FDSA Lab Manual aim algorithm
FDSA Lab Manual aim algorithm
II Year/IV Semester
EXP NO: 1 Working with Pandas data frames
Date:
ALGORITHM:
Step1: Start
Step5: Stop
PROGRAM:
import pandas as pd
data = {
df = pd.DataFrame(data)
print(df.head())
print(filtered_df)
print(df)
grouped_df = df.groupby('City')['Age'].mean()
print(grouped_df)
OUTPUT:
RESULT:
Thus the working with Pandas data frames was successfully completed.
EXP NO: 2 Basic plots using Matplotlib
Date:
AIM:
ALGORITHM:
Step1: Start
Step5: Stop
PROGRAM:
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()
OUTPUT:
RESULT:
Thus the basic plots using Matplotlib in Python program was successfully completed.
EXP NO:3A Frequency distributions
Date:
AIM :
ALGORITHM:
PROGRAM:
import pandas as pd
data = [1, 2, 2, 3, 4, 4, 4, 5, 6, 7, 7, 7, 7]
series = pd.Series(data)
frequency = series.value_counts()
print(frequency)
OUTPUT:
7 4
4 3
2 2
1 1
3 1
5 1
6 1
dtype: int64
RESULT:
Thus the python program to the frequency distribution in jupyter notebook was written and
executed successfully.
EXP NO:3B
Averages
Date:
AIM :
ALGORITHM:
PROGRAM:
import numpy as np
import pandas as pd
mean_pandas = pd.Series(data).mean()
mean_numpy = np.mean(data)
median = pd.Series(data).median()
print(f"Median: {median}")
mode = pd.Series(data).mode()
print(f"Mode: {mode}")
OUTPUT:
RESULT:
Thus the python program to the average in jupyter notebook was written and executed
successfully.
EXP NO:
3C Variability
Date:
AIM :
ALGORITHM:
PROGRAM:
import numpy as np
import pandas as pd
print(f"Range: {range_value}")
variance_pandas = pd.Series(data).var()
std_dev_pandas = pd.Series(data).std()
variance_numpy = np.var(data)
std_dev_numpy = np.std(data)
Range: 9
Variance (Pandas): 9.166666666666666
Standard Deviation (Pandas): 3.0276503540974917
Variance (NumPy): 8.25
Standard Deviation (NumPy): 2.8722813232690143
RESULT:
AIM :
ALGORITHM:
PROGRAM:
import numpy as np
mu = 0
sigma = 1
plt.xlabel('Data values')
plt.ylabel('Probability density')
OUTPUT:
RESULT:
Thus the normal curve using python program was successfully completed.
EXP NO: Correlation and scatter plots
4B
Date:
AIM :
ALGORITHM:
Program:
import numpy as np
x = np.random.rand(100)
plt.scatter(x, y, alpha=0.7)
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
OUTPUT:
Result:
Thus the Correlation and scatter plots using python program was successfully completed.
EXP NO: Correlation coefficient
4C
Date:
Aim:
ALGORITHM
PROGRAM:
import numpy as np
import pandas as pd
x = np.random.rand(100)
Result:
AIM:
ALGORITHM:
PROGRAM:
import numpy as np
np.random.seed(0)
X = np.random.rand(100) * 10
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
X = X.reshape(-1, 1)
model = LinearRegression()
model.fit(X, Y)
slope = model.coef_[0]
intercept = model.intercept_
Y_pred = model.predict(X)
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.show()
r_squared = model.score(X, Y)
print(f"R-squared: {r_squared}")
X_new = np.array([[15]])
Y_new = model.predict(X_new)
R-squared: 0.928337996765404
Predicted Y for X = 15: 37.75510721910058
RESULT:
Thus the computation for Simple Linear Regression was successfully completed.
EXP NO: 6
Date:
Z-test
AIM:
ALGORITHM:
PROGRAM:
import numpy as np
mean_1 = 50
mean_2 = 45
std_1 = 10
std_2 = 12
size_1 = 40
size_2 = 35
p_value_two_sample = 2 * (1 - stats.norm.cdf(abs(z_score_two_sample)))
print(f"Z-Score: {z_score_two_sample}")
print(f"P-value: {p_value_two_sample}")
OUTPUT:
Z-Score: 1.9441444452997994
P-value: 0.051878034893831915
RESULT:
AIM:
ALGORITHM:
PROGRAM:
import numpy as np
sample_data = np.array([52, 55, 48, 49, 53, 54, 51, 50, 55, 58, 56, 57, 52, 51, 54, 53, 59, 61, 50,
52, 54, 53, 49, 47, 52, 51, 50, 48, 56, 55])
population_mean = 50
print(f"T-statistic: {t_stat}")
print(f"P-value: {p_value}")
OUTPUT:
T-statistic: 4.571679054413011
P-value: 8.327654458471987e-05
RESULT:
AIM:
ALGORITHM:
PROGRAM:
import numpy as np
group_1 = np.array([23, 45, 67, 32, 45, 34, 43, 45, 56, 42])
group_2 = np.array([45, 32, 23, 43, 46, 32, 21, 22, 43, 43])
group_3 = np.array([65, 78, 56, 67, 82, 73, 74, 65, 68, 74])
print(f"F-statistic: {f_stat}")
print(f"P-value: {p_value}")
OUTPUT:
F-statistic: 32.6259618124822
P-value: 6.255218731829188e-08
There is a significant difference between the group means.
RESULT:
AIM:
To write a python program to building and validating linear models using jupyter notebook.
ALGORITHM:
PROGRAM:
import numpy as np
import pandas as pd
import statsmodels.api as sm
np.random.seed(0)
X = np.random.rand(100, 1) * 10
X_train_sm = sm.add_constant(X_train)
X_test_sm = sm.add_constant(X_test)
y_pred = model.predict(X_test_sm)
print(model.summary())
r2 = r2_score(y_test, y_pred)
print(f'R-squared: {r2}')
OUTPUT:
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Mean Squared Error: 3.6710129878857174
R-squared: 0.896480483165161
RESULT:
Thus the computation for building and validating linear models was successfully completed.
EXP NO: 10
Date:
Building and validating logistic models
AIM:
To write a python program to building and validating logistic models using jupyter notebook.
ALGORITHM:
PROGRAM:
import numpy as np
import pandas as pd
np.random.seed(0)
X = np.random.rand(100, 2)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(f'Accuracy: {accuracy}')
print('Confusion Matrix:')
print(conf_matrix)
print('Classification Report:')
print(class_report)
plt.figure(figsize=(10, 6))
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.show()
OUTPUT:
Accuracy: 0.9
Confusion Matrix:
[[ 8 2]
[ 0 10]]
Classification Report:
precision recall f1-score support
0 1.00 0.80 0.89 10
1 0.83 1.00 0.91 10
accuracy 0.90 20
macro avg 0.92 0.90 0.90 20
weighted avg 0.92 0.90 0.90 20
RESULT:
Thus the computation for building and validating logistic models was successfully completed.
EXP NO: 11
Date:
Time series analysis
AIM:
ALGORITHM:
PROGRAM:
import pandas as pd
import numpy as np
data = np.random.randn(100).cumsum()
plt.figure(figsize=(12, 6))
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.grid()
plt.show()
OUTPUT:
RESULT:
Thus the computation for time series analysis was successfully completed.