R22 ML Lab Manual
R22 ML Lab Manual
import pandas as pd
data = {
'Name': pd.Series(['Tom', 'James', 'Ricky', 'Vin', 'Steve', 'Smith', 'Jack',
'Lee',Chanchal', 'Gasper', 'Naviya', ‘Andres']),
'Age': pd.Series([25, 26, 25, 23, 30, 29, 23, 34, 40, 30, 51, 46]),
'Ra ng': pd.Series([4.23, 3.24, 3.98, 2.56, 3.20, 4.6, 3.8, 3.78, 2.98, 4.80,
4.10, 3.65])
}
df = pd.DataFrame(data)
print(df)
age_mean = df['Age'].mean()
age_median = df['Age'].median()
age_mode = df['Age'].mode()
ra ng_mean = df['Ra ng'].mean()
ra ng_median = df['Ra ng'].median()
ra ng_mode = df['Ra ng'].mode()
age_variance = df['Age'].var()
age_standard_devia on = df['Age'].std()
ra ng_variance = df['Ra ng'].var()
ra ng_standard_devia on = df['Ra ng'].std()
print("Variance...Age:", age_variance)
print("Standard devia on...Age:", age_standard_devia on)
print("Variance...Ra ng:", ra ng_variance)
print("Standard devia on...Ra ng:", ra ng_standard_devia on)
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
ti
3
1. Statistics Library
• The statistics module provides functions for statistical computations like mean,
median, mode, and standard deviation.
• It is built into Python, so no extra installation is required.
• Useful for simple data analysis and numerical summaries.
• Supports working with lists, tuples, and other iterable data structures.
• Ideal for beginners to perform basic statistical calculations.
•
Example Program:
import statistics
data = ([1,2,3,4,5])
print("Mean:", statistics.mean(data))
print("Median:", statistics.median(data))
print("Mode:", statistics.mode(data))
2. Math Library
• The math module provides mathematical functions such as square root, power,
trigonometry, and logarithms.
• It is built-in and does not require installation.
• Contains constants like pi and e.
• Helps perform complex mathematical operations ef ciently.
• Ideal for engineering, physics, and general numerical computations.
Example Program:
import math
a=16
b=4
print(“a+b=“,a+b)
print(“a-b=“,a-b)
print(“a*b=“,a*b)
print(“a/b=“,a/b)
print(“a%b=“,a%b)
print("Square root:", math.sqrt(num))
fi
4
3. NumPy Library
• NumPy (Numerical Python) is used for array manipulations and numerical computations.
• Provides multi-dimensional array objects (ndarray) with fast operations.
• Supports mathematical functions like linear algebra and statistics.
• Requires installation using pip install numpy.
• Widely used in scienti c computing and machine learning.
Example Program:
import numpy as np
Data = np.array([1, 2, 3, 4, 5])
print("Mean:", np.mean(data))
print("Median:", np.mean(data))
print(arr.ndim)
4. SciPy Library
• SciPy (Scienti c Python) is built on NumPy for advanced mathematical operations.
• Contains modules for optimization, integration, interpolation, and statistics.
• Useful for scienti c and engineering applications.
• Requires installation using pip install scipy.
• Provides specialized functions like signal processing and image manipulation.
Example Program:
from scipy.special import gcd
print("GCD:", gcd(8, 12))
fi
fi
fi
5
Pandas :
Pandas is a powerful Python library for data manipulation and analysis.
It provides data structures like Series (1D) and DataFrame (2D) for handling
structured data.
Pandas supports data cleaning, ltering, aggregation, and visualization with built-in
functions.
It ef ciently handles CSV, Excel, SQL, JSON, and other le formats.
Pandas is widely used in data science, nance, and machine learning for preprocessing
data.
Example Program :
import pandas as pd
mydataset = {
'cars': ["BMW", "Volvo", "Ford"],
'passings': [3, 7, 2]
}
myvar = pd.DataFrame(mydataset)
print(myvar)
Matplotlib :
Matplotlib is a Python library for creating static, animated, and interactive visualizations.
It provides the pyplot module, which offers a MATLAB-like interface for easy plotting.
Matplotlib supports line plots, bar charts, histograms, scatter plots, and more.
It allows customization of axes, labels, legends, colors, and styles for detailed
visualization.
Widely used in data science, machine learning, and engineering for data representation.
Example Program :
# importing libraries
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(500, 4),
columns =['a', 'b', 'c', 'd'])
fi
fi
fi
fi
6
df.plot.scatter(x ='a', y ='b')
plt.show()
Output :
7
import numpy as np
import matplotlib.pyplot as plt
return theta
return X @ theta
Output :
ti
tti
tt
9
10
5. Implementation of Multiple Linear Regression for House Price
Prediction using sklearn
import numpy as np
import pandas as pd
from sklearn.model_selec on import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
# Split the dataset into training and tes ng sets (80% training, 20% tes ng)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
ti
ti
ti
ti
11
# Ini alize the Mul ple Linear Regression model
model = LinearRegression()
# Visualize the true vs predicted prices (if needed, this will work well for
smaller datasets)
plt.sca er(y_test, y_pred)
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red',
linestyle='--')
plt.xlabel('True Prices')
plt.ylabel('Predicted Prices')
plt. tle('True vs Predicted House Prices')
plt.show()
ti
ti
tt
fi
ti
ffi
ti
12
Output :
13
6. Implementation of Decision tree using sklearn and its parameter tuning
# 1. Load dataset
iris = load_iris()
X = iris.data
y = iris.target
Note : Download “Credit card fraud detection” Dataset from Kaggle website
using this link : https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud
Program :
import pandas as pd
import numpy as np
from sklearn.model_selec on import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Logis cRegression
from sklearn.ensemble import RandomForestClassi er
from sklearn.metrics import classi ca on_report, confusion_matrix,
roc_auc_score, precision_recall_curve, auc
# Note: Add model saving, further tuning, or use of other classi ers like
XGBoost if required.
fi
fi
ti
ti
ti
ti
fi
25
Output :