0% found this document useful (0 votes)
23 views9 pages

MLCyber Lab

The document outlines a series of laboratory exercises focused on machine learning, covering topics such as data manipulation, linear regression, supervised and unsupervised learning, and model evaluation. Each section includes objectives, tasks, and solutions using Python and relevant libraries like NumPy and pandas. The exercises aim to provide hands-on experience in implementing machine learning concepts and techniques.

Uploaded by

nfsunotess
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views9 pages

MLCyber Lab

The document outlines a series of laboratory exercises focused on machine learning, covering topics such as data manipulation, linear regression, supervised and unsupervised learning, and model evaluation. Each section includes objectives, tasks, and solutions using Python and relevant libraries like NumPy and pandas. The exercises aim to provide hands-on experience in implementing machine learning concepts and techniques.

Uploaded by

nfsunotess
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Machine Learning

Laboratory Exercises
Kuldeep. J. Purohit
November 25, 2024

Contents
1 Data Manipulation and Statistical Analysis 2

2 Solving Linear Equations Using Python 3

3 Working with Vectors and Matrices in Machine Learning 3

4 Implementing Linear Regression from Scratch 4

5 Introduction to AI and Machine Learning 5

6 Data Preprocessing and Feature Engineering 6

7 Supervised Learning: Classification and Regression 7

8 Unsupervised Learning: Clustering and Dimensionality Reduction 8

9 Model Evaluation and Hyperparameter Tuning 9

1
1 Data Manipulation and Statistical Analysis
Objective
Apply Python programming skills to manipulate data using lists, dictionaries, and the
pandas library. Perform statistical operations such as mean, median, mode, standard
deviation, and variance.

Tasks
1. Create a dictionary representing a dataset of students with their names, ages, and
scores. Convert it into a pandas DataFrame and display the data.

2. Perform statistical analysis on the Score column:

• Mean
• Median
• Mode
• Standard deviation
• Variance

3. Visualize the distribution of Score using a histogram.

Solution
1 import pandas as pd
2 import numpy as np
3 import matplotlib . pyplot as plt
4
5 # Data dictionary
6 data = {
7 ’ Name ’: [ ’ Alice ’ , ’ Bob ’ , ’ Charlie ’ , ’ David ’ , ’ Eve ’] ,
8 ’ Age ’: [23 , 22 , 24 , 23 , 22] ,
9 ’ Score ’: [88 , 92 , 85 , 91 , 89]
10 }
11
12 # Convert to DataFrame
13 df = pd . DataFrame ( data )
14 print ( df )
15
16 # Calculate statistical measures
17 mean_score = np . mean ( df [ ’ Score ’ ])
18 median_score = np . median ( df [ ’ Score ’ ])
19 mode_score = df [ ’ Score ’ ]. mode () [0]
20 std_dev_score = np . std ( df [ ’ Score ’ ])
21 variance_score = np . var ( df [ ’ Score ’ ])
22
23 # Print statistics
24 print ( f " Mean : { mean_score } " )
25 print ( f " Median : { median_score } " )
26 print ( f " Mode : { mode_score } " )
27 print ( f " Standard Deviation : { std_dev_score } " )
28 print ( f " Variance : { variance_score } " )

2
29

30 # Plot histogram
31 plt . hist ( df [ ’ Score ’] , bins =5 , color = ’ blue ’ , alpha =0.7)
32 plt . title ( ’ Distribution of Scores ’)
33 plt . xlabel ( ’ Score ’)
34 plt . ylabel ( ’ Frequency ’)
35 plt . show ()

2 Solving Linear Equations Using Python


Objective
Use Python and NumPy to solve a system of linear equations and understand matrix
operations.

Tasks
1. Represent the system of equations 2x + 3y = 5 and x − y = 1 as a matrix equation
Ax = b.

2. Solve for x and y using numpy.linalg.solve().

3. Verify the solution by substituting x and y back into the original equations.

Solution
1 import numpy as np
2
3 # Coefficients matrix A
4 A = np . array ([[2 , 3] , [1 , -1]])
5

6 # Constants matrix b
7 b = np . array ([5 , 1])
8
9 # Solve for x and y
10 solution = np . linalg . solve (A , b )
11 print ( f " Solution : x = { solution [0]} , y = { solution [1]} " )
12
13 # Verify the solution
14 check = np . dot (A , solution )
15 print ( f " Verification : { check } " )

3 Working with Vectors and Matrices in Machine


Learning
Objective
Understand and perform basic vector and matrix operations, foundational to machine
learning algorithms.

3
Tasks
1. Create vectors and perform:
• Dot product
• Element-wise addition
• Cross product
2. Create matrices and perform:
• Matrix multiplication
• Transpose
• Inverse (if invertible)
3. Compute eigenvalues and eigenvectors of a random 3 × 3 matrix.

Solution
1 # Vector operations
2 v1 = np . array ([1 , 2 , 3])
3 v2 = np . array ([4 , 5 , 6])
4
5 dot_product = np . dot ( v1 , v2 )
6 e l e m e n t w i s e _ a d d i t i o n = v1 + v2
7 cross_product = np . cross ( v1 , v2 )
8

9 print ( f " Dot Product : { dot_product } " )


10 print ( f " Element - wise Addition : { e l e m e n t w i s e _ a d d i t i o n } " )
11 print ( f " Cross Product : { cross_product } " )
12
13 # Matrix operations
14 A = np . array ([[1 , 2] , [3 , 4]])
15 B = np . array ([[5 , 6] , [7 , 8]])
16
17 m a t r i x _ m u l t i p l i c a t i o n = np . dot (A , B )
18 transpose_A = np . transpose ( A )
19 inverse_A = np . linalg . inv ( A )
20

21 print ( f " Matrix Multiplication : \ n { m a t r i x _ m u l t i p l i c a t i o n } " )


22 print ( f " Transpose of A : \ n { transpose_A } " )
23 print ( f " Inverse of A : \ n { inverse_A } " )
24
25 # Eigenvalues and Eigenvectors
26 random_matrix = np . random . rand (3 , 3)
27 eigenvalues , eigenvectors = np . linalg . eig ( random_matrix )
28
29 print ( f " Eigenvalues : { eigenvalues } " )
30 print ( f " Eigenvectors : \ n { eigenvectors } " )

4 Implementing Linear Regression from Scratch


Objective
Implement a simple linear regression algorithm using Python.

4
Tasks
1. Generate synthetic data for y = 2x + 1 with random noise.
2. Visualize the data using matplotlib.
3. Implement the linear regression formula:
θ = (X T X)−1 X T y

4. Make predictions and evaluate using Mean Squared Error (MSE).

Solution
1 # Generate synthetic data
2 np . random . seed (42)
3 X = np . random . rand (100 , 1) * 10
4 y = 2 * X + 1 + np . random . randn (100 , 1)
5
6 # Visualize the data
7 plt . scatter (X , y , color = ’ blue ’)
8 plt . title ( ’ Generated Data ’)
9 plt . xlabel ( ’X ’)
10 plt . ylabel ( ’y ’)
11 plt . show ()
12

13 # Linear regression
14 X_bias = np . c_ [ np . ones (( X . shape [0] , 1) ) , X ]
15 theta = np . linalg . inv ( X_bias . T . dot ( X_bias ) ) . dot ( X_bias . T ) . dot ( y )
16
17 # Predictions
18 y_pred = X_bias . dot ( theta )
19
20 # MSE
21 mse = np . mean (( y - y_pred ) ** 2)
22 print ( f " Mean Squared Error : { mse } " )
23

24 # Plot the regression line


25 plt . scatter (X , y , color = ’ blue ’)
26 plt . plot (X , y_pred , color = ’ red ’)
27 plt . title ( ’ Linear Regression ’)
28 plt . xlabel ( ’X ’)
29 plt . ylabel ( ’y ’)
30 plt . show ()

5 Introduction to AI and Machine Learning


Objective: Understand the basic concepts of AI and ML, including definitions and types
of learning. Explore the applications of AI and ML in real-world scenarios.

Tasks
1. Research and Present AI Applications: List at least 5 applications of AI and
ML in different domains (e.g., healthcare, finance, transportation, etc.). Write a
brief explanation of how AI/ML is used in each application.

5
2. Classify Types of Learning: Describe and compare supervised learning, unsu-
pervised learning, and reinforcement learning. For each type of learning, provide a
real-world example. Create a table summarizing the types of learning.

3. Hands-on Task: Load a simple dataset (e.g., the Iris dataset) using scikit-learn
and visualize the features.
1 from sklearn import datasets
2 import matplotlib . pyplot as plt
3

4 # Load Iris dataset


5 iris = datasets . load_iris ()
6 X = iris . data
7 y = iris . target
8
9 # Scatter plot of the first two features
10 plt . scatter ( X [: , 0] , X [: , 1] , c =y , cmap = ’ viridis ’)
11 plt . title ( " Iris Dataset " )
12 plt . xlabel ( " Feature 1 ( Sepal length ) " )
13 plt . ylabel ( " Feature 2 ( Sepal width ) " )
14 plt . show ()
15

6 Data Preprocessing and Feature Engineering


Objective: Learn about data cleaning, feature selection, and feature normalization.

Tasks
1. Data Cleaning: Load a dataset (e.g., Titanic dataset from Kaggle). Check for
missing values and apply methods to handle them (e.g., fill with mean or drop
rows).
1 import pandas as pd
2
3 # Load Titanic dataset
4 df = pd . read_csv ( ’ titanic . csv ’)
5
6 # Check for missing values
7 print ( df . isnull () . sum () )
8
9 # Fill missing ’ Age ’ values with the mean
10 df [ ’ Age ’ ]. fillna ( df [ ’ Age ’ ]. mean () , inplace = True )
11

2. Feature Normalization: Normalize the numerical features using MinMaxScaler


or StandardScaler.
1 from sklearn . preprocessing import StandardScaler
2
3 # Normalize ’ Age ’ and ’ Fare ’ columns
4 scaler = StandardScaler ()
5 df [[ ’ Age ’ , ’ Fare ’ ]] = scaler . fit_transform ( df [[ ’ Age ’ , ’ Fare ’ ]])
6

6
3. Feature Selection: Use correlation analysis or feature importance (e.g., decision
trees) to select relevant features.
1 import seaborn as sns
2

3 # Calculate correlation matrix


4 corr = df . corr ()
5
6 # Plot the heatmap of correlations
7 sns . heatmap ( corr , annot = True , cmap = ’ coolwarm ’)
8

7 Supervised Learning: Classification and Regres-


sion
Objective: Apply supervised learning algorithms to classification and regression tasks.

Tasks
1. Regression with Linear Regression: Use the Boston Housing dataset from
scikit-learn to perform linear regression and predict house prices. Evaluate the
model using Mean Squared Error (MSE).
1 from sklearn . datasets import load_boston
2 from sklearn . linear_model import LinearRegression
3 from sklearn . metrics import m ea n_ sq ua re d_ er ro r
4 from sklearn . model_selection import train_test_split
5
6 # Load dataset
7 boston = load_boston ()
8 X = boston . data
9 y = boston . target
10
11 # Train - test split
12 X_train , X_test , y_train , y_test = train_test_split (X , y ,
test_size =0.2 , random_state =42)
13
14 # Train the model
15 model = LinearRegression ()
16 model . fit ( X_train , y_train )
17
18 # Predict and evaluate
19 y_pred = model . predict ( X_test )
20 mse = me an _s qu ar ed_ er ro r ( y_test , y_pred )
21 print ( f " Mean Squared Error : { mse } " )
22

2. Classification with Logistic Regression: Use the Iris dataset for classification
with Logistic Regression. Evaluate the model using accuracy and confusion matrix.
1 from sklearn . linear_model import Log is ti cR eg re ss io n
2 from sklearn . metrics import accuracy_score , confusion_matrix
3 from sklearn . model_selection import train_test_split
4
5 # Load dataset

7
6 iris = datasets . load_iris ()
7 X = iris . data
8 y = iris . target
9
10 # Train - test split
11 X_train , X_test , y_train , y_test = train_test_split (X , y ,
test_size =0.3 , random_state =42)
12
13 # Train the model
14 model = Lo gi st ic Re gr es si on ( max_iter =200)
15 model . fit ( X_train , y_train )
16
17 # Predict and evaluate
18 y_pred = model . predict ( X_test )
19 print ( f " Accuracy : { accuracy_score ( y_test , y_pred ) } " )
20 print ( f " Confusion Matrix : \ n { confusion_matrix ( y_test , y_pred ) } "
)
21

8 Unsupervised Learning: Clustering and Dimen-


sionality Reduction
Objective: Implement unsupervised learning techniques like clustering and dimension-
ality reduction.

Tasks
1. Clustering with K-Means: Apply K-Means clustering on the Iris dataset and
visualize the clusters.
1 from sklearn . cluster import KMeans
2 import matplotlib . pyplot as plt
3
4 # Apply KMeans clustering
5 kmeans = KMeans ( n_clusters =3 , random_state =42)
6 y_kmeans = kmeans . fit_predict ( X )
7
8 # Visualize the clusters
9 plt . scatter ( X [: , 0] , X [: , 1] , c = y_kmeans , cmap = ’ viridis ’)
10 plt . scatter ( kmeans . cluster_centers_ [: , 0] , kmeans .
cluster_centers_ [: , 1] , s =200 , c = ’ red ’ , marker = ’x ’)
11 plt . title ( "K - Means Clustering " )
12 plt . xlabel ( " Feature 1 " )
13 plt . ylabel ( " Feature 2 " )
14 plt . show ()
15

2. Dimensionality Reduction with PCA: Apply Principal Component Analysis


(PCA) to reduce the dimensions of the Iris dataset and visualize it in 2D.
1 from sklearn . decomposition import PCA
2
3 # Apply PCA to reduce to 2 components
4 pca = PCA ( n_components =2)

8
5 X_pca = pca . fit_transform ( X )
6
7 # Visualize the reduced data
8 plt . scatter ( X_pca [: , 0] , X_pca [: , 1] , c =y , cmap = ’ viridis ’)
9 plt . title ( " PCA - Iris Data " )
10 plt . xlabel ( " Principal Component 1 " )
11 plt . ylabel ( " Principal Component 2 " )
12 plt . show ()
13

9 Model Evaluation and Hyperparameter Tuning


Objective: Evaluate models using cross-validation and tune hyperparameters for better
performance.

Tasks
1. Model Evaluation with Cross-Validation: Apply cross-validation to evaluate
the performance of a classification model (e.g., SVM or Random Forest).
1 from sklearn . model_selection import cross_val_score
2 from sklearn . ensemble import R a n d o m F o r e s t C l a s s i f i e r
3
4 # Apply cross - validation
5 rf = R a n d o m F o r e s t C l a s s i f i e r ( n_estimators =100)
6 scores = cross_val_score ( rf , X , y , cv =5)
7 print ( f " Cross - validated accuracy : { scores . mean () } " )
8

2. Hyperparameter Tuning with GridSearchCV: Use GridSearchCV to tune the


hyperparameters of an SVM classifier (e.g., C and gamma).
1 from sklearn . model_selection import GridSearchCV
2 from sklearn . svm import SVC
3
4 # Define parameter grid and model
5 param_grid = { ’C ’: [0.1 , 1 , 10] , ’ gamma ’: [0.01 , 0.1 , 1]}
6 svm = SVC ()
7

8 # Apply GridSearchCV
9 grid_search = GridSearchCV ( svm , param_grid , cv =5)
10 grid_search . fit ( X_train , y_train )
11
12 # Display best hyperparameters
13 print ( f " Best Hyperparameters : { grid_search . best_params_ } " )
14

You might also like