0% found this document useful (0 votes)

23 views22 pages

ML Aml Cse It Lab Manual Final

Uploaded by

Dhyey Baldha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views22 pages

ML Aml Cse It Lab Manual Final

Uploaded by

Dhyey Baldha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Faculty of Degree Engineering – 083

Department of Computer Science & Engineering - 31

SEMESTER: 7

LAB MANUAL
Machine Learning 3170724

Name:

Enrollment No:

Batch:

DEPARTMENT OF COMPUTER SCIENCE &

ENGINEERING - 31
DR.SUBHASH TECHNICAL CAMPUS
Faculty of Degree Engineering – 083
Department of Computer Science & Engineering - 31

CERTIFICATE

Roll No.:- Enrollment No.:-

This is to certify that the practical work satisfactorily carried out

and hence recorded in this journal is the confide work of Mr .
/Miss_________________________________ student of
th
Computer Science & Engineering - 31 Semester 7

in the Machine Learning (3170724) Laboratory of Dr. Subhash

Technical Campus during the academic year 2024 -25.

Submission Date: ……………

Subject in Charge HOD

Examiner
Faculty of Degree Engineering – 083
Department of Computer Science & Engineering - 31

INDEX

SR. NO. AIM PAGE NO DATE SIGN

Introduction to various libraries, tools
1 used in Machine Learning.

Importing dataset and reading a

2 dataset using Pandas Library.

To clean the data and apply methods

3 to deal with Missing Values.

To deal with Outliers for data pre-

4 processing.

To implement Linear Regression in

5 Python.

To Evaluate Linear Regression in

6 Python.

To Implement Logistic Regression in

7 Python.

Implement KNN Algorithm for

8 Classification.
Practical - 1
Introduction to various libraries, tools used in Machine Learning.

Machine Learning, as the name suggests, is the science of programming a computer by which they are able to learn
from different kinds of data. A more general definition given by Arthur Samuel is – “Machine Learning is the field of
study that gives computers the ability to learn without being explicitly programmed.” They are typically used to solve
various types of life problems.
In the older days, people used to perform Machine Learning tasks by manually coding all the algorithms and
mathematical and statistical formulas. This made the processing time-consuming, tedious, and inefficient. But in the
modern days, it is become very much easy and more efficient compared to the older days with various python
libraries, frameworks, and modules. Today, Python is one of the most popular programming languages for this task
and it has replaced many languages in the industry, one of the reasons is its vast collection of libraries. Python
libraries that are used in Machine Learning are:

1. Numpy
2. Scipy
3. Scikit-learn
4. TensorFlow
5. Keras
6. PyTorch
7. Pandas
8. Matplotlib

Numpy

NumPy is a very popular python library for large multi-dimensional array and matrix processing, with the help of a
large collection of high-level mathematical functions. It is very useful for fundamental scientific computations in
Machine Learning. It is particularly useful for linear algebra, Fourier transform, and random number capabilities.

SciPy

SciPy is a very popular library among Machine Learning enthusiasts as it contains different modules for optimization,
linear algebra, integration and statistics. There is a difference between the SciPy library and the SciPy stack. The
SciPy is one of the core packages that make up the SciPy stack. SciPy is also very useful for image manipulation.

Scikit-learn

DR.SUBHASH TECHNICAL CAMPUS Page 1

Scikit-learn is one of the most popular ML libraries for classical ML algorithms. It is built on top of two basic Python
libraries, viz., NumPy and SciPy. Scikit-learn supports most of the supervised and unsupervised learning algorithms.
Scikit-learn can also be used for data-mining and data-analysis, which makes it a great tool who is starting out with
ML.

TensorFlow

TensorFlow is a very popular open-source library for high performance numerical computation developed by the
Google Brain team in Google. As the name suggests, Tensorflow is a framework that involves defining and running
computations involving tensors. It can train and run deep neural networks that can be used to develop several AI
applications. TensorFlow is widely used in the field of deep learning research and application.

Keras

It provides many inbuilt methods for groping, combining and filtering data.
Keras is a very popular Machine Learning library for Python. It is a high-level neural networks API capable of
running on top of TensorFlow, CNTK, or Theano. It can run seamlessly on both CPU and GPU. Keras makes it really
for ML beginners to build and design a Neural Network. One of the best thing about Keras is that it allows for easy
and fast prototyping.

PyTorch

PyTorch is a popular open-source Machine Learning library for Python based on Torch, which is an open-source
Machine Learning library that is implemented in C with a wrapper in Lua. It has an extensive choice of tools and
libraries that support Computer Vision, Natural Language Processing(NLP), and many more ML programs. It allows
developers to perform computations on Tensors with GPU acceleration and also helps in creating computational
graphs.

Pandas

Pandas is a popular Python library for data analysis. It is not directly related to Machine Learning. As we know that
the dataset must be prepared before training. In this case, Pandas comes handy as it was developed specifically for

DR.SUBHASH TECHNICAL CAMPUS Page 2

data extraction and preparation. It provides high-level data structures and wide variety tools for data analysis. It
provides many inbuilt methods for grouping, combining and filtering data.

Matplotlib

Matplotlib is a very popular Python library for data visualization. Like Pandas, it is not directly related to Machine
Learning. It particularly comes in handy when a programmer wants to visualize the patterns in the data. It is a 2D
plotting library used for creating 2D graphs and plots. A module named pyplot makes it easy for programmers for
plotting as it provides features to control line styles, font properties, formatting axes, etc. It provides various kinds of
graphs and plots for data visualization, viz., histogram, error charts, bar chats, etc,

Signature:

Date: _________________________

DR.SUBHASH TECHNICAL CAMPUS Page 3

Practical-2
Importing dataset and reading a dataset using Pandas Library.

import pandas as pd

# dataset
disease_df = pd.read_csv("/framingham.csv")
disease_df.drop(['education'], inplace = True, axis = 1)
disease_df.rename(columns ={'male':'Sex_male'}, inplace = True)

print(disease_df.head());

Output:

Signature:

Date: _________________________

DR.SUBHASH TECHNICAL CAMPUS Page 4

Practical-3
To clean the data and apply methods to deal with Missing Values.

When you have a dataset, the first step is to check which columns have missing data and how many. Let us use the
most famous dataset among Data science learns, of course, the Titanic survivor! Read the dataset using pandas
read_csv function as shown below.

import pandas as pd

# dataset
df = pd.read_csv("/content/titanic_data.csv")
print(df.head());

How to check which columns have missing data, and how many?
The ” isnull()” function is used for this. When you call the sum function along with isnull, the total sum of missing data
in each column is the output.

missing_values=df.isnull().sum()
print(missing_values)

Although we know how many values are missing in each column, it is essential to know the percentage of them
against the total values. So, let us calculate that in a single line of code.

mis_value_percent = 100 * df.isnull().sum() / len(df)

print(mis_value_percent)

DR.SUBHASH TECHNICAL CAMPUS Page 5

Dropping rows with Missing Values

It is a simple method, where we drop all the rows that have any missing values belonging to a particular column. As
easy as this is, it comes with a huge disadvantage. You might end up losing a huge chunk of your data. This will
reduce the size of your dataset and make your model predictions biased. You should use this only when the no of
missing values is very less.

For example, the ‘Embarked’ column has just 2 missing values. So, we can drop rows where this column is missing.
Follow the below code snippet.

print('Dataset before :', len(df))

df.dropna(subset=['Embarked'],how='any',inplace=True)
print('Dataset after :', len(df))
print('missing values :',df['Embarked'].isnull().sum())

Imputation with mean

When a continuous variable column has missing values, you can calculate the mean of the non-null values and use it
to fill the vacancies.

import numpy as np

df['Age']=df['Age'].replace(np.NaN,df['Age'].mean())
df['Age'][:10]

Imputation with median

df['Age']=df['Age'].replace(np.NaN,df['Age'].median())
df['Age'][:10]

Signature:

Date: _________________________
DR.SUBHASH TECHNICAL CAMPUS Page 6
Practical-4
To deal with Outliers for data pre-processing.

Outlier Detection And Removal

Here pandas data frame is used for a more realistic approach as real-world projects need to detect the outliers that
arose during the data analysis step, the same approach can be used on lists and series-type objects.

Dataset Used For Outlier Detection

The dataset used in this article is the Diabetes dataset and it is preloaded in the Sklearn library.

# Importing
import sklearn
from sklearn.datasets import load_diabetes
import pandas as pd
import matplotlib.pyplot as plt

# Load the dataset

diabetics = load_diabetes()

# Create the dataframe

column_name = diabetics.feature_names
df_diabetics = pd.DataFrame(diabetics.data)
df_diabetics.columns = column_name
print(df_diabetics.head())

Visualizing and Removing Outliers Using Box Plot

It captures the summary of the data effectively and efficiently with only a simple box and
whiskers. Boxplot summarizes sample data using 25th, 50th, and 75th percentiles. One can just get insights(quartiles,
median, and outliers) into the dataset by just looking at its boxplot.

# Box Plot
import seaborn as sns
sns.boxplot(df_diabetics['bmi'])

DR.SUBHASH TECHNICAL CAMPUS Page 7

import seaborn as sns
import matplotlib.pyplot as plt

def removal_box_plot(df, column, threshold):

sns.boxplot(df[column])
plt.title(f'Original Box Plot of {column}')
plt.show()

removed_outliers = df[df[column] <= threshold]

sns.boxplot(removed_outliers[column])
plt.title(f'Box Plot without Outliers of {column}')
plt.show()
return removed_outliers

threshold_value = 0.12

no_outliers = removal_box_plot(df_diabetics, 'bmi', threshold_value)

Z-score

Z- Score is also called a standard score. This value/score helps to understand that how far is the data point from the mean.
And after setting up a threshold value one can utilize z score values of data points to define the outliers.

Zscore = (data_point -mean) / std. deviation

In this example, we are calculating the Z scores for the ‘age’ column in the DataFrame df_diabetics using
the zscore function from the SciPy stats module. The resulting array z contains the absolute Z scores for each data point
in the ‘age’ column, indicating how many standard deviations each value is from the mean.

from scipy import stats

import numpy as np
z = np.abs(stats.zscore(df_diabetics['age']))
print(z)

import numpy as np

threshold_z = 2

outlier_indices = np.where(z > threshold_z)[0]

DR.SUBHASH TECHNICAL CAMPUS Page 8
no_outliers = df_diabetics.drop(outlier_indices)
print("Original DataFrame Shape:", df_diabetics.shape)
print("DataFrame Shape after Removing Outliers:", no_outliers.shape)

DR.SUBHASH TECHNICAL CAMPUS Page 9

Practical-5
To implement Linear Regression in Python.

Machine Learning is a branch of Artificial intelligence that focuses on the development of algorithms and statistical
models that can learn from and make predictions on data. Linear regression is also a type of machine-learning algorithm
more specifically a supervised machine-learning algorithm that learns from the labelled datasets and maps the data points
to the most optimized linear functions. which can be used for prediction on new datasets.

First of we should know what supervised machine learning algorithms is. It is a type of machine learning where the
algorithm learns from labelled data. Labeled data means the dataset whose respective target value is already known.
Supervised learning has two types:

Classification: It predicts the class of the dataset based on the independent input variable. Class is the categorical or
discrete values. like the image of an animal is a cat or dog?

Regression: It predicts the continuous output variables based on the independent input variable. like the prediction of
house prices based on different parameters like house age, distance from the main road, location, area, etc.

Types of Linear Regression

There are two main types of linear regression:

Simple Linear Regression

This is the simplest form of linear regression, and it involves only one independent variable and one dependent
variable. The equation for simple linear regression is:
y=β0+β1Xy=β0+β1X
where:
Y is the dependent variable
X is the independent variable
β0 is the intercept
β1 is the slope

Multiple Linear Regression

This involves more than one independent variable and one dependent variable. The equation for multiple linear
regression is:
y=β0+β1X1+β2X2+………βnXny=β0+β1X1+β2X2+………βnXn
where:
Y is the dependent variable
X1, X2, …, Xn are the independent variables
β0 is the intercept
β1, β2, …, βn are the slopes
The goal of the algorithm is to find the best Fit Line equation that can predict the values based on the independent
variables.
In regression set of records are present with X and Y values and these values are used to learn a function so if you want
to predict Y from an unknown X this learned function can be used. In regression we have to find the value of Y, So, a
function is required that predicts continuous Y in the case of regression given X as independent features.
What is the best Fit Line?
Our primary objective while using linear regression is to locate the best-fit line, which implies that the error between the
predicted and actual values should be kept to a minimum. There will be the least error in the best-fit line.
The best Fit Line equation provides a straight line that represents the relationship between the dependent and independent
variables. The slope of the line indicates how much the dependent variable changes for a unit change in the independent
variable(s).

DR.SUBHASH TECHNICAL CAMPUS Page 10

Here Y is called a dependent or target variable and X is called an independent variable also known as the predictor of Y.
There are many types of functions or modules that can be used for regression. A linear function is the simplest type of
function. Here, X may be a single feature or multiple features representing the problem.
Linear regression performs the task to predict a dependent variable value (y) based on a given independent variable (x)).
Hence, the name is Linear Regression. In the figure above, X (input) is the work experience and Y (output) is the salary
of a person. The regression line is the best-fit line for our model.
We utilize the cost function to compute the best values in order to get the best fit line since different values for weights or
the coefficient of lines result in different regression lines.

import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd

data_set= pd.read_csv('/content/salary_data.csv')

x= data_set.iloc[:, :-1].values
y= data_set.iloc[:, 1].values

# Splitting the dataset into training and test set.

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 1/3, random_state=0)

#Fitting the Simple Linear Regression model to the training dataset

from sklearn.linear_model import LinearRegression
regressor= LinearRegression()
regressor.fit(x_train, y_train)

#Prediction of Test and Training set result

y_pred= regressor.predict(x_test)
x_pred= regressor.predict(x_train)

#Prediction of Test and Training set result

y_pred= regressor.predict(x_test)
x_pred= regressor.predict(x_train)

mtp.scatter(x_train, y_train, color="green")

mtp.plot(x_train, x_pred, color="red")
mtp.title("Salary vs Experience (Training Dataset)")
mtp.xlabel("Years of Experience")
mtp.ylabel("Salary(In Rupees)")
mtp.show()

DR.SUBHASH TECHNICAL CAMPUS Page 11

#visualizing the Test set results
mtp.scatter(x_test, y_test, color="blue")
mtp.plot(x_train, x_pred, color="red")
mtp.title("Salary vs Experience (Test Dataset)")
mtp.xlabel("Years of Experience")
mtp.ylabel("Salary(In Rupees)")
mtp.show()

Signature:

Date: _________________________

DR.SUBHASH TECHNICAL CAMPUS Page 12

Practical-6
To Evaluate Linear Regression in Python.
Evaluating linear regression models

There are various metrics in place that we can use to evaluate linear regression models. Since models can't be 100 percent
efficient, evaluating the model on different metrics can help us optimize the performance, fine-tune it, and obtain better
results. The metrics we can use include:
Mean Absolute Error(MAE) calculates the absolute difference between the actual and predicted values. We get the sum
of all the prediction errors and divide them by the total number of data points.

from sklearn.metrics import mean_absolute_error

print('MAE:', mean_absolute_error(y_test,y_pred))

Mean Squared Error(MSE):

This is the most used metric. It finds the squared difference between actual and predicted values. We get the sum of the
square of all prediction errors and divide it by the number of data points.

To get the MSE from the model, import the mean_squared_error class from sklearn.metrics module.

from sklearn.metrics import mean_squared_error

print("MSE",mean_squared_error(y_test,y_pred))

Root Mean Squared Error(RMSE) is the square root of MSE.

import numpy as np
print("RMSE",np.sqrt(mean_squared_error(y_test,y_pred)))

DR.SUBHASH TECHNICAL CAMPUS Page 13

R Squared(R2): R2 is also called the coefficient of determination or goodness of fit score regression function. It measures
how much irregularity in the dependent variable the model can explain. The R2 value is between 0 to 1, and a bigger value
shows a better fit between prediction and actual value.

from sklearn.metrics import r2_score

r2 = r2_score(y_test,y_pred)
print(r2)

Signature:

Date: _________________________

DR.SUBHASH TECHNICAL CAMPUS Page 14

Practical-7
To Implement Logistic Regression in Python.
Logistic regression is a statistical method for predicting binary classes. The outcome or target variable is dichotomous in
nature. Dichotomous means there are only two possible classes. For example, it can be used for cancer detection
problems. It computes the probability of an event occurrence.
It is a special case of linear regression where the target variable is categorical in nature. It uses a log of odds as the
dependent variable. Logistic Regression predicts the probability of occurrence of a binary event utilizing a logit function.
Linear Regression Equation:

Where y is a dependent variable and x1, x2 ... and Xn are explanatory variables.
Sigmoid Function:

Acquisition of data

CSV file which tells which of the users purchased/not purchased a particular product.

Loading data

Visualizing and splitting the dataset

DR.SUBHASH TECHNICAL CAMPUS Page 15

Logistic Regression Model

Training the Model

DR.SUBHASH TECHNICAL CAMPUS Page 16

Signature:

Date: _________________________

DR.SUBHASH TECHNICAL CAMPUS Page 17

Practical-8
Implement KNN Algorithm for Classification.

Acquisition of data

CSV file which tells which of the users purchased/not purchased a particular product.

Loading data

Preprocessing

Splitting the dataset

Training

DR.SUBHASH TECHNICAL CAMPUS Page 18

Signature:

Date: _________________________

DR.SUBHASH TECHNICAL CAMPUS Page 19

Machine Learning Lab Dlihebca6sem
100% (1)
Machine Learning Lab Dlihebca6sem
25 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
85 pages
MLC Practical
No ratings yet
MLC Practical
51 pages
IML Lab Manual
No ratings yet
IML Lab Manual
31 pages
AL-405 Machine Learning Lab Manual
No ratings yet
AL-405 Machine Learning Lab Manual
40 pages
IML LabManual
No ratings yet
IML LabManual
31 pages
Data Science Bootcamp (Day-01) (1) - Compressed
No ratings yet
Data Science Bootcamp (Day-01) (1) - Compressed
161 pages
CS 601 ML Lab Manual
No ratings yet
CS 601 ML Lab Manual
15 pages
ML Record - Merged
No ratings yet
ML Record - Merged
29 pages
D P Lab Manual
No ratings yet
D P Lab Manual
54 pages
AD-502 Machine Learning Lab - Exp 1-10
No ratings yet
AD-502 Machine Learning Lab - Exp 1-10
13 pages
ML Contenthalf
No ratings yet
ML Contenthalf
35 pages
Vishnu. ML
No ratings yet
Vishnu. ML
26 pages
Machine Learning With Python
100% (2)
Machine Learning With Python
137 pages
EPS DL Handout1 Introduction Compressed
No ratings yet
EPS DL Handout1 Introduction Compressed
46 pages
Big Data Analysis
No ratings yet
Big Data Analysis
38 pages
Chapter 6 Python Libraries For Machine Learning
No ratings yet
Chapter 6 Python Libraries For Machine Learning
21 pages
GRADE 4 TERM 1 TEST MATHEMATICS MEMO (Final)
100% (4)
GRADE 4 TERM 1 TEST MATHEMATICS MEMO (Final)
5 pages
DADV - Lab - Subject - 303105315
No ratings yet
DADV - Lab - Subject - 303105315
35 pages
PDS Labmanualword
No ratings yet
PDS Labmanualword
32 pages
CS 601 ML Lab Manual
0% (1)
CS 601 ML Lab Manual
14 pages
Diya Basera
No ratings yet
Diya Basera
15 pages
Fds Merged
No ratings yet
Fds Merged
102 pages
ML Lab Manual (Upto Cie-1)
No ratings yet
ML Lab Manual (Upto Cie-1)
33 pages
Vamshi ml-1,2
No ratings yet
Vamshi ml-1,2
25 pages
ML Pgms - 24mar2025
No ratings yet
ML Pgms - 24mar2025
23 pages
Test Project
No ratings yet
Test Project
17 pages
100 Must-Know PythonMl Interview Questions and Answers 2024 - Devinterview - Io
No ratings yet
100 Must-Know PythonMl Interview Questions and Answers 2024 - Devinterview - Io
1 page
It, Hardware Exp1
No ratings yet
It, Hardware Exp1
10 pages
Machine Learning Lab (CIE 421P)
No ratings yet
Machine Learning Lab (CIE 421P)
49 pages
ML LabManual
No ratings yet
ML LabManual
16 pages
ML File Updated
No ratings yet
ML File Updated
60 pages
Dsbda Unit4
No ratings yet
Dsbda Unit4
110 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
Exp 1
No ratings yet
Exp 1
22 pages
Data Science I: Charles C.N. Wang
No ratings yet
Data Science I: Charles C.N. Wang
68 pages
Lab - Manual FDS
No ratings yet
Lab - Manual FDS
12 pages
Report Intership Chapters
No ratings yet
Report Intership Chapters
39 pages
Datascience
No ratings yet
Datascience
26 pages
Updated ML LAB Manual-2020-21
No ratings yet
Updated ML LAB Manual-2020-21
57 pages
ML-Lab Manual - NEP - DSS
No ratings yet
ML-Lab Manual - NEP - DSS
23 pages
ML File Syllabus
No ratings yet
ML File Syllabus
43 pages
ML Manual
No ratings yet
ML Manual
21 pages
Experiment No.1
No ratings yet
Experiment No.1
5 pages
Machine Learning - Lab Wise Manual Abbbbb
No ratings yet
Machine Learning - Lab Wise Manual Abbbbb
13 pages
ML Exp
No ratings yet
ML Exp
9 pages
Machine Learning Masterclass 2023
No ratings yet
Machine Learning Masterclass 2023
6 pages
TBC 603 Fundamentals of Machine Learning
No ratings yet
TBC 603 Fundamentals of Machine Learning
2 pages
DS Final
No ratings yet
DS Final
46 pages
Report Print
No ratings yet
Report Print
22 pages
ML Lab Manual
No ratings yet
ML Lab Manual
38 pages
TBC 603 Fundamentals of Machine Learning
No ratings yet
TBC 603 Fundamentals of Machine Learning
2 pages
Ass1 DSBDA Writeup
No ratings yet
Ass1 DSBDA Writeup
8 pages
DAL EXT 1 and 2
No ratings yet
DAL EXT 1 and 2
125 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
Unit 1-1
No ratings yet
Unit 1-1
10 pages
ML 1
No ratings yet
ML 1
6 pages
Ambo University Woliso Campus
100% (1)
Ambo University Woliso Campus
6 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
AR - Step by Step Guide
No ratings yet
AR - Step by Step Guide
10 pages
Seminar Topic: Department of Mechanical Engineering
No ratings yet
Seminar Topic: Department of Mechanical Engineering
19 pages
Ender-3 Assembly Instruction (V1.0)
No ratings yet
Ender-3 Assembly Instruction (V1.0)
14 pages
Mini Project Report MD
No ratings yet
Mini Project Report MD
43 pages
Facebook Netiquette
No ratings yet
Facebook Netiquette
13 pages
1 - Intoduction, Part1
No ratings yet
1 - Intoduction, Part1
48 pages
School Management System Report
No ratings yet
School Management System Report
30 pages
Answer AIL 1
No ratings yet
Answer AIL 1
12 pages
IBM HMC Recovery
No ratings yet
IBM HMC Recovery
98 pages
An Ad-Free Experience
No ratings yet
An Ad-Free Experience
5 pages
EPGP in Data Science (Curriculum)
No ratings yet
EPGP in Data Science (Curriculum)
30 pages
Security Model
No ratings yet
Security Model
18 pages
Form STUDY KELAYAKAN MUSTAHIK (Responses)
No ratings yet
Form STUDY KELAYAKAN MUSTAHIK (Responses)
41 pages
Abb Utilities GMBH: Operation
No ratings yet
Abb Utilities GMBH: Operation
4 pages
MRSPTU B.Tech. Electrical 7th-8th Sem Scheme and Syllabus 2018 Batch Onwards
No ratings yet
MRSPTU B.Tech. Electrical 7th-8th Sem Scheme and Syllabus 2018 Batch Onwards
26 pages
Ebook How To Sell Your Value and Your Price 49p
No ratings yet
Ebook How To Sell Your Value and Your Price 49p
49 pages
Sujata Sahoo Adhar
No ratings yet
Sujata Sahoo Adhar
1 page
Surds Review Questions
No ratings yet
Surds Review Questions
6 pages
Window On Humanity: A Concise Introduction To Anthropology, Ninth 9 Edition Conrad Phillip Kottak
No ratings yet
Window On Humanity: A Concise Introduction To Anthropology, Ninth 9 Edition Conrad Phillip Kottak
9 pages
DG 441
No ratings yet
DG 441
12 pages
Smart Logistics Warehouse Moving-Object Tracking Based On
No ratings yet
Smart Logistics Warehouse Moving-Object Tracking Based On
18 pages
Mad Micro Project
No ratings yet
Mad Micro Project
16 pages
DB Report Paper
No ratings yet
DB Report Paper
8 pages
Universal Testing Machine (UTM) - RTHRTI Catalog 2021 English
No ratings yet
Universal Testing Machine (UTM) - RTHRTI Catalog 2021 English
10 pages
Software Project Management Unit-3 - 1 PDF
No ratings yet
Software Project Management Unit-3 - 1 PDF
2 pages
UNIT I Complete Notes
No ratings yet
UNIT I Complete Notes
5 pages
Acces Problem - Agilent Cytogenomics 5.0: Skipalova, Karolina (Agilent Informatics Support)
No ratings yet
Acces Problem - Agilent Cytogenomics 5.0: Skipalova, Karolina (Agilent Informatics Support)
4 pages
Lab 2
No ratings yet
Lab 2
3 pages
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet

ML Aml Cse It Lab Manual Final

Uploaded by

ML Aml Cse It Lab Manual Final

Uploaded by

Faculty of Degree Engineering – 083

Department of Computer Science & Engineering - 31

DEPARTMENT OF COMPUTER SCIENCE &

Roll No.:- Enrollment No.:-

This is to certify that the practical work satisfactorily carried out

in the Machine Learning (3170724) Laboratory of Dr. Subhash

Submission Date: ……………

Subject in Charge HOD

SR. NO. AIM PAGE NO DATE SIGN

Importing dataset and reading a

To clean the data and apply methods

To deal with Outliers for data pre-

To implement Linear Regression in

To Evaluate Linear Regression in

To Implement Logistic Regression in

Implement KNN Algorithm for

DR.SUBHASH TECHNICAL CAMPUS Page 1

DR.SUBHASH TECHNICAL CAMPUS Page 2

DR.SUBHASH TECHNICAL CAMPUS Page 3

DR.SUBHASH TECHNICAL CAMPUS Page 4

mis_value_percent = 100 * df.isnull().sum() / len(df)

DR.SUBHASH TECHNICAL CAMPUS Page 5

print('Dataset before :', len(df))

Imputation with mean

Imputation with median

Outlier Detection And Removal

Dataset Used For Outlier Detection

# Load the dataset

# Create the dataframe

Visualizing and Removing Outliers Using Box Plot

DR.SUBHASH TECHNICAL CAMPUS Page 7

def removal_box_plot(df, column, threshold):

removed_outliers = df[df[column] <= threshold]

no_outliers = removal_box_plot(df_diabetics, 'bmi', threshold_value)

Zscore = (data_point -mean) / std. deviation

from scipy import stats

outlier_indices = np.where(z > threshold_z)[0]

DR.SUBHASH TECHNICAL CAMPUS Page 9

Types of Linear Regression

There are two main types of linear regression:

Simple Linear Regression

Multiple Linear Regression

DR.SUBHASH TECHNICAL CAMPUS Page 10

# Splitting the dataset into training and test set.

#Fitting the Simple Linear Regression model to the training dataset

#Prediction of Test and Training set result

#Prediction of Test and Training set result

mtp.scatter(x_train, y_train, color="green")

DR.SUBHASH TECHNICAL CAMPUS Page 11

DR.SUBHASH TECHNICAL CAMPUS Page 12

from sklearn.metrics import mean_absolute_error

Mean Squared Error(MSE):

from sklearn.metrics import mean_squared_error

Root Mean Squared Error(RMSE) is the square root of MSE.

DR.SUBHASH TECHNICAL CAMPUS Page 13

from sklearn.metrics import r2_score

DR.SUBHASH TECHNICAL CAMPUS Page 14

Visualizing and splitting the dataset

DR.SUBHASH TECHNICAL CAMPUS Page 15

Training the Model

DR.SUBHASH TECHNICAL CAMPUS Page 16

DR.SUBHASH TECHNICAL CAMPUS Page 17

Splitting the dataset

DR.SUBHASH TECHNICAL CAMPUS Page 18

DR.SUBHASH TECHNICAL CAMPUS Page 19

You might also like