0% found this document useful (0 votes)

34 views19 pages

Sec-D ML Practical File PDF

Uploaded by

Vedant Agnihotri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views19 pages

Sec-D ML Practical File PDF

Uploaded by

Vedant Agnihotri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

PRACTICAL FILE FOR MACHINE

LEARNING

Submitted by:
Ridham Kumar
Btech Cse 7th sem
17032100761
1.) Installation of anaconda navigator and introduction to various
tools/platform.
Anaconda is a trusted suite that bundles Python and R distributions.
Anaconda is a package manager and virtual environment manager, and it
includes a set of pre-installed software packages. The Anaconda open-source
ecosystem is mainly used for data science, machine learning, and large-scale
data analysis. Anaconda is popular because it’s simple to install, and it
provides access to almost all the tools and packages that data professionals
require, including the following:

• the Python interpreter

•
an extensive collection of packages
• Conda, a package and virtual environment management system
• Jupyter Notebook, a web-based interactive integrated development
environment (IDE) that combines code, text, and visualizations in the
same document
• Anaconda Navigator, a desktop application that makes it easy to
launch software packages that come with Anaconda distribution and
manages packages and virtual environments without using command-
line commands
How to Install Anaconda

Anaconda is a cross-platform Python distribution that you can install on

Windows, macOS, or different distributions of Linux.
NOTE If you already have Python installed, you don’t need to uninstall it. You
can still go ahead and install Anaconda and use the Python version that
comes along with Anaconda distribution.

• Download Anaconda installer for your operating system from:

https://www.anaconda.com/downloads.
• Once the download is complete, double-click the package to start
installing Anaconda. The installer will walk you through a wizard to
complete the installation; the default settings work well in most cases
• Click on Continue on the Introduction, Read Me, and License screens.
Click on Agree to continue the installation, once the prompt below
appears.
•On the Destination Select screen, select “Install for me only.” It is
recommended to install Anaconda on the default path; to do so, click
on Instal.Clhange
If you would
Installlike to installand
Location… Anaconda
changeon
thea installation
different location,
path.
click on
• On the PyCharm IDE screen, click on Continue to install Anaconda
without the PyCharm IDE.
• After the installation completes, click on Close on the Summary screen
to close the installation wizard.
• There are two ways to verify your Anaconda installation: Locate
Anaconda Navigator in the installed applications on your computer,
and double-click on its icon.

2.)introduction to Numpy, Scipy, Scilearn-kit, Pandas, keras,

matplotlib and tensor-flow packages in python.

1. Pandas

Pandas is a BSD (Berkeley Software Distribution) licensed open source

library. This popular library is widely used in the field of data science. They
are primarily used for data analysis, manipulation, cleaning, etc. Pandas
allow for simple data modelling and data analysis operations without the
need to switch to another language such as R. Usually, Python libraries use
the following types of data:

• Data in a dataset.
• Time series containing both ordered and unordered data.
• Rows and columns of matrix data are labelled.
• Unlabelled information
• Any other type of statistical information

Pandas can do a wide range of tasks, including:

• The data frame can be sliced using Pandas.

• Data frame joining and merging can be done using Pandas.
• Columns from two data frames can be concatenated using Pandas.
• In a data frame, index values can be changed using Pandas.
• In a column, the headers can be changed using Pandas.
• Data conversion into various forms can also be done using Pandas and many
more.
2. NumPy

NumPy is one of the most widely used open-source Python libraries, focusing
on scientific computation. It features built-in mathematical functions for
quick computation and supports big matrices and multidimensional data.
“Numerical Python” is defined by the term “NumPy.” It can be used in linear
algebra, as a multi-dimensional container for generic data, and as a random
number generator, among other things. Some of the important functions in
NumPy are arcsin(), arccos(), tan(), radians(), etc. NumPy Array is a Python
object which defines an N-dimensional array with rows and columns. In
Python, NumPy Array is preferred over lists because it takes up less memory
and is faster and more convenient to use.

Features:

1. Interactive: Numpy is a very interactive and user-friendly library.

2. Mathematics: NumPy simplifies the implementation of difficult
mathematical equations.
3. Intuitive: It makes coding and understanding topics a breeze.
4. Lot of Interaction: There is a lot of interaction in it because it is widely
utilised, hence there is a lot of open-source contribution.

The NumPy interface can be used to represent images, sound waves, and
other binary raw streams as an N-dimensional array of real values for
visualization. Numpy knowledge is required for full-stack developers to
implement this library for machine learning

3. Keras

Keras is a Python-based open-source neural network library that lets us

experiment with deep neural networks quickly. With deep learning becoming
more common, Keras emerges as a great option because, according to the
creators, it is an API (Application Programming Interface) designed for
humans, not machines. Keras has a higher adoption rate in the industry and
research community than TensorFlow or Theano. It is recommended that you
install the TensorFlow backend engine before installing Keras.
Features:

1. It runs without a hitch on both the CPU (Central Processing Unit) and
GPU (Graphics Processing Unit).
2. Keras supports nearly all neural network models, including fully
connected, convolutional, pooling, recurrent, embedding, and so forth.
These models can also be merged to create more sophisticated
models.
3. Keras’ modular design makes it very expressive, adaptable, and suited
really well to cutting edge research.
4. Keras is a Python based framework, making it simple to debug and
explore different models and projects.

Keras-powered features are already in use at various companies, for

instance, Netflix, Uber, Yelp, Instacart, Zocdoc, Square, and a slew of other
companies. It is particularly popular among firms that use deep learning to
power their products. Keras includes a lot of implementations of standard
neural network building elements such as layers, objectives, activation
functions, optimizers, and a slew of other tools for working with picture and
text data. It also includes a number of pre-processed data sets and pre-
trained models, such as MNIST, VGG, Inception, SqueezeNet, ResNet, etc.

4. TensorFlow

TensorFlow is a high-performance numerical calculation library that is open

source. It is also employed in deep learning algorithms and machine learning
algorithms. It was created by the Google Brain team researchers within the
Google AI organization and is currently widely utilized by math, physics, and
machine learning researchers for complicated mathematical computations.
TensorFlow is designed to be fast, and it employs techniques such as XLA
(XLA or Accelerated Linear Algebra is a domain-specific compiler for linear
algebra that can accelerate TensorFlow models with potentially no source
code changes.) to do speedy linear algebra computations.

Features:
• Responsive Construct: We can easily visualize each and every part of
the graph with TensorFlow, which is not possible with Numpy or SciKit.
• Adaptable: One of the most essential Tensorflow features is that it is
flexible in its operation related to Machine Learning models, which
means that it has modularity and allows you to make sections of it
stand alone.
• It is Simple to Train Machine Learning Models in TensorFlow: Machine
Learning models can be readily trained using TensorFlow on both the CPU
and GPU for distributed computing.
• Parallel Neural Network Training: TensorFlow allows you to train many
neural networks and GPUs at the same time.
• Open Source and a large community: Without a doubt, if it was
developed by Google, there is already a significant team of software
experts working on constant stability improvements. The nicest part
about this machine learning library is that it is open-source, which
means that anyone with internet access can use it.

TensorFlow is used on a regular basis, but only inadvertently, through

services like Google Voice Search and Google Photos. TensorFlow’s libraries
are developed entirely in C and C++. It does, however, have a sophisticated
Python front end. Your Python code will be compiled and run on the
TensorFlow distributed execution engine, which is written in C and C++.
TensorFlow has an almost infinite amount of applications, which is one of its
most appealing features

5. SciPy

Scipy is a free, open-source Python library used for scientific computing, data
processing, and high-performance computing. The library contains a huge
number of user-friendly routines for quick computation. The package is
based on the NumPy extension, which allows for data processing and
visualization as well as high-level commands. Scipy is used for mathematical
computations alongside NumPy. NumPy enables the sorting and indexing of
array data, while SciPy stores the numerical code. Cluster, constants, fftpack,
integrate, interpolate, io, linalg, ndimage, odr, optimize, signal, sparse,
spatial, special, and stats are only a few of the many sub packages available
in SciPy. “from scipy import subpackage-name” can be used to import them
from SciPy. NumPy, SciPy library, Matplotlib, IPython, Sympy, and Pandas
are, however, the essential packages of SciPy

Features:

• SciPy’s key characteristic is that it was written in NumPy, and its array
makes extensive use of NumPy.
• SciPy uses its specialised submodules to provide all of the efficient
numerical algorithms such as optimization, numerical integration, and
many others.
• All functions in SciPy’s submodules are extensively documented.
SciPy’s primary data structure is NumPy arrays, and it includes
modules for a variety of popular scientific programming applications.
SciPy handles tasks like linear algebra, integration (calculus), solving
ordinary differential equations, and signal processing with ease.

6.) Scikit Learn

Scikit Learn is an open-source library for machine learning algorithms that

runs on the Python environment. It can be used with both supervised and
unsupervised learning algorithms. The library includes popular algorithms as
well as the NumPy, Matplotlib, and SciPy packages. Scikit learns most well-
known use is for music suggestions in Spotify. Let us now deep dive into
some of the key features of Scikit Learn:

• Cross-Validation: There are several methods for checking the accuracy

of supervised models on unseen data with Scikit Learn for example
the train_test_split method, cross_val_score, etc.
•Unsupervised learning techniques: There is a wide range of
unsupervised learning algorithms available, ranging from clustering,
factor analysis, principal component analysis, and unsupervised neural
networks.
• Feature extraction: Extracting features from photos and text is a useful
tool (e.g. Bag of words)
Scikit Learn includes many algorithms and can be used for performing
common machine learning and data mining tasks such as dimensionality
reduction, classification, regression, clustering, and model selection.

6.) Matplotlib

Matplotlib is a cross-platform, data visualization and graphical plotting

library for Python and its numerical extension NumPy. As such, it offers a
viable open source alternative to MATLAB. Developers can also use
matplotlib’s APIs (Application Programming Interfaces) to embed plots in GUI
applications.

A Python matplotlib script is structured so that a few lines of code are all that
is required in most instances to generate a visual data plot. The matplotlib
scripting layer overlays two APIs:

• The pyplot API is a hierarchy of Python code objects topped

by matplotlib.pyplot

• An OO (Object-Oriented) API collection of objects that can be

assembled with greater flexibility than pyplot. This API provides direct
access to Matplotlib’s backend layers.

Matplotlib and Pyplot in Python

The pyplot API has a convenient MATLAB-style stateful interface. In fact,

matplotlib was originally written as an open source alternative for MATLAB.
The OO API and its interface is more customizable and powerful than pyplot,
but considered more difficult to use. As a result, the pyplot interface is more
commonly used. Understanding matplotlib’s pyplot API is key to
understanding how to work with plots:

• matplotlib.pyplot.figure: Figure is the top-level container. It includes

everything visualized in a plot including one or more Axes.
• matplotlib.pyplot.axes: Axes contain most of the elements in
a plot: Axis, Tick, Line2D, Text, etc., and sets the coordinates. It is the
area in which data is plotted. Axes include the X-Axis, Y-Axis, and
possibly a Z-Axis, as well.

3.) Build Classification models using Bayes Net, Naïve Bayes

models for given datasets in python. Find classification accuracy of
these models using confusion matrix.
Introduction to Naive Bayes algorithm In machine learning, Naïve Bayes classification is a
straightforward and powerful
algorithm for the classification task. Naïve Bayes classification is based on applying Bayes’
theorem with strong independence assumption between the features. Naïve Bayes
classification produces good results when we use it for textual data analysis such as
Natural Language Processing.

Naïve Bayes models are also known as simple Bayes or independent Bayes. All these
names refer to the application of Bayes’ theorem in the classifier’s decision rule. Naïve
Bayes classifier applies the Bayes’ theorem in practice. This classifier brings the power of
Bayes’ theorem to machine learning.
Naïve Bayes Classifier uses the Bayes’ theorem to predict membership probabilities for
each class such as the probability that given record or data point belongs to a particular
class. The class with the highest probability is considered as the most likely class. This is
also known as the Maximum A Posteriori (MAP).
The MAP for a hypothesis with 2 events A and B is MAP (A) = max (P (A | B))

= max (P (B | A) * P (A))/P (B)

= max (P (B | A) * P (A))

Here, P (B) is evidence probability. It is used to normalize the result. It remains the same,
So, removing it would not affect the result.

Working of Naïve Bayes' Classifier can be understood with the help of the
below example:
Suppose we have a dataset of weather conditions and corresponding target
variable "Play". So using this dataset we need to decide that whether we
should play or not on a particular day according to the weather conditions. So
to solve this problem, we need to follow the below steps:

1. Convert the given dataset into frequency tables.

2. Generate Likelihood table by finding the probabilities of given features.
3. Now, use Bayes theorem to calculate the posterior probability.

Problem: If the weather is sunny, then the Player should play or not?

Solution: To solve this, first consider the below dataset:

Outlook Play

0 Rainy Yes

1 Sunny Yes

2 Overcast Yes

3 Overcast Yes

4 Sunny No

5 Rainy Yes

6 Sunny Yes

7 Overcast Yes

8 Rainy No

9 Sunny No

10 Sunny Yes
11 Rainy No

12 Overcast Yes

13 Overcast Yes

Frequency table for the Weather Conditions:

Weather Yes No

Overcast 5 0

Rainy 2 2

Sunny 3 2

Total 10 5

Likelihood table weather condition:

Weather No Yes

Overcast 0 5 5/14= 0.35

Rainy 2 2 4/14=0.29

Sunny 2 3 5/14=0.35

All 4/14=0.29 10/14=0.71

Applying Bayes'theorem:

P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)

P(Sunny|Yes)= 3/10= 0.3

P(Sunny)= 0.35

P(Yes)=0.71

So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60

P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)

P(Sunny|NO)= 2/4=0.5

P(No)= 0.29

P(Sunny)= 0.35

So P(No|Sunny)= 0.5*0.29/0.35 = 0.41

So as we can see from the above calculation that P(Yes|Sunny)>P(No|Sunny)

Hence on a Sunny day, Player can play the game.

Creating Confusion Matrix:

Now we will check the accuracy of the Naive Bayes classifier using the Confusion matrix.
Below is the code for it:

1. # Making the Confusion Matrix

2. from sklearn.metrics import confusion_matrix
3. cm = confusion_matrix(y_test, y_pred)

Output:

As we can see in the above confusion matrix output, there are 7+3= 10 incorrect predictions,
and 65+25=90 correct predictions.
4.) Compare various models on the given dataset and exploring the
concepts of overfitting and underfitting

Overfitting in Machine Learning:- Overfitting refers to a model that models

the training data too well.
Overfitting happens when a model learns the detail and noise in the training
data to the extent that it negatively impacts the performance of the model on
new data. This means that the noise or random fluctuations in the training
data is picked up and learned as concepts by the model. The problem is that
these concepts do not apply to new data and negatively impact the models
ability to generalize.
Overfitting is more likely with nonparametric and nonlinear models that have
more flexibility when learning a target function. As such, many
nonparametric machine learning algorithms also include parameters or
techniques to limit and constrain how much detail the model learns.
For example, decision trees are a nonparametric machine learning algorithm
that is very flexible and is subject to overfitting training data. This problem
can be addressed by pruning a tree after it has learned in order to remove
some of the detail it has picked up.
Underfitting in Machine Learning
Underfitting refers to a model that can neither model the training data nor
generalize to new data.
An underfit machine learning model is not a suitable model and will be
obvious as it will have poor performance on the training data.
Underfitting is often not discussed as it is easy to detect given a good
performance metric. The remedy is to move on and try alternate machine
learning algorithms. Nevertheless, it does provide a good contrast to the
problem of overfitting.
A Good Fit in Machine Learning
Ideally, you want to select a model at the sweet spot between underfitting
and overfitting.

This is the goal, but is very difficult to do in practice.

To understand this goal, we can look at the performance of a machine

learning algorithm over time as it is learning a training data. We can plot both
the skill on the training data and the skill on a test dataset we have held back
from the training process.
Over time, as the algorithm learns, the error for the model on the training
data goes down and so does the error on the test dataset. If we train for too
long, the performance on the training dataset may continue to decrease
because the model is overfitting and learning the irrelevant detail and noise
in the training dataset. At the same time the error for the test set starts to rise
again as the model’s ability to generalize decreases.
The sweet spot is the point just before the error on the test dataset starts to
increase where the model has good skill on both the training dataset and the
unseen test dataset.

You can perform this experiment with your favorite machine learning
algorithms. This is often not useful technique in practice, because by
choosing the stopping point for training using the skill on the test dataset it
means that the testset is no longer “unseen” or a standalone objective
measure. Some knowledge (a lot of useful knowledge) about that data has
leaked into the training procedure.
There are two additional techniques you can use to help find the sweet spot
in practice: resampling methods and a validation dataset.
How To Limit Overfitting
Both overfitting and underfitting can lead to poor model performance. But by
far the most common problem in applied machine learning is overfitting.
Overfitting is such a problem because the evaluation of machine learning
algorithms on training data is different from the evaluation we actually care
the most about, namely how well the algorithm performs on unseen data.
There are two important techniques that you can use when evaluating
machine learning algorithms to limit overfitting:
• Use a resampling technique to estimate model accuracy.
• Hold back a validation dataset.
The most popular resampling technique is k-fold cross validation. It allows
you to train and test your model k-times on different subsets of training data
and build up an estimate of the performance of a machine learning model on
unseen data. A validation dataset is simply a subset of your training data that
you hold back from your machine learning algorithms until the very end of
your project. After you have selected and tuned your machine learning
algorithms on your training dataset you can evaluate the learned models on
the validation dataset to get a final objective idea of how the models might
perform on unseen data. Using cross validation is a gold standard in applied
machine learning for estimating model accuracy on unseen data.
5.) Hierarchal clustering algorithm to cluster data stored in .csv
dataset

# Do the necessary imports

import numpy as np

import matplotlib.pyplot as plt

from sklearn.datasets import make_blobs

# Create a blob of 200 data points

dataset = make_blobs(n_samples = 200,

n_features = 2,

centers = 4,
cluster_std = 1.6,

random_state = 50)

# Calling only first Dataset

points = dataset[0]

# import libraries for Heirarchical Clustering

import scipy.cluster.hierarchy as sch

from sklearn.cluster import AgglomerativeClustering

# Create a dendrogram

dendrogram = sch.dendrogram(sch.linkage(points, method = 'ward'))

DENDROGRAM
# Scattering Plots to See How the Data Looks Like
plt.scatter(dataset[0][:,0], dataset[0][:,1])

# Perform the Actual Clustering

hc = AgglomerativeClustering(n_clusters = 4, affinity = 'eucledian', linkage = 'ward')

y_hc = hc.fit_predict(points)

plt.scatter(points[y_hc == 0,0], points[y_hc == 0,1], s= 100, c='cyan')

plt.scatter(points[y_hc == 1,0], points[y_hc == 1,1], s= 100, c='yellow')

plt.scatter(points[y_hc == 2,0], points[y_hc == 2,1], s= 100, c='red')

plt.scatter(points[y_hc == 3,0], points[y_hc == 3,1], s= 100, c='green')

plt.scatter()

6.) Build an ANN models with back propagation neural network

approach with .csv datasets for classification problem. Compute
accuracy of the classifier by considering test dataset.

# Import Libraries import numpy as np import pandas as

pd from sklearn.datasets import load_iris from
sklearn.model_selection import train_test_split import
matplotlib.pyplot as plt
# Load dataset

# Get features and target

X=pd.read_csv('irisData.csv').to_numpy()
y=pd.read_csv('Result.csv').to_numpy()
#Split data into train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=20,
random_state=4)

# Initialize variables
learning_rate = 0.1
iterations = 5000
N = y_train.size

# number of input features

input_size = 4
# number of hidden layers neurons
hidden_size = 2

# number of neurons at the output layer

output_size = 3

results = pd.DataFrame(columns=["mse", "accuracy"])

# Initialize weights
np.random.seed(10)

# initializing weight for the hidden layer

W1 = np.random.normal(scale=0.5, size=(input_size, hidden_size))

# initializing weight for the output layer

W2 = np.random.normal(scale=0.5, size=(hidden_size , output_size))
def sigmoid(x):
return 1 / (1 + np.exp(-x))

def mean_squared_error(y_pred, y_true):

return (((y_pred - y_true)**2).sum() / (2*y_pred.size))

def accuracy(y_pred, y_true):

acc = y_pred.argmax(axis=1) == y_true.argmax(axis=1)
return acc.mean()
for itr in range(iterations):

# feedforward propagation
# on hidden layer
Z1 = np.dot(X_train, W1)
A1 = sigmoid(Z1)

# on output layer
Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)

# Calculating error
mse = mean_squared_error(A2, y_train)
acc = accuracy(A2, y_train)
results=results.append({"mse":mse, "accuracy":acc},ignore_index =True )

# backpropagation
E1 = A2 - y_train
dW1 = E1 * A2 * (1 - A2)

E2 = np.dot(dW1, W2.T)
dW2 = E2 * A1 * (1 - A1)

# weight updates
W2_update = np.dot(A1.T, dW1) / N
W1_update = np.dot(X_train.T, dW2) / N

W2 = W2 - learning_rate * W2_update
W1 = W1 - learning_rate * W1_update
results.mse.plot(title="Mean Squared Error")
<AxesSubplot:title={'center':'Mean Squared Error'}>

results.accuracy.plot(title="Accuracy")

<AxesSubplot:title={'center':'Accuracy'}>

# feedforward
Z1 = np.dot(X_test, W1)
A1 = sigmoid(Z1)

Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)

acc = accuracy(A2, y_test)

print("Accuracy: {}".format(acc))
Accuracy: 0.8

Machine Learning - Python Libraries
No ratings yet
Machine Learning - Python Libraries
12 pages
D365FO Interview Questions
No ratings yet
D365FO Interview Questions
33 pages
ML (Lab Programs)
No ratings yet
ML (Lab Programs)
28 pages
Installation and Administration Guide SEP14
100% (1)
Installation and Administration Guide SEP14
814 pages
Huawei g630 Repair Manual PDF
No ratings yet
Huawei g630 Repair Manual PDF
65 pages
40 Most Popular Python Scientific Libraries
No ratings yet
40 Most Popular Python Scientific Libraries
9 pages
Chapter 6 Python Libraries For Machine Learning
No ratings yet
Chapter 6 Python Libraries For Machine Learning
21 pages
LinuxCheatSheet PDF
100% (1)
LinuxCheatSheet PDF
13 pages
Linux Command Enigma2
0% (1)
Linux Command Enigma2
3 pages
Lab - Manual FDS
No ratings yet
Lab - Manual FDS
12 pages
Dsbda Unit4
No ratings yet
Dsbda Unit4
110 pages
Penerbit 0053
No ratings yet
Penerbit 0053
14 pages
ML Lab Manual (Vim)
No ratings yet
ML Lab Manual (Vim)
13 pages
Linux Magazine USA Issue 280, March 2024
100% (2)
Linux Magazine USA Issue 280, March 2024
100 pages
Best Python Libraries For Machine Learning - GeeksforGeeks
No ratings yet
Best Python Libraries For Machine Learning - GeeksforGeeks
18 pages
Top 20 Incredibly Impressive Trending Python Libraries To Work With
No ratings yet
Top 20 Incredibly Impressive Trending Python Libraries To Work With
15 pages
TY FDS Workbook
No ratings yet
TY FDS Workbook
56 pages
Python Environment Setup PDF
100% (1)
Python Environment Setup PDF
11 pages
ML Lab File
No ratings yet
ML Lab File
33 pages
Top 20 Python Libraries For Data Science
No ratings yet
Top 20 Python Libraries For Data Science
15 pages
DL Lab Manual
No ratings yet
DL Lab Manual
34 pages
D P Lab Manual
No ratings yet
D P Lab Manual
54 pages
Introduction
No ratings yet
Introduction
45 pages
Cloudera Manager Administration Guide
No ratings yet
Cloudera Manager Administration Guide
78 pages
Software Upgrade Procedure IP-10
100% (1)
Software Upgrade Procedure IP-10
9 pages
Expt-1 Dav
No ratings yet
Expt-1 Dav
5 pages
DL 1 2 3
No ratings yet
DL 1 2 3
24 pages
Winappdbg 1.5 Tutorial
No ratings yet
Winappdbg 1.5 Tutorial
79 pages
PDS Labmanualword
No ratings yet
PDS Labmanualword
32 pages
Face Mask Detection
No ratings yet
Face Mask Detection
32 pages
Exp1ml
No ratings yet
Exp1ml
6 pages
Introduction To Popular-1
No ratings yet
Introduction To Popular-1
15 pages
Mrdn-Mi 5
No ratings yet
Mrdn-Mi 5
23 pages
Data Preprocessing-AIML Algorithm1
No ratings yet
Data Preprocessing-AIML Algorithm1
47 pages
Staple Python Libraries For Data Science
No ratings yet
Staple Python Libraries For Data Science
26 pages
FDS Lab
No ratings yet
FDS Lab
11 pages
SAP HANA Security Checklists and Recommendations
No ratings yet
SAP HANA Security Checklists and Recommendations
36 pages
Python Libraries
No ratings yet
Python Libraries
34 pages
Top 18 Python Libraries
100% (1)
Top 18 Python Libraries
11 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
Chapter-5 DS
No ratings yet
Chapter-5 DS
2 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
18 pages
Linux Client DM-Multipath Best Practice For CDP-NSS
No ratings yet
Linux Client DM-Multipath Best Practice For CDP-NSS
16 pages
Data Ty
No ratings yet
Data Ty
59 pages
Unit 1-1
No ratings yet
Unit 1-1
10 pages
Yocto Project Quick Start
No ratings yet
Yocto Project Quick Start
8 pages
Python Tutorial For Students Machine Learning Course Holzinger
100% (1)
Python Tutorial For Students Machine Learning Course Holzinger
46 pages
ML Libraries PPT (3.3)
No ratings yet
ML Libraries PPT (3.3)
10 pages
Practical 1
No ratings yet
Practical 1
8 pages
Ass1 DSBDA Writeup
No ratings yet
Ass1 DSBDA Writeup
8 pages
ML Exp
No ratings yet
ML Exp
9 pages
In Python, A Library Is A Collection of Pre-Writt...
No ratings yet
In Python, A Library Is A Collection of Pre-Writt...
3 pages
10363uk ActivInspire - ActivDriver Network Installation Guide (Windows and Mac)
No ratings yet
10363uk ActivInspire - ActivDriver Network Installation Guide (Windows and Mac)
20 pages
AIES Assignment1
No ratings yet
AIES Assignment1
15 pages
V11 What's New and Changes PDF
No ratings yet
V11 What's New and Changes PDF
98 pages
Python Subjects For AI
No ratings yet
Python Subjects For AI
2 pages
Introduction To Python Libraries
No ratings yet
Introduction To Python Libraries
11 pages
PDF 1675791423
No ratings yet
PDF 1675791423
11 pages
10 Essential Python Libraries For Data Professionals - by Sigli Mumuni - Medium
No ratings yet
10 Essential Python Libraries For Data Professionals - by Sigli Mumuni - Medium
6 pages
Machine Learning Document
No ratings yet
Machine Learning Document
7 pages
Anaconda's Guide To Open-Source: Tools and Libraries For Enterprise Data Science and Machine Learning
No ratings yet
Anaconda's Guide To Open-Source: Tools and Libraries For Enterprise Data Science and Machine Learning
29 pages
Libraries
No ratings yet
Libraries
3 pages
Linux Interview Questions
No ratings yet
Linux Interview Questions
9 pages
Install Telive Sq5bpf, Osmo-Tetra, Gnuradio
No ratings yet
Install Telive Sq5bpf, Osmo-Tetra, Gnuradio
6 pages
100 Must-Know PythonMl Interview Questions and Answers 2024 - Devinterview - Io
No ratings yet
100 Must-Know PythonMl Interview Questions and Answers 2024 - Devinterview - Io
1 page
Installing SAP Applications
No ratings yet
Installing SAP Applications
16 pages
Q1. What Are: Python Standard Library Ans
No ratings yet
Q1. What Are: Python Standard Library Ans
6 pages
Fds PDF
No ratings yet
Fds PDF
4 pages
EViews 71 Supplement
No ratings yet
EViews 71 Supplement
32 pages
PR ZXV
No ratings yet
PR ZXV
8 pages
Machine Learning Python Packages
No ratings yet
Machine Learning Python Packages
9 pages
Simple Libraries in Python
No ratings yet
Simple Libraries in Python
12 pages
Done Assignment
No ratings yet
Done Assignment
9 pages
Precipitation Module (TC-PRISMA) User Guide
No ratings yet
Precipitation Module (TC-PRISMA) User Guide
55 pages
Two Scoops of Django 3x - Compress 2
No ratings yet
Two Scoops of Django 3x - Compress 2
50 pages
Yii 2
No ratings yet
Yii 2
3 pages
README
No ratings yet
README
11 pages
Core Libraries For Machine Learning
No ratings yet
Core Libraries For Machine Learning
5 pages
Introduction To Python and ML Libraries
No ratings yet
Introduction To Python and ML Libraries
11 pages
Basic Libraries For Data Science
No ratings yet
Basic Libraries For Data Science
4 pages
Reliance Netconnect - Broadband+ AC2726 Installation Guide For Linux
No ratings yet
Reliance Netconnect - Broadband+ AC2726 Installation Guide For Linux
7 pages
Top 10 Diagnostics Tips For Client Troubleshooting With SCCM Ver 4
No ratings yet
Top 10 Diagnostics Tips For Client Troubleshooting With SCCM Ver 4
100 pages
Linux Command (Basics + Advance)
No ratings yet
Linux Command (Basics + Advance)
12 pages
How To Import and Export Property Data Using The CAPE-OPEN
No ratings yet
How To Import and Export Property Data Using The CAPE-OPEN
2 pages
Celestia User's Guide: For Version 161
No ratings yet
Celestia User's Guide: For Version 161
48 pages
Secure Supply Chain Consumption Framework (S2C2F) Simplified Requirements
No ratings yet
Secure Supply Chain Consumption Framework (S2C2F) Simplified Requirements
29 pages
Prepared By:: Franz Lawrenz A. de Torres
No ratings yet
Prepared By:: Franz Lawrenz A. de Torres
12 pages

Sec-D ML Practical File PDF

Uploaded by

Sec-D ML Practical File PDF

Uploaded by

PRACTICAL FILE FOR MACHINE

• the Python interpreter

Anaconda is a cross-platform Python distribution that you can install on

• Download Anaconda installer for your operating system from:

2.)introduction to Numpy, Scipy, Scilearn-kit, Pandas, keras,

Pandas is a BSD (Berkeley Software Distribution) licensed open source

Pandas can do a wide range of tasks, including:

• The data frame can be sliced using Pandas.

1. Interactive: Numpy is a very interactive and user-friendly library.

Keras is a Python-based open-source neural network library that lets us

Keras-powered features are already in use at various companies, for

TensorFlow is a high-performance numerical calculation library that is open

TensorFlow is used on a regular basis, but only inadvertently, through

6.) Scikit Learn

Scikit Learn is an open-source library for machine learning algorithms that

• Cross-Validation: There are several methods for checking the accuracy

Matplotlib is a cross-platform, data visualization and graphical plotting

• The pyplot API is a hierarchy of Python code objects topped

• An OO (Object-Oriented) API collection of objects that can be

Matplotlib and Pyplot in Python

The pyplot API has a convenient MATLAB-style stateful interface. In fact,

• matplotlib.pyplot.figure: Figure is the top-level container. It includes

3.) Build Classification models using Bayes Net, Naïve Bayes

= max (P (B | A) * P (A))/P (B)

1. Convert the given dataset into frequency tables.

Solution: To solve this, first consider the below dataset:

Frequency table for the Weather Conditions:

Likelihood table weather condition:

Overcast 0 5 5/14= 0.35

All 4/14=0.29 10/14=0.71

P(Sunny|Yes)= 3/10= 0.3

So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60

So P(No|Sunny)= 0.5*0.29/0.35 = 0.41

So as we can see from the above calculation that P(Yes|Sunny)>P(No|Sunny)

Hence on a Sunny day, Player can play the game.

Creating Confusion Matrix:

1. # Making the Confusion Matrix

Overfitting in Machine Learning:- Overfitting refers to a model that models

This is the goal, but is very difficult to do in practice.

To understand this goal, we can look at the performance of a machine

# Do the necessary imports

import matplotlib.pyplot as plt

from sklearn.datasets import make_blobs

# Create a blob of 200 data points

dataset = make_blobs(n_samples = 200,

# Calling only first Dataset

# import libraries for Heirarchical Clustering

import scipy.cluster.hierarchy as sch

from sklearn.cluster import AgglomerativeClustering

dendrogram = sch.dendrogram(sch.linkage(points, method = 'ward'))

# Perform the Actual Clustering

hc = AgglomerativeClustering(n_clusters = 4, affinity = 'eucledian', linkage = 'ward')

plt.scatter(points[y_hc == 0,0], points[y_hc == 0,1], s= 100, c='cyan')

plt.scatter(points[y_hc == 1,0], points[y_hc == 1,1], s= 100, c='yellow')

plt.scatter(points[y_hc == 3,0], points[y_hc == 3,1], s= 100, c='green')

6.) Build an ANN models with back propagation neural network

# Import Libraries import numpy as np import pandas as

# Get features and target

# number of input features

# number of neurons at the output layer

results = pd.DataFrame(columns=["mse", "accuracy"])

# initializing weight for the hidden layer

# initializing weight for the output layer

def mean_squared_error(y_pred, y_true):

def accuracy(y_pred, y_true):

acc = accuracy(A2, y_test)

You might also like