0% found this document useful (0 votes)

30 views11 pages

Data Science Machine Leraning222

Uploaded by

Radhey Mohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views11 pages

Data Science Machine Leraning222

Uploaded by

Radhey Mohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Data Science Machine Learning


MAHAMAYA POLYTECHNIC OF INFORMATION
TECHNOLOGY (AMROHA).

Submitted to

Dr Jaya Singh

For

DIPLOMA OF TECHNOLOGY

INFORMATION TECH NOLOGY


By ------------

Name Roll No.

ANKIT SRIVASTAV : E20113835600012
Diploma final year VIth Semester


1
Practical-01
Q-1 Write a program in Python to implement the Decision tree Algorithm
Decision Tree is one of the most powerful and popular algorithm. Decision-tree algorithm
falls under the category of supervised learning algorithms. It works for both continuous as
well as categorical output variables.

Types of Decision Tree Algorithms

There are two types of decision trees. They are categorized based on the type of the target
variable they have. If the decision tree has a categorical target variable, then it is called a
‘categorical variable decision tree’. Similarly, if it has a continuous target variable, it is called
a ‘continuous variable decision tree’.
1. # Python program to implement decision tree algorithm and plot the tree
2.
3. # Importing the required libraries
4. import pandas as pd
5. import numpy as np
6. import matplotlib.pyplot as plt
7. from sklearn import metrics
8. import seaborn as sns
9. from sklearn.datasets import load_iris
10. from sklearn.model_selection import train_test_split
11. from sklearn import tree
12.
13. # Loading the dataset
14. iris = load_iris()
15.
16. #converting the data to a pandas dataframe
17. data = pd.DataFrame(data = iris.data, columns = iris.feature_names)
18.
19. #creating a separate column for the target variable of iris dataset
20. data['Species'] = iris.target
21.
22. #replacing the categories of target variable with the actual names of the species
23. target = np.unique(iris.target)
24. target_n = np.unique(iris.target_names)
25. target_dict = dict(zip(target, target_n))
26. data['Species'] = data['Species'].replace(target_dict)
27.
28. # Separating the independent dependent variables of the dataset
29. x = data.drop(columns = "Species")
30. y = data["Species"]
31. names_features = x.columns

2
32. target_labels = y.unique()
33.
34. # Splitting the dataset into training and testing datasets
35. x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state = 93)
36.
37. # Importing the Decision Tree classifier class from sklearn
38. from sklearn.tree import DecisionTreeClassifier
39.
40. # Creating an instance of the classifier class
41. dtc = DecisionTreeClassifier(max_depth = 3, random_state = 93)
42.
43. # Fitting the training dataset to the model
44. dtc.fit(x_train, y_train)
45.
46. # Plotting the Decision Tree
47. plt.figure(figsize = (30, 10), facecolor = 'b')
48. Tree = tree.plot_tree(dtc, feature_names = names_features, class_names = target_labels, rounded = Tr
ue, filled = True, fontsize = 14)
49. plt.show()
50. y_pred = dtc.predict(x_test)
51.
52. # Finding the confusion matrix
53. confusion_matrix = metrics.confusion_matrix(y_test, y_pred)
54. matrix = pd.DataFrame(confusion_matrix)
55. axis = plt.axes()
56. sns.set(font_scale = 1.3)
57. plt.figure(figsize = (10,7))
58.
59. # Plotting heatmap
60. sns.heatmap(matrix, annot = True, fmt = "g", ax = axis, cmap = "magma")
61. axis.set_title('Confusion Matrix')
62. axis.set_xlabel("Predicted Values", fontsize = 10)
63. axis.set_xticklabels([''] + target_labels)
64. axis.set_ylabel( "True Labels", fontsize = 10)
65. axis.set_yticklabels(list(target_labels), rotation = 0)
66. plt.show()

3
Practical-02
Q-01 Write a program in python to implement the K-means Algorithm
K-means is an unsupervised learning method for clustering data points. The algorithm
iteratively divides data points into K clusters by minimizing the variance in each cluster
we will show you how to estimate the best value for K using the elbow method, then use K-
means clustering to group the data points into clusters.
work
First, each data point is randomly assigned to one of the K clusters. Then, we compute the
centroid (functionally the center) of each cluster, and reassign each data point to the cluster
with the closest centroid. We repeat this process until the cluster assignments for each data
point are no longer changing.

K-means clustering requires us to select K, the number of clusters we want to group the data
into. The elbow method lets us graph the inertia (a distance-based metric) and visualize the
point at which it starts decreasing linearly. This point is referred to as the "eblow" and is a
good estimate for the best value for K based on our data.
Program
Import matplotlib.pyplot as plt

x = [4, 5, 10, 4, 3, 11, 14 , 6, 10, 12]

y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21]

plt.scatter(x, y)
plt.show()
Output

4
import numpy as np
import matplotlib.pyplot as plt

# Import the data

data = np.loadtxt("data.csv", delimiter=",")

# Choose the number of clusters

k = 3

# Initialize the centroids randomly

centroids = data[np.random.randint(0, len(data), k)]

# Repeat until the centroids do not change

while True:

# Assign each data point to the closest centroid

distances = np.linalg.norm(data - centroids, axis=1)
labels = np.argmin(distances, axis=0)

# Update the centroids

new_centroids = np.array([np.mean(data[labels == i], axis=0) for i in
range(k)])

# If the centroids have not changed, stop

if np.all(centroids == new_centroids):
break

# Update the centroids

centroids = new_centroids

# Plot the data

plt.scatter(data[:, 0], data[:, 1], c=labels)
plt.scatter(centroids[:, 0], centroids[:, 1], c="black", marker="x")
plt.show()

5
Practicial-03

Q-1 Write a program in python to implement the Linear Regression

Regression

The term regression is used when you try to find the relationship between variables.

In Machine Learning, and in statistical modeling, that relationship is used to predict the
outcome of future events

Linear Regression

Linear regression uses the relationship between the data-points to draw a straight line through
all them.

This line can be used to predict future values.

Work

Python has methods for finding a relationship between data-points and to draw a line of linear
regression. We will show you how to use these methods instead of going through the
mathematic formula.

In the example below, the x-axis represents age, and the y-axis represents speed. We have
registered the age and speed of 13 cars as they were passing a tollbooth. Let us see if the data
we collected could be used in a linear regression:

Code
import matplotlib.pyplot as plt

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

plt.scatter(x, y)
plt.show()

6
Outpuput

Import scipy and draw the line of Linear Regression:

import matplotlib.pyplot as plt

from scipy import stats

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]

slope, intercept, r, p, std_err = stats.linregress(x, y)

def myfunc(x):
return slope * x + intercept

mymodel = list(map(myfunc, x))

plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()

Output

7
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Load the data

data = np.loadtxt("data.csv", delimiter=",")

# Split the data into features and target

features = data[:, :-1]
target = data[:, -1]

# Create the linear regression model

model = LinearRegression()

# Fit the model to the data

model.fit(features, target)

# Predict the values of the target variable for the given features
predictions = model.predict(features)

# Plot the data and the regression line

plt.scatter(features, target, color="blue")
plt.plot(features, predictions, color="red")
plt.show()

8
Practical-04
Q1 Write a program in python to implement the K-NN
It is the learning where the value or result that we want to predict is within the training data
(labeled data) and the value which is in data that we want to study is known as Target or
Dependent Variable or Response Variable.
All the other columns in the dataset are known as the Feature or Predictor Variable or
Independent Variable.
Supervised Learning is classified into two categories:
1. Classification: Here our target variable consists of the categories.
2. Regression: Here our target variable is continuous and we usually try to find out the line
of the curve.
k-nearest neighbor algorithm:
This algorithm is used to solve the classification model problems. K-nearest neighbor or K-
NN algorithm basically creates an imaginary boundary to classify the data. When new data
points come in, the algorithm will try to predict that to the nearest of the boundary line.
Therefore, larger k value means smother curves of separation resulting in less complex
models. Whereas, smaller k value tends to overfit the data and resulting in complex models.
Note: It’s very important to have the right k-value when analyzing the dataset to avoid
overfitting and underfitting of the dataset.
Using the k-nearest neighbor algorithm we fit the historical data (or train the model) and
predict the future.
1. The k-nearest neighbor algorithm is imported from the scikit-learn package.
2. Create feature and target variables.
3. Split data into training and test data.
4. Generate a k-NN model using neighbor’s value.
5. Train or fit the data into the model.
6. Predict the future.

How does it work?

K is the number of nearest neighbors to use. For classification, a majority vote is used to
determined which class a new observation should fall into. Larger values of K are often more
robust to outliers and produce more stable decision boundaries than very small values
(K=3 would be better than K=1, which might produce undesirable results.

Now we fit the KNN algorithm with K=1:

from sklearn.neighbors import KNeighborsClassifier

data = list(zip(x, y))

knn = KNeighborsClassifier(n_neighbors=1)

knn.fit(data, classes)

And use it to classify a new data point:

9
Code

new_x = 8
new_y = 21
new_point = [(new_x, new_y)]

prediction = knn.predict(new_point)

plt.scatter(x + [new_x], y + [new_y], c=classes + [prediction[0]])

plt.text(x=new_x-1.7, y=new_y-0.7, s=f"new point, class: {prediction[0]}")
plt.show()

Output

import numpy as np
import matplotlib.pyplot as plt
from sklearn.neighbors import KNeighborsClassifier

# Load the data

data = np.loadtxt("data.csv", delimiter=",")

# Split the data into features and target

features = data[:, :-1]
target = data[:, -1]

10
# Choose the value of K
k = 5

# Create the KNN model

model = KNeighborsClassifier(n_neighbors=k)

# Fit the model to the data

model.fit(features, target)

# Predict the class of the new data point

new_data = np.array([1, 2, 3])
prediction = model.predict(new_data)

# Print the prediction

print(prediction)

# Plot the data and the decision boundary

plt.scatter(features, target, c=target, cmap="rainbow")
plt.plot(features, model.predict_proba(features)[:, 1], color="black")
plt.show()

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6458)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (648)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2814)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4135)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2141)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)

Data Science Machine Leraning222

Uploaded by

Data Science Machine Leraning222

Uploaded by

Data Science Machine Learning

INFORMATION TECH NOLOGY

Name Roll No.

Types of Decision Tree Algorithms

x = [4, 5, 10, 4, 3, 11, 14 , 6, 10, 12]

# Import the data

# Choose the number of clusters

# Initialize the centroids randomly

# Repeat until the centroids do not change

# Assign each data point to the closest centroid

# Update the centroids

# If the centroids have not changed, stop

# Update the centroids

# Plot the data

Q-1 Write a program in python to implement the Linear Regression

This line can be used to predict future values.

Import scipy and draw the line of Linear Regression:

import matplotlib.pyplot as plt

slope, intercept, r, p, std_err = stats.linregress(x, y)

mymodel = list(map(myfunc, x))

# Load the data

# Split the data into features and target

# Create the linear regression model

# Fit the model to the data

# Plot the data and the regression line

How does it work?

Now we fit the KNN algorithm with K=1:

from sklearn.neighbors import KNeighborsClassifier

data = list(zip(x, y))

And use it to classify a new data point:

plt.scatter(x + [new_x], y + [new_y], c=classes + [prediction[0]])

# Load the data

# Split the data into features and target

# Create the KNN model

# Fit the model to the data

# Predict the class of the new data point

# Print the prediction

# Plot the data and the decision boundary

You might also like