0% found this document useful (0 votes)

3 views

FDS LAB PROGRAMS

The document provides a comprehensive guide on installing Python and Jupyter Notebook on Windows, including step-by-step procedures for downloading, running installers, and verifying installations. It also includes tasks for programming in NumPy and Pandas, such as adding borders to arrays, finding unique elements, and creating DataFrames. Additionally, it covers data visualization using Plotly, including creating line charts, bar charts, and pie charts.

Uploaded by

nagendrakatta2003

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

FDS LAB PROGRAMS

Uploaded by

nagendrakatta2003

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 49

Task 1 1.

(a) Python installation for WINDOWS

Aim : to implement Python installation for WINDOWS

Procedure:
The Python programming language is an increasingly popular choice for both beginners and experienced
developers. Flexible and versatile, Python has strengths in scripting, automation, data analysis, machine
learning, and back-end development.

You’ll need a computer running Windows 10 with administrative privileges and an internet
connection.

Step 1 — Downloading the Python Installer

1. Go to the official Python download page for Windows.

2. Find a stable Python 3 release. This tutorial was tested with Python version 3.10.10.
3. Click the appropriate link for your system to download the executable file: Windows installer
(64-bit) or Windows installer (32-bit).

Step 2 — Running the Executable Installer

1. After the installer is downloaded, double-click the .exe file, for example python-3.10.10-
amd64.exe, to run the Python installer.
2. Select the Install launcher for all users checkbox, which enables all users of the computer to
access the Python launcher application.
3. Select the Add python.exe to PATH checkbox, which enables users to launch Python from the
command line.

4. If you’re just getting started with Python and you want to install it with default features as
described in the dialog, then click Install Now and go to Step 4 - Verify the Python Installation.
To install other optional and advanced features, click Customize installation and continue.
5. The Optional Features include common tools and resources for Python and you can install all of
them, even if you don’t plan to use them.
Select some or all of the following options:

 Documentation: recommended
 pip: recommended if you want to install other Python packages, such as NumPy or pandas
 tcl/tk and IDLE: recommended if you plan to use IDLE or follow tutorials that use it
 Python test suite: recommended for testing and learning
 py launcher and for all users: recommended to enable users to launch Python from the command line
6. Click Next.
7. The Advanced Options dialog displays.
Select the options that suit your requirements:

 Install for all users: recommended if you’re not the only user on this computer
 Associate files with Python: recommended, because this option associates all the Python file types with
the launcher or editor
 Create shortcuts for installed applications: recommended to enable shortcuts for Python applications
 Add Python to environment variables: recommended to enable launching Python
 Precompile standard library: not required, it might down the installation
 Download debugging symbols and Download debug binaries: recommended only if you plan to create
C or C++ extensions

Make note of the Python installation directory in case you need to reference it later.
8. Click Install to start the installation.
9. After the installation is complete, a Setup was successful message displays.
Step 3 — Adding Python to the Environment Variables (optional)

Skip this step if you selected Add Python to environment variables during installation.

If you want to access Python through the command line but you didn’t add Python to your
environment variables during installation, then you can still do it manually.

Before you start, locate the Python installation directory on your system. The following directories
are examples of the default directory paths:

 C:\Program Files\Python310: if you selected Install for all users during installation, then the
directory will be system wide
 C:\Users\Sammy\AppData\Local\Programs\Python\Python310: if you didn’t select Install for all
users during installation, then the directory will be in the Windows user path

Note that the folder name will be different if you installed a different version, but will still start
with Python.

1. Go to Start and enter advanced system settings in the search bar.

2. Click View advanced system settings.
3. In the System Properties dialog, click the Advanced tab and then click Environment
Variables.
4. Depending on your installation:
 If you selected Install for all users during installation, select Path from the list of System Variables and
click Edit.
 If you didn’t select Install for all users during installation, select Path from the list of User Variables and
click Edit.
5. Click New and enter the Python directory path, then click OK until all the dialogs are closed.
Step 4 — Verify the Python Installation

You can verify whether the Python installation is successful either through the command line or
through the Integrated Development Environment (IDLE) application, if you chose to install it.

Go to Start and enter cmd in the search bar. Click Command Prompt.

Enter the following command in the command prompt:

python --version

An example of the output is:

Output

Python 3.10.10

You can also check the version of Python by opening the IDLE application. Go to Start and
enter python in the search bar and then click the IDLE app, for example IDLE (Python 3.10 64-
bit).

You can start coding in Python using IDLE or your preferred code editor.

Task 1 1.b) Installation of Jupyter Notebook

Aim: To install to jupyter Notebook in Windows

Description:

Jupyter Notebook is an open-source web application that allows you to create and share
documents that contain live code, equations, visualizations, and narrative text. Uses
include data cleaning and transformation, numerical simulation, statistical modeling,
data visualization, machine learning, and much more.
Jupyter has support for over 40 different programming languages and Python is one of
them. Python is a requirement (Python 3.3 or greater, or Python 2.7) for installing the
Jupyter Notebook itself.

Installing Jupyter Notebook using Anaconda:

Anaconda is an open-source software that contains Jupyter, spyder, etc that

are used for large data processing, data analytics, heavy scientific computing.
Anaconda works for R and python programming language. Spyder(sub-
application of Anaconda) is used for python. Opencv for python will work in
spyder. Package versions are managed by the package management system
called conda.
To install Jupyter using Anaconda, just go through the following instructions:
 Launch Anaconda Navigator:
 Click on the Install Jupyter Notebook Button:

 Beginning the Installation:

 Loading Packages:

 Finished Installation:
Launching Jupyter:

Task1 2.a) write a numpy program to add a border filled with 0’s around the existing array
AIM: to create a NumPy program to add a border (filled with 0's) around an existing array.

Description:

program:
Output:

Task-1 2.b) Write a NumPy program to get the unique elements of an array.

AIM: To Write a NumPy program to get the unique elements of an array.

Description:

Program:
Output:

Task-1 2.c) Write a NumPy program to get the values and indices of the elements
that are bigger than 10 in an given array.

AIM: TO Write a NumPy program to get the values and indices of the elements that are
bigger than 10 in an given array.

DESCRIPTION:
PROGRAM:
OUTPUT:

Task- Write a Pandas program to create and display a DataFrame from a

2 .3(a) specified dictionary data which has the index labels.

Aim: To Write a Pandas program to create and display a DataFrame from a specified
dictionary data which has the index labels.

Sample DataFrame:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael',
'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}

labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
program:

import pandas as pd

import numpy as np

exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily',

'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],

'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],

'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],

'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no',
'yes']}

labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
df = pd.DataFrame(exam_data , index=labels)

print(df)

Output :

attempts name qualify score

a 1 Anastasia yes 12.5
b 3 Dima no 9.0
c 2 Katherine yes 16.5
d 3 James no NaN
e 2 Emily no 9.0
f 3 Michael yes 20.0
g 1 Matthew yes 14.5
h 1 Laura no NaN
i 2 Kevin no 8.0
j 1 Jonas yes 19.0

TASK- Write a Pandas program to select the rows where the score is missing,
2-3(b) i.e. is NaN.
Aim: To Write a Pandas program to select the rows where the score is missing, i.e. is
NaN.

Sample DataFrame:
Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael',
'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

Program:
import pandas as pd

import numpy as np

exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily',

'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],

'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],

'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],

'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no',
'yes']}

labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

df = pd.DataFrame(exam_data , index=labels)

print("Rows where score is missing:")

print(df[df['score'].isnull

Output:

Rows where score is missing:

attempts name qualify score
d 3 James no NaN
h 1 Laura no NaN

Task-3 4.a) Write a Python program to draw a scatter plot with empty circles taking a random
distribution in X and Y and plotted against each other.

Aim: To Write a Python program to draw a scatter plot with empty circles taking a random distribution in X and
Y and plotted against each other.

Code:

Output:
Task-3 4.b) Write a Python program to draw a pie chart with a title of popularity of programming
languages.

Aim: To Write a Python program to draw a pie chart with a title of popularity of programming languages.

Sample data:
Programming languages: Java, Python, PHP, JavaScript, C#, C++
Popularity: 22.2, 17.6, 8.8, 8, 7.7, 6.7

Program:

import matplotlib.pyplot as plt

# Data to plot
languages = 'Java', 'Python', 'PHP', 'JavaScript', 'C#', 'C++'
popuratity = [22.2, 17.6, 8.8, 8, 7.7, 6.7]
colors =[“green”,”blue”,”yellow”,” red“]
# explode 1st slice
explode = (0.1, 0, 0, 0,0,0)
# Plot
plt.pie(popuratity, explode=explode, labels=languages, colors=colors,
autopct='%1.1f%%', shadow=True, startangle=140)

plt.axis('equal')
plt.legend(title=”Popularity of Programming Languages”)
plt.show()

Output:
TASK-4 5.A) INSTALL PLOTLY
AIM : To install Plotly Succesfully in our PC

Procedure:

The Plotly Python library is an interactive open-source library. This can

be a very helpful tool for data visualization and understanding the
data simply and easily. plotly graph objects are a high-level interface
to plotly which are easy to use. It can plot various types of graphs
and charts like scatter plots, line charts, bar charts, box plots,
histograms, pie charts, etc.
So you all must be wondering why plotly over other visualization tools
or libraries? Here’s the answer –
 Plotly has hover tool capabilities that allow us to detect any
outliers or anomalies in a large number of data points.
 It is visually attractive that can be accepted by a wide range of
audiences.
 It allows us for the endless customization of our graphs that
makes our plot more meaningful and understandable for others.
Ok, enough theory let’s start.
INSTALLATION:

STEP 1: Click on Start  and Type conda install plotly

Step 2: after this type ‘y’ and press enter.

Now Plotly is Downloaded Successfully and Open the Jupyter Notebook and type Plotly Programs

TASK-4 5.B) Create Line Chart ,Bar Chart, Pie Chart Using Plotly

Aim: To Create Line Chart ,Bar Chart, Pie Chart Using Plotly
1. Line chart:

Program:
import plotly.express as px

x = [1,2,3,4,5]

y = [1,3,4,5,6]

fig = px.line( x = x ,y = y,title = 'A simple line graph')

fig.show()

OUTPUT:

A Simple Linear Graph

2. Bar charts: Bar charts are used when we want to compare different
groups of data and make inferences of which groups are highest and
which groups are common and compare how one group is performing
compared to others.
Program:

output:
3. Pie chart: A pie chart represents the distribution of different
variables among total. In the pie chart each slice shows its
contribution to the total amount.
Program:
# import all required libraries
import numpy as np

import plotly

import plotly.graph_objects as go
import plotly.offline as pyo

from plotly.offline import init_notebook_mode

init_notebook_mode(connected = True)

# different individual parts in

# total chart
countries=['India', 'canada',

'Australia','Brazil',
'Mexico','Russia',
'Germany','Switzerland',
'Texas']

# values corresponding to
each # individual country
present in # countries

values = [4500, 2500, 1053, 500,

3200, 1500, 1253, 600, 3500]

# plotting pie chart

fig = go.Figure(data=[go.Pie(labels=countries,
values=values)])

fig.show()
TASK-4 Create Box Plots, Violin Plots, Heatmaps Using Plotly
5.c)

Aim :- To Create Box Plots, Violin Plots, Heatmaps Using Plotly

1. Box Plots: A Box Plot is also known as Whisker plot is created to

display the summary of the set of data values having properties like
minimum, first quartile, median, third quartile and maximum. In the
box plot, a box is created from the first quartile to the third quartile, a
vertical line is also there which goes through the box at the median.
Here x-axis denotes the data to be plotted while the y-axis shows the
frequency distribution.
Program:
2. Violin plots

Violin Plot is a method to visualize the distribution of numerical data of

different variables. It is similar to Box Plot but with a rotated plot on
each side, giving more information about the density estimate on the
y-axis. The density is mirrored and flipped over and the resulting
shape is filled in, creating an image resembling a violin. The advantage
of a violin plot is that it can show nuances in the distribution that aren’t
perceptible in a boxplot. On the other hand, the boxplot more clearly
shows the outliers in the data.
Program:

3. Heatmaps

Heatmap is defined as a graphical representation of data using colors

to visualize the value of the matrix. In this, to represent more
common values or higher activities brighter colors basically reddish
colors are used and to represent less common or activity values,
darker colors are preferred. Heatmap is also defined by the name of
the shading matrix.
Program:

TASK-5- 6. A) Develop the Model Simple Linear Regression with Python

Aim: To develop the model Simple

Linear Regression with Python

Description:

Simple Linear Regression is a type of Regression algorithms that

models the relationship between a dependent variable and a single
independent variable. The relationship shown by a Simple Linear
Regression model is linear or a sloped straight line, hence it is called
Simple Linear Regression.
The key point in Simple Linear Regression is that the dependent
variable must be a continuous/real value. However, the
independent variable can be measured on continuous or categorical
values.

Program:

import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd
data_set= pd.read_csv('Salary_Data.csv')
x= data_set.iloc[:, :-1].values
y= data_set.iloc[:, 1].values
# Splitting the dataset into training and test set.
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 1/3, random_s
tate=0)
#Fitting the Simple Linear Regression model to the training dataset
from sklearn.linear_model import LinearRegression
regressor= LinearRegression()
regressor.fit(x_train, y_train)
#Prediction of Test and Training set result
y_pred= regressor.predict(x_test)
x_pred= regressor.predict(x_train)
mtp.scatter(x_train, y_train, color="green")
mtp.plot(x_train, x_pred, color="red")
mtp.title("Salary vs Experience (Training Dataset)")
mtp.xlabel("Years of Experience")
mtp.ylabel("Salary(In Rupees)")
mtp.show()
#visualizing the Test set results
mtp.scatter(x_test, y_test, color="blue")
mtp.plot(x_train, x_pred, color="red")
mtp.title("Salary vs Experience (Test Dataset)")
mtp.xlabel("Years of Experience")
mtp.ylabel("Salary(In Rupees)")
mtp.show()
Task5 6.b) Develop the Multiple Linear Regression with python
Aim: To Develop the Multiple Linear Regression with python

Description:

Multiple linear regression is used to estimate the relationship between

two or more independent variables and one dependent variable. You can
use multiple linear regression when you want to know:

Program:

# importing libraries

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

# importing the dataset

dataset=pd.read_csv("C:/Users/NECG/Desktop/50_StartsUp.c

sv") x=dataset.iloc[:,:-1].values

y=dataset.iloc[:,-1]

from sklearn.compose import

ColumnTransformer from

sklearn.preprocessing import

OneHotEncoder

ct=ColumnTransformer(transformers=[('encoder',OneHotEncoder(),

[3])],remainder='passthrough') x=np.array(ct.fit_transform(x))

#Avoding Dummy variable Trap

X=x[:,1:]

from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random
_State=0)

# training the multiple Linear Regression model on the training set

from sklearn.linear_model import

LinearRegression

regressor=LinearRegression()

regressor.fit(X_train,y_train)

#Predicting the test set

results

y_pred=regressor.predi

ct(X_test) y_test

y_pred

#mean square error

from sklearn.metrics import

mean_squared_error

mean_squared_error(y_test,y_pr

ed)

#r2 Score to find Accuracy

from sklearn.metrics

import r2_score

r2_score(y_test,y_pred)

#visualizing the train set results

mtp.scatter(x_train,y_train,color="green")

mtp.plot(x_train,x_pred,color="red") mtp.title("Salary vs

Experience(Training Dataset)") mtp.xlabel("Years of Experience")

mtp.ylabel("Salary(In Rupees)")

mtp.show()

#visualizing the test set result

mtp.scatter(x_train,y_train,color="blue")
mtp.plot(x_train,x_pred,color="red")

mtp.title("Salary vs Experience(Training Dataset)")

mtp.xlabel("Years of Experience")

mtp.ylabel("Salary(In Rupees)")

mtp.show()

Output:

TASK-6
7. Write a program to implement Logistic Regression.

Aim: to implement Logistic Regression

To understand the implementation of Logistic Regression in Python, we will

use the below example:

Example: There is a dataset given which contains the information of various

users obtained from the social networking sites. There is a car making
company that has recently launched a new SUV car. So the company wanted
to check how many users from the dataset, wants to purchase the car.

For this problem, we will build a Machine Learning model using the Logistic
regression algorithm. The dataset is shown in the below image. In this
problem, we will predict the purchased variable (Dependent
Variable) by using age and salary (Independent variables).

Program:

#Data Pre-procesing Step

# importing libraries
import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd
#importing datasets
data_set= pd.read_csv('user_data.csv')
#Extracting Independent and dependent Variable
x= data_set.iloc[:, [2,3]].values
y= data_set.iloc[:, 4].values

# Splitting the dataset into training and test set.

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25, random_state=0)
#feature Scaling
from sklearn.preprocessing import StandardScaler
st_x= StandardScaler()
x_train= st_x.fit_transform(x_train)
x_test= st_x.transform(x_test)
#Fitting Logistic Regression to the training set
from sklearn.linear_model import LogisticRegression
classifier= LogisticRegression(random_state=0)
classifier.fit(x_train, y_train)
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True, intercept_s
caling=1, l1_ratio=None, max_iter=100, multi_class='warn', n_jobs=None, pen
alty='l2', random_state=0, solver='warn', tol=0.0001, verbose=0, warm_start
=False)
#Predicting the test set result
y_pred= classifier.predict(x_test)

#Creating the Confusion matrix

from sklearn.metrics import confusion_matrix
cm= confusion_matrix()
#Visulaizing the test set result
from matplotlib.colors import ListedColormap
x_set, y_set = x_test, y_test
x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:,
0].max() + 1, step =0.01),
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0
.01))
mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(), x2.ravel()]).T).res
hape(x1.shape),
alpha = 0.75, cmap = ListedColormap(('purple','green' )))
mtp.xlim(x1.min(), x1.max())
mtp.ylim(x2.min(), x2.max())
for i, j in enumerate(nm.unique(y_set)):
c = ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Logistic Regression (Test set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()
TASK-6 8. Write a program to implement the Decision Tree
Regression model.
Aim: To implement the Decision Tree Regression model.
Program:
# import numpy package for arrays and stuff
import numpy as np

# import matplotlib.pyplot for plotting our result

import matplotlib.pyplot as plt

# import pandas for importing csv files

import pandas as pd
# import dataset
# dataset = pd.read_csv('Data.csv')
# alternatively open up .csv file to read data

dataset = np.array(
[['Asset Flip', 100, 1000],
['Text Based', 500, 3000],
['Visual Novel', 1500, 5000],
['2D Pixel Art', 3500, 8000],
['2D Vector Art', 5000, 6500],
['Strategy', 6000, 7000],
['First Person Shooter', 8000, 15000],
['Simulator', 9500, 20000],
['Racing', 12000, 21000],
['RPG', 14000, 25000],
['Sandbox', 15500, 27000],
['Open-World', 16500, 30000],
['MMOFPS', 25000, 52000],
['MMORPG', 30000, 80000]
])

# print the dataset

print(dataset)
# select all rows by : and column 1
# by 1:2 representing features
X = dataset[:, 1:2].astype(int)

# print X
print(X)

# select all rows by : and column 2

# by 2 to Y representing labels
y = dataset[:, 2].astype(int)

# print y
print(y)
# import the regressor
from sklearn.tree import DecisionTreeRegressor

# create a regressor object

regressor = DecisionTreeRegressor(random_state = 0)

# fit the regressor with X and Y data

regressor.fit(X, y)

# predicting a new value

# test the output by changing values, like 3750

y_pred = regressor.predict([[3750]])

# print the predicted price

print("Predicted price: % d\n"% y_pred)

# arange for creating a range of values

# from min value of X to max value of X
# with a difference of 0.01 between two
# consecutive values
X_grid = np.arange(min(X), max(X), 0.01)

# reshape for reshaping the data into

# a len(X_grid)*1 array, i.e. to make
# a column out of the X_grid values
X_grid = X_grid.reshape((len(X_grid), 1))
# scatter plot for original data
plt.scatter(X, y, color = 'red')

# plot predicted data

plt.plot(X_grid, regressor.predict(X_grid), color = 'blue')

# specify title
plt.title('Profit to Production Cost (Decision Tree Regression)')

# specify X axis label

plt.xlabel('Production Cost')

# specify Y axis label

plt.ylabel('Profit')

# show the plot

plt.show()

# import export_graphviz
from sklearn.tree import export_graphviz

# export the decision tree to a tree.dot file

# for visualizing the plot easily anywhere
export_graphviz(regressor, out_file ='tree.dot',
feature_names =['Production Cost'])
TASK-7 9. Write a program to implement Random Forest Classification model

AIM: To implement Python Program on Random Forest Classification model

PROGRAM:

#Import scikit-learn dataset library

from sklearn import datasets
#Load dataset
iris = datasets.load_iris()
# print the label species(setosa, versicolor,virginica)
print(iris.target_names)
# print the names of the four features
print(iris.feature_names)
# print the iris data (top 5 records)
print(iris.data[0:5])
# print the iris labels (0:setosa, 1:versicolor, 2:virginica)
print(iris.target)
# Creating a DataFrame of given iris dataset.
import pandas as pd
data=pd.DataFrame({
'sepal length':iris.data[:,0],
'sepal width':iris.data[:,1],
'petal length':iris.data[:,2],
'petal width':iris.data[:,3],
'species':iris.target
})
data.head()
# Import train_test_split function
from sklearn.model_selection import train_test_split
X=data[['sepal length', 'sepal width', 'petal length', 'petal width']]
# Features
y=data['species']
# Labels
# Split dataset into training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) #
70% training and 30% test
#Import Random Forest Model
from sklearn.ensemble import RandomForestClassifier
#Create a Gaussian Classifier
clf=RandomForestClassifier(n_estimators=100)
#Train the model using the training sets y_pred=clf.predict(X_test)
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
#Import scikit-learn metrics module for accuracy calculation
from sklearn import metrics
# Model Accuracy, how often is the classifier correct?
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
clf.predict([[3, 5, 4, 2]])

from sklearn.ensemble import RandomForestClassifier

#Create a Gaussian Classifier
clf=RandomForestClassifier(n_estimators=100)
#Train the model using the training sets y_pred=clf.predict(X_test)
clf.fit(X_train,y_train)
import pandas as pd
feature_imp = pd.Series(clf.feature_importances_,index=iris.feature_names)
.sort_values(ascending=False)
feature_imp
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# Creating a bar plot
sns.barplot(x=feature_imp, y=feature_imp.index)
# Add labels to your graph
plt.xlabel('Feature Importance Score')
plt.ylabel('Features')
plt.title("Visualizing Important Features")
plt.legend()
plt.show()
output:

TASK-8 10. Write a program to implement the K-Nearest Neighbour(KNN) algorithm to classify the
given dataset

Aim: To Write a program to implement the K-Nearest Neighbour(KNN) algorithm to classify the given
dataset

Program:

# Import necessary modules

from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

# Loading data
irisData = load_iris()

# Create feature and target arrays

X = irisData.data
y = irisData.target

# Split into training and test set

X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size = 0.2, random_state=42)

knn = KNeighborsClassifier(n_neighbors=7)

knn.fit(X_train, y_train)

# Predict on dataset which model has not seen before

print(knn.predict(X_test))
print(knn.score(X_test, y_test))
neighbors = np.arange(1, 9)
train_accuracy = np.empty(len(neighbors))
test_accuracy = np.empty(len(neighbors))

# Loop over K values

for i, k in enumerate(neighbors):
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)

# Compute training and test data accuracy

train_accuracy[i] = knn.score(X_train, y_train)
test_accuracy[i] = knn.score(X_test, y_test)

# Generate plot
plt.plot(neighbors, test_accuracy, label = 'Testing dataset Accuracy')
plt.plot(neighbors, train_accuracy, label = 'Training dataset Accuracy')

plt.legend()
plt.xlabel('n_neighbors')
plt.ylabel('Accuracy')
plt.show()

OUTPUT:

Task 9 11. Write a program to implement the Naive Bayesian classifier for a simple training data set to be
stored in .CSV file.
Aim: To implement the Naive Bayesian classifier for a simple training data set to be stored in .CSV file.

Program:

#Importing the libraries

import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('user_data.csv')
x = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25, random_stat
e = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)
# Fitting Naive Bayes to the Training set
from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB()
classifier.fit(x_train, y_train)
# Predicting the Test set results
y_pred = classifier.predict(x_test)
# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
# Visualising the Training set results
from matplotlib.colors import ListedColormap
x_set, y_set = x_train, y_train
X1, X2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1, step = 0.01),
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01))
mtp.contourf(X1, X2, classifier.predict(nm.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
alpha = 0.75, cmap = ListedColormap(('purple', 'green')))
mtp.xlim(X1.min(), X1.max())
mtp.ylim(X2.min(), X2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
c = ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Naive Bayes (Training set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()

# Visualising the Test set results

from matplotlib.colors import ListedColormap
x_set, y_set = x_test, y_test
X1, X2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1, step = 0.01),
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01))
mtp.contourf(X1, X2, classifier.predict(nm.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
alpha = 0.75, cmap = ListedColormap(('purple', 'green')))
mtp.xlim(X1.min(), X1.max())
mtp.ylim(X2.min(), X2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
c = ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Naive Bayes (test set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()
Output:
12. Write a program to implement clustering algorithm to cluster the set of
Task 10 data stored in .CSV file.

Aim: Program to implement clustering algorithm to cluster the set of data stored
in .CSV file.

Program:

# importing libraries
import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('Mall_Customers_data.csv')

x = dataset.iloc[:, [3, 4]].values

from sklearn.cluster import KMeans
wcss_list= [] #Initializing the list for the values of WCSS

#Using for loop for iterations from 1 to 10.

for i in range(1, 11):
kmeans = KMeans(n_clusters=i, init='k-means++', random_state= 42)
kmeans.fit(x)
wcss_list.append(kmeans.inertia_)
mtp.plot(range(1, 11), wcss_list)
mtp.title('The Elobw Method Graph')
mtp.xlabel('Number of clusters(k)')
mtp.ylabel('wcss_list')
mtp.show()
1. #training the K-means model on a dataset
2. kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)
3. y_predict= kmeans.fit_predict(x)

#training the K-means model on a dataset

kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)
y_predict= kmeans.fit_predict(x)
#visulaizing the clusters
mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = 'blue', la
bel = 'Cluster 1') #for first cluster mtp.scatter(x[y_predict == 1, 0], x[y_pre
dict == 1, 1], s = 100, c = 'green', label = 'Cluster 2') #for second cluster
mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = 'red', lab
el = 'Cluster 3') #for third cluster
mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan', la
bel = 'Cluster 4') #for fourth cluster
mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = 'magent
a', label = 'Cluster 5') #for fifth cluster
mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s =
300, c = 'yellow', label = 'Centroid')
mtp.title('Clusters of customers')
mtp.xlabel('Annual Income (k$)')
mtp.ylabel('Spending Score (1-100)')
mtp.legend()
mtp.show()

Boman - Hebrew Thought Compared With Greek - 1970
No ratings yet
Boman - Hebrew Thought Compared With Greek - 1970
228 pages
Python: Learn Python in 24 Hours
From Everand
Python: Learn Python in 24 Hours
Alex Nordeen
4/5 (12)
Chapte 1 (Introduction To Computer Network)
No ratings yet
Chapte 1 (Introduction To Computer Network)
54 pages
OrangePi Lite2 - User Maual - v1.1
No ratings yet
OrangePi Lite2 - User Maual - v1.1
33 pages
Python Lab Manual
No ratings yet
Python Lab Manual
67 pages
Week - 1 Record
No ratings yet
Week - 1 Record
5 pages
FODS Record
No ratings yet
FODS Record
66 pages
Week 1 Observation
No ratings yet
Week 1 Observation
2 pages
Setting Up Python 3.5 and Numpy and Matplotlib On Your Own Windows PC or Laptop
No ratings yet
Setting Up Python 3.5 and Numpy and Matplotlib On Your Own Windows PC or Laptop
18 pages
C Programming Lab Notes
No ratings yet
C Programming Lab Notes
30 pages
Python
No ratings yet
Python
27 pages
Lesson 3 Setting Up The Python Environment
No ratings yet
Lesson 3 Setting Up The Python Environment
16 pages
Installing Python For Windows: Windows x86-64 Executable Installer
No ratings yet
Installing Python For Windows: Windows x86-64 Executable Installer
11 pages
73035393463850
No ratings yet
73035393463850
48 pages
Introduction To Python Lecture 1: Setting Up Your Python Environment
No ratings yet
Introduction To Python Lecture 1: Setting Up Your Python Environment
33 pages
How To Install Python 3 On Windows 10
No ratings yet
How To Install Python 3 On Windows 10
8 pages
Setting Up Python 3.4 and Numpy and Matplotlib On Your Own Windows PC or Laptop
No ratings yet
Setting Up Python 3.4 and Numpy and Matplotlib On Your Own Windows PC or Laptop
15 pages
MLk65opyk45o4v 22i5vi2 It9359ci5ji3tjui3wmdlakmlmakmkmfiejrieuighegiurhgiurguir
No ratings yet
MLk65opyk45o4v 22i5vi2 It9359ci5ji3tjui3wmdlakmlmakmkmfiejrieuighegiurhgiurguir
23 pages
18DIP Lab 1
No ratings yet
18DIP Lab 1
19 pages
Python Material
No ratings yet
Python Material
166 pages
Anaconda and Python Setup (DSC551, IsP560, IsP781)
No ratings yet
Anaconda and Python Setup (DSC551, IsP560, IsP781)
6 pages
ML Lab 1
No ratings yet
ML Lab 1
24 pages
1. Introduction
No ratings yet
1. Introduction
4 pages
2_Setup of Python
No ratings yet
2_Setup of Python
18 pages
Python File
No ratings yet
Python File
5 pages
Setting Python For Windows
No ratings yet
Setting Python For Windows
12 pages
Python Running Notes
No ratings yet
Python Running Notes
6 pages
Getting Up and Running With Python Installing Anaconda On Windows
No ratings yet
Getting Up and Running With Python Installing Anaconda On Windows
15 pages
Xi Ip
No ratings yet
Xi Ip
5 pages
Installations
No ratings yet
Installations
29 pages
Num Py
No ratings yet
Num Py
20 pages
ML Lab (2MCA) (1)
No ratings yet
ML Lab (2MCA) (1)
52 pages
Lab Manual
No ratings yet
Lab Manual
134 pages
ai lab ppt
No ratings yet
ai lab ppt
15 pages
A Step-by-Step Guide To Installing Python On A Windows 10 PC by ChatGPT
No ratings yet
A Step-by-Step Guide To Installing Python On A Windows 10 PC by ChatGPT
3 pages
Python
No ratings yet
Python
50 pages
1710496889134
100% (1)
1710496889134
155 pages
Installing Python For Windows: Figure 2.1: Running The Python Setup File
No ratings yet
Installing Python For Windows: Figure 2.1: Running The Python Setup File
79 pages
ML Exp 1
No ratings yet
ML Exp 1
6 pages
Python Basics
No ratings yet
Python Basics
34 pages
python notes sarang sir (1)
No ratings yet
python notes sarang sir (1)
24 pages
Installing Pynomo On A Windows Machine: Ron Doerfler (21 December 2011)
No ratings yet
Installing Pynomo On A Windows Machine: Ron Doerfler (21 December 2011)
3 pages
Python 1 Core
No ratings yet
Python 1 Core
81 pages
Lecture 10 - Mathematics
No ratings yet
Lecture 10 - Mathematics
38 pages
01 Python Introduction
No ratings yet
01 Python Introduction
39 pages
ML With Python Lab (MCA)
No ratings yet
ML With Python Lab (MCA)
36 pages
Python
No ratings yet
Python
65 pages
Machine Learning Lab Set1
No ratings yet
Machine Learning Lab Set1
5 pages
Python Tutorial9
No ratings yet
Python Tutorial9
10 pages
PP Unit I Notes Dbatu-1
100% (1)
PP Unit I Notes Dbatu-1
20 pages
Lab 1 Manual
No ratings yet
Lab 1 Manual
5 pages
StudyMaterial BNCSD601B
No ratings yet
StudyMaterial BNCSD601B
48 pages
Python Workshop
No ratings yet
Python Workshop
61 pages
Introduction To Python Programming
No ratings yet
Introduction To Python Programming
24 pages
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
From Everand
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
Mark Chan
5/5 (4)
Python & Anaconda Installation
No ratings yet
Python & Anaconda Installation
9 pages
Data To Fish: How To Download and Install Python 3.9 On Windows
No ratings yet
Data To Fish: How To Download and Install Python 3.9 On Windows
6 pages
Machine_learning_lab_manual r
No ratings yet
Machine_learning_lab_manual r
32 pages
Install Python 3.5 (Windows 7) : Download The Python Software
No ratings yet
Install Python 3.5 (Windows 7) : Download The Python Software
10 pages
Python and Jupyter Notebook Installation
No ratings yet
Python and Jupyter Notebook Installation
20 pages
Python Module 1 23MBA
No ratings yet
Python Module 1 23MBA
42 pages
Lab Programs 1 to 3
No ratings yet
Lab Programs 1 to 3
23 pages
Python Programming Reference Guide: A Comprehensive Guide for Beginners to Master the Basics of Python Programming Language with Practical Coding & Learning Tips
From Everand
Python Programming Reference Guide: A Comprehensive Guide for Beginners to Master the Basics of Python Programming Language with Practical Coding & Learning Tips
Coleman Newton
No ratings yet
King Midas
No ratings yet
King Midas
8 pages
Model Business Letters, Emails and Other Business Documents 7th Edition Shirley Taylor all chapter instant download
100% (9)
Model Business Letters, Emails and Other Business Documents 7th Edition Shirley Taylor all chapter instant download
66 pages
Group Number - Phase 3 - 518015 - 121
No ratings yet
Group Number - Phase 3 - 518015 - 121
7 pages
BE6000SoftwareLoadSummary 9X10X K9 07
No ratings yet
BE6000SoftwareLoadSummary 9X10X K9 07
4 pages
Get Java 17 Quick Syntax Reference: A Pocket Guide To The Java SE Language, APIs, and Library Mikael Olsson PDF Ebook With Full Chapters Now
100% (3)
Get Java 17 Quick Syntax Reference: A Pocket Guide To The Java SE Language, APIs, and Library Mikael Olsson PDF Ebook With Full Chapters Now
79 pages
How Do You Cite Sources in A Literature Review
100% (1)
How Do You Cite Sources in A Literature Review
8 pages
Metacognition Uts
No ratings yet
Metacognition Uts
2 pages
NL-RTM-RDH: Netherlands, The
No ratings yet
NL-RTM-RDH: Netherlands, The
3 pages
Microsoft Word 2013™ Starting A Thesis (Level 3) : IT Training
No ratings yet
Microsoft Word 2013™ Starting A Thesis (Level 3) : IT Training
16 pages
Vincent Bradley SF
No ratings yet
Vincent Bradley SF
4 pages
Exercise On Figures of Speech (Ii)
No ratings yet
Exercise On Figures of Speech (Ii)
2 pages
Complete_Guide_to_Verbs_Extreme_Detail_Dark_Mode
No ratings yet
Complete_Guide_to_Verbs_Extreme_Detail_Dark_Mode
15 pages
UNIT 01 Unit Study Guide
No ratings yet
UNIT 01 Unit Study Guide
2 pages
التوحيد عند الهندوس
No ratings yet
التوحيد عند الهندوس
16 pages
2017 Key First Mid Term Exam
0% (1)
2017 Key First Mid Term Exam
4 pages
new format dlp pe
No ratings yet
new format dlp pe
2 pages
Comptia: Questions & Answers PDF
No ratings yet
Comptia: Questions & Answers PDF
8 pages
Introduction To The Old Testament (Hebrew Bible) (Christine Hayes) (Z-Library)
No ratings yet
Introduction To The Old Testament (Hebrew Bible) (Christine Hayes) (Z-Library)
513 pages
Mark The Letter A
No ratings yet
Mark The Letter A
5 pages
Root A. Python For Data Analytics. A Beginners Guide For Learning 2019
100% (8)
Root A. Python For Data Analytics. A Beginners Guide For Learning 2019
167 pages
Staad Output (Sample)
No ratings yet
Staad Output (Sample)
13 pages
Panels
100% (1)
Panels
138 pages
2º Bimestre 15-PTOS 10 Ptos Extra 25PTOS Total Acumulado Peso Recup. Nota Bimestre Total Av.B
No ratings yet
2º Bimestre 15-PTOS 10 Ptos Extra 25PTOS Total Acumulado Peso Recup. Nota Bimestre Total Av.B
1 page
African Literature
100% (2)
African Literature
296 pages
Elementary Data Items
No ratings yet
Elementary Data Items
6 pages
The Parable of The Burning House Buddha
100% (1)
The Parable of The Burning House Buddha
2 pages
Factory Reset RFID Modbus Reader Demo Software User Guide
No ratings yet
Factory Reset RFID Modbus Reader Demo Software User Guide
9 pages