0% found this document useful (0 votes)

11 views15 pages

AI ML - Cycle 2 Programs

The document outlines various machine learning techniques including linear regression, logistic regression, decision trees, random forests, SVM, and clustering algorithms. Each section provides a programmatic approach using Python libraries such as pandas and scikit-learn, along with example datasets and outputs for model evaluation. The document also includes implementations of ensemble techniques like VotingClassifier, BaggingClassifier, and AdaBoostClassifier, as well as Gaussian Mixture Models for clustering.

Uploaded by

suryau

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views15 pages

AI ML - Cycle 2 Programs

Uploaded by

suryau

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Ex No: 7a BUILD LINEAR REGRESSION MODELS

Date:

Program:
import pandas as pd
import statsmodels.api as sm
data = pd.read_csv(“pima_diabetes.csv")
#create correlation matrix
data.corr()
#Bivariate Analysis of Glucose-Insulin features
#define response variable 1
y1 = data['Glucose']
#define explanatory variable 1
x1 = data[['Insulin']]
#add constant to predictor variables
x1 = sm.add_constant(x1)
#fit linear regression model
model1 = sm.OLS(y1, x1).fit()
#view model summary
print(model1.summary())
#Bivariate Analysis of Age-Pregnancies features
#define response variable 2
y2 = data['Age']
#define explanatory variable 2
x2 = data['Pregnancies']
#add a constant to predictor
variablesx2 = sm.add_constant(x2)
#fit linear regression model
model2 = sm.OLS(y2, x2).fit()
#view model summary
print(model2.summary())
#Bivariate Analysis of SkinThickness-BMI features
#define response variable 3
y3 = data['SkinThickness']
#define explanatory variable 3
x3 = data[['BMI']]
#add constant to predictor variables
x3 = sm.add_constant(x3)
#fit linear regression model
Model3 = sm.OLS(y3, x3).fit()
#view model summary
print(model3.summary())
Output:

a. Correlation Matrix

b. Bivariate Analysis of Glucose-Insulin features

c. Bivariate Analysis of Age-Pregnancies features

d. Bivariate Analysis of SkinThickness-BMI features

Result:
Ex No: 7b BUILD LOGISTIC REGRESSION MODELS
Date:

Program:
# importing libraries
import statsmodels.api as sm
import pandas as pd
# loading the training dataset
data = pd.read_csv('pima_diabetes.csv', index_col = 0)
# defining the dependent and independent variables
Xtrain = data[['Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI',
'DiabetesPedigreeFunction','Age']]
ytrain = data[['Outcome']]
# building the model and fitting the data
log_reg = sm.Logit(ytrain, Xtrain).fit()
# printing the summary table
print(log_reg.summary())

Output:

Result:
Ex No: 7c BUILD DECISION TREES
Date:

Program:

import pandas
from sklearn import tree
from sklearn.tree import DecisionTreeClassifierdf
= pandas.read_csv("data.csv")
print("Input:")
print(df.head(5))
d = {'UK':0,'USA':1,'N':2}
df['Nationality'] = df['Nationality'].map(d)d
= {'YES':1, 'NO':0}
df['Go'] = df['Go'].map(d)
print("Transformed Data:")
print(df.head(5))
features = ['Age','Experience','Rank','Nationality']X
= df[features]
y = df['Go']
dtree = DecisionTreeClassifier()
dtree = dtree.fit(X,y)
print(dtree.predict([[40,10,6,1]]))
print("[1]means 'Go'")
print("[0]means 'NO'")
DATA SET : (data.csv)
Age Experience Rank Nationality Go
36 10 9 UK NO
42 12 4 USA NO
23 4 6 N NO
52 4 4 USA NO
43 21 8 USA YES
Output:

Result:
Ex No: 7d BUILD RANDOM FORESTS
Date:

Program:

# Pandas is used for data manipulation

import pandas as pd
# Read in data and display first 5 rows
features = pd.read_csv('temps.csv')
features.head(5)

print('The shape of our features is:', features.shape)

# Descriptive statistics for each column

features.describe()

# One-hot encode the data using pandas get_dummies

features = pd.get_dummies(features)
# Display the first 5 rows of the last 12 columns
features.iloc[:,5:].head(5)

import numpy as np
# Labels are the values we want to predict
labels = np.array(features['actual'])
# Remove the labels from the features
# axis 1 refers to the columns
features= features.drop('actual', axis = 1)
# Saving feature names for later use
feature_list = list(features.columns)
# Convert to numpy array
features = np.array(features)

# Using Skicit-learn to split data into training and testing sets

from sklearn.model_selection import train_test_split
# Split the data into training and testing sets
train_features, test_features, train_labels, test_labels = train_test_split(features, labels, test_size =
0.25, random_state = 42)

print('Training Features Shape:', train_features.shape)

print('Training Labels Shape:', train_labels.shape)
print('Testing Features Shape:', test_features.shape)
print('Testing Labels Shape:', test_labels.shape
# Import the model we are using
from sklearn.ensemble import RandomForestRegressor
# Limit depth of tree to 3 levels
rf_small = RandomForestRegressor(n_estimators=10, max_depth = 3)
# Train the model on training data
rf_small.fit(train_features, train_labels)

# Extract the small tree

tree_small = rf_small.estimators_[5]
# Save the tree as a png image
export_graphviz(tree_small, out_file = 'small_tree.dot', feature_names = feature_list, rounded =
True, precision = 1)
(graph, ) = pydot.graph_from_dot_file('small_tree.dot')
graph.write_png('small_tree.png');

# Use the forest's predict method on the test data

predictions = rf_small.predict(test_features)
# Calculate the absolute errors
errors = abs(predictions - test_labels)
# Print out the mean absolute error (mae)
print('Mean Absolute Error:', round(np.mean(errors), 2), 'degrees.')

# Calculate mean absolute percentage error (MAPE)

mape = 100 * (errors / test_labels)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Output:

The shape of our features is: (348, 12)

Training Features Shape: (261, 17)

Training Labels Shape: (261,)
Testing Features Shape: (87, 17)
Testing Labels Shape: (87,)
RandomForestRegressor(max_depth=3, n_estimators=10)

Mean Absolute Error: 4.0 degrees.

Accuracy: 93.73 %.

Result:
Ex No: 7e BUILD SVM MODELS
Date:

Program:
import pandas
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix

data = pandas.read_csv("vector.csv")
print("Input: ")
print(data.head(10))

training_set, test_set = train_test_split(data, test_size = 0.3, random_state=1)

x_train = training_set.iloc[:,0:2].values
y_train = training_set.iloc[:,2].values
x_test = test_set.iloc[:,0:2].values
y_test = test_set.iloc[:,2].values

classifier = SVC(kernel='linear', random_state=1)

classifier.fit(x_train, y_train)
y_pred = classifier.predict(x_test)
test_set["prediction"] = y_pred
print("Output")
print(test_set)

cm = confusion_matrix(y_test, y_pred)
accuracy = float(cm.diagonal().sum()/len(y_test))
print("\nAccuracy of SVM for the given dataset: ", accuracy)

Dataset
Output:

Result:
Ex No: 8 IMPLEMENT ENSEMBLING TECHNIQUES
Date:

Program:

#Implement VotingClassifier
#Importing necessary libraries:
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_moons
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score

#Creating dataset:
X, y = make_moons(n_samples=500, noise=0.30)
X_train, X_test, y_train, y_test = train_test_split(X, y)

#Initializing the models:

log = LogisticRegression()
rnd = RandomForestClassifier(n_estimators=100)
svm = SVC()
voting = VotingClassifier(
estimators=[('logistics_regression', log), ('random_forest', rnd), ('support_vector_machine', svm)],
voting='hard')

#Fitting training data:

voting.fit(X_train, y_train)

#prediction using test data

for clf in (log, rnd, svm, voting):
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print(clf. class . name , accuracy_score(y_test, y_pred))

#Implement BaggingClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

bagging_clf = BaggingClassifier(
DecisionTreeClassifier(), n_estimators=250,
max_samples=100, bootstrap=True, random_state=101)
#Fitting training data:
bagging_clf.fit(X_train, y_train)

#prediction using test data

y_pred = bagging_clf.predict(X_test)
print(accuracy_score(y_test, y_pred))

#Implement AdaBoostClassifier
from sklearn.ensemble import AdaBoostClassifier
adaboost_clf = AdaBoostClassifier(
DecisionTreeClassifier(max_depth=1), n_estimators=200,
algorithm="SAMME.R", learning_rate=0.5, random_state=42)

#Fitting training data:

adaboost_clf.fit(X_train, y_train)

#prediction using test data

y_pred = adaboost_clf.predict(X_test)
accuracy_score(y_test, y_pred)

Output:

#For VotingClassifier
LogisticRegression 0.848
RandomForestClassifier 0.88
SVC 0.896
VotingClassifier 0.896

#For BaggingClassifier
0.888

#For AdaBoostClassifier
0.864

Result:
Ex No: 9 IMPLEMENT CLUSTERING ALGORITHMS
Date:

Program:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

data = {'x':
[25,34,22,27,33,33,31,22,35,34,67,54,57,43,50,57,59,52,65,47,49,48,35,33,44,45,38,43,51,4
6],
'y':
[79,51,53,78,59,74,73,57,69,75,51,32,40,47,53,36,35,58,59,50,25,20,14,12,20,5,29,27,8,7]
}

df = pd.DataFrame(data, columns=['x', 'y'])

kmeans = KMeans(n_clusters=3).fit(df)
centroids = kmeans.cluster_centers_
print(centroids)
plt.scatter(df['x'], df['y'], c= kmeans.labels_.astype(float), s=50, alpha=0.5)
plt.scatter(centroids[:, 0], centroids[:, 1], c='red', s=50)
plt.show()

Output:

Result:
Ex No: 10 IMPLEMENT GMM ALGORITHMS
Date:

Program:
import matplotlib.pyplot as plt
from sklearn import datasets
import sklearn.metrics as sm
import pandas as pd
import numpy as np
%matplotlib inline
# import some data to play with
iris = datasets.load_iris()

#print("\n IRIS DATA :",iris.data);

#print("\n IRIS FEATURES :\n",iris.feature_names)
#print("\n IRIS TARGET :\n",iris.target)
#print("\n IRIS TARGET NAMES:\n",iris.target_names)
# Store the inputs as a Pandas Dataframe and set the column names
X = pd.DataFrame(iris.data)
#print(X)
X.columns = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
#print(X.columns)
#print("X:",x)
#print("Y:",y)
y = pd.DataFrame(iris.target)
y.columns = ['Targets']
# Set the size of the plot
plt.figure(figsize=(14,7))
# Create a colormap
colormap = np.array(['red', 'lime', 'black'])
# Plot Sepal
plt.subplot(1, 2, 1)
plt.scatter(X.Sepal_Length,X.Sepal_Width, c=colormap[y.Targets], s=40)
plt.title('Sepal')
plt.subplot(1, 2, 2)
plt.scatter(X.Petal_Length,X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('Petal')

# GMM
from sklearn import preprocessing
scaler = preprocessing.StandardScaler()
scaler.fit(X)
xsa = scaler.transform(X)
xs = pd.DataFrame(xsa, columns = X.columns)
xs.sample(5)
from sklearn.mixture import GaussianMixture
gmm = GaussianMixture(n_components=3)
gmm.fit(xs)
y_cluster_gmm = gmm.predict(xs)
y_cluster_gmm
plt.subplot(1, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y_cluster_gmm], s=40)
plt.title('GMM Classification')
# Accuracy
sm.accuracy_score(y, y_cluster_gmm)
# Confusion Matrix
sm.confusion_matrix(y, y_cluster_gmm)

Output:

array([[50, 0, 0],
[ 0, 5, 45],
[ 0, 50, 0]], dtype=int64)

Result:

Modern Mathematical Statistics With Applications (2nd Edition)
13% (32)
Modern Mathematical Statistics With Applications (2nd Edition)
13 pages
Assignment Questions
0% (2)
Assignment Questions
5 pages
Experimental Design-2
50% (2)
Experimental Design-2
38 pages
Additional Program
No ratings yet
Additional Program
573 pages
Stat-II CH-TWO
No ratings yet
Stat-II CH-TWO
68 pages
Udacity Machine Learning Analysis Supervised Learning
100% (1)
Udacity Machine Learning Analysis Supervised Learning
504 pages
Regression Analysis - Cheatsheet
No ratings yet
Regression Analysis - Cheatsheet
9 pages
ML Cheatsheet
No ratings yet
ML Cheatsheet
4 pages
Sklearn
No ratings yet
Sklearn
141 pages
(Ebook) A Pocket Guide To Risk Mathematics: Key Concepts Every Auditor Should Know by Matthew Leitch ISBN 9780470710524, 0470710527
No ratings yet
(Ebook) A Pocket Guide To Risk Mathematics: Key Concepts Every Auditor Should Know by Matthew Leitch ISBN 9780470710524, 0470710527
58 pages
Ensembles Models and Decision Tree
No ratings yet
Ensembles Models and Decision Tree
21 pages
Practicalpgm ML
No ratings yet
Practicalpgm ML
33 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
Aiml Programs
No ratings yet
Aiml Programs
12 pages
Machine Learnin
100% (2)
Machine Learnin
23 pages
Bacdeaf 23032025 115708 Split 1
No ratings yet
Bacdeaf 23032025 115708 Split 1
37 pages
Machine
100% (1)
Machine
45 pages
AML Lab
No ratings yet
AML Lab
14 pages
22MCA1008 - Varun ML LAB ASSIGNMENTS
100% (1)
22MCA1008 - Varun ML LAB ASSIGNMENTS
41 pages
ML5 Implementation
No ratings yet
ML5 Implementation
32 pages
CH 13 Probability Marks Booster Sheet - 30355906 - 2024 - 02!13!19 - 00
No ratings yet
CH 13 Probability Marks Booster Sheet - 30355906 - 2024 - 02!13!19 - 00
95 pages
AI Lab9 22it3044
No ratings yet
AI Lab9 22it3044
21 pages
PR
No ratings yet
PR
17 pages
Task 4
No ratings yet
Task 4
2 pages
AI Assignment-6
No ratings yet
AI Assignment-6
7 pages
1
No ratings yet
1
13 pages
Atul MLT Exp 4-11
No ratings yet
Atul MLT Exp 4-11
17 pages
ML PDF
No ratings yet
ML PDF
30 pages
23BCE7092 ML Lab Assignment
No ratings yet
23BCE7092 ML Lab Assignment
14 pages
MOD 5 (Hypotheis Testing)
No ratings yet
MOD 5 (Hypotheis Testing)
40 pages
ML Internal Answers
No ratings yet
ML Internal Answers
9 pages
Easy Pract ML
No ratings yet
Easy Pract ML
7 pages
Shobit Sharma (2124399) ML Lab File PDF
No ratings yet
Shobit Sharma (2124399) ML Lab File PDF
19 pages
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
ML Lab PT
No ratings yet
ML Lab PT
25 pages
ML L - Ab
No ratings yet
ML L - Ab
13 pages
Aiml Practicals
No ratings yet
Aiml Practicals
22 pages
DT R
No ratings yet
DT R
2 pages
Annotated Follow-Along Guide - Construct A Naive Bayes Model With Python
No ratings yet
Annotated Follow-Along Guide - Construct A Naive Bayes Model With Python
9 pages
Imp
No ratings yet
Imp
2 pages
ML Lab Programs
No ratings yet
ML Lab Programs
9 pages
One Way Anova
No ratings yet
One Way Anova
23 pages
6 Binary Classifier
No ratings yet
6 Binary Classifier
4 pages
ML Codes
No ratings yet
ML Codes
9 pages
PSYC2001 Practice Exam 2024
No ratings yet
PSYC2001 Practice Exam 2024
15 pages
ML Lab 01999676272
No ratings yet
ML Lab 01999676272
12 pages
Non Parametric Tests - A
No ratings yet
Non Parametric Tests - A
13 pages
Classification Review
No ratings yet
Classification Review
8 pages
Aiml 5-8
No ratings yet
Aiml 5-8
19 pages
MBA 1st Sem Unit-4 Business Statistics
No ratings yet
MBA 1st Sem Unit-4 Business Statistics
13 pages
Btech1007022 Lab5.1
No ratings yet
Btech1007022 Lab5.1
9 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
11 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
Data Analytics
No ratings yet
Data Analytics
10 pages
ABPR1103
No ratings yet
ABPR1103
215 pages
LAB-4 Report
No ratings yet
LAB-4 Report
21 pages
Supple Maximizing Performance in Cs CuBiCl
No ratings yet
Supple Maximizing Performance in Cs CuBiCl
5 pages
Data Mining Practicals
No ratings yet
Data Mining Practicals
22 pages
Scikit Learn What Were Covering
No ratings yet
Scikit Learn What Were Covering
15 pages
ML External Xerox
No ratings yet
ML External Xerox
1 page
Chapter 5 - Probability Distributions and Discrete Random Variables
No ratings yet
Chapter 5 - Probability Distributions and Discrete Random Variables
5 pages
Random Forest
No ratings yet
Random Forest
2 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
Chapter 5
No ratings yet
Chapter 5
35 pages
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
No ratings yet
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
20 pages
Figure 1.1 (Mediation Diagram) (Automatisch Hersteld)
No ratings yet
Figure 1.1 (Mediation Diagram) (Automatisch Hersteld)
5 pages
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 7
No ratings yet
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 7
23 pages
Week 7 Laboratory Activity
No ratings yet
Week 7 Laboratory Activity
12 pages
Regression Analysis
No ratings yet
Regression Analysis
6 pages
Sample Exam With Solutions. Econometrics II 2015.
No ratings yet
Sample Exam With Solutions. Econometrics II 2015.
15 pages
Linearregression SVM
No ratings yet
Linearregression SVM
3 pages
3.2.6 Practice
No ratings yet
3.2.6 Practice
8 pages
Homework 10 Solution
No ratings yet
Homework 10 Solution
9 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
This Study Resource Was: P (X 3) 1-P (X 0) - P (X 1) - P (X 2) - P (X 3) 0.1428765
No ratings yet
This Study Resource Was: P (X 3) 1-P (X 0) - P (X 1) - P (X 2) - P (X 3) 0.1428765
3 pages
Institute of Mathematical Statistics
No ratings yet
Institute of Mathematical Statistics
20 pages
16BCB0126 VL2018195002535 Pe003
No ratings yet
16BCB0126 VL2018195002535 Pe003
40 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
5 pages
Test
No ratings yet
Test
31 pages
ML Lab
No ratings yet
ML Lab
7 pages
Statistics and Probability: Quarter 4 - Module 1 Hypotheses Testing and Identifying The Parameter of A Real-Life Problem
100% (2)
Statistics and Probability: Quarter 4 - Module 1 Hypotheses Testing and Identifying The Parameter of A Real-Life Problem
18 pages
On The Analysis and Application of LDPC Codes: Olgica Milenkovic University of Colorado, Boulder
No ratings yet
On The Analysis and Application of LDPC Codes: Olgica Milenkovic University of Colorado, Boulder
51 pages
18.S096 Problem Set Fall 2013: Stochastic Calculus
No ratings yet
18.S096 Problem Set Fall 2013: Stochastic Calculus
3 pages
AS306 Fme Cou HSK Jan13
No ratings yet
AS306 Fme Cou HSK Jan13
2 pages
MIT6 041F10 Rec14
No ratings yet
MIT6 041F10 Rec14
2 pages
Aero 630: Introduction To Random Dynamical Systems
No ratings yet
Aero 630: Introduction To Random Dynamical Systems
1 page
ML Algorithms
100% (1)
ML Algorithms
1 page

AI ML - Cycle 2 Programs

Uploaded by

AI ML - Cycle 2 Programs

Uploaded by

Ex No: 7a BUILD LINEAR REGRESSION MODELS

b. Bivariate Analysis of Glucose-Insulin features

d. Bivariate Analysis of SkinThickness-BMI features

# Pandas is used for data manipulation

print('The shape of our features is:', features.shape)

# Descriptive statistics for each column

# One-hot encode the data using pandas get_dummies

# Using Skicit-learn to split data into training and testing sets

print('Training Features Shape:', train_features.shape)

# Extract the small tree

# Use the forest's predict method on the test data

# Calculate mean absolute percentage error (MAPE)

The shape of our features is: (348, 12)

Training Features Shape: (261, 17)

Mean Absolute Error: 4.0 degrees.

training_set, test_set = train_test_split(data, test_size = 0.3, random_state=1)

classifier = SVC(kernel='linear', random_state=1)

#Initializing the models:

#Fitting training data:

#prediction using test data

#prediction using test data

#Fitting training data:

#prediction using test data

df = pd.DataFrame(data, columns=['x', 'y'])

#print("\n IRIS DATA :",iris.data);

You might also like