Q (1 8)

The document outlines a series of tasks involving data classification and clustering using various algorithms on randomly generated and existing datasets. It includes generating random values, classifying them using KNN, WKNIV, and radius-based NNC, as well as clustering using Leader clustering and calculating purity values. Additionally, it covers tasks involving the Digits and Olivetti Face datasets, focusing on classification accuracy and dimensionality reduction techniques.

Uploaded by

manumanoz048

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views14 pages

Q (1 8)

Uploaded by

manumanoz048

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Q1. Randomly generate 100 values of x in the range [0,1]. Let them be x1, x2, . . ., x100.

Perform
the following based on the data set generated.
a. Label the first 50 points {x1, . . ., x50} as follows: if (xi< 0.5), then x; € Class1, else x, € Class2.
b. Classification.
i. Classify the remaining points, x51…., x100 using KNN. Perform this for k = 1, 2, 3, 4,5, 20,30.
ii. Classify the remaining points, x51...., x100 using WKNIV. Perform this for k = 1,2, 3, 4, 5,
20,30.
iii. Classify the remaining points, x51, ...., x100 using radius-based NNC. Perform this for k = 1,
2, 3, 4, 5, 20,30.
c. Compute the classification accuracy in all three cases and report. [Note: Classification accuracy
= nc/n, where n = 50, nc = number correctly classified]
Output:
Q2. Cluster the entire set of 100 points (as mentioned in Q1) using Leader clustering. Choose
different values for threshold (T) and carry out the clustering.

Output:
Q3. Let the clustering obtained using some threshold, T₁ be C₁ = {Cluster1, Computer the purity
value for each clustering, which is given by
𝑃𝑢𝑟𝑖𝑡𝑦(𝐶𝑖) = ∑𝑖𝑙
𝑗=1 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 (|𝐶𝑙𝑢𝑠𝑡𝑒𝑟𝑗 ∩ 𝐶𝑙𝑢𝑠𝑡𝑒𝑟1|, |𝐶𝑙𝑢𝑠𝑡𝑒𝑟𝑗 ∩ 𝐶𝑙𝑢𝑠𝑡𝑒𝑟2|, . . . , |𝐶𝑙𝑢𝑠𝑡𝑒𝑟𝑗 ∩

𝐶𝑙𝑢𝑠𝑡𝑒𝑟𝑖𝑙|)

Output:
Q4. Use the Digits data set available under sklearn: https://scikit-learn.org/stable/modules/genera
ted/sklearn.datasets.load\_digits.html Consider 10% of the data for training (179 samples). Each
pattern is an 8 x 8-sized character where each value is an integer in the range 0 to 16. Convert it i
nto binary form by replacing a value below 8 by 0 and other values (≥ 8) by 1.
a. Use these 179 patterns with labels and the remaining without labels for this subtask. Use KNN
and label the patterns without labels. Obtain the % classification accuracy. Perform this task with
k values from the set {1, 3, 5, 10, 20}.
b. Obtain the frequent itemsets for these 179 patterns using FP-growth by viewing each binary
pattern as a transaction of 64 items. Repeat this task with different minsup values from {0.1, 0.3,
0.5, 0.7}.
Output:
Q5. Download the Olivetti Face data set. There are 40 classes (corresponding to 40 people), each
class having 10 faces of the individual; so there are 400 images in total. Here, each face is viewed
as an image of size 64 x 64 (=4096) pixels; each pixel having values 0 to 255 which are ultimatel
y converted into floating numbers in the range [0,1]. Visit https://scikit-learn. org/0.19/datasets/ol
ivetti\_faces.html for more details. Your Tasks: There are three tasks. For all the tasks, split the d
ata set into train and test parts. Carry out this splitting randomly 10 times and report the average a
ccuracy. You may vary the test and train data set sizes. The tasks are:
a. Task 1: Build a decision tree using the training data. Tune the parameters corresponding to pru
ning the decision tree. Use the best decision tree to classify the test data set and obtain the accura
cy. Use both Gini and entropy impurities.
b. Task 2: Build a random forest classifier using the training data set. Use RF with 50 decision tre
es. Obtain the classification accuracy on the test data with the number of features as 20%, 40%
and 60% of the given set of features.
c. Task 3: Use the XGBoost classifier to classify by viewing the entire data set as the training
data set. Find out the accuracy on the data set using 50 and 100 trees.
Output:
Q6. Download the Olivetti Face data set. There are 40 classes (corresponding to 40 people), each
class having 10 faces of the individual; so there are a total of 400 images. Here, each face is view
ed as an image of size 64 x 64 (= 4096) pixels; each pixel has values 0 to 255 which are ultimatel
y converted into floating numbers in the range [0,1]. Visit https://scikit-learn. Split the data sets i
nto train and test parts. Perform this splitting randomly 10 times and report the average accuracy.
You may vary the test and train data set sizes. Use NBC to classify the test data set. Obtain the ac
curacy on the test data.

Output:
Q7. Use the Wisconsin Breast Cancer data set available under sklearn. There are 569 sam-ples co
rresponding to two classes. Each is a 30-dimensional vector. For more details, visit https://scikit-l
earn.org/stable/modules/generated/sklearn.datore details, _breast\_cancer.html. There are two tas
ks. For both the tasks, split the data set into train and test parts using train_size = 0.8. Perform thi
s splitting randomly 10 times and report the average accuracy. The tasks are:
a. Task 1: Here, you are supposed to reduce the dimensionality of the data set by clustering the 30
features into 12, 20 and 30 clusters obtained using the k-means algorithm. Note
that the resulting feature values are obtained by the centroids of the K clusters in each case. Com
pute the percentage accuracy using Gaussian naïve Bayes classifier on the test data. So, the result
ing training data set is of size 455 × K and the test data set is of size 114 x К.
b. Task 2: Repeat the task in (a) using k-means++ in place of the k-means algorithm.
Output:
Q8. Download the Olivetti Face data set. There are 40 classes (corresponding to 40 people), each
class having 10 faces of the individual; so there are 400 images in total. Here each face is viewed
as an imgae of size 64 x 64 (= 4096) pixels; each pixel having values 0 to 255 which are ultimate
ley converted into floating numbers in the range [0,1]. Visit https://scikit-learn. org/0.19/datasets/
olivetti\_faces.html for more details. Your Tasks: There are two subtasks. For both the subtasks, s
plit the data set into the train and test parts. Vary the test size. The tasks are:
a. Task 1: Reduce the dimensionality of the data set from 4096 to 400 using PCA and classify the
P-dimensional data set using perceptron, SVM, logistic regression and MLP.
b. Task 2: Repeat Task 1 using SVs instead of PCs.
Output:

Module 1 Quiz - Coursera15
50% (4)
Module 1 Quiz - Coursera15
1 page
Pattern Recognition Lab
No ratings yet
Pattern Recognition Lab
24 pages
CSE 455/555 Spring 2012 Homework 1: Bayes ∗ ω
100% (2)
CSE 455/555 Spring 2012 Homework 1: Bayes ∗ ω
3 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
9 pages
2022 ML Assignments
No ratings yet
2022 ML Assignments
45 pages
Logistic Regression For Binary Classification With Core APIs - TensorFlow Core
No ratings yet
Logistic Regression For Binary Classification With Core APIs - TensorFlow Core
22 pages
SCIEX QTRAP 5500 System Specification
No ratings yet
SCIEX QTRAP 5500 System Specification
13 pages
ML Using Scikit
50% (4)
ML Using Scikit
23 pages
Coincent - Data Science With Python Assignment
100% (2)
Coincent - Data Science With Python Assignment
23 pages
Python Final Project Group 03
No ratings yet
Python Final Project Group 03
18 pages
Machine Learning Full Question Bank
No ratings yet
Machine Learning Full Question Bank
14 pages
Free Ebook MCQ Series Based On e PG Pathshala P02-M1,2,3
No ratings yet
Free Ebook MCQ Series Based On e PG Pathshala P02-M1,2,3
81 pages
Artificial Intelligence Lab 7
No ratings yet
Artificial Intelligence Lab 7
10 pages
JD700B User Guide R06.0
No ratings yet
JD700B User Guide R06.0
690 pages
FinalTerm - Muhammad Hassan - 2516
No ratings yet
FinalTerm - Muhammad Hassan - 2516
16 pages
Quiz 1
No ratings yet
Quiz 1
5 pages
Mid-Term2024 SOL
No ratings yet
Mid-Term2024 SOL
4 pages
Numpy-Guide-1 11 0
No ratings yet
Numpy-Guide-1 11 0
135 pages
Assignment 10
100% (1)
Assignment 10
3 pages
Data604 Project2
No ratings yet
Data604 Project2
1 page
Assignment 2 Specification
No ratings yet
Assignment 2 Specification
3 pages
MLLab Manual
No ratings yet
MLLab Manual
24 pages
22
No ratings yet
22
7 pages
SLD-04 Single Line Diagram (4 of 11)
No ratings yet
SLD-04 Single Line Diagram (4 of 11)
1 page
KNN-SVM Assignment
No ratings yet
KNN-SVM Assignment
4 pages
P06 The Classification Pipeline Ans
No ratings yet
P06 The Classification Pipeline Ans
16 pages
Sheet1 1
No ratings yet
Sheet1 1
2 pages
Midterm Sample
No ratings yet
Midterm Sample
16 pages
Mlviva
No ratings yet
Mlviva
14 pages
Week-7 (SWI)
No ratings yet
Week-7 (SWI)
19 pages
178 hw1
No ratings yet
178 hw1
4 pages
CIS 520, Machine Learning, Fall 2015: Assignment 2 Due: Friday, September 18th, 11:59pm (Via Turnin)
No ratings yet
CIS 520, Machine Learning, Fall 2015: Assignment 2 Due: Friday, September 18th, 11:59pm (Via Turnin)
3 pages
Reo Guide To Fixed Installation Best Practice
No ratings yet
Reo Guide To Fixed Installation Best Practice
187 pages
VL2021220104718 Ast05
No ratings yet
VL2021220104718 Ast05
2 pages
Proposal - SRI SAI ENTERPRISES MOHAN NAGAR
No ratings yet
Proposal - SRI SAI ENTERPRISES MOHAN NAGAR
4 pages
ML II Lab
No ratings yet
ML II Lab
5 pages
51 DA5400 - FML51 - 20250501 ProblemSet06
No ratings yet
51 DA5400 - FML51 - 20250501 ProblemSet06
4 pages
10 EST Solution
No ratings yet
10 EST Solution
16 pages
Question 1 The Given Dataset Can Be Visualized As Follows
No ratings yet
Question 1 The Given Dataset Can Be Visualized As Follows
13 pages
DIT865 2018 Mar Solution
No ratings yet
DIT865 2018 Mar Solution
9 pages
Bike Generator Thesis
100% (3)
Bike Generator Thesis
6 pages
ML Digit Classification Report
No ratings yet
ML Digit Classification Report
7 pages
MS6711 Data Mining Homework 1: 1.1 Implement K-Means Manually (8 PTS)
No ratings yet
MS6711 Data Mining Homework 1: 1.1 Implement K-Means Manually (8 PTS)
6 pages
HW 1
No ratings yet
HW 1
4 pages
Final 2019
No ratings yet
Final 2019
15 pages
The Bengal Records Manual 1943
No ratings yet
The Bengal Records Manual 1943
249 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
DS+ICT Final Setup - Colour (128-135) .
No ratings yet
DS+ICT Final Setup - Colour (128-135) .
8 pages
Midterm
No ratings yet
Midterm
12 pages
F16midterm Sols v2
No ratings yet
F16midterm Sols v2
14 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
Direcpeciallfbi Po Prelims
No ratings yet
Direcpeciallfbi Po Prelims
20 pages
Drawing 19851
No ratings yet
Drawing 19851
1 page
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
Diploma in Legal Studies 27.04.22
No ratings yet
Diploma in Legal Studies 27.04.22
17 pages
Image Classification
No ratings yet
Image Classification
18 pages
HW 3
No ratings yet
HW 3
4 pages
EE3402 LIC Notes QUESTION BANK - by WWW - Notesfree.in
No ratings yet
EE3402 LIC Notes QUESTION BANK - by WWW - Notesfree.in
9 pages
Kernel PCA
No ratings yet
Kernel PCA
13 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
2 pages
HW 02
No ratings yet
HW 02
3 pages
Pratical No 14 Edp HKK
No ratings yet
Pratical No 14 Edp HKK
5 pages
Colgate OpenCore ComputerVision
No ratings yet
Colgate OpenCore ComputerVision
8 pages
Midterm Solutions For Machine Learning
No ratings yet
Midterm Solutions For Machine Learning
13 pages
Previous Exam Exercises On Classification: Exercise 4 2012: Classification With 2 Features
No ratings yet
Previous Exam Exercises On Classification: Exercise 4 2012: Classification With 2 Features
9 pages
Learning Episode 11 Updated
No ratings yet
Learning Episode 11 Updated
7 pages
DSCI 303: Machine Learning For Data Science Fall 2020
No ratings yet
DSCI 303: Machine Learning For Data Science Fall 2020
5 pages
Detection and Pattern Recognition: Matlab: 1 Supervised Classification
No ratings yet
Detection and Pattern Recognition: Matlab: 1 Supervised Classification
4 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Software Engineering: UNIT-2
No ratings yet
Software Engineering: UNIT-2
53 pages
Assignment III
No ratings yet
Assignment III
3 pages
DWM - END SEM LAB Questions
No ratings yet
DWM - END SEM LAB Questions
9 pages
1 Analytical Part (3 Percent Grade) : + + + 1 N I: y +1 I 1 N I: y 1 I
No ratings yet
1 Analytical Part (3 Percent Grade) : + + + 1 N I: y +1 I 1 N I: y 1 I
5 pages
Force Analysis of Spur Gears PDF
No ratings yet
Force Analysis of Spur Gears PDF
5 pages
IoT Based Street Light Controlling and M
No ratings yet
IoT Based Street Light Controlling and M
8 pages
ETH Start Broschuere en
No ratings yet
ETH Start Broschuere en
32 pages
Comprehensive Guide To Safe Registration and The Public Services Card
No ratings yet
Comprehensive Guide To Safe Registration and The Public Services Card
73 pages
1p00q00 5
No ratings yet
1p00q00 5
1 page
Lab 12
No ratings yet
Lab 12
8 pages
RCC11 Element Design
No ratings yet
RCC11 Element Design
6 pages
Image Processing
No ratings yet
Image Processing
5 pages
CS178 Homework #1: Problem 0: Getting Connected
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
4 pages
Wind Energy Conversion
No ratings yet
Wind Energy Conversion
7 pages
COMSATS University Islamabad: Terminal Examination, SPRING 2021
No ratings yet
COMSATS University Islamabad: Terminal Examination, SPRING 2021
6 pages
43 MLD STP at Valak Bl2 Status Report
No ratings yet
43 MLD STP at Valak Bl2 Status Report
2 pages
Zero Backlash in Rack and Pinion Drive Systems
No ratings yet
Zero Backlash in Rack and Pinion Drive Systems
2 pages
Huawei AR1000V Brochure
No ratings yet
Huawei AR1000V Brochure
4 pages
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet

Q (1 8)

Uploaded by

Q (1 8)

Uploaded by

Q1. Randomly generate 100 values of x in the range [0,1]. Let them be x1, x2, . . ., x100.

You might also like