21033570029_dm file kashish
21033570029_dm file kashish
21033570029_dm file kashish
PRACTICAL FILE
Submitted to: -
MR. RAJEEV RAI (Teacher In-charge)
As part of academic curriculum (3 RD YEAR- 6TH SEMESTER)
BSC. (HONS) COMPUTER SCIENCE
Page 1
INDEX
S. No. Program Name Page No.
Page 2
DATASETS
1. People1.csv
2. Dirty_Iris.csv
Page 3
3.Wine.csv
4.Market_basket_optimization
Page 4
5. Social_network.csv
6. Mall_customers.csv
Page 5
7.Wholesale_customers data.csv
Page 6
QUESTION 1
Create a file “people.txt” with the following data:
Page 7
Page 8
PLOT IS AS FOLLOWS:
Page 9
QUESTION 2
Page 10
Page 11
Page 12
Page 13
BOXPLOT IS AS FOLLOWS:
Page 14
QUESTION 3
Load the data from wine dataset. Check whether all attributes are
standardized or not ( mean is 0 and standard deviation is 1). If not
standardize the attributes. Do the same withiris dataset.
Page 15
Page 16
Page 17
QUESTION 4
Page 18
Page 19
Page 20
Page 21
QUESTION 5
Use Naïve Bayes, K-nearest and Decision tree classification algorithms and
build classifiers.
Divide the dataset into training and test set. Compare the accuracy
of the different classifiers under the following situations:
5.1 a) Training set = 75% Test set = 25%
b) Training set = 66.6% Test set = 33.3%
5.2 Training set is chosen by
i) Hold out method
ii) Random Subsampling
iii) Cross validation Method
Compare the accuracy of the classifiers obtained
5.3 Dataset is scaled to standard format.
Page 22
1. Naïve Bayes
(When training set = 75% and test set = 25%)
Page 23
Page 24
Page 25
Page 26
Page 27
KNN
Page 28
Page 29
Page 30
DECISION TREE
Page 31
Page 32
Page 33
QUESTION 6
K-means
Page 34
Page 35
Page 36
DBSCAN
Page 37
Page 38
Hierarchial Clustering
Page 39
Page 40