0% found this document useful (0 votes)

27 views15 pages

CALCULATION

The document contains information about student mark lists from a notepad and calculations in Excel and relation format. It includes student registration numbers, names, and marks in three subjects. It also contains information about converting data between different formats like numeric to nominal.

Uploaded by

E2-08 Bharath.M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views15 pages

CALCULATION

Uploaded by

E2-08 Bharath.M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

NOTEPAD:

@relation student mark list @ attribute regno numeric

@relation name {aa, bb, cc, dd, ee}
@relation mark1 numeric
@relation mark2 numeric
@relation mark3 numeric
@data1, aa, 35, 45, 65
@data2, bb, 46, 85, 52
@data3, cc, 65, 49, 63
@data4, dd, 75, 45, 41
@data5, ee, 41, 51, 74

CALCULATION:

Excel: Student Mark List

Reg NO Name Mark1 Mark2 Mark3

1 aa 35 45 65
2 bb 46 85 52
3 cc 65 49 63
4 dd 75 45 41
5 ee 41 51 74

Relation: Student Mark List

No 1:Reg No 2: Name 3: Mark1 4: Mark2 5: Mark3

Numeric Numeric Numeric Numeric Numeric

1 1.0 Aa 35.0 45.0 65.0

2 2.2 Bb 46.0 85.0 52.0
3 3.0 Cc 65.0 49.0 63.0
4 4.0 Dd 75.0 45.0 41.0
5 5.0 Ee 41.0 51.0 74.0

NOTEPAD:
@relation weather
@attribute outlook {sunny, overcast, rainy}
@attribute temperature real
@attribute humidity real
@attribute windy {true, false}
@attribute play {yes, no}
@data
sunny 85, 85, false, no
sunny 80, 90, true, no
overcast 83, 86, false, yes
rainy 70, 96, false, yes
rainy 68, 80, false, yes

CALCULATION:
Excel:

Outlook Temperature Humidity Windy Play

real Real
Sunny 85 85 FALSE NO
Sunny 80 90 TRUE NO
Rainy 70 96 FALSE YES
overcast 83 86 FALSE YES
Rainy 68 80 FALSE YES
CALCULATION:

NUMERIC TO NOMINAL

No 1:Outlook 2:Temperature 3:Humidity 4:Windy 5:Play

Nominal Numeric Numeric Nominal Nominal
1 Sunny 85.0 85.0 FALSE NO
2 Sunny 80.0 90.0 TRUE NO
3 Rainy 70.0 96.0 FALSE YES
4 Overcast 83.0 86.0 FALSE YES
5 Rainy 68.0 80.0 FALSE YES

No 1:Outlook 2:Temperature 3:Humidity 4:Windy 5:Play

Nominal Nominal Nominal Nominal Nominal
1 Sunny “ALL” “ALL” FALSE NO
2 Sunny “ALL” “ALL” TRUE NO
3 Rainy “ALL” “ALL” FALSE YES
4 Overcast “ALL” “ALL” FALSE YES
5 Rainy “ALL” “ALL” FALSE YES

CALCULATION:
No 1:Outlook 2:Temperature 3:Humidity 4:Windy 5:Play
Nominal Nominal Nominal Nominal Nominal
1 Sunny Hot high FALSE no
2 Sunny Hot high TRUE no
3 Rainy Hot high FALSE yes
4 Overcast Mid high FALSE yes
5 Rainy Cool normal FALSE yes

N 1:Outlook 2:Outl 3:Outl 4:Tempera 5:Temera 6:Tempera 7:Humi 8:Wi 9:Pla

o =sunny ook ook ture ture ture dity ndy y
Numeric =overc =rainy =hot =mid =cool =norma =fals Nomi
ast Numer Numeric Numeric Numeric l e nal
Numer ic Numeri Num
ic c eric
1 1.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 no
2 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 no
3 0.0 1.0 0.0 1.0 0.0 0.0 0.0 1.0 yes
4 0.0 0.0 1.0 0.0 1.0 0.0 0.0 1.0 yes
5 0.0 0.0 1.0 0.0 0.0 0.1 1.0 1.0 yes

Applying Bias Theorem:

Calculation Old data Predicted data Error (Error)2

3,3 56 77.16 21.16 441

3,5 89 69.8 19.2 368.6

5,4 98 76.3 21.7 470.89

7,6 98 71.8 26.2 38.44

∑ 341 294.9 68.1 1318.97

Accuracy =error/predicted data∗100

= 68.1/294.9*100

= 23%

CALCULATION:

X y x– x
i y -y
i (x – x)
i
2
(y – y)
i
2
(x – x)* (y – y)
i i

3 30 6 24 36 576 144
8 57 1 -3 1 9 3
9 64 0 -10 0 100 0
13 72 -4 -18 16 324 72
3 30 6 24 36 576 144
6 43 3 11 9 121 33
11 50 -2 4 4 16 8
21 90 -12 -35 144 1225 420
1 20 8 34 64 1156 272
16 83 -7 -29 49 841 203
∑ 9.1 53.9 359 4944 1299

b = ∑[(x - x )(y - y )]/∑[(x - x ) ]

1 i
—
i
—
i
— 2

= 1299/359

=3.618

b =y - b *x
0
—
1
—

= 53.9-3.618*9.1

=21

y linear regression:
^

y = b +b x
^
0 1

=21+3.618*10

= 57.18

Co-efficient:

R = {(1/N)∑(x - x )(y - y )/(σ *σ ) }

2
i
—
i
—
x y
2

= (1/10)1299/ (6.31523.605). = 0.8714 ≈ 0.9

CALCULATION:

RID Class Distance to New

1 No (1+0+0+1)/4=0.5
2 No (1+0+0+0)/4=0.25
3 Yes (0+0+0+1)/4=0.25
4 Yes (0+2+0+1)/4=0.75
5 Yes (0+0+1+1)/4=0.5
6 No (0+0+1+0)/4=0.25
7 Yes (0+0+1+0)/4=0.25
8 No (1+2+0+1)/4=1
9 Yes (1+0+1+1)/4=0.75
10 Yes (0+2+1+1)/4=1
11 Yes (1+2+1+0)/4=1
12 Yes (0+2+0+0)/4=0.5
13 Yes (0+0+1+1)/4=0.5
14 No (0+2+0+0)/4=0.5

CALCULATION:
First check which attribute provides the highest Information Gain in order splitter training set based
on that attribute. We need to calculate the expected information to classify the seand theanthropos
each attribute. The information gained is this mutual information minus the entropy. The mutual
information of the two classes:

I(SYes,SNo) =I (9,5) = -9/14 log2(9/14)– 5/14 log2(5/14) =0.94

For Age we have three values age <=30 (2yes and 3no), age31..40( 4yes and 0no) and age>40
(3yes2 no)

Entropy(age) = 5/14 (-2/5 log(2/5)-3/5log2 (3/5)) + 4/14 (0) + 5/14 (-3/5log2 (3/5)-2/5log2 (2/5))

= 5/14(0.9709) + 0 + 5/14(0.9709)
= 0.6935

Gain(age) = 0.94 – 0.6935 = 0.2465

For Income we have Three values income high (2yesand2no), income medium( 4yesand2no)
and income Low(3 yes 1 no)

Entropy(income) = 4/14(-2/4log2 (2/4)-2/4log(2/4)) + 6/14 (-4/6log2 (4/6)-2/6log2 (2/6))

+ 4/14 (-3/4log2 (3/4)-1/4log2 (1/4))

= 4/14 (1) + 6/14(0.918) + 4/14 (0.81

= 0.285714 + 0.393428 + 0.231714 = 0.9108

Gain(income) = 0.94 – 0.9108 = 0.0292

For Student we have two values student yes(6 yes and 1 no) and student no(3 yes 4 no)

Entropy(student) = 7/14(-6/7log2 (6/7)) + 7/14(-3/7log2 (3/7)-4/7log2 (4/7)

= 7/14(0.5916) + 7/14(0.9852)

= 0.2958 + 0.4926 = 0.7884

Gain (student) = 0.94 – 0.7884 = 0.1516

For Credit Rating we have two values credit rating fair(6yesand2no) and credit_rating
excellent(3yes 3 no)

Entropy(credit rating) = 8/14(-6/8log2 (6/8)-2/8log2 (2/8)) + 6/14(-3/6log2 (3/6)-3/6log2 (3/6))

= 8/14(0.8112) + 6/14(1)

= 0.4635 + 0.4285 = 0.8920

Gain(credit rating) = 0.94 – 0.8920 = 0.479

Since Age has the highest Information Gain we start splitting the dataset using the age
attribute
Since all records under the branch age31..40are all of class Yes,we can replace the leaf
with Class=Yes

The same process of splitting has tohappen for the two remaining branches.

For branch age<=30westillhaveattributesincome,studentandcredit_rating.Whichoneshouldbeuse

to split the partition?

The mutual information is I(SYes,SNo)=I(2,3)= -2/5 log2(2/5)–3/5 log2(3/5)=0.97.

For Income we have three values income high (0yesand2no), income

medium(1yesand1no)and income low(1 yes and 0 no)

Entropy(income) = 2/5(0) + 2/5 (-1/2log (1/2)-1/2log (1/2)) + 1/5 (0)

2 2

= 2/5 (1) = 0.4

Gain(income) = 0.97 – 0.4 = 0.57

For Student we have two values student yes(2 yes and 0 no) and student no(0 yes 3 no)

Entropy(student) = 2/5(0) + 3/5(0) = 0

Gain (student) = 0.97 – 0 = 0.97

Wecanthensafelysplitonattributestudentwithoutcheckingtheotherattributessincetheinformation
gain is maximized.

Since the set whole branches are from distinct classes, we make the min to leaf nodes with their
respective class as label:
Again the same process is needed for the other branch of age.

The mutual information is I(SYes,SNo)=I (3,2) = -3/5 log (3/5)–2/5 log (2/5)=0.97
2 2

- For Income we have two values income medium (2 yes and 1 no) and income low (1 yes and 1
no)

Entropy(income) = 3/5(-2/3log (2/3)-1/3log(1/3)) + 2/5 (-1/2log (1/2)-1/2log (1/2))

2 2 2

= 3/5(0.9182) +2/5 (1) = 0.55+0. 4= 0.95

Gain(income) = 0.97 – 0.95 = 0.02

For Student we have two values student yes(2 yes and 1 no) and student no(1 yes and
1 no)

Entropy(student) = 3/5(-2/3log (2/3)-1/3log(1/3)) +2/5(-1/2log (1/2)-1/2log (1/2))

2 2 2

= 0.95

Gain (student) = 0.97 – 0.95 = 0.02

For Credit Rating we have two values credit rating fair(3yesand0no) and credit rating
excellent (0yes and 2 no)

Entropy (credit rating) = 0

Gain (credit rating) = 0.97 – 0 = 0.97

We then split based on credit rating. These splits give partition search with records from the same
class. We just need to make these into leaf nodes with their class label attached:
CALCULATION:
This data set is to be grouped into two clusters. As a first step in finding a sensible initial partition,
let the A & B values of the two individuals furthest apart (using the Euclidean distance measure),
define the initial cluster means, giving

The remaining individuals are now examined in sequence and allocated to the cluster to which they
are closest, in terms of Euclidean distance to the cluster mean. The mean vector is recalculated
each time a new member is added. This leads to the following series of steps:
M2=(1/2(1.0+1.5),1/2(1.0+2.0))=3.9
M =(1/5(3.0+3.5+4.5+3.5),1/5(4.0+7.0+5.0+4.5)) = 5.1
2

Branches of Economics
No ratings yet
Branches of Economics
4 pages
Decision Tree and KNN Assignment Two
No ratings yet
Decision Tree and KNN Assignment Two
13 pages
DT 2023 24 Sols
No ratings yet
DT 2023 24 Sols
8 pages
Unit 5
No ratings yet
Unit 5
21 pages
Project Paarth
No ratings yet
Project Paarth
21 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
Decision Tree
No ratings yet
Decision Tree
25 pages
University of Gondar: August 2011 E.C Gondar, Ethiopia
No ratings yet
University of Gondar: August 2011 E.C Gondar, Ethiopia
10 pages
Homework - 2
No ratings yet
Homework - 2
4 pages
Classification Intr DT
No ratings yet
Classification Intr DT
31 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
Openlab 1
No ratings yet
Openlab 1
17 pages
IAT Paper Jan-June 22 DMBI DIV A&B Solution
No ratings yet
IAT Paper Jan-June 22 DMBI DIV A&B Solution
10 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
Unit 2
No ratings yet
Unit 2
20 pages
3 Decision Trees - LMS
No ratings yet
3 Decision Trees - LMS
47 pages
Data Classification and Prediction : Lecture-11
No ratings yet
Data Classification and Prediction : Lecture-11
36 pages
Pattern Recognition 21BR551 MODULE 03 NOTES
No ratings yet
Pattern Recognition 21BR551 MODULE 03 NOTES
16 pages
221IT027 DA Lab3
No ratings yet
221IT027 DA Lab3
5 pages
ML 2024a QP Solution Full
No ratings yet
ML 2024a QP Solution Full
13 pages
Linear Regression Example
No ratings yet
Linear Regression Example
26 pages
Slide 07 Chapter8 Classification Basic Concept
No ratings yet
Slide 07 Chapter8 Classification Basic Concept
55 pages
VII - CS8031 - DMDW - Module 6 - Classification - VBP
No ratings yet
VII - CS8031 - DMDW - Module 6 - Classification - VBP
99 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
ML LAB Rec
No ratings yet
ML LAB Rec
9 pages
Data Mining 3
No ratings yet
Data Mining 3
6 pages
ML Tutorial I
No ratings yet
ML Tutorial I
3 pages
ML Lab Programs PDF
No ratings yet
ML Lab Programs PDF
15 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
Name: Muhammad Sarfraz Seat: EP1850086 Section: A Course Code: 514 Course Name: Data Warehousing and Data Mining
No ratings yet
Name: Muhammad Sarfraz Seat: EP1850086 Section: A Course Code: 514 Course Name: Data Warehousing and Data Mining
39 pages
Alishba (S005)
No ratings yet
Alishba (S005)
5 pages
Unit 6 Finalized
No ratings yet
Unit 6 Finalized
30 pages
6 (4 Files Merged)
0% (1)
6 (4 Files Merged)
4 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
52 pages
Cours2 ML
No ratings yet
Cours2 ML
21 pages
Module 4 - Supervised and Unsupervised Learning Techniques
No ratings yet
Module 4 - Supervised and Unsupervised Learning Techniques
52 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
AIML MSE 2 Sums Notes
No ratings yet
AIML MSE 2 Sums Notes
131 pages
Machine Learning - Unit 2
No ratings yet
Machine Learning - Unit 2
104 pages
ML Lab Mannual
No ratings yet
ML Lab Mannual
29 pages
Classification Problems
100% (1)
Classification Problems
25 pages
Machine Learning Lab New
No ratings yet
Machine Learning Lab New
14 pages
6CS4-02 Machine Learning Manish Bhardwaj
No ratings yet
6CS4-02 Machine Learning Manish Bhardwaj
625 pages
Key3 DM
No ratings yet
Key3 DM
4 pages
Decision Tree
No ratings yet
Decision Tree
71 pages
Note 4
No ratings yet
Note 4
18 pages
Lecture1-Introduction To Data Mining
No ratings yet
Lecture1-Introduction To Data Mining
46 pages
01 Section 6.2.1 QR Code Content
No ratings yet
01 Section 6.2.1 QR Code Content
5 pages
Data Mining - Module 7
No ratings yet
Data Mining - Module 7
8 pages
AIML Lect5 Assignment ID3
No ratings yet
AIML Lect5 Assignment ID3
2 pages
K-Means Clustering Algorithm
No ratings yet
K-Means Clustering Algorithm
17 pages
Assignmnet 5
No ratings yet
Assignmnet 5
11 pages
ML Unit 2
No ratings yet
ML Unit 2
84 pages
Mid-Term2024 SOL
No ratings yet
Mid-Term2024 SOL
4 pages
Datamining
No ratings yet
Datamining
6 pages
Unit 4
No ratings yet
Unit 4
186 pages
Homework1 Excersises
No ratings yet
Homework1 Excersises
12 pages
My Notes
No ratings yet
My Notes
15 pages
AI Assignment: Vishal Batch 10 17SCSE101611
No ratings yet
AI Assignment: Vishal Batch 10 17SCSE101611
4 pages
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
From Everand
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
Stuart A. Klugman
4/5 (1)
Yield Certificate RAMULIFHO CARLSWALD NORTH ESTATE - 18 APRIL 2024
No ratings yet
Yield Certificate RAMULIFHO CARLSWALD NORTH ESTATE - 18 APRIL 2024
1 page
Dey'S - Sample PDF - BST-XII Exam Handbook Term-I - 2021-22
No ratings yet
Dey'S - Sample PDF - BST-XII Exam Handbook Term-I - 2021-22
62 pages
Soal Uas Bhs. Inggris Xii
No ratings yet
Soal Uas Bhs. Inggris Xii
18 pages
Attitude Is Everything
No ratings yet
Attitude Is Everything
27 pages
Test 03a
No ratings yet
Test 03a
4 pages
Customer Segmentation
No ratings yet
Customer Segmentation
61 pages
G12 Phy Sci P2 June 2025 Marking Guidelines
No ratings yet
G12 Phy Sci P2 June 2025 Marking Guidelines
13 pages
System and Network Administration Assignment
No ratings yet
System and Network Administration Assignment
64 pages
Ballew CBB Needsassessment 8462
No ratings yet
Ballew CBB Needsassessment 8462
24 pages
HAI Knowledge Questionnaire
No ratings yet
HAI Knowledge Questionnaire
3 pages
Plano de Trabalho
No ratings yet
Plano de Trabalho
107 pages
Fluency Plus 6 - Unit 1.3 - Vocabulary
No ratings yet
Fluency Plus 6 - Unit 1.3 - Vocabulary
5 pages
EOY Subject Information 2024 9 Sec 1 G3
No ratings yet
EOY Subject Information 2024 9 Sec 1 G3
2 pages
Discounted Cash Flows Method
No ratings yet
Discounted Cash Flows Method
36 pages
14.07.24 - SR - Star Co Super Chaina (Model-A&b) - Exams Syllabus Clarification
No ratings yet
14.07.24 - SR - Star Co Super Chaina (Model-A&b) - Exams Syllabus Clarification
2 pages
Data Engineer - Ireland
No ratings yet
Data Engineer - Ireland
3 pages
Value Creation Through Mergers and Acquistion - Eicher Motors
No ratings yet
Value Creation Through Mergers and Acquistion - Eicher Motors
21 pages
English For Ug
No ratings yet
English For Ug
19 pages
PowerPoint Presentation
No ratings yet
PowerPoint Presentation
60 pages
Service Catalog
No ratings yet
Service Catalog
3 pages
Botany in Berlin
100% (1)
Botany in Berlin
285 pages
Typical Vs Atypical Antipsychotics
No ratings yet
Typical Vs Atypical Antipsychotics
6 pages
BBS Server 1.2 Manual
No ratings yet
BBS Server 1.2 Manual
27 pages
1 Text For Reading Comprehension
100% (1)
1 Text For Reading Comprehension
3 pages
Math Class KGII
No ratings yet
Math Class KGII
3 pages
Fluid Level Sensors in Oil & Gas
No ratings yet
Fluid Level Sensors in Oil & Gas
4 pages
VBQ-XII - English Core - 2
No ratings yet
VBQ-XII - English Core - 2
25 pages
INSIDE OUT - Reaction Paper
No ratings yet
INSIDE OUT - Reaction Paper
1 page
Morphology of Flowering Plants Learn Cbse
No ratings yet
Morphology of Flowering Plants Learn Cbse
6 pages