0% found this document useful (0 votes)

73 views4 pages

CEG Assessment II

The document provides information about an internal assessment test for a data science and analytics course. It includes details like course outcomes, exam date and duration, instructions, questions in different parts covering various concepts, and a marking scheme.

Uploaded by

M S Shanmukhaa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views4 pages

CEG Assessment II

Uploaded by

M S Shanmukhaa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Roll No.

DEPARTMENT OF INFORMATION SCIENCE AND TECHNOLOGY, ANNA UNIVERSITY, CHENNAI

INTERNAL ASSESSMENT TEST II

VI Semester – B.TECH. in INFORMATION TECHNOLOGY

(R2019)
IT5602 – DATA SCIENCE AND ANALYTICS

Academic Session: August 2023 – December 2023

Program: B.Tech. IT Year / SEM: 2/3

Max. Marks: 50 Duration: 90 mins
Date of Exam: 06.05.2024 Faculty names: Dr. S. Sendhilkumar

CO 1 To learn the fundamentals of data science and big data.

CO 2 To gain in-depth knowledge on descriptive data analytical techniques.
CO 3 To gain knowledge to implement simple to complex analytical. Algorithms in big data
frameworks.
CO 4 To develop programming skills using required libraries and packages to perform data
analysis in Python.
CO 5 To understand and perform data visualization, web scraping, machine learning and
natural language processing using various Data Science tools.
BL – Bloom’s Taxonomy Levels
(L1 - Remembering, L2 - Understanding, L3 - Applying, L4 - Analyzing, L5 - Evaluating, L6 - Creating)

PART- A (7 x 2 = 14 Marks)

Q. No Questions Marks CO BL
1 Give the significance of Pearson’s coefficient in bivariate analysis. 2 3 L2
Also interpret its possible values.
2 Differentiate between bias and variance and state the tradeoff 2 3 L2
between these parameters in Machine Learning.
3 State how Hadoop is fault-tolerant. 2 4 L1
4 What is reinforcement learning? How it is different from 2 2 L2
unsupervised learning?
5 Use the data given in question No. 8(b) and create 3 numpy 2 5 L4
arrays with 9 data elements each. Write a simple Python
program to find the mean of every NumPy array in the given list?
6 What is the function of job tracker and task tracker in Hadoop 2 4 L1
architecture?
7 What type of OLAP servers can be implemented in a warehouse 2 3 L1
framework? Brief each type in a sentence or two.

PART- B (2 x 12 = 24 Marks)
Q. No Questions Marks CO BL
8(a) (i) Consider the following variables X and Y: 6+6 3 L3
X = [1, 2, 3, 4]
Y = [1, 4, 9, 15]
Apply polynomial regression to find the coefficients using the
matrix approach and hence the polynomial regression equation.
(ii) Let's say you want to know if gender has anything to do with
political party preference. You poll 440 voters in a simple random
sample to find out which political party they prefer. The results of
the survey are shown in the table below:
Republican Democrat Independent Total
Male 109 59 22 200
Female 120 65 25 220
Total 240 130 50 440
To see if gender is linked to political party preference, perform a
Chi-Square test of independence. (For 5% level of significance
and dof=2, the tabulated Chi-square value = 5.991).
(OR)
8(b) Suppose that the data for analysis includes the attribute age. The 12 3 L3
age values for the data tuples are (in increasing order) 13, 15, 16,
16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33,
33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
(i) Use smoothing by bin means to smooth the data,
using a bin depth of 3.
(ii) Use min-max normalization to transform the value
35 for age onto the range [0.0 to 1.0].
(iii) Use z-score normalization to transform the value 35
for age, where the standard deviation of age is 12.94
years.
(iv) Use normalization by decimal scaling to transform
the value 35 for age.

9(a) (i) Consider an enterprise that deals with very large amount of 12 4 L4
data, such as terabytes or petabytes structured in warehouse
schemas, and the data/queries come in at high velocity. Also, the
enterprise requires a high availability of data for answering
various ad-hoc queries. Suggest a suitable architecture that
enforces effective querying on the warehouse schemas by
transforming the queries into suitable map-reduce tasks and
explain how it works with suitable diagrams.
(OR)
9(b) (i) Consider a word counting problem on large set of web pages 12 4 L4
(size = 10GB) that is stored in a Hadoop distributed framework.
Explain how this task if submitted as a Map-Reduce program
will be executed in HDFS with a neat diagram and necessary
steps.

PART- C (1 x 12 = 12 Marks)

Q. No Questions Marks CO BL
10 (i) Calculate the Eigen Value and Eigen Vector for the data given 6 2 L5
in the below Table.

Feature Example 1 Example 2 Example 3 Example 4

X1 13 7 4 8

X2 5 14 11 4

Given the table below do the following: 6 3 L5

(ii) Estimate the conditional probabilities for P(A|+), P(B|+), P(C|
+), P(A|−), P(B|−), and P(C|−).
(iii) Use the estimate of conditional probabilities given in the previous
question to predict the class label for a test sample (A = 0, B =
1, C = 0) using the naıve Bayes approach.
Mark Distribution:
Question. Marks / CO Total Marks / BL
No Marks
CO 1 CO 2 CO 3 CO 4 CO 5 L1 L2 L3 L4 L5 L6
1 2 2 2
2 2 2 2
3 2 2 2
4 2 2 2
5 2 2 2
6 2 2 2
7 2 2 2
8 12 12 12
9 12 12 12
10 6 6 12 12
Total - 8 24 16 2 50 L1+L2=12 L3+L4=26 L5+L6=12
Mark
Distribution - 16% 48% 32% 4% 100 24% 52% 24%
in (%)

Date: 03/05/2024 Course Instructor(s) Signature

Professor In-charge Signature

Dsbda Lab Manual
No ratings yet
Dsbda Lab Manual
167 pages
Dsbdal Lab Manual
No ratings yet
Dsbdal Lab Manual
107 pages
ML Examination Paper
No ratings yet
ML Examination Paper
1 page
CSE1703 - Fundamental of Data Science
No ratings yet
CSE1703 - Fundamental of Data Science
6 pages
Sem 7-Endsem Paper
No ratings yet
Sem 7-Endsem Paper
7 pages
PracticalList - EDT - BCA - 2024 SET B1 - 4
No ratings yet
PracticalList - EDT - BCA - 2024 SET B1 - 4
8 pages
SL-III Lab Manual
No ratings yet
SL-III Lab Manual
74 pages
IT M502 BIG Data Analytics
No ratings yet
IT M502 BIG Data Analytics
3 pages
DSBDA LAB - MANUAL (Autosaved) - Sd1-Converted-1-2
100% (1)
DSBDA LAB - MANUAL (Autosaved) - Sd1-Converted-1-2
256 pages
Da Kit 601 It3
No ratings yet
Da Kit 601 It3
2 pages
A47E1-DA Nov 2023
No ratings yet
A47E1-DA Nov 2023
1 page
Merged Documents
No ratings yet
Merged Documents
8 pages
IT7C4 IR December 2019
No ratings yet
IT7C4 IR December 2019
3 pages
DWR Tee Paper
No ratings yet
DWR Tee Paper
8 pages
DS Imp Questions
No ratings yet
DS Imp Questions
5 pages
Previous Year Paper - Sem 7
No ratings yet
Previous Year Paper - Sem 7
12 pages
SCH 20mca31
No ratings yet
SCH 20mca31
7 pages
DSBDA Lab Plan
No ratings yet
DSBDA Lab Plan
5 pages
Artificial Intelligence & BA - Practicals Assignments
No ratings yet
Artificial Intelligence & BA - Practicals Assignments
15 pages
Dsbdal Lab Manual
No ratings yet
Dsbdal Lab Manual
107 pages
DA Exam Paper
No ratings yet
DA Exam Paper
3 pages
Syllabus OE AIDSML.
No ratings yet
Syllabus OE AIDSML.
7 pages
IT Syllabus BE - Minor - Specialization - Final Batch 2024-2028
No ratings yet
IT Syllabus BE - Minor - Specialization - Final Batch 2024-2028
21 pages
Int To DS
No ratings yet
Int To DS
2 pages
2023 June CST322-C
No ratings yet
2023 June CST322-C
3 pages
Set-D CT2 Answerkey
No ratings yet
Set-D CT2 Answerkey
11 pages
Data Warehousing&Data Mining AMTCSE0114
No ratings yet
Data Warehousing&Data Mining AMTCSE0114
3 pages
DSBDAL Lab Manual
No ratings yet
DSBDAL Lab Manual
26 pages
DSBDA Sample Problem Statements
No ratings yet
DSBDA Sample Problem Statements
3 pages
Index: SR. NO. Practical Name Date of Perform NO. Sign
No ratings yet
Index: SR. NO. Practical Name Date of Perform NO. Sign
28 pages
U20 - Bda QB-1
No ratings yet
U20 - Bda QB-1
6 pages
(08 Marks) (08 Marks)
No ratings yet
(08 Marks) (08 Marks)
2 pages
Big Data Analytics April 2023
No ratings yet
Big Data Analytics April 2023
4 pages
21ad62 Model Paper
No ratings yet
21ad62 Model Paper
38 pages
Dav End Sem
No ratings yet
Dav End Sem
2 pages
PR List Dsbda
No ratings yet
PR List Dsbda
2 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
0.extracted Pages 20MCA201 From 2020 MCA S3 S4
No ratings yet
0.extracted Pages 20MCA201 From 2020 MCA S3 S4
18 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
End Sem PYQ
No ratings yet
End Sem PYQ
8 pages
ST1 - 5th Sem
No ratings yet
ST1 - 5th Sem
8 pages
T1 Scheme 24 25
No ratings yet
T1 Scheme 24 25
5 pages
UEC718
No ratings yet
UEC718
2 pages
Midsem I 31 03 2023
No ratings yet
Midsem I 31 03 2023
12 pages
Dsbda May2022
No ratings yet
Dsbda May2022
2 pages
DSBDA Merged
No ratings yet
DSBDA Merged
13 pages
KIT 601 - DA PUE - Question Paper - Updated
No ratings yet
KIT 601 - DA PUE - Question Paper - Updated
2 pages
Dsbda Lab Manual Merged
No ratings yet
Dsbda Lab Manual Merged
117 pages
Data Science and ML-KTU
No ratings yet
Data Science and ML-KTU
11 pages
May Jun 2022
No ratings yet
May Jun 2022
2 pages
ITT306 Data Science-May2023
No ratings yet
ITT306 Data Science-May2023
3 pages
Ese 2
No ratings yet
Ese 2
2 pages
Dsa - DK Question Paper
No ratings yet
Dsa - DK Question Paper
4 pages
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
Advance Statistics Module
100% (1)
Advance Statistics Module
64 pages
Datascience
No ratings yet
Datascience
8 pages
DS&BD Lab Manul
No ratings yet
DS&BD Lab Manul
98 pages
Chi Square Table
No ratings yet
Chi Square Table
2 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
DSDBA Sppu Dsbda QP
No ratings yet
DSDBA Sppu Dsbda QP
11 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
Simultaneous Statistical Inference With Applications in The Life Sciences Instant Reading Access
100% (10)
Simultaneous Statistical Inference With Applications in The Life Sciences Instant Reading Access
15 pages
Statistics 2012-13
No ratings yet
Statistics 2012-13
87 pages
AMR Dell Case Solutions
No ratings yet
AMR Dell Case Solutions
13 pages
References Time and Motion Study
No ratings yet
References Time and Motion Study
14 pages
1003 0720 Modelación y Simulación 2 - Libro Averill M Law - Simulation Modeling and Analysis - Solutions of Select Exercises
100% (1)
1003 0720 Modelación y Simulación 2 - Libro Averill M Law - Simulation Modeling and Analysis - Solutions of Select Exercises
285 pages
Minitab Workbook
No ratings yet
Minitab Workbook
28 pages
Abhinav: Volume III, January'14 ISSN - 2277-1166
100% (1)
Abhinav: Volume III, January'14 ISSN - 2277-1166
8 pages
Introduction To Econometrics, Global Edition James H. Stock Instant Download
No ratings yet
Introduction To Econometrics, Global Edition James H. Stock Instant Download
51 pages
Bagozzi Et Al
No ratings yet
Bagozzi Et Al
15 pages
Equity Variance Swaps With Dividends OpenGamma
No ratings yet
Equity Variance Swaps With Dividends OpenGamma
13 pages
RMM Mini Project Ott Platform
No ratings yet
RMM Mini Project Ott Platform
48 pages
Chi-Square Test: Milan A Joshi
No ratings yet
Chi-Square Test: Milan A Joshi
39 pages
The Turnover Intentions of Information Systems Auditors
No ratings yet
The Turnover Intentions of Information Systems Auditors
20 pages
Course Outline: International Islamic University Malaysia
No ratings yet
Course Outline: International Islamic University Malaysia
3 pages
IOE 474 Exam Review2
No ratings yet
IOE 474 Exam Review2
20 pages
Simulation Input Data Analysis
No ratings yet
Simulation Input Data Analysis
43 pages
Estimation and Detection: Lecture 9: Introduction Detection Theory (Chs 1,2,3)
No ratings yet
Estimation and Detection: Lecture 9: Introduction Detection Theory (Chs 1,2,3)
38 pages
StatsTests04 PDF
No ratings yet
StatsTests04 PDF
32 pages
MM Chi Square Lab
No ratings yet
MM Chi Square Lab
4 pages
A Study On Customer Satisfaction Towards Kitchen Equipments in Chennai
No ratings yet
A Study On Customer Satisfaction Towards Kitchen Equipments in Chennai
7 pages
Bba 2 2020
No ratings yet
Bba 2 2020
12 pages
S +sakthi
No ratings yet
S +sakthi
13 pages
Conger Et Al 2000
No ratings yet
Conger Et Al 2000
21 pages
Random Variable Using MATLAB
No ratings yet
Random Variable Using MATLAB
14 pages
Divisors of Mersenne Numbers: by Samuel S. Wagstaff, Jr.
No ratings yet
Divisors of Mersenne Numbers: by Samuel S. Wagstaff, Jr.
13 pages
Statistics Study Guide Chi-Square
No ratings yet
Statistics Study Guide Chi-Square
4 pages
The Chi Square Statistic
No ratings yet
The Chi Square Statistic
6 pages
Tugas
No ratings yet
Tugas
6 pages
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet

CEG Assessment II

Uploaded by

CEG Assessment II

Uploaded by

Roll No.

DEPARTMENT OF INFORMATION SCIENCE AND TECHNOLOGY, ANNA UNIVERSITY, CHENNAI

INTERNAL ASSESSMENT TEST II

VI Semester – B.TECH. in INFORMATION TECHNOLOGY

Academic Session: August 2023 – December 2023

Program: B.Tech. IT Year / SEM: 2/3

CO 1 To learn the fundamentals of data science and big data.

Feature Example 1 Example 2 Example 3 Example 4

Given the table below do the following: 6 3 L5

Date: 03/05/2024 Course Instructor(s) Signature

Professor In-charge Signature

You might also like