Dav Pyq
Dav Pyq
82 66
89!8" "8#$%
&"'86"%(88 (886""7)8"
2&86*8+668)8+
!9 &8$(8 '"7&"%1 ,-
,&-./01203421/4256776821669:2;:4<2=632<0102>34>0301?682>/0:4208<2 ,C-
76<492>9088?8@2>/0:426=2<01020809A1?5:29?=425A594B2
,D-E?==43481?0142F?84032G4@34::?68208<2F6@?:1?52G4@34::?68B2 ,C-
,H-IJ>90?82<?==434812<01021A>4:2?82G2K?1/24J07>94:B2 ,C-
,L-IJ>90?82?82M3?4=2:14>:26=214J120809A:?:B2 ,C-
,N-./012?:21?742:43?4:20809A:?:O2IJ>90?82?1:2567>68481:B2 ,C-
,P-./012?:2Q08<0:O2IJ>90?82=401;34:26=2Q08<0:B2 ,C-
2
! &8$(8 '"72 ,-
,&-F?:1208<24J>90?82<?==434812>/0:4:2?82<01020809A1?5:29?=45A594B2 ,9-
,D-IJ>90?82R;1634@34::?S42TRGUV2W6S?8@2RS430@42TWRUV2 ,9-
R;1634@34::?S42W6S?8@2RS430@42TRGWRU208<2R;1634@34::?S42
X814@3014<2W6S?8@2RS430@42TRGXWRU2W6<49:2?82<410?9B2
2
!2 &8$(8 '"72 ,-
,&-Y095;901?8@21/4234@34::?6824Z;01?6826=2J2682A208<2A2682J2=36721/42 ,9-
=6996K?8@2<010208<24:1?70142J2K/482A2[2\]B2R9:62<41437?8421/42
S09;426=256334901?682564==?5?481B2
^ _]2 _\2 _`2 _a2 _b2
% c2 d2 a2 e2 _`2
,D-IJ>90?82:4S482>3051?5420340:26=214J120809A1?5:B2 ,9-
2
2
!1 &8$(8 '"72 ,-
,&-IJ>90?82K?1/2f;:1?=?501?6821/012K/?5/20809A:?:276<492?:2;:4<2162 ,9-
>34<?512g2=63450:127681/9A20S430@42147>4301;342?8202:>45?=?5234@?682
6S4321/4284J12A4032568:?<43?8@2/?:163?509259?70142<010B2
,D-IJ>90?82=6996K?8@2<0102S?:;09?h01?6829?M303?4:2?82QA1/68i2 ,9-
j6J2>961V2k?69?82>961V2Q?425/031V2l?:16@307V2j0325/0312
2
01213 56789
IJKLMNONPQRSLTUNVWXLYNZ[\[]NON^JUJN_`JabUcTdNJ`XNecdQJacfJUcW`
1
F
E2
A8
46
71
Paper / Subject Code: 37471 / Data Analytics and Visualization
6
26
5E
19
2C
F1
94
A8
46
71
6E
1T01876 - T.E. Computer Science and Engineering (Artificial Intelligence and Machine Learning) (Choice
96
68
5E
2C
F1
42
61
A4
Based) (R-19-20 'C' Scheme)SEMESTER - VI / 37471 - Data Analytics and Visualization
8
71
89
E4
18
26
QP CODE: 10039481 DATE: 11/12/2023
C
6
F1
5
4
87
94
8
A
1
E
A
7
8B
68
8
2C
F1
1
2
4
7
4
4C
(3 Hours) (Total Marks: 80)
A8
A
71
8
6E
B
8
8
96
2C
46
F1
1
42
8
7
4C
61
71
B8
6E
N.B.: 1. Question No. 1 is compulsory.
68
E4
18
6
2C
42
8
19
4
87
C
2. Answer any three out of the remaining questions.
85
6E
46
8
1A
8
6
3. Assume suitable data if necessary.
6
E
42
C8
9
4
7
5
1
1F
8A
8
89
6E
8
6
4. Figures to the right indicate full marks.
64
8B
A
4
C7
46
E
71
42
19
1
C
5
1F
E2
8A
B8
89
8
46
4
1A
6
C7
Q1. Attempt the following (any 4): (20)
26
6
5E
71
C8
19
4
F
94
8A
B8
8
6
a. Why is data analytics lifecycle essential?
64
1
E
4
C7
68
71
C8
9
1
2
b. The regression lines of a sample are and .
1
A4
1F
4
8A
B8
A8
64
9
E
Find (i) sample means ̅ and ̅.
4
C7
8
18
26
5E
71
C8
19
6
1
A4
1F
7
94
B8
8
46
(ii) coefficient of correlation between and
64
B8
A
C7
8
18
26
5E
C8
19
46
F1
C8
E2
A8
6
A
64
71
B8
9
64
4
8
d. What is Pandas? State and explain key features of Pandas.
8
26
5E
19
2C
6
F1
1
C8
19
4
7
e. Explain term frequency (TF), document frequency (DF), and inverse document 94
A8
46
A
1
8
E
6
64
C7
B
68
E4
18
26
5E
F1
8
frequency (IDF).
19
4
87
94
2
C
85
A8
A
71
6E
6
8B
68
1A
E4
18
6
2C
F1
42
9
A4
7
4C
85
1
1F
71
8
89
6E
6
8B
1A
E4
18
6
C7
2C
a. Explain the data analytics lifecycle. (10) 6
42
9
4
87
4C
85
1
1F
E2
89
6E
6
18
96
C7
26
46
E
42
C8
87
85
61
1F
Age of husband
94
E2
8A
89
4
25 22 28 26 35 20 22 40 20 18
8B
A
E4
96
C7
68
26
46
( )
71
1
4C
5
1
A4
1F
94
E2
8A
B8
8
Age of wife ( 18 15 20 17 22 14 16 21 15 14
A
E4
96
C7
68
18
26
71
8
1
C
5
61
A4
1F
87
Estimate (i) the age of husband when the age of wife is 19 and (ii) the age of wife when
94
E2
B8
8
4
A
96
7
8B
68
18
26
5E
F1
61
A4
87
94
2
4C
64
1
E
1A
4
C7
B
68
8
6
96
5E
19
71
2
C8
1F
E2
A8
46
8A
8
64
C7
8B
68
26
94
2
4C
61
A8
8A
71
8
6E
b. What is text mining? Enlist and explain the seven practice areas of text analytics. (10)
8B
68
E4
2C
F1
71
42
19
4
4C
5
71
8
E
A8
46
8B
68
8
26
6
1
19
A4
7
94
4C
46
8B
68
18
6
5E
42
19
1
8A
89
A8
46
company on the weights of 6 shipments, the distances they were moved and the damage
6
7
46
5E
71
C8
19
2C
1
1F
8A
46
64
6E
8B
C7
5E
71
9
1
B8
A8
46
kg)
96
C7
6
5E
C8
1
42
61
1F
E2
64
89
4
C7
6
5E
19
46
1
42
km)
1F
E2
A8
46
8A
89
5E
46
1
42
1F
E2
A8
A
89
C7
18
6
46
F1
42
Estimate the damage when a shipment of 3700 kg is moved to a distance of 260 km.
87
E2
A
71
89
8B
39481 Page 1 of 2
18
2C
46
42
87
4C
89
6E
8B
18
46
42
87
4C
89
8B
4C8B8718A4689426E2C71F1A85E46196
18
46
4C 8A 6E A8 96
46 2C 5 4C
8B 89 7 E4 8B
87 42 1F 61 87
18 6 1 9 18
4C A 46 E2 A8
5E
64
C8 A4
8B 89 C7 46 B 68
87 42 1F 87 94
18 6 1 19
6 18 26
A 46 E2 A8
5E 4C A4 E2
8B 89 C7 46 8B 68 C7
87 42 1F
1 19 8 71 94 1F
18
A 6 6 8 26 1A
39481
E2 A8 4C A4 E2
46 C7 5E 8B 85
89 46 68 C7 E4
42 1F
1 19 8 71 94 1F 6 19
18
A 6E2 A8 6 4C 8A 26
E2
1A 64
46
89 C7 5E 8B 4 68 C7 85
E4 C8
42 1F 46 8 94 1F 6 B8
6 1 19 71 19 7
regression.
6 26
SD
E2 A8 4C 8A 1A 64 18
4 E2 85
Mean
46 C7 5E 8B 68
A4 C8
c. Regression plot
89 46 C7 68 E4
42 1F
1 1 7 8 1F94 1 94 6 B8
7
6E 9 64 1 8 26 1 A 9 6 18 26
A E
Coefficient
A8
Correlation
4
71 4 B 68 7 E 4 8 B 6 8 C7
61 87 94 6 8 9
of
F1 1F 1 7 4 1F
1
19 61 87 94 1F 61
508.4
71 42 F1 1
71 26 1A 96 7 18 4 26 F 1A 19 87
8A E 8 4 A E 8 6 18
46 2C 5E C8 4 2 5 E
4C
8
A4
89 71 4 B 68 C7 4 6 B 68
61 8
Page 2 of 2
87 94 1F 1
*******
42 F1 1 7 94
6E A8 96 18 26 A 96 1 8 26
2C 5E 4C A4 E2 8 5E 4 C A 4 E2
71 4 61
8B 68 C7 46 8 B8 689
F1 87 94 1F 7
96 18 26 1 19 1 42
4C8B8718A4689426E2C71F1A85E46196
A8 4 A E 64 8 A 6E
4.6
C A8
4 5 4
26.7
5E 8 6 2C E C8 2C
46 B8 89 71 46 B 87
68
94 71
19 71 42 F1
A
19 F
64 8A 6E 8 6 4
18
A
26
E
C8 2 C
when the rainfall is 29 cm and the rainfall when the yield is 600 kg.
46 C7 5E 8B 46 2C
B8 89 1 4 6 8 8 9 71
Rainfall in cm
71 42 F1 1 9 7 1 42 F1
8A 6E A8 64 8 A 6 E
Paper / Subject Code: 37471 / Data Analytics and Visualization
A8
46 2C 5E C8 4 68 2 C 5
89
b. What is stepwise regression? State and explain different types of stepwise
71 46 B8 9 71
42 F1 19 71 42 F1
6E A8 64 8A 6E A8
2C 5E C8 4 6 2C 5E
71 4 6 B8 8 9 7 1 46
a. From the following results, obtain two regression equations and estimate the yield
F1 19 71 42 F1 1
A8 64 8A 6E A 8
(10)
5E C8 2C
(10)
(20)
46 5E
46 B8 89 71 46
19 71 42 F1 19
64 8A 6E A 8 6
C8 4 6 2C 5 E4
B8 89 71 61
71 42 F1 96
8A 6E A8
4A
C9
12
0
92
DE
Paper / Subject Code: 37471 / Data Analytics and Visualization
CB
AA
8B
10
C4
12
E0
92
1T01876 - T.E. Computer Science and Engineering (Artificial Intelligence and Machine Learning) (Choice Based)
B4
C
DA
AA
8B
10
4D
(R-19-20 'C' Scheme)SEMESTER - VI / 37471 - Data Analytics and Visualization
0C
92
77
AC
B4
1
E
QP CODE: 10029185 DATE: 08/05/2023
AA
10
4D
0A
0C
D
2
77
4
DA
1
DE
Duration: 3 Hrs [Max Marks: 80]
B
A
AA
0A
0C
7D
03
4
AC
4
A
A1
E
92
CB
D
Notes: (1) Question No. 1 is Compulsory.
D
0A
7D
C9
4A
03
C4
E0
(2) Attempt any THREE questions out of the remaining FIVE.
DA
2
8B
A7
CB
A
4D
99
(3) All questions carry equal marks.
7D
3
92
A0
E0
C
20
C
8B
7
10
(4) Assume suitable data, if required, and state it clearly.
DA
D
D
9
A
9
12
03
C4
2
0
(5) Figures to the right indicate full marks.
BC
7
9
DA
DE
AA
7
0
DA
99
A
21
03
C4
92
A0
B4
C
1
77
Q1 a) What is an analytic sandbox, and why is it important? 5
AA
2
8B
0
DA
D
0C
99
0A
21
03
2
B4
BC
1
b) Why use autocorrelation instead of autocovariance when examining stationary 5
DE
77
09
DA
AA
92
0C
A
1
8
C4
C9
2
03
92
time series?
A0
4
A1
DE
7
B
DA
2
8B
A7
10
3D
0C
99
4A
4
12
92
A0
77
BC
20
E
CB
A
AA
0
4D
0A
3D
99
21
D
28
d) What is regression? What is simple linear regression? E0 5
7
B4
DA
BC
A1
20
09
7
4D
0A
99
21
7D
4A
28
03
Q2 a) Explain in detail how dirty data can be detected in the data exploration phase 10
0
C
A
BC
A1
E
9
92
CB
DA
D
10
D
A
C9
with visualizations.
4A
28
03
C4
0
12
E0
7
DA
09
2
8B
CB
DA
AA
4D
99
b) List and explain methods that can be used for sentiment analysis. 10
21
3
92
A0
E0
C
77
AC
B4
A1
2
8B
10
Q3 a) List and explain the main phases of the Data Analytics Lifecycle.
D
10
D
99
0A
0C
7D
4A
12
03
C4
2
BC
9
DA
DE
AA
92
10
CB
DA
A
1
C9
2
03
C4
92
A0
B4
E0
A1
77
Q4 a) Suppose everyone who visits a retail website gets one promotional offer or no 10
2
8B
0
DA
3D
0C
4D
99
0A
21
4A
BC
20
DE
77
AC
9
DA
CB
AA
99
A
21
8
C4
7D
03
0
92
difference. What statistical method would you recommend for this analysis?
A0
4
BC
A1
DE
B
DA
92
A7
10
3D
C
4A
28
4
A0
77
20
DE
9
CB
A
AA
B
0
A
3D
99
21
D
Q5 a) How does the ARMA model differ from the ARIMA model? In what situation is 10
28
C4
A0
E0
7
B4
C
A1
20
09
7
DA
8B
4D
0A
99
21
4A
92
7
C
A
BC
A1
E
A7
CB
A
D
b) Explain with suitable example how the Term Frequency and Inverse Document 10
10
D
7D
4A
28
3
C4
0
12
E0
20
09
A7
AA
3D
4D
99
0C
21
A0
BC
20
77
B4
0C
7D
4A
C4
A0
BC
DE
92
5
CB
A
D
A
8
D
C9
03
C4
2
A0
E0
7
09
92
8B
A7
DA
b) Box-Jenkins Methodology 5
3D
4D
21
C9
2
A0
A1
20
77
AC
9
B
0
D
99
A
21
c) Seaborn Library. 5
28
7D
03
A0
BC
A1
92
A7
10
3D
A
5
12
92
A0
B4
20
AA
B
10
3D
0C
99
28
12
B4
BC
20
DE
09
A
**************************
0C
99
21
4A
8
92
BC
A1
E
CB
10
4D
4A
28
12
E0
C
09
CB
DA
AA
4D
29185 Page 1 of 1
21
E0
77
AC
B4
A1
4D
0C
7D
4A
AC
DE
A7
DA0A77DAC4DE0CB4AA1210928BC99203
CB
Y1
X5
X5
6
E6
CE
Paper / Subject Code: 37471 / Data Analytics and Visualization
25
6C
6C
1C
Y1
X5
E6
E6
5Y
X
May 15, 2024 02:30 pm - 05:30 pm 1T01876 - T.E. Computer Science and Engineering
25
6C
6C
1C
1C
52
(Artificial Intelligence and Machine Learning) (Choice Based) (R-19- ’C’ Scheme)SEMESTER - VI /
X5
E6
E6
Y
5Y
CX
37471 - Data Analytics and Visualization QP CODE: 10055196
25
C
1C
C
52
6
66
X5
E6
5Y
5Y
CX
CE
C
1C
52
2
6
66
Y1
5
(3 Hours)
E6
Y
5Y
CX
CX
CE
25
25
1C
2
66
6
(Total Marks: 80)
Y1
X5
X5
X5
E6
5Y
CE
5
6C
C
6C
1C
2
2
N.B.: 1. Question No. 1 is compulsory.
1
5
X5
E6
6
E6
5Y
Y
X
E
5
C
C
1C
C
1C
2. Answer any three out of the remaining questions.
2
2
6
6
X5
5
6
E6
Y
5Y
5Y
CX
CE
E
5
3. Assume suitable data if necessary.
6C
C
1C
52
2
52
Y1
1
X5
E6
E6
5Y
5Y
X
X
25
6C
C
6C
1C
C
52
2
66
1
X5
5
E6
E6
5Y
5Y
CX
CX
C
C
C
1C
52
2
66
6
6
Y1
1
Q1. Attempt any FOUR
5 [20]
6
E6
Y
5Y
CX
CX
X
E
CE
25
25
C
1C
1C
52
66
6
6
X5
[A] List and explain different key roles for successful data analytics? X5
E6
6
Y
5Y
5Y
CX
CE
E
25
6C
C
1C
C
52
52
6
66
Y1
1
X5
E6
E6
5Y
Y
CX
CX
CE
25
25
C
1C
1C
52
[C] Explain Term Frequency-Inverse Document Frequency (TF-IDF)
66
66
66
Y1
X5
5
5Y
5Y
CX
CX
CE
CE
CE
6C
2
52
6
66
Y1
Y1
Y1
X5
5
E6
E6
CX
CX
25
25
25
1C
1C
6
66
6
Y1
X5
X5
5
E6
E6
5Y
5Y
CX
CE
6C
6C
1C
1C
52
52
6
Y1
5
E6
E6
E6
5Y
5Y
CX
CX
CX
25
1C
1C
1C
2
52
66
66
66
5Y
5Y
5Y
CX
CX
CE
CE
CE
6C
2
52
52
Y1
Y1
Y1
5
E6
E6
X
CX
CX
CE
25
25
25
C
1C
1C
6
6
6
X5
X5
5
6
E6
[B] Explain ARIMA model in detail. Also state its Pros and Cons.
E6
5Y
Y
CX
CE
25
6C
6C
1C
1C
52
66
Y1
5
CX
CX
E
25
25
1C
1C
1C
52
66
66
5Y
5Y
5Y
CX
CX
CE
E
6C
1C
52
2
2
66
Y1
5
X5
E6
5Y
25
6C
1C
1C
language.
2
66
66
X5
X5
E6
5Y
5Y
CE
E
6C
6C
1C
1C
Y1
X5
E6
6
5Y
5Y
CX
CE
25
[A] The number of bacterial cells (y) per unit volume in a culture at
6C
1C
2
66
Y1
X5
X5
5Y
CE
CE
25
6C
6C
Y1
Y1
X5
X5
E6
6
CE
x 0 1 2 3 4 5 6 7 8 9
25
25
6C
6C
1C
Y1
X5
X5
6
6
5Y
25
6C
6C
52
Y1
X5
6
6
CX
25
6C
X5
6
CE
CE
25
25
6C
55196 Page 1 of 2
Y1
Y1
X5
5
6
CX
CE
25
25
6C
Y1
X5
X5
E6
25
6C
X525Y1CE66CX525Y1CE66CX525Y1CE66CX525Y1CE66C
6C
1C
CX CE 5Y CX CE
5 25 66C 1C 52 66
Y1 X5 E6 5Y CX
6C CE 25 6C 1C 52
X5 6 Y1 X5 E6 5Y
25 6C CE 25Y 6 CX 1C
Y1
CE
X5
25 6 6C 1C 52 E6
6
6 X5 E 5Y CX
25 6C Y1
CE 2 66 1C 525
[C]
[B]
[B]
[B]
CX
[D]
[A]
[A]
5Y
Q6.
Q5.
Y1 X5 6 5 E6
6 Y1
55196
CE 25 6C 1C 25
6 X5 E CX CE
6C Y1
CE 2 66
CX
Y1
CE 525 66
CX
X5 6 5Y
1C 25 6C 1C 5 25 66 Y1 52
E6 X5 E Y CX CE 5Y
6C Y1
CE 2 66
CX 1C 5 25 66
CX 1C
X5 6 5Y E6 E6
5 6 Y1
Pandas library
25 6C 1C 25 52 6C
X5 E Y CX CE 5Y X
6C Y1
CE 2 66
CX 1C 5 25 52 66
CX 1C
X5 6 5Y E6 5Y E6
6C 1C 2 5 6 5 Y1 1C 6
25 25 Write short notes on:
Attempt the following
5Y CX CE CX
Y1 X5 E6
52 66 Y 1 5 E6
1C C 2
E6
**************
X5 5
25 6C 1C 52 66 1 C 2 66 1C
E 5
Page 2 of 2
Y1 X5 E6 5Y CX Y CX E6
6C 1C 52 66 1C 52
C
between linear regression and logistic regression?
CE 25 E6 5Y E6 5Y
66 Y1 X5 1 X5
CX CE 2 5Y 6 CX CE 2 5Y 6CX 1C
52 6 6 1C 52 6 6 1 52
E6
5Y C X 5Y CX C E 5Y 6C
How Exploratory Data Analysis (EDA) is performed in R?
5 E6 6 X5
1C 2 5Y 6 C 1C 52 6C 1 C
E6 X5 E6 5Y X5 E6
6C 1C 2 6 1 C 2 6C
X525Y1CE66CX525Y1CE66CX525Y1CE66CX525Y1CE66C
X5 E6 5Y CX E6 5Y X5
25 6C 1C 52 6 C 1C 25
5Y X
What is Logistic Regression? What are the similarities and differences
X5 E6 E6
Y1 5 2 6 Y1
Paper / Subject Code: 37471 / Data Analytics and Visualization
CE 25 6C 1C C
66 Y1 X5 E6 5Y X5
CX CE 25 6C 1C 25
66 Y1 X5 E6 Y1
52 C C 2 6C
[20]
[20]
5Y E6 5Y CE
1C X5
25 6C 1 CE
X5
25 6
E6 Y 1C X 52 66 Y 1C
6C 5
X5 E 66 Y CX E6
25 C 1C 52 6C
Y1 X5 E6 5Y X5
CE 25 6C 1C
66 Y1 X5 E6
CX CE 25 6C
Y X