Data Science
Data Science
QUESTION BANK
VI SEMESTER
Regulation – 2019
Prepared by
1
SRM VALLIAMMAI ENGINEERING COLLEGE
(An Autonomous Institution)
SRM Nagar, Kattankulathur-603203
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
QUESTION BANK
PART – A
Q. Question BT Competence
No Level
1 What is Data Science process? Explain. BTL-1 Remember
2 Differentiate Business Intelligence (BI) and Data Science. BTL-2 Understand
2
20 Develop a general algorithm for Data Science process. BTL-6 Create
PART – B
i. What is Bigdata? (3)
1 BTL-1 Remember
ii. Describe the main features of a big data in detail.(10)
2 Describe life cycle of Data Science with neat diagram. (13) BTL-1 Remember
PART – C
1 Create a brief summary about the challenges faced in processing big data BTL-6 Create
now a day. (15)
2 Evaluate in detail about the case study of big data solutions. (15) BTL-5 Evaluate
3 Explain Traditional Vs Big data business approach with its drawbacks. BTL-6 Create
(15)
Evaluate the various formats of data and illustrate with a real time
4 BTL-5 Evaluate
examples. (15)
3
UNIT II MATHEMATICAL FOUNDATIONS
Linear Algebra: Vectors, Matrices- Statistics: Describing a Single Set of Data, Correlation, Simpson’s
Paradox-Correlation and Causation- Probability: Dependence and Independence, Conditional Probability,
Bayes’s-Theorem, Random Variables-Continuous Distributions- The Normal Distribution-The Central Limit
Theorem.
PART – A
Q. Questions BT Competence
No Level
1 Point out application of vectors. BTL-4 Analyze
2 Point out the rules for dot product of two vectors. BTL-4 Analyze
3 Compare variance and covariance. BTL-5 Evaluate
4 Develop a matrix to demonstrate binary relationship. BTL-6 Create
5 What is statistics? What are the ways to describe single set of data? BTL-1 Remember
6 List applications of matrices. BTL-1 Remember
7 Given single set of data, explain central tendencies of the data. BTL-2 Understand
8 Describe dispersion in single set of data. BTL-2 Understand
9 Give example of a continuous distribution. BTL-2 Understand
10 Define Bayes’s Theorem. BTL-1 Remember
11 List some applications of conditional probability. BTL-1 Remember
12 What way we can think of probability with respect to Data Science? BTL-1 Remember
13 What is correlation? BTL-1 Remember
14 Why normal distribution is important? BTL-2 Understand
15 Classify the different distribution of values of random variables. BTL-3 Apply
16 Illustrate normal distribution with diagram. BTL-3 Apply
17 Complete a routine to display a histogram for sample number people and BTL-3 Apply
respective number of friends for them.
18 Analyze and write the importance of matrices in representing data sets. BTL-4 Analyze
19 Reason for importance of normal distribution is central limit theorem – BTL-5 Evaluate
Justify.
20 Develop a routine to plot Probability Density Function. BTL-6 Create
PART – B
Describe vectors and various operations on vectors with routines,
1 example and diagram. (13) BTL-1 Remember
2 Explain matrices with respect to Data Science. (6) BTL-3 Apply
Explain statistics and single set of Data. (7)
3 i. Describe about correlation in detail.(7) BTL-1 Remember
ii. Explain any one application of correlation.(6)
4 Explain normal distribution with an example. (13) BTL-3 Apply
4
5 i. Explain conditional probability.(8) BTL-5 Evaluate
ii. Justify the need for normal distribution. (5)
6 i. Give routine to display a histogram. (7) BTL-2 Understand
ii. Discuss about Dependence and Independence. (6)
7 i. Describe application of matrices to represent binary relationship an BTL-2 Understand
example. (7)
ii. Describe Bayes’s Theorem. (6)
8 i. Write a routine to plot Probability Density Function and illustrate with BTL-4 Analyze
an example. (7)
ii. Write a routine to plot a Histogram that compares Binomial
Distribution and Normal Distribution. (6)
9 i. Describe Normal Distribution in detail. (7) BTL-1 Remember
ii. Explain any one application of Bayes’s theorem. (6)
10 Briefly describe the use of statistics in Data Science. (13) BTL-1 Remember
11 Analyze and write a routine to implement various Probability Functions BTL-4 Analyze
with example. (13)
12 Develop a data set and demonstrate correlation. (13) BTL-6 Create
13 Discuss in detail about the variance, covariance, and correlation. (13) BTL-2 Understand
14 Illustrate various distributions of values of random variables. (13) BTL-4 Analyze
PART – C
1 Develop a routine to demonstrate Binomial Distribution and Normal BTL-6 Create
Distribution. (15)
Assess the routines to implement various random variable distribution
2 BTL-5 Evaluate
functions. (15)
3 Assess the difference between variance and covariance. BTL-5 Evaluate
Show a data set of values and demonstrate its correlation. (15)
4 Develop your own scenarios to demonstrate use of Vectors and Matrices BTL-6 Create
in Data Science. (15)
5
4 Create a chart that demonstrates overfitting. BTL 6 Create
5 How supervised models differ from unsupervised models? BTL 4 Analyze
6 What is the reason for the word “Naïve” in Naïve Bayes classification? BTL 1 Remember
7 List the major categories of Machine Learning. BTL 1 Remember
8 What is a model with respect to Machine Learning? Give example. BTL 2 Understand
9 Define simple linear Regression. BTL 1 Remember
10 How to find the hyper plane dimension given the dimension of data in BTL 2 Understand
Support Vector Machine classification?
11 Simulate the idea behind nearest neighbor’s classification. BTL 6 Create
12 Discuss about random forests. BTL 2 Understand
13 Give the formula for Conditional probability. BTL 2 Understand
14 Explain Bayes’s theorem. BTL 4 Analyze
15 How we get random trees in Random Forest classification? BTL 3 Apply
16 List major categories of supervised learning. BTL 1 Remember
17 List out various regression models under supervised learning. BTL 1 Remember
18 Illustrate all possible decisions that can be made by the following decision
tree.
Is a Person Physically Fit?
Age<30?
Yes No
BTL 3 Apply
Eat’s a lot of Pizzas? Exercises in the morning?
Yes No Yes No
6
5 Discuss in detail the various Supervised Machine Learning techniques. (13) BTL 2 Understand
6 Construct a decision tree for the following data: BTL 5 Evaluate
Explain various path in the tree that leads to various decisions. (13)
7
4 Construct a decision tree for the following data: BTL 6 Create
Explain various path in the tree that leads to various decisions. (15)
PART – A
Q.No Questions BT Competence
Level
1 What is SAS? BTL1 Remember
2 Define data visualization in machine learning. BTL1 Remember
3 Give the features of Numpy. BTL2 Understand
4 What is meant Matplotlib? Give features of Matplotlib. BTL1 Remember
5 Give the expansion for NLTK in machine learning and explain. BTL2 Understand
6 List any four data science tools. BTL1 Remember
7 Describe about Apache Spark. BTL1 Remember
8 Predict the features of Scikit. BTL2 Understand
9 Compare R and Python. BTL4 Analyze
8
10 Distinguish Statistics and Data Science. BTL2 Understand
11 Classify the different visualization tools. BTL3 Apply
12 Develop line chart for the following data. BTL6 Create
years = [1950, 1960, 1970, 1980, 1990, 2000, 2010]
gdp = [300.2, 543.3, 1075.9, 2862.5, 5979.6, 10289.7, 14958.3].
13 Which language is best for learning data science? Illustrate why? BTL3 Apply
14 Summarize the MATLAB. BTL5 Evaluate
15 Point out the components of Data Science. BTL4 Analyze
16 Compare various data science languages. BTL4 Analyze
17 Select the best tool or language for data science and give justification. BTL5 Evaluate
18 Illustrate line charts with an example. BTL3 Apply
19 Identify the tools for Data Science. BTL1 Remember
20 Develop a bar chart for the following data. BTL6 Create
movies = ["Annie Hall", "Ben-Hur", "Casablanca", "Gandhi", "West Side
Story"]
num_oscars = [5, 11, 3, 8, 10].
PART – B
1 i. Describe Numpy in detail. (6) BTL1 Remember
ii. Write a python program that uses numpy and explain it. (7)
2 Describe the following. BTL1 Remember
i. Numpy. (7)
ii. Scikit. (6)
3 i. List the different types of charts? (7) BTL1 Remember
ii. Explain any one chart in detail with an Example.(6)
4 Discuss various Toolkits in Python in detail.(13) BTL2 Understand
5 Describe various web scraping methods in detail.(13) BTL2 Understand
6 Illustrate Matplotlib with an example. (13) BTL3 Apply
7 Explain different visualization tools in detail with an example. (13) BTL4 Analyze
8 Point out various features of Toolkits that can be used with Python. (13) BTL4 Analyze
9 Write about estimators and Explain how it can be fitted to some data using BTL5 Evaluate
its fit method. (13)
10 Write a program by loading the Iris dataset, split it into train and test sets, BTL6 Create
and compute the accuracy score of a pipeline on the test data. (13)
11 i. Write a python program to read a file. (7) BTL3 Apply
ii. Illustrate the flow of the program.( 6)
9
12 Describe the following. BTL1 Remember
i. MaTLAB. (7)
ii. Python. (6)
13 Explain in detail about the following. BTL4 Analyze
i. Line chart. (6)
ii. Bar chart .(7)
14 Describe NLTK. Explain the steps to use it in Python. (13) BTL2 Understand
PART – C
1 Develop a line chart to visualize a data set of your choice and give the BTL6 Create
detailed explanation of observations from chart. (15)
2 Analyze how to construct a bar chart for a data set and explain it in BTL4 Analyze
detail.(15)
3 Explain a various methods of Scraping the web in detail. (15) BTL5 Evaluate
4 Prepare a program to read a file and discuss its working.(15) BTL6 Create
10
16 Compare different computer vision tasks. BTL4 Analyze
17 Can Data Science be used in Stock Market Analysis? Justify. BTL5 Evaluate
18 How weather forecasts are made? BTL3 Apply
19 List three modules of R-CNN. BTL1 Remember
20 Develop sample input and output for Object Localization. BTL6 Create
PART – B
1 i. Describe data is a crucial part of Weather Predictions. (6) BTL1 Remember
ii. How weather Data is an aid for many Events. (7)
Describe the following
2 i. Image Classification. (6) BTL1 Remember
ii. Object Localization. (7)
3 Describe the following. BTL1 Remember
1. i, A Twitter NLP chain,. (5)
ii, NL processor and Ad-hoc NL processor. (8)
4 Discuss various subprocesses involved in the complete process of data science BTL2 Understand
for weather prediction. (13)
5 Describe YOLO Model Family. (13) BTL2 Understand
6 Write in detail about R-CNN Model Family. (13) BTL3 Apply
7 Explain schema, which shows the process of the water cycle and BTL4 Analyze
precipitation occurrence. (13)
8 Compare the following computer vision tasks and discuss about each task BTL4 Analyze
in a very detailed Manner.
i, Object Localization. (6)
ii. Object Detection. (7)
9 Summarize of Predictions made by YOLO Model. (13) BTL5 Evaluate
10 Develop a code to Prepare the Input for the LSTM Model. (13) BTL6 Create
11 i. Write short notes on R-CNN.(7) BTL4 Analyze
ii. Illustrate Satellite Imagery and Sensor Data. (6)
Describe the following
12 i. Image Classification. (6) BTL1 Remember
ii. Object Localization. (7)
13 i. Discuss in detail about Satellite Imagery and Sensor Data in weather BTL3 Apply
forecasting. (7)
i. Explain the Stock Market with suitable example. (6)
14 Describe various computer vision tasks in object recognition .(13) BTL2 Understand
PART – C
1 Develop a case study of Sentiment Analysis in Twitter.(15) BTL6 Create
2 Explain Condensation and coalescence are important parts of the water cycle and BTL5 Evaluate
how data collected from it.(15)
11
12