0% found this document useful (0 votes)
111 views6 pages

CS3352 FDS QB

The document is a question bank for the CS3352 - Foundations of Data Science course at Anna University, covering various topics across five units. It includes both Part-A and Part-B questions related to data science concepts, data description, relationships, Python libraries for data wrangling, and data visualization. Each unit contains definitions, explanations, and detailed inquiries into specific aspects of data science and its methodologies.

Uploaded by

viji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views6 pages

CS3352 FDS QB

The document is a question bank for the CS3352 - Foundations of Data Science course at Anna University, covering various topics across five units. It includes both Part-A and Part-B questions related to data science concepts, data description, relationships, Python libraries for data wrangling, and data visualization. Each unit contains definitions, explanations, and detailed inquiries into specific aspects of data science and its methodologies.

Uploaded by

viji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

ANNA UNIVERSITY -REGULATIONS 2021

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

III SEMESTER- QUESTION BANK

CS3352 - FOUNDATIONS OF DATA SCIENCE

UNIT- 1 INTRODUCTION

PART-A

1. What is data science?


2. Define structured data.
3. What is data?
4. What is unstructured data?
5. What is machine generated data?
6. Define streaming data.
7. List the stages of data science process.
8. What are the advantages of data repositories?
9. What is data cleaning?
10. What is outlier detection?
11. Explain exploratory data analysis.
12. Define data mining.
13. What are the three challenges to data mining regarding data mining methodology?
14. What is predictive mining?
15. What is data cleaning?
16. List the five primitives for specifying a data mining task.
17. List the stages of data science process.
18. What is data repository
19. List the data cleaning task.
20. What is Euclidean distance?

PART-B

1. Define facets of data. Explain the categories of data in detail.


2. What is Data science? Explain data science life cycle.
3. Write the overview if data science process and How to define the research goals in data
science process.
4. Give the brief explanation about data preparation,
5. What is exploratory data analysis and what are the steps involved in exploratory data
analysis, give with example.
6. Explain in detail about data modeling with data science process,
7. Explain presenting finding and building application.
8. Explain in detail about basic statistical description of data with examples,
9. What is data mining explain the steps involved in data mining with neat diagram and also
explain data mining techniques.
10. Draw neat sketch architecture of data warehousing system and explain three tier
architecture.

UNIT-2 DESCRIBING DATA

PART-A
1. Define qualitative data.
2. What is quantitative data?
3. What is nominal data?
4. Describe ordinal data.
5. What is an interval data?
6. What do you mean observational study?
7. What is frequency distribution?
8. What is cumulative frequency?
9. Explain histogram.
10. What is goal of variability
11. How to calculate range?
12. What is independent variable?
13. Explain frequency polygon.
14. What is steam and leaf diagram?
15. Define mean, median, mode in averages.

PART-B

1. Explain in detail about types of data with describing data.


2. i) What is qualitative and quantitative data? Explain difference between qualitative and
quantitative data.
ii) Explain the following: Range, variance, standard deviation, interquartile range.
3. Discuss in detail about types of variable with suitable example.
4. Briefly explain describing data with tables and graphs with neat sketch.
5. i) How to draw graphs by using quantitative data? Explain.
ii) Explain frequency distributions for quantitative data.
6. Explain describing data with averages.
7. Describe in detail about describing variability.
8. The heights of animals are: 600 mm, 470mm,170mm,430mm,and 300mm. find out the
mean, the variance and the standard deviation.
9. Using the computation formula for the sum of squares, calculate the populationstandar
deviation for the scores:1,3,7,2,0,4,7,3.
10. Discuss in detail about normal distribution and standard(z) scores
UNIT-3 DESCRIBING RELATIONSHIPS

PART - A

1. What is correlation?
2. Write the types of correlation.
3. What is coefficient of correlation?
4. Define positive and negative correlation.
5. What is cause and effect relationship?
6. Explain advantages and disadvantages of scatter diagram.
7. What is regression problem?
8. Define linear and nonlinear regression,
9. What are assumptions of regression?
10. What is regression analysis used for?
11. What are the types of regressions?
12. What do you mean by least square method?
13. What is correlation analysis?
14. Write the characteristic of R-Square.
15. What is multiple regression equations?

PART- B
1. Explain in detail about correlation and also explain the types of correlation
2. Discuss in detail about scatter plots and define the construction and also explain scatter
plot relationships.
3. Briefly explain correlation co-efficient for quantitative data.
4. i) Calculate coefficient of correlation from the following data.
X 12 9 8 10 11 13 7
Y 14 8 6 9 11 12 3

ii) What is linear regression? List its advantages and disadvantages.


5. Define computational formula and write the computational formula for correlation
coefficient with the example of computation sequence.
6. i) Compute pearson’s coefficient of correlation between maintains cost and sales as per
data given below.
Maintains cost 3 65 62 90 75 78 82 98 25 36
9
sales 58 60 91 84 51 62 53 47 86 68

ii) What is correlation? Explain coefficient and properties of correlation.


7. Explain regression and regression lines with briefly discuss the types of regression line.
8. Explain least square regression line.
9. Discuss in detail about interpretation of r2 with characteristics of r2.
10. Define regression equation and explain multiple regression equations.
11. Define linear and non linear regression using figures. Calculate the value of Y for X=100
based on linear regression prediction method.

X Y
4 390
9 580
10 650
14 730
4 410
7 530
12 600
22 790
1 350
3 400
8 590
11 640
5 450
6 520
10 690
11 690
16 770
13 700
13 730
10 640
UNIT- 4 PYTHON LIBRARIES FOR DATA WRANGLING

PART-A

1. Define data wrangling.


2. What is python?
3. What is numpy?
4. What are the types of basic manipulation arrays?
5. How to define indexing of arrays.
6. What is an aggregation function?
7. What is a structured array?
8. Describe pandas.
9. How to manipulating and creating categorical variables.
10. Explain hierarchical indexing.
11. What is group by() function.
12. What are pivot tables?
13. Write the syntax of slicing of arrays.
14. Write the two main ways to carry out Boolean masking.
15. What is combining data sets?

PART-B
1. What is data Wrangling? Explain iterative steps of data wrangling.
2. Discuss the following
I. Introduction to python programming
II. Features of python programming
III. Advantages and disadvantages of python programming
3. .What is numpy and explain the basics of numpy arrays.
4. Define computations on arrays with numpy ufuncs and explain the types of numpy
ufuncs with example.
5. Discuss the following types:
I. Comparison
II. Mask
III. Boolean logic
6. Explain structured arrays with example programs and write the method of structure
creation.
7. Discuss data manipulation with pandas and how to define the data frame with duplicate
data.
8. Explain various types of data manipulation with pandas.
9. Explain in detail aggregation and grouping.
10. Explain pivot tables and write the syntax of pivot table function and also create a data
frame with the use of pivot table.
UNIT- 5 DATA VISUALIZATION

PART- A
1. What is data visualization?
2. Which concept is used in data visualization?
3. List the benefits of data visualization.
4. Why big data visualization is important ?
5. How to saving work to disk?
6. Explain Matplotlib.
7. Write the four important parameters that use with annotate().
8. What is counter plot?
9. Explain legends.
10. What is subplots?
11. What is use of tick?
12. Describe in short Basemap.
13. What is Seaborn ?
14. Write the example of objects in basemap.
15. Write the Difference between matplotlib and seaborn

PART-B

1. Describe in detail about importing matplotlib.


2. Discuss scatterplots and write a program for create a simple scatter plots in python
bypassing x and y values to plt.scatter().
3. What is scatter plots? How to create scatter plot by using plt.scatter() and plt.plot
method? Explain with example.
4. Briefly explain visualizing errors with example for plot a one point.
5. Explain in detail about density and contour plots.
6. Define legend. Briefly discuss legend with example.
7. Describe in detail about text and annotation and write the program to create a plot with
both major and minor tick marks.
8. Explain three dimensional plotting. Write the program for 3D parametric plot.
9. Discuss in detail about geographic data with basemap.
10. Explain visualization with seaborn. And plot a scatter plot in seaborn.

You might also like