100% found this document useful (1 vote)
121 views29 pages

DS MCQ

This document provides a collection of multiple choice questions about data science topics including basics, linear algebra, and machine learning. It tests fundamental concepts in these areas that are important for data science interviews, exams, and other assessments. Various types of questions are included such as true/false, matching, and multiple choice single answer questions.

Uploaded by

Paridhi Gaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
121 views29 pages

DS MCQ

This document provides a collection of multiple choice questions about data science topics including basics, linear algebra, and machine learning. It tests fundamental concepts in these areas that are important for data science interviews, exams, and other assessments. Various types of questions are included such as true/false, matching, and multiple choice single answer questions.

Uploaded by

Paridhi Gaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Data Science MCQs

This section focuses on "Basics" of Data Science. These Data Science Multiple Choice
Questions (MCQ) should be practiced to improve the skills required for various
interviews (campus interview, walk-in interview, company interview), placements,
entrance exams and other competitive examinations.
1. Data science is the process of diverse set of data through ?

A. organizing data
B. processing data
C. analysing data
D. All of the above
View Answer
Ans : D

Explanation: Data science is the process of deriving knowledge and insights from a huge and diverse set
of data through organizing, processing and analysing the data.

2. The modern conception of data science as an independent discipline is sometimes


attributed to?

A. William S.
B. John McCarthy
C. Arthur Samuel
D. Satoshi Nakamoto
View Answer
Ans : A

Explanation: Data science developed by William S.

3. Which of the following language is used in Data science?

A. C
B. C++
C. R
D. Ruby
View Answer
Ans : C

Explanation: R is free software for statistical computing and analysis.

4. Which of the following is false?


A. Subsetting can be used to select and exclude variables and observations
B. Raw data should be processed only one time.
C. Merging concerns combining datasets on the same observations to produce a result with
more variables
D. None Of the above
View Answer
Ans : B

Explanation: Raw data may only need to be processed once.

5. What is the work of Data Architect?

A. utilize large data sets to gather information that meets their company's needs
B. work with businesses to determine the best usage of the information yielded from data
C. build data solutions that are optimized for performance and design applications
D. All of the above
View Answer
Ans : C

Explanation: Data Architect: Data architects build data solutions that are optimized for performance and
design applications.

6. Which of the following is correct skills for a Data Scientist?

A. Probability & Statistics


B. Machine Learning / Deep Learning
C. Data Wrangling
D. All of the above
View Answer
Ans : D

Explanation: All of the above is the correct skills for a Data Scientist.

7. Which of the following are correct component for data science?

A. Data Engineering
B. Advanced Computing
C. Domain expertise
D. All of the above
View Answer
Ans : D

Explanation: All are correct component for data science


8. Which of the following is not a part of data science process?

A. Discovery
B. Model Planning
C. Communication Building
D. Operationalize
View Answer
Ans : C

Explanation: Communication Building is not a part of data science process.

9. Which of the following are the Data Sources in data science?

A. Structured
B. UnStructured
C. Both A and B
D. None Of the above
View Answer
Ans : C

Explanation: Structured and Unstructured data. Like logs, SQL, NoSQL, or text

10. Which of the following is not a application for data science?

A. Recommendation Systems
B. Image & Speech Recognition
C. Online Price Comparison
D. Privacy Checker
View Answer
Ans : D

Explanation: Privacy Checker is not a application for data science

11. Point out the correct statement.

A. Raw data is original source of data


B. Preprocessed data is original source of data
C. Raw data is the data obtained after processing steps
D. None of the above
View Answer
Ans : A

Explanation: Accounting programs are prototypical examples of data processing applications.


12. Which of the following is one of the key data science skills?

A. Statistics
B. Machine Learning
C. Data Visualization
D. All of the above
View Answer
Ans : D

Explanation: Data visualization is the presentation of data in a pictorial or graphical format.

13. Which of the following is a key characteristic of a hacker?

A. Afraid to say they don't know the answer


B. Willing to find answers on their own
C. Not Willing to find answers on their own
D. All of the above
View Answer
Ans : B

Explanation: Hacker is an expert at programming and solving problems with a computer.

14. Raw data should be processed only one time.

A. True
B. False
C. Can be true or false
D. Can not say
View Answer
Ans : B

Explanation: Raw data may only need to be processed once.

15. Which of the following is the common goal of statistical modelling?

A. Inference
B. Summarizing
C. Subsetting
D. None of the above
View Answer
Ans : A

Explanation: Inference is the act or process of deriving logical conclusions from premises known or
assumed to be true.
16. Causal analysis is commonly applied to census data.

A. True
B. False
C. Can be true or false
D. Can not say
View Answer
Ans : B

Explanation: Descriptive analysis is commonly applied to census data.

17. Which of the following model is usually a gold standard for data analysis?

A. Inferential
B. Descriptive
C. Causal
D. All of the above
View Answer
Ans : C

Explanation: A causal model is an abstract model that describes the causal mechanisms of a system.

18. Which of the following is a revision control system?

A. Git
B. Numpy
C. Scipy
D. Slidify
View Answer
Ans : A

Explanation: Git is a free and open source distributed version control system designed to handle
everything from small to very large projects with speed and efficiency.

19. Which of the following step is performed by data scientist after acquiring the
data?

A. Data Cleaning
B. Data Integration
C. Data Replication
D. All of the above
View Answer
Ans : A

Explanation: Data cleaning, data cleaning or data scrubbing is the process of detecting and correcting (or
removing) corrupt or inaccurate records from a record set, table, or database.
20. Which of the following focuses on the discovery of (previously) unknown
properties on the data?

A. Data mining
B. BigData
C. Data wrangling
D. Machine Learning
View Answer
Ans : A

Explanation: Data munging or data wrangling is loosely the process of manually converting or mapping
data from one "raw" form into another format that allows for more convenient consumption of the data
with the help of semi-automated tools.

21. Data can be categorized into ______ groups.

A. 1
B. 2
C. 3
D. 4
View Answer
Ans : B

Explanation: Data can be categorized into two groups: Structured data and Unstructured data

22. Unstructured data is not organized.

A. TRUE
B. FALSE
C. Can be true or false
D. Can not say
View Answer
Ans : A

Explanation: True, Unstructured data is not organized. We must organize the data for analysis purposes.

23. A column is a ________ representation of data.

A. horizontal
B. diagonal
C. vertical
D. top
View Answer
Ans : C

Explanation: A column is a vertical representation of data.

24. A ________ is a structured representation of data.

A. database table
B. functions
C. data prepration
D. data frame
View Answer
Ans : D

Explanation: A data frame is a structured representation of data.

25. We write ______ in front of mean to let Python know that we want to activate the
mean function from the Numpy library.

A. npm.
B. np.
C. ng.
D. ngm.
View Answer
Ans : B

Explanation: We write np. in front of mean to let Python know that we want to activate the mean function
from the Numpy library.

Linear Algebra MCQ Questions And Answers


This section focuses on "Linear Algebra" in Data Science. These Linear Algebra
Multiple Choice Questions (MCQ) should be practiced to improve the Data Science
skills required for various interviews (campus interview, walk-in interview, company
interview), placements, entrance exams and other competitive examinations.
1. Suppose that price of 2 ball and 1 bat is 100 units, then What will be
representation of problems in Linear Algebra in the form of x and y?

A. 2x + y = 100
B. 2x + 2y = 100
C. 2x + y = 200
D. x + y = 100
View Answer
Ans : A

Explanation: Suppose the price of a bat is Rs 'x' and the price of a ball is Rs 'y'. Values of 'x' and 'y' can
be anything depending on the situation i.e. 'x' and 'y' are variables. 2x + y = 100 is the answer.
2. What is the first step in linear algebra?

A. Let's complicate the problem


B. Solve the problem
C. Visualise the problem
D. None Of the above
View Answer
Ans : C

Explanation: Visualise the problem is the first step in linear algebra.

3. A linear equation in three variables represents a?

A. flat objects
B. line
C. Planes
D. Both A and C
View Answer
Ans : C

Explanation: A linear equation in 3 variables represents the set of all points whose coordinates satisfy the
equations.Basically, a linear equation in three variables represents a plane.

4. How many ways a set of three planes can intersect?

A. 3
B. 4
C. 5
D. 6
View Answer
Ans : B

Explanation: There are 4 possible cases : No intersection at all, Planes intersect in a line, They can
intersect in a plane, All the three planes intersect at a point.

5. Which of the following is not a type of matrix?

A. Square Matrix
B. Scalar Matrix
C. Trace Matrix
D. Term Matrix
View Answer
Ans : D
Explanation: Term Matrix is not a type of matrix in linear algebra.

6. The matrix which is the sum of all the diagonal elements of a square matrix?

A. Diagonal matrix
B. Trace matrix
C. Identity matrix
D. Both A and B
View Answer
Ans : B

Explanation: Trace : It is the sum of all the diagonal elements of a square matrix.

7. Multiplication of a matrix with a scalar constant is called?

A. Complex multiplication
B. Linear multiplication
C. Scalar multiplication
D. Constant multiplication
View Answer
Ans : C

Explanation: Multiplication of a matrix with a scalar constant is called scalar multiplication.

8. Which of the following is false?

A. we have a constant scalar 'c' and a matrix 'A'. Then multiplying 'c' with 'A' gives : c[Cij] =
[c*Aij]
B. The multiplication of two matrices of orders i*j and j*k results into a matrix of order i*k.
C. Two matrices will be compatible for multiplication only if the number of columns of the first
matrix and the number of rows of the second one are same.
D. Transposition simply means interchanging the row and column index.
View Answer
Ans : A

Explanation: we have a constant scalar 'c' and a matrix 'A'. Then multiplying 'c' with 'A' gives : "c[Aij] =
[c*Aij]"

9. Which of the following is correct method to solve matrix equations?

A. Row Echelon Form


B. Inverse of a Matrix
C. Both A and B
D. None Of the above
View Answer
Ans : C

Explanation: The two methods to solve matrix equations : Row Echelon Form, Inverse of a Matrix

10. _______________ is equal to the maximum number of linearly independent row


vectors in a matrix.

A. Row matrix
B. Rank of a matrix
C. Term matrix
D. Linear matrix
View Answer
Ans : B

Explanation: Rank of a matrix is equal to the maximum number of linearly independent row vectors in a
matrix.

11. What is true regarding Determinant of a Matrix?

A. The concept of determinant is applicable to square matrices only.


B. To find determinant, subtract diagonal elements together.
C. determinant is a vector value that can be computed from the elements of a Trace matrix
D. Both A and C
View Answer
Ans : A

Explanation: The concept of determinant is applicable to square matrices only is true regarding
Determinant of a Matrix.

12. Vectors whose direction remains unchanged even after applying linear
transformation with the matrix are called?

A. Eigenvalues
B. Eigenvectors
C. Cofactor matrix
D. Minor of a matrix
View Answer
Ans : B

Explanation: vectors whose direction remains unchanged even after applying linear transformation with
the matrix are called Eigenvectors for that particular matrix.

13. The concept of Eigen values and vectors is applicable to?


A. Scalar matrix
B. Identity matrix
C. Upper triangular matrix
D. Square matrix
View Answer
Ans : D

Explanation: The concept of Eigen values and vectors is applicable to square matrices only.

14. What will be output for the following code?

A<-matrix(c(30,31,40,41,50,51,60,61,70),nrow = 3,byrow = T)

e <- eigen(A)

e$values

e$vectors

A. 148.737576 5.317459 -4.055035


B. 147.737576 5.317459 -3.055035
C. 147.737576 6.317459 -3.055035
D. 146.737576 4.317459 -4.055035
View Answer
Ans : B

Explanation: 147.737576 5.317459 -3.055035 is the output for the following code.

15. Singular matrix are?

A. non-invertible
B. invertible
C. Both non-invertible and invertible
D. None Of the above
View Answer
Ans : A

Explanation: Singular matrix are non-invertible


16. Singular Value Decomposition is some sort of generalisation of __________
decomposition.

A. Singular
B. Eigen vector
C. Eigen value
D. None Of the above
View Answer
Ans : C

Explanation: Singular Value Decomposition is some sort of generalisation of Eigen value decomposition.

17. What Will be output of det(A)?

B<-matrix(c(30,31,40,41,50,51,60,61,70),nrow = 3,byrow = T)

A<-solve(B)

det(A)

A. 0.0004166667
B. -0.0004166668
C. 0.0004166668
D. -0.0004166667
View Answer
Ans : D

Explanation: Option D will be output for the following code

18. The cofactor is always preceded by a?

A. positive (+) sign


B. negative (-) sign
C. positive (+) or negative (-) sign
D. With decimal
View Answer
Ans : C

Explanation: The cofactor is always preceded by a positive (+) or negative (-) sign, depending whether
the element is in a + or - position.

19. Which of the following is correct application for Eigenvectors?

A. computer vision
B. physics
C. machine learning
D. All of the above
View Answer
Ans : D

Explanation: Eigenvectors find a lot of applications in different domains like computer vision, physics and
machine learning.

20. Which of the following is false?

A. Order of matrix : If a matrix has 3 rows and 4 columns, order of the matrix is 3*4 i.e.
row*column.
B. Row matrix : A matrix consisting only of columns.
C. Column matrix : The matrix which consists of only 1 column.
D. Row matrix : A matrix consisting only of row.
View Answer
Ans : B

Explanation: Row matrix : A matrix consisting only of columns is false


Section 1 Section 2

21.
Which library in Python is commonly used for data
manipulation and analysis in Data Science?

A. Pandas

B. Matplotlib

C. Scikit-Learn

D. TensorFlow

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option A
No explanation is given for this question Let's Discuss on Board

Powered By

Pause

Unmute

Loaded: 2.16%

Fullscreen

Skip Ad

22.
What is the term for the process of cleaning and
organizing data into a structured format suitable for
analysis?

A. Data Transformation

B. Data Visualization

C. Data Aggregation

D. Data Extraction

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option A
No explanation is given for this question Let's Discuss on Board
23.
In Data Science, what is the primary purpose of data
visualization?

A. To make data more confusing

B. To obscure patterns in data

C. To help communicate insights from data

D. To complicate data

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option C
No explanation is given for this question Let's Discuss on Board

24.
Which step in the Data Science process involves
assessing the quality of collected data?

A. Data Collection

B. Data Cleaning

C. Data Validation

D. Data Visualization
Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option C
No explanation is given for this question Let's Discuss on Board

25.
Which of the following is a common technique used to
handle imbalanced datasets in classification problems?

A. Upsampling

B. Downsampling

C. Feature Engineering

D. Data Wrangling

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option A
No explanation is given for this question Let's Discuss on Board

26.
What type of data analysis focuses on understanding
the relationships and patterns within a dataset?

A. Descriptive Analysis
B. Predictive Analysis

C. Inferential Analysis

D. Diagnostic Analysis

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option A
No explanation is given for this question Let's Discuss on Board

27.
Which of the following is NOT a key role in a typical
Data Science team?

A. Data Engineer

B. Data Analyst

C. Data Scientist

D. Database Administrator

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
No explanation is given for this question Let's Discuss on Board
28.
What is the process of converting categorical variables
into numerical values for machine learning called?

A. Feature Extraction

B. Data Encoding

C. Data Standardization

D. Label Encoding

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B
No explanation is given for this question Let's Discuss on Board

29.
Which of the following is NOT a common data
visualization tool or library used in Data Science?

A. Tableau

B. Seaborn

C. Power BI

D. Excel
Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
No explanation is given for this question Let's Discuss on Board

30.
In Data Science, what is the term for the process of
reducing the dimensionality of a dataset while
preserving information?

A. Dimensionality Reduction

B. Data Cleaning

C. Feature Engineering

D. Data Transformation

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option A
No explanation is given for this question Let's Discuss on Board
Section 1 Section 2
41.
Which step in the Data Science process involves
visualizing and interpreting the results of data analysis?

A. Data Collection

B. Data Cleaning

C. Data Visualization

D. Model Building

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option C
No explanation is given for this question Let's Discuss on Board

Powered By
42.
Which of the following is a common method for dealing
with missing data in a dataset?

A. Dropping missing values

B. Imputing missing values

C. Creating new features

D. Normalizing the data

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B
No explanation is given for this question Let's Discuss on Board

43.
What is the primary role of a Data Scientist in a
business context?

A. Conducting market research

B. Developing marketing campaigns

C. Extracting data from databases

D. Leveraging data for insights


Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
No explanation is given for this question Let's Discuss on Board

44.
What is the primary objective of exploratory data
analysis (EDA) in Data Science?

A. To make predictions

B. To find patterns

C. To summarize data

D. To transform data

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B
No explanation is given for this question Let's Discuss on Board

45.
Which of the following is NOT typically considered a
part of the Data Science process?

A. Data Collection
B. Data Visualization

C. Data Cleaning

D. Software Development

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
No explanation is given for this question Let's Discuss on Board

46.
Which of the following command is used to squash the
commits?

A. rebase

B. squash

C. boot

D. all of the mentioned

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option A
No explanation is given for this question Let's Discuss on Board
47.
3V's are not sufficient to describe big data.

A. True

B. False

Answer & Solution Discuss in Board Save for Later

48.
Which of the following analysis is usually modeled by
deterministic set of equations?

A. Predictive

B. Causal

C. Mechanistic

D. All of the mentioned

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option C
No explanation is given for this question Let's Discuss on Board
49.
Which of the following data mining technique is used to
uncover patterns in data?

A. Data bagging

B. Data booting

C. Data merging

D. Data Dredging

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option D
No explanation is given for this question Let's Discuss on Board

50.
Point out the wrong statement.

A. Command is the CLI command which does a specific task

B. There is one and only flag for every command in CLI

C. Flags are the options given to command for activating particular behaviour

D. All of the mentioned


Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

A method used to make vector of repeated values?

1. read()
2. rep()
3. data()
4. view()
To find the maximum number in the Numpy array, we use?

1. max(array)
2. array.max()
3. array(max)
4. None of the above
Correct way to install Numpy is

1. import numpy
2. import numpy as np
3. from numpy import
Deep learning algorithms are constructed with how many layers?

1. 1
2. 2
3. 3
4. 4
Converting the Numpy array to the list in python requires

1. list.array
2. array.list
3. list(array)
4. None of the above
Index values in Pandas must be?

1. Hashable
2. Unique
3. Both Unique and Hashable
4. None of the above
Which role do internet technologies and the “IoT” play in the context of industry 4.0?

1. They form the base for an environmental friendly production.


2. form among others the base for corporate communication
3. They form the base to connect everyday items.
4. All of the above
A matplotlib function used to create a line chart is

1. line
2. graph
3. bar
4. plot
Which industry branches are suitable for industry 4.0 development?

1. Industry 4.0 is in first instance an enrichment for the service industry.


2. Especially in the automotive and agricultural sector
3. Industry 4.0 can be used in all industrial contexts where processes need to be more
intelligent.
4. All of the above
Which of the following is correct skills for a Data Scientist?

1. Probability & Statistics


2. Data Engineering
3. Machine Learning / Deep Learning
4. All of the above
A python package used for 2D graphics is

1. matplotlib.plt
2. matplotlib.pyplot
3. matplotlib.numpy
4. matplotlib.pip
What will be the output of the following code?
1. 0
2. 1
3. 3
4. None of the above
A series is a one-dimensional array which is labelled and can hold any data type.

1. True
2. False
What will be the output of the following program?

1. 2
2. 3
3. 1
4. 4
A correct way to preprocess the data When performing regression or classification is

1. None of the above


2. Normalize the data → PCA → training
3. Normalize the data → PCA → normalize PCA output → training
4. PCA → normalize PCA output → training
Size attribute in Numpy array in python is used for

1. finding the direction


2. finding the number of items
3. finding the shape
4. All of the above
How many types of Artificial Neural Network topologies do we have

1. 3
2. 2
3. 4
4. 1
Supervised learning differs from unsupervised clustering in that supervised learning requires

1. input attributes to be categorical.


2. ouput attriubutes to be categorical.
3. at least one output attribute.
4. at least one input attribute.
Following options can be used to create a DataFrame in Pandas?
1. All
2. A scalar value
3. A python dict
4. An ndarray
CNN is mostly used for which type of data

1. Both Structured and Unstructured


2. None of the above
3. Structured Data
4. Unstructured Data

You might also like