DS MCQ
DS MCQ
This section focuses on "Basics" of Data Science. These Data Science Multiple Choice
Questions (MCQ) should be practiced to improve the skills required for various
interviews (campus interview, walk-in interview, company interview), placements,
entrance exams and other competitive examinations.
1. Data science is the process of diverse set of data through ?
A. organizing data
B. processing data
C. analysing data
D. All of the above
View Answer
Ans : D
Explanation: Data science is the process of deriving knowledge and insights from a huge and diverse set
of data through organizing, processing and analysing the data.
A. William S.
B. John McCarthy
C. Arthur Samuel
D. Satoshi Nakamoto
View Answer
Ans : A
A. C
B. C++
C. R
D. Ruby
View Answer
Ans : C
A. utilize large data sets to gather information that meets their company's needs
B. work with businesses to determine the best usage of the information yielded from data
C. build data solutions that are optimized for performance and design applications
D. All of the above
View Answer
Ans : C
Explanation: Data Architect: Data architects build data solutions that are optimized for performance and
design applications.
Explanation: All of the above is the correct skills for a Data Scientist.
A. Data Engineering
B. Advanced Computing
C. Domain expertise
D. All of the above
View Answer
Ans : D
A. Discovery
B. Model Planning
C. Communication Building
D. Operationalize
View Answer
Ans : C
A. Structured
B. UnStructured
C. Both A and B
D. None Of the above
View Answer
Ans : C
Explanation: Structured and Unstructured data. Like logs, SQL, NoSQL, or text
A. Recommendation Systems
B. Image & Speech Recognition
C. Online Price Comparison
D. Privacy Checker
View Answer
Ans : D
A. Statistics
B. Machine Learning
C. Data Visualization
D. All of the above
View Answer
Ans : D
A. True
B. False
C. Can be true or false
D. Can not say
View Answer
Ans : B
A. Inference
B. Summarizing
C. Subsetting
D. None of the above
View Answer
Ans : A
Explanation: Inference is the act or process of deriving logical conclusions from premises known or
assumed to be true.
16. Causal analysis is commonly applied to census data.
A. True
B. False
C. Can be true or false
D. Can not say
View Answer
Ans : B
17. Which of the following model is usually a gold standard for data analysis?
A. Inferential
B. Descriptive
C. Causal
D. All of the above
View Answer
Ans : C
Explanation: A causal model is an abstract model that describes the causal mechanisms of a system.
A. Git
B. Numpy
C. Scipy
D. Slidify
View Answer
Ans : A
Explanation: Git is a free and open source distributed version control system designed to handle
everything from small to very large projects with speed and efficiency.
19. Which of the following step is performed by data scientist after acquiring the
data?
A. Data Cleaning
B. Data Integration
C. Data Replication
D. All of the above
View Answer
Ans : A
Explanation: Data cleaning, data cleaning or data scrubbing is the process of detecting and correcting (or
removing) corrupt or inaccurate records from a record set, table, or database.
20. Which of the following focuses on the discovery of (previously) unknown
properties on the data?
A. Data mining
B. BigData
C. Data wrangling
D. Machine Learning
View Answer
Ans : A
Explanation: Data munging or data wrangling is loosely the process of manually converting or mapping
data from one "raw" form into another format that allows for more convenient consumption of the data
with the help of semi-automated tools.
A. 1
B. 2
C. 3
D. 4
View Answer
Ans : B
Explanation: Data can be categorized into two groups: Structured data and Unstructured data
A. TRUE
B. FALSE
C. Can be true or false
D. Can not say
View Answer
Ans : A
Explanation: True, Unstructured data is not organized. We must organize the data for analysis purposes.
A. horizontal
B. diagonal
C. vertical
D. top
View Answer
Ans : C
A. database table
B. functions
C. data prepration
D. data frame
View Answer
Ans : D
25. We write ______ in front of mean to let Python know that we want to activate the
mean function from the Numpy library.
A. npm.
B. np.
C. ng.
D. ngm.
View Answer
Ans : B
Explanation: We write np. in front of mean to let Python know that we want to activate the mean function
from the Numpy library.
A. 2x + y = 100
B. 2x + 2y = 100
C. 2x + y = 200
D. x + y = 100
View Answer
Ans : A
Explanation: Suppose the price of a bat is Rs 'x' and the price of a ball is Rs 'y'. Values of 'x' and 'y' can
be anything depending on the situation i.e. 'x' and 'y' are variables. 2x + y = 100 is the answer.
2. What is the first step in linear algebra?
A. flat objects
B. line
C. Planes
D. Both A and C
View Answer
Ans : C
Explanation: A linear equation in 3 variables represents the set of all points whose coordinates satisfy the
equations.Basically, a linear equation in three variables represents a plane.
A. 3
B. 4
C. 5
D. 6
View Answer
Ans : B
Explanation: There are 4 possible cases : No intersection at all, Planes intersect in a line, They can
intersect in a plane, All the three planes intersect at a point.
A. Square Matrix
B. Scalar Matrix
C. Trace Matrix
D. Term Matrix
View Answer
Ans : D
Explanation: Term Matrix is not a type of matrix in linear algebra.
6. The matrix which is the sum of all the diagonal elements of a square matrix?
A. Diagonal matrix
B. Trace matrix
C. Identity matrix
D. Both A and B
View Answer
Ans : B
Explanation: Trace : It is the sum of all the diagonal elements of a square matrix.
A. Complex multiplication
B. Linear multiplication
C. Scalar multiplication
D. Constant multiplication
View Answer
Ans : C
A. we have a constant scalar 'c' and a matrix 'A'. Then multiplying 'c' with 'A' gives : c[Cij] =
[c*Aij]
B. The multiplication of two matrices of orders i*j and j*k results into a matrix of order i*k.
C. Two matrices will be compatible for multiplication only if the number of columns of the first
matrix and the number of rows of the second one are same.
D. Transposition simply means interchanging the row and column index.
View Answer
Ans : A
Explanation: we have a constant scalar 'c' and a matrix 'A'. Then multiplying 'c' with 'A' gives : "c[Aij] =
[c*Aij]"
Explanation: The two methods to solve matrix equations : Row Echelon Form, Inverse of a Matrix
A. Row matrix
B. Rank of a matrix
C. Term matrix
D. Linear matrix
View Answer
Ans : B
Explanation: Rank of a matrix is equal to the maximum number of linearly independent row vectors in a
matrix.
Explanation: The concept of determinant is applicable to square matrices only is true regarding
Determinant of a Matrix.
12. Vectors whose direction remains unchanged even after applying linear
transformation with the matrix are called?
A. Eigenvalues
B. Eigenvectors
C. Cofactor matrix
D. Minor of a matrix
View Answer
Ans : B
Explanation: vectors whose direction remains unchanged even after applying linear transformation with
the matrix are called Eigenvectors for that particular matrix.
Explanation: The concept of Eigen values and vectors is applicable to square matrices only.
A<-matrix(c(30,31,40,41,50,51,60,61,70),nrow = 3,byrow = T)
e <- eigen(A)
e$values
e$vectors
Explanation: 147.737576 5.317459 -3.055035 is the output for the following code.
A. non-invertible
B. invertible
C. Both non-invertible and invertible
D. None Of the above
View Answer
Ans : A
A. Singular
B. Eigen vector
C. Eigen value
D. None Of the above
View Answer
Ans : C
Explanation: Singular Value Decomposition is some sort of generalisation of Eigen value decomposition.
B<-matrix(c(30,31,40,41,50,51,60,61,70),nrow = 3,byrow = T)
A<-solve(B)
det(A)
A. 0.0004166667
B. -0.0004166668
C. 0.0004166668
D. -0.0004166667
View Answer
Ans : D
Explanation: The cofactor is always preceded by a positive (+) or negative (-) sign, depending whether
the element is in a + or - position.
A. computer vision
B. physics
C. machine learning
D. All of the above
View Answer
Ans : D
Explanation: Eigenvectors find a lot of applications in different domains like computer vision, physics and
machine learning.
A. Order of matrix : If a matrix has 3 rows and 4 columns, order of the matrix is 3*4 i.e.
row*column.
B. Row matrix : A matrix consisting only of columns.
C. Column matrix : The matrix which consists of only 1 column.
D. Row matrix : A matrix consisting only of row.
View Answer
Ans : B
21.
Which library in Python is commonly used for data
manipulation and analysis in Data Science?
A. Pandas
B. Matplotlib
C. Scikit-Learn
D. TensorFlow
Answer: Option A
No explanation is given for this question Let's Discuss on Board
Powered By
Pause
Unmute
Loaded: 2.16%
Fullscreen
Skip Ad
22.
What is the term for the process of cleaning and
organizing data into a structured format suitable for
analysis?
A. Data Transformation
B. Data Visualization
C. Data Aggregation
D. Data Extraction
Answer: Option A
No explanation is given for this question Let's Discuss on Board
23.
In Data Science, what is the primary purpose of data
visualization?
D. To complicate data
Answer: Option C
No explanation is given for this question Let's Discuss on Board
24.
Which step in the Data Science process involves
assessing the quality of collected data?
A. Data Collection
B. Data Cleaning
C. Data Validation
D. Data Visualization
Answer & Solution Discuss in Board Save for Later
Answer: Option C
No explanation is given for this question Let's Discuss on Board
25.
Which of the following is a common technique used to
handle imbalanced datasets in classification problems?
A. Upsampling
B. Downsampling
C. Feature Engineering
D. Data Wrangling
Answer: Option A
No explanation is given for this question Let's Discuss on Board
26.
What type of data analysis focuses on understanding
the relationships and patterns within a dataset?
A. Descriptive Analysis
B. Predictive Analysis
C. Inferential Analysis
D. Diagnostic Analysis
Answer: Option A
No explanation is given for this question Let's Discuss on Board
27.
Which of the following is NOT a key role in a typical
Data Science team?
A. Data Engineer
B. Data Analyst
C. Data Scientist
D. Database Administrator
Answer: Option D
No explanation is given for this question Let's Discuss on Board
28.
What is the process of converting categorical variables
into numerical values for machine learning called?
A. Feature Extraction
B. Data Encoding
C. Data Standardization
D. Label Encoding
Answer: Option B
No explanation is given for this question Let's Discuss on Board
29.
Which of the following is NOT a common data
visualization tool or library used in Data Science?
A. Tableau
B. Seaborn
C. Power BI
D. Excel
Answer & Solution Discuss in Board Save for Later
Answer: Option D
No explanation is given for this question Let's Discuss on Board
30.
In Data Science, what is the term for the process of
reducing the dimensionality of a dataset while
preserving information?
A. Dimensionality Reduction
B. Data Cleaning
C. Feature Engineering
D. Data Transformation
Answer: Option A
No explanation is given for this question Let's Discuss on Board
Section 1 Section 2
41.
Which step in the Data Science process involves
visualizing and interpreting the results of data analysis?
A. Data Collection
B. Data Cleaning
C. Data Visualization
D. Model Building
Answer: Option C
No explanation is given for this question Let's Discuss on Board
Powered By
42.
Which of the following is a common method for dealing
with missing data in a dataset?
Answer: Option B
No explanation is given for this question Let's Discuss on Board
43.
What is the primary role of a Data Scientist in a
business context?
Answer: Option D
No explanation is given for this question Let's Discuss on Board
44.
What is the primary objective of exploratory data
analysis (EDA) in Data Science?
A. To make predictions
B. To find patterns
C. To summarize data
D. To transform data
Answer: Option B
No explanation is given for this question Let's Discuss on Board
45.
Which of the following is NOT typically considered a
part of the Data Science process?
A. Data Collection
B. Data Visualization
C. Data Cleaning
D. Software Development
Answer: Option D
No explanation is given for this question Let's Discuss on Board
46.
Which of the following command is used to squash the
commits?
A. rebase
B. squash
C. boot
Answer: Option A
No explanation is given for this question Let's Discuss on Board
47.
3V's are not sufficient to describe big data.
A. True
B. False
48.
Which of the following analysis is usually modeled by
deterministic set of equations?
A. Predictive
B. Causal
C. Mechanistic
Answer: Option C
No explanation is given for this question Let's Discuss on Board
49.
Which of the following data mining technique is used to
uncover patterns in data?
A. Data bagging
B. Data booting
C. Data merging
D. Data Dredging
Answer: Option D
No explanation is given for this question Let's Discuss on Board
50.
Point out the wrong statement.
C. Flags are the options given to command for activating particular behaviour
Answer: Option B
1. read()
2. rep()
3. data()
4. view()
To find the maximum number in the Numpy array, we use?
1. max(array)
2. array.max()
3. array(max)
4. None of the above
Correct way to install Numpy is
1. import numpy
2. import numpy as np
3. from numpy import
Deep learning algorithms are constructed with how many layers?
1. 1
2. 2
3. 3
4. 4
Converting the Numpy array to the list in python requires
1. list.array
2. array.list
3. list(array)
4. None of the above
Index values in Pandas must be?
1. Hashable
2. Unique
3. Both Unique and Hashable
4. None of the above
Which role do internet technologies and the “IoT” play in the context of industry 4.0?
1. line
2. graph
3. bar
4. plot
Which industry branches are suitable for industry 4.0 development?
1. matplotlib.plt
2. matplotlib.pyplot
3. matplotlib.numpy
4. matplotlib.pip
What will be the output of the following code?
1. 0
2. 1
3. 3
4. None of the above
A series is a one-dimensional array which is labelled and can hold any data type.
1. True
2. False
What will be the output of the following program?
1. 2
2. 3
3. 1
4. 4
A correct way to preprocess the data When performing regression or classification is
1. 3
2. 2
3. 4
4. 1
Supervised learning differs from unsupervised clustering in that supervised learning requires