0% found this document useful (0 votes)
200 views5 pages

Select The Correct Answer

This document contains multiple choice questions about various machine learning and data science topics including linear regression, principal component analysis, naive bayes, support vector machines, and text mining. The questions cover techniques, assumptions, advantages, and applications of different algorithms. Overall, the document seems to be assessing knowledge of foundational machine learning concepts and methods.

Uploaded by

shirin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
200 views5 pages

Select The Correct Answer

This document contains multiple choice questions about various machine learning and data science topics including linear regression, principal component analysis, naive bayes, support vector machines, and text mining. The questions cover techniques, assumptions, advantages, and applications of different algorithms. Overall, the document seems to be assessing knowledge of foundational machine learning concepts and methods.

Uploaded by

shirin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Which function in Pandas library allows to manipulate data and create new variables?

SELECT THE CORRECT ANSWER

pivot_table function
read_csv function
merge function
apply function
Which of the following Python libraries provides advanced random number capabilities?
SELECT THE CORRECT ANSWER

NumPy
SciPy
Pandas
SymPy
What is the output of the below Python code? import numpy as np percentiles = [98, 76.37,
55.55, 69, 88] first_subject = np.array(percentiles) print first_subject.dtype
SELECT THE CORRECT ANSWER

float64
float32
int32
float
Which of the following methods is used to find the best fit line for data in Linear
Regression?
SELECT THE CORRECT ANSWER

Least Square Error


Maximum Likelihood
Logarithmic Loss
Both A and B
Which of the following statements is true about outliers in Linear regression?
SELECT THE CORRECT ANSWER

Linear regression is sensitive to outliers


Linear regression is not sensitive to outliers
The slope of the regression line will not change due to outliers
None of the above options

R-Squared measures: 
SELECT THE CORRECT ANSWER

The correlation between X and Y


The amount of variation in Y
The explained sum of squares as a proportion of the total sum of squares
The residual sum of squares as a proportion of the total sum of squares

Which of the following options is true?


SELECT THE CORRECT ANSWER

Linear Regression error values have to be normally distributed, but in case of Logistic
Regression it is not the case
Logistic regression error values have to be normally distributed, but in case of linear
regression it is not the case
Both linear regression and logistic regression error values have to be normally distributed
Both linear regression and logistic regression error values may not to be normally distributed

Which of the following options are true about PCA and LDA? 1. Both LDA and PCA are
linear transformation techniques 2. LDA is supervised, whereas PCA is unsupervised 3.
PCA maximizes the variance of the data, whereas LDA maximizes the separation between
different classes
SELECT THE CORRECT ANSWER

1 and 2
1 and 3
Only 3
1, 2, and 3

Which of the following is an alternative technique to Principal Component Analysis ?


SELECT THE CORRECT ANSWER

Factor analysis
Independent components analysis
Latent semantic analysis
All of the above options
What will happen when eigen values are roughly equal?
SELECT THE CORRECT ANSWER

PCA performance is high


PCA performance is low
It does not affect PCA performance
None of above options

Imagine, you are solving a classification problem with highly imbalanced class. The
majority class is observed 99% of times in the training data. Your model has 99% accuracy
after taking the predictions on test data. Which of the following is true in such a case? 1.
Accuracy metric is not good for imbalanced class problem. 2. Accuracy metric is good for
imbalanced class problem. 3. Precision and recall metrics are good for imbalanced class
problem. 4. Precision and recall metrics aren’t good for imbalanced class problem.
SELECT THE CORRECT ANSWER

1 and 3
1 and 4
2 and 3
2 and 4

Can decision trees be used for performing clustering?


SELECT THE CORRECT ANSWER

TRUE
FALSE

What is the minimum no. of variables/ features required to perform clustering?


SELECT THE CORRECT ANSWER

0
1
2
3

In time-series analysis, which source of variation can be estimated by the ratio-to-trend


method?
SELECT THE CORRECT ANSWER

Cyclical
Trend
Seasonal
Irregular
Which of the following is true about averaging ensemble?
SELECT THE CORRECT ANSWER

It can only be used in classification problem


It can only be used in regression problem
It can be used in both classification as well as regression problem
None of the above options
Which of the following is the command to replace a value in a data frame at "third" row of
column "Name" with a NAN value ?
SELECT THE CORRECT ANSWER

(df.index[3], 'Name') = np.nan


df.index[3]=='NAN'
df.loc[df.index[3], 'Name'] = np.nan
df.Name[df.index[3]] = np.nan

To test linear relationship of y(dependent) and x(independent) continuous variables, which


of the following plots is best suited?
SELECT THE CORRECT ANSWER

Barchart
Histogram
Scatter Plot
None of the above options

What is pca.components_ in Sklearn?


SELECT THE CORRECT ANSWER

Set of all eigen vectors for the projection space


Matrix of principal components
Result of the multiplication matrix
None of the above options
Advantages of SVM are :
SELECT THE CORRECT ANSWER

Effective in high dimensional spaces


Memory efficient
Versatile
All of the above options

Which of the following is true about Naive Bayes ?


SELECT THE CORRECT ANSWER

Assumes that all of the features in a dataset are equally important 


Assumes that all of the features in a dataset are independent
Both A and B
None of the above options

When is Ridge regression favorable over Lasso regression?


SELECT THE CORRECT ANSWER

In presence of few variables with medium / large sized effect


When the least square estimates have higher variance
Both A and B
None of the above options

Which of the following is true for white noise? 


SELECT THE CORRECT ANSWER

Zero autocovariances
Zero autocovariances, except at lag zero
Mean is zero
Mean is constant

In Ensemble learning, majority vote is used for:


SELECT THE CORRECT ANSWER

Regression
Classification
Both A and B
None of the above options
Which of the following techniques can be used for normalization in text mining?
SELECT THE CORRECT ANSWER

Stemming
Lemmatization
Stop Word Removal
Both A and B
Which of the following is not a text mining task?
SELECT THE CORRECT ANSWER

Text clustering
Entity relation modelling
Production of granular taxonomies
m Text restructuring

You might also like