Exam All Questions

Uploaded by

Praveen Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3K views566 pages

Exam All Questions

Uploaded by

Praveen Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 566

Wrong options Ans.

120
C.
D. The scheme always generate low bias and low variance model
SCQs [Paper-I]
Linear Regression
1. Feature engineering is an important step in any model building exercise. It is the process of creating new features
from a given data set using the domain knowledge to leverage the predictive power of a machine learning model.
Which of the following statements are correct?
Statement 1: Feature engineering techniques are applied before train test split.
Statement 2: There is no difference between standardization and normalization,
Statement 3: Mean encoding is a feature engineering technique for handling categorical features.
a. Only 1 and 2 c. Only 2 and 3
b. Only 1 d. Only 3
2. VIF is used to detect Multicollinearity. Which of the following statements is NOT true for VIF?
a. The VIF has lowest bound of 0
b. The VIF has no upper bound
c. VIF for a variable generally changes if you drop one of the predictor variables
d. If a variable is a product of two other variables, it can have a high VIF
3. The distribution of errors terms in a linear regression model should look like (the horizonal line represents y=0):

a. A c. B
b. C d. D

4. For the same dependent variable Y, two models were created using the independent variables X1 and X2. The
following graph represent the fitted line on the scatterplot. (Both the graph are on same scale). Which of the
following is true about the residuals in these two models?
a. The sum of residuals in model 2 is higher than model 1
b. The sum of residuals in model 1 is higher than model 2
c. Both have the same sum of residuals
d. Nothing can be said about the sum of residuals from
the given graph

5. You built a simple linear regression model on a provided problem statement by the client. After a few days, the client
asks you to build a new model with an increased number of data points (old dataset + new data points). The count of
new data points exceeds old data points by 20%.
Which of the following statement is TRUE regarding the mean of residuals?
a. Mean of residuals of old model > Mean of residuals of new model
b. Mean of residuals of old model < Mean of residuals of new model
c. Mean of residuals of old model = Mean of residuals of new model
d. Information provided is not enough to comment on the mean of residuals
6. A scatterplot was plotted for two variables – age and income to find out how the income depends on the age of a
person. It was found that as the income increases linearly with age, the variability in income also increases. This is a
violation of which of the following assumptions of linear regression?
a. Homogeneity c. Heterogeneity
b. Homoscedasticity d. Linearity
7. RFE method is used for:
a. Dummy variable creation c. Detecting multicollinearity
b. Feature selection d. Univariate regression
8. Which of the following assumptions do we make while building a simple linear regression model (assume X and y to be
independent and dependent variables respectively)
A. There is a linear relationship between X and y
B. X and Y are normally distributed
C. Error terms are independent of each other
D. Error terms have constant variance
a. A, B, C and D c. A, C and D
b. A, B and C d. B, C and D
9. A client approached you with a problem statement. You decided to build a multiple linear regression model on the
dataset provided. The dataset consisted of 40 features. Obviously, all features will not be significant. Selecting the
relevant features manually will be a tougher task. You can use RFE to select relevant features. RFE is an automated
feature selection technique. Initially, you assumed 25 features can explain your whole data.
Which of the following commands correctly calls the RFE technique in Python? (Here “lm” is the fitted instance of
multiple linear regression model)
a. from stastmodel.feature_selection import RFE
rfe=RFE(lm,25)
rfe=rfe.fit(X_train,y_train)
b. from sklearn.feature_selection import RFE
rfe=RFE(lm,25)
rfe=rfe.predict(X_train,y_train)
c. from sklearn.feature_selection import RFE
rfe=RFE(lm,25)
rfe=rfe.fit(X_train,y_train)
d. from RFE import feature_selection
rfe=RFE(lm,25)
rfe=rfe.predict(X_train,y_train)
10. Suppose that on adding a new predictor variable to a linear regression model (model-1), the adjusted r-squared of the
new model (model-2) decreases. Choose the correct statement:
a. The r-squared of model-2 will be less than that of model 1
b. The r-squared of model-2 increases, but the complexity of model-2 also increases
c. The r-squared of model-2 decreases, but the complexity of model-2 also increases
d. Nothing can be said about the r-squared of model-2
11. Some of the independent variables (predictors) might be interrelated, due to which the presence of a particular
independent variable in the model is redundant. This phenomenon is called Multicollinearity.
Suppose that you are building a multiple linear regression model for a given problem statement, which of the
following statements is TRUE w.r.t. multicollinearity?
a. Multicollinearity is a problem when your only goal is to predict the independent variable from the set of
dependent variables
b. Multicollinearity is a problem when your goal is to infer the effect on the dependent variable due to
independent variable.
c. Multicollinearity is not a problem if a variable is not collinear with your variable of interest
d. Multicollinearity is not a problem if there are multiple dummy(binary) variables that represent a categorical
variable with three or more categories
12. If the co-efficient of determination is 0.47 between a dependent variable and an independent variable. This denotes
that-
a. The relationship between the two variables is not strong
b. The corelation coefficient between the two variables is also 0.47
c. 47% of the variance in the independent variable is explained by the dependent variable
d. 47% of the variance in the dependent variable is explained by the independent variable
13. While solving linear regression, the dependent variable is-
a. Numeric c. Categorical
b. Dummy coded d. Binary
14. Consider the following two assumptions for a single regression model. (Assume X and y to be independent and
dependent variables respectively).
Statement 1: There is a linear relationship between X and y
Statement 2: X and y are normally distributed
a. Statement 1 is correct and statement 2 is wrong
b. Statement 2 is correct and statement 1 is wrong
c. Both the statements are correct
d. Both the statements are incorrect
15. What does standardized scaling do?
a. Bring all data points in the range 0 to 1
b. Bring all data points in the range -1 to 1
c. Bring all the data points in a normal distribution with mean 0 and standard deviation 1
d. Bring all the data points in a normal distribution with mean 1 and standard deviation 0
16. In the linear regression, F-statistic is used to determine-
a. The significance of the individual beta coefficient
b. The variance explanation strength of the model
c. The significance of the overall model fit
d. Both A and C
17. Suppose you run a regression with one of the feature variable T, with all the remaining feature variables. The R-
squared of this model was found out to be 0.8. What will be the VIF for the variable T?
a. 1.56 c. 2.77
b. 3.33 d. 5.00
18. Which of the following is true regarding the error terms in linear regression?
a. The sum of residuals should be zero
b. The sum of residuals should be lesser than zero
c. The sum of residuals should be greater than zero
d. There is no such restriction on what the sum of residuals should be

Logistic Regression and Classification

19. Suppose an imbalanced data set has a class ratio of 2:3, and you want to run a cross-validation scheme to evaluate a
model's performance. If you apply a stratified k-fold to generate the train-test folds, what will be the distribution of
the classes in the test split?
a. 1:5 c. 2:3
b. 1:7 d. None of these
20. Consider the following two statements-
Statement 1: Suppose the value of Precision and Recall for a model is 0.65 and 0.75 respectively. Then the value
of F1-score will be -0.696
Statement 2: Mean squared error is a metric that can be used to evaluate logistic regression model.
a. Statement 1 is correct and statement 2 is wrong
b. Statement 2 is correct and statement 1 is wrong
c. Both the statements are correct
d. Both the statements are incorrect
21. The output of logistic model is-
a. 0 or 1 c. Any value between 0 and 1
b. 0.5 d. Depends on the business problem
22. What is the use of performing segmentation on a dataset before running a logistic regression on it?
a. It helps in capturing the seasonal fluctuations that might be present in the data
b. It helps to find the optimal cut-off point more easily
c. It helps in finding the different predictive patterns for the different set of data points that might be present in
the data
d. It helps capture the trends easily when there is a class imbalance
23. Given an imbalanced dataset, the ratio of positive to negative class is 1: 10000. You run a logistic regression model and
find out the model has a high value of precision and a low value of recall. Which of the following statements is true?
a. The class is handled well by the data
b. The model is not able to detect the class, but when it does it is highly trustable
c. The model is able to detect the class but it includes data points from the other class as well
d. The class his handled poorly by the data
24. You have to build a logistic regression model that is trying to predict whether loan is approved or not based on a
person’s FICO score. Here are the model parameters: Intercept(𝛽0 )= -9.346 and the co-efficient of FICO score=0.0146.
Given the parameters, can you calculate the probability of loan getting approved for someone with a FICO score of
640?
a. 0.35 c. 0.40
b. 0.45 d. 0.50
25. Which of the following is correct for a logistic regression model?
a. The independent variable should not be multicollinear
b. The dependent variable should follow Normal Distribution
c. The log odds in a logistic regression model lies between 0 and 1
d. F1 score is always the best metric for evaluating a logistic regression model
26. You have problem statement to build a multivariate logistic regression model. There are two features say ‘infected’
and ‘Blood Group’ of your interest in the dataset. The feature ‘infected’ takes two values “yes” or “no” whereas “Blood
Group” takes multiple levels liked A, A+, O, O+ etc.
Now consider the following statements-
Statement 1: For the feature “infected”, mapping is preferred over the creation of dummy variables.
Statement 2: For the feature “Blood Group”, the creation of dummy variables is preferred over mapping.
a. Statement 1 is correct and statement 2 is wrong
b. Statement 2 is correct and statement 1 is wrong
c. Both the statements are correct
d. Both the statements are incorrect
27. For a completely random binary classification model, what will be the area under the curve of the ROC graph?
a. 0 c. 0.25
b. 0.5 d. 1
28. Consider the following univariate logistic model
𝑦 = 𝛽0 + 𝛽1 𝑥1
Which of the following statement is NOT true?
a. The maximum likelihood estimation determines the best combination of 𝛽0 𝑎𝑛𝑑 𝛽1
b. If 𝛽1 is increased by 1 unit, Y increases by 1 unit
c. 𝛽0 is the y-intercept
d. If 𝛽1 is increased by 1 unit, log odds increases by 1 unit
29. You have to build a logistic regression model that is trying to predict whether loan is approved or not based on a
person’s FICO score. Here are the model parameters: Intercept(𝛽0 )= -9.346 and the co-efficient of FICO score=0.0146.
Given the parameters, can you calculate the probability of loan getting approved for someone with a FICO score of
655?
a. 0.35 c. 0.45
b. 0.55 d. 0.65
30. Consider the following confusion matrix. Which among the following is the lowest for the given confusion matrix?
Total=500 Actual Positive Actual Negative
Predicted Positive 196 20
Predicted Negative 28 256
a. Accuracy c. Precision
b. Sensitivity d. Specificity
31. If you use a random number generator to predict the output 0 or 1 for a binary classification problem, what will be the
area under the curve of the ROC curve?
a. 0 c. 0.5
b. 1 d. 100
32. How is regression different from classification?
a. One is supervised while the other is unsupervised
b. One is iterative while the other is closed
c. In regression, the response variable is numeric while it is categorical in classification
d. None of the above
33. Recall the telecom churn example. If the log odds for churn are equal to 0 for a customer, then that means-
a. There is no chance of the customer churning
b. The probability of customer churning is equal to the probability of the customer not churning.
c. The probability of customer churning is very small compared to the probability of the customer not churning.
d. The probability of customer churning is very large compared to the probability of the customer not churning.
34. Recall the telecom churn example. If the log odds for churn are equal to 1/3 for a customer, then that means-
a. The probability of customer not churning is 3 times the probability of the customer churning
b. The probability of customer churning is 3 times more than the probability of the customer not churning
c. The probability of customer not churning is 4 times the probability of the customer churning
d. The probability of customer churning is 4 times more than the probability of the customer not churning
35. Which of the following statements is NOT true?
a. In the case of a fair coin, the odds of getting heads is 1
b. The error values of linear and logistic regression have to be normally distributed
c. Specificity decreases with the increase in sensitivity
d. As TPR increases, FPR also increases
36. Take a look at the following three problem statements.
Problem statement 1: Let's say that you are building a telecom chum prediction model with the business objective that
your company wants to implement an aggressive customer retention campaign to retain the high churn-risk'
customers. This is because a competitor has launched extremely low-cost mobile plans, and you want to avoid churn
as much as possible by incentivising the customers. Assume that budget is not a constraint.
Problem statement 2: Let's say you are building a cancer detection model with the objective that both the patient who
has cancer and the patient who has not cancer can be detected correctly. It can have serious implications if you
predict either of the class wrong, ie., if wrongly detected as "not cancer" the patient will die of cancer, and if wrongly
detected as "cancer" the patient will die of chemotherapy.
Problem statement 3: You have to build an image classification model where 60% of images belong to one class and
rest 40% images belong to another class. You have to predict the class of a new image.
Which is the correctly matched model evaluation metric for the above classification models?
a. Problem Statement 1: Specificity c. Problem Statement 2: Sensitivity
b. Problem Statement 2: Specificity d. Problem Statement 3: Accuracy
37. What will be the accuracy percentage of the given confusion matrix of the three-class classification?
True/Predicted Class A Class B Class C
Class A 13 0 5
Class B 0 4 8
Class C 1 1 9
a. 63% c. 36%
b. 71% d. 45%

Clustering
38. In hierarchical clustering, the shortest distance and the maximum distance between points in two clusters are
defined as ………. and ………….. respectively.
a. Single linkage and complete linkage c. Complete linkage and single linkage
b. Single linkage and average linkage d. Complete linkage and average linkage
39. Which of the following statement is NOT true?
a. Each time the clusters are made during the K-means algorithm, the centroid is updated.
b. The cluster centres that are computed in the K-means algorithm are given by centroid value of the cluster
points
c. Standardization of the data is not important before applying Euclidean distance as a measure of
similarity/dissimilarity
d. The centroid of a column with data points 25, 32, 34 and 23 is 28.5.
e. The Euclidean distance between two points (10,2) and (4,5) is 7.
40. Initializing the following command in Python will result in the following:
model_clus= KMeans(n_clusters=6, max_iter=50)
a. Run maximum 6 iterations c. Run maximum 40 iterations
b. Create 6 final clusters d. Create 50 final clusters
41. Which of the following is not true for Hopkins Statistics?
a. Hopkins statistics decides if the data is suitable for clustering or not
b. Hopkins statistics lie between -1 and 1
c. If the Hopkins statistics comes out to be 0, then the data is uniformly distributed
d. If the Hopkins statistics comes out to be 1, then the data is highly suitable for clustering
42. Consider the two statements-
Statement 1: The distance between 2 clusters is the maximum distance between 2 points in the clusters in
complete linkage.
Statement 2: Most of the time Complete linkage will produce unstructured dendrograms.
a. Statement 1 is correct and statement 2 is wrong
b. Statement 2 is correct and statement 1 is wrong
c. Both the statements are correct
d. Both the statements are incorrect
43. A client has approached you for a problem statement that requires the use of clustering. You decided to model the
problem statement with hierarchical clustering. Consider the datasets having ‘n’ data points.
Which of the following statements is true for the above problem statement?
a. ‘n*n’ distance matrix should be calculated for the mentioned problem statement
b. Initially ‘n’ clusters are formed for the mentioned problem statement
c. The output of the problem statement above is a dendrogram
d. All the above
44. Silhouette metric for any ith point is given by S(i) = (b(i) - a(i)/max(a(i), b(i))
Which of the following is not true about the Silhouette metric?
a. b(i) is the average distance from the nearest neighbour cluster (Separation)
b. a(i) is the average distance from own cluster (Cohesion).
c. If S(i) = 1 the data point is similar to its own cluster.
d. Silhouette metric ranges from 0 to +1
45. Clustering is used to identify the below-
a. Data distribution c. Correlation among the data points
b. Principal components d. Subgroups in the data
46. For a K-means clustering process, the Hopkin Statistic for the dataset came out to be 0.8. Hence the dataset is-
a. Suitable for clustering c. Not suitable for clustering
b. Can’t say from the given information d. None of the above
47. For a K-means clustering process, the Hopkin Statistic for the dataset came out to be 0.3. Hence the dataset is-
a. Suitable for clustering c. Not suitable for clustering
b. Can’t say from the given information d. None of the above
48. You observed the following dendrogram after performing K-means clustering on a dataset. Which of the following
statements can be concluded from this dendrogram?

a. The initial number of clusters is 6

b. There are 25 data points used in the above clustering algorithm
c. Single linkage is used to define the distance between two clusters in
the above dendrogram
d. The above dendrogram interpretation is not possible for K-means
clustering.

49. Refer to the dendrogram image below and answer the question that follow:
Find the number of clusters formed if the dendrogram is cut at 0.25. (Assume agglomerative clustering method)

a. 6 c. 11
b. 13 d. 15

Decision Tree
50. Which of the following is the correct sampling technique that is used by a random forest model to overcome the
problem of overfitting?
a. Random sampling c. Bootstrapping
b. Oversampling d. Stratified sampling
51. Which of the following metrics measures how often a randomly chosen element would be incorrectly identified?
a. Entropy c. Information Gain
b. Gini Index d. None of these
52. Which of the following is true for weight of evidence (WoE) analysis?
a. It helps in finding the different predictive patterns for the different segments that might be present in the data
b. WoE helps in treating missing values for both continuous and categorical variables
c. WoE values should follow an increasing or decreasing trend across bins.
d. All of the above
53. Refer to the decision tree given below and choose the statement that is correct as per this tree.

a. The tree given above will show very good performance on the train data
b. The tree given above is an underfitting tree.
c. If the petal length is more than 2.45, then it is equally likely that the flower is either setosa or virginica.
d. Both B and C
54. Suppose you train a decision tree with the following data. Which feature should we split on at the root?
X Y Z V
T T F 1
F F F 0
T T T 0
F T T 1
a. X c. Y
b. Z d. Cannot be determined
55. Select the correct option based on the following decision tree.

I. Node 8 is the root node

II. Leaf node is 5
III. Nodes 2, 3, 4 are internal nodes.
IV.
a. Only I c. Only II
b. Both II and III d. Both II and IV

NLP and Lexical Processing

56. Choose the correct option from the following:
The difference between “+” and “*” quantifier is-
a. ‘+’ needs the preceding character to be present at least once whereas ‘*’ does not need the same.
b. ‘*’ need the character to be present at least once whereas ‘+’ does not need the same.
c. Both then quantifiers have same functionality
d. None of the above
57. What is the Levenshtein distance between ‘decade’ and ‘dictate’?
a. 3 c. 4
b. 5 d. 6
58. Which of the following strings will match the expression ‘^01+0$’?
1. 0
2. 00
3. 011110
a. Only option 1 c. Only option 3
b. Both 1 and 2 d. Both 2 and 3
59. What is the Levenshtein distance between ‘shutter’ and ‘shelter’?
a. 1 c. 2
b. 3 d. 4
60. Which of the following strings will match with the regular expression ‘^01*0$’?
1. 0
2. 00
3. 01111111110
a. Only option 1 c. Only option 3
b. Both 1 and 2 d. Both 2 and 3
Business Problem Solving
61. The coronavirus disease (COVID-19) was declared a pandemic by World Health Organization (WHO) in February 2020.
Currently, there are no vaccines or treatments that have been officially approved by WHO after clinical trials. India has
not seen the peak of infection yet and the number of infections is touching a new height daily. The business unit of
and Indian health and hygiene company approaches you to know “Why the sales of masks is decreasing despite the
number of corona infections increasing daily”.
Answer the following questions:
Suppose you mapped the above problem statement with a classification problem, either a customer will buy a mask or
not. You’ll build …….. model as your initial solution.
a. Neural Network c. Logistic Regression
b. Decision Tree d. All of the above
62. The coronavirus disease (COVID-19) was declared a pandemic by World Health Organization (WHO) in February 2020.
Currently, there are no vaccines or treatments that have been officially approved by WHO after clinical trials. India has
not seen the peak of infection yet and the number of infections is touching a new height daily. The business unit of
and Indian health and hygiene company approaches you to know “Why the sales of masks is decreasing despite the
number of corona infections increasing daily”.
Answer the below questions:
Consider the following two statements:
Statement 1: Understanding the change in customer behaviour is an important factor to be considered for
business understanding for the problem statement above
Statement 2: One of the possible hypotheses for the above problem statement: There is a rise in the number of
companies manufacturing normal/surgical masks due to which the sales of the client’s company is decreasing
a. Statement 1 is correct and statement 2 is wrong
b. Statement 2 is correct and statement 1 is wrong
c. Both the statements are correct
d. Both the statements are incorrect
MCQs [Paper -I]
63. ROC curve shows the trade-off between the True Positive Rate (TPR) and the False Positive Rate (FPR). TPR and FPR are
sensitivity and (1-Specificity) respectively. The following function is written in Python using metrics package from the
sci-kit learn library for the ROC curve function.
def draw_roc(actual,probs):
fpr, tpr,thresholds = metrics.roc_curve(actual, probs, drop_intermediate=False)
auc_score = metrics.roc_auc_score(actual,probs)
return None
Which of the following statements are true? (More than one option may be correct)
a. The area under ROC curve can be more than 1
b. The arguments passed in the above function are actual values of the target variable and the predicted values
(i.e. 0 or 1)
c. The area under the ROC can take any value between 0 and 1
d. Larger the area under the curve, the better will be the model
e. The arguments passed in the above function are actual values of the target variable and the respective
predicted probabilities
64. Observe the following cost function graph with different learning rates.

a. The learning rate of the Curve C is highest among all curves.

b. The learning rate of the Curve B is lower than A.
c. The learning rate of the Curve B is higher than A.
d. The learning rate of Curve C is smallest among all curves.
e. None of the above
65. Which of the following command correctly builds a logistic regression model in Python?
(More than 1 option can be correct)
a. from sklearn.linear_model import LogisticRegression
lr=LogisticRegression()
lr.fit(X_train,y_train)
b. Import statsmodel.api as sm
lr=sm.GLM(y_train,(sm.add_constant(X_train)),
family =sm.families.Binomial())
lr.fit()
c. from sklearn.linear_model import LogisticRegression
lr=LogisticRegression()
lr.predict(X_train,y_train)
d. Import statsmodel.api as sm
lr=sm.GLM(y_train,(sm.add_constant(X_train)),
family =sm.families.Binomial())
lr.predict()
66. In a simple linear regression model when you fit a straight line through the data you’ll get the two parameters of the
straight line i.e. the intercept 𝛽0 and the slope 𝛽1 . Which of the following is true for 𝛽0 and 𝛽1 ? (More than one option
may be correct)
a. The null hypothesis for a simple linear regression model is 𝐻0 : 𝛽1 =0
b. If the p-value turns out to be greater than 0.05 for 𝛽1 , it means 𝛽1 is significant
c. If 𝛽1 turns out to be insignificant, that means there is no relationship between the dependent and the
independent variable.
d. If the p-value turns out to be less than 0.05 for 𝛽0 it means that 𝛽0 is non-zero
67. Which of the following metrics can be used for finding the appropriate number of clusters in K-means clustering?
(More than one option may be correct)
a. Silhouette Score c. Elbow Curve
b. Hopkins Statistic d. Dendrogram
68. Which of the following statements is true? (More than one option may be correct)
a. TSS(Total Sum of Squares) is defined as the sum of all squared differences between the observed dependent
variable and its mean
b. R-Squared can take any value between 0 and 1
c. Larger the R-squared value, the better the regression model fits the observations
d. If RSS=5.50 and TSS=11, the value of VIF will be 1.33
69. Which of the following statements are correct in the context of logistic regression? (More than one option may be
correct)
a. The dummies for continuous variables make the model more unstable
b. Weight of Evidence (WoE) helps in treating missing values for both continuous and categorical variables
c. WoE should follow a non-monotonic trend across bins.
d. Data clumping can be a problem with transforming continuous variables to dummies.
e. Information Value or IV is an important indicator of predictive power.
70. Which of the following is NOT a methodology by which you can identify the optimal number of clusters for K-means
clustering? (More than one option may be correct)
a. Dendrogram inspection method c. Elbow method
b. Single Linkage method d. Silhouette Score
71. Any Business Problem Solving will have the following steps:
1. To identify the right data sources, that will be useful in formulating the final solution
2. Develop hypothesis and assess the overall impact of the hypothesized solution
3. Asking the right question for business and problem understanding.
4. Define the solution approach: What will be the POC model? What will be the metrics for the model evaluation
etc.
5. Converting business problem to a data science problem
6. Start your model building process with the simple POC model. And then increase the complexity of the POC
model and optimize the parameters to get the best result.
7. Performing EDA on the datasets
8. Model Evaluation
What will be the correct flow for solving the above/any business problem?
a. 3>1>5>2>4>7>6>8
b. 3>2>1>5>4>7>6>8
c. 4>3>1>2>5>7>6>8
d. 3>2>1>5>4>7>8>6

Exam Final
100% (1)
Exam Final
21 pages
Capstone Notes-2
No ratings yet
Capstone Notes-2
27 pages
Untitled
No ratings yet
Untitled
1,326 pages
SMDM Project Report-Survi Ghura
100% (1)
SMDM Project Report-Survi Ghura
26 pages
Sample - Customer Churn Prediction Python Documentation
No ratings yet
Sample - Customer Churn Prediction Python Documentation
33 pages
Pokemon HP Predictions
No ratings yet
Pokemon HP Predictions
24 pages
Assignment-Based Subjective Questions
No ratings yet
Assignment-Based Subjective Questions
1 page
1) Introduction A) Defining Problem Statement:-: ST ST
No ratings yet
1) Introduction A) Defining Problem Statement:-: ST ST
10 pages
Sample Questions
No ratings yet
Sample Questions
8 pages
Simple Regression Quiz
No ratings yet
Simple Regression Quiz
6 pages
Business Report Project - Sheetal - SMDM
100% (1)
Business Report Project - Sheetal - SMDM
20 pages
Problem Statement 1
100% (1)
Problem Statement 1
17 pages
Quiz 3 - Recommendation Systems, Association Rule Mining - Machine Learning 3 - Ravi
No ratings yet
Quiz 3 - Recommendation Systems, Association Rule Mining - Machine Learning 3 - Ravi
7 pages
Assignment 02
No ratings yet
Assignment 02
9 pages
All Life Bank - AIML - ML - Project - Low - Code - Notebook
No ratings yet
All Life Bank - AIML - ML - Project - Low - Code - Notebook
78 pages
Week 1 Quiz
100% (1)
Week 1 Quiz
28 pages
Assignment-Based Subjective Questions/Answers
No ratings yet
Assignment-Based Subjective Questions/Answers
3 pages
Answer 1722791857 NLP and Classification Practical MCQ 4991
No ratings yet
Answer 1722791857 NLP and Classification Practical MCQ 4991
26 pages
Chapter 5 - Classification Problems
100% (1)
Chapter 5 - Classification Problems
25 pages
Business Report: Advanced Statistics Module Project I
100% (1)
Business Report: Advanced Statistics Module Project I
5 pages
Machine Learning Guided Project
No ratings yet
Machine Learning Guided Project
23 pages
Module 1 Quiz
No ratings yet
Module 1 Quiz
7 pages
Predictive Modeling - Supporting File1
No ratings yet
Predictive Modeling - Supporting File1
3 pages
Linear Regression Review
67% (3)
Linear Regression Review
4 pages
LDA 01 Linear Discriminant Analysis
No ratings yet
LDA 01 Linear Discriminant Analysis
65 pages
Akshaya SMDM Project Report
100% (1)
Akshaya SMDM Project Report
18 pages
70 534
No ratings yet
70 534
33 pages
FRA Project Report - Chilla Nagaraju
100% (1)
FRA Project Report - Chilla Nagaraju
66 pages
Statistical Methods For Decision Making
100% (1)
Statistical Methods For Decision Making
15 pages
Machine Learning Coursera Quiz 2
100% (1)
Machine Learning Coursera Quiz 2
6 pages
SMDM Report
No ratings yet
SMDM Report
12 pages
Machine Learning: Notes by Aniket Sahoo - Part II
No ratings yet
Machine Learning: Notes by Aniket Sahoo - Part II
140 pages
d2 - 1 PDF
No ratings yet
d2 - 1 PDF
5 pages
AS Notebook - PCA - Wine Data-4
100% (1)
AS Notebook - PCA - Wine Data-4
1 page
M4 Data Mining W4 Business Report
No ratings yet
M4 Data Mining W4 Business Report
22 pages
Simple Linear Regression - Assign3
No ratings yet
Simple Linear Regression - Assign3
8 pages
20dit073 Jay Prajapati ML
No ratings yet
20dit073 Jay Prajapati ML
68 pages
Machine Learning Unit 4 MCQ
No ratings yet
Machine Learning Unit 4 MCQ
28 pages
Time Series Forecasting Project (Shoe Sales)
No ratings yet
Time Series Forecasting Project (Shoe Sales)
26 pages
Vijayalakshmi
No ratings yet
Vijayalakshmi
17 pages
Random Forest - US - Heart - Patients - Class
100% (1)
Random Forest - US - Heart - Patients - Class
24 pages
Why Do You Need To Scale Data in KNN: 3 Answers
No ratings yet
Why Do You Need To Scale Data in KNN: 3 Answers
1 page
Palash Bhai - Machine Learning Assignment
100% (2)
Palash Bhai - Machine Learning Assignment
18 pages
15 KNN - Problem Statement
0% (2)
15 KNN - Problem Statement
3 pages
Quiz Week 7 - Support Vector Machines
100% (1)
Quiz Week 7 - Support Vector Machines
3 pages
Clustering & PCA Assignment Questions
No ratings yet
Clustering & PCA Assignment Questions
4 pages
Python Project Submission by - Ravikanth Govindu: Due Date: 27th Mar 2022
No ratings yet
Python Project Submission by - Ravikanth Govindu: Due Date: 27th Mar 2022
48 pages
Simple Linear Regression - Assign2
No ratings yet
Simple Linear Regression - Assign2
9 pages
ML Week 3 Logistic Regression
60% (10)
ML Week 3 Logistic Regression
6 pages
Problem 2 - Survey: Importing Nessceary Libraries
No ratings yet
Problem 2 - Survey: Importing Nessceary Libraries
10 pages
Test Bank Questions Chapters 11 14
No ratings yet
Test Bank Questions Chapters 11 14
4 pages
Quiz Statistics
No ratings yet
Quiz Statistics
22 pages
Color: Due On Sunday June 7th, by 11:59PM
No ratings yet
Color: Due On Sunday June 7th, by 11:59PM
2 pages
Statistics 578 Assignment 5 Homework
100% (6)
Statistics 578 Assignment 5 Homework
13 pages
ML Afawerquestions
No ratings yet
ML Afawerquestions
5 pages
Machine Learning Test Regression
No ratings yet
Machine Learning Test Regression
6 pages
Midterm
No ratings yet
Midterm
9 pages
ML Question Bank
No ratings yet
ML Question Bank
13 pages
MCQs (Machine Learning)
50% (22)
MCQs (Machine Learning)
7 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Detail-A: 2300 MM SUNK
No ratings yet
Detail-A: 2300 MM SUNK
1 page
202ecb Boom Split CG B
No ratings yet
202ecb Boom Split CG B
1 page
Anchorage Frames Brochure - Potain Tower Cranes
67% (3)
Anchorage Frames Brochure - Potain Tower Cranes
44 pages
CMPC EDMS Cycle
No ratings yet
CMPC EDMS Cycle
2 pages
MR 160 Load Chart
No ratings yet
MR 160 Load Chart
1 page
334-Ca-32 (R3) (Sheet 4)
No ratings yet
334-Ca-32 (R3) (Sheet 4)
1 page
Excavation Safety Modeling Approach Using BIM and VPL
No ratings yet
Excavation Safety Modeling Approach Using BIM and VPL
16 pages
DEAS 793-1-2021 Toilet Cleaners Specification Part 1 Acidic
No ratings yet
DEAS 793-1-2021 Toilet Cleaners Specification Part 1 Acidic
14 pages
Applied Paper 2
100% (1)
Applied Paper 2
16 pages
Swasti Sthapak
No ratings yet
Swasti Sthapak
5 pages
Real Life Applications of Cone Surface Area and Volume
No ratings yet
Real Life Applications of Cone Surface Area and Volume
8 pages
John H. Hoefker
100% (1)
John H. Hoefker
13 pages
DCMT - Set 4 GR14 Supple
No ratings yet
DCMT - Set 4 GR14 Supple
2 pages
Sigmoid Deep Learning
No ratings yet
Sigmoid Deep Learning
8 pages
Cells and Organisation Exam Questions
No ratings yet
Cells and Organisation Exam Questions
3 pages
Type of Sutures and Suturing Technique
No ratings yet
Type of Sutures and Suturing Technique
27 pages
BoostLi Energy Storage Module ESM-4875A1 User Manual
100% (2)
BoostLi Energy Storage Module ESM-4875A1 User Manual
29 pages
Wire Loops - Precast
No ratings yet
Wire Loops - Precast
2 pages
NCERT Solutions For Class 9 Chapter 9 Parallelograms and Triangles Exercise 9 3
No ratings yet
NCERT Solutions For Class 9 Chapter 9 Parallelograms and Triangles Exercise 9 3
14 pages
A Simple, Illustrated Introduction To Single Infusion Mash Temperatures
No ratings yet
A Simple, Illustrated Introduction To Single Infusion Mash Temperatures
5 pages
Excel Test Part #1-1301
100% (1)
Excel Test Part #1-1301
3 pages
Mr. Rohit Jawa Unilever PDF
No ratings yet
Mr. Rohit Jawa Unilever PDF
16 pages
f4 Form 4 March Test New
No ratings yet
f4 Form 4 March Test New
8 pages
6 SQL - Data Types and Constrains in SQL
No ratings yet
6 SQL - Data Types and Constrains in SQL
11 pages
Titanic Shrika Balasubramanian
No ratings yet
Titanic Shrika Balasubramanian
8 pages
CAA3 - Final Exam Notes
No ratings yet
CAA3 - Final Exam Notes
12 pages
Signals and Systems For Signals and Systems For
No ratings yet
Signals and Systems For Signals and Systems For
75 pages
Price of Materials-1
No ratings yet
Price of Materials-1
2 pages
Drawing For Plinth Trasformer For 63 To 200KV Transformer
No ratings yet
Drawing For Plinth Trasformer For 63 To 200KV Transformer
1 page
Grammar Quiz: MAY Might
No ratings yet
Grammar Quiz: MAY Might
2 pages
DI-9103E - Addressable ROR & FIX Heat Detector
No ratings yet
DI-9103E - Addressable ROR & FIX Heat Detector
2 pages
Hawker 00XPC-Ice Protection System
No ratings yet
Hawker 00XPC-Ice Protection System
14 pages
JIS-G-3312-2019-Prepainted Hot-Dip Zinc-Coated Steel Sheet and Strip
No ratings yet
JIS-G-3312-2019-Prepainted Hot-Dip Zinc-Coated Steel Sheet and Strip
31 pages
Food Chains and Food Webs
No ratings yet
Food Chains and Food Webs
8 pages
20.09.2017 Rak-Addendum Boq
No ratings yet
20.09.2017 Rak-Addendum Boq
447 pages
Fact Sheet - Diesel Motor Mechanic - Heavy Commercial Vehicle
No ratings yet
Fact Sheet - Diesel Motor Mechanic - Heavy Commercial Vehicle
6 pages
A Reading of Baudelaire's "Recueillement"
No ratings yet
A Reading of Baudelaire's "Recueillement"
5 pages

Exam All Questions

Uploaded by

Exam All Questions

Uploaded by

Wrong options Ans.

Logistic Regression and Classification

a. The initial number of clusters is 6

I. Node 8 is the root node

NLP and Lexical Processing

a. The learning rate of the Curve C is highest among all curves.

You might also like