Supervised Machine Learning Regression

The document consists of a series of multiple-choice questions related to supervised learning, regression tasks, linear regression, cost functions, gradient descent, and regularization techniques. It covers concepts such as overfitting, underfitting, feature scaling, and the bias-variance tradeoff. Each question tests knowledge on fundamental principles and practices in machine learning and regression analysis.

Uploaded by

eyob53834

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Supervised Machine Learning Regression

Uploaded by

eyob53834

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

1. Which of the following best describes supervised learning?

a) Finding patterns in unlabeled data

b) Using labeled data to make predictions
c) Clustering unlabeled samples into groups
d) Reducing dimensionality of data without labels
2. In a regression task, the target variable is:
a) Continuous
b) Categorical
c) Ordinal only
d) Binary
3. The hypothesis function for linear regression with one feature is typically expressed as:
a) h(x)=θ1xh(x) = \theta_1 xh(x)=θ1x
b) h(x)=θ0+θ1xh(x) = \theta_0 + \theta_1 xh(x)=θ0+θ1x
c) h(x)=θ0+θ1x+θ2x2h(x) = \theta_0 + \theta_1 x + \theta_2 x^2h(x)=θ0+θ1x+θ2x2
d) h(x)=xTθh(x) = x^T \thetah(x)=xTθ with no intercept term
4. The cost function often used in linear regression is:
a) Mean Absolute Error (MAE)
b) Mean Squared Error (MSE)
c) Cross-Entropy Loss
d) Hinge Loss
5. Minimizing the cost function in linear regression typically involves:
a) Maximizing the number of features
b) Reducing the training set size
c) Adjusting model parameters to minimize errors
d) Decreasing the number of training iterations intentionally
6. Gradient descent updates parameters by moving in the direction of:
a) Increasing gradient of the cost function
b) Decreasing gradient of the cost function
c) Zero gradient at all steps
d) Random directions to explore the parameter space
7. If the learning rate in gradient descent is too large, the algorithm may:
a) Converge too slowly but still reach minimum
b) Converge exactly to the global minimum
c) Fail to converge and instead diverge
d) Not update parameters at all
8. Feature scaling (e.g., normalization) can speed up gradient descent by:
a) Making the cost function vanish
b) Ensuring all features have similar ranges
c) Removing the need for an intercept term
d) Guaranteeing convergence in one step
9. Which of the following would likely indicate overfitting in a regression model?
a) High training error and low test error
b) Low training error and low test error
c) Low training error and high test error
d) High training error and high test error
10. High bias in a model typically leads to:
a) Underfitting
b) Overfitting
c) Perfect fitting of training data
d) Higher variance
11. When adding polynomial features, we are:
a) Increasing model complexity by transforming inputs
b) Reducing model complexity by removing features
c) Increasing the number of training examples
d) Automatically ensuring better generalization
12. The normal equation is a closed-form solution for:
a) Finding learning rates
b) Performing gradient descent updates
c) Computing the parameters θ\thetaθ that minimize the cost function
d) Selecting the best model hyperparameters
13. Which matrix operation is commonly used in the normal equation approach?
a) Matrix inversion
b) Element-wise multiplication only
c) Eigenvalue decomposition only
d) Convolution
14. Regularization techniques like Ridge (L2) regression:
a) Add a penalty proportional to the absolute values of coefficients
b) Add a penalty proportional to the square of coefficients
c) Remove the bias term from the model
d) Are never used in linear models
15. Lasso (L1) regularization tends to produce models that are:
a) More likely to have many small coefficients
b) More likely to zero out some coefficients
c) Identical to Ridge regression solutions
d) Always outperforming Ridge in all scenarios
16. The purpose of a validation set is to:
a) Estimate the model’s performance on unseen data and tune hyperparameters
b) Train the model parameters
c) Replace the test set
d) Collect more training examples
17. Reducing variance in a regression model could be achieved by:
a) Increasing the model’s complexity
b) Using fewer training examples
c) Implementing regularization
d) Removing regularization terms
18. Suppose we have a dataset with features in widely different numerical ranges. Without
feature scaling, gradient descent may:
a) Converge more quickly
b) Converge more slowly
c) Be unaffected by feature scales
d) Always find a global minimum instantly
19. Overfitting can often be addressed by:
a) Using a more complex model
b) Increasing the number of features arbitrarily
c) Using regularization or collecting more data
d) Decreasing regularization penalties
20. The mean squared error (MSE) between predictions y^\hat{y}y^ and true values yyy is
calculated as:
a) 1m∑i=1m(y^(i)−y(i))\frac{1}{m}\sum_{i=1}^m (\hat{y}^{(i)} - y^{(i)})m1∑i=1m
(y^(i)−y(i))
b) 1m∑i=1m∣y^(i)−y(i)∣\frac{1}{m}\sum_{i=1}^m |\hat{y}^{(i)} - y^{(i)}|m1∑i=1m∣y^
(i)−y(i)∣
c) 12m∑i=1m(y^(i)−y(i))2\frac{1}{2m}\sum_{i=1}^m (\hat{y}^{(i)} - y^{(i)})^22m1
∑i=1m(y^(i)−y(i))2
d) ∑i=1m(y^(i)−y(i))2\sum_{i=1}^m (\hat{y}^{(i)} - y^{(i)})^2∑i=1m(y^(i)−y(i))2
21. A model that is too simple and fails to capture the underlying data pattern is likely
experiencing:
a) Low bias, low variance
b) High bias, low variance
c) High bias, high variance
d) Low bias, high variance
22. A polynomial regression model that fits training data perfectly but performs poorly on
test data is an example of:
a) Underfitting
b) Overfitting
c) Just right model complexity
d) No variance
23. The term “hypothesis” in linear regression typically refers to:
a) The theoretical assumption that data is perfectly linear
b) The function mapping inputs to predicted outputs
c) A guess about the number of features to use
d) A hypothesis test for statistical significance
24. Convergence in gradient descent is often checked by:
a) Monitoring the training set size
b) Checking if parameters exceed a certain range
c) Observing if the cost function stops decreasing significantly
d) Ensuring the gradient updates are random
25. To combat underfitting, one might:
a) Increase model complexity (e.g., add features or polynomial terms)
b) Use a more aggressive regularization
c) Reduce the size of the training data
d) Decrease the number of iterations in gradient descent
26. If the learning rate is too small, gradient descent will:
a) Never update parameters
b) Converge very quickly
c) Converge very slowly but still eventually approach minimum
d) Oscillate around the minimum
27. The bias-variance tradeoff refers to the balance between:
a) Underfitting and overfitting
b) Training size and feature size
c) Linear and polynomial models
d) Regularized and unregularized models
28. The parameter θ0\theta_0θ0 in linear regression is:
a) The slope of the regression line
b) The regularization parameter
c) The intercept term
d) Always equal to zero
29. Which approach would NOT typically help with high variance?
a) Adding regularization
b) Adding more training examples
c) Simplifying the model
d) Increasing the complexity of the hypothesis
30. In multiple linear regression, the model is:
a) h(x)=θ0+θ1x1+θ2x2+⋯+θnxnh(\mathbf{x}) = \theta_0 + \theta_1 x_1 + \theta_2 x_2
+ \cdots + \theta_n x_nh(x)=θ0+θ1x1+θ2x2+⋯+θnxn
b) h(x)=x1+x2+⋯+xnh(\mathbf{x}) = x_1 + x_2 + \cdots + x_nh(x)=x1+x2+⋯+xn with
no parameters
c) h(x)=θ0x0h(\mathbf{x}) = \theta_0x_0h(x)=θ0x0 only
d) h(x)=max⁡(x)h(\mathbf{x}) = \max(\mathbf{x})h(x)=max(x)
31. When performing feature scaling by normalization, values are typically rescaled to:
a) Mean 0, standard deviation 1
b) Range [0, 1]
c) Both a and b are common approaches
d) Range [-10, 10]
32. One advantage of using the normal equation over gradient descent is:
a) It does not require choosing a learning rate
b) It scales better for very large feature sets
c) It always runs faster
d) It prevents overfitting automatically
33. The “cost function” in linear regression measures:
a) The model’s complexity in terms of number of parameters
b) The discrepancy between predictions and actual values
c) The number of iterations required
d) The computational expense of training
34. If adding new features significantly reduces training error but not test error, this typically
indicates:
a) Underfitting
b) Proper generalization
c) Overfitting
d) No change in performance
35. Ridge regression’s penalty term is added to the cost function as:
a) λ∑∣θj∣\lambda \sum |\theta_j|λ∑∣θj∣
b) λ∑θj2\lambda \sum \theta_j^2λ∑θj2
c) λ∑log⁡(θj)\lambda \sum \log(\theta_j)λ∑log(θj)
d) λ∑θj\lambda \sum \sqrt{\theta_j}λ∑θj
36. A key distinction between regression and classification tasks is that regression outputs:
a) Discrete categories
b) Probabilities of classes
c) Continuous numeric values
d) Binary (0/1) predictions only
37. Before running gradient descent, we often initialize parameters θ\thetaθ to:
a) Arbitrary small random values or zeros
b) Their closed-form solution
c) Exactly the final solution
d) Infinity
38. The term "regularization parameter" (λ\lambdaλ) controls:
a) The number of features in the dataset
b) The relative importance of the penalty term on the parameters
c) The step size in gradient descent
d) The intercept term
39. Which evaluation metric best measures how close predicted values are to the actual
values in a regression problem?
a) Accuracy
b) Mean squared error (MSE)
c) AUC (Area Under the Curve)
d) Gini impurity
40. When training a polynomial regression model, an appropriate approach to avoid overly
large polynomial coefficients might be:
a) Add more training data only
b) Use a lower learning rate only
c) Implement L2 regularization to control coefficients
d) Randomly shuffle the features
1. b) Using labeled data to make predictions
2. a) Continuous
3. b) h(x)=θ0+θ1xh(x) = \theta_0 + \theta_1 xh(x)=θ0+θ1x
4. b) Mean Squared Error (MSE)
5. c) Adjusting model parameters to minimize errors
6. b) Decreasing gradient of the cost function
7. c) Fail to converge and instead diverge
8. b) Ensuring all features have similar ranges
9. c) Low training error and high test error
10. a) Underfitting (high bias means underfitting)
11. a) Increasing model complexity by transforming inputs
12. c) Computing the parameters θ\thetaθ that minimize the cost function
13. a) Matrix inversion
14. b) Add a penalty proportional to the square of coefficients
15. b) More likely to zero out some coefficients
16. a) Estimate the model’s performance on unseen data and tune hyperparameters
17. c) Implementing regularization
18. b) Converge more slowly
19. c) Using regularization or collecting more data
20. c) 12m∑i=1m(y^(i)−y(i))2\frac{1}{2m}\sum_{i=1}^m (\hat{y}^{(i)} -
y^{(i)})^22m1∑i=1m(y^(i)−y(i))2
21. b) High bias, low variance
22. b) Overfitting
23. b) The function mapping inputs to predicted outputs
24. c) Observing if the cost function stops decreasing significantly
25. a) Increase model complexity (e.g., add features or polynomial terms)
26. c) Converge very slowly but still eventually approach minimum
27. a) Underfitting and overfitting (bias and variance)
28. c) The intercept term
29. d) Increasing the complexity of the hypothesis
30. a) h(x)=θ0+θ1x1+θ2x2+⋯+θnxnh(\mathbf{x}) = \theta_0 + \theta_1 x_1 +
\theta_2 x_2 + \cdots + \theta_n x_nh(x)=θ0+θ1x1+θ2x2+⋯+θnxn
31. c) Both a and b are common approaches
32. a) It does not require choosing a learning rate
33. b) The discrepancy between predictions and actual values
34. c) Overfitting
35. b) λ∑θj2\lambda \sum \theta_j^2λ∑θj2
36. c) Continuous numeric values
37. a) Arbitrary small random values or zeros
38. b) The relative importance of the penalty term on the parameters
39. b) Mean squared error (MSE)
40. c) Implement L2 regularization to control coefficients

Exam Final
100% (1)
Exam Final
21 pages
FD50 100
100% (1)
FD50 100
86 pages
Solutions To Fossen Structural Geology PDF
0% (1)
Solutions To Fossen Structural Geology PDF
31 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
Azure Data Factory
100% (4)
Azure Data Factory
16 pages
Sony Xav-63m Xav-64bt BTM Ver1.4 SM
No ratings yet
Sony Xav-63m Xav-64bt BTM Ver1.4 SM
122 pages
Machine Learning - Exploring The Model
50% (2)
Machine Learning - Exploring The Model
3 pages
Sample Questions
No ratings yet
Sample Questions
8 pages
ML Question bank
No ratings yet
ML Question bank
13 pages
quiz2 (1)
No ratings yet
quiz2 (1)
3 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Test 5 - Linear Regression
No ratings yet
Test 5 - Linear Regression
1 page
Quiz1 Solutions Quiz 1 Soln
No ratings yet
Quiz1 Solutions Quiz 1 Soln
7 pages
MLexam001 - 1 2 2
No ratings yet
MLexam001 - 1 2 2
9 pages
Epfl Machine Learning Final Exam 2021 Solutions
No ratings yet
Epfl Machine Learning Final Exam 2021 Solutions
21 pages
Test 1 With Key 10-3
No ratings yet
Test 1 With Key 10-3
16 pages
Sample Exam Answers
No ratings yet
Sample Exam Answers
6 pages
LR_GD
No ratings yet
LR_GD
5 pages
Lecture 10_04.09.2024_Regression-02 Lecture Slides
No ratings yet
Lecture 10_04.09.2024_Regression-02 Lecture Slides
61 pages
Sheet 3 Linear Regression.docx
No ratings yet
Sheet 3 Linear Regression.docx
8 pages
Assignment - Week 2 - Final
No ratings yet
Assignment - Week 2 - Final
3 pages
Male
No ratings yet
Male
9 pages
Axioms:: Simultaneously Meannormalization
No ratings yet
Axioms:: Simultaneously Meannormalization
2 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Devoir 1
No ratings yet
Devoir 1
6 pages
(MLP) Lecture Notes
No ratings yet
(MLP) Lecture Notes
22 pages
Sheet 3 Linear Regression
No ratings yet
Sheet 3 Linear Regression
8 pages
Machine 2020 Jul-Dec
No ratings yet
Machine 2020 Jul-Dec
45 pages
Machine learning
No ratings yet
Machine learning
19 pages
ML-1
No ratings yet
ML-1
24 pages
Exam All Questions
No ratings yet
Exam All Questions
566 pages
COGS 118 Homework 3 Supervised Machine Learning Algorithms
No ratings yet
COGS 118 Homework 3 Supervised Machine Learning Algorithms
7 pages
9_Linear Regression-Problems and Solutions
No ratings yet
9_Linear Regression-Problems and Solutions
23 pages
Assignment 1-12 ML
No ratings yet
Assignment 1-12 ML
54 pages
LLM ML Interview Q
No ratings yet
LLM ML Interview Q
43 pages
Int 354 ML-1
No ratings yet
Int 354 ML-1
4 pages
Ids Unit 5
No ratings yet
Ids Unit 5
4 pages
Machine Learning Test Regression
No ratings yet
Machine Learning Test Regression
6 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
Nptel Week 5
No ratings yet
Nptel Week 5
4 pages
Linear Regression
No ratings yet
Linear Regression
6 pages
Supervised Regression Notes
No ratings yet
Supervised Regression Notes
11 pages
Wa0006.
No ratings yet
Wa0006.
4 pages
Linear Regression Python Programming
No ratings yet
Linear Regression Python Programming
25 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
COMPSCI5014 1 Machine Learning (M) 201904
No ratings yet
COMPSCI5014 1 Machine Learning (M) 201904
7 pages
ML QUES MOD-1
No ratings yet
ML QUES MOD-1
25 pages
Logistic Regression Q/As From hands-on-ML: Gradient Descent, or Mini-Batch Gradient Descent
No ratings yet
Logistic Regression Q/As From hands-on-ML: Gradient Descent, or Mini-Batch Gradient Descent
3 pages
S&UL Subjective Question Bank
No ratings yet
S&UL Subjective Question Bank
7 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
16 pages
Untitled
No ratings yet
Untitled
2 pages
Quizz-ML
No ratings yet
Quizz-ML
3 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
Q and A BIS
No ratings yet
Q and A BIS
7 pages
Assignment 5 Solution
No ratings yet
Assignment 5 Solution
6 pages
Assignment 2
No ratings yet
Assignment 2
4 pages
30 Questions To Test Your Understanding of Logistic Regression
No ratings yet
30 Questions To Test Your Understanding of Logistic Regression
13 pages
RGRSSN Assgnmnt
No ratings yet
RGRSSN Assgnmnt
11 pages
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
Redis Certified Developer - Exam Practice Tests
From Everand
Redis Certified Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
Neo4j Graph Data Science Certified - Exam Practice Tests
From Everand
Neo4j Graph Data Science Certified - Exam Practice Tests
Cristian Scutaru
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
CLO 2
No ratings yet
CLO 2
39 pages
10_Improving_Deep_Neural_Networks_Hyperparameter_Tuning,_Regularization
No ratings yet
10_Improving_Deep_Neural_Networks_Hyperparameter_Tuning,_Regularization
6 pages
Computer Vision Day 2
No ratings yet
Computer Vision Day 2
50 pages
1- Python basics
No ratings yet
1- Python basics
9 pages
Exploratory Data Analysis for Machine Learning
No ratings yet
Exploratory Data Analysis for Machine Learning
6 pages
DCN Full
No ratings yet
DCN Full
215 pages
s4 Physics Paper 3 Exam 1
No ratings yet
s4 Physics Paper 3 Exam 1
4 pages
8086 Microprocessor
No ratings yet
8086 Microprocessor
15 pages
Fish Processing Technology
No ratings yet
Fish Processing Technology
4 pages
MSDS Idrolin-K SM
No ratings yet
MSDS Idrolin-K SM
25 pages
EOB Preboard 2
No ratings yet
EOB Preboard 2
4 pages
Onwo Foodsafe Vhvi Hydraulic Oils-2709 PDF
No ratings yet
Onwo Foodsafe Vhvi Hydraulic Oils-2709 PDF
2 pages
RemovablePartialDentureFrameworksintheAgeofDigitalDentistry-AReviewoftheLiterature
No ratings yet
RemovablePartialDentureFrameworksintheAgeofDigitalDentistry-AReviewoftheLiterature
19 pages
Training and Pruning in Chrysanthemun
0% (1)
Training and Pruning in Chrysanthemun
13 pages
Parry & Clark - The Law of Succession - Kerridge, R - (Roger) - 317-365
No ratings yet
Parry & Clark - The Law of Succession - Kerridge, R - (Roger) - 317-365
49 pages
Interactions View: Actor
No ratings yet
Interactions View: Actor
3 pages
Permutation Tests for Complex Data Theory Applications and Software Wiley Series in Probability and Statistics 1st Edition Fortunato Pesarin 2024 scribd download
100% (5)
Permutation Tests for Complex Data Theory Applications and Software Wiley Series in Probability and Statistics 1st Edition Fortunato Pesarin 2024 scribd download
61 pages
Test Bank for Essential College Physics, 1st Edition: Rex pdf download
100% (2)
Test Bank for Essential College Physics, 1st Edition: Rex pdf download
37 pages
Son of God The Life of Jesus Christ in You
No ratings yet
Son of God The Life of Jesus Christ in You
66 pages
Logistics Sector
No ratings yet
Logistics Sector
242 pages
Useful Vocabulary - Unit 1 - CPE
No ratings yet
Useful Vocabulary - Unit 1 - CPE
9 pages
Full download (Ebook) The Physics and Technology of Laser Resonators by Hall, Denis; Jackson, P.E ISBN 9780852741177, 9781000112221, 9781000132182, 9781000157031, 9781003069508, 0852741170, 1000112225, 1000132188, 1000157032 pdf docx
100% (5)
Full download (Ebook) The Physics and Technology of Laser Resonators by Hall, Denis; Jackson, P.E ISBN 9780852741177, 9781000112221, 9781000132182, 9781000157031, 9781003069508, 0852741170, 1000112225, 1000132188, 1000157032 pdf docx
55 pages
Descriptive Text
No ratings yet
Descriptive Text
2 pages
Grade 8 Technology Term 4
No ratings yet
Grade 8 Technology Term 4
12 pages
Bài tập TIẾNG ANH 7 (GLOBAL SUCCESS) 4 kỹ năng bản mới nhất 2022-2023 UNIT 10
0% (1)
Bài tập TIẾNG ANH 7 (GLOBAL SUCCESS) 4 kỹ năng bản mới nhất 2022-2023 UNIT 10
9 pages
University Institutions of National Importance Sl. No. States/Uts Research Institutions Arts, Science & Commerce Colleges Deemed University
No ratings yet
University Institutions of National Importance Sl. No. States/Uts Research Institutions Arts, Science & Commerce Colleges Deemed University
24 pages
Incremental Kinematics For Finite Element
No ratings yet
Incremental Kinematics For Finite Element
20 pages
HSP 106418 -Rev M- Manual CFV-2..20 en
No ratings yet
HSP 106418 -Rev M- Manual CFV-2..20 en
132 pages
Why Should Anyone Be Led You - PDF
No ratings yet
Why Should Anyone Be Led You - PDF
62 pages
You Exec - Annual Report Part4 Complete
No ratings yet
You Exec - Annual Report Part4 Complete
38 pages

Supervised Machine Learning Regression

Uploaded by

Supervised Machine Learning Regression

Uploaded by

1. Which of the following best describes supervised learning?

a) Finding patterns in unlabeled data

You might also like