0% found this document useful (0 votes)

56 views7 pages

Ass 2

Uploaded by

chamundeeswari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views7 pages

Ass 2

Uploaded by

chamundeeswari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

1.

State True or False:

Typically, linear regression tend to underperform compared to k-nearest neighbor
algorithms when dealing with high-dimensional input spaces.

True.

In high-dimensional input spaces, linear regression can underperform compared to k-nearest

neighbor (k-NN) algorithms due to several reasons:

1. Curse of Dimensionality: As the number of dimensions (features) increases, the distance

between data points grows, making it harder for linear regression to capture the underlying
patterns. k-NN, on the other hand, can handle high-dimensional spaces better by focusing on
local patterns rather than global trends.

2. Model Assumptions: Linear regression assumes a linear relationship between the predictors
and the target variable. In high-dimensional spaces, this assumption may not hold, leading to
poor performance. k-NN does not assume a specific functional form and can adapt to
complex, non-linear relationships.

3. Overfitting: In high-dimensional spaces, linear regression may overfit the training data,
especially if the number of features is large compared to the number of observations. k-NN
can be less prone to overfitting if the value of kkk is chosen appropriately, as it relies on local
neighborhoods rather than fitting a global model.

Overall, while linear regression is a powerful technique for many applications, k-NN can sometimes
provide better performance in high-dimensional spaces where the relationships between variables
are complex and non-linear.

2. Given the following dataset, find the uni-variate regression function that best fits the
dataset.

3. f(x)=1×x+4𝑓(𝑥)=1×𝑥+4
4. f(x)=1×x+5𝑓(𝑥)=1×𝑥+5
5. f(x)=1.5×x+3𝑓(𝑥)=1.5×𝑥+3
6. f(x)=2×x+1
7. To determine the best univariate regression function that fits the given dataset, follow these
steps:
8. Dataset
9. x=[2,3,4,10]x = [2, 3, 4, 10]x=[2,3,4,10]
10. y=[5.5,6.5,9,18.5]y = [5.5, 6.5, 9, 18.5]y=[5.5,6.5,9,18.5]
11. Step 1: Compute the Best-Fit Line Using Linear Regression
12. We will use the least squares method to find the line of best fit in the form y=mx+by = mx +
by=mx+b.
13. Calculate the Mean of xxx and yyy:
14. xˉ=2+3+4+104=194=4.75\bar{x} = \frac{2 + 3 + 4 + 10}{4} = \frac{19}{4} = 4.75xˉ=42+3+4+10
=419=4.75 yˉ=5.5+6.5+9+18.54=39.54=9.875\bar{y} = \frac{5.5 + 6.5 + 9 + 18.5}{4} = \
frac{39.5}{4} = 9.875yˉ=45.5+6.5+9+18.5=439.5=9.875
15. Calculate the Slope mmm:
16. m=∑i=1n(xi−xˉ)(yi−yˉ)∑i=1n(xi−xˉ)2m = \frac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})}{\
sum_{i=1}^n (x_i - \bar{x})^2}m=∑i=1n(xi−xˉ)2∑i=1n(xi−xˉ)(yi−yˉ)
17. Compute the numerator and denominator:
18. Numerator=(2−4.75)(5.5−9.875)+(3−4.75)(6.5−9.875)+(4−4.75)(9−9.875)+(10−4.75)
(18.5−9.875)\text{Numerator} = (2 - 4.75)(5.5 - 9.875) + (3 - 4.75)(6.5 - 9.875) + (4 - 4.75)(9 -
9.875) + (10 - 4.75)(18.5 - 9.875)Numerator=(2−4.75)(5.5−9.875)+(3−4.75)
(6.5−9.875)+(4−4.75)(9−9.875)+(10−4.75)(18.5−9.875) =(−2.75)(−4.375)+(−1.75)(−3.375)+
(−0.75)(−0.875)+(5.25)(8.625)= (-2.75)(-4.375) + (-1.75)(-3.375) + (-0.75)(-0.875) + (5.25)
(8.625)=(−2.75)(−4.375)+(−1.75)(−3.375)+(−0.75)(−0.875)+(5.25)(8.625)
=12.03125+5.90625+0.65625+45.28125= 12.03125 + 5.90625 + 0.65625 +
45.28125=12.03125+5.90625+0.65625+45.28125 =63.875= 63.875=63.875
Denominator=(2−4.75)2+(3−4.75)2+(4−4.75)2+(10−4.75)2\text{Denominator} = (2 - 4.75)^2 +
(3 - 4.75)^2 + (4 - 4.75)^2 + (10 -
4.75)^2Denominator=(2−4.75)2+(3−4.75)2+(4−4.75)2+(10−4.75)2 =(−2.75)2+(−1.75)2+
(−0.75)2+(5.25)2= (-2.75)^2 + (-1.75)^2 + (-0.75)^2 + (5.25)^2=(−2.75)2+(−1.75)2+
(−0.75)2+(5.25)2 =7.5625+3.0625+0.5625+27.5625= 7.5625 + 3.0625 + 0.5625 +
27.5625=7.5625+3.0625+0.5625+27.5625 =38.75= 38.75=38.75 m=63.87538.75≈1.65m = \
frac{63.875}{38.75} \approx 1.65m=38.7563.875≈1.65
19. Calculate the Intercept bbb:
20. b=yˉ−mxˉb = \bar{y} - m \bar{x}b=yˉ−mxˉ b=9.875−1.65×4.75b = 9.875 - 1.65 \times
4.75b=9.875−1.65×4.75 b=9.875−7.8125b = 9.875 - 7.8125b=9.875−7.8125 b≈2.0625b \
approx 2.0625b≈2.0625
21. Step 2: Form the Regression Equation
22. The regression equation is:
23. y=1.65x+2.0625y = 1.65x + 2.0625y=1.65x+2.0625
24. Step 3: Compare With Given Options
25. Let's compare this to the options provided:
26. f(x)=1⋅x+4f(x) = 1 \cdot x + 4f(x)=1⋅x+4
27. f(x)=1⋅x+5f(x) = 1 \cdot x + 5f(x)=1⋅x+5
28. f(x)=1.5⋅x+3f(x) = 1.5 \cdot x + 3f(x)=1.5⋅x+3
29. f(x)=2⋅x+1f(x) = 2 \cdot x + 1f(x)=2⋅x+1
30. Our computed regression function y=1.65x+2.0625y = 1.65x + 2.0625y=1.65x+2.0625 is
closest to:
31. f(x)=1.5⋅x+3f(x) = 1.5 \cdot x + 3f(x)=1.5⋅x+3
32. Although not an exact match, 1.5⋅x+31.5 \cdot x + 31.5⋅x+3 is reasonably close, considering
the options given and the rounding in practical scenarios.

3. Given a training data set of 500 instances, with each input instance having 6 dimensions
and each output being a scalar value, the dimensions of the design matrix used in applying
linear regression to this data is
500×6500×6
500×7500×7
500×62500×62
None of the above
In linear regression, the design matrix, often denoted as XXX, includes all the input features
for each instance, along with an additional column for the intercept (bias term) if you are
using a model with an intercept.
Given:
 The training dataset consists of 500 instances.
 Each instance has 6 dimensions (features).
 Each output is a scalar value.
Dimensions of the Design Matrix
For linear regression:
1. Input Features: There are 6 features for each instance.
2. Intercept Term: If the model includes an intercept term, an additional column is added to the
design matrix to account for this.
Therefore, the design matrix XXX will have:
 Rows: Equal to the number of instances (500).
 Columns: Equal to the number of features plus 1 (for the intercept term).
So, the dimensions of the design matrix will be 500×(6+1)=500×7500 \times (6 + 1) = 500 \
times 7500×(6+1)=500×7.
Conclusion
The correct dimension of the design matrix for applying linear regression to this dataset is:
 500 × 7
4. Assertion A: Binary encoding is usually preferred over One-hot encoding to represent
categorical data (eg. colors, gender etc)
Reason R: Binary encoding is more memory efficient when compared to One-hot encoding
Both A and R are true and R is the correct explanation of A
Both A and R are true but R is not the correct explanation of A
A is true but R is false
A is false but R is true
To evaluate the assertion and reason, let's break down both concepts:
Assertion A: Binary encoding is usually preferred over One-hot encoding to represent
categorical data (e.g., colors, gender, etc.)
Binary Encoding vs. One-hot Encoding:
 One-hot Encoding: Each category is represented as a binary vector with a single 1 and all
other entries as 0. For a categorical variable with nnn possible categories, you need nnn
binary columns. This can lead to high-dimensional data if nnn is large.
 Binary Encoding: Categories are first assigned unique integer values. These integer values

⌈log⁡2(n)⌉\lceil \log_2(n) \rceil⌈log2(n)⌉ binary columns. This is often more memory efficient
are then converted to binary form. For a categorical variable with nnn categories, you use

compared to one-hot encoding, especially when nnn is large.

Assertion A is true in many cases because binary encoding can be more compact than one-
hot encoding, especially with many categories.
Reason R: Binary encoding is more memory efficient when compared to One-hot encoding
Binary encoding is indeed more memory efficient than one-hot encoding, especially for
variables with a large number of categories. This is because it reduces the number of
columns required to represent the data.
Reason R is true and accurately describes the advantage of binary encoding over one-hot
encoding.
Explanation of Relationship:
Both A and R are true, and R is the correct explanation of A.
Binary encoding is preferred due to its memory efficiency, which is a correct explanation for
why it might be chosen over one-hot encoding.
Therefore, the correct answer is:
 Both A and R are true and R is the correct explanation of A
5. Select the TRUE statement
Subset selection methods are more likely to improve test error by only focussing on the
most important features and by reducing variance in the fit.
Subset selection methods are more likely to improve train error by only focussing on the
most important features and by reducing variance in the fit.
Subset selection methods are more likely to improve both test and train error by focussing
on the most important features and by reducing variance in the fit.
Subset selection methods don’t help in performance gain in any way.
To determine the true statement regarding subset selection methods, let's review what
subset selection methods are and how they affect training and test errors.
Subset Selection Methods
Subset selection methods are techniques used in feature selection where the goal is to
choose a subset of the most relevant features from a larger set. Common methods include:
1. Forward Selection: Adding features one by one to find the best subset.
2. Backward Elimination: Starting with all features and removing the least significant ones.
3. Stepwise Selection: A combination of forward selection and backward elimination.
Effects on Training and Test Errors
1. Training Error: Subset selection methods usually reduce training error because they allow
the model to focus on the most relevant features. This can lead to a better fit on the training
data. However, reducing the number of features might sometimes lead to overfitting if not
done properly.
2. Test Error: The main advantage of subset selection methods is to improve generalization and
reduce variance. By focusing only on the most important features, these methods can help to
avoid overfitting and thus potentially improve test error. The test error might improve if the
reduced feature set generalizes better to unseen data, but this is not guaranteed.
Evaluation of Statements
1. Subset selection methods are more likely to improve test error by only focusing on the
most important features and by reducing variance in the fit.
o True: By selecting a subset of important features, the model can reduce overfitting
and variance, which often leads to improved test error.
2. Subset selection methods are more likely to improve train error by only focusing on the
most important features and by reducing variance in the fit.
o False: Subset selection methods usually improve test error rather than training error.
Training error may decrease, but the primary goal is to improve the model’s
performance on unseen data (test error).
3. Subset selection methods are more likely to improve both test and train error by focusing
on the most important features and by reducing variance in the fit.
o False: While subset selection methods can improve test error, they typically do not
improve training error significantly. The focus is on improving generalization rather
than specifically reducing training error.
4. Subset selection methods don’t help in performance gain in any way.
o False: Subset selection methods can help in performance gain by improving model
generalization and reducing overfitting.
Conclusion
The true statement is:
 Subset selection methods are more likely to improve test error by only focusing on the
most important features and by reducing variance in the fit.
6. Rank the 3 subset selection methods in terms of computational efficiency:
Forward stepwise selection, best subset selection, and forward stagewise regression.
Forward stepwise selection, forward stagewise regression and best subset selection.
Best subset selection, forward stagewise regression and forward stepwise selection.
Best subset selection, forward stepwise selection and forward stagewise regression.
To rank the subset selection methods in terms of computational efficiency, let's analyze
each method:
1. Best Subset Selection
 Description: Best subset selection evaluates all possible subsets of features to find the one
that best fits the model according to some criterion (e.g., minimizing error).
 Computational Complexity: It is the most computationally intensive because it requires
evaluating 2p2^p2p subsets, where ppp is the number of features. This exponential growth
makes it impractical for large numbers of features.
2. Forward Stepwise Selection
 Description: Forward stepwise selection starts with no features and adds them one by one
based on their contribution to improving the model. At each step, it evaluates all
remaining features to decide which one to add.
 Computational Complexity: It is less computationally intensive than best subset selection.
For each feature added, it requires evaluating all remaining features, leading to a
complexity of approximately O(p2)O(p^2)O(p2), where ppp is the number of features.
3. Forward Stagewise Regression
 Description: Forward stagewise regression is similar to forward stepwise selection but adds
features more gradually. At each stage, it adds features one at a time but with smaller
incremental changes.
 Computational Complexity: It is generally more computationally efficient than both best
subset selection and forward stepwise selection because it makes smaller, incremental
updates and does not evaluate as many feature combinations in each step.
Ranking by Computational Efficiency
1. Forward Stagewise Regression: Most efficient, due to its incremental approach and fewer
evaluations.
2. Forward Stepwise Selection: More efficient than best subset selection but less efficient
than forward stagewise regression.
3. Best Subset Selection: Least efficient, as it involves evaluating all possible subsets.
Conclusion
The correct ranking of the subset selection methods in terms of computational efficiency
is:
 Forward stagewise regression, forward stepwise selection, and best subset selection.
7. Choose the TRUE statements from the following: (Multiple correct choice)
Ridge regression since it reduces the coefficients of all variables, makes the final fit a lot
more interpretable.
Lasso regression since it doesn’t deal with a squared power is easier to optimize than
ridge regression.
Ridge regression has a more stable optimization than lasso regression.
Lasso regression is better suited for interpretability than ridge regression.
Let's evaluate each statement regarding Ridge and Lasso regression:
1. Ridge regression since it reduces the coefficients of all variables, makes the final fit a lot
more interpretable.
o False: Ridge regression applies L2 regularization, which reduces the magnitude of
all coefficients but does not force any coefficients to be exactly zero. This means all
variables remain in the model, potentially making the final fit less interpretable
compared to Lasso regression, which can zero out some coefficients.
2. Lasso regression since it doesn’t deal with a squared power is easier to optimize than ridge
regression.
o False: Lasso regression applies L1 regularization, which can lead to sparse solutions
(coefficients exactly zero). The optimization problem for Lasso is not necessarily
easier to solve than Ridge regression; in fact, it can be more complex due to the L1
penalty and the need for algorithms that handle the non-differentiability at zero.
3. Ridge regression has a more stable optimization than Lasso regression.
o True: Ridge regression (L2 regularization) has a more stable optimization process
because it involves differentiable L2 norms, which provides a smooth and
continuous penalty. Lasso regression (L1 regularization) can lead to non-
differentiability at zero, making optimization more challenging.
4. Lasso regression is better suited for interpretability than ridge regression.
o True: Lasso regression tends to produce sparse models by driving some coefficients
exactly to zero. This sparsity can make the model easier to interpret because it
effectively selects a subset of important features, potentially leading to a more
straightforward and interpretable model.
Conclusion
The TRUE statements are:
 Ridge regression has a more stable optimization than Lasso regression.
 Lasso regression is better suited for interpretability than Ridge regression.
8. Which of the following statements are TRUE? Let xi𝑥𝑖 be the i−𝑖−th datapoint in a
dataset of N𝑁 points. Let v𝑣 represent the first principal component of the dataset.
(Multiple answer questions)
v=argmax∑Ni=1(vTxi)2s.t.|v|=1𝑣=𝑎𝑟𝑔𝑚𝑎𝑥∑𝑖=1𝑁(𝑣𝑇𝑥𝑖)2𝑠.𝑡.|𝑣|=1
v=argmin∑Ni=1(vTxi)2s.t.|v|=1𝑣=𝑎𝑟𝑔𝑚𝑖𝑛∑𝑖=1𝑁(𝑣𝑇𝑥𝑖)2𝑠.𝑡.|𝑣|=1
Scaling at the start of performing PCA is done just for better numerical stability and
computational benefits but plays no role in determining the final principal components of
a dataset.
The resultant vectors obtained when performing PCA on a dataset can vary based on the
scale of the dataset.

1. v=argmax∑i=1N(vTxi)2 s.t. ∣v∣=1v = \text{argmax} \sum_{i=1}^N (v^T x_i)^2 \text{ s.t. } |v|
Let's analyze each statement regarding Principal Component Analysis (PCA):

= 1v=argmax∑i=1N(vTxi)2 s.t. ∣v∣=1

True: This statement describes the objective of PCA. The first principal component vvv is
the vector that maximizes the variance of the projections (vTxi)(v^T x_i)(vTxi).
Mathematically, this is equivalent to maximizing the sum of squared projections
∑i=1N(vTxi)2\sum_{i=1}^N (v^T x_i)^2∑i=1N(vTxi)2 subject to the constraint that vvv is a
unit vector (∣v∣=1)(|v| = 1)(∣v∣=1). This corresponds to the eigenvector associated with the

2. v=argmin∑i=1N(vTxi)2 s.t. ∣v∣=1v = \text{argmin} \sum_{i=1}^N (v^T x_i)^2 \text{ s.t. } |v| =
largest eigenvalue of the covariance matrix.

1v=argmin∑i=1N(vTxi)2 s.t. ∣v∣=1

False: This statement is incorrect because PCA aims to maximize the variance of the
projections, not minimize it. Minimizing the variance of the projections is not the goal of
PCA and does not correspond to the principal components.
3. Scaling at the start of performing PCA is done just for better numerical stability and
computational benefits but plays no role in determining the final principal components of
a dataset.
False: Scaling (or standardization) is crucial in PCA, especially when the features have
different units or scales. If the features are not scaled, the PCA might be dominated by
features with larger scales, and the principal components obtained will be biased towards
those features. Scaling ensures that each feature contributes equally to the computation of
the principal components.
4. The resultant vectors obtained when performing PCA on a dataset can vary based on the
scale of the dataset.
True: The principal components obtained from PCA can indeed vary depending on the
scale of the dataset. If the features are on different scales and are not standardized, PCA
will give more weight to features with larger scales. Standardizing the features to have zero
mean and unit variance ensures that the principal components are not biased by the scale
of the features.
Conclusion

 v=argmax∑i=1N(vTxi)2 s.t. ∣v∣=1v = \text{argmax} \sum_{i=1}^N (v^T x_i)^2 \text{ s.t. } |v|
The TRUE statements are:

= 1v=argmax∑i=1N(vTxi)2 s.t. ∣v∣=1

 The resultant vectors obtained when performing PCA on a dataset can vary based on the
scale of the dataset.

Project Four Individual Part V
No ratings yet
Project Four Individual Part V
4 pages
Machine Learning Project
55% (11)
Machine Learning Project
15 pages
Project2 2022 Fall
No ratings yet
Project2 2022 Fall
7 pages
CA2123 Lecture 9 11
No ratings yet
CA2123 Lecture 9 11
119 pages
Chapter4 Regression
No ratings yet
Chapter4 Regression
15 pages
Spring Mid Sem ML Evalution Scheme
No ratings yet
Spring Mid Sem ML Evalution Scheme
8 pages
Karthik Nambiar 60009220193
No ratings yet
Karthik Nambiar 60009220193
9 pages
Curve Fitting
100% (1)
Curve Fitting
43 pages
Numerical Methods With Applications
No ratings yet
Numerical Methods With Applications
34 pages
Regression Questionnaire
No ratings yet
Regression Questionnaire
10 pages
ML Khuraim
No ratings yet
ML Khuraim
27 pages
Polynomial Curve Fitting
No ratings yet
Polynomial Curve Fitting
44 pages
CISE301-Topic 3 Curve Fitting
No ratings yet
CISE301-Topic 3 Curve Fitting
38 pages
Clase 11 Calculo Numerico I
No ratings yet
Clase 11 Calculo Numerico I
37 pages
21csc305p ML Unit 2
No ratings yet
21csc305p ML Unit 2
115 pages
ANUM 2012 Curve-Fitting
No ratings yet
ANUM 2012 Curve-Fitting
44 pages
3.2 Least Square and Polynomial Regression
No ratings yet
3.2 Least Square and Polynomial Regression
39 pages
Linear Regression - 2
No ratings yet
Linear Regression - 2
10 pages
Ch17 Curve Fitting
No ratings yet
Ch17 Curve Fitting
44 pages
Arnav MLlab02
No ratings yet
Arnav MLlab02
6 pages
Regression
No ratings yet
Regression
6 pages
Least Squares Curve Fitting: Numerical Methods
No ratings yet
Least Squares Curve Fitting: Numerical Methods
39 pages
Unit-2 ML
No ratings yet
Unit-2 ML
39 pages
Nonlinear Curve Fitting: "Why Fit in When You Were Born To Stand Out?" - Dr. Seuss
No ratings yet
Nonlinear Curve Fitting: "Why Fit in When You Were Born To Stand Out?" - Dr. Seuss
65 pages
MATH3714 Jan 2024
No ratings yet
MATH3714 Jan 2024
9 pages
Applications of Numerical Methods in Civil Engineering
No ratings yet
Applications of Numerical Methods in Civil Engineering
34 pages
Dr. Siti Mariam Binti Abdul Rahman Faculty of Mechanical Engineering Office: T1-A14-01C E-Mail: Mariam4528@salam - Uitm.edu - My
No ratings yet
Dr. Siti Mariam Binti Abdul Rahman Faculty of Mechanical Engineering Office: T1-A14-01C E-Mail: Mariam4528@salam - Uitm.edu - My
30 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
Curve Fitting: There Are Two General Approaches For Curve Fitting
No ratings yet
Curve Fitting: There Are Two General Approaches For Curve Fitting
63 pages
Statistics Quiz
No ratings yet
Statistics Quiz
20 pages
Linear Regression
No ratings yet
Linear Regression
89 pages
4-Curve Fitting and Interpolation
No ratings yet
4-Curve Fitting and Interpolation
48 pages
Adequacy Og Regression Model
No ratings yet
Adequacy Og Regression Model
10 pages
Reference+Material Linear Regression
No ratings yet
Reference+Material Linear Regression
12 pages
Simple Linear Regression-Merged
No ratings yet
Simple Linear Regression-Merged
65 pages
ML Unit-2
No ratings yet
ML Unit-2
138 pages
ML Assignment
No ratings yet
ML Assignment
7 pages
Mcq-On-Linear-Regression - 5eea6a0b39140f30f369dd1c
No ratings yet
Mcq-On-Linear-Regression - 5eea6a0b39140f30f369dd1c
22 pages
Reference Material Linear Regression
No ratings yet
Reference Material Linear Regression
12 pages
Curve Fitting
No ratings yet
Curve Fitting
17 pages
Assignment 1-12 ML
No ratings yet
Assignment 1-12 ML
54 pages
Module 5
No ratings yet
Module 5
26 pages
Section 2
No ratings yet
Section 2
22 pages
Linear Regression-Part 2
No ratings yet
Linear Regression-Part 2
26 pages
Curve Fitting
No ratings yet
Curve Fitting
48 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Regression Analysis
No ratings yet
Regression Analysis
11 pages
Data Analytics Unit 3 Notes
100% (3)
Data Analytics Unit 3 Notes
28 pages
ML - LAB - BE CSE (DS) Final
No ratings yet
ML - LAB - BE CSE (DS) Final
110 pages
Curve Fitting-Linear Regression
No ratings yet
Curve Fitting-Linear Regression
20 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Lec4 Numerical Model
No ratings yet
Lec4 Numerical Model
26 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
ML U3 MCQ
No ratings yet
ML U3 MCQ
20 pages
Extract Pages From 2 ML
No ratings yet
Extract Pages From 2 ML
3 pages
Week 4 Linear Regression
No ratings yet
Week 4 Linear Regression
38 pages
Lecture 9-10 - Updated Vesion S25 - Regression
No ratings yet
Lecture 9-10 - Updated Vesion S25 - Regression
43 pages
Unit 5
No ratings yet
Unit 5
171 pages
Assignment III
No ratings yet
Assignment III
3 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
4.kinds of Variables and Level of Measurement
No ratings yet
4.kinds of Variables and Level of Measurement
61 pages
Quantitative Analysis of Categorical Variables
No ratings yet
Quantitative Analysis of Categorical Variables
25 pages
Week6 - Naive Bayes
No ratings yet
Week6 - Naive Bayes
68 pages
Sakhil Capstone
No ratings yet
Sakhil Capstone
20 pages
Cemat V 8.2 Function Block Library ILS - CEM
No ratings yet
Cemat V 8.2 Function Block Library ILS - CEM
15 pages
2 - Power BI - Query Editor - Column Transformation - Data Types
100% (1)
2 - Power BI - Query Editor - Column Transformation - Data Types
64 pages
Logistic Regression & Practice
100% (1)
Logistic Regression & Practice
51 pages
Shakiba Rahimiaghdam - 61130 - Assignsubmission - File - DatasetAnalysis - MINERS
No ratings yet
Shakiba Rahimiaghdam - 61130 - Assignsubmission - File - DatasetAnalysis - MINERS
56 pages
T215A - Session 1
No ratings yet
T215A - Session 1
41 pages
Decision Making & Uncertainty Excel Sheet
No ratings yet
Decision Making & Uncertainty Excel Sheet
10 pages
الذكاء الاصطناعي
No ratings yet
الذكاء الاصطناعي
66 pages
Open Office Base
No ratings yet
Open Office Base
157 pages
Bok:978 1 4899 7218 7 PDF
No ratings yet
Bok:978 1 4899 7218 7 PDF
375 pages
Types of Data or Classification of Variables 1
No ratings yet
Types of Data or Classification of Variables 1
14 pages
Logistic PDF
No ratings yet
Logistic PDF
146 pages
Intel GFX PRM Osrc KBL Vol02a Commandreference Instructions
No ratings yet
Intel GFX PRM Osrc KBL Vol02a Commandreference Instructions
1,352 pages
All About Encoding - by Baijayanta Roy - Towards Data Science
No ratings yet
All About Encoding - by Baijayanta Roy - Towards Data Science
25 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
5 pages
Data Science With R Exam Questions: PG Program in Analytics
100% (2)
Data Science With R Exam Questions: PG Program in Analytics
4 pages
Customer Churn: by Dinesh Nair Adrien Le Doussal Fiona Tait Fatma Ahmadi Fulya Percin
100% (1)
Customer Churn: by Dinesh Nair Adrien Le Doussal Fiona Tait Fatma Ahmadi Fulya Percin
20 pages
Lecture Notes - Logistic Regression
100% (1)
Lecture Notes - Logistic Regression
11 pages
Lab 1.2.6 Binary To Decimal Conversion: Objective
No ratings yet
Lab 1.2.6 Binary To Decimal Conversion: Objective
3 pages
Computers and Education Open: Fabian Gunnars
No ratings yet
Computers and Education Open: Fabian Gunnars
10 pages
Basic Biostatistics - Wakgari Module 17-21
No ratings yet
Basic Biostatistics - Wakgari Module 17-21
82 pages
What Test Flowchart and Table
No ratings yet
What Test Flowchart and Table
2 pages
Statistics at Square Two 3rd Edition Michael J. Campbell Download
No ratings yet
Statistics at Square Two 3rd Edition Michael J. Campbell Download
50 pages
Ass 2
No ratings yet
Ass 2
7 pages
Data Mining Unit-4
No ratings yet
Data Mining Unit-4
27 pages
Econometrics For Finance Chapter 5
No ratings yet
Econometrics For Finance Chapter 5
12 pages

Ass 2

Uploaded by

Ass 2

Uploaded by

1.

State True or False:

In high-dimensional input spaces, linear regression can underperform compared to k-nearest

1. Curse of Dimensionality: As the number of dimensions (features) increases, the distance

compared to one-hot encoding, especially when nnn is large.

= 1v=argmax∑i=1N(vTxi)2 s.t. ∣v∣=1

1v=argmin∑i=1N(vTxi)2 s.t. ∣v∣=1

= 1v=argmax∑i=1N(vTxi)2 s.t. ∣v∣=1

You might also like