ML 5 Mark Questions Answers
ML 5 Mark Questions Answers
Ans: The Least Squares method minimizes the sum of squared differences between observed values and
predicted values. It finds the best-fitting line by minimizing the cost function J() = (1/n) * (y )².
Ans: L1 (Lasso) adds || to the cost and can shrink some coefficients to zero (feature selection). L2 (Ridge)
adds ² and shrinks all coefficients evenly. Use L1 for sparsity and L2 when all features matter.
Ans: It applies the sigmoid function (z) = 1 / (1 + e^(z)) to the linear output, converting it to a probability. If
Ans: 1. Data preprocessing (handling missing values, scaling, encoding). 2. Feature selection/engineering
Ans: The input data is evaluated from the root node by applying feature-based splits until it reaches a leaf
Ans: Pruning removes unnecessary branches to reduce overfitting. Pre-pruning stops growth early,
Ans: The hyperplane separates classes in feature space. Margin is the distance to the closest data points
Ans: Bagging trains multiple models independently to reduce variance (e.g., Random Forest). Boosting
Ans: 1. Initialize K centroids. 2. Assign each point to the nearest centroid. 3. Update centroids. 4. Repeat
Ans: PCA finds principal components (directions of max variance), reduces dimensions by projecting data
into fewer axes while retaining most information.
Ans: It removes irrelevant/noisy features, reduces overfitting, speeds up training, and improves model
Ans: Check if it has high bias (underfitting) or variance (overfitting) using learning curves. Tune
Ans: Bias is error from overly simplistic models; variance is error from sensitivity to data noise. High bias
Ans: It converts categorical variables into binary vectors to be usable by ML models. Each category is
Ans: 1. Define the problem. 2. Collect and clean data. 3. Feature engineering. 4. Split into train/test. 5. Train
Ans: Follow feature-based decisions from root to leaf. Example: if outlook=sunny and humidity=high predict
'no'.
Ans: Information Gain measures entropy reduction; Gini measures class impurity. Both help select best
Ans: Unpruned trees are complex and overfit. Pruning simplifies the tree, improving generalization and
reducing overfitting.
Ans: Linear Regression assumes linearity and works with numeric data. Decision Trees handle both types
Ans: Pre-pruning stops tree growth early. Post-pruning grows the tree fully and removes unhelpful branches
afterward.
21. Margin in SVMs and why maximize it.
Ans: Margin is the distance to the closest points. Maximizing it improves generalization and robustness of
the model.
Ans: The kernel trick maps data to a higher-dimensional space where it becomes linearly separable.
Ans: Hard margin allows no misclassification (requires perfect separation). Soft margin uses slack variables
Ans: C balances margin maximization vs. classification error. Low C allows wider margins with some errors;
Ans: SVM maximizes margin and handles non-linear data via kernels. Logistic Regression estimates