0% found this document useful (0 votes)
6 views4 pages

M L

The document provides an overview of key concepts in Machine Learning (ML), including types of learning (supervised, unsupervised, reinforcement), algorithms, and applications. It explains K-Means clustering, proximal tuning in optimization, the role of weights and biases in artificial neural networks, and considerations for handling missing data. Additionally, it discusses linear and polynomial regression, Principal Component Analysis (PCA), and the formula for Manhattan distance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views4 pages

M L

The document provides an overview of key concepts in Machine Learning (ML), including types of learning (supervised, unsupervised, reinforcement), algorithms, and applications. It explains K-Means clustering, proximal tuning in optimization, the role of weights and biases in artificial neural networks, and considerations for handling missing data. Additionally, it discusses linear and polynomial regression, Principal Component Analysis (PCA), and the formula for Manhattan distance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

1. What do you know in ML?

• Machine Learning (ML) is a subset of artificial intelligence where algorithms learn patterns
from data to make predictions or decisions without explicit programming.
• Supervised Learning: Uses labeled data (e.g., regression, classification).
• Unsupervised Learning: Finds patterns in unlabeled data (e.g., clustering, dimensionality
reduction).
• Reinforcement Learning: Agents learn by interacting with an environment to maximize
rewards.
• Common algorithms: Linear Regression, Logistic Regression, Decision Trees, Random
Forests, SVM, K-Means, PCA, Neural Networks.
• Applications: Image recognition, NLP, recommendation systems, fraud detection, etc.
• Key concepts: Feature engineering, overfitting, underfitting, bias-variance tradeoff, cross-
validation, and evaluation metrics (e.g., accuracy, MSE, F1-score).
2. How clustering happens in K-Means?
• K-Means is an unsupervised clustering algorithm that partitions data into K clusters by
minimizing variance within clusters.
• Steps:
• Initialize K centroids randomly.
• Assign each data point to the nearest centroid (based on Euclidean distance).
• Recalculate centroids as the mean of all points in each cluster.
• Repeat assignment and centroid update until convergence (centroids stabilize or max
iterations reached).
• Output: K clusters with assigned data points.
• Note: Sensitive to initial centroid placement; may require multiple runs or K-Means++ for
better initialization.
3. What is proximal tuning?
• Proximal tuning (or proximal gradient methods) is an optimization technique used in ML to
solve problems with composite objective functions, often involving a smooth loss function plus a
non-smooth regularization term (e.g., L1 regularization).
• Combines gradient descent with a proximal operator to handle non-differentiable terms.
• Common in sparse models like Lasso or compressed sensing.
• Example: In sparse linear regression, it minimizes the loss while encouraging sparsity in
coefficients.
• Proximal algorithms are efficient for large-scale, high-dimensional data.
4. How do weights and bias work in ANN?
• In Artificial Neural Networks (ANNs):
• Weights: Parameters that scale the input features or activations to influence the output
of a neuron. Each connection between neurons has a weight, adjusted during training to
minimize error.
• Bias: A constant added to the weighted sum of inputs to shift the activation function,
allowing better fitting of complex patterns.
• Process: For a neuron, input features are multiplied by weights, summed, and added to the
bias. This sum passes through an activation function (e.g., ReLU, sigmoid) to produce the
neuron’s output.
• During backpropagation, weights and biases are updated using gradient descent to minimize
the loss function.
5. How do you decide whether to drop the desired column or fill it with values?
• Deciding whether to drop a column or impute missing values depends on:
• Percentage of Missing Data:
• Low missing data (<5-10%): Impute using mean, median, mode, or advanced
methods (e.g., KNN imputation).
• High missing data (>50%): Consider dropping the column if it’s not critical, as
imputation may introduce bias.
• Importance of the Column: If the column is highly relevant (e.g., strong correlation with
the target), impute to retain information. If irrelevant, drop it.
• Data Distribution: Imputation works better if the data is missing at random and the
column’s distribution is stable.
• Domain Knowledge: If the column is critical based on domain expertise, prioritize
imputation.
• Model Requirements: Some models (e.g., tree-based) handle missing values better,
reducing the need for imputation.
• Use exploratory data analysis and cross-validation to assess the impact of dropping vs.
imputing.
6. What is linear regression and can it be used for multiple classification?
• Linear Regression: A supervised learning algorithm that models the relationship between a
dependent variable (continuous) and one or more independent variables using a linear
equation: y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \dots + \beta_nx_n . It minimizes the mean
squared error between predictions and actual values.
• Multiple Classification: Linear regression is not suitable for multiple classification (predicting
multiple discrete classes). It assumes a continuous output, while classification requires discrete
outputs. Instead, use:
• Logistic Regression for binary classification.
• Softmax Regression (or multinomial logistic regression) for multiple classes.
• Linear regression can be used indirectly in classification (e.g., as a baseline for
predicting class probabilities), but it’s not optimal due to unbounded outputs.
7. Where do you use polynomial regression?
• Polynomial regression is used when the relationship between the independent variable(s) and
the dependent variable is non-linear but can be approximated by a polynomial function.
• Use Cases:
• Non-linear Trends: When data shows curves or higher-order patterns (e.g., quadratic,
cubic) that linear regression can’t capture.
• Scientific Modeling: E.g., modeling growth rates, chemical reactions, or physical
phenomena with polynomial relationships.
• Feature Engineering: To capture non-linear effects in datasets with low dimensionality.
• Limitations: Avoid in high-degree polynomials to prevent overfitting; consider regularization
(e.g., Ridge) or other models (e.g., splines, decision trees) for complex patterns.
8. What is PCA and the steps for the reduction technique in PCA?
• Principal Component Analysis (PCA): A dimensionality reduction technique that transforms
high-dimensional data into a lower-dimensional space while retaining most variance. It finds
orthogonal axes (principal components) that maximize variance.
• Steps:
• Standardize the data (mean = 0, variance = 1) to ensure equal feature contribution.
• Compute the covariance matrix to understand feature relationships.
• Perform eigenvalue decomposition (or SVD) on the covariance matrix to find principal
components (eigenvectors) and their importance (eigenvalues).
• Sort eigenvalues in descending order and select the top k components for the desired
dimensionality.
• Project the data onto the selected components to obtain the reduced dataset.
• Output: Lower-dimensional data with minimal information loss.
9. What is the formula for Manhattan distance?
• The Manhattan distance (L1 norm) between two points A = (x_1, x_2, \dots, x_n) and B =
(y_1, y_2, \dots, y_n) in n-dimensional space is:

You might also like