M L

The document provides an overview of key concepts in Machine Learning (ML), including types of learning (supervised, unsupervised, reinforcement), algorithms, and applications. It explains K-Means clustering, proximal tuning in optimization, the role of weights and biases in artificial neural networks, and considerations for handling missing data. Additionally, it discusses linear and polynomial regression, Principal Component Analysis (PCA), and the formula for Manhattan distance.

Uploaded by

rvit22bcs053.rvitm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views4 pages

M L

Uploaded by

rvit22bcs053.rvitm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

1. What do you know in ML?

• Machine Learning (ML) is a subset of artificial intelligence where algorithms learn patterns
from data to make predictions or decisions without explicit programming.
• Supervised Learning: Uses labeled data (e.g., regression, classification).
• Unsupervised Learning: Finds patterns in unlabeled data (e.g., clustering, dimensionality
reduction).
• Reinforcement Learning: Agents learn by interacting with an environment to maximize
rewards.
• Common algorithms: Linear Regression, Logistic Regression, Decision Trees, Random
Forests, SVM, K-Means, PCA, Neural Networks.
• Applications: Image recognition, NLP, recommendation systems, fraud detection, etc.
• Key concepts: Feature engineering, overfitting, underfitting, bias-variance tradeoff, cross-
validation, and evaluation metrics (e.g., accuracy, MSE, F1-score).
2. How clustering happens in K-Means?
• K-Means is an unsupervised clustering algorithm that partitions data into K clusters by
minimizing variance within clusters.
• Steps:
• Initialize K centroids randomly.
• Assign each data point to the nearest centroid (based on Euclidean distance).
• Recalculate centroids as the mean of all points in each cluster.
• Repeat assignment and centroid update until convergence (centroids stabilize or max
iterations reached).
• Output: K clusters with assigned data points.
• Note: Sensitive to initial centroid placement; may require multiple runs or K-Means++ for
better initialization.
3. What is proximal tuning?
• Proximal tuning (or proximal gradient methods) is an optimization technique used in ML to
solve problems with composite objective functions, often involving a smooth loss function plus a
non-smooth regularization term (e.g., L1 regularization).
• Combines gradient descent with a proximal operator to handle non-differentiable terms.
• Common in sparse models like Lasso or compressed sensing.
• Example: In sparse linear regression, it minimizes the loss while encouraging sparsity in
coefficients.
• Proximal algorithms are efficient for large-scale, high-dimensional data.
4. How do weights and bias work in ANN?
• In Artificial Neural Networks (ANNs):
• Weights: Parameters that scale the input features or activations to influence the output
of a neuron. Each connection between neurons has a weight, adjusted during training to
minimize error.
• Bias: A constant added to the weighted sum of inputs to shift the activation function,
allowing better fitting of complex patterns.
• Process: For a neuron, input features are multiplied by weights, summed, and added to the
bias. This sum passes through an activation function (e.g., ReLU, sigmoid) to produce the
neuron’s output.
• During backpropagation, weights and biases are updated using gradient descent to minimize
the loss function.
5. How do you decide whether to drop the desired column or fill it with values?
• Deciding whether to drop a column or impute missing values depends on:
• Percentage of Missing Data:
• Low missing data (<5-10%): Impute using mean, median, mode, or advanced
methods (e.g., KNN imputation).
• High missing data (>50%): Consider dropping the column if it’s not critical, as
imputation may introduce bias.
• Importance of the Column: If the column is highly relevant (e.g., strong correlation with
the target), impute to retain information. If irrelevant, drop it.
• Data Distribution: Imputation works better if the data is missing at random and the
column’s distribution is stable.
• Domain Knowledge: If the column is critical based on domain expertise, prioritize
imputation.
• Model Requirements: Some models (e.g., tree-based) handle missing values better,
reducing the need for imputation.
• Use exploratory data analysis and cross-validation to assess the impact of dropping vs.
imputing.
6. What is linear regression and can it be used for multiple classification?
• Linear Regression: A supervised learning algorithm that models the relationship between a
dependent variable (continuous) and one or more independent variables using a linear
equation: y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \dots + \beta_nx_n . It minimizes the mean
squared error between predictions and actual values.
• Multiple Classification: Linear regression is not suitable for multiple classification (predicting
multiple discrete classes). It assumes a continuous output, while classification requires discrete
outputs. Instead, use:
• Logistic Regression for binary classification.
• Softmax Regression (or multinomial logistic regression) for multiple classes.
• Linear regression can be used indirectly in classification (e.g., as a baseline for
predicting class probabilities), but it’s not optimal due to unbounded outputs.
7. Where do you use polynomial regression?
• Polynomial regression is used when the relationship between the independent variable(s) and
the dependent variable is non-linear but can be approximated by a polynomial function.
• Use Cases:
• Non-linear Trends: When data shows curves or higher-order patterns (e.g., quadratic,
cubic) that linear regression can’t capture.
• Scientific Modeling: E.g., modeling growth rates, chemical reactions, or physical
phenomena with polynomial relationships.
• Feature Engineering: To capture non-linear effects in datasets with low dimensionality.
• Limitations: Avoid in high-degree polynomials to prevent overfitting; consider regularization
(e.g., Ridge) or other models (e.g., splines, decision trees) for complex patterns.
8. What is PCA and the steps for the reduction technique in PCA?
• Principal Component Analysis (PCA): A dimensionality reduction technique that transforms
high-dimensional data into a lower-dimensional space while retaining most variance. It finds
orthogonal axes (principal components) that maximize variance.
• Steps:
• Standardize the data (mean = 0, variance = 1) to ensure equal feature contribution.
• Compute the covariance matrix to understand feature relationships.
• Perform eigenvalue decomposition (or SVD) on the covariance matrix to find principal
components (eigenvectors) and their importance (eigenvalues).
• Sort eigenvalues in descending order and select the top k components for the desired
dimensionality.
• Project the data onto the selected components to obtain the reduced dataset.
• Output: Lower-dimensional data with minimal information loss.
9. What is the formula for Manhattan distance?
• The Manhattan distance (L1 norm) between two points A = (x_1, x_2, \dots, x_n) and B =
(y_1, y_2, \dots, y_n) in n-dimensional space is:

Statistics For Data Scientists
100% (1)
Statistics For Data Scientists
486 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
AWS Machine Learning Specialty Master Cheat Sheet
No ratings yet
AWS Machine Learning Specialty Master Cheat Sheet
24 pages
Data Science in FInancial Services - 3
No ratings yet
Data Science in FInancial Services - 3
76 pages
Deep Learning Answers
No ratings yet
Deep Learning Answers
36 pages
ML Interview Questions PDF
100% (5)
ML Interview Questions PDF
20 pages
20 Questions On Feature Engineering and Eda
No ratings yet
20 Questions On Feature Engineering and Eda
9 pages
Coefficient of Variation (Meaning and Interpretation
No ratings yet
Coefficient of Variation (Meaning and Interpretation
10 pages
Q1-What's The Trade-Off Between Bias and Variance?
100% (1)
Q1-What's The Trade-Off Between Bias and Variance?
5 pages
Predictive Maintenance
No ratings yet
Predictive Maintenance
66 pages
Data Minig Anwers
No ratings yet
Data Minig Anwers
37 pages
HUL New
No ratings yet
HUL New
73 pages
Unit 1 BD PDF
No ratings yet
Unit 1 BD PDF
26 pages
Exam Question Ans
No ratings yet
Exam Question Ans
19 pages
Aam Ut-2 QB Ans
No ratings yet
Aam Ut-2 QB Ans
29 pages
ML 2 Marks
No ratings yet
ML 2 Marks
14 pages
Interview Questions Companie
No ratings yet
Interview Questions Companie
72 pages
12 Bida - 630 - Final - Exam - Preparations PDF
No ratings yet
12 Bida - 630 - Final - Exam - Preparations PDF
7 pages
100 Days ML
No ratings yet
100 Days ML
15 pages
ML Imp QB
No ratings yet
ML Imp QB
34 pages
Unit 3
No ratings yet
Unit 3
55 pages
Common DS Interview Questions and Answers - 2
No ratings yet
Common DS Interview Questions and Answers - 2
7 pages
SEM MLOps
No ratings yet
SEM MLOps
58 pages
Sem Rpa
No ratings yet
Sem Rpa
61 pages
Supervised Learning 1 PDF
100% (1)
Supervised Learning 1 PDF
162 pages
Cours1 ML
No ratings yet
Cours1 ML
41 pages
Machine Learning Qs
No ratings yet
Machine Learning Qs
10 pages
Week11 - Regularization and Optimization
No ratings yet
Week11 - Regularization and Optimization
75 pages
Coincent - Data Science With Python Assignment
100% (2)
Coincent - Data Science With Python Assignment
23 pages
PSCS511 - Machine Learning Ques Paper
No ratings yet
PSCS511 - Machine Learning Ques Paper
10 pages
QSRI Lecture1
No ratings yet
QSRI Lecture1
45 pages
ML
No ratings yet
ML
18 pages
ML Unit I
No ratings yet
ML Unit I
14 pages
Senior Business Analyst Resume
33% (3)
Senior Business Analyst Resume
4 pages
Data Science Interview Questions (#Day11) PDF
100% (1)
Data Science Interview Questions (#Day11) PDF
11 pages
The Peformance Based Management Handbook Vol 3 Accountability For Performance PDF
No ratings yet
The Peformance Based Management Handbook Vol 3 Accountability For Performance PDF
72 pages
Machine Learning in A Nutshell
No ratings yet
Machine Learning in A Nutshell
36 pages
ML 1
No ratings yet
ML 1
20 pages
ML Imp Ques 1
No ratings yet
ML Imp Ques 1
22 pages
ML PYQs
No ratings yet
ML PYQs
32 pages
ML Short
No ratings yet
ML Short
11 pages
What Are The Differences Between Supervised and Unsupervised Learning?
No ratings yet
What Are The Differences Between Supervised and Unsupervised Learning?
21 pages
Fam Question Bank CT
No ratings yet
Fam Question Bank CT
14 pages
High Yield Notes
No ratings yet
High Yield Notes
251 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
5 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Fam QB Ans
No ratings yet
Fam QB Ans
9 pages
Lect 1
No ratings yet
Lect 1
24 pages
40 Interview Questions On Machine Learning From Analytics Vidhya
No ratings yet
40 Interview Questions On Machine Learning From Analytics Vidhya
14 pages
What Are The Differences Between Supervised and Unsupervised Learning?
No ratings yet
What Are The Differences Between Supervised and Unsupervised Learning?
22 pages
Machine Learning Insem-01 QP
No ratings yet
Machine Learning Insem-01 QP
6 pages
2 Mark Questions
No ratings yet
2 Mark Questions
13 pages
ML Questions Answers
No ratings yet
ML Questions Answers
4 pages
Quiz 4 5 6
No ratings yet
Quiz 4 5 6
11 pages
40 Interview Questions On Machine Learning - AnalyticsVidhya
100% (1)
40 Interview Questions On Machine Learning - AnalyticsVidhya
21 pages
QUIZ Data
No ratings yet
QUIZ Data
18 pages
Data Science Interview Question
No ratings yet
Data Science Interview Question
7 pages
Building Maintenance Dissertation
100% (2)
Building Maintenance Dissertation
5 pages
Interview Questions On Machine Learning
100% (4)
Interview Questions On Machine Learning
22 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Machine Learning Interview Questions
No ratings yet
Machine Learning Interview Questions
8 pages
Untitled 10
No ratings yet
Untitled 10
12 pages
ASSIGNMENT2
No ratings yet
ASSIGNMENT2
6 pages
Machine Learning One Mark Answers
No ratings yet
Machine Learning One Mark Answers
4 pages
Data Science Interview Questions (#Day9)
No ratings yet
Data Science Interview Questions (#Day9)
9 pages
LAS 3.4 Basic Statistics in Experimental Research No Answer Key
No ratings yet
LAS 3.4 Basic Statistics in Experimental Research No Answer Key
6 pages
Adf
50% (2)
Adf
15 pages
ARIMAX
No ratings yet
ARIMAX
10 pages
Machine Learning-1
No ratings yet
Machine Learning-1
64 pages
Power BI Vs Tableau
No ratings yet
Power BI Vs Tableau
6 pages
21 A Study On Market Survey On The Brand Equity of Pondicherry Co Operative Spinning Mills Ltd.
No ratings yet
21 A Study On Market Survey On The Brand Equity of Pondicherry Co Operative Spinning Mills Ltd.
48 pages
Project Charter
No ratings yet
Project Charter
6 pages
Chapter 3 Thesis Data Analysis
100% (3)
Chapter 3 Thesis Data Analysis
5 pages
Use of Computer in Legal Research
No ratings yet
Use of Computer in Legal Research
6 pages
Hasil Pls Ujian Statistik
No ratings yet
Hasil Pls Ujian Statistik
3 pages
Business Intelligence Assignment
No ratings yet
Business Intelligence Assignment
4 pages
Business Intelligence: Coursework 2 M00678748
No ratings yet
Business Intelligence: Coursework 2 M00678748
19 pages
CS306 Data Analysis and Visualization
No ratings yet
CS306 Data Analysis and Visualization
13 pages
Analysis of Variance (ANOVA)
No ratings yet
Analysis of Variance (ANOVA)
12 pages
Nihms 727740
No ratings yet
Nihms 727740
24 pages
48 Wan Nooraini Wan Kamaruddin
No ratings yet
48 Wan Nooraini Wan Kamaruddin
8 pages
SWOT Analysis: Internal and Eksternal Factor On Business Development of Brown Sugar Cane (Case Study in The UKM Bumi Asih, Bondowoso Regency)
No ratings yet
SWOT Analysis: Internal and Eksternal Factor On Business Development of Brown Sugar Cane (Case Study in The UKM Bumi Asih, Bondowoso Regency)
9 pages
Lab Report 2
No ratings yet
Lab Report 2
10 pages
Econ7020X 2024S FinalExam
No ratings yet
Econ7020X 2024S FinalExam
10 pages
PERFORMANCE c3
No ratings yet
PERFORMANCE c3
7 pages
Subham New Desertation
No ratings yet
Subham New Desertation
20 pages
NSS 6th SEM Pooja
No ratings yet
NSS 6th SEM Pooja
10 pages
Assignment 4.1. Correlation
No ratings yet
Assignment 4.1. Correlation
3 pages
Machine Learning Lab (BCSL606)
No ratings yet
Machine Learning Lab (BCSL606)
3 pages
Machine Learning Lab (BCSL606)
No ratings yet
Machine Learning Lab (BCSL606)
3 pages
10 05 21-7-53 38 Submitting ECG Waveforms Data FAQ
No ratings yet
10 05 21-7-53 38 Submitting ECG Waveforms Data FAQ
5 pages
Iks Assignment-1 Report Format
No ratings yet
Iks Assignment-1 Report Format
5 pages
Rvit Students Info
No ratings yet
Rvit Students Info
3 pages
Etkt-Blr-Ghaziabad For MR Satyam Srivastava
No ratings yet
Etkt-Blr-Ghaziabad For MR Satyam Srivastava
1 page
Mathematics for Data Science: Linear Algebra with Matlab
From Everand
Mathematics for Data Science: Linear Algebra with Matlab
César Pérez López
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

M L

Uploaded by

M L

Uploaded by

1. What do you know in ML?

You might also like