0% found this document useful (0 votes)

17 views69 pages

Machine Learning: Dr. Jagan. T Professor Department of ECE, GRIET

The document provides an overview of machine learning, including its definition, key concepts, types (supervised, unsupervised, reinforcement learning), applications, benefits, and challenges. It also discusses feature engineering, its importance, techniques for data cleaning, scaling, encoding, and selection, as well as dimensionality reduction methods like PCA and LDA. The future of machine learning is highlighted, emphasizing ongoing research areas such as deep learning and explainable AI.

Uploaded by

siddipetanandinireddy123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views69 pages

Machine Learning: Dr. Jagan. T Professor Department of ECE, GRIET

Uploaded by

siddipetanandinireddy123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 69

Machine Learning

Unit-I
Dr. Jagan. T
Professor
Department of ECE, GRIET
Introduction to Machine Learning
• Machine learning is a subfield of artificial intelligence (AI) that focuses on enabling
computers to learn from data without being explicitly programmed. It involves algorithms
and statistical models that allow computers to improve their performance on a specific task
over time, based on the data they are exposed to.
Key Concepts in Machine Learning:
• Algorithms: These are sets of rules and procedures that guide the learning process.
Different algorithms are suited for different types of data and tasks.
• Data: Machine learning models require data to learn from. This data can be labeled (for
supervised learning) or unlabeled (for unsupervised learning).
• Training: The process of feeding data to the algorithm to enable it to learn patterns and
relationships.
• Model: The output of the training process, representing the learned patterns and used for
making predictions or decisions on new data.
Types of Machine Learning

• Supervised Learning: The algorithm learns from labeled data, where the correct
output is provided. Examples include classification (e.g., spam detection) and
regression (e.g., predicting house prices).

• Unsupervised Learning: The algorithm learns from unlabeled data, identifying

patterns and structures without explicit guidance. Examples include clustering (e.g.,
customer segmentation) and dimensionality reduction (e.g., feature extraction).

• Reinforcement Learning: The algorithm learns through trial and error, receiving
rewards or penalties for its actions. It's often used in robotics and game playing.
Applications of Machine Learning:

• Image Recognition: Identifying objects or faces in images.

• Natural Language Processing: Understanding and generating human

language.

• Recommendation Systems: Suggesting products or content based on user

preferences.

• Fraud Detection: Identifying potentially fraudulent transactions.

• Medical Diagnosis: Assisting in disease diagnosis and treatment planning.

Etc…..
Benefits of Machine Learning:

• Automation: Automating tasks that traditionally require human

intelligence.

• Improved Accuracy: Making more accurate predictions and decisions

compared to traditional methods.

• Personalization: Tailoring experiences to individual users.

• Insights: Discovering hidden patterns and insights in data.

Challenges of Machine Learning:

• Data Requirements: Machine learning models often require large amounts

of high-quality data.

• Complexity: Developing and deploying machine learning models can be

complex.

• Bias: Machine learning models can inherit biases present in the data.

• Interpretability: Understanding why a machine learning model makes

certain decisions can be challenging.
The Future of Machine Learning:

• Machine learning is a rapidly evolving field with tremendous potential.

Ongoing research is focused on areas such as:

• Deep Learning: Using artificial neural networks with multiple layers to learn
complex patterns.

• Explainable AI: Developing methods to make machine learning decisions

more transparent and understandable.

• Federated Learning: Training machine learning models on decentralized

data sources without sharing the data itself.
Introduction - What is Feature Engineering?

• The process of selecting, transforming, and creating features from

raw data.
• A crucial step in machine learning, often more impactful than
algorithm selection.
• Aims to improve the performance of machine learning models.
• Bridges the gap between raw data and model understanding.
Introduction - What is Feature Engineering?

A visual representation of a data pipeline for feature creation would typically

show a series of connected boxes, each representing a distinct step in the
process, starting with raw data ingestion from various sources, progressing
through data cleaning, transformation (including feature engineering), and
finally culminating in the creation of new, refined features ready for machine
learning modeling; often with arrows indicating the data flow between each
stage, highlighting key transformations applied at each step like binning,
scaling, or one-hot encoding.
Introduction - What is Feature Engineering?
Introduction - What is Feature Engineering?

Key elements in a feature creation data pipeline visualization:

Data Source:
•Boxes representing the initial raw data sources, like databases, APIs, or files.
Data Ingestion:
•A box signifying the process of extracting data from sources and loading it into the
pipeline.
Data Cleaning:
•A stage where data is cleaned and preprocessed, including handling missing values,
outliers, and formatting inconsistencies.
Introduction - What is Feature Engineering?

Feature Engineering:
•A set of boxes representing different feature engineering techniques, like binning,
scaling, creating interaction features, or applying domain-specific transformations.
Feature Selection:
•A box indicating the process of choosing the most relevant features for the model.
Data Transformation:
•Boxes showing operations like one-hot encoding, normalization, or log
transformation.
Feature Storage:

•A box representing where the final engineered features are stored, like a data
warehouse or a dedicated feature store.
Why Bother with Feature Engineering?

•Improved Model Performance: Better features lead to more accurate and robust models.
•Faster Training: Relevant features can reduce training time.
•Enhanced Interpretability: Engineered features can make models easier to understand.
•Handles Missing Data: Feature engineering can address missing values effectively.
•Addresses Outliers: Techniques can mitigate the impact of outliers.
•Image: A graph showing improved model performance with feature engineering.
Types of Feature Engineering Techniques
• (Brief explanations and examples for each):
• Data Cleaning: Handling missing values, outliers, and inconsistencies. (e.g.,
imputation, outlier removal)
• Feature Scaling: Normalization and standardization. (e.g., Min-Max scaling, Z-
score normalization)
• Encoding Categorical Variables: One-hot encoding, label encoding. (e.g.,
converting colors to numerical representations)
• Creating New Features: Polynomial features, interaction features, domain-specific
features. (e.g., creating BMI from height and weight)
• Feature Transformation: Log transformation, square root transformation. (e.g.,
handling skewed data)
• Feature Selection: Identifying the most relevant features. (e.g., filter methods,
wrapper methods, embedded methods)
• Image: Icons representing each technique.
Data Cleaning

• Missing Value Imputation: Mean/median/mode imputation, K-NN

imputation.
• Outlier Handling: Identifying and treating outliers using box plots, IQR, etc.

• Handling Inconsistent Data: Correcting typos, inconsistencies in formatting.

Data Cleaning
Feature scaling

• Feature scaling is one of the most important data preprocessing step

in machine learning. Algorithms that compute the distance between
the features are biased towards numerically larger values if the data is
not scaled.
• Tree-based algorithms are fairly insensitive to the scale of the features.
Also, feature scaling helps machine learning, and deep learning
algorithms train and converge faster.
• There are some feature scaling techniques such as Normalization and
Standardization that are the most popular and at the same time, the
most confusing ones.
Scaling Up: Feature Scaling

•Normalization (Min-Max Scaling): Scales features to a range between 0 and 1.

•Standardization (Z-score Normalization): Scales features to have a mean of 0

and a standard deviation of 1.
Normalization (Min-Max Scaling):
This is used to transform features to be on a similar scale. The new point is
calculated as:
• This scales the range to [0, 1] or sometimes [-1, 1].
• Geometrically speaking, transformation squishes the n-dimensional data into an
n-dimensional unit hypercube.
• Normalization is useful when there are no outliers as it cannot cope up with
them. Usually, we would scale age and not incomes because only a few people
have high incomes but the age is close to uniform.
Standardization or Z-Score Normalization

• Standardization or Z-Score Normalization is the transformation of features

by subtracting from mean and dividing by standard deviation. This is often
called as Z-score.
Standardization or Z-Score Normalization
Standardization can be helpful in cases where the data follows a Gaussian
distribution.

However, this does not have to be necessarily true.

• Geometrically speaking, it translates the data to the mean vector of original

data to the origin and squishes or expands the points if std is 1 respectively.

• We can see that we are just changing mean and standard deviation to a
standard normal distribution which is still normal thus the shape of the
distribution is not affected.
Encoding Categorical Variables
• One-Hot Encoding: Creates binary
columns for each category.
• Label Encoding: Assigns a unique integer
to each category.
Creating New Features

•Polynomial Features: Creating higher-degree versions of existing features.

•Interaction Features: Combining multiple features.
•Domain-Specific Features: Leveraging domain knowledge to create
relevant features.
Feature Transformation

• Transforming Data: Feature Transformation

• Log Transformation: Useful for handling skewed data.

• Square Root Transformation: Another technique for

dealing with skewed data.
Feature Transformation
• Log Transformation:

By taking the logarithm of a data set, large values are compressed, effectively reducing the

impact of extreme outliers and making the distribution appear more symmetrical. This is

particularly useful when data is skewed towards the right, with a few very large values.

• Square Root Transformation:

This transformation has a milder effect on the data compared to log transformation, making

it suitable for situations where the skewness is not as extreme. It can also be applied to data

with zero values, unlike log transformation.

Feature Transformation

• Transforming Data: Feature Transformation

• Log transformation is a widely used technique to handle skewed data,

while square root transformation can also be used to address skewness,
particularly when dealing with positive skewed data; however, log
transformation is often considered more effective for heavily skewed
distributions, especially when the large values significantly skew the
data.
Feature Transformation
Feature Selection
•Filter Methods: Using statistical measures to rank features.
•Wrapper Methods: Evaluating subsets of features.
•Embedded Methods: Feature selection integrated into the model training process.
Feature Selection
•In feature selection,
•"Filter Methods" evaluate features individually based on statistical measures like
correlation, ranking them based on their apparent relevance to the target variable,
while
•"Wrapper Methods" test different combinations of features by training a model on
each subset to find the best performing set, and
•"Embedded Methods" incorporate feature selection directly within the model
training process, allowing the model to learn which features are most important during
learning itself.
Feature Selection
Key points about each method:
•Filter Methods:
•Pros: Fast, computationally efficient, easy to interpret.
•Cons: May miss important feature interactions, as features are evaluated
independently.
•Examples: Chi-square test, correlation coefficient, mutual information.

•Wrapper Methods:
•Pros: Can potentially find optimal feature subsets for a specific model.
•Cons: Computationally expensive, can overfit to the chosen model.
•Examples: Forward selection, backward elimination, recursive feature elimination
(RFE).
Feature Selection

Key points about each method:

•Embedded Methods:

•Pros: Combines advantages of filter and wrapper methods, less computationally

intensive than pure wrapper methods.
•Cons: May be model-specific, requiring careful selection of the model to use.
•Examples: Lasso regression (L1 regularization), decision trees, Random Forests.
Dimensionality:

• Dimensionality reduction is a powerful tool for dealing with high-

dimensional data. By reducing the number of features, it can improve
model performance, computational efficiency, and data visualization.

• What is Dimensionality Reduction?

• Dimensionality reduction is a technique that reduces the number of
features in a dataset while preserving its important information. It's
like summarizing a long article into a few key points - you lose some
details, but you keep the main idea.
Dimensionality:
• Why is it Important?
• The Curse of Dimensionality: When dealing with high-dimensional data, the data becomes
sparse, making it harder to find meaningful patterns. This can lead to overfitting, where the
model performs well on the training data but poorly on new data.

• Computational Efficiency: Training models on high-dimensional data can be

computationally expensive and time-consuming. Reducing the number of features can
significantly speed up the training process.

• Improved Model Performance: By removing irrelevant or redundant features,

dimensionality reduction can improve the accuracy and generalization ability of machine
learning models.

• Data Visualization: It's often easier to visualize data in 2D or 3D. Dimensionality reduction
can help project high-dimensional data into a lower-dimensional space for visualization.
Dimensionality:

• How Does it Work?

• There are various techniques for dimensionality reduction, including:
• Feature Selection: Choosing a subset of the most relevant features
and discarding the rest.
• Feature Extraction: Creating new features by combining or
transforming the original features.
Dimensionality:

Common Techniques:
• Principal Component Analysis (PCA): A linear technique that finds
the directions of maximum variance in the data and projects the data
onto those directions.
• Linear Discriminant Analysis (LDA): A technique that finds the
linear combinations of features that best separate different classes.
Dimensionality:

Principal Component Analysis (PCA) is an unsupervised technique that

identifies the directions of greatest variance within a dataset, projecting
data onto these directions to reduce dimensionality,
Dimensionality:

• Principal Component Analysis (PCA) is a linear dimensionality reduction technique designed to

extract a new set of variables from an existing high-dimensional dataset.

• Its primary goal is to reduce the dimensionality of the data while preserving as much variance as

possible.

• PCA is an unsupervised algorithm that creates linear combinations of the original features, known as

principal components.

• These components are calculated such that the first one captures the maximum variance in the dataset,

while each subsequent component explains the remaining variance without being correlated with the

previous ones.
Dimensionality:

Linear Discriminant Analysis (LDA) is a supervised technique that

finds linear combinations of features that best separate different classes
in a dataset, focusing on maximizing class separability rather than
overall variance.
Dimensionality:

• Linear discriminant analysis (LDA) is an approach used in supervised

machine learning to solve multi-class classification problems.

• LDA separates multiple classes with multiple features through data

dimensionality reduction.

• This technique is important in data science as it helps optimize machine

learning models.
Dimensionality:
Dimensionality:
Dimensionality:
Dimensionality:
PCA:
Advantages and Disadvantages of Principal Component Analysis
Advantages of Principal Component Analysis
1.Multicollinearity Handling: Creates new, uncorrelated variables to address issues
when original features are highly correlated.

2.Noise Reduction: Eliminates components with low variance (assumed to be noise),

enhancing data clarity.

3.Data Compression: Represents data with fewer components, reducing storage needs
and speeding up processing.

4.Outlier Detection: Identifies unusual data points by showing which ones deviate
significantly in the reduced space.
Dimensionality:
Disadvantages of Principal Component Analysis
1.Interpretation Challenges: The new components are combinations of original
variables, which can be hard to explain.

2.Data Scaling Sensitivity: Requires proper scaling of data before application, or results
may be misleading.

3.Information Loss: Reducing dimensions may lose some important information if too
few components are kept.

4.Assumption of Linearity: Works best when relationships between variables are linear,
and may struggle with non-linear data.

5.Computational Complexity: Can be slow and resource-intensive on very large

datasets.

6.Risk of Overfitting: Using too many components or working with a small dataset
might lead to models that don’t generalize well.
PCA: Example
Let’s understand it’s working in simple terms:
Imagine you’re looking at a messy cloud of data points (like stars in the sky) and want to
simplify it. PCA helps you find the “most important angles” to view this cloud so you don’t
miss the big patterns. Here’s how it works, step by step:

Step 1: Standardize the Data

Make sure all features (e.g., height, weight, age) are on the same scale. Why? A feature
like “salary” (ranging 0–100,000) could dominate “age” (0–100) otherwise.
Standardizing our dataset to ensures that each variable has a mean of 0 and a standard
deviation of 1.
(Z=X−μ/σ)
Here, μ is the mean of independent features μ={μ1,μ2,⋯,μm}
•σ is the standard deviation of independent features σ={σ1,σ2,⋯,σm}
PCA: Example
Step 2: Find Relationships
Calculate how features move together using a covariance matrix. Covariance measures the
strength of joint variability between two or more variables, indicating how much they
change in relation to each other. To find the covariance we can use the formula:

The value of covariance can be positive, negative, or zeros.

•Positive: As the x1 increases x2 also increases.

•Negative: As the x1 increases x2 also decreases.

•Zeros: No direct relation.

PCA: Example
Step 3: Find the “Magic Directions” (Principal Components)
•PCA identifies new axes (like rotating a camera) where the data spreads out the most:

• 1st Principal Component (PC1): The direction of maximum variance (most

spread).
• 2nd Principal Component (PC2): The next best direction, perpendicular to PC1,
and so on.
•These directions are calculated using Eigenvalues and Eigenvectors where:
eigenvectors (math tools that find these axes), and their importance is ranked
by eigenvalues (how much variance each captures).
For a square matrix A, an eigenvector X (a non-zero vector) and its
corresponding eigenvalue λ (a scalar) satisfy:
AX=λX
PCA: Example
This means:
•When A acts on X, it only stretches or shrinks X by the scalar λ.

•The direction of X remains unchanged (hence, eigenvectors define “stable directions” of A).

It can also be written as :

AX−λX=0
(A−λI)X=0

where I is the identity matrix of the same shape as matrix A.

And the above conditions will be true only if (A–λI)will be non-invertible (i.e. singular matrix).
That means,
∣A–λI∣=0
This determinant equation is called the characteristic equation.
•Solving it gives the eigenvalues \lambda,

•and therefore corresponding eigenvector can be found using the equation AX=λX.
PCA: Example

Step 4: Pick the Top Directions & Transform Data

•Keep only the top 2–3 directions (or enough to capture ~95% of the variance).
•Project the data onto these directions to get a simplified, lower-dimensional version.
PCA is an unsupervised learning algorithm, meaning it doesn’t require prior knowledge of
target variables.
It’s commonly used in exploratory data analysis and machine learning to simplify datasets
without losing critical information.
We know everything sound complicated, let’s understand again with help of visual image
where, x-axis (Radius) and y-axis (Area) represent two original features in the dataset.
PCA: Example
PCA: Example

Principal Components (PCs):

•PC₁ (First Principal Component): The direction along which the data has the maximum
variance. It captures the most important information.
•PC₂ (Second Principal Component): The direction orthogonal (perpendicular) to PC₁.
It captures the remaining variance but is less significant.
PCA: Example
Now, The red dashed lines indicate the spread (variance) of data along different directions
The variance along PC₁ is greater than PC₂,
which means that PC₁ carries more useful information about the dataset.

•The data points (blue dots) are projected onto PC₁, effectively reducing
• the dataset from two dimensions (Radius, Area) to one dimension (PC₁).

•This transformation simplifies the dataset while retaining most of the original variability.
The image visually explains why PCA selects the direction with the highest variance
(PC₁).

By removing PC₂,
we reduce redundancy while keeping essential information.
The transformation helps in data compression, visualization, and improved model
performance.
Linear Discriminant Analysis in Machine Learning

When working with high-dimensional datasets it is important to apply

dimensionality reduction techniques to make data exploration and modeling more
efficient.

One such technique is Linear Discriminant Analysis (LDA) which helps in reducing
the dimensionality of data while retaining the most significant features for
classification tasks.

It works by finding the linear combinations of features that best separate the classes
in the dataset. In this article we will learn about it and how to implement it in
python.
Linear Discriminant Analysis in Machine Learning

Maximizing Class Separability : Role of LDA

Linear Discriminant Analysis (LDA) also known as Normal Discriminant

Analysis is supervised classification problem that helps separate two or more
classes by

converting higher-dimensional data space into a lower-dimensional space.

It is used to identify a linear combination of features that best separates classes

within a dataset.
Linear Discriminant Analysis in Machine Learning

For example we have two classes that need to be separated efficiently.

Each class may have multiple features and using a single feature to classify them may
result in overlapping.

To solve this LDA is used as it uses multiple features to improve classification

accuracy.
Linear Discriminant Analysis in Machine Learning

Core Assumptions of LDA

For LDA to perform effectively certain assumptions are made:
1.Gaussian Distribution: Data within each class should follow a
Gaussian distribution.
2.Equal Covariance Matrices: Covariance matrices of the different classes should
be equal.
3.Linear Separability: A linear decision boundary should be sufficient to separate
the classes.
Linear Discriminant Analysis in Machine Learning

For example, when data points belonging to two classes are plotted if they are not
linearly separable LDA will attempt to find a projection that maximizes class
separability.
Linear Discriminant Analysis in Machine Learning

Image shows an example where the classes (black and green circles) are not linearly

separable. LDA attempts to separate them using red dashed line.

It uses both axes (X and Y) to generate a new axis in such a way that it

maximizes the distance between the means of the two classes while minimizing

the variation within each class.

This transforms the dataset into a space where the classes are better separated.
Linear Discriminant Analysis in Machine Learning

After transforming the data points along a new axis LDA maximizes the class

separation. This new axis allows for clearer classification by projecting the data

along a line that enhances the distance between the means of the two classes.
Linear Discriminant Analysis in Machine Learning
• Perpendicular distance between the decision boundary and the data points helps
us to visualize how LDA works by reducing class variation and increasing
separability.

• After generating this new axis using the above-mentioned criteria, all the data
points of the classes are plotted on this new axis and are shown in the figure
given below.
Linear Discriminant Analysis in Machine Learning
• It shows how LDA creates a new axis to project the data and separate the two
classes effectively along a linear path.

• But it fails when the mean of the distributions are shared as it becomes
impossible for LDA to find a new axis that makes both classes linearly separable.
In such cases we use non-linear discriminant analysis.
Linear Discriminant Analysis in Machine Learning
Linear Discriminant Analysis in Machine Learning
Linear Discriminant Analysis in Machine Learning
Advantages of LDA
•Simple and computationally efficient.

•Works well even when the number of features is much larger than the number of training
samples.
•Can handle multicollinearity.

Disadvantages of LDA
•Assumes Gaussian distribution of data which may not always be the case.

•Assumes equal covariance matrices for different classes which may not hold in all datasets.

•Assumes linear separability which is not always true.

•May not always perform well in high-dimensional feature spaces.

Linear Discriminant Analysis in Machine Learning

Applications of LDA

1.Face Recognition: It is used to reduce the high-dimensional feature space of pixel

values in face recognition applications helping to identify faces more efficiently.
2.Medical Diagnosis: It classifies disease severity in mild, moderate or severe based on
patient parameters helping in decision-making for treatment.
3.Customer Identification: It can help identify customer segments most likely to
purchase a specific product based on survey data.
Linear Discriminant Analysis in Machine Learning

Linear Discriminant Analysis (LDA) is a technique for dimensionality reduction that not
only simplifies high-dimensional data but also enhances the performance of models by
maximizing class separability.

By converting data into a lower-dimensional space it helps us to improve accuracy of

classification task.

Unit 2 Feature Engineering
No ratings yet
Unit 2 Feature Engineering
64 pages
Arun Kumar - Introduction To Solid State Physics - Libgen - Li
No ratings yet
Arun Kumar - Introduction To Solid State Physics - Libgen - Li
506 pages
ML 02 Dataset-Feature Selection PDF
No ratings yet
ML 02 Dataset-Feature Selection PDF
44 pages
Semi Supervised Learning
No ratings yet
Semi Supervised Learning
86 pages
AI-Module 4 Updated
No ratings yet
AI-Module 4 Updated
42 pages
Module 4
No ratings yet
Module 4
44 pages
C2 - W2 Mlopssadasdsa
No ratings yet
C2 - W2 Mlopssadasdsa
123 pages
AI-Module 4 - Updated
No ratings yet
AI-Module 4 - Updated
53 pages
ICSE Python Code Questions Answers
No ratings yet
ICSE Python Code Questions Answers
3 pages
ML-Unit 3
No ratings yet
ML-Unit 3
58 pages
Unit II
No ratings yet
Unit II
119 pages
ML - Week 04
No ratings yet
ML - Week 04
33 pages
NN 7
No ratings yet
NN 7
26 pages
Unit 4
No ratings yet
Unit 4
25 pages
5 - Unit 2 - Lecture 2-Data Handling
No ratings yet
5 - Unit 2 - Lecture 2-Data Handling
15 pages
Data
No ratings yet
Data
36 pages
Model Selection and Feature Engineering
No ratings yet
Model Selection and Feature Engineering
64 pages
Allpiedml Unit2
No ratings yet
Allpiedml Unit2
19 pages
AI6322 - Module 4 - Feature Engineering - MODULE
No ratings yet
AI6322 - Module 4 - Feature Engineering - MODULE
25 pages
CSC407 - Chapter 4
No ratings yet
CSC407 - Chapter 4
28 pages
Unit 2
No ratings yet
Unit 2
91 pages
ML - Unit-2 FULL - Feature Engineering Theory-13!09!24-1
No ratings yet
ML - Unit-2 FULL - Feature Engineering Theory-13!09!24-1
29 pages
Machine Learning (Feature Engineering)
No ratings yet
Machine Learning (Feature Engineering)
10 pages
Session 7 Feature Selection & Dimensionality Reduction
No ratings yet
Session 7 Feature Selection & Dimensionality Reduction
20 pages
2 DataPreProcessing Code
No ratings yet
2 DataPreProcessing Code
46 pages
DM - MOD - 1 Part III
No ratings yet
DM - MOD - 1 Part III
12 pages
ML UNIT 2 2 Old
No ratings yet
ML UNIT 2 2 Old
15 pages
Machine - Learning Note Modul2
No ratings yet
Machine - Learning Note Modul2
20 pages
Feature Engineering For Machine Learning
No ratings yet
Feature Engineering For Machine Learning
41 pages
AI Feature Engineering in Detail
No ratings yet
AI Feature Engineering in Detail
12 pages
Unit 3-2
No ratings yet
Unit 3-2
15 pages
Summery of Feature Eng
No ratings yet
Summery of Feature Eng
4 pages
Guidelines For Case Analysis
100% (1)
Guidelines For Case Analysis
3 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
Class PPT - Unit2
No ratings yet
Class PPT - Unit2
139 pages
Machine Learning Unit-2
No ratings yet
Machine Learning Unit-2
12 pages
NOTES
No ratings yet
NOTES
9 pages
Unit 2 Part 2
No ratings yet
Unit 2 Part 2
6 pages
Feature Engineering
No ratings yet
Feature Engineering
11 pages
Designing Machine Learning Systems With Python - Sample Chapter
100% (1)
Designing Machine Learning Systems With Python - Sample Chapter
31 pages
Life Lesson
No ratings yet
Life Lesson
13 pages
Feature Engineering
No ratings yet
Feature Engineering
18 pages
Module 4
No ratings yet
Module 4
28 pages
Final 1
No ratings yet
Final 1
6 pages
Unit 6aics
No ratings yet
Unit 6aics
25 pages
Feature Engineering
No ratings yet
Feature Engineering
6 pages
ML Da
No ratings yet
ML Da
55 pages
06 Feature Engineering
No ratings yet
06 Feature Engineering
24 pages
Manual Data
No ratings yet
Manual Data
13 pages
CSC 3301-Lecture06 Introduction To Machine Learning
No ratings yet
CSC 3301-Lecture06 Introduction To Machine Learning
56 pages
Deep Learning Vocabulary
No ratings yet
Deep Learning Vocabulary
6 pages
Rajat Agarwal-21bcon630
No ratings yet
Rajat Agarwal-21bcon630
13 pages
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
ML Week 8
No ratings yet
ML Week 8
12 pages
What Is Feature Engineering
No ratings yet
What Is Feature Engineering
2 pages
Machine: Learning
No ratings yet
Machine: Learning
24 pages
Machine Learning - Lec4 - 5
No ratings yet
Machine Learning - Lec4 - 5
41 pages
Feature Engineering and Normalization
No ratings yet
Feature Engineering and Normalization
7 pages
Feature Engineering
No ratings yet
Feature Engineering
2 pages
Feature Engineering: Short Study: Indian Institute of Space Science and Technology, Department of Mathematics
No ratings yet
Feature Engineering: Short Study: Indian Institute of Space Science and Technology, Department of Mathematics
6 pages
A Short Guide For Feature Engineering and Feature Selection
No ratings yet
A Short Guide For Feature Engineering and Feature Selection
32 pages
Volunteer Teacher's Toolkit by I-To-I TEFL
No ratings yet
Volunteer Teacher's Toolkit by I-To-I TEFL
25 pages
Tamil Nadu State Council For Science and Technology: Founder Chairman, Velammal Educational Trust
No ratings yet
Tamil Nadu State Council For Science and Technology: Founder Chairman, Velammal Educational Trust
3 pages
Training Facilities On Pakistan Railways
0% (1)
Training Facilities On Pakistan Railways
7 pages
Lesson 14 Cellular Respiration
No ratings yet
Lesson 14 Cellular Respiration
18 pages
Bar Council of India
100% (3)
Bar Council of India
17 pages
Art and The Domain of Aesthetics - Unlocked
No ratings yet
Art and The Domain of Aesthetics - Unlocked
18 pages
Lisa Hoang - CV
No ratings yet
Lisa Hoang - CV
6 pages
Time Connectives Homework Ks1
100% (1)
Time Connectives Homework Ks1
8 pages
Synopsis - Vinay Mohan
No ratings yet
Synopsis - Vinay Mohan
3 pages
Placement PPTHHHHHHHHJJJJ
No ratings yet
Placement PPTHHHHHHHHJJJJ
12 pages
ADA Unit 3
No ratings yet
ADA Unit 3
41 pages
Academic Staff Positions - Plateau State University
100% (2)
Academic Staff Positions - Plateau State University
2 pages
Research Proposal Cat
No ratings yet
Research Proposal Cat
43 pages
Direct Examination Questions For Court
No ratings yet
Direct Examination Questions For Court
9 pages
Colonial Colleges - Wikipedia
No ratings yet
Colonial Colleges - Wikipedia
55 pages
Unravelling The Complexity of Emotional Intelligence and Its Impact On Decision Making
No ratings yet
Unravelling The Complexity of Emotional Intelligence and Its Impact On Decision Making
10 pages
Gelect Lesson 2
No ratings yet
Gelect Lesson 2
7 pages
Project Proposal Updated 4 - 1
No ratings yet
Project Proposal Updated 4 - 1
3 pages
AJOKE 1 F Pneumagogy, A Proposed Theory For Effective Teaching and Learning in Christian Kingdom Education (FINAL)
No ratings yet
AJOKE 1 F Pneumagogy, A Proposed Theory For Effective Teaching and Learning in Christian Kingdom Education (FINAL)
16 pages
أهمية استخدام مقاييس الأداء غير المالية لزيادة فعالية دور المحاسبة الإدارية في ظل بيئة التصنيع الحديثة - رسالة ماجستير
No ratings yet
أهمية استخدام مقاييس الأداء غير المالية لزيادة فعالية دور المحاسبة الإدارية في ظل بيئة التصنيع الحديثة - رسالة ماجستير
8 pages
Coursebook - Unit 11
No ratings yet
Coursebook - Unit 11
4 pages
Use Poetry For English Language Teaching
100% (1)
Use Poetry For English Language Teaching
25 pages
Code of Ethics For Teacher As Legal Pillar To Improve Educational Professionalism
No ratings yet
Code of Ethics For Teacher As Legal Pillar To Improve Educational Professionalism
5 pages
Lewis Hine: I. Introduction: The Context
No ratings yet
Lewis Hine: I. Introduction: The Context
7 pages
Exam Practice Fel0012
No ratings yet
Exam Practice Fel0012
5 pages
Tentative Seniority List of Employees Borne in JKSSB 04052023
No ratings yet
Tentative Seniority List of Employees Borne in JKSSB 04052023
4 pages
Designing Online Learning Modules in Kinesiology
No ratings yet
Designing Online Learning Modules in Kinesiology
7 pages
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet

Machine Learning: Dr. Jagan. T Professor Department of ECE, GRIET

Uploaded by

Machine Learning: Dr. Jagan. T Professor Department of ECE, GRIET

Uploaded by

Machine Learning

• Unsupervised Learning: The algorithm learns from unlabeled data, identifying

• Image Recognition: Identifying objects or faces in images.

• Natural Language Processing: Understanding and generating human

• Recommendation Systems: Suggesting products or content based on user

• Fraud Detection: Identifying potentially fraudulent transactions.

• Medical Diagnosis: Assisting in disease diagnosis and treatment planning.

• Automation: Automating tasks that traditionally require human

• Improved Accuracy: Making more accurate predictions and decisions

• Personalization: Tailoring experiences to individual users.

• Insights: Discovering hidden patterns and insights in data.

• Data Requirements: Machine learning models often require large amounts

• Complexity: Developing and deploying machine learning models can be

• Interpretability: Understanding why a machine learning model makes

• Machine learning is a rapidly evolving field with tremendous potential.

• Explainable AI: Developing methods to make machine learning decisions

• Federated Learning: Training machine learning models on decentralized

• The process of selecting, transforming, and creating features from

A visual representation of a data pipeline for feature creation would typically

Key elements in a feature creation data pipeline visualization:

• Missing Value Imputation: Mean/median/mode imputation, K-NN

• Handling Inconsistent Data: Correcting typos, inconsistencies in formatting.

• Feature scaling is one of the most important data preprocessing step

•Normalization (Min-Max Scaling): Scales features to a range between 0 and 1.

•Standardization (Z-score Normalization): Scales features to have a mean of 0

• Standardization or Z-Score Normalization is the transformation of features

However, this does not have to be necessarily true.

• Geometrically speaking, it translates the data to the mean vector of original

•Polynomial Features: Creating higher-degree versions of existing features.

• Transforming Data: Feature Transformation

• Log Transformation: Useful for handling skewed data.

• Square Root Transformation: Another technique for

• Square Root Transformation:

with zero values, unlike log transformation.

• Transforming Data: Feature Transformation

• Log transformation is a widely used technique to handle skewed data,

Key points about each method:

•Pros: Combines advantages of filter and wrapper methods, less computationally

• Dimensionality reduction is a powerful tool for dealing with high-

• What is Dimensionality Reduction?

• Computational Efficiency: Training models on high-dimensional data can be

• Improved Model Performance: By removing irrelevant or redundant features,

• How Does it Work?

Principal Component Analysis (PCA) is an unsupervised technique that

• Principal Component Analysis (PCA) is a linear dimensionality reduction technique designed to

extract a new set of variables from an existing high-dimensional dataset.

Linear Discriminant Analysis (LDA) is a supervised technique that

• Linear discriminant analysis (LDA) is an approach used in supervised

• LDA separates multiple classes with multiple features through data

• This technique is important in data science as it helps optimize machine

2.Noise Reduction: Eliminates components with low variance (assumed to be noise),

5.Computational Complexity: Can be slow and resource-intensive on very large

Step 1: Standardize the Data

The value of covariance can be positive, negative, or zeros.

•Negative: As the x1 increases x2 also decreases.

•Zeros: No direct relation.

• 1st Principal Component (PC1): The direction of maximum variance (most

It can also be written as :

where I is the identity matrix of the same shape as matrix A.

Step 4: Pick the Top Directions & Transform Data

Principal Components (PCs):

When working with high-dimensional datasets it is important to apply

Maximizing Class Separability : Role of LDA

Linear Discriminant Analysis (LDA) also known as Normal Discriminant

converting higher-dimensional data space into a lower-dimensional space.

It is used to identify a linear combination of features that best separates classes

For example we have two classes that need to be separated efficiently.

To solve this LDA is used as it uses multiple features to improve classification

Core Assumptions of LDA

separable. LDA attempts to separate them using red dashed line.

the variation within each class.

•Assumes linear separability which is not always true.

•May not always perform well in high-dimensional feature spaces.

1.Face Recognition: It is used to reduce the high-dimensional feature space of pixel

By converting data into a lower-dimensional space it helps us to improve accuracy of

You might also like