0% found this document useful (0 votes)

18 views41 pages

23ECE205 FoDS 13 Introduction To ML

Uploaded by

rohithdhoni86

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views41 pages

23ECE205 FoDS 13 Introduction To ML

Uploaded by

rohithdhoni86

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 41

Om

23ECE205 Foundations of
Data Science

Introduction to Machine
Learning
Dr. Binoy B. Nair

1
1
Introduction

• Why Machine Learning?

• What Is Machine Learning?

• What Kind of Data Can Be Mined?

• What Technology Are Used?

• What Kind of Applications Are Targeted?

2
Why Machine Learning?

• The Explosive Growth of Data: from terabytes to petabytes

• Data collection and data availability
• Automated data collection tools, database systems, Web,
computerized society
• Major sources of abundant data
• Business: Web, e-commerce, transactions, stocks, …
• Science: Remote sensing, bioinformatics, scientific simulation, …
• Society and everyone: news, digital cameras, YouTube

• We are drowning in data, but starving for knowledge!

• “Necessity is the mother of invention”—Machine Learning—
Automated analysis of massive data sets
3
4
Introduction

• Why Machine Learning?

• What Is Machine Learning?

• What Kind of Data Can Be Mined?

• What Kinds of Patterns Can Be Mined?

• What Technology Are Used?

• What Kind of Applications Are Targeted?

5
What Is Machine Learning?

• Machine Learning (knowledge discovery from

data)

• Extraction of interesting (non-trivial, implicit,

previously unknown and potentially useful)
patterns or knowledge from huge amount of data

• Alternative names:
Knowledge discovery (mining) in databases (KDD),
Data Mining, knowledge extraction, data/pattern
analysis, data archeology, data dredging, information
harvesting, business intelligence, etc. 6
Machine Learning
Workflow

7
In other words

8
Types of ‘Learning’ in Machine
Learning

9
Types of ‘Learning’ in Machine
Learning
Machine Learning

Unsupervised Reinforcement
Supervised Learning
Learning Learning

Classificatio
Clustering
n

Dimensionality
Regression
Reduction

Anomaly Detection

Association Rule Mining

10
Supervised Learning

1
1
Supervised Learning

Supervised learning is a type of machine

learning where a model is trained on a
labeled dataset.

In this context, "labeled" means that each

training example has input data (features)
and the corresponding correct output (label
or target).

The goal of supervised learning is to learn a

mapping from inputs to outputs, so the model
can predict the correct output for new,
unseen inputs.
12
Key Components of Supervised
Learning
1. Training Data: A dataset that includes both input
features (e.g., Age, Income) and their corresponding
labels (e.g., 'Fiction' or 'NonFiction’).
2. Model: A mathematical function or algorithm that
maps inputs to outputs.
3. Loss Function: Measures how well the model's
predictions match the actual labels. The model is
trained to minimize this loss.
4. Optimization Algorithm: Adjusts the model's
parameters (weights) to improve predictions by
minimizing the loss function, commonly using
methods like gradient descent.

13
Types of Supervised Learning

Most common types of supervised learning:

1.Classification: Predicts a discrete label

(e.g., classifying emails as 'spam' or 'not
spam’).

2.Regression: Predicts a continuous value

(e.g., predicting house prices based on
features like size, location).

1
4
Classification: Example- Predict survival on the Titanic [3]

• On April 15, 1912, during her maiden voyage, RMS

Titanic sank after collision with an iceberg, killing 1502
out of 2224 passengers and crew.

• Although there was some element of luck involved in

surviving the sinking, some groups of people were
more likely to survive.

• We try to use classification to predict which passengers

are more likely to survive such future tragedies
(assuming that we are still living in 1912 and that our
hero has booked next boat ticket to USA via the same
route)
15
Classification: Example- Titanic
dataset Issues like
The classification Each column is
missing data
dataset will always
are very called a feature
have the observed
class. Here we have
common or attribute
two classes denoted
by 0 and 1

Each row
is called
an
observatio Attributes can be
n or a numeric, logical,
sample ordinal, nominal
or one of several
other types 16
Classification: Example- Survival on
Titanic

• The result of classification

using decision tree
classifier (we will learn how
this is obtained, later) is
given alongside.

• Let’s now check what would

be the possible outcome for
our hero Jack, given that he
is a male, 20 year old person
with no parent or child Jack might wind
accompanying him, looking up dead
at the rule generated by the
classifier:

17
How a ‘clean’ dataset might look
like
input features o/p class
Sampl Sepa Sepa
e No. Petal
l l Petal
Lengt Species
lengt widt width
h Class
h h
Label
Featur 1 5 3.5 6 1.5 Setosa
for the
es 2 5 3.2 6 1.5 Setosa feature
s
3 4 4 5.5 1.4 Setosa
4 7.4 9 2 2 Setosa
Sl. No.
(not a 5 7 9 2 5 Versicolor
feature) 6 7 8 1.2 5 Versicolor
7 8.6 6 1.5 6 Versicolor
8 8 7 2.5 6.4 Versicolor 18
Unsupervised Learning

1
9
Unsupervised Learning

Unsupervised learning is a type of ML

where the model is trained on data that
does not have labeled outputs.

Unsupervised learning focuses on finding

hidden patterns, structures, or
relationships within the input data without
the guidance of known labels.

20
Key Characteristics of Unsupervised
Learning

• No Labels: The dataset contains only input data

(features) without corresponding outputs or labels.

• Pattern Discovery: The model's objective is to learn

the underlying structure or distribution in the data,
such as grouping similar examples together or finding
relationships between features.

• Exploratory: Unsupervised learning is often used for

exploratory analysis to understand the data better or as
a preprocessing step for other tasks.

21
Common Tasks in Unsupervised
Learning
1. Clustering: The process of grouping data points based on their
similarity.
• Example: Grouping customers into different market segments based on
purchasing behavior.
• Algorithms: k-means, hierarchical clustering, DBSCAN.
2. Dimensionality Reduction: Reducing the number of input
variables while retaining the most important information.
• Example: Reducing the number of features in an image dataset for
visualization or speeding up computation.
• Algorithms: PCA (Principal Component Analysis), t-SNE.
3. Anomaly Detection: Identifying rare or unusual data points
that don't fit the general pattern.
• Example: Detecting fraudulent transactions in banking data.
4. Association: Finding rules that describe relationships between
variables in the data.
• Example: Market basket analysis, where you identify items frequently
bought together in a store.
• Algorithms: Apriori, FP-Tree.
22
Association Rule Mining Example: Market
Basket Analysis

What the store

wants to know
from your
purchase list

Fig. MB analysis [4]

What it does with

the mined
associations

23
Association Rule Mining

A typical transaction database from a shop [5]

Assume that each
transaction no.
denotes one purchase
session
Each row
is called a
transactio
n

Rule
antecedent Rules derived will typically be of the form:
Rule
Soy milk => Orange Juice consequent 24
Unsupervised Learning Applications

Customer Segmentation in marketing

Anomaly Detection in security or fraud

detection

Data Compression or feature extraction

Recommender Systems based on association

rules

25
Example: Text document clustering

26
Partitional, Hierarchical, Density
Popular Based Clustering
Unsupervis
ed
Principal Component Analysis
Learning (PCA)
Algorithms

Autoencoders

2
7
Reinforcement Learning

2
8
Reinforcement learning

• RL is a type of machine learning where an agent learns

to make decisions by interacting with an environment
in order to maximize some notion of cumulative reward.
• Unlike supervised learning, where the correct input-
output pairs are provided, or unsupervised learning,
where the goal is to find hidden patterns in data,
reinforcement learning focuses on learning from the
consequences of actions taken within an environment.

29
Reinforcement Learning Working
1. Interaction: The agent interacts with the environment by observing
the current state, choosing an action, and receiving feedback in the
form of a reward.
2. Feedback: After taking an action, the environment moves to a new
state, and the agent receives a reward (positive or negative) based on
the action's outcome.
3. Learning: The agent updates its understanding of the environment
(typically by updating the value function or policy) based on the
reward and the new state.
4. Exploration vs. Exploitation: The agent must balance exploring new
actions (to find better strategies) with exploiting known actions that
give good rewards. This balance is crucial for maximizing long-term
rewards.

30
Applications of Reinforcement
Learning

Game Playing: RL has been used to train

agents to play games like chess (AlphaZero),
Go, and Atari video games.

Robotics: Training robots to perform tasks like

walking, grasping objects, or flying drones.

Self-Driving Cars: Learning to navigate in

complex environments with various inputs
and outcomes.

31
Introduction

• Why Machine Learning?

• What Is Machine Learning?

• What Kind of Data Can Be Mined?

• What Technology Are Used?

• What Kind of Applications Are Targeted?

32
Machine Learning: On What Kinds of
Data?
Database-oriented data sets and
applications
• Relational database, data warehouse, transactional
database
Advanced data sets and advanced
applications
• Data streams and sensor data
• Time-series data, temporal data, sequence data
(incl. bio-sequences)
• Heterogeneous databases and legacy databases
• Spatial data and spatiotemporal data
• Multimedia database
• Text databases
• The World-Wide Web

33
Introduction

• Why Machine Learning?

• What Is Machine Learning?

• A Multi-Dimensional View of Machine Learning

• What Kind of Data Can Be Mined?

• What Kinds of Patterns Can Be Mined?

• What Technology Are Used?

• What Kind of Applications Are Targeted?

• Major Issues in Machine Learning

34
Machine Learning: Confluence of Multiple
Disciplines

Statistics

Database Machine Learning Visualization

Technology

High-Performance
Computing

35
Why Confluence of Multiple Disciplines?

• Tremendous amount of data

• Algorithms must be highly scalable to handle such as tera-
bytes of data

• High-dimensionality of data
• Micro-array may have tens of thousands of dimensions

• High complexity of data

• Data streams and sensor data
• Time-series data, temporal data, sequence data
• Structure data, graphs, social networks and multi-linked data
• Heterogeneous databases and legacy databases
• Spatial, spatiotemporal, multimedia, text and Web data
• Software programs, scientific simulations 36
Introduction

• Why Machine Learning?

• What Is Machine Learning?

• A Multi-Dimensional View of Machine Learning

• What Kind of Data Can Be Mined?

• What Kinds of Patterns Can Be Mined?

• What Technology Are Used?

• What Kind of Applications Are Targeted?

37
Applications- Actual Story so
Far

38
Midjourney: overview shot of
three dutch happy 40-year-
old woman chatting in a 39
1. J. Han , M. Kamber and J Pei, Data Mining: Concepts
and Techniques. Morgan Kaufmann, 3rd ed., 2011
2. Is free will a matter of being a conscious outlier?,
Available online:
https://baldscientist.wordpress.com/2013/02/02/is-
free-will-a-matter-of-being-a-conscious-outlier/, Last
accessed: Jan 1,2016
3. Hermann Mucke, , Data Mining in Drug
Development and Translational Medicine Overview,
Recommend Data Mining in Drug Development and Translational
Medicine, Available online:
ed http://www.insightpharmareports.com/data_mining/,
Last accessed: Jan 1,2016.
Reference 4. Peter Bajcsy, Introduction to Data Mining, Available
online:
Books http://www.slideshare.net/p2045i/introduction-to-
data-mining, Last accessed: Jan 1,2016.
5. Machine learning and Data Mining - Association
Analysis with Python, Available online:
http://aimotion.blogspot.in/2013/01/ machine-
learning-and-data-mining.html, Last accessed: Jan
1,2016.
6. Titanic dataset, Available online:
https://www.kaggle.com/c/titanic/data, Last
accessed: Jan 1,2016
7. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern
Classification, 2ed., Wiley-Interscience, 2000

4
0
Questions??

AI-ques-ans-Unit-1 Prof. Anuj Khanna KOIT
100% (1)
AI-ques-ans-Unit-1 Prof. Anuj Khanna KOIT
17 pages
Batch B DWM Experiments
No ratings yet
Batch B DWM Experiments
90 pages
Module 04
No ratings yet
Module 04
75 pages
Mini Project On Diabetes Prediction: Information Technology
No ratings yet
Mini Project On Diabetes Prediction: Information Technology
19 pages
Answers PDF
No ratings yet
Answers PDF
9 pages
Data Processing in Research Methodology
100% (4)
Data Processing in Research Methodology
4 pages
ch-9 Advanced Classes
No ratings yet
ch-9 Advanced Classes
28 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
8 pages
Lung Cancer Detection Using Machine Learning Algorithms and Neural Network On A Conducted Survey Dataset Lung Cancer Detection
No ratings yet
Lung Cancer Detection Using Machine Learning Algorithms and Neural Network On A Conducted Survey Dataset Lung Cancer Detection
4 pages
Scikit Learn Cheat Sheet
No ratings yet
Scikit Learn Cheat Sheet
9 pages
Machine Learning BE Merged Modules
No ratings yet
Machine Learning BE Merged Modules
561 pages
A System To Filter Unwanted Messages From The OSN User Walls
No ratings yet
A System To Filter Unwanted Messages From The OSN User Walls
22 pages
Face Identification Based On K-Nearest Neighbor
No ratings yet
Face Identification Based On K-Nearest Neighbor
21 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
28 pages
Technophilia Artificial Intelligence
No ratings yet
Technophilia Artificial Intelligence
5 pages
Punyatoya Patra AM
No ratings yet
Punyatoya Patra AM
15 pages
Lattin Et Al - Analyzing Multivariate Data - 281-283
No ratings yet
Lattin Et Al - Analyzing Multivariate Data - 281-283
3 pages
Machine Learning in Farm Animal Behavior Using Python Natasa Kleanthous
No ratings yet
Machine Learning in Farm Animal Behavior Using Python Natasa Kleanthous
412 pages
Age and Gender Prediction in Open Domain Text
No ratings yet
Age and Gender Prediction in Open Domain Text
8 pages
Machine Learning Lab Viva
100% (1)
Machine Learning Lab Viva
9 pages
Unit 1 PDF
No ratings yet
Unit 1 PDF
135 pages
Machine Learning and Web Scraping Lecture 01
No ratings yet
Machine Learning and Web Scraping Lecture 01
19 pages
Chronic Kidney Disease Prediction Using Machine Learning Techniques
No ratings yet
Chronic Kidney Disease Prediction Using Machine Learning Techniques
19 pages
Module 1
No ratings yet
Module 1
122 pages
Intro To Machine Learning
No ratings yet
Intro To Machine Learning
25 pages
Python UNIT-5
100% (1)
Python UNIT-5
67 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
12 pages
Module 1 PPT
No ratings yet
Module 1 PPT
122 pages
Machine Learning For Beginners
100% (1)
Machine Learning For Beginners
30 pages
DMDW Syllabus
No ratings yet
DMDW Syllabus
2 pages
DataScience Unit1 (+notes)
No ratings yet
DataScience Unit1 (+notes)
56 pages
Machine Learning Slides
No ratings yet
Machine Learning Slides
46 pages
MyChap1 - Introduction
No ratings yet
MyChap1 - Introduction
28 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
13 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
Unit 1
No ratings yet
Unit 1
52 pages
2 - Types of Machine Learning
No ratings yet
2 - Types of Machine Learning
26 pages
Rapport PFE Balsam Bendhif
No ratings yet
Rapport PFE Balsam Bendhif
73 pages
Unit 3
No ratings yet
Unit 3
33 pages
Introduction To Machine Learning For Beginners
No ratings yet
Introduction To Machine Learning For Beginners
5 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
16 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
L3 - Supervised and Unsupervised Learning
100% (3)
L3 - Supervised and Unsupervised Learning
24 pages
Machine Learnning
No ratings yet
Machine Learnning
17 pages
Logistic Regression Using SPSS
No ratings yet
Logistic Regression Using SPSS
29 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
Experiment 1
No ratings yet
Experiment 1
7 pages
Unit 1
No ratings yet
Unit 1
19 pages
Machine Learning For Beginners Overview of Algorithm TypesStart Learning Machine Learning From Here
No ratings yet
Machine Learning For Beginners Overview of Algorithm TypesStart Learning Machine Learning From Here
13 pages
Machine Learning Is The Branch of
No ratings yet
Machine Learning Is The Branch of
12 pages
Overview of Machine Learning
No ratings yet
Overview of Machine Learning
49 pages
Supervised Unsupervised Reinforcement
No ratings yet
Supervised Unsupervised Reinforcement
39 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
DS&ML 1
No ratings yet
DS&ML 1
9 pages
Intro To Machine Learning 1
No ratings yet
Intro To Machine Learning 1
14 pages
Ai Cheat Sheet Machine Learning With Python Cheat Sheet
100% (4)
Ai Cheat Sheet Machine Learning With Python Cheat Sheet
2 pages
Machine Learning - Introduction
No ratings yet
Machine Learning - Introduction
73 pages
Machine Learning - Introduction
No ratings yet
Machine Learning - Introduction
138 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
17 pages
Machine Learning: Understanding The Basics of Machine Learning and Its Applications
No ratings yet
Machine Learning: Understanding The Basics of Machine Learning and Its Applications
24 pages
Thinking by Classes in Data Science SDA
No ratings yet
Thinking by Classes in Data Science SDA
34 pages
Question Bank
No ratings yet
Question Bank
18 pages
Machine-Learning Techniques For Predictive Analytics
No ratings yet
Machine-Learning Techniques For Predictive Analytics
53 pages
Introduction To Machine Learning: Unit Structure
No ratings yet
Introduction To Machine Learning: Unit Structure
33 pages
Introduction To Machine Learing
No ratings yet
Introduction To Machine Learing
4 pages
Ai Faheem
No ratings yet
Ai Faheem
16 pages
AI Lab6
No ratings yet
AI Lab6
7 pages
DM Chapter 0
No ratings yet
DM Chapter 0
4 pages
Design and Implementation of Different Machine Learning Algorithms For Credit Card Fraud Detection
No ratings yet
Design and Implementation of Different Machine Learning Algorithms For Credit Card Fraud Detection
6 pages
FAM Unit5
No ratings yet
FAM Unit5
47 pages
ML Lecture 2 3 Types
No ratings yet
ML Lecture 2 3 Types
27 pages
Unit-1 ML Notes
No ratings yet
Unit-1 ML Notes
20 pages
Session 3 Types of Machine Learning
No ratings yet
Session 3 Types of Machine Learning
22 pages
LKSK ML typesToStudents
No ratings yet
LKSK ML typesToStudents
18 pages
Unit I
No ratings yet
Unit I
38 pages
ML Unit-1 Notes
No ratings yet
ML Unit-1 Notes
13 pages
AIML
No ratings yet
AIML
26 pages
Unit 3 ML
No ratings yet
Unit 3 ML
119 pages
Unit 3-Introduction To Machine Learning
No ratings yet
Unit 3-Introduction To Machine Learning
44 pages
Large Language Model Interview Questions Quiz
No ratings yet
Large Language Model Interview Questions Quiz
32 pages
Introduction To ML
No ratings yet
Introduction To ML
17 pages
ML Theory
No ratings yet
ML Theory
54 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
20 pages
ML UT 1 Merged
No ratings yet
ML UT 1 Merged
31 pages
DTS 101 Lecture 1
No ratings yet
DTS 101 Lecture 1
22 pages
Machine Learning
No ratings yet
Machine Learning
44 pages
What Is Machine Learning?
No ratings yet
What Is Machine Learning?
6 pages
Lecture 03
No ratings yet
Lecture 03
28 pages
Ml-Unit 1
No ratings yet
Ml-Unit 1
53 pages