0% found this document useful (0 votes)

18 views

Machine Learning Unit-1.1

Uploaded by

Devchand Chaudhari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Machine Learning Unit-1.1

Uploaded by

Devchand Chaudhari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 43

Overview of Course

1. Introduction
2. Linear Regression and Decision Trees
3. Instance based learning Feature Selection
4. Probability and Bayes Learning
5. Support Vector Machines
6. Neural Network
7. Introduction to Computational Learning Theory
8. Clustering
UNIT-1: Overview of Course
1. Introduction to Machine Learning
2. Definition of Machine Learning
3. Machine Learning with daily life examples
4. Different Types of Learning
5. Applications of Machine Learning
6. Examples of Machine Learning
7. Hypothesis space, Inductive Bias
8. Evaluation, Training and Test set, Cross-validation
9. Overfitting and Underfitting
Figure1:- Machine
• Artificial Intelligence is the concept of creating
smart intelligent machines. Ability of machine to
imitate intelligence like Intelligent human brain.

• Machine Learning is a subset of artificial

intelligence that helps you build
AI-driven applications. It is the application of AI
that allows system to automatically learn and
improve from experience.

• Deep Learning is a subset of Machine Learning

that uses complex algorithms and deep neural
network to train a model.
Difference Between Machine Learning
And Artificial Intelligence

• Artificial Intelligence is a concept of creating

intelligent machines that stimulates human
behaviour whereas
• Machine learning is a subset of Artificial
intelligence that allows machine to learn from
data without being programmed.
Introduction
• Popularity of this field in recent time and the
reasons behind that
– New software/ algorithms
• Neural networks
• Deep learning
– New hardware
• GPU’s
– Cloud Enabled
– Availability of Big Data
• Ever since computers were invented, we have
wondered whether they might be made to learn. If we
could understand how to program them to learn-to
improve automatically with experience-the impact
would be dramatic.

• Imagine computers learning from medical records

which treatments are most effective for new diseases.

• A successful understanding of how to make computers

learn would open up new uses of computers and new
levels of competence and customization.
• We do not know yet, how to make computers learn nearly as well as people
learn.

• However, algorithms have been invented that are effective for certain types of
learning tasks, and a theoretical understanding of learning is beginning to
emerge.

• Many practical computer programs have been developed to exhibit useful

types of learning, and significant commercial applications have begun to
appear.

• For problems such as speech recognition, algorithms based on machine

learning perform all other approaches that have been attempted to date.

• In the field known as data mining, machine learning algorithms are being used
routinely to discover valuable knowledge from large commercial databases
containing equipment maintenance records, loan applications, financial
transactions, medical records, patient record etc.
• A few specific achievements provide a glimpse of the
state of the art:

• Programs have been developed that successfully learn

to recognize spoken words.

• Predict recovery rates of pneumonia patients, detect

fraudulent use of credit cards, drive autonomous
vehicles on public highways.

• Play games such as backgammon at levels approaching

the performance of human world champions
What is data science?
• Data science is the field of applying advanced analytics techniques and
scientific principles to extract valuable information from data for
decision-making, strategic planning and other business uses.

• Data science combines math and statistics, specialized programming,

advanced analytics, artificial intelligence (AI), and machine learning with
specific subject matter expertise to uncover actionable insights hidden in
an organization’s data.

• Data science is the study of data to extract meaningful insights for

business. It combines principles and practices from the fields of
mathematics, statistics, artificial intelligence, and computer engineering
to analyze large amounts of data.

• These insights can be used to guide decision making and strategic

planning.
Prerequisite for Data Science
1.Machine Learning:- Machine learning is the backbone of data science.
Data Scientists need to have a solid grasp of ML in addition to basic
knowledge of statistics.

2. Modeling:- Mathematical models enable you to make quick calculations

and predictions based on what you already know about the data.

3. Statistics:- A sturdy handle on statistics can help you extract more

intelligence and obtain more meaningful results.

4. Programming:- The most common programming languages are Python,

and R. Python is especially popular because it’s easy to learn, and it
supports multiple libraries for data science and ML.

5. Databases:- A capable data scientist needs to understand how databases

work, how to manage them, and how to extract data from them.
Different Roles/Jobs in Data Science
• Data Scientist:
 a data scientist’s skillset is typically broader than the average data analyst.
Comparatively speaking, data scientist leverage common programming
languages, such as R and Python, to conduct more statistical inference and
data visualization.
 Data scientists require computer science and pure science skills beyond
those of a typical business analyst or data analyst.
 The data scientist must also understand the specifics of the business, such as
Manufacturing, eCommerce, or healthcare, Agriculture domain knowledge.
• Data Scientist:
• Job role: Determine what the problem is, what questions need answers, and
where to find the data. Also, they mine, clean, and present the relevant
data.
• Skills needed: Programming skills (SAS, R, Python), storytelling and data
visualization, statistical and mathematical skills, knowledge of Hadoop, SQL,
and Machine Learning.
• Data analyst: particularly deals with exploratory data analysis and data
visualization.
• Data Analyst:
• Job role: Analysts bridge the gap between the data scientists and the
business analysts, organizing and analyzing data to answer the questions
the organization poses. They take the technical analyses and turn them
into qualitative action items.
• Skills needed: Statistical and mathematical skills, programming skills (SAS,
R, Python), plus experience in data wrangling and data visualization.
• Data Engineer:
• Job role: Data engineers focus on developing, deploying, managing, and
optimizing the organization’s data infrastructure and data pipelines.
Engineers support data scientists by helping to transfer and transform
data for queries.
• Skills needed: NoSQL databases (e.g., MongoDB, Cassandra DB),
programming languages such as Java and Scala, and frameworks (Apache
Hadoop).
• Business Managers:
 The business managers are the people in charge of
overseeing the data science training method. Their primary
responsibility is to collaborate with the data science team to
characterise the problem and establish an analytical method.
Their goal is to ensure projects are completed on time by
collaborating closely with data scientists and IT managers.

• IT Managers:
 They are primarily responsible for developing the
infrastructure and architecture to enable data science
activities. Data science teams are constantly monitored and
resourced accordingly to ensure that they operate efficiently
and safely. They may also be in charge of creating and
maintaining IT environments for data science teams.
Data Science tools
• Data scientists rely on the following popular programming
languages to conduct data analysis and statistical regression.

• Open source tools support pre-built statistical modeling,

machine learning, and graphics capabilities.

1. R Studio: An open source programming language and

environment for developing statistical computing and
graphics.

2. Python: It is a dynamic and flexible programming language.

The Python includes numerous libraries, such as NumPy,
Pandas, Matplotlib, for analyzing data quickly.
Data Science Tools

• Data Analysis: SAS, Jupyter, R Studio, MATLAB, Excel,

RapidMiner.

• Data Warehousing: Informatica/ Talend, AWS Redshift

• Data Visualization: Jupyter, Tableau, Cognos, RAW

• Machine Learning: Spark MLib, Mahout, Azure ML

studio
Programs vs learning algorithms
Algorithmic solution

Data
Computer Output
Program

Machine Learning solution

Data
Computer Program
Output
What is Machine Learning?
• Learning:- “Learning is any process by which a system
improves performance from experience.” - Herbert Simon

• Machine learning is a discipline of computer science that

uses computer algorithms and analytics to build predictive
models that can solve business problems.

• Definition by Tom Mitchell on machine learning says: “A

computer program is said to learn from experience E with
respect to some class of tasks T and performance measure
P, if its performance at tasks T, as measured by P, improves
with experience E.”
What is Machine Learning
• Machine learning is programming computers to
optimize a performance criterion using data or
past experience.
• There is no need to “learn” to calculate payroll
• Learning is used when:
– Human expertise does not exist (navigating on Mars),
– Humans are unable to explain their expertise (speech
recognition)
– Solution changes in time (routing on a computer
network)
– Solution needs to be adapted to particular cases (user
biometrics)
19
Machine Learning : Definition
• Learning is the ability to improve one's behaviour based on
experience.

• Build computer systems that automatically improve with

experience

• What are the fundamental laws that govern all learning

processes?

• Machine Learning explores algorithms that can

– learn from data / build a model from data
– use the model for prediction, decision making or solvingsome
tasks
• A computer program is said to learn from
• experience E with respect to some class of
• tasks T and performance measure P, if its
• performance at tasks in T, as measured by P,
• improves with experience E.
• [Mitchell]
Components of a learning problem
• Task: The behaviour or task being improved.
– For example: classification, acting in an
environment

• Data: The experiences that are being used to

improve performance in the task.

• Measure of improvement :
– For example: increasing accuracy in prediction,
acquiring new, improved speed and efficiency
How Does Machine Learning Work?

• Machine learning accesses vast amounts of data (both structured and unstructured)
and learns from it to predict the future. It learns from the data by using multiple
algorithms and techniques. Below is a diagram that shows how a machine learns from
data.

Machine Learning
Past Data Algorithm Output

• https://www.simplilearn.com/tutorials/artificial-intelligence-tutorial/ai-vs-machine-le
arning-vs-deep-learning
• Reference Books:
1. Machine-Learning-Tom-Mitchell Publisher: McGraw-Hill
Programs vs learning algorithms
Algorithmic solution

Data
Computer Output
Program

Machine Learning solution

Data
Computer Program
Output
Domains and ML Applications
Domain:- Automobile
Example: A robot driving learning problem
• Task T: driving on public four-lane highways
using vision sensors
• Performance measure P: average distance
traveled before an error
• Training experience E: a sequence of images
and steering commands recorded while
observing a human driver
Domain: Health Care

• Diagnose a disease
– Input: symptoms, lab measurements, test result.
– Output: One of set of possible diseases, or
“none of the above”

• Data: Historical medical records.

• Learn: which future patients will respond best to

which treatments
Association rule
• Association rule is a kind of unsupervised learning technique that
tests for the reliance of one data element on another data element
and design appropriately so that it can be more cost-effective.

• The association rule learning is the most important approach of

machine learning, and it is employed in Market Basket analysis,
Web usage mining, continuous production, etc. In market basket
analysis, it is an approach used by several big retailers to find the
relations between items.

• Association rule learning can be divided into three types of

algorithms:
1. Apriori algorithm
2. Eclat algorithm
3. F-P Growth algorithm
Associations: Market Basket Analysis
How does Association Rule Learning work?
• Association rule learning works on the concept of If and
Else Statement, such as if A then B.

• To measure the associations between thousands of data

items, there are several metrics.

• These metrics are given below:

• Support
• Confidence
• Lift
Applications of Machine Learning:
Learning Associations:
TID Items
1 {Bread, Milk}
2 {Bread, Diaper, Beer, Eggs}
3 {Milk, Diaper, Beer, Cola}
4 {Bread, Milk, Diaper, Beer}
5 {Bread, Milk, Diaper, Cola}

Support Count(σ) – Frequency of occurrence of a items.

Here σ({Milk, Diaper, Beer})=2

σ({Bread, Diaper, Beer})=2
• Association Rule – An implication expression of the
form X -> Y, where X and Y are any 2 itemsets.
• Example: {Milk, Diaper}->{Beer}

• Definition of Support:
• Support is the frequency of A or how frequently an
itemset appears in the dataset.
• From the above table:

• Support(s)= σ ({Milk, Diaper, Beer})/|T| = 2/5

= 0.4

Where T is the total number of transactions.

• Definition of Confidence:
• How often the items X and Y occur together in
the dataset when the occurrence of X is already
given. It is the ratio of the transaction that
contains X and Y to the number of records that
contain X.
• From Example {Milk, Diaper} ―> {Beer}
• Confidence(c)= σ(Milk, Diaper, Beer)/σ(Milk,
Diaper)=2/3
=0.67
• Definition of Lift(l): The lift of the rule X=>Y is the confidence of the rule
divided by the expected confidence, assuming that the itemsets X and Y
are independent of each other. The expected confidence is the
confidence divided by the frequency of {Y}.

• If Lift(l)=1: It indicates X and Y almost often appear together as

expected,

• If Lift(l)>1: It means they appear together more than expected and

• If Lift(l)<1: It means they appear less than expected. Greater lift values
indicate stronger association.

• Lift(l)= Support(X,Y)/(Support(X)*Support(Y))
l=Supp({Milk, Diaper, Beer})/ (Supp({Milk, Diaper})*Supp({Beer}))
= 0.4/(0.6*0.6)
=1.11
Applications of Association Rule Learning

• Below are some popular applications of association rule learning:

1. Market Basket Analysis: It is one of the popular examples and applications of

association rule mining. This technique is commonly used by big retailers to
determine the association between items. By discovering such associations,
retailers produce marketing methods by analyzing which elements are
frequently purchased by users.

2. Medical Diagnosis: With the help of association rules, patients can be cured
easily, as it helps in identifying the probability of illness for a particular
disease.

3. Protein Sequence: The association rules help in determining the synthesis of

artificial Proteins.

4. Web usage mining: Web usage mining is basically the extraction of various
types of interesting data that is readily available and accessible in the ocean
of huge web pages, from Internet.
Classification

• Example: Credit
scoring
• Differentiating
between low-risk
and high-risk
customers from their
income and savings

Discriminant: IF income > θ1 AND savings > θ2

THEN low-risk ELSE high-risk
36
Face Recognition

Training examples of a person

Test images

AT&T Laboratories, Cambridge UK

http://www.uk.research.att.com/facedatabase.html

37
Clustering
Clustering
Regression

• Example: Price of a
used car
y = wx+w0
• x : car attributes
y : price
y = g (x | q )
g ( ) model,
q parameters
40
• Regression:
Some other applications
• Fraud detection : Credit card Providers
 Determine whether or not someone will
default on a home mortgage.
 Understand consumer sentiment based off of
unstructured text data.
 Determine customers behavior based on
previous records/pattern.
• Speech recognition:
• Face Recognition:
• Weather Forecasting
• NLP:
detect where entities are mentioned in NL
detect what facts are expressed in NL
detect if a product/movie review is positive,
negative, or neutral
• Financial:
Predict if a stock will rise or fall?
 Predict if a user will click on an ad or not?

Deed of Absolute Sale
100% (1)
Deed of Absolute Sale
3 pages
Seminar On Data Science
100% (7)
Seminar On Data Science
25 pages
Machine Learning Unit-1.1
No ratings yet
Machine Learning Unit-1.1
29 pages
Python
No ratings yet
Python
9 pages
Basics of Data Science KPK
No ratings yet
Basics of Data Science KPK
38 pages
DS Module 1
No ratings yet
DS Module 1
112 pages
Lesson1 Introduction To The Data Science Process and The Value of Learning Data Science
No ratings yet
Lesson1 Introduction To The Data Science Process and The Value of Learning Data Science
6 pages
Data Science: by Neha Tyagi
100% (1)
Data Science: by Neha Tyagi
17 pages
Impact of Artificial Intelligence on the Software Industries (2)
No ratings yet
Impact of Artificial Intelligence on the Software Industries (2)
25 pages
Question Bank Syllbuswise
No ratings yet
Question Bank Syllbuswise
16 pages
Fd45092a Ccad 459e Bc18 b01536fd6bac Untitled
No ratings yet
Fd45092a Ccad 459e Bc18 b01536fd6bac Untitled
53 pages
Day 1 Intro To DS and ML - New
No ratings yet
Day 1 Intro To DS and ML - New
41 pages
Intro To Career in Data Science: Md. Rabiul Islam
100% (1)
Intro To Career in Data Science: Md. Rabiul Islam
62 pages
Summer Training 2020: Advanced Data Science With IBM & Bionic Robotic Arm
No ratings yet
Summer Training 2020: Advanced Data Science With IBM & Bionic Robotic Arm
10 pages
Unit 2 Data Science
No ratings yet
Unit 2 Data Science
53 pages
Sushil 7th (1 PDF
No ratings yet
Sushil 7th (1 PDF
29 pages
Data Science Syllabus From Beginner to Advanced
No ratings yet
Data Science Syllabus From Beginner to Advanced
7 pages
DS-Unit-1_ABM
No ratings yet
DS-Unit-1_ABM
103 pages
ML Module2-Chapter 1
No ratings yet
ML Module2-Chapter 1
50 pages
File of ML
No ratings yet
File of ML
42 pages
Unit 1-FDS
No ratings yet
Unit 1-FDS
18 pages
Copy of Introduction to DS.pdf
No ratings yet
Copy of Introduction to DS.pdf
34 pages
Chapter 1 Data Science Fundamentals
No ratings yet
Chapter 1 Data Science Fundamentals
34 pages
Vishwha D
No ratings yet
Vishwha D
29 pages
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
No ratings yet
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
27 pages
AI UNIT 1 Data Science
No ratings yet
AI UNIT 1 Data Science
16 pages
himadev
No ratings yet
himadev
37 pages
Machine Learning 3
No ratings yet
Machine Learning 3
31 pages
02 Introduction_Fall 23-24
No ratings yet
02 Introduction_Fall 23-24
29 pages
Fundamentals of Data Science
100% (3)
Fundamentals of Data Science
62 pages
Applied Data Science With Machine Learning
100% (2)
Applied Data Science With Machine Learning
21 pages
CD101 Fundamental of Data Science
No ratings yet
CD101 Fundamental of Data Science
41 pages
Data-Science - Introduction
No ratings yet
Data-Science - Introduction
35 pages
Unit I
No ratings yet
Unit I
52 pages
Fds Module 1
No ratings yet
Fds Module 1
65 pages
DSUP (AI-DS) Experiments Prem
No ratings yet
DSUP (AI-DS) Experiments Prem
107 pages
DATA SCIENCE LIFE CYCLE
No ratings yet
DATA SCIENCE LIFE CYCLE
12 pages
1 - Introduction To Data Science
No ratings yet
1 - Introduction To Data Science
28 pages
Data Science Presentation
No ratings yet
Data Science Presentation
27 pages
Data Analytics PDF
0% (1)
Data Analytics PDF
6 pages
Data Science M-1 Notes
No ratings yet
Data Science M-1 Notes
34 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
8 pages
Reporttrain
No ratings yet
Reporttrain
54 pages
Unit-3 Intr Data Science
No ratings yet
Unit-3 Intr Data Science
150 pages
Data Science_ppt
No ratings yet
Data Science_ppt
45 pages
Anu Data Scie
No ratings yet
Anu Data Scie
32 pages
How Data Science and Machine Learning Are Revolutionizing Modern Technology
No ratings yet
How Data Science and Machine Learning Are Revolutionizing Modern Technology
5 pages
slides-1_intro_day1
No ratings yet
slides-1_intro_day1
71 pages
Module 1
No ratings yet
Module 1
192 pages
1. Introduction to Data Science.docx
No ratings yet
1. Introduction to Data Science.docx
24 pages
ML-Unit 1
No ratings yet
ML-Unit 1
101 pages
Unit -1 DS
No ratings yet
Unit -1 DS
24 pages
M-1-FDS-NOTES-PPT (2) (1)
No ratings yet
M-1-FDS-NOTES-PPT (2) (1)
19 pages
Project Report
No ratings yet
Project Report
29 pages
2-ML
No ratings yet
2-ML
80 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
30 pages
Introduction To Data Science What Is Data Science?
No ratings yet
Introduction To Data Science What Is Data Science?
11 pages
HUI-CMP201 Note 5
No ratings yet
HUI-CMP201 Note 5
62 pages
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Learning Advanced Programming
From Everand
Learning Advanced Programming
IT Campus Academy
No ratings yet
Machine Learning Fundamentals: Concepts, Models, and Applications
From Everand
Machine Learning Fundamentals: Concepts, Models, and Applications
Amar Sahay
No ratings yet
Detection and Prediction of Rice Leaf Disease using a Hybrid CNN-SVM Model
No ratings yet
Detection and Prediction of Rice Leaf Disease using a Hybrid CNN-SVM Model
19 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Presentation UNIT-2(Old)
No ratings yet
Presentation UNIT-2(Old)
58 pages
SVM Multi-class classification
No ratings yet
SVM Multi-class classification
5 pages
APT Practice
No ratings yet
APT Practice
21 pages
Financial Controllership
No ratings yet
Financial Controllership
5 pages
Case Digest 1-4.46
No ratings yet
Case Digest 1-4.46
4 pages
Album Launch
No ratings yet
Album Launch
4 pages
Data tab
No ratings yet
Data tab
8 pages
Synchronous Machines
No ratings yet
Synchronous Machines
53 pages
Academic Calender 2024-25
No ratings yet
Academic Calender 2024-25
4 pages
CRPC Material
No ratings yet
CRPC Material
30 pages
Complete With DocuSign 43403114 - Yanomami's
No ratings yet
Complete With DocuSign 43403114 - Yanomami's
11 pages
Solutions To Problems: LG 1 Basic
67% (3)
Solutions To Problems: LG 1 Basic
13 pages
Dharma Wanita Persatuan Kab Lotim
No ratings yet
Dharma Wanita Persatuan Kab Lotim
9 pages
MAJOR - PROJECT 7th
No ratings yet
MAJOR - PROJECT 7th
60 pages
Ahad Timetable
No ratings yet
Ahad Timetable
9 pages
User Manual
No ratings yet
User Manual
2 pages
Quilibet Potest Renunciare Juri Pro Se
No ratings yet
Quilibet Potest Renunciare Juri Pro Se
6 pages
4 - Wikipedia
No ratings yet
4 - Wikipedia
1 page
Greece
No ratings yet
Greece
12 pages
Application of Business Analytics in Corporate Enterprises, An Exploratory Study
No ratings yet
Application of Business Analytics in Corporate Enterprises, An Exploratory Study
15 pages
Essay Competition 2019
No ratings yet
Essay Competition 2019
9 pages
Module-2-3_MKTG-70_Brand-Brand-Naming-and-Imtellectual-Property-Issues.pptx
No ratings yet
Module-2-3_MKTG-70_Brand-Brand-Naming-and-Imtellectual-Property-Issues.pptx
45 pages
Carteret County 2030 - Imagining The Futures
No ratings yet
Carteret County 2030 - Imagining The Futures
85 pages
De Mustering ENG
No ratings yet
De Mustering ENG
1 page
Vocabulary List Kids Advanced
No ratings yet
Vocabulary List Kids Advanced
2 pages
Arcitura SOA Architect
No ratings yet
Arcitura SOA Architect
20 pages
Effect of Sand On Strain Modulus Evsubstrong2strongsub Property of Clayey Soil IJERTV6IS060418
No ratings yet
Effect of Sand On Strain Modulus Evsubstrong2strongsub Property of Clayey Soil IJERTV6IS060418
14 pages
Jobsite Inspection Checklist: OK Not Ok 1. Site Access Action Taken
No ratings yet
Jobsite Inspection Checklist: OK Not Ok 1. Site Access Action Taken
4 pages
What Is Biodiversity? 3.species Diversity
No ratings yet
What Is Biodiversity? 3.species Diversity
7 pages
Topic Risk and Return
No ratings yet
Topic Risk and Return
5 pages
Home Designer Suite 2021 Reference Manual
100% (1)
Home Designer Suite 2021 Reference Manual
408 pages