Unit 3
Unit 3
UNIT - III
AI Vs. ML
2
Artificial Intelligence
What is AI ?
John McCarthy
Artificial Intelligence is concerned with the design of
intelligence in an artificial device.
1. Intelligence
2. Artificial device
The Turing Test
(Can Machine think? A. M. Turing, 1950)
• Requires:
– Natural language Processing
– Knowledge representation
– Automated reasoning
– Machine learning
– (vision, robotics) for full test
Agents
An agent is any thing that can be viewed as perceiving its environment through sensors
and acting upon that environment through actuators/ effectors
• Robotic agent:
– Sensors:- cameras (picture Analysis) and infrared range finders for sensors, Solar Sensor.
– Actuators- various motors, speakers, Wheels
9
INTRODUCTION
10
How a computer works?
Cntd..
For some tasks ,however, we do not have an algorithm, We have machine learning intelligence
Ex. tell spam emails from legitimate emails
Input :Email document (file of characters), output : yes/no output indicating whether the message is spam or not
Computer (machine) to extract automatically the algorithm for this task.
Cntd..
(source: https://medium.com/analytics-vidhya/introduction-to-machine-learning-e1b9c055039c)
• Machine learning is a “Field of study that gives computers the ability to learn without
being explicitly programmed.”
• In other words it is concerned with the question of how to construct computer programs
that automatically improve with the experience. - According to Arthur Samuel(1959)
13
Cntd..
• A computer program is said to learn from experience ‘E’ with respect to some
class of task ‘T’ and performance measure ‘P’ if its performance at task in ‘T’
as measured by ‘P’ improves with experience ‘E’ – Tom M Mitchell
14
Cntd..
Example 1
Classify Email as spam or not spam
• Task (T): Classify email as spam or not spam
• Experience(E): watching the user to mark/label the email as spam or
not spam
• Performance (P): The number or fraction of email to be correctly
classified as spam or not spam
15
Cntd..
Example 2
Recognizing hand written digits/ characters
• Task(T): Recognizing hand written digit
• Experience (E): watching the user to mark/ label the hand written digit
to 10 classes(0-9) & identify underling pattern
• Performance(P):The number of fractions of hand-written digits
correctly classified
16
Why Machine Learning Important?.
• Human expertise does not exist
Navigating on Mars
industrial/manufacturing control
mass spectrometer analysis, drug design, astronomic discovery
• Black-box human expertise OR Some tasks cannot be defined well, except by
examples
face/handwriting/speech recognition/ recognizing people
driving a car, flying a plane
18
How does machine learning help us in daily life?
Social networking :
• Use of the appropriate emotions,
19
How does machine learning help us in daily life?
Personal finance and
banking solutions
smartphones machine
20
How does machine learning help us in daily life?
Commute estimation
transportation modes,
22
Examples…
Input : is an image , the classes are people to be recognized [non-face , frontal-face , profile-face] and the learning
program should learn to associate the face images to identities.
This problem is more difficult because there are more classes, input image is larger, and a face is 3-dimensional and
differences in pose and lighting cause significant changes in the image. There may also be occlusion ( blockage )of
certain inputs; e.g. glasses may hide the eyes and eyebrows, and a beard may hide the chin.
Example 3: Spam detection
Watch Fest <[email protected]> Ma
y
22
to me
Why is this message in Spam? It's similar to messages that were detected by our spam filters. Learn more
Example 4: Stock price prediction
A Crop Yield Prediction App in Senegal Using Satellite Imagery (Video Link)
https://www.youtube.com/watch?v=4OnBGkhA4jc&t=160s
.
Types of Learning
• Supervised learning
Types of Learning
Supervised Learning :
Aim is to learn a mapping from the input to an output whose correct values are provided
by a supervisor.
1. Classification : Data is labelled …meaning it is assigned a class,
for example spam/non-spam or fraud/non-fraud
e.g. for financial institution ..input to classifier is savings and income and output is
one the class like high risk or low risk based on following classification rule
❑ if income > δ1 and saving δ2 then low risk else high risk
2. Regression : Data is labelled with a real value rather then a label.
e.g. price of a stock over time.
e.g predict the price of used car ….
Input : brand, year, engine capacity, mileage & other information …
output: Price of car
Types of Learning
Types of Learning
Types of Learning
Supervised Learning :
Types of Learning
Supervised Learning :
Types of Learning
Supervised LEARNING
40
Supervised LEARNING
41
Unsupervised Learning
• Unsupervised learning
Example of Unsupervised learning
• Clustering
• Association
43
Example of Unsupervised learning
• Clustering
• Association
44
Example of Unsupervised learning
45
Example of Semi-supervised learning
46
Reinforcement Learning
• learning from mistakes
• Place a reinforcement learning algorithm into any environment and
it will make a lot of mistakes in the beginning
• As we provide some sort of signal to the algorithm that associates
good behaviors with a positive signal and bad behaviors with a
negative one
• we can reinforce our algorithm to prefer good behaviors over
bad ones.
• Over time, our learning algorithm learns to make less mistakes
than it used to.
Reinforcement Learning
Reinforcement Learning
Where is reinforcement learning in the real world?
• Video Games
• Industrial Simulation:
• Resource Management
50
Key Elements of Machine Learning
51
3. Optimization: the way candidate programs are generated known as the
search process.
For example combinatorial optimization, convex optimization,
constrained optimization.
• All machine learning algorithms are combinations of these three
components.
• A framework for understanding all algorithms.
52
Aspects of developing a learning system:
training data, concept representation, function approximation
• For training and testing purpose of our model we need to split the dataset in
to three distinct dataset, training set, validation set and testing set
• Training set:-
• A set of data used to train the model
• It is used to fit the model
• The model sees and learn from this data
• Later on the trained model can be deployed and used to accurately predict
on new data that it has not seen before
• Labeled data is used
53
Validation set
• Validation set is the set of data separate from the training data
• It is used to validate our model during training
• It gives information which is used for tuning model hyper parameter
• It ensures that our model is not over fitting to the data in the training
set
• Labeled data is used
54
Test Set
• A set of data use to test the model
• The test set is separated from both the train set and validation set
• Once the model is train and validated using the training data and
validation sets then the model is used to predict the output for the
data in the test set
• Unlabeled data is used
55
Data Split
sometimes, data in data sets have missing or incomplete information, which leads to less
accurate or incorrect predictions.
Further, sometimes data sets are clean but not adequately shaped, such as aggregated or
pivoted, and some have less business context.
Hence, after collecting data from various data sources, data preparation needs to
transform raw data.
Significant advantages of data preparation in machine learning as follows:
• It helps to provide reliable prediction outcomes in various analytics operations.
• It helps identify data issues or errors and significantly reduces the chances of errors.
• It increases decision-making capability.
• It reduces overall project cost (data management and analytic cost).
• It helps to remove duplicate content to make it worthwhile for different applications.
• It increases model performance.
Steps in Data Preparation Process
65
Cntd..
66
Cntd..
67
Cntd..
68
Cntd..
69
Cntd..
70
Cntd..
2. Outliers or anomalies: Unexpected values
• ML algorithms are sensitive to the range and distribution of values when data
comes from unknown sources.
• These values can spoil the entire machine learning training system and the
performance of the model.
• Hence, it is essential to detect these outliers or anomalies through techniques
such as visualization technique.
71
Cntd..
2. Outliers or anomalies: Unexpected values
• ML algorithms are sensitive to the range and distribution of values when data
comes from unknown sources.
• These values can spoil the entire machine learning training system and the
performance of the model.
• Hence, it is essential to detect these outliers or anomalies through techniques
such as visualization technique.
72
Cntd..
3. Unstructured data format :
• Data comes from various sources and needs to be extracted into a different
format.
• Hence, before deploying an ML project, always consult with domain experts or
import data from known sources.
4. Limited or sparse features / attributes :
• Whenever data comes from a single source, it contains limited features,
• so it is necessary to import data from various sources for feature enrichment
or build multiple features in datasets.
5. Understanding feature engineering:
• Features engineering helps develop additional content in the ML models,
increasing model performance and accuracy in predictions. 73
Cntd..
74
Cntd..
75
Cntd..
76
Cntd..
77
Cntd..
78
Cntd..
79
Cntd..
80