01 - Introduction To Machine Learning
01 - Introduction To Machine Learning
1
Machine Learning (ML)
•The subfield of computer science that “gives
computers the ability to learn without being
explicitly programmed”. (Arthur Samuel,
1959)
•“Learning is any process by which a system
improves performance from experience.”
(Herbert Simon)
2
When do we use Machine Learning?
• ML is used when
• Human expertise does not exist (navigating on Mars)
• Humans can’t explain their expertise (speech recognition)
• Models must be customized (personalized medicine)
• Models are based on huge amounts of data (genomics)
3
Traditional Programming vs Machine Learning
4
Defining the Learning Task
Definition by Tom Mitchell (1998):
• Machine Learning is the study of algorithms that
• improve their performance P
• at some task T
• with experience E.
• A well-defined learning task is given by <P, T, E>
5
Defining the Learning Task - Examples
Improve on task T, with respect to performance P, based on experience E
T: Playing tennis
P: Percentage of games won against an arbitrary opponent
E: Playing practice games
7
Tasks Best Solved by Learning Algorithm
• Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
– Medical images
• Generating patterns:
– Generating images or motion sequences
• Recognizing anomalies:
– Unusual credit card transactions
– Unusual patterns of sensor readings in a nuclear power plant
• Prediction:
– Future stock prices or currency exchange rates
8
Association Rule Learning
• A methodology useful for discovering interesting relationships hidden in large
data sets.
• The uncovered relationships can be presented in the form of association rules.
• Example: If a person buys {Banana, Mango} then there is a chance s/he will
also buy {Dates}
TID Items
1 Banana, Mango
2 Banana, Dates, Plum, Apricot
3 Mango, Dates, Plum, Pear
4 Banana, Mango, Dates, Plum
5 Banana, Mango, Dates, Pear
9
Classification
• Classification is a process of categorizing a given set of data into classes.
• It can be performed on both structured or unstructured data.
• The process starts with predicting the class of given data points.
• The classes are often referred to as target, label or categories.
10
Regression
• Regression is a technique that is used to predict values of a desired target
quantity when the target quantity is continuous.
• For example, you have a price vs area data of Rawalpindi which is depicted in
the figure below.
11
Clustering
• It is the assignment of a set of observations into subsets (called clusters) so
that observations in the same cluster are similar in some sense.
12
Dimensionality Reduction
• The number of input variables or features
for a dataset is referred to as its
dimensionality.
• Dimensionality reduction is simply, the
process of reducing the dimension of
feature set.
• Feature set could be a dataset with a
hundred columns (i.e features).
• More input features often make a
predictive modeling task more challenging
to model, more generally referred to as
the curse of dimensionality.
13
Reinforcement Learning
• Reinforcement learning that trains algorithms using a system of reward and punishment.
• A reinforcement learning algorithm, or agent, learns by interacting with its environment.
• The agent receives rewards by performing correctly and penalties for performing
incorrectly.
• The agent learns without intervention from a human by maximizing its reward and
minimizing its penalty.
14