WEEK 5 Machine Learning
WEEK 5 Machine Learning
Supervised Learning
• Labelled data – Data for which the target answer is known. For
example,
if you are shown a picture of a cat and you are told it’s a cat. That is
labelled data.
• Unlabelled data – Data for which the target answer is not known. For
example, if you are shown an image but you are not given information
about the image description.
Training Process:
nonmammals.
Unsupervised Learning
Key Concepts:
1. No Labeled Output:
2. Goal:
Common Tasks:
Example:
Imagine you have a big pile of customer data (age, purchase history,
website visits), but you don’t know anything about them. You want to
group similar customers together to send them tailored marketing
emails.
You didn’t tell the algorithm what to look for—it found those
patterns by itself.
To know when to use supervised and unsupervised
Decision Tree
In Sunny for example he has [2+, -3] because Yes (positive): D9,
D11 → 2
So [2+, -3]
Then to measure uncertainty is
Entropy formula
Entropy tells us how mixed a set of examples is. If a set is pure (all yes or all no)
ex. 10Y, 0N, entropy is 0. If it’s 50/50 (ex. 5Y , 5N, entropy is 1 — maximum
uncertainty.
We did it like this. We divide 6/8. Is 8 cuz 6 + 2 =8. Then follow the formula
Gini Index
Gini Index is another index like Entropy which is used to decide the splitting of an
attribute on a decision tree