0% found this document useful (0 votes)
39 views12 pages

Machine Learning INTRO

Machine learning involves improving performance at a task through experience. It is used when human expertise is limited, cannot be explained, or when large datasets are available. Common machine learning tasks include pattern recognition, prediction, and anomaly detection. Applications include web search, computational biology, finance, robotics, and more. Machine learning algorithms learn from data to produce outputs rather than following set instructions. There are different types of learning including supervised, unsupervised, and reinforcement learning.

Uploaded by

kasemfatima2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views12 pages

Machine Learning INTRO

Machine learning involves improving performance at a task through experience. It is used when human expertise is limited, cannot be explained, or when large datasets are available. Common machine learning tasks include pattern recognition, prediction, and anomaly detection. Applications include web search, computational biology, finance, robotics, and more. Machine learning algorithms learn from data to produce outputs rather than following set instructions. There are different types of learning including supervised, unsupervised, and reinforcement learning.

Uploaded by

kasemfatima2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Machine Learning

What is Machine Learning? "Machine Learning: Improving performance at


a task with experience."
• improve their performance P
• at some task T
• with experience E.
<P, T, E>.

"Traditional Programming is like a recipe for the Computer: Data goes in,
Program processes it, and Output comes out."

"In Machine Learning, the Computer learns from Data to produce Output.
It's like teaching the Computer to Program itself."

When Do We Use Machine Learning? ML is used when:

 Human expertise is lacking (e.g., navigating Mars)


 Humans can't explain their expertise (e.g., speech recognition)
 Customized models are needed (e.g., personalized medicine)
 Large datasets are available (e.g., genomics)

Classic Examples of Tasks:

1. Recognizing patterns (e.g., facial identities, medical images)


2. Generating patterns (e.g., images, motion sequences)
3. Recognizing anomalies (e.g., unusual credit card transactions)
4. Prediction (e.g., stock prices)
Sample Applications:

 Web search
 Computational biology
 Finance
 E-commerce
 Space exploration
 Robotics
 Information extraction
 Social networks
 Debugging software
 [Your favorite area]

Defining the Learning Task: Improve performance on a task, based on


experience:

 Task (T)
 Performance metric (P)
 Experience (E)

Examples of Learning Tasks:

1. Playing checkers: Percentage of games won against an opponent,


playing practice games against itself.
2. Recognizing hand-written words: Percentage of words correctly
classified, using a database of labeled images.
3. Driving on highways: Average distance traveled before an error,
based on recorded images and commands.
4. Email classification: Percentage of emails correctly classified, using a
database with labels.

Summary: Machine learning is about improving performance through


experience, used in various fields for tasks where human expertise is limited
or when dealing with large datasets.
Autonomous Cars:

 Sensors: Laser terrain mapping, adaptive vision.


 Technology: Learning from human drivers, path planning.

Deep Learning:

 Deep belief net on face images.


 Learning of object parts (e.g., faces, cars, elephants, chairs).
 Training on multiple objects with shared and specific features.
 Scene labeling via deep learning.

Inference from Deep Learned Models:

 Generating posterior samples from faces.


 Combining bottom-up and top-down inference.

Automatic Speech Recognition:

 ML predicts phone states from sound spectrogram.


 Deep learning reduces word error rate (WER).

Impact of Deep Learning:

 Deep learning outperforms traditional methods in speech recognition


(e.g., WER reduction).

Memorize the key points of each application to understand the


advancements in machine learning.
type of learning:

Supervised Learning:

 Definition: In supervised learning, the algorithm learns from labeled


data, meaning it is provided with inputs along with their
corresponding correct outputs during the training phase.
 Regression: This involves predicting continuous or real-valued
outputs. For example, predicting the Arctic sea ice extent based on
historical data.
 Classification: This involves predicting categorical outputs. For
instance, classifying breast tumors as malignant or benign based on
tumor size and other attributes.
 Example Attributes: In supervised learning, each input (x) can have
multiple dimensions corresponding to different attributes (e.g., age,
tumor size, clump thickness).

Unsupervised Learning:

 Definition: Unsupervised learning algorithms are provided with input


data without labeled responses. The algorithm then tries to find
hidden structures or patterns within the data.
 Clustering: This involves grouping similar data points together. For
example, grouping individuals based on genetic similarities in
genomics applications.
 Examples: Unsupervised learning is used in various fields such as
social network analysis, market segmentation, and astronomical data
analysis.
 Independent Component Analysis (ICA): This technique separates
combined signals into their original sources, useful in audio and
image processing.
Reinforcement Learning:

 Definition: Reinforcement learning involves an agent interacting with


an environment, learning to achieve a goal through trial and error.
The agent receives feedback in the form of rewards or penalties
based on its actions.
 Policy Learning: Reinforcement learning aims to learn a policy—a
mapping from states to actions—that guides the agent's decision-
making process.
 Examples: Common applications include game playing (e.g.,
AlphaGo), robot navigation, and autonomous vehicle control.
 Inverse Reinforcement Learning: This approach involves learning a
policy from demonstrations provided by a human expert.
Framing a Learning Problem

Designing a Learning System:

1. Choose the training experience.


2. Define the target function to be learned.
3. Determine how to represent the target function.
4. Select a learning algorithm to infer the target function from the
experience.

Training vs. Test Distribution:

 Assume training and test examples are independently drawn from the
same overall distribution (i.i.d).
 examples are not independent, collective classification
 test distribution is different, transfer learning

Machine Learning in a Nutshell:

 Numerous machine learning algorithms exist, with hundreds


introduced yearly.
 Every ML algorithm comprises three components: representation,
optimization, and evaluation.

Various Function Representations:

 Numerical functions (e.g., linear regression, neural networks).


 Symbolic functions (e.g., decision trees, rules in propositional logic).
 Instance-based functions (e.g., nearest-neighbor, case-based).
 Probabilistic Graphical Models (e.g., Naïve Bayes, Bayesian networks).

Various Search/Optimization Algorithms:

 Gradient descent (e.g., perceptron, backpropagation).


 Dynamic Programming (e.g., HMM learning, PCFG learning).
 Divide and Conquer (e.g., decision tree induction, rule learning).
 Evolutionary Computation (e.g., genetic algorithms, genetic programming).
Evaluation Metrics:

 Accuracy, precision, recall, squared error, likelihood, posterior


probability, cost/utility, margin, entropy, K-L divergence, etc.

Machine Learning in Practice Loop:

 Understand domain, prior knowledge, and goals.


 Data integration, selection, cleaning, pre-processing, etc.
 Learn models.
 Interpret results.
 Consolidate and deploy discovered knowledge.

Lessons Learned about Learning:

 Learning approximates a chosen target function using direct or


indirect experience.
 Function approximation involves searching through a space of
hypotheses for one that best fits the training data.
 Different learning methods assume different hypothesis spaces
and/or employ different search techniques.
Decision Trees
L3,4,5

Problem Setting:

 Set of possible instances X.


 Unknown target function f:X→Y.
 Set of function hypotheses H={h∣h:X→Y}.
2. Decision Tree Learning:
 Decision trees classify instances by sorting them down the tree from
the root to some leaf node.
 Each internal node tests an attribute, each branch represents an
attribute value, and each leaf node represents a class label.
 Decision trees can handle both categorical and continuous attributes.
3. Entropy:
 Entropy measures impurity in a group of examples.
 H(X) is the entropy of a random variable X.
 H(X)=−[p*log 2 𝑝]
4. Information Gain:
 Information Gain measures the reduction in entropy or impurity after
splitting the data on an attribute.
 It is calculated as the entropy of the parent node minus the weighted
average of the entropies of the child nodes.
 Information Gain = entropy(parent) – [average entropy(children)]

1. Classification Task Overview:


 Learning a function f that maps attributes x to class label y.
 Effective with datasets containing binary or nominal categories.

Two types: Descriptive Modeling (describing features) and Predictive Modeling


(predicting class labels).
Training Set:

 Contains examples (instances) with known class labels.

Test Set:

 Contains 5 examples for which we need to predict the "Class" label.


 Similar to the training set, each example has attributes but the label is
missing and needs to be predicted.

Illustrating the Classification Task:

1. Apply Model: In this step, we apply a learned model to make predictions on


new, unseen data.
2. Induction: This involves the process of learning or training the model using
the provided training dataset.
3. Deduction: After learning the model, we use it to deduce or predict the class
labels for the test dataset.
4. Learning Model: This is the process of building or training a model using
the training data, which involves selecting the appropriate algorithm and
optimizing its parameters to best fit the data.

The goal of the classification task is to train a model using the training set so that it
can accurately predict the class labels for the test set based on the values of the
attributes. This is typically done using machine learning algorithms such as
decision trees.

2. Classification Techniques:
 Decision Tree based Methods
 Rule-based Methods
 Memory based reasoning
 Neural Networks
 Naïve Bayes and Bayesian Belief Networks
 Support Vector Machines
 Each applies a learning algorithm to identify the best model for
predicting class labels.
PREDICT
3. Evaluating Classification Models: TRUE FALSE
ACTUAL TRUE TP FN
4. Confusion Matrix: FALSE FP TN
True Positives (TP): Instances that are
actually positive (class "Yes") and are correctly classified as positive.
5. False Positives (FP): Instances that are actually negative (class "No") but
are incorrectly classified as positive.
6. False Negatives (FN): Instances that are actually positive (class "Yes") but
are incorrectly classified as negative.
7. True Negatives (TN): Instances that are actually negative (class "No") and
are correctly classified as negative.
𝑇𝑃+𝑇𝑁 𝐹𝑃+𝐹𝑁
Accuracy= Error rate=
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁

𝑇𝑃 𝑇𝑃
Precision= Recall=
𝑇𝑃+𝐹𝑃 𝑇𝑃+𝐹𝑁

Tree Induction:
 Utilizes a greedy strategy to split records based on attribute tests optimizing
certain criteria.
 Challenges include determining attribute test conditions and identifying the
best splits.
Stopping Criteria for Tree Induction:
 Cease node expansion when all records belong to the same class or have
similar attribute values.
 Early termination methods may be applied.
Decision Tree Based Classification:
 Advantages:
 Affordability.
 Rapid classification.
 Easy interpretation for small trees.
 Comparable accuracy to other techniques for simple datasets.
Practical Issues of Classification:
 Underfitting and Overfitting:
 Achieving a balance between training and generalization errors is
crucial.
 Underfitting occurs when the model is too simple, while overfitting
results in overly complex models ,Noise , Insufficient Examples.

Estimating Generalization Errors:


 Re-substitution errors: Error on training data (e(t)).
 Generalization errors: Error on testing data (e’(t)).
 Methods for estimating generalization errors (Complexity):
 Optimistic approach:
 e’(t) = e(t) is a poor estimate.
 Pessimistic approach:
 For each leaf node: e’(t) = (e(t) + 0.5).
 Total errors: e’(T) = e(T) + N × 0.5 (N: number of leaf nodes).
 The Pessimistic error estimate = (e(T) + total penalty) / Total number
of nodes.
Example:
For a tree with 30 leaf nodes and 10 errors on training (out of 1000
instances):
Training error = 10 / 1000 = 1%.
Generalization error = (10 + 30 × 0.5) / 1000 = 2.5%.
 Reduced error pruning (REP):
 Uses a validation dataset to estimate generalization error.
How to Address Overfitting:
 Pre-Pruning (Early Stopping Rule):
 Halt the algorithm before a fully-grown tree, using various stopping
conditions for node expansion.
 Post-pruning:
 Grow the tree entirely, then trim nodes in a bottom-up manner if
generalization improves.
Handling Missing Attribute Values:
 Missing values affect decision tree construction in how impurity measures
are computed and instance distribution among child nodes.
Metrics for Performance Evaluation:
 Confusion Matrix:
 Provides counts or percentages of true positive, false negative, false
positive, and true negative predictions.
 Accuracy:
 Often used but limited, especially in imbalanced datasets.
 Cost Matrix:
 Assigns costs to misclassification, useful for scenarios like medical
diagnosis.
 Information Retrieval Measures:
 Precision, Recall, and F-measure offer insights into model
performance beyond accuracy.
Methods of Estimation:
 Holdout, Cross-validation, Stratified sampling, and Bootstrap are common
techniques for estimating generalization errors.

You might also like