0% found this document useful (0 votes)
18 views10 pages

First Step in Supervised Learning

Supervised learning is a machine learning approach that learns a function mapping inputs to outputs based on labeled training data. The process involves determining the type of training examples, gathering a representative training set, and selecting an appropriate learning algorithm. Common applications include handwriting recognition, spam detection, and object recognition, with various algorithms like support vector machines and neural networks being widely used.

Uploaded by

Adugna Negero
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views10 pages

First Step in Supervised Learning

Supervised learning is a machine learning approach that learns a function mapping inputs to outputs based on labeled training data. The process involves determining the type of training examples, gathering a representative training set, and selecting an appropriate learning algorithm. Common applications include handwriting recognition, spam detection, and object recognition, with various algorithms like support vector machines and neural networks being widely used.

Uploaded by

Adugna Negero
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

FIRST STEP IN SUPERVISED LEARNING Saman Siadati

INTRODUCTION
 Supervised learning is the machine learning task of learning a function that maps
an input to an output based on example input-output pairs.
 It infers a function from labeled training data consisting of a set of training
examples.
 In supervised learning, each example is a pair consisting of an input object
(typically a vector) and a desired output value (also called the supervisory signal). A
supervised learning algorithm analyzes the training data and produces an inferred
function, which can be used for mapping new examples.
SOLVING A GIVEN PROBLEM OF SUPERVISED LEARNING
1. Determine the type of training examples. Before doing anything else, the user should decide
what kind of data is to be used as a training set. In the case of handwriting analysis, for
example, this might be a single handwritten character, an entire handwritten word, or an
entire line of handwriting.
2. Gather a training set. The training set needs to be representative of the real-world use of
the function. Thus, a set of input objects is gathered and corresponding outputs are also
gathered, either from human experts or from measurements.
3. Determine the input feature representation of the learned function. The accuracy of the
learned function depends strongly on how the input object is represented. Typically, the
input object is transformed into a feature vector, which contains a number of features that
are descriptive of the object. The number of features should not be too large, because of
the curse of dimensionality; but should contain enough information to accurately predict the
output.
SOLVING A GIVEN PROBLEM OF SUPERVISED LEARNING
4. Determine the structure of the learned function and corresponding learning algorithm. For
example, the engineer may choose to use support vector machines or decision trees.
5. Complete the design. Run the learning algorithm on the gathered training set. Some
supervised learning algorithms require the user to determine certain control parameters.
These parameters may be adjusted by optimizing performance on a subset (called a
validation set) of the training set, or via cross-validation.
6. Evaluate the accuracy of the learned function. After parameter adjustment and learning, the
performance of the resulting function should be measured on a test set that is separate from
the training set.
MOST WIDELY USED ALGORITHMS
Support Vector Machines
linear regression
logistic regression
naive Bayes
linear discriminant analysis
decision trees
k-nearest neighbor algorithm
Neural Networks (Multilayer perceptron)
Similarity learning
MAJOR ISSUES TO CONSIDER IN ALGORITHM CHOICE
 Bias-variance tradeoff
 Function complexity and amount of training data
 Function complexity and amount of training data
 Noise in the output values d
 Heterogeneity of the data
 Redundancy in the data
 Presence of interactions and non-linearities
APPLICATIONS
 Bioinformatics
 Cheminformatics
 Quantitative structure–activity relationship
 Database marketing
 Handwriting recognition
 Information retrieval
 Learning to rank
APPLICATIONS
 Information extraction
 Object recognition in computer vision
 Optical character recognition
 Spam detection
 Pattern recognition
 Speech recognition
 Supervised learning is a special case of Downward causation in
biological systems
REFERENCES
Stuart J. Russell, Peter Norvig (2010) Artificial Intelligence: A Modern Approach, Third Edition, Prentice Hall ISBN
9780136042594.

Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar (2012) Foundations of Machine Learning, The MIT Press ISBN
9780262018258.

S. Geman, E. Bienenstock, and R. Doursat (1992). Neural networks and the bias/variance dilemma. Neural Computation
4, 1–58.

G. James (2003) Variance and Bias for General Loss Functions, Machine Learning 51, 115-135. (http://www-
bcf.usc.edu/~gareth/research/bv.pdf)

C.E. Brodely and M.A. Friedl (1999). Identifying and Eliminating Mislabeled Training Instances, Journal of Artificial
Intelligence Research 11, 131-167. (http://jair.org/media/606/live-606-1803-jair.pdf)

M.R. Smith and T. Martinez (2011). "Improving Classification Accuracy by Identifying and Removing Instances that Should
Be Misclassified". Proceedings of International Joint Conference on

Siadati, Saman. (2017). Just a moment with Machine Learning. 10.13140/RG.2.2. 31765.35042.

Siadati, Saman. (2018). A quick review of Deep Learning. 10.13140/RG.2.2. 27269.58089.

You might also like