0% found this document useful (0 votes)
21 views6 pages

What Is Machine Learning

Machine Learning is the field that enables computers to learn from data without explicit programming, exemplified by applications like spam filters that improve accuracy through training on examples. It encompasses various types of learning systems, including supervised, unsupervised, semi-supervised, and reinforcement learning, each with distinct methodologies and applications. Machine Learning is particularly effective for complex problems, adapting to new data, and simplifying solutions that would otherwise require extensive manual tuning.

Uploaded by

Ankitha Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views6 pages

What Is Machine Learning

Machine Learning is the field that enables computers to learn from data without explicit programming, exemplified by applications like spam filters that improve accuracy through training on examples. It encompasses various types of learning systems, including supervised, unsupervised, semi-supervised, and reinforcement learning, each with distinct methodologies and applications. Machine Learning is particularly effective for complex problems, adapting to new data, and simplifying solutions that would otherwise require extensive manual tuning.

Uploaded by

Ankitha Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

What is Machine Learning

Machine Learning is the science (and art) of programming computers so they can learn from
data.
Here is a slightly more general definition:
Machine Learning is the field of study that gives computers the ability to learn without being
explicitly programmed. —Arthur Samuel, 1959
A computer program is said to learn from experience E with respect to some task T and some
performance measure P, if its performance on T, as measured by P, improves with experience E.
—Tom Mitchell, 1997

For example, your spam filter is a Machine Learning program that can learn to flag spam given
examples of spam emails (e.g., flagged by users) and examples of regular (nonspam, also called
“ham”) emails. The examples that the system uses to learn are called the training set. Each
training example is called a training instance (or sample).
In this case, the task T is to flag spam for new emails, the experience E is the training data, and
the performance measure P needs to be defined; for example, you can use the ratio of correctly
classified emails. This particular performance measure is called accuracy and it is often used in
classification tasks.

Use Machine Learning


Consider how you would write a spam filter using traditional programming techniques.

1. First you would look at what spam typically looks like. You might notice that some words or
phrases (such as “4U,” “credit card,” “free,” and “amazing”) tend to come up a lot in the subject.
Perhaps you would also notice a few other patterns in the sender’s name, the email’s body, and so
on.
2. You would write a detection algorithm for each of the patterns that you noticed, and your
program would flag emails as spam if a number of these patterns are detected.
3. You would test your program, and repeat steps 1 and 2 until it is good enough.

Since the problem is not trivial, your program will likely become a long list of complex
rules—pretty hard to maintain.

In contrast, a spam filter based on Machine Learning techniques automatically learns which
words and phrases are good predictors of spam by detecting unusually frequent patterns of words
in the spam examples compared to the ham examples. The program is much shorter, easier to
maintain, and most likely more accurate.

To summarize, Machine Learning is great for:


• Problems for which existing solutions require a lot of hand-tuning or long lists of
rules: one Machine Learning algorithm can often simplify code and perform better.
• Complex problems for which there is no good solution at all using a traditional
approach: the best Machine Learning techniques can find a solution.
• Fluctuating environments: a Machine Learning system can adapt to new data.
• Getting insights about complex problems and large amounts of data.

Types of Machine Learning Systems


There are so many different types of Machine Learning systems that it is useful to classify them
in broad categories based on:
• Whether or not they are trained with human supervision (supervised, unsupervised,
semisupervised, and Reinforcement Learning)
• Whether or not they can learn incrementally on the fly (online versus batch learning)
• Whether they work by simply comparing new data points to known data points, or instead
detect patterns in the training data and build a predictive model, much like scientists do
(instance-based versus model-based learning)

Supervised/Unsupervised Learning
Batch and Online Learning
Instance-Based Versus Model-Based Learning
Supervised/Unsupervised Learning
Machine Learning systems can be classified according to the amount and type of supervision
they get during training. There are four major categories: supervised learning, unsupervised
learning, semisupervised learning, and Reinforcement Learning.

Supervised learning: In supervised learning, the training data you feed to the algorithm includes
the desired solutions, called labels.

A typical supervised learning task is classification. The spam filter is a good example of this: it is
trained with many example emails along with their class (spam or ham), and it must learn how to
classify new emails.

Another typical task is to predict a target


numeric value, such as the price of a car,
given a set of features (mileage, age,
brand, etc.) called predictors. This sort
of task is called regression. To train the
system, you need to give it many
examples of cars, including both their
predictors and their labels (i.e., their
prices).

Here are some of the most important supervised learning algorithms:


• k-Nearest Neighbors
• Linear Regression
• Logistic Regression
• Support Vector Machines (SVMs)
• Decision Trees and Random Forests
• Neural networks
Unsupervised learning
In unsupervised learning, as you might guess, the training data is unlabeled. The system tries to
learn without a teacher.

For example, say you have a lot of data about your blog’s visitors. You may want to run a
clustering algorithm to try to detect groups of similar visitors. At no point do you tell the
algorithm which group a visitor belongs to: it finds those connections without your help. For
example, it might notice that 40% of your visitors are males who love comic books and generally
read your blog in the evening, while 20% are young sci-fi lovers who visit during the weekends,
and so on. If you use a hierarchical clustering algorithm, it may also subdivide each group into
smaller groups. This may help you target your posts for each group.

Here are some of the most important unsupervised learning algorithms.


• Clustering
—K-Means
—DBSCAN
—Hierarchical Cluster Analysis (HCA)
• Anomaly detection and novelty detection
—One-class SVM
—Isolation Forest
• Visualization and dimensionality reduction
—Principal Component Analysis (PCA)
—Kernel PCA
—Locally-Linear Embedding (LLE)
—t-distributed Stochastic Neighbor Embedding (t-SNE)
• Association rule learning
—Apriori
—Eclat
Semisupervised learning:
Some algorithms can deal with partially labeled training data, usually a lot of unlabeled data and
a little bit of labeled data. This is called semisupervised learning.
Some photo-hosting services, such
as Google Photos, are good
examples of this. Once you upload
all your family photos to the
service, it automatically recognizes
that the same person A shows up in
photos 1, 5, and 11, while another
person B shows up in photos 2, 5,
and 7. This is the unsupervised part
of the algorithm (clustering). Now
all the system needs is for you to
tell it who these people are. Just
one label per person, and it is able to name everyone in every photo, which is useful for
searching photos.

Deep belief networks (DBNs) are based on unsupervised components called restricted
Boltzmann machines (RBMs) stacked on top of one another.

Reinforcement Learning
Reinforcement Learning is a very different beast. The learning system, called an agent in this
context, can observe the environment, select and perform actions, and get rewards in return or
penalties in the form of negative rewards.
For example, many robots implement Reinforcement Learning algorithms to learn
how to walk.DeepMind’s AlphaGo program is also a good example of Reinforcement
Learning: it made the headlines in May 2017 when it beat the world champion Ke Jie
at the game of Go. It learned itswinning policy by analyzing millions of games, and
then playing many games against itself.

You might also like