This MIT Lecture (01:08:05) : How Machine Learning Works
This MIT Lecture (01:08:05) : How Machine Learning Works
nuances between the two. Machine learning, deep learning, and neural networks are all sub-fields
of artificial intelligence. However, deep learning is actually a sub-field of machine learning, and
neural networks is a sub-field of deep learning.
The way in which deep learning and machine learning differ is in how each algorithm learns.
Deep learning automates much of the feature extraction piece of the process, eliminating some of
the manual human intervention required and enabling the use of larger data sets. You can think
of deep learning as "scalable machine learning" as Lex Fridman notes in this MIT lecture
(01:08:05) (link resides outside IBM). Classical, or "non-deep", machine learning is more
dependent on human intervention to learn. Human experts determine the set of features to
understand the differences between data inputs, usually requiring more structured data to learn.
"Deep" machine learning can leverage labeled datasets, also known as supervised learning, to
inform its algorithm, but it doesn’t necessarily require a labeled dataset. It can ingest
unstructured data in its raw form (e.g. text, images), and it can automatically determine the set of
features which distinguish different categories of data from one another. Unlike machine
learning, it doesn't require human intervention to process data, allowing us to scale machine
learning in more interesting ways. Deep learning and neural networks are primarily credited with
accelerating progress in areas, such as computer vision, natural language processing, and speech
recognition.
Neural networks, or artificial neural networks (ANNs), are comprised of a node layers,
containing an input layer, one or more hidden layers, and an output layer. Each node, or artificial
neuron, connects to another and has an associated weight and threshold. If the output of any
individual node is above the specified threshold value, that node is activated, sending data to the
next layer of the network. Otherwise, no data is passed along to the next layer of the network.
The “deep” in deep learning is just referring to the depth of layers in a neural network. A neural
network that consists of more than three layers—which would be inclusive of the inputs and the
output—can be considered a deep learning algorithm or a deep neural network. A neural network
that only has two or three layers is just a basic neural network.
See the blog post “AI vs. Machine Learning vs. Deep Learning vs. Neural Networks: What’s the
Difference?” for a closer look at how the different concepts relate.
UC Berkeley (link resides outside IBM) breaks out the learning system of a machine learning
algorithm into three main parts.
Machine learning methods
Supervised learning, also known as supervised machine learning, is defined by its use of labeled
datasets to train algorithms that to classify data or predict outcomes accurately. As input data is
fed into the model, it adjusts its weights until the model has been fitted appropriately. This
occurs as part of the cross validation process to ensure that the model
avoids overfitting or underfitting. Supervised learning helps organizations solve for a variety of
real-world problems at scale, such as classifying spam in a separate folder from your inbox.
Some methods used in supervised learning include neural networks, naïve bayes, linear
regression, logistic regression, random forest, support vector machine (SVM), and more.
Unsupervised learning, also known as unsupervised machine learning, uses machine learning
algorithms to analyze and cluster unlabeled datasets. These algorithms discover hidden patterns
or data groupings without the need for human intervention. Its ability to discover similarities and
differences in information make it the ideal solution for exploratory data analysis, cross-selling
strategies, customer segmentation, image and pattern recognition. It’s also used to reduce the
number of features in a model through the process of dimensionality reduction; principal
component analysis (PCA) and singular value decomposition (SVD) are two common
approaches for this. Other algorithms used in unsupervised learning include neural networks, k-
means clustering, probabilistic clustering methods, and more.
Semi-supervised learning
Semi-supervised learning offers a happy medium between supervised and unsupervised learning.
During training, it uses a smaller labeled data set to guide classification and feature extraction
from a larger, unlabeled data set. Semi-supervised learning can solve the problem of having not
enough labeled data (or not being able to afford to label enough data) to train a supervised
learning algorithm.
For a deep dive into the differences between these approaches, check out "Supervised vs.
Unsupervised Learning: What's the Difference?"