Unit 1 (MLT) Lecture Notes 1 Unit 1mlt Lecture Notes 1
Unit 1 (MLT) Lecture Notes 1 Unit 1mlt Lecture Notes 1
We do not yet know how to make computers learn nearly as well as people learn.
However, algorithms have been invented that are effective for certain types of
learning tasks, and a theoretical understanding of learning is beginning to emerge.
Many practical computer programs have been developed to exhibit useful types of
learning, and significant commercial applications have begun to appear. For problems
such as speech recognition, algorithms based on machine learning outperform all
other approaches that have been attempted to date. In the field known as data mining,
machine learning algorithms are being used routinely to discover valuable knowledge
from large commercial databases containing equipment maintenance records, loan
applications, financial transactions, medical records, and the like. As our
understanding of computers continues to mature, it seems inevitable that machine
learning will play an increasingly central role in computer science and computer
technology. In recent years, many successful ML applications have been developed,
ranging from data-mining programs that learn to detect fraudulent credit card
transactions, to information-filering systems that learn users’ reading preferences, to
autonomous vehicles that learn to drive on public highways.
Figure 1
The first problem is let’s write a program to add two numbers a and b, most of you
will wonder what is a question this is such a basic question probably this particular
program is among some of the early programs that all of us have written. t. So, how
do we really write this program? We essentially write a function f() which takes two
arguments a and b and then it returns a + b. This is a program that all of you are
familiar with, we can add two numbers very easily by writing a computer program.
Let us try to solve a slightly different problem with the same technique and we will
see whether we can solve it or if we need some more tools in our toolkit. The second
Let us take a step back and try to understand why we are able to recognize these digits
you can think that we have been seeing these kinds of digits right from our childhood.
When you started our formal education we are introduced to these digits.
So, somehow our brain is trained to recognize these digits even if they are written in a
slightly different style or in a slightly different orientation. Can we try to mimic the
training that we provided to a brain, can we give the same training to a computer?
Let’s try to explore that. This is the question that ML tries to explore. So, let us write
down the key difference between the programming the traditional programming
paradigm and the ML.
In our traditional programming world, we have a program,we give some data as an
input and we also input the rules, rather we code these rules in the program and then
4. History of ML
1950 — Alan Turing creates the “Turing Test” to determine if a computer has real
intelligence. To pass the test, a computer must be able to fool a human into believing
it is also human.
1952 — Arthur Samuel wrote the first computer learning program. The program was
the game of checkers, and the IBM computer improved at the game the more it
played, studying which moves made up winning strategies and incorporating those
moves into its program.
1957 — Frank Rosenblatt designed the first neural network for computers (the
perceptron), which simulate the thought processes of the human brain.
1967 — The “nearest neighbor” algorithm was written, allowing computers to begin
using very basic pattern recognition. This could be used to map a route for traveling
salesmen, starting at a random city but ensuring they visit all cities during a short tour.
1979 — Students at Stanford University invent the “Stanford Cart” which can
navigate obstacles in a room on its own.
1981 — Gerald Dejong introduces the concept of Explanation Based Learning (EBL),
in which a computer analyses training data and creates a general rule it can follow by
discarding unimportant data.
1985 — Terry Sejnowski invents NetTalk, which learns to pronounce words the same
way a baby does.
2006 — Geoffrey Hinton coins the term “deep learning” to explain new algorithms
that let computers “see” and distinguish objects and text in images and videos.
2010 — The Microsoft Kinect can track 20 human features at a rate of 30 times per
second, allowing people to interact with the computer via movements and gestures.
2011 — Google Brain is developed, and its deep neural network can learn to discover
and categorize objects much the way a cat does.
2015 – Microsoft creates the Distributed Machine Learning Toolkit, which enables the
efficient distribution of machine learning problems across multiple computers.
2015 – Over 3,000 AI and Robotics researchers, endorsed by Stephen Hawking, Elon
Musk and Steve Wozniak (among many others), sign an open letter warning of the
danger of autonomous weapons which select and engage targets without human
intervention.
Deep Learning (DL) -refers to systems that learn from experience on large data sets.
6.Types of Learning
6.1Supervised Learning:
Supervised learning is when the model is getting trained on a labelled
dataset. Labelled dataset is one which have both input and output parameters. In this
type of learning both training and validation datasets are labelled as shown in the
figures below.
Thus the machine has no idea about the features of dogs and cat so we can’t categorize
it in dogs and cats. But it can categorize them according to their similarities, patterns,
and differences i.e., we can easily categorize the above picture into two parts. First
may contain all pics having dogs in it and second part may contain all pics
having cats in it. Here you didn’t learn anything before, means no training data or
examples.
Unsupervised learning classified into two categories of algorithms:
Clustering: A clustering problem is where you want to discover the inherent
groupings in the data, such as grouping customers by purchasing behavior.
Association: An association rule learning problem is where you want to discover
rules that describe large portions of your data, such as people that buy X also tend
to buy Y.
In order to complete the design of the learning system, we must now choose
1. the exact type of knowledge to be learned
2. a representation for this target knowledge
3. a learning mechanism
Thus, we seek the weights, that minimize E for the observed training examples.
Several algorithms are known for finding weights of a linear function that minimize E
defined in this way. In our case, we require an algorithm that will incrementally refine
the weights as new training examples become available and that will be robust to
errors in these estimated training values. One such algorithm is called the least mean
squares, or LMS training rule. For each observed training example it adjusts the
weights a small amount in the direction that reduces the error on this training example.
This algorithm can be viewed as performing a stochastic gradient-descent search
through the space of possible hypotheses (weight values) to minimize the squared
error E. The LMS algorithm is defined as follows: