0% found this document useful (0 votes)
2 views

Module1 ML

Machine learning (ML) is a branch of artificial intelligence focused on developing algorithms that learn from data to perform tasks without explicit instructions. Supervised learning, a key aspect of ML, relies on past labeled data to train models for tasks such as classification and regression. Various algorithms, including Naïve Bayes, Decision Trees, and k-Nearest Neighbors (KNN), are commonly used to solve classification problems.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Module1 ML

Machine learning (ML) is a branch of artificial intelligence focused on developing algorithms that learn from data to perform tasks without explicit instructions. Supervised learning, a key aspect of ML, relies on past labeled data to train models for tasks such as classification and regression. Various algorithms, including Naïve Bayes, Decision Trees, and k-Nearest Neighbors (KNN), are commonly used to solve classification problems.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Machine learning (ML)

Machine learning
• Machine learning (ML) is a field of study in artificial intelligence
concerned with the development and study of statistical algorithms
that can learn from data and generalize to unseen data, and thus
perform tasks without explicit instructions.
• As regards machines, we might say, very broadly, that a machine
learns whenever it changes its structure, program, or data (based on
its inputs or in response to external information) in such a manner
that its expected future performance improves.
Supervised Learning
• The major motivation of supervised learning is to learn from past information. So
what kind of past information does the machine need for supervised learning?
• It is the information about the task which the machine has to execute. In context
of the definition of machine learning, this past information is the experience.
• Say a machine is getting images of different objects as input and the task is to
segregate the images by either shape or colour of the object.
• If it is by shape, the images which are of round-shaped objects need to be
separated from images of triangular-shaped objects, etc. If the segregation needs
to happen based on colour, images of blue objects need to be separated from
images of green objects.
• But how can the machine know what is round shape, or triangular shape?
• Same way, how can the machine distinguish image of an object based on whether
it is blue or green in colour?
training data
• A machine needs the basic information to be provided to it. This basic
input, or the experience in the paradigm of machine learning, is given
in the form of training data.
• Training data is the past information on a specific task
• In context of the image segregation problem, training data will have
past data on different aspects or features on a number of images,
along with a tag on whether the image is round or triangular, or blue
or green in colour.
• The tag is called ‘ label’ and we say that the training data is labelled in
case of supervised learning.
Supervised learning
Examples
• Predicting the results of a game
• Predicting whether a tumour is malignant or not dangerous
• Predicting the price of domains like real estate, stocks, etc.
• Classifying texts such as classifying a set of emails as spam or non-
spam
• Now, let’s consider two of the above examples, say ‘predicting whether a
tumour is malignant or not’ and ‘predicting price of real estate’. Are these
two problems same in nature?
• The answer is ‘no’.
• Though both of them are prediction problems, in one case we are trying to
predict which category or class an unknown data belongs to whereas in the
other case we are trying to predict an absolute value and not a class.
• When we are trying to predict a categorical or nominal variable, the
problem is known as a classification problem. Whereas when we are trying
to predict a real-valued variable, the problem falls under the category of
regression.
Classification
• Segregate the images of objects based on the shape:
• If the image is of a round object, it is put under one category, while if the image is
of a triangular object, it is put under another category.
• In which category the machine should put an image of unknown category, also
called a test data in machine learning, depends on the information it gets from
the past data, which we have called as training data. Since the training data has a
label or category defined for each and every image, the machine has to map a
new image or test data to a set of images to which it is similar to and assign the
same label or category to the test data.
• So we observe that the whole problem revolves around assigning a label or
category or class to a test data based on the label or category or class information
that is imparted by the training data.
• Since the target objective is to assign a class label, this type of problem referred
as classification problem.
In summary, classification is a type of supervised
learning where a target feature, which is typed
Training categorical in nature, is predicted for test data
based on the information imparted by training
data. The target categorical feature is known as
class.

1. Image classification
2. Prediction of disease
3. Win–loss prediction of games
4. Prediction of natural calamity like earthquake, flood, etc
5. Recognition of handwriting
Algorithms
• There are number of popular machine learning algorithms which help
in solving classification problems.
• To name a few, Naïve Bayes, Decision tree, and k-Nearest Neighbour
algorithms are adopted by many machine learning practitioners.
KNN
• Nonparametric statistics is a type of statistical analysis that makes
minimal assumptions about the underlying distribution of the data
being studied. Often these models are infinite-dimensional, rather
than finite dimensional, as is parametric statistics.
• The k-nearest neighbors algorithm, also known as KNN or k-NN, is a
non-parametric, supervised learning classifier, which uses proximity
to make classifications or predictions about the grouping of an
individual data point.

You might also like