0% found this document useful (0 votes)
12 views26 pages

MLT Unit-1 Notes

MLT Unit-1 Notes

Uploaded by

srimaddhesia9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views26 pages

MLT Unit-1 Notes

MLT Unit-1 Notes

Uploaded by

srimaddhesia9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

MACHINE LEARNING & TECHNIQUES

UNIT -1

 What is machine learning?


Machine learning (ML) is a type of artificial intelligence focused on building
computer systems that learn from data. The broad range of techniques ML
encompasses enables software applications to improve their performance
over time.

Machine learning algorithms are trained to find relationships and patterns in


data. They use historical data as input to make predictions, classify
information, cluster data points, reduce dimensionality and even help generate
new content, as demonstrated by new ML-fueled applications such as
ChatGPT.
Machine Learning is a subset of AI that focus on learning from data to
develop on algorithm that can be used to make a prediction.

 History Of Machine Learning

The early history of Machine Learning (Pre-1940):

o 1834: In 1834, Charles Babbage, the father of the computer,


conceived a device that could be programmed with punch cards.
However, the machine was never built, but all modern computers rely
on its logical structure.
o 1936: In 1936, Alan Turing gave a theory that how a machine can
determine and execute a set of instructions.
The era of stored program computers:

o 1940: In 1940, the first manually operated computer, "ENIAC" was


invented, which was the first electronic general-purpose computer.
After that stored program computer such as EDSAC in 1949 and EDVAC
in 1951 were invented.
o 1943: In 1943, a human neural network was modeled with an
electrical circuit. In 1950, the scientists started applying their idea to
work and analyzed how human neurons might work.

Computer machinery and intelligence:

o 1950: In 1950, Alan Turing published a seminal paper, "Computer


Machinery and Intelligence," on the topic of artificial intelligence. In
his paper, he asked, "Can machines think?"

Machine intelligence in Games:

o 1952: Arthur Samuel, who was the pioneer of machine learning,


created a program that helped an IBM computer to play a checkers
game. It performed better more it played.
o 1959: In 1959, the term "Machine Learning" was first coined
by Arthur Samuel.

The first "AI" winter:

o The duration of 1974 to 1980 was the tough time for AI and ML
researchers, and this duration was called as AI winter.
o In this duration, failure of machine translation occurred, and people
had reduced their interest from AI, which led to reduced funding by the
government to the researches.

Machine Learning from theory to reality

o 1959: In 1959, the first neural network was applied to a real-world


problem to remove echoes over phone lines using an adaptive filter.
o 1985: In 1985, Terry Sejnowski and Charles Rosenberg invented a
neural network NETtalk, which was able to teach itself how to
correctly pronounce 20,000 words in one week.
o 1997: The IBM's Deep blue intelligent computer won the chess game
against the chess expert Garry Kasparov, and it became the first
computer which had beaten a human chess expert.

Application Of Machine Learning:

1. Speech Recognition:

Speech recognition is a process of converting voice instructions into


text, and it is also known as "Speech to text", or "Computer
speech recognition." At present, machine learning algorithms are
widely used by various applications of speech recognition. Google
assistant, Siri, Cortana, and Alexa are using speech
recognition technology to follow the voice instructions.
2. Traffic prediction:

If we want to visit a new place, we take help of Google Maps, which


shows us the correct path with the shortest route and predicts the traffic
conditions.

It predicts the traffic conditions such as whether traffic is cleared, slow-


moving, or heavily congested with the help of two ways:

o Real Time location of the vehicle form Google Map


app and sensors
o Average time has taken on past days at the same
time.

3. Product recommendations:

Machine learning is widely used by various e-commerce and


entertainment companies such as Amazon, Netflix, etc., for product
recommendation to the user. Whenever we search for some product on
Amazon, then we started getting an advertisement for the same product
while internet surfing on the same browser and this is because of
machine learning.

Google understands the user interest using various machine learning


algorithms and suggests the product as per customer interest.

As similar, when we use Netflix, we find some recommendations for


entertainment series, movies, etc., and this is also done with the help of
machine learning.

4. Self-driving cars:

One of the most exciting applications of machine learning is self-driving


cars. Machine learning plays a significant role in self-driving cars. Tesla,
the most popular car manufacturing company is working on self-driving
car. It is using unsupervised learning method to train the car models to
detect people and objects while driving.
5. Email Spam and Malware Filtering:

Whenever we receive a new email, it is filtered automatically as


important, normal, and spam. We always receive an important mail in
our inbox with the important symbol and spam emails in our spam box,
and the technology behind this is Machine learning

6 Virtual Personal Assistant:

We have various virtual personal assistants such as Google


assistant, Alexa, Cortana, Siri. As the name suggests, they
help us in finding the information using our voice instruction. These
assistants can help us in various ways just by our voice instructions such
as Play music, call someone, Open an email, Scheduling an
appointment, etc.

.7. Online Fraud Detection:

Machine learning is making our online transaction safe and secure by


detecting fraud transaction. Whenever we perform some online
transaction, there may be various ways that a fraudulent transaction can
take place such as fake accounts, fake ids, and steal
money in the middle of a transaction. So to detect this, Feed
Forward Neural network helps us by checking whether it is a
genuine transaction or a fraud transaction.

Advantage Of ML:
1. Easily identifies trends and patterns
Machine Learning can review large volumes of data and
discover specific trends and patterns that would not be
apparent to humans. For instance, for an e-commerce
website like Amazon, it serves to understand the
browsing behaviors and purchase histories of its users to
help cater to the right products, deals, and reminders
relevant to them. It uses the results to reveal relevant
advertisements to them.

2. No human intervention needed (automation)

ML does not require physical force that is no human intervention is


needed.
3. Continuous Improvement
As ML algorithms gain experience, they keep improving
in accuracy and efficiency. This lets them make better
decisions. Say you need to make a weather forecast
model. As the amount of data you have keeps growing,
your algorithms learn to make more accurate predictions
faster.

4. Handling multi-dimensional and multi-variety


data
Machine Learning algorithms are good at handling data
that are multi-dimensional and multi-variety, and they
can do this in dynamic or uncertain environments.
5. Wide Applications
You could be an e-tailer or a healthcare provider and
make ML work for you. Where it does apply, it holds the
capability to help deliver a much more personal
experience to customers while also targeting the right
customers.
Disadvantages of Machine Learning

1. Data Acquisition
Machine Learning requires massive data sets to train on, and
these should be inclusive/unbiased, and of good quality. There
can also be times where they must wait for new data to be
generated.

2. Time and Resources


ML needs enough time to let the algorithms learn and develop
enough to fulfill their purpose with a considerable amount of
accuracy and relevancy. It also needs massive resources to
function. This can mean additional requirements of computer
power for you.

3. Interpretation of Results

Another major challenge is the ability to accurately interpret


results generated by the algorithms. You must also carefully
choose the algorithms for your purpose.

4. Highly Expensive

This software is highly expensive, and not everybody can own it.
Government agencies, big private firms, and enterprises mostly
own it. It needs to be made accessible to everybody for wide use.

5. Privacy Concern

As we know that one of the pillars of machine learning is data.


The collection of data has raised the fundamental question of
privacy. The way data is collected and used for commercial
purposes has always been a contentious issue. In India, the
Supreme court of India has declared privacy a fundamental right
of Indians. Without the user's permission, data cannot be
collected, used, or stored. However, many cases have come up
that big firms collect the data without the user's knowledge and
using it for their commercial gains.
6. Research And Innovation
Machine learning is evolving concept. This area has not seen any
major developments yet that fully revolutionized any economic
sector. The area requires continuous research and innovation.

Types Of ML :

1. Supervised Machine Learning


2. Unsupervised Machine Learning
3. Reinforcement Learning

1.) Supervised Learning :


Supervised machine learning is based on supervision. It
means in the supervised learning technique, we train the
machines using the "labelled" dataset, and based on the
training, the machine predicts the output

Let's understand supervised learning with an example.


Suppose we have an input dataset of cats and dog
images. So, first, we will provide the training to the
machine to understand the images, such as the shape &
size of the tail of cat and dog, Shape of eyes,
colour, height (dogs are taller, cats are smaller),
etc. After completion of training, we input the picture of a
cat and ask the machine to identify the object and predict
the output. Now, the machine is well trained, so it will
check all the features of the object, such as height, shape,
colour, eyes, ears, tail, etc., and find that it's a cat. So, it
will put it in the Cat category. This is the process of how
the machine identifies the objects in Supervised Learning.
Application :
Some common applications of Supervised Learning are given
below:

Image Segmentation:

Supervised Learning algorithms are used in image segmentation.


In this process, image classification is performed on different
image data with pre-defined labels.

Medical Diagnosis:

Supervised algorithms are also used in the medical field for


diagnosis purposes. It is done by using medical images and past
labelled data with labels for disease conditions. With such a
process, the machine can identify a disease for the new patients.

Fraud Detection - Supervised Learning classification algorithms


are used for identifying fraud transactions, fraud customers, etc.
It is done by using historic data to identify the patterns that can
lead to possible fraud.

Spam detection - In spam detection & filtering, classification


algorithms are used. These algorithms classify an email as spam
or not spam. The spam emails are sent to the spam folder.

Speech Recognition - Supervised learning algorithms are also


used in speech recognition. The algorithm is trained with voice
data, and various identifications can be done using the same,
such as voice-activated passwords, voice commands, etc.
2. Unsupervised Machine Learning
Unsupervised learning is different from the Supervised learning technique; as its
name suggests, there is no need for supervision. It means, in unsupervised machine
learning, the machine is trained using the unlabeled dataset, and the machine
predicts the output without any supervision.

In unsupervised learning, the models are trained with the data that is neither
classified nor labelled, and the model acts on that data without any supervision.

The main aim of the unsupervised learning algorithm is to


group or categories the unsorted dataset according to the
similarities, patterns, and differences. Machines are instructed to
find the hidden patterns from the input dataset.

Let's take an example to understand it more preciously; suppose there is a basket of


fruit images, and we input it into the machine learning model. The images are
totally unknown to the model, and the task of the machine is to find the patterns
and categories of the objects.

So, now the machine will discover its patterns and differences,
such as colour difference, shape difference, and predict the
output when it is tested with the test dataset.

Applications of Unsupervised Learning:

o Network Analysis: Unsupervised learning is used for


identifying plagiarism and copyright in document network
analysis of text data for scholarly articles.
o Recommendation Systems: Recommendation systems
widely use unsupervised learning techniques for building
recommendation applications for different web applications
and e-commerce websites.
o Anomaly Detection: Anomaly detection is a popular
application of unsupervised learning, which can identify
unusual data points within the dataset. It is used to discover
fraudulent transactions.
o Singular Value Decomposition: Singular Value
Decomposition or SVD is used to extract particular
information from the database. For example, extracting
information of each user located at a particular location.

3. Reinforcement Learning
Reinforcement learning works on a feedback-based
process, in which an AI agent (A software component)
automatically explore its surrounding by hitting & trail,
taking action, learning from experiences, and improving
its performance. Agent gets rewarded for each good action and
get punished for each bad action; hence the goal of reinforcement
learning agent is to maximize the rewards.

In reinforcement learning, there is no labelled data like supervised


learning, and agents learn from their experiences only.

The reinforcement learning process is similar to a human being;


for example, a child learns various things by experiences in his
day-to-day life. An example of reinforcement learning is to play a
game, where the Game is the environment, moves of an agent at
each step define states, and the goal of the agent is to get a high
score. Agent receives feedback in terms of punishment and
rewards.

Application Of Reinforcement Learning:

1.Robotics:

RL is used in Robot navigation, Robo-soccer, walking,


juggling, etc.

2. Control:
RL can be used for adaptive control such as Factory processes,
admission control in telecommunication, and Helicopter pilot is an
example of reinforcement learning.

3.Game Playing:

RL can be used in Game playing such as tic-tac-toe, chess, etc.

4.Chemistry:

RL can be used for optimizing the chemical reactions.

5.Business:

RL is now used for business strategy planning.

6.Manufacturing:

In various automobile manufacturing companies, the robots use


deep reinforcement learning to pick goods and put them in some
containers.

7.Finance Sector:

The RL is currently used in the finance sector for evaluating


trading strategies.

Steps Of Machine Learning :

1. Gathering Data:
Data Gathering is the first step of the machine learning life cycle.

In this step, we need to identify the different data sources, as


data can be collected from various sources such
as files, database, internet, or mobile devices. It is one of the
most important steps of the life cycle. The quantity and quality of
the collected data will determine the efficiency of the output. The
more will be the data, the more accurate will be the prediction.

This step includes the below tasks:

o Identify various data sources


o Collect data
o Integrate the data obtained from different sources

2. Data preparation

After collecting the data, we need to prepare it for further steps.


Data preparation is a step where we put our data into a suitable
place and prepare it to use in our machine learning training.

In this step, first, we put all data together, and then randomize
the ordering of data.

This step can be further divided into two processes:

o Data exploration:

It is used to understand the nature of data that we have to


work with. We need to understand the characteristics,
format, and quality of data.
A better understanding of data leads to an effective
outcome. In this, we find Correlations, general trends, and
outliers.

o Data pre-processing:

Now the next step is preprocessing of data for its analysis.

3. Data Analysis
The aim of this step is to build a machine learning model to
analyze the data using various analytical techniques and review
the outcome. It starts with the determination of the type of the
problems, where we select the machine learning techniques such
as Classification, Regression, Cluster analysis, Association,
etc. then build the model using prepared data, and evaluate the
model.

Hence, in this step, we take the data and use machine learning
algorithms to build the model.

5. Trained the Algorithm /Model

Now the next step is to train the model, in this step we train our
model to improve its performance for better outcome of the
problem.

We use datasets to train the model using various machine


learning algorithms. Training a model is required so that it can
understand the various patterns, rules, and, features.

6. Test Model
Once our machine learning model has been trained on a given
dataset, then we test the model. In this step, we check for the
accuracy of our model by providing a test dataset to it.

Testing the model determines the percentage accuracy of the


model as per the requirement of project or problem.

7. Deployment
The last step of machine learning life cycle is deployment, where
we deploy the model in the real-world system.

If the above-prepared model is producing an accurate result as


per our requirement with acceptable speed, then we deploy the
model in the real system. But before deploying the project, we will
check whether it is improving its performance using available
data or not. The deployment phase is similar to making the final
report for a project.

 Clustering :
Clustering or cluster analysis is a machine learning technique,
which groups the unlabelled dataset. It can be defined as "A way
of grouping the data points into different clusters,
consisting of similar data points. The objects with the
possible similarities remain in a group that has less or no
similarities with another group."

It does it by finding some similar patterns in the unlabelled


dataset such as shape, size, color, behavior, etc., and divides
them as per the presence and absence of those similar patterns.

It is an unsupervised learning method, hence no supervision is


provided to the algorithm, and it deals with the unlabeled dataset.

Example: Let's understand the clustering technique with the real-


world example of Mall: When we visit any shopping mall, we can
observe that the things with similar usage are grouped together.
Such as the t-shirts are grouped in one section, and trousers are
at other sections, similarly, at vegetable sections, apples,
bananas, Mangoes, etc., are grouped in separate sections, so that
we can easily find out the things. The clustering technique also
works in the same way. Other examples of clustering are
grouping documents according to the topic.

The clustering technique can be widely used in various tasks.


Some most common uses of this technique are:

o Market Segmentation
o Statistical data analysis
o Social network analysis
o Image segmentation
o Anomaly detection, etc.

Apart from these general usages, it is used by the Amazon in its


recommendation system to provide the recommendations as per
the past search of products. Netflix also uses this technique to
recommend the movies and web-series to its users as per the
watch history.

The below diagram explains the working of the clustering


algorithm. We can see the different fruits are divided into several
groups with similar properties.

Types Of Clustering :

1.) Partitioning Clustering


It is a type of clustering that divides the data into non-hierarchical
groups. It is also known as the centroid-based method. The
most common example of partitioning clustering is the K-Means
Clustering algorithm.
In this type, the dataset is divided into a set of k groups, where K
is used to define the number of pre-defined groups. The cluster
center is created in such a way that the distance between the
data points of one cluster is minimum as compared to another
cluster centroid.

2.) Density-Based Clustering


The density-based clustering method connects the highly-dense
areas into clusters, and the arbitrarily shaped distributions are
formed as long as the dense region can be connected. This
algorithm does it by identifying different clusters in the dataset
and connects the areas of high densities into clusters. The dense
areas in data space are divided from each other by sparser areas.

These algorithms can face difficulty in clustering the data points if


the dataset has varying densities and high dimensions.
3.) Distribution Model-Based Clustering
In the distribution model-based clustering method, the data is
divided based on the probability of how a dataset belongs to a
particular distribution. The grouping is done by assuming some
distributions commonly Gaussian Distribution.

The example of this type is the Expectation-Maximization


Clustering algorithm that uses Gaussian Mixture Models (GMM).

3.) Hierarchical Clustering


Hierarchical clustering can be used as an alternative for the
partitioned clustering as there is no requirement of pre-specifying
the number of clusters to be created. In this technique, the
dataset is divided into clusters to create a tree-like structure,
which is also called a dendrogram. The observations or any
number of clusters can be selected by cutting the tree at the
correct level. The most common example of this method is
the Agglomerative Hierarchical algorithm.
o Application Of Clustering :

o In Identification of Cancer Cells: The clustering


algorithms are widely used for the identification of cancerous
cells. It divides the cancerous and non-cancerous data sets
into different groups.
o In Search Engines: Search engines also work on the
clustering technique. The search result appears based on the
closest object to the search query. It does it by grouping
similar data objects in one group that is far from the other
dissimilar objects. The accurate result of a query depends on
the quality of the clustering algorithm used.
o Customer Segmentation: It is used in market research to
segment the customers based on their choice and
preferences.
o In Biology: It is used in the biology stream to classify
different species of plants and animals using the image
recognition technique.
o In Land Use: The clustering technique is used in identifying
the area of similar lands use in the GIS database. This can be
very useful to find that for what purpose the particular land
should be used, that means for which purpose it is more
suitable.

 Artificial Neural Network :

The term "Artificial Neural Network" is derived from Biological


neural networks that develop the structure of a human brain.
Similar to the human brain that has neurons interconnected to
one another, artificial neural networks also have neurons that are
interconnected to one another in various layers of the networks.
These neurons are known as nodes.

An Artificial Neural Network in the field of Artificial


intelligence where it attempts to mimic the network of neurons
makes up a human brain so that computers will have an option to
understand things and make decisions in a human-like manner.
The artificial neural network is designed by programming
computers to behave simply like interconnected brain cells.

There are around 1000 billion neurons in the human brain. Each
neuron has an association point somewhere in the range of 1,000
and 100,000. In the human brain, data is stored in such a manner
as to be distributed, and we can extract more than one piece of
this data when necessary from our memory parallelly. We can say
that the human brain is made up of incredibly amazing parallel
processors.

We can understand the artificial neural network with an example,


consider an example of a digital logic gate that takes an input and
gives an output. "OR" gate, which takes two inputs. If one or both
the inputs are "On," then we get "On" in output. If both the inputs
are "Off," then we get "Off" in output. Here the output depends
upon input. Our brain does not perform the same task. The
outputs to inputs relationship keep changing because of the
neurons in our brain, which are "learning."

 Decision Tree :

o Decision Tree is a Supervised learning technique that


can be used for both classification and Regression problems,
but mostly it is preferred for solving Classification problems.
It is a tree-structured classifier, where internal nodes
represent the features of a dataset, branches
represent the decision rules and each leaf node
represents the outcome.

o In a Decision tree, there are two nodes, which are


the Decision Node and Leaf Node. Decision nodes are
used to make any decision and have multiple branches,
whereas Leaf nodes are the output of those decisions and do
not contain any further branches.
o The decisions or the test are performed on the basis of
features of the given dataset.
o It is a graphical representation for getting all the
possible solutions to a problem/decision based on
given conditions.
o It is called a decision tree because, similar to a tree, it starts
with the root node, which expands on further branches and
constructs a tree-like structure.
o In order to build a tree, we use the CART algorithm, which
stands for Classification and Regression Tree
algorithm.
o A decision tree simply asks a question, and based on the
answer (Yes/No), it further split the tree into subtrees.

Example:

Suppose there is a candidate who has a job offer and wants to


decide whether he should accept the offer or Not. So, to solve this
problem, the decision tree starts with the root node (Salary
attribute by ASM). The root node splits further into the next
decision node (distance from the office) and one leaf node based
on the corresponding labels. The next decision node further gets
split into one decision node (Cab facility) and one leaf node.
Finally, the decision node splits into two leaf nodes (Accepted
offers and Declined offer). Consider the below diagram:
S.N
o Data Science Machine Learning

1. Data Science is a field about Machine Learning is a field of


processes and systems to study that gives computers the
S.N
o Data Science Machine Learning

extract data from structured capability to learn without


and semi-structured data. being explicitly programmed.

Need the entire analytics Combination of Machine and


2.
universe. Data Science.

Machines utilize data science


3. Branch that deals with data. techniques to learn about the
data.

Data in Data Science maybe It uses various techniques like


or maybe not evolved from a regression and supervised
4.
machine or mechanical clustering.
process.

Data Science as a broader


term not only focuses on
But it is only focused on
5. algorithms statistics but also
algorithm statistics.
takes care of the data
processing.

It is a broad term for multiple


6. It fits within data science.
disciplines.

Many operations of data


It is three types: Unsupervised
science that is, data
7. learning, Reinforcement
gathering, data cleaning, data
learning, Supervised learning.
manipulation, etc.

Example: Netflix uses Data Example: Facebook uses


8.
Science technology. Machine Learning technology.

You might also like