0% found this document useful (0 votes)
11 views

CHP1 Introduction To Machine Learning

Uploaded by

cseicb354
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

CHP1 Introduction To Machine Learning

Uploaded by

cseicb354
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

CHP1: Introduction to Machine

Learning

Terminologies:

CHP1: Introduction to Machine Learning 1


Supervised Learning, Unsupervised Learning, and Reinforcement Learning.

CHP1: Introduction to Machine Learning 2


Supervised Learning: This is depicted in red. It involves model training with
labelled data. Two common tasks associated with supervised learning are:

Classification: Illustrated with two groups of different shapes (triangles


and circles) being separated by a dashed line, indicating the model’s ability
to categorize data into distinct classes.

Regression: Shown as a scatter plot of stars and circles with a solid line
indicating the model’s ability to predict continuous outcomes.

Unsupervised Learning: This is shown in blue. This type involves model


training with unlabelled data. A common task associated here is:

Clustering: Represented by a mix of red and green dots grouped together,


showcasing the model’s capability to group similar data points together
without pre-defined labels.

Reinforcement Learning: This is depicted in green. It involves a model that


takes actions in an environment then receives state updates and feedbacks.
An illustration shows the interaction between the “Model Agent” and
“Environment” through actions, feedback, and state updates.

Each type of machine learning has specific applications and methods for training
models to perform tasks based on input data.

CHP1: Introduction to Machine Learning 3


Regression is a statistical method used in machine learning and data analysis to
predict a continuous outcome variable (dependent variable) based on one or more
predictor variables (independent variables). The goal of regression is to find the
relationship between the independent and dependent variables.

In the context of machine learning, regression algorithms are supervised learning


methods that learn this relationship from training data, and can then predict the
output for unseen data.

There are several types of regression, including:

Linear Regression: Assumes a linear relationship between the input variables


and the single output variable. It can be used to predict a quantitative
response.

Logistic Regression: Despite its name, it’s a classification method. It uses the
logistic function to model a binary dependent variable (an outcome that can
take only two values, like 0/1, Yes/No).

Polynomial Regression: Used when the data points are modeled better by an
nth degree polynomial instead of a straight line (as in Linear Regression).

Ridge/Lasso Regression: These are regularization methods used to prevent


overfitting in linear and polynomial regression.

The regression model is typically evaluated using metrics like Mean Absolute Error
(MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-
squared (coefficient of determination) value.

CHP1: Introduction to Machine Learning 4


CHP1: Introduction to Machine Learning 5
Clustering is an unsupervised machine learning technique used to group similar
instances on the basis of features into clusters. The goal is to partition the data
into sets such that the data points in the same set share some common traits.

The key idea behind clustering is that data points in the same group are more
similar to each other than to those in other groups. In simple words, the aim is to
segregate groups with similar traits and assign them into clusters.

There are several types of clustering methods, including:

K-Means Clustering: This algorithm partitions the input data into K distinct
clusters. The number of clusters (K) is user specified.

Hierarchical Clustering: This algorithm builds a hierarchy of clusters where


each node is a cluster consisting of the clusters of its offspring nodes.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise): This


algorithm groups together points that are packed closely together (points with
many nearby neighbors), marking low-density regions that are far from any
cluster as outliers.

CHP1: Introduction to Machine Learning 6


Association in machine learning refers to a type of rule-based machine learning
method that is primarily used for discovering interesting relations or associations
among a large set of data items. This method is a part of unsupervised learning,
where the algorithm is not guided by a specific outcome variable.

The most common application of association in machine learning is in Market


Basket Analysis, where the goal is to find associations between products that
occur together more frequently than expected. These associations are often
represented as association rules.

An association rule has two parts: an antecedent (if) and a consequent (then). An
example of an association rule could be “If a customer buys bread and butter, they
are likely to also buy milk.”

The strength of an association rule is measured using support, confidence, and


lift:

Support: This is the percentage of transactions in the dataset that contain all
items in the antecedent and consequent of the rule. It indicates how frequently
the rule occurs in the dataset.

Confidence: This is the conditional probability of the occurrence of the


consequent given the antecedent. It indicates the likelihood of item Y being
purchased when item X is purchased.

Lift: This is the ratio of the observed support to that expected if the
antecedent and the consequent were independent. A lift value greater than 1
indicates that the antecedent and consequent are dependent on each other.

CHP1: Introduction to Machine Learning 7


One of the most popular algorithms for generating association rules is the Apriori
algorithm. It uses a breadth-first search strategy to count the support of itemsets
and uses a candidate generation function that exploits the downward closure
property of support.

Decision Trees are a type of supervised machine learning algorithm that is mostly
used for classification, but can also be used for regression. They are called
decision trees because they make decisions by splitting data into subsets based
on feature values, which can be visualized as a tree structure.

The top of the tree starts with a single node, which splits into several branches.
Each branch represents a decision based on a feature’s value, leading to another
node or to a leaf node. Leaf nodes represent the output or decision (class label for
classification, numerical value for regression).
Key concepts in decision trees include:

CHP1: Introduction to Machine Learning 8


Root Node: The top-most node that represents the entire sample or
population. It further gets divided into two or more homogeneous sets.

Splitting: The process of dividing a node into two or more sub-nodes.

Pruning: The process of removing the unwanted branches from the tree.

Branch / Sub-Tree: A subsection of the entire tree.

Parent and Child Node: A node gets divided into sub-nodes where the node is
called parent and sub-nodes are called child nodes.

Decision trees have many advantages such as simplicity, interpretability, and


handling both numerical and categorical data. However, they can also create over-
complex trees that overfit the data and can be unstable with small variations in
data.

CHP1: Introduction to Machine Learning 9


Support Vector Machines (SVM) is a supervised machine learning algorithm
primarily used for classification tasks, and it can also be used for regression.
The main idea behind SVM is to find a hyperplane in an N-dimensional space
(where N is the number of features) that distinctly classifies the data points. The
hyperplane is chosen to have the maximum distance to the nearest data point of
any class, which is also known as the maximum margin. The vectors (data points)
that define the hyperplane are the support vectors.
SVM has a technique called the kernel trick. These are functions that take low-
dimensional input space and transform it into a higher-dimensional space. It is
mostly useful in non-linear separation problems.
Key concepts in SVM include:

Hyperplane: This is a decision boundary that separates between a set of


objects having different class memberships.

Support Vectors: These are the data points that are closest to the hyperplane
and influence the position and orientation of the hyperplane. Using these
support vectors, we maximize the margin of the classifier.

Margin: This is a gap between the two lines on the closest class points. This is
calculated as the perpendicular distance from the line to support vectors or
closest points. If the margin is larger in between the classes, then it is
considered a good margin, a smaller margin is a bad margin.

CHP1: Introduction to Machine Learning 10


Neural Networks, also known as Artificial Neural Networks (ANNs), are a subset
of machine learning and are at the heart of deep learning algorithms. They are
inspired by the structure and function of the human brain - specifically, the way
neurons interconnect and transmit signals.
A neural network takes in inputs, which are then processed in hidden layers using
weights that are adjusted during training. Then the model spits out a prediction as
the output.
Key components of a neural network include:

Neurons: These are the basic units of a neural network. They take in one or
more inputs, apply a transformation function (often non-linear), and produce
an output.

Layers: A neural network is composed of layers: an input layer, one or more


hidden layers, and an output layer. Each layer consists of multiple neurons.

Weights and Biases: These are the learnable parameters of a neural network.
The weights control the signal (or the strength of the connection) between two
neurons, and the biases allow you to shift the activation function to the left or
right.

Activation Function: This function is used to introduce non-linearity into the


output of a neuron. This is important because most real world data is non
linear and we want neurons to learn these non linear representations.

Backpropagation: This is the primary algorithm for performing gradient


descent on neural networks. It calculates the gradient of the loss function with

CHP1: Introduction to Machine Learning 11


respect to the weights of the network for a single input-output example, and
does so efficiently, unlike a naive direct computation.

Deep Learning is a subset of machine learning that’s based on artificial neural


networks with representation learning. It is called deep learning because it makes
use of deep neural networks, where the term “deep” refers to the number of
layers in the network. The more layers, the deeper the network.
Deep learning models learn to represent data by training on a large amount of data
and automatically extracting the features from the data. This is in contrast to
traditional machine learning, where features are manually extracted.
Key components of a deep learning model include:

Layers: A deep learning model consists of multiple layers that transform the
input data into a representation that can be used for the task at hand. Each
layer learns to extract a new feature from the input it receives.

Weights and Biases: These are the learnable parameters of a deep learning
model. The weights control the signal (or the strength of the connection)

CHP1: Introduction to Machine Learning 12


between two neurons, and the biases allow you to shift the activation function
to the left or right.

Activation Function: This function is used to introduce non-linearity into the


output of a neuron. This is important because most real world data is non
linear and we want neurons to learn these non linear representations.

Backpropagation and Gradient Descent: Backpropagation is used to calculate


the gradients of the loss with respect to the weights of the network, which are
then used in the gradient descent step to update the weights and minimize the
loss.

Deep learning has been successfully applied to a variety of tasks including image
recognition, speech recognition, natural language processing, and many others.

CHP1: Introduction to Machine Learning 13


Reinforcement Learning (RL) is a type of machine learning where an agent learns
to make decisions by taking actions in an environment to achieve a goal. The
agent learns from the consequences of its actions, rather than from being
explicitly taught and it selects its actions based on its past experiences
(exploitation) and also by new choices (exploration).
Key components of reinforcement learning include:

Agent: The ‘learner’ or ‘decision maker’.

Environment: What the agent interacts with and learns from.

Actions: What the agent can do. The choice of action depends on the policy.

Policy: The strategy that the agent employs to determine the next action
based on the current state.

Reward Function: A rule that returns a reward to the agent based on the action
it took. The agent’s objective is to learn to act in a way that maximizes the
reward.

Value Function: A prediction of future rewards. It’s the total amount of reward
the agent expects to accumulate over the future, starting from a state.

Q-function or Action-Value Function: Similar to the value function, but takes


an extra parameter, the current action.

Reinforcement learning algorithms include Q-Learning, Deep Q-Networks (DQN),


Policy Gradients, and many others. These algorithms have been used to train
software agents to play games, operate robots, and perform other complex tasks.

CHP1: Introduction to Machine Learning 14


Cross-validation is a resampling procedure used in machine learning to evaluate
the performance of a model on an independent data set, and to tune model
hyperparameters. The goal of cross-validation is to test the model’s ability to
predict new data that was not used in estimating it, in order to flag problems like
overfitting or selection bias.
The most common type of cross-validation is k-fold cross-validation, where the
original sample is randomly partitioned into k equal sized subsamples. Of the k
subsamples, a single subsample is retained as the validation data for testing the
model, and the remaining k-1 subsamples are used as training data. The cross-
validation process is then repeated k times, with each of the k subsamples used
exactly once as the validation data. The k results can then be averaged to produce
a single estimation.

CHP1: Introduction to Machine Learning 15


CHP1: Introduction to Machine Learning 16
CHP1: Introduction to Machine Learning 17
CHP1: Introduction to Machine Learning 18
1.2 Types of ML
Supervised Learning

CHP1: Introduction to Machine Learning 19


CHP1: Introduction to Machine Learning 20
Regression

Classification

CHP1: Introduction to Machine Learning 21


Examples of Supervised Learning
Speech Recognition based automation: Ok Google,Apple Siri

Weather Prediction

Biometric Attendance

Unsupervised Learning

CHP1: Introduction to Machine Learning 22


CHP1: Introduction to Machine Learning 23
Types of Unsupervised learning

Clustering

CHP1: Introduction to Machine Learning 24


Association

CHP1: Introduction to Machine Learning 25


Examples of Unsupervised Learning

CHP1: Introduction to Machine Learning 26


Reinforcement Learning

CHP1: Introduction to Machine Learning 27


1.3 Issues , Applications and Steps

Applications:

CHP1: Introduction to Machine Learning 28


1. Image Recognition:

Image Recognition, is a common application of Machine Learning (ML). It is used


to identify various elements within digital images such as objects, persons, places,
etc. A popular use case of image recognition and face detection is the automatic
friend tagging suggestion feature provided by Facebook.
When a user uploads a photo with their Facebook friends, the platform
automatically suggests tags for the faces present in the image. This is made
possible by Machine Learning's face detection and recognition algorithms. The
technology behind this feature is based on a project named "Deep Face" by
Facebook, which is responsible for face recognition and person identification in
pictures.
In essence, Image Recognition using ML involves training a model to understand
and interpret digital images, thereby enabling it to recognize and identify different
elements within those images. This technology has wide-ranging applications,
from social media platforms like Facebook to security systems and beyond.

2. Speech recognition:

Speech Recognition, as illustrated in the image, is a popular application of


Machine Learning (ML). It is the process of converting voice instructions into text,
often referred to as “Speech to Text” or “Computer Speech Recognition”.
A notable example of this technology is Google’s “Search by Voice” feature. When
using Google, we get an option of “Search by Voice”, which comes under speech
recognition. This feature allows users to input their search queries by speaking
instead of typing, enhancing the user experience and accessibility.

3. Traffic Prediction

Traffic Prediction, is a significant application of Machine Learning (ML). It involves


predicting traffic conditions such as whether traffic is cleared, slow-moving, or
heavily congested.
A notable example of this technology is Google Maps. When we want to visit a
new place, we often use Google Maps, which shows us the correct path with the
shortest route and predicts the traffic conditions. This prediction is made possible

CHP1: Introduction to Machine Learning 29


by analyzing the real-time location of vehicles from the Google Maps app and
sensors, as well as the average time taken on past days at the same time.
Everyone who uses Google Maps is helping this app to make it better. It takes
information from the user and sends it back to its database to improve
performance. This continuous input from users contributes to enhancing the app’s
performance over time, making the predictions more accurate.

4. Product Recommendation

5. Self Driving cars

Self-Driving Cars, as depicted in the image, are one of the most exciting
applications of Machine Learning (ML). Machine Learning plays a significant role
in the functioning of these autonomous vehicles.
A prominent example of this technology is Tesla, a popular car manufacturing
company that is working on self-driving cars. Tesla uses unsupervised learning
methods to train their car models. These models are trained to detect people and
objects while driving, thereby enabling the cars to navigate safely and efficiently.

6. Email spam and Malware filtering

CHP1: Introduction to Machine Learning 30


7. Virtual Personal Assistant

Virtual Personal Assistants (VPAs), are a significant application of Machine


Learning (ML). VPAs such as Google Assistant, Alexa, Cortana, and Siri are widely
used today. As the name suggests, they assist users in finding information and
performing various tasks using voice instructions. These tasks can range from
playing music, making calls, opening emails, to scheduling appointments, among
others.
Machine Learning algorithms play a crucial role in the functioning of these VPAs.
These assistants record the user’s voice instructions and send them over to
servers on the cloud. ML algorithms then decode these instructions and act
accordingly to execute the requested tasks.

8. Online Fraud detection

Online Fraud Detection, as depicted in the image, is a significant application of


Machine Learning (ML). ML enhances the safety and security of online
transactions by identifying fraudulent activities.

CHP1: Introduction to Machine Learning 31


When an online transaction is initiated, there are multiple ways a fraudulent act
can occur, such as through fake accounts or IDs, leading to theft in the middle of a
transaction. To detect this, Feed Forward Neural networks are employed to
discern between genuine and fraudulent transactions.
In this process, each legitimate transaction is converted into specific hash values
that serve as input for subsequent rounds of verification. A distinct pattern is
associated with every genuine transaction; any deviation from this pattern signals
a potential fraud, ensuring enhanced security for online transactions.

9. Stock Market Trading

10. Medical Diagnosis

Medical Diagnosis, as depicted in the image, is a significant application of


Machine Learning (ML) in the field of medical science. ML is used for disease
diagnoses, contributing to the rapid growth of medical technology.
One of the advancements is the ability to build 3D models that can predict the
exact position of a tumor in the brain. This technology aids in easily finding brain
tumors and other brain-related diseases.
The process involves using a Feed Forward Neural network, a type of artificial
neural network, to check whether a diagnosis is genuine or a false positive. For
each genuine diagnosis, the output is converted into some hash values, and these
values become the input for the next round of diagnosis.

Steps for Developing Machine Learning Applications

CHP1: Introduction to Machine Learning 32


1. Gathering Data

Gathering Data, as depicted in the image, is the first step of the Machine Learning
(ML) life cycle. The goal of this step is to identify and obtain all data related to
specific problems.
Data can be collected from various sources such as files, databases, the internet,
or mobile devices. It is one of the most important steps of the life cycle as the
quantity and quality of collected data will determine the efficiency of the output.
The more data there is, the more accurate the prediction will be.
This step includes tasks such as identifying various data sources, collecting data,
and integrating the data obtained from different sources. By performing these
tasks, we get a coherent set of data, also called a dataset, which will be used in
further steps.

1. Data Preparation

Data Preparation, as depicted in the image, is the second step of the Machine
Learning (ML) life cycle. After gathering data, it needs to be prepared for further
steps. This preparation involves putting all collected data together and
randomizing its order.

This step can be further divided into two processes: Data Exploration and Data
Pre-processing. Data Exploration is used to understand the nature of the data that
we have to work with. We need to understand the characteristics, format, and
quality of the data. A better understanding leads to an effective outcome. In this
phase, we find correlations, general trends, and outliers.

CHP1: Introduction to Machine Learning 33


Data Pre-processing is the next step where the explored data is prepared for
analysis. For each genuine transaction, the output is converted into some hash
values, and these values become the input for the next round. By performing
these tasks, we get a coherent set of data, also called a dataset, which will be
used in further steps.

1. Data Wrangling

Data Wrangling, as depicted in the image, is a crucial step in the Machine Learning
(ML) life cycle. It involves cleaning and converting raw data into a format that is
usable for machine learning or statistical analysis.
This step is essential as it ensures the quality and reliability of data before it is
used for training models or making decisions. The process includes cleaning the
data to address quality issues, selecting relevant variables, and transforming the
data into a suitable format for analysis.
In real-world applications, collected data may have various issues such as missing
values, duplicate data, and invalid data. Noise is another common problem that
needs to be addressed during this stage. Various filtering techniques are
employed to clean the data and remove these issues because they can negatively
affect the quality of outcomes derived from the processed data.

1. Analyse Data

Data Analysis, as depicted in the image, is a crucial step in the Machine Learning
(ML) life cycle. Once the data is cleaned and prepared, it is passed on to the
analysis step. This step involves the selection of analytical techniques, building
models, and reviewing the results.
The aim of this step is to build an ML model that can analyze the data using
various analytical techniques and review the outcomes. It starts with the
determination of the type of problems where we select machine learning
techniques such as Classification, Regression, Cluster Analysis, Association etc.,
then build models using prepared data and evaluate these models.
Hence, in this step, we use machine learning algorithms to build models.

1. Train the data

CHP1: Introduction to Machine Learning 34


1. Test the model

1. Deployment

1. Inadequate Training Data:

CHP1: Introduction to Machine Learning 35


Inadequate Training Data, as depicted in the image, is a major issue that arises
while using Machine Learning (ML) algorithms. The quality and quantity of data
play a vital role in the processing of ML algorithms.
Many data scientists claim that inadequate data, noisy data, and unclean data are
extremely exhausting for ML algorithms. For example, a simple task requires
thousands of sample data, and an advanced task such as speech or image
recognition needs millions of sample data examples.
Further, data quality is also important for the algorithms to work ideally, but the
absence of data quality is also found in ML applications. Data quality can be
affected by some factors such as missing values, duplicate data, invalid data, and
noise.

CHP1: Introduction to Machine Learning 36


CHP1: Introduction to Machine Learning 37
1.3 Hypothesis

CHP1: Introduction to Machine Learning 38


CHP1: Introduction to Machine Learning 39
CHP1: Introduction to Machine Learning 40
CHP1: Introduction to Machine Learning 41
Inductive Bias

CHP1: Introduction to Machine Learning 42


Preference Bias: It expresses a preference for some hypotheses over others.
For example, in decision tree algorithms like ID3, the preference is for shorter
trees over longer trees.

Restriction Bias: It restricts the set of hypotheses considered by the


algorithm. For instance, a linear regression algorithm restricts its hypothesis to
linear relationships between variables.

Dimentionality Reduction Technique

CHP1: Introduction to Machine Learning 43


• Curse of Dimensionality: This term refers to the complications and challenges
that arise when handling high-dimensional data. As the dimensionality of the input
dataset increases, any machine learning algorithm and model becomes more
complex. As the number of features increases, the number of samples also gets
increased proportionally, and the chance of overfitting also increases. If the
machine learning model is trained on high-dimensional data, it becomes overfitted
and results in poor performance. Hence, it is often required to reduce the number
of features, which can be done with dimensionality reduction. This phenomenon is
commonly known as the curse of dimensionality.

CHP1: Introduction to Machine Learning 44


CHP1: Introduction to Machine Learning 45
CHP1: Introduction to Machine Learning 46
CHP1: Introduction to Machine Learning 47
CHP1: Introduction to Machine Learning 48
CHP1: Introduction to Machine Learning 49
1.5 Overfitting and Underfitting

CHP1: Introduction to Machine Learning 50


CHP1: Introduction to Machine Learning 51
Table of Contents of ML

Table of Contents for ML(Minor)

CHP1: Introduction to Machine Learning 52

You might also like