0% found this document useful (0 votes)
37 views20 pages

DLunit 1

The document provides an overview of fundamentals of deep learning including artificial intelligence, the history of machine learning, probabilistic modeling, early neural networks, and kernel methods. It discusses how artificial intelligence plays a role in deep learning through learning and adaptation, feature extraction, natural language processing, computer vision, reinforcement learning, and autonomous systems. It also summarizes the history of machine learning from early foundations to modern deep learning approaches. Probabilistic modeling and early neural network models like the McCulloch-Pitts neuron and perceptron are described.

Uploaded by

EXAMCELL - H4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views20 pages

DLunit 1

The document provides an overview of fundamentals of deep learning including artificial intelligence, the history of machine learning, probabilistic modeling, early neural networks, and kernel methods. It discusses how artificial intelligence plays a role in deep learning through learning and adaptation, feature extraction, natural language processing, computer vision, reinforcement learning, and autonomous systems. It also summarizes the history of machine learning from early foundations to modern deep learning approaches. Probabilistic modeling and early neural network models like the McCulloch-Pitts neuron and perceptron are described.

Uploaded by

EXAMCELL - H4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Unit –1

Fundamentals of Deep Learning: Artificial Intelligence, History of


Machine learning: Probabilistic Modeling, Early Neural Networks,
Kernel Methods, Decision Trees, Random forests and Gradient Boosting
Machines, Fundamentals of Machine Learning: Four Branches of
Machine Learning, Evaluating Machine learning Models, Overfitting
and Underfitting.
Fundamentals of Deep Learning
Artificial Intelligence (AI)
Artificial Intelligence (AI) refers to the development of computer systems or machines that can
perform tasks that would typically require human intelligence. AI aims to replicate or simulate
human cognitive abilities, such as perception, reasoning, learning, problem-solving, and
decision-making.
Artificial intelligence (AI) plays a significant role in deep learning. Deep learning algorithms are
a specific subset of AI techniques that use artificial neural networks to mimic human brain
function and learn from data. Here's how AI is incorporated into deep learning:
1. Learning and Adaptation: Deep learning models utilize AI techniques to learn and adapt from
large datasets. They learn to recognize patterns, make predictions, or classify data by adjusting
their internal parameters based on feedback received during the training process.
2. Feature Extraction: AI methods are used to automatically extract relevant features or
representations from raw data. Instead of manually defining features, deep learning models can
learn hierarchical representations that capture important characteristics of the input data. AI
algorithms, such as convolutional neural networks (CNNs), are commonly employed to extract
features from images, audio, or text.
3. Natural Language Processing (NLP): NLP is a field of AI that deals with the interaction
between computers and human language. Deep learning techniques, such as recurrent neural
networks (RNNs) and transformer models, are used in NLP tasks like machine translation,
sentiment analysis, language generation, and question-answering systems.
4. Computer Vision: Deep learning has revolutionized computer vision tasks, such as image
classification, object detection, and image segmentation. AI algorithms, including CNNs and
deep convolutional generative adversarial networks (DCGANs), enable machines to perceive and
understand visual data with remarkable accuracy.
5. Reinforcement Learning: Reinforcement learning is a branch of AI concerned with training
agents to make decisions in an environment to maximize rewards. Deep reinforcement learning
combines deep learning and reinforcement learning techniques, enabling agents to learn directly
from raw sensory input and achieve impressive performance in complex tasks, such as game
playing and robotics.
6. Autonomous Systems: Deep learning, along with AI principles, is a key component in the
development of autonomous systems. These systems, such as self-driving cars and autonomous
drones, rely on deep learning algorithms to perceive the environment, interpret sensor data, and
make real-time decisions.
Overall, artificial intelligence provides the foundation and tools for deep learning
algorithms to learn, adapt, and perform complex tasks across various domains, ranging from
computer vision to natural language understanding.

History of Machine learning


The history of machine learning dates back several decades and has undergone significant
developments over time. Here's a brief overview of the key milestones in the history of machine
learning:
1. Early Foundations (1950s-1960s):
- The field of machine learning emerged from the intersection of computer science and
statistics, with early pioneers including Alan Turing and Arthur Samuel.
- In 1950, Alan Turing proposed the "Turing Test" as a way to measure a machine's ability to
exhibit intelligent behavior.
- In the 1950s, Arthur Samuel developed the concept of machine learning by creating programs
that could improve their performance over time through experience, specifically in the domain of
game-playing, such as checkers.
2. Symbolic AI and Expert Systems (1960s-1980s):
- During this period, researchers focused on symbolic AI and expert systems, which relied on
rules and logical reasoning.
- Machine learning took a backseat as rule-based systems dominated the field, with projects
like DENDRAL (a system for molecular biology) and MYCIN (a system for diagnosing bacterial
infections) gaining attention.
3. Connectionism and Neural Networks (1980s-1990s):
- Interest in neural networks and connectionism resurged during this period.
- Backpropagation, a widely used algorithm for training neural networks, was developed in the
1980s.
- The field saw advancements in areas such as pattern recognition and speech recognition,
fueled by neural network models like the Multi-Layer Perceptron (MLP).
4. Statistical Learning and Data-Driven Approaches (1990s-2000s):
- Researchers started emphasizing statistical learning and data-driven approaches.
- Support Vector Machines (SVMs) gained popularity for classification tasks, offering strong
theoretical foundations.
- The field saw the emergence of ensemble methods, such as Random Forests and Boosting,
which combined multiple models to improve performance.
5. Big Data and Deep Learning (2010s-present):
- The rise of big data, increased computational power, and advancements in deep learning
models revolutionized the field.
- Deep learning, specifically Convolutional Neural Networks (CNNs) and Recurrent Neural
Networks (RNNs), achieved remarkable success in computer vision, speech recognition, and
natural language processing.
- Deep learning frameworks like TensorFlow and PyTorch gained widespread adoption,
making it easier for researchers and practitioners to build and train deep neural networks.
Today, machine learning is a rapidly evolving field that continues to push boundaries
in areas such as reinforcement learning, generative models, and explain ability. It has become an
integral part of numerous applications, including recommendation systems, fraud detection,
autonomous vehicles, and personalized medicine, among many others.

Probabilistic Modeling:
Probabilistic modeling is an approach to modeling and analyzing data that incorporates
uncertainty and probability theory. It allows us to reason and make predictions in situations
where there is inherent variability or noise in the data. In probabilistic modeling, we represent
uncertain quantities as probability distributions and use statistical inference techniques to learn
and make inferences from the available data.
Here are some key aspects and applications of probabilistic modeling:
1. Probability Distributions: In probabilistic modeling, we assign probability distributions to
uncertain variables. These distributions describe the likelihood of different values the variables
can take. Commonly used probability distributions include the Gaussian (normal) distribution,
Bernoulli distribution, Poisson distribution, and more.
2. Bayesian Inference: Bayesian inference is a fundamental approach in probabilistic modeling
that allows us to update our beliefs about uncertain variables based on observed data. It combines
prior knowledge or beliefs (expressed as prior distributions) with observed data to obtain
posterior distributions, which represent our updated beliefs.
3. Generative Models: Probabilistic modeling enables the construction of generative models,
which can generate new samples that resemble the observed data. Generative models learn the
underlying probabilistic structure of the data and can be used for tasks such as data generation,
anomaly detection, and missing data imputation.
4. Bayesian Networks: Bayesian networks, also known as probabilistic graphical models, are
graphical representations of probabilistic dependencies among variables. They use directed
acyclic graphs to model the conditional dependencies and allow efficient inference and reasoning
about the joint distribution of variables.
5. Uncertainty Quantification: Probabilistic modeling provides a natural framework for
quantifying and expressing uncertainty. By representing uncertain variables as probability
distributions, we can estimate confidence intervals, calculate probabilities of different outcomes,
and assess the uncertainty associated with predictions or decisions.
6. Applications: Probabilistic modeling finds applications in various fields, including finance,
healthcare, natural language processing, computer vision, and more. It is used for tasks such as
risk assessment, fraud detection, recommendation systems, sentiment analysis, image
recognition, and predictive modeling.
Notable probabilistic modeling techniques include Bayesian regression, Hidden Markov
Models (HMMs), Gaussian Processes (GPs), and Variational Autoencoders (VAEs). These
techniques provide powerful tools for modeling complex systems and making principled
inferences in the presence of uncertainty.

Early Neural Networks:


Early neural networks, also known as the first-generation neural networks, emerged in the 1940s
and 1950s. These early models laid the foundation for modern deep learning and were the
precursors to the more advanced neural networks we have today. Here are some notable early
neural networks:
1. McCulloch-Pitts Neuron (1943): Proposed by Warren McCulloch and Walter Pitts, this model
was a simplified abstraction of a biological neuron. It introduced the concept of threshold logic,
where inputs were summed and compared to a threshold to produce a binary output. While not a
full-fledged neural network, it provided the basis for future developments.
2. Perceptron (1957): Developed by Frank Rosenblatt, the perceptron was one of the earliest
forms of a learning algorithm for neural networks. It consisted of a single layer of interconnected
artificial neurons (McCulloch-Pitts neurons) that could learn to classify inputs into two classes.
The perceptron learning rule adjusted the connection weights based on errors made during
training.
3. Adaline (1960): An abbreviation for "Adaptive Linear Neuron," Adaline was also developed
by Frank Rosenblatt. It was an extension of the perceptron model, introducing the use of
continuous activation functions and real-valued weights. Adaline could learn linear regression
tasks and was an early example of using gradient descent for weight adjustment.
4. Madeline (1960s): Short for "Multiple ADALINE," Madeline was an advancement that
introduced multiple layers of Adaline neurons. It allowed for the learning of more complex
decision boundaries and was one of the earliest attempts at building multilayer neural networks.
5. Backpropagation (1970s-1980s): Although backpropagation is now a fundamental algorithm
in deep learning, its development can be traced back to the 1970s. The core idea behind
backpropagation is to compute the gradient of the error with respect to the weights in a neural
network, enabling efficient weight updates. The algorithm experienced significant advancements
in the 1980s, leading to renewed interest in neural networks.
These early neural networks faced limitations in terms of computational power, data
availability, and the lack of sophisticated training algorithms. As a result, they were limited to
relatively simple tasks and had only a few layers. However, they laid the groundwork for future
breakthroughs and set the stage for the resurgence of neural networks in the 21st century, with
the development of deep learning architectures and powerful training techniques.

Kernal Methods:
Kernel methods are a family of machine learning techniques that operate in a high-dimensional
feature space implicitly through a kernel function. They are particularly useful for solving
complex nonlinear problems while preserving the computational efficiency of linear methods.
Kernel methods have applications in various fields, including classification, regression,
dimensionality reduction, and anomaly detection.
Here are some key aspects of kernel methods:
1. Kernel Functions: A kernel function measures the similarity or distance between pairs of data
points in the input space. It takes two inputs and returns a similarity measure or inner product in
a high-dimensional feature space. Popular kernel functions include the linear kernel, polynomial
kernel, Gaussian (RBF) kernel, and sigmoid kernel.
2. Kernel Trick: The kernel trick is a central concept in kernel methods. It allows us to implicitly
map the original input space into a higher-dimensional feature space without explicitly
computing the transformed features. This is computationally efficient as it avoids the need to
compute and store the high-dimensional feature representations explicitly.
3. Support Vector Machines (SVM): SVM is a widely used kernel-based algorithm for
classification and regression tasks. It aims to find a hyperplane that separates data points of
different classes while maximizing the margin between the classes. SVMs use kernel functions to
implicitly operate in a high-dimensional feature space and find the optimal decision boundary.
4. Kernel PCA: Kernel Principal Component Analysis (PCA) is an extension of traditional PCA
that uses kernel functions to perform nonlinear dimensionality reduction. It captures nonlinearn
relationships in the data by mapping it to a high-dimensional feature space and computing
principal components in that space.
5. Gaussian Processes (GPs): Gaussian processes are probabilistic models that use kernel
functions to define the covariance structure between data points. GPs are flexible and can model
complex nonlinear relationships while providing uncertainty estimates. They are used for
regression, classification, and Bayesian optimization tasks.
6. Kernel-based Clustering: Kernel methods can also be applied to clustering algorithms, such as
Kernel K-means and Spectral Clustering. These methods use kernel functions to measure
similarity or dissimilarity between data points and group them into clusters.
Kernel methods have several advantages, including their ability to handle nonlinear
relationships, their mathematical elegance, and their interpretability. However, they may face
challenges with scalability and hyperparameter selection. Nevertheless, kernel methods have had
a significant impact on the field of machine learning, providing powerful tools for solving a wide
range of problems.

Decision Trees:
 Decision Tree is a Supervised learning technique that can be used for both
classification and Regression problems, but mostly it is preferred for solving
Classification problems. It is a tree-structured classifier, where internal nodes represent
the features of a dataset, branches represent the decision rules and each leaf node
represents the outcome.
 In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node.
Decision nodes are used to make any decision and have multiple branches, whereas Leaf
nodes are the output of those decisions and do not contain any further branches.
 The decisions or the test are performed on the basis of features of the given dataset.
 It is a graphical representation for getting all the possible solutions to a
problem/decision based on given conditions.
 It is called a decision tree because, similar to a tree, it starts with the root node, which
expands on further branches and constructs a tree-like structure.
Decision Tree Terminologies
 Root Node: Root node is from where the decision tree starts. It represents the entire
dataset, which further gets divided into two or more homogeneous sets.
 Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated
further after getting a leaf node.
 Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes
according to the given conditions.
 Branch/Sub Tree: A tree formed by splitting the tree.
 Pruning: Pruning is the process of removing the unwanted branches from the tree.
 Parent/Child node: The root node of the tree is called the parent node, and other nodes
are called the child nodes.

Algorithm
 Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
 Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
 Step-3: Divide the S into subsets that contains possible values for the best attributes.
 Step-4: Generate the decision tree node, which contains the best attribute.
 Step-5: Recursively make new decision trees using the subsets of the dataset created in
step -3. Continue this process until a stage is reached where you cannot further classify
the nodes and called the final node as a leaf node.
Random Forest:
Random Forest is an ensemble learning method that combines multiple decision trees to make
predictions or classifications. It is a powerful and widely used algorithm known for its
robustness and ability to handle complex datasets. Random Forest overcomes the limitations of
individual decision trees by reducing overfitting and improving generalization.
Here are the key characteristics and concepts of Random Forest:

1. Ensemble of Decision Trees: Random Forest consists of a collection of decision trees, where
each tree is trained on a random subset of the training data. Each tree independently makes
predictions, and the final prediction is determined by combining the predictions of all the trees.

2. Random Sampling: Random Forest uses two types of random sampling. The first type is
random sampling with replacement, also known as bootstrap sampling. It creates multiple
bootstrap samples by randomly selecting data points from the training dataset, allowing some
data points to be present in multiple subsets. The second type is random feature selection,
where only a subset of features is considered for splitting at each node of the decision tree.

3. Voting for Predictions: Random Forest employs a majority voting scheme for classification
tasks and averaging for regression tasks. Each decision tree in the ensemble makes an
individual prediction, and the class with the most votes or the average of the predicted values is
chosen as the final prediction.

4. Feature Importance: Random Forest can provide a measure of feature importance based on
the average impurity decrease (such as Gini impurity or entropy) caused by the feature across
all decision trees in the forest. This information helps identify the most informative features for
the task at hand.

5. Robust to Overfitting: By aggregating predictions from multiple decision trees, Random


Forest reduces overfitting. The individual decision trees in the ensemble can overfit the training
data, but the averaging or voting process helps generalize predictions and reduces the impact
of outliers or noisy data.

6. Parallelizable: Random Forest can be easily parallelized since each decision tree in the
ensemble can be trained independently. This allows for efficient computation, especially for
large datasets.
7. Versatility: Random Forest is applicable to both classification and regression problems. It
handles a mixture of feature types, such as categorical and numerical features, without
requiring extensive preprocessing.

Random Forest is widely used in various domains, including finance, healthcare,


marketing, and computer vision. Its versatility, robustness, and ability to handle high-
dimensional data make it a popular choice for many machine learning tasks.

Gradient Boosting Machines:


Gradient Boosting Machines (GBMs) are a powerful ensemble learning method that combines multiple
weak prediction models, typically decision trees, to create a strong predictive model. GBMs iteratively
build an ensemble of models by optimizing a loss function in a gradient descent manner, focusing on
reducing the errors made by the previous models in the ensemble. They are known for their
effectiveness in a wide range of machine learning tasks, including regression and classification.

Here are the key characteristics and concepts of Gradient Boosting Machines:

1. Boosting: GBMs belong to the boosting family of algorithms, where weak models are sequentially
trained to correct the mistakes of the previous models. Each subsequent model in the ensemble focuses
on reducing the errors made by the previous models, leading to an ensemble with improved overall
predictive performance.

2. Gradient Descent: GBMs optimize the ensemble by minimizing a differentiable loss function using
gradient descent. The loss function measures the discrepancy between the predicted values and the
true values of the target variable. Gradient descent updates the model parameters in the direction of
steepest descent to iteratively improve the model's predictions.

3. Weak Learners: GBMs use weak learners as building blocks, typically decision trees with a small depth
(often referred to as "shallow trees" or "decision stumps"). These weak learners are simple models that
make predictions slightly better than random guessing. They are usually shallow to prevent overfitting
and to focus on capturing the specific patterns missed by previous models.

4. Residuals: In GBMs, the subsequent weak learners are trained to predict the residuals (the differences
between the true values and the predictions of the ensemble so far). By focusing on the residuals, the
subsequent models are designed to correct the errors made by the previous models and improve the
overall prediction accuracy.

5. Learning Rate: GBMs introduce a learning rate parameter that controls the contribution of each weak
learner to the ensemble. A smaller learning rate makes the learning process more conservative, slowing
down the convergence but potentially improving the generalization ability.

6. Regularization: To prevent overfitting, GBMs often include regularization techniques. Common


regularization methods include limiting the depth or complexity of the weak learners, applying shrinkage
(reducing the impact of each weak learner), and using subsampling techniques to train each weak
learner on a random subset of the data.
7. Feature Importance: GBMs can provide estimates of feature importance based on how frequently and
effectively they are used in the ensemble. This information helps identify the most informative features
for the task.

Gradient Boosting Machines, particularly popular implementations such as XGBoost, LightGBM,


and CAT Boost, have achieved state-of-the-art performance in various machine learning competitions
and real-world applications. They excel at handling complex, high-dimensional data and have become an
essential tool in the machine learning practitioner's toolkit.

Fundamentals of Machine Learning


Four Branches of Machine Learning:

Machine learning is a subset of AI, which enables the machine to automatically learn from
data, improve performance from past experiences, and make predictions. Machine learning
contains a set of algorithms that work on a huge amount of data. Data is fed to these algorithms
to train them, and on the basis of training, they build the model & perform a specific task.

These ML algorithms help to solve different business problems like Regression, Classification,
Forecasting, Clustering, and Associations, etc.
Based on the methods and way of learning, machine learning is divided into mainly four types,
which are:

1. Supervised Machine Learning


2. Unsupervised Machine Learning
3. Semi-Supervised Machine Learning
4. Reinforcement Learning

In this topic, we will provide a detailed description of the types of Machine Learning along with
their respective algorithms:

1. Supervised Machine Learning


As its name suggests, Supervised machine learning is based on supervision. It means in the supervised
learning technique, we train the machines using the "labelled" dataset, and based on the training, the
machine predicts the output. Here, the labelled data specifies that some of the inputs are already
mapped to the output. More preciously, we can say; first, we train the machine with the input and
corresponding output, and then we ask the machine to predict the output using the test dataset.

Let's understand supervised learning with an example. Suppose we have an input dataset of cats
and dog images. So, first, we will provide the training to the machine to understand the images,
such as the shape & size of the tail of cat and dog, Shape of eyes, colour, height (dogs are
taller, cats are smaller), etc. After completion of training, we input the picture of a cat and ask
the machine to identify the object and predict the output. Now, the machine is well trained, so it
will check all the features of the object, such as height, shape, colour, eyes, ears, tail, etc., and
find that it's a cat. So, it will put it in the Cat category. This is the process of how the machine
identifies the objects in Supervised Learning.
The main goal of the supervised learning technique is to map the input variable(x) with the
output variable(y). Some real-world applications of supervised learning are Risk Assessment,
Fraud Detection, Spam filtering, etc.

Categories of Supervised Machine Learning


Supervised machine learning can be classified into two types of problems, which are given
below:

 Classification
 Regression
a) Classification
Classification algorithms are used to solve the classification problems in which the output
variable is categorical, such as "Yes" or No, Male or Female, Red or Blue, etc. The
classification algorithms predict the categories present in the dataset. Some real-world examples
of classification algorithms are Spam Detection, Email filtering, etc.
Some popular classification algorithms are given below:

 Random Forest Algorithm


 Decision Tree Algorithm
 Logistic Regression Algorithm
 Support Vector Machine Algorithm
b) Regression
Regression algorithms are used to solve regression problems in which there is a linear
relationship between input and output variables. These are used to predict continuous output
variables, such as market trends, weather prediction, etc.
Some popular Regression algorithms are given below:

 Simple Linear Regression Algorithm


 Multivariate Regression Algorithm
 Decision Tree Algorithm
 Lasso Regression
Advantages and Disadvantages of Supervised Learning
Advantages:

 Since supervised learning work with the labelled dataset so we can have an exact idea
about the classes of objects.
 These algorithms are helpful in predicting the output on the basis of prior experience.
Disadvantages:

 These algorithms are not able to solve complex tasks.


 It may predict the wrong output if the test data is different from the training data.
 It requires lots of computational time to train the algorithm.
Applications of Supervised Learning
Some common applications of Supervised Learning are given below:

 Image Segmentation:
Supervised Learning algorithms are used in image segmentation. In this process,
image classification is performed on different image data with pre-defined labels.
 Medical Diagnosis:
Supervised algorithms are also used in the medical field for diagnosis purposes. It is
done by using medical images and past labelled data with labels for disease
conditions. With such a process, the machine can identify a disease for the new
patients.
 Fraud Detection - Supervised Learning classification algorithms are used for identifying
fraud transactions, fraud customers, etc. It is done by using historic data to identify the
patterns that can lead to possible fraud.
 Spam detection - In spam detection & filtering, classification algorithms are used. These
algorithms classify an email as spam or not spam. The spam emails are sent to the spam
folder.
 Speech Recognition - Supervised learning algorithms are also used in speech
recognition. The algorithm is trained with voice data, and various identifications can be
done using the same, such as voice-activated passwords, voice commands, etc.
2. Unsupervised Machine Learning
Unsupervised learning is different from the Supervised learning technique; as its name suggests,
there is no need for supervision. It means, in unsupervised machine learning, the machine is
trained using the unlabeled dataset, and the machine predicts the output without any supervision.
In unsupervised learning, the models are trained with the data that is neither classified nor
labelled, and the model acts on that data without any supervision.
The main aim of the unsupervised learning algorithm is to group or categories the unsorted
dataset according to the similarities, patterns, and differences. Machines are instructed to
find the hidden patterns from the input dataset.
Let's take an example to understand it more preciously; suppose there is a basket of fruit images,
and we input it into the machine learning model. The images are totally unknown to the model,
and the task of the machine is to find the patterns and categories of the objects.
So, now the machine will discover its patterns and differences, such as colour difference, shape
difference, and predict the output when it is tested with the test dataset.

Categories of Unsupervised Machine Learning


Unsupervised Learning can be further classified into two types, which are given below:

 Clustering
 Association
1) Clustering
The clustering technique is used when we want to find the inherent groups from the data. It is a
way to group the objects into a cluster such that the objects with the most similarities remain in
one group and have fewer or no similarities with the objects of other groups. An example of the
clustering algorithm is grouping the customers by their purchasing behaviour.
Some of the popular clustering algorithms are given below:

 K-Means Clustering algorithm


 Mean-shift algorithm
 DBSCAN Algorithm
 Principal Component Analysis
 Independent Component Analysis
2) Association
Association rule learning is an unsupervised learning technique, which finds interesting relations
among variables within a large dataset. The main aim of this learning algorithm is to find the
dependency of one data item on another data item and map those variables accordingly so that it
can generate maximum profit. This algorithm is mainly applied in Market Basket analysis,
Web usage mining, continuous production, etc.
Some popular algorithms of Association rule learning are Apriori Algorithm, Eclat, FP-growth
algorithm.

Advantages and Disadvantages of Unsupervised Learning Algorithm


Advantages:

 These algorithms can be used for complicated tasks compared to the supervised ones
because these algorithms work on the unlabeled dataset.
 Unsupervised algorithms are preferable for various tasks as getting the unlabeled dataset
is easier as compared to the labelled dataset.
Disadvantages:

 The output of an unsupervised algorithm can be less accurate as the dataset is not
labelled, and algorithms are not trained with the exact output in prior.
 Working with Unsupervised learning is more difficult as it works with the unlabelled
dataset that does not map with the output.
Applications of Unsupervised Learning
 Network Analysis: Unsupervised learning is used for identifying plagiarism and
copyright in document network analysis of text data for scholarly articles.
 Recommendation Systems: Recommendation systems widely use unsupervised learning
techniques for building recommendation applications for different web applications and
e-commerce websites.
 Anomaly Detection: Anomaly detection is a popular application of unsupervised
learning, which can identify unusual data points within the dataset. It is used to discover
fraudulent transactions.
 Singular Value Decomposition: Singular Value Decomposition or SVD is used to
extract particular information from the database. For example, extracting information of
each user located at a particular location.
3. Semi-Supervised Learning
Semi-Supervised learning is a type of Machine Learning algorithm that lies between
Supervised and Unsupervised machine learning. It represents the intermediate ground
between Supervised (With Labelled training data) and Unsupervised learning (with no labelled
training data) algorithms and uses the combination of labelled and unlabeled datasets during the
training period.
Although Semi-supervised learning is the middle ground between supervised and unsupervised
learning and operates on the data that consists of a few labels, it mostly consists of unlabeled
data. As labels are costly, but for corporate purposes, they may have few labels. It is completely
different from supervised and unsupervised learning as they are based on the presence & absence
of labels.
To overcome the drawbacks of supervised learning and unsupervised learning algorithms,
the concept of Semi-supervised learning is introduced. The main aim of semi-supervised
learning is to effectively use all the available data, rather than only labelled data like in
supervised learning. Initially, similar data is clustered along with an unsupervised learning
algorithm, and further, it helps to label the unlabeled data into labelled data. It is because labelled
data is a comparatively more expensive acquisition than unlabeled data.
We can imagine these algorithms with an example. Supervised learning is where a student is
under the supervision of an instructor at home and college. Further, if that student is self-
analysing the same concept without any help from the instructor, it comes under unsupervised
learning. Under semi-supervised learning, the student has to revise himself after analyzing the
same concept under the guidance of an instructor at college.

Advantages and disadvantages of Semi-supervised Learning


Advantages:

 It is simple and easy to understand the algorithm.


 It is highly efficient.
 It is used to solve drawbacks of Supervised and Unsupervised Learning algorithms.
Disadvantages:

 Iterations results may not be stable.


 We cannot apply these algorithms to network-level data.
 Accuracy is low.
4. Reinforcement Learning
Reinforcement learning works on a feedback-based process, in which an AI agent (A
software component) automatically explore its surrounding by hitting & trail, taking
action, learning from experiences, and improving its performance. Agent gets rewarded for
each good action and get punished for each bad action; hence the goal of reinforcement learning
agent is to maximize the rewards.
In reinforcement learning, there is no labelled data like supervised learning, and agents learn
from their experiences only.
The reinforcement learning process is similar to a human being; for example, a child learns
various things by experiences in his day-to-day life. An example of reinforcement learning is to
play a game, where the Game is the environment, moves of an agent at each step define states,
and the goal of the agent is to get a high score. Agent receives feedback in terms of punishment
and rewards.
Due to its way of working, reinforcement learning is employed in different fields such as Game
theory, Operation Research, Information theory, multi-agent systems.
A reinforcement learning problem can be formalized using Markov Decision Process(MDP). In
MDP, the agent constantly interacts with the environment and performs actions; at each action,
the environment responds and generates a new state.

Categories of Reinforcement Learning


Reinforcement learning is categorized mainly into two types of methods/algorithms:

 Positive Reinforcement Learning: Positive reinforcement learning specifies increasing


the tendency that the required behavior would occur again by adding something. It
enhances the strength of the behavior of the agent and positively impacts it.
 Negative Reinforcement Learning: Negative reinforcement learning works exactly
opposite to the positive RL. It increases the tendency that the specific behaviour would
occur again by avoiding the negative condition.
Real-world Use cases of Reinforcement Learning
 Video Games:
RL algorithms are much popular in gaming applications. It is used to gain super-human
performance. Some popular games that use RL algorithms are AlphaGO and AlphaGO
Zero.
 Resource Management:
The "Resource Management with Deep Reinforcement Learning" paper showed that how
to use RL in computer to automatically learn and schedule resources to wait for different
jobs in order to minimize average job slowdown.
 Robotics:
RL is widely being used in Robotics applications. Robots are used in the industrial and
manufacturing area, and these robots are made more powerful with reinforcement
learning. There are different industries that have their vision of building intelligent robots
using AI and Machine learning technology.
 Text Mining
Text-mining, one of the great applications of NLP, is now being implemented with the
help of Reinforcement Learning by Salesforce company.
Advantages and Disadvantages of Reinforcement Learning
Advantages

 It helps in solving complex real-world problems which are difficult to be solved by


general techniques.
 The learning model of RL is similar to the learning of human beings; hence most accurate
results can be found.
 Helps in achieving long term results.
Disadvantage

 RL algorithms are not preferred for simple problems.


 RL algorithms require huge data and computations.
 Too much reinforcement learning can lead to an overload of states which can weaken the
results.
The curse of dimensionality limits reinforcement learning for real physical systems.

Evaluating Machine learning Models:


Machine learning is a field of study and application that focuses on developing algorithms and
models that enable computers to learn and make predictions or decisions without being explicitly
programmed. It involves the development of mathematical and statistical techniques that allow
systems to automatically learn patterns and relationships from data and improve their
performance through experience.
Here are some fundamental concepts of machine learning
1. Data: Machine learning algorithms rely on data to learn and make predictions. The data
consists of input variables (features) and corresponding output variables (targets or labels). The
quality, quantity, and representativeness of the data play a crucial role in the success of machine
learning models.
2. Training, Validation, and Testing: In machine learning, the available data is typically divided
into three sets: the training set, the validation set, and the testing set. The training set is used to
train the model by adjusting its parameters based on the input-output patterns. The validation set
is used to fine-tune the model's hyperparameters and assess its performance during training. The
testing set is used to evaluate the final performance of the trained model on unseen data.
3. Supervised Learning: In supervised learning, the goal is to learn a mapping function that can
predict the output variable given the input variables. The training data consists of labeled
examples, where both the input and the desired output are known. Supervised learning
algorithms include regression (predicting continuous values) and classification (predicting
categorical values).
4. Unsupervised Learning: In unsupervised learning, the goal is to discover patterns or structures
in the data without explicit labels or target variables. Unsupervised learning algorithms include
clustering (grouping similar data points together) and dimensionality reduction (reducing the
number of input variables while preserving important information).
5. Feature Engineering: Feature engineering is the process of selecting, transforming, and
creating relevant features from the raw data to improve the performance of machine learning
models. It involves domain knowledge, data exploration, and various techniques such as
normalization, scaling, one-hot encoding, and feature extraction.
6. Model Evaluation and Selection: The performance of machine learning models needs to be
evaluated to assess their effectiveness. Common evaluation metrics depend on the task and can
include accuracy, precision, recall, F1 score, mean squared error, or area under the receiver
operating characteristic curve (AUC-ROC). Model selection involves comparing and choosing
the best-performing model based on the evaluation metrics.
7. Generalization and Overfitting: Machine learning models should be able to generalize well to
unseen data, meaning they can make accurate predictions on new, unseen examples. Overfitting
occurs when a model learns the training data too well, capturing noise and irrelevant patterns,
which can lead to poor performance on new data. Techniques like cross-validation and
regularization are used to prevent overfitting.
8. Bias-Variance Tradeoff: The bias-variance tradeoff is a key concept in machine learning. Bias
refers to the error introduced by the model's assumptions and simplifications, while variance
refers to the model's sensitivity to fluctuations in the training data. Finding the right balance
between bias and variance is crucial to achieve a model that can generalize well.
9. Model Deployment and Monitoring: Once a machine learning model is trained and evaluated,
it can be deployed to make predictions on new, real-world data. Model performance should be
continuously monitored, and models may need to be retrained or updated periodically as new
data becomes available or as requirements change.
Machine learning is a dynamic and rapidly evolving field with a wide range of algorithms,
techniques, and applications. Understanding these fundamental concepts provides a solid
foundation for diving deeper into the various aspects of machine learning and developing
effective models for solving real-world problems.

Overfitting and Underfitting:


Overfitting and underfitting are two common problems in machine learning that arise when a
model's performance on the training data does not generalize well to unseen data. These issues
affect the model's ability to make accurate predictions on new examples. Understanding
overfitting and underfitting is crucial for building reliable and effective machine learning
models.

1. Overfitting:
Overfitting occurs when a model learns the training data too well, capturing noise and random
variations that are specific to the training set but do not exist in the underlying population or the
test data. Signs of overfitting include:
- High training accuracy but poor performance on the test/validation data.
- The model captures the noise and outliers in the training data, leading to poor generalization.
- The model is excessively complex and has too many parameters, which allows it to memorize
the training examples instead of learning the underlying patterns.
- Overly flexible models like deep neural networks can be prone to overfitting, especially with
limited training data.
To mitigate overfitting, the following strategies can be employed:
- Increase the size of the training dataset to provide more diverse examples.
- Use techniques like cross-validation or train/test split to evaluate the model's performance on
unseen data.
- Regularization methods like L1 or L2 regularization can be applied to penalize complex models
and reduce the impact of noise in the training data.
- Simplify the model by reducing the number of parameters, limiting the depth of decision trees,
or reducing the complexity of neural networks.
- Feature selection or dimensionality reduction techniques can help remove irrelevant or noisy
features.

2. Underfitting:
Underfitting occurs when a model is too simple to capture the underlying patterns in the data. It
fails to learn the important relationships between the input features and the target variable,
resulting in poor performance on both the training and test data. Signs of underfitting include:
- Low training accuracy and poor performance on both the training and test/validation data.
- The model is too simple and does not capture the complexities of the data.
- The model fails to learn important patterns or relationships in the data.

To address underfitting, the following strategies can be used:


- Increase the complexity of the model by adding more parameters or using more sophisticated
algorithms.
- Collect more relevant features or create new features that provide more information to the
model.
- Adjust hyperparameters of the model, such as learning rate, regularization strength, or tree
depth, to improve its performance.
- Consider using more advanced models that are better suited to capture complex patterns in the
data.
Balancing between overfitting and underfitting is crucial to achieve a model that can generalize
well to unseen data. It involves finding the right level of model complexity that captures the
underlying patterns without being too sensitive to noise or too simplistic to capture the relevant
information. Regular evaluation techniques, such as cross-validation, can help in assessing a
model's performance and detecting signs of overfitting or underfitting.

You might also like