0% found this document useful (0 votes)

3 views

Machine Learning

Uploaded by

confirmadmission0010

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Machine Learning

Uploaded by

confirmadmission0010

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

MACHINE LEARNING

BY: DR. SANJEEV SOFAT

PROFESSOR, CS & E
OUTLINE

 What is Machine Learning?

 Types of Machine Learning
 Supervised Learning
 Unsupervised Learning
 Reinforcement Learning
 Conclusion
WHAT IS MACHINE LEARNING?

• “Learning is any process by which a system improves performance from

experience.” - Herbert Simon
• Definition by Tom Mitchell (1998):
• Machine Learning is the study of algorithms that
• improve their performance P
• at some task T
• with experience E
• A well-defined learning task is given by <P, T, E>.
TRADITIONAL PROGRAMMING VS. ML
WHEN DO WE USE ML?

• Human expertise does not exist (navigating on Mars)

• Humans can’t explain their expertise (speech recognition)
• Models must be customized (personalized medicine)
• Models are based on huge amounts of data (genomics)
EXAMPLES OF TASKS BEST SOLVED BY AN ML ALGORITHM
• Recognizing patterns:
• Facial identities or facial expressions
• Handwritten or spoken words
• Medical images
• Generating patterns:
• Generating images or motion sequences
• Recognizing anomalies:
• Unusual credit card transactions
• Unusual patterns of sensor readings in a nuclear power plant
• Prediction:
• Future stock prices or currency exchange rates
•Healthcare:
•Predictive analytics for disease diagnosis.
•Medical imaging analysis (e.g., detecting tumors).
•Personalized medicine and treatment plans.
•Finance:
•Fraud detection and prevention.
•Algorithmic trading and risk management.
•Credit scoring and underwriting.
•Retail:
•Recommendation systems for personalized shopping experiences.
•Inventory management and demand forecasting.
•Customer sentiment analysis.
•Transportation:
•Autonomous vehicles and navigation systems.
•Traffic prediction and route optimization.
•Fleet management and maintenance prediction.
•Manufacturing:
•Predictive maintenance of equipment.
•Quality control through image recognition.
•Supply chain optimization.
•Marketing:
•Targeted advertising and customer segmentation.
•A/B testing and campaign optimization.
•Sentiment analysis on social media.
•Natural Language Processing (NLP):
•Chatbots and virtual assistants.
•Text summarization and translation.
•Sentiment analysis and content moderation.
•Gaming:
•AI opponents and adaptive gameplay.
•Game design optimization and player behavior analysis.
•Energy:
•Smart grid management and energy consumption forecasting.
•Renewable energy optimization (e.g., wind and solar).
•Agriculture:
•Precision farming with crop yield prediction.
•Disease detection in plants through imaging.
•Soil health monitoring and analysis.
CHALLENGES
DATA COLLECTION

 Data plays a key role in any use case.

 60% of the work of a data scientist lies in collecting the data.
 For beginners to experiment with machine learning, they can easily find data from Kaggle, UCI ML Repository, etc.
 To implement real case scenarios, you need to collect the data through web-scraping or (through APIs like
twitter) or for solving business problems you need to attain data from clients (here ML engineers need to
coordinate with domain experts to collect the data).
 Once the data is collected, we need to structure the data and store it in the database which requires knowledge
of Big data (or data engineer) which plays a major role here.
LESS AMOUNT OF TRAINING DATA

 Once the data is collected you need to validate if the quantity is sufficient for the use case (if it is time-series data,
we need a minimum of 3–5 years of data).
 The two important things we do while doing a machine learning project are selecting a learning algorithm and
training the model using some of the acquired data.
 So as humans, we naturally tend to make mistakes and as a result, things may go wrong.
 Here, the mistakes could be opting for the wrong model or selecting data that is bad.
NON-REPRESENTATIVE TRAINING DATA

 The training data should be representative of the new cases to generalize well i.e., the data we use for training
should cover all the cases that occurred and that are going to occur.
 By using a non-representative training set, the trained model is not likely to make accurate predictions.
 Systems which are developed to make predictions for generalized cases in business problem view is said to be
good machine learning models.
 It will help the model to perform well even for the data which the data model has never seen.
 If the number of training samples is low, we have sampling noise which is unrepresentative data, again countless
training tests bring sampling bias if the strategy utilized for training is defective.
POOR QUALITY OF DATA

 In reality, we don’t directly start training the model, analyzing data is the most important step.
 But the data we collected might not be ready for training, some samples are abnormal from others having outliers
or missing values for instance.
 In these cases, we can remove the outliers, or fill the missing features/values using median or mean (to fill height)
or simply remove the attributes/instances with missing values, or train the model with and without these
instances.
 We don’t want our system to make false predictions, right? So the quality of data is very important to get
accurate results.
 Data preprocessing needs to be done by filtering missing values, and extract & rearrange what the model needs.
IRRELEVANT/UNWANTED FEATURES

 If the training data contains a large number of irrelevant features and enough relevant features, the machine
learning system will not give the results as expected.
 One of the important aspects required for the success of a machine learning project is the selection of good
features to train the model also known as Feature Selection.
OVERFITTING THE TRAINING DATA

 Say you visited a restaurant in a new city.

 You looked at the menu to order something and found that the cost or bill is too high.
 You might be tempted to say that ‘all the restaurants in the city are too costly and not
affordable’.
 Over-generalizing is something that we do very frequently, and shockingly, the frameworks
can likewise fall into a similar snare and in AI, we call it overfitting.
 It means the model is performing well, making likely predictions on the training
dataset, but it is not generalized well.
 We can avoid it by:
 Gathering more training data.
 Selecting a model with fewer features, a higher degree polynomial model is not preferred compared
to the linear model.
 Fix data errors, remove the outliers and reduce the number of instances in the training set.
UNDERFITTING THE TRAINING DATA

 Underfitting which is opposite to overfitting generally occurs when the model is too simple to understand the
base structure of the data.
 It’s like trying to fit into undersized pants.
 It generally happens when we have less information to construct an exact model and when we attempt to build
or develop a linear model with non-linear information.
 Main options to reduce underfitting are:
 Feature Engineering — feeding better features to the learning algorithm.
 Removing noise from the data.
 Increasing parameters and selecting a powerful model.
OFFLINE LEARNING & DEPLOYMENT OF THE MODEL

 Machine Learning engineering follows these steps while building an application 1) Data collection 2) Data cleaning
3) Feature engineering 4) Analyzing patterns 5) Training the model and Optimization 6) Deployment.
 A lot of machine learning practitioners can perform all steps but can lack the skills for deployment, bringing their
cool applications into production has become one of the biggest challenges due to lack of practice and
dependencies issues, low understanding of underlying models with business, understanding of business problems,
unstable models.
 Generally, many of the developers collect data from websites like Kaggle and start training the model.
 But in reality, we need to make a source for data collection, that varies dynamically.
 Offline learning or Batch learning may not be used for this type of variable data.
 The system is trained and then it is launched into production, and runs without learning anymore.
 Here the data might drift as it changes dynamically.
MACHINE LEARNING TECHNIQUES
SUPERVISED LEARNING

• Supervised learning is where you have input variables (x) and an output variable (Y) and you use an algorithm to learn the
mapping function from the input to the output.
Y = f(X)
• Goal: To approximate the mapping function so well that when you have new input data (x), you can predict the output
variables (Y) for that data
TYPES OF SUPERVISED LEARNING

• Regression: A predictive statistical process where the model

attempts to find the important relationship between
dependent and independent variables

• Classification: Job of a classification algorithm is to take an

input value and assign it a class, or category, that it fits into
based on the training data provided
SUPERVISED LEARNING MODELS

•Linear Regression:
•Used for predicting continuous values by fitting a linear relationship between input features and output.
•Logistic Regression:
•Used for binary classification problems, predicting the probability of a binary outcome.
•Decision Trees:
•A tree-like model that makes decisions based on the features of the input data. It’s intuitive and interpretable.
•Random Forest:
•An ensemble method that combines multiple decision trees to improve accuracy and reduce overfitting.
•Support Vector Machines (SVM):
•A powerful classification technique that finds the hyperplane that best separates different classes in
the feature space
•K-Nearest Neighbors (KNN):
•A non-parametric method where the classification of a data point is based on the majority class among its
nearest neighbors.
•Neural Networks:
•Complex models that mimic the human brain’s structure, suitable for both classification and regression tasks,
especially in high-dimensional spaces.
•Gradient Boosting Machines (GBM):
•An ensemble technique that builds models sequentially, where each new model corrects the errors of the
previous ones.
•AdaBoost:
•An ensemble method that combines multiple weak classifiers to create a strong classifier, focusing on
misclassified instances.
•Naive Bayes:
•A probabilistic classifier based on Bayes’ theorem, assuming independence between features. It’s particularly
effective for text classification.
LINEAR REGRESSION

 Linear regression is a statistical regression method that is used for predictive analysis.
 One of the very simple and easy algorithms which work on regression and shows the relationship between the
continuous variables.
 Linear regression shows the linear relationship between the independent variable (X-axis) and the dependent
variable (Y-axis), hence called linear regression.
 If there is only one input variable (x), then such linear regression is called simple linear regression.
 If there is more than one input variable, then such linear regression is called multiple linear regression.
LINEAR REGRESSION

 The relationship between variables in the linear regression model can be explained using the image on right. Here, we are predicting the
salary of an employee on the basis of the year of experience.
 Below is the mathematical equation for Linear regression:
 Y = aX + b
Here,Y = dependent variables (target variables),
X= Independent variables (predictor variables),
a and b are the linear coefficients
 Some popular applications of linear regression are:
 Analyzing trends and sales estimates
 Salary forecasting
 Real estate prediction
 Arriving at ETAs in traffic.
POLYNOMIAL REGRESSION

• A type of regression that models the non-linear dataset using a linear model.
• Similar to multiple linear regression, but it fits a non-linear curve between the value of x and
corresponding conditional values of y.
• Suppose there is a dataset that consists of data points that are present in a non-linear fashion, so for
such a case, linear regression will not best fit those data points.
• To cover such data points, we need Polynomial regression.
• In Polynomial regression, the original features are transformed into polynomial features of a
given degree and then modeled using a linear model.
• The data points are best fitted using a polynomial line.
POLYNOMIAL REGRESSION

• The equation for polynomial regression also derived from linear

regression equation that means Linear regression equation Y=
b0+ b1x, is transformed into Polynomial regression equation Y=
b0+b1x+ b2x2+ b3x3+.....+ bnxn.
Here Y is the predicted/target output, b0, b1,... bn are the regression
coefficients.
x is our independent/input variable.
• The model is still linear as the coefficients are still linear with
quadratic
LOGISTIC REGRESSION

• Logistic regression is another supervised learning algorithm that is used to solve classification problems.
• In classification problems, we have dependent variables in a binary or discrete format such as 0 or 1.
• Logistic regression algorithm works with the categorical variable such as 0 or 1, Yes or No, True or
False, Spam or not spam, etc.
• It is a predictive analysis algorithm that works on the concept of probability.
• Logistic regression uses a sigmoid function or logistic function which is a complex cost function. This
sigmoid function is used to model the data in logistic regression.
LOGISTIC REGRESSION

 The function for logistic regression is represented as:

1
𝑓 𝑥 =
1 + 𝑒 −𝑥
𝑓 𝑥 = Output between the 0 and 1 value.
𝑥 = input to the function
𝑒 = base of the natural logarithm.
 When we provide the input values (data) to the function, it gives the S-curve
 It uses the concept of threshold levels, values above the threshold level are rounded up to 1, and values
below the threshold level are rounded up to 0.
DECISION TREES

 Decision Tree can be used for solving both classification and

regression problems.
 It can solve problems for both categorical and numerical data
 Decision Tree regression builds a tree-like structure in which each
internal node represents the "test" for an attribute, each branch
represents the result of the test, and each leaf node represents the
final decision or result.
 A decision tree is constructed starting from the root node/parent
node (dataset), which splits into left and right child nodes (subsets
of the dataset).
 These child nodes are further divided into their children nodes, and
themselves become the parent node of those nodes.
 Image on the right shows an example of Decision Tee regression,
here, the model is trying to predict the choice of a person between
Sports cars or Luxury car
SUPPORT VECTOR MACHINE

 Support Vector Machine is a supervised learning algorithm that can be used for regression as well as
classification problems
 Support Vector Regression (SVR) is a regression algorithm that works for continuous variables.
 Below are some keywords that are used in Support Vector Machine:
• Kernel: It is a function used to map lower-dimensional data into higher dimensional data.
• Hyperplane: In general SVM, it is a separation line between two classes, but in SVR, it is a line that helps to
predict the continuous variables and covers most of the data points.
• Boundary line: Boundary lines are the two lines apart from the hyperplane, which creates a margin for data
points.
• Support vectors: Support vectors are the data points that are nearest to the hyperplane and opposite class.
SUPPORT VECTOR MACHINE

 In SVR, we always try to determine a hyperplane with a

maximum margin, so that the maximum number of
datapoints are covered in that margin.
 The main goal of SVR is to consider the maximum
data points within the boundary lines and the
hyperplane (best-fit line) must contain a maximum
number of data points.
 The blue line is called hyperplane, and the other two
lines are known as boundary lines
RANDOM FORESTS

 Random forest is one of the most powerful supervised

learning algorithms which is capable of performing
regression as well as classification tasks.
 An ensemble learning method that combines multiple
decision trees and predicts the final output based on the
average of each tree output.
 The combined decision trees are called base models
 Random forest uses Bagging or Bootstrap
Aggregation technique of ensemble learning in which
aggregated decision tree runs in parallel and do not
interact with each other.
 With the help of Random Forest regression, we can
prevent Overfitting in the model by creating random
subsets of the dataset.
K-NEAREST NEIGHBOR

 Assumes the similarity between the new case/data and available cases and put the new case into the category that
is most similar to the available categories.
 Stores all the available data and classifies a new data point based on the similarity.
 Mostly used for classification problems.
 K-NN is a non-parametric algorithm, which means it does not make any assumption on underlying data.
 It is also called a lazy learner algorithm because it does not learn from the training set immediately instead it
stores the dataset and at the time of classification, it performs an action on the dataset.
K-NEAREST NEIGHBOR

 Suppose there are two categories, i.e., Category A and Category B, and we have a new data point x1,
so this data point will lie in which of these categories.
 To solve this type of problem, we need the KNN algorithm.
 With the help of K-NN, we can easily identify the category or class of a particular dataset.
UNSUPERVISED LEARNING

• The training of machine using information that is neither classified nor labeled and allowing the algorithm to act on that
information without guidance
• Task of the unsupervised learning algorithm is to group unsorted information according to similarities, patterns, and
differences without any prior training of data
TYPES OF UNSUPERVISED LEARNING

• Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping
customers by purchasing behavior.
• Dimensionality reduction: Refers to techniques for reducing the number of input variables in training data
• More input features often make a predictive modeling task more challenging to model, more generally referred to as the
curse of dimensionality
• Association: An association rule learning problem is where you want to discover rules that describe large portions of your
data, such as people that buy X also tend to buy Y.
CLUSTERING

 The basic principle behind cluster is the assignment of a given set of

observations into subgroups or clusters such that observations present
in the same cluster possess a degree of similarity.
 It is the implementation of the human cognitive ability to discern
objects based on their nature
 For example, when you go out for grocery shopping, you easily
distinguish between apples and oranges in a given set containing both of
them. You distinguish these two objects based on their color, texture,
and other sensory information that is processed by your brain.
 The machine has to learn the features and patterns all by itself without
any given input-output mapping.
 The algorithm is able to extract inferences from the nature of data
objects and then create distinct classes to group them appropriately.
TYPES OF CLUSTERING
 Partitioning Clustering: The algorithm subdivides the data into a subset of
k groups.
 These k groups or clusters are to be pre-defined.
 It divides the data into clusters by satisfying these two requirements – Firstly, Each
group should consist of at least one point. Secondly, each point must belong to
exactly one group.
 K-Means Clustering is the most popular type of partitioning clustering method
 Hierarchical Clustering: The basic notion behind this type of clustering is
to create a hierarchy of clusters.
 It does not require pre-definition of clusters upon which the model is to be built.
 There are two ways to perform Hierarchical Clustering.
 The first approach is a bottom-up approach, also known as the Agglomerative
Approach
 The second approach is the Divisive Approach which moves the hierarchy of
clusters in a top-down approach. As a result of this type of clustering, we obtain a
tree-like representation known as a dendogram.
TYPES OF CLUSTERING

 Density-based Models: In these types of clusters, there are dense areas

present in the data space that are separated from each other by sparser
areas.
 These types of clustering algorithms play a crucial role in evaluating and finding
non-linear shape structures based on density.
 The most popular density-based algorithm is DBSCAN which allows spatial
clustering of data with noise.
 Fuzzy Clustering: In this type of clustering, the data points can belong to
more than one cluster.
 Each component present in the cluster has a membership coefficient that
corresponds to a degree of being present in that cluster.
 Fuzzy Clustering method is also known as a soft method of clustering.
DIMENSIONALITY REDUCTION

 Benefits of Dimensionality Reduction

 By reducing the dimensions of the features, the space required to store the dataset also gets reduced.
 Less Computation training time is required for reduced dimensions of features.
 Reduced dimensions of features of the dataset help in visualizing the data quickly.
 It removes the redundant features (if present) by taking care of multicollinearity.
DIMENSIONALITY REDUCTION TECHNIQUES

 Feature Selection: The process of selecting the subset of the relevant features and leaving out the
irrelevant features present in a dataset to build a model of high accuracy.
 Three methods are used for the feature selection:
 Filters Methods
 Wrappers Methods
 Embedded Methods
DIMENSIONALITY REDUCTION TECHNIQUES

 Feature Selection
 Filter Methods: In this method, the dataset is filtered, and  Embedded Methods: Embedded methods check the
a subset that contains only the relevant features is taken. different training iterations of the machine learning model
Some common techniques of filters method are: and evaluate the importance of each feature.
 Correlation  Some common techniques of Embedded methods are:
 Chi-Square Test  LASSO
 Elastic Net
 ANOVA
 Ridge Regression, etc.
 Information Gain
 Wrappers Methods: The wrapper method has the same
goal as the filter method, but it takes a machine learning
model for its evaluation.
 In this method, some features are fed to the ML model, and
evaluate the performance.
 The performance decides whether to add those features or
remove them to increase the accuracy of the model.
 This method is more accurate than the filtering method but
complex to work.
DIMENSIONALITY REDUCTION TECHNIQUES

 Feature Extraction: Feature extraction is the process of transforming the space containing many dimensions
into space with fewer dimensions.
 This approach is useful when we want to keep the whole information but use fewer resources while processing the
information.
 Some common feature extraction techniques are:
 Principal Component Analysis
 Linear Discriminant Analysis
 Kernel PCA
 Quadratic Discriminant Analysis
POPULAR DIMENSIONALITY REDUCTION TECHNIQUES

 Principal Component Analysis

 A statistical process that converts the observations of correlated features into a set of linearly uncorrelated features with
the help of orthogonal transformation.
 These new transformed features are called the Principal Components.
 It is one of the popular tools that is used for exploratory data analysis and predictive modeling.
 PCA works by considering the variance of each attribute because the high attribute shows the good split between the
classes, and hence it reduces the dimensionality.
 Some real-world applications of PCA are image processing, movie recommendation systems, optimizing the power
allocation in various communication channels.
POPULAR DIMENSIONALITY REDUCTION TECHNIQUES

 Backward Feature Selection: Mainly used while developing Linear Regression or Logistic Regression model.
 Below steps are performed in this technique to reduce the dimensionality or in feature selection:
 In this technique, firstly, all the n variables of the given dataset are taken to train the model.
 The performance of the model is checked.
 Now we will remove one feature each time and train the model on n-1 features for n times, and will compute the performance of the
model.
 We will check the variable that has made the smallest or no change in the performance of the model, and then we will drop that variable
or features; after that, we will be left with n-1 features.
 Repeat the complete process until no feature can be dropped.
 In this technique, by selecting the optimum performance of the model and maximum tolerable error rate, we can define the optimal
number of features required for the machine learning algorithms.
POPULAR DIMENSIONALITY REDUCTION TECHNIQUES
 Forward Feature Selection: Follows the inverse process of the backward elimination process.
 It means, in this technique, we don't eliminate the feature; instead, we will find the best features that can produce
the highest increase in the performance of the model.
 Below steps are performed in this technique:
• We start with a single feature only, and progressively we will add each feature at a time.
• Here we will train the model on each feature separately.
• The feature with the best performance is selected.
• The process will be repeated until we get a significant increase in the performance of the model.

 Missing Value Ratio: If a dataset has too many missing values, then we drop those variables as they
do not carry much useful information.
 To perform this, we can set a threshold level, and if a variable has missing values more than that threshold, we
will drop that variable.
 The higher the threshold value, the more efficient the reduction.
POPULAR DIMENSIONALITY REDUCTION TECHNIQUES

 Low Variance Filter:

 Similar to the missing value ratio technique, data columns with some changes in the data have less information.
 Therefore, we need to calculate the variance of each variable, and all data columns with variance lower than a
given threshold are dropped because low variance features will not affect the target variable.
 High Correlation Filter:
 High Correlation refers to the case when two variables carry approximately similar information.
 Due to this factor, the performance of the model can be degraded.
 This correlation between the independent numerical variable gives the calculated value of the correlation
coefficient.
 If this value is higher than the threshold value, we can remove one of the variables from the dataset.
 We can consider those variables or features that show a high correlation with the target variable.
POPULAR DIMENSIONALITY REDUCTION TECHNIQUES
 Random Forest:
 Random Forest is a popular and very useful feature selection algorithm in machine learning.
 This algorithm contains an in-built feature importance package, so we do not need to program it separately.
 In this technique, we need to generate a large set of trees against the target variable, and with the help of usage
statistics of each attribute, we need to find the subset of features.
 Factor Analysis:
 A technique in which each variable is kept within a group according to the correlation with other variables, which
means variables within a group can have a high correlation between themselves, but they have a low correlation
with variables of other groups.
 Example: If we have two variables Income and spend.
 These two variables have a high correlation, which means people with high income spends more, and vice versa.
 So, such variables are put into a group, and that group is known as the factor.
 The number of these factors will be reduced as compared to the original dimension of the dataset.
ASSOCIATION RULE LEARNING
 A type of unsupervised learning technique that checks for the dependency of one data item on another data
item and maps accordingly so that it can be more profitable.
 It is based on different rules to discover the interesting relations between variables in the database.
 Types of association rule learning:
 Apriori Algorithm: Uses frequent datasets to generate association rules.
 It is designed to work on the databases that contain transactions.
 This algorithm uses a breadth-first search and Hash Tree to calculate the itemset efficiently.
 It is mainly used for market basket analysis and helps to understand the products that can be bought together.
 It can also be used in the healthcare field to find drug reactions for patients.
 Eclat Algorithm: Stands for Equivalence Class Transformation.
 This algorithm uses a depth-first search technique to find frequent itemsets in a transaction database.
 It performs faster execution than Apriori Algorithm.
 F-P Growth Algorithm: The F-P growth algorithm stands for Frequent Pattern, and it is the improved version of the Apriori
Algorithm.
 It represents the database in the form of a tree structure that is known as a frequent pattern or tree.
 The purpose of this frequent tree is to extract the most frequent patterns.
REINFORCEMENT LEARNING

• Reinforcement learning is the training of machine learning models to make a sequence of decisions
• The agent learns to achieve a goal in an uncertain, potentially complex environment
REINFORCEMENT LEARNING ALGORITHMS

• Q-Learning: Q-learning is an Off policy RL algorithm,

which is used for temporal difference learning.
• The temporal difference learning methods are the way of
comparing temporally successive predictions.
• Off policy algorithm: Estimates the reward for future actions
and appends a value to the new state without actually
following any greedy policy.
• It learns the value function Q (S, a), which means how good to
take action "a" at a particular state "s.“
• The flowchart explains the working of Q- learning:
REINFORCEMENT LEARNING ALGORITHMS

 SARSA (state-action-reward-state-action):
 An on-policy reinforcement learning algorithm that estimates the value of the policy being followed.
 In this algorithm, the agent grasps the optimal policy and uses the same to act.
 The policy that is used for updating and the policy used for acting is the same, unlike in Q-learning.
 An experience in SARSA is of the form ⟨S,A,R,S’, A’⟩, which means that
 current state S,
 current action A,
 reward R, and
 new state S’,
 future action A’.
 This provides a new experience to update from:
Q(S,A) to R+γQ(S’,A’)
REINFORCEMENT LEARNING ALGORITHMS

• Deep Q Neural Network (DQN):

• As the name suggests, DQN is a Q-learning using Neural networks.
• For a big state space environment, it will be a challenging and complex task to define and update a Q-table.
• To solve such an issue, we can use a DQN algorithm. Where, instead of defining a Q-table, neural network
approximates the Q-values for each action and state.
CONCLUSION