0% found this document useful (0 votes)
482 views56 pages

Machine Learning (Aryan Kumar 7th Sem) PDF

The document is a training report submitted by Aryan Kumar, a student at Aravali Institute of Technical Studies, for their internship at WebTek Labs Pvt. Ltd. It discusses machine learning concepts learned during the internship including supervised learning, unsupervised learning, reinforcement learning, and environment setup. It then describes a project done at WebTek Labs to create emojis using deep learning and concludes with testing and references.

Uploaded by

Aniket Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
482 views56 pages

Machine Learning (Aryan Kumar 7th Sem) PDF

The document is a training report submitted by Aryan Kumar, a student at Aravali Institute of Technical Studies, for their internship at WebTek Labs Pvt. Ltd. It discusses machine learning concepts learned during the internship including supervised learning, unsupervised learning, reinforcement learning, and environment setup. It then describes a project done at WebTek Labs to create emojis using deep learning and concludes with testing and references.

Uploaded by

Aniket Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Emoji Creator Project

A
Summer
“Training Report”
On
“WebTek Labs Pvt. Ltd”
Submitted in Partial fulfillment
For the award of the Degree of
B.Tech in Department of Computer Science & Engineering
( With Specialization in Computer Science & Engineering )

Session – 2020-2021

Submitted To : Submitted By :
Mr. Devendra Suthar Aryan Kumar
(Assistant Professor) Roll no. 17ERACS007
Computer Science (B.Tech, 7th Sem)

Department of Computer Science And Engineering


Aravali Institute of Technical Studies, Udaipur (Raj.)
Board of Rajasthan Technical University, Kota
(September – 2019)
CERTIFICATE
CONDIDATE DECLEARATION

I hereby declare that the summer training from “WEBTEK LABS PVT. LTD,
JAIPUR, RAJASTHAN”. in partial fulfilment for the award of degree of
“Bachelor of technology in Department of Computer Science, Aravali Institute
of Technical Studies, Udaipur is a record of my own investigation carried under
the guidance of Mr. Jitendra Singh, Department of Computer Engineering,
Aravali Institute Of Technical Studies.

I have not submitted the presented in this report anywhere for the award of any
other degree.

Aryan Kumar
Computer Science
R no. 17ERASC007
AITS - Udaipur, Rajasthan

H.O.D DIRECTOR
Dr. Jitendra Singh Dr.Hemant Dhabhai
ACKNOWLEDGEMENT

I have gave my best for this project , And it was possible for me without any
guides of other people .But I would like to thanks my friends for the reason is
they are help me all time if any problem is occur . I am highly indebted to
WEBTEK LABS PVT. LTD, JAIPUR, RAJASTHAN

For providing me a creative environment and all the faucitis to learn advance
think. WEBTEK LABS PVT. LTD, doing best in IT field for providing us more
opportunity to explore our ideas and make a really great project in our field .And
also I would like thank Dr. Jitendra Singh Chauhan, head of Department of
computer science and engineering for the support and encouragement during the
project. I perceive this project as a big milestone in our career I will strive to use
gained skills and knowledge in the best possible way, and I will continue to work
on their improvement in Order to attain desired objective .

Aryan Kumar
Computer Science
R no. 17ERASC007
AITS - Udaipur, Rajasthan
COMPANY PROFILE

WebTek Labs Pvt. Ltd. is recognized as a leading IT solution providing organization


with a dynamic and fast growing team of diversely talented individuals. Incorporated
in 2001, in our aim to provide the best talent, we initially started with Recruitment
& Staffing services. We paralleled this by providing knowledge and skill
development certification training programs. WebTek Certified Tester (WCT)
Program that aims to provide IT companies trained software Testers has reached
soaring heights of recognition over the years. Few years later after its inception,
WebTek Labs added Software development & testing services to the portfolio.

Having partnered and worked with some of the leading names across Education, IT,
ITES, Banking, Insurance, Aviation, Retail, Healthcare, Hospitality, Media,
Manufacturing and FMCG sectors, WebTek Labs has explored business
opportunities in software solutions with the Government, Corporate and Institutes.

With over a decade of experience we create and deliver high-impact solutions,


enabling our clients to achieve their business goals and enhance their
competitiveness. In our pursuit of excellence, WebTek’s Research & Development
team consistently innovates to provide up-to-date solutions keeping in pace with
changing times. Our mission is for businesses to leverage the internet and mobility
to work smarter and grow faster. We work as your outsourcing and consulting
partner.
INDEX
Page no.
College Details [i]
Certificate ………………………………………………………………. [ii]
Candidate Declaration ………………………………………………….. [iii]
Acknowledgements …………………………………………………….. [iv]
Company Profile ………………………………………………………. [v]
Contents ………………………………………………………………… [vi]
List of Figure …………………………………………………………… [vii]

INTRODUCTION
About
1. Machine Learning
2. Supervised Machine Learning
3. Unsupervised Machine Learning
4. Semi-Supervised Machine Learning App Permissions
5. Reinforcement Machine Learning

Core Topics
6. Environment Setup For Machine Learning
7. Installing All Required Modules For Machine Learning

My Projects
8. Create your emoji with Deep Learning
9. Testing
10. Conclusion
11. Refrences
Chapter-1

MACHINE LEARNING
What is Machine Learning?
Machine learning is a subfield of computer science that evolved from the study of pattern
recognition and computational learning theory in artificial intelligence. Machine learning
explores the construction and study of algorithms that can learn from and make predictions on
data. Such algorithms operate by building a model from example inputs in order to make data
driven predictions or decisions, rather than following strictly static program instructions.
Machine learning is closely related to and often overlaps with computational statistics; a
discipline that also specializes in prediction-making. It has strong ties to mathematical
optimization, which deliver methods, theory and application domains to the field. Machine
learning is employed in a range of computing tasks where designing and programming explicit
algorithms is infeasible. Example applications include spam filtering, optical character
recognition (OCR), search engines and computer vision. Machine learning is some times
conflated with data mining, although that focuses more on exploratory data analysis. Machine
learning and pattern recognition “can be viewed as two facets of the same field.” When
employed in industrial contexts, machine learning methods may be referred to as predictive
analytics or predictive modelling.

In 1959, Arthur Samuel defined machine learning as a “Field of study that gives computers the
ability to learn without being explicitly programmed”. Tom M. Mitchell provided a widely
quoted, more formal definition: “A computer program is said to learn from experience E with
respect to some class of tasks T and performance measure P, if its performance at tasks in T,
as measured by P, improves with experience E”. This definition is notable for its defining
machine learning in fundamentally operational rather than cognitive terms, thus following
Alan Turing's proposal in his paper "Computing Machinery and Intelligence" that the question
“Can machines think?" be replaced with the question “Can machines do what we (as thinking
entities) can do?"

7
Steps involved in Machine Learning:

A machine learning project involves the following steps −

1. Defining a Problem

2. Preparing Data

3. Evaluating Algorithms

4. Improving Results

5. Presenting Results

The best way to get started using Python for machine learning is to work through a project
endto-end and cover the key steps like loading data, summarizing data, evaluating algorithms
and making some predictions. This gives you a replicable method that can be used dataset after
dataset.

Terminologies of Machine Learning


• Model
A model is a specific representation learned from data by applying some machine
learning algorithm. A model is also called hypothesis.
8
• Feature
A feature is an individual measurable property of our data. A set of numeric features
can be conveniently described by a feature vector. Feature vectors are fed as input to
the model. For example, in order to predict a fruit, there may be features like colour,
smell, taste, etc.
Note: Choosing informative, discriminating and independent features is a crucial step
for effective algorithms. We generally employ a feature extractor to extract the relevant
features from the raw data.
• Target(Label)
A target variable or label is the value to be predicted by our model. For the fruit
example discussed in the features section, the label with each set of input would be the
name of the fruit like apple, orange, banana, etc.
• Training
The idea is to give a set of inputs(features) and it’s expected outputs(labels), so after
training, we will have a model (hypothesis) that will then map new data to one of the
categories trained on.
• Prediction
Once our model is ready, it can be fed a set of inputs to which it will provide a predicted
output(label).
The figure shown below clears the above concepts:

Types of machine learning problems


There are various ways to classify machine learning problems. Here, we discuss the most
obvious ones.

1.On basis of the nature of the learning “signal” or “feedback” available to a learning
system

9
•Supervised learning: The computer is presented with example inputs and their desired
outputs, given by a “teacher”, and the goal is to learn a general rule that maps inputs to outputs.
The training process continues until the model achieves the desired level of accuracy on the
training data. Some real-life examples are:

•Image Classification: You train with images/labels. Then in the future you give a new image
expecting that the computer will recognize the new object.

•Market Prediction/Regression: You train the computer with historical market data and ask the
computer to predict the new price in the future.

•Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own
to find structure in its input. It is used for clustering population in different groups.
Unsupervised learning can be a goal in itself (discovering hidden patterns in data).

•Clustering: You ask the computer to separate similar data into clusters, this is essential in
research and science.

•High Dimension Visualization: Use the computer to help us visualize high dimension data.

•Generative Models: After a model captures the probability distribution of your input data, it
will be able to generate more data. This can be very useful to make your classifier more robust.
A simple diagram which clears the concept of supervised and unsupervised learning is shown
below:

As you can see clearly, the data in supervised learning is labelled, where as data in
unsupervised learning is unlabelled.

•Semi-supervised learning: Problems where you have a large amount of input data and only
some of the data is labeled, are called semi-supervised learning problems. These problems sit
in between both supervised and unsupervised learning. For example, a photo archive where
only some of the images are labeled, (e.g. dog, cat, person) and the majority are unlabeled.

10
•Reinforcement learning: A computer program interacts with a dynamic environment in
which it must perform a certain goal (such as driving a vehicle or playing a game against an
opponent). The program is provided feedback in terms of rewards and punishments as it
navigates its problem space.

2. On the basis of “output” desired from a machine learned system

•Classification: Inputs are divided into two or more classes, and the learner must produce a
model that assigns unseen inputs to one or more (multi-label classification) of these classes.
This is typically tackled in a supervised way. Spam filtering is an example of classification,
where the inputs are email (or other) messages and the classes are “spam” and “not spam”.

•Regression: It is also a supervised learning problem, but the outputs are continuous rather
than discrete. For example, predicting the stock prices using historical data.

An example of classification and regression on two different datasets is shown below:

•Clustering: Here, a set of inputs is to be divided into groups. Unlike in classification, the
groups are not known beforehand, making this typically an unsupervised task.

As you can see in the example below, the given dataset points have been divided into groups
identifiable by the colours red, green and blue.

11
•Density estimation: The task is to find the distribution of inputs in some space.

•Dimensionality reduction: It simplifies inputs by mapping them into a lower-dimensional


space. Topic modeling is a related problem, where a program is given a list of human language
documents and is tasked to find out which documents cover similar topics.

SOME APPLICATIONS OF MACHINE LEARNING ARE:

• Vision processing

• Language processing

• Forecasting things like stock market trends, weather

• Pattern recognition

• Games

• Data mining

• Expert systems

• Robotics

12
How does Machine Learning Work?

Machine Learning algorithm is trained using a training data set to create a model. When new
input data is introduced to the ML algorithm, it makes a prediction on the basis of the model.
The prediction is evaluated for accuracy and if the accuracy is acceptable, the Machine
Learning algorithm is deployed. If the accuracy is not acceptable, the Machine Learning
algorithm is trained again and again with an augmented training data set.

This is just a very high-level example as there are many factors and other steps involved.

13
Chapter-2

SUPERVISED MACHINE LEARNING


What is Supervised Machine Learning?
In supervised learning, learning data comes with description, labels, targets or desired outputs
and the objective is to find a general rule that maps inputs to outputs. This kind of learning
data is called labeled data. The learned rule is then used to label new data with unknown
outputs.

Supervised learning is commonly used in real world applications, such as face and speech
recognition, products or movie recommendations, and sales forecasting.

Supervised learning is when the model is getting trained on a labelled dataset. Labelled dataset
is one which have both input and output parameters. In this type of learning both training and
validation datasets are labelled as shown in the figures below.

14
Both the above figures have labelled data set –
•Figure A: It is a dataset of a shopping store which is useful in predicting whether a
customer will purchase a particular product under consideration or not based on his/ her
gender, age and salary.
Input: Gender, Age, Salary
Output: Purchased i.e. 0 or 1;1 means yes, the customer will purchase and 0 means that
customer won’t purchase it.
•Figure B: It is a Meteorological dataset which serves the purpose of predicting wind speed
based on different parameters.
Input: Dew Point, Temperature, Pressure, Relative Humidity, Wind Direction
Output: Wind Speed
Training the system:
While training the model, data is usually split in the ratio of 80:20 i.e. 80% as training data and
rest as testing data. In training data, we feed input as well as output for 80% data. The model
learns from training data only. We use different machine learning algorithms (which we will
discuss in detail in next articles) to build our model. By learning, it means that the model will
build some logic of its own.
Once the model is ready then it is good to be tested. At the time of testing, input is fed from
remaining 20% data which the model has never seen before, the model will predict some value
and we will compare it with actual output and calculate the accuracy.

15
Types of Supervised Learning:
1.Classification: It is a Supervised Learning task where output is having defined labels
(discrete value). For examples in above Figure A, Output – Purchased has defined labels i.e. 0
or 1;1 means the customer will purchase and 0 means that customer won’t purchase. The goal
here is to predict discrete values belonging to a particular class and evaluate on the basis of
accuracy.
It can be either binary or multi class classification. In binary classification, model predicts
either 0 or 1 ; yes or no but in case of multi class classification, model predicts more than one
class.
Example: Gmail classifies mails in more than one classes like social, promotions, updates,
forum.
A classification problem is when the output variable is a category, such as “red” or “blue” or
“disease” and “no disease”. A classification model attempts to draw some conclusion from
observed values. Given one or more inputs a classification model will try to predict the value
of one or more outcomes.
For example, when filtering emails “spam” or “not spam”, when looking at transaction data,
“fraudulent”, or “authorized”. In short Classification either predicts categorical class labels or
classifies data (construct a model) based on the training set and the values (class labels) in
classifying attributes and uses it in classifying new data. There are a number of classification
models. Classification models include logistic regression, decision tree, random forest,
gradient-boosted tree, multilayer perceptron, one-vs-rest, and Naive Bayes.
For example:
Which of the following is/are classification problem(s)?
• Predicting the gender of a person by his/her handwriting style
• Predicting house price based on area

16
• Predicting whether monsoon will be normal next year
• Predict the number of copies a music album will be sold next month
Solution: Predicting the gender of a person Predicting whether monsoon will be normal next
year. The other two are regression.
As we discussed classification with some examples. Now there is an example of classification
in which we are performing classification on the iris dataset using RandomForestClassifier in
python.

Dataset Description
Title: Iris Plants Database
Attribute Information:
1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm
5. class:
-- Iris Setosa
-- Iris Versicolour
-- Iris Virginica
Missing Attribute Values: None
Class Distribution: 33.3% for each of 3 classes

Program:
# Python code to illustrate
# classification using data set #Importing the required
library import pandas as pd from
sklearn.cross_validation import train_test_split from
sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
17
from sklearn.metrics import confusion_matrix from
sklearn.metrics import accuracy_score from
sklearn.metrics import classification_report

#Importing the dataset


dataset=pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-'+ '
databases/iris/iris.data',sep= ',', header= None) data = dataset.iloc[:, :]

#checking for null values


print("Sum of NULL values in each column. ") print(data.isnull().sum())

#seperating the predicting column from the whole


dataset X = data.iloc[:, :-1].values y = dataset.iloc[:,
4].values

#Encoding the predicting variable


labelencoder_y = LabelEncoder()
y =
labelencoder_y.fit_transform(y)

#Spliting the data into test and train dataset


X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size = 0.3, random_state = 0)

#Using the random forest classifier for the prediction


classifier=RandomForestClassifier()
classifier=classifier.fit(X_train,y_train)
predicted=classifier.predict(X_test)

#printing the results print ('Confusion Matrix :')


print(confusion_matrix(y_test, predicted)) print
('Accuracy Score :',accuracy_score(y_test, predicted))

18
print ('Report : ') print (classification_report(y_test,
predicted))

Output:

Sum of NULL values in each column.


0 0
1 0
2 0
3 0
4 0

Confusion Matrix:
[[16 0 0]
[ 0 17 1]
[ 0 0 11]]

Accuracy Score : 97.7

Report:
precision recall f1-score support
0 1.00 1.00 1.00 16
1 1.00 0.94 0.97 18 2
0.92 1.00 0.96 11 avg/total
0.98 0.98 0.98 45

2.Regression: It is a Supervised Learning task where output is having continuous value.

19
Example in above Figure B, Output – Wind Speed is not having any discrete value but is
continuous in the particular range. The goal here is to predict a value as much closer to actual
output value as our model can and then evaluation is done by calculating error value. The
smaller the error the greater the accuracy of our regression model.
A regression problem is when the output variable is a real or continuous value, such as “salary”
or “weight”. Many different models can be used, the simplest is the linear regression. It tries
to fit data with the best hyper-plane which goes through the points.

Types of Regression Models:

For Examples:
Which of the following is a regression task?
• Predicting age of a person
• Predicting nationality of a person

20
• Predicting whether stock price of a company will increase tomorrow
• Predicting whether a document is related to sighting of UFOs?
Solution: Predicting age of a person (because it is a real value, predicting nationality is
categorical, whether stock price will increase is discrete-yes/no answer, predicting whether a
document is related to UFO is again discrete- a yes/no answer).
Let’s take an example of linear regression. We have a Housing data set and we want to predict
the price of the house. Following is the python code for it

# Python code to illustrate


# regression using data set
import matplotlib
matplotlib.use('GTKAgg'
)

import matplotlib.pyplot as plt import


numpy as np from sklearn import datasets,
linear_model import pandas as pd

# Load CSV and columns df =


pd.read_csv("Housing.csv")

Y = df['price']
X = df['lotsize']

X=X.reshape(len(X),1)
Y=Y.reshape(len(Y),1)

# Split the data into training/testing sets


X_train = X[:-250]
X_test = X[-250:]

# Split the targets into training/testing sets


Y_train = Y[:-250]
Y_test = Y[-250:]

21
# Plot outputs plt.scatter(X_test,
Y_test, color='black') plt.title('Test
Data') plt.xlabel('Size')
plt.ylabel('Price') plt.xticks(())
plt.yticks(())

# Create linear regression object regr =


linear_model.LinearRegression()

# Train the model using the training sets regr.fit(X_train,


Y_train)

# Plot outputs plt.plot(X_test, regr.predict(X_test),


color='red',linewidth=3) plt.show()

The output of above code is:

Some types of Supervised Learning Algorithms:


• Linear Regression
• Nearest Neighbour
• Gaussian Naive Bayes
• Decision Trees

22
• Support Vector Machine (SVM)
• Random Forest

Chapter-3

UNSUPERVISED MACHINE LEARNING


Introduction:
Unsupervised learning is used to detect anomalies, outliers, such as fraud or defective
equipment, or to group customers with similar behaviors for a sales campaign. It is the opposite
of supervised learning. There is no labeled data here.

23
When learning data contains only some indications without any description or labels, it is up
to the coder or to the algorithm to find the structure of the underlying data, to discover hidden
patterns, or to determine how to describe the data. This kind of learning data is called unlabeled
data.

Unlike supervised learning, no teacher is provided that means no training will be given to the
machine. Therefore, machine is restricted to find the hidden structure in unlabelled data by
ourself.
For instance, suppose it is given an image having both dogs and cats which have not seen
ever.

Thus, the machine has no idea about the features of dogs and cat so we can’t categorize it in
dogs and cats. But it can categorize them according to their similarities, patterns, and
differences i.e., we can easily categorize the above picture into two parts. First may contain all
pics having dogs in it and second part may contain all pics having cats in it. Here you didn’t
learn anything before, means no training data or examples.

Types of Unsupervised Machine Learning:


1) Clustering
2) Association
1) Clustering: A clustering problem is where you want to discover the inherent groupings in
the data, such as grouping customers by purchasing behaviour.
Clustering is the task of dividing the population or data points into a number of groups such
that data points in the same groups are more similar to other data points in the same group and
dissimilar to the data points in other groups. It is basically a collection of objects on the basis
of similarity and dissimilarity between them.

24
For ex– The data points in the graph below clustered together can be classified into one single
group. We can distinguish the clusters, and we can identify that there are 3 clusters in the below
picture.

It is not necessary for clusters to be a spherical. Such as:

25
DBSCAN Density data
These data points are clustered by using the basic concept that the data point lies within the
given constraint from the cluster centre. Various distance methods and techniques are used for
calculation of the outliers.

2) Association: An association rule learning problem is where you want to discover rules that
describe large portions of your data, such as people that buy X also tend to buy Y.

Chapter-4

SEMI-SUPERVISED MACHINE LEARNING


Introduction:
If some learning samples are labelled, but some other are not labelled, then it is semisupervised
learning. It makes use of a large amount of unlabelled data for training and a small amount
of labeled data for testing. Semi-supervised learning is applied in cases where it is expensive
to acquire a fully labelled dataset while more practical to label a small subset. For example, it
often requires skilled experts to label certain remote sensing images, and lots of field
experiments to locate oil at a particular location, while acquiring unlabelled data is relatively
easy.

Today’s Machine Learning algorithms can be broadly classified into three categories,
Supervised Learning, Unsupervised Learning and Reinforcement Learning. Casting
Reinforced Learning aside, the primary two categories of Machine Learning problems are
Supervised and Unsupervised Learning. The basic difference between the two is that
Supervised Learning datasets have an output label associated with each tuple while
Unsupervised Learning datasets do not.

26
The most basic disadvantage of any Supervised Learning algorithm is that the dataset has to
be hand-labeled either by a Machine Learning Engineer or a Data Scientist. This is a very
costly process, especially when dealing with large volumes of data. The most basic
disadvantage of any Unsupervised Learning is that its application spectrum is limited.

To counter these disadvantages, the concept of Semi-Supervised Learning was introduced. In


this type of learning, the algorithm is trained upon a combination of labelled and unlabelled
data. Typically, this combination will contain a very small amount of labeled data and a very
large amount of unlabelled data. The basic procedure involved is that first, the programmer
will cluster similar data using an unsupervised learning algorithm and then use the existing
labeled data to label the rest of the unlabelled data. The typical use cases of such type of
algorithm have a common property among them – The acquisition of unlabelled data is
relatively cheap while labelling the said data is very expensive.

Intuitively, one may imagine the three types of learning algorithms as Supervised learning
where a student is under the supervision of a teacher at both home and school, Unsupervised
learning where a student has to figure out a concept himself and Semi-Supervised learning
where a teacher teaches a few concepts in class and gives questions as homework which are
based on similar concepts.

27
A Semi-Supervised algorithm assumes the following about the data –

1. Continuity Assumption: The algorithm assumes that the points which are closer to each
other are more likely to have the same output label.

2. Cluster Assumption: The data can be divided into discrete clusters and points in the same
cluster are more likely to share an output label.

3. Manifold Assumption: The data lie approximately on a manifold of much lower


dimension than the input space. This assumption allows the use of distances and densities
which are defined on a manifold.

Practical applications of Semi-Supervised Learning –

1) Speech Analysis: Since labelling of audio files is a very intensive task, Semi-Supervised
learning is a very natural approach to solve this problem.

2) Internet Content Classification: Labelling each webpage is an impractical and unfeasible


process and thus uses Semi-Supervised learning algorithms. Even the Google search algorithm
uses a variant of Semi-Supervised learning to rank the relevance of a webpage for a given
query.

3) Protein Sequence Classification: Since DNA strands are typically very large in size, the
rise of Semi-Supervised learning has been imminent in this field.

28
Chapter-5

REINFORCEMENT MACHINE LEARNING


Introduction:
Here learning data gives feedback so that the system adjusts to dynamic conditions in order to
achieve a certain objective. The system evaluates its performance based on the feedback
responses and reacts accordingly. The best-known instances include self-driving cars and chess
master algorithm AlphaGo.

Reinforcement learning is an area of Machine Learning. Reinforcement. It is about taking


suitable action to maximize reward in a particular situation. It is employed by various software
and machines to find the best possible behaviour or path it should take in a specific situation.
Reinforcement learning differs from the supervised learning in a way that in supervised
learning the training data has the answer key with it so the model is trained with the correct
answer itself whereas in reinforcement learning, there is no answer but the reinforcement agent
decides what to do to perform the given task. In the absence of training dataset, it is bound to
learn from its experience.

Example: The problem is as follows: We have an agent and a reward, with many hurdles in
between. The agent is supposed to find the best possible path to reach the reward. The
following problem explains the problem more easily.

The above image shows robot, diamond and fire. The goal of the robot is to get the reward that
is the diamond and avoid the hurdles that is fire. The robot learns by trying all the possible
paths and then choosing the path which gives him the reward with the least hurdles. Each right
29
step will give the robot a reward and each wrong step will subtract the reward of the robot. The
total reward will be calculated when it reaches the final reward that is the diamond.

Main points in Reinforcement learning –


• Input: The input should be an initial state from which the model will start
• Output: There are many possible outputs as there are variety of solution to a particular
problem
• Training: The training is based upon the input; the model will return a state and the user
will decide to reward or punish the model based on its output.
• The model keeps continues to learn.
• The best solution is decided based on the maximum reward.
Types of Reinforcement: There are two types of Reinforcement:

1)Positive –

Positive Reinforcement is defined as when an event, occurs due to a particular behaviour,


increases the strength and the frequency of the behaviour. In other words, it has a positive
effect on the behaviour.

Advantages of reinforcement learning are:

a) Maximizes Performance

b) Sustain Change for a long period of time

Disadvantages of reinforcement learning:

i) Too much Reinforcement can lead to overload of states which can diminish the results 2)

Negative –

Negative Reinforcement is defined as strengthening of a behaviour because a negative


condition is stopped or avoided.
Advantages of reinforcement learning:

a) Increases Behaviour

b) Provide defiance to minimum standard of performance Disadvantages of

reinforcement learning:

i) It Only provides enough to meet up the minimum behaviour

30
Chapter-6

ENVIRONMENT SETUP FOR MACHINE LEARNING

Programming Language Setup:


PYTHON INSTALLATION
Step-1:
Open browser and go to official page of Python to download www.python.org The

following page will appear in your browser.

Sterp-2:

Click the Windows link (two lines below the Download Python 3.7.4 button).

31
The following page will appear in your browser.

Step-3:

Click on the Download Windows x86-64 executable installer link under the top-left Stable
Releases.

The following pop-up window titled Opening python-3.74-amd64.exe will appear.

32
Click the Save
File button.

The file named


python-3.7.4-
amd64.exe
should start
downloading into your standard download folder. This file is about 30 Mb so it might take a
while to download fully if you are on a slow internet connection (it took me about 10 seconds
over a cable modem).

The file should appear as

1. Move this file to a more permanent location, so that you can install Python (and
reinstall it easily later, if necessary).
2. Feel free to explore this webpage further; if you want to just continue the installation,
you can terminate the tab browsing this webpage.
3. Start the Installing instructions directly below. Installing Step-1:

Double-click the icon labeling the file python-3.7.4-amd64.exe.

A Python 3.7.4 (64-bit) Setup pop-up window will appear.

33
Ensure that the Install launcher for all users (recommended) and the Add Python 3.7 to
PATH checkboxes at the bottom are checked.

If the Python Installer finds an earlier version of Python installed on your computer, the Install
Now message may instead appear as Upgrade Now (and the checkboxes will not appear).

Step-2:

Highlight the Install Now (or Upgrade Now) message, and then click it.

When run, a User Account Control pop-up window may appear on your screen. I could not
capture its image, but it asks, Do you want to allow this app to make changes to your device.

Step-3:

Click the Yes button.

A new Python 3.7.4 (64-bit) Setup pop-up window will appear with a Setup
Progress message and a progress bar.

34
During installation, it will show the various components it is installing and move the progress
bar towards completion. Soon, a new Python 3.7.4 (64-bit) Setup pop-up window will appear
with a Setup was successfully message.

35
Step-4:

Click the Close button.

Python should now be installed.

Verifying
To try to verify installation,

1. Navigate to the directory C:\Users\Pattis\AppData\Local\Programs\Python\Python37 (or to


whatever directory Python was installed: see the pop-up window for Installing step 3).

2. Double-click the icon/file python.exe.

The following pop-up window will appear.

A popup window with the title

C:\Users\Pattis\AppData\Local\Programs\Python\Python37\python.exe appears, and inside


the window; on the first line is the text Python 3.7.4 ... (notice that it should also say 64 bit).
Inside the window, at the bottom left, is the prompt >>>: type exit() to this prompt and press
enter to terminate Python.

You should keep the file python-3.7.4.exe somewhere on your computer in case you need to
reinstall Python (not likely necessary).

Installing Jupyter Notebook using pip

As an existing or experienced Python user, you may wish to install Jupyter using Python’s
package manager, pip.

If you have Python 3 installed (which is recommended):

python3 -m pip install --upgrade pip

python3 -m pip install jupyter

36
If you have Python 2 installed:

python -m pip install --upgrade pip

python -m pip install jupyter

Congratulations, you have installed Jupyter Notebook! To run the notebook, run the
following command at the Terminal (Mac/Linux) or Command Prompt (Windows):

jupyter notebook

See Running the Notebook for more details.

37
Chapter-7

INSTALLING ALL REQUIRED MODULES FOR

MACHINE LEARNING

STEP-1:

Open Command Prompt Step-2:

Enter the following keyword in the command prompt shell for installing corresponding

Modules

1. PANDAS:

pip install pandas

2. NUMPY:

pip install numpy

3. SCIPY:

pip install scipy

4. MATPLOTLIB:

pip install matplotlib

5. SCIKIT LEARN:

pip install scikit-learn

6. SPEECH RECOGNITION:

pip install SpeechRecognition

7. PYTTSX3:

pip install pyttsx3


38
Opening Jupyter Notebook:

Enter following keyword in command prompt: jupyter notebook

After this following window will open:

PROJECT WORK:
Why IRIS project:

39
• Attributes are numeric so you have to figure out how to load and handle data.
• It is a classification problem, allowing you to practice with perhaps an easier type of
supervised learning algorithm.
• It is a multi-class classification problem (multi-nominal) that may require some specialized
handling.
• It only has 4 attributes and 150 rows, meaning it is small and easily fits into memory (and a
screen or A4 page).
• All of the numeric attributes are in the same units and the same scale, not requiring any
special scaling or transforms to get started

Here is an overview of what we are going to cover:

1. Installing the Python and SciPy platform.


2. Loading the dataset.
3. Summarizing the dataset.
4. Visualizing the dataset.
5. Evaluating some algorithms.
6. Making some predictions

Whole project work explained step-by-step:


Step-1: Start Python and Check Versions

40
Step-
2: Import Libraries

Step-3: Load Dataset

Step-4: Summarize Dataset

Step-5: Check Dimensions of Dataset

6: Peek at the Data

41
Step-

Step-7: Statistical Summary

Step-8: Class Distribution

9: Data Visualization:
Univariate Plots

42
Step-

Histogram plots

Multivariate Plots

43
Step-10: Evaluate some Algorithms

Step-11: Compare Model by plotting

Step-12: Make Prediction on validation dataset

44
45
Chapter-8

Create your emoji with Deep Learning

About the Dataset?

The FER2013 dataset ( facial expression recognition) consists of 48*48 pixel grayscale face
images. The images are centered and occupy an equal amount of space. This dataset consist of
facial emotions of following categories:

1. Angry
2. Disgust
3. Feat
4. Happy
5. Sad
6. Surprise
7. Natural

Facial Emotion Recognition using CNN


In the below will build a convolution neural network architecture and train the model on
FER2013 dataset for Emotion recognition from images.
Download the dataset from the above link. Extract it in the data folder with separate train and test
directories.

1. Imports (Make a file train.py):

import numpy as np
import cv2
from keras.emotion_models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D
from keras.optimizers import Adam
from keras.layers import MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator

46
2. Initialize the training and validation generators:

train_dir = 'data/train'
val_dir = 'data/test'
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(48,48),
batch_size=64,
color_mode="gray_framescale",
class_mode='categorical')

validation_generator = val_datagen.flow_from_directory(
val_dir,
target_size=(48,48),
batch_size=64,
color_mode="gray_framescale",
class_mode='categorical')

47
3. Build the convolution network architecture:

emotion_model = Sequential()

emotion_model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',


input_shape=(48,48,1)))
emotion_model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Dropout(0.25))

emotion_model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))


emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Dropout(0.25))

emotion_model.add(Flatten())
emotion_model.add(Dense(1024, activation='relu'))
emotion_model.add(Dropout(0.5))
emotion_model.add(Dense(7, activation='softmax'))

4. Compile and train the model:

emotion_model.compile(loss='categorical_crossentropy',optimizer=Adam(
['accuracy'])
lr=0.0001, decay=1e-6),metrics=['accuracy'])

emotion_model_info
emotion_model_info==emotion_model.fit_generator(
emotion_model.fit_generator(
train_generator,
train_generator,
steps_per_epoch=28709
steps_per_epoch=28709////64,
64,
epochs=50,
epochs=50,
validation_data=validation_generator,
validation_data=validation_generator,
validation_steps=7178 // 64)
48
5. Save the model weight

emotion_model.save_weights('model.h5')

6. Using openCV haarcascade xml detect the bounding boxes of face in the
webcam and predict the emotions:

cv2.ocl.setUseOpenCL(False)

emotion_dict = {0: "Angry", 1: "Disgusted", 2: "Fearful", 3: "Happy", 4: "Neutral", 5:


"Sad", 6: "Surprised"}

cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
bounding_box = cv2.CascadeClassifier('/home/shivam/.local/lib/python3.6/site-
packages/cv2/data/haarcascade_frontalface_default.xml')
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2gray_frame)
num_faces = bounding_box.detectMultiScale(gray_frame,scaleFactor=1.3,
minNeighbors=5)

for (x, y, w, h) in num_faces:


cv2.rectangle(frame, (x, y-50), (x+w, y+h+10), (255, 0, 0), 2)
roi_gray_frame = gray_frame[y:y + h, x:x + w]
cropped_img = np.expand_dims(np.expand_dims(cv2.resize(roi_gray_frame, (48,
48)), -1), 0)
emotion_prediction = emotion_model.predict(cropped_img)
maxindex = int(np.argmax(emotion_prediction))
cv2.putText(frame, emotion_dict[maxindex], (x+20, y-60),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)

cv2.imshow('Video', cv2.resize(frame,(1200,860),interpolation =
cv2.INTER_CUBIC))
49
if cv2.waitKey(1) & 0xFF == ord('q'):
cap.release()
b6. Using openCV haarcascade xml detect the bounding boxes of face in
cv2.imshow('Video', cv2.resize(frame,(1200,860),interpolation =
webcam and predict the emotions:
cv2.INTER_CUBIC))
if cv2.waitKey(1) & 0xFF == ord('q'):
cap.release()
cv2.destroyAllWindows()

Code for GUI and mapping with emojis


Create a folder named emojis and save the emojis corresponding to each of the
seven emotions in the dataset.
. import tkinter as tk

from tkinter import *


import cv2
from PIL import Image, ImageTk
import os
import numpy as np
import cv2
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D
from keras.optimizers import Adam
from keras.layers import MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator

emotion_model = Sequential()

emotion_model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',


input_shape=(48,48,1)))
emotion_model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Dropout(0.25))

50
emotion_model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
bre
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
emotion_model.add(MaxPooling2D(pool_size=(2, 2)))
emotion_model.add(Dropout(0.25))

emotion_model.add(Flatten())
emotion_model.add(Dense(1024, activation='relu'))
emotion_model.add(Dropout(0.5))
emotion_model.add(Dense(7, activation='softmax'))
emotion_model.load_weights('model.h5')

cv2.ocl.setUseOpenCL(False)

emotion_dict = {0: " Angry ", 1: "Disgusted", 2: " Fearful ", 3: " Happy ", 4: " Neutral
", 5: " Sad ", 6: "Surprised"}

emoji_dist={0:"./emojis/angry.png",2:"./emojis/disgusted.png",2:"./emojis/fearful.png",3:"./e
mojis/happy.png",4:"./emojis/neutral.png",5:"./emojis/sad.png",6:"./emojis/surpriced.png"}

global last_frame1
last_frame1 = np.zeros((480, 640, 3), dtype=np.uint8)
global cap1
show_text=[0]
def show_vid():
cap1 = cv2.VideoCapture(0)
if not cap1.isOpened():
print("cant open the camera1")
flag1, frame1 = cap1.read()
frame1 = cv2.resize(frame1,(600,500))

51
bounding_box = cv2.CascadeClassifier('/home/shivam/.local/lib/python3.6/site-
packages/cv2/data/haarcascade_frontalface_default.xml')
gray_frame = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)
num_faces = bounding_box.detectMultiScale(gray_frame,scaleFactor=1.3,
minNeighbors=5)

for (x, y, w, h) in num_faces:


cv2.rectangle(frame1, (x, y-50), (x+w, y+h+10), (255, 0, 0), 2)
roi_gray_frame = gray_frame[y:y + h, x:x + w]
cropped_img = np.expand_dims(np.expand_dims(cv2.resize(roi_gray_frame, (48,
48)), -1), 0)
prediction = emotion_model.predict(cropped_img)

maxindex = int(np.argmax(prediction))
cv2.putText(frame1, emotion_dict[maxindex], (x+20, y-60),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)
show_text[0]=maxindex
if flag1 is None:
print ("Major error!")
elif flag1:
global last_frame1
last_frame1 = frame1.copy()
pic = cv2.cvtColor(last_frame1, cv2.COLOR_BGR2RGB)
img = Image.fromarray(pic)
imgtk = ImageTk.PhotoImage(image=img)
lmain.imgtk = imgtk
lmain.configure(image=imgtk)
lmain.after(10, show_vid)
if cv2.waitKey(1) & 0xFF == ord('q'):
exit()

52
def show_vid2():
frame2=cv2.imread(emoji_dist[show_text[0]])
pic2=cv2.cvtColor(frame2,cv2.COLOR_BGR2RGB)
img2=Image.fromarray(frame2)
imgtk2=ImageTk.PhotoImage(image=img2)
lmain2.imgtk2=imgtk2
lmain3.configure(text=emotion_dict[show_text[0]],font=('arial',45,'bold'))

lmain2.configure(image=imgtk2)
lmain2.after(10, show_vid2)

if __name__ == '__main__':
root=tk.Tk()
img = ImageTk.PhotoImage(Image.open("logo.png"))
heading = Label(root,image=img,bg='black')

heading.pack()
heading2=Label(root,text="Photo to Emoji",pady=20,
font=('arial',45,'bold'),bg='black',fg='#CDCDCD')

heading2.pack()
lmain = tk.Label(master=root,padx=50,bd=10)
lmain2 = tk.Label(master=root,bd=10)

lmain3=tk.Label(master=root,bd=10,fg="#CDCDCD",bg='black')
lmain.pack(side=LEFT)
lmain.place(x=50,y=250)
lmain3.pack()
lmain3.place(x=960,y=250)
lmain2.pack(side=RIGHT)
lmain2.place(x=900,y=350)

53
` root.title("Photo To Emoji")
root.geometry("1400x900+100+10")
root['bg']='black'
exitbutton = Button(root,
text='Quit',fg="red",command=root.destroy,font=('arial',25,'bold')).pack(side =
BOTTOM)
show_vid()
show_vid2()
root.mainloop()

54
Chapter-9

Testing

55
Chapter-10

CONCLUSION

In this deep learning project for beginners, we have built a convolution neural network to recognize
facial emotions. We have trained our model on the FER2013 dataset. Then we are mapping those
emotions with the corresponding emojis or avatars.

Using OpenCV’s haar cascade xml we are getting the bounding box of the faces in the webcam.
Then we feed these boxes to the trained model for classification.

Chapter-11

REFRENCES
DataFlair is committed to provide all the resources to make you a data scientist, which includes
detailed tutorials, practicals, use-cases as well as projects with source code. Did you like our
efforts? if yes, please give DataFlair 5 Stars on Google.

Workshop / Production technology by WEBTEK LABS PVT. LTD,


JAIPUR, RAJASTHAN
Study material provided by technical training center
Study material provided by Puja Batiya
https://data-flair.training/blogs/create-emoji-with-deep-learning/
https://webteklabs.com/
https://en.wikipedia.org/wiki/MachineLearning

56

You might also like