0% found this document useful (0 votes)
136 views43 pages

Mridul Report

The document is a report submitted by Mridul Shinghal detailing their industrial training at NULL CLASS EDTECH PVT. LTD. It provides an overview of the company and outlines Mridul's work on machine learning algorithms and applications during their internship. The report includes chapters on the history of machine learning, common algorithms like linear regression and decision trees, and projects undertaken to apply machine learning to domains such as fraud detection and predictive maintenance.

Uploaded by

Pankaj Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views43 pages

Mridul Report

The document is a report submitted by Mridul Shinghal detailing their industrial training at NULL CLASS EDTECH PVT. LTD. It provides an overview of the company and outlines Mridul's work on machine learning algorithms and applications during their internship. The report includes chapters on the history of machine learning, common algorithms like linear regression and decision trees, and projects undertaken to apply machine learning to domains such as fraud detection and predictive maintenance.

Uploaded by

Pankaj Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

A Report of

Industrial Training

at

NULL CLASS EDTECH PVT. LTD.

On

MACHINE LEARNING

Submitted in partial fulfilment of the requirements

for the degree of

BACHELOR OF TECHNOLOGY

in
COMPUTER SCIENCE AND ENGINEERING

Submitted by
Mridul Shinghal (1900140100068)

Department of Computer Science And Engineering

Shri Ram Murti Smarak College of Engineering & Technology, Bareilly

Dr.APJ Abdul Kalam Technical University, Lucknow

July,2022
CERTIFICATE

ii
ACKNOWLEDGEMENT

I am grateful to NULL CLASS, for giving me opportunity to carry out the project work
in the area of Machine Learning during my training and internship. I would like to also
thank my institute, Shri Ram Murti Smarak College of Engineering & Technology,
Bareilly for giving permission and necessary administrative support to take up the training
and internship work at NULL CLASS.
Deepest thanks to our Trainer, for his guidance, monitoring, constant encouragement and
correcting various assignments of ours with attention and care. He has taken pain to go
through the project and training sessions and make necessary corrections as when needed
and we are very grateful for that.
I perceive this opportunity as a big milestone in my career development . I will strive to
use gained skills and knowledge in the best possible way, and I will continue to work on
their improvement in order to attain my desired career objective.

MRIDUL SHINGHAL
1900140100068

iii
TABLE OF CONTENTS
TITLE PAGE NO.
CERTIFICATE ii

ACKNOWLEDGEMENT iii

LIST OF FIGURES v

CHAPTER 1: COMPANY PROFILE 1

CHAPTER 2: INTRODUCTION OF ML 2

CHAPTER 3 HISTORY OF ML 4

CHAPTER 4 TYPES OF ML 5

CHAPTER 5 LITERATURE SURVEY 7

CHAPTER 6 ML ALGORITHMS 8

CHAPTER 7 APPLICATIONS 14

CHAPTER 8 DATA PREPROCESSING, 19

CHAPTER 9 ADVANTAGES & DISADVANTAGES OF ML 21

CHAPTER 10 PROJECTS UNDERTAKEN 23

CHAPTER 11 CONCLUSIONS 36

REFERENCES 37

iv
LIST OF FIGURES

Fig 1. Introduction 2
Fig 2. Types of ML 5
Fig 3. ML Algorithms 8
Fig 4. Linear Regression 9
Fig 5. Logistic Regression 9
Fig 6. Decision Tree 10
Fig 7. SVM 11
Fig 8. KNN 13
Fig 9. Random Forest 13
Fig 10. Applications of ML 14
Fig 11. Sample 1 34
Fig 12. Sample 2 34
Fig 13. Sample 3 35

v
CHAPTER 1

COMPANY’S PROFILE

About the Company


NULL CLASS EDTECH PVT. LTD. is the leader in the specialized training brands of India. It is the
largest Internship/training service provider in various engineering domains for all engineering students
as well as for the working professionals. It has an extensive experience of nurturing over 200000+
students in a past few years. NULL CLASS is a trustworthy brand in Education and Training/Internship
industry. Nullclass Technology Private Limited is a 4 years 5 months old Private Company incorporated
on 14 Mar 2018. Its registered office is in Dharmapuri, Tamil Nadu, india.
The Company's status is Active, and it has filed its Annual Returns and Financial Statements up to 31
Mar 2019 (FY 2018-2019). It's a company limited by shares having an authorized capital of Rs 1.00 lakh
and a paid-up capital of Rs 1.00 lakh as per MCA.

WHY NULL CLASS?

Build real time projects: You get an opportunity to apply your knowledge in a real-world
environment to gain experience by building your real-time project from scratch.

Paid internship opportunity: There’s no substitute for hands-on experience. You may get an
opportunity to get paid to solve a real-world problem.

Experience certificate: You get an opportunity to have an experience certificate that states the
experience building the entire project.

24×7 Mentor support: A mentor can pass on life years of experience. You will be getting 24×7
mentor support over WhatsApp and other platforms

1
CHAPTER 2
INTRODUCTION TO ML

Fig 1.Introduction

Machine learning (ML) is a type of artificial intelligence (AI) that allows software applications to become
more accurate at predicting outcomes without being explicitly programmed to do so. Machine learning
algorithms use historical data as input to predict new output values.

2
Recommendation engines are a common use case for machine learning. Other popular uses include fraud
detection, spam filtering, malware threat detection, business process automation(BPA) and Predictive
maintenance.

Machine Learning is a concept which allows the machine to learn from examples and experience, and
that too without being explicitly programmed. So instead of you writing the code, what you do is you
feed data to the
generic algorithm, and the algorithm/ machine builds the logic based on the given data.

Why is machine learning important?

Machine learning is important because it gives enterprises a view of trends in customer behavior and
business operational patterns, as well as supports the development of new products. Many of today's
leading companies, such as Facebook, Google and Uber, make machine learning a central part of their
operations.
Machine learning has become a significant competitive differentiator for many companies.

3
CHAPTER 3
HISTORY OF ML
The name machine learning was coined in 1959 by Arthur Samuel.

Tom M. Mitchell provided a widely quoted, more formal definition of the algorithms studied in the
machine learning field: A computer program is said to learn from experience E with respect to some
class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves
with experience E.

This follows Alan Turing's proposal in his paper Computing Machinery and Intelligence, in which the
question Can machines think? is replaced with the question Can machines do what we (as thinking
entities) can do?.

In Turing’s proposal the characteristics that could be possessed by a thinking machine and the various
implications in constructing one are exposed.

Today, machine learning algorithms enable computers to communicate with humans, autonomously drive
cars, write and publish sport match reports, and find terrorist suspects. I firmly believe machine learning
will severely impact most industries and the jobs within them, which is why every manager should have
at least some grasp of what machine learning is and how it is evolving. In this post I offer a quick trip
through time to examine the origins of machine learning as well as the most recent milestones.

4
CHAPTER 4
TYPES OF ML

The types of machine learning algorithms differ in their approach, the type of data they input and output,
and the type of task or problem that they are intended to solve. Broadly Machine Learning can be
categorized into four categories.

I. Supervised Learning

II. Unsupervised Learning

III. Reinforcement Learning

IV. Semi-supervised Learning

Machine learning enables analysis of massive quantities of data. While it generally delivers faster, more
accurate results in order to identify profitable opportunities or dangerous risks, it may also require
additional time and resources to train it properly.

Fig2. Types of ML

5
Supervised Learning
Supervised Learning is a type of learning in which we are given a data set and we already know what are
correct output should look like, having the idea that there is a relationship between the input and output.
Basically, it is learning task of learning a function that maps an input to an output based on example
input/output pairs. It infers a function from labeled training data consisting of a set of training examples.
Supervised learning problems are categorized

Unsupervised Learning
Unsupervised Learning is a type of learning that allows us to approach problems with little or no idea
what our problem should look like. We can derive the structure by clustering the data based on a
relationship among the variables in data. With unsupervised learning there is no feedback based on
prediction result. Basically, it is a type of self-organized learning that helps in finding previously
unknown patterns in data set without pre-existing label.

Reinforcement Learning
Reinforcement learning is a learning method that interacts with its environment by producing actions
and discovers errors or rewards. Trial and error search and delayed reward are the most relevant
characteristics of reinforcement learning. This method allows machines and software agents to
automatically determine the ideal behavior within a specific context in order to maximize its
performance. Simple reward feedback is required for the agent to learn which action is best.

Semi-Supervised Learning
Semi-supervised learning fall somewhere in between supervised and unsupervised learning, since they
use both labeled and unlabeled data for training – typically a small amount of labeled data and a large
amount of unlabeled data. The systems that use this method are able to considerably improve learning
accuracy.

Usually, semi-supervised learning is chosen when the acquired labeled data requires skilled and relevant
resources in order to train it or learn from it. Otherwise, acquiring unlabeled data generally doesn’t
require additional resources.

6
CHAPTER 5

LITERATURE SURVEY
A core objective of a learner is to generalize from its experience. The computational analysis of machine
learning algorithms and their performance is a branch of theoretical computer science known as
computational learning theory. Because training sets are finite and the future is uncertain, learning theory
usually does not yield guarantees of the performance of algorithms. Instead, probabilistic bounds on the
performance are quite common. The bias–variance decomposition is one way to quantify generalization
error.
For the best performance in the context of generalization, the complexity of the hypothesis should match
the complexity of the function underlying the data. If the hypothesis is less complex than the function,
then the model has underfit the data. If the complexity of the model is increased in response, then the
training error decreases. But if the hypothesis is too complex, then the model is subject to overfitting and
generalization will be poorer.
In addition to performance bounds, learning theorists study the time complexity and feasibility of
learning. In computational learning theory, a computation is considered feasible if it can be done in
polynomial time. There are two kinds of time complexity results. Positive results show that a certain
class of functions can be learned in polynomial time. Negative results show that certain classes cannot
be learned in polynomial time.

The Challenges Facing Machine Learning

While there has been much progress in machine learning, there are also challenges. For example, the
mainstream machine learning technologies are black-box approaches, making us concerned about their
potential risks. To tackle this challenge, we may want to make machine learning more explainable and
controllable. As another example, the computational complexity of machine learning algorithms is
usually very high and we may want to invent lightweight algorithms or implementations. Machine
learning takes much more time. You have to gather and prepare data, then train the algorithm. There are
much more uncertainties. That is why, while in traditional website or application development an
experienced team can estimate the time quite precisely, a machine learning project used for example to
provide product recommendations can take much less or much more time than expected.
7
CHAPTER 6

MACHINE LEARNING ALGORITHMS

Fig 3. ML Algorithms

1. Linear Regression -

Linear regression is one of the supervised Machine learning algorithms in Python that observes
continuous features and predicts an outcome. Depending on whether it runs on a single variable or on
many features, we can call it simple linear regression or multiple linear regression.
This is one of the most popular Python ML algorithms and often under-appreciated. It assigns optimal
weights to variables to create a line ax+b to predict the output. We often use linear regression to estimate
real values like a number of calls and costs of houses based on continuous variables. The regression line
is the best line that fits Y=a*X+b to denote a relationship between independent and dependent variables.

8
Fig 4. Linear Regression

2. Logistic Regression -

Logistic regression is a supervised classification is unique Machine Learning algorithms in Python that
finds its use in estimating discrete values like 0/1, yes/no, and true/false.
This is based on a given set of independent variables.
We use a logistic function to predict the probability of an event and this gives us an output between 0
and 1.
Although it says regression this is actually a classification algorithm.
Logistic regression fits data into a logit function and is also called logistic regression

Fig 5. Logistic Regression

9
3. Decision Tree -

A decision tree falls under supervised Machine Learning Algorithms in Python and comes of use for both
classification and regression- although mostly for classification. This model takes an instance, traverses
the tree, and compares important features with a determined conditional statement. Whether it descends
to the left child branch or the right depends on the result. Usually, more important features are closer to
the root.
Decision Tree, a Machine Learning algorithm in Python can work on both categorical and continuous
dependent variables. Here, we split a population into two or more homogeneous sets. Tree models where
the target variable can take a discrete set of values are called classification trees; in these tree structures,
leaves represent class labels and branches represent conjunctions of features that lead to those class
labels. Decision trees where the target variable can take continuous values are called regression trees.

Fig 6. Decision Tree

4. Support Vector Machine (SVM)-


SVM is a supervised classification is one of the most important Machines Learning algorithms in Python,
that plots a line that divides different categories of your data. In this ML algorithm, we calculate the
vector to optimize the line. This is to ensure that the closest point in each group lies farthest from each
other. While you will almost always find this to be a linear vector, it can be other than that. An SVM
model is a representation of the examples as points in space, mapped so that the examples of the separate

10
categories are divided by a clear gap that is as wide as possible. In addition to performing linear
classification, SVMs can efficiently perform a non-linear classification using what is called the kernel
trick, implicitly mapping their inputs into high-dimensional feature spaces. When data are unlabeled,
supervised learning is not possible, and an unsupervised learning approach is required, which attempts
to find natural clustering of the data to groups, and then map new data to these formed groups.

Fig 7. SVM

5. Naive Bayes Algorithm -


Naive Bayes is a classification method which is based on Bayes’ theorem. This assumes independence
between predictors. A Naive Bayes classifier will assume that a feature in a class is unrelated to any
other. Consider a fruit. This is an apple if it is round, red, and 2.5 inches in diameter. A Naive Bayes
classifier will say these characteristics independently contribute to the probability of the fruit being an
apple. This is even if features depend on each other. For very large data sets, it is easy to build a Naive
Bayesian model. Not only is this model very simple, it performs better than many highly sophisticated
classification methods. Naïve Bayes classifiers are highly scalable, requiring a number of parameters
linear in the number of variables (features/predictors) in a learning problem. Maximum-likelihood
training can be done by evaluating a closed-form expression, which takes linear time, rather than by
expensive iterative approximation as used for many other types of classifiers.

11
6. KNN Algorithm -
This is a Python Machine Learning algorithm for classification and regression- mostly for classification.
This is a supervised learning algorithm that considers different centroids and uses a usually Euclidean
function to compare distance. Then, it analyzes the results and classifies each point to the group to
optimize it to place with all closest points to it. It classifies new cases using a majority vote of k of its
neighbors. The case it assigns to a class is the one most common among its K nearest neighbors. For this,
it uses a distance function. k-NN is a type of instance-based learning, or lazy learning, where the function
is only approximated locally and all computation is deferred until classification. k-NN is a special case
of a variable
bandwidth, kernel density "balloon" estimator with a uniform kernel.

7. K-Means Algorithm -
k-Means is an unsupervised algorithm that solves the problem of clustering. It classifies data using a
number of clusters. The data points inside a class are homogeneous and heterogeneous to peer groups. k-
means clustering is a method of vector quantization, originally from signal processing, that is popular for
cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in
which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.
k-means clustering is rather easy to apply to even large data sets, particularly when using heuristics such
as Lloyd's algorithm. It often is used as a preprocessing step for other algorithms, for example to find a
starting configuration. The problem is computationally difficult (NP-hard). k-means originates from
signal processing, and still finds use in this domain. In cluster analysis, the k-means algorithm can be
used to partition the input data set into k partitions (clusters). k-means clustering has been used as a
feature learning.
12
Fig 8. KNN

learning) step, in either (semi-)supervised learning or unsupervised learning.

8. Random Forest -
A random forest is an ensemble of decision trees. In order to classify every new object based on its
attributes, trees vote for class- each tree provides a classification. The classification with the most votes
wins in the forest. Random forests or random decision forests are an ensemble learning method for
classification, regression and other tasks that operates by constructing a multitude of decision trees at
training time and outputting the class that is the mode of the classes (classification) or mean prediction
(regression) of the individual trees.

Fig 9. Random forest


13
CHAPTER 7

APPLICATIONS OF MACHINE LEARNING

Fig 10. Applications of ML

Applications of Machine Learning

Machine learning is one of the most exciting technologies that one would have ever come across. As it
is evident from the name, it gives the computer that which makes it more similar to humans: The ability
to learn. Machine learning is actively being used today, perhaps in many more places than one would
expect. We probably use a learning algorithm dozen of time without even knowing it. Applications of
Machine Learning include:

14
1. Image Recognition:
Image recognition is one of the most common applications of machine learning. It is used to identify
objects, persons, places, digital images, etc. The popular use case of image recognition and face detection
is, Automatic friend tagging suggestion:

Facebook provides us a feature of auto friend tagging suggestion. Whenever we upload a photo with our
Facebook friends, then we automatically get a tagging suggestion with name, and the technology behind
this is machine learning's face detection and recognition algorithm.

It is based on the Facebook project named Deep Face which is responsible for face recognition and
person identification in the picture.

2. Speech Recognition:
While using Google, we get an option of “Search by voice” it comes under speech recognition, and it's a
popular application of machine learning.

Speech recognition is a process of converting voice instructions into text, and it is also known as Speech
to text or Computer speech recognition. At present, machine learning algorithms are widely used by
various applications of speech recognition. Google assistant, Siri, Cortana, and Alexa are using speech
recognition technology to follow the voice instructions.

3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the correct path with the
shortest route and predicts the traffic conditions.

Everyone who is using Google Map is helping this app to make it better. It takes information from the
user and sends back to its database to improve the performance.

4. Product recommendations:
Machine learning is widely used by various e-commerce and entertainment companies such as Amazon,
Netflix, etc., for product recommendation to the user. Whenever we search for some product on
15
Amazon, then we started getting an advertisement for the same product while internet surfing on the
same browser and this is because of machine learning.

Google understands the user interest using various machine learning algorithms and suggests the
product as per customer interest.

As similar, when we use Netflix, we find some recommendations for entertainment series, movies, etc.,
and this is also done with the help of machine learning.

5. Self-driving cars:

One of the most exciting applications of machine learning is self-driving cars. Machine learning plays a
significant role in self-driving cars. Tesla, the most popular car manufacturing company is working on
self-driving car. It is using unsupervised learning method to train the car models to detect people and
objects while driving.

6. Email Spam and Malware Filtering:


Whenever we receive a new email, it is filtered automatically as important, normal, and spam. We always
receive an important mail in our inbox with the important symbol and spam emails in our spam box, and
the technology behind this is Machine learning. Below are some spam filters used by Gmail:

● Content Filter

● Header filter

● General blacklists filter

● Rules-based filters

● Permission filters

Some machine learning algorithms such as Multi-Layer Perceptron, Decision tree, and Naïve Bayes
classifier are used for email spam filtering and malware detection.

7. Virtual Personal Assistant:


16
We have various virtual personal assistants such as Google assistant, Alexa, Cortana, Siri. As the name
suggests, they help us in finding the information using our voice instruction. These assistants can help us
in various ways just by our voice instructions such as Play music, call someone, Open an email,
Scheduling an appointment, etc.

These virtual assistants use machine learning algorithms as an important part.These assistant record our
voice instructions, send it over the server on a cloud, and decode it using ML algorithms and act
accordingly.

8. Online Fraud Detection:


Machine learning is making our online transaction safe and secure by detecting fraud transaction.
Whenever we perform some online transaction, there may be various ways that a fraudulent transaction
can take place such as fake accounts, fake ids, and steal money in the middle of a transaction. So to
detect this, Feed Forward Neural network helps us by checking whether it is a genuine transaction or
a fraud transaction.

For each genuine transaction, the output is converted into some hash values, and these values become the
input for the next round. For each genuine transaction, there is a specific pattern which gets change for
the fraud transaction hence, it detects it and makes our online transactions more secure.

9. Stock Market trading:


Machine learning is widely used in stock market trading. In the stock market, there is always a risk of up
and downs in shares, so for this machine learning's long short term memory neural network is used
for the prediction of stock market trends.

10. Medical Diagnosis:

In medical science, machine learning is used for diseases diagnoses. With this, medical technology is
growing very fast and able to build 3D models that can predict the exact position of lesions in the brain.

It helps in finding brain tumors and other brain-related diseases easily.

11. Automatic Language Translation:

17
Nowadays, if we visit a new place and we are not aware of the language then it is not a problem at all, as
for this also machine learning helps us by converting the text into our known languages. Google's GNMT
(Google Neural Machine Translation) provide this feature, which is a Neural Machine Learning that
translates the text into our familiar language, and it called as automatic translation.

The technology behind the automatic translation is a sequence to sequence learning algorithm, which is
used with image recognition and translates the text from one language to another language.

Future Scope

Future of Machine Learning is as vast as the limits of human mind. We can always keep learning, and
teaching the computers how to learn. And at the same time, wondering how some of the most complex
machine learning algorithms have been running in the back of our own mind so effortlessly all the time.
There is a bright future for machine learning. Companies like Google, Quora, and Facebook hire people
with machine learning. There is intense research in machine learning at the top universities in the world.
The global machine learning as a service market is rising expeditiously mainly due to the Internet
revolution. The process of connecting the world virtually has generated vast amount of data which is
boosting the adoption of machine learning solutions. Considering all these applications and dramatic
improvements that ML has brought us, it doesn't take a genius to realize that in coming future we will
definitely see more advanced applications of ML, applications that will stretch the capabilities of machine
learning to an unimaginable level.

18
CHAPTER 8
DATA PREPROCESSING, ANALYSIS AND VISUALIZATION

Machine Learning algorithms don’t work so well with processing raw data. Before we can feed such data
to an ML algorithm, we must preprocess it. We must apply some transformations on it. With data
preprocessing,
we convert raw data into a clean data set. To perform data this, there are 7 techniques -

1. Rescaling Data -
For data with attributes of varying scales, we can rescale attributes to possess the same scale. We
rescale attributes into the range 0 to 1 and call it normalization. We use the MinMaxScaler class from
scikit learn.
This gives us values between 0 and 1.

2. Standardizing Data -
With standardizing, we can take attributes with a Gaussian distribution and different means and
standard deviations and transform them into a standard Gaussian distribution with a mean of 0 and a
standard deviation of 1.

3. Normalizing Data -
In this task, we rescale each observation to a length of 1 (a unit norm). For this, we use the Normalizer
class.

4. Binarizing Data -
Using a binary threshold, it is possible to transform our data by marking the values above it 1 and
those equal to or below it, 0. For this purpose, we use the Binarizer class.

19
5. Mean Removal -
We can remove the mean from each feature to center it on zero.

6. One Hot Encoding -


When dealing with few and scattered numerical values, we may not need to store these. Then, we can
perform One Hot Encoding. For k distinct values, we can transform the feature into a k-dimensional
vector with one value of 1 and 0 as the rest values.

7. Label Encoding -
Some labels can be words or numbers. Usually, training data is labeled with words to make it
readable.
Label encoding converts word labels into numbers to let algorithms work on them.

20
CHAPTER 9

Advantages of Machine Learning

1. Easily identifies trends and patterns -


Machine Learning can review large volumes of data and discover specific trends and patterns that would
not be apparent to humans. For instance, for an e-commerce website like Amazon, it serves to understand
the browsing behaviours and purchase histories of its users to help cater to the right products, deals, and
reminders relevant to them. It uses the results to reveal relevant advertisements to them.

2. No human intervention needed (automation) -


With ML, we don’t need to babysit our project every step of the way. Since it means giving machines
the ability to learn, it lets them make predictions and also improve the algorithms on their own. A
common example of this is anti-virus software. they learn to filter new threats as they are recognized.
ML is also good at recognizing spam.

3. Continuous Improvement -
As ML algorithms gain experience, they keep improving in accuracy and efficiency. This lets them make
better decisions. Say we need to make a weather forecast model. As the amount of data, we have keeps
growing, our algorithms learn to make more accurate predictions faster.

4. Handling multi-dimensional and multi-variety data -


Machine Learning algorithms are good at handling data that are multi-dimensional and multi-variety,
and they can do this in dynamic or uncertain environments.

21
5. Wide Applications -
We could be an e-seller or a healthcare provider and make ML work for us. Where it does apply, it holds
the capability to help deliver a much more personal experience to customers while also targeting the right
customers.

Disadvantages of Machine Learning

1. Data Acquisition -
Machine Learning requires massive data sets to train on, and these should be inclusive/unbiased, and of
good quality. There can also be times where they must wait for new data to be generated.

2. Time and Resources -


ML needs enough time to let the algorithms learn and develop enough to fulfil their purpose with a
considerable amount of accuracy and relevancy. It also needs massive resources to function. This can
mean additional requirements of computer power for us.

3. Interpretation of Results -
Another major challenge is the ability to accurately interpret results generated by the algorithms. We
must also carefully choose the algorithms for your purpose.

4. High error-susceptibility -
Machine Learning is autonomous but highly susceptible to errors. Suppose you train an algorithm with
data sets small enough to not be inclusive. You end up with biased predictions coming from a biased
training set.

This leads to irrelevant advertisements being displayed to customers. In the case of ML, such blunders
can set off a chain of errors that can go undetected for long periods of time. And when they do get noticed,
it takes quite some time to recognize the source of the issue, and even longer to correct it.
22
CHAPTER 10
PROJECT UNDERTAKEN

Project Name: - AGE AND GENDER DETECTOR

OVERVIEW:

Age and Gender play a major role in someone’s identification. Automatic age and gender
classification has become relevant to an increasing amount of applications, particularly
since the rise of social platforms and social media. Hiding true values of these variables
can cause for security issues mainly. When it comes to Image Processing, an image or a
video frame is taken as the input and by processing, expected predictions will be out

23
putted. As the processing mechanism various algorithms and techniques have been used
since years.

DATASET:

➢ https://www.kaggle.com/datasets/jangedoo/utkface-new

CODE:

import numpy as np
import tensorflow as tf
import cv2
import sys
import os
import argparse
import time from mtcnn.mtcnn
import MTCNN from matplotlib
import pyplot as plt
import cv2 from matplotlib
import pyplot as pyplot from tensorflow.keras.layers
import Dropout from tensorflow.keras.layers
import Flatten from tensorflow.keras.layers
import BatchNormalization from tensorflow.keras.layers
import Dense, MaxPool2D, Conv2D from tensorflow.keras.models
import Model from tensorflow.keras.layers
import Input,Activation,Add from tensorflow.keras.regularizers
import L2 from tensorflow.keras.optimizers
import Adam,Adagrad,Adadelta,Adamax,RMSprop

24
fldr="C:/Users/lenovo/.jupyter/lab/workspaces/UTKFace" import os
flies=os.listdir(fldr) ages=[] genders=[] images=[]

for fle in flies:

age=int(fle.split('_')[0]
)
gender=int(fle.split('_')
[1]) total=fldr+'/'+fle
print(total)
image=cv2.imread(tota
l)
image=cv2.cvtColor(image,cv2.COLOR_BGR2R
GB) image=cv2.resize(image,(48,48))
images.append(image)

for fle in flies:


age=int(fle.split('_')[0])
gender=int(fle.split('_')[1])
ages.append(age)
genders.append(gender)
plt.imshow(images[45])

25
plt.imshow(images[87])

print(ages[87]) #10
print(genders[87]) #0
images_f=np.array(images)
ages_f=np.array(ages)
genders_f=np.array(genders)
np.save(fldr+'image.npy',images_f)
np.save(fldr+'age.npy',ages_f)
np.save(fldr+'gender.npy',genders_f
) #findings the number of elements
in the dataset:
values, counts=np.unique(genders_f,return_counts=True) print(counts)
#[12391 11317]

# Plotting the no. of males and females samples:

fig=plt.figure() ax=fig.add_axes([0,0,1,1])
gender=['male','female'] values=[12391,11317]
ax.bar(gender,values) plt.show() values,
counts=np.unique(ages_f,return_counts=True)
print(counts) val=values.tolist() cnt=counts.tolist()

26
plt.plot(counts) plt.xlabel('ages')
plt.ylabel('distribution') plt.show() labels=[]
i=0 while
i<len(ages):
label=[] label.append(ages[i])
label.append(genders[i]) labels.append(label) i=i+1
images_f_2=images_f/255 images_f_2.shape
labels_f=np.array(labels) from
tensorflow.keras.regularizers import l2 from
tensorflow.keras.regularizers import L2 from
sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(images_f_2,labels_f,test_size=0.25)
Y_train[0:5]
Y_train_2=[Y_train[:,1],Y_train[:,0]]

Y_test_2=[Y_test[:,1],Y_test[:,0]]

Y_train_2[0][0:5]

Y_train_2[1][0:5]

# Defining the model:


from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import Input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import
MaxPooling2D def
Convolution(input_tensor,filters):

27
x=Conv2D(filters=filters,kernel_size=(3,3),padding="same",strides=(1,1),kernel_reg
ularizer=l2(0.001))(input_tensor) x=Dropout(0.1)(x) x=Activation('relu')(x) return x
def model(input_shape):
inputs=Input((input_shape)) conv_1=Convolution(inputs,32)
maxp_1=MaxPooling2D(pool_size=(2,2))(conv_1)
conv_2=Convolution(maxp_1,64)
maxp_2=MaxPooling2D(pool_size=(2,2))(conv_2)
conv_3=Convolution(maxp_2,128)
maxp_3=MaxPooling2D(pool_size=(2,2))(conv_3)
conv_4=Convolution(maxp_3,256)
maxp_4=MaxPooling2D(pool_size=(2,2))(conv_4)
flatten=Flatten()(maxp_4)
dense_1=Dense(64,activation='relu')(flatten)
dense_2=Dense(64,activation='relu')(flatten)
drop_1=Dropout(0.2)(dense_1)
drop_2=Dropout(0.2)(dense_2)
output_1=Dense(1,activation='sigmoid',name='sex_out')(drop
_1)
output_2=Dense(1,activation='relu',name='age_out')(drop_2)
model=Model(inputs=[inputs],outputs=[output_1,output_2])

model.compile(loss=["binary_crossentropy","mae"],optimizer="Adam",metrics=["ac
curacy"]) return model
Model=model((48,48,3)) Model.summary()
from tensorflow.keras.callbacks import ModelCheckpoint
import tensorflow as tf

28
fle_s='Age_Sex_Detection.h5'
checkpoint=ModelCheckpoint(fle_s,monitor='val_loss',verbose=1,save_best_only=T
rue,save_weights_only=False,mode='auto',save_freq='epoch')

Early_stop=tf.keras.callbacks.EarlyStopping(patience=75,monitor='val_loss',restore

_best_weights='True') callback_list=[checkpoint,Early_stop]
History=Model.fit(X_train,Y_train_2,batch_size=64,validation_data=(X_test,Y_tes
t_2),epochs=250,callbacks=callback_list) Model.evaluate(X_test,Y_test_2)
pred=Model.predict(X_test)

pred[1] plt.plot(History.history['loss']) plt.plot(History.history['val_loss'])


plt.tittle('Model loss') plt.xlabel('Epoch') plt.ylabel('Epoch')
plt.legend(['Train','Validation'],loc='upper left')
plt.subplots_adjust(top=1.0,bottom=0.0,right=0.95,left=0,hspace=0.25,wspace=0.3
5) plt.plot(History.history['sex_out_accuracy'])
plt.plot(History.history['val_sex_out_accuracy']) plt.tittle('Model Accuracy')
plt.xlabel('Epoch') plt.ylabel('Accuracy') plt.legend(['Train','Validation'],loc='upper
left')
plt.subplots_adjust(top=1.0,bottom=0.0,right=0.95,left=0,hspace=0.25,wspac
e=0.3

#NOW GUI CODE FOR THIS MODEL:

This will help to run the model easily in an app:


#Importing
necessary libraries
import tkinter as
tk from tkinter
import filedialog
29
from tkinter
import *
from PIL import Image,ImageTk
import numpy import numpy as np

#Loading the model from

model=load_model('Age_Sex_Detection.h5')

#Initializing the gui top=tk.Tk()


top.geometry('800x600') top.title('Age &
Gender Detector')
top.configure(background='#CDCDCD')
#Initialising the labels ( 1 for age and 1 for
sex)
label1=Label(top,background="#CDCDCD",fo
nt=('arial',15,"bold"))
label2=Label(top,background="#CDCDCD",fo
nt=('arial',15,"bold")) sign_image=Label(top)

def Detect(file_path): global


label_packed
image=Image.open(file_path
)
image=image.resize((48,48))
image=numpy.expand_dims(
image,axis=0)
image=np.array(image)
image=np.delete(image,0,1)

30
image=np.resize(image,(48,4
8,3)) print (image.shape)
sex_f=['Male','Female']
image=np.array([image])/25
5 pred=model.predict(image)
age=int(np.round(pred[1][0])
)
sex=int(np.round(pred[0][0])
) print("Predicted Age is"+
str(age)) print("Predicted
Gender is"+ sex_f[sex])
label1.configure(foreground
="#011638",text=age)
label2.configure(foreground
="#011638",text=sex_f[sex])

def show_Detect_button(file_path):

Detect_b=Button(top,text="Detect
Image",command=lambda:Detect(file_path),padx=10,pady=5)
Detect_b.configure(background="#364156",foreground='white',font=('arial',10,'bold'))
Detect_b.place(relx=0.79,rely=0.46)

def upload_image():

try:

file_path=filedialog.askopenfilename()
uploaded=Image.open(file_path)
uploaded.thumbnail(((top.winfo_width()/2.25),top.w
31
info_height()/2.25))
im=ImageTk.PhotoImage(uploaded)
sign_image.configure(image=im)
sign_image.image=im
label1.configure(text='')
label2.configure(text='')
show_Detect_button(file_path)
except:
print

upload=Button(top,text="Upload an
Image",command=upload_image,padx=10,pady=5)
upload.configure(background="#364156",foreground='white',fo
nt=('arial',10,'bold')) upload.pack(side='bottom',pady=50)
sign_image.pack(side='bottom',expand=True)
label1.pack(side="bottom",expand=True)
label2.pack(side="bottom",expand=True)
heading=Label(top,text="Age & Gender
Detector",pady=20,font=('arial',20,'bold'))
heading.configure(background="#CDCDCD",foreground="#36
4156") heading.pack() top.mainloop()

#OUTPUT:

(48, 48, 3)

1/1 [==============================] - 1s 1s/step


Predicted Age is2
Predicted Gender isFemale
32
(48, 48, 3)

1/1 [==============================] - 0s 352ms/step


Predicted Age is28
Predicted Gender isFemale
(48, 48, 3)

1/1 [==============================] - 0s 192ms/step


Predicted Age is63
Predicted Gender isFemale
(48, 48, 3)

1/1 [==============================] - 0s 112ms/step


Predicted Age is2
Predicted Gender isFemale
(48, 48, 3)

1/1 [==============================] - 0s 152ms/step


Predicted Age is33
Predicted Gender isFemale
(48, 48, 3)

1/1 [==============================] - 0s 120ms/step


Predicted Age is2
Predicted Gender isMale

SAMPLE INPUT OUTPUT AFTER TESTING THE MODEL:

1) SAMPLE:

33
Fig 11. Sample 1

2) SAMPLE:

Fig 12. Sample 2

34
3) SAMPLE:

Fig 13. Sample 3

35
CHAPTER 11
CONCLUSION
This internship has introduced us to Machine Learning. Now, we know that Machine Learning is a
technique of training machines to perform the activities a human brain can do, albeit bit faster and better
than an average human-being. Today we have seen that the machines can beat human champions in
games such as Chess, Mahjong, which are considered very complex. We have seen that machines can be
trained to perform human activities in several areas and can aid humans in living better lives. Machine
learning is quickly growing field in computer science. It has applications in nearly every other field of
study and is already being implemented commercially because machine learning can solve problems too
difficult or time consuming for humans to solve. To describe machine learning in general terms, a variety
models are used to learn patterns in data and make accurate predictions based on the patterns it observes.

Machine Learning can be a Supervised or Unsupervised. If we have a lesser amount of data and clearly
labeled data for training, we opt for Supervised Learning. Unsupervised Learning would generally give
better performance and results for large data sets. If we have a huge data set easily available, we go for
deep learning techniques. We also have learned Reinforcement Learning and Deep Reinforcement
Learning. We now know what Neural Networks are, their applications and limitations. Specifically, we
have developed a thought process for approaching problems that machine learning works so well at
solving. We have learnt how machine learning is different than descriptive statistics.

Finally, when it comes to the development of machine learning models of our own, we looked
at the choices of various development languages, IDEs and Platforms. Next thing that we need
to do is start learning and practicing each machine learning technique. The subject is vast, it
means that there is width, but if we consider the depth, each topic can be learned in a few hours.

36
REFERENCES

• https://expertsystem.com

• https://www.geeksforgeeks.org

• https://www.wikipedia.org

• https://www.coursera.org/learn/machine-learning

• https://machinelearningmastery.com

• https://www.javatpoint.com/machine-learning

• https://towardsdatascience.com/machine-learning/home

37

You might also like