0% found this document useful (0 votes)
79 views

Deep Learning

Uploaded by

Jesee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views

Deep Learning

Uploaded by

Jesee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2018 International Conference on Advances in Computing and Communication Engineering (ICACCE-2018)

Paris, France 22-23 June 2018

Evaluating the performance of Deep Learning


Techniques on Classification Using Tensor Flow
Application
Suresh Kallama Syed Muzamil Bashab, Dharmendra Singh Rajputb, Rizwan Patana, Balamurugan Ba, Sk. Abdul
Khalandar Bashac

a
School of Computing Science and Engineering VIT University
Galgotias University Vellore,
Greater Noida,India Tamilnadu, India
[email protected], [email protected],
{sureshkallam, prizwan5& kadavulai }@gmail.com
[email protected]

Abstract - In Deep Learning, Artificial intelligence is the World Economic Forum (2016) 4IR. Which is focused more
overall bigger domain, in which machines given the capability to on Deep Learning and Machine Learning in all technical
learn new instances of data and then adapt to the basic domain of areas, Where and how will Deep Learning and Machine
Machine Learning. Deep Learning is a subset of it, which goes
Learning be used in future industry? For example, from the
into further accuracy that uses neural networking technology to
go in and enable more complex situational data to come in and
data collection, to big data analysis, as we have such a large
make more precise decisions. The objective of this research is to amount of data with diverse data types, lot of junk data that
find out the details like Ratio of training data, Noise, Batch Size, needs to be filtered out. Considering all this challenges, Deep
Properties of features, learning rate, Type of Activation function, Learning and Machine Learning can be very effective as a
Level of Regularization, Rate of Regularization in constructing filter, as a controller, to collect information and put it into the
Neural Network on four different Classification Datasets after big data analysis engine. In addition, Deep Learning and
directly manipulating design providing in Tensor flow Machine Learning will be used in creating real time
playground application. The Evaluation parameters consider in estimations based upon the real time data collected. Then,
our experiments are Test loss and Training lose. The findings in along with Real-time, IoT Data, Internet of Things, where the
our research is to specify that, how many hidden layers and
number of neurons in each hidden layer are needed, for each type
overall situational awareness, the condition, the status of the
of classification problem. These findings help the researchers to environment is included into the system in which Deep
fix the Maximum number of neurons and hidden layers needed Learning and Machine Learning is used again to make
in solving the four different types of classification problems by accurate predictions. Through these predictions, Deep
achieving test loss less than 0.005. Learning and Machine Learning can be used again to create
control and management technologies, such that we have a
Index Terms - Deep Learning, Artificial intelligence, Neural
Network, Test loss. safe and more productive society. Deep Learning and Machine
Learning, used in various different shapes and forms will be
I. INTRODUCTION used all over the data analysis to the control, all the way from
the data collection. In Future Industry Evolution, Artificial
As Rapid growth of new industries and technologies came Intelligence, Machine Learning and especially, Deep Learning
about. In addition, mass production becomes possible due to Technologies will be assisting the development and
electrical power, oil and steel. The driving technologies improvement in all of these core areas that are identified as a
include the telephone, light bulb, phonograph and internal future for the fourth industrial revolution. This includes
combustion engine. The third industrial revolution, 3IR from nanotechnology, biotechnology, 3D printing, robotics,
the 1980s until today, it has a worldwide influence. autonomous vehicles as well as IoT, 5G and smart devices,
Advancements in digital electronics and mechanical devices including cloud computing techniques.
came, because of this advancement in digital electronics, this Now, let us have an attempt to understand, Why Deep
era is named as Digital Revolution. Driving technologies Learning is important? The reason behind is, that every
include the personal computer, the PC, the internet and various Business in mostly interested in declining computational costs.
information and communication technologies. Some of the In cloud computing, device connected to a cloud can receive
current Industry Driving Technologies include Smartphone's, data storing, data analysis, application functions and control
wireless communication, including Wi-Fi and Bluetooth, intelligence services in terms of SaaS, PaaS, and IaaS. That
mobile communication, as LTE and 4G mobile are collaborating to support cloud functions services such as
communications, big data, cloud computing, mobile internet, storage, control, management, measurement or networking
social networking services and artificial intelligence. Then, the functionalities. In addition, greater availability of data is
fourth industrial revolution as defined in a book from the another one of the reason that deep learning is becoming so
978-1-5386-4485-0/18/$31.00 ©2018 IEEE

331
2018 International Conference on Advances in Computing and Communication Engineering (ICACCE-2018)
Paris, France 22-23 June 2018

popular. Higher quality of data collection through intelligent


data filters and databases. In addition, for example, like the
II. LITERATURE SURVEY
Hadoop's MapReduce and HDFS, the Hadoop Distributed File
System. This enables fast, key, value feature extraction from
structured data, semi-structured data, and unstructured data in Machine learning related findings from a survey on
real-time. Big Data key parameter feature extractions, then the enterprises with at least $500 million in sales was provided in
Deep Learning engine can directly work on these resulting [1]. Some of the significant findings were, 76% claimed to be
data. As training data to train the system and then immediately targeting for higher sales growth using machine learning. In
give you accurate results on the overall structure of the data addition, at least 40% already use machine learning in sales
that is being collected from the databases. Performance scales and marketing. It was also found that 38% credited machine
with data is another one of the major reasons. Improved learning contributions in sales improvements, and several
machine learning, deep learning and big data technology European banks were able to increase new product sales by
makes better use of data. In addition, major contributions in 10% while reducing churn by 20%.
Hardware Innovation. Support of much more powerful CPUs Here are some company considerations of machine learning
and GPUs. Smartphone CPUs which are multi-core CPUs. In and deep learning [2].
addition, low energy consuming efficient processing on
mobile devices can be used, which are on smartphones, a. Data protection is critical. Because artificial
Augmented Reality devices, IoT platforms that support intelligence accuracy is based on the data set used
powerful distributed computing. In addition, Integrated in training the software.
System hardware and software, Clouds, Servers, and Network b. The software as in terms of the machine learning
Innovations are another driving source of why Deep Learning and the deep learning. And open source software
is so popular. Deep learning is a machine learning technique and development tools are also available. But if
that uses multiple internal layers (hidden layers) of nonlinear one misuse the way in training the overall artificial
processing units (neurons) to conduct supervised, or intelligence and the machine learning system, that
unsupervised learning from data. This is because our deep could result in racial discrimination and other type
learning technology is commonly implemented using a neural of social issues.
network. Machine Learning and Deep Learning is used for c. Machine learning, the artificial intelligence engine,
natural language processing, computer vision, speech is somewhat of a black box to many people, may
recognition, robotics motion and manipulation, as well as not even know that there is that type of a bias that
computational creativity. These are some AI technology types was programmed within the system.
like, artificial neural networks, evolutionary algorithms, d. Artificial intelligent and robotics will result in job
genetic programming, swarm intelligence. AI tools used to displacements.
make optimal decisions or faster suboptimal decisions include
optimization theory, game theory, fuzzy logic simulated Here are some personal considerations in the machine
annealing. The behavior of neuron with only two states is as in learning area that to be consider [3].
equation 1.
a. Make artificial intelligence do the routine
administrative work.
b. Make artificial intelligence do the report writing.
n
ì y ³ T output is1
y = å xi wi í (1) c. Learn to trust the advice of AI data analysis or at
i =1 îotherwise output is 0 least partially trust it.
d. Judgment and decision making based on added
Where xi represents possible inputs, wi their human knowledge and experience to the data
analysis results of AI and ML is very important.
corresponding weights, Y the possible outcome, T is
The main role of a human manager or human
threshold value provided by the programming during administrator or a strategic planner. Some
constructing the neural network. It results as hard output consideration factors do exist when you're doing
(Binary value) and in order to obtain the soft output (0- this part. It is organization history, culture,
1), there are other functions like Rectified Linear unit empathy, human rights, common wealth,
(ReLu), soft plus, Logistic Sigmoid as described in principles of equality, policies and ethical
equation 2. reflection factors.
e. Become a leader in teaching and advising on how
g ( y ) = max(0, y ) , z ( y) = log(1 + exp( y)) , to effectively use artificial intelligence and
machine learning.
1
s ( y) = (2)
1 + exp(- y ) Characteristics of businesses with deep learning and machine
learning [4].

332
2018 International Conference on Advances in Computing and Communication Engineering (ICACCE-2018)
Paris, France 22-23 June 2018

from Baidu, there's the DeepBench, and there's


a. Deep learning excels on classification, prediction, others.
and generative tasks across a number of domains.
b. Based on data structure, There's structured data The competitive landscape and opportunities that are related to
and unstructured data. machine learning and deep learning are tremendous. Major
c. Structured data with time series. Log analysis and players are providing open-source deep learning frameworks
risk detection based on data centers, security, and to attract developer talent and influence downstream
in the area of finance. Enterprise resource planning applications. Large open source communities maintain
for manufacturing, automation, and supply chains. frameworks, provides support and drives new application
d. Relational Data Stream Management System. This areas. In addition, web scale companies have a competitive
is a Data Stream Management System that uses a advantage due to data volumes and large capital reserves for
distributed computing structure and in-memory, hardware. The motivation of our research is provided from the
and SQL queries to process real-time structured previous work carried out in using Tensor flow in
and unstructured data streams. RDSMS SQL demonstrating the performance of Deep Learning techniques.
queries do not exit after being executed in order to In [6], the author demonstrated the usefulness of the
generate continuous results as new data streams visualization using Tensor flow and described the usage of
enter the database. different scenarios for discovering novel deep learning
e. Unstructured data which include sound, texts, models. In [7], the author designed a new flow called GPflow,
image and video types. For sound, there is voice suing Gaussian processes aims in supporting different kernel
recognition, that are used in UX, UI, Automotive, functions, providing fast processing at large scale with greater
Security and IoT areas. There is voice search that's accuracy. In [8] the author used weighted fuzzy logic to
used on Smartphones and Telecom companies. consign proper weights in training the data to extract
f. In addition, sentiment analysis for CRM. Flaw sentiments. where as in [9] the author made a detailed
detection based on engine noise, for example for comparison on predictive models and perform analysis on
automotive, aviation, and manufacturing. Fraud Time series dataset. In [10] the author uses Generalized linear
detection based on finance, credit cards, banking model (GLM) in selecting the most influencing attribute from
and payment processing. PIMA dataset. where as in [11], the author used gradient
g. Have language translation that's used by the ascent method in initializing the exact weights of the terms in
government or by various private applications, analyzing the sentiments form tweets.
augmented searching, and theme detection in the
area of financing. For the domain of image,
III. METHODOLOGY
unstructured data exist based on facial recognition,
image search, which is used by the government TensorFlow is a data flow graph based numerical
and social media, and many other places as well, computation mechanism. Tensor is defined as a geometric
machine vision based on manufacturing, robotics, vector. It is an organized, multidimensional array of data
automotive, and aviation. values, used in describing geometric relationship among the
h. In addition, video for motion detection that's used values. In which nodes represent mathematical operations and
in gaming, robotics, UX, UI. Threat prediction there's edges represent the tensors. TensorFlow version 1.1.0
that's used by the government and transportation was released in March of 2017, which included support for
agencies. Real-time threat detection, that's used for Java APIs for Windows. Let us first understand the definition
security and also for places like airports and of Artificial intelligence (AI). It is a technology that enables a
terminals. machine to make an intelligent decision, or action. AI
technology enables an intelligent agent (a hardware module,
Deep learning and machine learning deployment options [5]. software, a robot, or an application) to cognitively perceive its
environment and correspondingly attempt to maximize its
a. In the hardware domain, there's CPUs, Central probability of success of a target action. Whereas, Machine
Processing Units, Graphics Processing Units, learning is the capability enabled to a computer to learn
ASIC, which stands for Application-Specific without being explicitly programmed. It is a functionality to
Integrated Circuits, and FPGA technology, and learn and make predictions from data. It evolved from pattern
others as well. recognition and computational learning theory in AI. Deep
b. In the software domain, there is Operating learning is a machine learning technique that uses multiple
Systems and libraries, and APIs, which are internal layers (hidden layers) of nonlinear processing units
Application Programming Interfaces, and many (neurons) to conduct supervised, or unsupervised learning
others. from data. This is because our deep learning technology is
c. Using pre-trained models, there's for IBM Watson, commonly implemented using a neural network. The detail
from Google, there's TensorFlow, Inception, structure is as shown in figure 1, having six hidden layers and
there's Nvidia has the DGX-1 system, and also six neurons in each hidden layer.

333
2018 International Conference on Advances in Computing and Communication Engineering (ICACCE-2018)
Paris, France 22-23 June 2018

IV. RESULTS AND DISCUSSION


The setup is as follows, the learning rate will be set to
0.001. The activation, we have used ReLU and Tanh. And the
regularization Level is None. This is because regularization is
needed in solving a complex problem as Dataset 4 to avoid
over fitting problem. Then, we consider the problem type as
classification, and also, the ratio of training-to-test data, set to
70%. The noise, we'll set it to zero. Now, noise is set to zero to
make it easy to find the solution. In addition, the batch size,
we will set as 30 over there. Now, the screen setting is as
shown in the figure 5. The objective, is to conduct
classification, a separation of two sets of clusters of data. In
Figure 1. Architecture of CNN using Tensor Flow playground the blue region and the orange region, the blue dots are in the
blue region and the orange dots are in the orange region,
Supervised learning uses training with labeled data. The and therefore, our experiment is successful. The black
labeled data are data that has the desired output values curve that's going down very quickly and it stays at the
already specified. So, we have the inputs, we have the low level. Actually, two lines, the black line and the gray
desired outputs corresponding to these inputs, and then line are perfectly overlapping in all the cases. Our
we can match and train the inside of the weights such experiment work is compared with the previous work
that they operate the way that we want them to. The carried out by the author in [12]. In which, the author
other one is unsupervised learning. This is training that focuses on the direct manipulation and interactive
uses unlabeled data. So, there are no desired output visualization. Where as in the present work, we focus on
values that are used. Other techniques include semi- finding out the exact numbers of neurons and hidden
supervised learning, which is training that uses both layers needed for four different datasets with test and
labeled data and unlabeled data. And then there's train error less than 0.005 and the same is plotted in table
reinforcement learning, which is the feedback is given 1. The output obtained from the experiments carried out
back into the system but no labeled data is used in this on all the dataset are as shown in figure 3. In figure 2,
case. Backpropagation is used to train perceptron’s and the behavior of CNN designed on one of the dataset with
multi-layer perceptron’s. Backpropagation uses training three neurons and five neurons with proper weights are
iterations where the error size as well as the variation, plotted.
direction, and speed are used to determine the update
value of each weight of the neural network. A neural
network structure versus the level of intelligence that can
be accomplished. One neuron can only make a very
simple one-dimensional decision. For more complex
intelligence, we need more neurons working together,
collaborating. Then use the difference value and input
value to derive the gradient of weights of the output
layer and hidden layer neurons. We're going to scale
Figure 2. Behavior of CNN based on neurons
down the gradient of the weights and that will reduce the
learning rate. The learning rate determines the learning
speed and the resolution. We will update the weights in
the opposite direction of the sign of the gradient. In other
words, if the gradient is a plus sign, we will update the
weights giving them a negative value. If the gradient
results in a negative number, a negative value, then we
will update the weights in a positive using a positive
number. We will repeat all steps until the desired input-
to-output performance is satisfactory.

334
2018 International Conference on Advances in Computing and Communication Engineering (ICACCE-2018)
Paris, France 22-23 June 2018

Figure 3. Output obtained from dataset considered

Table .1 Comparative result on different Dataset

Data Ratio of Batc Numbe Numbe Number of Lea Activation Regulariz Regulariz Test Loss Trainin
Set Trainin h r of r of Neurons in rni Function ation ation g Loss
g to Size Inputs Hidden Each ng Level Rate
Test Layers Hidden Rat
Data Layer e
(%)

1 70 30 2 1 1 0.001 ReLu L1 0.01 0.007 0.008

2 70 30 2 3 6 0.001 ReLu L2 0.01 0.018 0.011

3 70 30 2 4 6 0.001 ReLu L2 0.01 0.008 0.007

4 50 17 2 6 8 0.003 Tanh None 0 0.007 0.007

CONCLUSION
In TensorFlow Playground application a direct-
[5] LeCun Y, Bengio Y, Hinton G. Deep learning. nature. 2015
manipulation approach is provided in understanding May;521(7553):436.
CNN behavior. The significance of experiment in the [6] Wongsuphasawat K, Smilkov D, Wexler J, Wilson J, Mané D, Fritz D,
field of deep learning, the visualization is premeditated Krishnan D, Viégas FB, Wattenberg M. Visualizing Dataflow Graphs of
Deep Learning Models in TensorFlow. IEEE transactions on visualization
to compose it easy to get a practical feel on how to make and computer graphics. 2018 Jan;24(1):1-2.
CNN to work without any need of explicit coding. In our [7] Matthews AG, van der Wilk M, Nickson T, Fujii K, Boukouvalas A,
experiments, Tensor flow playground is restricted to León-Villagrá P, Ghahramani Z, Hensman J. GPflow: A Gaussian process
library using TensorFlow. Journal of Machine Learning Research. 2017
create only six hidden layers and for dataset 4 we need at Jan 1;18(40):1-6.
least seven hidden layers in achieving the test loss less [8] Basha, Syed Muzamil, Yang Zhenning, Dharmendra Singh Rajput, N.
Iyengar, and D. R. Caytiles. "Weighted Fuzzy Rule Based Sentiment
than 0.007. The findings in our research is to fix the Prediction Analysis on Tweets." International Journal of Grid and
number of neuron needed in each hidden layer for Distributed Computing 10, no. 6 (2017): 41-54. . DOI:
different classification problem as listed in Table 1 with 10.14257/ijgdc.2017.10.6.04
[9] Basha, Syed Muzamil, Yang Zhenning, Dharmendra Singh Rajput,
very low training and testing error. In future, we would Ronnie D. Caytiles, and N. Ch SN Iyengar. "Comparative Study on
like to work on Tensor flow version 1.0 with different Performance Analysis of Time Series Predictive Models." International
computation resources, towards achieving on low test Journal of Grid and Distributed Computing 10, no. 8 (2017): 37-48. DOI:
10.14257/ijgdc.2017.10.8.04
and train error rate on large datasets. [10]Basha, Syed Muzamil, H. Balaji, N. Ch SN Iyengar, and Ronnie D.
Caytiles. "A Soft Computing Approach to Provide Recommendation on
REFERENCES PIMA Diabetes." International Journal of Advanced Science and
[1] Ye Q, Zhang Z, Law R. Sentiment classification of online reviews to Technology 106 (2017): 19-32. DOI: 10.14257/ijast.2017.106.03
travel destinations by supervised machine learning approaches. Expert [11]Basha, Syed Muzamil, Dharmendra Singh Rajput, and Vishnu Vandhan.
systems with applications. 2009 Apr 1;36(3):6527-35. "Impact of Gradient Ascent and Boosting Algorithm in
[2] Najafabadi MM, Villanustre F, Khoshgoftaar TM, Seliya N, Wald R, Classification."International Journal of Intelligent Engineering and
Muharemagic E. Deep learning applications and challenges in big data Systems (IJIES) 11, no.1, (2018): 41-49. DOI:
analytics. Journal of Big Data. 2015 Dec;2(1):1. 10.22266/ijies2018.0228.05.
[3] Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and [12]Smilkov D, Carter S, Sculley D, Viégas FB, Wattenberg M. Direct-
prospects. Science. 2015 Jul 17;349(6245):255-60. manipulation visualization of deep networks. arXiv preprint
[4] Witten IH, Frank E, Hall MA, Pal CJ. Data Mining: Practical machine arXiv:1708.03788. 2017 Aug 12.
learning tools and techniques. Morgan Kaufmann; 2016 Oct 1.

335

You might also like