0% found this document useful (0 votes)
58 views5 pages

Training of Neural Network On A Mnist Dataset Using Openmp For Handwritten Digit Recognition

This document proposes researching the use of OpenMP for parallelizing the training of neural networks on a fashion MNIST dataset for image recognition. The researchers aim to assess how OpenMP impacts recognition accuracy, training time, and resource utilization. They will train a neural network on the fashion MNIST dataset through multiple iterations while varying parameters like thread count, processor architectures, and dataset sizes. The goal is to optimize OpenMP configurations and improve efficiency and reduce training times to provide insights for fashion recognition systems.

Uploaded by

Subhas Roy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views5 pages

Training of Neural Network On A Mnist Dataset Using Openmp For Handwritten Digit Recognition

This document proposes researching the use of OpenMP for parallelizing the training of neural networks on a fashion MNIST dataset for image recognition. The researchers aim to assess how OpenMP impacts recognition accuracy, training time, and resource utilization. They will train a neural network on the fashion MNIST dataset through multiple iterations while varying parameters like thread count, processor architectures, and dataset sizes. The goal is to optimize OpenMP configurations and improve efficiency and reduce training times to provide insights for fashion recognition systems.

Uploaded by

Subhas Roy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

TRAINING OF NEURAL NETWORK ON A FASHION MNIST DATASET

USING OPENMP F O R R E C O G N I T I O N

Research Proposal
Mahule Roy Mayank Sinha
Department of Metallurgical and Department of Metallurgical and Materials
Engineering Materials Engineering
National Institute of Technology Karnataka Surathkal National Institute of Technology Karnataka Surathkal
211MT026 211MT029

Spoorthi Gumtapure
Department of Civil Engineering
National Institute of Technology Karnataka Surathkal
211CV248

February 28th, 2024


ABSTRACT
This paper explores the application of OpenMP for accelerating the training of
neural networks in handwritten digit recognition. OpenMP's parallelization
capabilities will be harnessed to distribute computation across multiple threads,
enhancing efficiency. Leveraging the MNIST dataset, we aim to assess the impact
of OpenMP on recognition accuracy, training time, and resource utilization. The
study will encompass variations in thread count, processor architectures, and
dataset sizes. The anticipated outcomes include improved efficiency and reduced
training times, providing valuable insights for optimizing OpenMP configurations
in fashion recognition systems.

1. Introduction
Parallel computing is now considered as standard way for computational scientists and engineers to
solve problems in areas as diverse as galactic evolution, climate modeling, aircraft design, and
molecular dynamics. Parallel computer has roughly classified as Multi- Computer and Multiprocessor.
Multi-core technology means having more than one core inside a single chip. This opens a way to the
parallel computation, where multiple parts of a program are executed in parallel at same time. Thread-
level parallelism could be a well- known strategy to improve processor performance. So, this results in
multithreaded processors. Multi-core offers explicit support for executing multiple threads in parallel
and thus reduces the idle time. The factor motivated the design of parallel algorithm for multi-core
system is the performance. The performance of parallel algorithm is sensitive to number of cores
available in the system. One of the parameter to measure performance is execution time.
We chose a Neural Network (NN) to classify the Fashion MNIST dataset due to its efficiency in
handling large datasets, facilitated by GPU acceleration and the scalability of stochastic gradient descent
(SGD) and minibatch gradient descent. NN's parallel processing on GPUs and the ability to efficiently
scale to a vast number of examples make it superior to traditional algorithms like support vector
machines (SVM). This adaptability, coupled with NN's prowess in capturing intricate complexities,
positions it as an optimal choice for image classification tasks with substantial and intricate data.
In machine and deep learning algorithms, the training process is notably time and memory-intensive, a
critical consideration for hardware implementation on FPGA or GPGPU. Employing parallelization in
our program becomes pivotal to enhance performance and conserve memory during the training of our
Neural Network (NN). This paper investigates the influence of modifying NN variables on its
performance and accuracy, shedding light on optimization strategies.

2. Related works
In [6], the authors presented a context sensitive grammar in an and-Or graph representation, that can
produce a large set of composite graphical templates of cloth configurations.
In [5], the authors introduce a pipeline for recognizing and classifying clothes in natural scenes. To do
so, they used a multi-class learner based on a Random Forest. Data was extracted features maps which
were converted to histograms and used as inputs for a Random Forest (for clothing types) and an SVM
(for clothing attributes) classifier. As result, their pipeline can describe the clothes on a scene. They also
created their own dataset with 80000 images labeled in 15 classes.
Various the authors propose a knowledge-guided fashion analysis network for clothing landmark
localization, and classification. To do so, they used a Bidirectional Convolutional Neural Network. As
results, the model can not only predict landmarks, but also category and attributes,

3. MOTIVATION FOR CHOOSING COMPUTER VISION AND IMAGE


PROCESSING DOMAIN FOR PARALLELIZATION

In the specialized field of computer vision and image processing, the demands on
computational resources are pronounced, especially during the training of complex deep
learning models. Efficient hardware implementation, such as utilizing FPGA or GPGPU,
becomes imperative to meet these challenges. The integration of parallelization techniques into
our program is key, significantly enhancing the training efficiency of Neural Networks (NN)
for tasks like image recognition and processing. This research paper uniquely explores the
nuanced impact of modifying NN variables, offering nuanced insights into improving both the
performance and accuracy of image-related applications within the context of computer vision
and image processing.

Furthermore, the insights gained from this research can be extended to the realm of anomaly
detection. As the study delves into optimizing Neural Networks (NN) for image-related tasks
in computer vision and image processing, the findings may be leveraged to enhance the
efficiency and effectiveness of anomaly detection systems, broadening the applicability of the
research in diverse domains. Anomaly detection holds significant relevance in materials and
civil engineering applications, and the insights derived from training the Neural Network (NN)
can be effectively applied to analyze diverse image datasets beyond its original scope, thereby
extending its utility across various applications.

4. DATASET USED
Fashion-MNIST is a direct drop-in alternative to the original MNIST dataset, for benchmark ing
machine learning algorithms. MNIST is a collection of handwritten digits, and contains 70000
greyscale 28x28 images, associated with 10 labels, where 60000 are part of the training set and
10000 of the testing. Fashion-MNIST has the exact same structure, but images are fashion
products, not digits. The dataset can be obtained as two 785 columns CSV, one with training
images, and the other with testing ones. Each CSV row is an image that has a column with the
label (enumerated from 0 to 9) and 784 remaining columns that describe the 28x28 pixel image
with values from 0 to 255 representing pixel luminosity. To make data access easier, we
generated images divided in directories by usage, and labels. This way, we are able, to easily
obtain image information by using only its path.

5. Implementation
One of the useful things about OpenMP is that it allows the users the option of using the same source
code both with OpenMP compliant compilers and normal compilers. This is achieved by making the
OpenMP directives and commands hidden to regular compilers. The OpenMP standard was formulated
as an API for writing portable, multithreaded applications. It started as Fortran-based standard, but later
grew to include C and C++. The OpenMP programming model provides set of compiler pragmas. Many
loops can be threaded by just inserting a single loop above right to the loop. Implementation of OpenMP
determines how many threads to use and how best to manage it. Instead of adding lots of code for
creating parallel program, the programmer just need to tell the OpenMP which loop should be threaded.
In ordered to understand the concept of OpenMP, it is necessary to know the concept of parallel
programming. Parallel processing is done by more than one processor in parallel computing systems.
Some the advantage of OpenMP includes: good performance, portable, requires very little programming
effort and allows the program to be parallelized incrementally.

The neural network is a fully-connected network consisting of 2 layers with 100 and 10 neurons,
respectively. The input vector holds 12 values. The weights are stored in WL1 [100] [12+1] (+1 is for
the bias) and WL2 [10] [100+1] for layer 1 and 2, respectively. The internal states of the neurons are
stored in DL1 [100] for layer 1 and DL2 [10] for layer 2, whereas their corresponding outputs in OL1
[100] and OL2 [10].

5.1 Objective
We have implemented parallel algorithm using OpenMP with the hope that they will run faster than their
sequential counterparts. A sequential program, executing on a single processor can only perform one
computation at a time, whereas the parallel program executed in parallel and divides up perfectly among
the multi-processors. Main Objective of this approach is to increase the performance (speedup) which is
inversely proportional to execution time.

5.2 Overview of Proposed Work


We are training our Neural Network (NN) architecture on the Fashion MNIST dataset through
multiple iterations to systematically evaluate its performance under varying parameters.
Through this iterative process, we aim to compare and analyze the impact of adjusting
parameters on computation time and accuracy across both training and test samples. Employing
OpenMP for parallelization, we anticipate achieving faster computation times by optimizing
various aspects of the model and thereby enhancing its overall efficiency.

In our experimentation, we will systematically vary hyper parameters, including learning rate,
number of iterations, alpha value, neuron weights, and hidden layers, to comprehensively
evaluate their influence on our model's performance. A critical aspect of this analysis involves a
detailed examination of the back propagation step, a fundamental process for determining
precise weights crucial for accuracy and overall model performance.

Furthermore, we are keen on investigating the computational efficiency of back propagation


with the incorporation of OpenMP parallelization. This comparison between the runtime of
serial and parallel back propagation implementations will provide valuable insights into the
potential gains in efficiency achieved through parallel processing. Our focus on back
propagation, coupled with parameter variations, aims to offer a holistic understanding of how
these factors collectively impact the performance, accuracy, and computational speed of our
neural network model.

5.3 Further Implementation


Genetic Algorithm
For our extended implementation, we aim to tackle the classic N-Queen problem, known for its
complexity in AI. Traditional serial approaches, including various search methods like Hill
Climbing and A*, tend to be lengthy and less effective compared to the robustness of the genetic
algorithm. We intend to conduct a thorough performance comparison between the serial and
parallel implementations of the genetic algorithm to showcase its efficiency in solving the N-
Queen problem.

The rationale behind choosing the genetic algorithm lies in its versatility for solving both
constrained and unconstrained optimization problems, drawing inspiration from natural
selection in biological evolution. The genetic algorithm iteratively modifies a population of
individual solutions, employing selection, crossover, and mutation processes. By delving into
research and literature on genetic algorithms, we aim to identify optimal crossover rules,
selection strategies, and mutation rules, maximizing the efficiency and effectiveness of solving
the N-Queen problem.

6. References
1. https://in.mathworks.com/help/gads/what-is-the-genetic-algorithm.html
2. https://www.tensorflow.org/datasets/catalog/fashion_mnist
3. [2007.08657v1] Training with reduced precision of a support vector machine model for text
classification (arxiv.org)
4. Sanjay Kumar Sharma, Dr. Kusum Gupta, "Performance Analysis of Parallel Algorithms on
Multi-core System using OpenMP Programming Approaches", International Journal of
Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.5, October
2012.
5. Pranav Kulkarni, Sumith Pathare, "Performance Analysis of Parallel Algorithms over Sequential
using OpenMP", IOSOR Journal of Computer Science, Volume6, March-April 2014.
6. Sushil Kumar Sah,andDinesh Naik, “Parallelizing Doolittle Algorithm Using TBB,” IEEE 2014.
7. Sheela Kathavate, N.K.Srikanth, "Efficient of Parallel Algorithms on Multi Core Systems Using
OpenMP”, International Journal of Advanced Research in Computer and Communication
EngineeringVol. 3, Issue 10, October 2014.
8. Vijayalakshmi Saravanan, Mohan Radhakrishnan, A.S.Basavesh, and D.P.Kothari, “A
Comparative Study on Performance Benefits of Multi-core CPUs using OpenMP,” IJCSI
International Journal of Computer Science Issues, Vol. 9, Issue 1, No 2, January 2012.

You might also like