Document
Document
INTRODUCTION
Object detection is the process of finding real-world objects like car, bike, TV,
flowers, and humans in still images or videos. It allows for the recognition,
localization, and detection of multiple objects within an image which provides us with
a much better understanding of an image as a whole. It is commonly used in
applications such as image retrieval, security, surveillance, and advanced driver
assistance systems(ADAS).
Object detection can be done via multiple ways:
Feature-Based object Detection
Viola jones Object Detection
SVM classifications with HOG Features
Deep Learning Object Detection
The real world poses challenges like having tiny hardware like mobile phones
and raspberry pi which cant run complex deep learning models. This project
demonstrates us how we can do object detection using raspberry pi. Like cars on a
road, oranges in the fridge, signatures in the document . The raspberry pi is the neat
piece of hardware that has captured the whole generation with ~15M devices sold,
with hackers building even cooler projects on it. Giving the popularity of deep
learning and the raspberry pi camera we could detect any object using deep learning
on pi. Object detection has been good enough for a variety of applications even
though image segmentation is a much more precise result, it suffers from the
complexity of creating training data.
Object detection is of significant and has been used across a variety of industrial
processes to identify products. Finding a specific object through visual inspection is a
basic task that is involved in multiple industrial processes like sorting, inventory
management, machining, quality management, packaging. Inventory management can
be very tricky as items are hard to track in real time. Automatic object counting and
localization allows improving inventory accuracy. Every object detection algorithm
has a different way of working, but they all work on the same principle.
Deep learning based object detection there are three primary object detection
methods that we ill likely encounter:
Faster R-CNNs
You only Look once(YOLO)
1
Single shot detectors(SSDs)
When detecting object detection networks we normally use an existing
network architecture, such as VGG or reset, and then use it inside the object detection
pipeline. The problem is that these network architecture can be very large scale in the
order of 200-500MB. Network architectures such as these are unsuitable for resource
constrained devices due to their sheer size and resulting number of computations.
Instead we can use mobile nets another paper by google researchers. We call these
networks “Mobile Nets” because they are designed for resource constrained devices
such as your smartphone. Mobile Nets differ from traditional CNNs through the usage
of depthwise seperable convolution.
Creating accurate Machine Learning models which are capable of identifying
and localizing multiple objects in a single image remained a core challenge in
computer vision. But, with recent advancements in Deep Learning, Object detection
applications are easier to develop API is an open source framework built on top of
tensor flow that makes it easy to construct, train and deploy object detection models.
Deep learning (also known as deep structured learning or hierarchical
learning) is part of a broader family of machine learning methods based on learning
data representations, as opposed to task-specific algorithms. Learning can be
supervised, semi-supervised or unsupervised. Deep learning models are vaguely
inspired by information processing and communication patterns in biological nervous
systems yet have various differences from the structural and functional properties of
biological brains (especially human brains), which make them incompatible with
neuroscience evidences.
In deep learning, each level learns to transform its input data into a slightly
more abstract and composite representation. In an image recognition application, the
raw input may be a matrix of pixels; the first representational layer may abstract the
pixels and encode edges; the second layer may compose and encode arrangements of
edges; the third layer may encode a nose and eyes; and the fourth layer may recognise
that the image contains a face. Importantly, a deep learning process can learn which
features to optimally place in which level on its own. (Of course, this does not
completely obviate the need for hand- tuning; for example, varying numbers of layers
and layer sizes can provide different degrees of abstraction.)
2
2. LITERATURE SURVEY
This paper proposes an approach to detect and track the persons in a video. This
approach uses Gaussian Mixture Model to detect the person and Kalman filter to track
the detected person. The processing time to detect the person is reduced by
performing the detection operation on down-sampled video. After detecting the
person, the original size of the video is reconstructed using Papoulis
-Gerchberg method. The performance analysis is carried out by comparing with the
state-of-the-art- algorithms. The experimental results show that the proposed method
is well suited for detecting and tracking the person in lower processing time.
Surveillance and monitoring has become very important for security reasons these
days. Residential areas, government organisations, commercial spaces, schools and
hospitals, industries, banking and other challenging indoor and outdoor environments
require high end surveillance systems, which are very expensive. This paper proposes
the motion detection and tracking system for surveillance in this paper. The proposed
system uses Raspberry Pi and computer vision using SimpleCV to detect moving
objects in the surveillance area, switch on the lights to capture images and streams the
camera feed online using MPJG Streamer, which can be viewed by any authorised
person on the go.
3
[3] Hwajeong Seo , Jongseok Choi , Hyunjin Kim , Taehwan Park
and Howon Kim:
Traditional surveillance system is enabled by closed-circuit television(CCTV)
monitoring each district in real time. However, this approach should install expensive
CCTV to every destination and collected images or videos should go through complex
post-processing to get useful and meaningful information. Furthermore nowadays
CCTV violates people's private life, which is crucial problem in modern society. If
our goal is secure and robust street, more simple and cheap approaches could be
favourable. In this paper, we present a novel surveillance system using light sensor
which is commonly available in embedded process or modern smart-phones. On the
contrast to the traditional method, light sensor is cheap module and easy to install and
process the information. After processing, we can determine the secure or insecure
places with derived information. For practical evaluation, we made micro tested in our
campus. First we collected light information from several locations in different time
domains. And then secure or insecure places are determined in each time domain. We
defined bright and dark places as secure and insecure places, respectively. The
evaluation shows that our approach is unprecedented ultra light-weight approach and
cost effective method to improve security in our society.
4
3. EXISTING METHOD
This paper puts forth a prototype system of assistive text reading and image
processing .The three fictional components include screen capturing, data processing
and audio output. The screen capture components captures the scene by motion- based
object-deduction using a camera attached to a pair of sun glasses and, R denote the
calculated foreground object at the each frame. The object of interest is localized by
the mean of foreground. Data processing component deploy our proposed algorithms
that include object of interest, text localization to obtain image region containing text
and to convert the intensified text into readable codes. Mini laptops are used as the
processing device in the current prototype.
A number of portable reading systems have been designed specifically for the
visually impaired “K-Reader Mobile” runs on a cell phone which allows the visually
impaired person to read mail, receipts, fliers, and many other documents. But these
documents must be flat, placed on a clear, dark surface. In addition, “K-Reader
Mobile” accurately reads black print on a white background. However, it has
problems in recognizing color text or text with color background.
The drawback death with these systems is that it is a complex and heavy
device and the blind people found it difficult to carry it along with it. The internal
architecture consisted of 8051 microcontroller and separate chips for CPU, GPU, and
USB controller, RAM which made this model heavy and complicated. Thus our
proposed work deals with designing a device which would be handy and easily
portable for visually impaired people to carry it along.
To recognize different objects, we need to extract visual features which can
provide a sematic and robust representation. SIFT, HOG and haar-like features are the
representative ones. This is due to the fact that these features can produce
representations associated with complex cells in human brain. However due to the
diversity of appearances, illumination conditions and backgrounds, its difficult to
manually design a robust feature descriptor to perfectly describe all kinds of objects.
Multi-task learning learns a useful representative for multiple correlated tasks
from the same input, introduced convey features trained for object segmentation and
stuff(amorphous categories such as ground and water) to guide accurate object
detection of small objects(stuff Net). Presented multitask network cascades of three
5
networks, namely class-agnostic region proposal generation, pixel-level instance
segmentation and regional instance and region-based object detection into a multi-
stage architecture to fully exploit the learned segmentation features.
6
4. PROPOSED SYSTEM
To help blind persons to detect objects from different patterns with complex
backgrounds found on many everyday products of Hand-held objects, have to
conceive of a camera-based assistive text reading framework to get the object of
interest within the camera view and extract printed text information from the object.
The algorithm used in this system can handle clutter background and different
patterns. Also this proposed system can extract text information from hand-held
objects. In existing systems it is very challenging for blind user to position the object
of interest within the camera’s view. But, there are still no acceptable solutions. In
many stages this problem is approached. In this framework the product object should
be appears in the camera’s view. This system going to develop a motion based method
to extract the object i.e. region of interest (ROI) from the captured image. After that
system framework will perform text recognition only on that region of interest. The
text in captured images is mostly surrounded by noise, text characters appear with
different scale, size, fonts and colours. Hence this is very difficult task to localize
objects and ROI form captured image.
Thinking in Deep Learning based Object Detection: Apart from the above
approaches, there are still many important factors for continued progress. There is a
large imbalance between the number of annotated objects and background examples.
To address this problem, proposed an effective online mining algorithm (OHEM) for
automatic selection of the hard examples, which leads to a more effective and
efficient training. Instead of concentrating on feature extraction, made a detailed
analysis on object classifiers, and found that it is of particular importance for object
detection to construct a deep and convolutional per-region classifier carefully,
especially for ResNet and Google Nets. Traditional CNN framework for object
detection is not skilled in handling significant scale variation, occlusion or truncation,
especially when only 2D object detection is involved. To address this problem, Xiang
et al. proposed a novel subcategory-aware region proposal network , which guides the
7
generation of region proposals with subcategory information related to object poses
and jointly optimize object detection and subcategory classification. Ouyang et al.
found that the samples from different classes follow a longtailed distribution, which
indicates that different classes with distinct numbers of samples have different degrees
of impacts on feature learning. To this end, objects are firstly clustered into visually
similar class groups, and then a hierarchical feature learning scheme is adopted to
learn deep representations for each group separately. In order to minimize
computational cost and achieve the state-of-the-art performance, with the ‘deep and
thin’ design principle and following the pipeline of Fast R-CNN, Hong et al. proposed
the architecture of PVANET, which adopts some building blocks including
concatenated ReLU, Inception, and HyperNet to reduce the expense on multi-scale
feature extraction and trains the network with batch normalization, residual
connections, and learning rate scheduling based on plateau detection. The PVANET
achieves the state-of-the-art performance and can be processed in real time on Titan X
GPU
Recently, some CNN based face detection approaches have been proposed. As
less accurate localization results from independent regressions of object coordinates,
proposed a novel IoU loss function for predicting the four bounds of box jointly.
proposed a Deep Dense Face Detector (DDFD) to conduct multi-view face detection,
which is able to detect faces in a wide range of orientations without requirement of
pose/landmark annotations. Yang et al. proposed a novel deep learning based face
detection framework, which collects the responses from local facial parts (e.g. eyes,
nose and mouths) to address face detection under severe occlusions and unconstrained
pose variations. Yang et al. proposed a scale-friendly detection network named
ScaleFace, which splits a large range of target scales into smaller sub-ranges.
Different specialized sub-networks are constructed on these sub-scales and combined
into a single one to conduct end-to-end optimization. Has et al. designed an efficient
CNN to predict the scale distribution histogram of the faces and took this histogram to
guide the zoom-in and zoom out of the image. Since the faces are approximately in
uniform scale after zoom, compared with other state-ofthe-art baselines, better
performance is achieved with less computation cost. Besides, some generic detection
frameworks are extended to face detection with different modifications, e.g. Faster R-
CNN
Although DCNNs have obtained excellent performance on generic object
8
detection, none of these approaches have achieved better results than the best hand-
crafted feature based method for a long time, even when part-based information and
occlusion handling are incorporated . Thereby, some researches have been conducted
to analyze the reasons. Zhang et al. attempted to adapt generic Faster R-CNN to
pedestrian detection. They modified the downstream classifier by adding boosted
forests to shared, high-resolution feature maps and taking a RPN to handle small
instances and hard negative examples. To deal with complex occlusions existing in
pedestrian images, inspired by DPM. proposed a deep learning framework called
DeepParts, which makes decisions based an ensemble of extensive part detectors.
DeepParts has advantages in dealing with weakly labeled data, low IoU positive
proposals and partial occlusion. Other researchers also tried to combine
complementary information from multiple data sources. CompACT-Deep adopts a
complexity-aware cascade to combine hand-crafted features and fine-tuned DCNNs.
Based on Faster R-CNN, Liu et al. proposed multi-spectral deep neural networks for
pedestrian detection to combine complementary information from color and thermal
images. Tian et al. proposed a taskassistant CNN (TA-CNN) to jointly learn multiple
tasks withmultiple data sources and to combine pedestrian attributes with semantic
scene attributes together. Du et al. proposed a deep neural network fusion architecture
for fast and robust pedestrian detection. Based on the candidate bounding boxes
generated with SSD detectors, multiple binary classifiers are processed parallelly to
conduct soft-rejection based network fusion (SNF) by consulting their aggregated
degree of confidences. However, most of these approaches are much more
sophisticated than the standard R-CNN framework. CompACT-Deep consists of a
variety of hand-crafted features, a small CNN model and a large VGG16 model.
DeepParts contains 45 fine-tuned DCNN models, and a set of strategies, including
bounding box shifting handling and part selection, are required to arrive at the
reported results. So the modification and simplification is of significance to reduce the
burden on both software and hardware to satisfy real-time detection demand. Tome et
al. proposed a novel solution to adapt generic object detection pipeline to pedestrian
detection by optimizing most of its stages. Hu et al. trained an ensemble of boosted
decision models by reusing the feature maps, and a further improvement was gained
with simple pixel labelling and additional complementary hand-crafted features.
proposed a reduced memory region based deep CNN architecture, which fuses
regional responses from both ACF detectors and SVM classifiers into R-CNN.
9
Ribeiro et al. addressed the problem of Human-Aware Navigation and proposed a
vision-based person tracking system guided by multiple camera sensors. The camera-
based object detecting help blind persons to detect the products.
Camera acts as main vision in detecting the label image of the product then
image is processed internally . Separates label from image, and finally identifies the
product and identified product name is pronounced through voice. Received label
image is then converted to text. Once the identified label name is converted to text
and converted text is displayed on display unit connected to controller. Now
converted text should be converted to voice to hear label name as voice through ear
phones connected to audio.
10
4.2 BLOCK DIAGRAM
Power supply
RASPBERRY PI3
Audio output
USB camera
(a)Power Supply:
11
Fig: 4.2 Power supply
(b)USB Camera:
Stay connected with your loved ones. Your loved one might be miles apart, but
the Quantum QHM495LM 25MP webcam bridges the distance with life-like picture
quality and excellent sound reproduction. It has a built-in microphone that helps you
to chat with them online and enjoy clear conversation in a video call. Simply clip this
6 led webcam on your PC or laptop and start chatting without downloading any
drivers. The web camera also has six lights that automatically switch on in the dark. It
also has 16 special effects and 10 backgrounds frames. Pep up your images in these
frames and add special effects for added fun. Clean Picture, Clear Sound. This USB
webcam with mice comes with high speed USB 2.0 interface. The webcam also offers
great camera resolution and is available with AWB (Automatic Whiteness Balance) so
that you get clear and natural images.
12
schools and in developing countries. The original model became far more popular than
anticipated, selling outside its target market for uses such as robotics. It does not
include peripherals (such as keyboards and mice) and cases. However, some
accessories have been included in several official and unofficial bundles.
The organization behind the Raspberry Pi consists of two arms. The first two
models were developed by the Raspberry Pi Foundation. After the Pi Model B was
released, the Foundation set up Raspberry Pi Trading, with Been Upton as CEO, to
develop the third model, the B+. Raspberry Pi Trading is responsible for developing
the technology while the Foundation is an educational charity to promote the teaching
of basic computer science in schools and in developing countries.
The raspberry pi Foundation recommends Python. Any language which will compile
for ARMv6 can be used.
13
Installed by default on the raspberry pi:
C
C++
JAVA
SCRATCH
RUBY
14
4.3.1 Quantum QHM495LM 25MP Web Camera:
Stay connected with your loved ones. Your loved one might be miles apart, but
the Quantum QHM495LM 25MP webcam bridges the distance with life-like picture
quality and excellent sound reproduction. It has a built-in microphone that helps you
to chat with them online and enjoy clear conversation in a video call. Simply clip this
6 led webcam on your PC or laptop and start chatting without downloading any
drivers. The web camera also has six lights that automatically switch on in the dark. It
also has 16 special effects and 10 backgrounds frames. Pep up your images in these
frames and add special effects for added fun. Clean Picture, Clear Sound.
This USB webcam with mic comes with high speed USB 2.0 interface. The
webcam also offers great camera resolution and is available with AWB (Automatic
Whiteness Balance) so that you get clear and natural images. The Quantum 25MP
night vision webcam has some advanced features like brightness control, sharpness
control and adjust that help you get the expected high-quality image output. With the
CMOS sensor incorporated in this webcam, the images are rendered with supreme
quality. It is the most easy and amazing way to feel close to your loved ones while
you enjoy face-to-face chats with them. 10x digital zoom. Six white lights. Built-in
sensitive microphone. Snap shot mode for taking all pictures. Adjustable brightness,
sharpness and colour. Anti-flicker. Comes with AWB (Automatic Whiteness Balance).
4.3.2 RASPBERRY PI 3 MODEL (BCM2836):
Raspberry Pi is an ARM (built by using the RISC architecture) based credit
card sized single
15
.
R Fig: 4.7 Raspberry pi 3B Model
Dimensions 85 x 56 x 17mm
16
GPIO 40 Pins
Ethernet 10/100
Storage Micro-SD
17
w
18
4.3.4 POWER SUPPLY SYSTEM:
The power supply is designed to convert high voltage AC mains electricity to
a suitable low voltage supply for electronics circuits and other devices. A power
supply can by broken down into a series of blocks, each of which performs a
particular function. A d.c power supply which maintains the output voltage constant
irrespective of a.c mains fluctuations or load variations is known as “Regulated D.C
Power Supply”. The functional block diagram of power supply is shown in below
figure.
Transformer:
19
connection between the two coils; instead they are linked by an alternating magnetic
field created in the soft-iron core of the transformer. The two lines in the middle of the
circuit symbol represent the core. The an electrical transformer is shown in below
figure 4.10. Transformers waste very little power so the power out is (almost) equal to
the power in. Note that as voltage is stepped down current is stepped up. The ratio of
the number of turns on each coil, called the turn’s ratio, determines the ratio of the
voltages. A step-down transformer has a large number of turns on its primary (input)
coil which is connected to the high voltage mains supply, and a small number of turns
on its secondary (output) coil to give a low output voltage.
Np = number of turns on
primary coil. Ip = primary
(input) current.
4.3.5 RECTIFIER:
A circuit, which is used to convert ac to dc, is known as
20
RECTIFIER. The process of conversion ac to dc is called “rectification”.
TYPES OF RECTIFIERS
Full-wave Rectifier:
From the above comparisons we came to know that full wave bridge rectifier
as more advantages than the other two rectifiers. So, in our project we are using full
wave bridge rectifier circuit.
Bridge Rectifier:
A bridge rectifier makes use of four diodes in a bridge arrangement to achieve
full-wave rectification. This is a widely used configuration, both with individual
diodes wired as shown and with single component bridges where the diode bridge is
wired internally.
21
Comparison of rectifier circuits
Type of Rectifier
Half wave Full wave Bridge
Number of diodes
1 2 3
PIV of diodes
Vm 2Vm Vm
Ripple
Frequency F 2f 2f
Rectification
Efficiency 0.406 0.812 0.812
Transformer
Utilization 0.287 0.693 0.812
Factor(TUF)
RMS voltage V rms Vm/2 Vm/√2 Vm/√2
22
Fig: 4.12 Operation of forward biased Bridge Rectifier
During negative half cycle of secondary voltage, the diodes D1 and D4 are in
forward biased while D2 and D3 are in reverse biased. The current flow direction is
with dotted arrows. The operation of reverse biased bridge rectifier is shown in the fig
below:
We can observe that in both the cases, the load current direction is same, ie.,
up to down as shown in fig- so unidirectional, which means DC current. Thus, by the
usage of a bridge rectifier, the input AC current is converted into a DC current. The
output at the load with this bridge wave rectifier is pulsating in nature, but for
producing a pure DC requires additional filter like capacitor. The same operation is
applicable for different bridge rectifiers, but in case of controlled rectifiers thyristors
triggering is necessary to drive the current load.
23
4.3.6 Filter:
A Filter is a device, which removes the ac component of rectifier output
but allows the dc component to reach the load.
Capacitor Filter:
We have seen that the ripple content in the rectified output of half wave
rectifier is 1.21% or that of full-wave or bridge rectifier or bridge rectifier is 48%
such high percentages of ripples is not acceptable for most of the applications. Ripples
can be removed by one of the following methods of filtering:
A capacitor, in parallel to the load, provides an easier by pass for the ripples
voltage though it due to low impedance. At ripple frequency and leave the dc to
appears the load. An inductor, in series with the load, prevents the passage of the
ripple current(due to high impedance at ripple frequency) while allowing the d.c (due
to low resistance to d.c).
Various combinations of capacitor and inductor, such as L-section filter
section filter, multiple section filter etc. which make use of both the properties
mentioned. Two cases of capacitor filter, one applied on half wave rectifier and
another with full wave rectifier.
Filtering is performed by a large value electrolytic capacitor connected across
the DC supply to act as a reservoir, supplying current to the output when the varying
DC voltage from the rectifier is falling. The capacitor charges quickly near the peak of
the varying DC, and then discharges as it supplies current to the output. Filtering
significantly increases the average DC voltage to almost the peak value (1.4 × RMS
value).
To calculate the value of capacitor(C),
C = ¼*√3*f*r*RL
Where,
F=supply frequency,
r = ripple factor,
RL = load resistance,
24
4.3.7 Regulator:
Voltage regulator ICs is available with fixed (typically 5, 12 and 15V) or
variable output voltages. The maximum current they can pass also rates them.
Negative voltage regulators are available, mainly for use in dual supplies. Most
regulators include some automatic protection from excessive current ('overload
protection') and overheating ('thermal protection'). Many of the fixed voltage
regulator ICs has 3 leads and look like power transistors, such as the 7805 +5V 1A
regulator shown on the right. The A 3 terminal voltage regulator is shown in below fig
:
The LM7805 is simple to use. You simply connect the positive lead of your
unregulated DC power supply (anything from 9VDC to 24VDC) to the Input pin,
connect the negative lead to the Common pin and then when you turn on the power,
you get a 5 volt supply from the output pin.
25
• No External Component.
• Output Voltage 5.0V, 6V, 8V, 9V, 10V, 12V, 15V, 18V,
24V.
26
5. SOFTWARE DESCRIPTION
Evolution of UNIX:
In between 1963-1969, Bell laboratories computer sciences research center
developing a mainframe time-sharing operating system called Multiuse, which is
based around the concept of a single-level memory. Bell labs quit financing the
multiuse project, yet a gathering of software engineers, including Dennis Ritchie and
Ken Thompson, kept working with the multiuse project standards, from which UNIX
was produced in 1969.
First edition of UNIX was released by Dennis Ritchie on November 3, 1971. And
it includes over software commands like:
27
Linux is a free and open-source operating system created by Linus Torvalds in
1991. An operating system is simply a collection of software which is capable of
managing hardware resources and provides an environment where applications can
run, it can also allow applications to store information, send documents to printers and
other things. The kernel is the core of the Linux operating system, it runs on
numerous different platforms including the Alpha and Intel platform, and Linux
kernel is available under the GNU GPL.
Linux part is an interface between the equipment and programming, however
keeping in mind the end goal to have a completely helpful working framework, a
framework requires libraries, graphical UI, internet browsers and different projects
notwithstanding the portion. As Linux is an open source programming. It is allowed to
utilize, duplicate, study and change the product as required by the designers. Right
now, Linux has prompted the ascent of Linux dispersions. Linux dissemination is the
blend of Linux bit and another programming's as one make a working framework.
Linux is being utilized by a great many clients around the globe. It keeps
running on different equipment stages from devoted systems administration gadgets to
telephones to PCs and even supercomputers. Linux is mostly used for server
applications, which means it is capable to host websites, act as a file server and can
run database software.
Some of Linux distributions:
1. Cent OS
2. Chrome OS
3. Debial
4. Fedora
5. Raspbian
6. Red Hat Linux
7. Ubuntu
8. Arch Linux
Raspbian Operating System:
Raspbian is a PC working framework for Raspberry Pi, which was produced
in light of Linux Kernel. Also, is advanced to chip away at Raspberry Pi equipment.
Raspbian was made by Mike Thompson and Peter Green in June 2012, the Raspbian
working framework is still under dynamic advancement and the most recent form of
28
the Raspbian working framework is Stretch, which was released in August 2017. By
default, the Raspbian operating system comes with a full Graphical User Interface and
mostly useful utility software installed.
NOOBS:
It is an operating system installer, which is preloaded with Raspbian and
LibreELEC operating systems. The user can select any one of the operating systems
to install on Raspberry Pi. Apart from the above two operating systems NOOBS can
be able to download other operating systems through the network and install it on
Raspberry Pi.
NOOBS Lite:
It is an operating system installer, which doesn't come with preloaded
operating systems instead it gives an option for the user to select and install all
available operating systems which are supported by Raspberry Pi.
29
Fig: 5.1 NOOBS Software in Raspberry Pi Official website
After downloading NOOBS/NOOBS Lite extract the zipped file using 7Zip
preferably.
Step 2: Format the micro SD Card using the File Explorer or using any
Formatting software.
30
Step 3: Copy the extracted NOOBS/NOOBS Lite files into formatted micro SD
Card.
Step 4: Put micro SD Card which contains extracted NOOBS/NOOBS Lite files
into Raspberry Pi and Boot it up. After booting, the Raspberry Pi shows a menu
that lets you choose which operating system you'd like to install and then choose
your preferred operating system and click on install. Raspberry Pi will install that
operating system, complete the boot and load the respective operating system.
31
Fig: 5.4 Raspbian Operating System in RaspberryPi Official Website
32
SD Card into Raspberry Pi and boot it up after completion of booting Raspberry Pi
will load the Raspbian operating system.
Introduction to Python
Python is a high-level, general-purpose interpreted and object-oriented
programming language. Python is likewise an open source programming dialect
which can be utilized without purchasing any permit. The development of python was
started by Guido van Rossum in the year 1989 and he made the code public in
February 1991.
Python is capable of working on different operating systems including
windows, mac, Linux, Raspberry Pi and many other operating systems. Python syntax
is similar to the English language, which has more readable than other programming
languages. Python can be used as functional, procedural or an object-oriented
programming paradigm.
Python version 2 and version 3 are having a huge difference, which should be
kept in mind in order to work with the Python programming language. Python uses a
new line to complete the programming command, whereas other programming
languages use parentheses and, or semicolons. In python, the scope can be decided by
indentation and whitespace for classes, functions, and loops, etc.
33
Python programming language designed for readability to developers and its syntax
has more similar to the English language.
5.6 QT creator:
Qt Creator is a cross-platform C++, JavaScript and QML integrated
development environment which is part of the SDK for the Qt GUI application
development framework. It includes a visual debugger and an integrated GUI layout
and forms designer. The editor's features include syntax highlighting and auto
completion. Qt Creator uses the C++ compiler from the GNU Compiler Collection on
Linux and FreeBSD. On Windows it can use MinGW or MSVC with the default
install and can also use Microsoft Console Debugger when compiled from source
code. Clang is also supported.
Development of what would eventually become Qt Creator had begun by 2007
or earlier under transitional names Workbench and later Project Greenhouse. It [5]
debuted during the later part of the Qt 4 era, starting with the release of Qt Creator,
version 1.0 in March 2009 and subsequently bundled with Qt 4.5 in SDK 2009.3.
[6] [7]
This was at a time when the standalone Qt Designer application was still the
widget layout tool of choice for developers. There is no indication that Creator had
layout capability at this stage. The record is somewhat muddied on this point (perhaps
due to changes in ownership or the emphasis on Qt Quick), but the integration of Qt
Designer under Qt Creator is first mentioned at least as early as Qt 4.7 (ca. late 2011). [8]
Currently (in the Qt 5 era) it is simply stated that "[Qt Designer's] functionality is
now included as part of Qt Creator IDE." [9
34
3. Internal JavaScript debugger
1.The VNC server is the program on the machine that shares some screen (and may
not be related to a physical display – the server can be "headless"), and allows the
client to share control of it.
2.The VNC client (or viewer) is the program that represents the screen data
originating from the server, receives updates from it, and presumably controls it by
informing the server of collected local input.
3.The VNC protocol (RFB protocol) is very simple, based on transmitting one graphic
primitive from server to client ("Put a rectangle of pixel data at the specified X,Y
position") and event messages from client to server.
4.This VNC viewer acts as the interface between the raspberry pi and the user to
modify the software configurations of raspberry pi.
35
Fig: 5.8 VNC Viewer
5.8 Hardware:
The Raspberry Pi hardware has evolved through several versions that feature
variations in memory capacity and peripheral-device support.
This block diagram describes Model B and B+; Model A, A+, and the Pi Zero
are similar, but lack the Ethernet and USB hub components. The Ethernet adapter is
internally connected to an additional USB port. In Model A, A+, and the Pi Zero, the
USB port is connected directly to the system on a chip (SoC). On the Pi 1 Model B+
and later models the USB/Ethernet chip contains a five-port USB hub, of which four
ports are available, while the Pi 1 Model B only provides two. On the Pi Zero, the
USB port is also connected directly to the SoC, but it uses a micro USB (OTG) port.
36
Processor:
Fig: 5.10 The Raspberry Pi 2B uses a 32-bit 900 MHz quad-core ARM Cortex-
A7processor.
37
Pi are roughly equivalent to the performance of the Xbox of 2001.
Raspberry Pi 2 V1.1 included a quad-core Cortex-A7 CPU running at 900 MHz
and 1 GB RAM. It was described as 4–6 times more powerful than its predecessor.
The GPU was identical to the original. In parallelized benchmarks, the Raspberry Pi 2
V1.1 could be up to 14 times faster than a Raspberry Pi 1 Model B+.
The Raspberry Pi 3, with a quad-core ARM Cortex-A53 processor, is described
as having ten times the performance of a Raspberry Pi 1. This was suggested to be
highly dependent upon task threading and instruction set use. Benchmarks showed the
Raspberry Pi 3 to be approximately 80% faster than the Raspberry Pi 2
in parallelized tasks.
5.9 Overclocking:
Most Raspberry Pi systems-on-chip could be overclocked to 800 MHz, and some
to 1000 MHZ. There are reports the Raspberry Pi 2 can be similarly overclocked, in
extreme cases, even to 1500 MHz (discarding all safety features and over-voltage
limitations). In the Raspbian Linux distro the overclocking options on boot can be done
by a software command running "sudo raspi-config" without voiding the warranty. In
those cases the Pi automatically shuts the overclocking down if the
chip temperature reaches 85 °C (185 °F), but it is possible to override automatic over-
voltage and overclocking settings (voiding the warranty); an appropriately sized heat
sink is needed to protect the chip from serious overheating.
Newer versions of the firmware contain the option to choose between five
overclock ("turbo") presets that when used, attempt to maximise the performance of
the SOC without impairing the lifetime of the board. This is done by monitoring the
core temperature of the chip and the CPU load, and dynamically adjusting clock
speeds and the core voltage. When the demand is low on the CPU or it is running too
hot the performance is throttled, but if the CPU has much to do and the chip's
temperature is acceptable, performance is temporarily increased with clock speeds of
up to 1 GHz, depending on the board version and on which of the turbo settings is
used.
The seven overclock presets are:
none; 700 MHz ARM, 250 MHz core, 400 MHz SDRAM, 0 over volting
modest; 800 MHz ARM, 250 MHz core, 400 MHz SDRAM, 0 overvolting,
38
medium; 900 MHz ARM, 250 MHz core, 450 MHz SDRAM, 2 overvolting,
high; 950 MHz ARM, 250 MHz core, 450 MHz SDRAM, 6 overvolting,
turbo; 1000 MHz ARM, 500 MHz core, 600 MHz SDRAM, 6 overvolting,
Pi 2; 1000 MHz ARM, 500 MHz core, 500 MHz SDRAM, 2 overvolting,
Pi 3; 1100 MHz ARM, 550 MHz core, 500 MHz SDRAM, 6 overvolting. In
system information the CPU speed will appear as 1200 MHz. When idling, speed
lowers to 600 MHz.
In the highest (turbo) preset the SDRAM clock was originally 500 MHz, but
this was later changed to 600 MHz because 500 MHz sometimes causes SD card
corruption. Simultaneously in high mode the core clock speed was lowered from 450
to 250 MHz, and in medium mode from 333 to 250 MHz. The Raspberry Pi Zero runs
at 1 GHz.
The CPU on the first and second generation Raspberry Pi board did not
require cooling, such as a heat sink or fan, even when overclocked, but the Raspberry
Pi 3 may generate more heat when overclocked.
5.10 RAM:
On the older beta Model B boards, 128 MB was allocated by default to the
GPU, leaving 128 MB for the CPU. On the first 256 MB release Model B (and
Model A), three different splits were possible. The default split was 192 MB (RAM
for CPU), which should be sufficient for standalone 1080p video decoding, or for
simple 3D, but probably not for both together. 224 MB was for Linux only, with only
a 1080p frame buffer, and was likely to fail for any video or 3D. 128 MB was for
heavy 3D, possibly also with video decoding (e.g. XBMC). Comparatively the Nokia
701 uses 128 MB for the Broadcom Video Core IV.
For the later Model B with 512 MB RAM, new standard memory split files
(arm256_start.elf, arm384_start.elf, arm496_start.elf) were initially released for
256 MB, 384 MB and 496 MB CPU RAM (and 256 MB, 128 MB and 16 MB video
RAM) respectively. But a week or so later the RPF released a new version of start.elf
that could read a new entry in config.txt (gpu _ mem=xx) and could dynamically
assign an amount of RAM (from 16 to 256 MB in 8 MB steps) to the GPU, so the
older method of memory splits became absolete, and a single start.elf worked the
same for 256 MB and 512 MB Raspberry Pis.
The Raspberry Pi 2 and the Raspberry Pi 3 have 1 GB of RAM. The
39
Raspberry Pi Zero and Zero W have 512 MB of RAM.
5.11 Networking:
The Model A, A+ and Pi Zero have no Ethernet circuitry and are commonly
connected to a provided by a built-in USB Ethernet adapter using the SMSC
LAN9514 chip. The Raspberry network using an external user-supplied USB Ethernet
or Wi-Fi adapter. On the Model B and B+ the Ethernet port is Pi 3 and Pi Zero W
(wireless) are equipped with 2.4 GHz WiFi 802.11n (150 Mbit/s) and Bluetooth 4.1 (24
Mbit/s) based on the Broadcom BCM43438 Full MAC chip with no official support
for monitor mode but implemented through unofficial firmware patching and the Pi 3
also has a 10/100 Mbit/s Ethernet port. The Raspberry Pi 3B+ features dual-band IEEE
802.11b/g/n/ac WiFi, Bluetooth 4.2, and Gigabit Ethernet (limited to approximately 300
Mbit/s by the USB 2.0 bus between it and the SoC).
Peripherals:
40
Fig: 5.11 The Model 2B boards incorporate four USB ports for connecting
peripherals.
The Raspberry Pi may be operated with any generic USB computer
keyboard and mouse. It may also be used with USB storage, USB to MIDI converters,
and virtually any other device/component with USB capabilities. Other peripherals
can be attached through the various pins and connectors on the surface of the
Raspberry Pi.
5.12 Video:
Fig: 5.12 The early Raspberry Pi 1 Model A, with an HDMI port and a
standard RCA composite video port for older displays
41
The video controller can generate standard modern TV resolutions, such as
HD and Full HD, and higher or lower monitor resolutions as well as older NTSC or
PAL standard CRT TV resolutions. As shipped (i.e., without custom overclocking) it
can support the following resolutions: 640×350 EGA; 640×480 VGA; 800×600 SVGA;
1024×768 XGA; 1280×720 720p HDTV; 1280×768 WXGA variant;
1280×800 WXGA variant; 1280×1024 SXGA; 1366×768 WXGA variant;
1400×1050 SXGA+; 1600×1200 UXGA;
1680×1050 WXGA+; 1920×1080 1080p HDTV; 1920×1200 WUXGA.
Higher resolutions, up to 2048×1152, may work or even 3840×2160 at 15 Hz
(too low a frame rate for convincing video). Note also that allowing the highest
resolutions does not imply that the GPU can decode video formats at these
resolutions; in fact, the Pis are known to not work reliably for H.265 (at those high
resolutions), commonly used for very high resolutions (however, most common
formats up to Full HD do work).
Although the Raspberry Pi 3 does not have H.265 decoding hardware, the
CPU is more powerful than its predecessors, potentially fast enough to allow the
decoding of H.265-encoded videos in software. The GPU in the Raspberry Pi 3 runs
at higher clock frequencies of 300 MHz or 400 MHz, compared to previous versions
which ran at 250 MHz.
The Raspberry Pis can also generate 576i and 480i composite video signals, as
used on old-style (CRT) TV screens and less-expensive monitors through standard
connectors – either RCA or 3.5 mm phone connector depending on model. The
television signal standards supported are PAL-BGHID, PAL-M, PAL-
N, NTSC and NTSC-J.
42
CSI interface, effectively disabling the camera.
43
Fig: 5.14 Raspberry pi 3 Model B
The sub-images have to be cropped sharp to the border of the character in order
to standardize the sub-images. The image standardization is done by finding the
maximum row and column with 1s and with the peak point, increase and decrease the
counter until meeting the white space, or the line with all 0s. This technique is shown
in figure below where a character “S” is being cropped and resized.
Cropped and resized picture the image pre-processing is then followed by the
image resize again to meet the network input requirement, 5 by 7 matrices, where
the value of 1 will be assign to all pixel where all 10 by 10 box are filled with 1s.
Image resizes again to meet the network input requirement. Finally, the 5 by 7
matrices are concatenated into a stream so that it can be feed into network 35
inputs. .
44
The input of the network is actually the negative image of the figure, where the
input range is 0 to 1, with 0 equal to black and 1 indicate white, while the value in
between show the intensity of the relevant pixel. By this, we are able to extract the
character and pass to another object for future classification or traning purpose on
the neuron network.
45
6. RESULTS AND DISCUSSION
6.1 ADVANTAGES:
1. Has best in class performance on problems that significantly outperforms other
solutions in multiple domains. This includes speech, language, vision, playing
games by a significant amount.
46
2. Reduces the need for feature engineering, one of the most tie consuming parts
of machine learning practice.
APPLICATIONS:
1. Detect intruder in a bank locker and other sensitive areas in bank where people
can only enter with permission.
3. Can be used in home backyard during night to notify owner if any intruder
entered.
4. In museums to secure the places where humans are not allowed to enter.
5. This could also be used inside home during night to catch an intruder.
Comparison:
The image processed under mat lab platform in now compared with the
images which have been already fed into the database. When the captured image
matches with any of the images in the database, then that text is outputted as audio.
47
Voice Module:
This is an enhanced 8 channel recordable voice module. Each channel can
hold up to 1 minute of recorded voice and/or music .The built-in microphone and
push to record button makes recording easy and instant. The connection to external
amplifiers, audio equipment, and paging systems are done using a line-level output
jack.
48
7. CONCLUSION AND FUTURE SCOPE
7.1 CONCLUSION
49
8. REFERENCES
1. Sriram K.V. and R.H. Havaldar- Human Detection and tracking video
surveillance.
2. Virginia Menezes, Vamsikrishna patchava, M.Surya Deekshith Gupta-
surveillance and monitoring system using raspberry pi and simple cv.
3. Hwajeong seo, jongseok choi, Hyunjin Kim, Tachwan Park and Howan Kim.
4. A. Mahmoud and H. Farouk- “An Efficient Detection and Classification Method
for Landmine Types Based on IR Images Using Neural Network”, International
Journal of Geology, Vol.4 iss. 4, 2010.
5. L.Robledo, M.Carrasco, and D.Mery- “A survey of land mine detection
technology”, International Journal Of Remote Sensing , Vol. 30 , Iss. 9, 2009.
6. P. Rapillard and M. Walton- “Humanitarian and Developmental Impact of Anti-
vehicle Mines”, Journal of Conventional Weapons Destruction, Vol. 18, Iss. 3 ,2014.
7. M.Kale, V.Ratnaparkhe and A.Bhalchandra- “Sensors for Landmine Detection
And Techniques: A Review”, International Journal of Engineering Research &
Technology, Vol.2, Iss. 1, 2013.
8. A.Patri, A.Nayak and S.Jayanthu, “Wireless Communication Systems
ForUnderground Mines – A Critical Appraisal “,International Journal of Engineering
Trends and Technology (IJETT), Vol.4, Iss. 7, 2013
9. J.Bharath, “Automatic Land Mine Detection and Sweeper Robot Using
Microcontroller”, International Journal of Mechanical Engineering and Robotics
Research, Vol. 4, No. 1, Jan, 2015.
10. Raspberry Pi “ https://www.raspberrypi.org/”, Mar. 30, 2017 .
50
51