Input Devices
Input Devices
CHAPTER OUTLINE
Introduction
Types of Input Devices
Keyboard
Pointing Devices
Speech Recognition System
Digital Camera
Webcam
Scanners
Optical Character Recognition
Optical Mark Recognition
Magnetic-ink Character Recognition
Bar Code Reader
6.1 INTRODUCTION
In the computer terminology, a device can be referred as a unit of hardware, which is capable
of providing input to the computer or receiving output or both. When the term ‘device’ is
used, it includes keyboard, mouse, display monitors, printers and other hardware units. The
input devices are the hardware that allows to feed information into a computer and enables
to provide the means of communication between the computer and the outer world.
An input device can be defined as an electromechanical device that allows the user to feed
data into the computer for analysis and storage, and to give commands to the computer. The
data and instructions are entered into the main memory of the computer through an input
device. The input device captures information and translates it into a form that can be
processed and used by other parts of the computer. After processing the input data, the
computer provides the results with the help of output devices, which are discussed in
Chapter 7. This chapter explains the types of input devices and their uses.
6.2 TYPES OF INPUT DEVICES
The computer accepts input in two ways—manually and directly. In the case of manual data
entry, the user enters the data into the computer by hand, for example, by using the keyboard
and the mouse. A user can also enter the data directly by transferring information
automatically from a source document (like from cheque using MICR) into the computer. The
user need not enter the data manually. Direct data entry is accomplished by using special
devices like a barcode reader. Some of the commonly used input devices are keyboard,
pointing devices like mouse and joystick, speech recognition system, digital camera,
scanners and so on.
6.2.1 Keyboard
A keyboard is the most common data entry device. Using a keyboard, the user can type text
and commands. The keyboard is designed like a regular typewriter with a few additional keys.
The data are entered into the computer by using various keys. There are different types of
keyboard layouts such as QWERTY, DVORAK and AZERTY, but the most common layout is
the QWERTY. It is named so because the first six keys on the top row of letters are Q, W, E, R,
T and Y. The layout of the keyboard has changed very little ever since it is introduced. In fact,
the most common change in its technology has simply been the natural evolution of adding
more keys that provide additional functionality. The number of keys on a typical keyboard
varies from 84 to 109. Portable computers such as laptops quite often have custom
keyboards that have slightly different key arrangements than a standard keyboard. In
addition, many system manufacturers add special buttons to the standard layout (Figure
6.1).
Keyboard
According to the QWERTY layout, the keys are categorized under the following groups:
Alphanumeric Keys: The alphanumeric keys include the letter keys (1, 2, A, B and so on),
which are generally laid out in the same style as in typewriters.
Numeric Keypad: The numeric keys are located on the right-hand side of the keyboard. It
is like a calculator, which consists of digits and mathematical operators. The same keys are
also available in the set of alphanumeric keys. A separate numeric keyboard is required for
the fast entry of the numeric data.
Function Keys: The function keys (e.g. F1, F2 and F3) are arranged in a row along the top of
the keyboard. The functions of these keys depend on the program in which they are being
used. For example, most of the Microsoft programs use F1 to display help.
Cursor-movement Keys: The cursor-movement keys include four directional arrow keys
that are arranged in an inverted T formation between the alphanumeric keys and the numeric
keypad. These keys allow the user to move the cursor one space at a time in left or right
direction and one line at a time in up or down direction.
Other Keys: Apart from the above-mentioned keys, a keyboard contains some other keys
such as Enter, Shift, Caps Lock, Num Lock, Spacebar, Tab, Print Screen, Home, End, Insert,
Delete, Page Up, Page Down, Control (Ctrl), Alternate (Alt) and Escape (Esc). The Windows
keyboard also consists of two Windows or Start keys ( ), and an application key (
) (Table 6.1).
Note: Caps Lock and Num Lock keys are called as ‘toggle’ keys because when pressed, they
toggle or change their status from one state to another. In simple words, when a toggle key
is pressed, it changes its status from ON to OFF and vice versa. Moreover, unlike the Shift
key, one need not have to keep the toggle key pressed down.
Most of the computers have alphanumeric keyboards; but in some applications, this
keyboard is not convenient. For example, if the user wants to select an item from a list, the
user can identify those items, and position them by selecting through the keyboard.
However, this action can be performed quickly by pointing at the correct position. A pointing
device is used to communicate with the computer by pointing to the locations on the
monitor. Such devices do not require keying of characters, instead the user can move a
cursor on the screen and perform move, click or drag operations. Some of the commonly
used pointing devices are mouse, trackball, joystick, light pen, touch screen and trackpad.
6.2.2.1 Mouse
A mouse allows to create graphic elements on the screen such as lines, curves and freehand
shapes. Since it is an intuitive device, it is easier and convenient to work as compared to the
keyboard. Like a keyboard, it is also supplied with a computer; therefore, no additional cost
is incurred. However, it needs a flat space close to the computer. The mouse cannot easily
be used with laptop (notebook) or palmtop computers. These types of computers need a
trackball or a touch-sensitive pad called a touchpad.
6.2.2.1.1 Common Mouse Actions
Pointing: Pointing means moving the mouse pointer to position it on an object like icon or
a menu item on the screen.
Click: The action of pressing down a mouse button (usually the left one) and releasing it is
known as a click. The term comes from the fact that pressing and releasing most mouse
buttons makes a clicking sound.
Right-click: Clicking of the right mouse button is known as right-click. In Microsoft
Windows, right-clicking often produces a ‘pop-up’ menu depending upon the object
selected, offers options that can lead the user to open a program, cut or copy, create a
shortcut or display the properties of the selected object.
Double-click: Double-click refers to the action of clicking the mouse button twice in rapid
succession without moving the mouse between the clicks. Double-clicking is used to
perform an action such as starting an application or to open a folder.
Drag and Drop: It refers to the action of clicking and holding down the mouse button while
moving the mouse (drag), and then releasing the mouse button (drop). It is used to move the
object (e.g. a file) or selected text to a new position. If the mouse has several buttons, use
the leftmost button unless otherwise instructed (Figure 6.4).
Figure 6.4 Common Mouse Actions
6.2.2.1.2 Working of a Mouse
A mechanical mouse has a rubber ball at the bottom. When the user moves the mouse along
the flat surface, the ball rolls. The distance, direction and speed of motion of the ball are
tracked. These data are used by the computer to position the mouse pointer on the screen.
There are three rollers inside the mouse. One of them, which is mounted at an angle of 45°
to the other two is spring loaded. This roller is usually the smallest of the three. It is essential
to hold the ball against the other two rollers. The other two rollers are usually larger, and have
different colours. These rollers are mounted at an angle of 90° to each another; one roller
measures how fast the ball is turning horizontally, and the other measures how fast it is
turning vertically. When the ball rolls, it turns these two rollers. The rollers are connected to
axles, and the axles are connected to a small sensor that measures the speed of the axle.
Both sets of information are passed to the electronics inside the mouse. This little processor,
usually consisting of little more than a single chip, uses the information to determine the
speed of the movement of the mouse and its direction. This information is passed to the
computer via a mouse cord, where the operating system then moves the pointer accordingly.
The optical mouse uses an infrared light and special mouse pads with fine grid lines to
measure the rotation of the axle. The axle in the optical mouse is connected to a little photo-
interrupter wheel with a number of tiny holes in it. There is a light in front of this wheel and a
light metre on the other side. As the wheel turns, the light flashes through the holes in the
wheel. By measuring the frequency of these flashes, the light sensor can measure the speed
of the wheel and sends the corresponding coordinates to the computer. The computer
moves the cursor on the screen based on the coordinates received from the mouse. This
happens hundreds of times each second, making a very smooth cursor movement (Figure
6.5).
Figure 6.5 Inside a Mechanical Mouse
6.2.2.2 Trackball
A trackball is another pointing device that resembles a ball nestled in a square cradle and
serves as an alternative to a mouse. In general, the trackball is like a mouse turned upside
down. It has a ball, which can be rotated by fingers in any direction, and the cursor moves
accordingly. The size of the ball in the trackball varies from as large as a cue ball to as small
as a marble. Since it is a static device, instead of rolling the mouse on the top of the table,
the ball on the top is moved by using fingers, thumbs and palms. This pointing device comes
in various shapes and forms but with the same functions. The three shapes, which are
commonly used are a ball, button and square (Figure 6.6).
Like the mouse, the trackball is also used to control cursor movements and the actions on a
computer screen. The cursor is activated when buttons on the device are pressed. However,
the trackball remains stationary on the surface, and only the ball is moved with the fingers
or palm. By moving just the fingers and not the entire arm, the user can get more precision
and accuracy. That is why many graphic designers and gamers choose to use a trackball
instead of a mouse. In addition, since the whole device is not moved for moving the graphic
cursor, a trackball requires less space than a mouse for operation. Generally, the trackball
tends to have more buttons. A lot of computer game enthusiasts and graphic designers also
tend to choose to have more buttons to cut down on keyboard use. These extra buttons can
also be reprogrammed to suit the functions as per the requirement. Normally, the trackballs
are not supplied as standard, so an additional cost is always incurred. Moreover, a user
needs a thorough knowledge about them for efficient use.
6.2.2.2.1 Working of a Trackball
A trackball works in the same way as a mouse, with the ball turning rollers and the rollers
turning axles, which are in turn connected to either mechanical or optical sensors that
measure their rotation. As shown in Figure 6.7, a trackball consists of a number of
components. As one moves the trackball, it starts a chain of events inside the box that
results in the pointer moving on the computer screen. In a normal trackball, there is a pair of
light-emitting diodes (LEDs) on one side of each encoding wheel, which emits infrared light.
On the opposite side of each pair of LEDs, there is a light sensor. Every time the light from
the LEDs shines through a hole in the encoding wheel, a pulse of electricity is sent from the
light sensor to the microprocessor. When the trackball rolls side-to-side, the horizontal (x-
axis) shaft rotates, spinning the attached encoder wheel. Similarly, when the trackball is
rolled up and down, the vertical (y-axis) shaft rotates, spinning the attached encoder wheel.
Due to this spinning, the light blinks, which can be detected by the light sensor. The
microprocessor counts the number of times the light sensors detect light each second and
sends this information to the computer along the cord.
6.2.2.3 Joystick
A joystick is a device that moves in all directions and controls the movement of the cursor.
The basic design of a joystick consists of a stick that is attached to a plastic base with a
flexible rubber sheath. This plastic base houses a circuit board beneath the stick. The
electronic circuitry measures the movement of the stick from its central position and sends
the information for processing. The joystick also consists of buttons, which can be
programmed to indicate certain actions once a position on the screen has been selected
using the stick. It offers three types of control—digital, glide and direct. Digital control allows
movement in a limited number of directions such as up, down, left and right. Glide and direct
control allow movements in all directions (360°). Direct control joysticks have the added
ability to respond to the distance and speed with which the user moves the stick (Figure 6.8).
A joystick is generally used to control the velocity of the screen cursor movement rather than
its absolute position. It is mostly used for computer games. The other applications are in
flight simulators, training simulators, CAD/CAM systems and for controlling industrial
robots.
6.2.2.3.1 Working of a Joystick
Nowadays, various joystick technologies are available and they differ mainly in the amount
of information they pass on. All joysticks are designed to inform the computer about the
positioning of the stick at any given time. This is done by providing the x–y coordinates of the
stick. The x-axis represents the side-to-side position and the y-axis represents the forward
block position. The circuit board directly below the stick carries electricity from one contact
point to another. When the joystick is in neutral position, all but one of the individual circuits
is broken. Each broken section is covered with a simple plastic button containing a tiny metal
disc. When the stick is moved in any direction, it pushes down one of these buttons, pressing
the conductive metal disc against the circuit board. This closes the circuit, that is, it
completes the connection between the two wire sections. When the circuit is closed, the
electricity can flow down a wire from the computer and to another wire leading back to the
computer. When the computer picks up a charge on a particular wire, it knows that the
joystick is in the right position to complete that particular circuit. The joystick buttons work
exactly the same way. When a button is pressed down, it completes a circuit and the
computer recognizes a command (Figure 6.9).
A light pen (sometimes called a mouse pen) is a hand-held electro-optical pointing device,
which when touched or aimed closely at a connected computer monitor will allow the
computer to determine the position of the pen on the screen. It facilitates drawing images
and selects objects on the display screen by directly pointing to the objects. It is a pen-like
device, which is connected to the machine by a cable. Although named light pen, it actually
does not emit light, but its light-sensitive diode would sense the light coming from the
screen. The light coming from the screen causes the photocell to respond by generating a
pulse. This electric response is transmitted to the processor that identifies the position to
which the light pen is pointing. The lines or images are drawn with the movement of the light
pen over the screen (Figure 6.10).
The light pens give the user the full range of mouse capabilities without the use of a pad or
any horizontal surface. Using light pens, the users can interact more easily with applications,
in such modes as drag and drop or highlighting. It is used directly on the monitor screen and
does not require any special hand/eye coordinating skills. Pushing the light pen tip against
the screen activates a switch, which allows the user to make menu selections, draw and
perform other input functions. The light pens are perfect for applications where the desk
space is limited, in harsh workplace environments and in any situation where fast and
accurate input is desired. It is very useful in identifying a specific location on the screen.
However, it does not provide any information when held over a blank part of the screen. The
light pen is economically priced and requires little or no maintenance.
6.2.2.4.1 Working of a Light Pen
The light pen contains a lens and a photo detector located at its tip. When the electron beam
that sweeps the monitor strikes the phosphor within the field of view of the light pen, the light
emitted by the phosphor is focused through the lens and onto the photo detector. Hence,
the signal current is increased and is transmitted to the computer. The position of the beam
is tracked by the horizontal and vertical counters, which relay this information to a register.
This cycle is repeated for every frame produced by the electron beam. By noting when a scan
goes by and measuring the interval between the scan lines or entire screen refreshes, an
accurate position of the photo detector on the screen is determined. The light pen software
generates x–y vectors corresponding to a point on the screen, which may be used to make a
selection by activating a switch on the light pen (Figure 6.11).
A touch screen is a special kind of display screen device, which is placed on the computer
monitor to allow direct selection or activation of the computer when the user touches the
screen. Essentially, it registers the input when a finger or other object touches the screen.
The touch screen is normally used when information has to be accessed with minimum
effort. However, the touch screen is not suitable for input of large amounts of data. Typically,
they are used in information-providing systems like the hospitals, airlines and railway
reservation counters, amusement parks and so on (Figure 6.12).
A basic touch screen has three main components: a touch sensor, a controller and a
software driver. The touch sensor/panel is a clear glass panel with a touch-responsive
surface. It is placed over a display screen so that the responsive area of the panel covers the
viewable area of the video screen. Currently, there are different types of touch sensor
technologies available, each using a different method to detect touch input. These are
optical, acoustical and electrical methods. In optical method, the infrared beams interlace
the surface of the screen and when a light beam is broken, that particular location is
recorded. In acoustical method, the ultrasonic acoustic waves pass over the surface of the
screen and when the wave signals are interrupted by some contact with the screen, the
location is recorded. In electrical method, the panel has an electrical current going through
it and touching the screen causes a voltage change, which is used to determine the location
of the touch to the screen.
The controller connects the touch sensor and the computer. It takes information from the
touch sensor and translates it into information that a computer can understand. The driver
is a software update for the computer system that allows the touch screen and computer to
work together. It instructs the operating system how to interpret the touch event information
that is sent from the controller (Figure 6.13).
6.2.2.6 Trackpad
A trackpad consists of several layers—the top layer is the rubber layer on which the user
moves the finger and beneath this layer are two more layers, which consist of horizontal and
vertical rows of electrodes. The rows of electrodes do not touch each other; rather, they are
separated by a nonconductive, dialectic material. The layers of electrodes are charged up
(one with positive charge and another with negative charge) by the alternating current (AC)
and as a result, an electric field is created between them. The strength of mutual
capacitance of the electric field is sampled by the integrated circuits to which the layers of
electrodes are connected.
As a finger approaches the top layer of the trackpad, its presence causes the change in
capacitances where the electrodes crossover. The capacitance is most affected at the
closest intersection point of electrodes under the position, where the centre of finger is
touching. By reading the capacitances of closest intersections, the trackpad identifies the
cursor position on the screen. These capacitances are measured about 100 times per
second. As you move the finger, the changes in the measurements are translated into
movement of the cursor on the screen.
6.2.3 Speech Recognition System
Speech recognition is one of the most interactive systems to communicate with the
computer. The user can simply instruct the computer with the help of a microphone (along
with speech recognition software) what task to be performed. It is the technology by which
sounds, words or phrases spoken by the individuals are converted into digital signals, and
these signals are transformed into computer-generated texts or commands. Most of the
speech recognition systems are speaker dependent so that they must be separately trained
for each individual user. The speech recognition system ‘learns’ the voice of the user, who
speaks isolated words repeatedly. Then, these voiced words are recognizable in the future
(Figure 6.15).
Speech recognition is gaining popularity in the corporate world among nontypists, people
with disabilities and business travellers who tape-record information for later transcription.
The computer-based speech recognition systems can be used to create text documents
such as letters or e-mail, to browse the Internet and to navigate among applications by voice
commands. They have relatively high accuracy rates. They allow the user to communicate
with the computer directly without using a keyboard or a mouse. However, as compared to
other input devices, the reliability of speech recognizer is less. Sometimes, it is unable to
differentiate between the two similar sounding words such as see and sea. It is also not
suitable for noisy places.
For the last two decades, computer scientists have been working on speech recognition
systems. The major difficulty in developing these systems is that the people communicate
with each other using languages with different accents and intonations. Hence, most
successful speech recognition systems require a period of ‘training’ to be accustomed to an
individual's accent and intonation. Although this technology is still in its developmental
stage, it may someday eliminate the need for keying the data into the memory of the
computer completely.
6.2.3.1 Working of a Speech Recognition System
A speech recognition system consists of a number of components and together they convert
the human spoken words into computer text and commands. The process of this system is
that when a person speaks, the speech recognition software captures the sound through a
microphone and converts it to a digital signal. The signals coming out from the microphone
are the analog waves. These analog waves are converted into digital signals by the sound
card of the computer. The speech recognition software analyses the digital pattern to find
matches with known sounds contained in a database, and then passes the recognized words
to an application such as Microsoft Word or WordPerfect. A part of that database consists of
predefined sound patterns—a one-size-fits-all vocabulary for recognizing speech from as
many different voices as possible. The rest is built when a user ‘trains’ the software by
repeating keywords so that it can recognize the user's distinctive speech patterns (Figure
6.16).
Figure 6.16 Speech Recognition System
A digital camera stores images digitally rather than recording them on a film. Once a picture
has been taken, it can be transferred to a computer system and then manipulated with an
image-editing software and printed. The main advantage of a digital camera is that making
photos is both inexpensive and fast because there is no film processing (Figure 6.17).
All digital cameras record images in an electronic form, that is, the image is represented in
computer language of bits and bytes. Essentially, a digital image is a long string of 1s and 0s
that represent all the tiny-coloured dots or pixels that collectively make up the image. Similar
to a conventional camera, it has a series of lenses that focus light to create an image of a
scene. However, instead of focusing this light onto a piece of film, it focuses it onto a
semiconductor device that records light electronically. A computer then breaks this
electronic information down into the digital data.
The major difference between a digital camera and a film-based camera is that the digital
camera does not have a film; instead, it has a sensor that converts light into electrical
charges. The image sensor employed by most of the digital cameras is a charge-coupled
device (CCD). Some low-end cameras use complementary metal oxide semiconductor
(CMOS) technology. The CCD is a collection of tiny light-sensitive diodes, which convert
photons (light) into electrons (electrical charge). These diodes are called photosites.
Concisely, each photosite is sensitive to light. The brighter the light that hits a single
photosite, the greater will be the electrical charge that accumulates at that site. To digitize
the information, the signal must be passed through an analog-to-digital converter (ADC). It
converts that information to binary form and sends it to a digital signal processor (DSP). The
DSP adjusts the image details, compresses the information and sends it to the storage
medium of the camera from where it is transferred to the storage of the computer through a
cable (Figure 6.18).
6.2.5 Webcam
A webcam (short form of web camera) is a portable video camera, which captures live video
or images that may be viewed in real time over the network or the Internet. It is just a small
digital camera that is either in-built in your computer (in most laptops) or can be connected
through a USB port. It is normally placed on top of the PC monitor or laptop so as to capture
the images of the user while one is working on the computer (Figure 6.19).
Nowadays, a wide variety of webcams are available and according to their varied capabilities
and features, they are classified into two categories—streaming and snapshot. The
streaming webcam captures the moving images (about 30 images per second), thus creating
a streaming video—a web video that plays on the computer immediately as its data arrive
via the network; the recipient need not download the video. However, a high-speed Internet
connection is needed to transfer the smooth video and the image quality is also
comparatively poor. On the other hand, a snapshot webcam captures only still images
(usually, once in every 30 s) and refreshes it continuously. It produces better-quality images
and is easier to configure than streaming videos.
The popularity of webcams is increasing day by day due to its unlimited uses. The most
popular use of webcam is in videoconferencing to provide real-time communication where
a group of people can see and interact with each other. It can be used with various
messenger programs like Yahoo and Windows Live Messenger so that one can share videos
also while instant messaging. It is also being used in educational institutions to conduct
distance learning activities; one can attend the classes while sitting at home itself.
The webcams are cheap, compact and are easy to use and install. They are affordable
because of their low manufacturing cost. The major drawback of using webcams is that they
produce only real-time images and cannot be used unless attached with the PC. Some
webcams also comprise advanced features such as automatic lightning controls, automatic
face tracking and autofocus, which increase their cost.
6.2.6 Scanners
There are a number of situations where some information (picture or text) is available on
paper and is needed on the computer for further manipulation. A scanner is an input device
that converts a document into an electronic format that can be stored on the disk. The
electronic image can be edited, manipulated, combined and printed by using the image-
editing software. The scanners are also called optical scanners as they use a light beam to
scan the input data.
Note that most of the scanners are equipped with a utility program that allows it to
communicate with the computer and save the scanned image as a graphic file on the
computer. Moreover, they can store images in both gray scale and colour mode. The two
most common types of scanners are handheld scanner and flat-bed scanner.
6.2.6.1 Hand-held Scanner
A hand-held scanner consists of light-emitting diodes, which are placed over the document
to be scanned. This scanner performs the scanning of the document very slowly from the top
to the bottom with its light on. In this process, all the documents are converted and then
stored as an image. While working, the scanner is dragged very steadily and carefully over
the document at a constant speed without stopping or jerking to obtain best results. Hence,
hand-held scanners are widely used where high accuracy is not of much importance. The
size of the hand-held scanner is small. They are available in various resolutions, up to about
800 dots per inch (DPI) and in either grey scale or colour mode. Furthermore, they are used
when the volume of the documents to be scanned is low. The typical application of this
scanner includes the storage and reproduction of the images in publications. These devices
read the data on the price tags, shipping labels, inventory part number, book ISBNs and so
on.
When the scan button of the hand-held scanner is pressed, a light-emitting diode illuminates
the document underneath it. An inverted angled mirror directly over the window of the
scanner reflects the image onto the lens, which is located at the back of the scanner. The
lens focuses a single line of the image onto a charged coupled device (CCD), which contains
a row of light detectors. As the light shines through these detectors, each of them records
the amount of light as a voltage that corresponds to white, black and grey or colour. These
voltages are sent to a specialized analog chip, which corrects any colour detection error.
After that a single-line image is passed to an analog-to-digital converter (ADC), which
converts the analog signals into binary forms that can be sent to the computer. The converter
itself clears the data so that it can receive the next line of the image.
To scan a document, first it is placed on the glass plate and the cover is closed. A lamp is
used to illuminate the document. The scan head (mirrors, lens, filters and CCD array
constitute a scan head) is moved slowly across the document by a belt that is attached to a
stepper motor. The head is attached to a stabilizer bar to ensure that there is no wobble or
deviation in the pass. In scanning terms, a pass means that the scan head has completed a
single complete scan of the document. The image of the document is reflected by an angled
mirror to another mirror. Each mirror is slightly curved to focus the image it reflects onto a
smaller surface. The last mirror reflects the image onto a lens. The lens focuses the image
through a filter on the CCD array. It is a collection of tiny light-sensitive diodes (also called
photosites), which convert light into electrical charge. The brighter the light that hits a single
photosite, the greater the electrical charge that will accumulate at that site.
Some scanners use a three-pass scanning method. Each pass uses a different colour filter
(red, green or blue) between the lens and the CCD array. After the three passes are
completed, the scanner software assembles the three filtered images into a single full-
colour image. Nowadays, most of the scanners use the single-pass method. The lens splits
the image into three smaller versions of the original image. Each smaller version passes
through a colour filter (red, green or blue) onto a discrete section of the CCD array. The
scanner combines the data from the three parts of the CCD array into a single full-colour
image, which is then sent to the computer.
6.3 OPTICAL CHARACTER RECOGNITION
As stated earlier, a scanner converts an input document into an electronic format that can
be stored on the disk. If the document to be scanned contains an image, it can be
manipulated using the image-editing software. However, if the document to be scanned
contains text, an optical character recognition (OCR) software is needed. This is because
when the scanner scans a document, the scanned document is stored as a bitmap in the
memory of the computer. The OCR software translates the bitmap image of the text to the
ASCII codes that the computer can interpret as letters, numbers and special characters.
OCR works best with originals or very clear copies and mono-spaced fonts like Courier. For
efficient working of OCR, one should use 12 point or greater font size. The text should be laid
out in a single column and should be printed/written in black on a white background. The
earliest of the systems was dedicated to high-volume variable data entry. The first major use
of OCR was in processing petroleum credit card sales drafts. The other applications evolved
over time include cash register tape readers, page scanners and so on. Any standard form or
document with repetitive variable data would be a candidate application for OCR.
Because of OCR, the data entry becomes easier, error-free and less time consuming.
However, it is very expensive and if the document is not typed properly, it will be difficult for
the OCR to recognize the characters. Furthermore, except for tab stops and paragraph
marks, most of the document formatting is lost during text scanning. The output from a
finished text scan will be a single-column editable text file. This text file will always require
spell checking and proof reading, as well as reformatting to get the desired final layout.
6.3.1 Working of an OCR
All the OCR systems include an optical scanner for reading text and sophisticated software
for converting the text into machine-readable form. During the OCR processing, the text is
analysed for light and dark areas to identify each alphabetic letter or numeric digit. When a
character is recognized, it is converted into an ASCII code. Two basic methods are used for
OCR—matrix matching and feature extraction. The matrix matching technique compares
what the OCR scanner sees as a character with a library of character matrices or templates.
When an image matches one of these prescribed matrices of dots within a given level of
similarity, the computer labels that image as the corresponding ASCII character. The feature
extraction OCR does not require strict matching to prescribed templates. This method varies
depending upon the extent of ‘computer intelligence’ applied by the manufacturer. The
computer looks for general features such as open areas, closed shapes, diagonal lines and
line intersections. This method is much more versatile than matrix matching. The matrix
matching works best when the OCR encounters a limited repertoire of type styles; with little
or no variation within each style where the characters are less predictable, the feature
extraction is superior.
At the end of the OCR processing, the final information can be saved in a number of formats,
e.g. text or rich text format (RTF). The OCR software, which supports RTF can also recognize
bold and italics characters and retain tabs and white space, as well as a limited number of
different fonts. However, using OCR, the computer cannot interpret the special characters
and images. In addition, the storage capacity required for storing the document as an image
is much more than the capacity required for storing the document as a text.
Optical mark recognition (OMR) is the process of detecting the presence of intended marked
responses. A mark registers significantly less light than the surrounding paper. The optical
mark reading is done by a special device known as optical mark reader. A mark has to be
positioned correctly on the paper and should be significantly darker than the surrounding
paper so that it can be detected by the OMR reader. The OMR technology enables a high-
speed reading of large quantities of data and transferring these data to the computer without
using a keyboard. The OMR reader scans the form, detects the presence of marks and
passes this information to the computer for processing by the application software.
Generally, this technology is used to read answer sheets (objective-type tests). In this
method, special printed forms/documents are printed with boxes, which can be marked with
dark pencil or ink. These forms are then passed under a light source and the presence of dark
ink is transformed into electric pulses, which are transmitted to the computer.
The optical mark recognition is also used for standardized testing as well as course
enrolment and attendance in education. Human resource departments across the
industries use OMR for applications such as benefit enrolment, employee testing, payroll
deductions and user training. Healthcare providers also use the technology for registration
and surveys, medical labs for patient evaluations and tracking supply orders and lab
services. The OMR is also used for inventory management, voting applications, exit surveys,
polling and all types of questionnaires and evaluation studies.
The OMR has a better recognition rate than the OCR because fewer mistakes are made by
the machines to read marks than in reading handwritten characters. Large volumes of data
can be collected quickly and easily without any specially trained staff. Usually, an OMR
reader can maintain a throughput of 1500–10,000 forms per hour. However, the designing of
documents for optical mark recognition is complicated and the OMR reader needs to be
reprogrammed for each new document design. The OMR readers are relatively slow because
the person putting marks on the documents must follow the instructions precisely. Any
folding or dirt on a form may prevent the form from being read correctly. In addition, it
requires accurate alignment of printing on forms and needs a good-quality paper.
To make an OMR system work, any of the following methods of mark reading can be used.
The first method is based on the conductivity of the graphite to determine the presence of
pencil mark. The marks must be made only in pencil because the number of magnetic
particles in the lead pencil is large.
The second method is based on the reflection of light. In this, a thin beam of light is
directed on the surface of the paper. When lesser amount of light is transmitted through the
dot, the filled box can be recognized. The OMR can evaluate only those documents, which
are printed with the marked positions in the specified areas.
The optical mark recognition is traditionally performed using a reflective light method, where
a beam of light is reflected on a sheet with marks, to capture the reflection (presence of
mark) or the absence of reflection (absence of mark). The OMR data entry system converts
the information about the presence or absence of marks into a computer data file. A simple
pen or pencil mark is made on the form to indicate each selected response such as answers
to survey questions. The completed forms are scanned by an optical mark reader, which
detects the presence of a mark by measuring the reflected light. The OMR reader then
interprets the pattern of marks into a data record and sends this to the computer for storage,
analysis and reporting.
You must have seen special magnetic encoding using characters, printed on the bottom of
a cheque. The characters are printed using a special ink, which contains iron particles that
can be magnetized. To recognize these magnetic-ink characters, a magnetic-ink character
reader (MICR) is used. It reads the characters by examining their shapes in a matrix form and
the information is then passed on to the computer.
The banking industry prefers MICR to OCR as the MICR gives extra security against forgeries
such as colour copies of payroll cheques or hand-altered characters on a cheque. If a
document has been forged, say a counterfeit check produced using a colour photocopying
machine, the magnetic-ink line will either not respond to the magnetic fields or will produce
an incorrect code when scanned using a device designed to recover the information in the
magnetic characters. The reading speed of the MICR is also higher. This method is very
efficient and time saving for data processing.
A bar code is a machine-readable code in the form of parallel vertical lines of varying widths.
It is commonly used for labelling goods that are available in super markets and numbering
books in libraries. This code is sensed and read by a bar code reader using reflective light.
The information recorded in the bar code reader is then fed into the computer, which
recognizes the information from the thickness and spacing of the bars. The bar code readers
are either hand-held or fixed-mount. Hand-held scanners are used to read bar codes on
stationary items. With fixed-mount scanners, the items having bar codes are passed by the
scanner—by hand as in retail scanning applications or by conveyor belt in many industrial
applications.
Bar code data collection systems provide enormous benefits for every business with a bar
code data collection solution; capturing data is faster and more accurate. A bar code
scanner can record the data five to seven times faster than a skilled typist. A bar code data
entry has an error rate of about one in three million. Bar coding also reduces the cost in terms
of labour and revenue losses resulting from the data collection errors. The bar code readers
are widely used in supermarkets, department stores, libraries and other places. You must
have seen bar code on the back cover of certain books and greeting cards. The retail and
grocery stores use a bar code reader to determine the item being sold and to retrieve the
price of an item from a computer system.
6.6.1 Working of a Bar Code Reader
The bar code scanners are electro-optical systems that include a means of illuminating the
symbol and measuring the reflected light. The light waveform data are converted from analog
to digital, which are processed by a decoder, and then transmitted to the computer software.
The process begins when a device directs a light beam over a bar code. The device contains
a small sensory reading element, called sensor, which detects the light being reflected back
from the bar code, and converts light energy into electrical energy. The result is an electrical
signal that can be converted into an alphanumeric data. The pen in the bar code unit reads
the information stored in the bar code and converts it into a series of ASCII characters by
which the operating system gets the information stored in the bar code.