Brief Introduction and Overview of Visual Media Compression and Processing PDF
Brief Introduction and Overview of Visual Media Compression and Processing PDF
Processing
Course Code: COME5103
Course instructor: Mr Yancho Basil W
Email: [email protected]
Tel: 679615856/655023871
DEA Applied Computer Science
With the rapid development of applications and technologies built around the Internet, more and
more images and videos are freely available on the Internet. We refer to these images and videos
as visual media, and they can be considered to form a large database. This leads to opportunities
to create various new data-driven applications, allowing non-professional users to easily create
and edit visual media. Such models come from a diverse and expanding set of fields including
physical, mathematical, artistic, biological, and even conceptual (abstract) structures. In this
course as earlier said, we will be answering the following questions
1 Digital Image
An image as defined on, say a photographic film is a continuous function of brightness values.
Each point on the developed film can be associated with a gray level value representing how
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 1
bright that particular point is. In other to be able to store this images on a computer we need to
sample (consider the image only at a finite number of tiny squares) and digitize (change squares
into a sequence of discrete numbers call pixels that a computer can understand).
In simple terms, a digital image refers to a picture stored on a computer. A digital image is
therefore a numeric representation (normally binary) of a two-dimensional image. Lots of these
tiny coloured squares (pixels) together form a digital image. Therefore to create a digital image
you could simply create one in a piece of design software (like Paint or Photoshop), take one on
a digital camera or scan one in using a scanner.
Pixel: In digital imaging, a pixel or picture element is a physical point in a raster image, or the
smallest addressable element in an all points addressable display device; so it is the smallest
controllable element of a picture represented on the screen. The easiest way of sampling is on a
regular grid of squares. Each square is a pixel and to store the picture, the computer simply
records a number to represent the colour of each square. If you look closely at your computer
monitor you will see that the screen is made up of a grid with millions of tiny squares. The more
squares in the grid, the better the images will look. Each pixel’s color sample has three numerical
RGB components (Red, Green, Blue) to represent the color of that tiny pixel area. These three
RGB components are three 8-bit numbers for each pixel. Three 8-bit bytes (one byte for each of
RGB) are called 24 bit color. Each 8-bit 5 RGB component can have 256 possible values,
ranging from 0 for black to 255 for fully white. We often distinguish two different types of
images namely:
Bitmap: (or raster) images are stored as a series of tiny dots called pixels. Each pixel is actually
a very small square that is assigned a color, and then arranged in a pattern to form the image.
When you zoom in on a bitmap image you can see the individual pixels that make up that image.
Bitmap graphics can be edited by erasing or changing the color of individual pixels using a
program such as Adobe Photoshop. These are picture formats that use pixels to store pictures
such as
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 2
Figure 3.1: (a) Digital image 2D representation and (b) pixel grid representation
2. Vector graphics: They are made up of lines, curves and shapes instead of pixels. Each part of
a vector graphic is editable and they can be resized very easily. like bitmaps, vector images are
not based on pixel patterns, but instead use mathematical formulas to draw lines and curves that
can be combined to create an image from geometric objects such as circles and polygons. Vector
images are edited by manipulating the lines and curves that make up the image using a program
such as Adobe Illustrator
Vector images tend to be smaller than bitmap images. That’s because a bitmap image has
to store color information for each individual pixel that forms the image. A vector image
just has to store the mathematical formulas that make up the image, which take up less
space.
Vector images are also more scalable than bitmap images. When a bitmap image is scaled
up you begin to see the individual pixels that make up the image. This is most noticeable
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 3
in the edges of the image. When a vector image is scaled up, the image is redrawn using
the mathematical formula, so the resulting image is just as smooth as the original.
The three most popular image formats used on the Web (PNG, JPEG, and GIF) are bitmap
formats. The Scalable Vector Graphics (SVG) format comes in a distant fourth due to a legacy of
poor support for vector graphics in early browsers. Today however, all major browsers support
the SVG (Scalable Vector Graphics) format.
Bitmap formats are best for images that need to have a wide range of color gradations, such as
most photographs. Vector formats, on the other hand, are better for images that consist of a few
areas of solid color. Examples of images that are well suited for the vector format include logos
and type
Image processing is a method to convert an image into digital form and perform some
operations on it, in order to get to extract some useful information or features from it. process
digital images by means of computer, it covers low-, mid-, and high-level processes
Algorithms that alter an input image to create new image Input is image, output is image
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 4
Characteristics of Bitmap Images
A bitmap describes a type of image that web-users encounter all the time without realizing it.
Basically, it’s a grid where each individual square is a pixel that contains color information. The
key characteristics are
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 5
2. X-Ray imaging – X-rays of body.
3. Ultraviolet Band –Lithography, industrial inspection, microscopy, lasers.
4. Visual And Infrared Band – Remote sensing.
5. Microwave Band – Radar imaging.
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 6
heat sensitive devices inkjet units and digital units such as optical and CD ROM disk. Films
provide the highest possible resolution, but paper is the obvious medium of choice for written
applications.
viii) Networking It is almost a default function in any computer system in use today because of
the large amount of data inherent in image processing applications. The key consideration in
image transmission bandwidth.
Elements of Visual Perception
Structure of the human Eye
The eye is nearly a sphere with average approximately 20 mm diameter. The eye is
enclosed with three membranes
a) The cornea and sclera: it is a tough, transparent tissue that covers the anterior surface of the
eye. Rest of the optic globe is covered by the sclera
b) The choroid: It contains a network of blood vessels that serve as the major source of nutrition
to the eyes. It helps to reduce extraneous light entering in the eye It has two parts
(1) Iris Diaphragms- it contracts or expands to control the amount of light that enters the eyes.
(2) Ciliary body
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 7
c) Retina – it is innermost membrane of the eye. When the eye is properly focused, light
from an object outside the eye is imaged on the retina. There are various light receptors over the
surface of the retina The two major classes of the receptors are-
1) cones- it is in the number about 6 to 7 million. These are located in the central portion of the
retina called the fovea. These are highly sensitive to color. Human can resolve fine details with
these cones because each one is connected to its own nerve end. Cone vision is called photopic
or bright light vision
2) Rods – these are very much in number from 75 to 150 million and are distributed over the
entire retinal surface. The large area of distribution and the fact that several roads are connected
to a single nerve give a general overall picture of the field of view.They are not involved in the
color vision and are sensitive to low level of illumination. Rod vision is called is scotopic or
dim light vision. The absent of reciprocators is called blind spot
Image Formation in the Eye
The major difference between the lens of the eye and an ordinary optical lens in that the
former is flexible.
The shape of the lens of the eye is controlled by tension in the fiber of the ciliary body. To
focus on the distant object the controlling muscles allow the lens to become thicker in order
to focus on object near the eye it becomes relatively flattened. The distance between the center of
the lens and the retina is called the focal length and it varies from 17mm to 14mm as the
refractive power of the lens increases from its minimum to its maximum. When the eye focuses
on an object farther away than about 3m.the lens exhibits its lowest refractive power. When the
eye focuses on a nearly object. The lens is most strongly refractive.
The retinal image is reflected primarily in the area of the fovea. Perception then takes
place by the relative excitation of light receptors, which transform radiant energy into
electrical impulses that are ultimately decoded by the brain.
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 8
Brightness Adaption and Discrimination
Digital image are displayed as a discrete set of intensities. The range of light intensity
levels to which the human visual system can adopt is enormous- on the order of 1010
from scotopic threshold to the glare limit. Experimental evidences indicate that subjective
brightness is a logarithmic function of the light intensity incident on the eye.
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 9
It is an area that is been gaining importance because of the use of digital images over the
internet. Color image processing deals with basically color models and their implementation in
image processing applications.
v) Wavelets and Multiresolution Processing
These are the foundation for representing image in various degrees of resolution
vi) Compression
It deals with techniques reducing the storage required to save an image, or the bandwidth
required to transmit it over the network. It has to major approaches:
a) Lossless Compression
b) Lossy Compression
vii) Morphological processing
It deals with tools for extracting image components that are useful in the representation and
description of shape and boundary of objects. It is majorly used in automated inspection
applications.
viii) Representation and Description
It always follows the output of segmentation step that is, raw pixel data, constituting either
the boundary of an image or points in the region itself. In either case converting the data to
a form suitable for computer processing is necessary.
ix) Recognition
It is the process that assigns label to an object based on its descriptors. It is the last step of
image processing which use artificial intelligence software.
2 VIDEO PPROCESSING
Digital video processing refers to manipulation of the digital video bitstream. All digital video
applications require compression. In addition, they may benefit from filtering for format
conversion, enhancement, restoration, and super-resolution in order to obtain better-quality
images or to extract specific information, and some may require additional processing for motion
estimation, video segmentation, and 3D scene analysis. What makes digital video processing
different from still image processing is that video contains a significant amount of temporal
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 10
correlation (redundancy) between the frames. One may attempt to process video as a sequence of
still images, where each frame is processed independently. However, multi-frame processing
techniques using inter-frame correlations enable us to develop more effective algorithms, such as
motion-compensated filtering and prediction. In addition, some tasks, such as motion estimation
or the analysis of a time-varying scene, obviously cannot be performed on the basis of a single
image.
[Introduction into Visual Media Compression and processing] by Yancho Basil Page 11