Understanding Digital Imaging

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 25

Understanding Digital Imaging

 Digital imaging is the creation of digital images. These digital images


can be created by tools such as digital cameras, digital video cameras,
3D animation software, digital painting programs, digital sculpting
programs, and scanners.

 Pixels
 A pixel is a sampling of an image at a certain location. Therefore, pixels
do not have any true size or shape. Pixels are the smallest visible units
of a digital image on a displayable screen
 The more pixels you have the sharper the image will be
 The amount of pixels you have is known as the resolution of the
screen
 On a monitor, they are tiny dots or rectangles of red, green, and blue
that are grouped together to create a color sample

 Raster Graphics vs. Vector Graphics


 All digital images are either raster graphics or vector graphics.
 Raster graphics are built around a data structure that represents your
image with points or pixels that are typically viewed on a monitor.
 In fact, a raster image is the typical image you will see on a
computer

 Raster graphics are also resolution dependent, which means that at the
correct viewing percentage, they will look sharp and clear, but if you
zoom in, they will pixelate and lose their overall quality.

 Vector graphics are composed of geometrical math-based primitives of


lines, points, curves, and polygons.
 Vector graphics are not resolution dependent and can be scaled to any
size without losing any quality
Anti-Aliasing
 Anti-Aliasing is a extra point sampling of the pixels to better
represent the image
 Anti-Aliasing will make the images smoother and sharper
Basic graphic-file formats
 File formats are a way of coding information for a specific type of file
 Data
 Image
 Each format has its own advantages and disadvantages
Graphic/Image Data Structures

A digital image consists of many picture elements, termed pixels. The


number of pixels that compose a monitor image determine the quality of the
image (resolution). Higher resolution always yields better quality.

A bit-map representation stores the graphic/image data in the same


manner that the computer monitor contents are stored in video memory.

Monochrome/Bit-Map Images

An example 1 bit monochrome image is illustrated in Fig. 6.11 where:

Each pixel is stored as a single bit (0 or 1)

A 640 x 480 monochrome image requires 37.5 KB of storage.

Dithering is often used for displaying monochrome images

Gray-scale Images

An example gray-scale image is illustrated in Fig. 6.12 where:

Each pixel is usually stored as a byte (value between 0 to 255)

A 640 x 480 greyscale image requires over 300 KB of storage.


Figure 6.11: Sample Monochrome Bit-Map Image
Figure 6.12: Example of a Gray-scale Bit-map Image
Figure 6.13: Example of 8-Bit Colour Image

8-bit Colour Images

An example 8-bit colour image is illustrated in Fig. 6.13 where:

One byte for each pixel

Supports 256 out of the millions s possible, acceptable colour quality

Requires Colour Look-Up Tables (LUTs)

A 640 x 480 8-bit colour image requires 307.2 KB of storage (the same
as 8-bit greyscale)

24-bit Colour Images

An example 24-bit colour image is illustrated in Fig. 6.14 where:


Figure 6.14: Example of 24-Bit Colour Image
Each pixel is represented by three bytes (e.g., RGB)

Supports 256 x 256 x 256 possible combined colours (16,777,216)

A 640 x 480 24-bit colour image would require 921.6 KB of storage

Most 24-bit images are 32-bit images, the extra byte of data for each pixel is used to
store an alpha value representing special effect information

Calculating image File Sizes


 Everyone who creates content that will be broadcast, streamed, downloaded or shared must
be aware of the file size of the project.
 Productions that are too big won't play quickly, won't load quickly, and will take up bandwidth
and server storage space.
 Productions compressed to smaller file sizes will pixelate and will suffer from poor quality.

 For the Audio Video Production Companies, file size (video quality) is determined by the
client and distribution method.

 For Broadcast Companies, a smaller file size (lower video quality) is often preferred in
order to maximize storage capacity and bandwidth of their network.

Example:
a standard video application screen stage or canvas that is 720 x 480 pixels in size. The DPI is 72 and
the bit depth is 24 bits RGB (16.7 million colors).

This produces a file that is 345,600 total pixels in size and is 10 inches x 6.66 inches in print size. There
are 1,036,800 bytes, 1012.5 kilobytes (kb) and .9887 Megabytes (Mb) in the file. Some applications will
tell you all of this information, but others will not. You can calculate this by yourself so it is not a
mystery how the computer comes up with these numbers.

Total pixels is an easy math problem. It is simply the total pixels of any size area (length x width =
area).

L x W = Area
SD Video - 720 px x 480 px = 345,600 total pixels
HD Video – 1920px x 1080px = 2,073,600 total pixels (sometime called 2K resolution)
To convert pixels to inches you need to know that there are 72 pixels per inch. The
formula to convert pixels to inches is X/72=inches, where X is the number of pixels on a
side.

720px/72 = 10 inches 480px/72= 6.6666 inches


1920px/72 = 26. 6666 inches 1080px/72 = 15 inches
So once you figure out the size of your bitmap, how can you convert it to a data file size?

There are several formulas for the number of bytes of data each pixel holds, but the
average we will use here is 3, which is the norm for RGB color.

The formula for figuring out the number of bytes in an image is L x W x 3 = Bytes,
where the length and width are measured in pixels.

SD - 720px x 480px x 3 = 1,036,800 bytes (of


information) HD - 1920px x 1080px x 3 =
6,220,800 bytes (of information)

Now how do we convert that to kilobytes or megabytes?

Back in the early computer days, when data cost so much to process and store and we
managed data in kilobytes, the standard was set to binary kilobytes. Memory chips and
file sizes are kilobytes (1024 bytes) or megabytes (1024x1024 bytes) or gigabytes
(1024x1024x1024 bytes). Yes, it really doesn’t matter to us today if we have some extra
kb’s in storage when we’re measuring our storage in gigs, but it still matters when you’re
saving tens of thousands of images to be used in a game, animation or video. Even today
we still need to keep our file sizes as small as possible.

To convert our photograph to kilobytes we divide by 1024.

To convert to megabytes we divide by 1,048,576 (1024 x 1024)

For gigabytes we divide by 1024 cubed (1024 x 1024 x 1024).


1,036,800 / 1024 = 1,012.5 kilobytes (kb)
1,036,800 / (1024 x 1024) = .9887695 megabytes (Mb)
1,036,800/ (1024 x 1024 x 1024) = .0009655 gigabytes (GB)
Channels
 Channels are representations of the individual red, green, and blue
 (RGB, monitor display) or cyan, magenta, yellow, and black (CMYK,
print) information of a digital image
 Are representations of the single color of the file format
 They can be Red Green Blue (RGB) or Cyan Magenta Yellow Black
(CMYK)
 A Channel will display a black and white image to show the amount
of each color it is representing

Color depth / bit depth


 Represents the number of bits used to represent a single pixel
 The higher the depth the more information is in each pixel
 A 8-bit per channel image can have 256 different colors per channel
(typically 3 RGB) for a total of 16,777,216 total colors
 bit depth is the number of bits of memory allowed per pixel.
 Because the only possible alternatives per bit are 0 and 1, the bit depth in
color will be 2 to the power of x, where x is the allowable bit depth per file
type to see how many variations of color a pixel can have
 Color depth can also be represented in RGB color space with bits per
channel.
 An 8-bit-per-channel or 24-bit image can have 256 colors per red, 256
colors per green, and 256 colors per blue channel, with a total of
16,777,216 colors available
Color Calibration
 Is setting your monitors to the same color profile and color response
 This allows multiple users to see the same image on different
computers
 It also allows the monitors to represent what a final print will look
like
 Understanding digital video
 In the end, the best thing you can do is to create an image that will
work for the majority of people in the digital world

Color models
There are two type of color model
 Additive and Subtractive Color Model s

 RGB Color Mode RGB stands for red, green, and blue.
 In RGB mode, these colors are mixed in an additive manner to create the colors seen
on a monitor or television.
 This color mode is based on the human perception of color. Humans see wavelengths
of light in the range of 400 to 700 nanometers (nm).
 This range enables us to see from violet to red in a continuous rainbow. In our eyes,
receptors called cones are sensitive to red, green, and blue light.
 CMYK Color Mode
 CMYK stands for cyan, magenta, yellow, and black. In CMYK mode, these colors
are mixed in a subtractive form specifically for print.
 This subtractive model is not a native format for a computer because the screens
are created in a additive color mode with light.
 HSV Color Mode HSV stands for hue, saturation, and value.
 This color mode is also known as HSB (hue, saturation, and brightness) and HSL
(hue, saturation, and lightness).
 This color mode is actually a transformation of RGB mode, which means that it is
a color picker model to allow for easy color selection.
 YUV Color Mode YUV color mode uses luma (Y) and two chrominance (UV)
components.
 Luma is the brightness of the image and chrominance is the color.
 This color mode is used to interface between analog and digital equipment like
digitizing old VHS tape. This format is used to compress data into the MPEG and
JPEG file formats.
 Resolution, Device Aspect Ratio, and Pixel

 Resolution is the total number of pixels on a monitor or screen


 Device aspect ratio is the ratio of the width to height of the monitor
 Pixel aspect ratio is the height-to-width ratio of pixels on that screen
 is the shape of the pixel
 round
 square
 rectangular

Safe areas
 Safe areas and title safe areas on a monitor or screen was created to
account for the bezel of old televisions and the shape of the old TV
screens
 Bezel is the overlapping parts of a TV case that would occlude the
screen
 The safe area is about 10% of the outer edge of the screen, title safe is
about 20% of the outer edge of the screen
Interlaced and progressive scanning
 Progressive scanning draws a whole image, like a photograph, from top
to bottom until the image is completed.
 This progressive vertical scanning occurs at a very high rate, and the
Persistence of Vision theory indicates that we will never perceive the
scanning on the monitor.
 The speed at which this scanning happens is called the
refresh rate.
 You may have witnessed a flickering on your computer
monitor that typically means the refresh rate of the monitor
needs to be adjusted for smooth movement.
 Interlacing was created because old television sets could not hold an
image on the screen through the phosphors long enough before the
scan could complete, so the image would fade away before the next
cycle.
 So the lines were broken into alternating even and odd lines, enabling
the television to draw the image faster.
 These alternating lines are called fields. The actual images would be
captured in subframes at twice the usual display frame rate
 For example 50herts will be seen as 100herts
Standard System Independent Formats

The following brief format descriptions are the most commonly used formats.

Follow some of the document links for more descriptions.

GIF (GIF87a, GIF89a)

Graphics Interchange Format (GIF) devised by the UNISYS Corp. and


Compuserve, initially for transmitting graphical images over phone
lines via modems

Uses the Lempel-Ziv Welch algorithm (a form of Huffman Coding),


mod-ified slightly for image scan line packets (line grouping of pixels)

Limited to only 8-bit (256) colour images, suitable for images with few
distinctive colours (e.g., graphics drawing)

Supports interlacing
JPEG

A standard for photographic image compression created by the Joint


Pho-tographics Experts Group

Takes advantage of limitations in the human vision system to achieve


high rates of compression

Lossy compression which allows user to set the desired level of qual-
ity/compression

Detailed discussions in next chapter on compression.

TIFF

Tagged Image File Format (TIFF), stores many different types of


images (e.g., monochrome, greyscale, 8-bit & 24-bit RGB, etc.) –>
tagged

Developed by the Aldus Corp. in the 1980’s and later supported by the
Microsoft

TIFF is a lossless format (when not utilizing the new JPEG tag which
allows for JPEG compression)
It does not provide any major advantages over JPEG and is not as user-
controllable it appears to be declining in popularity

Graphics Animation Files

FLC – main animation or moving picture file format, originally created


by Animation Pro

FLI – similar to FLC

GL – better quality moving pictures, usually large file sizes

Postscript/Encapsulated Postscript

A typesetting language which includes text as well as vector/structured


graphics and bit-mapped images

Used in several popular graphics programs (Illustrator, FreeHand)

Does not provide compression, files are often large

6.4.3 System Dependent Formats

Many graphical/imaging applications create their own file format particular to


the systems they are executed upon. The following are a few popular system
dependent formats:

Microsoft Windows: BMP

A system standard graphics file format for Microsoft Windows

Used in PC Paintbrush and other programs

It is capable of storing 24-bit bitmap images

Macintosh: PAINT and PICT

PAINT was originally used in MacPaint program, initially only for 1-bit
monochrome images.
PICT format is used in MacDraw (a vector based drawing program) for
storing structured graphics

X-windows: XBM

Primary graphics format for the X Window system

Supports 24-bit colour bitmap

Many public domain graphic editors, e.g., xv

Used in X Windows for storing icons, pixmaps, backdrops, etc.

Compression
 Compression is the reduction or reordering of data in your file to make
that file smaller so it can be more easily distributed and viewed.
 However, the way you compress images and video can
greatly change the quality and usability of those files.
 The compression of video and image files has two basic parts:
compressing and decompressing.
 The act of compressing and decompressing always comes as a duo,
and a codec is used to start and complete this process.
 Codec, short for compression/decompression, is a program
that allows the ease of viewing or editing video or images.
 The primary idea of compressing a file is to use only what information is
needed and to lose data that is not as important, to shrink the file size.
 There are types of compression

 Lossy compression allows for the loss of some data to shrink the final
file size.
 Human observation of visual data is taken into account when the codec
is discarding data to give a good representation of the original image.
 But the bottom line of lossy compression is that you are losing
image quality in exchange for a faster playback and smaller file
size
 and after that information is lost, it cannot be brought back.
 Lossless compression does not allow for a loss of quality. This type of
compression
 typically does not create as small of a file size as a lossy compression,
 but the final quality is the most important aspect of lossless
compression.
 Video can use either spatial or temporal compression.
 Spatial compression looks at each frame individually, picks out the
differences, and changes only the information that is different from
frame to frame.
 Temporal compression, instead of looking at the differences of every
frame, chooses certain frames called keyframes in which to write all the
pixel information.
 Then for all other frames between the keyframes, the codec writes only
the pixel information indicating differences from these keyframes.
 The frames between the keyframes are called delta frames.

 Frame rate and timecode


 Frame rate is the rate at which the video film or game will show each
frame
 24fps - film
 30fps – NTSC televsion
 6fps to 100+ - frame rates for games
 Timecode is the method of syncing frame rate
 Digital Image Capture
 You can capture digital images with many different tools today
 Scanner
 Camera
 Video camera
 Webcam

Digital Image Capture


 You can use various tools to capture photo-like digital images:
scanners, cameras, and video cameras
 Scanners allow you to place an image or object on a glass flatbed to be
optically scanned, similar to a copy machine

 Digital cameras have come a long way in a short amount of time. These
capturing devices can now capture very high-definition images with the
aid of a lens at an incredible rat
 Digital video cameras enable video to be recorded at various frame
rates.

Basics of Video
6.6.1 Types of Colour Video Signals

Component video – each primary is sent as a separate video signal.

– The primaries can either be RGB or a luminance-chrominance


trans-formation of them (e.g., YIQ, YUV).

– Best colour reproduction

– Requires more bandwidth and good synchronization of the three


com-ponents
Composite video – colour (chrominance) and luminance signals are
mixed into a single carrier wave. Some interference between the two
sig-nals is inevitable.

S-Video (Separated video, e.g., in S-VHS) – a compromise between


com-ponent analog video and the composite video. It uses two lines,
one for luminance and another for composite chrominance signal.

6.6.2 Analog Video

The following figures (Fig. 6.27 and 6.28) are from A.M. Tekalp, Digital
video processing, Prentice Hall PTR, 1995.

Figure 6.27: Raster Scanning


Figure 6.28: NTSC Signal

NTSC Video

525 scan lines per frame, 30 frames per second (or be exact, 29.97 fps,
33.37 msec/frame)
Aspect ratio 4:3

Interlaced, each frame is divided into 2 fields, 262.5 lines/field

20 lines reserved for control information at the beginning of each field


(Fig. 6.29)

– So a maximum of 485 lines of visible data


˜

– Laserdisc and S-VHS have actual resolution of 420 lines


˜

– Ordinary TV – 320 lines

• Each line takes 63.5 microseconds to scan. Horizontal retrace takes 10


microseconds (with 5 microseconds horizontal synch pulse embedded),
so the active line time is 53.5 microseconds.

Figure 6.29: Digital Video Rasters

Colour representation:

– NTSC uses YIQ colour model.

– composite = Y + I cos(Fsc t) + Q sin(Fsc t), where Fsc is the


frequency of colour subcarrier

– Eye is most sensitive to Y, next to I, next to Q. In NTSC, 4 MHz is


allocated to Y, 1.5 MHz to I, 0.6 MHz to Q.

PAL Video
625 scan lines per frame, 25 frames per second (40 msec/frame)

Aspect ratio 4:3

Interlaced, each frame is divided into 2 fields, 312.5 lines/field

Colour representation:

– PAL uses YUV (YCbCr) colour model

– composite = Y + 0.492 x U sin(Fsc t) + 0.877 x V cos(Fsc t)

– In component analog video, U and V signals are lowpass filtered to


about half the bandwidth of Y.
6.6.3 Digital Video

Advantages:

– Direct random access –> good for nonlinear video editing

– No problem for repeated recording

– No need for blanking and sync pulse

Almost all digital video uses component video

6.6.4 Chroma Subsampling

How to decimate for chrominance (Fig. 6.30)?

Figure 6.30: Chroma Subsampling

4:2:2 –> Horizontally subsampled colour signals by a factor of 2. Each


pixel is two bytes, e.g., (Cb0, Y0)(Cr0, Y1)(Cb2, Y2)(Cr2, Y3)(Cb4,
Y4)
...

4:1:1 –> Horizontally subsampled by a factor of 4

4:2:0 –> Subsampled in both the horizontal and vertical axes by a factor
of 2 between pixels as shown in the Fig. 6.30.

4:1:1 and 4:2:0 are mostly used in JPEG and MPEG (see Chapter 4).

CCIR Standards for Digital Video

(CCIR – Consultative Committee for International Radio)

CCIR 601 CCIR 601 CIF QCIF

525/60 625/50

NTSC PAL/SECAM NTSC

-------------------- ----------- ----------- ----------- -----------

Luminance resolution 720 x 485 720 x 576 352 x 240 176 x 120

Chrominance resolut. 360 x 485 360 x 576 176 x 120 88 x 60


Colour Subsampling 4:2:2 4:2:2

Fields/sec 60 50 30 30

Interlacing Yes Yes No No

CCIR 601 uses interlaced scan, so each field only has half as much
vertical resolution (e.g., 243 lines in NTSC). The CCIR 601
(NTSC) data rate is
˜

165 Mbps.

CIF (Common Intermediate Format) is introduced to as an


acceptable temporary standard. It delivers about the VHS quality.
CIF uses pro-gressive (non-interlaced) scan.

ATSC Digital Television Standard

(ATSC – Advanced Television Systems Committee) The ATSC Digital


Tele-vision Standard was recommended to be adopted as the Advanced
TV broad-casting standard by the FCC Advisory Committee on
Advanced Television Ser-vice on November 28, 1995. It covers the
standard for HDTV (High Definition TV).

Video Format

The video scanning formats supported by the ATSC Digital


Television Stan-dard are shown in the following table.
Vertical Lines Horizontal Pixels Aspect Ratio Picture Rate
1080 920 16:9 60I 30P 24P
720 1280 16:9 60P 30P 24P
480 704 16:9 and 4:3 60I 60P 30P 24P
480 640 4:3 60I 60P 30P 24P

The aspect ratio for HDTV is 16:9 as opposed to 4:3 in NTSC,


PAL, and SECAM. (A 33% increase in horizontal dimension.)
In the picture rate column, the "I" means interlaced
scan, and the "P" means progressive (non-interlaced)
scan.

Both NTSC rates and integer rates are supported (i.e., 60.00, 59.94,
30.00, 29.97, 24.00, and 23.98).

You might also like