0% found this document useful (0 votes)
8 views

Lecture 4

The document discusses image representation and storage in computer vision, focusing on raster and vector images, their encoding, and techniques used in machine learning. It highlights the importance of color depth and various image formats, as well as representation methods such as pixel-based representation and feature extraction. A case study is included to illustrate the application of these techniques in developing an object recognition system for self-driving cars.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Lecture 4

The document discusses image representation and storage in computer vision, focusing on raster and vector images, their encoding, and techniques used in machine learning. It highlights the importance of color depth and various image formats, as well as representation methods such as pixel-based representation and feature extraction. A case study is included to illustrate the application of these techniques in developing an object recognition system for self-driving cars.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Image Processing &

Computer Vision
Lecture# 4
Content
• Image representation & storage
• Image Representation
• Raster Images
• Vector Images
• Image Representation Techniques in Machine Learning
What is the purpose of image encoding in computer science?

 Image encoding or image compression is the process of converting an image into a series of
bytes and codes, reducing the file size for storage or transmission and enhancing the efficiency
of computer resources.

 Image encoding in computer science is solely for data encryption and maintaining the privacy
of image content.

 The purpose of image encoding in computer science is to enlarge the file size for better image
quality.

 Image encoding or image compression is the process of editing an image to enhance its visual
elements like contrast and brightness.
Image Representation

Definition
Image representation refers to the way in which images are stored, processed, and displayed in
digital systems.

It encompasses the encoding of visual information so that it can be used by computer


algorithms.Images can be represented in various formats, and understanding how these
formats work is crucial for anyone working with digital media.

This representation can cover anything from basic pixel values to more complex data
structures.

Image Representation: The method or format used to encode, store, and


display visual information in digital systems.
There are two primary types of image representation:

 raster and vector. In raster images, the picture is made up of pixels, which are
small dots of color.

 This means that the quality of the image is tied to its resolution.
 higher resolution means more pixels and finer detail.

 In contrast, vector images are composed of paths defined by mathematical


formulas.

 These images can be scaled to any size without losing quality because they
do not rely on pixels.

 Understanding these fundamental types is vital for selecting the appropriate


image format for specific applications.
Example of Raster vs. Vector:
A photograph is typically represented as a raster image, while a logo that needs to be resized
frequently may be best represented as a vector image.

Image representation can also involve color depth, which refers to the number of bits used to
represent each pixel's color.

Common depths include:

 1-bit: Black and white images


 8-bit: 256 colors
 24-bit: over 16 million colors (True Color)

Higher color depths allow for more vibrant images but require more storage space.
Different image formats are available, each with unique properties that affect usage.Some
common formats include:

Format Type Use Cases


Photographs and web images due
JPEG Raster
to compression.

PNG Raster Images requiring transparency

GIF Raster Simple animations and graphics

SVG Vector Logos and icons that need scaling

TIFF Raster High-quality images in printing

This detailed understanding allows for better decisions in digital projects, ensuring that the right
format is chosen for quality, performance, and intended use.
Image Representation in Computer Science
Image representation plays a critical role in Computer Science, particularly in areas
involving graphics, web development, and media. Understanding how images are
encoded allows for effective manipulation, storage, and display in various
applications.

Two main categories of image representation exist: raster and vector images.

Raster Image: An image represented as a grid of pixels, where each pixel has its
specific color value.

Vector Image: An image defined by mathematical equations representing shapes,


allowing scaling without loss of quality.

For example, a photo taken with a camera is stored as a raster image, while a
company logo designed in a graphics editor may be saved as a vector image.
Another aspect of image representation is the concept of color depth, which
indicates the number of bits used to represent the color of a single pixel.
Common color depths include:

 1-bit: Black and white images


 8-bit: 256 colors
 24-bit: True color (16.7 million colors)

Higher color depth results in more accurate color representation but also
increases file sizes.

Color Depth: The number of bits allocated to represent the color of a pixel in
an image.
The choice of image format is vital for performance and quality. Here are some popular formats and
their characteristics:

Format Type Uses

Common for photographic images due


JPEG Raster
to compression

Supports transparency and is widely used on the


PNG Raster
web

GIF Raster Used for simple animations and graphics

SVG Vector Ideal for scalable graphics like logos

TIFF Raster High-quality images for printing and archiving


Selecting the appropriate format depends on the desired balance between image quality, file
size, and specific use cases.
Image Representation Techniques in Machine Learning
 In the realm of machine learning, understanding image representation is essential for tasks such
as image classification, object detection, and image generative modeling. Images must be
converted into a format that algorithms can interpret, which often involves translating visual data
into numerical representations.

 This conversion is key to enabling effective analysis and manipulation of image data.

 One common technique for image representation is pixel-based representation.

 This method involves representing each image as a grid of pixels, with each pixel containing
values for color components.

 Consider the structure of a 24-bit RGB image, where each pixel has three color components
(red, green, blue), typically represented by 8 bits each. Thus, an image of size 100x100 pixels
consists of an array with dimensions 100x100x3.
Pixel-based Representation:

 Pixel-based representation is the backbone of digital imaging, allowing computers to store


and manipulate visual data as discrete units of color and intensity. It's crucial for various
applications, from digital photography to computer vision, enabling the capture, processing,
and display of images in a format computers can understand.

 A method of representing an image by encoding each pixel's color information in a grid


format.

Example of Pixel Array:For a 2x2 pixel image in RGB format, the pixel representation could
look like the following:In this case, the first pixel is red, the second pixel is green, the third is
blue, and the fourth is yellow.
Another important technique is feature extraction, where key attributes of an image are
identified and used instead of raw pixel values.

This technique reduces the amount of data that needs to be processed and can significantly
improve model performance. Popular methods of feature extraction include:

 Histogram of Oriented Gradients (HOG)

 Scale-Invariant Feature Transform (SIFT)

 Principal Component Analysis (PCA)

 Feature Extraction:
 The process of transforming raw image data into a set of relevant features for analysis in
machine learning.
Feature extraction methods play a crucial role in image representation. Here are some popular
techniques with their applications:

Method Advantages Applications

Effective for recognizing Pedestrian detection, face


HOG
objects detection

Robust against scaling Image stitching, object


SIFT
and rotation recognition

Reduces dimensionality, Image compression, facial


PCA
denoising recognition

By leveraging these features, machine learning models can focus on the most significant
information in the image, leading to faster processing and better predictions.
Image Representation Examples and Explained Techniques
In computer science, various techniques are used to represent images effectively for processing
and analysis. Understanding these techniques can greatly enhance the performance of algorithms
that deal with visual data.

Two primary categories of image representation play a crucial role:

Pixel-based representation and Feature extraction.

Pixel-based Representation: The encoding of images as a grid of pixels, where each pixel
corresponds to a specific color value.

Example of a Pixel Array:


For a small 2x2 pixel image in RGB format, the pixel representation could look like this:
[ [[255, 0, 0], [0, 255, 0]], [[0, 0, 255], [255, 255, 0]]]

Here, each array holds the RGB values for each pixel – red, green, blue, and yellow.
Normalizing pixel values to a range between 0 and 1 can improve model performance in
machine learning applications.
Feature extraction involves identifying and utilizing key attributes from images instead of relying
solely on raw pixel values.

This approach allows algorithms to operate more efficiently, focusing on the most relevant
characteristics of the image data.Common techniques for feature extraction include:

 Histogram of Oriented Gradients (HOG)

 Scale-Invariant Feature Transform (SIFT)

 Principal Component Analysis (PCA)

Feature Extraction: A technique that transforms raw image data into a set of relevant
features for analysis, improving data handling efficiency.
Feature extraction techniques are crucial for image representation efficiency and effectiveness.
Below are notable methods with their descriptions and applications:

Method Advantages Applications

Good for object Pedestrian detection,


HOG
recognition tasks face recognition

Robust to scaling and Image stitching, 3D


SIFT
rotation changes modeling

Reduces dimensionality, Facial recognition, image


PCA
useful for denoising compression

By employing these techniques, machine learning models can enhance their accuracy and
efficiency when interpreting image data.
Image Representation : Key takeaways
 Image representation is the method used to encode, store, and display visual information in
digital systems, essential for image processing in computer science.
 The two primary types of image representation are raster, which consists of pixel grids, and
vector, which is defined by mathematical formulas for scalability.
 Color depth refers to the number of bits used to represent pixel colors; higher color depth
allows for more vibrant images at the cost of increased storage.
 In machine learning, image representation techniques like pixel-based representation and
feature extraction are crucial for processing and analyzing visual data effectively.
 Pixel-based representation encodes images as grids of pixels, while feature extraction
identifies relevant attributes to improve algorithm performance.
 Examples of image representation techniques include: Histogram of Oriented Gradients
(HOG), Scale-Invariant Feature Transform (SIFT), and Principal Component Analysis (PCA),
which enhance the efficacy of image processing tasks.
Case Study: Image Representation for Object
Recognition
Problem Statement
A self-driving car company wants to develop an object recognition
system to detect pedestrians, cars, and road signs. The system should
be able to represent images in a way that allows for efficient and
accurate object recognition.

Questions
1. What are the different image representation techniques?
2. How do these techniques affect object recognition accuracy?
3. Which technique is most suitable for the self-driving car's object
recognition system?
Solution
Image Representation Techniques

1. Pixel-based representation: Images are represented as a matrix


of pixel values.

2. Feature-based representation: Images are represented as a set of


features, such as edges, lines, or shapes.

3. Frequency-based representation: Images are represented in the


frequency domain using techniques such as Fourier transform.
Object Recognition Accuracy

1. Pixel-based representation: High accuracy for simple objects, but low


accuracy for complex objects or objects with varying lighting conditions.

2. Feature-based representation: High accuracy for objects with distinct


features, but low accuracy for objects with similar features.

3. Frequency-based representation: High accuracy for objects with distinct


frequency patterns, but low accuracy for objects with similar frequency
patterns.
Suitable Technique
Based on the requirements of the self-driving car's object recognition system, a
feature-based representation technique is most suitable. This technique allows
for efficient and accurate object recognition, even in complex environments
with varying lighting conditions.

Tools and Techniques


1. Convolutional Neural Networks (CNNs): A type of neural network that is
particularly well-suited for image classification tasks.
2. Object Detection Algorithms: Such as YOLO (You Only Look Once) or SSD
(Single Shot Detector).
3. Image Processing Libraries: Such as OpenCV or Pillow.

By using a feature-based representation technique and leveraging tools and techniques such
as CNNs, object detection algorithms, and image processing libraries, the self-driving car
company can develop an efficient and accurate object recognition system.

You might also like