Unit 3 CV and Di
Unit 3 CV and Di
1. HOUGH TRANSFORM
Hough Transform is a computer vision technique that detects shapes like lines and
circles in an image. It converts these shapes into mathematical representations in parameter
space, making it easier to identify them even if they’re broken or obscured. This method is
valuable for image analysis, pattern recognition, and object detection. The Hough
Transform algorithm line detection is a feature extraction method in image analysis,
computer vision, and digital image processing. It uses a voting mechanism to identify bad
examples of objects inside a given class of forms. This voting mechanism is carried out in
parameter space. First, the HT algorithm produces object candidates as local maxima in an
accumulator space.
1
1.1. Original image of Lane
Figure 2: Image after applying edge detection technique. Red circles show that the line is
breaking there.
2
A line can be described analytically in various ways. One of the line equations uses the
parametric or normal notion: xcosθ+ysinθ=r. where r is the length of a normal from the origin
to this line and θ is the orientation, as given in Figure 5.
The known variables (i.e., x i,y i ) in the image are constants in the parametric line
equation, whereas r and are the unknown variables we seek. Points in cartesian image space
correspond to curves (i.e., sinusoids) in the polar Hough parameter space if we plot the
potential (r, θ) values specified by each. The Hough Transform algorithm for straight li nes is
this point-to-curve transformation. Collinear spots in the cartesian image space become
obvious when examined in the Hough parameter space because they provide curves that
overlap at a single (r, θ) point.
A and b are the circle’s center coordinates, and r is the radius. The algorithm’s computing
complexity increases because we now have three coordinates in the parameter space and a 3 -D
accumulator. (In general, the number of parameters increases the calculation and the size of
3
the accumulator array polynomially.) As a result, the fundamental Hough approach described
here only applies to straight lines.
The Hough transform in image processing is a technique used to detect simple geometric shapes in
images. It works by transforming the image space into a parameter space, where the geometric
shapes can be detected through the identification of patterns in the parameter space
1.2.1. Algorithm
Determine the range of ρ and θ. Typically, the range of θ is [0, 180] degrees and ρ is [ -
d, d], where d is the diagonal length of the edge. Therefore, it’s crucial to quantify the
range of ρ and θ, which means there should only be a finite number of potential values.
Create a 2D array called the accumulator with the dimensions (num rhos, num thetas)
to represent the Hough Space and set all its values to zero.
Use the original image for edge detection (ED). You can use whatever ED technique
you like.
Check each pixel on the edge picture to see if it is an edge pixel. If the pixel is on edge,
loop over all possible values of θ, compute the corresponding ρ, locate the θ and ρ
index in the accumulator, and then increase the accumulator base on those index pairs.
Iterate over the accumulator’s values. Retrieve the ρ and θ index and get the value of ρ
and θ from the index pair. If the value exceeds a specified threshold, we can then
transform the index pair back to the form of y = ax + b.
4
DISCUSSION QUESTIONS:
1. How does the Hough Transform algorithm utilize the concept of parameter space to detect
shapes like lines or circles in an image?
2. What are the limitations of the standard Hough Transform in detecting shapes in images
with significant noise or occlusion, and how can these be mitigated?
3. How does the probabilistic Hough Transform differ from the standard approach, and in
what scenarios is the probabilistic version preferred?
4. Given an image with overlapping circles, how can the Hough Transform be adjusted or
combined with other techniques to accurately detect all the circles?
5. Can you propose an enhancement to the Hough Transform to make it more efficient in
terms of computational cost while maintaining accuracy?
5
6
22AM602 – COMPUTER VISION AND DIGITAL IMAGING
Now take x-0 and find corresponding y value for above given five equations
Point equations Now a=0 New point (a,b) Now a=1 New point (a,b)
A(1,4) b= -a+4 b= -(0)+4 =4 (0,4) b= -(1)+4 =3 (1,3)
B(2,3) b= -2a+3 b= -2(0)+3=3 (0,3) b= -2(1)+3=1 (1,1)
C(3,1) b= -3a+1 b= -3(0)+1=1 (0,1) b= -3(1)+1=-2 (1,-2)
D(4,1) b= -4a+1 b= -4(0)+1=1 (0,1) b= -4(1)+1=3 (1,-3)
E(5,0) b= -5a+0 b= -5(0)+0=0 (0,0) b= -5(1)+0=-5 (1,-5)
Let us plot the new point on the graph as given below in figure 6.
1
We can see that almost all line crosses each other at a point (-1,5). So here now a=-1 and b =5.
Now let’s put these values in the y=ax+b equation so we get y=-1x+5 so y=-x+5 is the line
equation that will link all the edges.
2
1.3 Applications
1.3.2. 3D Applications
The Hough transform in image processing can be extended to three-dimensional spaces to
detect three-dimensional shapes, such as planes or spheres, in 3D images or point clouds. This can
be useful in applications such as 3D modeling, robotics, and computer vision.
3
1.3.4. Object Tracking
The Hough transform can be used for object tracking by detecting and tracking specific geometric
shapes over time. This can be useful in applications such as surveillance or autonomous vehicles.
4
1.3.7. Medical Application
The Hough transform can be used in medical applications, such as image analysis or diagnosis,
where the Hough transform can detect specific geometric shapes in medical images, such as tumors
or blood vessels.
5
print("Circle Detection using Hough Transform")
cv2_imshow(img)
DISCUSSION QUESTIONS:
1. Propose potential modifications to the sum of Hough transform that could enhance its
accuracy in noisy environments.
2. Analyze how the Hough Transform facilitates object detection in satellite imagery. What
specific challenges does it address in this domain?
3. Discuss the advantages and limitations of applying the Hough Transform for detecting
lines and circles in medical imaging, such as X-rays or MRI scans.
4. Compare the Hough Transform's role in detecting shapes in natural images versus
synthetic datasets. What adjustments are necessary for each scenario?
5. Critique the adaptability of the Hough Transform for edge cases, such as detecting shapes
in low-contrast images or those with high noise levels.
6
7
22AM602 – COMPUTER VISION AND DIGITAL IMAGING
1. ARCHITECTURAL DESIGN
Architectural design is the process of conceptualizing, planning, and envisioning a
framework and turning it into a fully functional structure. It focuses on cultural needs and
aesthetics.
Pre-Design.
Schematic Design.
Design Development.
Contract Documents.
Bidding/Negotiation.
Contract Administration (construction)
Post-Occupancy.
2. DESIGN-BUILD
Design-build features a construction firm offering their designers in lieu of a
separate architect and being responsible for the construction of the project. Design-builds
are advantageous for those looking for a single entity to manage the entire project, as
builders and designers work hand-in-hand to provide design, engineering,
and implementation services. Through this streamlined form of communication, conflicts
between parties are eliminated, as is mediation from the designer to the builder. Another
advantage is a faster timeline due to eliminating a bidding process.
2.1. Design-bid-build
1
Design-bid-build is the most traditional project delivery method and best appeals
to low-cost bid seekers. In this method, the owner contracts designers and builders
separately. The design firm delivers 100% complete design documents. Then the owner
solicits bids from contractors to perform the documented scope of work. The low-cost can
occur as contractors conduct competitive bids and get the most competitive pricing. With
a design-bid-build, design and construction have more distinct roles, more independent,
and clear-cut ownership, which makes liability more apparent.
Despite its many benefits, design-bid-build projects also have their drawbacks.
Firstly, design-bid-build is more high-risk because designers and builders have no
contractual obligation to each other. As a result, the owner assumes the risk associated with
the design document completeness. Furthermore, contractors are bound solely by the
contents of documents, which can cause mistakes, variations, and gaps in expectations.
DISCUSSION QUESTIONS:
1. Analyze the key considerations when designing architectures for object detection and
segmentation tasks in computer vision. How do these considerations influence model performance
and computational efficiency?
4. Design an architecture suitable for real-time object tracking in video feeds. What
architectural elements would you prioritize to balance speed and accuracy?
5. Explore how architectural design in computer vision has evolved to support multi-modal
inputs (e.g., combining images with text or audio). What are the implications for future
developments?
2
22AM602 – COMPUTER VISION AND DIGITAL IMAGING
Within a session: We configure a session for the user-defined commit. If the Integration
Service fails to transform or write any row to the target, then We can choose to commit or
rollback a transaction.
When we run the session, then the Integration Service evaluates the expression for each
row that enters the transformation. When it evaluates a committed row, then it commits all
rows in the transaction to the target or targets. When the Integration Service evaluates a
rollback row, then it rolls back all rows in the transaction from the target or targets.
If the mapping has a flat-file as the target, then the integration service can generate an
output file for a new transaction each time. We can dynamically name the target flat files.
Here is the example of creating flat files dynamically - Dynamic flat-file creation.
1
1.1. Steps of Data Mapping
Step 1: Define — Define the data to be moved, including the tables, the fields within each
table, and the format of the field after it's moved. For data integrations, the frequency of data
transfer is also defined.
Step 2: Map the Data — Match source fields to destination fields.
Step 3: Transformation — If a field requires transformation, the transformation formula
or rule is coded.
Step 4: Test — Using a test system and sample data from the source, run the transfer to
see how it works and make adjustments as necessary.
Step 5: Deploy — Once it's determined that the data transformation is working as planned,
schedule a migration or integration go-live event.
Step 6: Maintain and Update — For ongoing data integration, the data map is a living
entity that will require updates and changes as new data sources are added, as data sources change,
or as requirements at the destination change.
Advanced cloud-based data mapping and transformation tools can help enterprises get
more out of their data without stretching the budget. This data mapping example shows data fields
being mapped from the source to a destination.
In the past, organizations documented data mappings on paper, which was sufficient at the
time. But the landscape has become much more complex. With more data, more mappings, and
constant changes, paper-based systems can't keep pace. They lack transparency and don't track the
inevitable changes in the data models. Mapping by hand also means coding transformations by
hand, which is time consuming and fraught with error.
2
Data maps are not a one-and-done deal. Changes in data standards, reporting
requirements, and systems mean that maps need maintenance. With a cloud-based data mapping
tool, stakeholders no longer run the risk of losing documentation about changes. Good data
mapping tools allow users to track the impact of changes as maps are updated. Data mapping
tools also allow users to reuse maps, so you don't have to start from scratch each time.
DISCUSSION QUESTIONS:
1. Examine the importance of transform mapping in creating feature hierarchies for tasks
like object detection and semantic segmentation. How can improper mapping affect downstream
performance?
2. Design a data pipeline for a computer vision project that involves large-scale image
datasets. What steps would you include to ensure efficient transform and transaction mapping?
4. Investigate the impact of using learned transformations (e.g., neural networks) versus
handcrafted transformations in transaction mapping for data pre-processing in computer vision
pipelines.
5. Explore how transform mapping can optimize the pre-processing of video data for tasks
like action recognition or video summarization.
3
22AM602 – COMPUTER VISION AND DIGITAL IMAGING
User interface plays a crucial role in any software system. It is possibly the only visible
aspect of a software system as −
Users will initially see the architecture of software system’s external user interface
without considering its internal architecture.
A good user interface must attract the user to use the software system without
mistakes. It should help the user to understand the software system easily without
misleading information. A bad UI may cause market failure against the competition
of software system.
UI has its syntax and semantics. The syntax comprises component types such as
textual, icon, button etc. and usability summarizes the semantics of UI. The quality
of UI is characterized by its look and feel (syntax) and its usability (semantics).
There are basically two major kinds of user interface − a) Textual b) Graphical.
Software in different domains may require different style of its user interface for
e.g. calculator need only a small area for displaying numeric numbers, but a big
area for commands, A web page needs forms, links, tabs, etc.
1.1. Graphical User Interface
A graphical user interface is the most common type of user interface available
today. It is a very user friendly because it makes use of pictures, graphics, and icons - hence
why it is called 'graphical'.
1
Pointers − A symbol such as an arrow which moves around the screen as user
moves the mouse. It helps user to select objects.
1.2. Design of User Interface
It starts with task analysis which understands the user’s primary tasks and problem
domain. It should be designed in terms of User’s terminology and outset of user’s job rather
than programmer’s.
To perform user interface analysis, the practitioner needs to study and understand four
elements −
o The users who will interact with the system through the interface
o The tasks that end users must perform to do their work
o The content that is presented as part of the interface
o The work environment in which these tasks will be conducted
Proper or good UI design works from the user’s capabilities and limitations not the
machines. While designing the UI, knowledge of the nature of the user's work and
environment is also critical.
The task to be performed can then be divided which are assigned to the user or machine,
based on knowledge of the capabilities and limitations of each. The design of a user
interface is often divided into four different levels −
o The conceptual level − It describes the basic entities considering the user's
view of the system and the actions possible upon them.
o The semantic level − It describes the functions performed by the system i.e.
description of the functional requirements of the system, but does not address
how the user will invoke the functions.
o The syntactic level − It describes the sequences of inputs and outputs required
to invoke the functions described.
o The lexical level − It determines how the inputs and outputs are actually
formed from primitive hardware operations.
User interface design is an iterative process, where all the iteration explains and refines
the information developed in the preceding steps. General steps for user interface
design
o Defines user interface objects and actions (operations).
o Defines events (user actions) that will cause the state of the user interface to
change.
o Indicates how the user interprets the state of the system from information
provided through the interface.
o Describe each interface state as it will actually look to the end user.
2
1.3.1. Interface analysis
It concentrates or focuses on users, tasks, content, and work environment who will interact
with the system. Defines the human - and computer-oriented tasks that are required to achieve
system function.
It defines a set of interface objects, actions, and their screen representations that enable a
user to perform all defined tasks in a manner that meets every usability objective defined for the
system.
It starts with a prototype that enables usage scenarios to be evaluated and continues with
development tools to complete the construction.
It focuses on the ability of the interface to implement every user task correctly,
accommodate all task variations, to achieve all general user requirements, and the degree to which
the interface is easy to use and easy to learn.
When a user interface is analyzed and designed following four models are used −
Created by a user or software engineer, which establishes the profile of the end-users of
the system based on age, gender, physical abilities, education, motivation, goals, and
personality.
3
Considers syntactic and semantic knowledge of the user and classifies users as novices,
knowledgeable intermittent, and knowledgeable frequent users.
Created by the software implementers who work on look and feel of the interface combined
with all supporting information (books, videos, help files) that describes system syntax and
semantics.
Serves as a translation of the design model and attempts to agree with the user's mental
model so that users then feel comfortable with the software and use it effectively.
Created by the user when interacting with the application. It contains the image of the
system that users carry in their heads.
Often called the user's system perception and correctness of the description depends upon
the user’s profile and overall familiarity with the software in the application domain.
DISCUSSION QUESTIONS:
1. Analyze the role of user interface design in improving the interpretability of computer
vision systems. How can UI elements help users understand model outputs, such as object
detection or segmentation results?
3. Investigate how user interfaces can help users identify and address biases in computer
vision models. What tools or visualizations would be most effective?
4. Reflect on the impact of poor UI design on the adoption of computer vision technologies
in industries like retail, manufacturing, or security. How can effective UI design bridge the gap
between complex technology and end users?
5. Explore the use of gamification in user interfaces for crowdsourcing computer vision
tasks like image labeling. What are the potential benefits and drawbacks of this approach?
4
22AM602 – COMPUTER VISION AND DIGITAL IMAGING
1
systems can not only be quickly modified but also evolve according to the changing
requirements.
1.2. Characteristics of Component-Based Design
1.3.1. UI Components
o User Interface components provide an easy and more convenient way to
encapsulate logic by combining presentational and visible elements such as
buttons, forms, and widgets.
1.3.2. Service Components
o Service components are the base of business logic or application services, in
which they serve as the platform for activities such as data processing,
authentication, and communication with external systems.
1.3.3. Data Components
o Through data abstraction and provision of interfaces for data access, data
components take care of database interaction issues and provide data structures
for querying, updating, and saving data.
1.3.4. Infrastructure Components
o The hardware elements regard as fundamental services or resources like logging,
caching, security and communication protocols which a software system depends
on.
1.3.5. Integration Components
2
o
Integrated components for data communication and data exchange between
different systems or modules are the integration components, which enable
protocol translation, workflow orchestration, and data exchange.
1.3.6. Reusable Components
o A reusable component, in turn, encapsulates common functionality or algorithms
that can be utilized across multiple projects as well as different domains, which
promotes code reuse and uniformity.
3
Stage 5: Decommissioning: Components are decommissioned, or disposed, when
they are no longer needed, either from obsolescence, redundancy, or architectural
changes.
1.6. Tools and Technologies for Component Design
Component Frameworks: Components frameworks for instance, Angular, React,
and Vue.js include a variety of pre-built components, templates, and utilities for
building user interfaces that users can interact.
Dependency Injection Containers: Injection Containers of IoC nature like Spring
IoC, Guice and Dagger can take the control of inversion by managing the
component dependencies as well as the producing the instances for the dependent
classes.
Middleware Platforms: API middleware systems like Apache Camel or MuleSoft
or RabbitMQ connect different system and services by means of message routing,
transformation, and mediation.
Component Repositories: These repositories like npm, Maven Central, and
NuGet pool together the storage, sharing, and discovery of libraries and reusable
components.
Component Testing Tools: Tools like Jest, JUnit, and Mockito are used for
automated testing of components; this lets their functionality, reliability and
adherence to specification to be verified.
4
E-commerce Platforms: There are e-commerce platforms such as Amazon, eBay
and Shopify, that apply component-based architecture to fulfill the diverse
functionalities of their products like product catalog management, order
processing, credit card processing and customer relationship management.
Content Management Systems (CMS): The majority of content management
systems like WordPress, Drupal, and Joomla, with component based design, offer
modular functionality through plugins, themes, and extensions which permit users
to do things like build, manage, and publish content.
Enterprise Resource Planning (ERP) Systems: The ERP systems SAP, Oracle,
and Microsoft Dynamics have components in order to integrate the business
processes and functionalities from a variety of fields, including finance, human
resources, supply chain management, and customer relationship management.
DISCUSSION QUESTIONS:
2. Design a reusable component for real-time object tracking in video feeds. How would
you address issues like latency and robustness in your design?
5
22AM602 – COMPUTER VISION AND DIGITAL IMAGING
1. DESIGN PATTERNS
Design patterns are typical solutions to commonly occurring problems in software design.
They are like pre-made blueprints that you can customize to solve a recurring design problem in
your code. You can’t just find a pattern and copy it into your program, the way you can with off-
the-shelf functions or libraries. The pattern is not a specific piece of code, but a general concept
for solving a particular problem. You can follow the pattern details and implement a solution that
suits the realities of your own program. Patterns are often confused with algorithms, because both
concepts describe typical solutions to some known problems. While an algorithm always defines
a clear set of actions that can achieve some goal, a pattern is a more high-level description of a
solution. The code of the same pattern applied to two different programs may be different.
Most patterns are described very formally so people can reproduce them in many contexts.
Here are the sections that are usually present in a pattern description:
Intent of the pattern briefly describes both the problem and the solution.
Motivation further explains the problem and the solution the pattern makes possible.
Structure of classes shows each part of the pattern and how they are related.
Code example in one of the popular programming languages makes it easier to grasp
the idea behind the pattern.
Some pattern catalogs list other useful details, such as applicability of the pattern,
implementation steps and relations with other patterns.
Design patterns differ by their complexity, level of detail and scale of applicability to the
entire system being designed. I like the analogy to road construction: you can make an intersection
safer by either installing some traffic lights or building an entire multi-level interchange with
underground passages for pedestrians.
The most basic and low-level patterns are often called idioms. They usually apply only to
a single programming language.
1
The most universal and high-level patterns are architectural patterns. Developers can
implement these patterns in virtually any language. Unlike other patterns, they can be used to
design the architecture of an entire application.
In addition, all patterns can be categorized by their intent, or purpose. This book covers
three main groups of patterns:
Creational patterns provide object creation mechanisms that increase flexibility and
reuse of existing code.
Structural patterns explain how to assemble objects and classes into larger structures,
while keeping these structures flexible and efficient.
Behavioral patterns take care of effective communication and the assignment of
responsibilities between objects.
2
1.2.2. Structural Design Patterns
Structural design patterns explain how to assemble objects and classes into larger
structures, while keeping these structures flexible and efficient.
3
1.2.3. Behavioral Design Patterns
Behavioral design patterns are concerned with algorithms and the assignment of
responsibilities between objects.
4
DISCUSSION QUESTIONS:
1. Critique the use of the Adapter design pattern in integrating traditional computer vision
libraries with modern deep learning frameworks. What are the advantages and pitfalls?
2. Explore how the Composite design pattern can simplify the hierarchical representation
of data in tasks like scene understanding or multi-object tracking.
3. Investigate the role of the Command design pattern in automating workflows for batch
image processing in computer vision. What would an implementation look like?
4. Propose how the Observer design pattern can be implemented in a computer vision
system for real-time event monitoring, such as anomaly detection.