0% found this document useful (0 votes)
42 views2 pages

Kim2020 - Low-Cost Hardware Architecture For Integral Image Generation Using Word Length Reduction

The document proposes a low-cost hardware architecture for generating integral images using word length reduction. It accumulates pixel values to update the integral image as a sub-window shifts, without subtracting old values. Word length reduction limits the bit width of integral image elements based on the maximum sum in feature value calculations. This allows generating integral images with significantly fewer logic resources than previous methods.

Uploaded by

Phú Triệu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views2 pages

Kim2020 - Low-Cost Hardware Architecture For Integral Image Generation Using Word Length Reduction

The document proposes a low-cost hardware architecture for generating integral images using word length reduction. It accumulates pixel values to update the integral image as a sub-window shifts, without subtracting old values. Word length reduction limits the bit width of integral image elements based on the maximum sum in feature value calculations. This allows generating integral images with significantly fewer logic resources than previous methods.

Uploaded by

Phú Triệu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Low-cost Hardware Architecture for Integral Image

Generation using Word Length Reduction

Junghwan Kim, Jongkil Hyun, and Byungin Moon


School of Electronics Engineering
Kyungpook National University
Daegu, Korea
{jh5746, 26712isjk, bihmoon}@knu.ac.kr

Abstract—An integral image is widely used in face detection the oldest values and including newly accumulated right pixel
to calculate feature values at high speed. However, implementing values when the sub-widow shifts. The proposed method
integral images in hardware requires considerable logic and requires less computation than conventional methods because it
memory resources. This paper proposes a hardware architecture does not subtract the oldest values to update the integral image
for integral image generation with reduced resource usage by when the sub-window shifts to the right. However, generating
applying the word length reduction method. When implemented an integral image in this method leads to growth in bit width of
in an FPGA, the proposed architecture uses about 83% fewer the elements in the integral image. To solve this problem, we
Slice LUTs than the conventional integral image method. apply the word length reduction method [4] to each element in
Therefore, the proposed architecture is suitable for low-cost real-
the integral image. This is the first study to apply both the sub-
time face detection systems.
window approach and the word length reduction method to the
Keywords—FPGA; hardware architecture; integral image; face integral image.
detection The word length reduction method allows the bit width of
each element in the integral image to be limited by the
I. INTRODUCTION maximum sum of the rectangular area used in the calculation of
In embedded systems, the Haar classifier is generally used the feature value. This means that even if the integral image
for face detection [1]. Accordingly, many studies have been reflects the entire horizontal width of the image, the element
conducted to implement the Haar classifier in hardware. For size of the integral image remains small. Applying the word
real-time operation of the Haar classifier, the integral image length reduction method, the cutoff value of the integral image
should be used to minimize the time taken to access the data in is determined by the maximum sum of the rectangular area
memory. However, generating an integral image of the entire used in the Haar classifier. Then, when the element of the
image demands large hardware resource, especially as the integral image is calculated, the cutoff value is subtracted if the
resolution of the image increases. To solve this hardware accumulated value exceeds the cutoff value. The feature value
resource problem, many studies use architectures that generate calculation of the word length reduction method is carried out
an integral image for a sub-window and update the integral in a similar way to the traditional feature value calculation
image each time the sub-window shifts [2, 3]. Using the sub- process. However, when the sum of the rectangular area is
window approach, the hardware resources used for storing negative or greater than the cutoff value, the cutoff value is
integral images decrease significantly. Nevertheless, the added or subtracted as a correction. This requires additional
hardware resources required to generate the integral image are operations to be performed in the feature value calculation but
still high because when updating the integral image, values of has less negative impact than positive effects as the correction
the leftmost pixels of the integral image before shifting should cab be carried out simply by comparison and compensation.
be subtracted from all other elements of the integral image.
Therefore, we propose a real-time integral image generation III. PROPOSED ARCHITECTURE
method and its hardware architecture to reduce hardware Fig. 1 illustrates the proposed integral image generator. The
resources used in the computation and storage of the integral proposed integral image generator consists of a pixel buffer, a
image. vertical adder tree, adders for the horizontal sum, and an
integral image buffer. The pixel buffer consists of 19 line
II. PROPOSED METHOD buffers for storing input pixel values. Twenty vertical pixel
The proposed method produces integral images by values, which consists of the current input pixel value and 19
accumulating pixel values to the right end of the horizontal outputs of the pixel buffer, are sent to the vertical adder tree.
direction of the input image while the vertical size of integral The vertical adder tree then performs a cumulative sum
images is fixed at the height of the sub-window. The proposed operation on the 20 vertical pixels to produce 1 × 20 column
method stores the integral image of only the size of the sub- sum data. The column sum data are summed horizontally with
window, and it updates the integral image by simply discarding the previous 1 × 20 vertical integral image data in each row to
TABLE I. COMPARISON OF ARITHMETIC UNITS FOR INTEGRAL IMAGE
GENERATION

Architecture used Architecture used Proposed


Resources
in [2] in [3] architecture
20+1 adders,
Arithmetic 35+20 adders,
20×20+20 35+20 adders
Units 20×20 subtractors
subtractors

Fig. 1. Proposed integral image generator


TABLE II. COMPARSION OF HARDWARE UTILIZATION

Hardware Architecture used in


Proposed architecture
Utilization [3]
Slice LUTs 7,834 1,300
Slice Registers 7,498 7,510
BRAMs (18 Kbits) 19 19

20 × 20. Unlike the architectures of [2] and [3], the proposed


architecture does not require any subtractor to generate the
integral image. Thus, the proposed architecture uses fewer
arithmetic units than traditional methods.
Table 2 presents the hardware utilization of the proposed
Fig. 2. Word length reduction: (a) cutoff process and (b) proposed cutoff architecture and the architecture of [3] when they are
method using bit width limiting implemented on the Xilinx FPGA XC7Z045 FFG900-2. The
produce a new vertical integral image reflecting the previous proposed architecture has greatly reduced the number of Slice
values. The vertical integral images are stored sequentially in LUTs compared with the architecture of [3], while keeping
the integral image buffer to form a 20 × 20 integral image. other hardware utilizations almost the same.
Each time a new vertical integral image is generated, the data
within the integral image buffer are shifted to form a new 20 × V. CONCLUSION
20 integral image for the new sub-window. At this time, the
In this paper, we proposed an integral image generation
oldest vertical integral image is discarded from the integral
architecture with reduced hardware resource usage. The
image buffer.
proposed architecture applies the word length reduction
Fig. 2 shows an example of the word length reduction in the method to integral image generation based on sliding sub-
proposed architecture. As shown in Fig. 2(a), if the sum value windows, to reduce logic and memory resources used to create
is greater than the cutoff value during the integral image and store integral images. Because of these advantages, the
generation, the cutoff value is subtracted reflecting the word proposed architecture can be efficiently used for embedded
length reduction method. In this paper, we limit the bit width of systems. In a future work, we will implement the entire Haar
the element to remove the additional calculation to subtract the classifier in our architecture.
cutoff value, so that the architecture can perform the word
length reduction more efficiently. In our architecture, which ACKNOWLEDGMENT
targets the sub-window size of 20 × 20 and uses grayscale
pixels, the maximum sum of the rectangular area is 255 × 20 × This research was supported by Multi-Ministry
20, and so expressed in 17 bits. For this reason, the proposed Collaborative R&D program (R&D program for complex
architecture limits the bit width of each element in the integral cognitive technology) through the National Research
image to 17 bits. Carry outputs from 17 bits in the process of Foundation of Korea (NRF) funded by the Ministry of Trade,
generating the integral image are discarded as shown in Fig. Industry and Energy (NRF–2018M3E3A1057248).
2(b). The proposed architecture also applies the bit width of 17
bits to the sum of the rectangular area for feature value REFERENCES
calculation, thus ignoring the overflow that produces results [1] P. Viola and M. J. Jones, “Robust real-time face detection,” International
with negative values or exceeding the cutoff value. This makes journal of computer vision, vol. 57, no. 2, 2004, pp. 137–154.
the proposed architecture produce the correct feature value [2] C. Kumar and S. Agarwal, “A novel architecture for dynamic integral
without any comparison or compensation. image generation for Haar-based face detection on FPGA,” in TENCON
2014–2014 IEEE Region 10 Conference, 2014, pp. 1–6.
[3] D. Kim, J. Hyun and B. Moon, “Memory-Efficient Architecture for
IV. EXPERIMENTAL RESULT Contrast Enhancement and Integral Image Computation,” in 2020 IEEE
Table 1 presents a comparison of the arithmetic units used International Conference on Electronics, Information, and
Communication, 2020, pp. 1–4.
in our architecture and the architectures of [2] and [3]. All
architectures have been designed based on the sub-window of [4] H. J. Belt, “Word length reduction for the integral image,” in 2008 15th
IEEE International Conference on Image Processing, 2008, pp. 805–808.

You might also like