A Real-Time Compact Structured-Light Based Range
A Real-Time Compact Structured-Light Based Range
A Real-Time Compact Structured-Light Based Range
193
Abstract—In this paper, we propose a new approach using the structured-light. In this paper, we develop a
for compact range sensor system for real-time robot compact range sensing system with a CMOS image-
applications. Instead of using off-the-shelf camera sensor and a DMD (Digital Micro-mirror Device), and
and projector, we devise a compact system with a provide an FPGA solution of the entire processing of
CMOS image-sensor and a DMD (Digital Micro- range sensor system, i.e., projecting the structured-light
mirror Device) that yields smaller dimension patterns followed by computing the sensing range.
(168x50x60mm) and lighter weight (500g). We also The generic range measurement system based on
realize one chip hard-wired processing of projection structured-light is as shown in Fig. 1. First, a projector
of structured-light and computing the range by generates the structured-light patterns that contain
exploiting correspondences between CMOS image- identifiable lines. A line on the image plane of the
sensor and DMD. This application-specific chip projector creates a light plane containing the center of
processing is implemented on an FPGA in real-time. projection and the projected line. Then, a camera
Our range acquisition system performs 30 times captures the projected pattern onto a scene and identifies
faster than the same implementation in software. We the lines generated by the projector. An image point on
also devise an efficient methodology to identify a the image plane of the camera creates a light ray (camera
proper light intensity to enhance the quality of range ray) containing the center of projection and the projected
sensor and minimize the decoding error. Our point. The intersection of the plane and the ray yields the
experimental results show that the total-error is range information.
reduced by 16% compared to the average case. Generally, single-pattern structured-light encoding
method is suitable for moving scenes, yet yielding low
Index Terms—Range sensing, structured-light, FPGA accuracy range data. Whereas multiple-pattern structured-
light encoding method can acquire accurate distance data,
I. INTRODUCTION
yet being vulnerable to the moving object [5]. Motivated Structured-light patterns are projected by DMD, and
by these facts, our approach is to apply high-speed captured by CMOS image-sensor. 2) The captured
CMOS image-sensor (500fps) and hard-wired range images are stored in external memory and decoded. 3)
computation to overcome the vulnerability problem of The decoded images are triangulated with the ray-plane
the multiple pattern approach. Furthermore, we attain intersection. 4) The final range data is transmitted to the
real time processing of dynamic scenes without losing external main control component through USB.
accuracy.
To enhance the quality of range sensor and minimize 1. Structured Light Decoding
the decoding error, we also present a methodology to
identify a proper light intensity. There are related efforts Structured-light decoding is a process of finding
which aim to enhance the quality of range sensor with correspondences between the columns of the DMD
adaptation to the scene. In [6] the authors introduced an projected pattern and the pixels of the captured image.
adaptive algorithm with adjusting the coding patterns We use Gray code for structured-light encoding. Gray
depending on the environment. However, there is no code is well suited for binary position encoding due to its
flexibility with structured-light coding methods due to one bit difference property. More robust encoding is to
the specially designed coding patterns. In [7] the authors add an inverted Gray pattern between two Gray patterns
presented an adaptive algorithm with using the partially [8], as shown in Fig. 3.
reconstructed scene. However, this approach is not The structured-light decoding process is shown in Fig.
suitable for real-time application due to its high 4. We need one frame of non-illuminated image (i.e.,
computational complexity although it provides high reference image) and 16 frames of patterned images to
accuracy results. In this paper, we present a methodology decode the structured-light patterns. Each layer contains
to enhance the quality of range sensor in real time with two frames; original Gray coded pattern and inverted
adjusting the light intensity according to the suggested Gray coded pattern. The decoding process is as follows.
structured-light decoding error model. First, the pixel values of the patterned images are
The rest of the paper is organized as follows. In subtracted by the pixel value of the non-illuminated
Section II, we shall exhibit the overall system architecture, image (referred to ‘Ref-frame-value’). The ‘Frame value’
and describe the process of structured-light decoding and represents the pixel value at each frame. The ‘Frame
ray-plane intersection. In Section III, we present on the value*’ represents the value subtracted by the ‘Ref-
error-minimized light intensity search. In Section IV, we frame-value’. Next, we compare the ‘Frame value*’
will show our implementation and experimental results. between two ‘Frame values’ in each layer and choose the
Finally, Section V will conclude the paper. greater value (Layer value).
The ‘Max frame#’ bit is set to ‘0’ if the left frame
II. SYSTEM ARCHITECTURE value (i.e., the pixel from the gray pattern) is greater than
the right frame value (i.e., the pixel from the inverted
As shown in Fig. 2, our architecture performs: 1) gray pattern); otherwise set to ‘1’. All of the ‘Max
frame#’ bits for all layers correspond to the index of the
LUT (look-up table). The LUT yields the corresponding
column of the projection image at the specific pixel of
captured image. To remove the ambiguous decoding
Fig. 2. The overall architecture of our system. Fig. 3. The structure of sequence frames.
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.12, NO.2, JUNE, 2012 195
Dec_result
Cur_x Cur_y (Col_number)
Generator
Generator
Addr
Planes
Rays
Data_in DMD distortion
Data_out parameter
Generator
we
Addr parameter
Rays
Divider
Cam distortion
Data_in parameter
Data_out
Output reg
·
range , (1)
·
uncertain range data. The zero-error can be classified into 1 for I i, j, γ 0 and
three cases such as “zero-over-error,” “zero-under-error,” I i, j, γ I i, j, γ 255 th (5)
and “zero-occlusion-error”. 0 else
We explain on how to compute the above mentioned
Then, we define zero-occlusion-error. The zero-error
four different errors. The zero-error occurs when any
also occurs when the pixel of captured image is in
‘Layer value’ (as shown in Fig. 4) is smaller than ‘thsl’ at
shadow area or out of sight from the projection area. We
a pixel. That implies that the pixel value difference
define this case as zero-occlusion-error. We identify
between light illuminated case and light non-illuminated
zero-occlusion-error pixel when the pixel value
case is smaller than ‘thsl’. This case occurs generally
difference between all-illuminated image (denoted as
when we take the image of a dark object (e.g., black
I i, j, γ and reference image I i, j, γ is less than the
object) due to its low reflectivity.
threshold value (denoted as thocclusion ).
For this case, we define the zero-under-error. This
error can be changed depending on the intensity of the
light γ in Eq. (2). The zero-error for pixel (i,j) is E γ ∑H ∑W I i, j, γ (6)
H W
defined when the pixel value of structured-light decoding
image (denoted as , , ) is zero. The zero-under- I i, j, γ
error pixel is defined when the pixel value of reference 1 for I i, j, γ 0 and
image (denoted as , , ) is less than (255-thsl) in I i, j, γ I i, j, γ th (7)
0 else
Eq. (3). In Eq. (3), the pixel value of zero-under-error
image (denoted as , , becomes ‘1’. Then, we
Last, we define order-error. Among the valid
get zero-under-error E γ with counting all the 1’s
decoding pixels (i.e., non-zero-error pixels), the decoded
in , , and with normalizing by the size of
pixel value would contain error. The decoding result
image (H W . should be increased along the right direction since the
decoding result corresponds to the column number of
E γ ∑H ∑W I i, j, γ (2) projector’s image plane. We define the out-of-ordered
H W
decoding result error as order-error. We distinguish
1 for I i, j, γ 0 and order-error for a pixel (i,j) by checking if the pixel value
I i, j, γ I i, j, γ 255 th (3) of the decoding image , , is less than the pixel
0 else value of pre-positioned pixel 1, , . If
, , = 1, then this case is out-of-order. Then, we
Next, we define the zero-over-error. The zero-error
accumulate the number of order-error pixel while
also increases with increasing light intensity when we
I i, j, γ = 1, as shown in Eq. (8).
take image of a bright object (e.g., white object). The
pixel value in this case is almost 255 without providing
E γ ∑H ∑W I i, j, γ (8)
the projector’s light. This causes smaller difference value E γ H W
than thsl, due to the upper limited pixel value (255 for
8bit pixel resolution). We define this case as zero-over- I i, j, γ
error. In other words, we identify the zero-over-error 1 for I i, j, γ I i 1, j, γ
(9)
pixel when the pixel value of reference image 0 else
, , is larger than (255-thsl). Similar to the zero-
We normalize to get the ratio of the order-error pixels
under-error, we define zero-over-error as shown in Eqs.
to the non-zero-error pixels. Here, zero-error-total
(4, 5).
(denoted as ) is defined as in Eq. (10).
E γ ∑H ∑W I i, j, γ (4)
H W E γ E γ E γ E γ (10)
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.12, NO.2, JUNE, 2012 197
E γ E γ ω ·E γ , (11)
Fig. 7. Zero-error function (depending on the light intensity). Fig. 9. Light Intensity Search.
198 BYUNG-JOO HONG et al : A REAL-TIME COMPACT STRUCTURED-LIGHT BASED RANGE SENSING SYSTEM
Fig. 11. Error estimator hardware. Total 1741 ms(0.57 fps) 58 ms(17.24 fps)
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.12, NO.2, JUNE, 2012 199
times compared to the conventional system configuration. Fig. 14. Structured-light decoding image.
Fig. 12 shows the entire system board that is smaller than
the system with using off-the-shelf projector and camera. intensity represents the range (in millimeter, the larger
value is the nearer position from the acquisition system).
2. Experiments on structured light range search The distance values of the 3D points of the diagonally
positioned box was smoothly increased or decreased with
Fig. 13 shows a non-illuminated general image the front edge of the box as a center. Also note that the
captured by CMOS image-sensor. To evaluate our distances of the paper cup and the box at the bottom are
system, we stacked several objects (a book, two boxes similar.
and a paper cup from bottom to top) at a distance of Fig. 16 shows the reconstruction of the objects with
about 50cm from the acquisition system. Notice that the the 3D points without any further post processing such as
box right below the paper cup is placed diagonally so as smoothing.
to experimentally verify the smooth increase or decrease For the purpose of accuracy analysis of our system, we
in distance. placed a white plane board at a distance of about 60 cm
Fig. 14 shows the structured-light decoded image. from the acquisition system. First, we obtain 3D points of
Each pixel’s intensity value represents the column the white plane through our acquisition system. Next, we
number of the DMD projected image. The black-colored reconstructed a flat 3D plane derived from the acquired
pixel is the invalid value due to either ambiguous 3D points using a plane-fitting such as PCA (Principle
correspondences or being out of sight from the projection component analysis). The difference in distance between
area. The intensity values of the image pixel is smoothly the points in flat plane and the 3D points is used as a
increased in the right direction, implying that the column metric for the range accuracy (expressed in terms of
numbers of the DMD projected image are increased from average error and standard deviation). Table 2 shows the
left to right.
The range image is shown in Fig. 15. Each pixel’s
13. Table 3 compares the total-errors derived from * γ (light intensity), A (Arbitrary light intensity), O (optimal light
intensity), E γ (total error)
optimal light intensity with the ones from the arbitrary
light intensity such as maximum(100%), middle(60%),
and minimum(20%) light intensity. The total-error was area and the accuracy in ROI area at the different light
reduced by 16% compared to the average case . intensity assignment, respectively. The optimal light
The zero-error represents the number of 3D points and intensity yields five times more 3D points compared to
the order-error implies the unreliable 3D points. the worst case(γ=20) with small accuracy reduction (0.04
Therefore, we adaptively apply the optimal light intensity mm).
to maximize the number of 3D points and accuracy. To We also verified the influence of ambient-light
show the quality of the measured distance, we set the conditions on the optimal-light-intensity of the system.
ROI (Region of Interest) which is flat area in the scene as Fig. 18 shows the relationship between E γ and
shown in Fig. 17. Then, we fit a 3D plane in the ROI and structured- light intensity with varying ambient-light
compute the accuracy at the arbitrary light intensity and
optimal light intensity.
Table 4 shows the number of 3D points in entire image 1
Level4
0.8 Level3
Level2
0.6 Level1
Table 3. Comparison of decoding errors among different light
E__total
intensity assignments
0.4
Arbitrary γ O
γ(%) E γ γ(%) E γ 0.2
max(100) 0.92
50 0.78 0
mid(60) 0.87 100 80 60 40 20 0
Light Intensity(%)
min(20) 0.99
* γ (light intensity), O (optimal light intensity), E γ (total error) Fig. 18. Influence of ambient light.
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.12, NO.2, JUNE, 2012 201
conditions. In Fig. 18, the higher level implies higher Imaging, Vol. 11(5-6), pp.358-369, 2005.
ambient-light conditions. With our error-model based on [4] http://en.wikipedia.org/wiki/Structured-light_3D_
E γ , our system attempts to minimize E γ scanner.
regardless of changed ambient-light condition. [5] J. Salvi, J. Pagés, and J. Batlle., “Pattern Codification
Strategies in Structured Light Systems,” Pattern
V. CONCLUSIONS Recognition, Vol.37, No.4, pp.827-849, 2004.
[6] Thomas P. Koninckx, Luc Van Gool, “Real-Time
In this paper, we proposed a new technique for the Range Acquisition by Adaptive Structured Light,”
structured-light based range sensing system that is IEEE Transactions on Pattern Analysis and
capable of processing of structured-light projection and Machine Intelligence, Vol.28, No.3, pp.432-445,
range computation. Overall processing is realized with 2006
fully hard-wired manner using an FPGA that results in [7] Yi Xu, Daniel G. Aliaga, “An Adaptive
real-time operation. The frame-rate of 17 fps (640x480) Correspondence Algorithm for Modeling Scenes
with Strong Interreflections,” IEEE Transactions
is reliable in the dynamic scene and feasible to recognize
on Visualization and Computer Graphics, Vol.15,
moving objects. Furthermore, to enhance the quality of
No.3, pp.465-480, 2009.
range sensor and minimize the decoding error, using a
[8] D. Scharstein and R. Szeliski, “High-accuracy
proper light intensity leads to desirable results.
stereo depth maps using structured light,” CVPR03,
Finally, we are posing some future works. First, the
pp.195-202, 2003.
current system has room for hardware optimization with
[9] S. Zhang, P. Huang, “Novel method for structured
pipelining optimization [10] and rescheduling the
light system calibration,” Optical Engineering,
processes. Second, we might apply various structured-
Vol.45, No.28, pp.083601, 2006.
light encoding techniques to secure more robustness with
[10] Chanho Lee, “Smart Bus Arbiter for QoS control in
the moving scenes or more accuracy the range data
H.264 decoders,” Journal of Semiconductor
depending on the different conditions.
Technology and Science, Vol.11, No.1, pp.33-39,
2011.
ACKNOWLEDGMENTS
Chan-Oh Park was born in Korea, Jun-Dong Cho received the B.S.
in 1986. He received the B.S. degree degree from the Department of
in department of electronic and Electronic Engineering,
information engineering from Seoul Sungkyunkwan University, Suwon,
National University of Technology Korea, in 1980, the M.S. degree
in 2010. He is currently pursuing the from the Department of Computer
M.S. degree in Sungkyunkwan Science, Polytechnic University
University. His current research interests include stereo Brooklyn, New York, in 1989, and the Ph.D. degree from
vision, computer architecture and system on chip. the Department of Computer Science, Northwestern
University, Evanston, in 1993. He was a CAD engineer
and Team Leader in Samsung Electronics Company,
from 1983 to 1987, and a Senior Technical Staff in
Nam-Seok Seo was born in Korea, Samsung Electronics Company, from 1993 to 1995. In
in 1982. He received the B.S. degree 1995, he joined the Department of Electrical and
in department of System Control Computer Engineering, Sungkyunkwan University
engineering from Hoseo University (SKKU), Suwon, Korea, where he is currently Professor.
in 2009. He is currently pursuing the His research interests include Low Power Design, 3-D
M.S. degree in Sungkyunkwan Image Processor, Embedded System Integration Design,
University. His current research and Multiprocessor on Chip for Software Defined Radio
interests include stereo vision, computer architecture and and Multimedia Applications. Prof. Cho is an IEEE
system on chip. Senior Member.