0% found this document useful (0 votes)

66 views18 pages

InTech-High Speed Architecure Based On Fpga For A Stereo Vision Algorithm

This document summarizes a research paper on developing a high-speed architecture based on an FPGA (field programmable gate array) for implementing a stereo vision algorithm. The architecture is designed to generate up to 325 dense disparity maps per second from stereo image pairs. The architecture carefully reuses operations and adapts the integer/binary nature of the stereo vision algorithm's operations to the FPGA. The paper analyzes parameters like filter window sizes, maximum disparity, and census transform window size to optimize the tradeoff between resource usage and disparity map accuracy on the FPGA.

Uploaded by

dvtruongson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views18 pages

InTech-High Speed Architecure Based On Fpga For A Stereo Vision Algorithm

Uploaded by

dvtruongson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

0 5

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm

M.-A. Ibarra-Manzano1 and D.-L. Almanza-Ojeda2
1 Digital

Signal Processing Laboratory, Electronics Department; DICIS University of Guanajuato 2 Mechatronic Department, Campus Loma Bonita University of Papaloapan 1 Salamanca, Guanajuato, Mexico 2 Loma Bonita, Oaxaca, Mexico

1. Introduction
Stereo vision is used to reconstruct the 3D (depth) information of a scene from two images, called left and right. This information is acquired from two cameras separated by a previously established distance. Stereo vision is a very popular technique used for applications such as mobile robotics, autoguided vehicles and 3D model acquisition. However, the real-time performance of these applications cannot be achieved by conventional computers, because the processing is computationally expensive. For this reason, other solutions like recongurable architectures have been proposed to execute dense computational algorithms. In the last decade, several works have proposed the development of high-performance architectures to solve the stereo-vision problem i.e. digital signal processing (DSP), eld programmable gate arrays (FPGA) or application-specic integrated circuits (ASIC). The ASIC devices are one of the most complicated and expensive solutions, however they afford the best condition for developing a nal commercial system (Woodll et al., 2006). On the other hand, FPGA have allowed the creation of hardware designs in standard, high-volume parts, thereby amortizing the cost of mask sets and signicantly reducing time-to-market for hardware solutions. However, engineering cost and design time for FPGA-based solutions still remain signicantly higher than software-based solutions. Designers must frequently iterate the design process in order to achieve system performance requirements and simultaneously minimize the required size of the FPGA. Each iteration of this process takes hours or days to be completed (Schmit et al., 2000). Even if designing with FPGAs is faster than designing ASICs, it has a nite resource capacity which demands clever strategies for adapting versatile real-time systems (Masrani & MacLean, 2006). In this chapter, we present a high-speed recongurable architecture of the Census Transform algorithm (Zabih & Woodll, 1994) for calculating the disparity map from a dense stereo-vision system. The reuse of operations and the integer/binary nature of these operations were carefully adapted on the FPGA for obtaining a nal architecture that generates up to 325 dense disparity maps of 640 480 pixels, even though most of the vision-based systems do not require high video-frame rates. In this context, we propose a stereo-vision system that can be adapted to the real-time application requirements. An

72 2

Advances in Stereo Vision Will-be-set-by-IN-TECH

analysis of the four essential architectural parameters (such as the size of the window of the arithmetic mean and median lters, the maximal disparity and the window size for the Census Transform), is carried out to obtain the best trade off between consumed resources and the disparity map accuracy. We vary these parameters and show a graphical representation of the consumed resources versus the desired performance for different extended architectures. From these curves, we can easily select the most appropriate architecture for our application. Furthermore, we develop a practical application of the obtained disparity map to tackle the problem of 3D environment reconstruction using the back-projection technique. Experimental performance results are compared to those of related architectures.

2. Overview of passive stereo vision

In computer vision, stereo vision intends to recover depth information from two images of the same scene. A pixel in one image corresponds to a pixel in the other, if both pixels are projections of the same physical scene element. Also, if the two images are spatially separated but simultaneous, then computing correspondence determines stereo depth (Zabih & Woodll, 1994). There are two main approaches to process the stereo correlation: feature-based and area-based. In this work, we are more interested in area-based approaches, because they propose a dense solution for calculating high-density disparity maps. Furthermore, these approaches have a regular algorithmic structure which is suitable for a convenient hardware architecture. The global dense stereo vision algorithm used in this work is based on the Census Transform. This algorithm was rst introduced by Zabih and Woodll (Zabih & Woodll, 1994). Figure 1 shows the block diagram of the global algorithm.

Fig. 1. Stereo vision algorithm

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

73 3

First of all, the algorithm processes in parallel and independently each of the images (right and left). The process begins with the rectication and correction of the distortion for each image. This process allows us to reduce the size of the search of points for the calculation of the disparity to a single dimension. In order to reduce the complexity and size of the required architecture, this algorithm uses the epipolar restriction. In this restriction, the main axes of the cameras should be aligned in parallel, so that the epipolar lines between the two cameras correspond to the displacement of the position between the two pixels (one per camera). Under this condition, an object location in the scene is reduced to a horizontal translation. If any pair of pixels is visible in both cameras and assuming they are the projection of a single point in the scene, then both pixels must be aligned on the same epipolar line (Ibarra-Manzano, Almanza-Ojeda, Devy, Boizard & Fourniols, 2009).
2.1 Image preprocessing

The Census Transform requires that left and right input images be pre-processed. During image pre-processing, we use an arithmetic mean lter that requires a rectangular window of size m n pixels. Suv represents a set of image coordinates inside of the rectangular window centered on the point (u, v). The arithmetic mean lter calculates the mean value in the noisy image I (u, v) at each rectangular window dened by Suv . The corrected image value I takes this arithmetic mean value at each point (u, v) of subset Suv (see Equation 1). I (u, v) = 1 I (i, j) m n (i,jS ) uv (1)

This lter could be implemented without using the scale factor 1/ (m n) because the size of the window is constant during the ltering process. The arithmetic mean lter smooths local variations in the image, at the same time, the noise produced by camera motions is notably reduced.
2.2 Census Transform

Once input images have been ltered, they are used to calculate the Census Transform. This transform is a non-parametric measure used during the matching process for measuring similarities and obtaining the correspondence between the points into the left and right images. A neighborhood of pixels is used for establishing the relationships among them (see Equation 2), IC (u, v) =
(i,j) Duv

I (u, v) , I (i, j)

(2)

where Duv represents the set of coordinates into the square window of size n n pixels (being n an odd number) and centered at the point (u, v). The function is the comparison of the intensity level among the center pixel (u, v) with all the pixels in Duv . This function returns 1 if the intensity of the pixel (i, j) is lower than the intensity of the centering pixel (u, v), otherwise the function returns 0. The operator represents the concatenation function among each bit calculated by the function . IC represents the Census Transform of the point (u, v) which is a bit chain.
2.3 Census correlation

The two pixels (one for each image) obtained from the Census Transform are compared using the Hamming distance. This comparison which is called the correlation process allows us

74 4

Advances in Stereo Vision Will-be-set-by-IN-TECH

to obtain a disparity measure. The similarity evaluation is based on the binary comparison between two bit chains given by the Census Transform. The disparity measure from left to right D H1 in the point (u, v) is calculated by the equation 3, where ICl and ICr represent the left and right images of the Census Transform, respectively. This disparity measure comes from the similarity maximization function in the same epipolar line v for the two images. In this same equation, D represents the maximal displacement value on the epipolar line of the right image. The function represents the binary operator XNOR. 1 N (3) I (u, v)i ICr (u d, v)i N i Cl d[0,D ] =1 The correlation process is carried out two times, (left to right then right to left) with the aim of reducing the disparity error. The equation 4 is for that case in which the right to left disparity measure is calculated. This measure was added for complementing the process. Contrary to the previous disparity measure shown in equation 3, the equation 4 uses the following pixels with respect to the current pixel in the search process. D H1 (u, v) = max D H2 (u, v) = max
2.4 Disparity validation

d[0,D ]

1 N I (u + d, v)i ICr (u, v)i N i Cl =1

(4)

Once both disparity measures have been obtained, the validating task is straightforward. The disparity measure validation (right to left and left to right) consists of comparing both disparity values and obtaining the absolute difference between them. In the case that this difference is lower than a predened threshold , then the disparity value is accepted. Otherwise, the disparity value is labeled as undened. The equation 5 represents the validation of the disparity measures, D H being the validation result. DH =
2.5 Disparity ltering

D H1 | D H1 D H2 | < ind | D H1 D H2 |

(5)

A novel ltering process is needed in order to improve the quality of the nal disparity image. Muv is the set of coordinates in a m n rectangular window centered on the point (u, v). First, the set of disparity values D H (i, j) in the region dened by Muv are ordered. After that, the median ltering process selects the centered value at the ordered list. This value is set into the region dened by an M N rectangular window Mu,v and the same process is carried out for all the image pixels (i, j) in order to obtain the ltered image D H . Hence, this ltered image calculated by the median lter, when expressed in terms of the central pixel (u, v), would be written as in equation 6. D H (u, v) = median ( D H (i, j) , (i, j) Muv ) (6)

Whereas, for the image preprocessing (described above), an arithmetic mean lter is used, here for the pre-ltering process a median spatial lter is used, because the median lter allows the selection of one value among all the disparity values for representing the disparity in the search window. This means that a new value does not need to be obtained, as in the arithmetic lter.

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

75 5

3. Hardware implementation
We have implemented an architecture for FPGA that implements several image processing tasks with high performance. During the architecture design, we try to minimize the consumed resources in the FPGA for maximizing the system performance. In previous subsections, we have explained the 4 essential tasks of our architecture: the image processing, the Census Transform, the disparity validation and the ltering of the disparity image. In this section, we describe the hardware implementation, that is, how these architectures are implemented into the FPGA.
3.1 The arithmetic mean lter module

Usually the image acquired from the camera is not so noisy, thus a fast smoothing function could be used for obtaining a good quality image. For this reason, we use the arithmetic mean lter, which is faster and easy to implement. We are often interested in the minimization of required resources, so that, after several tests, we choose a window size of 3 3 pixels that is also enough for achieving good results. Indeed, this size allows us to save resources in the nal architecture.

Fig. 2. Module architecture to calculate the arithmetic mean lter. The arithmetic mean calculation is carried out in two stages: in horizontal and vertical ways. The block diagram of this architecture, in accordance with the process described in subsection 2.1, is shown in gure 2. The three input registers (left side of the diagram) are used for the horizontal addition. These registers are connected to two 8-bit parallel adders, although the result is coded in 10 bits. The result of this operation is stored in the memory that is twice the length image size. For obtaining the sum of all the elements in the window, a vertical addition is carried out. This addition uses the current horizontal addition result plus the two previous horizontal additions stored in the memory. This is shown on the right side of the diagram. Finally the arithmetic mean of the nine pixels are codied in a 12 bit-chain. In this stage, the delay only depends on the operation that involves the last line plus one value.
3.2 The Census Transform module

The arithmetic mean of the left and right images are used as inputs in the Census Transform stage. This transformation codies all the intensity values inside the search window with respect to the central intensity value. The block diagram of the Census Transform architecture is shown in the gure 3. The performance in this module depends on the size of the search window. The size of this window directly increases the resources and the time of processing. So the best trade off between the consumed resources and the optimal size of the window has to be selected. After several tests, the best processing time and hardware saving resources is reached for a 7 7

76 6

Advances in Stereo Vision Will-be-set-by-IN-TECH

Fig. 3. Module architecture to calculate the Census Transform. pixels window. This window needs 49 registers. On the other hand, 6 memory blocks are used in the processing module. The size of these memory blocks is obtained as follows: size of the image (usually 640) minus the size of the search window (7 pixels in our case) and the result is multiplied by 12. The constant 12 in the last multiplication is used because we look for the same size in the input of the Census Transform rather than in the output of the arithmetic mean module. Once we have selected the size of the search window, then we continue with the description of the Census Transform. The central pixel in the search window is compared with their 48 local neighbors. This last operation implies the connection of all the corresponding registers with parallel comparators as is shown in gure 3. The result is codied in 48 bits, where each bit corresponds to the comparator outputs. This stage has a delay equal to half of the search window by the length of the image.
3.3 The census correlation module

The correlation process consists of analyzing left and right images resulting from the Census Transform. Considering that both images contain a common object, the correlation process

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

77 7

has the aim of nding the displacement between two projected pixels belonging to that common object in the images. Hence, as images are acquired from two different points of view (associated with the left and right camera positions) there will exist a noticeable difference between point positions belonging to the same object, such difference is referred to as the disparity. Usually, the correlation intends to maximize the similarity between two images in order to nd the disparity. We also use this common method which consists of two main stages: the calculation of the similarity measure and the search for the maximal value. Figure 4 shows the block diagram of the corresponding architecture. We are interested in reducing the delay in the correlation process, therefore it is more convenient to compare one point of the left Census Transform image with the maximal number of points in the right one. Furthermore, the correlation process is executed twice in order to minimize the error during the correlation computation. We will consider 64 pixels as the maximal disparity value for each left and right Census Transform image. The registers represented by the gray blocks on the left of gure 4 store those pixels. The registers as well as the pixels of the left Census Transform image enter the binary operators XNOR to deliver a 48-bit chain at the output. The XNOR function is used to nd the maximal and minimal similarity associated with the disparity values at the input. All such pixel values enter by pairs into the XNOR gates. If all the compared pixels are equal, then the XNOR output will be 1, which means maximal similarity. Otherwise, if pixels are different, then 0 will be the output of the XNOR, which is associated with a minimal similarity value. Once the similarity has been calculated, we continue with the search of the highest disparity values between the 64 pixels compared in both correlation results, from left to right and right to left correlations, but independently. This task requires several selector units, each one with 4 inputs distributed as follows: 2 for the similarity values that will be compared and 2 for the indicators. The indicators are associated with the displacement between pixels in the left and right Census Transform. That is, if one pixel has the same position in both right and left Census Transform, then the indicator will be zero. Otherwise, the indicator will represent the number of pixels between pixel position in the left Census Transform with respect to the pixel in the right one. The block diagram of the architecture shown in gure 4 describes graphically the implementation of the maximization process. By examining this gure, we can highlight that the selector unit receives two similarity measures and two indicators as inputs. The elements inside of these units are a multiplexer and a comparator. The multiplexer receives the pixel with the highest similarity value, while the comparator receives two similarity values that come from the rst stage. Hence the output of that comparison will be considered as the selector inputs of the multiplexer. Thus, the multiplexer output will be the similarity measure and its refereed index pixel. However, in order to obtain the pixel with the maximal similarity measure, six levels of comparison units are needed. These levels are organized in a pyramidal fashion. The lowest level corresponds to the rst layer that carries out the selector unit task described above 32 times. As we ascend the pyramid levels, each level reduces by half the number of operators used with the previous level. The last level delivers the highest similarity value between the corresponding pixels in the left and right Census images. Whereas right to left image correlation stores left Census Transform pixels, which are compared with one pixel of the right Census image, for left to right image correlation the comparison is relative to one pixel of the left Census image. For this reason, a similar architecture is used for developing both correlation processes. All this stage (including both right to left and left to right processes) has a delay that depends on the number of layers in the selector units, which at the same time depends on the maximal disparity value that we are using. In our case, we establish maximal disparity value to 64, thus the number of layers is equal to 6.

78 8

Advances in Stereo Vision Will-be-set-by-IN-TECH

Fig. 4. Module architecture to calculate the Census correlation.

3.4 Disparity validation module

This module fuses the two disparity values obtained by the Census correlation processes (right to left and left to right). First, the difference between the two values is calculated, after that it is compared with a threshold . If that difference is lower than then the left to right disparity value is the output of the comparison, otherwise, the output will be labeled as undened.

Fig. 5. Module architecture to validate the disparities.

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

79 9

The gure 5 shows the block diagram of the disparity validation architecture. The inputs of this module are the two disparity values obtained by the Census Correlation process. The absolute difference of values is used in the comparison with . The comparator delivers one bit that controls the multiplexer selector. If the result of the comparison is 1 (that is, is higher than the correlation differences), then the multiplexer will have the undened label as result. This result is associated with the maximal disparity value plus one, which is referred to as the default value. If the comparison result is zero, then the output of the multiplexer will be the value of the left to right Census correlation.
3.5 The median lter module

Some errors have been detected during the disparity validation, due to the errors in the correlation process. Most of these errors appear because objects in the image have similar intensity values to their surrounding area. This situation produces very similar Census Transform values in pixels and consequently wrong disparity values in certain cases. We are interested in reducing these errors in the image by using a median lter. As we pointed out before, it is not recommended to use the same arithmetic mean lter as in the pre-processing stage because this lter will give us a new value (the average into the ltering window), which is not an element of the current disparity values. On the other hand the median lter works with the true values in the image, so the resulting value will be an element of the disparity window. The median lter uses a search window of size 3 3 pixels. This window is enough for notably reducing the error and improving the nal disparity map as is shown in gure 6.

(a) Left stereo image

(b) Disparity map

(c) Filtered image of disparity map

Fig. 6. Median Filter. a) Left image, b) Resulting disparity map without ltering and c) with ltering. Figure 7 shows the block diagram of the median lter architecture. This lter is based on the works of (Dhanasekaran & Bagan, 2009) and (Vega-Rodriguez et al., 2002). On the left side of the diagram is shown the nine registers and the two RAM block memories used to generate the sliding window that extracts the nine pixels of the processing window. This architecture works similar like to the processing window of the Census Transform. That is, the nine pixels in the window are processed by a pyramidal architecture but in this case with 9 levels. Each level contains several comparison units that nd the higher value between two input values A and B. Each comparison unit contains a comparator and a multiplexer. If the input A in the comparator is higher than its input B, then the output will be 1, otherwise the output will be 0. The comparator output is used as a signal control of the multiplexer. When this signal is 1, then the multiplexer selects A as the higher value and B as the lower value, otherwise B is the higher value and A the lower one. Each comparator level in the median module orders

80 10

Advances in Stereo Vision Will-be-set-by-IN-TECH

the disparity values with respect to its neighbors in the processing window, completing in this way the descendent organization of all the values. However, here it is not necessary to order all the disparity values, because we are only looking for the middle value in the last level. Therefore, we only need the comparison unit at the last level, because previous levels give only the partial order of the elements. The connection structure between the comparison units at each level guaranty an optimal median value (Vega-Rodriguez et al., 2002).

Fig. 7. Module architecture to calculate the median lter of disparities.

4. Resources and performance discussion

Our nal architecture for executing the stereo vision algorithm based on the Census Transform was developed using the level design ow RTL (Ibarra-Manzano, 2011). The architecture was codied in VHDL language using Quartus II workspace and ModelSim. Finally, it was synthesized for an EP2C35F672C6 device contained in the Cyclone II family of Altera. Some synthesis results associated with our architecture are: the implemented architecture implies 11, 683 combinatorial functions and 5, 162 dedicated logic registers, both represent 12, 183 logic elements in total. The required memory is 112, 025 bits. The quantity of logic elements represent only 37% of the total capacity in the device while the memory size represents 43%. The resources consumed by the architecture are directly associated with 5 essential parameters: the image size, window processing size used in both arithmetic mean and median lter, the window size in the search window of Census Transform and the maximal value in the disparity measure. In this architecture, we use an image size of 640 480 pixels, a window size of 3 3 pixels for both lters (arithmetic mean and median lters), a search window of 7 7 pixels for the Census Transform and a maximal disparity value of

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

81 11

64 pixels. With these parameters, the architecture is able to calculate 130 disparity images per second with a 50 Mhz signal clock until 325 disparity images per second with a 100 Mhz signal clock.
4.1 Architectural exploration through high-level synthesis

High level synthesis was used to implement the stereo vision architecture based on Census Transform. The algorithm was developed using GAUT (Coussy & Morawiec, 2008), which is a high level synthesis tool using C language. After that, the algorithm was synthesized using the (EP2C35F672C6) Cyclone II of Altera. Each state of the architecture (ltering, Census Transform and Correlation) was developed taking into account consumed resources and high performance (high speed of processing). The best trade off was found for implementing an optimal architecture system. Tables 1 to 3 lay out three different architectures, labeled as Design 1, 2, and 3, with their most representative performance. In the following, we will describe how the different implementation details are related in our architecture. There exists a clear relation between performance, cadence and pipeline implementation. That is, if we reduce the performance, then the cadence increases, therefore the number of operations and stages in the pipeline is low. With the rest of feature design, it is more difcult to see how they are related. For example, the number of logic elements depends directly on the used combinational functions and the number of dedicated logic registers. The combinational functions are strongly associated with the quantity of operations and weakly with the state numbers in the state machine. As with any state machine, the cadence time controls the performance speed. Contrary to the combinational functions, the dedicated logic registers strongly depends on the number of states in the state machine and weakly on the number of operations. Finally, the delay is obtained based on the number of operations, the number of stages in the pipeline and specially in the cadence time established by the architecture design. The results shown in the tables 1 to 3 were carried out for an image size of 640 480 pixels with a processing window of 3 3 pixels for the arithmetic mean lter, a window size of 7 7 pixels for the Census Transform and a maximal disparity measure of 64 pixels, with a signal clock of 100 Mhz. Characteristics Cadency (ns) Performance (fps) Logic elements Comb, functions Ded. log. registers # Stages in pipeline # Operators Latency (s) Design 1 Design 2 Design 3 20 160 118 86 115 3 2 25.69 30 100 120 72 116 2 2 38.52 40 80 73 73 69 2 1 51.35

Table 1. Comparative table for the arithmetic mean lter. Taking into account the most common real time constraints, it is possible to choose the design 3 for the implementation of the arithmetic mean lter, because this represents the best compromise between performance and consumed resources. For the same reason, the design 2 could be chosen for developing the Census Transform and the design 3 for the Census correlation. The results of the hardware Synthesis in FPGA are summarized as follows: the global architecture needs 6, 977 logic elements and 112, 025 memory bits. The quantity of logic elements represents 21% of the total resources logic of the Cyclone II device, furthermore

82 12
Characteristics

Advances in Stereo Vision Will-be-set-by-IN-TECH

Design 1 Design 2 Design 3 80 40 1,532 837 1,279 24 79 308.00 200 15 1,540 864 1,380 10 34 769.50

Cadency (ns) 40 Performance (fps) 80 Logic elements 2,623 Comb. functions 2,321 Ded. log. registers 2,343 # Stages in pipeline 48 # Operators 155 Latency (s) 154.36 Table 2. Comparative table for the Census Transform. Characteristics Cadency (ns) Performance (fps) Logic elements Comb. functions Ded. log. registers # Stages in pipeline # Operators Latency (ns)

Design 1 Design 2 Design 3 20 160 1,693 1,661 1,369 27 140 290 40 80 2,079 1,972 1,451 12 76 160 80 40 2,644 2,553 1,866 8 46 100

Table 3. Comparative table for the Census correlation. the memory size represents 23%. This architecture calculates 40 dense disparity images per second with a clock of 100 Mhz. This performance is lower than the proposed architecture, although it proposes a well-optimized design, since it uses less resources than in the previous case. In spite of the low performance, this is high enough in the majority of real-time vision applications.
4.2 Comparative analysis of the architectures

First, we will analyze the system performance for four different solutions to the dense disparity image. Two of the above mentioned solutions are hardware implementations. The third one is a solution for a Digital Signal Processing (DSP) model ADSP-21161N, with a signal clock of 100 MHz from Analog Devices Company. The last one is a software solution for a PC DELL Optiplex 755 with a 2.00 Ghz Intel Core 2 Duo processor and 2 Gb in RAM. The performance comparison between these solutions is shown in table 4. The rst column indicates the different image sizes used during the experimental test. The second column shows the sizes of the search window used in the Census Transform. The third column shows the processing time (performance). In the FPGA implementation, the parallel processing allows short calculation time. The developed architecture uses the RTL level design which reaches the lower processing time, but it takes more time for the implementation. On the other hand, using high level synthesis for the architecture design allows the development of a less complex design, but it requires longer processing time. However, the advantage of high level synthesis is the short implementation time. Unlike FPGA implementations, the DSP solutions are easier and faster to implement, nevertheless the processing remains sequential, and so the computation time is considerably high. Finally, the PC solution, that affords the easiest implementation of all above discussed,

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

83 13

requires very high processing times compared to the hardware solution, since it has an inappropriate architecture for real time applications. Image size Census window Time of processing (pixels) size (pixels) FPGA DSP PC 192 144 192 144 192 144 384 288 384 288 384 288 640 480 640 480 640 480 33 55 77 33 55 77 33 55 77 0.69ms 0.69ms 0.69ms 2.77ms 2.77ms 2.77ms 7.68ms 7.68ms 7.68ms 0.26s 0.69s 1.80s 1.00s 2.75s 7.20s 2.80s 7.70s 20.00s 33.29s 34.87s 36.31s 145.91s 151.39s 158.20s 403.47s 423.63s 439.06s

Table 4. Performance comparison of different implementation. We present a comparative analysis between our two architectures and four different FPGA implementations found in the literature. The rst column of table 5 lays out the most common characteristics of the architectures. The second and third columns show the limitations, performance and consumed resources by our architectures using the RTL level design and the High level synthesis HLS (Ibarra-Manzano, Devy, Boizard, Lacroix & Fourniols, 2009), labeled as Design 1 and Design 2, respectively. The remaining columns show the corresponding values for the four architectures, labeled as Design 3 to 6. These architectures were designed by different authors. See their corresponding articles for more technical details (Naoulou et al., 2006), (Murphy et al., 2007), (Arias-Estrada & Xicotencatl, 2001) y (Miyajima & Maruyama, 2003) for Design 3 to 6. Besides all of these are FPGA implementations, they calculate dense disparity images from two stereo images. Our architecture could be directly compared with Design 3 and 4, since they use the Census transform algorithm for calculating the disparity map. We propose two essential improvements with respect to Design 3: the delay and the size of memory. These improvements directly affect the number of logic elements (area) that in our case increase. With respect to Design 2, we propose three important improvements: the delay, the area and the memory size. Again these improvements impact the performance, that is the processed image per second is lower. Although Design 4 has a good performance with respect to other designs, this is lower than our architecture performance. In addition, it uses a four-times-smaller image, it has a lower value of disparity measure and it consumes a bigger quantity of resources (area and memory). Our architecture cannot be directly compared with Designs 5 and 6, since they use the Sum Absolute of Differences (SAD) as a correlation measure. However, an interesting comparison point is the architecture performance required for calculating the disparity map, at the moment that an architecture uses only logic elements (Design 5) or when several accesses to external memories are used (Design 6). The big quantity of logic elements consumed by the architecture in Design 5 limits the size of the input images and the maximal disparity value. As a consequence, this architecture has a lower performance with respect to our architecture (Design 1). The Design 6 requires a large quantity of external memory that directly affects its performance with respect to our Design 1.

5. Implementation results
We are interested in obtaining the disparity maps relative to a image sequence acquired from a camera mounted in a moving vehicle or robot. It is important to point out the additional

84 14

Advances in Stereo Vision Will-be-set-by-IN-TECH

Design 1 Design 2 Design 3 Design 4 Design 5 Design 6 Measure Census Census Census Census SAD SAD Image size 640 480 640 480 640 480 320 240 320 240 640 480 Window size 77 77 77 13 13 77 77 Disparity max 64 64 64 20 16 80 Performance 325 40 130 40 71 18.9 Latency (s) 115 206 274 Area 12, 188 6, 977 11, 100 26, 265 4, 210 7, 096 Memory size 114 Kb 109 Kb 174 Kb 375 Kb Table 5. Comparative table from different architectures. constraint imposed by a vehicle in which the velocity is varying or very high. In this context, our architecture was tested for different navigational scenes using a stereo vision bank rst mounted in a mobile robot and then in a vehicle. In this section, we present three operational environments. Figure 8 (a) and (b) respectively show the left and right images from the stereo vision bank. Dense disparity image depicts the disparity value in gray color levels in gure 8 (c). By examining this last image, we can determine that if the object is close to the stereo vision bank that means a big disparity value, so it corresponds to a light gray level. Otherwise, if the object is far from the stereo vision bank, the disparity value is low, which corresponds to a dark gray level. In this way, we observe that the gray color which represents the road in the resulting images gradually changes from light to dark gray level. We point out the right side of the image, where we can see the different tones of gray level corresponding to each vehicle in the parking. Since these vehicles are located at different depths from the stereo vision bank, the disparity map detects and assigns a corresponding gray color value. The second test performed with the algorithm is shown in gure 9. In this case a white vehicle moves straightforward in our robot direction. This vehicle is detected in the disparity image and depicted with different gray color levels. Different depth points of the vehicle can be detected, since it is closer to our stereo vision bank than the vehicles parked at the right side of the scene. On the other hand, it is important to point out that sometimes the disparity validation fails because the similarity between left and right images is close. This problem is more signicant when there are shadows close to the visual system (as in this experiment) producing several detection errors in the shadow zones. In the last test (see gure 10), the stereo vision bank is mounted on a vehicle that is driven on a highway. This experimental test results in a difcult situation because the vehicle is driven at high-speed during the test. The left and right images (gure 10 (a) and (b) respectively) show a car that overtakes our vehicle. The gure 10 (c) shows the dense disparity map. We highlight all the different depths detected in the vehicle that overtakes our vehicle and how the gray color value in the highway becomes gradually darker until the black color which represents an innity depth.

6. Back-projection for 3D environment reconstruction

Many applications use the obtained disparity image for the obstacles detection task because the association of disparity values with a certain depth in the real world is straightforward. However, maybe the most common application of the disparity image is the 3D reconstruction of the environment. The reconstruction method employs a technique called back-projection, which uses the intrinsic parameters of the camera and the disparity image for positioning one point in the real world.

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

85 15

(a) Left stereo image

(b) Right stereo image

(c) Result from FPGA implementation

Fig. 8. Stereo images acquired from a mobil robot during outdoor navigation: a) left image b) right image and c) the disparity map.

(a) Left stereo image

(b) Right stereo image

(c) Result from implementation

FPGA

Fig. 9. Stereo images acquired from a mobile robot during outdoor navigation: a) left image b) right image and c) the disparity map. Figures 11 (a) and (b) show the left image and the obtained disparity map respectively. Figure 11 (c) shows the reconstructed environment using the back-projection technique. Each point in the reconstructed scene was located with respect to a reference frame set in the stereo bank employing intrinsic/extrinsic parameters of the cameras and geometrical

86 16

Advances in Stereo Vision Will-be-set-by-IN-TECH

(a) Left stereo image

(b) Right stereo image

(c) Result from implementation

FPGA

Fig. 10. Stereo images acquired from a vehicle in the highway: a) left image b) right image and c) the disparity map.

(a) Left stereo image

(b) Dense disparity image

Fig. 11. 3D reconstruction from outdoor environment using dense disparity map obtained by our architecture.

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

87 17

assumptions. By examining gure 11 (c), we can see that most of the undened disparity points were removed, thus the reconstruction is based on the well-dened depth-points. Finally, it is important to point out that the reconstruction of this environment results in a difcult task, since the robot with the stereo vision bank moves with a considerable velocity of 6 meters per second and in an outdoor environment. Therefore, the ideal conditions of controlled illumination and controlled vibrations do not hold, and this will be reected in some images, making it more difcult to obtain the disparity map and, consequently, the scene reconstruction.

7. Conclusions and perspectives

The logic-programmable technology is known as an intermediary solution between the programmable processors and the dedicated VLSI circuits due mainly to its price, performance and power consumption. Furthermore, it is difcult to establish frontiers for delimiting logic-programmable technology applications because of their continual evolution. All of this makes this technology an attractive solution. Nowadays, the FPGAs allow the development of hardware systems that overcome most of the limitations of real-time applications. However, the price and design time of the FPGA solutions make this technology a complicated tool in comparison with the software solutions. The designers must repeat the ow diagram of the design several times in order to overcome the performance limitations of the application, under the constraints imposed by always reducing the resources consumed by the circuit. At each iteration, the ow design could take several hours and in some cases days before converging into the optimal solution. The high level synthesis allows us to reduce the design time of the algorithm with condence that the resulting code will be equally efcient. This fact requires that the quality in the design be independent of the designer abilities. With the aim of efciency, the high level synthesis must use design methods that take into account the specialization domain of the application. We take advantage of rapidly-processing natural stereo images to use our architecture in real time applications. Resulting disparity images demonstrate the correct detection of different depth planes in stereo image pairs. However, resulting images present several fail detections that make images corrupt and noisy. In order to improve disparity image quality, we could include an additional measure of correlation to our current Hamming distance (such as Tanimoto distance) or the latter as the only correlation measure used. Both (Hamming and Tanimoto) resulted in disparity measures that could be linked in a unique disparity measure that will be more discriminative than our current one. The improvements in the stereo vision architecture include an algorithm for auto-calibration, in order to reduce the error during disparity calculation. Also, with the auto-calibration process we eliminate the previous calibration problem. In this way, the stereo vision system will be more suitable with respect to the current version. In the case of the modules, we are working on their parametrization. This consists of developing auto-congurable modules in which we could directly vary the window sizes (both processing and search window), and this allows us to develop a recongurable system useful for different purposes.

8. Acknowledgments
This work was partially funded by the CONACyT with the project entitled Diseo y optimizacin de una arquitectura para la clasicacin de objetos en tiempo real por color y textura basada en FPGA

88 18

Advances in Stereo Vision Will-be-set-by-IN-TECH

9. References
Arias-Estrada, M. & Xicotencatl, J. M. (2001). Multiple stereo matching using an extended architecture, in G. Brebner & R. Woods (eds), FPL 01: Proceeding of the 11th International Conference on Field-Programmable Logic and Applications, Springer-Verlag, London, UK, pp. 203212. Coussy, P. & Morawiec, A. (2008). High-Level Synthesis: from Algorithm to Digital Circuit, 1 edn, Springer. Dhanasekaran, D. & Bagan, K. B. (2009). High speed pipelined architecture for adaptive median lter, European Journal of Scientic Research 29(4): 454460. Ibarra-Manzano, M. (2011). Vision multi-camra pour la dtection dobstacles sur un robot de service: des algoritmes un systme intgr, PhD thesis, Institut National des Sciences Appliques de Toulouse, Toulouse, France. Ibarra-Manzano, M., Almanza-Ojeda, D.-L., Devy, M., Boizard, J.-L. & Fourniols, J.-Y. (2009). Stereo vision algorithm implementation in fpga using census transform for effective resource optimization, Digital System Design, Architectures, Methods and Tools, 2009. 12th Euromicro Conference on, pp. 799 805. Ibarra-Manzano, M., Devy, M., Boizard, J.-L., Lacroix, P. & Fourniols, J.-Y. (2009). An efcient recongurable architecture to implement dense stereo vision algorithm using high-level synthesis, 2009 International Conference on Field Programmable Logic and Applications, Prague, Czech Republic, pp. 444447. Masrani, D. & MacLean, W. (2006). A Real-Time large disparity range Stereo-System using FPGAs, Computer Vision Systems, 2006 ICVS 06. IEEE International Conference on, p. 13. Miyajima, Y. & Maruyama, T. (2003). A real-time stereo vision system with fpga, in G. Brebner & R. Woods (eds), FPL 03: Proceeding of the 13th International Conference on Field-Programmable Logic and Applications, Springer-Verlag, London, UK, pp. 448457. Murphy, C., Lindquist, D., Rynning, A. M., Cecil, T., Leavitt, S. & Chang, M. L. (2007). Low-cost stereo vision on an fpga, FCCM 07: Proceeding of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, IEEE Computer Society, Washington, DC, USA, pp. 333334. Naoulou, A., Boizard, J.-L., Fourniols, J. Y. & Devy, M. (2006). A 3d real-time vision sytem based on passive stereovision algorithms: Application to laparoscopic surgical manipulations, Proceedings of the 2nd Information and Communication Technologies, 2006 (ICTTA), Vol. 1, IEEE, pp. 10681073. Schmit, H. H., Cadambi, S., Moe, M. & Goldstein, S. C. (2000). Pipeline recongurable fpgas, Journal of VLSI Signal Processing Systems 24(2-3): 129146. Vega-Rodriguez, M. A., Sanchez-Perez, J. M. & Gomez-Pulido, J. A. (2002). An fpga-baed implementation for median lter meeting the real-time requirements of automated visual inspection systems, Proceedings of th 10th Mediterranean Conference on Control and Automation, Lisbon, Portugal, pp. 17. Woodll, J., Gordon, G., Jurasek, D., Brown, T. & Buck, R. (2006). The tyzx DeepSea g2 vision system, ATaskable, embedded stereo camera, Computer Vision and Pattern Recognition Workshop, 2006. CVPRW 06. Conference on, p. 126. Zabih, R. & Woodll, J. (1994). Non-parametric local transforms for computing visual correspondence, ECCV 94: Proceedings of the Third European Conference on Computer Vision, Vol. II, Springer-Verlag New York, Inc., Secaucus, NJ, USA, pp. 151158.

ASL Series Programming Guide
No ratings yet
ASL Series Programming Guide
460 pages
CV Notes PDF
No ratings yet
CV Notes PDF
206 pages
PDH Mini Link Presentation
No ratings yet
PDH Mini Link Presentation
37 pages
Ldica A Course
No ratings yet
Ldica A Course
45 pages
AC Lab Manual2017 PDF
No ratings yet
AC Lab Manual2017 PDF
35 pages
Coresight Etm - M4: Technical Reference Manual
No ratings yet
Coresight Etm - M4: Technical Reference Manual
55 pages
Intro HDL
No ratings yet
Intro HDL
141 pages
Mechanic Industrial Electronics.150171925
No ratings yet
Mechanic Industrial Electronics.150171925
28 pages
5g Diagrams
No ratings yet
5g Diagrams
103 pages
New DC Lab Manual
No ratings yet
New DC Lab Manual
53 pages
Verilog Lab 201101 PDF
No ratings yet
Verilog Lab 201101 PDF
28 pages
6.111 Project Report: Brian Axelrod, Amartya Shankha Biswas, Xinkun Nie
No ratings yet
6.111 Project Report: Brian Axelrod, Amartya Shankha Biswas, Xinkun Nie
26 pages
DLD Lab 7
No ratings yet
DLD Lab 7
5 pages
qt5w91q7q6 Nosplash
No ratings yet
qt5w91q7q6 Nosplash
143 pages
Unit 1 - Computer Architecture - WWW - Rgpvnotes.in
No ratings yet
Unit 1 - Computer Architecture - WWW - Rgpvnotes.in
16 pages
Verilog Codes For Combinational Ciruits Along With Their Test Bench
100% (1)
Verilog Codes For Combinational Ciruits Along With Their Test Bench
28 pages
QB105322
No ratings yet
QB105322
15 pages
FPGA Design and Implementation of A Real-Time Stereo Vision System
No ratings yet
FPGA Design and Implementation of A Real-Time Stereo Vision System
12 pages
Digital Electronics Practical Manual
No ratings yet
Digital Electronics Practical Manual
86 pages
Img Mod1 Session5 PDF
No ratings yet
Img Mod1 Session5 PDF
32 pages
Group 3 Report
No ratings yet
Group 3 Report
14 pages
Proposed Project Work: A Parallelized Distance Transformation Architecture For Fpgas Abstract
No ratings yet
Proposed Project Work: A Parallelized Distance Transformation Architecture For Fpgas Abstract
23 pages
Deep Learning Stereo Vision at The Edge: Luca Puglia and Cormac Brick
No ratings yet
Deep Learning Stereo Vision at The Edge: Luca Puglia and Cormac Brick
10 pages
5614sipij05 PDF
No ratings yet
5614sipij05 PDF
20 pages
Multimedia Systems: Multimedia Databases - Image Processing Basics
No ratings yet
Multimedia Systems: Multimedia Databases - Image Processing Basics
58 pages
S 2 G 2
No ratings yet
S 2 G 2
2 pages
Review of stereo vision algorithms and their suitability for resource-limited systems (科研通-ablesci.com)
No ratings yet
Review of stereo vision algorithms and their suitability for resource-limited systems (科研通-ablesci.com)
21 pages
On Building An Accurate Stereo Matching System On Graphics Hardware
No ratings yet
On Building An Accurate Stereo Matching System On Graphics Hardware
8 pages
Digital Logic Design and Microprocessor 2
No ratings yet
Digital Logic Design and Microprocessor 2
4 pages
TDMA
No ratings yet
TDMA
5 pages
Lab 3 - Tutorial
No ratings yet
Lab 3 - Tutorial
8 pages
Stereo Vision Using The Opencv Library: Sebastian DR Oppelmann Moos Hueting Sander Latour Martijn Van Der Veen June 2010
No ratings yet
Stereo Vision Using The Opencv Library: Sebastian DR Oppelmann Moos Hueting Sander Latour Martijn Van Der Veen June 2010
15 pages
Image Processing On The GPU: A Canonical Example: Scales Ns Orientatio Colors
No ratings yet
Image Processing On The GPU: A Canonical Example: Scales Ns Orientatio Colors
11 pages
Review of Stereo Matching Algorithms For 3d Vision
No ratings yet
Review of Stereo Matching Algorithms For 3d Vision
9 pages
C To Asynchronous Dataflow Circuits: An End-to-End Toolflow
No ratings yet
C To Asynchronous Dataflow Circuits: An End-to-End Toolflow
8 pages
Real-Time Stereo Vision For Urban Traffic Scene Understanding
No ratings yet
Real-Time Stereo Vision For Urban Traffic Scene Understanding
6 pages
Logics Exercise
No ratings yet
Logics Exercise
34 pages
L3v3-Image Enhancement and Preprocessing
No ratings yet
L3v3-Image Enhancement and Preprocessing
75 pages
Image Smoothing Based On FPGA
No ratings yet
Image Smoothing Based On FPGA
14 pages
List of Instigated Circuit
100% (1)
List of Instigated Circuit
25 pages
Development of A Stereo Vision Measurement Architecture For An Underwater Robot
No ratings yet
Development of A Stereo Vision Measurement Architecture For An Underwater Robot
4 pages
Fpga Görüntü Işleme Parmak Izi Karşılaştırma
No ratings yet
Fpga Görüntü Işleme Parmak Izi Karşılaştırma
11 pages
FPGA-Based Feature Detection
No ratings yet
FPGA-Based Feature Detection
9 pages
An FPGA Implementation of A Flexible, Parallel Image Processing Architecture Suitable For Embedded Vision Systems
No ratings yet
An FPGA Implementation of A Flexible, Parallel Image Processing Architecture Suitable For Embedded Vision Systems
6 pages
Coursehero 34726625 PDF
No ratings yet
Coursehero 34726625 PDF
11 pages
Stereo Video Processing For Depth Map: Harlan Hile and Colin Zheng
100% (2)
Stereo Video Processing For Depth Map: Harlan Hile and Colin Zheng
8 pages
Image Compression Using High Efficient Video Coding (HEVC) Technique
No ratings yet
Image Compression Using High Efficient Video Coding (HEVC) Technique
3 pages
CSE231L Lab 6 Report PDF
No ratings yet
CSE231L Lab 6 Report PDF
7 pages
Computer Vision
No ratings yet
Computer Vision
7 pages
OIE 751 ROBOTICS Unit 3 Class 5 (19-9-2020)
100% (1)
OIE 751 ROBOTICS Unit 3 Class 5 (19-9-2020)
14 pages
EE2001039 - A2report1 - EEE 4106
No ratings yet
EE2001039 - A2report1 - EEE 4106
5 pages
Ec3352-Dsd Lab Record
No ratings yet
Ec3352-Dsd Lab Record
52 pages
Output
No ratings yet
Output
2 pages
Sensors 22 01201 v2
No ratings yet
Sensors 22 01201 v2
26 pages
Fast Generation of Custom Floating-Point Spatial Filters On Fpgas
No ratings yet
Fast Generation of Custom Floating-Point Spatial Filters On Fpgas
12 pages
4 5 Multiview (System)
No ratings yet
4 5 Multiview (System)
52 pages
4 3 Stereo-App
No ratings yet
4 3 Stereo-App
70 pages
Use of Reconfigurable FPGA For Image Processing
No ratings yet
Use of Reconfigurable FPGA For Image Processing
5 pages
Generalised Parallel Bilinear Interpolation Archit
No ratings yet
Generalised Parallel Bilinear Interpolation Archit
7 pages
Computer Vision Notes
No ratings yet
Computer Vision Notes
72 pages
Digital Image Processing
No ratings yet
Digital Image Processing
30 pages
Digital Image Processing
No ratings yet
Digital Image Processing
10 pages
21ECC211L Devices Digital Lab First Pages
No ratings yet
21ECC211L Devices Digital Lab First Pages
3 pages
Best Science Journal
No ratings yet
Best Science Journal
5 pages
Low-Cost Implementation of Bilinear and Bicubic Image Interpolation For Real-Time Image Super-Resolution
No ratings yet
Low-Cost Implementation of Bilinear and Bicubic Image Interpolation For Real-Time Image Super-Resolution
5 pages
Improved Census Transforms For Resource-Optimized Stereo Vision
No ratings yet
Improved Census Transforms For Resource-Optimized Stereo Vision
14 pages
DO BCS302 CIE1 Question Bank
No ratings yet
DO BCS302 CIE1 Question Bank
2 pages
CS3351 - DP & CO Notes Word
No ratings yet
CS3351 - DP & CO Notes Word
297 pages
Computer Vision Three-Dimensional - Andrea Fusiello
No ratings yet
Computer Vision Three-Dimensional - Andrea Fusiello
632 pages
CA3 Question Paper
No ratings yet
CA3 Question Paper
2 pages
Deca
No ratings yet
Deca
9 pages
1 s2.0 S1319157820304493 Main
No ratings yet
1 s2.0 S1319157820304493 Main
11 pages
Método de Aplanado - Gellón Versión Inglés
No ratings yet
Método de Aplanado - Gellón Versión Inglés
9 pages
A High Performance Fpga Based Image Feature Detector and Matcher Based On The Fast and Brief Algorithms
No ratings yet
A High Performance Fpga Based Image Feature Detector and Matcher Based On The Fast and Brief Algorithms
15 pages
Unit 1
No ratings yet
Unit 1
8 pages
Fundamentals Lect 2
No ratings yet
Fundamentals Lect 2
58 pages
2013vlsisocbc fpgaRealTime3DStereoMatch
No ratings yet
2013vlsisocbc fpgaRealTime3DStereoMatch
21 pages
Computer Vision - 01 Introduction
No ratings yet
Computer Vision - 01 Introduction
40 pages
CO Machine Vision
No ratings yet
CO Machine Vision
3 pages
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
From Everand
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
Fouad Sabry
No ratings yet
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
From Everand
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
Fouad Sabry
No ratings yet
Histogram Equalization: Enhancing Image Contrast for Enhanced Visual Perception
From Everand
Histogram Equalization: Enhancing Image Contrast for Enhanced Visual Perception
Fouad Sabry
No ratings yet
Radiosity Computer Graphics: Advancing Visualization through Radiosity in Computer Vision
From Everand
Radiosity Computer Graphics: Advancing Visualization through Radiosity in Computer Vision
Fouad Sabry
No ratings yet
Image Based Modeling and Rendering: Exploring Visual Realism: Techniques in Computer Vision
From Everand
Image Based Modeling and Rendering: Exploring Visual Realism: Techniques in Computer Vision
Fouad Sabry
No ratings yet
Optical Flow: Exploring Dynamic Visual Patterns in Computer Vision
From Everand
Optical Flow: Exploring Dynamic Visual Patterns in Computer Vision
Fouad Sabry
No ratings yet
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
From Everand
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
Fouad Sabry
No ratings yet
Motion Estimation: Advancements and Applications in Computer Vision
From Everand
Motion Estimation: Advancements and Applications in Computer Vision
Fouad Sabry
No ratings yet
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
Volume Rendering: Exploring Visual Realism in Computer Vision
From Everand
Volume Rendering: Exploring Visual Realism in Computer Vision
Fouad Sabry
No ratings yet
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet

InTech-High Speed Architecure Based On Fpga For A Stereo Vision Algorithm

Uploaded by

InTech-High Speed Architecure Based On Fpga For A Stereo Vision Algorithm

Uploaded by

0 5

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm

Advances in Stereo Vision Will-be-set-by-IN-TECH

2. Overview of passive stereo vision

Fig. 1. Stereo vision algorithm

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

Advances in Stereo Vision Will-be-set-by-IN-TECH

1 N I (u + d, v)i ICr (u, v)i N i Cl =1

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

Advances in Stereo Vision Will-be-set-by-IN-TECH

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

Advances in Stereo Vision Will-be-set-by-IN-TECH

Fig. 4. Module architecture to calculate the Census correlation.

Fig. 5. Module architecture to validate the disparities.

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

(a) Left stereo image

(b) Disparity map

(c) Filtered image of disparity map

Advances in Stereo Vision Will-be-set-by-IN-TECH

Fig. 7. Module architecture to calculate the median lter of disparities.

4. Resources and performance discussion

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

Advances in Stereo Vision Will-be-set-by-IN-TECH

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

Advances in Stereo Vision Will-be-set-by-IN-TECH

6. Back-projection for 3D environment reconstruction

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

(a) Left stereo image

(b) Right stereo image

(c) Result from FPGA implementation

(a) Left stereo image

(b) Right stereo image

(c) Result from implementation

Advances in Stereo Vision Will-be-set-by-IN-TECH

(a) Left stereo image

(b) Right stereo image

(c) Result from implementation

(a) Left stereo image

(b) Dense disparity image

High-Speed Architecture Based on FPGA for a Stereo-Vision Algorithm a Stereo-Vision Algorithm

High-Speed Architecture Based on FPGA for

7. Conclusions and perspectives

Advances in Stereo Vision Will-be-set-by-IN-TECH

You might also like