Lin - Fast Motion Compensation

This paper proposes a fast block-matching algorithm (FFBMA) for motion estimation in video compression that uses integral projections to reduce computational complexity while maintaining optimal accuracy. The FFBMA compares a reference block to candidate blocks using multiple fast matching error measures based on integral projections, which have low complexity to compute. Most candidate blocks can be rejected after calculating just one or two error measures. Only a few candidate blocks need the full mean absolute error or mean square error calculations. Simulation results show the FFBMA achieves over an 86% reduction in computations compared to the full-search block-matching algorithm, while still finding the optimal motion vector.

Uploaded by

Karthik Shiva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views5 pages

Lin - Fast Motion Compensation

Uploaded by

Karthik Shiva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 45, NO.

5, MAY 1997 527

Fast Full-Search Block-Matching Algorithm

for Motion-Compensated Video Compression
Yih-Chuan Lin and Shen-Chuan Tai

Abstract—This paper proposes a fast block-matching algorithm in one complete measurement of are absolute
that uses three fast matching error measures, besides the conven- values (or squarings when and additions.
tional mean-absolute error (MAE) or mean-square error (MSE). The best matched block corresponds to the candidate block
An incoming reference block in the current frame is compared
to candidate blocks within the search window using multiple of its upper left corner located at which has
matching criteria. These three fast matching error measures are the minimum matching error A straightforward
established on the integral projections, having the advantages method of BMA is the full-search BMA (FBMA) which
of being good block features and having simple complexity in requires to compute the ’s for all positions
measuring matching errors. Most of the candidate blocks can of candidate blocks in the search window; that is, the FBMA
be rejected only by calculating one or more of the three fast
matching error measures. The time-consuming computations of needs absolute values (or squarings),
MSE or MAE are performed on only a few candidate blocks that additions, and comparisons for
first pass all three fast matching criteria. Simulation results show each reference block; however, it is an intensive computation
that a reduction of over 86% in computations is achieved after process, limiting its practical applications. Many well-known
integrating the three fast matching criteria into the full-search fast algorithms [3]–[13] have been developed to reduce such
algorithm, while ensuring optimal accuracy.
highly computational complexity of the full-search BMA by
considering only a limited number of the motion vectors in
I. INTRODUCTION the search window at the expense of estimate accuracy. That
is, only suboptimal estimate accuracy is guaranteed by these
M OTION estimation using a block-matching algorithm
(BMA) is widely used in many motion-compensated
video coding systems, such as those recommended by the
algorithms. Concerning the VLSI implementation, most of
these fast algorithms, e.g., the three-step search (TSS) [3],
H.261 and MPEG standards [1], [2], to remove interframe have the drawbacks of irregular data flow and high control
redundancy and thus achieve high data compression. In a overhead, while the full-search BMA has the advantages of
typical BMA, the current frame of a video sequence is divided regular data flow and low control overhead [14], [15].
into nonoverlapping square blocks of pixels, say, of size Recently, a number of algorithms with regard to the pattern-
For each reference block in the current frame, BMA matching problems [16]–[19] make use of integral projec-
searches for the best matched block within a search window tions to simplify the computational complexity of the pattern-
of size in the previous frame, where matching operation. However, all of the previous research
stands for the maximum allowed displacement. Then the work on motion estimation using integral projections has never
relative position between the reference and its best matched provided any optimality-preserving ability like the FBMA.
block is represented as the motion vector of the reference Integral projections are good features describing the block
block. A nonnegative matching error function is mean intensity and the edge location and orientation in a
defined over all the positions to be searched, i.e., block of pixels, and are most likely to be different for different
blocks. In this letter, a fast full-search BMA (FFBMA), which
is also based on the uses of integral projections, is presented
to provide much faster motion estimation than that using
the traditional FBMA, while preserving the optimality of
estimate accuracy. In fact, there still exist similar ideas being
or and (1) realized by other techniques for fast vector quantization (VQ),
such as the partial distortion search (PDS) [20], the triangle-
where is the reference block of its upper left pixel at eliminating rule (TIE) [21], or VQ using mean pyramids of
the coordinate in the current frame, and vectors [22]. These fast VQ algorithms converge to a common
is a candidate block of its upper left pixel at the coordinate goal to reject most entries in the codebook that are not
in the previous frame. The computations incurred best matched to the target block using only the partial and
Paper approved by M. R. Civanlar, the Editor for Image Processing of simple information in the blocks. It is not straightforward or
the IEEE Communications Society. Manuscript received September 8, 1995; even difficult to extend directly these fast algorithms to the
revised May 25, 1996. This work was supported in part by the National
Science Council of Taiwan, R.O.C. under Contract NSC86-2221-E-006-057. motion estimation task. For example, in [22], Lee and Chen
This paper was presented in part at the 13th IEEE International Conference defined a sequence of fast matching criteria, each associated
on Pattern Recognition, Vienna, Austria, August 1996. with a different level of the mean pyramids of blocks, and
The authors are with the Institute of Electrical Engineering, National Cheng
Kung University, Tainan, Taiwan, Republic of China. employed these criteria in the “coarse-to-fine” manner to
Publisher Item Identifier S 0090-6798(97)03723-9. promote speed in searching for the nearest neighbor in a VQ
0090-6778/97$10.00  1997 IEEE
528 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 45, NO. 5, MAY 1997

system; however, in trying to extend this fast hierarchical

matching technique with optimality-perserving ability to the
block-matching algorithm, it has to spend a large number
of overhead computations to construct the mean pyramids (7)
of all the candidate blocks in the search window, and a With these measures defined in (1) and (5)–(7), four different
significant amount of storage for these mean pyramids prior kinds of matching errors are available for each position within
to the start of the block-matching process. On the other hand, the search window. Obviously, (5) takes only one squaring (or
using integral projections instead of the mean pyramids, the absolute value) and one subtraction (or addition); as for (6) or
above two technical difficulties almost could be excluded (7), only additions and squarings (or absolute values)
completely by an efficient approach which can generate the are sufficient. Each of these three computational complexities
integral projections in an on-line manner and only requiring is relatively low in comparison with that of (1). In the
six additions/substractions at each position of candidate block. following, Theorems 1 and 2 provide the relationships among
In sum, this paper gives the optimality-preserving ability of the multiple matching errors on each position within the
motion estimation using integral projections, along with an search window .
efficient method for preparing the integral projections of all Theorem 1: a) b)
the candidate blocks in the search window, and shows the c)
performance gain over that solely using all separate pixels in Theorem 2: a) b)
the blocks. c)
Notice that corresponds to the mean-absolute error
II. THE FAST FULL-SEARCH (MAE), and is the mean-square error (MSE). The
BLOCK-MATCHING ALGORITHM (FFBMA) validity of these two theorems can be shown according to two
The basic idea behind the proposed fast full-search BMA mathematical inequalities. They are
relies on constructing three fast matching criteria and, during
the period of block matching, employing these three fast (8)
matching criteria to discard the candidate blocks in the search
window which are not matched to the reference block in the
current frame, before using the time-consuming matching error (9)
defined in (1). These fast matching criteria are derived from
the integral projections since the integral projections are simple
where are arbitrary real numbers. In-
and relevant features to a block of pixels.
equality (8) follows the well-known triangle inequality. As for
Roughly speaking, the integral projections can be regarded
inequality (9), a brief explanation is given as follows. When
as the intensity sums of spatial pixels along any fixed direction
considering any pair of two real numbers we have
in a block of pixels. For any given block in frame
or, equivalently,
three kinds of integral projections are defined as follows:
1) vertical projections: (10)

Taking summation over all the pairs on both sides of (10)

(2) yields inequality (9). Theorems 1 and 2 can be derived easily
by processing (1) according to (8) and (9), respectively, where
2) horizontal projections: the integral projections of the error terms in (1) are formulated
accordingly to form the or For
example,

(3)
3) massive projection:

(4)

In the proposed FFBMA, the three fast matching error

measures are defined as follows:

(5)

(6)
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 45, NO. 5, MAY 1997 529

This proves c) of Theorem 2. A similar process can be i.e.,

employed for the remaining parts of the two theorems.
Returning to the block-matching problem, let us assume
that the known current most matched motion vector
is initially set to (0,0), and the associated MSE is
(if the MSE criterion is used). For any other candidate block,
within the search window if one or more of
the following conditions holds:
1)
2)
3)
4)
Then, applying Theorem 2, the candidate block
can be rejected without calculating the time-consuming
MSE measurement. In the FFBMA, for each candidate block,
test conditions 1)–4) are checked sequentially; if any of the
first three test conditions fails, then the candidate block should Therefore, the computations required for the integral pro-
be checked further by its successive test condition; otherwise, jections of block are six arithmetic operations of
it is rejected, and the next other candidate block is then addition/subtraction. Suppose the frame size is pixels.
compared. Once test condition 4) is tested and unsatisfied, both The integral projections of all of the blocks for
the best matched motion vector and the associated matching and for
error should be updated. In this algorithm, all of the candidate in the considered frame are first calculated. This
blocks in the search window need be examined by test precomputation needs
condition 1); this requires additions, additions/subtractions. To calculate the integral projections of
squarings, and comparisons. Each of the candidate the remaining blocks in the frame, we need
blocks that fails on the check of test condition 1) should go additions/subtractions since each of the remaining blocks
requires six additions/subtractions and there are
through the test of condition 2) for further rejection, and this
blocks remaining. Since the integral projections can con-
test using condition 2) for one block matching needs
vey the most information in a block of pixels and the arithmetic
additions, squarings, and one comparison. Similarly, test
operations required for calculating the three fast matching
condition 3) also takes additions, squarings, and one errors are both much fewer and simpler than those for MSE,
comparison for one candidate block that violates the preceding a great deal of computations or number of MSE (or MAE)
test. Finally, when all of the first three test conditions are measurements are thus saved. The next section shows several
unsatisfied with a certain candidate block, the FFBMA requires experiments to demonstrate the effectiveness of using integral
additions, squarings, and one comparison to projections to speed up the FBMA.
decide whether the best matched motion vector should be
updated to the current candidate position. III. SIMULATION RESULTS
For the sake of clarity, the computations required in the
The efficiency of the proposed algorithm was tested by
FFBMA for one reference block should include:
using two benchmark video sequences, Salesman and Flower
1) comparisons; Garden. We first used 60 consecutive frames of size 360
2) additions (or 288 pixels in Salesman and 60 consecutive frames of size 360
subtraction); 240 pixels in Flower Garden. The block and search window
3) squarings (or absolute values sizes were fixed at 16 16 and 33 33, respectively. Thus,
if MAE is concerned); the traditional FBMA requires computing 1089 MSE or MAE
where and and stand for the measurements for each reference block in the current frame.
occurrence frequencies of evaluating test conditions 2)–4), In addition to the FBMA and FFBMA, we also implemented a
partial distortion search block-matching algorithm (PDSBMA)
respectively, required for finding the best matched motion
that involves the partial distortion search (PDS) technique to
vector within the search window.
speed up the block-matching process. In the PDSBMA, the
To evaluate the first three conditions 1)–3), the integral
matching process with a certain candidate block accomplishes
projections of each candidate block have to be known prior the distortion measurement by accumulating the individual
to matching. We do not have to calculate all of the integral error terms of block elements one at a time, and can be
projections for each candidate block in the previous frame. If quit partially without completing the accumulation of the full
the integral projections for the two candidate blocks absolute difference, that is, to check if the accumulation thus
and are known, only a few terms are updated far had already exceeded the distortion to the best match; if so,
for obtaining all of the integral projections of the block there is no need to continue the accumulation. Table I shows
530 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 45, NO. 5, MAY 1997

TABLE I TABLE IV
COMPARISON OF THE COMPUTATIONAL COMPLEXITY FOR COMPARISON OF THE COMPUTATION COMPLEXITY FOR VARIOUS
VARIOUS BLOCK-MATCHING ALGORITHMS WITH THE MSE ALGORITHMS WITH MAE WORKING ON 60 Flower Garden FRAMES
CRITERION ACCORDING TO THE NUMBERS OF THE ARITHMETIC OF 720 2 480 SIZE ACCORDING TO THE NUMBERS OF ARITHMETIC
OPERATIONS REQUIRED FOR EACH 16 2 16 REFERENCE BLOCK OPERATIONS REQUIRED FOR EACH 16 2 16 REFERENCE BLOCK

As for the higher resolution sequences, we also have done

an experiment on the Flower Garden sequence with a size
TABLE II of 720 480 pixels. This Flower Garden of higher resolution
COMPARISON OF THE COMPUTATIONAL COMPLEXITY FOR
VARIOUS BLOCK-MATCHING ALGORITHMS WITH THE MAE consists of 60 frames which are all the finer sampling versions
CRITERION ACCORDING TO THE NUMBERS OF THE ARITHMETIC of those in the preceding experiments. For this higher resolu-
OPERATIONS REQUIRED FOR EACH 16 2 16 REFERENCE BLOCK tion sequence, Tables III and IV compare the computation
complexities of the three algorithms with the MSE and MAE,
respectively. Notice that the block and search window sizes
were also set to 16 16 and 33 33, respectively. Referring
to Tables III and IV, we can observe that the computational
performance gain of the FFBMA for the higher resolution
of Flower Garden is less than that for the lower resolution
one. This is because, for the same scene, the finer sampling
version could result in the fact that most candidate blocks’
TABLE III matching errors within the search window are close to each
COMPARISON OF THE COMPUTATION COMPLEXITY FOR VARIOUS
ALGORITHMS WITH MSE WORKING ON 60 Flower Garden FRAMES other, especially for those smooth parts.
OF 7202 480 SIZE ACCORDING TO THE NUMBERS OF ARITHMETIC From these tables, we can find that the FFBMA exhibits a
OPERATIONS REQUIRED FOR EACH 16 2 16 REFERENCE BLOCK larger reduction of arithmetic operations for MAE as compared
to the MSE. This is because the MAE criterion inherently can
offer more rejection ratios of candidate blocks than the MSE
criterion. To explain this fact, we show a 2-D example as
follows. Assume that the 2-D vector (1,0) is the best matched
vector found thus far to the origin (0,0). The minimum MSE
and MAE values are both set to 1. Considering another
the averaged numbers of the various arithmetic operations, candidate vector (0.5,0.6) which is not closer to the origin
including squarings, additions/subtractions, and comparisons, vector than the vector (1,0) according to either the MSE or
required by the three considered algorithms, respectively, for MAE criteria, the MAE criterion can certainly reject this
the two test sequences. As can be seen from this table, the candidate vector by using the massive projection, that is,
FFBMA can achieve over 96% reduction of computation however, in the MSE case, the vector
complexity compared to the FBMA and 86% compared to (0.5,0.6) cannot be rejected by means of the massive projection
the PDSBMA in terms of the total number of arithmetic oper- because This example shows that in
ations. Table II shows similar results when the MAE measure the FFBMA, the MAE is superior to the MSE in the reduction
is considered. As shown in this table, over 96 and 88% of the of arithmetic operations.
total arithmetic operations, respectively, in the FBMA and the When comparing to the suboptimal algorithms, e.g., Liu and
PDSBMA are also saved by the FFBMA. In these two tables, Zaccarin’s subsampled motion-field estimation algorithm [10]
it is clearly indicated that in comparing between the Salesman that reduces the complexity of the FBMA by a fixed factor of 8
and Flower Garden sequences, more computation complexity at the expense of estimation accuracy, the FFBMA with MAE
is needed in both the FFBMA and the PDSBMA for the Flower can provide a greater computation reduction up to a factor of
Garden sequence. This increase of computation complexity is 29. Although the computation complexity of the FFBMA is
mainly due to the abrupt scene changes in Flower Garden, dependent on the input sequence, for the worst case in Table
which could make the current known minimum matching error IV, a comparable computation reduction factor of about 7.5
be larger during the period of block matching. can be achieved by the FFBMA. In [17], Fok and Au proposed
Obviously, this larger does reduce the rejection rate a feature domain BMA that can offer a computation reduction
of candidate blocks in the FFBMA and PDSBMA. Therefore, factor of about for the search block size of This
the reduction in computations of the FFBMA is dependent on algorithm is also suboptimal due to the employment of the
the sequence envisaged. The more significant the motion of integral projection features. For the search block size of 16
objects in the sequence, the less reduction of complexity the 16, the FFBMA with MAE produces a computation reduction
FFBMA exhibits. over twofold better than Fok and Au’s algorithm for the kind
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 45, NO. 5, MAY 1997 531

of sequences like Salesman. As for the sequence of Flower [6] S. Kapagantula and K. R. Rao, “Motion predictive interframe coding,”
Garden, a smaller computation reduction is achieved by the IEEE Trans. Commun., vol. COM-33, pp. 1011–1015, Sept. 1985.
[7] A. Puri, H. M. Hang, and D. L. Schilling, “An efficient block-matching
FFBMA as compared to Fok and Au’s method. algorithm for motion-compensated coding,” in Proc. Int. Conf. Acoust.,
These results verify the efficiency of the proposed FFBMA Speech, Signal Processing, 1987, pp. 25.4.1–25.4.4.
with either MSE and MAE. With the experiments, it is [8] M. Ghanbari, “The cross-search algorithm for motion estimation,” IEEE
Trans. Commun., vol. 38, pp. 950–953, July 1990.
concluded that the FFBMA, which uses the three fast matching [9] C. H. Hsieh, P. C. Lu, J. S. Shyn, and E. H. Lu, “Motion estimation
criteria, can perform much faster than the FBMA and the algorithm using interblock correlation,” Electron. Lett., vol. 26, pp.
PDSBMA at the same estimate accuracy. 276–277, Mar. 1990.
[10] B. Liu and A. Zaccarin, “New fast algorithms for the estimation of block
motion vectors,” IEEE Trans. Circuits Syst. Video Technol., vol. 3, pp.
IV. CONCLUSIONS 148–157, Apr. 1993.
[11] L. W. Lee, J. F. Wang, J. Y. Lee, and J. D. Shie, “Dynamic search-
A new fast full-search block-matching algorithm is pre- window adjustment and interlaced search for block-matching algo-
rithm,” IEEE Trans. Circuits Syst. Video Technol., vol. 3, pp. 85–87,
sented in this paper. It runs much faster than the traditional
Feb. 1993.
full-search BMA, while the optimal accuracy of motion esti- [12] M. J. Chen, L. G. Chen, and T. D. Chiueh, “One-dimensional full search
mation is guaranteed. This improvement of speed is based on motion estimation algorithm for video coding,” IEEE Trans. Circuits
Syst. Video Technol., vol. 4, pp. 504–509, Oct. 1994.
the fact that multiple matching errors which have different lev- [13] F. Dufaux and F. Moscheni, “Motion estimation techniques for digital
els of computation complexity are available on each position to TV: A review and a new contribution,” Proc. IEEE, vol. 83, pp.
be searched. The relationships among the multiple matching 858–876, June 1995.
[14] K. M. Yang, M. T. Sun, and L. Wu, “A family of VLSI designs for the
errors of a candidate position are utilized to construct three motion compensation block-matching algorithm,” IEEE Trans. Circuits
test conditions which can be employed during block matching Syst., vol. 36, pp. 1317–1325, Oct. 1989.
to avoid the time-consuming computations of MSE or MAE [15] H. M. Jong, L. G. Chen, and T. D. Chiueh, “Parallel architectures for 3-
step hierachical search block-matching algorithm,” IEEE Trans. Circuits
measurements. With the experiments, the proposed method Syst. Video Technol.. vol. 4, pp. 407–416, Aug. 1994.
can give a great amount of savings of computations, and thus [16] J. S. Kim and R. H. Park, “A fast feature-based block matching
can be well suited for a wide range of applications, such as algorithm using integral projections,” IEEE J. Select. Areas Commun.,
vol. 10, pp. 968–971, June 1992.
videotelephony, videoconferencing, and HDTV. [17] Y. H. Fok and O. C. Au, “An improved fast feature-based block motion
estimation,” in Proc. IEEE 1994 Int. Conf. Image Processing, 1994, pp.
REFERENCES 741–745.
[18] H. B. Park and C. Wang, “Image compression by vector quantization of
[1] CCITT SG XV, “Recommendation H.261—Video codec for audiovisual projection data,” IEICE Trans. Inform. Syst., vol. E75-D, pp. 148–155,
services at p*64 kbits/s,” Tech. Rep. COM XV-R37-E, Aug. 1990. Jan. 1992.
[2] MPEG, “ISO CD 11172-2: Coding of moving pictures and associated [19] K. H. Jung and C. Wang, “Projective image representation and its
audio for digital storage media at up to about 1.5 M bits/s,” Nov. 1991. application to image compression,” IEICE Trans. Inform. Syst., vol.
[3] T. Koga, K. Iinuma, A. Hirano, Y. Iijima, and T. Ishiguro, “Motion E79-D, pp. 136–142, Feb. 1996.
compensated interframe coding for video conferencing,” in Proc. Nat. [20] C. Bei and R. M. Gray, “An improvement of the minimum distortion
Telecommun. Conf. New Orleans, LA, Nov. 1981, pp. G5.3.1–G5.3.5. encoding algorithm for vector quantization,” IEEE Trans. Commun., vol.
[4] J. R. Jain and A. K. Jain, “Displacement measurement and its application COM-33, pp. 1132–1133, Oct. 1985.
in interframe image coding,” IEEE Trans. Commun., vol. COM-29, pp. [21] S. H. Huang and S. H. Chen, “Fast encoding algorithm for VQ-based
1799–1808, Dec. 1981. image coding,” Electron. Lett., vol. 26, pp. 1618–1619, Sept. 1990.
[5] R. Srinivasan and K. R. Rao, “Predictive coding based on efficient [22] C. H. Lee and L. H. Chen, “A fast search algorithm for vector quantiza-
motion estimation,” IEEE Trans. Commun., vol. COM-33, pp. 888–896, tion using mean pyramids of codewords,” IEEE Trans. Commun., vol.
Aug. 1985. 43, pp. 1697–1702, Feb./Mar./Apr. 1995.