A Fast Mode Decision Algorithm For Downscaled Transcoding of H.264 Preencoded Video
A Fast Mode Decision Algorithm For Downscaled Transcoding of H.264 Preencoded Video
978-1-4244-4316-1/10/$25.00 ©2010to:
Authorized licensed use limited IEEE
East China Normal University. Downloaded on March 23,2024 at 07:08:51 UTC from IEEE Xplore. Restrictions apply.
When re-encoding the downscaled video using the proposed algorithm and the AWMVM method for the obtained
AWMVM as a starting point for the 1-pixel motion search, we QCIF resolution stream of the Foreman sequence.
also found that a second measure, the distribution of the cost TABLE 2: EXPERIMENTAL RESULTS (CIF -> QCIF), QP 20
values J obtained from the 16x16 mode search within a frame PSNR bitrate Enc.Time SrchPts
exhibits a significant correlation as well with the distribution Sequence Method (dB) (kbps) (s) per MB CMP/MB
AWMVM 42.44 346.4 808.1 5,148 150,172
of the resulting final block sizes chosen. Foreman Proposed 42.40 350.4 636.5 2,070 92,677
Interestingly, our analysis unveiled that in the vast majority ∆ -0.04 1.17% -21.23% -59.79% -38.29%
of cases when the first measure failed to indicate the best block AWMVM 44.49 77.4 577.2 4,395 125,418
Akiyo Proposed 44.45 77.6 448.3 1,974 81,175
size, the second measure would make a correct prediction and ∆ -0.04 0.30% -22.34% -55.09% -35.28%
vice versa. We concluded that a combination of these two AWMVM 41.40 1,109.8 647.2 5,209 140,641
measures might yield the desired predictor that fulfills the two Mobile Proposed 41.35 1,124.1 500.1 2,184 85,124
∆ -0.05 1.28% -22.74% -58.07% -39.47%
goal criteria. The following linear combination of the two Average -0.04 0.92% -22.10% -57.65% -37.68%
measures, obtained by running an optimization algorithm,
minimizes the error between the predicted block size 46
i ∈ {1..N }
43
M i = 1.25 × SAR i + J i , (2)
PSNR (dB)
42
with Mi denoting the measure that is used to assign a set of 41
valid modes for the given (16x16) macroblock i within a frame 40
with N macroblocks, SARi indicating the sum-of-absolute 39
residual values in the corresponding area of the preencoded 38
Proposed
video and Ji stating the cost value obtained from the 16x16 37 AWMVM
mode search. 36
With this measure at hand, we now need to define the 100 200 300 400 500 600
Bitrate (kbps)
decision criteria to assign the appropriate set of modes to the
Fig. 1: Foreman (CIF → QCIF)
current macroblock. While the distribution of M remains
strongly correlated with the block-size distribution for varying
IV. CONCLUSION
motion content, the mean and the range of the values Mi within
a frame change with the content. We concluded that a criterion We presented an early-termination mode decision process
involving the mean (µ) and the standard deviation (σ) would for efficiently transcoding H.264 video content to a smaller
define a suitable factor for this task. A close analysis unveiled resolution by a factor of 0.5. The method takes advantage of
that the adaptive thresholds in Table 1 are most suitable for correlation present between the residual of the preencoded
assigning a set of valid modes to every macroblock i. stream, the final cost value from the 1-pixel full search around
the AWMVM vector and the set of optimal coding modes for a
TABLE 1: VALID MODES FOR EVERY MACROBLOCK
given macroblock. Our performance evaluations have shown
Threshold for Mi Valid Modes
Mi > µ+0.5σ all modes
that about 37% of the computations can be saved compared to
µ+0.5σ > Mi > µ-σ 16x16,16x8,8x16 the unmodified AWMVM method while preserving picture
Mi < µ-σ 16x16 only quality (-0.04dB) and incurring a 0.92% increase in bitrate.
Authorized licensed use limited to: East China Normal University. Downloaded on March 23,2024 at 07:08:51 UTC from IEEE Xplore. Restrictions apply.