Software Manual
Software Manual
Version: Last update: JSVM 9.18 (CVS tag: JSVM_9_18) June 19th, 2009
Summary: This document contains a detailed description of the usage and configuration of the JSVM (Joint Scalable Video Model) software for the Scalable Video Coding (SVC) project of the Joint Video Team (JVT) of the ISO/IEC Moving Pictures Experts Group (MPEG) and the ITU-T Video Coding Experts Group (VCEG). It provides information how to build the software on Windows 32/64 bit and Linux 32/64 bit platforms. It contains a description of the usage and configuration for the binaries built from the software package, including examples for spatial, SNR and combined scalability scenarios. Guidelines for the integration and validation of new tools in the software are provided.
File: 111010436.doc
Page: 1
Table of Contents
JSVM Software Manual........................................................................................................................1 1 General Information............................................................................................................................3 2 Usage and configuration of the JSVM software..................................................................................8 3 Use Examples as a brief tutorial........................................................................................................55 4 Information for Software Integration.................................................................................................70
File: 111010436.doc
Page: 2
1 General Information
The JSVM (Joint Scalable Video Model) software is the reference software for the Scalable Video Coding (SVC) project of the Joint Video Team (JVT) of the ISO/IEC Moving Pictures Experts Group (MPEG) and the ITU-T Video Coding Experts Group (VCEG). Since the SVC project is still under development, the JSVM Software as is also under development and changes frequently. The JSVM software is written in C++ and is provided as source code. Section 1.1 describes how the JSVM software can be obtained via a CVS server. Information about the structure of the CVS repository is presented in section 1.2. Section 1.3 describes how the JSVM software can be build on Win32 and Linux platforms, and section 1.4 gives basic information about the binaries that are contained in the JSVM software package.
Example 1 shows how the JSVM software can be accessed by using a command line CVS client.
Example 1: Accessing the JSVM software with a command line CVS client
cvs d :pserver:jvtuser:[email protected]:/cvs/jvt login cvs d :pserver:[email protected]:/cvs/jvt checkout jsvm
In Example 2, it is shown how a specific JSVM software version specified by a tag (JSVM_4_5 in Example 2) can be obtained using a command line CVS client. Note that co represents an abbreviation for the command checkout, which was used in Example 1.
Example 2: Accessing the JSVM software version with the tag JSVM_4_5 with a command line CVS client
cvs d :pserver:jvtuser:[email protected]:/cvs/jvt login cvs d :pserver:[email protected]:/cvs/jvt co r JSVM_4_5 jsvm
It is possible to checkout only a reduced JSVM software package by using the alias jsvm_red instead of jsvm. In this case, the directories JSVM0-config-sample and MVC-Configs are ommitted in the checkout, see Example 3.
Example 3: Accessing the JSVM software without the JSVM0 and MVC directories.
cvs d :pserver:jvtuser:[email protected]:/cvs/jvt login cvs d :pserver:[email protected]:/cvs/jvt co jsvm_red
File: 111010436.doc
Page: 3
Page: 4
SoftwareManual.doc
This file described the (main) changes from one CVS version to the next. It starts with JSVM version 4.0 (CVS tag: JSVM_4_0). software manual This document.
Page: 5
bin/QualityLevelAssignerStatic.exe bin/QualityLevelAssignerStaticd.exe bin/SIPAnalyser.exe bin/SIPAnalyserd.exe bin/YUVCompareStatic.exe bin/YUVCompareStaticd.exe ===== libraries ===== lib/AvcRewriterLibStatic.lib lib/AvcRewriterLibStaticd.lib lib/H264AVCCommonLibStatic.lib lib/H264AVCCommonLibStaticd.lib lib/H264AVCDecoderLibStatic.lib lib/H264AVCDecoderLibStaticd.lib lib/H264AVCEncoderLibStatic.lib lib/H264AVCEncoderLibStaticd.lib lib/H264AVCVideoIoLibStatic.lib lib/H264AVCVideoIoLibStaticd.lib
By replacing make with make release or make debug in the Example 5 it can be specified that only the release or debug versions of the libraries and executables should be build. After building the software the folders bin and lib shall contain the binaries and libraries summarized in Example 6. Note that there exist two different versions for each binary or library, one with and one without a d before the dot. The versions with a d before the dot represent binaries or libraries that have been built in debug mode, while the versions without a d before the dot represent binaries or libraries that have been built in release mode. When the command make release or make debug was used, only the debug or release version are present, respectively.
Example 6: Binaries and libraries after building the software on Linux
===== binaries ===== bin/AvcRewriterStatic bin/AvcRewriterStaticd bin/BitStreamExtractorStatic bin/BitStreamExtractorStaticd bin/DownConvertStatic bin/DownConvertStaticd bin/FixedQPEncoderStatic bin/FixedQPEncoderStaticd bin/H264AVCDecoderLibTestStatic bin/H264AVCDecoderLibTestStaticd bin/H264AVCEncoderLibTestStatic bin/H264AVCEncoderLibTestStaticd bin/MCTFPreProcessorStatic bin/MCTFPreProcessorStaticd bin/PSNRStatic bin/PSNRStaticd bin/QualityLevelAssignerStatic bin/QualityLevelAssignerStaticd bin/SIPAnalyser bin/SIPAnalyserd
File: 111010436.doc
Page: 6
bin/YUVCompareStatic.exe bin/YUVCompareStaticd.exe ===== libraries ===== lib/libAvcrewriterLibStatic.a lib/libAvcrewriterLibStaticd.a lib/libH264AVCCommonLibStatic.a lib/libH264AVCCommonLibStaticd.a lib/libH264AVCDecoderLibStatic.a lib/libH264AVCDecoderLibStaticd.a lib/libH264AVCEncoderLibStatic.a lib/libH264AVCEncoderLibStaticd.a lib/libH264AVCVideoIoLibStatic.a lib/libH264AVCVideoIoLibStaticd.a
Page: 7
AvcRewriter
BitStreamExtractorStatic
QualityLevelAssignerStatic
MCTFPreProcessor
PSNRStatic
FixedQPEncoderStatic
SIPAnalyser
YUVCompareStatic
The decoder can be used for decoding AVC or SVC bit-streams and reconstructing raw video sequences. More information on using the decoder are provided in section 2.3 SVC to AVC rewriter The rewriter can be used for rewriting an SVC to AVC bitstream when avc_rewrite_flag is present in enhancement layers. More information on using the decoder are provided in section 2.4. bit-stream extractor The bit-stream extractor can be used for extracting sub-bitstreams with a lower spatio-temporal resolution and/or bit-rate from a global scalable (SVC) bit-stream. More information on using the bit-stream extractor are provided in section 2.5. quality level assigner The quality level assigner can be used for generating a bit-stream with additional quality layer information given a scalable bitstream. Beside the additional quality layer information input and output bit-stream are identical. More information on using the quality layer assigner are provided in section 2.6. MCTF pre-processor The MCTF pre-processing tool can be used for pre-filtering image sequences. More information on using the MCTF pre-processor are provided in section 2.7. PSNR tool The PSNR tool can be used for measuring the PSNR between two raw video sequences. In addition it can be used for measuring the bit-rate of a given bit-stream. More information on using the PSNR tool are provided in section 2.7. fixed QP encoder This tool can be used for controlling the encoder and adjusting the bit-rate of the generated bit-stream. More information on using this tool are provided in section 2.9. SIP Analyser The SIPAnalyser tool can be used to make the selective inter-layer prediction decision. More information on using the SIPAnalyser tool are provided in section 2.10. tool for comparing YUV sequences This tool might be helpful for finding/debugging encoder-decoder mismatch. Call this tool without command line parameters for obtaining a brief usage explanation.
File: 111010436.doc
Page: 8
hin : input height (luma samples) in : input file wout : output width (luma samples) hout : output height (luma samples) out : output file --------------------------- OPTIONAL --------------------------method : rescaling methods (default: 0) 0: normative upsampling non-normative downsampling (JVT-R006) 1: dyadic upsampling (AVC 6-tap (1/2 pel) on odd luma samples dyadic downsampling (MPEG-4 downsampling filter) 2: crop only 3: upsampling (Three-lobed Lanczos-windowed sinc) 4: upsampling (JVT-O041: AVC 6-tap (1/2 pel) + bilinear 1/4 pel) t : number of temporal downsampling stages (default: 0) skip : number of frames to skip at start (default: 0) frms : number of frames wanted in output file (default: max) -------------------------- OVERLOADED --------------------------crop <type> <parameters> type : 0: Sequence level, 1: Picture level params : IF Sequence level: <x_orig> <y_orig> <crop_width> <crop_height> cropping window origin (x,y) and dimensions (width and height) IF Picture level: <crop_file> input file containing cropping window parameters. each line has four integer numbers separated by a comma as following: "x_orig, y_orig, crop_width, crop_height" for each picture to be resampled; -phase <in_uv_ph_x> <in_uv_ph_y> <out_uv_ph_x> <out_uv_ph_y> in_uv_ph_x : input chroma phase shift in horizontal direction in_uv_ph_y : input chroma phase shift in vertical direction out_uv_ph_x: output chroma phase shift in horizontal direction out_uv_ph_y: output chroma phase shift in vertical direction (default:-1) (default: 0) (default:-1) (default: 0)
-resample_mode <resample_mode> (default: 0) 0: low-res-frm = progressive, high-res-frm = progressive 1: low-res-frm = interlaced, high-res-frm = interlaced 2: low-res-frm = progressive (top-coincided), high-res-frm = interlaced 3: low-res-frm = progressive (bot-coincided), high-res-frm = interlaced 4: low-res-frm = interlaced (top-first) high-res-frm = progressive (double frm rate), 5: low-res-frm = interlaced (bot-first), high-res-frm = progressive (double frm rate)
Page: 9
negative values (i.e. temporal upsampling) for the parameter t. With the parameter skip it can be specified how many frames at the beginning of the input sequence shall be skipped. And with the optional parameter frms, a maximum number of output frames that shall be produced can be specified.
2.1.2.1 Upsampling
When method is set to 0, the SVC normative upsampling method designed to support the Extended Spatial Scalability is applied. It is based on a set of 4-taps filters. These integer-based 4-tap filters are originally derived from the Lanczos-3 filter. Any inter-layer scaling ratios, which can also be different in horizontal and vertical, are supported. When method is set to 1, only dyadic rescaling ratios are supported. The upsampling is realized via several dyadic stages. By default, in each stage, every second sample in horizontal and vertical direction is presented by the samples of the input image, and the missing luma samples are interpolated using the AVC half-sample interpolation filter {1, -5, 20, 20, -5, 1} / 32. The missing chroma samples are interpolated using the very simple filter {16,16}/32. Note that the ratio between the input and output dimensions shall be a power of 2. When method is set to 3, the upsampling method is achieved applying the three-lobed Lanczoswindowed sinc functions. Any inter-layer scaling ratios, which can also be different in horizontal and vertical, are supported. When method is set to 4, a combination of the AVC half-sample filter (see above) and a bi-linear filter is used.
2.1.2.2 Downsampling
When method is set to 0, the downsampling method based on the Sine-windowed Sinc-function is applied. A set of filters has been designed to support the extended range of spatial scaling ratio required by ESS. Any inter-layer scaling ratios, which can also be different in horizontal and vertical, are supported. When method is set to 1, we switch on dyadic downsampling mode. As for the dyadic upsampling the downsampling is done via several stages, where in each stage a scaling factor of 0.5 is applied in both horizontal and vertical direction. In each stage the input images are filtered with the MPEG-4 downsampling filter {2, 0, -4, -3, 5, 19, 26, 19, 5, -3, -4, 0, 2} / 64 and the output is presented by every second sample (in horizontal and vertical direction) of the filtered images. Note that the ratio between the input and output dimensions shall be a power of 2. In Example 8, the command line call for down-sampling a 4CIF (704x576 samples) sequence with a frame rate of 60Hz to a QCIF (176x144 samples) sequence with a frame rate of 15Hz is illustrated. Please note that this example uses method 1 which is not the recommended method for the simulations with the JSVM (see section 3.1). In Example 9, the resampling of a CIF (352x288 samples) 30Hz sequence to a 528x432 15Hz sequence using the SVC normative upsampler is illustrated.
Example 8: Down-sampling a 4CIF 60Hz sequence to a QCIF 15Hz sequence using the dyadic method.
DownConvertStatic 704 576 4CIF60.yuv 176 144 QCIF15.yuv 1 2
Example 9: Resampling of a CIF 30Hz sequence to a 528x432 15Hz sequence using the normative upsampler.
DownConvertStatic 352 288 CIF30.yuv 528 432 528x432_15.yuv 0 1
File: 111010436.doc
Page: 10
Example 11 illustrates the resampling of a 720p (HD) sequence with a resolution of 1280x720 samples and a frame rate of 50Hz to an SD sequence with a resolution of 720x576 samples and a frame rate of 25Hz. Note that with this example, the aspect ratio of the samples is kept, but the left and right border of the HD sequence in 16:9 format are cut off in order to obtain an SD sequence in 5:4 format. First an area of 900x720 samples if cropped out of the middle of an HD pictures (horizontal offset of 190 samples), and then these cropped pictures are downsampled with a resampling factor of 4/5 in both horizontal and vertical direction.
Example 11: Resampling of a HD sequence (1280x720,50Hz) to an SD sequence (720x576,25Hz)
DownConvertStatic.exe 1280 720 720p50.yuv 720 576 SD25.yuv 0 crop 0 crop.txt 1 content of crop.txt: 190 0 900 720
Note that the resampler can be used in crop only mode by setting the method argument to 2. In case no crop option is specified, the width and height of the cropping window are the output parameters wout and hout.
File: 111010436.doc
Page: 11
It should be noted that using the encoder does not guarantee rate-distortion efficient coding. For obtaining optimized encoding results the encoder configuration has to be carefully specified. Special care has to be taken when running the encoder in scalable mode, since the coding efficiency generally decrease with the supported scalability options. It should further be noted that the current encoder implementation does not provide a rate-control. The bit-rate needs to be controlled by selecting appropriate quantization parameters. Examples for using the encoder are described in section 2.9. The encoder can be run in two different coding modes: single-layer coding mode and scalable coding mode. Although single-layer bit-stream can also be generated in the scalable coding mode, the singlelayer coding mode provides more flexibility but lacks the support the generating scalable bit-streams. The encoding mode is specified by the parameter AVCMode inside the main configuration file. When this parameter is not present or equal to 0, the encoder is run in scalable coding mode; otherwise, the encoder is operated in single-layer coding mode. The configuration file parameters and command line
File: 111010436.doc
Page: 12
option for the single-layer coding mode are described in section 2.2.1, while the configuration file parameters and command line options for the scalable coding mode are described in section 2.2.2. Generally, the configuration files present a collection of configuration parameters. Each configuration parameter is specified in one line of the configuration files. Comments are started by the character #. The order of configuration parameters inside a configuration file can be arbitrarily selected. Each configuration parameter has a default value, and when the configuration parameter is not present in the configuration file, the default value is taken instead. Thus, it is generally not required to specify all configuration parameters in a configuration file.
#============================== MOTION SEARCH ================================== SearchMode 4 # Search mode (0:BlockSearch, 4:FastSearch) SearchFuncFullPel 3 # Search function full pel # (0:SAD, 1:SSE, 2:HADAMARD, 3:SAD-YUV)
File: 111010436.doc
Page: 13
2 32 1 4 8
# # # # # #
Search function sub pel (0:SAD, 1:SSE, 2:HADAMARD) Search range (Full Pel) Fast bi-directional search (0:off, 1:on) Max iterations for bi-pred search Search range for iterations (0: normal)
#============================== LOOP FILTER ==================================== LoopFilterDisable 0 # Loop filter idc (0: on, 1: off, 2: # on except for slice boundaries) LoopFilterAlphaC0Offset 0 # AlphaOffset(-6..+6): valid range LoopFilterBetaOffset 0 # BetaOffset (-6..+6): valid range #============================== WEIGHTED PREDICTION ============================ WeightedPrediction 0 # Weighting IP Slice (0:disable, 1:enable) WeightedBiprediction 0 # Weighting B Slice (0:disable, 1:explicit, 2:implicit) #=============================== HRD ===================================== EnableVclHRD 0 # Type I HRD (default 0:Off, 1:on) EnableNalHRD 0 # Type II HRD (default 0:Off, 1:on)
AVCMode
Flag (0 or 1), default: 0 Specifies whether the encoder is run in single-layer coding mode, which is also referred to as Multiview coding mode, since this mode was implemented to support multiview coding. When AVCMode is equal to 1, the scalability tools can not be used, but the coding structure is not restricted to dyadic prediction structures (cp. section 2.2.1.2). Note that dependent on the value of AVCMode several configuration parameters have no influence. When AVCMode is equal to 0, the encoder is run in scalable coding mode, which is described in section 2.2.2.
InputFile
String, default: in.yuv Specifies the filename of the original raw video sequence to be encoded.
OutputFile
String, default: test.264 Specifies the filename of the bit-stream to be generated.
ReconFile
String, default: rec.yuv Specifies the filename of the coded and reconstructed input sequence. This sequence is provided for debugging purposes.
SourceWidth
Unsigned Int, default: 0 (invalid) Specifies the width of the input images in luma samples. SourceWidth shall be non-zero and a multiple of 16. This parameter shall be present in each configuration file, since the default value of 0 is invalid.
SourceHeight
Unsigned Int, default: 0 (invalid) Specifies the height of the input images in luma samples. SourceHeight shall be non-zero and a multiple of 16. This parameter shall be present in each configuration file, since the default value of 0 is invalid.
FrameRate
Double, default: 60.0 Specifies the frame rate of the input sequence in Hz.
FramesToBeEncoded
Unsigned Int, default: 1 Specifies the number of frames of the input sequence to be encoded.
File: 111010436.doc
Page: 14
SymbolMode
Flag (0 or 1), default: 1 Specifies the entropy coding mode. When SymbolMode is equal to 0, the video sequence is encoded using variable length codes (VLC). When SymbolMode is equal to 1, the video sequence is encoded using context-adaptive binary arithmetic coding (CABAC). CABAC usually provides an increased coding efficiency.
Enable8x8Transform
Flag (0 or 1), default: 0 Specifies whether the 8x8 transform (High Profile) is enabled. When Enable8x8Transform is equal to 1, the 8x8 transform enabled in addition to the standard 4x4 transform for the luminance component; otherwise the 8x8 transform is disabled. The coding efficiency is generally increased by enabling the 8x8 transform, especially for high-resolution source material.
ScalingMatricesPresent
Flag (0 or 1), default: 0 Specifies whether (non-flat) scaling matrices are used. When ScalingMatricesPresent is equal to 0, the flat scaling matrices are used (with all entries equal to 16). When ScalingMatricesPresent is equal to 1, the (non-flat) default scaling matrices are used.
BiPred8x8Disable
Flag (0 or 1), default: 1 Specifies whether bi-prediction is diabled for blocks smaller than 8x8. When BiPred8x8Disable is equal to 1, bi-prediction is disabled for sub-macroblock partitions that contain blocks smaller than 8x8.
MCBlocksLT8x8Disable
Flag (0 or 1), default: 0 Specifies whether motion-compensated blocks smaller than 8x8 are disabled. When MCBlocksLT8x8Disable is equal to 1, sub-macroblock partitions that contain blocks smaller than 8x8 are disabled.
BasisQP
Double, default: 26 Specifies the basic quantization parameter. This parameter shall be used to control the bit-rate of a bitstream. The actual quantization parameters that are used for encoding a specific frame of a video sequence are additionally dependent on the parameter SequenceFormatString and the parameters DeltaLayerXQuant (with X in the range of 0 to 5, inclusive). More information on how quantization parameters for specific frames are chosen is given in the description of the parameters DeltaLayerXQuant.
DPBSize
Unsigned Int, default: 1 Specifies the minimum size of the decoded picture buffer (DPB) in frames. Depending on this parameter, the Level for encoding a video sequence is selected.
NumRefFrames
Unsigned Int, default: 1 Specifies the maximum number of reference frames that are stored in the DPB and can be referenced via Inter prediction of a following frame. The parameter also specifies the value of the syntax element num_ref_frames. NumRefFrames shall not be greater than DPBSize.
Log2MaxFrameNum
Unsigned Int, default: 4 Specifies the maximum value of the syntax elements frame_num. It also specifies the value of the syntax element log2_max_frame_num_minus4. Log2MaxFrameNum shall be in the range of 4 to 16, inclusive. Note that this parameter shall be large enough to allow an encoding/decoding of the sequence structure specified by SequenceFormatString. A greater value of Log2MaxFrameNum also increases the robustness against packet losses. When the value of Log2MaxFrameNum is too small in an error prone environment, the number of actual missing frames cannot be detected.
Log2MaxPocLsb
Unsigned Int, default: 4
File: 111010436.doc
Page: 15
Specifies the number of bits that are used for transmitting the syntax elements pic_order_cnt_lsb. It also specifies the value of the syntax element log2_max_pic_order_cnt_lsb_minus4. Log2MaxPocLsb shall be in the range of 4 to 15, inclusive. Note that this parameter shall be large enough to allow an encoding/decoding of the sequence structure specified by SequenceFormatString. A greater value of Log2MaxPocLsb also increases the robustness against packet losses. When the value of Log2MaxPocLsb is too small in an error prone environment, the output order of decoded pictures cannot be correctly determined.
SequenceFormatString
String, default: A0*n{P0} Specifies the coding structure for the video sequence. This parameter allows a very flexible configuration, more information about using this parameter are provided in section 2.2.1.2.
MaxRefIdxActiveP
Unsigned Int, default: 1 Specifies the maximum number of active entries in the reference picture list for P pictures.
SearchMode
Int, default: 0 Specifies the motion search algorithm to be applied. When SearchMode is equal to 0, an exhaustive block search is employed. When SearchMode is equal to 4, a fast motion search algorithm is employed. Other values than 0 or 4 lead to an unspecified behaviour. The fast motion search algorithm shall be preferred, since it provides a comparable rate-distortion efficiency, but significantly reduced the encoding time.
SearchFuncFullPel
Int, default: 0 Specifies the distortion measure that is applied for the motion search on integer-sample positions. The following values are supported: 0 Sum of absolute differences (SAD) for the luminance component 1 Sum of squared differences (SSE) for the luminance component 2 Sum of absolute differences in the Hadamard transform domain for the luminance component 3 Sum of absolute differences (SAD) for all colour components
SearchFuncSubPel
Int, default: 0 Specifies the distortion measure that is applied for the motion search on sub-sample positions. The following values are supported: 0 Sum of absolute differences (SAD) for the luminance component 1 Sum of squared differences (SSE) for the luminance component 2 Sum of absolute differences in the Hadamard transform domain for the luminance component
SearchRange
Unsigned Int, default: 96 Specifies the maximum search range for the motion search. Note that when the fast search algorithm is selected, the actual search range can be smaller.
FastBiSearch
Flag (0 or 1), default: 0 Specifies the usage of fast bi-directional search. When FastBiSearch is equal to 1, the fast bi-directional motion search is used; otherwise the standard bi-directiobal motion search is used.
File: 111010436.doc
Page: 16
BiPredIter
Unsigned Int, default: 4 Specifies the number of iterations of the motion search for bi-predictive blocks. The coding efficiency for B pictures is usually increased when the parameter BiPredIter is set to a value greater than or equal to 2.
IterSearchRange
Unsigned Int, default: 8 Specifies the search range for the motion search iterations for bi-predictive blocks. Since, an initial search is performed with the search range specified by SearchRange, this parameter can be set to significantly smaller values without decreasing the coding efficiency.
LoopFilterDisable
Unsigned Int, default: 0 Specifies how the in-loop deblocking filter is applies. The following values are supported: 0 The deblocking filter is applied to all block edges 1 The deblocking filter is not applied. 2 The deblocking filter is applied to all block edges with exception of slice boundaries
LoopFilterAlphaC0Offset
Int, default: 0 Specifies the alpha offset for the deblocking filter. LoopFilterAlphaC0Offset shall be in the range of 6 to 6, inclusive. This parameter can be used to adjust the strength of the deblocking filter.
LoopFilterBetaOffset
Int, default: 0 Specifies the beta offset for the deblocking filter. LoopFilterBetaOffset shall be in the range of 6 to 6, inclusive. This parameter can be used to adjust the strength of the deblocking filter.
WeightedPrediction
Flag (0 or 1), default: 0 Specifies whether weighted prediction is used for P pictures. When this parameter is equal to 0, weighted prediction for P pictures is disabled. When this parameter is equal to 1, weighted prediction for P pictures is enabled, and the prediction weights are estimated during encoding.
WeightedBiPrediction
Unsigned Int, default: 0 Specifies whether and how weighted prediction is used for B pictures. When this parameter is equal to 0, weighted prediction for B pictures is disabled. When this parameter is equal to 1, weighted prediction for B pictures is enabled and operated in explicit weighting mode; the corresponding prediction weights are estimated during encoding. When this parameter is equal to 2, weighted prediction for B pictures is enabled and operated in implicit prediction mode. In the implicit mode, the prediction weights are not estimated and transmitted, but inferred from the distances (measured via Picture Order Count) of the reference pictures to the picture currently to be encoded.
EnableVclHRD
Unsigned Int, default: 0 Specifieds whether the Type I HRD is enabled. When this parameter is equal to 0, Type I HRD information is not generated. When this parameter is equal to 1, Type I HRD information is generated.
EnableNalHRD
Unsigned Int, default: 0 Specifieds whether the Type II HRD is enabled. When this parameter is equal to 0, Type II HRD information is not generated. When this parameter is equal to 1, Type II HRD information is generated.
Page: 17
where GSP is a string that specifies a general sequence part. It generally consists of an arbitrary combination of frame sequence parts (FSP) and/or further general sequence parts (GSP) that can be optionally enclosed in curly brackets preceded by an asterisk and a syntax element REP specifying the number of repetitions of the sub-sequence structure inside the curly brackets: GSP: [*REP{]GSP|FSP[GSP|FSP][GSP|FSP][}]
The syntax element REP specifies the number of repetitions of the sub-sequences enclosed in curly brackets, it can be either a decimal number or the character n, which specifies that the sub-sequence structure encloses in curly brackets in repeated forever: REP: [(decimal number)|n]
A frame sequence part (FSP) consists of a combination of frame descriptions (FDES) that can be optionally enclosed in curly brackets preceded by an asterisk and the syntax element REP specifying the number of repetitions of the sub-sequence structure enclosed inside the curly brackets in the same manner as for the general sequence part (GPS): FSP: [*REP{]FDES[FDES][FDES][}]
A frame description (FDES) specifies all parameters needed by the encoder for encoding a given frame of a video sequence and updating the decoded picture buffer (DPB). A frame specification (FDES) consists of a frame coding type (TYPE) specifying the slice coding type and whether the frame is marked as used for reference, a frame index offset (OFFSET) specifying the difference of the frame indices of the frame to encode and the first frame (in input order) of the current frame sequence part (FSP) inside the input video sequence, and optionally up to two lists of reference picture reordering commands and/or a list of memory management control operation commands: FSPEC: (TYPE)(OFFSET)[Layer][RPLR-List0[RPLR-List1]][MMCO-List]
The frame coding type (TYPE) specifies the slice coding type for the frame and it further determines whether the frame is marked as used for reference after storing in the decoded picture buffer. The following values are possible: TYPE: A|I|P|B|i|p|b
A: The first picture of the frame is coded as IDR picture (the second picture is coded as P picture and marked as used for reference). Page: 18 Date Saved: 2009-06-19
File: 111010436.doc
I: P: i: p: b:
The picture(s) is/are coded as I picture(s) and marked as used for reference. The picture(s) is/are coded as P picture(s) and marked as used for reference. The picture(s) is/are coded as I picture(s) and marked as unused for reference. The picture(s) is/are coded as P picture(s) and marked as unused for reference. The picture(s) is/are coded as B picture(s) and marked as unused for reference.
B: The picture(s) is/are coded as B picture(s) and marked as used for reference.
The frame index offset (OFFSET) is a decimal number, which specifies the frame index of the given frame inside the given input video sequence: OFFSET: (decimal number)
This frame index is not given as absolute number, but as difference between the frame index of the current frame and the frame index of the first frame (in input order) of the current frame sequence part (FSP). If a frame sequence part consists of N frames, each frame index offset (OFFSET) must greater or equal to 0 and less than N, and all frame index offsets inside the frame sequence part must be different. The layer specification starts with the character L followed by a decimal number. Layer: L(LayerNumber)
The integer LayerNumber specifies the assignment of a picture to a specific layer. LayerNumber shall be in the range of 0 to 5, inclusive. Pictures are grouped into layers, for assigning them a specific quantization parameter is described in section 2.2.1.1. When the sub-string Layer is not present, the LayerNumber is inferred to be equal to 0. The lists of reference picture list reordering commands start with the character R followed by an optional ordered list of re-ordering commands. RPLR-List0: RPLR-List1: R[DIFF|+DIFF|L(LTI)][DIFF|+DIFF|L(LTI)] R[DIFF|+DIFF|L(LTI)][DIFF|+DIFF|L(LTI)]
The reference list re-ordering commands are specified in the following way. DIFF: specifies the reference picture list reordering command reordering_of_pic_nums_idc = 0, the associated syntax element abs_diff_pic_num_minus1 is set equal to the given decimal number DIFF. +DIFF: specifies the reference picture list reordering command reordering_of_pic_nums_idc = 1, the associated syntax element abs_diff_pic_num_minus1 is set equal to the given decimal number DIFF. L(LTI): specifies the reference picture list reordering command reordering_of_pic_nums_idc = 2, the associated syntax element long_term_pic_num is set equal to the given decimal number LTI. [Currently not supported by the encoder]
File: 111010436.doc
Page: 19
The reference picture list reordering command reordering_of_pic_nums_idc = 3 (end of list) is not specified, but automatically appended if the specified list of reordering commands is not empty. For B frames, two lists of reordering commands can be specified, the first list is related to reference picture list 0 and the second list is related to reference picture list 1. If reference picture list reordering commands should only be specified for reference picture list 1, this can be done by specifying an empty list for reference picture list 0: RR[DIFF|+DIFF|L(LTI)][DIFF|+DIFF|L(LTI)]. The list of memory management control operation (MMCO) commands starts with the character M followed by an optional ordered list of MMCO commands. MMCO-List: M[N(DIFF)|L(LTI)|L(LTI):(DIFF)|L(NUM)$|E|L(LTI)+][] The MMCO commands are specified in the following way. N(DIFF): specifies the memory management control operation command memory_management_control_operation = 1 (mark a short-term picture as unused for reference), the associated syntax element difference_of_pic_nums_minus1 is set equal to the given decimal number DIFF. L(LTI): specifies the memory management control operation command memory_management_control_operation = 2 (mark a long-term picture as unused for reference), the associated syntax element long_term_pic_num is set equal to the given decimal number LTI. [Currently not supported by the encoder] L(LTI):(DIFF): specifies the memory management control operation command memory_management_control_operation = 3 (assign a long-term index to a short-term picture), the associated syntax elements long_term_frame_idx and difference_of_pic_nums_minus1 are set equal to the given decimal numbers LTI and DIFF, respectively. [Currently not supported by the encoder] L(NUM)$: specifies the memory management control operation command memory_management_control_operation = 4 (specify maximum long-term frame index), the associated syntax element max_long_term_frame_idx_plus1 is set equal to the given decimal number NUM. [Currently not supported by the encoder] E: specifies the memory management control operation command memory_management_control_operation = 5 (mark all reference pictures as unused for reference and set the MaxLongTermFrameIdx variable to no long-term frames indices). [Currently not supported by the encoder]
L(LTI)+: For non-IDR pictures, it specifies the MMCO command memory_management_control_operation = 6 (assign a long-term frame index to the current decoded picture), the associated syntax element long_term_frame_idx is set equal to the given decimal number LTI. [Currently not supported by the encoder] For IDR pictures, it sets the flag long_term_reference_flag equal to 1 and the given decimal number LTI is ignored. [Currently not supported by the encoder]
Page: 20
2.2.1.2.2.1 Examples without MMCO and reordering commands Example 1: Format string: Coding Types: Stored as reference: Coding Order: Example 2: Format string: Coding Types: Stored as reference: Coding Order: Example 3: Format string: Coding Types: Stored as reference: Coding Order: Example 4: Format string: Coding Types: Stored as reference: Coding Order: A0*n{P0} IDR P P P P P P 1 1 1 1 1 1 1 0 1 2 3 4 5 6
A0*n{P2b0b1} IDR B B P B B P 1 0 0 1 0 0 1 0 2 3 1 5 6 4
A0*2{P1p0}*n{I0*2{P1p0}} IDR P P P P I P P P P I P P P P 1 0 1 0 1 1 0 1 0 1 1 0 1 0 1 0 2 1 4 3 5 7 6 9 8 10 12 11 14 13
2.2.1.2.2.2 Example with MMCO commands Example 5: Reference frames: Format string: Coding Types: Stored as reference: Coding Order:
Current Frame Index (coding order) 0 (0) 1 (1) 2 (2) 3 (3) 4 (4) 5 (5) 6 (6) 7 (7) 8 (8)
File: 111010436.doc
Ordered set of long-term reference pictures (specified by the frame indices) after decoding and storing the current frame
Page: 21
9 (9) 10 (10)
9 8 7 6 10
2.2.1.2.2.3 Example with Reordering Commands Example 6: Reference frames: Format string: Coding Types: Stored as reference: Coding Order:
Current Frame Index (coding order) 0 (0) 4 (1) 1 (2) 3 (3) 2 (4) 5 (5) 9 (6) 6 (7) 8 (8) 7 (9) 5 5 9 6 5 9 8 6 9 5 (6 5 8 9) 9 5 9 6 5 6 8 5 9 (8 9 6 5) 0 0 4 1 0 4 3 1 4 0 (1 0 3 4) 4 0 4 1 0 1 3 0 4 (3 4 1 0)
-frms (frames)
The parameter frames specifies the number of frames of the input sequence to be encoded.
-pf (config)
The parameter config specifies the name of the config file to be used.
-h
Prints out a brief help on using the encoder.
Page: 22
coding mode, one or more layer configuration files have to be specified inside the main configuration files. The parameters of the main configuration file are described in section 2.2.2.1, the parameters of the layer configuration files are described in section 2.2.2.2. Additional command line parameters are described in section 2.2.2.3.
Page: 23
2:implicit)
#============================== LOSS-AWARE RDO ================================= LARDO 0 # Loss-aware RDO (0:disable, 1:enable) #============================== OTHER PARAMETERS ============================ MultiLayerLambdaSel 0 # 0: diable, 1:enable, 2:enable with factor 0.8 PreAndSuffixUnitEnable 1 # Add prefix unit (0: off, 1: on) # shall always be on # in SVC contexts (i.e. when there are # MGS/CGS/spatial enhancement layers) NestingSEI 1 # Nesting SEI message(1:enable, 0:disable ) SceneInfo 1 # scene info SEI message(1:enable, 0:disable ) TLNestingFlag 0 # Sets the temporal level nesting flag # (0: off, default 1: on) IntegrityCheckSEI 0 # Integrity check SEI message in bitstream # (0: off, default 1: on) TL0DepRepIdxSeiEnable 0 # TL0 SEI message (1:enable, 0:disable ) RPEncCheck 1 # Enable the checking mechanism # (default 0: off, 1: on) MVDiffThreshold 20 # Motion vection difference threshold (default 20) #=============================== HRD ===================================== EnableVclHRD 0 # Type I HRD (default 0:Off, 1:on) EnableNalHRD 0 # Type II HRD (default 0:Off, 1:on) #=========================== RATE CONTROL ======================= RateControlEnable 0 # Enable base-layer rate control (0=off, 1=on) InitialQP 30 # Initial QP RCMinQP 12 # Minimum QP value during rate control RCMaxQP 40 # Maximum QP value during rate control MaxQPChange 2 # Maximum QP change among subsequent # highest-priority frames AdaptInitialQP 0 # Adapt the initial QP based on sequence dimensions # and rate # (0=off, 1=on) BitRate 64000 # Target bit rate in bits per second BasicUnit 99 # Number of MBs that constitute a rate control # basic unit
OutputFile
String, default: test.264 Specifies the filename for the bit-stream to be encoded.
FrameRate
Double, default: 60.0 Specifies the maximum frame rate of all input sequences or a multiple thereof. The parameter FramesToBeEncoded specifies the number of frames that should be encoded at the input frame rate FrameRate. The parameter FrameRate is additionally required for determining the actual coding structure of a layer. The basic coding structure (GOP size) for a layer is determined by the parameters FrameRate and GOPSize of the main configuration files and the parameters FrameRateIn and FrameRateOut of the layer configuration files (cp. section 2.2.2.2). See also GOPSize.
MaxDelay
Double, default: 1200.0 Specifies the maximum allowed structural delay in milliseconds (ms). In order to not exceed this maximum delay the coding structure might be adapted by restricting the motion-compensated prediction to reference pictures from the past for several or all pictures. In general the coding efficiency is decreased by forcing a maximum structural delay.
FramesToBeEncoded
Unsigned Int, default: 1
File: 111010436.doc
Page: 24
Specifies the number of frames of the input sequence to be encoded. The number of frames to be encoded is specified at the frame rate given by FrameRate. Thus, depending on the value of FrameRateOut in the layer configuration file(s), the number of actual encoded frame might be smaller.
NonRequiredEnable
Flag (0 or 1), default: 0 Specifies whether non-required picture SEI messages are included in the generated bit-stream If NonRequiredEnable is equal to 1, non-required picture SEI messages are included in the bit-stream; otherwise, non-required picture SEI messages are not included in the bit-stream.
CgsSnrRefinement
Flag (0 or 1), default: 0 Specifies whether SNR enhancements are coded using MGS (flag set to a value of 1) or coded using CGS (flag set to a value of 0). [Ed.: Attention, name suggests inverse meaning!]
EncodeKeyPictures
Unsigned Int, default: 0 Specifies whether the pictures at temporal level 0 are coded as key pictures (use_base_prediction_flag and store_base_rep_flag equal to 1). The following parameters are supported: 0 no pictures are coded as key pictures, 1 pictures with MGS (Q>0) refinement are coded as key pictures, 2 all pictures of temporal level 0 are encoded as key pictures. There is basically no difference in coding efficiency between the values 1 and 2 for single-layer, CGS, and spatial configurations; only the high level syntax is different, and with the value of 2, a larger decoded picture buffer is required in single-layer, spatial, or CGS coding. For MGS configurations (CgsSnrRefinement = 1), a value of 1 improves the error robustness as well as the supported degree of bit-stream adaptation; but it usually decreases the coding efficiency, especially in connection with small GOP sizes.
MGSControl
Unsigned Int, default: 0 Specifies what pictures are used as references for motion estimation and motion compensation (determination of the residual signal to be coded) for MGS coding. The following parameters are supported: 0 pictures of current MGS layer are used, 1 pictures of highest EL are used for motion estimation, picture of current layer are used for MC, 2 pictures of highest EL are used for motion estimation and motion compensation. The motion estimation / motion compensation process for pictures of temporal level 0 is not modified. A value of 0 corresponds to CGS coding.
GOPSize
Unsigned Int, default: 1 Specifies the GOP size that shall be used for encoding a video sequence. A GOP (group of pictures) consists of a key picture, which is generally coded as P picture, and several hierarchically coded B pictures that are located between the key pictures. The parameter GOPSize must be equal to a power of 2. The GOP size is specified at the frame rate given by FrameRate. Thus, depending on the value of FrameRateOut in the layer configuration file, the actual GOP size for a layer might be smaller. For example, if FrameRate is equal to 30, GOPSize is equal to 16, and FrameRateOut is equal to 7.5, the actual GOP size that is employed for encoding the specific layer is equal to 4. The GOP size of all layers is selected in a way that the key pictures for all layers are temporally aligned. Hence, depending on the parameters FrameRateOut in the layer configuration files, the allowed range for GOPSize might be restricted. The maximum allowed value is 64.
IntraPeriod
Unsigned Int, default: 2^32-1 (equal to 1) Specifies the intra period for the encoded video sequence. When IntraPeriod is equal to 1 (2^32-1), only the very first picture is intra coded. Otherwise, every IntraPeriod picture (at the frame rate FrameRate) is intra-coded. The parameter IntraPeriod shall be equal to 1 or equal to a multiple of GOPSize.
NumberReferenceFrames
Unsigned Int, default: 1
File: 111010436.doc
Page: 25
Specifies the maximum number of active entries for the reference pictures lists 0 and 1. The actual number of active entries that are used for encoding a specific frame, is additionally dependent on the location of a frame inside the group of pictures.
BaseLayerMode
Unsigned Int, default: 0 Specifies whether sub-sequence SEI messages are included for the base layer. The following values are supported: 0 AVC compatible base layer without sub-sequence SEI messages, 1 AVC compatible base layer without sub-sequence SEI messages (same as for the value of 0), 2 AVC compatible base layer with sub-sequence SEI messages for supporting temporal scalability without prefix NAL units.
ConstrainedIntraUps
Unsigned Int, default: 0 Specifies whether slice boundaries in the base layer picture are treated similar to picture boundaries for the Intra_Base upsampling process. A value of 1 specifies that the slice boundaries are treated as picture boundary for Intra_Base upsampling process. A value of 0 specifies that the slice boundary is not treated as picture boundary for Intra_Base upsampling process. When ConstrainedIntraUps is equal to 1, the parameter InterLayerLoopFilterDisable in must be set equal to 1, 2, or 5.
LoopFilterDisable
Unsigned Int, default: 0 Specifies how the in-loop deblocking filter is applied. The following values are supported: 0 The deblocking filter is applied to all block edges 1 The deblocking filter is not applied. 2 The deblocking filter is applied to all block edges with exception of slice boundaries 3 Two stage deblocking process (slice boundries are filtered in second stage). 4 The deblocking filter is applied to all luma block edges, chroma is not filtered. 5 The deblocking filter is applied to all luma block edges with exception of slice boundaries, chroma is not filtered. 6 Two stage deblocking process for luma (slice boundries are filtered in second stage), chroma is not filtered.
LoopFilterAlphaC0Offset
Int, default: 0 Specifies the alpha C0 offset for the loop filter.
LoopFilterBetaOffset
Int, default: 0 Specifies the beta offset for the loop filter.
InterLayerLoopFilterDisable
Unsigned Int, default: 0 Specifies how the inter-layer deblocking filter is applied. The following values are supported: 0 The deblocking filter is applied to all block edges 1 The deblocking filter is not applied. 2 The deblocking filter is applied to all block edges with exception of slice boundaries 3 Two stage deblocking process (slice boundries are filtered in second stage). 4 The deblocking filter is applied to all luma block edges, chroma is not filtered. 5 The deblocking filter is applied to all luma block edges with exception of slice boundaries, chroma is not filtered. 6 Two stage deblocking process for luma (slice boundries are filtered in second stage), chroma is not filtered.
InterLayerLoopFilterAlphaC0Offset
Int, default: 0 Specifies the alpha C0 offset for the inter-layer deblocking filter.
File: 111010436.doc
Page: 26
InterLayerLoopFilterBetaOffset
Int, default: 0 Specifies the beta offset for the inter-layer deblocking filter.
NumLayers
Unsigned Int, default: 1 Specifies the number of layers. A layer represents a spatial layer or a coarse-grain SNR scalable layer (CGS). Note that the number temporal layers is specified by the parameter GOPSize, and that mediumgrain SNR scalable layers (MGS) are specified by the parameter MgsVectorMode in the layer configuration files (a layer can contain several MGS layers). The parameter NumLayers shall be in the range of 1 to 8, inclusive. For each layer a layer configuration file shall be specified by using the parameter LayerCfg.
LayerCfg
String, no default value Specifies the filename of a layer configuration file. The main configuration file shall contain exactly NumLayers occurrences of the parameter LayerCfg. Each of these specifies the layer configuration file for a specific layer. The first occurrence of the parameter LayerCfg specifies the layer configuration file for the base layer (LayerId 0), the next occurrence specified the layer configuration file for the next layer (LayerId 1), etc.
LARDO
Flag (0 or 1), default: 0 Specifies whether loss-aware rate-distortion optimized macroblock mode decision is used. This parameter is evaluated when the parameter ClosedLoop in the layer configuration files is not equal to 0.
SearchMode, SearchFuncFullPel, SearchFuncSubPel, SearchRange, FastBiSearch, BiPredIter, IterSearchRange, LoopFilterDisable, LoopFilterAlphaC0Offset, LoopFilterBetaOffset WeightedPrediction WeightedBiPrediction
These parameters are described in section 2.2.1.1.
ELSearchRange
Unsigned Int, default: 0 Specifies the enhancement layer search range in full luma samples. When ELSearchRange is equal to 0, the enhancement layer search range is given by the parameter SearchRange. When ELSearchRange is greater than 0, the motion search in the enhancement layer is done around the scaled base layer motion vector using the search range specified by ELSearchRange. MultiLayerLambdaSel Unsigned Int, default: 0 Specifies whether the multi-layer lambda selection algorithm is enabled. The following values are supported: 0 multi-layer lambda selection is disabled. 1 multi-layer lambda selection based on the current single layer lambda selection. 2 lambda corresponds to 0.8 x the lambda value by option 1. PreAndSuffixUnitEnable Bool, default: 1 Specifies whether to add prefix NAL units before the NAL units of AVC slices.When this parameter is equal to 0, no prefix NAL units are added in the bitstream. When this parameter is 1, prefix NAL units are
File: 111010436.doc
Page: 27
added before the AVC NAL units. This parameter shall always be on in SVC contexts (i.e. when there are MGS/CGS/spatial enhancement layers).
NestingSEI
Bool, default: 0 Specifies whether scalable nesting SEI message is included in the bitstream. When this parameter is equal to 0, no scalable nesting SEI message is included. When this parameter is equal to 1, sclable nesting SEI messages containing AVC SEI messages are included in the bitstream.
SceneInfo
Bool, default: 0 Specifies whether scene information SEI message is included in the bitstream. When this parameter is equal to 0, no scene information SEI message is included. When this parameter is equal to 1, NestingSEI should be equal to 1, and sclable nesting SEI messages containing AVC scene information SEI messages are included in the bitstream.
TLNestingFlag
Bool, default: 0 Specifies whether the temporal prediction structure of the sequence is nested. When this parameter is equal to 0, the sequence may not have a nested structure. When this parameter is equal to 1, the sequence must be nested.
IntegrityCheckSEI
Bool, default: 0 Specifies whether integrity check SEI message is included in the bitstream. When this parameter is equal to 0, no integrity check SEI message is included. When this parameter is equal to 1, integrity check SEI messages are included in the bitstream.
TL0DepRepIdxSeiEnable
Bool, default: 0 Specifies whether temporal level 0 index information SEI message is included in the bitstream. When this parameter is equal to 0, no TL0 SEI message is included. When this parameter is equal to 1, TL0 SEI messages are included in the bitstream.
EnableVclHRD
Unsigned Int, default: 0 Specifies whether the Type I HRD is enabled. When this parameter is equal to 0, Type I HRD information is not generated. When this parameter is equal to 1, Type I HRD information is generated.
EnableNalHRD
Unsigned Int, default: 0 Specifies whether the Type II HRD is enabled. When this parameter is equal to 0, Type II HRD information is not generated. When this parameter is equal to 1, Type II HRD information is generated. RPEncCheck Unsigned Int, default: 0 Specifies whether the mechanism shall be activated in encoder for avoiding potential visual artifacts due to residual prediction under ESS. When this parameter is equal to 1, the mechanism is activated. MVDiffThreshold Unsigned Int, default: 20 Specifies a threshold value used in the mechanism above for avoiding potential visual artifacts due to residual prediction under ESS. This value is effective only when parameter RPEncCheck is not equal to 0. In ESS, if an enhancement layer block is covered by multiple motion vectors from base layer, the mechanism above is effective for the block only if the difference of these base layer motion vectors is above the threshold value.
RateControlEnable
Unsigned Int, default: 0 Enable rate control for the base layer: 0-disabled, 1-enabled.
InitialQP
File: 111010436.doc
Page: 28
Unsigned Int, default: 30 Initial QP for the base layer: ranges from 1 to 51.
RCMinQP
Unsigned Int, default: 12 Minimum QP value used during rate control of the base layer: ranges from 1 to 51.
RCMaxQP
Unsigned Int, default: 40 Maximum QP value used during rate control of the base layer: ranges from 1 to 51.
MaxQPChange
Unsigned Int, default: 2 Maximum QP change allowed among subsequent frames that belong to the highest-priority temporal level during rate control of the base layer. This parameter ranges from 2 to 50.
AdaptInitialQP
Unsigned Int, default: 0 Adapt the initial QP for the base layer based on the input bit rate and the sequence dimensions.
BitRate
Unsigned Int, default: 64000 Target bit rate for the base layer in bits per second.
BasicUnit
Unsigned Int, default: 99 Number of MBs that constitute a rate control basic unit. This number has to be a fraction of the total number of MBs in the picture. In general, the smaller the basic unit, the more accurate the rate control is.
Page: 29
ScalMatIntra4x4V ScalMatInter4x4Y ScalMatInter4x4U ScalMatInter4x4V ScalMatIntra8x8Y ScalMatInter8x8Y IPCMRate BiPredLT8x8Disable MCBlocksLT8x8Disable DisableBSlices MaxDeltaQP QP BaseLayerId ForceReOrdering EncSIPFile
# # # # # # # # # # # # # # #
scaling matrix for intra 4x4 Cr blocks scaling matrix for inter 4x4 luma blocks scaling matrix for inter 4x4 Cb blocks scaling matrix for inter 4x4 Cr blocks scaling matrix for intra 8x8 luma blocks scaling matrix for inter 8x8 luma blocks forced percentage of IPCM macroblocks disabled bi-predicted blocks smaller than 8x8 blocks smaller than 8x8 are disabled disables B slice coding Max. absolute delta QP Quantization parameters Layerd ID for the base layer Force RPLR commands (0:off, 1:on) SIP decision file
#====================== CONTROL ================================================ MeQPLP 30.00 # QP for mot. est. / mode decision (key pics) MeQP0 30.00 # QP for mot. est. / mode decision (stage 0) MeQP1 30.00 # QP for mot. est. / mode decision (stage 1) MeQP2 30.00 # QP for mot. est. / mode decision (stage 2) MeQP3 30.00 # QP for mot. est. / mode decision (stage 3) MeQP4 30.00 # QP for mot. est. / mode decision (stage 4) MeQP5 30.00 # QP for mot. est. / mode decision (stage 5) InterLayerPred ILModePred ILMotionPred ILResidualPred BaseQuality SliceSkip 0 0 0 0 15 1 # # # # # Inter-layer Pred. (0:no, 1:yes, 2:adap.) Inter-layer mode pred. (0:no, 1:yes, 2:adap.) Inter-layer motion pred. (0:no, 1:yes, 2:adap.) Inter-layer residual pred. (0:no, 1:yes, 2:adap.) Base quality level [0 .. 15]
#====================== EXTENDED SPATIAL SCALABILITY =========================== UseESS 2 # ESS mode ESSPicParamFile crop.txt # picture level cropping parameters # (ignored when UseESS !=2 ) ESSCropWidth 640 # base layer upsampled frame width ESSCropHeight 560 # base layer upsampled frame height ESSOriginX 0 # base layer upsampled frame x-pos ESSOriginY 0 # base layer upsampled frame y-pos ESSChromaPhaseX -1 # current layer chroma phase shift x ESSChromaPhaseY 0 # current layer chroma phase shift y ESSBaseChromaPhaseX -1 # base layer chroma phase shift x ESSBaseChromaPhaseY 0 # base layer chroma phase shift y #=========================== MGS =============================================== MGSVectorMode 0 # MGS vector usage selection MGSVector0 0 # Specifies 0th position of the vector MGSVector1 0 # Specifies 1st position of the vector MGSVector2 0 # Specifies 2nd position of the vector MGSVector3 0 # Specifies 3rd position of the vector MGSVector4 0 # Specifies 4th position of the vector MGSVector5 0 # Specifies 5th position of the vector MGSVector6 0 # Specifies 6th position of the vector MGSVector7 0 # Specifies 7th position of the vector MGSVector8 0 # Specifies 8th position of the vector MGSVector9 0 # Specifies 9th position of the vector MGSVector10 0 # Specifies 10th position of the vector MGSVector11 0 # Specifies 11th position of the vector MGSVector12 0 # Specifies 12th position of the vector MGSVector13 0 # Specifies 13th position of the vector MGSVector14 0 # Specifies 14th position of the vector MGSVector15 0 # Specifies 15th position of the vector #====================== QP Cascading ========================================= ExplicitQPCascading 1 # QP Cascading (0:auto, 1:explicit) DQP4TLevel0 -2 # Delta QP for temporal level 0 DQP4TLevel1 1 # Delta QP for temporal level 1
File: 111010436.doc
Page: 30
2 3 4 5 6
# # # # #
QP QP QP QP QP
2 3 4 5 6
#====================== FMO ========================================= NumSlicGrpMns1 0 # Number of Slice Groups Minus 1, # 0 == no FMO, # 1 == two slice groups, etc. SlcGrpMapType 1 # 0: Interleave, 1: Dispersed, # 2: Foreground with left-over, # 3: Box-out, 4: Raster Scan, 5: Wipe, # 6:Explicit,slice_group_id read from # SliceGroupConfigFileName SlcGrpChgDrFlag 1 # slice_group_change_direction_flag, # 0: box-out clockwise, raster scan or # wipe right # 1: box-out counter clockwise,reverse raster # scan or wipe left SlcGrpChgRtMns1 1 # slice_group_change_rate_minus1 SlcGrpCfgFileNm sg.cfg # SliceGroupConfigFileName, # Used for slice_group_map_type 0, 2, 6 UseRedundantSlc 1 # UseRedundantSlice, # 0: not used, # 1: one redundant slice used for each slice PLR 3 # (Packet Loss Rate) NumROI 1 # ???????? ROICfgFileNm roiconf.cfg # ???????? #=========================== SLICES ======================= SliceMode 0 # (0=off 1=fixed #mb in slice # 2=fixed #bytes in slice) SliceArgument 50 # (Arguments to modes 1 and 2 above) #==================== SVC TO AVC REWRITE ======================================= AvcRewriteFlag 0 # Enable SVC to AVC rewrite (0: no, 1: yes) AvcAdaptiveRewriteFlag 0 # AVC adaptive rewrite flag (0: no, 1: yes)
SourceWidth
Unsigned Int, default: 352 Specifies the width of the input images in luma samples. SourceWidth shall be non-zero and a multiple of 16.
SourceHeight
Unsigned Int, default: 288 Specifies the height of the input images in luma samples. SourceHeight shall be non-zero and a multiple of 16.
FrameRateIn
Double, default: 30 Specifies the frame rate of the input sequence. The parameter FrameRate in the main configuration file shall be a multiple of FrameRateIn, and FrameRateIn shall be be a multiple of FrameRateOut.
FrameRateOut
Double, default: 30 Specifies the output frame rate for the current layer. The parameter FrameRate in the main configuration file shall be a multiple of FrameRateIn, and FrameRateIn shall be be a multiple of FrameRateOut. When FrameRateOut is not equal to FrameRateIn, every (FrameRateIn / FrameRateOut ) frame of the input sequence is skipped. The actual GOP size that is used for encoding a layer is determined by the parameters FrameRate, GOPSize, and FrameRateOut. The actual number of frames that are coded for a layer is determined by the parameters FrameRate and FrameRateOut.
InputFile
File: 111010436.doc
Page: 31
String, default: test.yuv Specifies the filename of the original raw video sequence for the layer.
ReconFile
String, default: rec.yuv Specifies the filename of the coded and reconstructed input sequence for the layer. This sequence is provided for debugging purposes.
SymbolMode
Flag (0 or 1), default: 1 Specifies the entropy coding mode. When SymbolMode is equal to 0, the video sequence is encoded using variable length codes (VLC). When SymbolMode is equal to 1, the video sequence is encoded using context-adaptive binary arithmetic coding (CABAC). CABAC usually provides an increased coding efficiency.
IDRPeriod
Int, default: 0 Specifies the period of IDR pictures inside the layer. The set of possible values is given by (GOPSize*N) with N being 0, 1, ...
IntraPeriod
Int, default: 0 Specifies the period of intra pictures inside the layer (in addition to the intra period specified in the main configuration files). The set of possible values is given by (GOPSize*N) with N being 0, 1, ...
MbAff
Flag (0 or 1), default: 0 Specifies whether Macroblock Adaptive Frame Field coding mode is used. This flag or the PAff flag should be set to 1 when coding interlaced material.
PAff
Flag (0 or 1), default: 0 Specifies whether Picture Adaptive Frame Field coding mode is used. This flag or the MbAff flag should be set to 1 when coding interlaced material.
BottomFieldFirst
Flag (0 or 1), default: 0 Specifies whether the top or the bottom field is the first field in display order for interlaced sources.
LowComplexityMbMode
Flag (0 or 1), default: 0 Specifies if a Low-Complexity method is used instead of the Lagrangian Rate-Distortion Optimization for the Macroblock Mode Decision. It provides lower computational complexity, but also lower R-D efficiency. Valid for Base Layer coding only.
ProfileIdc
Unsigned Int, default: 0 Specifies the profile for the layer. The following values are supported: 0 The profile is automatically determined 66 Baseline profile (only for the first layer) 77 Main profile (only for the first layer) 88 Extended profile (only for the first layer) 100 High profile (only for the first layer) 83 Scalable Baseline profile (not for the first layer) 86 Scalable High profile (not for the first layer) When ProfileIdc is not equal to 0, other coding parameters might by changed in order to ensure compatibility to the specified profile. When ProfileIdc is equal to 0, the profile is automatically determined. In the latter case, it must be ensured that the chosen combination coding parameters is supported in a profile.
MinLevelIdc
File: 111010436.doc
Page: 32
Unsigned Int, default: no restriction Specifies the minimum level for the layer (level_idc * 10). It ensures that the encoder does not select a level that is less than the specified level. The encoder might still select a level that is higher than the specifies level depending on other configuration parameters.
UseLongTerm
Flag (0 or 1), default: 0 Specifies whether reference pictures are coded as short-term or long-term reference pictures. When UseLongTerm is equal to 0, all reference pictures are coded as short-term reference pictures. When UseLongTerm is equal to 1, all reference pictures are coded as long-term reference pictures. MMCOBaseEnable Bool, default: 1 Specifies whether MMCO commands are used for the management of decoded layer representions. When this parameter is equal to 0, no MMCO commands for decoded layer representations are included in the bitstream and the decoder uses sliding window to manage the decoded layer representations. When this parameter is equal to 1, MMCO commands for decoded layer representations are included in the bitstream. MMCOBaseEnable Bool, default: 1 Specifies whether MMCO commands are used for the management of base represention (of key pictures). When this parameter is equal to 0, no MMCO commands for base representations are included in the bitstream and the decoder uses sliding window to manage the base representations. When this parameter is equal to 1, MMCO commands for base representations are included in the bitstream.
Enable8x8Transform
Flag (0 or 1), default: 0 Specifies whether the 8x8 transform (High Profile) is enabled. When Enable8x8Transform is equal to 1, the 8x8 transform enabled in addition to the standard 4x4 transform for the luminance component; otherwise the 8x8 transform is disabled. The coding efficiency is generally increased by enabling the 8x8 transform, especially for high-resolution source material.
ScalingMatricesPresent
Flag (0 or 1), default: 0 Specifies whether (non-flat) scaling matrices are used. When ScalingMatricesPresent is equal to 0, the flat scaling matrices are used (with all entries equal to 16). When ScalingMatricesPresent is equal to 1, scaling matrices are used. The corresponding eight scaling matrices can be specified by the parameters ScalMatIntra4x4Y, ScalMatIntra4x4U, ScalMatIntra4x4V, ScalMatInter4x4Y, ScalMatInter4x4U, ScalMatInter4x4V, ScalMatIntra8x8Y, and ScalMatInter8x8Y. When a scaling matrix is not specified using these parameters, the corresponding (non-flat) default scaling matrix is used.
ScalMatIntra4x4Y
String, default: empty string (specifies usage of default scaling matrix) Specifies the filename of the text file containing the scaling matrix for intra 4x4 luma blocks, when ScalingMatricesPresent is equal to 1. The corresponding file shall contain the 16 scaling matrix coefficients in raster scan order separated by white spaces (spaces, tabs, or end-of-lines). When the scaling matrix is not specified using this parameter (and ScalingMatricesPresent is equal to 1), the default scaling matrix is used.
ScalMatIntra4x4U
String, default: empty string (specifies usage of default scaling matrix) Specifies the filename of the text file containing the scaling matrix for intra 4x4 Cb blocks, when ScalingMatricesPresent is equal to 1. The corresponding file shall contain the 16 scaling matrix coefficients in raster scan order separated by white spaces (spaces, tabs, or end-of-lines). When the scaling matrix is not specified using this parameter (and ScalingMatricesPresent is equal to 1), the default scaling matrix is used.
ScalMatIntra4x4V
String, default: empty string (specifies usage of default scaling matrix) Specifies the filename of the text file containing the scaling matrix for intra 4x4 Cr blocks, when ScalingMatricesPresent is equal to 1. The corresponding file shall contain the 16 scaling matrix
File: 111010436.doc
Page: 33
coefficients in raster scan order separated by white spaces (spaces, tabs, or end-of-lines). When the scaling matrix is not specified using this parameter (and ScalingMatricesPresent is equal to 1), the default scaling matrix is used.
ScalMatInter4x4Y
String, default: empty string (specifies usage of default scaling matrix) Specifies the filename of the text file containing the scaling matrix for inter 4x4 luma blocks, when ScalingMatricesPresent is equal to 1. The corresponding file shall contain the 16 scaling matrix coefficients in raster scan order separated by white spaces (spaces, tabs, or end-of-lines). When the scaling matrix is not specified using this parameter (and ScalingMatricesPresent is equal to 1), the default scaling matrix is used.
ScalMatInter4x4U
String, default: empty string (specifies usage of default scaling matrix) Specifies the filename of the text file containing the scaling matrix for inter 4x4 Cb blocks, when ScalingMatricesPresent is equal to 1. The corresponding file shall contain the 16 scaling matrix coefficients in raster scan order separated by white spaces (spaces, tabs, or end-of-lines). When the scaling matrix is not specified using this parameter (and ScalingMatricesPresent is equal to 1), the default scaling matrix is used.
ScalMatInter4x4V
String, default: empty string (specifies usage of default scaling matrix) Specifies the filename of the text file containing the scaling matrix for inter 4x4 Cr blocks, when ScalingMatricesPresent is equal to 1. The corresponding file shall contain the 16 scaling matrix coefficients in raster scan order separated by white spaces (spaces, tabs, or end-of-lines). When the scaling matrix is not specified using this parameter (and ScalingMatricesPresent is equal to 1), the default scaling matrix is used.
ScalMatIntra8x8Y
String, default: empty string (specifies usage of default scaling matrix) Specifies the filename of the text file containing the scaling matrix for intra 8x8 luma blocks, when ScalingMatricesPresent is equal to 1. The corresponding file shall contain the 64 scaling matrix coefficients in raster scan order separated by white spaces (spaces, tabs, or end-of-lines). When the scaling matrix is not specified using this parameter (and ScalingMatricesPresent is equal to 1), the default scaling matrix is used.
ScalMatInter8x8Y
String, default: empty string (specifies usage of default scaling matrix) Specifies the filename of the text file containing the scaling matrix for inter 8x8 luma blocks, when ScalingMatricesPresent is equal to 1. The corresponding file shall contain the 64 scaling matrix coefficients in raster scan order separated by white spaces (spaces, tabs, or end-of-lines). When the scaling matrix is not specified using this parameter (and ScalingMatricesPresent is equal to 1), the default scaling matrix is used.
IPCMRate
Unsigned Int, default: 0 Specifies the percentage of macroblocks that are forced to be coded in IPCM mode. The usage of forced IPCM macroblocks decreases the coding efficiency. This parameter is only thought for debugging purposes to test the implementation of IPCM macroblocks in encoder and decoder.
BiPredLT8x8Disable
Flag (0 or 1), default: 0 Specifies whether bi-prediction is disabled for blocks smaller than 8x8. When BiPredLT8x8Disable is equal to 1, bi-prediction is disabled for sub-macroblock partitions that contain blocks smaller than 8x8.
MCBlocksLT8x8Disable
Flag (0 or 1), default: 0 Specifies whether motion-compensated blocks smaller than 8x8 are disabled. When MCBlocksLT8x8Disable is equal to 1, sub-macroblock partitions that contain blocks smaller than 8x8 are disabled.
DisableBSlices
File: 111010436.doc
Page: 34
Flag (0 or 1), default: 0 When this flag is equal to 1, B slice coding is disabled and all slices are coded as I or P slices.
MaxDeltaQP
Unsigned Int, default: 1 Specifies the maximum absolute difference of macroblock quantization parameters from the picture quantization parameter.
QP
Double, default: 32.0 Specifies the basis quantization parameter for encoding the current layer. The actual quantization parameters for encoding a specific frame are determined depending on this parameter QP and the position of the frame inside the group of pictures.
BaseLayerId
Unsigned Int, default: next lower layer when present Specifies the layer that is used for inter-layer prediction. When BaseLayerId is not present, the next lower layer (specified by the occurrence order in the main configuration file) is considered to represent the base layer for inter-layer prediction. Otherwise, the layer that is employed for inter-layer prediction is specified by its LayerId. The LayerId is determined by the occurrence order in the main configuration file; the first layer (i.e., the base layer) has LayerId equal to 0. See LayerCfg.
ForceReOrdering
Flag (0 or 1), default: 0 Specifies whether reference picture list reordering commands are encoded for every entry of the reference picture lists. When ForceReOrdering is equal to 0, only required reference picture list reordering commands are transmitted. When ForceReOrdering is equal to 1, a reordering command is transmitted for each entry of the reference picture lists. By forcing reordering commands, the ability to detect missing pictures is increased.
EncSIPFile
String, default: Specifies the name of the SIP decision file. This file records POCs of this layers frames which dont use inter-layer prediction. When EncSIPFile is not present, the encoder doesnt use SIP strategy on this layer.
MeQPLP
Double, default: QP Specifies the Lagrangian parameter that is used for motion estimation and mode decision of key pictures. The Lagrangian parameter for mode decision is determined by L = 0.85 * (2 ^ (MeQPLP / 3) ), the Lagrangian parameter for motion estimation is additionally dependent on the distortion measure that is used for the motion search. When this parameter is not specified, it is inferred to be equal to the parameter QP.
InterLayerPred
Unsigned Int, default: 0 Specifies the usage of inter-layer prediction and specifies default values of ILModePred, ILMotionPred, and ILResidualPred. The following values are supported: 0 The default values for ILModePred, ILMotionPred, and ILResidualPred are 0. The layer does not use inter-layer prediction (no_inter_layer_pred_flag is equal to 1), unless specified otherwise by the parameters ILModePred, ILMotionPred, and ILResidualPred. 1 The default values for ILModePred, ILMotionPred, and ILResidualPred are 1. The layer does use inter-layer prediction (no_inter_layer_pred_flag is equal to 0). 2 The default values for ILModePred, ILMotionPred, and ILResidualPred are 2. The layer does use inter-layer prediction (no_inter_layer_pred_flag is equal to 0).
File: 111010436.doc
Page: 35
InterLayerPred shall always be equal to 0 for the base layer (LayerId 0).
ILModePred
Unsigned Int, default: InterLayerPred Specifies the usage of inter-layer mode prediction. The following values are supported: 0 The layer does not use inter-layer mode prediction (adaptive_base_mode_flag is equal to 0, default_base_mode_flag is equal to 0) 1 Inter-layer mode prediction is always used (adaptive_base_mode_flag is equal to 0, default_base_mode_flag is equal to 1) 2 Inter-layer mode prediction is used in a macroblock adaptive way (adaptive_base_mode_flag is equal to 1) ILModePred should always be equal to 0 for the base layer (LayerId 0). With mode 1, the inter-layer mode prediction is always enabled for all macroblocks of a picture as long as a corresponding base layer picture exists. That means that the macroblock modes and, when applicable, the reference indices and motion vectors are always copied from the reference layer. When ILModePred is equal to 2, the interlayer mode prediction is arbitrarily selected via a rate-distortion optimization framework. For scalable enhancement layers, the best coding efficiency is usually obtained when ILModePred is set equal to 2. When ILModePred is equal to 1, the encoder might still decide to set adaptive_base_mode_flag equal to 1 depending on other parameters in the configutaion files, but it tries to always select the mode with base_mode_flag equal to 1 when possible (it might not be possible due to a violation of the levels constraints for the maximum number of motion vectors).
ILMotionPred
Unsigned Int, default: InterLayerPred Specifies the usage of inter-layer motion prediction. The following values are supported: 0 The layer does not use inter-layer motion prediction (adaptive_motion_prediction_flag is equal to 0, default_motion_prediction_flag is equal to 0) 1 Inter-layer motion prediction is always used (adaptive_motion_prediction_flag is equal to 0, default_motion_prediction_flag is equal to 1) 2 Inter-layer motion prediction is used in a macroblock partition adaptive way (adaptive_motion_prediction_flag is equal to 1) ILMotionPred should always be equal to 0 for the base layer (LayerId 0). With mode 1, the inter-layer motion prediction is always enabled for all macroblocks of a picture as long as a corresponding base layer picture exists. That means that the reference layer motion vector is always used as motion vector predictor. When ILMotionPred is equal to 2, the inter-layer motion prediction is arbitrarily selected via a rate-distortion optimization framework. For scalable enhancement layers, the best coding efficiency is usually obtained when ILMotionPred is set equal to 2. When adaptive_base_mode_flag is equal to 0 and default_base_mode_flag is equal to 1 (ILModePred is equal to 1), the value of ILMotionPred doesn't have any influence, since all macroblock inside the cropping window are coded with base_mode_flag equal to 1.
ILResidualPred
Unsigned Int, default: InterLayerPred Specifies the usage of inter-layer residual prediction. The following values are supported: 0 The layer does not use inter-layer residual prediction (adaptive_residual_prediction_flag is equal to 0, default_residual_prediction_flag is equal to 0) 1 Inter-layer residual prediction is always used (adaptive_residual_prediction_flag is equal to 0, default_residual_prediction_flag is equal to 1) 2 Inter-layer residual prediction is used in a macroblock adaptive way (adaptive_residual_prediction_flag is equal to 1) ILResidualPred should always be equal to 0 for the base layer (LayerId 0). With mode 1, the inter-layer residual prediction is always enabled for all macroblocks of a picture as long as a corresponding base layer picture exists. That means that the reference layer motion residual is always used as predictor for the residual of the current layer. When ILResidualPred is equal to 2, the inter-layer residual prediction is arbitrarily selected via a rate-distortion optimization framework. For scalable enhancement layers, the best coding efficiency is usually obtained when ILResidualPred is set equal to 2..
BaseQuality
Unsigned Int, default: 15
File: 111010436.doc
Page: 36
Specifies which quality layer (MGS layer) of the base layer is used for inter-layer prediction. Since up to 16 MGS layers can be present for a spatial or coarse-grain layer, BaseQuality shall be in the range of 0 to 15, inclusive. A value of 0 for BaseQuality specifies that only the base representation and none of the MGS layers are employed for inter-layer prediction; a value of 1 specifies that the data of the first MGS layer are used in addition, etc. When the number of actually present MGS layers is less than the specified value of BaseQuality, all available MGS layers are employed for inter-layer prediction. Especially, by specifying the default value of 15, all base layer data are used in any case.
SliceSkip
Flag (0 or 1), default: 0 SliceSkip equal to 1 specifies that the current layer representation for all access units for which the reference layer representation is present (no_inter_layer_pred_flag is equal to 0) are coded with slice_skip_flag equal to 1. SliceSkip equal to 0 specifies that all slices of the layer representations are coded with slice_skip_flag equal to 0.
UseESS
Int, default: 0 Specifies whether Extended Spatial Scalability (ESS) should be used. ESS enables a generalized relation between successive spatial layers. A picture of a lower spatial layer may represent a cropped area of the higher resolution picture and the relation between successive spatial layers does not need to be dyadic. Geometrical parameters defining the cropping window and the down-sampling ratio can either be defined at the sequence level, or evolve at the picture level. The following values are supported: 0 no ESS. 1 Sequence level ESS 2 Picture level ESS
ESSPicParamFile
String, default: ess.dat Specifies the filename containing the cropping window parameters. This option is only required when the UseESS option is set to 2 (Picture level ESS). Each line of the file is made up of four integers separated by commas as following: x_orig, y_orig, crop_width, crop_height.Where x_orig and y_orig represent the coordinates of the upper left corner of the cropping window and crop_width and crop_height represent respectively its width and height. The cropping window parameters are given in the high layer domain, indeed it (somehow) represents the low layer upsampled. In case, the file contains less lines than the number of pictures in the output sequence, the cropping parameters of the last line will be used for the following pictures.
ESSCropWidth
Int, default: 0 Specifies the width of the cropping window. The UseESS option should be set to 1 or 2.
ESSCropHeight
Int, default: 0 Specifies the height of the cropping window. The UseESS option should be set to 1 or 2.
ESSOriginX
Int, default: 0 Specifies the X-position of the upper left corner of the cropping window. The UseESS option should be set to 1.
ESSOriginY
Int, default: 0 Specifies the Y-position of the upper left corner of the cropping window. The UseESS option should be set to 1.
ESSChromaPhaseX
Int, default: -1 Specifies the high layer (current layer) chroma component phase shift in dimension X in comparison to the luma component in dimension X. Values are given in quarter luma samples ( in range [-1;1]). The allowed value for this parameter should be in the range of -1 to 1, inclusive.
ESSChromaPhaseY
File: 111010436.doc
Page: 37
Int, default: 0 Specifies the high layer (current layer) chroma component phase shift in dimension X in comparison to the luma component in dimension Y. Values are given in quarter luma samples ( in range [-1;1]). The allowed value for this parameter should be in the range of -1 to 1, inclusive.
ESSBaseChromaPhaseX
Int, default: -1 Specifies the base layer (lower than the current layer) chroma component phase shift in dimension X in comparison to the luma component in dimension X. Values are given in quarter luma samples ( in range [-1;1]). The allowed value for this parameter should be in the range of -1 to 1, inclusive.
ESSBaseChromaPhaseY
Int, default: 0 Specifies the base layer (lower than the current layer) chroma component phase shift in dimension X in comparison to the luma component in dimension Y. Values are given in quarter luma samples ( in range [-1;1]). The allowed value for this parameter should be in the range of -1 to 1, inclusive.
MGSVectorMode
Flag (0 or 1), default: 0 Specifies whether the transform coefficients of the layer are written into several quality layers. A value of 0 specifies that no additional quality layers are inserted, 1 specifies that the transform coefficients of the layer are written into several quality layers according to MGSVectorX.
NumSlicGrpMns1
Unsigned Int, default: 0 Specifies the number of slice groups. A value of 0 specified one slice group, a value of 1 specifies 2 slice groups, etc.
SlcGrpMapType
Unsigned Int, default: 2 Specifies the slice group map type: The following values are supported: 0 Interleaved mode 1 Dispersed mode 2 Foreground with left-over 3 Box-out 4 Raster scan 5 Wipe 6 Explicit mode, slice_group_id is read from the file specified by SlcGrpCfgFileNm
SlcGrpChgDrFlag
Unsigned Int, default: 0 Sets slice_group_change_direction_flag. The following values are supported: 0 Box-out clockwise, raster scan or wipe right 1 Box-out counter clockwise, reverse raster scan or wipe left
SlcGrpChgRtMns1
Unsigned Int, default: 85 Sspecifies the value of the syntax element slice_group_change_rate_minus1.
SlcGrpCfgFileNm
String, default: sgcfg.cfg Slice configuration file used for slice group map types 0, 2, and 6.
File: 111010436.doc
Page: 38
UseRedundantSlc
Unsigned Int, default: 0 Enables the use of redundant slices. Currently supports one redundant slice per slice. The current redundant slice tool does not support multiple slices per picture.
PLR
Unsigned Int, default: 0 Specifies the packet loss rate for the current layer. This parameter is used by the loss-aware rate-distortion optimization, see LARDO.
NumROI
Unsigned Int, default: 0 Specifies the number of ROIs in the current layer.
ROICfgFileNm
String, default roiconf.cfg ROI configuration file. If NumROI > 0, roi_id, slice group id, scalability layer id for each ROI is described.
SliceMode
Unsigned Int, default: 0 Sets slice mode: 0-disabled, 1-fixed number of MB per slice, 2-fixed number of bytes per slice
SliceArgument
Unsigned Int, default: 2^32-1 Slice arguments for modes 1 (number of MBs), 2 (bytes)
ExplicitQPCascading
Flag (0 or 1), default: 0 Specifies whether the cascading of the quantization parameters over the temporal levels is explicitly given in the layer configuration file, or whether the default QP cascading is used. When ExplicitQPCascading is equal to 0, the default QP cascading is used. When ExplicitQPCascading is equal to 1, the QP cascading needs to be specified in the configuration file by the parameters DQP4TLevelX (with X being replaced by 0-6).
AvcRewriteFlag
Flag (0 or 1), default: 0 Specifies that the transmitted sequence can be rewritten without degradation as an AVC bit-stream by only decoding and coding entropy codes and scaling transform coefficients. An alternative method for the IntraBL block is employed and restrictions are placed on transform size selection by the encoder. AvcRewriteFlag should be 0 for base layer. For an enhancement layer, if AvcRewriteFlag is on, all the enhancement layer(s) (if any) beblow it should also have AvcRewriteFlag on.
AvcAdaptiveRewriteFlag
Flag (0 or 1), default: 0 Specifies that the avc_rewrite_flag will be sent in the slice header.
Page: 39
-pf (config)
The parameter config specifies the name of the main configuration file to be used.
-bf (bitstream)
The parameter bitstream specifies the filename for the bit-stream to be generated.
-frms (frames)
The parameter frames specifies the number of frames of the input sequence to be encoded. The number of frames to be encoded is specified at the frame rate given by the parameter FrameRate in the main configuration file.
-gop (gsize)
The parameter gsize overwrites the parameter GOPSize of the main configuration file.
-iper (iperiod)
The parameter iperiod overwrites the parameter IntraPeriod of the main configuration file.
-numl (numLayers)
Specifies the number of layers for the encoder call. The parameter numLayers shall not exceed the value of parameter NumLayers in the main configuration file.
-cabac
Sets the parameter SymbolMode equal to 1 for all layers.
-vlc
Sets the parameter SymbolMode equal to 0 for all layers.
Page: 40
The parameter mode overwrites the parameter InterLayerPred of the layer configuration file for the layer given by layer.
-bcip
This option specifies that the base layer (LayerId equal to 0) is encoded using constrained intra prediction. When a multi-layer sequence is encoded successively by staring with the base layer and adding one enhancement layer in each further encoder call (e.g. to match given bit-rates), this option shall be specified for the encoder call, in which only the base is encoded.
-tlnest (mode)
The option sets the value of the syntax element temporal_level_nesting_flag in the scalability Information SEI message. If mode is equal to 1, a reference picture shall not be used for inter prediction after a succeeding picture with lower temporal level value has been decoded. Otherwise, no such restriction is placed. The default value is 0.
-icsei (mode)
The option specifies whether the integrity check SEI message is present or not. If mode is equal to 1, the integrity check SEI message is present for encoder. Otherwise, no integrity check SEI message is present. The default value is 0.
-tlidx (mode)
The option specifies whether the tl0_dep_rep_idx SEI is present or not. If mode is equal to 1, tl0_dep_rep_idx SEI is present in all NAL units where dId and qId are equal to 0. Otherwise, it is not present. The default value is 0.
-ciu (mode)
Specifies whether slice boundaries in the base layer picture are treated similar to picture boundaries for the Intra_Base upsampling process. Mode equals to 1 specifies that the slice boundary is treated as picture boundary for Intra_Base upsampling process, and InterLayerLoopFilterDisable in configuration file must be set equal to 1, 2, or 5. Mode equals to 0 specifies that the slice boundary is not treated as picture boundary for Intra_Base upsampling process.
-kpm (value)
Specifies the usage of MGS key pictures. The parameter value overwrites the parameter EncodeMGSKeyPictures of the main configuration file.
-mgsctrl (value)
Specifies what pictures are employed for motion estimation and motion compensation (determination of the residual signal) for MGS coding. The parameter value overwrites the parameter MGSControl of the main configuration file.
Page: 41
-aeqpc (value)
Specifies whether explicit QP cascading mode is used in all layers. The parameter value overwrites the parameter ExplicitQPCascading of all layer configuration files.
-mbaff (layer)
Specifies whether Macroblock Adaptive Frame Field Coding is used for layer layer. This parameter overwrites the parameter MbAFF of the layer configuration file.
-paff (layer)
Specifies whether Picture Adaptive Frame Field Coding is used for layer layer. This parameter overwrites the parameter MbAFF of the layer configuration file.
-ml (value)
Specifies the usage of multi-layer lambda selection algorithm. The parameter value overwrites the parameter MultiLayerLambdaSel of the main configuration file.
-h
Prints out a brief help on using the encoder.
The optional parameter maxPOCDiff specifies a maximum difference of Picture Order Counts of successive output frames. When the actual difference of the Picture Order Count values of successive output frames exceeds this maximum value, the last decoded picture is repeated. This option can be used for decoding bit-streams with packet/frame losses in a way that the number of output frames is identical to the number of originally encoded frames. The value of maxPOCDiff shall be greater than 0. When this parameter is not present, maxPOCDiff is inferred to be equal to infinity.
File: 111010436.doc
Page: 42
The parameter ec specifies the error concealment method. The following values are supported:
1 all macroblock are assumed to be coded using BLSkip 2 frame copy 3 all macroblocks are assumed to be coded in direct mode
The default value for ec is 0. If the value is equal to 0, we do not perform any error concealment and no packet loss detection. For the bit-streams which are not supported by the current packet loss detection, uiErrorConceal equals to 0 means that it will perform as JSVM without error concealment.
generates a packet trace file from the given stream extract the layer with layer id = SL and the dependent lower layers extract all layers with dependency_id <= L extract all layers with temporal_level <= T extract all layers with quality_level <= F extract a layer (possibly truncated) with the target bitrate = B extract a layer (possibly truncated) with A: frame width [luma samples] B: frame height [luma samples] C: frame rate [Hz] D: bit rate [kbit/s] information about quality layers are used during extraction ordered/toplayer quality layer extraction extract using selective inter-layer prediction strategy. prefixUnit and suffixUnit are used as auxiliary information during extraction whe sip is present and the base layer is AVC compatible. uses a (modified) packet trace file for bit-stream extraction extract rate of highest layer wo trunc. (use QL when present) - may be specified as percentage (0% no ql, 100% all qls) use with l and f options: extract all included layers of the layer L specified with l and all quality levels below quality level F specified wth f of the layer L
Options "-l", "-t" and "-f" can be used in combination with each other. The options "-ql" or "-qlord" can be used in connection with the option "-e". The option "-sip" and "-suf" must be used in connection with the option "-e". Other options can only be used separately.
File: 111010436.doc
Page: 43
The parameter in specifies the filename for the input bit-stream (global bit-stream). The parameter out specifies the filename for the output bit-stream, which generally represents a sub-stream of the input bit-stream as specified by the additional command line parameters. When the extractor is called with a single parameter input.svc specifying the input bit-stream, information about the contained scalable layers are displayed as illustrated in Example 19. The DTQ information represents a triple of the values (dependency_id, temporal_level, quality_level).
Example 19: Using the bit-stream extractor for displaying information about a bit-stream
> BitStreamExtractorStatic input.svc Layer 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Resolution 176x144 176x144 176x144 176x144 176x144 176x144 176x144 176x144 176x144 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 Framerate 3.7500 3.7500 3.7500 7.5000 7.5000 7.5000 15.0000 15.0000 15.0000 3.7500 3.7500 7.5000 7.5000 15.0000 15.0000 30.0000 30.0000 Bitrate 106.00 206.00 341.00 430.00 454.00 508.00 594.00 631.00 721.00 666.00 1010.00 1364.00 1454.00 1838.00 1963.00 2192.00 2358.00 DTQ (0,0,0) (0,0,1) (0,0,2) (0,1,0) (0,1,1) (0,1,2) (0,2,0) (0,2,1) (0,2,2) (1,0,0) (1,0,1) (1,1,0) (1,1,1) (1,2,0) (1,2,1) (1,3,0) (1,3,1)
Each of the scalable layers indicated in the corresponding bit-stream extractor output (with Ids equal to 0 to 16 in the Example 19) can be extracted using the option sl. Note that these scalable layers do not correspond to the usual notion of layer. For instance, in order to extract the scalable layer 7 in Example 19, which has a spatial resolution of 176x144 samples (QCIF), a frame rate of 15Hz, and a bit-rate of 631 kbit/s, the command in Example 20 can be used.
Example 20: Extraction of a scalable layer
BitStreamExtractorStatic input.svc output.svc sl 7
The options l, -t, and f can be used to identify scalable layers by the DTQ values (see Example 19). The option l L specifies that all layers with dependency_id less than or equal to L shall be extracted, the option t T specifies that all layers with temporal_level less than or equal to T shall be extracted, and the option f F specifies that all layers with quality_level less than or equal to F shall be extracted. The Example 21 shows an alternative using the option l, -t, and f for extracting the scalable layer 7 (cp. Example 19 and Example 20).
Example 21: Extraction of a scalable sub-stream using the option l, -t, and -f
BitStreamExtractorStatic input.svc output.svc l 0 t 2 f 1
The use of and with f F and l L options allows to extract all whole layers below layer L and quality level F of layer L. The option b B can be used for extracting sub-stream with a bit-rate specified by B. The bit-rate B has to be specified in kbit/s. The extracted sub-stream is determined by the following ordered steps. 1. If there is a layer having a bitrate exactly equal to B, that layer is extracted. Otherwise, the extraction process continues with step 2.5.
File: 111010436.doc
Page: 44
2. The layer X having a bitrate smaller than but closest to B is found. If layer X is the highest layer, layer X is extracted. Otherwise, the extraction process continues with step 2.5. 3. If the next higher layer Y to layer X is not an MGS layer, then layer X is extracted. Otherwise, the extraction process continues with step 2.5. 4. If the bitrate of layer Y is greater than B, layer Y is truncated to B. Otherwise, let layer X be replaced with layer Y, and go to step 2.5. The most powerful option for operating the bit-stream extractor is option -e. When using this option, the spatial resolution, the frame rate, and the bit-rate of the sub-stream to be extracted can be specified. This extraction option has to be specified in the form -e AxB@C:D. At this, the parameters A and B represent the spatial resolution expressed by the frame width A and the frame height B. The frame rate of the sub-stream to be extracted is specified by the parameter C, and the target bit-rate is specified by the parameter D. When MGS data is present, some of the corresponding packets might be discarded in order to match the given target bit-rate. Note that each scalable stream only supports a specific range of target bit-rates for each spatio-temporal resolution, the supported bit-rate range is dependent on the encoder configuration. Example 22 shows an extractor call, in which a sub-stream in QCIF resolution (176x144 samples) with a frame rate of 15Hz and a bit-rate of 600 kbit/s is extracted. The extracted bit-stream is similar to that of Example 20 and Example 21 with the difference that the bit-rate (631 kbit/s in Example 20 and Example 21) is reduced to 600 kbit/s. The reduction of the bit-rate is obtained by discarding MGS packets (quality_level > 0) of the scalable layer 7 in Example 19).
Example 22: Extraction of a scalable sub-stream using the general option e
BitStreamExtractorStatic input.svc output.svc e 176x144@15:600
When using the general extraction option -e, an additional option -ql can be specified. When this option is specified the extractor uses the quality level information in the bit-stream to extract a ratedistortion optimized bit-stream. In general the quality (PSNR) of an extracted bit-stream is higher when the option -ql is used. However, this option should only be used when quality layer information are embedded in the bit-stream; otherwise, the results is undefined. More information on how quality level information can be embedded in a bit-stream are given in section 2.6.
Example 23: Extraction of a scalable sub-stream using -ql
BitStreamExtractorStatic input.svc output.svc e 352x288@30:1400 ql
If the quality level assigner did a multi layer quality layer assignment (see section 2.6), extraction using the option -ql may result in the removal of some of the lower layers(dependency_ids) MGS packets before all the top layers MGS packets are removed. This behaviour is intentional, since the goal of multi layer quality layer assignment is to provide optimal RD performance at the top most layer (may be at the cost of the RD performance at embedded lower layers). However for certain applications, it may be good to remove lower layers(dependency_ids) MGS packets only after all the top layers MGS packets have been removed. This ensures that the quality of the embedded spatial layers(dependecy_ids) are not affected by bit stream extraction, if the target bitrate allows it. This kind of ordered extraction can be performed by using the option -qlord. Using -qlord has the same effect as using -ql if the quality layer assigner did not use multi layer quality layer assignment. Using -qlord has the same effect as using -ql if there is only one layer(dependency_id).
Example 24: Extraction of a scalable sub-stream using -qlord
BitStreamExtractorStatic input.svc output.svc e 352x288@30:1400 qlord
When using the general extraction option -e, an additional option -sip can be specified. When this option is specified the extractor uses the selective inter-layer prediction information in the bit-stream
File: 111010436.doc
Page: 45
to extract a rate-distortion optimized bit-stream. This option should only be used when quality layer information are embedded in the bit-stream. If the base layer is AVC compatible, only when the base layer bit-stream is encoded with prefix or suffix units and the option -suf is present. And -sip cant be used in connection with -ql. More information on how the selective inter-layer prediction information can be embedded in a bit-stream is given in section 2.9. With the option -pt a so-called packet trace file can be generated from a given stream. This trace file is a text file, which specifies various parameters for each single packet inside the given bit-stream. These parameters include the start position (in units of bytes) of the packet inside the bit-stream, the length of the packet (in units of bytes), the values of dependency_id (LId), temporal_level (TId), and quality_level (QId) for the packet, the type of the packet, and two flag which indicate whether the packet is discardable or truncatable (truncatable packets are not supported in SVC). An example, which shows how such a trace file can be generated and what information is present in the trace file is given in Example 25. The packet trace file can be modified and then used for bit-stream extraction. This is a very useful feature for easily simulating non-regular packet losses or packet truncations. In order to remove a packet from a trace file, simply delete the corresponding line in the text file. For simulating a packet truncation, just reduce the Length parameter in the trace file. Note that the parameter Start-Pos shall not be modified for a packet, since this parameter is used for identifying packets inside a bit-stream. Only the parameters Start-Pos and Length are used for a trace-file-based bit-stream extraction; all other parameters are irrelevant and can be arbitrarily modified or deleted. Example 26 shows a modified version of the trace file in Example 25. Here, several packet have been deleted and other packets have been truncated. The example further shows how this modified trace file can be used for extracting a bit-stream. The extracted bit-stream can for instance be used for testing the error robustness of the decoder.
Example 25: Example for generating a packet trace file
> BitStreamExtractorStatic pt trace.txt input.svc > type trace.txt Start-Pos. Length LId TId QId Packet-Type Discardable ========== ====== === === === ============ =========== 0x00000000 162 0 0 0 StreamHeader No 0x000000a2 13 0 0 0 ParameterSet No 0x000000af 9 0 0 0 ParameterSet No 0x000000b8 9 0 0 0 ParameterSet No 0x000000c1 1408 0 0 0 SliceData No 0x00000641 1838 0 0 1 SliceData Yes 0x00000d6f 2811 0 0 2 SliceData Yes 0x0000186a 532 0 0 0 SliceData No 0x00001a7e 1809 0 0 1 SliceData Yes 0x0000218f 2837 0 0 2 SliceData Yes 0x00002ca4 232 0 1 0 SliceData Yes 0x00002d8c 167 0 1 1 SliceData Yes 0x00002e33 492 0 1 2 SliceData Yes 0x0000301f 163 0 2 0 SliceData Yes 0x000030c2 64 0 2 1 SliceData Yes 0x00003102 250 0 2 2 SliceData Yes 0x000031fc 181 0 2 0 SliceData Yes 0x000032b1 77 0 2 1 SliceData Yes 0x000032fe 250 0 2 2 SliceData Yes 0x000033f8 103 0 3 0 SliceData Yes 0x0000345f 27 0 3 1 SliceData Yes 0x0000347a 138 0 3 2 SliceData Yes 0x00003504 115 0 3 0 SliceData Yes 0x00003577 18 0 3 1 SliceData Yes 0x00003589 127 0 3 2 SliceData Yes 0x00003608 119 0 3 0 SliceData Yes 0x0000367f 19 0 3 1 SliceData Yes 0x00003692 154 0 3 2 SliceData Yes 0x0000372c 123 0 3 0 SliceData Yes 0x000037a7 18 0 3 1 SliceData Yes 0x000037b9 122 0 3 2 SliceData Yes 0x00003833 1362 0 0 0 SliceData No
File: 111010436.doc
Truncatable =========== No No No No No Yes Yes No Yes Yes No Yes Yes No Yes Yes No Yes Yes No Yes Yes No Yes Yes No Yes Yes No Yes Yes No
Page: 46
0 0 0
0 0 1
1 2 0
Yes Yes No
Example 26: Example for extracting a bit-stream using a (modified) packet trace file
> type trace_mod.txt Start-Pos. Length LId TId QId Packet-Type Discardable Truncatable ========== ====== === === === ============ =========== =========== 0x00000000 162 0 0 0 StreamHeader No No 0x000000a2 13 0 0 0 ParameterSet No No 0x000000af 9 0 0 0 ParameterSet No No 0x000000b8 9 0 0 0 ParameterSet No No 0x000000c1 1408 0 0 0 SliceData No No 0x00000641 1838 0 0 1 SliceData Yes Yes 0x00000d6f 1200 0 0 2 SliceData Yes Yes 0x0000186a 532 0 0 0 SliceData No No 0x00001a7e 100 0 0 1 SliceData Yes Yes 0x0000218f 2837 0 0 2 SliceData Yes Yes 0x00002ca4 232 0 1 0 SliceData Yes No 0x00003504 115 0 3 0 SliceData Yes No 0x00003577 18 0 3 1 SliceData Yes Yes 0x00003589 127 0 3 2 SliceData Yes Yes 0x00003608 119 0 3 0 SliceData Yes No 0x0000367f 19 0 3 1 SliceData Yes Yes 0x00003692 154 0 3 2 SliceData Yes Yes 0x0000372c 123 0 3 0 SliceData Yes No 0x000037a7 18 0 3 1 SliceData Yes Yes 0x000037b9 122 0 3 2 SliceData Yes Yes 0x00003833 1362 0 0 0 SliceData No No 0x00003d85 1815 0 0 1 SliceData Yes Yes 0x0000449c 1407 0 0 2 SliceData Yes Yes 0x00004f9a 219 0 1 0 SliceData Yes No ... > BitStreamExtractorStatic input.svc output.svc et trace_mod.txt
The input bit-stream file is specified by the parameter Input. As input a bit-stream generated by the encoder shall be specified. The output bit-stream is given by the parameter Output. Input and output
File: 111010436.doc
Page: 47
bit-stream are identical with the exception that the output bit-stream contains additional information about quality layers. The quality layer information can be present in two different ways: (1) These information can be embedded in the NAL unit header syntax element simple_priority_id. In this case, the bit-rate is kept constant. This method can only be applied when the 3-byte NAL unit header is used. (2) The quality layer information can be inserted as additional SEI messages. Due to the additional NAL units, the bit-rate of the output bit-stream is slightly increased in comparison to the input bit-stream, but this method can also be applied when the 2-byte NAL unit header is used. By default, the quality layer information are embedded in the NAL unit header syntax element simple_priority_id. When SEI messages should be inserted, the option -sei needs to be specified. In any case an original video sequence has to be specified for each layer (spatial or CGS layer) that is present in the input bit-stream. The original video sequences are specified by the option -org L org.yuv, where L specifies the LayerId for a contained layer and org.yuv specifies the filename of the original video sequence. In Example 28, the usage of the quality level assigner is illustrated for a bit-stream that contains 2 spatial or CGS layers. The file in.svc represents the input bit-stream, and the files LX.yuv with X being 0 or 1 represent the original sequences for layer X. The output bitstream is specified by the filename out.svc.
Example 28: Using the quality level assigner for inserting quality layer information into a 2-layer stream
QualityLevelAssigner -in in.svc -org 0 L0.yuv -org 1 L1.yuv out out.svc
When the option -sei is additionally specified in the Example 28, the quality layer information would be inserted as SEI messages. Furthermore, it is possible to specify one of the options -dep and -ind. These option signal that a slightly modified algorithm should be used for determining the quality level information. By using one of these option, the execution time can be reduced by 50%, but the rate-distortion efficiency of quality layer information is usually slightly reduced.
Example 29: Using the quality level assigner in two steps and storing quality layer information to a file
QualityLevelAssigner -in in.svc -org 0 L0.yuv -org 1 L1.yuv wp QLInfo.dat QualityLevelAssigner in in.svc out out.svc rp QLInfo.txt
The inserting of quality layer information can also be done in two steps as illustrated in Example 29. By specifying the option -wp QLInfo.dat, the quality layer information are stored into the file QLInfo.dat. And this file can later be used via the option -rp QLInfo.dat to directly insert the quality layer information into a bit-stream without carrying out the complex and time-consuming ratedistortion analysis. These options are for instance useful when two different bit-stream with quality layer information one using the syntax element simple_priority_id and one using SEI messages shall be generated as illustrated in Example 30.
Example 30: Using the quality level assigner for generating two streams with quality layer information
QualityLevelAssigner -in in.svc -org 0 L0.yuv -org 1 L1.yuv wp QLInfo.dat QualityLevelAssigner in in.svc out out_pid.svc rp QLInfo.txt QualityLevelAssigner in in.svc out out_sei.svc rp QLInfo.txt -sei
By default, the Quality level assigner assigns Quality layer Ids in such a way that the Quality layer Ids associated with a lower layer(dependency_id) is in a higher range compared to the Quality layer Ids associated with a higher layer(dependency_id). This ensures that the extractor removes the MGS packets associated with the a higher layer(dependency_id) before it removes MGS packets of a lower layer(dependency_id). Multi Layer Quality Layer assignment (as proposed in JVT-S043) can be done via the option mlql. This may improve the top layer(dependency_id) PSNR when the extractor truncates the bitstream using the option "-ql". This option is useful only if there is more than one layer(dependency id). Specifically, this option is targeted for the combined scalability(meaning, a configuration having more than one Spatial/CGS
File: 111010436.doc
Page: 48
layer and their associated MGS layers) configuration. If this option is used, the Quality layer Id assignment is done for the best RD performance at the top layer(dependency_id). The Quality layer Ids associated with a particular layer(dependency_id) may take any valid value.
Example 31: Using the quality level assigner for generating multi layer quality layer information
QualityLevelAssigner -in in.svc out out.svc -org 0 L0.yuv -org 1 L1.yuv -mlql
The extractor can choose to use the multi layer quality layer information directly or it can choose to use it for an ordered truncation. More information about this can be obtained by reading the extractor usage description above.
The input sequence is specified by the parameter Input. The output sequence is given by the parameter Output. The parameters width and height specify the frame width and heights in luma samples, respectively. The parameter frms specifies the number of frames that are pre-filtered. The GOP size, which is used for applying the MCTF pre-filter, is given by the parameter GOPSize. Allowed values for GOPSize are 2, 4, 8, 16, 32, and 64. The parameter QP specifies the Lagrangian multiplier, which is used for motion estimation and mode decision for the MCTF analysis. This parameter controls the strength of the pre-filter. In Example 33, the usage of the MCTF pre-processor is illustrated for 300 frames of a sequence in 4CIF resolution. The file in.svc represents the input sequence, and the file pre.yuv specifies the output sequence. For the parameters GOPSize and QP the default values of 16 and 26 are used, respectively.
Example 33: Example for using the MCTF pre-processor
MCTFPreProcessor -in in.svc -out pre.yuv w 352 h 288 f 300
Page: 49
reconstructed file number of temporal downsampling stages (default: 0) number of frames to skip at start (default: 0) coded stream frames per second
The filenames for the original sequence and the reconstructed sequence are specified by the parameters org and rec. The spatial resolution of the original sequence is specified by the frame width w and the frame height h. By default, the temporal resolution of the reconstructed sequence shall be identical to the temporal resolution of the original sequence. It is however also possible to measure the PSNR between the original and reconstructed sequence, when the reconstructed sequence represents a temporally downsampled version. In that case, the temporal downsampling stages t need to be specified. This parameter is identical to the one described in section 2.1.1. The optional parameter skip can be used to specify that skip frames at the start of the sequences shall be skipped. The bit-rate of a corresponding bit-stream is additionally outputted when a bit-stream file strm and a frame rate (in Hz) fps is specified. It is not possible to specify only one of these parameters.
Example 35: Using the PSNR tool for PSNR and rate measurement
PSNRStatic 176 144 org.yuv rec.yuv 0 0 str.svc 15 2>PSNR.txt type PSNR.txt 128,00 32,23 38,79 39.02
In Example 35, the most common use of the PSNR tool is illustrated. The files org.yuv and rec.yuv specify the original and reconstructed sequences, respectively. The file str.svc specifies a bit-stream. It is assumed that the sequences are given in QCIF resolution (176x144 samples) and a frame rate of 15 Hz. The PSNR is measured between the original and reconstructed sequence, and the bit-rate of the given stream is calculated. The average bit-rate as well as the PSNR values for the luminance and the two chrominance components are written to the file PSNR.txt. All values are written to one line of the file and are separated by tabulator characters. The order of the values is the following: (1) bit-rate in kbit/s, (2) Y-PSNR in dB luminance component, (3) U-PSNR in dB chrominance component U or Cb, (4) V-PSNR in dB chrominance component V or Cr.
The parameter rc_cfg represents the rate control parameter file configuring the fixed QP encoder. The configuration file reads specific parameters from defined lines of the configuration file. Therefore, the ordering of the parameters in the configuration file is fixed. For each layer, a target bit rate or a target PSNR, the maximum positive and negative bit-rate mismatch and a start value for the quantization parameters are specified. An example for the fixed qp encoder configuration file is provided in Example 37 below.
File: 111010436.doc
Page: 50
Example 37: Example fixed qp encoder configuration file for determination of the layer base quantization parameters with 2-layer encoding. For reference, the line numbers are printed before the lines. 1 ########### RATE-POINTS CONFIGURATION FILE ########### 2 3 CITY # Label 4 bin/H264AVCEncoderLibTestStatic.exe # Encoder Binary 5 bin/PSNRStatic.exe # PSNR Binary 6 cfg/CITY.cfg # Parameter File 7 str/CITY.264 # Bit Stream File 8 150 # Number Of Frames 9 2 # GOP Size 10 -1 # Intra Period 11 30.0 # Frames Per Second 12 2 # Number Of Layers 13 1 # constrained intra for base layer 14 10 # Number Of Iterations 15 0 # Mode (0:Rate, 1:PSNR) 16 17 18 ---------- LAYER 0 ---------19 176 # Input Width 20 144 # Input Height 21 org/CITY_176x144_15.yuv # Input File 22 tmp/CITY_layer0.yuv # Reconstructed File 23 256.00 # Bit rate [kbit/s] /PSNR [dB] 24 2.00 # Maximum Negative Mismatch [%] 25 2.00 # Maximum Positive Mismatch [%] 26 32.00 # StartBaseQpResidual 27 32.00 # StartQpModeDecision 28 0 # Entropy Coding Mode Flag (0: CAVLC, 1:CABAC) 29 0 # Inter-Layer Prediction Mode 30 -1 # Base Layer Id 31 32 ---------- LAYER 1 ---------33 352 # Input Width 34 288 # Input Height 35 org/CITY_352x288_30.yuv # Input File 36 tmp/CITY_layer1.yuv # Reconstructed File 37 1024.00 # Bit rate [kbit/s] /PSNR [dB] 38 2.00 # Maximum Negative Mismatch [%] 39 2.00 # Maximum Positive Mismatch [%] 40 36.00 # StartBaseQpResidual 41 36.00 # StartQpModeDecision 42 1 # Entropy Coding Mode Flag (0: CAVLC, 1:CABAC) 43 2 # Inter-Layer Prediction Mode 44 0 # Base Layer Id
Label (line 3)
String Label used for temporary file names (typically the name of the sequence under consideration)
Page: 51
String Name of the main configuration file as specified in section 2.2 for the configuration to be optimized. The main configuration file specifies the maximum number of layers.
The description of the layer parameters is given for a layer nlayer, with nlayer being in the range of 0..10(maximum). Input width (line 19+14*nlayer)
Unsigned int Input width in number of pixels. Overwrites the parameter SourceWidth in the layer configuration file.
Page: 52
Filename of the original raw video sequence for the layer. Overwrites the parameter Inputfile in the layer configuration file.
Page: 53
The parameter sip_cfg specifies the SIP parameter file. The configuration file reads specific parameters from defined lines of the configuration file. Therefore, the ordering of the parameters in the configuration file is fixed. For each layer, the out frame rate, the loss which can be tolerated under the condition with MA and the input and output filenames are specified. The parameter FileLabel add a suffix to the output filename. An example for the sip configuration file is provided in Example 39 below. Example 39: Example main configuration file main.cfg for single-layer coding
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ########### SIP CONFIGURATION FILE ########### 2 150 30 # Number of Layers # Number of Frames # In Frame Rate
---------- LAYER 0 ---------1.00 # Toleration 15 # Out Frame Rate SIP/bus_0_without.dat # Dat File Without SIP/bus_0_with.dat # Dat File With SIP/bus_0.dat # Output SIP Decision (dummy for layer 0) ---------- LAYER 1 ---------1.03 # loss which can be tolerated with MA (dummy for layer 0) 30 # out frame rate SIP/bus_1_without.dat # dat file without interlayer pred SIP/bus_1_with.dat # dat file with interlayer pred (dummy for layer 0) SIP/bus_1.dat # output SIP decision (dummy for layer 0) 19
File: 111010436.doc
Page: 54
For further ease of use, it is strongly suggested to rename these sequences as BUS_352x288_30.yuv, FOOTBALL_352x288_30.yuv, FOREMAN_352x288_30.yuv and MOBILE_352x288_30.yuv.
Where SEQ is the sequence name (i.e. BUS, FOOTBALL, FOREMAN or MOBILE) and frms is the number of frames wanted in output file. It is worth noting that whereas the above-metioned process corresponds to dyadic downsampling we recommend here to use method 0 (that corresponds to non-normative downsampling (JVT-R006)) instead of method 1.
1 352 288 0 0
# # # # #
Page: 55
# # # #
chroma phase x 0 or -1, default = -1 chroma phase y -1 to +1, default = 0 base chroma phase x 0 or -1, default = -1 base chroma phase y -1 to +1, default = 0 0
For further ease of use, it is strongly suggested to rename these sequences as CITY_704x576_60.yuv, CREW_704x576_60.yuv, HARBOUR_704x576_60.yuv, and SOCCER_704x576_60.yuv.
Example 43: Down-sampling CIF 30Hz sequence toQCIF 15Hz sequence for the 4CIF scenario.
DownConvertStatic 352 288 SEQ_352x288_30.yuv 176 144 SEQ_176x144_15.yuv 0 1 0 frms phase -1 0
Where SEQ is the sequence name (i.e. CITY, CREW, HARBOUR or SOCCER) and frms is the number of frames wanted in output file. It is worth noting that original sequences in QCIF resolutions are obtained via a two-steps process and not in a single one, which would results in different sequences. In addition, it is worth noting that whereas the above-metioned process corresponds to dyadic downsampling we recommend here to use method 0 (that corresponds to non-normative downsampling (JVT-R006)) instead of method 1.
Example 45: ESS parameters for Layer1(CIF) configuration file with Extended Spatial Scalability .
UseESS ESSCropWidth ESSCropHeight ESSOriginX ESSOriginY 1 704 576 0 0 # # # # # ESS cropping cropping cropping cropping width height origin X origin Y
File: 111010436.doc
Page: 56
The most important parameters that need to be specified in the main configuration file are the name for the bit-stream OutputFile, the frame rate FrameRate, the number of frames to be encoded FramesToBeEncoded, the GOP size GOPSize, and the base layer mode BaseLayerMode. Furthermore, the parameter NumLayers has to be set equal to 1 for single-layer coding, and exactly one layer configuration file has to be specified via the parameter LayerCfg. In Example 46, we additionally specified SearchMode and SearchRange to speed-up the encoder execution. For all other parameters of the main configuration file, default values as specified in section 2.2.2.1 are used. In the layer configuration file, the filename of the input sequence InputFile, the frame width SourceWidth and the frame height SourceHeight of the input images as well as the frame rates FrameRateIn and FrameRateOut need to be specified.
Example 47: Example layer configuration file layer.cfg for single-layer coding
# JSVM Layer Configuration File InputFile SourceWidth SourceHeight FrameRateIn FrameRateOut BUS_QCIF15.yuv 176 144 15 15 # # # # # Input Input Input Input Output file frame frame frame frame width height rate [Hz] rate [Hz]
Note that the frame rates FrameRate and FrameRateOut are different. Thus, only 75 frames are actually encoded although the parameter FramesToBeEncoded is set equal to 150. Similarly, groups of 8 pictures are actually used for encoding the sequence BUS_QCIF15.yuv, although a GOP size GOPSize of 16 is specified in the main configuration file. Note further that the base layer mode BaseLayerMode is set equal to 2. Hence, an AVC compatible bitstream with additional sub-sequence SEI messages is written. These SEI messages could be used for the extraction of a temporal sub-stream in an non-SVC environment (or when all prefix NAL units are removed).
File: 111010436.doc
Page: 57
As mentioned above, even in single-layer mode, the generated bit-stream can provide temporal scalability. The number of supported temporal scalability levels is dependent on the specified GOP size. A so-called group of pictures (GOP) consists of a key picture and hierarchically predicted B pictures that are located between the key picture of the current GOP and the key picture of the previous GOP. In Figure 1, the hierarchical prediction structure is illustrated for a group of 8 pictures, which is employed in the example above.
GOP border
prediction
GOP border
key picture
key picture
The key pictures are either intra-coded (e.g. in order to enable random access) or inter-coded by using previous key pictures as reference for motion-compensated prediction. The sequence of key pictures is independent from any other pictures of the video sequence, and in general it represents the minimal temporal resolution that can be decoded. Furthermore, the key pictures can be considered as resynchronisation points between encoder and decoder when arbitrarily discardable MGS enhancement layers are added. The first picture of a video sequence is always intra-coded as IDR picture and represents a special GOP, which consists of exactly one picture. The remaining pictures of a GOP are hierarchically predicted. For the example in Figure 1, the picture in the middle (blue) is predicted by using the surrounding key pictures as references. It depends only on the key pictures, and represents the next higher temporal resolution together with the key pictures. The pictures of the next temporal level (green) are predicted by using only the pictures of the lower temporal resolution as references, etc. It is obvious that this hierarchical prediction structure inherently provides temporal scalability. Given the configuration files in Example 46 and Example 47, the encoder can be started using the call in Example 48. With option -pf the main configuration file is specified. The option -lqp 0 30 specifies the basis quantization parameter as well as the Lagrangian parameters for motion estimation and mode decision as described in section 2.2.2. Note that for rate-distortion efficient coding, the quantization parameter and the QP values that are used for determining the Lagrangian parameters should be identical. This is ensured by using the command line option -lqp.
Example 48: Example encoder call for single-layer coding
>H264AVCEncoderLibTestStatic pf main.cfg lqp 0 30 ... SUMMARY: bitrate Min-bitr Y-PSNR --------- ---------- -------176x144 @ 1.8750 68.1375 68.1375 37.1245 176x144 @ 3.7500 94.5711 94.5711 35.8306 176x144 @ 7.5000 122.0511 122.0511 34.7876 176x144 @ 15.0000 154.9952 154.9952 34.0359
At the end of Example 48, the final encoder output is shown. It summarizes the supported spatial resolutions, frame rates, and bit-rates. For our example, only a single spatial resolution of 176x144
File: 111010436.doc
Page: 58
samples is supported. But the bit-stream provides 4 different temporal resolutions with frame rates of 1.87, 3.75, 7.5, and 15Hz, since we are using an effective GOP size of 8 pictures. The lowest supported temporal resolution is build be the sequence of key pictures.
Example 49: Example of supported scalable layers for single-layer coding
>BitStreamExtractorStatic test.264 Contained Layers: ==================== Layer 0 1 2 3 Resolution 176x144 176x144 176x144 176x144 Framerate 1.8750 3.7500 7.5000 15.0000 Bitrate MinBitrate 68.10 68.10 94.60 94.60 122.10 122.10 155.00 155.00 DTQ (0,0,0) (0,1,0) (0,2,0) (0,3,0)
A printout of the supported spatio-temporal resolutions and bit-rates can also be obtained when the bitstream extractor is called with the filename of the bit-stream as illustrated in Example 49. A subbitstream that only contains NAL units for a specific temporal resolution can be extracted from the global bit-stream test.264 by using the bit-stream extractor either with option -sl or -t (cp. section 2.5. During the extraction of a sub-stream, all NAL units (packets) that are not required for decoding a specific spatio-temporal-rate point are simply discarded. In Example 50, it is illustrated how the temporal sub-stream of 3.75 Hz is extracted from the global 30 Hz stream and how this sub-stream is decoded. When referring to Figure 1, this temporal substream only consists of the (brown) key pictures and the (blue) pictures of the next temporal level. Furthermore in Example 50, the PSNR of the decoded video sequence and the bit-rate of the extracted sub-stream are measured using the PSNR tool, which is described in section 2.7. It shall be noted that the measured values for the bit-rate and the PSNRs are identical to those that have been reported in the encoder output (see Example 48). The extremely minor differences in the rate values results from the fact, that the bits that are required for transmitting the stream header (scalable SEI message) are not counted during encoding.
Page: 59
inter-layer prediction, and that the actually used inter-layer prediction concepts are adaptively switch on a macroblock basis via rate-distortion-optimized coding. For rate-distortion efficient coding, InterLayerPred should be set equal to 2 for any enhancement layer.
Example 51: Example main configuration file main.cfg for spatial scalable coding
# JSVM Main Configuration File OutputFile FrameRate FramesToBeEncoded GOPSize BaseLayerMode SearchMode SearchRange NumLayers LayerCfg LayerCfg test.264 30.0 150 16 2 # # # # # # 4 # 32 # 2 # layer0.cfg # layer1.cfg # Bitstream file Maximum frame rate [Hz] Number of frames (at input frame rate) GOP Size (at maximum frame rate) Base layer mode (0,1: AVC compatible, 2: AVC w subseq SEI) Search mode (0:BlockSearch, 4:FastSearch) Search range (Full Pel) Number of layers Layer configuration file Layer configuration file
Example 52: Example layer configuration file layer0.cfg for spatial scalable coding
# JSVM Layer Configuration File InputFile SourceWidth SourceHeight FrameRateIn FrameRateOut BUS_QCIF15.yuv 176 144 15 15 # # # # # Input Input Input Input Output file frame frame frame frame width height rate [Hz] rate [Hz]
Example 53: Example layer configuration file layer1.cfg for spatial scalable coding
# JSVM Layer Configuration File InputFile SourceWidth SourceHeight FrameRateIn FrameRateOut InterLayerPred BUS_CIF30.yuv 352 288 30 30 2 # # # # # # Input file Input frame width Input frame height Input frame rate [Hz] Output frame rate [Hz] Inter-layer Pred. (0: no, 1: yes, 2:adap.)
The hierarchical coding structure with 2 spatial layers is illustrated in Figure 2. In each layer (QCIF and CIF in the above example), a independent hierarchical coding structure with layer specific motion parameters is employed. Note that in the above example, the QCIF layer in coded at a frame rate of 15 Hz, while the CIF layer is coded at a frame rate of 30 Hz. In order to obtain a temporal scalable representation, the prediction structures of all layers have to be aligned as depictured in Figure 2. And when specifying a GOP size of 16 pictures in the main configuration file, the effective GOP sizes that are used for the QCIF and CIF layer are 8 and 16, respectively (in Figure 2, the prediction structure is illustrated for GOP sizes of 4 and 8). The red arrows in Figure 2 indicated the usage of inter-layer prediction. Inter-layer prediction can only be used inside an access unit, and thus between base and enhancement layer pictures at the same time instant. Since, the frame rate of the CIF enhancement layer is twice the frame rate of the QCIF base layer, the enhancement layer pictures of the highest temporal level are coded without inter-layer prediction. These pictures are only predicted using motion-compensated temporal prediction.
File: 111010436.doc
Page: 60
motion-compensated prediction
GOP border
GOP border
CIF
inter-layer prediction
QCIF
key pictures key pictures
Given the configuration files of Example 51, Example 52, and Example 53, an example encoder call is illustrated in Example 54. It is assumed that the original sequence BUS_CIF30.yuv is given in CIF resolution with a frame rate of 30 Hz. In a first step, the resampling tool is used for generating a spatially and temporally downsampled sequence BUS_QCIF15.yuv in QCIF resolution with a frame rate of 15 Hz. More information about the resampling tool are given in section 2.1. Then the encoder is called with the main configuration file main.cfg and the additional options -lqp 0 30 and -lqp 1 32, where the first values (0 and 1) specify the layer, and the second values (30 and 32) specify the corresponding quantization parameter. As described in section 3.1, with option -lqp the quantization parameter as well as Langrangian parameters for motion estimation and mode decision are specified. The bit-rate of the generated sequence is mainly dependent on these parameters. Similar to singlelayer coding the quantization parameter QP and the parameters MeQPX, which determine the Lagrangian multipliers for mode decision and motion estimation, should be set to identical values for each spatial layer, in order to generated a rate-distortion optimized bit-stream. This is always ensured by using the command line option -lqp.
Example 54: Example encoder call for spatial scalable coding
>DownConvertStatic 352 288 BUS_CIF30.yuv 176 144 BUS_QCIF15.yuv 0 1 >H264AVCEncoderLibTestStatic pf main.cfg lqp 0 30 lqp 1 32 ... SUMMARY: bitrate Min-bitr Y-PSNR U-PSNR --------- ---------- -------- -------176x144 @ 1.8750 68.7450 68.7450 37.1325 41.4114 176x144 @ 3.7500 95.2042 95.2042 35.8244 41.1374 176x144 @ 7.5000 122.6842 122.6842 34.7798 40.9418 176x144 @ 15.0000 155.5248 155.5248 34.0337 40.8782 352x288 @ 1.8750 290.1045 290.1045 37.2590 41.7302 352x288 @ 3.7500 396.3095 396.3095 35.8589 41.3258 352x288 @ 7.5000 509.5389 509.5389 34.8414 41.1008 352x288 @ 15.0000 646.3088 646.3088 34.1147 40.9572 352x288 @ 30.0000 769.7504 769.7504 33.5364 40.8049
V-PSNR -------42.6966 42.3771 42.2639 42.1806 43.5598 43.2483 43.0806 42.9462 42.8078
At the end of example Example 54, the corresponding encoder output is shown, which summarizes the supported spatio-temporal resolutions and bit-rates. This information can also be obtained by calling
File: 111010436.doc
Page: 61
the bit-stream extractor with the generated bit-stream as shown in Example 55. It should be noted that the lowest supported frame rate of 1.875 Hz is identical to the frequency of key pictures.
Example 55: Example of supported scalable layers for spatial scalable coding
>BitStreamExtractorStatic test.264 Contained Layers: ==================== Layer 0 1 2 3 4 5 6 7 8 Resolution 176x144 176x144 176x144 176x144 352x288 352x288 352x288 352x288 352x288 Framerate 1.8750 3.7500 7.5000 15.0000 1.8750 3.7500 7.5000 15.0000 30.0000 Bitrate MinBitrate 68.70 68.70 95.20 95.20 122.70 122.70 155.50 155.50 290.10 290.10 396.30 396.30 509.50 509.50 646.30 646.30 769.80 769.80 DTQ (0,0,0) (0,1,0) (0,2,0) (0,3,0) (1,0,0) (1,1,0) (1,2,0) (1,3,0) (1,4,0)
The generated scalable bit-stream contains 9 different representations, 4 of them are QCIF representation with frame rates of 1.875, 3.75, 7.5, and 15 Hz, and 5 of them are CIF representations with frame rates of 1.875, 3.75, 7.5, 15, and 30 Hz. All of the included QCIF representations are AVC compatible and can be decoded with a standard AVC decoder. Sub-streams for any of the 9 included representations can be extracted using the bit-stream extractor. The bit-stream extractor can be called either with the option -sl or with the options -l and -t. In Example 56, examples for extracting a QCIF representation with a frame rate of 7.5 Hz (scalable layer 2 in the encoder output) and a CIF representation with a frame rate of 15 Hz are given. Note, that an extracted representation still includes all representations with a spatial resolution equal to or smaller to the spatial resolution of the extracted representation and a frame rate equal to or smaller than the frame rate of the extracted representation. Thus, the extracted representation can also be used for further extractions. For instance, the bit-stream sub.264 obtained with
>BitStreamExtractorStatic subP1.264 sub.264 l 0 t 2
is identical to the bit-stream subP0.yuv of Example 56. After the extraction of spatio-temporal substreams in Example 56, these sub-streams are decoded, and the PSNR and the rate are measured similar to Example 50 for single-layer coding. Note again that the measured rate and PSNR values are identical to the values that are reported in the encoder output, with the exception of minor rate differences. The reason for obtaining these rate differences has been described in section 3.1.
Example 56: Example for extracting and decoding a spatio-temporal sub-sequence
>BitStreamExtractorStatic >BitStreamExtractorStatic test.264 subP0.264 l 0 t 2 test.264 subP1.264 l 1 t 3
>H264AVCDecoderLibTestStatic subP0.264 decP0.yuv >H264AVCDecoderLibTestStatic subP1.264 decP1.yuv >PSNRStatic 176 144 BUS_QCIF15.yuv decP0.yuv 1 0 subP0.264 15 2> PSNR.dat >PSNRStatic 352 288 BUS_CIF30.yuv decP1.yuv 1 0 subP1.264 30 2>>PSNR.dat >type PSNR.dat 123,4611 647,7872 34,7798 40,9418 42,2639 34,1147 40,9572 42,9462
Page: 62
scalability MGS. In principle, CGS and MGS is identical to spatial scalable coding with the only exception that all layers have an identical spatial resolution. Thus, for generating a CGS bit-stream the examples in section 3.3 can be used, only the spatial resolution (SourceWidth and SourceHeight) of the layers has to be modified in a way that SourceWidth and SourceHeight are identical for all layers. Similar to spatial scalable coding, only a limited set of extractable rate points is included in a bitstream. In addition to CGS coding, MGS coding supports many more rate points since an arbitrary subset of an MGS layers packets may be extracted. Furthermore, an encoder may choose to partition the transform coefficients of a layer into up to 16 MGS layers which increases the number of packets and thus the number of subsets of packets. In this way, a finer granularity is achieved. In Example 57, Example 58, and Example 59, configuration files for SNR scalable coding with MGS are depicted. The main difference to the configuration files for single-layer coding (see section 3.1) is that the MGS parameter CgsSnrRefinement is set to 1 and that an MGS vector is specified in layer1.cfg. This vector specifies that for layer1.cfg, the transform coefficients are written into three slices. Additionally, the parameter MGSControl is set equal to 2 in the main configuration file, which specifies that the encoder control is operated by closing the prediction loop at the lowest and the highest rate point.
Example 57: Example main configuration file main.cfg for SNR scalable coding
# JSVM Main Configuration File OutputFile FrameRate FramesToBeEncoded GOPSize BaseLayerMode CgsSnrRefinement EncodeKeyPictures MGSControl SearchMode SearchRange NumLayers LayerCfg LayerCfg test.264 30.0 150 16 2 # # # # # # 1 # 1 # 2 # # 4 # 32 # 2 # layer0.cfg # layer1.cfg # Bitstream file Maximum frame rate [Hz] Number of frames (at input frame rate) GOP Size (at maximum frame rate) Base layer mode (0,1: AVC compatible, 2: AVC w subseq SEI) SNR refinement as 1: MGS; 0: CGS Key pics at T=0 (0:none, 1:MGS, 2:all) ME/MC for non-key pictures in MGS layers (0:std, 1:ME with EL, 2:ME+MC with EL) Search mode (0:BlockSearch, 4:FastSearch) Search range (Full Pel) Number of layers Layer configuration file Layer configuration file
Example 58: Example layer configuration file layer0.cfg for SNR scalable coding
# JSVM Layer Configuration File InputFile SourceWidth SourceHeight FrameRateIn FrameRateOut MGSVectorMode BUS_CIF30.yuv 352 288 30 30 0 # # # # # # Input file Input frame width Input frame height Input frame rate [Hz] Output frame rate [Hz] MGS vector usage selection
Example 59: Example layer configuration file layer1.cfg for SNR scalable coding
# JSVM Layer Configuration File InputFile SourceWidth SourceHeight FrameRateIn FrameRateOut InterLayerPred MGSVectorMode MGSVector0 MGSVector1 MGSVector2 BUS_CIF30.yuv 352 288 30 30 1 1 4 4 8 # # # # # # # # # # Input file Input frame width Input frame height Input frame rate [Hz] Output frame rate [Hz] Inter-layer Prediction (0: no, 1: yes, 2:adaptive) MGS vector usage selection Specifies 0th position of the vector Specifies 1st position of the vector Specifies 2nd position of the vector
File: 111010436.doc
Page: 63
Example 60 illustrates a possible encoder call for generating an SNR scalable bit-stream with the given configuration files. In addition to the option -pf (main configuration file), the options -lqp and -rqp are specified for both layers independently where the first number after -lqp or -rqp determines the layer and the second number determines the value. The option -lqp is known from the examples for single-layer and spatial scalable coding (section 3.1 and 3.3). This option specifies the quantization parameter and the Lagrangian parameters for mode decision and motion estimation. The option -rqp specifies the quantization parameter. Thus, the previous value given by -lqp is overwritten. As a consequence, with option -lqp only the Lagrangian multipliers for motion estimation and mode decision are specified (as a QP equivalent), while the actual used quantization parameter is specified by the option -rqp. For rate-distortion efficient MGS coding, the values of MeQP (option -lqp in the example) that are used for determining the Lagrangian parameters should be set to smaller values than the actual quantization parameter (option -rqp in the example). The optimal difference depends on the number of MGS layers that should be encoded as well as on the sequence content. For SNR scalable coding with one MGS layer, a QP difference of 1 or 2 is reasonable in most cases.
Example 60: Example encoder call for SNR scalable coding
>H264AVCEncoderLibTestStatic pf main.cfg -lqp 0 30 -rqp 0 32 ... SUMMARY: bitrate Min-bitr Y-PSNR U-PSNR --------- ---------- -------- -------352x288 @ 1.8750 475.8690 274.9965 42.1823 45.2743 352x288 @ 3.7500 661.9200 372.1784 40.2007 44.3558 352x288 @ 7.5000 883.4511 466.8632 38.8771 43.7826 352x288 @ 15.0000 1173.8320 581.3968 37.8877 43.3222 352x288 @ 30.0000 1493.6992 710.0240 36.9683 42.9079 -lqp 1 24 -rqp 1 26 V-PSNR -------46.7233 46.0389 45.6203 45.2689 44.9354
Example 61: Example of supported scalable layers for SNR scalable coding
>BitStreamExtractorStatic test.264 Contained Layers: ==================== Layer 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Resolution 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 Framerate 1.8750 3.7500 7.5000 15.0000 30.0000 1.8750 1.8750 1.8750 3.7500 3.7500 3.7500 7.5000 7.5000 7.5000 15.0000 15.0000 15.0000 30.0000 30.0000 30.0000 Bitrate MinBitrate 275.00 275.00 372.20 372.20 466.90 466.90 581.40 581.40 710.00 710.00 344.60 402.60 475.90 478.00 561.20 661.90 628.80 746.70 883.50 821.70 988.90 1173.80 1038.40 1261.20 1493.70 DTQ (0,0,0) (0,1,0) (0,2,0) (0,3,0) (0,4,0) (0,0,1) (0,0,2) (0,0,3) (0,1,1) (0,1,2) (0,1,3) (0,2,1) (0,2,2) (0,2,3) (0,3,1) (0,3,2) (0,3,3) (0,4,1) (0,4,2) (0,4,3)
In Example 60 the encoder output is depicted, which is part of the output of the bit-stream extractor in Example 61 which additionally shows the bitrates for the MGS layers that are introduced by the specified MGS vector in layer1.cfg. These values are not the only extractable rate points since an
File: 111010436.doc
Page: 64
MGS layer consists of many packets (one for each slice). When we look at a frame rate of 30 Hz, many bit-rates inside the interval from 710 kbit/s to 1493 kbit/s can be extracted from the generated bit-stream by extracting a subset of all packets of an MGS layer. However, such a subset has to be choosen carefully. In Example 62, it is illustrated that it is possible to extract almost any rate inside this interval. In this example, a target rate of 768 kbit/s is specified via the command line option -e for the bit-stream extractor. And the rate that is measured by the PSNR tools verifies that this bit-rate was really extracted. The output of the bit-stream extractor shows that this is reached by extracting all packets with quality_id equal to 0 and 38 of 75 packets with quality_id equal to 1. This means that the bitstream extractor has choosen a subset.
Example 62: Example for extracting and decoding a sub-sequence in SNR scalable coding
>BitStreamExtractorStatic Contained Layers: ==================== Layer 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Resolution 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 Framerate 1.8750 3.7500 7.5000 15.0000 30.0000 1.8750 1.8750 1.8750 3.7500 3.7500 3.7500 7.5000 7.5000 7.5000 15.0000 15.0000 15.0000 30.0000 30.0000 30.0000 Bitrate MinBitrate 275.00 275.00 372.20 372.20 466.90 466.90 581.40 581.40 710.00 710.00 344.60 402.60 475.90 478.00 561.20 661.90 628.80 746.70 883.50 821.70 988.90 1173.80 1038.40 1261.20 1493.70 DTQ (0,0,0) (0,1,0) (0,2,0) (0,3,0) (0,4,0) (0,0,1) (0,0,2) (0,0,3) (0,1,1) (0,1,2) (0,1,3) (0,2,1) (0,2,2) (0,2,3) (0,3,1) (0,3,2) (0,3,3) (0,4,1) (0,4,2) (0,4,3) test.264 substream.264 e 352x288@15:768
============Extraction Information====== Extracted spatail layer : 352x288 Extracted temporal rate : 15f/s quality_id statistics for dependency_id 0 =========================================== quality_id 0 - total: 75 retained: 75 quality_id 1 - total: 75 retained: 38 Number of input packets: Number of output packets: 906 269
>H264AVCDecoderLibTestStatic substream.264 dec.yuv >PSNRStatic 352 288 BUS_CIF30.yuv dec.yuv 1 0 substream.264 30 2>PSNR.dat >type PSNR.dat 747,0656 34,9682 42,3511 44,4188
The rate-distortion efficiency of sub-streams that are extracted from an SNR scalable bit-stream can be increased when quality layer information are inserted into the bit-stream. Example 63 shows how the quality level assigner can be used for this purpose. The input bit-stream test.264, which has been generated by the encoder is investigated by a rate-distortion analysis, for which the original sequence
File: 111010436.doc
Page: 65
BUS_CIF30.yuv is required. The determined quality layer information, which can be employed by the bit-stream extractor, are inserted into the newly created output bit-stream testQL.264. Beside the additional quality layer information, both bit-streams test.264 and testQL.264 are identical. More information about the quality level assigner are given in section 2.6.
Example 63: Example for using the quality level assigner for optimizing an SNR scalable bit-stream
>QualityLevelAssignerStatic in test.264 org 0 BUS_CIF30.yuv out testQL.264
In the following, it is shown that the additional quality information really improve the rate-distortion efficiency of the extracted sub-streams. Example 64 and Example 65 show two batch files for extracting and decoding a set of 19 rate points from a SNR scalable bit-stream. And in Example 66, these batch scripts are used for generating sets of R-D values. For the first set of R-D values Standard.dat, the standard method for extracting rate points is employed with the original bit-stream test.264. An identical set of R-D values is obtained when the bit-stream test.264 is replaced by the bit-stream testQL.264. For the second set of R-D values QualityLayer.dat, the quality layer information that has been inserted in the bit-stream testQL.264 is employed for an optimized extraction of rate points. The optimized extraction method is specified by the option -ql. Note, however that it is not possible to use this option with a bit-stream that does not contain information about quality layers.
Example 64: Batch file decodePoint.bat for extracting and decoding a rate point of a SNR scalable stream.
:: (1:stream) (2:rate) (3:-ql) BitStreamExtractorStatic %1 tmp.264 -e 352x288@30:%2 %3 H264AVCDecoderLibTestStatic tmp.264 tmp.yuv PSNRStatic 352 288 BUS_CIF30.yuv tmp.yuv 0 0 tmp.264 30
Example 65: Batch file decode.bat for extracting and decoding a set of rate points of a SNR scalable stream.
:: (1:stream) (2:-ql) FOR /L %%i IN (700,50,1600) DO @CALL decodePoint.bat %1 %%i %2
Example 66: Example for generating rate-distortion curves for SNR scalable bit-streams.
>decode.bat test.264 2>Standard.dat >decode.bat testQL.264 -ql 2>QualityLayer.dat
In Figure 3, the generated sets of R-D points are plotted inside a diagram. It can be clearly seen, that the rate-distortion efficiency is improved when using the quality layer information. For the lowest and highest rate point, the coding efficiency is identical.
File: 111010436.doc
Page: 66
37
Standard Quality Layer
36 Y-PSNR [dB]
35
34
# # # # # # # # # # # # # # # # #
Bitstream file Maximum frame rate [Hz] Number of frames (at input frame rate) GOP Size (at maximum frame rate) SNR refinement as 1: MGS; 0: CGS Key pics at T=0 (0:none, 1:MGS, 2:all) ME/MC for non-key pictures in MGS layers (0:std, 1:ME with EL, 2:ME+MC with EL) Base layer mode (0,1: AVC compatible, 2: AVC w subseq SEI) Search mode (0:BlockSearch, 4:FastSearch) Search range (Full Pel) Number of layers Layer configuration file Layer configuration file Layer configuration file Layer configuration file
Page: 67
Example 68: Example layer configuration file layer0.cfg for combined scalable coding
# JSVM Layer Configuration File InputFile SourceWidth SourceHeight FrameRateIn FrameRateOut QP MeQP0 MeQP1 MeQP2 MeQP3 MeQP4 MeQP5 BUS_QCIF15.yuv 176 144 15 15 34 32 32 32 32 32 32 # # # # # # # # # # # # Input Input Input Input Output file frame frame frame frame width height rate [Hz] rate [Hz] decision decision decision decision decision decision (stage (stage (stage (stage (stage (stage 0) 1) 2) 3) 4) 5)
Quantization parameters QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode
Example 69: Example layer configuration file layer1.cfg for combined scalable coding
# JSVM Layer Configuration File InputFile SourceWidth SourceHeight FrameRateIn FrameRateOut InterLayerPred QP MeQP0 MeQP1 MeQP2 MeQP3 MeQP4 MeQP5 BUS_QCIF15.yuv 176 144 15 15 1 28 28 28 28 28 28 28 # # # # # # # # # # # # # Input file Input frame width Input frame height Input frame rate [Hz] Output frame rate [Hz] Inter-layer Pred. (0: no, 1: yes, 2:adap.) Quantization parameters QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode decision decision decision decision decision decision (stage (stage (stage (stage (stage (stage 0) 1) 2) 3) 4) 5)
Example 70: Example layer configuration file layer2.cfg for combined scalable coding
# JSVM Layer Configuration File InputFile SourceWidth SourceHeight FrameRateIn FrameRateOut InterLayerPred QP MeQP0 MeQP1 MeQP2 MeQP3 MeQP4 MeQP5 BUS_CIF30.yuv 352 288 30 30 2 36 34 34 34 34 34 34 # # # # # # # # # # # # # Input file Input frame width Input frame height Input frame rate [Hz] Output frame rate [Hz] Inter-layer Pred. (0: no, 1: yes, 2:adap.) Quantization parameters QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode decision decision decision decision decision decision (stage (stage (stage (stage (stage (stage 0) 1) 2) 3) 4) 5)
Example 71: Example layer configuration file layer1.cfg for combined scalable coding
# JSVM Layer Configuration File InputFile SourceWidth SourceHeight FrameRateIn FrameRateOut InterLayerPred BUS_CIF30.yuv 352 288 30 30 1 # # # # # # Input file Input frame width Input frame height Input frame rate [Hz] Output frame rate [Hz] Inter-layer Pred. (0: no, 1: yes, 2:adap.)
File: 111010436.doc
Page: 68
30 30 30 30 30 30 30
# # # # # # #
Quantization parameters QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode QP for mot. est. / mode
0) 1) 2) 3) 4) 5)
The commands that are executed for generating the combined scalable bit-stream are given in Example 72. Lets assume the input sequence is given in CIF resolution (352x288 samples) at a frame rate of 30 Hz. First, downsampled sequences for the QCIF layer are generated by using the resampler. These sequences are required as input for the encoder. Thereafter, the actual encoding is executed.
Example 72: Example encoder call for combined scalable coding
>DownConvertStatic 352 288 BUS_CIF30.yuv 176 144 BUS_QCIF15.yuv 0 1 >H264AVCEncoderLibTestStatic pf main.cfg SUMMARY: 176x144 176x144 176x144 176x144 352x288 352x288 352x288 352x288 352x288 @ 1.8750 @ 3.7500 @ 7.5000 @ 15.0000 @ 1.8750 @ 3.7500 @ 7.5000 @ 15.0000 @ 30.0000 bitrate Min-bitr Y-PSNR --------- ---------- -------93.4125 47.0745 38.5063 127.9137 64.0311 37.1204 165.0995 81.1974 35.9945 210.6160 101.4816 35.1407 404.2635 188.1210 38.7535 547.1179 252.8147 37.1742 703.0626 317.7947 35.9920 895.7664 396.9712 35.1249 1065.7728 475.8656 34.4091 U-PSNR -------42.4337 42.0545 41.7574 41.6518 42.7444 42.1826 41.8283 41.6106 41.3834 V-PSNR -------43.8098 43.4157 43.1639 43.0275 44.3954 44.0011 43.7475 43.5677 43.3742
Example 73: Example of supported scalable layers for combined scalable coding
>BitStreamExtractorStatic test.264 Contained Layers: ==================== Layer 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Resolution 176x144 176x144 176x144 176x144 176x144 176x144 176x144 176x144 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 352x288 Framerate 1.8750 3.7500 7.5000 15.0000 1.8750 3.7500 7.5000 15.0000 1.8750 3.7500 7.5000 15.0000 30.0000 1.8750 3.7500 7.5000 15.0000 30.0000 Bitrate MinBitrate 47.10 47.10 64.00 64.00 81.20 81.20 101.50 101.50 93.40 127.90 165.10 210.60 234.50 188.20 316.70 252.80 401.70 317.80 506.10 397.00 585.00 475.90 404.30 547.10 703.10 895.80 1065.80 DTQ (0,0,0) (0,1,0) (0,2,0) (0,3,0) (0,0,1) (0,1,1) (0,2,1) (0,3,1) (1,0,0) (1,1,0) (1,2,0) (1,3,0) (1,4,0) (1,0,1) (1,1,1) (1,2,1) (1,3,1) (1,4,1)
The extractable bit-rates (minimum and maximum value) for each spatio-temporal resolution are specified by the encoder output at the end of Example 72 as well as by the extractor output in Example 73. The spatio-temporal resolution and the corresponding rate ranges that are supported by the generated combined scalable bit-stream are summarized in Table 5.
File: 111010436.doc
Page: 69
Table 5: Supported bit-rates for combined scalable coding supported bit-rates [kbit/s] spatial resolution QCIF CIF temporal resolution 1.875 Hz 47-93 234-404 3.75 Hz 64-128 316-547 7.5 Hz 81-165 401-703 15 Hz 101-210 506-895 30 Hz na 585-1065
Please check out the latest software version from the CVS. It is recommended to run (at least) the short-term validation scripts before integration to make sure that the software passed all tests before you started integration. See section 4.3 for the details how to run the validation scripts. Do your integration. Please try hard to stick to the given time slot! Please obey the guidelines in section 4.1.
In case you might be running into trouble meeting the deadline please contact the software coordinators as soon as possible. Re-run the validation tests. To be considered as accepted an implemented tool should "pass" all validation tests. In case the tests fail with a slight deviation from the target rate and PSNR values defined in the scripts, please contact the software coordinators to discuss if the target points should be re-adjusted. Propose new tests that are designed to ensure that the tool you have implemented is not broken. Therefore, the tests do not need to demonstrate the performance of the tool but should rather validate its operability. Update the file changes.txt as well as the software manual.
To verify the validity of the claimed gains of the proposal, proponents are encouraged to provide verification results with the integrated JSVM software for their adopted tool at the next meeting.
The integrated software shall compile without warnings when using the provided VS .NET workspaces as well as linux makefiles (i.e. using an up-to-date g++ compiler). Follow the coding style of the JSVM software. Use 2 (two) spaces for indentation, no tabs. Re-use code and integrate functionality as possible. Try to avoid redundant code. Do not change the meaning of existing input parameters but define new ones if necessary (and applicable). Page: 70 Date Saved: 2009-06-19
File: 111010436.doc
Make sure that new parameters have meaningful default values. Tools should not be switched on by default (if not decided different by the JVT). Do not re-structure the output of the compiled binaries (if not decided different by the JVT). Please change the JSVM version number (i.e. _JSVM_VERSION_) macro located in the file CommonDefs.h to be inline with your integration tag (e.g. 5.10).
The corresponding proposal numbers (for reference), A zipped software package containing only the modified files, but in the correct directory structure without CVS directories, The output for the current validation scripts when running with the modified software, Proposals for modified and/or additional test configurations (also regarding updates of PSNR and rate values) for existing tools. Proposals for new short term tests (one or more) that validate the functionality of the new tool. A brief list of changes (to maintain the changes file changes.txt in the root directory of the JSVM software the changes should be directly written to the file changes.txt), An updated version of the software manual (when parameters or tools have been added).
4.3.1 Structure
The root directory of the validation scripts contains the following file and sub-directories:
run.pm (file): which is the main Perl script to be run in order to validate/evaluate JSVM performances. Tools (dir): which contains additional Perl scripts. SimuDataBase (dir): which contains the Simulations (i.e. tests sets) data base to be run for the validation process. PacketLossSimulator (dir): which contains the source code of a dummy packet loss simulator to be used for validation tests related to Error Concealment. 422_to_420fullres (dir): which contains the source code of the SD 422 to 420 conversion tool proposed in [VCEG-059]. This tool has to be used to generate the input sequences for certain validation tests related to Interlaced material.
Short_term: represents a set of several short tests (on a few number of images) which aim at validating most of the tools integrated in the JSVM software. Crossing tests running various tools together are also defined. Each time a tool is integrated, the proponents are invited to propose new dedicated short tests. The running duration of this test set would be less than 3 hours. Long_term: represents a set of long tests that target to asset the sanity of the software over time and face to more complex encoding/decoding scenarios. For the time being, only one simulation is present? Page: 71 Date Saved: 2009-06-19
File: 111010436.doc
AVC_Conformance: represents a set of AVC conformance tests which aim at validating the conformance of the JSVM decoder. The numbering of the conformance tests corresponds to the classification of the conformance bit-streams in the ITU-T Recommendation H.264.1 "Conformance specification for H.264 advanced video coding".
4.3.2.2 Interlaced
In order to run the Interlaced validation scripts CFG1_P2I, CFG4_I2I, MGSCtrl3, MGSCtrl4, LCFG1_P2I, and, LCFG4_I2I, the user will have to download the SD sequence called src5_ref__625.yuv available at ftp://vqeg.its.bldrdoc.gov/SDTV/VQEG_PhaseI/TestSequences/ALL_625/. This sequence is in 4:2:2 format. Conversion to 4:2:0 format shall be made using the conversion tool 422_to_420fullres using the p option as following:
> 422_to_420perl.exe p src5_ref__625.yuv CANOA_720x576_25i.yuv
The resulting CANOA_720x576_25i.yuv will then be used for the Interlaced validation scripts (in both Short_term and Long_term tests sets).
4.3.2.3 AVC_Conformance
Before running the AVC_Conformance test set, the user will have to download the conformance bitstreams and YUV sequences available at http://ftp3.itu.ch/av-arch/jvt-site/draft_conformance/. Then, you should use the dump.pm Perl script (located in SimuDataBase/AVC_Conformance directory) in order to copy (or remove) the sequences and bitstreams to their right places. Example 74 illustrates the dump.pm usage.
Example 74: Using the dump.pm script
USAGE: dump.pm -------------[-c] : to copy the "conformance sequences and bitstreams" in the corresponding simus directories. [-r] : to remove the "conformance sequences and bitstreams" of each simus directories. [-simu <name_simu1>...<name_simuN> ] : name of the simulations to copy/remove. [-data <yuv_streams_directory>] : name of the directory containing the "conformance sequences and bitstreams". [-which] : print the name of "conformance sequences and bitstreams" to be used. [-u] : Usage.
File: 111010436.doc
Page: 72
In order to find out which sequences and bitstreams are necessary in order to run a given AVC_Conformance test, the which option can be used. Example 75 illustrates the way to use the dump.pm script in order to know which conformance bitstreams and sequences are needed when running test named 6.6.1.
Example 75: Knowing the necessary conformance bitstreams and sequences
> perl dump.pm simu 6.6.1 which
Assuming the user has downloaded the conformance sequences and bitstreams in a directory named ConformanceData. Assuming that the user intends to copy the data related to test sets named 6.6.1 and 6.6.11 to the right directories. Example 76 illustrates the way the dump.pm script should be run.
Example 76: Copying conformance data using the dump.pm script
> perl dump.pm c data ./ConformanceData simu 6.6.1 6.6.11
File: 111010436.doc
Page: 73
The name of the simulations set to be validated can be also specified by using SimusetName where SimusetName represents the name of the targeted test set. So far, the possible tests set name are Short_term, Long_term and AVC_Conformance. To simplify the use of such an option the user may only specify a prefix of the test set name to be validated. Example 78 illustrates the use of the run.pm script when the user intends to validate the Long_term tests set. Additionnally to the -SimusetName the user may specify explicitly the name of the tests to run in the considered tests set. To specify it ,the full names of the simulations (tests) to be run separated by white spaces should be given as follows SimuName1 SimuName2 ...SimuNameN. The binaries directory as well as the input YUV sequences directory could be specified by using respectively the options bin and seq.
Example 78: Validating the Long_term tests set
> perl run.pm Long_term is equivalent to > perl run.pm L
Example 79 illustrates the command line when user intends to validate the tests named CAVLC, ESS and, T1 of the simulations set Short_term assuming ../../bin as binaries directory and ../../orig as YUV sequences directory.
Example 79: Validating the tests CAVLC, ESS and T1 of the Short_term set
> perl run.pm S CAVLC ESS T1 seq ../../orig bin ../../bin
A working directory named SimuRun is created (if non existing), during the validation scripts running. For each run tests a detailed log file as well as a global log file would be created at the root of the SimuRun directory. To be considered as validated, the global log file of a given simulation shall contain only passed messages. If failed or no results messages occur, it can result from a bad validation scripts usage or worse, a broken JSVM tool. Example 80 illustrates the log file CAVLC_Global.log assuming a successful validation of the CAVLC test.
Example 80: CAVLC_Global.log file assuming a successful validation of the CAVLC test.
================================================== Run simu CAVLC: ------------------Load Simu.............. ok Create Sequences....... ok Encode................. ok Run Tests.............. ----------------------------------------------L0 :: (176x144, 15) -> 268.2 - 34.47 ----------------------------------------------Rate (265.4400) Passed PSNR (34.4708) Passed Encoder/Decoder match Passed ----------------------------------------------L1 :: (352x288, 30) -> 1565 - 33.88 ----------------------------------------------Rate (1563.3680) Passed PSNR (33.9121) Passed Encoder/Decoder match Passed ----------------------------------------------Temp Base :: (352x288, 3.75) -> 805 - 37.05 ----------------------------------------------Rate (801.6225) Passed PSNR (37.0608) Passed ok
File: 111010436.doc
Page: 74
File: 111010436.doc
Page: 75