Multimedia Systems: Sreeraj K. P. Asst. Professor, Dec, Rset
Multimedia Systems: Sreeraj K. P. Asst. Professor, Dec, Rset
Introduction
Two approaches to analysis synthesis filter design
Direct implementation of filter banks in time domain (through FIR filters) with overlapped frequency domain characteristics.
require frequency domain alias correction by the proper design of adjacent filter bank characteristics.
If ai = bi*
L : length of filter tap Substituting equation hi(n) in si(n) we obtain DCT/DST of the input samples, multiplied by the prototype low pass filters impulse.
The audio signal is shifted into a 512 samples X buffer, 32 samples at a time. The content of X buffer are multiplied by the C-window function c(n) and the results are stored into the Z-buffer. The Z-buffer contents are divided into eight 64-element vectors (taking M=32), which are summed to form a 64-element Y-vector. The Y-vector is transformed using MDCT to yield the 32-subband samples.
The 32 subband samples are transformed back to the 64 element V vector, using inverse MDCT (IMDCT). The V-vector is pushed into a FIFO which stores the last 16 V vectors. A U-vector is created from the alternate 32 component blocks and a window (called Dwindow) is applied to U to produce the Wvector, which is divided into 16 vectors, each having 32 values. These 16 vectors are added together to obtain 32 sample output.
Psychoacoustic Models
Model 1:
is computationally simple. has high accuracy at high bit rate.
Model 2:
is computationally complex. has high accuracy at low bit rate.
Psychoacoustic model I
The auditory spectrum is approximated by a list of tonal and non-tonal components. Tonal components are selected by identifying the maxima in the spectrum whose height is greatest in the neighbourhood. All the remaining spectral lines are used for calculating the non-tonal components. They are grouped into critical bands. Within each critical band, a non-tonal component is represented. Then, the list of tonal and noise components are decimated by eliminating those components which are below the auditory threshold or are less than one half of a critical band width from a neighbouring component.
Psychoacoustic model I
To compute the masking effect of a tonal or non-tonal component on the neighbouring spectral frequencies, the strength of the component is summed with two terms called the masking index and the masking function.
Masking index: An attenuation term which depends on the critical band rate of the component and whether it is tonal or non-tonal. Masking function: An attenuation factor which depends on
Displacement of the component from neighbouring frequency. The component signal strength.
Psychoacoustic model I
Tonal masking index
Psychoacoustic model I
Psychoacoustic model I
Psychoacoustic model I
For a tonal component j, at critical band rate z(j), the masking threshold LTtm(j,i) at critical band rate z(i) is given by
LTtm(j,i) = Xtm(j)+avtm(z(j))+vf[z(i)z(j), Xtm(j)]
Xtm(j) : the strength of tonal component at frequency j avtm(j) : the tonal masking index at the critical band rate z(j), vf (i,j) : the masking function
i representing displacement j representing signal strength.
Psychoacoustic model I
For non-tonal components the masking index can be calculated as:
LTnm (j, i) = Xnm(j)+avnm(j)+vf[z(i) - z(j), Xnm(j)]
The global masking thresholds are computed for all spectral frequencies by adding the masking thresholds computed above, for all the neighbouring tonal & nontonal components, with the threshold of hearing. The minimum masking threshold function is determined for each sub-band from the minimum of all the global masking thresholds contributing to that sub-band. Signal to mask ratio (SMR) is computed.
Psychoacoustic model II
It does not make a distinction between the tonal and non-tonal components. Spectral data is transformed into a partition domain. 1024 point FFT computation is used. Tonality is decided by the unpredictability of the spectrum with time.