0% found this document useful (0 votes)
267 views

Modern Control Systems

sistenas de control moderno

Uploaded by

Elias Gabriel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
267 views

Modern Control Systems

sistenas de control moderno

Uploaded by

Elias Gabriel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 119
6.0 INTRODUCTION 374 ‘As we saw in Chapter 5,an LTI system with a rational system function bas the property that the input and output sequences satisfy linear constant-coefficient difference equa- tion, Singe the system function is the ¢-transform of the impulse response, and since the difference equation satisfice by the input and output can be determined by inspection Of the system function, follows that the difference equation, the impulse response, and the system function are equivalent characterizations of the input-output relation of an LU discrete-time system. When such systems arc implemented with discrete-time ana- og or digital hardware. the difference equation or the system function representation must be converted to an algorithm or structure that can be realized in the desired tech- nology. As we will see:in this chapter, systems deseribed by lineztr constant-coefficient difference equations can be represented hy structures consisting of an interconnection of the basic operations of addition, multiplication by a comslunt, and delay, the exac! implementation of which is dictated by the technology to be used. Asan illustration of the computation associated with 2 difference equation, con- sider the system described by the system function by + yet " Hg = TE. tl > ll. (6) “The impulse response of this system is ho] = Byala) + da" tuln 1, (62) and the ISLorder difference equation that is satisfied by the input and ontpat sequences is ylnd = aya — 1] = boxtad + bieln — 1. (63) Section 6.1 Block Diagram Representation of Linear Constant-Coefficient Difference Equations 375 ‘Fquation (6.2) gives a formula for the impulse response for this system. However, since the system impulse response has infinite duration, even if we only wanted to compute the output over a finite interval, it would not be efficient to do so by discrete convolution since the amount of computation required to compute y{n] would grow with 2. However, rewriting Eq. (6.3) in the form yin} = ayln — 1] + boxtn] + yzin = 1] (64) provides the basis for an algorithm for recursive computation of the output at any time n in terms of the previous output yl — 1}, the current input sample x[7], and the previous input sample x[v ~ 1. As discussed in Section 2.5, if we further assume initial-rest conditions (i.e. if x\x] = 0 for < 0,then y[n} = 0 for n = 0), and if swe use Eq. (64) as a recurrence formula for computing the output from past values of the output and present and past values of the input, the system will be linear and time invariant. A similar proceclure can be applied to the more general case of an "order difference equation. However, the algorithm suggested by Eq. (6.4), and generalization for higher-order difference equations is not the only computational algorithm for implementing a particular system, and often, itis not the best choice. As ‘we will see. an unlimited variety of computational structures result in the same relation between the input sequence x{n] and the output sequence y[n} Inthe remainder of this chapter. we consider the important issuesin the implemen- tation of LT! discrete-time systerns. We first present the block diagram and signal flow graph descriptions of computational structures for linear constant-coetficient difference equations representing LT causal systems.' Using a combination of algebraic manipu- ations and manipulations of block diagram representations, we derive a number of basic equivalent structuses for implementing a causal LTT system including lattice structures, Although two structures may be equivalent with regard to their input-output character istics for infinite-precision representations of coefficients and variables, they may have vastly different behavior when the numerical precision is limited. This is the major rea- son that it is of interest to study different implementation structures, The effects of finite- precision representation of the system coefficients and the effects of truncation orround- ing of intermediate computations are considered in the latter sections of the chapter. 6.1 BLOCK DIAGRAM REPRESENTATION OF LINEAR CONSTANT-COEFFICIENT DIFFERENCE EQUATIONS The implementation of an ITT discrete-time system by iteratively evaluating a recur- ence formula obtained from a difference equation requires that delayed values of the ‘output, input, and intermediate sequences be available. The delay of sequence values implies the need for storage of past sequence values, Also, we must provide means for multiplication of the delayed sequence values by the coefficients, as well as means for adding the resulting products. Therefore, the basic elements required for the implemen- tation of an LTT discrete-time system are adders, multipliers, and memory for storing #Such flow graphs are also called “networks” ip analogy tovtectrical circuit diagrams. We shall use the terms low graph, structure, and net work interchangeably with respect gruphic representations of difference equations Chapter Structures for Discrete-Time Systems vale ze) wb + abel wal ei LL ,Frizure 6.4 Block ciagram symbols tn Tie 71] (A) Addition of two sequences. . (b) Multiplication of a sequence by a © ‘constant, (t) Unit delay. delayed sequence values and coefficients. The interconnection of these basic elements is conveniently depicted by block diagrams composed of the: busic pictorial symbols shown in Figure 6.1, Figure 6, 1(a) represents the addition of two sequences. In general block diagram notation. an adder may have any number of inputs. However, in almost all practical implementations. adders have only two inpuls. In all the diagrams of this chapter, we indicate this explicitly by limiting the number of inpuls as in Figure 6.1(a}. Figure 6.1(b) depicts multiplication of a sequence by a constant, and Figure 6.1(c) de- picts delaying a sequence by one sample, In digital implementations, the delay operation can be implemented by providing a storage register for each unit delay that is required, For this reason, we sometimes refer to the operator of Figure 6.1(c} as a delay register. In analog disercte-time implementations such as switched-capacitor filters, the delays are implemented by charge storage devices. The unit delay system is represented in Fi ure 6.1(c) by its system function, 4, Delays of more than one sample can be denoted as in Figure 6.1(¢), with a system function of 2, where M is the number of samples of delay; however, the actual implementation of Af samples of delay would generally be done by cascading M unit delays, In an integrated-ciscuit implementation, these unit delays might form a shift register that is clocked at the sampling rate of the input signal. In a software implementation, M cascaded unit delays would be implemented as M consecutive memory registers Example 6.1 Block Diagram Representation of a Difference Equation Asan example of the representation of a difference equation in terms of the elements in Figure 6.t, consider the 2-order difference equation F vin} = ayy — 1] + any — 2] + boat 5) © The corresponding system function is HG 6.6) Tae =a The block diagram representation ofthe systemt realization based oa Eq. (6.5) isshowa in Figure 6.2, Such diagrams give a pictorial representation of a computational a 1 gorithm for ienplementing the system. When the system is implemented on either 4 Section 6.1 Blook Diagram Representation of Linear Constant-Coefficient Difference Equations 377 ©) general-purpose computer or a digital signal processing (DSP) chip, network structures || Stich asthe one shown in Figure 6.2serveas the basis for a program that implements the system. Ifthe system isimplemented with discrete components or asacomplete s with very large-scale integration (VLSI technology, the block diagram is the basis for determining a hardware architecture for the system. In both eases, diagrams such as | Figure 6.2 show expiicitly that we must provide storage for the delayed variables (in this case, yln—1Tand yr —2)) and also the coetficients ofthe difference equation (in this case, uj,.<), and hy). Furthermore, we see ftom Figure 62 that an output sequence value yf] is computed by first forming the products ayn — 1] and ayy {4 ~ 2}. then adding them, and, finally, adding the result to box[n, Thas, Figure 62 conveniently depicts the complexity of the associated computational algorithm, the steps of the algorithm, and the amount of hardware required to realize the system. ‘ yfn-2] Figure 6.2 Example of a block diagram representation ofa difference equation. Example 6.1 can be generalized to higher-order difference equations of the form? x « vit = Cayln - £1] =) bexin - (67) ‘ with the corresponding system function u Dobe HQ = _. (68) 1-Yae* Rewriting Eq, (6.7) as a recurrence formula for y{n]in terms of a linear combination of past values of the output sequence and current and past values of the input sequence 2 the form used in previous chaprers fora general ND order difference equation was y w Vevte a= Sie 4 = f= Tn the remainder of the book. it wil be more convenient 10 use the form in Eq. (67), where the ooetficient Of y{n) is normalized to unity und the coefficients asociated with the delayed output appear with a positive sigh afer they have been moved to the right-hand side of the equstion. (See Eq. (6.9).) 378 Ghapter6 Structures for Discrete-Time Systems. Jeads to the relation 8 yn] = Sayin — 41+ 7 basta — kh (69) i ‘The block diagram of Figure 6.3 is an explicit pictorial representation of Eq, (6.9) More precisely, it represents the pair of difference equations vin] = )Sbexin #1, (6.10a) im vlad = So ary mi “The assumption of a two-input adder implies that the additions are done in a specified order, hat is, Figure 6.3 shows that the products ay yu — NJ and ay-iyta — N +1] must be computed, then added, and the resulting sum added to ay—2¥ln — N +2], and soon, After y[n] has been computed, the delay variables must be updated by moving o{n — N + LJ into the register holding yl — NJ, and so on, with the newly computed ‘yla} becoming yf — 1] for the next iteration. A block diagram can be rearranged or modified in a variety of ways without chang- ing the overall system function. Each appropriate rearrangement represents a different computational algorithm for implementing the same system. For example, the block diagram of Figure 6.3 can be viewed as a cascade of two systems, the first represent- ing the computation of v(1| from x(n and the second representing the computation of {vl from v[n. Since each of the two systems is an LTT system (assuming initial-rest conditions for the delay registers), the order in which the two systems are cascaded can be reversed, as shown in Figure 6.4, without affecting the overall system Function. In Figure 6.4, for convenience, we have assumed that M = N. Clearly, there is no Loss of generality, since if M 3 NV, some of the coefficients ay or by in the figure would be zero, and the diagram could be simplified accordingly nk) + vl (6.10b) Figure 6.3 Block diagram rapresentation for a general N"-orger yIn-NY ifference equation. Section 6,1 Block Diagram Representation af Linear Constant-Goelficient Diference Equations 379 In terms of the system function H(z) in Eq. (68). Figure 6.3 can be viewed as an implementation of (z) through the decomposition a L H(2) = Hae) Hile) = 7 (She (11) 1- Yq * |? = or, equivalently, through the pair of equations Vz) = My(@)X (2) (6.128) 1 ¥ (5) = Hxz)¥(@) = | —{— ] ¥@) (6.120) 1-Slas* a Figure 6.4, on the other hand, represents H(z) as 1 + (6.13) 1 or, equivalently, through the equations pees 1 f Wiz) = Ha(z)X (2) = | —~——_ ] X@), (6.14a) 1- Yaz a YG) a = Hy(z)Wiz) = (Sact) wes (6.14b) = Chapter 6 Structures for Discrete-Time Systems lal by re a by wf] nN] Figure 6.4 Rearrangement of block diagram of Figure 6.3. Wa assume for ‘convenience that N= MIEN 2 M, win] some of the coefficients wil be zero Inthe time domain, Figure 6.4 and, equivalently, Eqs. (6.14a) andl (6.14b) can be repre- sented by the pair of difference equations wha] = Sawin — b+ xed, (6.153) i yin] =) bewtn — kD (6.15b) ‘The block diagrams of Figures 6.3 and 6.4 have several important differences. In Figure 6.3, the zeros of H(z), represented by H (2), are implemented first, followed by the poles, represented by H(z). In Figure 6.4, the poles are implemented first, followed by the zeros. Theoretically, the order of implementation does not affect the overall sys- tem function. However, as we will see, when a difference equation is implemented with finite-precision arithmetic, there can be a significant difference between two systems that are equivalent with the assumption of infinite precision arithmetic in the real num- ber system. Another important point concerns the number of delay elements in the two systems, As drawn, the systems in Figures 6.3 and 6.4 each have a total of (V + Mf) delay elements, However, the block diagram of Figure 6.4 can be redrawn by noting that exactly the same signal, wf], is stored in the two chains of delay elements in the Figure. Consequently, the two can be collapsed into one chain, as indicated in Figure 6.5, ‘The total number of delay elementsin Figure 6.5 is tess than or equal to the number required in either Figure 6.3 or Figure 6.4.and in fact itis the minimum number required toimplement asystem with system function given by Eq, (6.8). Specifically, the minimum number of delay elements required is, in general, max(W, Mf). An implementation with the minimum number of delay elements is commonly referred to a8 a canonic form Section 6:1 Block Diagram Ropresentation af Linear Constant-Coeflicient Difference Equations 381 win} To] Figure 6.5 Combination of delays in Figure 6.4, implementation. The noncanonic block diagram in Figure 6.3 is referred to as the direct form Fimplementation of the general Nth-order system because itis a direct realization of the difference equation satisfied by the input x[n] and the output yfa], which in Turn can he written directly from the system function by inspection. Figure 6.5 is often referred to as the direct form I or canonic direct form implementation. Knowing that Figure 6.5 is an appropriate realization structure for H(2) given by Eq. (6.8), We can 20 directly back and forth in a straightforward manner between the system function and the block diagram (or the equivalent difference equation). Example 6.2 Direct Form I and Direct Form | Implementation of an LTI System Consider the LTT system with system function e 14227 4) = ope OS [rise 1 40. ‘Comparing this system (unction with Eq. (6.8), We find bg = 1, bt = 2,4, = +15, and a = ~0.9, so it follows from Figure 63 that we can implement the system in direct form [block disgram as shown in Figure 6,6, Referring io Figure 6.5, we can also implement the system function in direct form 11, as shown in Figure 6,7. In both cases, note that the coetficients in the feedback branches in the block diagram have opposite signs from the corresponding coefficients of 2~? and ¢~? in Eq. (6.16). Although this ‘change of sign is sometimes confusing, itis essential to remember that the feedback coefficients (ax) slways have the opposite sign in the difference equation from their sign in the system function, Note also that the direct form 11 requires only two delay elements to implement #(¢), one less than the direct form | implementation. (6.16) 382 Chapter 6 Structures for Discrete-Time Systerns Figure 6.6 Direct form | implementation of Ea. (6.16). Figure 6.7 Direct form It implementation of Eq, (6.16), In the preceding discussion, we developed two equivalent block diagrams for im- plementing an LTT system with system function given by Eq. (6.8). These block diagrams, which represent different computational algorithms for implementing the systern, were obtained by manipulations based on the Linearity of the system and the algebraic prop- erties of the system function. Indeed, since the basic difference equations that represent an LTI system are linear, equivalent sets of difference equations can be obtained sim- ply by linear transformations of the variables of the difference equations. Thus, there are an unlimited number of equivalent realizations of any given system. In Section 6.3, using an approach similar to that employed in this section, we will develop a number of other important and useful equivalent structures for implementing a system with system function as in Eq, (6.8). Before discussing these other forms, however, it is convenient to introduce signal flow graphs as an alternative to block diagrams for representing difference equations. 6.2 SIGNAL FLOW GRAPH REPRESENTATION OF LINEAR CONSTANT-COEFFICIENT DIFFERENCE EQUATIONS A signal flow graph representation of a difference equation is essentially the same as a block diagram representation, except for a few notational differences. Formally, a Section 6.2 Signal Flow Graph Representation 383 Noe Ke) a) / Figure 6.8 Example of nodes and Nodek branches ina signa flow graph signal flow graph is a network of directed branches that connect al nodes. Associated with each node is a variable or node value. The value associated with node & might be denoted wz, or, since node variables for digital filters are generally sequences, We often indicate this explicitly with the notation will, Branch (7,4) denotes a branch originating at node j and terminating at node k, with the direction from j to k being \dicated by an arrowhead on the branch. This is shown in Figure 6.8. Each branch has an input signal and an output signal, The input signal from node j to branch (j, 4) is the node value w;[n|. In a linear signal flow graph, which is the only class we will consider, the output of a branch is a finear transformation of the input to the branch, ‘The simplest example is a constant gain, ie., when the output of the branch is simply a constant multiple of the input to the branch. The linear operation represented by the branch is typically indicated next to the arrowhead showing the direction of the branch. For the case of a constant multiplier, the constant issimply shown next (o the arrowhead. When an explicit indication of the branch operatian is omitted, this indicates a branch transmittance of unity, or the identity transformation. By definition, the value at each node in a graph is the sum of the outputs of all the branches entering the node. To complete the definition of signal flow graph notation, we define two special types of nodes, Source nodes are nodes that have no entering branches, Source nodes are used to represent the injection of external inputs or signal sources into a graph. Sink nodes are nodes that have only entering branches. Sink nodes are used to extract outputs from a graph, Source nodes, sink nodes, and simple branch gains are illustrated in the signal flow graph of Figure 6,9, The linear equations represented by the figure are as follows: wn] = xn] + awin} + bw2 ln], waln| = cuilnl, (617) gla] = dein] +ewaln Source Rede fn] wf) Sink jrjnose Figure 6.9. Example ofa signal flow graph showing source and sink nodes. fr og Anata 384 Chapter Structures for Discrete-Time Systems Souree rode 0 8 vin] Figure 6.10 (2) Block diagram ropresentation ofa 1"'-order digital iter (b) Structure of the signal flow graph wy garespanlg fs Boek cea 5) fa. Addition, multiplication by a constant, and delay are the busie operations required toimplementa linear constant-cocfficient difference equation. Since these are all lincar operations, it is possible to use signal flow graph notation to depict algorithms for implementing LTI discrete-time systems. As an example of how the flow graph concepts just discussed can be applied to the representation of a difference equation, consider the block diagram in Figure 6.10(a), which is the direct form II realization of the system whose system function is given by Eq. (6.1). A signal flow graph corresponding to this system is shown in Figure 6.10(b). In the signal flow graph representation of difference equations, the node variables are sequences. In Figure 6.10(b), node Qis a source node whose value is determined by the input sequence x[n], and node 5 is a sink node whose value is denoted yl]. Notice that the source and sink nodes are connected to the rest of the graph by unity-gain branches to clearly denote the input and output of the system, Obviously, nodes 3 and 5 have icentical values, The extra branch with unity gain is simply used to highlight the fact that node 3 is the output of the system. Tn Figure 6.10(b), all branches except one (the delay branch (2, 4)) can be represented by a simple branch gain; ie, the output signal is a constant multiple of the branch input, A delay cannot be represented in the time domain by a branch gain, However, the z-transform representation of a unit delay is multiplication by the factor 2". Ifwe represented the difference equations by their corresponding z-transform equations, all the branches would be characterized by theirsystem {unctions. In this ease, each branch ain would be a function of z; eg., a unit delay branch would have a gain of 2~!. By convention, we represent the variables in a signal flow graph ax sequences rather than as z-transforms of sequences. However, to simplify the notation, we normally indicate 2 delay branch by showing its branch gain as <~!, bul it is understood that the output of such a branch is the branch input delayed by one sequence value. That is, the use of 2! in a signal flow graph is in the sense of an operator that produces a delay of one sample, The geaph of Figure 6,10(b) is shown in Figure 6.11 with this convention. The Section 82 ‘Signal Flow Graph Representation 385 will whe a, wale . 11 M = N, then Np = M — N; otherwise, the first summation in Eq. (6.34) is not included. If the coctficients ay and hy are real in Eq, (6.27), then the quantities Ay, Bg, Ce, ce, and eg are all real. In this form, the system function can be interpreted as representing a paraltel combination of i*- and 2°4-order LR systems, with possibly .V, simple scaled delay paths. Alternatively, we may group the real poles in pairs, so that 1(z) can be expressed as (6.35) YD Ny He) = Deets Yet kad k ee. aye where, as in the cascade form, Ny = L(V +1)/2| is the largest integer contained in (N-+1)/2,and if Np = M —N is negative, the first sumis not present. A typical example for N = M = Gisshownin Figure 6.20. The general difference equations for the parallel form with 2"-order direct form I] sections are wel] = aywaln — 1 bawele 24 ala}, k= 122M (6.360) Yale) = eoewaled + crawler ~ 1), kaL2 My (6,36b) , Ne yb = So Cutln — E+) vile (6.360) ra] ai If M < NW, then the first summation in Eq, (6.36c) is not included, 394 Chepter6 Structures for Discrete-Time Systems & will ey Figure 6.20 Parallel torm structure for 6"*-order system (MM — WW = 6) with the real and complex poles grouped in pairs. Example 6.6 Illustration of Parallel Form Structures Consider again the system function used in Examples 6.4 and 6.5. For the parallel form, fe mustexpress H(z) in the form of either Eq, (6.34) or Eq, (6.35). If we use order sections, l4mlyet T+ 87! Toonset yodas ? 7° * parm Ty ase | The parallel form realization for this example with a 2°-order section is shown in Figure 6.21 Since all the poles are real, we can obtain an alternative parallel orm realization by expanding H(z) as He (637) 448s 1-052) ~ 1-028 ‘The resulting parallel form with 1%-order sections is shown in Figure 6.22. As in the general ease, the difference equations represented by both Figures 6.21 and 6.22 can “: be written down by inspection. HE (638) ) Section 6.3 Basic Structures for IR Systems 395 xia) | Figure 6.21 Paral form structure for Example 6.6 using a 2"-order system. a Is ala via} 05 2s 028 Figure 6.22 Parallel form structure for Example 6.6 using 1*-order systems. 6.3.4 Feedback it IR Systems All the flow graphs of this section have feedback loops; ie., they have closed paths that begin at a node and return to that node by traversing branches only in the dircetion of their arrowheads, Such a structure in the flow graph implies that a node variable in aloop depends directly or indirectly on itself, A simple example is shown in Figure 6.23(a). which represents the difference equation ple] = ayfn — 11+ xfn]. (6.39) Chapter 8 Structures for Discrete-Time Systems ch vial a ° } Pa in) — le a Figure 6.23 (a) System with feedback loop. (b} FIR system with feedback loop te {c) Noncomputable system. Such loops are necessary (but not suificient) to generate infinitely long impulse responses. This can be seen if we consider a network with no feedback loops. In such a case, any path from the input to the output can pass through each delay element only once. Therefore, the longest delay between the input and output would occur for a path that passes through all of the delay elements in the network. Thus, For a network with no loops, the impulse response is no longer than the total number of delay ctements in the network. From this, we conclude that if a network hus no loops, then the system funetion has only zeros (except for poles at z = 0), and the number ef zeros can be no more than the number of delay elements in the network Returning to the simple example of Figure 6.23(a), we see that when the input is the impulse sequence 3{n, the single-input sample continually recirculates in the feedback loop with either increasing (if jal > 1) or decreasing (if Jal = 1) amplitude owing to multiplication by the constant a, so that the impulse response is hf] = aul] This illustrates how feedback can create an infinitely long impulse response. Ifa system function has poles a corresponding block diagram or signal flow graph will have feedback loops. On the other hand, neither poles in the system function nor loops in the network are sufficient for the impulse response to be infinitely long. Fig- ure 6.23(b) shows a network with a feedback Toop, but with an impulse response of finite length. This is because the pole of the system function cancels with a zero; he for Figure 6.23(b), (6.0) The impulse response of this system is ffm] = 3|7] + a3{n — 1). The system is a simple example of a general class of FIR systems called freguency-sampling systerns. This class of systems is considered in more detail in Problems 6.39 and 6,51. Section 6.4 Transposed Forms 397 Loops in a network pose special problems in implementing the computations im- plied by the network, As we have discussed, it must be possible to compute the node variables in a network in sequence such that all necessary values are available when needed. In. some cases, there is no way to order the Computations so that the node variables of a flow graph can be computed in sequence. Such a network is called non computable (see Crochiere and Oppenheim, 1975). A simple noncomputable network is shown in Figure 6.23(c). The difference equation for this network is yfe] = aytn} + af). (641) In this form, we cannot compute yin] because the right-hand side of the equation in- volves the quantity we wish to compute. The fact that a flow graph is noncomputable does not mean the equations represented by the flow graph cannot be solved; indeed, the solution to Eq. (6.41) is yfa] = x[u]/(1 a}. ILsimply means that the flow graph does nol represent a set of difference equations that can be solved successively for the mode variables. The key to the computability of a flow graph is that all loops must contain at east one unit delay element. Thus, in manipulating flow graphs representing implemen- ations of LL systems, we must be careful not to create delay-free loops. Problem 6.37 deals witha system having a delay-free loop. Problem 7.51 shows how a delay-free loop can be introduced. 6.4 TRANSPOSED FORMS ‘The theory oflinear signal flow graphs provides a variety of procedures for transforming, such geaphs into different forms while leaving the overall system function between input and output unchanged. One of these procedures, called flow graph reversal or transpasition, leads to a set of transposed system structures that provide some useful alternatives to the structures discussed in the previous section, Transposition of a flow graph is accomplished by reversing the directions of all branches in the metwork while keeping the branch transmittances as they were and reversing the roles of the input and output so that source nodes become sink nodes and vice versa. For single-input, single-output systems, the resulting flow graph has the same system function as the original graph if the input and output nodes are interchanged. Although we will not formally prove this result here,* we will demonstrate that iLis valid with two examples. Example 6.7 Transposed Form for a 1*-Order System with No Zeros The ISLorder system corresponding to the flow graph in Figure 6.24a) has system function (62) The theorem fotiows dtecty from Mason’s gain formuta of signal flow graph theory (Sec Mason and Zimmermann. 1960: Chow and Cassignol, 1962; 0: Phillips and Nagle, 1995.) 398, Chanter Structures for Diserate-Timo Systems 28 To obtain the teansposed form for this system, we reverse the directions of all the branch arrows, taking the output where the input was and injecting the input where the output was. The resull is shown in Figure 6.24{b). Tis usually convenient to draw the transposed network with the inputon the left and the output on the right, as shown in Figure 6,24(c). Comparing Figures 6,24(a) and 6.24(c) we note that the only differ cence is that in Figure 6.24(a), we multiply the delayed output sequence yin — 1 by the coefficient a, whereas in Figure 624(c) we multiply the output yfn] by the coetficient aand then delay the resulting prod uct. Since the two operations can be interchanged, ‘we can conclude by inspection that the original system in Figure &24(a) and the cor responding transposed system in Figure 6.24(c) have the same system function. Stee ae x td (a) ae - “Tie 5 ie rc) Weal ap ¢ 2 ©) EB a Figure 6.24 (a) Flow graph of simple 1**-order system, (b) Transposed form af (9 6) Sacro) redrawn wih np ona. In Example 6.7, itis straightforward tosee that the original system and its transpose have the same system function. However, for more complicated graphs. the result is often not so obvious. This is illustrated by the next example, Example 6.8 Transposed Form for a Basic 2"¢-Order Section Consider the basic 2"-ordex section depicted in Figure 6.25. The corresponding ditt ference equations for this system are vf] = ay wher — 11 + agteln — 2] =x Enh, (6.434) la] = bpwler} + by wl — 1) + bwhn — 2) (6.430) Section 64 Transposed Forms 399 ‘The transposed flow graph is shown in Figure 6.26; its corresponding difference equations are gla] = &pxie] + vba — 1, (@.44a) vf] = tot (6.44h) uyled = ay ybnd + byt} + natn — 1 (6.44) vbr] = agylay 4+ bax (4a) Equutions (643n)-(6.43b) and Eqs (6.440)-(6.44d) are different ways to ong nize the computation of the output samples [a] from the input samples x[n), and it is not immediately clent that the wo sets of difference equations are equivalent, One vay to show this equivalence is €0 use the z-transform representations of both sets of equations, solve for the ratio ¥(:)/X (s) = H(z) in both cases, and compare the results. Another way is £0 substitute Eq. (6.444) into Eq. (6.44e), substitute the result into Eq. (6.44), and finally, substitute that result into Ea. (6.44b). The final result is yim = aryl — 1) + avin —21-+ Bpxin] + bya ~ 1} + boxy — 21 (645) Since the network of Figure 625 isa direct form I structure, itis easily seen that the input and output of the system in Figure 6.25 also satisfies the difference Eq. (6.45). ‘Therefore, for initial-rest conditions, the systems in Figures 6.25 and 6.26 are equivalent. whol by Figure 6.25 Direct form I! structure for Example 6.8. ty tala) Figure 6.26 Transposed direct form Il structure for Example 6.8, The transposition theorem can be applied to any of the structures that we have discussed so far, For example, the result of applying the theorem to the direct form I structure of Figure 6.14 is shown in Figure 6.27, and similarly, the structure obtained by 400 Chapler6 Structures for Discrete-Time Systems Figure6.27 _Generalfiow graph resulting from applying the transposition theorem tothe direct form I structure of Figure 6.14. transposing the direct form Il structure of Figure 6.15 isshown in Figure 6.28. If signal flow graph configuration is transposed, the number of delay branches and the number of coefficients remain the same. Thus, the transposed direct form If structure is also a canonic structure. Transposed structures derived from direct forms are also “direct” in the sense that they can be constructed by inspection of the numerator and denominator of the system function, ‘An important point becomes evident through a comparison of Figures 6.15 and 6.28, Whereas the direct form II structure implements the poles first and then the zeros, the iransposed direct form IT structure implements the zeros first and then the poles. These differences can become important in the presence of quantization in finite- precision digital implementations or in the presence of noise in discrete-time analog implementations Figure6.28 _Goneraltlow graph resulting from applying the transposition theorem ta the direct form il structure of Figure 6.15. Section 65 asic Network Structures for FIR Systems aot When the transposition theorem is applied to cascade or parallel structures, the individual 2"-order systems are replaced by transposed structures. For example, ap- plying the transposition theorem 1o Figure 6, 18 results in a cascade of three transposed direct form IT sections (cach like the one in Example 6.8) with the same coefficients as in Figure 6.18, but with the order of the cascade reversed. A similar statement car be made about the transposition of Figure. 6.20. The transposition theorem further emphasizes that un infinite variety of imple- mentation structures exists for any given rational system function, The transposition theorem provides a simple procedure for generating new struetures. The problems of {implementing systems with finite-precision arithmetic have motivated the development of many more classes of equivalent structures than we can discuss here. Thus, we con- centrate only on the most commonly used structures. 6.5 BASIC NETWORK STRUCTURES FOR FIR SYSTEMS The direct, cascade, and parallel form structures discussed in Sections 6.3 and 6.4 are the most common basic structures for IIR systems, These structures were developed under the assumption that the system function had both poles and zeros. Although the direct and cascade forms for IIR systems include FIR systems as a special case, there are additional specific forms for FIR 6.5.1 Direct Form For causal FIR systems, the system function has only zeros (except for poles at z = 0). and since the coefficients ay are all zera, the difference equation of By. (6.9) reduces to “ vba] = So baxta 41, (6.46) i This can be recognized as the discrete convolution of xn] with the impulse response vin=[ In this case, the direct form J and direct form II structures in Figures 6.14 and 6.15 both reduce to the direct form FIR structure as redrawn in Figure 6.29. Because of the chain of delay elements across the top of the diagram, this structure is also referred to asa rapped delay line structure or a transversal filter structure. As seen from Figure 6.29. the signal at each tap along this chain is weighted by the appropriate coefficient {impulse- response value), and the resulting products are summed to form the output yl]. ‘The transposed direct form for the FIR case is obtained by applying the transpo- sition theorem to Figure 6.29 or, equivalently, by setting the coefficients a, 0 zero in Figure 6.27 or Figure 6.28. The result is shown in Figure 6.30. (6.47) oz Chapteré Structures for Discrete-Time Systems zt Tne f i hia) fala Figure 6.29 Direct form realization of an FIR systern ~ ~ ~ vin iy Angst] bata] dae] att) fo) Figure 6.30 Transposition of the network of Figure 6.29. ln] 6.5.2 Cascade Form The cascade form for FIR systems is obtained by factoring the polynomial system func- tion. ‘That is, we represent H(z) a8 Me ue ad = Sita" = [tbo + beet bo = i f (6.48) where M, = [(M + 1)/2] is the largest integer contained in (M + 1)/2. If M is odd, one of the coetficients bag will be zero, since A(z) in that case would have an odd number of real zeros. The flow graph representing Eq. (6.48) is shown in Figure 6.31, which is identical in form to Figure 6.18 with the coefficients ay, and az. all zero. Each of the 2"-order sections in Figure 6.31 uses the direet form shown in Figure 629. Another alternative is to use trunsposed dizeet form 2*€-order sections or, equivalently. to apply the transposition theorem to Figure 6.31 Figure 6.31 Cascade form realization af an FIR system, Section 6.5 Basic Network Structures for FIR Systems 403 6.5.3 Structures for Linear-Phase FIR Systems Tn Chapter 5, we showed that causal FTR systems have generalized linear phase if the impulse response satisfies the symmetry condi AIM —n] = hin] M (6.40) AIM =n) =—Aln] n= 0,1,.... (6.496) With either of these conditions, the number of coefficient multipliers can be es- sentially halved. To see this, consider the following manipulations of the discrete con- volution equation, assuming that Mf is an even integer corresponding to type T or type IIL systems: ve) = Salk lxtn — k1 im una a =O miktehe 1-4 HIM Qt — MP2 DS nko A i= 7a Mpa Mpa = Metahe 1-4 mE 2c — MAE data = MF i ard For type T systems, we use Eq. (6.49a) to obtain Mpa yl] = SD wleledn — kl tala — M+ RD + ALM 21x10 — M/2}. (6.50) mS For type Il systems, we use Bq. (6.49b) to obtain Mant ylal = s Alk](cln — k] —xla— M4 k)). (6.51) for} For the ease of M an odd integer, the corresponding equations are, for type TI systems, orn sil = SS Atal CeIn A+ ale = MF RD, (6.52) fod and, for type IV systems, otaay2 vin= Yo blkline] xdn AD. (6.53) ra) Equations (6.50)-{6.53) imply structures with either M/2+ 1, Mj2, or (M+ 1)/2 coefficient multipliers, rather than the M coefficient multipliers of the gencral direct form structure of Figure 6.29. Figure 6.32 shows the structure implied by Eq. (6.50), and Figure 6.33 shows the structure implied by Eq. (6.52). In our discussion of lincar-phase systems in Section 5.7.3, we showed that the symmetry conditions of Eqs. (6.49a) and (6.496) cause the zeros of H(z) to occur in mirror-image pairs. ‘That is, if zp is a zero of H(z), then 1/z9 is also a zero of Hz). Furthermore, if h{nl is real, then the zeros of H(z) occur in complex-conjugate pairs. Chapteré Structures for Discrete-Time Systems. tnd viel Figure 6.32 Direct form structure for an Fil even integer. so), ae f et} oa Paicar 3/2] falcan 1972], Figure 6.33 Direct form structure for an FIR Linear-phase system when Mis an 6d integer. As a consequence, real zeros not on the unit circle occur in reciprocal pairs Complex zexos not on the unit circle occur in groups of four, corresponding to the complex conjugates and reciprocals. [fa zeroison the unit citee, its reciprocal isalso its conjugate. Consequently, complex zeros on the unit citcle are conveniently grouped into pairs. Zeros at : = +1 are their own reciprocal and complex conjugate. The four cases are summarized in Figure 6.34, where the zeros at 21, 27, 1/21, and 1/2} are considered as a group of four. The zeros at zz and I/z2 are considered as a group of two, as are the zeros at z3 and z$, The zero at z4 is considered singly, If H(2) has the zeros shown in Figure 6.34, it can be factored into a product of 1°-, 2°4., and 4*h-order factors. Each of these factors is a polynomial whose coefficients have the same symmetry as the coeflicients of H(z); ie. each factor is a linear-phase polynomial in z~!, Therefore, the system can be implemented as a cascade of 1%, 2"*., and 4!"-order systems. For example, the system function corresponding to the zerus of Figure 6.34 can be expressed as H(z) = hfO\ + ¢ x(Ltee + de ter Mi +az (6.54) where a@=2tl fiz), b= WRRelssh, = -2Relr + Mal, d= 2H le + 1 fer? Section 6.6 Lattice Filters 405 Figure 6.34 | Symmetry of zeres tor a ‘inear-phase FIR filter. ‘This representation suggests a cascade structure consisting of linear-phase elements. It can be seen thet the order of the system function polynomial is M = 9 and the number of different coefficient multipliers is five. This is the same number (M + 1)/2 = 5) of constant multipliers required for implementing the system in the linear-phase direct form of Figure 6.32. Thus, with no additional multiplications, we obtain a modular structure in terms of a cascade of short lincar-phase FIR systems. 6.6 LATTICE FILTERS In Sections 6.3.2 and 6.52, we discussed cascade forms for both IR and FIR systems obtained by factoring their system functions into 1*- and 2"!-order sections. Another interesting and usctul cascade structure is based on a cascade (output to input) con- nection of the basie structure shown in Figure 6.35(a). In the case of Figure 6.35(a) the basic building block system has two inputs and two outputs, and is called a two-port flow graph. Figure 6.35(b) shows the equivalent flow graph representation. Figure 6.36 shows a cascade of M of these basic elements with a “lermination” at each end of the cascade so that the overall system is a single-input single-output system with input xfn] feeding both inputs of two-port building block (1) and output y{n] defined to be a |], the upper output of the last two-port building block M/, (The lower output of the a" stage is generally ignored.) Although such a structure could take many different forms Two-Port Vow Graph = wa edgy 8g) ula ite fa) 0) Figure 6.35 One section of the lattice structure for FIR late filters. (2) Block iagram representation of a two-port building biock (b) Equivalert flow graph. 408 Chapter6 Structures for Discrete-Time Systems aa an an] any an) Two-Port Two-Part Two-Part =stel Flow Flow Flow Graph Graph Graph @ @) ca Ppl aad ra Ha} Figure-6.36 Cascade connection of Mf basic building block sections. depending on the definition of the basic building block, we will limit our attention to the particular choice in Figure 6.35(b), which leads to a widely used class of FIR and IIR filter structures known as lattice filters. 6.6.1 FIR Lattice Filters If the basic butterfly-shaped two-port building block in Figure 6.35(b) is used in the cascade of Figure 6.36, we obtain a flow praph like the one shown in Figure 637, whose lattice shape motivates the name lattice filer. The coefficients fy, k,..., ky, atereferted to generally as the k-parameters of the lattice structure, In Chapter 11, we will see that the &-parameters have a special meaning in the context of all-pole modeling of signals, and the lattice filter of Figure 6,37 is an implementation structure fora linear prediction of signal samples. In the eurrent chapter, our focus is only on the use of lattice filters to implement FIR and all-pole ITR transfer functions. ‘The node variables a'{n] and b'[n] in Figure 6.37 are intermediate sequences that depend upon the input x{] through the set of difference equations al [nt = OO [a] = ated (6.55a) a! tn] =a py — bP — 1 i= M (6.556) BOM = bn — kal Py i= aM (6.55e) yin = a" [a (655d) As we can see, the &-parameters are coefficients in this set of M coupled difference equations represented by Figure 6.37 and Eqs (6.55a)-(6.55d). It should be clear that xlel_o on tn) a] _ yin) “kn fay pis Bo fw Ny pa) oe ree ron Figure 6.37 Lattice flow graph for an FIR system based on a cascade of M two- port building blocks of Figure 6.35(b). Section 66 Lattice Filters 407 these equations must be computed in the order shown {i = 0.1, ..., M)since the output of stage (i — 1) is needed as input to stage ti), etc. ‘The lattice structure in Figure 6,37 is clearly an UTI system, since itis a finear signal flow graph with only delays and constant hranch coefficients. Furthermore, note that there are no feedhack loops, which implies that the system hay a finite- duration impulye response, In fact, a straightforward argument is sufficient to show that the impulse response from the input to any internal node has finite length. Specifically, consider the impulse response from x(n] to the node variable an], ic., from the input to the i* upper ftode. Its clear that if x[n] = Sp], then a0] — | for every i, since the impulse propagates with no delay through the top branch of all the stages. All other paths to any node variable a![n] or 6m] pass through at least one delay, with the greatest delay being along the bottom path and then up to node variable a”[n| through the coefficient —k;. This will be the last impulse that arrives at a" [1], so the impulse response will have length i + 1 samples. All other paths fo an internal node zigeag between the top and bottom of the graph. thereby passing through at least one, but not all, of the delays occurring before the outputs of section (0), ‘Note that in our introduction to lattice Alters, a[a] and 6 {n} were used in Figure 6.37 and Eqs. (6.55a}-(6.55d) to denote the node variables of building block (7) for any input x[n]. However, for the remainder of our discussion, it is convenient to assume specifically that fa] = 4{n] so that a" [a] and b(n] arc the resulting impulse Fesponses at the associated nodes, and that the corresponding z-trans forms A¥(:) and B'(z) are the transfer functions between the input and the i nodes, Consequently, the transfer function betwcen the input and the upper i" node is 1-Dawes, (656) = where in the second form, the coefficients ay’ for m < i are composed of sums of products of the coefficients k; for j = m. As we have shown, the coefficient for the fongest delay from the input to the upper node / is a” impulse response from x[1] to node variable al!"[n] is, &. In this notation, the 1 aso ainl=} a! Len 0. As indicated earlier, the transfer functions A(z) and B“'(z) can be computed recursively using Eq, (6.584) and (6.58b). These transfer functions are *-order polyno- mials, and it is particularly useful to obtain a direct relationship among the coefficients of the polynomials. Toward this end, the right side of Eq. (6.57) defines the coeficients of A®() to be —at!, for m = 1.2,.... 4 with the leading coefficient equal to one: ic, as in Eq, (6.56), AO@) Daven, (661) and similarly, AN UG) = 1 Sal mmol F (6.62) Section 6.6 Lattice Filters 409 To obtain a direct recursive relationship for the coefficients e\’ in terms of eft"! and ki, we combine Eqs, (6.60a) and (6.62) from which it follows that Big aa Dayy = (6.63) Substituting Eqs. (6.62) and (6.63) into Eq. (6.58a), Al) can also he expressed as Re indexing the second summation by reversing the ordering of the terms (ie..replacing m by i —m and re-summing) and combining terms in Eq, (6.64) leads to in - wt =F fog aol] where we see that, as indicated earlier, the coefficient of <~ is —k;, Comparing Eqs. (6.65) and (6.61) shows that AOE hit (6.65) wid (6.66a) ait) = [al Y — kal?) m= af? = ki (6.66b) Equations (6.66) are the desired direct recursion between the coefficients of A(z) and the coefficients of A! (z), These equations, together with Eq. (6.60a) also determine the transfer function B)(z), The recursion of Fys, (6.66) can also be expressed compactly in matrix form, We denote by a1 the vector of transfer function coefficients for AY" (2) and by &—1 these coefficients in reverse order, ic, yt ey ie wy and : (ght), g(h HY fia = [al a? af] Then Eqs, (6.66) can be expressed as the matrix equation a (6.67) -1 The recursion in Eqs, (6.66) or Eqs, (6.67) isthe basis foranalgorithm for analyzing an FIR lattice structure to obtain its transfer function, We begin with the flow graph specified as in Figure 6.37 by the set of &-parameters (Ay, ka, ., Aye} Then we can use Eqs. (6.66) recursively to compute the transfer functions of successively higher-order FIR filters until we come to end of the cascade giving us wu AD =1-P Gat ¥@ ~XxG)" (6.682) 410 Chapter Structures for Discrete-Time Systoms k-Parameters-to-Coefficients Algorithm ‘Given ki, ka,....ku for i=1,2...,M al = ky Eq. (6.66b) if i> I then for j = 1,2... af = al) ali? Eq. (6.66a) | end i snd: Figure 6.38 Algorithm for converting a fF 24,.2..,M Eq. (6.68b) orn paramo tn FR er. coe where iw aq = al M. (6.68) ‘The steps of this algorithm are represented in Figure 6.38, It js also of interest Lo obtain the k-parameters in the FIR lattice structure that lize a given desired transfer function from input [7] to the output y[a] = afahic., wwe wish Lo go from A(z) specified as a polynomial by Eqs. (6.68a) and (6.68) to the sct of keparameters for the lattice structure in Figure 6.37. This can be done by reversing the recursion of Egs. (6.66) or (6.67) to obtain successively the transfer function AY "'(2) in. terms of A"'}(z) fori — M, M—1,....2. The &-parameters are obtained as.a by-product of this recursion. Specifically, we assume that the coefficients af!” = ce, for m = 1,..., M are spec- ified and we want to obtain the k-parameters ky, ... ky to realize this transfer function, in lattice form. We start with the last stage of the FIR lattice, ie., with i = M. From Eq. (6.666). ku =a! am (6.69) with A(*? (2) defined in terms of the specified coefficients as AM) =1— Salt - (6.70) Inverting Eqs. (6.66) or equivalently Eq. (6.67), with ¢ = M and ky Of then determines oy, the vector of transform coefficients ut the next to last stage t= M1. “This process is repeated until we reach Al (2), To oblain a general recursion formula for af 1 “ in terms of af? from Bq. (6.668) note that «/'!? must be eliminated. To do this, replace m by i — m in Eq. (6.66a) and multiply both sides of the resulting equation by &; thereby obtaining kya, = hat? — ot» Adding this equation to Eq, (6.66a) results in Bag) from which it follows that ty 67a) Section 6.6 Lattice Filters an With af” calculated for m (671b) ‘Thus, starting with oy!” = an, 2, ..M we can use Eqs. (6.71a) and (6.71b) to compute wf!!—"!, form = 1,2,...,M — Land kjy—1, and then repeat this process recursively to obtain all of the transfer functions A'?(z) and, as a by-product, all of the &- parameters needed for the lattice structure. The algorithm is represented in Figure 6.39. Coefficients-to--Purameters Algorithm Given al) = aj) 7 = 1.2, all) Fq. (6.69) ‘for M.M- 1, 2 for j= 1,2.-44-1 Eq. (6.71a) Eq. (6.71b) Figure 6.39 Algorithm for converting from FIR filter coetficients to t-parameters. Example 6.9 #Parameters for a 3"°-Order FIR System Consider the FIR system shown in Figure: 6.401 whose system function is, 1 AG) = 1-095! 40.6427 ~ 0.576: Consequently, M = 3 and the coefficients af” in Eq. (6.70), are 09 af =064 a ‘We begin by observing that és = af") = 0.576, Next we want to caleulate the coefficients for transfer function 4/22) using Eq, (6.71a). Specifically, applying Eq. (6.72a), we obtain (rounded to three decimal places): S16, el) 4 tgal? pee i From Bg, (6706) we then ident iy = a = -0.182 To obtain A") (2) we again apply Eq, (6.71a) obtaining © 5 tal? af) 1 thay 1 ae 673, anz Chapter Structures for Discrete-Time Systems We then identify & — a ure 6.40b. = 0.673, The resulting lattice structure is shown in Fig- “9+ “ oo a _— vie @ aia] yd wg pe . = \ Aw Lat ora Joe Jos oS (o) Figure 6.40 Flow graphs for example. (a) Direct form. (ty cients rounded), 6.6.2 Al-Pole Lattice Structure A lattice structure for implementing the all-pole system function H(z) = 1/A(2) can be developed from the FIR lattice of the previous section by recognizing that Hic) is the inverse filter for the FIR system function A(z). To derive the all-pole lattice structure, assume that we are given s{7] = a!*fn}, and we wish to compute the input ain} = x{n]. This can be done by working from right to left to invert the computations in Figure 6,37. More specifically, if we solve Eq. (6.58a) for A ')(z) in terms of A(z) and BY!) @) and leave Eq, (6.58b) as it is, we obtain the pair of equations AvP) 2) jet BED (6.72a) B® Ale 4B, (6.720) ) which have the flow graph representation shown in Figure 6.41, Note that in this case, the signal flow is from i to? — 1 along the top of the diagram and from é — 1 to é along the bottom. Suecessive connection of M stages of Figure 6.41 with the appropriate &; in each section takes the input a!{n] to the output a°)[a] as shown in the flow graph. of Figure 6.42. Finally, the condition x{r] = a! {n] = 4 (n] at the terminals of the last stage in Figure 6.42 causes a feedback connection that provides the sequences [nj that propagate in the reverse direction, Such feedback is, of course, necessary for an IIR system Section 6.6 Lattice Filters 413 Figure 6.41 One stage of computation bea 2! bs] for an all-nole lattice system. The set of difference equations represented by Figure 6.42 ist ain] = sla] (6.73a) a Yin} =a in] +6" Pt) i= MM 1. (6.730) bn] = ba 1] — kai Mn) i= MM 1. (6.73c) ala] = 2 In] = ba, (6.734) ause of the feedback inherent in Figure 6.42 and these corresponding equations, al conditions must he specified for all of the node variables associated with delays. Typically, we would specify b'""[—1] = 0 for initial rest conditions. Then, if Eq. (6.73b) is evaluated first, a!'~))[n] will be available at times 7 = 0 for evaluation of Eq. (6.73¢) with the values of b)fa ~ 1] having been provided by the previous iteration Now we can state that all the analysis of Section 6.6.1 applies to the all-pole lattice system of Figure 6.42. If we wish to obtain a lattice implementation of an all-pole system with system function H(z) = 1/A(z), we can simply use the algorithms in Figures 6.39 and 6.38 to obtain &-parameters from denominator polynomial coefficients or vice-versa. wel = fn] Opn) apa. 4a] Fay Figure 6.42 _At-pole lattice system. Nowe that by basing our derivation of the all-pole lattice on the FIR lattice in Figure: 637, we hawe ‘ended up with the input denoted yin} and the ourput x{a} in opposition to our normal convention. This labeling i, of course, arbitrary once the derivation has been completed, a4 Chapter 8 Structures for Discrete-Time Systems Example 6.10 Lattice Impfementation of an IIR System 26 Asan example of an IIR system, consider the system function & He = ——— (6.740 @ 1092-140. 2 0.5762-3 « ) L (108) + Oizo THE 0.927) (6.740) hich is the inverse system for the system in Example 6.9. Figure 6.43(a) shows the direel form realization of this system, whereas Figure 6.43(b) shows the equivalent TIR lattice systent using the &-parameters computed as in Example 69. Note that the Jatice structure has the same number of delays (memory registers) as the © structure, However, the number of multipliers is twice the number of the: This is obviously true for any order Mf ze ln) ta) te tN / / a ey / 0182, cose of 2A. LZNef Nef Ne! o Figure 6.43 Signal tlow graph of II filter (a) direct form, (b) lattice form, Since the lattice structure of Figure 6.42 is an IIR system, we must be concerned about its stability, We will see in Chapter 13 thal a necessary and sufficient condi- tion for all the zeros of a polynomial A(z) to be inside the unit circle is |ky| < 1, i= 1,2,...,M. (Gee Markel and Gray, 1976.) Example 6.10 confirms this fact since, as shown in Eq, (6.74b) the poles of H(z) (zeros of A(c}) are located inside the unit circle of the z-plane and all the &-parameters have magnitude less than one. For LIK systems, the guarantee of stability inherent in the condition |&;| < 1 is particularly important. Even though the lattice structure requires twice the number of multiplications per out- put sample as the direct form, itis insensitive to quantization of the k-parameters. This, property accounts for the popularity of lattice filters in speech synthesis applications, (See Quatieri, 2002 and Rabiner and Schafer. 1978.) Section 6.7 Overview of Finite-Precision Numerical Efects a5 6.6.3 Generalization of Lattice Systems We have shown that FIR systems and all-pole IIR systems have a lattice structure representation, When the system function has both poles and zeros itis still possible to find a lattiee structure based upon a modification of the all-pole striictute of Figure 6.42. The derivation will not he provided here (Sec Gray and Markel, 1973, 1976.), but it is outlined in Problem 11.27. 6.7 OVERVIEW OF FINITE-PRECISION NUMERICAL EFFECTS We have seen that a particular LIL discrete-time system can be implemented by a variety of computational structures, One motivation for considering alternatives to the simple direct form structures is that different structures that are equivalent for intinite- precision arithmetic may behave differently when implemented with finite numerical precision. In this section, we give a brief introduction 10 the major numerical proble that arise in implementing discrete-timesysiems, A more detailed analysis of these finite ‘word-length effects is given in Sections 6.8-6.10. 6.7.1 Number Representations In theoretical analyses of discrete-time systems, we generally assume that signal values and system coefficients are represented in the real-number system. However, with ana- log discrete-time systems, the limited precision of the components of a circuit makes it difficult to realize coefficients exactly. Similarly, when implementing digital signal- Processing systems, we must represent signals and coefficients in some digital number system that must always have finite precision The problem of finite numerical precision has already been discussed in Sec- tion 4.82 in the context of A/D conversion. We showed there that the output samples from an A/D converter are quantized and thus can be represented by fixed-point binary numbers For compactness and simplicity in implementing arithmetic, one of the bits of the binary number is assumed to indicate the algebraie sign of the number. Formats such as sign and magnitude, one’s complement, and two's complement are possible, but two's complement is most common.> A real number can be represented with infinite precision in two's-complement form as Rm (e+ S02 ‘ (6:75) Where Xj_ is an arbitrary scale factor and the bys are either 0 or 1. ‘The quantity by is, referred to as the sign bit. If bp = 0, then 0 In this ease & would be replaced by (4 ~ X21) in the bounds om the size ‘ofthe quantization error. Overview of Finite-Precision Numerical Efects 47 {Ol + Lt po 4 a a8 . © [2a ! ! 1 7 7 ya, 8 42 4 | 4 2 4 “3 Figure 6.44 Nonlinear relationships representing :wo's-complement (a) rounding and (b) truncation for ©) B=? So far, we have simply stated that the quantity X,_ is am arbitrary scale factor; however, thisfactorhas several useful interpretations. In A/D conversion, we considered Xp tobe the full-scale amplitude of the A/D converter. In this case, Xj, would probably represent a voltage or current in the analog part of the system, Thus, Xqr serves as a calibration constant for relating binary numbers in the range —1 <= ft» < 1 to analog signal amplitudes. In digital signal-processing implementations, itis common to assume thatall signal variables and all coefficients are binary fractions. Thus, if we multiply a (B + 1)-bit signal variable by a (B + 1)-bit coefficient, the result is a (2B + 1)-bit fraction that can be conveniently reduced to (B+ |) bits by rounding or truncating the least significant bits. With this convention, the quantity X,, can be thought of as a scale factor that allows the 418 au 010 oo Chapter 6 f= Ola] Structures for Discrete-Time Systems ou ono oo 101 100 m 3A 43— @) OL 01 106 ou oo 101 @) Figure 6.45 Two's-complement rounding. (a} Natural over‘low. (b) Saturation, representation of numbers that are greater than unity in magnitude, For example, in fixed-point computations, it is common to assume that each binary number h factor of the farm Xyq actually located betw a scale 2°. Accordingly, a value c= 2 implies that the binary point is in bp and b3 of the binary word in Eq. {6.78}. Often, this scl factor is not explicitly represented; instead, itis implicit in the implementation program or hardware architecture. Section 6.7 Overview of Finite-Precision Numerical Effects aig Still another way of thinking about the scale factor Xq, leads to the floating-point representations, in which the exponent c of the stale factor is called the characteristic and the fractional part Zp is called the mantissa. ‘The characteristic and the mantissa are each represented explicitly as binary numbers in floating-point arithmetic systems Floating-point representations provide a convenient means for maintaining both a wide dynamic range and a small quantization noise: however, quantization error manifests itself in a somewhat different way. 6.7.2 Quantization in Implementing Systems Numerical quantization affects the implementation of LT discrete-time systems in sev- eral ways, As a simple illustration, consider Figure 6.46(a), which shows a block diagram ein _slel | ~ 10 + © cote, ato, Bel ne ad | T gan T r é a= Ontal ©) dat sa s)_ tte d tn stl oe a ° ~O ‘O 5 ! eam el r ln] © Figure 6.46 _(mplementation of discrete-time filtering ofan analog signal. (a) ideal system. (b} Nonfinear model. (c} Linearized model 420 Chapter 6 Structures for Diseete-Time Systems for a system in which a bandlimited continuous-time signal x47) is sampled to oblain the sequence x{n], which is the input to an LIT system whose system function is, 1 HO=TaT (6.80) ‘The output of this system, yf}, is converted by ideal bandlimited interpolation to the bandlimited signal ye). ‘A more realistic madel is shown in Figure 6.46(b). In a practical setting, sampling would be done with an A/D converter with finite precision of (8; + 1) bits. The sys tem would be implemented with binary arithmetic of (B + 1) bits, The coefficient « in Figure 6.46(a) would be represented with (B + 1) bits of precision. Also, the cl vatiable é{n — 1] would be stored in a (# + 1)-bit register, and when the (B + 1)-bit number 5[u — 1] is multiplied by the (B | 1)-bit number a, the resulting product would be (28 +1) bits in length. If we assume that a (B + 1)-bit adder is used. the product a0 (n — 1} must be quantized (1.e., rounded or truncated) to (B +1) bits before it can be added to the (8; + 1)-bit input sample é[n]. When B; < B, the (B, + 1) bits of the input samples can be placed anywhere in the (2 + 1)-bit binary word with appropriate extension of the sign. Different choices correspond to different scalings of the input. The coeflicient a has been quantized, so leaving aside the other quantization errors, the system response cannot in general be the same as in Figure 6.46(a). Finally, the (B 4 1)-bit samples én]. computed by iterating the difference equation represented by the block diagram, would be converted to an analog signal by a (8, + 1)-bit DIA converter. When B,, = #, the output samples must be quantized further before DIA conversion Although the model of Figure 6.46(b) could be an accurate representation of 4 real system, it would be difficult to analyze. The system is nonlinear owing to the quantizers and the possibility of overflow at the adder, Also, quantization errors ate introduced at many points in the system. ‘The effets of these efrors are impossible to analyze precisely, since they depend on the input signal, which we generally consider to be unknown. ‘Fhus, we are forced to adopt several different approximation approaches to simplify the analysis of such systems. The effect of quantizing the system parameters, such as the coetficient « in Fig- ure 6.46(a), is generally determined separately from the effect of quantization in data conversion or in implementing difference equations. That is, the ideal coefficients of a system function are replaced by their quantized values, and the resulting response fune- tions are tested to see whether, in the absence of other quantization in the arithmetic, quantization of the filter coefficients has degraded the performance of the system to ‘unacceptable levels. For the example of Figure 6.46, if the real number a is quantized to (B +1) bits, we must consider whether the resulting system with system function 1 Toit A= (681) is close enough to the desired system function H(z) given by Eq. (6.80), Since there are only 2+? different (8 + 1)-bit binary numbers, the pole of H(z) can occur only at2!4! locations on the real axis of the z-plane, and. while itis possible thatd = a, inmost cases some deviation from the idcal response would result. This type of analysis is discussed in more general terms in Section 6.8. Section 68 The Effects of Coefficient Quantization a2 The nonlinearity of the system of Figure 6.46(b) causes behavior that cannot occur ina linear system. Specifically, systems such as this can exhibit zero-input limit eycles, where the output oscillates periodically when the input becomes zero after having been nonzero. Limit cycles are caused both by quantization and by overflow. Although the analysis of such phenomena is difficult, some useful approximate results have been developed. Limit cycles are discussed briefly in Section 6.10 Ti care is taken in the design of a digital implementation, we can ensure that overflow occurs only rarely and quantization errors are small. Under these conditions, the system of Figure 6.46(b) behaves very much like a lincar system (with quantized coefficients) in which quantization errors are injected at the input and output and at internal points in the structure where rounding or truncation occurs, Therefore, we can replace the model of Figure 6.46(b) by the linearized model of Figure 6.46(c), where the quantizers are replaced by additive noise sources (sce Gold and Rader, 1969; Jackson, 1970a, 1970b). Figure 6.46(c) is equivalent to Figure 6.46(b) if we know cach of the noise sources exactly. However, as discussed in Section 4.8.3, useful results are obtained if we assume a random noise model for the quantization noisc in A/D conversion. This same approach can be used in analyzing the effvets of arithmetic quantization in digital implementations of linear systems. As seen in Figure 6.46(¢), each noise soure injects a random signal that is processed by a different part of the sysiem, but since we assume that all parts of the system are lincar, we can compute the overall cffcet by superposition. In Section 6.9, we iblustrate this style of analysis for several important systems. In the simple example of Figure 6.46, there is little flexibility in the choice of structure, However, for higher-ordcr systems, we have seen that there is a wide variety of choices. Some of the structures are less sensitive to coefficient quantization than others, Similarly, because different structures have different quantization noise sources and because these noise sources are filtered in different ways by the system, we will find that structures that are theoretically equivalent sometimes perform quite differently when finite-precision arithmetic is used to implement them. 6.8 THE EFFECTS OF COEFFICIENT QUANTIZATION ITI discrete-time systems are generally used to perform a filtering operation. Methods for designing FIR and IIR filters, which are discussed in Chapter 7, typically assume 4 particular form for the system function. The result of the filter desig process is a system function for which we must choose an implementation structure (4 set of dif- ference equations) from an unlithiled number of theoretically equivalent implementa tions. Although we are almost always interested in implementations that require the least hardware or soltware complexity, it is not always possible to base the choice of yplementation structure on this criterion alone. As we will sce in Scetion 6.9, the im- plementation structure determines the quantization noise generated internally in the system, Also, some structures are more scasitive than others to perturbations of the coefficients. As we pointed out in Section 6.7, the standard approach to the study of co- efficient quantization and round-off noise is to-treat them independently. In this section, we consider the effects of quantizing the system parameters, Chapteré Structures ‘or Discrete-Time Systems 6.8.1 Effects of Coefficient Quantization in UR Systems ‘When the parameters of. rational system function or corresponding difference equation sare quantized, the poles and zeros of the system function move to new positions in the z-plane. Equivalently, the frequency response is perturbed from its original value. Ifthe system implementation structure is highly sensitive to perturbations of the eoeflicients, the resulting system may no longer meet the original design specifications, or an IIR system might even become unstable. ‘A detailed sensitivity analysis for the general case is complicated and usually of limited value in specific cases of digital filter implementation. Using powerfulsimulation tools, it is usually easy to simply quantize the coefficients of the difference equations employed in implementing the system and then compute the corresponding frequency response and compare it with the desired frequency-response function, Even though simulation of the system is generally necessary in specific cases, it is still worthwhile to consider, in general, how the system function is affected hy quantization of the co- efficients of the difference equations. For eximple, the system function representation corresponding to both direct forms (and their corresponding transposed versions) is the ratio of polynomials (6.82) ‘The sets of coefficients {ay} and {bi} are the ideal infinite-precision coefficients in both direct form implementation structures (and corresponding transposed structures). If we quantize these coefficients, we obtain the system function \ Yur is He) (6.83) i where de = ag + Aay and by = by + Aby are the quantized coefficients that differ from the original coefficients by the quantization errors Aay and Ab. Now consider how the roots of the denominator and numerator polynomials (the poles and zeros of H(z}) are affected by the errors in the coefficients, Each polynomial Toot is affected by al! of the errors in the coefficients of the polynomial since each root isa function of all the coefficients of thc polynomial, Thus, cach pole and zero will be affected by all of the quantization errors in the denominator and numerator polynomi- als, respectively. More specifically, Kaiser (1966) showed that if the poles (or zeros) are tightly clustered. itis possible that small errors in the denominator (numerator) coetfi~ cients can cause large shifts of the poles (zeros) for the direct form structures. Thus, if the poles (zeros) are tightly clustered, corresponding to a narrow-bandpass filter or a narfow-bandwidth lowpass filter, then we can expect the poles of the direct form struc- ture to be quite sensitive to quantization errors in the coefficients. Furthermore, Kaiser's Section 6.8 ‘The E'fects of Coefficient Quantization aa analysis showed that the larger the number of clustered poles (zeros), the greater is the sensitivity. ‘The cascade and parallel form system functions, which are given by Eqs (6.30) and (6.35), res -onsist of combinations of 2°0-order direct form systems. However, in both eases, each pair of complex-conjugate poles is realized independently of all the other poles. Thus, the error in a particular pole pair is independent of its distance from the other poles of the system function. For the cascade form, the same argument holds for the zeros, since they are realized as independent 2°2-order factors Thus, the cascade form is generally much less sensitive to coefficient quantization than the equivalent direct form realization. ‘Asscen in Eq, (6.35), the zeros of the parallel form system function are realized implicitly, through combining the quantized 2"-order sections to obisin a common denominator, Thus, a particular zerois affected by quantization errors in the numerator and denominator coellicients of ail the 2"-order sections, However, for most practical filter designs, the parallel form is also found to be much less sensitive to cocfticient quantization than the cquivalcnt dircet forms becausc the 2°4-order subsystems are not extremely sensitive to quantization. In many practical filters, the zeros are often widely distributed around the unit circle, or in some cases they may all be located at z= 41. In thelatter situation. the zeros mainly provide much higher attenuation around frequencies w = 0 and w = 1 than is specified, and thus, movements of zeros away from £1 do not significantly degrade the performance of the system. 6.8.2 Example of Coefficient Quantization in an Elliptic Filter As an illustration of the effect of coefficient quantization, consider the example of an IIR bandpass elliptic filter designed using approximation techniques to be discussed in Chapter 7. The filter was designed to meet the following specification: 0.99 < |M(e!*)| = LOL O.3n = |oi = 04m, [M(e!)| < 0014.2, —40dB), foo] < 0.29. H(ei)| hk = H(2) + AH), (687) 430 Chapter Structures for Discrete-Time Systems zplane © Realizable pole positions Re ae plane i) = ~ os} \ Figure 6.52 Pole locations tor coupled form 2°orver IIR system of 9 4 a ‘Re Figure 6.51. (2) Four-bit quantization of &) coeticients, (2) Soven-bit quantization where ANE (6.88) ‘Thus, the system function (and therefore, also the Frequeney response) of the quantized system is linearly related! to the quantization errors in the impulse-response coefficients For this rcason, the quantized system can be represented as in Figure 6.53, which shows te unquantized system in parallel with an error system whose impulse response is the sequence of quantization exo samples (Afr) aud whose system Lunction is the corresponding z-transform, A/(2). Section 68 The Effects of Coeicient Quantization 4a ee) xin viel Figure 6.53 Representation of ‘coefficient quantization in FIR systems. TABLE 6.5 _UNQUANTIZED AND QUANTIZED COEFFICIENTS FOR AN OPTIMUM FIR LOWPASS FILTER (M = 27), Costfcient ——_Unquantized obits MAKIN LaseeT* IO ASD |S A{)=h(25]— —Lere993 x 10-853 1738032 = 10-3 =284 x 2-18 286841 10 21S 1255246 410-24 x 2-18 esyisay 10-7 2ibe2 22172 «10% PDB 124663 = 10-2 S00 2-18 B20 x 10212192 nsx 2 IF 3288892 % 10-2 10592-18265 2-13 653057 x10 2142 «2-1 336 2-18 AT52K754 10-2467 x21 617 x AZ] AlIs| —1560970% 10-5115 «2-H 279 «2-8 ALLS} = Alls} — 4394004 107114399. 2-15 3600 2°13 Another approach to studying the sensitivity of the direct form FIR structure would be to examine the sensitivity of the zeros to quantization errors in the impulse- response coeflicients, which are, of course the coefficients of the polynomial H(z). If the zeros of H(z) are tightly clustered, then their locations will be highly sensitive to quantization crrorsin the impulse-response coefficients. The reason that the direet form PIR system is widely used is that, for most linear phase FIR filters, the zeros are more or less uniformly spread in the z-plane, We demonstrate this by the following example. 6.8.5 Example of Quantization of an Optimum FIR Filter ‘As an example of the effect of coefficient quantization in the FIR case, consider a linear-phase lowpass filter designed to meet the following specifications: 0.99 = [Hel = 101, 0s al = 04x, IH(e*)| $0.001(.e,-600B), 0.6 This filter was designed using the Parks-McCletlan design technique, which will be discussed in Section 7.7.3, The details of the design for this example are considered in Section 7.8.1. Table 6.5 shows the unquantized impulse-response coefficients for the system, along with quantized coefficients for 16-, 14-, 13-, and &-bit quantization. Figure 6.54 32 0 40 “0 -80~ 100 a 7 Ralan Frequency (op @) on 0.605 005 ooio|—-—L» —t__1_ ae Oe Radian Frequency (a) ( oa10.- ————— ————— ‘nos E 5 — a 0 ae Oar O67 O8e 7 Radian frequency (0) © Figure 6.54 FIR quantization example, (a) Log magnitude for unquantized case. (©) Approximation error forun quantized case. (Err not defined in transition band) {6) Approximation errr for 16-bit quantization opi 0s Amplitude 0 O29 Ode ‘Ge oar 7 Radium frequency (a) (a) oo 8.005 3 | q 1 = =oons = -0010 A ccna Hl camo 02 nae em Or 7 Radian frequency ( © Radian frequency («) 0 Figure 6.54 (contiaved) {d) Approximation error for 14-bit quantization, (} Approximation error for 13-bit quantization. {f) Approximation error for 8-bit ‘quantization. 433 434 Chapter® Structures for Discrate-Time Systems gives a comparison of the frequency responses of the various systems. Figure 6.54{a) shows the log magnitude in dB of the frequency response for unquantized coefficients Figures 6.54(b), (c), (d), (c), and (1) show the passband and stopband approximation errors (errors in approximating unily in the passband and zero in the stopband) for the unquantized, 16-, 14-, 13-, and 8-bil quantized cascs, respectively. From Figure 6.54, ‘we See that the system meets the specifications for the unquantized case and both the 16-bit and 14-bit quantized cases. However, with 13-bit quantization the stopband ap- proximation error becomes greater than 0.001, and with 8-bit quantization the stopband approximation erroris over 10 timesas large as specified. ‘Thus, we see that at least 14-bit coclicients are required for a direct form implementation of the system, However, this is not a serious limitation, since 16-bit or 14-bit coefficients are well matched to many of the technologies that might be used to implement such a filter. ‘The effect of quantization of the filter coefficieats on the locations of the zeros of the filter is shown in Figure 6.55. Note that in the unquantized case, shown in Fig- ure 6.55(a), the zeros are spread around the z-plane, although there is some clustering ‘on the unit circle. The: zeros on the unit cirele are primarily responsible for developing, the stopband attenuation, whereas those at conjugate reciprocal locations off the unit circle are primarily responsible for forming the passband. Note that little difference is observed in Figure 6.554b) for 16-bit quantization, but in Figure 6.55(c), showing 13-bit quantization, the zeros on the unit cirele have moved significantly. Finally, in Fig ure 6.55(d), we sve that 8-hil quantization causes several of the zeros on the unit circle to pair up and move off the circle to conjugate reciprocal locations. This behavior of the zeros explains the behavior of the frequency response shown in Figure 6.54. ‘A final point about this example is worth mentioning. All of the unquantized coetficients have magnitudes less than 0.5. Consequently, if all of the coefficients (and therefore, the impulse response) are doubled prior to quantization, more efficient use of the available bits will result, corresponding in effect to increasing & by 1, In Table 6. and Figure 6.54, we did not take this potential for increased accuracy into account. 6.8.6 Maintaining Linear Phase So far, we have not made any assumptions about the phase response of the FLR sys- tem, However, the possibility of generalized linear phase is one of the major advan- tages of an FIR system, Recall that a linear-phase FIR system has either a symmetric (ALM =n} = hlml) or an antisymmetric (1M — n] = —hln)) impulse response. ‘These Jinear-phase conditions are easily preserved for the direct form quantized system. Thus, all the systems discussed in the example of the: previous subsection have a precisely linear phase, regardless of the coarseness of the quantization. This can be seen in the way in which the conjugate reciprocal locations are preserved in Figure 6.55 Figure 6.55(d) suggests that, in situations where quantization is very coarse or for high-order systems with closely spaced zeros, it may be worthwhile to realize smaller sets of zeros independently with a cascade form FIR system. To maintain linear phase, each of the sections in the cascade must also have linear phase, Recall that the zeros of a linear-phase system must occur as illustrated in Figure 6.34. For example, if-we use 2" .order sections of the form (1 +az~' +2°?) for each complex-conjugate pair of zeros: on the unit cirele. the zero can move only on the unit circle when the coefficient a is Section 68 The Etfects of Coeiient Quantization 435, FIR Lops Filter: FIR Lospuss Biter: Ungsinced Coeliients 165it Celcents a z e 2 é = Real part @) ® FIR Low pass Fiter, FIR Lowpass Fates Ishi Cooficiens Sit Coeliients Imaginary past maginary part Real part Real part ©) a Figure 6.55 Effect of impulse response quantization on zeros of Hi2) (a) Un~ ‘quantized. (b) 16-bit quantization. (c) 13-bit quantization, (d) 8-bit quantization. ‘quantized. This prevents zeros from moving away from the unit circle, thereby lessening their attenuating effect. Similarly, real zeros inside the unit circle and at the reciprocal location outside the unit circle would remain real. Also, 2er08 at = +L can be realized exactly by 1*-order systems. Ifa pair of complex-conjugate zcros inside the unit circle is realized by a 2"¢-order system rather than a 4"*-order system, then we must ensure that, for each complex zero inside the unit circle, there is a conjugate reciprocal zero outside the unit circle, This can be done by expressing the 4""-order factor corresponding to zeros at z = rel and 2 =rte-#* as Leet - dz? $e: (6.89) (1 = 2r cosdz = 2reos dz! +2 436 Chapter Structures for Discrete-Time Systems Figure 6.56 Subsystem to implement 4"-order factors in a linear-phase FIR system such that linearity of the phase is maintained independently of parameter ‘quantization ‘This condition corresponds to the subsystem shown in Figure 6.56, This system uses the same coefficients, ~2r cos # and r2, to realize both the zeros inside the unit circle and the conjugate reciprocal zeros outside the unit circle. Thus, the linear-phase condition is preserved undex quantization. Notice that the factor (1~2r cos 27} 4 r22~2) is identical tothe denominator of the 2-order direct form TTR system of Figure 6,49, Therefore, the set of quantized zeros is as depicted in Figure 6.50. More details on cascade realizations of FIR systems are given by Herrmann and Schiissler (1970b), 6.9 EFFECTS OF ROUND-OFF NOISE IN DIGITAL FILTERS Difference equations realized with finite-precision arithmetic are nonlinear systems Although itis important in general to understand how this nonlinearity affects the per- formance of discrete-time systems, a precise analysis of arithmetic quantization effects is generally not required in practical applications, where we are typically concerned with the performance of a specific system. Indeed, just as with coefficient quantization, the most effective approach is often to simulate the system and measure its performance. For example, a common objective in quantization error analysis is to choose the digital word length such that the digital system is a sufficiently accurate realization of the de- sired linear system and at the same time requires a minimum of hardware or software complesity, The digital word length can, of course, be changed only in steps of 1 bit, and as we have already seen in Section 4.8.2, the addition of 1 bit to the word length reduces the size of quantization errors by a factor of 2. Thus, the choice of word length is insensitive to inaccuracies in the quantization error analysis; an analysis that is correct to within 30 to 40 percent is often adequate. For this reason, many of the important effects of quantization can he studied using lincar additive noise approximations, We develop such approximations in this section and illustrate their use with several ex amples. An cxception is the phenomenon of zero-input limit cycles, which are striclly nonlinear phenomena. We restrict our sttidy of nonlinear models for digital filters to a brief introduction to zero-inpul limit cycles in Section 6.10. 6.9.1 Analysis of the Direct Form IR Structures Tointroduce the basic ideas, let us consider the direct form structure for an LI discrete time system. The flow graph of a direct form 1 2"!-order system is shown in Section 6.9 Effects of Round-off Noise in Digital Fiters se yal 437 Figure 6.57 Models for direct form 1 system, (2 Infnite-precision madel (b) Nonlinear quantized model {c) Linear noise model Figure 6.$7(a), The general "order difference equation for the direct form T structure is x a Late e+ Yoextn -A, (6.90) fe ms 438 Chapter Structures for Discrete-Time Systems and the system function is, (691) Let us assume that all signal values and coefficients are represented by (B + 1)-bit fixed-point binary numbers. Then, in implementing Eq. (6.90) with a (B + 1)-bit adder, it would be necessary to reduce the length of the (28 + 1)-bit products resulting from multiplying two (B + 1)-bit numbers back to (# + 1) bits. Since all numbers are treated as fractions, we would discard the least significant & bits by either rounding or truncation. This is represented by replacing each constant multiplier branch in Figure 6.57(a) by a constant multiplier followed by a quantizer, as in the nonlinear model of Figure 6.57(b).. The difference equation corresponcting to Figure 6.57(b) is the nonlinear equation ¥ u Sta} = YO Olax dla — kl) =P Otbextn — ka (692) 1 7 Figure 6.57(c) shows an alternative representation in which the quantizers are Teplaced by noise sources that arc equal to the quantization error at the autput of cach quantizer. For example, rounding or truncation of a product dxf] is represented by a noise source of the form eln] = Qlbrin}} ~ bet). (6.93) If the noise sources are known exactly, then Figure 6.57(c) is exactly equivalent to Figure 6.57(b). However, Figure 6.57(c) is most useful when we assume that each quan- tization noise source has the following properties: 1. Ewch quantization noiscsource efn] isa wide-scnse-stationary white-noise process, 2, Each quantization noise source has a uniform distribution of amplitudes over one quatitization interval 3. Each quantization noise source is uncorrelated with the input to the corresponding quantizer, all other quantization neise sources. and the input to the system. ‘These assumptions are identical to those made in the analysis of A/D conversion in Section 48. Strictly speaking, our assumptions here cannot be valid, since the quanti- zation ertor depends directly on the input to the quantizer. This is readily apparent for constant and sinusoidal signals. However, experimental and theoretical analyses have shown (see Bennett, 1948; Widrow, 1956, 1961; Widrow and Kollar, 2008) that in many situations the approximation just described leads to accurate predictions of measured statistical averages such as the mean, variance, and correlation function. This is truc when the input signal is a complicated wideband signal such as speech, in which the signal fhuctuates rapidly among all the quantization levels and traverses many of those levels in going from sample to sample (see Gold and Rader, 1969), The simple linear- noise approximation presenicd here allows us to characterize the noise generated in the system by averages such as the mean and variance and to determine how these averages are modiied by the system. Section 6.9 Effects of Round-off fotse in Digital Filters 439 Pele) 1 is Ano a 2 (a) pee) 1 & ay Figure 6.58 Probability density function for quantization errors, wv) (@) Rounding. (bj Truncation. For (B + 1)-bit quantization, we showed in Section 6.7.1 that, for rounding, lin low 52 "seul 52"%, (6.94a) and for two’s-complement truncation, -2-8 elnl <0. (6.94b) ‘Thus, according to our second assumption, the probability density functions for the random variables representing quantization error are the uniform densities shown in Figure 6.58(a) for rounding and in Figure 6,58(b) for truncation. The mean and variance for rounding are, respectively. (6.95a) (6.95b) For two's-complement truncation, the mean and varianee are — (6.96a) (6.96b) In general, the autocorrelation sequence of a quantization noise source is, according to the first assumption, deel] = 07 In the case of rounding, which we will assume for convenience henceforth m = 0,80 the autocorrelation function is d,e[2] = 07 4[n], and the power spectrum is ®,-(e/) = a? for \w| < m. In this case, the variance and the average power are identical. In the Bin] + m2. (6.97) Chapter 8 Structures for Discrete-Time Systerrs elal Bla} = vn} Fe Figure 6.59 Linear-noise model for direct form | with noise sources combined, case of truncation, the mean is not zero, so average-power results derived for rounding must be corrected by computing the mean of the signal and adding its square to the average-power resulls for rounding. With this mode! for each of the noise sourcesin Figure 6.57(c), we can now proceed to determine the effect of the quantization noise on the output of the system. To aid Us in doing this, it is helpful to observe that all of the noise sources in that figure are effectively injected between the part of the system that implements the zeros and the part that implements the poles. Thus, Figure 6.59 is equivalent to Figure 6.57(c) if e[a] in Figure 6.59 is eln] = enn} + e,{n] + e2ln] + esfn] + eatn (698) Since we assume that all the noise sources are independent of the input and independent of each other, the variance of the combined noise sources for the 2"4-order direct form Tease is 2 val agh to2ag? as tt oh to2 to, +02, +02 =5 : 6.99) Ty +86, + ey + Os + 8 7 (6.99) and for the general direct form I case, itis 28 of =(M41+N) iz (6.100), To obtain an expression for the output noise, we note from Figure 6.59 that the system has two inputs, x[n] and ef]. and since the system is now assumed to be linear, the output can be represented as 121] = yInl + fin], where yfa] is the response of the ideal unquantized system Lo the input sequence x[a] and f[a] is the response of the system to the mput ejn]. ‘The output y[n] is given by the difierence equation (6.90), but since elit] is injected after the zeros und before the poles, the output noise satisfies the difference equation Fl) = Daa sin e+e: (6.101) 1 i.e., the properties of the output noise in the direct form I implementation depend only on the poles of the system. To determine the mean and variance of the output noise sequence, we can use some results from Section 2.10. Consider a linear system with system function H.y(z) Section 6.9 Effects of Round-off Noise in Digital Filters 4a with a white-noise input el] and corresponding output fl]. Then, from Es. (2.184) and (2.185), the mean of the output is me S2 hep] = me gle (6.102) Since me = 0 for rounding, the mean of the output will be zero, so we need not be concerned with the mean value of the noise if we assume rounding, From Eqs. (6.97)and (2.190), it follows that, because, for rounding, elm ]is azero-mean white-noise sequence, the power density spectrum of the output noise is simply Preto) = Pppfeh”) = o2|Heple™)P. (6.103) From Eq. (2.192), the variance of the output noise can be shown to be ef 1 f” iy oan _Pritoldes = 085 [Hey (2)Pde. (6.104) Using Parseval's theorem in the form of Eq. (2.162), we can also express o7 as (6.10: When the system function corresponding to her[n] isa rational function, as it will always be for difference equations of the type considered in this chapter, we can use Eq. (A.66) in Appendix A for evaluating infinite sums of squares of the form of Eq, (6.105). ‘We will use the results summarized in Eqs. (6.102)-(6.105) often in our analysis cof quantization noise in linear systems. For example, for the direct form 1 system of Figure 6.59, H,;(z) = 1/A(2); Le, the system function from the point where all the noise Sources are injected to the output consists only of the poles of the system function A(z) in Eq. (6.91). Thus, we conclude that, in general, the total output variance owing to internal round-off or truncation is 2781 pt do T2 2x Jig (ACE 2S 2 = ELE NIE D bert, oF =(M+14N) (6.106) where hey{nt| is the impulse response corresponding to Her (2) = 1/A(e). The use of the preceding results is illustrated by the following examples. Example 6.11 Round-off Nolse in a 1*-Order System 2 Suppose thal we wish to implement a stable system having the system function q b We) = @= Ta Figure 6.60 shows the flow graph of tae lineur-noise model for the implementation in which products ate quantized before addition. Each noise source is filtered by the "system from en] to the output, for which the impulse response is ef) = aula. iid (6.107) az Chapter 6 Structures for Discrete-Time Systems 4g Since M = Oand = 1 for this example, from Eq. (6,103), the power spectrum of the ‘ouEpUt noise is er) (6.108) and the total noise variance athe output is we, Set = (6109) From Pa, (6.109), we see that the output noise variance increases as the pole at approaches the unit circle, Thus, to maintain the noise variance below a specified level as jal approaches unity, we must use longer word lengths, The following example also illustrates this point. el) = ele] + exla) In] Bln] =o) + Fl) Figure 6.60 1*-order linear noise made! Example 6.12 Round-off Noise in a 2"-Order System $8 Consider a stable 2"J-order direct form I system with system function bot bye7! + byzm? He) Nd ree-Hy (6.110) Tei, ‘The linear-noise model for this system is shown in Figure 6.57(0), or equivalently, Figure 6.59, with a; — 2r cos and az = —r?. In this case, the total output noise power can be expressed in the form. ora pe des 5 Se og Ware eTou are Pee oy Using Eq. (A.66) in Appendix A, the output noise power is found to be 28 eas (He (1) As in Exumple 6.11, we see thal as the complex conjugate poles approach the unit circle (r — 1), the (otal autput noise variance increases, thus requiring longer word lengths to maintain the variance below a prescribed level, The techniques of analysis developed so far for the direct form structure can also bbe applied to the direct form II structure. The nonlinear difference equations for the Section 6.9 Effects of Round-off Noise in Digital Fiters 443 direct form HI structure are of the form w Bln] = D> Olaxiile — KI] + xLn], (6.1134) = aM $lal = D> Olbxtbln — Fy] (6.1136) far] Figure 6.61(a) shows the linear-noise model for a 2-order direct form II system, A noise source has been introduced after each multiplication, indicating that the products are quantized to (+ 1) bits before addition, Figure 6.61(b) shows an equivalent linear model, wherein we have moved the noise sources resulting from implementation of the poles and combined them into a single noise source e,{1] = e3in| + eal] at the input. Likewise. the noise sources due to implementation of the zeros are combined into the single noise source ¢p[] = eg{n]-tex{nl--eala) that is added directly to the output. From this equivalent model. it follows that for M zeros and N poles and rounding (m, = 0), the power spectrum of the output noise is 28 qe Prplo) = NAS IMP + MED (e114) and the output noise variance is P 2 28 a fine Pda ony a 4 (ot1sy DY hea? aren 2 That is, the white noise produced in implementing the poles is filtered by the entire system, whereas the white noise produced in implementing the zeros is added dircetly to the output of the system. In writing Eq. (6.115), we have assumed that the N noise sources at the input are independent, so that their sum has N’ times the variance of a single quantization noise source. The same assumption was made about the (M + 1) noise sources at the output. These results are easily modified for two's-complement truncation, Recall from Eqs. (6.95a)-(6.95b) and Eqs. (6.96a)-(6.96b) that the variance of a truncation noise source is the same as that of a rounding noise source, but the mean of a truncation noise source is not zero. Consequently, the formulas in Eqs. (6.106) and (6-115) for the total output noise variance also hold for truncation. However, the output noise will have a nonzero average value that can be computed using Eq. (6.102). A comparison of Fg. (6.106) with Eq, (6.1 15) shows thatthe direct form [and direct, form I structures are affected differently by the quantization of products in implement- ing the corresponding difference equations. In general, other equivalent structures such as cascade, parallel, and transposed forms will have a total output noise variance differ- ent from that of either of the direct form structures. However, even though Egs. (6.106) and (6.115) axe different, we cannot say which system will have the smaller output noise variance unless we know specific valucs for the coefficients of the system. Inother words, it is not possible to state that a particular structural form will always produce the least output noise Chapter 6 Structures for Discrete-Time Systems bp “TT ic) eal ele @ eal] ‘ a a sh i Sl Liat 4% oe | < | igure 6.61 Linear-noise models for a | direct form I. (a) Showing quantization of individual products. (b) With noise a) sources combined. It is possible to improve the noise performance of the dircet form systems (and therefore cascade and parallel forms as well) by using a (2B-+1)-bit adder to accumulate the sum of products required in both direct form systems. For example, for the direct form implementation, we could use a difference equation of the form ¥ ow Sin] = o| Spout Soh -ah (6.116) m i.e, the sums of products are accumulated with (28 + 1)-or (28 ++ 2)+bit accuracy, and the result is quantized to (B + 1) bits for output and storage:in the d direct form I case, this means that the quantization noise is still filtered by the poles, but the factor (M-++1+N ) in Eq. (6, 106) is replaced by unity. Similarly. for the direct form TT realization, the difference equations (6.113a)-(6.113b) can respectively be replaced by win] =O [Zam -# at] (6.107) and i ait 0[ Src —a (6.1176) io F Section 6.9 Effects of Round-off Noise in Digital Fitters 45 This implies a single noise source at both the input and output, so the factors V and (M ~1)in Eq. (6.115) are each replaced by unity. Thus, the double-length accumulator provided in most DSP chips can be used to significantly reduce quantization noise in direct form systems 6.9.2 Scaling in Fixed-Point Implementations of IIR Systems “The possibility of overflow is another important consideration in the implements of IIR systems using fixed-point arithmetic. If we follow the convention that each fixed- point number represents a fraction (possibly times a known scale factor), each node in the structure must be constrained to have a magnitude less than unity to avoid overflow. If we[n] denotes the value of the k'" node variable, and Ay[m] denotes the impulse response [rom the input x[71] lo the node variable wel], then B= | YD xe = onytete (6.118) The bound WwelnT| Samax Velo (6.119) is obtained by replacing x[n — ml by its maximum value xmax and using the fact that the magnitude of a sum is less than or equal to the sum of the magnitudes of the summands. Therefore, a sufficient condition for |w|7]| < Lis 1 aa (6.120) YS thamtl for all nodes in the flow graph. If Ymax does not satisfy Eq. (6.120), then we can multiply xin] by a sealing multiplier s at the input to the system so that sxmax Satisfies Eq. (6.120) for all nodes in the flow graph; ic., a (6.121) max [2mm] Scaling the input in this way guarantees that overflow never occurs at any of the modes in the flow graph, Equation (6.120) is necessary as well as sufficient, since an input always exists such that Eq. (6.119) is satisfied with equality. (See Eg, (2.70) in the discussion of stability in Section 2.4.) However, Eq, (6.120) leads to a very conservative scaling of the input for most signals Another approach to scaling is to assume that the input is a narrowband signal, modeled as x[ 1] = Xinax COS gan. In this case, the node variables will have the form wel] = | ae!) xmax coslogn + £ Hele!) (6.122) Therefore, overflow is avoided for all sinusoidal RAS, Hele! <1 (6.123) nals if Chapter 6 Structures for Discrete-Time Systems or if the input is scated by the scale factor s such that 1 Simex < ——_— (6.124) max [Ha(e!)] Still another sealing approach is based on the energy F = %y x(n? of the input signal, We can derive the scale factor in this case by applying the Schwarz inequality (see Bartle, 2000) to obtain the following inequality relating the square of the node signal lo the energies of the input signal and the node impulse response: lr puxia? = Lf" neti steed s (2f lately) (s f Ix(e™Pde) sete) Therefore, if we scale the input sequence values by s and apply Parseval’s theorem, we see that {ay [n]/? < 1 for all nodes k i s (= pe?) EB pl (6.125) op x wna Since it can be shown that for the 4" node, % 1 ‘Ds van} é max Mite] = YS Malell (61m) it follows that (for most input signals) gs. (6.121), (6.124), and (6.126) give three de- creasingly conservative ways of scaling the input to a digital filter (equivalently decreas- ing the gain of the filter). Ot the three, Eq. (6.126) is generally the easiest to evaluate analytically because the purtial Lraction method of Appendix A can be used: however use of Eq. (6.126) requires an assumption about the mean-squared value of the signal E, On the other hand, Eq, (6.121) is difficult to evaluate analytically, except for thesim- plest systems, Of course, if the filter coefficients are fixed numbers, the scale factors can be estimated by computing the impulse response or frequency response numerically. If the input must be scaled down (s < 1), the signal-to-noise ratio (SNR) at the output of the system will be reduced because the signal power is reduced, but the noise power is dependent only on the rounding operation. Figure 6.62 shows 2"-order direct form Land direct form II systems with scaling multipliers at the input. In determining the scaling multiplier for these systems, it is not necessary to examine each node in the flow graph. Some nodes do not represent addition and thus cannot overflow: Other nodes represent partial sums. If we use nonsaturation two's-complement arithmetic, such nodes are permitted to overflow if certain key nodes do not. For example, in re 6,62(a), we can focus om the node enclosed by the dashed cirele, In the figure, the scaling multipticr is shown combined with the his. so that the noise source is the same Section 6.9 Effects of Round-off Noise in Digital Fiters a7 Figure 6.62 Scaling of direct form systems. (a) Direct form |. (b} Direct form I as in Figure 6.59; ie., it has five times the power of a single quantization noise source.'"” Since the noise source is again filtered only by the poles, the output noise power is the same in Figures 6.59 and 6,62(a). However, the overall system function of the system in Figure 6.62(a) is s/1(2) instead of H(z), 80 the unquantized component of the output 3fn] is sy{a] instead of yf. Since the noise és injected after the scaling, the ratio of signal power to noise power in the scaled system is s* times the SNR for Figure 6.59. Because s < 1 if scaling is required to avoid overflow, the SNR is reduced by sealing, ‘The same is true for the direct form IT system of Figure 6.62(b). In this ease, we must determine the scaling multiplier to avoid overflow at both of the circled nodes Again, the overall gain of the system is s times the gain of the system in Figure 6.61(B) but it may be necessary to implement the scaling multiplier explicitly in this case to avoid overflow at the node on the left. This scaling multiplier adds an additional noise component to ea[n], $0 the noise power at the input is, in general, (N + 1)2~24/12. Otherwise, the noise sources are filtered by the system in. exactly the same way in both Figure 6.61(b) and Figure 6.62(b), Therefore, the signal power is multiplied by s?, and the noise power at the output is again given by Eq. (6.115), with N replaced by (N + 1). The SNR is again reduced if scaling is required to avoid overflow. "This eliminates a separate scaling multiplication and quantization naise soures, However, seing {and quantizing) the ys can change the frequency response ol the system, Ifa separate input scaling multiplier precedesthe implementation the zeros in Figure 6.62(a), thea un additional quantization aise source would contribute tothe output noise after going through the entire system Chapter 6 Structures for Discrete-Time Systems Example 6.13 Interaction Between Scaling and Round-off Noise {To illostrate the interaction of scaling and round-off noise, consider the system of “) Example 6.11 with system function given by Eo. (6.107). If the scaling multiplier is combined with the coefficient b. we oblain the flow graph of Figure 6.63 forthe scaled system. Suppose that the input is white noise with amplitudes uniformly distributed between —1 and +1, Then the total signal variance is a? = 1/3. To guarantee no ‘overflow in computing §[n}, we use Eg, ¢6.121) to compute the scale factor 1 11a) => (6.128) Yair a0 ‘The output noise variance was determined in Example 6,11 to be 28 2222 2 wate (6.129) and since we again have two (B + )-bit rounding operations, the noise power i the cutput is the same, Lc.,0, = 0. The variance ofthe output yf] due tothe scaled input so{n]is =s7of. (6.130) ( =a)? 2 a(S (6.131) my) ae TS at ‘As the pole of the system approaches the unit circle, the SNR decreases because the quantization noise is amplified by the system and because the high gain of the system forces the input to be sealed down (o avoid overflow. Again, we see that overflow and quantization noise work in opposition to decrease the performance of the system. eal sb To sind Sled oyled Figure 6.63 Scaled 1*-arder system, 6.9.3 Example of Analysis of a Cascade HR Structure ‘The previous results of this section can be applied directly to the analysis of either paral- fel or cascade structures composed of 2°4-order direet form subsystems. The interaction of scaling and quantization is particularly interesting in the cascade form. Our general Section 6.9 Effects of Round-off Notse in Digital Fiters 449 comments on cascade systems will be interwoven with a specific example. An elliptic lowpass filter was designed to meet the following specifications: 0.99 = [He = 1.01, — \a < 05x, L(e) = 0.01, 056m < ol Sm “The system function of the 1 sulting system is Gy = 0.079459 T] (; el the coefficients are given in Table 6.6. Notice that all the zeros of H(z) are on the le in this example; however, that need not be the case in general, Figure 6.64(a) shows a flow graph of a possible implementation of this system asa cascade of 2"4-order transposed direct form LI subsystems, The gain constant, 0.079459, issuch that the overall gain of the system is approximately unity in the passband, and it is assumed that this guarantees no overfiow at the output of the system. Figure 6.64(a) shows the gain constant placed at the input to the system. This approach reduces the amplitude of the signal immediately, with the result that the subsequent filter sections must have high gain to produce an overall gain of unity. Since the quantization noise sources are introduced after the gain of 0.079459 but are likewise amplified by the rest of the system. this is not a good approach. Ideally, the overall gain constant, being, less than unity, should be placed at the very end of the cascade, so that the signal and noise will be attenuated by the same amount. However. this creates the possibility of overflow along the cascade, Therefore, a better approach is to distribute the gain among the three stages of the system, so that overflow is just avoided al each stage of the cascade. This distribution is represented by HG (6.133) where sjs253 = 0.079459. The scaling multipliers can be incorporated into the coetfi- cients of the nume rators of the individual system functions H(z) = sy H(z), as in 5 afi age 3 4G) =[] ( ai bu ) Te. (6.134) where bj = by = se and 61, = 4614. The resulting scaled system is depicted in Figure 6.64(b), Also shown in Figure 6.64(b) are quantization noise sources re presenting the quan- tization of the products before addition. Figure 6.64 (c) shows an equivalent noise model, ) = 0.079459] 17402). (6.132) sy (sq HazhssH3 (2). 2 2 +0, TABLE 6.6 COEFFICIENTS FOR ELLIPTIC LOWPASS FILTER IN CASCADE FORM kaw ae bu 1 OATBRRE 0.72150. 71985 2 OASTIT 0.010077 081109 32 —DoSMTF9 DOU Oa LLAS2 450 o7D4s9 Ghapter'6 Structures for Discrete-Time Systems Sta Figure 6.64 Models for 6"-order cascade system with transposed direct form lt subsys- ‘tems. (a) Infinite-precision model, (b) Linear-noise model for scaled system, showing quan- ization of individual multiplications. (c} Linear-noise model with noise sources combined. for which it is recognized that all the noise sources in a particular section are filtered only by the poles of that section (and the subsequent subsystems). Figure 6.64(c) also uses the fact that delayed white-noise sources are still white noise and are independent of all the other noise sources, so that all five sources in a subsection can be combined into a single noise source having five times the variance of a single quantization noise source;! Since the noise sources are assumed independent, the variance of the output “this discussion can be generalized se show that the transpased direst form II nas the same noise behavior asthe ditect form Esystem, Section 6. Effects of Round-off Noise in Digit Fers 451 noise is the sum of the variances owing to the three noise sources in Figure 6.64(c). Therefore, for rounding, the power spectrum of the output noise is 0 BHy ei Pat Ag(ely? — BiH ei? 1 Pr pla) Se eel Cie ®) [ Teese Tan * jae) (ens) and the total output noise variance is 2-28 fa f SBOP HEMP 1 [2x Tyee (6.136) L ft stitts(e!®)h 1 time ‘Axle A Dae L meme ‘Ifa double-length accumulator is available, it would be necessary to quantize only the sums that are the inputs to the delay elements in Figure 6.64(b). In this case the factor of 5 in Eqs. (6.135) and (6.136) would be changed to 3. Furthermore, if a double- length register were used to implement the delay elements, only the variables ity] would have to be quantized, and there would be only one quantization noise source per subsystem. In that case, the factor of 5 in Eqs, (6.135) and (6.136) would be changed to unity, The scale factoxs s, are chosen to avoiel overffow at points along the cascade system. We will use the scaling convention of Eq. (6.124). Therefore, the scaling constants are chosen to satisfy si max (eI <1, (6.137) suse may (He!) Fate!) = 1 (6.137b) sisps3 = 0.079459, (6.137) The last condition ensures that there will be no overflow al the outputof thesystem for unit-amplitude sinusoidal inputs, because the maximum overall gain of the filter is unity. For the coefficients of Table 6.6, the resulting scale factors are s; = 0.186447, $9 = 0.529236, and s3 = 0.805267. Equations (6.135) and (6.136) show that the shape of the output noise power spec- trum and the total output noise variance depends on the way that zeros and poles are paired to form the 2"4-order sections and on the order of the 2"order sections in the cascade form realization. Indeed, it is easily seen that, for N sections, there are (NI): ways to pair the poles and zeros, and there are likewise (N!) ways to order the result ing 2"4-order sections, a total of (A)? different systems. In addition, we can choose either direct form 1 or direct form II (or their transposes} for the implementation of the 2""-order sections, In our example, this implies that there are 144 different cascade systems to consider, if we wish to find the system with the lowest output noise variance. For five cascaded sections, there would be 57,600 different systems. Clearly, the complete analysis of even low-order systems is a tedious task, since an expression like Eq. (6.136) must be evaluated for each pairing and ordering. Hwang (1974) used dynamic pro- gramming and Liu and Peled (1975) used a heuristic approach to reduce the amount of computation, 452 Ghapter 6 Structures for Discrete-Time Systems e-plane Unit cele Figure 6.65 Pole-zer0 plot for 6""-arder system of Figure 6.64, i showing pairing of poles and zeros, Even though finding the best pairing and orderingmay require computer optimiza- tion, Jackson (1970a, 1970b, 1996) found that good results are: almost always oblained by applying simple rules of the following form: 1. The pole that is efosest to the unit circle should be paired with the zero that is closest to it in the z-plane. 2. Rule I should be repeatedly applied until all the poles and zeros have been paired. 3. The resulting 2""-order sections should be ordered according to either increasing closeness to the unit cirele or decreasing closeness to the unit circle, ‘The pairing rules are based on the observation that subsystems with high peak gain are undesirable because they can cause overflow and because they can amplify quantization noise. Pairing a pole that is close to the unit circle with an adjacent zero tends to reduce the peak gain of that section. These heuristic rules are implemented in design and analysis tools such as the MATLAB Iunction 2p2sos. One motivation for rule 3 is suggested by Eq. (6.135). We see that the frequency re- sponses of some of the subsystems appear more than once in the equation for the power spectrum of the output noise. If we do not Want the output noise variance spectrum to have a high peak around a pole that is close to the unit circle, then it is advantageous to have the frequency-response component owing to that pole not appear frequently in Eq, (6.135). This suggests moving such ‘“high Q poles to the beginning of the cascade. On the other hand, the frequency response from the input to a particular node in the flow graph will involve a product of the frequency responses of the subsystems that pre- cede the node. Thus, to avoid excessive reduction of the signal level in the early stages of the cascade, we should place the poles that are close to the unit circle last in order. Section 6.9 Effects of Round-off Noise in Digita Filters 453 Clearly then, the question of ordering hinges on a variety of considerations, including total output noise variance and the shape of the output noise spectrum, Jackson (1970, 1970) used L , norms to quantify the analysis of the pairing-and-ordering problem and gave a much more detailed set of “rules of thumb” for obtaining good results without having to evaluate all possibilities The pole-zero plot for the system in our example is shown in Figure 6.65. The paired poles and zeros are circled. In this case, we have chosen to order the sections from least peaked to most peaked frequency response. Figure 6.66 illustrates. how the frequency responses of the individual sections combine to form the overall frequency response. Figures 6.66(a)-(c) show the frequency responses of the individual unscaled subsystems. Figures 6.66(d)-(I) show how the overall frequency response is built up. Notice that Figures 6.66(d)-(f) demonstrate that the scaling Figs. (6.137a)-(6.137¢) en- sure that the maximum gain from the input to the output of any subsystem is less than unity, The solid curve in Figure 6.67 shows the power spectrum of the output noise for the ordering 123 (least peaked to most peaked). We assume that B + 1 = 16 for the plot. Note that the spectrum peaks in the vicinity of the pole that is closest to the unit Circle, The dotted curve shows the power spectrum of the output noise when the scetion order is reversed (i.e., 321). Since section 1 has high gain at low frequencies, the noise spectrum is appreciably larger at low frequencies and slightly lower around the peak ‘The high Q pole still filters the noise sources of the first section in the cascade. soit still tends to dominate the spectrum, The total noise power for the tWo orderings turns out to be almost the same in this case. ‘The example we have just presented shows the complexity of the issues that arise in fixed-point implementations of cascade LR systems, ‘The parallel form is somewhat simpler because the issue of pairing and ordering does not arise. However, scaling still required (o avoid overfiow in individual 2" order subsystems and when the outputs of the subsystems are summed to produce the overall output. The techniques that we have developed must therefore be applied for the parallel form as well. Jackson (1996) dis- cusses the analysis of the parallel form in detail and concludes that its total output noise power is typically comparable to that of the best pairings and orderings of the cascade form. Even sa, the cascade form is more common, because, for widely used TTR filters such that the zeros of the system function are on the unit circle, the cascade form can be implemented with fewer multipliers and with more control over the locations of the zeros, 6.9.4 Analysis of Direct-Form FIR Systems Since the direct form [and direct form ILIIR systems include the direct form FIR system asa special case (ie, the case where all coefficients ay in Figures 6.14 and 6.15 are zero), the results and analysis techniques of Sections 6.9.1 and 6.9.2 apply to FIR systems if we eliminate all reference to the poles. of the system function and eliminate the feedback paths in all the signal flow graphs. ‘The direct form FIR system is simply the discrete convolution a yl] = > Alelatn — 41 (6.138) i= 20 — of 3 -2of woh -) ! 1 ! 1 0 oT 7 Radian frequency (o) ® En oF § 20h 40 a a ° a 7 Radian frequency (a) ws a0 2 2 #0 Figure 6.66 Frequency-response 2a : functions for example system, 0 ae Der (Re (a) 2010949 1H (2*. Radian frequerey (0) (0) 2010939 IH(6"| Cr {€) 2010949 13 (2%) 454 Oe Oar Om Radian frequency (0) @ 1 Oar leer Radian frequency (a) © 08 * or Or Radian frequency (o) 0 Figure 6.66 (continued (0) 20iogig 1H4co*D, fe) 20logy9 iM; (eh HH5 (2h), {1 2010040 hcp col age) = 20 log ig 1H'(e 9h. 455 455 Chapter Structures for Discrete-Time Systems 0 Gf 02 03 04 OS 06 07 08 09 1 Radian frequency ¢a) Figure 6.67 utput noise power spectrum for 123 ordering (solid line) and 321 ‘ordering (dashed line) of 2"-order sections. Figure 6.68(a) shows the ideal unquantized direct form FLR system, and Figure 6.68(b) shows the lincar-noisc mode! for the system, assuming that all products are quantized before additions are performed. The effect is to inject (M + 1) white-noise sources directly at the output of the system, so that the total output noise variance is 2-8 oF a +) (6.139) This is cxactly the result we would obtain by setting N = 0 and hes{n] = Slr} in Eqs. (6.106) and (6.115). When a double-length accumulator is available, we would need to quantize only the output. Therefore, the factor (Mf +1) in Eq. (6.139) would be replaced by unity. This makes the double-length accumblator a very attractive hardware feature for implementing FIR systems Overflow is also a problem for fixed-point realizations of FIR systems in direct form. For two's-complement arithmetic, we need to be concerned only about the size of the output, since all the other sums in Figure 6.68(b) are partial sums, Thus, the impulse-response coe fficients.can be scaled to reduce the possibility of overflow, Scaling, multipliers can be determined using any of the altematives discussed in Section 6.9.2. Of course, scaling the impulse response reduces the gain of the system, and therefore the SNR at the output is reduced as discussed in that section. Section 89 Effects of Rourd-off Noise in Digital Filters 487 go) Paty dae) patsy iyAM all) ated > eateh) ewe or] + Fle} ) Figure 6.68. Direct form realization of an FIR system. (2) nfnte- precision model {b) Linear-noise model Example 6.14 Scaling Considerations for the FIR System in Section 6.8.5 ‘The impulse-response coetficients forthe system in Section 6.8.5 are given in Table 65. Simple calculations show, and from Figure 6.54(b) we see, that Spun = ‘ (5 | ’ 4 max | (e!)] = 1,009, 751382, 1679442, 12 These numbers satisfy the ordering relationship in Eq, (6.127). Thus, the system, as 2. given, is scaled so that overflow is theoretically possible for a sinusoidal signal whose | amplitude is greater than 1/1.009 = 0.9911, but even so, overflow is unlikely for most signals, Indeed, since the filter has a finear phase, we can argue thal, for widchand signals, since the gain in the passband is approximately unity, and the gain elsewhere is ess han unity, the output signal should be smaller than the input signal In Section 6.5.3, we showed that linear-phase systems like the one in Example 6.14 can be implemented with about half the number of multiplications of the general FIR system, This is evident from the signal flow graphs of Figures 6.32 and 6,33. In these cases, it should be clear that the output noise variance would be halved if products were quantized before addition, However, the utilization of such structures involves a more complicated indexing algorithm than the dircet form. ‘The architecture of most DSP chips combines a double-length accumulator with an efficient pipelined multiply. 458 Chapter 6 Structures for Discrete-Time Systems accumulate operation and simple lonping control to optimize for the case of the direct form FIR system, For this reason, direct form FIR implementations are often most attractive, even compared with IIR filters that meet frequency-response specifications with fewer multiplications, since cascade or parallel structures do not permit long se- quences of multiply-accumulate operations It Section 6.5.3, we discussed cascade realizations of FIR systems. The results and analysis techniques of Section 6.9.3 apply to these realizations; but for FIR systems with no- poles, the pairing and ordering problem reduces to just an ordering problem. As in the case of IIR cascade systems, the analysis of all possible orderings can be very difficult ifthe system is composed of many subsystems. Chan and Rabiner (97a, 1973b) studied this problem and found experimentally that the noise performance is relatively insensitive to the ordering. Their results suggest that a good ordering is an ordering for which the frequency response from cach noise source to the outputiis relatively Nal and for which the peak gain is small 6.9.5 Floating-Point Realizations of Discrete-Time Systems From the preceding discussion, itis clear that the limited dynamic range of fixed-point arithmetic makes it necessary to carefully scale the input and intermediate signal levels in fixed-point digital realizations of diserete-time systems. The need for such scaling can be essentially eliminated by using floating-point numeric representations and floating point arithmetic In foating-point representations, a real number x is represented by the binary number 2°, where the exponent c of the scale factor is called the characteristic and isa fractional part called the mantissa, Both the characteristic and the mantissa are represented explicitly as fixed-point binary numbers in floating-point arithmetic sys tems. Floating-point representations provide a convenient means for maintaining both a wide dynamic range and low quantization noise; however, quantization error mani- fests itseif ina somewhat different way, Floating-point arithmetic generally maintains its high accuracy and wide dynamic range by adjusting the characteristic and normalizing the mantissa so that 0.5 = fy < 1. When floating-point numbers are multiplied, their characteristics are added and their mantissas are multiplied. Thus, the mantissa must be quantized. When two floating-point numbers are added, their characteristics must he adjusted to be the same hy moving the binary point of the mantissa of the smaller number. Hence, addition results in quantization, too. If we assume that the range of the characteristic is sufficient so that no numbers become larger than 2°. then quantization affects only the mantissa, but the error in the mantissa is also sealed by 2°. Thus, a quantized loating-point number is conveniently represented as Bax tear tex (6.140) By representing the quantization error as a fraction ¢ of x, we automatically represent the fact that the quantization error is scaled up and down with the signal level. ‘The aforementioned properties of floating-point arithmetic complicate the quan- tization error analysis of floating, point implementations of discrete-time systems First, noise sources must he inserted both after each multiplication and after each addition. ‘An important consequence is that, in contrast to fixed-point arithmetic, the order in Section 6.10 Zero-Input Lint Cycles in Fixed-Point Realzations of HR Digital Fitters 459 which multiplications and additions are performed can sometimes make a big differ. ence. More important for analysis, we can no longer justify the assumption that the quantization noise sources are white noise and are independent of the signal. In fact, in Eq, (6.140), the noise is expressed explicitly in terms of the signal. Therefore, we can no longer analyze the noise without making assumptions about the mature of the input signal. If the input is assumed to be known (e.g, white noise), a reasonable assumption is that the relative error ¢ is independent of x and is uniformly distributed white noise. With these types of assumptions, useful results have been obtained by Sandberg (1967), Liu and Kaneko (1969), Weinstein and Oppenheim (1969), and Kan and Ag- garwal (1971). In particular, Weinstein and Oppenheim, comparing floating-point and fixed-point realizations of 1*- and 2"-order IIR systems, showed that if the umber of bits representing the floating-point mantissa is equal to the length of the fixed-point word, then floating-point arithmetic leads to higher SNR at the output. Not surpris- ingly, the difference was found to be greater for poles close to the unit circle. However, additional bits are required to represent the characteristic, and the greater the desired dynamic range, the more bits are required for the characteristic. Also, the hardware to implement floating-point arithmetic is much more complex than that for fixed-point arithmetic. Therefore, the use of floating-point arithmetic entails an increased word length and increased complexity in the arithmetic unit, Tts major advantage is that it essentially eliminates the problem of overflow, and if a sufficiently long mantissa is used, quantization also hecomes much less of a problem. This translates into greater simplicity in system design and implementation ‘Nowadays, digital iltering of multi-media signalsis often implemented on personal ‘computers or workstations that have very accurate floating point numerical represen- tions and high speed arithmetic units. In such cases, the quantization issues discussed in Sections 6.7 6:9 are generally af lltle or no concern, However. in high volume systems, fixed point arithmetic is generally required to achieve low cost. 10 ZERO-INPUT LIMIT CYCLES IN FIXED-POINT REALIZATIONS OF IIR DIGITAL FILTERS: For stable IIR discrete-time systems implemented with infinite-precision arithmetic, if the excitation becomes zero and remains zero for n greater than some value 129, the output for a = np will decay asymptotically toward zero, For the same system, implemented with inite-register-length arithmetic, the output may continue to oscillate indefinitely with a periodic pattern while the input remains equal to zero. This effect is often referred to as zera-inpur limit cycle behavior and is a consequence either of the nonlinear quantizers in the feedback loop of the system or of overflow of additions. The limit cycle behavior of a digital filter is complex and difficult to analyze, and we will not allempt to treat the topic in any general sense, To illustrate the point, however, we will give two simple examples that will show how such limit q 6.10.1 Limit Cycles Owing to Round-off and Truncation Successive roundoff or truncation of products in an iterated difference equation can create repeating patterns. This is illustrated in the following example. Chapter 6 Structures for Discrete-Time Systerns Example 6.15 imit Cycle Behavior in a 1"-Order System S% Consider the 1*Lorder system characterized by the difference equation & ybt) = ayln — 1) + xl, tal 1m) The signal flow graph of this system is shown in Figure 6,65%a), Let us assume that the register length tor storing the coefficient a, the input x(n], and the flver node , Yatiable yin ~ 1] 4 bits (ic, a sign bit to the left of the binary point and 3 bits to the right of the binary point), Because of the fnite-length reyister, the produet ayn — T] must be tounded o truncated t0 4 bits before being added co «in]. The flow graph representing the actual realization based on Eq. (6.141) is shown in Figure 6.69(b). Assuming rounding of the product the netual output $f] satisfies the nonlinear dit ference equation Stn = Olas in — 11) + xh (6.142) where Q-] represents the rounding operation, Let us assume that @ = 1/2 = 06106 and that the input isx{al = (7/8)8la] = (oI 118i). Using Bq, (6.142), we see that for 40, $10} = 7/8 = 00111. To obiain S11), we muliply $10] by a, obtaining the resuk y[0] = 0-011100, a 7-bit number that must be rounded to 4 bits. This number, 7/16, is exactly halfway between the two 4-bit quantization levels 4/8 and 3/8. If we choose always to round upward in such cases, then 0.01 1100 rounded to 4 bits is 0; 100 = 1/2 ‘Since [1] = 0, it follows that §{1] = 0,100 = 1/2. Continuing to iterate the difference ‘equation gives §[2] = QlaS11}) = 0.010 = 1/4 and $13] = 00001 = 1/8. In both these cases, no rounding is necessary. However, to obtain $[4]. we must round the 7-bit sumber 513] = 05090100 to 0.0L, The same result is obtained for all values of n= 3. The output sequence for this example is shown in Figure 6.70(@). Ifa = —1/2, ‘we can carry oul the preceding computation again to demonstrate thal the output is as shown in Figure 6.70(b). Thus, because of rounding ofthe product aa — 1], the output reaches a constant value of 1/8 when a = 1/2 and a periodic steady-state oscillation between 41/8 and 1/8 whena = —1/2. These are periodic outputs similar to those that would be obtained from a I"-order pole at 2 = =I instead of at » = +1/2 alnl in lal orl 4 () Figure 6.69 1*-order IIR system. (a) Ininte-precision tmear system. (b) Nan- ‘near system due to quantization Section 6.10 Zero-Input Limit Cycles in Fixed-Point Realizations ct 1IR Digital Fiers 461 2 a 3 fa) @= 2 Sr) (a 3) 1 1 i 4 4d, ay Gy i : 1 yy 2 Ae 123 4 5 6) ~ ” (a i. apap x ei ms ° Figure 6.70 Resparse of the 1-order system of Figure 6.69 to an impulse. (b)a=-} } When a = 41/2, the period of the oscillation is 1, and when @ = —1/2, the period of oscillation is 2. Such steady-state periodic outputs are called limit eycles, and their existence was firsl noted by Blackman (1965), who referred to the amplitude intervals to which such limit cycles are confined as dead bunds. Tn this ease, the desd band is -2-F «© jn] = 42°", where B = 3. ‘The foregoing example has iflustrated that a zero-input limit cycle can result from rounding ina 1*-order DIR system. Similar results can be demonstrated for truncation, 2"4-order systems can also exhibit limit cycle behavior. In the case of parallel realizations Chapter 6 Structures for Discrete-Time Systerns of higher-order systems, the outputs of the individual 2"-order systems are indepen: dent when the input is zero. In this case, one or more of the 2%-order sections could contribute a limit cycle to the output sum. In the case of cascade realizations, only the first section has zero input; succeeding sections may exhibit their own characteristic limit eycle behavior, or they may appear to be simply filtering the limit eyele output of a previous section. For higher-order systems realized by other filter structures, the limit eyele behavior becomes more complex, as does its analysis. In addition to giving an understanding of limit cycle effects in digital filters, the preceding results are useful when the zero-input limit cycle response of a system is the desired output. This is the case, for example, when one is concerned with digital sine ‘wave oscillators for signal generation or for the generation of coefficients for caleulation, of the discrete Fourier transform. 6.10.2 Limit Cycles Owing to Overflow In addition to the classes of limit cycles discussed in the preceding section, a more severe type of limit eycle can occur owing 1o overflow. The effect of overflow is to inserta gross error in the Output, and in some cases the filter output thereafter oscillates between large-amplitude limits. Such limit cycles have heen referred to as averfiow oscillation. ‘The problem of oscillations caused by overflow is discussed in detail by Ebert et al (1969). Overflow oscillations are illustrated by the following example. Example 6.16 Overflow Oscillations in a 2"-Order System: %) Consider a 2%4-order system realized by the difference equation ala] + Ola, 51n— 111+ Qlaysln — I, (6143) where QL-] represents two's-complemeat rounding with a word fenath of 3 bits plus | Dil for the sign, Overflow can occur with (wo's-complement addition of the rounded proglucts. Suppose that a, = 3/4 = 001 10 and a = —3/4 = 10610, and assume tral ‘[n] Femains equal to zoro for n = 0, Furthermore, assume that §[-1] = 3/4 = Qld and §—2] = =3/4 = 1,010, Now the output at sample n = 0s, 510] = 05110 x 0110+ 1,010 x 16010. I we evaluate the products using (wo's-complement arithmetic, we obtain 510] = 0-100100 + 0,100100, and if we choose to round upward when a numberis halfway between two quantization levels, the result of two's-complement addition is 510] = 0101 + 06101 = 1,010 — 3) Inthiscase the binary carry overflows into the sign bit, thus changing the positive sum. into a negative number. Repeating the process ives = 1011+ 1.011 0.110 = 3 ‘The binary carry resulting from the sum ofthe sign bits is lost, amd the negative sum is mapped into a positive number. Clearly. stn] will continue to oscillate between 13/4 - and ~3/4 until an input is applied. Thus, §{n} has entered a periodic limit cycle with 2 £2 period of Zand an amplitude of almost the full-scale omplitude of the implementation, Section 6.19 Summary 463 ‘The preceding example illustrates how overflow oscillations occur. Much more complex behavior cam be exhibited by higher-order systems, and other frequencies can occur, Some results are available for predicting when overllow oscillations can be sup- ported by a difference equation (see Ebert et al., 1969). Overflow oscillations ean be avoided by using the saturation overflow charactcristic of Figure 6.45(b) (see Ebert et al., 1969). 6.10.3 Avoiding Limit Cycles The possible existence of a zero-input limit cycle is important in applications where a digital filter is to be in continuous operation, since itis generally desired that the output approach zero when the input is zero. For example, suppose that a speech signal is sampled, filtered by a digital filter, and then converted back to an acoustic signal using a D/A converter. In such a situation it would be very undesirable for the filter to enter a periodic limit cycle whenever the input is zero, since the limit cycle would produce an audible tone ‘One approach to the general problem of limit cycles is to seek structures that do ot support limit cycle oscillations. Such structures have been derived by using state~ space representations (see Barnes and Fam, 1977; Mills, Mullis and Roberts, 1978) and concepts analogous to passivity in analog systems (sce Ruo and Kailath, 1984; Fettweis, 1986}. However, these structures generally require more computation than an equivalent cascade or parallel form implementation. By adding more bits to the computational wordlength, we can generally avoid overflow. Similarly, since round-off Jimit cycles usually are limited to the least significant bits of the binary word, additional bits can be used to reduce the effective amplitude of the limit cycle. Also, Claasen et al. (1973) showed that if a double-length accumulator is used so that quantization ‘occurs after the accumulation of products, then limit cycles owing to round-off are much less likely to occur in 2°4-order systems, Thus, the trade-off between word length and computational algorithm complexity arises for limit cycles just as it does for coefficient quantization and round-off noise. Finally. itis important to point out that zero-input limit cycles due to both overflow and round-off are a phenomenon unique to ITR systems: FIR systems cannot support zero-input limit cycles, because they have no feedback paths. The output of an FIR system will be zero no later than (M + 1) samples after the input goes to zero and remains there. This is a major advantage of FIR systems in applications wherein limit cycle oscillations cannot be tolerated. 6.11 SUMMARY In this chapter, we have considered many aspects of the problem of implementing an, LTT discrete-time system. The first half of the chapter was devoted to basic implemen- tation structures. After introducing block diagram and signal flow graphs as pictorial Tepresentations of difference equations, we discussed a number of basic structures for Chapter 6 Structures for Discrete-Time Systems IIR and PIR discrete-time systems. ‘These included the direct form J, direct form Hl, cas- cade form, parallel form, lattice form. and transposed version of all the basic forms. We showed that these forms are all equivalent when implemented with infinite-precision arithmetic. However, the different structures are most significant in the context of finite- precision implementations. Therefore, the remainder of the chapter addressed problems associated with finite precision or quantization in fixed-point digital implementations of the basic structures We began the discussion of finite precision effects with a brief review of digital number representation and an overview showing that the quantization effects that are important in sampting (discussed in Chapter 4) are also important in representing the coefficients of a discrete-time system and in implementing systems using finite-precision arithmetic, We ilustrated the effect of quantization of the coefficients of a difference ‘equation through several examples, This issuc was treated independently of the effects of finite-precision arithmetic, which we showed introduces nonlinearity into the system, We demonstrated that in some cases this nonlinearity was responsible for limit cycles that may persist after the input toa systern has become zero, We also showed that quan- tization effects can be modeled in terms of independent random white-noise sourc thal are injected internally into the flow graph. Such linesr-noise models were devel oped for the direct form structures and for the caseade structure. In all of our discussion of quantization effects, the underlyingtheme was the conflict between the desire forline quantization and the need {o maintain a wide range of signal amplitudes. We saw that in fixed-point implementations, one can be improved at the expense of the other, but to improve one while leaving the other unalfected requires that we increase the number of bits used to represent coefficients and signal amplitudes. This can be done either by increasing the fixed-point word length or by adopting a floating-point representation Our discussion of quantization effects serves two purposes, Hirst, we developed several results that can be useful in guiding the desiga of practical implementations We fonnd that quantization effects depend greatly on the structure used and on the specific parameters of the system to be implemented, and even though simulation of the system is generally necessary to evaluate its performance, many of the results discussed are useful in making intelligent decisions in the design process. A second. equally important purpose of this part of the chapter was to illustrate style of analysis that can be applied in studying quantization effects in a variety of digital signal-processing algorithms. The examples of the chapter indicate the types of assumptions and approximations that are commonly made in studying quantization effects. In Chapter 9, we will apply the analysis techniques developed here to the study of quantization in the computation of the discrete Fourier transform Basic Problems with Answers 6.1, Determine the system function of the (wo Row graphs in Figure P6.1, and show that they have the same pales. bar 6 Problems 465 ar ' Neewor im a 77 reas in rin le % reos ag NN 4 MN etmork2 oy Figure P6.1 6.2. ‘The signal flow graph of Figure P62 represents 4 Tinear difference equation with constant coefficients. Determine the difference equation that relates the ourput yln to the input shy). Figure P6.2 6.4, Figure P63 shows six systems, Determine which one of the last five, (b)-(). has the same system function as (2). You shauld be able to eliminate some of the possibilities by inspec: tion, “hn co) w) © ‘a Figure P63 Chapter 6 Problems. 467 ated vied Figure P6.3 (continued) 64, Consider the system in Figure P6.3(d). (a) Determine the system function relating the :-transforms of the input and output (b) Write the difference equation that issatisfied by the inputsequence x[n)and the output sequence yin} 6.5. An LIV system i realized by the flow graph shown in Figure Pa.5. Figure P6.5 {a) Write the difference equation relating xn] and yl] for this flow graph. {b) What is the system function of the system? {e) Inthe realization 07 Figure P6.5, how many real multiplications and real additions are required to compute each sample of the output? (Assume that x[x] is real, and ussume that multiplication by 1 does not count in the total.) {@) The realization of Figure P65 requires four storage registers (delay elements). Is it possible to reduce the number of storage registers by using a different structure? If so, draw the flow graph; if not, explain why the aumber of storage registers cannol be reduced, Chapter 6 Structures for Discrete-Time Systems 66, Deterinine the impulse response of each of the systems in Figure P66 tel st] @ Figure P6.6 6.7. Let x{n] and y[n} be sequences related by the following difference equation: 1 1 yin] = 5yln = 2] = xf —2}— 5rin Draw a direct form 1 signal flow graph for the causal LIT system corresponding to this difference equation. Shepter 6 Problems 469 628. The signal flow graph in Figure P68 represents un LTH system. Determine a difference ‘equation that gives a relationship between the input x[n] and the output y[n] of this system, As usual, all branches of the signal low graph have unity gain unless specifically indicated otherwise la} 3 Figure P6.8 63, Figure P6.9 shows the signal Nlow graph for 4 causal discrete-time LTH system. Branches ‘without gains explicitly indicated have a gain of unity. (a) By tracing the path of an impulse through the flowgraph, determine hil, the impulse response at (b) Determine the difference equation relating x{n] and y(n]. Figure P6.9 6.10. Consider the signal flow graph shown in Figure P6.10, (a) Using the node variables indicated, write the set of difference equations represented by this flow graph, (b) Draw the flow graph of an equivatent system that is the cascade of two order systems, (@) Is the system stable? Explain a” 1 a win} Figure P6.10 470 Chapter 6 Structures for Discrete-Time Systems 6.1L. Consider a causal LTT system with impulse response h(v] and system function (a) Draw a direct form II Blow graph for the system. (b)_ Draw the transposed form of the flow graph in part (a). 6.2. For the LTT system described by the Now graph in Figure P6.12, determine the difference equation relating the input x{2] to the output yl} Sa —\ fl Figure P6.12 6.13. Draw the signal flow graph for the direct form | implementation of the fT system with system function Hw 6.14. Draw the signal flow graph for the direct form I implementation of the LTT system with system function GAS. Draw the signal flow graph for the transposed direct form I] implementation of the LT1 sys tem with system function 6.16. Consider the signal flow graph shown in Figure P6.16. (2), Draw the signal flow graph that results from applying the transposition theorem to this, signal flow graph 0). Confirm that the transposed signal flow graph that you foundin (a) has thesame system fanction H(z) as the original system in the figure. st) Figure P6.16 Chapter & Problems an 6.17, Consider the causal TI system with system function (a) Draw the signal flow graph for the direct form implementation of this system. (b) Draw the signal flow graph for the transposed direet form implementation of the system 6.18. For some nonzero choices of the parameter 4, the signal flow graph in Figure P6.18 can be replaced by a 2"4-order direct form II signal low graph implementing the same system function. Give one such choice for a and the systern function H() that results. Figure P6.18 6.19, Consider the causal LTT sysiem with the system function Draw a signal fow graph that implements this system as a parallel combination of 1"order transposed direct form II secti 6.20, Draw a signal flow graph implementing the system function HG) as a cascade of 2°¢-order transposed direct form I sections with real coefficients. Basic Problems 6.21. For many applications. itis useful to have asystom that will generate a sinusoidal sequence. One possible way to do this is with a system whose impulse response is An] =e!" Tal. ‘he real and imaginary parts of hla} are therefore hyn] = (coswynduln] and hyn} = (sin eqa jul}, respectively. Tn implementing a system with x complex impulse response, the real and imaginary parisare distinguished as separate outputs. By frst writing the complex difference equation required to produce the desired impulse response and then separating it into its real uod imaginary parts, draw a flow graph that will implement this system. The flow graph that ‘you draw should have only real cocflicients. This implementation is sometimes called the caupled form oscillaior, since, when the input is excited by an impulse, the outputs are sinusoidal ar Chapter Structures for Discrete-Time Systerns 6.22, For the system function draw the flow graphs of all possible realizations for this system as cascades of 1%-order systems. 623, We want to implement a causal system H(<) with the pole-zero diagram shown in Fig. ure P6.23. For all parts of this problem, cj, 22. pj, and pz are real and a gain constant that is independent of frequency can be absorbed into a gain coefficient in the outpat branch of each flow graph, ' Figure P6.23 (a) eww the flow graph of the direct form IT implementation. Determine an expression for each of the branch gains in terms of the variables z,, 22. py, and po. (b) Draw the flow graph of an implementation as a cascade of 2"4-order dicect form 1 sections. Determine an expression for each of the bramch gains in terms of the variables 24527» Ps 08 pr» (€), Draw the flow graph of a parallel form implementation with 1*-order direct form 11 sections. Specify a system of linear equations that can be solved to express the branch ins in terms ofthe variables zy, 22, pj.nd pa, 624, Consider a causal LTT system whose system function is (a) Draw the signal flow graphs for implementations ofthe system in each ofthe Following Forms (@) Diroct form i) Direct form 11 (iii) Cascade form using 1*- and 2"4-order direet form II sections fv) Parallel form using 1"- and 2"4-order direct form I seetions (8) Transposed direct form 1 (b) Write the difference equations for the low graph of part (¥) in (a) and show that this system has the correct system function Chapter Problems ars 625. A causal LTI system is defined by the signal flow graph shown in Figure P6.25, which represents the system as a cascade of a 2°-order system with a 15!-order system, sal ea ost — 4 Figure P6.25 (a) What isthe system fanction of the overall cascade system? 4h) Is the overall system stable? Explain briefly {¢) Is the overall system a minimam-phase system? Explain briefiy. (a) Draw the signal Sow graph ofa transposed direct form IT implementation of this system. 6.26. A causal LATE system has sys sm function given by the following expression: geek i+ {opt (a) Is this system stable? Explain briefly (b)_ Draw the signal flow graph of a parallel form implementation of this system, (©) Draw the signal flow graph of a cascade form implementation of this system as & cascade of a 1" order system and a 2"-order system. Use a transposed direct form II implementation for the 2°4-order system. 627. An LTI system with system function Ae am Chapter Structures for Discrete-Time Systems (a) Fill in all the coefficients in the diagram of Figure P6.27. Is your solution unique? (b) Define appropriate node variables in Figure P6.27, and write the set of difference ‘equations that is represented by the flow praph. 62K. (a) Determine the system function, 1/(z), from xf] to yl] for the flow graph shown in, Figure P6.28-1 (note that the location where the diagonal lines criss-cross isnot a single node). z Figure P6284 (b) Draw the direct form (I and II) flow graph of systems having the system funetion H() (©) Design A(z) such that H(z) in Figure P6.28-2 has a causal stable inverse and H,(e!)| — |H(e)), Note: Zeto-pole cancellation is permitied. ST ney LOL ace Ti Figure P6.28-2 (a) Draw the transposed direct form II flow graph for H:(2), 6.29. (a) Determine the system function H (¢> relating the input xin] to the output pln forthe FIR lattice fitter depicted in Figure P628. yi] eg ge pe vt zt a Figure P6.29 (b) Draw the lattice filler structure for the all-pole filler 1/2). Chapter Problems 475 6.40, Determine and draw the lattice filter implementation of the following causal all-pole system function: Ts the system stale? GL, An IIR lattice filter is shown in Figure Pé.31 xin vay z Figure P6.31 (a) By tracing the path of an impulse through the flowgraph, determine y{1] for input fee} = Bf. tb) Determine a flow graph for the corresponding inverse filter {c) Determine the transfer function for the IIR filter in Figure P6.31, 6.32. ‘he flow graph shown in Figure P6.32 is an implementation of a causal, LT! system. ral Figure PB.32 (a) Draw the transpose of the signal flow graph. (b) Foreither the original system or its transpose, determine the difference equation relat- ing the input xr] to the output y[n]. (Note: The difference equations wil be the same for both structures) (6) Is the sysiem BIBO stable? (a) Determine y12] if a) = (1/2) ln 418 Chapter 6 Structures for Discrete-Time Systems Advanced Problems 6.33, Consider the LTT system represented by (he FIR lattice structure in Figure P6.33-1 vin xf] Seal Figure P6.23-1 (a) Determine the system function from the input x[n] to the output ef] (NOT yf. (b) Let #(<) be the system function from the input 2[7] to the output yl], and let gl] be the result of expanding the associated impulke response kl} by 2 as shown in Figure. P6.33-2. S0| pg Figure P6.33-2 The impulse response ef] defines a new system with system function (42). We would Tike to implement Giz) using an FIR lattice structure. Determine the &-parameters necescary for an FIR lattice implementation of Gi), Note: You should think carefully before diving into a long calculation. 6.34, Figure P6.34-1 shows an impulse response k(n, specified as A , ' hin = | (2)"" sta, torn an integer maple of constant in between asindieated Alay 12 is 3 0 4 8 Figure P6.34-1 (a) Determine a choice for 4 [7] and hy|n] such that ‘e] = fy fal eal, where hy [n] isan FIR filterand where fi] = 0 for n/4 not an integer. fs gf] an FIR or IIR filter? Chapter6 Problems: an {b} ‘The impulse response h[n] #8 to he used in a downsampling system as indicated in Figure P6:34-2. ON int Ll ye Figure P6.34-2 Draw a flow graph implementation of the system in Figure P6.34-2 that requires the minimum number of nonzero and nomuaity coefficient multipliers, You may tse unit delay elements, coefficient muttiptiers, adders and compressors. (Multiplication by a zero or a one does not require a multiplier) (e} For your system, state how many multiplications per input sample and per output sample are required, giving a brief explanation. 5. Consider the system shown in Figure P6.35-L gu lr) 44 al Figure P6.35-1 We want to implement this system using the polyphase structure shown in Figure: P6.35- se +4 extn] i vi 7 yt] sy 4 extn] Figure P6.35-2 Polyphase structure of the system. For parts (a) andl (b) only, assume fifn] is defined ia Figure P6.35-3, wow ye Ata va ttle. +s 67 8 ! lvoe WT "Figure PB.35-3 (hl) =O for all n = Dand n > 12). ame Ghapter6 Structures for Discrete-Time Systems (@) Give the sequences ep|J, cyl Jeli], and e3[n] that result in acorrect implementation. (b)_ We want to minimize the total number of multiplies per output sample for the imple- mentation of the structure in Figure Pé.35-2. Using the appropriate choice of fn). elt, eala!, and e3|n] from part (a), determine the minimum number of multiplies per output sample for the overall system. Also, determine the minimum number of ‘multiplies per input sample for the overall system. Explain, (6) Instead of using the sequences gol |, eye}. ealit}, and e3[n] identified in part (a), now assume that Fy(e®) and £2(e!®) the DTFTS of eolrrl andl epln], respectively, are as aliven in Figure P6.35-4, and E,(e!®) = Ex(e) = 0. Pony Ble) YE sa 209 ar ar Figure P6.35-4 ‘Sketch und label A1¢e!*") from (x, 2), 6.36. Consider a general flow graph (denoted Network A) consisting of coefficient multipliers and delay clements, as shown in Figure P6.36-1. Ifthe systems initially at rest, its behavior is completely specified by its impulse response ln]. We wish to modify the system to create a new flow graph (denoted Network A ;) with impulse response: hy [nl = (— L)*aln so Network A Figure P6.36-1 (a) Hf H(e4%} is as given in Figure P6.36.2, sketch Hf (2/4), Me™) 2 Figure P5362 (b) Explain how to modify Network A by simple modifications ofits coefficient multipliers andior the delay branches to form the new Network 4, whose impulse response is rule Ghapter6 Problems 479 (c) IE Network A is as given in Figure P6.36-3, show how to modify it by simple moditica- tions to only the coefficient multipliers so that the resulting Network A has impulse response [1]. al) Figure P6.36-3 6.37. The flow graph shown in igure 6.37 is noncomputable; ic. itis not possible to compute the output using the difference equations represented by the flow graph because it contains a closed loop having no delay elements Figure P6.37 (a) Write the difference equations for the system of Figure P6.37, and from them, find the system function of the flow graph. (b) From the system function, obtain a flow graph that ‘computable, 6.38. ‘The impulse response of an LTT system is a, Went, ninr= (5 Siac (a) Draw the flow graph of a direct form nonrecursive. implementation of the system. (b) Show that the corresponding system function can he expressed as a8 A) = lel ah (€) Draw the flow graph of an. implementation of (2), as expressed in part (b), eorte- sponding toacascade of an FIR system (numerator) with an IIR system (denominator). (@) Is the implementation in part (c) recursive or nonrecursive? Is the overall system FIR or IR? Chapter 6 Structures for Discrete-Time Systerns (e) Which implementation of the system requires (i) the most storage (delay elements)? (i) the most arithmetic (mukiplcations and additions per ouspat sample)? 6.39, Consider an FIR system whose impulse response is iin) = [ HOTEOARH/AG ng). On 14 o otherwise This ystem is un example ofa lass of filer: known a Frequeney-sampling filters, Prob- lem 6351 discusses these filters in detall. Ln this problem, we consider just one specific case. 5/2 (a) Sketch the impulse response of the system for the cases ny = Oand my = (b) Show that the system function of the system can be expressed as Jen s2xnadl5 Leitenos 5 H@=d-2 8) sintiolS/2) 1 sinflw —277/18)15/2] singo/3) * 2 sinllo— 2a/15)/21 1 sino + 29 /15)15/2] 2 sin[(o + 29/15/21 ‘Use this expression to sketch the magnitude of the frequency response ofthe system for ny = 15/2. Obtain a similar expression for rig = 0. Sketeh the magnitude response {for np =0. For which choices of ng does the system have a generalized linear phase? (a) Drawa signal flow graph of an implementation of the system asa cascade of an FIR sys- tem whose system function is 2! and a parallelcombinationof a1*t-and 2"4-order AIR system. 6.0. Consider the discrete-time system depicted in Figure P6.40-1. Figure P6.40-4 (a), Write the set of difference equations represented by the flow graph of Figure Pé.ath! (b) Determine the system function #7; (:) = ¥(2)/ X (2) of the system in Figure Pé6.40-1 and determine the magnitudes and angles of the poles of Fz) asa function of r for -leral (©) Figure P6.40-2 shows a different flow graph obtained from the flow graph of Fig: ure P640-I by moving the delay elements to the opposite top branch. How is the system function F(z) = ¥p(z)/ X (2) related to H(z)? Figure Pé.40-2 Chapter 6 Problems 481 GAl. ‘The three flow graphs in Figure P6.41 are all equivalent implementations of the same two- input, two-output LTE system. ‘ 4 Saal ier aT xe aot ln) la Nework A Network « 6) © Figure Pé.44 (a) Write the difference equations for Network A. (b) Determine values of a, b,c, and d for Network 8 in terms of rin Network A such that the two systems are equivalent (c) Determine values of e and f for Network C in terms of r in Network 4 such that the ‘ovo systerns are equivalent, (4) Why might Network B or C be preferred over Network A? What possible advantage could Network 4 have over Network B or C? 6.42. Consider an all-pass system with system function ut 1-054 ‘A flow graph for un implementation of this systema is shown in Figure P6.42, H(2) = 054 Figure P6.42 (a) Determine the coefficients fc, and d such that the flow graph in Figure P6.42 is a direct eealization of H(). (b) Ina practical implementation of the network in Figure P6.42, the coefficients h, ¢, and might be quantized by rounding the exact walue to the nearest tenth (¢., 0.54 will be rounded 10 (h5 and. 1/054 = 1.8518... Will be rounded to 1,9), Would the resulting system stil be an all-pass system: Chapter Structures for Discrete-Time Systems (©) Show that the difference equation relating the input and output of the all-pass system ‘with system function #2) can be expressed as yl] = O.54¢[a 1) ~ tad + she = 1. Draw the flow graph of « network that implements this difference oquation with to delay elements, hut only one multiplication by a constant other than +1 (@) With quantized coetficients, would the flow graph of part (c) be an all-pess system? The primary disadvantage of the implementation in part {c) compared with the im plementation in part (a) is that it requires two delay elements. However, for higherorder systems, itis necessary to implement a cascade of all-pass systerns. Ror W allepass sections in cascade, itis possible to use all-pass sections in the form determined in part (c) while requiring Only (.V + 1) delay elements, This is accomplished by sharing a delay element between sections (e) Consider the all-pass system with system function (8) (=e) ‘Draw the flow graph of a “cascade” realization composed of two sections of the form ‘oblained in part {c) with one delay clement shared between the sections. The resulting ow graph should have only three delay elements, (0) With quantized coefficients « and 6, would the flow graph in part {e) be an all-pass system? 643. All branches of the signal flow graphs in this problem have unity gain unless specifically indicated otherwise ia] Figure P6.43-1 (a) “The signa! flow graph of Systern A, shown in Figure P6.43-1. represents a causal LTI sys ‘tem. Isil possible to implement the same input-output relationship using fewer delays? {itis possible, whatis the minimum number of delays required to implement an equiv. alent system? If it is not possible, explain why not. (h) Does the System B shown in Figure P6.43-2 represent the same input-outpul relation: ship as System A in Figure P6.43-1? Explain clearly, va) 1 Figure P5.43-2 Chapter 6 Problems 68, 483 (a) Draw the direct form | signal flow graph for the system. How many delays and multi pliers do you need’? (Do not count multiplying by +1.) (b) Draw a signal flow graph for the systers that usesene multiplier. Minimize the number of delays (e) Now consider another all-pass system whose system function is Determine and draw a signal flow graph for the system with wo multipliers and three: delays With infinite-precision arithmetic, the flow graphs shown in Figure P6.45 have the same system function, but with quantizcd fixed-point arithmetic they behave differently, Assume that w and bare eeal mumbers and

You might also like