0% found this document useful (0 votes)
59 views9 pages

Applying A Seamless Design Flow T O Fast Development of A Carrier Synchronizer FOR

1) The document describes applying a seamless design flow to quickly develop a carrier synchronizer chip for MPSK modulation in just 3 months. 2) The circuit was fabricated in a 1um CMOS process using standard cells and is currently used commercially in a digital TV modem. 3) The design flow involved modeling the system at different abstraction levels and verifying each step as the design was refined from the system level down to the implementation level.

Uploaded by

MOHAMMAD AWAIS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views9 pages

Applying A Seamless Design Flow T O Fast Development of A Carrier Synchronizer FOR

1) The document describes applying a seamless design flow to quickly develop a carrier synchronizer chip for MPSK modulation in just 3 months. 2) The circuit was fabricated in a 1um CMOS process using standard cells and is currently used commercially in a digital TV modem. 3) The design flow involved modeling the system at different abstraction levels and verifying each step as the design was refined from the system level down to the implementation level.

Uploaded by

MOHAMMAD AWAIS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

APPLYING A SEAMLESS DESIGN FLOW T O

FAST DEVELOPMENT OF A CARRIER


SYNCHRONIZER FOR MPSK

M. Vaupel and H . Meyr


Chair for Integrated Systems of Signd Processing
Aachen University of Technology
52056 Aachen
Germany
[email protected] th-aa.chen.de

-
Abstract Short product cycles and the necessity to achieve a short
time to market call for sophisticated design methodologies. In this paper
a seamless design flow is described enabling the design of engineering a
chip for carrier synchronization of SPSK, QPSK and BPSK modulated
signals in three months. The circuit was fabricated in a 1pm CMOS
process using standard cells and is currently in commercial use in a
modem for digital TV transmission.

INTRODUCTION
The demand for short product cycles in the field of VLSI-design is conti-
nuously increasing. Time to market or rapid prototyping are hot topics. These
demands call not only for skilled designers but require also design methodolo-
gies which support the development of digital integrated circuits. The design
flow ranges from system level specification and simulation down to post lay-
out simulation or even post production testing. An integral part of this flow
are the verification steps.
Phase shift keying (PSI<) is a commonly used modulation technique for
the transmission of digital data. Due to local oscillator inaccuracies of the
receiver-transmitter pair, the complex valued signal suffers from a frequency
offset and a phase drift. In order to overcome these effects a correction device
is needed which should be able to cope with different modulation schemes
(8PSI<, QPSK and BPSK) to be applicable in various environments. Ad-
ditionally the functional parameters should be programmable to ensure the
required flexibility.

THE CIRCUIT
Figure I shows the receiver branch of the whole modem for digital T V
transmission. After analog down conversion the signal is sampled. In phase
and qwdrature components are input to a matched filter, that is part of

0-7803-2612-1195 $4.00 0 1995 IEEE

126

Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 09,2020 at 00:16:46 UTC from IEEE Xplore. Restrictions apply.
the timing synchronization loop. The output signal is fed into the carrier
synchronizer. The corrected signal values are input to a channel decoder (eg
a Viterbi decoder). As an alternative, the matched filter could also be used
inside the synchronization loop (see Figure 1).

I------------

Figure 1: Receiver chain of the modem


The internal structure of the synchronizer is basically a digital PLL [1,2]
and is depicted in Figure 2. It consists of phase rotator, phase error detector,
loop filter, and NCO mainly. Additionally a sweep generator is implemented
that allows an automated search of the frequency offset to he performed
within a wide range during acquisition. This search is controlled by a lock
detection unit that detects whether the received data is located within a
certain range around a reference point in the complex plane or not. This in-
range information is processed within the count-and-compare unit which puts
out a loop-in-lock bit. Furthermore a microprocessor interface ensures the
necessary flexibility of configurations and allows observation of internal states
and signals. Additionally, a frequency measurement unit is iinpleineiited to
support the coarse adjustment of the oscillator.

Figure 2: Structure of the Synchronizer


The maximum specified sample rate was 35 MHz. The correctalJe fre-
quency error should he up to +/- 12.5%) (SPSK) of the sample rate. The
normalized loop bandwidth BL can be prc-z-aiiimed to lie in the range of
to of the sainpliiig rate; the dai:.,>ing ratio C can he chosen bet-
ween 0.7 and 1.5.

127

Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 09,2020 at 00:16:46 UTC from IEEE Xplore. Restrictions apply.
DESIGN FLOW
The integration of telecommunication systems on silicon is a process that
requires modeling on different abstraction levels[3]. For each level suitable
simulation paradigms have to be applied in order to evaluate the performance
with the help of crikria that are adequate on this particular level. (See Figure

Of course designing an integrated circuit is not a pure top-down process.


Each refinement step gives a feedback to upper levels. For instance, the cost
in silicon area has an impact on the decision on quantization parameters.
The delay that is introduced into the synchronizer loop due to pipelining the
building blocks has an influence on the system performance. This stresses the
importance of a smooth transition between system and architectural level. In
addition each shift from an upper abstraction level to a lower one requires a
verification between the two different representations of the same block.

Implementation

Development of
Algorithms
P System pellormanca

Route

c
l Fabrication Die fumtion is

Figure 3: Overview of Design Flow

System level
On the uppermost abstract,ion level all system performance related issues
(eg BER or remaining phase inaccuracies) have to be invest,igat,ed. For this
reason the system environinent (eg channel, source decoder) has to be mo-
deled and simulated. The simulation efficiency is a crucial point especially

128

Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 09,2020 at 00:16:46 UTC from IEEE Xplore. Restrictions apply.
for complex systems since long simulations are necessary in order to achieve
accurate results. All information which is not relevant on this level has to
be hidden in order to speed up simulations. Therefore a data flow driven
simulation engine is the appropriate choice[4]. In our case the system level
simulation tool was COSSAP[5].
In order to decide on quantization of internal signals and pave the way
down to implementation, a bittrue specification of all blocks of the circuit
had to be created.

For each building block, a corresponding VLSH-architecture was developed


and described in 1,.That C Q K I ~ ~ ~ofSeg
~ Sdirectly implementing the arith-
metic units (eg the within loop filter) with some iinprovements that increase
implementation efficiency or of building architectures that are structurally
different but show identical behavior at the borders ~f the entity compared
with the functional description of the block.

s..
....,., 0-
s:::.:~ phase difference

.....
......
$&jn-lock
.... , area
.........

Figure 4: Ma,pping of address values


The pha.se rota.tor is realized a.s a. CORDIC-processor[G,7,8]. The loop
filter is a first order filter, that is pipelined in order to reach the specified
da.ta rate. Phase error detector and lock detection unit are implemented
once for each moduhtion scheme. The phase detection algorit,hm used for
8PSK inodula.t,ed sign& wa.s described in [9]. It,s implement,a.tionwa,s clone
via a synthesized table, because of the increased efficiency compared to using
arithmetic units or RORIs. In order t,o reduce the required silicon real est.ate,
the implicit, symiiiet,ry of t,he phase error detection wa.s a.dva.ntageouslyt,a.ken
illto a.ccount,: in a first step all qua.drant,s are mapped ont,o tAie first one
1) (see Figure 4), then tlie second oct,and is ma.pped onto the first 2), and
finally the uppar rema.ining bria.ngle is mapped onto the free address spa.ce
in the lower half of t,he first qua.drant 3). Each mirroring opera,tion in the

129

Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 09,2020 at 00:16:46 UTC from IEEE Xplore. Restrictions apply.
address space is compensated for by an inversion of the output phase value.
Following this approach, the address space of the table could be decreased
by a factor of eight a t the expense of implementing a preprocessing unit that
perforines the mapping and a post processing unit for conditionally inverting
the phase values. The values of the synthesized table were obtained using the
functional simulation model of the Viterbi-and-Viterbi [9]algorithm and the
postprocessing facilities of the simulation environineiit.
Another possible structure of the receiver is to use the matched filter with
subsequent decimation within the frequency synchronizer loop (see figure 1).
The input data of the phase error estimator is then read from outside the
chip rather than from the CORDIC-processor. Due to the different data
rates (CORDIC-processor and NCO are processing the oversampled data, the
other parts of the loop are working with the decimated values), decoupling
of the CORDIC-processor and the rest of the loop is necessary which was
implemented using enable signals.

Verification
VHDL Verification. An crucial step in the design flow is the verifica.tion
of the VHDL descriptions a.ga.inst their counterpart,s, the functional system
level specifica.tions. For this reason a simulator coupling between COSSAP
and the SYNOPSYS VSS VHDL siinulator was developed a t our 4t.e [10,11].
For each VHDL ent,it,y an iiiterfa.ce block is generat,ed automatically t1ia.t can
repla.ce the respect,ive simulation model in a. COSSAP netslist.. Via t,liis block
the simulation da.ta of the syst,em level simulation run is fed into the VHDL
simulat,ion tha.t is running simultaneously. The output dat,a is coinpa.red on
the functional level ena.bling the use of grap1iica.l postprocessing fa,cilities.
This approach avoids developing VHDL test,benches for each building block,
producing input stimuli within the test bench or writing input stimuli of the
functional model tto files and rea.ding these into the VHDL simula.tion, and
comparing the produced output files which would be a very tedious, error-
prone and h i e consuming process. Therefore, debugging is grea.tly simplified
using the cosimula.tion approa.ch.
The granula,rity of the modeled blocks has to be fine enough tao ensure
controllability and observa.bility with a limited number of input dataset,s and
to ease the debugging process, and coa.rse enough to obt.a.iii a high verifica-
tion eficiency. After design a i d verifica.tion of all building blocks the overall
system ha,s tsobe verified using the sa.me approa.ch described a.bove.

Post Layout Simulation. The verifica.tion step t1ia.t follows placement a i d


routing is a.ided by generating input stimuli and espect,ed output data. for
post, layout siinulat,ion using the syst,ein level simulat,ion setup again. It is
very ea.sy to use the available source models to produce 1a.rgesets of complex
d a t a pa.tt.erns in order to drive the internal sta.tes to all possible va.lues. The
generation of Production Test vectors is done using the sa.me approa.ch.

130

Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 09,2020 at 00:16:46 UTC from IEEE Xplore. Restrictions apply.
BOO

400
frequency stored in loop

-200

-400
, _---_
.- -

Figure 5: Measured int(erna1da,ta

Functional Chip Verification. The -manufactured chip was functionally


tested using a hardware-in-the-simulation-loop environment[l2]. Via a soft-
ware interface module the input data which is produced by the simulator
under simulated environmental conditions (eg a specific type of channel) is
fed into the real hardware. The on-chip processed results are then read back
and can be coinpared with the simulation model and visualized using the gra-
phical output facilities of the simulation tool. In the case of the synchronizer
chip the micro-processor interface that is needed for configuring the chip al-
lows additionally an observation of internal states and signals. For instance,
estimated phase error, output of NCO, and the frequency stored in the loop
filter can be read out without increasing the pin count. Figure 5 summarizes
some measured results that are read out during acquisition. The uppermost
curve shows the detected phase error, the middle the frequency that is stored
in the loop filter, aiid the lower displays the NCO output. During the first
2000 clock cycles, a positive sweep value is added into the loop. After rea-
ching the programmed limit, the output of the sweep generator' is inverted
in order search into negative direction. When a loop-in-lock is detected (a
certain amount of received signal points are near the reference points in the

131

Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 09,2020 at 00:16:46 UTC from IEEE Xplore. Restrictions apply.
complex plane) t,he output of the sweep generator is disabled. The circuit is
now in trwking mode and follows slowly varying frequencies.

Problems
The remaining bottleneck of the design flow was in our case the loose
coupling between synthesis and placement and routing. The large number of
simple logic gates and the non-local connections inside the synthesized table
caused a large amount of wiring in t81iisregion (see Figure 6). Due to the
fact that wireload models used in SYNOPSYS are not accurate enough with
respect to the actual wiring capacitances. the pre- and post-layout tiiiiiiig
estimates differed by u p to 33% resulting in serious timing violations that had
to be removed by numerous try-and-error iterations.

Figure G: Filial La.yout

132

Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 09,2020 at 00:16:46 UTC from IEEE Xplore. Restrictions apply.
RESULTS
The chip was manufactured in a 1,mi CMOS process ancl consists of
5964 standard cells. The chip area is 27 inin2. It is packaged into a 68 pin
ceramic pin grid array package. The maximum sample rate under worst case
conditions is 35 MHz. No self test is implemented, the fault coverage iisiiig
tlie generated 32k test vectors is 89%. Figure G shows tlie final layout of the
chip.

SUMMARY
An integrated design flow that ranges from system level specification and
simulation down to post layout simulation or even post production testiitg
avoids implementation failures that result from verification gaps between dif-
ferent levels of abstraction and ensures short design times. It guarantees a
close interaction between system level and hardware design providing a well
defined interface. This enables concurrent engineering on different levels and
of different structural blocks as well. With the help of the design environment
described above it was possible to produce the filial physical layout within
three months. The chip is fully functional aiid was a first silicon success.
A natural extension of the methodology is to eiiable reusing the developed
pairs of functional aiid architectural paraineterizahle modules in order to
improve the design process even further. Currently a tool set[13,14] is under
development a t our site which provides an automated VHDL generation from
the fuiictiorial specifications.

References
[l] H. Meyr and G . Ascheid, Synchroniza.tion an Digital Communica.tions,
vol. 1. John Wiley & Sons, 1990.
[2] E. A. Lee and D. G. Messerschmitst, Digital Communications. Icluwer
Academic Publishers, 1994.
[3] 11;. t,eii Ha.gen, Abstrakte Modellierung dagitaler Schaltungen ( V H D L nom
funktionalen Modeld his zur Gatterehene). Heidelberg New York: Sprin-
ger, August 1995. In Vorbereitung.
[4] G. Jennings, “A case a.ga.inst event, driven simulat8ioiiof digital system
design,” in The 24th Annual Simulation Symposiw.m (A. H. Riit,aii, ed.),
(Los Alamit,os, California), pp. 170-17G, IEEE Computer Society Press,
April 1991.
[5] Synopsys, Inc., 700 E. Middlefield Rd., Mountain View, CA 94043, USA,,
COSSAP User’s Manual.
[GI J. E. Volder, “The CORDIC trigoiiomet,ric comput,iiig t.echiiique,” IR.E
Trans. Electronic Computing, vol. EC-8, pp. 330-34, September 1959.

133

Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 09,2020 at 00:16:46 UTC from IEEE Xplore. Restrictions apply.
[7] J . S. Walt,lier, “A unified a.lgorit,lnnfor elementary funct,ions,” in A F I P S
Spring Joint Computer Conference. vol. 38, pp. 379-85, 1971.
[SI H. Da.wid and H. Rifeyr, “The Different,ia.lCORDIC Algorit.hin: Consta.nt
Scale Fa,ctor Redundant 1mpleinenta.tion without correcting It,eratioiis.”
accepted for IEEE Transactions on Computers, May, 1994.

[9] A. J. Viterbi aiid A. M. Viterbi, “Nonlinear estimation of psk-modula.ted


carrier pha.se with application to burst digtal transmission,” I E E E ‘Pran-
suctions on I n f o m a t i o n Theory, vol. IT-29, pp. 543-551, July 1983.
[IO] P. Zepter, “Simulator Coupling: COSSAP - Synopsys VSS,” 1nt)erna.l
Memo 715/16, ISS, RWTH Aachen, Septeiiiber 1993.
[ll] P. Zepter, “Kopplung eines VHDL Simuhtors a n einen Siinula.tor fur Sig-
nalvera.rbeit,ungsalgorithmen,nin G M E Fachberichte i l Mikroeiektronik
(D. Seitzer, ed.), pp. 127-132, VDE Verhg, Rifa.rc1i 1993. in german.

[12] 0. J. Joeressen and H. Meyr, “Hardware “in t8heloop” simulat,ion with


COSSAP: Closing the verificat#iongap,” in International Conference on
DSP Applications and Technology, (Dallas, TX), pp. 779-784, DSP As-
socia.tes, October 1994.

[13] H. Meyr, H. Dawid, 0. Joeressen, and P. Zepter, “Design of High


Speed Communica.tion Systems,” in Proceedings of the I E E E Interna-
tiona.1 Symposium on Circuits and Systems, (London), p. 2.134, IEEE,
Ma.y 1994.

[14] P. Zepter, T. Grot,ker, 0. Joeressen, and H. Meyr, “A clesign syst,ein


for high throughput digihl signa.1processing,” in G M E Fachberichte 13
Mikroelektronik (E. Barke, ed.), VDE Verla.g, Rifarch 1995.

134

Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 09,2020 at 00:16:46 UTC from IEEE Xplore. Restrictions apply.

You might also like