Accurate Timing - and Power Characterization With Nanosim PDF
Accurate Timing - and Power Characterization With Nanosim PDF
with NanoSim
Stefan Scharfenberg,
Andreas Roth
[email protected]
[email protected]
ABSTRACT
In today’s large SOC designs it is very important to do full chip timing- and power analysis.
This can only be done with sufficient accuracy if timing- and power data is available of all
the building blocks.
This paper describes an approach and script setup around NanoSim to create the required
.lib/.db view of a building block.
This characterization environment takes advantage of the NanoSim´s Block Delay Calculator
(BDC) and its bisection features to measure timing data. Power data is calculated from the
reported block power and can be modeled mode dependent.
Table of Contents
List of Figures.........................................................................................................................2
1.0 Introduction..........................................................................................................................3
2.0 View Generation System .....................................................................................................3
3.0 NanoSim Encapsulation.......................................................................................................4
3.1 Timing characterization....................................................................................................5
3.2 Power characterization .....................................................................................................8
4.0 Modeling Power Consumption ............................................................................................9
4.1 Library Units ..................................................................................................................11
5.0 Conclusions and Recommendations ..................................................................................11
6.0 References..........................................................................................................................11
7.0 Appendix............................................................................................................................12
7.1 Mode Definition File ......................................................................................................12
7.2 Nanosim Configuration File ...........................................................................................12
7.3 Timing Characterization Control File ............................................................................13
7.4 NanoSim Generated sdf File...........................................................................................13
7.5 NanoSim hold time control file ......................................................................................14
7.6 NanoSim Power Report ..................................................................................................14
List of Figures
Figure 1: Overview of VGS system...........................................................................................3
Figure 2: vgs_nanosim encapsulation........................................................................................5
Figure 3: Timing characterization environment ........................................................................6
Figure 4: setup time measurement .............................................................................................6
Figure 5: hold time measurement ..............................................................................................6
Figure 6: minimum setup / hold time.........................................................................................7
Figure 7: Counter example ........................................................................................................9
Figure 8: Oscillator monitor example........................................................................................9
SNUG Europe 2005 2 Accurate Timing- and Power Characterization with NanoSim
1.0 Introduction
Today’s SoC designs have grown significantly in area and complexity over the last years. At
the same time EDA tools have made progress in dealing with this large amount of data. The
flow with pure digital logic using either a flat or hierarchical approach has become pretty
robust. However, there are chips like micro controllers, which are not synthesizable as large
digital designs. The reason is that they consist of many different blocks such as pure digital
blocks, full-custom blocks, analog blocks, and different types of memories. It is very
important that these designs can be fully analyzed for timing and power under different
aspects.
A very straightforward way to do this is to describe those blocks, which are not fully
synthesizable as large standard cell. This allows keeping and using the existing analysis tools
in the flow.
Describing a block as standard cell requires the creation of several abstractions or views of
the design to be fed into the digital analysis tools. This view creation process defines also a
clean interface between the digital SoC integration team and any other team, which works on
the design of the subblocks. So the view generation step can be used as a quality gate. At the
same time, this enables reuse of the block.
PathMill PrimeTime
Raw Raw
Database
Timing Timing
Data Data
SNUG Europe 2005 3 Accurate Timing- and Power Characterization with NanoSim
The central element of the View Generation System is a global database for timing values,
power values, and other attributes. Several database clients write to this database or read from
it. This architecture has proven to be very flexible as it its very easy to add new
characterization tools to it.
Since the initial release of VGS static TA tools like PathMill and PrimeTime have become
more powerful and they are today able to create abstract timing views like ETM (Extracted
Timing Model) or ILM (Interface Logic Model) directly. These timing models are better
suited for static timing analysis.
However, there are limitations to static TA of these tools. In the first place they are not very
effective in analyzing circuits, which are too much analog in nature. These are not fully
understood by static TA tools and thus cannot be characterized. Secondly, static TA is not
suited to do any type of power analysis or power characterization.
Fortunately, these deficiencies can be addressed through transistor-level simulation with
Spice or NanoSim. The fundamental difference compared to static TA is that this approach
requires input stimuli to the block. These stimuli have to be carefully selected under different
aspects. From the timing point of view they have to sensitize all the delay paths and
setup/hold paths that are under characterization. From the power aspect the input vectors
have to create a representative activity in the circuit such that the power number is
meaningful.
vgs_nanosim Encapsulation
Perl Measure
Database
Threshold
Nanosim BDC Client
Adjustment
Raw Raw
Timing Timing Timing
Data Data Database
SNUG Europe 2005 4 Accurate Timing- and Power Characterization with NanoSim
Figure 2: vgs_nanosim encapsulation
Figure 2 shows the input and output files to the vgs_nanosim encapsulation, which is a
collection of Perl and shell scripts. As input files the encapsulation needs a transistor-level
netlist, transistor model data, either as Spice parameters or as technology file, input stimuli,
and several configuration files. For more accurate results extracted parasitic data should be
provided as well. Lets look a bit more into the configuration files.
In VGS there are three types of configuration files. The first type, the mode.ctl file, describes
all the possible modes of the circuit. Appendix 7.1 shows an example for a dual ported ram.
The second type the nanosim.cfg file has a listing of all the conditions that need to be looped
through for a complete characterization. This includes a list of PVT points, a list of input
slopes, a list of output loads, and a list of modes. Appendix 7.2 shows an example. The third
file, the nanosim.cfg file, is just required for timing characterization. It lists all the timing
arcs that need to be measured. See Appendix 7.3. for an example.
The output data of the several Nanosim runs is then parsed and temporary stored in simple
raw data files. At the end of the characterization runs all the raw data files are entered into the
global database.
1. for each mode: 2d delay tables, depending on input slope and output load
2. for each mode: 2d output slope tables, depending on input slope and output load
3. for each mode: 2d setup time tables, depending on clock- and data slope
4. for each mode: 2d hold time tables, depending on clock- and data slope
Figure 3 shows how the timing data is measured and which boundary conditions are applied.
OUT1
CLK1 OUT2
D Q
Sequential delay
SNUG Europe 2005 5 Accurate Timing- and Power Characterization with NanoSim
Figure 3: Timing characterization environment
These tables are stored in a .lib file, which is PVT dependent. This means for each PVT point
all the characterization runs have to be repeated.
Now, as we know what data needs to be obtained, let’s look at NanoSim’s features and how
the required data can be measured. The delay and output slopes can be measured through the
BDC (Block Delay Calculator) feature. Using this feature, NanoSim creates an sdf file with
path delays and with output slopes printed as comments, see Appendix 7.4.
Compared to measuring path delays and output slopes, the determination of setup and hold
times is a relative complex task. Setup and hold times are measured for storage elements. The
data edges have to be moved back and forth around the clock edge to find out when the data
signal can be captured. This can be achieved through the Hspice bisection optimization
command that is also available in NanoSim.
Figure 4 shows, how setup
1 times are measured. The clock signal is kept stable at a specific
point. The data edge is then moved from 2ns before the clock edge to 2ns after the clock
edge. While doing this the output pin of the storage element is observed. The optimization is
stopped when the output pin stops toggling.
volt
clock
time
Move data signal right until signal is
no longer captured
volt
data
tsetup time
Figure 5 shows, that measuring the hold time works in a very similar way. Again, the clock
edge is kept stable. The data signal starts to change at the previously measured setup time and
then changes its value 2 ns after the clock signal. It is then moved forward until the output
pin of the storage element is not toggling any more.
volt
clock
thold time
SNUG Europe 2005 6 Accurate Timing- and Power Characterization with NanoSim
Eventually, the second data edge changes immediately after the first data edge. In this case it
is not possible to measure the exact hold time. Instead it is assumed that the sum of setup –
and hold time is equivalent to the slope of the data signal as Figure 6 illustrates.
volt
clock
time
volt
data
time
When measuring delays and especially setup- and hold times, it is important that the input
vectors stimulate the paths under characterization. Figure 3 shows an example: In order to
characterize the setup / hold times of the flip-flop, the second input of the Nand gate has be
set to logical “1”. Otherwise, the flip-flop cell would not see changes on the input signal
“IN2”.
VGS allows a very flexible selection of input vectors for this purpose. Appendix7.2 shows
two lists (@STIM_FILES and @SIM_TIMES) which can be used to select different stimulus
files for each mode along with measure windows. The format of the stimulus files is .vcd
(value change dump), which can be written out of every digital simulator. The .vcd files are
then translated into Spice PWL sources by the VTRAN utility program. The vgs_nanosim
encapsulation generates the configuration files for VTRAN and starts the program
automatically. For delay measurement, all input signals are translated. For setup- and hold
time measurement, all input signals with exception of the data signal under characterization
are translated. Instead, the data signal under characterization is defined in the bisection
control file. Appendix 7.5 shows an example.
All the required looping though the different condition can be described by the following
pseudo code:
Pseudo code for timing characterization:
Loop through all PVTs
Loop through all modes
Loop through all input slopes
Translate vcd file for given V, mode, slope
Loop through all output loads
Measure delays and output slope in SDF
Loop through all clock slopes
Translate vcd file for given V, mode, slope
Loop through all data slopes
Use bisection to measure setup times
Use bisection to measure hold times
SNUG Europe 2005 7 Accurate Timing- and Power Characterization with NanoSim
3.2 Power characterization
A complete description of a large standard cell includes not only the timing behavior, but also
its power consumption. This is done very similar to the black box timing model. Looking at
all the power related tables in the .lib and .db files, it turns out that just two different types of
power consumption need to be measured:
Power numbers have to be put into the database in a similar way as the timing numbers. They
are dependent on the input slope of a block and its output loads or just on the input slope.
NanoSim measures the total average power consumption of a block in a given time window.
This value is extracted by the encapsulation and entered as total power into the database.
NanoSim measures also a static wasted current in the very same time window. This value is
taken and entered into the database as leakage current. Appendix 7.6 shows an example
nanosim.log file and the corresponding reporting lines from which values are extracted and
put into the database.
A block’s power consumption depends on input slopes and output loads, but its dynamic
power is mainly determined by its activity. Unfortunately, the internal activity is somehow
hidden and cannot be observed from outside of a block. Instead, the activity at its input ports
and output ports can be used to some extend. The encapsulation uses NanoSim’s
report_node_toggle command to get the toggle count for all output ports and the input clock
ports during the measure time window. These values are also stored in the database.
Finally, we are also interested in the number of rising and falling edges at the output ports.
The report_node_toggle does not differentiate between rising and falling edges, it simply
counts all toggle events. If the reported toggle count is an even number, we can simply
assume that 50% are rising events and 50% are falling events. If the reported toggle count is
an odd number we need to know if the extra toggle comes from a rising edge or from a falling
edge. This missing information can be obtained if we compare the value of each output port
at the beginning of the measure window against its value at the end of the measure window.
All the required looping though the different condition can be described by the following
pseudo code:
Pseudo code power characterization:
Loop through all PVTs
Loop through all modes
Loop through all input slopes
Translate vcd file for mode and slope
Loop through all output loads
Measure power consumption:
. average total power
. average static wasted current
. store output loads
. measure toggle count for output pins
SNUG Europe 2005 8 Accurate Timing- and Power Characterization with NanoSim
4.0 Modeling Power Consumption
The dynamic power consumption of a block depends mainly on its activity. From the
modeling aspect we have to concentrate on activity that can be observed from outside at the
block ports. If a block shows enough activity at its input ports and output ports, we can use a
modeling method, which corresponds to timing arcs. In VGS this method is called
“power_arc”. Figure 7 shows a counter block as an example where this method can be used.
Using this method, we count the toggles at all the output pins during the measure time
window. This total toggle count relates to the total dynamic power consumption of the block.
Each power arc gets now a fraction of the total dynamic power consumption, which relates to
the toggle count at the sink node of the power arc.
CNT3
Counter
CNT2
CNT1
CLK
CNT0
Oscillator
CLK2 Monitor
ABOVE
CLK1
BELOW
SNUG Europe 2005 9 Accurate Timing- and Power Characterization with NanoSim
Energy
Power = • Transitionrate
Transition
Cload
An alternative way to describe the looping and calculations to be done to create the power
tables in the .lib file is the following pseudo code:
Pseudo code power modeling (power_arc mode):
Loop through all PVTs
Loop through all modes
Put down leakage power as static wasted current * V
Loop through all output pins
Put down dynamic energy as
(Average power – leakage power – CV^2) * toggle
SNUG Europe 2005 10 Accurate Timing- and Power Characterization with NanoSim
4.1 Library Units
When putting the characterized values into the .lib tables we have to make sure that the right
units are used. The encapsulation measures data in the following units:
• time in ns
• voltage in V
• capacitances in fF
• wasted current in μA
• average power in μW
Some of the library units are explicitly mentioned in the .lib header and declared through
specific keywords. Other units are derived from them:
• time_unit 1ns
• voltage_unit 1V
• capacitive_load_unit 1pF
• current_unit 1mA
• leakage_power_unit 1mW
• dynamic energy capacitive_load_unit * voltage_unit2
The next enhancements to the system are planed to support the characterization of blocks
with multi supply voltage. Also there are deficiencies to describe the power consumption of
purely combinational blocks.
Finally, the .lib format will shortly become a bottleneck in the attempt to create accurate
power modeling. The larger a block is, the more different modes it can have. In most cases it
is not possible to observe the current mode from outside of the block.
6.0 References
[1] “Semiconductor Reuse Standard: VC Block Deliverables”, http://www.freescale.com
[2] “IP Characterization with Pathmill and PrimeTime”, Stefan Scharfenberg, SNUG 1999
[3] “NanoSim User Guide”, Synopsys Inc.
SNUG Europe 2005 11 Accurate Timing- and Power Characterization with NanoSim
7.0 Appendix
7.1 Mode Definition File
// Syntax for defining modes:
// /# name <in_pin_1> <in_pin_2> ... [relevant_clk] [power_model]
// <mode_name> [0|1|?] [0|1|?] ... [<clock_name>|?|-] [power_arc|input]
//
SNUG Europe 2005 12 Accurate Timing- and Power Characterization with NanoSim
7.3 Timing Characterization Control File
// Nanosim Timing Characterization File
start_setuphold_table
// source_node clock clock_edge data_out #_of_clock_edge time
// -----------------------------------------------------------------------
mea acca r ctl_a_lmeib 2 60ns
wrea acca r ctl_a_lwre 2 60ns
meb accb r ctl_b_lmeib 2 60ns
wreb accb r ctl_b_lwre 2 60ns
adra1 acca r ctl_a_net288 2 60ns
adrb1 accb r ctl_b_net288 2 60ns
da0 acca r sa_a_SA0_ldi 2 60ns
end_setuphold_table
(DELAY (ABSOLUTE
(IOPATH (posedge acca) qa0 (1.830:1.830:1.830) (1.580:1.580:1.580))
//(TRANS qa0 (0.200:0.200:0.200) (0.110:0.110:0.110))
//(AT_TIME qa0 (301.910:301.910:301.910) (261.660:261.660:261.660))
SNUG Europe 2005 13 Accurate Timing- and Power Characterization with NanoSim
(IOPATH (posedge acca) qa1 (1.840:1.840:1.840) (1.580:1.580:1.580))
//(TRANS qa1 (0.220:0.220:0.220) (0.110:0.110:0.110))
//(AT_TIME qa1 (261.920:261.920:261.920) (301.660:301.660:301.660))
(IOPATH (posedge acca) qa2 (1.840:1.840:1.840) (1.580:1.580:1.580))
//(TRANS qa2 (0.220:0.220:0.220) (0.120:0.120:0.120))
//(AT_TIME qa2 (301.920:301.920:301.920) (261.660:261.660:261.660))
. . .
))
)
)
** Parameter **
***************
** This will be used to sweep from <UpperLimit>/2 before the clock edge to
** <UpperLimit>/2 after the clock edge
*.Param <ParamName> = <OptParFun> (<Initial>, <LowerLimit>, <UpperLimit>)
.Param DelayTime = Opt1 ( 0.0n, 0.0n, 3.7269n )
** Transient Simulation **
**************************
.Tran 1n 67.6ns Sweep
+ Optimize=Opt1
+ Result=MaxVDout
+ Model=OptMod
** Hold time **
.Measure Tran MaxVDout Min v(sa_b_SA0_ldi) Goal='0.1*2.5'
+ from=63.571ns to=67.6ns
.Measure Tran HOLD_accbdb0 Trig v(accb) Val='2.5/2' rise=2
+ Targ v(db0) Val='2.5/2' rise=1
** Optimization Model **
************************
.Model OptMod Opt method=Bisection
+ relin=0.0001
+ relout=0.001
SNUG Europe 2005 14 Accurate Timing- and Power Characterization with NanoSim
| |
| NanoSim Version W-2004.12 |
| SN: P20041105-HP_UX |
| Machine Name: trubadix |
| Copyright (c) 2004 Synopsys Inc., All Rights Reserved. |
| |
--------------------------------------------------------
Built by nsmgr in " 20041105_hp32_ns_main " on Fri Nov 5 21:30:10 PST 2004
Thu Feb 10 14:47:59 2005
Compiling
"/home/stefan_s/VGS/sdpram64x16/tool_data/vgs/spice/sdpram64x16_wcs_c.hsp" (SPICE)
Compiling "sdpram64x16.default.vec" (SPICE)
Compiling "/home/stefan_s/VGS/sdpram64x16/tool_data/vgs/tool_data/nanosim/\
sdpram64x16/templates/supplies.V250.sp" (SPICE)
Compiling "/home/stefan_s/VGS/sdpram64x16/tool_data/vgs/tool_data/nanosim/\
sdpram64x16/Run/tsmc25nvmPtypV250T025/header_tsmc25nvmPtypV250T025"
(SPICE)
Compiling "/vobs/libs/tsmc25nvm/ams/amsmodels/hspice/mix025_1.l" (SPICE)
Compiling "/vobs/libs/tsmc25nvm/ams/amsmodels/hspice/025emf.l" (SPICE)
SNUG Europe 2005 15 Accurate Timing- and Power Characterization with NanoSim
. . .
0.00000e+00 - 8.00000e+01 ns
Node: 0
Average current : 3.15704e+02 uA
Node: vdd
Average current : -3.16225e+02 uA
Block: total
Number of nodes in block : 3832
Number of elements in block : 11786
Number of block supply nodes : 1
Number of block ground nodes : 1
Number of block biput nodes : 0
Number of block input nodes : 52
Number of block output nodes : 0
Number of block stages : 928
Number of block partial stages : 0
SNUG Europe 2005 16 Accurate Timing- and Power Characterization with NanoSim
Average output current : 0.000000 uA
RMS output current : 0.000000 uA
Average static
Average biput current : 0.000000 uA
RMS biput current : 0.000000 uA wasted current,
taken to calculate
Average capacitive current : -243.559375 uA
RMS capacitive current : 1395.686573 uA leakage power
Average wasted current : -72.665250 uA
RMS wasted current : 720.986883 uA
SNUG Europe 2005 17 Accurate Timing- and Power Characterization with NanoSim