0% found this document useful (0 votes)
105 views64 pages

W06 - RTL Synthesis Using Synopsys Design Compiler

Uploaded by

ppp85125
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views64 pages

W06 - RTL Synthesis Using Synopsys Design Compiler

Uploaded by

ppp85125
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

RTL Synthesis using

Synopsys
Design Compiler
Chao-Tsung Huang

National Tsing Hua University


Department of Electrical Engineering

EE4292 IC Design Lab, Fall 2024 1


Outline
• Part I
– Basic Settings for RTL Synthesis
– Cell Library – Cadence GSC Library (GPDK45)
– Basic Design Constraints
• Part II
– Design Compiler Optimization Process
– Static Timing Analysis
– DesignWare Library

• Reference
– Synopsys DC Documents
• at /usr/cad/synopsys/synthesis/”Design Compiler”
• “Design Compiler.xls” for TOC (can be opened by soffice “Design Compiler.xls”)
• especially, dcug.pdf and tcoug.pdf (synqr.pdf: quick ref)
– TSRI Synthesis Training Materials
EE4292 IC Design Lab, Fall 2024 2
Basic Design Flow with DC
HDL

Constraints
(SDC)
Datapath optimization/ Timing/Power
Timing optimization/ Analysis
Technology
Library Area optimization/
Power optimization/ Formality
Scan chain synthesis Check
IP/DesignWare
Library Design Compiler
Synthesized netlist

P&R
DC: Design Compiler
SDC: Synthesis design constraint

EE4292 IC Design Lab, Fall 2024 3


Setup File .synopsys_dc.setup
set search_path ". \
/usr/cadtool/GPDK45/gsclib045_svt_v4.4/gsclib045/db/ \
$search_path "

set target_library " slow_vdd1v2_basicCells_wl.db \ Standard cell


" technology library

set link_library " * $target_library \


dw_foundation.sldb \ DesignWare
" Database
Schematic
set symbol_library "generic.sdb" views (for DV)
set synthetic_library "dw_foundation.sldb“ Specially
licensed DW
# Recognize synchronous reset (avoid mixing logics with reset)
set hdlin_ff_always_sync_set_reset true

EE4292 IC Design Lab, Fall 2024 4


Basic DC Execution Flow
• Read designs
– Read or analyze *.v (RTL or netlist)
– Elaborate (for modules with parameters)
– Uniquify (avoid multiple instances refer to the synthesized design)
• design U0(…); design U1(…);
• Uniquified => design_0 U0(…); design_1 U1(…);
• Each design can be optimized independently
• Link libraries (*.db)
• Set design constraints
– Will be introduced in detail
• Optimize netlist
• Report and generate output files Script-based flow mostly used

EE4292 IC Design Lab, Fall 2024 5


Example of DC Optimization Log
• Mapping (compile_ultra)
Beginning Pass 1 Mapping
------------------------
Processing 'regfile'
Implement Synthetic for 'regfile'.
Processing 'alu'
Information: Added key list 'DesignWare' to design 'alu'. (DDB-72)
Implement Synthetic for 'alu'.

Information: Performing clock-gating on design PC. (PWR-730)

Beginning Mapping Optimizations (Ultra High effort)


-------------------------------
Information: Added key list 'DesignWare' to design 'PC'. (DDB-72)
Information: Added key list 'DesignWare' to design 'EXE_stage'. (DDB-72)
Information: Added key list 'DesignWare' to design 'IF_stage'. (DDB-72)
Mapping Optimization (Phase 1)

EE4292 IC Design Lab, Fall 2024 6


Example of DC Optimization Log
• Mapping (compile_ultra) σ 𝑠𝑒𝑡𝑢𝑝 𝑡𝑖𝑚𝑒 𝑣𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛

TOTAL
ELAPSED WORST NEG SETUP DESIGN
TIME AREA SLACK COST RULE COST ENDPOINT
--------- --------- --------- --------- --------- -------------------------
0:01:35 261209.3 3.33 3406.4 648.6
0:01:40 262075.3 3.31 3388.4 648.6
0:01:40 262075.3 3.31 3388.4 648.6
0:01:40 262075.3 3.31 3388.4 648.6
Re-synthesis Optimization (Phase 1)
Re-synthesis Optimization (Phase 2) Worst slack
0:01:45 262075.3 3.31 3388.4 648.6

Total
slack
Global Optimization (Phase 29) …
Global Optimization (Phase 30)
0:02:00 271424.2 3.33 3401.0 635.6

EE4292 IC Design Lab, Fall 2024 7


Example of DC Optimization Log
• Optimization (compile_ultra)
Beginning Delay Optimization Phase
----------------------------------

TOTAL
ELAPSED WORST NEG SETUP DESIGN
TIME AREA SLACK COST RULE COST ENDPOINT
--------- --------- --------- --------- --------- -------------------------
0:00:22 274512.3 2.10 2082.3 3.2
0:00:22 274512.3 2.10 2082.3 3.2
0:00:31 275296.8 1.25 920.6 4.8
0:00:31 275296.8 1.25 920.6 4.8
0:00:31 275294.5 1.24 920.5 4.8

0:01:02 275962.1 0.15 26.8 2.5


0:01:02 275962.1 0.15 26.8 2.5
0:01:02 275962.1 0.15 26.8 2.5
0:01:07 275474.0 0.15 27.6 609.7

EE4292 IC Design Lab, Fall 2024 8


Example of DC Optimization Log
• Optimization (compile_ultra)
Beginning Design Rule Fixing (max_transition) (max_capacitance)
----------------------------

TOTAL
ELAPSED WORST NEG SETUP DESIGN
TIME AREA SLACK COST RULE COST ENDPOINT
--------- --------- --------- --------- --------- -------------------------
0:01:07 275474.0 0.15 27.6 609.7
0:01:07 275474.0 0.15 27.6 609.7
Global Optimization (Phase 6)
Global Optimization (Phase 7)
Global Optimization (Phase 8)
0:01:07 275545.7 0.15 27.6 0.0
0:01:07 275571.4 0.00 0.0 0.0

0:01:07 275571.4 0.00 0.0 0.0


0:01:07 275571.4 0.00 0.0 0.0

EE4292 IC Design Lab, Fall 2024 9


Example of DC Optimization Log
• Optimization (compile_ultra)
Beginning Area-Recovery Phase (max_area 0)
-----------------------------

TOTAL
ELAPSED WORST NEG SETUP DESIGN
TIME AREA SLACK COST RULE COST ENDPOINT
--------- --------- --------- --------- --------- -------------------------
0:01:07 275571.4 0.00 0.0 0.0
0:01:07 275571.4 0.00 0.0 0.0
Global Optimization (Phase 9)
Global Optimization (Phase 10)

0:01:27 274857.7 0.00 0.0 0.0


0:01:27 274857.7 0.00 0.0 0.0
0:01:27 274857.7 0.00 0.0 0.0

EE4292 IC Design Lab, Fall 2024 10


Outline
• Part I
– Basic Settings for RTL Synthesis
– Cell Library – Cadence GSC (GPDK45)
– Basic Design Constraints
• Part II
– Design Compiler Optimization Process
– Static Timing Analysis
– DesignWare Library

EE4292 IC Design Lab, Fall 2024 11


Cell Liberty File for Synthesis
• Human-readable format: *.lib
– Converted to database *.db for DC
• Key characteristics
– Cell height: 7-8 tracks (low), 9-10 tracks (standard)
– Threshold voltage: high-VT, regular/standard-VT, low-VT
• One delay corner includes
– Operating condition (PVT)
• Process corner for NMOS-PMOS speed
– e.g. Fast-Fast (FF), Typical-Typical (TT), Slow-Slow (SS)
– e.g. full node, half node
• Operating Voltage, e.g. 1.16V, 1.05V, 0.95V
• Temperature, e.g. -40 ℃, 25 ℃, 125℃
– Characterized physical parameters for cells
• Area, timing, power, input capacitance
– Estimated wire load model
• Capacitance, resistance, area of nets
EE4292 IC Design Lab, Fall 2024 12
Cell Delay:
NLDM (non-linear delay model)
• Characterized and approximated by tables for
propagation delay Dp and transition time Dt
Dp
Vin Dp and Dt both 2-D functions (tables)
of input transition time Dt,in of Vin
Vout and output total capacitance Ctotal

Dt
In terms of timing arc
Vin
Vout

Ctotal=Cnet+Cin

Wire has parasitic Cnet, Rnet Cell has input capacitance Cin
(estimated by models) (accurate calibration)
EE4292 IC Design Lab, Fall 2024 13
Cell Power: NLDM
• Characterized and approximated by tables for
internal power and leakage power
Total Power = Switching Power + Internal Power + Leakage Power

Capacitance charging and discharging


(no need of characterization)
Vin

Short current which is related to input Leakage current


transition time and output capacitance

EE4292 IC Design Lab, Fall 2024 14


Cadence GSC Liberty File
(slow_vdd1v2_basicCells_wl.lib)
library (slow_vdd1v2) { 45nm, Standard Vt
Slow delay/PVT corner:


slow-slow process, 1.08V, 125℃
delay_model : table_lookup; (typical: TT, 1.2v, 25℃)
capacitive_load_unit (1,pf);
current_unit : "1mA"; Default Units
leakage_power_unit : "1pW";
pulling_resistance_unit : "1kohm";
time_unit : "1ns";
voltage_unit : "1V";

operating_conditions (PVT_1P08V_125C) {
process : 1;
temperature : 125; Nominal operational condition
voltage : 1.08; of this library (PVT)
}
default_operating_conditions : PVT_1P08V_125C;

EE4292 IC Design Lab, Fall 2024 15


Wire Load Model (WLM)
/* Wire load models added by CT for teaching purpose */


Model name
wire_load("Small") {
resistance : 5.0000;
capacitance : 0.000001; Per unit estimated net length
area : 1e-40;
slope : 6.00; Extrapolation slope
fanout_length (1, 9.00);
} Extrapolation example:
If fanout=10,

wire_load("Large") { Net length = 9+(10-1)*6.00 = 63.00 (“Small”) or


resistance : 5.0000; Net length = 24+(10-1)*12.00 = 132.00 (“Large”)
capacitance : 0.000001;
area : 1e-40;
slope : 12.00;
fanout_length (1, 24.00);
}

default_wire_load : "Small"; Use “Small” for default WL


default_wire_load_mode : top; Will be changed to “enclosed” later

EE4292 IC Design Lab, Fall 2024 16


Wire Load Assignment
• For each wire, a WLM will be chosen to
calculate its C, R, and area by two kinds of
synthesis commands
– Wire load mode
• Two common modes are used
– enclosed (consider the smallest module encloses the wire)
– top (consider the top module, worst guess)
– Direct assignment
• You can assign different WLMs to the wires inside different
designs, cells, and ports
• e.g. set_wire_load_model –name “Large” YourDesign
– Area selection
• Use the WL selection group to choose the WLM
• e.g. set_wire_load_selection_group predcaps

EE4292 IC Design Lab, Fall 2024 17


Example of WLM
Operating Conditions: PVT_1P08V_125C Library: slow_vdd1v2
Wire Load Model Mode: enclosed

Startpoint: rst_n (input port clocked by clk)


Endpoint: mac_00/clk_gate_cu_o_reg/latch
(negative level-sensitive latch clocked by clk)
Path Group: clk
Path Type: max
DC Timing report
Des/Clust/Port Wire Load Model Library
------------------------------------------------
mac_array Large slow_vdd1v2 Direct assignment
mac_mydesign_8 Small slow_vdd1v2 Use the default “Small”

Point Incr Path


--------------------------------------------------------------------------
clock clk (rise edge) 0.00 0.00
clock network delay (ideal) 0.00 0.00
input external delay 2.00 2.00 r

EE4292 IC Design Lab, Fall 2024 18


Templates for Table Lookup
lu_table_template (delay_template_2x2) { Lookup table (LUT) template
variable_1 : input_net_transition; for delay calculation
variable_2 : total_output_net_capacitance; e.g. 2-D table
index_1 ("0.008, 0.28"); (typical libraries use 7x7, but GSC
index_2 ("0.01, 0.3"); adopts 2x2 for simplicity)
}
lu_table_template (delay_template_7x7) {
variable_1 : input_net_transition;
variable_2 : total_output_net_capacitance;
index_1 ("0.008, 0.04, 0.08, 0.12, 0.16, 0.224, 0.28");
index_2 ("0.01, 0.06, 0.1, 0.15, 0.2, 0.25, 0.3");
}
power_lut_template (passive_power_template_2x1) { LUT template
variable_1 : input_transition_time; for internal power calculation
index_1 ("0.008, 0.28"); e.g. 1-D table for input energy
}
power_lut_template (power_template_2x2) {
variable_1 : input_transition_time;
LUT template
variable_2 : total_output_net_capacitance;
for internal power calculation
index_1 ("0.008, 0.28"); e.g. 2-D table for output energy
index_2 ("0.01, 0.3");
}
EE4292 IC Design Lab, Fall 2024 19
Example: NAND2 Cell
cell (NAND2X1) {
area : 1.026;
pg_pin (VDD) {
A
pg_type : primary_power; Y
voltage_name : "VDD"; B
}
pg_pin (VSS) {
pg_type : primary_ground;
voltage_name : "VSS";
}
leakage_power () {
value : 61.9728;
when : "(A&B)";
related_pg_pin : VDD; Input-dependent
} leakage power
leakage_power () {
value : 42.1336;
when : "(A&!B)";
related_pg_pin : VDD;
}

EE4292 IC Design Lab, Fall 2024 20


Example: NAND2 Cell Input
pin (A) { Input capacitance
direction : "input"; and internal power for input pin A
related_ground_pin : VSS;
related_power_pin : VDD; Cell-dependent design rule setting
max_transition : 0.28; (here for the max accurate table index of input transition)
capacitance : 0.000430669;
rise_capacitance : 0.000430669;
rise_capacitance_range (0.000423435, 0.000435958);
fall_capacitance : 0.000378596;
fall_capacitance_range (0.000372796, 0.000388644);
internal_power () {
related_pg_pin : VDD;
rise_power (passive_power_template_2x1) {
index_1 ("0.008, 0.28");
values ( \
"-0.000307505, -0.00030599" \
);
}

}
}

EE4292 IC Design Lab, Fall 2024 21


Example: NAND2 Cell Output (1/2)
pin (Y) { Timing and internal power for output pin Y
direction : "output";
function : "(!(A B))";
related_ground_pin : VSS;
Cell-dependent design rule setting
related_power_pin : VDD;
(here for the max output capacitance it can drive; the
max_capacitance : 0.25;
max value for table lookup)
timing () {
related_pin : "A";
Timing arc for A->Y
timing_sense : negative_unate;
timing_type : combinational;
cell_rise (delay_template_2x2) { Propagation delay
index_1 ("0.008, 0.28");
index_2 ("0.01, 0.25");
for rising output
values ( \
"0.083797, 1.77744", \ A
"0.205226, 1.93256" \ Y
); B
}
rise_transition (delay_template_2x2) { Transition time
index_1 ("0.008, 0.28"); for rising output
index_2 ("0.01, 0.25");
values ( \
"0.137835, 3.19574", \
"0.173557, 3.1956" \
);
}
EE4292 IC Design Lab, Fall 2024 22
Example: NAND2 Cell Output (2/2)
internal_power () {
related_pin : "A";
related_pg_pin : VDD; Use 2-D power LUT template
rise_power (power_template_2x2) {
index_1 ("0.008, 0.28");
index_2 ("0.01, 0.25");
values ( \
"0.00103162, 0.00104113", \
"0.000996519, 0.00102774" \
);
}
fall_power (power_template_2x2) {
index_1 ("0.008, 0.28");
index_2 ("0.01, 0.25");
values ( \
"0.000265354, 0.000267192", \
"0.000225005, 0.000264408" \
);
}
}

EE4292 IC Design Lab, Fall 2024 23


Example: D Flip-Flop Cell
lu_table_template (constraint_template_2x2) {
variable_1 : constrained_pin_transition;
variable_2 : related_pin_transition;
index_1 ("0.008, 0.28");
index_2 ("0.008, 0.28");
}

cell (DFFHQX1) {
area : 5.472;

pin (D) {

timing () {
related_pin : "CK"; Setup/hold time checking
timing_type : setup_rising; also by table look-up
rise_constraint (constraint_template_2x2) {
index_1 ("0.008, 0.28");
index_2 ("0.008, 0.28");
values ( \
"0.0543128, 0.00217756", \
"0.152974, 0.091404" \
);
}

EE4292 IC Design Lab, Fall 2024 24


Outline
• Part I
– Basic Settings for RTL Synthesis
– Cell Library – Cadence GSC (GPDK45)
– Basic Design Constraints
• Set operating and I/O environment
• Set optimization constraints
• Set compile constraints

• Part II
– Design Compiler Optimization Process
– Static Timing Analysis
– DesignWare Library
EE4292 IC Design Lab, Fall 2024 25
Set Operating and I/O Environment

credit: Synopsys DCUG.


Note:
1. I/O depends on your preceding and following macros OR chip interface.
2. Macro WLM could be different from the one chosen in whole-chip synthesis.
EE4292 IC Design Lab, Fall 2024 26
# Setting Design and I/O Environment
set_operating_conditions -library slow_vdd1v2 PVT_1P08V_125C
# Assume outputs go to DFF and inputs also come from DFF
set_driving_cell -library slow_vdd1v2 -lib_cell DFFHQX1 -pin {Q} [all_inputs]
set_load [load_of "slow_vdd1v2/DFFHQX1/D"] [all_outputs]

# Setting wire-load model (default: “Small”)


set_wire_load_mode enclosed
set_wire_load_model -name "Large" $TOPLEVEL

Comparison of
wire load modes

credit: Synopsys DCUG.

EE4292 IC Design Lab, Fall 2024 27


Set Optimization Constraints
#Setting Timing Constraints Ideal nets: Free from design rule constraints
# 4*0.8=3.2 (20% timing margin for 250MHz clock) (max cap, transition, fanout)
create_clock -name clk -period 3.2 [get_ports clk] Don’t touch nets: Free from modification or
set_ideal_network [get_ports clk] replacement (e.g. no buffer insertion)
set_dont_touch_network [all_clocks] Network: The property will propagate to all
related nets.

#I/O delay should depend on the real environment. Here only shows an example of setting
#Default is zero if not explicitly set.
set_input_delay 2 -clock clk [remove_from_collection [all_inputs] [get_ports clk]]
set_output_delay 1 -clock clk [all_outputs]

# Setting DRC Constraint


# Defensive setting: default fanout_load 1.0 and our target max fanout # 20 => 1.0*20 = 20.0
# max_transition and max_capacitance are given in the cell library
set_max_fanout 20.0 $TOPLEVEL

#Area Constraint
set_max_area 0

EE4292 IC Design Lab, Fall 2024 28


Set Compile Constraints
set_fix_multiple_port_nets -feedthroughs -outputs -constants -buffer_constants
Insert buffers for outputs and constants to avoid multiple loading
####check design####
check_design > ./report/check_design.log
Had better check these log files
check_timing > ./report/check_timing.log
before going to the next step

set_clock_gating_style -max_fanout 10 One clock-gated cell serves


at most 10 clock fanouts

#Synthesis all design#


compile_ultra -gate_clock -no_autoungroup -no_seq_output_inversion \
-no_boundary_optimization -exact_map

These constraints help hierarchical LEC later

EE4292 IC Design Lab, Fall 2024 29


More about Latch-based Clocking Gating

credit: Synopsys DCUG.

You can do clock gating for macros (e.g. SRAM)


by manually inserting clock gating cells.

EE4292 IC Design Lab, Fall 2024 30


Timing Report for Clock-gating Latch
Startpoint: temp_b_reg_1_
(rising edge-triggered flip-flop clocked by clk)
Endpoint: clk_gate_gcd_out_reg/latch
(positive level-sensitive latch clocked by clk')
clk
clock clk (rise edge) 0.0000 0.0000
clock network delay (ideal) 0.0000 0.0000
temp_b_reg_1_/CLK (DFFX1_HVT) 0.0000 0.0000 r clk’
temp_b_reg_1_/QN (DFFX1_HVT) 0.1582 0.1582 r
….
clk_gate_gcd_out_reg/latch/D (LATCHX1_HVT) 0.0000 9.8096 r
data arrival time 9.8096 en
clock clk' (rise edge) 5.0000 5.0000
clock network delay (ideal) 0.0000 5.0000
clk_gate_gcd_out_reg/latch/CLK (LATCHX1_HVT) 0.0000 5.0000 r
borrowed time
time borrowed from endpoint 4.8096 9.8096
data required time 9.8096 (w.r.t. a flip-flop)
--------------------------------------------------------------------------
data required time 9.8096
data arrival time -9.8096 clk enclk
-------------------------------------------------------------------------- clk’ G
slack (MET) 0.0000
latch
en D
--------------------------------------------------------------

EE4292 IC Design Lab, Fall 2024 31


Outline
• Part I
– Basic Settings for RTL Synthesis
– Cell Library – Cadence GSC (GPDK45)
– Basic Design Constraints
• Part II
– Design Compiler Optimization Process
– Static Timing Analysis
– DesignWare Library

EE4292 IC Design Lab, Fall 2024 32


DC Optimization Process
Work on
High

Architectural
HDL
Optimization
Level

Logic-level Generic Netlist


Optimization (GTECH library)

Gate-level Technology-specific
Optimization Netlist

Low
EE4292 IC Design Lab, Fall 2024 33
Architectural Optimization (1/2)
• High-level synthesis based on your constraints
and your coding style
– Will generate technology-independent netlist (GTECH)
• High-level tasks like:
– Sharing common subexpressions
t = a*b;
out1 = a*b+c;
out1 = t+c;
out2 = a*b+d;
out2 = t+d;

– Sharing resources

– Reordering operators
• e.g. (b+c)+d = b+(c+d)

EE4292 IC Design Lab, Fall 2024 34


Architectural Optimization (2/2)
• Identifying arithmetic expressions for data-path synthesis
(DC Ultra only)
– Data-path block extraction: addition/subtraction/increment,
multiplier, even comparator
– e.g. a*b+c*b = (a+c)*b

• Selecting DesignWare Implementation


– A variety of optimized designs for different optimization purposes
for arithmetic operations
– DC can “remember” the high-level function of these blocks in the
low-level netlist phase
– The only task that can recur in a mapped netlist
• One benefit of using DW
– e.g. c = a+b; => DW01_add(.A(a), .B(b), .SUM(c));

EE4292 IC Design Lab, Fall 2024 35


Logic-level Optimization (1/2)
• Works on GTECH netlist and consists of
two processes
– Structuring
• Constraint-based, useful for noncritical timing paths
• Introduce intermediate variables and logic
structures for reducing area out1 = a&b|c; t = a&b;
out1 = t|c;
out2 = a&b|d;
out2 = t|d;

in logic level

• Factoring out the subfunctions that reduce most


logics

EE4292 IC Design Lab, Fall 2024 36


Logic-level Optimization (2/2)
– Flattening
• Not constraint-based (default is false)
• Convert combinational logics into two-level sum-of-
product presentations for speeding optimization
– Not all circuits will be flattened since two-level SOP may
explode or be inefficient
• Single- or multiple-output optimization

credit: CIC

EE4292 IC Design Lab, Fall 2024 37


Gate-level Optimization (1/3)
• Or technology mapping, consists of
– Mapping
• Use combinational and sequential gates from the
target library to generate a gate-level netlist

credit: CIC

EE4292 IC Design Lab, Fall 2024 38


Gate-level Optimization (2/3)
– Delay optimization
• Fix delay violations in the mapping phase
• May not clean design rule violations and meet area
constraint
– Design rule fixing (Quality of your netlist for P&R)
• Common design rules:
– Maximum transition time (for accurate cell timing lookup)
– Maximum/minimum total capacitance (for accurate cell timing)
– Maximum fanout load (for accurate wire load lookup)
• Fixed by inserting buffers or resizing gates
– DC tries not to affect timing and area results

EE4292 IC Design Lab, Fall 2024 39


Gate-level Optimization (3/3)

– Area optimization
• Try to meet area constraint but not introduce new
violations of design rules and delay constraints

– Power optimization (not included in the labs)


• Dynamic power
– RTL: Clock-gating (in architectural optimization phase)
– RTL: Operand isolation
– Gate-level: Dynamic power optimization
• Leakage power
– Multi-Vt optimization

EE4292 IC Design Lab, Fall 2024 40


Outline
• Part I
– Basic Settings for RTL Synthesis
– Cell Library – Cadence GSC (GPDK45)
– Basic Design Constraints
• Part II
– Design Compiler Optimization Process
– Static Timing Analysis
– DesignWare Library

EE4292 IC Design Lab, Fall 2024 41


Design Objects

credit: Synopsys DCUG.

EE4292 IC Design Lab, Fall 2024 42


Design Objects
• Design: A design consists of instances, nets, ports, and pins
• Port: Inputs and outputs of a design

• Reference: A library component or design, e.g. INV, REGFILE

• Instance or cell
credit: Synopsys DCUG.
– An occurrence of a reference
– Instances (can point to the same reference) have their unique
names, e.g. U2 and U3
• Pin : Inputs and outputs of an instance or a cell within a design
EE4292 IC Design Lab, Fall 2024 43
Static Timing Analysis (STA)
• Analyze the timing conditioned by single clock
• Four timing path types Input or
Flip-flop Q
Output or
Flip-flop D

Flip-flop -> Flip-flop Flip-flop -> Output


Input -> Flip-flop

Input -> Output credit: Synopsys TCOUG.

EE4292 IC Design Lab, Fall 2024 44


Path Delay
• Path delay = cell delay + net delay
(usually assume zero for synthesis)

credit: Synopsys TCOUG.

Estimated Estimated for post-synthesis


for synthesis Accurate for P&R

EE4292 IC Design Lab, Fall 2024 45


Input/Output Delay

External Input Delay

External Output Delay

credit: Synopsys TCOUG.

EE4292 IC Design Lab, Fall 2024 46


Setup/Hold Checks

credit: Synopsys TCOUG.


EE4292 IC Design Lab, Fall 2024 47
Example of Setup Check
• Input delay = DFF1clk->Q + M
• Output delay = T + DFF4setup
• Clock cycle has to be larger than or equal to
– DFF2clk->Q + X + DFF3setup
– Input delay + N+ DFF2setup
– DFF3clk->Q + S + output delay

credit: CIC.
EE4292 IC Design Lab, Fall 2024 48
STA with Max/Min Analysis
• Slow PVT corner for setup check Note:
Usually only setup time is
• Fast PVT corner for hold check checked for DC

credit: CIC.
EE4292 IC Design Lab, Fall 2024 49
DesignWare Library
• Technology-independent soft macros
– Multiple pre-designed and optimized architectures for
each macro for speed/area tradeoffs
– Will be synthesized into technology-specific netlist
– As simple as add, sub, shift, …
– As complex as pipelined multiplier, divider, sin/cos, …

credit: CIC

EE4292 IC Design Lab, Fall 2024 50


DesignWare Library
• Library can be found
– /usr/cadtool/cad/synopsys/synthesis/cur/dw
– doc/: documents
– dw/sim_ver: verilog behavior models
• Only for simulation; many are not synthesizable
• Don’t include these files for synthesis
• Invoke by two approaches
– Inference (depend on DC’s algorithm)
• e.g. wire [7:0] c= a+b; No carry-in/-out specified
– Instantiation (with an instance name)
• e.g. DW01_add #(8) U1(.A(a), .B(b), .CI(1’b0), .SUM(c), .CO(carry));

EE4292 IC Design Lab, Fall 2024 51


Example: Library List
• /usr/cadtool/cad/synopsys/synthesis/cur/dw/doc/intro.pdf

Read the document to know the specification before using some


special operations, like DIV, SIN, … (usually not recommended)
EE4292 IC Design Lab, Fall 2024 52
Example: DW01_add

EE4292 IC Design Lab, Fall 2024 53


Appendix – Synthesis Tips

EE4292 IC Design Lab, Fall 2024 54


Suggested Usage
• Use only the “simple” ones
– Adder, shifter, comparator
– Multiplier (signed or unsigned)
• By instantiation

• Avoid “complex” ones


– Divider, SIN, LOG, …
– Floating-point
– You have to figure out how to do formality
checking before using these DW’s

EE4292 IC Design Lab, Fall 2024 55


Notes on Synthesis Report
• Design rule cost > 0?
– It means the reported timing and power may not accurate
for some logics
– Had better check where the violations occur
• Timing slack?
– Positive slack means your design passes the timing
constraint => Totally OK
• Not necessarily to be zero
– Negative slack means your design doesn’t pass the
current timing constraint
• Ok if this is intentional
• e.g. real target = 6ns, assigned constraint = 5ns, reported slack
= -0.5ns => the worst timing path is 5.5ns => OK for real target
• Problem: DC stops optimizing your circuits by default if there is
any negative timing slack

EE4292 IC Design Lab, Fall 2024 56


Reminders for Beginners
• Write synthesizable RTL for hardware
– No verification constructs
• wait, initial, while, …
• Don’t use latch (except clock-gating cells)
– both intentionally and unintentionally
• Use only one clock and one edge (positive)
– DC handles one clock and one corner at one time
(setup and hold time checked in different corners)

• Professional only!
– Paths between clock A and clock B
• Usually treated as asynchronous
– Paths between different edges
• Need to take care of the quality of clock source (e.g. duty
cycle)
EE4292 IC Design Lab, Fall 2024 57
Check The Logs
• Check the logs of VCS, verdi, SpyGlass,
and dc_shell
– Avoid errors and understand warnings
• Check if there is any unintentional latch or
any unresolved design
– SpyGlass
– dc_shell

EE4292 IC Design Lab, Fall 2024 58


Squeeze The Timing
• compile_ultra –incremental
– Incremental (local) optimization
– More than one iteration may give the timing you want
• set_critical_range [ns] [DesignTop]
– e.g. set_critical_range 1.0 $TOPLEVEL
– DC then optimizes the timing delay inside the critical
range [by default, only for the worst path]
slack slack
Only try to fix the
worst path (default) Fix a wider range
of paths

… …

EE4292 IC Design Lab, Fall 2024 59


Synthesis for P&R
• Hierarchical RTL partitioning
– Give better WLM for smaller enclosed modules
– Remember: WLM less accurate for higher-level blocks
• Keep your high-level blocks as unrelated as possible
• You may set smaller WLM for top blocks to avoid over-estimation
• Preserve timing margin (e.g. 10%~20%, technology dependent)
– In case of under-estimation of WLM
– Timing buffer for placement and routing
– Setup time buffer for fixing hold time violation
• Setup and hold time are fixed in different corners, and the two
phases are not usually convergent
• The timing buffer can let you tolerate a “literal” setup violation
EE4292 IC Design Lab, Fall 2024 60
Appendix – More Accurate
Wire Load Model

EE4292 IC Design Lab, Fall 2024 61


Wire Load Model (WLM) in Synopsys SAED library
Model name

wire_load ("8000") {
capacitance : 0.000343;
resistance : 1.730000e-03; Per unit estimated net length
area : 0.010000;
Extrapolation slope
slope : 90.646360;
if fanout overflow (>20 here)
fanout_length("1", \
"13.9403600");
fanout_length("2", \ Net Length Table:
Non-linear table "31.8040800"); If fanout # =3,
derived by calibration fanout_length("3", \ net length = 51.61212
"51.6121200");
fanout_length("4", \ => (estimated)
C = 0.000343 x 51.6

R = 1.73e-03 x 51.6
fanout_length("19", \ Area = 0.01 x 51.6
"834.4876400");
fanout_length("20", \
"925.1340000"); Extrapolation:
} If fanout=22,
Net length = 925.134+(22-20)*90.646
(not accurate, should be avoided!)

EE4292 IC Design Lab, Fall 2024 62


Wire Load Selection
Selection group name

wire_load_selection (predcaps) {
wire_load_from_area(0.000000,200.000000, \
"ForQA");
wire_load_from_area(200.000000,8000.000000, \
Model name
"8000");
wire_load_from_area(8000.000000,16000.000000, \
"16000");
wire_load_from_area(16000.000000,35000.000000, \
"35000");
Select WLM wire_load_from_area(35000.000000,70000.000000, \
by area
"70000"); … Select WLM “70000”
if the area between 35K and 70K

wire_load_from_area(2000000.000000,4000000.000000, \
"4000000");
wire_load_from_area(4000000.000000,8000000.000000, \
"8000000");
}
p.s. Synthesis constraints define which part of area to be concerned.

For advanced technologies, there are many different selection groups,


from conservative to aggressive, for you to choose.
.
EE4292 IC Design Lab, Fall 2024 63
Example of Area-Selected WLM
Operating Conditions: ss0p95v125c Library: saed32hvt_ss0p95v125c
Wire Load Model Mode: enclosed

Startpoint: top0/ID_EXE/EXE_ALUSrc_reg (rising edge-triggered flip-flop clocked by clk)


Endpoint: top0/ID_stage/regfile/gpr_reg_21__31_ (rising edge-triggered flip-flop clocked by clk)
Path Group: clk
Path Type: max
DC Timing report
Des/Clust/Port Wire Load Model Library
------------------------------------------------
top_pipe 280000 saed32hvt_ss0p95v125c
ID_EXE 8000 saed32hvt_ss0p95v125c
top 280000 saed32hvt_ss0p95v125c
alu 8000 saed32hvt_ss0p95v125c
regfile 16000 saed32hvt_ss0p95v125c

Point Incr Path


--------------------------------------------------------------------------
clock clk (rise edge) 0.0000 0.0000
clock network delay (ideal) 0.0000 0.0000
top0/ID_EXE/EXE_ALUSrc_reg/CLK (DFFSSRX1_HVT) 0.0000 # 0.0000 r
top0/ID_EXE/EXE_ALUSrc_reg/QN (DFFSSRX1_HVT) 0.1662 0.1662 f
top0/ID_EXE/U100/Y (INVX2_HVT) 0.0440 0.2102 r

EE4292 IC Design Lab, Fall 2024 64

You might also like