0% found this document useful (0 votes)
30 views5 pages

Design and Implementation of High-Performance PNR Block-Level Design With Timing Placement

Design_and_Implementation_of_High-Performance_PnR_Block-Level_Design_with_Timing_Placement

Uploaded by

Shiv shankar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views5 pages

Design and Implementation of High-Performance PNR Block-Level Design With Timing Placement

Design_and_Implementation_of_High-Performance_PnR_Block-Level_Design_with_Timing_Placement

Uploaded by

Shiv shankar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Proceedings of the 7th International Conference on Intelligent Computing and Control Systems (ICICCS-2023)

IEEE Xplore Part Number: CFP23K74-ART; ISBN: 979-8-3503-9725-3

Design and Implementation of High-Performance


PnR Block-Level Design with Timing Placement
2023 7th International Conference on Intelligent Computing and Control Systems (ICICCS) | 979-8-3503-9725-3/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICICCS56967.2023.10142497

Dr. J. Charles Pravin Keshav Kondabala Arava Santosh Reddy


Department of ECE Department of ECE Department of ECE
Kalasalingam Academy of Research Kalasalingam Academy of Research Kalasalingam Academy of Research
and Education and Education and Education
Krishnankoil, Tamilnadu, India Krishnankoil, Tamilnadu, India Krishnankoil, Tamilnadu, India
[email protected] [email protected] [email protected]

Onteru M ahesh Siripireddy Satish Reddy Chilakuri Vishnu Chaithanya


Department of ECE Department of ECE Department of ECE
Kalasalingam Academy of Research Kalasalingam Academy of Research Kalasalingam Academy of Research
and Education and Education and Education
Krishnankoil, Tamilnadu, India Krishnankoil, Tamilnadu, India Krishnankoil, Tamilnadu, India
[email protected] [email protected] [email protected]

Abstract—Delay of signal distribution in integrated chips the Rectilinear Die (L-Shape) and Rectilinear Die (U-Shape)
due to an increased number of standard cells and macros. Due demonstrates the comparison of congestion, time, Qo R, and
to the unprecedented delay in timing the performance of the power vary with each type of macro placement. [1].
chips can be delimited. Most of the researchers have
performed simulation experiments to minimize the timing by The majority of the placement strategies discussed in the
reducing the slack and it has produced compromising results prior chapters work to reduce the length of the wire overall.
with some tradeoffs. This research study intends to minimize The wires on timing-critical channels are the main focus of
the timing delay through efficient slack management place and the placement strategy. It should be noted that a cell
route strategies. To achieve such reduced timing, macros were frequently connects to two or more other cells . constructing
placed proximal to the core boundary. The performance of the some focused nets. As a result, placement must be carried
proposed work is weighted using performance metrics such as out with extreme care and equilibrium[2].
data required time, data arrival time, and violated slacks. Our
proposed methodology outperforms by reducing the data One of the factors affecting the development of cutting -
required time to 32.6%, data arrival time to 73.4%, an d edge placement strategies to improve circuit performance is
violated slack time to 45% in the cases of before and after the scalability and integration of technology. Modern
placement respectively. This makes our approach to be useful system-on-chip (SOC) arch itectures' abundance of macros
in reducing delay and timing for circuits with 40nm and standard cells, as well as other factors, all play a part in
technology. the increased connection versus gate latency ratios[2]. As on-
die functional integration levels increase, global
Keywords—Register-transfer level, Graphic Design System, interconnects extend. These variables continue to make
Graphical User Interface, Clock tree synthesis, Time is driven improved placement problematic.
placement, Quality of Results, Power, and Ground, Design Rule
Checking, Layout Versus Schematic, Design for Manufacturing
II. PREVIOUS WORK
I. INT RODUCT ION A. Floorplan
The way to make Pn R block level design may consist of By provid ing the core utilization factor or the measurements
several stages like floor planning, power planning, of the core and die space, the floor plan first establishes the
placement, clock tree synthesis, routing, LVS, DFM, and boundary between the die and the core. Then, arrangement
DRC checks. In the placement stage, most of the design of of the macros by some rules, such as grouping similar types
the chip is done. At the placement stage, it checks the
of macros, trying to minimize stacking, arranging the
number of remaining ideal nets, the maximu m nu mber of
macros according to the number of connections between
routing layers, the number of scan chains that exist, the
number of tap cells, and the standard cells placed in the them, orienting all of the macros' pins towards the core area,
design. Each standard cell must occupy the minimu m area in and leaving no dead space in the core area [3]. After fixing
the core area of the design. The number of macros placed in macros and blockages standard cells get placed [4].
a design is grouped by similar macros. B. Powerplan
Every macro p lacement approach had a different effect Making a power grid, also referred to as a power plan, is the
on the scheduling, congestion, and power supply for various next stage in the floor plan. Vertical and lateral squares will
types of dies, which made it possible to automate floor serve as models for how power mesh is made. The breadth
planning for different types of die forms and shorten cycle and spacing between the metal straps, as well as your floor
durations [1]. Time, congestion on the network, and power layout, determine how many vertical and horizontal straps
are all variables that affect how cells are placed in various are needed. Maintaining lower power usage is one of the
die shapes. Selecting a rectangular die for placement gave primary objectives of the power plan [5].
the best congestion of 0.1% and shortened the cycle time for
a design with 500k instances and 18 macros with a utilization
of 59% fro m weeks to days . Utilization of around 49% for

979-8-3503-9725-3/23/$31.00 ©2023 IEEE 1799


Authorized licensed use limited to: STMicroelectronics international NV. Downloaded on December 18,2023 at 07:52:35 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the 7th International Conference on Intelligent Computing and Control Systems (ICICCS-2023)
IEEE Xplore Part Number: CFP23K74-ART; ISBN: 979-8-3503-9725-3

C. Placement The proposed method consists of different stages of


At this point, all standard cells will be placed and legalized logical and physical data setup checks constraints, and
in standard cell rows by the optimal t iming guidelines. In netlist to make a design. This netlist is provided as input
addition to normal cells, this placement step also involves for the floorplan when the floorp lan is designed and
the positioning of unique cells. To maintain synchronization arranged all macros manually further powerp lan is
while positioning standard cells, buffers were introduced. processed. In the power plan, pg grids are placed for the
After placing all necessary cells global route check will be standard cells and the already placed macros. After
done. powerplan Placement takes place and while checking
global congestion if it’s good then moved to the clock
D. Clock Tree Synthesis tree synthesis stage or else again floorplan power plan
In the physical design cycle, Clock Tree Synthesis comes will be done. After the clock tree, synthesis routing was
just after the Placement and before the Routing. To ensure also done and further moved to check the DRC, DFM ,
the least possible skew, linking must be done in a manner and LVS fig1.
that ensures the clock signal reaches all cells concurrently.
Tool attempts to place repeaters in the clock path to achieve
TABLE I. SP ECIFICATIONS OF PROPOSED WORK
the least amount of skew [6]. Before CTS, only setup timing
will be taken into account; however, because real clock Block level Design 40 nm Technology
routes will exist during clock tree synthesis, hold timing will Macros 34
also be taken into account.
Standard cells 38K
E. Routing Clock
1G Hz
Routing—The tool chooses the links between various Frequency
components during the routing procedure. To do this, Supply
recognition of the typical cell and macro pins as well as the 1.1v
voltage
pins on the block border or the pads near the chip's
perimeter is crucial. The tool has access to exact info rmation
Metal layers 5
about the positions of the different blocks, their 5% of supply
Max IR drop
corresponding pins, and the I/O pads at the chip borders voltage
after the placement and CTS steps are finished [7]. Power budget 600mW

III. PROPOSED W ORK TABLE I contains information about the specifications of the
Design and all the experiments done on the synopsis
ICCOMPILER II tool.

A. Floorplan of proposed work

Fig. 2. Placement of macros manually

Fig.2 displays the manually p laced macros during the


floorplanning. Further links between macros of the same
Fig. 1. Flowchart of proposed work design type and color grouping were used to construct the floor
design [fig 2]. All pins must face the core zone, macro
stacking must be limited, and there cannot be any dead space
to achieve a congestion-free design

979-8-3503-9725-3/23/$31.00 ©2023 IEEE 1800


Authorized licensed use limited to: STMicroelectronics international NV. Downloaded on December 18,2023 at 07:52:35 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the 7th International Conference on Intelligent Computing and Control Systems (ICICCS-2023)
IEEE Xplore Part Number: CFP23K74-ART; ISBN: 979-8-3503-9725-3

B. Power Planning of proposed work remained the same. IR drop was tested by using the IR drop
map in the Synopsys ICCompiler-II tool.

C. Placement of proposed work

Fig. 3. Power plan analysis graph

Fig. 5. Placement of Standard Cells

According to the number of connections between


standard cells and macros, we carried out the placement step
in such a manner that all standard cells were put in the core
region. In the central region, in addition to normal cells, are
additional special cells like tap cells, boundary cells, and
filler cells Fig.5. Every standard cell is set up in its
appropriate row harness. After placing everything, we looked
at the Global route congestion.

Fig. 4. Graph representation of IR drop

Achieving the lowest possible power consumption is the


goal of power planning. However, lesser technology nodes
also experience considerable interconnecting PG train wire
power usage. Only dynamic power reduction solutions with
the requisite standard cells are given in [3]. Current and
resistance lead to IR decrease. The IR drop was tested by
varying the PG straps' breadth, metal-to-metal distance,
Fig. 6. Global route congestion map (with congestion)
pitch, and offset. The goal was to lower power consumption
to under 50% of the supply voltage. Resistance varies with Fig.6 above represents the results of placement in the first
width (specification table 1) version. Because of routing gridlock, we returned to the
The IR drop obtained this time was 49.2mv (Fig. 3), layout and altered the location of the macros by adding some
which was within the permitted range (5% of supply voltage space between them, performing all tasks up until placement.
according to specification) listed in TABLE I after
performing the PG mesh and IR drop check. The results of
the power analysis are shown in Fig. 4, together with the
values for the wire parameters. The first IR drop verified for
the following values: breadth = 3, the distance between
metals = 0.5, pitch = 10, and offset = 1. These values are
shown on the graph in orange, green, and violet, respectively.
(Represented in yellow color) and generated a PG mesh, for
which the mesh produced an IR drop was 50 mv (represented
in cyan on the graph), which is undesirable for the power
plan's specification criteria. The IR drop check2 was carried
out with the following settings: width=4, spacing=0.5,
pitch=10, offset=1 and the resulting IR drop was 51mv , Fig. 7. Global route congestion map (without congestion)
which is likewise not optimal. The width number was
adjusted to 4.5 in the last iteration, while the other numbers

979-8-3503-9725-3/23/$31.00 ©2023 IEEE 1801


Authorized licensed use limited to: STMicroelectronics international NV. Downloaded on December 18,2023 at 07:52:35 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the 7th International Conference on Intelligent Computing and Control Systems (ICICCS-2023)
IEEE Xplore Part Number: CFP23K74-ART; ISBN: 979-8-3503-9725-3

After several iterations of this placement stage, we Fig 9 provides details on the timing in each of their scenarios
cleared the global route congestion the second graph and the number of endpoints violated both before and after
represents the congestion-free graph(Fig.7). placement. Here setup violations are only taken into count
because it’s a placement stage. Initially, the endpoints of
both scan_rclk and uart_clk are varies for different slack
TABLE II. SETUP SLACK BEFORE AND AFTER P LACEMENT
values which may reduce the timing violations and optimize
Points Before Placement After Placement the timing in particular paths.
Data required 0.72ns 0ns
time D. Clock tree synthesis of proposed work
Data arrival -2.68ns -1.24ns This work uses the concurrent clock and data
time optimization to achieve the least amount of timing slack
possible. The quantity of clocks emp loyed in this study is
Slack(violated) -1.96ns -1.24ns
listed in TABLE I, along with the relevant cell's location in
the clock routes. The clock port was connected to each sink's
TABLE II above represents the setup slack(violation) before clock pin by CCD standards.
placement and after placement. Before p lacement, the slack
is -1.96ns after the placement the slack decreases to -1.24ns
hence here the timing optimization takes place.

Fig. 10. Number of timing violated paths before and after clock tree
synthesis

Fig. 8. Several endpoints vs slack in func.ss_125c Fig 10 provides details on the timing in each of their
scenarios and the number of endpoints violated both before
Fig 8 provides details on the timing in each of their scenarios and after CTS. They also explain how many pathways were
and the number of endpoints violated both before and after broken. Since only setup violations are taken into account
placement. Here setup violations are only taken into count before cts, hold violations are in the thousands. Following
because it’s a placement stage. Initially, the endpoints of cts, 90% of violated paths in the design scenario dropped,
both sys_rclk and uart_clk are more before placement after and 97% of violated paths decreased in the test scenario for
several iterations did the endpoints minimized and optimized the ff 125c. Hence, clock t ree synthesis succeeded, as seen
to reduce the timing of the design. Timing violations were from this.
reduced by Gate resizing, Buffer/Inverter insertion, and
selecting appropriate registers of different setup times. E. Routing routing of proposed work

Fig. 11. Metal routing of signal pins of all the standard cells

All signal pins of cells present in the design were connected


to metal wires using 40n m design rules and non-default
routing rules (Fig11), and several physical verification
checks like DRC, LVS, and DFM checks were successful
during the simulation.
Fig. 9. Number of endpoints vs slack in the test.ss_125c

979-8-3503-9725-3/23/$31.00 ©2023 IEEE 1802


Authorized licensed use limited to: STMicroelectronics international NV. Downloaded on December 18,2023 at 07:52:35 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the 7th International Conference on Intelligent Computing and Control Systems (ICICCS-2023)
IEEE Xplore Part Number: CFP23K74-ART; ISBN: 979-8-3503-9725-3

TABLE III. PROPOSED WORK VS PREVIOUS WORK REFERENCES


Pre vious work Proposed work O utcome [1] Garg, Shivani, and Neeraj Kr Shukla. "A Study of Floorplanning
Challenges and Analysis of macro placement approaches in Physical
Floorplan 28 macros arranged 34 macros Wire length was Aware Synthesis." International Journal of Hybrid Information
manually. arranged using reduced and T echnology 9.1 (2016): 279-290.
fly lines repetition stages
manually. were also [2] Venkatesh, S. V. (1995, May). Hierarchical timing-driven
minimal. floorplanning and place and route using a timing budgeter. In
Proceedings of the IEEE 1995 Custom Integrated Circuits Conference
Powerplan Mesh-type powerplan. Mesh-type Better IR drop. (pp. 469472). IEEE.
Used 10% of the powerplan. [3] Lin, Jai-Ming, and Ji-Heng Wu. "F-FM: Fixed outline floorplanning
source voltage. Used 5% of the Reduced to 50%
IR drop. methodology for mixed-size modules considering voltage-island
source voltage. constraint." IEEE T ransactions on Computer-Aided Design of
Placement 30k standard cells 38k standard Timing Integrated Circuits and Systems 33.11 (2014): 1681-1692
used. Placed standard cells placed. optimized up to [4] Ravali, K., S. Ravi, and Harish M. Kittur. "Power Optimization
cells. Minimizing the 45% by Techniques and Physical Design Flow on Repeaters for High-Speed
slack using the reducing the Processor Core in sub 14nm." 2018 3rd IEEE International
minimal slack in the Conference on Recent Trends in Electronics, Information &
number of tap design. Communication Technology (RTEICT). IEEE, 2018
cells and filler [5] Shen, Weixiang, et al. "An effective gated clock tree design based on
cells. activity and register aware placement." IEEE Transact ions on very
Clock tree The classic method is CCD method is Run time large scale integration (VLSI) Systems 18.12 (2009): 1639- 1648.
synthesis used used. reduced to 50%. [6] Sai, Manoj PD, et al. "Reliable 3-D clock-tree synthesis considering
The data path nonlinear capacitive T SV model with electrical–thermal–mechanical
and clock path coupling." IEEE Transactions on Computer-Aided Design of
run Integrated Circuits and Systems 32.11 (2013): 1734-1747
simultaneously. [7] Cho, Minsik, and David Z. Pan. "A high-performance droplet routing
Routing 80nm technology node 40nm Minimal short algorithm for digital microfluidic biochips." IEEE Transactions on
used. technology and opens Computer-Aided Design of Integrated Circuits and Systems 27.10
node was used reduces to 30% (2008): 1714-1724.
and cleared compared to [8] Timing Driven Placement Malgorzata MarekSadowska and Shen P.
more DRC 80nm node. Lin Electronics Research Laboratory University of California 1989.
violations. Minimized [9] Guo, Z., & Lin, Y. (2022, July). Differentiabletiming-driven global
DRC violations. placement. In Proceedings of the 59th ACM/IEEE Design
Automation Conference (pp. 1315-1320).
[10] Fadnavis, J., & Kariyappa, B. S. (2021). PNR flow methodology for
congestion optimization using different macro placement strategies of
DDR memories. International Journal of Advanced Technology and
TABLE III provides a detailed exp lanation of previous work Engineering Exploration, 8(80), 903.
using 80nm technology and the proposed work using 40n m [11] Aland, S., & Venkateswarlu, V. (2012). Block Level Physical Design
technology in various stages of physical design. of Interfacing Module in RISC Core.
[12] Chang, C. C., Lee, J., Stabenfeldt, M., & T say, R. S. (1994,
December). A practical all-path timing-driven place and route design
IV. RESULT S AND CONCLUSION system. In Proceedings of APCCAS'94-1994 Asia Pacific Conference
Based on the findings of the study, it can be inferred that on Circuits and Systems (pp. 560-563). IEEE.
achieving reducible timing during placement is crucial to [13] Hutton, M., Adibsamii, K., & Leaver, A. (2001, February). Timing-
driven placement for hierarchical programmable logic devices. In
simplify subsequent physical design steps when utilizing Proceedings of the 2001 ACM/SIGDA ninth international symposium
40n m technology. To lower power consumption in power on Field programmable gate arrays (pp. 3-11).
planning, it is reco mmended to employ metal straps with [14] hang, Robert C., and B-H. Lim. "Efficient IP routing table VLSI
dimensions of 4.5 micro meters in width and a spacing of 0.5 design for multigigabit routers." IEEE T ransactions on Circuits and
micro meters between them, wh ile maintaining an offset Systems I: Regular Papers 51.4 (2004): 700-708.
value of 1. To address congestion in the standard cell global [15] Sterpone, Luca, and Massimo Violante. "A new reliability-oriented
route during placement, the soft blockage should be used. place and route algorithm for SRAM-based FPGAs." IEEE
T ransactions on Computers 55.6 (2006): 732-744.
Additionally, integrating a concurrent clock and data
optimization (CCD) technique in clock tree synthesis can aid [16] Vansteenkiste, Elias, et al. "TPaR: place and route tools for the
dynamic reconfiguration of the FPGA's interconnect network." IEEE
in minimizing skew. These measures can effectively enhance Transactions on computer-aided design of integrated circuits and
the efficiency and performance of the physical design Systems 33.3 (2014): 370-383
process in the context of 40nm technology [17] Lin, T ung-Liang, and Sao-Jie Chen. "A Platform of Resynthesizing a
Clock Architecture Into Powerand-Area Effective Clock Trees." IEEE
Transactions on Computer-Aided Design of Integrated Circuits and
A CKNOWLEDGMENT Systems 39.10 (2019): 2475-2488.
We thank the management of Kalasalingam Academy of [18] Lu, Jingwei, Wing-Kai Chow, and Chiu-Wing Sham. "Fast power-
Research and Education and the Department of Electronics and slew-aware gated clock tree synthesis." IEEE Transactions on
very large scale integration (VLSI) Systems 20.11 (2011): 2094-
and Communication Engineering for providing 2103.
computational facilities at VLSI Research Lab. We also
thank Mr. Manjunath and Mr. B.K.Sreenath fro m Nanochip
Solutions for providing support to the successful completion
of this work.

979-8-3503-9725-3/23/$31.00 ©2023 IEEE 1803


Authorized licensed use limited to: STMicroelectronics international NV. Downloaded on December 18,2023 at 07:52:35 UTC from IEEE Xplore. Restrictions apply.

You might also like