Design and Implementation of High-Performance PNR Block-Level Design With Timing Placement
Design and Implementation of High-Performance PNR Block-Level Design With Timing Placement
Abstract—Delay of signal distribution in integrated chips the Rectilinear Die (L-Shape) and Rectilinear Die (U-Shape)
due to an increased number of standard cells and macros. Due demonstrates the comparison of congestion, time, Qo R, and
to the unprecedented delay in timing the performance of the power vary with each type of macro placement. [1].
chips can be delimited. Most of the researchers have
performed simulation experiments to minimize the timing by The majority of the placement strategies discussed in the
reducing the slack and it has produced compromising results prior chapters work to reduce the length of the wire overall.
with some tradeoffs. This research study intends to minimize The wires on timing-critical channels are the main focus of
the timing delay through efficient slack management place and the placement strategy. It should be noted that a cell
route strategies. To achieve such reduced timing, macros were frequently connects to two or more other cells . constructing
placed proximal to the core boundary. The performance of the some focused nets. As a result, placement must be carried
proposed work is weighted using performance metrics such as out with extreme care and equilibrium[2].
data required time, data arrival time, and violated slacks. Our
proposed methodology outperforms by reducing the data One of the factors affecting the development of cutting -
required time to 32.6%, data arrival time to 73.4%, an d edge placement strategies to improve circuit performance is
violated slack time to 45% in the cases of before and after the scalability and integration of technology. Modern
placement respectively. This makes our approach to be useful system-on-chip (SOC) arch itectures' abundance of macros
in reducing delay and timing for circuits with 40nm and standard cells, as well as other factors, all play a part in
technology. the increased connection versus gate latency ratios[2]. As on-
die functional integration levels increase, global
Keywords—Register-transfer level, Graphic Design System, interconnects extend. These variables continue to make
Graphical User Interface, Clock tree synthesis, Time is driven improved placement problematic.
placement, Quality of Results, Power, and Ground, Design Rule
Checking, Layout Versus Schematic, Design for Manufacturing
II. PREVIOUS WORK
I. INT RODUCT ION A. Floorplan
The way to make Pn R block level design may consist of By provid ing the core utilization factor or the measurements
several stages like floor planning, power planning, of the core and die space, the floor plan first establishes the
placement, clock tree synthesis, routing, LVS, DFM, and boundary between the die and the core. Then, arrangement
DRC checks. In the placement stage, most of the design of of the macros by some rules, such as grouping similar types
the chip is done. At the placement stage, it checks the
of macros, trying to minimize stacking, arranging the
number of remaining ideal nets, the maximu m nu mber of
macros according to the number of connections between
routing layers, the number of scan chains that exist, the
number of tap cells, and the standard cells placed in the them, orienting all of the macros' pins towards the core area,
design. Each standard cell must occupy the minimu m area in and leaving no dead space in the core area [3]. After fixing
the core area of the design. The number of macros placed in macros and blockages standard cells get placed [4].
a design is grouped by similar macros. B. Powerplan
Every macro p lacement approach had a different effect Making a power grid, also referred to as a power plan, is the
on the scheduling, congestion, and power supply for various next stage in the floor plan. Vertical and lateral squares will
types of dies, which made it possible to automate floor serve as models for how power mesh is made. The breadth
planning for different types of die forms and shorten cycle and spacing between the metal straps, as well as your floor
durations [1]. Time, congestion on the network, and power layout, determine how many vertical and horizontal straps
are all variables that affect how cells are placed in various are needed. Maintaining lower power usage is one of the
die shapes. Selecting a rectangular die for placement gave primary objectives of the power plan [5].
the best congestion of 0.1% and shortened the cycle time for
a design with 500k instances and 18 macros with a utilization
of 59% fro m weeks to days . Utilization of around 49% for
III. PROPOSED W ORK TABLE I contains information about the specifications of the
Design and all the experiments done on the synopsis
ICCOMPILER II tool.
B. Power Planning of proposed work remained the same. IR drop was tested by using the IR drop
map in the Synopsys ICCompiler-II tool.
After several iterations of this placement stage, we Fig 9 provides details on the timing in each of their scenarios
cleared the global route congestion the second graph and the number of endpoints violated both before and after
represents the congestion-free graph(Fig.7). placement. Here setup violations are only taken into count
because it’s a placement stage. Initially, the endpoints of
both scan_rclk and uart_clk are varies for different slack
TABLE II. SETUP SLACK BEFORE AND AFTER P LACEMENT
values which may reduce the timing violations and optimize
Points Before Placement After Placement the timing in particular paths.
Data required 0.72ns 0ns
time D. Clock tree synthesis of proposed work
Data arrival -2.68ns -1.24ns This work uses the concurrent clock and data
time optimization to achieve the least amount of timing slack
possible. The quantity of clocks emp loyed in this study is
Slack(violated) -1.96ns -1.24ns
listed in TABLE I, along with the relevant cell's location in
the clock routes. The clock port was connected to each sink's
TABLE II above represents the setup slack(violation) before clock pin by CCD standards.
placement and after placement. Before p lacement, the slack
is -1.96ns after the placement the slack decreases to -1.24ns
hence here the timing optimization takes place.
Fig. 10. Number of timing violated paths before and after clock tree
synthesis
Fig. 8. Several endpoints vs slack in func.ss_125c Fig 10 provides details on the timing in each of their
scenarios and the number of endpoints violated both before
Fig 8 provides details on the timing in each of their scenarios and after CTS. They also explain how many pathways were
and the number of endpoints violated both before and after broken. Since only setup violations are taken into account
placement. Here setup violations are only taken into count before cts, hold violations are in the thousands. Following
because it’s a placement stage. Initially, the endpoints of cts, 90% of violated paths in the design scenario dropped,
both sys_rclk and uart_clk are more before placement after and 97% of violated paths decreased in the test scenario for
several iterations did the endpoints minimized and optimized the ff 125c. Hence, clock t ree synthesis succeeded, as seen
to reduce the timing of the design. Timing violations were from this.
reduced by Gate resizing, Buffer/Inverter insertion, and
selecting appropriate registers of different setup times. E. Routing routing of proposed work
Fig. 11. Metal routing of signal pins of all the standard cells