0% found this document useful (0 votes)
11 views

Report On ML Based Abnormal Simulation Detection in SOC Verification

Use of ML in hardware verification

Uploaded by

yashman2k
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Report On ML Based Abnormal Simulation Detection in SOC Verification

Use of ML in hardware verification

Uploaded by

yashman2k
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Report on ML based abnormal simulation

detection in SOC verification


Kolluru Yashwanth
March 2024

1 Abstract
In the paper the author has proposed a Abnormal Simulation detec-
tor (ASD) which is based on machine learning algorithm ”multiple linear
regression” to reduce the time complexity of SOC verification.

2 Introduction
2.1 SOC Verification
SOC stands for System-on-Chip. It refers to an integrated circuit (IC)
that integrates multiple electronic components or subsystems into a single chip.
These components typically include a central processing unit (CPU), memory,
input/output interfaces, and various peripherals.SOCs are commonly used in
a wide range of electronic devices, including smartphones, tablets, wearable
devices, digital cameras, automotive systems, and IoT devices.
Its been crucial to make sure all the parts of this SOC fit together and do
what they’re supposed to do before the manufacturing phase so that it mitigates
the risk of design flaws but also contributes to enhanced product quality, reduced
development costs, so this is where SOC Verification comes into picture. SOC
verification uses various methodologies like UVM environments, formal verifi-
cation techniques, code and functional coverage and etc to ensure the things as
told above.

2.2 Multiple Linear Regression


In Machine Learning, Multiple Linear Regression is a fundamental super-
vised learning technique used for predictive modeling. It’s employed when we
want to predict a continuous target variable based on multiple input features.
Think of it as fitting a line to a multidimensional dataset, where each feature
represents a dimension. For example, in predicting house prices, we might con-
sider features like square footage, number of bedrooms, and location. Multiple

1
Linear Regression helps us understand how each feature contributes to the pre-
diction. By learning the relationships between these features and the target
variable from training data, the model can then make predictions on new, un-
seen data. It’s a cornerstone of regression analysis in Machine Learning, widely
applied in fields like finance, healthcare, and marketing for making forecasts and
informed decisions.

3 Motivation
3.1 Challenges in SOC verification
• Due to increment in the design size and complexity there is need for more
testcases to check an SOC chip which increases the use of EDA licences
and computing resources as an overall simulation time increases.
• During the verification process if simulation goes fine then normally this
process takes 5-6 hours but if there is simulation hang or say abnormal
simulation occurs then it takes 2 times to 100 times longer than the
normal passed simulation.

3.2 Causes of abnormal simulation


This can be caused due to various reasons like shortage of storage, network
issue, protocol violations, unintended sequences and test benches, VIP (verifi-
cation IP) problems and etc.

3.3 Effects due to abnormal simulation


• These abnormal simulations sometimes may run forever until the TIME-
OUT reaches or till if some engineer notices it and stop them explicitly.
• Wastage of compute storage resources and licences which are limited are
occupied uselessly.
• Unnecessary cost and time increase.
In Fig.1 an example which shows the run time taken by SOC verification simu-
lation during a normal case and an abnormal case.

Figure 1: Run time taken by SOC Verification

The above figure was taken reference from here

2
3.4 Methods to Reduce the abnormal simulation time
• To monitor the simualtion terminals for hours.
• To set a TIMEOUT value on all tests but this is a difficult task as there
will be too many testcases and also it is tough to give a timeout value to
all testcases as simulation time depends on its functions and senerios to
verify.

all these things need to be done manually so here the author suggests an inno-
vative way by automating the things.

4 Proposed Idea
An ASD (abnormal simulation detector) has been proposed. This detector
will predict the expected simulation run time for a testcase using a ML algorithm
”multiple linear regression”. Below are the steps followed to do this-
1. Collection of previous project simulations data for training like project
name, test name, result, simulation run-time and etc.

2. Encoding the input data, simulation run time into Vectors using ’bag-of-
words’ concept which is shown below in Table 1 and later it is converted
into binary code which is shown in table 2.
this is the format for input data [project-name, block or IP, user
defined testname/scenario, user defined test option]

NUM Testname(input data) Run time


1 project1-dram-access-test-FASTBOOT 3h 15min
2 project2-spmi-basic-test-MULTIMASTER 7h 4min
3 - -
4 - -
- - -
- - -
20000 project10-i2c-interrup-test-GETDUMP 10h 37min

Table 1: Training Data set.

3. now train the ML model with these encoded vectors and also optimize the
parameters and model.

4. after training, the model is ready to predict the expected run time we can
give inputs as project name and test name.
According to the proposed paper a 91.03% accuracy has been achieved

3
1 0 1 0 - 0 1
1 0 0 0 - 0 1
1 1 0 0 - 0 1
- - - - - - -
- - - - - - -
1 0 1 0 - 0 0
1 0 0 0 - 0 0

Table 2: Binary format.

5 Integrating with Verification process


Verification undergoes in phases method so here pre-run phase and run phase
are chosen to do the task. Additionally a Test Manager(TM) block is being
used whose functionality will be understood below
• During Pre-run phase:
1. here the input data as told above is sent to ML model which gives the
output as expected simulation run time.
2. this data is stored in TM and then this TM calculates the time limit
by adding some extra buffer time to the expected run time.
• During Run phase:
1. there is some monitoring technique in TM which checks that time limit
is crossing or not.
2. if time limit is crossed then a warning will be displayed and offers a
button to stop simulation.
This process can be understood more precisely by seeing the figure 2.

Figure 2: Working flow

The above figure was taken reference from here

4
6 Results
Here according to the paper proposed they have took some 2000 randomly
sampled abnormal simulations which can take 2 times to 100 times more time
than a normal simulation and then results are observed.
• Normal simulations on an average took 11 hours.

• Abnormal simulations on an average took 57 hours.


• ASD based on ML simulations on an average took 20 hours.
by this result 65% of simulation time has been reduced with ASD when com-
pared with abnormal simulations which reduces the effort and cost.

7 Advantages
• Early detection of abnormalities caused by wrong testcase or sequences.
• Debugging can be improved as more time is saving

• Testplan can be done more efficiently by considering the timing parame-


ters.
• We can schedule the regression tests with more flexibility as now we know
expected run time.
• Easy detection of simulation hangs.

8 Conclusion
An innovative ML based framework has been proposed to predict the run time
limits of the tests and detection of simulation hangs in the early stage which
helps to reduce verification time and costs.

9 Future Scope
We can try to improve Accuracy by proper optimization or using some other
efficient ML algorithms.

10 Reference
-”Machine Learning Based Abnormal Simulation Detector in SoC Verification”,
59th DAC conference (2022)

You might also like