0% found this document useful (0 votes)
10 views10 pages

Pseudo-Random Number Generators For Stochastic Computing SC Design and Analysis

Uploaded by

thirukg77
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views10 pages

Pseudo-Random Number Generators For Stochastic Computing SC Design and Analysis

Uploaded by

thirukg77
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

This article has been accepted for publication in IEEE Open Journal of Nanotechnology.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/OJNANO.2024.3414955

Pseudo-Random Number Generators for Stochastic


Computing (SC): Design and Analysis
Pilin Junsangsri, Member IEEE and Fabrizio Lombardi, Fellow IEEE

Abstract—In most nanoscale stochastic computing designs, the Stochastic Number Generator (SNG) circuit is complex and
occupies a significant area because each copy of a stochastic variable requires its own dedicated (and independent) stochastic
number generator. This paper introduces a novel approach for pseudo-random number generators (RNGs) to be used in SNGs.
The proposed RNG design leverages the inherent randomness between each bit of data to generate larger sets of random numbers
by concatenating the modules of the customized linear feedback shift registers. To efficiently generate random data, a plane of
RNGs (comprising of multiple modules) is introduced. A sliding window approach is employed for reading data in both the
horizontal and vertical directions; therefore, the sets of random numbers are generated by doubling the datasets and inverting
the duplicated datasets. Flip-Flops are utilized to isolate the datasets and diminish correlation among them. This paper explores
variations in parameters to evaluate their impact on the performance of the proposed design. A comparative analysis between
the proposed design and existing SNG designs from technical literature is presented. The results show that the proposed
nanoscale RNG design offers many advantages such as small area per RNG, low power operation, generated large datasets and
higher accuracy.

Index Terms—pseudo–Random Number Generator (RNG), Stochastic Computing (SC), Stochastic Number Generator (SNG)

3� . This encoded number is referred to as a stochastic number


8
I. INTRODUCTION [9].

S
tochastic computing (SC) is a computational technique
that executes logic operations through the utilization of
random bitstreams, making it particularly effective at
nanoscales. In contrast to binary numbers, stochastic
computing encodes data based on the probability of
encountering 1's in bitstreams [1]. SC presents several
advantages over traditional binary computation, including
reduced hardware complexity, tolerance to errors, and low-
power implementations of complex arithmetic functions, so
very suitable for nanoscale environments. For instance,
implementing a multiplier function in SC requires only a single
AND gate, while a scaled adder function can be achieved using Fig. 1. A SC multiplier circuit and a SC adder circuit
a multiplexer circuit. SC has been used in various applications
such as image processing [2], digital filter [3][4], and neural Once the stochastic bitstreams for all variables are
networks [5][6][7][8]. generated, they are then fed to the stochastic arithmetic parts.
Figure 1 illustrates a schematic diagram of SC circuits, that In this example, SC circuits of a multiplier and an adder are
can be categorized into two main components: the stochastic considered. For the multiplier circuit, let the length of the
number generators (SNGs) and the stochastic computing stochastic stream be 8 bits. Bitstream A produces the sequence
(arithmetic) section. The SNG generates a stochastic bitstream 1, 0, 0, 0, 0, 1, 0, 0, with a value of ¼ or 0.25. Bitstream B
for each input/variable. During each clock cycle, a new n-bit generates the sequence 0, 0, 1, 0, 1, 1, 0, 1, with a value of ½ or
random number (R) is compared to an n-bit binary number (B). 0.5. When these bitstreams are input into the AND gate, the
If the random number (R) is less than or equal to the binary resulting sequence is 0, 0, 0, 0, 0, 1, 0, 0, with a value of 1�8 or
number, the output is set to 1; otherwise, it is 0. The sequence 0.125. This value is equivalent to the product of bitstream A and
of output signal is referred to as a stochastic bitstream; its value bitstream B (0.5 * 0.25 = 0.125).
can be determined by calculating the probability of The stochastic adder is implemented by using only a two-
encountering 1's in the bitstream. For instance, a number x input multiplexer (MUX) in which the probability of the
represented as 0.375 or 3�8 can be encoded by the sequence 0, selecting signal is set at 0.5 [5][10]. The output of a SC adder is
1, 0, 0, 1, 1, 0, 0, ..., where the frequency of 1's is equivalent to determined by the following expression.
p +p
pY = C D (1)
2
Pilin Junsangsri is with the School of Engineering, Wentworth Institute of where pC and pD are the stochastic values of bitstreams C and D
Technology, Boston, MA 02115 USA (email: [email protected]). respectively; pY is the stochastic value of the bitstream Y as
Fabrizio Lombardi is with the Electrical and Computer Engineering
Department, Northeastern University, Boston, MA 02115 USA (email:
output of the adder circuit (multiplexer); however, in (1), the
[email protected]) value of the output of the adder circuit is reduced by half, so the

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Open Journal of Nanotechnology. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/OJNANO.2024.3414955

output signal needs to be renormalized prior to the next step of than its counterparts. Each generated dataset has a very large
the entire process [10]. size with low correlation, leading to accurate results.
As for the accuracy of a SC circuit, the individual bits in a Furthermore, in terms of hardware, the circuit of the proposed
bitstream sequence are random, therefore a long length is design per dataset is small and consumes less power compared
needed to increase the accuracy because the average of the to other designs.
sequence converges to the desired stochastic value [9].
In addition to the length of a stochastic bitstream, the II. PROPOSED NANOSCALE DESIGN
correlation of stochastic bitstreams is another crucial factor This paper introduces a nanoscale design for pseudo-random
influencing the accuracy of a stochastic circuit. Two correlation number generator (RNGs) for SC. Rather than focusing on
metrics are usually considered in an SC circuit: autocorrelation improving a single RNG, the proposed design exploits the
and cross-correlation. Autocorrelation assesses the advantage of randomness between each bit of data in the LFSRs
independence of bits within a single stochastic number, to generate larger sets of random numbers. The architecture of
indicating how effectively isolation can decorrelate the the proposed RNG design comprises multiple modules of
stochastic number from copies of itself. Cross-correlation customized LFSRs. Each module consists of a group of LFSRs
gauges the level of independence between different stochastic in which data is shifted between them. To illustrate the
numbers [11]. Correlation values fall within the range of -1 to principles of the proposed design, an example is provided using
1, where 0 means no correlation or the maximum degree of a module consisting of three 4-bits customized LFSRs.
independence [11], and ±1 indicates maximum correlation. A
correlation close to 1 suggests that an increase in a variable
significantly influences an increase in another variable.
Conversely, negative correlation signifies an inverse
relationship between the two variables.
To mitigate the correlation between bitstreams and enhance
the accuracy of an SC circuit [11], it is important to assign a
dedicated and independent stochastic number generator (SNG)
to each copy of a stochastic variable (or input). This approach
ensures that the generation of stochastic numbers for different
variables remains independent. However, hardware Fig. 2. Circuit diagram for the proposed sample module; a module
implementations for SNGs are typically complex; it has been consists of 3 rows and each row has 4-bit data.
reported that, in several stochastic designs, the SNGs occupy
more than 80% of the total area of the stochastic circuit Figure 2 depicts the circuit diagram of the proposed sample
[12][13]. module, constructed with three 4-bit custom Linear Feedback
This paper is an extension of a previous manuscript [14] Shift Register (LFSRs). The proposed module has three input
proposing nanoscale pseudo-Random Number Generators signals (In0, In1, and In2), one input clock signal, three output
(PRNGs) that can be used in Stochastic Number Generators signals (Out0, Out1, Out2), and a total of twelve output data
(SNGs). Due to the challenge in implementing a true random bits. Each bit of data is stored and processed by twelve D-FF,
number generator in which a physical noise source (such as with four D-FFs allocated per row and three rows in a module.
voltage, and temperature) is then converted to a digital value, a In each module, every row generates 4 bits of random data,
deterministic “pseudo-random” circuit, that produces random- and each bit is shifted to the next bit in the other rows. This
like number sequences, is often used as the random number deliberate shifting of each bit between the Random Number
source [11]. The proposed pseudo-RNGs, referred to as RNGs Generators (RNGs) effectively diminishes the autocorrelation
for simplicity, are constructed from multiple modules of a among the generated random numbers. To further reduce
customized linear feedback shift register (LFSR); each module correlation between data in the current and the next modules,
comprises a set of LFSRs in which data from each bit of the three XOR gates are employed—one per row. The output
LFSR is transmitted to the other LFSRs. In the proposed design, signals of these XOR gates serve as input data for the
a plane of RNGs array is accessed using a sliding window subsequent module. For output, data in the proposed design is
approach in both the horizontal and vertical directions. Due to concurrently read, and these signals are processed to generate
low correlation observed when reading data in both the forward random numbers.
and reverse directions [1], the generated random data are
effectively duplicated and read in the reverse. Next, an isolation
method is employed to reduce correlation between datasets. For
isolation, flip-flops (or isolators) are inserted in a circuit to
reduce the correlation between each bit of data [15].
In this paper, the parameters of the proposed design are Fig. 3. A row of proposed RNGs in which p modules are connected.
initially varied to study the impact on the generated random Each module is made from 3 4-bits customized LFSR.
data; then, the proposed design is compared with other RNGs
found in technical literature. The results of this paper show that Figure 3 depicts the connection of the proposed RNGs
the proposed design offers many advantages over other designs. modules in a row of the RNG plane. The modules are connected
When evaluating RNG planes of equal size, the proposed in series. Output signals from each module serve as input data
design stands out by generating datasets significantly larger

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Open Journal of Nanotechnology. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/OJNANO.2024.3414955

for the next module, except for the last one, where its output is a pair of datasets with high cross-correlation is delayed at
looped back to the first module. different times, the cross-correlation between these datasets is
A multiplexer selects the input for the first column modules reduced.
from either the last module’s data or the seed input. Prior to In the proposed design, the number of flip flops for isolating
utilizing the proposed Linear Feedback Shift Register (LFSR) data in each dataset is pre-determined and fixed. To determine
plane, seed data is fed to the initial random numbers in which the number of flip-flops needed to introduce the delay to each
this data could be pre-selected or originating from a physical dataset, a process involves adding a flip-flop to the datasets with
noise source (such as voltage, or temperature) as a true random the highest cross-correlation until the cross-correlation between
number generator. Like the LFSR, the condition that must be all datasets fall below a specified threshold. Once the datasets
avoided when selecting the seed data, is when every bit of data are delayed by this specific duration using flip-flops, the
in a row of the RNG plane is 0 (i.e., zero XOR zero is equal to random numbers are derived by reading the data in each dataset.
zero); If every data in a row of the RNG is zero, the proposed
RNG plane always generates a zero number. Once seeded, the
modules' data is read in parallel, processed, and generates
random numbers in normal operation.
N-bit random numbers are formed by concatenating
adjacent n-bit data. At each clock cycle, the output data from
the proposed modules is read. Due to low correlation between
each bit, random numbers are generated using a sliding window
approach, reading data from both horizontal and vertical
directions. Figure 4 illustrates the use of the sliding window to
enhance the efficiency of the proposed RNG plane, focusing on
the horizontal direction. The random numbers are obtained by
concatenating data for a specific read bit, note that there are 8
bits in this case.
Fig. 5. Organization of the proposed RNG plane
Figure 5 illustrates the organization of the proposed RNG
plane. After the data in the RNG plane is read by using a sliding
window, the datasets undergo the isolation method to introduce
delays to each dataset at different times. The data from each
dataset after the isolation method serves as the random number
Fig. 4. Sliding window to improve the utilization of the proposed RNG dataset.
plane by considering only the horizontal direction.
Next, permutation is used to increase the number of III. SIMULATION AND DISCUSSION
generated random variables from the existing random numbers. In this section, the simulation results of the proposed scheme
Leveraging the low cross-correlation observed when reading are discussed. The default parameters of the proposed module
data (random numbers) in both forward and reverse directions, are configured as follows: each module comprises 3 rows of
additional sets of random numbers are generated by reading the customized LFSRs, with each row generating 4 bits of random
data in the inverse direction. For the proposed LFSR plane with data. The XOR inputs for the last module in each row are
n rows and m columns, the number of datasets to generate k-bit positioned at bits 1 and 3 of its module, whereas the XOR inputs
random numbers is given by. of other modules are selected from bits 2 and 3 of their modules.
N = (horizontal + vertical) * 2
N = [n*(m – k + 1) + m*(n – k + 1)] * 2
N = 4mn + 2(m + n) (1 – k) (2)
where N represents the number of generated k-bit random
number datasets using the proposed design, the term
“horizontal” denotes the number of datasets that a RNG plane
could generate from n rows and m columns when considering
only the horizontal direction. The term “vertical” represents the
number of datasets that an RNG plane size n*m (n rows and m
columns) could generate when considering only the vertical
direction.
The sliding window method increases the number of
generated random number datasets in an RNG plane; however,
a drawback of this method is the potential for high cross-
correlation between specific pairs of datasets, especially when
two datasets share the most significant bit (MSB). To mitigate
this, an isolation method is employed. In this approach, flip-
flops (FFs) are inserted to delay the data in each dataset. When Fig. 6. The simulated proposed RNG plane

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Open Journal of Nanotechnology. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/OJNANO.2024.3414955

In this simulation, an RNG plane is constructed with 16 numbers. The results in Figure 7 show that the size of the
proposed modules, there are 4 modules per row and 4 rows in generated random numbers in a dataset by using the proposed
total. The total bit of this RNG plane is 192 bits of data, with 16 design is low when the number of rows (RNGs) in each module
bits per row and 12 rows in the plane, as illustrated in Figure 6. is a power of 2 or an even number. For example, when there are
The simulation focuses on 8-bit random numbers, generated by 32 rows of RNGs per module, the size of the generated 8-bit
concatenating data from multiple modules aligned in the same random numbers in a dataset is only 6943. However, when the
direction. The simulation is implemented using Python, and the number of rows in each module is odd or a prime number (for
Genus Synthesis Solution tool at 32nm CMOS technology to example such as 5, 7, 9), the number of generated 8-bit random
evaluate the delay, power dissipation, and area of the RNG numbers per dataset is increased to more than 120,000,000. For
circuits in the nanoscales. ease of presentation and as an example, in this paper, each
By using the proposed method, 376 datasets are generated module consists of only 3 RNGs (rows).
from 192 bits of data, with each dataset having a length
exceeding 120 million random and with no repetition. The
average autocorrelation and cross correlation of every dataset
are 2.766*10-6 and 4.67*10-4 respectively – indicating very
small values.
When assessing delay, power dissipation, and area, the
following configuration is considered:
• A single module composed of 3 4-bit RNGs (Figure 2).
• An RNG plane comprised of 16 proposed modules: with
4 modules per row and 4 rows in an RNG plane (Figure
6). Fig. 7. Size of generated 8-bit random numbers in each dataset. Each
• For the isolation module, few FFs are added to each module is simulated 120,000,000 times.
dataset until the cross correlation between datasets falls 2) Number of bits in each row of the proposed module
below 0.1. To achieve this goal, a total of 376 FFs are The number of bits in each row of the proposed module also
needed and the highest number of FFs to delay a dataset impacts the size of the generated random numbers in each
is 35, i.e., after the first 35 clock cycles, the proposed dataset. As shown in Table 2, by increasing the number of bits
design operates normally. in each module, the size of the generated random numbers in a
Table 1 presents a summary of the delay, power dissipation, dataset increases too. For example, when each row in a module
and area of the plane of the proposed RNG array. consists of 6 bits of data and only 2 modules are connected in
TABLE I. series, the proposed RNG design generates more than 120
DELAY, POWER DISSIPATION, AND AREA OF THE PLANE OF THE million random numbers in a dataset (by reading data only from
PROPOSED RNG ARRAY WITH 16 PROPOSED MODULES IN FIGURE 6 bits 0 to 7 in the first row).
Module Delay (ps) Power Dissipation (µW) Area (µm2) TABLE II.
Single Module 109 122.895 105.216
LENGTH OF GENERATED 8-BIT RANDOM NUMBERS FROM EACH RNG;
RNG plane 207 1,988.82 1,711.66
EACH RNG IS MADE FROM TWO CONSECUTIVE PROPOSED MODULES
Isolation 109 23,907.09 19,876.11
Total 207 25,895.91 21,587.77 Number of bits in each row Size of dataset of generated
in the proposed module 8-bit random numbers
A. Proposed customized LFSR module.
4 1,185,036
This section investigates the impact of parameters on the 5 44,434,004
proposed LFSR module, with respect to the size of the 6 More than 120*106
generated random numbers. To explicitly show the effect of
each parameter, this section considers only one row of the RNG 3) Number of modules in each row
plane; therefore, each row of the RNG plane consists of two Next, the number of modules connected in series in each row
interconnect proposed modules where the outputs of the second of the RNG plane is considered. In this simulation, a default
module are fed back to the input of the first module. Each module (4 bits per row and 3 rows of RNG as shown in Figure
module consists of 3 rows of LFSRs, with each row generating 2) is considered. By varying the number of modules in each
4 bits of random data. For the last module in each row, XOR row, the size of the first 8-bit random number dataset is found
inputs are positioned at bits 1 and 3, while for other modules, by reading data horizontally (so only from bits 0 to 7 of a RNG
the XOR inputs are selected at bits 2 and 3. In this section, the plane).
output generates a set of 8-bit random numbers, with data being In Table 3, an increase of the number of proposed modules
read only from the first row (row 0) of the RNG plane. in each row does not guarantee an increase in the size of the
1) Number of rows (RNGs) in each module generated random numbers in a dataset; however, the size of the
When varying the number of rows in each LFSR module, the generated random number dataset is very large when the total
size of each dataset is changed. Figure 7 shows the size of the bit in each row is a power of 2; so since each module has 4 bits
generated random numbers in each dataset when varying the of data per row, when 2, 4, and 8 modules are connected, the
number of rows in each module of the proposed design. Like a total numbers of bits in each row of the RNG plane are 8, 16,
linear feedback shift register, the increase in the number of rows and 32 respectively, i.e. the number of generated 8-bit random
in a module does not always generate a large set of random numbers in a dataset is therefore large.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Open Journal of Nanotechnology. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/OJNANO.2024.3414955

TABLE III. • For the horizontal direction, 9 sets of random numbers are
LENGTH OF GENERATED 8-BIT RANDOM NUMBERS PER DATASET WHEN generated in each row. So, 108 sets of random numbers are
VARING THE NUMBER OF MODULES IN EACH ROW
generated horizontally.
Number of modules in each Size of dataset of generated • For the vertical direction, 5 sets of random numbers are
row in the proposed module 8-bit random numbers generated per column. So, 80 sets of random numbers are
2 1,185,036 generated vertically.
3 83 The total number of generated sets is 188; moreover, when
4 More than 120 million
using permutation, the number of sets of generated 8-bit
5 59,593,841
random numbers is doubled, i.e., 376 sets; this number can also
6 1,048,574
7 664,019
be found by using (2).
8 More than 120 million The proposed RNG plane is simulated 100,000 times to
9 196,601 generate 376 datasets of 8-bit random numbers (each set has
100,000 random numbers). The size of the generated (8-bit)
4) Position of inputs for the XOR gate random numbers for each dataset and the correlation (between
Next, the impact of the position of the input for the XOR a dataset and other datasets in the same RNG plane) are
gates on the size of the generated random number dataset is assessed next.
considered. The positions of the XOR inputs are given as an For the size of the dataset, every set of generated random
index of each module (where the first index starts at zero). Each numbers (376 sets) generates more than 100,000 random
simulation is performed 10,000,000 times; in this simulation, numbers at a very low autocorrelation; the average
each row of the RNG plane consists of 4 modules in series. autocorrelation of every generated set is only -2.766*10-6. For
TABLE IV. cross-correlation, there are 376 sets of generated 8-bit random
NUMBER OF GENERATED 8-BIT RANDOM NUMBER IN A DATASETS WHEN numbers, therefore 70,500 pairs of cross-correlation must be
READING DATA IN ROW 1 FROM BIT 0 TO BIT 7; EACH ROW CONSISTS OF considered.
4 MODULES IN SERIES. EACH CASE IS SIMULATED 10,000,000 TIMES
Positions of XOR inputs of first module
01 02 03 12 13 23
01 1*107 1*107 35,804 1*107 1*107 15,809
Positions 02 42,704 419 1*107 1*107 1*107 4,784,054
of XOR
03 1,392,554 1*107 60 1*107 1*107 1*107
inputs of
12 298,934 1*107 1*107 419 1*107 7,340,024
other
modules 13 4,094 1*107 1*107 1*107 120 1*107
23 332,009 69,614 1*107 1*107 1*107 1,259

The results in Table 4 show that there are several cases in


which the proposed RNG plane generates more than 10 million Fig. 8. Scatter plot of autocorrelation of each 8-bits random number
8-bit random numbers per dataset; hereafter in this paper, the dataset by using the proposed method when reading data in both
positions of the XOR inputs for the first module are selected at directions; horizontal and vertical, using sliding windows and
bits 1 and 3 from the last module, while the XOR inputs of the permuting its output.
other modules are selected from bits 2 and 3 of its previous
module.
B. Plane of the proposed RNGs
Previously, each random number is generated by reading
only the first 8 bits in each row of the RNG plane; a method is
proposed next to improve the utilization of the proposed RNG
plane by using a so-called sliding window. As presented in
Figures 4 and 5, data in an RNG plane can be reused to generate
more datasets of random numbers by reading in both directions
(horizontal and vertical) using the sliding window and
Fig. 9. Scatter plot of cross-correlation of pairs of 8-bits random
permuting the data by reading it in the reverse direction.
number datasets by using the proposed method when reading data in
For simulation, the RNG plane of Figure 6 consists of 16 both directions; horizontal and vertical, using sliding windows and
proposed modules; each row consists of 4 proposed modules permuting its output.
connected horizontally. An RNG plane with 4 rows is used as
an example. This RNG plane has 192 bits of data in total; so, 16 Figures 8 and 9 show the scatter plot of the autocorrelation
bits per row and there are 12 rows in an RNG plane. By reading of the datasets and the cross-correlation between pairs of
data in both directions with the sliding window and permuting datasets generated by using the proposed methods. As shown in
by reversing it, 376 sets of 8-bit random numbers are generated Figure 8 and 9, the autocorrelations of the datasets are close; the
from the proposed RNG plane. The following features are cross-correlation between the pair of generated datasets can be
attained. categorized into multiple sub-categories. In this paper, cross-
correlations between each pair are separated as shown in Table
5.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Open Journal of Nanotechnology. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/OJNANO.2024.3414955

TABLE V. flop after reading the data. This method can significantly reduce
CROSS-CORRELATION BETWEEN SETS OF GENERATED 8-BIT RANDOM the cross-correlation; for example, cross-correlation between a
NUMBERS pair of datasets with a high cross-correlation (0.75) is
Range of significantly reduced to 0.004 when shifting data in one of the
Level of Average cross-
cross- Number Percentage
cross-
correlation
correlation in
of pair (%) datasets by a clock cycle.
correlation this level Next, the number of FFs to isolate datasets in the proposed
[low, high]
Very high (0.7, 1] 0.7501 240 0.34 RNG plane is established; in the simulation, a FF is added to a
High (0.55, 0.7] 0.5627 48 0.07 dataset that has the highest total cross-correlation. This
Medium high (0.45, 0.55] 0.5004 320 0.45
simulation is run till the cross-correlation of every dataset is less
Medium (0.3, 0.45] 0.3751 436 0.62
Medium low (0.2, 0.3] 0.2457 368 0.52 than the threshold value (set to 0.1 in this case) and the number
Low (0.1, 0.2] 0.1651 892 1.27 of FFs for each dataset are found.
Very low (0.05, 0.1] 0.0819 1208 1.71 Figure 11 shows the scatter plot of the cross-correlation
Extremely low [0, 0.05] 0.0017 66988 95.02 between datasets after using the isolation method in which their
In Table 5, 95.02% of the cross-correlations are extremely cross-correlation threshold is set to 0.1. After using the isolation
low (less than 0.05); the average cross correlation is only method, the average cross-correlation between these pairs of
0.0017, so well suitable for stochastic computing. datasets is only 0.000467 and a total of 376 FFs are needed. The
Next, the relationship between each dataset is considered to largest number of FFs to be inserted in a dataset is 35
find pairs of datasets that have high cross-correlation. A very (corresponding to the delay in clock cycles) i.e., after the first
high cross-correlation between datasets hereafter (i.e. a cross- 35 clock cycles, the proposed design operates in a normal
correlation greater than 0.7) occurs when both datasets share behavior. This occurs at dataset 65 in which data from row 7
their most significant bit (MSB) or the second most significant column 9 is read in the left direction. 33 datasets do not need an
bit. Figure 10 shows a sample of datasets that have high cross- FF to delay a value, i.e., data can directly be read from such
correlation: three datasets (dataset 0, dataset1, and dataset 2) dataset. The number of the inserted FFs tend to be high for a
share the MSB at row 0 column 7. Note that dataset 0 is read dataset whose MSB is located at the middle of the RNG plane.
from row 0 column 7 to the left, dataset 1 is read row 0 column
7 to the right, and dataset 2 is read from row 0 column 7 to the
bottom. Cross-correlation between these datasets is very high,
i.e., approximately 0.75.

Fig. 11. Scatter plot of cross-correlation of every pair of datasets of the


proposed design with isolation; FFs are inserted before datasets.
C. Accuracy
Next, the generated random numbers from the proposed
Fig. 10. Sample datasets in the proposed RNG plane that have very design are used to generate stochastic bitstreams for each input.
high cross-correlation. These bitstreams are fed into two stochastic circuits: a
multiplier and adder. The value of the result is compared with
A high cross-correlation is approximately 0.7, it occurs the expected value. The Mean Square Error (MSE) is used as a
between datasets that are located next to each other and read in metric to assess the difference between the expected and
the same direction; for example, in Figure 10, cross-correlation computed results using SC. The values of each input are
between dataset 3 (in which its MSB is at row 9 column 7), and selected randomly and sets of 1,000,000 random numbers are
dataset 4 (in which its MSB is at row 9 column 8) and both are generated using the proposed design.
read to the left, then the cross-correlation between these two
datasets is approximately 0.5, because the MSB of dataset 3 is TABLE VI.
the second most significant bit of dataset 4, so the value in MEAN SQUARE ERROR IN SC MULTIPLIER WHEN STOCHASTIC BITSTREAMS
ARE GENERATED USING THE PROPOSED DESIGN
dataset 3 impacts the value in dataset 4. Number of bits in generated RNGs
The cross-correlation between datasets is reduced when Stream size
8 bits 16 bits 32 bits 64 bits
shared bits between each set has less significance e.g., cross- 8 1.81*10-2 1.82*10-2 1.82*10-2 1.81*10-2
correlation between a dataset read from row 0 column 7 to the 16 9.22*10-3 9.20*10-3 9.19*10-3 9.12*10-3
left and a dataset read from row 0 column 4 to the right 32 4.58*10 -3 4.73*10-3 4.58*10-3 4.64*10-3
direction, is only 0.37. 64 2.35*10 -3 2.37*10-3 2.29*10-3 2.33*10-3
1) Isolation method 128 1.18*10-3 1.15*10-3 1.17*10-3 1.15*10-3
Due to high cross-correlation between some pairs of 256 5.74*10-4 5.85*10-4 5.71*10-4 5.82*10-4
generated datasets, an isolation method is used to shift data in a 512 3.17*10 -4 2.93*10-4 2.85*10-4 3.07*10-4
dataset for a clock cycle. This is accomplished by adding a flip- 1024 1.74*10-4 1.48*10-4 1.21*10-4 1.53*10-4

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Open Journal of Nanotechnology. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/OJNANO.2024.3414955

TABLE VII. random data per dataset. Specific conditions must be met for
MEAN SQUARE ERROR IN SC ADDER WHEN STOCHASTIC BITSTREAMS ARE SBoNG to generate large amounts of random data; however,
GENERATED USING THE PROPOSED DESIGN
these conditions cannot always be met. For the permuted LFSR
Number of bits in generated RNGs
Stream size [1], the size of generated random values in each dataset is the
8 bits 16 bits 32 bits 64 bits
8 2.66*10-2 2.67*10-2 2.65*10-2 2.66*10-2
same as for the LFSR; in this case, a 16-bit LFSR is used in
16 1.33*10-2 1.33*10-2 1.34*10-2 1.33*10-2 each row, and each 8-bit dataset generates a total of 65,634 data.
32 6.77*10 -3 6.68*10-3 6.67*10-3 6.66*10-3 B. Number of datasets
64 3.32*10-3 3.408*10-3 3.41*10-3 3.36*10-3
128 1.69*10-3 1.68*10-3 1.68*10-3 1.64*10-3
For the RNG planes with a size of 12*16, the proposed
256 8.60*10 -4 8.85*10-4 8.40*10-4 8.59*10-4 design leverages the sliding window in both the vertical and
512 4.32*10-4 4.03*10-4 4.05*10-4 4.09*10-4 horizontal directions, along with data reversal, to generate 376
1024 2.19*10-4 2.19*10-4 1.96*10-4 2.18*10-4 sets of RNGs. The other designs generate a smaller number of
RNGs than the proposed design. For the permuted LFSR [1],
Table 6 and 7 present the Mean Square Errors (MSEs) of the the number of generated datasets is limited to the number of bits
SC multiplier and adder, utilizing the proposed design for per row. This limitation occurs from the increase in cross-
random number generation. The error in the SC circuit correlation between datasets e.g., when datasets share their
diminishes when increasing the bitstream length. However, the most significant bit (MSB), its cross-correlation is increased by
number of bits in the generated random numbers has low impact 0.5. Hence, the permuted LFSR method [1] generates up to 16
to the accuracy of these stochastic circuits. datasets per row (with 16 bits in each row). The total generated
dataset of [1] is limited to 192. For the SBoNG [11], each row
IV. COMPARISON also generates 16 datasets by rotating data to the right to
This section presents a comparison between the proposed increase the number of datasets. The total number of generated
and other random number generators found in technical datasets for SBoNG [11] is limited to 192 datasets for 16 bits
literature e.g., the Linear Feedback Shift Register (LFSR), per row of data. As for the LFSR, it generates 2 datasets of 8-
SBoNG [11] that generates low-correlated SNGs from a LFSR bit random numbers per row, and with 12 rows in this array, the
by using the S-box circuit, and the permuted LFSR [1] that total number of datasets that the LFSR generates is 24.
generates more random numbers by permuted data from a C. Correlation
single LFSR with no additional circuit. A 12x16 array of RNGs
that consists of 192 bits of RNGs arranged with 16 bits per row In the simulation results presented in Table 8, the size of the
and 12 rows in an RNG plane is considered. For the proposed RNG dataset is constrained by either the maximum number of
design, a RNG is made of 16 default proposed modules as generated random numbers in a dataset or 100,000, whichever
shown in Figure 6. Various aspects are evaluated including the is lower. As shown in Table 8, the proposed design generates
number of generated random numbers, delay, power datasets with very low autocorrelation. The cross-correlation of
dissipation, area, correlations, and accuracy of the stochastic the proposed design is higher than for the LFSR, due to sharing
circuit. bits when reading data through the sliding windows. However,
after using the isolation method, the cross-correlation of the
TABLE VIII. proposed design is significantly reduced, and it is better than
COMPARISON BETWEEN THE PROPOSED RNG DESIGN AND OTHER DESIGNS the cross-correlation of the SBoNG method [11] and the
WHEN IT IS SIMULATED FOR 120*106 TIMES. MEAN SQUARE ERROR IS
CONSIDERED AT 64 BITS FOR STREAM SIZE AND 8 BIT RNG
permuted LFSR [1]. The cross-correlations of the permuted
Array of Random number generator (RNGs)
LFSR [1] are high because n datasets are generated from a
Metric Proposed Permuted LFSR with n bits in each row. Sharing with the second MSB
LFSR SBoNG [11] increases the cross-correlation between datasets.
design LFSR [1]
Number of datasets 376 24 192 192
Size of dataset 120*106 65,534 65,534 65,534 D. Delay, Area and Power
Average Auto In Table 8, the delay, power dissipation, and area are
2.766*10-6 1.532*10-4 1.561*10-5 1.555*10-5
correlation
normalized per the number of generated datasets. The results
Average Cross
correlation
4.67*10-4 5.685*10-5 9.326*10-2 1.317*10-1 show that both power dissipation and area per RNG of the
Delay/RNG (ps) 207 159 159 159 proposed design are the lowest, followed by the permuted
Power/RNG (µW) 68.872 130.02 96.77 71.887 LFSR [1], SBoNG [11], and LFSR respectively. The main
Area/RNG (µm2) 57.41 111.95 86.68 60.25 advantage of the proposed design is its ability to reuse data in
Mean Square Error of
SC Multiplier (*10-4)
23.457 39.532 110.67 56.443 both directions of an RNG plane to generate a larger number of
Mean Square Error of datasets. Both power dissipation and area per RNG of the
33.204 45.288 93.791 61.879 proposed design are very low.
SC Adder (*10-4)
For the LFSR, the values of the power dissipation and area
per RNG are the largest (worst) because the number of
A. Number of generated random number per datasets generated datasets is small. SBoNG [11] generates larger
As shown in Table 8, the proposed design generates a very datasets compared to LFSR however, it needs additional
large number of RNGs per dataset, more than 120*106 data, combinational circuits and the LFSR. The power dissipation
while other designs generate only 65,534 random data. The and area of SBoNG [11] is high. The permuted LFSR [1] reuses
SBoNG design [11] exhibits variability in the number of data in the LFSR; nevertheless, the size of the permuted LFSR

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Open Journal of Nanotechnology. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/OJNANO.2024.3414955

is limited by the number of bits in each row. Area and power applications dominates the delay of the SNGs; so for the SC
dissipation per RNG of the permuted LFSR [1] are lower than multiplier and SC adder, the PDP of the proposed RNG design
SBoNG [11]; however, their values are still larger than the is better than all other RNG based designs, hence showing its
proposed design. validity and effectiveness for such widely used applications.
The only disadvantage of the proposed design is the delay.
E. Accuracy
In addition to the initial setup of the proposed RNG plane,
which operates at the first 35 clock cycles, the delay of the Next, the accuracy comparison of stochastic circuits by
proposed design is high, because the complexity of the using various random number generators is considered.
proposed RNG plane is higher than the other designs with a Different types of random number generators are used to
complexity like LFSR. generate stochastic bitstreams as inputs for the SC multiplier
Next, delay and power dissipation of stochastic applications and adder. The results of the SC multiplier and SC adder circuits
(e.g., the SC multiplier and the SC adder) are evaluated when are compared with the expected results; the Mean Square Errors
using various RNGs. Due to the simplicity of stochastic (MSE) is used to evaluate errors from different RNGs. As in a
arithmetic, e.g., a multiplier circuit can be implemented by prior section, the values of each input are selected randomly,
using an AND gate, a scaled adder circuit can be implemented and its value remains the same for the length of the bitstream
by using a multiplexer circuit, therefore power dissipation of times prior to updating its value.
these circuits is lower than for stochastic number generators To increase the length, the size of the random numbers in
(SNGs); the power dissipation of a SC multiplier circuit is only each dataset is increased by setting the number of bits in each
7.2µW, while the power dissipation of a SC adder circuit is row in a RNG plane to 32; by considering only the first
9.956µW. Hence, the power dissipations of the SC arithmetic 1,000,000 random numbers in each dataset, the results are
hardware have a smaller impact to the SC system, because the shown in Figures 12 and 13.
power dissipation of the SNGs dominates.
For the delay of SC applications when using various RNG
designs, the circuits of Figure 1 are used; in these circuits, 2
comparator circuits are needed to generate the stochastic
bitstreams; then these values are processed by to stochastic
arithmetic hardware.
TABLE IX.
DELAY AND POWER DISSIPATION, OF STOCHASTIC MULTIPLER AND
STOCASTIC ADDER CIRCUITS
Array of Random number generator
(RNGs)
Metric Component Fig. 12. Mean Square Error (MSE) vs size of bitstream in multiplier
Proposed SBoNG Permuted
LFSR circuit; each bitstream is generated from 8 bits random number by
design [11] LFSR [1]
RNG 207 159 159 159 different designs.
Delay (ps) SC Multiplier 379
SC adder 434
RNG 68.872 130.02 96.77 71.887
Power (µW) SC Multiplier 144.944 267.24 200.74 150.974
SC adder 147.7 269.996 203.50 153.73
Power Delay RNG 14.26 20.67 15.39 11.43
Product SC Multiplier 54.93 101.28 76.08 57.22
(PDP) (fs*W) SC adder 64.10 117.18 88.32 66.72

Table 9 presents the simulation results; in this simulation,


the Genus synthesis tools at the 32nm technology node are used
at the nominal supply voltage of 1.05V. The simulation results
show that for both applications and the RNG only, the proposed
scheme has a larger delay compared to other designs. However, Fig. 13. Mean Square Error (MSE) vs size of bitstream in adder circuit;
each bitstream is generated from 8 bits random number by different
the delay for the SC multiplier is 379ps while the delay of the
designs.
SC adder circuit is 434ps; hence, the total delay of the SC
applications dominates the delay of the entire SC circuit. As Figures 12 and 13 present the mean square errors of the SC
shown in Table 9, the worst-case delay doesn’t occur from the multiplier and adder circuits when bitstreams are generated
SNGs, changing the RNG designs has no impact on the delay from various RNG designs. As expected, the increase of the
of the SC applications. Simulation also shows that the total bitstream size enhances accuracy i.e., a reduction in errors.
power is significantly reduced when using the proposed scheme Compared to other RNG designs, the proposed RNG design has
for both applications considered. As for the power-delay the lowest error, showing that it is very accurate.
product (PDP) of the stochastic number generators (SNGs), its
F. Product Metric
value for the proposed design is better than for LFSR and
SBoNG [11] but slightly worse than for the permuted LFSR [1] To evaluate each RNG design, a product metric (DPACP) is
(due to its large delay). However, when the SNGs are used in given in (3)
the two considered SC applications, the delay of the stochastic DPACP = delay*power*area* correlation (3)

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Open Journal of Nanotechnology. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/OJNANO.2024.3414955

where power is the average power dissipation per dataset, and V. CONCLUSION
area is the area per dataset. Correlation is the product of the In a Stochastic Computing (SC) design, the Stochastic
cross-correlation and the auto correlation, i.e. Number Generator (SNG) incur in a substantially large area,
correlation = cross correlation * auto correlation exceeding 80% of the overall circuit. This paper introduces a
These parameters are normalized to equalize their novel approach to random number generators (RNGs). The
significance such that a good RNG design has a low DPACP as proposed RNG design leverages the inherent randomness
combined metric for these 4 figures of merit. between bits to generate larger sets of random numbers. The
proposed design enhances RNG utilization through a sliding
TABLE X. window and reversing technique, producing multiple RNGs
POWER DISSIPATION, AND AREA PER RNG
Product from a shared RNG plane. To mitigate the potential high
Design Delay Power Area DPACP correlation from the sliding window (in which sharing of the
Correlation
Proposed 1 0.53 0.51 6.29*10-4 1.71*10-4 significant bits between datasets is required), an isolation
LFSR 0.77 1 1 4.23*10-3 3.26*10-3 method is employed. Flip-Flops (FFs) are added at each RNG
SBoNG until cross-correlation of every dataset reaches an acceptable
0.77 0.74 0.77 0.71 0.315
[11]
Permuted level.
0.77 0.55 0.54 1 0.229 In this paper, parameters in the proposed design are varied to
LFSR[1]
study the impact of each parameter in the design; the results in
Table 10 shows the DPACP of each considered design; the this paper show that to generate large datasets of random
DPACP of the proposed RNG design is the lowest, followed by numbers from the proposed design, the following criteria must
LFSR, the permuted LFSR [1], and SBoNG [11] respectively. be met.
The proposed design is better than these designs in terms of - An increase in the number of rows in each module tends
power dissipation, area, and correlation; however, its delay is to increase the random numbers in each dataset.
higher. While both SBoNG [11] and permuted LFSR [1] are However, to generate a large dataset, the number of rows
good in terms of power dissipation and area per RNG, the in each module must not be a power of 2 or an even
correlation is rather high, so the accuracy of these designs is number.
low. For the LFSR, even though it has a high-power dissipation - An increase in the number of bits in each row of the
and area per RNG, its correlation is still low, so the LFSR is proposed module increases the generated random
better than SBoNG [11] and the permuted LFSR [1]. numbers in each dataset.
- In an RNG plane, the number of modules in each row
TABLE XI. doesn’t directly impact the size of generated random
PERCENTAGE IMPROVEMENT OF RNG DESIGN VS LFSRS WHEN CONSIDERING numbers in a dataset. However, the total bits in each row
A PLANE OF RNGS SIZE 12*16 BITS. MEAN SQUARE ERROR (MSE) IS of an RNG plane significantly impacts the number of
CONSIDERED AT 16 BITS OF RANDOM DATA WITH BITSTREAM SIZE OF 128 BITS
generated random numbers in each dataset. Large
Percentage improvement from LFSR
Metric Permuted
datasets of random numbers can be generated when the
Proposed SBoNG [9] total bits in each row of an RNG plane is a power of 2.
LFSR [1]
Number of datasets 1,466% 700% 700% - The inputs of the XOR gates in each module also impact
Size of dataset 183,001% 0% 0% the size of the random number in each datasets. There are
Delay/RNG -30.19% 0% 0% several pairs of XOR inputs that can generate large
Power/RNG 47.03% 25.57% 44.71% random number datasets.
Area/RNG 48.71% 22.57% 46.18% In stochastic computing (SC), both autocorrelation and cross
Autocorrelation 98.19% 89.81% 89.85% correlation of RNG datasets significantly impact the accuracy;
Cross-correlation -721.46% -163,945% -231,632% the proposed design uses a sliding window method to increase
Mean Square Error the number of generated datasets. When increasing the size of
40.62% -433.78% -38.55%
(Multiplier circuit) the RNG plane, the number of generated datasets is also
Mean Square Error increased, however correlation between these datasets is high,
26.79% -247.14% -34.24%
(Adder circuit) especially datasets that are generated from the data in the
middle of the RNG plane in which its most significant bits are
Next, the percentage differences in each metric when shared. Flip-Flops are used to isolate these datasets and reduce
comparing various RNG designs versus LFSR, are considered. their correlation.
A positive (negative) value shows an improvement Compared to other RNG designs found in the technical
(degradation) of an RNG design compared to LFSR. As shown literature, the proposed design offers the best performance per
in Table 11, even though the delay of the proposed RNG design RNG in terms of power dissipation, area and correlation (as
is higher than LFSR and other RNG designs by 30.19%, the product of the cross-and auto-correlations); it incurs the largest
proposed RNG design is better than LFSRs and other RNG delay. However, when considering the product of all these
designs in all other figures of merit. Moreover, the mean square figures of merit (i.e., DPACP) as a combined metric, the
error (MSE) of the proposed RNG design is less (so more proposed design has the best performance. When using the
accurate) than these designs for the SC adder/multiplier circuits. proposed design for SC in two stochastic circuits (multiplier
and adder), results show that the mean square error for the SC

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Open Journal of Nanotechnology. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/OJNANO.2024.3414955

10

multiplier and adder circuits when bitstreams are generated Fabrizio Lombardi (M’81–SM’02-F’09)
using the proposed RNG design, is the lowest, so very accurate. graduated in 1977 from the University of
Essex (UK) with a B.Sc. (Hons.) in Electronic
REFERENCES Engineering. In 1977 he joined the
[1] S. A. Salehi “Low-cost Stochastic Number Generators for Stochastic Microwave Research Unit at University
Computing” IEEE Transactions on Very Large Integration (VLSI) College London, where he received the
system, Vol. 28, No. 4, April 2020 Master in Microwaves and Modern Optics
[2] P. Li, D. J. Lilja, W. Qian, K. Bazargan, M. D. Riedel, "Computation on
Stochastic Bit Streams Digital Image Processing case studies", IEEE
(1978), the Diploma in Microwave
Trans. VLSI Syst., Vol. 22, no. 3, pp. 449-462, March 2014 Engineering (1978) and the Ph.D. from the University of
[3] H. Ichihara, T. Sugino, "Compact and accurate digital filters based on London (1982).
stochastic computing", IEEE Transactions on Emergin Topics in He is currently the holder of the International Test Conference
Computing, 2016
[4] R. Wang, B. F. Cockburn, D. G. Elliott, "Design, evaluation and fault-
(ITC) Endowed Chair Professorship at Northeastern
tolerance analysis of stochastic FIR filters," Microelectronics Rel., Vol. University, Boston. His research interests are bio-inspired and
57, No. 2, pp. 111-127, 2016 nano manufacturing computing, VLSI design, testing, and
[5] B. D. Brown, H. C. Card, “Stochastic neural computation. I. fault/defect tolerance.
computational elements” IEEE Trans. Comput. Vol. 50, No. 9, pp. 891-
905, September 2001
[6] N. Nedjah, L. de Macedo Mourelle, "Stochastic reconFigurable hardware
for neural networks," Proc. Euromicro Symp. Digit. Syst. Des., pp. 438-
442, 2003
[7] S. Lyshevski, V. Shmerko, M. A. Lyshevski, S. Yanushchkevich,
"Neuronal processing, reconFigurable neural networks and stochastic
computing", Proc. 8th IEEE Conf. Nanotechnology, pp. 717-720, 2008
[8] S. Li, Q. Wang, X. Liu, J. Chen, "Low-Cost LSTM Implementation based
on Stochastic Computing for Channel State Information Prediction",
APCCAS, Vol. 10.1109, pp. 231-234, 2018.
[9] W. J. Gross, V. C. Gaudet, "Stochastic Computing: Techniques and
Applications", Springer 2019, ISBN-10: 3030037290
[10] Y. Liu, S. Liu, Y. Wang, F. Lombardi, J. Han, “A Survey of Stochastic
Computing Neural Networks for Machine Learning Applications”, IEEE
Transaction on Neural Networks and Learning Systems, Vol. 32, No. 7,
pp. 2809-2824, July 2021
[11] F. Neugebauer, I. Polian, J. P. Hayes, “Building a Better Random Number
Generator for Stochastic Computing”, Euromicro Conference on Digital
System Design, 2017
[12] H. Ichihara, et. Al “Compact and accurate stochastic circuits with shared
random number sources”, IEEE International Conference on Computer
Design (ICCD) 32nd, pp. 361-366, 2014
[13] S. Mohajer, Z. Wang, K. Bazarga, M. Riedel, D. Lilja, S. Faraji, “Parallel
computing using Stochastic circuit and deterministic shuffling networks”,
U.S. Patent Appl. 16/165713, April 25, 2019
[14] P. Junsangsri, F. Lombardi, “A Pseudo-Random Number Generator
Circuit for Nanoscale Stochastic Computing (SC)”, 23rd IEEE
International Conference on Nanotechnology (IEEE NANO 2023), July
2-5, 2023, Jeju Island, Korea
[15] A. Alaghi, J. P. Hayes, “Exploiting correlation in stochastic circuit
design”, IEEE International Conference on Computer Design (ICCD) 31st.
pp. 39-46, 2013

Pilin Junsangsri (Member, IEEE) received


the B.Eng. in Electrical Engineering from
Chulalongkorn University, Bangkok,
Thailand, in 2006, and the M.S. degree in
Electrical and Computer engineering from
Northeastern University, Boston, in 2010.
And Ph.D. in Computer Engineering from
Northeastern University, Boston, MA, in 2017.
She is currently an assistant professor at the School of
Engineering, Wentworth Institute of Technology, Boston, MA.
Her past research included the simulation and design of the
model of solar cells, design of non-volatile memory by using
Emerging Technology such as memristor, phase change
memory, programmable metallization cell, and racetrack
memory. Her research interests are in VLSI design, memory
design, Stochastic Computing, and Artificial Intelligence.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

You might also like