0% found this document useful (0 votes)
112 views28 pages

Reconfigurable Computing (EN2911X, Fall07) : Lab 2 Presentations

The document discusses several student presentations on palindrome checkers implemented in Verilog for reconfigurable computing. It describes various approaches taken by different student teams, including using loop unrolling to compare digits, optimizing with non-blocking instructions, and dividing digits into a vector to compare. It also discusses a binary to BCD conversion module, integrating the checker with a Nios II processor using a PIO interface, and performance results achieving 4.2 seconds to check 1 billion numbers.

Uploaded by

Tom Perrin
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views28 pages

Reconfigurable Computing (EN2911X, Fall07) : Lab 2 Presentations

The document discusses several student presentations on palindrome checkers implemented in Verilog for reconfigurable computing. It describes various approaches taken by different student teams, including using loop unrolling to compare digits, optimizing with non-blocking instructions, and dividing digits into a vector to compare. It also discusses a binary to BCD conversion module, integrating the checker with a Nios II processor using a PIO interface, and performance results achieving 4.2 seconds to check 1 billion numbers.

Uploaded by

Tom Perrin
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 28

Reconfigurable Computing (EN2911X, Fall07)

Lab 2 presentations
Prof. Sherief Reda Division of Engineering, Brown University http://ic.engin.brown.edu

Reconfigurable Computing S. Reda, Brown University

Runtimes by different teams


4.2 seconds 14 seconds 33 seconds 300 seconds 305 seconds 320 seconds

Reconfigurable Computing S. Reda, Brown University

Palindrome Checker

Cesare Ferri Rotor Le

Reconfigurable Computing S. Reda, Brown University

Part I : Verilog Module


always @(posedge CLOCK_50) begin DECOMPOSE not_palindrome = 1'd0;len = 0; tmp = number; //reset THE NUMBER for (i = 0; i<9 ; i = i + 4'd1) IN DIGITS begin if (tmp > 0) begin modulo = tmp % 4'd10; Room for tmp = tmp / 10; loop unrolling vector[len % 9] = modulo; here.. len = len + 1; end end th = (len >> 1) ; for (j=0; j<th; j = j + 4'd1) begin tmp2 = (len-1) - j; tmp3 = vector[j];tmp4 = vector[tmp2]; if ( tmp3 != tmp4 ) not_palindrome = 1'b1; end Reconfigurable Computing = ~(not_palidrome); result S. Reda, Brown University end

Part I : Verilog Module


always @(posedge CLOCK_50) begin not_palindrome = 1'd0;len = 0; tmp = number; //reset for (i = 0; i<9 ; i = i + 4'd1) begin if (tmp > 0) begin modulo = tmp % 4'd10; tmp = tmp / 10; vector[len % 9] = modulo; len = len + 1; COMPARE THE end DIGITS STORED end INTO THE th = (len >> 1) ; VECTOR for (j=0; j<th; j = j + 4'd1) begin tmp2 = (len-1) - j; tmp3 = vector[j];tmp4 = vector[tmp2]; if ( tmp3 != tmp4 ) loop not_palindrome = 1'b1; unrolling, end again.. Reconfigurable Computing = ~(not_palidrome); result S. Reda, Brown University end

Optimized Verilog Code


Do loop unrolling to compare digits:

if (digits[0] == digits[3] && digits[1] == digits[2]) not_palindrome = 1'd1;//reset


Reconfigurable Computing S. Reda, Brown University

Unsolved things
Our running time now depends on the way that we extract digits from the number Some ideas to improve?
Using shift register Using non-blocking instructions

Reconfigurable Computing S. Reda, Brown University

Palindrome Homework Summary


ENGN2911X Aaron Mandle Bryant Mairs
Reconfigurable Computing S. Reda, Brown University

Setup
Two-cycle fixed length custom instruction Operates on 20 numbers at a time Returns total palindromes in that 20number block

Reconfigurable Computing S. Reda, Brown University

Process
Combinatorial conversion from binary to BCD Check number of digits Compare digits based on length Total up number of valid palindromes

Reconfigurable Computing S. Reda, Brown University

Binary to BCD Conversion


Built using blocks of conditional add-3 modules and shifts Add-3 modules:
4-bit input Adds 3 if input was 5 or greater

Based on adding 6 numbers > 9

Reconfigurable Computing S. Reda, Brown University

module checkPalindrome(data, result);

input [31:0] data; output [31:0] result;


wire [3:0] digits [10:0]; wire [3:0] digCount; bin2bcd({digits[9], digits[8], digits[7], digits[6], digits[5], digits[4], digits[3], digits[2], digits[1], digits[0]}, data);

assign digCount = digits[9] != 0?10: digits[8] != 0?9: digits[7] != 0?8: digits[6] != 0?7: digits[5] != 0?6: digits[4] != 0?5: digits[3] != 0?4: digits[2] != 0?3: digits[1] != 0?2: 1; assign result = digCount == 1 || digCount == 2 && (digits[0] == digits[1]) || digCount == 3 && (digits[0] == digits[2]) || digCount == 4 && (digits[0] == digits[3] && digits[1] == digits[2]) || digCount == 5 && (digits[0] == digits[4] && digits[1] == digits[3]) || digCount == 6 && (digits[0] == digits[5] && digits[1] == digits[4] && digits[2] == digits[3]) || digCount == 7 && (digits[0] == digits[6] && digits[1] == digits[5] && digits[2] == digits[4]) || digCount == 8 && (digits[0] == digits[7] && digits[1] == digits[6] && digits[2] == digits[5] && digits[3] == digits[4]) || digCount == 9 && (digits[0] == digits[8] && digits[1] == digits[7] && digits[2] == digits[6] && digits[3] == digits[5]); endmodule

Reconfigurable Computing S. Reda, Brown University

Yossi

Reconfigurable Computing S. Reda, Brown University

For all solutions


Finding the length of the decimal representation (# digits) by:
typedef unsigned long UINT; inline UINT GetMSDFIndx(UINT n) { return (n >= 100000000 ? 8 (n >= 10000000 ? 7 (n >= 1000000 ? 6 (n >= 100000 ? 5 (n >= 10000 ? 4 (n >= 1000 ? 3 (n >= 100 ? 2 (n >= 10 ? 1 }

: : : : : : : : 0))))))));

Reconfigurable Computing S. Reda, Brown University

Software Only Solutions


Times:
On laptop (Intel 2333 MHz): 8 secs. On NIOS (100 MHz): 3500 secs.

Inherently sequential
Early false detection: quit the computation if we find two digits that do not match.
Brings down expected # divide operations to less than 2.2
Reconfigurable Computing S. Reda, Brown University

Software Only Solutions


Observations:
1. Detect whether the MSD is a given number without division MSD test: d is the MSD of number n of length L if and only if d*10L-1 n < (d+1)* 10L-1 E.g 4*103 <= 4765 < 5*103
2. Cut out the MSD: 4665 4*103 = 665 and continue.

Algorithm: find one LSD after another, compare with MSDs, quit early if not a palindrome. Runs in 8 seconds on laptop
Reconfigurable Computing S. Reda, Brown University

Software Only Solutions

On NIOS, division is really expensive Division free algorithm: Dont test the MSD, find it with binary search

Reconfigurable Computing S. Reda, Brown University

Software Only Solutions


On NIOS, division is really expensive Algorithm:
Start from left Find half of the digits Compute the palindrome whose left half matches these digits Compare to the tested number
Loose the early false detection, but still better than division.

Runs in 3500 secs on NIOS 100 MHz.


Reconfigurable Computing S. Reda, Brown University

Using the Hardware


A general trick to divide by a constant without using division.

Based on trick I read in Hackers Delight of how to divide by 3.

Demonstrate on divide by 10:


Given: number n < 230 Needed: floor(n/10) Algorithm: Multiply n by (231+2)/10 = 0xCCCCCCD, and then shift right 31 positions.
Reconfigurable Computing S. Reda, Brown University

Division Free divide by 10


Algorithm:
Multiply n<230 by (231+2)/10 = 0xCCCCCCD, and then shift right 31 positions.

Proof: The above algorithm outputs:


floor(n/10) <= n/10 + 2*n/(10*231) < floor(n/10) + 1

floor[ n * ((231+2)/10) * 1/231 ] = floor [ n/10 + 2*n/(10*231) ] n < 230 implies: 2*n < 231 < 1/10

= floor(n/10)

floor(n/10) <= n/10 <= floor(n/10) + 9/10


Reconfigurable Computing S. Reda, Brown University

Divide by Constant
Similarly, to divide n by a constant C, we need to find P and R such that:
2P + R = 0 mod C. R*n < 2P

And then multiply n by (2P + R)/C, and shift right P positions.

Found the constants to all powers of 10 needed. Algorithm worst register to register delay: 25 ns.
Reconfigurable Computing S. Reda, Brown University

Run Time: 33 secs.

EN2911X Lab 2: Palindromes


Brian Reggiannini and Chris Erway

Reconfigurable Computing S. Reda, Brown University

Checking a palindrome
All combinational logic! Step 1: Convert 30-bit integer to 37-bit binary-coded decimal (BCD) format Step 2: Detect the length of decimal number Step 3: Compare pairs of digits with XOR

Reconfigurable Computing S. Reda, Brown University

Binary to BCD converter

Reconfigurable Computing S. Reda, Brown University

Binary to BCD converter

Reconfigurable Computing S. Reda, Brown University

Integration with Nios II


Worst-case propagation delay: 43ns, 5 cycles Dont want to wait! Use 32-bit PIO interface Array of 25 palindrome-checking units Write out 32-bit start value
Read back # of total palindromes found (from next 25) While Nios is waiting: increment loop counter

Reconfigurable Computing S. Reda, Brown University

Nios Software

Reconfigurable Computing S. Reda, Brown University

Results
Original C program: 49.59s/billion Unoptimized Nios C program: 7842s/100million Final result: 4.2s/billion (420000036 cycles @ 100MHz)
Total logic elements: 23,039 / 33,216 (69%)

Reconfigurable Computing S. Reda, Brown University

You might also like