INDS08 Implementation of One Dimensional CNN Array On FPGA A Design Based On Verilog HDL
INDS08 Implementation of One Dimensional CNN Array On FPGA A Design Based On Verilog HDL
Abstract— In this paper an FPGA based Implementation of a universal machine by using CNN [7]. A test-bed for CNN
1D-CNN with a 3×1 template and 8×1 length will be described. emulation on an FPGA evaluation board is a relatively cheap
The Cellular Neural Networks (CNN) is a parallel processing option and the required development time should be
technology that has been generally used for image processing. significantly low.
This system is a reduced version of a Hopfield Neural Network.
The local connections between a cell and the neighbors in this
implementation of this technology is easier than in the case of
II. CNN DIGITAL EMULATION/MODELLING
Hopfield Neural Networks. There are various implementation
options of CNN on chips, the best solution being using ASIC CNN mathematical models for each cell are first-order
technology. The next best is an emulation on top of a digital equations like Eq-1 shown below.
reconfigurable chip such as FPGA. Designing and developing Eq-1:
universal CNN based machines using these technologies is
possible. Since FPGAs are COTS components and their growth is
high, a simple and economical architecture is obtained by State Equation:
designing an CNN emulation on FPGA chips. This digital dvxij(t) 1
designing on FPGA does however have a tradeoff between speed C = − vxij(t) + ∑ A(i, j; k, l)vykl(t) +
and area. One key target is therefore to reach to a best dt Rx C(k ,l )∈Nr (i , j )
∑B(i, j; k, l)v
performance for this emulation architecture under the
mentioned constraints. ukl +I
C(k ,l )∈Nr (i , j )
1 ≤ i ≤ M ;1 ≤ j ≤ N
Keywords— Cellular Neural Network; CNN Emulation on
FPGA; Simulation; HDL.
For modeling this equation in HDL, we must simplify
nonlinear terms. The simplified equation takes the form of
I. INTRODUCTION
equation (Eq-2) as following.
This paper briefly introduces a the design of digital
emulation of CNN based on hardware description languages.
Cellular Neural Network has been introduced by Chua and Eq-2:
Yang from the University of California at Berkeley in 1988
[4]. This type of neural networks is a reduced version of x& = − x + A * v y + B * vu + I
Hopfield Neural Networks. One of the most important features
of CNN is the local connectivity; in this technology each cell
In this equation 'A' is a template for feedback operator and
is connected only to its neighbor cells. Due to local connection
'B' is a template for control.
between a cell and the neighbors a hardware implementation
of this type of neural networks is easily realizable [5]. By
digitizing the analog behavior of this system (i.e. emulation on Eq-3:
a digital platform) one is able realize this system based on
FPGA. Due to the local connectivity and processing around Output Equation :
each cell the global system works like a parallel processor
v yij ( t ) = ( v xij ( t ) + 1 − v xij ( t ) − 1 )
1
system [6].
The behavior of a CNN system is based on the settings of
2
the template values. By changing these template values the 1 ≤ i ≤ M ;1 ≤ j ≤ N
CNN behavior is affected. This is a very feature for realizing a
According to the Equation-4, we are able to normalize the
input data.
The output Equation is a linear sigmoid function for
limiting the output state value. In some references the sigmoid Eq-4:
function is noted by f (.).
Converting equation-2 to a discrete time model is possible. U = Pixel_value * 2 – 1
A Discrete Time CNN can easily be mapped to an FPGA by
defining digital integrators, multipliers, adders and other
digital operators. After defining fundamental operators in The convolution module loads template values and inputs
FPGA and wiring of/between these operators, a dynamic data and then return the product of these values. The following
modeling of CNN is possible [7]. A single CNN cell model is block diagram shows the convolution operator I/O (see Fig.2).
like a first-order differential equation; therefore, solving this We set zero for out of bound values in the CNN array. The
equation by this architecture is possible. module cloning based on this diagram is simple in HDL code.
a1 a2 a3 b1 b2 b3
CLK
TA= [-1, +2, +1]
TB=Bias=0
Figure 4. Hole Counting Sample, (Left) Input Image, (Right) output result.
Figure 3. Digital Architecture of CNN
[10] Sadeghi-Emamchaie, S., G. A. Jullien, et al. (1998). "Digital arithmetic
using analog arrays." VLSI, 1998. Proceedings of the 8th Great Lakes
TA= [+1, -1, +1] Symposium on: 202-205.
[11] Espejo, S., A. Rodriguez-Vazquez, et al. (1994). "Smart-pixel cellular
TB= [0, 1, 0] neural networks in analog current-mode CMOStechnology." Solid-State
Bias = 0 Circuits, IEEE Journal of 29(8): 895-905.
APPENDIX
input [15:0]VA1;
input [15:0]VA2;
input [15:0]VA3;
input [15:0]Y1;
Figure 5. Filter Sample: (Left) Input Image, (Right) output result.
input [15:0]Y2;
input [15:0]Y3;
IV. CONCLUSION wire signed [17:0] conv;
endmodule
[1] Martinez, J. J., F. J. Toledo, et al. "New emulated discrete model of
CNN architecture for FPGA and DSP applications." Lecture notes in
computer science: 33-40. // resule range [-7,+7] accuracy 12bit Fixed Float
[2] T. Roska, A. Rodriguez-Vazquez, “Review of CMOS implementations
of the CNN universal machine-type visual microprocessors”, IEEE Int. module signe_mul (out,a,b);
Symp. on Circuits and Systems, ISCAS 2000, Geneva-Italia, vol. 2, pp.
output [15:0] out;
120-123, 2000.
[3] Chua, L. O. and L. Yang (1988). "Cellular neural networks: input [15:0] a;
applications." Circuits and Systems, IEEE Transactions on 35(10):
1273-1290. input [15:0] b;
[4] Chua, L. O. and L. Yang (1988). "Cellular neural networks: theory." wire signed [15:0] out;
Circuits and Systems, IEEE Transactions on 35(10): 1257-1272.
[5] J.Zhao, Q. Ren, J. Wang, and H. Meng,"A New Approach for Image wire signed [31:0] mul_out;
Restoration Based on CNN Processor",ISNN 2007, Part III, LNCS
4493, pp. 821–827, 2007. assign mul_out = a*b;
[6] Toledo, F. J., J. J. Martínez, et al. (2005). "Image processing with CNN assign out = {mul_out[31],mul_out[26:12]};
in a FPGA-based augmented reality system for visually impaired
people." 8º Int. Work-Conference on Artificial and Natural Neural endmodule
Networks, IWANN: 906-912.
[7] Martinez-Alvarez, J. J., F. J. Garrigos-Guerrero, et al. "High
Performance Implementation of an FPGA-Based Sequential DT-CNN."
[8] Eric Y. Chou, Bing J. Sheu, Topzy H. Wu, Robert C. Chang ,"VLSI
Design of Densely-Connected Array Processors", Proceedings of the
International Conference on Computer Design: VLSI in computers &
Processor (ICCD '95)
[9] Lai, K. and P. Leong "Implementation of Time-Multiplexed CNN
Building Block Cell." Proc. MicroNeuro 96: 80-85.