0% found this document useful (0 votes)
35 views12 pages

UART Paper

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views12 pages

UART Paper

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228309783

Design, Implementation and Optimization of Highly Efficient UART

Article · January 2010

CITATIONS READS
8 7,625

2 authors, including:

Gupta Ashutosh
Amity University
49 PUBLICATIONS   239 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Brain Computer Interfacing View project

Skin Cancer analysis using Different Algorithms and Early stage Detection by Biomedical Imaging View project

All content following this page was uploaded by Gupta Ashutosh on 28 May 2014.

The user has requested enhancement of the downloaded file.


Design, Implementation and Optimization of Highly Efficient UART

Design, Implementation and Optimization


of Highly Efficient UART
Parul Sharma* and Ashutosh Gupta**

The paper presents the hardware implementation of a high speed and efficient universal
asynchronous receiver and transmitter (UART) using FPGA which is fully functional
and synthesizable. The UART consists of a transmitter, a baud rate generator and a
receiver. This has been implemented using Verilog hardware description language(VHDL)
and simulated using ModelSim SE 6.0d. The simulated waveform has been obtained in
0.83 µs (baud rate of 9600 kbps) using 25.171 MHz clock cycle. The Verilog description
has been synthesized on the field programmable gate array devices (FPGA) such as Virtex4
and Sparten3 and a comparative study has been carried out. The maximum frequency of
operation in case of Virtex4 and Sparten3 has been observed as 289.151 and 155.473
MHz respectively. Futhermore, the total power consumption in case of Virtex4 is 268
mW and for Sparten3 it is 93 mW. The number of slices, look up tables (LUTS) and
general clocks (GCLS) used by Virtex4 is 63, 109 and 2, respectively, and by Sparten3 is
59, 112 and 25, respectively.

Introduction
A universal asynchronous receiver and transmitter (UART) is an indispensible
component for communication with serial input and output devices. Serial communication
[1-3] is essential in computers and allows them to communicate with the low speed
peripherals such as keyboard, mouse and modems. The UART takes bytes of data and
transmits the individual bits in a sequential fashion. At the destination, a second UART
reassembles the bits into complete bytes [4, 5]. Figure 1 illustrates a basic UART data
packet. It consists of 1 start bit which is always at logic 0, followed by a programmable
number of data bits (typically between 5 to 8), and a programmable number of stop bits
(1 or 2) [2]. The stop bit is always kept at logic 1. Thus, a standard UART can transmit 10
bits of data byte. In the UART, the two systems transmitter and the receiver do not share
a clock signal and they contain separate local clocks [6].
*
Post Graduate Student, ECE Department, GJUS&T, Hisar 125001, India. E-mail: [email protected]
**
Project Assistant, DSG, Central Electronics E ngineering Research Institute, Pilani 333031, India.
E-mail: [email protected]

©
The2009
IcfaiThe Icfai University
University Journal ofPress. All &
Science Rights Reserved.
Technology, Vol. 5, No. 4, 2009 1
D0 D1 D2 D3 D4 D5 D6 D7
S

Start 8 Data Bits Stop


Bit Bit

Figure 1: Basic Data Format of UART: 1 Start Bit, 8 Data Bits and 1 Stop Bit

Since common clock is not shared, a known data transfer rate (baud rate) must be agreed
upon between the transmitter and the receiver before the transmission and reception of data
bit. The transmitter shifts out the data starting with the LSB first. Once the required baud
rate is established (prior to initial communication), both the transmitter and the receiver’s
internal clocks are set to the same frequency (though not in the same phase). The receiver
‘synchronizes’ its internal clock to that of the transmitter’s at the beginning of every data
packet received.
The paper is arranged as follows: after a brief introduction in Section 1, Verilog designing
of UART has been discussed in Section 2. The results and discussion of simulation and
synthesis are given in Section 3, followed by a conclusion in Section 4.

2. System Description
The UART system consists of a transmitter module, receiver module and the baud rate
generator, which are elaborated in the following sections.

2.1 Transmitter Module


The proposed transmitter of UART is composed of a memory element, a baud rate circuit,
a bit cell counter, transmitted bit counter, a shift register and a state machine. The memory
element stores the data byte to be transmitted. The size of the memory element is 20 bytes by
default. It can be varied according to the requirement. The baud rate circuit is used to provide
the necessary baud clk to the transmitter. The baud rate circuit is explained in Section 2.3.
The Xmit bit counter is used to keep track of the number of data bits cumulated so far. When
this count reaches a preset limit (i.e., 8), the state machine stops accepting more data bits.
This counter has 2 control inputs: enabitcountH and rstbitCountH. When the former is
active high, the counter is advanced by 1. When the latter is active high, the counter is
cleared to 0. The width of this counter is of 4-bits by default. The main function of bit cell
counter is to generate a delay in units of uartclk (Baud rate Period/16). This is an up counter
and the signal countEnableH1 controls it. When countEnableH1 becomes active high, the
counter is in a reset state. When this signal is active low, the counter counts up by 1. Figure
2 illustrates the functional block diagram and Table 1 indicates the input and output ports of
the transmitter.

2 The Icfai University Journal of Science & Technology, Vol. 5, No. 4, 2009
Design, Implementation and Optimization of Highly Efficient UART

Xmit
Xmitdata H1
Memory Shift Register
Element 20 bits 8 bits uartout
uartXmitH
1’b0
1’b1 2 bits
XmitDataSelH
Count
Enable H1
Bit Cell
Counter BitcellontrH
sysrst1
sys clk

4 bit

State Machine
EnbitcountH1
rstCount H1
Xmit bit
bitCount H1
Counter
4bit

Baud
Baud
clk Xmit H1
Rate
Circuit uartclk

Figure 2: Transmitter Block Diagram

Signal Description Width Type


sys clk Main system clock 1 Input
sysrst1 Main system reset 1 Input
uartclk Baud clock. 16*baud rate 1 Input
uartxmitH This is the output of asynchronous transmitter 1 Output
xmitH An active high, 1 uartclk pulse starts the transmit process 1 Output
XmitH1 When active high shifts the data in memory 1 Input
Xmitdone When active high indicates that Xmit dataH1 has been fully transmitted 1 Output
XmitdataH1 This is the data to be sent to the remote. This data is sampled when
XmitH1 goes high. 8 Input

Table 1: Input and Output Ports of the Transmitter

The shift register in the transmitter is an 8 bit parallel-in-serial-out shift register. It is


controlled by two control inputs: loadshiftRegH and shiftEnaH1. When the first signal is
active high, it loads the parallel data into the shift register. An active high on the latter signal
shifts the loaded data out by 1 bit. A Multiplexer (MUX) is present on the output signal
uartxmitH1. It is a 2 to 1 MUX. The functionality of MUX is to select the start-bit (logic 0),

The Icfai University Journal of Science & Technology, Vol. 5, No. 4, 2009 3
user data (from the shift register) and the stop-bit (logic 1). The state machine is a simple 5
state Mealy type. Figure 3 illustrates the state flow.

Reset
XmitH1 = 0

Start
XmitH1 = 0

Start bitcellcntrlH!=16

Wait
bitcellcntrlH! = 16 bitcountH=
Wordslenght 1

Stop

Shift

bitcellcntrlH! = 16

Figure 3: State Machine of the Transmitter

When system is reset, the state machine defaults to ‘idle’ state. It remains in this state
as long as no transmit command is given. When xmitH1 becomes active high (i.e., for
single UART pulse), then the shift register will be loaded and jumps to ‘start’ state. In
‘start’ state, the uartxmitH1 MUX will be set to 1’b0 (start bit), and will be waited for 1
baud tick (16 UART pulse) before transitioning to ‘wait’ state. In ‘wait’ state, the
uartxmitH1 MUX will be set to point to the shift register, and 1 baud tick will be waited.
When the wait is complete and all bits (WORDLENGHT1 = 8) have been transmitted,
the state machine transitions to ‘stop’ state, otherwise it goes back to the ‘shift’ state. In
the ‘shift’ state, the shift register will be shifted by 1 bit and transitions to ‘wait’ state. In
the ‘stop’ state, the uartxmitH1 MUX is set to 1’b1 (stop bit), 1 baud tick will be waited and
then transitions to ‘idle’ state.

2.2 Receiver Module


The proposed receiving part of the UART consists of a baud rate circuit, a state machine
which controls the different states of the receiver, a serializer and a support logic. Figure 4
explains the functionality of the receiver and Table 2 indicates the input and output ports
of the receiver.

4 The Icfai University Journal of Science & Technology, Vol. 5, No. 4, 2009
Design, Implementation and Optimization of Highly Efficient UART

uartdataH1
recvdatH Serializer 8 Bit
Synchronizer
recvdatH

shiftH1

contrlresetH1
Bit cell
Counter bitcellontrlH
4bit
State Machine
recVdataH
countH1
sys clk sysrst1 rstCountH1
Received readbit
Bit Counter ContrlH1

Baud Baud
Rate clk uartclk
Circuit

Figure 4: Receiver Block Diagram

Signal Description Width Type


sys clk Main system clock 1 Input
sysrst1 Main system reset 1 Input
sysrst1 Baud clk. 16*baud rate 1 Input
uartdataH1 The 8 bit data is received which is transmitted by transmitter
output uartxmitH 1 Input
recVdataH This is the serialized data receive from remote 8 Output
recVreadyH Whe active high indicates that a fresh data is available on recVdataH 1 Output

Table 2: Input and Output Ports of the Receiver

The functionality of the baud rate circuit will remain the same. The main functionality of
the receiver is to detect the start-bit, then convert the following 8 bit serial data into parallel
(serialize), and then to detect the stop-bit, and make the data available to the host. In this,
the parity bit is not taken. So, no error checking logic is taken by default. The ubaud1.v will CE pls. clarify
generate the signal uartclk which is 16*Baud rate. The clocks present within the receiver
module will be driven by this clock. Before giving the incoming data, i.e., uartdataH1 to a
serializer, it is fed to the synchronizer. This synchronizer is an essential part of the receiver
because the data present on uartdataH1 is synchronous to the transmitter’s clock and not on

The Icfai University Journal of Science & Technology, Vol. 5, No. 4, 2009 5
the receiver’s clock. The serializer functionality is same as that of the serial-to-parallel shift
register. It has one control input shiftH1 from the state machine. When this signal is going to
be active high, the serializer will shift. By default the width of the shift register is 8 bits. The LSB CE pls. clarify
will be shifted in first. The received bit counter keeps track of the number of data bits received
so far. When this count is equal to the predefined limit (i.e., 8), then the state machine stops
receiving more data bits. This counter has 2 control inputs: countH1 and rstCountH1. When
the former is active high, the counter is advanced by 1. When the latter is active high, the
counter is cleared to 0. Note that this is a synchronous counter. The width of this counter is of
4 bits by default. The main function of the bit cell counter is to generate a delay in the units of
uartclk (Baud rate Period/16). This is an up counter controlled by the signal cntrresetH1. When
cntrresetH1 becomes active high, the counter is in the reset state. When this signal is active low,
the counter will count up by 1. The state machine is a simple 5 state Mealy type (output is function
of present state and input). Figure 5 explains the state diagram of state machine for receiver.

Reset recVdatH = 0

Start

recVdatH = 1

recVdatH = 0
Center bitcellcntrlH!=4

Wait
bitcellcntrlH! = 4 recVdbitcntrlH=
Wordlenght 1

Stop
bitcellcntrlH! = 0
Sample
bitcellcntrlH!=16

Figure 5: Receiver Sate Machine Diagram

The state machine ties all of the functional units previously described. When the system
is reset, the state machine by default will be in ‘start’ state. In this state the state machine
looks for the start bit. This condition is going to be detected by the transition of the incoming
data (which is idle at logic 1) to a logic 0. Once the start bit is detected, it transitions to
‘centre’ state. In ‘centre’ state, the state machine waits for ½ bit cell in order to find the bit
cell centre. A bit cell is 1 baud ‘tick’ and corresponds to16 uartclk ticks. So ½ bit cell
corresponds to 8 uart pulses. Once the bit cell centre is found (after having waited

6 The Icfai University Journal of Science & Technology, Vol. 5, No. 4, 2009
Design, Implementation and Optimization of Highly Efficient UART

4 uart pulses), if the state of the recvdataH (synchronized incoming data) is still low, then the
state machine transitions to ‘wait’ state. If recdataH1 is high, then this is not a valid start bit
so the state machine transitions back to ‘start’ state. This type of effect can be produced by
noise signal in the UART data line. The ‘wait’ state simply waits for 1 baud tick (16 uart
pulses). Note that the previous state, ‘centre’, aligned the incoming data to the centre of the
start bit cell. Once 1 baud tick is waited, the incoming data can be sampled into the serializer.
If all WORDLENGHT1 (8 by default) bits have been sampled, then the state machine
transitions to ‘stop’ state, otherwise, it transitions to ‘sample’ state.

2.3 Functionality of Baud Rate Generator


The baud rate generator will generate the uartclk from the external clock (sys clk) of the
system. This clk is used to derive all of the clock within the transmitter and the receiver
module. The uartclk is equal to 16 times the baud rate. In this design the system clock is
25.175 MHz and the baud rate is 9,600 kbps. The required maximum frequency of uartclk
(16*baud rate) is 9,600*16 = 153,600. The time interval for 1 bit is calculated by 1/9,600 =
0.1042 µs and for 8 data bits it is 0.83 µs.

3. Results and Discussion


The proposed UART has been simulated on ModelSim SE6.0d and has been synthesized by
using Xilinx ISE 10.1. Figures 6 and 7 show the serial transmission and reception of 8 bit data
at the baud rate of 9,600 kps by using the crystal frequency of 25.175 MHz. When the signal

Figure 6: Simulated Waveform of Transmitter

The Icfai University Journal of Science & Technology, Vol. 5, No. 4, 2009 7
XmitH1 is active high the serializer is loaded with 8 bit parallel data and converts the 8 bit
parallel data into serial. The transmitter takes 8*16 clk pulses for transmitting the 8 bit data.
The signal XmitdoneH is active high when the 8 bit data has been successfully transmitted.
When it completes the transmission of first 8 bits data, it starts transmitting the remaining
data. Similarly, the receiver collects the 8 bit serial data at uartdataH1 input and converts it
into parallel data. When the receiver receives the first 8 bit data, then signal recvreadyH
becomes high. The synthesis flow for the UART has been targeted to two flexible high
performance FPGA Architectures available from Xilinx called Virtex4FX and the Sparten3
families. The synthesis has been done when the optimization goal of the designing is area. Table
3 shows the comparison between the synthesized results between the two FPGA families and

CE pls chk figure 7


not mentioned in
the text

Figure 7: Simulation Results of Receiver

Parameters Virtex-4vfx12-sf363-10 Sparten-3s1000fg320-4


Number of Slices 63 59
Number of Slice flip-flops 67 69
Number of LUTS 109 112
Number of GCLKS 2 25
Minimum Input Delay Time 3.204 ns 4.431 ns
Maximum Output Delay Time 4.677 ns 7.241 ns
Maximum Frequency 289.151 MHz 155.473 MHz
Total Power Consumption 268 mw 93 mw

Table 3: Synthesized Results of Two FPGA Families

8 The Icfai University Journal of Science & Technology, Vol. 5, No. 4, 2009
Design, Implementation and Optimization of Highly Efficient UART

their comparative graph is shown in Figure 8. The maximum frequency of operation in case of
Virtex4 is 289.151 and it is 155.473 MHz in case of Sparten 3. Futhermore, the total power
consumption in case of Virtex4 is 268 mw and it is 93 mw in case of Sparten 3. The number of
RA pls. provide
SLICES, LUTS and GCLS used by the Virtex4FX device is 63, 109 and 2 respectively and by the
the fullforms
Sparten 3 device is 59, 112 and 25 respectively.

350
300
250

200
150
100
50

0
Slices LUT Max. Freq. Total Power

Figure 8: Comparison Chart of Parameters for Virtex4fx & Sparten3sfg FPGA Devices

Conclusion
A proposed UART system has been designed using Verilog in a high level design method.
All modules have been simulated using ModelSim SE6.0d and implemented using Xilinx ISE
10.1 tool. A Virtex4 with 363 and Sparten3 FPGA with 320 input/output pins have been used
as a target device. From the comparative analysis between the two devices we conclude that
there is a small difference between the number of slices, LUTs and GCLKs but there is a large
difference in total power consumption. The power consumption in case of Sparten3FG is 93
mW which is less than a power consumption 268 mW of Virtex4FX. Hence it is concluded that
the Sparten3FG is a better choice for the implementation of the proposed UART system.

References

RA pls. provide the 1. Martin S Michael (xxxxx), “A Comparison of the INS8250, NS16450 and NS16550A
year Series of UARTs”, National Semiconductor Application Note 493, April.

2. Harvey M S (1999), Generic UART Manual, SiliconValley, December.

3. Mohd. Yamani Idna Idris and Mashkuri Yaccob (2003), “A VHDL Implementation of
BIST Technique in UART Design”, in TENCON, Conference on Convergent
Technologies for Asia-Pacific Region, 5-17 October.

The Icfai University Journal of Science & Technology, Vol. 5, No. 4, 2009 9
4. Chig-Chang Wong and Yu-Han Lin (2006), “A Reusable UART IP Design and its
Application in Mobile Robots”, in Jonas Buchli (Ed), Mobile Robots, Moving Intelligence,
pp. 576, ARS/plV, December, Germany.

5. Norhuzaimin J and Maimun H H (2005), “The Design of High Speed UART”, Asia-
Pacific Conference on Applied Electromagnetic (APACE 2005), December.
RA pls. provide the
year 6. Thomas Oelsner (xxxx), “Digital UART Design in HDL”, QuickLogic, Europe.

Reference # 30J-2009-12-0x-01

10 The Icfai University Journal of Science & Technology, Vol. 5, No. 4, 2009
View publication stats

You might also like