0% found this document useful (0 votes)
7 views

ECE224 Notes

Uploaded by

zpkbtbsffg
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

ECE224 Notes

Uploaded by

zpkbtbsffg
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 150

ECE 224 Notes

Section II: Embedded Systems


Embedded Systems & Design
 Embedded system: a computer system designed to perform a task without user knowledge of
its existence
o Users might provide input via sensors and receive output, but they don’t need to be
aware of the presence of the embedded system itself
o Examples include consumer electronics, household appliances, automotive systems, etc.
o Products including cars, calculators, and cameras have evolved from being mostly
mechanical to almost entirely electronics/informatics
 Characteristics of embedded systems:
o Multidisciplinary design problems
o Complex design interactions
o Unpredictable external environment
o Security and reliability are very important

Computer Technology
 Hardware: mechanical or electronic components of a computer
 Software: programs or other operating information used by a computer
 Hardware/Software Co-design: task of simultaneously designing hardware and software
components in a combined system

 Microprocessor: consists of only a CPU, no main memory or built-in support for I/O devices
o General-purpose CPU that can include multiple processor cores
o Delivers high performance
o Requires (off-chip) I/O devices to implement a complete computer
 Microcontroller: a complete, single-chip computer that consists of a CPU, memory, and some
I/O devices
o Specialized CPU to control a mechanical or electronical system
o Built-in memory storage
o Small and cost-efficient
o Specialized, built-in interface support for devices (e.g. parallel, analog, serial)
o Designed to meet needs of wide range of applications
 System-on-a-Chip (SoC): a user-designed, fully functional system implemented on a single chip;
may contain CPU, memory, I/O devices, and other digital logic
o Typically contains:
 Similar functionality to a microprocessor or microcontroller, implemented as
either hardware or software
 Communication ports
 Volatile storage (e.g. RAM)
 Non-volatile storage (e.g. ROM)
 Other components such as timers, parallel interfaces, analog to digital
converters, etc.
 Programmable Logic Device (PLD): a digital logic chip that permits configuring and
interconnecting internal logic blocks to form a custom digital circuit; can be rewired using
primitive building blocks to implement custom circuits
o Primitive building blocks: flip-flops, multiplexers, LUTs, adders, multipliers, and RAM
blocks
o Examples: SRAM, EEPROM, NOR Flash
o PLDs can be one-time programmable vs. reconfigurable, as well as contain in-system
programmability vs. external programming hardware
 System-on-a-Programmable-Chip (SoPC): a SoC that is implemented using a PLD
o Advantages over SoCs:
 Flexible
 Upgradable
o Disadvantages over SoCs:
 Can be slower
 More expensive in large quantities

Section III: Software, Synchronization, and Device Drivers


Software Development and Synchronization
 Good software is modular
 Low level functions typically interact directly with
hardware devices to:
1. Initialize device configurations
2. Read/write data
3. Synchronize (polling or interrupts)
 When a device completes its task, the device must
synchronize with the processor
 Events can be assigned priority, to help minimize
latency of high priority events
 Latency: delay between arrival of a request and completion of service
o CPU latency: the time between receiving a service request and initiating service
 Involves both hardware and software delays
o Device latency: the time between requesting service and receiving service
 Real-time system: a system that guarantees a certain worst-time latency for critical events
 Throughput: measure of number of items processed per unit of time

Synchronization Mechanisms
 Blind cycle: software waits for an amount of time, then acts on data regardless whether or not
the device is ready
 Occasional polling: device status is checked at convenience of the designer
 Periodic polling: device is checked after a set amount of time, repeating until device is done –
generally uses a timer interrupt
 Tight polling loop (busy waiting): software continuously checks I/O status until device is ready
 Interrupt handling: devices generates hardware interrupts to request service

 Blind cycle, occasional polling, and periodic polling are CPU-oriented – device waits for CPU to
initiate synchronization
 Tight polling and interrupt handling are device-oriented – devices demands service, reducing
device latency

Polling Loop Synchronization – Input Data


 Poll device, wait until data is available, then read input data
while (not data_available) loop
read data
clear data_available
process data
return

Polling Loop Synchronization – Output Data


 Conservative: assume device isn’t initially ready, so poll device until it’s ready, then output the
data
while (not output_ready) loop
clear output_ready
output data
return
 Optimistic: assume device is already ready, output data, poll device, and wait until device is
ready
clear output_ready
output data
while (not output_ready) loop
return

Interrupt Synchronization
1. Device requests interrupt to CPU
2. CPU completes execution of current instruction
3. CPU suspends execution of main program
4. Interrupts may be disabled
5. Internal registers saved to stack
6. Device may be acknowledged
7. ISR is selected
8. ISR is executed
9. Registers are restored
10. Interrupts are enabled (if disabled)
11. CPU resumes execution of main program

CPU Interrupt Notification


 Interrupts must be handled from multiple sources

Single interrupt request line

 Advantages: fewer wires and pins

Multiple interrupt request lines

Interrupt Service Routine Selection


Non-vectored interrupts

 Devices are polled to determine source and priority


 General ISR selects device-specific handler to execute
Vectored interrupts

 Each request is associated with an interrupt vector, which has a fixed priority
 ISR at vector address is executed

Interrupt Service Routines


 ISRs should execute as fast as possible since they are interrupting other tasks
 ISRs should avoid blocking synchronous I/O functions
 General structure:
1. Save any registers that will be modified during ISR
2. Acknowledge device
3. Re-enable interrupts to allow same/higher-priority interrupts
4. Test for valid interrupts and determine its source
5. Complete desired action
6. Restore registers
7. Return from ISR
 Initializing system software that uses interrupts:
1. Disable all interrupts
2. Enable device interface interrupts (by setting appropriate registers)
3. Set interrupt mask, to allow interrupts from device
4. Initialize interrupt vector with address of ISR
5. Enable interrupts

Device Drivers
 Device drivers: software included with a particular device
 Includes:
o Data structures
 Variables to access device interface registers
 Variables to track state of device
 Data buffers
o Initialization functions
 Device initialization
 Synchronization initialization
 Driver variables initialization
o I/O functions
o ISRs
Section IV: Synchronization, Data Generation, and Data
Transfer
 Generally, during interface communication, the two sides of the interface (producer and
consumer) likely have different views of time and perform independent tasks except when
communicating with each other

Producer/Consumer Model of Data Transfer

 Producer: hardware/software component that’s responsible for producing data and/or events
 Consumer: hardware/software component that’s responsible for consuming data and/or events
 Data: state information transferred from producer to consumer
 Event: control information transferred from producer to consumer to indicate the occurrence of
an activity

Producer/Consumer Communication
 Event only: event occurs and the occurrence is transferred from producer to consumer
 Event and data: an event occurs, so the event and some associated data is transferred to
consumer
 Data only: a data value is produced but the consumer isn’t notified of the data; consumer can
read data at any time

Synchronization Hierarchy
 Synchronization is considered at several levels:
1. Data generation: how is data creation controlled, started/stopped?
2. Data notification/Initiation of transfer: once producer has data, how does it notify the
consumer? Does the consumer request the data?
3. Data transfer: once producer has data and consumer is ready, how does
synchronization of data transfer handled?
 Synchronization: interaction required to make two entities with different views of time interact
o Active synchronization: one of the entities is able to force a change in the operational
characteristics of the other
o Passive synchronization: one of the entities signals a request for service; the other is
not required to respond
 Synchronization needs
o Relationship between/amongst entities (how many, master/slave vs. equal entities)
o Level of required service
 Active, demand-oriented: event must be serviced
 Passive, request-oriented: event may be serviced
Data Generation
 Creation of data can be initiated by both the producer and consumer, but requires action by the
producer
 Spontaneous sources: data is produced by the device, independent of the actions of the
consumer accepting it
 Consumer sensitive sources: data is produced by the device, only after data is consumed by the
consumer (consumer acknowledges consumption)
 Consumer responsive sources: data is produced by the device, only after requested by the
consumer

Notification/Initiation of Transfer
 Consumer initiated scenarios: consumer requests data, data becomes/is ready, request is
completed
o Passive synchronization example: polling for a key press, and once it’s pressed, data is
consumed
o Active synchronization example: interrupt from printer to indicate that it’s ready for the
next item
 Producer initiated scenarios: data is available, data is accepted by consumer, transfer is
completed
o Passive synchronization example: polling for printer to be ready for the next character,
and once it is ready, data transfer occurs
o Active synchronization example: keyboard sends interrupt to indicate it has the next
character for consumption
Data Transfer
 Data transfer: exchange of information between two entities that may have different view of
time
 Data transfer considerations:
o Data persistence: how long the data is valid for transferring between entities
o Time synchronization/clocking: how signals representing any data are enabled to make
a transfer happen
o Control signal: how control information is exchanged between communicating entities

 Persistent data: information remains valid until consumer indicates that it has consumed it
o Requires feedback path
 Transient data: information is made available to consumer, and only valid for a period of time
o Synchronized buses

Blind Synchronization (Independent Timing)

 Consumer reads data without caring about changes to producer


 Errors can occur if data changes too close to sample time
 Some data values may be missed, others read multiple times
o Useful for devices where a missed reading doesn’t really matter, like a thermometer

Synchronous Systems (Common View of Time)

 Producer and consumer share common clock


 Producer and consumer use different edges to ensure data validity
 Data may be sampled multiple times, unless transfers are limited to one per clock period

Asynchronous Systems (Different Views of Time)


 Producer informs consumer of data validity
 Producer-driven consumer clock – timing inferred from data validity
 Data is only sampled once

Generalized I/O Overview

 Global initialization: set processor and I/O interface parameters once for overall system
operation
o Initializing vectored interrupt tables, specifying port directions, etc.
 Transfer initialization: set processor and I/O interface parameters once per transfer to facilitate
a specific transfer
o Setting memory location to provide data for transfer, setting the block and track number
of disk transfer, etc.
 Data transfer: synchronization before and after data transfer

Impact of Data Generation


 ttransfer producer: time it takes for producer to transfer one data unit
 ttransfer consumer: time it takes for consumer to transfer one data unit
 twait: time required by initiator (CPU) to synchronize with the target (device) and recognize
availability of new data
 tsync-poll: time spent by initiator (CPU) to actively poll and detect new data available at a target
(device)
 tsync-inter: time spent by initiator (CPU) to interrupt its execution in response to new data available
at the target (device)
 tinterdata: time between generation of consecutive data blocks by the producer

Impact of Notification/Initiation
 Assume data source is consumer sensitive, and that each block contains one data unit
 Since ttransfer producer < ttransfer consumer, ttransfer (total transfer time) = ttransfer consumer
 Transfer is consumer-initiated since ttransfer consumer starts before and ends after ttransfer producer
Block Read Transfers

 Assume data source is consumer sensitive, and that each block contains two data units

Estimating Synchronization Times of Block Transfers


t transfer =max ⁡(t transfer producer , t transfer consumer )

t wait =
{ t sync −poll
t sync−inter +t interdata
(differs for polling and interrupts)

 n = number of data units in one block


 For transfer of a block of 256 data units:
o n=1 means synchronization occurs after transfer of every 1 data unit, for 256 total
synchronizations per block
o n=256 means synchronization occurs after transfer of each block, for 1 synchronization
per block

Section V: Computer Structure


Computer Organization
System Bus
 System bus: set of wires shared by several components that need to communicate with each
other
o Buses exchange information using a standardized protocol
o Reduces the total number of wires required by the system
o Building blocks of a system bus:
 Technique for selecting between 2+ communicating entities
 Set of wires/signals to transfer data
 Set of wires/signals to control data transfer

Memory Mapped I/O


 I/O devices are connected to the system bus similar to memory

 I/O devices can also be connected to a separate I/O bus that isn’t connected to memory
Memory
 Memory connects to CPU through 1+ buses
 Instructions and data are both stored in memory
o They can only be distinguished by context
 Stored memory values are accessed using their memory addresses
 Read Only Memory (ROM): contains sequences of instructions necessary to put processor in
start-up state, which can be the final state or a starting point to boot into a more functional OS
 Random Access Memory (RAM): used to store values and programs that may change
 There isn’t distinction between ROM and RAM addresses

Central Processing Unit (CPU)


 Consists of the following components:
o Arithmetic and Logic Unit (ALU): performs required operations based on instructions
o Control unit: interprets instructions and generates sequence of operations
o Registers: stores state of system
 General purpose registers: internal storage of intermediate results
 Special purpose registers: stores control/status information
 Program counter (PC): address of next instruction to read
 Instruction register (IR): current instruction
 Program status register (PSR): processor status (flags, etc.)
 Stack pointer (SP): address of top of stack
 Memory address register (MAR) / Memory data register (MDR): buffer
data between CPU and memory via the system bus

Wire Signal Representations


Single Wire Signals
Multiple Wire Signals

Control Signals
 Many operations are triggered by 1+ control signals
 Control signals can be active high or active low triggered
o Active high: indicates certain condition when set to 1
o Active low: indicates certain condition when set to 0
 Control signals can be rising or falling edge triggered
o Rising edge: changes from low to high voltage
o Falling edge: changes from high to low voltage

Clock Signals
 Special signal that synchronizes 2+ devices
 They are (rising or falling) edge-triggered
 They are periodic, with a fixed duty cycle (ratio of high to low time)
 System clock signals are typically periodic with duty cycles of ~50%
o Input signals are sampled at predictable times with respect to the system clock
o Output signals change at predictable times with respect to the system clock
 Some clock signals aren’t periodic, like register clock signals, where active edges only occur
when needed to trigger an event
 Multiphase clock signals have been proposed to increase synchronization opportunities

Clock Signals Involved in Memory Interface Operations

 Clock enable signals: MDRin, MARin


 Tri-state enable signals: MDRout
 Direction control signals: READ, WRITE
 Acknowledgement signals: MFC (Memory Function Complete)
o MFC exists logically, but may not exist physically
MDR Bus Connections

CPU Memory Timing Interactions

 CPU generates the following signals:


o MARin
o MDRin/MDRout
o READ/WRITE
 Device generates MFC
 CPU and memory often differ significantly in performance
o E.g. in above example, assuming 50ns memory access time, there’s a 5x performance
difference
 During a normal read operation, CPU asserts address value, memory eventually responds with
corresponding data value
Synchronous Option (Blind Synchronization)

 Say memory required 6 clock periods to respond to a CPU’s read request


 With synchronous option:
o CPU begins operation with a synchronization pulse
o CPU assumes data is valid after 6 clock periods
o Memory must provide valid data prior to end of 6th clock period
o Data is transient – after 6 clock periods, data is gone

Asynchronous Option (Non-Blind Synchronization)

 Say memory required 6 clock periods to respond to a CPU’s read request


 With asynchronous option:
o CPU and memory exchange synchronization signals
o CPU begins operation with a synchronization pulse
o Memory responds with a synchronization pulse (MFC) when data is valid
o Address is persistent (valid) until finish signal
Sharing Bus Lines with Multiple Devices

 To share a bus line, the following steps are necessary:


1. Select at most one output driver to output at one time
 Use addressing, arbitration, or time-multiplexing
2. Connect output drivers so that conflicts do not result in large current flows
 Use resistances to limit current

Types of Output Drivers


 Totem pole output drivers – used by all logic gates
o Capable of actively driving 0 or 1
o Not capable of high impedance state (so not suitable for bus sharing)
 Open-collector output drivers – indicated by o/c on a gate
o Capable of actively driving 0
o When not driving 0, it is in high impedance state
o High impedance state can passively drive a value of 1 using a pull-up resistor connected
to a voltage supply
 Tri-state output drivers – indicated by enable signal on a gate
o Capable of actively driving 0 or 1 when enabled
o When not enabled, in high impedance state
o Ideal for bus sharing

Sourcing and Sinking Current


 Sourcing current: output driver connects to a voltage source
 Sinking current: output driver connects to a voltage sink

 During high impedance state, no voltage source or sink is connected

Totem Pole Output Drivers

Truth table:

i o
0 0
1 1

 Logic must be implemented to prevent active connections to both the power supply voltage and
ground at the same time
o Permanent: use passive pull-up or pull-down resistor to limit current flow in all possible
paths
o Temporary: disable device by turning off all its transistor-controlled connections to the
power source and ground
Open-Collector Output Driver

Truth table:

i o
0 0
1 Z

Tri-State Output Driver

Truth table:

i e o
0 0 Z
1 0 Z
0 1 0
1 1 1

Bus Conflict Resolutions


 Conflicts between active pull-ups and active pull-downs = unknown
 Conflicts between active pull-ups and passive pull-downs = 1
 Conflicts between passive pull-ups and active pull-downs = 0

Device Selection
 If a bus line has more than one possible driver, it must select devices to drive the bus
 Explicit selection
o Selects unique device based on available bus signals
 Address values
 Arbitration signal
 Timing signal
 Event signal
o Used for general purpose buses
 Implicit selection
o Resolves bus conflicts in predictable manner, using wired logic
 Wired AND logic = 1 if all drivers output a 1
 Wired OR logic = 1 if any driver outputs a 1
o Used for special applications (e.g. IRQ lines)

Bus Timing Example – Read Operation


 Assume system uses synchronous bus with the following data transfer signals:
o CLOCK: clock signal
o A[15..0]: 16-bit address bus signals
o D[15..0]: 16-bit data bus signals
o R/W: read (1) / write (0) signal
 Since address bus is 16 bits wide, system supports 216 addresses
 Since data bus is 16 bits wide, system word size is 16 bits
 Time A: CPU (bus master) drives address and read signal onto their bus signals
 Time B: Memory (bus slave) assumes bus signals are correct
o Decodes address and fetches data between time B and time C
 Time C: Memory (bus slave) drives data bus signals with requested value

Address Decoding
 Centralized vs. decentralized
o Centralized: system has centralized address and timing decoder to enable devices
o Decentralized: system embeds all address and timing decode logic into circuitry
 Address aliasing
o Aliasing: portion of address bus signals can be ignored by the system
o No aliasing: all address bus signals are considered by the system
Bus Timing Example – Write Operation

 Time A: CPU (bus master) puts address, data, and write signal onto the appropriate bus signals
 Time B: Memory (bus slave) assumes valid bus signals
 Time C: Memory (bus slave) stores data in appropriate register or memory location

Metastability
 Metastability: when setup times or hold times are violated, because data changes during setup
or hold time
o Can never be completely eliminated from a computer system, since external inputs are
unpredictable
o Synchronization chains can reduce probability of metastable signals

Characteristics of a Synchronous Bus


 Common clock / view of time available to all devices
 Control line indicates read or write operation
 Assume all operations have the same completion time
 No feedback from consumer to producer to alter transfer rate or duration of valid data

Bus Transfer Terms


 tPA – Address propagation delay: time for address to propagate from bus master to all slaves
 tPD – Data propagation delay: time for data to propagate from data source to all potential data
receivers
 tP – Bus propagation delay: maximum propagation delay of all bus signals = MAX(tPA, tPD)
 tSetup – Setup time: minimum time that a signal has to be available at the input to buffer before
an active clock edge
o Read: specified at master
o Write: specified at slave
 tHold – Hold time: minimum time that a signal has to be available (held stable) after the clock
edge triggers a transfer
o Read: specified at master
o Write: specified at slave
 tSelect/tS – Select time: time required for a device attached to the bus to recognize that the
current transfer involves its device interface
o Doesn’t include time required by device to perform register selection
 tAccess – Access time: time required for a device interface to access the requested information
after it has been selected
 tStore – Store time: time required for a device interface to capture/store data after it has been
selected
o Read operation:

o Write operation:

 tSkew – Skew time: maximum difference in propagation times of signals that ideally occur
simultaneously
 tM1/tM2 – Margin time: extra time reserved by designer to allow for unexpected variables in
performance, ensuring reliable operation

Skew
 Sources of skew:
o Minor differences in propagation delays due to wire length
o Differences in logic gate delays
o Differences in rise and fall times due to capacitive effects

Synchronous Bus Read – Timing Diagram

 Master Edge
o CPU puts address on bus: tPA + tSkew
o tM1 until slave edge
 Slave Edge
o Device is selected and retrieves data: tSelect + tAccess
o Device puts data on bus: tPD + tSkew
o tM2 until master edge
 Master Edge
o tSetup before master edge
o tHold after master edge
Synchronous Bus Write – Timing Diagram

 Master Edge
o CPU puts address and data on bus: tP (for both address and data to be valid) + tSkew
o tM1 until slave edge
 Slave Edge
o Device is selected and stores data: tSelect + tStore
o tM2 until master edge
 Master Edge
o tSetup before master edge
o tHold after master edge

Register Clock Derivation for Write Operation


 Clock signal is influenced by:
o CLOCK
o Address Decoded
o R/W signal

 Normal clock:
 Register clock:
o Inverted so that data is clocked on rising edge

Section VI: Parallel Interfacing


Parallel Interfaces

 Functions:
o Synchronizes device with system
o Alters signal levels
o Encodes data
o Buffers data
 Parallel interfaces help convert processing domain (operating in GHz–MHz) to device domain
(operating in Hz) – large time discrepancy

Properties of Parallel Interfaces


Property Processor Side Device Side
Signal levels Processor standard Device standard
Timing Memory-like Any
Signals RD and WR or R/W and CK Any
Delays Fixed/known Device dependent

 From a system view:


o Known: bus lines, address, data, control
o Unknown: signal lines, device/interface

System Bus Side


 Signal groups:
o Data: bi-directional between processor and devices
o Selection: technique to select memory location or I/O device address
o Control: synchronizes transfers
 Data transfer synchronization signals (e.g. R/W, CLK)
 Bus control signals (e.g. Bus Request, Bus Grant)
 Processor arbitration signals (e.g. IRQ)
 Assumptions:
o Memory-mapped I/O device
o Interface has 4 memory-mapped registers
o Synchronous bus, i.e. global clock
o Timing:
Device-Side (Data) Alternatives
 Unidirectional – low cost/complexity
 Bi-directional – versatile
o Control
 Explicit: use Data Direction Register
 Implicit: no Data Direction Register
o Implementation
 Pseudo bi-directional: implicit only
 Tri-state: explicit only
 Passive pull-up (open collector): implicit or explicit
o Importance of changing directions
 Static: bi-directional port only configured during initialization, operates as a
unidirectional port
 Dynamic: bi-directional port is re-configured frequently during operation

Driver Alternatives

 Totem-pole driver: both switches are active


o Logic-1 is applied by closing and opening pull-up switch
o One of the two switches is always closed, making it not useful in shared signal lines
 Passive pull-up driver (open collector): pull-up switch replaced with resistor
o Can be pulled low for logic-0
o Drives logic-1 if no driver pulls it low
 Passive pull-down driver (open emitter): pull-down switch replaced with resistor
o Can be pulled up for logic-1
o Drives logic-0 if no driver pulls it up
 Tri-state driver: both switches are active, but control logic permits both switches can be off at
the same time – high impedance (Z)
Data (No Synchronization) – Unidirectional, Data Only Input

 Enable:

Data (Transient Data) – Unidirectional (Data and Event) Input


Data (No Synchronization) – Unidirectional Output

 Clock 1:
 Clock 2:

Explicit Bidirectional
Implicit Bidirectional
Explicit Direction Control using Passive Pull-up
Implicit Direction Control (Pseudo Bidirectional)
Data Characteristics
 When two systems communicate, need to consider:
o Signal translation
o Synchronization
 Data can be:
o Persistent: remains valid until consumer explicitly accepts data
o Transient: data will vanish if not read at the appropriate time
 Data source may either inform or not inform receiver that data is present
 Receiver may or may not request for new data when it is ready

Control Signalling
 Two techniques to pass control information between two systems:
o In-band: some or all control information is passed the same way that data is transferred
o Out-of-band: control signalling is done with techniques that cannot be confused with
data, e.g. signals at frequency where data isn’t found, values that data will never be,
extra signal lines, etc.
 Signalling of data changes:
o In-band: value of data line differing from previous value indicates that data has
changed; special value can be inserted in between consecutive values to indicate change
o Out-of-band: use an extra signal line to indicate that new data is available

Persistence of Data

 Persistent data can be accepted at convenience of receiver; after data is received, source is
informed that data is no longer needed

o Data available triggers Valid


o Once data is received, Accept is set to 1, which triggers Valid to go to 0 and Data to end
availability
o Accept is set back to 0

 Transient data must be accepted within a source-specified time limit after it signals its
availability; this can be done with a latch at the receiver
o What happens if consumption is too slow?
 Error
 Only keep first read value (discard new value)
 Remember the new value

 Bounce: noise that occurs when switches change state


 Single Pole Double Throw (SPDT) solution assumes the switch won’t bounce from one throw to
the other; it only bounces on and off from one throw at a time
Switch Debouncing Options
 Bouncing duration is usually unknown and can change over time
 Software approaches:
o Wait a fixed amount of time, using a delay loop
o Wait until signal is stable
 Hardware approaches:
o Wait a fixed amount of time, using a counter or shift register
o Slow down signal transitions with a one-shot capacitor
o Using a more expensive switch attached to an RS latch

Synchronization – Control Line Issues


 Input synchronization issues:
o How does device inform CPU of new input?
 Interrupts or polling
o What function should be performed when new input is available?
 Inform CPU or clock data into register
o Which control edge is the active edge?
 Rising or falling edge
o When should the Status bit be set?
 Output synchronization issues:
o When should the status bit/signal be set?
 CPU decides or as result of external event
o When should the status bit/signal be reset?
o How does the device indicate to CPU that data has been processed?

Data Only Transfers


Section VII: Error Detection and Correction
Errors in Digital Systems
 Hard errors: persistent/repeatable errors, e.g. memory bit stuck at 0 will always return 0
 Soft errors: transient/non-repeatable errors, e.g. memory bit transmission failed due to noise
during that specific transmission

 Error detection (ED): add sufficient redundant information at the data source so that at the data
destination, we can determine if the data has changed from the original
 Error correction (EC): add sufficient redundant information at the data source to make it
possible to recover the original information

 Error rate: the rate/probability of errors on a channel; e.g. 10-6 = one bit per million is an error
o Errors may or may not be correlated, meaning error in one bit increases likelihood of
error in next bit
number of non−data bits
 Overhead:
total number of bits transferred

Block Recovery vs. Byte/Word Recovery


 If recovery is done at byte level (smallest level), system may be very responsive
o May require many extra bits to transmit
o Fewest bits to re-transmit
 If recovery is done at block level (group of bits), require fewer data bits – less overhead
o Fewer extra bits transmitted
o Re-transmission requires entire block to be sent again

Data Words and Code Words


 B: number of data bits
 C : number of check bits required to handle errors
 Data word: information bits to be transferred over a channel; smallest unit for error
correction/detection – has B bits of data
 Code word: combination of B data bits and C error handling bits
o In the event of an uncorrectable error, all B+C bits must be re-transmitted
C
 Overhead:
B+C
Communication Structure

 F: calculates check bits


o If F(Bin) = F(Bout), there has been no detectable error
o If F(Bin) =/= F(Bout), then:
 ED: F detects errors only
 EC: F provides sufficient information to correct errors
 EC-ED: F provides sufficient information to correct some errors and detects
others
 Some errors can’t be detected
o Designed for single-bit error but multiple bits contained errors -> valid code word could
be converted into a different valid code word
o More erroneous bits than permitted -> erroneous code word could be corrected to an
invalid code word
o Timing/signalling errors prevent successful error event recovery

Parity
 Most common first level of defense to detect single bit errors (C = 1)
 Add 1 bit per B bits, so that B+1 bits always have:
o Even parity: even number of bits set to 1
o Odd parity: odd number of bits set to 1
 Error is detected if data doesn’t have correct parity
 Distance: minimum number of bits that must change to go from one code word to another code
word
o For parity bits, minimum distance between valid code words is 2
o To detect single bit errors, minimum distance of 2 is required

Example: Consider a 3-bit data word + 1 parity bit (bbbc); assume even parity

Word Valid? Word Valid?


0000 ✓ 1000
0001 1001 ✓
0010 1010 ✓
0011 ✓ 1011
0100 1100 ✓
0101 ✓ 1101
0110 ✓ 1110
0111 1111 ✓
(Hamming) Distance and EC/ED

 Error detection: distance = number of bits to detect + 1


 Error correction: distance = number of bits to detect * 2 + 1

Hamming Code
Introduction
 Consider 4 data bits ( B = 4)
 Three overlapping sets (P, Q, R) are needed to construct the code word
o Each set uses 1 check bit – 3 check bits for 4 data bits
o Each set has initial even parity (combination of data and check bits results in even
number of 1’s)
 After transmission, each set has either even or odd parity
o 23 – 1 code words indicate a single bit error condition
o 1 code word indicates no error (P, Q, R all have even parity)

Code Word Size


 Given C check bits, we can represent 2C −1 error conditions and 1 correct condition
 Given B data bits, we need to transmit B+C bits total (data + check bits)
o Thus, we need 2C −1 ≥ B+ C

Syndrome
 Syndrome: XORs received check bits (CRec) and calculated check bits (CCalc) to determine the
location of a single bit error in a received code word
o If S = 0 (all bits are 0), no error
o If S only has one 1, error occurred in check bit
o If S has more than one 1, error occurred in data bit

 Check bits always occur in code words with only one 1


 To get CTransmitted (i.e. CRec), for each check bit:
o XOR data bits with 1 in the same location as check bit
C 0=D0 D1 D3 D4 D6

C 1=D0 D 2 D3 D5 D6

C 2=D1 D2 D3 D 7

C 3=D4 D5 D6 D7

Example: For a data value of 1110

Overall Bit Binary


1 001
2 010
3 011
4 100
5 101
6 110
7 111

CTransmitted: 100

 C 0=D0 D1 D3 =0 11=0
 C 1=D0 D 2 D3=01 1=0
 C 2=D1 D2 D3=1 11=1

Correct transmitted code word: 1111000

Assume received data has error in D2: 1011000


CRec: 100

DRec: 1010

CCalc: 010

 C 0=D0 D1 D3 =0 11=0
 C 1=D0 D 2 D3=0 01=1
 C 2=D1 D2 D3=1 0 1=0

Syndrome: CRec ⊕ CCalc = 110 = 6

 Error in 6th bit of 1011000


 Corrected data: 1111000

Section VIII: Serial Interfacing


Communication Context

 Serial is the least complex electrical interface between digital devices


 It uses a single wire (plus ground) to transfer information one symbol (usually one bit) at a time
 Hierarchy of information transmission:
o Information
o Packets (data and header)
o Frames (formatted for communication)
o Bytes
o Bits
o Logic values on wires
o Analog signals
Channel Types
 Simplex: unidirectional

 Half duplex: bidirectional, only one way at a time

 Full duplex: bidirectional, simultaneous


o Requires double the bandwidth of half duplex

Classes of Synchronization
 Bit synchronization: how long each bit is, and where it starts/stops
o Bit rate synchronization: time that elapses between start of one bit and start of another
o Phase synchronization: given a bit rate, how the midpoint or start of each bit is found –
found by determining phase relationship between local clock and transmitter clock
 Byte synchronization: where bytes start/stop
 Block synchronization: where blocks start/stop

Bit Synchronization
 Given an input signal with no other information, number of bits and their values are unknown
 Given an input signal and a clock signal, the clock provides way to get bit rate synchronization
 Assume data line is sampled on every edge of the clock period
o Number of bits: 8
o Value: depends on phase synchronization – which edge of the clock is sampled
(rising/falling)
Byte Synchronization
 Given an input signal and a clock signal, the clock provides a way to get bit timing, but we do not
know where a byte starts
 Need additional byte timing clock to reliably get byte info

 Again, assume data line is sampled on every edge of the clock period
o Still do not know what the bytes represent

 Bit rate: number of bits that can be transferred per second over a channel
 Baud rate: maximum number of symbols that can be transferred per second over a channel
o Example:

 Bit rate is twice the baud rate


 Advantage: more bits of data can be transmitted per clock period
 Disadvantage: smaller noise margin between adjacent data voltage thresholds,
which can lead to more errors
Data Representation as Electrical Signals

 Non-Return to Zero (NRZ): no zero rest point


o On/off: 0V = data 0, positive voltage = data 1
o Bipolar: negative voltage = data 0, positive voltage = data 1 (voltages centered around
0)
 Return to Zero (RZ): data 1 returns to 0 during clock cycle, data 0 remains at 0 through entire
clock cycle
o Less time to sample data bit, but clock is embedded in the signal
 Manchester (Split Phase): correct data occurs in first half of clock cycle, edge occurs in middle of
data bit
o Clock recovery is easy
 Differential Encoding: data edge on clock edge = data 0, no edge on clock edge = data 0
 Inverted/Interval (NRZI): no edge on clock edge = data 0, data edge on clock edge = data 1
o Data can be interpreted even if it is inverted
One-Byte Serial Interface
Transmitting Data

 Shift registers (SR) convert between serial and parallel data streams
 Transmitting data:
o Data is loaded into XMIT (Data Out) register in parallel from system bus
o When SRout is empty and XMIT has data, data from XMIT is transferred to SRout in
parallel
o Data is shifted out serially by the Tx (transmit) clock connected to SRout
 Asynchronous (separate clocks): Tx clock generated by local oscillator
 Synchronous (common clock): Tx clock received from data destination or
another external source
Receiving Data

 Receiving data:
o Data is shifted into SRin serially by the Rx (receiver) clock
 Asynchronous (separate clocks): Rx clock generated by local oscillator
 Synchronous (common clock): Rx clock generated by data source or another
external source
o When SRin is full, data is transferred to RCV (Data In) register in parallel
o Data is transferred from RCV to system bus in parallel
Processor Side

Reality Checks/Issues
 Time
o How does receiver recognize bits/bytes/blocks?
o Given a clock, how do you send/receive data?
o How do you get the clock?
o Synchronous vs. asynchronous:
 Asynchronous: Rx and Tx share the same clock
 Synchronous: Rx and Tx have different clocks that may have same or slightly
different frequencies
 Errors
o Corrupted data due to channel noise
o Erroneous data caused by timing problems (e.g. slow readings, clock differences)

Clocks/Time
 Transmit and Receive clocks don’t have to be the same rate, but they usually are
 Clock is considered external to the serial interface
 In some cases, a local oscillator is used for the Transmit clock, and may also be used for the
Receive clock
 Depending on the chosen edge and Receive clock rate, different data is received:
Time and Data at the Transmit and Receive Registers
 Transmit and Receive both consist of two registers, an additional shift register (SR)
 It takes bit time to transmit between SR and TX/RCV registers, and byte time to transmit
between CPU and TX/RCV registers
o Receive: processor may not pick up the data fast enough – overrun
o Transmit: processor may not respond fast enough for serial data stream to be
uninterrupted
Communication Protocols
 Two systems that want to exchange information need to agree on:
o Timing
o Control
o Format
o Data representation
o Electrical signalling (voltages, currents, connector pins, etc.)
 Communication Protocol: set of rules for making connections and transferring information
o Communication protocols are generally developed in the following sequence:
1. A company or organization defines the specifications for use with their products
2. A national-level organization sanctions (approves) the standard
3. International Standards Organization (ISO) may sanction an international
standard
4. International Telegraph Union – Telecommunication (ITU-T) standardization
sector sanctions it

Asynchronous Protocols
 Transmit and Receive sides are connected to separate local oscillators
 Assume both sides agree on a data rate (baud rate)
 Phase of Transmit clock is communicated to the Receiver

Bit Synchronization – Asynchronous Systems


 Transmit and Receive sides use different clocks
 The designer/user of the receiver only knows the nominal transmission frequency (ideal)
 There is no direct phase synchronization between Transmit and Receive
 Goal is to sample the bit in the middle, to help ensure that no bits are missed; in order to do so,
need:
o Reasonably accurate specification of the nominal transmission frequency, so that the
current bit can be used to find the next bit
o Way to extract the phase of the clock
 Example:

Asynchronous Frame Example

 Bit 1: start bit (0)


 Bit 0-7: data bits
 Bit 8 (and possibly 9): stop bit (1) – ensures that next start bit will have falling edge
o Hold the line as 1 when idle

Bit Sampling at n times the Data Rate


 Assume n=16, 8-bit data, and no parity
 Sample middle of each bit
 Since start bit time and bit structure are both known, bit synchronization is inherent
1. 0x0: wait for start of start bit (falling edge)
2. 0x0 – 0x7: wait for 0.5 bit times / 8 clock periods (since n=16)
3. 0x8: test for 0 (middle of start bit)
4. 0x9 – 0x18: wait for 1 bit time / 16 clock periods, sample b0
5. 0xi9 – 0x(i+1)8; wait for 1 bit time / 16 clock periods, sample bi
6. 0x89 – 0x98: wait for 1 bit time / 16 clock periods, check stop bit – If data = 0, then framing error
7. (if 2 stop bits) 0x99 – 0xA8: wait for 1 bit time / 16 clock periods, check stop bit – If data = 0,
then framing error
8. 0x0: indicate byte has been received and go to step 1

Clock Errors

 If clock rate drift causes sample point to change by more than 0.5 bits in 10 bits (8 data bits +
start/stop bits), sampled data will be incorrect
o Clock too slow: missed bits
o Clock too fast: double sampled bits
 Thus, for 10-bit frames, frequencies of transmit and receive clocks can’t differ by more than 5%
Clock Shift

∆T= | 1

1
f Rx f Tx |
 f Rx and f Tx are the actual clock rates of the transmit and receive clocks
 If B is the total number of bits to be transmitted (including start/stop bits), then if
1 1
B× ∆ T > × , then at the end of B bits the shift will be larger than 0.5 bits
2 f Tx
o The larger B is, the smaller ∆ T has to be

Overhead
number of non−data bits
 Overhead ¿
number of total bits
 Per byte of data:
o Worst-case – more overhead, but know if data is correct:
 7 data bits
 1 start bit
 2 stop bits
 1 parity bit
4
¿ =36 %
11
o Best-case – less overhead, but don’t know if data is correct:
 8 data bits (no parity bit)
 1 start bit
 1 stop bit
2
¿ =20 %
10
Errors
 Framing error: incorrect frame detected due to incorrect stop bit
 Overrun error: data is overrun by data following it, due to serial data arrives too fast for
Receiver to process, or Transmitter sends data too fast for the serial channel to transfer
o Receive overrun error: at Receiver, incoming data has overwritten data in SR¿
o Transmit overrun error: CPU or device writes to transmit buffer before its current data
has been transmitted
 Parity error: parity of received data and value of parity bit do not match
 Start bit error: when the line is sampled 0.5 bit times after the (falling) edge of the start bit, the
value is not 0 – false start bit detected or there is an error in the assumed bit rate

Impact of Parity Selection


 Choice of even or odd parity can depend on character set usage, as some short-pulse noise can
imitate the timing of one or more bits
 Examples:

o DELETE (0x7F): even parity is more prone to error, because a start bit followed by all 1’s
looks like a correct even-parity character
o ~ (0x7E): odd parity is more prone to error than even parity, but there is no ideal parity

 Break is a special character that forces a framing error


o A normal break is usually a fixed duration of 0s
o In cases with slow data rates, the break may seem like data with a number of 0s
o A long break stays 0 as long as the user presses the key, to help ensure a framing error is
forced (so the character isn’t detected as data)
 This is an example of out-of-band signalling

Impact of Stop Bit Mismatch


 What happens if the transmitter and receiver assume different numbers of stop bits?
o If Tx assumes more stop bits than Rx, communication isn’t a problem – subsequent
transmitted stop bits are received as idle bits
o If Tx assumes fewer stop bits than Rx, there is a problem – if data bytes are sent one
after another (the fastest rate), then start bit from Tx would be seen as framing error by
Rx
 For bidirectional channels, transmitter and receiver must agree on number of stop bits in order
to work in both directions

Summary of Synchronization in Asynchronous Serial Communication


 Bit rate: by agreement or software detection
 Bit phase: by start/stop bits – receiver detects falling edge to determine location of start bit
 Byte synchronization: start/stop bits – 1 byte per frame, so once bit 0 is detected, entire byte is
known
Synchronous Protocols

 Requires a common clock at the source and destination


o Separate lines for clock and data
o Embed clock in data (e.g. Manchester encoding), so that clock can be extracted from
data
 Advantages:
o No start/stop bits – less overhead
o No need to worry about relative accuracies of two clocks, so higher speeds are possible

 If distances are too large, then clock may not arrive at the input of each device at the correct
time
o Moving data from device 2 to 1: need to shift CLK x 8
o Moving data from device 2 to 3: need to shift CLK x 16
Block Oriented Synchronous Port Structure

 Receive clock is extracted from the data; data line edges are used to adjust the frequency and
phase of the local oscillator
 Since there are no start/stop bits, Sync Detect is required for synchronization, to determine
where bytes start
 Since higher speeds are possible, FIFO queues may be required

Bit Synchronization – Synchronous Systems


 If you initially assume the wrong edge, you will know that you have the wrong edge assumption
after not detecting edge
 Then, wait half a bit time to get the correct clock edge

 BISYNC Protocol: modify sequence of characters (using hardware or software) to provide block
synchronization
o Byte synchronization used to derive block synchronization
 HDLC Protocol: modify sequence of bits to maintain synchronization
o Bit synchronization used to derive block synchronization

BISYNC Protocol
 Based on special ASCII characters:
o SYNC (Synchronize) – 0x16 (0001 0110)
o SOH (Start of Header)
o STX (Start of Text)
o ETX (End of Text)
o ETB (End of Transmission Block)
o DLE (Data Link Escape)
 A long string of SYNC characters can be used to establish correct synchronization during
initialization; they are ignored
 Header: can contain source/destination addresses or a sequence number
 Data: data being transferred (excluding STX and ETX characters)
 Special characters: all special characters are sent with a DLE, which indicates that the next
character should be interpreted as a special character
o E.g. to send ETX in data, send DLE | ETX
o E.g. to send DLE in data, send DLE | DLE
 When the source buffer empties, SYNCs are inserted by the source and ignored by the
destination

BISYNC frame:

1. SYNC: >= 1 byte


2. SOH: 1 byte
3. Header: 3 bytes
4. STX: 1 byte
5. Data: < 256 bytes
6. ETX: 1 byte
7. Check bits: 2 bytes

HDLC (High-Level Data Link Control) Protocol


 Based on modifying bit sequence to achieve synchronization:
o A string of six 1s is a special flag to the receiver
o Bit stuffing: If user data requires five 1s, the hardware sends five 1s, stuffs an extra 0,
then sends the next data bit
 E.g. 0x7E (0111 1110) would be sent as (0111 11010)
 E.g. 0x3E (0011 1110) would be sent as (0011 11100)
o When receiver receives five 1s in a row, it discards the next bit if it is a 0, or treats it as a
flag if it is a 1

HDLC Frame:

1. Flag: 8 bits
2. Address: 8 bits
3. Control: 8 bits
4. Data: >= 0 bits
5. Checksum: 16 bits
6. Flag: 8 bits

 Minimum frame has Address, Control, and Checksum


 Types of frames:
o Information
o Supervisory
o Unnumbered

Information Frames

 Seq: frame sequence number


o Each frame is assigned a number between 0 and 7, used for re-transmission control
o Permits several messages to be sent before any ACK is received
 Next: number of messages consumed by destination
o Piggy-backed acknowledgement that all messages up to Next-1 have been received
without error
 p/f (Poll/Final): can be used to Poll a unit for data or indicate that this is the Final packet; can
also be used to Force an acknowledgement from the destination
Supervisory Frames

 Type 0 – ACK: Acknowledgement


 Type 1 – NACK: Negative Acknowledgement – Next field indicates the number of the first frame
that needs to be retransmitted
 Type 2 – ACK: ACK of messages up to Next-1, tells the sender to stop transmission
 Type 3 – Selective Reject: resend only the frame in the Next field

Unnumbered Frames

 Used for various control functions:


o DISConnect – machine disconnecting
o SNRM (Set Normal Response Mode) – reset sequence numbers to 0
o FRMR (Frame Reject) – a frame was received with valid checksum, but makes no sense
o UA (Unnumbered Acknowledge) – ACK for a control frame

General Message Passing


No Sequence Numbers – Error Free Unidirectional Transmission

 Device 1 sends data frames, Device 2 acknowledges received data frames


o D x: Data frame from device x
o A x : Acknowledgement from device x
 At most 1 outstanding data frame at a time
No Sequence Numbers – Error Free Bidirectional Transmission

 Both devices send and receive/acknowledge data frames


o D x: Data frame from device x
o A x : Acknowledgement from device x
 At most 1 outstanding data frame from each device at a time

Sequence Numbers – Error Free Bidirectional Transmission

 Both devices send and receive/acknowledge data frames


o D x, y : Data frame y from device x
o A x , y : Acknowledgement of data frame y from device x
 At most n−1 outstanding data frames from each device at a time
o Can send data every frame

Sequence Numbers – Unidirectional Transmission with Timeouts

 Device 1 sends data frames and Device 2 acknowledges received data frames
o D x, y : Data frame y from device x
o A x , y : Acknowledgement of data frame y from device x
o E: Erroneous frame
o D: Discarded frame
 At most n−1 outstanding data frames from Device 1 at a time
 When an erroneous frame is detected, all subsequent frames are discarded until timeout
 Then, erroneous frame is re-transmitted
Sequence Numbers – Unidirectional Transmission with NACKs

 Device 1 sends data frames and Device 2 acknowledges received data frames
o D x, y : Data frame y from device x
o A x , y : Acknowledgement of data frame y from device x
o N x , y : Negative Acknowledgement of data frame y from device x
o E: Erroneous frame
o D: Discarded frame
 At most n−1 outstanding data frames from Device 1 at a time
 When an erroneous frame is detected, a NACK is sent so that the frames can be retransmitted
starting with the erroneous frame
 All frames are discarded until erroneous frame is re-received

Sequence Numbers – Bidirectional Transmission with Piggy-Backed


Acknowledgements

 Both devices send and receive/acknowledge data frames


o D x, y : Data frame y from device x
o A x , y : Acknowledgement of data frame y from device x
 At most n−1 outstanding data frames from Device 1 at a time
 Acknowledgements are piggy-backed on data frames (i.e. sent at the same time)

Sequence Numbers – Out-of-Order Transmission with Selective


Rejection

 Both devices send and receive/acknowledge data frames


o D x, y : Data frame y from device x
o A x , y : Acknowledgement of data frame y from device x
o R x , y : Rejection of data frame y from device x
 At most n−1 outstanding data frames from Device 1 at a time
 Acknowledgements are piggy-backed on data frames
 Erroneous frames are rejected

Summary of Synchronization in Synchronous Serial Communication


 Bit rate:
o Global clock: dedicated clock signal
o Single line: hardware detection, likely with agreement on appropriate rate
 Bit phase:
o Global clock: dedicated clock signal
o Single line: derived from data stream – assuming Manchester encoding, phase can be
determined by looking at and shifting clock edges until middle of bit is found
 Byte synchronization:
o Global clock: flag, bit, bit pattern, or global reset
o Single line: flag or bit pattern
 Block synchronization:
o Global clock: derived from byte synchronization
o Single line: derived from bit or byte synchronization

Asynchronous vs. Synchronous Communication


Issue Asynchronous Synchronous
Transmission rate Often slower Faster
Complexity May be high if implementing Even higher, depending on
block-oriented transfers hardware/software choices
Bit and byte synchronization Higher overhead
Block synchronization Byte-oriented only (BISYNC) Alternatives available (BISYNC
and HDLC)

 Actual drivers/receivers used in transfer of serial signals can be of any type


 Considerations when choosing the best one:
o Distance – increased distance means more specialized drivers/receivers needed
o Noise – increased noise means more specialized drivers/receivers needed
o Data rates – increased data rate means more specialized drivers/receivers needed
o Standards – needed if driver and receiver are provided by different manufacturers
Universal Serial Bus (USB)

 General goal was to create a standard for cables in the PC market, with:
o Cheap connectivity
o Ease-of-use
o Expandable ports
 Specific goals:
o Common connector
o Automatic device detection and configuration upon plug-in
o Support for new and legacy devices
o Higher performance
o Low-power
o Direct power distribution for low-power devices

Low Speed and Full Speed Signalling


 Differential drivers and receivers – four wires: power, ground, D+ (data positive), D- (data
negative)
o D+ and D- are half duplex (bidirectional, one direction at a time)
 R source=45 Ω for high-speed operations, between 28 Ω−45 Ω for full-speed operation
 Slew rate is limited

 Maximum cable skew between D+ and D- is 100ps


Low-Level Synchronization
 Data rate is specified by one of the resistors pulling up either D+ (low speed) or D- (full speed) –
detected when device is plugged in
 NRZI signal encoding – 0’s cause transitions, 1’s do not
 Bit stuffing used to maintain clock edges and keep the clock synchronized
 Block synchronization:
o Start of Packet is transmitted as 00000001 to synchronize clock
o Line goes idle between packets and clock resynchronization is required for each packet
o End of Packet is transmitted by out-of-band signalling

USB Data Transfers


 Only root hub can initiate transfers
 Each transfer consists of 3 packets:
o USB Token packet: selects slave for communication
 Target device
 Endpoint number (device I/O register)
 Direction of transfer
o USB Data packet: transfer payload information (actual data)
o USB Handshake packet: confirmation to data source that data has been received by
destination

USB Packets

 USB PID: there are 16 PID values, each of which are 4 bits followed by the same 4 bits
complemented
o E.g. Start of Frame (0101) -> 0xA5
 USB SOF Token Packet: frame used to provide some synchronization and facilitate data transfer
– generally sent once per millisecond to indicate start of new cycle
 USB IN Token Packet: used for Interrupt, Bulk, Control, and Isosynchronous transfers, provides
device address and I/O register endpoint to provide data for during the transfer
 USB OUT Token Packet: same as IN packet except communication is in opposite direction
 USB SETUP Token Packet: used to set up a remote device of hub
 USB ACK Packet: error free receipt of data packet
 USB NACK Packet: inability to accept or return data, or no new data available
 USB STALL Packet: error that can’t be recovered by itself; target is unable to complete transfer,
so software intervention is required

Types of USB Transfers


 Interrupt Transfer: traditional, small volume transfers
 Bulk Transfer: used by block transfer devices to ensure delivery of every piece of data, with no
time constraints
 Isosynchronous Transfer: data is read/written at a given rate – generally real-time data; missed
items are ignored
 Control Transfer: set up devices
USB Setup

RS-232 Electrical Interface


 Originally used to connect data terminal equipment (DTE) to data communication equipment
(DCE) or a modem
 Now used to connect computers to almost any relatively slow peripheral device

o Ring Indicator: request from DCE to communicate with DTE


o Data Terminal Ready: indicates if DTE power is on
o Carrier Detect: communication line is active
o Signal ground: grounds loop problems
o Data Set Ready: indicates if DCE power is on
o Clear to Send: DCE indicates it is ready to receive character
o Request to Send: DTE indicates it has character to send
o Receive: data from DCE to DTE
o Transmit: data from DTE to DCE
o Shield ground: noise protection
 To connect two DTEs together, a null modem is needed to translate connections between lines

3/4-Wire Null Modem

5/6-Wire Null Modem

Section IX: Analog Interfaces


Op-Amp Review
Ideal Op-Amps

 Infinite input impedance  zero input current


 Zero output impedance  can source/sink any amount of current
 Infinite open-loop gain

 V out =A V ¿
ol

 Zero internal noise


 Infinite bandwidth

Non-Ideal Op-Amps

 Finite gain
 V out is bounded by V S and V S −¿¿
 V out =A V ¿
ol

Op-Amp (without feedback) as a Comparator

 Without feedback components, the output switches between V S and V S −¿¿


 Two inputs: V C and V Ref
o Positive-Logic Comparator – V out =V S (positive) when V C >V Ref
o Negative-Logic Comparator – V out =V S (positive) when V C <V Ref

Basic Inverting Amplifier

 Op-amp with resistor feedback such that increase in input voltage  decrease in output voltage
 V out =−A V V −¿¿ or V −¿=−V
ol
out
¿
AV ol

V out −R f
 If AV → ∞ then the closed-loop gain =
ol
V¿ R¿
−R f
V out = V¿
R¿
Current Summer/Inverting Adder/Weighted Summer

Vi
 The current in the i th branch is I i=
Ri
V out Vi
 If = =−∑ I i=−∑ by KCL
Rf Ri
Vi
 V out =I f R f =−Rf ∑
Ri
−R f
o If all Ri=R (same value), then V out =
R
∑Vi
Basic Non-Inverting Amplifier

 Op-amp with resistor feedback element such that increase in input voltage  increase in output
voltage
V out R f + R¿
 =
V¿ R¿
Non-Inverting (Unity Gain) Buffer

 A special non-inverting amplifier where R f =0 and R¿ =∞


o High input impedance
o Low output impedance
 V out =V ¿

Dynamic Characteristics of Analog Signals

 At analog level, step response with delay, slew, and ring


 Settling time: time for output to settle within a specific range of the final value for a given input
o Total time of delay, slew, and ring
 Slew rate: the rate at which signals change from one value to another

Analog Interfacing – Conversion Issues


 Quantization: the act of assigning a single digital value (quantity) to a range of analog values,
when reducing continuous analog signal values to a set of digital values
 LSB: the change in analog signal corresponding to a change in the least significant bit (LSB) of the
digital representation
FSAR
o 1 LSB= n
, where FSAR is the full-scale analog range and n is the number of bits
2 −1
in the digital representation
o Analog: LSB measured in voltage or current
o Digital: LSB measured in bits
Digital-to-Analog Converters (DACs)

 Output voltage/current x is proportional to V Ref (analog reference voltage) and n -bit binary
input B to map to analog
o x=k × V Ref × B , where k is a proportionality constant
Binary Weighted Resistor Ladder

 Switch is closed when Bi = 1, open when Bi = 0


V out V Ref V Ref V Ref V Ref
 If = =−( B 0+ B1 + B 2+ B)
0.5 R 8R 4R 2R R 3
 All the R’s cancel, so:
−V Ref −V Ref 3 i
2 V out =
8
(B 0+ 2 B1 +4 B2 +8 B3 )V out = ∑ 2 Bi
16 i =0
V n−1 i −1
k=
¿− Ref ∑ i
2n i =0
2 B 16

Max −V Ref
n
2 −1
V out = n ( 2 −1 )=−V Ref
n
n
=FSAR
2 2
n
2 −1
FSAR −V Ref
 1 LSB= n 2
n
2 −1 ¿ n
2 −1
−V Ref
¿ n
2
This is the amount of analog voltage change per change in LSB

Varying Input Impedance


 The input impedance with respect to V Ref varies as the n -bit values vary, i.e. current is much
larger when all B’s = 1 than when all B’s = 0
 In this case, V Ref always has a constant current flow, unlike the previous version, as it is
grounded when the switch is open and connected to V −¿ ¿ (virtual ground) when the switch is
closed
 The constant current load is equal to the maximum current load of the previous version
 Attaching this to a processor:

Resistor Values
 Arbitrary example:
 For very large n (number of bits), very large resistors are needed and a very large resistor range
is needed – hard to fabricate
 Largest current 100mA is too large for most op-amps to supply, while smallest current 4uA is
akin to the noise level of most op=amps
 This configuration is impractical and expensive for large n

R-2R Ladders

 The input impedance is fixed at Z ¿=R


V Ref
 I=
R
I I I
o I 1= , I 2= , I 3=
2 4 8

 Just like the Binary Weighted Ladder,


−V Ref 3 i −V n−1 −1
V out = ∑
16 i =0
2 Bi= nRef ∑ 2i B ik =
2 i=0 16

Characteristics
 Easy to fabricate two resistors of a fixed ratio
 Consider an n -bit DAC:
−V Ref n−1
V out =
2 n ∑ 2i Bi
i =0
 Full scale output (every B = 1):
n
Max 2 −1
V =−V Ref
out n
2
1
 Thus, if |k|< n , then | out| | Ref| and so the largest possible output value is
V <V
2
n
Max 2 −1
V out =−V Ref n
2
Multiplying DACs (M-DACs)

 A DAC that is configured such that the analog input voltage signal (V Ref ) is a time-varying signal
 Output analog voltage:
V out =k × B ×V ¿
o V ¿ is a varying input – attaching an analog signal to V ¿ would allow you to control the
gain with the digital signal B

DAC Characteristics – Specifications


 Resolution: the number of bits in the digital value at the input of the DAC
 Precision: the smallest distinguishable change in output (ideally 1 LSB)
 Accuracy: comparison of actual output to expected output
o Often specified as a fraction of an LSB
 Range: the difference between the maximum and minimum output value
 Dynamic Range: the difference in decibels between the system’s noise level and saturation
(overload) level
n
Dynamic range ( dB )=20 log (2 −1)
 Errors:
o Gain and offset errors – independent from the digital value
o Linearity errors – dependent on the applied digital value
o Environmental errors – things we can’t control

DAC Offset and Scale Errors


 Offset error: an analog shift in the output of a DAC that is consistent over the full range of digital
input values
 Gain error: error in the output value of a DAC that varies linearly with the applied digital value
o Can be caused by errors/drift in resistor values, or changes in reference voltage

Mathematical Model of Offset Errors and Calibration


 Ideal relationship:
V out =V Ideal=k V Ref B
 With offset error:
V out =V WithOffsetError=V Ideal +V Offset
 The offset error can be cancelled by subtracting V ZeroCode =¿ the output voltage when input = 0V

V out =V OffsetCalibrated =V Ideal+V Offset−V ZeroCode


 There could also be gain error:
V out =V OffsetCalibrated ×GainError
 To cancel the gain error, determine V MaxActual=¿ the actual maximum output voltage and then
alter the gain (by adjusting the feedback resistor):
FSAR
V out =V OffsetCalibrated ×GainError ×
V MaxActual

DAC Linearity Errors – Differential


 Differential Non-Linearity (DNL): the differences between analog values corresponding to
consecutive digital input values
DNL=Max ¿
o E.g. DNL = 1 LSB means that moving from B=i to B=i+1 will result in a change in the
output of between 0 and 2 LSBs in magnitude (where ∆ V ideal =1 LSB)
 Monotonicity Error: If DNL > 1 LSB, increasing two consecutive digital values from i to i+1 could
result in a decrease in output (analog) voltage

DAC Linearity Errors – Integral


 Integral Non-Linearity (INL): the maximum deviation between the true output and the ideal
output
o Based on the assumption that all linear errors have been eliminated/compensated for

Example of INL and DNL error calculation

DAC Environmental Errors


 Power Supply Rejection Ration (PSRR): how sensitive the converter is to changes in the power
supply voltage
% change ∈full scale
S=
% change ∈supply voltage
o E.g. decreasing the power supply voltage by 5% (from 5V to 4.75V) results in a decrease
20 %
in V Ref by 20% (from 2.5V to 2.0 V), then S= =4
5%
DAC Dynamic Performance

 Settling time – often expressed as the amount of time it takes for a certain percentage of the
final output to be reached
1
o Should be ± LSB to accurately read output
2
 Manufacturers will often leave amplifier off the DAC and only provide a current source, making
the DAC faster and cheaper

 Glitch impulse: when digital values change from one to another, the analog output may not
change directly from one value to the other
o Glitch impulse area is the region of error caused by internal switches not changing at the
same rate – e.g. in the example above, if the switch for b3 changes slower than switches
for b0, b1, and b2
Analog-to-Digital Converters

 If the digital value is 010, then the applied analog voltage was between 1.5–2.5 LSBs
 Two basic operations:
o Quantization: reduce range of analog values to discrete digital values
o Coding: assign binary code to each discrete range
ADC Unbiased vs. Biased Error

 For biased errors, transition point gets moved over to each volt (instead of between volts)
 If digital value of 010 is read, then applied analog voltage is between 2–3 LSBs

Binary Ramp ADC


 Increment DV until output of DAC > analog input voltage V Analog
 DV is now a digital representation of the analog input voltage
1. Processor asserts start-of-conversion (SOC) signal
2. SOC signal resets the Counter and D Flip-Flop
3. Counter increments by 1 during each CLOCK cycle
4. DAC outputs a discretized ramp signal controlled by the digital value (DV)
5. When DAC output exceeds analog voltage V Analog (i.e. A < B), Comparator clocks LOGIC 1 into
the D Flip-Flop to assert conversion complete (CC)
a. Counter stops counting up when LOGIC 1 is clocked
6. Processor detects CC and reads DV

Example
 Analog input voltage 0.6V is converted by ADC to digital value 101, since it is between 0.571V –
0.714V
 Biased high: digital value 101 is interpreted as 0.714V, i.e. all values in range 0.571V – 0.714V
are interpreted as 0.714V

Binary Ramp ADC – Software Version

/* Let n store the resolution of the DAC in bits. */


/* Let DV map to the input of the DAC. */
/* Let S map to the output of the comparator. */

DV = 0; // start by resetting digital value

// increment digital value while it is lower than the analog value and
the
// max value hasn’t been reached
while (S == 0 && DV < pow(2, n) – 1) {
DV += 1;
}

// DV now stores the digital representation of the analog value

Binary Ramp ADC – Comments


 Conversion speed depends on:
o Analog input voltage
o DAC speed
 Conversion accuracy depends on quality of DAC and comparator
 Data output value is persistent until next SOC signal
 Advantages:
o Simple to implement
 Disadvantages:
o Variable conversion time
o Slow conversion time for large resolutions
o DAC overshoot can cause flip-flop to trigger prematurely
o Speed is limited by DAC settling time
Successive Approximation Register ADC
 Goal: improve performance of Binary Ramp by converting linear search of DV to binary search

Conversion time becomes fixed for a given resolution, instead of being dependent on the analog input
voltage

 Conversion time is fast and predictable


 Implementation is relatively straightforward

Assume bits of SAR are numbered 0 (LSB) to n−1 (MSB)

1. Clear all bits of digital value DV in the SAR


2. Set X =n−1
3. Set DV X =1 and wait for DAC to settle
4. If Comparator outputs 1, set DV X =0 and wait for DAC to settle; otherwise, leave the bit as 1
5. Decrement X ; if X ≥ 0, go to step 3

Example
 Analog input voltage = 0.6V
o First clock cycle: X =2, SAR = 100 ( DV 2=1); comparator outputs 0, so leave DV 2=1
o Second clock cycle: X =1, SAR = 110 ( DV 1=1); comparator outputs 1, so set DV 1=0
o Third clock cycle: X =0, SAR = 101 ( DV 0=1); comparator outputs 1, so set DV 0=0
 3 clock periods to convert 0.6V to 100
 Biased low: digital value 100 interpreted as 0.571V, i.e. all values in range 0.571V – 0.714V are
interpreted as 0.571V

Successive Approximation Register ADC – Software Version


 Uses same hardware as Binary Ramp ADC

/* Let n store the resolution of the DAC in bits. */


/* Let DV map to the input of the DAC. */
/* Let S map to the output of the comparator. */

DV = 0;

for (int i = n-1; i >= 0; i--) {


DV = DV | (1 << i); // set the ith bit to 1
if (S == 1) {
DV = DV & ~(1 << i); // reset the ith bit to 0
}
}

// DV now stores the digital representation of the analog value

Successive Approximation ADC – Comments


 Speed can be improved by observing that the DAC output doesn’t have to change as much as
the conversion progresses, so it may be possible to clock the SAR increasingly faster
 Data output is persistent until the next SOC
 Advantages:
o Simple to implement
o Faster than Binary Ramp ADCs for large analog input voltages
 Disadvantages:
o Requires multiple clock periods per conversion
o Slower than Binary Ramp ADCs for very small analog input voltages

Comparison of Binary Ramp and Successive Approximation ADCs


 Binary Ramp ADCs bias high (overestimates voltage), while Successive Approximation Register
ADCs bias low (underestimates voltage)
 Both ADCs have very similar software and hardware costs
o Successive Approximation ADCs require fewer clock periods, but require faster DACs if
the same clock period length is used for both converters
 Successive Approximation Register ADCs have fixed conversion times (n clock periods) that
aren’t affected by analog input voltage
 Successive Approximation ADCs may have fewer glitch problems, since they only have one bit
turned on or off per DAC cycle

Flash ADC (Fast, Brute Force Approach)


General (n -bit) converter:
 Test every possible value at once  requires 2n−1 comparators and 1 resistor per possible
value
 Input values are limited to a number of consecutive 0’s followed by a number of consecutive 1’s
 Each comparator outputs result of V Analog >V , where V is the voltage from the resistor chain
2-bit converter example:

1 1
 E.g. XYZ = 100 means that V Analog is between V and V Ref , so DV = 01
4 Ref 2
1 3
 E.g. XYZ = 110 means that V Analog is between V Ref and V Ref , so DV = 10
2 4

Flash ADC – Comments


 Data is transient, unlike Binary Ramp and Successive Approximation
 Biased low
 No need for start/end of conversion, but also no way to tell when the converter has stabilized
 Advantage:
o Very fast
 Disadvantage:
o Requires many accurate resistors and even more comparators
 Size becomes extremely big for higher resolutions
 Expensive
 Greater number of parts that can fail
 Large amount of heat due to a large number of parts

Indirect Integrating ADCs


 Converts voltage to time, then to number through a counter
Op-Amp Integrators

ⅆv
 I =C and I ¿=I f so:
ⅆt
V¿ ⅆ V out
=−C
R ⅆt
 Integrating both sides:
T

∫ −1
RC
V ¿ ⅆt =V out ( T )−V out ( t 0 )
t0

 Assume that the input voltage doesn’t change between t 0 and T :


T
−V ¿ −V ¿
V out ( T )−V out ( t 0 )=
RC t
∫ ⅆt =
RC
(T −t 0)
0

 Then, at time T :

V¿
V out ( T )=V out ( t 0 )− (T −t 0 )
RC

o If V ¿ < 0, then the slope becomes positive


Single Slope ADC

 Create a voltage ramp (with a known slope)


 Determine how long the ramp takes to reach the unknown voltage
o Use a comparator to determine if V out of integrator has reached V Analog

T
 Switch is kept closed until start of conversion, so V out ( 0 ) =0 and V out ( T )= V
RC Ref
 At the start of conversion, switch is opened and counter is reset
V Ref
o V out ramps up with a slope of
RC
 When the threshold is reached (at time T ), the counter stops
o V out (T )≈ V Analog
DV
o T= , where DV is the binary value of the counter and f is the clock frequency
f
 Thus:
T V Ref DV
V Analog ≈ V Ref ≈
RC RC f
V Analog RCf
DV ≈
V Ref
o This shows that DV is dependent on R , C , f , and V Ref

Single Slope ADC – Component Selection


 To convert values of V Analog between 0 and V Max, choose
V Ref
RCf =( 2 −1 )
n
V Max
 E.g. if n=8, Ref =5 V , and V Max ≈ 5V , then RCf =255
V

Singe Slope ADC – Comments


 Converter produces persistent data, since counter isn’t reset until next SOC signal
 Advantages:
o Simple and reasonably good converter
 Disadvantages:
o Since DV is dependent on R , C , f , and V Ref , conversion errors can occur due to:
 Clock frequency errors
 Errors/drift in R and C
 Errors in V Ref
o Conversion time is proportional to V Analog

Dual Slope ADC


 Conversion starts with V o =0 and Counter = 0
 Integrate with V Analog for a fixed amount of time T :
−V Analog T −V Analog
V o ( T )=
RC t
∫ ⅆt =
RC
T
0

o Since V Analog is positive, the output voltage slope is initially negative


N
 When the counter reaches some value N (T = ), the switch position is changed and the
f
counter is reset
o N is typically 2n, where n is the ADC’s resolution and the number of bits in the Counter
o A carry out occurs in the counter when 2n clock periods have passed
 Then, integrate with −V Ref for a variable amount of time τ until V o =0 again:
T +τ
−1
V o ( T −τ )=
RC T
∫ −V Ref ⅆt +V o ( T )
V Ref V Analog
¿ τ− T
RC RC
¿0
 When V o ( T −τ )=0, stop the counter, copy the counter value (DV) into a register, and start
integrating with V Analog as the input again
o V o ( T −τ )=0 occurs when:
V Ref V Analog
τ= T
RC RC
τ
V Analog=V Ref (RC ' s cancel out)
T
n
DV 2
o But since τ = and T = , so:
f f
DV
V Analog=V Ref n
2
n
Max 2 −1
V Analog=V Ref n
2
 Value doesn’t depend on R , C , or f ; assume that these won’t change during
conversion

Dual Slope ADC – Comments


 Clocking the DV into register can be performed so that the data is persistent or transient
 Output is biased low, and V Ref ≥V Analog
 Advantages:
o Output doesn’t depend on values of R and C
o Output doesn’t depend on value of f – clock drift between conversions is tolerable
o Fairly simple to build
o Popular for low-speed low-cost applications
 Disadvantages:
V Analog
o Completion time is a function of , since the second slope takes that amount of
V Ref
time to get back to 0V – thus, the completion time is between 2n−2 n−1 clock periods
o Number of clock periods doubles with each 1-bit increase in resolution

ADC Specifications and Errors


 Resolution: the number of digital value bits that the ADC converts
 Quantization Error
 Dynamic Range: ratio of largest value that can be converted to smallest value
o E.g. for a 10-bit ADC with an input range of 0V – 4V, the quantization step is
4 V −0 V 4V
10
=3.9062 mV , so the dynamic range is =1023 ≈ 60 dB
2 −1 3.9062mV
 Missing Code: due to some errors in the ADC, certain digital values may never be able to be
generated; these are known as missing codes
 Accuracy
 Offset, gain, and linearity errors
 Conversion time

Time Varying Signals


 Changing signals can cause converter problems
 Converter Aperture Time: the maximum time that the converter output (result) is sensitive to
changes in the analog signal
o A sample-and-hold circuit is often used to ensure that the ADC’s input analog signal
doesn’t change during the aperture time

 The frequency of a periodic signal can impact the required sampling rate
o Nyquist Sampling Rate: if a time-varying signal contains components of significant
amplitude only below f Hz , then a sampling frequency ¿ 2 f Hz will suffice to
reconstruct the frequency without generating lower frequency aliasing signals
 i.e. a signal with frequency f requires a sampling frequency of at least 2 f to
reconstruct the signal

Sampling a Time-Varying Signal without a Sample-and-Hold Circuit


 What is the maximum frequency of a time-varying signal that can be sampled without adding a
sample-and-hold circuit?
 Example: assume sampling a sine wave of some frequency – worst case occurs at point of
maximum rate of change (slope) of the signal
V ( t )=V peak sin (2 πft )
ⅆV
=2 πf V peak cos (2 πft )
ⅆt
Max slope=2 πf V peak
 We often assume that the signal won’t change by more than ¼ LSB during the conversion
 E.g. 12-bit ADC requires 10µs to convert a signal with range 10V peak-to-peak
o Converter aperture time = 10µs, so the max amount that the input can change by in
10µs is ¼ LSB
1 1 10 V
o LSB= ≈ 0.6 mV , so there can be at most a 0.6mV change in 10µs
4 4 212−1

o
ΔV
|
Δt Max
=2 π f Max V ρeak

0.6 mV
f Max = ≈ 2 Hz  very constraining for a 100kHz converter
(10 μs)(2 π)(5 V )
Sample-and-Hold Overview

 ADCs require a stable input voltage during conversions, i.e. the voltage remains within ¼ LSB
 A sample-and-hold circuit is used to ideally hold the input voltage constant during the
conversion
 Buffer 1 isolates the analog circuit from C when Q is on
 Buffer 2 isolates the converter from C
 Sample period: When Q is on, V c tracks V ¿
 Hold period: When Q is off, V c retains the most recent V ¿, which means that V out retains the
most recent V ¿ value and it can be used as the ADC input voltage
 Some ADCs include sample-and-hold circuits

Sample-and-Hold Circuit Errors

 During sampling
o Errors in input buffer
 Offset
 Non-linearity
 Non-unity gain
o Settling time: time to attain a good estimate of the final value, given a full-scale step at
the input, to within a specified error – i.e. time to couple the voltage across the
capacitor through the two buffers, to the ADC input
 During sample-to-hold transition
o Sample and Hold Aperture Time: time required for Q to turn off, once the hold signal is
asserted

o Sample and Hold Aperture Uncertainty (Jitter): the time between the command to turn
Q off and when Q actually turns off
 Caused by variables in: the delay to turn off, sample/hold transition, or
temperature
 If sampling a signal at regular intervals, cannot miss a sample by the time
required for the signal to change by more than a fraction of the LSB value
o Example: if you need to sample a 10kHz signal using the 10µs ADC mentioned above,
how much sample and hold aperture uncertainty can be tolerated?
1 1 10 V
LSB= ≈ 0.6 mV
4 4 212−1
ΔV
|
Δt Max
=2 π f Max V ρeak

0.6 mV
T aperture uncertainty= ≈2 ns
2 π (10 kHz)(5 V )
 During hold
o Droop: a drop in the signal out of the sample and hold circuit
 Caused by C discharging due to:
 Input bias currents in output buffer
 Leakage through the switch
 Leakage across C
o Hold settling time (t hs): time for signal to stabilize after hold begins
o Feed through: leakage forward through the switch
 During hold-to-sample transition
o Acquisition time (t acq): time required before capacitor voltage is within a specified
percentage of the final value
o At times, input and stored values may appear to be close, but there may still be
transients when sample starts due to stray capacitance still present in the circuit

Section X: Buses – Data Transfer


 Synchronous Data Transfer: transfer of data between communicating entities with a common
view of time
o Global clock; data transfer occurs at specific times in the clock period
o All transfers are a fixed duration of one clock period
o No feedback from consumer to producer to alter the rate of transfer or the length of
data validity
 Asynchronous Data Transfer: transfer of data between communicating entities with different
view of time
o No global clock
o Variable transfer times are permitted
o Can be fully or partially interlocked
 Semi-Synchronous Data Transfer: transfer of data between communicating entities with a
common view of time, but variable transfer times are permitted
o Transfer times are an integral number of clock cycles, but the actual number of clock
cycles are generally controlled by the slave
 Split-Cycle Data Transfer: transfer of data between communicating entities that only permits
transfer from producer to consumer
o Write transfers are accomplished in a single transfer
o Read transfers are accomplished as two write transfers (one in each direction)
Synchronous Read (Single Clock)
Minimum Read Time – Synchronous
 After the rising Master Edge of the clock,
minimum(Read )
t phase1 =t Hold +t PA +t Skew
 After the falling Slave Edge of the clock,
Minimum (Read)
t phase2 =t Select +t Access +t PD +t Skew +t Setup
Max
 Thus, assuming the slowest interfacing device has t Access, the minimum time for a synchronous
read bus cycle is:
minimum ( Read ) minimum ( Read )
Synchronous Read
T buscycle =t phase1 +t phase2 ¿ t Hold +t PA +t Skew +t Select + t Access +t PD + t Skew+ t Setup
Max
¿ t PA +t PD +2 t Skew +t Setup + t Hold +t Select +t Access
Synchronous Read (Two Clocks)

 In this case, hold time is decoupled from the next cycle, because it occurs before the clock edge
where data is read
Synchronous Write (Single Clock)
Synchronous Write (Two Clocks)

Clock Rate Limits


 Assume that a pure synchronous system is constrained to a single clock period and so the slower
Optimized Synchronous Read
of the two times is taken as T buscycle
 Read time is always slower than write time, due to t Access
Asynchronous Buses

 Synchronous buses must run at the speed of the slowest interfacing device
 Asynchronous buses allow bus speed to vary according to a wide variety of device speeds
o Clock line is replaced with Master and Slave lines
Fully Interlocked Asynchronous Read Transfer

Timing Diagram at Master


Timing Diagram at Slave

Minimum Read Time – Fully Interlocked Asynchronous


Asynchronous Fully−interlocked Read
T buscycle =2 t P +t Skew +t Select +t Access + 2t P + t Skew + t Setup
Current Transfer
¿ 4 t P +2 t Skew+ t Setup +t Select + t Access

 2 t P=t P from Master to Slave and t P from Slave to Master

Fully Interlocked Asynchronous Write Transfer


Timing Diagram at Master
Timing Diagram at Slave

 Optimistic slave: once the address is decoded, save the data in a temporary register until it can
be written to the correct location
 Conservative slave: wait until the data is written to the correct location before informing the
master

Factors Limiting Bus Speed


 Skew time: must be known by master or slave, depending on implementation
 Propagation time
 Setup time: may need to be known by master
 Hold time
 Device access times

Synchronous vs. Fully Interlocked Asynchronous Transfers


 Synchronous:
o Transfer rate limited by slowest device connected to bus
o Transfers are faster than asynchronous if all attached devices have comparable
performance
o Treat data as transient
 Asynchronous:
o Slower than optimized synchronous bus in situations where all attached devices have
comparable performance
o Supports wide range of device response times
o Verifies that real values are being read
o Treat data as persistent

Partially Interlocked Asynchronous Bus Transfers

 Based on constraining slave signals to a fixed duration


 Non-interlocked transitions must obey timing constraints
 Reduces the number of bus propagation delays from 4 to 2, but adds a delay time

Minimum Read Time – Partially Interlocked Asynchronous


Asynchronous Partially −interlocked Read Current Transfer
T buscycle =2 t P +t Skew +t Select + t Access + t Delay¿ 2 t P +t Skew +t Select +t Access + t Delay

 Assume that t Delay can be set to any fixed value that is higher than the device time
Synchronous vs. Fully- and Partially-Interlocked Asynchronous

Semi-Synchronous Bus Transfer


 Hybrid of synchronous and asynchronous techniques
 Best implemented when most transfers are fast; if a device is slower than the average transfer
and it notifies the master that the transfer cannot be completed in the expected time, it is
possible to slow the transfer
 Transfer timing is based on two signals:
o Clock from the bus
o Wait/Ready/Hold from the slave
 Constraint: slave interface must be able to remove the Ready signal or assert a
Wait signal, in time to stop the master from reading the data
 Semi-synchronous transfers combine the speed of synchronous transfers with the flexibility of
asynchronous transfers
Semi-Synchronous Read
Semi-Synchronous Write

Split Cycle

 In all previous strategies, the bus is occupied by the master until the transfer is completed
 What if there is a very slow device on the bus, but several possible bus masters
 Split cycle protocol:
o Writes are one quick transfer – could use a temporary register to hold data at the slave,
if necessary
o Reads  two write cycles
1. Master sends address and Read signal to slave
2. Master releases bus
3. Slave requests access to the bus and writes the request result back to master
 Can be implemented using synchronous, asynchronous, or semi-synchronous buses
o Most useful when combined with synchronous buses, to make use of extra clock cycles

XI: Buses – Arbitration


 Bus arbitration: a process that selects a single bus master from one or more devices requesting
to be bus master
 Primary goal: unique selection of a bus master
 Secondary goals (possible):
o Fair (round robin, FCFS) or unfair (fixed priority)
o Cheap
 Non-pre-emptive arbitration: once a device is selected as bus master, it will continue to use the
bus until completion – the CPU can’t forcibly take the bus back
 Central arbiter (CA): the part of the arbitration system that’s responsible for detecting bus
requests and issuing bus grants
 Distributed arbiter (DA): the part of the arbitration system that repeats in each device, which
has a requirement to be bus master at some time
o Devices that do not contain a DA cannot become bus master; they bypass arbitration
and are a slave device only

2-Wire Daisy-Chain Arbitration System

2-Wire Daisy-Chain Central Arbiter

 Asserts BusGrant 0 whenever BusReques t o is received


 Connected by a single wire – approximately no delay through the central arbiter (t CA ≈ 0 )

2-Wire Daisy-Chain Distributed Arbiter

 Arbiter circuit is implemented as an asynchronous (fundamental mode) design


 The closer the DA to the CA, the higher priority the device is
 There is a finite delay through the DA (t CA ≠ 0)
Design of a 2-Wire Daisy-Chain Distributed Arbiter
Cases that don’t require arbitration:

 Case A – Idle: local device interface i isn’t requesting service, no lower-priority devices are
requesting service
o All inputs ( R¿ , G¿ , Reques t i) and outputs ( Rout , Gout ,Gran t i) are 0
 Case B – Local Only: only local device interface i requests service
 Case C – Lower-Priority Only: only one or more lower-priority device interfaces request service
o Higher-priority devices pass along the signal

Cases that require arbitration:

 Case D: one or more lower-priority device interfaces request service while the local device
interface i is using the bus
o Case D1: forward grant immediately
 Gran t i :1→ 0 and G out :0 →1
o Case D2: re-arbitration cycle
 Gran t i :1→ 0 and Rout :1 →0 (regardless of value of R¿)
 DA waits for G ¿ :0 → 1 before re-asserting Rout :0 → 1
 Case E: local device interface i requests service after a lower-priority device has requested
service
o Case E1: If G ¿ :0 → 1 hasn’t occurred yet, grant bus to local device
o Case E2: If G ¿ :0 → 1 already, pass grant along to lower-priority device

Case B: Local Request

 Reques t i :0 → 1, then Rout :0 → 1


 BusReques t i−1 :0 →1 and so BusGrant i−1 :0 → 1
 G¿ :0 → 1, so Gran t i :0 →1
Case C: Lower-Priority Request

 Rout passes along the signal from R¿


 Gout passes along the signal from G¿
Case D1: Lower-Priority Request During Use – Propagate Grant

 Forward grant to the right when local device finishes


 Not recommended – breaks order of priority and complexity of implementation not justified
Case D2: Lower-Priority Request During Use – Re-Arbitration
 After local device interface is finished, start arbitration over again
Case E: Simultaneous Requests
 Device 1 issues bus request after Device 2 issues one

Case E1: Local Device Gets Bus First

 Reques t 1 occurred before BusGrant 0, so local device (Device 1) gets bus as if it was the first to
request
Case E1A: Propagate Grant

Case E1B: Re-Arbitration


Case E2: Remote Device Gets Bus First

 Reques t 1 occurred after BusGrant 0, so remote device (Device 2) gets bus first

Case E2A/E2B: Local Request after Grant

 Case E2A: Once BusReques t 1 is removed, Gran t 1 is issued without re-arbitration


o Device 1 must assume that once BusReques t 1 is removed, Device 2 released the bus
quickly enough for Device 1 to use the bus
o Breaks order of priority
 Case E2B: Once BusReques t 1 is removed, initiate re-arbitration
Fundamental Mode Circuit Design

 Fundamental mode assumptions:


o Inputs only change one at a time
o Output signals stabilize before the next input change is received

 a: system is idle
 b: local request ( Request ) only (no remote requests ( R¿ )
 c: remote requests only (no pending local request)
 d: system is awaiting a G ¿ signal; remains in this state until G ¿ is received
 e: G ¿ received, bus granted to local device; if there is an R¿, move to state i
 f: G ¿ received, bus granted to remote device; if Request is activated during this state, move to
state j
 g: two pending requests; local request wins when grant is detected
 h: arbiter waits for Request to fall after Rout has been de-asserted – transition state
 i: local device is bus master while remote request ( R¿) arrives
 j: remote device is bus master while local request ( Request ) arrives

State Reduction (Merging)

Reduced state table (multiple states per row):

3-Wire Daisy-Chain Arbitration System


 Uses two pass pull-up bus line and one daisy-chained signal
 Devices can request bus usage faster, since request lines are shared
 Central Arbiter contains open-collector driver
3-Wire Daisy-Chain Central Arbiter
 Asserts BusGrant 0 when BusRequest=0 (asserted) and BusBusy=1 (not asserted)
´
BusGrant 0=BusBusy ∙ BusRequest
o Implemented using a NOT gate and an AND gate
 Finite delay through the central arbiter (t CA ≠ 0)

3-Wire Daisy-Chain Distributed Arbiter

 Implemented as an asynchronous (fundamental mode) design


 Finite delay through each distributed arbiter (t DA ≠ 0)

 When a device interface wants to use the bus, its DA pulls BusRequest low to request service
 A DA may take control of the bus when it sees a rising edge on BusGrant i−1, the bus is not busy
( BusBusy=1), and it has a local request ( Reques t i=1)
 When it gets the bus, the local device pulls BusBusy low to indicate that it is in use
3-Wire Daisy-Chain Arbitration

Example 1: Device 1 Requests Bus Before Device 2


Example 2: Device 2 Requests Bus when System Idle
Example 3: Three Devices

4-Wire Distributed Bus Arbitration


 Add BusAck – it is asserted by the DA next in line to get the bus, after it receives a BusGrant
Example 1: Device 1 Requests Service Before Device 2
Summary of Daisy-Chain Characteristics
Bus Idle – Delay
 Consider a request by the nth device in a daisy chain, when the bus is idle
 Estimated delay:
o 2-wire: delay through n DAs, 1 in each direction ( BusRequest and BusGrant )
t delay 2−wire=2 n ×t DA
o 3-wire: delay to assert BusRequest (t BL), delay through n DAs, delay through CA, and
delay to assert BusBusy (t BL)
t delay 3−wire=t BL + n ×t DA + t CA +t BL¿ 2 t BL +n t DA +t CA
o 4-wire: delay to assert BusRequest (t BL), delay through n DAs, delay through CA, and
delay to assert BusBusy and BusAck signals (t BL each)
t delay 4 −wire=t BL +t BL + n× t DA +t CA + 2× t BL¿ 3 t BL +n t DA + t CA

Bus Idle – Priority


 There is nominal priority if the bus is idle and two or more simultaneous requests are received
 For 2-wire, 3-wire, and 4-wire, there is fixed priority starting from the central arbiter

 Priority uncertainty: consider the situation where the bus is idle and there are > 2 requests
 t uncertainty=( n−i ) × t DA for 2-wire, 3-wire, and 4-wire arbitration systems
o If device i requests the bus more than t uncertainty before device n receives the bus and
i<n, it will be serviced first
o If device I requests the bus after t uncertainty has passed, device n will get the bus even
though device i has higher priority
Bus Busy – Delay
 Consider a bus request by the nth device in the daisy chain
 Assume the bus is currently being used by the c th device, where c greater than or less than n
 Assume the bus is busy, so the delay is only related to the time to determine the next user
 What is the delay between the completion of one user and the start of the next (assuming the
request is already present when the current DA finishes)?
 Delay if the bus is busy:
o 3-wire: delay to de-assert BusBusy (t BL), delay through CA, delay through n DAs, delay
to assert BusBusy (including Grant ) (t BL) t arbitration busy =2 t BL + nt DA +t CA
o 4-wire: if there is enough time to arbitrate the new request before the previous user has
finished using the bus, then the delay is just to assert BusBusy (including Grant )
t arbitrationbusy=t BL

Bus Busy – Priority


 Nominal priority if the bus is busy when the request arrives:
o 2-wire: fixed priority starting from CA
o 3-wire: fixed priority starting from DA
o 4-wire: DA closest to CA that makes a request when the current user starts using the
bus has highest priority
 i.e. re-arbitration starts right after the current user starts using the bus, so if a
higher-priority request comes in after re-arbitration completes, it doesn’t get
the bus
 Priority uncertainty: if the bus is idle and there are > 2 requests; if device i requests the bus
more than t uncertainty before device n receives the bus and i<n, it will be serviced first
o 2-wire: t uncertainty =( n−i ) × t DA
o 3-wire: t uncertainty =( n−i ) × t DA
o 4-wire: t uncertainty =( n−i ) × t DA +t current user usage
Section XII: Direct Memory Access (DMA)
Block Transfers

 Many systems require the transfer of large blocks of data


 Block-oriented device interfaces tend to be more complex than character-oriented ones and
tend to have built-in buffers to permit synchronization that is less frequent than one byte

 Assume:
o Device is unidirectional and only provides data
o Data is applied and clocked in by external device
o When data is clocked, the flip-flop is set to 1, which can be read by the bus master by
querying the status register
o When data is read, status register is cleared
 Example: transfer of 256 bytes
for (int i = 255; i <= 0; i--) {
while (Status_Register & 0x80 != 0x80);
value[i] = Data_Register;
}

 Assembly:
o Number of useless memory/bus cycles is 2+n(6 x+7) , where n is the number of bytes
to transfer and x is the number of cycles needed for synchronization
 Even for fastest devices with x=1, waste 13 bus cycles

Integrated DMA Controller (DMAC)

 MAR: pointer to next byte in memory to be transferred


 BCR: number of bytes left to transfer in block
 Status/Control register: sets
o Mode: number of transfers per bus mastership
o R/W: direction of transfer
o Up/Down: decrement/increment memory address in MAR
o Start: ready to start transfer
o Interrupt Enable: enable/disable interrupts
o Busy: synchronization bit for processing one byte of data
o Interrupt Request: interrupt pending is asserted after processing one block of data

Transfer Mode
 Cycle Stealing: only transfer 1 byte per bus mastership
o May require an excessive number of bus requests/grants
 Transparent: if CPU tells the DMA controller the cycles when it doesn’t need the bus, DMA can
only use the bus during those cycles
 Burst: multiple transfers are permitted per bus mastership; allows the transfer of up to an entire
block per bus mastership
o Eliminates some latency due to bus request/grant cycles, but CPU can be blocked from
using the bus as long as the I/O devices needs to transfer data
 Trade-offs:
o During a burst, the CPU can’t access anything on the bus
o Thus, the length of a burst is usually constrained by:
 Amount of data available to be transferred
 Number of cycles that the CPU is willing to give up the bus

Basic DMA Transfer Sequence


1. CPU loads DMAC with starting memory address (MAR), number of bytes to transfer (BCR), and
control values
2. When the device is ready or has data, it issues a DMARequest
3. DMAC requests bus and waits for arbitration to grant it the bus
4. DMAC provides address and control to make transfer occur
5. DMA increments or decrements MAR and decrements BCR
6. If in burst mode and BCR isn’t 0 yet, data is available, and max transfer count hasn’t been
reached, go to 4 and repeat
7. Transfer complete; release bus, synchronize CPU and DMAC

DMA Transfer from CPU’s Point of View

 Number of wasted memory/bus cycles decreases to 12+6 x ; if DMAC completes first, x=1
Complete DMA Cycle
Initialization
 Global initialization – once
o CPU configures global aspects of DMAC
o CPU configures unchanging aspects of device
o CPU configures unchanging aspects of DMAC interrupts, if applicable
 CPU does normal operations
 Block initialization – once per block
o May be necessary to set up device or device interface every block
o CPU writes control values to DMAC (MAR, BCR, interrupt enable/disable, etc.)
 CPU does normal operations until DMAC takes control

Data Transfer
 Data transfer – once per transfer
o Device data is ready
o Device interfaces requests transfer from DMAC
o DMAC requests for and gets control of the bus through arbitration
o DMAC does a read (memory data gets written to interface) or write (interface’s data
gets written to memory) operation
o DMAC increments/decrements MAR and decrements BCR (loop if burst mode)
o DMAC releases bus
 Block synchronization – once per block
o DMAC tests if the BCR = 0; if so, set the status bit or trigger an interrupt (if interrupts
enabled)
Cycle Stealing – Ladder Diagram
DMAC Architecture Considerations

 Connection between device interface and DMAC:


o Integrated DMAC: DMAC is part of device interface – each device who wants to use
DMA has one
o Detached DMAC: DMAC interface is separate from both processor and device interfaces
– only one DMAC per system
 Addressing structure (for detached DMAC only):
o Dual Address Protocol: two address cycles per transfer – one to read data from source,
one to write data to destination
o Implicit Address Protocol: one address cycle per transfer – DMAC causes device to put
data on the bus and the destination to copy data from the bus, at the appropriate times
 Bus structure:
o Single bus – memory-mapped I/O
o Separate memory and I/O buses – non-memory-mapped I/O
 DMAC should have higher bus priority than CPU to prevent starvation, since it uses the bus less
often
Dual Address Protocol

1,2: DMAC requests and claims bus


3,4: DMAC reads data from device/memory (depending on transfer direction)
5: DMAC temporarily stores data to be written
6,7: DMAC writes data to memory/device (depending on transfer direction)

 Advantage: No hardware or device interface changes required


 Disadvantage: Twice as many bus cycles required as implicit, since each DMA cycle requires 2
bus cycles – 1 read and 1 write
Implicit Address Protocol

 Required signals to transfer data from memory to interface:


o Memory address and Read signal (on bus) – for memory
o CS and Write signal (at interface) – for device

1. Device request service from interface, which request service from DMAC
2. DMAC requests and claims bus
3. DMAC asserts memory address and control signals for desired memory operation
4. DMAC issues Acknowledge signal to the interface to indicate that it should take the appropriate
action
5. Interface releases request and DMAC completes transaction (including removing Acknowledge
signal)
Implicit Addresses vs. Dual Addresses

 Implicit address requires Request and Acknowledge lines


 Dual address only requires Request line
Single Bus vs. Separate Memory and I/O Buses

 For separate memory and I/O buses:


o Although the buses are logically separate, most of the wires would still be shared
o DMAC must be separate and connected to both buses

o Options:
 2 physical buses – two sets of address, control, and data lines
 1 physical bus and 1 mode bit (virtual memory bus) – two modes, one
optimized for memory transfers and the other optimized for I/O transfers
I/O Performance

 Assume that t interdata starts as soon as previous block of data is available


 Polling times:
o Data is read every 10ms, so it takes 10s to read 1000 bytes
o 80 bus cycles required for block synchronization
o Processor/Bus time: 10s/50 ns per instruction + block synchronization = 200,000,040
instructions = 400,000,080 bus cycles (bc)
 Interrupt times:
o Each of the 1000 bytes requires one subroutine (20 bc) + one interrupt response time (5
bc) + block synchronization cost (80 bc)
o Processor/Bus time: 25,080 bc
 DMA – Dual Address times:
o Each of the 1000 transfers requires 2 bus arbitration cycles (bac) (one for DMA to
become master, one for CPU to become master) and 2 bc for actual transfer
o Assume both transfers required for DMA transfer are made in 1 bac, i.e. DMA can keep
bus to make both transfers
o Each block of transfers requires one setup (40 bc) and one interrupt at the end of
synchronization (5 bc) = 45 bc
o Processor time: 85 bc
o Bus time: 85 bc + 2000 DMA bc + 2000 bac
 DMA – Implicit Address times:
o Each of the 1000 transfers requires 2 bus arbitration cycles (bac) (one for DMA to
become master, one for CPU to become master) and 1 bc for actual transfer
o Each block of transfers requires one setup (40 bc) and one interrupt at the end of
synchronization (5 bc) = 45 bc
o Processor time: 85 bc
o Bus time: 85 bc + 1000 DMA bc + 2000 bac

Performance Comparison
 Assume 3-wire arbitration, where arbitration is comparable (faster) than a memory cycle (i.e.
1 bac = 1 bc)

 Actual transfer time in all cases is slightly more than 10s


 Changing bus cycle time from 25ns to 250ns will drop the number of bus cycles required for
polling by 10, while the others stay the same

 Perspective is everything:
o Device sees tight polling as the best option because it provides fastest response time
o Processor may see best performance with interrupts, since processor performs fewer
instructions when interrupts are used, so it has more time to complete other tasks
o Overall cost/complexity may be better with polling, since it is less complex and so may
cost less
 Device perspective:
o Latency – delay between data being available and data being read (by DMAC or CPU)
o Transfer rate – how quickly data is transferred out of buffer
o Effective transfer rate – highest sustained rate of transfer
 CPU perspective:
o How much processing and bus time is needed for the transfer?
o Device latency – delay between making request and first data being available
o Transfer rate – delay between subsequent transfers
o Effective transfer rate – highest sustained rate of transfer

You might also like