Microprocessor Applications
Microprocessor Applications
Syllabus :
Microprocessor architecture, Organisation & operation of microcomputer systems. Hardware and software interaction. Programme and data storage. Parallel interfacing and programmable ICs. Serial interfacing, standards and protocols. Analogue interfacing. Interrupts and DMA. Microcontrollers and small embedded systems. The CPU, memory and the operating system.
Assessment: Written Assignment Unseen Examination 20% Week 10 (March) 80% June
Reading List
Alan Clements. 2000. The Principles of Computer Hardware, Oxford, 3rd edition. (A number are available for loan from the Engineering & Design Department Office) For assessment exercise: Various manufacturer's microprocessor and microcontroller datasheets and user documentation downloadable from the internet. For lecture notes I have used the above Clements + others below Note: these are not recommended for buying The 68000 Microprocessor Family, M.A.Miller,1992 MacMillan Digital Fundamentals, Floyd, 2006, Pearson International Computer Engineering Hardware Design, M.Manno, Prentice Hall Microcomputer Interfacing, H.Stone, ddison Wesley + various datasheets from web e.g. 68HC000 =CMOS 68000 version
1. Review Architecture & Programming of Microcomputer Systems 2. Programme and Data Storage 3. Parallel Input & Output Peripheral Devices 4. Interrupts 5. Serial Input and Output 6. Analogue I/O 7. Microcontrollers for Small Embedded Systems 8. CPU, Memory, and the Operating System
Principal function(s) controlled by microprocessor embedded within it Computer (microprocessor or microcontroller) hidden from view Purpose designed for particular application (PC is really a general purpose computing machine rather than an embedded system) Embedded computer takes input variables from controlled system Computes output variables to control system Sometimes autonomous, or sometimes interaction with user or sometimes interaction with other systems
In this course the 68000 or 68HC000 processor is used to demonstrate aspects of the device hardware interface and software device access
Address Register Indirect: MOVE.L (A1),D0 32bits from memory pointed to by A1-> D0 Address Register Indirect with Displacement: MOVE.<size> displacement16(An),Dn MOVE.W $4(A0),D2 D2 memory at location given by (contents of A0 + 4) Absolute: Immediate: MOVE.L $C02E,D5 loads D5 with 32bit data word from location $FFC02E MOVE.L #$30,D2 loads D2 with immediate data (fills with leading zeros)
Address Register Indirect with Predecrement/Postincrement: e.g. MOVE.B (A3),D3, MOVE.L (A0)+,D4, MOVE.W D4,(A2)+
C) Logical Instructions
ASL, ASR, LSR, AND, OR, NOT, Arithmetic shift left (lsb 0) Arithmetic shift right (old msb msb) Logical shift right (0 msb) Logical AND Logical OR all bits complemented 0 1
Programme instructions intensively use the 8 data registers and 7 address registers in the CPU as intermediate data products or temporary variables in the course of processing data to / from the external world via external devices.
[ Also a fifth type: Direct Memory Access I/O Memory but requires special DMA bus controller see later ]
Summary of the 68000s 64 connection pins: Vcc Voltage source (e.g. 5Volts above Vss) Vss Ground Clock: system clock input Buses: D0 to D15 data bus lines - bidirectional A1 to A23 address bus lines, O/P Main Control lines: AS: Address strobe- valid address on A1-A23, O/P R/W: direction of D0-D15 bus,1=read,0=write, O/P UDS,LDS: upper/lower data strobe [A0], O/P [effectively A0:maps 8bit wide memories to 16 bits] DTACK: Data Transfer Acknowledge, I/P slower external devices can cause CPU to wait RESET: resets CPU programme counter I/P HALT: halts operation(I/P) or indicates failure(O/P) IPL0-2: Interrupt request lines I/P Others: BR,BG,BGACK: for external DMA control of bus FC0-FC2:monitor: programme, data, interrupt ,O/P E,VMA,VPA,BERR extra signals for interfacing
Main aspects: FC0-FC2=010 indicates program opcode fetch (alternative 001 for data) Valid address A1-A23, UDS,LDS=00 means16bit read (10=d0-7, 01=d8-15 only) Address Strobe, AS allows address bus to be decoded for memory chip select R/~W stays high throughout as this is a read operation External device/address decode asserts ~DTACK as data placed on bus If memory device is slow ~DTACK assertion can be delayed to provide wait states D0-D15 Data bus receives valid data from addressed memory before AS returns. 68000 uses a 2-word prefetch, absorbing program fetch cycles within execution cycles
System Design
1) Before Designing system decide on requirements: Amount of programme memory (ROM) Amount of read/write data memory (RAM) Number & type of I/O ports Other system and peripheral components as needed 2) Software must be considered. 3) Then individual component types chosen, considering their characteristics (timing, voltage levels,etc) & requirements 4) Circuit wiring, board design & board layout completed
- Types of memory device - Connecting memory to the processor - Memory device address decoding
Types of Memory
Random Access Memory, RAM (data volatile- lost on power off)
RAM used for data, can be written to & read from Static RAM each bit stored in simple circuit of a few transistors, e.g. flip-flop Dynamic RAM- each bit stored as charge on a single transistor gate but needs refresh circuitry as gate is a leaky capacitor and data lost otherwise SRAM faster, takes more power, less dense expensive, but easy to use DRAM simpler, lower power, cheaper, requires extra refresh control, more complex to use.
Read Only Memory, ROM (data non-volatile, remains after power cycling)
ROM data remains after power off. Mask programmed custom written at manufacture, e.g. PC boot up programme PROMS semi-custom- written only once to chip by specialist equipment/co data 0/1 stored as fuses blown/unblown or as OTP (see below) EPROM user programmed by EPROM programmer. Data stored as charge on high impedance gates- can be erased by ultra-violet light through window in chip & reprogrammed. One time programmable, OTP, = version of EPROM chip without window EEPROM- similar to EPROM but erased electrically without being removed from circuit. Erased in blocks of memory in system programmable Flash memory, similar but simpler very dense memory (silicon hard disc) FRAM access as fast as RAM but data non-volatile
Address Decoding
a) General address decoding Chip selected by specific combination of higher address line values.
b) Linear Address Decoding Each chip select uses a dedicated address line- simple for small systems but wasteful and can lead to bus contention (>1 device selected at once!! e.g. A11 & A12 must not both =1 ) c) Full Address Decoding Logic used to provide a maximum number of chip selects from address lines. E.g. two address lines A11 & A12 have four possibilities (00,01,10,11) each combination decoded for a chip select.
When chip is not enabled: all 8 outputs high independent of A inputs When chip enabled (~E1,~E2,E3=001) only one output goes low, rest high Inputs A1,A2,A3 select which of 8 outputs goes low
Two examples:
Intel 27210 64k x 16 16bit word size Data: O0-O15 Address: A0-A15 ~CE is chip select ~OE is enable ouput (read data from ROM) ~PGM for programming data into ROM
Another Device20000H
Connecting RAM
Addition of two 32K x 8 RAM to previous slide ( two of ROM of last slide not shown for clarity ) Again pair for 16bits wide ROM A16-A18=000 RAM A16-A18=001 Now R/~W needed ~DTACK as long as either ROM or RAM accessed.
- Buffers and latches - Example input and output devices - Programmable I/O devices - Counter-timers
D-type latches used to sample & hold data - latch data. Data on the bus, e.g.from the microprocessor, sent to the output port. Address decoding for port enables latch, latching data at end of pulse. Port outputs new data until next time port addressed (OUTPUT). ~LE = logical OR of ~Port CS, ~Bus Write)
Keyboard Key pressed CA1 high Keyboard tells PIA that data ready to be read. CA1 used to strobe data into PIA data register. PIA acknowledges keyboard by CA2 high and at same time tells uP by setting IRQA low Microprocessor services interrupt request and reads PIA data register. Act of reading resets IRQA high and resets CA2 low telling keyboard that it is ready for more data.
Rows scanned with travelling 0 on output port until keypress causes input <11111111 Then key identified by combination of known position of 0 on O/P port and measured position of 0 on I/P port.
Counter-Timer Chips
This example three separate counter-timers Each has clock input, a gate input, and an output: a)Clock can be supplied from the microprocessor clock, or by an external system. b) Gate is a signal that enables/disables counting c) The output is changed when the counter reaches a preset value, counted down 0. Uses: Output can be used as interrupt to uP Enables accurate time delays to be generated under software control Multi-Mode configured by software Used for delay instead of software timing loopsfrees up uP to do other tasks Can be used to count external events Watchdog timer- unless software reloads counter before an initial long count value reaches zero resets system. Checks against endless loop type software hangups - ensures continued operation of essential systems.
PSEUDO-PROGRAMME:
FOR i=1 to 128 move data from table to port wait a fixed time END FOR
#deloop, D2 ;set up delay loop time #1,D2 ;decrement loop time Loop2 ;wait for loop time ;return from subroutine $002000 128
Table
PSEUDO-PROGRAMME:
FOR i=1 to 128 get data from table wait until port ready output data END FOR
;D0memory([A0]) ; [A0][A0]+1 ;Read status ;mask off all but ready bit ;wait for port ready ;Output data to peripheral ;decrement loop count ;repeat for all 128 data
Table
Need Interrupts...
OUTPUT : INT Y
EQU ORG
$008000 $000400
Location of O/P Port Start of programme Save environment general for subroutines Point A0 to buffer Read a byte from buffer Send to O/P port Save updated pointer Restore Environment Return from interrupt
MOVEM.L D0-D7/A0-A6,-(A7) MOVEA.L POINTER,A0 MOVE.B (A0)+,D0 MOVE.B D0,Output MOVE.L A0,POINTER MOVEM.L (A7)+,Do-D7/A0-A6 RTE $002000 1024 BUFFER
: ORG BUFFER DS.B POINTER DC.L Data Origin Reserve 1024 bytes Reserve long word
In previous example (of last 2 slides): to obtain regular slow timed outputs- interrupt could be caused by a software pre-programmed Timer/counter chip output connected to a processor interrupt line.
Priority Encoder:
converts IRQ1-IRQ7 to three bits IPL0-IPL2
CPU:
gets interrupt vector from memory pointed to by peripheral vector x 4 e.g =100H
Serial Interfaces
Serial Transmission can be: i) Asynchronous ( e.g. traditional PC COM1 port ) ( e.g. USB Port )
ii) Synchronous
UART chip performs parallel-to-serial conversion on data sent from CPU and serial-to-parallel conversion on data received by CPU. Mechanism of shift register, shift out bits of data byte (or character) one at a time.
Bit-Serial Data
Non-Return to Zero (NRZ), quiescent level=1 Bit serial data framed by start and stop bits with optional parity bits for error checking. E.g. if 7bit character data(ASCII) then up to 11 bits required per character. Above example of transmitting character R, in ASCII is 52Hex (1010010b). Seven bit data. (8th MSBit discarded). Character rate = bit rate / bits per character Bit rate = baud rate
Data Link can be: 1) Simplex ( one way data transfer ) 2) Half Duplex ( two way data transfer, but only one at a time ) 3) (Full) Duplex ( two way data transfer simultaneously )
Extra signals needed for handshake with external serial devices: RTS: Request to send. Computer asks modem if it is ready for data operations CTS: Clear to send. In response to RTS modem tells computer data can be sent DCD: Data carrier detect. Modem tells computer that it receives carrier tone on the telephone line
Receive a Character Subroutine: RDRF EQU 0 ;RX data ready bit 0 of SR SR EQU 0 ;Status register offset DR EQU 2 ;Data register offset LEA ACIA,A0 ;A0 points to ACIA POLL TST.B #RDRF,SR(A0) ;Read RX status bit BEQ POLL ;repeat until char received MOVE.B DR(A0),D0 ;get input from ACIA to D0 RTS Transmit a Character Subroutine: TDRE EQU 1 LEA ACIA,A0 TPOLL BTST.B #TDRE,SR(A0) BEQ TPOLL MOVE.B D0,DR(A0) RTS ;Transmitter data register empty bit ;A0 points to ACIA ;TX register empty? ;Repeat until ready to transmit ;Move byte from D0 to ACIA
RS423:
RS422:
RS485:
Many resistors needed - 2n where n= number of bits, but all same value
a) Simple Explanation: Each bit if logic 1 connects resistor to the circuit. R values vary as 2:1 from bit to bit with MSB having the lowest value R low R passes highest current most effect on output voltage. Disadvantage: need to have many different, precise R values
b) Detailed Explanation: Amplifier -ve input is virtual ground since feedback resistor from Vout holds inputs at 0 volts. High impedance input amplifier takes zero input current, so all currents I0, I1, etc, must pass through feedback resistor, Rf. Total current in feedback resistor, If = b0 I0 + b1 I1 +b2 I2 + .. ( With each bit bn =0 or =1) So final analogue voltage output= Vout = If Rf
Example: data value 1000 D3=1 5V via 2R to input held at 0V by feedback- all current flows through Rf(=2R) so Vout must be -5V. Lumped value, Req of other resistors not critical as no current passes through Req
Thevenins Theorem- any circuit can be reduced to an equivalent voltage in series with an equivalent resistor. Applying theorem to the left of R8 we have Vth=1.25V & Rth=R. Again voltage across R7=0, then 1.25V through 2R to input will require Vout to be -1.25V to keep input at zero voltage (Virtual earth).
a) Square wave period T (without DAC): Loop1 Output 0 on port pin wait T/2 Output 1 on port pin wait T/2 branch to Loop1 repeat forever b) Sawtooth ramp period T (with 8 bit DAC) initialise D0=0 Loop2 output D0 to DAC increment D0 wait T/256 branch to Loop2 repeat forever c) Triangular wave period T (with 8 bit DAC) initialise D0=0 Loop3 output D0 to DAC increment D0 wait T/512 compare D0 to #255 branch to Loop3 if not equal Loop4 output D0 to DAC decrement D0 wait T/512 compare D0 to #0 branch to Loop4 if not equal branch to Loop3 repeat forever
Accuracy
Ideally = resolution but in practice less because of accuracy of resistors
Linearity
Linear error is deviation from ideal straight line Vout= constant x digital value
Settling Time
Time taken for analogue output value to reach a new value in response to a change in the digital input - depends on RC time constants , internal & external capacitance.
Example of 4bit ADC 1) Microprocessor sends SC (start conversion) to control 2) Control logic within ADC outputs a digital value to a DAC 3) The analogue input is compared with the DAC output 3) Control logic tries each bit in turn starting at MSB Decision tree: Only four decisions (red lines) 4) After 4 successive approximations sends end of conversion EOC signal to microprocessor 5) Microprocessor reads digital value
Vexample
x(n)=sampled analogue waveform, an =weights (coefficients, or scaling factor), Z-1 =unit time delay = 1 sample period
Computer Architectures:
Most microprocessors use von Neumann architecture as Harvard would need many more pins to access two external buses. However, more processing efficient Harvard Architecture with two buses easily implemented internally within a microcontroller.
8051 chip Includes: central processor ROM & RAM 3 counter/timers 4 parallel ports 1 Serial port Requires only crystal for clock and Vcc. Ports can be used to expand ROM & RAM bus externally
8bit microcontrollers:
Some Microcontrollers
Very many types & manufacturers produce various versions with different facilities, e.g: a) Speed: reduce 1 clock / instruction b) Memory Data RAM upto 2kbyte Programme ROM upto128kbytes+ Types: EPROM, Flash, EEROM. c) Communications RS232 I2C, 1-wire CAN bus, Ethernet, etc d) Peripheral Drivers, LCD, etc
16bit microcontrollers:
10.0
Recent progress in processor speeds curbed by power dissipation problems. Every transistor switch action has I*V. Very many transistors at any one time often doing nothing but still dissipate heat! Over-clocking say from 3GHz to 3.6GHz possible beyond manufacturers specifications running hot reduces component lifetime plan redundancy use 10year guarantee computer for only 4year Supercomputers- similar problems. Germanium Arsenide logic + exotic components + liquid cooling faster clocks. Supercomputer progress only through vastly parallel machines with many processors (groups of 1000s of PCs) - massively parallel hardware and applications Desktop PCs co-opted supercomputer model single die. Dual core , Quad core 2 to 4 CPU cores on a
10.1
Typical 2 CPUs each with own L1 cache, share single L2 cache, that accesses single external bus to off-chip memory (Cache memory- see later) But typically only ~1/3 of presently written PC programs can be parallelised only 50% speedup as go from single dual core (not the expected 100% - Amdahls law) Single bus from L2 to off chip external RAM memory still a bus bottleneck Programs only fast as long as work from L1. But L1 size typically only 32-64kbytes! Small programs!, L2 typically 4Mbytes Multi-core (2 -4) use is presently optimised by operating system: multiple programs written for single core. Primarily for speeding up Multi-tasking, Multiple threads Moving towards Many-cores >>4, Many-cores really need to program specifically for parallel processors, need effective parallel languages, auto-parallelising compiler Active Research Area
10.2
Overview: 9 processor elements on a single chip: 1 x 64 bit PowerPC processor element (PPE) optimised for operating system/control 8 x Synergistic processor elements (SPE) optimised for compute intensive applications PPE access main storage via load/store to private register file. Operating system neutral SPE access main storage via DMA to local memory for data & programme See: http://www-01.ibm.com/chips/techlib/techlib.nsf/products/Cell_Broadband_Engine
10.3
Xera flop 1027 Yotta flop 1024 Zetta flop 1021 Exa flop 1018 Peta flop 1015 Tera flop 1012 Giga flop 109 Mega flop 106 Kilo flop 103
Today: 1) Top of the range PC Quad-core 30Gigaflops; 2) Roadrunner super computer (~ 130kcores = 13k Cell processors[9core]+ 7k AMD dual core)
1.7 Petaflop
FPGAs can be re-configured in application to provide various functionalities. e.g. mobile phone, GPS receiver, etc. FPGA cores often operate at lower voltage than data buses, often mixed voltage buses. therefore there is a need to convert logic levels between buses..
Part 8.
for
Fast I/O
Memory
DMA modes:
Generous!
Reasonable
Greedy!
7.0
Cache memory local fast memory used to hold pre-fetched operation codes Speed-up depends on (i) ratio of cache memory speed to main memory speed, (ii) how often op-code is already in cache (a hit), (iii) average number of machine/clock cycles per instruction Cache controller needs to (a) look ahead in programme to fetch instructions. (b) keep address tags of instructions to identify hits Note that the 68000 uses a standard simple 2-word pre-fetch, absorbing some program op-code fetch cycles within execution cycles as many clock cyles/instruction.
Operating System
Overall OS: Co-ordinates, optimises efficiency, schedules tasks (processes). Applications use resources provided by OS OS hides details of the hardware.
Task Scheduling: Each process is in one of three states: Runnable: available & waiting Running: running now Blocked: waiting for an essential resource to become available.
Multi-Tasking
OS safely switches contexts between processes. Scheduler saves current processs context (volatile portion) and invokes a new Process. Present state of each process must be saved at end of running it. Previous state of each process must be restored at the start of running it again. Use separate stack areas for each process to save register status.
I2C,
Fast 2-wire bus, up to 400kbits/s
1-wire,
Single wire used by master to communicate with slaves, also used to power slave devices. Economic in hardware resources. Ideal for short distances.
CANbus
Each byte transmitted as NonReturn to Zero, NRZ, asynchronous, with start & stop bits (like RS-232.). Balanced 2-wire interface with differential line drivers & receivers in parallel (like RS422/RS485).
I2C Bus
Each device on bus has unique address. Multi-master- more than 1 device can control bus. Arbitration between contending devices. Serial 8bit data. Two wire bus shared by all devices: SDA Serial Data line;SCL Serial Clock Line Example:
1-wire
(Maxim-Dallas)
Device families include: ADC, DAC, Analogue Switches, Memory, Temperature Sensors, etc.
Programming Microprocessors/Microcontrollers
Directly in Low Level Assembler Language. Slow, tedious, unforgiving, only practical for small systems Timing for critical programme loops and for interfaces can be set precisely from number of clocks/intruction. Memory use/allocation can be easily organised/kept within bounds. Better to understand what is actually happening in fine detail. Difficult to appreciate the whole design. Indirect via Cross Compiling from Higher Level Language, e.g. C Relatively quick, easy to implement in C, often necessary for large systems Difficult to ensure the precise timing of critical parts. Care must be taken in memory use and data variable type assignment. Easy to appreciate the whole and verify overall design functionality. In practice overall system often written in a higher language with some time critical sub-systems written directly in the relevant assembler language.
Course Summary
Outline the connections for : a) b) c) Simplest linear address decoding, assuming no other devices are on the bus Full address decoding, allowing for system expansion Full address decoding as in (b) but when the above chips are not available and the only chips available are 2k x 8 ROMs and 4k x 8 RAMs
Work out for next week- well go over possible solutions in lecture
Problem Sheet 2 :
General Review
1) Draw and label the block diagram of a small microprocessor system that might be used in a dedicated application. What is the function of each section. 2) Outline the sequence of operations involved in the execution of a typical instruction by a microprocessor such as a 68000. Provide a labelled timing diagram. 3) Explain the differences between (a) operation-code (programme word) fetch cycle, (b) data memory read cycle, and (c) data memory write cycle. 4) What is meant by a wait state and how does its use affect the microprocessor bus control signals and memory cycle times. What devices initiate wait states? 5) Discuss the operation and use of the programme counter and stack pointer and show how they control the sequence of programme execution. 6) Explain the difference between static and dynamic RAM. What are the different types of non-volatile memory? 7) A particular peripheral chip contains four internal address locations which may be written to or read from. Draw a diagram of the bus connections necessary to connect two such chips to a microprocessor using (a) linear addressing (b) fully decoded addressing. 8) Explain the operation of data latches and buffers. How are these used for microprocessor input and output ports? 9) What functions can be performed by counter-timer chips and how might these be used?
Problem Sheet 3 :
General Review
1) Explain the time sequence of actions that occur when a subroutine is called from the main programme with particular reference to role of the stack pointer. 2) How does an external device interrupt the microprocessor programme? What is meant by interrupt priority and how does the microprocessor control which devices can cause interrupts? 3) Explain the difference between synchronous and asynchronous serial data communication and sketch a typical asynchronous data character. How would you obtain this signal from an 8bit parallel data byte? 4) How does a Universal Asynchronous Receiver/Transmitter interface with (a) external systems, and (b) with its host microprocessor? 5) Explain the operation of three types of digital to analogue convertors(DAC): (a) Potential divider network DAC, (b)Binary weighted input DAC, and (c) R-2R ladder DAC. 6) With the aid of pseudo-code or flow charts explain microprocessor programme sequences that use an 8-bit DAC to generate the following analogue signal patterns: (a) square wave whose amplitude is software controlled, (b) a square wave whose period is software controlled (c) a full amplitude sawtooth wave, and (d) a full amplitude triangular wave. 7) How might a microprocessor utilise a timer/counter chip, an input port, and a DAC to provide an analogue sawtooth wave whose period is controlled by an external digital input.
Problem Sheet 4 :
General Review
1) Explain the following types of Analogue to Digital Convertors (ADC) operate: (a) Successive approximation, (b) Flash ADC, and (c) Dual slope ADC. 2) What is meant by aliasing error? What does the Nyquist frequency signify? What is the minimum sampling rate required for signals which contain frequency components up to 20kHz? 3) A microprocessor system has two ADC inputs, one DAC output, and an alarm bell operated by a single output port bit. The two ADC inputs monitor the voltage across an electric motor and the current taken by it. Provide the pseudo-code, or flow chart, for a microprocessor programme that outputs an analogue value proportional to the electric power taken by the motor and also rings an alarm whenever the power taken exceeds a preset maximum value. (hint: power = voltage x current) 4) The motor of question (4) produces many short-term current spikes that cause false alarms. Modify the pseudo- programme/ flow chart to reduce the effects of spikes by averaging over 8 successive current samples. 5) Describe a typical microcontroller and highlight its essential features, identifying their uses. 6) Choose a typical microcontroller application and show how the microcontroller is used. 7) What is Direct Memory Access (DMA) and why is it used? 8) What is Cache memory and what is its purpose? 9) Provide a brief description of different types of microcontroller network: I2C; CANbus; and 1-wire. 10) Outline the essential features of an operating system. What is meant by multi-tasking?