Microelectronic
Microelectronic
TUTORIALS
1.1 Introduction
Typically 1 cm
An integrated circuit (IC) is a piece of semiconductor
material, most commonly silicon and often referred to
as a chip.
Scale of increase of complexity (number of transistors on a chip) often expressed as complexity doubles every 2.2 years (Moore’s Law) -
shown diagrammatically over the last 35 years in Graph 1. Some recent evidence that the increase in complexity is slowing down becoming
more of a curve than a straight line (see graph 1)
Design of a complex VLSI chip is a major task - can take several man-years. Teams of designers working on different parts of a
chip use Computer Aided Design (CAD) tools to reduce design time and with the objective of ensuring that design is correct first
time.
Chip specification drawn up by the system user/designer covering all aspects including functionality, speed, power dissipation,
package type, number of package pins, volume, reliability, voltage and current ratings, etc.
Initial specification leads the designer to conclusions on the technology and architecture to be used.
System test considerations taken into account to ensure adequate and economic testing possible. (Design for Test or DFT).
Sub-system design using pre-defined circuits (cells) or specially designed sub-systems if required and allowed within the design
style adopted.
Designs are usually now carried out using a Hardware Description Language (HDL) such as VHDL (Very High Speed IC HDL)
for digital systems or Analogue HDL for analogue or mixed analogue/digital systems. Advantage is ease of use, and correct-by-
design facility by using a synthesis tool to generate the low-level designs.
● Simulate the circuit using a simulator tool (logic for digital, circuit for analogue).
● Back-annotate the layout to check for corrections and re-simulate using actual layout parameters.
● Silicon fabrication using the masks and thin slices of silicon (wafers), test the wafers and package the chips.
● Production test.
1.3 Design Considerations - Technology
Silicon (Si).
1.3.1 Silicon
• Two main types of technologies available in silicon based on Bipolar Junction Transistor (BJT) and Metal-Oxide-
Semiconductor Field Effect Transistor (MOSFET).
• Bipolar technology offers high speed, high current drive but at the expense of high power dissipation, low
complexity. It uses n-p-n and p-n-p BJTs, diodes and resistors and is good for analogue and digital circuits.
• MOSFET technology offers moderate speed and power and high complexity and based on either Enhancement
(normally off) type n-channel (EnMOS) or, for improved performance, both Enhancement and Depletion type (normally
on) n-channel MOSFETS (EDnMOS). Only suitable for digital circuits.
• p-channel MOSFET technology (pMOS) solely not offered any more due to the devices' slow performance
compared to nMOS technology because of lower mobility of the charge carriers (holes).
• Technology based on both p-channel and n-channel enhancement MOSFET known as Complementary Metal-
Oxide Semiconductor (CMOS) popular as it offers very low power dissipation, is suitable for both analogue and digital
applications but at the expense of lower complexity compared to EDnMOS. CMOS is becoming the standard process
for all but the highest speed devices.
• Technology based on both bipolar and CMOS (BiCMOS) can give the best of both technologies but at the
expense of increased fabrication complexity and cost.
● Based on components which are a variation of field effect transistors known as Metal Semiconductor Field Effect Transistors (MESFET).
● Originally used in very high frequency ICs but now available for very high speed digital circuits of relatively low complexity.
● Choice of technology will depend on a number of factors given in the specification, including cost, CAD tools, etc.
● Full custom
Note: In semi-custom the designer accepts restrictions in order to simplify the design whereas in full custom the designer is free
to optimise each component to improve performance and reduce chip size. This increases cost and design time enormously.
1.4.1 PLDs
• PLDs are reusable PROM memory devices used in computers where the user programmes an array of
transistors/gates to form a given function using an electrical programming device.
• No involvement of IC manufacture.
• Cheap and quick but with little flexibility.
• Range of CAD tools available running on PCs.
• Originally digital only but pure analogue arrays now also available.
1.4.3 FPGA
• Variation of gate array which allows the user to field programme a gate array as in PLD and thus reduce design
time.
• FPGAs less dense than gate arrays and therefore higher unit cost.
• Compatible FPGA/gate array device ranges available to enable initial designs to be made using FPGAs and
transfer easily to gate arrays later if required.
Chip area and floorplan and packaging aspects need to be considered, because chip area usually determines cost and yield
while packaging is of concern to the customer and designer alike.
Power supply including operating voltage and current requirements are required in order to design power track widths,
electromigration, etc.
CAD tools availability, effectiveness and ease of use important as they will form a key part of the design process, including
manufacture.
Test aspects are vital, including the incorporation of Design for Test (DfT) circuitry to enable efficient and effective production
testing to take place in a few seconds.
Test costs are becoming a significant part of total chip costs in many cases. See Graph2:
Graph 2 - Relative cost versus cost breakdown.
Finally, last but not least, cost considerations have to be taken into account as each of the previous technological, architectural
and other considerations have different cost implications.
We can estimate the cost of a particular chip having made some reasonable assumptions.
For a typical chip cost illustration see section 1.6.1 in the recommended textbook.
For this example this results in a typical cashflow forecast - see Graph 3:
Hence the break-even time (27 months in this case) can be estimated.
Microelectronic Design
Chapter 2 - Microelectronics Fabrication Process
2.1 Introduction
● Fabrication process described in this module applies only to silicon as it is by far the most commonly used semiconductor
material, and covers all the stages involved in the fabrication of any microelectronics device.
● GaAs fabrication process has many parallels with silicon process but is different in some significant ways - for further
information see texts on GaAs device fabrication.
● Semiconductors essential for the fabrication of microelectronic devices because their atomic band structures are such that the
addition of small amounts of certain elements (doping) changes the electrical properties of the semiconductor dramatically.
● Silicon naturally occurs as silicon dioxide in sand,
chemically reduced and purified until it is very pure
containing typically <1 part per billion (ppb) impurity.
Polysilicon Ingots
Ingot Pulling
Silicon Ingots
Single Silicon Ingot
(Mitsubishi)
Wafers
● Many variants of the basic IC process to produce different types of IC eg. bipolar, CMOS, nMOS, etc. but all essentially
follow the stages shown in Diagram 2.
Mask
Process which effectively transfers the chip layout on a mask onto the silicon surface - has similarities to photographic
printing.
Stepper Mask
(AMS Lithography) (SGS Thomson)
• Object is to enable many (millions) of shapes to be printed on the wafer in one operation
(enormous cost benefits).
• Most important process as far as ensuring that the various components line up with each other
and are interconnected correctly (determines line width)
2.4 Diffusion and Ion Implantation
• Ability to change the doping level or type essential in IC fabrication process since parts of the
IC have to be doped differently in order that the IC functions correctly. For example, for a high gain
BJT, emitter doping>base doping>collector doping.
• Epitaxy (section 2.2) changes the doping over the whole of the wafer (globally) whereas more
often it is required to change the doping over part of the slice (selectively).
• Photolithography used to form patterns on the wafer surface but cannot, in itself, be used as a
mask to prevent dopants reaching the silicon wafer underneath it. This is because both diffusion and
ion implantation are high temperature/high energy processes and the chemical elements involved
would simply pass through the photoresist.
• Silicon dioxide which can be easily formed on the surface of the wafer and is very dense and
strong, is capable of forming an excellent dopant barrier.
• Process to selectively dope an area in two steps; firstly photolithography used to define the
required pattern in the silicon dioxide layer which is then used in the second step to limit the dopant
to the required areas only.
• Process relies on the ability to accurately remove material such as silicon dioxide defined by
photolithography - process known as etching.
Acid Etch
(Cybor)
● Dry etching anisotropic processes developed which only
remove material in one direction (normally vertically)
overcoming undercutting, giving a faithful representation
of the mask pattern on the silicon wafer.
Plasma Asher
(Fusion Systems)
2.4.1Diffusion
• To form a p-type region the element boron is diffused into silicon while the elements arsenic or
phosphorus used to form n-type regions.
• For an n-type wafer placed in a furnace at high temperatures in the presence of a high
concentration of boron, the boron will progressively diffuse into the wafer to a depth dependent on
the furnace temperature and duration. (Typical depths used are 0.25-2.0µm). A p-n junction is then
formed in the wafer whose electrical properties are those of a diode and is electrically stable.
• Further diffusions could be used to form n-p-n, BJTs, MOSFETS, resistors, etc.
• Silicon dioxide and silicon nitride are dielectric materials and are used in IC processing for
electrical insulation purposes and for passivation (final covering of all exposed areas of silicon).
• Metallisation takes place towards the end of the fabrication process and involves the deposition
of a thin layer of aluminium (typically 1µm) over the whole of the wafer, (by processes known as
aluminium evaporation or sputtering) and then the use of photolithography to define the interconnect
pattern.
• A layer of dielectric over the first level interconnect will allow a further layer of interconnect to
be formed and so on (multi-level interconnect). The interconnect on the higher layers are connected
to the silicon wafer by cutting contact holes in the insulator or to each other.
• A typical IC process will see one or two layers of polysilicon and two or three layers of
aluminium.
● Final test of completed devices carried out against full specification prior to shipment.
Commercial IC
● Successful package will satisfy all application requirements at an acceptable design, manufacturing and operating cost.
● Improving technology increases the demands on the number and density of package pins and interconnections demanding reduced physical dimensions
requiring the use of improved techniques.
● Need to improve quality and reliability of the packaging process important.
● Number of connections (pin-out) a major cost factor and strongly dependent on IC function, eg. memories require few connection pins but random logic
requires many more.
● The number of terminals required and the number of circuits are related by Rent's Rule which states that
N = KMp where N is the number of terminals required, K is the average number of terminals used for an individual logic circuit, M is the total number of
circuits and p is a constant (0 £ p £ 1). Typically p is 0.12 for static memory, 0.45 for microprocessors and 0.63 for high performance computer chips.
● Graph 1 shows pin-out for memory and random logic ICs showing memories requiring relatively few pins whereas random logic requires many and
approximately follows Rent’s Rule.
Graph 1 - Graph of number of pins (terminals) versus circuit complexity for various microelectronic functions
● Increasing pin-out requires the package to squeeze more pins into the same or less space whilst still maintaining requirements of mechanical fragility, electrical
performance and thermal specification.
● Basic package electrical parasitics of resistance, inductance and capacitance present in all IC packages and can cause signal delays, signal distortion and noise.
● Self-capacitances in particular cause signal delays.
● Non-zero source resistances also increase signal delays.
● New more complex packages have reduced self-capacitances (by using improved geometry and lower dielectric constant materials) and reduced source
resistances.
● Package resistance causes voltage drops and increases signal delays.
● Signal reflections are particularly troublesome causing faulty circuit operation in some cases.
● Noise generated by switching current from one chip driver can affect other drivers through inductances.
● Reduce noise by reducing inductances, restricting the total switched current, or using decoupling capacitors.
● Power distribution across a chip must be accurately controlled (<10% variation maximum) so package power lines must be designed to ensure this happens
even in the event of circuit switching activity.
● More complex chips make more demands on efficient heat removal from the chips.
● Silicon chips limited to approximately 100ºC for normal operation which limits power densities on chip to a maximum of 10watts/cm2 in current IC packages.
● Imposes a limit average on power dissipation per individual circuit on chip of approximately 1µW/circuit for a 10 million transistor chip of area 1cm × 1 cm.
● Important to reduce power dissipation/circuit and improve package thermal design in order to produce larger, more complex ICs in the future.
● Differential thermal expansion of package parts gives rise to mechanical stresses and reliability risks.
● Total package technologies in electronic products are very diverse and include ICs, PCBs, flexible circuit carriers and Multi-Chip Modules (MCMs) as shown
in Table 1.
Chip 1st level package 1st to 2nd 2nd level package 2nd to 3rd 3rd level package Chip Max
connection level level cooling chips/
connection connection system
Consumer electronics
WB PSCM SMT/PTH Card - - - <10
WB/TAB PSCM SMT/PTH Card - - -
WB PSCM SMT/PTH Card/flex - - -
Low-end systems
WB PSCM SMT/PTH Card Connection Board - 10s
WB PSCM SMT/PTH Card Connection Board -
WB PSCM SMT/PTH Card - - -
WB PSCM SMT/PTH Card - - -
WB PSCM SMT/PTH Card/flex - - -
Intermediate systems
WB C-SCM PTH Card Connection Board Air 100s
WB C-PGA SMT Card Connection Board Air (w/fin)
WB C-PGA PTH Card Connection Board Air
C4 C-TCM PTH Board Connection Cable Air
Large systems
WB C-L-CC SMT P-G board Connection Cable Water 1,000s
WB C0FP C-MCM PTH P-G board Cable P-G board Air
C4 C-TCM PTH FR-4 board Connection Cable Water
TAB FTC SMT LCM PTH P-G board Water
Supercomputers
WB C-FP SMT Card Connection Cable FC-78 >10,000
TAB C-LCC SMT Board Connection Cable LN2
TAB FTC SMT LCM PTH P-G board Water
Table 1 - Typical packaging technologies
● Packaging hierarchy imposes thermal hierarchy considerations on the assembly of the total system.
● Connections between the chip and package commonly performed by one of the following three technologies:
❍ Wirebond using thin gold or aluminium wire.
❍ Solder bond or Controlled Collapse Chip Connection (C4), also called flip-chip.
❍ Tape Automated Bonding (TAB).
● Wirebond most common, cheapest, lowest temperature but limited in the number of connections that can be made (<500) and also high lead inductance
limiting electrical performance.
● C4 capable of many more connections up to 20k and low lead inductance but more complex technology and higher temperature.
● TAB gives higher number of connections than wirebond though not as high as C4, better electrical performance, high yield and lower assembly costs but
highest temperature and more complex technology.
● Chip packages are made of metal, ceramics and are either hermetically sealed or encapsulated in plastic.
● The most common types are:
200 0.63
>500 2.54
Tap automated Plastic 300 0.50
bonding (TAB) >1000
0.25
Ball grid array Plastic 300 >500 0.50
Ceramic 604 >1000 0.40
Chip Scale 300-1000 >1000 0.5
Thin film ceramic
packages 300-1000 >1000 0.5
SLIM Thin film - >5000 0.25
Table 3 - First level single chip packages and their characteristics
● Despite reductions in power dissipation/individual circuit due to technology improvements, current chips contain more individual circuits/chip and so total
power dissipation/chip is increasing.
● Graph 4 shows a plot of typical power dissipation per chip between 1970-1990 showing at least a tenfold increase over that period.
● Most systems use forced-air cooling of modules for package cooling.
● High performance systems additionally use heat sinks mounted onto packages using various technologies.
● In extreme cases packages are further cooled by immersion in inert liquids, such as fluorocarbons.
3.5 Package Sealing and Encapsulation
● Intended to protect the chip and package metallisation from corroding environments and from mechanical damage due to handling.
● Moisture is one of the major sources of corrosion.
● Plastic materials, such as silicones and epoxies, developed with low water diffusion properties are used extensively for IC encapsulation.
● For high reliability devices hermetic sealing used based on welding or brazing of ceramic/metal packages. More expensive and time-consuming than plastic
encapsulation.
Microelectronic Technologies and Applications
Chapter 1 - CMOS Digital Logic
Chapter Overview
This chapter reviews aspects covered in two earlier modules: Microelectronic Design and Business Issues and Benefits of Microelectronic Devices, and also covered in
the textbook.
● Work through the MIB - DTI presentation associated with this chapter.
● Review sections 1.1 and 1.2 in module Microelectronic Design.
● Read section 1.2 Design Flow in the textbook.
● Module concentrates on ICs made using CMOS technologies (low power, cheapest and popular).
● Basic two input NAND gate (equivalent gate) requires four transistors in CMOS; the number of equivalent gates equals one quarter the number of transistors.
● Feature size, l quoted for processes (currently 0.25µm typically); equals half the dimension (length or width) of the smallest transistor.
● Review sections 1.4.1 - 1.4.5 in module Microelectronic Design on the different types of ASICs.
● Read sections 1.1 - 1.1.8 in the textbook on the same topic.
● Read section 1.3 in the textbook - case study on the development of the Sun Microsystems SPARCstation 1.
● Review the costing within section 1.5 of module Microelectronic Design and also in module Business Issues and Benefits of Microelectronic Devices.
● Read sections 1.4, 1.4.1, 1.4.2 in the textbook relating to product costs for different ASIC solutions.
● Figure 1.11 shows a break-even analysis for different ASIC types.
FIGURE 1.11 - A break-even analysis for an FPGA, a masked
gate array (MGA) and a custom cell-based ASIC (CBIC). The
break-even volume between two technologies is the point at
which the total cost of parts are equal. These numbers are very
approximate.
● Familiarise yourself with the constituent parts of fixed costs (spreadsheet figure 1.12) - section 1.4.3 and variable costs (spreadsheet figure 1.14) - section 1.4.4 in the textbook.
● Understand how the costs change with technology advances and product maturity, see figure 1.15
Figure 1.1 - CMOS transistors as switches. (a) An n-channel transistor. (b) a p-channel transistor. (c) A CMOS inverter and its symbol (an equilateral triangle and a circle).
• CMOS inverter formed by connecting n-channel and p-channel transistors in series between VDD and VSS
• Operation is that inverter output at logic 0 for logic 1 input and output at logic 1 for logic 0 input - see CMOS interactive
exercise.
• In both logic states one transistor OFF and hence no current flow and very low power dissipation (virtually zero) in both
states. Major advantage of CMOS!
• Other gates eg. NAND/NOR and more complex structures can be easily designed - see figure 2.2 in the textbook.
• Theory of CMOS transistor operation complex - see sections 2.1, 2.1.1, 2.1.2, 2.1.4 in the textbook.
• Theory gives V-I equations in saturation (VDS>VGS-VtN) as equation 2.12 and in linear region (VDS < VGS-VtN) as equation
2.9 for n-channel transistor.
• Equations 2.15 apply for p-channel transistor.
• All V-I equations contain b term (Gain Factor) which equals k (process transconductance) times width/length of the transistor.
• b Important allowing current in a CMOS transistor to be varied by varying device geometry (W/L) as well as terminal voltages.
• Theory and practical measurements agree well (see figure 1.4). Short-channel transistors are normal in most ASIC devices.
Figure 2.4 - MOS n-channel transistor characteristics for a generic 0.5mm process (G5).
A short channel transistor, with W = 6 mm and L = 0.6 mm
(drawn) and a long channel transistor (W = 60mm, L =
6mm)
• CAD programme SPICE often used to characterise transistors or gates. Parameters for a generic 0.5µm CMOS process given
in Table 1.1 for example.
• Due to transistor operation logic levels can be either strong or weak.
• n-channel transistor gives a strong '0' logic level but a weak '1' - see section 2.1.4 and Figure 2.5 in the textbook.
• p-channel transistor gives a strong '1' logic level but a weak '0'.
• Use both transistors together in CMOS gates to give strong '0' and '1' levels.
· Review chapter 2 in module mdesign - Microelectronics Fabrication Process and in particular the single n-well CMOS process (section 2.8).
· The various mask/layer names together with MOSIS (US design house) mask labels are given in Table 2.2
· Using these names Figure 2.7 in the textbook shows the layers required to achieve a typical standard cell layout given in figure 1.3 (p8),
together with the complete cell layout and the phantom layout often used in ASIC designs.
· 'Wells' of the opposite type of semiconductor to the substrate are used in CMOS to allow the fabrication of p-type and n-type transistors on the
same substrate.
· Single, twin and triple-well processes available.
· In general the more wells in a process the more control over transistor properties.
· In all cases n-wells must be connected to the most positive part of the circuit (VDD) to ensure that substrate/source drain junctions are not
forward biased and p-wells must be connected to the most negative part of the circuit (VSS) for the same reason.
· Often substrate connections not shown on circuit schematics but vital for correct circuit operation.
· CMOS process for circuit depicted in figure 2.7 described in pages 52-55 of the textbook.
· Sheet resistance is inversely proportional to the concentration of a doped layer. Sheet resistance is a measure of the concentration of a
semiconductor doped layer.
· Sheet resistance measured in ohms/square since layers are very shallow compared to widths/lengths.
· Typical values between 1.1kW/sq for an n-well to 30 x 10-3W/sq for metal - see Tables 2.3 and 2.4 in the textbook for example set.
· Contact resistance (CR) - metal/silicon - often significant and process steps taken to reduce CR and also improve contact reliability (see
Tables 2.5 and 2.6 in the textbook).
· Basic combinational gates (NAND, NOR, etc) can be made in CMOS (figure 2.2) as already discussed.
· More complex combinational cells comprising several gate combinations such as AND-OR-INVERT(AOI) and OR-AND-INVERT (OAI) are
much more efficient in CMOS and often used in combinational design - see Table 2.10 in the textbook.
· Numbering notation based on number of inputs at first level and second level often used - see Figure 2.12 in the textbook.
· Design procedure (pushing bubbles!) using networks of transistors (stacks) used.
· Illustrated in for the AOI 221in figure 2.13 of the textbook.
· Different hole and election mobilities give rise to different transistor gain factors bn and bp (equations 2.11 and 2.15 in the textbook).
· Equalise by adjusting (W/L) ratio of n and p type transistors to make bn and bp equal (same drive strength). (see section 2.4.2 in the textbook)
· Cells in a library available in a range of drive strengths.
· For transistors in series or parallel, design procedures are more complicated but essentially:
• for transistors in parallel, make all the lengths unity and add the widths
• for transistors in series, make all the widths unity and add the lengths.
· For example applied to AOI221 in figure 2.13c.
· Alternative combinational design approach based on CMOS transmission gate (TX) exists for simple gates. (section 2.4.3 and figure 2.14).
· More efficient in terms of number of transistors but other considerations may be important ie. charge sharing which may require extra
buffering and therefore extra transistors.
· Can design of a 2/1 multiplexer using TX gates (figure 2.15)
· Comparison with design using OAI22 cell (figure 2.16) shows little difference but for longer MUXs differences can become significant.
· Can then design EXC-OR cell from a 2/1 MUX and an OR gate (section 2.4.4).
1.1.8.2 Sequential
See section 2.5 in the textbook
· Synchronous design using a single system clock is nearly always used since it is safer, compatible with CAD tools and usually guarantees that
the ASIC will work as simulated
· Sequential cells have a memory or storage feature
· Simple Latches are transparent (ie. changes at inputs appear at the Q output when the clock is high)
· Flip-flops are more complex (require at least two latches)
· Design a latch from TX gates – operation illustrated in Figure 2.17 (Ref. Smith)
· Design a flip-flop from two D latches (see fig 2.18)
· Clocked inverter easily designed from an inverter and one TX gate (see fig 2.19) which can then replace inverters in latches and FFs
Sum = A B CIN
and Cout = A.B + A.CIN + B.CIN
· These can be expressed in terms of the PARITY function (where an output is 1 if there are an odd number of inputs are 1's) and the
MAJORITY function (where the output is 1 if the majority of the three inputs are 1) as:
o Table 2.11 reviews the common binary arithmetic operations (add, subtract etc) for the four common binary number representations
(unsigned, signed magnitude, ones' complement, two's complement).
o Often addition is in terms of generate G(i) and propagate P(i) signals - see section 2.6.2 for an explanation and equivalences.
o Form a ripple carry adder (RCA) conventionally (figure 1.22a) or using the generate/propagate approach (figure 1.22b).
Figure1.22 - The Ripple Carry Adder (RCA). (a) A conventional RCA. The delay may be reduced slightly by adding pairs of bubbles as shown to use two-input NAND gates. (b) An
alternative RCA circuit topology using different cells for odd and even stages and an extra connection between cells. The carry chain is a fast string of NAND gates (shown in bold).
■ Other faster adders are based on carry-save (CSA) (figure 2.23), carry-by pass (CBA), carry-skip and the most well-known the carry-lookahead(CLA).
■ Brent-Kung adder reduces the delay and increases the regularity of CLAs (figure 2.24).
■ Fastest adders are based on carry-select leading to the conditional sum adder (CSA) (see figure 2.25).
■ Graphs of normalized delays versus number of bits (figure 2.26a) show ripple-carry to be the slowest, carry- select faster and carry-save the fastest.
Figure 1.26 - Datapath adders. This data is from a series of submicron datapath libraries.
(a) Delay normalized to a two-input NAND logic cell delay (approximately equal to 250ps in a 0.5mm process). For example, a 64-bit ripple-carry
adder (RCA) has a delay of approximately 30ns in a 0.5mm process. The spread in delay is due to variation in delays between different inputs and
outputs. An n-bit RCA has a delay proportional to n. The delay of an n-bit carry-select adder is approximately proportional to log2n. The carry-save
adder delay is constant (but requires a carry-propagate adder to complete an addition).
(b) In a datapath library the area of all adders is proportional to the bit size.
■ Graphs of areas versus bits (figure 2.26b) show that ripple-carry and carry-save take up about the same area whereas carry-select takes up about twice as much
area.
1.1.8.4.2 Multipliers
See section 2.6.4 in the textbook
· Multiplication is a series of additions and shifts.
· Use a number of adders to form an array multiplier - see figure 2.27 for a 6-bit multiplication illustration.
· Performance determined by the number of partial products and the addition of partial products.
· Use canonical signed - digit vectors (CSDs) to reduce the number of add/subtract operations and replace some additions
by shifts.
· Further improvement by using Booth encoding (partial products reduced by a factor of 2 improves speed and area
utilisation).
· Improve speed still further by using Wallace-trees and Dadda multipliers (Figures 2.28, 2.29 and 2.30)
· Several considerations apply in the choice of parallel multiplier architecture eg. overall speed, power dissipation,
implementation (cell or full custom), pipelining etc.
Figure 1.33
Chapter Information
❍ Although no current is taken by a CMOS inverter under static conditions (logic 1 or 0), current is drawn during switching due to both transistors being on (see Figure
2.2 in textbook).
❍ Switching delay often given driving standard loads (gates) where n = 1, 2, 4, 8 etc.
❍ Simulation shows that switching delays are approximately linear function of load capacitances - see Figure 2.3 in text book (and hence the number of loads).
❍ Typically tpdf = Rpd (COUT + Cp) - equation 3.2
❍ Due to complexity it is only possible to evaluate time delays accurately from CAD simulation (eg. SPICE)
❍ Hand calculations only give estimates.
❍ Cell delay results from transistor resistances, transistor (intrinsic) parasitic capacitances and load (extrinsic) capacitances.
❍ Input capacitance of driven cell in the load capacitance of the driving cell.
❍ CAD tool SPICE lists 8 capacitances for a CMOS transistor (shown diagrammatically in Figure 2.4 in text book).
❍ Junction capacitances CBD and CBS are p-n diode capacitances.
❍ Overlap capacitances CGSOV and CGDOV are oxide capacitances, and account for lateral diffusion of drain and source under the gate.
❍ Gate-source, gate-drain and gate bulk capacitances, CGS, CGD and CGB are combinations of junction and oxide capacitances.
❍ Detailed formulae for calculating each capacitance is given in Table 3.1 together with a typical calculation at one operating condition.
❍ All transistor parasitic capacitances are functions of operating conditions.
❍ For instance, Figure 2.5 in textbook shows the variation of all the parasitic capacitances with VIN from 0 to +3V.
❍ 'Logical effort' is a concept which gives us an insight into why logic has the delay it had and allows us to examine relative delays.
❍ Modifies equation 3.2 by an additional term tq to give tpd = R (COUT + Cp) + tq - equation 3.10.
❍ tq is non-linear and includes delay due to parasitic capacitances and other effects.
❍ For scaled cells (scaling factor s) since capacitances increase and resistances
decrease then equation 3.12 results giving tpd = (COUT + sCp) + stq.
❍ Equation 3.12 is then rewritten using the capacitance of the scaled cell - equation 3.14 and normalising the delay d, using the pull resistance RINV and input
capacitance CINV of a minimum size inverter giving equation 3.15 where
d= 3.15
❍ So delay (d) is the sum of the effort delay (f), parasitic delay (p) and non-ideal delay (q).
❍ Effort delay (f) is further broken up into the product of logical effort (g) and electrical effort (h).
❍ Logical effort is defined in figure 2.8 and is a function of the type of logic cell.
❍ Table 3.2 gives logical efforts for inverter, NAND and NOR cells.
❍ Section 3.3.1 in the textbook shows how to use this technique to calculate the delay in a 3 i/p NOR gate driving a net with capacitance 0.3pF giving an answer of
0.846ns.
❍ Enables a calculation of the area of transistors in a logic cell to be made - logical area (see section 2.3.2).
❍ Calculation of delay in section 2.3.1 did not depend on logical effort g because it is not driving the NOR cell with another logic cell which is ideal.
❍ In a logical path situation it is possible to calculate the delays of logic cells driven by a minimum size inverter.
❍ Path delay, d, is the sum of the logical effort, parasitic delay and non-ideal delay at each stage.
❍ Extend this concept to work out delays in multistage cells. (See section 2.3.4)
2.2.3 Optimum delay
❍ For a chain of N inverters each with equal stage effort f, then neglecting parasitic and non ideal delay it can be shown that minimum delay occurs when electrical
effort h is equal to 2.7.
❍ Shown diagrammatically and in a tabular form in Figure 2.12
❍ Use hand-crafted or more commonly, symbolic layout, such as STICKS to draw the cell layout.
❍ Particular design rules built-in.
❍ Cells for gate array, standard cell and datapath are quite different - see section 3.6, 3.7, 3.8 and elsewhere in this course.
2.4 Power Dissipation
See section 15.5 in the textbook
❍ Switching current power dissipation (P1) given by f C VDD2 and is the major source of power dissipation in CMOS.
❍ Reduce by reducing supply voltage VDD and parasitic capacitance C.
❍ Short circuit current (crowbar current) power dissipation (P2) can be important for output drivers and large clock buffers
Chapter Overview
This chapter introduces a group of custom silicon components known collectively as Programmable Logic Devices. Typical generic architectures are considered
with particular reference to the construction, principle of operation and application of those devices categorised as Simple Programmable Logic Devices.
Devices are supplied from the manufacturer with an array of prefabricated logic components and interconnect. The designer utilises a CAD system to define the
required configuration of components and interconnect for a given design.
Unlike mask programmable devices that require fabrication at a silicon foundry, PLDs are electrically programmable by the user. Typically this is implemented by
downloading the configuration data from the serial line of the CAD system either directly into the device or via a device programmer.
Device architectures vary considerably from the simple AND/OR gate structure of small PLDs to the complex logic cell arrays of Field Programmable Gate Arrays
(FPGAs). Device types are further distinguished by being either one time programmable (OTP) or re-programmable.
Most manufacturers will also offer hardwired versions of their devices in which the interconnect circuitry is removed in favour of single tracks. This reduces silicon
area and therefore device cost and would be an appropriate consideration for higher volume production (typically > 10,000 units/year).
Programmable Logic Devices are available in a wide range of architectures and sizes. They are generally classified into three groups of increasing capacity, features
and cost as follows :-
Since the AND array is fixed, each of the 8 combinations for the three input variables I2, I1 and I0 effectively defines a memory address.
Only one AND gate will be active for any given input combination so each of the outputs O3, O2, O1 and O0 will register a logic 1 or logic 0
depending on whether the corresponding fuse in the OR array is left intact or blown.
Logic expressions are therefore being defined by storing a complete truth table in the PROM for each of the 4 functions required. The limitation of
this type of architecture is the inefficiency incurred when large numbers of input variables are required. A sixteen variable function for example
would require a 64K location PROM and would occupy far more silicon area than a discrete logic gate implementation.
3.4 PLA Architecture (Programmable AND Array, Programmable OR Array)
Figure 3 defines a Programmable Logic Array (PLA) structure with 3 input variables and 4 output functions.
Having both arrays programmable allows greater flexibility in generating logic functions, particularly when more than one output function requires the
same product term. In this case the AND gate generating the product term is simply connected to the appropriate OR gates.
However, should a particular logic function require a large number of product terms this can quickly use up the available supply of AND gates leaving
few left to implement the other functions
The major disadvantage of the PLA structure is related to having an extra set of fuses in the OR array which adds extra propagation delay to the signals
and reduces the component packing density. For this reason examples of commercially available devices are limited.
3.5 PAL Architecture (Programmable AND Array, Fixed OR Array)
Figure 4 defines a Programmable Array Logic (PAL) structure with 6 input variables and 4 output functions.
The basic PAL structure is the exact opposite of that required for a PROM.
The number of AND gates required is considerably reduced from the 2 n needed for the PROM and is related only to the number of product terms to be defined.
The fixed OR array, however, imposes restrictions on the number of product terms than can be logically ORed together and therefore limits the complexity of the
required logic expressions.
PAL devices provide the optimum compromise for speed, flexibility and packing density and offer many features such as programmable input/output pins, internal
feedback from outputs, flip flops and active LOW/active HIGH outputs that make them ideal components for implementing logic functions.
3.6 Simple Programmable Logic Devices (SPLDs)
There exists a bewildering array of terminology amongst manufacturers for this group of devices. Often they are simply referred to as PLDs. The PAL devices
previously discussed fall into this category. More complex devices are sometimes referred to as Erasable Programmable Logic Devices (EPLDs) although some
manufacturers use this as a generic term to cover all programmable devices.
PAL devices form the entry level into the group, typically replacing 5 or 6 TTL chips in a 20 or 22 pin package. They provide a good introduction to the features,
architecture and applications of SPLDs.
The PAL was invented at Monolithic Memories Incorporated (MMI) in 1978 to provide an alternative to Small Scale Integration (SSI) chips in applications where
customised combinational or sequential logic is required. In its original form it employed bipolar technology and fusible links for programming.
Amongst the first devices to be offered were the PAL18H8 (a combinational device constructed from an AND/OR array structure) and the PAL16R8 (which added
flip flops to enable sequential circuits to be implemented).
The modern equivalent is the PAL16V8 manufactured in CMOS technology and combining features from both these devices. The programming fuses are replaced
by electrically programmable cells, thereby allowing the device to be repeatedly reconfigurable.
Programming is accomplished by placing the device in a device programmer and downloading the configuration data file from the serial line of a CAD system. The
file conforms to an internationally agreed format known as JEDEC.
The programmer configures the device by re-designating the input and output pins to programming functions. This is achieved by applying a higher than normal
voltage to a specified input pin to select programming mode.
The address of the cell to be configured is placed on a subset of the input pins and its required state (logic 0 or logic 1) is set as an appropriate voltage on the output
pin connected to the AND/OR block it resides in. The cell is configured by applying a programming pulse to a second specified input pin and the cell state read back
on its corresponding output pin to verify correct programming has occurred.
A security bit is provided in the device, which when set during programming, inhibits any reading of the device contents.
The simple AND/OR array construction of PAL devices provides consistent and easily predictable propagation delays through the logic elements, an important
requirement in speed critical applications. Architectures that exhibit this property are often referred to as Deterministic.
Locate and select the “Data Sheets: PAL and GAL Products" entry and observe the devices currently available.
Study this information carefully and in particular familiarise yourself with the following:
Chapter Information
You will probably come across several different definitions of a microsystem but, as pointed out in the
Business Issues module, no single definition is generally accepted. A very basic definition of a microsystem
might be a very finely toleranced structure. This is much too broad for our purposes and would include any
precision engineered component. A more specific definition is the combination of microelectronics together
with a micromachined element on the same substrate. This sounds fine but is actually quite restrictive and
would exclude some of the more interesting techniques and devices emerging from this field. It will
probably be useful to have a more flexible view of what constitutes a microsystem. It may also be helpful to
consider the definition of the broader term: microengineering.
This question is also posed in the module Business Issues and Benefits of Microelectronic Devices chapter 5.
This question raises several interesting questions about how we view microelectronics in the wider sense.
Microengineering has developed over the last ten years or so and has largely arisen out of a recognition of
the possibilities of using the normal microelectronics production techniques to produce things other than
integrated circuits. Using batch processes which operate on hundreds or thousands of components at one
time, microelectronics technologies allow the production (the mass production) of highly complex structures
with feature sizes in the range from 1-100 microns. Microengineering recognises that the same, or similar,
techniques can be used to fabricate very small sensors or actuators. If we combine these artifacts with some
electronic signal processing at the sub-mm level, then we have a microsystem. Note that this gets round the
difficulty posed by one of the above definitions in that the integration does not have to take place on the
same substrate.
One of the most useful concepts is to think of the difference between a microelectronic device and a
microsystem. A microelectronic component processes electrical information whereas a microsystem
interacts with its environment.
It is important to realise that microelectronics technologies on their own do not provide a wide enough
portfolio of processes and materials; many additional techniques are required and combinations of
technologies are used together to give highly miniaturised components and systems. In particular, materials
other than silicon are used. Microelectronics is essentially a two-dimensional technology, consisting of thin
patterned layers. MST often requires a third dimension.
The issue of definition is important given the potential market for microsystems and the rate at which it is
expected to grow over the next decade or so. Some sources predict a market with growth rates and overall
size similar to that for microelectronics over the next couple of decades. This sounds excessive, but there is
no doubt that the eventual market will be huge; probably growing to some billions of dollars within the next
five years (see the module on Business Issues). Indeed, in a recent survey, the European Commission has
identified nearly 20,000 companies in Europe which are interested in applying microsystems technology to
their products in the immediate future. The world-wide market is predicted to reach around $US 7-8 billion
by 2002.
It is interesting to note how this technology is defined in different parts of the world. MST is largely a
European term. In the US, the generally accepted term is MEMS (Micro Electro Mechanical Systems)
while, in Japan, the subject is known as Micromachines.
These differences stem largely from the different evolutionary paths the technology has followed. In Europe
and the U.S. the main drive has been from device engineers looking for new applications for semiconductor
technologies. In Japan, the origins lie in the field of robotics and a search for increasing miniaturisation.
Microsystems technology (MST) then, is concerned with the production of new products, systems or
components through the use of microengineering techniques. Such a component will incorporate sensors
and/or actuators and signal processing on a microscopic scale.
This last definition will serve us well but has its limitations. Some authorities would exclude the necessity to
incorporate any electronics in the system before it becomes a microsystem. And, although a photodiode is
undoubtedly a sensor and an LED or semiconductor laser can be considered to be an actuator, MST
sometimes excludes optoelctronic devices from its remit. We also have to take care to include new
categories of component such as microfluidic devices in our definition. These may or may not have an active
electronic component. One can see the need to adopt a flexible definition.
In this module we shall look at the various MST techniques, examine their capabilities and discuss some
applications.
Q 2. What are the key differences between microelectronics and micro systems?
• Sensors
• Actuators
• Microstructures and
• Integrated Microsystems
When the laser was being developed in the 1960s, it was described as “a solution in search of a problem.”
Luckily, those involved in the development (and the funding) ignored this view and kept working. The
problems for this solution came thick and fast in the 1970s and ‘80s and now it is difficult to imagine a world
without the laser. Applications range from industrial to medical. In fact, the laser is now a household item,
being a key component of a compact disk player. Think about this. The chances are that you have more than
one. I have at least four in my home; two CD-ROM drives (three if you count last year’s, slow model sitting
on the shelf) and two audio CD players. One of the latter is battery powered, can be carried round or used in
the car. The key points are:
• a technology which was once seen as being of little practical use is now the basis of a
huge, world-wide industry,
• the application areas are many and diverse with a requirement for a large number of
types, models and variations,
• this demand can only be met be means of mass production methods - particularly
when it comes to the smaller, cheaper devices.
When the original technology was being developed, it would have been very difficult to look at these early,
large, power-hungry lasers, with their limited capabilities and requirement for cabinets full of drive
electronics and predict that, one day, the man in the street would carry one around with him, have one in his
car and a few more in his home. Not only has this happened, it has happened very quickly (within a few
decades).
A virtuous circle is at work here; the emerging applications lead to the development of new devices and
manufacturing techniques and the possibilities thus revealed lead to the identification of new applications
which leads to further development etc. All this is driven by the, potentially enormous, financial rewards to
be had. The creation of a huge consumer market for “must have” goods which did not exist previously (CD-
ROMs and audio players) is a golden scenario for industry.
If anything, MST is an even more "golden" opportunity. Firstly, the origins of the technology, to a large
extent, arise out of manufacturing techniques which already exist. Secondly, many of the potential markets
and applications have already been identified. Indeed, some of these are already established and can be said
to be mature.
The parallel with the laser industry is therefore close but not exact. In fact, some laser devices could now be
considered microsystems in themselves. Lasers were a classic case of a phenomenon known as "technology
push." This is where a new technology is developed and the applications and markets follow. The opposite
case is where the market or need is identified first and a technology is developed (or an existing one adapted)
to meet the requirement. This is called "market pull."
MST has a foot in both camps. Many of the well known "demonstrators" (for example miniature cogs,
sprockets, and even motors) are of little obvious use - for the moment anyway. However, MST has
responded very rapidly to some market pulls, such as the requirement for airbag triggers in the automotive
industry.
There are basically two ways in which the potential of MST can be exploited:
The first of these can only be true if there is some positive advantage in moving to a microsystem.
In general, there are three key advantages advantages of microsystems over their (macro) counterparts:
• reduced size
• reduced cost
• improved performance
These may be applicable to greater or lesser degrees. They are not necessarily independent and a
combination of advantages can result. This is especially so for the size and performance arguments. Let us
look at this a bit more deeply.
Reduced size
This is perhaps the most obvious advantage of a microsystem. Many applications are driven, solely or
largely, by considerations of space. Invasive and implantable medical devices such as catheters are an
obvious example. A reduction in size also opens up the possibility of incorporating many components into
one device. An example of this is the array of magnetic coils produced by CSEM.
The microsystem need not be in the form of an array of similar components. Different components can be
incorporated on the one substrate giving rise to possibilities such as the laboratory on a chip. A reduction in
size brings other benefits. Smaller devices often consume less power and have a faster response (ie.
improved performance).
Reduced cost
This is often a result of the cost of production derived from the processing techniques developed for
microelectronics. The batch processing techniques used in microelectronics manufacture has been the key to
its staggering success. The ability to make large numbers of components at once has driven costs down
while functionality and performance have increased. For microsystems using similar processes, the same
advantages will apply. A related benefit is that of reproducibility. Material properties and dimensions can be
kept within tight limits and made uniform both within a batch and across batches. This results in predictable
component characteristics which is a considerable advantage to both the system and the component
designer. The microelectronics industry spends a considerable amount of time and energy improving the
quality and predictability (and, in turn, reliability) of its processes. This benefits not only costs but also
performance.
Improved performance
There are several reasons why a microsystem might display improved performance, mostly related to size.
One is the potential to integrate the sensing element with the electronics. This means the signal does not
have to travel any significant distance before being processed. Much weaker effects can thus be measured.
There is also the possibility to incorporate calibration functions in the device. The size of the device also
makes it less likely to interfere with its environment. A smaller sensor will be less affected by outside
influences and forces (this is most obviously so for mechanical microsensors and will be discussed in
Chapter 3). In addition, the improved quality and reproducibility of the fabrication process will lead to
improved predictability of performance. A fourth advantage exists: the ability to do things that could not be
done by any other method.
1.4 Scaling
When we design a microsystem, we cannot simply scale down the dimensions of an existing macrodesign.
We rely upon certain relationships to predict performance. As we reduce the dimensions of an object, the
significance of the various parameters in these relationships changes. For example, consider an airplane. If
we take the linear dimension as L, the fuel load it can carry, and hence the distance it can travel on one load,
will depend on volume (L3). The drag, however, will be proportional to the surface area (L2). So, all other
things being equal, if we increase the size, we will increase the distance travelled on a single load of fuel.
Following the same logic, we can see that the strength of adhesion of a bond will be proportional to the area
(L2) while the mass will scale by L3. A similar argument can be applied to a supporting structure and its
cross sectional area so a smaller object will be more capable of supporting its own weight than a large one.
Examples of this are micromechanical cantilevers which can be very long relative to their thickness and in
the macro world where small animals can carry their weight with greater ease than large ones (compare an
elephant to an insect).
So it is essential to have some idea of how the forces we take for granted and make use of in the macro world
scale down to microsystem dimensions.
• Gravitational forces
• Elastic forces
• Surface tension forces
• Electrostatic forces
• Electromagnetic forces
• Piezoelectric forces
• Thermal forces
Some of these forces will act destructively and some will be useful eg. as a source of motive power for a
microactuator. Scaling is considered in detail in the study section:-
The table of properties in table 7.5. Study these in detail and try to relate them to what you know about them
and the effect they have at the everyday, macro level. Note the Load and Response parameters. These can be
read as "static" and "dynamic" parameters respectively and this gives a clue to a major classification of
sensors into those that detect a static load directly and those that use an indirect method such as a change in
resonant frequency.
Pay particular attention to table 7.6 and 7.7. Try to envisage how the physical parameters change with the
linear dimension as this reduces.
The physicist Richard Feynman gave a classic talk in 1959 entitled "There's Plenty of Room at the Bottom".
His main theme was the possibilities and problems of making very small machines. Whilst talking about this
he touched on many interesting points which are interesting from the perspective of the state of both
microsystem and microelectronics technology today. You can find this on:
http://www.zyvex.com/nanotech/feynman.html
Read this and note the following, bearing in mind that this talk was given in 1959 before the creation of the
semiconductor industry as we know it:
• In the section "Miniaturization by evaporation" compare what he says in his first paragraph
with what you know about current semiconductor processing.
• Read what he says about the problems of scaling in this and the following section. Note his
comment on the effect of Van der Waals attractions and compare with the Kundsen effect
mentioned on page 153 of the textbook. He also hints at the possibility of creating a highly
focused beam of light.
You may wish to print this paper and refer to it at the end of this module to see how Feynman fares in his
predictions.
Another useful paper is "Grand In Purpose, Insignificant In Size" by William Trimmer which you can
find on:
http://home.earthlink.net/~trimmerw/mems/mems_97_talk.html
This paper is very useful as a reference to others in the field and as a summary of the history of the
technological process. Read this carefully. Note particularly the following:
• In the paragraph beginning "The field we are contemplating....." he indicates that we are
looking at a technology, or a series of technologies, which have not evolved from a single point
from development into manufacture (like the spawning of the microelectronics industry from the
transistor). However, this can be considered a strength (see the paragraph which begins "One
thing giving me confidence.....").
• The sentence "Complex calculations and decisions have now become inexpensive" sums
up, in a few short words, almost the entire benefit and reason behind the success of digital
computing over the last few decades (and, in fact, of electronics as a whole). See if you can think
of an equivalent, concise sentence to sum up the benefits of microsystems - if not now, then when
you have completed the module.
• Pay attention to his description of Surface and Bulk Micromachining and LIGA. We will
look at these in more detail soon. Keep in mind his analogy of the flour & steel automobile when
you come to look at Surface Micromachining
Additional information
The booklets published by the DTI (references 1, 2 & 3) are introductory guides aimed at managers and chief
engineers of companies who may find microsystems useful in their product development. They make useful
background reading and Ref 3 (the Handbook of the Microsystems Supply Industry) gives some www
addresses which are of interest.
The European Commission, through the Europractice initiative, has carried out several activities to promote
awareness of MST throughout Europe. The UK centre was known as MST 2001. Visit their web site:
www.jasa.or.jp/mst/index_e.html
Browse around the three sites linked from this page to get a feel for the services on offer. Have a look at:
http://www.MST-Design.co.uk/markets.html
Compare the tables for current and emerging products. Look at the case studies on:
http://www.MST-Design.co.uk/casestudies.html
These are not detailed but the last two (Thin Film Bulk Acoustic Resonator and Microwave Switch) give a
good account of how the devices were manufactured. You may like to muse on the techniques which were
employed to fabricate these structures.
The AML site is of interest as an example of a company offering services in MST. A microsystem will often
require a number of processes and techniques for its manufacture. Not all of these will be available from one
source so someone (a lead contractor perhaps) will have to undertake to manage the device through these
various facilities (not an easy task). There will be a role in this industry for services such as this one from
AML, perhaps from companies who carry out no manufacture themselves. Again from the MST 2000 home
page you will find a link to NMRC. Have a look at the microsystems articles in their newsletter.
The page:
http://www.nexus-emsto.com/
http://www.nexus-emsto.com/jap-tai_mission.html
and:
http://www.nexus-emsto.com/mission.html
(You will have to register to get access to the latter).
These give the results from some recent visits to Japan and the US and serve to give a flavour of what is
currently happening in these areas.
http://www.nexus-emsto.com/market-analysis/index.html
is worth looking at and gives another take on the MST market roll-out. (You will only get access to the
executive summary).
Microsystems and Multichip Modules
Chapter Information
Throughout this module reference will be made to articles in MST News. Access to back issues is free but you need to register with the site first.
On the top, right hand corner, in the index box pull down menu, choose "mst newsletter".
This takes you to a page headed "MST News - International newsletter on Microsystems and MEMs"
From the greyed text on the right select "REGISTRATION" and fill out the details.
You should now be able to get to the issues of the journal by following the above steps but, instead of "REGISTRATION" click on "DOWNLOAD"
This will take you to a list of issues and you can select from these as before. It may be as well to download the PDF version of the appropriate issues for easy reference.
The techniques for manufacturing MST components will be covered in the following chapters but a short overview of the main methods will be given here.
Some, very precise, conventional machine tools can manufacture objects in at the micron level. However, difficulties occur due to the material properties of the tool,
such as elasticity, thermal effects etc. CO2 lasers can be used to cut and form the material but, since they do this by melting, the precision is again limited. Finer
dimensions and higher precision can be achieved through the use of Excimer lasers which operate at high frequency. With this technique, the laser fires pulses which
blast away layers of atomic thickness. The pulses are so short that no heating or melting takes place. The intensity of the beam can be altered to control the amount of
material removed with each pulse and the depth of cut can be determined by counting the number of pulses.
The main problem with direct machining techniques is that they commonly work on one structure at a time. This makes them suitable for prototypes but not for mass
production. However, there is still a place for such methods (particularly Eximer laser abation) in the MST portfolio of techniques.
Micromachining
As we have already established, the most useful MST manufacturing techniques have been developed from existing, microelectronics technologies. The basic
techniques were described in Chapter 2 of Microelectronic Technologies and Applications. You should look at this again now in order to refresh your memory on the
steps involved.
These methods are based on the fundamental technique of photolithography where patterns are reproduced on the surface of the material (usually silicon) and a circuit
is built up through a combination of patterning, etching, deposition, doping etc. This results in an essentially two-dimensional structure whereas MST requires three-
dimensional features. However, the basic technique has such powerful advantages in terms of unit cost and reproducibility that it makes an excellent basis upon which
to develop MST methods. Indeed, much of MST research is focused on overcoming the restrictions of photolithographic manufacturing whilst retaining the benefits.
The term micromachining is typically used to refer to these processing techniques. There are two main subdivisions: surface micromachining and bulk
micromachining.
Surface micromachining uses common microelectronic and thin film processing to form micromechanical structures on the surface (that is, to a depth of a few
microns) of the silicon (we refer to silicon but the techniques can be applied to other materials). Thin film techniques (essentially material deposition) can be employed
to enhance the process capabilities.
A lot of very useful structures (such as cantilevers) can be made using surface micromachining. But there is often a requirement for deeper, taller structures. The big
advantage is that these techniques can often be combined with a specific microelectronic process in order to produce a sensing device as part of an electronic
component.
Bulk micromachining exploits the fact that silicon etches much faster in some directions than others. This is a property of the crystal structure and can be controlled by
doping the silicon with various impurities. Two methods of etching are employed: wet etching and dry etching.
These micromachining methods can be broken down into several key processes:
• Pattern replication
• Material deposition
• Etching
• Sacrificial layer processing
LIGA
The term LIGA comes from the German for Lithography, electroplating and molding (Lithographie, Galvanformung und Abformung). It is essentially a technique for
creating a mold on the micron scale and using this to mass produce very small structures. The LIGA technique is not based on, or compatible with, silicon processing. If
it is to be combined with electronics, this must be done using a packaging technology.
Packaging Techniques
In most cases the ideal solution is to integrate the sensor or actuator on the same piece of silicon as the electronics. This will give a true microsystem. However, as
noted above, some of the techniques for producing microsystems are often incompatible with normal silicon processing. One solution is to combine a separate
microsensor/actuator and a microchip on the same substrate using one of the packaging techniques developed for microelectronics. This is often referred to as Hybrid or
Multichip Module (MCM) technology and involves bonding of a device (or many devices) onto the same substrate which can be silicon or some other material. We will
look at this in more detail in chapter 10.
It can be seen that the manufacture of microsystems can employs a variety of techniques and often a combination of methods is used. The above list is not exhaustive.
Many other techniques are used and others are being developed. For example, chemical and biological sensors usually employ a coating of some kind to make a FET
sensitive to a particular substance.
Look again at the ASIC development process described in Microelectronic Technologies and Applications Section 1.2. The flow chart of diagram 1 is a typical
representation of the IC design route. It involves a number of steps including system design, partitioning, layout and simulation with feedback loops at various stages.
Although the details of the representation may vary and the process itself will evolve, the design flow is fairly well established.
The opportunity for design verification is a big advantage, dramatically increasing the chances of “right first time” devices.
• The complications arising from the possible (or probable) need to combine different technologies (and hence manufacturers)
• The lack of adequate design tools.
• In particular, note the comments on page 22.
All this makes microsystem development sound a lot more lengthy, costly and risky than that for an IC. Especially when we consider a microsystem which effectively
has an ASIC design as a subset of its overall development. This is indeed so and is likely to remain so given the nature of the technology. However, the issue is being
addressed by the MST community, particularly by companies offering a brokerage service.
Writtenby H.v.d.Vlekkert of CSEM, a Swiss company specialising in MST development and supply. Reproduced with
permission.
The goal of the EUROPRACTICE program is to promote access to Microsystems. A Microsystem is defined as a small
(<1 cm3 typically) package containing at least a microstructure which interfaces with the non-electrical world (sensor or
actuator) and an IC which provides an intelligent signal processing interface between the microstructure and the user. By
promoting access, it is intended that the number of Microsystem products manufactured in Europe will increase.
However, manufacturing represents only the final stage of a product life cycle (as depicted in Figure 1) The life cycle
passes three main phases: technology set-up, development and production. The technology set-up phase is the beginning
of a Microsystem and, starting from an idea, results in a function demonstrator for which manufacturing is described in a
cook book. The development phase consists of two stages: product definition and product development. The product
definition stage is carried out by the customer, in co-operation with the Microsystems service provider. The product
development stage starts after the product definition phase and assumes that the Microsystem specifications have been
defined and agreed upon with the customer. The production phase starts with industrialisation followed by the production
of the Microsystem. In parallel, the customer implements market introduction, distribution and sales.
Before Microsystem manufacturing at medium to large scales can begin, a product development stage must be executed.
The goal of product development is to make industrial prototypes of the Microsystem which can be manufactured in
medium to large quantities without the necessity of a redesign.
In the past, the development stage for a Microsystem used to represent a long, costly and risky stage. The length of a
Microsystem development often spans a period of several years. Its cost can run up to several million ECUs, often with
major setbacks along the project or even total failure at the end.
To reduce these problems, CSEM has implemented a methodology for the development of Microsystems. The
methodology describes the flow of a Microsystem development to guarantee its systematic execution. It focuses on the
reuse of existing components and knowledge through the optimal sue of design libraries. The software tools are optimised
to execute each part of the development stage as effectively as possible. Check points described extensively in the
methodology limit the risks of the development.
Microsystem Development Flow
The flow of the Microsystem development stage that has been defined in the methodology is depicted in the flow chart.
The development begins with the system design phase in which the product specifications are partitioned over the different
system components. Simulations are performed to verify that the system will meet all specifications. In the next stage, all
components are designed in detail according to their specifications. The results of the detailed simulations are cross-
checked against the system level simulations. When the components meet their specifications, they are fabricated and
tested. They can then be assembled to form the first prototype of the system. This prototype is then tested extensively to
gain insight in the tolerances of the system to different parameters. When the initial prototype meets all critical
specifications, the project continues with the design of the final prototype. Minor modification to the design will be made
to assure that this prototype meets all specifications.
The experience gained with the fabrication will now also be used to optimise the final prototype so that it can be produced
industrially without any further modifications. The product specific equipment necessary for this future production will
also be defined at this stage. The final prototypes are then fabricated, tested and sent to the customer. They can also
undergo the environmental and quality tests specified in the development contract.
The methodology for a Microsystem development is similar to the one for an ASIC with two notable difference. The first
difference is that the Microsystem methodology develops an IC, a microstructure and a package in parallel with much
emphasis on their interactions during the entire development stage. The second difference is that the ASIC development
methodology does not distinguish between first and industrial prototypes. The need for this distinction in Microsystems
stems from the fact that there are no standard test or assembly procedures for Micro- systems. Therefore, the first
prototype is used to optimise the test and assembly procedures for industrial production. The resulting industrial prototype
is conceived in such a way that the prototype can be produced in large quantities without the need for redesign in the
industrialisation stage.
The system design phase is very important in the Microsystem methodology. In this phase, three different issues are
addressed. The first issue is system partitioning which distributes the customer specifications over he different
components of the Microsystem. The second issue is the choice of technologies for the different components and the
verification as to whether the component specifications can be met with the technologies chosen. The third issue is
concerned with the assembly and test of the components and the system. Given the small dimensions of a Microsystem, a
test and assembly concept must be worked out during the system design.
Throughout the entire methodology, there are checkpoints defined with the precise definition of the information which
must be available. The checkpoints are very effective in limiting the risks of the Microsystem development, since they
require the evaluation of all critical aspects of the Microsystem and split the development stage into shorter well-defined
parts.
The methodology also helps defining the software tools needed for the Microsystem development. The requirements of
the software tools for each step of the development stage are based on the kind of information that must be available at the
end of the step. This has helped us to choose the software and implement the libraries necessary for each step. The
libraries in turn will help shorten the development time and reduce its cost, since they maximise the reuse of available
components and knowledge.
Conclusion
The methodology described above has been implemented at CSEM and is used for all our development projects. We are
using this methodology to define software tools and libraries necessary for Microsystem design and simulation. The
libraries of existing components are currently being implemented and will be extended as more components become
available.
The result of the methodology appears to be that development projects tend to get shorter, although it is too early to reach
a definitive conclusion. The methodology has certainly helped us in discussions with potential customers, because it
explains how a Microsystem development takes place and how it limits the risks of such a project.
What features of standard, microelectronic processes make them suitable for development as MST techniques?
Question 2
Question 3
Question 4
Question 5
Question 6
With reference to the article on “development technologies” from ref 4 answer the following:
Let us explore the design task more thoroughly. Go back to the bullet points of section 2.2 where we list the reasons for the well established IC design flow and the
difficulties in the MST case. Bearing these in mind, read the following article from the journal "MST News". (Click on the link below to reach the relevant copy of the
journal and look at the paper entitled "Moving MEMS CAD Tools into the next Century"- the file is in PDF format). Read this critically now noting the following points
and answer the SAQs in order to assess your understanding of the material. As you study this, compare it to what you know of the IC design task and the CAD tools
available.
http://www.ami.bolton.ac.uk/courseware/msysmcm/ch2/mstnews0499.pdf
Note the point made in the final paragraph on the inaccuracy of process parameters. Again, this is a major problem. SAQs 7, 8 & 9 refer to this paper.
Question 7
In the preceding paper, what do the authors suggest to speed up Time To Market?
Question 8
What are suggested as being the main areas of difficulty in creating a CAD tool suite?
Question 9
Now have a look at the paper in MST News of 5/00 entitled "Towards Dedicated Design Tools for Microsystems".
http://www.ami.bolton.ac.uk/courseware/msysmcm/ch2/mstnews5-500.pdf
The first thing to note is that the first paper was from April 1999 whereas this is dated May 2000 ie. just over a year later.
Read this paper carefully paying particular attention to the diagrams. Note the following:
The paper paints a fairly detailed picture of the requirements for an integrated set of design tools. Do you think this will ever come about? Can you think of any factors
that would slow the development of such a toolset? The IC CAD business is largely driven by the huge amounts of money at stake. The CAD vendors have therefore a
strong incentive to develop new tools and a very competitive market has resulted. For MST, the wide range of applications areas and technologies needed would mean a
wide range of toolsets and a more fragmented market. This may limit the applicability of each and there may not be such an incentive for CAD vendors to get involved
in this field. Until the predicted large market develops, the development of the tools may be slow.
Question 10
What two design tools are used most in the ad-hoc approach to MST design?
Question 11
Question 12
What does the author suggest is the cause of the loss of information and a source of errors?
If you have the time, read the other articles in this issue of MST news. This journal is very comprehensive and is a good way of keeping up to date with
developments in MST. Indeed, we shall refer to more material from this source in the following chapters. If you develop an interest in the subject you
should get a subscription (it's free).
Demonstration
Finally for this chapter, you can download a demonstration version of an MST toolset. This is an executable file and you should be able to run it on your
PC. You will not be able to design any Microsystems with it but play around with it and, from your knowledge of IC design tools, see how it differs from
these. There are no SAQs on it.
Note: The MST toolset software will take typically fifteen minutes to download, depending on the speed of your modem
and internet connection.
Microsystems and Multichip Modules
Chapter Information
Self Assessment
3.1 Introduction
There are a number of ways in which we could classify and group the various processes
used in MST. As we have already ascertained, there is much to be gained by using
processes compatible with, or based upon, well understood semiconductor processes.
These use silicon (almost exclusively) as a base material. However, MST frequently
requires the use of other materials, so we will need additional processes. It may be
useful therefore to discuss MST in terms of silicon processing and specialised
processing. (This has the added advantage that the assigned textbook classifies it in this
way).
In this chapter, we shall look at the common silicon processes and talk about their
suitability for microsystem manufacture.
First of all, let us consider the key processes for microfabrication (note: the terms
micromachining and microfabrication are frequently used interchangeably). These were
mentioned at the end of the previous chapter and you should try to recall what they are.
Automated methods of reproducing patterns are the fundamental process required for
almost every mass production technique. (For example, think of the stamping of car
body parts in sheet steel). Traditional methods include printing, moulding, casting and
embossing. Difficulties arise when we try to apply such methods directly to the
dimensions required for microsystems. These lie in:
3.3 Deposition
Another key technique is that of deposition. As the term suggests, this is essentially a
process, which deposits material onto the surface of the wafer. The main techniques are:
1. Epitaxy. This is a process whereby a very thin (1-10 micron) layer of doped silicon is
effectively grown on the wafer in such a way that the crystal structure is continuous
between the substrate and the deposited layer.
figure 3.2
4. Spin coating. In this process, a vacuum holds the wafer to a spinning chuck. A
liquid is applied which spins out to form a coating. This dries or polymerises to form a
layer of about 100 microns or greater. There is no precise control and the process tends
to planarise a non-planar surface (see figure 3.3 below). The technique is commonly
used for spinning of photoresist onto silicon wafers where such considerations are less
important.
figure 3.3
figure 3.4
figure 3.5
3.4 Etching
• They can be highly selective with regard to the material they etch.
• This means they can be controlled using the photolithographic method
with a patterned etch resist layer.
• The use of different etchants and other methods of control (such as
doping or etch stops - discussed later) provides a variety of techniques for
different effects.
Wet etching is the simplest process where the sample is placed in a liquid that dissolves
some materials but not others (typically, the mask material or a doped region). Broadly
speaking, etchants can be Isotropic or Anisotropic. Isotropic etchants attack the material
equally in all directions and Anisotropic etchants attack the material at different rates in
different directions. It is particularly useful for cutting deep V-groves and trenches. We
shall look at these in more detail later.
Dry etching uses a gaseous etchant. The gas is ionised and the ions are propelled to the
sample in an RF field. This is known as Reactive Ion Etching (RIE). It has a high
directionality and allows deep, steep-sided features to be made.
(A) wet isotropic (B) anisotropic in (100) silicon (C) reactive-ion (RIE)
figure 3.6
Typical effects of the two wet and the dry etching techniques are shown in figure 3.6
above.
Planar Process used in IC manufacture. In particular, note the following:
In order to appreciate the processes required for MST, it is important to have a good
grasp of the Silicon Planar Process.
Now read section 3.2 of the textbook. Note the process steps as set out on page 37. A
couple of important additions are bonding and encapsulation. Passivation is discussed
in section 3.2.6. of the book. It is important to note here that it is possible to leave a gap
in the passivation layer. With an IC it is necessary to leave such gaps (known as
windows) over the metal bond pads to allow bonding to take place. One important
technique we have not explicitly discussed above is doping. A good account of this is
given in section 3.2.1 and you should go over this carefully. Figure 3.2 is interesting
mainly because it comes from a source dated 1976. Although it doesn’t imply any
relative importance of the processes listed, CMOS is undoubtedly the most prominent
process today and this is where most research and investment goes. NMOS & PMOS
are by no means extinct, but they are very rare. One process not mentioned is SOI
(Silicon on Insulator). You may come across this as SOS (Silicon on Sapphire).
We have from time to time mentioned that much of the research into microsystems is
focused on solving the problems of compatibilitiy with common IC production. The
April 1998 issue of the magazine "Semiconductor international" contains an article from
Scandia Labs showing one innovative approach (and mentions some interesting points
along the way). You should look at this. Note the following in particular:
The processes above are generally applied to silicon. Of course, silicon is just one of the
materials available to us to construct a microsystem. A number of other materials, both
passive and active are used in MST.
Question 2
Question 4
Question 5
Question 7
Now read the paper presented by Sniegowski et al. which you will find on the link below.
This gives a short overview of how one team see the problem of adapting silicon
processes and integrating MEMS. The sections of note are those entitled "Integrating
MEMS and CMOS" and "Adapting to microsystem manufacture". Figure 3 is
interesting. This paper hints at some of the things we shall look at in the following
chapter."
http://www.semiconductor.net/semiconductor/issues/Issues/1998/apr98/docs/feature8.asp
Microelectronic Project Management
Chapter 1 - Introduction
Chapter Information
It may be felt in other environments, where innovation is not a key feature, that
the techniques of Project Management are unlikely to be appropriate. This would
be a dangerous misconception. Good management achieves most of its effects
from the successful introduction of change rather than by the supervision of the
status quo. Project Management in such an environment is particularly important
and demanding.
In projects like Manhattan and Apollo, requirements are not so flexible. First,
both projects were subject to severe time constraints. Manhattan, undertaken
during World War II, required that the atomic bomb be developed in the shortest
time possible, preferably ahead of the Nazis; Apollo, undertaken in the early
1960s, had to be finished by 1970 to fulfil President Kennedy's goal of landing a
man on the moon and returning him safely to earth “before the decade is out".
Both projects involved advanced research and development and explored new
areas of science and engineering. In neither case could technical performance
requirements be compromised to compensate for limitations in time, funding, or
other resource; to do so would increase the risk to undertakings that were already
very risky.
The next sections present an overview of the main stages of planning and
managing a typical microelectronics project.
All projects share one common characteristic -the projection of ideas and
activities into new endeavours. The ever-present element of risk and uncertainty
means that the steps and tasks leading to completion can never be described with
absolute accuracy in advance. For some complex projects the achievement of a
successful outcome may even be in question. The function of project management
is to foresee or predict as many of the dangers and problems as possible and to
plan, organise and control activities so that the project is completed successfully.
When the uncertainty of a project drops to nearly zero, and when it is repeated a
large number of times, then the effort is usually no longer considered a project.
For example, building a skyscraper is definitely a project, but mass construction
of prefabricated homes more closely resembles an assembly line than a project.
Admiral Byrd's exploratory flight to the South Pole was a project, but modern
daily supply flights to Antarctic bases are not. When (far in the future) tourists
begin taking chartered excursions to Mars, trips there will no longer be considered
projects either. They will just be ordinary scheduled operations.
In all cases, projects involve organisations which, after target goals have been
accomplished, go on to do something else (construction companies or
microelectronics project teams) or are disbanded (Admiral Byrd's crew, the Mars
exploration team). In contrast, repetitive, high certainty activities (prefabricated
housing, supply flights and tourist trips to Antarctica or Mars) are performed by
permanent organisations which do the same thing over and over, with little
change in operations other than rescheduling. That projects differ greatly from
repetitive efforts requires that they be managed differently.
MANAGEMENT + PLANNING
= PROJECT MANAGEMENT
The objectives of Project Management, as distinct from those of Project Planning,
are:
· to manage time and progress;
· to manage cost and cash-flow;
· to manage quality and performance.
The means by which these objectives are achieved include Project Planning,
together with co-ordinating, monitoring and controlling available resources.
The projects mentioned earlier in this chapter, the Great Pyramids of Egypt, the
Manhattan project, the Apollo space programme, and the development of Product
X all have something in common with each other and with every other
undertaking of human organisations: they all require, in a word, management.
Certainly the resources, work tasks and goals of these projects vary greatly, yet
without management none of them could happen.
The project must then be broken down into manageable work packages. From
here activities can be defined and, against the activities, resources can be
planned. The order in which activities are carried out, and the extent to which
parts of the project can be carried out concurrently is calculated. This is normally
achieved using techniques of networkplanning. The normal outcome here is a
list of start and finish dates for activities.
In addition, a cost plan must be formulated such that the accumulation of costs
throughout the project can be investigated, planned and monitored.
Key players in the project team are the project manager, the ‘customer’ (external
or internal to the organisation), team members and senior management of the
company.
In addition to requiring the right mix of technical skills to carry out the work,
there is also a requirement for a good mix of ‘human’ skills. Too many natural
leaders can lead to chaos. Likewise, a group consisting of all ‘doers’, or all
‘thinkers’ would also not be effective. The formulation of teams from a ‘human’
perspective is equally important to ensuring the right mix of technical knowledge
and expertise.
You may have come across the work of Belbin, who defined eight ‘personality’
types which are required for a team to function well. He said that some people
naturally prefer to operate in the role of ‘leader’, whereas others had a natural
tendency to prefer to carry out detailed work. Other ‘characteristics’ he
determined as being important to have in the team, include someone who is
naturally good at encouraging and being enthusiastic, someone who can smooth
disputes, and someone who can be constructively critical- or evaluative - about
the work of the team. Belbin’s theory will be covered in more depth later in the
module.
5. Kliem, Ralph The people side of project management, Gower 1994 ISBN
0556 0736 33 (library 658.404 KLI)
Self Assessment
Question 1
Describe a project in which you have been involved. State the objectives,
timescale and any penalties for not achieving the objectives.
Describe briefly how you broke the project into activities.
List any activities which you could carry out simultaneously and those that were
sequential.
Question 2
Describe the characteristics which classify a piece of work as a project rather than
a routine or regular task.
Question 3
Describe the main steps in a typical microelectronics project from definition
through to completion of the product design.
Microelectronic Project Management
Chapter Information
2.1 Introduction
This section deals with project definition, which should take place before a project is given the ‘go-ahead’ (authorisation).
Project specification provides information with which to appraise the proposed works against required outcomes (cost, time,
quality, fit for purpose etc.). It also should give a sound basis of information with which to carry out the detailed planning of
the project.
The best source of information for creating the project specification is the person or persons requiring the work. This may be
an external customer, or an internal manager or department. In either case the ‘initiator’ of the project will be termed ‘the
customer’ for the purposes of these notes. (Similarly, the person(s) carrying out the project work will normally be termed ‘the
contractor’) Ensuring that the customer’s specifications for the project are fully understood and documented is a vital first
step towards a successfully completed project.
During this session we will first examine the nature of projects, in order to understand the features which must be specified.
2.2 What is a project?
Why are some works considered "projects" while other human activities, such as planting and harvesting a crop, stocking a
warehouse, issuing payroll cheques, or manufacturing a product, are not?
What is a project? This is a question we will cover in more detail as we progress through the course. Just for an introduction
though, some characteristics will be listed that warrant classifying an activity as a project. They centre on the purpose,
complexity, uniqueness, unfamiliarity, stake, impermanence and life cycle of the activity.
2. Projects cut across organisational lines since they need to utilise skills and talents from multiple
professions and organisations. Project complexity often arises from the complexity of advanced
technology, which relies on task interdependencies, and may introduce new and unique problems. This
may be especially true for microelectronic applications.
3. Every project is unique in that it requires doing something different than was done previously. Even
in "routine" projects such as home construction, variables such as terrain, access, zoning laws,
labourmarket, public services and local utilities make each project different. A project is a one time
activity, never to be exactly repeated again.
4. Given that a project differs from what was previously done, it also involves unfamiliarity. It may
encompass new technology and, for the organisation undertaking the project, possess significant
elements of uncertainty and risk. So the organisation usually has something at stake when doing a
project. The activity may call for special effort because failure would jeopardise the organisation or its
goals.
5. Projects are temporary activities. An ad hoc organisation of personnel, material, and facilities is
assembled to accomplish a goal, usually within a scheduled time frame; once the goal is achieved, the
organisation is disbanded or reconfigured to begin work on a new goal.
6. Finally, a project is the process of working to achieve a goal; during the process, projects pass
through several distinct phases, called the project life cycle. The tasks, people, organisations and other
resources change as the project moves from one phase to the next. The organisation structure and
resource expenditure slowly builds with each succeeding phase, peak, and then decline as the project
nears completion.
Imagine a company which primarily carries out project work for customers. These notes are oriented towards a ‘building
contractor’ type of organisation. However, the same logic can be applied to a mechanical engineering jobbing shop, or indeed
a department receiving requests for work from other areas of a company. The project specification process described here is
in essence generic, although particular examples are mostly oriented towards construction.
Project definition is a process which starts when the customer or investor first conceives the idea of a project. It does not end
until the last piece of information has been filed to describe the project in its finished 'as built condition'. Figure 2.1 shows
some of the elements in the overall process. This section deals with the part of project definition that should take place before
a project is authorised; the part most relevant to setting the project on its proper course and which plays a vital role in helping
to establish any initial contractual commitments.
The sales specification is only the first stage in defining a project. The process is not complete until as-built records have
been made.
Figure 2.1 the process of project definition
With acknowledgements to Dennis Lock "Project Management" 1992
Enquiries and subsequent orders for commercial projects generally enter contracting companies through their sales
engineering or marketing organisation, and it is usually from this source that other departments learn of each new enquiry or
firm order. Even when enquiries bypass the sales organisation, sensible company rules should operate to ensure referral to the
marketing director or sales manager, so that every enquiry is ‘entered into the system'. This will ensure that every enquiry
received can be subjected to a formal screening process for assessing its potential project scope, risk and value. The work
involved in preparing a tender can easily constitute a small project in itself, needing significant preliminary engineering
design work plus sales and office effort that must be properly authorised and budgeted. The potential customer will almost
certainly set a date by which all tenders for the project must be submitted, so that time available for preparation is usually
limited. Everything must be planned, co-ordinated and controlled if a tender of adequate quality is to be delivered on time.
Some companies record their screening decision and appropriate follow-up action on a form.
Before any company (or internal department within a company) can even start the enquiry screening process, and certainly
before tender preparation can be authorised, the customer's requirements must be clearly established and understood. The
project must be defined as well as possible right at the start. The contracting company must know for what it is bidding and
what its commitments would be in the event of winning the contract. Similarly the product development department of a
manufacturing company must understand the product requirements, normally identified in conjunction with the marketing
department.
Adequate project definition is equally important for the customer, who must be clear on what he expects to get for his money.
Project definition is also just as important for a company considering an in-house project, where that company (as the
investor in the project) can be regarded as the customer.
All of this demands proper and extended project definition, the full scope of which would include the evaluation and
assessment of some or all of the following parameters:
· An outline description of the project, with its required performance characteristics quantified in
unambiguous terms.
· The total amount of expenditure expected to carry out the project and bring its resulting product into
use.
· The expected date when the product can be put effectively to its intended use.
· Forecast of any subsequent operating and maintenance costs for the new product.
· A forecast of the costs of financing likely over the appraisal period (bank interest rates, inflationary
trends, international exchange rate trends, and so on, as appropriate).
· Fiscal considerations (taxes or financial incentives expected under national or local government
legislation).
· A schedule which sets out all known items of expenditure (cash outflows) against the calendar.
· A schedule which sets out all expected savings or other contributions to profits (cash inflows) against
the same calendar.
· A net cash flow schedule (which is the difference between the inflow and outflow schedules, again
tabulated against the same calendar).
For short-term commercial projects the financial appraisal may take the form of a simple payback calculation. This sets out
expenditure against time (it could be tabulated or drawn as a graph) and also plots all the financial benefits (savings or
profits) on the same chart. Supposing that graphs were drawn, the point where the two graphs intersect is the break -even
point, where the project can be said to have 'paid for itself'. The time taken to reach this point is called the payback period.
Any financial sum listed as a saving or cost item in future years will have its significance distorted through the passage of
time (for instance, £100 spent today is more expensive than spending £100 in a year's time, owing to lost interest that the
money might have earned for the investor on deposit in the meantime). Such distortions can have a considerable effect on the
forecast cash flows of a project lasting more than two or three years if factors are not introduced to correct them, and it is best
to use a discounting technique for the financial appraisal of long -term projects.
Project managers do not, of course, have to be expert in all or any of the techniques of project financial appraisal. It may,
however, help to increase their determination to meet the defined objectives if they realise that these were key factors in an
earlier appraisal decision; factors (time, money, performance) on which the investor and the contractor are both dependent if
the completed project is to be a mutual success.
Project scope
Should the quotation be successful and a firm order result, the contractor will have to ensure that the customer's specification
is satisfied in every respect. His commitments will not be confined to the technical details but will encompass the fulfilment
of all specified commercial conditions. The terms of the order may lay down specific rules governing methods for invoicing
and certification of work done for payment. Inspection and standards may be spelled out in the contract and one would
certainly expect to find a well- defined statement of delivery requirements. There may even be a warning that a condition of
the resulting contract will provide for penalties to be paid by the contractor should he default on the agreed delivery dates.
Any failure by the contractor to meet his contractual obligations could obviously be very damaging for his reputation. Bad
news travels fast throughout an industry, and the contractor's competitors will, to put it mildly, not attempt to slow the
process. The contractor may suffer financial loss if the programme cannot be met or if he has otherwise miscalculated the
size of the task which he undertook. It is therefore extremely important for the contractor to determine in advance exactly
what the customer expects for the money.
The customer's specification should therefore set out all the requirements in unambiguous terms, so that they are understood
and similarly interpreted by customer and contractor alike. Much of this section deals with the technical requirements of a
specification but, equally important, is the way in which responsibility for the work is to be shared between the contractor,
the customer, and others. In more precise terms, the scope of work required from the contractor, the size of his contribution
to the project, must be made clear.
At its simplest, the scope of work required might be limited to making and delivering a piece of hardware in accordance with
drawings supplied by the customer. At the other extreme, the scope could be defined so that the contractor handles the project
entirely, and is responsible for conceptual design through until the purchaser is able to accept delivery of a fully completed
and proven project (known as a turnkey operation).
Whether the scope of work lies at one of these extremes or the other, there is always a range of ancillary items that have to be
considered. Will the contractor be responsible for any training of the customer's staff and, if so, how much (if any) training is
to be included in the project contract and price? What about commissioning, or support during the first few weeks or months
of the project's working life? What sort of warranty or guarantee is going to be expected? Are any training, operating or
maintenance instructions to be provided? If so, in what language?
Answers to all of these questions must be provided, as part of project definition, before cost estimates, tenders and binding
contracts can be considered. Checklists are useful way of ensuring that nothing important is forgotten.
Use of checklists
Contractors who have amassed a great deal of experience in their particular field of project operation will learn the type of
questions that must be asked of the customer in order to fill in most of the information gaps and arrive at a specification that
is sufficiently complete.
The simplest level of checklist use is seen when a sales engineer takes a customer's order for equipment that is standard, but
which can be ordered with a range of options. The sales engineer will use a pad of pre-printed forms, ticking off the options
that the customer requires. People selling replacement windows with double-glazing use such pads. So do some automobile
salesmen. The forms are convenient and help to ensure that no important details are omitted when the order is taken and
passed back to the factory for action.
Concept options
It is well known that the desired end results of a project can often be achieved by a variety of technical or logistical concepts.
There could be considerable differences between proposals submitted by companies competing for the same order. Once an
order has been won, however, the successful contractor knows the general solution which has been chosen. The defeated
alternative options will usually be relegated to history. But there will still remain a considerable range of possibilities for the
detailed design and make- up of the project within the defined boundaries of the accepted proposal and its resulting contract.
Taking just a tiny element of a technical project as an example, suppose that a plant is being designed in which there is a
requirement to position a lever from time to time by automatic remote control. Any one or combination of a number of drive
mechanisms might be chosen. Possibilities include hydraulic, mechanical, pneumatic, or electromagnetic devices. Each of
these could be subdivided into a further series of techniques. If, for example, an electromagnetic system were chosen this
might be a solenoid, a stepping motor or a servo motor. There are still further possible variations within each of these
methods. The device chosen might have to be flameproof or magnetically shielded, or special in some other respect. Every
time the lever has been moved to a new position, several ways can be imagined for measuring and checking the result.
Electro-optical, electrical, electronic or mechanical methods could be considered. Very probably the data obtained from this
positional measurement would be used in some sort of control or feedback system to correct errors. There would, in fact,
exist a very large number of permutations between all the possible ways of providing drive, measurement and positional
control. The arrangement eventually chosen might depend not so much on the optimum solution (if such exists) as on the
contractor's usual design practice or simply on the personal preference of the engineer responsible.
With the possibility of all these different methods for such a simple operation, the variety of choice could approach infinite
proportions when the permutations are considered for all the design decisions for a major project. It is clear that coupled with
all these different possibilities will be a correspondingly wide range of different costs, since some methods by their very
nature must cost more than others. When a price or budget is quoted for a project, this will obviously depend not only on
economic factors (such as the location of the firm and its cost/ profit structure) but also on the system and detailed design
intentions.
It can be seen that owing to their cost implications, the main technical proposals must be established before serious attempts
at estimating can start. Once these design guidelines have been decided, they must be recorded in a provisional design
specification. If this were not done, there would be a danger that a project could be costed, priced and sold against one set of
design solutions but actually executed using a different, more costly, approach. This danger is very real. It occurs in practice
when the period between submitting a quotation and actually receiving the order exceeds a few months, allowing the original
intentions to be forgotten. It also happens when the engineers carrying out the project work decide not to agree with the
original proposals (sometimes called the 'not invented here' syndrome). Projects in this author's experience have strayed so
far from their original design concept for such reasons that their total costs reached more than double their budgets.
Similar arguments apply concerning the need to associate the production methods actually used in manufacturing projects
with those assumed in the cost estimates and subsequent budgets. It can happen that certain rather bright individuals come up
with suggestions during the proposal stage for cutting corners and saving expected costs- all aimed at securing a lower and
more competitive tender price. Provided that these ideas are recorded with the estimates, all will be well and the cost savings
can be achieved when the project goes ahead. Now imagine what could happen if, for instance, a project proposal were to be
submitted by one branch of the organisation, but that when an order eventually materialised responsibility for carrying out
the work was switched to a production facility at some other location in the organisation, with no record of the production
methods originally envisaged. The cost consequences could prove to be nothing short of disastrous. Unfortunately, it is not
necessary to transfer work between locations for mistakes of this kind to arise. Even the resignation of one production
engineer from a manufacturing company could produce such consequences if his intentions had not been adequately
recorded. The golden rule, once again, is to define and document the project in all essential respects before the estimates are
made and translated into budgets and price.
Development programmes aimed at the introduction of additions or changes to a company's product range are perhaps more
prone than most to overspending on cost budgets and timescale. One possible cause of this phenomenon is that chronic
engineer's disease which might be termed 'creeping improvement sickness'. Many will recognise the type of situation
illustrated in the following case study:
Case Study Questions
1. Imagine you are involved in discussions between the chief engineer, marketing and the production manager before the
design engineer (George) is called in to brief him on the new product. You decide that you would like to create a written
project specification, rather than rely on a verbal briefing. Draw up a list of the information you would include in the
specification for the new product development project.
2. How would the team decide an appropriate target cost for the unit. How would the target completion date be set?
3. Imagine now that the engineer has received the design brief, along with the written specification you have developed
above. Which problems from the case study would have been avoided by creating and agreeing the project specification?
Your name
Although the customer may be clear from the very first about his needs, it is usual for dialogue to take place between the
customer and one or more potential contractors before a contract is signed. During this process each contractor can be
expected to make various proposals for executing the profit that effectively add to or amend the customer's initial enquiry
document. In some companies this pre- project phase is aptly known as solution engineering, since the contractor's sales
engineers work to produce and recommend an engineering solution which they consider would best suit the customer (and
win the order).
Solution engineering may last a few days, several months, or even years. It can be an expensive undertaking (especially when
the resulting tender fails to win the contract). Although it is a nice tidy theory to imagine the contractor's sales engineers
putting their pens to paper and writing the definitive project specification at the end of the solution engineering phase, the
practice is likely to be quite different. An original descriptive text, written fairly early in the proceedings, will undergo
additions and amendments as the solution develops, and there will probably be a pile of drawings, artists' impressions,
flowsheets, schedules (or other documents appropriate to the type of project) which themselves have undergone amendments
and substitutions. A fundamental and obvious requirement when the contract is signed is to be able to identify positively
which of these versions record the actual contract commitment. Remember that the latest issue of any document may not be
the correct issue.
Consider, therefore, the composition of a project specification. The following arrangement provides the basis for
unambiguous definition of project requirements at any stage by reference to the specification serial number and its correct
revision number. The total specification will comprise:
1. Binder or folder The specification for a large project is going to be around for some time and receive considerable
handling. It deserves the protection of an adequate binder or folder. This should carry the project number and title,
prominently displayed for easy identification. The binder should be loose-leaf, to allow for the addition or substitution of
amended pages.
2 Descriptive text The narrative describing the project should be written clearly and concisely. The text should be preceded
by a contents list, and be divided logically into sections, with all pages numbered. Every amendment must be given an
identifying serial number or letter, and the overall amendment (or revision) number for the entire specification must be raised
each time the text is changed. Amended paragraphs or additional pages must be highlighted, for example by placing the
relevant amendment number alongside the change (possibly within an inverted triangle in the manner often used for
engineering drawings).
3 Supporting documents Most project specifications need a number of supporting engineering and other documents that
may be too bulky for binding in the folder. All these documents must be listed and treated as part of the specification (see the
next item).
4 Control schedule of specification documents This vital part of the specification should be bound in the folder along with
the main text (either in the front or at the back). This schedule must list every document which forms part of the complete
specification or which is otherwise relevant to adequate project definition (for example, a standard engineering specification).
Minimum data required for each document are its serial and correct revision numbers. Preferably the title of each document
should also be given. Should any of the associated documents itself be complex, it should have its own in-built control
schedule. It usually helps if a control schedule is given the same serial and amendment numbers as the main document which
it is controlling.
· The following self assessment questions are intended to re-inforce the material presented in chapter
2.
· Once you have completed the questions click the 'submit' button at the bottom of the page.
· To reset given answers and start again click the 'reset' button at the bottom of this page.
Question 1
Describe the four elements of a documented project specification.
Question 2
What factors must be specified in order to appraise a project financially?
Question 3
How should amendments be incorporated into the specification?
Question 4
List the type of information which should be sought from the project "customer" in order to specify a microelectronics
project.
Question 5
List information the "contractor" would add to complete the specification
Question 6
Describe QFD and the main steps in its implementation. What are the benefits of the approach? How could it be applied to a
Microelectronics example?
Microelectronic Project Management
Chapter Information
3.1 Introduction
Throughout the history of project management, project managers have managed
their projects according to three criteria: cost, schedule, and quality (see Figure
3.1). They treated all other considerations as subordinate.
Ironically,
following this
approach has
not proven too
successful for
any of the three
criteria.
Projects in most
industries often
exceed project
completion
dates by
months, even
years, and
overrun their
budgets by
thousands, even
millions, of
pounds. In
addition, each
criterion seems
to go in
different
directions.
Meeting the
schedule often
means
foregoing
budget and
quality
considerations.
Adhering to
budget
frequently
means
sacrificing
quality or
ignoring the
schedule.
Concentrating
on quality
means 'blowing'
the budget or
ignoring the
schedule. All
this has
occurred when
project
managers have
a wide array of
project
management
tools and
techniques at
their disposal.
Many plan their
projects by
developing
work break-
down
structures, time
Figure 3.1 - Criteria for managing projects estimates, and
network
diagrams.
Many organise
their projects
by developing
organisation
charts and
forms and
allocating
resources.
Many control
their projects
by collecting
information on
progress of the
project and
developing
contingency
plans to address
anticipated
problems. In
addition, these
tools and
techniques have
become more
sophisticated
and
automated.Then
why the dismal
record, at least
from the
perspective of
the three
criteria?
The answer is that schedule, budget, and quality are not enough. One other
important criterion is missing: people.
What many project managers fail to realise is that their handling of people affects
the outcome of their projects. Indeed, their neglect or mismanagement of people
can affect schedule, cost, and quality.
Successful project managers are those who recognise the importance of people in
completing their projects. They know that without people no project would exist
in the first place. They also recognise that people play an integral role in
completing the project within budget, on schedule, and with top workmanship.
The people side is not more important than the hardside and vice versa. Rather,
project managers must recognise the equal importance of both sides. That entails
adding the fourth important criterion, people, to the traditional three: cost,
schedule, and quality.
3.2 The major players
To progress smoothly, project management requires that four key players (shown
in Figure 3.2) participate. These players are the project manager, senior
management, client, and the project team.
Some project managers do not participate in a project even though they hold the
title of 'project manager.' They may be uninterested in the project because it was
forced upon them or they assumed the position by circumstance. In response to
this situation, they may fail to plan, organise, control, or lead these projects
adequately. The results are unsuccessful projects, that is, projects that fail to meet
goals and objectives with regard to cost, schedule, and quality.
PROJECT MANAGER The project manager of today
Orchestrates successful delivery of the plays an important central role in
project ensuring that communication and
Enables interactive communications co-ordination among different
among senior management, client and participants occur efficiently and
project team effectively. If project managers
Co-ordinates effective and efficient fail to perform such tasks disaster
participation is soon forthcoming.
Develops project plans, including
estimates, work breakdown structure
and schedules
Provides mechanism for monitoring
and tracking progress regarding
schedule, budget and technical
performance
Creates infrastructure for managing
the project team
SENIOR MANAGEMENT
Determine project's fate (proceed or
stop)
Allocate project support resources
including money and manpower
Identify favoured or preferred
projects
Continued participation throughout the
life cycle
Provide strategic guidance and
direction
CLIENT
Pays for project/product
Co-ordinates with project manager for
project/product clarification
Uses the product
Approves the product
Dedicates resources to the project
including people, time and money
PROJECT TEAM
Supports the project manager
Provides requisite skills and creativity
Operates as a unified team
Works with the clients to obtain
requirements, feedback and approvals
Figure 3.2 - Responsibilities of four key players in projects
3.4 Senior management
The project manager needs the participation of senior management because much
power resides with them. Senior management decides whether the project will
proceed. They also determine the extent of support the project will receive
relative to other projects. If they do not view the project as having much
importance, senior management will allocate resources to more ‘significant'
endeavours. If they have a favourable view, the opposite will occur.
The importance of senior management's participation becomes very clear when
there is a split over how important a project is. This may give a project a 'stop and
go' mode of operation which can result in poor productivity and low morale. The
problem can become even worse if management withdraws their support.
Senior management must not, however, adopt a policy of benign neglect. They
must keep abreast of what occurs on the project. The emphasis is on what, not
how. Feedback up and down the chain of command is absolutely essential.
3.5 Client
The client is the reason why the project exists in the first place. Clients may be
internal or external (outside the company). They pay for the project, either at the
beginning or later. Their participation, like that of senior management, is
principally during the start and end of a project.
The client is not always a monolithic entity but may comprise several types of
people. First, there are the people who pay for the product; they typically are the
principal decision- makers. Second, there are the people whose function is the co-
ordination with the project manager during most of the project; they are the main
contacts for information and clarification. Third, and finally, there are the people
who will use the product in their operational environment; they see the value of
the product in terms of how it improves their productivity.
Dealing with the client requires sensitivity. What and how much to tell a client
depends on your relationship. The best policy, from your perspective as a project
manager, is to maintain an open and honest relationship. Any hint of dishonesty
or duplicity can result in a breakdown of communications and a cancellation of
agreements.
There is another aspect to the requirement for sensitivity and project managers
find themselves caught in a political crossfire. They can make one person on the
client's side happy and inadvertently anger someone else. Project managers must
always be aware of this possibility and focus on the key participants (with respect
to political power) in the client's organisation.
Unity and co-operation among team members are absolutely necessary. Projects
involve a diverse number of specialised skills which must complement one
another in achieving goals. If team members fight with one another, energy is
directed into unproductive endeavours. If the team members fight with the client,
the latter can withhold co-operation, or, worse, cancel the contract. If team
members fight with senior management, communications up and down the chain
of command suffer and so, ultimately, will productivity.
Without the support of any one of these people, the quality of the product will
decline. As the project manager, you play an important role in ensuring that senior
management, client, and project team contribute to your project. If your
relationship with them deteriorates in any way or if their relationships with one
another worsen, the people side of project management can prove very difficult
and damage progress, affecting schedule, budget, and workmanship.
R. Meredith Belbin describes in his book Management Teams, Why they Succeed
or Fail (1981, Butterworth Heinemann) a number of team-roles that are played by
team-members when bringing them into a team. Understanding the different roles
helps the members in finding their place during the life-cycle of the project.
Teams need to have several of the different roles present in order to be successful.
A major mistake that is made is selecting people upon their technical competence
only. Research carried out by Belbin proved that teams comprised of very clever
members had a significant poor performance compared to other teams. This
phenomenon has been identified as the Apollo syndrome.
As individuals differ greatly in personality and in behaviour, so will the team role profiles of
individuals vary. The research that Dr. Belbin conducted showed that the natural variations in
different team roles of individuals can give strength to a team if they occur in the right
combination and are used in an appropriate setting. A need thus developed for a way to make
these research findings available and useful.
The value for project management is an understanding of the necessary ingredients for a
successful team. The book has a self-perception questionnaire that gives someone’s roles as
perceived by himself. It can be used by teams as a team-building exercise in which team-
members get to know and understand each other.
Knowledge of preferred team function can be useful both for individuals and the
project manager on many levels.
The website indicated above shows how a "team map" can be constructed to
visualise the team in terms of its members' preferred roles. This can help to
highlight potentially problematic situations such as a team where most members
tend towards one role. One can imagine the difficulties if for example, the team is
made up entirely of the "plant" role, where plenty of ideas might be conceived,
but the team has great difficulty in following these through. The Belbin theory
suggests a balanced team, with strengths in most of the roles would be most
effective (at least in terms of team dynamics).
Knowledge of one's own preferred role(s) can also be useful in aiding team
dynamics. If for example, the team seems to lack a natural leader, or perhaps is
struggling with various conflicts, individual members "can take up the missing
role" in an attempt to move the team forward.
Place your answers in an email or an attached Word file (using the button below) and please
indicate whether or not you would like a response.
Question 3. - Describe the Belbin team roles, and how the theory could be applied in the
Microelectronics industry.
Introduction to Programmable Logic Components
• PAL: a small FPD that contains two levels of logic, an AND-plane and an OR-plane, the
AND-plane is programmable while the OR- plane is fixed.
• FPGA: Field Programmable Gate Array is an FPD featuring a general structure that
allows very high logic capacity. Whereas CPLDs feature logic resources with a wide
number of inputs, FPGAs offer more narrow logic resources.
– FPGA also offer a higher ratio of flip-flops to
logic resources than CPLDs.
Interconnect: the wiring resources in FPD.
• Programmable Switch: a user programmable switch that can connect a logic element to an
interconnect wire, or one interconnect wire to another.
• Logic Block: a relatively small circuit block that is replicated in an array in an FPD. When a
circuit implemented in an FPD, it is first decomposed into smaller sub-circuits that can
each be mapped into a logic block.
• Logic Capacity: The amount of digital logic that can be mapped into a single FPD.
• The first type of programmable device was the programmable read-only memory PROM. PROM
is a one-time programmable device that consists of an array of read-only cells.
However, field programmable devices offer advantages that often out-weights its
speed-performance shortcomings:
•Field programmable chips are less expensive
•They can be programmed in very short time.
Two field programmable variants of the PROM, the EPROM and the EEPROM,
both can be erased and reprogrammed many times.
• Programmable Logic Devices PLDs are designed specifically for implementing logic circuits.
PLDs can only implement small logic circuits that can be represented with a modest number of
product terms.
•Programmable logic elements fall under the category of Application Specific Integrated Circuits
ASICs.
•A full custom IC includes logic cells that are customized and all mask layers that are customized.
A microprocessor is an example of a full-custom IC.
Full-custom ICs are the most expensive to manufacture and to design. The time needed
to manufacture an IC is called the manufacturing lead time, is typical 8 weeks
for a full custom IC. Usually these ICs are application specific and therefore they are
called Application Specific Integrated Circuits ASICs.
A semicustom ASIC, these are of two types; standard cell based and gare array
based ASICs.
A programmable ASICs; are these that have logic cells that are predesigned.
There are two types of programmable ASICs: the programmable logic device
PLD and the field programmable gate array FPGA.
• ROM are memories that can generate all the possible minterms of its input. Dioda
are used to summate minterms into sum-of-product expression.
Example: Consider a ROM that have 3 inputs and that would satisfy the following
function:
Example: Draw the schematic diagram of a ROM implementation of the following functions:
W=/A /C + A /C /D + A /B D
X=A /C + BD + AB + /A /B C /D
Y=/C /D + /A B + B /C A /B D
Z=/C D + A B /C + /A B /D + A /B C /D
Example: Consider the following set of logic equations:
W=A B /C + A /B C + /A /B /C
X=A /B + /A C
Y=A B + /A /B + A C
Z=A /B /C + B C
What size of PAL is needed to implement these functions? Draw the schematic diagram?
Notice that there are 4 inputs considered, the maximum number of minterms is three and that
each minterm must be able to select arguments from a domain of three input variables.
Field Programmable Gate Array FPGA
• FPGAs were introduced by Xilinx in 1985. Since then, many different FPGAs have been developed
by a number of companies: Actel, Altera, Synaposis and others.
• An FPGA consists of a two-dimentional array of logic blocks that can be connected by general
interconnection resources.
• The interconnect comprises segments of wire, where the segments may be of various lengths.
Within the interconnect are programmable switches that serve to connect the logic blocks to
the wire segments, or one wire segment to another.
• Logic circuits are implemented in FPGA by partitioning the logic into individual logic blocks
and then interconnecting the blocks as required via the switches.
Logic Blocks
• The structure and contents of a logic block are called the architecture.
Logic block architecture can be designed in many different ways.
Most logic blocks also contain some type of flip-flops to aid in the
implementation of sequential circuits.
• The structure and content of the interconnect in an FPGA is called its routing architecture.
• Routing architecture consist of both wire segments and programmable switches including static RAM
cells, anti-fuses, EPROM and EEPROM transistors.
• There exist many different ways to design the structure of the routing architecture.
Applications of FPGAs
• FPGAs can be used in almost all of the applications that currently use Mask-
Programmable Gate Arrays, PLDs and Small Scale Integration SSI logic chips.
Following is a few categories of such designs:
Implementation Process
• The starting point is the design entry of the circuit to be implemented. This step
typically involves:
• drawing a schematic
• entering an HDL code
• specifying Boolean expressions
• Or State Diagrams
• Regardless of the initial design entry method, the circuit is translated into
a standard form such as Boolean expressions.
• The Boolean expressions are then processed by logic optimizer tool, which manipulates the
expressions. The goal is to optimize these expressions to optimize the area and/or speed of
the final circuit.
• The optimized Boolean expressions are then transformed into a circuit of FPGA logic blocks.
This is done by a technology mapping program.
• The mapper may attempt to optimize the total number of blocks required (area optimization).
• Alternatively, the objective may be to minimize the number of stages of logic blocks in time-
critical paths (delay optimization).
• The next step is to decide on where to place each block in the FPGA’s array.
• Typical placement algorithms attempt to minimize the total length of interconnect required for the
resulting placement.
• The next step is the routing. The routing assigns the FPGA’s wire segments and chooses
programmable switches to establish the required connections among the logic blocks. It is often
necessary to do routing such that propagation delays in time-critical connections are minimized.
• The final step is the programming. The CAD system’s output is fet to a
programming unit which configures the final FPGA chip.
Programmable Technologies
ROM
• Each bit position in the memory consists of a transistor switch, bipolar
or MOS connected in series with a small fuse.
Fuse
• The fuses break open when current flowing through exceeds a certain limit.
• As the fuses start to heat up, reducing its resistance, which rapidly increases the current
flow causing further rapid heating until the fuse is vaporised to form an open circuit.
Antifuse
• In a poly–diffusion antifuse the high current density causes a large power dissipation
in a small area, which melts a thin insulating dielectric between polysilicon and diffusion
electrodes and forms a thin (about 20 nm in diameter), permanent, and resistive silicon link .
• The programming process also drives dopant atoms from the poly and diffusion
electrodes into the link, and the final level of doping determines the resistance
value of the link.
Metal-Metal Antifuse
– The second advantage is that the direct connection to the low-resistance metal
layers makes it easier to use larger programming currents to reduce
the antifuse resistance.
Static RAM
• This Xilinx SRAM configuration cell is constructed from two cross-coupled inverters
and uses a standard CMOS process. The configuration cell drives the gates of other
transistors on the chip—either turning pass transistors or transmission gates on to make a
connection or off to break a connection.
• The advantages of SRAM programming technology are that designers can reuse chips
during prototyping and a system can be manufactured using ISP.
• This programming technology is also useful for upgrades—a customer can be sent a new
configuration file to reprogram a chip, not a new chip.
• Designers can also update or change a system on the fly in reconfigurable hardware .
• The disadvantage of using SRAM programming technology is that you need to keep
power supplied to the programmable ASIC (at a low level) for the volatile SRAM
to retain the connection information.
• Alternatively you can load the configuration data from a permanently programmed memory
(typically a programmable read-only memory or PROM ) every time you turn the system on.
• The total size of an SRAM configuration cell plus the transistor switch that the SRAM
cell drives is also larger than the programming devices used in the antifuse technologies.
EPROM
Erasure is caused by shining an intense ultraviolet light through a window that is designed
into the memory chip. (Although ordinary room lighting does not contain enough ultraviolet light
to cause erasure, bright sunlight can cause erasure. For this reason, the window is usually
covered with a label when not installed in the computer.)
• Altera MAX 5000 EPLDs and Xilinx EPLDs both use UV-erasable EPROM cells as
their programming technology.
• The EPROM cell is almost as small as an antifuse. An EPROM transistor looks like a normal
MOS transistor except it has a second, floating, gate.
• Applying a programming voltage V PP (usually greater than 12 V) to the drain of the n- channel
EPROM transistor programs the EPROM cell.
• A high electric field causes electrons flowing toward the drain to move so fast they “jump” across
the insulating gate oxide where they are trapped on the bottom, floating, gate.
• We say these energetic electrons are hot and the effect is known as hot-electron injection
or avalanche injection . EPROM technology is sometimes called floating-gate avalanche
MOS (FAMOS ).
EEPROM
Unlike EPROM chips, EEPROMs do not need to be removed from the computer to
be modified. However, an EEPROM chip has to be erased and reprogrammed in its
entirety, not selectively.
• The movement of the data stored in registers and the processing performed
on them is referred to as register transfer operations.
2. The operations that are performed on the data stored in the registers, and
• The control signal provides signals that sequence the microoperations in prescribed manner.
Basic Technology Requirements
Switching
An input must be able to affect an output conditionally
At least one of the functions AND or OR
Inversion
In must be possible to invert (NOT) the logic state
Amplification
Nothing is 100% efficient so somegain must be provided
Instantiation
It must be possible to make many copies of the basic elements and
connect them selectively
The output of one gate is connectable to the inputs of others
Specifically:
The Xilinx FPGAs used in the laboratories use RAM look-up tables.
The VLSI part of this course is concerned primarily with the way
Gates are integrated on to large devices. Of course FPGAs, PLDs,
memories etc. are also made in very similar ways.
The integrated circuits most commonly encountered in industry are
Application Specific Integrated Circuits (ASICs).These are devices
Aimed at integrating as many desired functions as possible onto one
device.
Abstract
describe the recent evolution of FPDs. The three main categories of FPDs are delineated: Simple
PLDs (SPLDs), Complex PLDs (CPLDs) and Field-Programmable Gate Arrays (FPGAs). We
then give details of the architectures of all of the most important commercially available chips,
Unlike previous generations of technology, in which board-level designs included large numbers
of SSI chips containing basic gates, virtually every digital design produced today consists mostly
of high-density devices. This applies not only to custom devices like processors and memory, but
also for logic circuits such as state machine controllers, counters, registers, and decoders. When
such circuits are destined for high-volume systems they have been integrated into high-density
gate arrays. However, gate array NRE costs often are too expensive and gate arrays take too long
to manufacture to be viable for prototyping or other low-volume scenarios. For these reasons,
most prototypes, and also many production designs are now built using FPDs. The most compel-
ling advantages of FPDs are instant manufacturing turnaround, low start-up costs, low financial
risk and (since programming is done by the end user) ease of design changes.
The market for FPDs has grown dramatically over the past decade to the point where there is
now a wide assortment of devices to choose from. A designer today faces a daunting task to
research the different types of chips, understand what they can best be used for, choose a particu-
lar manufacturers’s product, learn the intricacies of vendor-specific software and then design the
hardware. Confusion for designers is exacerbated by not only the sheer number of FPDs avail-
able, but also by the complexity of the more sophisticated devices. The purpose of this paper is to
provide an overview of the architecture of the various types of FPDs. The emphasis is on devices
with relatively high logic capacity; all of the most important commercial products are discussed.
Before proceeding, we provide definitions of the terminology in this field. This is necessary
because the technical jargon has become somewhat inconsistent over the past few years as compa-
• Field-Programmable Device (FPD) — a general term that refers to any type of integrated cir-
cuit used for implementing digital hardware, where the chip can be configured by the end user
to realize different designs. Programming of such a device often involves placing the chip into
a special programming unit, but some chips can also be configured “in-system”. Another name
for FPDs is programmable logic devices (PLDs); although PLDs encompass the same types of
chips as FPDs, we prefer the term FPD because historically the word PLD has referred to rela-
tively simple types of devices.
• PLA — a Programmable Logic Array (PLA) is a relatively small FPD that contains two levels
of logic, an AND-plane and an OR-plane, where both levels are programmable (note: although
PLA structures are sometimes embedded into full-custom chips, we refer here only to those
PLAs that are provided as separate integrated circuits and are user-programmable).
• PAL* — a Programmable Array Logic (PAL) is a relatively small FPD that has a programma-
ble AND-plane followed by a fixed OR-plane
• SPLD — refers to any type of Simple PLD, usually either a PLA or PAL
• CPLD — a more Complex PLD that consists of an arrangement of multiple SPLD-like blocks
on a single chip. Alternative names (that will not be used in this paper) sometimes adopted for
this style of chip are Enhanced PLD (EPLD), Super PAL, Mega PAL, and others.
• FPGA — a Field-Programmable Gate Array is an FPD featuring a general structure that allows
very high logic capacity. Whereas CPLDs feature logic resources with a wide number of inputs
(AND planes), FPGAs offer more narrow logic resources. FPGAs also offer a higher ratio of
flip-flops to logic resources than do CPLDs.
• HCPLDs — high-capacity PLDs: a single acronym that refers to both CPLDs and FPGAs. This
term has been coined in trade literature for providing an easy way to refer to both types of
devices. We do not use this term in the paper.
• Logic Block — a relatively small circuit block that is replicated in an array in an FPD. When a
circuit is implemented in an FPD, it is first decomposed into smaller sub-circuits that can each
be mapped into a logic block. The term logic block is mostly used in the context of FPGAs, but
it could also refer to a block of circuitry in a CPLD.
• Logic Capacity — the amount of digital logic that can be mapped into a single FPD. This is
usually measured in units of “equivalent number of gates in a traditional gate array”. In other
words, the capacity of an FPD is measured by the size of gate array that it is comparable to. In
simpler terms, logic capacity can be thought of as “number of 2-input NAND gates”.
In the remainder of this section, to provide insight into FPD development the evolution of
FPDs over the past two decades is described. Additional background information is also included
on the semiconductor technologies used in the manufacture of FPDs.
The first type of user-programmable chip that could implement logic circuits was the Programma-
ble Read-Only Memory (PROM), in which address lines can be used as logic circuit inputs and
data lines as outputs. Logic functions, however, rarely require more than a few product terms, and
a PROM contains a full decoder for its address inputs. PROMS are thus an inefficient architecture
for realizing logic circuits, and so are rarely used in practice for that purpose. The first device
developed later specifically for implementing logic circuits was the Field-Programmable Logic
Array (FPLA), or simply PLA for short. A PLA consists of two levels of logic gates: a program-
Inputs & Flip−flop feedbacks
AND D
D
Outputs
Plane D
so that any of its inputs (or their complements) can be AND’ed together in the AND-plane; each
AND-plane output can thus correspond to any product term of the inputs. Similarly, each OR-
plane output can be configured to produce the logical sum of any of the AND-plane outputs. With
this structure, PLAs are well-suited for implementing logic functions in sum-of-products form.
They are also quite versatile, since both the AND terms and OR terms can have many inputs (this
When PLAs were introduced in the early 1970s, by Philips, their main drawbacks were that
they were expensive to manufacture and offered somewhat poor speed-performance. Both disad-
vantages were due to the two levels of configurable logic, because programmable logic planes
were difficult to manufacture and introduced significant propagation delays. To overcome these
weaknesses, Programmable Array Logic (PAL) devices were developed. As Figure 1 illustrates,
PALs feature only a single level of programmability, consisting of a programmable “wired” AND-
plane that feeds fixed OR-gates. To compensate for lack of generality incurred because the OR-
plane is fixed, several variants of PALs are produced, with different numbers of inputs and out-
puts, and various sizes of OR-gates. PALs usually contain flip-flops connected to the OR-gate out-
puts so that sequential circuits can be realized. PAL devices are important because when
introduced they had a profound effect on digital hardware design, and also they are the basis for
some of the newer, more sophisticated architectures that will be described shortly. Variants of the
basic PAL architecture are featured in several other products known by different acronyms. All
small PLDs, including PLAs, PALs, and PAL-like devices are grouped into a single category
called Simple PLDs (SPLDs), whose most important characteristics are low cost and very high
pin-to-pin speed-performance.
As technology has advanced, it has become possible to produce devices with higher capacity
than SPLDs. The difficulty with increasing capacity of a strict SPLD architecture is that the struc-
ture of the programmable logic-planes grow too quickly in size as the number of inputs is
increased. The only feasible way to provide large capacity devices based on SPLD architectures is
then to integrate multiple SPLDs onto a single chip and provide interconnect to programmably
connect the SPLD blocks together. Many commercial FPD products exist on the market today
with this basic structure, and are collectively referred to as Complex PLDs (CPLDs).
CPLDs were pioneered by Altera, first in their family of chips called Classic EPLDs, and then
in three additional series, called MAX 5000, MAX 7000 and MAX 9000. Because of a rapidly
growing market for large FPDs, other manufacturers developed devices in the CPLD category and
there are now many choices available. All of the most important commercial products will be
described in Section 2. CPLDs provide logic capacity up to the equivalent of about 50 typical
SPLD devices, but it is somewhat difficult to extend these architectures to higher densities. To
build FPDs with very high logic capacity, a different approach is needed.
The highest capacity general purpose logic chips available today are the traditional gate arrays
of pre-fabricated transistors that can be customized into the user’s logic circuit by connecting the
transistors with custom wires. Customization is performed during chip fabrication by specifying
the metal interconnect, and this means that in order for a user to employ an MPGA a large setup
cost is involved and manufacturing time is long. Although MPGAs are clearly not FPDs, they are
mentioned here because they motivated the design of the user-programmable equivalent: Field-
Programmable Gate Arrays (FPGAs). Like MPGAs, FPGAs comprise an array of uncommitted
circuit elements, called logic blocks, and interconnect resources, but FPGA configuration is per-
formed through programming by the end user. An illustration of a typical FPGA architecture
appears in Figure 2. As the only type of FPD that supports very high logic capacity, FPGAs have
been responsible for a major shift in the way digital circuits are designed.
Logic
Block
I/O Block
the three categories. In the figure, “equivalent gates” refers loosely to “number of 2-input NAND
gates”. The chart serves as a guide for selecting a specific device for a given application, depend-
ing on the logic capacity needed. However, as we will discuss shortly, each type of FPD is inher-
ently better suited for some applications than for others. It should also be mentioned that there
exist other special-purpose devices optimized for specific applications (e.g. state machines, ana-
log gate arrays, large interconnection problems). However, since use of such devices is limited
they will not be described here. The next sub-section discusses the methods used to implement the
The first type of user-programmable switch developed was the fuse used in PLAs. Although fuses
are still used in some smaller devices, we will not discuss them here because they are quickly
being replaced by newer technology. For higher density devices, where CMOS dominates the IC
industry, different approaches to implementing programmable switches have been developed. For
CPLDs the main switch technologies (in commercial products) are floating gate transistors like
40000 ***
20000
Equivalent 12000 **
Gates
5000 *
2000
1000
Legend
200
An EEPROM or EPROM transistor is used as a programmable switch for CPLDs (and also
for many SPLDs) by placing the transistor between two wires in a way that facilitates implemen-
tation of wired-AND functions. This is illustrated in Figure 4, which shows EPROM transistors as
they might be connected in an AND-plane of a CPLD. An input to the AND-plane can drive a
product wire to logic level ‘0’ through an EPROM transistor, if that input is part of the corre-
sponding product term. For inputs that are not involved for a product term, the appropriate
EPROM transistors are programmed to be permanently turned off. A diagram for an EEPROM-
based device would look similar.
Although there is no technical reason why EPROM or EEPROM could not be applied to
FPGAs, current commercial FPGA products are based either on SRAM or antifuse technologies,
as discussed below.
+5 V
product wire
EPROM EPROM
SRAM
SRAM
SRAM
trol the select lines of multiplexers that drive logic block inputs. The figures gives an example of
the connection of one logic block (represented by the AND-gate in the upper left corner) to
another through two pass-transistor switches, and then a multiplexer, all controlled by SRAM
cells. Whether an FPGA uses pass-transistors or multiplexers or both depends on the particular
product.
The other type of programmable switch used in FPGAs is the antifuse. Antifuses are origi-
nally open-circuits and take on low resistance only when programmed. Antifuses are suitable for
FPGAs because they can be built using modified CMOS technology. As an example, Actel’s anti-
fuse structure, known as PLICE [Ham88], is depicted in Figure 6. The figure shows that an anti-
fuse is positioned between two interconnect wires and physically consists of three sandwiched
layers: the top and bottom layers are conductors, and the middle layer is an insulator. When
unprogrammed, the insulator isolates the top and bottom layers, but when programmed the insula-
tor changes to become a low-resistance link. PLICE uses Poly-Si and n+ diffusion as conductors
e oxide
wir
Poly−Si
dielectric
wire
antifuse
n+ diffision
silicon substrate
and ONO (see [Ham88]) as an insulator, but other antifuses rely on metal for conductors, with
amorphous silicon as the middle layer [Birk92][Marp94].
Table 1 lists the most important characteristics of the programming technologies discussed in
this section. The left-most column of the table indicates whether the programmable switches are
one-time programmable (OTP), or can be re-programmed (RP). The next column lists whether the
switches are volatile, and the last column names the underlying transistor technology.
Fuse no no Bipolar
Antifuse no no CMOS+
Design (CAD) programs. Such software tools are discussed briefly in this section to provide a feel
CAD tools are important not only for complex devices like CPLDs and FPGAs, but also for
SPLDs. A typical CAD system for SPLDs would include software for the following tasks: initial
design entry, logic optimization, device fitting, simulation, and configuration. This design flow is
illustrated in Figure 7, which also indicates how some stages feed back to others. Design entry
may be done either by creating a schematic diagram with a graphical CAD tool, by using a text-
based system to describe a design in a simple hardware description language, or with a mixture of
design entry methods. Since initial logic entry is not usually in an optimized form, algorithms are
employed to optimize the circuits, after which additional algorithms analyse the resulting logic
equations and “fit” them into the SPLD. Simulation is used to verify correct operation, and the
user would return to the design entry step to fix errors. When a design simulates correctly it can be
loaded into a programming unit and used to configure an SPLD. One final detail to note about Fig-
ure 7 is that while the original design entry step is performed manually by the designer, all other
fix errors
manual automatic
tools themselves are more sophisticated. Because the devices are complex and can accommodate
large designs, it is more common to use a mixture of design entry methods for different modules
of a complete circuit. For instance, some modules might be designed with a small hardware
description language like ABEL, others drawn using a symbolic schematic capture tool, and still
others described via a full-featured hardware description language such as VHDL. Also, for
CPLDs the process of “fitting” a design may require steps similar to those described below for
FPGAs, depending on how sophisticated the CPLD is. The necessary software for these tasks is
The design process for FPGAs is similar to that for CPLDs, but additional tools are needed to
support the increased complexity of the chips. The major difference is in the “device fitter” step
that comes after logic optimization and before simulation, where FPGAs require at least three
steps: a technology mapper to map from basic logic gates into the FPGA’s logic blocks, placement
to choose which specific logic blocks to use in the FPGA, and a router to allocate the wire seg-
ments in the FPGA to interconnect the logic blocks. With this added complexity, the CAD tools
might require a fairly long period of time (often more than an hour or even several hours) to com-
briefly, and then details are given for all of the most important CPLDs and FPGAs. The reader
who is interested in more details on the commercial products is encouraged to contact the manu-
* Most FPD manufacturers now provide their data sheets on the world wide web, and can be located
at URL “http://www.companyname.com”.
2.1 Commercially Available SPLDs
As the staple for digital hardware designers for the past two decades, SPLDs are very important
devices. SPLDs represent the highest speed-performance FPDs available, and are inexpensive.
However, they are also fairly straight-forward and well understood, so this paper will discuss
Two of the most popular SPLDs are the PALs produced by Advanced Micro Devices (AMD)
known as the 16R8 and 22V10. Both of these devices are industry standards and are widely sec-
ond-sourced by various companies. The name “16R8” means that the PAL has a maximum of 16
inputs (there are 8 dedicated inputs and 8 input/outputs), and a maximum of 8 outputs. The “R”
refers to the type of outputs provided by the PAL and means that each output is “registered” by a
D flip-flop. Similarly, the “22V10” has a maximum of 22 inputs and 10 outputs. Here, the “V”
means each output is “versatile” and can be configured in various ways, some configurations reg-
Another widely used and second sourced SPLD is the Altera Classic EP610. This device is
similar in complexity to PALs, but it offers more flexibility in the way that outputs are produced
and has larger AND- and OR- planes. In the EP610, outputs can be registered and the flip-flops
In addition to the SPLDs mentioned above many other products are available from a wide
array of companies. All SPLDs share common characteristics, like some sort of logic planes
(AND, OR, NOR, or NAND), but each specific product offers unique features that may be partic-
ularly attractive for some applications. A partial list of companies that offer SPLDs includes:
AMD, Altera, ICT, Lattice, Cypress, and Philips-Signetics. Since some of these SPLDs have com-
plexity approaching that found in CPLDs, the paper will now move on to more sophisticated
devices.
I/O
LAB
Block
PIA
As stated earlier, CPLDs consist of multiple SPLD-like blocks on a single chip. However, CPLD
products are much more sophisticated than SPLDs, even at the level of their basic SPLD-like
blocks. In this section, CPLDs are discussed in detail, first by surveying the available commercial
products and then by discussing the types of applications for which CPLDs are best suited. Suffi-
cient details are presented to allow a comparison between the various competing products, with
more attention being paid to devices that we believe are in more widespread use than others.
Altera has developed three families of chips that fit within the CPLD category: MAX 5000,
MAX 7000, and MAX 9000. Here, the discussion will focus on the MAX 7000 series, because it
is widely used and offers state-of-the-art logic capacity and speed-performance. MAX 5000 rep-
resents an older technology that offers a cost effective solution, and MAX 9000 is similar to MAX
7000, except that MAX 9000 offers higher logic capacity (the industry’s highest for CPLDs).
The general architecture of the Altera MAX 7000 series is depicted in Figure 8. It comprises
an array of blocks called Logic Array Blocks (LABs), and interconnect wires called a Program-
mable Interconnect Array (PIA). The PIA is capable of connecting any LAB input or output to
any other LAB. Also, the inputs and outputs of the chip connect directly to the PIA and to LABs.
A LAB can be thought of as a complex SPLD-like structure, and so the entire chip can be consid-
ered to be an array of SPLDs. MAX 7000 devices are available both based in EPROM and
EEPROM technology. Until recently, even with EEPROM, MAX 7000 chips could be program-
The structure of a LAB is shown in Figure 9. Each LAB consists of two sets of eight macro-
cells (shown in Figure 10), where a macrocell comprises a set of programmable product terms
(part of an AND-plane) that feeds an OR-gate and a flip-flop. The flip-flops can be configured as
D type, JK, T, SR, or can be transparent. As illustrated in Figure 10, the number of inputs to the
Array of 16
Macrocells
I/O Control Block
P
to I/O Cells
I
A
product-term sharing
to other LABs
LAB
terms within the macrocell, and in addition can have up to 15 extra product terms from macrocells
in the same LAB. This product term flexibility makes the MAX 7000 series LAB more efficient in
terms of chip area because typical logic functions do not need more than five product terms, and
the architecture supports wider functions when they are needed. It is interesting to note that vari-
able sized OR-gates of this sort are not available in basic SPLDs (see Figure 1). Similar features
of this kind are found in other CPLD architectures discussed shortly.
Besides Altera, several other companies produce devices that can be categorized as CPLDs.
For example, AMD manufacturers the Mach family, Lattice has the (i)pLSI series, Xilinx pro-
duces a CPLD series that they call XC7000 (unrelated to the Altera MAX 7000 series) and has
announced a new family called XC9500, and ICT has the PEEL array. These devices are dis-
set
state
S
D Q
PIA
array clock
clear
(global clear
to PIA
not shown)
I/O (32)
AMD offers a CPLD family with five sub-families called Mach 1 to Mach 5. Each Mach device
comprises multiple PAL-like blocks: Mach 1 and 2 consist of optimized 22V16 PALs, and Mach 3
and 4 comprise several optimized 34V16 PALs, and Mach 5 is similar but offers enhanced speed-
performance. All Mach chips are based on EEPROM technology, and together the five sub-fami-
lies provide a wide range of selection, from small, inexpensive chips to larger state-of-the-art
ones. This discussion will focus on Mach 4, because it represents the most advanced currently
Figure 11 depicts a Mach 4 chip, showing the multiple 34V16 PAL-like blocks, and the inter-
connect, called Central Switch Matrix, for connecting the blocks together. Chips range in size
from 6 to 16 PAL blocks, which corresponds roughly to 2000 to 5000 equivalent gates and are in-
circuit programmable. All connections in Mach 4 between one PAL block and another (even from
a PAL block to itself) are routed through the Central Switch Matrix. The device can thus be
viewed not only as a collection of PALs, but also as a single large device. Since all connections
travel through the same path, timing delays of circuits implemented in Mach 4 are predictable.
A Mach 4 PAL-like block is depicted in Figure 12. It has 16 outputs and a total of 34 inputs
(16 of which are the outputs fed-back), so it corresponds to a 34V16 PAL. However, there are two
key differences between this block and a normal PAL: 1. there is a product term allocator
between the AND-plane and the macrocells (the macrocells comprise an OR-gate, an EX-OR gate
and a flip-flop), and 2. there is an output switch matrix between the OR-gates and the I/O pins.
These two features make a Mach 4 chip easier to use, because they “decouple” sections of the
PAL block. More specifically, the product term allocator distributes and shares product terms
from the AND-plane to whichever OR-gates require them. This is much more flexible than the
fixed-size OR-gates in regular PALs. The output switch matrix makes it possible for any macro-
cell output (OR-gate or flip-flop) to drive any of the I/O pins connected to the PAL block. Again,
flexibility is enhanced over a PAL, where each macrocell can drive only one specific I/O pin.
Mach 4’s combination of in-system programmability and high flexibility promote easy hardware
design changes.
clock generator
PT allocator, OR, EXOR
output / buried
AND−plane
I/O cells
34 80 16 16 8
I/O (8)
16
input
switch 16
matrix
Lattice offers a complete range of CPLDs, with two main product lines: the Lattice pLSI consist
of three families of EEPROM CPLDs, and the ispLSI are the same as the pLSI devices, except
that they are in-system programmable. For both the pLSI and ispLSI products, Lattice offers three
Lattice’s earliest generation of CPLDs is the pLSI and ispLSI 1000 series. Each chip consists
of a collection of SPLD-like blocks, described in more detail later, and a global routing pool to
connect blocks together. Logic capacity ranges from about 1200 to 4000 gates. Pin-to-pin delays
are 10 nsec. Lattice also offers a CPLD family called the 2000 series, which are relatively small
CPLDs, with between 600 and 2000 gates that offer a higher ratio of macrocells to I/O pins and
higher speed-performance than the 1000 series. At 5.5 nsec pin-to-pin delays, the 2000 series
Lattice’s 3000 series represents their largest CPLDs, with up to 5000 gates. Pin-to-pin delays
for this device are about 10-15 nsec. In terms of other chips discussed so far, the 3000 series func-
tionality is most similar to AMD’s Mach 4. The 3000 series offers some enhancements over the
other Lattice parts to support more recent design styles, such as JTAG boundary scan.
The general structure of a Lattice pLSI or ispLSI device is indicated in Figure 13. Around the
outside edges of the chip are the bi-directional I/Os, which are connected both to the Generic
Logic Blocks (GLBs) and the Global Routing Pool (GRP). As the fly-out on the right side of the
figure shows, the GLBs are small PAL-like blocks that consist of an AND-plane, product term
allocator, and macrocells. The GRP is a set of wires that span the entire chip and are available to
connect the GLB inputs and outputs together. All interconnections pass through the GRP, so tim-
ing between levels of logic in the Lattice chips is fully predictable, much as it is for the AMD
Mach devices.
2.2.4 Cypress FLASH370 CPLDs
Cypress has recently developed a family of CPLD products that are similar to both the AMD
and Lattice devices in several ways. The Cypress CPLDs, called FLASH370 are based on FLASH
EEPROM technology, and offer speed-performance of 8.5 to 15 nsec pin-to-pin delays. The
FLASH370 parts are not in-system programmable. Recognizing that larger chips need more I/Os,
FLASH370 provides more I/Os than competing products, featuring a linear relationship between
the number of macrocells and number of bi-directional I/O pins. The smallest parts have 32 mac-
rocells and 32 I/Os and the largest 256 macrocells and 256 I/Os.
Figure14 shows that FLASH370 has a typical CPLD architecture with multiple PAL-like
blocks and a programmable interconnect matrix (PIM) to connect them. Within each PAL-like
block, there is an AND-plane that feeds a product term allocator that directs from 0 to 16 product
terms to each of 32 OR-gates. Note that in the feed-back path from the macrocell outputs to the
Input Bus
I/O Pads
PIM, there are 32 wires; this means that a macrocell can be buried (not drive an I/O pin) and yet
the I/O pin that could be driven by the macrocell can still be used as an input. This illustrates
another type of flexibility available in PAL-like blocks in CPLDs, but not present in normal PALs.
Although Xilinx is mostly a manufacturer of FPGAs, they also offer a selection of CPLDs, called
XC7000, and have announced a new CPLD family called XC9500. There are two main families
in the XC7000 offering: the 7200 series, originally marketed by Plus Logic as the Hiper EPLDs,
and the 7300 series, developed by Xilinx. The 7200 series are moderately small devices, with
about 600 to 1500 gates capacity, and they offer speed-performance of about 25 nsec pin-to-pin
delays. Each chip consists of a collection of SPLD-like blocks that each have 9 macrocells. The
macrocells in the 7200 series are different from those in other CPLDs in that each macrocell
includes two OR-gates and each of these OR-gates is input to a two-bit Arithmetic Logic Unit
(ALU). The ALU can produce any functions of its two inputs, and its output feeds a configurable
flip-flop. The Xilinx 7300 series is an enhanced version of the 7200, offering more capacity (up to
3000 gates when the entire family becomes available) and higher speed-performance. Finally, the
new XC9500, when available, will offer in-circuit programmability with 5 nsec pin-to-pin delays
ity and provides on-chip SRAM blocks, a unique feature among CPLD products. The upper part
of Figure 15 illustrates the architecture of FLASHlogic devices; it comprises a collection of PAL-
like blocks, called Configurable Function Blocks (CFBs), that each represents an optimized
24V10 PAL.
in
global interconnect matrix clk
10 PAL 24V10
data in
ever, they have one unique feature that stands them apart from all other CPLDs: each PAL-like
block, instead of being used for AND-OR logic, can be configured as a block of 10 nsec Static
RAM. This concept is illustrated in the lower part of Figure 15, which shows one CFB being used
as a PAL and another configured as an SRAM. In the SRAM configuration, the PAL block
becomes a 128 word by 10 bit read/write memory. Inputs that would normally feed the AND-
plane in the PAL in this case become address lines, data in, and control signals for the memory.
Notice that the flip-flops and tri-state buffers are still available when the PAL block is configured
as memory.
In the FLASHlogic device, the AND-OR logic plane’s configuration bits are SRAM cells that
are “shadowed” by EPROM or EEPROM cells. The SRAM cells are loaded with a copy of the
non-volatile EPROM or EEPROM memory when power is applied, but it is the SRAM cells that
control the configuration of the chip. It is possible to re-configure the chips in-system by down-
loading new information into the SRAM cells. The SRAM cells’ contents can be written back to
The ICT PEEL Arrays are basically large PLAs that include logic macrocells with flop-flops
and feedback to the logic planes. This structure is illustrated by Figure 16, which shows a pro-
grammable AND-plane that feeds a programmable OR-plane. The outputs of the OR-plane are
divided into groups of four, and each group can be input to any of the logic cells. The logic cells
provide registers for the sum terms and can feed-back the sum terms to the AND-plane. Also, the
Because they have a PLA-like structure, logic capacity of PEEL Arrays is somewhat difficult
to measure compared to the CPLDs discussed so far; an estimate is 1600 to 2800 equivalent gates.
Figure 16 - Architecture of ICT PEEL Arrays.
PEEL Arrays offer relatively few I/O pins, with the largest part being offered in a 40 pin package.
Since they do not comprise SPLD-like blocks, PEEL Arrays do not fit well into the CPLD cate-
gory, however the are included here because they represent an example of PLA-based, rather than
PAL-based devices, and they offer larger capacity than a typical SPLD.
The logic cell in the PEEL Arrays, depicted in Figure 17, includes a flip-flop, configurable as
D, T, or JK, and two multiplexers. The multiplexers each produce an output of the logic cell and
can provide either a registered or combinational output. One of the logic cell outputs can connect
the flip-flop clock, as well as preset and clear, are full sum-of-product logic functions. This differs
from all other CPLDs, which simply provide product terms for these signals and is attractive for
some applications. Because of their PLA-like OR-plane, the ICT PEEL Arrays are especially
We will now briefly examine the types of applications which best suit CPLD architectures.
Because they offer high speeds and a range of capacities, CPLDs are useful for a very wide assort-
ment of applications, from implementing random glue logic to prototyping small gate arrays. One
of the most common uses in industry at this time, and a strong reason for the large growth of the
CPLD market, is the conversion of designs that consist of multiple SPLDs into a smaller number
of CPLDs.
CPLDs can realize reasonably complex designs, such as graphics controller, LAN controllers,
UARTs, cache control, and many others. As a general rule-of-thumb, circuits that can exploit wide
AND/OR gates, and do not need a very large number of flip-flops are good candidates for imple-
mentation in CPLDs. A significant advantage of CPLDs is that they provide simple design
changes through re-programming (all commercial CPLD products are re-programmable). With in-
system programmable CPLDs it is even possible to re-configure hardware (an example might be
to change a protocol for a communications circuit) without power-down.
Designs often partition naturally into the SPLD-like blocks in a CPLD. The result is more pre-
dictable speed-performance than would be the case if a design were split into many small pieces
and then those pieces were mapped into different areas of the chip. Predictability of circuit imple-
mentation is one of the strongest advantages of CPLD architectures.
2.3 Commercially Available FPGAs
As one of the largest growing segments of the semiconductor industry, the FPGA market-place is
volatile. As such, the pool of companies involved changes rapidly and it is somewhat difficult to
say which products will be the most significant when the industry reaches a stable state. For this
reason, and to provide a more focused discussion, we will not mention all of the FPGA manufac-
turers that currently exist, but will instead focus on those companies whose products are in wide-
spread use at this time. In describing each device we will list its capacity, nominally in 2-input
NAND gates as given by the vendor. Gate count is an especially contentious issue in the FPGA
industry, and so the numbers given in this paper for all manufacturers should not be taken too seri-
ously. Wags have taken to calling them “dog” gates, in reference to the traditional ratio between
There are two basic categories of FPGAs on the market today: 1. SRAM-based FPGAs and 2.
antifuse-based FPGAs. In the first category, Xilinx and Altera are the leading manufacturers in
terms of number of users, with the major competitor being AT&T. For antifuse-based products,
Actel, Quicklogic and Cypress, and Xilinx offer competing products.
The basic structure of Xilinx FPGAs is array-based, meaning that each chip comprises a two-
dimensional array of logic blocks that can be interconnected via horizontal and vertical routing
channels. An illustration of this type of architecture was shown in Figure 2. Xilinx introduced the
first FPGA family, called the XC2000 series, in about 1985 and now offers three more genera-
tions: XC3000, XC4000, and XC5000. Although the XC3000 devices are still widely used, we
will focus on the more recent and more popular XC4000 family. We note that XC5000 is similar
to XC4000, but has been engineered to offer similar features at a more attractive price, with some
penalty in speed. We should also note that Xilinx has recently introduced an FPGA family based
on anti-fuses, called the XC8100. The XC8100 has many interesting features, but since it is not
yet in widespread use, we will not discuss it here. The Xilinx 4000 family devices range in capac-
The XC4000 features a logic block (called a Configurable Logic Block (CLB) by Xilinx) that
is based on look-up tables (LUTs). A LUT is a small one bit wide memory array, where the
address lines for the memory are inputs of the logic block and the one bit output from the memory
is the LUT output. A LUT with K inputs would then correspond to a 2K x 1 bit memory, and can
realize any logic function of its K inputs by programming the logic function’s truth table directly
into the memory. The XC4000 CLB contains three separate LUTs, in the configuration shown in
Figure 18. There are two 4-input LUTS that are fed by CLB inputs, and the third LUT can be used
in combination with the other two. This arrangement allows the CLB to implement a wide range
of logic functions of up to nine inputs, two separate functions of four inputs or other possibilities.
Each CLB also contains two flip-flops.
C1 C2 C3 C4
Inputs
selector
G4 state Outputs
G3 S
Lookup Q
D Q2
G2 Table
G1
E R
Lookup
Table G
F4 state
F3 S
Lookup
D Q Q1
F2 Table
F1
E R
Vcc
Clock F
tems, the XC4000 chips have “system oriented” features. For instance, each CLB contains cir-
cuitry that allows it to efficiently perform arithmetic (i.e., a circuit that can implement a fast carry
operation for adder-like circuits) and also the LUTs in a CLB can be configured as read/write
RAM cells. A new version of this family, the 4000E, has the additional feature that the RAM can
be configured as a dual port RAM with a single write and two read ports. In the 4000E, RAM
blocks can be synchronous RAM. Also, each XC4000 chip includes very wide AND-planes
around the periphery of the logic block array to facilitate implementing circuit blocks such as
wide decoders.
Besides logic, the other key feature that characterizes an FPGA is its interconnect structure.
The XC4000 interconnect is arranged in horizontal and vertical channels. Each channel contains
some number of short wire segments that span a single CLB (the number of segments in each
channel depends on the specific part number), longer segments that span two CLBs, and very long
segments that span the entire length or width of the chip. Programmable switches are available
(see Figure 5) to connect the inputs and outputs of the CLBs to the wire segments, or to connect
one wire segment to another. A small section of a routing channel representative of an XC4000
device appears in Figure 19. The figure shows only the wire segments in a horizontal channel, and
does not show the vertical routing channels, the CLB inputs and outputs, or the routing switches.
An important point worth noting about the Xilinx interconnect is that signals must pass through
switches to reach one CLB from another, and the total number of switches traversed depends on
the particular set of wire segments used. Thus, speed-performance of an implemented circuit
depends in part on how the wire segments are allocated to individual signals by CAD tools.
vertical channels
not shown
length 1
wires
length 2
wires
long
wires
Altera’s FLEX 8000 series consists of a three-level hierarchy much like that found in CPLDs.
However, the lowest level of the hierarchy consists of a set of lookup tables, rather than an SPLD-
like block, and so the FLEX 8000 is categorized here as an FPGA. It should be noted, however,
that FLEX 8000 is a combination of FPGA and CPLD technologies. FLEX 8000 is SRAM-based
and features a four-input LUT as its basic logic block. Logic capacity ranges from about 4000
The overall architecture of FLEX 8000 is illustrated in Figure 20. The basic logic block,
called a Logic Element (LE) contains a four-input LUT, a flip-flop, and special-purpose carry cir-
cuitry for arithmetic circuits (similar to Xilinx XC4000). The LE also includes cascade circuitry
that allows for efficient implementation of wide AND functions. Details of the LE are illustrated
in Figure 21.
In the FLEX 8000, LEs are grouped into sets of 8, called Logic Array Blocks (LABs, a term
borrowed from Altera’s CPLDs). As shown in Figure 22, each LAB contains local interconnect
and each local wire can connect any LE to any other LE within the same LAB. Local interconnect
I/O
I/O
FastTrack
interconnect
LAB
(8 Logic Elements
& local interconnect)
Cascade out
Cascade in
data1
data2
S
Look−up Cascade D
LE out
data3 Q
Table
data4 R
Carry in Carry
Carry out
cntrl1
set/clear
cntrl2
cntrl3
clock
cntrl4
From FastTrack
interconnect cascade, carry
cntrl
2
4
data
To FastTrack
LE interconnect
4
To FastTrack
LE
LE To FastTrack
interconnect
to adjacent LAB
also connects to the FLEX 8000’s global interconnect, called FastTrack. FastTrack is similar to
Xilinx long lines in that each FastTrack wire extends the full width or height of the device. How-
ever, a major difference between FLEX 8000 and Xilinx chips is that FastTrack consists of only
long lines. This makes the FLEX 8000 easy for CAD tools to automatically configure. All Fast-
Track wires horizontal wires are identical, and so interconnect delays in the FLEX 8000 are more
predictable than FPGAs that employ many smaller length segments because there are fewer pro-
grammable switches in the longer paths. Predictability is furthered aided by the fact that connec-
tions between horizontal and vertical lines pass through active buffers.
The FLEX 8000 architecture has been extended in the state-of-the-art FLEX 10000 family.
FLEX 10000 offers all of the features of FLEX 8000, with the addition of variable-sized blocks of
SRAM, called Embedded Array Blocks (EABs). This idea is illustrated in Figure 23, which shows
that each row in a FLEX 10000 chip has an EAB on one end. Each EAB is configurable to serve
EAB can alternatively be configured to implement a complex logic circuit, such as a multiplier, by
I/O
I/O
EAB
EAB
employing it as a large multi-output lookup table. Altera provides, as part of their CAD tools, sev-
eral macro-functions that implement useful logic circuits in EABs. Counting the EABs as logic
gates, FLEX 10000 offers the highest logic capacity of any FPGA, although it is hard to provide
an accurate number.
AT&T’s SRAM-based FPGAs feature an overall structure similar to that in Xilinx FPGAs,
and is called Optimized Reconfigurable Cell Array (ORCA). The ORCA logic block is based on
LUTs, containing an array of Programmable Function Units (PFUs). The structure of a PFU is
shown in Figure 24. A PFU possesses a unique kind of configurability among LUT-based logic
blocks, in that it can be configured in the following ways: as four 4-input LUTs, as two 5-input
LUTs, and as one 6-input LUT. A key element of this architecture is that when used as four 4-
LUT D Q
LUT D Q
switch matrix
LUT D Q
LUT D Q
input LUTs, several of the LUTs’ inputs must come from the same PFU input. While this reduces
the apparent functionality of the PFU, it also significantly reduces the cost of the wiring associ-
ated with the chip. The PFU also includes arithmetic circuitry, like Xilinx XC4000 and Altera
FLEX 8000, and like Xilinx XC4000 a PFU can be configured as a RAM block. A recently
announced version of the ORCA chip also allows dual-port and synchronous RAM.
ORCA’s interconnect structure is also different from those in other SRAM-based FPGAs.
Each PFU connects to interconnect that is configured in four-bit buses. This provides for more
efficient support for “system-level” designs, since buses are common in such applications. The
ORCA family has been extended in the ORCA 2 series, and offers very high logic capacity up to
40,000 logic gates. ORCA 2 features a two-level hierarchy of PFUs based on the original ORCA
architecture.
2.3.4 Actel FPGAs
In contrast to FPGAs described above, the devices manufactured by Actel are based on anti-
fuse technology. Actel offers three main families: Act 1, Act 2, and Act 3. Although all three gen-
erations have similar features, this paper will focus on the most recent devices, since they are apt
to be more widely used in the longer term. Unlike the FPGAs described above, Actel devices are
based on a structure similar to traditional gate arrays; the logic blocks are arranged in rows and
there are horizontal routing channels between adjacent rows. This architecture is illustrated in
Figure 25. The logic blocks in the Actel devices are relatively small in comparison to the LUT-
based ones described above, and are based on multiplexers. Figure 26 illustrates the logic block in
the Act 3 and shows that it comprises an AND and OR gate that are connected to a multiplexer-
based circuit block. The multiplexer circuit is arranged such that, in combination with the two
logic gates, a very wide range of functions can be realized in a single logic block. About half of
I/O Blocks
Logic
Block
Routing Rows
Channels
I/O Blocks
I/O Blocks
I/O Blocks
inputs
As stated above, Actel’s interconnect is organized in horizontal routing channels. The chan-
nels consist of wire segments of various lengths with antifuses to connect logic blocks to wire
segments or one wire to another. Also, although not shown in Figure 25, Actel chips have vertical
wires that overlay the logic blocks, for signal paths that span multiple rows. In terms of speed-per-
formance, it would seem probable that Actel chips are not fully predictable, because the number
of antifuses traversed by a signal depends on how the wire segments are allocated during circuit
implementation by CAD tools. However, Actel provides a rich selection of wire segments of dif-
ferent length in each channel and has developed algorithms that guarantee strict limits on the
number of antifuses traversed by any two-point connection in a circuit which improves speed-per-
formance significantly.
The main competitor for Actel in antifuse-based FPGAs is Quicklogic, whose has two fami-
lies of devices, called pASIC and pASIC-2. The pASIC-2 is an enhanced version that has only
recently been introduced, and will not be discussed here. The pASIC, as illustrated in Figure 27,
has similarities to several other FPGAs: the overall structure is array-based like Xilinx FPGAs, its
logic blocks use multiplexers similar to Actel FPGAs, and the interconnect consists of only long-
amorphous silicon
metal 2
metal 1
oxide
I/O Blocks
ViaLink at
every wire
crossing
Logic Cell
lines like in Altera FLEX 8000. We note that the pASIC architecture is now independently devel-
oped by Cypress as well, but this discussion will focus only on Quicklogic’s version of their parts.
Quicklogic’s antifuse structure, called ViaLink, is illustrated on the left-hand side of Figure
27. It consists of a top layer of metal, an insulating layer of amorphous silicon, and a bottom layer
of metal. When compared to Actel’s PLICE antifuse, ViaLink offers a very low on-resistance of
about 50 ohms (PLICE is about 300 ohms) and a low parasitic capacitance. Figure 27 shows that
ViaLink antifuses are present at every crossing of logic block pins and interconnect wires, provid-
ing generous connectivity. pASIC’s multiplexer-based logic block is depicted in Fig 28. It is more
complex than Actel’s Logic Module, with more inputs and wide (6-input) AND-gates on the mul-
QS
A1
A2
A3
AZ
A4
A5
A6
B1
OZ
B2 0
C1 1
D S Q QZ
C2 0
D1 1
D2 0 R
E1
E2 1
NZ
F1
F2
F3
F4 FZ
F5
F6
QC
QR
FPGAs have gained rapid acceptance and growth over the past decade because they can be
applied to a very wide range of applications. A list of typical applications includes: random logic,
integrating multiple SPLDs, device controllers, communication encoding and filtering, small to
medium sized systems with SRAM blocks, and many more.
might be possible using only a single large FPGA (which corresponds to a small Gate Array in
terms of capacity), and the latter would entail many FPGAs connected by some sort of intercon-
nect; for emulation of hardware, QuickTurn [Wolff90] (and others) has developed products that
comprise many FPGAs and the necessary software to partition and map circuits.
Another promising area for FPGA application, which is only beginning to be developed, is the
usage of FPGAs as custom computing machines. This involves using the programmable parts to
“execute” software, rather than compiling the software for execution on a regular CPU. The
reader is referred to the FPGA-Based Custom Computing Workshop (FCCM) held for the last
It was mentioned in Section 2.2.8 that when designs are mapped into CPLDs, pieces of the
design often map naturally to the SPLD-like blocks. However, designs mapped into an FPGA are
broken up into logic block-sized pieces and distributed through an area of the FPGA. Depending
on the FPGA’s interconnect structure, there may be various delays associated with the intercon-
nections between these logic blocks. Thus, FPGA performance often depends more upon how
CAD tools map circuits into the chip than is the case for CPLDs.
provides the programmability and a description of many of the architectures in the current mar-
ketplace. This paper has not focussed on the equally important issue of CAD tools for FPDs.
We believe that over time programmable logic will become the dominant form of digital logic
design and implementation. Their ease of access, principally through the low cost of the devices,
makes them attractive to small firms and small parts of large companies. The fast manufacturing
turn-around they provide is an essential element of success in the market. As architecture and
CAD tools improve, the disadvantages of FPDs compared to Mask-Programmed Gate Arrays will
[Oldf95] J. Oldfield, R. Dorf, Field Programmable Gate Arrays, John Wiley & Sons, New
York, 1995.
Up to date research topics can be found in several conferences - CICC, ICCAD, DAC and the
published proceedings: FPGA Symposium Series: FPGA ‘95: The 3rd Int’l ACM Symposium
on Field-Programmable Gate Arrays, and FPGA ‘96. In addition, there have been international
workshops on Field-Programmable Logic in Oxford (1991), Vienna (1992), Oxford (1993) Pra-
gue (1994), and Oxford (1995) and Darmstadt (1996), some of the proceedings of which are pub-
lished by Abingdon Press, UK. Finally, there is the Canadian Workshop on Field Programmable
Devices, 1994, 1995, and 1996.
5 References
[Birk92] J. Birkner et al, “A very-high-speed field-programmable gate array using metal-to-
metal antifuse programmable elements,” Microelectronics Journal, v. 23, pp. 561-568.
[Ham88] E. Hamdy et al, “Dielectric-based antifuse for logic and memory ICs,” IEEE
International Electron Devices Meeting Technical Digest, pp. 786 - 789, 1988.
[Marp94] David Marple and Larry Cooke, “Programming Antifuses in CrossPoint’s FPGA,”
Proc. CICC 94, May 1994, pp. 185-188.
[MAX94] Altera Corporation, “MAX+PlusII CAD Design System, version 5.0”, 1994.
[ORCA94] AT&T Corp., “ORCA 2C Series FPGAs, Preliminary Data Sheets,” April 1994.
[Wolff90] H. Wolff, “How QuickTurn is filling the gap,” Electronics, April 1990.
Acknowledgments
We wish to acknowledge students, colleagues, and acquaintances in industry who have helped
1. Introduction
Figure 1.
European market of personal communication systems (source: Elsevier Ad-
vanced Technology).
In the eighties, two new processor types were introduced: RISC and VLIW.
chitectures are a more special case, where the processor can dynamically
schedule pipelined instructions into parallel instruction streams.
The above described concepts of CISC, RISC and VLIW, are applicable
to computer architectures for many broad application domains, such as
scientific computing, control, or digital signal processing. The term DSP
(digital signal processing)architecture is often used to designate a processor
architecture suited for the latter application domain, including e.g. audio,
speech processing and telecom applications.
DSPs may use any of the above concepts (CISC, RISC or VLIW). Usu-
ally a processor is classified as a DSP when it has the following features:
- It contains a parallel multiplier unit, in addition to the standard ALU.
This allows to execute multiply-accumulate instructions at a rate of
one per machine cycle.
- It has efficient memory and register structures, ensuring a high com-
munication bandwidth between the different functional units in the
processor's data path, and between data path and memory.
- Several addressing modes are available for the data memory.
note
Such a mixed-signal chip would implement all signal processing and control functions,
excluding power amplifier circuits.
note:
The data path structure of this ASIP is largely similar to the ADSP-21xx processor.
Both code types are illustrated in Figure 4, which shows the execution
of a small application consisting of multiply-accumulate sequences, imple-
mented on the multiplier-accumulator of Figure 3, assuming that an extra
pipeline stage has been added between the multiplier and accumulator unit.
Per sequence three operations have to be executed : an operand fetch, a
multiplication, and an accumulation (we assume that the result is kept in
the accumulator register MR). The processor allows to start a new sequence
in every machine cycle. A macrocoded multiply-accumulate instruction is
shown, which executes in three cycles, thus controlling one complete se-
quence. Alternatively, a microcoded multiply-accumulate instruction would
execute in a single cycle, controlling the fetch, multiply and accumulate op-
erations of three consecutive sequences.
In the case of a microcoded processor, setting up and maintaining the
instruction pipeline is a responsibility of the programmer or the code gen-
note:
Hybrid code types do exist as well. For example, some processors contain instructions
that specify an operand fetch and an arithmetic operation that start in the same machine
cycle, but the arithmetic operation continues in the next cycle. Such a hybrid code type
is sometimes referred to as time-stationary macrocode.
Figure 4. Different code types, illustrated for a multiply-accumulate instruction (b), on
a pipelined data path (a),
erator. The resulting pipeline schedule is fully visible in the machine code
programme. In contrast, in a macrocoded processor these actions are per-
formed by the processor controller. Macrocoded processors may exhibit
pipeline hazards . Depending on the processor, pipeline hazards may
have to be resolved in the machine code programme (statically) or by means
of interlocking in the processor controller (dynamically). Macrocoded pro-
cessors with interlocking are relatively easy to programme, although it may
more difficult for a designer to predict their exact cycle time behaviour.
- Most ASIPs for consumer and telecom products have a load-store (also
called register-register) architecture. In a load-store architecture, all
data path operators get their operands from, and produce results in
addressable registers (e.g. AX,AY,AR in Figure 3(a)). Communication
between memories and registers requires separate "load" and "store"
operations, which may be scheduled in parallel with arithmetic opera-
tions if permitted by the instruction set (see Section 4.2.3).
Control flow. Most ASIPs support standard control flow instructions, like
conditional branching based on bit values in the condition code register.
However, several measures are usually taken to guarantee good performance
in the presence of control flow.
- Branch penalties are usually small, i.e. 0 or 1 cycles. This term refers
to the delay incurred in executing a branch due to the instruction
pipeline.
With respect to the use of an ASIP, the following major design tasks
can be identified (Figure 5):
The discussion in this chapter will be restricted to the code generation part.
As an illustration, Figure 6 shows the Chess retargetable code generation
environment, that is currently under development at IMEC . The code
generation process is traditionally divided in a number of phases. For the
ASIPs envisaged in this chapter, it is useful to distinguish the following
code generation phases :
Although these different phases are often solved in different passes of the
code generation trajectory it is well known that they are strongly depen-
dent, especially in the case of parallel architectures. This is referred to
note:
A distinction is sometimes made between determining a valid ordering of partial in-
structions (called scheduling) and the actual merging of partial instructions into complete
instructions (called compaction).
- The method presupposes that both the legal instruction patterns and
the application programme take the form of trees. Again, these con-
Figure 7. Code selection as a tree pattern matching problem: (a) Template pattern
base, derived for the ASIP of Figure 3; (b) CDFG of a symmetrical FIR filter, with a
possible cover using the available pattern base.
6.1.3. Scheduling
Scheduling (or compaction) is an essential task for architectures that have
parallelism in their instructions, or that exhibit pipeline hazards (e.g.
RISCs, VLIWs). In these cases the code selection phase normally delivers
only partial instructions, which can be further combined by a scheduling
tool. In the microprogramming community, the set of partial instructions
after code selection is usually called vertical microcode, while the final in-
structions after scheduling are referred to as horizontal microcode.
Scheduling algorithms are usually based on list scheduling . Initial
scheduling algorithms were only operating at the basic block level. More
global scheduling algorithms have been proposed as well, such as trace
scheduling and software pipelining.
The code generation techniques described in the previous section were pri-
marily developed for general purpose processors. Complementary to this
work, new research activities have started in the past few years, into the
problem of code generation for ASIPs. These new activities are justified by
the following reasons :
Note:
A number of commercial compilers for fixed-point DSPs are based on standard soft-
ware compilation techniques, enhanced with processor-specific heuristics in the various
code generation phases. These heuristics result in an improved code quality,but they
make the compiler non-retargetable.
tally solved in the compiler community. As motivated in Section 5,
retargetability is a basic requirement in an ASIP environment.
note:
The "best" form of retargetability offered by a traditional compiler today is
probably the portability of the GCC compiler, which requires a signifjcant
intervention by the user.
Figure 8. ISG representation of the example ASIP.
The above methods rely on processor models in which all patterns cor-
responding to legal partial instructions are enumerated in advance. Van
Praet proposed a bundling technique, in which the required patterns are
constructed on the fly during code selection . Whether or not such a
pattern is legal, can be derived from the ISG model. A comparable ap-
proach is used by Nowak who uses the connection-operation graph to
check patterns.
(a) Storage in AR; (b) Storage in AR followed by MX; (c) Spilling to memory.The latter
two alternatives require extra transfers.
Hitherto, the design of industrial ASIPs has largely been a manual task.
Whereas the selection of functional units for a given application may be rea-
sonably straightforward, defining efficient memory and register structures
is non-trivial . Moreover the design of the actual instruction set involves
a careful tradeoff between the parallelism that will be offered (which de-
termines the ASIP´s performance and also its power dissipation) and the
width of the instruction word and programme memory. In reality, designers
sometimes start from an existing processor architecture that is modified for
new applications. In this case only part of the search space will be explored,
leading to sub-optimal solutions. A first aid to ASIP designers would be an
environment in which :
instruction set, using a graph-based processor model (see also Section 6.2).
Evaluations of the architecture are made by calling a retargetable code
generator that operates on the processor model, and produces compile-time
diagnostics to the designer.
Several authors have explored the semi-automatic definition of ASIP
architectures for a given application. Alomary´s approach is to start from a
generic ASIP model . This is a processor with a maximal instruction set,
consisting of a fixed set of basic instructions extended with more sophis-
ticated instructions that are compositions of the basic ones or correspond
to special hardware blocks. A tool then selects a subset from the maximal
instruction set. The latter problem is solved using a branch-and-bound al-
gorithm, based on user-defined constraints and metrics (e.g. performance
optimisation under area and power constraints). Holmer presented a tech-
nique to define macrocoded ASIP architectures using a code compaction
technique . In this approach, the designer defines the ASIP´s data path,
the instruction invariant behaviour of the instruction pipeline, and an up-
per bound on the allowed instruction width. The tool then determines the
actual instruction set, optimising the processor performance.
Huang presented a method similar to Holmer´s that uses different com-
piler techniques . Moreover, his method can also automatically deter-
mine certain parameters of the data path, such as the number of parallel
functional units. Goossens presented an architectural synthesis system for
microcoded ASIPs with orthogonal instruction formats . The user spec-
ifies the required functional units in the data path. The tool then optimises
the actual register structure, within the scope of a restricted data path
model.
ASIP). For every control thread, the machine code is compiled statically
using a code generator. During the execution, the kernel may e.g. decide to
interrupt an ongoing process of which the timing is non-critical, to insert
a high-priority time-critical process. Run-time kernels introduce a multi-
tasking functionality on the processor, similar to multi-tasking in a time-
sharing operating system. The kernel itself is simply another software pro-
cess running on the processor. Note however that most processors have
special hardware provisions to facilitate the implementation of a run-time
kernel (e.g. interrupt handlers, timers, etc.).
Several run-time kernels for DSP processors are commercially available.
These tools are stripped versions of time-sharing operating
systems, with more efficient mechanisms for context switching between
processes. In most cases they use a fixed priority preemptive scheduling
mechanism, in which the user has to set the process priorities. Limitations
of these tools are : their inability to guarantee that hard real-time con-
straints will be met, and the lack of automatic retargetability to different
processor cores (e.g. ASIPs).
Consumer and telecom applications are real-time systems, that must
operate under externally specified timing constraints. The kernel should
schedule the different processes in such a way that these timing constraints
are met. In this context the term real-time kernel is sometimes used. In
addition, care must be taken to ensure a correct communication of data
between processes. Current research in the field of run-time kernels is fo-
cussing on these issues 27
Recent publications in the CAD community have considered the prob-
lem of automatic synthesis of application-specific kernels .
By statically compiling a kernel that exploits the information about the
application at hand, more efficient communication and context switching
mechanisms can be incorporated. By means of a static timing analysis, the
timing constraints can already be checked at compile time.
8. Conclusions
note:
Some kernels also support the execution on multi-processor targets.
ACKNOWLEDGEMENTS
The authors whish to acknowledge the help of the following researchers at IMEC,
who directly contributed to the insights described in this chapter : Augusli Kifli,
Koen Schoofs, Hans Cappelle, and Stefan De Troch.
References
Systems-on-programmable chips: A look at the
packaging challenges
For example, programmable logic device (PLD) vendors offer their customers
the ability to develop and verify designs for their devices well before the actual
devices are shipped which is typically 4-6 months before first samples are
available. This necessitates the packaging aspects of the entire product family to
be finalized before that. These aspects include items like pinout and electrical
and thermal characteristics which collectively facilitate early board layout,
design timing and verification, signal integrity analysis, and power budgeting.
Programmable logic vendors als o offer customers the ability to migrate designs
between different device family members with the same package and pinout
avoiding expensive board re-spins. This feature is called vertical migration and
has been largely accomplished through the package/die layout optimization.
Facilitating this capability has required proactive development of associated
substrate technologies to support the routing densities required.
Altera has been one of the early users of high-density interconnect (HDI)
technology and has and continues to work extensively with these providers to
enhance capabilities and improve performance. One of the more recent
challenges in programmable logic packaging has been the integration of high-
speed transceivers.
Altera's experience with its first generation of transceiver devices — the Mercury
FPGA family, — enabled us to establish procedures for the complex simulation of
transceiver-based FPGAs, laying the groundwork for its more recent work with
the Stratix GX family. At that time, Altera package engineers discovered that
they had to develop a common framework and process to address both the
mechanical and electric aspects of these increasingly complex packages.
Working with the silicon design engineering team and using a combination of
tools from multiple vendors, the package engineers were able to develop
accurate models of the electrical behavior of the packaging circuitry that, when
used with the IC design test bench, would indicate the overall behavior of the
packaged die on the board. These models included ball-to-transmission line,
transmission line, and transmission line-to-bump H-spice models, as well as S-
parameter model s of the ball-to-bump behavior.
This process enabled Altera's engineers to accurately predict the signal integrity
behavior of the Stratix GX device several months before actual silicon was
available.
Higher priorities
The increased emphasis on signal integrity is coupled with enhanced customer
usability requirements of managing power dissipation and board mounting
considerations such as accommodating various reflow conditions for different
package lead finishes. The implementation of lead-free packages adds an
additional flavor to these challenges.
Programmable logic devices tend to be high pin count with relatively large die
sizes. Component and board-level reliability are realized by starting with optimal
silicon and package design. Programmable logic vendors need to partner closely
with their assembly partners to optimize processes to meet customer
requirements of reliability and manufacturability. This includes participation in
the material selection for substrates/leadframes, underfill and die attach as well
as encapsulation. Modeling as well as empirical techniques are used to validate
these using test vehicles prior to product introduction.