0% found this document useful (0 votes)
168 views22 pages

AMP Manual

The document describes experiments related to studying internal components of a CPU cabinet. It defines different types of processor packages and sockets used over time. It explains the components inside the CPU cabinet including the motherboard, processor slots/sockets, RAM slots, cache memory, and bus slots. It discusses simulating pipeline processing, superscalar and super pipeline architectures, and detecting different types of data dependency hazards in pipelines.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
168 views22 pages

AMP Manual

The document describes experiments related to studying internal components of a CPU cabinet. It defines different types of processor packages and sockets used over time. It explains the components inside the CPU cabinet including the motherboard, processor slots/sockets, RAM slots, cache memory, and bus slots. It discusses simulating pipeline processing, superscalar and super pipeline architectures, and detecting different types of data dependency hazards in pipelines.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Advanced Microprocessor

Experiment No. 01: Study of internal components of CPU cabinet AIM: To study the Internal Components of CPU Cabinet. EQUIPMENT: P IV 2 GHz, 512 MB RAM 40 GB HDD, 15 DELL Color Monitor optical Mouse, Dot Matrix Printer(EPSON FX-2175).

THEORY:
Package Type: DIP: (Dual -in-line package) : QFP (quad flat package ):

(8086/88,Z80, 68000,68010 ) PGA (Pin Grid Array ):

( IntelNG80386, PowerPC601) LCC(Leadless Chip Carrier):

(Intel 386 DX ,Cyrix Cx486DLC AMD 486 DX, Intel 486 SX)

(AMD R80186, Intel R80286-6, Siemens SAB 80188-R)

Department of computer Engineering, SIES Graduate School of Technology

Advanced Microprocessor

PLCC(Plastic Leaded Chip Carrier):

Slot Packages:

(AMD N80C186,Harris CS80C286-16, Cyrix CX-83S87-16-JP)

Intel Pentium Intel Celeron (Slot 1)

III

(Slot

1)

Motherboard It is the main unit inside the cabinet on which all the components are mounted or to which are connected. Mainly it is described according the processor slot/socket available on it. Motherboard is of many types like AT, ATX, etc. Processor slots/sockets: Socket / Slot Pincount / Supported Processors Type Socket 1 169 LIF/ZIF PGA Socket 2 238 LIF/ZIF PGA Intel i486

AMD Am5x86 133 (w/ voltage adaptor) Cyrix Cx5x86 100/120 (w/ voltage adaptor) Intel i486 Intel Pentium

AMD Am5x86 133 (w/ voltage adaptor) Cyrix 5x86 100/120 (w/ voltage adaptor)

Socket 3

237 LIF/ZIF PGA

Intel Intel AMD Cyrix 5x86 100/120 Am5x86

i486 Pentium 133

Socket 4

273 LIF/ZIF PGA

Intel

Pentium

P5

60/66

Intel Pentium OverDrive 120/133

Department of computer Engineering, SIES Graduate School of Technology

Advanced Microprocessor

Socket 5

296/320 LIF/ZIF SPGA

Intel Intel AMD AMD

Pentium Pentium MMX K5 K6

P45C P55

75-133 166-233 PR75-133 166-300

Cyrix 6x86L PR120-166 (w/ voltage adaptor) Cyrix 6x86MX PR166-233 (w/ voltage adaptor) IDT Winchip Socket (uncommon) 6 235 ZIF PGA Socket Super 7 7 321 ZIF SPGA Intel Intel AMD AMD K6 Cyrix 6x86 IDT Winchip Socket 8 387 LIF/ZIF PGA/SPGA dual pattern Slot 1 242 SECC SECC2 SEPP Slot 2 330 SECC Intel Celeron Intel Pentium Pro Intel Pentium II Intel Pentium III Intel Pentium II Xeon 400/450 (Drake) Intel Pentium Pro 150-200 Pentium Pentium K5 MMX P45C P55 PR75-200 Intel i486 DX4 75-120

Intel Pentium II

Intel Pentium III Xeon 500/550 (Tanner) Intel Pentium III Xeon 600-1GHz (Cascades)

Socket 370

370 ZIF SPGA

Intel Celeron Intel Pentium III Cyrix III 533-667 (Samuel)

Department of computer Engineering, SIES Graduate School of Technology

Advanced Microprocessor

Slot A

242 SECC

AMD AMD

Athlon Athlon

500-700 550-1GHz

(K7) (K75)

AMD Athlon 700-1GHz (Thunderbird) Socket A 462 ZIF SPGA Socket 423 423 AMD Duron 600-950 AMDAthlon AMD Sempron Intel Pentium 4 1.3GHz

Intel Celeron 1.7GHz-1.8GHz ZIF SPGA Socket 478 478 ZIF PGA Socket T 775 LGA Intel Celeron Intel Intel Pentium Intel Celeron D 325J (Prescott) Pentium 4

Intel Pentium 4 Intel Pentium D Intel Pentium Extreme

Socket 603 /604

603/604 ZIF PGA

Intel Xeon

PAC418

418 VLIF

Intel Itanium 733-800MHz (Merced)

PAC611

611 VLIF

Intel Itanium 2

Socket 754

754 ZIF

AMD Athlon 64 AMD Sempron 2600+-3300 AMD Athlon 64 FX-51 - FX-53 (Sledgehammer) AMD Opteron 140-150 (Sledgehammer)

Socket 940

940 ZIF

Socket 939

939 ZIF

AMD Athlon 64

Department of computer Engineering, SIES Graduate School of Technology

Advanced Microprocessor

Bus Slots The various bus slots on motherboard are ISA (Industry standard Architecture) PCI (Peripheral Component Interconnect) AGP (Accelerated Graphics Port) AMR (Audio Modem Riser) It also contains external connections for your onboard sound card, USB ports, Serial and Parallel ports, PS/2 ports for your keyboard and mouse as well as network and Firewire connections. RAM Slots There are varieties of RAM modules that can be mounted on the motherboard SIMM (Single Inline Memory Modules) Supports EDO RAM DIMM (Dual Inline Memory Module) Supports 3D and DDR RAM RIMM (Rambus Inline Memory Module) Supports RD RAM Cache Memory Cache is an intermediate or buffer memory. The idea behind cache is that it should function as a near store of fast RAM. A store which the CPU can always be supplied from. In practice there are always at least two close stores. They are called Level 1, Level 2, and (if applicable) Level 3 cache. Level 1 cache is built into the actual processor core. It is a piece of RAM, typically 8, 16, 20, 32, 64 or 128 Kbytes, which operates at the same clock frequency as the rest of the CPU. Thus you could say the L1 cache is part of the processor. L1 cache is normally divided into two sections, one for data and one for instructions. For example, an Athlon processor may have a 32 KB data cache and a 32 KB instruction cache. If the cache is common for both data and instructions, it is called a unified cache. The level 2 cache is normally much bigger (and unified), such as 256, 512 or 1024 KB. The purpose of the L2 cache is to constantly read in slightly larger quantities of data from RAM, so that these are available to the L1 cache. Now the L2 cache has been integrated within processor and that makes it function much better in relation to the L1 cache and the processor core.
Department of computer Engineering, SIES Graduate School of Technology 5

Advanced Microprocessor

The level 2 cache takes up a lot of the chips die, as millions of transistors are needed to make a large cache. The integrated cache is made using SRAM (static RAM), as opposed to normal RAM which is dynamic (DRAM). Buses Description PC-XT from 1981 Synchronous 8-bit bus which followed the CPU clock frequency of 4.77 Band 170: 4-6 MB/sec. ISA AT) from 1984 MCA from 1987 (PC- Simple, Synchronous Band width: 8 MB/sec. Advanced I/O bus from IBM (patented). Asynchronous, 32-bit, at 10 MHz Band width: 40 MB/sec. EISA From Simple, 1988 high-speed I/O bus. cheap with I/O the bus. CPU. or 6 MHz

32-bit, synchronized with the CPUs clock frequency: 33, 40, 50 MHz. Band width: up to 160 MB/sec.

PCI from 1993

Advanced, general, high-speed I/O bus. 32-bit, asynchronous, at 33 MHz Band width: 133 MB/sec.

USB Fire

and Serial buses for external equipment. wire,

from 1998 PCI Express from 2004 A serial bus for I/O cards with very high speed. Replaces PCI and AGP. 500 MB/sec. per. Channel.

CONCLUSION: .

Department of computer Engineering, SIES Graduate School of Technology

Advanced Microprocessor

Experiment No. 02: Simulation of pipeline processing AIM: Write a Program in Java to simulate a pipeline processing. EQUIPMENT: Internet, Books, PC.

THEORY: Pipeline is a process of prefetching the next task while executing the current task. Pipeline in which task is divided in subtasks and in each stage of pipeline subtask is executed. Instruction pipeline in which instruction is prefetched while executing current instruction. In this simulation, High level language can be used to simulate the same. Algorithm: Start Display of vertical lines Display of Instruction stages in pipelines Movement of instructions one by one End

CONCLUSION:

Department of computer Engineering, SIES Graduate School of Technology

Advanced Microprocessor

Experiment No. 03: Super Pipeline AIM: Write a program to Simulate Superscalar and Super Pipeline .

EQUIPMENT: Internet, Books PC.

THEORY: In Superscalar Architecture, Pipeline implementation implies parallelism and more than one instruction are executed at a time. Two issue superscalar pipeline means at a time two instructions are pipelined and if it is three issue superscalar pipeline means at a time three instructions are pipelined. This type of pipelining increase the throughput of the processor . now days 8 issue superscalar structure is been developed. In superscalar processor fetches multiple instructions at a time and attempts to find nearby instructions that are independent of one another and can be executed in parallel .The essence of the superscalar approach is the ability to execute instructions independently in different pipelines. In Super Pipeline, Many pipeline stages need less than half a clock cycle. Double internal clock speed gets two tasks per external clock cycle. Algorithm Start Display of vertical lines Display of Instruction stages in pipeline Movement of two instructions at a time.(2-issue superscalar) In super pipeline, each instruction is taking less than one cycle(completion of each stage in half cycle) End

Department of computer Engineering, SIES Graduate School of Technology

Advanced Microprocessor

CONCLUSION:

Department of computer Engineering, SIES Graduate School of Technology

Advanced Microprocessor

Experiment No. 04: Data dependency hazards AIM: Write a program to detect data dependency hazards. EQUIPMENT: Internet, Books PC.

THEORY: Dependency among the instructions is required to remove in order to implement instruction level parallelism (ILP). There are three types of data dependency exist which are to be identified and eliminated from sequential flow of instructions True data dependency Hazard (Flaw dependency/RAW Hazard) Eg : R1:=R2+ R3 R4:= R1-R5 Antidependency (WAR Hazard) Eg: R1:= R2+R3 R2:= 6 Output dependency Hazard (WAW Hazard) Eg: R1:= R2+R3 R1:=R5 Algorithm Start Accept No of Instructions Accept Source and destination for each instruction

Department of computer Engineering, SIES Graduate School of Technology

10

Advanced Microprocessor


Output

For checking Flow dependency, compare destination of each instruction with Src of other instructions sequentially. For checking anti dependency, compare src of each instruction with destination of other instructions sequentially. For checking output dependency , compare destination of each instruction with destination of others Display the flow dependant, Anti dependent ,output dependent instructions , Display of Instruction stages in pipeline Movement of two instructions at a time.(2-issue superscalar) Three data dependency hazards are to be simulated End

CONCLUSION:

Department of computer Engineering, SIES Graduate School of Technology

11

Advanced Microprocessor

Experiment No. 05: Simulation of Brach Prediction logic. AIM: Write a program to Simulate Brach Prediction logic. EQUIPMENT: Internet, Books PC

Prediction Logic is used to minimize penalty incurred due to branch instructions. To reduce time taken by queue to flush and fetch again and again branch prediction is used.Following diagram depicts the need of Branch Prediction Logic

BTB(Branch Translation Buffer) is lookup table which has 256 entries (2^8=256, 2 way associative cache ) Valid bit Source Address History bits Target Address

Department of computer Engineering, SIES Graduate School of Technology

12

Advanced Microprocessor

History bits can be in the one of four states and based on which prediction is 00~ Strongly Taken 01~ Weakly taken 10~ Weakly not taken 11~ Strongly Not taken Algorithm: 1. Find source address of instruction into look up table. a. if (Source Addr not Found) // Instruction encountered first time Prediction is NO JUMP { if ( branch ) insert record into BTB with history bits 00 else do nothing. } b. If (Source Addr Found ) Prediction is JUMP / NO JUMP// Based on history bits { if ( branch ) History bits are upgraded else History bits are degraded } } Output : Instructions in program are : cmp x1,x2 Jump if x1 < x2 Enter x1 , x2 value : 35 45 Prediction is No JUMP

Department of computer Engineering, SIES Graduate School of Technology

13

Advanced Microprocessor

Branch taken Incorrect Prediction . History bits are strongly taken Enter x1 , x2 value : 31 11 Prediction is JUMP Branch not taken Incorrect Prediction History bits are weakly taken Enter x1 , x2 value : 63 10 Prediction is NOJUMP Branch not taken Correct Prediction. History bits are weakly not taken Enter x1, x2 value: 74 95 Prediction is NO JUMP Branch taken Incorrect Prediction. History bits are weakly taken

CONCLUSION:

Department of computer Engineering, SIES Graduate School of Technology

14

Advanced Microprocessor

Experiment No. 06: Implementation of Page replacement algorithm AIM: Write a Program to implement Page replacement algorithm. EQUIPMENT: Internet, Books PC

THEORY: Whenever there is a page required for data it will be searched in the cache. If it is not present it will be brought in to the cache. If there is space in the cache the any page is replaced by the new page for this various techniques are used such as FIFO, LRU, optimal, clock etc.FIFO: in this technique the page entered first is replaced. Eg:

LRU: in this technique the page least recently used is replaced. Eg :

Department of computer Engineering, SIES Graduate School of Technology

15

Advanced Microprocessor

Lowest page-fault rate of all algorithms. Never suffer from Beladys anomaly Replace page that will not be used for longest period of time.4 frames example 1,2,3,4,1,2,5,1,2,3,4,5 How do you know this? Used for measuring how well your algorithm performs. Difficult to implement as it requires prior knowledge of reference string (like SJF in CPU Scheduling)Mainly used for comparison studies CONCLUSION:.

Department of computer Engineering, SIES Graduate School of Technology

16

Advanced Microprocessor

Experiment No. 07: PENTIUM Processors. AIM: Study of PENTIUM processors. EQUIPMENT: Internet Books

THEORY: FEATURES OF P5: 64 bit data bus, that permits 8-bytes or 4-words to be transferred in a single bus cycle. 8-bit data cache and instruction cache. It has a two issue superscalar architecture. Parallel integer execution consisting of U pipeline and V pipeline. It has a branch target buffer and branch prediction logic. It has an operating speed from 60MHz to 200MHz.

FEATURES OF P6: It is a 32-bit Intel microprocessor. Implements a dynamic execution micro-architecture having speculative and out of order execution. It has a 3 way superscalar architecture allowing execution of 3 instructions per clock cycle. It has two on-chip 8-kB L1 cache and 256-kB L2 cache. It has a dynamic execution that is micro data flow analysis, out of order execution, superior branch prediction and speculative execution.

FEATURES OF PENTIUM 4: It has a processing speed of 1.4 GHz. It has a 20 stage hyper pipelining technology. It consists of a hyper threading technology.

Department of computer Engineering, SIES Graduate School of Technology

17

Advanced Microprocessor

It expands the floating-point registers to a full 128-bit and adds an additional register for data movement which helps improve performance on both floating-point and multimedia applications.

It has an built in self test to test if all the attributes of the processor are working properly. 13 new instructions in SSE3 are primarily designed to import thread synchronization and specific application areas such as media and gaming.

ARCHITECTURE OF PENTIUM 4:

Department of computer Engineering, SIES Graduate School of Technology

18

Advanced Microprocessor

COMPARISON BETWEEN PENTIUM PROCESSORS:


Sr no 1 2 3 Features Pentium5 Pentium Pro 1995 32 bits 60 MHz (later upto 200MHz) 200 32 64 5.5 million 64 GB 64 TB 16KB Spilt Pentium II 1997 32 bits 166 MHz (later upto 300MHz) 350 32 64 7.5 million 64 GB 64 TB 32KB Spilt Pentium 4

Year Introduced Processor Size Speed

1993 32 bits 60 to 66 MHz (later upto 200MHz) 100 to 112 32 64 3.1 million 4 GB 64 TB 16KB Spilt

2000 32 bits 400 MHz (later upto 2.26GHz) 3000 32 64 77 million 64 GB 64 TB 12 K Cpcodes + 8KB data 1MB ATC

4 5 6 7 8 9 10

MIPS Address Bus Size Data Bus Size No. of transistors Addressable Memory Virtual memory L1 Cache

11

L2 Cache

Off chip not specified No No

256KB to 1MB Unified No No

512KB on chip

12 13

MMX instruction set Hyper threading support Architecture Family SMP(multiprocessor) support Integer pipeline stages Floating point pipeline stages Brief Description

Yes No

Yes Yes

14 15

P5 No

P6 No

P6 No

NetBurst Yes

16 17

5 8

14 -

5 8

20 20

18

Superscalar Architecture

Intels first true server/workstation chip

Dual independent bus, dynamic execution,MMX technology


19

Data transfer rate is 4.2 GB

Department of computer Engineering, SIES Graduate School of Technology

Advanced Microprocessor

CONCLUSION:

Department of computer Engineering, SIES Graduate School of Technology

20

Advanced Microprocessor

Experiment No. 08: SPARC Architecture AIM: Study of SPARC Architecture (V8). EQUIPMENT: Internet Books SPARC Manual

THEORY: Scalable Processor ARChitecture, or SPARC ATTRIBUTES SPARC is a CPU instruction set architecture (ISA), derived from a reduced instruction set computer (RISC) lineage. As an architecture, SPARC allows for a spectrum of chip and system implementations at a variety of price/performance points for a range of applications, including scientific/engineering, programming, real-time, and commercial. DESIGN GOALS SPARC was designed as a target for optimizing compilers and easily pipelined hardware implementations. SPARC implementations provide exceptionally high execution rates and short time-to-market development schedules. REGISTER WINDOWS SPARC, Formulated At Sun Microsystems In 1985, Is Based On The Risc I & II designs engineered at the University of California at Berkeley from 1980 through 1982. The SPARC register window architecture, pioneered in UC Berkeley designs, allows for straightforward, high-performance compilers and a significant reduction in memory load/store instructions over other RISCs, particularly for large application programs. For languages such as C++, where object-oriented programming is dominant, register windows result in an even greater reduction in instructions executed. Note that supervisor software, not user programs, manages the register windows. A Supervisor can save a minimum number of registers (approximately 24) at the time of a context switch, thereby optimizing context switch latency. One difference between SPARC and the Berkeley RISC I & II is that SPARC provides greater flexibility to a compiler in its assignment of registers to program variables. SPARC is more flexible because register window management is not tied to procedure call and return (CALL and JMPL) instructions, as it is on the Berkeley machines. Instead, separate instructions (SAVE and RESTORE) provide register window management. SPARC System Components The architecture allows for a spectrum of input/output (I/O), memory management unit (MMU), and cache system sub-architectures. SPARC assumes that these elements are optimally defined by the specific requirements of particular systems. Note that they are
Department of computer Engineering, SIES Graduate School of Technology 21

Advanced Microprocessor

invisible to nearly all user application programs and the interfaces to them can be limited to localized modules in an associated operating system. SPARC includes the following principal features: A linear, 32-bit address space. Few and simple instruction formats All instructions are 32 bits wide, and are aligned on 32-bit boundaries in memory. There are only three basic instruction formats, and they feature uniform placement of opcode and register address fields. Only load and store instructions access memory and I/O. Few addressing modes A memory address is given by either register + register or register + immediate. Triadic register addresses Most instructions operate on two register operands (or one register and a constant), and place the result in a third register. A large windowed register file At any one instant, a program sees 8 global integer registers plus a 24-register window into a larger register file. The windowed registers can be described as a cache of procedure arguments, local values, and return addresses. A separate floating-point register file configurable by software into 32 singleprecision (32-bit), 16 double-precision (64-bit), 8 quad-precision registers (128-bit), or a mixture thereof. Delayed control transfer the processor always fetches the next instruction after a delayed control-transfer instruction. It either executes it or not, depending on the control-transfer instructions annul bit. Fast trap handlers Traps are vectored through a table, and cause allocation of a fresh register window in the register file. Tagged instructions The tagged add/subtract instructions assume that the two least-significant bits of the operands are tag bits. Multiprocessor synchronization instructions one instruction performs an atomic read-then-set-memory operation; another performs an atomic exchange-register-withmemory operation. Coprocessor the architecture defines a straightforward coprocessor instruction set, in addition to the floating-point instruction set. In SPARC Architecture, Following concepts are also described The Instruction Set, Addressing Modes, Pipeline Processing,, FPU , Interrupts , Bus cycles, Programming Model. Etc.

CONCLUSION:

Department of computer Engineering, SIES Graduate School of Technology

22

You might also like