0% found this document useful (0 votes)

22 views64 pages

Week2 - 1

This document discusses the evolution of computer memory and processor architectures. It describes how magnetic core memory worked and was later replaced by semiconductor memory as it became smaller, faster, and cheaper. It outlines 13 generations of semiconductor memory density increases. It also discusses the evolution of Intel microprocessors from the 4004 to modern multi-core designs. Additionally, it covers techniques used to increase processor speeds like pipelining, branch prediction, and speculative execution which attempt to keep the processor busy by predicting future instructions.

Uploaded by

Abdul Moiz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views64 pages

Week2 - 1

Uploaded by

Abdul Moiz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 64

COAL

Basic Concepts and Evaluation

Instruction Set Architectures X86, ARM

Dr. Zafar Iqbal

Assistant Professor
Department of Cyber Security, FCAI.
Air University, Islamabad.

2
Magnetic Memory
• In core storage, each ferrite ring can represent a 0 or 1 bit, depending
on its magnetic state.

• If magnetized in one direction, it represents a 1 bit, and

• if magnetized in the opposite direction, it represents a 0 bit.

• These cores are magnetized by sending an electric current through the

wires on which the core is laced.

• It is this direction of current that determines the state of each core.

Types of internal storage (tpub.com) 2

Semiconductor Memory
• 1970, Fairchild produced first relatively large semiconductor memory
• Size of a single core
• i.e. 1 bit of magnetic core memory storage holds 256 bits
• It took only 70 billionths of a second to read a bit.
• However, the cost per bit was higher than for that of core.
• In 1978, price per bit of semiconductor memory dropped below the price per bit of core
memory
• Following this, there has been rapid decline in memory cost accompanied by a corresponding
increase in physical memory density.
• Led the way to smaller, faster machines with memory sizes of larger and more expensive
machines from just a few years ear

3
Semiconductor Memory
• Since 1970, semiconductor memory has been through 13 generations:

• 1k, 4k, 16k, 64k, 256k, 1M, 4M, 16M, 64M, 256M, 1G, 4G, and, as of this writing, 8 Gb on a
single chip (1k = 2^10 , 1M = 2^20 , 1G = 2^30).

• Reach 16 Gb by 2018 and 32 Gb by 2023 [ITRS14].

• Just as the density of elements on memory chips has continued to rise, so has the density of
elements on processor chips.
• As time went on, more and more elements were placed on each chip,
• So that fewer and fewer chips were needed to construct a single computer processor

4
Intel
1971 – 4004
• First microprocessor
• All CPU components on a single chip
• 4 bit number addition
• Multiply number by repeated Addition

1972 - 8008
• 8 bit
• Both 4004 and 8008 designed for specific applications

1974 - 8080
• Intel’s first general purpose microprocessor
Dr. Syed Atif Ali Shah 5
Evolution of Intel Micro Processors
Evolution of Intel Micro Processors
Evolution of Intel Micro Processors
Evolution of Intel Micro Processors
Memory/Storage Architecture: Historical Trend
The gap between compute and memory/storage is increasing

https://medium.com/@abruyns/memory-holds-the-keys-to-ai-adoption-5acd5e06508b Pure Storage Inc. 10

Speeding it up

Pipelining
Branch prediction
Superscalar execution
Data flow analysis
Speculative execution
Dr. Syed Atif Ali Shah 11
Pipelining
• Pipelining enables a processor to work simultaneously on multiple instructions.
• For example, while one instruction is being executed, the computer is decoding the next
instruction.

12
https://www.researchgate.net/publication/352189762_Advancements_in_Microprocessor_Architecture_for_Ubiquitous_AI-An_Overview_on_History_Evolution_and_Upcoming_Challenges_in_AI_Implementation
Pipelining Limitations
• The speed of a pipeline is eventually limited by the slowest stage.
• For this reason, conventional processors rely on very deep pipelines (20 stage pipelines).

• However, in typical program traces, every 5-6th instruction is a conditional jump!

• This requires very accurate branch prediction.

• The penalty of a miss prediction grows with the depth of the pipeline, since a larger number of instructions will have to be
flushed.

13
https://www.researchgate.net/publication/352189762_Advancements_in_Microprocessor_Architecture_for_Ubiquitous_AI-An_Overview_on_History_Evolution_and_Upcoming_Challenges_in_AI_Implementation
Branch prediction
• Processor looks ahead in the instruction code fetched from memory and predicts which
branches, or groups of instructions, are likely to be processed next.
• Techniques
• Static Branch Prediction
• Dynamic Branch Prediction

14
Correlating Branch Prediction - GeeksforGeeks
Branch prediction (BP)
• Processor looks ahead in the instruction code fetched from memory and predicts which
branches, or groups of instructions, are likely to be processed next.
• Techniques
• Static Branch Prediction
• Dynamic Branch Prediction Static BP: underlying hardware assumes that either the
branch is not taken always or the branch is taken always.

15
Branch prediction (BP)
• Processor looks ahead in the instruction code fetched from memory and predicts which
branches, or groups of instructions, are likely to be processed next.
• Techniques
• Static Branch Prediction
Dynamic BP: Prediction by underlying hardware is not fixed,
• Dynamic Branch Prediction rather it changes dynamically.

• 1-bit branch prediction technique

• 2-bit branch prediction technique
• Correlating branch prediction technique

16
Branch prediction (BP)
• Processor looks ahead in the instruction code fetched from memory and predicts which
branches, or groups of instructions, are likely to be processed next.
• Techniques
• Static Branch Prediction
• 1-bit branch prediction technique
• Dynamic Branch Prediction

17
Branch prediction (BP)
• Processor looks ahead in the instruction code fetched from memory and predicts which
branches, or groups of instructions, are likely to be processed next.
• Techniques
• Static Branch Prediction
• 1-bit branch prediction technique
• Dynamic Branch Prediction

18
Branch prediction (BP)
• Processor looks ahead in the instruction code fetched from memory and predicts which
branches, or groups of instructions, are likely to be processed next.
• Techniques
• 2-bit branch prediction technique
• Static Branch Prediction
• Dynamic Branch Prediction • underlying hardware does not changes its assumption just
after one incorrect assumption,
• rather it changes its assumption after two consecutive wrong
assumption.

19
Branch prediction (BP)
• Processor looks ahead in the instruction code fetched from memory and predicts which
branches, or groups of instructions, are likely to be processed next.
• Techniques
• Correlating branch prediction technique
• Static Branch Prediction
• Dynamic Branch Prediction

Local History Table Local Prediction Table

20
https://www.geeksforgeeks.org/correlating-branch-prediction/
Branch prediction
• Processor looks ahead in the instruction code fetched from memory and predicts which
branches, or groups of instructions, are likely to be processed next.

• If the processor guesses right most of the time, it can pre fetch the correct instructions and
buffer them so that the processor is kept busy.

• Thus, branch prediction increases the amount of work available for the processor to execute.

21
Branch Prediction Limitations
• However, in typical program traces, every 5-6th instruction is a conditional jump!
• This requires very accurate branch prediction.

• The penalty of a misprediction grows with the depth of the pipeline, since a larger number of instructions will have to be
flushed.

22
https://www.researchgate.net/publication/352189762_Advancements_in_Microprocessor_Architecture_for_Ubiquitous_AI-An_Overview_on_History_Evolution_and_Upcoming_Challenges_in_AI_Implementation
Superscalar execution
• One simple way of improving these bottlenecks is to use multiple pipelines.
• In effect, multiple parallel pipelines are used.

wastage of resources due to data dependencies

23
Data flow analysis
• Processor analyzes which instructions are dependent on each other’s results, or data, to
create an optimized schedule of instructions.
• In fact, instructions are scheduled to be executed when ready, independent of the original
program order.
• This prevents unnecessary delay.

24
Speculative execution

Branch Data Flow

Prediction Analysis

25
Performance Balance

Memory
Processor speed
capacity
increased
increased

Memory speed
lags behind
processor speed
26
Logic and Memory Performance Gap

27
Memory/Storage Architecture: Historical Trend
The gap between compute and memory/storage is increasing

https://medium.com/@abruyns/memory-holds-the-keys-to-ai-adoption-5acd5e06508b Pure Storage Inc. 28

Solution
 Processor power has raced ahead at fast speed, other critical components of the computer have
not kept up.

 The problem created by such mismatches is particularly critical at the interface between processor
and main memory.

 While processor speed has grown rapidly, the speed with which data can be transferred between
main memory and the processor has lagged badly.

 The result is a need to look for performance balance: an adjusting of the organization and
architecture to compensate for the mismatch among the capabilities of the various components.

29
Solutions
• The problem created by such mismatches is particularly critical at the interface between processor and main
memory.

• While processor speed has grown rapidly, the speed with which data can be transferred between main memory and
the processor has lagged badly.

• The interface between processor and main memory is the most crucial pathway in the entire computer

• It is responsible for carrying a constant flow of program instructions and data between memory chips and the
processor.

• If memory or the pathway fails to keep pace with the processor’s insistent demands, the processor stalls in a wait
state, and valuable processing time is lost.

30
Solutions
• A system architect can attack this problem in a number of ways.

• Increase the number of bits that are retrieved at one time by making DRAMs
“wider”, using wide bus data paths.

• Change the DRAM interface to make it more efficient by including a cache.

• This includes the incorporation of one or more caches on the processor chip as well
as on an off-chip cache close to the processor chip.

31
Typical I/O Device Data Rates
Ethernet modem
(max speed)

Graphics display

Wi-Fi modem
(max speed)

Hard disk

Optical disc

Laser printer

Scanner

Mouse

Keyboard

101 102 103 104 105 106 107 108 109 1010 1011
Data Rate (bps)

• Figure 2.1 gives some examples of typical peripheral devices in use on personal computers and
• workstations. Figure 2.1 Typical I/O Device Data Rates

• These devices create tremendous data throughput demands.

• While the current generation of processors can handle the data pumped out by these devices,
32
there remains the problem of getting that data moved between processor and peripheral.
I/O Devices
Peripherals with intensive I/O demands

Processors can handle this Problem by:

• Caching
• Buffering
• Higher-speed interconnection buses
• More elaborate bus structures
• Multiple-processor configurations
33
Key is Balance
• Designers constantly struggle to balance the throughput and processing demands of the processor, main
memory, I/O devices, and the interconnection structures.

• This design must constantly be reconsidered to cope with two constantly evolving factors:

• The rate at which performance is changing in the various technology areas (processor, buses,
memory, peripherals) differs greatly from one type of element to another.

• New applications and new peripheral devices constantly change the nature of the demand on
the system in terms of typical instruction profile and the data access patterns.

34
Improvements in Chip Organization and Architecture
Increase hardware speed of processor

• Fundamentally due to shrinking logic gate size

• More gates, packed more tightly, increasing clock rate
• Propagation time for signals reduced

Increase size and speed of caches

• Dedicating part of processor chip

• Cache access times drop significantly

Change processor organization and architecture

• Increase effective speed of execution

• Parallelism

35
Problems with Clock Speed and Logic Density
Power

• Power density increases with density of logic and clock speed

• Dissipating heat

RC delay

• Speed at which electrons flow limited by resistance and capacitance of metal wires connecting them
• Delay increases as RC product increases
• Wire joins thinner, increasing resistance
• Wires closer together, increasing capacitance

Memory latency

• Memory speeds lag processor speeds

Solution:

• More emphasis on organizational and architectural approaches

36
Increased Cache Capacity

https://www.youtube.com/watch?v=IA8au8Qr3lo 37
Increased Cache Capacity

https://www.youtube.com/watch?v=IA8au8Qr3lo 38
Increased Cache Capacity

• Typically, two or three levels of cache between processor and main memory

• Chip density increased

• More cache memory on chip

• Faster cache access

• Pentium chip devoted about 10% of chip area to cache

• Pentium 4 devotes about 50%

39
Moore’s law
• Moore observed that the number of transistors that could be put on a single chip was doubling
every year.

• Correctly predicted that this pace would continue into the near future.

40
Instruction Set Architecture
• To command a computer’s hardware, you must speak its
language.
• The words of a computer’s language are called instructions
• Instruction vocabulary is called an instruction set.
• Popular Instruction Sets are
• Intel x86
• ARM

41
x86 ISA
• Early chips were given technical part numbers, such as 8086, 80386, or

80486.

• This led to the commonly used shorthand of “x86 architecture,” in reference to the
last two digits of each chip’s part number.

• Beginning in 1993, the “x86” naming convention gave way to more memorable
(and pronounceable) product names such as:
• Intel® Pentium® processor,
• Intel® Celeron® processor

42
x86 Evolution
8080 8086 – 5MHz – 29,000 transistors 80286
• first general purpose microprocessor • much more powerful • 16 Mbyte memory addressable
• 8 bit data path • 16 bit • up from 1Mb
• Used in first personal computer – Altair • instruction cache, prefetch few instructions
• 8088 (8 bit external bus) used in first IBM
PC

80386 80486
• 32 bit • sophisticated powerful cache and
• Support for multitasking instruction pipelining

• built in maths co-processor

43
x86 Evolution
Pentium Pentium Pro
• Superscalar • Increased superscalar organization
• Multiple instructions executed in • Aggressive register renaming
parallel • branch prediction
• data flow analysis
• speculative execution

Pentium II Pentium III

• MMX technology • Additional floating point
• graphics, video & audio processing instructions for 3D graphics

Dr. Syed Atif Ali Shah 44

x86 Evolution
Pentium 4 Core Core 2 Core 2 Quad – 3GHz
• Note Arabic rather than • First x86 with dual core • 64 bit architecture – 820 million
Roman numerals transistors
• Further floating point and
• Four processors on chip
multimedia enhancements

Instruction set
x86 architecture Organization and
architecture evolved ~1 instruction per
dominant outside technology changed
with backwards month added
embedded systems dramatically
compatibility

See Intel web pages

500 instructions for detailed
available information on
processors

45
Intel Microprocessor Performance

46
Intel Microprocessor Performance

• Internal memory cache: Called cache - memory found within modern processors.

• It acts as a temporary storage for frequently accessed data and instructions, significantly improving
overall system performance.

47
Intel Microprocessor Performance

• Speculative out-of-order execution: Branch prediction + Data Analysis

Dr. Syed Atif Ali Shah 48

Intel Microprocessor Performance

Multimedia Extensions:
• Released in 1997 for Intel x86 processors.
• Introduced 57 new instructions specifically designed for multimedia tasks like audio, video, image
processing, and 3D graphics.
• Utilized Single Instruction, Multiple Data (SIMD) architecture to process multiple data elements
simultaneously.

Dr. Syed Atif Ali Shah 49

Intel Microprocessor Performance

• Hyper Threading: Imagine a single core with processing resources like execution units and registers.

• During normal operation, these resources are used by one thread at a time.

• Hyperthreading creates virtual copies of these resources, allowing two threads to "share" the core,
essentially multitasking within the same space.

• Though the threads still need to wait for each other for certain tasks, hyperthreading enables efficient
utilization of idle resources during wait times, potentially boosting overall performance.

Dr. Syed Atif Ali Shah 50

CISC vs RISC

• Complex instruction set computer” (CISC—pronounced “sisk”)

• The x86 architecture has a complex instruction set, which makes it difficult to optimize code for
performance.

• The main idea behind CISC processors is that:

• a single instruction can be used to do all of the loading, evaluating, and storing operations. Because
of this, instructions are relatively more complicated compared to RISC, hence the name Complex
Instruction.

• The concept of RISC is to reduce the complexity of instructions by increasing the number of instructions
but reducing the size of the sets. In this way, commands are simplified, and low-level operations are easier
to achieve since instructions can be executed faster.This complexity can also make it harder to debug
software and hardware issues.

51
CISC vs RISC

• CISC: x86 processors from Intel and AMD.

• RISC: ARM processors commonly used in mobile devices, MIPS processors used in
embedded systems and networking equipment.

52
ARM (Advanced RISC Machine)

• ARM is a family of RISC- based microprocessors and microcontrollers designed by ARM

Holdings, Cambridge, EnglandARM architecture is the most commonly implemented 32-
bit instruction set architecture

• Used mainly in embedded systems

• Embedded system refers to the use of electronics and software within a product

• ARM is used within product, Not general purpose computer

• Dedicated function

• E.g. Anti-lock brakes in car, Digital Cameras, Cell phones.

53
Embedded Systems Requirements
Different
Different sizes
requirements
Safety, reliability, real-time, flexibility, legislation

Lifespan

Environmental conditions

Different constraints, optimization, reuse Static v dynamic loads

Slow to fast speeds

Computation v I/O intensive

Descrete event v continuous dynamics

54
Possible Organization of an Embedded System

55
Possible Organization of an Embedded System

• With human interface, we interact with the Embedded System.

• Communication channels through which users interact with and control the system, providing access to data,
functionalities, and settings 56
Possible Organization of an Embedded System

• The diagnostic port may be used for diagnosing the system

57
Possible Organization of an Embedded System

• Special- purpose field programmable (FPGA), application- specific (ASIC), or even

nondigital hardware may be used to increase performance or reliability.
58
Possible Organization of an Embedded System

• An actuator is a device that receives an input signal (electrical, pneumatic, or hydraulic) and converts it into
mechanical motion or force.
• Example: Stepper motor where electrical energy drives the motor
59
ARM Evolution
Designed by ARM Inc.,
Cambridge, England
Licensed to
manufacturers
High speed, small die,
low power
consumption
PDAs, hand held
• E.g. iPod, iPhone
games, phones
Acorn produced ARM1
& ARM2 in 1985 and
ARM3 in 1989
Acorn, VLSI and Apple
Computer founded
ARM Ltd.
60
ARM Systems Categories

Embedded real time

Application platform
• Linux, Palm OS, Symbian OS, Windows mobile

Secure applications

Dr. Syed Atif Ali Shah 61

Cloud Computing

• A model for enabling universal, convenient, on- demand

network access to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications, and
services) that can be rapidly provisioned and released with
minimal management effort or service provider interaction.

62
Cloud Service

63
Thank You

Intel 80586 (Pentium)
100% (3)
Intel 80586 (Pentium)
24 pages
Yamaha FZS V3 English
No ratings yet
Yamaha FZS V3 English
240 pages
Lecture (2) .PPT-1
100% (1)
Lecture (2) .PPT-1
19 pages
Introduction To High Performance Computing: Unit-I
No ratings yet
Introduction To High Performance Computing: Unit-I
70 pages
Ecommended Eading: William Stallings
No ratings yet
Ecommended Eading: William Stallings
23 pages
Unit 1 Modern Processors
No ratings yet
Unit 1 Modern Processors
52 pages
CH02-COA10e Spring 2025
No ratings yet
CH02-COA10e Spring 2025
24 pages
Module 2
No ratings yet
Module 2
127 pages
Lecture1 2
No ratings yet
Lecture1 2
30 pages
Chapter 4 (Processors and Memory Hierarchy)
100% (1)
Chapter 4 (Processors and Memory Hierarchy)
17 pages
Seminar Report
50% (4)
Seminar Report
30 pages
Chapter 01
No ratings yet
Chapter 01
78 pages
Chapter 1 Edit
No ratings yet
Chapter 1 Edit
463 pages
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
50 pages
William Stallings Computer Organization and Architecture 7 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 7 Edition Computer Evolution and Performance
44 pages
Computer Evolution 2 (Details)
No ratings yet
Computer Evolution 2 (Details)
23 pages
Chapter 2 Part 2
No ratings yet
Chapter 2 Part 2
18 pages
IAS & MIPS Rate
No ratings yet
IAS & MIPS Rate
42 pages
02 - Computer Evolution and Performance
No ratings yet
02 - Computer Evolution and Performance
32 pages
Cpe 631 Pentium 4
No ratings yet
Cpe 631 Pentium 4
111 pages
Chapter 2
No ratings yet
Chapter 2
14 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
29 pages
CI-0120 Arquitectura de Computadoras Ejemplos FundamentosDiseño
No ratings yet
CI-0120 Arquitectura de Computadoras Ejemplos FundamentosDiseño
52 pages
HPC Unit 2
No ratings yet
HPC Unit 2
72 pages
Architecture1 1 (2012)
No ratings yet
Architecture1 1 (2012)
87 pages
Computer Architecture 600
No ratings yet
Computer Architecture 600
7 pages
Mca Unit1a
No ratings yet
Mca Unit1a
37 pages
HPC - Unit-1 Insem Notes
No ratings yet
HPC - Unit-1 Insem Notes
76 pages
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
No ratings yet
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
54 pages
Future of Computer Architecture
No ratings yet
Future of Computer Architecture
46 pages
HPC Unit 1
No ratings yet
HPC Unit 1
65 pages
Advanced Computer Architecture ECE 6373: Pauline Markenscoff N320 Engineering Building 1 E-Mail: Markenscoff@uh - Edu
No ratings yet
Advanced Computer Architecture ECE 6373: Pauline Markenscoff N320 Engineering Building 1 E-Mail: Markenscoff@uh - Edu
151 pages
Modle 01 - HPC Introduction To Pipeline
No ratings yet
Modle 01 - HPC Introduction To Pipeline
124 pages
CH02-COA10e Spring 2025
No ratings yet
CH02-COA10e Spring 2025
24 pages
Module 4 part-II
No ratings yet
Module 4 part-II
22 pages
Onur 447 Spring15 Lecture17 Memoryhierarchyandcaches Afterlecture
No ratings yet
Onur 447 Spring15 Lecture17 Memoryhierarchyandcaches Afterlecture
51 pages
Module 6 Pentium 4
No ratings yet
Module 6 Pentium 4
5 pages
Lecture 1
No ratings yet
Lecture 1
37 pages
Chapter 2
No ratings yet
Chapter 2
15 pages
Microprocessor (Report)
No ratings yet
Microprocessor (Report)
4 pages
William Stallings Computer Organization and Architecture
No ratings yet
William Stallings Computer Organization and Architecture
20 pages
Computer Evolution & Performance
No ratings yet
Computer Evolution & Performance
71 pages
Parallel Computing Platforms-Dr Nausheen
No ratings yet
Parallel Computing Platforms-Dr Nausheen
47 pages
Chapter 2
No ratings yet
Chapter 2
34 pages
Advanced Processor Architecture: Summer 1997
No ratings yet
Advanced Processor Architecture: Summer 1997
28 pages
Memory Hierarchy Design: A Quantitative Approach, Fifth Edition
No ratings yet
Memory Hierarchy Design: A Quantitative Approach, Fifth Edition
22 pages
Parallel Processing
No ratings yet
Parallel Processing
127 pages
Lec2 ParallelProgrammingPlatforms
No ratings yet
Lec2 ParallelProgrammingPlatforms
26 pages
Chap2 Slides
No ratings yet
Chap2 Slides
127 pages
Computer Evolution and Performance
No ratings yet
Computer Evolution and Performance
52 pages
Ch2 PDF
No ratings yet
Ch2 PDF
53 pages
Pipeline History
No ratings yet
Pipeline History
30 pages
Chapter 1 Edit PDF
No ratings yet
Chapter 1 Edit PDF
40 pages
CAQA6e ch1
No ratings yet
CAQA6e ch1
31 pages
COA - 02 - Computer Evolution and Performance
No ratings yet
COA - 02 - Computer Evolution and Performance
9 pages
Instructor: L. N. Bhuyan
No ratings yet
Instructor: L. N. Bhuyan
32 pages
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
52 pages
Biogeochemical Cycles
100% (1)
Biogeochemical Cycles
3 pages
Defining Computer Architecture
No ratings yet
Defining Computer Architecture
6 pages
3 Minute Speech
No ratings yet
3 Minute Speech
2 pages
Open Hole Logging Costs ( ) : Platform Express
No ratings yet
Open Hole Logging Costs ( ) : Platform Express
8 pages
MCP 033123
No ratings yet
MCP 033123
224 pages
Preliminary PET Speaking Examiner Pack
100% (1)
Preliminary PET Speaking Examiner Pack
24 pages
Physical Therapy Clinical Management.9 PDF
No ratings yet
Physical Therapy Clinical Management.9 PDF
24 pages
GBT 1591 2018 en
No ratings yet
GBT 1591 2018 en
33 pages
1,3 Butadiene
No ratings yet
1,3 Butadiene
7 pages
Precalculus-Q1-W6-Module 9
No ratings yet
Precalculus-Q1-W6-Module 9
16 pages
BSBSUS401 Assess 1 ProjecT
No ratings yet
BSBSUS401 Assess 1 ProjecT
16 pages
LifeScience Front Matter Essential Science For Teachers Life Science
No ratings yet
LifeScience Front Matter Essential Science For Teachers Life Science
34 pages
Micro Teaching 1
No ratings yet
Micro Teaching 1
2 pages
BRIM Syllabus
No ratings yet
BRIM Syllabus
4 pages
Bulk LPG Layout Requirements-Comparison BTW San & Nfpa 58
No ratings yet
Bulk LPG Layout Requirements-Comparison BTW San & Nfpa 58
25 pages
The Kaldnes Moving Bed Process For Wastewater Treatment at Pulp and Paper Mills
No ratings yet
The Kaldnes Moving Bed Process For Wastewater Treatment at Pulp and Paper Mills
3 pages
314327-Consumer Electronic Systems 281124
No ratings yet
314327-Consumer Electronic Systems 281124
10 pages
On Dumpster Diving
No ratings yet
On Dumpster Diving
10 pages
(Ebook PDF) Chemistry 4th Edition by Julia Burdge 2024 Scribd Download
100% (4)
(Ebook PDF) Chemistry 4th Edition by Julia Burdge 2024 Scribd Download
46 pages
Washing Machine Owner's Instructions: B1485AV/ B1285AV/ B1285AS/ B1285A/ B1085A/ R1285AV/ R1085A/ F1285AV/ F1085A
No ratings yet
Washing Machine Owner's Instructions: B1485AV/ B1285AV/ B1285AS/ B1285A/ B1085A/ R1285AV/ R1085A/ F1285AV/ F1085A
22 pages
Autocad 2008 Features and Benifits
No ratings yet
Autocad 2008 Features and Benifits
7 pages
SunlightV6 On Top6872265039
No ratings yet
SunlightV6 On Top6872265039
5 pages
Form A1
No ratings yet
Form A1
2 pages
Tifac Core at Nit Hamirpur
No ratings yet
Tifac Core at Nit Hamirpur
6 pages
IKS MOD-3 QB Sol
No ratings yet
IKS MOD-3 QB Sol
26 pages
THE PEOPLE V DAUTI TIYESANJE PHIRI (1985)
No ratings yet
THE PEOPLE V DAUTI TIYESANJE PHIRI (1985)
2 pages
Solutions-Grand Marks Booster Challenege#1
No ratings yet
Solutions-Grand Marks Booster Challenege#1
66 pages
AUTOBIOGRAPHY - Peter James Sinkinson
No ratings yet
AUTOBIOGRAPHY - Peter James Sinkinson
2 pages
National Museum of Rwanda
No ratings yet
National Museum of Rwanda
4 pages
LM4040, LM4041 Precision Micro-Power Shunt Voltage References
No ratings yet
LM4040, LM4041 Precision Micro-Power Shunt Voltage References
14 pages
AVR Microcontroller Engineering: Definitive Reference for Developers and Engineers
From Everand
AVR Microcontroller Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
PIC Microcontroller Development Essentials: Definitive Reference for Developers and Engineers
From Everand
PIC Microcontroller Development Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet

Week2 - 1

Uploaded by

Week2 - 1

Uploaded by

COAL

Basic Concepts and Evaluation

Dr. Zafar Iqbal

• If magnetized in one direction, it represents a 1 bit, and

• if magnetized in the opposite direction, it represents a 0 bit.

• These cores are magnetized by sending an electric current through the

• It is this direction of current that determines the state of each core.

Types of internal storage (tpub.com) 2

• Reach 16 Gb by 2018 and 32 Gb by 2023 [ITRS14].

https://medium.com/@abruyns/memory-holds-the-keys-to-ai-adoption-5acd5e06508b Pure Storage Inc. 10

• However, in typical program traces, every 5-6th instruction is a conditional jump!

• 1-bit branch prediction technique

Local History Table Local Prediction Table

wastage of resources due to data dependencies

Branch Data Flow

https://medium.com/@abruyns/memory-holds-the-keys-to-ai-adoption-5acd5e06508b Pure Storage Inc. 28

• Change the DRAM interface to make it more efficient by including a cache.

• These devices create tremendous data throughput demands.

Processors can handle this Problem by:

• Fundamentally due to shrinking logic gate size

Increase size and speed of caches

• Dedicating part of processor chip

Change processor organization and architecture

• Increase effective speed of execution

• Power density increases with density of logic and clock speed

• Memory speeds lag processor speeds

• More emphasis on organizational and architectural approaches

• Chip density increased

• More cache memory on chip

• Faster cache access

• Pentium chip devoted about 10% of chip area to cache

• Pentium 4 devotes about 50%

• built in maths co-processor

Pentium II Pentium III

Dr. Syed Atif Ali Shah 44

See Intel web pages

• Speculative out-of-order execution: Branch prediction + Data Analysis

Dr. Syed Atif Ali Shah 48

Dr. Syed Atif Ali Shah 49

Dr. Syed Atif Ali Shah 50

• Complex instruction set computer” (CISC—pronounced “sisk”)

• The main idea behind CISC processors is that:

• CISC: x86 processors from Intel and AMD.

• ARM is a family of RISC- based microprocessors and microcontrollers designed by ARM

• Used mainly in embedded systems

• ARM is used within product, Not general purpose computer

• E.g. Anti-lock brakes in car, Digital Cameras, Cell phones.

Different constraints, optimization, reuse Static v dynamic loads

Slow to fast speeds

Computation v I/O intensive

Descrete event v continuous dynamics

• With human interface, we interact with the Embedded System.

• The diagnostic port may be used for diagnosing the system

• Special- purpose field programmable (FPGA), application- specific (ASIC), or even

Embedded real time

Dr. Syed Atif Ali Shah 61

• A model for enabling universal, convenient, on- demand

You might also like