0% found this document useful (0 votes)

14 views45 pages

Week 4a - Computer Architecture Fundamentals - Part 1

The document covers computer architecture fundamentals, focusing on performance improvements, current trends, and various classes of computers, including IoT, personal mobile devices, desktops, servers, and clusters. It discusses parallelism, Flynn's taxonomy, and the importance of power and energy efficiency in integrated circuits. Additionally, it addresses trends in technology, performance measurement, and dependability in computing systems.

Uploaded by

owen chan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views45 pages

Week 4a - Computer Architecture Fundamentals - Part 1

Uploaded by

owen chan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

CSIT123 Computing and Cyber

Security Fundamentals
Week 4: Computer Architecture Fundamentals
(Part 1)
Dr. Huseyin Hisil and Dr. Xueqiao Liu

Initially prepared by Dr. Dung Duong

Reading Task:
● Patterson, D.A. and Hennessy, J.L., 2019. Computer Architecture: A
Quantitative Approach. Morgan Kaufmann Publishing.
○ Fundamentals of Quantitative Design and Analysis
Introduction to Quantitative Design and Analysis
● Performance Improvements
• Semiconductor technology - feature size, clock speed
• Improvements in computer architectures - Enabled by HLL compilers, UNIX; Lead to RISC
(Reduced Instruction Set Computer) architectures.
• Together enables - Lightweight computers; Productivity-based; managed/interpreted
programming languages; SaaS, Virtualization, Cloud.
• Applications - Speech, sound, images, video, “augmented/extended reality”, “big data”.
Processor Performance

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Introduction
● Current Trends in Architecture
• Cannot continue to leverage Instruction-Level parallelism (ILP) - Single processor performance
improvement ended in 2003
• New models for performance - Data-level parallelism (DLP); Thread-level parallelism (TLP);
Request-level parallelism (RLP)
• These require explicit restructuring of the application
Classes of Computers
● Internet of Things/Embedded Computers
● Personal Mobile Device (PMD)
● Desktop Computing
● Servers
● Clusters / Warehouse Scale Computers
Classes of Computers
Internet of Things (IoT)/Embedded Computers
● Embedded computers are found in everyday machines: microwaves, washing
machines, most printers, networking switches, and all automobiles.
● Internet of Things (IoT) refers to embedded computers that are connected to the
Internet, typically wirelessly
○ When augmented with sensors and actuators, IoT devices collect useful data and
interact with the physical world, leading to a wide variety of “smart” applications, i.e.,
smart watches, smart thermostats, smart speakers, smart cars, smart homes, smart
grids, and smart cities.
● Embedded computers have the widest spread of processing power and cost.
○ Price is a key factor in the design of computers for this space.
Classes of Computers
Personal Mobile Device
● apply to a collection of wireless devices with multimedia user interfaces such as cell
phones, tablet computers, and so on.
○ Cost is a prime concern
● Applications on PMDs are often web-based and media-oriented,
● Energy and size requirements lead to use of Flash memory for storage
● Characteristics:
○ Responsiveness and Predictability
○ Minimize memory and energy efficiency
Classes of Computers
Desktop Computing
● Largest market in dollar terms
● Desktop market tends to be driven to optimize price-performance.
○ Matters to customers and computer designers
○ Appearance of highest-performance microprocessors and cost-reduced microprocessors
Classes of Computers
Servers
● Servers have become the backbone of large-scale enterprise computing, replacing the
traditional mainframe.
● Important characteristics:
○ Availability
○ Scalability
○ Efficiency and cost-effectiveness
Classes of Computers
Clusters/Warehouse-Scale Computers
● Clusters are collections of desktop computers or servers connected by local area
networks to act as a single larger computer.
● WSCs (Warehouse-Scale Computers) are the largest of the clusters
○ tens of thousands of servers can act as one
● Price-performance and power are critical to WSCs
● WSCs are related to servers in that availability is critical
● Difference between WSCs and servers:
○ WSCs use redundant, inexpensive components as the building blocks, relying on a
software layer to catch and isolate the many failures
○ scalability for a WSC is handled by the local area network connecting the computers
and not by integrated computer hardware, as in the case of servers.
● Supercomputers are related to WSCs in that they are equally expensive
○ Supercomputers emphasize floating-point performance
○ WSCs emphasize interactive applications, large-scale storage, dependability, and high
Internet bandwidth.
Classes of Computers
Parallelism at Multiple Levels
● Classes of parallelism in applications
○ Data-Level Parallelism (DLP) - many data items that can be operated on at the
same time.
○ Task-Level Parallelism (TLP) - tasks of work are created that can operate
independently and largely in parallel
● Classes of architectural parallelism
○ Instruction-Level Parallelism (ILP) - exploits data-level parallelism at modest levels with
compiler help using ideas like pipelining and at medium levels using ideas like
speculative execution.
○ Vector architectures/Graphic Processor Units (GPUs) - exploit data-level parallelism by
applying a single instruction to a collection of data in parallel.
○ Thread-Level Parallelism - exploits either data-level parallelism or task-level parallelism
in a tightly coupled hardware model that allows for interaction among parallel threads.
○ Request-Level Parallelism - exploits parallelism among largely decoupled tasks specified
by the programmer or the operating system.
Classes of Computers
Flynn’s Taxonomy - Founded in the 1960s
● Single instruction stream, single data stream (SISD)
○ Uni-processor. Sequential computer but can exploit instruction-level parallelism. Use ILP techs (superscalar and
speculative execution).
● Single instruction stream, multiple data streams (SIMD)
○ The same instruction is executed by multiple processors using different data streams. Exploit data-level parallelism
by applying the same operations to multiple items of data in parallel.
○ Each processor has its own data memory (hence the MD of SIMD), but there is a single instruction memory and
control processor.
○ DLP and three different architectures that exploit it: vector architectures, multimedia extensions to standard
instruction sets, and GPUs.
● Multiple instruction streams, single data stream (MISD)
○ No commercial multiprocessor of this type has been built to date.
● Multiple instruction streams, multiple data streams (MIMD)
○ Each processor fetches its own instructions and operates on its own data, targeting task-level parallelism.
○ More flexible than SIMD and thus more generally applicable, but more expensive.
○ Tightly coupled MIMD - exploit thread-level parallelism since multiple cooperating threads operate in parallel;
Loosely coupled MIMD – (clusters and warehouse-scale computers) exploit request-level parallelism with little
communication or synchronization.
Define Computer Architecture
● The Myopic View: only Instruction Set Architecture (ISA)
○ ISA serves as the boundary between the software and hardware
○ Seven dimensions: Class of ISA, Memory addressing, Addressing modes, Types and sizes
of operands, Operations, Control flow instructions, and Encoding an ISA
● The Genuine: Designing the Organization and Hardware
○ Meet functional requirements as well as price, power, performance, and availability
goals.
○ Cover three aspects: Instruction set architecture, Microarchitecture Addressing modes,
Hardware
Trends in Technology
● Integrated circuit technology
○ Transistor density increases by 35%/year
○ Die size increases by 10-20%/year
○ Together leads to the increase on transistor count on a chip: 40-55%/year
● DRAM (dynamic random-access memory) capacity
○ Capacity per DRAM chip has increased by 25-40%/year (slowing)
● Flash (electrically erasable programmable read-only memory) capacity
○ Capacity per Flash chip has increased by 50-60%/year
○ In 2019, 8-10X cheaper/bit than DRAM
● Magnetic disk technology
○ Since 2004, density increased by 40%/year
○ 8-10X cheaper/bit than Flash
○ 200-300X cheaper/bit than DRAM
● Network technology : Performance depends on
○ performance of switches
○ performance of the transmission system
Performance Trends: Bandwidth over Latency
● Bandwidth or throughput
○ The total amount of work done in given
time, such as megabytes per second for a
disk transfer
○ 10,000-25,000X improvement for
processors over the 1st milestone
○ 300-1200X improvement for memory and
disks over the 1st milestone
● Latency or response time
○ The time between start and completion of
an event
○ 30-80X improvement for processors over
the 1st milestone
○ 6-8X improvement for memory and disks
over the 1st milestone
Scaling of Transistor Performance and Wires
● Feature size
○ Minimum size of transistor or wire in x or y dimension
○ 10 microns in 1971 to .016 microns in 2017
○ Transistor performance scales linearly - Wire delay does not improve with feature size
○ Integration density scales quadratically
○ Linear performance and quadratic density growth present a challenge and opportunity,
creating the need for computer architect
Power and Energy in Integrated Circuits
● Power is the biggest challenge
○ Problem: power is brought in and distributed around the chip, and modern
microprocessors use hundreds of pins and multiple interconnect layers just for power
and ground; power is dissipated as heat and must be removed.
○ Three concerns
■ What is the maximum power a processor ever requires?
■ What is the sustained power consumption? - the thermal design power (TDP): Characterizes
sustained power consumption; Used as target for power supply and cooling system; Lower
than peak power, higher than average power consumption
■ Consider energy and energy efficiency – energy (not power) consumption per task better
measures efficiency
Dynamic Energy and Power
● Dynamic energy
○ The energy required per transistor is proportional to the product of the capacitive load
driven by the transistor and the square of the voltage, i.e., the energy of pulse of the
logic transition of 0→1→0 or 1→0→1

○ The energy of a single transition (0→1 or 1→0)

● Dynamic power
○ The power required per transistor is the product of the energy of a transition multiplied
by the frequency of transitions = ½ x Capacitive load x Voltage2 x Frequency switched.
Dynamic Energy and Power
● Example: Some microprocessors today are designed to have adjustable voltage, so
a 15% reduction in voltage may result in a 15% reduction in frequency. What would
be the impact on dynamic energy and on dynamic power?
● Solution:
Dynamic Energy and Power
● Reducing clock
frequency/rate reduces
power, not energy
○ The first microprocessors
consumed less than a watt
and the first 32-bit
microprocessors (like the
Intel 80386) used about 2
W, while a 4.0 GHz Intel
Core i7-6700K consumes
95 W.
Improve Energy Efficiency
● Techniques
○ Do nothing well - Most microprocessors turn off the clock of inactive modules to save energy
and dynamic power. E.g., if no floating-point instructions executing, the clock of the floating-
point unit is disabled. If some cores are idle, their clocks are stopped.
○ Dynamic Voltage-Frequency Scaling (DVFS) - Modern microprocessors typically offer a few
clock frequencies and voltages in which to operate that use lower power and energy.
○ Design for typical case - PMDs and laptops are often idle, memory and storage offer low
power modes to save energy. E.g., DRAMs different increasingly lower power modes to extend
battery life, so disks can spin at lower rates when idle. You cannot access DRAMs or disks in
these modes, so must return to fully active mode to read or write.
○ Overclocking - Run at a higher clock rate for short time on some cores until temperature rises.
For single threaded code, microprocessors turn off all cores but one and run it at a higher clock
rate.
Static Power
● Static power
○ Leakage current flows even when a transistor is off

○ Proportional to number of transistors

○ Power gating reduces static power - Turn off the power supply
○ Race-to-halt strategy - use a faster, less energy-efficient processor to allow the rest of
the system to go into a sleep mode
● New metric to evaluate
○ Old: performance per mm2 of silicon
○ New: tasks per joule or performance per watt
Trends in Cost
● Learning Curves Drives Costs Down.
○ Measured by change in yield - the percentage of manufactured devices that survives the
testing procedure. Designs that have twice the yield will have half the cost.
● DRAMs price closely track cost
● Microprocessor
○ In significant competition, its price tracks cost closely
○ Its volume also determines cost
■ Increasing volumes decrease the time to get down the learning curve which is partly
proportional to number of systems/chips manufactured
■ Increasing volumes decrease cost for increasing purchasing/manufacturing efficiency (10%
less for each doubling of volume)
Integrated Circuit
● Integrated circuit costs become greater portion and vary
○ Though the costs of integrated circuits dropped exponentially, the silicon manufacture is
unchanged: a wafer is still tested and chopped into dies that are packaged

○ To predict the number of good chips per wafer requires learning how many dies fit on a
wafer and how to predict the percentage of those that will work

○ Number of dies per wafer is about the area of the wafer divided by the area of the die

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Integrated Circuit
● Example: Find the number of dies per 300 mm (30 cm) wafer for a die
that is 1.5 cm on a side and for a die that is 1.0 cm on a side.
● Answer:
Integrated Circuit
● Problem: what’s the fraction of good dies on a wafer/die yield
○ Above only gives the max number of dies per wafer. Defects are randomly distributed
over the wafer and yield is inversely proportional to the complexity of the fabrication

○ Bose–Einstein formula looks at the yield of many manufacturing lines.

■ Wafer yield accounts for wafers are completely bad and so need not be tested, assume the
wafer yield is 100% for simplicity.
■ Defects per unit area measures the random manufacturing defects that occur. In 2017, 0.08
to 0.1 defects per square inch; or 0.012 to 0.016 defects per square cm for a 28nm process,
as it depends on the maturity of the process (learning curve).
■ N: the process-complexity factor, a measure of manufacturing difficulty. For 28 nm
processes in 2017, N ranged from 7.5 to 9.5.
○ The manufacturing process dictates the wafer cost, wafer yield, and defects per unit
area, so the sole control of the designer is die area

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Integrated Circuit
● Example: Find the die yield for dies that are 1.5 cm on a side and 1.0 cm on a
side, assuming a defect density of 0.047 per cm2 and N is 12.
● Answer:
Dependability
● Systems alternate between 2 service states due to service level
agreements (SLAs) or service level objectives (SLOs)
○ Service accomplishment, where the service is delivered as specified
○ Service interruption, where the delivered service is different from the SLA
● Transitions caused by failures (1 to 2) or restorations (2 to 1)
○ Module reliability: continuous service accomplishment/the time to failure
■ Mean time to failure (MTTF)/failures in time (FIT)
■ 1 MTTF of 1,000,000 hours = 1000 FIT
■ Mean time to repair (MTTR)
■ Mean time between failures (MTBF) = MTTF + MTTR
○ Module availability: the service accomplishment with respect to the alternation
between the two states of accomplishment and interruption

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Dependability
● Example:
Dependability
● Answer:
Performance Measurement
● Two metrics
• Reduce response time: time between start and end of an event (execution time)
• Increase throughput: the total amount of work done in a given time
• Computer X is n times faster than Y:

• The throughput of X is 1.3 times higher than Y” signifies here that the number of tasks
completed per unit time on computer X is 1.3 times the number completed on Y.
● Execution time
• Wall-clock time/response time/elapsed time: latency to complete a task (all overheads)
• CPU time: only computation time

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Performance Measurement
● Benchmarks
• The best choice of benchmarks to measure performance is real applications,
• Kernels: small, key pieces of real applications
• Toy programs: 100-line programs from beginning programming assignments, e.g., quicksort
• Synthetic benchmarks: fake programs invented to try to match the profile and behaviour of
real applications, e.g., Dhrystone (https://en.wikipedia.org/wiki/Dhrystone)
Performance Measurement
● Benchmarks
• Bench suites: a popular measure of performance of processors with a variety of
applications, e.g., SPEC (Standard Performance Evaluation Corporation)
• SPECRatio: dividing time on the reference computer by time on the rated computer

• SPECRatio is a ratio rather than an absolute execution time, the mean must be computed
using the geometric mean

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Quantitative Principles of Computer Design
● Take Advantage of Parallelism
• Multiple processors, disks, memory banks, pipelining, multiple functional units
● Principle of Locality
• Reuse of data and instructions
Extra Reading Material
Quantitative Principles of Computer Design
● Focus on the Common Case
• Amdahl’s law:

• Speed up from enhancement depending on two factors:

•The fraction of the computation time in the original computer. E.g., if 20 seconds of the execution
time of a program that takes 60 seconds in total can use an enhancement, Fractionenhanced = 20/60,
is always less than or equal to 1
•Enhanced execution mode, i.e., how much faster the task would run if the enhanced mode were
used for the entire program, i.e., the time of the original mode over the time of the enhanced
mode. E.g., if the enhanced mode takes, 2 seconds for a portion of the program, while it is 5
seconds in the original mode, Speedupenhanced = 5/2, is always greater than 1

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Quantitative Principles of Computer Design

● Focus on the Common Case

○ Example: Suppose that we want to enhance the processor used for web serving. The
new processor is 10 times faster on computation in the web serving application than
the old processor. Assuming that the original processor is busy with computation 40%
of the time and is waiting for I/O 60% of the time, what is the overall speedup gained
by incorporating the enhancement?
○ Answer:

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Quantitative Principles of Computer Design
● Focus on the Common Case
○ Example: A common transformation required in graphics processors is square root.
Implementations of floating-point (FP) square root vary significantly in performance,
especially among processors designed for graphics. Suppose FP square root (FSQRT) is
responsible for 20% of the execution time of a critical graphics benchmark. One
proposal is to enhance the FSQRT hardware and speed up this operation by a factor of
10. The other alternative is just to try to make all FP instructions in the graphics
processor run faster by a factor of 1.6; FP instructions are responsible for half of the
execution time for the application. The design team believes that they can make all FP
instructions run 1.6 times faster with the same effort as required for the fast square
root. Compare these two design alternatives.
○ Answer: Improving the performance of the FP operations overall is slightly better
because of the higher frequency.

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Quantitative Principles of Computer Design
● The Processor Performance Equation
• All computers are constructed using a clock running at a constant rate.
•These discrete time events are called ticks, clock ticks, clock periods, clocks, cycles, or clock cycles.
Computer designers refer to the time of a clock period by its duration (e.g., 1 ns) or by its rate (e.g.,
1 GHz).

•The number of instructions executed: the instruction path length or instruction count (IC). If know
the number of clock cycles and the instruction count, can calculate the average number of clock
cycles per instruction (CPI), or the inverse of CPI (IPC)

•Clock cycles can be defined as IC × CPI

•Expanding the first formula into the units of measurement shows how the pieces fit together:

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Quantitative Principles of Computer Design
● Processor performance and CPU time equally depend on
• clock cycle (or rate)
• clock cycles per instruction
• instruction count
● Basic technologies are interdependent so hard to change one parameter
in complete isolation from others
• Clock cycle time: Hardware technology and organization
• CPI: Organization and instruction set architecture
• Instruction count: Instruction set architecture and compiler technology
Quantitative Principles of Computer Design
● Potential improvement techniques improve one component of with
small/predictable impacts on the other two
• Calculate number of total processor clock cycles, ICi: number of times instruction i is
executed in a program, CPIi: average number of clocks per instruction for instruction i

• Use each individual CPIi and the fraction of occurrences of that instruction in a program,
i.e., ICi ÷ Instruction count:

• Instead use measurement of frequency of instructions and of instruction CPI values

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Quantitative Principles of Computer Design
● Example:

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

● Answer

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Reference
● Patterson, D.A. and Hennessy, J.L., 2019. Computer Architecture: A
Quantitative Approach. Morgan Kaufmann Publishing.

CI-0120 Arquitectura de Computadoras Ejemplos FundamentosDiseño
No ratings yet
CI-0120 Arquitectura de Computadoras Ejemplos FundamentosDiseño
52 pages
CS5204/EE5364 - Advanced Computer Architecture - Introduction
No ratings yet
CS5204/EE5364 - Advanced Computer Architecture - Introduction
28 pages
Histroy of Computer Generation
No ratings yet
Histroy of Computer Generation
28 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
54 pages
UNIT1
No ratings yet
UNIT1
11 pages
CHAPTER 1-orig
No ratings yet
CHAPTER 1-orig
50 pages
Cse.m-ii-Advances in Computer Architecture (12scs23) - Notes
No ratings yet
Cse.m-ii-Advances in Computer Architecture (12scs23) - Notes
213 pages
Chapter01 (2)
No ratings yet
Chapter01 (2)
40 pages
Review of LSS CSC
No ratings yet
Review of LSS CSC
21 pages
Defining Computer Architecture
No ratings yet
Defining Computer Architecture
6 pages
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
No ratings yet
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
54 pages
Introduction to ACA 2021
No ratings yet
Introduction to ACA 2021
73 pages
Modern Computer Architecture: Lecture1 Fundamentals of Quantitative Design and Analysis (I)
No ratings yet
Modern Computer Architecture: Lecture1 Fundamentals of Quantitative Design and Analysis (I)
41 pages
Advanced Computer Architecture ECE 6373: Pauline Markenscoff N320 Engineering Building 1 E-Mail: Markenscoff@uh - Edu
No ratings yet
Advanced Computer Architecture ECE 6373: Pauline Markenscoff N320 Engineering Building 1 E-Mail: Markenscoff@uh - Edu
151 pages
HPC Module 1
No ratings yet
HPC Module 1
11 pages
ACA Notes UNIT-1
No ratings yet
ACA Notes UNIT-1
20 pages
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
No ratings yet
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
48 pages
Computer Architecture and Organization: Thanudas. B
No ratings yet
Computer Architecture and Organization: Thanudas. B
22 pages
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
No ratings yet
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
37 pages
CAQA6e ch1
No ratings yet
CAQA6e ch1
31 pages
Parallel Scalable Models
No ratings yet
Parallel Scalable Models
61 pages
Ch1 Intro BushPPT1
No ratings yet
Ch1 Intro BushPPT1
54 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
29 pages
Advanced Computer Architecture: Azvjvhd
No ratings yet
Advanced Computer Architecture: Azvjvhd
61 pages
Chapter 1 Fundamentals of Computer Design
No ratings yet
Chapter 1 Fundamentals of Computer Design
40 pages
CA0216D_Chapter1B
No ratings yet
CA0216D_Chapter1B
32 pages
ACA Mod1
No ratings yet
ACA Mod1
118 pages
Introducción_2024
No ratings yet
Introducción_2024
41 pages
COA Module-4 Notes
No ratings yet
COA Module-4 Notes
11 pages
Fundamentals of Computer Design
No ratings yet
Fundamentals of Computer Design
133 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
49 pages
Unit I Fundamentals of Computer Design and Ilp-1-14
No ratings yet
Unit I Fundamentals of Computer Design and Ilp-1-14
14 pages
Aca 1st Unit
No ratings yet
Aca 1st Unit
13 pages
Bell's Law 2013
No ratings yet
Bell's Law 2013
68 pages
Lecture Slides-Week1
No ratings yet
Lecture Slides-Week1
59 pages
CMPS290 Class Notes Chap 01
No ratings yet
CMPS290 Class Notes Chap 01
24 pages
Architecture of Parallel Computing
No ratings yet
Architecture of Parallel Computing
6 pages
UNIT1-ACA
No ratings yet
UNIT1-ACA
115 pages
Chapter 1 - Fundamentals of Computer Design
100% (1)
Chapter 1 - Fundamentals of Computer Design
40 pages
Model
No ratings yet
Model
14 pages
Lect 01 PDF
No ratings yet
Lect 01 PDF
33 pages
Fundamentals of Computer Design
No ratings yet
Fundamentals of Computer Design
14 pages
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
No ratings yet
Fundamentals of Quantitative Design and Analysis: A Quantitative Approach, Fifth Edition
24 pages
William Stallings Computer Organization and Architecture 7 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 7 Edition Computer Evolution and Performance
44 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
CSS224 Lec1
No ratings yet
CSS224 Lec1
30 pages
Chapter1 Aca
No ratings yet
Chapter1 Aca
26 pages
The Essentials of Computer Organization and Architecture, Fifth Edition by Null and
No ratings yet
The Essentials of Computer Organization and Architecture, Fifth Edition by Null and
49 pages
L-1 (History of Computer)
No ratings yet
L-1 (History of Computer)
75 pages
Archtitecure 1
No ratings yet
Archtitecure 1
64 pages
Computer Architecture Slides
No ratings yet
Computer Architecture Slides
274 pages
Smd150 Computer Architecture: Per Lindgren Eislab, Lectures Andrey Kruglyak, Syncsim Johan Eriksson, VHDL
No ratings yet
Smd150 Computer Architecture: Per Lindgren Eislab, Lectures Andrey Kruglyak, Syncsim Johan Eriksson, VHDL
43 pages
فایل 1
No ratings yet
فایل 1
23 pages
CS3350B Computer Architecture: Marc Moreno Maza
100% (1)
CS3350B Computer Architecture: Marc Moreno Maza
45 pages
Slide02 Parallel Computers
No ratings yet
Slide02 Parallel Computers
44 pages
CompArch - Chapter One
No ratings yet
CompArch - Chapter One
9 pages
Chapt1 Computer Architecture
No ratings yet
Chapt1 Computer Architecture
50 pages
Fundamentals of Modern Computer Architecture: From Logic Gates to Parallel Processing
From Everand
Fundamentals of Modern Computer Architecture: From Logic Gates to Parallel Processing
Sam Steed
No ratings yet
Grid Computing: A Revolutionary Approach to Scientific Research and Data Management
From Everand
Grid Computing: A Revolutionary Approach to Scientific Research and Data Management
Pasquale De Marco
No ratings yet
Quantum Computer Vs Traditional Computer
From Everand
Quantum Computer Vs Traditional Computer
Arief Muinnudin
No ratings yet
Factsheet - REPowerEU PDF
No ratings yet
Factsheet - REPowerEU PDF
2 pages
''Burushaski Morphology'' (Eisenbrauns, 2007)
No ratings yet
''Burushaski Morphology'' (Eisenbrauns, 2007)
44 pages
Chapter No-2-Storage, Collection and Transportation of Municipal Solid
100% (1)
Chapter No-2-Storage, Collection and Transportation of Municipal Solid
14 pages
Basti - Notes - 1st April
100% (1)
Basti - Notes - 1st April
2 pages
1024 Ubihub AI Spec Sheet - 102320
No ratings yet
1024 Ubihub AI Spec Sheet - 102320
2 pages
Vaagdevi Institute of Technology & Science: Code: 15A02701 Narasimhapuram (V), Peddasettipalli (Post), Proddatur - 516360
No ratings yet
Vaagdevi Institute of Technology & Science: Code: 15A02701 Narasimhapuram (V), Peddasettipalli (Post), Proddatur - 516360
2 pages
Penicillium Chrysogenum
No ratings yet
Penicillium Chrysogenum
2 pages
0900766b812cdbee ABB Insulation Monitor Relay
No ratings yet
0900766b812cdbee ABB Insulation Monitor Relay
15 pages
13 - Digital Controller Design
No ratings yet
13 - Digital Controller Design
22 pages
Notes On Narrative Report About Gardening
No ratings yet
Notes On Narrative Report About Gardening
9 pages
Temperature Measurement
No ratings yet
Temperature Measurement
42 pages
Facilities and Equipment
No ratings yet
Facilities and Equipment
6 pages
Thread Types: Core Spun Threads
No ratings yet
Thread Types: Core Spun Threads
1 page
gt24 XL Gas Turbine Upgrade PDF
100% (1)
gt24 XL Gas Turbine Upgrade PDF
4 pages
Sanitation Essay
No ratings yet
Sanitation Essay
3 pages
Sufi Cosmology and Psychology
100% (1)
Sufi Cosmology and Psychology
7 pages
Draping of Forearm: With Hand Drape
No ratings yet
Draping of Forearm: With Hand Drape
2 pages
Romeo and Juliet Script
No ratings yet
Romeo and Juliet Script
21 pages
Pe11 q2 Mod3 My-Fitness-Goals
100% (3)
Pe11 q2 Mod3 My-Fitness-Goals
29 pages
Transport in Plants
No ratings yet
Transport in Plants
12 pages
TSI-LIB-131Aslam Kassimali Matrix Analysis of Structure-Alvarez1
No ratings yet
TSI-LIB-131Aslam Kassimali Matrix Analysis of Structure-Alvarez1
25 pages
Technical Data Sheet: Fully Synthetic
No ratings yet
Technical Data Sheet: Fully Synthetic
1 page
Switching-Impulse Voltage Marta
No ratings yet
Switching-Impulse Voltage Marta
3 pages
GA3 Effect Flowering Kalanchoe Model
No ratings yet
GA3 Effect Flowering Kalanchoe Model
7 pages
Marine Environment and Their Divisions
No ratings yet
Marine Environment and Their Divisions
11 pages
For Transformer Ohmmeter DC Winding Resistance Test Set MTO210 Catalog Number MTO210
No ratings yet
For Transformer Ohmmeter DC Winding Resistance Test Set MTO210 Catalog Number MTO210
74 pages
Dped 15-16
No ratings yet
Dped 15-16
25 pages
CHM 432
No ratings yet
CHM 432
16 pages
Topic 4A Composition Stoichiometry
No ratings yet
Topic 4A Composition Stoichiometry
33 pages
LLLLL
No ratings yet
LLLLL
8 pages

Week 4a - Computer Architecture Fundamentals - Part 1

Uploaded by

Week 4a - Computer Architecture Fundamentals - Part 1

Uploaded by

CSIT123 Computing and Cyber

Initially prepared by Dr. Dung Duong

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

○ The energy of a single transition (0→1 or 1→0)

○ Proportional to number of transistors

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

○ Bose–Einstein formula looks at the yield of many manufacturing lines.

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

• Speed up from enhancement depending on two factors:

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

● Focus on the Common Case

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

•Clock cycles can be defined as IC × CPI

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

• Instead use measurement of frequency of instructions and of instruction CPI values

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

Image Source: https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1

You might also like