0% found this document useful (0 votes)

59 views4 pages

COA Digital-Cheatsheet

This document provides a summary of key concepts in computer organization and cache memory. It defines structural hazards as simultaneous use of hardware resources. It describes different types of data hazards like RAW and how forwarding can resolve them. It discusses control hazards from branching and how early branch resolution and prediction can help. Finally, it summarizes cache performance metrics and how parameters like block size can impact hit rate and miss penalty.

Uploaded by

2k22.cscys.2212288

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views4 pages

COA Digital-Cheatsheet

Uploaded by

2k22.cscys.2212288

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

lOMoARcPSD|38291606

CS2100 Finals Cheatsheet

Computer Organisation (National University of Singapore)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by Muskan Gupta ([email protected])
lOMoARcPSD|38291606

Hazard and resolution

Structural Hazards
 Simultaneous use of hardware resource (e.g. memory unit used by
both load and fetch instruction)
 No issue for MIPS as data and instruction memory is separate
Data Hazards
Per-block overhead: Valid flag (1-bit) + Tag length
RAW (Read after Write) (Initially, all valid flags are unset)
 Register Writes first, then Reads Blocks in cache: 2M
 Without data forwarding: If dependent cycle is right before: 2 Bytes per block: 2N
cycle delay, 2 cycles before: 1 cycle delay For each Memory Address
 With data forwarding: If dependent cycle is dependent on lw: 1  Set Index = (val mod 2N+M)// 2N
 Word Index = (val mod 2N)//Bytesword
cycle delay otherwise: no delay  Tag = val // 2N+M
 Detect Load-Use hazard when ID/EX.instruction == Load && Set-Associative Cache
(ID/EX.rt == IF/ID.rs || ID/EX.rt == IF/ID.rt) A block maps to a unique set of N possible cache locations
Data Forwarding N-way SAC → N cache blocks per set
 Resolves all RAW Hazards except lw (need one stall) Bytes per block: 2M
Cache bocks = Sizecache / Sizeblock
Performance  sw after lw might not need to stall at all Sets = CacheBlocks / N = 2N
Single Cycle  Forward from EX/MEM to ALU for 1 Fully-Associative Cache
 One Instruction = 1 Clock Cycle  Forward from MEM/WB to ALU for 2
 Clock Cycle Time: Longest Latency Amongst all Control Hazards (Branching/Jumping)
Instructions (usually Lw)  Without ANY Control Measures: 3 cycle delay
 Total Execution Time =  Early branch resolution Move branch decision calculation from
EX/MEM to ID stage – stall 1 cycle instead of 3 (may cause
Number of Instructions x Clock Cycle Time further stall if reg. is written by previous instruction)
Multi Cycle o Involved in RAW with prev inst (not lw): stall 2
cycles Block can be placed anywhere, but need to search all blocks
 One Stage = 1 Clock Cycle No conflict miss anymore. Capacity miss = total miss – cold miss
 Cycle Time Decreases, Clock Frequency Increases o Involved in RAW with prev inst (lw): stall 3 cycles Cache Performance
 Different Instructions take variable number of clock o Not involved in any RAW: stall 1 cycle Larger Block Trade Off:
 Branch prediction (not taken): Guess the outcome and  Spatial Locality Advantage (Hit Rate Increases)
cycles (since not all stages are needed) speculatively execute instructions, if guess wrongly then flush
 Clock Cycle Time: Longest latency amongst all pipeline  Miss Penalty increases due to loading more
Stages  Temporal Locality Disadvantage at certain limit (Miss Rate
o With early branching: 1 cycle occur Increases)
 Total Execution Time = o Without early branching: 3 cycles occur Rule of thumb: Direct-mapped cache of size N has
I x Average CPI x Clock Cycle Time before instructions get flushed/not flushed almost the same miss rate as 2-way set associative cache of
 Delayed branch: X instructions following a branch will always size N/2
Pipeline be executed regardless of outcome (requires compiler re-ordering - Cold/Compulsory miss does not depend on size/associativity
 One Stage = 1 Clock Cycle of instructions to branch-delay slot(s), or add nop instructions)
Try to find un-linked instructions from before the branch. - For same cache size, Conflict miss decreases with increasing associativity
 Clock Cycle Time: Longest latency amongst all - Conflict miss is 0 for FA Cache
Stages + Td (time needed to store into pipeline) o With early branching: shift 1 instruction
- For same cache size, Capacity miss does not depend on associativity
o Without early branching: shift 3 instructions - Capacity miss decreases with increasing size
 Cycles needed for I inst: (I + N – 1) Cache (1GiB = 2 bytes, 1 KiB = 210 bytes)
30

 Total Execution Time = Block replacement policy

 Temporal locality: Same item tends to be re-referenced soon Least recently used (LRU): the usual policy, hard to track
(I + N – 1) x Clock Cycle Time  Spatial locality: Nearby items tend to be referenced soon First in first out (FIFO) – with second chance variant
 If N(instructions) >> N(stages)  Hit rate: fraction of memory accesses that are in cache Random replacement (RR)
 (avg. access time) = (hit rate) × (hit time) + (1 − (hit rate)) × Least frequently used (LFU)
Speedup(pipeline) = (Time(single cycle) / Time(pipe)) ~ N (miss penalty)
Performance
 Cache block/line: smallest unit of transfer between memory and
cache Performance = 1 / ResponseTime
Types of misses: Speedup n, between x and y:
 Cold/Compulsory: when the block has never been accessed
before
 Conflict: same index gets overwritten (direct & set assoc.)
 Capacity: cache cannot contain all blocks (full assoc.) CPU Time = Instructions / Program * Cycles / Instruction * Seconds / Cycle
Write Policy Factors affecting performance: Different compiler (affects Instruction Per
 Write-through: write data both to cache and main memory using Program), Different ISA (affects CPI)
Pipelining (Pipeline register contents) a write buffer to queue memory writes Cannot use CPI to determine performance/time, use total time!
 Write-back: write data to cache; write to main memory when Amdahl’s Law (Performance limited to non-speedup program portion)
 IF/ID: Instruction from memory & PC + 4 block is evicted using a “dirty bit” on each cache block P: % of program time that can be
 ID/EX: Data read from register files, 32-bit Sign Extended Imm, Write miss policy improved
& PC + 4  Write allocate: load block to cache, then follow write policy
 EX/MEM: Imm, & (PC + 4) + (Imm * 4), ALU Result, isZero  Write around: write directly to main memory Boolean Algebra
Signal * RD2 from register file Direct Mapped Cache Precedence of Not > And > Or
 Identity: A + 0 = A and A · 1 = A
 MEM/WB: ALU result, Memory Read Data & Write Register
 Complement: A + A’ = 1 and A · A’ = 0
Data (passed through all pipelines)  Commutative: A + B = B + A and A · B = B · A
Downloaded by Muskan Gupta ([email protected])
lOMoARcPSD|38291606

 Associative: A + (B + C) = (A + B) + C and A · (B · C) = (A · B) ·  Prime implicant: Implicant which is not a subset of any other code
C implicant - Priority Encoder can deal with the garbage inputs by assigning priorities
 Distributive: A + (B · C) = (A + B) · (A + C) and A · (B + C) = to inputs.
(A · B) + (A · C)  Essential prime implicant: Prime implicant with at least one ‘1’
- Add valid bit to
 Duality (not a real law): If we flip AND/OR operators and flip the that is not in any other prime implicant (must show in final eqn) deal with (if nothing
operands (0 and 1), the Boolean equation still holds  Simplified SOP expression – group ‘1’s on K-map switched on)
 Ide mpo tency:X+X=Xa ndX·X=X  Simplified POS expression – find SOP expression using ‘0’s on • Demultiplexer:
 One /Ze roEl eme nt :X+1=1a ndX·0=0 K-map, then negate resulting expression - One input data line
 Inv olution:( X’)’=X  Grouping 2N cells(only power-sizes are allowed) eliminates n - N selection lines
 Abs orption:X+( X· Y)=X variables - Directs data from
X·( X+Y)=X input to a selected
 EPIs are counted only by checking 1s, not Xs output line among
 Abs orption( var i
ant ):X+( X’·Y)=X+Y  K-maps help to obtain canonical SOP, but might not provide the 2N possibilities
X·( X’+Y)=X·Y simplest expression possible (need to use boolean algebra for that) Demultiplexer ≡
 De Mor g ans’(ca nb eus edon>2v ar
iabl
es):( X·Y) ’=X’+Y’ Decoder with enable
(X+Y) ’=X’·Y’ • Multiplexer:
 Cons ens us:(X·Y)+( X’·Z)+( Y·Z)=( X·Y)+( X’·Z) - Selects one of 2n inputs to a single output line, using n selection lines
(X+Y)·( X’+Z)·( Y+Z)=( X+Y)·( X’+Z) - To implement functions with n variables, pass variables to the n-bit selector
Lo gicGa t es and set 2n inputs to
Complete set of logic: Any set of gates appropriate constants from
sufficient for building any boolean function. truth table
 e.g. {AND, OR, NOT} - To implement functions
Lo gicCircuits with n + 1 variables, pass
 e.g. {NAND} (self-sufficient / first n variables to the n-
universal gate) = {Negative OR} Combi nati
onalc i
rc uit
:eachoutputdependsent
ire
lyon
bit selector and set each
 e.g. {NOR} (self-sufficient / presentinputs input appropriately to ‘0’,
Seque nti
alc i
rcuit:e a
choutputdependsonbo t
hpresent ‘1’, Z, or Z’ (Z is the last
universal gate) – only when both
inputsands t
ate variable)
inputs 0 will output be 1
•Hal f-
Adde rC=X·Y,S=X⊕Y
•Ful l-
adde rCout=X· Y+( X⊕Y) Cin,S=X⊕(
· Y⊕Z)=( X⊕Y) ⊕Z
With negated outputs, use NAND to simulate •4- bi
tparall
eladd erbyc a
scadi
ng4f ull-
adder
sviathe
ircar
ri
es
OR and NOR to simulate AND •Adde r-
cum- subtractorneedtoXORt heYwi thS(0/1de p
endingon Larger
add/s ubtract
)andpa s
sinSa sC-in(X–Y=X+( 1s-
Complemento fY)+ Components
1) - Remove a
•Magni t
udeCompar a
tor:input:2uns i
gnedval
uesAa ndB,output:"
A> decoder that gives
B" ,"A=B" ,"A<B" duplicate outputs
(w.r.t another
Circuit Delays decoder) by using
• For each component, time = max(∀tinput) + tcurrent component an OR gate with
• Propagation delay of ripple-carry parallel adders ∝ no. of bits the outputs from
the first decoder,
and the enable input of the second.
ALU Build

MSI Components
SOP 0 IS THE LEAST
SIGNIFICANT
expression – implement using 2-level AND-OR circuit or 2-level NAND INPUT!
circuit • Decoder (n-to-m-
POS expression – implement using 2-level OR-AND circuit or 2-level NOR line decoder):
circuit converts binary
data from n input lines to one of the m ≤ 2n output
lines (i.e. 2 x 4 )
Minterms & Maxterms
- Each output line represents a minterm
 Minterm/Maxterm of n - Active High - Generate minterms and
variables is a use OR on minterms to form a function.
product/sum term that Alternatively, use NOR on maxterms
contains n literals from all the variables -> n variables -> 2n - Active Low – AND the maxterms or
mindterms, 2n maxterms NAND the minterms
 Minterm: m0 = X’· Y’· Z’ - Can add an Enable signal
Larger decoders can be constructed from
 Maxterm: M0 = X + Y + Z smaller ones with an inverter (e.g. 3 x 8
 m0’ = M0 decoder built from 2 x 4)
 Functions can be sum of minterms or product of maxterms • Encoder: opposite of decoder Sequential Circuits
 Sum of 2 distinct Maxterms is 1 - Exactly ONE input should be ‘1’ Self-Correcting: any unused state can transit to a used state after a finite
 Product of 2 distinct minterms is 0 - If more than one input switched one, number of cycles
Kmap then X (don’t care values)
- Position of single active input line Synchronous: outputs change at specific time (with clock)
 Implicant: Product term with all ‘1’ or ‘X’, but with at least one among 2n possibilities is coded as a n-bit Asynchronous: outputs change at any time
‘1’
Downloaded by Muskan Gupta ([email protected])
lOMoARcPSD|38291606

Multivibrator: sequential circuits that operate/swing between

 HIGH and LOW state
 Bistable: 2 stable states (e.g. latch, flip-flop)
 Monostable / one-shot: 1 stable state
 Astable: no stable state (e.g. clock)
Memory element: device that can remember value indefinitely, or change
value on command from its inputs. Same input does not always give same
output!
 Pulse-triggered: activated by +ve/−ve pulses (e.g. latch)
 Edge-triggered: activated by rising/falling edge (e.g. flip-flop)

S-R latch (“Set-Reset”) (High: 2 cross-coupled NOR gates Low: NAND):

Gated S-R latch: Outputs change only when EN is HIGH (AND)

Memorised when EN is LOW

Gated D latch (“Data”): Can build from gated S-R latch (No invalid)

• S-R flip-flop: Similar to gated S-R latch

• D (data) flip-flop: Similar to gated D latch (No invalid Inputs)
• J-K flip-flop: J:“Set”, K:“Reset”, Toggle if both HIGH
• T flip-flop (“Toggle”): J-K flip-flop with tied inputs

J-K Flip Flop: Q and Q’ fed back to NAND gates

T Flip Flop: Tie both inputs of J-K together

Downloaded by Muskan Gupta ([email protected])

Computer Organization and Architecture
67% (3)
Computer Organization and Architecture
111 pages
A. Nagoor Kani - 8086 Microprocessors and Its Applications-Mc Graw Hill India (2013)
100% (7)
A. Nagoor Kani - 8086 Microprocessors and Its Applications-Mc Graw Hill India (2013)
498 pages
Computer Abbreviations For Railway NTPC Exams
No ratings yet
Computer Abbreviations For Railway NTPC Exams
9 pages
CPT 111 Past Q
No ratings yet
CPT 111 Past Q
16 pages
Computer Full Form
No ratings yet
Computer Full Form
9 pages
Final Exam Topics: CSE 564 Computer Architecture Summer 2017
No ratings yet
Final Exam Topics: CSE 564 Computer Architecture Summer 2017
78 pages
Embedded Systems UEC513 D
No ratings yet
Embedded Systems UEC513 D
225 pages
Sap 1
100% (1)
Sap 1
15 pages
Asm 1
No ratings yet
Asm 1
400 pages
Worksheet 3 ICT: Storage Devices and Media: Materi
No ratings yet
Worksheet 3 ICT: Storage Devices and Media: Materi
7 pages
COP8CBR9
No ratings yet
COP8CBR9
82 pages
Wong 2010
No ratings yet
Wong 2010
27 pages
For Atmega8
100% (1)
For Atmega8
47 pages
IT3030E CA Chap6 Memory
No ratings yet
IT3030E CA Chap6 Memory
69 pages
Multi-Channel Memory Architecture: Jump To Navigation Jump To Search
No ratings yet
Multi-Channel Memory Architecture: Jump To Navigation Jump To Search
3 pages
CA I - Chapter 5 Caches 3
No ratings yet
CA I - Chapter 5 Caches 3
70 pages
Chapter # 05
No ratings yet
Chapter # 05
42 pages
Module 7 Memory Management
No ratings yet
Module 7 Memory Management
7 pages
Cheetah-G2 FSM en Final 241017
No ratings yet
Cheetah-G2 FSM en Final 241017
79 pages
Lecture2a PDF
No ratings yet
Lecture2a PDF
63 pages
Two Forms of Pipelining: - E.g., Floating Point Operations
No ratings yet
Two Forms of Pipelining: - E.g., Floating Point Operations
36 pages
Caches and Memory
No ratings yet
Caches and Memory
65 pages
Ec6302 Digital Electronics 2marks Answers
No ratings yet
Ec6302 Digital Electronics 2marks Answers
23 pages
Computer Architecture: Assoc. Prof. Nguyễn Trí Thành, Phd
No ratings yet
Computer Architecture: Assoc. Prof. Nguyễn Trí Thành, Phd
55 pages
OPERATING - SYSTEM - IV & V PYQs & SOLUTION
No ratings yet
OPERATING - SYSTEM - IV & V PYQs & SOLUTION
30 pages
ch2 Appb
No ratings yet
ch2 Appb
58 pages
DigitalLogic ComputerOrganization L22 CachesP3 Handout
No ratings yet
DigitalLogic ComputerOrganization L22 CachesP3 Handout
52 pages
COMP 740: Computer Architecture and Implementation: Montek Singh
No ratings yet
COMP 740: Computer Architecture and Implementation: Montek Singh
41 pages
M116C 1 EE116C-Midterm2-w15 Solution
100% (1)
M116C 1 EE116C-Midterm2-w15 Solution
8 pages
L18 Cache Wrap Up
No ratings yet
L18 Cache Wrap Up
30 pages
Ca Sol PDF
No ratings yet
Ca Sol PDF
8 pages
Cache Memory
No ratings yet
Cache Memory
28 pages
10 Caches
No ratings yet
10 Caches
34 pages
CS-30005 (HPC) - CS End Nov 2024
No ratings yet
CS-30005 (HPC) - CS End Nov 2024
23 pages
Cache Writing & Performance
No ratings yet
Cache Writing & Performance
23 pages
Lec 34
No ratings yet
Lec 34
26 pages
L07 MemoryII
No ratings yet
L07 MemoryII
27 pages
Computer Science 246 Computer Architecture: Si 2009 Spring 2009 Harvard University
No ratings yet
Computer Science 246 Computer Architecture: Si 2009 Spring 2009 Harvard University
27 pages
EECS 470 Final Review
No ratings yet
EECS 470 Final Review
16 pages
Unit II
No ratings yet
Unit II
9 pages
Chapter 4 Addressing Mode
No ratings yet
Chapter 4 Addressing Mode
43 pages
Numerical: Central Processing Unit
No ratings yet
Numerical: Central Processing Unit
28 pages
Revision Test 2
No ratings yet
Revision Test 2
9 pages
Disc09 Sols
No ratings yet
Disc09 Sols
7 pages
Advanced Computer Architecture-06CS81-Memory Hierarchy Design
No ratings yet
Advanced Computer Architecture-06CS81-Memory Hierarchy Design
18 pages
The 80386 Microprocessor Updated
No ratings yet
The 80386 Microprocessor Updated
13 pages
Memory Hierarchy - Ways To Reduce Misses: DAP Spr. 98 ©UCB 1
No ratings yet
Memory Hierarchy - Ways To Reduce Misses: DAP Spr. 98 ©UCB 1
23 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
12 pages
Syllabus Dip CST
No ratings yet
Syllabus Dip CST
46 pages
Lecture 5 Cache Optimization
No ratings yet
Lecture 5 Cache Optimization
25 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
48 pages
Parameters of Cache Memory: - Cache Hit - Cache Miss - Hit Ratio - Miss Penalty
No ratings yet
Parameters of Cache Memory: - Cache Hit - Cache Miss - Hit Ratio - Miss Penalty
18 pages
Os Ques
No ratings yet
Os Ques
13 pages
COSS MidSem 2020.07.05 MakeUp With Key COPYM06Tq# Name-Rana
No ratings yet
COSS MidSem 2020.07.05 MakeUp With Key COPYM06Tq# Name-Rana
5 pages
Module 5
No ratings yet
Module 5
25 pages
Cache Misses
No ratings yet
Cache Misses
8 pages
Comp Org Exam 3 Cheat Sheet
No ratings yet
Comp Org Exam 3 Cheat Sheet
3 pages
Computer Architecture Solutions - OK
No ratings yet
Computer Architecture Solutions - OK
6 pages
Coa Applied
No ratings yet
Coa Applied
13 pages
CompEng 361 Final Review Problems - Solutions
No ratings yet
CompEng 361 Final Review Problems - Solutions
6 pages
Computer CH 2
No ratings yet
Computer CH 2
7 pages
COA Answers
No ratings yet
COA Answers
5 pages
Computer Architecture and Organization: Lecture15: Cache Performance
No ratings yet
Computer Architecture and Organization: Lecture15: Cache Performance
17 pages
Lect12 Cache
No ratings yet
Lect12 Cache
39 pages
Homework 5
No ratings yet
Homework 5
6 pages
Lecture: Cache Hierarchies: Topics: Cache Innovations (Sections B.1-B.3, 2.1)
No ratings yet
Lecture: Cache Hierarchies: Topics: Cache Innovations (Sections B.1-B.3, 2.1)
20 pages
Memory Hierarchy Design-Aca
No ratings yet
Memory Hierarchy Design-Aca
15 pages
CompArch Cheatsheet
No ratings yet
CompArch Cheatsheet
2 pages
Examlet 4 Review
No ratings yet
Examlet 4 Review
2 pages
Anatomy of A Program in Memory
No ratings yet
Anatomy of A Program in Memory
19 pages
15IF11 Multicore E PDF
No ratings yet
15IF11 Multicore E PDF
14 pages
CSE Module2
No ratings yet
CSE Module2
9 pages
Advance Computer Architecture Homework 2 Solution
No ratings yet
Advance Computer Architecture Homework 2 Solution
8 pages
Cache
No ratings yet
Cache
34 pages
Lecture16 PDF
No ratings yet
Lecture16 PDF
4 pages
BFE Final Organization Fall 2014 Answer
No ratings yet
BFE Final Organization Fall 2014 Answer
8 pages
Computer Architecture
No ratings yet
Computer Architecture
5 pages
Replacing Serial Eeproms With User Flash Memory in Altera Cplds
No ratings yet
Replacing Serial Eeproms With User Flash Memory in Altera Cplds
4 pages
Cau 6 Cache
No ratings yet
Cau 6 Cache
25 pages
Assign1 PDF
No ratings yet
Assign1 PDF
5 pages
PDF
No ratings yet
PDF
6 pages
Archi Second 2013 2014 JCE
No ratings yet
Archi Second 2013 2014 JCE
2 pages
Chip Multicore Processors - Tutorial 6: Task 6.1: Cache Misses
No ratings yet
Chip Multicore Processors - Tutorial 6: Task 6.1: Cache Misses
1 page
5.2 Eleven Advanced Optimizations of Cache Performance
No ratings yet
5.2 Eleven Advanced Optimizations of Cache Performance
13 pages
UNIT-IV Memory and I/O
No ratings yet
UNIT-IV Memory and I/O
36 pages
Vga
No ratings yet
Vga
9 pages
Solution of CSE 240A Assignemnt 3
No ratings yet
Solution of CSE 240A Assignemnt 3
5 pages

COA Digital-Cheatsheet

Uploaded by

COA Digital-Cheatsheet

Uploaded by

lOMoARcPSD|38291606

CS2100 Finals Cheatsheet

Computer Organisation (National University of Singapore)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Hazard and resolution

 Total Execution Time = Block replacement policy

Multivibrator: sequential circuits that operate/swing between

S-R latch (“Set-Reset”) (High: 2 cross-coupled NOR gates Low: NAND):

Gated S-R latch: Outputs change only when EN is HIGH (AND)

• S-R flip-flop: Similar to gated S-R latch

J-K Flip Flop: Q and Q’ fed back to NAND gates

T Flip Flop: Tie both inputs of J-K together

Downloaded by Muskan Gupta ([email protected])

You might also like