0% found this document useful (0 votes)
5 views

Computer Architecture

Uploaded by

biel.fernandez07
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Computer Architecture

Uploaded by

biel.fernandez07
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Computer Architecture

Computer organisation

Memory components
▪ D latch
o store the state value unless the clock input C is asserted
o When C is asserted the value of input D replaces the value of Q

▪ flip-flop
o The output is equal to the value of the stored state
o The internal state is changed only on clock edge

▪ A register is a flip-flop with several bits


o A n-bits register consists n flip-flops with n inputs, n outputs and 1 clock
o Various types of registers are available commercially

▪ In a shift register the output of the flip-flopi is connected to the input of the flip-
flopi+1
▪ A register file is an array of registers
o Each register can be read by supplying a its register number

▪ Random Access Memory


o Larger amounts of memory than registers
o Slower access than registers
o Organised as arrays of 2m rows of n bits
▪ m bits needed to select a row
▪ Read Write Selector (RWS) control bit
• RWS = 0, RAM reads the address and the contents are
available in O
• RWS=1, RAM writes I at the address
Memory hierarchy
Memory organisation

▪ Endianness
o The order of byte wise values in memory

▪ Big-Endian
o Byte with most significant value: stored first (lowest memory address)
o Data networking and mainframes
o Motorola 68000 and PowerPC G5 are big-endian

▪ Little-Endian
o Byte with least significant value: stored first (lowest memory address)
o x86 Intel and AMD64 processors family and most microprocessors

▪ Some architectures support both


o E.g. Arm and IBM POWER in full, recent x86 and x86-64 have limited
support (movbe)
Big endian

▪ The location address points to the big end of the number


o Like writing the left-to-right

Little Endian

▪ The location address points to the Little endian of the number


o Like writing the bytes right-to-left
Endianness in Python

▪ Handling binary data


o stored in files
o or from network connections

Computer organisation

Central Processor Unit


Grouping operations together
Computation – programs

▪ Compute the sum of two vectors


o Vectors = data; data is stored in memory

Data operations
Processors

▪ M Chips
o N cores/chip
o T threads/core

▪ LLC – last level cache memory

▪ What do we need?
o A program – sequence of instructions
▪ Or multiple sequences... if concurrent/parallel

o Data – operands should reach the instructions


Hardware Thread
▪ Each hardware thread independently...
o Fetches instructions*
o Decodes
o Issues load memory accesses*
o Executes*
o Stores results*

*When executing a single thread per core, then such a thread has all core
resources available!
- Memory bandwidth

- Functional units

▪ Multithreading
o Execute multiple threads in parallel
Software Thread

▪ The instruction flow of a given running program. Any program has at least one
thread.
o Single-Threaded

▪ Multi-Threaded: execute multiple threads in parallel or concurrently

Hardware multithreading

▪ Each hardware thread independently...


o Fetches instructions*
o Decodes
o Issues load memory accesses*
o Executes*
o Stores results*

*When executing a single thread per core, then such a thread has all core
resources available!
- Memory bandwidth

- Functional units
Detailed memory access
Sample code

▪ Computing on vectors a, b, and c


▪ Accesses reference main memory locations, not cache locations
o Cache memories are transparently managed by the hardware
o Memory coherency: any read from any processor to a particular memory @,
returns the most recently written value to that @
o Memory consistency: ensure writes to different memory @ will be seen in the
correct order from all processors
Code generation details

Code execution details


Core details
▪ Instructions need the use of registers for bringing data to the thread
o Load/store instructions bring data from memory (also mov, add, mul...)
o Computation instructions use the ALUs to process data (add, mul...)
o Control instructions break the execution sequence (conditionally...)

Complete processor/memory system

▪ Most usually, systems have two or more chips


o NUMA – Non-Uniform Memory Access
Example of multiprocessor motherboard
Software/hardware mapping

Current processor chips

▪ Intel Xeon E7 v4 family


o 14 nm technology
o 24 cores / hyperthreading (2), 2.2 – 3.4 GHz.
o L3 cache 60MB.
o MAX CPU supported 8 sockets
o 3.07 TB. MAX RAM 1866 MHz., 4 memory channels
o PCIe x4, x8, x16

▪ IBM Power 9
o 14 nm technology
o 24 cores / SMT (8), 3.0 – 4.0 GHz.
o L1 caches 32+32 KB
o L2 cache 512 KB.
o L3 cache 120MB
o MAX CPU supported 4-8 and more sockets
o 2 TB MAX RAM DDR4
o PCIe v4 x4, x8, x16
▪ Intel KNL – Xeon Phi 72x5
o 14 nm technology
o 72 cores 1.5 – 1.6 GHz.
o L2 cache 36 MB.
o MAX CPU supported 1 socket?
o 384 GB. MAX RAM DDR4
o PCIe v3 x4, x8, x16

▪ ARM Cortex-A77
o 7 nm technology
o aarch64 – ARMv8-A
o 4-8 cores
o DynamIQ Technology – (big-LITTLE)

▪ ARM Cortex-A72 – A64FX (Fujitsu)


o 7 nm
o ARMv8.2
o 48 cores
o 512-bit SIMD Scalable Vector Extensions (SVE)

▪ Apple M3
o 3 nm technology
o 4.05 GHz performance, 2.76 GHz efficiency
o aarch64 – ARMv8.6-A
o 4 performance cores + 4 efficiency cores
o L1 cache 192+128 KiB per performance core
o L1 cache 128+64 KiB per efficiency core
o L2 cache 16 MiB
o RAM 8-24 GB
o GPU 8-10 cores

Computer organisation
Input/Output components
▪ The I/O Bus extends the access to
o Accelerators (GPUs, FPGAs)
o Disks
o Network
o Human-Machine Interface Peripherals

Accelerators
Access to accelerators/devices/peripherals

Sata and HMI Peripherals


Storage and file systems
Networking

▪ Send/receive information to
o Servers
o Network-attached disks

▪ Protocols
o Low-level – ethernet packet
o High-level – TCP/IP

▪ Control based on memory mapped configuration registers


o Access from the OS

▪ Data transfers based on DMA engines


Virtual Machine (VM)

You might also like