0% found this document useful (0 votes)
6 views

Computer Architecture

Uploaded by

biel.fernandez07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Computer Architecture

Uploaded by

biel.fernandez07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Computer Architecture

Computer organisation

Memory components
▪ D latch
o store the state value unless the clock input C is asserted
o When C is asserted the value of input D replaces the value of Q

▪ flip-flop
o The output is equal to the value of the stored state
o The internal state is changed only on clock edge

▪ A register is a flip-flop with several bits


o A n-bits register consists n flip-flops with n inputs, n outputs and 1 clock
o Various types of registers are available commercially

▪ In a shift register the output of the flip-flopi is connected to the input of the flip-
flopi+1
▪ A register file is an array of registers
o Each register can be read by supplying a its register number

▪ Random Access Memory


o Larger amounts of memory than registers
o Slower access than registers
o Organised as arrays of 2m rows of n bits
▪ m bits needed to select a row
▪ Read Write Selector (RWS) control bit
• RWS = 0, RAM reads the address and the contents are
available in O
• RWS=1, RAM writes I at the address
Memory hierarchy
Memory organisation

▪ Endianness
o The order of byte wise values in memory

▪ Big-Endian
o Byte with most significant value: stored first (lowest memory address)
o Data networking and mainframes
o Motorola 68000 and PowerPC G5 are big-endian

▪ Little-Endian
o Byte with least significant value: stored first (lowest memory address)
o x86 Intel and AMD64 processors family and most microprocessors

▪ Some architectures support both


o E.g. Arm and IBM POWER in full, recent x86 and x86-64 have limited
support (movbe)
Big endian

▪ The location address points to the big end of the number


o Like writing the left-to-right

Little Endian

▪ The location address points to the Little endian of the number


o Like writing the bytes right-to-left
Endianness in Python

▪ Handling binary data


o stored in files
o or from network connections

Computer organisation

Central Processor Unit


Grouping operations together
Computation – programs

▪ Compute the sum of two vectors


o Vectors = data; data is stored in memory

Data operations
Processors

▪ M Chips
o N cores/chip
o T threads/core

▪ LLC – last level cache memory

▪ What do we need?
o A program – sequence of instructions
▪ Or multiple sequences... if concurrent/parallel

o Data – operands should reach the instructions


Hardware Thread
▪ Each hardware thread independently...
o Fetches instructions*
o Decodes
o Issues load memory accesses*
o Executes*
o Stores results*

*When executing a single thread per core, then such a thread has all core
resources available!
- Memory bandwidth

- Functional units

▪ Multithreading
o Execute multiple threads in parallel
Software Thread

▪ The instruction flow of a given running program. Any program has at least one
thread.
o Single-Threaded

▪ Multi-Threaded: execute multiple threads in parallel or concurrently

Hardware multithreading

▪ Each hardware thread independently...


o Fetches instructions*
o Decodes
o Issues load memory accesses*
o Executes*
o Stores results*

*When executing a single thread per core, then such a thread has all core
resources available!
- Memory bandwidth

- Functional units
Detailed memory access
Sample code

▪ Computing on vectors a, b, and c


▪ Accesses reference main memory locations, not cache locations
o Cache memories are transparently managed by the hardware
o Memory coherency: any read from any processor to a particular memory @,
returns the most recently written value to that @
o Memory consistency: ensure writes to different memory @ will be seen in the
correct order from all processors
Code generation details

Code execution details


Core details
▪ Instructions need the use of registers for bringing data to the thread
o Load/store instructions bring data from memory (also mov, add, mul...)
o Computation instructions use the ALUs to process data (add, mul...)
o Control instructions break the execution sequence (conditionally...)

Complete processor/memory system

▪ Most usually, systems have two or more chips


o NUMA – Non-Uniform Memory Access
Example of multiprocessor motherboard
Software/hardware mapping

Current processor chips

▪ Intel Xeon E7 v4 family


o 14 nm technology
o 24 cores / hyperthreading (2), 2.2 – 3.4 GHz.
o L3 cache 60MB.
o MAX CPU supported 8 sockets
o 3.07 TB. MAX RAM 1866 MHz., 4 memory channels
o PCIe x4, x8, x16

▪ IBM Power 9
o 14 nm technology
o 24 cores / SMT (8), 3.0 – 4.0 GHz.
o L1 caches 32+32 KB
o L2 cache 512 KB.
o L3 cache 120MB
o MAX CPU supported 4-8 and more sockets
o 2 TB MAX RAM DDR4
o PCIe v4 x4, x8, x16
▪ Intel KNL – Xeon Phi 72x5
o 14 nm technology
o 72 cores 1.5 – 1.6 GHz.
o L2 cache 36 MB.
o MAX CPU supported 1 socket?
o 384 GB. MAX RAM DDR4
o PCIe v3 x4, x8, x16

▪ ARM Cortex-A77
o 7 nm technology
o aarch64 – ARMv8-A
o 4-8 cores
o DynamIQ Technology – (big-LITTLE)

▪ ARM Cortex-A72 – A64FX (Fujitsu)


o 7 nm
o ARMv8.2
o 48 cores
o 512-bit SIMD Scalable Vector Extensions (SVE)

▪ Apple M3
o 3 nm technology
o 4.05 GHz performance, 2.76 GHz efficiency
o aarch64 – ARMv8.6-A
o 4 performance cores + 4 efficiency cores
o L1 cache 192+128 KiB per performance core
o L1 cache 128+64 KiB per efficiency core
o L2 cache 16 MiB
o RAM 8-24 GB
o GPU 8-10 cores

Computer organisation
Input/Output components
▪ The I/O Bus extends the access to
o Accelerators (GPUs, FPGAs)
o Disks
o Network
o Human-Machine Interface Peripherals

Accelerators
Access to accelerators/devices/peripherals

Sata and HMI Peripherals


Storage and file systems
Networking

▪ Send/receive information to
o Servers
o Network-attached disks

▪ Protocols
o Low-level – ethernet packet
o High-level – TCP/IP

▪ Control based on memory mapped configuration registers


o Access from the OS

▪ Data transfers based on DMA engines


Virtual Machine (VM)

You might also like