11. Compute - Part 2
11. Compute - Part 2
Architecture
Infrastructure Building Blocks
and Concepts
Compute – Part 2
(chapter 11)
Midrange systems
• Still used in places where high availability, performance, and security are
very important
Midrange systems - Architecture
• In a shared memory architecture, all CPUs in the system can access all
installed memory blocks
Changes made in memory by one CPU are immediately seen by all other CPUs
A shared bus connects all CPUs and all RAM
The I/O system is also connected to the interconnection network
Midrange systems - Architecture
• The virtualization and operating systems using the server hardware must
be aware that components can be swapped on the fly
For instance, the operating system must be able to recognize that memory is added
while the server operates and must allow the use of this extra memory without the
need for a reboot
Parity memory
• To detect memory failures, parity bits can be used as the simplest form of
error detecting code
• Parity bits enable the detection of data errors
• They cannot correct the error, as it is unknown which bit has flipped
DATA PARITY
1001 0110 0
1011 0110 1
0001 0110 0 -> ERROR: parity bit should have been 1!
ECC memory
• ECC memory not only detects errors, but is also able to correct them
• ECC Memory chips use Hamming Code or Triple Modular Redundancy
(TMR) as the method of error detection and correction
• Memory errors are proportional to the amount of RAM in a computer as
well as the duration of operation
Since servers typically contain many GBs of RAM and are in operation 24 hours a day,
the likelihood of memory errors is relatively high and hence they require ECC memory
Virtualization availability
• This trend has continued for more than half a century now
• An Intel Alder Lake hybrid processor contains 100,000,000,000 (100
billion) transistors
• A 43 million-fold increase in 52 years’ time!
CPU: Moore's law
Please note that the vertical scale is logarithmic instead of linear, showing a
10-fold increase of the number of transistors in each step
CPU: Moore's law
• Moore’s law cannot continue forever, as there are physical limits to the
number of transistors a single chip can hold
Today, the connections used inside a high-end CPU have a physical width of 5 nm
(nanometer)
This is extremely small – the size of 24 atoms (the diameter of an atom is of the order
of 0.21 nm )
CPU: Increasing CPU and memory
performance
• Various techniques have been invented to increase CPU performance, like:
Increasing the clock speed
Caching
Prefetching
Branch prediction
Pipelines
Use of multiple cores
CPU: Increasing clock speed
• Cache memory runs at full CPU speed (say 3 GHz), main memory runs at
the CPU external clock speed (say 100 MHz, which is 30 times slower)
CPU: Caching
CPU: Pipelines
• Early processors first fetched an instruction, decoded it, then executed the
fetched instruction, and wrote the result back before fetching the next
instruction and starting the process over again
CPU: Pipelines
• A superscalar CPU can process more than one instruction per clock tick
• This is done by simultaneously dispatching multiple instructions to
redundant functional units on the processor
CPU: Multi-core CPUs
• The fastest commercial CPUs have been between running between 3 GHz
and 4 GHz for a number of years now
• Reasons:
High clock speeds make connections on the circuit board work as a radio antenna
A frequency of 3 GHz means a wavelength of 10 cm. When signals travel for more than
a few cm on a circuit board, the signal gets out of phase with the clock
The CPU can heat up tremendously at certain spots, which could lead to a meltdown
CPU: Multi-core CPUs
• The physical machine needs to handle the disk and network I/O of all
running virtual machines
This can easily lead to an I/O performance bottleneck
Virtualization performance
• Databases generally require a lot of network bandwidth and high disk I/O
performance
This makes databases less suitable for a virtualized environment
Raw Device Mapping allows a virtual machine exclusive access to a physical storage
medium
This diminishes the performance hit of the hypervisor on storage to almost zero
Virtualization performance
• Some servers allow the detection of the physical opening of the server
housing
Such an event can be sent to a central management console using for instance SNMP
traps
If possible, enable this to detect unusual activities
Data in use
• DMZ
Consider using separate physical machines that run all the virtual machines needed in
the DMZ
Virtualization security