Modul 10. Compute
Modul 10. Compute
Introduction
• Compute is an
umbrella term
for computers located
in the datacenter
• Physical machines or
virtual machines
• Three groups compute
systems:
• Mainframes
• Midrange systems
• x86 servers
Introduction
• Physical computers contain:
• Power supplies
• Central Processing Units
• A Basic Input/Output System
• Memory
• Expansion ports
• Network connectivity
• A keyboard, mouse, and monitor
History
• The British Colossus
computer, created during
World War II, was the
world's first programmable
computer
• Information about it was
classified under British
secrecy laws
• The first publicly recognized
general purpose computer
was the ENIAC (Electronic
Numerical Integrator And
Computer)
• The ENIAC was designed in
1943 and was financed by the
United States Army in the
midst of World War II
History
• The ENIAC:
• Could perform 5,000 operations per second
• Used more than 17,000 vacuum tubes
• Got its input using an IBM punched card reader
• Punched cards were used for output
• In the 1960s computers started to be built using
transistors instead of vacuum tubes
• Smaller
• Faster
• Cheaper to produce
• Required less power
• Much more reliable
History
• The transistor based computers were followed in the
1970s by computers based on integrated circuit (IC)
technology
• ICs are small chips that contain a set of transistors
providing standardized building blocks like AND gates, OR
gates, counters, adders, and flip-flops
• By combining building blocks, CPUs and memory circuits
could be created
• Microprocessors decreased size and cost of
computers even further
• Increased their speed and reliability
• In the 1980s microprocessors were cheap enough to be
used in personal computers
Compute building blocks
Computer housing
• Originally, computers were
stand-alone complete
systems, called pedestal or
tower computers
• Placed on the datacenter floor
• Most x86 servers and
midrange systems are now:
• Rack mounted
• Blade servers
• Blade servers are less
expensive than rack mounted
servers
• They use the enclosure’s
shared components like power
supplies and fans
Computer housing
• A blade enclosure typically hosts from 8 to 16 blade
servers
• Blade enclosure provides:
• Shared redundant power supplies for all blades
• Shared backplane to connect all blades
• Redundant network switches to connect the blades’
Ethernet interfaces providing redundant Ethernet
connections to other systems
• Redundant SAN switches to connect the HBA interfaces on
the blade servers providing dual redundant Fibre Channel
connections to other systems
• A management module to manage the enclosure and the
blades in it
Computer housing
• The amount of wiring in a blade server setup is
substantially reduced when compared to traditional
server racks
• Less possible points of failure
• Lower initial deployment costs
• Enclosures are often not only used for blade servers,
but also for storage components like disks,
controllers, and SAN switches
Processors
• In a computer, the Central Processing Unit (CPU) – or
processor – executes a set of instructions
• A CPU is the electronic circuitry that carries out the
instructions of a computer program by performing the
basic arithmetic, logical, control and input/output (I/O)
operations specified by the instructions
• Today’s processors contain billions of transistors and
are extremely powerful
Processor instructions
• CPU can perform a fixed number of instructions such as
ADD, SHIFT BITS, MOVE DATA, and JUMP TO CODE
LOCATION, called the instruction set
• A program created using CPU instructions is referred to
as machine code
• Each instruction is associated with an English like
mnemonic
• Easier for people to remember
• Set of mnemonics is called the assembly language
• For example:
• Binary code for the ADD WITH CARRY
• Machine code instruction: 10011101
• Mnemonic : ADC
Processors - programming
• The assembler programming language is the lowest
level programming language for computers
• Higher level programming languages are much
more human friendly
• C#
• Java
• Python
• Programs written in these languages are translated
to assembly code before they can run on a specific
CPU
• This compiling is done by a high-level language
compiler
Processors - speed
• A CPU needs a high frequency clock to operate,
generating so-called clock ticks or clock cycles
• Each machine code instruction takes one or more clock
ticks to execute
• An ADD instruction typically costs 1 tick to compute
• The speed at which the CPU operates is defined in
GHz (billions of clock ticks per second)
• A single core of a 2.4 GHz CPU can perform 2.4 billion
additions in 1 second
Processors – word size
• Each CPU is designed to handle data in chunks,
called words, with a specific size
• The first CPUs had a word size of 4 bits
• Today, most CPUs have a word size of 64 bits
• The word size is reflected in many aspects of a
CPU's structure and operation:
• The majority of the internal memory registers are the
size of one word
• The largest piece of data that can be transferred to and
from the working memory in a single operation is a word
• A 64-bit CPU can address 17,179,869,184 TB of memory
(64-bit word)
Intel x86 processors
• Intel CPUs became the de-facto standard for many
computer architectures
• The original PC used a 4.77 MHz 16-bit 8088 CPU
• A few years later, Intel produced the 32-bit 80386 and
the 80486 processors
• Since these names all ended with the number 86,
the generic architecture was referred to as x86
• In 2017, the latest Intel x86 model is the 22-core
E5-2699A Xeon Processor, running on 2.4 GHz
AMD x86 processors
• Advanced Micro Devices, Inc. (AMD) is the second-
largest global supplier of microprocessors based on
the x86 architecture
• In 1982, AMD signed a contract with Intel, becoming a
licensed second-source manufacturer of 8086 and 8088
processors for IBM
• Intel cancelled the licensing contract in 1986
• AMD still produces x86 compatible CPUs, forcing Intel to
keep innovating and to keep CPU prices relatively low
• In 2017, the latest model is the 16-core AMD
Opteron 6386 SE CPU, running on 2.8 GHz
Itanium and x86-64 processors
• The Itanium processor line was a family of 64-bit high-
end CPUs meant for high-end servers and workstations
• Not based on the x86 architecture
• HP was the only company to actively produce Itanium based
systems, running HP-UX and OpenVMS
• In 2005, AMD released the K8 core processor
architecture as an answer to Intel’s Itanium architecture
• The K8 included a 64-bit extension to the x86 instruction set
• Later, Intel adopted AMS’s processor’s instruction set as an
extension to its x86 processor line, called x86-64
• Today, the x86-64 architecture is used in all Intel and
AMD processors
ARM processors
• The ARM (Advanced RISC Machine) is the most used
CPU in the world
• In 2013, 10 billion ARM processors were shipped,
running on:
• 95% of smartphones
• 90% of hard disk drives
• 40% of digital televisions and set-top boxes
• 15% of microcontrollers
• 20% of mobile computers
• The CPU is produced by a large number of
manufacturers under license of ARM
• Since 2016, ARM is owned by Japanese
telecommunications company SoftBank Group
Oracle SPARC processors
• In 1986, Sun Microsystems started to produce the
SPARC processor series for their Solaris UNIX based
systems
• The SPARC architecture is fully open and non-
proprietary
• A true open source hardware design
• Any manufacturer can get a license to produce a SPARC CPU
• Oracle bought Sun Microsystems in 2010
• SPARC processors are still used by Oracle in their
Exadata and Exalogic products
• In 2017, the latest model is the 32-core SPARC M7 CPU,
running on 4.1 GHz
IBM POWER processors
• POWER (also known as PowerPC) is a series of
CPUs
• Created by IBM
• Introduced in 1990
• IBM uses POWER CPUs in many of their high-end
server products
• Watson, the supercomputer that won Jeopardy in 2011,
was equipped with 3,000 POWER7 CPU cores
• In 2017, the latest model is the 24-core POWER9
CPU, running on 4 GHz
Memory – early systems
• The first computers used vacuum tubes to store
data
• Extremely expensive, uses much power, fragile, generates
much heat
• An alternative to vacuum tubes were relays
• Mechanical parts that use magnetism to move a physical
switch
• Two relays can be combined to create a single bit of
memory storage
• Slow, uses much power, noisy, heavy, and expensive
• Based on cathode ray tubes, the Williams tube was
the first random access memory, capable of storing
several thousands of bits, but only for some seconds
Memory – early systems
• The first truly useable type of
main memory was magnetic
core memory, introduced in
1951
• The dominant type of memory
until the late 1960s
• Uses very small magnetic rings,
called cores, with wires running
through them
• The wires can polarize the
magnetic field one direction or
the other in each individual core
• One direction means 1, the other
means 0
• Core memory was replaced by
RAM chips in the 1970s
RAM memory
• RAM: Random Access Memory
• Any piece of data stored in RAM can be read in the same
amount of time, regardless of its physical location
• Based on transistor technology, typically
implemented in large amounts in Integrated Circuits
(ICs)
• Data is volatile – it remains available as long as the
RAM is powered
RAM memory
• Static RAM (SRAM)
• Uses flip-flop circuitry to store bits
• Six transistors per bit
• Dynamic RAM (DRAM)
• Uses a charge in a capacitor
• One transistor per bit
• DRAM loses its data after a short time due to the leakage
of the capacitors
• To keep data available in DRAM it must be refreshed
regularly (typically 16 times per second)
BIOS
• The Basic Input/Output System (BIOS) is a set of
instructions stored on a memory chip located on the
computer’s motherboard
• The BIOS controls a computer from the moment it is
powered on, to the point where the operating
system is started
• Mostly implemented in a Flash memory chip
• It is good practice to update the BIOS software
regularly
• Upgrading computers to the latest version of the BIOS is
called BIOS flashing
Interfaces
• Connecting computers to external peripherals is
done using interfaces
• External interfaces use connectors located at the
outside of the computer case
• One of the first standardized external interfaces was the
serial bus based on RS-232
• RS-232 is still used today in some systems to connect:
• Older type of peripherals
• Industrial equipment
• Console ports
• Special purpose equipment
USB
• The Universal Serial Bus (USB) was introduced in
1996 as a replacement for most of the external
interfaces on servers and PCs
• Can provide operating power to attached devices
• Up to seven devices can be daisy-chained
• Hubs can be used to connect multiple devices to one USB
computer port
• In 2013, USB 3.1 was introduced
• Provides a throughput of 10 Gbit/s
• In 2014, USB Type-C was introduced
• Smaller connector
• Ability to provide more power to connected devices
Thunderbolt
• Thunderbolt, also known as Light Peak, was
introduced in 2011
• Thunderbolt 3 was released in 2015
• Can provide a maximum throughput of 40 Gbit/s
• Provide 100 W power to devices
• Uses the USB Type-C connector
• Backward compatible with USB 3.1
PCI
• Internal interfaces, typically some form of PCI,
are located on the system board of the computer,
inside the case, and connect expansion boards
like network adapters and disk controllers
• Uses a shared parallel bus architecture
• Only one shared communication path between two
PCI devices can be active at any given time
PCIe
• PCI Express (PCIe) uses a topology based on point-to-
point serial links, rather than a shared parallel bus
architecture
• A connection between any two PCIe devices is known as a
link
• A collection of 1 or more links is called a lane
• Routed by a hub on the system board acting as a
crossbar switch
• The hub allows multiple pairs of devices to communicate
with each other at the same time
• Despite the availability of the much faster PCIe,
conventional PCI remains a very common interface in
computers
PCI and PCIe
PCI and PCIe
PCI speeds in Gbit/s
Lanes
1 2 4 8 16 32
PCI 32-bit/33 MHz 1
PCI 32-bit/66 MHz 2
PCI 64-bit/33 MHz 2
PCI 64-bit/66 MHz 4
PCI 64-bit/100 MHz 6
PCIe 1.0 2 4 8 16 32 64
PCIe 2.0 4 8 16 32 64 128
PCIe 3.0 8 16 32 64 128 256
PCIe 4.0 16 32 64 128 256 512
Compute virtualization
• Compute virtualization is
also known as:
• Server virtualization
• Software Defined Compute
• Introduces an abstraction
layer between physical
computer hardware and
the operating system
using that hardware
• Allows multiple operating
systems to run on a single
physical machine
• Decouples and isolates
virtual machines from the
physical machine and from
other virtual machines
Compute virtualization
• A virtual machine is a logical representation of a
physical computer in software
• New virtual machines can be provisioned without
the need for a hardware purchase
• With a few mouse clicks or using an API
• New virtual machines can be installed in minutes
• Costs can be saved on hardware, power, and cooling
by consolidating many physical computers as virtual
machines on fewer (bigger) physical machines
• Because fewer physical machines are needed, the
cost of maintenance contracts can be reduced and
the risk of hardware failure is reduced
Software Defined Compute
(SDC)
• Virtual machines are typically managed using one
redundant centralized virtual machine management
system
• Enables systems managers to manage more machines
with the same number of staff
• Allows managing the virtual machines using APIs
• Server virtualization can therefore be seen as Software
Defined Compute
• In SDC, all physical machines are running a
hypervisor and all hypervisors are managed as one
layer using management software
Software Defined Compute
(SDC)
Software Defined Compute
(SDC)
• Some virtualization platforms allow running virtual
machines to be moved automatically between
physical machines
• Benefits:
• When a physical machine fails, all virtual machines that
ran on the failed physical machine can be restarted
automatically on other physical machines
• Virtual machines can automatically be moved to the least
busy physical machines
• Some physical machines can get fully loaded while other
physical machines can be automatically switched off,
saving power and cooling cost
• Enables hardware maintenance without downtime
Disadvantages of computer
virtualization
• Because creating a new virtual machine is so easy,
virtual machines tend to get created for all kinds of
reasons
• This effect is known as "virtual machine sprawl“
• All VMs:
• Must be managed
• Use resources of the physical machine
• Use power and cooling
• Must be back-upped
• Must be kept up to date by installing patches
Disadvantages of computer
virtualization
• Introduction of an extra layer in the infrastructure
• License fees
• Systems managers training
• Installation and maintenance of additional tools
• Virtualization cannot be used on all servers
• Some servers require additional specialized hardware,
like modem cards, USB tokens or some form of high speed
I/O like in real-time SCADA systems
• Virtualization is not supported by all application
vendors
• When the application experiences some problem,
systems managers must reinstall the application on a
physical machine before they get support
Virtualization technologies
• Emulation
• Can run programs on a computer, other than the one they
were originally intended for
• Run a mainframe operating system on a x86 server
• Logical Partitions (LPARs)
• Hardware based
• Used on mainframe and midrange systems
Virtualization technologies
• Hypervisors
• Control the physical computer's hardware and provide
virtual machines with all the services of a physical system
• Virtual CPUs
• BIOS
• Virtual devices
• Virtualized memory management
• Three types:
• Binary translation
• Paravirtualization
• Hardware assisted virtualization (most used on x86 servers)
Container technology
• Container technology is a server virtualization method
in which the kernel of an operating system provides
multiple isolated user-space instances, instead of just
one
• Containers look and feel like a real server from the
point of view of its owners and users, but they share
the same operating system kernel
• Containers are part of the Linux kernel since 2008
Container technology
• Containers are a
balance between
isolation and
overhead of running
isolated
applications
Container technology
• Containers have a number of benefits:
• Isolation
• Applications or application components can be encapsulated in
containers, each operating independently and isolated from
each other
• Portability
• Since containers typically contain all components the
application needs to function, including libraries and patches,
containers can be run on any infrastructure that is capable of
running containers using the same kernel version
• Easy deployment
• Containers allow developers to quickly deploy new software
versions, as their containers can be moved from the
development environment to the production environment
unaltered
Container implementation
• Containers are based on 3 technologies that are all
part of the Linux kernel:
• Chroot (also known as a jail)
• Changes the root directory for the current running process
• Ensures processes cannot access files outside the designated
directory tree
• Namespaces
• Allows complete isolation of an applications' view of the
operating environment
• Process trees, networking, user IDs and mounted file systems
Container implementation
• Cgroups
• Limits and isolates the resource usage of a collection of
processes
• PU, memory, disk I/O, network
• Linux Containers (LXC), introduced in 2008, is a
combination of these
• Docker is a popular implementation of a container
ecosystem
Container orchestration
• Container orchestration abstracts the resources of a cluster
of machines and provides services to containers
• A container orchestrator enables containers to be run
anywhere on a cluster of machines
• Schedules the containers to run on any machine that has resources
available
• Acts like a kernel for the combined resources of an entire
datacenter
Mainframes
• A mainframe is a high-performance computer
made for high-volume, I/O-intensive computing
• Expensive
• Mostly used for administrative processes
• Optimized for handling high volumes of data
• IBM is the largest vendor – it has 90% market share
• The end of the mainframe is predicted for decades
now, but mainframes are still widely used
• Today’s mainframes are still large (the size of a few
19" racks), but they don’t fill-up a room anymore
Mainframes
• Mainframes are highly
reliable, typically running
for years without downtime
• Much redundancy is built in
• Hardware can be upgraded
and repaired while the
mainframe is operating
without downtime
• The latest IBM z13
mainframe:
• Introduced in 2015
• Up to 10TB of memory
• Up to 141 processors
• Running at a 5GHz clock
speed
• Can run up to 8000 virtual
machines simultaneously
Mainframe architecture
• A mainframe
consists of:
• Processing units
(PUs)
• Memory
• I/O channels
• Control units
• Devices, all placed
in racks (frames)
Mainframe architecture – PU,
memory, and disks
• In the mainframe world, the term PU (Processing
Unit) is used instead of CPU
• A mainframe has multiple PUs, so there is no central
processing unit
• The total of all PUs in a mainframe is called a Central
Processor Complex (CPC)
• Each book package in the CPC cage contains from
four to eight memory cards
• For example, a fully loaded z9 mainframe has four book
packages that can provide up to a total of 512 GB of
memory
• Disks in mainframes are called DASD (Direct
Attached Storage Device)
• Comparable to a SAN in a midrange or x86 environment
Mainframe architecture –
Channels and control units
• A channel provides a data and control path
between I/O devices and memory
• Today’s largest mainframes have 1024 channels
• A control unit is similar to an expansion card in
an x86 or midrange system
• Contains logic to work with a particular type of I/O
device, like a printer or a tape drive
Mainframe architecture –
Channels and control units
• Channel types:
• OSA
• Connectivity to various industry standard networking
technologies, including Ethernet
• FICON
• The most flexible channel technology, based on fiber-optic
technology
• With FICON, input/output devices can be located many
kilometers from the mainframe to which they are attached
• ESCON
• An earlier type of fiber-optic technology
• Almost as fast as FICON channels, but at a shorter distance
Mainframe virtualization
• Mainframes were designed for virtualization from the
start
• Logical partitions (LPARs) are the default virtualization
solution
• LPARs are equivalent to separate mainframes
• A common number of LPARs in use on a mainframe is
less than ten
• The mainframe operating system running on each
LPAR is designed to concurrently run a large number of
applications and services, and can be connected to
thousands of users at the same time
• Often one LPAR runs all production tasks while another
runs the consolidated test environment
Midrange systems
• The midrange platform is positioned between
the mainframe platform and the x86 platform
• Built using parts from only one vendor, and run
an operating system provided by that same
vendor
• This makes the platform:
• Stable
• High available
• Secure
Midrange systems
• Today midrange systems are produced by three
vendors:
• IBM
• Power Systems series
• Operating system: AIX UNIX, Linux and IBM i
• Hewlett-Packard
• HP Integrity systems
• Operating system: HP-UX UNIX and OpenVMS
• Oracle
• Sun Microsystems’s based SPARC servers
• Operating system: Solaris UNIX
Midrange systems - History
• Minicomputer evolved in
the 1960s as small
computers that became
possible with the use of IC
and core memory
technologies
• Small was relative
• A single minicomputer
typically was housed in a
few cabinets the size of a
19” rack
• The first commercially
successful minicomputer
was DEC PDP-8, launched
in 1964
Midrange systems - History
• Minicomputers became powerful systems
• They ran full multi-user, multitasking operating
systems like OpenVMS and UNIX
• In the 1980s, minicomputers (a.k.a. midrange
systems) became less popular
• A result of the lower cost of PCs, and the emergence
of LANs
• Still used in places where high availability,
performance, and security are very important
Midrange systems -
Architecture
• The architecture of most midrange systems:
• Uses multiple CPUs
• Based on a shared memory architecture
• In a shared memory architecture, all CPUs in the
system can access all installed memory blocks
• Changes made in memory by one CPU are
immediately seen by all other CPUs
• A shared bus connects all CPUs and all RAM
• The I/O system is also connected to the
interconnection network
Midrange systems -
Architecture
• Shared memory architectures come in two
flavors:
• Uniform Memory Access (UMA)
• Non-Uniform Memory Access (NUMA)
• Cache coherence is needed
• If one CPU writes to a location in shared memory, all other
CPUs must update their caches to reflect the changed data
• Cache coherent versions are known as ccUMA and
ccNUMA
Midrange systems – UMA
architecture
• The UMA architecture is
one of the earliest styles
of multi-CPU
architectures, typically
used in systems with no
more than 8 CPUs
• The machine is organized
into a series of nodes
containing either a
processor, or a memory
block
• Nodes are
interconnected, usually
by a shared bus
Midrange systems – SMP
architecture
• UMA systems are also
known as Symmetric
Multi-Processor (SMP)
systems
• SMP is used in x86
servers as well as early
midrange systems
• Can be Implemented
inside multi-core CPUs
• The interconnect is
implemented on-chip in
the CPU
• A single path to the
main memory is
provided between the
chip and the memory
subsystem
Midrange systems – NUMA
architecture
• NUMA is a computer
architecture in which the
machine is organized into a
series of nodes
• Each node contains
processors and memory
• Nodes are interconnected
using a crossbar interconnect
• When a processor accesses
memory not within its own
node, data must be
transferred over the
interconnect
• Slower than accessing local
memory
• Memory access times are non-
uniform
Midrange virtualization
• Most midrange platform vendors provide
virtualization based on LPARs
• LPARS are a type of hardware partitioning
• IBM AIX: Workload/Working Partitions (WPARs)
• HP: nPARs
• Oracle Solaris: zones and containers
x86 servers
• The x86 platform is the most dominant server architecture
today
• In the 1990s, x86 servers first started to appear.
• They were basically big PCs, housed in 19” racks without dedicated
keyboards and monitors
• x86 servers are produced by many vendors, like:
• HP
• Dell
• Lenovo
• HDS (Hitachi Data Systems)
• Huawei
• Implementation of the platform is highly dependent on the
vendor and the components available at a certain moment
• Most used operating systems are Microsoft Windows and
Linux
x86 servers - Architecture
• x86 architectures are
defined by a CPU from
the x86 family, and
building blocks,
integrated in a number
of specialized chips,
known as an x86
chipset
• Earlier x86 systems
utilized a Northbridge
/ Southbridge
architecture
• Front Side Bus (FSB)
• Direct Media Interface
(DMI)
x86 servers - Architecture
• In the PCH architecture, the
RAM and PCIe data paths
are directly connected to
the CPU
• The Northbridge integrated
in the CPU
• Intel introduces new
architectures and chipsets
roughly every two years
• Now full system on a chip
(SoC)
• SOCs directly expose:
• PCIe lanes
• SATA
• USB
• High Definition Video
x86 virtualization
• On x86 platforms, most servers only run one application each
• A Windows server running Exchange will probably not also run
SharePoint
• A Linux server running Apache will probably not also run MySQL
• This is the main reason x86 systems use their hardware much less
efficient that midrange systems
• By running multiple operating systems – each in one virtual
machine – on a large x86 server, resource utilization can be
improved
• The most used products for virtualization on the x86 platform
are:
• VMware vSphere
• Microsoft’s Hyper-V
• Citrix XenServer
• Oracle VirtualBox
• Red Hat RHEV
Supercomputers
• A supercomputer is a computer architecture
designed to maximize calculation speed
• This in contrast with a mainframe, which is optimized
for high I/O throughput
• Supercomputers are the fastest machines available
at any given time
• Used for highly compute-intensive tasks requiring
floating point calculations, like:
• Weather forecast calculations
• Oil reservoir simulations
• Rendering of animation movies
Supercomputers
• Originally,
supercomputers were
produced primarily by
a company named
Cray Research
• Cray-1 (1976): 250
MFLOPS (Million
Floating Point
Operations per
second)
• Cray-2 (1985): 1,900
MFLOPS
Supercomputers
• Nowadays high performance computing is
done mainly with large arrays of x86
systems
• Intel’s Core i7 5960X CPU has a peak
performance of 354,000 MFLOPS
• The fastest computer array is a cluster with
more than 10,000,000 CPU cores,
calculating at 125,435,000,000 MFLOPS,
running Linux
• Graphics processing units (GPUs) can be
used together with CPUs to accelerate
specific calculations
• A GPU has a massively parallel architecture
consisting of thousands of small, efficient
cores designed for handling multiple tasks
simultaneously
• The NVIDIA Tesla GP100 GPU, introduced in
2016, has 3840 cores
Compute availability
Hot swappable components
• Hot swappable components are server components
that can be installed, replaced, or upgraded while the
server is running
• Memory
• CPUs
• Interface cards
• Power supplies
• The virtualization and operating systems using the
server hardware must be aware that components can
be swapped on the fly
• For instance, the operating system must be able to recognize
that memory is added while the server operates and must
allow the use of this extra memory without the need for a
reboot
Parity memory
• To detect memory failures, parity bits can be used
as the simplest form of error detecting code
• Parity bits enable the detection of data errors
• They cannot correct the error, as it is unknown
which bit has flipped
DATA PARITY
1001 0110 0
1011 0110 1
0001 0110 0 -> ERROR: parity bit should have been 1!
ECC memory
• ECC memory not only detects errors, but is also
able to correct them
• ECC Memory chips use Hamming Code or Triple
Modular Redundancy (TMR) as the method of error
detection and correction
• Memory errors are proportional to the amount of
RAM in a computer as well as the duration of
operation
• Since servers typically contain many GBs of RAM and are
in operation 24 hours a day, the likelihood of memory
errors is relatively high and hence they require ECC
memory
Lockstepping
• Lockstepping is an error detection and correction
technology for servers
• Multiple systems perform the same calculation, and
the results of the calculations are compared
• If the results are equal, the calculations were correctly
performed
• If there are different outcomes, one of the servers made
an error
• Very expensive technology
• Only used in systems that require extremely high
reliability
Virtualization availability
• All virtualization products provide failover
clustering
• When a physical machine fails, the virtual machines
running on that physical machine can be configured to
restart automatically on other physical machines
• When a virtual machine crashes, it can be restarted
automatically on the same physical machine
Virtualization availability
• The virtualization layer has no knowledge of the
applications running on the virtual machine’s
operating system
• Failover clustering on virtualization level can only
protect against two situations:
• A physical hardware failure.
• An operating system crash in a virtual machine
Virtualization availability
• To cope with the effects of a failure of a physical
machine, a spare physical machine is needed
• All hypervisors are placed in a virtualization cluster
• The hypervisors on the physical machines check the
availability of the other hypervisors in the cluster
• One physical machine is running as a spare to take over the
load of any failing physical machine
Virtualization availability
• When a physical machine fails, the virtual machines
running on it are automatically restarted on the
spare physical machine
Virtualization availability
• An alternative is to have all physical machines
running at lower capacity
Virtualization availability
• When a physical machine fails, the virtual machines
running on it are automatically restarted on the
other physical machine
• All machines now run on full capacity
Compute performance
CPU: Moore's law
• In 1971, Intel released the world's first
universal microprocessor, the 4004
• Contained 2,300 transistors
• Could perform 60,000 instructions per
second
• About as much as the ENIAC computer that filled
a complete room and weighed several tons
• Since the introduction of the first CPU in
1971, the power of CPUs has increased
exponentially
CPU: Moore's law
• Moore's law states:
• The number of transistors that can be placed
inexpensively on an integrated circuit doubles
approximately every two years
• This trend has continued for more than half a
century now
• An Intel Broadwell-EP Xeon in 2017 contains
7,200,000,000 transistors
• An 3,100,000-fold increase in 45 years’ time!
CPU: Moore's law
Please note that the vertical scale is logarithmic instead of linear, showing a
10-fold increase of the number of transistors in each step
CPU: Moore's law
• Moore's law only speaks of the number of transistors; not
the performance of the CPU
• The performance of a CPU is dependent on a number of
variables, like:
• Clock speed
• Use of caches and pipelines
• Width of the data bus
• Moore’s law cannot continue forever, as there are
physical limits to the number of transistors a single chip
can hold
• In 2017, connections used inside a high-end CPU had a physical
width of 14 nm (nanometer), the size of 140 atoms
• in 2020, 5 nm CPUs will be produced; traces on the chip are just
50 atoms wide
CPU: Increasing CPU and
memory performance
• Various techniques have been invented to increase
CPU performance, like:
• Increasing the clock speed
• Caching
• Prefetching
• Branch prediction
• Pipelines
• Use of multiple cores
CPU: Increasing clock speed
• CPU instructions need to be fetched, decoded, executed, and the
result must often be written back to memory
• Each step in the sequence is executed when an external clock
signal is changed from 0 to 1 (the cock tick)
• The clock signal is supplied to the CPU by an external oscillator
• In the early years, CPU clock speed was the main performance
indicator
• CPU clock speed is measured in Hertz (Hz) – clock ticks or cycles
per second
CPU: Increasing clock speed
• Today’s CPUs use clock speeds as high as 3 GHz (3,000,000,000
clock ticks per second)
• Because of physical limitations, oscillators cannot run at this speed
• An oscillator with a lower frequency is used (for instance 400
MHz) and this clock rate is multiplied on the CPU chip
• The oscillator speed is known as the Front Side Bus (FSB) speed
CPU: Caching
• All CPUs in use today contain on-chip caches
• A cache is a relatively small piece of high speed
static RAM on the CPU
• Temporarily stores data received from slower main
memory
• Most CPUs contain two types of cache: level 1 and level 2
cache
• Some multi-core CPUs also have a large level 3 cache; a
cache shared by the cores
• Cache memory runs at full CPU speed (say 3 GHz),
main memory runs at the CPU external clock speed
(say 100 MHz, which is 30 times slower)
CPU: Caching
CPU: Pipelines
• Early processors first fetched an instruction,
decoded it, then executed the fetched instruction,
and wrote the result back before fetching the next
instruction and starting the process over again
CPU: Pipelines
• Later CPUs used pipelines
• While the first instruction is being executed, the second
instruction can be fetched (since that circuitry is idling
anyway), creating instruction overlap
CPU: Prefetching and branch
prediction
• Prefetching:
• When an instruction is fetched from main memory, also
the next instructions are fetched and stored in cache
• When the CPU needs the next instruction it is already
available in cache
• Unfortunately, most programs contain jumps (also
known as branches), resulting in cache misses
• The next instruction is not the next instruction in
memory
CPU: Prefetching and branch
prediction
• The cache system tries to predict the outcome of
branch instructions before they are executed by the
CPU (called branch prediction)
• In practice more than 80% of the processor instructions
are delivered to the CPU from cache memory using
prefetching and branch prediction
CPU: Superscalar CPUs
• A superscalar CPU can process more than one instruction per
clock tick
• This is done by simultaneously dispatching multiple
instructions to redundant functional units on the processor
CPU: Multi-core CPUs
• The fastest commercial CPUs have been between
running between 3 GHz and 4 GHz for a number of
years now
• Reasons:
• High clock speeds make connections on the circuit board
work as a radio antenna
• A frequency of 3 GHz means a wavelength of 10 cm. When
signals travel for more than a few cm on a circuit board,
the signal gets out of phase with the clock
• The CPU can heat up tremendously at certain spots,
which could lead to a meltdown
CPU: Multi-core CPUs
• A multi-core processor is a CPU with multiple
separate cores
• The equivalent of getting multiple processors in one
package
• The cores in a multi-core CPU run at a lower
frequency
• Reduce power consumption
• Reduce heat (no hot spots)
• Trend is to have CPUs with tens or even hundreds of
cores
CPU: Hyper-threading
• Certain Intel CPUs contain a propriety technology
called hyper-threading
• For example the Core i3/i5/i7 and Xeon CPUs
• Hyper-threading makes a single processor core
virtually work as a multi-core processor
• Hyper-threading can provide some increase in
system performance by keeping the processor
pipelines busier
Virtualization performance
• Consolidating multiple virtual machines on one
physical machine increases CPU usage and reduces
CPU idle time
• This is a primary driver for the use of virtualization
• The physical machine needs to handle the disk and
network I/O of all running virtual machines
• This can easily lead to an I/O performance bottleneck
Virtualization performance
• When choosing a physical machine to host virtual
machines, consider getting a machine with:
• Much CPU and memory capacity
• Capability of very high I/O throughput
• Virtualization introduces performance penalties:
• Resources required to run the hypervisor
• Operation transformations
• This is usually less than 10% of the total
performance
Virtualization performance
• Databases generally require a lot of network
bandwidth and high disk I/O performance
• This makes databases less suitable for a virtualized
environment
• Raw Device Mapping allows a virtual machine exclusive
access to a physical storage medium
• This diminishes the performance hit of the hypervisor on storage
to almost zero
Virtualization performance
• Often one physical server is needed per database
• The physical server runs just one virtual machine
• Many benefits of virtualization remain
• Database servers can easily be migrated to other physical
machines without downtime
• Management of the servers is unified when all servers run
hypervisors
Compute security
Physical security
• Disable external USB ports in the BIOS
• BIOS settings in an x86 server should be
protected using a password,
• Via the BIOS, external USB ports can be enabled, and
other parameters can be set
• Some servers allow the detection of the physical
opening of the server housing
• Such an event can be sent to a central management
console using for instance SNMP traps
• If possible, enable this to detect unusual activities
Virtualization security
• Virtual machines must be protected the same way as
physical machines
• The use of virtualization introduces new security
vulnerabilities of its own:
• If possible, firewalls and Intrusion Detection Systems
(IDSs) in the hypervisor should be deployed
• The virtualization platform itself needs patching too
• The size and complexity of the hypervisor should be kept
to a minimum
• DMZ
• Consider using separate physical machines that run all
the virtual machines needed in the DMZ
Virtualization security
• Systems management console
• The systems management console connects to all
hypervisors and virtual machines
• When the security of the systems management console is
breached, security is breached on all virtual machines
• Not all systems managers should have access to all virtual
machines
• Special user accounts and passwords should be
configured for high risk operations like shutting down
physical machines or virtualized clusters
• All user activity in the systems management console
should be logged