Wagner W 420 KVM Performance Improvements and Optimizations

Download as pdf or txt
Download as pdf or txt
You are on page 1of 100
At a glance
Powered by AI
The presentation discusses various ways to optimize performance of KVM virtual machines running on Red Hat Enterprise Linux, including tuning memory, networking, block I/O, NUMA and CPU settings.

SPECvirt is a benchmark for measuring virtualization performance. It uses virtual machines running SPECweb, SPECjApp and SPECmail workloads grouped into tiles of six VMs each. The number of tiles that can run simultaneously while meeting quality of service requirements determines the score.

Some ways discussed to improve KVM performance include choosing appropriate device models, enabling features like huge pages, optimizing network and storage configurations, using NUMA and CPU affinity settings appropriately.

KVM PERFORMANCE IMPROVEMENTS AND OPTIMIZATIONS

Mark Wagner Principal SW Engineer, Red Hat May 4, 2011

Overview

Discuss a range of topics about KVM performance


Will show how some features impact SPECvirt results Also show against real world applications

RHEL5 and RHEL6 Use libvirt where possible

Note that not all features in all releases Some stuff will apply but your mileage will vary...

We will not cover the RHEV products explicitly

Before we dive in...


Arrow shows improvement Guest NFS Write Performance - are we sure ?
Is this really a 10Gbit line ?

Throughput ( MBytes / second)

450 400 350 300 250 200 150 100 50 0 RHEL6-default

By default the rtl8139 device is chosen

Agenda

What is SPECvirt A quick, high level overview of KVM Low Hanging Fruit Memory Networking Block I/O basics NUMA and affinity settings CPU Settings Wrap up

SPECvirt - Basics

Virtual Machines are added in sets of six

Called a Tile Web (http) Infrastructure (NFS for Web) Application (Java Enterprise) DB (for App) Mail (imap) Idle

Each guest has a specific function


SPECvirt - Basics

Three SPEC workloads drive one Tile


SPECweb SPECjApp SPECmail

Run as many Tiles as possible until any of the workloads fail any of the Quality of Service requirements

Tune, lather, rinse, repeat

SPECvirt - Basics

Each workload is throttled

there are think times between requests

SPECjApp workload has peaks/valleys to greatly vary resource usage in App & DB guests SPECvirt Home Page

http://www.spec.org/virt_sc2010/

SPECvirt Diagram
Each client runs a modified version of SPECweb, SPECjApp, and SPECmail

A single QoS error invalidates the entire test

Web Infra App Tile 1 DB Tile 2 Tile 3 Host Tile 4 Tile n

Mail

Idle

Client Client Client Client Client

Controller

SPECvirt_sc2010 Published Results March 2011


2000 1800 1600

SPECvirt_sc2010 2-Socket Results(x86_64 Servers) 3/2011


SPECvirt_sc2010 Score SPECvirt Tiles/Core

1.4 1.2 1

RHEL6
1367

1763

SPECvirt_sc2010 Score

1400 1200 1000 800 600 400 200 0


RHEL 5.5 (KVM) / IBMx3650 M3 / 12 cores

vmware
1169 1221

RHEL5
1369

0.6 0.4 0.2 0


Vmware ESX 4.1 / HP D380 G7 / 12 cores RHEL 6.0 (KVM) / IBM HS22V / 12 cores RHEL 5.5 (KVM) / IBMx3690 X5 / 16 cores RHEL 6 (KVM) / IBMx3690 X5 / 16 cores

System

Tiles / Core

0.8

SPECvirt_sc2010 Published Results March 2011


SPECvirt_sc2010 2-4 Socket Results
6000

(x86_64 Servers) 3/2011


5466

1 0.9 0.8

SPECvirt_sc2010 Score

5000

4000

SPECvirt_sc2010 Score SPECvirt Tiles/Core

0.7
3723

3000

2721

2742

0.5 0.4

2000

0.3 0.2 0.1

1000

0
Vmware ESX 4.1 / Bull SAS /32core Vmware ESX 4.1 / IBMx3850X5 / 32 core Vmware ESX 4.1 / HP DL580 G7 / 40 core RHEL 6 (KVM) / IBMx3850 X5 / 64 core

System

Tiles / Core

0.6

SPECvirt_sc2010 Published Results May 2011


2000 1800

SPECvirt_sc2010 2-Socket Results(x86_64 Servers) 5/2011


SPECvirt_sc2010 Score SPECvirt Tiles/Core 1763 1811

RHEL6

1820

1.4 1.2 1

SPECvirt_sc2010 Score

1600 1400 1200 1000 800 600 400 200 0

1169

vmware 1221

RHEL5 1367

1369

0.6 0.4 0.2 0


RHEL 5.5 Vmware ESX (KVM) / 4.1 / HP IBMx3650 M3 D380 G7 / / 12 cores 12 cores RHEL 6.0 (KVM) / IBM HS22V/ 12 cores RHEL 5.5 (KVM) / IBMx3690 X5 / 16 cores RHEL 6 Vmware ESX (KVM) / 4.1 / HP IBMx3690 X5 BL620c G7 / / 16 cores 20 cores RHEL6.1 (KVM) / HP BL620c G7 / 20 cores

System

Tiles / Core

0.8

SPECvirt_sc2010 Published Results May 2011


SPECvirt_sc2010 2-4 Socket Results
(x86_64 Servers) 4/2011
8000 7000

1.4 SPECvirt_sc2010 Score SPECvirt Tiles/Core 5466 7067 1.2 1 0.8

SPECvirt_sc2010 Score

6000 5000 4000 3000 2000 1000 0 Vmware ESX 4.1 / Vmware ESX 4.1 / Vmware ESX 4.1 / Bull SAS /32 cores IBMx3850X5 / 32 HP DL580 G7 / 40 cores cores 2721 2742

3723 0.6 0.4 0.2 0


RHEL 6 (KVM) / IBMx3850 X5 / 64 cores RHEL 6 (KVM) / IBMx3850 X5 / 80 cores

System

Tiles / core

Unofficial SPECvirt Tiles vs Tuning - RHEL5.5


Impact of Tuning KVM for SPECvirt
Not official SPECvirt Results 14 12

Number of Tiles

10 8 6 4 2 0 Baseline SPECvirt Tiles Based on presentation by Andrew Theurer at KVM Forum August 2010 8

Agenda

What is SPECvirt A quick, high level overview of KVM Low Hanging Fruit Memory Networking Some block I/O basics NUMA and affinity settings CPU Settings Wrap up

Quick Overview KVM Architecture

Guests run as a process in userspace on the host A virtual CPU is implemented using a Linux thread

The Linux scheduler is responsible for scheduling a virtual CPU, as it is a normal thread NUMA Huge Pages Support for new hardware

Guests inherit features from the kernel


Quick Overview KVM Architecture

Disk and Network I/O through host (most of the time)

I/O settings in host can make a big difference in guest I/O performance Proper settings to achieve true direct I/O from the guest Deadline scheduler (on host) typically gives best performance

Need to understand host buffer caching


Network typically goes through a software bridge

Simplified view of KVM

Ordinary Ordinary Linux Ordinary Linux Process Linux Process Process

User VM

User VM

User VM

KVM Modules Driver Linux Driver Hardware Driver

Performance Improvements in RHEL6

Performance enhancements in every component

Component CPU/Kernel

Feature NUMA Ticketed spinlocks; Completely fair scheduler; Extensive use of Read Copy Update (RCU) Scales up to 64 vcpus per guest Large memory optimizations: Transparent Huge Pages is ideal for hardware based virtualization Vhost-net a kernel based virtio w/ better throughput and latency. SRIOV for ~native performance AIO, MSI, scatter gather.

Memory Networking Block

Performance Enhancements

Vhost-net

new host kernel networking backend providing superior throughput and latency over the prior userspace implementation

FPU performance optimization

Avoid the need for host to trap guest FPU cr0 access

Performance Enhancements

Disk I/O latency & throughput improvements,

using ioeventfd for faster notifications

Qcow2 virt image format caching of metadata improves performance

batches writes upon I/O barrier, rather than fsync every time additional block storage needed (thin provisioning growth)

Agenda

What is SPECvirt A quick, high level overview of KVM Low Hanging Fruit Memory Networking Some block I/O basics NUMA and affinity settings CPU Settings Wrap up

Remember this ?
Guest NFS Write Performance

Throughput ( MBytes / second)

450 400 350 300 250 200 150 100 50 0 RHEL6-default

Impact of not specifying OS at guest creation

Be Specific !

virt-manager will:

Make sure the guest will function Optimize as it can

The more info you provide the more tailoring will happen

Specify the OS details

Specify OS + flavor

Specifying Linux / Red Hat will get you:


The virtio driver Unless you are at RHEL6.1

vhost_net drivers

I Like This Much Better


Guest NFS Write Performance
Impact of specifying OS Type at Creation

Throughput ( MBytes / second)

450 400 350 300 250 200 150 100 50 0

12.5 x

RHEL6-default

RHEL6-vhost

RHEL5-virtio

Low Hanging Fruit

Remove unused devices

Do you need sound in your web server?

Remove / disable unnecessary services ?


Both host and guest Bluetooth in a guest ?

Agenda

What is SPECvirt A quick, high level overview of KVM Low Hanging Fruit Memory Networking Some block I/O basics NUMA and affinity settings CPU Settings Wrap up

Memory Enhancements for Virtualization

Extended Page Table (EPT) age bits Kernel Same-page Merging (KSM) Transparent Huge Pages

Memory Enhancements for Virtualization

Extended Page Table (EPT) age bits

Allow host to make smarter swap choice when under pressure. Consolidate duplicate pages. Particularly efficient for Windows guests. Efficiently manage large memory allocations as one unit

Kernel Same-page Merging (KSM)


Transparent Huge Pages

Memory sharing

KSM

Scan into (transparent) huge pages


Help delay/avoid swapping At cost of slower memory access That's the major issue to make sure THP won't reduce the memory over commit capacity of the host Throughput vs. density

Classical speed vs. space trade off

Add KSM on|off per VM

Memory Tuning Huge Pages

2M pages vs 4K standard Linux page


Virtual to physical page map is 512 times smaller TLB can map more physical page resulting fewer misses

Traditional Huge Pages always pinned Transparent Huge Pages in RHEL6 Most databases support Huge Pages How to configure Huge Pages (16G)

echo 8192 > /proc/sys/vm/nr_hugepages vi /etc/sysctl.conf (vm.nr_hugepages=8192)

Memory Tuning Huge Pages

Benefits not only Host but guests

Try them in a guest too !

Try Huge Pages on Host and Guest


Impact of Huge Pages on SPECjbb
RHEL5.5 - 8 threads
300K

24%

46%

SPECjbb2005 bops

250K 200K 150K 100K 50K K No huge pages Host using huge pages Guest & host using huge pages

Impact of Huge Pages


KVM guest runs with and without Huge Pages
RHEL5.5 KVM - OLTP Workload 600k

Aggregate Transactions / Minute

500k 400k 300k 200k 100k k

1 Guest 1 Guest Huge pages 4 Guests 4 Guests Huge pages

10U

20U

40U

Number of users (x 100)

Transparent Huge Pages


RHEL6/6.1 SPECjbb
24-cpu, 24 vcpu Westmere EP, 24GB 500K 450K

Transactions Per Minute

400K 350K 300K 250K 200K 150K 100K 50K K r6-guest r6-metal

No-THP

THP

Unofficial SPECvirt Tiles vs Tuning - RHEL5.5


Impact of Tuning KVM for SPECvirt
Not official SPECvirt Results 14 12

10%

Number of Tiles

10 8 6 4 2 0 Baseline Huge Pages SPECvirt Tiles Based on presentation by Andrew Theurer at KVM Forum August 2010 8 8.8

Agenda

What is SPECvirt A quick, high level overview of KVM Low Hanging Fruit Memory Networking Some block I/O basics NUMA and affinity settings CPU Settings Wrap up

Virtualization Tuning - Network

General Tips Virtio vhost_net PCI passthrough SR-IOV (Single root I/O Virtualization)

Network Tuning Tips

Separate networks for different functions

Use arp_filter to prevent ARP Flux


echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter Use /etc/sysctl.conf for permanent

Don't need HW to bridge intra-box communications


VM traffic never hits the HW on same box Can really kick up MTU as needed Need to make sure it is set across all components

Packet size

Virtualization Tuning - Network

Virtio

VirtIO drivers for network Bypass the qemu layer Bypass the host and pass the PCI device to the guest Can be passed only to one guest

vhost_net (low latency close to line speed)

PCI passthrough

SR-IOV (Single root I/O Virtualization)


Can be shared among multiple guests Limited hardware support Pass through to the guest

KVM Network Architecture - VirtIO

Virtual Machine sees paravirtualized network device VirtIO

VirtIO drivers included in Linux Kernel VirtIO drivers available for Windows

Network stack implemented in userspace

KVM Network Architecture


Virtio

Latency comparison RHEL 6


Network Latency virtio
Guest Receive (Lower is better)
400 350

Latency (usecs)

300 250 200 150 100 50 0


1 19 51 13 1 38 7 2 10 7 7 30 5 9 81 5 2 7 45 9 6 3 55 9

4X gap in latency

Virtio Host

Message Size (Bytes)

KVM Network Architecture vhost_net

New in RHEL6.1 Moves QEMU network stack from userspace to kernel Improved performance Lower Latency Reduced context switching One less copy

KVM Network Architecture vhost_net


vhost_net

Latency comparison RHEL 6


Network Latency - vhost_net
Guest Receive (Lower is better)
400 350

Latency (usecs)

300 250 200 150 100 50 0


1 19 51 13 1 38 7 10 27 30 75 81 95 5 24 79 5 65 39

Latency much closer to bare metal

Virtio Vhost_net Host

Message Size (Bytes)

KVM Network Architecture VirtIO vs vhost_net


Virtio vhost_net

Host CPU Consumption virtio vs vhost_net


Host CPU Consumption, virtio vs Vhost
8 Guests TCP Receive

Major difference is usr time

% Total Host CPU (Lower is Better)

45 40 35 30 25 20 15 10 5 0

%usr %soft %guest %sys

-v 8- io vh o 12 st 8 5 1 -v i o 2vh o 51 s t 20 2-v 48 io -v ho 20 s t 4 81 8-v 92 io -v ho st 81 9 32 2 76 -vi o 8vh 32 os 76 t 8vi o 12

-v h 32

32

os

Message Size (Bytes)

vhost_net Efficiency
8 Guest Scale Out RX Vhost vs Virtio - % Host CPU
Mbit per % CPU netperf TCP_STREAM 400

Mbit / % CPU (bigger is better)

350 300 250 200 150 100 50 0


32 64 12 8 25 6 51 2 2 10 4 4 20 8 9 40 2 9 81 2 1 8 63 4 3 6 27 8 6 0 55 7

Vhost Virtio

Message Size (Bytes)

KVM Network Architecture PCI Device Assignment

Physical NIC is passed directly to guest Guest sees real physical device

Needs physical device driver Intel VT-D or AMD IOMMU

Requires hardware support Lose hardware independence 1:1 mapping of NIC to Guest BTW - This also works on some I/O controllers

KVM Network Architecture Device Assignment


Device Assignment

KVM Network Architecture SR-IOV

Single Root I/O Virtualization New class of PCI devices that present multiple virtual devices that appear as regular PCI devices

Guest sees real physical device

Needs physical device driver

Requires hardware support Low overhead, high throughput No live migration Lose hardware independence

KVM Architecture SR-IOV


SR-IOV

KVM Architecture Device Assignment vs SR/IOV


Device Assignment SR-IOV

Latency comparison RHEL 6 based methods


Network Latency by guest interface method
Guest Receive (Lower is better)
400 350

Latency (usecs)

300 250 200 150 100 50 0


1 19 51 13 1 38 7 10 27 30 75 81 95 5 24 79 5 65 39

SR-IOV latency close to bare metal

Virtio Vhost_net SR-IOV Host

Message Size (Bytes)

SR-IOV latency close to bare metal

RHEL6 KVM w/ SR-IOV Intel Niantic 10Gb Postgres DB


D V DS toreV ersion2resu lts
Throughput in Order/min (OPM)

100,000 90,000 80,000 70,000 60,000 50,000 40,000 30,000 20,000 10,000 0

86,469 69,984

92,680

93% Bare Metal

M P O l a t o T

1R ed H a tK VM b rid g ed g ues t

1R ed H a tK VM S R -IOV g ues t

1da ta ba s e ins ta nc e (ba rem eta l)

Unofficial SPECvirt Tiles vs Tuning - RHEL5.5


Impact of Tuning KVM for SPECvirt
Not official SPECvirt Results 14 12

10%
10.6 8 8.8

32%

Number of Tiles

10 8 6 4 2 0 Baseline Huge Pages SPECvirt Tiles

SR-IOV

Based on presentation by Andrew Theurer at KVM Forum August 2010

Agenda

What is SPECvirt A quick, high level overview of KVM Low Hanging Fruit Memory Networking Block I/O basics NUMA and affinity settings CPU Settings Wrap up

Block

Recent improvements
Tuning hardware Choosing an elevator Choosing the caching model Tuned / ktune Device assignment

Block Improvements

Ioeventd performance enhancement 4K sector size (RHEL6) Windows drivers

MSI support

AIO implementation

I/O Tuning - Hardware

Know your Storage


SAS or SATA? Fibre Channel, Ethernet or SSD? Bandwidth limits Device-mapper-multipath Provides multipathing capabilities and LUN persistence Low level I/O tools dd, iozone, dt, etc

Multiple HBAs

How to test

I/O Tuning Understanding I/O Elevators

I/O Elevators of Interest


Use deadline on the host

Deadline CFQ Noop

Your mileage may vary Experiment !

I/O Tuning Understanding I/O Elevators

Deadline

Two queues per device, one for read and one for writes IOs dispatched based on time spent in queue

CFQ

Per process queue Each process queue gets fixed time slice (based on process priority)

Noop

FIFO Simple I/O Merging Lowest CPU Cost

I/O Tuning How to configure I/O Elevators

Boot-time

Grub command line elevator=deadline/cfq/noop echo deadline > /sys/class/block/sda/queue/scheduler

Dynamically per device

Virtualization Tuning I/O elevators - OLTP


Performance Impact of I/O Elevators on OLTP Workload
Host running Deadline Scheduler
300K 250K

Transactions per Minute

200K 150K 100K 50K K

Noop CFQ Deadline

1Guest

2 Guests

4 Guests

Virtualization Tuning I/O Cache

Three Types

Cache=none Cache=writethrough Cache=writeback - Not supported

Virtualization Tuning - Caching

Cache=none

I/O from the guest in not cached I/O from the guest is cached and written through on the host Potential scaling problems with this option with multiple guests (host cpu used to maintain cache)

Cache=writethrough

Cache=writeback - Not supported

Virtualization Tuning - Caching

Configure I/O-Cache per disk in qemu command line or libvirt

Virt-manager dropdown option under Advanced Options Libvirt XML file - driver name='qemu' type='raw' cache='none' io='native'

Effect of I/O Cache settings on Guest performance


OLTP like workload
FusionIO storage
900K

Transaction Per Minute

800K 700K 600K 500K 400K 300K 200K 100K K


1Guest 4Guests

Cache=WT Cache=none

I/O Tuning - Filesystems

RHEL6 introduced barriers


Needed for data integrity On by default Can disable on Enterprise class storage Database ( parameters to configure read ahead) Block devices ( getra , setra ) Eliminate Synchronous I/O stall Critical for I/O intensive applications

Configure read ahead


Asynchronous I/O

AIO Native vs Threaded (default)


Impact of AIO selection on OLTP Workload
"cache=none" setting used - Threaded is default
1000K 900K 800K 700K 600K 500K 400K 300K 200K 100K K

Transactions Per Minute

AIO Threaded AIO Native

10U

20U

Number of Users (x 100)

Configurable per device (only by xml configuration file) Libvirt xml file - driver name='qemu' type='raw' cache='none' io='native'

RHEL6 tuned-adm profiles


# tuned-adm list Available profiles: - default - latency-performance - throughput-performance - enterprise-storage Example # tuned-adm profile enterprise-storage

Recommend enterprise-storage w/ KVM

RHEL6 tuned-adm profiles -default

default

CFQ elevator (cgroup) I/O barriers on ondemand power savings upstream VM 4 msec quantum

Example # tuned-adm profile default

RHEL6 tuned-adm profiles - latency-performance

latency-performance

elevator=deadline power=performance

Example # tuned-adm profile latency-performance

RHEL6 tuned-adm profiles - throughputperformance

throughput-performance

latency + 10 msec quantum readahead 4x VM dirty_ratio=40

Example # tuned-adm profile throughput-performance

Remember Network Device Assignment ?

Device Assignment

It works for Block too ! Device Specific Similar Benefits And drawbacks...

Block Device Passthrough - SAS Workload


RHEL6.1 SAS Mixed Analytics Workload - Metal/KVM
Intel Westmere EP 12-core, 24 GB Mem, LSI 16 SAS Time to complete (secs)
20k 18k 16k 14k 12k 10k 8k 6k 4k 2k k

79 %

94 %
SAS system SAS Total

KVM VirtIO

KVM/PCI-PassThrough

Bare-Metal

78

Agenda

What is SPECvirt A quick, high level overview of KVM Low Hanging Fruit Memory Networking Block I/O basics NUMA and affinity settings CPU Settings Wrap up

NUMA (Non Uniform Memory Access)

Multi Socket Multi core architecture


NUMA is needed for scaling RHEL 5 / 6 completely NUMA aware Additional performance gains by enforcing NUMA placement

How to enforce NUMA placement

numactl CPU and memory pinning

Memory Tuning - NUMA


# numactl --hardware available: 8 nodes (0-7) node 0 cpus: 0 1 2 3 4 5 node 0 size: 8189 MB node 0 free: 7220 MB node 1 cpus: 6 7 8 9 10 11 node 1 size: 8192 MB ... node 7 cpus: 42 43 44 45 46 47 node 7 size: 8192 MB node 7 free: 7816 MB node distances: node 0 1 2 3 4 5 6 7 0: 10 16 16 22 16 22 16 22 1: 16 10 22 16 16 22 22 16 2: 16 22 10 16 16 16 16 16 3: 22 16 16 10 16 16 22 22 4: 16 16 16 16 10 16 16 22 5: 22 22 16 16 16 10 22 16 6: 16 22 16 22 16 22 10 16 7: 22 16 16 22 22 16 16 10

Internode Memory distance From SLIT table

Note variation in internode distances

NUMA and Huge Pages -Not an issue with THP

Static Huge Page allocation takes place uniformly across NUMA nodes

Make sure that guests are sized to fit

Workaround 1 Use Transparent Huge Pages Workaround 2 Allocate Huge pages / Start Guest / De-allocate Huge pages
Physical Memory 128G 4 NUMA nodes 20GB Guest using Huge Pages Huge Pages 80G 20G in each NUMA node 20GB Guest using NUMA and Huge Pages

OLTP Workload Effect of NUMA and Huge Pages


OLTP workload - Multi Instance
Effect of NUMA 8%
1400K

Aggregate Transactions per Minute

1200K

1000K

800K

600K

400K

200K

Non NUMA

NUMA

OLTP Workload Effect of NUMA and Huge Pages


OLTP workload - Multi Instance
Effect of NUMA and Huge Pages
1400K

8%

12 %

18 %

Aggregate Transactions per Minute

1200K

1000K

800K

600K

400K

200K

Non NUMA

NUMA

non NUMA + Huge Pages NUMA + Huge Pages

RHEL5.5 KVM Huge Pages + CPU Affinity


Comparison between HugePages vs HugePages + affinity
700K

Aggregate Transactions Per Minute

Huge Pages Leveled off

Four RHEL5.5 Guests using libvirt

600K 500K 400K 300K 200K 100K K

huge pages huge pages + vcpu

10U

20U

40U

Number of Users ( X 100 )

Agenda

What is SPECvirt A quick, high level overview of KVM Low Hanging Fruit Memory Networking Block I\O basics NUMA and affinity settings CPU Settings Wrap up

CPU Performance

Improvements CPU Type CPU Topology CPU Pin - Affinity

CPU Improvements

RCU for the KVM module locks

we scale to 64 vcpus!

Dynamic cancel ticket spinlocks Add user return notifiers in the kernel X2apic

Use MSR access to limit mmio accesses to the irq chip

Specifying Processor Details

Mixed results with CPU type and topology Experiment and see what works best in your case

CPU Pinning - Affinity

Virt-manager allows selection based on NUMA topology

True NUMA support in the works

Virsh pinning allows finer grain control

1:1 pinning

Good gains with pinning

Unofficial SPECvirt Tiles vs Tuning - RHEL5.5


Impact of Tuning KVM for SPECvirt
Not official SPECvirt Results 14 12

50% 32%

10%
10.6 8 8.8

12

Number of Tiles

10 8 6 4 2 0 Baseline Huge Pages SPECvirt Tiles

SR-IOV

Node Binding

Based on presentation by Andrew Theurer at KVM Forum August 2010

Performance monitoring tools

Monitoring tools

top, vmstat, ps, iostat, netstat, sar, perf /proc, sysctl, AltSysrq ethtool, ifconfig oprofile, strace, ltrace, systemtap, perf

Kernel tools

Networking

Profiling

Agenda

What is SPECvirt A quick, high level overview of KVM Low Hanging Fruit Memory Networking Block I/O basics NUMA and affinity settings CPU Settings Wrap up

Wrap up
KVM can be tuned effectively

Understand what is going on under the covers Turn off stuff you don't need Be specific when you create your guest Look at using NUMA or affinity Choose appropriate elevators (Deadline vs CFQ) Choose your cache wisely

For More Information Other talks

Campground 2 Thurs 10:20 Performance Analysis & Tuning of Red Hat Enterprise Linux Shak and Larry

Part 1 - Thurs 10:20 Part 2 - Thurs 11:30 Sanjay Rao - Fri 9:45

Tuning Your Red Hat System for Databases

KVM Performance Optimizations

Rik van Riel - Thurs 4:20

For More Information

KVM Wiki

http://www.linux-kvm.org/page/Main_Page

irc, email lists, etc

http://www.linux-kvm.org/page/Lists%2C_IRC
http://libvirt.org/
http://docs.redhat.com/docs/enUS/Red_Hat_Enterprise_Linux/index.html Should be available soon !

libvirt Wiki

New, revamped edition of the Virtualization Guide

For More Information

Reference Architecture Website

https://access.redhat.com/knowledge/refarch/TBD
http://www.linux-kvm.org/wiki/images/7/7f/2010-forum-perf-andscalability-server-consolidation.pdf

Andrew Theurers' original presentation

SPECvirt Home Page

http://www.spec.org/virt_sc2010/ http://www.principledtechnologies.com/clients/report s/Red%20Hat/Red%20Hat.htm

Principled Technologies

SPECvirt_sc2010 Published Results May 2011


2000 1800

SPECvirt_sc2010 2-Socket Results(x86_64 Servers) 5/2011


SPECvirt_sc2010 Score SPECvirt Tiles/Core 1763 1811

RHEL6

1820

1.4 1.2 1

SPECvirt_sc2010 Score

1600 1400 1200 1000 800 600 400 200 0

1169

vmware 1221

RHEL5 1367

1369

0.6 0.4 0.2 0


RHEL 5.5 Vmware ESX (KVM) / 4.1 / HP IBMx3650 M3 D380 G7 / / 12 cores 12 cores RHEL 6.0 (KVM) / IBM HS22V/ 12 cores RHEL 5.5 (KVM) / IBMx3690 X5 / 16 cores RHEL 6 Vmware ESX (KVM) / 4.1 / HP IBMx3690 X5 BL620c G7 / / 16 cores 20 cores RHEL6.1 (KVM) / HP BL620c G7 / 20 cores

System

Tiles / Core

0.8

SPECvirt_sc2010 Published Results May 2011


SPECvirt_sc2010 2-4 Socket Results
8000 7000

(x86_64 Servers) 4/2011


7067

1 0.9 0.8

SPECvirt_sc2010 Score

6000 5000 4000 3000 2000 1000 0

SPECvirt_sc2010 Score SPECvirt Tiles/Core 3723


2721 2742

5466

0.7

0.5 0.4 0.3 0.2 0.1 0

Vmware ESX 4.1 / Vmware ESX 4.1 / Vmware ESX 4.1 / Bull SAS /32 cores IBMx3850X5 / 32 HP DL580 G7 / 40 cores cores

RHEL 6 (KVM) / IBMx3850 X5 / 64 cores

RHEL 6 (KVM) / IBMx3850 X5 / 80 cores

System

Tiles / core

0.6

You might also like