0% found this document useful (0 votes)
70 views

Openstack Performance Optimization: Numa, Large Pages & Cpu Pinning

The document discusses OpenStack performance optimization techniques including NUMA, large pages, CPU pinning, and I/O devices. It provides an introduction to OpenStack Nova and describes how Nova handles NUMA configuration, large page configuration, CPU pinning, and assigning I/O devices to optimize performance based on hardware topology. The key goals are to improve memory and I/O bandwidth through NUMA awareness and avoid inefficient memory access.

Uploaded by

dinesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

Openstack Performance Optimization: Numa, Large Pages & Cpu Pinning

The document discusses OpenStack performance optimization techniques including NUMA, large pages, CPU pinning, and I/O devices. It provides an introduction to OpenStack Nova and describes how Nova handles NUMA configuration, large page configuration, CPU pinning, and assigning I/O devices to optimize performance based on hardware topology. The key goals are to improve memory and I/O bandwidth through NUMA awareness and avoid inefficient memory access.

Uploaded by

dinesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

OpenStack performance optimization

NUMA, Large pages & CPU pinning

Daniel P. Berrangé <[email protected]>
About me
● Contributor to multiple virt projects
● Libvirt Developer / Architect 8+ years
● OpenStack contributor 2 years
● Nova Core Team Reviewer
● Focused on Nova libvirt + KVM integration

KVM Forum 2014: Düsseldorf


Talk Structure
● Introduction to OpenStack
● NUMA config
● Large page config
● CPU pinning
● I/O devices

KVM Forum 2014: Düsseldorf


What is OpenStack ?
● Public or private cloud
● Multiple projects (compute, network, block
storage, image storage, messaging, ....)
● Self-service user API and dashboard

KVM Forum 2014: Düsseldorf


What is OpenStack Nova?
● Execution of compute workloads
● Virtualization agnostic
– Libvirt (KVM, QEMU, Xen, LXC), XenAPI, Hyper-V,
VMware ESX, Ironic (bare metal)
● Concepts
– Flavours, instances, image storage, block storage,
network ports

KVM Forum 2014: Düsseldorf


Nova approach
● Cloud infrastructure administrators
– Flavours for VM instance policy
– Minimal host provisioning / setup
– No involvement in per-VM setup
● Guest instance users
– Preferences via image metadata
– No visibility of compute hosts / hardware

KVM Forum 2014: Düsseldorf


Nova architecture (simplified)
HTTP REST API

nova-api

nova-scheduler
AMQP
nova-conductor

nova-compute

Libvirt+KVM
Database

KVM Forum 2014: Düsseldorf


Current VM scheduling
● VM scheduler has multiple filters
● Filters applied to pick compute host
● Overcommit of RAM and CPUs
● VMs float across shared resources
● Assignment of I/O devices (PCI)

KVM Forum 2014: Düsseldorf


Scheduling goals
● Motivation: Network function virt (NFV)
– Support “dedicated resource” guest
– Support predictable / low latency
● Motivation: Maximise hardware utilization
– Avoid inefficient memory access on NUMA

KVM Forum 2014: Düsseldorf


NUMA
● Factors for placement
– Memory bandwidth & access latency
– Cache efficiency
– Locality of I/O devices
● Goal – small guests
– Fit entirely within single host node
● Goal – large guests
– Define virtual NUMA topology
– Fit each guest node within single host node

KVM Forum 2014: Düsseldorf


libvirt host resource info
<capabilities>
<host>
<topology>
<cells num='2'>
<cell id='0'>
<memory unit='KiB'>4047764</memory>
<pages unit='KiB' size='4'>999141</pages>
<pages unit='KiB' size='2048'>25</pages>
<distances>
<sibling id='0' value='10'/>
<sibling id='1' value='20'/>
</distances>
<cpus num='4'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
<cpu id='1' socket_id='0' core_id='1' siblings='1'/>
<cpu id='2' socket_id='0' core_id='2' siblings='2'/>
<cpu id='3' socket_id='0' core_id='3' siblings='3'/>
</cpus>
</cell>
<cell id='1'>....

KVM Forum 2014: Düsseldorf


Nova NUMA config
● Property for number of guest nodes
– Default: 1 node
– hw:numa_nodes=2
● Property to assign vCPUS/RAM to guest nodes
– Assume symmetric by default
– hw:numa_cpu.0=0,1
– hw:numa_cpu.1=2,3,4,5
– hw:numa_mem.0=500
– hw:numa_mem.1=1500
● NO choice of host node assigment

KVM Forum 2014: Düsseldorf


NUMA impl
● Scheduling
– Hosts NUMA topology recorded in DB
– VM Instance placement recorded in DB
– Filter checks host load to identify target
– Schedular records NUMA topology in DB
– Compute node starts VM with NUMA config

KVM Forum 2014: Düsseldorf


libvirt NUMA config
● VCPUs pinned to specific host NUMA nodes
● VCPUs float within host NUMA nodes
● Emulator threads to union of vCPU threads

<vcpu placement='static'>6</vcpu>
<cputune>
<vcpupin vcpu="0" cpuset="0-1"/>
<vcpupin vcpu="1" cpuset="0-1"/>
<vcpupin vcpu="2" cpuset="4-7"/>
<vcpupin vcpu="3" cpuset="4-7"/>
<vcpupin vcpu="4" cpuset="4-7"/>
<vcpupin vcpu="5" cpuset="4-7"/>
<emulatorpin cpuset="0-1,4-7"/>
</cputune>

KVM Forum 2014: Düsseldorf


Libvirt NUMA config
● VCPUS + RAM regions assigned to guest NUMA nodes

RAM in guest NUMA nodes pinned to host NUMA nodes

<memory>2048000</memory>
<numatune>
<memory mode='strict' nodeset='0-1'/>
<memnode cellid='0' mode='strict' nodeset='0'/>
<memnode cellid='1' mode='strict' nodeset='1'/>
</numatune>
<cpu>
<numa>
<cell id='0' cpus='0,1' memory='512000'/>
<cell id='1' cpus='1,2,3,4' memory='1536000'/>
</numa>
</cpu>

KVM Forum 2014: Düsseldorf


Large pages
● Factors for usage
– Availability of pages on hosts
– Page size vs RAM size
– Lack of over commit
● Goals
– Dedicated RAM resource
– Maximise TLB efficiency

KVM Forum 2014: Düsseldorf


Large page config
● Property for page size config
– Default to small pages (for over commit)
– hw:mem_page_size=large|small|any|2MB|1GB

KVM Forum 2014: Düsseldorf


Large page impl
● Scheduling
– Cloud admin sets up host group
– NUMA record augmented with large page info
– Filter refines NUMA decision for page size

KVM Forum 2014: Düsseldorf


libvirt large page config
● Page size set for each guest NUMA node

<memoryBacking>
<hugepages>
<page size='2' unit='MiB' nodeset='0-1'/>
<page size='1' unit='GiB' nodeset='2'/>
</hugepages>
</memoryBacking>

KVM Forum 2014: Düsseldorf


CPU pinning
● Factors for usage
– Efficiency of cache sharing
– Contention for shared compute units
● Goals
– Prefer hyperthread siblings for cache benefits
– Avoid hyperthread siblings for workload
independence
– Dedicated CPU resource

KVM Forum 2014: Düsseldorf


CPU pinning config
● Property for dedicated resource
– hw:cpu_policy=shared|dedicated
– hw:cpu_threads_policy=avoid|separate|isolate|
prefer

KVM Forum 2014: Düsseldorf


CPU pinning impl
● Scheduling
– Cloud admin sets up host group
– NUMA info augmented with CPU topology
– Filter refines NUMA decision with topology

KVM Forum 2014: Düsseldorf


libvirt CPU pinning config
● Strict 1-to-1 pinning of vCPUs <-> pCPUs
● Emulator threads pinned to dedicated CPU

<cputune>
<vcpupin vcpu="0" cpuset="0"/>
<vcpupin vcpu="1" cpuset="1"/>
<vcpupin vcpu="2" cpuset="4"/>
<vcpupin vcpu="3" cpuset="5"/>
<vcpupin vcpu="4" cpuset="6"/>
<vcpupin vcpu="5" cpuset="7"/>
<emulatorpin cpuset="2"/>
</cputune>

KVM Forum 2014: Düsseldorf


I/O devices
● Factors for usage
– Locality of PCI device to NUMA node
– Connectivity of PCI network interface
● Goals
– Assign PCI device on local NUMA node

KVM Forum 2014: Düsseldorf


Libvirt device info
<device>
<name>pci_0000_80_16_7</name>
<path>/sys/devices/pci0000:80/0000:80:16.7</path>
<capability type='pci'>
<domain>0</domain>
<bus>128</bus>
<slot>22</slot>
<function>7</function>
<product id='0x342c'>5520/5500/X58 Chipset QuickData Technology</product>
<vendor id='0x8086'>Intel Corporation</vendor>
<iommuGroup number='25'>
<address domain='0x0000' bus='0x80' slot='0x16' function='0x0'/>
</iommuGroup>
<numa node='1'/>
<pci-express/>
</capability>
</device>

KVM Forum 2014: Düsseldorf


I/O device impl
● Scheduling
– Hosts record locality of PCI devices in DB
– Filter refines NUMA decision for device
● Guest config
– TBD: Tell guest BIOS NUMA locality of PCI dev

KVM Forum 2014: Düsseldorf


http://libvirt.org ­ http://openstack.org

https://wiki.openstack.org/wiki/VirtDriverGuestCPUMemoryPlacement
http://people.redhat.com/berrange/kvm­forum­2014/

You might also like