Sharing IOMMU PageTables With TDP in KVM
Sharing IOMMU PageTables With TDP in KVM
Sharing IOMMU PageTables With TDP in KVM
Lu Baolu [email protected]
Zhao Yan [email protected]
Tian Kevin [email protected]
Sep. 2021
Disclaimers
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability,
fitness
for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of
dealing, or
usage in trade.
This document contains information on products, services and/or processes in development. All information provided here is
subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and
roadmaps.
The products and services described may contain defects or errors known as errata which may cause deviations from
published
specifications. Current characterized errata are available on request. No product or component can be absolutely secure.
Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-
548-
4725 or by visiting www.intel.com/design/literature.htm.
Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
© Intel Corporation
Agenda
• Goal
• Sharing Advantages
• Sharing Prerequisites
• Sharing Interfaces
• Page & Page table Pinning
• Shared Page Table Root Update
• Bootup Performance
• TODOs
Goal
HPA HPA
IOPT TDP HPA TDP
• Qemu
– KVM side
• check TDP is enabled
• vCPU model does not include EPT/NPT feature
– IOMMU side
• no vIOMMU
• vIOMMU is not in shadow mode. (nested mode on GPA is ok)
.
The Same Address Space (Cont.)
• Nested VM
– TBD currently
• SMM in x86
– A different address space. Cannot be shared to IOMMU.
– Non-SMM mode EPT must be kept for sharing when vCPU is
in SMM mode.
Compatible Page Table Formats
• Local APIC
– DMA write to 0xfeexxxxx doesn’t go through DMA remapping.
• Read/Write/Execute bit
– RO for RO memslots
– RW for other memslots
– Execute bit
• currently ignored in IOMMU and no device uses it.
– Write protection for live migration
• Allowed when IO page fault is supported
• Must be disabled otherwise
– All pinned ranges are dirty or
– traversal for Dirty bit
Sharing Interfaces
• Request/stop sharing Attach/Detach Device to pin/unpin
IOASID shared with TDP QEMU
• Page/page table pinning for
Request sharing with
DMAs without IO page fault pin mode & notification callbacks
IOMMU
page fault KVM
IO page fault
• Page fault for IO page fault
IOASID TDP
support Page table content update notification
1 entry
rmap
2M 3
remove
rmap add
page entry 0
1
rmap add
…
511
rmap add 2M
0
page
2 1
…
511
2M
page
TDP entry being atomically updated from non-zero value to another non-zero value.
Page & Page Table Pinning Interfaces
• For sharing without IO page fault,
– Pinning of all ranges in user memslots: memslot add
– Pinning a specific range: extra interface
root_count-- 4 If !role.smm,
root_count++
• Performance optimization
– Page table root update reduction,
– Huge page support for P2P, etc.
Why it is KVM manages the shared table
• CPU side has more restrictions in page size
– Check guest MTRR
– NX huge page workaround
Pin/Unpin on
1 Pin/Unpin
memslot create/delete
notification of
3 5 root/map update
Device[n] 2 Attach/
Detach IOASID 4c page fault TDP
IO page fault
2 Request/Stop sharing
1 Alloc IOASID
KVM_EPT_LEVEL_4 anon_inode:
/dev/vfio/devices/dev[n] /dev/iommu kvm-vm
4b 4a
1 Pin/Unpin on
MAP_DMA/ notification of memslot create/delete
UNMAP_DMA 3 5 root/map update