0% found this document useful (0 votes)
13 views

Getting Started Developing Vsphere IO Filter Solutions

Uploaded by

ggons2010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Getting Started Developing Vsphere IO Filter Solutions

Uploaded by

ggons2010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 278

Getting Started Developing vSphere IO

Filter Solutions

Version 1.0

VMware Confidential and Proprietary


Copyright © 2016 VMware, Inc. All rights reserved. Copyright and trademark information.

VMware, Inc.
3401 Hillview Ave.
Palo Alto, CA 94304
www.vmware.com

2 VMware Confidential and Proprietary VMware, Inc.


Contents

Preface 7
Acknowledgements and Credits 7
Legal Access Notice 7

1 Course Overview 9
About This Course 9
Understand the Course's Prerequisites 9
Understand the Course's Strategic Objectives 10
Understanding the Course's Tactical Objectives 10
Understanding the Course's Organization 12
Chapter Summary 12

2 Overview of vSphere IO Filters: Purpose, Architecture / IO Flows and


Components 13
Understanding the Purpose and High-Level Attributes of vSphere IO Filter Solutions 14
Understanding VAIO General and Supported Use Cases 15
Features and Flow of vSphere IO Filters 16
Online and Offline Filtering 17
Understanding IO Flows to VMDKs and Back to Requestor on vSphere, Without IO Filters 17
Understanding the role of the vSCSI Module in IO Flows 19
Understanding the Role of the POSIX Object Layer in IO Flows 19
Understanding the Role of the File System Switch ( FSS ) Module in IO Flows 19
Understanding the Role of the Buffer Cache Module in IO Flows 20
Understanding IO Flows to VMDKs and Back to Requester on vSphere With IO Filters 20
Understanding the Role of the SecretSauceMod Kernel Modules in IOs with IO Filters 21
Understanding the Role of the SecretSauce Library (SSLib) in IOs With IO Filters 22
Understanding the Role of an IO Filter Solution's Library and Library Instances (LIs) in IO
Flows with IO Filters 23
Understanding Filter-Private Data: Instance Data and Sidecars 24
Understanding the Role of an IO Filter Daemon in IO Filter Solutions 25
Understanding the Role of CIM Providers in IO Filter Solutions 26
Understanding IO Filters Impact on Clusters 27
Understanding the Role of vSphere Web Client (VWC) Plugins in IO Filter Solutions 28
Summarizing the Architecture of a vSphere IO Filter Solution 28
Chapter Summary 29
Review Questions - Overview of vSphere IO Filters - Purpose, IO Flows, Architecture, Components
and Key Concepts 29

3 Deploying and Testing an IO Filter Development Environment 31


Understanding the Development Environment and its Requirements 32
32 and 64-bit Architectures in IO Filter Solutions 33

VMware, Inc. VMware Confidential and Proprietary 3


Getting Started Developing vSphere IO Filter Solutions

Deploying the vSphere API for IO Filters SDK (VAIODK) 33


Downloading and Unwrapping the VAIODK 34
Installing the VAIODK using VMware Workbench 35
Understanding the Results of a Successful VAIODK Install 43
Removing the VAIODK 47
Understanding a Filter's Directory Contents and Building a VAIODK Sample Filter 47
Understanding the Results of a Successful Build 48
Testing an IO Filter 49
Placing an ESXi host in Community Supported Mode 50
Understanding Logging in an IO Filter 50
Testing an IO Filter with an ESXi Host 52
Testing with a Cluster 67
Chapter Summary 84
Review Question Answers - Deploying and Testing an IO Filter Development Environment 84

4 Creating a Basic IO Filter Solution 87


Understanding Strategies for Starting IO Filter Development 88
Creating a Build Folder 88
Adding a Makefile to Your Build Folder 88
How to Build a CIM Provider 89
Creating and Populating a Correct Scons File 89
Creating and Populating a Correct .json File 98
How to properly set DAEMON MEMORY RESERVATION 100
Creating a Skeletal Filter Library Component Source File and Understanding its Entry-points /
Callbacks 101
Understanding and Invoking the VMIOF_DEFINE_DISK_FILTER Macro 101
Understanding VMIOF_Status Results for Functions in the VAIO 102
Understanding and Defining diskAttach and diskDetach Callbacks 103
Understanding and Defining diskOpen and diskClose Callbacks 104
Understanding and Defining the diskSnapshot Callback 104
Understanding and Defining the diskCollapse Callback 105
Understanding and Defining the diskDeleteBlock* Callbacks 105
Understanding and Defining a diskRequirements Callback 106
Understanding and Defining the diskClone Callback 106
Understanding and Defining the diskVmMigration Callback 107
Understanding and Defining the diskIOStart Callback 107
Understanding and Defining the diskIOAbort Callback 108
Understanding and Defining the diskIOsReset Callback 108
Understanding and Defining the diskStun and diskUnStun Callbacks 108
Understanding and Defining the diskRelease Callback 109
Understanding and Defining the diskGrow Callback 109
Understanding and Defining the diskExtentGetPre and diskExtentGetPost Callbacks 109
Understanding and Defining the diskProperties{Valid,Set,Get,Free} Callbacks 110
A Summary Table of Filter Library Call back 111
Manipulating per-VMDK Filter Properties using vmkfstools 112
Creating a Skeletal Daemon Component 113
Understanding and Invoking the VMIOF_DEFINE_DAEMON Macro 113
Understanding and Defining the Daemon Start Callback 113
Understanding and Defining the Daemon stop Callback 114

4 VMware Confidential and Proprietary VMware, Inc.


Contents

Understanding and Defining the Daemon cleanup Callback 115


Creating a Skeletal CIM Provider 116
Understanding How to Localize your Solution 116
Debugging IO Filter Core Dumps 118
Live Debugging 120
Chapter Summary - Creating a Basic IO Filter Solution 127
Review Questions 127

5 Fleshing IO Filter Library Component Source 129


Understanding and Using IO Filter Utility Functions Common to Most Solutions 130
Managing Memory in an IO Filter Solution 130
Using Sidecars Functions in Library Code to Keep Persistent Per-VMDK Meta Data 137
Understanding and Using the IO Filters Polling Functions 141
Understanding and Using Filter-Private Data Functions to Keep Non-Persistent Per-VMDK
Meta Data 145
Getting the UUID of a VMDK 149
Getting the ESX Version Info 150
Using the IO Filter Failure Reporting Functionality 151
Understanding and Processing Primary Disk Events 153
Understanding and Using a diskRequirements Callback 153
Understanding and Processing DiskAttach Events 155
Understanding and Processing DiskDetach Events 158
Understanding and Processing diskOpen Events 159
Understanding and Processing DiskClose Events 165
Understanding and Processing IO Events 167
Understanding the DiskIO Structure 167
Understanding and Processing diskIOStart Events 169
Understanding and Processing the diskDeleteBlocks* Events 178
Understanding and Processing IO Abort (Cancel) Events ( diskIOAbort ()) 180
Understanding and Processing Disk IO Reset Requests (diskIOsReset()) 181
Overview of VMIOF IO Utility Functions 181
Understanding and Using the IO Filters VMIOF_DiskIODup Function 182
Understanding and Using the IO Filters VMIOF_DiskIOAlloc Function 182
Understanding and using the IO Filters VMIOF_DiskIOSubmit Function 183
Understanding and Using the VMIOF_DiskIOFree Function 183
Optimizing the Performance of Your IO Filter Solution 184
Understanding and Using the IO Filters CrossFD Functions 184
Understanding and Using the IO Filters AIO Functions (for cache file IO) 186
Understanding and Processing Snapshot-related Events 192
Understanding and Processing a Snapshot Event 193
Reverting a Snapshot 198
Understanding and Processing a DiskCollapse Event 201
Understanding and Processing the diskExtentGet* Events 204
Using the Extent Scanning Functions 206
Understanding and Processing Other IO Filter Events 208
Understanding and Processing diskStun and diskUnstun Events 208
Understanding and Processing a DiskClone Event 210
Understanding and Processing an xMigration (diskVmMigration) Event 216
Understanding and Processing a DiskRelease Event 228

VMware, Inc. VMware Confidential and Proprietary 5


Getting Started Developing vSphere IO Filter Solutions

Understanding and Using the VMIOF_VirtualDisk* Functions 231


Understanding and Processing a diskGrow Event 233
Understanding an Processing diskProperites{Valid,Set,Get,Free} Events 237
Understanding and Using the VMIOF SCSI Functions 242
Chapter Summary 250
Review Questions - Fleshing IO Filter Library Component Source 251

6 Using Cache Solution-Specific Functions 253


Understanding and Using the VMIOF_Cache *() Functions 253
Updating the Dirty State of the VMDK using VMIOF_DiskContentsDirtySet () 257
Chapter Summary 257
Review Questions - Using Cache Solution-Specific Functions 258

7 Understanding Additional Rules Tips and Tricks 259


Understanding Rules for Using pthread_create () 260
Understanding an Example of Why You Should Not Use pthread_create () in Library Code 260
Blocking Rules in Poll and Timer Callbacks and WorkGroup Functions 264
Bundle Deployment URL lifetime 264
Troubleshooting vSphere Cluster and Filter Build Configuration (Checklist) 265
Using the RAMDisk 265
Holding a Lock across calls 265
Understanding Sidecar Functions Invocations 266
Heap Debugging 266
Debugging With Different Versions of ESX 268
Using the realtime clock functions 269
Using Libraries in your Library Instance and Daemon 269
Development Tips 270
Frequently Asked Questions 270
Chapter Summary 272

Index 275

6 VMware Confidential and Proprietary VMware, Inc.


Preface

Acknowledgements and Credits


This course was a collaborative effort of several people in various roles. It is appropriate to give them credit:

n Course architect & project manager – Matthew Thurmaier

n Content developers (in alphabetical order): Nagib Gulam, Vasuki Nagaraju , Kamal Prasad, Deepa
Seshadri, Matthew Thurmaier, Jinlu Yu

n Content Reviewers (in alphabetical order) – Adrian Drzewiecki, Bob Seigman, Christoph Klee, Dworkin
Muller, Jesse Pool, Nasser Shayesteh, Nishant Yadav, Suraj Swaminathan, Rohit Jog, Zahed Khurasani .

These people gave significant feedback and corrections. To the extent that there are errors still in the
course, they are the responsibility of Vasuki, Nagib and Matthew, not these engineers.

n Editors – Matthew Thurmaier, Vasuki Nagaraju and Nagib Gulam.

The course's content developers received significant help from the IO Filter engineering team and the project
and program managers including Alex Jauch and Nasser Shayesteh.

The entire team hopes you find the course valuable and enjoyable.

Legal Access Notice


In addition to other copyright notices on this document, the content of this document is proprietary, and
confidential, to VMware, Inc. Access to this document is subject to partners who have a signed NDA for and
membership in the VMware API for IO Filters (VAIO) program. Use and distribution of this document, in
whole or part, is governed by the terms of the VAIO program agreement.

VMware, Inc. VMware Confidential and Proprietary 7


Getting Started Developing vSphere IO Filter Solutions

8 VMware Confidential and Proprietary VMware, Inc.


Course Overview 1
Chapter Objectives and Topics
To achieve the tactical objectives of this course, after successfully completing this chapter, you should be
able to:
n Define the purpose of this course

n List the course's prerequisites

n List the course's objectives

n List the course's chapters and appendices

This chapter includes the following topics:

n “About This Course,” on page 9

n “Understand the Course's Prerequisites,” on page 9

n “Understand the Course's Strategic Objectives,” on page 10

n “Understanding the Course's Tactical Objectives,” on page 10

n “Understanding the Course's Organization,” on page 12

n “Chapter Summary,” on page 12

About This Course


VMware has created a new feature for vSphere called IO Filters, which allows the interception of IO
requests between the requestor and a virtual disk. VMware also created a new SDK that allows partners to
create said IO Filters. This course teaches students the concepts related to, and tasks required for a partner to
create their own IO Filters.

Understand the Course's Prerequisites


Students must have certain skills and knowledge before attending this course. Failure to have the required
knowledge and skills will make successful completion of the course very difficult, if not impossible.

The prerequisites for this course are:

n C programming language fluency and experience, including compiling programs, solving syntax
issues, etc.

n Experience using the gdb debugger for example to post-mortem core dumps

VMware, Inc. VMware Confidential and Proprietary 9


Getting Started Developing vSphere IO Filter Solutions

n Experience using a Linux shell and text editors (vi/vim, EMACS, etc.)

n Know how to deploy and manage a basic vSphere development cluster as presented in the VMware
Fundamentals for Developers course, including:

n Deploying ESXi and vCenter Server

n Creating a cluster with DRS enabled

n Add hosts to the cluster

n Deploying VMs in the cluster

n Reconfiguring VMs, specifically to add / remove virtual disks

n Experience installing and removing VIBs from ESXi instances, including moving ESXi into and out of
maintenance mode, all from the ESXi CLI

In addition, it is advantageous if students have experience:

n Using storage policies in the VWC

n Using the Managed Object Browser (MOB) with a vCenter Server

n Using the Remote System Explorer (RSE) in Eclipse, especially the VMware customized RSE in VMware
Workbench

Understand the Course's Strategic Objectives


The purpose of the course is to provide software engineers responsible for developing vSphere IO Filters
with the information they need to accomplish that task. The course is designed to provide developers with
the big picture of a complete IO Filter Solution and concepts and skills necessary to build the said solution's
Library and Daemon components. Successful completion of this course enables solution developers to get
their product to market faster than those who do not take the course.

Understanding the Course's Tactical Objectives


To achieve the strategic objective of this course, by the end of this course, you should be able to:

n Define the purpose of this course

n List the course's prerequisites

n List the course's objectives

n List the course's chapters and appendices

n Understand the flow of an IO between a guest OS and a Virtual Machine Disk (VMDK), with and
without an IO Filter, including understanding the role of:

n The VMX

n The Virtual SCSI (vSCSI), POSIX Object Layer (POL) File System Switch, and SSMod kernel
modules

n SSlib in the VMX and user-space cartels

n IO filter Library Instances and Daemons

n Understand the purpose of vSphere IO Filters

n Understand the purpose of IO Filter daemons in IO Filter Solutions

n Understand the role of CIM providers and VWC plug-ins in IO filter Solutions

n Diagram the architecture of IO flows when vSphere IO Filters are used

10 VMware Confidential and Proprietary VMware, Inc.


Chapter 1 Course Overview

n Understand issues related to IO Filters and vSphere clusters

n Define the terms filter-private data, instance data and sidecar

n Understand the use of sidecars and instance data in IO Filter Solutions

n The difference between on-line and off-line IO Filter operations

n List the components of a minimal environment for developing and testing IO Filter Solutions

n Deploy the VAIODK to a supported development platform, including:

n Locating and downloading the VAIODK

n Install the VAIODK

n List the key folders, files, and tools created by installing the VAIODK, including the sample filters

n Remove the VAIODK

n Build a sample filter and list the results a successful build

n Deploy an IO Filter to ESXi systems in a cluster

n Test an IO Filter, including:

n Creating a new VMDK for a VM

n Add an IO Filter to a VMDK, using the ESXi shell and VWC

n Generate IOs to the VMDK

n View filter log messages in the appropriate log files

n Specify where to create your build folder on your build system, and why

n Create a Makefile to build your Solution

n Create appropriate entries in your Solution's SCONS and JSON files, based on the plans for your
Solution

n Create minimal source files for the library and daemon components of your Solution

n Define the prototype for each of the callback's in a library and daemon component, and when the IO
Filters Framework invokes each

n Use sidecars in an IO Filter Solution

n Use the IO Filter timer functions in a Solution

n Use the IO Filter polling functions in a Solution

n Process IO transactions in a Library Instance, including:

n Understanding the data structures used in IO transactions

n Creating new and duplicating existing IO transactions

n Submit new and duplicated IO transactions to the framework

n Process xMigration events in an IO Filter Solution

n Process Snapshot events in an IO Filter Solution

n Use the Cache functions in the vSphere IO Filter API

n Specify in which components with which you can not, may not, and may create threads using
pthread_create()

n Understand the use of blocking rules in Poll and Timer Callbacks, and WorkGroup Functions

n Understand the lifetime requirements of the Bundle Deployment URL

VMware, Inc. VMware Confidential and Proprietary 11


Getting Started Developing vSphere IO Filter Solutions

n Understand how to troubleshoot a vSphere Cluster and Filter Build Configuration

n Understand how to use a RAM Disk

n Understand when you can and cannot hold a lock across calls to the VAIO

n Understand the rules regarding the use of Sidecar Functions

n Understand how to debug the Heap

n Understand how to debugging with different versions of ESXi

n Understand how to use the realtime clock functions

Understanding the Course's Organization


This course contains a number of chapters to support its strategic and tactical objectives, including:

1 Course Overview (this chapter) — Discuss course overview, objectives, and organization.

2 Overview of vSphere IO Filters : Purpose, IO Flows, Architecture/Components, and Key Concepts

3 Deploying and Testing an IO Filter Development Environment

4 Creating a Basic IO Filter

5 Fleshing the IO Filter Library Object Source

6 Using Cache Solution-Specific Functions

7 Understanding Additional Rules Tips and Tricks

Chapter Summary
This chapter has presented an overview of the course, its objectives, and organization. You should now be
able to :

n Define the purpose of this course

n List the course's prerequisites

n List the course's objectives

n List the course's chapters and appendices

12 VMware Confidential and Proprietary VMware, Inc.


Overview of vSphere IO Filters:
Purpose, Architecture / IO Flows and
Components 2
Chapter Objectives
To achieve the tactical objectives of this course, after successfully completing this chapter, you should be
able to:
n Understand the flow of an IO between a guest OS and a Virtual Machine Disk (VMDK), with and
without an IO Filter, including understanding the role of:
n The VMX

n The Virtual SCSI (vSCSI), POSIX Object Layer (POL) File System Switch, and SSMod kernel
modules

n SSlib in the VMX and user-space cartels

n IO filter Library Instances and Daemons

n Understand the purpose of vSphere IO Filters

n Understand the purpose of IO Filter daemons in IO Filter Solutions

n Understand the role of CIM providers and VWC plug-ins in IO filter Solutions

n Diagram the architecture of IO flows when vSphere IO Filters are used

n Understand issues related to IO Filters and vSphere clusters

n Define the terms filter-private data, instance data and sidecar

n Understand the use of sidecars and instance data in IO Filter Solutions

n The difference between on-line and off-line IO Filter operations

This chapter includes the following topics:

n “Understanding the Purpose and High-Level Attributes of vSphere IO Filter Solutions,” on page 14

n “Understanding IO Flows to VMDKs and Back to Requestor on vSphere, Without IO Filters,” on


page 17

n “Understanding IO Flows to VMDKs and Back to Requester on vSphere With IO Filters,” on page 20

n “Understanding the Role of CIM Providers in IO Filter Solutions,” on page 26

n “Understanding IO Filters Impact on Clusters,” on page 27

n “Understanding the Role of vSphere Web Client (VWC) Plugins in IO Filter Solutions,” on page 28

n “Summarizing the Architecture of a vSphere IO Filter Solution,” on page 28

VMware, Inc. VMware Confidential and Proprietary 13


Getting Started Developing vSphere IO Filter Solutions

n “Chapter Summary,” on page 29

n “Review Questions - Overview of vSphere IO Filters - Purpose, IO Flows, Architecture, Components


and Key Concepts,” on page 29

Understanding the Purpose and High-Level Attributes of vSphere IO


Filter Solutions
Consider the following definition:

IO Stack The set of software components through which an IO request travels


between the entity initiating the request (the initiator) and the target of the
data.

In a traditional multipurpose OS, the initiator can be an user-space application or a kernel module with the
target being a device driver that sends the data to the target device for write requests or extracts the data
from the device for read requests. Components of the IO Stack can include buffer caches for block disk IO
and / or packet rings for network IO. Historically, the only flexibility in general-purpose OS IO stacks were:

n Anyone could create a user-level application to initiate IOs

n Certain 3rd-party developers could create device drivers (using platform-vendor-provided DDKs) to
interface with hardware not supported by the base kernel

The following diagram illustrates this historic architecture of the IO Stack (the red line indicates an example
flow of an IO between an initiator and a target):

Figure 2‑1. IO Stack - Traditional

Other than in the driver itself, these IO stacks did not allow 3rd party developers to intercept and process
any IOs. That is, if a developer wanted to process IOs for devices managed by Driver A, there was no way to
do this. There are many use cases for developers to intercept and process these IOs (called IO Filtering),
including:

n Checking data being read from or written to the disk for malware, blocking IOs containing infected
data

n Replicating disk writes to a remote site for hot or warm backup

n Encrypting data in an IO for security

n Compressing data in an IO for better throughput across a network and / or to fit more data onto a single
disk

14 VMware Confidential and Proprietary VMware, Inc.


Chapter 2 Overview of vSphere IO Filters: Purpose, Architecture / IO Flows and Components

This leads to the next definition:

IO Filter Software that intercepts IO transactions somewhere between the initiator and
target, regardless of the final disposition of said transaction (blocking,
deleting, replication, etc.)

Eventually, most general-purpose OS vendors created frameworks that allow 3rd party developers to create
IO Filters to insert into said vendor's OSes' IO Stacks. The framework most OS vendors provide requires IO
Filter developers to create kernel-modules that have specific entry-points, similar to device drivers, for
receiving and processing IO requests. Thus, IO Filters using such frameworks are often called Filter
Drivers. The following figure extends the preceding figure, inserting Filter Drivers.

Figure 2‑2. IO Filters - Filter Drivers Framework

Through vSphere 6.0, with respect to IO Filters, VMware's ESXi was similar to the traditional OSes. While
3rd parties were allowed to create specific-purpose kernel modules in the IO flow, (for example for
Pluggable Storage Architecture (PSA) or vSphere APIs for Array Integration (VAAI-Block) ), said kernel
modules did not provide a framework for general purpose IO Filtering.

Now, VMware has created the vSphere APIs for IO Filter, abbreviated VAIO (the F of Filter is not included
in the acronym, and the "APIs" is plural even though there is really only one API with many functions and
data structures) which provides a general-purpose framework that allows 3rd parties to create solutions
(called IO Filter Solutions), analogous to Filter Drivers in non-hypervisor operating systems.

Understanding VAIO General and Supported Use Cases


VAIO envisions supporting multiple use-cases, for example, but not limited to:

n Inspection — For example, looking for malware or file corruption

n Compression — To increase the amount of information stored in a given space on the disk

n Encryption — Separate from compression, requiring a key to decrypt the information, used for
enhancing security

n Replication — To create hot or warm backups of vDisks on remote sites

n Cache — Using SSDs to cache the contents of spinning disks, used to increase performance

VMware, Inc. VMware Confidential and Proprietary 15


Getting Started Developing vSphere IO Filter Solutions

VAIO defines each use case as a separate class of filter. Currently, it allows administrators to assign at most
one filter of any class with a given virtual disk at any one time. For example, a VM called foo could have a
virtual disk called foo-disk1.vmdk with filters for inspection (for malware), encryption, and replication
attached.

NOTE The first release of vSPhere APIs for IO Filters will only support the replication and caching uses
cases. VMware will not certify VAIO solutions attempting to support other use cases at this time.

Features and Flow of vSphere IO Filters


VAIO for vSphere6.0U1 provides the following features:

n Filtering of IOs to VMDKs. There is currently no support for filtering network IO.

n Filter modules run in multi-world user space cartels (like multi-threaded user-space processes in a Unix
environment). This feature ensures that flaws in an IO filter will not crash the ESXi kernel.

The following figure provides a high-level overview of flow of an IO between a VM and a filtered VMDK
(and back) using a user-space IO Filter Solution:

Figure 2‑3. High Level Overview of vSphere IO Filters

The steps in the illustrated flow are:

1 A guest OS in a VM starts an IO to a VMDK.

2 The ESXi kernel starts processing the request, then realizes that the target disk has an IO filter on it, so
the kernel sends the IO request to the vSphere IO Filter Framework

3 The IO Filter Framework sends the IO request to the IO Filter Solution. There are actually several sub-
steps involved in getting the IO to user-space and to the right IO Filter Solution.

4 The IO Filter Solution processes / filters the IO request appropriately for the use-case it implements. For
example: A replication solution will send write requests to a remote replication site; A caching solution
would look up read requests in the cache and update the cache with write requests.

16 VMware Confidential and Proprietary VMware, Inc.


Chapter 2 Overview of vSphere IO Filters: Purpose, Architecture / IO Flows and Components

After processing the request, said solution may then allow the rest of the IO stack to continue
processing the IO. The solution continues the processing by returning control of the IO to the vSphere
IO Filter Framework.

NOTE Later sections in this course discuss exceptions to this rule including dropping the IO and
processing the IO request asynchronously.

5 The IO Filter Framework continues the IO through the rest of the IO stack (typically including other
ESXi kernel modules), which eventually results in a call to the device driver controlling the hardware
on which the VMDK resides.

6 The driver completes the IO request and sends the result back towards the requesting VM, including
the vSphere IO Filter Framework.

7 If, during the IO Filter solution's processing of the IO request (step 4), said solution requests to be
notified when the IO completes, the IO Filter Framework sends said completed request back to the IO
Filter Solution again. For example, a caching solution may wait to update the cache on writes until after
the write completes on the VMDK itself. If said solution does not ask for IO completion notification, the
flow continues at Step 9.

8 The IO Filter solution does any additional processing required on IO completion, and then may notify
the vSphere IO Filter Framework that the IO can complete.

9 The vSphere IO Filter Framework continues the IO on its way, typically through additional ESXi kernel
modules.

10 The last ESXi kernel module in the IO Stack presents the results of the IO to the guest OS in the VM.

Online and Offline Filtering


A central component of ESXi for processing disk IOs from VMs is called DiskLib. It provides methods for
opening/closing/reading/writing and otherwise manipulating virtual disk (VMDK) files.

In addition to the DiskLib used by ESXi, VMware offers another version of DiskLib (called vixDiskLib) via
the un-gated Virtual Disk Development Kit (VDDK). The version provided in the VDDK provides functions for
accessing and manipulating VMDK files from a Windows or Linux program. Both versions of DiskLib invoke
IO Filtering Solutions for all IOs to VMDKs marked for filtering. This leads to the following definitions:

On-line Filtering The application of IO Filtering Solutions to IOs between a VM's guest OS and
any of its VMDKs that are marked for filtering

Off-line Filtering The application of IO Filtering Solutions to IOs to VMDKs (that are marked
for filtering) driven by a anything other than a VM's guest OS. Examples
include:

n ESXi utilities such as vmkfstools and od

n Linux or Windows application performing IOs using the functions


shipped in the VDDK when the program has connected to the ESXi host
(via the VixDiskLib_ConnectEx() function - not discussed further in this
course) managing said VMDKs.

Understanding IO Flows to VMDKs and Back to Requestor on


vSphere, Without IO Filters
This section provides a simplified view of how IO requests flow to a VMDK and back to the requester. It
does not include any discussion of special processing such as IO Filter processing or Pluggable Storage
Architecture. It is important to have this basic understanding before you attempt to understand any special
processing for an IO request.

VMware, Inc. VMware Confidential and Proprietary 17


Getting Started Developing vSphere IO Filter Solutions

It is important to understand that there are three main sources of IO requests on ESXi, as illustrated by the
following figure:

Figure 2‑4. Basic ESXi IO Stack

n VMs accessing their VMDKs

n User-space cartels accessing a VMDK. Two special cases of this are the hostd and vpxa cartels which,
among other things, proxy requests from off-host entities to access or manage VMDKs on host.

n vSphere functionality imbedded in ESXi, for example VM migration with vMotion

This section concentrates on the first two items in the preceding list.

The first kernel module to receive IO requests depends on the source of the IO.

n For IOs generated from a guest VM, the IO gets handed to the vSCSI module in the kernel which is
invoked by the Virtual Machine Monitor (VMM).

TIP For those unfamiliar with ESXi VMM/VMX architecture, here is some background information.

When ESXi starts running a VM, it creates a new kernel space cartel called Virtual Machine Monitor
(VMM) with one world (thread) per vCPU in the VM. The VMM in turn creates a user space cartel called
Virtual Machine Executable (VMX) again with one world per vCPU in the VM. Each pair of worlds
work in concert to provide virtualization services to the guest OS, and then starts the guest code
running.

NOTE When the guest code is running in the CPU, the vCPU is said to be in direct execution mode

At some point, the guest code performs some operation that requires the services of the hypervisor,
such submitting an IO request to virtual hardware. The CPU switches from direct execution mode back
to running the VMM. The VMM processes and then submits the IO request to the vSCSI module in the
ESXi kernel. Said vSCSI module implements a SCSI interface for all storage IO requests (to a VMDK).

After performing some processing on the IO request, the vSCSI module passes the IO to the File System
Switch (FSS) kernel module (see “Understanding the Role of the File System Switch (FSS) Module in IO
Flows,” on page 19 ).

18 VMware Confidential and Proprietary VMware, Inc.


Chapter 2 Overview of vSphere IO Filters: Purpose, Architecture / IO Flows and Components

n For user-space cartels, the module that handles IO requests is called POSIX Object Layer (POL)
(“Understanding the Role of the POSIX Object Layer in IO Flows,” on page 19). For reads, this layer
may consult the Buffer Cache (BC) to obviate a read to physical hardware. This layer may also update the
BC for write operations. On BC miss, or if it does not use the BC, this layer hands off the IO request to
the File System Switch (FSS) kernel module (see “Understanding the Role of the File System Switch (FSS)
Module in IO Flows,” on page 19 ).

Thus, whether a VMDK's IOs are generated from a user space cartel or a VM, eventually, said requests end
up in the FSS module.

Next, the FSS module invokes the appropriate filesystem callbacks to perform the IO. For NFS, this involves
network IO. For other filesystems, the ESXi kernel invokes some (purposely obscured, for purposes of this
discussion) other kernel modules, eventually resulting in a call to the storage device's driver.

When the device completes the IO, analogous (but not necessarily the same) modules process the result on
its way back to the requester.

NOTE The ESXi File IO Stack, as described in this and subsequent subsections, may appear to be similar to
File IO Stacks in other semi-POSIX-compliant operating system. However, it is important to remember that
ESXi is its own proprietary OS with its own Kernel and should be thought of as such. You should not think
of ESXi as being based on any other semi-POSIX-reliant OS.

Understanding the role of the vSCSI Module in IO Flows


This module processes SCSI commands from guest virtual Host Bus Adaptor(HBA). For SCSI commands
directed to file-based virtual disks (VMDKs), this module invokes the File System Switch(FSS) layer as
discussed in the preceding section.

Understanding the Role of the POSIX Object Layer in IO Flows


As shown in the preceding figure, only user-space programs use the POSIX Object Layer (POL). It creates
and manages the numeric file handles (like Linux file descriptors) that are given to user-space cartels as a
result of opening files on a filesystem, including file-based VMDKs. This module associates user-objects
with these file handles. For actual files, said user-objects themselves contain handles to file systems
managed by the File System Switch (FSS) layer (thus called FSS handles).

When a user-space cartel issues a file-related system call, said VMKernel code redirects the call to this
module, which then invokes the FSS layer using an appropriate FSS handle.

Understanding the Role of the File System Switch ( FSS ) Module in IO Flows
ESXi's File System Switch (FSS) module is analogous to the Virtual File System (VFS) layer in Linux / Unix
systems.

For those unfamiliar with either of these, the purpose of this module is to provide a uniform set of
operations to higher-layer kernel modules, (for ESXi: POL / vSCSI), but allow different implementation of
the actual filesystem organization (for ESXi: NFS, VMFS, VSAN, etc.).

For those familiar with object-oriented programming, think of this module as a pure-virtual class (C++) or
Interface (Java), with filesystems being the derived classes that actually implement each of the methods in
the FSS class.

VMware, Inc. VMware Confidential and Proprietary 19


Getting Started Developing vSphere IO Filter Solutions

Understanding the Role of the Buffer Cache Module in IO Flows


Most systems benefit from caching IOs to the block-oriented storage devices. ESXi provides a single kernel
module, called Buffer Cache that provides uniform caching functionality for user space cartels. However,
guest IO does not go through the buffer cache because ESXi assumes that each guest knows best which IOs
to cache.

Understanding IO Flows to VMDKs and Back to Requester on vSphere


With IO Filters
As mentioned at the beginning of this chapter, the IO Filter Framework allows 3rd parties to intercept and
process IOs to VMDKs on ESXi systems. The Framework attaches filters to, and detaches them from,
individual VMDKs in response to administrators' configuration commands via a supported management
tool (for example the vSphere Web Client [VWC] or the vmkfstools ESX CLI command).

Attaching an IO Filter to a VMDK means that the Framework sends all subsequent IOs for said VMDK to
said IO Filter for processing. Further, it means the Framework invokes filter-supplied functions (called
callbacks) to alert said filter of key events related to the VMDK (other than IOs) for example:

n when someone requests a snapshot of the disk

n when someone migrates the VM controlling a disk from one host to another

n when someone changes the size of a disk (grow or shrink)

n when a program opens or closes the disk (such as VM start and stop / pause, etc.)

Detaching an IO Filter from a VMDK means the Framework no longer sends IOs or invokes callbacks for
VMDK-related events.

IMPORTANT The concept of callbacks is central to developing IO Filters. In this context, a callback is analogous
to an entry-point function in a device driver. The VAIO specifies a series of callbacks that every filter must
provide for the IO Filter Framework to call, as well as some that are optional.

This section focusses on describing the flow of an IO to a VMDK with IO Filters attached. Other chapters of
this course discuss how to define and flesh the callbacks that make up an IO Filter. This section builds on the
preceding discussion, and is high-level also, keeping the concepts as simple as possible. The intent is to
provide you with a big picture and context for how vSPhere IO Filters work, which will help you
understand what you can and can't do in an IO Filter and why.

The following figure augments the preceding one (Figure 2-4) and shows the possible IO Flows to a VMDK
with an IO Filter attached.

20 VMware Confidential and Proprietary VMware, Inc.


Chapter 2 Overview of vSphere IO Filters: Purpose, Architecture / IO Flows and Components

Figure 2‑5. IO Stack with vSphere IO Filters and other Filter Components

NOTE To reduce the complexity of this (already complicated) figure, this diagram does not show the path of
IOs to VMDKs without IO Filters. That said, for those IOs, the path would flow from the FSS beneath the
VMM to either DevFS (in the case of writing to a device file) or to a filesystem module such as NFS, VMFS,
etc.

The following sub-topics describe the vSphere IO Filter-specific modules illustrated in this figure.

Understanding the Role of the SecretSauceMod Kernel Modules in IOs with IO


Filters
The SecretSauce Module (SSMod) implements the vSphere IO Filtering API. As an IO targeted at a filtered
VMDK transits the IO stack in ESXi, the kernel passes the IO to SSMod, which passes the IO (possibly with
other IOs and other requests) to the SecretSauce Library Module (SSLib, discussed in the next sub-topic)
running in whatever user-space cartel initiated the IO, for example the VMX for a VM requesting the IO.
The mechanism for sending messages to SSLib, and the messages themselves, are called upcall and upcalls,
respectively.

NOTE IO Filters are interested in, and must process events beyond the IO requests passed to them via
upcalls. Those events not passed via upcalls are sent to SSLib via other mechanism.

VMware, Inc. VMware Confidential and Proprietary 21


Getting Started Developing vSphere IO Filter Solutions

Understanding the Role of the SecretSauce Library (SSLib) in IOs With IO


Filters
The SecretSauce Library (SSLib) is a shared-object library that gets loaded into the VMX cartel and runs in the
context of its worlds. It receives VMDK-related events via messages from DiskLib and upcalls from SSMod.
It interprets each such event (such as a disk read or snapshot request) and invokes the appropriate callbacks
for the IO Filters attached to the VMDK.

NOTE As depicted in the preceding figure, IO Filters can have several components: A Library Instance (LI),
a Daemon, and a CIM Provider, each of which are discussed in other sub-topics. The component that
provides the callbacks referenced in the preceding paragraph is the Library Instance.

Library Instances are shared object (library) files that get loaded into the VMX process of VMs with said
library attached to one of their VMDKs. For offline filtering, SSLib and the LIs get loaded into the cartel
accessing the VMDK, for example hostd, vmkfstools, etc. In this latter case, SSMod makes upcalls to the
SSLib in the cartel that loaded it for offline processing.

SSLib invokes the applicable callback(s) in sequence according to the class / type of the filter. Currently, the
two classes of filters supported by vSPhere IO Filters, and the order in which they are invoked by SSLib are:

n Replication

n Cache

For example, if VMDK foo.vmdk has a caching filter C and a replication filter R attached, SSLib will invoke
R's callback first and then C's callback for any given event for or IO request to foo.vmdk.

NOTE Developers specify the class of their filter in a build-related configuration file, discussed in “Creating
and Populating a Correct Scons File,” on page 89. SSLib relies solely on this declaration to define a filter's
type.

Since callbacks are in the LI of the filter, SSLib invokes them with a simple procedure call. As such, it must
wait for each callback to return before it can proceed. How SSLib proceeds depends on the value returned
from the callback. Return values cause the SSLib to fail events (for example fail the read(), snapshot,
migration, etc.). A value indicating success causes SSLib to submit the IO to the next filter in sequence, or, if
all filters have processed the event successfully, return a success value to the caller. There is a third option
beyond success or failure, though, as discussed in the next sub-topic (“Understanding Synchronous vs
Asynchronous Processing in Callbacks,” on page 22).

For each IO request, SSLib tracks the following:

n The last library instance to process the request

n The completion callbacks (and data) registered by filters to be issued when the IO request is completed

n IO request status (being processed by a callback, deferred, completed, aborted)

Understanding Synchronous vs Asynchronous Processing in Callbacks

Many VMDK events require synchronous processing. That is, all work that a filter must do within a callback
must be completed before the callback returns. In these cases, unless the callback wants to return an error, it
returns a VAIO-defined code representing success (VMIOF_SUCCESS).

However, some events require work that may not complete quickly. For example, consider a replication
filter that receives a write request form a guest OS, where the filter is configured to only complete the write
locally after it receives an ACK that the data has been written on the replication site(s). The replication filter
must send the data to the replication site, and wait for an ACK from that site, before sending the write
request to the local disk and completing the write request.

22 VMware Confidential and Proprietary VMware, Inc.


Chapter 2 Overview of vSphere IO Filters: Purpose, Architecture / IO Flows and Components

If the filter were to block, waiting for the ACK before returning from the remote site, it would tie up a world
(thread) in the cartel that issues the write. In the case of online filtering, the VMX process has very few
worlds. Having them blocked indefinitely (in this example, waiting for an ACK) would have adverse impact
on performance. In these cases, the callback may schedule some method for handling the work
asynchronously, and then return a code indicating this to the IO Filter Framework(VMIOF_ASYNC).

When SSLib receives VMIOF_ASYNC from a callback, it suspends any further processing of the event. When the
filter completes the asynchronous work (in the example, it receives the ACK and sends the write to the local
disk), the filter invokes a VAIO function to tell the IO Filter Framework that it can continue processing the
event.

Callbacks, other than the one for IO requests, include a pointer to a function the Filter must call to signal
completion of the callback.

For IO requests, the VAIO provides separate utility functions that signal the Framework that the Filter has
completed its work. One function indicates that the Filter has fulfilled (completed) the IO request, which
causes the Framework to turn the IO request around and start sending it up the IO stack. Another function
indicates that the Filter has completed its work and that the Framework should continue the request down
the IO stack. Each of these functions take a parameter that indicates whether the IO request was serviced
successfully or had a failure.

From the time a filter receives an IO request until it completes or continues it, the filter is said to own the IO.
Filters are required to keep track of each IO request it owns on a per-VMDK basis. How they implement this
is up to the Filter developer, though this course provides a suggestion and example. Filters must do this
because the Framework may send abort or reset requests to a Filter, which must be able to check if it owns
the IO(s) being aborted / reset.

Understanding the Role of an IO Filter Solution's Library and Library Instances


(LIs) in IO Flows with IO Filters
Every IO Filter Solution must include a Filter Library component. This is a shared object library that gets
loaded by SSLib into the address space of the user-space cartel that opens a VMDK with this filter attached.
This component contains the callback functions that the IO Filter Framework invokes to process VMDK and
IO events, for example: attaching and detaching the IO Filter Solution to/from the VMDK, getting
opening/closing the VMDK, deleting blocks for thin-provisioned VMDKs, processing IOs issued to a
VMDK. The Filter Library code must include a data structure, defined by the VAIO, that provides pointers
to each of the filter's required and optional callback functions.

One callback that each Library component must supply is to a function generically referred to as
diskIOStart. SSLib invokes this function for each read / write IO request it receives on a VMDK to which
this filter is attached. This function must perform whatever filtering logic is appropriate, and then return a
value indicating either: Success; Failure; Or that the filter will complete the IO request later, asynchronously.
SSlib then proceeds with the IO request as discussed in the previous topic, “Understanding the Role of the
SecretSauce Library (SSLib) in IOs With IO Filters,” on page 22. Code in this function often involves
communication with a Filter's daemon, if present, as discussed in the next topic.

Developers specify the source files that make up a filter's library instance using definitions in the filter's
SCONS file (see “Creating and Populating a Correct Scons File,” on page 89).

It is important to understand that for any given user-space cartel that opens one or more filtered VMDKs,
the IO Filter Framework (through SSLib) loads the Library component of the required IO Filters just once.
When the IO Filter Framework attaches an IO Filter to a VMDK, or opens a VMDK with an IO Filter
attached, the framework allocates a new opaque handle that it associates with that VMDK. Whenever the IO
Filter Framework invokes a callback of the Library component, said framework passes the opaque handle to
the callback so that the Library code knows on which VMDK the event occurred. Thus, an IO Filter's Library
component code must keep separate data for each VMDK it is filtering. This data is known as instance data.
The context of a Library component filtering a single VMDK is thus called a Library Instance (LI). For
example, consider the following figure:

VMware, Inc. VMware Confidential and Proprietary 23


Getting Started Developing vSphere IO Filter Solutions

Figure 2‑6. IO Filter Library Instances

This figure illustrates the following:

n VM Y has two VMDKs (VMDK1 and VMDK2)

n Filter A is attached to each VMDK

n Filter A's Library component is loaded into the VMX of VM Y (so one address for each of the
callbacks in the Library component)

n The Library code keeps separate data about each of the VMDKs to which it is attached (represented
by the hexagons labeled ID1 and ID2). Thus, it is said there are two LIs for this filter in this VM's
context.

n VM Z has two VMDKs (VMDK3 and VMDK4)

n Filters A and B are attached to VMDK3

n Only Filter B is attached to VMDK4

n Filter A's and Filter B's Library component is loaded into the VMX of VM Z. That is, there are two
addresses for each callback, one for Filter A, one for Filter B.

n Filter A only keep track of data for VMDK3. Thus, it is said there is only one LI for this filter in this
VM's context.

n Filter B keep separate data for VMDK3 and VMDK 4. Thus it is said there are two LIs for this filter
in this VM's context.

Understanding Filter-Private Data: Instance Data and Sidecars


Filters need to keep meta data about each VMDK they are filtering, including:

n Configuration data (also called filter properties or filter capabilities) — For example:

n A caching solution may allow administrators to configure write-through vs write-back caching

n A replication solution may allow administrators to configure whether the writes to the replication
site must be ACKed by at least one remote site before continuing the IO request on the source host

24 VMware Confidential and Proprietary VMware, Inc.


Chapter 2 Overview of vSphere IO Filters: Purpose, Architecture / IO Flows and Components

n Operational data / statistics — For example:

n A caching solution may keep cache hit rates

n A replication solution my want to keep the average throughput between the host and replication
site(s)

n State — For example:

n Whether the VMDK is stunned

n The name of the VMDK file being filtered in this instance

n Whether vSphere is migrating the VM owning the VMDKs between hosts

Some of this data is fast-moving and need not persist between the close of a VMDK and it being reopened
(for example between a power-off and power-on of a VM owning the VMDK), such as current cache hit
rates. Some of this data must persist and be present as long as the filter is attached to the VMDK, such as
configuration data. All such data is referred to as filter-private data.

The non-persistent filter-private data kept by a LI is called instance data as discussed in “Understanding the
Role of an IO Filter Solution's Library and Library Instances (LIs) in IO Flows with IO Filters,” on page 23
and must be kept by Filters in RAM-based data structures they define themselves. The persistent filter-
private data is called sidecar data because the Filter keeps it in a file associated with the VMDK called a
sidecar. The VAIODK provides a set of utility functions that allow Library code to manage both instance and
sidecar data.

Examples of items commonly found in instance data are:

n The stun level of a VMDK (defined and discussed in detail in “Understanding and Processing diskStun
and diskUnstun Events,” on page 208)

n A list or tree of IOs currently owned by the instance

n File descriptors for any sockets used by the instance

n Handles for any sidecar files associated with the instance's VMDK

Examples of items commonly found in sidecar data are:

n Whether the VM is currently migrating between hosts

n The pathname of the VMDK file (this can change as a result of storage migration)

n The current settings for filter properties and any additional configuration parameters

Some additional notes about sidecars. They:

n Are associated with a particular VMDK file

n Are managed (created/deleted/read/written) by Library code

n Reside in the same directory as a VMDK, though their file name is (currently) obscured

n Can only be managed in library code

For information on the instance data functions, see “Understanding and Using Filter-Private Data Functions
to Keep Non-Persistent Per-VMDK Meta Data,” on page 145. For information on sidecar functions, see
“Using Sidecars Functions in Library Code to Keep Persistent Per-VMDK Meta Data,” on page 137.

Understanding the Role of an IO Filter Daemon in IO Filter Solutions


Daemons are an optional component of an IO Filter. Despite their name, vSphere IO Filter Daemons are not
processes that run in an infinite loop as they do in Unix / Linux systems. Rather, vSphere IO Filter Daemons
are actually shared object (library) files that get dynamically loaded and run in the context of the iofilterd
user-space cartel.

VMware, Inc. VMware Confidential and Proprietary 25


Getting Started Developing vSphere IO Filter Solutions

The daemon gets loaded and started as part of the filter's VIB installation onto an ESXi host. The iofilterd
cartel starts a daemon by invoking a (mandatory) start callback. The daemon is stopped and unloaded as
part of a filter's VIB removal. The iofilterd cartel stops a daemon by invoking a (mandatory) stop callback.

Daemons are not allowed to run in infinite loops as they do in Linux / Unix (or services in Windows).
Instead, the start function typically creates and binds to sockets, then sets up asynchronous IO callbacks on
receipt of data on said sockets. They may also perform timer operations, etc. using VAIO utility functions
(discussed in Chapter 5).

Thus, Daemons: Start before LI code runs: Stop after all LI code exits; Can handle asynchronous events that
make it appear as though it is running at all times. A common thing for LIs to include in the callback for
handling VMDK open events is to make a socket connection to the filter's Daemon.

NOTE It is a best practice to use UNIX domain sockets for these connections.

The purpose of a Daemon is to provide functionality to LIs that LIs cannot do themselves. For example, LI
code cannot make a TCP/IP socket connection to an off-host server. This is enforced with ESXi access
restrictions. However, daemons can. Further, after making such a connection, they can pass the file-
descriptor of the resulting socket to a LI so that said LI can communicate directly with the remote server
without having to continually involve the daemon.

Another example is, for a caching filter, it is considered a best practice to have the daemon create and
manage vFlash File System (VFFS) cache files caching storage (typically but not always SSDs). In this case, for
read request processing, the LI queries the daemon to see if the cache file contains a required block for a
given VMDK. For write request processing, the LI must send updated data to the daemon to update the
cache.

To avoid sending large amounts of data for IO requests between daemon and LI over a socket, a reasonable
programming pattern is to: 1) Create a shared memory area accessible to LI and Daemon components; 2)
Place data for read / write requests in the shared data; 3) Send requests / responses, that include offsets to
the applicable data, in messages between the LI and daemon over the socket. You can think of the shared
memory as a data-plane and the socket as control-plane for the data sharing.

In summary, for a given IO request, an LI may return VMIOF_ASYNC to the SSLib, then send requests to its
daemon to perform processing on the request that the LI cannot (or should not) perform itself, and then wait
for a response from the daemon.

As with LIs, you specify the source files that make up the daemon using SCONS file definitions (see
“Creating and Populating a Correct Scons File,” on page 89).

NOTE The daemon in your IO Filter Solution is recommended to use SSL to establish a secure connections
between its LIs and off-host peer daemons, incorporating additional measures to authentic such
connections, such as magic numbers, etc.

Understanding the Role of CIM Providers in IO Filter Solutions


VMware vSphere uses the Common Information Model (CIM) to view operational and configuration
information from, and provide management of, hardware and software modules. The main purpose of CIM
is to provide an open framework, defined by the Distributed Management Task Force (DMTF), for defining
both standardized management data, standardized methods to perform operations on that data, and a
standardized method for defining management commands to be sent to hardware and services. Each IO
Filter must include a CIM Provider to perform these standardized operations for the filter.

For example, a caching filter may optionally perform read-ahead when processing a cache miss. Further, it
may allow administrators to configure how many blocks to read ahead. Such a filter can include a CIM
provider that receives configuration commands from a standard management tool (vSphere Client, vSphere
Web Client, Power CLI, etc.), and pass those commands onto the filter instance, typically via the filter's
daemon. The CIM provider can also surface the current configuration to said management tools by

26 VMware Confidential and Proprietary VMware, Inc.


Chapter 2 Overview of vSphere IO Filters: Purpose, Architecture / IO Flows and Components

requesting current settings from the filter daemon or LI. If the daemon supports separate read-ahead values
for each VMDK, the filter can keep the parameter in one of the VMDK's sidecar files. If the daemon only
supports a global setting, the filter can keep the data in the CIM provider, which has standardised methods
for storing and retrieving persistent data.

While each IO Filter must define and provide source for a CIM provider in its SCONS file, said CIM
provider code need not actually do anything useful. Unless you are an experienced CIM provider
developer, consider starting with the skeletal example provided by one of the example filters in the
VAIODK, then adding functionality as you learn.

Developing CIM providers is a large topic on its own, and is somewhat orthogonal to the discussion of
developing a vSphere IO Filter. Thus, further discussion of developing a CIM provider is beyond the scope
of this course.

Understanding IO Filters Impact on Clusters


While you can deploy a vSphere IO Filter to a single ESXi host, and will almost certainly do that during
development and testing (to keep things simple), vSphere IO Filters are meant to be deployed at the vSphere
Cluster level.

Further, vSphere IO Filters integrate, at the vSphere layer, with vSphere's Storage Policy Based Management
(SPBM) feature. Historically, SPBM has allowed administrators to apply policies to VMDKs such as: Only
place this VMDK on devices that can provide X IOPs, and migrate it to a sufficient device if the current
device falls below X IOPs. For vSphere IO Filters, administrators associate an IO Filter to a VMDK as part of
the SPBM framework.

The work flow for configuring IO Filters in a cluster, and applying them to VMDKs for VMs running there
is similar to, if not exactly, the following:

1 Obtain the IO Filter Bundle from the vendor (you).

2 Deploy the bundle to the vCenter Server / VCSA controlling clusters in the datacenter. This enables
administrators to associate an IO Filter with a VMDK using SPBM methods in the VWC.

3 Apply an IO Filter SPBM policy to a VMDK, either when creating said VMDK or after it is created,
using the VWC. This action causes the vCenter Server / VCSA to do two things:

a Deploy the specified IO Filter's VIB to each ESXi host in the cluster (which in turn starts said IO
Filter's daemon)

b Performs a diskAttach operation for said IO Filter to the specified VMDK, using default values for
any configurable filter properties.

There are implications of running VMs with IO Filters attached to their VMDKs in a cluster, including:

n Migration — For a VM (with filtered disks) to migrate from Host A to B, B must have the required IO
Filter installed, too. Further, if the filter is of class cache, does B have its own SSD with a VFFS volume?
What is the appropriate behavior for the filter to take in either case?

NOTE It is entirely possible for a VM running on Host B to use the VFFS cache on Host A, and still
increase performance over just accessing VMDKs on their native backing store without cache.

n If a host enters the cluster with DRS, the vCenter Server / VCSA must (and does) deploy all deployed
IO Filters to that host as part of the join. Similarly, if a host leaves the cluster, all VMs with filtered
VMDKs must migrate to other hosts in the cluster, and the filter must be removed as part of the leave.

Where necessary, topics in Chapter 5 include detailed discussion of cluster implications on callback design
decisions.

VMware, Inc. VMware Confidential and Proprietary 27


Getting Started Developing vSphere IO Filter Solutions

Understanding the Role of vSphere Web Client (VWC) Plugins in IO


Filter Solutions
The vSphere Web Client (VWC) is VMware's flagship, browser-based, extensible tool for managing vSphere
environments. The base tool, shipped with vCenter Server and vCenter Server Appliance (VCSA) provides
administrators with tools to view the operational state and configure most aspects of their virtual data
center, including configuring clustering components to load-balancing policies, storage policies, assignment
of SSDs to vFlash Cache volumes, and more.

For IO Filters, the VWC includes methods for administrators to:

n Deploy an IO Filter Solution into a cluster, meaning that the method causes the Filter's VIB to get
installed into each ESXi host in the cluster.

n Configure properties of a IO Filter Solution, which get surfaced to the VWC via entries in the filter's
build environment files. That is, when you specify that your filter has a configurable property (for
example cache read-ahead) in your filter's build configuration variables, then compile the VIB, the VIB
includes the information you provide about those configurable properties. When the VIB is deployed to
the cluster, the VWC is able to see the configurable properties in the VIBs and provide a standard UI for
changing the properties. When the administrator changes the property, the VWC sends the change
command, via the vCenter Server, to the IO Filter's CIM Provider, which can then effect the change in
the other components of the IO Filter as appropriate (for example sending the command to change the
read-ahead value to the Filter's daemon or LI.) CIM provider is not involved if only for the filter
property / capability change.

VMware provides an un-gated, free SDK that allows anyone, but typically partners, to develop plugins that
provide additional functionality for managing their solutions within a vSphere environment. As a vSphere
IO Filter Developer, you only need develop a VWC plugin for your filter if you need functionality beyond
what is provided by the base VWC, that is, beyond requesting reads / writes to filter properties. If you do
decide to create your own VWC plugin, you should still use the CIM framework for communicating
requests to your solution.

Creating VWC plugins is a large topic, for which VMware offers a 5-6 day bootcamp. Thus further
discussion of developing VWC plugins is beyond the scope of this course.

Summarizing the Architecture of a vSphere IO Filter Solution


To summarize the architectural features of IO Filters:

n Each IO Filter must provide Library Instance code that gets invoked by the IO Filter Framework to filter
IOs to designated VMDKs. The library code runs in the context of a VM's VMX user-space cartel and / or
in the context of a user-space cartel performing off-line processing of the VMDK.

n Each IO Filter can optionally provide a Daemon that runs in the context of the iofilterd user-space
cartel. Daemons are started by the IO Filter Framework during deployment of the filter's VIB on an
ESXi host.

n IOs to VMDKs from VMs enter the storage stack via the VMM invoking the vSCSI module. IOs to VMDKs
from on-host user-space programs, including hostd, enter the storage stack via POL module. The hostd
process proxies IOs that originate on Windows / Linux applications for this off-line processing. The
combination of these factors indicate that all IOs to VMDKs going through these three sources get into the
ESXi storage stack and are thus subject to IO Filtering.

NOTE IOs to VMDK files on shared storage that do not go through ESXi are not subject to IO filtering. For
example, a VMDK on an NFS file store, opened by a program using standard POSIX open/close/read/write
system calls will not have its IOs filtered by vSphere IO Filters.

28 VMware Confidential and Proprietary VMware, Inc.


Chapter 2 Overview of vSphere IO Filters: Purpose, Architecture / IO Flows and Components

Executing filter code in the context of user-space cartel has the benefit that fatal flaws in a given filter's code
only affects VMs whose VMDKs are attached to said filter. This contrasts with kernel-based filtering
solutions where a fatal flaw in the filtering code causes the kernel to crash. This user-space architecture
increases the reliability of the ESXi hosts.

Chapter Summary
To achieve the tactical objectives of this course, after successfully completing this chapter, you should be
able to:

n Understand the flow of an IO between a guest OS and a Virtual Machine Disk (VMDK), with and
without an IO Filter, including understanding the role of:

n The VMX

n The Virtual SCSI (vSCSI), POSIX Object Layer (POL) File System Switch, and SSMod kernel
modules

n SSlib in the VMX and user-space cartels

n IO filter Library Instances and Daemons

n Understand the purpose of vSphere IO Filters

n Understand the purpose of IO Filter daemons in IO Filter Solutions

n Understand the role of CIM providers and VWC plug-ins in IO filter Solutions

n Diagram the architecture of IO flows when vSphere IO Filters are used

n Understand issues related to IO Filters and vSphere clusters

n Define the terms filter-private data, instance data and sidecar

n Understand the use of sidecars and instance data in IO Filter Solutions

n The difference between on-line and off-line IO Filter operations

Review Questions - Overview of vSphere IO Filters - Purpose, IO


Flows, Architecture, Components and Key Concepts
1 Which of the following are supported classes / use cases of vSphere IO Filters today? (choose all that
apply)

a Caching

b Encryption

c Inspection

d Replication

2 Which of the following are required components of a vSphere IO Filter solution developed by you (a
partner)? (choose all that apply)

a Library Instance

b Daemon

c CIM Provider

d VWC Plugin

3 In which of the following contexts does SSLib run for a VM? (Choose the best answer)

a DiskLib of a user-space cartel

VMware, Inc. VMware Confidential and Proprietary 29


Getting Started Developing vSphere IO Filter Solutions

b User-space part of VMM

c VMX user-space cartel

d A VMkernel module

4 Which of the following components does VMware recommend control VFFS cache files for a caching
filter solution?

a The CIM provider

b The daemon

c The LI

d The VWC plugin

5 The LI of an online IO Filter runs in the context of which program ?

a VMM

b VMX

c iofilterd

d VMkernel

6 Which of the following statements best describe the role fo SSLib ?

a It receives events from SSMod / hostd / vpxa and invokes the appropriate callback of the LI

b It attaches and detaches to / from VMDKs

c It sends IO requests to LIs

d It proxies IOs between LIs and their Daemons

7 Where do IO Filters store persistent meta-data about the VMDKs they filter ?

a Instance Data

b Sidecars

c The .vmdk meta data file for each VMDK

d The Daemon's heap

30 VMware Confidential and Proprietary VMware, Inc.


Deploying and Testing an IO Filter
Development Environment 3
Chapter Objectives
To achieve the tactical objectives of this course, after successfully completing this chapter, you should be
able to:
n List the components of a minimal environment for developing and testing IO Filter Solutions

n Deploy the VAIODK to a supported development platform, including:


n Locating and downloading the VAIODK

n Install the VAIODK

n List the key folders, files, and tools created by installing the VAIODK, including the sample filters

n Remove the VAIODK

n Build a sample filter and list the results a successful build

n Deploy an IO Filter to ESXi systems in a cluster

n Test an IO Filter, including:


n Creating a new VMDK for a VM

n Add an IO Filter to a VMDK, using the ESXi shell and VWC

n Generate IOs to the VMDK

n View filter log messages in the appropriate log files

This chapter includes the following topics:

n “Understanding the Development Environment and its Requirements,” on page 32

n “32 and 64-bit Architectures in IO Filter Solutions,” on page 33

n “Deploying the vSphere API for IO Filters SDK (VAIODK),” on page 33

n “Understanding a Filter's Directory Contents and Building a VAIODK Sample Filter,” on page 47

n “Testing an IO Filter,” on page 49

n “Chapter Summary,” on page 84

n “Review Question Answers - Deploying and Testing an IO Filter Development Environment,” on


page 84

VMware, Inc. VMware Confidential and Proprietary 31


Getting Started Developing vSphere IO Filter Solutions

Understanding the Development Environment and its Requirements


VMware provides the VAIODK to members of the vSphere IO Filters program. The VAIODK, combined
with the required environment detailed in this topic, provide the tools you need to develop / test / debug
your own IO Filter Solution.

Some key properties of this SDK include the following:

n A C tool chain that includes a cross compiler and customized gdb debugger, linker, etc. The entire
toolchain must run on a supported 64-bit Linux system. It also includes make and scons, which are used
to build IO Filter VIBs / Bundles. If you are unfamiliar with scons:

n Don't Panic! This course teaches you what entries to make in scons-related build files. You don't
have to learn scons to create IO Filters.

n That said, consider learning about it at www.scons.org and/or http://en.wikipedia.org/wiki/SCons

n A customized version of the CIM Provider Development Kit (CIMPDK) to build CIM providers

n A customized version of the Host Extension Development Kit (HEXDK) that the toolchain uses to create
a VIB and Bundle with which you can deploy your IO Filter Solution to ESXi hosts and vCenter Servers

n It is possible to include your own static ELF libraries in your IO Filter Solution, allowing for modular
development of your solution.

As indicated by the presence of a C tool chain, all IO Filters must be written in C. No other language is
supported at this time, including C++. You must use the VAIO-provided toolchain to compile your filters
and build your VIBs / Bundles.

To develop and test vSphere IO Filters, you must have the following in your development environment:

n A vSphere cluster using vCenter Server 60U1 or VCSA 60U1 with:

n At least two ESXi 60U1 systems, each with a VMkernel vNIC configured for vMotion traffic. You
should enable SSH to each host, and disable ESXi shell timeouts.

n vSphere DRS configured. Consider changing the Automation parameter from the default of Fully
Automated to Manual.

WHAT'S NEW DRS is no longer a requirement starting in ESXi 60U2. This will allow customers to
upgrade the IO Filter by moving the hosts into maintenance mode and then performing the
upgrade or uninstall process without the need of DRS.

n A VM installed to the shared datastore. The guest OS in the VM must be one that can support
dynamically adding additional disks. It can be anything that you know how to administer: DOS,
Windows, Linux, etc. You must have an appropriate license for said guest OS.

IMPORTANT The build of ESXi you use to deploy your ESXi hosts must match the build of the VAIODK.
Mismatches will (almost certainly) cause IO Filters to fail to load, VMs to crash in unexpected ways, etc.

NOTE The authentication system used between vCenter Server and ESXi hosts is time sensitive. If the
systems' times are too far out of sync, certain cluster operations may fail. Therefore, a best practice is to
configure the vCenter Server and ESXi hosts to use the same Network Time Protocol (NTP) server.

n A 64-bit Linux system with which you write, build, and debug your IO Filter. To do this you must
install the VAIODK onto this system and follow the other procedures discussed in this chapter.

32 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

While you can use just about any 64-bit Linux system to do your development today, VMware
recommends using the VMware Workbench VM, installed on an Fusion / vSphere / Workstation
environment that is not part of the cluster you will be using to test your IO Filter.

NOTE VMware Workbench VM is currently based on SUSE Enterprise Linux (SLES) 11 SP 3. You get access
to this product free from VMware as part of belonging to the IO Filters program. It is provided through
a special licensing agreement with Novel, owners of SLES, eliminating the need for you to buy a license
for the version of SLES used for Workbench yourself.

Another advantage of using VMware Workbench VM is that you can use its customized Remote System
Explorer to install and remove VIBs from ESXi hosts instead of having to do this from the ESXi CLI.

32 and 64-bit Architectures in IO Filter Solutions


The user-space cartels in which an IO Filter's CIM Provider, Daemon, and Library Instance are loaded are a
mix of 32 and 64-bit executables. Specifically:

n CIM Providers are compiled to 32-bit ELF shared object (.so) files that get loaded into the context of the
sfcbd (small foot-print cim broker daemon) cartel, which is a 32-bit executable

n IO Filter Daemons are compiled to 64-bit ELF shared object (.so) files that get loaded into the context of
the iofilterd cartel, which is a 64-bit executable

n IO Filter Libraries are compiled to both 32-bit and 64-bit ELF shared object (.so) files that get loaded
into the context of either 32 and 64-bit executables, respectively. The key user of the 64-bit Library is
vmx. Most other user-space cartels, such as vmkfstools, hostd and vpxa are currently 32-bit executables
and thus use the 32-bit Library.

The mixed world has several implications for writing code, generating the following guidelines:

n The read/write buffer for the structure VAIODK defines for IO elements (scatter/gather items),
VMIOF_DiskIOElem, contains addr that is declared to be of type uint64_t. To convert this address to a C
pointer, and vice versa, use the macros A2P() and P2A() which are defined as follows:

#define A2P(_a) ((void *)(uintptr_t)(_a))


#define P2A(_p) ((uint64_t)(uintptr_t)(void *)(_p))

These macros are defined in at least one of the sample filters that come with the VAIODK and discussed
in “Understanding the Results of a Successful VAIODK Install,” on page 43.

n You cannot share a mutex between 32 and 64-bit code. For example, if you create a shared memory
segment between your LI and Daemon, and control access to the area with a pthread_mutex_t object,
this will work for LIs running in a VMX context, but not in the context of, for example, vmkfstools.

n System V semaphores are an alternative to pthread mutexes when working with 32 and 64-bit code.

n If you want to include your own static library in an IO Filter solution, you must provide both 32 and 64-
bit versions of said libraries so they, too, can get loaded into both 32 and 64-bit cartels such as vmx and
vmkfstools

n Do not write code that depends on the size of a pointer

Deploying the vSphere API for IO Filters SDK (VAIODK)


At a high level, deploying the VAIODK is a 3 or 4-step process as outlined in the procedure here. Each of the
sub-topics of this topic provides details for these steps. This procedure assumes that you have a Linux
system on which you have not previously installed the VAIODK. If you have previously installed the
VAIODK on your Linux system, you must first remove it before reinstalling the same or a newer version.

VMware, Inc. VMware Confidential and Proprietary 33


Getting Started Developing vSphere IO Filter Solutions

Prerequisites
In order to deploy the VAIODK, you must:

n Have a 64-bit Linux system installed and available for hosting the VAIODK , as discussed in
“Understanding the Development Environment and its Requirements,” on page 32

n Have received the download location for the VAIODK from the VAIO program management.

Procedure
1 Download and unwrap the VAIODK package.

2 Install the VAIODK from the downloaded , and unwrapped, content.

3 Verify the results of the installation created content in the appropriate places on the disk.

What to do next
Build the sample filters as discussed in “Understanding a Filter's Directory Contents and Building a
VAIODK Sample Filter,” on page 47

Downloading and Unwrapping the VAIODK


When VMware accepts your organization into the IO Filters program, a program manager sends your
technical contact an email with credentials for an SFTP site where VMware has staged the components you
need to build a development environment including versions of ESXi , vCenter Server, and the certain
documentation. The following figure illustrates the contents files in the GA 60U1A directory on the SFTP
site:

Figure 3‑1. VAIODK - GA 60U1A Files on SFTP Site

The files listed in the preceding figure are located in the folder /iofilter-early-access/20151014-3074370-
VAIODK-GA-ESXi6.0U1A. Your account on the SFTP site does not have permission to browse the directories on
the site. Therefore you must specify files to get by their full pathname. VMware recommends using tools
such as Filezilla or WinSCP to access and download files from the SFTP site.

Historically, the VAIODK was staged on this server. How, you download the VAIODK Developer Center as
discussed in the next topic.

Its important to note that the VAIODK install requires Workbench version 3.5.3.

Given the above, the procedure for installing and unwrapping the VAIODK is:

Procedure
1 Download the VAIODK from Developer Center.

2 Download Workbench VM 3.5 from Developer Center (as discussed in the course VMware
Fundamentals for Developers) and deploy it a virtualized envrionement that is NOT part of your test
environment. (The developers of this course typically run Workbench in VMware Fusion or VMware
Workstation).

3 Download the Workbench 3.5.3 update package from Developer Center, and install it within the
Workbench application (Eclipse) in the Workbench VM.

34 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

4 Install the VAIODK archive as described in the next topic: “Installing the VAIODK using VMware
Workbench,” on page 35.

Installing the VAIODK using VMware Workbench

Prerequisites
The VMware Fundamentals for Developers course discusses how to install the VMware workbench. You find
the workbench VM download at https://developercenter.vmware.com/group/workbench/vm/3.5. You must
have also updated your Workbench environment to version 3.5.3 by installing the VMware Workbench Core
Update found on the page at that same URL.

Procedure
1 If you have not already done so, start the VMware workbench VM. Log in using as either root or
vmware. The default password for both is vmware.

Whichever username you use, when you start the Workbench app, it runs as root.

Figure 3‑2. Workbench login as root

VMware, Inc. VMware Confidential and Proprietary 35


Getting Started Developing vSphere IO Filter Solutions

2 Start the Workbench application by clicking on the VMware Workbench icon on the desktop.

Figure 3‑3. Start VMware Workbench

36 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

3 Specify the location where you want your project to reside. Click OK to continue.

Figure 3‑4. Select a Workspace

VMware, Inc. VMware Confidential and Proprietary 37


Getting Started Developing vSphere IO Filter Solutions

4 You should have downloaded the VAIODK fromhttps://developercenter.vmware.com/group/sdk/60/io-


filter. Select Help > Install New Software... from within the VMware Workbench

Figure 3‑5. Install VAIODK

38 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

5 In the Install Wizard, click on Add to specify the location of the VAIODK zip file that you downloaded
in the previous step.

Figure 3‑6. Install Software Wizard

VMware, Inc. VMware Confidential and Proprietary 39


Getting Started Developing vSphere IO Filter Solutions

6 Click on Archive and browse to the location of the VAIODK zip file and select it.

Figure 3‑7. Select VAIODK zip file to install

40 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

7 Click on Select All to include all the files inside the zip file. Click on Next to proceed with the VAIODK
installation.

Figure 3‑8. Install VAIODK contd.

VMware, Inc. VMware Confidential and Proprietary 41


Getting Started Developing vSphere IO Filter Solutions

8 You are prompted with the Licencing Terms and Conditions. Click on I accept the terms of the license
agreements radio button. Then click on Finish to complete the VAIODK installation.

Figure 3‑9. Accept License Agreement

9 To verify that the VAIODK installation is successful, open a terminal and execute the command rpm –qa
| grep vmware-esx. The display is similar to the following screenshot.

Figure 3‑10. Verify VAIODK Installation

42 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

The VAIODK Installation is complete.

NOTE Installation of VAIODK should be done only as a root user. It is recommended that you do your IO
Filter solution development as a non root. You could use username :vmware and password : vmware
credentials for example, that is already available.

Understanding the Results of a Successful VAIODK Install

NOTE Within this topic, the term buildId refers to a specific build of ESXi. You must use the version of ESXi
indicated in the buildId of a VAIODK.

Installing the VAIODK results in adding the following key items to the filesystem:

n /build/toolchain — This directory is the traditional location of VMware's internal tools and supporting
tools (such as various versions of Perl, gcc, etc.). These tools will migrate to /opt/vmware/toolchain
over time.

n /opt/vmware — This directory is where VMware installs its SDKs. It includes directories such
as /opt/vmware/toolchain, which is the (new) location for VMware's internal tools and supporting
tools. Sub-directories containing SDKs have a name in the form SDKName-SDKVersion, for example
vaiodk-6.0.0-2329218.

n /usr/bin/vibauthor and /usr/bin/vibpublish — These are symbolic links to utilities


in /opt/vmware/vibtools*. The tools themselves create VIBs and offline bundles. The symbolic links
exist for historical reasons.

Within the /opt/vmware directory, VAIODK installation creates several sub-directories, of which the
following are key:

n cimpdk — As the name suggests, this directory contains the CIM Provider Development Kit (CIMPDK).
This allows IO Filter Solutions to include a CIM Provider. The tool chain for building an IO Filter VIB
invokes tools in this CIMPDK.

n vaiodk-6.0.0-buildId — This directory contains the core of the VAIODK as discussed later in this
topic.

n vaiodk-symbols-6.0.0-vaiodk-6.0.0-buildId — This directory contains the symbols files, beyond those


of an IO Filter itself, that are required to debug an IO Filter

While writing code for, building, and debugging IO Filters, you will spend most of your time working
within the /opt/vmware/vaiodk-6.0.0-buildID folder. The key sub-directories within this directory after
install are illustrated in the following figure:

VMware, Inc. VMware Confidential and Proprietary 43


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑11. Structure of vaiodk-6.0.0-buildId directory, as installed

The three key sub-directories include:

n debug — This directory contains items needed to perform post-mortem debugging on IO Filter
components

n docs — As the name suggests, this directory contains documentation for VAIODK. Most notably, the
doc/html directory contains doxygen contents from the VAIODK .h files.

n src — This is the directory under which all IO filter code must be created. It also includes the directory
with the VAIODK include files and samples, as discussed in the next two sub-topics.

Understanding the VAIODK .h files

The directory bora/lib/public/vmiof of the VAIODK's src directory contains all the header files that define
data types and prototypes for the VAIO. It is a best practice to just #include vmiof.h, which includes all of
the other .h files for you. However, the following list describes each of the header files for your background
information.

n vmiof.h — This file #includes all other .h files in this folder.

n vmiof_aio.h — This file provides prototypes for the functions, and related data structures, used to
perform asynchronous IO from within an IO Filter

n vmiof_crossfd.h — This file provides prototypes for the functions, and related data structures, that a
cartel can use to provide access to its memory to other cartels (for example iofilterd and a cartel that
loaded an IO Filter Library) using a file IO abstraction. For more information about these functions, see
“Understanding and Using the IO Filters CrossFD Functions,” on page 184.

n vmiof_daemon.h — This file provides prototypes for the functions, and related data structures, used to
create Daemons for an IO Filter

44 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

n vmiof_disk.h — This file provides prototypes for the callback functions defined by an IO Filter, utility
functions used by an IO Filter, and their related data structures. While the callback functions defined
here are only used to write Filter Libraries, the utility functions are used by Filter Daemons as well.

n vmiof_heap.h — This file provides prototypes for the functions, and related data structures, used to
create, destroy, and managed dynamic memory spaces (heaps) within an IO Filter. For more
information about these functions, see “Managing Memory in an IO Filter Solution,” on page 130.

n vmiof_log.h — This file provides prototypes for the functions, and related data structures, used to
create log entries within an IO Filter. For more information about these functions, see “Understanding
Logging in an IO Filter,” on page 50.

n vmiof_poll.h — This file provides prototypes for the functions, and related data structures, used to do
poll-based processing, as an alternative to using the select(2) or poll(2) system calls (or variants
thereof), within an IO Filter. For more information about these functions, see “Understanding and
Using the IO Filters Polling Functions,” on page 141.

n vmiof_scsi.h — This file provides prototypes for the functions, and related data structures, used to
issue SCSI commands within an IO Filter

n vmiof_status.h — This file provides definitions for the common return values returned by callbacks
and utility functions of the VAIO. For more information about status codes, see “Understanding
VMIOF_Status Results for Functions in the VAIO,” on page 102.

n vmiof_timer.h — This file provides prototypes for the functions, and related data structures, for doing
time-based processing within an IO Filter. For more information about these functions, see
“Understanding and Using the IO Filters Timer Functions,” on page 171.

n vmiof_cache.h — This file provides prototypes for the functions, and related data structures, used to
create, destroy, and perform IO to files on an ESXi host's vFlash Files System (VFFS), created from an
aggregate of the host's SSDs. For more information about these functions, see “Understanding and
Using the VMIOF_Cache*() Functions,” on page 253.

n vmiof_work.h — This file provides prototypes for the functions, and related data structures, for
implementing work-pile based processing within an IO Filter. For more information about these
functions, see “Understanding and Using the IO Filters Worker Functions,” on page 174.

Understanding the VAIODK Sample Solutions

The VAIODK includes several sample IO Filter Solutions, each of which serves a specific purpose in aiding
you to develop your own IO Filter Solution. All of the samples are located within sub-directories
of /opt/vmware/vaiodk-version-build/src/partners/samples/iofilters. The samples and their
descriptions, in order of complexity, are as follows:

n sampfilt — This solution includes LI and a CIM provider components, but no daemon component. The
LI does not communicate with the CIM provider.

The LI contains the minimum code necessary for a functional LI, having just the required entry points,
and the code for each of those just containing a call to make a log entry and then return a value
indicating the callback succeeded. The CIM provider is similarly minimal.

You can use this sample in several ways:

n Validate the VAIODK build capability — After installing the VAIODK, build this solution. If you
encounter errors, there is almost certainly something wrong with your VAIODK deployment.

n Validate the VAIO runtime in your ESXi hosts — After building this solution, deploy its VIB to an
ESXi host, attach it to the VMDK of a VM, and power on the VM. If you have any problems doing
this, you have issues in your development environment.

VMware, Inc. VMware Confidential and Proprietary 45


Getting Started Developing vSphere IO Filter Solutions

n Understanding callback invocation — When an entity loads and runs this filter, you can observe
when the IO Filter Framework invokes various callbacks. For example, you can perform a
snapshot, migration, power on/off, etc. and observe when the framework invokes callbacks such as
diskOpen(), diskIOStart(), diskStun(), diskUnStun(), etc. Further, you can enhance the code by
having it log the values passed into parameters of the various callbacks to gain an understanding of
said parameters.

While this filter is designated as a replication class filter, in fact it performs no replication.

n countIO — This solution is only slightly more complicated than sampfilt. It includes a daemon
component in addition to an LI and CIM provider. That said, none of the components communicate
with one another.

The daemon component contains minimal functionality, including the callbacks for starting, stopping,
and cleaning up after a daemon stops.

The LI callbacks include code to manipulate non-persistent per-VMDK meta-data (private data) as well
as counting the number of times the IO Filter Framework invokes the diskIOStart() callback. Both
components use the VMIOF_heapAllocate() and VMIOF_heapFree() functions(see “Managing Memory in
an IO Filter Solution,” on page 130).

This filter is designated as a caching filter, even though it performs no caching. Thus, you can use this
filter, in combination with sampfilt, to understand the sequence in which the IO Filter Framework
invokes callbacks on a VMDK with multiple filters attached. That is, if you configure a VMDK with
both the sampfilt and countIO filters attached, you can see that for any given event, the IO Filter
framework invokes the callback for sampfilt before countIO because of their respective classes, and that
the IO Filter Framework invokes callbacks for replication filters before caching filters.

n proxy — This solution is more complicated still, though it does not include a daemon component. The
complexity added in this solution is in how it processes IO requests in diskIOStart(). Instead of just
completing each IO request synchronously, this filter illustrates several commonly used programming
patterns for handling IOs, including:

n Duplicating and submitting filter-generated IO requests — The LI creates a duplicate of each IO


request it receives, and submits the duplicate, not the original, to the disk.

n Registering for IO completion — Before submitting the duplicate IO to the IO Filter Framework,
the LI registers a callback function for the IO Filter Framework to call upon completion of the IO
request. In that callback, the LI completes the original IO request, and free's the duplicated IO it
had previously created and submitted.

n Asynchronous IO completion — After duplicating, registering for completion, and submitting the
duplicate IO to the IO Filter Framework, the LI signals said framework that it will complete the
original IO request asynchronously. Per the preceding bullet, the completion callback signals said
framework that the original IO is complete after it receives the results using the duplicate IO
request.

n sampcache — This solution implements a full-features write-through caching IO Filter Solution. At a


high level, it includes:

n A fully functional LI and Daemon that communicate with one another via UNIX domain sockets
and shared memory using the Crossfd functions (see “Understanding and Using the IO Filters
CrossFD Functions,” on page 184

n Use of the vFlash functions to create cache files on a cache device, for example an SSD, (see
“Understanding and Using the VMIOF_Cache*() Functions,” on page 253)

n Use of sidecar functions to manage persistent per-VMDK meta-data (see “Using Sidecars Functions
in Library Code to Keep Persistent Per-VMDK Meta Data,” on page 137)

n Asynchronous IO completion — Requests are always completed asynchronously, after read data is
retrieved from the cache or disk and after writes are completed to the disk and cache, respectively.

46 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

n IO request duplication, allocation, and freeing — Read requests are fulfilled from the cache as
much as possible, and from the disk for whatever is not in the cache. For the data missing from the
cache, the filter creates a new IO request and submits it to the disk, filling in the original request
with the complete data on completion of the filter-generated IO request.

n Use of work-pile and poll callbacks — The LI and daemon both use work-piles to complete
background processing of certain tasks. They also use poll callbacks to process reads/writes
through their shared socket.

VMware maintains a VMware Confidential Copyright on all of the sample code. In general, you are free to
copy the patterns in your own solutions. Check your program contracts for specific language on how you
can (and can't) use the source code in these samples.

Removing the VAIODK

Prerequisites
At this current time, uninstall support of the VAIODK in Workbench is unconfirmed. We recommend that
you create a new WorkBench VM and install the desired VAIODK.

Understanding a Filter's Directory Contents and Building a VAIODK


Sample Filter
You deliver IO Filter Solutions to clusters as offline bundles (archives of VIBs - vSphere Installation Bundles),
via a VIM API function (specifics of which are discussed elsewhere in a different topic). This causes vpxd on
the vCenter Server to deploy the VIB(s) in the offline bundle to the ESXi hosts on a specific cluster. This is
the way VMware expects filters to get deployed in production environments.

During development, as an alternative, you can deploy an IO Filter Solution to ESXi hosts as VIBs. This is
faster than deploying to a whole cluster and is appropriate when testing non-cluster related functionality.

Thus, the objective of an IO Filter build is the generation of its VIB and bundle files.

NOTE If you are unfamiliar with VIBs and bundles, see


http://blogs.vmware.com/vsphere/2011/09/whats-in-a-vib.html and
http://pubs.vmware.com/vsphere-51/index.jsp?topic=%2Fcom.vmware.cimsdk.smashpg.doc
%2F05_CIM_SMASH_PG_Offline_Bundles.7.1.html, respectively.

Each of the VAIODK sample filters are ready to build as soon as you install the VAIODK because they
contain all the required, and sometimes optional, components of an IO Filter Solution. This section discusses
what those components are, and how to build them into a VIB / bundle.

Specifically, each IO Filter source directory must contain the following items:

n Source for the filter's LI, based on the name of the filter, for example sampfilt.c, countIO.c, etc.

n A source file for the filter's daemon, if present. Remember that the sampfilt and proxy sample solutions
do not include daemons.

n A sub-directory with a build environment for the filter's CIM provider. The sub-directory's name is also
based on the filter name using the pattern cim/filter-name, for example cim/sampfilt, cim/countIO, etc.
The CIMPDK dictates the contents of this sub-directory. For this topic, it is enough to know that the
CIM provider source is located in cim/filter-name/src.

n A catalogs directory that contains files for localizing the solution.

n A SCONS file, whose name is based on the filter, for example sampfilt.sc, with rules for building the
solution's VIB, bundle, and their components. For details on this file, see “Creating and Populating a
Correct Scons File,” on page 89.

VMware, Inc. VMware Confidential and Proprietary 47


Getting Started Developing vSphere IO Filter Solutions

n A Makefile with rules for invoking scons to build the solution's VIB, debugging environment, and to
clean the build environment for this filter.

In aggregate the content's of each filters' directory allows you to build the VIB very easily:

Prerequisites
To build the VAIODK sample filters, you must have already installed the VAIODK.

Procedure
1 Set into the sample's directory.

2 Enter the command: make

Make invokes SCONS to build the solution. It compiles the 32 and 64-bit versions of the filter's LI, the
64-bit daemon shared object, and 32-bit CIM provider shared object, then builds the VIB and bundle file
from those components.

After successfully building your filter, you should deploy it to, and test it on your vSphere cluster as
described in “Testing an IO Filter,” on page 49.

IMPORTANT Please run "make clean" before you run "make", otherwise, VAIODK might bundle all the old
vibs into the final bundle. You can use "unzip -l <bundle-name>.zip" to verify what's inside the bundle.

Understanding the Results of a Successful Build


When you successfully build an IO Filter Solution, the build actions create several artifacts, most placed
within the build sub-directory of the filter's directory:

n 32-bit LI — Located in build/usr/lib/vmware/plugin/libvmiof-disk-countio.so

n 64-bit LI — Located in build/usr/lib64/vmware/plugin/libvmiof-disk-countio.so

n 64-bit Daemon — Located in build/usr/lib64/vmware/plugin/libvmiof-disk-daemon-countio.so

n VIB — Located in build/vib/countio-yourversion-devkitversion.x86_64.vib, where yourversion is


what you specified in the Identification dictionary item in the SCONS file, and devkitversion is a string
created by the VAIODK that includes the vSphere release and build (for example 600 and 2329281). For
example, the vib created for CountIO using Drop 6 of the VAIODK is countio-1.0.0-1OEM.
600.0.0.2329218.x86_64.vib.

NOTE The x86_64 in the vib file name indicates that it is meant to be loaded on 64-bit ESXi hosts, not
that all of its contents are 64-bit executables. Remember that IO Filter Solution VIBs have a mix of 32
and 64-bit objects.

n Bundle — Located in build/bundle/countio-offline-bundle.zip

The notable exception is that the build creates CIM objects under the filter's cim/filter_name/build/ sub-
directory, for example cim/sampfilt/build/.libs/libsampfiltprovider.so.

Again, these examples were for builds of the countIO sample. For whatever filter you build, just replace
countio or countIO with your filter's name for the names for that filter.

IMPORTANT Each time you build, you should check the timestamp on the VIB and bundle to ensure that they
were updated by your most recent build.

48 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

Testing an IO Filter
IO Filter Solutions are meant to be deployed to a cluster, which in turn causes the cluster's vCenter Server to
deploy it to all hosts within the cluster. That said, during early development, for simplicity, you may choose
to only deploy your IO Filter Solution directly to one or more ESXi hosts.

Once you have the IO Filter Solution deployed in your vSphere environment (to a cluster or one or more
hosts), you must attach the filter to one or more VMDKs. Again, for initial simplicity, you may attach it to
only one VMDK, then to multiple VMDKs associated with the same VM, and finally to multiple VMDKs on
multiple VMs.

To prevent damage to the guest OS running in your test VM, while developing your Solution, attach your
filter to and detach it from a non-system disk. VMware recommends just adding an additional VMDK to
your VM and using that with your IO Filter, until you are confident in the correctness of your code. At that
point, a good test of your Solution is to try installing a guest OS in a new VM with your Solution attached to
all of said VM's VMDKs. Try migrating the VM from host to host in the cluster, and taking snapshots,
during the installation. If the installation fails because of an IO Filter issue, you have more work to do.

Once you have your Solution working with multiple VMDKs in multiple VMs, you should also deploy a
sample Solution of a different class than your Solution to ensure that your Solution "plays well with others."
You may even consider adding an alternate test filter which injects various faults into certain operations,
such as preparing for snapshots or migrations, to validate how your Solution responds to the IO Filter
Framework surfacing those faults for your LI.

In general, you should follow these steps to test your solution:

Prerequisites
To test an IO Filter Solution, you must have successfully built its VIB and bundle files and have a minimal
vSphere cluster into which you deploy them. The VM should be running a guest OS that you can easily
administer, such as adding and removing disks, making filesystems on those disks, etc.

Procedure
1 Set each of the ESXi hosts in your cluster to Community Supported mode so that it can accept VIBs whose
signatures can't be verified.

You can now install the VIBs you build in our environment without having to sign each VIB with your
company's certificates.

2 Deploy your filter on a single ESXi host using the filter's VIB file and test it in a single-host environment

This simplifies the environment in which your filter runs. You don't have to worry about DRS, vCenter
Server issues, using the MOB or VWC, etc.

3 After thoroughly testing your filter on an ESXi host, deploy it to your cluster's vCenter Server using the
filter's offline bundle file and test it in that environment.

What to do next
Each of the sub-topics that follow provide detailed instructions for deploying your filter onto an individual
host and into a cluster.

VMware, Inc. VMware Confidential and Proprietary 49


Getting Started Developing vSphere IO Filter Solutions

Placing an ESXi host in Community Supported Mode


Normally, for security reasons, customers are only willing to install VIBs onto their systems that are signed
with certificates from a VMware partner or VMware itself. During development of your IO Filter Solution,
you can force ESXi to install your VIBs even if they aren't signed, if you are installing the VIB using an the
ESX CLI and tell ESXi to ignore the certificates, or if you install the VIB using VMware Workbench's Remote
System Explorer.

However, if you deploy your IO Filter Solution to a cluster, the vCenter Server sends the VIB to each host
and proxies the install request through said hosts' hostd cartel. In this case, hostd does not tell the system to
ignore the VIB certificates, which causes the installation to fail, unless you first put the ESXi hosts into
Community Support mode. In this mode, the ESXi Installer (the back-end invoked by all installation
methods) ignores all VIBs certificates and just installs the VIB.

DANGER You should only put development hosts into Community Support mode. You should never put
production systems into Community Support mode, for security reasons.

Prerequisites
In order to perform the commands in this topic, you must have configured the host for SSH access, enabled
the ESX CLI, and ssh'd into the host as root.

Procedure
1 SSH into your ESXi system as root.

2 Enter this command: esxcli software acceptance set --level=CommunitySupported

This command should print the following: Host acceptance level changed to 'CommunitySupported'.

If so, the ESXi Installer now ignores certificates on VIBs.

3 Verify that the preceding command works by entering this command: esxcli software acceptance get

This command should print CommunitySupported

Understanding Logging in an IO Filter


The are several files that receive log events from the IO Filter Framework and from your Filter. The output
of the logging depends on which component is generating the log messages and the context in which that
component runs.

Log Messages for LI Components


The .so's of a LI are loaded into a user-space cartel. Thus the log messages are output according to the log
configuration of the cartel. The three most common cartels that load LIs and their log configuration are:

n VMX — The VMX sends log messages to the file vmware.log located in the VM's directory. For example, a
VM called foo running on a datastore mounted at /vmfs/volumes/nas1 will have VMX log entries appear
in /vmfs/volumes/nas1/foo/vmware.log.

n vmkfstools — This ESXi CLI command sends log messages to its standard output (stdout). However, it
only makes log messages visible if you set the verbose flag (-v or --verbose) to 5 or higher.

n hostd — This cartel, among many other things, proxy's requests from a vCenter Server to perform IO
Filter requests in the host on which hostd runs. For example, when you deploy an IO Filter to a cluster,
and the vCenter Server in turn deploys the VIB to hosts, it uses hostd on each host to perform the VIB
deployment. Hostd sends log messages to /var/log/hostd.log. Check this file for log entries generated
by your LI when you perform operations through the VWC such as attaching and detaching an IO
Filter from a VMDK.

50 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

n vpxa — Similar to hostd, vpxa also proxy's requests from a vCenter Server to perform IOFilter requests
on the host. Vpxa sends log messages to /var/log/vpxa.log. Logs for some operations initiated by VWC,
such as disk cloning, will go to vpxa.log. If you have hard time finding log messages, check both
hostd.log and vpxa.log.

Log Messages for Daemon Components


Remember that the iofilterd cartel loads all Daemon components into its address space. This cartel logs
messages when it tries to start, stop, or clean up Daemons to /var/log/iofilter-init.log. If you see repeated
Daemon starts, watchdog timeouts, etc. in this file, your Daemon is failing.

The messages printed out by the Daemon will go to /var/log/iofilter-[yourfiltername].log.

The iofilterd cartel also sends log messages generated by a Daemon to/var/log/syslog.log.

Log Messages from the IO Filter Framework


The IO Filter Framework, such as the SSMod kernel module, log messages to /var/log/syslog.log. For
example, when the framework creates new worker threads in a VMX to handle additional LI work, the
Framework generates a message to that effect in this file.

Log Messages for Offline Bundle Installation


When installing an IOFilter Offline Bundle through the vSphere APIs, the installation is actually done by
ESXi Agent Manager (EAM). EAM sends the VIB inside the Offline Bundle to ESXi hosts and installs it on the
hosts. The messages for this installation is logged in /var/log/vmware/eam/eam.log
and /var/log/vmware/vpxd/vpxd.log on the vCenter Server, as well as in /var/log/esxupdate.log on the
ESXi host(s).

File Location Summary


In summary, the places on the ESXi host to look for log messages related to IO Filtering include:

n /var/log/syslog.log

n /var/log/iofilter-init.log

n /var/log/hostd.log

n /var/log/vpxa.log

n /var/log/esxupdate.log

n /vmfs/volumes/some-datastore/VM-name/vmware.log

n Stdout on the ESXi CLI for commands you run from there that load the LI, for example vmkfstools

The places on the vCenter Server to look for log messages related to IO Filtering include:

n /var/log/vmware/eam/eam.log

n /var/log/vmware/vpxd/vpxd.log

Log Levels and the VMIOF_Log () function


The VAIO provides VMIOF_Log() to generate log messages. While future chapters of this course discussed
specific functions of VAIO, it is important to understand this function now. The function has a similar
signature to syslog(). The first parameter to the function is the level of the message, which is analogous to
the severity parameter to syslog(). The log levels defined for VMIOF_Log are:

n VMIOF_LOG_PANIC

n VMIOF_LOG_ERROR

n VMIOF_LOG_WARNING

VMware, Inc. VMware Confidential and Proprietary 51


Getting Started Developing vSphere IO Filter Solutions

n VMIOF_LOG_INFO

n VMIOF_LOG_VERBOSE

n VMIOF_LOG_TRIVIA

By default, the IO Filter Infrastructure only logs messages with a level of VMIOF_LOG_WARNING or higher. To
force the infrastructure to log messages with a lower level, do one of two things. Either:

n Edit the .vmx file for the VM, and add log.logMinLevel=X to the end of the file (where X is the minimum
log level you want the infrastructure to use)

n Edit the file /etc/vmware/config on the ESXi host, adding vmx.log.logMinLevel=X to the end of the file
(where X is the minimum log level you want the infrastructure to use)

VMIOF_VLog () function
The VAIO also provides VMIOF_VLog() to generate log messages. This function has the following prototype :

void VMIOF_VLog(VMIOF_LogLevel level, const char *fmt, va_list ap);

n The first input parameter to this function is the level of the message, which is analogous to the severity
parameter to syslog(). The log levels defined for VMIOF_VLog are:

n VMIOF_LOG_PANIC

n VMIOF_LOG_ERROR

n VMIOF_LOG_WARNING

n VMIOF_LOG_INFO

n VMIOF_LOG_VERBOSE

n VMIOF_LOG_TRIVIA

n The second input parameter is a printf like format string

n The third input parameter is the list of message arguments.

This function returns a void.

Testing an IO Filter with an ESXi Host


To test your IO filter for an individual ESXi host, you need to be able to:

1 Install the VIB to the ESXi host. You can do this either using an ESXi CLI command, the Remote System
Explorer in VMware's Workbench, the Power CLI, or a Perl / Java program using the VIM API.

2 Attach the IO Filter to a VMDK of a test VM. While you can create a VMDK for a VM with any of the
management tools, you must attach the filter to the VMDK with the ESXi CLI.

NOTE If the VMDK does not exist, you must power off the VM to associate the VMDK with the VM,
then power the VM back on.

3 Perform IO to the disk using the guest OS in your test VM.

As you discover and fix flaws or provide enhancements to your filter, you must remove the existing filter
before you can install a new one.

The following sub-topics provide steps for performing each of these actions.

52 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

Staging your VIB for Deployment from the ESX CLI

The ESXi CLI command you use to deploy a VIB to an ESXi host requires that you provide a URL to the VIB
file. The acceptable URLs include:

n file://some-pathname — where some-pathname is a file on the host's filesystem. To use this type of URL,
you must scp the VIB to your ESXi host, or access the vib staged on a NFS datastore.

n http[s]://IP-or-hostname/some-path — where some-pathname is the path relative to the document root of


the web server running on the specified IP or hostname.

n ftp://IP-or-hostname/some-path — where some-pathname is the path relative to the root of an FTP server
running on the specified IP or hostname.

To use the latter two URL forms, you must copy the VIB to an appropriate place under the doc root of the
web server or root of the FTP server.

NOTE To be able to scp files to an ESXi host, you must enable SSH to the host using its Direct Console.

ATTENTION While VMware's Workbench does not currently ship with a standard web server, there is a trick
you can use to create a web server using the build/vib directory of your filter's source directory:

1 Start a shell terminal window.

2 Cd to the build/vib directory within your filter's source. Remember, this directory only exists after you
build your VIB the first time.

3 Enter the following command: python -m SimpleHTTPServer &

You can now browse to the IP address of your Workbench VM, port 8000, using http, and see your VIB file
there. The URL to your VIB is now http://ip-of-WorkbenchVM/vib-file-name, for example:
http://172.16.1.2:8000/sampfilt-1.0.0-1OEM.600.0.0.2329218.x86_64.vib

Deploying an IO Filter VIB to an ESXi Instance Using the ESXi CLI

To test your IO Filter's functionality on a single ESXi host, you must install its VIB to said host. The
following procedure provides one method (of several) to complete this task:

Prerequisites
To install an IO Filter VIB via the ESXi CLI you must have:

n Enabled SSH for your ESXi host from its Direct Console

n Successfully built the VIB

n Staged the VIB so that it is accessible by the ESXi host using a supported URL

Procedure
1 SSH into your ESXi host.

VMware, Inc. VMware Confidential and Proprietary 53


Getting Started Developing vSphere IO Filter Solutions

2 Execute the following command: esxcli software vib install -v URL-to-VIB

For example:

esxcli software vib install --no-sig-check -v http://172.16.193.11:8000/countio-1.0.0-1OEM.


600.0.0.2329218.x86_64.vib

Another example:

esxcli software vib install --no-sig-check -v file:///tmp/countio-1.0.0-1OEM.


600.0.0.2329218.x86_64.vib

NOTE For file: URLs, the text file:// is optional.

This command invokes the ESXi installer, which opens the VIB and deploys its components to the
appropriate places on the host. On success, the command displays output similar to the following:

Installation Result
Message: Operation finished successfully.
Reboot Required: false
VIBs Installed: ZZZ_bootbank_countio_1.0.0-1OEM.600.0.0.2329218
VIBs Removed:
VIBs Skipped:

You should see the following results of a successful VIB deployment:

n If your solution includes a daemon component:

n The deployment should place a init script for it in the file /etc/init.d/iofilterd-filter_name (for
example /etc/init.d/iofilterd-countio)

n The daemon should be running. Verify this in two ways:

n In the ESXi shell, run the command: ps -c | grep iof. You should see one iofilterd cartel
running, with a user-world belonging to that cartel called iofltd-filter_name (e.g. iofltd-
countio).

n Check /var/log/iofilter-init.log for messages indicating that your daemon started and did
not terminate due to a watchdog or for some other reason. You should see output similar to
the following in this file:

2014-07-26T04:48:49Z iofilterd_countIO: Starting ...


2014-07-26T04:48:49Z iofilterd_countIO: +/bin/watchdog.sh -d -t 100 -s
iofilterd_countIO /usr/lib/vmware/iofilter/bin/iofilterd --filter=countIO --
mempool=100
2014-07-26T04:48:49Z iofilterd_countIO: +/etc/init.d/iofiltervpd start iofiltervpd
started
2014-07-26T04:48:49Z iofilterd_countIO: +sleep 2
2014-07-26T04:48:51Z iofilterd_countIO: +/usr/lib/vmware/iofilter/bin/iofvp-ctrl-
app -a/usr/lib/vmware/vmiof/disk/countIO-config.xml
2014-07-26T04:48:51Z iofilterd_countIO: ... Start is complete

n Your daemon is placed in /usr/lib64/vmware/plugin/libvmiof-disk-daemon-filter_name.so.

n Your LI is placed in /usr/lib/vmware/plugin/libvmiof-disk-filter_name.so


and /usr/lib64/vmware/plugin/libvmiof-disk-filter_name.so for the 32 and 64-bit versions,
respectively.

n The XML file created from parsing your CAPABILITIES part of your filter's .json file is placed
in /usr/lib/vmware/vmiof/disk/filter_name-config.xml.

n The XML file created from parsing your NETWORK part of your filter's .json file is placed
in /etc/vmware/firewall/vmiof-disk-filter_name.xml.

54 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

n The CIM provider is placed


in /var/lib/sfcb/registration/repository/filter_name*/filter_name/cimv2.

Deploying an IO Filter VIB to an ESXi Instance Using Workbench

This topic builds on the content presented in “Deploying an IO Filter VIB to an ESXi Instance Using the ESXi
CLI,” on page 53. It has, essentially, the same prerequisites and purpose. However, this topic discusses how
to complete the same task using the Remote System Explorer component of VMware Workbench instead of
the ESXi CLI.

One difference is that you do not need to stage the VIB to deploy it with Workbench. That is handled under
the covers for you by Workbench. The other difference is that you need to have configured your ESXi host in
the Remote System Explorer before you can follow this procedure.

NOTE The Remote System Explorer (RSE) is an open source plugin to Eclipse, a key component of VMware
Workbench. VMware expanded RSE's functionality to understand ESXi type hosts and perform certain
operations with said hosts, including installing and removing VIBs. The VMware Fundamentals for Developers
course discusses how to create RSE connections to ESXi hosts.

Once you've successfully built your filter's VIB, to deploy it with Workbench, follow these steps:

Procedure
1 If you have not already done so, start VMware Workbench.

2 Select the Remote System Explorer Perspective.

One way to do this is to select Window > Open Perspective > Remote System Explorer.

In one of its panes, Workbench displays the RSE perspective similar to the following:

VMware, Inc. VMware Confidential and Proprietary 55


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑12. Workbench: Remote System perspective, with ESXi hosts added

3 Right-click on the ESXi host to which you would like to deploy the VIB and then select VMware >
Install Package...

Workbench displays the Install Wizard similar to the following:

Figure 3‑13. Workbench: Installation Wizard, first dialog

56 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

Notice that the wizard includes a list of all of the ESXi systems that exist in the RSE, with the one you
right-clicked on already checked. To install the VIB on multiple ESXi hosts at the same time, check
additional hosts.

4 Either enter the full pathname to the VIB file, or click Browse... to use a modified File Chooser to
navigate to the VIB file.

The following figure shows the use of File Chooser, navigated to the build/vib folder for the countIO
sample:

Figure 3‑14. Workbench: Install Wizard File Chooser

NOTE You can create shortcuts to your build/vib directory by: Navigating to the build directory;
Highlighting the vib directory; Clicking + Add. This figure shows that shortcuts were added for several
vib directories. In the future, just select the vib directory shortcut instead of navigating through the
entire file tree.

After selecting the VIB, click OK in the File Chooser.

The Install Wizard has the name of the VIB file to deploy.

5 Click Next >.

Workbench displays the next dialog in the Install Wizard, similar to the following:

VMware, Inc. VMware Confidential and Proprietary 57


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑15. Workbench: Install Wizard, 2nd Dialog

6 Click Install.

Workbench displays the following confirmation dialog:

Figure 3‑16. Workbench: Install Wizard, Confirmation dialog

7 Read the information in the dialog, and then click OK to proceed.

The Install Wizard attempts to deploy the VIB to the selected hosts. As it does this, it displays a
progress dialog similar to the following:

58 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

Figure 3‑17. Workbench: Install Wizard, Progress dialog

When the operation completes, the Install Wizard displays the results in a dialog similar to the
following:

Figure 3‑18. Workbench: Install Wizard, Results dialog

NOTE The text displayed in this dialog is the output form the ESXi Installer as discussed in “Deploying
an IO Filter VIB to an ESXi Instance Using the ESXi CLI,” on page 53

8 If the install failed, fix the problem, then use the < Back buttons to return to the beginning and try
again.

9 On successful install, click Next >.

The Install wizard displays the final dialog as follows:

VMware, Inc. VMware Confidential and Proprietary 59


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑19. Workbench: Install Wizard, Summary Page

10 Click Finish.

the VIB is now deployed to the ESXi host(s) you specified.

Attaching an IO Filter to a VMDK, Using the ESXi Shell

After you deploy your IO Filter for the first time on a host, you must attach said filter to a VMDK in order to
test said filter. You need only do this once, even if you replace the filter on the host (for example, after fixing
a flaw or adding functionality), even if you change the version of the filter, as you attach filters to a VMDK
using the name of the filter, not the name of its VIB file, etc.

Attaching the IO Filter to a VMDK causes the IO Filter Framework to invoke the diskAttach() callback of
said filter, passing any filter properties that you may set when you perform this procedure.

This procedure uses the vmkfstools command, whose syntax has been modified to support IO Filters
operations. The relevant syntax is:

vmkfstools -v # --iofilters filter_name[:property="value"...] vmdkfile

Where:

n # — is the level of debugging verbosity. During development, consider setting this number to 5 or
higher to see the messages generated by calls to VMIOF_Log().

n filter_name — is the name of the IO Filter as specified in the name field of its SCONS file in its
Identification dictionary. This is not to be set to the VIB's file name.

n property — is the name of one of the properties exposed by the IO Filter as listed in its .json file

n value — is a quoted string that is to be assigned to the property

n vmdkfile — is the name of the VMDK's meta-data file, not its flat file or snapshot files

NOTE To specify values for multiple properties, separate each property name/value pair with a colon.
Further, all values are passed as ASCII strings. It is up to the IO Filter LI to convert ASCII strings to binary
numbers if needed.

Perform the following procedure from the ESXi CLI, as root:

60 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

Prerequisites
In order to attach an IO Filter to a VMDK via the ESXi Shell using the procedure described in this topic, you
must have already:

n Deployed the IO Filter VIB to the ESXi host running the VM to which the VMDK belongs

n Added the VMDK to said VM

n SSHed to the ESXi host's management interface as root

NOTE While it is technically possible to both create a VMDK and attach a filter to it with a single command,
attaching a filter to an existing VMDK provides behavior that more closely approximates the behavior of
attaching a filter to a VMDK via the VWC.

The VMware Fundamentals for Developers course discusses how to create VMDKs (vDisks) for a VM.

Procedure
1 Run the vmkfstools command providing the --iofilters option with the name of your IO Filter and
any properties you want to set and also providing the name of the VMDK to which you wish to add the
IO Filter.

The filter should now be attached to the VMDK.

2 Run the vmkfstools command providing the --iofilterslist option and the name of the VMDK file
specified in the preceding step.

The command displays the list of IO Filters attached to the VMDK file.

Example: Example attaching countio to disk2.vmdk


The following sequence of commands attaches the countio IO Filter to the VMDK named disk2.vmdk,
setting the numWorkGroups property to 3, then verifies the attach.

# vmkfstools --iofilters countio:numWorkGroups="3" disk2.vmdk


# vmkfstools --iofilterslist disk2.vmdk

countio

Detaching an IO Filter from a VMDK, Using the ESXi Shell

This topic builds on the information presented in the topic “Attaching an IO Filter to a VMDK, Using the
ESXi Shell,” on page 60. It has analogous context and prerequisites. It has one additional prerequisite: that
you have already attached the IO Filter that you wish to detach.

Detaching an IO Filter from a VMDK causes the IO Filter Framework to invoke the diskDetach() callback of
said filter.

To detach a filter from a VMDK using the ESX CLI, perform the following steps as root:

Procedure
u Run the vmkfstools command as you did to attach the IO Filter, except: A) use "" (empty quotes) for the
filter_name; B) Do not set any property values.

On successful completion, the IO Filter is detached from the VMDK.

Example: Example detaching all filters from disk2.vmdk


# vmkfstools --iofilters "" disk2.vmdk
# vmkfstools --iofilterslist disk2.vmdk
#

VMware, Inc. VMware Confidential and Proprietary 61


Getting Started Developing vSphere IO Filter Solutions

Removing an IO Filter VIB from an ESXi Instance using the ESXi CLI

Every time you make changes to your IO Filter code and successfully build it, you need to deploy the new
VIB to test it. Before you can deploy the new vib, you must remove the previously installed VIB. This topic
provides one procedure (of many) for completing the task of removing the VIB, using the ESXi CLI.

To remove an IO Filter VIB, the system must be in system maintenance mode. In maintenance mode, ESXi
hosts do not run VMs. Thus, to enter maintenance mode, all VMs running on the host must get shut down
or suspended. Certain methods you use to put the system in maintenance mode migrate the VM to another
host in the cluster for you. Others methods require you to suspend or power-off the VM manually. This
topic assumes that you have powered off your test VM or migrated it to another host so that you can put
this host into maintenance mode.

To remove an IO Filter VIB from an ESXi host using the ESXi CLI, follow this procedure, running all
commands as root:

Prerequisites
To remove an IO Filter VIB via the ESXi CLI you must have:

n Enabled SSH for your ESXi host from its Direct Console

n SSHed into the ESXi host as root

n Deployed the VIB to the ESXi host (using any of the available methods)

n Placed the ESXi host in system maintenance mode, migrating all running VMs to another host in the
DRS cluster or powering them off on this host

Procedure
1 Run the command: esxcli system maintenanceMode set -e true

The system tries to enter maintenance mode. If any VMs are running that it cannot stop, it fails to enter
maintenance mod and displays the following error message:

Failed to perform requested operation.

The system does not generate any output if it successfully enters maintenance mode. You can always
obtain the current mode using the following command: esxcli system maintenanceMode get.

2 Run the command: esxcli software vib remove -n filter_name

For example: esxcli software vib remove -n countio

The ESXi Installer stops your IO Filter's daemon, if it has one. Then, it removes the items from the VIB
deployed to the filesystem (for example /usr/lib/vmware/plugin/libvmiof-disk-filter_name.so.
Finally, it removes the VIB from the VIB database.

The ESXi Installer displays the results of the command, similar to the following:

Removal Result
Message: Operation finished successfully.
Reboot Required: false
VIBs Installed:
VIBs Removed: ZZZ_bootbank_countio_1.0.0-1OEM.600.0.0.2329218
VIBs Skipped:

3 Take the system out of maintenance mode by running the command: esxcli system maintenanceMode
set -e false

The system exits system maintenance mode, starting any VMs configured to auto-start. All other VMs
on the host remain paused or powered off.

62 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

Your IO Filter VIB is no longer on the ESXi host. You are now able to install a new IO Filter VIB.

Removing an IO Filter VIB from an ESXi Instance using Workbench

This topic builds on the content presented in “Removing an IO Filter VIB from an ESXi Instance using the
ESXi CLI,” on page 62. It has, essentially, the same prerequisites and purpose. However, this topic discusses
how to complete the same task using the Remote System Explorer component of VMware Workbench
instead of the ESXi CLI.

One difference is that you need to have configured your ESXi host in the Remote System Explorer (RSE)
before you can follow this procedure.

The workflow for removing a VIB via RSE in Workbench is similar to the workflow for deploying a VIB via
RSE in Workbench as described in “Deploying an IO Filter VIB to an ESXi Instance Using Workbench,” on
page 55. The key differences are:

n The procedure uses the VMware Uninstall Wizard instead of Install Wizard

n You invoke the Wizard differently

To remove an IO Filter VIB from an ESXi host using RSE in Workbench, follow these steps from within the
Workbench GUI:

Procedure
1 If you have not already done so, start VMware Workbench.

2 Select the Remote System Explorer Perspective.

One way to do this is to select Window > Choose Perspective > Remote System Explorer.

In one of its panes, Workbench displays the RSE perspective similar to the following:

VMware, Inc. VMware Confidential and Proprietary 63


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑20. Workbench: Remote System perspective, with ESXi hosts added

3 Right-click on the ESXi host to which you would like to remove the VIB and then select VMware >
Package Manager

Workbench displays the Package Manager dialog similar to the following:

Figure 3‑21. Workbench: VMware Package Manager dialog

64 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

4 Scroll through the list of installed VIBs and select the VIB for your IO Filter Solution.

When you select your VIB, the Package Manager enables the Remove button.

5 With your VIB selected, click Remove.

The Package Manager launches the Uninstall Wizard, which displays its first dialog similar to the
following:

Figure 3‑22. Workbench: VMware Uninstall Wizard

6 Check Maintenance Mode, and then click Next >.

Checking Maintenance Mode causes the Uninstall Wizard to (attempt to) place the host in maintenance
mode before it tries to uninstall the VIB, and then move it back again after the operation. Again, you
can only remove IO Filter VIBs when a host is in maintenance mode.

The Uninstall VIB displays the following dialog:

VMware, Inc. VMware Confidential and Proprietary 65


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑23. Workbench: VMware Uninstall Wizard, second dialog

7 Click Uninstall.

The Uninstall Wizard starts the uninstall process, displaying a dialog similar to the following as it
progresses:

Figure 3‑24. Workbench: VMware Uninstall Wizard, Progress dialog

After the operation completes, the Uninstall Wizard displays another dialog with the results, similar to
the following:

66 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

Figure 3‑25. Workbench: VMware Uninstall Wizard, Completion dialog

NOTE The text displayed in this dialog is the output form the ESXi Installer as discussed in “Removing
an IO Filter VIB from an ESXi Instance using the ESXi CLI,” on page 62.

8 If the uninstall failed, fix the problem, then use the < Back buttons to return to the beginning and try
again.

9 Click Finish.

Workbench closes the Uninstall Wizard dialog, leaving the Package Manager dialog.

10 Click Close.

Workbench closes the Package Manager dialog.

Your IO Filter VIB is no longer on the ESXi host. You are now able to install a new IO Filter VIB.

Testing with a Cluster


The supported customer workflow for deploying and using IO Filters is for the customer's administrators to:

n Deploy / upgrade / remove IO Filters to a cluster managed by a vCenter Server using the VIM API (and
a supported SDK for that API)

n Create a SPBM policy that uses the IO Filter with its properties set to certain values. If they want the
filter to use different property values for different VMDKs, they must create separate policies for each
set of values.

n Apply the desired SPBM policy to various VMDKs of VMs. Applying the policy to a VMDK causes the
IO Filter Framework to attach the IO Filter in the SPBM policy to the VMDK. Removing the policy
causes the IO Filter Framework to detach the IO Filter from the VMDK.

Administrators use the VWC to perform all of these actions, except deploy / upgrade / removal of IO filters.
Again, the latter operations currently require invoking functions in the VIM API.

The following sub-topics provide detailed steps for completing each of these tasks.

VMware, Inc. VMware Confidential and Proprietary 67


Getting Started Developing vSphere IO Filter Solutions

Deploying an IO Filter Solution to a Cluster using the MOB

Once you have built an IO Filter, which builds its offline bundle file, you must stage it to a web server. Do
this in a manner analogous to the procedure given in “Staging your VIB for Deployment from the ESX CLI,”
on page 53, except:

n Stage the bundle file, not the VIB file

n You must stage the bundle to a web server (http or https). You cannot stage it to an FTP or other server.

VMware has enhanced the VIM API with several functions for managing IO Filters on a cluster. Common
parameters to these functions include:

n clusterMOID — is the Managed Object ID (MOID) of the cluster to which you wish to deploy the IO
Filter.

n filterID — is a string that is a concatenation of these items (each separated by an under-score): The 3-
letter vendor_code in the filter's SCONS file; The string bootbank; The filer's name from its SCONS file;
The filter's version from its SCONS file, a dash, and then a string assigned by the build system, for
example "1OEM.600.0.0.2329218" (the numbers after the last decimal are the build number of ESXi at
which the SDK is targeted). For example, ZZZ_bootbank_countio_1.0.0-1OEM.6.0.0.2329218.

The functions added to the API include:

n InstallIOFilter_Task(string bundleURL, string clusterMOID) — Where bundleURL is the URL to the


location where you staged the bundle file.

This function creates a new task on the vCenter Server against which you run this function. The task, in
turn, attempts the deployment, which includes deploying the IO Filter's VIB (enclosed in the bundle
file) to each host in the cluster, and making the IO Filter available for SPBM policies.

NOTE The bundleURL can be HTTPS, however, currently vSphere (VC/EAM) doesn't validate the
certificate and has no way for partners to provide credentials for the bundle server.

n QueryDisksUsingFilter(string filterID, string clusterMOID) — This function returns an array of


VirtualDiskId structures, one for each VMDK to which the IO Filter is currently attached, including
each VMDK's MOID.

n QueryIOFilterInfo(string clusterMOID) — This function returns an array of ClusterIOFilterInfo


structures, one element for each IO Filter deployed to the cluster. A key member of the
ClusterIOFilterInfo structure is the filter's filterID.

n QueryIoFilterIssues(string filterID, string clusterMOID) — This function returns an array of


IoFilterHostIssue structures, one for each host that has an outstanding issue with an IO Filter. For
example, if the host could not remove an IO Filter's VIB because it was in use when someone /
something invoked the UninstallIoFilter_Task() (discussed later in this list).

n ResolveInstallationIssuesOnCluster_Task(string filterID, string clusterMOID) — This function makes


one attempt to resolve any issues that prevented the installation of the specified filter to the specified
cluster.

n ResolveInstallationIssuesOnHost_Task(string filterID, string hostMOID) — This function makes one


attempt to resolve any issues that prevented the installation the specified filter to the specified host.

n UpgradeIoFilter_Task(string filterID, string clusterMOID, string vibURL) — This function currently


just invokes the UninstallIoFilter_Task() and then InstallIoFilter_Task() methods.

68 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

VMware provides several officially supported SDKs with which you can develop programs using the VIM
API, including one for Perl, Java, and the Power CLI. It also provides a community-supported SDK for the
VIM API that uses Python (pyvmomi). Currently, you can only access the IO Filter functions of the VIM API
via the officially supported SDKs, and the Managed Object Browser (MOB) of the vCenter Server containing
the cluster to which you wish to deploy your IO Filter Solution.

Since the MOB is language-independent, and accessible via any web browser, this topic demonstrates
deploying the IO Filter Solution via this interface. To deploy your IO Filter Solution to your cluster via the
MOB, follow these steps:

Prerequisites
To deploy an IO Filter Solution to a cluster, you must have:

n Already built the IO Filter, specifically its bundle (located in the file build/bundle/filter_name-
offline-bundle.zip under the Filter's source directory)

n Staged the bundle file on a web server

n Credentials for an administrator account on the vCenter Server that contains the cluster

n Removed the IO Filter from the cluster, if you had deployed it previously

Procedure
1 Browse to https://IP/mob.

Where IP is the IP address (or host name) of the vCenter Server containing the cluster to which you
wish to deploy your IO Filter Solution.

The MOB requires authentication, so your browser displays a standard browser authentication dialog.

2 In the authentication dialog, enter credentials of a vCenter Server user with administrator privileges.

By default, vCenter Servers install with the user [email protected]. You set the password for
this user during deployment of the vCenter Server.

On successful authentication, the browser displays a page similar to the following:

VMware, Inc. VMware Confidential and Proprietary 69


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑26. MOB home page

3 Click content (in the Value column in the Properties section)

The browser displays a page with the MOB's content similar to the following:

Figure 3‑27. MOB: Content page, with IOFilterManager outlined for emphasis

70 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

4 Click IOFilterManager (outlined for emphasis).

The browser displays a page with the IO Filter Manager VIM functions, similar to the following:

Figure 3‑28. MOB: IOFilterManager page, with InstallIoFilter_Task outlined for emphasis

5 Click InstallIoFilter_Task (outlined for emphasis).

The browser displays a page with fields for each of the parameters for the function, similar to the
following:

Figure 3‑29. MOB: IOFilterManager, InstallIoFilter_Task page

6 Enter the URL of the staged IO Filter Bundle file in the text field labeled vibURL - its really looking for
a bundle, not a VIB - outlined in a dashed line in the preceding figure for emphasis.

7 Replace the text MOID with the managed object ID of the cluster to which you wish to deploy your IO
Filter Solution.

The following figure shows the form filled out with the URL for the countIO bundle, staged at
https://172.16.193.11 to the cluster whose MOID is domain-c7:

VMware, Inc. VMware Confidential and Proprietary 71


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑30. MOB: IO Filter Manager, InstallIOFilter_Task Example

8 Click Invoke Method.

The MOB creates a vCenter Server task that invokes the InstallIoFilter_Task() function, passing the
arguments you placed in this web form. The browser displays the task ID in another form at the bottom
of the web form. The following figure shows an example result page:

Figure 3‑31. MOB: IO Filter Manager, InstallIOFilter_Task Method Results

9 Click the task ID (outlined for emphasis).

The MOB queries the vCenter Server for properties of the task, which the browser displays, similar to
the following:

72 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

Figure 3‑32. MOB: IO Filter Manager, InstallIOFilter_Task Task Properties page

10 Click info (outlined for emphasis).

The MOB queries the vCenter Server for information about the task, which the browser displays,
similar to the following:

Figure 3‑33. MOB: IO Filter Manager, InstallIOFilter_Task Task Info page

VMware, Inc. VMware Confidential and Proprietary 73


Getting Started Developing vSphere IO Filter Solutions

This figure shows the task state is "running", (outlined for emphasis). This means that the vCenter
Server is deploying the IO Filter VIBs in the bundle to the ESXi hosts, creating the appropriate SPBM
entities, etc.

11 Continue to refresh the task info page until the state is either "success" or "error".

On success, you are ready to create SPBM policies, attach the policies to VMDKs, etc.

On error, you need to explore the log files (especially /var/log/hostd.log on each of the ESXi hosts) to
determine why the operation failed.

What to do next
After successfully deploying an IO Filter Solution to a cluster, you need to create an SPBM policy that uses
the filter.

Creating an Storage Policy-Based Management (SPBM) Policy for an IO Filter

In a production environment, in order to get an IO Filter attached to a disk, administrators must first create
an SPBM policy that includes a rule for the said IO Filter. They do this with the VWC. To accomplish this
task, follow these steps using the VWC:

Prerequisites
You must have deployed an IO Filter's bundle to a cluster before you can create an SPBM policy that uses it.
You must also log into the VWC using credentials of a user with administrator privileges, such as
[email protected]

74 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

Procedure
1 Browse to the Home page, home tab. The browser displays a page similar to the following:

Figure 3‑34. VWC: Home Page, Home Tab, VM Storage Policies icon highlighted

2 Click VM Storage Policies (highlighted for emphasis)

The VWC displays a page with a list of VM Storage Policies (SPBM policies). By default the page looks
like the following:

Figure 3‑35. VWC: VM Storage Policies (SPBM Policies) page

VMware, Inc. VMware Confidential and Proprietary 75


Getting Started Developing vSphere IO Filter Solutions

3
Click the icon to create a new SPBM policy.

The VWC displays the Create New VM Storage Policy dialog similar to the following:

Figure 3‑36. VWC: Create New VM Storage Policy dialog, Name and Description page

4 Enter a name and description for the policy in the fields provided, and then click Next.

Each policy must have a unique name, which should provide a brief but accurate indication of what the
policy is for. Use the Description field to provide detailed information about the policy. For IO Filter
policies, you should describe the details of the policy, at a high level. For example, an administrator
might set the name to "countioAcceleration5" and have a description of "CountIO policy with
Acceleration property set to 5."

The VWC displays the Rule-Sets page of the dialog as follows:

Figure 3‑37. VWC: Create New VM Storage Policy dialog, Rule-Sets page

76 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

5 Read the information in the Rule-Sets page, and then click Next.

NOTE There are no actionable items on this page.

The VWC displays the Common rules page of the dialog as follows:

Figure 3‑38. VWC: Create New VM Storage Policy dialog, Common rules page

6 Check Use common rules in the VM storage policy.

You must enable common rules because IO Filters are part of the SPBM common rule set.

The VWC enables the <Add rule> pull-down.

7 Click the <Add rule> pull-down.

The VWC displays the list of IO Filter types currently installed on the host. Currently, this will either be
cache or replication, as present in the CLASS field of a filter's .json file.

8 Select the class of filter for which you want to make a rule in the policy, using the pull-down.

The VWC expands the Rule-set page with another pull-down for you to select the IO Filter of the
selected type. The following figure shows the results of selecting a replication filter class and the pull-
down for selecting the filter (outlined for emphasis):

VMware, Inc. VMware Confidential and Proprietary 77


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑39. VWC: Create New VM Storage Policy dialog, Common rules page, adding a rule

9 Use the [Select Value] pull-down to select the IO Filter to add to the rule.

The VWC displays a list of properties (capabilities) exposed by the IO Filter's .json file, along with their
default values and pull-downs / text fields for modifying their values. The following figure shows the
properties for a variant of sampfilt called MyFilter1:

Figure 3‑40. VWC: Create New VM Storage Policy dialog, Common rules page, adding a rule, setting
property values

10 Modify the filter's property values as desired.

NOTE All VMDKs to which you attach this policy will receive the same values for these properties. If
you want different VMDKs to have different values for these properties, you must create separate
policies with the different values.

78 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

11 If desired, add additional rules to the policy using the <Add rule> pull down, repeating steps 9 and 10
for the additional rule(s).

NOTE You can currently only add two rules to a common rule policy, because there are only two
classes of filters supported and a given VMDK can only have one filter of each class attached at a given
time.

12 Click Next.

The VWC displays the Rule-Set 1 page of the dialog, as follows:

Figure 3‑41. VWC: Create New VM Storage Policy dialog, Rule-Set 1 page

You can use the items on this page to configure datastore rules for your policy. By default, no datastore
rules are present in a vSphere system. Further, they are not needed for testing IO Filters.

13 Un-check Use rule-sets in the storage policy.

The VWC disables the remaining items in this page.

14 Click Next.

The VWC checks the cluster's storage against the datastore rule set and displays the Storage
compatibility page of the dialog similar to the following:

VMware, Inc. VMware Confidential and Proprietary 79


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑42. VWC: Create New VM Storage Policy dialog, Storage compatibility page

NOTE Since you did not create any datastore rules, this dialog should list all of your datastores. In any
event, there are no actionable items in this dialog.

15 Click Next to see a summary of the policy configured thus far, or Finish to just complete the policy
creation.

If you click Finish, continue to the discussion of Step 17. If you click Next, the VWC displays the Ready
to complete page similar to the following:

Figure 3‑43. VWC: Create New VM Storage Policy dialog, Ready to complete page

16 Review the content in this page. If anything is wrong, use Back to correct it and then navigate back to
this page.

17 Click Finish when everything is correct.

The VWC causes the vCenter Server to create the new policy as configured. On successful completion,
the VWC displays the VM Storage Policies page as shown earlier in this topic, except that the list of
policies now contains the one you just created.

80 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

The cluster now has a VM Storage Policy (SPBM policy) which you can apply to VMDKs in the cluster.

What to do next
Apply the policy you created to one or more VMDKs.

NOTE In ESX 60U1, after upgrading your Filter to a newer version, you need to delete the SPBM policy and
recreate it. This is no longer a requirement using 60U2

Attaching an IO Filter to, and Detaching it from, a VMDK Using the VWC and SPBM

In production, you attach IO Filters to VMDKs via Storage Policy-Based Management (SPBM) policies. After
creating an SPBM policy that references the IO Filter as a common rule, you then apply that policy to
VMDK(s). This causes the vCenter Server to send requests to the hostd on the ESXi system currently hosting
the VM to attach the filter to the said VMDK(s).

You configure (add/remove/change) storage policies for a VMDK using the same VWC pages you use to
configure other aspects of a VMDK. For example:

n For an existing VM, you assign a policy to any of a VM's VMDKs in the Virtual Hardware tab of the
Edit Settings dialog (reached by right-clicking on a VM and then selecting Edit Settings, and then
selecting the Virtual Hardware tab.)

n While creating a VM, you assign a policy to any of the VM's VMDKs in the Customize hardware page
of the New Virtual Machine dialog (reached via Actions > New Virtual Machine > New Virtual
Machine... with a VM or cluster selected in the Object Navigator).

In either case the VWC displays a list of VM hardware similar to the following:

Figure 3‑44. VWC: VM Hardware Modification page

Clicking the arrow associated with a virtual disk causes the VWC to display the attributes of the VMDK,
similar to the following:

VMware, Inc. VMware Confidential and Proprietary 81


Getting Started Developing vSphere IO Filter Solutions

Figure 3‑45. VWC: VM Hardware Modification page, Hard disk 1 expanded

The VM storage policy pull-down allows you to select which storage policy, of those define on the cluster, to
apply to this VMDK. You can apply different storage policies to each VMDK, as long as the policies do not
conflict.

In summary, follow these steps to add / remove / change the IO Filters attached to a VMDK, using the VWC
and SPBM policies:

Prerequisites
In order to attach an IO Filter to a VMDK using the VWC, you must:

n Have created a SPBM policy that contains the IO Filter

n Be authenticated to the VWC as a user with administrator privileges

Procedure
1 Define a storage policy that references the desired IO Filter(s).

2 Select a VM that contains the VMDK to which you wish to attach the IO Filter(s) in said storage policy,
or create the VM.

3 In the Virtual Hardware tab of the Edit Settings dialog, or Customize hardware page of the New
Virtual Machine dialog, respectively, display the list of properties for the VMDK to which you wish to
attach the IO Filter(s)

4 Select the SPBM Policy you created in Step 1 using the pull-down list associated with the VM storage
policy label.

5 Click OK or Finish, respectively, within the dialog.

This causes the VWC to tell the vCenter Server to direct the hostd running on the ESXi system hosting
the VM to have the IO Filter Framework on said host invoke the diskAttach() callback for the IO
Filter(s) listed in the policy.

82 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

You now have a VMDK with the desired IO Filters attached.

Removing an IO Filter Solution from a Cluster Using the Managed Object Browser
(MOB)

This topic builds on the concepts presented in two other topics: “Removing an IO Filter VIB from an ESXi
Instance using the ESXi CLI,” on page 62 and “Deploying an IO Filter Solution to a Cluster using the MOB,”
on page 68. The reason for removing an IO Filter Solution is the same as that for removing an IO Filter VIB
from an ESXi instance. The prerequisites are analogous, that you have installed the bundle and have access
to the MOB with administrator credentials, and you must know the MOID of the cluster from which you
wish to remove the IO Filter Solution.

However, there are some additional considerations for removing an IO Filter Solution from a cluster:

n You must know the ID (not the name) of the IO Filter you wish to remove. You can retrieve this
information using the QueryIoFilterInfo() VIM API method via the MOB, the Power CLI or a
supported SDK.

n You cannot remove an IO Filter that is used in an rule in an SPBM policy that is currently applied to a
VMDK. You must edit the policy so that it does not contain the IO Filter in one of its rules, or you must
delete the policy entirely. However, you cannot delete a policy that is currently applied to any VMDKs.

n If you remove an IO Filter that is used in a rule in an SPBM policy that is currently not applied to any
VMDKs, the policy becomes invalid until you either edit the associated rule, or re-deploy the IO Filter
Solution.

n The URL you specified when installing the Filter must be still accessible as VC doesn't store your Filter
Bundle, so it has to access your URL to get the bundle info before uninstalling.

As a reminder, the MOB is one interface for invoking the VIM API IOFilterManager methods. You use the
UninstallIoFilter_Task() VIM API method to remove an IO Filter Solution from a cluster. To invoke this
method via the MOB, follow these steps:

Procedure
1 Use the MOB to access the IOFilterManager by following Steps 1-4 of the procedure in “Deploying an
IO Filter Solution to a Cluster using the MOB,” on page 68.

2 Click UninstallIOFilter_Task.

The browser displays a web form similar to the following:

Figure 3‑46. MOB: UninstallIoFilter_Task web form

VMware, Inc. VMware Confidential and Proprietary 83


Getting Started Developing vSphere IO Filter Solutions

3 Enter the ID of the IO Filter in the ID field, and the MOID of the cluster into which the IO Filter has
been deployed in the MOID field, and then click Invoke Method.

The MOB invokes the method on the vCenter Server, which creates a task to remove the filter from each
ESXi host in the cluster, and displays a link to the task in the same web form page.

4 View the task information as discussed in Steps 9-11 of the procedure in“Deploying an IO Filter
Solution to a Cluster using the MOB,” on page 68.

The IO Filter Solution is removed from the cluster, and each ESXi host therein.

Chapter Summary
The topics in this chapter presented details such that you should now be able to:

n List the components of a minimal environment for developing and testing IO Filter Solutions

n Deploy the VAIODK to a supported development platform, including:

n Locating and downloading the VAIODK

n Install the VAIODK

n List the key folders, files, and tools created by installing the VAIODK, including the sample filters

n Remove the VAIODK

n Build a sample filter and list the results a successful build

n Deploy an IO Filter to ESXi systems in a cluster

n Test an IO Filter, including:

n Creating a new VMDK for a VM

n Add an IO Filter to a VMDK, using the ESXi shell and VWC

n Generate IOs to the VMDK

n View filter log messages in the appropriate log files

Review Question Answers - Deploying and Testing an IO Filter


Development Environment
1 VMware recommends that you have at least which components in your Basic Development
Deployment? (choose all that apply)

a One ESXi instance

b Two ESXi instances

c A vCenter Server

d Two vCenter Server instances

e VMware Workbench VM

2 Why does the VAIODK tool-chain compile each Library component in both 32 and 64-bit mode?

a To ensure backward compatibility with previous versions of ESXi

b So that the LI can be linked with the 32 and 64-bit versions of the Daemon

c So that the LI can be linked with 32 and 64-bit user worlds

d To allow LIs to communicate properly across IP connections

84 VMware Confidential and Proprietary VMware, Inc.


Chapter 3 Deploying and Testing an IO Filter Development Environment

3 Which command do you use to compile an IO Filter solution?

a make

b scons

c config; make

d jam

4 In normal operations, to which vSphere component do administrators deploy their IO Filter Solutions?

a ESXi instances

b Clusters

c Datacenters

d vCenter Server

VMware, Inc. VMware Confidential and Proprietary 85


Getting Started Developing vSphere IO Filter Solutions

86 VMware Confidential and Proprietary VMware, Inc.


Creating a Basic IO Filter Solution 4
Chapter Objectives and Topics
To achieve the tactical objectives of this course, after successfully completing this chapter, you should be
able to:

n Specify where to create your build folder on your build system, and why

n Create a Makefile to build your Solution

n Create appropriate entries in your Solution's SCONS and JSON files, based on the plans for your
Solution

n Create minimal source files for the library and daemon components of your Solution

n Define the prototype for each of the callback's in a library and daemon component, and when the IO
Filters Framework invokes each

This chapter includes the following topics:

n “Understanding Strategies for Starting IO Filter Development,” on page 88

n “Creating a Build Folder,” on page 88

n “Adding a Makefile to Your Build Folder,” on page 88

n “How to Build a CIM Provider,” on page 89

n “Creating and Populating a Correct Scons File,” on page 89

n “Creating and Populating a Correct .json File,” on page 98

n “Creating a Skeletal Filter Library Component Source File and Understanding its Entry-points /
Callbacks,” on page 101

n “Manipulating per-VMDK Filter Properties using vmkfstools,” on page 112

n “Creating a Skeletal Daemon Component,” on page 113

n “Creating a Skeletal CIM Provider,” on page 116

n “Understanding How to Localize your Solution,” on page 116

n “Debugging IO Filter Core Dumps,” on page 118

n “Live Debugging,” on page 120

n “Chapter Summary - Creating a Basic IO Filter Solution,” on page 127

n “Review Questions,” on page 127

VMware, Inc. VMware Confidential and Proprietary 87


Getting Started Developing vSphere IO Filter Solutions

Understanding Strategies for Starting IO Filter Development


There are two main strategies you can use to create an IO Filter Solution:

n Start with a sample filter that is closest to your product, then morph that code as you need

n Start from scratch

Regardless of the strategy you choose, at a minimum, you need a filter with a library component and a CIM
provider component. Most IO Filter Solutions also include a Daemon component. The code in all of these
components contains a set of callbacks functions invoked by the IO Filter and CIM Provider frameworks,
respectively.

This chapter provides the details you need to create each of these components.

Creating a Build Folder


You create your project directory with all your source files under /opt/vmware/vaiodk-6*/src. For
example /opt/vmware/vaiodk-6.0.0-2198567/src/myfilter is the complete path of the myfilter project
directory. The directory must be located here because:

n This directory is the Scons root directory

n The Class Executable rule (a vmware extension to SCONS) requires that its sources be under the said
SCONS root directory.

You must have the VAIODK installed before performing this task. Enter the following commands at the
shell prompt on your development platform. :

Procedure
1 cd /opt/vmware/vaiodk-6*/src

2 mkdir myfilter

3 sudo chmod ugo+wxt myfilter

4 cd myfilter

Adding a Makefile to Your Build Folder


As discussed in “Understanding a Filter's Directory Contents and Building a VAIODK Sample Filter,” on
page 47

n The VAIO SDK uses scons to build filters

n To make it easier for people unfamiliar with scons but familiar with make, the samples provided in the
SDK also include a Makefile with rules that invoke scons to build the filter, clean the build
environment, etc.

The Makefile is the same for all the sample filters.

NOTE As an exercise, use diff to compare the Makefiles in two sample filter's directories.

88 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

Once you create your filter's source directory (as discussed in “Creating a Build Folder,” on page 88), create
a Makefile in that directory by copying the generic Makefile from any of the sample filter directories into
your filter's source directory.

NOTE Remember, the source directories for all sample filters are under /opt/vmware/vaiodk-
*/src/partners/samples/iofilter/*. For example the
directory /opt/vmware/vaiodk-6.0.0-2198567/src/partners/samples/iofilter/countIO contains the
countIO sample filter.

After creating a Makefile, you need to create / edit a SCONS file as discussed in detail in section“Creating
and Populating a Correct Scons File,” on page 89.

How to Build a CIM Provider


Upon installing the VAIODK , you create the Build Folder as described in Section “Creating a Build Folder,”
on page 88. Partners should do their development work as non-root users. However, when you try to build
your filter as non-root user, you will encounter the following error.

mkdir -p /opt/vmware/cimpdk-6.0.0-2799832/oss/sfcb/src
mkdir: cannot create directory `/opt/vmware/cimpdk-6.0.0-2799832/oss/sfcb/src': Permission denied
make: *** [generic-prep] Error 1

You will need to change the permissions on the "/opt/vmware/cimpdk-6.0.0-2799832/oss/sfcb" directory.

Enter the following commands at the shell prompt on your development platform:

Procedure

1 cd /opt/vmware/cimpdk*/oss

2 sudo chmod 1777 sfcb

3 cd /opt/vmware/cimpdk*/oss

You can now successfully compile your filter.

Creating and Populating a Correct Scons File


The VAIODK uses scons, a software construction tool similar to make, to build IO Filter solution
components, VIBs and bundles. The Makefile provided in the VAIODK, which invokes SCONS, assumes the
SCONS rules for building an IO Filter solution are in a file named filter-name.sc, for example sampfilt.sc.
Whatever the file's name, it is simply referred to as the solution's SCONS file.

The SCONS file has precise syntax requirements. Before getting to IO Filter-specific rules, some general
syntax notes are:

n SCONS is built on top of Python, so all of its syntax rules apply, including: Case sensitivity; comments
begin with a pound sign (#); You can use either single or double quotes for strings, as long as you begin
and end the string with the same quote; The escape character is back-slash (\), etc.

n Some of the SCONS rules required to build IO Filter solutions use lists. Pythons syntax has lists
enclosed in square brackets ([]) with items in the list separated by commas.

VMware, Inc. VMware Confidential and Proprietary 89


Getting Started Developing vSphere IO Filter Solutions

n SCONS files for VAIO require you to define several dictionaries (a Python data structure) with specific
names, with specific keys within those dictionaries set to values you provide. While it is a gross
injustice to equate Python dictionaries with C structures, the way the VAIO SCONS files use
dictionaries is analogous to defining a C structure with members initialized to values you provide.
While there are several methods for creating dictionaries in python, the syntax used in VAIO SCONS
files is:

somename = {
'key1' : 'yourvalue1',
'key2' : 'yourvalue2',
}

This snippet defines a dictionary called somename that contains two keys: key1 and key2, with values
yourvalue1 and yourvalue2 respectively.

VAIO has some specific content requirements for SCONS files to build an IO Filter solution. This topic
discusses the dictionaries you must define and methods you must invoke to successfully define a SCONS
file for building an IO Filter solution.

To understand these requirements, consider the following SCONS file from the countIO sample solution,
with a following discussion of the content:

1 # Copyright 2014 VMware, Inc.


2 # All rights reserved. -- VMware Confidential
3
4 """build module for the countIO sample filter
5
6 IOFilter definitions for the countIO sample filter.
7
8 When developing an IOFilter for release through the async program:
9 * Adjust the copyright message above as appropriate for
10 your company
11 * set 'vendor' to the name of your company
12 * set 'vendor email' to the contact e-mail provided by your company
13 * increment the version number if the source has come from VMware
14 * remove 'version_bump' if present
15
16 When bringing an async IOFilter inbox at VMware:
17 * leave 'version' as is from the async release
18 * set 'version_bump' to 1
19 * set 'vendor' to 'Vmware, Inc.'
20 * set 'vendorEmail' to the VMware contact e-mail address
21
22 If updating the IOFilter at VMware:
23 * increment 'version bump' or contact the IHV for a new version number
24
25 If updating the IOFilter at an async vendor:
26 * increment the version number (do not use version_bump)
27 """
28
29 #
30 # identification section
31 #
32 countIOIdentification = {
33 'name' : 'countio',
34 'module type' : 'VMIOF',
35 'binary compat' : 'yes',
36 'summary' : 'CountIO Sample Disk IOFilter',

90 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

37 'description' : 'CountIO Sample IOFilter for VMware ESX',


38 'version' : '1.0.0',
39 'license' : VMK_MODULE_LICENSE_VMWARE,
40 'vendor' : 'Example, Inc.',
41 'vendor_code' : 'ZZZ',
42 'vendor_email' : '[email protected]',
43 }
44
45 #
46 # IOFilter properties for the countIO filter
47 #
48 countIOVmiofDef = {
49 'identification' : countIOIdentification,
50 'VMIOFTYPE type' : 'disk',
51 'VMIOF version' : '1.0',
52 'capabilities' : 'countIO_config.json'
53 }
54
55 #
56 # user-world build definition for the countIO IOFilter.
57 #
58 countIOuser-worldDef = {
59 'identification' : countIOIdentification,
60 'source files' : ['countIO.c',
61 ],
62 'extra objects' : {
63 '32' : ['blobs/lib-objlib.a', 'blobs/lib-vobuserlib.a'],
64 '64' : ['blobs64/lib-objlib.a', 'blobs64/lib-vobuserlib.a'],
65 }
66 }
67 #
68 # Build the VMIOF shared object
69 #
70 countIOSo = defineVMIOFso(countIOuser-worldDef, countIOVmiofDef)
71
72 #
73 # Build definition for the countIO IOFilter's daemon plugin.
74 #
75 countIOdaemonDef = {
76 'identification' : countIOIdentification,
77 'source files' : ['countIODaemon.c',
78 ],
79 'extra objects' : {
80 '64' : ['blobs64/lib-objlib.a', 'blobs64/lib-vobuserlib.a'],
81 }
82 }
83 #
84 # Build the VMIOF daemon plugin shared object
85 #
86 countIOdaemonSo = defineVMIOFDaemonSo(countIOdaemonDef, countIOVmiofDef)
87
88 #
89 # Definition for the countIO CIM provider
90 #
91 countIOProviderDef = {

VMware, Inc. VMware Confidential and Proprietary 91


Getting Started Developing vSphere IO Filter Solutions

92 'identification' : countIOIdentification,
93 'cim location' : 'cim/countIO',
94 'shared include' : [ 'include/common',
95 ],
96 }
97 countIOProvider = defineCimProvider(countIOProviderDef)
98
99 #
100 # Build the Filter's config file
101 #
102 countIOConfig = defineVMIOFconfig(countIOVmiofDef, cim=countIOProviderDef)
103
104 #
105 # VIB build definitions for the filter package
106 #
107 countIOVibDef = {
108 'identification' : countIOIdentification,
109 'payload' : [ countIOSo,
110 countIOdaemonSo,
111 countIOConfig,
112 countIOProvider
113 ],
114 'vib properties' : {
115 'provides' : [],
116 'depends' : [],
117 'conflicts' : [],
118 'replaces' : [],
119 'acceptance-level' : 'community',
120 }
121 }
122 countIOVib = defineVmiofVib(countIOVibDef)
123
124 #
125 # Offline Bundle definition for the filter package
126 #
126 countIOBulletinDef = {
127 "identification" : countIOIdentification,
128 "vibs" : [ countIOVib,
129 ],
130
131 "bulletin" : {
132 # These elements show the default values for the corresponding items
133 # in bulletin.xml file. Uncomment a line if you need to use a
134 # different value.
135 #'severity' : 'general',
136 #'category' : 'Enhancement',
137 #'releaseType' : 'extension',
138 #'urgency' : 'Important',
139
140 'kbUrl' : 'http://kb.vmware.com/kb/example.html',
141
142 # 1. At least one target platform needs to be specified with
143 # 'productLineID'
144 # 2. The product version number may be specified explicitly, like 7.8.9,
145 # or, when it's None or skipped, be a default one for the devkit

92 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

146 # 3. 'locale' element is optional


147 'platforms' : [ {'productLineID':'embeddedEsx'},
148 # {'productLineID':'ESXi', 'version':"7.8.9", 'locale':''}
149 ]
150 }
151 }
152 defineOfflineBundle(countIOBulletinDef)

The first 28 lines just contain copyright and comments. The remaining contents are divided into sections as
described in the following sub-topics.

Understanding and Creating an Identification Dictionary Item (32-43)


You must define a dictionary item that provides keys with information used to identify your IO Filer
solution. You typically base the name of the dictionary on the name of the filter, for example
countIOIdentification. Various user interfaces expose some of the keys of this dictionary, such as the filter's
name, version, and description. Some of the keys have additional uses. For example, the version key is used
to name the VIB.

You use this dictionary to define other dictionaries in the SCONS file, as well as with functions you invoke
in the SCONS file. The keys you must provide in this definition, as seen in lines 32-43 of the countIO.sc file,
are:

n name — Set this to the name of the filter. In the example, the name is countio. The name is exposed in
UIs as well as used in the name of the VIB and bundle files. Because it is used for filenames, it cannot
contain spaces or other punctuation. You are required to name your filter using the format: Vender-
CodePRDName where:

n VendorCode is a 3 letter string assigned by VMware

n PRDName is assigned by you and consists only of letters and numbers

For example, if ExampleCo is assigned the VendorCode exc, and calls their filter product cacheme, the
name should be exccacheme. Only lower case characters can be used.

n module type — Always set this to VMIOF

n binary compat — While you can set this to either yes or no, you must set it to yes for asynchronous
releases and certification

n summary — A longer name of the filter. Unlike the name key, this key can contain spaces, punctuation,
etc. However, it should be brief.

n description — A brief description (1 - 2 sentences) describing this filter

n version — Version number of the filter. This must be strictly numbers and dots. The ESXi installer uses
this number to differentiate revisions of the filter. If you deploy version X to a host, then re-deploy
version X to the same host, the installer believes they are the same filter. To upgrade, the new filter
must have a different version number.

n license — Using one of the following macros, specify the type of license you use to release your IO Filter
solution. The available license macros, and their meaning, are:

n VMK_MODULE_LICENSE_BSD — BSD license agreement

n VMK_MODULE_LICENSE_VMWARE — The same license agreement as vMware uses to release its products

n VMK_MODULE_LICENSE_APACHE — APACHE license agreement

n VMK_MODULE_LICENSE_LGPL - LGPL license agreement

n VMK_MODULE_LICENSE_EPL - EPL license agreement

VMware, Inc. VMware Confidential and Proprietary 93


Getting Started Developing vSphere IO Filter Solutions

Alternatively, you can set this to something you define yourself, for example,
"MySpecialLicenseAgreement". In this case, the build system considers this a "third party license"
agreement which is unknown to VMware.

n vendor — Set this to your company's legal name

n vendor_code — You must obtain a vendor code from your IO Filter program manager, and set this key to
that value

n vendor_email — Set this to the email address you want customers to use when contacting you about
issues with this filter

Understanding and Creating a Filter Properties Dictionary Item (48-53)


You must define a dictionary item that provides keys to indicate properties of your IO Filter solution. You
typically base the dictionary name on the filter, for example countIOVmiofDef.

As with the Identification dictionary property, you use this dictionary in a function you invoke within the
SCONS file. The keys you must provide in this definition, as seen in lines 48-53 of the countIO.sc file, are:

n identification — Set this to the name of the identification dictionary you have previously defined

n VMIOFTYPE type — The type of IO filter implemented in this solution. Currently, you can only set this
value disk.

n VMIOF version — The version of the IO Filtering framework to use with this IO Filter

WHAT'S NEW Currently this tag is not used. If you use 60U1 VAIODK to build. the vib/bundle will
depend on VMIOF API version 1.0, if you use 60U2 VAIODK to build, the vib/bundle will depend on
VMIOF API version 1.1.

n capabilities — The path, relative to the filter's directory, of a .json file that contains settings required for
an IO Filter solution as described in “Creating and Populating a Correct .json File,” on page 98.

Understanding and Creating a Library Instance Dictionary Item (58-66)


You must define a dictionary item that provides keys that define the source files that make up your IO Filter
solution's LI. This dictionary definition can also include keys for passing flags and definitions to the C
compiler. You typically base its name on the name fo the filter, for example, countIOuser-worldDef.

As with other dictionaries, you use this dictionary's definition with a function you invoke within the SCONS
file. The keys you must provide in this definition, as see in lines 58-66 of the countIO.sc file, are:

n identification — Set this to the name of the identification dictionary you have previously defined

n source files — This must be a list of files whose source make up your filter's LI. In the example, the LI
only has one source file, countIO.c.

In addition, you can provide the following keys:

n cc defs / cc flags / cc warnings — These are optional flags that can be passed to the compiler. cc defs are
flags that as passed to the c preprocessor. Since the compiler invokes the preprocessor they are also
included on the C compiler command line. By defining them separately, it is therefore possible to use
the preprocessor independently of the compiler. cc flags are flags that are only given to the C compiler.
cc warnings— warnings are diagnostic messages that report constructions that are not inherently
erroneous but that are risky or suggest that there may have been an error. These are flags that can be
passed to the C compiler to determine the different category of warnings that could be either enabled or
suppressed.

94 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

n extra objects— You can pass pre-compiled partner provided opaque binary ELF files into the user-world
build information of your IO filter.

NOTE You will have to provide both 32 and 64 bit versions of these files as they could get loaded into
either 32 or 64 bit cartels such as vmx and vmkfstools, as the case maybe.

Invoking the defineVMIOFso Function (70)


To build the shared object for your filter, you use the defineVMIOFso(). It takes the following parameters:

n The library instance dictionary item previously defined in this topic

n The IO Filter properties definition previously defined in Filter Properties dictionary item

Understanding and Creating a Daemon Plug-In Dictionary Item (75-82)


If you include an IO filter daemon in your build, you provide the information for this build in this section of
the SCONS file. The keys you must provide in this definition, as seen in lines 75-82 of the countIO.sc file,
are:

n identification—Set this to the name of the identification dictionary you have previously defined

n source files — This must be a list of files whose sources make up you IO Filter's daemon code. In the
example, the daemon only has one source file countIODaemon.c.

n cc defs / cc flags / cc warnings — These are optional flags that can be passed to the compiler. cc defs are
flags that as passed to the c preprocessor. Since the compiler invokes the preprocessor they are also
included on the C compiler command line. By defining them separately, it is therefore possible to use
the preprocessor independently of the compiler. cc flags are flags that are only given to the C compiler.
cc warnings— warnings are diagnostic messages that report constructions that are not inherently
erroneous but that are risky or suggest that there may have been an error. These are flags that can be
passed to the C compiler to determine the different category of warnings that could be either enabled or
suppressed.

n extra objects— You can pass pre-compiled partner provided opaque binary ELF files into the user-world
build information of your IO filter.

NOTE you provide a 64 bit version of this binary ELF file since it gets loaded into the 64 bitiofilterd
cartel

Invoking the defineVMIOFDaemonSo Function (86)


To build the shared object for your IO filter's daemon plug-in, you use the defineVMIOFDaemonSo(). It takes
the following parameters:

n The IO Filter's Daemon plug-in dictionary item previously defined in this topic

n The IO Filter properties definition previously defined in Filter Properties dictionary item in this topic

Understanding and Creating the CIM Provider Dictionary Item (91-96)


A CIM provider enables you to provide a means of monitoring and managing your IO filter. You must
provide the identification and location information for all files necessary to build the provider, in this
section of the SCONS file. The keys you must provide in this definition, as seen in lines 91-96 of the
countIO.sc file, are:

n identification— Set this to the name of the identification dictionary previously defined in this topic

n cim location— The location of the directory containing the CIM provider project files. This location is
relative to the directory holding the .sc file.

VMware, Inc. VMware Confidential and Proprietary 95


Getting Started Developing vSphere IO Filter Solutions

n shared include— The location of any files necessary to the CIM provider build that are shared with the
IO filter. This location is relative to the IO filter project directory.

Invoking the defineCIMProvider Function (97)


Now that you have defined identification and location, you build the CIM provider using
defineCimProvider(). It takes the following parameters :

n The CIM provider dictionary item previously defined in this topic

Invoking the defineVMIOFconfig Function (102)


You build the IO filter configuration file using defineVMIOFconfig(). It takes the following parameters :

n IO filter properties definition as per Filter Properties Dictionary Item previously defined in this topic

n CIM provider definition as per CIM Provider Dictionary Item previously defined in this topic

Understanding and Creating the VIB Dictionary Item (107-121)


The VIB section provides information that is needed to construct a VIB (vSphere Installation Bundle). VIBs
are used to package installable entities for installation on ESXi. The keys you must provide in this definition,
as seen in lines 107-121 of the countIO.sc file are :

n identification— Set this to the name of the identification dictionary previously defined in this topic

n payload— The shared objects to be included in the VIB. Refer to sections Invoking the defineVMIOFso
Function, Invoking the defineVMIOFDaemonSo Function, and Invoking the defineVMIOFconfig Function
defined previously in this topic.

n Setting VIB properties — The VIB properties provides the ESXi installer with information about your IO
Filter that is necessary for it to install correctly. You must set these fields to appropriate values as
required for your IO Filter.

n provides— Lists interfaces or virtual packages that this VIB package provides. Each entry has two
members — the name field that is required and version field that is optional.

n depends— This is the list of VIBs that the said VIB is dependent on

n conflicts— This is the list of VIBs that should not be installed along with the said VIB.

n replaces— List of VIB/VIBs if any that this VIB replaces. There is no need to enter a replaces field for
older versions of the same package name as these will be automatically replaced. Use this only
when packages have been renamed.

n acceptance-level— This represents the VIB acceptance level. Valid values are community, partner,
accepted, certified.

Each IO filter project has standard depends and provides properties. These are automatically added to the
VIB. You can use the VIB section to add vendor-specific depends and provides properties information.

Invoking the definePartnerVib Function (122)


To build the VIB, use the function definePartnerVib(). Its only parameter is a VIB definition dictionary as
defined in the preceding section.

96 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

Understanding and Creating Offline Bundle Dictionary Item (126-151)


You define the information that enables the build to create offline bundle .zip file. You can use this .zip file
to install the offline bundle on any ESXi host cluster. The keys you must provide in this definition, as seen in
lines 126-151 of the countIO.sc file are :

n identification— Set this to the name of the identification dictionary previously defined in this topic

n vibs— List of VIBs to be included in the offline bundle as per Invoking the definePartnerVib Function
topic defined previously.

n Bulletin Properties — The information used to create elements in the bulletin.xml file comprise of the
following :

n Severity— This identifies the bulletin's severity. Accepted values are - critical, security, general.

n category— This field defines the purpose of the said packaged item, for example what kind of issue
is the VIB addressing. Valid values are : Security, BugFix, Enhancement(default), Recall, RecallFix,
Info, Misc.

n releaseType— This field specifies the release type. Valid values are : patch, rollup, update, extension,
notification, upgrade.

n urgency— This field enables you to define the importance of the packaged item. Valid values are :
critical, important(default), moderate, low.

n kbUrl— This is the URL link to a knowledge base article or similar online documentation about the
entire issue. This field must contain text, but can indicate that no URL is available.

n platforms— This field denotes the platform for which the IO filter was developed. At least one
target platform needs to be specified.

invoking the defineOfflineBundle Function (152)


use the function defineOfflineBundle() to build the VIB. It takes the parameters mentioned in Offline
Bundle dictionary item section defined previously in this topic.

Defining a Test App (optional)


You can, optionally, create a test application for the purpose of driving tests of your IO Filter. To do this,
you must define a dictionary item that provides keys to indicate properties of your IO Filter Test App. You
typically base the dictionary name on the filter, for example sampfiltTestuser-worldDef. After defining the
dictionary item, invoke the defineVmiofTestApp function to create the actual executable.

The following SCONS file snippet illustrates how to create the dictionary item and invoke the function:

1 #
2 # Build the test app
3 #
4 sampfiltTestuser-worldDef = {
5 'identification' : sampfiltIdentification,
6 'source dirs' : ['tests',
7 ]
8 }
9 sampfiltTest = defineVmiofTestApp('test1',
10 sampfiltTestuser-worldDef,
11 sampfiltVmiofDef,
12 testDir='ZZZ_corp_tests')

VMware, Inc. VMware Confidential and Proprietary 97


Getting Started Developing vSphere IO Filter Solutions

This sample illustrates the following dictionary definition (lines 4-8):

n identification — Set this to the name of the identification dictionary you have previously defined (Not
shown)

n source dirs — The path, relative to the filter's directory, of a .json file that contains settings required for
an IO Filter Test App “Creating and Populating a Correct .json File,” on page 98.

The sample illustrates invocation of defineVmiofTestApp (lines 9-12):


n The name of the test app

n The test app dictionary item previously defined in this topic

n The IO Filter properties definition previously defined in Filter Properties dictionary item (Not shown in
this example)

n The name of the output directory under build/test-apps

NOTE You cannot add the test app to the vib or offline bundle. You must manually copy it to the ESXi host
in order to run it there (or have it available via an NFS mount).

Creating and Populating a Correct .json File


Each filter's source code must include a json file that defines certain information required by the SCONS
build system. The filter's SCONS file includes a specific reference to this json file in the File Properties
dictionary item (see “Understanding and Creating a Filter Properties Dictionary Item (48-53),” on page 94).
It doesn't matter what you name the file, as long as you provide the json file's name in the SCONS file.
However, the json file is typically based on the filter name, for example sampfilt.json.

NOTE If you are unfamiliar with json notation, see http://json.org..

The document vSphere APIs for IO Filtering Development Kit (VAIODK) Guide for the Command Line contains
the topic Editing the .json File (<filter_name>_config.json), which describes the contents of this file well.

To understand these requirements, consider the following .json file from the countIO sample solution, with
a following discussion of the content:

1. {
2. "INFO" : [
3. " Copyright 2014 VMware, Inc. ",
4. " All rights reserved. -- VMware Confidential "
5. ],
6. "PROPERTIES" : {
7. "TYPE" : "disk",
8. "CLASS" : "cache",
9. "CATALOGS" : {
10. "en" : "catalogs/catalog_en.vmsg"
11. }
12. },
13. "CAPABILITIES": {
14. "numWorkGroups": {
15. "min": 0,
16. "max": 8,
17. "default": 0
18. },
19. "encoding": {
20. "values": ["2", "4", "6"],
21. "default": "2"
22. }

98 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

23. },
24. "NETWORK" : {
25. "firewall" : {
26. "inbound" : [ 32768, [32770, 32774], 32790],
27. "outbound" : [ 25, 80, 443, [32770, 32774]]
28. }
29. },
30. "RESOURCES" : {
31. "DAEMON MEMORY RESERVATION" : 100
32. }
33. }

Understanding and Creating a INFO section (2-5)


This section specifies copyright/legal headers for the IO Filter Solution. In this example, we use the VMware
copyright.

Understanding and Creating a PROPERTIES section (6-12)


n TYPE — For now this is set to "disk". In the future there may be different types of IO Filters.

n CLASS — It is important to note that you define the class of the IO Filter Solution in the PROPERTIES
section. This is the only place that you define the class. The IO Filter Framework uses this setting to
determine the order in which to invoke this filter's callbacks, relative to other IO Filters attached to the
same VMDK.

n CATALOGS — You must create one entry in this section for each locale for which you define natural
language (localized) versions of your filter's properties' names and descriptions (see next section). You
only have to provide this section if your filter defines CAPABILITIES / properties. For details on
defining locales and catalogs for them, see “Understanding How to Localize your Solution,” on
page 116.

Understanding and Creating a CAPABILITIES section (13-23)


This section should really be named PROPERTIES, because it describes the list of properties exposed by
your IO Filter solution. The section “Understanding an Processing diskProperites{Valid,Set,Get,Free}
Events,” on page 237 discusses properties in detail.

n LIs are given the list of properties and their initial values when an administrator attaches the IO Filter to
a VMDK

n Administrators can change the values via the vSphere Web Client (VWC) or vmkfstools, which trigger
(optional) callbacks for validating and changing the property's value.

n You specify the initial value of each property using the default attribute of each property.
Administrators can override these initial values in the VWC and using vmkfstools, when the
administrator attaches the IO Filter to a VMDK.

n If you specify the min and max attributes, or values attribute of a property, the VWC only allows settings
within bounds or from the list, respectively.

n LIs typically store the values of each property in a sidecar for the VMDK.

IMPORTANT Currently, each IO Filter solution must define at least one capability / property, even if the
solution does not use it.

NOTE In 60U2, support was added to create a capability that accepts any value from the user. This can be
used to provide IPs, network shares, or other arbitrary string based filter settings. The default value must be
"*" to pass the SPBM compatibility check.

VMware, Inc. VMware Confidential and Proprietary 99


Getting Started Developing vSphere IO Filter Solutions

Understanding and Creating a NETWORK section (24-28)


This section is only required if your IO Filter solution includes a daemon, and if said daemon performs off-
host network communications via TCP or UDP. The section defines the ports to which your daemon binds
to accept inbound connections, and the ports on which it creates outbound connections. Partners can use the
management vmknic, or vmknics tagged with other traffic (for example vSAN or vMotion). Your solution
can also provide users with a choice of vmknic's in a UI (for example a VWC plug-in) to let them choose
which vmknic to use.

To prevent port-usage collisions, VMware assigns each IO Filter partner up to 8 discrete ports to which their
solutions can bind to listen for incoming connection requests. VMware limits solutions to creating up to 200
outbound connections to off-host destinations.

You must create a firewall entry within the NETWORK entry, and separate lists of inbound and outbound
used by your solution. On installation of your IO Filter solution's VIB to a host, the installation process
automatically opens the listed ports in the firewall in the indicated directions. For example, the example
entry indicates that the solution listens (binds to) ports 32768, 32770 through 32774 and 32790, and makes
outbound connections on ports 25, 80, 443, and 32770 through 32774.

NOTE The daemon in your IO Filter Solution is recommended to use SSL to establish a secure connections
between its LIs and off-host peer daemons, incorporating additional measures to authentic such
connections, such as magic numbers, etc.

Understanding and Creating a RESOURCES section (30-32)


This section is required if your IO Filter solution includes a daemon. In that case, you must include the
DAEMON MEMORY RESERVATION attribute to indicate how many megabytes of memory your daemon
may use, excluding memory allocated via the VMIOF_Heap* functions. In the example, the filter is
reserving 100MB of space.

If you do not provide this attribute, the build system assumes its value to be zero. Thus, failing to provide it
does not cause a build issue. However, providing a value that is too low does prevent your daemon from
starting at runtime. For example, if you specify 10MB in this value, and your daemon includes a static data
of 15MB, the IO Filter Framework will fail to start the daemon during ESXi boot or after VIB installation.

REMEMBER Memory allocated using VMIOF_Heap* API is does not count against this limit.

How to properly set DAEMON MEMORY RESERVATION


Since all the heap memory allocated through the VMIOF_Heap* functions is not restricted by "DAEMON MEMORY
RESERVATION", the reservation should not be very large. You can use the following steps to figure out the
correct sizing:

1 Set "DAEMON MEMORY RESERVATION" to a very large value, e.g. 1GB, then build and install the vib.

2 Get the PID of iofilterd through "ps | grep iofilterd"

3 Get ScheddGroupID for iofilterd through "vsish -e get /sched/memClients/[iofilterd-


PID]/SchedGroupID"

4 Run "memstats -r group-stats -s name:min:max:eMin:rMinPeak:consumed:flags -g [iofilterd-


ScheddGroupID] -u mb"

5 Run your IO Filter through worst case scenario by opening the maximum number of VMDKs you want
to support at the same time.

100 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

6 Run memstats again, and rMinPeak in the first row will be the maximum memory needed by your
daemon. Then add some headroom on top of that.

7 Set "DAEMON MEMORY RESERVATION" to the new value you just figured out.

NOTE Shared memory is not counted by DAEMON MEMORY RESERVATION. But the page table
used to map shared memory is counted, which will be automatically covered using the steps described
above.

Creating a Skeletal Filter Library Component Source File and


Understanding its Entry-points / Callbacks
Each Library Component's source must contain the following:

n #include(s) — The source must #include <vmiof.h> in order to have definitions for the macro you use
to enumerate the mandatory and optional callbacks present in your Library component

n Callbacks — The source must define certain code for all mandatory callbacks. It may also define code
for the optional callbacks allowed by the IO Filter Framework.

n VMIOF_DEFINE_DISK_FILTER — The code must contain one invocation of this macro, which provides the
IO Filter Framework with pointers to the callbacks in the Library (among other things)

You must reference this file in the SCONS file's Library Instance dictionary item, in the source files key.

The sub-sections that follow provide details for creating a skeletal LI source file.

NOTE The sampfilt sample is essentially the skeletal solution discussed in this topic, except that it has
functions for optional as well as mandatory callbacks.

Procedure
1 Create a C source file in the IO Filter's source directory.

a At the top of the file, #include <vmiof.h>

b At the bottom of the file, use the VMIOF_DEFINE_DISK_FILTER macro (discussed in the next section) to
define the callbacks you will have in your filter's Library component. Initially consider including
calls to VMIOF_Log() in each of the callbacks to see when the IO Filter Framework invokes each of
these functions.

c In the middle of the file, declare and define the body of each callback you specify in the
VMIOF_DEFINE_DISK_FILTER macro.

This chapter provides one topic for each callback you can define in a LI.

2 Add the source file to your SCONS file.

3 Build and test your skeletal solution

You have a starting point for fleshing a LI with your own code.

Understanding and Invoking the VMIOF_DEFINE_DISK_FILTER Macro


You must define a VMIOF_DEFINE_DISK_FILTER macro for every library instance. The macro expands to
a data structure that contains function pointers to required and optional IO Filter callbacks. When a user-
space cartel load the filter's LI, the initialization code uses the pointers in this structure so that IO Filter
Framework invocations route to functions they point to.

VMware, Inc. VMware Confidential and Proprietary 101


Getting Started Developing vSphere IO Filter Solutions

An example VMIOF_DEFINE_DISK_FILTER macro definition for countIO sample filter, taken from the file
countIO.c file is shown below :

VMIOF_DEFINE_DISK_FILTER(
.diskAttach = &TestDiskAttach,
.diskDetach = &TestDiskDetach,
.diskOpen = &TestDiskOpen,
.diskClose = &TestDiskClose,

.diskPropertiesSet = &TestDiskSetProperties,
.diskPropertiesGet = &TestDiskGetProperties,
.diskRequirements = &TestDiskRequirements,

/*
* The following optional event callback functions could be implemented
* as well but are omitted here for brevity.
* .diskSnapshot = ...,
* .diskCollapse = ...,
* .diskClone = ...,
* .diskVmMigration = ...,
*/

.diskIOStart = &TestDiskStartIO,
.diskIOAbort = &TestDiskAbortIO,
.diskIOsReset = &TestDiskResetIOs,
);

NOTE This macro has pointers to all required callbacks. It mentions, but does not provide pointers to
functions for, some of the optional callbacks. The additional optional callbacks not listed in this code sample
include: diskRelease, diskPropertiesFree, diskPropertiesValid, diskGrow, diskStun, and diskUnstun.

Understanding VMIOF_Status Results for Functions in the VAIO


Most of the functions in the VAIO return values. A few return void. The type of all values returned by
functions in the VAIO are of type VMIOF_STATUS, which can have one of the values defined in the
file /opt/vmware/vaiodk-*/src/bora/lib/public/vmiof/vmiof_status.h. The values and their meaning are
replicated here as follows:

n VMIOF_SUCCESS — The Operation completed successfully

n VMIOF_FAILURE — The Operation failed

n VMIOF_NO_MEMORY — Not enough memory available to complete the request

n VMIOF_BAD_PARAM — One or more supplied parameters is invalid

n VMIOF_ASYNC — The IO or the operation is marked as deferred and is handled asynchronously

n VMIOF_NO_IO — The disk is opened with the NO_IO flag

n VMIOF_READ_ONLY — The disk is read only

n VMIOF_NO_SPACE — The disk does not have enough space

n VMIOF_OUT_OF_RANGE — An access was requested that is out of range

n VMIOF_INVALID_DISK — The supplied disk handle is invalid

n VMIOF_NOT_SUPPORTED — The operation mentioned is not supported

n VMIOF_NOT_FOUND — The required object could not be found

102 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

n VMIOF_NO_RESOURCES — Not enough resources available to complete the request

n VMIOF_SIDECAR_LIMIT — The maximum number of sidecar files per filter was reached

n VMIOF_CANCELLED — The operation was cancelled

n VMIOF_ABORTED — The IO operation was aborted

n VMIOF_ALREADY_EXISTS — The file already exists

n VMIOF_MISALIGNED — The parameters mentioned do not have correct alignment

n VMIOF_INVALID_ADDRESS - The address is invalid for an IO

n VMIOF_NO_ACCESS - Permission was denied for an IO

n VMIOF_SCSI_RESERVATION_CONFLICT - There was a SCSI Reservation conflict error from the underlying
storage

NOTE This VMIOF_STATUS type is used by both utility functions (discussed in “Overview of VMIOF IO Utility
Functions,” on page 181) implemented in the IO Filters Framework and entry point / callback functions
(discussed in “Understanding and Processing diskIOStart Events,” on page 169 and “Understanding and
Processing Other IO Filter Events,” on page 208) that you must implement.

Understanding and Defining diskAttach and diskDetach Callbacks


The IO Filter Framework invokes a filter's diskAttach() callback when an administrator adds a filter to a
VMDK. It is also possible that you are now attaching an IO filter, to a VMDK that has been created in the
past and in use for some time. One of the primary activity done as part of diskAttach() callback function is
doing any one-time initialization of resources that is required by your IO filter, for example creating and
initializing the sidecar to be used by your IO filter, as discussed in, “Using Sidecars Functions in Library
Code to Keep Persistent Per-VMDK Meta Data,” on page 137).

NOTE diskAttach() callback function must complete synchronously. Hence, if there is any long running
task that needs to be done as a result of attaching the filter to the disk, it must be done outside the context of
this callback.

The Prototype of diskAttach callback function is as follows :

VMIOF_Status (*diskAttach)(VMIOF_DiskHandle *handle,


const VMIOF_DiskInfo *info,
const VMIOF_DiskFilterProperty *const *properties);

See “Understanding and Processing DiskAttach Events,” on page 155 for further information including an
example code.

The IO Filter Framework invokes a filter's diskDetach() callback when an administrator removes the IO
Filter from a VMDK. It indicates that the said IO Filter is being disassociated from the disk. It is essential
that you release any resources, that you were holding thus far and is no longer needed by your IO Filter like,
the sidecar that is associated with the VMDK. This is done by invoking VMIOF_DiskSidecarDelete(). Refer to
section “Using Sidecars Functions in Library Code to Keep Persistent Per-VMDK Meta Data,” on page 137
for details on sidecar discussion.

NOTE diskDetach() callback function does not need to complete synchronously. The filter must remove
itself from the disk completely. This would require completing any pending work and returning with the
disk data in a consistent state. Depending on the detachFlags member of the VMIOF_DiskDetachInfo
structure, the detach operation could also result in disk deletion. Therefore the VMIOF_DiskDetachInfo
structure has as its members progressFunc and completionFunc to keep the filter framework informed of
progress and subsequent completion of work in case of asynchronous processing.

VMware, Inc. VMware Confidential and Proprietary 103


Getting Started Developing vSphere IO Filter Solutions

The diskDetach() callback function prototype is as follows :

VMIOF_Status (*diskDetach)(VMIOF_DiskHandle *handle,


const VMIOF_DiskDetachInfo *info);

For further information on diskDetach() including a code sample refer to section “Understanding and
Processing DiskDetach Events,” on page 158

Understanding and Defining diskOpen and diskClose Callbacks


The IO Filter Framework invokes a filter's diskOpen() callback when the VMDK is opened. The
VMIOF_DiskFlags parameter in the VMIOF_DiskInfo structure specifies the mode in which the disk is opened,
for example read-only, no-IO (explained later in the course), shared, etc.

The prototype for a diskOpen() callback is:

VMIOF_Status (*diskOpen)(VMIOF_DiskHandle *handle,


const VMIOF_DiskInfo *info);

Refer to section, “Understanding and Processing diskOpen Events,” on page 159 for further information
including a code sample.

The IO Filter Framework invokes diskClose() callback when a open vmdk is closed. All work groups must
be waited on to ensure that all queued work has been drained, all poll callbacks must be removed, and all
timer operations if any must be removed before invoking diskclose(). Other cleanup operations like
removing connections if any to a daemon, freeing up the sidecar and instance data structure resources of
your IO filter and finally destroying the heap itself is done as part of diskClose().

The prototype for a diskClose() callback is:

VMIOF_Status (*diskClose)(VMIOF_DiskHandle *handle);

Refer to section “Understanding and Processing DiskClose Events,” on page 165for further information
including a code sample.

Understanding and Defining the diskSnapshot Callback


VMware provides administrators and developers with methods for taking snapshots of a VM's vDisks, via
management UIs and the vSphere Management API, respectively. When an administrator or program
invokes the snapshot method, the IO Filter Framework invokes a filter's diskSnapshot() callback so that the
filter can take action(s) it requires (discussed in this and later topics).

Each snapshot occurs in the following phases: Prepare, Notification, and possibly Failure. A brief
description of these phases is:

n Prepare — As the name suggests, the prepare phase occurs after snapshot method invocation but before
the vSphere starts performing the snapshot itself. For IO Filters, it allows the filter to perform any
preparation work, such as flushing dirty cache buffers to the disk before vSphere performs the
snapshot. For more information on this phase, see “Understanding and Processing a Snapshot Event,”
on page 193

n Notification - Almost all the snapshot work is done when the IO Filter Framework invokes this callback.
You can consider the snapshot complete when said framework invokes the next diskClose() callback
after this callback.

NOTE It is possible to receive this callback for a Notification phase without receiving it for a Prepare
phase. For example, when vSphere creates a Linked Clone of a VM, the IO Filter Framework invokes
this callback for both Prepare and Notification phases for the VMDK(s) of the source VM. It only
invokes this callback with the Notification phase as vSphere creates each VMDK for the linked clone
(target) VM, doing so as vSphere creates said VMDK(s).

104 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

n Failure — Something prevented the snapshot from completing successfully, for example running out of
disk space.

The IO Filter Framework invokes a filter's diskSnapshot() callback function once for each phase of a
snapshot on each disk being snapshotted. The prototype for a diskSnapshot() function is:

VMIOF_Status (*diskSnapshot)(VMIOF_DiskHandle *handle,


const VMIOF_DiskSnapshotInfo *info)

diskSnapshot() callback can return the following :


n VMIOF_SUCCESS — The function succeeded in handling the reported phase of the snapshot

n VMIOF_ASYNC — The function cannot complete processing of the notification at this time. The IO Filter
Framework postpones further snapshot processing until the filter calls the function pointed to by info-
>completionFunc (that must match the prototype of VMIOF_DiskOpCompletionFunc, typically in the
context of a work-pile, timer, or poll callback function. If info->completionFunc is NULL, then you
cannot return VMIOF_ASYNC from the function.

NOTE Currently, info->completionFunc is only non-NULL for the prepare phase.

Returning any error code other than VMIOF_SUCCESS and VMIOF_ASYNC from the prepare phase causes the IO
Filter Framework to cancel the snapshot process. This causes the IO Filter framework to post a failure
notification to the other IO Filters.

If the code for snapshot contains any long running code, or processes the request asynchronously, it must
invoke the function pointed to by info->progressFunc at least once every 10 seconds to prevent the IO Filter
Framework from timing out the snapshot operation.

Understanding and Defining the diskCollapse Callback


A virtual disk can have a chain of delta files associated with it. The initial copy of the virtual disk (vmdk file)
is referred to as the parent link. Every time a change is made to the disk, the incremental change is placed
into a separate delta file. Over time, the sequence of incremental changes results in a hierarchy of links. The
process of collapsing all of those incremental changes into a single file is known as a disk collapse. After
completing this activity, the filter framework invokes the diskCollapse() callback routine. The callback
function can do any housekeeping tasks it desires, but cannot reverse the collapse which has been initiated
by the filter framework before the callback was invoked.

The prototype for diskCollapse() function is:

VMIOF_Status (*diskCollapse)(VMIOF_DiskHandle *handle,


const VMIOF_DiskCollapseInfo *info);

It returns the following values:

n VMIOF_SUCCESS — The IO Filter allows the collapse to proceed

n VMIOF_FAILURE — The IO Filter is not willing to allow the collapse to proceed. This results in link
hierarchy remaining unaffected. Since the framework finishes a diskCollapse() before calling this
routine, the chain of links which has been deleted cannot be recovered and the parent link will contain
all of the data in a single file.

NOTE This callback is optional. If the Library component does not have it, the IO Filter Framework assumes
it would return VMIOF_SUCCESS.

Understanding and Defining the diskDeleteBlock* Callbacks


The IO Filter framework invokes a couple of callback functions associated with deleting a set of blocks from
a virtual disk. It corresponds to receiving an UNMAP SCSI command.

VMware, Inc. VMware Confidential and Proprietary 105


Getting Started Developing vSphere IO Filter Solutions

The first callback is invoked before deleting the blocks of a virtual disk. The filter has the ability to deny this
operation by returning the appropriate error code if the LI is not ready for the UNMAP operation. It also gives
the LI a chance to do any internal bookeeping necessary to accomodate the operation. The UNMAP can come
from utilities such as vmkfstools, or be initiated inside the guest.

The prototype for this callback is :

VMIOF_Status
(*diskDeleteBlocksPrepare)(VMIOF_DiskHandle *handle, const VMIOF_DiskDeleteBlocksInfo *info);

This callback returns VMIOF_SUCCESS if the operation is allowed to proceed, else it returns an appropriate
error value.

The second callback is called to signal the status of the block deletion to the filter. It is invoked after a set of
virtual disk blocks are deleted, or if the operation fails. This callback allows the LI to complete any
bookeeping started in the diskDeleteBlocksPrepare callback.

The prototype for this callback is :

void
(*diskDeleteBlocks)(VMIOF_DiskHandle *handle, const VMIOF_DiskDeleteBlocksInfo *info,
VMIOF_Status status);

This callback does not return a value.

Understanding and Defining a diskRequirements Callback


The IO Filter Framework invokes diskRequirements to determine how much heap memory a filter requires
for a disk of a given size. The framework calls this function before the diskOpen and diskAttach callbacks to
determine whether the hypervisor has sufficient memory to proceed. Thus, the function must provide the
maximum possible heap memory required by a filter instance. The Framework remembers this information,
and will not allow Heap memory allocations to exceed the values provided by this function (via its
requirements parameter.

To understand how to calculate the memory requirements, see “Understanding and Using a
diskRequirements Callback,” on page 153

void (*diskRequirements)(VMIOF_DiskRequirements *requirements);

This function returns void.

Understanding and Defining the diskClone Callback


The IO Filter framework calls the diskClone callback function after a disk has been cloned. Cloning is the
process of creating a copy of a disk that end user can initiate using vsphere client. Please note that this
callback will only be invoked on the newly cloned VMDK as it is called after the disk has been cloned. It will
not be called on the original VMDK.

The prototype of this callback is:

VMIOF_Status (*diskClone)(VMIOF_DiskHandle *handle, const VMIOF_DiskCloneInfo *info);

The return values for this callback are:

n VMIOF_SUCCESS — The clone can proceed.

n Any other value indicates the filter does now allow for the clone to proceed.

NOTE This callback is optional. If the Library component does not have it, the IO Filter Framework assumes
it would return VMIOF_SUCCESS.

106 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

Understanding and Defining the diskVmMigration Callback


The IO Filter framework invokes diskVmMigration callback function in the event of a live migration or if a
live migration has failed. A live migration is triggered when the administrator uses vCenter to migrate a VM
from one host to another. The filter framework provides an input parameter to the callback function phase,
which describes whether the callback was invoked as part of a live migration prepare phase or after a failure
to migrate the VM.

The prototype of this callback is:

VMIOF_Status (*diskVmMigration)(VMIOF_DiskHandle *handle, const VMIOF_DiskVmMigrationInfo *info);

The possible return values for this callback function are:

n VMIOF_SUCCESS — The operation succeeded

n VMIOF_ASYNC — Processing of the migration event continues asynchronously. The framework will
postpone the migration on until the function calls completionFunc.

n Any other value is taken as a failure causing the IO Filter Framework to abort the migration.

NOTE This callback function is not allowed to block, nor can it perform any long-running activity.

Understanding and Defining the diskIOStart Callback


The IO Filter Framework invokes a filter's diskIOStart() callback for every IO request submitted to a filtered
VMDK. The function can either process each IO request synchronously or asynchronously. When a filter
chooses to execute asynchronously the IO Filter Framework suspends processing the request until the filter
later indicates that said framework should continue processing said request.

Some example of diskIOStart() invocations include:

n When the guest VM to which the SCSI device has been added boots up, it tries to read the first few
sectors of the device and also do the same when the device is mounted

n Mounting of a device requires that a filesystem be created on the disk which again requires that the first
few sectors of the device be written

n Regular read() / write() operations on a disk or file thereon. Invocations of these functions may be
coalesced by the guest OS.

The prototype for this function is:

VMIOF_Status (*diskIOStart)(VMIOF_DiskHandle *handle, VMIOF_DiskIO *io);

Return Value : Only two return values are permitted for this callback - VMIOF_SUCCESS and VMIOF_ASYNC. This
callback function is not allowed to block.

n VMIOF_SUCCESS – diskIOStart() callback returns VMIOF_SUCCESS when its operation succeeds.

n VMIOF_ASYNC – diskIOStart() callback returns VMIOF_ASYNC when IO is deferred to be handled


asynchronously. Processing of this IO by the framework is postponed until the IO is continued by the
IO filter.

NOTE The Framework may invoke multiple diskIOStart callbacks for the same diskHandle at the same
time.

VMware, Inc. VMware Confidential and Proprietary 107


Getting Started Developing vSphere IO Filter Solutions

Understanding and Defining the diskIOAbort Callback


The IO Filter Framework invokes a filter's diskIOAbort callback for every IO request that gets aborted. In
other words this callback function of a filter is invoked for each IO for which an abort command is issued by
the guest VM, and which should get aborted if it is managed by the filter. This callback function executes
synchronously and is not allowed to block. The filter may want to do some housekeeping work in the event
of an abort.

Prototype :

VMIOF_Status (*diskIOAbort)(VMIOF_DiskHandle *handle, VMIOF_DiskIO *io);

Return Value : Only the following two return values are permitted for this callback –

n VMIOF_IO_ABORTED – The IO request was indeed aborted.

n VMIOF_NOT_FOUND – The IO was not found by this filter.

Understanding and Defining the diskIOsReset Callback


The IO Filter Framework invokes a filter's diskIOsReset() callback in order to reset all outstanding IOs. As a
result all the outstanding IOs to a virtual disk associated with the specific resetIdentifier will be aborted.
This callback function executes synchronously and is not allowed to block. The filter may want to do some
housekeeping operations when such an event occurs.

Prototype :

void (*diskIOsReset)(VMIOF_DiskHandle *handle,


VMIOF_DiskResetIdentifier resetIdentifier);

Return Value : diskIOsReset() callback function returns a void.

Understanding and Defining the diskStun and diskUnStun Callbacks


The IO Filter framework invokes the diskStun() callback function to stun all operations of a filter. The act of
stunning all operations means that all pending/transient IOs to a disk should be completed. The framework
will not issue any IOs to the filter in between the calls to diskStun() and diskUnstun() callbacks.

The prototype of this callback is:

VMIOF_Status (*diskStun)(VMIOF_DiskHandle *handle, const VMIOF_DiskStunInfo *info);

In general, the IO Filter framework invokes the diskUnstun() callback to undo a corresponding diskStun()
operation for a given VMDK.

The prototype of this callback is:

void (*diskUnstun)(VMIOF_DiskHandle *handle);

NOTE It is possible for the IO Filter framework to invoke diskUnstun() without a preceding diskStun(). In
that case, this callback function should return without doing anything.

This is an optional callback function. If you do not provide this callback function, the framework assumes
VMIOF_SUCCESS in calling this routine.

Refer to “Understanding and Processing diskStun and diskUnstun Events,” on page 208 for further
information.

108 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

Understanding and Defining the diskRelease Callback


The IO Filter Framework invokes a filter's diskRelease callback when a cartel attempts to open a VMDK, but
fails because another cartel has locked said VMDK. This can happen for several reasons, including that an
IO Filter's Daemon opened the VMDK to perform offline flushing from a cache to VMDK, or offline
replication (offline meaning: while the VM owning the VMDK is not running). In this scenario, the IO Filter
Framework invokes the diskRelease callback in the LI of the cartel trying to open the VMDK.

NOTE The framework invokes diskRelease without first calling the LI's diskOpen() callback.

In response to this callback, the LI must get whichever cartel has opened the VMDK to close it. For an
explanation of how it does that, see “Understanding and Processing a DiskRelease Event,” on page 228. The
LI reflects its success in getting the other cartel to close the VMDK in its return value, as follows:

n VMIOF_SUCCESS – The other cartel has closed the VMDK. On receiving this return value, the kernel tries
to open the VMDK again. If the retry fails, the operation requesting the open (such as starting the VM)
fails.

n VMIOF_FAILURE – The other cartel could not be determined, meaning this disk has not been opened by a
component belonging to this filter; or refused to close the VMDK, whose result is analogous to a failed
retry described in the preceding bullet.

The prototype of this callback is:

VMIOF_Status (*diskRelease)(VMIOF_DiskHandle *handle);

The parameter to this callback is VMIOF_DiskHandle *handle, an opaque handle to the virtual disk to be
released. You use this parameter in the code that implements this callback, for example to open certain
sidecar files associated with this VMDK. For details on implementing this callback, see “Understanding and
Processing a DiskRelease Event,” on page 228.

NOTE This callback is optional. If the Library component does not have it, the IO Filter Framework assumes
that it would return VMIOF_FAILURE.

Understanding and Defining the diskGrow Callback


The IO Filter Framework invokes a filter's diskGrow() callback before the disk is grown. Growth of the disk is
initiated through VC or an esxcli command. The purpose of this callback is to give the LI a chance to adjust
it's internal algorithms, adjust sidecar data, etc. due to a disk resizing. Note that provision to call this
callback function by the filter is optional. For a detailed understanding of this callback, see “Understanding
and Processing a diskGrow Event,” on page 233

The prototype of this callback is:

VMIOF_Status (*diskGrow)(VMIOF_DiskHandle *handle, const VMIOF_DiskGrowInfo *info);

Return VMIOF_SUCCESS to allow the grow to proceed, or any other value to abort the grow operation.

Understanding and Defining the diskExtentGetPre and diskExtentGetPost Callbacks


ESXi creates adisk extent when it writes a block into a child disk. For caching solution, the IO Filter keeps
track of the blocks residing in the cache but does not share that information with the IO Filter Framework.
The callback functions diskExtentGetPre() and diskExtentGetPost() enable the IO Filter Framework to
query the IO Filter solution to determine the cache state of the disk extents.

VMware, Inc. VMware Confidential and Proprietary 109


Getting Started Developing vSphere IO Filter Solutions

diskExtentGetPre() callback is invoked before a disk extent get operation is invoked on the virtual disk. The
filter may modify the start offset if needed. The function prototype for diskExtentGetPre() is :

VMIOF_Status (*diskExtentGetPre)(VMIOF_DiskHandle *handle,


VMIOF_DiskExtentGetInfo *info);

The return values for this callback are :

n VMIOF_SUCCESS– The operation is allowed to proceed.

NOTE Provision of diskExtentGetPre() callback is dependent upon whether diskExtentGetPost() callback is


provided on not. A filter should provide either both of these or none.

diskExtentGetPost() callback function of a filter is invoked after the disk extent get operation is completed
on the virtual disk. The filter may modify the start offset, extent's offset and length if needed. The function
prototype for diskExtentGetPost() is :

VMIOF_Status (*diskExtentGetPost)(VMIOF_DiskHandle *handle,


VMIOF_DiskExtentGetInfo *info);

The return values for this callback are :

n VMIOF_SUCCESS– The operation is allowed to proceed.

NOTE Provision of diskExtentGetPost() callback is dependent upon whether diskExtentGetPre() callback is


provided on not. A filter should provide either both of these or none.

Understanding and Defining the diskProperties{Valid,Set,Get,Free} Callbacks


The VAIODK defines four callbacks for processing properties-related events. This topic discuses each of
these.

Understanding the diskPropertiesValid callback


The IO Filter Framework invokes the filter’s diskPropertiesValid() callback prior to invoking diskAttach()
or diskPropertiesSet() to verify whether a given list of filter properties and associated values are valid for a
particular VMDK. If all of the properties or values are valid, as indicated by a return of VMIOF_SUCCESS, the
Framework expects that a subsequent call to diskPropertiesSet() to succeed, and for a call to
diskPropertiesValid() to not fail because of property settings. If any of the property names or their values
are invalid, as indicated by any other return value, the Framework does not invoke the subsequent
callbacks.

The prototype for this callback is:

VMIOF_Status
(*diskPropertiesValid)(VMIOF_DiskHandle *handle, const VMIOF_DiskFilterProperty *const
*properties);

Understanding the diskPropertiesSet callback


The IO Filter Framework invokes a filter's diskPropertiesSet() callback when an ESXi administrator uses
the vmkfstools command from the ESX CLI to set properties for a filtered virtual disk / vDisk / VMDK. You
can also alter the filter properties in the vSphere Web Client SPBM policy page. The IO Filter Framework
calls diskPropertiesSet() for the Filter Instance associated with the VMDK, providing the list of properties
specified by the administrator. The IO Filter framework invokes diskPropertiesSet() to validate the
supplied properties and update the filter's private properties The filter framework expects valid properties
to be over-ridden with new values.

110 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

The prototype for this callback is:

VMIOF_Status
(*diskPropertiesSet)(VMIOF_DiskHandle *handle,
const VMIOF_DiskFilterProperty *const *properties);

The parameters are analogous to those of diskPropertiesValid().

Understanding the diskPropertiesGet callback


The IO Filter Framework invokes a filter's diskPropertiesGet() callback when an administrator uses
vmkfstools to query if a filter associated with a virtual disk / vDisk / VMDK matches a given set of
properties. It is called to query the filter for all properties that have been configured for that disk. The filter
must provide a pointer to the current configured disk properties. This pointer will reference memory that is
owned by the filter and must remain allocated until the filter is closed or until the framework calls
diskPropertiesFree. The filter properties configured for a VMDK is also queried when you view them via
the vSphere Web Client, on the SPBM policy page.

Prototype :

void (*diskPropertiesGet)(VMIOF_DiskHandle *handle,


const VMIOF_DiskFilterProperty *const **properties);

Understanding the diskPropertiesFree callback


The IO Filter Framework invokes a filter’s diskPropertiesFree() callback to tell the filter to free any memory
it may have dynamically allocated during a diskPropertiesGet() callback. This would happen when you
navigate away from the SPBM policy page in vSphere Web Client, after viewing the existing properties.

Prototype :

void
(*diskPropertiesFree)(VMIOF_DiskHandle *handle, VMIOF_DiskFilterProperty **properties);

A Summary Table of Filter Library Call back

When it comes to filter library callbacks, there are three things to consider. First, whether the callback is
required or optional. Some of the callbacks are required, meaning the filter library must implement them,
like diskAttach , while some are optional, like diskSnapshot.

Second, whether the callback is synchronous or asynchronous. If VMIOF_DiskOpCompletionFunc is passed in


as part of the parameter for a callback, then it can return VMIOF_ASYNC. E.g. VMIOF_DiskOpCompletionFunc is
part of VMIOF_DiskSnapshotInfo which is the parameter for diskSnapshot. Otherwise it needs to return
synchronously.

Third, whether the callback can perform a long running operation. If VMIOF_DiskOpProgressFunc is passed in
as part of the parameter for a callback, e.g. VMIOF_DiskOpProgressFunc is part of VMIOF_DiskGrowInfo for the
diskGrow callback, then it is allowed to do a long running operation, but it must report progress using
VMIOF_DiskOpProgressFunc at least every 10 seconds. The following Table lists these three attributes for all
the filter library callbacks.

Other than the scenarios where completionFunc or progressFunc is present, there is an overall non-blocking
rule that applies to all callbacks in a Library Instance. In LI callbacks, you should not wait indefinitely, e.g.
using semaphore, socket read or eventfd read, but mutex / spinlocks are allowed.

NOTE The table only summarizes the typical case. You always need to check whether
VMIOF_DiskOpProgressFunc and VMIOF_DiskOpCompletionFunc exist before calling them. For example, when
you detach a filter from a VMDK of a powered off VM, completionFunc will not be set in diskDetach
callback, so you cannot return ASYNC.

VMware, Inc. VMware Confidential and Proprietary 111


Getting Started Developing vSphere IO Filter Solutions

Table 4‑1.
Whether Long Running
Callbacks Required / Optional Sync / Async Operation Allowed (Y/N)

diskAttach Required Sync No

diskDetach Required Async Yes

diskOpen Required Sync No

diskClose Required Sync No

diskRelease Optional Sync No

diskPropertiesSet Required Sync No

diskPropertiesGet Required Sync No

diskPropertiesFree Optional Sync No

diskPropertiesValid Optional Sync No

diskRequirements Required Sync No

diskSnapshot (PREPARE) Optional Async Yes

diskSnapshot (NOTIFY) Optional Sync No

diskSnapshot (FAIL) Optional Sync No

diskCollapse Optional Sync No

diskGrow Optional Sync Yes

diskClone Optional Sync Yes

diskVmMigration Optional Async Yes


(PREPARE)

diskVmMigration (FAIL) Optional Sync No

diskIOStart Required Async Yes

diskIOAbort Required Sync No

diskIOsReset Required Sync No

diskStun Optional Sync Yes

diskUnstun Optional Sync No

diskDeleteBlocksPrepare Optional Sync No

diskDeleteBlocks Optional Sync No

diskExtentGetPre Optional Sync No

diskExtentGetPost Optional Sync No

Manipulating per-VMDK Filter Properties using vmkfstools


vmkfstools is a tool supplied by VMware which will call the appropriate SetProperties callback function of
an IO Filter. The SetProperties callback function is supposed to process a list of properties handed to it by
the framework and choose which ones it wishes to accept/modify. By default, the callback function will not
process properties which is not defined and will return success for an empty list of properties. The usage of
vmkfstools is as follows :

vmkfstools --iofilters
<filtername>:property1="value1":property2="value2":...:property="valueN" <vmdk filename>

112 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

An example using a filter named countio with properties numWorkGroups and port would be :
n vmkfstools --iofilters countio:numWorkGroups="3":port="32769" iofilter.vmdk

One can set any number of properties with a call to vmkfstools.

Creating a Skeletal Daemon Component


All Daemons must contain the following:

n #include(s) — The source must #include <vmiof.h> in order to have definitions for the macro you use
to enumerate the mandatory callbacks for Daemons

n Callbacks — The source must define code for certain mandatory callbacks. The IO Filter Framework
does not have optional callbacks for Daemons.

You must reference the source files that make up your daemon code in the SCONS file's Daemon Plugin
dictionary item, in the source files key.

The sub-sections that follow provide details for creating a skeletal Daemon source file.

NOTE The countIO sample provides a skeletal Daemon that you can copy into your filter and use as a
starting point.

Understanding and Invoking the VMIOF_DEFINE_DAEMON Macro


Each Daemon must invoke the VMIOF_DEFINE_DAEMON macro somewhere in its source files. The macro
expands to a data structure that, among other things, contains pointer the required callbacks functions for
Daemons. Within the macro invocation, you assign each callback pointer to the callback function you write
in your daemon. This is how you expose your callback functions to the IO filter Framework. On loading
your Daemon's .so file, iofilterd finds this structure and then saves said pointers for the IO Filter
Framework to call when needed.

The countIO sample contains the following example of invoking VMIOF_DEFINE_DAEMON :

VMIOF_DEFINE_DAEMON(
.start = &SCDStart,
.stop = &SCDStop,
.cleanup = &SCDCleanup,
);

The next three sub-topics discuss the prototype of each function, when the IO Filter Framework invokes it,
and the typical things found in the body of each.

Understanding and Defining the Daemon Start Callback


The IO Filter Framework invokes a Daemon's start function when the Framework loads the Daemon after
IO Filter installation, update, or system boot.

The prototype for the start function is defined by the type VMIOF_DaemonOpsStartCB, which is defined as:

typedef VMIOF_Status (*VMIOF_DaemonOpsStartCB)(void);

That is, you define the start function similar to the following:

VMIOF_Status
daemonStart(void) {...}

Like a Unix / Linux daemon, an IO Filter Daemon's start function typically does the following:

n Make outbound connections — Only Daemons are allowed to open off-host TCP/IP connections, such
as to replication sites or management interfaces of storage devices, etc.

VMware, Inc. VMware Confidential and Proprietary 113


Getting Started Developing vSphere IO Filter Solutions

n Listen for inbound connections — Daemons typically accept connections from Library Instances and
possibly from the filter's CIM provider

n Initialize any global state or related data structures — This includes creating at least one heap from
which to dynamically allocate memory for data structures such as those used to keep track of the state
of each client connection. Daemons may also use System V synchronization primitives (semaphores,
shared memory, etc.) and pthread synchronization primitives (mutex's, etc.) to communicate with LIs
and possibly CIM providers.

Unlike a Unix / Linux daemon, the code in the start function must return a status of VMIOF_SUCCESS in a
timely manner to tell the IO Filter Framework that it has successfully completed performing the setup tasks
listed in the preceding list. Otherwise:

n If the Daemon returns a failure code, the IO Filter Framework logs the error and notes that the Filter is
not available. An administrator may diagnose and resolve the problem and then attempt to restart the
daemon manually from the ESXi CLI.

n If the Daemon fails to return any code in a timely manner, the IO Filter Framework kills and attempts to
restart the Daemon several times before eventually giving up, logging the error, and making the filter
unavailable.

Because of the need for start functions to return in a timely manner, if your Daemon uses sockets, you
cannot block waiting for connection requests on said sockets. Instead, you must use the VMIOF_PollAdd()
function to register callbacks that the IO Filter Framework invokes on receipt of IO on said sockets. The
callback for bound sockets typically:

1 Accepts connections (most commonly from an LI, but possibly from a CIM Provider)

2 Creates a data structure to keep track of the conversation with the client

3 Uses VMIOF_PollAdd() to register a different callback for processing the conversation with said client

NOTE On detecting a close on the socket by the client, the Daemon must remove the poll callback by
invoking VMIOF_PollRemove().

For details on using the VMIOF_Poll*() functions, see “Understanding and Using the IO Filters Polling
Functions,” on page 141.

NOTE The daemon in your IO Filter Solution is recommended to use SSL to establish a secure connections
between its LIs and off-host peer daemons, incorporating additional measures to authentic such
connections, such as magic numbers, etc.

Understanding and Defining the Daemon stop Callback


The IO Filter Framework invokes a Daemon's stop function when the Framework unloads the Daemon, for
example for IO Filter removal, update, or system shutdown/reboot. If an entity (SPBM API on vCenter, ESXi
CLI command, etc.) requests that ESXi remove a VIB associated with an IO Filter, ESXi takes the following
actions:

1 Check if there are any VMs using the LI of the filter. If so, ESXi refuses the request.

2 Tell the IO Filter Framework to stop the daemon (by invoking the stop callback.

When the Framework invokes the function, it passes a pointer to a function (called the stopped function)
that the daemon must call when it has finished stopping. Upon receipt of the stopped callback, the
Framework signals ESXi that daemon shutdown is complete.

3 Remove the VIB and its artifacts.

The prototype for the stop function is defined by the type VMIOF_DaemonOpsStopCB, which is defined as:

typedef void (*VMIOF_DaemonOpsStopCB)(VMIOF_DaemonStoppedCB stopped, void *data);

114 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

That is, you define the stop function similar to the following:

void
daemonStop(VMIOF_DaemonStoppedCB stopped, void *data) {...}

The body of this function must undo whatever the start function and subsequent operations have done,
including:

n Signal any VMs with which the daemon is connected that it is shutting down

n Flush the cache

n Close any open virtual disks

n Remove poll callbacks from the socket file descriptors

n Close socket file descriptors

n Free memory associated with client connections

n Destroy the heap(s) that may have been created

n Call the stopped function

The stop function cannot be long running. However, the tasks in the preceding list may take some time.
Thus, this function typically creates a work group and queues an asynchronous function to complete said
tasks. If the stop function uses this pattern, ensure that the worker function calls the stopped function.
Otherwise, the stop function must call the stopped function itself.

NOTE In some caching filter design patterns, under certain conditions, daemons on one host act as caches
for VMs running on other hosts. In this case, the daemon must still perform all the steps listed in the
preceding list before ceasing operation.

NOTE The daemon keeps running when ESX enters Maintenance Mode.

Handling Removal vs Upgrade


The vCenter Server can ask ESXi hosts to remove IO Filter VIBs from a host for two reasons:

n The administrator is removing the solution entirely

n The administrator is upgrading the solution from one version to another

It may be desirable for the IO Filter Solution to differentiate between these two conditions. For example
during a complete removal, the logic in a daemon's stop function (and associated code) should delete cache
files from a host's SSD(s). On an upgrade, the logic may wish to leave the cache files in place.

The IO Filter Framework does not provide any reason for the invocation when it calls stop (host shutdown,
filter upgrade, filter removal, etc.). If a filter requires this reason (context), the IO Filter Solution should
include a VWC plugin that drives its update and removal. Such a plugin can send context to the daemon
(through the CIM provider) before invoking the SPBM API to update or remove the filter. The stop function
can then check for this context to determine what action to take, then erasing the context as part of the task
list. Should the stop not find context when it is invoked, it can assume a default context of host shutdown.

Understanding and Defining the Daemon cleanup Callback


The IO Filter Framework invokes a Daemon's cleanup function after said Framework calls the Daemon's
stop function. It provides the Daemon one last opportunity to perform any "cleanup" that the daemon could
not or did not perform during stop.

The prototype for the cleanup function is defined by the type VMIOF_DaemonCleanupCB, which is defined as:

typedef void (*VMIOF_DaemonCleanupCB)(void);

VMware, Inc. VMware Confidential and Proprietary 115


Getting Started Developing vSphere IO Filter Solutions

That is, you define the cleanup function similar to the following:

void
daemonCleanup(void) {...}

Creating a Skeletal CIM Provider


VMware will not define the CIM classes of data, or the relationships that it requires each IO Filter to expose
through its CIM provider. While you are welcome to define your own CIM classes, to get started use the
skeletal CIM Provider (CIMP) code provided in the IO Filter Samples. You need to copy the entire CIM
development context from one of the samples to your own filter's source folder.

REMEMBER The CIMP code for IO Filter examples is located in the directory cim/name (where name is the
sample filter name), within the sample filter.

For example, to copy the CIM code from the sampfilt sample into your filter's source directory, follow these
steps:

1 Change directory to your filter's source directory.

2 Make a directory for the CIM content, using the folder name specified in the CIM dictionary definition,
in the cim location key (e.g. CIM/myfilter). For example:

mkdir cim/myfilter1

3 Change into that directory. For example:

cd cim/myfilter1

4 Copy the files from an existing sample. For example:

cp -R /opt/vmware/VAIO-6*/src/partners/iofilters/sampfilt/cim/sampfilt/ .

5 In each of the files that the filter name occurs, replace the name of the filter from which you copied the
CIM code with your filter's name.

Currently, those files include: Makefile, src/configure.ac, src/Makefile.am, src/sampfilt.mof,


src/sampfilt-Provider.c, src/sampfilt.reg, and src/sampfilt.wsman

Understanding How to Localize your Solution


The VWC exposes the name and a short description of each property of a filter defined in the
CAPABILITIES section of its .json file (see “Creating and Populating a Correct .json File,” on page 98). The
VWC displays the property names and descriptions whenever a user creates or edits a SPBM policy that
uses the IO Filter. As the VWC itself supports localization (by default with the following locales: English /
en_US; German / de_DE; Japanese / ja_JP; Simplified Chinese / zh_CN; French / fr_FR; and Korean / ko_KR),
you should consider providing localized versions of your filter's properties for each of these locales (plus
any other locales where you have large customers). Currently, you are only required to provide a English /
en localized version of your properties' names and descriptions.

To create a localized version of your filter's properties, do two things:

1 Create an entry for the locale in the CATALOGS section of the .json file that specifies the locale you are
adding and an associated catalog file.

2 Create the catalog file you gave in the entry.

116 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

The following subsections discuss each of these in greater detail.

NOTE Currently, you are only required to provide localization for English, using the "en" locale, which is
what the VWC defaults to if it cannot find localization files for the locale currently set in the browser in
which it is running.

Creating New Entries in the CATALOGS section of the .json File for Each Locale
Each sample filter's .json file has the following CATALOGS definition in the PROPERTIES section:

"CATALOGS" : {
"en" : "catalogs/catalog_en.vmsg"
}

This definition defines a single locale, en (English), and specifies that the catalog file for this locale is located
in the catalogs/catalog_en.vmsg file (relative to the scons root directory of the filter).

To add a new entry locale, just create a new line within the CATALOGS definition. The format for each
entry is:

n locale — a string, within double quotes, specifying the locale defined by ISO 639

n colon (:) — a separator between strings

n filename — a double-quoted string specifying the pathname to the catalog file. The pathname is relative
to the scons root of the filter. The format of this file is defined in the next section.

For example, the following CATALOGS definition provides entries for the six locales supported by the
VWC, plus the default entry:

"CATALOGS" : {
"en" : "catalogs/catalog_en.vmsg",
"en_US" : "catalogs/catalog_en_US.vmsg",
"de_DE" : "catalogs/catalog_de_DE.vmsg",
"ja_JP" : "catalogs/catalog_ja_JP.vmsg",
"zh_CN" : "catalogs/catalog_zh_CN.vmsg",
"fr_FR" : "catalogs/catalog_fr_FR.vmsg",
"ko_KR" : "catalogs/catalog_ko_KR.vmsg",
}

Remember, for each local / catalog pair, you must also create the indicated catalog file as discussed in the
next section.

Creating Localization Catalog Files


Catalog files have a key-value-pair format where the keys and values are separated by an equal sign (=).
Each catalog begins with two entries that provide the name and description of the filter itself. The rest of the
catalog must contain two entries for each filter property (that is, each entry in the CAPABILITIES section of
the filter's .json file).

The following content is taken from the countIO sample's .json file:

"CAPABILITIES": {
"acceleration": {
"min": 1,
"max": 5,
"default": 3
}

This section defines just one property, acceleration.

VMware, Inc. VMware Confidential and Proprietary 117


Getting Started Developing vSphere IO Filter Solutions

The following content is taken from the countIO sample's "en" catalog file:

name.label = countio
name.description = Hello, I am the countio IO filter.

capability.acceleration.label = acceleration
capability.acceleration.description = acceleration level

This file has the following entries:

n name.label and name.description — These two keys define the label and description to display for an
IO Filter. For example, the French version of name.label could be: Nombre IO, and the name.description:
Bonjour, Je suis le Nombre IO filtre.

n capability.acceleration.label and capability.acceleration.description — These two keys define


the label and description for the only capability defined in the json file (acceleration). In general, for
each capability defined in the json file, you create two such keys. Each key should have the format:

capability.xxx.label = something
capability.xxx.description = something else

where:

n capability, label, and description are all literals.

n xxx is the name of the capability / property given in the CAPABILITIES section of the .json file

n something and something else are the values assigned to the keys. NOTE that you are not required
to put the values in quotes. If you do, the quotes themselves become part of the value.

NOTE In ESX 60U1 and 60U2, there is a known issue that your customers may see the filter name shown as
"iofilters.disk.<filteer-name>.name.label" in VWC instead of the name defined in catalog file. This is a bug in
VWC. The workaround is to log out of VWC and re-login. This bug has been fixed for the ESXi 2016 release.

Debugging IO Filter Core Dumps


This topic provides steps for starting a debugging session to analyze core dumps of user space components
of IO Filter Solutions running on an ESXi host. It explains how to find the core file on the host, transfer it to
the right place in your development environment, and invoke gdb through a special VAIODK-provided
script so that you can perform debugging.

Since this topic involves running commands in multiple environments, the commands shown use different
prompts to indicate both platform and authentication required to perform the task. Specifically:

n DEV$ — You are an ordinary user (or root) on the development platform (for example VMware
Workbench)

n ESXi# — You are root on the ESXi host

Whenever a user-space cartel generates a core dump, ESXi generates a core file of the form command-zdump.#
where command is the name of the program that dumped core, and # is a 3-digit number that sequences the
core files (e.g. 001, 002, etc.). ESXi places many core files in the /var/core directory. One exception is that the
core files for VMX cartels are located in the same folder as the VM's .vmx file. Also in the case of VMX
dumps, the command part of the file is vmx-debug, not just vmx, for example vmx-debug-zdump.001.

Once ESXi generates the core file, to use gdb to debug the issue, follow these steps:

1 On the ESXi host, copy the zdump file to the directory on the development platform that contains the
IO filter project’s .sc file. The following example shows copying the dump file while logged into the
DEV system and in the IO Filter's root directory:

DEV$ scp [email protected]:/var/core/vmkfstools-zdump.001 .

118 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

2 In the development machine directory containing the zdump file, type make prep-debug. This displays a
list of the zdump files in the directory and a prompt for specifying the zdump.

NOTE You must run make prep-debug each time you want to examine a new core file.

For example:

DEV$ make prep-debug


00:39:17.875 scons INFO Reading SConscript files ...
00:39:18.009 main INFO loading #scons/init.sc
00:39:18.014 main INFO Searching tree 'bora' for products
00:39:18.021 locals INFO Setting undeclared build options:
00:39:18.021 locals INFO
VIRTUAL_GOBUILD=/opt/vmware/vaiodk-6.0.0-2250664/tools/gobuild/vmiofdk.cache
00:39:18.024 main INFO Setting product to vmiofdk
00:39:18.028 main INFO Enabling SCons implicit dependency caching
...
00:39:18.346 env INFO Customizing the default environment for host linux32
00:39:18.347 env WARNING Product replacing default environment
Select a zdump to debug:
iofilterd-zdump.000 Wed 05 Nov 2014 12:33:22 AM UTC
iofilterd-zdump.001 Wed 05 Nov 2014 12:33:22 AM UTC
Enter some matching characters of a pattern:

3 At the prompt, enter matching characters for the zdump you want and press <ENTER>. This creates a
dstage directory with the necessary tools to run the gdb session.

Example of extracting the zdump:

Select a zdump to debug:


iofilterd-zdump.000 Wed 05 Nov 2014 12:33:22 AM UTC
iofilterd-zdump.001 Wed 05 Nov 2014 12:33:22 AM UTC
Enter some matching characters of a pattern: 000
Using iofilterd-zdump.000 ...
Inferred binary "/usr/lib/vmware/iofilter/bin/iofilterd" from zdump "iofilterd-zdump.000"
Using mirror binary
"/workspace/acme_sampcache/sampcache/dstage//usr/lib/vmware/iofilter/bin/iofilterd"
Extracted dump file
Saving /opt/vmware/vaiodk-6.0.0-2250664/tools/../debug/vmkgdb64-7.2 --command
/workspace/acme_sampcache/sampcache/dstage/gdb.cmd

4 Navigate to the dstage directory and start the gdb session by typing ./gdbiof

Example of starting the gdb Session:

DEV$ cd /workspace/acme_sampcache/sampcache/dstage
DEV$ ./gdbiof
Traceback (most recent call last):
File "<string>", line 35, in <module>
File "/usr/share/gdb/python/gdb/__init__.py", line 23, in <module>
'gdb.function': os.path.join(gdb.PYTHONDIR, 'gdb', 'function'),
NameError: name 'os' is not defined
GNU gdb (GDB) 7.2.0.20100903-cvs (build 2014-05-12)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".

VMware, Inc. VMware Confidential and Proprietary 119


Getting Started Developing vSphere IO Filter Solutions

For bug reporting instructions, please see:


<http://www.gnu.org/software/gdb/bugs/>.
warning: core file may not match specified executable file.
[New Thread 1000024241]
[New Thread 1000024243]
Core was generated by `no arguments'.
Program terminated with signal 32, Real-time event 32.
#0 0x000000000e58fbb2 in _start () from
/workspace/acme_sampcache/sampcache/dstage/lib64/ld-linux-x86-64.so.2
( gdb)

5 Begin debugging using gdb commands.

Example of Debugging with gdb Commands:

(gdb) bt
#0 0x05d21092 in _dl_sysinfo_int80 () from
/workspace/acme_sampcache/sampcache/dstage/lib/ld-linux.so.2
#1 0x0bb38ca5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:67
#2 0x0bb3a4e3 in abort () at abort.c:92
#3 0x0d40594d in TestDiskOpen (handle=0x7d30cde8, diskInfo=0xff9d5904) at
/workspace/acme_sampcache/sampcache/sampcache.c:65
#4 0x0b655eaf in FiltLibDiskOpenFilter (dh=0x7d30e1b4, dlInfo=0x7d30f8c0,
iofilters=0x7d30fb70 "countIO", lightWeightOpen=0 '\000',
outCtx=0x7d30e1e4) at bora/lib/filtlib/filtlibDisk.c:1118
#5 FiltLibDiskOpenAllFilters (dh=0x7d30e1b4, dlInfo=0x7d30f8c0, iofilters=0x7d30fb70
"countIO", lightWeightOpen=0 '\000', outCtx=0x7d30e1e4) at
bora/lib/filtlib/filtlibDisk.c:1154
#6 FiltLibCreateContextFromFiltersInt (dh=0x7d30e1b4, dlInfo=0x7d30f8c0,
iofilters=0x7d30fb70 "countIO", lightWeightOpen=0 '\000',
outCtx=0x7d30e1e4) at bora/lib/filtlib/filtlibDisk.c:1287
#7 FiltLib_CreateContextFromFilters (dh=0x7d30e1b4, dlInfo=0x7d30f8c0,
iofilters=0x7d30fb70 "sampcache", lightWeightOpen=0 '\000',
outCtx=0x7d30e1e4) at bora/lib/filtlib/filtlibDisk.c:1368
#8 0x0b656181 in FiltLib_CreateContext (dh=0x7d30e1b4, dlInfo=0x7d30f8c0,
lightWeightOpen=0 '\000', outCtx=0x7d30e1e4) at
bora/lib/filtlib/filtlibDisk.c:1420
#9 0x0b59ff41 in DiskLibFiltLibInit (diskHandle=0x7d30e1b4, dlInfo=0x7d30f8c0,
forceInit=0 '\000') at bora/lib/disklib/diskLib.c:6241
...
#50 0x0555d7d9 in HostdApp::Run (upgradeVMs=false, autoStartVMs=false) at
bora/vim/hostd/main/hostdApp.cpp:892
#51 0x0555598e in HostdMain (argc=2, argv=0xff9d6e8c) at bora/vim/hostd/main/main.cpp:411
#52 0x047f07c5 in main (argc=2, argv=0xff9d6e8c) at
bora/vim/hostd/main/static/main.cpp:62
(gdb) quit

Live Debugging
While writing your IO Filter solution, there may be situations where you might need to debug the process
live. For this very purpose, VAIODK supports remote live debugging on vmkfstools, iofilterd, test-apps
developed for IO Filter and/or vmx process. This section describes the steps for performing live-debugging.

Since this topic involves running commands in multiple environments, the commands shown use different
prompts to indicate both platform and authentication required to perform the task. Specifically:
n DEV$ — You are an ordinary user (or root) on the development platform (for example VMware
Workbench)

120 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

n ESXi# — You are root on the ESXi host

It is assumed that you will have access/credentials to the ESX host you want to perform the live-debugging
on. It is also required that you understand the VAIODK framework and understand concepts like vmx,
daemon, hostd, etc before proceeding further.

To enable remote live-debugging, follow these steps:

1 Navigate to your IO Filter source directory on your development environment.

DEV$ cd /opt/vmware/vaiodk-6.0.0-2799832/src/partners/samples/iofilter/sampfilt

2 Run the make command with live-debug target

DEV$ make live-debug

3 On running the make command, you will be prompted to enter the ESXi hostname or IP address. Enter
the relevant information and also provide the password when prompted

enter an ESXi hostname or ip address: 10.112.81.235


Checking for ssh RSA key on ESXi host 10.112.81.235
Enter the ESXi root password if prompted ......Done
Starting live debugging session for filter: sampfilt

4 You will be prompted to enter the cartel id/name to which you want to attach the gdb process. If you
want to debug the LI context, you can choose the vmx process. If your filter has a daemon component,
you can choose the daemon process to debug.

Select a remote cartel or command to debug:


==> 1000752474 vmx-debug
/bin/vmx-debug -s sched.group=host/user -# product=2;name=VMware
ESX;version=6.0.0;buildnumber=2799832;licensename=VMware ESX
Server;licenseversion=6.0; -@ transport=3;msgs=ui /vmfs/volumes/558412ff-97d536e3-
d6e6-00505694bc5d/Test_1/Test.vmx
==> utility /bin/vmkfstools

Enter some matching characters or the cartel id of interest: vmx


Matched 1000752474 vmx-debug
/bin/vmx-debug -s sched.group=host/user -# product=2;name=VMware
ESX;version=6.0.0;buildnumber=2799832;licensename=VMware ESX
Server;licenseversion=6.0; -@ transport=3;msgs=ui /vmfs/volumes/558412ff-97d536e3-
d6e6-00505694bc5d/Test_1/Test.vmx ...

Using command path '/bin/vmx-debug' for cartel '1000752474'.


Fetching the build type and build number from ESXi host 10.112.81.235
Found build type beta.
Found build number 2799832 for ESXi host 10.112.81.235.
Enabling the firewall rule for gdbserver on ESXi host 10.112.81.235.
Disabling the iofilter watchdog on host 10.112.81.235.
Using /opt/vmware/vaiodk-symbols-6.0.0-2799832 for its symbols
Exact match for remote host build number 2799832
Using /opt/vmware/cimpdk-6.0.0-2799832 for its symbols
Exact match for remote host build number 2799832
Saving ssh [email protected] /bin/gdbserver --attach :50000 1000752474 &
Saving /opt/vmware/vaiodk-6.0.0-2799832/tools/../debug/vmkgdb64-7.2 --
command /opt/vmware/vaiodk-6.0.0-2799832
/src/partners/samples/iofilter/sampfilt/dstage/gdb.cmd

VMware, Inc. VMware Confidential and Proprietary 121


Getting Started Developing vSphere IO Filter Solutions

5 Now navigate to the directory name dstage created in your working folder.

DEV$ cd dstage

6 Run the script gdbiof to attach the gdb process to the cartel you selected in Step 4 above.

DEV$ ./gdbiof
Traceback (most recent call last):
File "<string>", line 35, in <module>
File "/usr/share/gdb/python/gdb/__init__.py", line 23, in <module>
'gdb.function': os.path.join(gdb.PYTHONDIR, 'gdb', 'function'),
NameError: name 'os' is not defined
GNU gdb (GDB) 7.2.0.20100903-cvs (build 2014-05-12)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Attached; pid = 1000752474
Listening on port 50000
Remote debugging from host 10.112.82.247
warning: Can not parse XML target description; XML support was disabled at compile time
[New Thread 1000752474]
Created trace state variable $trace_timestamp for target's variable 1.
(gdb)
[Thread 1000752474] #1 stopped.
0x0000000022dcd758 in ?? ()
warning: .dynamic section for "/lib64/libgcc_s.so.1" is not at the expected address (wrong
library or version mismatch?)
warning: Could not load shared library symbols for 8 libraries, e.g. /lib64/libX11.so.6.
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?

7 The gdb control will stop at poll waiting for input.

(gdb) bt
#0 0x0000000022dcd758 in ppoll (fds=0x3ffe671da08, nfds=12, timeout=0x3ffe671b960,
sigmask=0x0) at
../sysdeps/unix/sysv/linux/ppoll.c:58
#1 0x00000000204db2a4 in PollExecuteDevice (polltab=0x3ffe694e010, class=POLL_CLASS_MAIN,
timeout=<value optimized out>)
at bora/vmx/main/pollVMX.c:3012
#2 0x00000000204dbb27 in PollVMXLoopTimeout (loop=0 '\000', exit=0x218f3245 "",
class=<value optimized out>, timeout=1000000)
at bora/vmx/main/pollVMX.c:2255
#3 0x00000000204c7eac in VMXPoweredOnLoop () at bora/vmx/main/vmx.c:2394
#4 VMXPowerOnMainThread () at bora/vmx/main/vmx.c:2307
#5 VMX_Loop () at bora/vmx/main/vmx.c:244
#6 0x00000000204c3aa6 in MainRun (ac=<value optimized out>, av=0x3ffe671edf8) at
bora/vmx/main/main.c:2276
#7 main (ac=<value optimized out>, av=0x3ffe671edf8) at bora/vmx/main/main.c:575
(gdb) info threads
[New Thread 1000752510]
[New Thread 1000752509]
[New Thread 1000752508]

122 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

[New Thread 1000752507]


[New Thread 1000752505]
[New Thread 1000752497]
7 Thread 1000752497 (running)
6 Thread 1000752505 (running)
5 Thread 1000752507 (running)
4 Thread 1000752508 (running)
3 Thread 1000752509 (running)
2 Thread 1000752510 (running)
* 1 Thread 1000752474 0x0000000022dcd758 in ppoll (fds=0x3ffe671da08, nfds=12,
timeout=0x3ffe671b960, sigmask=0x0)
at ../sysdeps/unix/sysv/linux/ppoll.c:58

8 You can now start using gdb commands and attach a breakpoint where desired. Note that there are
multiple threads and attaching gdb to one thread will not stop processing on other threads. Once the
control reaches the breakpoint that you added, you will need to identify the appropriate thread
executing the operation.

(gdb) b SampleFilterDiskStartIO
Breakpoint 1 at 0x23cf0070: file partners/samples/iofilter/sampfilt/sampfilt.c, line 200.
(gdb) c
Continuing.
^C
[Thread 1000752474] #1 stopped.
0x0000000022dcd758 in ppoll (fds=0x3ffe671da08, nfds=12, timeout=0x3ffe671b960, sigmask=0x0)
at ../sysdeps/unix/sysv/linux/ppoll.c:58
58 ../sysdeps/unix/sysv/linux/ppoll.c: No such file or directory.
in ../sysdeps/unix/sysv/linux/ppoll.c
(gdb) info threads
7 Thread 1000752497 (running)
6 Thread 1000752505 SampleFilterDiskStartIO (handle=0x32abf7d0, io=0xbbaec230) at
partners/samples/iofilter/sampfilt/sampfilt.c:200
5 Thread 1000752507 (running)
4 Thread 1000752508 (running)
3 Thread 1000752509 (running)
2 Thread 1000752510 (running)
* 1 Thread 1000752474 0x0000000022dcd758 in ppoll (fds=0x3ffe671da08, nfds=12,
timeout=0x3ffe671b960, sigmask=0x0)
at ../sysdeps/unix/sysv/linux/ppoll.c:58
(gdb) thread 6
[Switching to thread 6 (Thread 1000752505)]#0 SampleFilterDiskStartIO (handle=0x32abf7d0,
io=0xbbaec230)
at partners/samples/iofilter/sampfilt/sampfilt.c:200
200 {

9 Once you are in the correct thread-context, you can step through the function/code using gdb
commands.

(gdb) bt
#0 SampleFilterDiskStartIO (handle=0x32abf7d0, io=0xbbaec230) at
partners/samples/iofilter/sampfilt/sampfilt.c:200
#1 0x0000000020b07902 in FiltLibWrapIOStart (handle=0x32abf7d0, io=0xbbaec230) at
bora/lib/filtlib/filtlibTrace.c:212
#2 0x0000000020b03c2e in FiltLibDiskIOHandleIO (io=0xbc3003b0) at
bora/lib/filtlib/filtlibDiskIO.c:133
#3 0x0000000020b092a8 in FiltLibUpcallProcessIOs (data=0x32ac37e0) at
bora/lib/filtlib/filtlibUpcall.c:685

VMware, Inc. VMware Confidential and Proprietary 123


Getting Started Developing vSphere IO Filter Solutions

#4 FiltLibUpcallDiskIOFunc (data=0x32ac37e0) at bora/lib/filtlib/filtlibUpcall.c:842


#5 0x0000000000000000 in ?? ()
(gdb) n
201 VMIOF_Log(VMIOF_LOG_INFO, "In the callback %s\n", __func__);
(gdb)
200 {
(gdb) c
Continuing.

Live-debugging a test-app

You can also write your own test-app to test your IO Filter solution. The test-app will be written in a
directory named <tests> in your <filter> directory. In addition to the files that contain the code for your test-
app, you will also create a subdir.json file that defines the rules for compiling your test-app. You will also
need to add the compilation rules in the scons files defined your <filter> directory, see “Creating and
Populating a Correct Scons File,” on page 89. More information on how you can build your test-app can be
found in the document “vSphere APIs for IO Filtering Development Kit (VAIODK) Guide for the Command
Line”.

When you build your IO Filter by running the make command, the test-app is also built. You can verify this
by navigating to the build folder inside the <filter> directory and checking for the <test-apps> folder. The
binary for the test-app will reside inside this folder

DEV$ pwd
/opt/vmware/vaiodk-6.0.0-2799832/src/partners/samples/iofilter/sampfilt/build
DEV$ ls
.cimpdk_clean .cimpdk_stage bundle catalogs cim config init.d shutdown.d test-apps usr
vib
DEV$ cd test-apps
DEV$ ls
ZZZ_corp_tests

Please note the following before live-debugging the test-app.

1 The binary for the test-app is copied to the ESX host

DEV$ scp sampfilt-test1 [email protected]:/tmp/


sampfilt-
test1 100%
13KB 13.1KB/s 00:00

2 Once on the ESX host, invoke the test-app so that it is running as a process to which we can attach the
gdb process remotely. E.g. In the case of sampfilt-test1, it is required that the test-app be invoked
continuously through a script since it just exits after printing some data.

#!/bin/bash
while :
do
./sampfilt-test1
sleep 1
done

To enable remote live-debugging on the test-app, follow these steps:

1 To enable remote live-debugging on the test-app, follow these steps: Note: For this example,
"vmkfstools -v" is also running in a script

DEV$ cd /opt/vmware/vaiodk-6.0.0-2799832/src/partners/samples/iofilter/sampfilt
DEV$ make live-debug
enter an ESXi hostname or ip address: 10.112.80.113
Checking for ssh RSA key on ESXi host 10.112.80.113.

124 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

Enter the ESXi root password if prompted ...


...Done
Starting live debugging session for filter: sampfilt

Select a remote cartel or command to debug:

==> utility /bin/vmkfstools


==> test app /ZZZ_corp_tests/sampfilt-test1

Enter some matching characters or the cartel id of interest: test

Matched test app /ZZZ_corp_tests/sampfilt-test1 ...


Enter command arguments (if any):
Using command: /ZZZ_corp_tests/sampfilt-test1
Fetching the build type and build number from ESXi host 10.112.80.113.
Found build type beta.
Found build number 2697104 for ESXi host 10.112.80.113.
Enabling the firewall rule for gdbserver on ESXi host 10.112.80.113.
Disabling the iofilter watchdog on host 10.112.80.113.
Using /opt/vmware/vaiodk-symbols-6.0.0-2697104 for its symbols
Exact match for remote host build number 2697104
Using /opt/vmware/cimpdk-6.0.0-2697104 for its symbols
Exact match for remote host build number 2697104
Saving ssh [email protected] /bin/gdbserver :50000 /ZZZ_corp_tests/sampfilt-test1 &
Saving /opt/vmware/vaiodk-6.0.0-2697104/tools/../debug/vmkgdb64-7.2 --
command /opt/vmware/vaiodk-6.0.0-2697104/src/partners
/samples/iofilter/sampfilt/dstage/gdb.cmd

2 Navigate to dstage directory and run the gdbiof script

DEV$ cd dstage
DEV$ ls
ZZZ_corp_tests bin etc gdb.cmd gdbiof include init.d lib lib64 sbin share
shutdown.d test-apps usr usr64 var
DEV$ ./gdbiof
Traceback (most recent call last):
File "<string>", line 35, in <module>
File "/usr/share/gdb/python/gdb/__init__.py", line 23, in <module>
'gdb.function': os.path.join(gdb.PYTHONDIR, 'gdb', 'function'),
NameError: name 'os' is not defined
GNU gdb (GDB) 7.2.0.20100903-cvs (build 2014-05-12)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Process /ZZZ_corp_tests/sampfilt-test1 created; pid = 1004367573
Listening on port 50000
Remote debugging from host 10.112.80.155
warning: Can not parse XML target description; XML support was disabled at compile time
[New Thread 1004367573]
Created trace state variable $trace_timestamp for target's variable 1.

VMware, Inc. VMware Confidential and Proprietary 125


Getting Started Developing vSphere IO Filter Solutions

3 Once the gdb process is attached to the test-app, you can use gdb commands to do further processing.

(gdb)
[Thread 1004367573] #1 stopped.
0x0000000020a6d270 in ?? ()
bt
#0 0x0000000020a6d270 in _start ()
from /opt/vmware/vaiodk-6.0.0-2697104/src/partners/samples/iofilter/sampfilt/dstage/lib64/ld-
linux-x86-64.so.2
#1 0x0000000000000001 in ?? ()
#2 0x000003fff736ff11 in ?? ()
#3 0x0000000000000000 in ?? ()
(gdb) b main
Breakpoint 1 at 0x2086a6b0: file partners/samples/iofilter/sampfilt/tests/test1.c, line 13.
(gdb) c
Continuing.
warning: Could not load shared library symbols for 2 libraries, e.g. /lib64/libvmiof.so.0.
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?

Breakpoint 1, main () at partners/samples/iofilter/sampfilt/tests/test1.c:13


13 {
(gdb) bt
#0 main () at partners/samples/iofilter/sampfilt/tests/test1.c:13
(gdb) n
14 printf("Hello world, my name is: %s\n", VMIOF_XSTR(VMIOF_NAME));
(gdb)
13 {
(gdb)
14 printf("Hello world, my name is: %s\n", VMIOF_XSTR(VMIOF_NAME));
(gdb)
15 printf("Hello world, my vendor is: %s\n", VMIOF_XSTR(VMIOF_VENDOR));
(gdb)
16 printf("Hello world, my version is: %s\n", VMIOF_XSTR(VMIOF_VERSION));
(gdb)
17 printf("Hello world, my class is: %s\n", VMIOF_XSTR(VMIOF_CLASS));
(gdb)
18 printf("Hello world, my class is: %s\n", VMIOF_XSTR(VMIOF_CLASS));
(gdb)
19 printf("Hello world, my 1a include is: %s\n", STRING_1a);
(gdb)
20 printf("Hello world, my 1b include is: %s\n", STRING_1b);
(gdb) Quit
(gdb) q
A debugging session is active.

Inferior 1 [Remote target] will be killed.

Quit anyway? (y or n) [answered Y; input not from terminal]


Killing all inferiors
DEV$

126 VMware Confidential and Proprietary VMware, Inc.


Chapter 4 Creating a Basic IO Filter Solution

Chapter Summary - Creating a Basic IO Filter Solution


The topics in this chapter presented details such that you should now be able to:

n Specify where to create your build folder on your build system, and why

n Create a Makefile to build your Solution

n Create appropriate entries in your Solution's SCONS and JSON files, based on the plans for your
Solution

n Create minimal source files for the library and daemon components of your Solution

n Define the prototype for each of the callback's in a library and daemon component, and when the IO
Filters Framework invokes each

Review Questions
1 Under which directory do you create IO Filter source?

a Whatever folder you wish

b The src subdirectory of the VAIODK directory

c Under the src/partners/examples subdirectory of the VAIODK directory

d Anywhere under the VAIODK directory

2 In which file do you list the source files that make up the components of your IO Filter Solution?

a Makefile

b xxx.sc where xxx is any non-NULL name you want

c xxx.sc where xxx is the name of the filter

d xxx.sc where xxx is the name of the source folder

3 Which of the following callbacks are optional in an LI? (choose all that apply)

a diskAttach

b diskPropertiesSet

c diskRelease

d diskIOStart

4 In which file do you list the properties / capabilities of an IO Filter Solution?

a The scons file

b Makefile

c The json file

d The CIM provider

5 Which of the following are required callbacks for a Daemon? (choose all that apply)

a start

b IOStart

c stop

d shutdown

VMware, Inc. VMware Confidential and Proprietary 127


Getting Started Developing vSphere IO Filter Solutions

6 If your filter dumps core while running in the context of a VMX, where would you find the core file?

a /var/core

b The directory of the VM's .vmx file

c The directory from which you started the VM

d /tmp

7 Which (optional) component of an IO Filter cannot be included in a VIB or bundle?

a Library (LI)

b Daemon

c CIM Provider

d Test application

8 What entry do you use in a JSON file to specify the amount of static memory needed by a daemon?

a "DAEMON MEMORY RESERVATION" within the RESOURCE section

b "MEMORY RESERVATION" within the "DAEMON" section

c You don't put it in the JSON file, but in the SCONS file, as part of the daemon dictionary definition

d None of the above?

9 What entry do you use in the SCONS file to set warning flags for C compilation?

a 'warnings' in the Identification section of the entire IO Filter

b 'cc warnings' in the definition of each component for which you want to set said flags

c 'cc warnings' in the last parameter to the function that defines a given component

d You don't do this in a SCONS file, but rather set the MAKE_WARNINGS environment variable in
your shell to the flags you want

128 VMware Confidential and Proprietary VMware, Inc.


Fleshing IO Filter Library Component
Source 5
Chapter Objectives and Topics
To achieve the tactical objectives of this course, after successfully completing this chapter, you should be
able to:
n Use sidecars in an IO Filter Solution

n Use the IO Filter timer functions in a Solution

n Use the IO Filter polling functions in a Solution

n Process IO transactions in a Library Instance, including:


n Understanding the data structures used in IO transactions

n Creating new and duplicating existing IO transactions

n Submit new and duplicated IO transactions to the framework

n Process xMigration events in an IO Filter Solution

n Process Snapshot events in an IO Filter Solution

This chapter includes the following topics:

n “Understanding and Using IO Filter Utility Functions Common to Most Solutions,” on page 130

n “Understanding and Processing Primary Disk Events,” on page 153

n “Understanding and Processing IO Events,” on page 167

n “Overview of VMIOF IO Utility Functions,” on page 181

n “Optimizing the Performance of Your IO Filter Solution,” on page 184

n “Understanding and Processing Snapshot-related Events,” on page 192

n “Understanding and Processing Other IO Filter Events,” on page 208

n “Chapter Summary,” on page 250

n “Review Questions - Fleshing IO Filter Library Component Source,” on page 251

VMware, Inc. VMware Confidential and Proprietary 129


Getting Started Developing vSphere IO Filter Solutions

Understanding and Using IO Filter Utility Functions Common to Most


Solutions
The VAIO provides several utility functions that you call within callback functions to perform various tasks,
including: managing memory, managing and performing IO to sidecar files, setting up and processing
event-based socket IO (polling), creating and retrieving non-persistent instance-private data. These tasks are
common to filters of any class as well as common to almost any callback within those filters.

The following subsections discuss how to use the utility functions provided for these tasks.

The framework serializes many but not all callbacks. For example it will not block timer destruction while a
timer callback runs, and will not invoke a timer callback during a timer destruction call. However, as a
counter example, the framework can invoke diskStun and diskUnstun concurrently with other callbacks.
However, Timers, Workgroups and Poll callbacks are not serialized so its up the IO Filter developer to
provide synchronization.

Since the Timer, Workgroups, and Poll callbacks are invoked asynchronously to other callbacks, you need to
ensure that there is no race between diskClose and these other callbacks when diskClose frees the LI private
data. To prevent this race, the diskClose callback should remove any pending Poll, and Timer callbacks, and
wait for all remaining Workgroups to complete before it frees the LI private instance data.

NOTE The VAIO provides utility functions beyond those discussed in the next few sub-sections, for
example to create and submit IO requests. Those utility functions not discussed in the following subsections
are discussed in more appropriate contexts later in this chapter.

Managing Memory in an IO Filter Solution


The IO Filter Framework uses heaps as a means of isolating and managing dynamic memory allocations.
These heaps are different from, and thus use separate functions to manage, the default cartel heap created
by ESXi when it creates the cartel, that is managed with calls to malloc / free, etc. The IO Filter Framework
heaps are also account for differently.

You define the size of each IO Filter heap you create, ideally using a utility function provided by VAIO
(discussed later in this topic), based on the set of structures your filter would expect to create during normal,
or even stressed, operations. You may define multiple heaps in your filter. Regardless, the IO Filter
Framework enforces memory limits specified in the diskRequirements callback for LIs.

At a high level, there are three sets of functions you use to manage memory within an IO Filter solution:

n Setup — There is a set of functions used to define a heap handle, estimate the size of the heap, and then
actually create the heap. You typically invoke these functions during diskOpen, diskAttach / diskDetach,
diskRelease, and daemonStart callbacks. You may also create separate heaps in your Daemon each time
it receives a connection with a LI, CIM Provider, or another Daemon, with said heaps sized to keep
track of the state of transactions with those entities.

n Operation — There is a set of functions your filter components use during normal operation, during a
diskAttach or diskRelease operation, to allocate and free memory from the heap(s) it set up. Allocations
can be either aligned or unaligned.

n Cleanup — There is a function used during diskClose, diskAttach / diskDetach, diskRelease and
daemonStop / daemonCleanup callbacks, to destroy the heap(s) created during setup.

130 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

A key thing to remember is that, whatever heap you create, you must destroy before the context of your
filter goes away. Whatever memory you allocate from a heap, you must free before the heap is destroyed.
Failure to follow these rules causes the framework to crash your filter, which can result in crashing a VM.

DANGER When you create a heap for an LI (typically in the body of a diskOpen() callback), you must only
access that heap while in the context of that LI, including freeing and memory allocated from it and
destroying it. For example, you cannot allocate a heap in the context of VMDK1, and then destroy it or free it
in the context of VMDK2. Doing this will crash the filter. Put another way, perform all heap operations
within the same LI that created said heap.

Thus, at a high level, the steps for using these functions are:

1 Declare a pointer to VMIOF_HeapHandle. This represents the opaque handle to the heap that you create.
You use this handle later when you dynamically allocate memory units from the heap and also when
you finally destroy the heap when you no longer need it.

2 Estimate the required heap size based on the set of allocations you expect your filter component to
make (for example 1 struct foo per open VMDK, one struct bar per outstanding IO, multiplied by the
number of possible open disks and outstanding IOs, respectively). The estimation of required space also
must include space to provide alignment for certain data structures your filter component may need to
allocate, such as buffers for doing I/O to sidecars.

3 Create the new heap object based on the estimated heap size.

4 Allocate memory, either aligned or non-aligned, from the heap created in Step 3.

5 Free the allocated memory when it is no longer needed.

6 Destroy the heap when it is no longer needed.

NOTE Always free the same unit you allocated. Always free all memory you allocated from a heap before
destroying said heap.

WHAT'S NEW Starting in release 60U2, the frameworks supports one single heap across different LIs within
the same VMX cartel. However, we don't have a proper way to report this single heap requirement, and
everything reported in VMIOF_DiskRequirements will be counted once for every LI. This will be fixed in
the 2016 release.

An alternative way to have memory accessible to different LIs is to use mmap with MAP_SHARED. Just like
System V Shared Memory Segments, mmap with MAP_SHARED comes from the kernel directly, so there is
no limitation. However, the page table for it comes from different resource pools. For the VMX cartel, it
comes from the uwshmempt page pool (beginning in 60U2), for other cartels, it comes from resource pools
with a fixed limit. You don't need to account for the System V Shared memory itself, but you need to
account for the page table overhead for the Daemon in DAEMON_MEMORY_RESERVATION and in
diskRequirements() for LIs.

memory = mmap(NULL, memSize, PROT_READ | PROT_WRITE,MAP_SHARED | MAP_ANONYMOUS,-1, 0);


munmap(memory, memSize);

NOTE malloc, as well as mmap(MAP_PRIVATE | MAP_ANONYMOUS), are not allowed to be used in an


IO Filter.

Declaring a pointer to a heap: Understanding the VMIOF_HeapHandle type


The VAIO API provides an opaque data type VMIOF_HeapHandle to represent the heap with which IO Filter
components can perform dynamic memory operations. This handle gets initialized when you create the
heap, and is used for all subsequent references to it, including allocating and freeing memory and
destroying the heap itself.

VMware, Inc. VMware Confidential and Proprietary 131


Getting Started Developing vSphere IO Filter Solutions

Example code to declare a heap handle is:

typedef struct VMIOF_HeapHandleInt VMIOF_HeapHandle;

Estimating the size of the heap: Understanding the VMIOF_HeapAllocation type and
VMIOF_HeapEstimateRequiredSize ()

Estimating the size of the heap(s) required by your code is critical to overall system behavior, as well as
functionality of your filter. Creating heaps larger than you need potentially deprives other parts of the
system of memory. Creating heaps smaller than needed will prevent your filter from functioning properly.

NOTE At no time can a LI create heaps that, in aggregate, exceed the total amount of memory returned by
the diskRequirements callback.

To estimate the size of your heap, you need to determine:

n Which data structures you will dynamically allocate from said heap

n The maximum number of those data structures will you need allocated at any given time

n On what, if any, boundary must those data structures be aligned

.Instead of calculating the worst-case combination of these structures your self, use
VMIOF_HeapEstimateRequiredSize() to do this for you. To use this function, you must first create an array of
structures of type VMIOF_HeapAllocation, each of which describes one of the data structures the filter
component will allocate from this heap, its alignment requirements (if any), and the maximum number of
the data structures the heap must support at any given time.

The definition of VMIOF_HeapAllocation is:

typedef struct VMIOF_HeapAllocation {


/* The size of the allocation request. */
size_t size;
/* The required alignment. */
size_t alignment;
/* The number of such allocations. */
size_t count;
} VMIOF_HeapAllocation;

After declaring an array of these structures, initialized with data for the data structures to be allocated from
the heap, use VMIOF_HeapEstimateRequiredSize() to estimate the heap’s size. This function determines the
minimum heap size required such that the heap has enough room to allocate any combination of the data
structures described by the array of VMIOF_HeapAllocation structures. The prototype for this function is:

size_t VMIOF_HeapEstimateRequiredSize(const VMIOF_HeapAllocation *allocations, size_t count);

The parameters to this function are:

n const VMIOF_HeapAllocation *allocations — This input parameter represents the base address of an
allocations array of VMIOF_HeapAllocation structures, that describe all sets and types of memory
allocations to be performed on the heap.

n size_t count —This input parameter is the total number of VMIOF_HeapAllocation objects in allocations
put together.

The function returns the estimated heap size as a size_t. On failure, it returns 0.

Understanding VMIOF_HeapCreate ()
Use VMIOF_HeapCreate() to create a new heap from which you can dynamically allocate memory. . . The
prototype of this function is:

VMIOF_Status VMIOF_HeapCreate(size_t size, VMIOF_HeapHandle **heap);

132 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

The parameters to this function are:


n size_t size — This input parameter denotes the size of the heap that you intend to create, as returned by
VMIOF_HeapEstimateRequiredSize().

n VMIOF_HeapHandle **heap — This output parameter is the opaque handle to the heap object that gets
created upon success. Use this handle for all subsequent heap operations, including allocating / freeing
memory and destroying the heap.

The function returns the following status codes:


n VMIOF_SUCCESS — The call succeeded, the heap is created, and heap contains a handle on the heap.

n Any other value — The heap creation failed. The error code indicates the reason for the failure.

Understanding VMIOF_HeapDestroy ()
When a heap is no longer needed, call VMIOF_HeapDestroy() to destroy the heap created with
VMIOF_HeapCreate(). The prototype for this function is:

VMIOF_Status VMIOF_HeapDestroy(VMIOF_HeapHandle *heap);

The parameter to this function is VMIOF_HeapHandle *heap, the handle to the heap to destroy, what was
initialized by a call to VMIOF_HeapCreate().

The function returns VMIOF_SUCCESS when the heap is successfully destroyed, or some other code on error.

NOTE The heap must be empty before the framework will destroy it. Attempting to destroy a heap that is
not empty results in a crash in the framework.

Understanding VMIOF_HeapAllocate ()
Call VMIOF_HeapAllocate() to dynamically allocate memory from a heap. The prototype of this function is:

void * VMIOF_HeapAllocate(VMIOF_HeapHandle *heap, size_t size);

The parameters to this function are:


n VMIOF_HeapHandle*heap — This input parameter is an opaque handle to the heap from which to allocate
the memory.

n size_t size — This input parameter is the size of the memory block to be allocated.

The function returns a void pointer to the memory allocated, or NULL on failure.

NOTE Allocated memory is not zeroed automatically.

Understanding VMIOF_HeapAllocateAligned ()
Use VMIOF_HeapAllocateAligned() to dynamically allocate memory from a heap, with the memory aligned
on a specified boundary. The prototype for this function is:

void * VMIOF_HeapAllocateAligned(VMIOF_HeapHandle *heap, size_t size, size_t alignment);

The parameters to this function are :


n VMIOF_HeapHandle *heap — This input parameter is an opaque handle to the heap from which to allocate
the memory.

n size_t size — This input parameter is the size of the memory block to be allocated.

n size_t alignment — This input parameter provides the desired alignment of the memory.

The function returns a void pointer to the memory allocated, or NULL on failure.

NOTE Allocated memory is not zeroed automatically.

VMware, Inc. VMware Confidential and Proprietary 133


Getting Started Developing vSphere IO Filter Solutions

Understanding VMIOF_HeapFree ()
When you are done using any block of dynamically allocated memory, use VMIOF_HeapFree() to return it to
the heap from whence it came. The prototype of this function is:

void VMIOF_HeapFree(VMIOF_HeapHandle *heap, void *memory);

The parameters to this function are :

n VMIOF_HeapHandle *heap — This input parameter is the handle to the heap from which the memory was
originally allocated.

n void *memory — This is a pointer to the memory to free. It should point to memory that was allocated
from the indicated heap, that is no longer needed, and not already freed. It must be the same pointer
returned by VMIOF_HeapAllocate*(). That is, you cannot free pieces of allocated memory from a heap.

The function returns void.

NOTE freeing an already freed block or a NULL pointer results in a crash in the framework.

Example: Example code


The following code snippet provides an example usage of the heap functions. At a high level, this code
sample, in the context of VMIOF_Status SampleFilterDiskOpen(), estimates the size of heap to be created
using VMIOF_HeapEstimateRequiredSize() and creates it using VMIOF_HeapCreate(). Later on in the code
snippet, you see that a memory block is dynamically allocated from the heap you created, using
VMIOF_HeapAllocate(). Finally, when you are done using the memory block you free it using
VMIOF_HeapFree(). Ensure that you free all the allocated memory blocks before calling VMIOF_HeapDestroy().

1. VMIOF_Status
2. SampleFilterDiskOpen(VMIOF_DiskHandle *diskHandle, const VMIOF_DiskInfo *diskInfo)
3. {
4. VMIOF_Status res;
5. size_t heapSizeEstimate;
6. InstanceData_t *id;
7. uint32_t idx, count=diskInfo->linksInChain;
8. VMIOF_HeapAllocation allocations[] = {
9. { sizeof(InstanceData_t), 0, 1 }, /* instance data */
10. { MY_SIDECAR_SIZE, VMIOF_DISK_SIDECAR_ALIGN, 1 }, /* sidecar */
11. { MY_OWNER_SIDECAR_SIZE, VMIOF_DISK_SIDECAR_ALIGN, 1 }, /* sidecar */
12. { sizeof(SampfiltIOXact_t), 0, MAX_OUTSTANDING_IOTS}, /* io transactions */
13. };
14. size_t numAllocations = sizeof(allocations)/sizeof(VMIOF_HeapAllocation);
15. VMIOF_HeapHandle *instanceHeapHandlep;
16. char mbuf[MAX_FLAGS_STRING_SIZE];
17. VMIOF_DiskSidecar *ownerSidecarHandlep;
18. char *base;
19. ...
20. heapSizeEstimate = VMIOF_HeapEstimateRequiredSize(allocations,
21. numAllocations);
22. res = VMIOF_HeapCreate(heapSizeEstimate, &instanceHeapHandlep);
23. if( VMIOF_SUCCESS != res ) {
24. /* couldn't create the heap */
25. VMIOF_Log(VMIOF_LOG_ERROR,"NoopfiltLI(%s): error creating filter heap "
26. "(%d)\n", __func__, res);
27. return res;
28. }
29. VMIOF_Log(VMIOF_LOG_ERROR,"NoopfiltLI: created heap\n");

134 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

30. ...
31. id = VMIOF_HeapAllocate(instanceHeapHandlep, sizeof(InstanceData_t));
32. if(!id) {
33. VMIOF_Log(VMIOF_LOG_ERROR,"Sampfilt(%s): Could not allocate instance "
34. "id. Failing the open.\n", __func__);
35. VMIOF_HeapDestroy(instanceHeapHandlep);
36. return VMIOF_NO_MEMORY;
37. }
38. VMIOF_Log(VMIOF_LOG_ERROR,"SampFilt: did ID heap allocation\n");
39. bzero(id,sizeof(*id)); /* Heap allocations are not zero'd */
40. ...
41. /* allocate buffer for sidecar from heap aligned as required */
42. id->mySidecarp = (SampfiltSidecar_t *)VMIOF_HeapAllocateAligned(
43. id->heapHandlep, MY_SIDECAR_SIZE, VMIOF_DISK_SIDECAR_ALIGN);
44. if (NULL == id->mySidecarp) {
45. VMIOF_Log(VMIOF_LOG_ERROR,"NoopfiltLI(%s): could not allocate sidecar "
46. "buffer\n", __func__);
47. close(id->daemonSockFD);
48. VMIOF_HeapFree(id->heapHandlep, id);
49. VMIOF_HeapDestroy(instanceHeapHandlep);
50. return VMIOF_NO_MEMORY;
51. }
52. }
53. …
54.. VMIOF_Status
55. SampleFilterDiskIOStart(VMIOF_DiskHandle *handle, VMIOF_DiskIO *io) {
56. …
57. /* create an I/O transaction for this request */
58. if (NULL == (iotp = VMIOF_HeapAllocate(id->heapHandlep, sizeof(SampfiltDelayIO_t)))) {
59. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskIOStart): can't HeapAlloc iotp for local
worker. just continuing the IO for now\n");
60. return VMIOF_SUCCESS;
61. } /* heap alloc */
62. iotp->io = io;
63. iotp->handle = handle;
64. iotp->sequence = sequence++;
65. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client: Client(DiskStartIO): async-local queueing
work\n");
66. res = VMIOF_WorkQueue(id->workGroupp, SampfiltDelayIOWorker, (void *)iotp);
67. if( VMIOF_SUCCESS != res) {
68. /* well, that didn't work */
69. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskStartIO): could not add delayIO to the
work pile (%d).",res);
70. return res;
71. } else { /* succeeded in adding to the work pile */ {
72. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskStartIO): added delay IO to the work
pile\n");
73. }
74. return VMIOF_ASYNC;
75. }
76. …
77. VMIOF_Status
78. SampleFilterDiskClose(VMIOF_DiskHandle *diskHandle)
79. {
80. …

VMware, Inc. VMware Confidential and Proprietary 135


Getting Started Developing vSphere IO Filter Solutions

81. /* free the sidecar buffers BEFORE freeing the instance data */
82. VMIOF_HeapFree(id->heapHandlep,id->mySidecarp);
83. /* free our instance data from the heap */
84. tmp = id->heapHandlep; /* save this pointer, first */
85. VMIOF_HeapFree(id->heapHandlep,id);
86. VMIOF_HeapDestroy(tmp);
87. …
88. }

The key sections of the code are as follows :


n Line 4-18 : Declare & initialize local variables
n Line 4 : res : This variable is used to capture VMIOF_status, example to capture the return value of
VMIOF_HeapCreate()

n Line 5: Declare a variable heapSizeEstimate to hold the return value of


VMIOF_HeapEstimateRequiredSize() on line 20. You later use this size value to create the heap via
VMIOF_HeapCreate() on line 22.

n Line 6: *id: This variable is a pointer to the instance data.

n Line 8-13: allocations[]: An array of type VMIOF_HeapAllocation with the elements we will be
allocating over the life of the heap.

n Line 14: numAllocations: The number of elements in the allocations[] array.

n Line 15: *instanceHeapHandlep : It is a pointer to VMIOF_HeapHandleand initialized to NULL. This


variable gets initialized upon successful execution of VMIOF_HeapCreate().

n Line 20 : Estimate the size of the heap to be created and capture the return value of
VMIOF_HeapEstimateRequiredSize() in heapSizeEstimate.

n Line 22-28 : Create the heap of size heapSizeEstimate via VMIOF_HeapCreate(). Capture the return value in
local variable res, and a pointer to the heap in instanceHeapHandlep. If heap creation not successful,
error message is sent to the log.

n Line 31-37 : Allocate a structure of type InstanceData_t from the heap and capture the return value of
VMIOF_HeapAllocate() in pointer id. If allocation not successful, send appropriate error message to log,
and destroy the heap.

n Line 39: Upon successful allocation of structure InstanceData_t from heap, zero out the structure as
Heap allocations are not automatically zeroed.

n Line 42: Allocate a SampfiltSidecar_t structure using VMIOF_HeapAllocateAligned() specifying the


proper alignment. Capture the return value in local variable res. If heap allocation is not successful, an
error message is sent to the log.

n Line 58-61: Allocate a SampfiltDelayIO_t structure using VMIOF_HeapAllocate(). If the allocation was not
successful, send an error message to the log.

n Line 66-73: You are ready to do asynchronous local queueing of work. Enqueue the work into the
workgroup queue. The function associated with the work thread is SampfiltDelayIOWorker() and it
accepts the pointer iotp. If work enqueue is not successful send an error message to the log.

n Line 80 : In the context of SampleFilterDiskClose() ...

n Line 82 : Free the previously allocated sidecar structure pointed to by id, once you are done using it, by
calling VMIOF_HeapFree().

n Line 85 : Free the previously allocated structure of type InstanceData_t pointed to by id, after saving off
the heap pointer in a temporary variable (tmp), by calling VMIOF_HeapFree().

n Line 86: Once you are done using the heap, destroy it using VMIOF_HeapDestroy(). Ensure that you free
all previously allocated memory blocks before calling this function.

136 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Using Sidecars Functions in Library Code to Keep Persistent Per-VMDK Meta


Data
The VAIO provides a set of functions and associated data structures that allow filters to manage sidecar files
(see “Understanding Filter-Private Data: Instance Data and Sidecars,” on page 24). The file vmiof_disk.h
contains all sidecar-related definitions. Within this file:

n Functions and data structures all begin with VMIOF_Sidecar

n #defines all begin with VMIOF_SIDECAR

The following sub-sections provide information on using each of the sidecar-related functions.

Understanding VMIOF_DISK_SIDECAR_ALIGN and VMIOF_DiskSidecar


The VAIODK defines the macro VMIOF_DISK_SIDECAR_ALIGN that you use in several ways when performing
sidecar operations, including:

n The size of each sidecar must be a whole multiple of VMIOF_DISK_SIDECAR_ALIGN bytes

n Each read from or write to a sidecar file must be at an offset that is a whole multiple of
VMIOF_DISK_SIDECAR_ALIGN bytes

n The address of the memory (RAM) from which you read or to which you write data must be on a
VMIOF_DISK_SIDECAR_ALIGN boundary

It is important that you use this macro in your code rather than hard-coding its current value, as VMware
reserves the right to change the value associated with this macro at any time. Failure to use the appropriate
values for offsets, sizes, or addresses cause the sidecar-related functions to fail, returning appropriate status
codes.

The VAIODK defines the type VMIOF_DiskSidecar which is analogous to a file descriptor in POSIX file
functions. Your code receives pointer to a valid VMIOF_DiskSidecar structure when it successfully creates or
opens a sidecar. Your code must pass this pointer into subsequent functions to read from / write to / delete /
close sidecar files.

Understanding and Using VMIOF_DiskSidecarCreate ()


Use this function to create a sidecar file associated with a VMDK file. You may call this function only from
the context of a diskAttach or diskPropertiesSet callback. The sidecar file is associated with the VMDK to
which the filter is being attached.

You can create a small number of sidecar files for each VMDK that your code filters. To distinguish between
sidecars, your code must assign a unique key to each. Keys are of type uint32_t are must be within the range
VMIOF_DISK_SIDECAR_KEYMIN and VMIOF_DISK_SIDECAR_KEYMAX. Thus, the number of sidecars allowed is
VMIOF_DISK_SIDECAR_KEYMAX - VMIOF_DISK_SIDECAR_KEYMIN.

The prototype for this function is:

VMIOF_Status VMIOF_DiskSidecarCreate(VMIOF_DiskHandle *handle, uint32_t key, uint64_t size,


VMIOF_DiskSidecar **outh);

The parameters you pass to this function are:

n VMIOF_DiskHandle *handle — The handle of the VMDK as passed into the diskAttach or
diskPropertiesSet callback

n uint32_t key — The key to assign to the sidecar

n uint64_t size — The size of the sidecar file. This must be a whole multiple of
VMIOF_DISK_SIDECAR_ALIGN

VMware, Inc. VMware Confidential and Proprietary 137


Getting Started Developing vSphere IO Filter Solutions

n VMIOF_DiskSidecar **outh — A pointer to a pointer to a VMIOF_DiskSidecar. If this function returns


VMIOF_SUCCESS (see the next results discussion), it sets outh to point to a valid VMIOF_DiskSidecar.
Otherwise, the value is undefined.

The return values for this function include:

n VMIOF_SUCCESS — The function succeeded. It created the sidecar and set outh to point to the
VMIOF_DiskSidecar handle to be used in subsequent operations with said sidecar.

n VMIOF_MISALIGNED — The function failed because size is not a multiple of VMIOF_DISK_SIDECAR_ALIGN

n VMIOF_NOT_SUPPORTED —The function failed because it was not invoked in the context of a diskAttach or
diskPropertiesSet callback

n VMIOF_SIDECAR_LIMIT — The function failed because the LI has already created the maximum number of
sidecar files per LI for this VMDK

Understanding and Using VMIOF_DiskSidecarDelete ()


Use this function to delete a sidecar with a given key. You may call this function only from the context of a
diskDetach or diskPropertiesSet callback. The key of type uint32_t must be within the range
VMIOF_DISK_SIDECAR_KEYMIN and VMIOF_DISK_SIDECAR_KEYMAX.

The prototype for this function is :

VMIOF_Status VMIOF_DiskSidecarDelete(VMIOF_DiskHandle *handle,


uint32_t key);

The parameters you pass to this function are:

n VMIOF_DiskHandle *handle— The handle of the VMDK as passed into the diskAttach or
diskPropertiesSet callback

n uint32_t key — The key representing the sidecar being deleted

The return values for this function include:

n VMIOF_SUCCESS — The function succeeded and the sidecar file was deleted

n VMIOF_NOT_SUPPORTED — The function failed because it was not invoked in the context of a diskDetach or
diskPropertiesSet callback

n VMIOF_BUSY — The function was invoked for an already opened sidecar. The filter must close the sidecar
using VMIOF_DiskSidecarClose before calling this function.

Understanding and Using VMIOF_DiskSidecarOpen ()


Use this function to open a closed sidecar. Note that a newly created sidecar, created via
VMIOF_DiskSidecarCreate is opened implicitly. You may call this function in the context of diskOpen,
diskClose, diskAttach, diskPropertiesSet, diskDetach, diskGrow, diskClone, diskCollapse,
diskVmMigration, diskSnapshot and diskRelease callback functions

The prototype for this function is :

VMIOF_Status VMIOF_DiskSidecarOpen(VMIOF_DiskHandle *handle,


uint32_t key,
VMIOF_DiskSidecar **outh);

The parameters you pass to this function are:

n VMIOF_DiskHandle *handle— The handle of the VMDK as passed into the callback

n uint32_t key — The key representing the sidecar. It must be within the range
VMIOF_DISK_SIDECAR_KEYMIN and VMIOF_DISK_SIDECAR_KEYMAX.

138 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n VMIOF_DiskSidecar **outh — A pointer to a pointer to a VMIOF_DiskSidecar. If this function returns


VMIOF_SUCCESS , it sets outh to point to a valid VMIOF_DiskSidecar that is opened. Otherwise, the value is
undefined.

The return values for this function include:

n VMIOF_SUCCESS — The function succeeded and the sidecar file was indeed opened. outh now contains a
valid handle to an opened sidecar

n VMIOF_NOT_SUPPORTED — The function was invoked from an unsupported filter operation. In other
words, the function was not invoked in the context of diskOpen, diskClose, diskAttach,
diskPropertiesSet, diskDetach, diskGrow, diskClone, diskCollapse, diskVmMigration, diskSnapshot and
diskRelease callback functions

Understanding and Using VMIOF_DiskSidecarClose ()


Use this function to close an opened sidecar. You must close the sidecar before you can delete it. You may
call this function in the context of diskOpen, diskClose, diskAttach, diskPropertiesSet, diskDetach,
diskGrow, diskClone, diskCollapse, diskVmMigration, diskSnapshot and diskRelease callback functions

The prototype for this function is :

VMIOF_Status VMIOF_DiskSidecarClose(VMIOF_DiskSidecar *handle);

The parameter you pass to this function is:

n VMIOF_DiskSidecar *handle — The handle to the sidecar that you are closing

The return values for this function include:

n VMIOF_SUCCESS — The function succeeded in closing the sidecar file

n VMIOF_NOT_SUPPORTED — The function was invoked from an unsupported filter operation. In other
words, the function was not invoked in the context of diskOpen, diskClose, diskAttach,
diskPropertiesSet, diskDetach, diskGrow, diskClone, diskCollapse, diskVmMigration, diskSnapshot and
diskRelease callback functions

Understanding and Using VMIOF_DiskSidecarRead ()


Use this function to read data from the sidecar. You may call this function in the context of diskOpen,
diskClose, diskAttach, diskPropertiesSet, diskDetach, diskGrow, diskClone, diskCollapse,
diskVmMigration, diskSnapshot and diskRelease callback functions. If the filter implements the diskStun
and diskUnstun callbacks, then in stunned state, this function can be used in all LI callbacks, while in
unstunned state, this function can be used in any context.

The prototype for this function is :

VMIOF_Status VMIOF_DiskSidecarRead(VMIOF_DiskSidecar *handle,


void *buffer,
uint64_t numBytes,
uint64_t byteOff);

The parameters you pass to this function are :

n VMIOF_DiskSidecar *handle — The handle to the sidecar that you are reading

n void *buffer — This is a pointer to a buffer into which the sidecar data is read into. This memory
should be VMIOF_DISK_SIDECAR_ALIGN aligned.

n uint64_t numBytes — The number of bytes to read from the sidecar file into the buffer. This value must
be a multiple of VMIOF_DISK_SIDECAR_ALIGN

n uint64_t byteOff — The offset in the sidecar file at which to start reading. This offset must be a
multiple of VMIOF_DISK_SIDECAR_ALIGN

VMware, Inc. VMware Confidential and Proprietary 139


Getting Started Developing vSphere IO Filter Solutions

The return values for this function include:

n VMIOF_SUCCESS — The function succeeded and the buffer now contains the data that was read from the
sidecar file

n VMIOF_MISALIGNED — The buffer, numBytes or byteOff is not a multiple of VMIOF_DISK_SIDECAR_ALIGN

n VMIOF_OUT_OF_RANGE — You are trying to read beyond the size of the sidecar's capacity

Understanding and Using VMIOF_DiskSidecarWrite ()


Use this function to write data into the sidecar. You may call this function in the context of diskOpen,
diskClose, diskAttach, diskPropertiesSet, diskDetach, diskGrow, diskClone, diskCollapse,
diskVmMigration, diskSnapshot and diskRelease callback functions. If the filter implements the diskStun
and diskUnstun callbacks, then in stunned state, this function can be used in all LI callbacks, while in
unstunned state, this function can be used in any context. .

The prototype for this function is :

VMIOF_Status VMIOF_DiskSidecarWrite(VMIOF_DiskSidecar *handle,


void *buffer,
uint64_t numBytes,
uint64_t byteOff);

The parameter you pass to this function is:

n VMIOF_DiskSidecar *handle — The handle to the sidecar that you are writing into

n void *buffer — This is a pointer to a buffer with data that will be written into the sidecar file. This
memory should be VMIOF_DISK_SIDECAR_ALIGN aligned.

n uint64_t numBytes — The number of bytes to write from the buffer into the sidecar. This value must be
a multiple of VMIOF_DISK_SIDECAR_ALIGN

n uint64_t byteOff — It is the offset in the sidecar file at which to start writing. This offset must be a
multiple of VMIOF_DISK_SIDECAR_ALIGN

The return values for this function include:

n VMIOF_SUCCESS — The function succeeded and the data in the buffer is written into the sidecar file

n VMIOF_MISALIGNED — The buffer, numBytes or byteOff is not a multiple of VMIOF_DISK_SIDECAR_ALIGN

n VMIOF_OUT_OF_RANGE — You are trying to write beyond the size of the sidecar's capacity

Understanding and Using VMIOF_DiskSidecarGetSize () and VMIOF_DiskSidecarSetSize ()


VMIOF_DiskSidecarGetSize — Use this function to retrieve the current size of the sidecar file. You can call
this function in any context except while a VM is stunned.

The prototype for this function is :

VMIOF_Status VMIOF_DiskSidecarGetSize(VMIOF_DiskSidecar *handle,


uint64_t *size);

The parameter you pass to this function is:

n VMIOF_DiskSidecar *handle — The handle to the sidecar whose size you are trying to retrieve

n uint64_t *size — The retrieved size of the sidecar file

The return values for this function include:

n VMIOF_SUCCESS — The function succeeded and size now contains the retrieved sidecar file size

VMIOF_DiskSidecarSetSize — Use this function to resize your sidecar file. You can call this function only in
the context diskAttach, diskPropertiesSet and diskGrow

140 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

The prototype for this function is :

VMIOF_Status VMIOF_DiskSidecarSetSize(VMIOF_DiskSidecar *handle,


uint64_t size);

The parameter you pass to this function is:

n VMIOF_DiskSidecar *handle — The handle to the sidecar that you are resizing

n uint64_t size — The new size of the sidecar in bytes

The return values for this function include:

n VMIOF_SUCCESS — The function succeeded and size now contains the new size of the sidecar file

n VMIOF_NOT_SUPPORTED — The function was invoked from an unsupported filter operation. The function
failed because it was invoked outside the context of diskAttach, diskPropertiesSet and diskGrow

Important Notes
The VAIO implementation limits when you can invoke certain sidecar functions, including:

n You can only invoke VMIOF_DiskSidecarCreate() during the diskAttach or diskPropertiesSet callbacks.

n You can only invoke VMIOF_DiskSidecarDelete() during the diskPropertiesSet, diskAttach or


diskDetach callbacks.

n You cannot call any sidecar functions while a VM is stunned. Thus you need to keep the current stun
status of a VMDK's VM in its VMDK private data, not in its sidecar.

The IO Filter framework imposes no limitation on the size of the sidecar files. However, the downside of
having large sidecars is the time spent in copying the sidecar during snapshot operations. A VM can see
large stun times during the snapshot create operation if the sidecars are large because the sidecars are
copied from parent disk to child disk while the VM is stunned. The same is true when the sidecars are
copied from child disk to parent disk during the snapshot consolidate operation.

NOTE The vm-support bundle, used for crash analysis, can collect the first 1MB of each sidecar file.
However, this is not enabled by default, you have to specify it explicitly as follows. Please note that this
feature doesn't work on VVOL and vSAN datastores.

ESXi# vm-support -a VirtualMachines:sidecarfiles

Example: Example Code


Several topics in this chapter provide examples of using these functions, including: “Understanding and
Processing DiskAttach Events,” on page 155, “Understanding and Processing DiskDetach Events,” on
page 158, etc.

Understanding and Using the IO Filters Polling Functions


VAIO provides functions and data structures that allow your IO Filter solution to poll certain of its file
descriptors for pending reads and writes in a manner that is analogous to using the select() system call in
POSIX environments. A key reason for these poll functions lies in the resource limitations of IO Filter
solutions vs typical POSIX environments, specifically limitations on threads.

For example, two coding patterns for server programs that use sockets are :

n Many-simple threads:

n Create a thread to process connection request on a bound socket

n For each connection accepted, spawn a new thread to process and respond to requests on the file
returned by accept()

VMware, Inc. VMware Confidential and Proprietary 141


Getting Started Developing vSphere IO Filter Solutions

n Call select() in a busy-wait loop:

Neither of these patterns works well in IO Filters, because:

n IO Filters are discouraged from creating additional threads in either their daemon or Library Instance
code as excessive threads may cause performance problems and may even cause ESXi to crash

n Library Instances and Daemons only get invoked through their entry-point functions. The Library
Instance routines must eventually call return to continue the IO flow on the VM . The Daemon's must
call return in a timely manner or they will be killed by the IO Filter Framework's Daemon watchdog.

Instead of using either of the patterns described above, the proper way to wait for file IO within the IO Filter
Solution is to use its poll functions. At a high level, your code registers each file descriptor for which it
wants to be notified of pending IO with the framework via the VMIOF_PollAdd() function (discussed in
detail later in this topic), associating a callback function with the file descriptor. Whenever there is specific
IO pending on this file descriptor, the IO Filter Framework invokes the associated callback to process the IO.

NOTE Poll functions available in the IO filter framework, currently support operations on sockets and pipes
only.

Given the above details, Common examples of using the poll functions include:

n Daemon code waiting for connection request on a bound socket

n Daemon code waiting on incoming requests / commands from either Library Instances or CIM
Providers

n Library instances waiting for results from deferred IO requests sent to a Daemon for fulfillment

The following sub-sections provide details and an example of using the VAIO poll functionality.

NOTE Poll functions won't be called during the callbacks of diskOpen and diskClose. The same thread that
calls all the poll functions is also responsible for these two callbacks. As a workaround, we recommend you
cancel all timers, and poll callbacks in diskClose, and then use poll/select directly.

Understanding VMIOF_PollHandle
The VAIO API provides an opaque data type VMIOF_PollHandle. This represents a handle to the poll callback
function that gets registered via VMIOF_PollAdd() (discussed in detail in the next topic). This function is
invoked upon occurrence of the event being polled, example read/write operation on the file descriptor
(socket/pipe).

VMIOF_PollHandle **poll;

Understanding VMIOF_PollAdd ()
Use VMIOF_PollAdd() to register a poll callback for a certain event, like read or write operation on a socket or
pipe file descriptor. VMIOF_PollAdd() has the prototype in vmiof_poll.h .

VMIOF_Status
VMIOF_PollAdd(VMIOF_FileHandle file, VMIOF_PollEvent event, VMIOF_PollCallback
callback, void *data, VMIOF_PollHandle **poll);

The parameters to this function are :

n VMIOF_FileHandle file — This is the file descriptor of the socket or pipe that is polled

n VMIOF_PollEvent event — The refers to the event being polled. It is read or write operation on the
socket or pipe.

n VMIOF_PollCallback callback - This function pointer is the poll callback function that is registered to be
invoked upon occurrence of VMIOF_PollEvent

142 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n void *data — A pointer to data to be passed to VMIOF_PollCallback function by the IO Filter Framework.

n VMIOF_PollHandle **poll - A pointer to a pointer to VMIOF_PollHandle.Your code allocates / declares a


VMIOF_PollHandle*, say called foo, and then passes a pointer to foo in the last parameter. You use this
handle if/when you wish to unregister/remove the VMIOF_PollCallback. You would do this if you want
to stop polling the socket or pipe for the read or write operation.

The function returns the following status codes:

n VMIOF_SUCCESS — VMIOF_PollAdd() successfully registered VMIOF_PollCallback function for the


specified VMIOF_PollEvent

n VMIOF_BAD_PARAM — VMIOF_PollAdd() returns VMIOF_BAD_PARAM if the event specified by VMIOF_PollEvent


is invalid.

n VMIOF_NO_MEMORY — VMIOF_PollAdd() returns VMIOF_NO_MEMORY there is a memory allocation failure of


VMIOF_PollHandle

NOTE In both LIs and Daemons, no blocking functions may be called in Poll callbacks.

Understanding VMIOF_PollRemove ()
Use VMIOF_PollRemove() to remove or unregister the VMIOF_PollCallback(). You use this function when you
no longer want to poll the event (read or write) associated with the socket or pipe file descriptor.
VMIOF_PollRemove() has the prototype in vmiof_poll.h.

VMIOF_Status
VMIOF_PollRemove(VMIOF_PollHandle *poll);

The parameters to this function are :

n VMIOF_PollHandle *poll This is the opaque handle to the poll callback function that you remove or
unregister, associated with the event (read or write) on the file descriptor of the socket or pipe.

The function returns the following status codes:

n VMIOF_SUCCESS — VMIOF_PollRemove() returns this value upon successful removal of the


VMIOF_PollCallback() callback function.

n VMIOF_NOT_FOUND — VMIOF_PollRemove() returns VMIOF_NOT_FOUND if the handle to the poll callback


function VMIOF_PollCallback() is either invalid or not found.

IMPORTANT Be sure to call this function on a file descriptor before you close the file descriptor. If you fail to
remove the poll before you close the file descriptor, the framework will call the callback function
continuously.

Example: Example of using the Poll Functions


The following code snippet provides an example usage of the poll functions. At a high level, this code
sample, in the context of SampleFilterDiskOpen(), creates a socket and adds it to the poll list to monitor it for
read event using VMIOF_PollAdd(). Upon occurrence of the event the registered callback function is invoked.
The code, in the context of SampleFilterDiskClose() removes or unregisters the callback function causing
the event on the socket not to be polled subsequently.

0. char Mbuf[1024];
1. /* Declare the poll handle for the callback function to be registered via VMIOF_PollAdd()*/
2. VMIOF_PollHandle *SampfiltLISocketPollHandle;
3. int SampfiltSockFD;
4. void SampfiltDaemonReadWorker(void *datap) {
5. SampfiltDelayIO_t delayIO;
6. if(-1 == read(SampfiltSockFD,&delayIO,sizeof(delayIO))) {
7. /*Unable to read Daemon socket data*/

VMware, Inc. VMware Confidential and Proprietary 143


Getting Started Developing vSphere IO Filter Solutions

8. sprintf(Mbuf,"DaemonReadWrker: could not read Daemon socket data: %d\n",errno);


9. IOFLOG1(Mbuf);
10. } else { /* read worked, continue the IO */
11. sprintf(Mbuf,"DaemonReadWorker: received ack for delayed IO seq %d. Continuing the
IO.\n",delayIO.sequence);
12. IOFLOG1(Mbuf);
13. …
14. }
15. return;
16. } /* End of SampfiltDaemonReadWorker() */
17. …
18. VMIOF_Status
19. SampleFilterDiskOpen(VMIOF_DiskHandle *diskHandle, const VMIOF_DiskInfo *diskInfo) {
20. VMIOF_Status res;
21. /* open a socket to the Daemon */
22. if(0 > (SampfiltSockFD = socket(AF_UNIX, SOCK_STREAM, 0))) {
23. sprintf(Mbuf,"can't socket in LI: %d\n",errno);
24. IOFLOG1(Mbuf);
25. return((VMIOF_Status)errno);
26. }
27. …
28. /* now we should be read to talk to the daemon. Add the socket to the Poll list, for read
*/
29. if(VMIOF_SUCCESS != (res = VMIOF_PollAdd((VMIOF_FileHandle)SampfiltSockFD,
VMIOF_POLL_EVENT_READ,
30. SampfiltDaemonReadWorker, (void *)NULL,
&SampfiltLISocketPollHandle))) {
31. sprintf(Mbuf,"Error adding Poll for Daemon socket %d: %d\n",SampfiltSockFD, res);
32. IOFLOG1(Mbuf);
33. return res;
34. }
35. …
36. } /* End of SampleFilterDiskOpen () */
37. VMIOF_Status
38. SampleFilterDiskClose(VMIOF_DiskHandle *diskHandle) {
39. …
40. VMIOF_PollRemove(SampfiltLISocketPollHandle);
41. …
42. } /* end of SampleFilterDiskClose() */

The key sections of code are as follows:

n Line 0 : Define a character buffer, Mbuf that the code uses for buffering log messages

n Line 2 : Define VMIOF_PollHandle *SampfiltLISocketPollHandle - to be an opaque handle to the poll


callback function that gets registered and is invoked upon occurrence of an event (read or write) on the
specified file descriptor (socket or pipe).

n Line 3 : Define SampfiltSockFD of type integer holds the file descriptor of the socket

n Line 4-16 : SampfiltDaemonReadWorker() is the callback function that is registered via VMIOF_PollAdd().
This function reads data from the socket.

n Line 5: Define SampfiltDelayIO_t delayIO - This is a structure of information needed when


processing locally delayed IO.

144 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n Line 6-9: Read from socket using its file descriptor SampfiltSockFD. In case the read fails log an
appropriate error message indicating the failure

n Line 10 - 12 : Upon successful read from socket, continue IO operation.

n Line 15 - 16: Return from SampfiltDaemonReadWorker().

n Line 18-33 : Define SampleFilterDiskOpen(). Not shown in this code snippet, the Library Instance defines
this function as the entry point for the diskOpen event. Specifically:

n Line 20: Declare res to hold the return values from certain VAIO functions called by this function

n Line 22 - 25 : Create the socket, an endpoint for communication and accept the return value in the
socket file descriptor SampfiltSockFD. Log appropriate error message upon failure to create the
socket.

n Line 28 - 33 : Upon successful creation of the socket add it to the poll list to monitor it for read
event using VMIOF_PollAdd(). The first parameter to VMIOF_PollAdd() is the socket file descriptor
SampfiltSockFD, the second parameter VMIOF_POLL_EVENT_READ indicates that the event being polled
is a read event on the socket, the third parameter SampfiltDaemonReadWorker() is the callback
function to invoke upon occurrence of the read event, the fourth parameter (void *)NULL indicates
that the callback function SampfiltDaemonReadWorker() being registered does not expect any
parameter. The fifth parameter is an output parameter that holds an opaque handle to the callback
function SampfiltDaemonReadWorker() being registered. This handle is needed for any future
manipulation, for example unregistering the callback function when you no longer want to poll the
socket for the event. The return value of VMIOF_PollAdd() is captured in res and checked for
success. Appropriate error message is logged upon failure.

n Lines 36-42 : Define SampleFilterDiskClose(). Not shown in this code snippet, the Library Instance defines
this function as the entry point for the diskClose event. Specifically:

n Line 40: Removes or unregisters the poll callback function SampfiltDaemonReadWorker() via the
VMIOF_PollHandle. The event on the socket will no longer be polled or monitored subsequently.
Normally code should check the return value of this function. For simplicity, this function discards
and does not check the return value

Thus, in general, whenever ESXi opens a VMDK associated with this filter, the library code creates a socket
to talk to the demon (that might be providing caching or replication services on a different ESXi host)
polling it, for read event. Upon occurrence of the event the callback function SampfiltDaemonReadWorker() is
called and IO continues. When ESXi closes the disk, the registered callback is removed via
VMIOF_PollRemove().

Understanding and Using Filter-Private Data Functions to Keep Non-Persistent


Per-VMDK Meta Data
There is only one address space for each VMX, iofilterd / daemon, or other user-space cartel in which filter
components run, regardless of the number of VMDKs tagged for processing by the Filter Solution. However,
Libraries (and maybe Daemons) likely want to keep certain information on a per-VMDK basis, for example:

n Handles to timer ( “Understanding and Using the IO Filters Timer Functions,” on page 171 ), poll
( “Understanding and Using the IO Filters Polling Functions,” on page 141 ), and worker callbacks
( “Understanding and Using the IO Filters Worker Functions,” on page 174).

n VM stun state, for example to determine whether you can perform Sidecar operations as discussed in
the next section

n A cached copy of sidecar data that changes quickly, where continually updating the sidecar would have
an adverse performance impact

VMware, Inc. VMware Confidential and Proprietary 145


Getting Started Developing vSphere IO Filter Solutions

The VAIO provides a method for filters to associate arbitrary information (defined by the Filter Solution,
called Filter-Private Data) with a VMIOF_Diskhandle pointer, and then retrieve that data on demand. The
functions are:
n void VMIOF_DiskFilterPrivateDataSet(DiskHandle *handle, void *data)

Use this function to associate Filter-Private data with handle

n void *VMIOF_DiskFilterPrivateDataGet(DiskHandle *handle)

Use this function to retrieve the Filter-Private data associated with handle

Because the data is meant to be per-VMDK, and filters should be designed to support an arbitrary number
of VMDKs, filter library's typically:
n Can Define the Filter-Private data structure in a .h file

n Dynamically allocate the Filter Solution-defined data (using VMIOF_HeapAllocate() or


VMIOF_HeapAllocateAligned (see “Managing Memory in an IO Filter Solution,” on page 130) and
associate it with the disk handle, (using VMIOF_DiskFilterPrivateDataSet()) in the diskOpen() entry-
point code

n Retrieves the data, (using VMIOF_DiskFilterPrivateDataGet()) in each of the other callbacks that require
the data. You may also consider passing the Filter-Private data as the parameter to timer, poll and
worker callback functions (discussed in the previously referenced sections).

n Frees the Filter-Private data in the diskClose() entry-point code

NOTE In addition to the example content listed previously, Filter-Private Data typically includes a pointer
to the VMIOF_DiskHandle with which the data is associated. This allows timer/poll/worker callback functions,
which don't receive a said handle, to retrieve the data when the IO Filter Framework invokes said callbacks.

NOTE Library Instances should not share Heaps with each other. Filter Private Data should be allocated
from its own heap.

Example: Example Code


The following code snippet shows the use of these APIs as outlined above. Though it contains invocations of
functions detailed later in this chapter, the example is limited to timers and the VMIOF_Heap*() functions,
which you (hopefully) conceptually understand without detailed knowledge of them.

The following snippet contains a sample definition of Filter-Private data in sampfilt_instanceData.h:

1. /* data needed per filtered VMDK. Allocated on DiskOpen() and freed on DiskClose() */
2. typedef struct InstanceData_s {
3. VMIOF_DiskHandle *handle;
4. int SampfiltSockFD;
5. VMIOF_PollHandle *SampfiltLISocketPollHandle;
6. VMIOF_DiskSidecar *mySidecarHandlep;
7. SampfiltSidecar_t *mySidecarp; /* pointer to mmaped space for reading/writing
sidecar data */
8. VMIOF_WorkGroup *workGroupp;
9. VMIOF_WorkHandle *workHandlep;
10. // DoneExperimenting VMIOF_TimerHandle *timerHandlep;
11. VMIOF_TimerHandle *snapTimerHandlep; /* handle for snapshot progress timer */
12. uint64_t snapCount; /* how far we've progressed through the simulated
'work'of a snapshot */
13. pthread_cond_t snapCondVar; /* used for signalling when snap progress is
complete */
14. pthread_mutex_t snapMutex;
15. char stunStatus; /* 0=unstunned, 1=stunned */
16. } InstanceData_t;

146 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

The following code snippet contains Filter Library code:

1. /* **************************************************************************
2. * Copyright 2014 VMware, Inc. All rights reserved.
3. * -- VMware Confidential
4. * **************************************************************************/
5.
6. #include "vmiof/vmiof_disk.h"
7.
8. #include "sampfilt_delayIO.h"
9. #include "sampfilt_instanceData.h"
10.
11. #define THIRTY_SECONDS (uint64_t)30000000 /* 30 million microseconds = 30 seconds */
12. void
13. TimerCallback(void *datap) {
14. static int count = 1;
15. VMIOF_Status res;
16. InstanceData_t *id = (InstanceData_t *)datap;
17.
18. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(TimerCallback): Disk %p: Timer fired!\n"id-
>handle);
19. ...
20. return;
21. } /* TimerCallback() */
22.
23. VMIOF_Status
24. SampleFilterDiskOpen(VMIOF_DiskHandle *diskHandle, const VMIOF_DiskInfo *diskInfo)
25. {
26. VMIOF_Status res;
27. struct sockaddr_in serv_addr;
28. struct hostent *server;
29. size_t heapSizeEstimate;
30. InstanceData_t *id;
31. uint32_t idx, count=diskInfo->linksInChain;
32.
33. ...
34. /* Heap is created before this*/
35. id = VMIOF_HeapAllocate(SampfiltHeapHandlep, sizeof(*id));
36. if(!id) {
37. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): Could not allocate instance id.
Failing the open on\n");
38. return VMIOF_NO_MEMORY;
39. } else {
40. bzero(id,sizeof(*id)); /* just to be sure */
41. }
42. pthread_cond_init(&id->snapCondVar, NULL); /* initialize the condition variable */
43. pthread_mutex_init(&id->snapMutex, NULL); /* initialize the mutex variable */
44. id->handle = diskHandle;
45. /* create a timer */
46. if( VMIOF_SUCCESS != (res = VMIOF_TimerAdd(THIRTY_SECONDS, TimerCallback, (void *)id,
&id->timerHandlep))) {
47. /* couldn't create the timer */
48. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): creating timer failed (%d)",res);
49. return res;
50. } else {
51. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): created timer\n");

VMware, Inc. VMware Confidential and Proprietary 147


Getting Started Developing vSphere IO Filter Solutions

52. } /* if VMIOF_TimerAdd() */
53. VMIOF_DiskFilterPrivateDataSet(diskHandle, (void *)id);
54. ...
55. return res;
56. } /* SampleFilterDiskOpen() */
57.
58. VMIOF_Status
59. SampleFilterDiskClose(VMIOF_DiskHandle *diskHandle)
60. {
61. VMIOF_Status res = VMIOF_SUCCESS;
62. InstanceData_t *id;
63.
64. /* announce we are here, and print the parameters */
65. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskClose): handle=%p\n",diskHandle);
66. id = (InstanceData_t *)VMIOF_DiskFilterPrivateDataGet(diskHandle);
67. ...
68. /* cancel the timer, remove the workgroup, remove the poll */
69. (void)VMIOF_TimerRemove(id->timerHandlep); /* don't care about the return value here */
70. ...
71. /* free our instance data from the heap */
72. VMIOF_HeapFree(id->HeapHandlep,id);
73. ...
74. return res;
75. } /* SampleFilterDiskClose() */

This code works as follows:


n Lines 6-9 just include necessary header files. There are others included in the real source that are not
shown here, such as vmiof_timers.h.

n Line 11 declares a constant used when creating a timer that fires every 30 seconds (see line 52)

n Lines 12-21 define the function TimerCallback() which is registered as the callback for the 30-second
timer in line 52. NOTE that this callback expects to receive an InstanceData_t * in its data parameter.
The code shown prints out the VMIOF_DiskHandle associated with this InstanceData. The handle is
stored in the InstanceData in line 50.

n Lines 23-56 define the diskOpen() callback, SampleFilterDiskOpen(), which does the following:
n Lines 26-31 declare local variables, including a pointer to an InstanceData_t, called id. This topic
does not discuss the other variables, but may provide food for thought for items that are
appropriate to be local instead of in Filter-Private Data.

n Lines 34-41 attempt to allocate a new InstanceData_t structure from the heap, storing the result in
id. If the allocation fails, the code logs the failure and returns failure to the caller. If the allocation
succeeds, the code initializes the structure to all zeros.

n Lines 42-44 initialize some of id's members.

n Lines 45-52 attempt to create a timer that fires every 30 seconds, logging an error and returning
failure to the caller if the creation fails. Specifically, on line 46, the call passes: The callback
TimerCallback in the second parameter; id in the third parameter (this is the parameter to pass to
TimerCallback()); The timerHandlep member of id in the third parameter. This handle is used to
cancel/remove the timer in Line 72.

n Line 53 associates id with handle so that it can be recalled in various other routines, such as on line
66.

n Lines 58-75 define the diskClose() callback, SampleFilterDiskClose(), which does the following:

n Line 66 retrieves the InstanceDataassociated with handle, storing it in id. The code missing in line
67 should check that id is non-NULL before proceeding.

148 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n Line 69 cancels/removes the timer whose handle is id->TimerHandlep that was created in Line 46.

n Line 72 frees the memory allocation for id.

Getting the UUID of a VMDK


The VMIOF_DiskUuidGet() utility function returns a unique identifier for the specified virtual disk. The UUID
is a random number of size VMIOF_DISK_UUID_SIZE that is generated when the disk is first created. It is not
necessary that the UUID is globally unique. UUID may change over the lifetime of the virtual disk.

NOTE vSphere keeps the UUID the same during move or rename operations. vSphere assigns a new UUID
for copied VMDKs. In the case of a linked clone or snapshot, vSphere keeps the same UUID. In some cases
(e.g., Windows VSS), the snapshot may be opened on the same VM, and will have the same UUID. The same
VMDK will never be opened twice simultaneously on the same VM, but two separate (but related) VMDKs
with the same UUID's may be opened simultaneously. The combination of the UUID and the VM's instance
UUID will generate an identifier that may be sufficiently unique for logging and reporting.

The prototype of this function is:

VMIOF_Status VMIOF_DiskUuidGet(VMIOF_DiskHandle *handle,


uint8_t uuid[VMIOF_DISK_UUID_SIZE]);

The parameters are :


n VMIOF_DiskHandle *handle — This input parameter is the handle to the disk

n uint8_t uuid — This output parameter us used to capture the UUID of the disk upon successful return
of VMIOF_DiskUuidGet().

NOTE VMIOF_DISK_UUID_SIZE is the number of bytes in UUID and is defined as follows

#define VMIOF_DISK_UUID-SIZE 16

The function returns the following status codes :


n VMIOF_SUCCESS — The function completed successfully. The uuid variable contains the UUID of the
VMDK.

n VMIOF_NOT_FOUND — The UUID of the disk is not available. All virtual disks will have a valid UUID
when created. However VMIOF_NOT_FOUND may be returned if the virtual disk descriptor file has been
modified or got corrupted.

n VMIOF_NOT_SUPPORTED — The operation is not supported on this disk handle. It could happen that the
disk chain is partially opened like only a subset of the total links being opened. In this case, you could
get a return value of VMIOF_NOT_SUPPORTED.

NOTE Upon successful execution of VMIOF_DiskUuidGet(), the UUID returned matches the UUID that is
available in the vSphere API under vim.vm.device.VirtualDisk.backing.uuid and also in the virtual
disk descriptor file with the key ddb.uuid, in the Disk Data Base section.

NOTE Usually the UUID doesn't change, however it can change if someone invokes the public vSphere
API to change it.

The following code snipped shows the use of VMIOF_DiskUuidGet() function outlined above —

1. static inline VMIOF_Status


2. PrintDiskID(VMIOF_DiskHandle* handle)
3. {
4. uint8_t uuid[VMIOF_VM_UUID_SIZE];
5. VMIOF_Status status = VMIOF_DiskUuidGet(handle, uuid);
6. int i = 0;

VMware, Inc. VMware Confidential and Proprietary 149


Getting Started Developing vSphere IO Filter Solutions

7. if (status != VMIOF_SUCCESS) {
8. LOG("ERROR: could not get a disk uuid for this filter\n");
9. return (status);
10. }
11. LOG("Disk UUID for this filter is:");
12. for (i = 0; i < VMIOF_VM_UUID_SIZE; i++) {
13. LOG("%x ", uuid[i]);
14. }
15. LOG("\n");
16. return (status);
17. }

The key sections of the code are as follows —


n Line 1-3 — Function PrintDiskID() function definition begins

n Line 4 — Declare Local variable uuid to capture the uuid of the disk

n Line 5 — Invocation of function VMIOF_DiskUuidGet() that takes the disk handle and the uuid. Here uuid
is an output parameter and gets populated with the disk uuid upon successful return of the
VMIOF_DiskUuidGet().

n Line 6 — Declare Looping variable i and initialize it to zero.

n Line 7-10 — Upon unsuccessful return from VMIOF_DiskUuidGet() log message to convey that you could
not get the disk uuid for this filter.

n Line 11-13 — Upon successful return from VMIOF_DiskUuidGet(), log the the disk uuid.
n This corresponds to the uuid field of the disk data base in the Disk Descriptor File of the virtual
disk to which your filter is attached. For example line 12 logged, "60 00 C2 98 e1 85 37 85-62 d7 25
0d fe 1e 6e 5e" that corresponded to the entry - iofilter1.vmdk:ddb.uuid = "60 00 C2 98 e1 85 37
85-62 d7 25 0d fe 1e 6e 5e" in the iofilter*.vmdk file. This is the vmdk to which the IO Filter is
attached.

n Line 14-16 — Log new line and return back to the calling function.

Getting the ESX Version Info

WHAT'S NEW In 60U2, we introduced the utility function VmkuserVersion_GetUniqueSystemVersion() to


allow partners to get the ESXi version information for the system.

The prototype of this function is:

VmkuserStatus_Code
VmkuserVersion_GetUniqueSystemVersion(
VmkuserVersion_UniqueSystemVersionInfo *versionInfo);

The parameters are :


n VmkuserVersion_UniqueSystemVersionInfo *versionInfo — The version information structure returned
by the framework

The function returns the following status codes :


n VMK_OK — The function completed successfully.

n VMK_FAILURE — The function failed.

The structure of VmkuserVersion_UniqueSystemVersionInfo is defined as follows:

typedef struct VmkuserVersion_UniqueSystemVersionInfo {


char product[VMKUSER_SYSVERSION_STRING_LEN];
char productVersion[VMKUSER_SYSVERSION_STRING_LEN];
char buildType[VMKUSER_SYSVERSION_STRING_LEN];

150 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

char buildDate[VMKUSER_SYSVERSION_STRING_LEN];
char buildTime[VMKUSER_SYSVERSION_STRING_LEN];
char releaseUpdate[VMKUSER_SYSVERSION_STRING_LEN];
char releasePatch[VMKUSER_SYSVERSION_STRING_LEN];
unsigned int vmkernelBuild;
} VmkuserVersion_UniqueSystemVersionInfo;

Using the IO Filter Failure Reporting Functionality


Use VMIOF_FailureReportDisabled() to report catastrophic failure, for example if the SSD used in cache
solutions or the remote side of a replication solution becomes unreachable. Both LI and daemon code can
invoke this function.

Invoking this function causes the IO Filter Framework to generate a certain VMKEvent, which ESXi sends to
the vCenter Server, which in turn has two effects:

n It generates an alert, visible in the VWC, that informs administrators of the issue

n It advises vCenter Server to not power-on or provision VMs on the host from which the event was
received

This function has no effect, with respect to IO Filter Framework events, for the invoking VM or any other
VMs using this filter. It is up to each filter instance / daemon to decide how to proceed in the face of the
specific error it encounters. Consider the following examples:

n Suppose an administrator has configured two VMDKs (A and B) to be replicated to two different sites
rA and rB and that the VM with these VMDKs is currently running. Suppose the connection to rB is
dropped due to a cable cut, but not rA.

In this scenario

n The LI for B invokes this function, with the previously described effects, and then must decide
what to do with its pending IOs and how else to proceed. For example, it may:

n Track changed but un-replicated blocks in a bitmap in a sidecar

n Ask a daemon on another host in the cluster to attempt sending the replication IOs

n Crash the VM (very drastic action, probably more appropriate for caching errors than the one
described in this scenario).

n The LI for A continues without an issue

n Suppose a caching solution detects than an entire SSD it uses has died, and there are currently two
VMDKs (A and B) with the solution's filter attached in one or more running VMs. Call the LIs for these
VMDKs lA and lB. Next, suppose that the daemon for the solution controls the cache and detected the
issue. Finally, suppose that lA has IOs in flight to the cache but lB does not. In this scenario:

n The daemon invokes VMIOF_FailureRportDiabled(), and (probably) sends control messages to the
LIs.

n lA may wish to crash the VM to prevent it from thinking that its IOs are OK, because they are not.

n lB may just fail any future IOs (return a failure code for each diskIOStart() invocation), since its
VMDK is in tact, but cannot perform any additional IOs, allowing the guest to give up or attempt
recover, etc.

Understanding VMIOF_FailureReportDisabled ()
This utility function has the following prototype (in vmiof_failure.h):

static inline VMIOF_Status VMIOF_FailureReportDisabled(const char *reason);

VMware, Inc. VMware Confidential and Proprietary 151


Getting Started Developing vSphere IO Filter Solutions

The parameters to this function are:

n const char *reason — This input parameter is the string describing the reason for failure.

The function returns an error code describing the result of the operation :

n VMIOF_SUCCESS — The function succeeded and the IO Filter's operational state was successfully set to
disabled.

n VMIOF_FAILURE — The function was unable to set the IO Filter's operational state to disabled.

NOTE If you plan to use this API, please work with VMware to provide a link to your documentation so
that we can update our KB article accordingly.

Enabling a Failed IO Filter on a Host


Once someone remedies the issue causing the failure report, an administrator must manually enable that IO
Filter to undo the actions taken to disable it. To do this, run:

esxcli storage iofilter enable -f filtername

where filtername is the name of the IO Filter. Invoking this command sends a different VMKEvent than the one
described earlier, which ESXi sends to the vCenter Server, which in turn has two effects:

n Advises vCenter Server that it is now OK to power-on and provision VMs on this host

n Resolves the alert previously posted

Besides, using the following command, you can list all the installed Filters on the host:

esxcli storage iofilter list

Understanding VmkuserVob_Notify

WHAT'S NEW From version 60U2, VAIO provides this function to send a notification about a critical
problem, or some critical observation which may help in identifying a root cause of a problem. This call
should not be used for reporting general error conditions without specific, known solutions.

This utility function has the following prototype (in vmkuservob.h):

VmkuserStatus_Code VmkuserVob_Notify(VmkuserVob_Metadata *metadata, const char *fmt, ...);

The parameters to this function are:

n VmkuserVob_Metadata *metadata — vmk_VobMetadata instance.

n const char *fmt — format string.

The function returns an error code describing the result of the operation :

n VMK_OK — Function was successful.

n VMK_FAILURE — Function failed.

The structure vmk_VobMetadata has the following definition:

typedef struct VmkuserVob_Metadata {


VmkuserVob_Urgency urgency;
const char source[VMKUSERVOB_MAX_SOURCE_NAME];
const char kbLinkUrl[VMKUSERVOB_MAX_KB_LINK_LEN];
} VmkuserVob_Metadata;

The members of this structure are:

n VmkuserVob_Urgency urgency — Urgency level of the observation which could be


VMKUSERVOB_URGENCY_INFO, VMKUSERVOB_URGENCY_WARNING, or VMKUSERVOB_URGENCY_ERROR.

152 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n const char source[VMKUSERVOB_MAX_SOURCE_NAME] — Application/Provider name to be associated with


the event. Use a descriptive module name for this parameter. MKUSERVOB_MAX_SOURCE_NAME is currently
defined as 32.

n const char kbLinkUrl[VMKUSERVOB_MAX_KB_LINK_LEN] — Link to a knowledge base article which


describes the remedy steps or explanation about the observation. Providing a Knowledge Base URL
link is mandatory. It should be a fully resolved URL. MKUSERVOB_MAX_KB_LINK_LEN is currently defined as
512.

Understanding and Processing Primary Disk Events

Understanding and Using a diskRequirements Callback


The diskRequirements callback function is invoked by the IO filter framework to query the filter's memory
requirements. The data supplied by the callback function represents the maximum possible value an
instance of the filter requires. The data returned by this callback needs to be accurate otherwise the filter will
fail when using the VMIOF_Heap* functions. The prototype of this function is as follows:

void (*diskRequirements)(VMIOF_DiskRequirements *requirements);

The parameters to this callback are:


n VMIOF_DiskRequirements *requirements — Object describing the memory requirements of the filter.
Filters are expected to fill in values for all attributes. The object is defined as :

typedef struct VMIOF_DiskRequirements {


uint64_t requiredMemoryPerDiskMiB;
uint64_t requiredMemoryPerIO;
uint64_t requiredStaticMemory;
} VMIOF_DiskRequirements;

n The members this structure are:


n uint64_t requiredMemoryPerDiskMiB —The memory required in bytes for every megabyte of
virtual disk

n uint64_t requiredMemoryPerIO — The memory required in bytes per IO. Internally we compute
the heap overhead as requiredMemoryPerIO * VMIOF_DiskMaxOutstandingIOsGet(). We understand
that this value can get relatively high for a large per IO overhead, so you can also decide to support
a smaller number of in-flight IOs and return VMIOF_NO_MEMORY. In that case your computation
should be: perIO = (real per IO overhead * own_max_supported_IOs /
VMIOF_DiskMaxOutstandingIOsGet());

n uint64_t requiredStaticMemory — The required memory amount should include all the memory
that is not accounted for by requiredMemoryPerDiskMiB or requiredMemoryPerIO, even if its total
amount may depend on a customizable variable through the filter policy.

This function returns void.

Example: Code example from countIO.c

The following code snippet provides an example of the proper use of the diskRequirements callback:

1 static void
2 TestDiskRequirements(VMIOF_DiskRequirements *requirements)
3 {
4 uint32_t maxIOs = VMIOF_DiskMaxOutstandingIOsGet();
5 VMIOF_HeapAllocation staticMemory[] = { /* size, alignment, # */
6 { sizeof(CountIOPrivateData), 0, 1 }, /* instance data */
7 { sizeof(CountIOSidecar),VMIOF_DISK_SIDECAR_ALIGN, 1}, /* sidecar data */

VMware, Inc. VMware Confidential and Proprietary 153


Getting Started Developing vSphere IO Filter Solutions

8 };
9 size_t countStatic = sizeof(staticMemory)/sizeof(VMIOF_HeapAllocation);
10 VMIOF_HeapAllocation perIO[] = { /* size, alignment, # */
11 { sizeof(CountIOWorkItem), 0, maxIOs },
12 };
13 size_t countPerIO = sizeof(perIO)/sizeof(VMIOF_HeapAllocation);
14
15 requirements->requiredMemoryPerDiskMiB = 0;
16 requirements->requiredMemoryPerIO =
17 (VMIOF_HeapEstimateRequiredSize(perIO, countPerIO) + maxIOs - 1) / maxIOs;
18
19 requirements->requiredStaticMemory = VMIOF_HeapEstimateRequiredSize(staticMemory,
20 countStatic);
21 }

The key sections of the code are as follows :

n Line 1-13 : Declare & initialize local variables

n Line 2: *requirements : It is a pointer to VMIOF_DiskRequirements and initialized to NULL. This


variable gets filled in by the function.

n Line 4: maxIOs : holds the max number IOs you will get from the framework.

n Line 5-8 : Initialize the members of the staticMemory[] array.

n Line 7: Note that the Sidecar data is aligned to VMIOF_DISK_SIDECAR_ALIGN.

n Line 9 : countStatic: holds the number of elements of the staticMemory[] array

n Line 10-12: Initialize the members of the perIO[] array.

n Line 13: countPerIO: holds the number of elements of the perIO[] array

n Line 15: Set the requireMemoryPerDiskMiB to 0 since we don't have any per megabyte disk requirements

n Line 16-17: Set the requireMemoryPerIO using the VMIOF_HeapEstimateRequiredSize function. We use
this function as it takes into account the proper alignment when sizing the heap.

n Line 19: Set the requiredStaticMemory element using VMIOF_HeapEstimateRequiredSize function.

NOTE This function can be called at any point after filter is loaded and no state information should be
assumed by the code in this function. It means, the framework may query for the requirements of the filter
even before initializing the filter instance or attaching the filter to a specific disk.

NOTE You don't need to account System V Shared Memory Segments and mmap with MAP_SHARED,
because they come from kernel directly. However, you need to account for the page table overhead in
diskRequirements.

If DRS tries to power on a VM, it will check the number reported by this callback, if it is larger than the
remaining memory capacity of any host in the cluster, it will fail powering on the VM. However, if user
performs a manual power on, ESX won't do such admission check.

154 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Understanding and Processing DiskAttach Events


As discussed in section “Understanding and Defining diskAttach and diskDetach Callbacks,” on page 103,
the IO Filter Framework invokes a filter's diskAttach callback when an administrator associates said filter
with a VMDK. In this function, add code to do any filter-specific initialization, including creating and
initializing any sidecar files for the VMDK.

NOTE You can only create sidecar files from the context of a diskAttach function and diskPropertiesSet
function.

For details on sidecar-related functions, see “Using Sidecars Functions in Library Code to Keep Persistent
Per-VMDK Meta Data,” on page 137)

Recall that data written to a sidecar file must be aligned on a boundary specified by the
VMIOF_DISK_SIDECAR_ALIGN macro. Further, recall that the VAIO provides the VMIOF_HeapAllocateAligned()
function to allocate aligned memory from the heap. Thus, if your diskAttach() writes initial data to sidecar
files, you must first create a heap and then allocate aligned memory for this purpose. The size of the heap
needed in a diskAttach() is different from that needed for processing IOs once the filter is attached. Be sure
to invoke VMIOF_HeapSizeEstimation() in your diskAttach() before calling VMIOF_HeapCreate(), and to call
it with just enough size to perform the diskAttach(). For details on using these heap functions, see
“Managing Memory in an IO Filter Solution,” on page 130.

At a minimum, consider storing the following data in sidecars associated with a VMDK:

n Disk parameters, for example such as size. You can update this information when you get appropriate
callbacks when this information changes, such as diskGrow, diskCollapse, diskSnapshot, etc.

n Filter properties. As with disk parameters, you can update property values in your
VMIOF_DiskPropertiesSet callback.

n If you plan on supporting the diskRelease callback, have a separate sidecar that contains the pathname
and, optionally, an IP address and port number. When the diskOpen callback runs in the context of a
Daemon, it will fill in these latter fields with the IP information. This course refers to this as the owner
sidecar. Initialize the IP and port number to zero. If you don't plan to support diskRelease, put the
pathname in some other sidecar.

The IO Filters Framework passes this information into the diskAttach() function using parameters as
follows:

VMIOF_Status (*diskAttach)(VMIOF_DiskHandle *handle,


const VMIOF_DiskInfo *info,
const VMIOF_DiskFilterProperty
*const *properties);

Parameters :

n VMIOF_DiskHandle *handle - This input parameter an opaque handle to the disk and is valid only for the
filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n const VMIOF_DiskInfo *info - This is a pointer to a structure of type VMIOF_DiskInfo that describes disk
information like disk capacity, number of links that compose the disk and files that compose the disk.

n const VMIOF_DiskFilterProperty *const *properties - This is an array of VMIOF_DiskFilterProperty


pointers that is NULL terminated. It may be NULL if no properties are present. Note that
VMIOF_DiskFilterProperty is a structure that describes a disk filter property attribute like name and
value.

VMware, Inc. VMware Confidential and Proprietary 155


Getting Started Developing vSphere IO Filter Solutions

Return Value :

n Every callback function should return a valid VMIOF_Status (Refer section “Understanding
VMIOF_Status Results for Functions in the VAIO,” on page 102)

Following is a code snippet describing the activities associated with diskAttach callback :

1. VMIOF_Status
2. SampleFilterDiskAttach(VMIOF_DiskHandle *handle,
3. const VMIOF_DiskInfo *diskInfo,
4. const VMIOF_DiskFilterProperty *const *properties)
5. {
6. VMIOF_DiskSidecar *schp;
7. VMIOF_Status res;
8. SampfiltSidecar_t *sidecarp;
9. size_t heapSizeEstimate;

10. VMIOF_HeapHandle *heapHandlep;


11. VMIOF_HeapAllocation allocations[] = {
12. { MY_SIDECAR_SIZE, VMIOF_DISK_SIDECAR_ALIGN, 1 }, /* sidecar */
13. };
14. size_t numAllocations = sizeof(allocations)/sizeof(VMIOF_HeapAllocation);
15. ...
16. heapSizeEstimate = VMIOF_HeapEstimateRequiredSize( allocations, numAllocations);
17.
18. /* Heap Creation */
19. res = VMIOF_HeapCreate(heapSizeEstimate, &heapHandlep);
20. if( VMIOF_SUCCESS != res) {
21. /* couldn't create the heap */
22. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskAttach): error creating" " filter heap
(%d)\n",res);
23. return res;
24. } else {
25. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskAttach): Created new heap " "for
disk\n");
26. }
27. ...
28. /*Sidecar Creation */
29. if( VMIOF_SUCCESS != (res = VMIOF_DiskSidecarCreate(handle, MY_SIDECAR_KEY,
30. MY_SIDECAR_SIZE, &schp))) {
31. /* well, that didn't work either. log that */
32. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskAttach): could not create sidecar
(%d).",res);
33. return res;
34. } else {
35. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskAttach): sidecar created (%p)\n",schp);
36. }
37. ...
38. /* map buffer for sidecar from heap aligned as required */
39. if(NULL == (sidecarp = (SampfiltSidecar_t *)VMIOF_HeapAllocateAligned(
40. heapHandlep, MY_SIDECAR_SIZE, VMIOF_DISK_SIDECAR_ALIGN))) {
41. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskAttach): could not "
42. "allocate sidecar buffer\n");
43. ...
44. } else {
45. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskAttach): sidecarp ="
46. " %p\n",sidecarp);

156 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

47. }
48. ...
49. /* Sidecar Write */
50. if( VMIOF_SUCCESS != (res = VMIOF_DiskSidecarWrite(schp,
51. sidecarp, MY_SIDECAR_SIZE, (uint64_t)0))) {
52. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskAttach): writing initial "
53. "sidecar failed (%d)\n",res);
54. ...
55. return res ;
56. } else { /* it worked */
57. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskAttach): initial sidecar"
58. " written\n");
59. }
60. ...
61. /* Close the sidecar */
62. if( schp && (VMIOF_SUCCESS != (res = VMIOF_DiskSidecarClose(schp)))) {
63. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskAttach): closing sidecar "
64. "failed (%d)\n",res);
65. ...
66. return res;
67. } else { /* it worked */
68. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskAttach): sidecar closed\n");
69. } /* if write update */
70.
71. ...
72.
73. /* Destroy the heap */
74. VMIOF_HeapDestroy(heapHandlep);
75. ...

The key sections of code are as follows:

n Lines 1-5 : Defines SampleFilterDiskAttach() function.

n Lines 6-14 : Declaration and initialization of local variables needed for heap creation, sidecar creation, to
accept the return value of callback functions via

n Lines18-24 : Heap is created via VMIOF_HeapCreate(). If not successful, log appropriate message to
indicate failure and return failure via res

n Lines 24-26 : Upon successful creation of heap log appropriate message.

n Lines 28-33 : Create your sidecar using VMIOF_DiskSidecarCreate(). If not successful, log appropriate
message indicating failure and return the failure status via res.

n Lines 34-36 : Log appropriate message indicating successful creation of sidecar.

n Lines 38-43 : Map a buffer from the heap you created for your sidecar file using
VMIOF_HeapAllocateAligned(). Note that data written to a sidecar file must be aligned on a boundary
specified by the VMIOF_DISK_SIDECAR_ALIGN macro. Initialization of different members of this structure is
not shown specifically in this code sample.

n Lines 44-47 : Log appropriate message upon successful mapping of the buffer from the heap via
VMIOF_HeapAllocateAligned().

n Lines 49-55 : Write to the sidecar using VMIOF_DiskSidecarWrite(). In case of a failure, log appropriate
error message and return failure via res

n Lines 56-59 : Log message indicating successful write to sidecar.

VMware, Inc. VMware Confidential and Proprietary 157


Getting Started Developing vSphere IO Filter Solutions

n Lines 61-66 : Now that you are done with you sidecar, close it using VMIOF_DiskSidecarClose()
function. In case of failure, log appropriate message and return failure status via res.

n Lines 67-69 : Log message indicating successful sidecar close operation.

n Lines 73-74 : Destroy the heap using VMIOF_HeapDestroy().

Understanding and Processing DiskDetach Events


As discussed in section “Understanding and Defining diskAttach and diskDetach Callbacks,” on page 103,
the IO Filter Framework invokes a filter's diskDetach() callback when an administrator disassociates the
said filter from a VMDK. You should release any resources that your IO Filter was holding thus far and is no
longer needed as part of your diskDetach() callback function. For example, you remove the sidecar
associated with the VMDK as part of diskDetach(). This is done by invoking VMIOF_DiskSidecarDelete()
callback function. Refer to section “Using Sidecars Functions in Library Code to Keep Persistent Per-VMDK
Meta Data,” on page 137 for details on sidecar discussion.

The prototype for this callback is:

VMIOF_Status (*diskDetach)(VMIOF_DiskHandle *handle,


const VMIOF_DiskDetachInfo *info);

The parameters to this callback are:

n VMIOF_DiskHandle *handle — An opaque handle to the virtual disk from which the IO Filter is being
detached

n const VMIOF_DiskDetachInfo *info— Information about the detach event, defined as:

typedef struct VMIOF_DiskDetachInfo {


VMIOF_DiskDetachFlags detachFlags;
VMIOF_DiskOpProgressFunc progressFunc;
VMIOF_DiskOpCompletionFunc completionFunc;
} VMIOF_DiskDetachInfo;

The members of this structure are:

n VMIOF_DiskDetachFlags detachFlags — There is currently only one value defined for the flags at
this time - VMIOF_DISK_DETACH_DELETE. The IO Filter Framework sets this flag when the vdisk from
which the filter is being detached is also being deleted. One special case is during a diskCollapse.
In that case, the IO Filter Framework performs a series of diskAttach / diskDetach callbacks for
each delta disk, setting the VMIOF_DISK_DETACH_DELETE flag for the delta disks getting deleted as
part of the collapse.

n VMIOF_DiskOpProgressFunc progressFunc — Call this function at least every 10 seconds as the


callback does its work, synchronously or asynchronously

n VMIOF_DiskOpCompletionFunc completionFunc — Call this function if the callback has previously


returned VMIOF_ASYNC and when the asynchronous work is completed. This tells the IO Filter
Framework that the callback's work is done and whether it succeeded.

This callback can return these values:

n VMIOF_SUCCESS — The callback completed its work successfully

n VMIOF_ASYNC — The callback is continuing its work asynchronously. It must call completionFunc to
indicate when it is done.

n Any other value indicates that the detach failed.

158 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Following is a code snippet describing the activities associated with diskDetach() callback :

1. VMIOF_Status SampleFilterDiskDetach(VMIOF_DiskHandle *handle,


2. const VMIOF_DiskDetachInfo *info)
3. {
4. VMIOF_Status res;
5 InstanceData_t *id = (InstanceData_t *)VMIOF_DiskFilterPrivateDataGet(handle);
6. ....
7. /* the sidecar may be opened. Attempt to close it first */
8. if(id->mySidecarHandlep) {
9. (void)VMIOF_DiskSidecarClose(id->mySidecarHandlep);
10. id->mySidecarHandlep = NULL;
11. }
12. /* destroy the sidecar */
13. if( VMIOF_SUCCESS != (res = VMIOF_DiskSidecarDelete(handle,
14. MY_SIDECAR_KEY))) {
15. /* well, that didn't work either. log that */
16. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskDetach): could not delete"
17. " sidecar (%d).",res);
18. return res;
19. } else {
20. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskDetach): deleted sidecar\n");
21. }
22. return VMIOF_SUCCESS;
23. }

The key sections of code are as follows:

n Lines 1-3 : Define SampleFilterDiskDetach function. The Library Instance defines this function as the
entry point for the diskDetach event.

n Line 4 : Variable res declared to hold the return value of type VMIOF_Status (Refer to section
“Understanding VMIOF_Status Results for Functions in the VAIO,” on page 102)

n Line 5 : You get the current set of valid properties of your IO Filter via
VMIOF_DiskFilterPrivateDataGet() and accept it in a pointer to a structure of type InstanceData_t. You
later use this pointer id to reference the sidecar that you will try closing first and then delete as part of
diskDetach operation.

n Lines 6-11 : First close the sidecar that you intend to delete as part of diskDetach. This is done by
invoking VMIOF_DiskSidecarClose(). Note that you are referencing the sidecar to be closed via id

n Lines 12-18 : After ensuring that you closed the sidecar in the previous step, you now remove it by
invoking VMIOF_DiskSidecarDelete(). You accept the return value of this function in variable res. If not
successful, you log an appropriate message and return failure via res.

n Lines 19-23 : Upon successfully deleting the sidecar, you log message indicating that the sidecar indeed
got deleted, and return success via res.

Understanding and Processing diskOpen Events


As discussed in section “Understanding and Defining diskOpen and diskClose Callbacks,” on page 104, the
IO Filter Framework invokes a filter's diskOpen() callback function when a cartel attempts to open a VMDK.
Recall that the prototype of this function is:

VMIOF_Status (*diskOpen)(VMIOF_DiskHandle *handle,


const VMIOF_DiskInfo *info);

VMware, Inc. VMware Confidential and Proprietary 159


Getting Started Developing vSphere IO Filter Solutions

The VMIOF_DiskFlags parameter in the VMIOF_DiskInfo structure specifies the mode in which the disk is
opened. It is required to check the disk size during open and free up any resources in excess to what is
required for the current disk size. This can happen when the said filter had reserved some resources during
disk grow and for some reason disk grow operation failed.

Following are some of the operations that are done in the context of diskOpen():

1 Determine the size of heap that you need to create using VMIOF_HeapEstimateRequiredSize().

2 Create the heap of the required size as determined above, using VMIOF_HeapCreate(). This heap is used
for all your dynamic memory allocation requirements like creating the instance data structure for your
IO Filter and mapping buffer for your sidecar.

NOTE A filter instance MUST create its own heap, and you may not share a heap with another filter
instance.

3 Allocate memory for your instance data structure from the heap you created using
VMIOF_HeapAllocate(). If you are unable to do this, return with VMIOF_NO_MEMORY after invoking
VMIOF_HeapDestroy() on the heap created in Step 2. Upon successful allocation of the instance data
structure, you initialize its members to appropriate values. Remember that the instance data should
contain a list of IOs owned by the instance. Ensure that you initialize the list to an empty state.

4 If your LI needs to connect with the filter's Daemon component, establish a UNIX socket connection
between them. Using a UNIX socket allows you to pass file descriptors (such as a crossFD, discussed in
the next item) between cartels using control messages.

NOTE The daemon in your IO Filter Solution is recommended to use SSL to establish a secure
connections between its LIs and off-host peer daemons, incorporating additional measures to authentic
such connections, such as magic numbers, etc.

5 If you want the Daemon to share memory using the VAIO crossFD utility functions, use the appropriate
functions to create a crossFD and manage the address space shared between the LI and Daemon. When
the LI does its initial handshake with the Daemon, pass the crossFD's file descriptor to the Daemon in
the control part of a POSIX msghdr structure.

6 Allocate memory from the heap to process IOs to/from your sidecar(s) using
VMIOF_HeapAllocateAligned().

NOTE You must have created the sidecar(s) using VMIOF_DiskSidecarCreate() in your diskAttach() or
diskPropertiesSet(). You cannot create sidecar files in the context of diskOpen().

7 Open the sidecar(s) using VMIOF_DiskSidecarOpen().

8 Upon successful open of the sidecar(s), read the values into structures allocated from a heap, caching
the values in the VMDK's instance data. Remember that you may have multiple sidecar(s) for a filter.

9 If the open is happening after a failed diskGrow event, the code for processing the grow may have
allocated resources that are no longer needed. To detect this, after a diskGrow, determine the size of the
disk (using information in the info parameter) and compare it with the disk size-related resources used
by your filter. If appropriate, free any unused resources.

10 If you need to react to a storage migration, determine whether the pathname passed into open is the
same as the one last written into whichever sidecar you keep the VMDK's pathname. After processing
the migration, update the pathname in the sidecar.

11 If you want to implement the diskRelease callback:

a Determine if your open is happening because of the filter Daemon's call to


VMIOF_VirtualDiskOpen(). How you do this is up to you, but examples include: setting an
environment variable in the filter, then checking for it in the library code; checking the name of the
invoking program, etc.

160 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

b If so:

1 Write the VMDK's pathname and IP address of this host and the port number on which the
Daemon is listening (if the port is not static) into it.

c Else:

1 Write the VMDK's pathname, but zeros into IP address fields.

d Close the ownership sidecar. It is VITAL that you keep this sidecar closed except during
diskAttach, diskOpen, and diskClose callbacks. Otherwise, you can't implement diskRelease.

12 Create a workgroup via VMIOF_WorkGroupAlloc() to which you can later submit workers.

NOTE This callback must complete synchronously. Hence, if there is any long running task that needs to be
done as a result of opening the disk, it must be done outside the context of this callback.

The prototype of function diskOpen() is as follows :

VMIOF_Status (diskOpen*)(VMIOF_DiskHandle *handle,


const VMIOF_DiskInfo *info)

The parameters to the callback are:

n VMIOF_DiskHandle *handle — An opaque handle to the disk being opened.

n VMIOF_DiskInfo *info —This structure describes the disk being opened as follows:

typedef struct VMIOF_DiskInfo {


uint64_t capacity;
uint32_t linksInChain;
const char *const *filesInChain;
VMIOF_DiskFlags diskFlags;
} VMIOF_DiskInfo;

The members of this structure are:


n uint64_t capacity — The size of the disk, in bytes

n uint32_t linksInChain — The number of elements in the array pointed to by filesInChain

n const char *const *filesInChain — An array of pointers to strings with the absolute path names
of the base and delta disks of the virtual disk (see “Understanding and Processing Snapshot-related
Events,” on page 192). The most recent delta disk will be the first in the array, while the base disk
will be the last in the array. Logically, this should be declared of as:

const char *const filesInChain[]

n VMIOF_DiskFlags diskFlags — Zero or more of the following flags to indicate how the disk is being
opened:

n VMIOF_DISK_NO_IO — The filter is not allowed to do IO to the virtual disk, but can perform IO
to its sidecar file(s) and to cache files

n VMIOF_DISK_RO — The virtual disk is read-only. The filter cannot perform writes to the virtual
disk or its sidecar file(s)

n VMIOF_DISK_CREATE — The virtual disk is newly created (this is the first open on the disk)

n VMIOF_DISK_SHARED — The virtual disk can have multiple writers on the disk at the same time.
This can occur if the VMDK's Sharing property is set to multi-writer or if the vHBA to which
the VMDK is attached has its SCSI Bus Sharing property set to Virtual or Physical.

n VMIOF_DISK_SYNC — The virtual disk cannot be made dirty by the filter (any caches should be
write through)

VMware, Inc. VMware Confidential and Proprietary 161


Getting Started Developing vSphere IO Filter Solutions

The values to return from this callback are:


n VMIOF_SUCCESS — Your code succeeded in processing the open event

n Any other value indicates your code failed to process the open event

Following code snippet provides a brief example of processing a diskOpen event:

1. VMIOF_Status SampleFilterDiskOpen(VMIOF_DiskHandle *diskHandle,


const VMIOF_DiskInfo *diskInfo)
2. {
3. VMIOF_Status res;
4. size_t heapSizeEstimate;
5. InstanceData_t *id;
6. ....
7. VMIOF_HeapAllocation allocations[] = {
8. { sizeof(InstanceData_t), 0, 1 }, /* instance data */
9. { MY_SIDECAR_SIZE, VMIOF_DISK_SIDECAR_ALIGN, 1 }, /* sidecar */
10. { sizeof(NoopfiltIOXact_t), 0, MAX_OUTSTANDING_IOTS}, /* io transactions */
11. };
12. size_t numAllocations = sizeof(sizeof(allocations)/sizeof(VMIOF_HeapAllocation));
13. VMIOF_HeapHandle *instanceHeapHandlep;
14. heapSizeEstimate = VMIOF_HeapEstimateRequiredSize( allocations, numAllocations);
15.
16. if( VMIOF_SUCCESS != (res = VMIOF_HeapCreate(heapSizeEstimate, &instanceHeapHandlep))) {
17. /* couldn't create the heap */
18. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): error creating" " filter heap
(%d)\n",res);
19. return res;
20. } else {
21. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): Created new heap " "for disk\n");
22. }
23. id = VMIOF_HeapAllocate(instanceHeapHandlep, sizeof(InstanceData_t));
24. if(!id) {
25. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): Could not allocate "
26. "instance id. Failing the open on\n");
27. return VMIOF_NO_MEMORY;
28. } else {
29. bzero(id,sizeof(*id)); /* just to be sure */
30. /* the following strncpy must be done before connecting to daemon */
31. *(int *)id->buffer = (int)CROSSFD_TEST_MAGIC;
32. id->buffer[(sizeof(id->buffer))-1] = 0; /* force termination */
33. VMIOF_Log(VMIOF_LOG_ERROR,"address of id->buffer=%p\n",id->buffer);
34. }
35. /* try to connect to filter's daemon */
36. if(VMIOF_SUCCESS != (res = SampleFIlterConnectToDaemon(id))) {
37. VMIOF_HeapFree(id->heapHandlep, id);
38. VMIOF_HeapDestroy(instanceHeapHandlep);
39. return res;
40. }
41. id->heapHandlep = instanceHeapHandlep;
42. VMIOF_DiskFilterPrivateDataSet(diskHandle, (void *)id);
43. pthread_cond_init(&id->snapCondVar, NULL);
44. pthread_mutex_init(&id->snapMutex, NULL);
45. pthread_cond_init(&id->delayCondVar, NULL);
46. pthread_mutex_init(&id->delayMutex, NULL);
47. pthread_mutex_init(&id->transListMutex, NULL);
48. id->handle = diskHandle;

162 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

49. id->transListCount=0;
50. if(VMIOF_SUCCESS != (res = SampleFilterCrossfdSetup(id))) {
51. /* cleanup */
52. close(id->daemonSockFD);
53. VMIOF_HeapFree(id->heapHandlep, id);
54. VMIOF_HeapDestroy(instanceHeapHandlep);
55. return res;
56. }
57. /* map buffer for sidecar from heap alligned as required */
58. if(NULL == (id->mySidecarp = (SampfiltSidecar_t *)VMIOF_HeapAllocateAligned(
59. id->heapHandlep, MY_SIDECAR_SIZE, VMIOF_DISK_SIDECAR_ALIGN))) {
60. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): could not " "allocate
sidecar buffer\n");
61. close(id->daemonSockFD);
62. VMIOF_HeapFree(id->heapHandlep, id);
63. VMIOF_HeapDestroy(instanceHeapHandlep);
64. return VMIOF_NO_MEMORY;
65. } else {
66. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): mySidecarp =" " %p\n",id-
>mySidecarp);
67. }
68. if( VMIOF_SUCCESS != (res = VMIOF_DiskSidecarOpen(diskHandle, MY_SIDECAR_KEY, &id-
>mySidecarHandlep))) {
69. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): no sidecar found on " "open
(%d)...\n",res);
70. /* NOTE: on failure mySidecarHandlep is left unchanged. That is, it should be NULL */
71. } else { /* open succeeded! */
72. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): sidecar opened"
73. " handlep=%p\n",id->mySidecarHandlep);
74. /* go read the sidecar data */
75. if( VMIOF_SUCCESS != (res = VMIOF_DiskSidecarRead(id->mySidecarHandlep,
76. id->mySidecarp, MY_SIDECAR_SIZE,
(uint64_t)0))) {
77. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): reading sidecar "
78. "data failed (%d), going to recreate it\n", res);
79. /* for now, just recreate the initial data */
80. bzero(id->mySidecarp,MY_SIDECAR_SIZE);
81. strncpy(id->mySidecarp->signature, "MySidecarSignature", MY_SIDECAR_SIG_SIZE-1);
82. id->mySidecarp->open_count = 1; /* we've had an open */
83. if( VMIOF_SUCCESS != (res = VMIOF_DiskSidecarWrite(id->mySidecarHandlep,
84. id->mySidecarp, MY_SIDECAR_SIZE,
(uint64_t)0))) {
85. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): re-writing" " initial sidecar
failed (%d)\n",res);
86. } else { /* reinitializing write worked */
87. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): re-initial" " sidecar
written\n");
88. }
89. } else { /* read succeeded */
90. /* add one to 'open_count' */
91. id->mySidecarp->open_count++;
92. /* now write it out */
93. if( VMIOF_SUCCESS != (res = VMIOF_DiskSidecarWrite(id->mySidecarHandlep,
94. id->mySidecarp, MY_SIDECAR_SIZE,
(uint64_t)0))) {

VMware, Inc. VMware Confidential and Proprietary 163


Getting Started Developing vSphere IO Filter Solutions

95. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): writing updated" " sidecar


failed (%d)\n",res);
96. } else { /* it worked */
97. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): sidecar " "open_count
updated\n");
98.
99. } /* if write update */
100. } /* if read sidecar */
101. } /* if can't open it */
102. /* create a workgroup (work-pile thread model) with 1 thread */
103. if( VMIOF_SUCCESS != (res = VMIOF_WorkGroupAlloc(1 /* thread */, &id->workGroupp))) {
104. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): creating work group " "failed
(%d)\n",res);
105. close(id->daemonSockFD);
106. VMIOF_PollRemove(id->SampfiltLISocketPollHandle);
107. VMIOF_HeapFree(id->heapHandlep, id->mySidecarp);
108. VMIOF_HeapFree(id->heapHandlep, id);
109. VMIOF_HeapDestroy(instanceHeapHandlep);
110. return res;
111. } else {
112. VMIOF_Log(VMIOF_LOG_ERROR,"VMIOF-Client(DiskOpen): created work group\n");
113. } /* if VMIOF_WorkGroupAlloc() */
114. return res;
115. }

The key sections of code are as follows:

n Lines 1-2 : Define SampleFilterDiskOpen().

n Lines 3-6 : Declare local variables, res to hold return value of type VMIOF_Status from function calls
(Refer to section “Understanding VMIOF_Status Results for Functions in the VAIO,” on page 102),
variable heapSizeEstimate to get the estimated heap size to create, variable id that is a pointer to the
instance data structure etc.

n Lines 7-15 : Initialize the required variables and estimate the heap size that is required to be created
using VMIOF_HeapEstimateRequiredSize()

n Lines 16-19 : Create the heap using VMIOF_HeapCreate(). If not successful, log appropriate message and
return the failure via res

n Lines 20-22 : If successful in creating the heap, log appropriate message indicating success.

n Lines 23-27 : Allocate memory from heap for the instance data structure. If not successful, log
appropriate failure message and return VMIOF_NO_MEMORY

n Lines 28-34 : Upon successful allocation of instance id structure, zero it out and log its address.

n Lines 35-40 : Connect to filter daemon using SampleFIlterConnectToDaemon(). If not successful, do


cleanup freeing up the memory of the instance id structure using VMIOF_HeapFree() and destroy the
heap via VMIOF_HeapDestroy()

n Lines 41-49 : Initialize various instance id data structure members.

n Lines 50-56 : CrossFD setup using SampleFilterCrossfdSetup(). Upon failure, do cleanup like closing
the daemon socket, freeing up the instance data structure and destroy the heap.

n Lines 57-64 : Allocate memory mapping buffer for your sidecar using VMIOF_HeapAllocateAligned().
Upon failure, do cleanup like closing the daemon socket connection, freeing up the instance data
structure and destroy the heap after logging appropriate error message.

n Lines 65-67 : Upon successful allocation of the mapping buffer, log its address.

164 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n Lines 68-70 : Open the sidecar. Upon failure, log appropriate failure message.

n Liens 71-73 : Log appropriate message indicating successful opening of the sidecar.

n Lines 74-88 : Read the sidecar data. If not successful, log appropriate message and just recreate it and
write to sidecar using VMIOF_DiskSidecarWrite()

n Lines 89-101 : Upon successful read of sidecar data, write to sidecar using VMIOF_DiskSidecarWrite()

n Lines 102-110 : Create a workgroup with a single thread. If not successful, log failure message followed
by a through cleanup - including removing the poll, release memory for the sidecar and instance id
structure and destroy the heap.

n Lines 111-115 : Log message indicating successful creation of the workgroup and return.

Understanding and Processing DiskClose Events


As discussed in section “Understanding and Defining diskOpen and diskClose Callbacks,” on page 104, the
IO Filter Framework invokes a filter's diskClose() callback when the VMDK is closed. It is essential that all
necessary cleanup operations like removing timers, draining up of workgroup queues, removal of poll
callback and daemon connection if any, releasing heap memory associated with your sidecar and IO filter
instance data structure before finally destroying the heap. If there are in-flight IOs generated by your filter,
you must wait for them to complete before returning from diskClose().

Long time processing is not allowed in the callback of diskClose. If your solution is a cache solution, and
you have dirty disk data in SSD when the diskClose is called, you should not flush the dirty data in this
callback. Instead, let your daemon open the disk using VMIOF_VirtualDiskOpen and flush the dirty data. For
how to use VMIOF_VirtualDiskOpen, please see “Understanding and Using the VMIOF_VirtualDisk*
Functions,” on page 231

The prototype for diskClose() is as follows :

VMIOF_Status (diskClose*)(VMIOF_DiskHandle *handle);

Parameters :
n VMIOF_DiskHandle *handle - This input parameter an opaque handle to the disk. Almost every callback
function has this handle to the disk as the first parameter.

Return Value :
n VMIOF_SUCCESS : diskClose() returns VMIOF_SUCCESS upon successfully closing the vmdk

n Any other value will indicates failure.

Following is a code snippet describing the activities associated with diskClose event :

1 VMIOF_Status
2 SampfiltDiskClose(VMIOF_DiskHandle *diskHandle)
3 {
4 VMIOF_Status res = VMIOF_SUCCESS;
5 InstanceData_t *id;
6 VMIOF_HeapHandle *tmp;
7
8 /* announce we are here, and print the parameters */
9 VMIOF_Log(VMIOF_LOG_ERROR,"SampfiltLI(%s): handle=%p\n",__func__,diskHandle);
10 id = (InstanceData_t *)VMIOF_DiskFilterPrivateDataGet(diskHandle);
11 /* wait for all async I/Os to complete before continuing */
12 VMIOF_WorkGroupWait(id->workGroupp);
13
14 /* update close count and close the sidecar */
15 if(id->mySidecarHandlep) {
16 id->mySidecarp->close_count++;

VMware, Inc. VMware Confidential and Proprietary 165


Getting Started Developing vSphere IO Filter Solutions

17 res = VMIOF_DiskSidecarWrite(id->mySidecarHandlep, id->mySidecarp,


18 MY_SIDECAR_SIZE, (uint64_t)0);
19 if( VMIOF_SUCCESS != res ) {
20 VMIOF_Log(VMIOF_LOG_ERROR,"SampfiltLI(%s): writing updated sidecar "
21 "failed (%d)\n",__func__, res);
22 } else { /* it worked */
23 VMIOF_Log(VMIOF_LOG_ERROR,"SampfiltLI(%s): sidecar close_count "
24 "updated\n",__func__);
25 } /* if write update */
26 /* close the sidecar */
27 res = VMIOF_DiskSidecarClose(id->mySidecarHandlep);
28 if( VMIOF_SUCCESS != res ) {
29 VMIOF_Log(VMIOF_LOG_ERROR,"SampfiltLI(%s): closing sidecar failed "
30 "(%d)\n", __func__, res);
31 } else { /* it worked */
32 VMIOF_Log(VMIOF_LOG_ERROR,"SampfiltLI(%s): sidecar closed\n",__func__);
33 id->mySidecarHandlep = NULL;
34 } /* if write update */
35 }
36 /* cancel the timer, remove the workgroup, remove the poll */
37 (void)VMIOF_WorkGroupFree(id->workGroupp);
38 (void)VMIOF_PollRemove(id->SampfiltLISocketPollHandle);
39
40 /* remove polling and close the connection to the daemon */
41 close(id->daemonSockFD);
42 close(id->crossFD);
43 /* free the sidecar buffer BEFORE freeing the instance data */
44 VMIOF_HeapFree(id->heapHandlep,id->mySidecarp);
45 /* free our instance data from the heap */
46 tmp = id->heapHandlep; /* save this pointer, first */
47 VMIOF_HeapFree(id->heapHandlep,id);
48 VMIOF_HeapDestroy(tmp);
49
50 return res;
51 } /* SampfiltDiskClose() */

The key sections of code are as follows:


n Lines 1-2 : Define SampleFilterDiskClose() function. The Library Instance defines this function as the
entry point to the diskClose event.

n Lines 3-6 : Declare and initialize local variables

n Line 9: Just debugging

n Line 10: Retrieve instance data structure for the VMDK

n Line 12: Wait for any workers currently running to complete.

n Line 15: Check to see if the sidecar handle is valid. If the IO Filter Framework invoked diskDetach()
before close, that callback will have set the sidecar handle to NULL before closing and deleting the
sidecar.

n Lines 16-35: Only run if the sidecar handle is valid.


n Lines 16-25: Update the cached sidecar data and attempt to write the updated information to the
sidecar.

n Lines 26-34: Attempt to close the sidecar. Accept the return value of SampleFilterDiskClose() in local
variable res. Upon failure to close the sidecar log appropriate message.

166 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n Lines 37 & 38: Remove work group and poll callback.

n Lines 41 & 42: Close socket and crossFD.

n Lines 44: Free up the sidecar drawn from the heap.

n Lines 46 & 47: Time to free up the instance structure. But first store the heap handle in the instance
structure into a local variable tmp. Because once you free up the instance structure, you lose all its
information and will be unable to destroy the heap, that you will be doing in the subsequent step. Now
free the instance structure that is also drawn from the heap.

n Line 48: Destroy the heap via VMIOF_HeapDestroy, passing it the variable tmp that holds the heap
handle.

n Line 50: Return Success.

Understanding and Processing IO Events

Understanding the DiskIO Structure


The VAIO uses the DiskIO structure to describe a single IO request. The request may contain a single or
multiple scatter-gather, each described by a single DiskIOElem structure. In addition to scatter-gather
elements, the IO request includes flags indicating whether it is a read or write request and whether the
structure was created by a VAIO utility function.

The following figure summarizes the layout of a VMIOF_DiskIO structure, with members of the structure
defined in the following text:

Figure 5‑1. The VMIOF_DiskIO Structure

VMware, Inc. VMware Confidential and Proprietary 167


Getting Started Developing vSphere IO Filter Solutions

Specifically, this structure has the following definition:

typedef struct VMIOF_DiskIO {


/** Handle used to identify IOs belonging to a reset command. */
VMIOF_DiskResetIdentifier resetIdentifier;
/** The flags describing the disk IO request. */
VMIOF_IOFlags ioFlags;
/** Reserved. */
uint8_t _reserved[3];
/** The number of elements this IO request has. */
uint32_t numElems;
/** The offset of the first element inside the disk. */
uint64_t offset;
/** The array of disk IO elements. */
VMIOF_DiskIOElem elems[0];
} VMIOF_DiskIO;

The members of this structure are:

n DiskResetIdentifier resetIdentifier — This is essentially a large integer. By default all IOs belonging
to a diskHandle have the same resetIdentifier. However, there might be cases where two IOs belonging
to the same diskHandle might have different resetIdentifiers.

n VMIOF_IOFlags ioFlags —Flags that describe the IO request, defined as:

typedef enum VMIOF_IOFlags {


/** The IO request is for reading data. */
VMIOF_READ_OP = 0x01,
/** The IO request is for writing data. */
VMIOF_WRITE_OP = 0x02,
/** The IO request is a zero copy. */
VMIOF_ZERO_COPY = 0x04,
/** THE IO originated in a VM */
VMIOF_VM_IO = 0x08,
} _VMIOF_PACKED_ VMIOF_IOFlags;

An IO will have either VMIOF_READ_OP or VMIOF_WRITE_OP set, not both.

The other flags may or may not be set with the VMIOF_READ_OP or VMIOF_WRITE_OP flag. When set, they
have the following meaning:

n VMIOF_VM_IO — As the comment suggests, if this flag is set, the IO request originated in a VM. This
contrasts with IOs that may originate in Filter Library code using VMIOF_DiskIOAlloc() (see
“Understanding and Using the IO Filters VMIOF_DiskIOAlloc Function,” on page 182).

n VMIOF_ZERO_COPY — The original semantics of this flag was "The IO has not been copied via
VMIOF_DISKIODup()." The implications were that the IO was coming from a VM, meaning that
the addr members of the VMIOF_DiskIOElem items (discussed later in this section) pointed at
memory in the guest VM. The implication of that was that filters could not change the data pointed
to (except when completing a VMIOF_READ_OP).

The semantics have changed to mean that, except when completing a VMIOF_READ_OP, the
buffer pointed to by the addr member of the VMIOF_DiskIOElem items is read-only. There are many
reasons why the framework may set this flag, including the original semantic (that its guest
memory). But, Filter Library code creating new IOs (via VMIOF_DiskIOAlloc()) can also set this flag
to prevent filters lower in the stack from writing over their data.

NOTE VMIOF_DiskIODup() does not replicate this flag in the duplicated IO structure it creates.

168 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n uint64_t offset — The offset into the file to start reading / writing data. The amount of data to read /
write, and the address of the RAM buffer are in the VMIOF_DIskIOElem structures described later in
this list.

n uint32_t numelems — This member indicates how many VMIOF_DiskIOElem structures are in the array
pointed to by the elems member.

n VMIOF_DiskIOElem elems — This is an array of VMIOF_DiskIOElem Structures, each of which is


essentially an element of a scatter-gather list and has the following definition:

typedef struct VMIOF_DiskIOElem {


/**
* The starting virtual address of the contiguous virtual address
* region this element represents.
*/
uint64_t addr;
/** The number of bytes represented by this element. */
uint64_t length;
} VMIOF_DiskIOElem;

This structure has the following fields :

n uint64_t addr — The address in memory from which the data is to be written or to which the data
is to be read. To convert from addr to a pointer or a pointer to an addr, use the A2P() and P2A()
macros found in the proxy and sampcache samples.

n uint64_t length — The length of the block.

NOTE The offset and the total length of an IO are always sector (512 bytes) aligned, but this may
not be true for the individual elements.

Understanding and Processing diskIOStart Events


The diskIOStart() callback is the heart of most IO Filter Solutions. Again, the IO Filter Framework invokes
this function for each IO submitted against a VMDK to which the filter is attached. The body of this
function, or helpers thereto (as described later in this topic) perform the actual filtering of the IO read from /
written to a VMDK.

The prototype for this function is:

VMIOF_Status (*diskIOStart)(VMIOF_DiskHandle *handle, VMIOF_DiskIO *io);

Return Value : Only two return values are permitted for this callback - VMIOF_SUCCESS and VMIOF_ASYNC. This
callback function is not allowed to block.

n VMIOF_SUCCESS – diskIOStart() callback returns VMIOF_SUCCESS when its operation succeeds.

n VMIOF_ASYNC – diskIOStart() callback returns VMIOF_ASYNC when IO is deferred to be handled


asynchronously. Processing of this IO by the framework is postponed until the IO is continued by the
IO filter.

As discussed in “Understanding Synchronous vs Asynchronous Processing in Callbacks,” on page 22, each


LI must keep track of all IOs it owns. Thus, one of the first things this callback must do is add the IO request
(passed into callback) to its list of owned IOs. Then, as also discussed in the referenced topic, the LI must
process the IO in one of two ways:

n SYNC processing — Perform its filter processing, if any, possibly including setting up a completion
callback, and then return VMIOF_SUCCESS to the IO Filter Framework. In this case, the IO Filter
Framework continues the IO through the rest of the IO stack. This may mean passing the IO to the next
IO Filter (if there is one), or sending the request to the next kernel module in the IO stack. Before
returning one of the synchronous return codes, the LI must remove the IO from its list of owned IOs.

VMware, Inc. VMware Confidential and Proprietary 169


Getting Started Developing vSphere IO Filter Solutions

n ASYNC processing — you should process the request asynchronously if said processing involves calls
to remote services such as Daemons for caching solutions or replication sites for replication solutions.
To process an IO asynchronously:

a Perform any necessary preliminary processing, such as setting up completion callback

b Enqueue the IO to some service (a worker callback, a timer callback, or the Daemon)

c Return VMIOF_ASYNC to the IO Filter Framework

In this case, the IO Filter Framework suspends processing of the IO request until the LI tells said
Framework that it is finished processing said IO, either completing or continuing it. Whatever code
completes processing the IO must also remove it from the LI's list of owned IOs.

Whether processing the IO synchronously or asynchronously, the callback may register a further callback
function for the IO Filter Framework to invoke when the IO request has been completed (for example when
the data has been read from or written to a VMDK). It does this by invoking
VMIOF_DiskIOCompletionCallbackSet(). In addition to the IO request and VMIOF_DiskHandle parameters,
this function takes a pointer to a function of type VMIOF_DiskIOCompletionCallback that you provide in your
LI, and a pointer to opaque data your LI can associate with this IO request.

NOTE The framework may invoke multiple completion callback functions at the same time.

The prototype for VMIOF_DiskIOCompletionCallbackSet is as follows:

VMIOF_DiskIOCompletionCallbackSet(VMIOF_DiskHandle *handle,
VMIOF_DiskIO *io,
VMIOF_DiskIOCompletionCallback callback,
void *data);

The parameters you pass to this function are:

n VMIOF_DiskHandle *handle — The handle of the VMDK as passed into the diskIOStart callback.

n VMIOF_DiskIO *io — An IO request as supplied to diskIOStart.

n VMIOF_DiskIOCompletionCallback callback — A callback function to be invoked when the IO is


completed.

n void * data — User-defined data to pass to 'callback'.

The prototype for VMIOF_DiskIOCompletionCallback is as follows:

typedef VMIOF_Status (*VMIOF_DiskIOCompletionCallback)(VMIOF_DiskHandle *handle,


VMIOF_DiskIO *io,
void *data,
VMIOF_Status ioStatus);

The parameters you get from this callback are:

n VMIOF_DiskHandle *handle — The handle of the VMDK.

n VMIOF_DiskIO *io — IO that was completed.

n void * data — User-defined data as passed to VMIOF_DiskIOCompletionCallbackSet

n VMIOF_Status ioStatus —The status code with which the IO was completed.

NOTE The utility function and callback type are very similarly named. Remember that the utility function
ends in "Set" and the callback type does not.

170 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

The IO Filter Framework passes each of these parameters to your callback, plus the completion status of the
IO (for example did the read / write succeed, and did no other IO Filter fail it). The completion function can
further delay the IO by returning VMIOF_ASYNC, or continue the completion by returning
VMIOF_SUCCESS.

NOTE Generally you do not need to call VMIOF_DiskIOComplete() in the completion callbacks, but if you do,
the completion callback needs to return VMIOF_ASYNC"

Some times, an IO Filter may wish to short-circuit the normal flow of an IO request, completing the request
without having the rest of the IO stack process said request. For example, a caching solution may find the
desired data in its cache file, in which case there is no reason for the request to progress through the rest of
the IO stack to the VMDK, etc. Call VMIOF_DiskIOComplete() to cause the IO Filter Framework to turn the IO
around and start it heading back to its requestor.

NOTE If an LI (say for filter X) has registered a completion callback on an IO request (via
VMIOF_DiskIOCompletionCallbackSet()), and a subsequent LI (say for filter Y) invokes
VMIOF_DiskIOComplete() on the same IO request, the IO Filter Framework invokes the completion callback
for filter X.

IMPORTANT If your filter requires stable data, remember to check the ioFlags member of the request
VMIOF_DiskIO structure against VMIOF_ZERO_COPY. If the flag is set, the data pointed to by the address
members of said request's VMIOF_DiskIOElem elements can change while the IO is still in flight. So if your IO
Filter solution needs stable data in the IO request, it should use VMIOF_DiskIODup to create a duplicate of the
data and process the duplicate. This is like the Bounce Buffer on a Linux or BSD system to copy data for
devices having specific addressing requirement to DMA access.

NOTE ESXi will kill the VMX cartel if it is stuck for over 120 seconds in the IO processing thread. Please be
aware of time spent in both diskIOStart callback and VMIOF_DiskIOCompletionCallback callbacks.

Understanding and Using the IO Filters Timer Functions

Timers in IO filter framework are used to invoke specified function in an IO Filter with a specified
periodicity (though there may be some latency that can be expected since ESXi is not a real-time operating
system). Examples of reasons to use timers in an IO Filter solution include:

n Creating timeouts for an ACK for each IO request sent to a replication site

n Invoking a function every 10 seconds or less to indicate the progress of a given long-running operation,
such as required by certain entry points including diskAttach(), diskSnapshot(), and diskVmMigration()

At a high level, the steps for using these functions are:

1 Define callback functions for your timer(s) as discussed in the sub-section. “Understanding
VMIOF_TimerCallback(),” on page 172.

2 Allocate a timer as discussed in the sub-section. “Understanding VMIOF_TimerHandle,” on page 172

3 Create timers with specific periodicity and an associated callback function as discussed in the sub-
section “Understanding VMIOF_TimerAdd(),” on page 172.

4 If/When you no longer wish the timer to fire, remove it as discussed in the sub-section. “Understanding
VMIOF_TimerRemove(),” on page 173.

The VAIO defines all of the data types and function prototypes it uses to implement timers in the file
vmiof_timer.h.

VMware, Inc. VMware Confidential and Proprietary 171


Getting Started Developing vSphere IO Filter Solutions

Understanding VMIOF_TimerHandle
The VAIO API provides an opaque data type VMIOF_TimerHandle to represent each timer you create in your
code. If your code uses a fixed number of timers, as in the case of a timer for a progress function, you may
consider declaring handles for such timers as regular variables. If your code needs to create timers
dynamically, as in the case of a timer per IO request being sent to a replication site, use dynamic memory
allocation to create space for the timer handles.

Because most VAIO timer functions require a pointer to a VMIOF_TimerHandle instead of an actual
VMIOF_TimerHandle, a typical declaration looks similar to the following:

VMIOF_TimerHandle *timerHandlep;

Understanding VMIOF_TimerCallback ()
Whenever a timer fires (expires), the IO Filter Framework invokes a callback function that you specify when
you create the timer. All callback functions must be of type VMIOF_TimerCallback, which is defined in
vmiof_timer.h as:

typedef void (*VMIOF_TimerCallback)(void *data);

That is, your callback must take VMIOF_TimerCallback a pointer to some data (the data itself is opaque to the
IO Filter framework) and return void. You associate the data pointer with the timer when you create the
timer. Continuing the replication preceding timeout example, the replication code could associate a pointer
to the IO request that it is sending to the replication site and on whose ACK said replication code is still
waiting. Alternatively, in some instances, solutions choose to associate no data, that is a NULL pointer, with
a timer. In this latter case, when the IO Filter Framework invokes the callback, *data is NULL.

NOTE In both LIs and Daemons, no blocking functions may be called in Timer callbacks

Understanding VMIOF_TimerAdd ()
You create a timer using VMIOF_TimerAdd(), which has the following prototype (in vmiof_timer.h):

VMIOF_Status
VMIOF_TimerAdd(uint64_t delay, VMIOF_TimerCallback callback, void *data,
VMIOF_TimerHandle **timer);

The parameters to this function are:

n uint64_t delay — The periodicity of the timer, in microseconds

n VMIOF_TimerCallback callback — A pointer to a function, of type VMIOF_TimerCallback, that the


framework invokes every delay millisecond

n void *data — A pointer to the data to associate with the timer. Again, this pointer is passed to callback by
the IO Filter Framework

n VMIOF_TimerHandle **timer — A pointer to a pointer to a VMIOF_TimerHandle. That is, your code


allocates / declares a VMIOF_TimerHandle *, say called foo, and then passes a pointer to foo in the last
parameter. You use this handle if / when you wish to stop the timer from firing.

The function returns the following status codes:

n VMIOF_SUCCESS — The function succeeds in creating the timer

n VMIOF_BAD_PARAM — The value of delay is too large, currently the max value is INT_MAX.

n VMIOF_NO_MEMORY — The IO Filter framework was unable to allocate the memory, it needed, to create the
timer

172 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Understanding VMIOF_TimerRemove ()
Once your code creates a timer, the IO Filter framework continues to fire the timer with the given
periodicity, theoretically forever. To stop, your code must invoke VMIOF_TimerRemove() provided by the
VAIO. This function has the following prototype (in vmiof_timer.h):

VMIOF_Status
VMIOF_TimerRemove(VMIOF_TimerHandle *timer);

This function takes only one parameter, the VMIOF_TimerHandle * returned by VMIOF_TimerAdd().

The function returns the following status codes:

n VMIOF_SUCCESS — The function succeed in removing the timer. That is, the timer will no longer fire.

n VMIOF_NOT_FOUND — *timer does not point to a valid VMIOF_TimerHandle as returned by VMIOF_TimerAdd()

NOTE The framework synchronizes removing a Timer with its own lock, eliminating a race between the
removal of the timer and its firing.

Example: Example of using the Timer Functions


The following code snippet provides an example usage of the timer functions. At a high level, this code
sample, in the context of SampleFilterDiskOpen(), creates a timer, timerHandlep that fires every 30 seconds
calling the function TimerCallback(). The code, in the context of SampleFilterDiskClose() cancels / removes
the timer, thereby preventing it from firing again.

0. char Mbuf[1024];
1. #define THIRTY_SECONDS (uint64_t)30000000
2. /* Declare the timer handle for the new timer to be created */
3. VMIOF_TimerHandle *SampFiltTimerHandlep;
4. void
5. TimerCallback(void *datap) {
6. IOFLOG1("It's Time!\n");
7. } /* TimerCallback() */
8.
9. VMIOF_Status
10.SampleFilterDiskOpen(VMIOF_DiskHandle *diskHandle, const VMIOF_DiskInfo *diskInfo) {
11. VMIOF_Status res;
12. …
13. /* create a timer */
14. if( VMIOF_SUCCESS != (res = VMIOF_TimerAdd(THIRTY_SECONDS, TimerCallback, NULL,
&SampFiltTimerHandlep))) {
15. /* couldn't create the timer */
16. sprintf(Mbuf,"creating timer failed (%d)",res);
17. IOFLOG1(Mbuf);
18. } else {
19. IOFLOG1("created timer\n");
20. }
21. …
22. } /* End of SampleFilterDiskOpen () */
23.
24. VMIOF_Status
25. SampleFilterDiskClose(VMIOF_DiskHandle *diskHandle) {
26. …
27. /* cancel the timer */
28. (void)VMIOF_TimerRemove(SampFiltTimerHandlep);

VMware, Inc. VMware Confidential and Proprietary 173


Getting Started Developing vSphere IO Filter Solutions

29. …
30. } /* end of SampleFilterDiskClose() */

The key sections of code are as follows:

n Line 0 : Defines a character buffer, Mbuf that the code uses for buffering log messages

n Line 1 : Defines THIRTY_SECONDS to be the number of microseconds in 30 full seconds. The code uses this
value for the periodicity of the timer.

n Line 3 : Defines a timer handle called SmpFiltTimerHandlep that the code uses for its one and only timer

n Lines 4/7: Defines TimerCallback(), the callback function that the code associates with its timer on line 13.
The function has just one line of code, Line 6, which displays It's time! when it gets invoked by the IO
Filter Framework

n Line 9-22 : Define SampleFilterDiskOpen(). Not shown in this code snippet, the Library Instance defines
this function as the entry point for the diskOpen event. Specifically:

n Line 11: Defines res to hold the return values from certain VAIO functions called by this function

n Line 14: Invokes VMIOF_TimerAdd() to create a new timer with a periodicity of THIRTY_SECONDS, that
calls TimerCallback() when the timer fires, passes NO data to TimerCallback() when the IO Filter
Framework invokes the function, and stores a handle to the timer in SampFiltTimerHandlep. The
code evaluates the return value of the invocation.

n Lines 15-17: Log an appropriate error message in the event of failure adding the timer on Line
14

n Line 19: Logs an appropriate error message in the event the timer add succeeds on Line 14

n Lines 24-30 : Define SampleFilterDiskClose(). Not shown in this code snippet, the Library Instance defines
this function as the entry point for the diskClose event. Specifically:

n Line 28: Cancels / removes the timer create / added on Line 14. Normally code should check the
return value of this function. For simplicity, this function discards and does not check the return
value

Thus, in general, whenever ESXi opens a VMDK covered by this filter, the code starts a timer that fires every
30 seconds, calling TimerCallback(). The timer continues to fire until ESXi closes the disk, at which time the
code cancels the timer.

Understanding and Using the IO Filters Worker Functions

One of the patterns for writing multi-threaded code is called work pile (see Pthreads Programming by Buttlar,
Farrel, and Nichols from O'Reilly Media, September 1996).. this model contains:

1 A single queue that contains the objects of work to be done at any time, indicated by a function to call
and data to pass to said function.

2 Threads that add work to the queue whenever they need to

3 A pool of threads that perform the work. Each thread pulls an item off the head of the queue, invokes
the indicated function, passing the indicated data. When the function finishes, the thread pulls the next
item from the queue, rinse and repeat, until there are no items in the queue.

The queue is called the work pile. One distinguishing attribute of this pattern is that it does not require a
thread to coordinate which worker thread performs which item of work. (That pattern is called a boss /
worker model by the previously referenced book.

174 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

VAIO provides a set of functions that implement a work-pile pattern, though it uses the term work group.
The main reason to queue work to a work-pile in IO Filters is to prevent the current thread from blocking,
for example:

n Performing an asynchronous operation such as querying the Daemon for I/O processing

n Waiting on a condition variable signal / broadcast (mutex/spinlock)

There are several reasons to avoid blocking in IO Filter code, including:

n There are a limited number of threads that can process poll events in a Library / Daemon (currently,
only one!). If you block in a poll thread, no other poll-driven processing can occur until that thread
unblocks. If the code that unblocks the poll code in turn depends on a poll event to run, the Solution
will deadlock.

n Increase performance. If you block in certain functions, you prevent the Framework from submitting
additional events to the Solution. Thus, many events allow a VMIOF_ASYNC return to complete at a
later time. You use the work group functions to complete the event. Examples of this include:

n You may want diskDetach() to be done asynchronously, for example to flush cache data or remove
encryption. Either of which may take a significant time

n The Prepare activity of diskSnapshot() or diskVmMigration() may take significant time

n A replication filter may have to duplicate incoming IOs, send them to the replication sites, waiting
for acknowledgements. In the meantime, it blocks the caller in a synchronous fashion. To avoid this
blocking and allow for the IO Filter Framework to continue processing events, perform these long-
running actions in worker functions. The workers must inform the framework when they finish
work for those events.

The pattern for using the work group VAIO functions is:

n Once — Declare an work group handle to keep track of the work group.

n Once — Create (allocate) a work group, specifying the maximum number of worker threads the
framework will allocate to work on the items in the queue.

n As necessary — Queue work items into the work group. For each item, specify the function that the
thread must call to perform the work, and the data on which the function must work.

n When desired, especially before shutting down / closing — Wait for all the items in the work group to
complete. You cannot add new work items to the queue while waiting for the existing items to
complete.

n When shutting down / closing — Free the work queue.

The remaining sub-topics discuss how to use the VAIO work group functions to implement this pattern.

Understanding VMIOF_WorkGroup
The VAIO API provides an opaque data type VMIOF_WorkGroup to represent each work group you create in
your code. You normally declare handle to work group as a regular static variable. Upon allocation of a new
work group you use this handle for all subsequent references to the work group. For example, when you
want to enqueue a work item into the work group or you want to wait until all the work items in this work
group have completed executing or when you want to free a work group itself when you no longer need it.

Because many VAIO work functions require a pointer to the work group that they operate upon, a typical
declaration looks similar to the following:

VMIOF_WorkGroup *workGroupp;

VMware, Inc. VMware Confidential and Proprietary 175


Getting Started Developing vSphere IO Filter Solutions

Understanding VMIOF_WorkGroupAlloc () :
Use VMIOF_WorkGroupAlloc() to allocate a new work group. Creating work groups is a means of grouping
various logically related work items together. These work items will be implemented as separate threads
that get executed asynchronously to accomplish their specified tasks. You can wait for the enqueued work
items to finish before proceeding further. This provides a mechanism for the master to co-ordinate the
various tasks that it is implementing via multiple threads or work items. When VMIOF_WorkGroupAlloc()
returns VMIOF_SUCCESS, it returns an opaque handle to VMIOF_WorkGroup in the group parameter. Work group
creation could fail in case of memory allocation failure with a return value of VMIOF_NO_MEMORY.

VMIOF_Status
VMIOF_WorkGroupAlloc(uint32_t maxThrds, VMIOF_WorkGroup **group);

The parameters to this function are :

n uint32_t maxThrds— This input parameter indicates the maximum number of threads for this work
group.

n VMIOF_WorkGroup**group— This output parameter is the opaque handle to the work group that gets
allocated.

The function returns the following status codes :

n VMIOF_SUCCESS — This value is returned upon successful creation of the work group.

n VMIOF_NO_MEMORY — This value is returned if work group allocation fails due to memory allocation
failure.

Understanding VMIOF_WorkQueue () :
Use VMIOF_WorkQueue() to enqueue the work item into the work group queue. Each work item or thread
executes asynchronously. The master is thereby able to co-ordinate the activities performed by the work
threads. VMIOF_WorkQueue() returns VMIOF_SUCCESS upon successful execution, indicating that the work
thread was indeed en-queued into the work group queue.

VMIOF_Status VMIOF_WorkQueue(VMIOF_WorkGroup *group, VMIOF_WorkFunc func, void *data);

The parameters to this function are :

n VMIOF_WorkGroup *group — This input parameter is an opaque handle to the work group into which the
work thread needs to be en-queued.

n VMIOF_WorkFunc func — This input parameter is a function pointer. It points to the function that will be
invoked to perform the actual task.

n void *data — This input parameter can be used to pass on, any user defined data/parameter that
VMIOF_WorkFunc can operate upon.

The function returns the following status codes :

n VMIOF_SUCCESS — Upon successful execution of VMIOF_WorkQueue(), VMIOF_SUCCESS is returned indicating


that the work item was indeed enqueued.

NOTE In LIs, no blocking functions may be called in VMIOF_WorkFunc's

Understanding VMIOF_WorkGroupWait () :
VMIOF_WorkGroupWait() can be used by the master to wait for all the work items or threads in the given work
group to finish before the master proceeds further. Since it is watching the work group it needs a handle to
the group on which it needs to wait, as the input parameter.

void VMIOF_WorkGroupWait(VMIOF_WorkGroup *group);

176 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

The parameters to this function are :

n VMIOF_WorkGroup*group — This input parameter is an opaque handle to the work group that is watched
for all its work items to finish execution.

This function returns a void.

Understanding VMIOF_WorkGroupFree () :
When you no longer need the work group, you free it up using VMIOF_WorkGroupFree(). It takes a single
input parameter VMIOF_WorkGroup, an opaque handle to the work group that needs to be freed. All enqueued
work items must have finished execution or else you should have executed VMIOF_WorkGroupWait() before
you call VMIOF_WorkGroupFree().

void VMIOF_WorkGroupFree(VMIOF_WorkGroup *group);

The parameters to this function are :

n VMIOF_WorkGroup *group — This input parameter refers to the opaque handle of the work group that will
be freed.

The function returns the following status codes :

This function returns a void.

IMPORTANT You should call VMIOF_WorkGroupWait() before calling this function. This function waits for any
existing work function to complete before destroying the work group.

Example: Example of using the Worker Functions


/* workgroup macro and variables, and worker function */

1. VMIOF_WorkGroup *workGroupp;
2. …
3. void Worker(void *datap) {
4. int count = *(int *)datap;
5. sprintf(Mbuf,"worker: count is %d\n",count);
6. IOFLOG1(Mbuf);
7. return;
8. }
9. /* create a workgroup (work-pile thread model) with 1 thread */
10. if( VMIOF_SUCCESS != (res = VMIOF_WorkGroupAlloc(1 /* thread */, &workGroupp))) {
11. /* couldn't create the workgroup */
12. sprintf(Mbuf,"creating work group failed (%d)",res);
13. IOFLOG1(Mbuf);
14. } else {
15. IOFLOG1("created work group\n");
16. } /* if VMIOF_WorkGroupAlloc() */
17. …
18. /* add a call to the Worker function to the "work pile" pointed to by workGroupp (set in
DiskOpen) */
19. if( VMIOF_SUCCESS != (res = VMIOF_WorkQueue(workGroupp, Worker, (void *)&count))) {
20. /* well, that didn't work */
21. sprintf(Mbuf,"could not add work to the pile (%d).",res);
22. IOFLOG1(Mbuf);
23. } else {
24. IOFLOG1("added work to the pile\n");
25. }

VMware, Inc. VMware Confidential and Proprietary 177


Getting Started Developing vSphere IO Filter Solutions

26. …
27. VMIOF_WorkGroupWait(workGroupp);
28. VMIOF_WorkGroupFree(workGroupp);
29. …

The key sections of code are as follows:

n Line 1 : Declare a pointer to VMIOF_WorkGroup. It is an opaque handle to the work group that gets created
via VMIOF_WorkGroupAlloc().

n Line 3-8 :This is the function associated with work item thread that is passed to VMIOF_WorkQueue(). It
provides the functionality that the work item is expected to perform. In this example, the current value
of count variable is sent to log message by this function.

n Line 9-13 : Work group allocation happens here. The work group can accommodate one work item or
thread. The opaque handle to the work groupworkGroupp is passed to VMIOF_WorkGroupAlloc() as an
output parameter. If VMIOF_WorkGroupAlloc() fails, a log message is sent conveying that work group
creation failed.

n Line 14-16 : Upon successful execution of VMIOF_WorkGroupAlloc() the message conveying successful
creation of the work group is sent to the log.

n Line 18-22 : The work item thread Worker is enqueued into the queue work group workGroupp, count is
the parameter to be passed to Worker function.

n Line 23-25 : Upon successful execution of VMIOF_WorkQueue(), log a message to convey that work was
added to the work pile.

n Line 27-28 :When the work group is no longer needed, you call VMIOF_WorkGroupWait() to wait for it to
complete and then VMIOF_WorkGroupFree() to free it, passing them the work group handle workGroupp.

Understanding and Processing the diskDeleteBlocks* Events


As discussed previously in Chapter 4, there are 2 callbacks for the delete block operation. Again, the
prototype for the diskDeleteBlocksPrepare() callback is :

VMIOF_Status
(*diskDeleteBlocksPrepare)(VMIOF_DiskHandle *handle, const VMIOF_DiskDeleteBlocksInfo *info);

The parameters to this callback are:


n VMIOF_DiskHandle *handle —This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to.

n const VMIOF_DiskDeleteBlocksInfo *info — This input parameter that is a pointer to a structure


indicates the list of disk blocks to be deleted. The structure is defined as follows:

typedef struct VMIOF_DiskDeleteBlocksInfo {


uint32_t numBlockDescs;
uint32_t unused;
const VMIOF_DiskDeleteBlockDesc descs[0];
} VMIOF_DiskDeleteBlocksInfo;

n uint32_t numBlockDescs —The number of delete block descriptors in the structure

n uint32_t unused — Reserved for future use

n const VMIOF_DiskDeleteBlockDesc descs[0] — An array of delete block descriptors, defined as


follows:

typedef struct VMIOF_DiskDeleteBlockDesc {


uint64_t offset;
uint64_t length;
} VMIOF_DiskDeleteBlockDesc;

178 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

The fields of a delete block descriptor are:

n uint64_t offset — Offset (in bytes) to the first disk block to be deleted

n uint64_t length — Length (in bytes) of the block to be deleted

This callback returns VMIOF_SUCCESS if the operation is allowed to proceed, else it returns an appropriate
error value.

The second callback is called to signal the status of the block deletion to the filter. It is invoked after a set of
virtual disk blocks were deleted, or if the operation failed. The prototype for this callback is :

void
(*diskDeletedBlocks)(VMIOF_DiskHandle *handle, const VMIOF_DiskDeleteBlocksInfo *info,
VMIOF_Status status);

The parameters to this callback are:

n VMIOF_DiskHandle *handle —This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to.

n const VMIOF_DiskDeleteBlocksInfo *info — This input parameter that is a pointer to a structure


indicates the list of disk blocks to be deleted.

n VMIOF_Status status —The status of the block deletion operation.

This callback does not return a value.

These two callbacks are not allowed long running operations. If you are a cache solution and have cached
data for the blocks being deleted, you should hold the data, and only delete them after diskDeletedBlocks is
received.

NOTE While some blocks are being deleted, you may see parallel read / writes from a guest OS, but they
should not be the same blocks.

As an example, the following simple code was added to the sampfilt example.

void
SampleFilterDiskDeletedBlocks(VMIOF_DiskHandle *handle,
VMIOF_DiskDeleteBlocksInfo *info,
VMIOF_Status status)
{
VMIOF_Log(VMIOF_LOG_ERROR, "In callback %s\n", __func__);
VMIOF_Log(VMIOF_LOG_ERROR, "info->numBlockDescs = 0x%x\n", info->numBlockDescs);
VMIOF_Log(VMIOF_LOG_ERROR, "info->descs[0]: offset = 0x%lx length = 0x%lx\n",
(long unsigned int) info->descs[0].offset,
(long unsigned int)info->descs[0].length);
VMIOF_Log(VMIOF_LOG_ERROR, "status = 0x%x\n", status);
}

VMIOF_Status
SampleFilterDiskDeleteBlocksPrepare (VMIOF_DiskHandle *handle,
VMIOF_DiskDeleteBlocksInfo *info)
{
VMIOF_Log(VMIOF_LOG_ERROR, "In callback %s\n", __func__);
VMIOF_Log(VMIOF_LOG_ERROR, "info->numBlockDescs = 0x%x\n", info->numBlockDescs);
VMIOF_Log(VMIOF_LOG_ERROR, "info->descs[0]: offset = 0x%lx length = 0x%lx\n",
(long unsigned int) info->descs[0].offset,
(long unsigned int)info->descs[0].length);
return VMIOF_SUCCESS;
}

VMware, Inc. VMware Confidential and Proprietary 179


Getting Started Developing vSphere IO Filter Solutions

The code is then invoked using vmfstools, which is an easy way to send the SCSI UNMAP command. The
filter has already been attached to test.vmdk

# vmkfstools -v 3 --punchzero test.vmdk


DISKLIB-VMFS : "/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/test-flat.vmdk" : open
successful (10) size = 10485760, hd = 79230. Type 3
DISKLIB-DSCPTR: Opened [0]: "test-flat.vmdk" (0xa)
DISKLIB-LINK : Opened '/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/test.vmdk' (0xa):
vmfs, 20480 sectors / 10 MB.

In the callback SampleFilterDiskStartIO
Hole Punching: 100% done.FiltLib: Received delete blocks prepare notification
In callback SampleFilterDiskDeleteBlocksPrepare
info->numBlockDescs = 0x1
info->descs[0]: offset = 0x0 length = 0xa00000
FiltLib: Received deleted blocks notification
In callback SampleFilterDiskDeletedBlocks
info->numBlockDescs = 0x1
info->descs[0]: offset = 0x0 length = 0xa00000
status = 0x0
In the callback SampleFilterDiskClose

Understanding and Processing IO Abort (Cancel) Events ( diskIOAbort ())


Remember from “Understanding and Defining the diskIOAbort Callback,” on page 108 that the IO Filter
Framework invokes a filter's diskIOAbort() callback with a specific IO that needs to be cancelled/failed. That
is, the IO Filter Framework invokes this callback for each IO for which some other component issued an IO
abort command (a guest, a kernel module, or another filter). Remember that each LI must keep track of the
IOs it owns, typically in its instance data.

The body of this callback must search for the IO being aborted to see if said IO is on the LI's list of owned
IOs. If so, it must invoke VMIOF_DiskIOComplete() on the IO being aborted with a status of VMIOF_IO_ABORTED,
and then return with that same status. If not, it must return VMIOF_NOT_FOUND. It may not return any other
values.

If a LI aborts an IO it owns, it must:

n Remove said IO from its list of owned IOs, as aborted IOs no longer exist

n Cancel workers, timers, etc. associated with processing the IO

n Abort any subordinate IOs it may have created and submitted with VMIOF_DiskIOSubmit() by calling
VMIOF_DiskIOAbort()

Ensure the callback cancels workers, timers, etc. that may be associated with processing the IO as part of
cancelling it.

This callback is defined as follows :

VMIOF_Status (*diskIOAbort)(VMIOF_DiskHandle *handle, VMIOF_DiskIO *io);

The parameters to this callback are:

n VMIOF_DiskHandle *handle— This input parameter an opaque handle to the disk and is valid only for the
filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n VMIOF_DiskIO *io— This input parameter describes a disk IO request that should get aborted by this
filter.

180 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Return Values :

n VMIOF_IO_ABORTED— The LI owns the IO and has successfully aborted it by invoking


VMIOF_DiskIOComplete() with a status of VMIOF_IO_ABORTED

n VMIOF_NOT_FOUND— The does not own the IO and therefore cannot abort it

NOTE diskIOAbort() callback function is not allowed to block. It is also not allowed to defer the request to
another context (poll or worker), it must complete the IO through VMIOF_IO_ABORTED from within the calling
context if the IO is found.

NOTE If you want to test diskIOAbort(), you can simply delay your IO processing, this will trigger
diskIOAbort(). At some later point, we will issue the diskIOsReset() callback.

Understanding and Processing Disk IO Reset Requests (diskIOsReset())


Remember from “Understanding and Defining the diskIOsReset Callback,” on page 108 that the IO Filter
Framework invokes a filter's diskIOsReset() callback in response to a SCSI reset (or similar) being sent down
from the guest VM (or other user-space cartel that loaded the filter). In other words this callback function of
a filter is invoked to abort all outstanding IOs of a virtual disk associated with the specified resetIdentifier.
Further, remember that each LI must keep track of all IOs it owns.

The body of this callback must search for its list of owned IOs to find any owned IOs whose resetIdentifier
matches the resetIdentifier passed into the callback. For each IO found with the matching resetIdentifier, the
callback must invoke VMIOF_DiskIOComplete() on it with a status of VMIOF_IO_ABORTED.

If a LI aborts an IO it owns during a diskIOsReset callback, it must remove said IO from its list of owned
IOs, as aborted IOs no longer exist. Ensure the callback cancels workers, timers, etc. that may be associated
with processing the IOs being reset.

This callback is defined as:

void (diskIOsReset)(VMIOF_DiskHandle *handle, VMIOF_DiskResetIdentifier resetIdentifier);

Parameters :

n VMIOF_DiskHandle *handle— This input parameter an opaque handle to the disk and is valid only for the
filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n VMIOF_DiskResetIdentifier resetIdentifier — IOs associated with this identifier should get aborted

The function has no return value.

NOTE diskIOsReset() callback function is not allowed to block. It is also not allowed to defer the request to
another context (poll or worker), it must complete the IO through VMIOF_IO_ABORTED from within the
calling context.

NOTE If you want to test diskIOsReset(), you can simply delay your IO processing, this will trigger
diskIOAbort(), and at some later point the framework will issue the diskIOsReset() callback.

Overview of VMIOF IO Utility Functions


The VAIO provides many functions that you can use to manipulate IO requests from within a LI. Daemons
can also use these functions if they force-load a LI instance into their context. The sub-topics that follow
discuss how upon force-loading a LI into its context by a Daemon, the functions that VAIO provide, to
manipulate IO requests.

VMware, Inc. VMware Confidential and Proprietary 181


Getting Started Developing vSphere IO Filter Solutions

Understanding and Using the IO Filters VMIOF_DiskIODup Function


The VAIO provides VMIOF_DiskIODup() to allow an IO Filter to duplicate an existing VMIOF_DiskIO structure
into a new VMIOF_DiskIO structure allocated by this function (see “Understanding the DiskIO Structure,” on
page 167). In other words VMIOF_DiskIODup() allocates a new disk IO request and new buffers to issue from
the invoking filter.

This function has the following prototype:

VMIOF_Status VMIOF_DiskIODup(VMIOF_DiskHandle *handle, VMIOF_HeapHandle *heap,


const VMIOF_DiskIO *origIO, VMIOF_DiskIO **outIO);

Parameters :

n VMIOF_DiskHandle *handle — The handle to the VMDK for which origIO is a request

n VMIOF_HeapHandle *heap — The handle to the heap to be used for allocating IO memory for the
duplicated request returned in outIO

n const VMIOF_DiskIO *origIO — The IO request to duplicate

n VMIOF_DiskIO **outIO — The new IO request allocated from heap

Return Value :

n VMIOF_SUCCESS — The function succeeded and outIO points to a newly allocated disk IO object

n VMIOF_NO_MEMORY — The function could not allocate the memory necessary to duplicate the IO request

A common use of this function is to create a snapshot of an IO request whose ioFlags element includes
VMIOF_ZERO_COPY, as the data pointed to by the addr members of said request's VMIOF_DiskIOElem
elements can change while the IO is still in flight. If your IO Filter Solution needs stable data in the IO
request, it should use this function to create a duplicate of the data and process the duplicate.

Understanding and Using the IO Filters VMIOF_DiskIOAlloc Function


The VAIO provides VMIOF_DiskIOAlloc() to allow an LIs (including LIs loaded into a Daemon) to create a
new IO request, in the form of a VMIOF_DiskIO structure, to submit to a VMDK to which they have a handle.
This function has the following prototype:

VMIOF_Status VMIOF_DiskIOAlloc(VMIOF_DiskHandle *handle, VMIOF_HeapHandle *heap,


uint32_t numElems, VMIOF_DiskIO **outIO);

Parameters :

n VMIOF_DiskHandle *handle — The handle to the VMDK for which origIO is a request

n VMIOF_HeapHandle *heap — The handle to the heap to be used for allocating IO memory for the
VMIOF_DiskIOElem structures allocated by the function

n uint32_t numElems — The number of VMIOF_DiskIOElem structures the function must allocate and place
in the structure pointed to by *outIO.

n VMIOF_DiskIO **outIO — The new IO request structure, allocated from heap. The structure will have
numElems VMIOF_DiskIOElem structures in it. However, none of elements' members are set, nor is space
allocated for them. Further the function sets the ioFlags member of the VMIOF_DiskIO structure to zero
(0).

Return Value :

n VMIOF_SUCCESS — The function succeeded and outIO points to a newly allocated disk IO object

182 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n VMIOF_NO_MEMORY — The function could not allocate the memory necessary to fulfill the request

A common use for this function is to create and issue IOs against VMDKs from within Daemons that have
performed a VMIOF_VirtualDiskOpen() on said VMDK.

Understanding and using the IO Filters VMIOF_DiskIOSubmit Function


After creating an IO request with VMIOF_DiskIODup() or VMIOF_DiskIOAlloc(), and performing any other
necessary operations on the request's data, submit the IO to the IO Filter Framework, and then the VMDK,
by calling VMIOF_DiskIOSubmit(). This function has the following prototype :

VMIOF_Status
VMIOF_DiskIOSubmit(VMIOF_DiskHandle *handle, VMIOF_DiskIO *io);

The parameters are:

n VMIOF_DiskHandle *handle — The VMDK to which the IO request is submitted

n VMIOF_DiskIO *io — The IO request created by VMIOF_DiskIOAlloc() or VMIOF_DiskIODup()

The function returns VMIOF_SUCCESS on success, or some other status indicating why it failed (for
example VMIOF_BAD_PARAM for an invalid pointer to the IO structure).

IMPORTANT The IO Filter Framework provides a default completion mechanism that frees the IO structures
allocated by VMIOF_DiskIOAlloc() or VMIOF_DiskIODup(). If you indeed want the process the IO after it is
completed then you have to call VMIOF_DiskIOCompletionCallbackSet() before you invoke
VMIOF_DiskIOSubmit().

NOTE Filters using the VMIOF_DiskSubmitIO interface should not overload the storage HW. Your code
should look out for latencies, and if they rise, decrease the load they are causing on the HW.

Understanding and Using the VMIOF_DiskIOFree Function


You must free any IO requests that you dynamically allocate via VMIOF_DiskIOAlloc() or VMIOF_DiskIODup().
Invoke the VMIOF_DiskIOFree() function to free said IO requests. This function has the following signature:

VMIOF_Status
VMIOF_DiskIOFree(VMIOF_DiskHandle *handle, VMIOF_DiskIO *io);

The parameters to this function are:

n VMIOF_DiskHandle *handle — The handle passed into the function that created the IO request

n VMIOF_DiskIO *io — The IO to free that was previously allocated either via VMIOF_DiskIOAlloc() or
VMIOF_DiskIODup()

This function returns VMIOF_SUCCESS after freeing the IO request, or a descriptive result on error.

IMPORTANT This function frees the data structure pointed to by the io member. It will also free the memory
pointed by the addr members of the IO request's DiskIOElem structures if the io member is allocated by
VMIOF_DiskIODup(), but not by VMIOF_DiskIOAlloc().

NOTE VMIOF_DiskIO cannot be reused. When an IO completes, you need to free the VMIOF_DiskIO struct and
alloc new struct(s) for others IOs to be submitted.

VMware, Inc. VMware Confidential and Proprietary 183


Getting Started Developing vSphere IO Filter Solutions

Optimizing the Performance of Your IO Filter Solution


The performance of any IO Filter Solution depends on many factors. While there is no one specific set of
things that VMware can prescribe that will guarantee maximum possible performance, there are some
things that you can do that will increase performance vs not doing these things.

The three specific things that VMware expects you do use in your IO Filter Solutions to provide better
performance (than not doing these things):

n Use the VMIOF_Crossfd* functions to share memory in the your LIs with your Daemon

n Use VMIOF_AIO* functions to perform asynchronous IO to and from cache files and buffers in your
Daemon or memory in your Crossfd-shared memory

n After the Daemon makes off-host TCP/IP connections, it should pass the file descriptors for those
sockets to the LIs (using standard fd-passing techniques through UNIX domain sockets), so that said
LIs can communicate directly with off-host entities. This obviates the need for the Daemon to proxy IOs
between the LIs and off-host entities.

The following sub-topics discuss these points in detail.

Understanding and Using the IO Filters CrossFD Functions

The purpose of CrossFD functions is to allow the daemon to perform disk IO on behalf of a filter instance,
without needing to copy data. In other words, the purpose is to share memory between entities within a
filter, but using the file IO abstract instead of traditional shared memory APIs such as mmap() or System V
shared memory segments.

This set of functions create a special kind of file descriptor, called crossfd, that can be used to read and write
memory of the cartel that creates it. Said cartel then associates specific memory to which it is willing to grant
access with the crossfd. It then passes the crossfd to other cartels (using a UNIX domain socket sendmsg())
with which it is willing to share the associated memory. The other cartels, after receiving the crossfd (using a
Unix domain socket recvmsg()) can use the VAIO asynchronous IO (AIO) functions to perform IO directly to
the associated memory in the sharing cartel, it can also use file IO functions such as pread()/pwrite() to
access the associated memory in the sharing cartel.

The programming flow is:

1 In the cartel that wants to share its memory (typically a LI):

a Use VMIOF_CrossfdCreate() to create a crossfd representing the current cartel's memory.

b Restrict the file descriptor to a given range using VMIOF_CrossfdGrantAccessToRange(). By default,


other cartels have no access to this process's memory. For security reasons, you should only grant
access to specific memory that you want to share, rather than your cartel's entire address space.

c Send the crossfd via a UNIX domain socket to another cartel using sendmsg().

2 In the cartel that wishes to access the shared memory (typically a Daemon):

a Retrieve the crossfd via a UNIX domain socket using recvmsg().

b Choice 1: Use AIO functions to perform IO directly to memory represented by the crossfd. When
the kernel processes such AIO requests, it writes directly into the address space of the cartel that
created the crossfd. For more details about AIO, see “Understanding and Using the IO Filters AIO
Functions (for cache file IO),” on page 186

184 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

c Choice 2: Perform file IO to read/write from the crossfd file descriptor.

NOTE The purpose of CrossFD is to allow daemon to perform AIO on behalf of a filter instance. If
the AIO is not needed, pread and pwrite can be used to access data from crossFD, but since every
access will introduce an additional system call (pread or pwrite), the performance is probably not
as good as expected. As a preferable alternative, System V shared memory segments can be used.
System V shared memory segments come from kernel directly, so there is no limitation. However,
the page table bookkeeping overhead comes from a different resource pool. For the VMX it comes
from the uwshmempt page pool (60U2 release), for other cartels, it comes from resource pools with
a fixed limit. You don't need to account the System V Shared memory itself, but you need to
account for the page table overhead in DAEMON_MEMORY_RESERVATION for Daemon and
diskRequirements() for LIs.

The doxygen for these functions provides details on their syntax. The following sub-sections provide a
synopsis of that information:

Understanding the VMIOF_CrossfdCreate () Function


Use VMIOF_CrossfdCreate() to create a crossfd. This function has the following prototype:

VMIOF_Status VMIOF_CrossfdCreate(int *pcrossfd);

As shown, the function takes a single output parameter, pcrossfd, a pointer to an integer that is the file
descriptor of the crossfd.

The function returns the following values:

n VMIOF_SUCCESS — The function call succeeded and int *pcrossfd now points to a crossfd.

n VMIOF_NO_RESOURCES — The function call failed, and the file descriptor was not created, because the
system did not have enough resources. The value pointed to by pcrossfd is undefined.

NOTE You can use the standard close() function on the crossfd

Understanding the VMIOF_CrossfdGrantAccessToRange () Function


Use this function to grant access to a range of memory in the cartel that created a crossfd. By default, other
cartels cannot access any memory in the cartel that created the crossfd. Thus, you must use this function to
grant access to whatever memory you want to make accessible to other cartels. This function has the
following prototype:

VMIOF_Status
VMIOF_CrossfdGrantAccessToRange(int crossfd, uintptr_t start, unsigned long length);

This function takes the following parameters:

n int crossfd — A crossfd assigned by VMIOF_CrossfdCreate()

n uintptr_t start — The starting address of memory range (within the calling cartel) to which you are
granting access.

REMEMBER The size of a uintptr_t varies in size according to the execution environment. Typically, in
32-bit environments, it is 32-bits in size, while in 64-bit environments it is 64-bits in size.

n unsigned long length — The amount of memory, starting at start, to which you wish to grant access.

The function returns the following values:

n VMIOF_SUCCESS — The grant request succeeded

n VMIOF_ALREADY_EXISTS — The grant failed because a range covering the entire region or a part of it
already exists.

VMware, Inc. VMware Confidential and Proprietary 185


Getting Started Developing vSphere IO Filter Solutions

n VMIOF_BAD_PARAM — The grant failed because one of the parameters is invalid (for example start is an
invalid address)

n VMIOF_NO_RESOURCES — The grant failed because the system did not have enough resources to keep track
of it internally

Understanding the CrossfdRevokeAccessToRange () Function


Use this function to revoke access to memory that was previously granted via a call to
VMIOF_CrossfdGrantAccessToRange(). This function has the following prototype:

VMIOF_Status
CrossfdRevokeAccessToRange(int crossfd, uintptr_t start, unsigned long length);

As shown, this function takes the same parameters as CrossfdRevokeAccessToRange, with the same
semantics. The difference is that start and length define the range of memory to which the function revokes
access for other cartels.

The return values are similarly analogous, except that VMIOF_NO_RESOURCES is not a value returned by the
function.

Understanding and Using the IO Filters AIO Functions (for cache file IO)
The VAIO provides a set of functions to manage asynchronous IO transactions to cache files (see
“Understanding and Using the VMIOF_Cache*() Functions,” on page 253) using scatter / gather lists that
increase your IO Filter performance vs alternative IO methods. For example, without the AIO functions, a
Daemon for a caching Filter that wants to write n sets of blocks to a cache file would have to invoke pwrite()
(or similar function) n separate times, waiting for each to complete, blocking the Daemon each time,
probably causing context switches for each. Using the AIO functions, the Daemon can create a list of n AIO
scatter / gather structures (of type VMIOF_AIO), and then submit the list to the IO Filter Framework with a
single function invocation, and then receive and process callbacks as said Framework completes each of the
items on the list.

NOTE The AIO functions and data structures in IO Filters are analogous to, but different from, those
defined by POSIX.

Further, the VAIO AIO functions have been enhanced to allow (but not require) code to perform IO between
a file (such as a cache file) and a crossFD file (shared memory) rather than just to a cartel's memory. This
further increases performance in IO Filter component by eliminating the need to copy data between cartel-
private and cartel-shared memory.

NOTE The older version of these functions forced the developer to use the poll callback mechanism, while
this is no longer the case.

NOTE While you can't use AIO with a socket, you are able to use crossFD with sendfile()

This topic provides a discussion of the data structures and functions related to performing AIO within IO
Filter Daemon and Library components.

186 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Understanding the VMIOF_AIO structure


The VMIOF_AIO structure describes a single IO operation for the IO Filter Framework to complete. Your
code must allocate and populate one of these structures for each contiguous set of blocks it wants to read
from / write to the disk defined in the fd member. The code submits the request to the IO Filter Framework
using the VMIOF_AIOSubmit() function discussed later in this topic.

typedef struct VMIOF_AIO {


uint8_t reserved[64];
VMIOF_AIOOpcode opcode; /* Operation type: VMIOF_AIO_READ, VMIOF_AIO_WRITE,
VMIOF_AIO_READV, VMIOF_AIO_WRITEV. */
int fd; /* File descriptor against which the IO is performed. */
uint64_t offset; /* Byte offset in file. */
uint64_t length; /* Number of bytes to transfer. */
uint64_t bufaddr; /* Address of buffer to read into / write from. */
int crossFD; /* Cartel that the buffer exists in, -1 if current cartel. */
VMIOF_AIOCallback callback;/* function to invoke when the IO completes. */
void *data; /* Pointer to pass to the "callback" function */
} VMIOF_AIO;

The comments in the code above describe the usage of each member.

NOTE You can perform vectored IO via VMIOF_AIO_READV or VMIOF_AIO_WRITEV. This would allow you to
only have one VMIOF_AIO structure.

NOTE Each AIO request must be a multiple of 512 bytes. If it is vectored IO, the overall length of all the
vectors must meet this criteria, since overall they compose one IO request.

The type VMIOF_AIO Callback defines the prototype for all callback functions, which is:

void *(VMIOF_AIOCallback)(void *data, VMIOF_Status status);

The parameters passed to the callback by the IO Filter Framework are:

n void *data — The value passed in the data member of the VMIOF_AIO structure associated with the IO
that just completed

n VMIOF_Status status — The result of the IO operation.

The status values are:

n VMIOF_SUCCESS — The IO operation completed successfully.

n VMIOF_BAD_PARAM — Bad parameter, need to reformat the AIO structure.

n VMIOF_INVALID_ADDRESS — Invalid address, need to reformat the AIO structure.

n VMIOF_NO_ACCESS — Permission denied, need to reformat the AIO structure.

n VMIOF_READ_ONLY — Disk read-only, need to reformat the AIO structure.

n VMIOF_NOT_SUPPORTED — Operation not supported, need to reformat the AIO structure.

n VMIOF_NO_RESOURCES — Not enough resources, you can retry.

n VMIOF_NO_MEMORY — Memory allocation failed, you can retry.

n VMIOF_BUSY — Busy, you can retry.

n VMIOF_RETRY — Retry operation

n VMIOF_IO_ERROR — IO error, you can retry.

n VMIOF_IO_ABORTED — IO was aborted, you can retry.

VMware, Inc. VMware Confidential and Proprietary 187


Getting Started Developing vSphere IO Filter Solutions

n VMIOF_NO_CONNECTION — No connection to the device, you can retry, but only after some time.

n VMIOF_NO_SPACE — Not enough space.

n VMIOF_PERM_DEV_LOSS — Device is permanently unavailable

n VMIOF_FAILURE — Other fatal failure.

Understanding the VMIOF_AIOContextProperties structure


The VMIOF_AIOContextProperties structure describes the context used to submit AIO requests. Your code
must allocate and populate this structure before calling the VMIOF_AIOContextCreate() function discussed
later in this topic.

typedef struct VMIOF_AIOContextProperties {


VMIOF_HeapHandle *heap; /* Heap to allocate the context from. */
VMIOF_AIOContextFlags flags; /* AIO context behavior flags (VMIOF_AIO_CONTEXT_FLAGS_NONE
or VMIOF_AIO_CONTEXT_CUSTOM_EVENTFD)*/
int eventfd; /* Custom eventfd if VMIOF_AIO_CONTEXT_CUSTOM_EVENTFD is
set. */
} VMIOF_AIOContextProperties;

The comments in the code above describe the usage of each member.

Understanding the VMIOF_AIOContextCreate () function


All AIO operations must be submitted via a context, of type VMIOF_AIOContext, allocated by your code. The
AIO context is allocated from the provided heap, and will be freed back to the heap when destroyed. The
prototype of this function is:

VMIOF_Status VMIOF_AIOContextCreate(const VMIOF_AIOContextProperties *props, VMIOF_AIOContext


**pcontext);

The parameters are:

n VMIOF_AIOContextProperties *props — The properties from which the function allocates the context
structure

n VMIOF_AIOContext **pcontext — A pointer to a VMIOF_AIOContext pointer. The function writes the


context pointer into this pointer.

The return values are:

n VMIOF_SUCCESS — The function successfully created the context

n VMIOF_NO_MEMORY — The function could not create the context because of insufficient memory. Either the
specified heap was out of space or the system failed the underlying memory allocation to the heap

n VMIOF_NO_RESOURCES — The function could not create the context because it lacked some resource other
than memory.

When the function returns any value other than VMIOF_SUCCESS the value in pcontext is undefined. You can
retry the function a few times with a delay before finally giving up.

Each cartel can create up to 32 AIO contexts.

Understanding the VMIOF_AIOContextDestroy () function


Destroys an AIO context, and returns the memory back to the original heap. The AIO context must not have
any outstanding IOs. The prototype of this function is:

void VMIOF_AIOContextDestroy(VMIOF_AIOContext *context);

188 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

The parameters are:

n VMIOF_AIOContext *context — A pointer to a VMIOF_AIOContext to be destroyed.

Understanding the VMIOF_AIOSubmit () function


Use this function to submit the AIO requests to the kernel. The AIO context is used to track the submitted
IOs, and must not be freed until all outstanding IOs have been completed.

The prototype of this function is:

VMIOF_Status VMIOF_AIOSubmit(VMIOF_AIOContext *context, VMIOF_AIO **aios, uint32_t count,


uint32_t *submitted);

The parameters to this function are:

n VMIOF_AIOContext *context — A pointer to the context returned by VMIOF_AIOContextCreate()

n VMIOF_AIO **aio — A pointer to a pointer of the AIOs to submit

n uint32_t count — The number of AIOs to submit.

n uint32_t submitted — This optional parameter gets the number of AIOs that were submitted. If the
system did not have all the available resources, this parameter will reflect the index of the first AIO that
was not submitted.

The return values are:

n VMIOF_SUCCESS — The framework tried to submit the specified AIOs.

n VMIOF_BAD_PARAM — The AIOs were not submitted because a parameter was malformed.

NOTE If submitted is specified, the framework will not try to submit the remaining IOs and instead return
the number of submitted IOs. If submitted is not specified the AIOs that were not submitted will have their
completion callback invoked with an error status.

NOTE There is no strict limit for the number of AIOs you submit, but you might hit fast slab allocation
failures if you have more than 4k or 8k outstanding IOs, since they are all allocated from the same slab on a
per context basis. If you want to go way beyond that, you should create individual contexts and should be
able to push the IO limits well into the 16k to 32k range.

Understanding the VMIOF_AIOProcessCompletions () function


Use this function to have the IO Filter Framework process AIOs that are handled by an AIO context created
with the VMIOF_AIO_CONTEXT_CUSTOM_EVENTFD flag.

The prototype for this function is:

void VMIOF_AIOProcessCompletions(VMIOF_AIOContext *context, uint64_t maxIOs);

n VMIOF_AIOContext *context — the context to which the aio was previously submitted

n uint64_t maxIOs — The maximum number of IOs for which the completion callback will be called.

This function returns void.

NOTE The caller of this function is responsible for calling eventfd_read() to determine maxIOs value.

VMware, Inc. VMware Confidential and Proprietary 189


Getting Started Developing vSphere IO Filter Solutions

Understanding the VMIOF_AIOAbort () function


Use this function to cancel / abort a specific VMIOF_AIO that was previously submitted to the specified
context. The prototype for this function is:

VMIOF_Status VMIOF_AIOAbort(VMIOF_AIOContext *context, const VMIOF_AIO *aio, VMIOF_Status


*aioStatus);

The parameters to this function are:

n VMIOF_AIOContext *context — the context to which the aio was previously submitted

n VMIOF_AIO *aio — The aio to abort

n VMIOF_Status *aioStatus — the status of the AIO only in the case where the AIO is complete, but the
callback has not been called.

The return values are:

n VMIOF_SUCCESS — The AIO was aborted. A completion callback for this IO has and will not been called.

n VMIOF_NOT_FOUND — The AIO was not aborted as it was not found.

n VMIOF_ASYNC — The abort will be completed asynchronously.

The Standard Coding Pattern for AIO


Given the information about the VAIO AIO functions, the following steps outline the general programming
pattern employed to perform AIO in an IO Filter with a 1:1 mapping for the LI/Daemon pair:

1 Create a new AIO context in a Daemon's Start callback , and destroy it in the Cleanup callback.

2 For each IO operation in a group of operations (for example, for each VMIOF_DiskIOElem in a
VMIOF_DiskIO structure):

a Allocate a new VMIOF_AIO structure from a VMIOF_HeapHandle.

b Populate the VMIOF_AIO structure's members.

c Add the VMIOF_AIO structure to the AIO list.

3 Submit the AIOs using the VMIOF_AIOSubmit() function. As the IO Filter Framework process each AIO, it
calls the appropriate callback.

There are now several ways to Submit/Complete the AIOs.

Using a custom EventFD and using a Poll callback

1 Create an eventfd using eventfd()

2 When creating the context, set the eventfd and set the flag to VMIOF_AIO_CONTEXT_CUSTOM_EVENTFD

3 Register a read Poll callback on the eventfd.

4 Submit a bunch of AIO’s using VMIOF_AIOSubmit()

5 You will get called in your eventfd Poll callback whenever there are pending IOs. The Poll callback
needs to:

a Call eventfd_read(eventfd, &count) to get the number of AIOs that require completion

b Call VMIOF_AIOProcessCompletions() to perform completions.

c Subsequently, VMIOF_AIOProcessCompletions() shall invoke the AIO's completion routines.

6 Remove the Poll callback

7 Destroy the context

190 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Using a custom EventFD and NOT using a Poll callback

1 Create an eventfd using eventfd()

2 When creating the context, set the eventfd and set the flag to VMIOF_AIO_CONTEXT_CUSTOM_EVENTFD

3 Submit a bunch of AIO’s using VMIOF_AIOSubmit()

4 Sit on a blocking call (eg. Blocking eventfd_read()) and loop until all AIO’s are complete

VMIOF_AIOSubmit(aios, naios);
pending = naios;
while (pending > 0) {
/* block waiting for at least one AIO to finish */
eventfd_read(fd, &count);
/* process completions */
VMIOF_AIOProcessCompletions(ctx, count);
pending -= count;
}

5 You will get called in your AIO completion routine.

6 Destroy the context

NOT using a custom EventFD and NOT using a Poll callback

1 When creating the context, set the eventfd == -1 and set the flag to VMIOF_AIO_CONTEXT_FLAGS_NONE

2 Submit a bunch of AIO’s using VMIOF_AIOSubmit()

3 You will get called in your AIO callback for each completed AIO.

4 You don’t need to call VMIOF_AIOProcessCompletions() as the AIO is complete once your callback is
called. In fact it is strictly forbidden to call VMIOF_AIOProcessCompletions() if you do not specify a
custom eventfd

5 Destroy the context

NOTE In our internal testing, we observed the best performance from the 2nd option - custom eventfd and
no poll callback. The thread that submits the IOs should be the one that waits for them and processes their
completions.

The programming pattern between LIs and Daemons deserves further explanation in cache solutions.
Generally, in this case, the Daemon will create and manage the cache file, but the LI receives the
diskIOStart callbacks with the IO requests. Thus, the transaction involves two programming patterns: One
in the LI; one in the Daemon, as follows:

In the LI:

1 Create a set of buffers to share memory between the LI and Daemon. (Optional: You can directly map
the guest's IO buffers)

2 Create a crossFD and grant access to the buffers created in the previous step as described in the topic
“Understanding and Using the IO Filters CrossFD Functions,” on page 184. Do this in the LI's diskOpen
callback. Close the resulting file descriptor in the LI's diskClose callback. Send the crossFD file
descriptor to the Daemon.

3 In the diskIOStart callback, send a control message to the Daemon indicating how many IO requests to
perform to fulfill this IO's request. Then, for each VMIOF_DiskIOElem in a VMIOF_DiskIO structure that you
want the Daemon to fulfill:

a For read operations, assign a buffer in shared memory into which the Daemon must write and send
control information to the daemon with this information

VMware, Inc. VMware Confidential and Proprietary 191


Getting Started Developing vSphere IO Filter Solutions

b For write operations, assign a buffer in shared memory, copy the data from the VMIOF_DiskIOElem
into said buffer, and then send control information to the daemon with this information

4 Send a control message to the Daemon to perform the IO operations

In the Daemon:

1 For each LI connection, create an AIO context. Destroy it, performing any appropriate cleanup, when
the LI disconnects. Receive the crossFD from the LI during initial handshake with it.

2 Upon receiving a new IO request control packet from the LI:

a Receive each control packet with individual IO element information.

b Allocate a new VMIOF_AIO structure and populate its members with the addresses of the buffers,
lengths, etc. specified in the control packet received in the preceding step. Remember to set the
crossFD member to the one received at handshake.

c Add the VMIOF_AIO structure to the Daemon's list.

d Upon receiving all of the IO elements for a given request, submit the list.

IMPORTANT Your code must provide any synchronization necessary between submit / drain operations and
abort operations. For example, extending the preceding programming pattern, if a Daemon receives a
control message to abort an IO element after it has been submitted, said request can arrive while the callback
for the given IO element is running. Therefore the code must synchronize access to any data structures
shared between the callback and abort code.

NOTE You need to protect against data leakage from a VM by ensuring that a LI can’t read data in the cache
that is from a different VM. Typically, you do this by using some level of indirection in your cache
implementation.

Understanding and Processing Snapshot-related Events


A snapshot is a set of data that represents the state of a VM, including its RAM, CPUs, and virtual disk
contents, at a specified point in time. Currently, administrators or programs using appropriate APIs must
specifically tell hypervisors to take snapshots of a VM. Storage systems can implement snapshots in
different ways. For example:

n NFS and VMFS-based datastores use delta disks for VMDKs. Each time a snapshot happens:

a The existing VMDK is frozen. No new writes are allowed to this VMDK file. It is now called a
parent VMDK.

b The hypervisor creates a new VMDK file, called the child (relative to the parent discussed in the
preceding step). All writes to the virtual disk occur to this child VMDK file. This is why it is called a
delta disk.

The hypervisor satisfies reads by first checking for the block in the child VMDK file. If the child file
does not contain the block, the hypervisor looks for the block in the parent VMDK file.

c The virtual disk now consists of this chain of parent and child VMDK files. The VMDK file that
exists before any snapshots are taken is referred to as the base VMDK.

Should someone take another snapshot, the hypervisor repeats these 3 steps, and the virtual disk then
consists of a chain of 3 VMDK files etc. Reads may take longer now because the hypervisor may have to
check 3 files before it finds the block it needs. Long chains of delta disks can have a significant impact
on the performance of a VM.

n For native snapshots (VVOL and NFS VAAI), we don't create a delta disk, rather the whole disk is
copied, so you will not see the disk chain opened.

192 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

The IO Filter Framework passes several snapshot-related events onto IO Filters, including:

n Taking a Snapshot — When an administrator or program takes a snapshot on a VM, the IO Filters
attached to the VM's VMDKs are notified of the snapshot event (unless the VMDK is marked as
independent by the vSphere Administrator). A successful snapshot operation results in the host having a
complete image of the VM's state, including that of its VMDKs, such that the vSphere environment may
later return (revert) the VM to this state at will. For that to be possible, each IO Filter must do its part to
make the VMDKs to which they are attached clean and complete so that the snapshot can proceed, or
fail the event to prevent the snapshot from happening.

n Reverting to a Snapshot — When an administrator reverts a VM to a given snapshot, the vSPhere


system must return the VM and its VMDKs to the state represented in a given snapshot. To do this,
vSphere restores the memory and VMDK's contents to what they had been at the time of the snapshot.
For this, IO Filter Framework:
a Deletes the current delta disk

b Creates a new delta disk

c Invokes the filter's diskSnapshot callback

n Deleting a Snapshot — When an administrator or program deletes a snapshot, the hypervisor takes the
data from that snapshot and rolls it forward into the earlier VMDK file. After the snapshots are deleted,
the IO Filters attached to the VM's VMDKs are notified of a disk collapse event (unless the VMDK is
marked as independent by the Administrator). They must determine which snapshots were deleted and
update their state accordingly.

The following subsections provide details for the callbacks invoked for snapshot and collapse events. They
also discuss functions a filter can use to scan for blocks that have changed since the most recent snapshot.

NOTE Currently we don't support attaching filters to a VM/VMDK that has existing snapshots.

Understanding and Processing a Snapshot Event


Remember from “Understanding and Defining the diskSnapshot Callback,” on page 104 that the IO Filter
Framework invokes a filter's diskSnapshot callback several times in response to snapshot requests invoked
by administrators or programs. This callback function of a filter is invoked before a snapshot is taken, once a
new child disk is created and if the snapshot operation encountered any error.

The callback has the following prototype:

VMIOF_Status (*diskSnapshot)(VMIOF_DiskHandle *handle, const VMIOF_DiskSnapshotInfo *info);

Parameters :
n VMIOF_DiskHandle *handle — This input parameter an opaque handle to the disk and is valid only for the
filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n const VMIOF_DiskSnapshotInfo *info — This is a pointer to a structure of type


VMIOF_DiskSnapshotInfo that describes the phase in which the callback in invoked, provides functions
to notify the framework of progress so as to avoid timeouts and also acknowledge completion of
asynchronous processing.

typedef struct VMIOF_DiskSnapshotInfo {


/**
* \brief The phase in which the callback was invoked.
*
* This member describes the phase in which the 'diskSnapshot'
* callback is invoked. Possible values are: \ref
* VMIOF_SNAPSHOT_PREPARE to describe the prepare phase of a snapshot
* operation, \ref VMIOF_SNAPSHOT_NOTIFY indicating that a snapshot

VMware, Inc. VMware Confidential and Proprietary 193


Getting Started Developing vSphere IO Filter Solutions

* is being taken and should be considered a success, and


* \ref VMIOF_SNAPSHOT_FAIL signaling a snapshot failure.
*/
VMIOF_DiskSnapshotPhase phase;
/**
* \brief Function to post progress updates to framework.
*
* If the progressFunc is specified, the filter can post progress
* updates to the framework as part of its asynchronous processing
* of the notification.
*/
VMIOF_DiskOpProgressFunc progressFunc;
/**
* \brief Function to acknowledge completion of notification.
*
* If the completionFunc is specified, the filter can defer the
* snapshot request in which case it can later acknowledge completion
* of the asynchronous processing to the framework by means of this
* function.
*/
VMIOF_DiskOpCompletionFunc completionFunc;
} VMIOF_DiskSnapshotInfo;

typedef enum VMIOF_DiskSnapshotPhase {


/*
* Prepare for snapshot operation. In this phase filter should stun
* all its background operations and flush all its dirty data to disk
* to make the disk consistent. It may also perform additional work
* that is required by a snapshot. IOs to the disk must continue to
* be processed while this notification is in progress. On completion,
* the filter must continue to process IOs keeping the disk in a
* consistent state until the disk is closed or a
* \ref VMIOF_SNAPSHOT_FAIL notification is received. There is no
* guarantee that the snapshot will be created immediately after filter
* finishes its prepare stage. Note that neither
* \ref VMIOF_SNAPSHOT_NOTIFY nor \ref VMIOF_SNAPSHOT_FAIL may be seen
* in the event of a catastrophic failure.
*/
VMIOF_SNAPSHOT_PREPARE = 1,
/*
* A snapshot has been created. This callback is fired on the new
* disk. Filter should consider this a success and ignore
* \ref VMIOF_SNAPSHOT_FAIL after this.
*/
VMIOF_SNAPSHOT_NOTIFY = 2,
/*
* A snapshot operation has failed. If the \ref VMIOF_SNAPSHOT_NOTIFY
* phase has not been seen, then it should no longer be expected. On
* failure, a filter can resume background work and the disk is not
* required to be in a consistent state.
*/
VMIOF_SNAPSHOT_FAIL = 3,
} VMIOF_DiskSnapshotPhase;

194 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Return Value :

n VMIOF_SUCCESS — The function succeeded in handling the reported phase of the snapshot

n VMIOF_ASYNC — diskSnapshot returns VMIOF_ASYNC when IO is deferred to be handled


asynchronously. Snapshotting will be postponed by the framework until VMIOF_DiskOpCompletionFunc()
is invoked.

NOTE This return value is only allowed in the prepare phase. It must not be used in the notify or failure
phase.

n Any other value will indicate failure, which will fail the snapshot operation. Returning an error code is
only allowed in the phases of VMIOF_SNAPSHOT_PREPARE and VMIOF_SNAPSHOT_NOTIFY, while the phase
VMIOF_SNAPSHOT_FAIL may only return VMIOF_SUCCESS.

NOTE Usually the IO Filter framework expects filters to fail the snapshot operation in the
VMIOF_SNAPSHOT_PREPARE phase. Remember that there are cases where VMIOF_SNAPSHOT_PREPARE is not
delivered, e.g. when vSphere creates VMDKs for linked clone VMs. In such cases filters may fail the
snapshot operation in VMIOF_SNAPSHOT_NOTIFY phase.

Prepare Phase — During the prepare phase, the handler is invoked with the phase set to
VMIOF_SNAPSHOT_PREPARE. In this phase, the filter instance should stun all its background operations and
flush all its dirty data to disk to make the disk consistent. It may also perform additional work that is
required by a snapshot. IOs to the disk must continue to be processed while this notification is in progress.
On completion, the filter must continue to process IOs keeping the disk in a consistent state until the disk is
closed or a VMIOF_SNAPSHOT_FAIL notification is received. There is no guarantee that the snapshot will be
created immediately after filter finishes its prepare stage. Note that neither VMIOF_SNAPSHOT_NOTIFY nor
VMIOF_SNAPSHOT_FAIL may be seen in the event of a catastrophic failure.

If there is no asynchronous work required by the filter instance, a value of VMIOF_SUCCESS is returned. This
is the state where no action is needed and everything is ready for the next phase.

When there is work to be done by the filter instance, a value of VMIOF_ASYNC is returned. The filter should
stun all its background operations and flush all its "dirty data" to disk to make the disk consistent. The filter
instance must report the progress of its actions continuously to the Filter Framework using the
VMIOF_DiskOpProgressFunc() pointer in the VMIOF_DiskSnapshotInfo structure. This function includes two
parameters, the disk handle and the percentage complete, and allows for updating the progress on a
granularity of one percent. This function can report the same progress multiple times but decreasing the
progress is prohibited. Progress is reported approximately every 10 seconds until the necessary actions are
completed.

Completion of work is indicated to the Filter Framework using the VMIOF_DiskOpCompleteFunc() pointer in
the VMIOF_DiskSnapshotInfo structure. If the filter instance returns VMIOF_SUCCESS for this function, the Filter
Framework is allowed to proceed with the snapshot. However, if the filter sets any other status value, the
Prepare Phase will end for all filters and the snapshot will be terminated.

Once all filters have returned VMIOF_SUCCESS, the Prepare Phase is over. At this point, all disks associated
with the VM are stunned/closed. The Filter Framework issues a diskStun()/diskClose() to all the filters.

The snapshot process now moves into the Notify Phase.

Notify Phase — The following diagram (taken from a VMware KB Article) shows an overview of a disk
with 3 snapshots.

VMware, Inc. VMware Confidential and Proprietary 195


Getting Started Developing vSphere IO Filter Solutions

Figure 5‑2. Disk Chain

Assuming we take the first snapshot. After the snapshot, we will have parent VMDK file vm.vmdk and
child VMDK file vm-001.vmdk. All sidecar files will also be copied for the child VMDK file. So if you have a
sidecar file vm.vmfd for vm.vmdk, then after snapshot, you will have a copy of vm.vmfd, named as
vm-001.vmfd, associated with the child VMDK file vm-001.vmdk.

In case of failure, the diskSnapshot() callback will be seen with the phase VMIOF_SNAPSHOT_FAIL. On failure, a
filter instance can resume background work and the disk is not required to be in a consistent state.

First, the child VMDK vm-001.vmdk will be opened with VMIOF_DiskFlags in VMIOF_DiskInfo set to
VMIOF_DISK_NO_IO. At this time, IOs to disk is not allowed, but IOs to sidecar files are allowed. All the
access to the sidecar files will go to the ones associated with the child VMDK, whick is vm-001.vmfd in our
example. If a filter needs to track parent–child disk relationships, VMware recommends that the child's
sidecar is updated so that the relationship can be inferred from it. Then vm-001.vmdk will receive callback
of diskSnapshot with VMIOF_DiskSnapshotPhase in VMIOF_DiskSnapshotInfo set to
VMIOF_SNAPSHOT_NOTIFY, and callback of diskClose aftewards.

196 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Then, the whole VMDK chain, vm.vmdk and vm-001.vmdk will be opened together. In other words, in
VMIOF_DiskInfo, linksInChain will be 2, while filesInChain will refer to the two VMDK path names,
vm.vmdk and vm-001.vmdk. Access to sidecar file now will continue to go to vm-001.vmfd. At this point all
IO are still suspended until callback diskUnstun is received. Then the VMDKs will resume seeing IOs.

NOTE A Linked Disk Clone is implemented through Disk Snapshot mechanism. So the LI should only
expect a diskSnapshot callback, not a diskClone callback. In the current release, there is no way to
distinguish a Snapshot and a Linked Clone, but in the next release we will add a flag in diskSnapshot
callback to distinguish them. For a Linked Clone, the LI will not receive diskSnapshot PREPARE, but only
diskSnapshot NOTIFY. You do not need to flush dirty data in case of Linked Clone for a caching solution as
the Linked Clone is on top of a Snapshot, so the dirty should already has been flushed as part of the
Snapshot process.

NOTE Independent-Nonpersistent vmdk's are also implemented using Snapshot. When a VM is powered
on, the base Independent-Nonpersistent disk is opened as Read-Only mode, then a delta disk is created as
the “redo log”, and diskSnapshot callback of Phase Notify is delivered, then the disk chain is opened for
IOs. When the VM is powered off, the delta disk is deleted and the content inside is discarded, causing the
diskDetach callback to be invoked with the VMIOF_DISK_DETACH_DELETE flag set.

Example of log messages from vmware.log during a snapshot operation while the VM is running. In this
case, the disk name is cent55.vmdk —

| Upcall-bc85c3| I120: In the callback SampleFilterDiskStartIO


| Upcall-bc85c3| I120: >> handle is 32AC1900
| Upcall-bc85c3| I120: In the callback SampleFilterDiskStartIO
| Upcall-bc85c3| I120: >> handle is 32AC1900
| vmx| I120: In the callback SampleFilterDiskSnapshot
| vmx| I120: >> handle is 32AC1900
| vmx| I120: >> Snapshot phase is 1
| Upcall-bc85c3| I120: In the callback SampleFilterDiskStun
| Upcall-bc85c3| I120: >> handle is 32AC1900
| vcpu-0| I120: In the callback SampleFilterDiskClose
| vcpu-0| I120: >> handle is 32AC1900
| vcpu-0| I120: In the callback SampleFilterDiskOpen
| vcpu-0| I120: >> handle is 3291E350
| vcpu-0| I120: >> diskFlags is 1
| vcpu-0| I120: >> linksInChain is 1
| vcpu-0| I120: >> the 1(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55-000001.vmdk
| vcpu-0| I120: In the callback SampleFilterDiskSnapshot
| vcpu-0| I120: >> handle is 3291E350
| vcpu-0| I120: >> Snapshot phase is 2
| vcpu-0| I120: In the callback SampleFilterDiskClose
| vcpu-0| I120: >> handle is 3291E350
| vcpu-0| I120: In the callback SampleFilterDiskOpen
| vcpu-0| I120: >> handle is 3291E240
| vcpu-0| I120: >> diskFlags is 0
| vcpu-0| I120: >> linksInChain is 2
| vcpu-0| I120: >> the 1(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55-000001.vmdk
| vcpu-0| I120: >> the 2(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55.vmdk
| Upcall-cda5d4| I120: In the callback SampleFilterDiskUnstun
| Upcall-cda5d4| I120: >> handle is 3291E240
| Upcall-bc85c3| I120: In the callback SampleFilterDiskStartIO

VMware, Inc. VMware Confidential and Proprietary 197


Getting Started Developing vSphere IO Filter Solutions

| Upcall-bc85c3| I120: >> handle is 3291E240


| Upcall-bc85c3| I120: In the callback SampleFilterDiskStartIO
| Upcall-bc85c3| I120: >> handle is 3291E240
| Upcall-bc85c3| I120: In the callback SampleFilterDiskStartIO
| Upcall-bc85c3| I120: >> handle is 3291E240

Quesce Snapshot When customers take a snapshot, they have the option to "Quiesce the guest file system".
vSphere needs the VMWare tools to orchestrate this application level quiescing inside the guest operating
system. After application level quiescing is complete, applications are allowed to modify the qiuesced state
of the disk mostly to cleanup unwanted data. Our snapshots are readonly and to allow applications to write
to quiesce state we create one more child from the quiesced disk state. Then we hot add that disk to the
guest so that application can write to it. The second disk will always be opened with SYNC flags set, and a
caching based IO Filter Solution should write-through to this disk. After the snapshot operation is complete,
the snapshot points to the 2nd child, then the IO Filter Solution can open this 2nd child, and the base disk as
a chain for a consistent snapshot image backup.

Now we only take writable snapshots on Win2k3, Win2k8, Win2k8r2 and Win8server.

Both VMFS and NFS Native Snapshot create a 2nd child disk for a quiesce snapshot. But on VVOL we don't
create a 2nd child disk for quiesce snapshot, rather the parent disk will see the writes.

Reverting a Snapshot
A snapshot revert creates a new child disk by calling the snapshot callback and deleting the current running
point using detach callback. The following is an example of invoking the series of callback using the vSphere
Web Client GUI.

198 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

In this example, a VM(TVM) was created and 2 snapshots were created such as snap1 and snap2. The VM
was in a powered-on state. Using the VC GUI, the "Revert to the latest snapshot" operation was performed.

VMware, Inc. VMware Confidential and Proprietary 199


Getting Started Developing vSphere IO Filter Solutions

First, the VMDK chain (TVM.vmdk,TVM-000001.vmdk,TVM-000002.vmdk) is closed and the VM is


powered off which is showed in vmware.log:

448:2015-11-12T17:12:38.714Z| vmx| I120: In the callback SampleFilterDiskOpen


450:2015-11-12T17:12:38.714Z| vmx| I120: <Debug>capacity = 17179869184
451:2015-11-12T17:12:38.714Z| vmx| I120: <Debug>links = 3
452:2015-11-12T17:12:38.714Z| vmx| I120: <Debug>/vmfs/volumes/56166dfd-e2a858ea-
b4d1-005056b6fe3e/TVM/TVM-000002.vmdk
453:2015-11-12T17:12:38.714Z| vmx| I120: <Debug>diskFlags = 0
454:2015-11-12T17:12:38.714Z| vmx| I120: <Debug>/vmfs/volumes/56166dfd-e2a858ea-
b4d1-005056b6fe3e/TVM/TVM-000001.vmdk
455:2015-11-12T17:12:38.714Z| vmx| I120: <Debug>diskFlags = 0
456:2015-11-12T17:12:38.714Z| vmx| I120: <Debug>/vmfs/volumes/56166dfd-e2a858ea-
b4d1-005056b6fe3e/TVM/TVM.vmdk
457:2015-11-12T17:12:38.714Z| vmx| I120: <Debug>diskFlags = 0
1225:2015-11-12T17:12:39.208Z| Upcall-1e62062| I120: <Debug>In the callback SampFilterdiskUnstun
1393:2015-11-12T17:12:43.789Z| Upcall-1e62062| I120: In the callback SampleFilterDiskStartIO
1716:2015-11-12T17:34:35.455Z| vmx| I120: In the callback SampleFilterDiskClose

A new snapshot is created based on the chain of TVM.vmdk and TVM-000001.vmdk, by forming a new
child TVM-000003.vmdk. After that, the child disk TVM-000002.vmdk is deleted. This is reflected in
hostd.log:

2015-11-12T17:34:34.917Z info hostd[31043B70] [Originator@6876


sub=Vmsvc.vm:/vmfs/volumes/56166dfd-e2a858ea-b4d1-005056b6fe3e/TVM/TVM.vmx
opID=b1995ac0-1c2c-4f19-95c1-55d61552406a-49877-ngc-5a-0-884a
user=vpxuser:VSPHERE.LOCAL\Administrator] State Transition
(VM_STATE_ON -> VM_STATE_REVERT_SNAPSHOT)
2015-11-12T17:34:37.866Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskOpen
2015-11-12T17:34:37.866Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>capacity = 0
2015-11-12T17:34:37.866Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>links = 2
2015-11-12T17:34:37.866Z info hostd[306C2B70] [Originator@6876 sub=Libs]
<Debug>/vmfs/volumes/56166dfd-e2a858ea-b4d1-005056b6fe3e/
TVM/TVM-000001.vmdk
2015-11-12T17:34:37.866Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>diskFlags = 0
2015-11-12T17:34:37.866Z info hostd[306C2B70] [Originator@6876 sub=Libs]
<Debug>/vmfs/volumes/56166dfd-e2a858ea-b4d1-005056b6fe3e/TVM/TVM.vmdk
2015-11-12T17:34:37.866Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>diskFlags = 0
2015-11-12T17:34:37.866Z info hostd[2FFEFB70] [Originator@6876 sub=Libs] <Debug>In the callback
SampFilterdiskUnstun
2015-11-12T17:34:37.866Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>In the callback
SampFilterSnapshot
2015-11-12T17:34:37.867Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>Snapshot phase
is 1
2015-11-12T17:34:37.892Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskClose
2015-11-12T17:34:37.892Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskRequirements
2015-11-12T17:34:38.151Z info hostd[306C2B70] [Originator@6876 sub=DiskLib] DISKLIB-
LIB_CREATE : CREATE CHILD: "/vmfs/volumes/
56166dfd-e2a858ea-b4d1-005056b6fe3e/TVM/TVM-000003.vmdk" -- vmfsSparse cowGran=1 allocType=0
policy=
2015-11-12T17:34:38.365Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskOpen
2015-11-12T17:34:38.365Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>capacity = 0

200 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

2015-11-12T17:34:38.365Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>links = 1


2015-11-12T17:34:38.365Z info hostd[306C2B70] [Originator@6876 sub=Libs]
<Debug>/vmfs/volumes/56166dfd-e2a858ea-b4d1-005056b6fe3e/TVM/
TVM-000003.vmdk
2015-11-12T17:34:38.365Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>diskFlags = 1
2015-11-12T17:34:38.365Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>In the callback
SampFilterSnapshot
2015-11-12T17:34:38.365Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>Snapshot phase
is 2
2015-11-12T17:34:38.422Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskClose
2015-11-12T17:34:38.422Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskRequirements
2015-11-12T17:34:38.487Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskOpen
2015-11-12T17:34:38.487Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>capacity = 0
2015-11-12T17:34:38.487Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>links = 2
2015-11-12T17:34:38.487Z info hostd[306C2B70] [Originator@6876 sub=Libs]
<Debug>/vmfs/volumes/56166dfd-e2a858ea-b4d1-005056b6fe3e/TVM/
TVM-000001.vmdk
2015-11-12T17:34:38.487Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>diskFlags = 0
2015-11-12T17:34:38.487Z info hostd[306C2B70] [Originator@6876 sub=Libs]
<Debug>/vmfs/volumes/56166dfd-e2a858ea-b4d1-005056b6fe3e/TVM/TVM.vmdk
2015-11-12T17:34:38.487Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>diskFlags = 0
2015-11-12T17:34:38.487Z info hostd[306C2B70] [Originator@6876 sub=Libs] FiltLib: sampfilt:
diskOpen successful.
2015-11-12T17:34:38.487Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskClose
2015-11-12T17:34:38.487Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskRequirements
2015-11-12T17:34:39.114Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskOpen
2015-11-12T17:34:39.114Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>capacity = 0
2015-11-12T17:34:39.114Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>links = 1
2015-11-12T17:34:39.114Z info hostd[306C2B70] [Originator@6876 sub=Libs]
<Debug>/vmfs/volumes/56166dfd-e2a858ea-b4d1-005056b6fe3e/TVM/
TVM-000002.vmdk
2015-11-12T17:34:39.114Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>diskFlags = 1
2015-11-12T17:34:39.114Z info hostd[306C2B70] [Originator@6876 sub=Libs] FiltLib: sampfilt:
diskOpen successful.
2015-11-12T17:34:39.114Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskDetach
2015-11-12T17:34:39.114Z info hostd[306C2B70] [Originator@6876 sub=Libs] <Debug>detachFlag
(VMIOF_DISK_DETACH_DELETE)is 1
2015-11-12T17:34:39.114Z info hostd[306C2B70] [Originator@6876 sub=Libs] In the callback
SampleFilterDiskClose

Understanding and Processing a DiskCollapse Event


Recall from “Understanding and Defining the diskCollapse Callback,” on page 105 that the IO Filter
Framework invokes a filter's diskCollapse callback after the completion of the consolidate operation. This
indicates that the data in the current disk link has been collapsed to the parent link. The parent link now
contains the combined data representing both itself and the current link.

VMware, Inc. VMware Confidential and Proprietary 201


Getting Started Developing vSphere IO Filter Solutions

The callback has the following prototype:

VMIOF_Status (*diskCollapse) (VMIOF_DiskHandle *handle, const VMIOF_DiskCollapseInfo *info);

The parameters to this callback function are:


n VMIOF_DiskHandle *handle — This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n const VMIOF_DiskCollapseInfo *info — This input parameter is a pointer to a structure of type


VMIOF_DiskCollapseInfo that is currently unused.

typedef struct VMIOF_DiskCollapseInfo {


/** \brief currently unused. */
void *unused;
} VMIOF_DiskCollapseInfo;

The values this callback can return are:


n VMIOF_SUCCESS — The function succeeded in handling the disk collapse event.

n VMIOF_FAILURE — This return value indicates a failure. The framework will fail the disk collapse
operation, but the link hierarchy will be unaffected. Because the deletion of any delta disk (or analogous
object on VVOLS) has already occurred and cannot be reverted, the parent link now contains the
combined disk contents of the child and parent.

There is a consolidation of disks (if this wasn’t the tail of the snapshot chain), in which the snapshot that
followed the deleted snapshot is consolidated in order to maintain consistency. For example, suppose that at
time T, a virtual disk had the following chain:

[LinuxVM.vmdk ] ------> Snap 1


|
|--[LinuxVM-000001.vmdk] -------> Snap 2
|
|--[LinuxVM-000002.vmdk] -----> Snap 3
|
|--[LinuxVM-000003.vmdk] -----> You are here

Suppose an administrator (or program) deletes Snap 2. To effect this, the hypervisor takes the contents of
LinuxVM-000002.vmdk and writes them into LinuxVM-000001.vmdk, then moves the label of Snap 3 to
LinuxVM-000001.vmdk. For convenience, we express the combined disk as LinuxVM-000001.vmdk
+LinuxVM-000002.vmdk. There are no changes to the base disk or LinuxVM-000003.vmdk.

The following diagram shows the resulting chain:

[LinuxVM.vmdk ] ------> Snap 1


|
|--[LinuxVM-000001.vmdk] -------> Snap 3 [LinuxVM-000001.vmdk + LinuxVM-000002.vmdk ]
|
|--[LinuxVM-000003.vmdk] -----> You are here

Let's see a real example to show what the IOFilter Framework does and what Library callbacks are invoked
during the process of disk consolidation when a VM is running.

Before the operation, the disk chain, parent disk "cent55.vmdk" and child disk "cent55-001.vmdk", is opened
and expecting IOs. There are two sidecar files, "cent55.vmfd" and "cent55-001.vmfd", associated to the parent
and child VMDK respectively. Access to sidecar file goes to the child file "cent55-001.vmfd".

During disk consolidation operation, the IOFilter Framework takes the following steps:

1 Since the VM is running, the Framework needs to first stun the VM. It does this by invoking diskStun
and then the diskClose callback on the disk chain.

202 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

2 In order to consolidate the two VMDKs, both are opened and then closed separately.

3 The framework then opens the disk chain and unstuns it, so that the VM can resume running.

4 Consolidation occurs by copying the content from "cent55-001.vmdk" to "cent55.vmdk" while the VM is
actively running.

5 The framework then stuns the VM again by invoking the diskStun and the diskClose callbacks to the
chain. It then copies the content of sidecar file "cent55-001.vmfd" to "cent55.vmfd" replacing the original
content.

6 The framework then opens the disk chain, invokes the diskCollapse callback, and then closes the disk
chain.

7 It then updates the vmx file to point to the consolidated disk "cent55.vmdk".

8 The next step is to open the child "cent55-001.vmdk" with VMIOF_DiskFlags in VMIOF_DiskInfo set to
VMIOF_DISK_NO_IO. The diskDetach callback is then invoked with the VMIOF_DiskDetachFlags in
VMIOF_DiskDetachInfo set to VMIOF_DISK_DETACH_DELETE. The framework then closes the child disk.

9 The last steps involves opening the parent disk (cent55.vmdk), unstunning it, and beginning to issuing
IOs. Access to the sidecar file is now directed to "cent55.vmfd".

Example of detailed log messages from vmware.log are as follows.

| Upcall-bc85c3| I120: In the callback SampleFilterDiskStartIO


| Upcall-bc85c3| I120: >> handle is 3291E240
| Upcall-bc85c3| I120: In the callback SampleFilterDiskStartIO
| Upcall-bc85c3| I120: >> handle is 3291E240
| Upcall-bc85c3| I120: In the callback SampleFilterDiskStun
| Upcall-bc85c3| I120: >> handle is 3291E240
| vcpu-0| I120: In the callback SampleFilterDiskClose
| vcpu-0| I120: >> handle is 3291E240
| vcpu-0| I120: In the callback SampleFilterDiskOpen
| vcpu-0| I120: >> handle is 32788F10
| vcpu-0| I120: >> diskFlags is 0
| vcpu-0| I120: >> linksInChain is 1
| vcpu-0| I120: >> the 1(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55-000001.vmdk
| vcpu-0| I120: In the callback SampleFilterDiskOpen
| vcpu-0| I120: >> handle is 327E5290
| vcpu-0| I120: >> diskFlags is 0
| vcpu-0| I120: >> linksInChain is 1
| vcpu-0| I120: >> the 1(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55.vmdk
| vcpu-0| I120: In the callback SampleFilterDiskClose
| vcpu-0| I120: >> handle is 32788F10
| vcpu-0| I120: In the callback SampleFilterDiskClose
| vcpu-0| I120: >> handle is 327E5290
| vcpu-0| I120: In the callback SampleFilterDiskOpen
| vcpu-0| I120: >> handle is 32ABFF70
| vcpu-0| I120: >> diskFlags is 0
| vcpu-0| I120: >> linksInChain is 2
| vcpu-0| I120: >> the 1(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55-000001.vmdk
| vcpu-0| I120: >> the 2(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55.vmdk
| Upcall-be05d7| I120: In the callback SampleFilterDiskUnstun
| Upcall-be05d7| I120: >> handle is 32ABFF70
| Upcall-bc85c3| I120: In the callback SampleFilterDiskStun

VMware, Inc. VMware Confidential and Proprietary 203


Getting Started Developing vSphere IO Filter Solutions

| Upcall-bc85c3| I120: >> handle is 32ABFF70


| vcpu-0| I120: In the callback SampleFilterDiskClose
| vcpu-0| I120: >> handle is 32ABFF70
| vcpu-0| I120: In the callback SampleFilterDiskOpen
| vcpu-0| I120: >> handle is 3291E710
| vcpu-0| I120: >> diskFlags is 0
| vcpu-0| I120: >> linksInChain is 2
| vcpu-0| I120: >> the 1(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55-000001.vmdk
| vcpu-0| I120: >> the 2(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55.vmdk
| vcpu-0| I120: In the callback SampleFilterDiskCollapse
| vcpu-0| I120: >> handle is 3291E710
| vcpu-0| I120: In the callback SampleFilterDiskClose
| vcpu-0| I120: >> handle is 3291E710
| vcpu-0| I120: In the callback SampleFilterDiskOpen
| vcpu-0| I120: >> handle is 32C1E1F0
| vcpu-0| I120: >> diskFlags is 1
| vcpu-0| I120: >> linksInChain is 1
| vcpu-0| I120: >> the 1(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55-000001.vmdk
| vcpu-0| I120: clock_gettime: 1437993029 sec, 45138745 nsec
| vcpu-0| I120: FiltLib: sampfilt: diskOpen successful.
| vcpu-0| I120: In the callback SampleFilterDiskDetach
| vcpu-0| I120: In the callback SampleFilterDiskClose
| vcpu-0| I120: >> handle is 32C1E1F0
| vcpu-0| I120: In the callback SampleFilterDiskOpen
| vcpu-0| I120: >> handle is 328B2F50
| vcpu-0| I120: >> diskFlags is 0
| vcpu-0| I120: >> linksInChain is 1
| vcpu-0| I120: >> the 1(th) file in filesInChain is /vmfs/volumes/55a904a2-
faa26b55-7b06-005056b5cf7d/cent55/cent55.vmdk
| Upcall-be05d7| I120: In the callback SampleFilterDiskUnstun
| Upcall-be05d7| I120: >> handle is 328B2F50
| Upcall-bc85c3| I120: In the callback SampleFilterDiskStartIO
| Upcall-bc85c3| I120: >> handle is 328B2F50
| Upcall-bc85c3| I120: In the callback SampleFilterDiskStartIO
| Upcall-bc85c3| I120: >> handle is 328B2F50

Understanding and Processing the diskExtentGet* Events


When ESXi writes a block to a child disk, ESXi creates a disk extent. For caching solutions, the filter keeps
track of the blocks residing in the cache, but does not share that information with the IO Filter Framework.
The callbacks discussed in this topic (diskExtentGetPre and diskExtentGetPost) allow the IO Filter
Framework to query the filter to determine the cache state of the disk extents.

NOTE These two callbacks only concern cached write data, not read cache. Since all write data goes to the
current delta disk, if currentDelta is not set, Filter can simply ignore these two callbacks.

The framework invokes diskExtentGetPre before a disk extent get operation (performed by vSphere).

The prototype for the diskExtentGetPre() callback is :

VMIOF_Status
(*diskExtentGetPre)(VMIOF_DiskHandle *handle, VMIOF_DiskExtentGetInfo *info);

204 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

The parameters to this callback are:

n VMIOF_DiskHandle *handle —This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to.

n const VMIOF_DiskExtentGetInfo *info — This input parameter that is a pointer to a structure indicates
the disk extent data. The structure is defined as follows:

typedef struct VMIOF_DiskExtentGetInfo {


uint64_t startOffset;
uint64_t extentOffset;
uint64_t length;
bool currentDelta;
} VMIOF_DiskExtentGetInfo;

n uint64_t startOffset —The start offset, in bytes, of the scan

n uint64_t extentOffset —The offset, in bytes, of the found extent post scan.

n uint64_t length —The length, in bytes, of the found extent post scan.

n bool currentDelta — Indicates whether the scan includes current disk state.

This callback returns VMIOF_SUCCESS if the operation is allowed to proceed, else it returns an appropriate
error value.

The framework invokes The diskExtentGetPost after vSphere completes a disk extent get operation on a
VMDK.

The prototype for this callback is :

VMIOF_Status
(*diskExtentGetPost)(VMIOF_DiskHandle *handle, VMIOF_DiskExtentGetInfo *info);

The parameters to this callback are:

n VMIOF_DiskHandle *handle —This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to.

n const VMIOF_DiskExtentGetInfo *info — This input parameter that is a pointer to a structure indicates
the disk extent data. The structure is defined as follows:

typedef struct VMIOF_DiskExtentGetInfo {


uint64_t startOffset;
uint64_t extentOffset;
uint64_t length;
bool currentDelta;
} VMIOF_DiskExtentGetInfo;

n uint64_t startOffset —The start offset, in bytes, of the scan

n uint64_t extentOffset —The offset, in bytes, of the found extent post scan.

n uint64_t length —The length, in bytes, of the found extent post scan.

n bool currentDelta — Indicates whether the scan includes current disk state.

This callback returns VMIOF_SUCCESS if the operation is allowed to proceed, else it returns an appropriate
error value.

NOTE Note: For a caching solution, Filter only needs to change the extentOffset and length if needed in
diskExtentGetPost, not in diskExtentGetPre. For example, in an diskExtentGetPost callback,
startOffset=0x1000, extentOffset=0x1400, length =0x300. If the Filter is caching a block at 0x1200 with
length=0x100, it needs to update extentOffset to 0x1200, length to 0x100. If it is caching a block at 0x1200 with
length=0x200, it needs to update extentOffset to 0x1200, length to 0x500

VMware, Inc. VMware Confidential and Proprietary 205


Getting Started Developing vSphere IO Filter Solutions

Using the Extent Scanning Functions


The VAIO provides the following functions to scan extents:

VMIOF_DiskScanBegin() — Use this function to setup a scan of disk extents within a virtual disk chain. In
other words this function initializes state used in scanning a disk for used or changed blocks.

The function prototype look as follows:

VMIOF_Status VMIOF_DiskScanBegin(VMIOF_DiskHandle *handle,


VMIOF_HeapHandle *heap,
VMIOF_DiskScan **pscan);

The parameters to this function are:

n VMIOF_DiskHandle *handle — This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n VMIOF_HeapHandle *heap — This input parameter refers to the heap used to allocate the scan state.

n VMIOF_DiskScan **pscan — This output parameter refers to the initialized scan state

The values this function returns are:

n VMIOF_SUCCESS — This indicates that the operation succeeded.

n VMIOF_NO_MEMORY — This return value indicates that heap has insufficient space to allocate the scan state.

VMIOF_DiskScanEnd() — Use this function to conclude a scan of disk extents within a virtual disk chain. In
other words this function destroys state used in scanning a disk for used or changed blocks.

The function prototype look as follows:

void VMIOF_DiskScanEnd(VMIOF_DiskScan *scan);

The parameters to this function are:

n VMIOF_DiskScan *scan — This input parameter refers to the scan state to destroy

This function returns a void.

VMIOF_DiskExtentGetChanged() — Use this function to find the blocks that are private to the current (most
recent) snapshot of a virtual disk chain. In other words this function gets region that has changed since the
last snapshot. It scans the disk starting at the provided startOffset, and returns the offset and length of a
region of the disk which is private to the most current snapshot.

The function prototype look as follows:

VMIOF_Status VMIOF_DiskExtentGetChanged(VMIOF_DiskScan *scan,


uint64_t *startOffset,
uint64_t *length);

The parameters to this function are:

n VMIOF_DiskScan *scan — This input parameter refers to the scan state returned from
VMIOF_DiskScanBegin()

n uint64_t *startOffset — As input parameter this refers to the byte offset where the search should
start. As output parameter it refers to the start of a private region or start of next search if length is 0.

n uint64_t *length — This output parameter refers to the length of the region. It can be 0.

206 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

The values this function returns are:


n VMIOF_SUCCESS — This indicates that the operation succeeded.

n VMIOF_OUT_OF_RANGE — This return value indicates that the start offset is beyond the disk capacity.

VMIOF_DiskExtentGetUsed() — Use this function to find the blocks of a virtual disk change that exist in any of
the snapshot disks. That is, this function finds blocks that have changed since the first snapshot was taken
from the base disk of a virtual disk chain. It scans the disk starting at the provided startOffset, and returns
the offset and length of a region of the disk which contains valid data, that is, a disk region that has been
written to by the current or any prior snapshot. The offset and length are in bytes.

The function prototype look as follows:

VMIOF_Status VMIOF_DiskExtentGetUsed(VMIOF_DiskScan *scan,


uint64_t *startOffset,
uint64_t *length);

The parameters to this function are:


n VMIOF_DiskScan *scan — This input parameter refers to the scan state returned from
VMIOF_DiskScanBegin()

n uint64_t *startOffset — As input parameter this refers to the byte offset where the search should
start. As output parameter it refers to the start of a valid region or start of next search if length is 0.

n uint64_t *length — This output parameter refers to the length of the region. It can be 0.

The values this function returns are:


n VMIOF_SUCCESS — This indicates that the operation succeeded.

n VMIOF_OUT_OF_RANGE — This return value indicates that the start offset is beyond the disk capacity.

An example of using these functions is present in the sampfilt VAIODK sample:

{
...
/* list changed extents */
status = VMIOF_DiskScanBegin(handle, heap, &scan);
if (status != VMIOF_SUCCESS) {
VMIOF_Log(VMIOF_LOG_WARNING, "Failed to begin scan.\n");
}
offset = 0;
length = 0;
start = 0;
total = 0;

for (;;) {
status = VMIOF_DiskExtentGetChanged(scan, &offset, &length);
if (status != VMIOF_SUCCESS) {
break;
}
if (offset != start + total) {
if (total != 0) {
VMIOF_Log(VMIOF_LOG_INFO, "Changed @ %lu length %lu\n",
(long)start, (long)total);
}
start = offset;
total = 0;
}
total += length;
offset += length;

VMware, Inc. VMware Confidential and Proprietary 207


Getting Started Developing vSphere IO Filter Solutions

}
if (total != 0) {
VMIOF_Log(VMIOF_LOG_INFO, "Changed @ %lu length %lu\n",
(long)start, (long)total);
}

VMIOF_DiskScanEnd(scan);
...
}

Here is an example of input/output while using these functions.

Start: Offset = 0, Length = 0


1: Offset = 0x1000, Length = 0xFF. Now we set Offset = 0x10FF
2: Offset = 0x3000, Length = 0x400. Now we set Offset = 0x3400
3. Offset = 0x4000, Length = 0. This indicates the start of the next search, so Offset = 0x4000
4. Offset = 0x8000, Length = 0x1000. Now we set Offset = 0x9000
5. VMIOF_Status != VMIOF_SUCCESS so our search is now done.

NOTE If you apply VMIOF_DiskExtentGetUsed or VMIOF_DiskExtentGetChanged against the base VMDK of a


thin-provisioned VMDK, you will get the valid regions that have been provisioned. If you apply them
against the base VMDK of a lazy-zeroed thick-provisioned VMDK, you will get the valid regions that have
been accessed.

Understanding and Processing Other IO Filter Events


The following sub-topics discuss how to flesh the remaining IO Filter events.

Understanding and Processing diskStun and diskUnstun Events

Processing the diskStun callback


The IO Filter framework invokes the diskStun() callback function in response to the hypervisor stunning the
VM or at certain other times. In either case, receipt of this callback indicates to the Filter that it must cease IO
operations to both the VMDK and said VMDK's sidecars.

Generally, invocation of diskStun callbacks are paired with invocations of diskUnstun callbacks. However,
there are exceptions to this rule, discussed later in this topic. Further, diskStun / diskUnstun callbacks may
be nested. That is, it is possible to get a diskStun, then another diskStun before receiving a diskUnstun. A
filter must keep a stun level in its instance data (not its sidecar, since sidecars can not be modified while the
disk is stunned). The level starts at one during processing of a diskOpen event, and is logically reset to zero
on receipt of a diskClose event (because instance data is destroyed at diskClose). The filter must increment
the stun level in its diskStun callback and decrement it in its diskUnstun callback. That said, if the filter
receives a diskUnstun callback while the stun level is zero, it must simply return from the callback,
essentially ignoring it. This can happen when a VM is resumed after a suspend or migration.

NOTE An exception is after the diskOpen() callback but before diskUnstun() callback, access to sidecar files is
allowed, but access to the disk is not allowed.

To be clear, a filter is permitted to do IO to its VMDK and sidecars during the processing of the diskStun
callback taking the stun level from zero to one (0 -> 1), but not after returning from said callback. A filter
may resume IO to its VMDK and as soon as it enters the diskUnstun event that takes the stun level from one
back to zero (1 -> 0).

208 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Thus, upon receipt of a stun event taking the stun level from zero to one, a filter must complete all pending
IOs that it owns before returning from the callback. The framework will not issue any IOs to the Filter until
it returns from the diskUnStun event that returns the stun level to zero.

NOTE It is possible that a disk notification for another disk requires to stun the VM while there is already a
disk notification in progress for the current disk. In this case, the filter should halt the operations it is
performing for the earlier disk notification and resume the operations once the filter has received the
diskUnstun notification for the current disk. It is permitted to post progress updates during this time period
reporting the same progress value several times. As an example, the disk is doing a long running
diskDetach operation, and there is an in-guest VM reboot. The reboot will trigger a stun notification,
meaning the filter should suspend the detach operation, and just post the same progress value until it
receives the unstun.

The prototype of this callback is:

VMIOF_Status (*diskStun)(VMIOF_DiskHandle *handle, const VMIOF_DiskStunInfo *info);

The parameters to this callback are:

n VMIOF_DiskHandle *handle - This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to.

n const VMIOF_DiskStunInfo *info - This input parameter is a structure which provides a flag and a
progress function callback that can be updated by the filter to indicate the progress of the stun activity.

typedef struct VMIOF_DiskStunInfo {


/** Flags that describe the detach operation. */
VMIOF_DiskStunFlags stunFlags;
/**
* Function to post progress updates to the framework.
*
* The filter can post progress updates to the framework as part of
* its processing of the notification.
*/
VMIOF_DiskOpProgressFunc progressFunc;
} VMIOF_DiskStunInfo;

n The stunFlags can only have value of 0x1 at this point of time. The enumerated type is defined as
follows :

typedef enum VMIOF_DiskStunFlags {


VMIOF_DISK_STUN_FLUSH_DIRTY_DATA = 0x1,
} VMIOF_DiskStunFlags;

The VMIOF_DISK_STUN_FLUSH_DIRTY_DATA flag is not used in current release. It will only be


triggered by command of "vmkfstools --flushvirtualdisk" for testing purposes. We are considering
removing it in next release.

This callback must return VMIOF_SUCCESS.

Processing the diskUnstun callback


The diskUnstun() callback function is invoked by the IO filter framework to unstun all operations of the
filter. The filter is allowed to resume its regular operations after control returns from this function.

The prototype of this callback is:

void (*diskUnstun)(VMIOF_DiskHandle *handle);

The only parameter to this function is the disk handle.

This callback function returns void.

VMware, Inc. VMware Confidential and Proprietary 209


Getting Started Developing vSphere IO Filter Solutions

The code should check to see if the stun level is zero, in which case it should ignore the call, unless a
Migration is in progress. For processing in this latter case, see “Understanding and Processing an
xMigration (diskVmMigration) Event,” on page 216.

NOTE This callback is optional.

Understanding and Processing a DiskClone Event


Remember from section “Understanding and Defining the diskClone Callback,” on page 106 that the IO
Filter Framework invokes a filter's diskClone()callback after the completion of the Clone operation.

The prototype for diskClone() callback is as follows :

VMIOF_Status (*diskClone)(VMIOF_DiskHandle *handle, const VMIOF_DiskCloneInfo *info);

Parameters :

n VMIOF_DiskHandle *handle — This input parameter an opaque handle to the disk and is valid only for
the filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n const VMIOF_DiskCloneInfo *info — This is a pointer to a structure of type VMIOF_DiskCloneInfo that


provides details about the clone operation.

typedef struct VMIOF_DiskCloneInfo {


/*
* \brief Function to post progress updates to the framework.
* The filter can post progress updates to the framework as part of
* its processing of the notification.
*/
VMIOF_DiskOpProgressFunc progressFunc;
} VMIOF_DiskCloneInfo;

Return Value :

n VMIOF_SUCCESS - This return value indicates that the operation succeeded.

n Any other value indicates an error.

NOTE A Linked Disk Clone is implemented through the Disk Snapshot mechanism. The LI should only
expect a diskSnapshot callback, and not a diskClone callback.

Usually, you need to track the status of the VMDK, e.g. whether it is dirty, also the position of the cache, in
the sidecar file. As part of the Clone, both the VMDK and the sidecar will be copied. After that, you will get
a diskClone callback on the newly cloned VMDK. Using the sidecar file, you will know the current status
and the cache location, which you can then use to talk to the original cache. If the disk is dirty, you can
transfer the data from the original cache and flush it later. Since when the diskClone is delivered, the disk is
in a stunned state, so you won't be able to flush in inside diskClone.

sMigration while the VM is powered off (aka VMDK cold migration, or a VMDK relocation) is implemented
using Disk Clone. So you will only get a diskClone callback, not a diskVmMigration callback. One way to
distinguish the two cases is using the disk UUID. vSphere assigns a new UUID for copied VMDKs, while
the UUID will keep the same for sMigration. Another way to deal with the two cases is to implement a
reference count for the cache of how many VMDKs are using it. When you talk to the original cache on
behalf of the cloned disk, increase the reference count. Then if it is a real clone, both the original VMDK and
the newly cloned VMDK will use the same cache, and you can separate them at a later time of your choice. If
it is a VMDK Cold Migration, the original VMDK will be opened and deleted. Since you have a reference
count and if it is more than 1, so you won't delete the cache as a result. Then since only the new VMDK will
use the cache, you can continue to use it or create a new one later.

210 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Following is a code snippet describing the activities associated with diskClone() callback :

1. SampleFilterDiskClone(VMIOF_DiskHandle *handle, const VMIOF_DiskCloneInfo *info)


2. {
3. VMIOF_Status status;
4. uint32_t i;
5. VMIOF_Log(VMIOF_LOG_ERROR, "In the callback %s\n", __func__);
6. VMIOF_Log(VMIOF_LOG_ERROR, "The progressFunc is %p\n", (void*)info->progressFunc);
7. /* Lets call the status with 25% increments */
8. if (info->progressFunc) {
9. for(i = 25; i <=100; i+=25) {
10. status = info->progressFunc(handle, i);
11. VMIOF_Log(VMIOF_LOG_ERROR, "The progress is %d%\n", i);
12. usleep(1000);
13. }
14. }
15. return VMIOF_SUCCESS;
16.}

The key sections of code are as follows:


n Line 8 : You call the progressFunc indicating the level of completion.

NOTE You cannot return VMIOF_ASYNC in this function, so all tasks must be completed synchronously.

To trigger the callback, invoke vmkfstools as follows:

vmkfstools --verbose 3 --clonevirtualdisk test.vmdk disk2.vmdk

The following is a sample log file showing a diskClone event :

** The framework openes the source disk in Read Only Mode.


In the callback SampleFilterDiskOpen
(SampleFilterDiskOpen): handle=1F05F840, flags=2(Read Only). File Chains=1. Files in chain: <--
Note: opened in READ ONLY mode
Name[0]='/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/test.vmdk'
clock_gettime: 1440486889 sec, 382127414 nsec
FiltLib: sampfilt: diskOpen successful.
DISKLIB-LIB : Opened "test.vmdk" (flags 0xe, type vmfs).
Destination disk format: VMFS zeroedthick
Cloning disk 'test.vmdk'...
DISKLIB-LIB_CREATE : CREATE: "disk2.vmdk" -- vmfs capacity=0 (0 bytes) adapter=lsilogic
info=cowGran=0 allocType=3 objType=file policy=''
In the callback SampleFilterDiskClose
In the callback SampleFilterDiskRequirements
Changed @ 0 length 10485760
FiltLib: heap sampfilt statistics: numGrowthOps 0 mem bytes 0 numShrinkOps 0 mem bytes 0
numSuccAllocs 1 numFailedAllocs 0.
FiltLib: sampfilt: diskClose successful.

** The framework creates the new disk, and attaches the filter
FiltLib: DisableDiskIO upcallThread 1000017310 initalized 1
FiltLib: DetachFromFiltMod 1F05F5E0
DISKLIB-LIB_CLONE : Failed to clone disk using Object Cloning
DISKLIB-LIB_CREATE : CREATE: "disk2.vmdk" -- vmfs capacity=20480 (10 MB) adapter=lsilogic
info=cowGran=1 allocType=3 objType=file policy=''
DISKLIB-LIB_CREATE : CREATE: Creating disk backed by 'default'
Clone: 10% done.DISKLIB-DSCPTR: "disk2.vmdk" : creation successful.
DISKLIB-VMFS : "/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/disk2-flat.vmdk" : open

VMware, Inc. VMware Confidential and Proprietary 211


Getting Started Developing vSphere IO Filter Solutions

successful (17) size = 10485760, hd = 0. Type 3


FiltLib: There are no io filters for this disk.
DISKLIB-VMFS : "/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/disk2-flat.vmdk" :
closed.
DISKLIB-VMFS : "/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/disk2-flat.vmdk" : open
successful (131608) size = 10485760, hd = 119097.
Type 3
DISKLIB-DSCPTR: Opened [0]: "disk2-flat.vmdk" (0x20218)
DISKLIB-LINK : Opened '/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/disk2.vmdk'
(0x20218): vmfs, 20480 sectors / 10 MB.
FiltLib: There are no io filters for this disk.
DISKLIB-LIB : Opened "disk2.vmdk" (flags 0x20218, type vmfs).
DDB: "longContentID" = "9fb637dc8ba7ad587b0522d9a49a5b1a" (was
"4a8968c515b046a014c31640fffffffe")
VAAI-NAS [vmfsNasPlugin : /vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136] : CLONE
[/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/disk2-flat.vmdk] failed.
DISKLIB-LINK : DiskLinkNativeVmfsCloneExisting: Failed to create native clone. (24).
DISKLIB-LIB_CLONE : Failed to create native clone on destination handle : The specified
feature is not supported by this version (24).
Clone: 100% done.PluginLdr_Load: Loaded plugin libvmiof-disk-sampfilt.so
from /usr/lib/vmware/plugin/libvmiof-disk-sampfilt.so
VTHREAD start thread 8 "Upcall-1d139" pid 1000017311
FiltLib: Context 1F066200: initialized the upcall thread.

** New disk is opened in R/W mode, and the diskClone callback is invoked.
In the callback SampleFilterDiskOpen
(SampleFilterDiskOpen): handle=1F0663B0, flags=0(). File Chains=1. Files in chain:
Name[0]='/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/disk2.vmdk'
clock_gettime: 1440486892 sec, 940818835 nsec
FiltLib: sampfilt: diskOpen successful.
In the callback SampleFilterDiskClone
The progress is 25 percent
The progress is 50 percent
The progress is 75 percent
The progress is 100 percent
FiltLib: sampfilt: diskClone successful.
In the callback SampleFilterDiskClose
In the callback SampleFilterDiskRequirements
Changed @ 0 length 10485760
FiltLib: heap sampfilt statistics: numGrowthOps 0 mem bytes 0 numShrinkOps 0 mem bytes 0
numSuccAllocs 1 numFailedAllocs 0.
FiltLib: sampfilt: diskClose successful.
FiltLib: DisableDiskIO upcallThread 1000017311 initalized 1
FiltLib: DetachFromFiltMod 1F066200
FiltLib: DisableDiskIO upcallThread 4294967295 initalized 0
DISKLIB-VMFS : "/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/disk2-flat.vmdk" :
closed.
PluginLdr_Load: Loaded plugin libvmiof-disk-sampfilt.so from /usr/lib/vmware/plugin/libvmiof-
disk-sampfilt.so
VTHREAD start thread 9 "Upcall-11136" pid 1000017312
FiltLib: Context 1F06B460: initialized the upcall thread.

** The framework reopens the original disk in Read Only mode.


In the callback SampleFilterDiskOpen
(SampleFilterDiskOpen): handle=1F05F5E0, flags=2(Read Only). File Chains=1. Files in chain:

212 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Name[0]='/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/test.vmdk'
clock_gettime: 1440486893 sec, 569226610 nsec
FiltLib: sampfilt: diskOpen successful.
In the callback SampleFilterDiskClose
In the callback SampleFilterDiskRequirements
Changed @ 0 length 10485760
FiltLib: heap sampfilt statistics: numGrowthOps 0 mem bytes 0 numShrinkOps 0 mem bytes 0
numSuccAllocs 1 numFailedAllocs 0.
FiltLib: sampfilt: diskClose successful.
FiltLib: DisableDiskIO upcallThread 1000017312 initalized 1
FiltLib: DetachFromFiltMod 1F06B460
DISKLIB-VMFS : "/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/test-flat.vmdk" : closed.

BACKGROUND The framework will always open source sidecars in READONLY mode during a clone
operation. This is due to an issue where the sidecar’s were opened in write exclusive mode, as the disk was
opened with only the OPEN_NOIO flag. During a parallel linked clone, you could have a the same sidecar
opened concurrently, so the solution is to always open the sidecar in READONLY.

Cloning A Disk Of a Running VM


Besides cloning a powered-off VM, vSphere also supports cloning a running VM. In order to clone a VMDK
from a consistent state, vSphere creates a snapshot of the VMDK, and then clones from the snapshot. The
following callback sequence shows an example of cloning the VMDK cent55.vmdk of a running VM.

First, vSphere creates a snapshot, so it freezes the original disk cent55.vmdk, and creates a delta disk
cent55-000001.vmdk. The VM keeps running, and all new data goes to the delta disk. This is reflected in
vmware.log.

2015-12-22T18:56:58.626Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStartIO


2015-12-22T18:56:58.626Z| Upcall-16c6a7a| I120: >> handle is 32AB3740
2015-12-22T18:56:58.630Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStartIO
2015-12-22T18:56:58.630Z| Upcall-16c6a7a| I120: >> handle is 32AB3740
2015-12-22T18:58:58.867Z| vmx| I120: In the callback SampleFilterDiskSnapshot
2015-12-22T18:58:58.867Z| vmx| I120: >> handle is 32AB3740
2015-12-22T18:58:58.867Z| vmx| I120: >> Snapshot phase is 1
2015-12-22T18:59:00.013Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStartIO
2015-12-22T18:59:00.013Z| Upcall-16c6a7a| I120: >> handle is 32AB3740
2015-12-22T18:59:00.013Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStartIO
2015-12-22T18:59:00.013Z| Upcall-16c6a7a| I120: >> handle is 32AB3740
2015-12-22T18:59:00.065Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStun
2015-12-22T18:59:00.065Z| Upcall-16c6a7a| I120: >> handle is 32AB3740
2015-12-22T18:59:00.093Z| vcpu-0| I120: In the callback SampleFilterDiskClose
2015-12-22T18:59:00.093Z| vcpu-0| I120: >> handle is 32AB3740
2015-12-22T18:59:01.130Z| vcpu-0| I120: In the callback SampleFilterDiskOpen
2015-12-22T18:59:01.130Z| vcpu-0| I120: >> handle is 32C452F0
2015-12-22T18:59:01.130Z| vcpu-0| I120: >> diskFlags is 1
2015-12-22T18:59:01.130Z| vcpu-0| I120: >> linksInChain is 1
2015-12-22T18:59:01.130Z| vcpu-0| I120: >> the 1(th) file in filesInChain is
/vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55-000001.vmdk
2015-12-22T18:59:01.130Z| vcpu-0| I120: In the callback SampleFilterDiskSnapshot
2015-12-22T18:59:01.130Z| vcpu-0| I120: >> handle is 32C452F0
2015-12-22T18:59:01.130Z| vcpu-0| I120: >> Snapshot phase is 2
2015-12-22T18:59:01.218Z| vcpu-0| I120: In the callback SampleFilterDiskClose
2015-12-22T18:59:01.218Z| vcpu-0| I120: >> handle is 32C452F0
2015-12-22T18:59:01.777Z| vcpu-0| I120: In the callback SampleFilterDiskOpen
2015-12-22T18:59:01.777Z| vcpu-0| I120: >> handle is 327E3DC0

VMware, Inc. VMware Confidential and Proprietary 213


Getting Started Developing vSphere IO Filter Solutions

2015-12-22T18:59:01.777Z| vcpu-0| I120: >> diskFlags is 0


2015-12-22T18:59:01.777Z| vcpu-0| I120: >> linksInChain is 2
2015-12-22T18:59:01.777Z| vcpu-0| I120: >> the 1(th) file in filesInChain is
/vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55-000001.vmdk
2015-12-22T18:59:01.777Z| vcpu-0| I120: >> the 2(th) file in filesInChain
is /vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55.vmdk
2015-12-22T18:59:01.779Z| Upcall-e06a7e| I120: In the callback SampleFilterDiskUnstun
2015-12-22T18:59:01.779Z| Upcall-e06a7e| I120: >> handle is 327E3DC0

Then, vSphere clones the base disk cent55.vmdk and creates a new VMDK called cloned.vmdk. As part of
the process, the IOFilter framework invokes thediskClone callback of the filter. This is reflected in vpxa.log.

2015-12-22T18:59:35.214Z info vpxa[FFF99B70] In the callback SampleFilterDiskOpen


2015-12-22T18:59:35.214Z info vpxa[FFF99B70] >> handle is 1F37C168
2015-12-22T18:59:35.215Z info vpxa[FFF99B70] >> diskFlags is 0
2015-12-22T18:59:35.215Z info vpxa[FFF99B70] >> linksInChain is 1
2015-12-22T18:59:35.215Z info vpxa[FFF99B70] >> the 1(th) file in filesInChain is
/vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cloned/cloned.vmdk
2015-12-22T18:59:35.216Z info vpxa[FFF99B70] In the callback SampleFilterDiskClone
2015-12-22T18:59:35.216Z info vpxa[FFF99B70] >> handle is 1F37C168
2015-12-22T18:59:35.303Z info vpxa[FFF99B70] In the callback SampleFilterDiskClose
2015-12-22T18:59:35.303Z info vpxa[FFF99B70] >> handle is 1F37C168
2015-12-22T18:59:35.487Z info vpxa[FFF99B70] In the callback SampleFilterDiskOpen
2015-12-22T18:59:35.488Z info vpxa[FFF99B70] >> handle is 1F64B290
2015-12-22T18:59:35.488Z info vpxa[FFF99B70] >> diskFlags is 2
2015-12-22T18:59:35.488Z info vpxa[FFF99B70] >> linksInChain is 1
2015-12-22T18:59:35.488Z info vpxa[FFF99B70] >> the 1(th) file in filesInChain is
/vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55.vmdk
2015-12-22T18:59:35.488Z info vpxa[FFF99B70] In the callback SampleFilterDiskClose
2015-12-22T18:59:35.488Z info vpxa[FFF99B70] >> handle is 1F64B290

Finally, vSphere deletes the earlier created snapshot, so it consolidates the disks of cent55.vmdk and
cent55-000001.vmdk, and deletes the delta disk cent55-000001.vmdk. Then all new data goes to the base disk
cent55.vmdk. This is reflected in vmware.log.

2015-12-22T18:59:37.793Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStartIO


2015-12-22T18:59:37.793Z| Upcall-16c6a7a| I120: >> handle is 327E3DC0
2015-12-22T18:59:37.847Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStartIO
2015-12-22T18:59:37.847Z| Upcall-16c6a7a| I120: >> handle is 327E3DC0
2015-12-22T18:59:38.896Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStun
2015-12-22T18:59:38.896Z| Upcall-16c6a7a| I120: >> handle is 327E3DC0
2015-12-22T18:59:38.907Z| vcpu-0| I120: In the callback SampleFilterDiskClose
2015-12-22T18:59:38.907Z| vcpu-0| I120: >> handle is 327E3DC0
2015-12-22T18:59:39.045Z| vcpu-0| I120: In the callback SampleFilterDiskOpen
2015-12-22T18:59:39.045Z| vcpu-0| I120: >> handle is 327E2040
2015-12-22T18:59:39.045Z| vcpu-0| I120: >> diskFlags is 0
2015-12-22T18:59:39.045Z| vcpu-0| I120: >> linksInChain is 1
2015-12-22T18:59:39.045Z| vcpu-0| I120: >> the 1(th) file in filesInChain is
/vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55-000001.vmdk
2015-12-22T18:59:39.059Z| vcpu-0| I120: In the callback SampleFilterDiskOpen
2015-12-22T18:59:39.059Z| vcpu-0| I120: >> handle is 3278A0D0
2015-12-22T18:59:39.059Z| vcpu-0| I120: >> diskFlags is 0
2015-12-22T18:59:39.059Z| vcpu-0| I120: >> linksInChain is 1
2015-12-22T18:59:39.059Z| vcpu-0| I120: >> the 1(th) file in filesInChain
is /vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55.vmdk
2015-12-22T18:59:39.060Z| vcpu-0| I120: In the callback SampleFilterDiskClose
2015-12-22T18:59:39.060Z| vcpu-0| I120: >> handle is 327E2040

214 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

2015-12-22T18:59:39.141Z| vcpu-0| I120: In the callback SampleFilterDiskClose


2015-12-22T18:59:39.141Z| vcpu-0| I120: >> handle is 3278A0D0
2015-12-22T18:59:39.382Z| vcpu-0| I120: In the callback SampleFilterDiskOpen
2015-12-22T18:59:39.382Z| vcpu-0| I120: >> handle is 32C43910
2015-12-22T18:59:39.382Z| vcpu-0| I120: >> diskFlags is 0
2015-12-22T18:59:39.382Z| vcpu-0| I120: >> linksInChain is 2
2015-12-22T18:59:39.382Z| vcpu-0| I120: >> the 1(th) file in filesInChain is
/vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55-000001.vmdk
2015-12-22T18:59:39.382Z| vcpu-0| I120: >> the 2(th) file in filesInChain
is /vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55.vmdk
2015-12-22T18:59:39.389Z| Upcall-fa8aad| I120: In the callback SampleFilterDiskUnstun
2015-12-22T18:59:39.389Z| Upcall-fa8aad| I120: >> handle is 32C43910
2015-12-22T18:59:39.536Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStun
2015-12-22T18:59:39.536Z| Upcall-16c6a7a| I120: >> handle is 32C43910
2015-12-22T18:59:40.129Z| vcpu-0| I120: In the callback SampleFilterDiskClose
2015-12-22T18:59:40.129Z| vcpu-0| I120: >> handle is 32C43910
2015-12-22T18:59:40.511Z| vcpu-0| I120: In the callback SampleFilterDiskOpen
2015-12-22T18:59:40.511Z| vcpu-0| I120: >> handle is 32C46AA0
2015-12-22T18:59:40.511Z| vcpu-0| I120: >> diskFlags is 0
2015-12-22T18:59:40.512Z| vcpu-0| I120: >> linksInChain is 2
2015-12-22T18:59:40.512Z| vcpu-0| I120: >> the 1(th) file in filesInChain is
/vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55-000001.vmdk
2015-12-22T18:59:40.512Z| vcpu-0| I120: >> the 2(th) file in filesInChain
is /vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55.vmdk
2015-12-22T18:59:40.512Z| vcpu-0| I120: In the callback SampleFilterDiskCollapse
2015-12-22T18:59:40.512Z| vcpu-0| I120: >> handle is 32C46AA0
2015-12-22T18:59:40.512Z| vcpu-0| I120: In the callback SampleFilterDiskClose
2015-12-22T18:59:40.512Z| vcpu-0| I120: >> handle is 32C46AA0
2015-12-22T18:59:40.879Z| vcpu-0| I120: In the callback SampleFilterDiskOpen
2015-12-22T18:59:40.879Z| vcpu-0| I120: >> handle is 32C0D120
2015-12-22T18:59:40.879Z| vcpu-0| I120: >> diskFlags is 1
2015-12-22T18:59:40.879Z| vcpu-0| I120: >> linksInChain is 1
2015-12-22T18:59:40.879Z| vcpu-0| I120: >> the 1(th) file in filesInChain is
/vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55-000001.vmdk
2015-12-22T18:59:40.879Z| vcpu-0| I120: In the callback SampleFilterDiskDetach
2015-12-22T18:59:40.879Z| vcpu-0| I120: In the callback SampleFilterDiskClose
2015-12-22T18:59:40.879Z| vcpu-0| I120: >> handle is 32C0D120
2015-12-22T18:59:41.243Z| vcpu-0| I120: In the callback SampleFilterDiskOpen
2015-12-22T18:59:41.243Z| vcpu-0| I120: >> handle is 32C0D120
2015-12-22T18:59:41.243Z| vcpu-0| I120: >> diskFlags is 0
2015-12-22T18:59:41.243Z| vcpu-0| I120: >> linksInChain is 1
2015-12-22T18:59:41.243Z| vcpu-0| I120: >> the 1(th) file in filesInChain
is /vmfs/volumes/566608d3-add9468f-0912-0050569625f1/cent55/cent55.vmdk
2015-12-22T18:59:41.245Z| Upcall-fa8aad| I120: In the callback SampleFilterDiskUnstun
2015-12-22T18:59:41.245Z| Upcall-fa8aad| I120: >> handle is 32C0D120
2015-12-22T18:59:41.744Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStartIO
2015-12-22T18:59:41.744Z| Upcall-16c6a7a| I120: >> handle is 32C0D120
2015-12-22T18:59:41.744Z| Upcall-16c6a7a| I120: In the callback SampleFilterDiskStartIO
2015-12-22T18:59:41.744Z| Upcall-16c6a7a| I120: >> handle is 32C0D120

VMware, Inc. VMware Confidential and Proprietary 215


Getting Started Developing vSphere IO Filter Solutions

Understanding and Processing an xMigration (diskVmMigration) Event


The IO Filter framework invokes the diskVmMigration callback function in the event of a live migration or if
a live migration has failed.The prototype of this callback is:

VMIOF_Status (*diskVmMigration)(VMIOF_DiskHandle *handle, const VMIOF_DiskVmMigrationInfo *info);

The parameters to this callback are:

n VMIOF_DiskHandle *handle — An opaque handle to the disk

n const VMIOF_DiskVmMigrationInfo *info — A pointer to a structure of type


VMIOF_DiskVmMigrationInfo, which is defined as:

typedef struct VMIOF_DiskVmMigrationInfo {


VMIOF_DiskMigrationPhase phase;
VMIOF_DiskVmMigrationType type;
VMIOF_DiskVmMigrationIpSpec ipSpec;
VMIOF_DiskOpProgressFunc progressFunc;
VMIOF_DiskOpCompletionFunc completionFunc;
} VMIOF_DiskVmMigrationInfo;

The members of this structure are:

n VMIOF_DiskMigrationPhase phase — The phase in which the diskVmMigration callback is invoked.


Possible values are:

n VMIOF_MIGRATION_PREPARE — The migration has been requested but not started

n VMIOF_MIGRATION_FAILED — The migration failed

NOTE There is no Migration happening or Migration completed phase. A filter knows that a migration
has completed when it gets a diskUnstun callback after a migration callback with a prepare phase.

n VMIOF_DiskVmMigrationType type – The type of the diskVmMigration callback. Possible values are:

n VMIOF_MIGRATION_INVALID — The IO Filter Framework sets this value to indicate a failed


migration.

n VMIOF_MIGRATION_VMOTION — This value indicates that a migration has been performed on a


VM. The datastore may also change as part of the migration.

n VMIOF_MIGRATION_SVMOTION — This value indicates that only the virtual disks belonging to the
VM have been migrated to a destination datastore on the same host.

n VMIOF_DiskVmMigrationIpSpec ipSpec — The IP address of the target host for VM migration. For
migration failure notification, the IO Filter Framework sets this to
VMIOF_MIGRATION_IP_ADDR_INVALID.

n VMIOF_DiskOpProgressFunc progressFunc — If not NULL, the function to call, at least every 10


seconds, to post progress of migration preparation

n VMIOF_DiskOpCompletionFunc completionFunc — If not NULL and the callback previously returned


ASYNC, call this function to notify the IO Filter Framework that the callback's work is completed

The possible return values for this callback function are:

n VMIOF_SUCCESS — The operation succeeded

n VMIOF_ASYNC — Processing of the migration event continues asynchronously. The framework will
postpone the migration on until the function calls completionFunc.

216 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n Any other value is taken as a failure causing the IO Filter Framework to abort the migration.

NOTE This callback function is not allowed to block and must not perform any long running activity.

NOTE If a filter returns VMIOF_ASYNC from a VMIOF_MIGRATION_PREPARE, and a VMIOF_MIGRATION_FAILED event


occurs before the filter has called the completion routine, the filter still need to call complete, but with a
VMIOF_FAILED status.

The course VMware Fundamentals for Developers and the various VMware vSphere documentation topics
discuss that VMware vSphere products (ESXi and vCenter Server), when configured into a cluster, can
migrate VMs from one host to another, even while the VM continues to run. VMware calls this feature
vMotion. Further, since the introduction of vMotion, vSphere has gained the ability to migrate the virtual
disks of a VM, for example from one datastore to another, even while the VM is running and continues to
perform IOs to the disk. VMware calls this feature Storage vMotion. For clarity, this course uses the terms
xMigration and sMigration to refer to the two different types of migrations, respectively. There is an
additional scenario where vSphere performs an xMigration: when a VM's hardware is changed while the
VM is running, for example the hot-adding / removing of a vNIC, vHBA, vDISK, etc.

At a low-level, all migrations are invoked by functions provided by various APIs. At a high level, excepting
the hardware-hot-add case, the migration functions are invoked in one of two ways:
n A human starts the migration via one of VMware's management UIs (vSphere Client, vSphere Web
Client, esxcli). For example, in the VWC, if you right-click on a VM that is running on a host in a cluster
and select "All vCenter Actions" > "Migrate", your browser displays a wizard dialog to migrate the VM
from host to host or its disks from one datastore to another.

n A management automation product starts a migration. For example, the DRS feature of vSphere may
initiate an xMigration on a VM to balance CPU and/or RAM load between hosts in a cluster.

All migration events are significant to vSphere IO Filters Solutions. For example:
n A caching solution must flush the dirty blocks in the SSD back to the disk before either type of
migration:
n For xMigration from host H1 to H2, the caching filter on H1 will have used a local SSD to cache the
data. When the VM gets to H2, the filter there will use its own local SSD to cache appropriate
VMDK data.

n For sMigration, the data must be flushed to the VMDK or the copy of said VMDK on the
destination datastore will contain stale (wrong) data.

n A replication solution, at a minimum, must perform the following for the different types of migration:
n For xMigration from host H1 to H2, the filter's Daemon on H1 should close its socket connection to
the Daemon on the replication host if it is not replicating any other VMDKs there. The filter's
Daemon on H2 must then open a connection to the replication host so that it can start sending
writes there.

n For sMigration, the filter's Daemon will almost certainly need to obtain the pathname of the VMDK
on the destination datastore and advise the filter's Daemon on the replication host of that change,
for housekeeping purposes if not others.

n An xMigration takes place in three main phases:


n Preparation - The Framework notifies appropriate Library Instances that a migration was initiated.
The Library Instance must prepare for a migration to occur and tell the Framework when it is ready
for the migration to happen. Alternatively, the Library Instance may inform the Framework that it
cannot allow a migration at this time, in which case the migration is terminated entirely. Only
when all conditions have been met for migration, including all IO Filters agreeing to proceed, does
the migration go to the next phase.

n The actual migration - During the migration, the Framework invokes several callbacks (including
diskClose(), diskOpen(), diskStun() and diskUnStun()).

VMware, Inc. VMware Confidential and Proprietary 217


Getting Started Developing vSphere IO Filter Solutions

n The migration completes or fails - On failure, the Framework notifies the Library Instance
explicitly. Currently, the Framework does not send a similar notification to the Library Instance for
success. Instead, the Library Instance can infer the success by noting the Framework's invocation of
its diskUnStun() callback after an invocation of its diskOpen() callback on H2.

The following diagram illustrates the basic sequence of interactions between the IO Filter Framework and a
Solution's Library Instance during an xMigration (after the migration is initiated by DRS, SDRS, or user
interaction):

Figure 5‑3. xMigration Sequence

The details of each step are:

1 The Framework on H1 iterates through each Filter attached to effected VMDKs, invoking their
diskMigration() callback. The framework passes three parameters: A handle to the VMDK affected by
the migration; A VMIOF_DiskVMMigrationInfo structure, with the phase member set to
VMIOF_MIGRATE_PREPARE to indicate that this is the preparation phase of a migration; A pointer to a
function of type VMIOF_DiskProgressFunc(), generically referred to as progress.

218 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

2 The Library Instance must take whatever steps necessary for the migration. For example, a caching
solution may notify the daemon to flush all dirty blocks for the VMDK.

NOTE Until the preparation is complete, the Library Instance must call progress at least once every 10
seconds. Each invocation of progress must pass the current amount of work done and the total amount
of work to do. For example, in a caching solution, at the start of the event suppose the Library Instance
discovers there are 170 dirty blocks to be flushed, and saves that number in a well know location. Then,
as the Filter Instances writes the blocks to the VMDK, it can update a counter of how many have been
written, also stored in a well known location.

3 diskMigration() returns VMIOF_SUCCESS when it is ready for the migration to proceed, or some other
code, which prevents the migration from proceeding. In the latter case, the iteration is restarted
invoking diskMigration() with the prepare members set to VMIOF_MIGRATE_FAILED.

NOTE Given this, it is possible for the Framework to invoke a filter's diskMigration() callback with
failed set indicating failure before it invokes the callback indicating prepare. In this case, the filter
should treat the invocation as a no-op. To detect this condition, filters should keep migration state for
each VMDK.

4 The Framework continues iterating through the other filters on the VMDK.

5 If all conditions are met for a migration, vSphere begins the second phase, that is the actual migration.
During this time, the VM continues to run, which means it can continue to do IOs to its VMDKs, which
means that the Framework continues to invoke the Filter Library's callback's (for example
diskIOStart()) as appropriate. Code in these callbacks should check the VMDK's migration state and act
accordingly. For example, a caching solution may keep its cache synced with the VMDK. vSphere
performs migrations by copying all of the RAM (xMigration) or disk (sMigration) to the destination,
keeping track of things that change during the copy. It then repeats this step, just for the changes.
Eventually the change set will converge to a relatively small number of pages / blocks, respectively.

6 When the migration change set converges, vSphere must stop the VM from running while it finishes
copying the remaining changes (typically a very short period of time). To do this, the ESXi kernel stuns
the VM. When it does this, the Framework invokes the Library Instance's diskStun() callback, passing
the VMDK handle and a VMIOF_DiskStunFlags parameter.

7 The LI returns VMIOF_SUCCESS if everything goes well.

8 vSphere must close the VMDK on H1 as that host will no longer access it. The Framework invokes the
LI's diskClose() callback.

9 As with all callbacks, diskClose() must return VMIOF_SUCCESS to indicate that the migration can
proceed. Any other return aborts the migration as discussed earlier in this sequence.

10 vSphere must open the VMDK on H2 as that host will now perform IOs to the it. The Framework
invokes the LI's diskOpen() callback.

11 The LI must connect to the Daemon and inform it that the disk is open.

12 The filter must return VMIOF_SUCCESS on success of its diskOpen() callback. Any other result causes the
migration to abort. At this stage, this involves moving the VM back to H1, if it can. If the move back to
H1 fails, the VM is left paralyzed.

13 vSphere un-stuns the VM. The Framework invokes the Library Instance's diskUnStun() callback (on H2).
The Library Instance can match the diskUnStun() call with the migration prepare state, and conclude
that the migration has succeeded. In this case, the callback should perform any necessary cleanup from
the migration operation.

NOTE It is possible to receive diskUnStun() callbacks without a corresponding diskStun() event. The
diskUnStun() callback should treat such cases as no-ops.

VMware, Inc. VMware Confidential and Proprietary 219


Getting Started Developing vSphere IO Filter Solutions

14 diskUnStun() returns VMIOF_SUCCESS, or an appropriate error code.

NOTE It is possible in the next step for this callback to return something other than VMIOF_SUCCESS. The
results would be similar to having the diskOpen() fail as discussed in step 2

Example of detailed log messages as seen on vmware.log for xMigration is as follows. These are the events
received on the source host (where xMigration was initiated):

2015-08-26T11:50:11.944Z| vmx| I120: Received migrate 'start' request for mig id


1440589812446319, dest world id 1000018452.
2015-08-26T11:50:11.944Z| vmx| I120: MigrateSetState: Transitioning from state 1 to 2.
2015-08-26T11:50:11.951Z| vmx| I120: MigratePlatformInitMigration: DiskOp file set to
/vmfs/volumes/55dcb5a8-1efb5567-f271-005056948bdc/VM1/VM1-
2015-08-26T11:50:11.967Z| Worker#0| I120: WORKER: Creating new group with numThreads=1 (4)
2015-08-26T11:50:11.968Z| vthread-8| I120: VTHREAD start thread 8 "vthread-8" pid 1000018794
2015-08-26T11:50:11.968Z| Worker#1| I120: MigratePrepareEventNotifyThread: Prepare event
start.
2015-08-26T11:50:11.968Z| Worker#1| I120: FiltLib_MigrationPrepare: received prepare for
migration notification
>>>>>>>> Migration notification received by the framework
2015-08-26T11:50:11.968Z| Worker#1| E105: SampleFilterfiltLI(SampleFilterfiltVmMigration):
handle=32AC5C10, info=32C0AFB8, phase=PREPARE
>>>>>>>> Migration callback invoked by the iofilter's library instance
2015-08-26T11:50:11.968Z| Worker#1| I120: MigratePrepareEventLog: Prepare event callback
complete.
2015-08-26T11:50:11.969Z| vmx| I120: MigratePrepareEventLog: Prepare multiwriter disk handoff
complete.
2015-08-26T11:50:11.972Z| vmx| I120: MigratePrepareEventLog: Prepare destination complete.
2015-08-26T11:50:13.025Z| vcpu-0| I120: FiltLib: FiltLib_DiskSync: Requesting FiltMod to abort
all outstanding IOs.
2015-08-26T11:50:13.025Z| vcpu-0| I120: VMMon_VSCSIStopVports: No such target on adapter
2015-08-26T11:50:13.026Z| Upcall-f0f6f| I120: FiltLib: Received stun notification
2015-08-26T11:50:13.026Z| Upcall-f0f6f| I120: FiltLib: FiltLib_DiskStun : Stunning all IO
filters.
>>>>>>>>> Framework receiving the stun-notification
2015-08-26T11:50:13.026Z| Upcall-f0f6f| E105: SampleFilterfiltLI(SampleFilterfiltDiskStun):
handle=32AC5C10, flags=0,
progressFunc=0 entry stunLevel=0
>>>>>>>>> Appropriate callback invoked for diskStun by the iofilter library instance
2015-08-26T11:50:13.030Z| vcpu-0| I120: Done Cpt monModules(3).
2015-08-26T11:50:13.030Z| vcpu-0| I120: Closing disk scsi0:0
>>>>>>>>> diskClose notification
2015-08-26T11:50:13.030Z| vcpu-0| E105: SampleFilterfiltLI(SampleFilterfiltDiskClose):
handle=32AC5C10 stunLevel = 1
2015-08-26T11:50:13.030Z| vcpu-0| E105: SampleFilterfiltLI(SampleFilterfiltDiskClose): WG Wait
2015-08-26T11:50:13.030Z| vcpu-0| E105: SampleFilterfiltLI(SampleFilterfiltDiskClose): WG Free
2015-08-26T11:50:13.030Z| vcpu-0| E105: SampleFilterfiltLI(SampleFilterfiltDiskClose): have
sidecar hdl
2015-08-26T11:50:13.030Z| vcpu-0| E105: SampleFilterfiltLI(SampleFilterfiltDiskClose): sidecar
close
2015-08-26T11:50:13.032Z| vcpu-0| E105: SampleFilterfiltLI(SampleFilterfiltDiskClose): close
mutex lock
2015-08-26T11:50:13.032Z| vcpu-0| E105: SampleFilterfiltLI(SampleFilterfiltDiskClose): daemon
poll remove
2015-08-26T11:50:13.032Z| vcpu-0| E105: SampleFilterfiltLI(SampleFilterfiltDiskClose): sending
close to daemon

220 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

2015-08-26T11:50:13.032Z| vcpu-0| E105: SampleFilterfiltLI(SampleFilterfiltDiskClose): daemon


sock close
2015-08-26T11:50:13.032Z| vcpu-0| I120: FiltLib: heap SampleFilterfilt statistics: numGrowthOps
2 mem bytes 20480
numShrinkOps 0 mem bytes 0 numSuccAlloc
2015-08-26T11:50:13.032Z| vcpu-0| E105: SampleFilterfiltLI(SampleFilterfiltDiskClose): returning
0
2015-08-26T11:50:13.032Z| vcpu-0| I120: FiltLib: SampleFilterfilt: diskClose
successful.
2015-08-26T11:50:13.032Z| vcpu-0| I120: FiltLib: DisableDiskIO upcallThread 1000018710
initalized 1
2015-08-26T11:50:13.032Z| vcpu-0| I120: FiltLib: DetachFromFiltMod 32AC4320

These are the events received on the destination host (where the VM was migrated to):

2015-08-26T11:50:09.036Z| vmx| I120: Migrate: Automatically handling hint.


2015-08-26T11:50:09.045Z| vmx| I120: DISKLIB-VMFS : "/vmfs/volumes/55dcb5a8-1efb5567-
f271-005056948bdc/VM1/VM1-flat.vmdk" :
open successful (524309) size = 21474836480, hd = 0. T
2015-08-26T11:50:09.050Z| vmx| I120: PluginLdr_Load: Loaded plugin libvmiof-disk-
SampleFilterfilt.so from
/usr/lib64/vmware/plugin/libvmiof-disk-SampleFilterfilt.so
2015-08-26T11:50:09.051Z| vmx| E105: SampleFilterfiltLI(SampleFilterfiltDiskRequirements): JUST
return requirements
2015-08-26T11:50:09.051Z| vmx| I120: FiltLib: IOFilter param "SampleFilterfilt.per-mb" = "32".
2015-08-26T11:50:09.051Z| vmx| I120: FiltLib: IOFilter param "SampleFilterfilt.per-io" = "48".
2015-08-26T11:50:09.051Z| vmx| I120: FiltLib: IOFilter param "SampleFilterfilt.static" = "9608".
2015-08-26T11:50:10.504Z| vmx| I120: PluginLdr_Load: Loaded plugin libvmiof-disk-
SampleFilterfilt.so
from /usr/lib64/vmware/plugin/libvmiof-disk-SampleFilterfilt.so
2015-08-26T11:50:10.506Z| Upcall-f187| I120: VTHREAD start thread 7 "Upcall-f187" pid 1000018460
2015-08-26T11:50:10.506Z| vmx| I120: FiltLib: Context 32AC5230: initialized the upcall thread.
2015-08-26T11:50:10.506Z| vmx| E105: SampleFilterfiltLI(SampleFilterfiltDiskOpen):
handle=32AC6AC0, flags=0(). File Chains=1.
Files in chain:
>>>>>>>>>>>>>>> diskopen callback invoked
2015-08-26T11:50:10.506Z| vmx| E105: Name[0]='/vmfs/volumes/55dcb5a8-1efb5567-
f271-005056948bdc/VM1/VM1.vmdk'
2015-08-26T11:50:10.506Z| vmx| E105: SampleFilterfiltDiskOpen called from /bin/vmx-debug.
inDaemon=0
2015-08-26T11:50:10.506Z| vmx| E105: SampleFilterfiltLI(SampleFilterfiltDiskRequirements): JUST
return requirements
2015-08-26T11:50:10.506Z| vmx| E105: SampleFilterfiltLI: created heap
2015-08-26T11:50:10.506Z| vmx| E105: SampleFilterfiltLI: did ID heap allocation
2015-08-26T11:50:10.510Z| vmx| E105: SampleFilterfiltLI: opened sidecar successfully
2015-08-26T11:50:10.513Z| vmx| E105: SampleFilterfiltLI: opened owner sidecar successfully
2015-08-26T11:50:10.516Z| vmx| E105: sizeof header: 4
2015-08-26T11:50:10.516Z| vmx| E105: sendmsg sent 4 bytes.
2015-08-26T11:50:10.516Z| vmx| E105: sizeof offset: 8
2015-08-26T11:50:10.516Z| vmx| E105: sizeof pnl: 4
2015-08-26T11:50:10.516Z| vmx| E105: sizeof pathname: 62
2015-08-26T11:50:10.516Z| vmx| E105: sendmsg sent 75 bytes.
2015-08-26T11:50:10.516Z| vmx| I120: WORKER: Creating new group with numThreads=1 (4)
2015-08-26T11:50:10.516Z| vmx| E105: SampleFilterfiltLI(SampleFilterfiltDiskOpen): Open
returning SUCCESS
2015-08-26T11:50:10.516Z| vmx| I120: FiltLib: SampleFilterfilt: diskOpen successful.

VMware, Inc. VMware Confidential and Proprietary 221


Getting Started Developing vSphere IO Filter Solutions

2015-08-26T11:50:10.549Z| vmx| I120: FiltLib: handle 32AC5230 adapter 0


2015-08-26T11:50:10.852Z| Upcall-f187| I120: FiltLib: Received unstun notification
>>>>>>>>>>>>>>>>> Disk Unstun notification received by the framework
2015-08-26T11:50:10.852Z| Upcall-f187| I120: FiltLib: FiltLib_DiskUnstun : Unstunning all IO
filters.
2015-08-26T11:50:10.852Z| Upcall-f187| E105: SampleFilterfiltLI(SampleFilterfiltDiskUnStun):
handle=32AC6AC0, entry stunLevel=-1
>>>>>>>>>>>>>>>>> diskUnstun callback invoked
2015-08-26T11:50:10.852Z| vcpu-0| I120: Migrate: cleaning up migration state.
2015-08-26T11:50:10.852Z| vcpu-0| I120: Migrate: Final status reported through Vigor.
2015-08-26T11:50:10.852Z| vcpu-0| I120: MigrateSetState: Transitioning from state 12 to 0.
2015-08-26T11:50:10.853Z| vcpu-0| I120: Migrate: Final status reported through VMDB.
2015-08-26T11:50:10.911Z| vcpu-0| I120: CPT: vmstart
2015-08-26T11:50:10.911Z| vcpu-1| I120: CPT: vmstart
2015-08-26T11:50:10.957Z| Upcall-f187| E105: SampleFilterfiltLI(SampleFilterfiltDiskStartIO):
handle=32AC6AC0, io=48C49000, sequence=0, resetID=73679445206144
>>>>>>>>>>>>>>>>> IO operations (re)start on the disk
2015-08-26T11:50:10.957Z| Upcall-f187| E105: SampleFilterfiltLI(SampleFilterfiltDiskStartIO):
iotp->handle=32AC6AC0, iotp->io=48C49000, sequence=0
2015-08-26T11:50:10.959Z| Upcall-f187| E105: SampleFilterfiltLI(SampleFilterfiltDiskStartIO):
handle=32AC6AC0, io=48C49000, sequence=1, resetID=73679445206144

The same set of events are received when we perform a sMigration, however the notifications are sent only
to the source disk(s). An example is a write-back caching filter that has queued up some writes during a
sMigration to a different datastore, while the VM resides on the same host. The filter stores the cache
location in the sidecar. After the sMigration, the destination gets a copy of the sidecar and continues to use
the same cache location. On receiving the prepare-migration notification during a sMigration, the source
filter can now flush all it's caches and continue in write-through mode during the migration. On the source,
after the sMigration succeeds, the framework sends a detach callback and sets the delete flag. The filter on
the destination is unaware of the migration and may try to delete or reset the cache, so communication
should be setup through the sidecar or via the Daemon in order to maintain cache consistency.

NOTE sMigration while the VM is powered off (aka VMDK cold migration, or a VMDK relocation) is
implemented using Disk Clone.

Example of detailed log messages as seen on vmware.log for sMigration is as follows:

2015-11-19T06:14:06.373Z| vmx| I120: Received migrate 'start' request for mig id


1447913334240451, dest world id 1001889007.
2015-11-19T06:14:06.373Z| vmx| I120: MigrateSetState: Transitioning from state 1 to 2.
2015-11-19T06:14:06.417Z| vmx| I120: WORKER: Creating new group with numThreads=1 (4)
2015-11-19T06:14:06.417Z| Worker#0| I120: MigratePrepareEventNotifyThread: Prepare event start.
2015-11-19T06:14:06.417Z| Worker#0| I120: FiltLib_MigrationPrepare: received prepare for
migration notification
2015-11-19T06:14:06.417Z| Worker#0| I120: FLDiskNotificationProcess: no filters registered
2015-11-19T06:14:06.417Z| Worker#0| I120: FiltLib_MigrationPrepare: received prepare for
migration notification

>> We get the Migration callback in our LI

2015-11-19T06:14:06.417Z| Worker#0| I120: In the callback TestDiskVmMigration


2015-11-19T06:14:06.417Z| Worker#0| E105: TestDiskVmMigration:649: Migration Phase is PREPARE
2015-11-19T06:14:06.417Z| Worker#0| E105: TestDiskVmMigration:655: Migration Type is svMotion

2015-11-19T06:14:06.417Z| Worker#0| I120: FiltLib_MigrationPrepare: received prepare for


migration notification

222 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

2015-11-19T06:14:06.417Z| Worker#0| I120: FLDiskNotificationProcess: no filters registered


2015-11-19T06:14:06.417Z| Worker#0| I120: MigratePrepareEventLog: Prepare event callback
complete.
2015-11-19T06:14:06.418Z| vmx| I120: MigratePrepareEventLog: Prepare multiwriter disk handoff
complete.
2015-11-19T06:14:06.422Z| vmx| I120: MigratePrepareEventLog: Prepare destination complete.
2015-11-19T06:14:06.422Z| vmx| I120: MigratePrepareEventLog: Prepare event complete.
2015-11-19T06:14:06.422Z| vmx| I120: WORKER: Creating new group with numThreads=1 (4)
2015-11-19T06:14:06.422Z| vmx| I120: WORKER: Creating new group with numThreads=1 (4)
2015-11-19T06:14:06.422Z| vmx| I120: WORKER: Creating new group with numThreads=1 (4)
2015-11-19T06:14:06.422Z| vmx| I120: SVMotion: Enter Phase 1
2015-11-19T06:14:06.426Z| vmx| I120: Unable to find file CD/DVD drive 0
2015-11-19T06:14:06.830Z| vmx| I120: SVMotionDiskGetSrcInfo: disk scsi0:0: type: 11, allocType:
2, capacity: 41943040, grain:
0, numlinks: 1, rdm: null.
2015-11-19T06:14:07.530Z| vmx| I120: SVMotionDiskGetDstInfo: disk scsi0:0: type: 11, allocType:
2, capacity: 41943040, grain:
0, numlinks: 1, rdm: null.
2015-11-19T06:14:07.530Z| vmx| I120: SVMotionDiskSetup: Adding disk scsi0:0: isRDM: 0, isRemote:
0, skipZeros: 1.
2015-11-19T06:14:07.678Z| vmx| I120: MigrateWriteHostLog: Writing to log file took 145893 us.
2015-11-19T06:14:08.441Z| vmx| I120: SVMotionDiskGetSrcInfo: disk scsi0:1: type: 11, allocType:
0, capacity: 28672, grain:
0, numlinks: 1, rdm: null.
2015-11-19T06:14:09.239Z| vmx| I120: SVMotionDiskGetDstInfo: disk scsi0:1: type: 11, allocType:
0, capacity: 28672, grain:
0, numlinks: 1, rdm: null.
2015-11-19T06:14:09.240Z| vmx| I120: SVMotionDiskSetup: Adding disk scsi0:1: isRDM: 0, isRemote:
0, skipZeros: 0.
2015-11-19T06:14:09.427Z| vmx| I120: MigrateWriteHostLog: Writing to log file took 186052 us.
2015-11-19T06:14:09.792Z| vmx| I120: MigrateWriteHostLog: Writing to log file took 363912 us.
2015-11-19T06:14:10.084Z| vmx| I120: SVMotionDiskGetSrcInfo: disk scsi0:2: type: 11, allocType:
1, capacity:
32768, grain: 0, numlinks: 1, rdm: null.
2015-11-19T06:14:10.764Z| vmx| I120: SVMotionDiskGetDstInfo: disk scsi0:2: type: 11, allocType:
1, capacity:
32768, grain: 0, numlinks: 1, rdm: null.
2015-11-19T06:14:10.764Z| vmx| I120: SVMotionDiskSetup: Adding disk scsi0:2: isRDM: 0, isRemote:
0, skipZeros: 1.
2015-11-19T06:14:11.042Z| vmx| I120: MigrateWriteHostLog: Writing to log file took 276894 us.
2015-11-19T06:14:11.043Z| vmx| I120: MigrateSetState: Transitioning from state 2 to 3.
2015-11-19T06:14:11.043Z| vmx| I120: MigratePlatformLogCpuStats: 1447913334240451 SRC VM gid:
13483421 MainVmxThr:
1001888845 MigSendThr: 0 MigRecvThr: 0
2015-11-19T06:14:11.043Z| vmx| I120: MigratePlatformLogCpuStats: 1447913334240451 (begin) VM:
used: 64777141
2015-11-19T06:14:11.043Z| vmx| I120: MigratePlatformLogCpuStats: 1447913334240451 (begin)
MainVmxThr: ready: 590244 used: 3657336

>>>> Now ESXi will create the disk on the new datastore

2015-11-19T06:14:16.963Z| Worker#1| I120: MigrateWriteHostLog: Writing to log file took 148831


us.
2015-11-19T06:14:16.964Z| Worker#1| I120: SVMotionDiskGetCreateExtParams: not using a storage
policy to create disk

VMware, Inc. VMware Confidential and Proprietary 223


Getting Started Developing vSphere IO Filter Solutions

'/vmfs/volumes/549dec0c-38a54e37-14f1-0050569436e5/WB21/WB21_1.vmdk'
2015-11-19T06:14:16.964Z| Worker#1| I120: DISKLIB-LIB_CREATE : CREATE:
"/vmfs/volumes/549dec0c-38a54e37-14f1-0050569436e5/WB21/WB21_1.vmdk"
-- vmfs capacity=28672 (14 MB) adapter=lsi
logic info=cowGran=0 allocType=3 objType= policy=''
2015-11-19T06:14:16.964Z| Worker#1| I120: DISKLIB-LIB_CREATE : CreateObjExtParams: Object
backing type 0 is invalid. Figuring out the most
suitable backing type...
2015-11-19T06:14:16.967Z| Worker#1| I120: DISKLIB-LIB_CREATE : CREATE: Creating disk backed by
'default'
2015-11-19T06:14:17.651Z| Worker#1| I120: DISKLIB-DSCPTR:
"/vmfs/volumes/549dec0c-38a54e37-14f1-0050569436e5/WB21/WB21_1.vmdk" :
creation successful.
2015-11-19T06:14:18.119Z| Worker#1| I120: DISKLIB-VMFS :
"/vmfs/volumes/549dec0c-38a54e37-14f1-0050569436e5/WB21/WB21_1-flat.vmdk" :
open successful (17) size = 14680064, hd = 0. Type
3
2015-11-19T06:14:18.676Z| Worker#1| I120: FiltLib: There are no io filters for this disk.
2015-11-19T06:14:19.201Z| Worker#1| I120: DISKLIB-VMFS :
"/vmfs/volumes/549dec0c-38a54e37-14f1-0050569436e5/WB21/WB21_1-flat.vmdk" : closed.
2015-11-19T06:14:19.373Z| Worker#1| I120: MigrateWriteHostLog: Writing to log file took 170668
us.
2

2015-11-19T06:14:29.695Z| Worker#1| I120: DISKLIB-DSCPTR: Opened [0]: "WB21_1-flat.vmdk" (0x20a)
2015-11-19T06:14:29.695Z| Worker#1| I120: DISKLIB-LINK : Opened
'/vmfs/volumes/549dec0c-38a54e37-14f1-0050569436e5/WB21/WB21_1.vmdk'
(0x20a): vmfs, 28672 sectors / 14 MB.
2015-11-19T06:14:29.696Z| Worker#1| I120: FiltLib: There are no io filters for this disk.
2015-11-19T06:14:29.696Z| Worker#1| I120: DISKLIB-LIB : Opened
"/vmfs/volumes/549dec0c-38a54e37-14f1-0050569436e5/WB21/WB21_1.vmdk"
(flags 0x20a, type vmfs).
2015-11-19T06:14:33.486Z| Worker#1| I120: DDB: "longContentID" =
"47efb6fe4e1ef2bf24148069ba8e8bc8" (was "26cb0eae164c3e31ac0aba15fffffffe")
2015-11-19T06:14:35.731Z| Worker#1| I120: DDB: "uuid" = "60 00 C2 9c a0 83 c1 1c-93 25 54 0b 01
a5 65 e3"
(was "60 00 C2 9b 30 34 a8 cf-0b f4 43 7e 64 ea 2c 51")
2015-11-19T06:14:37.383Z| Worker#1| I120: SVMotionLocalDiskQueryInfo: Got block size 1048576 for
filesystem VMFS.

>> We stun the VM (not shown) and close the source disk so we can copy from a consistent state

2015-11-19T06:14:44.687Z| vcpu-0| I120: Closing disk scsi0:1


2015-11-19T06:14:44.687Z| vcpu-0| I120: In the callback TestDiskClose
2015-11-19T06:14:44.687Z| vcpu-0| I120: TestDiskClose:358: Received close call for handle
32ACE470
2015-11-19T06:14:44.715Z| vcpu-0| I120: FiltLib: heap countio statistics: numGrowthOps 1 mem
bytes 12288 numShrinkOps 0 mem bytes 0
numSuccAllocs 2 numFailedAllocs 0.
2015-11-19T06:14:44.716Z| vcpu-0| I120: FiltLib: countio: diskClose successful.
2015-11-19T06:14:44.716Z| vcpu-0| I120: FiltLib: DisableDiskIO upcallThread 1001888852
initalized 1
2015-11-19T06:14:44.716Z| vcpu-0| I120: FiltLib: DetachFromFiltMod 32ACBAC0
2015-11-19T06:14:44.739Z| vcpu-0| I120: DISKLIB-VMFS :
"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1-flat.vmdk" : closed.

224 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

2015-11-19T06:14:44.739Z| vcpu-0| I120: Closing disk scsi0:0


2015-11-19T06:14:44.739Z| vcpu-0| I120: FiltLib: DisableDiskIO upcallThread 4294967295
initalized 0
2015-11-19T06:14:44.762Z| vcpu-0| I120: DISKLIB-VMFS :
"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21-flat.vmdk" : closed.

>> The LI is reopened on the source. This is so we can install the “mirror” driver.
>> From the vSphere 5.0 Storage Technical Whitepaper: Mirror Mode enables a single-pass block
copy of the source disk to the destination
disk by mirroring I/Os of copied blocks.

2015-11-19T06:14:44.899Z| vcpu-0| I120: DISK: OPEN scsi0:1


'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk' persistent R[]
2015-11-19T06:14:44.952Z| vcpu-0| I120: DISKLIB-VMFS :
"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1-flat.vmdk" :
open successful (10) size = 14680064, hd = 30710396.
Type 3
2015-11-19T06:14:44.952Z| vcpu-0| I120: DISKLIB-DSCPTR: Opened [0]: "WB21_1-flat.vmdk" (0xa)
2015-11-19T06:14:44.952Z| vcpu-0| I120: DISKLIB-LINK : Opened
'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk' (0xa):
vmfs, 28672 sectors / 14 MB.
2015-11-19T06:14:44.978Z| vcpu-0| I120: PluginLdr_Load: Loaded plugin libvmiof-disk-countio.so
from
/usr/lib64/vmware/plugin/libvmiof-disk-countio.so
2015-11-19T06:14:44.985Z| Upcall-1d49a7c| I120: VTHREAD start thread 10 "Upcall-1d49a7c" pid
1001889156
2015-11-19T06:14:44.985Z| vcpu-0| I120: FiltLib: Context 32ACD3D0: initialized the upcall thread.
2015-11-19T06:14:44.985Z| vcpu-0| I120: In the callback TestDiskOpen
2015-11-19T06:14:45.048Z| vcpu-0| I120: FiltLib: countio: diskOpen successful.
2015-11-19T06:14:45.048Z| vcpu-0| I120: DISKLIB-LIB : Opened
"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk"
(flags 0xa, type vmfs).
2015-11-19T06:14:45.048Z| vcpu-0| I120: DISK: Disk
'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk'
has UUID '60 00 c2 9c a0 83 c1 1c-93 25 54 0b 01 a5 65 e3'
2015-11-19T06:14:45.048Z| vcpu-0| I120: DISK: OPEN
'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk'
Geo (14/64/32) BIOS Geo (0/0/0)

>> The LI gets an unstun

2015-11-19T06:14:45.052Z| Upcall-1d49a7c| I120: FiltLib: Received unstun notification


2015-11-19T06:14:45.052Z| Upcall-1d49a7c| I120: FiltLib: FiltLib_DiskUnstun : Unstunning all IO
filters.
2015-11-19T06:14:45.052Z| Upcall-1d49a7c| I120: In the callback TestDiskUnstun
2015-11-19T06:14:45.052Z| Upcall-1d49a7c| E105: TestDiskUnstun:637: DiskAdapterGet status = 0x0
2015-11-19T06:14:45.052Z| Upcall-1d49a7c| E105: TestDiskUnstun:638: adaptertype = 0x2
2015-11-19T06:14:45.052Z| Upcall-1d49a7c| E105: TestDiskUnstun:639: adapter = 0x0

>> At some point in the future, the Library indicates that the copy is complete

2015-11-19T06:37:25.849Z| Worker#1| I120: Disk/File copy started


for /vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk.
2015-11-19T06:37:28.276Z| vmx| I120: SVMotion: scsi0:1: Disk copy completed for total 14 MB at
5906 kB/s.

VMware, Inc. VMware Confidential and Proprietary 225


Getting Started Developing vSphere IO Filter Solutions

>> The LI gets closed on the source

2015-11-19T06:37:28.786Z| vcpu-0| I120: Closing disk scsi0:1


2015-11-19T06:37:28.802Z| vcpu-0| I120: In the callback TestDiskClose
2015-11-19T06:37:28.802Z| vcpu-0| I120: TestDiskClose:358: Received close call for handle
32ACC330
2015-11-19T06:37:28.802Z| vcpu-0| I120: FiltLib: heap countio statistics: numGrowthOps 1 mem
bytes 12288 numShrinkOps 0 mem bytes 0
numSuccAllocs 2 numFailedAllocs 0.
2015-11-19T06:37:28.828Z| vcpu-0| I120: FiltLib: countio: diskClose successful.
2015-11-19T06:37:28.828Z| vcpu-0| I120: FiltLib: DisableDiskIO upcallThread 1001888852
initalized 1
2015-11-19T06:37:28.828Z| vcpu-0| I120: FiltLib: DetachFromFiltMod 32ACD3D0
2015-11-19T06:37:28.922Z| vcpu-0| I120: DISKLIB-VMFS :
"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1-flat.vmdk" : closed.

>> Once the storage migration is complete, the LI on the destination now gets an Open. Note the
new Path, which is the clue that we are
now in the destination LI.

2015-11-19T06:37:33.396Z| Worker#1| I120: PluginLdr_Load: Loaded plugin libvmiof-disk-countio.so


from
/usr/lib64/vmware/plugin/libvmiof-disk-countio.so
2015-11-19T06:37:33.404Z| Upcall-1aedfe0| I120: VTHREAD start thread 9 "Upcall-1aedfe0" pid
1001890223
2015-11-19T06:37:33.404Z| Worker#1| I120: FiltLib: Context 32AD84B0: initialized the upcall
thread.
2015-11-19T06:37:33.404Z| Worker#1| I120: In the callback TestDiskOpen
2015-11-19T06:37:33.419Z| Worker#1| E105: TestDiskOpen:303: GetUUID status = 0x0
2015-11-19T06:37:33.419Z| Worker#1| E105: TestDiskOpen:304: UUID = TestDiskOpen:306: 60
TestDiskOpen:306: 0 TestDiskOpen:306: c2
TestDiskOpen:306: 9c TestDiskOpen:306: a0 TestDiskOpen:
306: 83 TestDiskOpen:306: c1 TestDiskOpen:306: 1c TestDiskOpen:306: 93 TestDiskOpen:306: 25
TestDiskOpen:306: 54 TestDiskOpen:306:
b TestDiskOpen:306: 1 TestDiskOpen:306: a5 TestDiskOp
en:306: 65 TestDiskOpen:306: e3 TestDiskOpen:308:
2015-11-19T06:37:33.420Z| Worker#1| E105: TestDiskOpen:311: DiskAdapterGet status = 0x0
2015-11-19T06:37:33.420Z| Worker#1| E105: TestDiskOpen:312: adaptertype = 0x2
2015-11-19T06:37:33.420Z| Worker#1| E105: TestDiskOpen:313: adapter = 0x0
2015-11-19T06:37:33.420Z| Worker#1| E105: TestDiskOpen:314: target = 0x1
2015-11-19T06:37:33.420Z| Worker#1| I120: FiltLib: countio: diskOpen successful.
2015-11-19T06:37:33.420Z| Worker#1| I120: DISKLIB-LIB : Opened
"/vmfs/volumes/549dec0c-38a54e37-14f1-0050569436e5/WB21/WB21_1.vmdk"
(flags 0xa, type vmfs).
2015-11-19T06:37:33.421Z| Worker#1| I120: DISK: Disk
'/vmfs/volumes/549dec0c-38a54e37-14f1-0050569436e5/WB21/WB21_1.vmdk'
has UUID '60 00 c2 9c a0 83 c1 1c-93 25 54 0b 01 a5 65 e3'
2015-11-19T06:37:33.421Z| Worker#1| I120: DISK: OPEN
'/vmfs/volumes/549dec0c-38a54e37-14f1-0050569436e5/WB21/WB21_1.vmdk'
Geo (14/64/32) BIOS Geo (0/0/0)

226 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

>> Destination LI gets an Unstun and I/O’s continue as normal

2015-11-19T06:37:34.303Z| Upcall-1aedfe0| I120: FiltLib: Received unstun notification


2015-11-19T06:37:34.303Z| Upcall-1aedfe0| I120: FiltLib: FiltLib_DiskUnstun : Unstunning all IO
filters.
2015-11-19T06:37:34.304Z| Upcall-1aedfe0| I120: In the callback TestDiskUnstun
2015-11-19T06:37:34.304Z| Upcall-1aedfe0| E105: TestDiskUnstun:637: DiskAdapterGet status = 0x0
2015-11-19T06:37:34.304Z| Upcall-1aedfe0| E105: TestDiskUnstun:638: adaptertype = 0x2
2015-11-19T06:37:34.304Z| Upcall-1aedfe0| E105: TestDiskUnstun:639: adapter = 0x0
2015-11-19T06:37:34.304Z| vcpu-0| I120: Migrate: cleaning up migration state.
2015-11-19T06:37:34.307Z| vcpu-0| I120: Migrate: Final status reported through Vigor.
2015-11-19T06:37:34.307Z| vcpu-0| I120: MigrateSetState: Transitioning from state 12 to 0.
2015-11-19T06:37:34.308Z| vcpu-0| I120: Migrate: Final status reported through VMDB.

>> The source LI gets an open/detach/close with the flag set to VMIOF_DISK_DETACH_DELETE (1)

2015-11-19T06:37:32.757Z| Worker#1| I120: SVMotion: Enter Phase 12


2015-11-19T06:37:32.758Z| vmx| I120: Migrate: VM successfully stunned.
2015-11-19T06:37:32.758Z| vmx| I120: MigrateSetState: Transitioning from state 4 to 5.
2015-11-19T06:37:32.759Z| vmx| I120: Closing all the disks of the VM.
2015-11-19T06:37:32.769Z| Worker#1| I120: Migrate: Remote Log: Destination waited for 1406.52
seconds.
2015-11-19T06:37:32.769Z| Worker#1| I120: Migrate: Remote Log: Beginning checkpoint restore.
2015-11-19T06:37:32.769Z| Worker#1| I120: Migrate: Remote Log: Switching to checkpoint state.
2015-11-19T06:37:34.067Z| vmx| I120: VMXVmdb_SetMigrationHostLogState: hostlog state transits to
success for migrate 'to' mid 1447913334240451
2015-11-19T06:37:34.277Z| vmx| I120: MigrateWriteHostLog: Writing to log file took 209805 us.
2015-11-19T06:37:34.277Z| vmx| I120: MigrateSetStateFinished: type=1 new state=6
2015-11-19T06:37:34.277Z| vmx| I120: MigrateSetState: Transitioning from state 5 to 6.
2015-11-19T06:37:34.277Z| vmx| A115: ConfigDB: Setting cleanShutdown = "TRUE"
2015-11-19T06:37:34.422Z| vmx| I120: Migrate: Powering off
2015-11-19T06:37:34.422Z| vmx| I120: VigorTransport_ServerSendResponse opID=79711827-0e35-4c7f-
a436-23081d7450f6-99934-ngc-fb-eb-ef-2a57
seq=30373: Completed Migrate request.
2015-11-19T06:37:34.422Z| vmx| I120: Stopping VCPU threads...
2015-11-19T06:37:34.467Z| vmx| I120: SVMotion: Enter Phase 13
2015-11-19T06:37:34.467Z| vmx| I120: SVMotion_Cleanup: Scheduling cleanup thread.
2015-11-19T06:37:34.467Z| vmx| I120: SVMotionMirroredMode: Worker thread exited.
2015-11-19T06:37:34.467Z| vmx| I120: Closing all the disks of the VM.
2015-11-19T06:37:34.467Z| vmx| I120: SVMotionCleanupThread: Waiting for SVMotion Bitmap thread
to complete.
2015-11-19T06:37:34.467Z| vmx| I120: SVMotionCleanupThread: Waiting for SVMotion thread to
complete.
2015-11-19T06:37:34.467Z| vmx| I120: SVMotionCleanupThread: Cleaning up files.
2015-11-19T06:37:34.474Z| vmx| I120: SVMotionCleanupThread: Cleaning up disks.
2015-11-19T06:37:35.352Z| vmx| I120: MigrateDeleteHostLogItem: Attempting to delete
/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk (0).
2015-11-19T06:37:35.358Z| vmx| I120: DISKLIB-VMFS :
"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1-flat.vmdk" :
open successful (1115153) size = 0, hd = 0. Type 3
2015-11-19T06:37:35.381Z| vmx| I120: PluginLdr_Load: Loaded plugin libvmiof-disk-countio.so
from /usr/lib64/vmware/plugin/libvmiof-disk-countio.so
2015-11-19T06:37:35.381Z| vmx| I120: In the callback TestDiskOpen

VMware, Inc. VMware Confidential and Proprietary 227


Getting Started Developing vSphere IO Filter Solutions

2015-11-19T06:37:35.438Z| vmx| E105: TestDiskOpen:303: GetUUID status = 0x0


2015-11-19T06:37:35.438Z| vmx| E105: TestDiskOpen:304: UUID = TestDiskOpen:306: 60 TestDiskOpen:
306: 0 TestDiskOpen:306: c2 TestDiskOpen:306:
9c TestDiskOpen:306: a0 TestDiskOpen:306:
83 TestDiskOpen:306: c1 TestDiskOpen:306: 1c TestDiskOpen:306: 93 TestDiskOpen:306: 25
TestDiskOpen:306: 54 TestDiskOpen:306: b TestDiskOpen:306:
1 TestDiskOpen:306: a5 TestDiskOpen:30
6: 65 TestDiskOpen:306: e3 TestDiskOpen:308:
2015-11-19T06:37:35.439Z| vmx| E105: TestDiskOpen:311: DiskAdapterGet status = 0xb
2015-11-19T06:37:35.439Z| vmx| E105: TestDiskOpen:312: adaptertype = 0x3ff
2015-11-19T06:37:35.439Z| vmx| E105: TestDiskOpen:313: adapter = 0x0
2015-11-19T06:37:35.439Z| vmx| E105: TestDiskOpen:314: target = 0xff0a0000
2015-11-19T06:37:35.439Z| vmx| I120: FiltLib: countio: diskOpen successful.
2015-11-19T06:37:35.439Z| vmx| I120: In the callback TestDiskDetach
2015-11-19T06:37:35.439Z| vmx| E105: TestDiskDetach:188: Detach info->detachFlags = 0x1
2015-11-19T06:37:36.238Z| vmx| I120: In the callback TestDiskClose
2015-11-19T06:37:36.238Z| vmx| I120: TestDiskClose:358: Received close call for handle 327B9E00
2015-11-19T06:37:36.238Z| vmx| I120: FiltLib: heap countio statistics: numGrowthOps 1 mem bytes
12288 numShrinkOps 0 mem bytes 0
numSuccAllocs 2 numFailedAllocs 0.
2015-11-19T06:37:36.238Z| vmx| I120: FiltLib: countio: diskClose successful

Understanding and Processing a DiskRelease Event


Remember from “Understanding and Defining the diskRelease Callback,” on page 109 that the IO Filter
Framework invokes the diskRelease() callback of a cartel trying to open a VMDK if said VMDK is locked by
another cartel at that time. This can happen for several reasons, including that the filter's daemon has
opened the disk to do offline processing (flushing the cache or replicating the disk while the VM is not
running).

On receipt of this event, the filter (between its LI and Daemon) must determine if a Daemon has the VMDK
open for offline processing, and if so, get it to stop said processing and close the VMDK, so that the current
cartel can open it. If, the VMDK gets closed by whomever has it opened, the library code must return
VMIOF_SUCCESS to the Framework, causing said Framework to retry the open. If the disk has not been
opened by this filter, the Library Instance should return VMIOF_FAILURE. The LI can also return other error
codes to indicate different error scenarios. If all the filters return failure or error, the Framework will fail to
open the disk. The following figure illustrates a simplified version of this sequence:

228 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Figure 5‑4. Processing diskRelease: Single host case

The callback, defined in a library, has the following prototype:

VMIOF_Status (*diskRelease)(VMIOF_DiskHandle *handle);

Parameters :

n VMIOF_DiskHandle *handle — This input parameter an opaque handle to the disk and is valid only for the
filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

Return Value :

n VMIOF_SUCCESS — The function returns VMIOF_SUCCESS when its operation succeeds and the disk is
indeed closed. The disk open operation can then be retried.

n VMIOF_FAILURE — The function failed if the disk could not be closed. It might not have been opened by a
component belonging to this filter.

NOTE Returning VMIOF_ASYNC is prohibited. Provision for this callback by a filter is optional. If not
provided, the behaviour is the same as if VMIOF_FAILURE was returned.

Successfully implementing this callback requires several pieces:

n The library code must maintain a separate owner sidecar that, at a minimum, includes the pathname of
the VMDK. This is required because, as shown in the prototype above, the diskRelease callback only
receives a VMIOF_DiskHandle, not a VMIOF_DiskInfo with the pathname of the VMDK. This sidecar
may also include the IP address of the host whose Daemon has opened the VMDK for offline
processing, if any. The diskOpen callback must write the host's IP address in the owner sidecar when it
is invoked in the context of a Daemon (which it must determine on its own), and set it back to zeros (or
some other magic number) during the corresponding diskClose callback. The reason to keep the IP
address is explained later in this topic.

n The Daemon must keep a list of all VMDKs it opens for background processing so that, on receipt of
request from an LI to release a disk, if they don't have it open, then can reject the request. If they do
have it open, they can stop processing and honor the request.

n All Daemons for the Solution within a cluster must be able to communicate with one another. This is so
that, if the Daemon that opens the VMDK and the VM are on different hosts (say host1 and host2
respectively), the Solution on host2 can ask the Daemon on host1 to close it.

VMware, Inc. VMware Confidential and Proprietary 229


Getting Started Developing vSphere IO Filter Solutions

n The Library code must only keep the owner sidecar open for a short period of time during the
diskAttach, diskOpen, and diskClose callbacks. Failing to keep this sidecar closed at all other times will
prevent diskRelease from opening it so that it can read the VMDK's pathname (and possibly owner IP)
to start processing the event.

With these things in place, the diskRelease code can perform the steps shown in the preceding figure.
However, that diagram assumed that the VM attempting to open a locked VMDK is on the same host as the
Daemon that has locked it. The following figure provides the same sequence, but dives deeper into step 5,
where the LI asks the Daemon to release the disk, and the Daemon that has locked it is on a separate host. If
the Daemon writes its IP (or other identifying information) into the owner sidecar, the Daemon on the host
with the VM attempting to open the VMDK can proxy the release request to the Daemon that has actually
opened the VMDK:

Figure 5‑5. Processing diskRelease: Multi-host case

Alternatives to Request Proxy


While this diagram and the preceding paragraph described a multi-host solution where daemons proxied
release requests to one another, there are alternative solutions that do not require the owner sidecar to keep
owner information, which must be maintained in the diskOpen / diskClose callbacks. One alternative is:

1 A Daemon (e.g. on H2) receives a release request. (call this the local Daemon).

2 The local Daemon checks its VMDK list. If it does not find the VMDK on its list, it broadcasts messages
to the other Daemons asking if any of them have it open.

3 Each Daemon responds with an ACK or NAK. If all NAK, the local Daemon NAKs to the LI. If one of
them ACK, the local Daemon ACKs to the LI.

4 The LI returns a result to the framework based on the ACK/NAK from the local Daemon.

230 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

What If No Daemon Owns the Sidecar?


There are several scenarios where a VMDK is locked, preventing a VM from starting, but its not locked by
one of the filter's Daemon's. For example:

n For VMDKs with multiple filters attached, a Daemon for a different filter may have opened the VMDK.
In this case, the Framework sends diskRelease to each of the filters, in the same order in which filters
receive IOs, until either: One of the filter's diskRelease callbacks return VMIOF_SUCCESS; Or all filters
return failure. In the latter case, the Framework fails the open.

n A VMDK could be locked by tools such as vmkfstools. Since IO Filters attached to VMDKs is opaque to
these tools, there is no (good) way for a diskRelease callback to get the tool to close the VMDK. Here,
the VM start will simply fail.

NOTE If the VM is powered-off, and the Daemon has opened the disk for offline flushing, then a user tries
to clone the VM, the diskRelease callback will be delivered to the filter. During the whole cloning
process,the IOFilter framework will keep the disk opened in Read-Only mode, so Daemon won't be able to
open it again. However, the filter will see the diskClose callback before the cloning happens and another
diskOpen callback after the clone is complete.

Understanding and Using the VMIOF_VirtualDisk* Functions


By default, IO Filter Daemons cannot perform IO to the VMDKs they help filter, in part because the VAIO
IO utility functions, such as VMIOF_DiskIOAlloc() and VMIOF_DiskIOSubmit() require a VMIOF_DiskHandle as a
parameter, and this parameter is only provided to library callbacks.

Should you need your Daemon to perform IO to a VMDK, open it with VMIOF_VirtualDiskOpen(). This
function causes the IO Filter Framework to load an instance of the filter's library into the context of the
Daemon, and call its diskOpen() callback, whence said daemon receives a VMIOF_DiskHandle as one of the
parameters to diskOpen(). Once your Daemon has a handle to the disk, it can invoke VMIOF_DiskIOAlloc()
and VMIOF_DiskIOSubmit(). That said, Daemons typically communicate IO requests to their LI via a UNIX
socket used as a control plane, and a crossFD shared memory area used as a data plane.

The signature of VMIOF_VirtualDiskOpen() is:

VMIOF_Status
VMIOF_VirtualDiskOpen(const char *path, VMIOF_DiskFlags flags, VMIOF_VirtualDiskHandle **handle);

The parameters to this function are:

n const char *path — The pathname to the VMDK file to open

n VMIOF_DiskFlags flags — The flags to pass in the flags parameter to the filter's diskOpen() callback

n VMIOF_VirtualDiskHandle **handle — A pointer to the virtual disk handle pointer that is created on
success

NOTE You cannot use a VMIOF_VirtualDiskHandle with any function other than
VMIOF_VirtualDiskClose(). Further, you cannot pass VMIOF_DiskHandle objects between entities. They
only have meaning in the context in which they are issued.

VMware, Inc. VMware Confidential and Proprietary 231


Getting Started Developing vSphere IO Filter Solutions

Upon success, this function returns VMIOF_SUCCESS. On failure, the function returns a status indicating the
reason for failure, for example VMIOF_BAD_PARAM for invalid flag combinations. The handle is only valid if the
function returns VMIOF_SUCCESS.

NOTE VMIOF_DISK_NO_IO, VMIOF_DISK_RO, VMIOF_DISK_SHARED are the supported flags values. If the disk gets
opened with VMIOF_DISK_NO_IO flag, it will not be locked. This means that other applications can access the
disk and no IO is permitted to the VMDK. Only sidecar operations are allowed. To access a sidecar in
VMIOF_DISK_NO_IO mode, the sidecar file needs to be closed first so that it can get opened by the caller of this
function, or all applications that have opened the sidecar must have opened the VMDK with VMIOF_DISK_RO.
It is not allowed to open the virtual disk in VMIOF_DISK_SHARED and VMIOF_DISK_RO mode.

When you have finished performing IO to a VMDK, such as during Daemon shutdown, your code must call
VMIOF_VirtualDiskClose(). VMIOF_VirtualDiskClose() takes only one parameter, the handle returned by
VMIOF_VirtualDiskOpen().

The signature of VMIOF_VirtualDiskOpen() is:

VMIOF_Status
VMIOF_VirtualDiskClose(VMIOF_VirtualDiskHandle *handle);

The parameters to this function are:


n VMIOF_VirtualDiskHandle *handle— This input parameter is a handle to the opened virtual disk
obtained via VMIOF_VirtualDiskOpen().

Return Value :
n VMIOF_Status — VMIOF_SUCCESS on success

n Some other value on failure.

If you want the daemon to open an existing VMDK that has a filter already attached, you can do that using
VMIOF_VirtualDiskOpen(). However, if you want the daemon to open a VMDK that does not exist, you will
need call VMIOF_VirtualDiskCreate(). This function creates new vmdk file with a single filter attached.

The signature of VMIOF_VirtualDiskCreate() is:

VMIOF_Status
VMIOF_VirtualDiskCreate(const char *path,VMIOF_VirtualDiskType type, uint64_t size,
const char *filter, const VMIOF_DiskFilterProperty * const props[],
uint64_t numProps)

NOTE This function is deprecated and will be removed once alternative functionality is available.

The parameters to this function are:


n const char *path — The pathname to the VMDK file

n VMIOF_VirtualDiskType — The type of the virtual disk (The valid disk-types are
VMIOF_VIRTUAL_DISK_TYPE_VMFS_THIN, VMIOF_VIRTUAL_DISK_TYPE_VMFS_THICK_EAGER_ZERO, and
VMIOF_VIRTUAL_DISK_TYPE_VMFS_THICK_LAZY_ZERO).

n uint64_t size — The size of the virtual disk (in bytes).

n const char *filter — The name of the filter to attach to the disk.

n VMIOF_DiskFilterProperty props — The list of filter properties.

n uint64_t numProps — The number of properties.

Return Value :
n VMIOF_SUCCESS — The function succeeded and the vmdk was created.

n VMIOF_MISALIGNED — The vmdk could not be created because the size of the disk is not a multiple of the
sector size (512 bytes).

232 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Following is a code-snippet describing the activities associated with a VirtualDiskOpen event on the
Daemon:

1. static VMIOF_Status
2. VmiofTestCreateDiskDaemonStart(void)
3. {
4. const char *vmdk = getenv("VMDK_PATH");
5. const char *vmdkType = getenv("VMDK_TYPE");
6. const char *filter = "vmiofTestCreateDisk";
7. const size_t size = 10 * 1024 * 1024;
8. VMIOF_VirtualDiskType type;
9. const VMIOF_DiskFilterProperty prop1 = {
10. .name = "success", .value = "true",
11. };
12. const VMIOF_DiskFilterProperty prop2 = {
13. .name = "notsuccess", .value = "false"
14. };
15. const VMIOF_DiskFilterProperty * const props[3] = {
16. &prop1, &prop2, NULL,
17. };
18. if (strcmp(vmdkType, "VMIOF_VIRTUAL_DISK_TYPE_VMFS_THIN") == 0) {
19. type = VMIOF_VIRTUAL_DISK_TYPE_VMFS_THIN;
20. } else if (strcmp(vmdkType, "VMIOF_VIRTUAL_DISK_TYPE_VMFS_THICK_EAGER_ZERO") == 0) {
21. type = VMIOF_VIRTUAL_DISK_TYPE_VMFS_THICK_EAGER_ZERO;
22. } else if (strcmp(vmdkType, "VMIOF_VIRTUAL_DISK_TYPE_VMFS_THICK_LAZY_ZERO") == 0) {
23. type = VMIOF_VIRTUAL_DISK_TYPE_VMFS_THICK_LAZY_ZERO;
24. } else {
25. VERIFY(false);
26. }
27. LOG("%s: daemon starting up and creating vmdk \"%s\" of type %s", __FUNCTION__, vmdk,
vmdkType);
28. return VMIOF_VirtualDiskCreate(vmdk, type, size, filter, props, i. ARRAYSIZE(props));
29.}

The key sections of the code are as follows:

Lines 1-2: Define VmiofTestCreateDiskDaemonStart() function. This is the callback for a daemon start and for
this example acts as the entry point for invoking the VMIOF_VirtualDiskCreate() function.
Lines 4-28: Initialize the various parameters to be passed to the VMIOF_VirtualDiskCreate() function.

Line 4: Specifies the path of the vmdk file to be created.

Line 5: Specifies the type of vmdk file.

Line 6: Specifies the filter name to attach to the disk.

Line 7: Calculates the size of the disk in bytes.

Lines 9-17: Initialize the property to be passed to the filter and the disk.

Lines 20-28: Initializes the type of the disk to be created.

Line 32: Calls the VMIOF_VirtualDiskCreate() function with the desired parameters.

Understanding and Processing a diskGrow Event


The IO Filter Framework invokes a filter's diskGrow() callback before the disk is grown.

VMware, Inc. VMware Confidential and Proprietary 233


Getting Started Developing vSphere IO Filter Solutions

Prototype :

VMIOF_Status (*diskGrow)(VMIOF_DiskHandle *handle, const VMIOF_DiskGrowInfo *info);

Parameters :

n VMIOF_DiskHandle *handle - This input parameter an opaque handle to the disk and is valid only for
the filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n VMIOF_DiskGrowInfo *info – This input parameter is a pointer to a structure that contains brief
information for a disk grow notification for the filter. The capacity member of this structure conveys
the new capacity to which the disk is about to be grown. The filter can post the progress update of
diskGrow() to the IO filter framework as part of its processing and notification via the function pointer
VMIOF_DiskOpProgressFunc.

typedef struct VMIOF_DiskGrowInfo {


uint64_t capacity;
VMIOF_DiskOpProgressFunc progressFunc;
} VMIOF_DiskGrowInfo;

Return VMIOF_SUCCESS to allow the grow to continue. Returning any other value aborts the disk grow
operation.

NOTE The provision of this callback by a filter is optional. If not provided, the behaviour is the same as if
VMIOF_SUCCESS was returned.

The following is a sample log file that helps explain the diskGrow event sequence:

>> First the framework stuns the VM and closes all the disks. Note: In this scenario, the
filter is only attached to disk 2 (WB21_1)

2015-09-16T22:18:25.420Z| vcpu-0| I120: Destroying virtual dev for scsi0:0 vscsi=8202


2015-09-16T22:18:25.420Z| vcpu-0| I120: FiltLib: Sync set with 0 IOs in flight.
2015-09-16T22:18:25.420Z| vcpu-0| I120: VMMon_VSCSIStopVports: No such target on adapter
2015-09-16T22:18:25.421Z| vcpu-0| I120: Destroying virtual dev for scsi0:1 vscsi=8203
2015-09-16T22:18:25.421Z| vcpu-0| I120: FiltLib: Sync set with 0 IOs in flight.
2015-09-16T22:18:25.421Z| vcpu-0| I120: FiltLib: FiltLib_DiskSync: Requesting FiltMod to abort
all outstanding IOs.
2015-09-16T22:18:25.421Z| vcpu-0| I120: VMMon_VSCSIStopVports: No such target on adapter
2015-09-16T22:18:25.422Z| Upcall-1f78e2| I120: FiltLib: Received stun notification
2015-09-16T22:18:25.422Z| Upcall-1f78e2| I120: FiltLib: FiltLib_DiskStun : Stunning all IO
filters.
2015-09-16T22:18:25.422Z| Upcall-1f78e2| I120: In the callback TestDiskDiskStun
2015-09-16T22:18:25.477Z| vcpu-0| I120: Done Cpt monModules(3).
2015-09-16T22:18:25.477Z| vcpu-0| I120: Closing all the disks of the VM.
2015-09-16T22:18:25.477Z| vcpu-0| I120: Closing disk scsi0:1
2015-09-16T22:18:25.477Z| vcpu-0| I120: In the callback TestDiskClose
2015-09-16T22:18:25.477Z| vcpu-0| I120: TestDiskClose:338: Received close call for handle
32ABB5E0

>> The framework only opens the disk that it is about to Grow

2015-09-16T22:18:25.695Z| vcpu-0| I120: DISKLIB-VMFS :


"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1-flat.vmdk" : open
successful (24) size = 12582912, hd = 2873569. Type 3
2015-09-16T22:18:25.695Z| vcpu-0| I120: DISKLIB-DSCPTR: Opened [0]: "WB21_1-flat.vmdk" (0x18)
2015-09-16T22:18:25.695Z| vcpu-0| I120: DISKLIB-LINK : Opened

234 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk' (0x18):
vmfs, 24576 sectors / 12 MB.
2015-09-16T22:18:25.704Z| vcpu-0| I120: PluginLdr_Load: Loaded plugin libvmiof-disk-countio.so
from
/usr/lib64/vmware/plugin/libvmiof-disk-countio.so
2015-09-16T22:18:25.709Z| Upcall-2bd8e1| I120: VTHREAD start thread 9 "Upcall-2bd8e1" pid
1000158970
2015-09-16T22:18:25.710Z| vcpu-0| I120: FiltLib: Context 32ABAD70: initialized the upcall thread.
2015-09-16T22:18:25.710Z| vcpu-0| I120: In the callback TestDiskOpen
2015-09-16T22:18:25.710Z| vcpu-0| I120: diskFlags = 0x0
2015-09-16T22:18:25.744Z| vcpu-0| I120: FiltLib: countio: diskOpen successful.
2015-09-16T22:18:25.744Z| vcpu-0| I120: DISKLIB-LIB : Opened
"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk"
(flags 0x18, type vmfs).

>> The framework has reopened the disk in order to notify the library that the disk is about to
be grown

2015-09-16T22:18:25.745Z| vcpu-0| I120: In the callback TestDiskDiskGrow


2015-09-16T22:18:25.745Z| vcpu-0| I120: info->capacity = 14680064
2015-09-16T22:18:25.745Z| vcpu-0| I120: In the callback TestDiskClose
2015-09-16T22:18:25.745Z| vcpu-0| I120: TestDiskClose:338: Received close call for handle
32AB9800
2015-09-16T22:18:25.755Z| vcpu-0| I120: FiltLib: heap countio statistics: numGrowthOps 1 mem
bytes 12288 numShrinkOps 0 mem bytes 0
numSuccAllocs 2 numFailedAllocs 0.
2015-09-16T22:18:25.755Z| vcpu-0| I120: FiltLib: countio: diskClose successful.

>> The framework has closed the disk and proceeds to grow the it.

2015-09-16T22:18:25.762Z| vcpu-0| I120: DISKLIB-LIB : Growing disk


'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk' :
createType = vmfs
2015-09-16T22:18:25.762Z| vcpu-0| I120: DISKLIB-LIB : Growing disk
'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk' :
capacity = 24576 sectors - 12 MB
2015-09-16T22:18:25.762Z| vcpu-0| I120: DISKLIB-LIB : Growing disk
'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk' :
new capacity = 28672 sectors - 14 MB
2015-09-16T22:18:27.589Z| vcpu-0| I120: DDB: "geometry.cylinders" = "14" (was "12")
2015-09-16T22:18:28.069Z| vcpu-0| I120: PluginLdr_Load: Loaded plugin libvmiof-disk-countio.so
from
/usr/lib64/vmware/plugin/libvmiof-disk-countio.so

>> Now the framework reopens the disk then closes the disk, giving the filter a chance to update
its metadata (sidecars).

2015-09-16T22:18:28.072Z| vcpu-0| I120: In the callback TestDiskOpen


2015-09-16T22:18:28.072Z| vcpu-0| I120: diskFlags = 0x0
2015-09-16T22:18:28.103Z| vcpu-0| I120: FiltLib: countio: diskOpen successful.
2015-09-16T22:18:28.103Z| vcpu-0| I120: In the callback TestDiskClose
2015-09-16T22:18:28.103Z| vcpu-0| I120: TestDiskClose:338: Received close call for handle

VMware, Inc. VMware Confidential and Proprietary 235


Getting Started Developing vSphere IO Filter Solutions

32ABF280
2015-09-16T22:18:28.114Z| vcpu-0| I120: FiltLib: heap countio statistics: numGrowthOps 1 mem
bytes 12288 numShrinkOps 0 mem bytes 0
numSuccAllocs 2 numFailedAllocs 0.
2015-09-16T22:18:28.115Z| vcpu-0| I120: FiltLib: countio: diskClose successful.

>> The framework now opens all disks

2015-09-16T22:18:28.168Z| vcpu-0| I120: DISK: OPEN scsi0:0


'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21.vmdk' persistent R[]
2015-09-16T22:18:28.191Z| vcpu-0| I120: DISKLIB-VMFS :
"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21-flat.vmdk" :
open successful (10) size = 21474836480, hd = 2889953. Type 3
2015-09-16T22:18:28.191Z| vcpu-0| I120: DISKLIB-DSCPTR: Opened [0]: "WB21-flat.vmdk" (0xa)
2015-09-16T22:18:28.191Z| vcpu-0| I120: DISKLIB-LINK : Opened
'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21.vmdk' (0xa):
vmfs, 41943040 sectors / 20 GB.
2015-09-16T22:18:28.192Z| vcpu-0| I120: FiltLib: There are no io filters for this disk.
2015-09-16T22:18:28.192Z| vcpu-0| I120: DISKLIB-LIB : Opened
"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21.vmdk"
(flags 0xa, type vmfs).
2015-09-16T22:18:28.192Z| vcpu-0| I120: DISK: Disk
'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21.vmdk' has
UUID '60 00 c2 9f 35 1c 92 6c-0b b6 30 8d 38 6c 51 06'
2015-09-16T22:18:28.192Z| vcpu-0| I120: DISK: OPEN
'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21.vmdk'
Geo (2610/255/63) BIOS Geo (0/0/0)
2015-09-16T22:18:28.210Z| vcpu-0| I120: DISK: OPEN scsi0:1
'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk' persistent R[]
2015-09-16T22:18:28.234Z| vcpu-0| I120: DISKLIB-VMFS :
"/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1-flat.vmdk" :
open successful (10) size = 14680064, hd = 2136290. Type 3
2015-09-16T22:18:28.234Z| vcpu-0| I120: DISKLIB-DSCPTR: Opened [0]: "WB21_1-flat.vmdk" (0xa)
2015-09-16T22:18:28.234Z| vcpu-0| I120: DISKLIB-LINK : Opened
'/vmfs/volumes/53601316-8ca9ccc0-175e-000c290c3136/WB21/WB21_1.vmdk' (0xa):
vmfs, 28672 sectors / 14 MB.
2015-09-16T22:18:28.244Z| vcpu-0| I120: PluginLdr_Load: Loaded plugin libvmiof-disk-countio.so
from
/usr/lib64/vmware/plugin/libvmiof-disk-countio.so

2015-09-16T22:18:28.252Z| vcpu-0| I120: In the callback TestDiskOpen


2015-09-16T22:18:28.252Z| vcpu-0| I120: diskFlags = 0x0
2015-09-16T22:18:28.271Z| vcpu-0| I120: FiltLib: countio: diskOpen successful.

>> The VM is now unstunned and regular operations resume

2015-09-16T22:18:28.272Z| Upcall-2098e2| I120: FiltLib: Received unstun notification


2015-09-16T22:18:28.272Z| Upcall-2098e2| I120: FiltLib: FiltLib_DiskUnstun : Unstunning all IO
filters.
2015-09-16T22:18:28.273Z| Upcall-2098e2| I120: In the callback TestDiskDiskUnstun
2015-09-16T22:18:28.273Z| vcpu-0| I120: Creating virtual dev for scsi0:1
2015-09-16T22:18:28.273Z| vcpu-0| I120: DumpDiskInfo: scsi0:1 createType=11, capacity = 28672,

236 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

numLinks = 1, allocationType = 0
2015-09-16T22:18:28.273Z| vcpu-0| I120: SCSIDiskESXPopulateVDevDesc: Using FS backend
2015-09-16T22:18:28.273Z| vcpu-0| I120: DISKUTIL: scsi0:1 : geometry=14/64/32
2015-09-16T22:18:28.274Z| vcpu-0| I120: FiltLib: handle 32ABEE20 adapter 0
2015-09-16T22:18:28.283Z| vcpu-0| I120: Vigor_UpdateSchedulingPolicy: results: 1 args: normal
18446744073709551615 18446744073709551614 0
2015-09-16T22:18:28.285Z| vcpu-0| I120: # "#disk0.ddb.thinProvisioned" = "1"
2015-09-16T22:18:28.285Z| vcpu-0| I120: # "#disk1.label" = "scsi0:1"
2015-09-16T22:18:28.285Z| vcpu-0| I120: # "#disk0.ddb.toolsVersion" = "2147483647"
2015-09-16T22:18:28.285Z| vcpu-0| I120: # "#disk0.ddb.uuid" = "60 00 C2 9f 35 1c 92 6c-0b b6 30
8d 38 6c 51 06"
2015-09-16T22:18:28.285Z| vcpu-0| I120: # "#disk1.ddb.iofilters" = "countio"
2015-09-16T22:18:28.285Z| vcpu-0| I120: # "#disk1.ddb.longContentID" =
"47efb6fe4e1ef2bf24148069ba8e8bc8"
2015-09-16T22:18:28.285Z| vcpu-0| I120: # "#disk1.ddb.adapterType" = "lsilogic"
2015-09-16T22:18:28.285Z| vcpu-0| I120: # "#disk0.ddb.geometry.heads" = "255"
2015-09-16T22:18:28.285Z| vcpu-0| I120: # "#disk1.ddb.geometry.heads" = "64"
2015-09-16T22:18:28.285Z| vcpu-0| I120: # "#disk1.ddb.uuid" = "60 00 C2 9c a0 83 c1 1c-93 25 54
0b 01 a5 65 e3"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk0.ddb.adapterType" = "lsilogic"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk1.ddb.sidecars" = "countio_1,WB21_1-
a5df5f3ae9d2d1b5.vmfd"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk0.label" = "scsi0:0"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk0.ddb.geometry.cylinders" = "2610"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk0.ddb.geometry.sectors" = "63"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk0.ddb.virtualHWVersion" = "11"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk1.ddb.geometry.cylinders" = "14"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk0.ddb.longContentID" =
"0b5b332e78a517e8399fec717fb7afcb"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk1.capacityMB" = "14"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk1.ddb.geometry.sectors" = "32"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk1.ddb.virtualHWVersion" = "11"
2015-09-16T22:18:28.286Z| vcpu-0| I120: # "#disk0.capacityMB" = "20480"

Understanding an Processing diskProperites{Valid,Set,Get,Free} Events


Remember from the section “Understanding and Defining the diskProperties{Valid,Set,Get,Free} Callbacks,”
on page 110 that the VAIO provides four callbacks for processing events related to managing the
capabilities / properties associated with a VMDK.

There are two ways to set and change VMDK capabilities / properties. One way is using vmkfstools, either
when attaching a filter to a VMDK, or after attaching. This is only for developing and testing purpose, so it
should not be used in production environment.

vmkfstools --iofilters countio:numWorkGroups="4" vm.vmdk

The preferred way is through configuring a SPBM Policy. You can configure a SPBM policy either using the
vSphere Web Client or using the vSphere Storage Policy API. SPBM is designed to be a configuration that
applies to a large number of virtual disks, but not as an extremely fine grained virtual disk management. If
you need fine grained control for each and every virtual disk, you need to have your VWC Plugin talk to
your CIM provider or daemon to provide additional configuration information to the Library Instance.

VMware, Inc. VMware Confidential and Proprietary 237


Getting Started Developing vSphere IO Filter Solutions

When you change the filter properties through SPBM policy, you will be asked whether you want to apply it
right now, and if you press yes, for each and every disk, diskPropertiesValid / diskPropertiesSet will be
called with the changed properties. Please keep in mind that it is possible that you also see this callback
when other filters change around you, so you might also see it without any actual changes to your filter.

NOTE If you change filter properties for a VMDK of a running VM, the VM will be stunned first.

NOTE If you change to a different SPMB policy, while both of the policies contain your filter, your LI will
only see diskPropertiesValid / diskPropertiesSet callbacks, but not detach/attach callbacks.

As a brief review, the prototype and purpose of each callback is as follows:

diskPropertiesValid() — Determine if the list of properties and their associated values are valid. In other
words given a list of filter properties, report if the properties can be applied to the virtual disk.

VMIOF_Status(*diskPropertiesValid)(VMIOF_DiskHandle *handle,
const VMIOF_DiskFilterProperty *const *properties);

Parameters —

n VMIOF_DiskHandle *handle — This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n const VMIOF_DiskFilterProperty *const *properties — This is an array of pointers to structures of


type VMIOF_DiskFilterProperty that is NULL terminated. It may be NULL if no properties are present.

typedef struct VMIOF_DiskFilterProperty {


/** Property name. */
const char *name;
/** Property value. */
const char *value;
} VMIOF_DiskFilterProperty;

n const char *name — A pointer to an ASCII string containing the name of the property to be
validated, set, retrieved, or freed.

n const char *value — For valid and set, this points to the value being diskPropertiesValid /
diskPropertiesSet, respectively, in ASCII.

Return Values —

n A return value of VMIOF_SUCCESS indicates that the specified filter properties are valid for this filter
instance. A return value of VMIOF_FAILURE indicates that the specified filter properties are not valid or
cannot be set at this time.

NOTE Provisioning of this callback by a filter is optional. If not provided, the filter is expected to
normally succeed when setting properties.

NOTE The diskHandle in diskPropertiesValid might be NULL depending on the context. For example,
the first time you attach the filter to a disk, the pointer will be NULL, since the filter hasn't been
attached to the disk yet. If the filter has already been attached to the disk, and you change the policy,
you will see a valid diskHandle.

diskPropertiesSet() — Update the values of the specified properties with their associated values, typically
by writing them to a sidecar.

VMIOF_Status (*diskPropertiesSet)(VMIOF_DiskHandle *handle,


const VMIOF_DiskFilterProperty *const *properties);

238 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Parameters —

n VMIOF_DiskHandle *handle — This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n const VMIOF_DiskFilterProperty *const *properties — This is an array of pointers to structures of


type VMIOF_DiskFilterProperty that is NULL terminated. It may be NULL if no properties are present.

Return Values —

n VMIOF_SUCCESS — The specified filter properties have been accepted and will be applied.

n VMIOF_FAILURE — The specified filter properties have not been accepted and will not be applied

NOTE Long running work that needs to take effect as a result of the change in properties, must happen
outside of the context of this callback.

diskPropertiesGet() — Retrieve the values of the specified properties. The function is expected to
dynamically allocate the space for the retrieved values.

void (*diskPropertiesGet)(VMIOF_DiskHandle *handle,


const VMIOF_DiskFilterProperty *const **properties);

Parameters —

n VMIOF_DiskHandle *handle — This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n const VMIOF_DiskFilterProperty *const *properties — This is an array of pointers to structures of


type VMIOF_DiskFilterProperty that is NULL terminated. It may be NULL if no properties are present.

Return Values —

n none

NOTE Repeated calls may return the same pointer.

diskPropertiesFree() — Free the space dynamically allocated during a diskPropertiesGet().

void (*diskPropertiesFree)(VMIOF_DiskHandle *handle,


VMIOF_DiskFilterProperty **properties);

Parameters —

n VMIOF_DiskHandle *handle — This input parameter is an opaque handle to the disk and is valid only for
the filter that it is passed to. Almost every callback function has this handle to the disk as the first
parameter.

n const VMIOF_DiskFilterProperty *const *properties — This is an array of pointers to structures of


type VMIOF_DiskFilterProperty that is NULL terminated. It may be NULL if no properties are present.

VMware, Inc. VMware Confidential and Proprietary 239


Getting Started Developing vSphere IO Filter Solutions

Return Values —

n none

NOTE Implementing this callback by a filter is optional. If it is not provided, the filter is expected to free the
associated memory on close.

WHAT'S NEW In ESX 60U1, after upgrading your Filter from one version to another, you need to delete the
SPBM policy and recreate it. This has been fixed for 60U2

In newer versions of yourfFilter, you can introduce new properties, but you can never remove old
properties.

Assume you install your filter version-1 on Cluster-1, create SPBM Policy-1, then install filter version-2 on
Cluster-2, which introduces some new properties in addition to version-1, then create SPMB Policy-2. SPBM
Policy-1 will have only the old properties, but if you edit it, you will see the new properties, and it can be
applied to both Cluster-1 and Cluster-2. SPBM Policy-2 will include all the old and new properties, but it can
only be applied to Cluster-2.

There are two implications for filters, one is that a filter should not crash if it receives an unexpected
property, since this might be an property that is valid in a future filter version. The other is from a SPBM
policy you cannot necessarily tell to which filter version it will fully apply or only partially apply.

It is important to understand the sequence in which the IO Filter Framework invokes these callbacks.

n For setting properties during filter attach (diskAttach()):

a If its present in the library, invoke diskPropertiesValid()

b if diskPropertiesValid() was invoked and returned VMIOF_SUCCESS, invoke diskAttach(). If it


wasn't present and therefore not invoked, invoke diskAttach() anyway.

n For setting properties after filter attach (diskPropertiesSet()):

a If its present in the library, invoke diskPropertiesValid()

b if diskPropertiesValid() was invoked and returned VMIOF_SUCCESS, invoke diskAttach(). If it


wasn't present and therefore not invoked, invoke diskPropertiesSet() anyway.

n For retrieving properties:

a diskPropertiesGet()

b some time later, diskPropertiesFree()

It is also important to understand how these IO Filter Framework passes properties and their values into
these callbacks. The second parameter in each of these callbacks is either:

const VMIOF_DiskFilterProperty *const *properties

for Set / Get / Valid, or:

VMIOF_DiskFilterProperty **properties

for Free.

This syntax should actually be written with *properties replaced by properties[] because properties is
actually an array of pointers to VMIOF_DiskFilterProperty structures, not a pointer to a pointer to a single
structure of that type. The Framework sets the last element in the array to NULL.

240 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

Each of the VMIOF_DiskFilterProperty (the elements in the array) are defined as:

typedef struct VMIOF_DiskFilterProperty {


/** Property name. */
const char *name;
/** Property value. */
const char *value;
} VMIOF_DiskFilterProperty;

The two members of this array are:

n const char *name — A pointer to an ASCII string containing the name of the property to be validated,
set, retrieved, or freed.

n const char *value — For valid and set, this points to the value being diskPropertiesValid /
diskPropertiesSet, respectively, in ASCII.

For diskPropertiesGet, the function is expected to set this pointer to an address that contains an ASCII
representation of the property's value. If the property is naturally ASCII (a string), or you keep the
ASCII representation (uncommon), the code can just set this member to the address of that value. In the
more likely case that an integer value is not kept in ASCII, the code should dynamically allocate
memory to store and ASCII value of the current value of the named property and set this member to the
address of the allocated memory, which gets freed some time later by a call by the Framework to
diskPropertiesFree().

Having pointers for values for string properties pointing to existing memory and pointers for other
values dynamically allocated adds a small layer of complexity to the solution such that when the
Framework calls diskPropertiesFree, the code for that callback has to distinguish which were
dynamically allocated and which were not.

The diskPropertiesGet callback is similar, except that the second parameter is

const VMIOF_DiskFilterProperty *const **properties

Here properties is still an array but to pointers to pointers to VMIOF_DiskFilterProperty structures.


That is, your code must allocate one VMIOF_DiskFilterProperty structure for each property in your
filter and place a pointer to said structure in the array.

For diskPropertiesfree, properties contains the address the code set during diskPropertiesGet. The
code must free the memory pointed to if it was dynamically allocated.

The code for most functions is somewhat obvious once you understand the parameters. That said, here are
some important notes:

n Return VMIOF_SUCCESS from diskPropertiesValid if ALL of the parameters and their proposed values
are valid. The validity of values may be depend on dynamic factors such as the current time (for
example with licenses), the size of the disk, available cache memory, etc. Return VMIOF_FAILURE if any
property or its value is invalid.

n Only return from diskPropertiesSet only if you can set ALL of the parameters to their proposed
values. That is, a set operation should be atomic. Either the code changes all of them, or none of them.

n diskPropertiesValid is optional. If the Library component does not provide this callback, the
Framework just calls diskPropertiesSet() and diskAttach() as though it was present and returned
VMIOF_SUCCESS.

n diskPropertiesFree is optional. Your library only needs to provide it if your diskPropertiesGet


dynamically allocates memory for properties it returns.

VMware, Inc. VMware Confidential and Proprietary 241


Getting Started Developing vSphere IO Filter Solutions

Understanding and Using the VMIOF SCSI Functions


VAIO provides functions and data structures that allow your IO Filter solution to access SCSI devices and
issue SCSI commands to physical host scsi devices.

NOTE The SCSI commands issued from the IO Filter are batched and executed asynchronously.

Understanding the ScsiCommand Structure


The SCSI commands can be issued from the IO Filter context (typically the Daemon). Note that the
framework does not perform any validation of the SCSI commands. It will attempt to submit all the
commands but the kernel will fail them. The SCSI command structure has the following definition:

typedef struct VMIOF_ScsiCommand {


VMIOF_ScsiCallback done; /* Callback to issue when command is done. */
void *doneData; /* Data for done callback. */
void *buffer; /* Buffer to perform IO against. */
ssize_t length; /* Desired transfer length. */
uint64_t timeoutMS; /* Command timeout in millisec. */
VMIOF_ScsiCommandFlags flags; /* Command flags. */
uint8_t *cdb; /* The SCSI CDB. */
uint8_t cdbLen; /* Length of the CDB. */
uint8_t scsiStatus; /* Command status. */
uint16_t hostStatus; /* Host status. */
uint16_t driverStatus; /* Driver status. */
uint32_t bytesXferred; /* Actual bytes transferred. */
uint32_t senseSize; /* Sense buffer size/length. */
uint8_t *sense; /* Sense buffer. */
uint8_t reserved[48]; /* Reserved for internal/future use. */
} VMIOF_ScsiCommand;

The members of this structure are:

n VMIOF_ScsiCallback done — This is essentially a callback that is invoked by the IO filter framework for
each SCSI command issued to an iSCSI target.

n void *doneData — The parameter to be passed to the SCSI command callback function

n void *buffer — The buffer is the placeholder for the read/write operations performed by the SCSI
commands.

n ssize_t length — This is the size of the buffer being used for IO operations.

n uint64_t timeoutMS — This specifies the time in milliseconds after the which on response, the SCSI
command times out.

n VMIOF_ScsiCommandFlags flags — Flags that describe the SCSI commands are as follows:

typedef enum {
VMIOF_SCSI_FLAG_NONE = 0, /* Empty flag. */
VMIOF_SCSI_FLAG_DIRECTION_TO_DEVICE = 1u << 0, /* Direction of transfer is to device. */
VMIOF_SCSI_FLAG_DIRECTION_FROM_DEVICE = 1u << 1, /* Direction of transfer is from device.

242 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

*/
VMIOF_SCSI_FLAG_NO_RETRY = 1u << 2, /* Upon transient failure, do not retry
command. */
} VMIOF_ScsiCommandFlags;

NOTE If the command is bidirectional, then set both VMIOF_SCSI_FLAG_DIRECTION_TO_DEVICE and


VMIOF_SCSI_FLAG_DIRECTION_FROM_DEVICE, causing the buffer to serve as both the input buffer and
output buffer.

n uint8_t *cdb — This is the pointer to the SCSI CDB in which commands are specified.

n uint8_t cdbLen — The length of the CDB structure specified above.

n uint8_t scsiStatus — The status of the SCSI command issued.

n uint16_t hostStatus — This is the host adapter status (and error codes)

n uint16_t driverStatus — This is the status of the driver.

n uint32_t bytesXferred — This is a SCSI statistic to specify the actual number of bytes of data transferred
by SCSI.

n uint32_t senseSize — This is size of the sense buffer

n uint8_t *sense — This is the pointer to the sense buffer to hold status information

NOTE The memory for the SCSI command and the buffers and cdb should be allocated from the Heap.
Similarly, at the end of the command, you should also free the memory.

NOTE The LI is allowed to use SCSI APIs, but the SCSI device must be opened by the Daemon.

Understanding VMIOF_ScsiHandle
The VAIO API provides an opaque data type VMIOF_ScsiHandle. This represents a handle to the SCSI disk to
which SCSI commands need to be issued. The handle is returned on a successful call to the
VMIOFScsiDiskOpen() function (discussed in detail in the next topic).

Understanding VMIOF_ScsiCallback()
The IO filter framework invokes the VMIOF_ScsiCallback() function for each SCSI command issued to a
iSCSI target. The function prototype is as follows:

typedef
void (*VMIOF_ScsiCallback) (void *data, VMIOF_Status status)

The parameters to this function are:

n void *data — This parameter holds any data pertaining to the SCSI command issued.

n VMIF_Status status — This is the status of the operation performed.

Understanding VMIOF_ScsiEstimateHeapSize()
For performing any SCSI operations from an IO Filter, you will perform dynamic memory allocation for the
SCSI components. To this end, you will need to estimate the memory requirement and specify the maximum
number of devices and SCSI commands that will be allocated on the user-defined heap (as described in the
Section Managing Memory in an IO Filter Solution). Use VMIOF_ScsiEstimateHeapSize to determine how
much heap space is require for the SCSI device handle and the command tracking information.

Once the heap-size that is required for SCSI operations is calculated, you should add this value to the total
heap required and create the heap using VMIOF_HeapCreate() functions described in the topic “Managing
Memory in an IO Filter Solution,” on page 130 .

VMware, Inc. VMware Confidential and Proprietary 243


Getting Started Developing vSphere IO Filter Solutions

The prototype of the function is as follows:

size_t
VMIOF_ScsiEstimateHeapSize(uint32_t numDevs, uint32_t numCmds)

The parameters to this function are:


n uint32_t numDevs — This is the number of SCSI devices to which scsi commands may be issued from
the IO Filter.

n uint32_t numCmds — This is the maximum number of SCSI commands that will be issued.

The function returns the size of the heap that will required and this is the amount of heap that needs to be
created.

Understanding VMIOF_ScsiDiskOpen()
The prototype of the function is as follows:

VMIOF_Status
VMIOF_ScsiDiskOpen(VMIOF_HeapHandle *heap, const char *name, VMIOF_ScsiHandle **phandle)

The parameters to this function are:


n VMIOF_HeapHandle *heap — This is the opaque handle to the heap from which to allocate the memory.

n const char *name — This is the name of the SCSI disk to which the commands should be issued. Note
that the name does not include the path to the disk as the IO Filter framework appends the string
“/dev/disk” to the path.

n VMIOF_Scsihandle **phandle — The handle to the SCSI disk to which commands will be issued.

The function returns the status of the operation (VMIOF_SUCCESS if the operation is successful; It returns an
appropriate failure status on a failure condition).

Understanding VMIOF_ScsiDiskClose()
Once you are done performing operations on the SCSI disk, you can close the disk by calling the
VMIOF_ScsiClose() function. The prototype of the function is as follows:

VMIOF_Status
VMIOF_ScsiClose(VMIOF_ScsiHandle *handle)

The parameters to this function are:


n VMIOF_Scsihandle *handle — The handle to the SCSI disk which needs to be closed.

The function returns VMIOF_SUCCESS if the operation is successful or returns VMIOF_BAD_PARAM if the handle is
invalid.

Understanding VMIOF_ScsiCommandsIssue()
The SCSI commands can be issued to the disk using the VMIOF_ScsciCommandsIssue. The commands are
batched and are completed asynchronously. On completion, each of the command will call its callback
function (defined in the VMIOF_ScsiCommand structure). The prototype of the function is as follows:

VMIOF_Status
VMIOF_ScsiCommandsIssue(VMIOF_ScsiHandle *handle, VMIOF_ScsiCommand **cmds, uint32_t count,
uint32_*submitted)

The parameters to this function are:


n VMIOF_Scsihandle *handle — The handle to the SCSI disk which commands need to be issued.

n VMIOF_ScsiCommand **cmds — is the array of pointers to commands that will be batched and sent to the
SCSI disk. When the command completes, the memory allocated for each of the command should be
freed separately.

244 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n uint32_t count — is the number of commands in the cmds array

n uint32_t submitted — is the number of commands that were successfully submitted. If there were to be
any malformed commands in the structure, they will not be submitted. Additionally, all the commands
subsequent to this malformed command in the array will not be submitted.

The function returns VMIOF_SUCCESS if all the commands were submitted. It returns VMIOF_BAD_PARAM if the
command is malformed. It returns VMIOF_NO_MEMORY if there is not enough heap space; VMIOF_NO_RESOURCES if
the system does not have enough resources.

Example: Example of using the VMIOF_Scsi* functions


1. #include <stdlib.h>
2. #include <string.h>
3. #include <stdbool.h>
4. #include <semaphore.h>
5. #include <vmiof/vmiof_log.h>
6. #include <vmiof/vmiof_daemon.h>
7. #include <vmiof/vmiof_scsi.h>
8. #include <vmiof/vmiof_timer.h>
9. #include <vmiof/vmiof_work.h>
10. #include "assert.h"
11. #define __LOG(level, fmt, ...) \
12. VMIOF_Log(level, "%s:%u: " fmt, __FUNCTION__, __LINE__, ##__VA_ARGS__)
13. #define LOG(fmt, ...) __LOG(VMIOF_LOG_INFO, fmt, ##__VA_ARGS__)
14. #define ERR(fmt, ...) __LOG(VMIOF_LOG_ERROR, fmt, ##__VA_ARGS__)
15. #define WARN(fmt, ...) __LOG(VMIOF_LOG_WARNING, fmt, ##__VA_ARGS__)
16. /* scsi command buffer size */
17. #define MIN_BUFSZ 255
18. #define MAX_BUFSZ (MIN_BUFSZ + 512)
19. /* scsi command size */
20. #define CMDSZ 6
21. /* scsi sense buffer size */
22. #define SENSE_BUFSZ 64
23. /* bad sense buffer address to trigger IO failure */
24. #define BAD_SENSE_BUF_ADDR ((void *)0xdeadbeef)
25. /* number of commands to issue */
26. #define NUM_CMDS 5
27. static VMIOF_ScsiHandle *scsiDisk;
28. static VMIOF_HeapHandle *heap;
29. static const char *diskName;
30. static sem_t sem;
31. static VMIOF_TimerHandle *timer;
32. static VMIOF_WorkGroup *workGroup;
33. static void ScsiTestCallback(void *data, VMIOF_Status status)
34. {
35. VMIOF_ScsiCommand *c = data;
36. LOG("cmd=%p status=%d flags=%#x\n", c, status, c->flags);
37. VMIOF_HeapFree(heap, c->buffer);
38. VMIOF_HeapFree(heap, c->cdb);
39. if (c->sense != BAD_SENSE_BUF_ADDR) {
40. VMIOF_HeapFree(heap, c->sense);
41. }
42. VMIOF_HeapFree(heap, c);
43. }
44. static VMIOF_ScsiCommand *ScsiTestMakeCommand()

VMware, Inc. VMware Confidential and Proprietary 245


Getting Started Developing vSphere IO Filter Solutions

45. {
46. VMIOF_ScsiCommand *c;
47. uint32_t len;
48. c = VMIOF_HeapAllocate(heap, sizeof *c);
49. assert(c != NULL);
50. memset(c, 0, sizeof *c);
51. len = MIN_BUFSZ + (rand() % (MAX_BUFSZ - MIN_BUFSZ));
52. c->done = ScsiTestCallback;
53. c->doneData = c;
54. c->buffer = VMIOF_HeapAllocate(heap, len);
55. assert(c->buffer != NULL);
56. c->length = len;
57. c->timeoutMS = 0;
58. c->flags = VMIOF_SCSI_FLAG_NONE;
59. c->cdb = VMIOF_HeapAllocate(heap, CMDSZ);
60. assert(c->cdb != NULL);
61. c->cdb[0] = 0x12; /* Opcode for Inquiry Command */
62. c->cdb[1] = 0; /* Misc CDB Info */
63. c->cdb[2] = 0;
64. c->cdb[3] = 0; /* MSB for Allocation Length */
65. c->cdb[4] = len; /* LSB for Allocation Length -> for Inquiry Command Needs to be
atleast 5 */
66. c->cdb[5] = 0; /* Control information */
67. c->cdbLen = CMDSZ;
68. assert(CMDSZ >= 6);
69. c->sense = VMIOF_HeapAllocate(heap, SENSE_BUFSZ);
70. assert(c->sense != NULL);
71. c->senseSize = SENSE_BUFSZ;
72. LOG("Inquiry Command CDB: %x %x %x %x %x %x \n",
73. c->cdb[0], c->cdb[1], c->cdb[2], c->cdb[3], c->cdb[4], c->cdb[5]);
74. return c;
75. }
76. static void ScsiTestCleanupTimerCb(void *data)
77. {
78. int exitVal = (int)(uintptr_t)data;
79. VMIOF_TimerRemove(timer);
80. VMIOF_WorkGroupWait(workGroup);
81. VMIOF_WorkGroupFree(workGroup);
82. workGroup = NULL;
83. VMIOF_ScsiClose(scsiDisk);
84. VMIOF_HeapDestroy(heap);
85. exit(exitVal);
86. }
87. static void ScsiTestWorker(void *data)
88. {
89. VMIOF_ScsiCommand *cmds[NUM_CMDS];
90. VMIOF_ScsiCommand *invalidCmd;
91. uint32_t i, numPending;
92. VMIOF_Status status;
93. for (i = 0; i < NUM_CMDS; i ++) {
94. cmds[i] = ScsiTestMakeCommand(true /* => valid sense buffer */);
95. }
96. status = VMIOF_ScsiCommandsIssue(scsiDisk, cmds, i, &numPending);
97. LOG("VMIOF_ScsiCommandsIssue: status=%d i=%d numPending=%d\n", status, i, numPending);
98. if (numPending == 0) {

246 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

99. VMIOF_TimerAdd(1, ScsiTestCleanupTimerCb, (void *)1 /* exit status */, &timer);


100. return;
101. }
102. VMIOF_TimerAdd(1, ScsiTestCleanupTimerCb, (void *)0 /* exit status */, &timer);
103. }
104. static VMIOF_Status ScsiTestStart(void)
105. {
106. VMIOF_Status status;
107. size_t heapSize;
108. int err;
109. diskName = getenv("TEST_DISK");
110. if (diskName == NULL) {
111. return VMIOF_BAD_PARAM;
112. }
113. heapSize = VMIOF_ScsiEstimateHeapSize(1, NUM_CMDS);
114. {
115. VMIOF_HeapAllocation allocs[] = {
{ .size = sizeof(VMIOF_ScsiCommand), .count = NUM_CMDS },
{ .size = MAX_BUFSZ, .count = NUM_CMDS },
{ .size = CMDSZ, .count = NUM_CMDS },
{ .size = SENSE_BUFSZ, .count = NUM_CMDS },
};
116. heapSize += VMIOF_HeapEstimateRequiredSize(allocs, 4);
117. }
118. status = VMIOF_HeapCreate(heapSize, &heap);
119. if (status != VMIOF_SUCCESS) {
120. ERR("Failed to create %zu sized heap.\n", heapSize);
121. return status;
122. }
123. assert(status == VMIOF_SUCCESS);
124. status = VMIOF_ScsiDiskOpen(heap, diskName, &scsiDisk);
125. if (status != VMIOF_SUCCESS) {
126. ERR("Failed to open %s :%d\n", diskName, status);
127. VMIOF_HeapDestroy(heap);
128. return status;
129. }
130. status = VMIOF_WorkGroupAlloc(1, &workGroup);
131. if (status != VMIOF_SUCCESS) {
132. ERR("Failed to create work group: %d\n", status);
133. VMIOF_ScsiClose(scsiDisk);
134. VMIOF_HeapDestroy(heap);
135. return status;
136. }
137. status = VMIOF_WorkQueue(workGroup, ScsiTestWorker, NULL);
138. if (status != VMIOF_SUCCESS) {
139. ERR("Failed to start ScsiTestWorker: %d\n", status);
140. VMIOF_WorkGroupFree(workGroup);
141. VMIOF_ScsiClose(scsiDisk);
142. VMIOF_HeapDestroy(heap);
143. return status;
144. }
145. return VMIOF_SUCCESS;
146. }
147. static void ScsiTestStop(VMIOF_DaemonStoppedCB stoppedCb, void *data)
148. {

VMware, Inc. VMware Confidential and Proprietary 247


Getting Started Developing vSphere IO Filter Solutions

149. VMIOF_ScsiClose(scsiDisk);
150. VMIOF_HeapDestroy(heap);
151. stoppedCb(data);
152. }
153. static void ScsiTestCleanup(void)
154. {
155. //Any required cleanup action
156. }
157. VMIOF_DEFINE_DAEMON(
158. .start = ScsiTestStart,
159. .stop = ScsiTestStop,
160. .cleanup = ScsiTestCleanup
161. );

This code works as follows:

n Lines 1-32: include all the necessary header files and defines some global and static variables for the
sample code.

n Lines 33-43: define the callback that is invoked on a completion of the SCSI command issued. In the
callback, we print the status of the command and free all the allocated memory for the IO buffer and
sense buffer. We also free the memory allocated to the SCSI command.

n Lines 44-75: create a sample SCSI INQUIRY command.

n Line 48: Allocates memory for the SCSI command

n Line 52: Sets the callback function to invoke when the command completes

n Lines 61-66: Fills the CDB structure with the SCSI command parameters

n Lines 76-86: define the Timer callback to clean up the oustanding workGroups and allocated memory.

n Lines 87-103: defines the worker thread callback that is invoked which is responsible for issuing the
SCSI command to the SCSI disk.

n Line 94: calls the function to fill up the VMIOF_ScsiCommand structure with appropriate command
parameters.

n Line 97: The parameter num_pending is the number of commands that have not yet been
“submitted”.

n Lines 104-146: define the function that is invoked when the daemon starts

n Lines 109-112: obtain the diskName to which the SCSI commands need to be issued against. The
diskname is obtained from the environment variable. There is also some sanity-checking
performed against the diskname to confirm it is not null. Please refer the section "How to get the
SCSI disk name" in order to get the diskname.

n Lines 113-117: calculates the estimated memory requirements for the SCSI commands

n Line 118: creates the heap and returns the heap-handle which will be used to allocate memory.

n Line 124: opens the SCSI disk

n Lines 130-144:creates the work group and submits the worker which is responsible for issuing the
SCSI commands.

n Lines 147-152: define the daemon stop callback.

n Lines 153-156: define the daemon cleanup callback.

n Lines 157-161: define the callback functions for the daemon.

248 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

How to run the sample program


In order to run the above sample program, you will need to first compile the code as the daemon component
for one of the filters.

1 Change directory to the countio directory

$ pwd
/opt/vmware/vaiodk-6.0.-2897841/src/partners/samples/iofilters
$ cd countIO

2 Replace the countIODaemon.c file with the sample code shown above. Note that the sample-code above
issues SCSI commands from within the daemon context of the framework

3 Compile the filter and copy the vib as described in the Section “Chapter Summary,” on page 84

4 Once you have installed the countio IO Filter on the ESX and have identified the SCSI disk-name (Refer
Section "How to get SCSI disk name") to which you want to target, type the following on your ESX
console

# TEST_DISK=eui.636f6b6570652064 /usr/lib/vmware/iofilter/bin/iofilterd --filter countio --


mempool=100

5 The logs for the above operation will be generated at - /var/log/iofilterd-countio.log

Sample logs:

2015-07-28T16:57:37Z iofilterd-countio[1000546996]: Starting the daemon for filter countio


2015-07-28T16:57:37Z iofilterd-countio[1000546996]: WORKER: Creating new group with numThreads=1
(1)
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestMakeCommand:216: Inquiry Command
CDB: 12 0 0 0
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestMakeCommand:216: Inquiry Command
CDB: 12 0 0 0
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestMakeCommand:216: Inquiry Command
CDB: 12 0 0 0
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestMakeCommand:216: Inquiry Command
CDB: 12 0 0 0
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestMakeCommand:216: Inquiry Command
CDB: 12 0 0 0
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestWorker:290: VMIOF_ScsiCommandsIssue:
status=0 i
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestCallback:84: cmd=B043F4D0 status=0
flags=0x4
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestCallback:85: scsiStatus=0
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestCallback:86: hostStatus=0
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestCallback:87: driverStatus=0
2015-07-28T16:57:37Z iofilterd-countio[1000546996]: ScsiTestCallback:88: bytesXferred=36

How to get the SCSI disk name


In order to use the SCSI functions described in the section “Understanding and Using VMIOF SCSI
functions”, one of the most important parameters to perform an SCSI operation is the name of the SCSI disk.
This section will list down the steps to identify the name of the SCSI disk. You will need to have access to an
ESX server with the SCSI disk attached to it.

VMware, Inc. VMware Confidential and Proprietary 249


Getting Started Developing vSphere IO Filter Solutions

On the ESX console, type the following command:

# esxcli storage core device list | grep "Display Name"


Display Name: Local TSSTcorp CD-ROM (mpx.vmhba36:C0:T0:L0)
Has Settable Display Name: false
Display Name: Local DELL Disk (naa.6b083fe0bf1f32001bd84a99083d89bd) =======> This is the SCSI
disk
Has Settable Display Name: true
Display Name: Local DP Enclosure Svc Dev (t10.DP______BACKPLANE000000)
Has Settable Display Name: true

The name of the SCSI disk is the value within the parenthesis for the parameter “Display Name”. In the
above example, the name of the SCSI disk is “naa.6b083fe0bf1f32001bd84a99083d89bd”

NOTE The SCSI disk against which SCSI commands are to be issued should not be actively being used.

NOTE You can also use the coredump partition as a SCSI disk. This should not be done in a production
environment.

1 On the ESX console, type the following command to list the coredump partition

# esxcli system coredump partition list


Name Path Active Configured
--------------------------------------
---------------------------------------------------------- ------ ----------
naa.600508b1001cec2e3a4f5c263327695d:2 /vmfs/devices/disks/naa.
600508b1001cec2e3a4f5c263327695d:2 true true

2 Set the "Active" parameter to "false"

# esxcli system coredump partition set -e false

You can now use this coredump partition as a valid SCSI disk against which you can issue SCSI commands.

WHAT'S NEW In the 60U2 release, the IO Filter Framework fixed the bug that VMIOF_ScsiClose and a SCSI
command callback could have race condition. It is now acceptable to call VMIOF_ScsiClose from its last SCSI
command callback.

In 60U1, you can workaround this by doing the following: First make sure all the SCSI command callbacks
have been called. Then call VMIOF_ScsiClose to close the device in a poll callback or a timer callback, and
later call VMIOF_HeapDestroy to destroy the heap for the device.

Chapter Summary
The topics in this chapter presented details such that you should now be able to:

n Use sidecars in an IO Filter Solution

n Use the IO Filter timer functions in a Solution

n Use the IO Filter polling functions in a Solution

n Process IO transactions in a Library Instance, including:

n Understanding the data structures used in IO transactions

n Creating new and duplicating existing IO transactions

n Submit new and duplicated IO transactions to the framework

n Process xMigration events in an IO Filter Solution

250 VMware Confidential and Proprietary VMware, Inc.


Chapter 5 Fleshing IO Filter Library Component Source

n Process Snapshot events in an IO Filter Solution

Review Questions - Fleshing IO Filter Library Component Source


1 Which of the following should a diskClose callback do? (choose all that apply)

a Reset the stun/unstun level to zero

b Close any sidecar file(s) opened for the VMDK

c Destroy any heap(s) created during diskOpen

d Stop the VM

2 Which of the following should a diskOpen callback do? (choose all that apply)

a Create sidecar file(s) needed for the VMDK

b Create Heap(s) needed for the VMDK

c Start doing background IOs for replication filters

d Record the pathname of the VMDK in the sidecar file?

3 What is the difference between instance data and sidecar data? (fill in the blank)

4 What is the difference between a diskAttach and a diskOpen event? (fill in the blank)

5 When does an filter know that a diskVmMigration has completed successfully? (fill in the blank)

6 Which two callbacks does the IO Filter Framework use to set capabilities / filter properties? (fill in the
blanks)

7 How does a filter know when its VMDK is getting deleted? (fill in the blank)

VMware, Inc. VMware Confidential and Proprietary 251


Getting Started Developing vSphere IO Filter Solutions

252 VMware Confidential and Proprietary VMware, Inc.


Using Cache Solution-Specific
Functions 6
Chapter Objectives and Topics
To achieve the tactical objectives of this course, after successfully completing this chapter, you should be
able to:
n Use the Cache functions in the vSphere IO Filter API

This chapter includes the following topics:

n “Understanding and Using the VMIOF_Cache*() Functions,” on page 253

n “Updating the Dirty State of the VMDK using VMIOF_DiskContentsDirtySet(),” on page 257

n “Chapter Summary,” on page 257

n “Review Questions - Using Cache Solution-Specific Functions,” on page 258

Understanding and Using the VMIOF_Cache *() Functions


The VAIO includes functions to manage caches, often backed by SSD devices. Each ESXi instance that has
caching media maintains a single vFlash File System (vFFS) volume. That volume aggregates all caching
media into the single vFFS volume. The VAIO VMIOF_Cache*() functions provide methods for IO Filters to
create, manage, and delete cache files with the VFFS volume.

For cache class filters, VMware highly recommends using a design pattern that includes having the filter's
Daemon perform all operations on the cache file. This greatly simplifies many other issues related to filter
design. With the Daemon owning the cache file, the LIs must communicate with the daemon to request
cache entries for reads and update the cache for writes. The protocol used between the LI and Daemon is
completely up to the implementor. That said, VMware strongly suggests the design include the following:

n The LIs should set up buffers for IO blocks, and share them with the Daemon via the VMIOF_crossfd*()
functions

n The Daemon should use the VMIOF_AIO*() functions to perform scatter-gather IO between these crossfd
buffers and the cache file. The Daemon must place blocks for read hits into crossfd buffers, and read
data to cache from crossfd buffers.

The Daemon uses the functions discussed in the following sub-sections to manage the cache file.

Using the VMIOF_CacheFileVolumeIsAvailable () function


Use this function to determine whether the ESXi host has a VFFS volume. This function takes no parameters.
It returns true if the ESXi host contains a VFFS volume, and false otherwise. You can only create a cache file
on hosts that contain a VFFS volume. Use this function before attempting to create a cache file on the VFFS
volume.

VMware, Inc. VMware Confidential and Proprietary 253


Getting Started Developing vSphere IO Filter Solutions

This function has the following prototype:

bool VMIOF_CacheFileVolumeIsAvailable(void)

Here, bool is defined by <stdbool.h>, which is included by <vmiof_cache.h>, which is included by


<vmiof.h>.

Using the VMIOF_CacheFileVolumeGetAvailableSpace () function


Use this function to determine how much space is available on the VFFS volume, if any. This function has
the following prototype:

VMIOF_Status
VMIOF_CacheFileVolumeGetAvailableSpace(uint64_t *spaceInMB);

If this function returns VMIOF_SUCCESS, it stores the number of megabytes available for new cache files in the
spaceInMB parameter. If it is unable to find the cache volume on the host it returns VMIOF_NO_FOUND. If it is
unable to get the free space available, it returns VMIOF_FAILURE.

Using the VMIOF_CacheFileCreate () function


Use this function to create a cache file once you have determined that a VFFS exists and that it has sufficient
space for your needs. The prototype for this function is:

VMIOF_Status VMIOF_CacheFileCreate(const VMIOF_CacheFileParam *param);

The only parameter is a structure of type VMIOF_CacheFileParam which has the following 3 members :

typedef struct VMIOF_CacheFileParam {


/** Name of the cache which uniquely identifies it */
const char *name;
/** Size of a cache file in MB */
uint64_t sizeInMB;
/**
* Flag to ensure that the space for the file is
* pre-allocated in one contiguous chunk.
*/
bool needContiguous;
} VMIOF_CacheFileParam;

n const char *name — A name for the cache file to be created. To prevent collisions between competing
users of a VFFS, use a URI-style name of the form com.vendor.filter_name. If you decide to have
multiple cache files in your filter, use the form com.vendor.filter_name.cache_nameX.

n uint64_t sizeInMB — The size of the cache file to create, in megabytes

n bool needContiguous— Set to true if the solution requires the space in the cache file to be contiguous
within the VFFS volume. If it is true, space will be pre-allocated. If this is set to false, writes can fail due
to lack of space, and it is even possible to create a cache file that is larger than the VFFS partition.

NOTE VMIOF_CacheFileCreate won't fail due to fragmentation. Even you set needContiguous to true,
you will still be able to consume the entire VFFS volume.

The return values are:

n VMIOF_SUCCESS — The function successfully created the cache file.

n VMIOF_ALREADY_EXISTS — A cache file with the given name already exists.

n VMIOF_NO_SPACE — There is not enough space to create the cache file.

254 VMware Confidential and Proprietary VMware, Inc.


Chapter 6 Using Cache Solution-Specific Functions

n VMIOF_FAILURE — The function was unable to create the cache file.

NOTE While you may consider having one cache file per VMDK processed by a filter, VMware
recommends that each IO Filter Solution have a single cache file that it uses for all VMDKs it processes.

Using the VMIOF_CacheFileHandleAllocationSize () function


Use this function to estimate the allocation size for a VMIOF_CacheFileHandle.

The prototype for this function is:

size_t VMIOF_CacheFileHandleAllocationSize(void);

The return values is the required minimum allocation size to accomodate the cache file handle.

Using the VMIOF_CacheFileOpen () function


Use this function to open a cache file and get it's handle. You should use the file descriptor if you want to
read/write to the file using the POSIX APIs.

The prototype for this function is:

VMIOF_Status VMIOF_CacheFileOpen(const char *name, VMIOF_HeapHandle *heap, VMIOF_CacheFileHandle


**phandle, int *fd)

n const char *name — The name of the existing cachefile.

n VMIOF_HeapHandle heap — The heap from which the handle gets allocated.

n VMIOF_CacheFileHandle **phandle — Pointer to the handle of the cache file

n int *fd— An optional file descriptor for the cache file.

The return values are:

n VMIOF_SUCCESS — The function successfully created the cache file.

n VMIOF_NOT_FOUND — The cache file with the given name was not found.

n VMIOF_BUSY — The cache file is currently in use.

n VMIOF_NO_MEMORY — There is not enough memory to allocate the handle.

n VMIOF_FAILURE — The function was unable to get a handle for the cache file.

NOTE You are expected to close the cache file once all operations are completed by calling
VMIOF_CacheFileClose()

Using the VMIOF_CacheFileClose () function


Use this function to close the cache file.

The prototype for this function is:

void VMIOF_CacheFileClose(VMIOF_CacheFileHandle *handle)

n VMIOF_CacheFileHandle *handle — Handle to the cache file

Using the VMIOF_CacheFileDelete () function


When your filter no longer needs a cache file, it should delete said file. To do this, invoke
VMIOF_CacheFileDelete().

VMware, Inc. VMware Confidential and Proprietary 255


Getting Started Developing vSphere IO Filter Solutions

The prototype for this function is:

VMIOF_Status VMIOF_CacheFileDelete(const char *name)

n const char *name — The name of the existing cachefile.

The return values are:

n VMIOF_SUCCESS — The function successfully removed the cache file.

n VMIOF_NOT_FOUND — The cache file with the given name was not found.

n VMIOF_BUSY — The cache file is currently in use.

NOTE You are expected to close the cache file before calling this function.

Using the VMIOF_CacheFileResize () function


In the case where you determine the initial cache file size was incorrect, you can use VMIOF_CacheFileResize
to change the size of the file. This can be used as long as the cache file is closed

The prototype for this function is:

VMIOF_Status VMIOF_CacheFileResize(const char *name, uint64_t sizeInMB)

n const char *name — The name of the existing cachefile.

n uint64_t sizeInMB — The new size of the cache file in MB.

The return values are:

n VMIOF_SUCCESS — The function successfully resized the cache file.

n VMIOF_NOT_FOUND — The cache file with the given name was not found.

n VMIOF_BUSY — The cache file is currently in use.

n VMIOF_NO_SPACE — There is not enough space available to resize the cache file.

n VMIOF_FAILURE — The function was unable to resize the cache file.

Using the VMIOF_CacheFileDiscardRange () function


During normal filter operations, you will need to remove data from the cache, for example when a VM
whose VMDKs are cached is turned off. Use VMIOF_CacheFileDiscardRange() to free blocks no longer needed
in the cache. This function tells the underlying SSD to free blocks mapped to the given range. The prototype
of this function is:

VMIOF_Status VMIOF_CacheFileDiscardRange(VMIOF_CacheFileHandle *handle, uint64_t byteOffset,


uint64_t
numBytes, uint64_t *bytesFreed);

The parameters are:

n VMIOF_CacheFileHandle *handle — The handle to the cache file provided by VMIOF_CacheFileOpen()

n uint64_t byteOffset — The offset into the cache file to start freeing space

n uint64_t numBytes — The number of bytes to free

n uint64_t *bytesFreed — If the function returns VMIOF_SUCCESS, it writes the number of bytes actually
freed into this parameter

The return values are:

n VMIOF_SUCCESS — The function successfully freed the given number of bytes.

256 VMware Confidential and Proprietary VMware, Inc.


Chapter 6 Using Cache Solution-Specific Functions

n VMIOF_BAD_PARAM — The given byteOffset is not within the file.

n VMIOF_NOT_SUPPORTED — The operation is not supported on files which are not contiguous.

n VMIOF_MISALIGNED — The byteOffset and/or numBytes parameters are not aligned to the correct block
size.

n VMIOF_FAILURE — The function was unable to discard the given range.

You can only perform this operation on non-contiguous cache files. You cannot free less than one block of
data on a cache file, as provided by VMIOF_CacheFileVolumeGetBlockSize(). Further, byteOffset and numBytes
should be multiples of the block size provided by said function.

Using the VMIOF_CacheFileVolumeGetBlockSize () function


Use VMIOF_CacheFileVolumeGetBlockSize() determines the block size used for files in the VFFS volume on a
host. The prototype of this function is:

VMIOF_Status VMIOF_CacheFileVolumeGetBlockSize(uint32_t *blockSize);

As shown, this function takes a single parameter. It places the size of the blocks in the VFFS volume in the
space pointed to by blockSize parameter upon returning VMIOF_SUCCESS. A return value of VMIOF_NOT_FOUND
indicates that it was unable to find the cache volume on the host. A return value of VMIOF_FAILURE indicates
that it was unable to get the block size of the cache file.

Updating the Dirty State of the VMDK using VMIOF_DiskContentsDirtySet ()


It is vital that ESXi always knows when a VMDK is dirty (it has updates pending in memory / cache). In the
event of a user-space cartel crash before the Filter flushes updates to the cache, ESXi must know that the
disk is dirty. ESXi prevents certain operations, such as cloning and snapshotting, of dirty VMDKs.

VAIO provides the VMIOF_DiskContentsDirtySet() function to set the dirty / clean state of a VMDK. Use this
function to set the state of a VMDK to dirty whenever the VMDK is dirty, and again when it is clean. The
prototype of this function is:

VMIOF_Status VMIOF_DiskContentsDirtySet(VMIOF_DiskHandle *handle, VMIOF_DirtyState dirty);

The parameters to this function are:

n VMIOF_DiskHandle *handle — Set this to the handle passed into the callback from which you invoke this
function

n VMIOF_DirtyState dirty — Set this to VMIOF_DISK_CLEAN to indicate the VMDK is clean, or


VMIOF_DISK_DIRTY to indicate that the disk is dirty.

VMIOF_DiskContentsDirtySet() returns VMIOF_SUCCESS if the operation succeeded and VMIOF_NO_MEMORY if


there is not enough memory to complete the request.

NOTE You can only use this function from the diskOpen and the diskClose callbacks.

Typically, only write-back caching solutions need to use this routine.

Chapter Summary
The topics in this chapter presented details such that you should now be able to:

n Use the Cache functions in the vSphere IO Filter API

VMware, Inc. VMware Confidential and Proprietary 257


Getting Started Developing vSphere IO Filter Solutions

Review Questions - Using Cache Solution-Specific Functions


1 Which function would you call to get a file descriptor to the cache file to use with the VAIO AIO
functions?

a VMIOF_CacheFileCreate

b VMIOF_CacheFileGetFD

c open

d VMIOF_CacheFileOpen

2 Which IO Filter component does VMware recommend manage and perform IOs to cache files?

a Library

b Daemon

c CIM Provider

d Either the Library or Daemon

3 How many cache files does VMware recommend a caching IO Filter Solution create?

a One

b One per VMDK it filters

c Two per VMDK it filters, one for clean data and one for dirty data

d VMware does not have a specific recommendation in this realm

258 VMware Confidential and Proprietary VMware, Inc.


Understanding Additional Rules Tips
and Tricks 7
Chapter Objectives and Topics
To achieve the tactical objectives of this course, after successfully completing this chapter, you should be
able to:

n Specify in which components with which you can not, may not, and may create threads using
pthread_create()

n Understand the use of blocking rules in Poll and Timer Callbacks, and WorkGroup Functions

n Understand the lifetime requirements of the Bundle Deployment URL

n Understand how to troubleshoot a vSphere Cluster and Filter Build Configuration

n Understand how to use a RAM Disk

n Understand when you can and cannot hold a lock across calls to the VAIO

n Understand the rules regarding the use of Sidecar Functions

n Understand how to debug the Heap

n Understand how to debugging with different versions of ESXi

n Understand how to use the realtime clock functions

This chapter includes the following topics:

n “Understanding Rules for Using pthread_create(),” on page 260

n “Blocking Rules in Poll and Timer Callbacks and WorkGroup Functions,” on page 264

n “Bundle Deployment URL lifetime,” on page 264

n “Troubleshooting vSphere Cluster and Filter Build Configuration (Checklist),” on page 265

n “Using the RAMDisk,” on page 265

n “Holding a Lock across calls,” on page 265

n “Understanding Sidecar Functions Invocations,” on page 266

n “Heap Debugging,” on page 266

n “Debugging With Different Versions of ESX,” on page 268

n “Using the realtime clock functions,” on page 269

n “Using Libraries in your Library Instance and Daemon,” on page 269

n “Development Tips,” on page 270

VMware, Inc. VMware Confidential and Proprietary 259


Getting Started Developing vSphere IO Filter Solutions

n “Frequently Asked Questions,” on page 270

n “Chapter Summary,” on page 272

Understanding Rules for Using pthread_create ()


While you (as a filter developer) may want to create additional threads to increase the performance of your
solution, VMware has create the following rules and advice governing the use of pthread_create() in each of
the three components built by the VAIODK:

1 Filter Library Component — You can not use pthread_create() in Filter Library code. The SDK looks
for pthread_create() in the source code at compile time and raises an error, preventing compilation. The
reason for this restriction is that Filter Libraries may run in resource constrained environments, where
creating additional threads may harm the system. One example of this is hostd, which is constrained to
200 total threads in vSphere 6.0.

Instead, use VMIOF_WorkGroupAlloc() to create a thread pool and VMIOF_WorkQueue() to submit work to
the thread pool.

2 Filter Daemon Component — You may, but should not use pthread_create() in Filter Daemon code.
VMware prefers that you use VMIOF_WorkGroupAlloc() to create a thread pool and VMIOF_WorkQueue() to
submit work to the thread pool.

3 CIM Provider — You may use pthread_create() as you wish within CIM provider code. That said,
please be considerate of others and limit the number of threads you create to just those you need to
perform CIM operations.

Understanding an Example of Why You Should Not Use pthread_create () in Library


Code
In this section you find out why you should not use pthread_create() in the context of the library instance.
The prerequisites for this section is that you understand the VAIO framework and also the fact that it runs
in a separate cartel from the hostd cartel. You should also know how controls are passed between threads
and callback functions are invoked.

When you use pthread_create() in the context of a library instance, that thread is not allocated from the
thread pool. The thread is created directly in hostd. When the callback for the thread is invoked, the context
which originally created the thread is not identified. As a result you end up crashing the hostd.

The following code snippet shows how the usage of pthread_create() in the context of the library instance
can crash hostd. When a snapshot is requested, the diskSnapshot callback function
SampleFilterDiskSnapshot() is invoked. During the SNAPSHOT_PREPARE phase, you create a thread via
pthread_create(). The thread callback SampleFilterThreadSnapshotCallback() now tries to invoke the
progressFunc() callback. This thread actually gets created in hostd context and it has no knowledge of either
the progressFunc or completionFunc. As a result it leads to crashing the hostd. Note that the structure
SnapshotCallbackInfo is passed as a parameter to the thread created in the diskSnapshot callback.

1. typedef struct snCallbackInfo_s {


2. VMIOF_DiskHandle *snhandle;
3. VMIOF_DiskOpCompletionFunc completionFunc;
4. VMIOF_DiskOpCompletionFunc progressFunc;
5. }SnapshotCallbackInfo;
6.
7. void* SampleFilterThreadSnapshotCallback(void *buf)
8. {
9. SnapshotCallbackInfo *sninfo = (SnapshotCallbackInfo *)buf;
10. VMIOF_Log(VMIOF_LOG_INFO, "In the callback %s\n", __func__);
11.

260 VMware Confidential and Proprietary VMware, Inc.


Chapter 7 Understanding Additional Rules Tips and Tricks

12. sleep(5);
13.
14. if (sninfo->progressFunc != NULL) {
15. sninfo->progressFunc(sninfo->snhandle, VMIOF_SUCCESS);
16. }
17. if (sninfo->completionFunc != NULL) {
18. sninfo->completionFunc(sninfo->snhandle, VMIOF_SUCCESS);
19. }
20. return VMIOF_SUCCESS;
21. }
22.
23. VMIOF_Status SampleFilterDiskSnapshot(VMIOF_DiskHandle *handle,
24. const VMIOF_DiskSnapshotInfo *info)
25. {
26. int ret = 0;
27. SnapshotCallbackInfo *buf = NULL;
28. buf = (SnapshotCallbackInfo *)malloc(sizeof(SnapshotCallbackInfo));
29. buf->snhandle = handle;
30. buf->progressFunc = info->progressFunc;
31. buf->completionFunc = info->completionFunc;
32.
33. VMIOF_Log(VMIOF_LOG_INFO, "In the callback %s\n", __func__);
34.
35. if (info->phase == VMIOF_SNAPSHOT_PREPARE) {
36. pthread_t thread_info;
37. pthread_attr_t attr;
38. pthread_attr_init(&attr);
39. pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
40. ret = pthread_create(&thread_info, &attr,
41. SampleFilterThreadSnapshotCallback, buf);
42. VMIOF_Log(VMIOF_LOG_INFO, "%s : Created thread\n", __func__);
43. return VMIOF_ASYNC;
44. }
45.
46. return VMIOF_SUCCESS;
47.
48. }

The following code snippet provides detailed explaination :

n Lines 1-5 : You define the snapshot callback structure SnapshotCallbackInfo that stores information
about the disk which is getting snapshotted.

n Lines 7-22 : Define the snapshot callback function that is invoked when the thread created using
pthread_create() is called.

n Line 9 : Note that the parameter buf is type casted to SnapshotCallbackInfo and accepted into
sninfo.

n Line 14 & 17 : Calls to progressFunc and completionFunc respectively. This is the place where the
control should get transferred to the LI context. However since the thread is created in the context
of hostd and has no knowledge of either the progressFunc or completionFunc, it results in crashing
the hostd.

n Lines 23-48 : Define the diskSnapshot callback function, SampleFilterDiskSnapshot

n Lines 28-32 : Allocate memory for a variable of type SnapshotCallbackInfo and assign values to the
members of this structure.

VMware, Inc. VMware Confidential and Proprietary 261


Getting Started Developing vSphere IO Filter Solutions

n Lines 33-44 : Check if the control came to the diskSnapshot callback in the VMIOF_SNAPSHOT_PREPARE
phase. We then create a thread using pthread_create() and assign the callback to
SampleFilterThreadSnapshotCallback() function. You return VMIOF_ASYNC on line 43.

n When this code sample is executed upon initiating a snapshot operation on the disk to which the
sampfilt filter is attached, you observe a hostd crash. The stack trace looks as follows :

Program terminated with signal 6, Aborted.


(gdb) bt
#0 0x0664a092 in _dl_sysinfo_int80 () from /tmp/debug-uw.blRNZmhk/lib/ld-linux.so.2
#1 0x0c4ddca5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:67
#2 0x0c4df4e3 in abort () at abort.c:92
#3 0x06d447c3 in Vmacore::System::SignalTerminateHandler (info=0x26b7f888, ctx=0x26b7f904)
at bora/vim/lib/vmacore/posix/defSigHandlers.cpp:65
#4 0x003ef002 in ?? ()
#5 0x070e540b in Vmomi::PropertyCollectorInt::PropertyCollectorImpl::TriggerProcessGUReqs
(this=0x25c5ade8,
filter=0x255d7248) at bora/vim/lib/vmomi/propertyCollector.cpp:1410
#6 0x070e6adc in Vmomi::PropertyCollectorInt::FilterImpl::NotifyChange (this=0x255d7248,
moRef=0x25696f78,
destroyed=false) at bora/vim/lib/vmomi/propertyCollector.cpp:423
#7 0x070fb3cd in Vmomi::PropertyJournalImpl::RecordAndNotifyChangeInt (this=0x256aafe0,
moRef=0x25696f78, changes=...)
at bora/vim/lib/vmomi/propertyJournal.cpp:771
#8 0x070fb5c5 in Vmomi::PropertyJournalImpl::CommitChangesAndNotify (this=0x256aafe0,
moRef=0x25696f78, changes=...,
values=...) at bora/vim/lib/vmomi/propertyJournal.cpp:953
#9 0x0710234e in Vmomi::PropertyProviderMixin::BeforeWriteLockRelease (this=0x256ab5ac)
at bora/vim/lib/vmomi/propertyProvider.cpp:1823
#10 0x05cce2bd in Unlock (this=0x256ab5ac) at bora/vim/lib/public/vmacore/system/syncImpl.h:
137
#11 HostdCommon::HaManagedObjectImpl::Unlock (this=0x256ab5ac) at
bora/vim/hostd/common/HaManagedObjectImpl.cpp:81
#12 0x05cb647c in HostdCommon::AuthEntityBaseImpl::Unlock (this=0x256ab5ac) at
bora/vim/hostd/common/baseEntityImpl.cpp:380
#13 0x05e9d810 in ~WriteSynchronized (this=0x256ab5a8, progress=0) at
bora/vim/lib/public/vmacore/system/syncRaii.h:288
#14 Vimsvc::HaTaskImpl::SetProgress (this=0x256ab5a8, progress=0) at
bora/vim/hostd/vimsvc/HaTaskImpl.cpp:563
#15 0x05c88c64 in VigorVim::UpdateTask (task=0x256ab648, percent=0) at
bora/vim/lib/vigorVim/vigorVimOp.cpp:149
#16 0x05c8aed4 in operator()<void (*)(Vimsvc::HaTask*, int), boost::_bi::list1<int&> >
(function_obj_ptr=..., a0=0)
at compcache/boost1550_lin32_gcc463/ob-1944553/linux/include/boost/bind/bind.hpp:313
#17 operator()<int> (function_obj_ptr=..., a0=0)
at
compcache/boost1550_lin32_gcc463/ob-1944553/linux/include/boost/bind/bind_template.hpp:32
#18 boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, void (*)
(Vimsvc::HaTask*, int), boost::_bi::list2<boost::_bi::value<Vmacore::Ref<Vimsvc::HaTask> >,
boost::arg<1> > >, void, int>::invoke (function_obj_ptr=..., a0=0)
at
compcache/boost1550_lin32_gcc463/ob-1944553/linux/include/boost/function/function_template.hp
p:153
#19 0x05c89f9c in operator() (this=0x259bbcc0, current=0, maximum=512)
at
compcache/boost1550_lin32_gcc463/ob-1944553/linux/include/boost/function/function_template.hp

262 VMware Confidential and Proprietary VMware, Inc.


Chapter 7 Understanding Additional Rules Tips and Tricks

p:767
#20 VigorVim::VigorVimOp::ReportProgress (this=0x259bbcc0, current=0, maximum=512)
at bora/vim/lib/vigorVim/vigorVimOp.cpp:340
#21 0x0ac8a00a in VigorCpp::VigorOp::ResultCbWrapper (cbData=0x259bae58, result=0x26b8004c)
at bora/lib/vigorCpp/VigorOp.cpp:181
#22 0x0aca5d3c in SnapshotVigorProgressCB (progressData=0x1f4a93e4, cur=0, max=512)
at bora/lib/vigorOffline/offlineSnapshot.c:387
#23 0x0c011d34 in SnapshotProgress (cbData=0x2627f83c, pos=0, end=100) at
bora/lib/snapshot/snapshot.c:1789
#24 SnapshotProgress (cbData=0x2627f83c, pos=0, end=100) at bora/lib/snapshot/snapshot.c:1766
#25 0x0bfb7925 in FLDiskNotificationReportProgressToBE (info=0x257fc5a0) at
bora/lib/filtlib/flDiskNotification.c:85
#26 FLDiskNotificationReportProgressToBE (info=0x257fc5a0) at
bora/lib/filtlib/flDiskNotification.c:75
#27 0x0bfb7a8b in FLDiskNotificationProgressCb (handle=0x257de0e0, percentage=0)
at bora/lib/filtlib/flDiskNotification.c:1122
#28 0x0de37151 in ?? ()
#29 0x0c49dd6a in start_thread (arg=0x26b80b70) at pthread_create.c:301
#30 0x0c586d9e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:133
(gdb)

Now go to frame 5 in the stack trace and list the code. You see that it is trying to get the thread from the
thread pool.

(gdb) frame 5
#5 0x070e540b in Vmomi::PropertyCollectorInt::PropertyCollectorImpl::TriggerProcessGUReqs
(this=0x25c5ade8, filter=0x255d7248)
at bora/vim/lib/vmomi/propertyCollector.cpp:1410
1410 this, &PropertyCollectorImpl::ProcessGUReqs));
(gdb) list
1405 /* schedule the thread
1406 */
1407
1408 GetThreadPool()->ScheduleWorkItem(
1409 MakeFunctor(
1410 this, &PropertyCollectorImpl::ProcessGUReqs));
1411 }
1412 }
1413 }
1414

If you look at the hex-dump, you see that GetThreadPool() is in fact returning NULL pointer.

(gdb) x/i $eip


=> 0x70e540b
<Vmomi::PropertyCollectorInt::PropertyCollectorImpl::TriggerProcessGUReqs(Vmomi::PropertyCollecto
rInt::FilterImpl*)+235>: mov (%eax),%edx

(gdb) disassemble $eip


Dump of assembler code for function
Vmomi::PropertyCollectorInt::PropertyCollectorImpl::TriggerProcessGUReqs(Vmomi::PropertyCollector
Int::FilterImpl*):

0x070e5401 <+225>: jg 0x70e5480
<Vmomi::PropertyCollectorInt::PropertyCollectorImpl::TriggerProcessGUReqs(Vmomi::PropertyCollecto
rInt::FilterImpl*)+352>

VMware, Inc. VMware Confidential and Proprietary 263


Getting Started Developing vSphere IO Filter Solutions

0x070e5403 <+227>:
call 0x70ce420 <_ZN7Vmacore6System13GetThreadPoolEv@plt>
0x070e5408 <+232>: lea -0x38(%ebp),%edi
=> 0x070e540b <+235>: mov (%eax),%edx

The reason GetThreadPool() is returning NULL is because the progressFunc callback referred to on Line 15 of
the code-snippet shown above is not running on the thread-pool. It is in fact running on a thread created
from the iofilters-thread pool. The pthread_create() created the thread directly in hostd and has no
knowledge and control over iofilter Library Instance context’s threads.

Blocking Rules in Poll and Timer Callbacks and WorkGroup Functions

In both LIs and Daemons, no blocking functions may be called in Poll callbacks and Timer callbacks. There
is a single thread in IOFilter framework, for each VMX and iofilterd, that calls all Poll callbacks and Timer
callbacks. As a result, if any callback blocks, all other callbacks have to wait. Furthermore, if one callback has
dependency on another callback, e.g. if it waits for a semaphore to be posted by another callback, it will run
into deadlock, since the other callback will never have the chance to run and post the semaphore.

In LIs, no blocking functions may be called in WorkGroup functions. There are several worker threads in the
IOFilter framework that are responsible to process all the WorkGroup functions in LIs. But in some
application environments, the number of worker threads might be as little as one. If any WorkGroup
function blocks, the worker thread blocks and cannot process other functions. Similar to Poll callbacks and
Timer callbacks, if WorkGroup functions have dependency on each other, they are subject to deadlock.
However, Daemons can use a set of the available worker threads (64 currently) for blocking operations.

You might wonder what you can do if you really need to block and wait for something to complete before
you can proceed. Our recommendation is to schedule a timer callback to periodically check the status of the
thing you want to wait for. One example is diskDetach callback, in which you need to detach your filter and
do some clean like removing sidecar files, but your filter might be issuing its own IOs. You need to wait for
these IOs to complete before you can detach the filter. In this situation, you can schedule a timer callback
that periodically checks whether all the IOs have been completed, you can also use this timer callback to call
VMIOF_DiskOpProgressFunc provided in VMIOF_DiskDetachInfo. Once all the IOs are completed, you can
perform the remaining clean up steps.

Bundle Deployment URL lifetime

The URL you specified in the InstallIOFilter_Task when installing the IOFilter Bundle needs to be accessible
all the time until the Filter is Uninstalled. The reason is that VC doesn't store the bundle, so it has to access
that URL to get bundle info every time a new host joins the cluster, a stateless host reboots and reconnects to
VC, as well as when the Filter is uninstalled from the cluster.

If the URL is not accessible, you will observe various errors. If the error message on vCenter Server is not
clear enough, you can examine the vpxd.log and eam.log in vCenter Server to figure out the reason.

E.g. If the URL is not accessible when trying to uninstall the Filter, the task will fail and give an error
message on vCenter Server as:

Unable to access agent VIB module at http://10.20.72.161:8005/sampfilt-offline-bundle-10-2-1.zip


(IoFilter-ZZZ_bootbank_sampfilt)

264 VMware Confidential and Proprietary VMware, Inc.


Chapter 7 Understanding Additional Rules Tips and Tricks

Troubleshooting vSphere Cluster and Filter Build Configuration


(Checklist)
If you receive an error invoking IOFilterManager methods (for example installing or updating a filter), it may
be because your cluster is not (currently) configured properly. Even if you have been using the cluster
successfully, you (or a colleague) may have made a change that you either forgot, or of which you were
unaware, that is preventing the current operation from succeeding.

This topic provides a brief list of things to check in your cluster and build configuration:

n Each host is in Community Supported mode — The InstallIOFilter_Task method requires this mode
to install unsigned VIBs.

n VIB's 'acceptance-level' set to 'community' in the SCONS file — The InstallIOFilter_Task method
requires this acceptance level to install unsigned VIBs.

n During install/uninstall, EAM will put the host into maintenance mode first, and put it back after
install/uninstall. EAM will do so successfully, however, it will report an error in the Task List on
vSphere Web Client UI. You can ignore the error message right now, and we will fix it in next release.

n DRS is properly set - When uninstalling or upgrading, DRS will migrate running VMs to other hosts. In
order for DRS to do so, (1) DRS needs to be set to Fully Automated Mode, (2) VMs need to be on shared
storage, (3) vMotion Traffic Tag must be set on one of the vmknics on all the hosts.

n Hosts / vCenter date / time synchronized — The authentication system used between vCenter Server
and ESXi hosts is time sensitive. If the time on the systems are too far out of sync, certain cluster
operations may fail. A best practice is to configure the vCenter Server and ESXi hosts to use the same
Network Time Protocol (NTP) server.

Using the RAMDisk


The IOFilter framework provides a ramdisk mounted at /var/run/iofilters. The main purpose is to store files
that are used for System V shared memory keys. It can also be used to store small size (a few KB) non-
persistent data. The size of this ramdisk is 32MB for the current release, but the amount of files allowed on
this ramdisk is fairly large, so that partners can create one file per VMDK. File names in this ramdisk should
use the com.company.product style.

Following is a code snippet describing how to use this ramdisk to generate a shared memory key.

key_t key;
int shmid;
char *data;
size_t size = 1024*1024*2;
key = ftok("/var/run/iofilters/com.companyA.productA", 'R');
shmid = shmget(key, size, 0644 | IPC_CREAT);
data = shmat(shmid, (void *)0, 0);

Holding a Lock across calls


An IO Filter is not allowed to hold a lock, example mutex, across calls to the following utility functions since
these functions might trigger inter-filter deadlocks —

n VMIOF_DiskIOContinue() — This function continues processing of an IO request. It pushes the IO


through the remaining filters and rest of the IO stack. Each filter can either defer or complete the IO.
However, this function is intended to be called for IOs that were deferred or processed asynchronously
even though calling it from a synchronous context is allowed.

VMware, Inc. VMware Confidential and Proprietary 265


Getting Started Developing vSphere IO Filter Solutions

n VMIOF_DiskIOComplete() — This function is used to complete processing of an IO request prematurely.


No further filters will receive the diskIOStart callback for this IO request.

n VMIOF_DiskIOSubmit() — This function is used to submit a previously allocated disk IO object to the
kernel.

n VMIOF_DiskIOAbort() — This function is used to abort a previously submitted disk IO.

Consider a filter A that invokes the following functions in sequence in its diskIOStart callback function —

pthread_mutex_lock(M);
VMIOF_DiskIOSubmit();
pthread_mutex_unlock(M);

Now, if this thread tries to acquire the same mutex M in its completion callback, it will have self deadlocked
it. It is possible that a filter underneath might have completed the IO, so that the completion path is getting
executed inside of VMIOF_DiskIOSubmit.

Understanding Sidecar Functions Invocations

The following table summarizes various Sidecar functions along with the list of valid callback function that
can invoke them —

Sidecar Function Valid Calling Functions

VMIOF_DiskSidecarCreate() diskAttach(), diskPropertiesSet()

VMIOF_DiskSidecarDelete() diskDetach(), diskPropertiesSet()

VMIOF_DiskSidecarOpen() diskOpen(),diskClose(), diskAttach(),


diskPropertiesSet(), diskDetach(), diskGrow(),
diskClone(), diskCollapse(), diskVmMigration(),
diskSnapshot(), diskRelease()

VMIOF_DiskSidecarClose() diskOpen(),diskClose(), diskAttach(),


diskPropertiesSet(), diskDetach(), diskGrow(),
diskClone(), diskCollapse(), diskVmMigration(),
diskSnapshot(), diskRelease()

VMIOF_DiskSidecarRead() diskOpen(),diskClose(), diskAttach(), diskDetach(),


diskGrow(), diskClone(), diskCollapse(),
diskVmMigration(), diskSnapshot(), diskRelease()

VMIOF_DiskSidecarWrite() diskOpen(),diskClose(), diskAttach(), diskDetach(),


diskGrow(), diskClone(), diskCollapse(),
diskVmMigration(), diskSnapshot(), diskRelease()

VMIOF_DiskSidecarGetSize() ANY

VMIOF_DiskSidecarSetSize() diskAttach(), diskPropertiesSet(), diskGrow()

NOTE If the filter implements the diskStun and diskUnstun callbacks, then in stunned state,
VMIOF_DiskSidecarRead() and VMIOF_DiskSidecarWrite() can be used in all LI callbacks, while in unstunned
state, they can be used in any context.

Heap Debugging
When the Library Instance exits, an ASSERT() will occur if there are any memory leaks. In order to debug
the heap, you will require the heap.gdb script referenced in this topic. At the current time, it is not available
in the Dev Kit, so please open a DCPN case requesting this script.

266 VMware Confidential and Proprietary VMware, Inc.


Chapter 7 Understanding Additional Rules Tips and Tricks

As an example of using this script, the following code was added to the countIO sample in the
TestDiskStartIO() function

if (data->ioCount == 0) {
// Allocate an I/O
status = VMIOF_DiskIOAlloc(handle, data->heap, 1, &outIO);
}

Running this filter causes an ASSERT() when the filter is unloaded, and the heap.gdb script is used to assist
in determining the leak.

Program terminated with signal 32, Real-time event 32.


#0 VMKernel_LiveCoreDump () at bora/vmkernel/public/uwvmk.h:1337
1337 bora/vmkernel/public/uwvmk.h: No such file or directory.
in bora/vmkernel/public/uwvmk.h
(gdb) bt
#0 VMKernel_LiveCoreDump () at bora/vmkernel/public/uwvmk.h:1337
#1 Sig_CoreDump () at bora/lib/sig/sigPosix.c:2357
#2 0x01c19802 in CoreDump_CoreDump () at bora/lib/coreDump/coreDumpPosix.c:78
#3 0x01c79866 in Panic_Panic (format=0x215b86c "Non-empty heap (%s) being destroyed (avail is
%d, should be %d).\n",
args=0xfffc9694 "\b`\222+\320L") at bora/lib/panic/panic.c:529
#4 0x01c7989f in Panic (fmt=0x215b86c "Non-empty heap (%s) being destroyed (avail is %d, should
be %d).\n") at
bora/lib/panicDefault/panicDefault.c:43
#5 0x01f880f8 in HeapDestroy (heap=0x2b926000) at bora/lib/filtlib/heap.c:743
#6 Heap_Destroy (heap=0x2b926000) at bora/lib/filtlib/heap.c:800
#7 0x01f81128 in FiltLibDestroyHeap (heap=0x2b926000) at bora/lib/filtlib/filtlibHeap.c:257
#8 0x025d9aae in ?? ()
#9 0x025d30f1 in TestDiskClose (handle=0x1f05faf0) at
partners/samples/iofilter/countIO/countIO.c:295
#10 0x01f851a7 in FiltLibWrapClose (handle=0x1f05faf0) at bora/lib/filtlib/filtlibTrace.c:212
#11 0x01f741da in FiltLibDiskCloseFilter (context=0x1f05f938, firstClass=<value optimized out>,
lockPoll=1 '\001') at
bora/lib/filtlib/filtlibDisk.c:1276
#12 FiltLibDiskCloseFiltersFrom (context=0x1f05f938, firstClass=<value optimized out>,
lockPoll=1 '\001') at
bora/lib/filtlib/filtlibDisk.c:1315
#13 0x01f78668 in FiltLibDiskCloseAllFilters (ctx=0x1f05f938) at bora/lib/filtlib/filtlibDisk.c:
1769
#14 FiltLib_DestroyContext (ctx=0x1f05f938) at bora/lib/filtlib/filtlibDisk.c:2006
#15 0x01eb8c1d in DiskLibFiltLibExit (diskHandle=0x1f05efa4) at bora/lib/disklib/diskLib.c:6550
#16 0x01eb8e4b in DiskLib_Close (diskHandle=0x1f05efa4) at bora/lib/disklib/diskLib.c:4608
#17 0x01b10e13 in DoIOToDisk (source=0xfffc9df2 "test.vmdk", ioType=IOTYPE_WRITE_ZEROS,
startSector=0,
args=<value optimized out>) at bora/apps/vmkfstools/fstools.c:2953
#18 0x01b11c3f in WriteZeros (args=0xfffc9b54) at bora/apps/vmkfstools/fstools.c:2983
#19 0x01b08ef6 in main (argc=5, argv=0xfffc9d24) at bora/apps/vmkfstools/fstools.c:1424
(gdb) source heap.gdb
(gdb) heapprint heap
No symbol "heap" in current context.

We need to change to a frame where the heap is valid.

(gdb) frame 5
#5 0x01f880f8 in HeapDestroy (heap=0x2b926000) at bora/lib/filtlib/heap.c:743
743 bora/lib/filtlib/heap.c: No such file or directory.

VMware, Inc. VMware Confidential and Proprietary 267


Getting Started Developing vSphere IO Filter Solutions

in bora/lib/filtlib/heap.c
(gdb) heapprint heap
heap: countio
dlmalloc segment 0: base=0x2b928000, size=12288, next=0x2b927fe0
INUSE: mchunkptr: 0x2b928000 (raw addr 0x2b928008); mchunk size=160
Poison @0x2b928094 OK; bytes: 132, sufixBytes: 24, callerPC: 0x1f74931
<FiltLibVmiofDiskAllocIO+97>:
mov -0x38(%ebp),%edx
dlmalloc segment 1: base=0x2b92608c, size=8052, next=(nil)
FREE: mchunkptr: 0x2b926268 (raw addr 0x2b926270); mchunk size=7536 fd=0x2b926268
bk=0x2b926268

We see we have 1 allocated object.

(gdb) p ((FiltLibDiskIO*)0x2b928008)->context.ioCount.value
$1 = 1

This shows we have 1 allocated I/O.

(gdb) p *((FiltLibDiskIO*)0x2b928008)
$2 = {diskIO = 0x2b928064, heap = 0x2b926000, completionStatus = VMIOF_SUCCESS, sharedDataIndex
= 0, context = 0x1f05f938,
currentClass = FILTLIB_DISK_FILTER_CLASS_CACHE, currentClassGuard =
FILTLIB_DISK_FILTER_CLASS_INVALID, referenceCount = {value = 1},
submit = 0x1f7acc0 <FiltLibSubmitUserIO>, finalize = 0x1f73e30 <FiltLibFreeIO>, completionPairs
= {{callback = 0, data = 0x0},
{callback = 0, data = 0x0}, {callback = 0, data = 0x0}, {callback = 0, data = 0x0}, {callback =
0, data = 0x0},
{callback = 0x1f744a0 <FiltLibDefaultUserIOCompletionCb>, data = 0x0}}, debugInfoIndex = 1}
(gdb)

Debugging With Different Versions of ESX


Sometimes you might need to debug with an ESX which has a different build number from the one of the
VAIO SDK. One scenario is that you are provided with a sandbox build (internal test build) for a specific
bug. The other scenario is that you get a core dump from customer environment where an ESX patch release
is deployed. In order to debug with different versions of ESX, you need to install the symbol VAIO RPMs
matched with the ESX versions. The RPM file name is vmware-esx-vaiodk-symbols-[build-
number]-6.0.0-1.17.[build-number].i386.rpm. Multiple symbol VAIO RPMs can be installed on the same
Workbench VM. E.g. after installing two symbol VAIO RPMs, the Workbench VM has the following two
directories:

/opt/vmware/vaiodk-symbols-6.0.0-2799832
/opt/vmware/vaiodk-symbols-6.0.0-2897841

Then "make prep-debug" and "make live-debug" will automatically choose the right symbol files to use.
During the following example, symbol files of build 2799832 is chosen:

proma-2n-dhcp211:/opt/vmware/vaiodk-6.0.0-2897841/src/partners/samples/iofilter/sampfilt # make
prep-debug

Select a zdump to debug:

vmx-debug-zdump.002 Wed Aug 5 15:26:25 2015

Enter some matching characters of a pattern: vmx-debug-zdump.002

268 VMware Confidential and Proprietary VMware, Inc.


Chapter 7 Understanding Additional Rules Tips and Tricks

Using vmx-debug-zdump.002 ...


Using /opt/vmware/vaiodk-symbols-6.0.0-2799832 for its symbols
Exact match for zdump build number 2799832

Using the realtime clock functions


The librt library is available but not all the functions are fully implemented. As an example, clock_gettime()
only supports CLOCK_REALTIME, CLOCK_MONOTONIC and CLOCK_MONOTONIC_RAW. If thread specific time is required
or in performance critical code paths, you should use the read timestamp counter opcode. An example of
calling RDTSC:

RDTSC(void)
{
#ifdef VM_X86_64
uint64 tscLow;
uint64 tscHigh;

__asm__ __volatile__(
"rdtsc"
: "=a" (tscLow), "=d" (tscHigh)
);
return tscHigh << 32 | tscLow;
#elif defined(VM_X86_32)
uint64 tim;

__asm__ __volatile__(
"rdtsc"
: "=A" (tim)
);

return tim;
#endif
}

Using Libraries in your Library Instance and Daemon


The VAIODK includes the following libraries:

n libvmiof – This library provides the interfaces specific to an IO Filtering solution.

n libvmkuserlib – This is a stable interface exported by the VMkernel to userlevel applications that want
to access VMkernel specific functions.

n librt – This library provides most of the POSIX Realtime Extension interfaces. VMware supports a
limited subset of these interfaces.

n glibc – the GNU Project's implementation of the C standard library.

The four libraries will be automatically dynamically linked. All you need to do is to include the header files
and run make.

VMware, Inc. VMware Confidential and Proprietary 269


Getting Started Developing vSphere IO Filter Solutions

If you want to use any other library, either public or proprietary, there are two cases. If the library has no
dependency, then you can statically link it into your Library Instance or Daemon by using "extra objects"
keyword in the scons file. If the library has any dependency, e.g. glibc, you will have to extract the code you
need from the library, and create your own static library. You will need to take care of any legal obligations
involved.

Please let us know if you heavily depend on any public library, and creating a static library as described
above is not feasible to you. We will consider adding it to our supported list of dynamic libraries for a future
release.

Development Tips
The following tips can help speed up your development efforts :

n Use the built in NFS client on ESXi to NFS mount the Linux Dev System:

n Aids command line install (no scope needed)

n Aids the copy of core files to dev sys

n Can be used with sym links instead of doing VIB installs after rebuilding your code (only after first
vib install)

n In order to reload the daemon, you will need to stop/start it using the instructions in the document
"vSphere APIs for IO Filtering Development Kit (VAIODK) Guide for the Command Line" in the
section "Starting and Stopping the IO Filter Daemon"

n Create a test user-space program to interact with the filter. See the section "Building a Test Application
for Your IO Filter" in the document "vSphere APIs for IO Filtering Development Kit (VAIODK) Guide
for the Command Line" .

n Use cache files for local databases (even if its not a caching solution)

n Use “extern char * program_invocation_name” within the Library Instance code to determine who
loaded the library. This is important if you need to understand if you are being loaded in the context of
the VMX, or a hostd process such as vmkfstools

n Use atomics instead of pthread mutex’s where possible, for speed

n In the case where you have code that creates a sidecar, but need to remove it manually due to a bug, it
can be accomplished by manually editing the VMDK file with vi or an equivalent text editor. An
example of the line in the VMDK file is as follows:

ddb.sidecars = "sampcache_1,iofilter-97b1142038b77f6d.vmfd"

Frequently Asked Questions

Q: Is the barrier IO tag available at the VAIO framework level?


No, the barrier IO tag is not present to either partner Filters or the VAIO framework. For user level IOs,
those sync commands are absorbed by the VMkernel buffer cache. For VM IO requests we strip away this
information in the vSCSI layer, so the PSA layer and a driver wouldn't see this information.

Q: Will TRIM/UNMAP command be sent to SSD device?


TRIM/UNMAP will be sent to SSD device only when the following two conditions are met:

1 The cache file is created as a thin file.

270 VMware Confidential and Proprietary VMware, Inc.


Chapter 7 Understanding Additional Rules Tips and Tricks

2 The ESXi configuration EnableBlockDelete is set, which is not the default. However, setting this
configuration has additional ramifications, because vSphere sends then TRIM/UNMAP commands
wether or not the underlying device is a local SSD or if the volume is VMFS or VFFS.

Q: What are the rules around catching POSIX Signals?


The rule is to not catch POSIX Signals. All signals are caught and handled by the Framework.

Q: Is the memory of a Filter pageable?


The data segment and code segment of a Library Instance and Daemon are non-pageable. The heap is
currently non-pageable. We are considering adding a flag to allow a Filter to specify whether the memory
should be pageable or non-pageable in a future release.

Q: How do I persist configuration data on an ESXi host?


VAIO does not provide a set of functions for doing this on a filter-wide basis, in part because ESXi hosts,
with filters, can be installed on a host with no local disks. Consider the following alternatives for storing
configuration data instead:

n Put the information in a sidecar or VMDK only created and used by the daemon

n Create a file in the VFFS if its available on the host

n Store it in a vCenter Extension and retrieve it through the CIM provider

Q: Does IOFilter support RDM (Raw Device Mapping)?


pRDM is not supported, since it will go from the VSCSI layer directly to the device, bypassing the IO Filter
layer. vRDM is supported, since it will go through IOFilter layer.

Q: Is a shared VMDK supported by VAIO?


Yes, the framework provides the flag VMIOF_DISK_SHARED in VMIOF_DiskInfo, and it is up to you to decide
whether you want to implement it.

If you don't want to implement shared VMDK (multi-writer) support for guests such as Oracle, you should
fail the open call if you see the flag VMIOF_DISK_SHARED.

The IO Filter framework won't allow attaching a filter to a shared VMDK if the VM is powered on. If you
attach a filter to a shared VMDK when the VM is powered off, it will succeed but the VMIOF_DISK_SHARED flag
will not be present in either diskAttach or diskOpen callbacks, as the Framework will not have that
information. You will only see VMIOF_DISK_SHARED flag in diskOpen when VM is being powered on. If you
return VMIOF_NOT_SUPPORTED, the customer will see an error message of "The specified feature is not
supported by this version" on VC, and you can also log your own message in vmware.log in order to assist
customer troubleshooting.

Q: What is the maximum combined size of all the elems in a DiskIO structure?
We don't have any limit on the total size of an IO. Besides, since the filters are chained, if there is a filter
above your filter, it can issue IOs of arbitrary size.

Q: How do I generate a live core file?


Use the following commands to force vSphere to generate a core file for a specific user-world cartel, e.g.
your daemon:

vsish -e set /userworld/cartel/<cartel-id>/debug/coreDumpEnabled 1


vsish -e set /userworld/cartel/<cartel-id>/debug/livecore 1

VMware, Inc. VMware Confidential and Proprietary 271


Getting Started Developing vSphere IO Filter Solutions

Use the following commands to generate a backtrace for a specific world in vmkernel.log:

vsish -e set /world/<worldID>/logbacktrace on

Use the following commands to force vSphere to generate a kernel core file:

localcli --plugin-dir /usr/lib/vmware/esxcli/int debug livedump perform


esxcfg-dumppart -C -D active

Q: What performance profiling tools are available?


Valgrind exists for ESXi at https://developercenter.vmware.com/web/dp/tool/valgrind/5.5

Once you have installed valgrind for ESXi, issue the following command:

/opt/valgrind/bin/valgrind --tool=helgrind <heavy_memory_program>

Q: Can I open() files on /dev/disks/... from the Daemon?


You are permitted to open files on /dev/disks/... The restriction is that you can only do so from the Daemon.
Once you open the file, daemons are free to pass this FD to the LI via standard POSIX techniques.

Q: How to detect if VC or ESX is IOFilter capable?


For VC, we don't have a programmable way to do it. You cannot tell by checking the VC version, because
vSphere doesn't expose the upgrade level in vSphere API. The build number is not a reliable way to do this
either. So you have to rely on documentation, e.g. telling your customer to install your filter only on VC
version 6.0U1 or higher.

For the ESX Host, you can run "esxcli software vib get -n esx-base" and search for vmiof_, then you will
know whether it is IOFilter capable, and the IOFilter versions supported.

Q:The function VMIOF_DiskAdapterGet () fails with the status VMIOF_NOT_FOUND after a new
disk has been added to a running VM. What is the workaround?
This is a known bug, and likely won't be fixed since this function is marked as deprecated. The workaround
is to call the function from the first diskUnstun() callback after diskOpen().

Q: Will the Filter be uninstalled if I move an ESX host out of a Cluster?


If you move the host into a different cluster without the filter installed, or into the Datacenter as a
standalone host, the filter will be uninstalled. However, if you remove the host from the VC inventory, the
filter will not be uninstalled.

Q: How can I get the physical memory size of an ESX host?


You can invoke localcli hardware memory get to get the memory size on the host. We don't have an API to
get the free memory size on the host, but you should be able to figure it out using vSphere API, then send
such info to your Daemon/LI through your CIM Provider.

Chapter Summary
The topics in this chapter presented details such that you should now be able to:

n Specify in which components with which you can not, may not, and may create threads using
pthread_create()

n Understand the use of blocking rules in Poll and Timer Callbacks, and WorkGroup Functions

n Understand the lifetime requirements of the Bundle Deployment URL

272 VMware Confidential and Proprietary VMware, Inc.


Chapter 7 Understanding Additional Rules Tips and Tricks

n Understand how to troubleshoot a vSphere Cluster and Filter Build Configuration

n Understand how to use a RAM Disk

n Understand when you can and cannot hold a lock across calls to the VAIO

n Understand the rules regarding the use of Sidecar Functions

n Understand how to debug the Heap

n Understand how to debugging with different versions of ESXi

n Understand how to use the realtime clock functions

VMware, Inc. VMware Confidential and Proprietary 273


Getting Started Developing vSphere IO Filter Solutions

274 VMware Confidential and Proprietary VMware, Inc.


Index

Symbols D
/opt/vmware 43 Daemon 25, 113
/var/log/hostd.log 50 DAEMON MEMORY RESERVATION 100
/var/log/iofilter-init.log 50 Daemon Start Callback 113
/var/log/syslog.log 50 Daemon Stop Callback 114
Development 270
Numerics Development Environment Requirements 32
32-bit 33 diskAttach 103, 155
64-bit 33 diskClone 106, 210
diskClose 104, 165
A diskCollapse 105, 201
Acknowledgements 7 diskDeleteBlocks 105, 178
Asynchronous 22 diskDeleteBlocksPrepare 105, 178
diskDetach 61, 103, 158
B
diskExtentGetPost 109, 204
BC 20
diskExtentGetPre 109, 204
Buffer Cache 20
diskGrow 109, 233
build 48
diskIOAbort 108, 180
buildId 43
diskIOReset 181
bundle 48
diskIOsReset 108
Bundle 27, 264
diskIOStart 107, 257
DiskIOStart 169
C
DiskLib 17, 21
cache 15
diskOpen 104, 159
Capabilities 74
diskRelease 109
CATALOG 116
DiskRelease 228
Chapter Objectives 13
diskRequirements 106, 153
Chapter summary 12
diskSnapshot 104
checklist 265
diskStun 108, 208
CIM 26, 89, 116
diskUnstun 108, 208
CIM Provider (CIMP) 116
diskVmMigration 107
CIMP 116
downloading VAIODK 34
class 15
cluster 27
E
ClusterIOFilterInfo 68
encryption 15
clusterMIOD 68
esxcli 62
Common Information Model (CIM) 26
esxcli software vib install 53
Common rule set 74
esxcli system maintenanceMode set -e true 62
CommunitySupported 50
compression 15 F
config.xml 53 FAQ 270
countIO 45 filter class 74
Course Prerequisites 9 Filter class 22
crossFD 184 filterID 68

VMware, Inc. VMware Confidential and Proprietary 275


Getting Started Developing vSphere IO Filter Solutions

FSS 19 P
persistent 24
G Poll Callback 264
gdb 118, 120 properties 60
gdbiof 118, 120 Properties 74
glossary 7 property 60
proxy 45
H
pthread_create 260
Heap 266
python 53
hostd 17, 33
Q
I
QueryIOFilterInfo 68
init.d 53
inspection 15 R
InstallIOFilter_Task 68 RAMDisk 265
instance data 24 RDTSC 269
intended audience 7 Remote System Explorer (RSE) 55
IO Filter Architecture 28 replication 15
IO Stack 14 requiredMemoryPerDiskMiB 106, 153
iofiltd 53 requiredMemoryPerIO 106, 153
iofilterd 25, 33, 53 requiredStaticMemory 106, 153
IOFilterManager 68, 265 revert 192, 198
RSE 55
J
Rule-Set 1 74
json 98
S
L
sampcache 45
LI 23
sampfilt 45
Library Instance (LI) 23
scons 47, 88
live-debug 120
SCONS 25, 89
locale 116
scp 53
localization 116
SecretSauce Library 22
sidecar 24
M
sidecar data 24
maintenanceMode 62
sidecars 24
make 47
SimpleHTTPServer 53
Makefile 88, 89
snapshot 192, 198
Managed Object Browser (MOB) 68, 83
Snapshot 193
meta data 24
software acceptance 50
MOB 68, 83
SPBM Policy 74
N SSLib 21, 22
name.description 116 SSMod 21
name.label 116 static library 269
NFS 19 stdout 50
non-persistent 24 Strategic Objective 10
Summary 29
O Synchronous 22
Offline filtering 17
Online filtering 17 T
overview 9 Tactical objectives 10
Timer Callback 264

276 VMware Confidential and Proprietary VMware, Inc.


Index

U VMIOF_DiskContentsDirtySet 257
uninstall 47 VMIOF_DiskDeleteBlockDesc 178
UninstallIoFilter_Task 68 VMIOF_DiskDeleteBlocksInfo 178
Unix domain socket 184 VMIOF_DiskDetachInfo 158
unwrapping VAIODK 34 VMIOF_DiskFilterPrivateDataGet 145
upcall 21 VMIOF_DiskFilterPrivateDataSet 145
UpgradeIoFilter 68 VMIOF_DiskFilterProperty 110
UpgradeIoFilter_Task 68 VMIOF_DiskGrowInfo 109
User Object Module 19 VMIOF_DiskHandle 104
User space cartels 16 VMIOF_DiskInfo 104
utility functions 181 VMIOF_DiskIO 107, 108, 167, 169, 182
VMIOF_DiskIOAlloc 169, 182, 183
V VMIOF_DiskIOComplete 107, 169
VAIO 88 VMIOF_DiskIOCompletionCallback 183
VAIODK 88, 89 VMIOF_DiskIOCompletionCallbackSet 107,
vCenter Server 27 169, 183
VCSA 27 VMIOF_DiskIOContinue 169
VFFS 25 VMIOF_DiskIODup 169, 182, 183
vFlash 25 VMIOF_DiskIOElem 107, 167, 169
VFS 19 VMIOF_DiskIOFree 183
VIB 27, 48, 53, 89 VMIOF_DiskIOSubmit 169, 183
vixDiskLib 17, 21 VMIOF_DiskMaxOutstandingIOsGet 169
VM Storage Policy 74 VMIOF_DiskMigrationPhase 107
VMFS 19 VMIOF_DiskOpCompletionFunc 107
vmiof_aio.h 44 VMIOF_DiskOpProgressFunc 106–108
VMIOF_AIOAbort 186 VMIOF_DiskResetIdentifier 107, 108, 169
VMIOF_AIOCallback 186 VMIOF_DiskSidecar 137
VMIOF_AIOQueueCreate 186 VMIOF_DiskSidecarClose 137
VMIOF_AIOQueueDestroy 186 VMIOF_DiskSidecarCreate 137
VMIOF_AIOQueueDrain 186 VMIOF_DiskSidecarDelete 137
VMIOF_AIOSubmit 186 VMIOF_DiskSidecarGetSize 137
VMIOF_ASYNC 22 VMIOF_DiskSidecarOpen 137
VMIOF_cache 253 VMIOF_DiskSidecarRead 137
vmiof_cache.h 44 VMIOF_DiskSidecarSetSize 137
vmiof_crossfd.h 44 VMIOF_DiskSidecarWrite 137
VMIOF_CrossfdCreate() 184 VMIOF_DiskSnapshotInfo 104
VMIOF_CrossfdGrantAccessToRange() 184 VMIOF_DiskStunFlags 108
VMIOF_CrossfdRevokeAccessToRange() 184 VMIOF_DiskStunInfo 108
vmiof_daemon.h 44 VMIOF_DiskUuidGet 149
VMIOF_DEFINE_DISK_FILTER 101 VMIOF_DiskVmMigrationInfo 107, 216
VMIOF_DirtyState 257 VMIOF_DiskVmMigrationIpSpec 107
VMIOF_DISK_CLEAN 257 VMIOF_DiskVmMigrationType 107
VMIOF_DISK_DIRTY 257 VMIOF_FailureReportDisabled 151
VMIOF_DISK_SIDECAR_ALIGN 137 vmiof_heap.h 44
VMIOF_DISK_SIDECAR_KEYMAX 137 VMIOF_HeapAllocation 130
VMIOF_DISK_SIDECAR_KEYMIN 137 VMIOF_HeapDestroy 130
VMIOF_DISK_STUN_FLUSH_DIRTY_DATA VMIOF_HeapEstimateRequiredSize 130
208 VMIOF_HeapFree 130
vmiof_disk.h 44 VMIOF_HeapHandle 130
VMIOF_DiskCloneInfo 106, 210 VMIOF_IOFlags 107, 167, 169
VMIOF_DiskCollapseInfo 105, 201 VMIOF_LOG_ERROR 50

VMware, Inc. VMware Confidential and Proprietary 277


Getting Started Developing vSphere IO Filter Solutions

VMIOF_LOG_INFO 50
VMIOF_LOG_PANIC 50
VMIOF_LOG_TRIVIAL 50
VMIOF_LOG_VERBOSE 50
vmiof_log.h 44
vmiof_poll.h 44
VMIOF_PollAdd 141
VMIOF_PollHandle 141
VMIOF_PollRemove 141
VMIOF_READ_OP 167
vmiof_scsi.h 44
VMIOF_ScsiCallback 242
VMIOF_ScsiClose 242
VMIOF_ScsiCommandsIssue 242
VMIOF_ScsiDiskOpen 242
VMIOF_ScsiEstimateHeapSize 242
VMIOF_ScsiHandle 242
VMIOF_STATUS 102
vmiof_status.h 44
VMIOF_SUCCESS 22
vmiof_timer.h 44
VMIOF_TimerAdd 171
VMIOF_TimerHandle 171
VMIOF_TimerRemove 171
VMIOF_VirtualDiskClose 231
VMIOF_VirtualDiskCreate 231
VMIOF_VirtualDiskOpen 231
VMIOF_VM_IO 167
vmiof_work.h 44
VMIOF_WRITE_OP 167
VMIOF_ZERO_COPY 167
vmiof.h 44
vmkaccess 231
vmkfstools 20, 33, 60, 112
VmkuserVersion_GetUniqueSystemVersion 150
VMware Workbench 32
vmware.log 50
VSAN 19
vSCSI Module 19
vSphere Cluster Configuration 265
vSphere Web Client (VWC) 28
VWC 28

W
WorkGroup Function 264

X
xMigration 216

Z
zdump 118

278 VMware Confidential and Proprietary VMware, Inc.

You might also like