VDI Design Guide Part 2
VDI Design Guide Part 2
Design Guide
Part II
Advanced Design Topics
Version 1.0
Between February 2018 and summer 2021, the world didn’t stop
turning. I got the ability to work on some more great customer
projects and got to meet awesome people. All the new lessons I
learned have been bundled in the book you are reading right now.
Even though Johan and I have both been in the EUC industry since
the late nineties, (as he likes to say), I didn’t actually meet him
until after I joined VMware’s EUC Office of the CTO in 2018. I had
taken an 18-month sabbatical in between BrianMadden.com and
VMware, and when I first traveled to The Netherlands, I met this
super tall, super awesome, bearded VDI geek who had worked
with VMware and NVIDIA to build an F1 racing simulator rig that
was completely running via VDI and the Blast remoting protocol!
(Johan talks more about that project in this book. It’s dope!)
Until then, I’m excited for you to read this book and up your VDI
game like I have!
Enjoy reading!
Cheers,
Johan
Another special thank you goes out to Tobias Kreidl. Besides being
a great community friend, Tobias is my “grammar and spelling
conscience.” I’m certainly all besides an English native speaker,
and where Grammarly sometimes has a hard job in recommending
the right tone of voice or just the proper word, Tobias has helped
me a lot!
Finally, and I would like to hand him his own podium, Age
Roskam is a great colleague and someone who reminds me of
myself. Age has joined ITQ during the pandemic as an EUC
ACKNOWLEDGEMENTS
Chasing dreams and setting goals is what gets me energized. But,
dreaming and setting goals is one thing, keeping yourself
motivated to make those dreams happen is another. One essential
quality which I always thought you need to achieve those goals, is
discipline. I always thought that discipline is something you have,
or you don’t. And, for a long, long time, I thought I lacked a bit of
discipline. What I learned from my peers at ITQ (and a little bit
from Yoda), is that there’s no such thing as discipline. It all has to
do with planning and sticking to that planning. Prioritize the
things you do and ask yourself if they are more important than
locking yourself up in your office and work on your goal. Don’t
get me wrong, I also like to put meat on my grill, drink a nice glass
of Italian wine with friends, and meet up with family. But writing
this sequel sometimes had a higher priority. This lesson (among
other lessons) that I learned from Francisco Perez van der Oord
(one of my mentors at ITQ) is what I am forever grateful for.
Another special thank you, goes out to Bertwin Oudenampsen.
One of the best managers I’ve ever had the pleasure to work with
and someone who fully supported my journey to get this book
done. Just another reason to join the ITQ family if you like to take
your own career to a next level. ☺
One thing which was really cool after the first book was launched,
was the fact that people could buy it at the VMworld bookstore
during both the US and EU events. My brother, Rick van
Amersfoort designed the cover for the first book, and because of
the awesome work he did, it was an absolute no-brainer to ask him
to design this cover as well. I think he took the original design to
the next level and hid some easter eggs in it which you might find
if you look closely. Don’t hesitate to drop me a message on twitter
if you find one. ☺
Now, enough with all the introductory stuff, let’s dive into the
world of EUC
The VDI Design Guide had a couple of goals. I have spent most of
my career installing, configuring, managing, designing,
developing, supporting, updating, upgrading, and quite often
breaking, solutions that were used (and sometimes still are used)
with a general purpose: empower an end-user to do their work in
the most efficient way. That’s also what my view is on end-user
computing (EUC).
The second goal of the first book was to help increase the number
of VMware Certified Design Experts in the Desktop & Mobility
certification track (VCDX-DTM). I was the fourteenth who
achieved the certification in 2016 and wanted to increase that
number by helping architects to understand the VCDX design
methodology. Although it did help a bit (there are 19 certified
VMware EUC architects as of summer 2021), we aren’t there yet.
The first book has sold thousands of copies, which means that a lot
of people are now familiar with the design methodology and that
might even be more valuable. Of course, I still hope the number of
certified architects will increase, but time will tell.
The third and final goal was to show the maturity of the Virtual
Desktop Infrastructure technology. I believe that goal has been
achieved, as well. I have seen so many different use cases that
were able to successfully run on a VDI without any issues or
negative User Experience (UX). This made me realize that although
it shouldn’t be a goal of a project, you could design a VDI for all of
your use cases.
In the first four weeks of the pandemic, IT teams did what they
could to answer to a sudden demand of remote workers. All
resources, devices, old hardware, etc. were used to enable
organizations to continue their critical business processes, but with
a new form of flexibility in mind. IT people, and EUC specialists in
particular, were the corporate heroes of a new era in IT.
I strongly believe the end user has changed. This “new end user”
and the expected UX have probably changed, as well. While they
might have been used to get a Lamborghini to do their work with,
they were forced to use a Fiat, instead, during the pandemic. And
it probably worked well enough for the essential business
processes to continue. I have conducted many interviews with
people in all sorts of organizations to understand how their
The second goal is to tell you about a variety of use cases. On the
previous page, I explained the increased number of use cases that
are capable of running successfully on a VDI. I will dive into those
use cases, as well, and share my experiences, design
considerations, best practices and new/creative ideas to get the
most out of your own VDI projects. Gamers, data scientists, and
video editors are some of those examples. The nature of their work
in many cases demanded a physical PC because of the resources,
operating systems, or applications. The maturity level has again
increased and enables those complex use cases to be virtualized.
As you may know, organizations like Microsoft and Google have
had a huge benefit from the pandemic. They were practically
giving away Teams and Hangouts to enable organizations to
collaborate while people were working from home. And this led to
another shit storm. Many companies invested in VDI but didn’t
see a benefit of including Graphical Processing Units (GPUs) in
their infrastructure. With environments that have never been
designed to run video conferencing tools and every employee
heavily depending on them, a sudden demand for either
offloading of those tools or GPUs was introduced. These “new”
requirements will for sure be covered. I say new, because we as
EUC specialists already knew this day was coming, just not that
soon.
Also, a lot has happened since 2018. VMware Workspace ONE has
been thriving and has been part of nearly every VMware Horizon
project I’ve been part of. VMware also acquired Carbon Black and
heavily invested in their security stack. From an NVIDIA side of
things, they (NVIDIA) completed their plans to acquire Mellanox
Technologies in April 2020 and are busy with their intended
acquisition of ARM. There’s a lot more happening, but I’m quite
sure we could cover the rest of the page with the stuff I’m
forgetting. The third goal is to inform you about these great
developments.
As you can read, the current world of EUC is in many ways a lot
different from what it was before. We had a lot of fun declaring
every year to be the Year of VDI, and I don’t think any year should
be a Year of VDI, but the importance of a well-designed EUC
solution like a VDI has never been so big.
HORIZON
I still know from the early days of virtualization that we created
our own hardware designs and had to incorporate things like
growth. Hyperconverged Infrastructures with linear scale didn’t
exist so we had to think about the ideal sizing and ratio between
the different components. RAM and CPUs were quite easy to size,
but when looking at networking, storage and scale of building
blocks, it became a lot more complex. Now, I have had my fair
share of PEBKAC (Problem Exists Between Keyboard And Chair)
incidents and learned a lot from them. I think this is also where
most of the inspiration for this book came from. Although I believe
I’m starting to become a prehistoric mammal (I started my own
EUC career in October 1999, which I always refer to as the late
90s), there are always people in the EUC space who can be
considered a dinosaur. One of those people is my good friend
Spencer Pitts. Spencer has been part of VMware for many, many
years and spent pretty much all of that time in VMware’s EUC
business units. He started in VMware’s professional services
organization, and through different other roles currently is a Chief
Technologist for the Digital Workspace at VMware, covering
EMEA. In the past years, Spence and I had quite some good
conversations on VDI, and I always enjoyed his anecdotes about
the early days. I thought it was a good idea to share those
anecdotes with the rest of the world, so here goes.
Me: I like the fact that you talk about the home side of things,
especially now that a lot of people are mainly working from home.
Like you mentioned, you don’t want any hassle when stuff breaks.
Automation became pretty important, also in end-user computing.
What have you seen changing in that space?
Me: It's a bit like the Pets versus Cattle analogy, right?
Me: Now let's talk about VMware. You've been part of VMware
for like a gazillion years, but what was VMware to you prior to
moment they jumped into the end-user computing space?
And in the early days, people didn't trust it unless it was a PC.
You'd have a laptop, but it was still at that time when there was
more PCs than laptops. Laptops were still a bit of a status symbol
for a lot of end users at the time rather than the actual PC device.
So, having a PC and plugging into a monitor, or a projector as well
(which I also used to watch football on when I got home). So, my
initial first kind of exposure to VMware is that I didn't want to
keep carrying around the PC. So, I used VMs. I'd use VMware
Workstation in a very, very, very early film, and I could use that
for testing, testing my images and everything. However, when I
used to show customers, they were like, “Hold on, so you're
building something virtual, what's this virtual machine thing?”
And some people didn't get it, right. So, it worked for me, and it
was great, but it took a while until customers got virtual machines.
Me: Nice. So, you moved over to VMware in 2007 you just
mentioned. Virtual Desktop Manager (VDM, which was released
prior to VMware View) was released in 2008. How was that year
prior to that initial VDI release?
Me: So, I heard rumors that the original broker was kind of based
on VB script and an Excel sheet as a database. Is that rumor true?
Spencer: Right, so the short answer is yes. But actually, what
happened with that use case is they started running it. And they
had an Excel spreadsheet that had some IP addresses in it. They
basically just assigned an IP address to a user. They connect to the
IP address with that username and password. This worked for a
while, they started getting problems when they got to about 150
users. I think it was at the time where people were forgetting IP
addresses, and nobody was updating the master spreadsheet.
People started connected into wrong machines and they got
annoyed. Remember that we didn’t have any non-persistency
included yet.
So that was just before I joined. I wasn't there. At that point, then
we'd actually bought a company called Propero, which is where
Matt Coppinger came along, but that was in stealth mode. It was
one of the very earliest acquisitions and they had their own
product called Propero Workspaces. That solution was based on a
Linux back-end architecture, as well. After the acquisition, the first
thing we did is we rewrote it all to Windows because we didn't
think customers wanted Linux-based appliances. ☺
Me: How did customers cope with managing them, and especially
when talking about large numbers?
Spencer: The thing about the early days is that we didn't think that
people would do it at scale. We weren't proactively saying to them
that they could use this for 10,000 machines. After a while, they
were asking us for those numbers. And I remember going through
it at a customer in the UK, which was a government customer, and
we were still in the days of Windows Vista. And they said to us:
“We love this VDM! We tested it. We got 50 machines or
something, and now we want to do this for 10,000 machines.” And
if you think about it as well, this was still back in the day where
SAN was expensive and shared storage haven't come on that well,
either. And Vista wasn't very good, especially with storage. If
you're taking 50 Vista machines and running them, and then scale
them out to 10,000, the calculations of the storage size alone
required a shit ton of hardware. It wasn’t just about speed and
performance, but it was also because there was no duplication yet.
If your machine was 250 GB, you’d have to times that by however
many machines you needed. There was no optimization in that
way. I mean, there was memory overcommit and things like that.
And, you know, the early new concepts of virtualization, but from
the storage side, there was nothing. So, to your point, what did we
tell customers? “Well, get your checkbook out, because you need
to go and buy a big fucking SAN,” was pretty much what we said.
That's what happened in the early days. Storage was killing our
ability to scale. And we used to lose a lot to Citrix for that very
reason, because we didn't do any app publishing, we didn't use
terminal services or the MetaFrame stuff or any of the RDS
services as you know them today. We literally had VM/Desktop
brokering, using RDP for remote connectivity. And we could also
Spencer: It wasn't for a lot of customers. But the other thing you
need to understand here is it wasn't like End-User Computing
teams were coming to us. The people that started VDI were mostly
the datacenter people that thought, “Hey let's get the desktops on
there, as well.” And they didn’t necessarily understand how a
desktop worked. So, in the early days I was having to go into
customers and explain to desktop guys about why VDI could be a
good thing. From a business case perspective, it was really tough
to make it stack up. So, what you ended up doing is you went for
some of the main use cases like remote workers. Because if you
think about what you needed to enable remote workers and
introduce flexibility, you could have conversation. We knew it was
expensive, but these were the benefits you got out of it as well. So
yeah, it didn't come cheap. We were waiting, waiting, and waiting
until View Composer came along and compression on storage
level and then suddenly the company’s storage costs reduced to
half. It didn't sort the problem out, but it cut the storage costs in
terms of physical space for sure. But yeah, the early days, really
tough. The business cases were really hard to get over the line.
Spencer: It was the biggest thing that they wanted. They wanted
non-persistent VDI. So, if you take non persistency, what that
really meant was, I can reset your machine back to a golden image.
So, what they were liking about that was to secure their hygiene
around the virtual desktops. It’s like PCs, you turn them on, and
they've already got slower. It's kind of the way I look at it and then
you leave them on for any length of time to get even more slower.
So, it's nice every now and again to reset them back. Every time
somebody logs on there, it’s still quick. There are less support
So yeah, persistent and non-persistent was drawn out the fact that
if it worked, it was in non-persistent. People didn't moan and
you're able to tell them what they were going to have. Persistent
was also for people that had a voice. If they were loud, they were
important and thus they were the ones that normally got persistent
VDI. A bit like you had a PC or a laptop in the early days. If you
had a laptop, you were a little bit higher up the hierarchy. It's the
same with VDI. If you are the lowest of the low, you got a non-
persistent desktop because you couldn't do anything about it. And
you know what I mean by that.
Me: I still remember the early days of the first tablet PCs. There
was a Compaq tablet (The TC1000), it was one of the first
Windows-based tablets with a pen. It worked really well, and I
actually owned one. All of our senior management wanted one
because of a status-thing. Managing those devices was a real shit
storm. They came with a specific tablet edition of Windows XP,
that was hard to deploy and hard to manage. But still, we needed
Spencer: Yeah, but in those days it was different. When they first
saw VDI, a lot of people thought it would be the death of the PC,
which I didn't think would be the case because history tells us that
there's always something that avoids that. Mainframes are still
there, which says it all. So, you're never going to fully get rid of
them all. There's always going to be these use cases. But the
mindset of “It’s the Year of VDI” means we're going to get
everybody on to it. But until we do it won't be the Year of VDI,
which is why we joke about it, right? We didn’t think VDI would
be great for an X amount of use cases. We were just on this hell-
bent kind of thing to try and make sure there was no reason why a
user couldn't get into VDI. It's called fixation blindness. That's how
IT treated VDI back then. “You're going to be on VDI, you need to
have a valid reason why you're not going to be on that.” And nobody
would ever challenge IT. But then you had some foul projects
where it didn’t work, right? And, in the early days, as well. Don't
forget, you know, there were some use cases like VOIP, which
wasn't a big thing back then, but you did have it a little bit, which
was a bit problematic. And then you started getting into all of
these offloading things. And let's think about thin clients and zero
clients. Back in the day, you didn't have any things local because
then there's nothing to go wrong. It's a disposable item, there's no
data there. But, you know, now we're in this kind of hybrid
chubby client, not a thin client. I call them chubby clients, where
there's a lot of offloading and there's a lot of local CPU and
Me: Something else: When looking back at the last 15 years, what's
the most memorable use case you have either seen or deployed on
VDI and why?
Me: What’s really cool is that the golf use case will be covered later
on in the book, as well. Topgolf uses VDI for one of their latest
games. ☺
Let’s dive into the last topic. So, in your current role, you talk a lot
with C-level management and other important decision makers. If
IT admins or architects need to convince those people to innovate
and invest in their EUC platforms, what would be your main key
takeaway?
Spencer: I think the main point here is that this isn't just about IT
making their cost structure better. Because as I mentioned before,
in the early days, it was about saving money with this, even
though the business case of storage was a bit weird. This isn't just
about IT. This is about better user experience and flexibility, as
Ok, so very hypothetically, 2020 could be the year. It’s also not
weird that 2020 could have been the year. Besides the fact that the
technology has really matured, it also became very versatile. Both
those factors introduced new use cases to run on VDI. It still
doesn’t mean you need to build a VDI-for-All strategy, but you
certainly can.
Since the rest of the book will cover most of the current state of
VDI, let’s dive a bit into the future.
The VDI technology itself is state of the art. If you don’t believe
me, you should take a look at modern companies like NVIDIA,
Microsoft, and Google, who have built gaming platforms that
stream games to an endpoint through a remoting protocol.
NVIDIA GeForce NOW, Microsoft xCloud, and Google Stadia use
the same set of solutions to bring a “traditional” application like a
game to an endpoint and offer the same advantages over installing
the application locally on your device:
“After a nuclear war, only three things will still exist: cockroaches,
twinkies and Windows apps. Denying that is simply foolish.”
I strongly believe he’s right. I have met many customers and EUC
enthusiasts around the world who quite often are still using
applications in their organization that are over 20 years old. Many
of these apps are mission critical (I shit you not). And apparently,
no real alternative is available from a cloud (like SaaS
applications). Quite often they aren’t even suitable for a different
deployment technology, and there you have it: another reason that
forces you into technology choices which might be really cool
(because let’s be honest, VDI is cool) but unnecessary because of
Now, some of you might have seen me present at either one of the
VMworlds or at a VMUG and talk about the Formula 1 simulator
that I’ve build together with a couple of my ITQ colleagues and
the help of VMware/NVIDIA. The idea of the project is to show
potential customers who have misconceptions about VDI, that it is
even capable of running the worst apps, with a great UX. You
might ask why I think an F1 game is the worst? The reason is fairly
simple. The game we tested was F1 2018. The behavior of the game
can be compared to a VDI’s worst nightmare:
This use case (which I’ll cover in the use cases section) lead into
another one which was really cool and shows the future potential
of VDI as well. VMware launched a new project called Project VXR
OUTCOME-BASED APPROACH
In the VDI Design Guide, I covered quite a bit involving business
cases. In my opinion, every business case should Start with “Why”.
Why do you want to change? Change isn’t done easily and doesn’t
happen in a day or week. Change takes time and depending on the
type of change, it will cost effort – an effort that should not be
underestimated, either.
You see where this is going. I think the main reason for an IT
Defined Workspace to be successful is because IT was pretty
predictable. Nothing really changed that much, outside of the
operating system upgrades or things like major hardware
replacements. One of the companies I worked for decided to
combine desktop replacements with operating system migrations
and because most of the aspects of such a migration were also
quite predictable, we kind-of automated most of it. There were
scripts to power off an old desktop that completely backed up
important data from the old machine, stored it on the network and
I met Luke at a birthday party and because of the same love for
sneakers (I was wearing a pair of limited-edition Jordan 1s and he
did, too), and we ended up having a nice conversation about the
organization he worked for. They were a small municipality in the
northern part of the Netherlands, and for many years were driving
their business through the same non-changing demand from their
customer: citizens of Tatooine. What you need to know is that an
organization like that is quite unpopular with graduates. They
fully embraced bureaucracy, have a process for basically
everything and are commonly known for their slow way of
embracing changes. It’s mostly older people who work at such
organizations. Luke saw that many years ago and wanted to
change that image of his organization. His first challenge was to
reshape the outside image of the organization. This was about 10
years ago. People in the Netherlands need to pay a visit to city hall
for public services like requesting a new driver’s license, passport,
announce the birth of a child, etc. Most of the tasks involved in
these processes can be automated and offered to citizens remotely.
That’s exactly what they did. They were one of the first in the
Netherlands to introduce a self-service portal in which their
customer could manage a lot of their own services and schedule
appointments. This gave Luke the ability to let his employees
focus on improving the business processes instead of mostly
repetitive and administrative work. The overall image of their
organization slowly started to change because the citizens of
THE JOURNEY
The idea behind the model is that based on the individual topics,
you basically determine what your maturity is within a topic. If
you are already running the majority of applications from a
Microsoft 365 portal, it would probably also mean you have fully
integrated into a public identity provider and thus will be very
mature in that part of the model. But, on the other hand, if you
don’t have a BYOD policy in place or have a hard time to enforce
and report security compliancy, you will need some more effort to
move from that maturity level to the Digital Enterprise.
The idea here is not to treat everything as one massive project, but
to make the journey in smaller steps.
Apple does this a bit differently. With every new version of their
Mac Operating System, they introduce subtle changes so people
can slowly adapt to them. When I now touch an old OSX version
(like Lion or Snow Leopard) I miss a lot of features that I use on a
daily basis. So, again applying changes in small steps will for sure
help!
For inpatient people, it can be hard if you have to wait for changes
to happen. My wife, who works in HR, taught me a great lesson
about this.
If you change or improve 1% of the things you do per day, for a whole
year, you’ll end up thirty-seven times better by the time you’re done.
Since the mid 90s, we (as in: IT people), have worked with
technologies to remote the end user into our company networks
and application servers. Those technologies always had a certain
stigma or prejudices regarding the User Experience, not from us,
but mainly from the businesses. Especially when talking about
such remoting technology for the masses. At my employer ITQ, we
have tried many things to convince customers and take away all
their prejudices. Running proofs of concepts with their own
business-critical applications, showing the F1 simulator on VDI,
running remoted sessions from another continent to show how
higher latencies can still work -- we have tried it all. But nothing
has worked as well as the sudden work-from-home demand. For a
long time, I was wondering why that was, but I think I found an
answer.
You see, change is good. Even if it’s not good, it eventually will be
good. It’s easy to find disadvantages of change. If you break up
from your spouse, it’s a massive change that might not seem right
at first. After a few weeks or months or maybe a year, you will
realize that also in that case: change is good. And trust me, I have
been in that exact position many years ago and look now. I found
out that the very person who I would eventually marry was my
There are multiple reasons why the world has changed so rapidly.
For instance, the technology itself is mature and capable of coping
such an insane scale. IT departments have changed and adapted to
become more end-user focused. Senior management is changing to
outcome-based monitoring instead of micro-managing employees.
What the pandemic has taught us, is that user experience has also
been subjected to change. This is what has changed my own
mindset forever. I have been part of a lot of pre-2020 VDI projects
and the most complex part of those projects has always been
application related. The complexity is not only related to the best
delivery model or strategy for a certain application, but also to the
effect it has on the rest of the project. Imagine you have 500
applications that need to land on the new platform. You can
migrate 1 application per day, which leads to 1.5 years of work
until you have not only migrated all of the apps, but also the users.
Without all of their applications, end users can’t work, right?
That’s how we always ran our projects. We defined a list of
general apps, a list per department and a list per user. Based on
those lists, we migrated apps to the new platform and as soon as
all of the apps were migrated for the group of users, we migrated
the actual users, as well.
Robert: I think I’m built that way… It truly gives me energy and
joy if I’m able to help someone or to feel that as a team we can be
successful. Furthermore, I truly love to see differences in people.
For instance, during my military service, when I was 20, I was
explicitly experiencing the big differences of all the people in the
company I was part of. At that time, I learned a lot how to take
into account people’s interests and preferences. In these days it
was purely intuitive, later I learned more explicitly via
management trainings the importance of diversity and how to
increase success and joy by focusing on the talents of every profile
and combine these different profiles in teams.
At that same time ITQ was growing and looking for someone to
help in the next phase. My respect and admiration is big for
Francisco Perez van der Oord, both (co)founder and CEO at that
time, being able to reflect on what the company was needing and
that he should embrace a different responsibility. Being able to
Me: You have been in the seat of both customer and supplier. Are
applying changes different in those seats?
Robert: Yes, as a supplier you can learn, apply and learn again
constantly. Inside a customer organization, normally the same
change doesn’t occur that often. However, another view is
ownership. Inside the customer organization you can really own
the change. As a supplier, you are always positioned as an advisor,
maybe implementer, but never the one that will be the owner
across the whole lifecycle of the change. Furthermore, inside the
customer organization, you are part of the culture. That is both an
advantage as a disadvantage, easier to read and feel the culture,
however also being part of that same culture and less capable of
thinking and acting out of the box or including previous cultural
experiences in the same type of change…
Me: How did the past 4 years impact your view on the people &
process aspect of End-User Computing?
Me: It’s hard to predict the future, but where do you see End-User
Computing going, from a people and process perspective?
If you think you can manage the change purely within the
IT arena, you are mistaken. The end-users in the business
are crucial stakeholders and you need an explicit approach
to manage the change that they are going to experience.
Just to enforce the above words. A few years ago, we were not
successful at all in a big project. The technology was working, but
we managed the change from a technical perspective. The first
time we found out the end-user was considering this change as
unwanted, it was already far too late. We didn’t get that feedback
via a direct communication. The resistance started as part of the
informal talks at the coffee machine, then reaching to the business
management, who on their turn talked to their management and
finding its way to the board of the company. No communication
was yet made to the IT project, when the CIO of the company was
getting the feedback out of board level… You can imagine, this
was a disrupting event for the image of this EUC project. It never
got finished. Luckily, we learned our lessons and as of that time
we include an explicit approach, like explained via the three main
principles.
If you want to know more about Robert and his work as the CEO
of ITQ, you can follow him on twitter: @roberthellings or find him
on LinkedIn.
USER COMPUTING
FAMILY
Where the VDI Design Guide primarily touches the VMware
Horizon Suite, this book covers some more components (from a
VMware EUC perspective) which might be necessary when
looking at advanced designs. VMware continues to innovate in the
VDI space, but when looking at the bigger picture, they are heavily
investing in the entire end-user computing portfolio. The bigger
picture here means the triangle between the identity, the device,
and the apps.
ONE?
The short answer is that it’s the Digital Workspace Suite from
VMware, which includes a full set of solutions to manage the
identity, devices, apps, user experience, and security of the end
users. And it does it in a modern way. It’s basically VMware’s
answer to Microsoft’s Modern Management strategy -- a “new”
approach to manage the workspace of an end user. The reason
why new is between quotes is because it’s new to many
organizations. Modern management itself isn’t new. Windows 8.1
got introduced with a new set of APIs which made it possible to
manage devices running that OS like you would manage a
smartphone or tablet. Modern management strongly focusses on
self-service, continues delivery of apps, simplified update
mechanisms, and security based on the identity and context of the
end user.
In the first phases of the project that would follow, we were full of
confidence about the solution and the outcome that we were going
to realize. After finalizing the design, and building the production
The idea is that every end user uses Workspace ONE Access as
their main entry point to the corporate resources, which can either
be internally hosted, cloud hosted, or a combination of both. The
main portal has a consistent user experience, independent of the
type of device the end user is running the portal from. A modern
browser which supports HTML5 should be sufficient. The
following figure shows the Workspace ONE Intelligent Hub from
a modern HTML5-based browser.
https://www.youtube.com/watch?v=MReIFlS8z00
Design Considerations
Deployment Considerations
One thing that you need to keep in mind, is that the cloud version
is hosted. In case the service becomes unavailable, users won’t be
able to authenticate with the platform, and thus won’t be able to
access applications.
https://techzone.vmware.com/resource/workspace-one-
access-architecture#workspace-one-access-connector
• I’m not really sure why, but in case you need VMware
Horizon integration with Workspace ONE Access, you
can’t use the modern connector software which is on the
same version as Workspace ONE Access. As of this
writing, VMware Horizon 2103 and Citrix Virtual Apps
and Desktops support the legacy connector only.
• In case you are using the legacy connector for VMware
Horizon, it automatically means that you will use the same
connector for the on-premises AD sync as well.
• Workspace ONE Access has the ability to provision
ThinApped applications directly to a Windows-based
Sizing
https://docs.vmware.com/en/VMware-Workspace-ONE-
Access/19.03/identitymanager-connector-win/GUID-A401F9EA-
0BD5-42E3-BF62-41F278724C85.html
Identity Federation
One of the first things to configure after the initial setup, is the
Identity Provider (IDP). Where will your users be primarily
authenticated? There are a great number of different IDPs which
are fully supported. Which one to use, depends on your situation.
It’s even possible to use multiple IDPs in the same Workspace
ONE Access, if needed. The following list show some of the IDPs
which are supported:
• Not every IDP works in the same way. Some use SAML,
some use other protocols. Check the documentation on the
IDP to know how you can federate identities to Workspace
ONE Access.
• If you like to federate an external IDP to access internal
Horizon resources, be sure to include True SSO in your
Horizon deployment (otherwise you need to enter your
password again when accessing Horizon resources.
Conditional Access
For more information about the design and creation of those rules,
I’d highly recommend checking the following video:
https://youtu.be/h5XIu5K6Pes
ENDPOINT MANAGEMENT
When designing VDI solutions, does it really matter what
endpoint is used and how you manage the endpoint? Absolutely!
In the VDI Design Guide, I covered that topic quite thoroughly,
but not from a Workspace ONE perspective. Like mentioned in the
introduction of this section, VMware acquired AirWatch, which
opened up the ability to extend VMware’s focus within the EUC
space. Back then, AirWatch was the undisputed leader in mobile
device management, mobile application management, mobile
content management, etc. Their solution covered all aspects of
mobility management, with a major focus on doing all of the
previously named types of management, for a brought set of
device and OS types. AirWatch always claimed to offer day 1
support for new versions of operating systems. A new version of
iOS got released? No worries, AirWatch offered same day support.
I don’t know if that form of support was unique to the world of
mobile device management, but I do know it helped AirWatch to
become the leader in this space.
After the acquisition, nothing really changed for a long time. For a
couple of years, AirWatch just remained AirWatch. In 2016, the
world saw the first integration possibilities between AirWatch and
VMware Identity Manager. The transition to VMware Workspace
ONE Unified Endpoint Management took two more years and was
finalized in 2018.
Is there any link with VDI? Actually, there is! For a long time, we
have been creating as many non-persistent desktops as possible. It
can save time, complexity and possibly increase the uptime of the
VDI. Now, non-persistency (especially when you look back at the
interview with Spencer Pitts) was primarily built to decrease the
footprint of the virtual desktops on the infrastructure and also
decrease the Total Cost of Ownership (TCO). And, quite honestly,
https://docs.vmware.com/en/VMware-Horizon-Cloud-
Service/services/hzncloudmsazure.admin15/GUID-E4675019-
DA01-4173-99EE-DC6E51CCCBC9.html
https://techzone.vmware.com/resource/horizon-active-
passive-service-using-stretched-vsan-cluster
With all the stuff that’s going on right now with UEM, my guess is
that UEM will also become the primary solution to manage virtual
desktops. At this moment, you aren’t able to enroll a non-
persistent desktop into UEM, but my estimate is that this will
come in the next coming years. And if you think about it, it will
totally make sense for a number of reasons:
https://www.youtube.com/watch?v=SgxEa4Wc87o
Design Considerations
There isn’t that much that can be designed which will heavily
impact your UEM deployment up front. There are a couple of
things to take into account though:
https://www.vmware.com/content/dam/digitalmarketing/
vmware/en/pdf/products/workspace-one/workspace-one-
editions-comparison.pdf
https://docs.vmware.com/en/VMware-Workspace-ONE-
UEM/2102/UEM_Recommended_Architecture/GUID-
CDC49AFA-98EF-4FBD-897D-
A561FACB9915.html#vmware-tunnel-and-unified-
access-gateway-tunnel-9
https://docs.vmware.com/en/VMware-Workspace-ONE-
UEM/services/GUID-AWT-FREESTYLE-
ORCHESTRATOR/GUID-freestyle-orchestrator.html
https://techzone.vmware.com/resource/workspace-one-uem-
architecture
INTELLIGENCE
I originally wanted to end the End-User Computing Family section
with Workspace ONE Intelligence, simply because it’s the glue
that sticks everything together. As you may know or have read,
most of the EUC solutions from VMware were acquired. That’s no
problem, but it does come with some challenges. The most
important one is the integration. Integrating acquired solutions
might pose a challenge if the solution doesn’t include a strong API
(or any API for that matter). That challenge didn’t really exist
when VMware acquired Apteligent in 2017. The Workspace ONE
portfolio with AirWatch and Identity Manager was good for most
of the use cases, but it lacked analytics and automation. This is
why Apteligent was brought in as the first step in building
Workspace ONE Intelligence. Apteligent was already built with a
strong API because the nature of the solution. It needed to easily
integrate to gather data and be able to run analytics on it for a
variety of use cases such as mobile performance management,
business insights and application behavior analytics. The
https://marketplace.cloud.vmware.com/
You can filter for security solutions, who are part of the Trust
Network.
One of the cool things about DEEM, is that it doesn’t just focus on
monitoring, it also enables you to automatically remediate certain
issues like application crashes (based on the automation engine in
Workspace ONE Intelligence). I think this is a very welcome
feature in the EUC portfolio, although from a VDI-perspective it
does lack some telemetry data.
DEEM Considerations
If you like to know more about DEEM, check out this article:
https://docs.vmware.com/en/VMware-Workspace-
ONE/services/intelligence-documentation/GUID-
19_intel_deem.html
https://docs.vmware.com/en/VMware-Workspace-
ONE/services/intelligence-documentation/GUID-
21_intel_automations.html
https://docs.vmware.com/en/VMware-Workspace-
ONE/services/intelligence-documentation/GUID-
06_intel_install_connector.html
https://techzone.vmware.com/onboarding-windows-10-
using-command-line-enrollment-workspace-one-
operational-tutorial#_273303
https://techzone.vmware.com/resource/workspace-one-
intelligence-architecture
And this is exactly one of the things that needs to change. Because
hackers and cybercriminals don’t work in teams. They start by
compromising an identity or device and start moving laterally
through the network to eventually compromise workloads and
data.
https://www.cyberark.com/resources/blog/remote-work-survey-
how-cyber-habits-at-home-threaten-corporate-network-security
Intrinsic Security
The security officer who was part of the project brought to the
teams’ attention that the hospital wanted to comply with the NEN
7510 regulatory, and he had already downloaded the ADMX files
whereby the hospital would immediately comply with the
standard, not looking at any setting within that policy and
understanding what it would mean when we would implement it.
When taking the intrinsic security approach, you bring tools like
Carbon Black into the desktop environment and both teams will
start working closely together because all alerts generated will be
Proactive
Where the average administrator would start to act after the shit
has hit the fan and then tries to reverse engineer the attacks of
yesterday, the administrator will now start acting proactively to
prevent that attack on the system.
Built-in
The greatest thing of all is that the product, Carbon Black, is built
into the technology the administrator is using, whether it is a
network admin who controls the network traffic with NSX, a
server admin in charge of the workloads, or a desktop admin
who’s responsible for the devices and identities managed with
Workspace ONE.
https://youtu.be/LSU9HQopuIY
Deployment considerations
Let’s start with one of the most asked questions these days: Am I
going to the cloud, or do I stay on-premises? In my opinion, the
answer is simple. Always cloud, unless you have a very, very strict
no-cloud policy, and here is why:
https://www.ivandemes.com/automating-the-vmware-carbon-
black-local-mirror-configuration-for-windows/
Sensor considerations
VDI Considerations
Regarding VDI there are a couple of things you have to take into
account when installing the sensor on the various types of clones.
As linked clones no longer exist in the latest versions of Horizon, I
will focus on the considerations for Instant- and Full clones or
Persistent and Non-Persistent, so to speak.
Before you start deploying the Carbon Black Cloud sensors in your
environment, you need to have a VDI policy where you need to
configure some essentials settings for the virtual desktops starting
with bypass rules (exclusions).
**\Program Files\VMware\**,
**\SnapVolumesTemp**,
**\SVROOT**,
**\SoftwareDistribution\DataStore**,
**\System32\Spool\Printers**,
**\ProgramData\VMware\VDM\Logs**,
Next to the above bypass rules, it’s best practice to disable on-
access file scan mode and signature updates. The local scan
feature adds network overhead and augments resource utilization.
The Carbon Black Cloud can pull reputation and enforce policy in
real-time from the Cloud because most VDI environments
maintain 99% uptime.
https://techzone.vmware.com/resource/antivirus-considerations-
vmware-horizon-environment
There are many more options to set but the last one I’d like to
point out is the Auto-deregister VDI sensors that have been
inactive for setting. I would recommend to only enable this setting
on a policy for Non-Persistent virtual desktops. It’s recommended
to enable this setting to remove any clones from the management
console that have been inactive for a specified duration. Check the
following link for more information:
https://docs.vmware.com/en/VMware-Carbon-Black-
Cloud/services/cbc-sensor-installation-guide/GUID-D2BC3455-
B8EB-414F-A5FE-31D40C193ABE.html
https://techzone.vmware.com/integrating-workspace-one-
intelligence-and-vmware-carbon-black-workspace-one-
operational-tutorial
INNOVATIONS
As a nerd who spends a lot of time in his home lab, I have a big
interest in hardware and how hardware can best fit the
requirements you might have. Hardware can really make a
difference for things like user experience, total cost of ownership,
but also security and recoverability. I’m still amazed by the
number of requests I get to run an assessment or health check on
an existing VDI platform to find out that the platform hasn’t been
designed correctly or optimally. And in this case, I don’t mean it
should have been designed according to my own standards. In this
case, the platform has just been designed incorrectly according to
vendor best practices. A great example is an organization who
contacted us to talk about Microsoft Teams integration into their
new VDI platform. Like every other company, they wanted to use
The customer got some advice from a certain company that since
they wanted to achieve the highest density possible on a single
host with mainly task users, they were advised to go for a card
with a lot of framebuffer. The company recommended to run two
NVIDIA Quadro RTX8000s in a single host, as a single card has 48
GB of framebuffer and combined could offer 96 1B profiles to end
users. This would mean that they could run 96 users on a single
host.
By the way, all of the use case sections which you can find later in
this book include background information on the different
hardware choices and considerations. Those considerations are
based on the experience I gained by being able to test and validate
those use cases on the different platforms. I need to thank
companies like HPE, Supermicro, NVIDIA, Intel, AMD, and
VMware (obviously) for their support in testing and validating
those components.
CPUS
Central Processing Units (CPUs) have always been a hot topic. Am
I going to use a lot of cores or fewer cores? Do I need a high clock
speed or a lower clock speed? Do I need multiple physical CPUs in
a single system? What about Intel versus AMD? And since
VMware is investing in ARM and NVIDIA acquiring ARM, is
ARM a thing? Questions, questions. The answer isn’t always that
simple, but the design methodology in the VDI Design Guide will
help you to narrow down the choices you have related to CPUs for
your specific use cases. Before we go a bit more into the
considerations, let’s dive into the trends and different choices first.
When looking at VDI solutions, the idea is to get the best density
while retaining a good user experience for the end user. Until a
couple of years ago, VDI was mostly used for desktop workloads
with a similar footprint. The footprint was mostly tied to your task
worker use cases as they could have a positive impact on your
density. Power users were always a challenge, especially when
looking at either single threaded applications or requirements for a
high clock speed. When Intel released the Skylake architecture
with the 18-core Intel Xeon Gold 6154 (3 GHz / 36 threads) in 2017,
this opened up the possibility for higher densities, even with
power users. For a couple of years, this CPU and its successor, the
Intel Xeon Gold 6254 (3.1 GHz / 36 threads), have been part of
almost all major VDI projects I have worked on.
Intel has always been a strong partner with VMware. I guess that
since Pat Gelsinger (VMware’s former CEO) announced he would
leave VMware in early 2021 to become Intel’s new CEO, that
partnership has only become stronger.
Intel and AMD remained partners until the mid 1990s. The
partners ended up becoming each other’s competitor and in a
long-lasting dispute, the US court decided that AMD could still
use the Intel x86 architecture up to the 80486 processor family.
In 1996 they released their first x86 CPU that was fully developed
in-house. It was called the AMD K5 and was available with clock
speeds varying from 75 MHz to 100 MHz. True story, the first PC I
bought from the money I earned with delivering newspapers was
actually based on this CPU. Mine ran at 90 MHz and was fast
enough to play games like Doom, Mortal Kombat, and Command
and Conquer.
For a couple of years, AMD was really able to compete with Intel
in the CPU market. Many architectures followed the K5.
Personally, I owned a K6 and an Athlon as well, before I moved
over to Intel CPUs, even though AMD had some really nice
features in their later architectures (such as Turbo Core technology
to switch from 6 cores running at 100% speed to 3 cores running at
200% speed) -- technology we currently see in most CPUs as a so-
called Turbo Boost.
I stepped a bit out of the hardcore gaming scene for a while and
lost track on what happened with CPU trends. While working for
a system integrator from the mid 2000s to the early 2010s, I have
literally installed tens of thousands of PCs for all sorts of
customers. One of the things that came to me while planning this
section was that I never ever installed a PC with an AMD CPU
during that era. Intel made a couple of really good deals with
Before I tell you about that project, I want to emphasize that AMD
didn’t really leave the market. In fact, their CPU sales (and GPU
sales, since they acquired ATI in 2006) didn’t drop at all. The
acquisition of ATI brought a whole new market to the table.
Suddenly, AMD was able to deliver GPUs and because of their
long-lasting knowledge of the CPU market, even able to design a
hybrid CPU/GPU called Fusion APU (Accelerated Processing
Unit). This new market was formed by major game console
manufacturers like Nintendo and Microsoft. The Xbox 360 and
Nintendo Wii used an ATI-based GPU and basically set a new
standard for game console GPUs created by AMD. Where Sony’s
PlayStation 3 used an NVIDIA-based GPU, the PlayStation 4, Xbox
For the task worker use case, we ran the benchmark 10 times per
host and ended up with the following metrics, averaged based on
the last run in which the density reached the maximum number
per host. For the task worker persona, every user had:
• Windows 10 1809
• 4 vCPUs
• 6 GB RAM
The following table shows the average duration time per action
and shows the differences for both different configurations.
As you can clearly see, the metrics are relatively similar. The thing
that completely stands out is the fact that the AMD-based
configuration has a much higher density (98 vs 68) until it reaches
a threshold. Up to these densities for both configurations,
applications responded quite well, and the user experience was
acceptable. As soon as more virtual desktops were loaded, the
CPU ready times began to increase on the physical hosts and the
performance started to drop significantly.
Power user
Next, it was time to run the same benchmark for the power users.
Alongside the task worker applications, we ran the benchmark for
the additional applications, as well. Due to the required resources
for the power users, we used the following configuration:
• Windows 10 1809
• 6 vCPUs
• 12 GB RAM
• vGPU T4_1B profile
Lastly, we ran the benchmark for the heavy power users. These
users run two applications (next to the apps which the previous
use cases run) which are both graphically and computationally
intensive and require a lot of resources. The single-threaded
application is a research/data science application which requires a
high clock speed, a GPU with CUDA support, and 8 vCPUs to
achieve the best performance from the application. The virtual
desktops for this benchmark used the following specs:
• Windows 10 1809
• 8 vCPUs
• 16 GB RAM
• vGPU T4_2Q profile
For the last three years, I've been a part of the CTO office of the
cloud platform business unit, which is responsible for the core
features and multi-cloud strategies. This is where I focus on
generating strategies for modern workloads and developing new
product concepts. Understanding customer direction is essential.
That means I have to recognize how the customer uses our
products today and how our products should align with their
technology strategies for tomorrow.
Going from the moment where I saw vMotion for the first time to
be allowed to develop future product concepts has been a journey
I'm very proud of.
What's not to like? Well..., It was one CPU package, that was for
sure, however under the hood lay two six-core Istanbul CPUs,
with a HyperTransport (AMD brand) cache-coherent connection
between the two Istanbuls and an interconnect between the two
sockets. That meant that there are now four NUMA nodes with a
system with two sockets. With this concept, the IT world departed
from the concept of a NUMA node that is equal to a CPU socket.
Twelve years later and we still need to get used to this concept. As
a result, ESXi showed four NUMA nodes and VM sizing needed to
align to the physical layout of the internal structure of the CPU
package to get the most consistent performance. Many
applications are not adequately NUMA optimized. As a result,
application performance boils down to VM sizing. Unfortunately,
this concept was confusing to customers who switched from Intel
and their Single Socket - Single NUMA node design.
But AMD believes this is the future, and even within the modern
EPYC architecture, they are sticking to their guns with their Multi-
Chip Module architecture. The first design was an authentic roller
coaster design. You can compare it to introducing someone who
was brought up with the most elegant system designed by
The world was not ready for this design. Operating systems are
designed with the notion that every CPU within the same socket
can access all the cache and memory with a uniform memory
access pattern. The VMkernel uses relational scheduling
optimizations and attempts to schedule worlds (vCPUs) on CPUs
that share the same LLC domain to take advantage of cache
latency instead of memory latency (10 ns vs 70ns). But now, with
the AMD EPYC, the LLC domain was microscopic compared to an
Intel system. Some customer systems were seeing massive
oversubscriptions on a tiny part of the CPU package. Other
operating systems had similar problems. VMware immediately
started to work on optimizations, and with each update, new
optimizations are introduced. AMD took some feedback to heart
and introduced a simpler architecture in their second generation
(Rome) EPYC. Instead of four NUMA domains per socket, there
was now only one. Yet still 16 LLC domains with every four cores
per CCX. And it's this layout that is detrimental to the
performance of VMs that go beyond the four cores.
Now the good news, the third generation has increased the size of
the CCX. AMD bumped the core count to eight. Very interesting to
see, and hopefully, more people are getting used to this roller
coaster ride and size accordingly. I'm sure they take the
recommendations in this book to heart.
Me: You are one of the subject matter experts on core vSphere
features like vSphere HA and DRS. How have you seen the
different CPU architectures and innovations impact those core
features?
Me: What trends are you currently seeing in the field of x86 CPUs
and virtualization?
Me: Let’s take it back to VDI use cases. In VDI, we mostly see
homogenous VMs running on a host (because of how we design
desktop pools). When taking this homogeneity in mind, what
recommendations do you have when talking about CPU choices
and architectures?
Me: If people want to know more about NUMA and the impact of
AMD and Intel CPUs on virtualization, what sources would you
recommend?
https://youtu.be/VnfFk1W1MqE
If you want to know more about Frank and the stuff he’s doing,
follow him on twitter: @frankdenneman or on his blog:
What the future will look like, no one knows, but let’s dive into the
options first.
Let’s start with NVIDIA, first. I think NVIDIA did an excellent job
in continuously improving and innovating their datacenter GPUs.
They finally made a split in their compute-only accelerators and
the ones purposely built for graphical workloads. While
accelerators can handle both of those workloads perfectly fine, an
accelerator specific to graphics can benefit from a different
architecture compared to the compute ones (and vice versa). I
guess that NVIDIA’s focus area is more and more shifting from
gaming to AI. The release of their DXG platform (monstrous super
computers used for computational workloads), the NVIDIA GPU
Cloud (NGC, an online repository where you can find Docker
images with entire AI applications), and the acquisition of
Mellanox are some of the examples which show that renewed
focus. I’m wondering what their idea was when they created the
CUDA framework. Was it just a fun project to run computational
instructions on a GPU? Or did they know back then that they
would shape the future of computational acceleration like it is
now? To top it all, NVIDIA announced an extensive partnership
with VMware during VMworld 2020. In the partnership they are
developing a solution to commoditize AI platforms with
NVIDIA’s GPUs, network interfaces, NGC platform, and
VMware’s VMware Cloud Foundation infrastructure platform
(VCF) and Tanzu platform for container/application management.
I’m a strong believer in innovations like this, because you want to
move away from customized, whitebox solutions which host your
mission-critical applications. The key is running them on an
enterprise-grade platform which can be supported in the same
way as your current virtual infrastructure. What does this have to
do with VDI? Not a lot. Although I do think we might see
innovations from this front coming to the VDI stack, as well. One
of the things which NVIDA released in early 2021 is something
they call Multi Instance GPUs (MIG). MIG enables you to divide a
physical GPU into logical instances. vGPU does this too, but vGPU
does it only from a framebuffer perspective. The cores on a GPU
are scheduled through a GPU scheduler which resides on the
hypervisor layer. In case of MIG, cores are directly tied to a set of
framebuffer and won’t be shared. The advantage is that with MIG,
NVIDIA did release some cool new stuff for us VDI people,
though. In 2018, GPUs based on the Maxwell, Pascal, and Volta
architecture were widely available for different types of use cases.
Specifically, the Tesla M10, P4, P6, P40, P100, and V100 were
available for a wide variety of use cases. Not long after the release
of the first VDI Design Guide, NVIDIA announced their Turing
architecture which brought us the Tesla T4 (which is now called
the NVIDIA T4) and the Quadro RTX6000 and Quadro RTX8000
cards. Early 2021 announced another lineup of products, so the
RTX cards have been replaced already by cards in the Ampere
architecture, as has the T4, which still is the go-to-GPU for VDI
workloads until these are generally available.
• 4 Ampere-based GPUs
• 64 GB of framebuffer
• H.264, H.265, VP9, and AV1 CODEC support
• 250 Watt power consumption
• 2 x the encoder throughput as a single M10
• PCIe Gen 4.0 dual slot
3 of these will fit in a single host, which could potentially get the
theoretical density to a blazing 192 users when using a 1B or 1Q
profile! This will hopefully decrease the cost of VDI and create a
GPU-for-all situation at an investment which is a no-brainer. The
biggest competing solution for NVIDIA GPUs isn’t the GPU of a
competitor, it’s just more CPUs. Quite often customers choose to
invest in additional hosts for the overhead caused by encoding and
rendering, which will offer enough resources for sure, but will
limit the user experience, as well.
AMD
Intel
The other example was related to one of the latest projects I’ve
worked on. The primary goal was to move a team of video editors
to a VDI. Prior to the pandemic, the team was mainly working
from a central location at the customer’s HQ. After the pandemic,
they were spread across the country and experienced a massive
challenge in collaborating on their video projects. The customer
chose to move them to a VDI platform to enable them to
collaborate better. The first challenge was to virtualize their
workstations to the central platform. With just a crapload of
resources, that was fairly easy. The next step was to move their
project data to a so-called Media Asset Management (MAM)
solution and enable them to run their project workflows from their
virtual desktops, but with data which resides on the MAM
appliance. If I tell you that they normally work on 4K raw video
data, you can imagine what challenge we faced. Without spoiling
everything, the solution was also found in an insanely fast
network: 100 GbE network interfaces with a switch that could offer
enough bandwidth for the entire team to work simultaneously.
So, the question is, how is this innovation? Well, I don’t think it is
innovation from a datacenter perspective. But, since emerging use
cases like data scientists or solutions like VDI by Day, Compute by
Night might have a network constraint with (traditional) 10 GbE,
utilizing those technologies in a modern VDI stack can be
considered as innovation.
INDUSTRIES
The maturity of a Virtual Desktop Infrastructure platform has
reached a level in which I strongly believe it will be hard to find a
use case that can’t run on it. Sure, it will be hard to connect to a
VDI from the International Space Station or from the Mariana
Trench (although I think that if you have 11 kilometers/6.8 miles
of fiber cable attached to the sub, it should be possible). And
although some use cases might be suitable for a VDI, the outcome
is the most important thing. Choosing the right approach is
I was part of a presales team, and our main goal was to first see if
their use case was viable for a more modern way of management,
increase the availability, and reduce the potential downtime in
case of a failure.
If you think of a use case that needs rich graphics with a high
frame rate and a low latency, the first thing that comes to mind is a
gamer. Without going into a lot of details (you can find these in the
Gaming on VDI section), we built the F1 simulator on VDI. From a
rich graphics perspective, we could perfectly show how smooth
the game played in the virtual desktop (with 60 frames per
second). From a latency perspective, we introduced a USB-
redirected steering wheel and gas/brake pedals. If any latency is
noticed, you will notice that in a split second. So, being able to
have great user experience in such a game would make sense. We
showed the customer the F1 simulator on VDI, and after driving a
couple of laps themselves, they were convinced this would be the
right solution for the job. At first, it might not make a lot of sense
to build something like this. But it does have its use cases.
1. Success Criteria
2. Test procedure
It’s important to define a begin state. Quite often, the begin state is
an assumed configuration which will most likely run the use case
as expected, or nearly as expected. Obviously, this depends on
your experience with certain use cases. If you have a lot of
experience in the field of healthcare, chances are you can easily set
up a new healthcare proof of concept. If you have a lot of
experience in the security side of things, piloting a use case for a
government won’t be that different than that from a commercial
organization who needs to protect their intellectual property
through a VDI.
In this case, the begin state was a bit different. The customer
wanted to see how these media use cases differed from the other
GPU-accelerated use case. So, we used the virtual desktop
configuration from the other use cases as a starting point, which
was the following:
Physical hosts:
• 2 x Intel Xeon Gold 6248R CPUs (24 cores/48 threads, 3
GHz clock speed).
• VMware vSAN All-Flash storage with Intel Optane
o (2 x DC P4800X for cache, 6 x DC P4510 for
capacity)
• 4 x NVIDIA T4s
• Mellanox Connectx-5 NICs (dual port 25 GbE)
Management components:
• VMware vSphere 7.0
• VMware Horizon 2012
• NVIDIA vGPU 12.1
Connection Protocol:
• Blast Extreme
• H.264 Encoder
• minQP: 10
• maxQP: 36
• 30 Frames per Second
Before you know where you are heading, you also need to know
where you are coming from. This is why you should run the
workflow a first time and validate if the test procedure is capable
of initially running on the begin state. Of course, running it the
first time means you go through the procedures using three
iterations and record the average of the three iterations. What I
always do is document the steps and the result per step, followed
by a general outcome of the entire run:
Test
Initial benchmark run with the begin state configuration
Step Positive Result Negative Result
Connect the virtual 35 seconds
desktop from your
home location
Start Adobe Premiere Starts fine
Pro
Start a new project No issues
5. Suggest a change
It might be that the outcome of the last run doesn’t meet the
success criteria. In that case, you want to improve something that
could get you closer to that goal. There are different opinions for
change management and the application of changes. In my
approach I’m focusing on a single change at a time, the main
reason being that I want to be in control over the impact. If you
change multiple things like the number of vCPUs, a vGPU profile,
and Blast Extreme settings before starting another run, how will
you know which of the applied changes had which impact?
If you look at the outcome of the last run, my gut feeling says the
VM has a large vCPU resource constraint. We validated that by
checking the performance statistics in vCenter and came to the
same conclusion. The suggested change in this case was to add an
additional number of 4 vCPUs to the virtual desktop.
After applying the change, you start a run with the new settings
included and document the outcome.
Test
Second run: added 4 vCPUs to the virtual desktop
Step Positive Result Negative Result
Connect the virtual 34 seconds
desktop from your
home location
Start Adobe Premiere Starts fine
Pro
Start a new project No issues
Import 4K media Importing a 150
which is stored locally GB video, takes
in the virtual desktop 5.4 seconds
into the project
After the second run, we noticed that although the VM did have a
vGPU profile, the render process utilized a great number of CPU
resources. This quite often means that these processes haven’t been
offloaded to the GPU. Adobe Premiere Pro does support
offloading to a GPU, but this requires Quadro features. We also
noticed in Windows Task Manager that the framebuffer of the
GPU was fully utilized. Because of this, we suggested to move
towards a T4_4Q vGPU profile, instead.
After the change was applied, the third run showed serious
improvements.
Test
Third run: added a T4_4Q vGPU profile to the virtual desktop
Step Positive Result Negative Result
Connect the virtual 34 seconds
desktop from your
home location
Start Adobe Premiere Starts fine
Pro
Start a new project No issues
Import 4K media Importing a 150
which is stored locally GB video, takes
in the virtual desktop 5.4 seconds
into the project
Play the media in the Audio/Video is
timeline in sync, but
frames seem to
drop
Open up the MAM Opening the
solution through a MAM solution
plugin in Premiere Pro takes 2 seconds
Scrub through the Impossible, lot of
media frames drop
Import a 4K video Importing a 100
from the MAM GB video takes 5
solution seconds
Create a transition Transition created
between the local in 2 seconds,
video and the remote adjusting the
video transition time
The video plays at 50 frames per second, but the framerate in Blast
Extreme is configured to a limited framerate of 30 frames per
second. As a result, the next suggested change was to increase the
maximum FPS to 60.
Test
Fourth run: increased the FPS limit of Blast Extreme to 60
Step Positive Result Negative Result
Connect the virtual 34 seconds
desktop from your
home location
Start Adobe Premiere Starts fine
Pro
Start a new project No issues
Import 4K media Importing a 150
which is stored locally GB video, takes
in the virtual desktop 5.4 seconds
into the project
Play the media in the Audio/Video is
timeline fully in sync
without frame
drops
The final PoC run, showed as the result we were hoping for.
Test
Fifth run: changed the NIC to 10 GbE
Step Positive Result Negative Result
RESULTS
The result was as we hoped for and perfectly showed that even a
complex use case as a video editor can work on a virtual desktop.
Of course, there are some additional things to keep in mind:
CHAKRABORTY
Me: How have you seen UX and end user satisfaction change in
the past years?
Me: In this section, the customer wanted to run their video editors
on a VDI. Have customers surprised you with comparable use
cases in which the UX was beyond expected?
Me: Have we gotten the most out of the potential of the connection
protocols yet?
If you like to know more about Anirban Chakraborty, you can find
him on twitter: @anirbanzonly
HEALTHCARE DIFFERENT?
First and foremost, lives might heavily depend on IT systems in a
healthcare environment. Keep this in mind.
https://www.hhs.gov/hipaa/for-professionals/security/laws-
regulations/index.html
https://techzone.vmware.com/resource/vmware-horizon-
deployment-guide-healthcare
I’d like to dive into a couple of other examples of use cases that
required things like Workspace ONE Access or Workspace ONE
UEM to get similar result. The full EUC picture includes all
solutions in the entire Workspace ONE and Horizon Suite, but it
also heavily leans on the VMware SDDC suite. While this section is
dedicated to designing the ideal healthcare platform, I think it’s
reasonably impossible to cover everything in this section. I could
probably dedicate a whole book just on healthcare. I might even
do it when this book is finished. Let’s dive into a couple of
examples of healthcare-related EUC use cases first.
Unified Endpoints
If you think a solution like this doesn’t exist from a single vendor, I
have some news. Please don’t see this as a sales pitch, but a
solution like Workspace ONE UEM in conjunction with
Workspace ONE Access and Workspace ONE Intelligence can
seriously offer you the ability to introduce a single device
management strategy for all the different endpoints you might use
in a healthcare environment.
For Project VXR (which you can find in a later section) I worked
with the VMware team to test some things out with an Oculus
Quest, but the first Quest didn’t have proper device management
support yet. Everything changed when the Oculus Quest 2 was
released, and specifically the Oculus Quest 2 for Business. With the
normal Quest 2, you would need a Facebook account in order to
use it, but with the Oculus for Business models, a device
management solution can be used to manage the device, as well.
Workspace ONE UEM fully supports these devices. Without going
into a lot of detail (I don’t want to spoil all the fun of the Project
VXR section), the customer was able to fully manage all of the VR
goggles. The goggles are configured into a kiosk mode, so patients
can just use the VR apps which are provisioned on them, and
that’s it. Like I mentioned earlier, the use case involving the
distraction of a patient during chemotherapy was successful and
has been fully embraced by the customer. Pain relief studies look
good, but no results have been posted by the customer yet.
The idea came from one of my EUC friends, Huib Dijkstra. Both
customers already invested in Workspace ONE Access but were in
the phase of migrating to it so it could become their main
authentication portal for people coming from the outside. The idea
of Workspace ONE Access is simple: you sign into the portal and
when you open up an application from the portal, your credentials
are used to authenticate into the application. The SAML standard
for identity federation, enables an end user to work with single
sign-on (SSO).
Now, Huib’s idea was simple. What if you could use single sign-
on between the two institutes and fully federate the identity from
one institute into the other? This would enable users to just sign in
to one platform and start applications at the other institute without
having to go through all of the authentication steps. It would then
also be possible to enable them to use a per-app VPN instead of a
device VPN and establish it when needed.
The following article from Peter Björk will help you build a similar
environment yourself. Although it’s already from 2017, it is still
very useful!
https://blogs.vmware.com/horizontech/2017/05/using-vmware-
identity-manager-transform-users-active-directory-domains.html
I don’t have to tell what this outcome did to IT. First, they
expressed a lot of disbelief, which luckily led into the quest for a
solution. Second, it gave us the opportunity to focus on helping the
customer solve this challenge.
There are multiple ways and an endless number of tools which can
help you to normalize application sprawl. What you want to
avoid, though, is to introduce point solutions to manage all of the
individual applications. This is where the full VMware EUC stack
steps in. All of the different types of applications can be
provisioned to an end user and in such a way that shadow IT
could become history. Here is an overview showing some of the
types of apps and the delivery method of the app:
The list is obviously a lot bigger. Instead of sharing the list and the
ideal delivery methodology, I’d rather guide you to my blog. In
Now run nvidia-smi.exe and check the result in the table. The
table shows an overview of all of the processes which utilize the
GPU. The processes could be marked with a C (compute), a G
(graphics) or C+G (compute + graphics) in the Type column. In
case a process is using the GPU for computational reasons, the
table will show either C or C+G.
Now, for more mainstream solutions like Zoom and Teams, there
are other options, too. Sure, using a GPU inside the virtual desktop
is probably the easiest to implement and also the one that can be
used from a wide variety of endpoints. The only thing is that you
surely need to optimize the connection protocol for such a
workload. Optimizing unfortunately isn’t just done by selecting a
certain checkbox. What I would recommend here is to download
the Blast Extreme Optimization Guide. It will cover most of the
options which you can tune to get the most out of the connection
protocol for video conferencing solutions, accelerated through
GPUs. You can find the guide here:
https://techzone.vmware.com/resource/vmware-blast-extreme-
optimization-guide
You can find the optimization plug-in for Microsoft Teams and
related information here:
https://techzone.vmware.com/resource/microsoft-teams-
optimization-vmware-horizon
https://support.zoom.us/hc/en-us/articles/360031096531-Getting-
started-with-VDI
Monitoring
https://docs.vmware.com/en/Management-Packs-for-vRealize-
Operations-Manager/10/care-systems-analytics.pdf
As there is more than just Epic in the world, you also might be
looking at a different solution for monitoring your EMR
application. Another solution which provides out-of-the-box
support EMR solutions -- but in this case for many different ones --
is the Performance & Application Monitor from Goliath. Goliath
offers support for major systems like Epic, Cerner, and
MEDITECH.
Application considerations
Security considerations
Hardware considerations
Endpoint considerations
What we’ve seen during the pandemic is that the risk of suddenly
having to cope with an enormous growth is real. Being able to
handle such an increase in the number of end users on a platform
requires an EUC solution to be designed with scalability in mind.
How to scale out depends on your use cases. You can imagine that
cloud services like Workspace ONE Access and UEM are relatively
easy to scale out. The only thing you basically need is a platinum
credit card. ☺ Your services are designed and built for scalability,
so that shouldn’t be an issue. I do recommend reaching out to
VMware to check if the type of service/tenant you consume is
capable of handling the theoretical maximum. When looking at
VDI use cases, it will be a bit different, though.
Scaling out is the easiest when your entire platform (hosts, storage,
and networking) is capable of scaling out. I covered that topic
quite thoroughly in the VDI Design Guide, so I won’t go that much
into detail here. However, there are some things we learned in the
past year which I would like to share.
• Scaling a VDI out sounds easy, but make sure you have
the equipment ready. The pandemic taught us that at such
an event, it could be hard to find hardware because big
companies (such as those hyperscalers) might have
contracts with suppliers which give them priority over
your own organization. Which brings me to the next point.
https://youtu.be/nTYOPu6BShU
Me: You’ve been around in the EUC space for many years, how
did you end up in EUC?
Me: When looking back at the EUC side of your career, what was
your favorite EUC project and why?
Me: Hahahaha, let’s talk about your recent move towards the
security side of the business, why this interesting move?
Huib: Remember that picture form 2017 where a guy was mowing
the lawn as a tornado approaches his house? I loved the quote
from his wife (Cecilia Wessels) who reportedly said “he was
keeping an eye on it”
So, a lot has changed, and the topic of prevention isn’t as sexy as
delivering a new capacity to your colleagues. That combined with
how technically advanced attacks have become makes it a hard
topic.
The big debate in security is “to go to the cloud, yes or no?”. Big
data algorithms, AI and machine learning by nature work better
Me: If you could share one key takeaway regarding security and
EUC, what would it be?
• Assess
• Design
• Validate
• Deploy
• Migrate
First of all, they have an infinite demand for GPU power. Their
work mainly consists of analyzing data, creating deep learning
models to support the analysis and using these models to improve
certain processes within the organization. As the datasets are
exponentially growing, the resources required to train their
models are increasing, as well. The more horsepower you have in
When looking at the user/desktop count, you can see that between
8:00 and 17:00 there is a high concurrency, but it quickly drops at
18:00. Between 18:00 and 7:00 the environment had a 10%
concurrency.
The output of the assessment showed that over 80% of the end
users could benefit from a GPU, either at the application or
operating System level.
In the following figure you can see that VDI resources are still
available and also have the highest priority (for the same user
experience reason), but the majority of resources will be allocated
to the deep learning machines.
The concept itself sounds really cool, easily doable at first, but
along the way we found out that it’s not so easy to implement. The
main reasons are:
DESIGN CONSIDERATIONS
Because it had never been done before on a large scale, we needed
to reinvent the wheel. This is why I love being a VCDX. The VCDX
methodology helped me to figure out what the proposed
architecture needed to look like, based on requirements,
constraints, assumptions, and risks.
The world of data science is one that changes rapidly. Due to the
fact it has never been so easy to collect large amounts of data, the
data aspect will impact your network and storage, as well. Not
every data science team works with the same kind of data.
Financial organizations like banks and insurance companies might
work a lot with tabular data, while medical institutes might work
with medical images. The size and the amount will certainly differ
per data science team, so make sure to know what you are sizing
for. One of the teams I have worked with uses medical images
generated from so-called widefield microscopes. Every image that
is created varies between 500 GB and 2 TB in size and is stored on
an extremely fast and ridiculously large NAS. Running ML
training jobs on such data requires different networking hardware
in a host than tabular data that can easily be moved close to a
compute VM. Please consider the following when choosing the
right networking and storage hardware for the job:
VMware vSphere
The main reason for this is that vSphere wasn’t built from the
ground up to support accelerators, in any form. We can assign an
accelerator to a VM, but what happens if an accelerator becomes
fully utilized? Or what happens if an accelerator fails? In 2020,
VMware announced a tight partnership with NVIDIA to work on
the accelerator support and let’s hope that this will change, indeed.
It would be great if vSphere DRS will be capable of using
accelerator metrics to declare a VM to be a noisy neighbor or to
predict a failure and run a vMotion to move a compute VM from
one host to another. vSphere vMotion currently already supports a
vGPU vMotion, so why not extend DRS support for DRS Initial
Placement and Predictive DRS, as well? We will see what happens,
only time will tell. When looking at the vSphere Layer, please
consider the following:
https://kb.vmware.com/s/article/57297
VMware Horizon
https://flings.vmware.com/horizon-ova-for-ubuntu
• Python 3
• TensorFlow 1.15
• 16 GB of framebuffer
• 8 vCPUs
• 64 GB of RAM
I started with the build process, and these are the steps I took:
Me: When did you first hear about the idea of repurposing
compute resources?
Tony: The idea of for VDI by day compute by night really took
form for me in 2016, a few weeks after the GPU Tech Conference.
That was my first GTC and it really helped me to synthesize the
concept for repurposing of resources.
The idea was really simple, and at the core of what I’ve been
working on it remains simple. And that is, several years ago,
VMware had been championing any workload can be virtualized.
That means workloads such as high-performance compute (HPC),
machine learning (ML), deep learning (DL), and so forth that have
continued to grow increasingly dependent on GPUs can be
virtualized. And if they can be virtualized then they must adhere
to the same rules as other VMs. This also means that they could
use the same underlying infrastructure as other workloads.
If they (AI, ML, DL, or HPC) could use the same infrastructure, as
say VDI, then all I needed to do was turn the VMs on and off at the
Now for the history buffs, you’re probably thinking a lot of the
functionality wasn’t available for GPUs until the T4 GPU and
associated virtualization software in late 2017. And you’re right!
That is when it became possible to harvest spare resources,
because that release included the ability to suspend VMs with
vGPUs and release those resources back to the underlying host
along with some important placement and scheduling capabilities.
This allowed me to move it from a hypothesis floating around in
my head to a proof of concept (POC).
Me: Since you work at one of the biggest OEMs in the world, is my
assumption correct that you are seeing a bigger adoption of the
concept?
So, at this time it has not been adopted as fast as I want, but
looking at all the indicators, hang on for a wild ride in the coming
years.
I expect to see adoption in many of the areas, where users are now
remote and VDI/EUC is being used. They have all of these
resources that are being consumed for 8 to 12 hours a day and then
just doing nothing plus they have all this new (big) data for what
the “new normal” looks like, and for them to have and continue to
hold a competitive advantage they will need additional
computing, and I think resource harvesting is the best source for
those computing needs. I think the ones I’m describing here will
be larger retailers and mid-sized enterprises.
Me: Do you think other GPU vendors will jump into this space?
Tony: I’ve worked with some others in the industry around GPUs
and K8s specifically around scalability. While K8s expose a lot of
potential for many workloads, there are many more that can’t
easily make the transition to containers, specifically VDI. Because
of that I expect we will continue to see VDI as a workload that
shares many similarities with K8s but cannot necessarily be
converted to a true container-based workload.
Tony: Right now, it’s still a science experiment that’s being done in
a few folks’ garages and home labs, with some organizations
seeing potential in it and jumping in. It’s really in its infancy right
now and it’s a fun time to be part of it.
Where I’d like to go with it, and I’m getting there slowly, is to
build a basic packaged vApp for users to download with a basic
API and UI. This will hopefully allow it to move beyond a science
project to something more finished and useable by organizations
large and small.
All of these silos will be broken down again and I think resource
harvesting is one tool in the IT admin’s arsenal to do this. To do
more with fewer resources. Just remember to hold on, as this will
be a wild ride.
If you like to know more about Tony and his adventures as the
Wondernerd, check him on twitter: @wonder_nerd
ON VDI
A use case closely related to the VDI by Day, Compute by Night
section is the data scientist. Although a lot is already written about
it, I still want to explain a bit more about this use case and how
you would be able to support them on a VDI platform.
Due to the nature of their work, a data scientist is a use case that
could really push the limits of a VDI platform. They work with
data, sometimes a serious amount data (multiple TBs). That data
quite often needs to be analyzed, which they do with applications
they either build themselves or which are slightly adjusted to their
work or industry. They utilize Artificial Intelligence frameworks
within those applications to accelerate the applications on GPU
hardware. And typically, those are consumer-grade GPUs or other
accelerators. In the same project as mentioned in the previous
section, I got to work with multiple different data science teams
and got a pretty good understanding of their goals, their
challenges, what drives them to get to work in the morning, and
how a VDI could for sure help them in getting the most out of their
workday.
Before we dive into the details, I think it’s a good idea to explain a
bit about Artificial Intelligence, Machine Learning, and Deep
Learning. These are terms which are quite commonly used within
data science departments and can mean the same to certain people
and completely different things to others.
The container needs access to the data. Like mentioned earlier, this
can be relatively small, but don’t be surprised if they need access
to multiple terabytes of data which will hopefully reside on a
This is the section with the longest title and also, the one that I
enjoyed a lot in terms of research. Some customers just like a lot of
simplicity or simply don’t have the funding to build massive GPU
clusters to run their data science workloads on. The phrase “but we
already have GPUs for the virtual desktops, so why would I invest in a
separate GPU cluster?” is also quite commonly heard. And what
about “Yeah, but they are now working with cheap consumer-grade
GPUs, so it will only be better if we migrate them to a VDI with GPUs,
Based on the answers, you need to figure out the following things:
DESIGN CONSIDERATIONS
Supportability is key. Find out which combination of Linux,
VMware hypervisor, VMware Horizon, NVIDIA drivers, and
frameworks like CUDA, TensorFlow and PyTorch are compatible.
https://vhojan.nl/mastering-voodoo-building-linux-vdi-for-deep-
learning-workloads-part-2/
HARDWARE CONSIDERATIONS
The hardware considerations for data science use cases can be
quite similar to the ones of VDI by Day, Compute by Night. Most
of the workloads I have seen so far are really resource intensive on
most of the hardware resources you might want to virtualize.
GPUs
• Make sure you get the right GPU for the job. The right
GPU should be determined based on the framebuffer size,
the number of cores, and its computational capabilities
(such as double precision calculations).
• Determine the best way of assigning a GPU to a virtual
desktop. NVIDIA vGPU will offer the best flexibility but
could cause a bit of overhead. Passing through a whole
GPU will reduce flexibility but might increase the
performance of the accelerated process. I have seen both
but would surely recommend testing the various options
with a proof of concept.
• VMware vSphere Bitfusion might be an option to
virtualize a GPU. In this case, the GPU will be virtualized
over the network. The biggest advantage is that it’s really
easy to share the GPU amongst different end users. The
downside is that Bitfusion only supports CUDA
offloading. This means that if you use a GPU for both
graphics and computations, Bitfusion won’t be an option
(unless you want to use Bitfusion just for computations
and vGPU for graphics). If you want to know more about
CPUs
• Like the GPU, it’s important to choose the right CPU for
the job. Your end users will most likely run a virtual
desktop with a larger number of vCPUs than “normal”
users. 6 to 10 virtual CPUs is quite common.
• I have done some extensive testing with both AMD CPUs
(EPYC Naples and Rome) and Intel Xeon CPUs and due to
(again) the NUMA boundaries, found out that when
exceeding 4 vCPUs on an AMD-based system, the
performance will significantly drop compared to the Intel
ones as soon as you scale the number of VMs up on a
single host. More about this can be found in the blog by
Frank Denneman:
https://frankdenneman.nl/2019/10/14/amd-epyc-naples-
vs-rome-and-vsphere-cpu-scheduler-updates/
https://www.mycugc.org/blogs/tobias-
kreidl/2019/04/30/a-tale-of-two-servers-part-3
RAM
Networking
Me: You have been with VMware since 2007. How did you end up
with VMware?
Me: How did you get into the ML field and develop your interest
and passion for it?
Justin: Yes, you are right, Johan. VMware has been in a much
closer partnership with NVIDIA for over a year now though we
have worked with NVIDIA on vGPUs and other areas for many
years. We recognized that we have to be partners with the biggest
player in the accelerated AI/ML business and that today is
definitely NVIDIA. They sell not only the acceleration hardware,
but they also have vast experience with customers in both HPC
and ML and have built software frameworks, APIs, libraries and
containers that really jumpstart the data scientist and developer in
their work in ML. You can the real treasure trove of tools and
platforms for this at the NVIDIA GPU Cloud (NGC) – a repository
for containers, Helm charts, applications, pre-trained neural
networks and lots of vertical-specific solutions like Clara for
Healthcare or Jarvis for language recognition or Isaac for robotics
applications. What NVIDIA saw in VMware was a leader in
Me: Why do you think the technology is ready now, and not, let’s
say, 5 years ago?
Me: If someone is new in this space and would like to get a jump-
start. What resources would you recommend?
Justin: Well, if you are interested in the programming side of
machine learning, there is a wealth of tools and libraries out there
to download and learn about ML. You can get hold of TensorFlow
and PyTorch with all their associated libraries fairly easily and run
them on your laptop or desktop. One flavor of TensorFlow runs on
mobile phones! There are also lots of example applications around
to look at and try them out. If you are at the very beginning, I
would look at the TensorFlow Playground application online just
to get a feel for what the various parameters of neural network are,
Me: For anyone who would like to build their own effective
demos, what three takeaways would you have?
Justin: Just some guidelines here – there are outfits that will teach
you how to build effective demos
2. Keep the whole demo very short – stay within 2-3 minutes
if you can
Funny enough, they all claim the primary use case is user
experience (the game will run closer to gaming servers) and even
gamers with a lower bandwidth (+/- 10 MB) should be able to run
games in full HD with a good framerate. The reality is that the
choice to offer those platforms is really financially driven. Look at
Microsoft. Lots of people own an Xbox. They can play games on
the device locally. Microsoft already earns money on selling games
(as they claim that the consoles are sold without profit) to those
Xbox owners. The whole idea is that they can offer the same games
to other people, as well, as you don’t need an Xbox console to
stream their games from the hosted gaming platform. At the time
of this writing, an android tablet or phone with an Xbox controller
is sufficient. In 2020, almost 75% of all people in the world who
own a smartphone, have one powered by Google Android. Sure,
not all of those phones are powerful enough to run those games
with a comparable performance to an Xbox console, but you get
the point. Billions of people are potentially new customers for
Microsoft and can stream games without buying them, but rather
just pay on-demand. And it’s not just games from Microsoft. You
can play all sorts of games on the platform, just like with Google
and NVIDIA (who don’t even develop games). This brings another
interesting shift to the table, a financial one. Gaming studios
normally invest in designing games for multiple platforms: Xbox,
PlayStation, Windows 10, macOS, Android, iOS, etc. Especially for
Android and iOS, Google and Apple earn money for every app (or
game in this case) that gets sold in their app stores. With the
remoted gaming platforms, that financial transaction is moved to
the platform owner instead. This is also the reason why (as of this
writing) Apple still doesn’t allow for a game streaming app in
their app store yet.
I can imagine that you might be a bit mistrusting about this, but
please don’t forget that gamers are one of the pickiest and most
critical users you can imagine. They enjoy gaming and heavily
invest in gaming PCs with powerful GPUs and fast CPUs. The
smoother the game runs, the better it is for the gameplay. And the
faster it runs, the lower the latency will be. And this is key.
Latency in a first-person shooter, for instance, will probably get
you killed easier. It’s that simple.
I like to take the art of the possible into projects. Not just to show
the potential of a solution, but also to see if certain aspects of such
a futuristic showcase might actually be viable for a production
version of the solution. Building awesome showcases is one of the
things I like most about my job at ITQ, especially showcases which
Just for the sake of the experiment, the idea was to run the game
on a fast local datastore, so we went for an NVMe drive. The Intel
Xeon E5-2660 V3 has a base frequency of 2.6 GHz, which to us
seemed enough.
So, how do you translate the key principles into Blast Extreme
optimizations? Blast Extreme has a large number of optimization
settings, but just some are necessary when looking at high-quality
use cases which require high FPS. The following list is a good
starting point. You can easily add the settings to the registry of the
virtual desktop or set them in a GPO with the Blast Extreme
ADMX template. The full registry path is:
HKEY_CURRENT_USER\SOFTWARE\VMware, Inc.\VMware
Blast\Config
The above settings are a good starting point and optimized for
gaming and high-quality multimedia use cases, but there is
another option if you want to push it even further. You have the
option to go for a different codec called High Efficiency Video
Coding (HEVC). This codec increases the display quality and uses
H.265 instead of H.264. Switching over to H.265 can also have a
positive impact on bandwidth consumption, but it requires an
endpoint that is capable of decoding it, as well. If you have such an
endpoint, please note that decoding H.265 will probably require
more resources on the endpoint.
https://rdanalyzer.com
If you want to find out more about Blast Extreme and how to
optimize it for your use cases, check out the following guide:
https://techzone.vmware.com/resource/vmware-blast-extreme-
optimization-guide
It’s quite hard to show the result in a book, but one of the coolest
things happened a few months after we finalized the build. I’m
part of a team who organizes a yearly EUC conference called
VMware EUC TechCon (vEUC TechCon). The team thought it
would be awesome to have a Playseat at the conference so people
could sit down and experience the performance themselves. One
of those people was my friend Brian Madden, who had just moved
to VMware and was about to present the keynote at our event. He
played a couple of laps on the simulator and was amazed by the
UX. There’s a short clip of him playing on YouTube where you can
see this for yourself:
https://youtu.be/FzAwvqOM_gc
https://youtu.be/kd77JdfZZtY
CONSIDERATIONS
Building a new VDI platform is mostly an IT-driven project. I
always like to have a little fun while working on such a platform.
One of my colleagues once built an Unreal Tournament ThinApp
package, just to test Horizon, GPUs, and Instant Clone pools in a
massive multiplayer session with the entire IT team. If that’s not
an example of having fun in such a project, I don’t know what is!
☺
https://alain.xyz/blog/comparison-of-modern-graphics-
https://www.nvidia.com/en-us/data-
center/resources/vgpu-evaluation/
People don’t want to wait for services anymore. That’s why self-
service has such a big impact on society. While elderly might have
a hard time using self-service portals from organizations like
utility companies, the government, or financial companies,
millennials basically grew up with this and can’t imagine a world
without. The same holds true for a consumption-based model. I
still own hundreds of music albums because I love to listen to
I guess that this will be extended to gaming even more. At the time
of this writing (March 2021) there is a major shortage of GPUs due
to various reasons. The price of Bitcoin and other cryptocurrencies
exploded (which made it profitable for miners to start mining
Bitcoin and others again), the pandemic has caused production
delays in China, and the introduction of raytracing in cards from
both NVIDIA and AMD has caused an explosion in demand for
GPUs. I have seen prices double on the black market and at sites
like eBay. Why would you still invest thousands of dollars if you
could just run a game as well on a cheap endpoint with fewer
resources? You then don’t have to replace your GPU every two
years because your favorite game requires it. And best of all, you
just pay for the remoted game services as soon as you start
playing.
Me: When was the first time you got introduced into remoting
technologies such as those from Citrix, VMware, and Microsoft?
Christian: I'm going to think back here and say that the first time I
encountered Citrix technologies was in the early days of
WinFrame - so probably 1996. I had joined Bechtel in 1995 and was
part of the Infrastructure Engineering team in the UK. Even back
then, Bechtel had a lot of locations due to the engineering &
construction project based nature of the company and so we used
WinFrame to centralize certain applications because even on an
office LAN, some of them didn't work very well (as they had been
designed to run literally on a local machine). Of course, like most
other enterprise companies around that time, we were big on
Microsoft technologies, so I got introduced to Windows NT and
the TSE (Terminal Server Edition) around a similar time period,
maybe a year or so later. I guess my real love of Citrix inside of
Bechtel came in around 2007/2008 when we rebuilt our entire
global infrastructure (for a 60,000 person company) and delivered
pretty much all our apps on a top-to-bottom Citrix stack. That
became a really cool story for both companies. Hard to believe it
was 14 years ago.
Me: Was this just a fun showcase for Citrix, or did it have a use
case, as well?
Me: Did it just work out of the box, or did you have to do some
tweaking and tuning to get the best user experience?
Me: During your time at Citrix, have you seen customers build
VDI platforms to run similar applications (games, simulators) on
their platform?
Christian: Yes, 100%. If you think about it, there’s probably more
money (margin) for the big companies in gaming subscriptions
than in selling hardware. There’s also a massive advantage in
being able to bring new games to market and update existing ones
without having to manufacture them onto DVD or, even, have
customers wait to download them in full to the storage on their
existing consoles. As we’ve seen with things like Minecraft,
Roblox, Call of Duty and tons of other titles, the world is moving
to multi-player, collaborative, immersive experiences. That’s also a
lot easier to deal with server-side.
Me: Let’s get back to F1. What’s your favorite racetrack and why?
Me: If you have one key takeaway to share on this topic, what
would it be?
Christian: One takeaway…I would say from what we’ve seen over
the past few years in these trends, the challenges we’ve seen from
global remote working during COVID, and the ways in which
we’ve been able to push the technology to deliver rich experiences
from “cloud” should tell us that the future of fully immersive
collaboration platforms that connect people for work and play is
pretty much here. The next two or three years will bring us even
more cool stuff, such as Microsoft Mesh, which will allow us to
Me: How long have you been working with VDI solutions like
VMware Horizon?
Me: Without any further ado, what awesome use case have you
built?
Me: Have you ever come across such a use case before?
Scott and Jason: Over the past few years, we’ve been able to
confirm that our use case is entirely unique. As of this writing,
there are no other deployments on the planet with our use case,
software stack or hardware. That’s both a blessing and a curse,
because while it makes troubleshooting issues rather difficult, it
also makes creativity in design a necessity. It’s been a challenge,
Me: How did you come up with the idea to remote the application
through VMware Horizon?
Me: What challenges did you face while building the platform?
Me: Your end user isn’t a colleague, but a customer. How is this
different?
Scott and Jason: Deploying the solution using VDI has incredible
advantages including supportability, scalability, and availability to
name a few. The average Topgolf venue has 102 hitting bays and
Scott and Jason: For Topgolf, the advantages have far outweighed
the disadvantages. The maturity of NVIDIA integrations with
VMware have been a major disadvantage, but we are starting to
see those integrations become more enterprise ready in recent
releases. NVIDIA’s architecture created challenges with licensing,
instant clones, DRS limitations and more. Partnering with NVIDIA
and VMware to solve those issues has proven invaluable, and
we’re starting to see a major shift towards GPU in the enterprise
because of those partnerships. Barrier to entry is also a
disadvantage. Licensing and hardware required to deploy a
solution this complex can be very expensive – and even cost
prohibitive for some customers. Finally, getting people who can
understand the complexities of our platform on support calls has
been a major disadvantage. While it’s getting better, there are still
communication challenges between partners.
Scott and Jason: We’re already seeing GPUs being delivered with
cloud computing today. In 2019 NVIDIA built partnerships with
Microsoft to deliver AI and ML solutions that leverage NVIDIA
GPUs, and Amazon followed in 2020. Gaming is a multibillion-
dollar industry, with revenues projected to surpass 140 billion
dollars in 2021. It’s very likely that the use case for VDI, and even
containerized GPU gaming will become more common. We’re
seeing the first steps in a complete overhaul in game delivery right
now, in fact. NVIDIA launched GeForce Now, Google launched
Stadia, and Microsoft is launching Project xCloud soon.
There aren’t many companies that have a need to leverage VDI for
gaming the way that Topgolf does, delivering experiences to
We think that 5G will also play a huge role at the edge and
provide new opportunities to interconnect locations around the
globe that would normally have latency issues. We’re seeing an
influx of VR experiences like Zero Latency, and as those
experiences scale and expand to new markets around the world
there would be operational advantages to virtualization, not to
mention the advantages virtualizing those 3D gaming experiences
offers instead of housing the compute on the players back.
Me: Do you have any plans to extend the platform with other use
cases or applications?
Scott and Jason: Absolutely! Whether we can talk about them here
is another story. We’ve got to maintain that competitive
advantage! Thankfully, Topgolf leadership fosters and develops
innovations and creativity. One of our Core Values is Edgy Spirit,
and it’s all about having curious intelligence and pushing the
limits of what we think is possible. We’re already using parts of
the technology to run AI and ML workloads, we’re integrating
other venue systems into the hardware stack, and we’re testing
several opportunities with other guest-facing systems in the venue
and new proprietary technologies that will continue to evolve the
guest experience at Topgolf. Stay tuned!
https://topgolf.com/us/play/games/
If you like to know more about Scott and Jason, check them on
twitter: @stforehand and @jasonsova
In the section about the Healthcare Workplace, you may have read
that at one of my customers we are working with Virtual Reality
(VR). Their use case of pain relief and distraction during
chemotherapy is a perfect example of how the demand for devices
is changing. In this case, several researchers had demand for VR
devices to run certain apps on. Based on the three ways to handle
those demands, IT has multiple options:
This is what Project VXR from VMware is all about. VMware has a
department called xLabs. It is led by the office of the CTO and
focusses on technologies even before they become emerging. ESXi
on a Raspberry Pi4, Bitfusion and native K8s support, and SD-
WAN for LTE are examples of such projects. Another one is
Project VXR. At VMworld 2019 in San Francisco, the team behind
this project demonstrated this project for the first time. Project
VXR focusses on bringing technologies like virtual reality and
augmented reality (AR) to the enterprise. Where my customer use
case is just a simple example, VR and AR are being used my many
Fortune 500 companies. Several major airplane builders use AR for
instance for quality and assurance purposes. Through AR, an
engineer can see inside a plane what screw or tiny cable should be
https://www.youtube.com/watch?v=BLnojMZ_gOo
LANDING ZONE
Like on any other device, a standalone HMD has an interface
which lets you navigate through different apps, configure the
device, and browse through an app store. I’ve conducted a lot of
research with the Oculus Quest and in regard to the menu
structure, the settings, and the overall UX, I’m quite satisfied. It’s
very easy to navigate through the virtual world, but the interface is
proprietary to the Oculus Quest. The main challenge with that is
related to support for other platforms. Every type of device can
come with its own user interface. What you would ideally like to
offer to those different devices is a landing zone that is consistent
and similar for all of them. Within the landing zone, you would
like to provide (conditional) access to corporate applications.
Sounds familiar? The first use case for Project VXR is basically a
VR version of the Workspace ONE Intelligent Hub. Kind-of. After
starting the VXR application, you will be transferred to a room
with a view over a valley. The room offers access to the actual
Workspace ONE Intelligent Hub and will show specific VR-
capable applications. Those applications are the ones you can
publish through VMware Horizon and can be accessed from the
VXR applications in case you are entitled. More about those
remote apps later. You might ask yourself why the native browser
One of the best features of the VXR app is the ability to SSO
through the different apps after you have initially signed into
Workspace ONE. In my lab, I have managed to federate
Workspace ONE Access with Google Workspaces and my local
Active Directory. As a result, I am now able to use my Google
identity to sign into the Workspace ONE Intelligent Hub and SSO
into my VXR remoted applications without reentering my
password.
After enrollment, you now have the ability to fully control the
device. You are able to remotely configure it, push security
policies, install applications, and monitor the device.
VR APPLICATION REMOTING
Managing an HMD, publishing apps to it, and SSO to these apps
through Workspace ONE Access is really cool. Don’t get me
wrong here. But, when I saw the first remote rendering demo from
the VXR team, I was blown away. Not because it just looked cool,
but mostly because one of the issues which always held me back
from investing in an HMD, was the fact that powerful ones require
a cable attached to a really beefy gaming PC to give the best UX.
That, quite honestly, is a bit of overkill for most of the applications.
Besides the lack of mobility, the alternative HDMs are also
relatively expensive with prices varying between $500 and $2000. I
have played around with the first Oculus Quest for over a year
now and most of the applications run really well. The native ones
that have been built for the Quest are fun and show a great
performance on an HMD which basically is equipped with a
similar CPU and GPU as a smartphone. Now, it’s the high-
performance VR apps that still require a gaming PC which can’t
natively run on a relatively cheap HMD.
Sure, the Oculus Quest comes with a USB-C cable which can set
the HMD in link mode and treat the HMD like an Oculus Rift, but
you are still losing mobility and don’t want to walk around with a
laptop in your backpack (attached with the cable to the HMD).
Was the link cable the only way to provide access to the apps on
the gaming PC? Well, this book isn’t called the VDI Design Guide
Part II for nothing! ☺
When I first met with Matt Coppinger (director of the VXR team),
they had some different options to bring those applications to the
HMD. The obvious option was to use Blast Extreme or PCoIP, but
http://www.yellow-bricks.com/2020/01/02/seeing-green-only-on-
your-hmd-when-using-alvr-to-stream-an-app/
CloudXR
Sure, you would probably be able to pull the same thing off with a
traditional remoting protocol (from a displaying perspective), but
those aren’t optimized for VR remotes, which make them pretty
worthless. But, remotely displaying a SteamVR menu which is
quite static isn’t where all the magic happens. Traditional remoting
protocols were initially built for your office workers, and as you
have read in this book, can also be tweaked and used for a wide
variety of powerful and graphical intense use cases. Unfortunately,
VR isn’t one of them. Because where they lack usability for VR is
that they are optimized to handle latency. As soon as the latency
increases, a combination of dropping frames and reducing the
image quality occurs. And that’s perfect for most use cases.
Reducing 60 FPS to 50 FPS is fine, even for most gamers who
experience a bit of latency. And don’t forget that even with
latencies up to 100 MS round trip, 50 FPS is really good.
https://itrinegy.com/virtual-appliance-network-emulators/
For the second WAN test, I wanted to actually see what a true
WAN did. My neighbors are connected through the same ISP and
In the final test, I wanted to see what CloudXR does when both
bandwidth and latency are constrained. I configured my phone for
tethering my internet connection and so remoted the application
over a 4G internet connection. In this case, my 4G connection had
around 30-40 ms latency, but lacked a serious amount of
bandwidth. The bandwidth varied between 6 Mbps to 15 Mpbs,
which is far below the minimum requirements for a low-quality
UX. As a result, the entire application was unusable. Now, I
couldn’t really see moving parts, or control anything. What did
work however, is looking around. The virtual environment I was
in seemed to be preloaded and because of that I could look around
and saw walls, the floor and the ceiling with a reasonable amount
of detail. Again, this was something that surprised me.
https://developer.nvidia.com/nvidia-cloudxr-sdk-early-access-
program
Now, you might ask yourself how all of this ties into VMware
Horizon. The simple answer is that as of this writing, it doesn’t.
But Project VXR is a VMware-led project and focused on
delivering VR and AR applications to HMDs. So, purely out of
speculation, I think something might be coming. But what could it
be?
GPU resources
Network considerations
https://www.xrspace.io/us/headset
If you like to read more about the tests which have been conducted
for Project VXR, please take a look at the following site:
https://pathfinder.vmware.com/activity/projectvxr
On the site, you can see some videos and download a whitepaper
that contains most of the tests which have been conducted.
http://www.yellow-bricks.com/tag/vxr/
https://www.vmworld.com/en/video-
library/search.html#text=%22spatial%22&year=2020
In 2014, we met for the first time. I just started working at ITQ as
the first EUC consultant and got invited for a VMware
Professional Services training week in the Staines, UK at the
former European HQ of VMware. That is probably the best
training course I’ve ever been in. Matt organized it, together with
Spencer Pitts. The room was filled with approximately 20 VMware
employees, and me. I still have no clue how I ended up there, but I
am forever thankful that I was able to attend it. That week sparked
Me: You have seen the VDI market grow to what it is now. Did
you ever expect it to become so successful?
Matt: It's the year of the desktop, right? :) Having worked at and
with various enterprises deploying Windows NT, Windows 98,
Me: You have had different EUC roles within VMware. What was
your favorite role and why?
Matt: That’s like asking which one of your children is your favorite
:) I’m fortunate to say I’ve loved every role at VMware. Each one
presented a new challenge and pushed me in many ways. VMware
has certainly given me the opportunity to develop my career in so
many ways and with interesting roles. I enjoyed working with
customers as a PSO consultant. It gives you a great understanding
of customers, their problems and the pros/cons of your products.
I think every software engineer, product manager and product
marketer where possible should take the opportunity to ride-a-
long with a PSO consultant - invaluable experience. My favorite
role though has to be my current one, but probably not for the
Me: In your current role, you lead the development of state of the
art EUC solutions. This is probably as innovating as it gets. Before
we dive into Spatial Computing, what is it like to be working on
the forefront of innovation?
Me: LOL. Could you explain what Spatial Computing is and why
it matters?
Matt: As I’ve said before two key challenges at the moment are
spatial computing device management and device capabilities.
We’re helping address those challenges through Workspace ONE
and Project VXR. The other two challenges I see are enterprise
security/access on these devices and enterprise user experience.
VMware is also working towards solving these issues. VR devices
need to become a little smaller in form factor and the user
experience in VR needs to mature. On the AR front, the devices are
not as mature as VR or as widely adopted. Microsoft HoloLens 2 is
a great development, but the price point is still high. There are
some considerable near-eye optics challenges that need to be
solved for AR to become more mainstream. However, as there is
more adoption, we’ll see more content, applications and use cases.
This time reminds me very much of the early days of VDI. Plenty
of skepticism, but clear-cut use cases and demand in the
enterprise, somewhat hampered by technology maturity. Going
back to my previous answers, I believe in the fundamental benefits
Spatial Computing brings and it's only a matter of time until the
challenges highlighted here are solved.
Matt: COVID-19 was a turning point for many in the EUC space.
Now more than ever, a digital workspace is needed by every
single organization on the planet. Any device, any app, anywhere
rings true now more than ever. VMware EUC is in a strong
position to help our customers succeed in delivering quickly on
the digital workspace, which gives us a solid foundation for the
future. Our digital workspace is also a foundation for our
customers. A foundation that we can develop on top of to deliver
not only a platform but better employee experience with the
digital workspace. Digital Employee Experience Management is
the next battleground, lots of innovators and players in the space,
helping customers deliver a consumer simple, enterprise secure
experience. DEEM enables organizations to measure, manage and
optimize the services they deliver and the user experience.
Allowing employees to seamlessly conduct business operations
across different applications and platforms helps increase
employee productivity. Now, we need to do all of this “outside the
firewall”. If organizations were not embracing Zero Trust before,
they need to now. Prior to COVID, many were aware of the risks
of just trusting devices on the corporate network. With many
working remotely, Zero Trust is even more important. Zero Trust
delivers a more secure architecture but comes with challenges
around user experience. VMware Workspace ONE and other
Me: Any key takeaway which you would like to share regarding
Spatial Computing or VMware EUC in general?
If you like to know more about Matt or follow his adventures, you
can find him on twitter: @Mcopping.
The beast
I used “The beast” (that’s what I called it) for little over a year. I
tweaked most of the fans, so it was less noisy. This was probably
around 2002, so silent fans like the ones from Noctua didn’t exist
yet. The beast was insanely fast, but also had a massive problem.
As I was still living in my parent’s house, I didn’t have to pay for
the utility bill. Until the moment they got the new yearly bill, that
Like any other project, it’s essential to start with your own
assessment and determine your requirements, constraints,
potential assumptions and risks. This may sound a bit like
overdoing it, but you will see that it will help you a lot when
designing your lab. I created a list of questions with some guiding
answers which will help you define your own conceptual design:
INVESTING IN HARDWARE
Based on the answers to the questions in the previous section,
there are a couple of main decisions to be made related to
hardware:
http://vmwa.re/homelab
As mentioned, there are multiple options, and let’s dive into those
options first.
I think the first option and one that can be very budget-friendly, is
going for second-hand hardware. When looking at William’s list of
community labs, Investments vary between $500 and $150,000. In
my opinion, they don’t always show “the art of the possible”. I
know for a fact that you can get more for less if you aren’t tied to a
time constraint. Take your time to look around and keep track of
what’s happening on sites like eBay. My first VMware lab was an
HP ProLiant ML350 G6 with 64 GB of RAM which I bought in 2012
for just $450 on eBay. It was a bargain, and because I spent a
couple of months looking for it, I think I got it for such a low
investment. It was relatively quiet, had out-of-band management,
and could perfectly run VMware View with all of the supporting
components without any issues. I didn’t run it 24/7, but it worked
fine. Finding the trusted sellers on eBay might help avoid scams. It
might even be possible to spent less on a lab, but in that case, you
might need to make some concessions. In some cases, it’s also an
option to search for complete clusters of hardware. One of my
colleagues bought four Dell 1U nodes including CPUs, RAM,
disks, and networking for a little bit less than $2000. This means
you are basically up and running for that amount of money. The
only thing you might need is a network switch. Hardware that’s 4-
5 years old, might be old for an organization, but as long as the
software you would like to run is supported, you are good. One
important lesson, take a look at the VMware HCL so you are
certain that you can run the lab components on the hardware.
To sum up, these are some of the pros and cons of investing in
second-hand hardware.
Pros
Cons:
This is probably the most expensive option but does offer the best
support and durability. It’s also the one that potentially has the
lowest WAF (due to the relatively high investment). You do have
some relatively cheap options, but they do come with some
constraints.
If you would like to spend a bit more, multiple OEMs offer Xeon-
based workstations that sometimes are supported by the
hypervisor, as well. Some of my colleagues have built their labs on
HPE or Dell workstations, with full support of the hardware and
due to the powerful CPUs and support for GPUs are able to run an
entire VMware Horizon and NVIDIA vGPU lab.
Because of the fact the hardware is new, it’s also quite simple to
expand. In my case, I started with two Supermicro E300 nodes in
direct-connect for VMware vSAN. A year later I bought a third one
and a 10 GbE switch, and I had more resources in my cluster. To
me, that was a big plus.
Pros
Cons
Non-HCL-based barebones/systems
Like mentioned earlier, I think this is the single most popular lab
system. Take an Intel NUC or comparable system. It’s small, has a
low power consumption, a relatively high performance, and has a
relatively low price point. Some of them even are seriously quiet
and thus would perfectly fit on your desk. With the addition of
external high speed NICs, you could even run an all-flash VMware
vSAN platform on them. The nice things about NUCs is that you
can just start with a single one, and relatively easily scale out. Just
add more NUCs. ☺
https://twitter.com/mabelard
Pros
• Full warranty
• Relatively easy to scale out
• Small form factor
• Quiet!
• Low power consumption
• Relatively cheap for a complete system
• For the Intel NUC, there is a large community which can
offer support
Custom builds
I think this actually shows the best pros of a custom build. You can
build something fully tailored to your needs. In the past five years,
I had four different lab configs and just added a newest custom-
built host. It’s an AMD Ryzen 9-based host with 128 GB of RAM,
NVMe storage and is capable of running VR and extremely
resource intensive workloads because the CPU has 24 logical cores
with a 4.2 GHz base clock speed and in the host runs an NVIDIA
RTX6000. I ended up with this system because of the relatively low
investment for the insane performance it delivers. It did come with
some virtualization issues, though, which I needed to tweak
manually in the BIOS (which obviously is one of the cons). But it
now works like a charm!
Pros
Cons
• https://www.reddit.com/r/homelab/
• https://www.reddit.com/r/HomeNetworking/
If you are a VMware vExpert and would like to know more about
labs, follow the Homelab channel in the vExpert Slack. You can
find it here:
• https://vexpert.slack.com
• https://williamlam.com/
Network
• Local flash
• NAS/SAN
• Virtual SAN
https://williamlam.com/2020/10/vsan-witness-using-
raspberry-pi-4-esxi-arm-fling.html
• In case you would like to start with two nodes, but like to
scale the cluster out, you can simply do so by adding more
nodes. My general recommendation is to add identical
hosts to avoid potential compatibility issues within the
cluster.
VMware licenses:
https://my.vmware.com
https://vexpert.vmware.com
https://www.vmug.com/membership/vmug-advantage-
membership
Microsoft licenses:
VMware offers a free to use services called TestDrive. You can find
prebuilt services including demo scripts, but even more valuable,
you can spin up your own (sort-of permanent) instance of VMware
Workspace ONE Access and UEM.
https://kb.vmtestdrive.com/hc/en-
us/articles/360001372254-Getting-Started-with-TestDrive
• (If you don’t work at a partner, you can also get access
through the VMUG Advantage subscription.) An
alternative is to become vExpert EUC, as you will get the
subscription, as well.
https://developer.microsoft.com/en-us/microsoft-365
Playing with other IDPs for your EUC stack is something you
might like to do, as well.
https://www.okta.com/free-trial/
https://developer.okta.com/signup/
https://www.pingidentity.com/en/trials/p14e-trial.html
https://developer.salesforce.com/signup
https://www.atlassian.com/software/jira/free
https://www.ibm.com/partners/start/watson-assistant/
https://developer.servicenow.com/dev.do#!/guides/quebe
c/developer-program/pdi-guide/personal-developer-
instance-guide-introduction
There’s a really great blog post from Reinhart Nel (VMware EUC
Instructor) about setting up a VMware EUC SaaS lab. You can find
it here:
https://www.livefire.solutions/euc/build-your-own-vmware-euc-
saas-lab/
https://vhojan.nl/building-a-low-power-vsan-vgpu-homelab/
https://vhojan.nl/low-noise-vgpu-and-vsan-homelab-
optimizations/
For the new lab, the workloads I wanted to work with required a
different type of GPU. VR workloads and simulators can work on
T4s, but to get the most out of user experience, I went for a pair of
Quadro RTX6000s. They all support NVIDIAs vGPU, but the
RTX6000s can deliver a better performance and even better, they
are actively cooled.
Management cluster:
AI Cluster:
Test Cluster:
The result
The new lab has met most of my requirements pretty well. The
three-node vSAN cluster consumes around 100 Watts of power.
They are powerful enough to run all of the 24/7 workloads and
even some non-essential ones.
The host for testing beta builds doesn’t have to offer a great user
experience, I just want to be able to test functionality, which the
Xeon-D-based platform does really well.
It took me around two months to fully build it, but for the first
time I think it might be lab that will be here for multiple years.
And because of the low power consumption, it has a great WAF,
as well.
Me: I always enjoy it when you talk about labs. Why do you have
such an interest in labs?
William: My career and everything that I have learned thus far has
been a direct result of being able to explore and experiment in a
VMware-based lab environment. Automation, which is a passion
of mine, has allowed me to build repeatable infrastructure
scenarios and be able to share that with others to benefit. This
ultimately allowed me to better understand how things work,
Me: If people would like to work on their own fling, where should
they start?
William: Honestly, it was pure luck and hope. In the Fall of 2018,
Apple announced the new Mac Mini which officially supports
64GB (2x32GB DIMM) but the availability of 32GB SODIMM for
individual purchase did not arrive in the market until Spring of
2019. At the time, I had access to an Intel Hades Canyon (8th
Generation) NUC and based on the CPU spec, it technically
supported 64GB of memory although the NUC specification stated
32GB as the limit. I figured I would give it a shot and purchased 2
of the 32GB DIMM which at the time were going for $298 per
DIMM (today, they are $168) and see what would happen. I was
not that surprised that it just worked on the Hades Canyon NUC
given the CPU was fairly recent, but I was totally shocked when I
realized the 64GB also showed up on my 6th Gen NUC, which I
bought back in 2016! This totally changed the game for home labs;
memory has always been a constraint for most folks but now you
have a fairly inexpensive way of upgrading your existing
hardware going as far back as 6th NUC all the way to the latest
generation which now officially support 64GB. I had so many folks
share their success across other NUC generations and how happy
they were to be able to leverage their existing investment. I think
someone would have eventually figured this out, as well, but I
guess it was just good timing and luck on my part.
If you want to know more about William and the stuff he’s doing,
follow him on twitter: @williamlam
.
The ICA protocol that Citrix used (and still uses) was so
impressive that the user experience of the remote desktop and
applications blew my mind. How could it be that a desktop
361
protocol was able to present a user with remote applications and
desktops without the user even noticing it? And that was over 20
years ago.
The cool thing was, that the more I got involved in the SaaS
project, the more I got to spend time with the VMware ESX
architecture. That's when I knew the direction I wanted to take my
career. Until 2013, I worked on the SaaS platform, which was
running on VMware ESX 4, managed by vCenter Server,
connected to an EMC Clarion CX4 shared storage. The Windows
RDS machines were running on the platform in a big farm, load
balanced with 2X (now knows as Parallels RAS) and provisioned
with applications by Microsoft App-V 4.5.
Late in 2013, it was time to make a career change. But what? It was
kind of easy. When you mix server virtualization with SBC and
application virtualization, what do you get? The answer is simple:
end-user computing (as we knew it back then).
Directly after the book launch, I went to VMworld in the US. That
week in 2018 was my most memorable moment in my IT career. I
got the opportunity to launch the book at the Inside Track
community event, followed by a book signing at both NVIDIA and
Liquidware. Especially at the booth of NVIDIA, there was a
massive number of people lined up who all wanted a signed copy
of the book. The icing on the cake was seeing my own book in the
VMworld bookstore.
If you read the book prior to reading this bio, then you know what
happened between 2018 and 2021 in terms of the large number of
different use cases I got to work with. Something which eventually
led to the creation of this book. You’d might expect that writing a
second book is similar to the first book. In a lot of aspects, it was
similar (creating a story line, finding the right topics, creating the
interviews). Where it differed (a lot), was with the research of the
topics and use cases. I obviously wanted to have my facts straight,
which sometimes led to a journey through the rabbit holes of IT.
But this is also the way how I learn. I learn by diving into
technology. Building, breaking, rebuilding, breaking again, and
finally working towards a solution which is viable for customer
projects.
After finishing this book, I’m not really sure what my next side
project will be. I’d love to develop a VMware Fling, work on some
new podcast ideas, or maybe write a children’s book about VDI ☺
We’ll see. I’m going to enjoy a couple of months without a side
project, and focus on my family expansion first...