storage efficiencies with hyper-v at the virtual and...
TRANSCRIPT
Storage Efficiencies with Hyper-V at the Virtual and Physical LayerThe Perils & Benefits of Thin Provisioning, UNMAP & Snapshots
Didier Van Hoye Technical Architect
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
2© 2016 Veeam Software
ContentsIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Thin Provisioning and UNMAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Thin provisioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
Thin provisioning at the physical layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
Thin provisioning at the virtual layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
Thin provisioning at the physical and virtual layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
UNMAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
Space efficiencies at the virtual layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
Elasticity at the virtual layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Performance issues with thin provisioning and UNMAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Snapshots and checkpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Storage arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Snapshots are space efficient, not magic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Don’t run off to buy more storage before verification of the issue! . . . . . . . . . . . . . . . . . . . . . . . . . 14
Hyper-V checkpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
The effect of Snapshots and/or checkpoints on UNMAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Is fragmentation an issue in all of this? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
On the storage array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
In regards to Hyper-V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
On the Hyper-V host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Inside the VM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
The internal VHDX structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Closing thoughts on fragmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Other capabilities enhancing these efficiencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
ODX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
ReFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
About the Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
About Veeam Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
3© 2016 Veeam Software
IntroductionYou might have read some articles or blog posts about the dangers of thin provisioning, UNMAP
and snapshots . It’s wise to point out these risks . Any powerful technology used in the wrong way
is dangerous . In the end, no matter how automated our process, we are in control . So, the advice is
to master the technologies we leverage and not become mindless slaves to it . Why? Because these
technologies are about achieving storage efficiencies . Efficiencies in delivering storage and offering
data protection . We can use these technologies at the physical layer (storage array), at the virtual
layer (Hyper-V) or a combination of both . What we choose depends on the environment, budget and
workloads . It’s about cost reduction, the ease and speed of recovering to points in time and offering
data to Development Operations (DevOps) teams .
Make no mistake — when you have economies of scale, this is of utmost importance . 20% of a 10
million dollars is serious money in terms of savings on storage CapEx . When you’re a small company,
not having to buy another 10TB of storage can make or break the budget for that year . It’s a cost
cutting edge you need to compete at cloud scale . These efficiencies help Hyper-V surpass challenges
beyond the ability of delivering a few hundred to over one million IOPS to virtual machines (VMs) .
These features are valuable to all sizes and types of users when used properly . So, let’s take closer
look at the benefits, the risks and the smoke and mirrors . A lot of this is based on many years
of running and supporting Hyper-V .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
4© 2016 Veeam Software
Thin Provisioning and UNMAP Thin provisioningThe benefits are clear . You don’t waste storage space by handing it out only for it never to be used . The
caveat with thin provisioning is when you consume the maximum space allocated in the dynamically
expanding VHDX files or on your storage array . You can deplete your storage capacity because you’re
basically lying to the operating system . It doesn’t know that you’ve run out of storage under the hood .
This problem exists at both the virtual and the physical layer . You can create a 4 TB dynamically expanding
VHDX file on a 2 TB LUN — thin provisioned or fully provisioned — it doesn’t even matter here . Once your
virtual hard disk outgrows the space available, your LUN won’t have any operating room left and your
VMs will suffer down time . Likewise, when you create a 50 TB LUN on a SAN that only has 30 TB in total
capacity, you’ll be in trouble when you actually try to copy that much data on to the LUN .
Note that Hyper-V, just like a decent SAN, will protect your data by no longer allowing it to be used .
The actions of Hyper-V or/and your storage array prevent major data corruption issues . But, you will
suffer down time for your virtual machines because storage can’t be generated out of thin air .
Barring the risk of running out of real, usable storage capacity, there are some other risks
with a thin provisioned LUN that we’ll address when we discuss UNMAP and fragmentation —
as these are performance related .
It takes careful management and monitoring to make sure that you add storage capability when
it’s needed . This is the responsibility of the administrator . Your organization must be able to handle
ordering and deploying storage in a timely fashion for that to happen . Make sure you know your
processes and timelines . Don't just assume them, or your extra storage capacity might never arrive or
get provisioned . Also, know the procedure to actually deliver this storage to your VMs . Not all storage
solutions are created equal and can increase storage indefinitely .
So, plan ahead . Know your storage . Make sure you have some head room and a safety margin . It’s all
about checks and balances . Best practices dictate that you do not over provision and that you monitor
your environment when the workloads are mission critical . Does this sound like a lot work? Well no,
not really . After the initial learning curve, it comes down to actually monitoring your environment
and responding to alerts . You have to do it — and doing it well is far superior to not doing it at all or
aiming for perfection . Trust me, managing storage is a hundred times less complex and expensive than
managing people . You probably have dozens of middle managers doing that so a meager 20% FTE for
storage monitoring and management shouldn’t be an issue .
Thin provisioning at the physical layer brings a lot of benefits — even when all your virtual disks are fixed . But, it
can also be combined with thin provisioning at the virtual layer . Dynamically expanding virtual disks provides
thin provisioning for people who don’t have a storage array with thin provisioning . We’ll discuss these below .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
5© 2016 Veeam Software
But let’s not just talk about the risks . There are real benefits and you can do really interesting things that
will save you loads of lost capacity . Performance issues are often not due to thin provisioning because
modern storage is often tiered or all flash .
Thin provisioning at the physical layer
Thin provisioning at the physical layer means in the storage array . Let’s take a look at this 10 .5 TB LUN on
a SAN where we disabled snapshots . It shows that the actual space we are using on the SAN is 402 GB,
the rest of that 10 .5 TB is not allocated .
Figure 1: A 10.5 TB LUN with 402.01 GB of actual storage consumed on the SAN
We’ll delete about 200GB of files and see that amount of space recuperated on the SAN .
Figure 2: That same 10.5 TB LUN after we deleted about 200 GB worth of data on it. After UNMAP — happens automatically in Windows Server 20012(R2) — it’s passed onto the SAN that’s this data has been deleted and the space has been reclaimed!
Do note that your SAN is often more capable than the OS at detecting truly used storage space . The
OS doesn’t know about the fact that a fixed VHDX is only really consuming 10 GB out of 50 GB or 127
GB . The SAN does . The same goes for dynamically expanding VHDX files — on which space has been
recovered on the SAN, but where the file hasn’t shrunk yet . So, the OS reports way higher storage use
on a CSV LUN full of VMs with fixed VHDX size or dynamically expanding VHDX files than the SAN . As an
extreme example, take a look at the same LUN as above .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
6© 2016 Veeam Software
Figure 3: The CSV reports the size the OS sees, not what’s consumed on the SAN.
There’s a nice collection of fixed VHDX files on this CSV that are filled with just an OS and nothing more .
So, they don’t consume space on the thin provisioned LUN on the SAN but Windows doesn’t know any
better and reports the fixed VHDX file sizes .
Figure 4: Windows reporting 2.59 TB in use while in reality only 402 GB is consumed on the SAN.
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
7© 2016 Veeam Software
Thin provisioning at the virtual layer
Even if you don’t have a storage array that provides you with thin provisioning, you can enjoy its
benefits due that to the dynamically expanding VHDX . No storage capacity is required until you actually
put data into the volume(s) on these virtual disks .
Figure 5: The real size of the dynamically expanding virtual disks consumed is reported on the Hyper-V host.
You might notice the difference in size between an empty VHD and VHDX . This is due to the new
internal structure of the VHDX which grows by default in block sizes of 32 MB versus 2 MB with a VHD .
It also allocates some blocks as a buffer so it can fill requests fast while it expands with new blocks . The
file sizes on the Hyper-V host reflect this when we write data to the dynamically expanding virtual disks .
Figure 6: The real size of the dynamically expanding virtual disks on the Hyper-V host has grown.
A thing to note here is that if the fixed VHDX files reside on a thin provisioned LUN on a storage array,
they don’t consume that space either . Only when data is written to them .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
8© 2016 Veeam Software
The operating system reports the size of the LUN as assigned inside of the VM, not the real size of the
dynamically expanding virtual disks . The amount of free space is never consumed on the host . This will
not cause any issues as long as the space assigned is available when needed .
Figure 7: The OS reports the size of the LUNs that are assigned and that are free space inside the VM.
I’ll mention this for completeness, but today you should be using the VHDX format and not VHD . Much has
been discussed on the benefits of VHDX so, I’m not going to repeat it here . Unless you have the need to move
VMs back and forth to an older Windows Server 2008 R2 environment and your modern Windows 2012 R2
Hyper-V environment you should really use VHDX and not look back . The performance is way better and the
dynamically expanding VHDX is faster, on top of better resilience and data protection .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
9© 2016 Veeam Software
I didn’t use dynamically expanding virtual disks in production at all prior to the introduction of the
VHDX virtual disk file format . It was a lab/dev-only use case for environments I worked at . This was
because of a noticeable reduction in performance and an increased fragmentation of the VHDX files
on the LUN when there were too many dynamically expanding VHD files residing there and expansion
happened frequently . The newer VHDX format has changed this significantly and they now grow in
larger block sizes . The performance is really good even during growth, and we’ve started using them in
production for new deployments of “general purpose” VMs were we cannot predict the storage needs
accurately . This avoids overprovisioning and saves on wasted storage space on the storage array . It also
prevents resizing (shrink/extend) the virtual hard disk . Even if this can be done online while leveraging
ODX for performance, it’s not as easy as avoiding most of the issue . We gradually try our larger VM
sizes with dynamically expanding VHDX files and its usage is growing overall . Having ODX helps with
performance during the extension of the virtual disks, but this is most noticeable in cases where
significant growth happens . We still leverage fixed VHDX files for really IO-intensive workloads .
Thin provisioning at the physical and virtual layer
There used to be a time when some SAN vendors didn’t support dynamically expanding virtual hard
disks on their thin provisioned LUNs . That was back in the days of VHDs and Windows 2008 (R2) . But,
that’s behind us and it perfectly fine with most vendors to combine thin provisioning at the physical
and virtual layer . The combination of dynamically expanding virtual disks and thin provisioned LUNs on
a SAN complement each other .
UNMAPThis is the feature that allows both the virtual disk as well as the physical storage to deliver space
efficiencies . It does this in multiple ways and at both the virtual and physical layer . Since Windows
Server 2012, we have native support for UNMAP in the OS . This has made our lives a lot easier in regards
to communication to the storage arrays when data has been deleted and can then be recuperated .
This was time consuming and used to require scripting and storage vendor agents to be deployed —
causing an impact on performance (zeroing out the volumes) .
Both a fixed VHDX and a dynamically expanding VHDX UNMAP will mark the free space that becomes
available after the deletion of data as “not in use .” This is passed via the host to the storage array which,
if it’s thinly provisioned, can also reclaim that space and use it elsewhere in the array .
Space efficiencies at the virtual layer
At the virtual layer the free space in the virtual disk can be reused which means that a dynamically
expanding VHDX will not have to grow beyond what’s really needed as it knows very well that it has
reusable free space . You may think that’s the case with a dynamically expanding VHD — and if it’s
a single volume you’re right — the experience is much the same . The benefit a VHDX and UNMAP
has over a VHD becomes very visible, and more important, when a virtual disk has multiple volumes .
Volumes in VHD don’t know about the free space at the disk level or if that space has been freed from
another volume on that virtual disk . This means that when copying data into other volumes on the
same dynamically expanding VHD, that disk will have to grow, which comes at a performance cost
because it is allocating more space than is actually being used .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
10© 2016 Veeam Software
Another benefit of UNMAP is that a backup and Availability solution like Veeam® doesn’t have to backup hard
deleted data in a VM . UNMAP has signaled to the VM that the data is deleted and that the space is free . This
means Veeam Backup & Replication™ doesn’t back it up either, reducing the overhead at your backup target .
Starting from Veeam Availability Suite™ v9, thanks to the new feature BitLooker, we can skip blocks where
deleted files were situated, even without the UNMAP support from guest OS/hypervisor .
Here’s logical example of UNMAP as an illustration . Let’s assume it’s a dynamically expanding
virtual disk (VHD/VHDX) . It has two volumes, which both have 50% of the maximum capacity
of the dynamically expanding virtual disk .
Figure 8: A dynamically expanding VHDX/VHD with two volumes and no free space.
Figure 9: We copy data into volume 1. The VHDX/VHD grows as the volumes.
Figure 10: We delete data from volume 1. There is free space.
Figure 11: We copy data into volume 2. UNMAP & VHDX allow for use free disk space. The virtual disk does not have to expand.
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
11© 2016 Veeam Software
Figure 12: We copy data into volume 2. VHD doesn’t support UNMAP. The virtual disk does need to expand to provision space for the volume 2. The result is loss in performance and storage capacity.
Elasticity at the virtual layer
UNMAP allows a dynamically expanding VHDX to shrink . When you shut down the VM the dynamically
expanding VHDX will shrink . This provides efficiencies when you do not have a thin provisioned
storage array . If you do have one, the UNMAP command is propagated via the Hyper-V host to the
SAN, which can reclaim the space for use elsewhere .
There is a big difference to note between the virtual layer and the storage array . At the virtualized layer
shrinking of dynamically expanding VHDX is possible . When you shut down the VM, the dynamically
expanding virtual disk VHDX will be shrunk by the amount of available free space . This requires the
VM to shut down, it doesn’t happen during restarts . That’s OK . The use case for this is a large change
in space required in a dynamically expanding virtual disk . You do not need the yoyo effect of a
dynamically expanding virtual disk extending and shrinking all the time . This is senseless overhead
which can affect performance, just like thinly provisioned LUNs at the physical layer . Highly volatile
LUNs where large amounts of data are being written to and deleted should not be thinly provisioned .
That’s one use case for thick provisioning via fixed virtual hard disks . I have described a similar situation
in this blog post Mind the UNMAP Impact on Performance in Certain Scenarios .
Storage arrays are generally only good at expanding LUNs . Shrinking them isn’t commonly available
to the best of my knowledge . True elasticity is only available at the virtual layer . Even with fixed VHDX
there is elasticity but it’s not automatically done, you have to shrink or expand the VHDX manually .
Shrinking is only possible if you have free space on the disk . Also don’t forget that this requires
a guest OS where you can resize volumes .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
12© 2016 Veeam Software
Performance issues with thin provisioning and UNMAP
It goes without saying that if you have heavy IO, going on fixed VHDX is the way to go . This is due to the
fact that no matter how fast the dynamically expanding VHDX can grow, it still has impact on performance .
ODX can mitigate part of this, but not everything . There are other issues you need to watch out for . When
a dynamically expanding VHDX has shrunk at shutdown of the VM, it will have to grow again, needlessly .
UNMAP itself comes at a cost when being executed . In most cases this is not an issue but there are scenarios
where it will hurt performance too much . This is the case with LUNs when large amounts data are constantly
written and deleted after x amount of time like some monitoring systems, backups etc .
Figure 13: Real life example: A huge increase in deleting expired backup when in the day UNMAP became active on the target host. After figuring this out, fully provisioning the LUNs was a solution.
It’s wise to provision thick LUNs on the storage array to prevent performance issues in these scenarios . I prefer
this to turning UNMAP off at the host level, which affects all LUNs on that host . A fully provisioned LUN on a
SAN doesn’t take away the thin provisioning/UNMAP benefits from the other LUNs . To understand when to
use, or not use, thin provisioning start here Plan and Deploy Thin Provisioning .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
13© 2016 Veeam Software
Snapshots and checkpointsBoth Hyper-V checkpoints (as snapshots are now called) and storage array snapshots provide us with
a lot of flexibility for ease of recovery, data protection or delivering read-only data to certain workloads
or test data to dev/ops . They are fast and space efficient . That means they can have an impact on your
performance, and you can’t make endless snapshots . They do consume storage space, memory on
controllers and they negatively influence performance, especially under certain conditions .
There are also some risks associated with ill-advised use of snapshots . But, in our discussion here we’ll
mainly focus of the effect snapshots have on storage consumption, how they affect UNMAP and
what their performance impact is .
You need to use them wisely and manage them with intent . Performance will suffer if you don’t . Too
much storage space will be consumed, and if you really get into trouble, they can cause downtime or
even data loss . Knowledge of the technologies and how to use them in your environment are key to
optimizing the benefits they deliver while avoiding any, or most, issues .
Storage arraysThe technical details will differ between vendors, models, makes and versions of the storage arrays .
So to demonstrate that management and monitoring is needed to enjoy the benefits that snapshots
deliver, we’ll look at some real live examples .
Note that most SANs support two types of snapshots . The crash consistent ones and application
consistent ones . The latter require a hardware Volume Snapshot Service (VSS) provider to be used
in order to create them . This is the type of snapshot you can leverage with your backup solution .
Snapshots by themselves do not replace backups as they violate the 3-2-1 rule .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
14© 2016 Veeam Software
Snapshots are space efficient, not magic
Snapshots are space efficient, but that doesn’t translate to endlessly creating snapshots without ever
running out of storage space .
Here’s an example of a thinly provisioned LUN for SQL Server virtualization . We sized it at 1 .9 TB
(growing is easy, shrinking is not, so we’re conservative in increasing LUNs) . Thin provisioning at the
physical layer (the SAN) means we only use 1 .21 TB of the available storage for the actual data .
Figure 14: Note the space consumed by the snapshots of the LUN (light green).
Depending on the number of snapshots you retain — how frequent you take them and how volatile
your environment is in adding and deleting data, leading to large deltas frequently — the amount of
storage consumed by snapshots on a LUN and in your SAN can become significant . Don’t assume that
just because snapshots are space efficient and only capture a delta that they are insignificant!
You can also see the RAID overhead (yellow) . That’s why the available storage is always less than the
raw capacity of a storage array .
Don’t run off to buy more storage before verification of the issue!
Let me share with you another practical example of the importance of managing your snapshots in your
storage arrays . In this case, some were already suggesting they urgently needed to buy extra storage
capacity . But, we dove into the environment first to see if the numbers made sense . They did, but not
because the storage was being used for their real workloads . The actual storage needed for that, including
snapshots, was way below the alerting threshold . The extra capacity “lost” was due to abandoned,
orphaned snapshots in combination with a volatile elastic environment where snapshots can be big .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
15© 2016 Veeam Software
Figure 15: Space reclaimed on SAN by doing some long overdue maintenance.
So just doing maintenance — cleaning up snapshots that didn’t expire for some reason, removing orphaned
snapshot and making sure UNMAP was enabled on the hypervisor — recuperated over 80 TB of capacity .
Monitoring and maintenance are one thing . You also need to know and understand your storage ecosystem,
what’s happening to it and why . A smooth sales guy could have earned a nice bonus here .
This was caused mainly by the orphaned snapshots, which also prevents UNMAP from doing its job . But
just consider people who are not leveraging UNMAP with Hyper-V . You can often reclaim 15% to 20%
of space after a number of years in production by turning it on!
Even if the above is an extreme example, do realize that you should always verify what’s going on
before you rush out to order additional capacity . You need to manage UNMAP, Snapshots on both your
Hyper-V hosts and your storage array . Remember that even when your storage array doesn’t provide
thin provisioning or UNMAP support, you’ll gain benefits at the virtual layer and that might just be
what you need to make your resources go 15 to 20% further . That’s real money right there — especially
when needing to balance your budget .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
16© 2016 Veeam Software
Hyper-V checkpointsLike with storage based snapshots Hyper-V checkpoints provide a space efficient and fast way of creating
a point in time to revert back to . Much has been written and told about those . I’ll summarize the most
important aspects — both the benefits and the risks . The good news is that it’s all done at the virtual layer
meaning it’s a great feature for all of us who do not have hardware snapshot capabilities at our disposal .
When you want to test patches, deploy a new driver, upgrade software or even the guest OS it’s a life
saver . It saves us a ton of time in the lab and it’s a blessing for demos . As soon as you notice that things
are not working out, you can revert back to the original situation . You can chain checkpoints and revert
back to distinct steps in the process that you’re testing or performing . That’s a great capability to have .
Do note that “soon” is important here . So, if you need to go back after a week or longer with a stateful
application you’ll be in trouble . Basically, checkpoints are not supported in production, but are intended
for very fast consumption after being made in test/development by people who know what they’re doing
and they shouldn’t be kept around for a long time . You should use supported backup solutions and not
checkpoints for production . That’s also the case with certain applications that have their own extensive
and complex mechanisms of dealing with their data consistency such as SQL Server, Exchange and
Active Directory (AD) . Sure AD has become more intelligent about being virtualized, but people still get
into trouble . Reading the documentation is very important here . One way to mitigate the issues above is
taking down the VM before taking the checkpoint . That’s actually the safest way of using them .
Checkpoints also allow for cloning of a VM . That means that you can give a copy of a production VM to a
developer for testing of the upgrade of an application, to a security auditor or to engineer to reproduce issues .
These are nice capabilities but you should not use them in production environments and even for the
above use cases you must really know your workloads to know what you can get away with .
Checkpoints have other issues you need to be aware of:
• They consume disks space and can become large in volatile environments
• People forget they need space for a checkpoint merge to happen
• They are very bad for performance and it only gets worse the bigger the checkpoint chain becomes —
this can get pretty bad and you pay even more in lost performance during backups
• They can complicate migrations to different hardware or Hyper-V version which is important when you
need to replace the Hyper-V host or during migrations to a newer version of the parent partition
• They can wreck applications or even wreak havoc in your environment leading to data loss, service
down time and the cost of recovering from this — they do NOT replace backups
• You can’t extend a VM’s disks that have checkpoints because it will cause problems and potential data loss
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
17© 2016 Veeam Software
• Some things have improved over time since we can now merge checkpoints online — in the past,
people who deleted a checkpoint (or snapshot as it was called than) and forgot about it, were
surprised that it still had to merge when they shut down the VM . That led to issues when shutting
down the VM(s) ranging from longer down times because a large avhdx takes time to merge . You
also need sufficient free space to merge the avhd(x) into the vhd(x), which might not be available
after a longer period of time . But, the good news is, merging checkpoints can leverage ODX!
• Tracking and managing large number of checkpoints is tedious and error prone
Figure 16: Those checkpoints can add up. Don’t leave them lingering around without a reason.
Note that we now have a second type of checkpoint in Windows Server 2016: “Production Checkpoints .”
Production checkpoints use the VSS from within the VM . This is similar to a hardware VSS assisted SAN
snapshot and is actually like a backup . The VM and the applications are aware of what happened . It’s
not just a point-in-time freeze . When you apply a production checkpoint, the VM will be shut down
and you’ll need to boot it to restore the VM with application consistency . Applications like SQL Server
and Exchange should be able to handle production checkpoints like they can handle normal backups .
They have their own requirements and conditions in regards to backups that you still have to meet, but
this is a supported and feasible way of using a production snapshot . Do note however that production
checkpoints by themselves are no substitute for backups!
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
18© 2016 Veeam Software
The effect of Snapshots and/or checkpoints on UNMAPWhen you are leveraging UNMAP and snapshots or checkpoints you might notice that it doesn’t seem
to work, or when it does, it’s with a certain time delay . It seems unpredictable to people . To understand
why, we need to understand how both technologies impact each other .
Snapshots on SANs are used for automatic data-tiering, data protection and various other use cases . As
long as those snapshots live, so does the data in them, UNMAP will not free up space on the SAN with
thinly provisioned LUNs . This is normal, because the data is still stored on the SAN for those snapshots and
hard-deleting it from the VM or host has no impact on the storage the SAN uses until those snapshots are
deleted or expire . The direct impact is received only by what happens in the active portion .
This is also the case with Hyper-V checkpoints . When you create a checkpoint, the VHDX is kept and
you are writing to the avhdx (differencing disk) — meaning that any UNMAP activity will only reflect
on data in the active avhdx file and not in the “frozen” parent file . When you have a larger chain, data is
stored over even more avhdx files and the results of UNMAP become less predictable as well .
The good news is, it all still works . And, as your snapshots expire or your checkpoints are deleted or
merged, the benefits will be there . This should not be an issue as the true value of thin provisioning and
UNMAP is accomplished over time . If it is a concern for certain VMs, don’t use checkpoint or snapshots
on the LUN where it resides .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
19© 2016 Veeam Software
Is fragmentation an issue in all of this?You’ll most likely have more urgent concerns than fragmentation in your Hyper-V environment — on
modern hardware with a well-designed Hyper-V deployment and the correct choice of virtual disk . But,
it does exist, and both extremes — “it’s a huge wide spread issue” or “it doesn’t matter at all” — aren’t
very helpful . Let’s take a quick look at fragmentation, especially in its relation to thin provisioning and
dynamically expanding virtual disks
On the storage arrayGreat news . The internal structure of the storage array isn’t your problem — it’s the vendor’s
responsibility . In modern storage arrays with auto-tiering, snapshots, deduplication and virtualized
storage systems, the LUNS aren’t exactly a contiguous file and Windows doesn’t know anything about
that anyway . Then there’s the fact that with SSD “classic” fragmentation is a non-issue and we’re seeing
ever more of SSD out there . However, even with SSDs and “virtualized storage” in storage arrays,
fragmentation still comes in to play in performance . The point is that storage vendors make their
money form delivering robust, reliable, highly performant and capable storage . They need to worry
about how they solve the technical challenges .
The storage array is an abstraction layer where the only thing you can do is ask for advice from your
storage vendor on what to do regarding defragmentation on the Windows and Hyper-V side of
things . You have a 95% chance they’ll tell you that they handle it “under-the-hood,” to “ask Microsoft” or
nothing . I’ve learned to stop worrying and choose a good storage solution that I love .
In regards to Hyper-VThere are three types of disk fragmentation you might need to deal with in regards to Hyper-V:
1 . Fragmentation of the file system on the host LUN where the VMs reside
2 . Fragmentation of files system on the LUNs inside of the VM
3 . Block fragmentation of the VHDX itself — this is a potentially issue with dynamic disks
and differencing disks
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
20© 2016 Veeam Software
On the Hyper-V host
When you use a lot of dynamically expanding virtual disks, and they all grow frequently over time,
fragmentation will build up . This may or may not cause performance issues . Things aren’t as simple on a
modern storage array as they used to be on a single HHD . The good news is, when this becomes a problem
and you’ve weighed the impact on other subsystems, you can defragment your LUNs — and even the CSV —
without down time . Check out this blog post on the subject by Microsoft How to Run ChkDsk and Defrag on Cluster Shared Volumes in Windows Server 2012 R2 . Decent third party tools will support this by default .
Take a look at the Optimize Drives GUI from a Hyper-V host that has both local, non-CSV LUNS and CSV LUNs .
Figure 17: A Hyper-V cluster host with local and SAN drives.
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
21© 2016 Veeam Software
The local disks are identified as hard disk drives, the SAN LUNs, both non-CSV and CSV LUNs are
identified as thin provisioned drives . Pretty smooth . Here’s another from a lab server of mine . You can
see that Windows tries to identify the type of disk and will run the optimization based on that . If you
think that SSD/Flash storage means absolutely no fragmentation, I invite you to read Scott Hanselman’s
excellent post on the subject The real and complete story — Does Windows defragment your SSD? .
Figure 18: A server with local direct attached storage only, showing both solid state and hard disk drives
Inside the VM
That’s your standard run-of-the-mill fragmentation within the OS and data partitions . Whether this is
problematic and requires regular maintenance depends on the characteristics of the workload . Since
Windows Server 2012, we have the storage optimizer which runs as maintenance job in Windows . The
storage optimizer will optimize things for you as best as it can in an intelligent way . Please note that
capabilities have evolved a lot over the various Windows versions . So read What's new in defrag for Windows Server 2012/2012R2 .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
22© 2016 Veeam Software
I do need to clarify one thing . When you look at Optimize Drives inside a Windows Server 2012 R2 VM,
you’ll see all disks identified as a thin provisioned drive .
Figure 19: Optimize Drives in the guest OS reporting the media type — It’s all thin provisioned drives.
The screenshot is of a VM that has every type of disk attached . The Optimize Drives GUI inside this VM shows
you a mix of fixed and dynamically expanding VHD/VHDX virtual disks . The VM resides on a CSV provided
by a modern SAN . Note why they are all identified as “thin provisioned drive” — A VHD doesn’t know about
UNMAP, does it? How does that compute? Well, my understanding is that all virtual disks, dynamically
expanding or fixed, both VHDX and VHD, are identified as thin provisioned disks, no matter what type of
physical disk they reside on (CSV, SAS, SATA, SSD, shared/non shared) . This is to allow for UNMAP command
to be sent from the guest to the Hyper-V storage stack below . If it’s a VHD those UNMAP commands are
basically black holed — just like they’d never be passed down to a local SATA HHD (on the host) that has no
idea what it is or what it’s used for . The physical layer will figure out what to do .
But, to get back to the part about fragmentation — unless you see serious performance degradation,
you’ll be fine with without intervention . The key thing is to know your workloads and data/file behavior
to know where fragmentation might cause issues and monitor for that .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
23© 2016 Veeam Software
The internal VHDX structure
The most unknown one for you to watch is the internal fragmentation of the VHDX file . Not because it’s a
huge problem, but because it’s a little known one . It’s easy to spot via PowerShell . It’s fixed by creating a new
virtual disk using the fragmented one as the source . See Hyper-V and Disk Fragmentation . But this would
cause down time, so the best way to handle this is to leverage storage live migration . If you can leverage
ODX it’s also pretty fast . So, always leave yourself some margin on storage space to be able to perform
maintenance operations via storage live migration . However, there is little guidance on what percentage is
acceptable . I wouldn’t go through the effort for less than 15% fragmentation . Your mileage may vary .
Closing thoughts on fragmentationDynamically expanding virtual disks are more prone to fragmentation than fixed virtual disks . However,
experience shows me that heavy IO workloads that have many read/writes/deletes, and potentially a
large delta due to the volatile nature of the data, are not good candidates for dynamically expanding
virtual disks in regards to performance . So, choose your workload for dynamically expanding virtual
disks wisely and your fragmentation problems will be kept to a minimum as well .
Read up on the subject in What's new in defrag for Windows Server 2012/2012R2 and learn that
running defragmentation is not a simple yes or no question . Defragmentation itself has changed in
what it does and how . This has evolved with the versions of Windows and modern tiered storage, SSD/
Flash, deduplication and the like . Just doing classic, old-school defragmentation out of habit is pure
institutional inertia at work . Don’t be like that . Find out if there is a real issue — and if so, solve it in
an intelligent manner . Stating that it’s not an issue anymore is overly simplistic . Observe, decide, act
(ODA) is the way to go . Don’t just "act" by default “just in case” . If you do decide to leverage third party
solutions for defragmentation, choose a good-quality, modern version that knows how to deal with
modern storage and not just hard-disk drives .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
24© 2016 Veeam Software
Other capabilities enhancing these efficiencies ODXIn all situations where the storage array (or arrays) supports offloaded data transfer (ODX), you’ll see
a significant increase in performance when copying or moving data . The array or arrays do the heavy
lifting under the hood .
You’ll enjoy the benefits from ODX when:
• Resizing VHD/VHDX virtual hard disks
• Creating VHD/VHDX virtual hard disks
• Merging checkpoints or differencing disks
• Storage live migration or shared nothing live migration
• Copying data on the same LUN or between LUNs on the same servers
(physical and virtual) or different servers
As you can see, Hyper-V efficiencies are enhanced by ODX when available on both
the virtual and the physical layer .
There’s only one rule, and that is that the data has to live within the storage array or a cluster of arrays
that support ODX . On a high level, ODX works with a token .
Figure 20: ODX
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
25© 2016 Veeam Software
Windows ODX allows direct data transfer within or between compatible storage devices, bypassing
the servers . This minimizes latencies, maximizes throughput and reduces CPU and network load on the
server hosts . It all happens transparently and requires no change in processes .
ODX uses a token for reading and writing data within or between intelligent storage arrays . The token is only
used between the source and target . This token is what allows the process to be transparent to the operating
system while the actual data transfer happens in or between the storage arrays . ODX avoids all the overhead
associated with “ordinary” data copies or moves; which need CPU cycles, incur multiple data copy actions and
context switches to move data across the TCP/IP stack to transfer the data between servers .
ReFSIn Windows Server 2016 ReFS is enhanced with capabilities that make it extremely fast in performing
file operations that are important and useful to Hyper-V scenarios . The creation and merging of virtual
disks now happens at lightning speed . For this set of actions it’s like ODX behavior without needing a
specialized storage array . Pretty impressive and yet a very economical way of leveraging commodity
storage for even better results .
ConclusionThin provisioning, UNMAP and snapshots are very important tools that are available to you, to provide
efficiencies and speed at both the virtual and physical storage layer . It would be a tremendous waste
to ignore them — either by not realizing what they can do for you or out of fear for the risks . Learn to
master your tools and use them to your advantage . That means knowing about their strengths and
weaknesses so you can manage the risk while achieving your goals . The raw power of storage (speed,
low latency and capacity) can only shine brighter when assisted by correctly used intelligent features
that help us deliver efficiencies . Other storage enhancements like ODX or ReFS act as force multipliers
and help storage, which has gotten very scalable and capable in Windows Server 2012 R2 Hyper-V
shine even more . All we need to do is remember that with great power comes great responsibility .
Knowledge is your friend . Thin provisioning, UNMAP and snapshots will deliver great value when used
correctly for the right workloads and scenarios .
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
26© 2016 Veeam Software
Didier Van Hoye is an IT veteran with over 17 years of expertise in Microsoft
technologies, storage, virtualization and networking . He works mainly
as a subject matter expert advisor and infrastructure architect in Wintel
environments leveraging DELL hardware to build the best possible high
performance solutions with great value for money . He contributes his
experience and knowledge to the global community as Microsoft MVP in
Hyper-V, a Veeam Vanguard, a member of the Microsoft Extended Experts
Team in Belgium and a DELL TechCenter Rockstar . His areas of expertise include
blogging, writing and public speaking .
Twitter @workinghardinit
Blog http://blog.workinghardinit.work
LinkedIn http://be.linkedin.com/in/didiervanhoye
About Veeam Software Veeam® recognizes the new challenges companies across the globe face in enabling the Always-
On Business™, a business that must operate 24/7/365 . To address this, Veeam has pioneered a
new market of Availability for the Always-On Enterprise™ by helping organizations meet recovery
time and point objectives (RTPO™) of less than 15 minutes for all applications and data, through
a fundamentally new kind of solution that delivers high-speed recovery, data loss avoidance,
verified protection, leveraged data and complete visibility . Veeam Availability Suite™, which
includes Veeam Backup & Replication™, leverages virtualization, storage, and cloud technologies
that enable the modern data center to help organizations save time, mitigate risks, and
dramatically reduce capital and operational costs .
Founded in 2006, Veeam currently has 37,000 ProPartners and more than 183,000 customers
worldwide . Veeam's global headquarters are located in Baar, Switzerland, and the company has
offices throughout the world . To learn more, visit http://www.veeam.com .
About the Author
Storage Efficiencies with Hyper-V at the Virtual and Physical Layer
27© 2016 Veeam Software
Learn more and previewthe upcoming v9 release
vee.am/v9
NEW Veeam® AvailabilitySuite™ v9
RTPO™ <15 minutes forALL applications and data