VMware memory reclamation techniques are very well covered on the internet – a quick search will yield lots of information. This post covers the topic fairly high level to help educate people new to VMware virtualization. In this post I will also explain host memory states, which determine when and in which specific order the memory reclamation techniques are employed.
Memory reclamation, as the term suggests, is a method of reclaiming memory that has been assigned to VMs. ESXi assigns memory resources to VMs according to the amount of memory those VMs have been configured to use. If ESXi has free or spare memory after satisfying all VMs’ configured memory demands then there is no need to reclaim memory. Memory reclamation only comes into play when the host begins to runs out of physical memory and cannot allocate any more to VMs.
There are 4 memory reclamation techniques in total:
- Transparent Page sharing
- Hypervisor swapping
- Memory compression
Ballooning and Hypervisor swapping are dynamic in that they expand or contract the amount of memory allocated to VMs based on the amount of free memory on the host.
TPS (Transparent Page Sharing)
In a typical virtual environment, it is very likely that a high proportion of VMs will be running the same operating system. They would therefore load the same pages of data into memory. In such situations a hypervisor will employ TPS to store just a single copy of the identical pages and securely eliminates those redundant copies of memory pages – it is basically memory deduplication. TPS results in reduced host memory consumption by the VMs. TPS is enabled by default and its effectiveness depends on the amount of identical pages that can be identified in memory and subsequently reclaimed. TPS can save you up to 70% (VDI environments with many identical operation systems), it is transparent to the VMs and incurs no performance penalty.
With TPS, a workload running within a VM often consumes less memory than it would when running on a physical machine.
Ballooning is a dynamic memory reclamation technique that is leveraged to reclaim memory pages when host memory resources are in demand and available physical pages cannot meet requirements. A memory balloon driver (vmmemctl) is loaded into the guest OS running within a VM. The memory balloon driver communicates directly with the host and is made aware of it’s low memory status. The guest OS is neither aware of this communication nor the low memory status of the host. The host will indicate the amount of physical pages it needs to reclaim, this determines the balloon size. The balloon driver then inflates to the required size by allocating the required amount of memory within the guest OS that is required by the host. It pins the allocated pages, so that the guest OS doesn’t page or swap them out. The balloon driver then informs the host which pages it has allocated, the host then finds the physical memory that backs the pages that have been pinned within the guest OS by the balloon driver and uses them to satisfy memory demands of other running workloads. Ballooning creates artificial memory pressure with the guest OS, which can force it to leverage its own native memory management techniques to handle these changing conditions. Once memory pressure within the host has reduced, it will return those pages back to the guest OS. It does this by by deflating the balloon driver. VMware Tools contains the balloon driver and must be installed for ballooning to work on a VM.
Swapping is also a dynamic memory reclamation technique. It must be made clear that this is hypervisor swapping and not guest OS swapping. When the other memory reclamation techniques have failed to combat memory pressure on the host, swapping is used as a last resort. In this technique, the VM’s swap file, which is created when the VM is powered on, is used to swap out memory from the VM in order to reclaim memory from it. The reason it is used as a last resort is because swapping memory to a file is very slow and results in a significant performance impact. If you have reached a point whereby hypervisor swapping is being utilized you need to fix the problem fast and order more physical memory!
Memory compression is employed when the host is suffering from memory pressure and ballooning has already been used. It is enabled by default. It compresses and stores memory in a cache in the host’s main memory. ESXi checks whether a page can be compressed by checking the its compression ratio. Memory is compressed if a page’s compression ratio is greater than 50%. Otherwise, no performance benefit is derived from compressing it and that page is swapped out. As compressed memory resides in memory and can be retrieved by decompression, it is still much faster than swapping which is slower because it involves writing pages to a file (hpyervisor swapping). Memory compression can significantly improve application performance when the host is under memory pressure.
Host free memory states
So at what point does ESXi actually start to reclaim memory? Which memory reclamation technique does it use first? These are all determined by the host free memory state ESXi is in at the time. So how do you know what host free memory state ESXi is in? Well, the answer is there are 5 host free memory states (in vSphere 6), they are explicit values or thresholds based on minFree, a dynamic value that is determined by the amount of total physical memory in the host. The table below can provide some context to this:
|Memory State||Threshold||Reclcamation Action|
|High||< 400% of minFree||TPS breaks large pages when below this threshold.|
|Clear||< 100% of minFree||TPS breaks large pages and actively collapses pages|
|Soft||< 64% of minFree||TPS + Ballooning|
|Hard||< 32% of minFree||TPS + Compression + Swapping|
|Low||< 16% of minFree||Compression + Swapping + Blocking|
So how do you calculate minFree? I found this excellent article that explained how to calculate it.
You calculate minFree by:
- Observing the amount of physical memory in the host.
- For the first 28GB of physical memory in the host, minFree = 899 MB.
- Then add 1% of the remaining physical memory to the 899MB value in step 2 above.
Let’s illustrate this by using an example where our host has 50GB of physical memory of which only 500MB is free. The table below calculates the minFree value for our host as 1119MB:
|first 28GB of RAM||899MB|
|1% of remaining RAM (50GB - 28GB x 1%)||220MB|
So taking minFree, which in our example is 1119MB, we compare it to the amount of free RAM in ESXi, which is 500MB, which equates to ~45% of minFree. We therefore know, by checking the thresholds table above, that the host would be in a Soft state, which means it would be employing TPS and Ballooning to reclaim memory from VMs.
vSphere 6 ESXi memory state and reclamation techniques
What happens at which vSphere memory state?
vSphere 5 memory management explained (part 2)
vSphere Resource Management
VMware Memory Management Part 1 – Understanding ESXi Host Memory States