What is vNUMA?

20150924225507I am occasionally asked by people what vNUMA is. I decided to write this post and share it with those inquisitors. This post aims to explain things fairly high-level and hopefully will enable readers to understand what vNUMA is and why it is important.

VMware introduced vNUMA (Virtual Non-Uniform Memory Access) in vSphere 5. It is a technology feature that exposes the underlying NUMA architecture of the hypervisor to the VMs running on it. Assuming that those VMs run operating systems that are NUMA-aware, they could potentially gain significant performance increases from seeing the underlying NUMA architecture.

In order to fully understand vNUMA and its benefits we must first explain UMA (Unified Memory Architecture) and NUMA (Non-Uniform Memory Architecture).

UMA

Unified Memory Architecture is a Shared Memory Architecture where all the processors share the same physical memory uniformly. This configuration is also known as a Symmetric Multi-Processing system or SMP. The graphic below illustrates the UMA architecture:

20150924115547

As you can see both processors have direct access to the same memory on the same bus and this access is uniform or equal, meaning that no processor would have a performance advantage over another when accessing memory addresses. This architecture is suitable for general purpose time critical applications used by multiple users. It is also suitable for large single programs in time critical applications.

The problem with UMA was that the requirement for much larger server systems resulted in more processors sharing the same memory bus which increased memory access latency. This consequently impacted operating system and application performance.

NUMA

NUMA architecture works by linking memory directly to a processor to create a NUMA node. Here, all processors have access to all memory, but that memory access is not uniform or equal. The graphic below illustrates this:

20150924115640

As you can see each processor has direct access to its own memory, this is known as local memory. It can also access memory assigned to the other processor, this is known as remote memory. Access to remote memory is significantly slower than local memory hence why the memory access is non-uniform.

Memory access times are not uniform and depend on the location of the memory and the node from which it is accessed, as the technology’s name implies.

The main benefit of NUMA architecture is memory latency reduction and application memory performance improvement. To realise these performance benefits the operating system must be NUMA-aware in order for it to place applications in specific NUMA nodes and to prevent them from crossing NUMA-node boundaries.

vNUMA

As mentioned earlier, vNUMA exposes the underlying NUMA architecture of the hpyervisor to the VMs running on it. As long as those VMs run operating systems that are NUMA-aware they can make the most intelligent or efficient use of the underlying processors and memory.

vNUMA is a technology feature that was introduced with vSphere 5.  In earlier versions of vSphere, a VM that contained vCPUs spanning multiple physical sockets would think it was running on a UMA system and would therefore adversely affect its NUMA-aware management features. This could significantly impact the performance of the VM.

With the increased demand for ever larger VMs, we are now seeing wide or monster VMs becoming the norm in enterprise cloud environments. Especially when those VMs are running critical workloads. The latest version of vSphere, at the time of writing is it vSphere 6, supports VMs with 128 vCPUs and 4TB of memory. As VMs move towards these upper limits in terms of size, they will certainly will span multiple NUMA nodes, this is where vNUMA can significantly improve system and application performance of these large, high performance VMs.

Below are some important points relating to vNUMA:

  • vNUMA requires VMs to run virtual hardware version 8 or above.
  • The hypervisor must run vSphere 5.0 and above.
  • The hypervisor must contain NUMA-enabled hardware.
  • vNUMA is automatically enabled for VMs with more than 8 x vCPUs (so 9 x vCPUs or more).
  • To enable vNUMA on VMs with 8 x vCPUs or less, it must be done manually. It can be set in the VM’s Configuration Parameters.
  • vNUMA will not be run on a VM with vCPU hotplug enabled, in fact, the will use UMA with interleaved memory access instead.
  • A VM’s vNUMA topology is set based on the NUMA topology of the hypervisor it is running on. It retains the same vNUMA topology of the hypversior it was started on even if is migrated to another hypervisor in the same cluster. This is why it is good practice to build clusters with identical physical hosts/hardware.

VM Sizing

VMware recommends sizing VMs so they align with physical NUMA boundaries. So, if your hypervisor has 8 x cores per socket (octo-core), the NUMA node is assumed to contain 8 cores. Then size your virtual machines in multiples of 8 x vCPUs (8 vCPUs, 16 vCPUs, 24 vCPUs, 32 vCPUs etc).

VMware recommends sizing VMs with the default value of 1 core per socket (with the number of virtual sockets therefore equal to the number of vCPUs). So in our octo-core server if we require 8 x vCPUs we would configure the VM to have the configuration below:

20150924173345

By changing the configuration to 8 cores per socket, it still aligns correctly with the NUMA node as there are 8 cores:

20150924175647

However, this configuration can result in reduced performance. According to this VMware article, this configuration is considered to be less than optimal and resulted in significant increases in application execution time. The conclusion was that the corespersocket configuration of a virtual machine does indeed have an impact on performance when the manually configured vNUMA topology does not optimally match the physical NUMA topology.

If VMs were sized incorrectly and did not match the underlying NUMA toplogy it would result in reduced performance. So using our example where the hypervisor has 8 cores per socket, if a VM with 10 virtual sockets (a total of 10 cores) were created, it would breach the NUMA boundary. 8 of the cores would come from an assigned NUMA node and 2 would come from another NUMA node. This would result in reduced performance as applications would be forced to access remote memory thereby incurring a performance hit. How much the performance hit is depends on factors unique to that specific VM.

References:
vNUMA: What it is and why it matters
Understanding vNUMA (Virtual Non-Uniform Memory Access)
Sizing VMs and NUMA nodes
ESX 4.1 NUMA Scheduling
SQL Server Virtual Machine vNUMA Sizing
Many Cores per Socket or Single-Core Socket Mystery
Does corespersocket Affect Performance?
Cores Per Socket and vNUMA in VMware vSphere
Performance Best Practices for VMware vSphere® 6.0

Cloning a VM in ESXi 5.x standalone hypervisor using vmkfstools

If you run ESXi 5.5 hypervisor in a standalone setup and not managed by vCenter you do not have the ability to clone a VM via the GUI. Any VM that is managed by vCenter can be cloned by simply right-clicking on it and selecting Clone, per the below:

20150606002506

When you have spent many man hours configuring a VM, it’s guest OS and the applications that run within it, it makes no sense re-doing all the same work if you need to deploy a new server that will run the same applications and perform the same role. This is one of the main reasons cloning a VM is such as useful feature. My current setup contains only 1 x Centos VM called lnx-svr-01.vsysad.local, per the below:

20150606001634

As you can see, the clone option is not available in the GUI:

20150606001753

The aim here is to clone lnx-svr-01.vsysad.local and create new VM called lnx-svr-02.vsysad.local. So I will clone the existing  VM via CLI using vmkfstools to produce the new one. So to proceed perform the following steps:

1. Shutdown he VM you are going to clone, in my case this is lnx-svr-01.vsysad.local.

2. Connect to the ESXi hypervisor via ssh.

3. Browse the datastore that contains the VM folders, the datastore in question here is hyp1-local-1:

~ # ls /vmfs/volumes/hyp1-local-1/
ISO lnx-svr-01.vsysad.local
Imports vmkdump
~ #

As you can see from the above the only VM folder is lnx-svr-01.vsysad.local because it’s the only one I have running on this hypervisor at the moment.

4. Create a new folder called lnx-svr-02.vsysad.local. This is where our new VM is going to reside on the datastore. To do so run:

~ # mkdir /vmfs/volumes/hyp1-local-1/lnx-svr-02.vsysad.local

Then run ls to verify it has been created:

~ # ls /vmfs/volumes/hyp1-local-1/
ISO lnx-svr-01.vsysad.local vmkdump
Imports lnx-svr-02.vsysad.local
~ #

We can see from the output that a folder called lnx-svr-02.vsysad.local has been successfully created.

5. Now list the contents of the folder lnx-svr-01.vsysad.local by running:

~ # ls /vmfs/volumes/hyp1-local-1/lnx-svr-01.vsysad.local
lnx-svr-01.vsysad.local-dabdb809.vswp
lnx-svr-01.vsysad.local-flat.vmdk
lnx-svr-01.vsysad.local.nvram
lnx-svr-01.vsysad.local.vmdk
lnx-svr-01.vsysad.local.vmsd
lnx-svr-01.vsysad.local.vmx
lnx-svr-01.vsysad.local.vmx.lck
lnx-svr-01.vsysad.local.vmxf
lnx-svr-01.vsysad.local.vmx~
vmware-1.log
vmware.log
vmx-lnx-svr-01.vsysad.local-3669866505-1.vswp

We can see all the VM files above. The one we are going to copy is called lnx-svr-01.vsysad.local.vmx.

6. Copy lnx-svr-01.vsysad.local.vmx to the folder lnx-svr-02.vsysad.local we created in step 3 above by running:

~ # cp /vmfs/volumes/hyp1-local-1/lnx-svr-01.vsysad.local/lnx-svr-01.vsysad.local.vmx /vmfs/volumes/hyp1-local-1/lnx-svr-02.vsysad.local/lnx-svr-02.vsysad.local.vmx

Notice that we are copying it and renaming it to lnx-svr-02.vsysad.local.vmx.

7. Next, we use the vmkfstools command to clone the virtual hard disk (vmdk) of lnx-svr-01.vsysad.local into a new vmdk. If you look back at step 4 you will see the file called lnx-svr-01.vsysad.local.vmdk – we will clone this file into a new one called lnx-svr-02.vsysad.local.vmdk located in the lnx-svr-02.vsysad.local folder we created in step 3. To proceed, I run the following:

~ # vmkfstools -i /vmfs/volumes/hyp1-local-1/lnx-svr-01.vsysad.local/lnx-svr-01.vsysad.local.vmdk /vmfs/volumes/hyp1-local-1/lnx-svr-02.vsysad.local/lnx-svr-02.vsysad.local.vmdk -d zeroedthick

You will see the following output:

~ # vmkfstools -i /vmfs/volumes/hyp1-local-1/lnx-svr-01.vsysad.local/lnx-svr-01.vsysad.local.vmdk /vmfs/volumes/hyp1-local-1/lnx-svr-02.vsysad.local/lnx-svr-02.vsysad.local.vmdk -d zeroedthick
Destination disk format: VMFS zeroedthick
Cloning disk '/vmfs/volumes/hyp1-local-1/lnx-svr-01.vsysad.local/lnx-svr-01.vsysad.local.vmdk'...
Clone: 10% done.

Eventually it will complete successfully:

~ # vmkfstools -i /vmfs/volumes/hyp1-local-1/lnx-svr-01.vsysad.local/lnx-svr-01.vsysad.local.vmdk /vmfs/volumes/hyp1-local-1/lnx-svr-02.vsysad.local/lnx-svr-02.vsysad.local.vmdk -d zeroedthick
Destination disk format: VMFS zeroedthick
Cloning disk '/vmfs/volumes/hyp1-local-1/lnx-svr-01.vsysad.local/lnx-svr-01.vsysad.local.vmdk'...
Clone: 100% done.
~ #

Please note the option -d zeroedthick that I run in the vmkfstools command. I wanted the disk to be created as zeroed thick, you can change this option to thin if you want it to be thin provisioned. More options are listed on this VMware KB article.

8. So at this stage, we have a new folder called lnx-svr-02.vsysad.local that contains three new files:

~ # ls /vmfs/volumes/hyp1-local-1/lnx-svr-02.vsysad.local/
lnx-svr-02.vsysad.local-flat.vmdk lnx-svr-02.vsysad.local.vmdk
lnx-svr-02.vsysad.local.vmx
~ #

We copied the lnx-svr-02.vsysad.local.vmx file and lnx-svr-02.vsysad.local.vmdk was created by the clone process. But you’ll notice a third file, lnx-svr-02.vsysad.local-flat.vmdk. This is the actual virtual disk that contains the data, lnx-svr-02.vsysad.local.vmdk is a descriptor file that simply points to it. The clone process created this file also.

9. The next thing we need to to is edit the lnx-svr-02.vsysad.local.vmx file using vi. I am making an assumption that you know how to use vi to edit files. If you don’t, see this link, which covers the basics. To do so, run:

~ # vi /vmfs/volumes/hyp1-local-1/lnx-svr-02.vsysad.local/lnx-svr-02.vsysad.local.vmx

Find any string that contains lnx-svr-01.vsysad.local and change it to lnx-svr-02.vsysad.local. So in my case the following lines were changed:

nvram = "lnx-svr-02.vsysad.local.nvram"
displayName = "lnx-svr-02.vsysad.local"
extendedConfigFile = "lnx-svr-02.vsysad.local.vmxf"
scsi0:0.fileName = "lnx-svr-02.vsysad.local.vmdk"
sched.swap.derivedName = "/vmfs/volumes/55627693-62f7625a-65ce-a82066349979/lnx-svr-02.vsysad.local/lnx-svr-02.vsysad.local-15fdf1aa.vswp"

Save the changes made to lnx-svr-02.vsysad.local.vmx.

10. So we are now ready to register our VM. Run this command to register it:

~ # vim-cmd solo/registervm /vmfs/volumes/hyp1-local-1/lnx-svr-02.vsysad.local/lnx-svr-02.vsysad.local.vmx

In the vSphere Client you will see the corresponding task:

20150606083412

And BOOM! There it is in your inventory:

20150606125246

A couple of points to mention:

When you power the VM on you will see the following question:

20150606083914

Check I Copied It and click OK and the VM will power on OK.

The next thing to mention is that you should make sure the new VM being powered on doesn’t cause an IP conflict with the VM it was cloned from. If the original VM has a static IP configured then an IP conlfict will ensue by powering on the new/cloned version. So to avoid this, disable the NIC in the VM settings of the new/cloned VM, by un-checking Connect at power on, like so:

20150606083727

Then you can power on the VM, login to the console, re-IP it and then check Connected and Connect at power on in the VM settings. Those actions will avoid causing an IP conflict and potential havoc on your network.

Reference:
Cloning and converting virtual machine disks with vmkfstools (1028042)
Registering or adding a virtual machine to the inventory on vCenter Server or on an ESX/ESXi host (1006160)