vNUMA CPU Alignment – doing it right

As I’ve said before, the majority of our VMware environment is running on Cisco UCS blade servers, and the majority of these are running dual hex-core CPUs. With the broad spectrum of Operating Systems and applications running across our many hundreds of VMs, there are inevitably many, many VMs with multiple vCPUs.

This shows the CPU Ready/Usage stats before and after re-aligning vCPU configuration from single core-multiple sockets, to multiple core-single socket

NUMA, in a nutshell, utilises the host CPU’s local memory bus to allow faster memory access time for workloads on that specific CPU. This is particularly important for latency sensitive workloads. vNUMA is VMware’s implementation of utilising NUMA to reduce memory latency for VMs running on ESXi.

When looking at CPU performance issues with a guest VM, no host level contention for CPU resources existed. Through the VM performance graphs in the vSphere client, CPU Ready times could be seen to be spiking often, this was on a VM with multiple CPUs set to 2 x socket and 8 x cores (16 vCPU).

This shows the CPU Ready/Usage stats before and after re-aligning vCPU configuration from single core-multiple sockets, to multiple core-single socket; a marked difference

 

This problem is fairly well documented, and is detailed in VMware KB 1026063; when utilising the NUMA features of VMware, it is important to configure the vCPU layouts for your VMs to align with the physical characteristics of your hosts if possible; this will help to guarantee that the VM guest can be vMotioned to other hosts in the cluster, with different physical CPU configurations. A better solution, where identical CPU configurations can not be guaranteed across all your hosts, is to define all vCPUs as x sockets with a single core.

In this case our physical CPU architecture is 2 sockets x 6 cores, and the VM was configured with 1 socket x 8 cores. This prevents the hypervisor and guest OS from completely utilising either one, or both sockets in our physical host, and is therefore missing out on the speed benefits which NUMA can bring. And can be exhibited as high or consistently high ready States for this VM guest.

There is a great VMware blog article about the performance implications of vNUMA design selection here, which echoes the VMware Best Practices guide, stating that the best way to approach this is to set your vCPU configuration ‘flat and wide’. this means that if your VM requires 8 cores, then configure your vCPU with 8 sockets with 1 core each, rather than 2 quad-core or 4 dual-core sockets.

This allows the vNUMA technology to balance the load as it best sees fit and should prevent CPU Ready spike issues. There are of course edge cases, where specific software licensing may force you to use as few sockets as possible, in which case alignment to physical host CPU should always be attempted, in the case above, the VM could have been configured to 1 socket x 6 cores, or 2 sockets x 6 cores. Be aware, that stepping outside of the ‘flat and wide’ model will prevent vNUMA from doing its job, and will bow to your judgement of vCPU configuration; this means you had better have got it right!

What are the problems in my environment?

I have worked as a SysAdmin for around 7 years, working across a wide range of technologies. My current role is a 3rd Line Server Infrastructure Engineer, specifically supporting VMware and Cisco UCS platforms providing IaaS for over 50 clients totalling around 1500 VMs.

The virtualised environment I support is almost exclusively running on vSphere 5.1, we have 5 different Production vCenter servers, none of them linked, and running on a variety of hardware. The majority of the compute is running on Cisco UCS blades, which are a challenge to manage in themselves, but more on that another time. Most of our storage is EMC VNX, with some IBM SVC fronted kit thrown in for good measure, all running on Fibre Channel, and our network stack is running on Cisco Nexus switches, using the simultaneously great, and terrible, Cisco Nexus 1000v for our vDS.

I guess this is all pretty standard stuff, so what problems do I see on a daily basis? Well I plan to talk in this blog about the problems I see, how we can get around them, and what steps we can take to more effectively manage the issues thrown up by this infrastructure.