vRealize Orchestrator and Site Recovery Manager – The Missing Parts (or how to hack SOAP APIs to get what you want)

vRealize Orchestrator (vRO) forms the backbone of vRealize Automation (vRA), and provides the XaaS (Anything-as-a-Service) functionality for this product. vRO has plugins for a number of technologies; both those made by VMware, and those which are not. Having been using vRO to automate various products for the last 6 months or so, I have found that these plugins have varying degrees of quality, and some cover more functionality of the underlying product than others.

Over the last couple of weeks, I have been looking at the Site Recovery Manager (SRM) plugin (specifically version 6.1.1, in association with vRO 7.0.1, and SRM 6.1), and while this provides some of the basic functionality of SRM, it is missing some key features which I needed to expose in order to provide full-featured vRA catalog services. Specifically, the plugin documentation lists the following as being missing:

  • You cannot create, edit, or delete recovery plans.
  • You cannot add or remove test network mapping to a recovery plan.
  • You cannot rescan storage to discover newly added replicated devices.
  • You cannot delete folder, network, and resource pool mappings
  • You cannot delete protection groups
  • The unassociateVms and unrotectVms methods are not available in the plug-in. You can use them by using the Site Recovery Manager public API.

Some of these are annoying, but the last ones, around removing VMs from Protection Groups are pretty crucial for the catalog services I was looking to put together. I had to find another way to do this task, outside of the hamstrung plugin.

I dug out the SRM API Developers Guide (available here), and had a read through it, but whilst describing the API in terms of Java and C# access, it wasn’t particularly of use in helping me to use vRO’s JavaScript based programming to do what I needed to do. So I needed another way to do this, which utilised the native SOAP API presented by SRM.

Another issue I saw when using the vRO SRM plugin was that when trying to add a second SRM server (the Recovery site), the plugin fell apart. It seems that the general idea is you only automate your Protected site with this plugin, and not both sites through a single vRO instance.

I tried adding a SOAP host to vRO using the ‘Add a SOAP host’ workflow, but even after adding the WSDL available on the SRM API interface, this was still not particularly friendly, so this didn’t help too much.

Using PowerCLI, we can do some useful things using the SRM API, see this post, and this GitHub repo, for some help with doing this. Our general approach to using vRO is to avoid using a PowerShell host, as this adds a bunch of complexity around adding a host, and generally we would rather do things using REST hosts with pure JavaScript code. So we need a way to figure out how to use this undocumented SOAP API to do stuff.

Now before we go on, I appreciate that the API is subject to change, and that by using the following method to do what we need to do, the methods of automation may change in a future version of SRM. As you will see, this is a fairly simple method of getting what you need, and it should be easy enough to refactor the payloads we are using if and when the API changes. In addition to this, this method should work for any kind of SOAP or REST based API which you can access through .NET type objects in PowerShell.

So the first thing we need to do is to install Fiddler. This is the easiest tool I found to get what I wanted, and there are probably other products about, but I found and liked this one. Fiddler is a web debugging tool, which I would imagine a lot of web developers are familiar with, it can be obtained here. What I like about it is the simplicity it gives in setting up a man-in-the-middle (MitM) attack to pull the detail of what is going on. This is particularly useful when using it with PowerShell, because your client machine is the endpoint, so the proxy injection is straight forward without too much messing about.

NOTE: Because this is doing MitM attacks on the traffic, it is

I’m not going to go into installing Fiddler here, it’s a standard Windows wizard, once installed, launch the program and you should see something like this:

1

If you click in the bottom right, next to ‘All Processes’, you will see it change to ‘Capturing’:

2

We are now ready to start capturing some API calls. So open PowerShell. Now to limit the amount of junk traffic we capture, we can set to only keep a certain number of sessions (in this case I set it to 1000), and target the process to capture from (by dragging the ‘Any Process’ button to our PowerShell window).

3

9

Run the following to connect to vCenter:

Import-Module VMware.VimAutomation.Core
Connect-VIServer -Server $vcenter -Credential (Get-Credential)

4

You should see some captures appearing in the Fiddler window, we can ignore these for now as it’s just connections to the vCenter server:

You can inspect this traffic in any case, by selecting a session, and selecting the ‘Raw’ tab in the right hand pane:

5

Here we can see the URI (https://<redacted>/sdk), the SOAP method (POST), the Headers (User-Agent, Content-Type, SOAPAction, Host, Cookie etc), and the body (<?xml version….), this shows us exactly what the PowerShell client is doing to talk to the API.

Now we can connect to our local and remote SRM sites using the following command:

$srm = connect-srmserver -RemoteCredential (Get-Credential -Message 'Remote Site Credential') -Credential (Get-Credential -Message 'Local Site Credential')

If you examine the sessions in your Fiddler window now, you should see a session which looks like this:

6

This shows the URI as our SRM server, on HTTPS port 9086, with suffix ‘/vcdr/extapi/sdk’, this is the URI we use for all the SRM SOAP calls, it shows the body we use (which contains usernames and passwords for both sites), and the response with a ‘Set-Cookie’ header with a session ticket in it. This session ticket will be added as a header to each of our following calls to the SOAP API.

Let’s try and do something with the API through PowerShell now, and see what the response looks like, run the following in your PowerShell window:

$srmApi = $srm.ExtensionData
$protectionGroups= $srmApi.Protection.ListProtectionGroups()

This session will show us the following:

7

Here we can see the URI is the same as earlier, that there is a header with the name ‘Cookie’ and value of ‘vmware_soap_session=”d8ba0e7de00ae1831b253341685201b2f3b29a66″’, which ties in with the cookie returned by the last call, which has returned us some ManagedObjectReference (MoRef) names of ‘srm-vm-protection-group-1172’ and ‘srm-vm-protection-group-1823’, which represent our Protection Groups. This is great, but how do we tie these into the Protection Group names we set in SRM? Well if we run the following commands in our PowerShell window, and look at the output:

Write-Output $($pg.MoRef.Value+" is equal to "+$pg.GetInfo().Name)

The responses in Fiddler look like this:

8

This shows us a query being sent, with the Protection Group MoRef, and the returned Protection Group name.

We can repeat this process for any of the methods available through the SRM API exposed in PowerCLI, and build up a list of the bodies we have for querying, and retrieving data, and use this to build up a library of actions. As an example we have the following methods already:

Query for Protection Groups:

<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><ListProtectionGroups xmlns="urn:srm0"><_this type="SrmProtection">SrmProtection</_this></ListProtectionGroups></soap:Body></soap:Envelope>

Get the name of a Protection Group from it’s MoRef:

<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><GetInfo xmlns="urn:srm0"><_this type="SrmProtectionGroup">MOREFNAME</_this></GetInfo></soap:Body></soap:Envelope>

So how do we take these, and turn them into actions in vRO? Well we first need to add a REST host to vRO using the ‘Add a REST host’ built in workflow, pointing to ‘https://<SRM_Server_IP>:9086’, and then write actions to do calls against this, there is more detail on doing this around on the web, this site has a good example. For the authentication method we can do:

// let's set up our variables first, these could be pushed in through parameters on the action, which would make more sense, but keep it simple for now

var localUsername = "administrator@vsphere.local"

var localPassword = "VMware1!"

var remoteUsername = "administrator@vsphere.local"

var remotePassword = "VMware1!"

 

// We need our XML body to send

var content = '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><SrmLoginSites xmlns="urn:srm0"><_this type="SrmServiceInstance">SrmServiceInstance</_this><username>'+localUsername+'</username><password>'+localPassword+'</password><remoteUsername>'+remoteUsername+'</remoteUsername><remotePassword>'+remotePassword+'</remotePassword></SrmLoginSites></soap:Body></soap:Envelope>';

 

// create the session request

var SessionRequest = RestHost.createRequest("POST", "/vcdr/extapi/sdk", content);

// set the headers we saw on the request through Fiddler

SessionRequest.setHeader("SOAPAction","urn:srm0/4.0");

SessionRequest.setHeader("Content-Type","text/xml; charset=utf-8");

var SessionResponse = SessionRequest.execute();

 

// show the content

System.log("Session Response: " + SessionResponse.contentAsString);

 

// take the response and turn it into a string

var XmlContent = SessionResponse.contentAsString;

 

// get the headers

var responseHeaders = SessionResponse.getAllHeaders();

 

// and just the one we want

var token = responseHeaders.get("Set-Cookie");

 

// log the token we got

System.log("Token: " + token);

 

// return our token

return token

This will return us the token we can use for doing calls against the API. Now how do we use that to return a list of Protection Groups:

// We need our XML body to send, this just queries for the Protection Group MoRefs

var content = '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><ListProtectionGroups xmlns="urn:srm0"><_this type="SrmProtection">SrmProtection</_this></ListProtectionGroups></soap:Body></soap:Envelope>';

 

// create the session request

var SessionRequest = RestHost.createRequest("POST", "/vcdr/extapi/sdk", content);

// set the headers we saw on the request through Fiddler

SessionRequest.setHeader("SOAPAction","urn:srm0/4.0");

SessionRequest.setHeader("Content-Type","text/xml; charset=utf-8");

SessionRequest.setHeader("Cookie",token);

var SessionResponse = SessionRequest.execute();

 

// show the content

System.log("Session Response: " + SessionResponse.contentAsString);

 

// take the response and turn it into a string

var XmlContent = SessionResponse.contentAsString;

 

// lets get the Protection Group MoRefs from the response

var PGMoRefs = XmlContent.getElementsByTagName("returnval");

 

// declare an array of Protection Groups to return

var returnedPGs = [];

 

// iterate through each Protection Group MoRef

for each (var index=0; index<PGMoRefs.getLength(); index++) {

// extract the actual MoRef value

var thisMoRef = PGMoRefs.item(index).textContent;

// and insert it into the body of the new call

var content = '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><GetInfo xmlns="urn:srm0"><_this type="SrmProtectionGroup">'+thisMoRef+'</_this></GetInfo></soap:Body></soap:Envelope>';

// do another call to the API to get the Protection Group name

SessionRequest = RestHost.createRequest("POST", "/vcdr/extapi/sdk", content);

SessionRequest.setHeader("SOAPAction","urn:srm0/4.0");

SessionRequest.setHeader("Content-Type","text/xml; charset=utf-8");

SessionRequest.setHeader("Cookie",token);

SessionResponse = SessionRequest.execute();

XmlContent = XMLManager.fromString(SessionResponse.contentAsString);

returnedPGs += myxmlobj.getElementsByTagName("name").item(0).textContent;

};

 

// return our token

return returnedPGs;

Through building actions like this, we can build up a library to call the API directly. This should be a good starting point for building your own libraries for vRO to interact with SRM via the API, rather than the plugin. As stated earlier, using Fiddler, or something like it, you should be able to use this to capture anything being done through PowerShell, and I have even had some success with capturing browser clicks through this method, depending on how the web interface is configured. This method certainly made creating some integration with SRM through vRO less painful than trying to use the plugin.

Advertisements

vSphere HTML5 Client Fling Deployment Script

So yesterday, VMware released the HTML5 vSphere Client as a fling, this is available for download here. I have put together a PowerShell script to deploy this to your vSphere environment.

It seems unusual for this to take the form of an OVA, but at least this means that it does not touch your existing vCenter, so should be deployable with less apprehension.

The client itself is issued an IP address from an IP Pool, and therefore has a different IP address to access from vCenter. Deployment of the OVA is pretty straight forward, and instructions for use and setup are in the link above.

There are already a tonne of posts around the features present, and not present, in the vSphere HTML5  Client, I am not going to go over that here, suffice to say, it is a fling for a reason.

This script (at first release) assumes that a valid, enabled IP Pool already exists in vCenter for the IP you allocate to the VM, I will add functionality in the next release to create an IP Pool if one is not already present.

Other than that, you should just need to replace the variables at the top of the script to use it for deployment. The script is available on my GitHub repository at this link.

 

vSphere PowerCLI 6.3 Release 1 – in the wild…

Yesterday VMware released PowerCLI 6.3 Release 1, this follows yesterday’s fairly exciting release of new products across the VMware portfolio including:

  • vSphere 6.0 Update 2
  • vRealize Automation 7.0.1
  • vCloud Director 8.0.1

While I am not rushing to update to the new version of vCenter or ESXi in production environments, upgrading to this latest version of PowerCLI is far less risky, so I immediately upgraded and checked out the new features.

The latest release adds the following new support:

  • Support for Windows 10 and PowerShell 5.0 – as a Windows 10 user (for my personal laptop and home PC at least), this is a welcome addition. Windows Server 2016 is just around the corner as well, so this should ensure that PowerCLI 6.3R1 works here too. Not seen any problems with running the previous version of PowerCLI on my Windows 10 machines, but at least this is officially tested and supported now anyway
  • Support for vCloud Director 8.0 – VMware are driving vCD forward again, so if you are using the latest versions, and use PowerCLI to help make your life easier (and if you’re not, then why not?), this will be a welcome addition
  • Support for vRealize Operations Manager 6.2 – there are still only 12 cmdlets available in the VMware.VimAutomation.vROps module, but this bumps up support for the latest version anyway

And adds the following new features:

  • Added Content Library support – I haven’t really got into the whole Content Library thing just yet, but this feature introduced in vSphere 6.0, and was previously only automatable through the new vSphere REST API. This release of PowerCLI includes cmdlets to let you work with the Content Library, will probably do a follow up post on configuring the content library at a later date
  • Get-EsxCli functionality updated – for those that don’t know, Get-EsxCli lets you run esxcli commands via PowerShell on a target host. This is useful for certain things which are not really possible through the standard PowerCLI host management cmdlets. This release brings in advanced functionality in this area
  • Get-VM command – this command has been streamlined to more quickly return results, which should help in larger environments

So all in all, some minor improvements, some new features, and some updates to support for newer VMware products. A solid release which will keep PowerCLI relevant as a tool in a vSphere admin’s arsenal. If you’re not already using PowerCLI, then get on the bandwagon, there are some great books and videos out there, and a fantastic community to help you along.

PowerCLI – where to start

I began using PowerShell around 18 months ago while working for a small UK based Managed Service provider. Prior to this, my coding/scripting experience consisted of an A-Level in Computing, which introduced me to Visual Basic 6.0 and databases, a void of around 7 years, and then some sysadmin VBScript and batch file type goodness for a few years.

Screen Shot 2015-11-02 at 21.34.14

Until I started at said company, I had only been exposed to systems running Windows Server 2003, and with a look to security über alles, no access to PowerShell, or any other exciting languages was available, so VBScript became our automation tool of choice.

I have posted before about good resources to use to learn PowerShell, this is more a rundown of how I learned, and the joy and knowledge it gave me to do this.

My first taste of PowerShell was working with Exchange 2010 servers, doing stuff like this to report on mailbox items over a certain age.

Get-Mailbox "username" | New-MailboxSearch -Name search123 -SearchQuery "Received:<01/01/2014" -estimateonly

Were it not for the necessity to use PowerShell to do anything remotely useful in Exchange 2010, I would have been happy to continue to use batch files and VBScript to automate some of the things, I was confident in using these tools, and could achieve time savings, albeit fairly slowly. But PowerShell I must, so PowerShell I did.

Around this time, I became more keen on working with infrastructure, than applications, and got transferred to a role solely looking after our fairly sizeable Cisco UCS and VMware estate. I had plenty of years of experience of VMware, and none of Cisco UCS, but was excited by the new challenge.

I was quickly steered by the senior engineers, towards Cisco PowerTool, and VMware’s PowerCLI, to help to automate some of the administrative, and reporting type tasks I would soon be inundated with, so I picked them up and learned as I went.

I started small, and Google was my friend. Scripting small tasks to save incrementally larger amounts of time. Stuff like this:


$podcsv = import-csv .\UCS_Pods.csv
$credcsv = import-csv .\UCS_Credentials.csv
$ucsuser = $credcsv.username
$ucspasswd = $credcsv.password
$secpasswd = convertto-securestring $ucspasswd -asplaintext -force
$ucscreds = new-object system.management.automation.pscredential ($ucsuser,$secpasswd)
$datetime = get-date -uformat “%C%y%m%d-%H%M”
foreach($pod in $podcsv)
{
$podname=$pod.name
$podip=$pod.ip
connect-ucs -credential $ucscreds $podip
get-ucsfault | select ucs,id,lasttransition,descr,ack,severity | export-csv -path .\$datetime-$podname-errors.csv
disconnect-ucs
}

To dump out the alerts we had in multiple UCS systems, to CSV files. This would save 20-30 minutes a day, nothing major, but clicking buttons is boring, and I can always find better things to do with my time.

On the VMware side of things, I started really small, with stuff like this which would tell you the version of VMTools on all of your virtual machines:


# Ask for connection details, then connect using these
$vcenter = Read-Host "Enter vCenter Name or IP"
$username = Read-Host "Enter your username"
$password = Read-Host "Enter your password"
# Set up our constants for logging
$datetime = get-date -uformat "%C%y%m%d-%H%M"
$outfilepsp = $(".\" + $datetime + "_" + $vcenter + "_PSPList_Log.txt")
$outfilerdm = $(".\" + $datetime + "_" + $vcenter + "_RDMList_Log.txt")
$OutputFile = ".\" + $datetime + "_" + $vcenter + "_VMTools_Report.txt"
# Connect to vCenter
$Connection = Connect-VIServer $vcenter #-User $username -Password $password
foreach($Cluster in Get-Cluster) {
foreach($esxhost in ($Cluster | Get-VMHost | Where { ($_.ConnectionState -eq "Connected") -or ($_.ConnectionState -eq "Maintenance")} | Sort Name)) {
Get-Cluster | Get-VMhost $esxhost | get-vm | % { get-view $_.id } | select Name, @{ Name="ToolsVersion"; Expression={$_.config.tools.toolsVersion}}, @{ Name="ToolStatus"; Expression={$_.Guest.ToolsVersionStatus}}, @{Name="Host";Expression={$esxhost}}, @{Name="Cluster";Expression={$cluster.name}} | Format-Table | Out-File -FilePath $OutputFile -Append
}
}
Disconnect-VIServer * -Confirm:$false

This is a real time saver, and great for getting quick figures out of your environment. As I wrote these scripts, I learned more and more what I could do, picking up ways of doing different things here and there: for/next loops, do/while loops, arrays. As I picked up these concepts again, concepts I had learned years earlier and not used to great effect, my scripts became more complex, and delivered more value in the output they gave, and the time saved. Scripts like this which reports on any datastores over 90% utilisation, these soon became a part of our daily reporting regime:


$datetime = get-date -uformat "%C%y%m%d-%H%M"
$vcentercsv = import-csv .\VCenter_Servers.csv
# Configure connection settings using Read Only account
$credcsv = import-csv .\VMware_Credentials.csv
$vmuser = $credcsv.username
$vmpasswd = $credcsv.password
$secpasswd = convertto-securestring $vmpasswd -asplaintext -force
$vmcreds = new-object system.management.automation.pscredential ($vmuser,$secpasswd)
$report = @()
foreach($vcenter in $vcentercsv)
{
$vcentername=$vcenter.name
connect-viserver $vcenter.ip -credential $vmcreds
foreach ($datastore in (get-datastore | where {$_.name -notlike "*local*" -and [math]::Round(100-($_.freespacegb/$_.capacitygb)*100) -gt 90}))
{
$row = '' | select Name,FreeSpaceGB,CapacityGB,vCenter,PercentUsed
$row.Name = $datastore.name
$row.FreeSpaceGB = $datastore.freespacegb
$row.CapacityGB = $datastore.capacitygb
$row.vCenter = $vcenter.name
$row.PercentUsed = [math]::Round(100-($datastore.freespacegb/$datastore.capacitygb)*100)
$report += $row
}
Disconnect-VIServer * -Confirm:$false
}
$report | Sort PercentUsed | export-csv -path .\$datetime-datastore-overuse.csv

My knowledge of how to do things, and confidence in what I was doing grew rapidly, and the old thing of ‘the more I know, the more I realise I don’t know’ came to pass. I am still learning at a rapid rate how better to put these things together, and new cmdlets, new modules, new ways to do things. It’s a fun journey though, one which leaves you with extremely useful and admired skills, and one which will continue to develop you as an IT technician throughout your career.

I am now doing the biggest PowerShell datacenter automation project I have ever done, it is around 5000 lines now, and growing every day. I feel like anything can be achieved with PowerShell, and the various modules released by vendors, and finding ways of solving the constant puzzles which hit me in the face is exciting and rewarding in equal measure.

Everywhere you look in IT now, it is automation and DevOps. It has been said many times that IT engineers who do not learn some form of automation are going to be automated out of a job, and to some extent I agree with this. The advent of software defined storage, networking, everything, shows that automation, and policy driven configuration, is really changing the world of IT infrastructure. If you’re in IT then you probably got in because you love technology, well get out there and learn new skills, whatever those may be, you will enjoy it more than you think.

Transparent Page Sharing – for better or worse

Transparent Page Sharing (TPS) is one of the cornerstones in memory management in the ESXi hypervisor, this is one of the many technologies VMware have developed, which allows higher VM consolidation ratios on hosts through intelligent analysis of VM memory utilisation, and deduplication based on this analysis.

There are two types of TPS, intra-VM and inter-VM. The first scans the memory in use by a single VM and deduplicates common block patterns, this will lower the total host memory consumption of a VM, without significantly impacting VM performance. The second type; inter-VM TPS, does the same process, looking for common blocks across memory usage for all VMs on a given host.

Historically this has been a successful reclamation technique, and has led to decent savings in host memory consumption, however, most modern Operating Systems seen in virtualised environments (most Linux distributions, Windows Server 2008 onwards) now use memory encryption by default so the chance of the TPS daemon finding common blocks becomes less and less likely.

If CPU integrated hardware-assisted memory virtualisation features (AMD Rapid Virtualisation Indexing (RVI) or Intel Extended Page Tables (EPT)) are utilised for an ESXi host, then the hypervisor will use 2MB block size for its TPS calculations, rather than the normal 4KB block size. Attempting to deduplicate in 2MB chunks is far more resource intensive, and far less successful, than running the same process in 4KB chunks, thus ESXi will not attempt to deduplicate and share large memory pages by default.

The upshot of this is that 2MB pages are scanned, and 4KB blocks within these large pages  hashed in preparation for inducing memory sharing should the host come under memory contention, in an effort to prevent sharing. Pre-hashing these 4KB chunks will mean that TPS is able to quickly react, deduplicating and sharing the pages should the need arise.

All good so far, this technique should help us to save a bit of memory, although since memory virtualisation features in modern CPUs are widespread, and larger amounts of host memory more common, the potential TPS savings should hopefully never be needed or seen.

At the back end of last year, VMware announced that they would be disabling TPS by default in ESXi following an academic paper which showed a potential security vulnerability in TPS which, if exploited, could result in sensitive data being made available from one VM to another utilising memory sharing. It should be noted that the technique used to exploit this is in a highly controlled laboratory style condition, and requires physical access to the host in question, and that the researcher highlighting it never actually managed to glean any data in this method.

Despite the theoretical nature of the published vulnerability, VMware took the precautionary approach, and so in ESXi 5.0, 5.1, 5.5 and now 6.0, with the latest updates, TPS is now disabled by default. What does this mean for the enterprise though, and what choices do we have?

1.Turn TPS on regardless – if you have a dedicated internal-only infrastructure, then it may be that you do not care about the risks exposed by the research. If you and your team of system administrators are the only ones with access to your ESXi servers, and the VMs within, as is common in many internal IT departments, then there are likely far easier ways to get access to sensitive data than by utilising this theoretical technique anyway

2.Turn off TPS – if you are in a shared Service Provider style infrastructure, or in an environment requiring the highest security, then this should be a no-brainer. The integrity and security of the client data you have on your systems should be foremost in your organisations mind, and in the interests of precaution, and good practice, you should disable TPS on existing systems and leave it off

This were the options presented by VMware until about a week ago, when an article was published which described a third option being introduced in the next ESXi updates:

3. Use TPS with guest VM salting – this allows selective inter-VM TPS enablement among sets of selected VMs, allowing you to limit the potential areas of vulnerability, using a combination of edits to .vmx files, and advanced host settings. This may be a good middle ground if you are reliant on the benefits provided by TPS, but security policy demands that it not be in the previous default mode

So these are our options, and regardless of which one you choose, you need to know what the difference in your environment will be if you turn off TPS, this is going to be different for everyone. Your current savings being delivered by TPS can be calculated, and this should give you some idea of what the change in host memory utilisation will be following the disabling of TPS.

The quickest way to see this for an individual host is via the Performance tab in vSphere Client, if you look at real time memory usage, and select the ‘Active’, ‘Shared’ and ‘Shared Common’ counters then you will be able to see how much memory is consumed in total by your host, and how much of this is being saved through TPS:

TPS_vSphere_Client

Here we can see:

TPS %age saving = (Shared – Shared common) / (Consumed – Used by VMkernel) * 100% = (5743984 – 112448) / (160286356 – 3111764) * 100% = 5631536 / 157174592 * 100% = 3.58%

So TPS is saving around 5.6GB or 3.6% of total memory on the host being consumed by VMs. This is a marker of the efficiency of TPS.

The same figures can be taken from esxtop if you SSH to an ESXi host, run esxtop, and press ‘m’ to get to memory view.

TPS_esxtop

Here we are looking at the PSHARE value, we can see the saving is 5607MB (ties up with above from the vSphere Client), and the memory consumed by VMs can be seen under PMEM/other, in this case 153104MB. Again we can calculate the percentage saving TPS is giving us by dividing the saving by the active memory and multiplying by 100%:

TPS %age saving = PSHARE saving / PMEM other * 100% = 5607 / 153104 * 100% = 3.66%

So this is how we can calculate the saving for each host, but what if you have dozens, or hundreds of hosts in your environment, wouldn’t it be great to get these stats for all your hosts? Well, the easiest way to get this kind of information is usually through PowerCLI so I put the following script together:


# Ask for connection details, then connect using these
$vcenter = Read-Host "Enter vCenter Name or IP"


# Set up our constants for logging
$datetime = get-date -uformat "%C%y%m%d-%H%M"
$OutputFile = ".\" + $datetime + "_" + $vcenter + "_TPS_Report.csv"

# Connect to vCenter
$Connection = Connect-VIServer $vcenter

$myArray = @()

forEach ($Cluster in Get-Cluster) {
foreach($esxhost in ($Cluster | Get-VMHost | Where { ($_.ConnectionState -eq "Connected") -or ($_.ConnectionState -eq "Maintenance")} | Sort Name)) {
$vmdetails = "" | select hostname,clustername,memsizegb,memshavg,memshcom,tpssaving,percenttotalmemsaved,tpsefficiencypercent
$vmdetails.hostname = $esxhost.name
$vmdetails.clustername = $cluster.name
$hostmem = Get-VMHost $esxhost | Select -exp memorytotalgb
$vmdetails.memsizegb = "{0:N0}" -f $hostmem
$vmdetails.memshavg = [math]::Round((Get-VMhost $esxhost | Get-Stat -Stat mem.shared.average -MaxSamples 1 -Realtime | Select -exp value),2)
$vmdetails.memshcom = [math]::Round((Get-VMhost $esxhost | Get-Stat -Stat mem.sharedcommon.average -MaxSamples 1 -Realtime | Select -exp value),2)
$vmdetails.tpssaving = $vmdetails.memshavg-$vmdetails.memshcom
$vmdetails.percenttotalmemsaved = [math]::Round(([int]$vmdetails.tpssaving/([int]$vmdetails.memsizegb*1024*1024))*100,2)
$consumedmemvm = [math]::Round(((Get-VMhost $esxhost | Get-Stat -Stat mem.consumed.average -MaxSamples 1 -Realtime | Select -exp value)-(Get-VMhost $esxhost | Get-Stat -Stat mem.sysUsage.average -MaxSamples 1 -Realtime | Select -exp value)),2)
$vmdetails.tpsefficiencypercent = [math]::Round(([int]$vmdetails.tpssaving/$consumedmemvm)*100,2)
$myArray += $vmdetails
}
}
Disconnect-VIServer * -Confirm:$false

$myArray | Sort Name | Export-Csv -Path $Outputfile

This script will dump out a CSV with every host in your vCenter, and tell you the percentage of total host memory saved by TPS, and the efficiency of TPS in your environment. This should help to provide some idea of what the impacts of TPS being turned off will be.

Ultimately, your organisation’s security policies should define what to do after the next ESXi updates, and how you should act in the meantime, TPS is definitely a useful feature, and does allow for higher consolidation ratios, but security vulnerabilities should not be ignored. Hopefully this post will give you an idea of how TPS is currently impacting your infrastructure.