rbvami – Managing the vCSA VAMI using Ruby

I have been putting together a module for managing vCSA’s VAMI using Ruby. This uses the vCenter 6.5 REST API, and the long term plan is to build it out to cover the entire REST API.

My intention is to use this module as the basis for a Chef cookbook for managing VAMI configuration, and is mainly a learning rather than a practical exercise.

The project is on my GitHub site, feel free to contact me if there is functionality you would like to see added.

Deploying NX-OSv 9000 on vSphere

Cisco have recently released (1st March 2017) an updated virtual version of their Nexus 9K switch, and the good news is that this is now available as an OVA for deployment onto ESXi. We used to use VIRL in a lab, which was fine until a buggy earlier version of the virtual 9K was introduced which prevented core functionality like port channels. This new release doesn’t require the complex environment that VIRL brings, and lets you deploy a quick pair of appliances in vPC to test code against.

The download is available here, and while there are some instructions available, I did not find them particularly useful in deploying the switch to my ESXi environment. As a result, I decided to write up how I did this to hopefully save people spending time smashing their face off it.

Getting the OVA

NOTE: you will need a Cisco login to download the OVA file. My login has access to a bunch of bits so not sure exactly what the requirements are around this.

There are a few versions available from the above link, including a qcow2 (KVM) image, a .vmdk file (for rolling your own VM), a VirtualBox image (for use with VirtualBox and/or Vagrant), and an OVA (for use with Fusion, Workstation, ESXi).

Once downloaded we are ready to deploy the appliance. There are a few things to bear in mind here:

  1. This can be used to pass VM traffic between virtual machines: there are 6 connected vNICs on deployment, 1 of these simulates the mgmt0 port on the 9K, and the other 5 are able to pass VM traffic.
  2. vNICs 2-6 should not be attached to the management network (best practice)
  3. We will need to initially connect over a virtual serial port through the host, this will require opening up the ESXi host firewall temporarily

Deploying the OVA

You can deploy the OVA through the vSphere Web Client, or the new vSphere HTML5 Web Client, I’ve detailed how to do this via PowerShell here, because who’s got time for clicking buttons?


# Simulator is available at:
# https://software.cisco.com/download/release.html?mdfid=286312239&softwareid=282088129&release=7.0(3)I5(1)&relind=AVAILABLE&rellifecycle=&reltype=latest
# Filename: nxosv-final.7.0.3.I5.2.ova
# Documentation: http://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/7-x/nx-osv/configuration/guide/b_NX-OSv_9000/b_NX-OSv_chapter_01.html

Function New-SerialPort {
  # stolen from http://access-console-port-virtual-machine.blogspot.co.uk/2013/07/add-serial-port-to-vm-through-gui-or.html
  ) #end
  $dev = New-Object VMware.Vim.VirtualDeviceConfigSpec
  $dev.operation = "add"
  $dev.device = New-Object VMware.Vim.VirtualSerialPort
  $dev.device.key = -1
  $dev.device.backing = New-Object VMware.Vim.VirtualSerialPortURIBackingInfo
  $dev.device.backing.direction = "server"
  $dev.device.backing.serviceURI = "telnet://"+$hostIP+":"+$prt
  $dev.device.connectable = New-Object VMware.Vim.VirtualDeviceConnectInfo
  $dev.device.connectable.connected = $true
  $dev.device.connectable.StartConnected = $true
  $dev.device.yieldOnPoll = $true

  $spec = New-Object VMware.Vim.VirtualMachineConfigSpec
  $spec.DeviceChange += $dev

  $vm = Get-VM -Name $vmName

# Variables - edit these...
$ovf_location = '.\nxosv-final.7.0.3.I5.1.ova'
$n9k_name = 'NXOSV-N9K-001'
$target_datastore = 'VBR_MGTESX01_Local_SSD_01'
$target_portgroup = 'vSS_Mgmt_Network'
$target_cluster = 'VBR_Mgmt_Cluster'

$vi_server = ''
$vi_user = 'administrator@vsphere.local'
$vi_pass = 'VMware1!'

# set this to $true to remove non-management network interfaces, $false to leave them where they are
$remove_additional_interfaces = $true

# Don't edit below here
Import-Module VMware.PowerCLI

Connect-VIServer $vi_server -user $vi_user -pass $vi_pass

$vmhost = $((Get-Cluster $target_cluster | Get-VMHost)[0])

$ovfconfig = Get-OvfConfiguration $ovf_location

$ovfconfig.NetworkMapping.mgmt0.Value = $target_portgroup
$ovfconfig.NetworkMapping.Ethernet1_1.Value = $target_portgroup
$ovfconfig.NetworkMapping.Ethernet1_2.Value = $target_portgroup
$ovfconfig.NetworkMapping.Ethernet1_3.Value = $target_portgroup
$ovfconfig.NetworkMapping.Ethernet1_4.Value = $target_portgroup
$ovfconfig.NetworkMapping.Ethernet1_5.Value = $target_portgroup
$ovfconfig.DeploymentOption.Value = 'default'

Import-VApp $ovf_location -OvfConfiguration $ovfconfig -VMHost $vmhost -Datastore $target_datastore -DiskStorageFormat Thin -Name $n9k_name

if ($remove_additional_interfaces) {
  Get-VM $n9k_name | Get-NetworkAdapter | ?{$_.Name -ne 'Network adapter 1'} | Remove-NetworkAdapter -Confirm:$false

New-SerialPort -vmName $n9k_name -hostIP $($vmhost | Get-VMHostNetworkAdapter -Name vmk0 | Select -ExpandProperty IP) -prt 2000

$vmhost | Get-VMHostFirewallException -Name 'VM serial port connected over network' | Set-VMHostFirewallException -Enabled $true

Get-VM $n9k_name | Start-VM

This should start the VM, we will be able to telnet into the host on port 2000 to reach the VM console, but it will not be ready for us to do that until this screen is reached:


Now when we connect we should see:


At this point we can enter ‘n’ and go through the normal Nexus 9K setup wizard. Once the management IP and SSH are configured you should be able to connect via SSH, the virtual serial port can then be removed via the vSphere Client, and the ‘VM serial port connected over network’ rule should be disabled on the host firewall.

Pimping things up

Add more NICs

Obviously here we have removed the additional NICs from the VM, which makes it only talk over the single management port. We can add a bunch more NICs and the virtual switch will let us use them to talk on. This could be an interesting use case to pass actual VM traffic through the 9K.

Set up vPC

The switch is fully vPC (Virtual Port Channel) capable, so we can spin up another virtual N9K and put them in vPC mode, this is useful to experiment with that feature.

Bring the API!

The switch is NXAPI capable, which was the main reason for me wanting to deploy it, so that I could test REST calls against it. Enable NXAPI by entering the ‘feature nxapi’ commmand.


Hopefully this post will help people struggling to deploy this OVA, or wanting to test out NXOS in a lab environment. I found the Cisco documentation a little confusing so though I would share my experiences.

Replacing the ‘All Services’ Icon in vRealize Automation

I had a conversation with Ricky El-Qasem (@rickyelqasem) on Twitter this week about the ‘All Services’ logo in vRealize Automation, and whether this could be replaced programatically.

For those which don’t know the pain of this particular element of vRA; when browsing the service catalog, groups of services are listed down the left hand side of the page with icons next to them:

Screen Shot 2017-03-17 at 17.52.42.png

These can all be changed, but until recently the top icon would remain as a blue lego brick, which can make the otherwise slick portal look unsightly. This is shown on the image below:

Screen Shot 2017-03-17 at 17.56.25.png

Now luckily, from vRA 7.1, this has been replaceable through the API, and steps have been documented in the accompanying guide here. This uses the REST API, and means you need to convert the image in PNG into Base-64 encoding in order to push it to the API, a little to manual for me!

So I quickly threw vRA 7.2 up in my home lab and got to work. I chose to script it using Python because I found that I could easily convert the image to Base-64, and I knew I could do the REST calls using the excellent ‘requests’ Python package (info available here). The code I used is available on my GitHub, and is shown below. I also created a script to delete the custom icon, and return things to vanilla state, you know, just in case 😉

Anyway, I hope this is useful for people who want to quickly and easily replace the icon.

#!/usr/bin/env python
# required packages, install with pip if not present
import requests
import json
# disable self-signed cert warnings
# replace these variables
filename = 'service.png'
vra_ip = ''
vra_user = 'administrator@vsphere.local'
vra_pass = 'VMware1!'
vra_tenant = 'vsphere.local'
# don't replace anything from here
# open file and encode it in b64
with open("./"+filename, "rb") as f:
    data = f.read()
    encoded = data.encode("base64")
encoded = encoded.replace("\r","")
encoded = encoded.replace("\n","")
# get our authorization token
uri = 'https://'+vra_ip+'/identity/api/tokens'
headers = {'Accept':'application/json','Content-Type':'application/json'}
payload = '{"username":"'+vra_user+'","password":"'+vra_pass+'","tenant":"'+vra_tenant+'"}'
r = requests.post(uri, headers=headers, verify=False, data=payload)
token = 'Bearer '+str(json.loads(r.text)["id"])
# send the new icon to the API
uri = 'https://'+vra_ip+'/catalog-service/api/icons'
headers = {'Accept':'application/json','Content-Type':'application/json','Authorization':token}
payload = '{"id":"cafe_default_icon_genericAllServices","fileName":"'+filename+'","contentType":"image/png","image":"'+encoded+'"}'
r = requests.post(uri, headers=headers, verify=False, data=payload)
if r.status_code == 201:
    print "Replacement successful"
    print "Expected return code 201, got "+r.status_code+" something went wrong"

vSphere Automation SDKs

This week VMware open sourced their SDKs for vSphere using REST APIs, and Python. The REST API was released with vSphere 6.0, while the Python SDK has been around for nearly four years now. I’m going to summarise the contents of this release below, and where these can help us make more of our vSphere environments.


The vSphere REST API has been growing since the release of vSphere 6 nearly two years ago, and brings access to the following areas of vSphere with its current release:

  • Session management
  • Tagging
  • Content Library
  • Virtual Machines
  • vCenter Server Appliance management

These cover mainly new features from vSphere 6.0 (formerly known as vCloud Suite SDK), and then some of the new bits put together for modernising the API access in vSphere 6.5. The Virtual Machine management particularly is useful in being able to start using REST based methods to do operations, and report on VMs in your environment, very useful for people looking to write quick integrations with things like vRealize Orchestrator, where the built in plugins do not do what you want.

The new material, available on GitHub, contains two main functions:

Postman Collection

Screen Shot 2017-03-12 at 10.28.54.png

Postman is a REST client used to explore APIs, providing a nice graphical display of the request-response type methods used for REST. This is a great way to get your head round what is happening with requests, and helps to build up an idea of what is going on with the API.

Pre-built packs of requests can be gathered together in Postman ‘Collections’; these can then be distributed (in JSON format) and loaded into another instance of Postman. This can be crucially important in documenting the functionality of APIs, especially when the documentation is lacking.

There are some instructions on how to set this up here; if you are new to REST APIs, or just want a quick way to have a play with the new vSphere REST APIs, you could do far worse than starting here.

Node.js Sample Pack

Node.js has taken over the world of server side web programming, and thanks to the simple syntax of Javascript, is easy to pick up and get started with. This pack (available here) has some samples of Node.js code to interact with the REST API. This is a good place to start with seeing how web requests and responses are dealt with in Node, and how we can programatically carry out administrative tasks.

These could be integrated into a web based portal to do the requests directly, or I can see these being used in the future as part of a serverless administration platform, using something like AWS Lambda along with a monitoring platform to automate the administration of a vSphere environment.

Python SDK

Python has been an incredibly popular language for automation for a number of years. Its very low barrier to getting started makes it ideal to pick up and learn, with a wealth of possibilities for building on solid simple foundations to make highly complex software solutions. VMware released their ‘pyvmomi’ Python SDK back in 2013, and it has received consistent updates since then. While not as popular, or as promoted as their PowerCLI PowerShell module, it has nevertheless had strong usage and support from the community.

The release on offer as part of the vSphere Automation SDKs consists of scripts to spin up a demo environment for developing with the Python SDK, as well as a number of sample scripts demonstrating the functionality of the new APIs released in vSphere 6.0 and 6.5.

The continued growth in popularity of Python, along with leading automation toolsets like Ansible using a Python base, mean that it is a great platform to push this kind of development and publicity in. As with Node.js; serverless platforms are widely supporting Python, so this could be integrated with Lambda, Fission, or other FaaS platforms in the future.


It’s great to see VMware really getting behind developing and pushing their automation toolkits in the open, they are to my mind a leader in the industry in terms of making their products programmable, and I hope they continue at this pace and in this vein. The work shown in this release will help make it easier for people new to automation to get involved and start reaping the benefits that it can bring, and the possibilities for combining these vSphere SDKs with serverless administration will be an interesting area to watch.

Installing vSphere Integrated Containers

This document details installing and testing vSphere Integrated Containers, which went v1.0 recently. This has been tested against vSphere 6.5 only.
Download VIC from my.vmware.com.
Release notes available here.
From Linux terminal:
root@LOBSANG:~# tar -xvf vic_0.8.0-7315-c8ac999.tar.gz
root@LOBSANG:~# cd vic
root@LOBSANG:~/vic# tree .
├── appliance.iso
├── bootstrap.iso
├── ui
│   ├── vCenterForWindows
│   │   ├── configs
│   │   ├── install.bat
│   │   ├── uninstall.bat
│   │   ├── upgrade.bat
│   │   └── utils
│   │   └── xml.exe
│   ├── VCSA
│   │   ├── configs
│   │   ├── install.sh
│   │   ├── uninstall.sh
│   │   └── upgrade.sh
│   └── vsphere-client-serenity
│   ├── com.vmware.vicui.Vicui-0.8.0
│   │   ├── plugin-package.xml
│   │   ├── plugins
│   │   │   ├── vic-ui-service.jar
│   │   │   ├── vic-ui-war.war
│   │   │   └── vim25.jar
│   │   └── vc_extension_flags
│   └── com.vmware.vicui.Vicui-0.8.0.zip
├── vic-machine-darwin
├── vic-machine-linux
├── vic-machine-windows.exe
├── vic-ui-darwin
├── vic-ui-linux
└── vic-ui-windows.exe

7 directories, 25 files
Now we have the files ready to go we can run the install command as detailed in the GitHub repository for VIC (here). We are going to use Linux here:
root@VBRPHOTON01 [ ~/vic ]# ./vic-machine-linux
  vic-machine-linux - Create and manage Virtual Container Hosts

  vic-machine-linux [global options] command [command options] [arguments...]


  create Deploy VCH
  delete Delete VCH and associated resources
  ls List VCHs
  inspect Inspect VCH
  version Show VIC version information
  debug Debug VCH

  --help, -h show help
  --version, -v print the version

root@VBRPHOTON01 [ ~/vic ]#
On all hosts in the cluster you are using, create a bridge network (has to be vDS), mine is called vDS_VCH_Bridge, and disable the ESXi firewall by doing this.
To install we use the command as follows:
root@VBRPHOTON01 [ ~/vic ]# ./vic-machine-linux create --target --image-store VBR_MGTESX01_Local_SSD_01 --name VBR-VCH-01 --user administrator@vsphere.local --password VMware1! --compute-resource VBR_Mgmt_Cluster --bridge-network vDS_VCH_Bridge --public-network vSS_Mgmt_Network --client-network vSS_Mgmt_Network --management-network vSS_Mgmt_Network --force --no-tlsverify
This is in my lab; I’m deploying to a vCenter with a single host, and don’t care about security. The output should look something like this:
INFO[2017-01-06T21:52:14Z] ### Installing VCH ####
WARN[2017-01-06T21:52:14Z] Using administrative user for VCH operation - use --ops-user to improve security (see -x for advanced help)
INFO[2017-01-06T21:52:14Z] Loaded server certificate VBR-VCH-01/server-cert.pem
WARN[2017-01-06T21:52:14Z] Configuring without TLS verify - certificate-based authentication disabled
INFO[2017-01-06T21:52:15Z] Validating supplied configuration
INFO[2017-01-06T21:52:15Z] vDS configuration OK on "vDS_VCH_Bridge"
INFO[2017-01-06T21:52:15Z] Firewall status: DISABLED on "/VBR_Datacenter/host/VBR_Mgmt_Cluster/vbrmgtesx01.virtualbrakeman.local"
WARN[2017-01-06T21:52:15Z] Firewall configuration will be incorrect if firewall is reenabled on hosts:
WARN[2017-01-06T21:52:15Z] "/VBR_Datacenter/host/VBR_Mgmt_Cluster/vbrmgtesx01.virtualbrakeman.local"
INFO[2017-01-06T21:52:15Z] Firewall must permit dst 2377/tcp outbound to VCH management interface if firewall is reenabled
INFO[2017-01-06T21:52:15Z] License check OK on hosts:
INFO[2017-01-06T21:52:15Z] "/VBR_Datacenter/host/VBR_Mgmt_Cluster/vbrmgtesx01.virtualbrakeman.local"
INFO[2017-01-06T21:52:15Z] DRS check OK on:
INFO[2017-01-06T21:52:15Z] "/VBR_Datacenter/host/VBR_Mgmt_Cluster/Resources"
INFO[2017-01-06T21:52:15Z] Creating virtual app "VBR-VCH-01"
INFO[2017-01-06T21:52:15Z] Creating appliance on target
INFO[2017-01-06T21:52:15Z] Network role "client" is sharing NIC with "public"
INFO[2017-01-06T21:52:15Z] Network role "management" is sharing NIC with "public"
INFO[2017-01-06T21:52:16Z] Uploading images for container
INFO[2017-01-06T21:52:16Z] "bootstrap.iso"
INFO[2017-01-06T21:52:16Z] "appliance.iso"
INFO[2017-01-06T21:52:22Z] Waiting for IP information
INFO[2017-01-06T21:52:35Z] Waiting for major appliance components to launch
INFO[2017-01-06T21:52:35Z] Checking VCH connectivity with vSphere target
INFO[2017-01-06T21:52:36Z] vSphere API Test: vSphere API target responds as expected
INFO[2017-01-06T21:52:38Z] Initialization of appliance successful
INFO[2017-01-06T21:52:38Z] VCH Admin Portal:
INFO[2017-01-06T21:52:38Z] Published ports can be reached at:
INFO[2017-01-06T21:52:38Z] Docker environment variables:
INFO[2017-01-06T21:52:38Z] DOCKER_HOST=
INFO[2017-01-06T21:52:38Z] Environment saved in VBR-VCH-01/VBR-VCH-01.env
INFO[2017-01-06T21:52:38Z] Connect to docker:
INFO[2017-01-06T21:52:38Z] docker -H --tls info
INFO[2017-01-06T21:52:38Z] Installer completed successfully
Now we can check the state of our remote VIC host with:
root@VBRPHOTON01 [ ~ ]# docker -H tcp:// --tls info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: v0.8.0-7315-c8ac999
Storage Driver: vSphere Integrated Containers v0.8.0-7315-c8ac999 Backend Engine
vSphere Integrated Containers v0.8.0-7315-c8ac999 Backend Engine: RUNNING
 VCH mhz limit: 2419 Mhz
 VCH memory limit: 27.88 GiB
 VMware Product: VMware vCenter Server
 VMware OS: linux-x64
 VMware OS version: 6.5.0
Execution Driver: vSphere Integrated Containers v0.8.0-7315-c8ac999 Backend Engine
 Network: bridge
Operating System: linux-x64
OSType: linux-x64
Architecture: x86_64
CPUs: 2419
Total Memory: 27.88 GiB
Name: VBR-VCH-01
ID: vSphere Integrated Containers
Docker Root Dir:
Debug mode (client): false
Debug mode (server): false
Registry: registry-1.docker.io
root@VBRPHOTON01 [ ~ ]#
This shows us it’s up and running, Now we can run our first container on here by doing:
root@VBRPHOTON01 [ ~ ]# docker -H tcp:// --tls run hello-world
Unable to find image 'hello-world:latest' locally
Pulling from library/hello-world
c04b14da8d14: Pull complete
a3ed95caeb02: Pull complete
Digest: sha256:548e9719abe62684ac7f01eea38cb5b0cf467cfe67c58b83fe87ba96674a4cdd
Status: Downloaded newer image for library/hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
 3. The Docker daemon created a new container from that image which runs the
  executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
  to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker Hub account:

For more examples and ideas, visit:

root@VBRPHOTON01 [ ~ ]#
We can see this under vSphere as follows:


So our container host itself is a VM under a vApp, and all containers are spun up as VMs under the vApp. As we can see here, the container ‘VM’ is powered off. This can be seen further by running ‘docker ps’ against our remote host:
root@VBRPHOTON01 [ ~ ]# docker -H tcp:// --tls ps
root@VBRPHOTON01 [ ~ ]# docker -H tcp:// --tls ps -a
24598201e216 hello-world "/hello" 56 seconds ago Exited (0) 47 seconds ago silly_davinci
root@VBRPHOTON01 [ ~ ]# docker -H tcp:// --tls rm 24598201e216
root@VBRPHOTON01 [ ~ ]# docker -H tcp:// --tls ps -a
root@VBRPHOTON01 [ ~ ]#
This container is now tidied up in vSphere:
So now we have VIC installed and can spin up containers. In the next post we will install VMware Harbor and use that as our trusted registry.

Incorrectly Reported Separated Network Partitions in VSAN Cluster

I’ve been playing around with VSAN, automating the build of a 3 node Management cluster using ESXi 6.0 Update 1. I came across and issue where I moved one of my hosts to another cluster and then back into the VSAN cluster, and when it came back it showed as a separate network partition, and had a separate VSAN datastore.

The VSAN Disk Management page under my cluster in the Web Client showed that the Network Partition Group was different for this host to my other two hosts, despite the network being absolutely fine.

Turned out that the host had not rejoined the VSAN cluster, but had created its own 1-node cluster. I resolved this by running the following commands:

On the partitioned host:

esxcli vsan cluster get

Cluster Information

   Enabled: true

   Current Local Time: 2016-09-21T10:23:35Z

   Local Node UUID: 57e0040c-83a9-add9-ec1f-0cc47ab46218

   Local Node Type: NORMAL

   Local Node State: MASTER

   Local Node Health State: HEALTHY

   Sub-Cluster Master UUID: 57e0040c-83a9-add9-ec1f-0cc47ab46218

   Sub-Cluster Backup UUID:

   Sub-Cluster UUID: 3451e257-cedd-8772-4b31-0cc47ab460e8

   Sub-Cluster Membership Entry Revision: 0

   Sub-Cluster Member Count: 1

   Sub-Cluster Member UUIDs: 57e0040c-83a9-add9-ec1f-0cc47ab46218

   Sub-Cluster Membership UUID: 9c5fe257-e053-7716-ca0a-0cc47ab46218

This shows the host in a single node cluster

On a surviving host:

esxcli vsan cluster get

Cluster Information

   Enabled: true

   Current Local Time: 2016-09-21T11:14:55Z

   Local Node UUID: 57e006b6-71ab-c8f6-7d1d-0cc47ab460e8

   Local Node Type: NORMAL

   Local Node State: MASTER

   Local Node Health State: HEALTHY

   Sub-Cluster Master UUID: 57e006b6-71ab-c8f6-7d1d-0cc47ab460e8

   Sub-Cluster Backup UUID: 57e0f22f-3071-fe1a-fd8e-0cc47ab460ec

   Sub-Cluster UUID: 57e0040c-83a9-add9-ec1f-0cc47ab46218

   Sub-Cluster Membership Entry Revision: 0

   Sub-Cluster Member Count: 2

   Sub-Cluster Member UUIDs: 57e0f22f-3071-fe1a-fd8e-0cc47ab460ec, 57e006b6-71ab-c8f6-7d1d-0cc47ab460e8

   Sub-Cluster Membership UUID: 3451e257-cedd-8772-4b31-0cc47ab460e8

This showed me there were only 2 nodes in the cluster, we will use the Sub-Cluster UUID from here in a moment.

On the partitioned host:

esxcli vsan cluster leave

esxcli vsan cluster join -u 57e0040c-83a9-add9-ec1f-0cc47ab46218

esxcli vsan cluster get

Cluster Information

   Enabled: true

   Current Local Time: 2016-09-21T10:24:26Z

   Local Node UUID: 57e0040c-83a9-add9-ec1f-0cc47ab46218

   Local Node Type: NORMAL

   Local Node State: AGENT

   Local Node Health State: HEALTHY

   Sub-Cluster Master UUID: 57e006b6-71ab-c8f6-7d1d-0cc47ab460e8

   Sub-Cluster Backup UUID: 57e0f22f-3071-fe1a-fd8e-0cc47ab460ec

   Sub-Cluster UUID: 57e0040c-83a9-add9-ec1f-0cc47ab46218

   Sub-Cluster Membership Entry Revision: 1

   Sub-Cluster Member Count: 3

   Sub-Cluster Member UUIDs: 57e0f22f-3071-fe1a-fd8e-0cc47ab460ec, 57e006b6-71ab-c8f6-7d1d-0cc47ab460e8, 57e0040c-83a9-add9-ec1f-0cc47ab46218

   Sub-Cluster Membership UUID: 3451e257-cedd-8772-4b31-0cc47ab460e8

Now we see all three nodes back in the cluster. The data will take some time to rebuild on this node, but once done, the VSAN health check should show as Healthy, and there should be a single VSAN datastore spanning all hosts.

vRealize Orchestrator and Site Recovery Manager – The Missing Parts (or how to hack SOAP APIs to get what you want)

vRealize Orchestrator (vRO) forms the backbone of vRealize Automation (vRA), and provides the XaaS (Anything-as-a-Service) functionality for this product. vRO has plugins for a number of technologies; both those made by VMware, and those which are not. Having been using vRO to automate various products for the last 6 months or so, I have found that these plugins have varying degrees of quality, and some cover more functionality of the underlying product than others.

Over the last couple of weeks, I have been looking at the Site Recovery Manager (SRM) plugin (specifically version 6.1.1, in association with vRO 7.0.1, and SRM 6.1), and while this provides some of the basic functionality of SRM, it is missing some key features which I needed to expose in order to provide full-featured vRA catalog services. Specifically, the plugin documentation lists the following as being missing:

  • You cannot create, edit, or delete recovery plans.
  • You cannot add or remove test network mapping to a recovery plan.
  • You cannot rescan storage to discover newly added replicated devices.
  • You cannot delete folder, network, and resource pool mappings
  • You cannot delete protection groups
  • The unassociateVms and unrotectVms methods are not available in the plug-in. You can use them by using the Site Recovery Manager public API.

Some of these are annoying, but the last ones, around removing VMs from Protection Groups are pretty crucial for the catalog services I was looking to put together. I had to find another way to do this task, outside of the hamstrung plugin.

I dug out the SRM API Developers Guide (available here), and had a read through it, but whilst describing the API in terms of Java and C# access, it wasn’t particularly of use in helping me to use vRO’s JavaScript based programming to do what I needed to do. So I needed another way to do this, which utilised the native SOAP API presented by SRM.

Another issue I saw when using the vRO SRM plugin was that when trying to add a second SRM server (the Recovery site), the plugin fell apart. It seems that the general idea is you only automate your Protected site with this plugin, and not both sites through a single vRO instance.

I tried adding a SOAP host to vRO using the ‘Add a SOAP host’ workflow, but even after adding the WSDL available on the SRM API interface, this was still not particularly friendly, so this didn’t help too much.

Using PowerCLI, we can do some useful things using the SRM API, see this post, and this GitHub repo, for some help with doing this. Our general approach to using vRO is to avoid using a PowerShell host, as this adds a bunch of complexity around adding a host, and generally we would rather do things using REST hosts with pure JavaScript code. So we need a way to figure out how to use this undocumented SOAP API to do stuff.

Now before we go on, I appreciate that the API is subject to change, and that by using the following method to do what we need to do, the methods of automation may change in a future version of SRM. As you will see, this is a fairly simple method of getting what you need, and it should be easy enough to refactor the payloads we are using if and when the API changes. In addition to this, this method should work for any kind of SOAP or REST based API which you can access through .NET type objects in PowerShell.

So the first thing we need to do is to install Fiddler. This is the easiest tool I found to get what I wanted, and there are probably other products about, but I found and liked this one. Fiddler is a web debugging tool, which I would imagine a lot of web developers are familiar with, it can be obtained here. What I like about it is the simplicity it gives in setting up a man-in-the-middle (MitM) attack to pull the detail of what is going on. This is particularly useful when using it with PowerShell, because your client machine is the endpoint, so the proxy injection is straight forward without too much messing about.

NOTE: Because this is doing MitM attacks on the traffic, it is

I’m not going to go into installing Fiddler here, it’s a standard Windows wizard, once installed, launch the program and you should see something like this:


If you click in the bottom right, next to ‘All Processes’, you will see it change to ‘Capturing’:


We are now ready to start capturing some API calls. So open PowerShell. Now to limit the amount of junk traffic we capture, we can set to only keep a certain number of sessions (in this case I set it to 1000), and target the process to capture from (by dragging the ‘Any Process’ button to our PowerShell window).



Run the following to connect to vCenter:

Import-Module VMware.VimAutomation.Core
Connect-VIServer -Server $vcenter -Credential (Get-Credential)


You should see some captures appearing in the Fiddler window, we can ignore these for now as it’s just connections to the vCenter server:

You can inspect this traffic in any case, by selecting a session, and selecting the ‘Raw’ tab in the right hand pane:


Here we can see the URI (https://<redacted>/sdk), the SOAP method (POST), the Headers (User-Agent, Content-Type, SOAPAction, Host, Cookie etc), and the body (<?xml version….), this shows us exactly what the PowerShell client is doing to talk to the API.

Now we can connect to our local and remote SRM sites using the following command:

$srm = connect-srmserver -RemoteCredential (Get-Credential -Message 'Remote Site Credential') -Credential (Get-Credential -Message 'Local Site Credential')

If you examine the sessions in your Fiddler window now, you should see a session which looks like this:


This shows the URI as our SRM server, on HTTPS port 9086, with suffix ‘/vcdr/extapi/sdk’, this is the URI we use for all the SRM SOAP calls, it shows the body we use (which contains usernames and passwords for both sites), and the response with a ‘Set-Cookie’ header with a session ticket in it. This session ticket will be added as a header to each of our following calls to the SOAP API.

Let’s try and do something with the API through PowerShell now, and see what the response looks like, run the following in your PowerShell window:

$srmApi = $srm.ExtensionData
$protectionGroups= $srmApi.Protection.ListProtectionGroups()

This session will show us the following:


Here we can see the URI is the same as earlier, that there is a header with the name ‘Cookie’ and value of ‘vmware_soap_session=”d8ba0e7de00ae1831b253341685201b2f3b29a66″’, which ties in with the cookie returned by the last call, which has returned us some ManagedObjectReference (MoRef) names of ‘srm-vm-protection-group-1172’ and ‘srm-vm-protection-group-1823’, which represent our Protection Groups. This is great, but how do we tie these into the Protection Group names we set in SRM? Well if we run the following commands in our PowerShell window, and look at the output:

Write-Output $($pg.MoRef.Value+" is equal to "+$pg.GetInfo().Name)

The responses in Fiddler look like this:


This shows us a query being sent, with the Protection Group MoRef, and the returned Protection Group name.

We can repeat this process for any of the methods available through the SRM API exposed in PowerCLI, and build up a list of the bodies we have for querying, and retrieving data, and use this to build up a library of actions. As an example we have the following methods already:

Query for Protection Groups:

<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><ListProtectionGroups xmlns="urn:srm0"><_this type="SrmProtection">SrmProtection</_this></ListProtectionGroups></soap:Body></soap:Envelope>

Get the name of a Protection Group from it’s MoRef:

<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><GetInfo xmlns="urn:srm0"><_this type="SrmProtectionGroup">MOREFNAME</_this></GetInfo></soap:Body></soap:Envelope>

So how do we take these, and turn them into actions in vRO? Well we first need to add a REST host to vRO using the ‘Add a REST host’ built in workflow, pointing to ‘https://<SRM_Server_IP>:9086’, and then write actions to do calls against this, there is more detail on doing this around on the web, this site has a good example. For the authentication method we can do:

// let's set up our variables first, these could be pushed in through parameters on the action, which would make more sense, but keep it simple for now

var localUsername = "administrator@vsphere.local"

var localPassword = "VMware1!"

var remoteUsername = "administrator@vsphere.local"

var remotePassword = "VMware1!"


// We need our XML body to send

var content = '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><SrmLoginSites xmlns="urn:srm0"><_this type="SrmServiceInstance">SrmServiceInstance</_this><username>'+localUsername+'</username><password>'+localPassword+'</password><remoteUsername>'+remoteUsername+'</remoteUsername><remotePassword>'+remotePassword+'</remotePassword></SrmLoginSites></soap:Body></soap:Envelope>';


// create the session request

var SessionRequest = RestHost.createRequest("POST", "/vcdr/extapi/sdk", content);

// set the headers we saw on the request through Fiddler


SessionRequest.setHeader("Content-Type","text/xml; charset=utf-8");

var SessionResponse = SessionRequest.execute();


// show the content

System.log("Session Response: " + SessionResponse.contentAsString);


// take the response and turn it into a string

var XmlContent = SessionResponse.contentAsString;


// get the headers

var responseHeaders = SessionResponse.getAllHeaders();


// and just the one we want

var token = responseHeaders.get("Set-Cookie");


// log the token we got

System.log("Token: " + token);


// return our token

return token

This will return us the token we can use for doing calls against the API. Now how do we use that to return a list of Protection Groups:

// We need our XML body to send, this just queries for the Protection Group MoRefs

var content = '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><ListProtectionGroups xmlns="urn:srm0"><_this type="SrmProtection">SrmProtection</_this></ListProtectionGroups></soap:Body></soap:Envelope>';


// create the session request

var SessionRequest = RestHost.createRequest("POST", "/vcdr/extapi/sdk", content);

// set the headers we saw on the request through Fiddler


SessionRequest.setHeader("Content-Type","text/xml; charset=utf-8");


var SessionResponse = SessionRequest.execute();


// show the content

System.log("Session Response: " + SessionResponse.contentAsString);


// take the response and turn it into a string

var XmlContent = SessionResponse.contentAsString;


// lets get the Protection Group MoRefs from the response

var PGMoRefs = XmlContent.getElementsByTagName("returnval");


// declare an array of Protection Groups to return

var returnedPGs = [];


// iterate through each Protection Group MoRef

for each (var index=0; index<PGMoRefs.getLength(); index++) {

// extract the actual MoRef value

var thisMoRef = PGMoRefs.item(index).textContent;

// and insert it into the body of the new call

var content = '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><GetInfo xmlns="urn:srm0"><_this type="SrmProtectionGroup">'+thisMoRef+'</_this></GetInfo></soap:Body></soap:Envelope>';

// do another call to the API to get the Protection Group name

SessionRequest = RestHost.createRequest("POST", "/vcdr/extapi/sdk", content);


SessionRequest.setHeader("Content-Type","text/xml; charset=utf-8");


SessionResponse = SessionRequest.execute();

XmlContent = XMLManager.fromString(SessionResponse.contentAsString);

returnedPGs += myxmlobj.getElementsByTagName("name").item(0).textContent;



// return our token

return returnedPGs;

Through building actions like this, we can build up a library to call the API directly. This should be a good starting point for building your own libraries for vRO to interact with SRM via the API, rather than the plugin. As stated earlier, using Fiddler, or something like it, you should be able to use this to capture anything being done through PowerShell, and I have even had some success with capturing browser clicks through this method, depending on how the web interface is configured. This method certainly made creating some integration with SRM through vRO less painful than trying to use the plugin.