Teradici PCoIP firmware 3.5 USB performance comparsion

Recently, Teradici released version 3.5 firmware for PCoIP Zero clients based on the Portal processor.

One of the main enhancements was the ability to fully utilize the USB 2.0 capable hardware that always existed in these devices, but was restricted to USB 1.1 performance by the previous firmware.

Version 3.5 firmware removes this restriction when using a PCoIP Zero client against a VMware View desktop.

I thought I would do a quick test to see what the real world differences were between version 3.4 and 3.5 firmware on a Wyse P20 Zero client.

The Test setup
Wyse P20 Zero client (on gigabit network)
VMware View 5.0 and vSphere 5.0 Infrastructure
Windows XP Virtual Desktop (1 vCPU and 1.5Gb RAM) running in a View 5.0 floating pool
Teradici PCoIP Management Console 1.7

I took a basic 2GB Sandisk USB memory stick and connected it to the Wyse P20 running V3.4 firmware. I then copied a 150MB test file from the local disk of my VM to the memory stick.

I then logged off the desktop and refreshed it so as not to influence the performance via caching etc.

The Wyse P20 was then updated to V3.5 firmware, pushing it out from the Teradici Management console.

Logging back into a fresh View XP desktop, I copied the same 150MB file to the USB memory stick.

The results:

V3.4 firmware: 4m:35s to copy 150MB file

V3.5 firmware: 1m:48 to copy 150MB file

So, almost a four fold increase in USB performance

This will make the Teradici based zero clients more acceptable in situations requiring moving files via USB memory stick and removable hard disks, as well as other applications involving USB based CD/DVD burners.

In my environment, this is the best feature of the latest V3.5 firmware – a must have.

Configuring iSCSI with ESXCLI in vSphere 5.0

This is mostly me learning my way around some of the new namespaces in esxcli as part of vSphere 5.0

If you want to know what’s new in esxcli in vSphere 5.0, please read these two posts from Duncan Epping and William Lam.

I wanted to see what needed to be done to configure a load balanced iSCSI connection with two VMkernel portgroups.

So here goes; All done against just a standard vSwitch.

Enable software iSCSI on the ESXi host
~ # esxcli iscsi software set --enabled=true

Add a portgroup to my standard vswitch for iSCSI #1
~ # esxcli network vswitch standard portgroup add -p iSCSI-1 -v vSwitch0

Now add a vmkernel nic (vmk1) to my portgroup
~ # esxcli network ip interface add -i vmk1 -p iSCSI-1

Repeat for iSCSI #2
~ # esxcli network vswitch standard portgroup add -p iSCSI-2 -v vSwitch0
~ # esxcli network ip interface add -i vmk2 -p iSCSI-2

Set the VLAN for both my iSCSI VMkernel port groups - in my case VLAN 140
~ # esxcli network vswitch standard portgroup set -p iSCSI-1 -v 140
~ # esxcli network vswitch standard portgroup set -p iSCSI-2 -v 140

Set the static IP addresses on both VMkernel NICs as part of the iSCSI network 
~ # esxcli network ip interface ipv4 set -i vmk1 -I 10.190.201.62 -N 255.255.255.0 -t static
~ # esxcli network ip interface ipv4 set -i vmk2 -I 10.190.201.63 -N 255.255.255.0 -t static

Set manual override fail-over policy so each iSCSI VMkernel portgroup had one active physical vmnic
~ # esxcli network vswitch standard portgroup policy failover set -p iSCSI-1 -a vmnic0
~ # esxcli network vswitch standard portgroup policy failover set -p iSCSI-2 -a vmnic3

Bond each of the VMkernel NICs to the software iSCSI HBA
~ # esxcli iscsi networkportal add -A vmhba33 -n vmk1
~ # esxcli iscsi networkportal add -A vmhba33 -n vmk2

Add the IP address of your iSCSI array or SAN as a dynamic discovery sendtarget
~ # esxcli iscsi adapter discovery sendtarget add -A vmhba33 -a 10.190.201.102

Re-scan your software iSCSI hba to discover volumes and VMFS datastores
~ # esxcli storage core adapter rescan --adapter vmhba33

Nice Changes to Maintenance Mode settings in vSphere 5

There have been some small but very useful changes to the feature set of Maintenance Mode in vSphere Update Manager 5.

It’s easier to see when comparing screenshots of a vSphere 4.1 environment to a vSphere 5.0 setup.

Here are the settings presented in VUM 4.1

And here are the settings in VUM 5.0. I’ve highlighted the changes.

Starting at the top (orange), when you remediate one or more hosts, you can choose to change the power state on the VMs if you desire.

The options include shutting down the VMs or suspending them. This is useful in test/dev environments where shutting down/suspending a large number or VMs may be quicker than migrating across to other hosts.

The second change (green) is just some simple more aggressive timing changes on the retry should a host fail to go into maintenance mode. By default it will now try up to 3 times every 5 minutes (instead of every 30 minutes in vSphere 4).

The last set of changes (purple) are some welcome enhancements, particularly if you have large clusters and offer some more automation around migrating powered-off VMs.

The most prominent feature in my opinion is parallel remediation for hosts. In vSphere 4, hosts were remediated in a serial fashion and this could be very time consuming in a large cluster.

Now vSphere 5 will auto determine how many hosts it can patch in parallel using information from DRS. As an admin you can over-ride this at remedation time and manually select how many hosts you want patched at the same time. (see screenshot)

Powered-off and suspended VMs can now be auto configured to migrate when a host goes into maintenance mode, another nice time saver.

The final new feature is around the ability to apply patches to ESXi hosts deployed via vSphere Auto Deploy (i.e PXE booted hosts).

Be aware, only patches that don’t require a host reboot work with this, as PXE booted ESXi hosts are stateless and any patch applied won’t be there next time you reboot the host from the Auto Deploy repository. Any permanent patch would need to be applied to the ESXi host image in the repository.

Overall, the new VUM changes lessen the manual tasks on larger clusters and make patching a less time consuming process.

VMware View Desktop boot-storm on SSD – illustrated

Just a couple of screen shots illustrating what happens with I/O when 30+ Windows XP virtual desktops boot at the same time on local SSD storage.

The hardware used was an 8-core (16 thread) Nehalem 5520 IBM x3650 M2 rack server, 64GB RAM running ESXi 4.1, with two local 160GB Intel 320 MLC SSD in RAID-0 providing the VMFS datastore for the desktops

The desktops were the linked clone variety in a VMware View 4.6 floating pool, and were running XP SP3 with 1 vCPU and 1.5GB RAM. Sophos anti-virus was also running in the VM’s.

The VM’s were bulk selected and a forced reset of all 30+ desktops at the same time via the vSphere client.

Firing up esxtop on the ESXi host and looking at the disk latency and I/O

 Booting the VM’s is using the read-only replica at this point, so read IOps are hitting 8200, with 493 write IOps against the linked clones. You can see latency is under 1ms. It did peak at 3ms on occasions.

Just running two Intel SSD in RAID-0 gives anywhere between 8,000-11,000 random 4k read/write IOps with very low latency.

With normal 15,000rpm SAS or FC disks at 180 IOps, you can see you may need 45-60 spindles to get this type of performance, along with the associated cost and rack space required.

This makes local SSD the best bang for buck for VDI deployments, as it mitigates boot-storms and AV update storms as well as enhancing deployment speed.

It also provides the nice side effect of a much snappier, smoother and consistency fast desktop experience for the end user – and that’s what it’s all about.

Running Veeam Backup 5 in a VM

One of the best features of most modern virtual backup products, is that they support the vStorage APIs for Data Protection (VADP), including change block tracking (CBT) and hot-add.

More here on VADP: kb.vmware.com/kb/1021175

To take advantage of the vSphere hot-add ability, the backup server itself must be a VM with access to the same storage and datastores of the VM’s it is backing up.

Veeam calls this virtual appliance mode and similar techniques are used by other products from PHD Virtual and VMware (VDR). Ultimately it means backups are very fast, LAN free and are “localised” to the backup server.

Veeam fully supports this configuration, but depending on your how many VM’s you are backing up, the capacity and the backup window  - it does have some small challenges.

Firstly, we all know the storage Achilles heal of vSphere 4 is its 2TB (minus 512 bytes) limit on VMDK’s or RAW disks passed through to the VM.

If you plan to keep a month of compressed and de-duped backups of 100 Server VM’s, 2TB may well be a drop in the ocean of data.

In my environment, backing up 80+ VM’s and keeping versions for 30 days, this results in around 12TB of data – this includes SQL servers, Exchange, File and Content Management servers – and the usual suspects in a corporate Windows environment.

So, do we present multiple 2TB VMDK’s or RAW disks to the Veeam Backup VM and then spread jobs across the drives? Perhaps aggregate and stripe them using Windows disk manager? Both of these options will work, but I preferred a simpler and cleaner single disk/volume solution.

Using the Windows iSCSI initiator inside the VM allows me to present huge volumes directly to the 64-bit Windows server VM from the iSCSI SAN – avoiding those nasty 32-bit SCSI2 limits of vSphere currently.

So onto some technical aspects on how I have this configured.

The Windows VM itself, is Server 2008 R2.
It is configured with 6 vCPU and 8GB ofRAM, as i said previously, depending on your workload, this may be overkill, or perhaps not enough.

I find that two running jobs can saturate a 6 vCPU VM running on top of a Westmere based blade server.

Presented to this is a 20TB Volume from an HP P4000 G2 SAN (iSCSI)

The VM has two dedicated vNics (VMXNET3) for iSCSI, both are separated onto different dedicated virtual machine port groups on the ESX host side, which are on the iSCSI VLAN. (See two images below)

Each of the port groups have been configured with manual fail-over order to set the preferred active physical adapter. iSCSI Network 1 using vmnic2 and iSCSI Network 2 using vmnic3 (see two images below)

This means that the two virtual nics in the VM traverse separate physical nics coming out of the ESX hosts.

Then using the iSCSI initiator software inside Windows 2008 R2 and installing the HP DSM driver for iSCSI Multipathing to the P4000 (other iSCSI vendors should also have MPIO packages for Windows)

Docs on how to configure under this in Windows are available from HP:
http://bizsupport1.austin.hp.com/bc/docs/support/SupportManual/c01865547/c01865547.pdf

This allows Windows to use both nics and perform round-robin load balancing to the volume on the iSCSI storage array. In the two images below, you can see the iSCSI target tab and both paths to the volume.

The multi-pathed volume appears in device manager like this:

Data traverses both paths evenly: (VMXNET3 drivers show as 10Gbps)

The end result is a gigantic multi-pathed multi terabyte volume available for Veeam Backup 5 to drop its backups onto and with all the the advantages of using your virtual infrastructure (HA/DRS etc.)

More info on Veeam Backup and Replication at their site:
http://www.veeam.com/vmware-esx-backup.html