Running Veeam Backup 5 in a VM

One of the best features of most modern virtual backup products, is that they support the vStorage APIs for Data Protection (VADP), including change block tracking (CBT) and hot-add.

More here on VADP: kb.vmware.com/kb/1021175

To take advantage of the vSphere hot-add ability, the backup server itself must be a VM with access to the same storage and datastores of the VM’s it is backing up.

Veeam calls this virtual appliance mode and similar techniques are used by other products from PHD Virtual and VMware (VDR). Ultimately it means backups are very fast, LAN free and are “localised” to the backup server.

Veeam fully supports this configuration, but depending on your how many VM’s you are backing up, the capacity and the backup window  – it does have some small challenges.

Firstly, we all know the storage Achilles heal of vSphere 4 is its 2TB (minus 512 bytes) limit on VMDK’s or RAW disks passed through to the VM.

If you plan to keep a month of compressed and de-duped backups of 100 Server VM’s, 2TB may well be a drop in the ocean of data.

In my environment, backing up 80+ VM’s and keeping versions for 30 days, this results in around 12TB of data – this includes SQL servers, Exchange, File and Content Management servers – and the usual suspects in a corporate Windows environment.

So, do we present multiple 2TB VMDK’s or RAW disks to the Veeam Backup VM and then spread jobs across the drives? Perhaps aggregate and stripe them using Windows disk manager? Both of these options will work, but I preferred a simpler and cleaner single disk/volume solution.

Using the Windows iSCSI initiator inside the VM allows me to present huge volumes directly to the 64-bit Windows server VM from the iSCSI SAN – avoiding those nasty 32-bit SCSI2 limits of vSphere currently.

So onto some technical aspects on how I have this configured.

The Windows VM itself, is Server 2008 R2.
It is configured with 6 vCPU and 8GB ofRAM, as i said previously, depending on your workload, this may be overkill, or perhaps not enough.

I find that two running jobs can saturate a 6 vCPU VM running on top of a Westmere based blade server.

Presented to this is a 20TB Volume from an HP P4000 G2 SAN (iSCSI)

The VM has two dedicated vNics (VMXNET3) for iSCSI, both are separated onto different dedicated virtual machine port groups on the ESX host side, which are on the iSCSI VLAN. (See two images below)

Each of the port groups have been configured with manual fail-over order to set the preferred active physical adapter. iSCSI Network 1 using vmnic2 and iSCSI Network 2 using vmnic3 (see two images below)

This means that the two virtual nics in the VM traverse separate physical nics coming out of the ESX hosts.

Then using the iSCSI initiator software inside Windows 2008 R2 and installing the HP DSM driver for iSCSI Multipathing to the P4000 (other iSCSI vendors should also have MPIO packages for Windows)

Docs on how to configure under this in Windows are available from HP:
http://bizsupport1.austin.hp.com/bc/docs/support/SupportManual/c01865547/c01865547.pdf

This allows Windows to use both nics and perform round-robin load balancing to the volume on the iSCSI storage array. In the two images below, you can see the iSCSI target tab and both paths to the volume.

The multi-pathed volume appears in device manager like this:

Data traverses both paths evenly: (VMXNET3 drivers show as 10Gbps)

The end result is a gigantic multi-pathed multi terabyte volume available for Veeam Backup 5 to drop its backups onto and with all the the advantages of using your virtual infrastructure (HA/DRS etc.)

More info on Veeam Backup and Replication at their site:
http://www.veeam.com/vmware-esx-backup.html

Using vCLI (vMA) or PowerCLI for ESXi server maintenance mode

Just a quick post (mostly for documentations sake) to highlight on how to take your ESX(i) boxes into and out of maintenance mode using either VMware vCLI/vMA or VMware PowerCLI (Powershell)

This can be handy if you have a whole bunch of new blades to apply drivers to, or ESXi firmware updates via vihostupdate. You can put all this in a script to make your admin life a little easier.

Using vCLI installed on your workstation or via the vMA

You can enter and exit maintenance mode as well as reboot individual ESX hosts:

vicfg-hostops --server youresxservername --operation enter
vicfg-hostops --server youresxservername --operation exit
vicfg-hostops --server youresxservername --operation reboot

Example: vicfg-hostops --server esx01.domain.local --operation enter

Or you can do it against an entire cluster (useful if all your new servers are in a new cluster)

vicfg-hostops --server yourvcenterserver --cluster "yourcluster" --operation enter
vicfg-hostops --server yourvcenterserver --cluster "yourcluster" --operation exit

Example: vicfg-hostops --server vcenter41.domain.local --cluster "ProdCluster" --operation enter

Similar things from the Powershell side

#Connect to your vcenter server
connect-viserver -server yourvcenterservername

#Put ESX host into maintenance mode
Set-VMHost -vmhost youresxservername -state maintenance

#Take ESX host out of maintenance mode
Set-VMHost -vmhost youresxservername -state connected

#Reboot your ESX host  (must be in maintenance mode or also use -force in the command)
Restart-VMHost -vmhost youresxservername

vSphere VAAI Performance on the HP P4000 G2

With the recent release of SAN/iQ 9.0 which supports the VAAI vStorage offload features of vSphere 4.1, I thought I would run a quick block zeroing test to see the difference.

The setup was as follows:

4 x HP P4500 G2 nodes in a cluster (48 x 15K SAS disks)
500GB Network RAID-10 (2-Way mirror) volume
VMFS Datastore at 8MB blocksize

From the CLI of the ESXi host, I created a zeroed out VMDK file 50GB in size:

vmkfstools -c 50G -d eagerzeroedthick P4000vaai-test.vmdk

This was run once with VAAI enabled on the ESXi host and once with VAAI disabled (via the advanced Datamover options)


The results are pretty conclusive. For block zeroing on a VMDK, VAAI accelerates the operation by 4-5x

VAAI enabled: 109 seconds
VAAI disabled: 482 seconds

HA error when upgrading ESXi 4.0 Update 2 to ESXi 4.1

Ran into this annoying HA error after upgrading some blades all running ESXi 4.0 Update 2.

All hosts were upgraded using VUM from a  vCenter 4.1 server. Upgrade all went well, taking only a few short minutes per host.

Upon trying to renable HA, all hosts failed with the error:

HA agent on esxhostname in cluster clustername in datacentername has an error:  Error while running health check script

Disabling HA on the cluster or reconfiguring HA on hosts had no effect. The error persisted.

From previous HA errors I remembered that this knowledge base article from VMware (KB1007234) might do the trick.

After enabling remote tech support mode (SSH) on the ESXi host and then connecting in;

Run the uninstall the Legato Automated Availability Manager script:

./opt/vmware/aam/VMware-aam-ha-uninstall.sh

Then restart management services on the ESXi host:

services.sh restart

Then re-enable HA on your cluster and it should install aam-ha agents from scratch on each ESXi host,

I’ve only run into this issue with ESXi and not ESX classic with COS,

A list of vSphere 4.1 VAAI compatible arrays?

This list of VAAI arrays detected by the Pluggable Storage Architecture appears in /etc/vmware/esx.conf of an ESX 4.1 host.

Appears  these are there to support the vendors whose arrays will support VAAI now or in the future with firmware.

/storage/PSA/Filter/claimrule[65430]/match/model = “SYMMETRIX”
/storage/PSA/Filter/claimrule[65430]/match/vendor = “EMC”
/storage/PSA/Filter/claimrule[65430]/plugin = “VAAI_FILTER”
/storage/PSA/Filter/claimrule[65430]/type = “vendor”
/storage/PSA/Filter/claimrule[65431]/match/vendor = “DGC”
/storage/PSA/Filter/claimrule[65431]/plugin = “VAAI_FILTER”
/storage/PSA/Filter/claimrule[65431]/type = “vendor”
/storage/PSA/Filter/claimrule[65432]/match/vendor = “EQLOGIC”
/storage/PSA/Filter/claimrule[65432]/plugin = “VAAI_FILTER”
/storage/PSA/Filter/claimrule[65432]/type = “vendor”
/storage/PSA/Filter/claimrule[65433]/match/vendor = “NETAPP”
/storage/PSA/Filter/claimrule[65433]/plugin = “VAAI_FILTER”
/storage/PSA/Filter/claimrule[65433]/type = “vendor”
/storage/PSA/Filter/claimrule[65434]/match/vendor = “HITACHI”
/storage/PSA/Filter/claimrule[65434]/plugin = “VAAI_FILTER”
/storage/PSA/Filter/claimrule[65434]/type = “vendor”
/storage/PSA/Filter/claimrule[65435]/match/vendor = “LEFTHAND”
/storage/PSA/Filter/claimrule[65435]/plugin = “VAAI_FILTER”
/storage/PSA/Filter/claimrule[65435]/type = “vendor”
/storage/PSA/MP/claimrule[0101]/match/model = “Universal Xport”
/storage/PSA/MP/claimrule[0101]/match/vendor = “DELL”
/storage/PSA/MP/claimrule[0101]/plugin = “MASK_PATH”
/storage/PSA/MP/claimrule[0101]/type = “vendor”
/storage/PSA/VAAI/claimrule[65430]/match/model = “SYMMETRIX”
/storage/PSA/VAAI/claimrule[65430]/match/vendor = “EMC”
/storage/PSA/VAAI/claimrule[65430]/plugin = “VMW_VAAIP_SYMM”
/storage/PSA/VAAI/claimrule[65430]/type = “vendor”
/storage/PSA/VAAI/claimrule[65431]/match/vendor = “DGC”
/storage/PSA/VAAI/claimrule[65431]/plugin = “VMW_VAAIP_CX”
/storage/PSA/VAAI/claimrule[65431]/type = “vendor”
/storage/PSA/VAAI/claimrule[65432]/match/vendor = “EQLOGIC”
/storage/PSA/VAAI/claimrule[65432]/plugin = “VMW_VAAIP_EQL”
/storage/PSA/VAAI/claimrule[65432]/type = “vendor”
/storage/PSA/VAAI/claimrule[65433]/match/vendor = “NETAPP”
/storage/PSA/VAAI/claimrule[65433]/plugin = “VMW_VAAIP_NETAPP”
/storage/PSA/VAAI/claimrule[65433]/type = “vendor”
/storage/PSA/VAAI/claimrule[65434]/match/vendor = “HITACHI”
/storage/PSA/VAAI/claimrule[65434]/plugin = “VMW_VAAIP_HDS”
/storage/PSA/VAAI/claimrule[65434]/type = “vendor”
/storage/PSA/VAAI/claimrule[65435]/match/vendor = “LEFTHAND”
/storage/PSA/VAAI/claimrule[65435]/plugin = “VMW_VAAIP_LHN”
/storage/PSA/VAAI/claimrule[65435]/type = “vendor”

Some great blogs about the new vSphere 4.1 array integration features are here and here

USB passthrough in vSphere 4.1

One of the new features of vSphere/ESX 4.1 is the ability to pass-through up to 20 USB devices from the ESX host to a VM or VMs.

It is really simple to setup and test.

I am using a Server 2008 R2 VM as a test in this case.

Right click on the VM and select edit settings.

Add a USB Controller and then click OK and exit the edit settings screen.

Edit settings on the VM again and add a USB device.

At this point the wizard will show you any visible/compatible device you have plugged into the underlying ESX 4.1 host

A device HCL is here on VMwares support site: http://kb.vmware.com/kb/1021345

You cannot multi-select devices at this stage – add them one by one.

There is an option to allow vMotion of the VM while the USB device is connected.

VMware documentation states:  “You can migrate a virtual machine to another ESX/ESXi host in the same datacenter and maintain the USB
passthrough device connections to the original host.”

I tested vMotion with a USB mass storage device attached and it does indeed work across ESX hosts as promised.

In the screen below I have now added two pass-through USB devices to my VM (A Kingston USB drive and a Safenet/Raindow dongle)

Inside the Windows VM – looking at device manager – the devices have appeared.

Both devices work correctly as intended.

I have tested numerous brands of USB mass storage devices (Kingston, Sandisk, Lexar, Imation) as well a couple of of security dongles and they all work well.

Also, please check out the USB pass-through section in the Virtual Machine Administration Guide PDF that is part of the vSphere 4.1 documentation.

http://www.vmware.com/support/pubs/vs_pubs.html

10GigE – changing VMware vSwitch best practices

As 10GbE adapter prices (for rack and blade servers) and per port switch prices plummet, this tech is fast becoming the easy choice when implementing new servers and storage.

In the past few years, best practice for ESX vSwitches has always been to separate out traffic types onto different vSwitches with dedicated and aggregated 1GigE links.

Most ESX hosts would have a setup similar to this:

Service Console/vMotion on one vSwitch with  2 x 1Gig NICs

LAN/VM Network traffic on another vSwitch with 2 x 1GigE NICs

iSCSI/VMkernel traffic on a third vSwitch with 2 x 1GigE NICs

This meant that you had at least six Ethernet cables snaking their way around your rack into structured cabling or top of rack switches.

With an armada of rack servers or a few blade chassis, cabling and network management can soon become an issue.

Enter affordable 10GigE adapters, cheap SFP+ direct attach cables (up to 8.5M) and dropping prices on 10GigE switch ports; Cable mayhem is history.

So now the networking configuration of your average ESXi  host might look like this:

Two fat 10Gig pipes into a single vSwitch, everything separated via port groups and VLANs. Simple, secure and manageable with plenty of performance.

Add some NIC team override settings and you can have different traffic nicely split across both paths.

Of course one config may not suit all, but I would wager that this would cover most ESX workloads out there.

This is easily my new best practice for ESX vSwitch configuration.