Clustering Exchange on VMWare ESX Server

Introduction

This article gives an overview of leveraging VMware ESX server as a solution for High Availability of Microsoft Exchange 2003.

VMware provides IT operations with cost savings in server provisioning and server hosting. One area that has always proved difficult for smaller and budget contrained IT departments is highly available clustering, mainly due to the requirement for duplicate hardware that would rarely be utilised. Pitfalls of not implementing HA means increased downtime periods for users when doing maintenance, another is in the event of hardware failure, even more lengthy downtime will be experienced when restoration and continuity plans are carried out.

Within my Infrastructure I currently utilise VMware to host applications such as SQL and Exchange. This approach has allowed me to produce a cost effective and low overhead clustering solution without effectively wasting a replica set of passive servers. The number of users on my chosen clustered Exchange server setup is around 1200. The total store size is approximately 200GB which clearly requires serious horsepower during normal business hours. To provide this I have implemented a physical HP BL25p server to host my Exchange services for production use as much as possible. With the new performance offered within VMware ESX VI3 I may consider moving both nodes of the cluster onto ESX to consolidate further in the future.

The MSCS solution allows me to utilise an almost nil cost cluster node to host Exchange services during outage periods such as ESEUTIL maintenance on a weekend or Service Pack upgrades.

ESX Setup and configuration

Given that my ESX farm is connected to my SAN it allows me to utilise Raw Disk Mapping (available in ESX 2.5.2 and above). Previously when running older versions of ESX 2.5, in order to cluster you would have to utilise the VMware file system (VMFS) to perform clustering in a shared disk solution. This didn’t allow you to use an external Physical host as I needed. The availability of Raw Disk Mappings (RDMs) means that this can be implemented. Using RDMs also means that it is easier to perform SAN snap shots.

Exchange Setup and Configuration

The setup for an Exchange cluster in this configuration is no different to a completely physical solution given that the nodes both think they are physical nodes. For reference, the link to the configuration guide for MCSC solutions on ESX is available at the end of the article.

Exchange Disk configurations

Using Microsoft best practices for Storage Group, Mailbox Store and Log provisioning I have chosen to separate the DB and Log files for each individual storage group. This means presenting multiple volumes from my SAN to the Physical Host and the ESX Virtual Host. The chosen method used to group users into storage groups is a Platinum, Gold, Silver and Bronze naming structure. This layout provides us with the option to align this to HR structure, and also implement management policies on mailbox size, message size and implement SLA’s on backup and restore. So for instance, if the Platinum storage group was affected or a disaster occurred, the service would have a higher restore SLA than the Bronze one due to the importance of users in this group.

A high level diagram of my architecture is shown below; it is similar to a Physical/Physical two node cluster; 

exchange-vmware

Cluster Configuration Caveats

Firstly you will have to configure your disks on the SAN, you then need to present the disks to your Clustered Nodes. This is a relatively standard practice for a Physical host. On a virtual host you need to present the Raw Lun to the VMware Host server NOT the Virtual Guest. When this LUN is presented to the VMware host I experienced issues where the Physical host had locked a regular SCSI reservation on my HP EVA SAN, this gave me issues when scanning my ESX Host for new LUNs that had been presented. The resolution for this was to turn off the Physical host to unlock the regular reservation and then present the disks back to the VMware server and scan for new LUNs. The key to this type of setup is to make sure you configure the disks optimally when installing.

Another issue lies with the number of disks that are required. I have found that the amount of disks that can be presented on a virtual SCSI adapter in ESX is limited. This causes issues when following the practise outlined above to separate DBs and Log drives as the number of mounted disks can be as many as 12-15. To combat this we have implemented a shared volume between the Platinum Storage group and Gold with the same for Silver and Bronze.

RDM Files

RDM Files have been introduced since VMware 2.5.x to provide configurations such as the one I have implemented. They are basically a Metadata file that points to Raw LUN volumes. This file will act as the proxy disk for the Virtual Guest. They will appear as a Physical Disk but will be easily identified as a RDM. Interestingly and unbeknown to me they also appear with the same size as the physical LUN that they are proxying to. On my ESX Farm I have a VMFS Volume dedicated to RDM files, this allows makes administration easier. How you choose to implement your solution is down to you. I will provide links to VMware resource on RDM theory and best practices and let you make your choice!

VMFS is the bespoke VMware file system. It can support clustering BUT not in the setup that I have implemented. The reason for the lack of clustering support in this setup is that a VMFS volume is presented to an ESX host and is only seen by the Virtual Guests native to the ESX Host. You cannot share this with a physical server as I have done. You can however present the disk when in a “Shared” formatted mode to another ESX Server that has the LUN presented to it. This allows you to still perform clustering but only on two ESX Servers. For more details on clustering and disk mode support I have included relevant links at the end of this document.

Performance

Microsoft Exchange makes relatively high use of Disk I/O. New technology in VMware VI3 allows you to implement higher I/O products like Exchange and SQL due to the increased Hypervisor performance. It may be beneficial to capacity plan and test your configuration with a tool such as IOmeter to do some highly I/O intensive operations on the Virtual Guest. Overall the intial results from the setup I am utilising are very good with the most important factor being no user complaints! Users are still being migrated onto the new platform so we can monitor the performance of the virtual node as we go along. Tools such as the Jet Stress from the Exchange toolkit (see link section) give the option to stress the disk subsystem with Exchange specific data.

Support statements

Some will say that this solution is not supported by Microsoft. My view on this is that there are enough tools on the market to port and migrate to physical setups in the event of Microsoft saying that it’s not supported. [Editor: Personally, I feel that this setup should be validated by MS as running a HA solution which is not supported can be a bad area to get into especially when a disaster situation occurs!]

End Statement

My company is like many, low on budget and becoming more and more dependant on cost effective solutions such as this method of clustering. A complete physical solution would not be possible both financially and also due to space and power issues with a second physical node. To obtain buy in from Exchange admins and SQL admins I have taken the stance that using a second Virtual node is better than running with all of your eggs in one basket. Within my environment I am confident that in a failure scenario or in the event of an urgent maintenance requirement the virtual solution would provide me with a more than acceptable performance level to provide a service to users.

Resources and links

VMware RDM Clustering                                                       http://www.vmware.com/pdf/esx25_rawdevicemapping.pdf

VMware clustering guide for VI3                                          http://www.vmware.com/pdf/vi3_vm_and_mscs.pdf

Installing and configuring an Exchange 2003 cluster     http://technet.microsoft.com/en-us/library/bb123612.aspx

My Blog                                                                                     http://www.danieleason.co.uk



Add this page to your favorite Social Bookmarking websites

Add comment


Security code
Refresh