Juggling server virtualisation and database workloads

As the conversation moves from generic virtualisation of ‘quick win’ workloads such as web servers and print servers, development and test environments, one-off applications and so on, the question arises – where does server virtualisation go next?

One potential area is to see virtual machines as a target for database workloads such as SQL Server, MySQL and Oracle. So, just how good a combination is virtualisation and database, and what can we learn from early adopters?

Before we kick off we should say that ‘database virtualisation’ can mean different things to different people. It can refer, for example, to:

• A single VM running on a server with direct attached storage, in which the database engine and the database itself both exist within the confines of the virtual machine – for example the VMware VMDK file in VMware, or the VHD file in Microsoft terms.

• A VM running a database engine, but with the database repository existing on disks running within a SAN, accessed (for example) via a Host Bus Adaptor (HBA) on the physical server.

(While outside the scope of this piece, note that the term has also been used to describe how the principles of virtualisation have also been applied to the database engine itself. In this case, a ‘virtual database’ can be managed as a single entity though it may exist across several physical servers.)

Considering engine-and-repository-in-VM first, this approach benefits from the same gains as more generic virtualisation – that is, more flexible configuration and provisioning, reduced hardware cost etc. However, running multiple databases in multiple VMs on the same server can only be of limited use, for the simple reason that they will all be accessing the same disk.

So, if databases are only subjected to limited numbers of transactions, there’s plenty to like but you may find that you quickly reach a threshold beyond which performance starts to plummet. This disk contention point will be equally true with respect to database backup (and as a quick tip, it isn’t a fantastic idea to schedule all VM backups from the same server at the same time, as Reg readers have told us).

These bottleneck issues can be reduced if the storage is independent from the server, for example in the SAN. In this case at the very least, different database repositories can be physically stored on different disks, reducing the contention. But bottlenecks still exist – not least at the network ports and HBAs.

Again from Reg feedback we know that storage bottlenecks are a significant challenge for organisations looking to up the virtualisation game. Hardware manufacturers are falling over themselves (in a good way) to develop new products that can manage the levels of I/O driven by virtual machines – this includes companies like Intel with the Boxboro chipset, and HBA manufacturers such as Emulex. But as hardware architects know, throughput is an ‘end-to-end challenge’ – solve the problem in one place, and it moves to another.

Another area of potential difficulty comes with moving workloads. This is a key selling point of virtualisation, at least to the operations staff who are enjoying (you tell us) the benefits of being able to shift a virtual machine from one server to another, Moving a VM is straightforward enough when it doesn’t have dependencies with the outside, but difficulties start to appear if data paths need to be kept going between dependent entities – database and storage via an HBA for example.

For all of these reasons, rather than considering virtualisation as an overlay onto existing infrastructure for consolidation-driven cost-saving reasons, there does seem to be an argument for considering the up-scaling of virtualisation as more of a green field that needs to be planted. To deal with issues of scalability in advance of hitting the bottlenecks, a virtualisation-based environment to support database workloads needs to be designed rather than adapted from what is already there.