Disk write speed testing different XenServer configurations – single disk vs mdadm vs hardware raid

In our virtual environment on of the VM Host servers has a hardware raid controller on it .  so natuarally we used the hardware raid.

The server is a on a Dell 6100 which uses a low featured LSI SAS RAID controller.
One of the ‘low’ features was that it only allows two RAID volumes at a time.  Also it does not do RAID 10

So I decided to create a RAID 1 with two SSD drives for the host,  and we would also put the root operating systems for each of the Guest VMs there.   It would be fast and redundant.   Then we have upto 4 1TB disks for the larger data sets.  We have multiple identically configured VM Hosts in our Pool.

For the data drives,  with only 1 more RAID volume I could create without a RAID 10,  I was limited to either a RAID V,   a mirror with 2 spares,   a JBOD.  In order to get the most space out of the 4 1TB drives,   I created the RAIDV.   After configuring two identical VM hosts like this,  putting a DRBD Primary / Primary connection between the two of them and then OCFS2 filesystem on top of it.  I found I got as low as 3MB write speed.   I wasnt originally thinking about what speeds I would get,  I just kind of expected that the speeds would be somewhere around disk write speed and so I suppose I was expecting to get acceptable speeds beetween 30 and 80 MB/s.   When I didn’t,  I realized I was going to have to do some simple benchmarking on my 4 1TB drives to see what configuration will work best for me to get the best speed and size configuration out of them.

A couple of environment items

  • I will mount the final drive on /data
  • I mount temporary drives in /mnt when testing
  • We use XenServer for our virtual environment,  I will refer to the host as the VM Host or dom0 and to a guest VM as VM Guest or domU.
  • The final speed that we are looking to get is on domU,  since that is where our application will be,  however I will be doing tests in both dom0 and domU environments.
  • It is possible that the domU may be the only VM Guest,  so we will also test raw disk access from domU for the data (and skip the abstraction level provided by the dom0)

So,  as I test the different environments I need to be able to createw and destroy the local storage on the dom0 VM Host.  Here are some commands that help me to do it.
I already went through xencenter and removed all connections and virtual disk on the storage I want to remove,  I had to click on the device “Local Storage 2” under the host and click the storage tab and make sure each was deleted. {VM Host SR Delete Process}

xe sr-list host=server1 #find and keep the uuid of the sr in my case "c2457be3-be34-f2c1-deac-7d63dcc8a55a"
xe pbd-list   sr-uuid=c2457be3-be34-f2c1-deac-7d63dcc8a55a # find and keep the uuid of the pbd connectig sr to dom0 "b8af1711-12d6-5c92-5ab2-c201d25612a9"
xe pbd-unplug  uuid=b8af1711-12d6-5c92-5ab2-c201d25612a9 #unplug the device from the sr
xe pbd-destroy uuid=b8af1711-12d6-5c92-5ab2-c201d25612a9 #destroy the devices
xe sr-forget uuid=c2457be3-be34-f2c1-deac-7d63dcc8a55a #destroy the sr

Now that the sr is destroyed,  I can work on the raw disks on the dom0 and do some benchmarking on the speeds of differnt soft configurations from their.
Once I have made  a change,  to the structure of the disks,  I can recreate the sr with a new name on top of whatever solution I come up with by :

xe sr-create content-type=user device-config:device=/dev/XXX host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk ‘{print $NF}’` name-label=”Local storage XXX on `cat /etc/hostname`” shared=false type=lvm

Replace the red XXX with what works for you

Most of the tests were me just running dd commands and writing the slowest time,  and then what seemed to be about the average time in MB/s.   It seemed like,  the first time a write was done it was a bit slower but each subsequent time it was faster and I am not sure if that means when a disk is idle,  it takes a bit longer to speed up and write?  if that is the case then there are two scenarios,   if the disk is often idle,  the it will use the slower number,  but if the disk is busy,  it will use the higher average number,  so I tracked them both.  The idle disk issue was not scientific and many of my tests did not wait long enough for the disk to go idle inbetween tests.

The commands I ran for testing were dd commands

dd if=/dev/zero of=data/speetest.`date +%s` bs=1k count=1000 conv=fdatasync  #for 1 mb
dd if=/dev/zero of=data/speetest.`date +%s` bs=1k count=10000 conv=fdatasync  #for 10 mb
dd if=/dev/zero of=data/speetest.`date +%s` bs=1k count=100000 conv=fdatasync  #for 100 mb
dd if=/dev/zero of=data/speetest.`date +%s` bs=1k count=1000000 conv=fdatasync  #for 1000 mb

I wont get into the details of every single command I ran as I was creating the different disk configurations and environments but I will document a couple of them

Soft RAID 10 on dom0

dom0>mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb2 --assume-clean
dom0>mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sdc1 /dev/sdd2 --assume-clean
dom0>mdadm --create /dev/md10 --level=0 --raid-devices=2 /dev/md0 /dev/md1 --assume-clean
dom0>mkfs.ext3 /dev/md10
dom0>xe sr-create content-type=user device-config:device=/dev/md10 host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk ‘{print $NF}’` name-label=”Local storage md10 on `cat /etc/hostname`” shared=false type=lvm

Dual Dom0 Mirror – Striped on DomU for an “Extended RAID 10”

dom0> {VM Host SR Delete Process} #to clean out 'Local storage md10'
dom0>mdadm --manage /dev/md2 --stop
dom0>mkfs.ext3 /dev/md0 && mkfs.ext3 /dev/md1
dom0>xe sr-create content-type=user device-config:device=/dev/md0 host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk ‘{print $NF}’` name-label=”Local storage md0 on `cat /etc/hostname`” shared=false type=lvm
dom0>xe sr-create content-type=user device-config:device=/dev/md1 host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk ‘{print $NF}’` name-label=”Local storage md1 on `cat /etc/hostname`” shared=false type=lvm
#at this  point use Xen Center to add and attach disks from each of the local md0 and md1 disks to the domU (they were attached on my systems as xvdb and xvdc
domU> mdadm --create /dev/md10 --level=0 --raid-devices=2 /dev/xvdb /dev/xvdc
domU> mkfs.ext3 /dev/md10  && mount /data /dev/md10

Four disks SR from dom0, soft raid 10 on domU

domU>umount /data
domU> mdadm --manage /dev/md10 --stop
domU> {delete md2 and md1 disks from the storage tab under your VM Host in Xen Center}
dom0> {VM Host SR Delete Process} #to clean out 'Local storage md10'
dom0>mdadm --manage /dev/md2 --stop
dom0>mdadm --manage /dev/md1 --stop
dom0>mdadm --manage /dev/md0 --stop
dom0>fdisk /dev/sda #delete partition and write (d w)
dom0>fdisk /dev/sdb #delete partition and write (d w)
dom0>fdisk /dev/sdc #delete partition and write (d w)
dom0>fdisk /dev/sdd #delete partition and write (d w)
dom0>xe sr-create content-type=user device-config:device=/dev/sda host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk '{print $NF}'` name-label="Local storage sda on `cat /etc/hostname`" shared=false type=lvm
dom0>xe sr-create content-type=user device-config:device=/dev/sdb host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk '{print $NF}'` name-label="Local storage sdb on `cat /etc/hostname`" shared=false type=lvm
dom0>xe sr-create content-type=user device-config:device=/dev/sdc host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk '{print $NF}'` name-label="Local storage sdc on `cat /etc/hostname`" shared=false type=lvm
dom0>xe sr-create content-type=user device-config:device=/dev/sdd host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk '{print $NF}'` name-label="Local storage sdd on `cat /etc/hostname`" shared=false type=lvm
domU>mdadm --create /dev/md10 -l10 --raid-devices=4 /dev/xvdb /dev/xvdc /dev/xvde /dev/xvdf
domU>mdadm --detail --scan >> /etc/mdadm/mdadm.conf 
domU>echo 100000 > /proc/sys/dev/raid/speed_limit_min #I made the resync go fast, which reduced it from 26 hours to about 3 hours
domU>mdadm --grow /dev/md0 --size=max