Asides
Matraex has moved
Matraex has moved
Matraex has outgrown our space at 1101 W Grove St in Boise. The space was great and everyone LOVES being downtown, but we were to big for the tight space.
We will be hiring 2 to 3 people in 2015, starting with a new Developer on January 5th, and so we decided to treat ourselves to a little holiday treat and find a larger space.
So we found an office with more than twice the space (at the historic Alaska Building about a block closer into downtown Boise and moved in on Monday the 15th of Decmber.
With a giant bank of windows looking out at Bogus Basin, we have been very happy with the amount of space and light.
Over the coming weeks we will be adding our unique blend of geeky design to the office space.
The new address is, all those that would like to use the mail to send us anything, please send it to the address below:
1020 W Main St
Suite 250
Boise, Idaho 83702
8Th Annual Matraex River Trip (Survivors)
8Th Annual Matraex River Trip (Survivors)
We completed the 8th1 Annual Matraex Trip and everyone returned safe and sound. We all got wet to varying degrees, everyone got tossed around, one tossed half out (but wholly rescued). Other than one knee-bruised rib and a single rib-bruised knee, we ended the day with smiles all around. Here are some highlights of the trip this year:
1 – The 8th annual Matraex river trip is only one year after the second 6th2 annual Matraex river trip.
2 – Basically this was the 8th because the 6th3 happened twice, and there never was a 7th.
3 – It was actually the 5th that happened twice, and there never was a 6th, the 7th trip actually happened.
Converting an assembla subversion repository to a Jira / Stash hosted git repository
Converting an assembla subversion repository to a Jira / Stash hosted git repository
We were tasked by our client to move several subversion repositories, hosted at Assembla, to Stash GIT on dedicated Windows 2008 servers with SQL Server 2008.
First install windows git (http://git-scm.com/download/win)
So, first I had to download an install Jira (https://www.atlassian.com/software/jira/download), and Stash (https://www.atlassian.com/software/stash/download) from atlassian. This was very straight forward, just create an account at Atlassian download on the Webserver and install using the evaluation key.
I installed the apps using their default options with the default ports and then when prompted I just changed ‘localhost’ to the IP of the server I was working on.
Once they were installed I accessed them at
http://serverip:8080/ – Jira
http://serverip:7990/ – Stash
Each one of the programs takes you through a setup wizard. Where it prompts you to connect to a database server. This was straight forward you just have to make sure you setup separate databases for each of them. I used the same database server and username. The only gotcha comes with setting up the Stash database. here you will want to just use SQL to set up the DB.
CREATE DATABASE stash
USE stash
ALTER DATABASE stash SET ALLOW_SNAPSHOT_ISOLATION ON
ALTER DATABASE stash SET READ_COMMITTED_SNAPSHOT ON
ALTER DATABASE pm_stash COLLATE SQL_Latin1_General_CP1_CS_AS
SET NOCOUNT OFF
This setup is a requirement of stash so the db is case sensitive. After both databases are setup you can setup your user with create drop insert and select privileges (i just gave it db_owner).
Setup Jira first and then when setting up Stash, you can just answer some basic questions about where your Jira installation is and you can combine the user accounts so you only manage one set of users.
Now that the default is setup, add a project and repository to stash, this is basic and the repository is empty and on the resulting screen it shows you how to push an existing repository into stash.
You will see in the Stash repository screen commands that you will use to push the desired repository to Stash take note of where to find it, we are going to first convert our repository from SVN to git
Log into a ‘working’ linux installation, I have a Ubuntu server.
Delete the temporary folder and download your svn repository into a temporary directory (I had seen some odd errors when I tried to clone again to a partially cloned git repo, so I decided I will always delete the git repo first , and start from scratch if there is an error to make sure it is 100% clean)
- rm -rf /tmp/temp.git
- git svn clone <svnurl> –no-metadata -A users.txt –stdlayout /tmp/temp.git
This clone command may take a while depending on how large the repository is as well as your bandwidth to pull down all of the code.
While it is difficult to be able to tell WHERE in the process you are in order to find out how long the process may take, I did find a couple of indicators which may help if it is a very lengthy process.
find out some information on your existing repository 1) the current revision and 2) what the total size of your repository is on the server (http://stackoverflow.com/questions/1740543/determine-total-size-of-svn-directory-trunk)
- svn info <svn repository url>
- svn list -vR <svn repository url>|awk ‘{if ($3 !=””) sum+=$3; i++} END {print “ntotal size= ” sum/1024000″ MB” “nnumber of files= ” i/1000 ” K”}’
You can use the information above in a couple of ways
- watch the output from your command, depending on the size of the commits to your repository, you will frequently see what revision number is currently being worked on. (r17 is revision 17) you can compare this against the most recent revision on the server to determine how far you are in the import
- Even though you git repository will be smaller than the SVN repository(between 0 and 30%), you may be able to compare the size to indicate how far along in the process it is. so cd /tmp/temp.git and ‘du –max-depth=0’
Ignore the errors about using –prefix, these just caused me headaches and so by NOT using their recommendations to set prefix=origin/ this will work.
If you receive an error which says I need to have ‘usera’ I would add usera to users.txt and run the rm and git svn clone command again
- echo “usera = User A <usera@com.com>” >> users.txt
This method works if you only have a couple of users, but for some of my repositories I had LOTS of users, so I used the following command and then edited the user file by hand to make sure it worked.
- svn log -q <svnurl> | awk -F ‘|’ ‘/^r/ {sub(“^ “, “”, $2); sub(” $”, “”, $2); print $2″ = “$2” <“$2″>”}’ | sort -u > users.txt
If you receive the error “Unexpected HTTP status 405 ‘Method Not Allowed’ on ‘/svn/wssus.admindash’” this means that you have some protected directories setup in your Assembla. Open the project within Assembla > Settings > Protected branch and remove the directories and run the rm and git svn clone command again
If dont have any ignore files in svn i fake one so I have at least one commit
- cd/tmp/temp.git
- touch .gitignore #if I dont have any ignore files
- git svn show-ignore > .gitignore #if I do have ignore files i svn, this will return an error if there are no ignore files
- git add .gitignore
- git commit -m ‘Converting Properties from SVN to GIT’ .gitignore
create another bare repo and create a symbolic link for the ‘trunk’ coming from SVN
- mkdir /tmp/bare.git
- cd /tmp/tmpbare.git
- git init –bare
- git symbolic-ref HEAD refs/heads/trunk
Go back to the temp git repo and push it to the new bare repo and rename the ‘trunk’ to ‘master’
- cd /tmp/temp.git
- git remote add bare /tmp/bare.git
- git config remote.bare.push ‘refs/remotes/*:refs/heads/*’
- git push bare
- cd /tmp/bare.git
- git branch -m trunk master
Clean up branches and tags (thanks to http://john.albin.net/git/convert-subversion-to-git for most of this)
- git for-each-ref –format=’%(refname)’ refs/heads/tags |
cut -d / -f 4 |
while read ref
do
git tag “$ref” “refs/heads/tags/$ref”;
git branch -D “tags/$ref”;
done
Now you have a correctly formatted repo at /git/bare.git. you can push it to your stash git repository
- cd /tmp/bare.git
- git config remote-url <your git repository url from stash>
- git push origin master
This may take a bit of time depending on how much code you have. But once it is complete you can browse you repository in stash.
Next step, setting up Git work flows to implement a required SDLC.
Converting a Subversion Repository(hosted on Assembla) to a GIT repository (hosted on dedicated Linux Host)
Converting a Subversion Repository(hosted on Assembla) to a GIT repository (hosted on dedicated Linux Host)
We were tasked by our client to move several subversion repositories, hosted at Assembla, to GIT on dedicated internal Linux server. First step was actually installing the Ubuntu Server on their ESXi4 server.
- I downloaded the ubutu 14.04 LTS iso
- Uploaded it to one of the datastores on the ESXi server
- Created a new VM with 4CPU, 8GB RAM and 100 MB disk space.
- I loaded the ubuntu 14.04 iso into the VM CD and started the VM.
- I installed ubuntu with all of the defaults
Next I installed GIT on the server.
- apt-get install git-core
- apt-get install git-svn
Next I clone the svn directory into git.
- git svn clone <svnurl> –no-metadata -A authors-transform.txt –stdlayout /svn/tempreponame/
Ignore the errors about using –prefix, these just caused me headaches and so by NOT using their recommendations to set prefix=origin/ this will work.
Enter your username and password and each time I get an error which says I need to have ‘usera’ I would add usera to users.txt and run the get svn clone command again
- echo “usera = User A <usera@com.com>” >> users.txt
If you receive the error “Unexpected HTTP status 405 ‘Method Not Allowed’ on ‘/svn/wssus.admindash’” this means that you have some protected directories setup in your Assembla. Open the project within Assembla > Settings > Protected branch and remove the directories and run the git svn clone command again
I dont have any ignore files in svn so i fake one
- cd /svn/tempreponame/
- touch .gitignore
- git add .gitignore
- git commit -m ‘Converting Properties from SVN to GIT’ .gitignore
Move the git repository to a new/bare repo and create a symbolic link for the ‘trunk’ coming from SVN
- cd /git/newreponame
- git init –bar
- git symbolic-ref HEAD regs/heads/trunk
Go back to the temporary git repo and push it to the new bar repo and rename the ‘trunk’ to ‘master’
- cd /svn/tempreponame
- git remote add bare /git/newreponame
- git config remote.bar.push ‘refs/remotes/*:refs/heads/*’
- git push bare
- cd /git/newreponame
- git branch -m trunk master
Clean up branches and tags (thanks to http://john.albin.net/git/convert-subversion-to-git for most of this)
- cd /git/newreponame
git for-each-ref –format=’%(refname)’ refs/heads/tags |
cut -d / -f 4 |
while read ref
do
git tag “$ref” “refs/heads/tags/$ref”;
git branch -D “tags/$ref”;
done
Now you have a new repo at /git/newreponame. check it out to confirm it worked by checking it out and confirming that you have the same log history as your original SVN repo
- cd /tmp/
- git clone /git/newreponame
- find . #this should list all of your files
- cd newreponame
- git log
Disk write speed testing different XenServer configurations – single disk vs mdadm vs hardware raid
Disk write speed testing different XenServer configurations – single disk vs mdadm vs hardware raid
In our virtual environment on of the VM Host servers has a hardware raid controller on it . so natuarally we used the hardware raid.
The server is a on a Dell 6100 which uses a low featured LSI SAS RAID controller.
One of the ‘low’ features was that it only allows two RAID volumes at a time. Also it does not do RAID 10
So I decided to create a RAID 1 with two SSD drives for the host, and we would also put the root operating systems for each of the Guest VMs there. It would be fast and redundant. Then we have upto 4 1TB disks for the larger data sets. We have multiple identically configured VM Hosts in our Pool.
For the data drives, with only 1 more RAID volume I could create without a RAID 10, I was limited to either a RAID V, a mirror with 2 spares, a JBOD. In order to get the most space out of the 4 1TB drives, I created the RAIDV. After configuring two identical VM hosts like this, putting a DRBD Primary / Primary connection between the two of them and then OCFS2 filesystem on top of it. I found I got as low as 3MB write speed. I wasnt originally thinking about what speeds I would get, I just kind of expected that the speeds would be somewhere around disk write speed and so I suppose I was expecting to get acceptable speeds beetween 30 and 80 MB/s. When I didn’t, I realized I was going to have to do some simple benchmarking on my 4 1TB drives to see what configuration will work best for me to get the best speed and size configuration out of them.
A couple of environment items
- I will mount the final drive on /data
- I mount temporary drives in /mnt when testing
- We use XenServer for our virtual environment, I will refer to the host as the VM Host or dom0 and to a guest VM as VM Guest or domU.
- The final speed that we are looking to get is on domU, since that is where our application will be, however I will be doing tests in both dom0 and domU environments.
- It is possible that the domU may be the only VM Guest, so we will also test raw disk access from domU for the data (and skip the abstraction level provided by the dom0)
So, as I test the different environments I need to be able to createw and destroy the local storage on the dom0 VM Host. Here are some commands that help me to do it.
I already went through xencenter and removed all connections and virtual disk on the storage I want to remove, I had to click on the device “Local Storage 2” under the host and click the storage tab and make sure each was deleted. {VM Host SR Delete Process}
xe sr-list host=server1 #find and keep the uuid of the sr in my case "c2457be3-be34-f2c1-deac-7d63dcc8a55a"
xe pbd-list sr-uuid=c2457be3-be34-f2c1-deac-7d63dcc8a55a # find and keep the uuid of the pbd connectig sr to dom0 "b8af1711-12d6-5c92-5ab2-c201d25612a9"
xe pbd-unplug uuid=b8af1711-12d6-5c92-5ab2-c201d25612a9 #unplug the device from the sr
xe pbd-destroy uuid=b8af1711-12d6-5c92-5ab2-c201d25612a9 #destroy the devices
xe sr-forget uuid=c2457be3-be34-f2c1-deac-7d63dcc8a55a #destroy the sr
Now that the sr is destroyed, I can work on the raw disks on the dom0 and do some benchmarking on the speeds of differnt soft configurations from their.
Once I have made a change, to the structure of the disks, I can recreate the sr with a new name on top of whatever solution I come up with by :
xe sr-create content-type=user device-config:device=/dev/XXX host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk ‘{print $NF}’` name-label=”Local storage XXX on `cat /etc/hostname`” shared=false type=lvm
Replace the red XXX with what works for you
Most of the tests were me just running dd commands and writing the slowest time, and then what seemed to be about the average time in MB/s. It seemed like, the first time a write was done it was a bit slower but each subsequent time it was faster and I am not sure if that means when a disk is idle, it takes a bit longer to speed up and write? if that is the case then there are two scenarios, if the disk is often idle, the it will use the slower number, but if the disk is busy, it will use the higher average number, so I tracked them both. The idle disk issue was not scientific and many of my tests did not wait long enough for the disk to go idle inbetween tests.
The commands I ran for testing were dd commands
dd if=/dev/zero of=data/speetest.`date +%s` bs=1k count=1000 conv=fdatasync #for 1 mb dd if=/dev/zero of=data/speetest.`date +%s` bs=1k count=10000 conv=fdatasync #for 10 mb dd if=/dev/zero of=data/speetest.`date +%s` bs=1k count=100000 conv=fdatasync #for 100 mb dd if=/dev/zero of=data/speetest.`date +%s` bs=1k count=1000000 conv=fdatasync #for 1000 mb
I wont get into the details of every single command I ran as I was creating the different disk configurations and environments but I will document a couple of them
Soft RAID 10 on dom0
dom0>mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb2 --assume-clean dom0>mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sdc1 /dev/sdd2 --assume-clean dom0>mdadm --create /dev/md10 --level=0 --raid-devices=2 /dev/md0 /dev/md1 --assume-clean dom0>mkfs.ext3 /dev/md10 dom0>xe sr-create content-type=user device-config:device=/dev/md10 host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk ‘{print $NF}’` name-label=”Local storage md10 on `cat /etc/hostname`” shared=false type=lvm
Dual Dom0 Mirror – Striped on DomU for an “Extended RAID 10”
dom0> {VM Host SR Delete Process} #to clean out 'Local storage md10' dom0>mdadm --manage /dev/md2 --stop dom0>mkfs.ext3 /dev/md0 && mkfs.ext3 /dev/md1 dom0>xe sr-create content-type=user device-config:device=/dev/md0 host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk ‘{print $NF}’` name-label=”Local storage md0 on `cat /etc/hostname`” shared=false type=lvm dom0>xe sr-create content-type=user device-config:device=/dev/md1 host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk ‘{print $NF}’` name-label=”Local storage md1 on `cat /etc/hostname`” shared=false type=lvm domU> #at this point use Xen Center to add and attach disks from each of the local md0 and md1 disks to the domU (they were attached on my systems as xvdb and xvdc domU> mdadm --create /dev/md10 --level=0 --raid-devices=2 /dev/xvdb /dev/xvdc domU> mkfs.ext3 /dev/md10 && mount /data /dev/md10
Four disks SR from dom0, soft raid 10 on domU
domU>umount /data domU> mdadm --manage /dev/md10 --stop domU> {delete md2 and md1 disks from the storage tab under your VM Host in Xen Center} dom0> {VM Host SR Delete Process} #to clean out 'Local storage md10' dom0>mdadm --manage /dev/md2 --stop dom0>mdadm --manage /dev/md1 --stop dom0>mdadm --manage /dev/md0 --stop dom0>fdisk /dev/sda #delete partition and write (d w) dom0>fdisk /dev/sdb #delete partition and write (d w) dom0>fdisk /dev/sdc #delete partition and write (d w) dom0>fdisk /dev/sdd #delete partition and write (d w) dom0>xe sr-create content-type=user device-config:device=/dev/sda host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk '{print $NF}'` name-label="Local storage sda on `cat /etc/hostname`" shared=false type=lvm dom0>xe sr-create content-type=user device-config:device=/dev/sdb host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk '{print $NF}'` name-label="Local storage sdb on `cat /etc/hostname`" shared=false type=lvm dom0>xe sr-create content-type=user device-config:device=/dev/sdc host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk '{print $NF}'` name-label="Local storage sdc on `cat /etc/hostname`" shared=false type=lvm dom0>xe sr-create content-type=user device-config:device=/dev/sdd host-uuid=`grep -B1 -f /etc/hostname <(xe host-list)|head -n1|awk '{print $NF}'` name-label="Local storage sdd on `cat /etc/hostname`" shared=false type=lvm domU>mdadm --create /dev/md10 -l10 --raid-devices=4 /dev/xvdb /dev/xvdc /dev/xvde /dev/xvdf domU>mdadm --detail --scan >> /etc/mdadm/mdadm.conf domU>echo 100000 > /proc/sys/dev/raid/speed_limit_min #I made the resync go fast, which reduced it from 26 hours to about 3 hours domU>mdadm --grow /dev/md0 --size=max
Working with GB Large mysql dump files -splitting insert statements
Working with GB Large mysql dump files -splitting insert statements
Recently I had to restore a huge database from a huge MySQL dump file.
Since the dump file was had all of the create statements mixed with insert statements, I found the recreation of the database to take a very long time with the possibility that it might error out and rollback all of the transactions.
So I came up with the following script which processes the single MySQL dump file and splits it out so we can run the different parts separately.
This creates files that can be run individually called
- mysql.tblname.beforeinsert
- mysql.tblname.insert
- mysql.tblname.afterinsert
cat mysql.dump.sql| awk 'BEGIN{ TABLE="table_not_set"} { if($1=="CREATE" && $2=="TABLE") { TABLE=$3 gsub("`","",TABLE) inserted=false } if($1!="INSERT") { if(!inserted) { print $0 > "mysql."TABLE".beforeinsert"; } else { print $0 > "mysql."TABLE".afterinsert"; } } else { print $0 > "mysql."TABLE".insert"; inserted=true } } '
Creating a Bootable USB Install Thumb drive for XenServer
Creating a Bootable USB Install Thumb drive for XenServer
We have a couple sites with XenServer VM machines, so part of our redundancy / failure plan is to be able to quickly isntall / reinstall a XenServer hypervisor.
THere are plenty of more involved methods with setting up PXE servers, etc. But the quickest / low tech method is to have a USB thumbdrive on hand.
So we can use one of the plethora of tools to create a USB thumbdrive, (unetbootin, USB to ISO, etc) but they all seem to have problems with the ISO, (OS not found, error with install, etc)
So I found one that works well
http://rufus.akeo.ie/
He keeps his software upto date it appears. Download it and run it, select your USB drive then check the box to ‘create a bootable disk using ISO Image, select the image to use from your hard drive. I downloaded the iso image from
http://xenserver.org/overview-xenserver-open-source-virtualization/download.html
– XenServer Installation ISO
Then just boot from the USB drive and the install should start.
Setting up DRBD with OCSF2 on a Ubuntu 12.04 server for Primary/Primary
Setting up DRBD with OCSF2 on a Ubuntu 12.04 server for Primary/Primary
We run in a virtual environment and so we thought we would go with the virtual kernel for the latest linux kernls
We learned that we should NOT not in the case we want to use the OCFS2 distributed locking files system because ocfs2 did not have the correct modules so we would have had to doa custom build of the modules so we decided against it. we just went with the latest kernel, and would install ocfs2 tools from the package manager.
DRBD on the other hand had to be downloaded, compiled and installed regardless of kernel, here are the procedures, these must be run on each of a pair of machines.
We assume that /dev/xvdb has a similar sized device on both machines.
apt-get install make gcc flex wget http://oss.linbit.com/drbd/8.4/drbd-8.4.4.tar.gztar xzvf drbd-8.4.4.tar.gz cd drbd-8.4.4/ ./configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-km make all
Connfigure both systems to be aware of eachother without dns /etc/hosts
192.168.100.10 server1 192.168.100.11 server2
Create a configuration file at /etc/drbd.d/disk.res
resource r0 {
protocol C;
syncer { rate 1000M; }
startup {
wfc-timeout 15;
degr-wfc-timeout 60;
become-primary-on both;
}
net {
#requires a clustered filesystem ocfs2 for 2 prmaries, mounted simultaneously
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
cram-hmac-alg sha1;
shared-secret "sharedsanconfigsecret";
}
on server1 {
device /dev/drbd0;
disk /dev/xvdb;
address 192.168.100.10:7788;
meta-disk internal;
}
on server2 {
device /dev/drbd0;
disk /dev/xvdb;
address 192.168.100.11:7788;
meta-disk internal;
}
}
configure drbd to start on reboot verify that DRBD is running on both machines and reboot, and verify again
update-rc.d drbd defaults
/etc/init.d/drbd start
drbdadm -- --force create-md r0
drbdadm up r0
cat /proc/drbd
at this point you should see that both devices are connected Secondary/Secondary and Inconsistent/Inconsistent.
Now we start the sync fresh, on server1 only both sides are blank so drbd should manage any changes from here on. cat /proc/drbd will show UpToDate/UpToDate
Then we mark both primary and reboot to verify everything comes back up
server1>drbdadm -- --clear-bitmap new-current-uuid r0 server1>drbdadm primary r0 server2>drbdadm primary r0 server2>reboot server1>reboot
I took a snapshot at this point
Now it is time to setup the OCFS2 clustered file system on top of the device first setup a /etc/ocfs2/cluster.conf
cluster:node_count = 2 name = mycluster node:ip_port = 7777 ip_address = 192.168.100.10 number = 1 name = server1 cluster = mycluster node:ip_port = 7777 ip_address = 192.168.100.11 number = 2 name = server2 cluster = mycluster
get the needed packages, configure them and setup for reboot, when reconfiguring, remember to put the name of the cluster you want to start at boot up mycluster run the below on both machines
apt-get install ocfs2-tools dpkg-reconfigure ocfs2-tools mkfs.ocfs2 -L mycluster /dev/drbd0 #only run this on server1 mkdir -p /data echo "/dev/drbd0 /data ocfs2 noauto,noatime,nodiratime,_netdev 0 0" >> /etc/fstab mount /data touch /data/testfile.`hostname` stat /data/testfile.* rm /data/testfile* # you will only have to run this on one machine reboot
So, everything should be running on both computers at this point when things come backup make sure everythign is connected.
You can run these commands from either server
/etc/init.d/o2cb status cat /proc/drbd
Setting DRBD in Primary / Primary — common commands to sync resync and make changes
Setting DRBD in Primary / Primary — common commands to sync resync and make changes
As we have been setting up our farm with an NFS share the DRBD primary / primary connection between servers is important.
We are setting up a group of /customcommands/ that we will be able to run to help us keep track of all of the common status and maintenance commands we use, but when we have to create, make changes to the structure, sync and resync, recover, grow or move the servers, We need to document our ‘Best Practices’ and how we can recover.
From base Server install
apt-get install gcc make flex
wget http://oss.linbit.com/drbd/8.4/drbd-8.4.1.tar.gz
tar xvfz drbd-8.4.1.tar.gz
cd drbd-8.4.1/
./configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-km
make KDIR=/lib/modules/3.2.0-58-virtual/build
make install
Setup in/etc/drbd.d/disk.res
resource r0 {
protocol C;
syncer { rate 1000M; }
startup {
wfc-timeout 15;
degr-wfc-timeout 60;
become-primary-on both;
}
net {
#requires a clustered filesystem ocfs2 for 2 prmaries, mounted simultaneously
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
cram-hmac-alg sha1;
shared-secret "sharedsanconfigsecret";
}
on server1{
device /dev/drbd0;
disk /dev/xvdb;
address 192.168.100.10:7788;
meta-disk internal;
}
on riofarm-base-san2 {
device /dev/drbd0;
disk /dev/xvdb;
address 192.168.100.11:7788;
meta-disk internal;
}
}
Setup your /etc/hosts
192.168.100.10 server1
192.168.100.11 server2
Setup /etc/hostname with
server1
reboot, verify your settings and SAVE A DRBDVMTEMPLATE clone your VM to a new server called server2
Setup /etc/hostname with
server2
start drbd with /etc/init.d/drbd this will likely try and create the connection, but this is where we are going to ‘play’ to learn the commands and how we can sync, etc.
cat /proc/drbd #shows the status of the connections server1> drbdadm down r0 #turns of the drbdresource and connection server2> drbdadm down r0 #turns of the drbd resource and connection server1> drbdadm -- --force create-md r0 #creates a new set of meta data on the drive, which 'erases drbds memory of the sync status in the past server2> drbdadm -- --force create-md r0 #creates a new set of meta data on the drive, which 'erases drbds memory of the sync status in the past server1> drbdadm up r0 #turns on the drbdresource and connection and they shoudl connect without a problem, with no memory of a past sync history server2> drbdadm up r0 #turns on the drbdresource and connection and they shoudl connect without a problem, with no memory of a past sync history server1> drbdadm -- --clear-bitmap new-current-uuid r0 # this create a new 'disk sync image' essentially telling drbd that the servers are blank so no sync needs to be done both servers are immediately UpToDate/UptoDate in /proc/drbd server1> drbdadm primary r0 server2> drbdadm primary r0 #make both servers primary and now when you put an a filesystem on /dev/drbd0 you will be able to read and write on both systems as though they are local
So, lets do some failure scenarios, Say, we loose a server, it doesn’t matter which one since they are both primaries, in this case though we will say server2 failed. Create a new VM from DRBDVMTEMPLATE which already had drbd made on it with the configuration or create another one using the instructions above.
Open /etc/hostname and set it to
server2
reboot. Make sure /etc/init.d/drbd start is running
server1>watch cat /proc/drbd #watch the status of dtbd, it is very useful and telling about what is happening, you will want DRBD to be Connected Primary/Unknown UpToDate/DUnknown server2>drbdadm down server2>dbadm wipe-md r0 #this is an optional step that is used to wipe out the meta data, I have not seen that it does anything different than creating the metadata using the command below, but it is useful to know the command in case you want to get rid of md on your disk server2>drbdadm -- --force create-md r0 ##this makes sure that their is no partial resync data left over from where you cloned it from server2>drbdadm up r0 # this brings drbd server2 back into the resource and connects them, it will immediately sart syncing you should see SyncSource Primary/Secondary UpToDate/Inconsistent on server1, for me it was soing to to 22 hours for my test of a 1TM (10 MB / second)
Lets get funky, what happens if you stop everything in the middle of a sync
server1>drbdadm down r0 #we shut down the drdb resource that has the most up to date information, on server2 /proc/drbd shows Secondary/Unknown Inconsitent/DUnknown , server2 does not know about server1 any more, but server2 still knows that server2 is inconsitent, (insertable step here could be on server2: drbdadm down ro; drbdadm up ro, with no change to the effect) server1>drbdadm up ro # this brings server1 back on line and /proc/drbd on server1 shows SyncSource, server2 shows SyncTarget, server1 came backup as the UpToDate server, server2 was Inconsistent, it figured it out
Where things started to go wrong and become less ‘syncable’ was when servers were both down and had to be brought back up again separately with a new uuid was created on them separately. so lets simulate that the drbd config fell apart, and we have to put it together again.
server2>drbdadm disconnect ro; drdbadm -- --force create-md r0 ; drbd connect ro; #start the sync process over
Xen Center – import from exported XVA file for restoring – does not create a new VM
Xen Center – import from exported XVA file for restoring – does not create a new VM
Building a backup procedure for Xen Center
This started when defining the procedure for how to backup VMs ‘off site’ in a way that would later allow us to restore them, should some sort of unrecoverable error occur.
The concept is, we want to be able to take a VM, which is currently running and get a backup of some kind which can then be restored at a later point.
First I will explain what I have found to be the way that Xen Center currenlty allows, I dont know what the ‘Best Practice’ is suggested for this procedure, I couldnt find it from searching and so I explored the options available within Xen Center and at the command line
Exporting (stopping a VM first)
This method requires you to stop a VM first, which means that it does not work well for VMs which are already in production, and the method is not a viable “backup” solution for us, but I explain it here to make the distinction between the different types of exporting. This method is something that would work well for backing up defined templates to an offsite location, but would not work well for saving a full running VM in a way that could be restored from an offsite location.
Xen Center allows you to export a VM, which you later import: First, shut down your VM, right click on it and go to Export and follow the steps in the wizard to put the export on your local drive. you can conveniently do an export of multiple stopped VMs. the progress is displayed in the Logs tab for the pool. I told the export process to ‘verify’ the exported files, which added A LONG time to the process, be prepared for this
Once the export is complete, you can move these files where ever you want, to restore, you simply right click on your pool and go to Import select the file from disk, follow the wizard and the VMS will be started (I am not sure what happens with any kind of MAC address collisions here if the same VM you exported is currently running)
Exporting a live VM
It seems reasonable that a currently running VM could not be exported directly to a file, because the VM is running and the changes that occur in a running VM would be inconsitant during the process of the export. here is how we work around this.
In short, here are the steps
- create a snapshot of a vm
- export the snapshot to a file offsite using Xen Center (we are backed up)
- start the restore by creating a vm martyr by installing a new vm or template (hopefully a small )
- destroy the martyrs existing vdi
- import the snapshot from a file (could be on the same server or pool, or a completely new build)
- attach the imported vdi as a bootable disk
- rename from martry to your correct name and start the new VM
WIth more detail:
First take snapshot of the running VM, then go to the Snapshots tab for th VM and export that Snapshot. Follow the wizard and save it to a file.
When it comes time to re import, we have a little preparation we should do and keep track of these numbers
#Find the UUID for the SR that you will be importing the file back to
xe sr-list host=HOSTNAME
>079eb2c5-091d-5c2f-7e84-17705a8158cf
#get a list of all of the CURRENT uuids for the virtual disks on the sr
xe vdi-list sr-uuid=079eb2c5-091d-5c2f-7e84-17705a8158cf|grep uuid
>uuid ( RO) : bb8015b0-0672-45af-aed5-c5308f60b914
>uuid ( RO) : f0b67634-25bc-486d-b38e-0e8294de7df6
>uuid ( RO) : cdc13e40-9ffe-497c-91ff-d426a52aaf2a
Now we import the file. Right click on the host you would like to restore it to and click ‘Import’ THe import process asks you a couple of pieces of information about the restore, host name, network, etc. go through the steps an click finish. The vm will be imported again, the progress will be shows in the Logs tab of the host and pool, when complete, we now have a virtual disk unattached to a VM, which we need to attach to a VM,
Here things are a bit more complex. First we create a VM ‘martyr’, this is what I call a VM, that we create through some other method, soley for the purpose of attaching our reimported snapshot to it. we will take the guts out of whatever VM we create and put the guts from our import into it. on the technical side, we take a VM, disconnect the existing bootable vdi and reconnect the vdi we just imported. I create the VM using a template or install I dont cover that here, but I name it martyr_for_import
#get a list of the latest uuids for the virtual disks on the sr
xe vdi-list sr-uuid=079eb2c5-091d-5c2f-7e84-17705a8158cf|grep uuid
>uuid ( RO) : bb8015b0-0672-45af-aed5-c5308f60b914
>uuid ( RO) : f0b67634-25bc-486d-b38e-0e8294de7df6
>uuid ( RO) : cdc13e40-9ffe-497c-91ff-d426a52aaf2a
>uuid ( RO) : 04a7f80e-e108-4468-9bd3-fada613e9a42
#each time I have done this, the imported uuid is listed last, but I run the list, before and after, just to make sure, in this case my vdi is: 04a7f80e-e108-4468-9bd3-fada613e9a42
#find the current vbds attached to this vm
xe vbd-list vm-label-name-label=martyr_for_import
>uuid ( RO) : b0f4cb5e-5285-bbec-13a3-f581c6e6d287 vm-uuid ( RO): 708b633a-683d-859f-1f1f-bf8495d17fe8 vm-name-label ( RO): martyr_for_import vdi-uuid ( RO): a36d6025-039b-4f6e-9d19-f7eb7d1d4c46 empty ( RO): false device ( RO): xvdd
uuid ( RO) : eb12fdac-c36c-78fa-8eb6-67fa3a5a1d85 vm-uuid ( RO): 708b633a-683d-859f-1f1f-bf8495d17fe8 vm-name-label ( RO): martyr_for_import vdi-uuid ( RO): cdc13e40-9ffe-497c-91ff-d426a52aaf2a empty ( RO): false device ( RO): xvda #shut down the vm xe vm-shutdown uuid=708b633a-683d-859f-1f1f-bf8495d17fe8 #destroy the vdi virtual disk that is attached to our marty as the current xvda vbd xe vdi-destroy uuid=cdc13e40-9ffe-497c-91ff-d426a52aaf2a #you can verify that it has been destroyed and detached by running xe vbd-list vm-label-name-label=martyr_for_import again #now attach our snapshot vdi as a new vbd bottable device as xvda again. (note the bootable=true and type=Disk) xe vbd-create vm-uuid=708b633a-683d-859f-1f1f-bf8495d17fe8 vdi-uuid=04a7f80e-e108-4468-9bd3-fada613e9a42 bootable=true device=xvda type=Disk #okay we are attached (you can verify by running xe vbd-list vm-label-name-label=martyr_for_import again #go ahead and start the vm through your Xen scenter or run this command xe vm-start uuid=708b633a-683d-859f-1f1f-bf8495d17fe8