awk Command to remove Non IP entries from /etc/hosts and /etc/hosts.deny

Posted on February 23, 2014

awk Command to remove Non IP entries from /etc/hosts and /etc/hosts.deny

We had a script automatically adding malicious IPS to our /etc/hosts.deny file on one of our servers.

The script went awry and ended up putting hundreds of thousands of non ip addresses into the file. There were malicious IP addresses mixed in

I used this awk script to clean it up , and remove all of the non ip addresses, and make the list unique.

 awk  '/ALL/ &&  $NF ~ /[0-9.]/' /etc/hosts.deny| sort -n -k2 |uniq > /etc/hosts.deny2

once I inspected the /etc/hosts.deny2 I replaced the original

mv /etc/hosts.deny2 /etc/hosts.deny

Mdadm – Failed disk recovery (unreadable disk)

Posted on February 22, 2014

Linux Server Hardware Technology

Mdadm – Failed disk recovery (unreadable disk)

Well,

After 9 more months I ran into a nother disk failure. (First disk failure found here https://www.matraex.com/mdadm-failed-disk-recovery/)

But this time, The system was unable to read the disk at all

#fdisk /dev/sdb

This process just hung for a few minutes. It seems I couldn’t simply run a few commands like before to remove and add the disk back to the software RAID. So I had to replace the disk. Before I went to the datacenter I ran

#mdadm /dev/md0 --remove /dev/sdb1

I physically went to our data center, found the disk that showed the failure (it was disk sdb so I ‘assumed’ it was the center disk out of three, but I was able to verify since it was not blinking from normal disk activity. I removed the disk, swapped it out for one that I had sitting their waiting for this to happen, and replaced it. Then I ran a command to make sure the disk was correctly partitioned to be able to fit into the array

#fdisk /dev/sdb

This command did not hang, but responded with cannot read disk. Darn, looks like some error happened within the OS or on the backplane that made it so a newly added disk wasn’t readable. I scheduled a restart on the server later when the server came back up, fdisk could read the disk. It looks like I had used the disk for something before, but since I had put it in my spare disk pile, I knew I could delete it and I partitioned it with one partion to match what the md was expecting (same as the old disk)

#fdisk /dev/sdb
>d 2 -deletes the old partition 2
>d 1 -deletes the old partition 1
>c -creates a new partion
>p – sets the new partion as primary
>1 – sets the new partion as number 1
>> <ENTER> – just press enter to accept the defaults starting cylinder
>> <ENTER> – just press enter to accept the defaults ending cylinder
>> w – write the partion changes to disk
>> Ctrl +c – break out of fdisk

Now the partition is ready to add back to the raid array

#mdadm /dev/md0 –add /dev/sdb1

And we can immediately see the progress

#mdadm /dev/md0 --detail
/dev/md0:
 Version : 00.90.03
 Creation Time : Wed Jul 18 00:57:18 2007
 Raid Level : raid5
 Array Size : 140632704 (134.12 GiB 144.01 GB)
 Device Size : 70316352 (67.06 GiB 72.00 GB)
 Raid Devices : 3
 Total Devices : 3
Preferred Minor : 0
 Persistence : Superblock is persistent
Update Time : Sat Feb 22 10:32:01 2014
 State : active, degraded, recovering
 Active Devices : 2
Working Devices : 3
 Failed Devices : 0
 Spare Devices : 1
Layout : left-symmetric
 Chunk Size : 64K
Rebuild Status : 0% complete
UUID : fe510f45:66fd464d:3035a68b:f79f8e5b
 Events : 0.537869
Number Major Minor RaidDevice State
 0 8 1 0 active sync /dev/sda1
 3 8 17 1 spare rebuilding /dev/sdb1
 2 8 33 2 active sync /dev/sdc1

And then to see the progress of rebuilding

#cat /proc/mdadm
Personalities : [raid1] [raid6] [raid5] [raid4]
md0 : active raid5 sdb1[3] sda1[0] sdc1[2]
 140632704 blocks level 5, 64k chunk, algorithm 2 [3/2] [U_U]
 [==============>......] recovery = 71.1% (50047872/70316352) finish=11.0min speed=30549K/sec
md1 : active raid1 sda2[0]
 1365440 blocks [2/1] [U_]

Wow in the time I have been blogging this, already 71 percent rebuilt!, but wait! what is this, md1 is failed? I check my monitor and what do I find but another message that shows that md1 failed with the reboot. I was so used to getting the notice saying md0 was down I did not notice that md1 did not come backup with the reboot! How can this be?

It turnd out that sdb was in use on both md1 and md0, but even through sdb could not be read at all on /dev/sdb and /dev/sdb1 failed out of the md0 array, somehow the raid subsystem had not noticed and degraded the md1 array even though the entire sdb disk was not respoding (perhaps sdb2 WAS responding back then just not sdb), who knows at this point. Maybe the errors on the old disk could have been corrected by the reboot if I had tried that before replacing the disk, but that doesn’t matter any more, All I know is that I have to repartion the sdb device in order to support both the md0 and md1 arrays.

I had to wait until sdb finished rebuilding, then remove it from md0, use fdisk to destroy the partitions, build new partitions matching sda and add the disk back to md0 and md1

Deleting Orphaned Disks in Citrix XenServer

Posted on February 19, 2014

XenServer

Deleting Orphaned Disks in Citrix XenServer

I found that while building my virtual environment with templates ready to deploy I created quite a few templates and snapshots.

I did a pretty good job of deleting the extras when I didn’t need them any more, but in some cases when deleting a VM I no longer needed, I forgot to check the box to delete the snapshots that went WITH that VM.

I could see under the dom0 host -> Storage tab that space was still allocated to the snapshots, (Usage was higher than the combined visible suage of servers and templates, and Virtual allocation was way higher than it should be)

But without a place that listed the snapshots that were taking up space. When looking into the way to delete these orphaned snapshots (and the disk snapshots that went with them) I found some cumbersome command line methods.

Like this old method that someone used - http://blog.appsense.com/2009/11/deleting-orphaned-disks-in-citrix-xenserver-5-5/

After a big more digging, i found that by just clicking on the Local Storage under the domU then clicking on the ‘Storage’ tab under there, I would see a list of all of the storage elements that are allocated. I would see some that were for snapshots without a name. Turns out those were the ones that were orphaned, If they were allocated to a live server the delete button would not be highlighted so I just deleted those old ones.

Resizing a VDI on XenServer using XenCenter and Commandline

Posted on February 18, 2014

XenServer

Resizing a VDI on XenServer using XenCenter and Commandline

Occassionally I have a need to change the size of a disk, perhaps to allocate more data to the os.

To do this, on the host I unmount the disk

umount /data

Click on the domU server in XenCenter and click on the Storage tab, select the storage item I want to resize and click ‘Detach’
at the command line on one of the dom0 hosts

 xe sr-list host=dom0hostname

write down the uuid of the SR which the Virtual Disk was in. (we will use XXXXX-XXXXX-XXXX)

 xe vdi-list sr-uuid=XXXXX-XXXXX-XXXX

write down the uuid of the disk that you wanted to resize(we will use YYYY-YYYY-YYYYY)
Also, note that the the virtual-size parameter that shows. VDIs can not be shrunk so you will need a disk size LARGER than the size displayed here.

 xe vdi-resize sr-uuid=YYYY-YYYY-YYYYY disk-size=9887654

Two step recipe – upgrading from Postgres 8.4 to 9.3 and then implementing 9.3 hot_standby replication

Posted on February 18, 2014

postgresql

Two step recipe – upgrading from Postgres 8.4 to 9.3 and then implementing 9.3 hot_standby replication

Upgrade and existing postgresql database from Postgres 8.4 to 9.3 and then implementing a 9.3 hot_standby replication server so all backups and slow select queries can run from it.

The setup: two servers, the current primary database server (will continue to be the primary database server when using 9.3, but we will call it the master in the replication)
First get and install postgres 9.3 using the postgres apt repositories

master and standby> vi /etc/apt/sources.list
   - deb http://apt.postgresql.org/pub/repos/apt/ UNAME-pgdg main
#UNAME EXAMPLES: precise, squeeze, etch, etc
master and standby> apt-get update
master> apt-get install postgresql-9.3 postgresql-client-9.3 nfs-kernel-server nfs-client
standby> apt-get install postgresql-9.3 postgresql-client-9.3 postgresql-contrib-9.3 nfs-kernel-server nfs-client

Next create some shared directorys via nfs for file based archiving

standby> mkdir -p /data/dbsync/primaryarchive master> mkdir -p /data/dbsync-archiveto master> vi /etc/exports
   - /var/lib/postgresql/9.3/main 192.168.*.*(ro,sync,no_subtree_check,no_root_squash)
standby> vi /etc/exports
   - /data/dbsync/primaryarchive 192.168.*.*(rw,sync,no_subtree_che
master> vi /etc/fstab
   - SECONDARYSERVERIP:/data/dbsync/primaryarchive /data/dbsync-archiveto nfs ro 0 1
standby>mkdir -p /mnt/livedb
standby> mount PRIMARYSERVERIP:/var/lib/postgresql/8.4/main/ /mnt/livedb
master> mount /data/dbsync-archiveto

Now, configure postgres on the master to allow replication and restart, put it on port 5433 so there are no conflictw with 8.4

master> vi /etc/postgresql/9.3/main/pg_hba.conf - host replication all SECONDARYSERVERIP trust
master> vi /etc/postgresql/9.3/main/postgresql.conf
  - wal_level=hot_standby
  - archive_mode = on
  - port = 5433
  - archive_command = 'test -f /data/dbsync-archiveto/archiveable && cp %p /data/dbsync-archiveto/%f'
master> /etc/init.d/postgresql restart

Configure postgres on the standby to allow it to run as a hot_standby

standby> vi /etc/postgresql/9.3/main/postgresql.conf
  -restore_command = a018/usr/lib/postgresql/9.3/bin/pg_standby -d -t /tmp/pgsql.trigger.5432 /data/dbsync/primaryarchive %f %p %r 2>>/var/log/postgresql/standby.log
  -recovery_end_command = a018rm -f /tmp/pgsql.trigger.5432
  - wal_level=hot_standby
  - hot_standby = on
standby> /etc/init.d/postgresql stop

Now lets get a base backup on the standby

standby> mv /var/lib/postgresql/9.3/main /var/lib/postgresql/9.3/main.old
standby>cd /var/lib/postgres/9.3; mv main main.old;
standby> pg_basebackup -D main -R -h192.168.120.201 -p5433 -x -Upostgres
standby> chown postgres.postgres main/ -R
standby> /etc/init.d/postgres start

Thats it!!, you should not have a working replication server

primary> create table tmp as select now();
secondary> select * from tmp;

#check the progress several ways. postregres log, which files and recovery are running and by being able to connect and see updates from the master, on the secondary

standby> tail /var/log/postgresql/postgresql-9.3-main.log
standby> grep 'database system is ready to accept read only connections'
standby> ps ax|grep post
 - postgres: wal receiver process streaming 3/43000000
master> psql -Upostgres -c 'select pg_switch_xlog()'
and the log file would switch in the recovery file
standby> ps ax|grep post
- postgres: startup process recovering 000000010000000300000037
That was all to make sure that the replication is working on 9.3,   now that I am comfortable with it working,  I am going to turn off the replication,  copy the data from 8.4 to 9.3 and recreate the replication
First lets stop the postgresql daemon on the standby server so the VERY heavy load from copying the db from 8.4 to 9.3 is not duplication

standby> /etc/init.d/postgresql stop

Next, copy the database from 8.4 to 9.3, I have heard there may be some problems for conversion of some objects between 8.4 and 9.3 but not for me, this went great.

master> pg_dump -C -Upostgres mydatabase| psql -Upostgres -p5433

Once that is successful, lets switch ports on the 9.3 and 8.4 servers so 9.3 can take over

master>vi /etc/postgresql/9.3/main/postgresql.conf
  - port = 5432
master>vi /etc/postgresql/8.4/main/postgresql.conf
  - port = 5433
master> /etc/init.d/postgres reload
Last step, get a base backup and start again.
standby> mv /var/lib/postgresql/9.3/main /var/lib/postgresql/9.3/main.old 
standby>cd /var/lib/postgres/9.3; mv main main.old;
standby> pg_basebackup -D main -R -h192.168.120.201 -x -Upostgres
standby> chown postgres.postgres main/ -R
standby> /etc/init.d/postgres start
standby> rm /var/lib/postgres/9.3/main.old* -rf

Now..... to figure out what to do with the archivedir method we are currently using.....  It seems that it is just building up  when do we use it?

Employee Appreciation Night – Wahooz

Posted on February 17, 2014

Company

Employee Appreciation Night – Wahooz

The tema at Matraex went to Wahooz on Tuesday February 4th for some pizza, laser tag, go carts and video games.

We ended up with a group of 15 with family and all.

I was lots of fun and we will probably end up doing it again!

PHP to reset all primary key sequences in your postgresql database

Posted on February 17, 2014

PHP postgresql

PHP to reset all primary key sequences in your postgresql database

Use the following php code t reset all of the primary key sequences with the max(id) currently in the db.

We use wrapper functions db_query (which returns an array from the db when a select statement is run) and db_exec() which runs an update or insert command against the db.

[code language=”php”]$sql = "SELECT t.relname as related_table,
a.attname as related_column,
s.relname as sequence_name
FROM pg_class s
JOIN pg_depend d ON d.objid = s.oid
JOIN pg_class t ON d.objid = s.oid AND d.refobjid = t.oid
JOIN pg_attribute a ON (d.refobjid, d.refobjsubid) = (a.attrelid, a.attnum)
JOIN pg_namespace n ON n.oid = s.relnamespace
WHERE s.relkind = ‘S’
AND n.nspname = ‘public’";
$qry = db_query($sql);

foreach($qry as $row)
{
$outsql = "select setval(‘$row[sequence_name]’,(select max($row[related_column]) from $row[related_table]))";
db_exec($outsql);
}[/code]

Setting up postgres warm standby

Posted on February 14, 2014

postgresql

Setting up postgres warm standby

Mainly just notes, I haven’t gone through this in too much detail yet…. -MB

Setup two postgres servers with the same version, both with the same data directory layout.
AS ROOT

primary> apt-get install postgresql-server nfs-kernel-server nfs-client
secondary> apt-get install postgresql-server nfs-kernel-server nfs-client postgresql-contrib-8.4
secondary> mkdir -p /data/dbsync/primaryarchive
primary> mkdir -p /data/dbsync-archiveto
primary> vi /etc/exports

#to temporarily allow access to the base data for getting a base
/var/lib/postgresql/8.4/main 192.168.*.*(ro,sync,no_subtree_check,no_root_squash)

secondary> vi /etc/exports

#to provide a location on the secondary, that the primary can write WAL logs to
/data/dbsync/primaryarchive 192.168.*.*(rw,sync,no_subtree_che

primary> vi /etc/fstab

SECONDARYSERVERIP:/data/dbsync/primaryarchive /data/dbsync-archiveto nfs ro 0 1

primary> mount /data/dbsync-archiveto
primary> vi /etc/postgresql/8.4/main/postgresql.conf

#to turn on archiving from primary to the secondary
-archive_mode = on
-archive_command ‘cp %p /data/dbsync-archiveto/%f’

secondary>mkdir -p /mnt/livedb
secondary> mount PRIMARYSERVERIP:/var/lib/postgresql/8.4/main/ /mnt/livedb

At this point we should have a postgres db running on both primary and secondary in /var/lib/postgresql/8.4/main
we should have a mount on the primary pointing to the secondary and WAL logs being writted to the secondary. If the mount to the secondary fails the WAL Logs would build up on the primary in /var/lib/postgresql/8.4/main/pg_xlog.
we should have a mount on the secondary pointing to the base install of the primary database so we can copy the base

secondary> /etc/init.d/postgresql stop
secondary> rsync -Cqtar –delete /mnt/livedb/* /var/lib/postgresql/8.4/main/.
secondary> vi /var/lib/postgresql/8.4/main/recovery.conf

While that is going, setup the configuration on secondary to be monitoring the archive directory on primary and processing the archive entries.

secondary> vi /etc/postgresql/8.4/main/postgresql.conf

restore_command = ‘/usr/lib/postgresql/8.4/bin/pg_standby -d -t /tmp/pgsql.trigger.5432 /data/dbsync/primaryarchive %f %p %r 2>>/var/log/postgresql/standby.log’
recovery_end_command = ‘rm -f /tmp/pgsql.trigger.5432’

———————

(the above is ideal, but sometimes NFS mount issues cause problems, here are some shortcuts I have used….)

– use the mount from primary to secondary to do the initial base transfer.

On the secondary stop postgres (/etc/init.d/postgresql stop)
mv /var/lib/postgresql/8.4/main to data/dbsync/primaryarchive ( /data/dbsync/primaryarchive/main)
from the primary start a backup and sync to the secondary
primary> psql -Upostgres -c “select pg_start_backup(‘backmeup’)” && time rsync –delete -Ctar /var/lib/postgresql/8.4/main/* /data/dbsync-archiveto/main/. –progress –exclude=’*pg_xlog/*’ && psql -Upostgres -c “select pg_stop_backup()”
mv main back to the running location (mv /data/dbsync/primary/main /var/lib/postgresql/8.4/.)
make sure all files are property owned by postgres.postgres (chown postgres.postgres -R /var/lib/postgresql/8.4/main)
vi /var/lib/postgresql/8.4/main/recovery.conf
restore_command = ‘/usr/lib/postgresql/8.4/bin/pg_standby -d -t /tmp/pgsql.trigger.5432 /data/dbsync/primaryarchive %f %p %r 2>>/var/log/postgresql/standby.log’
recovery_end_command = ‘rm -f /tmp/pgsql.trigger.5432’
/etc/init.d/postgresql start

XenCenter – live migrating a vm in a pool to another host

Posted on February 12, 2014

Linux XenServer

XenCenter – live migrating a vm in a pool to another host

When migrating a vm server from one host to another host in the pool I found it to be very easy at first.

In fact, it was one of the first things test I did after setting up my first vm on a host in a pool. 4 steps

Simply right click on the vm in XenCenter ->
Migrate to Server ->
Select from your available servers.
Follow the wizzard

In building some servers, I wanted to get some base templates which are ‘aware’ of the network I am putting together. This would involve adding some packages and configuration, taking a snapshot and then turning that snapshot into a template that I could easily restart next time I wanted a similar server. Then when I went to migrate one of the servers into its final resting place. I found an interesting error.

Right click on the vm in XenCenter ->
Migrate to Server ->
All servers listed – Cannot see required storage

I found this odd since I was sure that the pool could see all of the required storage (In fact I was able to start a new VM on the storage available, so I new the storage was there)

I soon found out though that the issue is that the live migrate feature, just doesnt work when there is more than one snapshot. I will have to look into my snapshot management on how I want to do this now, but basically I found that by removing old snapshots does to where the VM only had one snapshot (I left one that was a couple of days old) I was able to follow the original 4 steps

Note: the way I found out about the limitation of the number of snapshots was by

Eight click on the vm in XenCenter ->
Migrate to Server ->
The available servers are all grayed out, so Select “Migrate VM Wizard”
In the wizard that comes up select the current pool for “Destination”
This populates a list of VMs with Home Server in the destination pool want to migrate the VM (My understanding, is that this will move the VM to that server AND make that new server the “Home Server” for that VM)
When you attempt to select from the drop down list under Home Server, you see a message “You attempted to migrate a VM with more than one snaphot”

Using that information I removed all but one snapshot and was able to migrate. I am sure there is some logical reason behind snapshot / migration limitation but for now I will work around it and come up with some other way to handle my snapshots than just leaving them under the snapshot tab of the server.

apt-get – NO_PUBKEY – how to add the pubkey

Posted on February 12, 2014

Linux

apt-get – NO_PUBKEY – how to add the pubkey

I have run into this situation many times on Ubuntu and Debian so I thought I would finally document the fix.

When run into a apt-get error where there NO_PUBKEY avaiable for a package you want to install you get this error

The following signatures couldn't be verified because the public key is not available: NO_PUBKEY xxxxxxxxxxxxxxxxxxxxxx

This means your system does not trust the signature, so if you trust mit’s keyserver, you can do this to fix it

root@servername:~# gpg --keyserver pgp.mit.edu --recv-keys xxxxxxxxxxxxxxxxxxxxxx
root@servername:~# gpg --armor --export xxxxxxxxxxxxxxxxxxxxxx| apt-key add -

Solves it for me every time so far, at some point though I might run into a situation where mit does not have the keys, for now though this works, and I trust them

Entire script below

The following signatures couldn't be verified because the public key is not available: NO_PUBKEY xxxxxxxxxxxxxxxxxxxxxx
 root@servername:~# gpg --keyserver pgp.mit.edu --recv-keys xxxxxxxxxxxxxxxxxxxxxx
 root@servername:~# gpg --armor --export xxxxxxxxxxxxxxxxxxxxxx| apt-key add -
 OK