maandag 19 november 2012

Wham! Bam! Thank you, mdadm!

For all who don't know what mdadm is, mdadm is a software RAID tool for Linux. Now we've got that out of the way...

As any other Über-geek I have a Linux server running some software and storing all of my precious data and of course my software repositories. As I have had the pleasure of data loss a couple years back I decided to use RAID 5 for my server. Of course I'm a student (at the time writing), and I don't have a lot of money because I'm not a greedy bastard. :-) I of course went for a software RAID using mdadm as my weapon of choice.

My server is an old desktop computer (I won't name the brand, but can ensure you it is notorious for.. well.. (since there is no other way to put this) sucking). After infecting the PC with penguins I found the server pretty damn good. Lucky me :-D
The *cough* server *cough* has a AMD Athlon 64 X2 3800+, 2GHz isn't too bad for downloading crap, serving Redmine and some software repositories like SVN and Mercurial. (No GIT, I should be ashamed.) It came with IIRC 1GB of RAM which I naturally upgraded to 2GB as I had 1GB laying around. After stripping some useless hardware like a graphics card I popped in a cheap-ass soft-RAID card and 4x500GB, you can guess the rest.

After some time I wanted to upgrade the thing as I had another system that I didn't use. The upgrade would consist of dual core to quad core, 2GB to 5GB (2x2GB 2x512MB) and 4x500GB to 4x1TB.

The memory upgrade is successful and the CPU upgrade didn't work as the motherboard didn't support the quad core.

The hard drives would be tricky as it would consist of yanking out one drive at a time and replacing them with a bigger drive. As mdadm f*ckt me royally in the past I decided to backup my RAID first. I had an external drive of 1TB laying around and used rsync to create a full backup (yeah I have a sh*tload of hardware I don't actually use but comes in handy sometimes).

Today I spent a couple hours of upgrading and soon it was time for the big drive swap. Mdadm reported earlier today that one of the drives failed, so that one was first to be replaced. Using hdparm to check the serial of the drive I recognized the drive and replaced it with a 1TB disk. After running "mdadm /dev/md0 --add /dev/new-drive" it started recovery. Of course this is where mdadm screwed me over again and showed me something like "[U_U_]" or "[_UU_]". (U means that a drive is Up and _ means a drive is down, or something.)

As I have a ramdisk (initramfs, initrd, etc.) with mdadm build in and busybox as a rescue shell I tried to force mdadm to assemble the array. Assembly didn't work so I tried re-creating them (keeping the data) by issuing "mdadm --create /dev/md0 --raid-devices=4 --level=5 --assume-clean /dev/sd[abc] missing" this worked.. until I invited /dev/sdd to the party (what a party pooper).

I am now creating a RAID 5 consisting of 4x1TB using mdadm (I guess I never learn). I guess I'll be restoring backups somewhere tomorrow.

Lesson learned: never trust mdadm! (or make backups before screwing around with mdadm)