Friday, October 23, 2009

Replacing multiple hard drives including the one containing boot under linux with LVM.

I had a server with 3 old hard drives 2 IDE and 1 SCSI.

One of the IDE discs had the /boot partition ~200MB. The rest was part of an LVM partition which was a logical volume group along with the rest of the drives. The group had 2 logical volumes, 1 for the root and a 2GB one for swap.

I decided to move everything to one SATA drive. Turns out it was a relatively simple and safe process.

The first thing is to create the boot partition on the new drive. Same size ~200MB. I've used fdisk to do that. Then I've created another partition which spanned across the remaining disk. I had to change its type (use the t command) to 8e which is the LVM type.

Next simply mounted the new boot partition: mount /dev/sdc1 /mnt/newboot
Then copy the old boot to the new boot: cp -afv /boot/* /mnt/newboot

Add the new disc's partition into the volume group:

pvcreate /dev/sda2
vgextend VolGroup00 /dev/sda2

Time to to the lvm transfer now, simply pmove each of the old disks to the old one and then remove them from the group:

pmove /dev/hda2 /dev/sda2
vgreduce VolGroup00 /dev/hda2

pmove /dev/hdb1 /dev/sda2
vgreduce VolGroup00 /dev/hdb1

pmove /dev/hdd1 /dev/sda2
vgreduce VolGroup00 /dev/hdd1

All done, now poweroff and remove old discs.

Boot using a live CD,USB or Super grub and install grub and voila!

Albatron PX865PE Lite (V 2.0) booting from USB

Just a small note, for this motherboard which has a tricky boot sequence selection.

In general if you have multiple boot devices connected on the same option group (e.g. Hard Disk, which includes USB sticks) you must disable all but the one you wish to boot from, boot once and then reboot and enable the rest.

That is the motherboard will choose the last known bootable device.

For instance you may have a SATA bootable drive and an IDE bootable drive. You usually boot from the IDE but now you want to change to the SATA one. Simply disable the IDE drive (set None in BIOS), boot, reboot change the option back (to Auto for instance) and it will now boot from the SATA always.

Weird.

Sunday, October 18, 2009

ddclient hangs while getting IP address under certain circumstances

DDClient is a great dynamic IP updating Perl script which supports many providers and has cool features like grabbing the IP from your modem instead some server in the Internet.

However with the latest version 3.8.0 I have had an issue which led to ddclient blocking while trying to read the response from the modem.

I am not an expert in Perl, however I can code. So I had a look at the source. I've managed to find the problematic calls around line 1800:
my $timeout = 0;
local $SIG{'ALRM'} = sub { $timeout = 1; msg(qq{Alarm!\n});};

$0 = sprintf("%s - reading from %s port %s", $program, $peer, $port);

alarm(opt('timeout')) if opt('timeout') > 0;

while (!$timeout && ($_ = <$sd>)) {
$0 = sprintf("%s - read from %s port %s", $program, $peer, $port);
verbose("RECEIVE:", "%s", define($_, ""));
$reply .= $_ if defined $_;
}
if (opt('timeout') > 0) {
alarm(0);
}

I remembered from my early days in learning Perl that the perldoc was clear: Signal Handling in Perl is different from other languages.

So although in C/C++ you would expect the read to return EINTR when the process was interrupted with a SIG_ALRM that does not happen in Perl (not on all platforms anyway). Yes the signal is caught and the handler is called but the read is not interrupted and therefore the sentinel timeout is never checked.

To make things worse the timeout option in the socket constructor does nothing.

The solution is to force the read to end using an eval/die pair:

$0 = sprintf("%s - reading from %s port %s", $program, $peer,
$port);
eval {
local $SIG{'ALRM'} = sub { die "timeout";};
alarm(opt('timeout')) if opt('timeout') > 0;
while (($_ = <$sd>)) {
$0 = sprintf("%s - read from %s port %s", $program,
$peer, $port);
verbose("RECEIVE:", "%s", define($_, ""));
$reply .= $_ if defined $_;
}
if (opt('timeout') > 0) {
alarm(0);
}
};

close($sd);

# if ($timeout) {
if ($@ and $@ =~ /timeout/) {
warning("TIMEOUT: %s after %s seconds", $to,
opt('timeout'));
$reply = '';
}

The commented line is the original code.
So the code in the eval block runs as if it was a perl program within our program. When the signal is caught we kill the little program with die and execution is returned to the main program after the eval block.

Variable $@ holds the last eval return value so that is how we test if a timeout has occurred.

And now this works! I can finally see the timeout messages in the logs. I suspect that some people may never have a problem with this limitation of ddclient. If ddclient will get a response from the http server even a trashy one then it will not hang. I guess that my modem is buggy and it simply does not return a response but holds the connection open which makes ddclient block.

I have filed a patch with the code changes for ddclient.