Categories

Archives

Linux Leap-Second problem

So, this weekend was quite an interesting one, as on July 1st 02:00 local time (00:00 UTC) a leap-second was added via NTP. This caused serious problems for all my Java Virtual Machines and mysql databases.

If your system has printed the following line (in dmesg), a leap-second has been added recently:

Clock: inserting leap second 23:59:60 UTC

On most of my systems, the JVM’s would spike to 100% cpu load over all cores, mysql seems to also do this.

The work-around/fix at this time is to run:

date `date +"%m%d%H%M%C%y.%S"`

Hopefully this will be fixed in the kernel before the next leap-second is added. Which could be as soon as 2013/01/01, though probably later.

2 new ActiveMQ monitoring scripts for Nagios (compatible monitoring systems)

I’ve recently written two new check-scripts for ActiveMQ. These nagios scripts will keep an eye out on ActiveMQ’s internal storage space and status of the queues.

Both scripts make use of activemq’s administrative web-interface (located at hostname:port/admin/ by default).

Check_ActiveMQ_Mem
Usage:
check_activemq_mem url warn crit

The url to use is the ‘print’ version, as that’s a lot simpler to parse. It can be found on http://hostname:port/admin/index.jsp?printable=true

Example output:

check_activemq_mem http://hostname:port/admin/index.jsp?printable=true 5 10
WARNING: ActiveMQ memory usage: 7/0/0 (store/memory/temp)

Check_ActiveMQ_queues
Usage:
check_activemq_queues url

The url to use is http://hostname:port/admin/xml/queues.jsp
The configuration is done in the check_script itself. At the top there is a configuration block (perl hash) that looks like this:

my %config = (
        "ActiveMQ.DLQ"  => [
                        "* * * * *", "0" ],
        "sig-io.startJob" => [
                        "40-59 20 * * *", 5000,
                        "0-15 21 * * *", 5000,
                        "* * * * *", 100 ],
        "sig-io.finishJob" => [
                        "0-45 21 * * *", 5000,
                        "* * * * *", 100 ],
        "default"       => [ "* * * * *", "10" ],
);

In this config block, you can specify a number of crontab(5) like rules, all with a limit behind them. This limit will be checked against the number of items in the named queue, during the specified time-periods. If no rule is found that matches the current timeslot or matches the current Queue-name, the ‘default’ values will be used.

You can specify a limit value of 0 or a positive integer as a maximum amount of queue-items in the named queue. When this limit is passed, the script will go into warning or error state.

Specifying a limit of -1 signifies that this queue is to be completely ignored.

Example output:

check_activemq http://hostname:port/admin/xml/queues.jsp
OK: ActiveMQ: Everything reported OK (url)

check_activemq_mem.pl


check_activemq.pl

Resetting failed drive in linux mdadm raid array

Today I was greeted with a failed drive in a mdadm raid array. The drive had some transient errors and was kicked out of the array, but testing showed that the drive still seemed to work just fine.

http://www.flickr.com/photos/twicepix/3333710952

The following procedure will remove the drive from the array, remove it from the system, re-probe for disks, and then re-add the drive back into the array(s).

  • Remove the failed drive from the array, in this case, it was /dev/sdb:
    • mdadm --manage --set-faulty /dev/md0 /dev/sdb1
  • Make sure nothing on this disk is being used (mounts, other arrays, etc)
  • Reseat the drive from the system, either physically, or using the following commands:
    • echo 1 > /sys/block/sdb/device/delete
    • echo "- - -" > /sys/class/scsi_host/host1/scan
  • Check if the drive is found again, and check if it works correctly
    • check dmesg output, or look at /proc/partitions
    • try running: ‘pv < /dev/sdb of=/dev/zero
  • Re-add the drive to the array(s)
    • mdadm /dev/md0 -a /dev/sdb1
    • cat /proc/mdstat

That should do the trick…

ImageMagick 6.7.6 and libwebp 0.1.3 packages for CentOS/RHEL 6.2

In my centos6 repository you can now find ‘up-to-date’ packages for ImageMagick compiled with Google’s new webp image format support. These packages were built on a centos6 build-host, as the packages from the ImageMagick project, which are built on CentOS 5.7 are not really compatible with CentOS 6.x.

Designing a low-cost, high-capacity storage server

My customers have a rather large hunger for storage-capacity, but not the budgets to buy expensive SAN networks or NAS-heads. This post will describe a low-cost, high capacity storage platform that should be able to provide decent performance at a cost of about 110 euro per TB usable capacity (given raid-6, with 6+2 disks per diskgroup).

The basis system consists of:

  1. Norco RPC4224 4U 24-Bay hot-swap server cabinet (est price 440,-)
  2. 6x SFF Cables (96,-)
  3. 850Watt Seasonic PSU (128,-)
  4. Rail kit (32,-)
  5. Supermicro X8SIA-F Mainboard (200,-)
  6. Supermicro AOC-SASLP-MV8 Sata controller (3x, total price 300,-)
  7. 32 GB SSD, 2x 50,- (just to boot from) (100,-)
  8. Xeon X3440 Boxed CPU (200,-)
  9. 16GB ECC/Registered DDR3 Ram KVR1333D3D4R9SK2/16GB (130,-)

This brings the base-system price to about 1600,-

To this base system we can then add disks as needed, in sets of 8. Currently I would recommend the following disks:

  • Hitachi Deskstar 7k3000 (3tb, ~200 euro)
  • Hitachi Deskstar 7k4000 (4tb, ~270 euro)
  • Seagate Constellation ES2 (3tb, ~270 euro)

When adding 24 4TB disks, the total system price would end up at ~8100 euro, and have a total capacity of about 72 TB in 3 diskgroups of 24TB each (with 8 disks, of which any 2 may fail). To spread out the costs over a longer period, and to be able to grow, you could consider buying the disks in sets of 8. I would however recommend getting at least 1 or 2 disks as cold-spare to be able to quickly swap out broken disks.

On the software side, install your linux distribution of choice, use dm-raid and LVM, and put XFS or Ext4 filesystems on top. In the future BTRFS might be a better solution, but for now it’s still a bit too immature. A solaris or BSD install with ZFS would also be possible, but is not something I’ve tested on this setup yet.

Run-time editing of limits in Linux

On CentOS and RHEL Linux (with kernels >= 2.6.32) you can modify resource-limits (ulimit) run-time. This can be done using the /proc/<pid>/limits functionality. On older kernels this file is read-only and can be used to inspect the limits that are in effect on the process. On newer systems you can modify the limits with echo:

cat /proc/pid/limits
echo -n "Max open files=soft:hard" > /proc/pid/limits
cat /proc/pid/limits

On older systems you will have to modify limits before starting a process.

(See also the post on serverfault)

If you are not running CentOS/RHEL, you can use the ‘prlimit’ command, which does the same thing, but doesn’t rely on a patch that’s no longer present in current kernels.

Offsite backups

Sig-I/O is currently working on an off-site backup service for it’s customers. We will offer secure off-site backup capacity in a different datacenter from our main operations. Customers will be able to rent backup space on a quarterly basis, from 1 gigabyte up to several terabytes.

More information will be released soon. Please contact us if you are interested or looking for backup capacity.

Autorai 2011

Some images from the visit to the 2011 Autorai.

Continue reading Autorai 2011

Online resizing of multipath devices in Linux dm-multipath

Linux doesn’t automatically re-size multipath devices, so this procedure must be used to have online re-sizing of multipath.
(Offline re-size is automatic, just remove the mapping and reload)

  • this example will use multipath device /dev/mpath/testdisk, with scsi disks /dev/sdx and /dev/sdy
  • Resize the lun on the underlying storage layer (iscsi / san)
  • Check which sd? devices are relevant, and re-scan these:
    • multipath -ll testdisk
    • blockdev –rereadpt /dev/sdx
    • blockdev –rereadpt /dev/sdy
    • blockdev –getsz /dev/sdx
    • blockdev –getsz /dev/sdy
  • Take note of the new size returned by getsz.
  • Dump the dmsetup table to a file (and a backup)
    • dmsetup table testdisk | tee mapping.bak mapping.cur
  • Edit the table stored in ‘mapping.cur’
    • vi mapping.cur, replace field 2 (size) with the new size from getsz
  • Suspend I/O, reread the table, and resume I/O
    • dmsetup suspend testdisk; dmsetup reload testdisk mapping.cur; dmsetup resume testdisk
  • The multipath device should now be resized:
  • multipath -ll

You can now resize the filesystem on the multipath device, or the LVM-PV if you use LVM on the LUN.

IBM has a slightly more complex, but possibly ‘better’ procedure at http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD105262

Dienstenoverzicht online

Onder het kopje ‘Diensten’ is een overzicht te vinden van enkele standaard diensten die SIG-I/O automatisering aanbied.