Berichten over techtip

drive linux mdadm raid reset sata

Resetting failed drive in linux mdadm raid array

Today I was greeted with a failed drive in a mdadm raid array. The drive had some transient errors and was kicked out of the array, but testing showed that the drive still seemed to work just fine.

The following procedure will remove the drive from the array, remove it from the system, re-probe for disks, and then re-add the drive back into the array(s).

  • Remove the failed drive from the array, in this case, it was /dev/sdb:
    • mdadm --manage --set-faulty /dev/md0 /dev/sdb1
  • Make sure nothing on this disk is being used (mounts, other arrays, etc)
  • Reseat the drive from the system, either physically, or using the following commands:
    • echo 1 > /sys/block/sdb/device/delete
    • echo "- - -" > /sys/class/scsi_host/host1/scan
  • Check if the drive is found again, and check if it works correctly
    • check dmesg output, or look at /proc/partitions
    • try running: ‘pv < /dev/sdb of=/dev/zero‘
  • Re-add the drive to the array(s)
    • mdadm /dev/md0 -a /dev/sdb1
    • cat /proc/mdstat

That should do the trick…

kernel limit linux proc resource ulimit

Run-time editing of limits in Linux

On CentOS and RHEL Linux (with kernels >= 2.6.32) you can modify resource-limits (ulimit) run-time. This can be done using the /proc/<pid>/limits functionality. On older kernels this file is read-only and can be used to inspect the limits that are in effect on the process. On newer systems you can modify the limits with echo:

cat /proc/pid/limits echo -n "Max open files=soft:hard" > /proc/pid/limits cat /proc/pid/limits

On older systems you will have to modify limits before starting a process.

(See also the post on serverfault)

If you are not running CentOS/RHEL, you can use the ‘prlimit’ command, which does the same thing, but doesn’t rely on a patch that’s no longer present in current kernels.

dmsetup iscsi linux lvm multipath resize

Online resizing of multipath devices in Linux dm-multipath

Linux doesn’t automatically re-size multipath devices, so this procedure must be used to have online re-sizing of multipath. (Offline re-size is automatic, just remove the mapping and reload)

  • this example will use multipath device /dev/mpath/testdisk, with scsi disks /dev/sdx and /dev/sdy
  • Resize the lun on the underlying storage layer (iscsi / san)
  • Check which sd? devices are relevant, and re-scan these:
    • multipath -ll testdisk
    • blockdev –rereadpt /dev/sdx
    • blockdev –rereadpt /dev/sdy
    • blockdev –getsz /dev/sdx
    • blockdev –getsz /dev/sdy
  • Take note of the new size returned by getsz.
  • Dump the dmsetup table to a file (and a backup)
    • dmsetup table testdisk | tee mapping.bak mapping.cur
  • Edit the table stored in ‘mapping.cur’
    • vi mapping.cur, replace field 2 (size) with the new size from getsz
  • Suspend I/O, reread the table, and resume I/O
    • dmsetup suspend testdisk; dmsetup reload testdisk mapping.cur; dmsetup resume testdisk
  • The multipath device should now be resized:
  • multipath -ll

You can now resize the filesystem on the multipath device, or the LVM-PV if you use LVM on the LUN.