Sig-I/O (Posts about sata)https://sig-io.nl/categories/sata.atom2023-05-25T20:58:04ZMark JanssenNikolaResetting failed drive in linux mdadm raid arrayhttps://sig-io.nl/posts/resetting-failed-drive-in-linux-mdadm-raid-array/2012-04-12T23:05:19+02:002012-04-12T23:05:19+02:00Mark Janssen<p>Today I was greeted with a failed drive in a mdadm raid array. The drive had some transient errors and was kicked out of the array, but testing showed that the drive still seemed to work just fine.</p>
<figure>
<a class="reference external image-reference" href="https://sig-io.nl/images/harddisks.jpg"><img alt="Harddisks" src="https://sig-io.nl/images/harddisks.thumbnail.jpg"></a>
<figcaption>
<p>Image by Martin Abegglen (<a class="reference external" href="https://www.flickr.com/photos/twicepix/3333710952">https://www.flickr.com/photos/twicepix/3333710952</a>)</p>
</figcaption>
</figure>
<p>The following procedure will remove the drive from the array, remove it from the system, re-probe for disks, and then re-add the drive back into the array(s).</p>
<ul class="simple">
<li><p>Remove the failed drive from the array, in this case, it was /dev/sdb:</p>
<ul>
<li><p>mdadm --manage --set-faulty /dev/md0 /dev/sdb1</p></li>
</ul>
</li>
<li><p>Make sure nothing on this disk is being used (mounts, other arrays, etc)</p></li>
<li><p>Reseat the drive from the system, either physically, or using the following commands:</p>
<ul>
<li><p>echo 1 > /sys/block/sdb/device/delete</p></li>
<li><p>echo "- - -" > /sys/class/scsi_host/host1/scan</p></li>
</ul>
</li>
<li><p>Check if the drive is found again, and check if it works correctly</p>
<ul>
<li><p>check dmesg output, or look at /proc/partitions</p></li>
<li><p>try running: ‘pv < /dev/sdb of=/dev/zero‘</p></li>
</ul>
</li>
<li><p>Re-add the drive to the array(s)</p>
<ul>
<li><p>mdadm /dev/md0 -a /dev/sdb1</p></li>
<li><p>cat /proc/mdstat</p></li>
</ul>
</li>
</ul>
<p>That should do the trick…</p>