[nSLUG] Multiply-claimed blocks?

George N. White III gnwiii at gmail.com
Thu Sep 1 13:25:34 ADT 2011


On Tue, Aug 30, 2011 at 2:36 AM, Mike Spencer <mspencer at tallships.ca> wrote:
>
> My first-ever apparently unprovoked Linux crash in 12 years.
> Everything stopped working.
>
> Cold reboot produced stuff [much snipped to avoid boredom] like:
>
>    /sbin/agetty: error while loading shared libraries lib/libc.so,6:
>    invalid ELF header
>
>    INIT: Id "c3" respawning too fast: disabled for 5 minutes
>    INIT: Id "c4" respawning too fast: disabled for 5 minutes
>    INIT: Id "c5" respawning too fast: disabled for 5 minutes
>    INIT: Id "c6" respawning too fast: disabled for 5 minutes
>
>    INIT: no more processes left in this run level
>
> From an install DVD boot, e2fsck -p reported:
>
>    /dev/hda: File /lib/libc-2.3.6.so (inode #720234, mod time Thu Sep 14
>    07:54:43 2006)
>
>       has 1 multiply-claimed block(s) shared with 1 file(s):
>
>    /dev/hda: File /lib/libank-2.3.6.so (inode #720235, mod time Thu Sep
>    14 07:54:43 2006)
>
>    /dev/hda3: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY
>         (i.e. without -a or -p options)
>
>
> Did that. e2fsck reported fixing a bunch of stuff (becasue I answered
> 'y' to all the questions) but normal HD reboot still failed on errors
> indicating that libc was bad.
>
> [Much tedious messing about omitted...]
>
> Finally fixed it by booting from the Slackware install DVD and
> reinstalling the glibc-solibs package. Everything seems to be fine now.
>
> Are multiply-claimed blocks a common occurrence. A reliable sign of
> impending disk failure?  A black swan?

Not common, may or may not result from impending disk failure, but why
take cances?   If the contents if the disk
are important, I would certainly consider replacing it, especially it
was put of warranty  The rotating mass storage version of a black swan
is a UDE:

"Though remarkably reliable, disk drives do fail occasionally. Most
failures can be detected immediately; moreover, such, failures can be
modeled and addressed using technologies such as RAID (Redundant
Arrays of Independent Disks). Unfortunately, disk drives can
experience errors that are undetected by the drive-- which we refer to
as undetected disk errors (UDEs). These errors can cause silent data
corruption that may go completely undetected (until a system or
application malfunction) or may be detected by software in the storage
I/O stack. Continual increases in disk densities or in storage array
sizes and more significantly the introduction of desktop-class drives
in enterprise storage systems are increasing the likelihood of UDEs in
a given system. Therefore, the incorporation of UDE detection (and
correction) into storage systems is necessary to prevent increasing
numbers of data corruption and data loss events. In this paper, we
discuss the causes of UDEs and their effects on data integrity. We
describe some of the basic techniques that have been applied to
address this problem at various software layers in the I/O stack and
describe a family of solutions that can be integrated into the RAID
subsystem." -- IBM Journal of Research and Development Volume 52 Issue
4, July 2008


>
>
> - Mike
>
> PS: I have no idea how anybody can recover from a crash like that or
>    even from many simpler things if they have only one computer. How
>    do you look up stuff if you can't grovel trough the docs?

You can do a lot with a live linux distro, but I would not spend a lot of time
trying to recover the system because a disk you don't trust has a negative
value.   If a quick fix doesn't get things going I would first focus on deciding
whether this disk can be trusted.

I would first run the vendor's diagnostics, then exercise the disk by
running a "wipe disk" tool.  If it survives
all that without more errors then I would install the OS restore the
system from backups.  I would not trust the
disk without this sort of testing, and I would not invest this much
time in an older, out of warranty disk.  A
relatively new disk I have tested is actually more valuable (e.g.,
more trusted) than a brand new out-of-the
antistatic bag but untested drive.


-- 
George N. White III <aa056 at chebucto.ns.ca>
Head of St. Margarets Bay, Nova Scotia



More information about the nSLUG mailing list