[nSLUG] Multiply-claimed blocks?
nslug at fop.ns.ca
Wed Aug 31 15:19:10 ADT 2011
On Wed, 31 Aug 2011, Daniel Morrison wrote:
> The fact that the prof's computer was a shared resource doesn't seem relevant.
The shared resource is the driver for the reason he did this at all -
everyone else said there were no problems with storage as there were no
errors reported. He did this to demonstrate that errors on the storage
subsystem can easily be missed or not reported at all.
> It's my OPINION that there must have been something wrong with that
> computer (not just cosmic rays).
Statistics, I think. The storage subsystem was used heavily enough that
transient errors were more common than in a standalone system.
> I think that if a bit flip on disk once a month was "normal", our
> computers would be crashing a whole lot more often.
Not as often as you'd think; I have a firewall machine (AMD K6-350, which
should say everything right there) with a stick of bad RAM in it and a
hard drive that's reporting 0 hours lifetime remaining, chundering along
to see how long before it will finally die. It never crashes, just causes
stuff to segfault, and running monit allows any dead or hung daemons to be
restarted. Current uptime is 28 days, since the last power outage that
outlasted the UPS. Checking the logs it's had 31 errors related to RAM, or
just over once a day. Since it started reporting errors about 4 years ago,
it's only crashed and locked up once that I recall.
More information about the nSLUG