[nSLUG] Strange crash

George White aa056 at chebucto.ns.ca
Sat Jul 19 08:45:20 ADT 2003


On Wed, 16 Jul 2003, Dop Ganger wrote:

> On Wed, 16 Jul 2003 bdavidso at supercity.ns.ca wrote:
> 
> > The other day a linux-based server I babysit sort-of crashed.  I say
> > sort-of because the box became mostly unresponsive but didn't actually
> > generate any error messages.
> 
> That's usually (in my experience) a lockup on the IDE interface, quite
> often caused by overheating. The only way to grab this sort of thing is to
> have a serial console capturing output, usually the machine isn't (quite)
> hung enough that it can't do serial output.

A marginal memory chip is more likely to screw up at higher temps.  I 
only use ECC systems and there is a definite corellation with higher 
temps.  
 
> > So...  Any ideas?  I would like to have a sense of what happened and how
> > to prevent it in the future.
> 
> If you use mrtg, I'd recommend getting hddtemp and monitoring the drive
> temperature. Hangs quite often correlate with excessive temperature; HDL
> turn off HVAC in our office at weekends, and I've got a desktop server
> with a few drives that has overheated once or twice as a result;
> monitoring with hddtemp showed the drive temp spiking up to around 42C
> before the machine hung. IDE drives seem to max out at around 38C before
> they start flaking out, in my experience.

This is my experience as well.

I put a cheap indoor/outdoor digital thermometer near the CPU with the
probe on the power supply exit.  The thermometer has a min/max feature, so
I can tell if the system got too hot overnight or on the weekend.  At
least this gives me a hint if I come in on Mon. and the system is dead but
the room is cool.  A simple thermometer is much easier to explain to
building maintenance staff than entries in a computer log. The worst
overheating we had was when they turned on the heat last fall before they
had the thermostats connected for the new heating plant, but the peak
temps were during daytime, so it was easy to convince the boss to buy some
drive-bay cooling fans.

--
George White <aa056 at chebucto.ns.ca> 
Head of St. Margarets Bay, Nova Scotia




More information about the nSLUG mailing list