[nSLUG] Tip: check your BBU on raid controllers

Rory rory at unixism.org
Fri Jan 25 20:56:28 AST 2013


This may well explain something I've been experiencing on our 2950's.

Thanks muchly!

On 13-01-25 02:19 PM, D G Teed wrote:
>
> Thought I'd share a server tip.  This is specific to Perc controllers
> and use of Open Manage, but there could be other brands which
> also disable write-back (for controller cache) in a similar way,
> as the BBU degrades.
>
> We have a few Dell 2950's reaching 4 years old or so.  They have Perc 5/i
> controllers.  It turns out the battery status to Dell's Open Manage 
> can say "OK"
> while they have actually dropped below the threshold to support a 
> Write Policy
> of "Write Back".  If your server has significant IO load, lacking 
> write-back will
> kill IO performance.  I learned of one system where the battery 
> threshold had
> dropped below 30%, which triggered the controller to go into "Write 
> Through" mode.
>
> Useless battery check:
>
> # omreport storage battery controller=0
> Battery 0 on Controller PERC 5/i Integrated (Embedded)
>
> Controller PERC 5/i Integrated (Slot Embedded)
> ID                  : 0
> Status              : Ok
> Name                : Battery 0
> State               : Ready
> Recharge Count      : Not Applicable
> Max Recharge Count  : Not Applicable
> Learn State         : Idle
> Next Learn Time     : 59 days 21 hours
> Maximum Learn Delay : 7 days 0 hours
>
> State of write policy:
>
> # omreport storage vdisk
> List of Virtual Disks in the System
>
> Controller PERC 5/i Integrated (Embedded)
> ID                        : 0
> Status                    : Ok
> Name                      : Virtual Disk 0
> State                     : Ready
> Hot Spare Policy violated : Not Assigned
> Encrypted                 : Not Applicable
> Layout                    : RAID-5
> Size                      : 836.63 GB (898319253504 bytes)
> Device Name               : /dev/sda
> Bus Protocol              : SAS
> Media                     : HDD
> Read Policy               : No Read Ahead
> Write Policy              : Write Through
> Cache Policy              : Not Applicable
> Stripe Element Size       : 64 KB
> Disk Cache Policy         : Disabled
>
> Hints in the messages log:
>
> # grep 'Server Administrator' /var/log/messages
> { ... }
> Jan 18 07:25:35 myserv Server Administrator: Storage Service EventID: 
> 2335  Controller event log: BBU disabled; changing WB virtual disks to 
> WT:  Controller 0 (PERC 5/i Integrated)
>
> BBU is the back up battery for the controller.  WB means write back.  
> WT means write through.
>
> These status messages can also appear at boot up as it flips back and 
> forth, or during "auto-learn" of
> the BBU at 90 day cycles.  You can confirm the battery level via a 
> controller export of the log:
>
> # omconfig storage controller controller=0 action=exportlog
>
> # grep Absolute /var/log/lsi_0125.log
> T42:     Absolute State of Charge  : 25 %
>
> Solution is to replace BBU.  Until then, the server's IO is poor 
> (manifested
> by high load, high util% in 'iostat -x 4 4', when compared to the same
> server's historical performance.
>
>
>
>
> _______________________________________________
> nSLUG mailing list
> nSLUG at nslug.ns.ca
> http://nslug.ns.ca/mailman/listinfo/nslug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://nslug.ns.ca/mailman/private/nslug/attachments/20130125/2c7526e1/attachment.html>


More information about the nSLUG mailing list