[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 20040526: bigbird status (cont.)
- Subject: Re: 20040526: bigbird status (cont.)
- Date: Wed, 26 May 2004 20:56:53 -0500
Tom Yoksas wrote:
From: Gerry Creager N5JXS <address@hidden>
Organization: Texas A&M University -- AATLT
Keywords: 200405262241.i4QMfhtK005798 LDM RAID JFS
Hi Gerry,
IM==Instant Messaging... Sometimes convenient for on-line communications
while remote troubleshooting...
I should have known...
I've seen the RAID failure on reboot several times.
Interesting... Did you run fsck (or variant) to get things patched
up before remounting the RAID filesystem?
Yes. Took a few (10 or less) min. This time was taking a lot longer.
I really want to
get rid of this card and get into a 3Ware card. New 'Net find today
suggests that, as suspected, Promise's proprietary RAID is less than
advertised. They said nicer things 'bout HighPoint, but the only "real
RAID" comments were reserved for Adapeptec and 3Ware... noting Adaptec
followed 3Ware's lead.
I believe that Pete Pokrandt of U Wisc/AOS is using a 3Ware card in
his Linux PC.
That's where I should have gone.
effectively, when rebooting, the system times out while flushing the
product queue now.
Product queue? If you mean LDM product queue, that is on a different
file system. Also, I did not see a startup script for the LDM
in /etc/init.d, so I added one:
Was, or should have been, embedded in rc.local
Since this wasn't there on reboot, the LDM queue would not have
been checked by it.
That apparently is tied to the RAID corruption in
some manner. If we really saw a RAID corruption while running today,
that's a first for me on this system. Further, there's spares. It
should have alarmed and fixed itself.
I agree, but the load average did go up to 400...
I'll keep looking. By doing the reboot, we did salvage the messages
logs, and there might be some clues.
Thanks for spotting the problem. I was working on bigfoot and didn't
even look at bigbird today, save to place it on a KVM. Hmmm. It's
possible that caused a hiccup, but it shouldn't have. It's been in idle
state WRT the monitor, keyboard, mouse for weeks. The reboot for the
box is serendipitous. I wasn't planning to reboot 'til needed anyway,
so I'd not have had console access (at least for X) 'til I did.
Keyboard and video worked as expected...
Later, gerry
fsck.jfs is still running on /dev/md0, and it will
take time to finish. I will try to look in on bigbird later tonight
or early tomorrow morning. As soon as fsck.jfs finishes, I will try
to mount /data and crank up the LDM.
Time to head home...
It's taking a lot longer than it ever has before....
I'll be writing most of the night. Proposal time.
TTFN, Gerry
Gerry Creager -- address@hidden
Network Engineering -- AATLT, Texas A&M University
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843