This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
>From: Gerry Creager N5JXS <address@hidden> >Organization: Texas A&M University -- AATLT >Keywords: 200405262241.i4QMfhtK005798 LDM RAID JFS Hi Gerry, >IM==Instant Messaging... Sometimes convenient for on-line communications >while remote troubleshooting... I should have known... >I've seen the RAID failure on reboot several times. Interesting... Did you run fsck (or variant) to get things patched up before remounting the RAID filesystem? >I really want to >get rid of this card and get into a 3Ware card. New 'Net find today >suggests that, as suspected, Promise's proprietary RAID is less than >advertised. They said nicer things 'bout HighPoint, but the only "real >RAID" comments were reserved for Adapeptec and 3Ware... noting Adaptec >followed 3Ware's lead. I believe that Pete Pokrandt of U Wisc/AOS is using a 3Ware card in his Linux PC. >effectively, when rebooting, the system times out while flushing the >product queue now. Product queue? If you mean LDM product queue, that is on a different file system. Also, I did not see a startup script for the LDM in /etc/init.d, so I added one: /etc/init.d/ldmd Since this wasn't there on reboot, the LDM queue would not have been checked by it. >That apparently is tied to the RAID corruption in >some manner. If we really saw a RAID corruption while running today, >that's a first for me on this system. Further, there's spares. It >should have alarmed and fixed itself. I agree, but the load average did go up to 400... >I'll keep looking. By doing the reboot, we did salvage the messages >logs, and there might be some clues. OK. >Thanks for spotting the problem. I was working on bigfoot and didn't >even look at bigbird today, save to place it on a KVM. Hmmm. It's >possible that caused a hiccup, but it shouldn't have. It's been in idle >state WRT the monitor, keyboard, mouse for weeks. The reboot for the >box is serendipitous. I wasn't planning to reboot 'til needed anyway, >so I'd not have had console access (at least for X) 'til I did. >Keyboard and video worked as expected... OK. >Later, gerry fsck.jfs is still running on /dev/md0, and it will take time to finish. I will try to look in on bigbird later tonight or early tomorrow morning. As soon as fsck.jfs finishes, I will try to mount /data and crank up the LDM. Time to head home... Tom From: Tom Yoksas <address@hidden> To: Gerry Creager N5JXS <address@hidden> >From: Gerry Creager N5JXS <address@hidden> >Organization: Texas A&M University -- AATLT >Keywords: 200405262241.i4QMfhtK005798 LDM RAID JFS The RAID was not mounted on bootup. Is there supposed to be an entry in /etc/fstab to mount it? I couldn't find the entry I thought would be there. We tried mounting /dev/md0 as /data, but got a bad super block message. We are now running /sbin/fsck.jfs to check for problems. The RAID having a failure and then not being available would fit as a cause for the load average to ramp up to 400: all processes would be waiting to write to a resource that no longer existed. More later as we discover stuff... Tom -- +-----------------------------------------------------------------------------+ * Tom Yoksas UCAR Unidata Program * * (303) 497-8642 (last resort) P.O. Box 3000 * * address@hidden Boulder, CO 80307 * * Unidata WWW Service http://www.unidata.ucar.edu/* +-----------------------------------------------------------------------------+