This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
>From: Gerry Creager N5JXS <address@hidden> >Organization: Texas A&M University -- AATLT >Keywords: 200401021656.i02GuEp2026352 Hi Gerry, Sorry I couldn't get to your note this past weekend, but I was consumed by activities related to changing of the Unidata-Wisconsin datastream. >Tom, this is a follow-up to your inquiry about RAID and product queues. OK. >I've started seeing journal corruption and kernel panics on bigbird over >the last month, predating the ramp-up with CRAFT and CONDUIT. Hmm... Do you think that this has to do with the less than optimum RAID support in Linux? >Yesterday >I moved the product queue from the RAID partition to a system partition >to see if going to a different controller and disk will resolve anything. I did the following on the system I am putting together: - abandoned the TX2000 harware RAID card and put in a TX2 UDMA-133 IDE interface - upgrade from RH 9 to Fedora Core 1; told the OS to create a software RAID using the two Samsung 160 GB HDs - ran the system with the LDM queue on the software RAID while ingesting all IDD data. The latencies dropped to essentially zero - cranked up McIDAS decoding; the latencies remained at zero - cranked up GEMPAK decoding; the latencies started to climb, so I moved the LDM queue back to the system disk. From that point the latencies have stayed at/near zero >It's my intent to rebuild bigbird, regardless of the results, using >either reiserfs or xfs, probably next week. I have followed the responses to your inquiry with great interest. I know Daryl has a lot of experience with Linux, so I would probably go with XFS. >However, if the system >stays stable as currently configured, I _may_ let it run another week. Sounds prudent. >I installed gempak 5.6L today. I'm seeing a lot of write failures and >table failures. Chiz and I (and I think, you and I) ran thru this once >before, but I slept since then... Could we arrange to get on the phone >and hammer thru them again? I will talk with Chiz about this tomorrow. >I'm home today (last blissful day of >vacation) at 979.695.6878 if you're working. If not, try the cellphone >(below) sometime when you've an interest, and I'll make for a terminal >and work on it. Well, my weekend was not a vacation. I worked about 12 hours yesterday, but I did it from home so I got real stressed out with my terrible internet connection (dial-up, ugh!). >Hope y'all had a wonderful Holiday season! You too! Cheers, Tom >From address@hidden Tue Jan 6 08:31:01 2004 Morning! Tom Yoksas wrote: >>From: Gerry Creager N5JXS <address@hidden> >>Organization: Texas A&M University -- AATLT >>Keywords: 200401021656.i02GuEp2026352 > > > Hi Gerry, > > Sorry I couldn't get to your note this past weekend, but I was > consumed by activities related to changing of the Unidata-Wisconsin > datastream. I saw the announcements >>Tom, this is a follow-up to your inquiry about RAID and product queues. > > > OK. > > >>I've started seeing journal corruption and kernel panics on bigbird over >>the last month, predating the ramp-up with CRAFT and CONDUIT. > > > Hmm... Do you think that this has to do with the less than optimum > RAID support in Linux? Probably, and the fact that Promise and High Point, while releasing drivers, are not releasing all their info so other driver writers can derive better drivers... >>Yesterday >>I moved the product queue from the RAID partition to a system partition >>to see if going to a different controller and disk will resolve anything. > > > I did the following on the system I am putting together: > > - abandoned the TX2000 harware RAID card and put in a TX2 UDMA-133 > IDE interface > - upgrade from RH 9 to Fedora Core 1; told the OS to create a software > RAID using the two Samsung 160 GB HDs > - ran the system with the LDM queue on the software RAID while > ingesting all IDD data. The latencies dropped to essentially > zero > - cranked up McIDAS decoding; the latencies remained at zero > - cranked up GEMPAK decoding; the latencies started to climb, > so I moved the LDM queue back to the system disk. From that > point the latencies have stayed at/near zero We are bringing up an Opteron (uP) with a TX2000. I will test it in ingest mode and see what we can do there. We loaded the 2.6.0 kernel and recompiled (or at least it was compiling when I went to the "next" meeting, yesterday). S/W RAID is incorporated therein. Once we have a week or 2 of datapoints, we'll re-define the RAID to S/W (the TX2000 will allow "regular" IDE control, so this'll simply be a re-build of the arrays and flushing the RAID bios) and rerun the operations. That should give us info on what we need to do. At that point, between the 2 of us, I suspect we'll be comfortable answering these questions to others. All that said, however, I want to get a 3Ware controller in and work with it. They're supposed to have superior Linux support... >>It's my intent to rebuild bigbird, regardless of the results, using >>either reiserfs or xfs, probably next week. > > > I have followed the responses to your inquiry with great interest. > I know Daryl has a lot of experience with Linux, so I would probably > go with XFS. In going back thru the Beowulf list info about RAID, I found the following links, which might be useful to both of us in the future. http://www.linux-ide.org/chipsets.html http://www.1u-raid5.net I was looking for a response I got privately from Stonie Cooper, who was adamant that reiserfs was the right answer, based on his experience. The Opteron system is currently Reiserfs. I'm going to have bigbird reformatted as xfs. We should be able to draw a few conclusions. >>However, if the system >>stays stable as currently configured, I _may_ let it run another week. > > > Sounds prudent. > > >>I installed gempak 5.6L today. I'm seeing a lot of write failures and >>table failures. Chiz and I (and I think, you and I) ran thru this once >>before, but I slept since then... Could we arrange to get on the phone >>and hammer thru them again? > > > I will talk with Chiz about this tomorrow. Thanks! >>I'm home today (last blissful day of >>vacation) at 979.695.6878 if you're working. If not, try the cellphone >>(below) sometime when you've an interest, and I'll make for a terminal >>and work on it. > > > Well, my weekend was not a vacation. I worked about 12 hours yesterday, > but I did it from home so I got real stressed out with my terrible > internet connection (dial-up, ugh!). Sorry to hear about both. I finally got DSL at home, and I'm looking at a wireless link back to the campus (it's good to be the research network engineer in my "spare time"...) that'll give me 25 Mb or so to play with unfettered. It'll be IPv4/IPv6 for some testing, and may incorporate some mesh networking as a research sidelight! >>Hope y'all had a wonderful Holiday season! > > > You too! Thanks again, Gerry -- Gerry Creager -- address@hidden Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578 Page: 979.228.0173 Office: 903A Eller Bldg, TAMU, College Station, TX 77843