[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 20040527: bigbird rises from the ashes :-)
- Subject: Re: 20040527: bigbird rises from the ashes :-)
- Date: Thu, 27 May 2004 12:46:37 -0500
Unidata Support wrote:
From: Gerry Creager N5JXS <address@hidden>
Organization: Texas A&M University -- AATLT
Keywords: 200405270011.i4R0BjtK010667 LDM Linux RAID JFS
Hi Gerry,
I do not yet propose to rename it 'phoenix'...
I was going to recommend that you do if all goes well :-) I just
watched Flight of the Phoenix again (tenth time?) the other night, so
this was on my mind.
I want to see it fly farther than the next crash before we rename it. I
love that movie!
re: chunk-size setup in /etc/raidtab
I did find several web references recently that recommended 4k chunking.
I'd done larger chunks with the 'hardware' RAID, and we know the
results, although these were likely muddied by other issues. I don't
mind re-chunking if you want to try it, and I'd be willing to go to 128k
for a test.
Well, bigbird is crusing right now, so it is hard to argue with the 4K
chunk-size:
-- tail end of ~ldm/logs/bigbird.uptime
20040527.1628 0.91 1.26 1.82 8 0 8 1970 49M 0 0 scourBYnumber
20040527.1629 0.56 1.10 1.73 8 0 8 1983 48M 0 0 scourBYnumber
20040527.1630 1.13 1.17 1.71 8 0 8 2010 50M 0 0 scourBYnumber
20040527.1631 2.07 1.49 1.80 8 0 8 2021 48M 0 0 scourBYnumber
20040527.1632 1.21 1.34 1.72 8 0 8 2056 49M 0 0 scourBYnumber
20040527.1633 0.83 1.24 1.66 8 0 8 2067 49M 0 0 scourBYnumber
20040527.1634 0.65 1.12 1.59 8 0 8 2082 48M 0 0 scourBYnumber
20040527.1635 0.88 1.12 1.56 8 0 8 2101 49M 0 0 scourBYnumber
20040527.1636 0.50 0.96 1.48 9 0 9 2123 48M 0 0 scourBYnumber
20040527.1637 1.11 1.08 1.49 8 0 8 2155 49M 0 0 scourBYnumber
20040527.1638 0.98 1.06 1.45 8 0 8 2184 48M 0 0 scourBYnumber
20040527.1639 0.96 1.04 1.42 8 0 8 2204 48M 0 0 scourBYnumber
20040527.1640 1.87 1.24 1.46 8 0 8 2228 49M 0 0 scourBYnumber
20040527.1641 1.74 1.35 1.49 8 0 8 2258 48M 0 0 scourBYnumber
20040527.1642 0.87 1.18 1.42 8 0 8 2280 49M 0 0 scourBYnumber
20040527.1643 0.76 1.08 1.37 8 0 8 2288 49M 0 0 scourBYnumber
20040527.1644 1.56 1.19 1.38 8 0 8 2315 49M 0 0 scourBYnumber
These are the lowest load averages I have ever seen on bigbird while
the LDM and decoders are running!
I suspect going back to ext3 may also help that. It's a fairly
efficient journal scheme, and jfs, while it's theoretically better for
the filesizes we're seeing, isn't necessarily efficient in journaling.
Curiosity: was the setup error in raidtab specifying that there were
spare disks that weren't there? I didn't study the differences between
raidtab and raidtab.old last night...
I had 2 disks identified in there that didn't exist and I'd forgotten to
remove them. I was trying to run spares on 2ndary channels of IDE and
that wasn't helping. I've pruned the RAID size by one disk and made the
prune'd disk a spare now.
I am all for staying with this setup if it works well. Again, I am
just trying to learn as much as possible about RAID on Linux. More and
more Unidata sites are moving to Linux (and Linux clusters), and
installing RAIDs since disks are so cheap (too bad the same can't be
said for memory!).
Indeed. I'd love to help identify a good config for hard disk/RAID and
memory. I think that's going to be important in the long run.
By the way, my Google searches last night showed that O'Reilly has a
book out on Linux and RAIDs: Managing RAID on Linux. I will be picking
up a copy of this tomorrow if it is the store, otherwise I will be
ordering it ASAP.
I've got it. It's rather disparaging of s/w RAID, and of IDE RAID in
general. While I would love to do a SCSI RAID, I can't afford the disks
and most of the SCSI disks are considerably smaller... Most of what he
has offers alternatives with very few concrete suggestions. I used it
as a guide, with interpolation from SCSI to IDE, for RAIDTAB settings.
There's a brief discussion of chunk sizing. I've seen better offerings
from O'Reilly. I have a pretty complete library for reference.
Gotta go into the office. I've got a PlanetLab node dead and I've got
to troubleshoot it before Dell will send a technician on-site to repair.
Gerry
--
Gerry Creager -- address@hidden
Network Engineering -- AATLT, Texas A&M University
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.847.8578
Page: 979.228.0173
Office: 903A Eller Bldg, TAMU, College Station, TX 77843