[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[LDM #JGZ-326819]: LDM - LDM is killing the system
- Subject: [LDM #JGZ-326819]: LDM - LDM is killing the system
- Date: Tue, 21 Feb 2006 16:05:34 -0700
Hi Angel,
Long time no hear!
> Institution: University of Miami
> Package Version: ldm 6.4.1
> Operating System: SuSE Linux 9.3 (x86-64)
> Hardware Information: 4 processor dual-core
> Inquiry: Problem #1: scour never completes
>
> Problem #2: when the LDM runs the machine is very unresponsive. I know this
> is kinda vague
> but that's as much as I know. They are saving a pretty large subset of
> available data and
> writing to a RAIDed disk on a 3ware card.. Any hints where to look first?
Our experience with "home built" RAID systems (meaning a RAID built by adding a
RAID
card and attaching hard disks) on Linux is NOT positive! We have tried
virtually every
file system available on the RAID (except GFS), and have been dissapointed with
all.
We have been told that RAID performance when using 3Ware cards is better, but my
experience working with Gerry Creager (address@hidden) on his 3Ware-based RAID
setup is not stellar. Sources in NCAR claim that they get very good RAID
performance
with external boxes that appear like SCSI devices to the system.
The biggest performance problem occurs when one puts the LDM queue on the RAID
AND then write LOTs of files to it. In a test on a Fedora Core 1 machine with
a Promise TX2000 RAID
card, I found that putting a 2 GB LDM queue on the RAID would result in receipt
time latencies
that rapidly ramped up to 1 hour. When the queue was moved to a "local", ext3
filesystem
the latencies dropped to fractions of a second. Gerry and I also noticed that
the scouring
on his RAID was very sluggish, so much so that I investigated writing new scour
routines
in other scripting languages to see if I could minimize the problems. I was
marginally
successful in implementing scouring in Tcl, but not so much so that I can
positively
say that this is a "solution". By the way, at the time of our collaborative
testing Gerry's
machine was running Fedora Core 2 and is now running CentOS Linux. It is a
dual, hyperthreaded
Xeon (32-bit) machine with 4 GB of RAM. The 2 TB RAID is built from multiple
300 GB Maxtor IDE drives.
As a starting point, I recommend immediately moving your LDM queue off of the
RAID _if_ it
is currently on it, and see if there is a noticable improvement.
By the way, Steve says hi and asks how things are going in Miami!
Cheers,
Tom
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
address@hidden Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************
Ticket Details
===================
Ticket ID: JGZ-326819
Department: Support LDM
Priority: Normal
Status: Closed