[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20021120: LDM startup error message
- Subject: 20021120: LDM startup error message
- Date: Wed, 20 Nov 2002 09:28:18 -0700
>From: Ken Scheeringa <address@hidden>
>Organization: Purdue
>Keywords: 200211201420.gAKEKV425218 LDM ldmadmin ldmd.pq
Ken,
>Yesterday I rebooted my RS6000 box running AIX 4.3.2
>for the first time in a while, but LDM wouldn't restart.
>
>I haven't made any changes to the LDM configs in a very
>long time so I am wondering why this new error.
It could be a corrupt queue.
>I have attached a bit of ldmd.log that may help figure
>this out. In the first part of the log, I am attempting
>to connect to my failover "sunset" at the U of Wisconsin,
>and in the second part I am attempting to connect to
>my usual upstream "anvil" here at Purdue.
>
>Both fail with errors in pq.c, but not exactly the same error.
>I don't know much about the internals of LDM, I just know
>that "it works" for my purposes of capturing a small portion
>of the DDPLUS feed.
>
>I appreciate any assistance you can provide. Thanks!
-- ldmd.log --
Nov 20 14:00:55 shadow localhost[19500]: Connection from localhost
Nov 20 14:00:55 shadow localhost[19500]: Connection reset by peer
Nov 20 14:00:55 shadow localhost[19500]: Exiting
Nov 20 14:00:58 shadow pqexpire[23156]: assertion "found != 0" failed: file "pq.
c", line 627
Nov 20 14:01:04 shadow rpc.ldmd[21944]: child 23156 terminated by signal 6
Nov 20 14:01:04 shadow rpc.ldmd[21944]: Killing (SIGINT) process group
Nov 20 14:01:04 shadow rpc.ldmd[21944]: Interrupt
Nov 20 14:01:04 shadow rpc.ldmd[21944]: Exiting
Nov 20 14:01:04 shadow udp.ldmd[20012]: Interrupt
Nov 20 14:01:04 shadow udp.ldmd[20012]: Exiting
Nov 20 14:01:04 shadow pqact[20596]: Interrupt
Nov 20 14:01:04 shadow pqact[20596]: Exiting
Nov 20 14:01:04 shadow pqbinstats[20904]: Interrupt
Nov 20 14:01:04 shadow pqbinstats[20904]: Exiting
Nov 20 14:01:05 shadow rpc.ldmd[21944]: Terminating process group
Nov 20 14:01:05 shadow sunset[20342]: Interrupt
Nov 20 14:01:05 shadow sunset[20342]: Exiting
Nov 20 14:03:58 shadow rpc.ldmd[23502]: Starting Up (built: Jan 30 1996 10:54:55
)
Nov 20 14:03:58 shadow anvil[20598]: run_requester: Starting Up: anvil.eas.purdu
e.edu
Nov 20 14:03:58 shadow pqbinstats[20346]: Starting Up (23502)
Nov 20 14:03:58 shadow pqact[20906]: Starting Up
Nov 20 14:03:58 shadow pqexpire[21180]: Starting Up
Nov 20 14:03:59 shadow udp.ldmd[20014]: Starting Up
Nov 20 14:03:59 shadow localhost[19504]: Connection from localhost
Nov 20 14:03:59 shadow localhost[19504]: Connection reset by peer
Nov 20 14:03:59 shadow localhost[19504]: Exiting
Nov 20 14:04:00 shadow pqexpire[21180]: assertion "rp->prev != OFF_NONE" failed:
file "pq.c", line 696
The assertion failure in pqexpire points to a corrupt queue as being the
most likely problem. To remake the queue, do the following:
<login as 'ldm'>
cd ~ldm
ldmadmin stop <- to make sure nothing is running
ldmadmin delqueue
ldmadmin mkqueue
ldmadmin start
In taking a closer look at the log file you sent in, I see that
you are using a very old version of the LDM:
rpc.ldmd[23502]: Starting Up (built: Jan 30 1996 10:54:55)
I strongly suggest that you upgrade to a current version of the LDM:
<login as 'ldm'>
cd ~ldm
ftp ftp.unidata.ucar.edu
<user> anonymous
<pass> your_full_email_address
cd pub/ldm5
binary
get ldm-5.2.2.tar.Z
quit
zcat ldm-5.2.2.tar.Z | tar xvf -
cd ldm-5.2.2/src
./configure
make
make install
sudo make install_setuids <- assumes your system as sudo installed
If you don't, 'root' will need to
run 'make install_setuids'
cd ../bin
<edit ldmadmin and make sure that the line:
$hostname = "@HOSTNAME@";
is modified so that '@HOSTNAME@' is replaced by the fully qualified hostname
of your machine running the LDM.
Also check to make sure that the path to Perl is correct and that
you have enough room for the default 400 MB queue.
<continuing>
cd ~
rm runtime
ln -s ldm-5.2.2 runtime
cd etc
<edit ldmd.conf and:
comment out the line that starts pqexpire (it is not needed in newer LDMs)
add the line:
exec "rtstats -h rtstats.unidata.ucar.edu"
cd ~
ldmadmin delqueue
ldmadmin mkqueue < will take some time
ldmadmin start
You can probably get away with not upgrading your LDM, but since you are
going to have to do this at some point anyway, why not do it now?
Tom Yoksas
>From address@hidden Wed Nov 20 11:17:04 2002
>Subject: Re: 20021120: LDM startup error message
On Wed, 20 Nov 2002, Unidata Support wrote:
> Ken,
> The assertion failure in pqexpire points to a corrupt queue as being the
> most likely problem. To remake the queue, do the following:
Yes, that was it a corrupt queue!
I never had that happen to me before so
I hadn't thought of that.
Thanks much for your help!!
*******************************************************************************
Ken Scheeringa Indiana Climate Page
State Climatologist http://shadow.agry.purdue.edu
Agronomy Dept
Purdue University featuring climate data archives:
e-mail: address@hidden daily coop stations : 1994+
fax: 765.496.2926 hourly airport data : Jul 1996+
phone: 765.494.8105 30-min autostation : 1999+
updated daily
Also monthly/daily normals
*******************************************************************************