[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 20001029: LDM 5.1.2 on solarisx86 not letting products out of queue
- Subject: Re: 20001029: LDM 5.1.2 on solarisx86 not letting products out of queue
- Date: Mon, 30 Oct 2000 14:46:10 -0700
>From: Tom Yoksas <address@hidden>
>Organization: UCAR/Unidata
>Keywords: 200010291355.e9TDt4406706 LDM 5.1.2 queue pqact
Tom,
I just looked at the shemp ldm logs, and the last "pq_del_oldest: conflict"
message was at 7:33pm last night, about 19 hour ago:
Oct 29 19:33:19 shemp.unidata.ucar.edu wsihcsn[7973]: pq_del_oldest: conflict
on 1148054576
Oct 29 19:33:19 shemp.unidata.ucar.edu wsihcsn[7973]: comings: pqe_new:
Resource temporarily unavailable
Oct 29 19:33:19 shemp.unidata.ucar.edu wsihcsn[7973]: :
7975d55ce2b4a1407ea86f0ce58a7728 11798 20001029193319.149 WSI 412
NEX/HMO/PRE1
Oct 29 19:33:19 shemp.unidata.ucar.edu wsihcsn[7973]: Connection reset by peer
Oct 29 19:33:19 shemp.unidata.ucar.edu wsihcsn[7973]: Disconnect
It looks like someone shut down the LDM at 7:34pm:
Oct 29 19:34:37 shemp.unidata.ucar.edu rpc.ldmd[7957]: child 7958 terminated
by signal 11
Oct 29 19:34:37 shemp.unidata.ucar.edu rpc.ldmd[7957]: Killing (SIGINT)
process group
and then restarted it just after 8:00pm:
Oct 29 20:00:05 shemp.unidata.ucar.edu rpc.ldmd[24961]: Starting Up (built:
Aug 25 2000 10:53:07)
Oct 29 20:00:05 shemp.unidata.ucar.edu motherlode[24965]: run_requester:
Starting Up: motherlode.ucar.edu
The new more informative "pq_del_oldest: conflict" messages showed
that the products on which locks were being held were of every
feedtype, so that shoots the theory that a McIDAS decoder was holding
a lock on them. Some of them also seem to be very recently ingested
products, which points to an error in determining which is the oldest
product. But motherlode has been getting the same products for at
least the last 4 days without a single "pq_del_oldest: conflict"
message in its logs. I'm beginning to wonder if shemp may be getting
a disk read error that would cause something like this.
But one other user, Tom McDermott <address@hidden>,
just reported getting a bunch of "pq_del_oldest: conflict" messages
too, so the disk read error doesn't seem that likely.
Also, why haven't we seen any of these errors on shemp since last
night???
--Russ