[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[LDM #UAK-912261]: data flow problems
- Subject: [LDM #UAK-912261]: data flow problems
- Date: Wed, 28 May 2008 09:14:18 -0600
Hi Karen,
> I'm having an interesting problem with a couple of my machines. I have
> "pluto which gets it's data over a private connection that seems to be
> working fine. It should feed the data to dontpanic but that doesn't
> seem to be working. It was fine last week, and even this morning part
> of the data was getting through, but not all... same feed type,
> different patterns, but only 1 feed was getting through even though
> dontpanic only has one request line to pluto.
>
> Now I've restart ldm on both machines and even stopped ldm rebuilt
> queues and restarted on both machines, and I have no data flowing now.
> I even rebooted the upstream machine. I haven't rebooted the downstream
> machine yet, I can but I hope not to as it is my primary data server
> for a number of realtime systems.
>
> Pings and ldmpings work between the machines, I'm actually concerned
> that on the upstream machine I can see the data coming in using ldmadmin
> watch, but when I try to run notifyme against the local queue using this
> command:
>
> notifyme -v -l - -h localhost
>
> I don't get any notifications of the data arriving in the queue. I have
> a feeling this is why the data isn't getting downstream. I am seeing
> this in the log:
>
> May 27 19:39:45 pluto localhost(noti)[12784]: Starting Up(6.0.14/5):
> 20080527193945.809 TS_ENDT {{ANY, ".*"}}
> May 27 19:39:45 pluto localhost(noti)[12784]: topo:
> localhost.localdomain ANY
> May 27 19:43:18 pluto localhost(noti)[12528]:
> nullproc5(localhost.localdomain): RPC: Unable to receive
The first two log messages show normal startup of an upstream LDM
process in response to a notifyme(1) process. The third log message
is from a different upstream LDM process (different PID).
Were there any other log messages from PID 12784?
Try running two xterm(1) windows. In one, run "ldmadmin watch"; in
the other, run notifyme(1). They should show the same data-products,
although the notifyme(1) might lag the "ldmadmin watch". If there's
a discrepancy, then find the PID of the upstream LDM that was started
in response to the notifyme(1), fgrep(1) just its log messages e.g.,
"fgrep '[nnnnn]' $HOME/logs/ldmd.log"), and send them to me.
> Same kind of response when I try from the downstream machine.
>
> This was working last week, but as a sanity check I rebuilt my ldm from
> source and checked all the configurations. It is a slightly older
> version 6.0.14. I even double checked to make sure iptables and se
> linux weren't running. There is no firewall between the machines as
> they are both on our internal network. I also checked to make sure the
> rpc/services files still had the proper settings.
>
> I have exhausted all my ideas, looking for any ideas of what to try
> next. I'd rather exhaust all my options on the upstream machine
> (especially as it seems that is where the problem is -- considering the
> notifyme failures) before trying anything on the downstream machine.
>
> --
> -------------------------------------------
>
> There are 2 kinds of people in the world:
>
> 1) Those who can extrapolate from incomplete data.
>
> -------------------------------------------
> address@hidden
>
> Phone: 405-325-6982
> Cell: 405-834-8559
> SAIC/Systems Analyst
> National Severe Storms Laboratory
Regards,
Steve Emmerson
Ticket Details
===================
Ticket ID: UAK-912261
Department: Support LDM
Priority: Normal
Status: On Hold