[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20040917: Possible pqact issue in LDM?
- Subject: 20040917: Possible pqact issue in LDM?
- Date: Fri, 17 Sep 2004 11:22:48 -0600
Steven,
>Date: Thu, 16 Sep 2004 20:53:48 -0500
>From: "Steven Danz" <address@hidden>
>Organization: Aviation Weather Center
>To: Steve Emmerson <address@hidden>
>Subject: Re: 20040913: Possible pqact issue in LDM?
> Keywords: 200409142045.i8EKjBnJ016694 LDM-6 ldmadmin pqact.conf
The above message contained the following:
> So far so good, but I'm wanting to give it some time. :-)
Same here. About how often, however, has your pqact(1) missed
data-products?
> I've been mulling it around and had a few questions.
>
> 1) Noticed the changes in pqact and was wondering if the rpc.ldmd uses
> the same or a similar method to determine which products go to
> downstream clients.
The changes to the pqact(1) program should not be relevant to the
problem you're seeing: they addressed other issues.
> If so, could it be possible that this might
> cause a server to not pass products down to a client? I've had a
> problem once in a blue moon where a product upstream doesn't show
> up down, but I've always seemed to convince myself it was 'something
> else'.
Both pqact(1) and sending rpc.ldmd(1) processes use the pq_sequence(3)
function of the pq(3) module. This function is responsible for
sequencing through all desired data-products in the product-queue. I
hope that changes to this module will stop pqact(1) from skipping
data-products. This should also eliminate any similar behavior by
sending rpc.ldmd(1) processes.
> 2) The timestamp that is assigned when the product is placed in the
> queue, is it preserved as the product passes from server to client?
No. What goes along with the data-product as part of its metadata
is the creation-time of the data-product. The insertion-time of a
data-product into a product-queue is local to the system only and is not
communicated between LDM-s.
> If so, if there were 4 data sources feeding one client, then
> there exists a chance (regardless of the speed of the systems)
> that a duplicate timestamp could be created by the data sources.
It does, indeed, seem possible that, on a fast system, more than one
data-product could have the same insertion-time -- especiall during a
reconnection when the downstream LDM is "catching up".
> 3) Even if the timestamp is reassigned by the client, if the client
> had multiple upstream sources, and therefore multiple rpc.ldmd
> processes inserting products in the queue, it would seem (especially
> on multiple CPU systems) that the multiple client rpc.ldmd processes
> would stand a pretty good chance of creating duplicate timestamps.
Warren Blanchard at the national NWS HQ has reported missing about 8
NEXRAD Level II data-products per 100,000 data-products (a 0.008% loss
rate). This seems similar to your pqact(1) problem.
> True, for a single-CPU system it would take a pretty fast system
> (and pretty small products), but I would think that with an n-way
> system it would more likely.
I agree. Hopefully, the new pq(3) module will fix this.
> Like I said, just some random thoughts. Thanks for the work on this!
>
> Steven
>
> Steven Danz
> Senior Software Development Engineer
> Aviation Weather Center (NOAA/NWS/NCEP)
> 7220 NW 101st Terrace, Room 101
> Kansas City, MO 64153-2371
>
> Email: address@hidden
> Phone: 816.584.7251
> Fax: 816.880.0650
> URL: http://aviationweather.gov/
>
> The opinions expressed in this message do not necessarily reflect those
> of the National Weather Service, or the Aviation Weather Center.
Regards,
Steve Emmerson