[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[CONDUIT #CEL-625898]: 20190423: CONDUIT feed latencies
- Subject: [CONDUIT #CEL-625898]: 20190423: CONDUIT feed latencies
- Date: Tue, 23 Apr 2019 10:34:30 -0600
Hi Carissa,
This is a quick note about two different topics related to the latencies
that Unidata top level relays have experienced/see for the IDD CONDUIT
data feed:
- first, it would be most useful (and interesting!) to know what, if
anything, was done at NCEP to drastically reduce the observed latencies
in the CONDUIT feed
FYI:
- 20-way feed REQUEST splits by Penn State and Unidata did result in
reduced latencies, but the 20-way split at Unidata did not return
the latencies to earlier levels
- I contacted Pete Pokrandt (UW/AOS) to see if he had/would change
his 10-way REQUEST split for CONDUIT
Before Pete could start changing his existing 10-way split to a
20-way split, the CONDUIT latencies plummeted and have remained
as low as they have historically been during good periods.
We would _love_ to know what was done to effect the drop in
latencies at NCEP (or at some point in the network managed by
NOAA). If nothing was done (really nothing that is), it would also
be good to know as this might indicate that there was a problem in
Internet2. While it is _extremely_ rare for there to be problems
in Internet2, we and UW/SSEC did experience a situation where the
Internet2 gateway connection to AWS East was significantly
under performing, and the slowness affected our ability to uplink
NEXRAD Level 2 data to the AWS S3 bucket that we have been populating
for over 3 years as part of the NOAA Big Data project.
- secondly, CONDUIT latencies have always exhibited a variation from near
zero to 30 seconds
These latencies are caused by the process(es) that are inserting
products into the LDM queue on the CONDUIT origination machine NOT
sending a CONT (continue) signal to the LDM to inform it that new
products are available in the LDM queue. The default behavior of
an LDM is to process (relay to downstreams, etc.) all products it
finds in its queue; sleep for 30 seconds; and then wake up and
check to see if there are any products to process. The result of
this default behavior is latencies that range from zero/near-zero
to 30 seconds. The step that needs to be done to eliminate this
"artificial" 0-30 second latency is for the process that is
inserting the product(s) into the LDM queue to send a CONT signal
to the negative of the process ID of the lead LDM process. In
practice, this looks like:
/bin/kill -s CONT -`cat ~ldm/ldmd.pid`
NB:
- the system version of 'kill' must be used; 'kill' provided by,
for instance, BASH does not work correctly
- the file ~ldm/ldmd.pid contains the process ID of the lead
LDM process for a running LDM
- the CONT signal must be passed to the negative of the process ID
for the lead LDm process
So, my question to you is how we can go about enhancing the product
insertion processes in use in NCEP to include sending of the CONT
signal to the LDM so that it is knows each time a new product is
inserted into its queue?
Cheers,
Tom
--
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
address@hidden Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************