[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[IDD #EEC-117639]: Missing, and slow satellite data
- Subject: [IDD #EEC-117639]: Missing, and slow satellite data
- Date: Wed, 18 Aug 2021 11:45:45 -0600
Hi,
re:
> For about a week now we have been missing multiple channels, and times of
> level 1 Goes 16 and 17 data. Has something changed?
Over a week ago, we moved idd.unidata.ucar.edu back to a cluster that is
housed in the NCAR-Wyoming Supercomputer Center (NWSC) in Cheyenne, WY.
This email is the first we've heard from any site about high latencies.
I see that freshair1 is redundantly feeding from idd.unidata.ucar.edu
and iddb.unidata.ucar.edu:
https://rtstats.unidata.ucar.edu/cgi-bin/rtstats/siteindex?freshair1.atmos.washington.edu
topology list for DIFAX:
https://rtstats.unidata.ucar.edu/cgi-bin/rtstats/iddstats_topo_nc?DIFAX+freshair1.atmos.washington.edu
We did not move/change the setup for iddb.unidata.ucar.edu, so I would
expect that the feed latencies from it should not have changed
recently. The fact that your latencies are very high suggests that
there may have been some change closer to you (campus, department, ?).
re
> I see that the latency
> for both DIFAX, and NIMAGE are way up, and all over the place, but not so
> for other feeds.
The FNEXRAD latency is very high right now too and so are the latencies
in the NEXRAD2 feed:
FNEXRAD latencies for freshair1:
https://rtstats.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?FNEXRAD+freshair1.atmos.washington.edu
NEXRAD2 latendies for freshair2:
https://rtstats.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?NEXRAD2+freshair1.atmos.washington.edu
This is at the same time that the CONDUIT latencies have been climbing.
These plots suggest that the latencies are correlated with the volume
of data in a feed. Consider the following snapshot of volumes taken
from a randomly chosen real-server backend of the idd.unidata.ucar.edu
cluster:
https://rtstats.unidata.ucar.edu/cgi-bin/rtstats/rtstats_summary_volume?node7.unidata.ucar.edu
Data Volume Summary for node7.unidata.ucar.edu
Maximum hourly volume 132304.840 M bytes/hour
Average hourly volume 79591.983 M bytes/hour
Average products per hour 550130 prods/hour
Feed Average Maximum Products
(M byte/hour) (M byte/hour) number/hour
CONDUIT 15876.831 [ 19.948%] 52229.672 93306.256
SATELLITE 14278.176 [ 17.939%] 20442.923 6289.372
NEXRAD2 12585.120 [ 15.812%] 15088.628 128220.628
NIMAGE 7652.056 [ 9.614%] 12035.722 5817.698
NGRID 7477.583 [ 9.395%] 12546.217 66062.860
NOTHER 6373.055 [ 8.007%] 10156.915 10751.023
FNEXRAD 5063.719 [ 6.362%] 5398.921 9911.023
HDS 3735.609 [ 4.693%] 9333.490 29897.767
NEXRAD3 3427.554 [ 4.306%] 3985.472 134177.465
FNMOC 1500.523 [ 1.885%] 4975.851 5484.279
UNIWISC 897.155 [ 1.127%] 1118.399 849.605
GEM 636.047 [ 0.799%] 4471.495 3674.023
IDS|DDPLUS 84.774 [ 0.107%] 98.752 55307.302
LIGHTNING 3.619 [ 0.005%] 8.096 347.163
GPS 0.098 [ 0.000%] 0.968 1.047
FSL2 0.065 [ 0.000%] 0.558 32.116
As you can see, the SATELLITE (aka DIFAX), NEXRAD2 and NIMAGE feeds are
some of the most voluminous in the IDD. While CONDUIT, on average, is
more voluminous, you are likely using split REQUESTs to get CONDUIT data
and, as show by you below, single REQUESTs for each of the other feeds where
the latencies are now very high. This again suggests that the latencies
you are experiencing are a function of the volume of data in a feed.
Question:
- is it possible that per-connection volume limiting was imposed somewhere
on the UWashington campus?
re:
> https://a.atmos.washington.edu/~ovens/ldmstats.html
>
> I have this in my ldmd.conf for them:
> request DIFAX ".*" idd.unidata.ucar.edu
> request DIFAX ".*" iddb.unidata.ucar.edu
> request NIMAGE ".*" idd.unidata.ucar.edu
> request NIMAGE ".*"
> iddb.unidata.ucar.edu
>
> Do you have any suggestions?
I suggest two things:
- contact your campus network folks to see if they have implemented
any per-connection limitations, or if they have installed something
like Palo Alto that is doing packet inspection and thus slowing things
down
- try splitting your feed REQUEST for each of the feeds that are
showing high latencies
Another site, Embry Riddle Aeronautical University in FL, was forced
to split their SATELLITE (aka DIFAX) feed REQUEST into multiple,
disjoint REQUESTs about 2 years ago. Since doing the split, their
latencies have been kept as low as possible.
re:
> The only thing on our end that has changed is
> that I have a replacement ldm server on our network feeding off our
> existing ones to sync up before using it to replace our current primary.
> Could that have pushed it over the edge.
I don't think so unless the feeds to the new machine has saturated
your local network.
Question:
- do you know what speeds your local area network run at?
- do you know the speed the Ethernet interfaces on freshair1 and
freshair2 run at?
We found that we had to bond two 1 Gbps Ethernet interfaces together
to increase the through put for machines that used to make up the
idd.unidata.ucar.edu cluster. Our newer machines, however, all have
10 Gbps interfaces, so the need to bond two Ethernet interfaces together
has not returned.
re:
> However, it's been running for
> more than a week, and no other feeds seem to be having problems.
I think that the latency plots for the NEXRAD2 and FNEXRAD feeds
tell a different story.
re:
> Do you have any suggestions. I have not been able to find any obvious
> system issues.
Please try splitting the NIMAGE feed as a first test to see if
that reduces the NIMAGE latendies. I suggest a two-way split
where one REQUEST is for GOES-East (GOES16) and the other is
for GOES-West (GOES17) products. If this change is positive,
do the same thing for your SATELLITE (aka DIFAX) feed REQUEST.
Aside:
The current release of the LDM is v6.13.15. I see that freshair
is still running v6.13.6 which is why your stats show DIFAX
instead of SATELLITE for the GOES-R/S GRB data. Upgrading to
the latest LDM will likely _not_ solve your latency problem,
but it may help in other areas.
Cheers,
Tom
--
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
address@hidden Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************
Ticket Details
===================
Ticket ID: EEC-117639
Department: Support IDD
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata
inquiry tracking system and then made publicly available through the web. If
you do not want to have your interactions made available in this way, you must
let us know in each email you send to us.