[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[IDD #NME-796833]: Re: IDD feed question
- Subject: [IDD #NME-796833]: Re: IDD feed question
- Date: Mon, 05 Apr 2021 15:30:11 -0600
Hi Pete and Steve,
re:
> To be honest, I'm not sure.. Mike helped me set this up a few years ago. ldm
> requests
> come into the frontend machine (director node?) and are passed on to either
> idd1 or
> idd2, the worker nodes. Looking back at our emails from 2018, we're using LVS
> to do
> the farming out of requests to one or the other. I'm not sure how, or if, it
> takes
> load or number of connections, or ?? to determine where the next connection
> should go..
Assuming that your setup is like the one we use on our clusters (a very good
assumption),
the heuristic that is being used is simple number of downstream connections.
Where this
is not so great is when lots of HIGH volume feeds end up on particular
real-server
backends of the cluster in question. The situation for you might be worse than
it would
be for us because we have on the order of 8 active real-server backends in the
idd.unidata.ucar.edu cluster, so the load tends to get distributed more evenly.
That
being said, however, we have seen and continue to see greater loads on
real-server backend
machines that are providing relatively more HIGH volume feeds to downstreams.
The other factor that comes into play is that _all_ feeds to a downstream site
will
be forced to be served from the same real-server backend of your/our clusters.
The
REQUEST that UWM is making is for a LOT of data in a LOT of products. To get a
better idea of the volumes and number of products being REQUESTed, consider the
Cumulative volume summary listing from the real-server backends of your IDD
relay cluster:
idd1.aos.wisc.edu [6.13.12]
Data Volume Summary for idd1.aos.wisc.edu
Maximum hourly volume 114987.562 M bytes/hour
Average hourly volume 67995.993 M bytes/hour
Average products per hour 463229 prods/hour
Feed Average Maximum Products
(M byte/hour) (M byte/hour) number/hour
SATELLITE 15461.342 [ 22.739%] 20256.913 6706.804
CONDUIT 14420.399 [ 21.208%] 50165.255 92519.652
NIMAGE 8462.043 [ 12.445%] 11787.471 6268.783
NGRID 6866.707 [ 10.099%] 11659.715 64266.435
NOTHER 6810.090 [ 10.015%] 9749.042 12111.196
NEXRAD2 6212.300 [ 9.136%] 7841.242 88200.109
HDS 3792.085 [ 5.577%] 9463.515 38592.370
FNMOC 1785.414 [ 2.626%] 5442.408 7152.935
NEXRAD3 1687.847 [ 2.482%] 2090.884 88206.174
UNIWISC 947.133 [ 1.393%] 1069.297 922.739
GEM 694.456 [ 1.021%] 3023.284 4179.413
FSL2 656.582 [ 0.966%] 739.354 138.261
FNEXRAD 112.432 [ 0.165%] 143.094 105.652
IDS|DDPLUS 87.162 [ 0.128%] 97.570 53858.630
idd2.aos.wisc.edu
Data Volume Summary for idd2.aos.wisc.edu
Maximum hourly volume 114987.612 M bytes/hour
Average hourly volume 68022.282 M bytes/hour
Average products per hour 463581 prods/hour
Feed Average Maximum Products
(M byte/hour) (M byte/hour) number/hour
SATELLITE 15463.222 [ 22.733%] 20256.913 6709.152
CONDUIT 14421.820 [ 21.202%] 50165.255 92586.565
NIMAGE 8463.912 [ 12.443%] 11787.471 6270.543
NGRID 6872.471 [ 10.103%] 11659.715 64374.870
NOTHER 6813.772 [ 10.017%] 9749.042 12116.065
NEXRAD2 6215.027 [ 9.137%] 7841.242 88237.435
HDS 3793.737 [ 5.577%] 9463.515 38614.261
FNMOC 1785.296 [ 2.625%] 5442.408 7152.348
NEXRAD3 1689.094 [ 2.483%] 2090.884 88266.500
UNIWISC 947.313 [ 1.393%] 1069.297 923.000
GEM 699.810 [ 1.029%] 3023.284 4211.391
FSL2 657.203 [ 0.966%] 739.354 138.391
FNEXRAD 112.432 [ 0.165%] 143.094 105.652
IDS|DDPLUS 87.173 [ 0.128%] 97.570 53874.804
Aside: you may well wonder why splitting feed REQUESTs is so beneficial.
The answer is two fold: the TCP collision/detection/backoff mechanism
allows for greater throughput when the set of data being sent is divided
into several, mutually-exclusive sets of data. Second, it is more
efficient for there to be multiple 'ldmd' invocations bringing in
data than a single instance. The biggest reason to split, however, is
the way that TCP works; the LDM considerations are secondary.
Here was the original single REQUEST line that I saw in earlier exchanges:
REQUEST
NEXRAD2|NEXRAD3|NPORT|NIMAGE|FNEXRAD|CONDUIT|FSL5|FSL4|FSL3|FSL2|UNIDATA ".*
idd.aos.wisc.edu
Comments:
- this is a VERY large amount of data to receive in a single REQUEST
- FSL3, FSL4 and FSL5 can be removed from the REQUEST since the
UWisc/AOS IDD cluster does not relay these feeds
- NPORT is a compound feed type that is the union of NTEXT, NGRID, NPOINT,
NGRAPH and NOTHER feeds
Of these feed types, the only ones that actually have data are NGRID and
NOTHER.
On top of this, the _great_ majority of the products in NOTHER are available
in NIMAGE, but in a "preprocessed" form. The tiles that comprise full
images for the FullDisk and CONUS coverages are stitched together here in
Unidata and the resultant full scenes are distributed in NIMAGE. All of
the other Level 2 products in NOTHER and HDS are also sent in NIMAGE, but
without the NOAAPort broadcast header and trailer, which are stripped off.
In short, all of the products sent in NIMAGE are directly usable netCDF4 files
while the ones in NOTHER have to be acted upon to reveal the netCDF4 files
where the data is found.
- in addition to splitting the REQUEST for CONDUIT out of the single REQUEST
and splitting it N-ways, I would split the REQUEST(s) up even more
Here is what I recommend that Clark use:
REQUEST NEXRAD2 ".* idd.aos.wisc.edu
REQUEST NEXRAD3 ".* idd.aos.wisc.edu
REQUEST NGRID ".* idd.aos.wisc.edu
REQUEST NIMAGE ".* idd.aos.wisc.edu
REQUEST FNEXRAD|FSL2|UNIDATA ".* idd.aos.wisc.edu
REQUEST CONDUIT "[0]$" idd.aos.wisc.edu
REQUEST CONDUIT "[1]$" idd.aos.wisc.edu
REQUEST CONDUIT "[2]$" idd.aos.wisc.edu
REQUEST CONDUIT "[3]$" idd.aos.wisc.edu
REQUEST CONDUIT "[4]$" idd.aos.wisc.edu
REQUEST CONDUIT "[5]$" idd.aos.wisc.edu
REQUEST CONDUIT "[6]$" idd.aos.wisc.edu
REQUEST CONDUIT "[7]$" idd.aos.wisc.edu
REQUEST CONDUIT "[8]$" idd.aos.wisc.edu
REQUEST CONDUIT "[9]$" idd.aos.wisc.edu
If the few remaining products that are available in NOTHER are
desired, I can figure out what extended regular expression should
be used to limit the NOTHER REQUEST.
If the loads on Pete's two-machine relay cluster remain very high (a load of 14
might
be considered to be very high depending on the number of CPUs the machine has),
then I would move some of the REQUESTs to different IDD relay instances like
Penn
State (idd.meteo.psu.eud) or Unidata (idd.unidata.ucar.edu).
Please let us know if the above was written well enough to be understandable!
Cheers,
Tom
--
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
address@hidden Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************
Ticket Details
===================
Ticket ID: NME-796833
Department: Support IDD
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata
inquiry tracking system and then made publicly available through the web. If
you do not want to have your interactions made available in this way, you must
let us know in each email you send to us.