This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Justin, The current feeds should be Illinois connected to ncepldm=ldm1 Wisconsin connected to ldm2 NSF has a primary request to ldm2 and an alternate request to ldm1 I believe that all of those hosts use a 5 way split in the request of data (eg each request line asks for 20% of the data). Are you able to use "netstat" to view the number of connections? Steve On Wed, 2007-06-20 at 13:10 -0400, Justin Cooke wrote: > Steve, > > That's great that you're able to see our stats. > > I'm on a conference call right now with Chi and persons from NCEP, the > question came up of how many request feeds you have to our ldm server? > > Justin > > On Jun 20, 2007, at 12:54 PM, Steve Chiswell wrote: > > > Justin, > > > > I am receiving the stats from node6: > > Latency: > > http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc? > > CONDUIT+node6.woc.noaa.gov > > Volume: > > http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_vol_nc? > > CONDUIT+node6.woc.noaa.gov > > > > The latency there to ldm1 is climbing on the initial connection, and > > will start off by catching up on the last hours worth of data in the > > upstream queue. After that, we can see what the latency is doing. > > > > Steve > > > > On Wed, 2007-06-20 at 12:43 -0400, Justin Cooke wrote: > >> Steve and Chi, > >> > >> I tried to ping rtstats.unidata.ucar.edu but was unable to. > >> > >> Chi would you be able to set up a static route from node6 to > >> rstats.unidata.ucar.edu like Steve mentions? > >> > >> I actually am unable to connect to ncepldm.woc.noaa.gov either. > >> However > >> I did set up a feed to "ldm1" and am receiving CONDUIT data currently. > >> > >> Steve how tough would it be to do the pqact step you mention and to > >> get > >> the stats reports from those if Chi is unable to get the static route > >> going? > >> > >> Thanks for all the help, > >> > >> Justin > >> > >> On Jun 20, 2007, at 12:16 PM, Steve Chiswell wrote: > >> > >>> Justin, > >>> > >>> Is that box capable of sending stats to our rtstats.unidata.ucar.edu > >>> host? > >>> Eg, is it allowed to connect outside your domain? > >>> > >>> The ldm won't need to run pqact to test out the throughput and > >>> netwrok, > >>> but will need ldmd.conf lines: > >>> > >>> EXEC "rtstats -h rtstats.unidata.ucar.edu" > >>> request CONDUIT ".*" ncepldm.woc.noaa.gov > >>> > >>> The pqact EXEC action can be commented out. The request > >>> line will start the feed to ncepldm which flood.atmos.uiuc.edu is > >>> pointing to, and showing high latency. If you are able to feed from > >>> ncepldm > >>> without the latency that outside hosts are showing, then it would > >>> isolate the > >>> problem further to the border of your network to the outside. If you > >>> do > >>> show similar latency, then it would either be the LDM configuration > >>> itself, or the local > >>> router that the machines are on. > >>> > >>> If you are able to send rtstats out to us, then we can monitor stats > >>> on > >>> our web pages. > >>> Your network might require a static route be added for sending that > >>> outside your domain (that would something your networking folks would > >>> know). The rtstats sends > >>> a small text report about every 60 seconds, so not a lot of traffic. > >>> > >>> If you can't configure your host to send rtstats, then we could > >>> create > >>> q > >>> pqact.conf action to file the .status reports and calculate the > >>> latency > >>> from those. > >>> > >>> Thanks, > >>> > >>> Steve > >>> > >>> > >>> > >>> > >>> On Wed, 2007-06-20 at 12:03 -0400, Justin Cooke wrote: > >>>> Steve, > >>>> > >>>> If you provide us a pqact.conf I can have the box chi set up to feed > >>>> off of ldm1 and see how its latencies are. > >>>> > >>>> Justin > >>>> On Jun 20, 2007, at 11:36 AM, Steve Chiswell wrote: > >>>> > >>>>> Justin, > >>>>> > >>>>> Since the change at 13Z by dropping daffy.unidata.ucar.edu out of > >>>>> the > >>>>> top level nodes the ldm2 feed to NSF is showing little/no latency > >>>>> at > >>>>> all. The ldm1 feed to NSF which is connected using the alternate > >>>>> LDM > >>>>> mode is only devivering the .status messages its creates as all the > >>>>> other products are duplicates of products already being received > >>>>> from > >>>>> LDM2 and that is showing high latency: > >>>>> http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc? > >>>>> CONDUIT+atm.cise-nsf.gov > >>>>> > >>>>> This configuration is getting data out to the community at the > >>>>> moment. > >>>>> The downside here is that it puts a single point of failure at NSF > >>>>> in > >>>>> getting the data to Unidata, but > >>>>> I'll monitor that end. > >>>>> > >>>>> It seems that ldm1 is either slow, or it is showing network > >>>>> limitations > >>>>> (since > >>>>> flood.atmos.uiuc.edu is feeding from ncepldm which is apparently > >>>>> pointing to ldm1, there is load on ldm1 besides the NSF feed. LDM2 > >>>>> is > >>>>> feeding both NSF and idd.aos.wisc.edu (and Wisc looks good since > >>>>> 13Z > >>>>> as > >>>>> well) so it is able to > >>>>> handle the throughput to 2 downstreams, but adding daffy as the 3rd > >>>>> seems to > >>>>> cross some point in volume of what can be sent out. > >>>>> > >>>>> Steve > >>>>> > >>>>> On Wed, 2007-06-20 at 09:45 -0400, Justin Cooke wrote: > >>>>>> Thanks Steve, > >>>>>> > >>>>>> Chi has set up a box on the lan for us to run LDM on, I am > >>>>>> beginning > >>>>>> to > >>>>>> get things running on there. > >>>>>> > >>>>>> have you seen any improvement since dropping daffy? > >>>>>> > >>>>>> Justin > >>>>>> > >>>>>> On Jun 20, 2007, at 9:03 AM, Steve Chiswell wrote: > >>>>>> > >>>>>>> Justin, > >>>>>>> > >>>>>>> Yes, this does appear to be the case. I will drop daffy from > >>>>>>> feeding > >>>>>>> directly and instead move it to feed from NSF. That will remove > >>>>>>> one > >>>>>>> of the top level relays of data having to go out of NCEP and > >>>>>>> we can see if the other nodes show an improvement. > >>>>>>> > >>>>>>> Steve > >>>>>>> > >>>>>>> On Wed, 20 Jun 2007, Justin Cooke wrote: > >>>>>>> > >>>>>>>> Steve, > >>>>>>>> > >>>>>>>> Did you see a slowdown to ldm2 after Pete and the other sites > >>>>>>>> began > >>>>>>>> making connections? > >>>>>>>> > >>>>>>>> Chi, considering steve saw a good connection to ldm1 before the > >>>>>>>> other > >>>>>>>> sites connected doesn't that point toward a network issue? > >>>>>>>> > >>>>>>>> All of our queue processing on the diskserver has been running > >>>>>>>> without > >>>>>>>> any problems so I don't believe anything on that system would > >>>>>>>> impacting > >>>>>>>> ldm1/ldm2. > >>>>>>>> > >>>>>>>> Justin > >>>>>>>> > >>>>>>>> On Jun 20, 2007, at 12:04 AM, Chi Y Kang wrote: > >>>>>>>> > >>>>>>>>> I setup the test LDM server for the NCEP folks to test the > >>>>>>>>> local > >>>>>>>>> pull > >>>>>>>>> from the LDM servers. That should give us some information / > >>>>>>>>> network > >>>>>>>>> or system related issue. We'll handle that tomorrow. I am a > >>>>>>>>> little > >>>>>>>>> bit concerned that the slow down all occurred at the some time > >>>>>>>>> as > >>>>>>>>> the > >>>>>>>>> ldm1 crash last week. > >>>>>>>>> > >>>>>>>>> Also, can NCEP also check if there are any bad dbnet queues on > >>>>>>>>> the > >>>>>>>>> backend servers? Just to verify. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Steve Chiswell wrote: > >>>>>>>>>> Thanks Justin, > >>>>>>>>>> I also had a typo in my message: > >>>>>>>>>> ldm1 is running slower than ldm2 > >>>>>>>>>> Now if the feed to ldm2 all of a sudden slows down if Pete and > >>>>>>>>>> other > >>>>>>>>>> sites add a request to it, it would really signal some sort of > >>>>>>>>>> total > >>>>>>>>>> bandwidth limitation > >>>>>>>>>> on the I2 connection. Seemed a little coincidental that we > >>>>>>>>>> had a > >>>>>>>>>> show > >>>>>>>>>> period > >>>>>>>>>> of good connectivity to ldm1 after which it slowed way down. > >>>>>>>>>> Steve > >>>>>>>>>> On Tue, 2007-06-19 at 17:01 -0400, Justin Cooke wrote: > >>>>>>>>>>> I just realized the issue. When I disabled the "pqact" > >>>>>>>>>>> process > >>>>>>>>>>> on > >>>>>>>>>>> ldm2 earlier today it caused our monitor script (in cron, > >>>>>>>>>>> every 5 > >>>>>>>>>>> min) to kill the LDM and restart it. I have removed the check > >>>>>>>>>>> for > >>>>>>>>>>> the pqact in that monitor...things should be a bit better > >>>>>>>>>>> now. > >>>>>>>>>>> > >>>>>>>>>>> Chi.Y.Kang wrote: > >>>>>>>>>>>> Huh, i thought you guys were on the system. let me take a > >>>>>>>>>>>> look > >>>>>>>>>>>> on > >>>>>>>>>>>> ldm2 > >>>>>>>>>>>> and see what is going on. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Justin Cooke wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Chi.Y.Kang wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Steve Chiswell wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Pete and David, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I changed the CONDUIT request lines at NSF and Unidata to > >>>>>>>>>>>>>>> request data > >>>>>>>>>>>>>>> from ldm1.woc.noaa.gov rather than ncepldm.woc.noaa.gov > >>>>>>>>>>>>>>> after > >>>>>>>>>>>>>>> seeing > >>>>>>>>>>>>>>> lots of > >>>>>>>>>>>>>>> disconnect/reconnects to the ncepldm virtual name. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> The LDM appears to have caught up here as an interim > >>>>>>>>>>>>>>> solution. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Still don't know the cause of the problem. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Steve > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> I know the NCEP was stop and starting the LDM service on > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>> ldm2 > >>>>>>>>>>>>>> box > >>>>>>>>>>>>>> where the VIp address is pointed to at this time. how is > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>> current > >>>>>>>>>>>>>> connection to LDM1? is the speed of the conduit feed > >>>>>>>>>>>>>> acceptable? > >>>>>>>>>>>>>> > >>>>>>>>>>>>> Chi, NCEP has not restarted the LDM on ldm2 at all today. > >>>>>>>>>>>>> But > >>>>>>>>>>>>> looking > >>>>>>>>>>>>> at the logs it appears to be dying and getting restarted by > >>>>>>>>>>>>> cron. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I will watch and see if I see anything. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Justin > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Chi Y. Kang > >>>>>>>>> Contractor > >>>>>>>>> Principal Engineer > >>>>>>>>> Phone: 301-713-3333 x201 > >>>>>>>>> Cell: 240-338-1059 > >>>>>>>> > >>>>> -- > >>>>> Steve Chiswell <address@hidden> > >>>>> Unidata > >>> -- > >>> Steve Chiswell <address@hidden> > >>> Unidata > > -- > > Steve Chiswell <address@hidden> > > Unidata -- Steve Chiswell <address@hidden> Unidata