This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Justin,The latencies are still looking bad using ldm2. This has lead to the continuation of large numbers of missing fields (identical on both receiving machines, NCAR and Unidata) from all model cycles over the weekend,
continuing this morning.http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?CONDUIT +idd.unidata.ucar.edu+LOG
Doug On Jun 18, 2007, at 6:32 AM, Justin Cooke wrote:
You're right Chi, I misread the graph, we are on ldm2 and have been since Saturday. My apologies.Justin Chi Y Kang wrote:Justin Cooke wrote:Chi,It looks like you switched us back to ldm1 on Saturday but according to the graphs from Steve they experienced the same delays.Running on ldm2 right now. It looks like the send-Q on our end seems to be okay. All of these connections are going via I2, let me see what rate limit there is set on I2 connection coming out of the campus.Justin Steve Chiswell wrote:Chi & Justin,The latency of data today has been high like yesterday, even with the switch of ldm2. The throughput looks restricted either by a router or firewall/packet shaping, but was wondering if coincident with Justin's restart was that the connections had to be re-established, so changes took effect at that time.Thanks for all your efforts, Steve Chiswell Unidata User Support On Fri, 15 Jun 2007, Chi Y Kang wrote:Wait a minute here, 128.117.140.208 isn't in the mix. The other hosts are.I updated the LDM access list. Should we just have some class C rangesto have access rather then ip at a time?Also, i noticed that the send-Q are pretty normal on ldm2 server right now but was pretty high on ldm1. might be just an issue with the ACLlist. 128.117.12.2 128.117.12.3 128.117.130.220 128.117.140.208 128.117.140.220 128.117.149.220 128.117.156.220 128.174.80.16 128.174.80.47 140.90.193.19 140.90.193.227 140.90.193.228 140.90.193.99 140.90.226.201 140.90.226.202 140.90.226.203 140.90.226.204 140.90.37.12 140.90.37.13 140.90.37.15 140.90.37.16 140.90.37.40 144.92.130.88 144.92.131.244 150.9.117.128 192.12.209.57 192.58.3.194 192.58.3.195 192.58.3.196 192.58.3.197 193.61.196.74 198.181.231.53 208.64.117.128 Justin Cooke wrote:Chi,The reboot doesn't seem to have helped. Is there anything else that may be causing these issues? Network related after I performed the restartof LDM? Steve has a few possibilities: /It seems to be network related at your end, but strange that itoccurred at the time when you retsrtaed the LDM- unless there was somesort of firewall or packet filter that occurred when the LDM's re-connected. / Justin Steve Chiswell wrote:Justin,I haven't seen any improvement from ncepldm to the top level relays daffy.unidata.ucar.edu (Unidata), idd.aos.wisc.edu (U. WIsconsin), flood.atmos.uiuc.edu (U. Illinois) or atm.cise-nsf.gov (NSF, DC).It seems to be network related at your end, but strange that it occurred at the time when you retsrtaed the LDM- unless there was some sort of firewall or packet filter that occurred when the LDM's re- connected.Thanks for your time in looking at this, Steve On Fri, 2007-06-15 at 15:31 -0400, Justin Cooke wrote:Steve and Doug,I just got a call from Chi at the WOC, he rebooted LDM1 after noticing an unusual load on the machine. LDM is again running on that box and it remains primary, can you check to see how the latencies are now?Thanks, Justin Doug Schuster wrote:Justin,28,079 products are missing from the 12z cycle. You'll be getting theautomated email shortly. -Doug On Jun 15, 2007, at 12:48 PM, Justin Cooke wrote:Steve, I've turned off the feed to LDM2. There is no other load on the ldm1 system except for LDM. Doug, are you missing many of the TIGGE params for 12Z? Justin Steve Chiswell wrote:Justin, That didn't change the behavior. Still seeing latency. perhaps turning off the other feed. Is there any load other than LDM on the system? Steve On Fri, 2007-06-15 at 12:56 -0400, Justin Cooke wrote:Steve,I've recreated the queue, let me know if you are still seeing issues.If so I'll turn off the feed to ldm2 to see if that corrects things.Justin Steve Chiswell wrote:Justin, I don't know if they saw a disk space problem with log files not being rotated, but it might just be best today to build a new queue: ldmadmin stop ldmadmin delqueue ldmadmin mkqueue ldmadmin startThat will mean some queued data would be lost, but if users aren'tgetting itanyway, then its best to ensure that the queue isn't corrupt for theweekend. Happy Friday.... Thanks, Steve On Fri, 2007-06-15 at 12:13 -0400, Justin Cooke wrote:Steve,Our logs on the primary ldm system "ldm1" had not rotated for nearly a week. I sent email to the WOC support and this was theresponse:Looks like the seed file was missing after we brought the systembackup from the last outage. should be good now. Justin Cooke wrote:Also around that time I turned on our backup feed to the ldm2 system which had been off since that system had issues a few weeks ago (we were asked by WOC to turn it back on). I have sent email to their support group asking if both ldm1 and ldm2 are responding to the ncepldm.woc.noaa.gov address or if somethingWOC,I noticed that our logs for LDM have not been rotated on machineldm1since 06/05/2007. We have a cron entry that runs "ldmadminnewlog" at 00Z every day.I attempted to run the command by hand and got the following back:ldm@ldm1:~$ bin/ldmadmin newlog hupsyslog: couldn't open /var/run/syslogd.pidI checked but /var/run/syslogd.pid is not there but it is on ldm2.Could there be a problem with syslogd on ldm1? Justinelse is going on. Justin Steve Chiswell wrote:Justin,Yesterday just after 18Z, the data flow from ncepldm.woc.noaa.gov to top level sites at NSF and Unidata both began showing highlatency:http://www.unidata.ucar.edu/cgi-bin/rtstats/ iddstats_nc?CONDUIT+atm.cise-nsf.gov andhttp://www.unidata.ucar.edu/cgi-bin/rtstats/ iddstats_nc?CONDUIT+daffy.unidata.ucar.edu Data volume out has dropped as a result:http://www.unidata.ucar.edu/cgi-bin/rtstats/ iddstats_vol_nc?CONDUIT+atm.cise-nsf.gov Since the behavior is similar at both sites at separate locations, theproblem would appear to be near your end. Since that coincideswith yourrestart of the LDM, could you fill me in on the issues you wereexperiencing? Thanks Steve Chiswell Unidata User Support On Fri, 2007-06-15 at 11:38 -0400, Justin Cooke wrote:Doug,I had to restart our LDM yesterday right before the 18Z cycle, we had an issue with out logging but none of the configuration files changed. Could one of your feeds have lost the connectionto our LDM during that restart? Justin Douglas Schuster wrote:Yes, we've received partial cycles. More than half of theexpected fields have been missingin each cycle from June 14 18Z, to June 15, 06Z. The numberof missing fields varies between each cycle. Doug On Jun 15, 2007, at 9:11 AM, Justin Cooke wrote:Doug,Have you received any GEFS data from us today? Or is it justcertain fields you are missing? Justin-- Chi Y. Kang Contractor Principal Engineer Phone: 301-713-3333 x201 Cell: 240-338-1059