This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
>From: Robert Leche <address@hidden> >Organization: LSU >Keywords: 200306161954.h5GJs2Ld016710 LDM-6 IDD Bob, >As we have switched to 'event mode' with the hurricane in the gulf, I >have had to drop the network investigation. Today is out, and at least >part of tomorrow. Also, I have lost email over the last 4 days. >Please resend the emails you sent from Friday on. > >Speaking of hurricanes, our computer "Hurricane" died. I am in the >process of rebuilding it with Gempak to bring to the Office of Emergency >Preparedness. Murphy's Law.. If it can fail, it will! The most important thing I asked for in email sent since last Friday was for you to contact the telecomm folks at LSU and/or LANET to see what they possibly did over the weekend to first make the HDS latencies from seistan to zero.unidata.ucar.edu drop significantly starting on Friday evening, AND then to rise back starting on Sunday afternoon. Whatever was done holds the information for finally closing out the feed problems being experienced by sites downstream of LSU. Tom Here are all of the messages I sent to you since last Friday morning: From address@hidden Sun Jun 29 20:06:45 2003 To: address@hidden cc: address@hidden, Kevin Robbins <address@hidden> Subject: 20030628: HDS feed to/from seistan (cont.) >From: Unidata Support <address@hidden> >Organization: UCAR/Unidata >Keywords: 200306161954.h5GJs2Ld016710 LDM-6 IDD Bob, Well, after most of a weekend of pretty good HDS latencies from seistan to zero.unidata.ucar.edu, the feed problems reappeared. This can be seen by the 'latency' plot from the real time statistics page: http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?HDS+zero.unidata.ucar.edu The questions now are: - what changed at LSU/LANET on Saturday to make the latencies drop to near what they should be - what changed at LSU/LANET on Sunday afternoon to make the latencies climb to their previous bad levels I think a call to the LSU telecomm folks is in order. If you can't get anywhere with them (please try, you should have more clout with them than us), can you send along their contact information? From address@hidden Sat Jun 28 07:53:30 2003 To: address@hidden cc: address@hidden, Kevin Robbins <address@hidden> Subject: 20030627: HDS feed to/from seistan (cont.) >From: Robert Leche <address@hidden> >Organization: LSU >Keywords: 200306161954.h5GJs2Ld016710 LDM-6 IDD Hi Bob, >take a look at the following two cases. Notice the LSU to ULM hop is >via the network address translation firewall: >dynip422.nat.ulm.edu.(Line 6) Interestingly, ULM does not pass through >the same NAT/firewall process. I beleive this could offer a clue. >The traceroute report is missing the last 3 hops and untill the firewall at >ULM is opened allow you to ping tornado we will not have a complete picture. I don't think that this has anything to do with the feed problems we were seeing from LSU to others. It only explains the inability to do complete traceroutes to ULM. This is/was not part of the feed problems we have been seeing. >Some how two different paths are connecting ULM. And this suggests a >reason why it takes more time to issue packets from Seistan to >Torando. It does not explain the asymmetry in feed to/from UCAR. ULM has been out of the picture as far as high volume data feeds from seistan for well over a week now. Ever since I switched them to feed from CU/CIRES (rainbow.al.noaa.gov), their HDS latencies have been at or very near zero. Now, back to the problem at hand. Something significant changed yesterday night: - the HDS latencies from seistan to zero.unidata.ucar.edu dropped to near zero after a spike at around 7Z - for the first time since setting up the feed test from emo.unidata.ucar.edu to seistan and then back out to zero.unidata.ucar.edu, all HDS data was relayed from seistan to zero.unidata.ucar.edu - latencies for all feeds from seistan to tornado.geos.ulm.edu (e.g., FSL2, IDS|DDPLUS, UNIWISC, and NNEXRAD) dropped significantly Given these three observations from the real time statistics page: http://www.unidata.ucar.edu/staff/chiz/rtstats/siteindex.shtml for seistan.srcc.lsu.edu, zero.unidata.ucar.edu, and tornado.geos.ulm.edu I conclude that something changed in the network path out of LSU or in LANET. Did you receive a change notification from the LSU telecomm folks? If not, will you contact them to find out exactly what was done? A complete picture of what went wrong and its fix will help others if they run into similar problems. From address@hidden Fri Jun 27 12:11:39 2003 To: address@hidden cc: address@hidden, Kevin Robbins <address@hidden> Subject: 20030627: HDS feed to/from seistan (cont.) >From: Robert Leche <address@hidden> >Organization: LSU >Keywords: 200306161954.h5GJs2Ld016710 LDM-6 IDD Hi Bob, re: ULM rerouted their traffic from I2 to "I1: >I did not know this happened, but it explains why ULM is able to communicate >with rainbow.al.noaa.gov. The ULM folks told us that during a total outage at LSU at some point in the past they fed from thelma.ucar.edu and experienced no problems. This predated either your or ULM's upgrade to LDM-6 by quite a bit. Here a portion of the original note we received on problems ULM was having feeding from srcc.lsu.edu: "For more than a year, we have been having serious data feed problems when our upstream site is at LSU (sirocco). We have tried everything that we can, including contacting LSU repeatedly, but cannot seem to resolve the situation satisfactorily. We have worked extensively with our network people and believe that the problem is at LSU. We are basing this conclusion on the fact that, while sirocco was down and we were feeding from Unidata's thelma machine, everything was fine. We received all data without significant losses. However, once sirocco came on-line again and we switched over to them, we began to experience substantial losses of data. Our fallback site is OU's stokes machine and we have used them in the past, but they are feeding so many sites that we tend to fall significantly behind in the data feed. Can you help us resolve this problem?" >It would be interesting to also force an I1 connection to LSU and repeat >the test. I agree, running feed tests using a different route to/from LSU would certainly be welcome. re: "I1" >Internet one? That is what we asked. >A better question in this case is, what is I2 in the context >to the LANET sonnet connecting ULM to LANET? Here is the route from ULM to seistan.srcc.lsu.edu: Matt's traceroute [v0.49] tornado.geos.ulm.edu Fri Jun 27 10:56:14 2003 Keys: D - Display mode R - Restart statistics Q - Quit Packets Pings Hostname %Loss Rcv Snt Last Best Avg Worst 1. 10.16.0.1 0% 18 18 1 1 1 1 2. 10.1.1.1 0% 18 18 0 0 0 1 3. 198.232.231.1 0% 18 18 0 0 0 1 4. laNoc-ulm.LEARN.la.net 0% 17 17 13 13 19 76 5. lsubr-laNoc.LEARN.la.netponse 2. (serve0% 17 17 14 14 15 26 6. howe-e241a-4006-dsw-1.g1.lsu.edu 0% 17 17 18 15 22 50 7. seistan.srcc.lsu.edu 0% 17 17 15 14 19 42 This can be compared with LSU's route from seistan to tornado.geos.ulm.edu: Matt's traceroute [v0.49] seistan.srcc.lsu.edu Fri Jun 27 10:58:56 2003 Keys: D - Display mode R - Restart statistics Q - Quit Packets Pings Hostname %Loss Rcv Snt Last Best Avg Worst 1. 130.39.188.1 0% 11 11 4 1 2 5 2. lsubr1-118-6509-dsw-1.g2.lsu.edu 0% 11 11 1 0 1 1 3. laNoc-lsubr.LEARN.la.net 0% 11 11 2 1 2 4 4. ulm-laNoc.LEARN.la.net 0% 11 11 14 14 36 91 5. 198.232.231.2 0% 11 11 29 14 41 127 6. dynip422.nat.ulm.edu 0% 11 11 16 15 25 61 7. tornado.geos.ulm.edu 0% 10 10 15 14 16 23 Resolver: Received error response 2. (server failure) >My limited understanding of >what I2 is, is that traffic is I2 if it passes through Abilene's system. I believe that is correct. >That being the case, unless ULM is passing through Abilenes routers, ULM >is really on I1 anyway. Please see the route above. This, at least, reflects ULM's current connection to LSU. UCAR's connection to ULM, however, traverses I2 until Houston where the bridge is made to LEARN.La.Net: zero.unidata.ucar.edu -> tornado.geos.ulm.edu: Matt's traceroute [v0.44] zero.unidata.ucar.edu Fri Jun 27 12:02:58 2003 Keys: D - Display mode R - Restart statistics Q - Quit Packets Pings Hostname %Loss Rcv Snt Last Best Avg Worst 1. flra-n140.unidata.ucar.edu 0% 71 71 0 0 0 29 2. gin-n243-80.ucar.edu 0% 71 71 0 0 0 6 3. frgp-gw-1.frgp.net 0% 71 71 1 1 2 25 4. 198.32.11.105 0% 71 71 1 1 1 6 5. kscyng-dnvrng.abilene.ucaid.edu 0% 71 71 12 12 13 26 6. hstnng-kscyng.abilene.ucaid.edu 0% 71 71 27 27 27 27 7. laNoc-abileneHou.LEARN.La.Net 0% 71 71 33 32 33 36 8. ulm-laNoc.LEARN.La.Net 0% 70 70 45 45 46 71 9. ??? tornado.geos.ulm.edu -> zero.unidata.ucar.edu Matt's traceroute [v0.49] tornado.geos.ulm.edu Fri Jun 27 13:04:05 2003 Keys: D - Display mode R - Restart statistics Q - Quit Packets Pings Hostname %Loss Rcv Snt Last Best Avg Worst 1. 10.16.0.1 0% 4 4 1 1 1 1 2. 10.1.1.1 0% 4 4 0 0 0 0 3. 198.232.231.1 0% 4 4 0 0 0 0 4. laNoc-ulm.LEARN.la.net 0% 4 4 13 13 13 13 5. abileneHou-laNoc.LEARN.la.net 2. (serve0% 4 4 18 18 25 45 6. kscyng-hstnng.abilene.ucaid.edu 0% 3 3 34 34 34 34 7. dnvrng-kscyng.abilene.ucaid.edu 0% 3 3 44 44 44 44 8. 198.32.11.106 0% 3 3 44 44 44 45 9. gin.ucar.edu 0% 3 3 46 45 45 46 10. flrb.ucar.edu 0% 3 3 45 45 46 46 11. zero.unidata.ucar.edu 0% 3 3 56 45 49 56 re: ULM rerouted away from the problematic I2 connection >LANET indicated this trouble ticket >has been open for "some time". We do not know what "some time" means in terms >of days or months. It would be useful to know how long that trouble ticket has been open. >CRC, and retransmission errors are consistent with delays >in network traffic. I agree. re: is CRC and retransmission (trouble ticket at LANET) affecting LSU also >I think the communication issue will require resolving before we will >know. The really strange part is the asymmetry in the problem. Since we are are feeding seistan.srcc.lsu.edu the HDS stream from emo.unidata.ucar.edu with no latencies, while at the same time we are _unable_ to feed the data back to a different machine here at the UPC, zero.unidata.ucar.edu (zero and emo are in the same room on the same subnet), perhaps a look at the route from Unidata to seistan and back again would be instructive: zero.unidata.ucar.edu -> seistan.srcc.lsu.edu Matt's traceroute [v0.44] zero.unidata.ucar.edu Fri Jun 27 10:16:40 2003 Keys: D - Display mode R - Restart statistics Q - Quit Packets Pings Hostname %Loss Rcv Snt Last Best Avg Worst 1. flra-n140.unidata.ucar.edu 0% 8 8 10 0 1 10 2. gin-n243-80.ucar.edu 0% 8 8 0 0 0 0 3. frgp-gw-1.frgp.net 0% 8 8 1 1 1 2 4. 198.32.11.105 0% 8 8 1 1 1 1 5. kscyng-dnvrng.abilene.ucaid.edu 0% 8 8 22 12 13 22 6. hstnng-kscyng.abilene.ucaid.edu 0% 8 8 27 27 27 27 7. laNoc-abileneHou.LEARN.La.Net 0% 8 8 33 33 33 33 8. lsubr-laNoc.LEARN.La.Net 0% 8 8 34 34 34 34 9. howe-e241a-4006-dsw-1.g2.lsu.edu 0% 8 8 39 35 37 42 10. seistan.srcc.lsu.edu 0% 7 7 34 34 34 35 seistan.srcc.lsu.edu -> zero.unidata.ucar.edu Matt's traceroute [v0.49] seistan.srcc.lsu.edu Fri Jun 27 11:15:53 2003 Keys: D - Display mode R - Restart statistics Q - Quit Packets Pings Hostname %Loss Rcv Snt Last Best Avg Worst 1. 130.39.188.1 0% 14 14 1 1 3 16 2. lsubr1-118-6509-dsw-1.g2.lsu.edu 0% 14 14 0 0 1 6 3. laNoc-lsubr.LEARN.la.net 0% 14 14 2 1 2 5 4. abileneHou-laNoc.LEARN.la.net 0% 14 14 8 7 16 46 5. kscyng-hstnng.abilene.ucaid.edu 0% 14 14 23 22 22 23 6. dnvrng-kscyng.abilene.ucaid.edu 0% 14 14 33 33 36 71 7. 198.32.11.106 0% 14 14 34 33 36 59 8. gin.ucar.edu 0% 14 14 35 34 35 45 9. flrb.ucar.edu 0% 14 14 34 34 35 45 10. zero.unidata.ucar.edu 0% 13 13 34 34 36 57 The major difference in routes that I notice is the route from zero to seistan goes through howe-e241a-4006-dsw-1.g2.lsu.edu, but the route from seistan to zero goes through lsubr1-118-6509-dsw-1.g2.lsu.edu. Perhaps this is a big clue that we are overlooking? Could it be that there is something amiss on the howe-e241a-4006-dsw-1.g2.lsu.edu gateway/router? re: What did the telecomm folks have to say about the asymmetry seen moving data to/from srcc.lsu.edu from zero.unidata.ucar.edu? >The issue of asymmetry was not the paramount issue with telecom. Again, the >telecom guys want to wait and see the communications issues are fixed, as >they believe the errors in the circuit are causing the problems between LSU >and ULM. The problem is not _just_ between LSU and ULM. We (zero.unidata.ucar.edu) are seeing the exact same problem that ULM was seeing when trying to feed HDS from seistan.srcc.lsu.edu. Moreover, we saw the exact same problem during our test of feeding the HDS stream from seistan.srcc.lsu.edu to the University of South Florida machine, metlab.cas.usf.edu. The problem most likely exists between seistan and Jackson State, but we can't verify this because they are not reporting stats AND we do not have current contact information for them. If the LSU telecomm folks are under the impression that the only problem is between LSU and ULM, then they need to be contacted and made aware of the problems going to such diverse sites as UCAR and USF. >From address@hidden Fri Jun 27 07:34:11 2003 To: address@hidden cc: Kevin Robbins <address@hidden>, address@hidden Subject: 20030626: 20030624: HDS feed to/from seistan (cont.) >From: Robert Leche <address@hidden> >Organization: LSU >Keywords: 200306161954.h5GJs2Ld016710 LDM-6 IDD Hi Bob, >In talking with our telecommunications people: > >1) The Louisiana Office of Telecommunications ("LANET") was contacted with >the problem and LANET reports Bell South (The states communications provider) >has an open trouble ticket on the Public Switched sonnet network connecting >ULM to the LANET. The trouble ticket reports: CRC, Retransmission errors. >This is a DS-3 Private Virtual Circuit (PVC) on the Public Switched sonnet >network connecting ULM to the LANET. This sounds like the problem we uncovered at ULM. They contacted their service provider and rerouted their traffice from I2 to "I1". We never did get a reply from them as to what "I1" means. After they rerouted away from their problematic I2 connection, we were able to feed all of HDS to them with virtually no latency. >LANET indicated this trouble ticket >has been open for "some time". We do not know what "some time" means in terms >of days or months. CRC, and retransmission errors are consistent with delays >in network traffic. Is this also affecting the LSU connection? If not, there is still a problem to be solved. >2) Concerning Ping (ICMP): > A) LSU has limitation's placed on ICMP payload sizes to limit "the >Ping Of Death" hacks. So it is interesting that even though LSU has this >policy in place we can demonstrate large ICMP traffic to correctly query >systems other than ULM but not ULM. OK. > B) The telecommunications people pointed out that Cisco router interface >ping (ICMP) buffers have a hard limitation of 18,000 bytes. Unix/Linux systems >do not have this issue. So the theory goes.... Ping LSU's Cisco border router, >then LANET's Cisco border router and problems seem apparent. Yet ping an >UNIX device with a large pay load beyond the Cisco device and travel time >delays suddenly do not seem excessive. I understand. Even still, the pings with large ICMP packets from seistan (RedHat 7.2 Linux) to zero.unidata.ucar.edu (Sun Solaris SPARC 5.9) show dramatic round trip time increases after the ping packet size exceeds 20KB. The 18000 byte limit you note does seem like what were were seeing when trying to ping laNoc-lsubr.LEARN.la.net. >3) It would be interesting to know who ULM is feeding HDS from. Chances are, >the communications circuit they are currently using is the same DS-3 circuit >that LANET uses. Right now, ULM is feeding HDS from rainbow.al.noaa.gov (this is a CU/CIRES lab here in Boulder). We also fed them with no latency from emo.unidata.ucar.edu. >4) Limitations placed on ICMP payload sizes on any devices in a networks >path will cause problems in using ICMP round trip time to measure network >metrics. But at this time, I do not have an alternative method to measure >network latencies. My network guy said network latencies issues are handled >by the circuit provider. No help there. The ping packet size issue was just an interesting observation. The real issue is the latency when feeding the HDS stream out of LSU as compared to virtually no latency when feeding the HDS stream _into_ LSU. This observation is something that the telcomm people should be able use to help isolate where the throttling is occuring on or near the LSU campus. Our being able to feed ULM all of the HDS feed from at least two other sites and our not being able to feed HDS from seistan but being able to feed seistan shows us that the problem is not at ULM, but at LSU. What did the telecomm folks have to say about the asymmetry seen moving data to/from srcc.lsu.edu? Tom >From address@hidden Mon Jun 30 12:30:54 2003 >To: Unidata Support <address@hidden> >Subject: Re: 20030630: IDD feeds from LSU to any non LSU downstream sites >(cont.) >Tom, >thanks for sending the email to me. >The LSU telcom folks report no changes where made with the LSU network >configuration over the weekend. The LANET part of this remains to be >answered, and our telcom will contact them. >Just to let you know, we have not made any changes to Seistan either.