[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20030627: HDS feed to/from seistan (cont.)
- Subject: 20030627: HDS feed to/from seistan (cont.)
- Date: Fri, 27 Jun 2003 12:11:38 -0600
>From: Robert Leche <address@hidden>
>Organization: LSU
>Keywords: 200306161954.h5GJs2Ld016710 LDM-6 IDD
Hi Bob,
re: ULM rerouted their traffic from I2 to "I1:
>I did not know this happened, but it explains why ULM is able to communicate
>with rainbow.al.noaa.gov.
The ULM folks told us that during a total outage at LSU at some point in
the past they fed from thelma.ucar.edu and experienced no problems. This
predated either your or ULM's upgrade to LDM-6 by quite a bit.
Here a portion of the original note we received on problems ULM was
having feeding from srcc.lsu.edu:
"For more than a year, we have been having serious data feed problems
when our upstream site is at LSU (sirocco). We have tried everything
that we can, including contacting LSU repeatedly, but cannot seem to
resolve the situation satisfactorily. We have worked extensively with
our network people and believe that the problem is at LSU. We are
basing this conclusion on the fact that, while sirocco was down and we
were feeding from Unidata's thelma machine, everything was fine. We
received all data without significant losses. However, once sirocco
came on-line again and we switched over to them, we began to experience
substantial losses of data. Our fallback site is OU's stokes machine
and we have used them in the past, but they are feeding so many sites
that we tend to fall significantly behind in the data feed.
Can you help us resolve this problem?"
>It would be interesting to also force an I1 connection to LSU and repeat
>the test.
I agree, running feed tests using a different route to/from LSU would
certainly be welcome.
re: "I1"
>Internet one?
That is what we asked.
>A better question in this case is, what is I2 in the context
>to the LANET sonnet connecting ULM to LANET?
Here is the route from ULM to seistan.srcc.lsu.edu:
Matt's traceroute [v0.49]
tornado.geos.ulm.edu Fri Jun 27 10:56:14 2003
Keys: D - Display mode R - Restart statistics Q - Quit
Packets Pings
Hostname %Loss Rcv Snt Last Best Avg Worst
1. 10.16.0.1 0% 18 18 1 1 1 1
2. 10.1.1.1 0% 18 18 0 0 0 1
3. 198.232.231.1 0% 18 18 0 0 0 1
4. laNoc-ulm.LEARN.la.net 0% 17 17 13 13 19 76
5. lsubr-laNoc.LEARN.la.netponse 2. (serve0% 17 17 14 14 15 26
6. howe-e241a-4006-dsw-1.g1.lsu.edu 0% 17 17 18 15 22 50
7. seistan.srcc.lsu.edu 0% 17 17 15 14 19 42
This can be compared with LSU's route from seistan to tornado.geos.ulm.edu:
Matt's traceroute [v0.49]
seistan.srcc.lsu.edu Fri Jun 27 10:58:56 2003
Keys: D - Display mode R - Restart statistics Q - Quit
Packets Pings
Hostname %Loss Rcv Snt Last Best Avg Worst
1. 130.39.188.1 0% 11 11 4 1 2 5
2. lsubr1-118-6509-dsw-1.g2.lsu.edu 0% 11 11 1 0 1 1
3. laNoc-lsubr.LEARN.la.net 0% 11 11 2 1 2 4
4. ulm-laNoc.LEARN.la.net 0% 11 11 14 14 36 91
5. 198.232.231.2 0% 11 11 29 14 41 127
6. dynip422.nat.ulm.edu 0% 11 11 16 15 25 61
7. tornado.geos.ulm.edu 0% 10 10 15 14 16 23
Resolver: Received error response 2. (server failure)
>My limited understanding of
>what I2 is, is that traffic is I2 if it passes through Abilene's system.
I believe that is correct.
>That being the case, unless ULM is passing through Abilenes routers, ULM
>is really on I1 anyway.
Please see the route above. This, at least, reflects ULM's current
connection to LSU. UCAR's connection to ULM, however, traverses I2
until Houston where the bridge is made to LEARN.La.Net:
zero.unidata.ucar.edu -> tornado.geos.ulm.edu:
Matt's traceroute [v0.44]
zero.unidata.ucar.edu Fri Jun 27 12:02:58 2003
Keys: D - Display mode R - Restart statistics Q - Quit
Packets Pings
Hostname %Loss Rcv Snt Last Best Avg Worst
1. flra-n140.unidata.ucar.edu 0% 71 71 0 0 0 29
2. gin-n243-80.ucar.edu 0% 71 71 0 0 0 6
3. frgp-gw-1.frgp.net 0% 71 71 1 1 2 25
4. 198.32.11.105 0% 71 71 1 1 1 6
5. kscyng-dnvrng.abilene.ucaid.edu 0% 71 71 12 12 13 26
6. hstnng-kscyng.abilene.ucaid.edu 0% 71 71 27 27 27 27
7. laNoc-abileneHou.LEARN.La.Net 0% 71 71 33 32 33 36
8. ulm-laNoc.LEARN.La.Net 0% 70 70 45 45 46 71
9. ???
tornado.geos.ulm.edu -> zero.unidata.ucar.edu
Matt's traceroute [v0.49]
tornado.geos.ulm.edu Fri Jun 27 13:04:05 2003
Keys: D - Display mode R - Restart statistics Q - Quit
Packets Pings
Hostname %Loss Rcv Snt Last Best Avg Worst
1. 10.16.0.1 0% 4 4 1 1 1 1
2. 10.1.1.1 0% 4 4 0 0 0 0
3. 198.232.231.1 0% 4 4 0 0 0 0
4. laNoc-ulm.LEARN.la.net 0% 4 4 13 13 13 13
5. abileneHou-laNoc.LEARN.la.net 2. (serve0% 4 4 18 18 25 45
6. kscyng-hstnng.abilene.ucaid.edu 0% 3 3 34 34 34 34
7. dnvrng-kscyng.abilene.ucaid.edu 0% 3 3 44 44 44 44
8. 198.32.11.106 0% 3 3 44 44 44 45
9. gin.ucar.edu 0% 3 3 46 45 45 46
10. flrb.ucar.edu 0% 3 3 45 45 46 46
11. zero.unidata.ucar.edu 0% 3 3 56 45 49 56
re: ULM rerouted away from the problematic I2 connection
>LANET indicated this trouble ticket
>has been open for "some time". We do not know what "some time" means in terms
>of days or months.
It would be useful to know how long that trouble ticket has been open.
>CRC, and retransmission errors are consistent with delays
>in network traffic.
I agree.
re: is CRC and retransmission (trouble ticket at LANET) affecting LSU also
>I think the communication issue will require resolving before we will
>know.
The really strange part is the asymmetry in the problem. Since we are
are feeding seistan.srcc.lsu.edu the HDS stream from
emo.unidata.ucar.edu with no latencies, while at the same time we are
_unable_ to feed the data back to a different machine here at the UPC,
zero.unidata.ucar.edu (zero and emo are in the same room on the same
subnet), perhaps a look at the route from Unidata to seistan and back
again would be instructive:
zero.unidata.ucar.edu -> seistan.srcc.lsu.edu
Matt's traceroute [v0.44]
zero.unidata.ucar.edu Fri Jun 27 10:16:40 2003
Keys: D - Display mode R - Restart statistics Q - Quit
Packets Pings
Hostname %Loss Rcv Snt Last Best Avg Worst
1. flra-n140.unidata.ucar.edu 0% 8 8 10 0 1 10
2. gin-n243-80.ucar.edu 0% 8 8 0 0 0 0
3. frgp-gw-1.frgp.net 0% 8 8 1 1 1 2
4. 198.32.11.105 0% 8 8 1 1 1 1
5. kscyng-dnvrng.abilene.ucaid.edu 0% 8 8 22 12 13 22
6. hstnng-kscyng.abilene.ucaid.edu 0% 8 8 27 27 27 27
7. laNoc-abileneHou.LEARN.La.Net 0% 8 8 33 33 33 33
8. lsubr-laNoc.LEARN.La.Net 0% 8 8 34 34 34 34
9. howe-e241a-4006-dsw-1.g2.lsu.edu 0% 8 8 39 35 37 42
10. seistan.srcc.lsu.edu 0% 7 7 34 34 34 35
seistan.srcc.lsu.edu -> zero.unidata.ucar.edu
Matt's traceroute [v0.49]
seistan.srcc.lsu.edu Fri Jun 27 11:15:53 2003
Keys: D - Display mode R - Restart statistics Q - Quit
Packets Pings
Hostname %Loss Rcv Snt Last Best Avg Worst
1. 130.39.188.1 0% 14 14 1 1 3 16
2. lsubr1-118-6509-dsw-1.g2.lsu.edu 0% 14 14 0 0 1 6
3. laNoc-lsubr.LEARN.la.net 0% 14 14 2 1 2 5
4. abileneHou-laNoc.LEARN.la.net 0% 14 14 8 7 16 46
5. kscyng-hstnng.abilene.ucaid.edu 0% 14 14 23 22 22 23
6. dnvrng-kscyng.abilene.ucaid.edu 0% 14 14 33 33 36 71
7. 198.32.11.106 0% 14 14 34 33 36 59
8. gin.ucar.edu 0% 14 14 35 34 35 45
9. flrb.ucar.edu 0% 14 14 34 34 35 45
10. zero.unidata.ucar.edu 0% 13 13 34 34 36 57
The major difference in routes that I notice is the route from zero
to seistan goes through howe-e241a-4006-dsw-1.g2.lsu.edu, but the
route from seistan to zero goes through lsubr1-118-6509-dsw-1.g2.lsu.edu.
Perhaps this is a big clue that we are overlooking? Could it be
that there is something amiss on the howe-e241a-4006-dsw-1.g2.lsu.edu
gateway/router?
re: What did the telecomm folks have to say about the asymmetry seen moving
data to/from srcc.lsu.edu from zero.unidata.ucar.edu?
>The issue of asymmetry was not the paramount issue with telecom. Again, the
>telecom guys want to wait and see the communications issues are fixed, as
>they believe the errors in the circuit are causing the problems between LSU
>and ULM.
The problem is not _just_ between LSU and ULM. We (zero.unidata.ucar.edu)
are seeing the exact same problem that ULM was seeing when trying to
feed HDS from seistan.srcc.lsu.edu. Moreover, we saw the exact same
problem during our test of feeding the HDS stream from
seistan.srcc.lsu.edu to the University of South Florida machine,
metlab.cas.usf.edu. The problem most likely exists between seistan
and Jackson State, but we can't verify this because they are not reporting
stats AND we do not have current contact information for them.
If the LSU telecomm folks are under the impression that the only
problem is between LSU and ULM, then they need to be contacted and made
aware of the problems going to such diverse sites as UCAR and USF.
Tom