[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20030623: HDS feed to/from seistan (cont.)
- Subject: 20030623: HDS feed to/from seistan (cont.)
- Date: Mon, 23 Jun 2003 17:43:29 -0600
>From: Robert Leche <address@hidden>
>Organization: LSU
>Keywords: 200306161954.h5GJs2Ld016710 LDM-6 IDD
Hi Bob,
>>General question: is it OK to give all srcc.lsu.edu machines total
>>access to seistan and sirocco?
>For testing I have no problem.
OK. What we were looking at was the set of rules listed from a
'ipchains -L' on seistan. It just seemed to us that the list is much
more complicated that it perhaps needs to be, and even hase some
contradictory ALLOWs and DENYs.
We tried adding another rule for 'ldm' (port 388) that allowed all
traffic (so it would appear at the top of the list) to see if that
would improve the data transfer from seistan to zero, but it
essentially changed nothing.
To check another possibility, we temporarily added an entry to
/etc/hosts for zero.unidata.ucar.edu. We wanted to know if DNS access
was slowing things down at all; it wasn't.
The next test was to repeat a test you tried a couple of days ago: ping
a downstream host using different packet sizes. This test was
informative to say the least. We ran a series of ping tests to
zero.unidata.ucar.edu while varying the packet size from the default
(64 ICMP data bytes) to the maximum allowed (65507 data bytes):
packet size [bytes] round-trip min/avg/max/mdev
-------------------+-------------------------
default 25.093/29.574/45.186/4.434 ms
1000 27.265/29.108/34.205/2.739 ms
2000 26.212/41.732/79.915/17.509 ms
4000 27.295/62.053/631.038/134.159 ms
6000 28.032/48.721/128.113/35.800 ms
8000 28.550/67.500/192.124/51.161 ms
10000 29.861/43.767/233.212/44.220 ms
12000 32.321/99.727/935.382/206.396 ms
14000 31.093/151.241/401.941/117.003 ms
16000 33.061/120.564/287.026/87.274 ms
18000 35.566/99.640/289.394/91.020 ms
20000 30.985/115.274/371.706/109.435 ms
30000 272.666/533.249/1326.722/347.923 ms
40000 88.554/733.970/998.542/194.308 ms
50000 1520.142/1610.229/1756.897/66.452 ms
65507 2248.710/4534.717/7079.395/1642.487 ms
While the average ('avg') round trip times were not monotonic, they do
show that when packet sizes get to be large, the round trip time grows
rapidly. We believe that this is what is being experienced in the feed
of HDS data to any downstream feed site from either seistan or datoo
(our tests feeding from datoo earlier today showed the exact same
pattern as when feeding from seistan).
While playing around with the ping tests, we decided to do pings to
machines along the route from seistan to zero. We found something
that was interesting, but have no idea if it means anything.
Here is the route from seistan to zero and traceroute times (output from
mtr):
1. 130.39.188.1 0% 19 19 4 1 4 15
2. lsubr1-118-6509-dsw-1.g2.lsu.edu 0% 19 19 0 0 9 142
3. laNoc-lsubr.LEARN.la.net 0% 19 19 2 1 4 25
4. abileneHou-laNoc.LEARN.la.net 0% 19 19 7 7 8 15
5. kscyng-hstnng.abilene.ucaid.edu 0% 19 19 23 22 23 33
6. dnvrng-kscyng.abilene.ucaid.edu 0% 19 19 34 33 34 45
7. 198.32.11.106 0% 19 19 34 33 34 40
8. gin.ucar.edu 0% 18 18 39 34 35 39
9. flrb.ucar.edu 0% 18 18 35 34 35 41
10. zero.unidata.ucar.edu 0% 18 18 35 34 35 39
In our ping tests, we found that the largest size ping packet that
laNoc-lsubr.LEARN.la.net would respond to was 17997 bytes (which
gets upped to 18025 ICMP bytes):
# ping -c 20 -s 17997 laNoc-lsubr.LEARN.la.net
PING laNoc-lsubr.LEARN.la.net (162.75.0.9) from 130.39.188.204 : 17997(18025)
bytes of data.
18005 bytes from laNoc-lsubr.LEARN.la.net (162.75.0.9): icmp_seq=0 ttl=253
time=243.957 msec
...
[root@seistan bin]# ping -c 20 -s 17998 laNoc-lsubr.LEARN.la.net
PING laNoc-lsubr.LEARN.la.net (162.75.0.9) from 130.39.188.204 : 17998(18026)
bytes of data.
-- no response --
Next, I decided to run a data movement test using something other than
the LDM. I used scp to move a 132 MB GOES-12 VIS image from zero to
seistan and then back again. Here are the results:
scp test:
zero.unidata.ucar.edu -> seistan.srcc.lsu.edu
AREA1234 100% |********************************| 132 MB 03:06
seistan.srcc.lsu.edu -> zero.unidata.ucar.edu
AREA1234 100% |********************************| 132 MB 04:41
Both tests were "pull" tests: the scp was initiated on the receiving
system.
As you can see, it took about 50% more time to move the data from seistan
to zero than to move the file from zero to seistan. This parallels the
observation that we can move HDS data to seistan from zero with little
to no latency, but can not going the other direction.
Since the size of products in the HDS feed is considerably larger than
those in the IDS|DDPLUS feed (which seistan is relaying to ULM with no
significant latencies), and since the number of products in the HDS
feed is a couple of orders of magnitude higher than in the UNIWISC feed
(the UNIWISC products are a lot larger than the HDS products), it
appears that the HDS feed problems is a function of lots of large size
packets. This fact severely limits the value of LSU's being a top
level IDD relay.
Would it be possible for you to take
- the results of the ping tests above
- the fact that we can send seistan large volumes of HDS data with
virtually no latency, but we can't receive the same data back to
a different machine in our network
- the results of using scp to copy data between the sysame systems
to the LSU networking support group and enlist their aid in finding out
what is limiting the traffic out of your network? We can run a number
of tests from here, but we don't have the same facilities for tracing
down a problem that the network group there should have.
>my general firewall philosophy, is to allow that which needs allowing.
>And no more. As every hosts at LSU is connected the open internet. We
>do not have safe areas behind a firewall. The issue of network security
>is of the most importance. Not all SRCC hosts need LDM and that is why
>it is set the way it is.
The LDM has its own security facilities. Adding a firewall rule for
each machine that want to get a feed just makes the list of rules
longer and longer. The net affect of this is to make packet processing
take longer and longer. We feel that it would be more efficient --
while still being secure -- to allow open access to the LDM port 388
and remove all specific rules for LDM access. Again, the 'allow' lines
in the ~ldm/etc/ldmd.conf file take care of sites trying to access the
LDM server.
Additionally, you have multiple rules in place for the exact same
host(s). Here are the duplicates from seistan:
ACCEPT all ------ aqua.nsstc.uah.edu anywhere n/a
ACCEPT all ------ aqua.nsstc.uah.edu anywhere n/a
ACCEPT all ------ atm.geo.nsf.gov anywhere n/a
ACCEPT all ------ atm.geo.nsf.gov anywhere n/a
ACCEPT all ------ betsy.jsums.edu anywhere n/a
DENY all ------ betsy.jsums.edu anywhere n/a
ACCEPT all ------ mistral.srcc.lsu.edu anywhere n/a
ACCEPT all ------ mistral.srcc.lsu.edu anywhere n/a
ACCEPT all ------ mistral.srcc.lsu.edu anywhere n/a
ACCEPT all ------ mistral.srcc.lsu.edu anywhere n/a
ACCEPT all ------ mistral.srcc.lsu.edu anywhere n/a
ACCEPT all ------ mistral.srcc.lsu.edu anywhere n/a
ACCEPT all ------ mistral.srcc.lsu.edu anywhere n/a
ACCEPT all ------ mistral.srcc.lsu.edu anywhere n/a
ACCEPT all ------ sirocco.srcc.lsu.edu anywhere n/a
ACCEPT all ------ sirocco.srcc.lsu.edu anywhere n/a
ACCEPT all ------ weather.admin.niu.edu anywhere n/a
ACCEPT all ------ weather.admin.niu.edu anywhere n/a
ACCEPT all ------ weather2.admin.niu.edu anywhere n/a
ACCEPT all ------ weather2.admin.niu.edu anywhere n/a
ACCEPT all ------ weather3.admin.niu.edu anywhere n/a
ACCEPT all ------ weather3.admin.niu.edu anywhere n/a
ACCEPT all ------ weather3.admin.niu.edu anywhere n/a
ACCEPT tcp ------ anywhere anywhere any ->
https
ACCEPT tcp ------ anywhere anywhere any ->
https
ACCEPT tcp ------ anywhere anywhere any -> smtp
ACCEPT tcp ------ anywhere anywhere any -> smtp
ACCEPT udp ------ anywhere anywhere any -> ntp
ACCEPT udp ------ anywhere anywhere any -> ntp
ACCEPT udp ------ anywhere anywhere
bootps:bootpc -> bootps:bootpc
ACCEPT udp ------ anywhere anywhere
bootps:bootpc -> bootps:bootpc
Each rule will be processed for each packet received until a match
occurs regardless of whether or not the rule is a duplicate of one
already run. Simply, the longer the list of rules, the longer it takes
to process each packet.
>The rules that are in place, allow the
>services we need. The guild lines presented by the SANS org have been
>followed in reguard to firewall setup. I am open to suggestions in this
>area, but just know any changes we make must be secure.
We agree that security is _the_ most important consideration. Our
recommendations would do nothing to compromise your security. Rather,
they would be aimed at making your setup more efficient and useful
to potential IDD sites.
>From address@hidden Mon Jun 23 14:50:05 2003
>Datoo should allow you to login. The pw is now the one you have. The SunOs
>does not use ipchains. Just tcp-wrappers.
Thanks for the access. The fact that datoo only uses TCP wrappers
answers some questions we had.
re: ipchains rule on seistan
>Changing the order is fine.
re:
1) flush the IP chains rule set that is in place right now on seistan
2) install a new rule set that consolidates the restrictions you currently
have in place
Tom, consolidating is fine.
3) return the HDS feed from seistan to zero.unidata.ucar.edu to see if
the large latencies drop to zero
>Ok.
We did return the feed of HDS back to seistan after noting that the feed
from datoo had virtually the same latencies.
re: any reason to not run tests
>I am comfortable with the security as it is. But making improvements in access
>control efficiency is fine as long as security is not compromised..
OK. Since tests run since the previous email to you indicated that
the IP chains setup was not _the_ reason for large HDS latencies, changes
to your setup is not needed to proceed. We stand by our observations
that things would be more efficient if duplicate rules were eliminated
and the ordering of the rules was such that more general rules come
first and then be followed by more specific ones.
>The issue
>of access control on Seistan or Sirocco being a source of trouble surprises
>me. The system performance indicators have not pointed to this as a critical
>problem area or a bottleneck.
As you can see above, that is our observation also.
>If you would have told me your tests found
>Sirocco had problems, I could believe it has had loading issues. It is a
>single CPU is a 450MHZ system. Seistan on the other hand is a dual 400Mhz
>machine. The loading on Seistan is much lower then Sirocco and is the reason
>I moved the downstream users off of Sirocco.
OK. The problem does not appear to be system loading dependent as we
can move HDS data to seistan with little to no latency. The problem is
strictly one of being able to move high volume data off of seistan to
machines that are not located in the srcc.lsu.edu and, presumably,
lsu.edu domains. Also, the fact that the latency introduction appears
to work on a connection basis (latencies for HDS can be quite high
while those for UNIWISC and IDS|DDPLUS are quite low), it would almost
seem to point to something like packet shaping or something being used
to limit transfers.
Tom