[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20060105: reboot of yakov leads to high latencies from ECMWF (cont.)
- Subject: 20060105: reboot of yakov leads to high latencies from ECMWF (cont.)
- Date: Thu, 05 Jan 2006 08:16:46 -0700
>From: Mike Schmidt <address@hidden>
>Organization: UCAR/Unidata
>Keywords: 200601051054.k05AsGP1017802 TIGGE IDD latencies FC4 system tuning
Hi Mike,
re:
>Sorry for the delay getting back to you.
No worries.
(Just so you know, a number of my comments below are "for the files".)
yakov is currently running Fedora Core 4 64-bit.
uname -a
Linux yakov.unidata.ucar.edu 2.6.14-1.1653_FC4smp #1 SMP Tue Dec 13 21:55:55
EST 2005 x86_64 x86_64 x86_64 GNU/Linux
It is a dual 3.2 Ghz Intel Xeon EM64T platform with 4 GB of RAM. FC4
recognizes the Xeon hyperthreading capabilities and configures itself
as if there are 4 CPUs.
>Here are the TCP tuning parameters for yakov;
>
># echo 2500000 > /proc/sys/net/core/wmem_max
># echo 2500000 > /proc/sys/net/core/rmem_max
># echo "4096 5000000 5000000" > /proc/sys/net/ipv4/tcp_rmem
># echo "4096 65536 5000000" > /proc/sys/net/ipv4/tcp_wmem
>
>in addition, I've been starting an iperf server for testing with;
>
># iperf -s -m -w1m >> /iperf.server 2>&1 &
Thanks. I performed all of the above as 'root' on yakov as soon as I
saw your note this morning. I immediately did and 'ldmadmin watch' to
see if tuning would affect existing rpc.ldmd connections; it did
_NOT_. Because of this, I restarted the LDM:
ldmadmin restart
After the restart, the latencies started dropping immediately.
>I'll add these to a startup script in the next day or so.
I just added the sequence of 'echos' to /etc/rc.local. I am not sure
if this is the appropriate place to make the change because of the
following comment in rc.local:
----- /etc/rc.local -----
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
...
----- /etc/rc.local -----
If this means that an autostart of the LDM would proceed the mods, then
rc.local is _not_ the place to make the change. The reason I say this
is I didn't see the latencies fall in existing LDM feeds from
ensemble.ecmwf.int until I restarted the LDM. It might be the case
that the tuning steps would best be put into the LDM autostart script
(which does not yet exist on yakov).
It is _very_ interesting to note:
- without the tuning mods, yakov would only receive 2 GB/hr from
ensemble -- lots of data was being lost. This occurred even though
the feed request was split 4 ways (one request each for 10, 20, 30,
and 60 MB products).
- with the tuning mods AND a restart of the LDM, the latencies
dropped fairly quickly
Comment: I am not sure why the volume received before tuning
was pegged at 2 GB/hr. This bears further thought/investigation.
Given the dramatic results, we should consider:
- recommending that Manuel and Waldenio do similar things on their
TIGGE test machines
- making the same modifications on the idd.unidata.ucar.edu cluster
data servers (uni1, uni2, and uni4) and on the cluster collector
frontends oliver and emo
Thanks for the tuning instructions!
Cheers,
Tom
>> From: Tom Yoksas <address@hidden>
>> Subject: 20060104: reboot of yakov leads to high latencies from ECMWF
>>
>> >From: Unidata User Support <address@hidden>
>> >Organization: Unidata Program Center/UCAR
>> >Keywords: TIGGE IDD latencies FC4 system tuning
>>
>> Hi Mike,
>>
>> I don't know if you are reading email, but I rebooted yakov yesterday
>> afternoon because of some desktop weirdness I was seeing AND because a
>> new kernel had been put in /boot but was not yet being used.
>>
>> After the reboot, the latencies for the data coming from ECMWF went
>> from about 15 seconds to an hour. I seem to remember that you did some
>> tweeking on yakov after the last reboot, but I can't remember exactly
>> what was needed. Can you tell me what tuning needs to be done after a
>> reboot of yakov?
>>
>> Thanks in advance...
>>
>> Tom
Cheers,
Tom
--
+-----------------------------------------------------------------------------+
* Tom Yoksas UCAR Unidata Program *
* (303) 497-8642 (last resort) P.O. Box 3000 *
* address@hidden Boulder, CO 80307 *
* Unidata WWW Service http://www.unidata.ucar.edu/*
+-----------------------------------------------------------------------------+