[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20020730: LDM latency problem - followup (cont.)
- Subject: 20020730: LDM latency problem - followup (cont.)
- Date: Wed, 31 Jul 2002 09:48:55 -0600
>From: Mike Voss <address@hidden>
>Organization: SJSU
>Keywords: 200207302019.g6UKJL905705 IDD latency
Mike,
>No smoke here :-)
I am relieved ;-)
>Well, at first I suspected that aeolus was having latency issues and so I
>sent this email to Larry last Thursday:
>---------- Forwarded message ----------
>Date: Thu, 25 Jul 2002 07:59:52 -0700 (PDT)
>From: Mike Voss <address@hidden>
>To: address@hidden
>Subject: ldm feed
>Hi Larry,
>My primary feed is Arizona, but I have been feeding from your machine
>recently. Since about Sunday (7-21) the HDS and MCIDAS data has been
>inconsistent...I believe because of latency issues. I've been trying to
>figure out if my machine is not handling the load or what. When I do a
>notifyme on aeolus I get the following:
>----snip----
>Jul 25 14:45:13 notifyme[25457]: NOTIFYME(aeolus.ucsd.edu): OK
>Jul 25 14:45:14 notifyme[25457]: 925 20020725144514.924 IDS|DDPLUS
>171 FPUS73 KEAX 251444 /pNOWMCI
>Jul 25 14:45:14 notifyme[25457]: 13639 20020725144514.409 NNEXRAD
>40018202 SDUS54 KEWX 251444 /pN0VEWX
>Jul 25 14:45:14 notifyme[25457]: 15347 20020725144514.509 NNEXRAD
>40018203 SDUS54 KEWX 251444 /pN0SEWX
>Jul 25 14:45:14 notifyme[25457]: 616 20020725144514.530 NNEXRAD
>40018205 SDUS52 KJAX 251439 /pNVLVAX
>Jul 25 14:45:14 notifyme[25457]: 7051 20020725144514.580 NNEXRAD
>40018207 SDUS72 KJAX 251439 /pN1VJAX
>Jul 25 14:45:14 notifyme[25457]: 9776 20020725144514.661 NNEXRAD
>40018210 SDUS24 KLUB 251439 /pN2RLBB
>Jul 25 14:45:14 notifyme[25457]: 6560 20020725144514.708 NNEXRAD
>40018212 SDUS76 KLOX 251440 /pN1VVTX
>Jul 25 14:45:14 notifyme[25457]: 109 20020725144515.146 IDS|DDPLUS
>181 SAUS45 KCYS 251444 /pMTRBFU
>Jul 25 14:45:14 notifyme[25457]: 8559 20020725144516.069 HDS 184
>SDUS83 KGRB 251441 /pDPAGRB
>----snip---
>....which looks good. But when I add on a time, ie,:
>---snip-----
>rossby:~>notifyme -vl - -h aeolus.ucsd.edu -o 11000
>Jul 25 14:54:16 notifyme[25479]: Starting Up: aeolus.ucsd.edu:
>20020725115056.124 TS_ENDT {{ANY, ".*"}}
>Jul 25 14:54:16 notifyme[25479]: NOTIFYME(aeolus.ucsd.edu): OK
>Jul 25 14:54:16 notifyme[25479]: 5926 20020725115056.237 IDS|DDPLUS
>699 SXUS56 KSCS 251100 /pSCNGA
>Jul 25 14:54:16 notifyme[25479]: 3478 20020725115056.249 IDS|DDPLUS
>700 SXUS56 KSCS 251100 /pSCNIA
>Jul 25 14:54:16 notifyme[25479]: 7484 20020725115056.284 IDS|DDPLUS
>702 SXUS86 KSGX 251150 /pOMRSGX
>Jul 25 14:54:17 notifyme[25479]: 6520 20020725115056.141 NNEXRAD
>39914748 SDUS35 KGGW 251148 /pN3SGGW
>Jul 25 14:54:17 notifyme[25479]: 6652 20020725115056.183 NNEXRAD
>39914749 SDUS23 KFSD 251148 /pN2SFSD
>Jul 25 14:54:17 notifyme[25479]: 2537 20020725115056.202 NNEXRAD
>39914750 SDUS35 KMSO 251146 /pNVWMSX
>Jul 25 14:54:17 notifyme[25479]: 612 20020725115056.457 HDS 704
>SFUS41 KWBC 251149
>-----snip---
>..which tells me there is data over three hours old in your queue, maybe
>that is on purpose.
>Anyway, I just though I would check and see if all your data is coming in
>in a timely manner?
>Thanks,
>Mike
>----end of forwarded message
I agree with your reasoning in asking Larry if it was by design that his
queue has HRS data that is 3 hours old.
>Larry responded that things looked good on his end and he copied support:
>http://www.unidata.ucar.edu/glimpse/idd/5776
Like I said, I wasn't in the loop until I sent you my first email. Right
now we are holding workshops, so lots of folks are involved elsewhere
(mine starts on this coming Monday).
>I was on vacation last week, and didn't pursue this any further. Now here we
>are.
>
>LDM access to rossby:
>
>First ssh to metsun1.met.sjsu.edu as "ldm" and xxx
>then ssh to rossby from metsun1 as "ldm" and xxx
OK, I'm on.
>Sorry for the sorry state of my config files, I've been blasting away on
>them trying different options.
No problem.
>Notes:
>
>-I did change the ldmd.conf on rossby allow all the HDS in.
I see this.
>- the RPC errors in ldmd.conf are recent...I believe since I upgraded to LMD-5
> .2 this morning.
OK.
>- notifyme does not seem to work right now on the local host...gives an rpc er
> ror in the log. "ldmadmi watch" works fine....
Hmm... Not good.
>- I'll be looking at this stuff from home tonight to see how the 00Z HDS flood
> is handled
I see LOTS of RECLASS messages in your ~ldm/logs/ldmd.log file.
>cheers, and thanks for the help!
You have to hold up the thanks until I do something.
Tom
>From address@hidden Wed Jul 31 09:29:27 2002
>Subject: Re: 20020730: 20020730: LDM latency problem - more followup
Tom,
Here is some notifyme output from this morning, I think it sheds light
on the problem. When I do a notifyme on aeolus.ucsd.edu I only need to
go out to an offset of 200 seconds to get data:
rossby:~>notifyme -vl - -h aeolus.ucsd.edu -f HDS -o 200
Jul 31 15:19:09 notifyme[10452]: Starting Up: aeolus.ucsd.edu:
20020731151549.730 TS_ENDT {{HDS, ".*"}}
Jul 31 15:19:10 notifyme[10452]: NOTIFYME(aeolus.ucsd.edu): OK
Jul 31 15:19:11 notifyme[10452]: 9102 20020731151550.668 HDS 396
JUSA42 KWNO 311400
Jul 31 15:19:11 notifyme[10452]: 5969 20020731151550.769 HDS 404
SDUS84 KMOB 311512 /pDPAEVX
Jul 31 15:19:11 notifyme[10452]: 5126 20020731151550.868 HDS 406
SDUS82 KTBW 311513 /pDPATBW
But on rossby.met.sjsu.edu I need to go out to 3700 seconds before I
get anything:
rossby:~>notifyme -vl - -f HDS -o 3400 -T 30
Jul 31 15:25:13 notifyme[10771]: Starting Up: localhost: 20020731142833.144
TS_ENDT {{HDS, ".*"}}
Jul 31 15:25:13 notifyme[10771]: NOTIFYME(localhost): OK
Jul 31 15:25:44 notifyme[10771]: Timed out after 30 seconds inactivity
Jul 31 15:25:44 notifyme[10771]: Disconnect
rossby:~>notifyme -vl - -f HDS -o 3700 -T 30
Jul 31 15:26:12 notifyme[10784]: Starting Up: localhost: 20020731142432.116
TS_ENDT {{HDS, ".*"}}
Jul 31 15:26:12 notifyme[10784]: NOTIFYME(localhost): OK
Jul 31 15:26:12 notifyme[10784]: 3264 20020731142436.272 HDS 045
YPID91 KWBF 311200 /mNGM
Jul 31 15:26:12 notifyme[10784]: 6930 20020731142436.289 HDS 046
YPQD91 KWBF 311200 /mNGM
Jul 31 15:26:12 notifyme[10784]: 2050 20020731142436.301 HDS 047
YPND91 KWBF 311200 /mNGM
This all tells me the latency is between me an aeolus....correct?
I will be in a meeting all morning. Feel free to log into to rossby as
indicated on the prior email.
Cheers,
Mike
>From address@hidden Wed Jul 31 13:40:07 2002
>Subject: Re: 20020730: LDM latency problem - followup (cont.)
Tom,
Sorry to keep spamming you all, I know your busy with the work shops. I
found something interesting, (I don't know why this didn't jump out at
me before). I have our ingest machine on a MRTG, which clearly shows a
big drop off in data flow 10 days ago..look at the "monthly" graph:
http://130.65.81.201/mrtg/130.65.80.62.4007.html
This may indicate a router problem on our end....I have our network
folks investigating this now. cheers,
Mike