[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[CONDUIT #MJC-410449]: Re: [conduit] Huge CONDUIT latencies, lost data starting ~ 00 UTC last night
- Subject: [CONDUIT #MJC-410449]: Re: [conduit] Huge CONDUIT latencies, lost data starting ~ 00 UTC last night
- Date: Fri, 01 Mar 2013 12:07:52 -0700
Hi Becky,
re:
> Let me jump in here a bit late in the game. I know Justin was working
> this with the ldm users group, but he's out sick today.
OK.
Question:
- are folks there monitoring posts to both the address@hidden
and address@hidden email lists?
I thought that you and others were subscribed to address@hidden
so you would see posts related to CONDUIT content and other problems. I
was have also been working under the assumption that folks there were
_not_ subscribed to address@hidden mainly since there is a lot
of chatter on that list that likely has nothing to do with anything you
can help with.
re:
> What I understand from yesterday was that users weren't getting the SREF
> data.
That is one issue. There is another that I will re-iterate after replies
to your current list of comments.
re:
> We looked and realized that we'd only done the changes for the
> August 2012 SREF upgrade on the Silver Spring system. Not Boulder. So
> first question -- I'm guessing you all were getting the SREF from Silver
> Spring only. Can anyone give me proof that you stopped being able to
> access Silver Spring in the last week? And therefore this issue surfaced?
Hmm... this does cover part of the other issue...
We, Unidata Program Center, had not receive any products from
ncepldm1.woc.noaa.gov since 08:51:58 on 20130215; we started
receiving products from ncepldm1.woc.noaa.gov today, however:
Real-time CONDUIT volume statistics for daffy.unidata.ucar.edu:
http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_vol_nc?CONDUIT+daffy.unidata.ucar.edu
The AOS group at the University of Wisconsin at Madison has been
having problems receiving CONDUIT data for the past several days now
both from ncepldm4.woc.noaa.gov (high latencies, overall bad service)
and ncepldm1.woc.noaa.gov (no data; perhaps being denied their feed
REQUEST(s)?). Things got so bad at UW/AOS that they started feeding
from us (idd.unidata.ucar.edu) this morning.
re:
> Now, we did put the SREF implementation on Boulder last night around 5PM
> our time.
Very good, thanks.
re:
> What issues are you seeing now?
Feed issues to UW/AOS. Here are some snippits from emails we have received
from Pete Pokrandt yesterday:
Posted on: 20130228.1045 MST
I never got a resolution as to why I am unable to ingest the CONDUIT
feed on idd.aos.wisc.edu from ncepldm1.woc.noaa.gov since about 9:30 AM
CST on 2/15/2013. Prior to that date I was able to connect.
I should be set up to ingest conduit from ncepldm1.woc.noaa.gov and from
ncepldm4.woc.noaa.gov
Posted on: 20130228.1137 MST
Steve, (cc Tom Yoksas)
The only contact I have ever had regarding the conduit feed has been
through the Unidata folks - Steve Chiswell in the olden days <TM>, and
Tom Yoksas more recently. I have never had any direct contact between
myself and noaa.
I have attached the 10/15/2010 email that asked me to switch my ingest
of the conduit feed to the redundant servers of ncepldm1.woc.noaa.gov
and ncep4.woc.noaa.gov. This has been working since then, up until
2/15/2013.
There is an email for a Rebecca Cosgrove at Noaa - is she still our
contact for the CONDUIT ldm servers? Tom, should I contact her or
someone direct or should this go through you?
My Comment: I was out of the office yesterday and not available yesterday
evening, so I was not around to respond to Pete's inquiries.
Posted on: 20130228.1206 MST
I am currently feeding CONDUIT from ncepldm4, that has always worked.
The problem is there is no redundancy now. If ncepldm4 drops offline, we
have no CONDUIT data.
It is good to know that you also are unable to receive conduit data from
ncepldm1. Sounds like maybe the ldm on ncepldm1 went down?
My Comment: Pete's post does not reflect his previous comments about
very high latencies while feeding from ncepldm4.woc.noaa.gov.
Posted on: 20130301.0850 MST
All,
We are losing lots of CONDUIT data, huge latencies beginning near 00 UTC
or so last night.
I don't think it is just us because the problem shows up on other sites
as well. I have attached two latency plots - unfortunately most of the
time the begin/end times aren't working on these plots, but I did look
at them yesterday and the latencies had not begun yet, so the big
increase began sometime late yesterday. Did something change?
My users reported lost data beginning with the 00 UTC model cycle.
Also, I still am unable to connect to ncepldm1.woc.noaa.gov.
Posted on: 20130301.0852 MST
By the way, I just began requesting CONDUIT also from
idd.unidata.ucar.edu since they appear to have a connection with lower
latencies.
re:
> so I was hoping we'd have some reports from you guys of what
> problems you're seeing
The two issues seem to be/have been:
- lack of SREF data
- set of ALLOWs on CONDUIT toplevel injection machines is not uniform
re:
> So... are you all having problems accessing some or all of the CONDUIT
> boxes? If so, since when?
We are getting data from ncepldm1 again (as per real-time CONDUIT volume
plot URL I included above). UW/AOS (idd.aos.wisc.edu) may still not be
able to REQUEST data from ncepldm1.
Cheers,
Tom
--
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
address@hidden Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************
Ticket Details
===================
Ticket ID: MJC-410449
Department: Support CONDUIT
Priority: Normal
Status: Closed