[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[CONDUIT #APE-897679]: Missing data on idd.unidata.ucard.edu feed



Hi Justin,

re:
> Today we got a report from Doug Schuster (TIGGE) about missing GEFS
> datasets on the CONDUIT feed. We did not notice any processing errors
> on our end and Doug didn't see any LDM errors on his side.
> 
> Doug gets his data from idd.unidata.ucar.edu and idd.cise-nsf.gov,
> do you maintain those systems?

Yes, we maintain these systems.

CONDUIT data gets to idd.unidata.ucar.edu via four paths:

idd.aos.wisc.edu, idd.meteo.psu.edu, daffy.unidata.ucar.edu, and 
idd.cise-nsf.gov

CONDUIT data gets to idd.cise-nsf.gov via atm.cise-nsf.gov which,
in turn gets the data from idd.unidata.ucar.edu.

So, the independent sources of CONDUIT are idd.aos.wisc.edu, idd.meteo.psu.edu 
and
daffy.unidata.ucar.edu.  daffy.unidata.ucar.edu maintains redundant requests for
all CONDUIT data from the WOC top levels back east and here in Boulder:

request CONDUIT "[09]$" ncepldm1.woc.noaa.gov
request CONDUIT "[18]$" ncepldm1.woc.noaa.gov
request CONDUIT "[27]$" ncepldm1.woc.noaa.gov
request CONDUIT "[36]$" ncepldm1.woc.noaa.gov
request CONDUIT "[45]$" ncepldm1.woc.noaa.gov

request CONDUIT "[09]$" ncepldm4.woc.noaa.gov
request CONDUIT "[18]$" ncepldm4.woc.noaa.gov
request CONDUIT "[27]$" ncepldm4.woc.noaa.gov
request CONDUIT "[36]$" ncepldm4.woc.noaa.gov
request CONDUIT "[45]$" ncepldm4.woc.noaa.gov

re:
> Can you check the LDM logs and see if you had any errors during the time
> the 12Z GEFS is disseminated (16:40Z - 17:20Z)?

I have attached the LDM log file from daffy.unidata.ucar.edu for today (March 
8).
This log file demonstrates that there were several instances of problems seen
on daffy, some of which were name server problems and others were connections
reset by peer.

Here are two snippits from the log file each of which shows one of the problems:

Name server problem:

Mar  8 10:03:03 daffy ncepldm1.woc.noaa.gov[6412] ERROR: Couldn't get IP 
address for upstream host ncepldm1.woc.noaa.gov; Couldn't resolve 
"ncepldm1.woc.noaa.gov" to an Internet address in 25.053 seconds; hostname 
lookup timeout

Connection reset by peer:

Mar  8 10:20:24 daffy ncepldm1.woc.noaa.gov[6412] ERROR: readtcp(): read() 
error on socket 4
Mar  8 10:20:24 daffy ncepldm1.woc.noaa.gov[6412] ERROR: one_svc_run(): RPC 
layer closed connection
Mar  8 10:20:24 daffy ncepldm1.woc.noaa.gov[6412] ERROR: Disconnecting due to 
LDM failure; Connection to upstream LDM closed
Mar  8 10:20:24 daffy ncepldm1.woc.noaa.gov[6412] NOTE: LDM-6 desired 
product-class: 20110308162024.578 TS_ENDT {{CONDUIT,  "[18]$"},{NONE,  
"SIG=36e2c8b2b3c72b005a5774c9212d3537"}}
Mar  8 10:20:24 daffy ncepldm1.woc.noaa.gov[6410] ERROR: Connection reset by 
peer
Mar  8 10:20:24 daffy ncepldm1.woc.noaa.gov[6410] ERROR: readtcp(): read() 
error on socket 4
Mar  8 10:20:24 daffy ncepldm1.woc.noaa.gov[6410] ERROR: one_svc_run(): RPC 
layer closed connection
Mar  8 10:20:24 daffy ncepldm1.woc.noaa.gov[6410] ERROR: Disconnecting due to 
LDM failure; Connection to upstream LDM closed

re:
> I'm particularly interested if you had any connection errors to our
> CONDUIT system.

It seems that the answer is yes, we did experience problems today.  Whether or 
not this
resulted in missing products is not knowable by the log output, but it is likely
that the outage that Doug reported is related/caused by the problems we saw.

re:
> We have seen occasional reports from other users of
> ftp.ncep.noaa.gov and NOMADS about connection issues starting last
> week. Our system administrators are troubleshooting, but so far there
> are no smoking guns. This is the first report of CONDUIT possibly have
> a connection issue.

I believe that we have seen the name service problem before.  I have a vague 
recollection
of this happening when the CONDUIT WOC toplevel here in Boulder was being
first established.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: APE-897679
Department: Support CONDUIT
Priority: Normal
Status: Closed