[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[IDD #OMZ-874415]: GOESR ldm feed
- Subject: [IDD #OMZ-874415]: GOESR ldm feed
- Date: Wed, 26 Sep 2018 13:41:09 -0600
Hi Carol,
Steve, Mike and I just finished discussing the situation that you ran
into on typhoon... very strange indeed, and something that we have
never seen before.
The following is more "for the files" than anything else:
- the problem encountered was an ldmd process trying to write a
new product into the LDM queue but being able to do so because
all of the products in the queue were locked
The 6.13.6 ldmd process exited when it was unable to write the
product, and the LDM kept running. Steve assured us that this
was valid behavior.
- we (I) recommended increasing the LDM queue size from 500M to
8G, but noted that this was to keep in line with the recommendation
that about an hour be kept in the local queue
This recommendation allows the LDM to do better duplicate product
detection and elimination.
NB: we do NOT think that using a 500M queue was the cause of the
problem seen!
- the only scenario that makes any sense to us given the symptoms
we see represented by LDM log file messages is that somehow the
500M queue got corrupted, and this was manifested in locks on
products in the queue not being released
The creation of a new queue, 8G in your case, should have remedied
the problem in the 500M queue if there was one. Your reporting that
you ran into the same problem sometime around 15:46 yesterday afternoon
_after_ making a new, 8G queue, does not support the guesstimate that
the problem was a corrupted queue.
Question:
- is it possible that the ordering of making a new queue was different
than what you reported?
I.e., is it possible that the LDM was restarted while the existing 500M
queue was still being used, and the problem was run into again at more
or less 15:46? We ask this since we are trying to reconcile the LDM
restarts we see reflected in the LDM log files in ~ldm/var/logs.
Closing comment:
- the LDM registry entry for <datadir-path> in the <pqact> section
is currently:
/usr/local/ldm/var/data
<datadir-path> will be the current working directory for 'pqact'
invocations. This why the log file for the 'grbfile.sh' process
is located in /usr/local/ldm/var/data/logs given that the
log file specified in the ~ldm/etc/pqact_satellite.conf pattern-action
file is the relative 'logs/grbfile.log'. If you want the log files
for the 'grbfile.sh' process to be put in the same directory as the
LDM log files, ~ldm/var/logs, you will need to either change the
<datadir-path> directory to /usr/local/ldm/var, or change the values
specified in pqact_satellite.conf.
I find it most useful if all of the LDM related log files are located
in the same directory, so I recommend modifying either the actions
in pqact_satellite.conf or changing <datadir-path> in the LDM registry.
NB: if you change definitions in the LDM registry, the LDM will need
to be restarted for the change(s) to become active.
Cheers,
Tom
--
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
address@hidden Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************
Ticket Details
===================
Ticket ID: OMZ-874415
Department: Support IDD
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata
inquiry tracking system and then made publicly available through the web. If
you do not want to have your interactions made available in this way, you must
let us know in each email you send to us.