[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[McIDAS #TAL-270196]: Not getting MD data
- Subject: [McIDAS #TAL-270196]: Not getting MD data
- Date: Thu, 28 Jul 2016 10:02:45 -0600
Hi Heather,
re:
> So we stopped decoding the MD data again. Exactly at the end of the day
> yesterday. Not
> good!
I agree, this is not good!
re:
> I did the greps that you ask me to do. The first one came up with nothing,
> and
> the second grep came up just fine.
I should have had you do one or two more 'ps' invocations to make sure that
what I was expecting to see would be listed. On all of our systems, the
'ps -eaf | grep DM | grep -v' invocation will list out all of the McIDAS-XCD
data monitors running on the system. I assumed that this would be the
case for your system as ** I think ** you are running either RedHat Enterprise
6.x or CentOS 6.x. I guess it might possible that the data monitors wouldn't
show up
in this 'ps' invocation, but the real executables (e.g., dmsfc.k, etc.) would in
that case.
re:
> Yesterday while everything was running okay, both
> came up with the processes that you said they would.
OK, good. This answered a couple of questions that I was
ready to pose.
re:
> Here is a screen grab of what I am seeing:
>
> [ldm@npxcd ~]$ cd /data/
> [ldm@npxcd data]$ ls -ltr MDXX*
> -rw-rw-r-- 1 ldm ldm 5604680 Jul 27 09:13 MDXX0018
> -rw-rw-r-- 1 ldm ldm 2897136 Jul 27 09:13 MDXX0028
> -rw-rw-r-- 1 ldm ldm 542880 Jul 27 09:13 MDXX0057
> -rw-rw-r-- 1 ldm ldm 7097768 Jul 27 10:50 MDXX0109
> -rw-rw-r-- 1 ldm ldm 5961852 Jul 27 13:25 MDXX0068
> -rw-rw-r-- 1 ldm ldm 14156120 Jul 27 18:07 MDXX0119
> -rw-rw-r-- 1 ldm ldm 12630748 Jul 27 20:00 MDXX0038
> -rw-rw-r-- 1 ldm ldm 49844936 Jul 27 20:00 MDXX0008
> -rw-rw-r-- 1 ldm ldm 8819736 Jul 27 20:03 MDXX0058
> -rw-rw-r-- 1 ldm ldm 6020504 Jul 27 21:02 MDXX0069
> -rw-rw-r-- 1 ldm ldm 6615720 Jul 27 22:05 MDXX0019
> -rw-rw-r-- 1 ldm ldm 4817376 Jul 27 22:07 MDXX0029
> -rw-rw-r-- 1 ldm ldm 3569768 Jul 27 22:50 MDXX0110
> -rw-rw-r-- 1 ldm ldm 50471136 Jul 27 23:02 MDXX0009
> -rw-rw-r-- 1 ldm ldm 12976672 Jul 27 23:47 MDXX0039
> -rw-rw-r-- 1 ldm ldm 657952 Jul 27 23:53 MDXX0030
> -rw-rw-r-- 1 ldm ldm 1020352 Jul 27 23:53 MDXX0020
> -rw-rw-r-- 1 ldm ldm 10408436 Jul 27 23:59 MDXX0010
> -rw-rw-r-- 1 ldm ldm 8863248 Jul 27 23:59 MDXX0059
> -rw-rw-r-- 1 ldm ldm 2319312 Jul 27 23:59 MDXX0060
> -rw-rw-r-- 1 ldm ldm 12438832 Jul 27 23:59 MDXX0040
> -rw-rw-r-- 1 ldm ldm 1095628 Jul 27 23:59 MDXX0070
OK. For reference, here is a long listing of the sizes of our
SFCHOURLY MD files from one of our motherlode-class machines:
% ls -alt /data/ldm/pub/decoded/mcidas/RTPTSRC/SFCHOURLY
total 860468
-rw-rw-r-- 1 ldm ustaff 35359336 Jul 28 15:53 MDXX0010
-rw-rw-r-- 1 ldm ustaff 50468136 Jul 28 15:51 MDXX0009
-rw-rw-r-- 1 ldm ustaff 50517136 Jul 28 00:00 MDXX0008
-rw-rw-r-- 1 ldm ustaff 50553936 Jul 27 00:00 MDXX0007
-rw-rw-r-- 1 ldm ustaff 50517136 Jul 25 21:48 MDXX0006
-rw-rw-r-- 1 ldm ustaff 50468136 Jul 24 21:55 MDXX0005
-rw-rw-r-- 1 ldm ustaff 50546936 Jul 24 00:00 MDXX0004
-rw-rw-r-- 1 ldm ustaff 50530636 Jul 22 21:18 MDXX0003
-rw-rw-r-- 1 ldm ustaff 50530636 Jul 21 21:05 MDXX0002
Notice that a complete SFCHOURLY (METAR) MD file should be
on the order of 50 MB per day.
re:
> [ldm@npxcd data]$ date
> Thu Jul 28 06:42:31 EDT 2016
> [ldm@npxcd data]$ ps -eaf | grep DM | grep -v grep
The fact that this listing was empty is disturbing. XCD is designed
to restart data monitors that exit automatically, so even if they
died/were killed, they should be restarted.
re:
> [ldm@npxcd data]$ ps -eaf | grep inge | grep -v grep
> ldm 21115 21102 0 Jul27 ? 00:00:00 ingetext.k DDS
> ldm 21116 21102 0 Jul27 ? 00:00:00 ingebin.k HRS
> ldm 21135 21116 1 Jul27 ? 00:12:13 ingebin.k HRS
> ldm 21138 21115 0 Jul27 ? 00:01:55 ingetext.k DDS
> [ldm@npxcd data]$
This looks correct.
re:
> Any idea why my mcidas decoder is stopping? How can I fix this?
Unfortunately, the answer to both of these questions is no.
Comment:
- if your McIDAS-XCD decoders continue to run with no problems
when using the previously installed version of the LDM on
your machine, you should switch back to it immediately
Again, I have _no_ idea why/how this would/could be the case,
but we can sort those issues out later.
re:
> Thanks!
Sorry for your problems...
Cheers,
Tom
--
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
address@hidden Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************
Ticket Details
===================
Ticket ID: TAL-270196
Department: Support McIDAS
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata
inquiry tracking system and then made publicly available through the web. If
you do not want to have your interactions made available in this way, you must
let us know in each email you send to us.