[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
19990319: ROUTE PP BATCH failures at UVa (cont.)
- Subject: 19990319: ROUTE PP BATCH failures at UVa (cont.)
- Date: Sat, 20 Mar 1999 12:39:01 -0700
>From: "Jennie L. Moody" <address@hidden>
>Organization: UVa
>Keywords: 199903091926.MAA12825 McIDAS ROUTE SYSIMAGE.SAV
Jennie,
re: unsetting MCPATH statement not appearing when things are working
>I don't know if thats true. Now, with the processing working
>(so far so good), the message of "unsetting MCPATH" is gone????
This has got to be a big clue somehow!
re: turn off verbose loggin
>Okay. later.
re: changing of permissions on /home/mcidas/workdata
>Right. When I reread that last message, I noticed you didn't have
>to do this, you thought ldma couldn't write to the /home/mcidas/workdata
>but I saw you actually had tried to touch a file in /home/mcidas/data,
Actually, the 'touch' I did try was for /home/mcidas/workdata. The
mistake was in the email I sent back to you.
>I don't think this was actually ever the problem (nice to know I'm
>not the only one who makes mistakes)
Mistake is my middle name :-(
re: mcscour.sh has deleting of ROUTEPP.LOG
>Okay, forgot to notice that.
re: add writing message in batch.k before 'mcenv' invocation
>Okay, I'll try adding this.
This will prove/disprove the theory that GU core dumping is causing mcenv
to die.
re: another user using /home/mcidas/workdata
>I checked for this and didn't find any problem. Could that have
>happened if we had something like a power outage and a remote user
>was thrown off (windfall has an uninterruptable power supply, but
>aerial doesn't, and I know one day I was knocked off in such a manner...
Could be, but I am not sure.
>I do not recollect whether I had actually had been running mcidas, but
>I am sure I had been logged into windfall....so its possible.....)
>This was back in February (which if I recall was when those lingering
>segments were time-stamped
re: make sure to check all directories in 'mcidas' MCPATH using dmap.k
>OK.
re: GU dumping causing mcenv to die
>bummer.
The correct response is "bummer, dude!"
re: other's sessions
>I looked at everyones path...they all seem fine. Unless they are running
>some process that resets their path? Don't know what that might be...
OK.
re: think of anything else
The setup I have been recommending is a little different than the one
we use here at the UPC, but the difference should really only affect
XCD decoding. Here is what I did:
o create a /home/mcidas/upcworkdata directory and _copy_ (not link)
the files from /home/mcidas/workdata to it
o edit xcd_run and change MCDATA to point to /home/mcidas/upcworkdata
instead of /home/mcidas/data
I did this to obviate the possible impact of my building and reinstalling
McIDAS-X,-XCD frequently (testing, betas from SSEC, multiple platforms,
etc.).
My ROUTE PP BATCH processing, on the other hand, still uses
/home/mcidas/workdata as MCDATA. So, the upshot of the difference would
be that ROUTE PP BATCHing would be separated from XCD processing. While
I don't think that this should have any salutory effects, it may. Since
it is spring break for you, you might want to give this a shot. If you
decide to do this, make sure to stop the LDM before making the changes
(making the new directory; copying the files to the new directory; and
editing xcd_run that is being used by the LDM) and start it afterwards.
Of course, you will have to insure that the read/write permissions on
the new directory and files are such that 'ldma' can read/write.
>Sorry to have become a pain in the ass. Thats how it feels on this end.
No problem. This would be more interesting to me IF I wasn't working
so hard to get a new distribution out of the door.
Tom
>From address@hidden Sat Mar 20 14:41:10 1999
>re: mcidas.log message must be a clue
Struck me that way....., so when post-processing dies, the mcidas
decoders (not the xcd-decoders mind you) give this unique message
in the mcidas log. ?
> re: turn off verbose loggin
I haven't done this yet, maybe I'll wait to see if it
happens next time (arghhh)
>re: I actually checked /home/mcidas/workdata for write
Are you sure then, because it seemed really strange to me that ldma
couldn't write to that directory (recall it *had* been writing to
that directory previously) and ldma and mcidas are in same group,
etc....it really made no sense.
> re: another user using /home/mcidas/workdata
actually, what I wrote below didn't have anything to to do
with another user using /home/mcidas/workdata, it was
my response to your wondering if we need some kind of
shared memory patch (I was wondering if that kind of crash
could cause a process to die without releasing shared
memory (I am not sure if this is the concept?)
>re: The correct response is "bummer, dude!"
I *hate* the phrase dude, I strongly discourage my sons from
saying it.