[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
20021113: ldm having issues starting up related to McIDAS-XCD (cont.)
- Subject: 20021113: ldm having issues starting up related to McIDAS-XCD (cont.)
- Date: Wed, 13 Nov 2002 14:01:33 -0700
>From: William C Klein <address@hidden>
>Organization: Valparaiso
>Keywords: 200211131551.gADFpqL24688 LDM ldmd.conf xcd_run McIDAS-XCD
Bill,
I logged onto aeolus as 'ldm'. Here is what I looked for and found:
-> Time to move onto the rest:
re: permissions on the McIDAS-XCD executables that 'xcd_run' tries
These look OK.
re: the permissions on the directories that XCD decoders want to write to
These also look OK.
re: there are no inter process communication handles left on the system.
This is the problem. as 'ldm', I ran ipcs and got a long listing of
interprocess communication handles that needed to be removed:
[ 35 ] > ipcs
IPC status from <running system> as of Wed Nov 13 14:30:23 CST 2002
T ID KEY MODE OWNER GROUP
Message Queues:
Shared Memory:
m 0 0x50000d2f --rw-r--r-- root root
m 1 0 --rw------- ldm vumcidas
m 2 0 --rw------- ldm vumcidas
m 3 0 --rw------- ldm vumcidas
m 4 0 --rw------- ldm vumcidas
m 5 0 --rw------- ldm vumcidas
m 6 0 --rw------- ldm vumcidas
m 7 0 --rw------- ldm vumcidas
m 8 0 --rw------- ldm vumcidas
m 9 0 --rw------- ldm vumcidas
m 10 0 --rw------- ldm vumcidas
m 11 0 --rw------- ldm vumcidas
m 12 0 --rw------- ldm vumcidas
...
m 97 0 --rw------- ldm vumcidas
m 98 0 --rw------- ldm vumcidas
m 99 0 --rw------- ldm vumcidas
Semaphores:
Also, my comment about the ~ldm/.mctmp directory containing lots of
subdirectories was also found:
[ aeolus : ldm : ~ ]
[ 39 ] > ls .mctmp
1 116 126 136 20 30 40 5 6 7 78 88 97
10 117 127 138 21 31 403 50 60 70 79 89 98
102 118 128 14 22 32 41 51 61 701 8 9 99
103 119 129 15 23 33 42 52 62 702 80 90
104 12 13 16 24 34 43 53 63 71 81 91
105 120 130 17 25 35 44 54 64 72 82 92
106 121 131 1742 26 36 45 55 65 73 83 93
108 122 132 1743 27 37 46 56 66 74 84 94
11 123 133 18 28 38 47 57 67 75 85 946
114 124 134 19 29 39 48 58 68 76 86 95
115 125 135 2 3 4 49 59 69 77 87 96
I removed the .mctmp subdirectories and cleaned-up the ipc handles:
[ aeolus : ldm : ~/.mctmp ]
[ 42 ] > rm -rf *
The next step was to delete all ipc segments:
set COUNT=1
while ( $COUNT <= 99 )
%while echo COUNT = $COUNT
%while ipcrm -m $COUNT
%while @ COUNT = $COUNT + 1
%while end
After doing this, I decided to become McIDAS and see if I could
create a McIDAS session (since this would exercise the shared memory
system on aeolus). Here is what happened:
<login as 'mcidas'>
cd workdata
mcenv
ld.so.1: mcenv: fatal: relocation error: file mcenv: symbol __s_rsFe_pv:
referenced symbol not found
Killed
This indicates some sort of a shared memory system problem on aeolus.
The next step I would normally suggest is a reboot, but I see that
aeolus has only been up for just over 7 hours.
Question: did your problems start after a reboot earlier today?
Tom