This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
>From: "Jennie L. Moody" <address@hidden> >Organization: UVa >Keywords: 200205161823.g4GINca18632 LDM Hi Jennie, >Well, it was inevitable that eventually I would have to start >paying attention to things and trying to fix problems. Yup. Things were working pretty well awhile ago. I worked with Tony to cut down on the amount of grib data that gets decoded with McIDAS-XCD. This was crutial at the time since windfall kept running out of disk space (in /p4 ?). >Our webpage stopped updating yesterday, and it looks like we lost >our connection to our upstream site. Today I got on to see >if I could just restart the ldm. After realizing that I >had to make a new password just to get in (I thought >I new the old one?), I found that there is plenty >I have forgotten. I was thinking I could just >stop the ldm, ldmadmin stop >and then restart it >ldmadmin start. That is the correct sequence with the exception that you have to wait until all LDM processes exit before restarting. >But I get the message that there is still a server running: > >windfall: /usr/local/ldm/etc $ ps -lf -u ldm > F S UID PID PPID C PRI NI ADDR SZ WCHAN > >STIME TTY TIME CMD > 8 R ldm 469 467 34 85 22 60d1c1f0 37463 > >14:10:02 ? 942:46 pqact -d /usr/local/ldm -q /usr/loc > 8 S ldm 467 1 0 47 22 608c4010 275 608c4080 > >14:10:01 ? 0:00 rpc.ldmd -q /usr/local/ldm/data/ldm > 8 O ldm 470 467 33 75 22 60d1c8b0 37452 > >14:10:02 ? 928:12 pqbinstats -d /p4/logs -q /usr/loca > 8 R ldm 471 467 34 85 22 60d1b470 37472 > >14:10:02 ? 962:03 rpc.ldmd -q /usr/local/ldm/data/ldm > 8 S ldm 20640 14588 0 51 20 60d1adb0 204 60d1ae20 It is strange to see pqact in the list of still active processes. Did you check to see if there was still available disk space? More down below. >13:49:00 pts/1 0:00 -ksh >windfall: /usr/local/ldm/etc $ whoami >ldm >windfall: /usr/local/ldm/etc $ ldmadmin stop >stopping the LDM server... >LDM server stopped >windfall: /usr/local/ldm/etc $ ps -lf -u ldm > F S UID PID PPID C PRI NI ADDR SZ WCHAN > >STIME TTY TIME CMD > 8 R ldm 469 467 34 85 22 60d1c1f0 37463 > >14:10:02 ? 942:59 pqact -d /usr/local/ldm -q /usr/loc > 8 S ldm 467 1 0 47 22 608c4010 275 608c4080 > >14:10:01 ? 0:00 rpc.ldmd -q /usr/local/ldm/data/ldm > 8 R ldm 470 467 32 95 22 60d1c8b0 37452 > >14:10:02 ? 928:24 pqbinstats -d /p4/logs -q /usr/loca > 8 O ldm 471 467 34 75 22 60d1b470 37472 > >14:10:02 ? 962:16 rpc.ldmd -q /usr/local/ldm/data/ldm > 8 S ldm 20640 14588 0 51 20 60d1adb0 204 60d1ae20 > >13:49:00 pts/1 0:00 -ksh > >So this didn't seem to do anything, using the dumb approach of >thinking that some of these processes wouldn't stop if the >delqueue wasn't run, I tried that (don't ask me why I thought >this would work...the mental equivalent of pushing buttons) At this point, I would forcably kill all processes that refuse to die; verify that they are no longer running; delete and remake the queue; and then restart. >windfall: /usr/local/ldm $ ldmadmin stop >stopping the LDM server... >LDM server stopped >windfall: /usr/local/ldm $ ldmadmin delqueue >May 16 18:13:20 UTC windfall.evsc.Virginia.EDU : delete_pq: A > >server is running, cannot delete the queue Right. The processes that access the queue will have a lock on it, so you shouldn't be able to delete it. >So, I don't know whats up.....sadly, I need a refresher course, >but in the meantime, maybe someone out there can tell me what to >do, or jump in here....I will happily share the new access info > >for user ldm... OK, I just logged on. What I did was: windfall: /usr/local/ldm $ ps -u ldm PID TTY TIME CMD 469 ? 966:43 pqact 467 ? 0:00 rpc.ldmd 470 ? 951:45 pqbinsta 471 ? 986:39 rpc.ldmd 25380 pts/8 0:00 ksh windfall: /usr/local/ldm $ kill -9 469 467 470 471 windfall: /usr/local/ldm $ ldmadmin delqueue windfall: /usr/local/ldm $ ldmadmin mkqueue windfall: /usr/local/ldm $ ldmadmin start windfall: /usr/local/ldm $ ps -u ldm PID TTY TIME CMD 25487 ? 0:00 ingetext 25452 ? 0:00 ingetext 25467 ? 0:00 startxcd 25485 ? 0:00 ingebin. 25468 ? 0:00 dmsfc.k 25448 ? 0:00 pqbinsta 25446 ? 0:00 startxcd 25470 ? 0:00 dmgrid.k 25449 ? 0:00 rpc.ldmd 25453 ? 0:00 ingebin. 25447 ? 0:00 pqact 25469 ? 0:00 dmraob.k 25445 ? 0:00 rpc.ldmd 25380 pts/8 0:00 ksh windfall: /usr/local/ldm $ ldmadmin watch (Type ^D or ^C when finished) May 16 18:47:26 pqutil: 25724 20020516174856.095 HDS 145 YHWB90 KWBG 161700 /mRUC2 May 16 18:47:26 pqutil: 19326 20020516174856.200 HDS 150 YTWB90 KWBG 161700 /mRUC2 May 16 18:47:26 pqutil: 19326 20020516174856.298 HDS 155 YVWB85 KWBG 161700 /mRUC2 May 16 18:47:26 pqutil: 19326 20020516174856.415 HDS 160 YUWB90 KWBG 161700 /mRUC2 May 16 18:47:26 pqutil: 19326 20020516174856.558 HDS 165 YVWB90 KWBG 161700 /mRUC2 May 16 18:47:27 pqutil: 4821 20020516174856.583 HDS 167 SDUS84 KLZK 161744 /pDPALZK ... So, windfall is again feeding from ldm.meteo.psu.edu. The 'ldmadmin watch' shows that products are being received as expected, so LDM-related things (including McIDAS-XCD decoders) are running. How this relates to your web page generation of products we can't say, but presumably they will come back as data gets decoded and cron-initiated scripts run. Since things were kinda messed up, I took the opportunity to do some further cleaning up: ldmadmin stop <verify that all LDM processes exit> cd ~ldm/.mctmp /bin/rm -rf * This cleans up subdirectories created by LDM-initiated McIDAS processes. there were a few left in .mctmp that were fairly old (listing done before the 'ldmadmin stop' above): windfall: /usr/local/ldm/.mctmp $ ls -alt total 46 drwx------ 23 ldm mcidas 512 May 16 14:54 ./ drwx------ 2 ldm mcidas 512 May 16 14:46 2902/ drwx------ 2 ldm mcidas 512 May 16 14:46 3300/ drwx------ 2 ldm mcidas 512 May 16 14:46 3601/ drwxr-xr-x 11 ldm mcidas 1024 May 16 14:46 ../ drwx------ 2 ldm mcidas 512 May 15 13:40 66406/ drwx------ 2 ldm mcidas 512 May 14 07:20 702/ drwx------ 2 ldm mcidas 512 May 14 07:20 801/ drwx------ 2 ldm mcidas 512 May 12 22:10 401/ drwx------ 2 ldm mcidas 512 May 12 22:10 302/ drwx------ 2 ldm mcidas 512 May 12 21:50 202/ drwx------ 2 ldm mcidas 512 Apr 25 06:50 301/ drwx------ 2 ldm mcidas 512 Apr 23 14:50 571600/ drwx------ 2 ldm mcidas 512 Apr 23 14:50 5801/ drwx------ 2 ldm mcidas 512 Apr 23 14:50 778602/ drwx------ 2 ldm mcidas 512 Apr 4 14:26 570300/ drwx------ 2 ldm mcidas 512 Apr 4 14:26 777502/ drwx------ 2 ldm mcidas 512 Mar 19 13:20 4501/ drwx------ 2 ldm mcidas 512 Feb 18 12:37 1090511/ drwx------ 2 ldm mcidas 512 Feb 16 07:37 100716/ drwx------ 2 ldm mcidas 512 Apr 19 2001 683109/ drwx------ 2 ldm mcidas 512 Feb 19 2001 43502/ drwx------ 2 ldm mcidas 512 Feb 19 2001 84901/ The old ones in the list (ones previous to May 16) are the result of aborted processes. Cleaning them up is a _good thing_ :-) After making sure that there were no shared memory segments still allocated to 'ldm' (again, McIDAS use), I restarted the LDM: ldmadmin start Things appear to be running smoothly, and the load on windfall is low: last pid: 27157; load averages: 0.26, 0.35, 0.91 15:04:54 66 processes: 65 sleeping, 1 on cpu CPU states: 76.2% idle, 18.0% user, 3.8% kernel, 2.0% iowait, 0.0% swap Memory: 384M real, 5920K free, 58M swap in use, 1266M swap free >thanks in advance, Tom or Anne or whomever.... Tom here... >(by the way, this isn't any really time-sensitive >issue, no operational or quasi-operational work >going on here) No problem. This was a quick fix. Talk to you later... Tom >From address@hidden Thu May 16 18:51:27 2002 Thanks so much Tom! My instict was to just kill the processes, so I don't know why I didn't, just confusion I guess. Jennie