This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
anne wrote: > > anne wrote: > > > > Hi Russ and Mike, > > > > Larry Riddle's LDM, on aeolus, an OSF1 alpha, is having problems. It > > keeps shutting down with same error message reported in the log: > > > > ldmd.log.1:Feb 05 22:51:59 aeolus motherlode[4249]: run_requester: > > Starting Up: motherlode.ucar.edu > > ldmd.log.1:Feb 05 22:59:29 aeolus motherlode[4249]: run_requester: > > 20020205215159.112 TS_ENDT {{FSL2|UNIDATA, ".*"},{NNEXRAD, > > ".*"},{DIFAX, ".*"}} > > ldmd.log.1:Feb 05 22:59:30 aeolus motherlode[4249]: > > FEEDME(motherlode.ucar.edu): OK > > ldmd.log.1:Feb 05 22:59:31 aeolus motherlode[4249]: RECLASS: > > 20020205215931.091 TS_ENDT {{FSL2|UNIDATA, ".*"},{NNEXRAD, > > ".*"},{DIFAX, ".*"}} > > ldmd.log.1:Feb 05 22:59:31 aeolus motherlode[4249]: skipped: > > 20020205215159.267 (451.825 seconds) > > ldmd.log.1:Feb 05 22:59:32 aeolus motherlode[4249]: assertion "n > 0" > > failed: file "pq.c", line 2172 > > ----- > > ldmd.log.2:Feb 05 22:15:54 aeolus motherlode[3932]: run_requester: > > Starting Up: motherlode.ucar.edu > > ldmd.log.2:Feb 05 22:23:23 aeolus motherlode[3932]: run_requester: > > 20020205211554.650 TS_ENDT {{FSL2|UNIDATA, ".*"},{NNEXRAD, > > ".*"},{DIFAX, ".*"}} > > ldmd.log.2:Feb 05 22:23:23 aeolus motherlode[3932]: > > FEEDME(motherlode.ucar.edu): OK > > ldmd.log.2:Feb 05 22:23:24 aeolus motherlode[3932]: RECLASS: > > 20020205212324.304 TS_ENDT {{FSL2|UNIDATA, ".*"},{NNEXRAD, > > ".*"},{DIFAX, ".*"}} > > ldmd.log.2:Feb 05 22:23:24 aeolus motherlode[3932]: skipped: > > 20020205211554.685 (449.618 seconds) > > ldmd.log.2:Feb 05 22:23:25 aeolus motherlode[3932]: assertion "n > 0" > > failed: file "pq.c", line 2172 > > ----- > > ldmd.log.3:Feb 05 17:00:29 aeolus motherlode[1329]: run_requester: > > Starting Up: motherlode.ucar.edu > > ldmd.log.3:Feb 05 17:00:29 aeolus motherlode[1329]: run_requester: > > 20020205160029.865 TS_ENDT {{FSL2|UNIDATA, ".*"},{NNEXRAD, > > ".*"},{DIFAX, ".*"}} > > ldmd.log.3:Feb 05 17:00:30 aeolus motherlode[1329]: > > FEEDME(motherlode.ucar.edu): OK > > ldmd.log.3:Feb 05 17:41:04 aeolus motherlode[1329]: RECLASS: > > 20020205164104.746 TS_ENDT {{FSL2|UNIDATA, ".*"},{NNEXRAD, > > ".*"},{DIFAX, ".*"}} > > ldmd.log.3:Feb 05 17:41:04 aeolus motherlode[1329]: skipped: > > 20020205160304.032 (2280.714 seconds) > > ldmd.log.3:Feb 05 18:03:47 aeolus motherlode[1329]: RECLASS: > > 20020205170346.979 TS_ENDT {{FSL2|UNIDATA, ".*"},{NNEXRAD, > > ".*"},{DIFAX, ".*"}} > > ldmd.log.3:Feb 05 18:03:47 aeolus motherlode[1329]: skipped: > > 20020205164524.036 (1102.943 seconds) > > ldmd.log.3:Feb 05 20:59:38 aeolus motherlode[1329]: assertion "n > 0" > > failed: file "pq.c", line 2172 > > > > The function that is failing is this: > > /* > > * Hash function for signature. > > */ > > static size_t > > sx_hash(size_t nchains, const signaturet sig) > > { > > size_t h; > > int i; > > unsigned int n; > > > > n = 0; > > for(i=0; i<4; i++) > > n = 256*n + sig[i]; > > assert(n > 0); > > h = n % nchains; > > return h; > > } > > > > Perhaps the signatures are being corrupted? > > > > It's interesting that the latencies on these skipped products are > > terrible. ldmpings from motherlode to aeolus aren't very good, > > including some in the hundreds of milliseconds: > > > > motherlode.ucar.edu% ldmping -i2 aeolus.ucsd.edu > > Feb 06 01:08:44 State Elapsed Port Remote_Host > > rpc_stat > > ... (aeolus LDM started here) > > Feb 06 01:09:40 RESPONDING 0.092502 388 aeolus.ucsd.edu > > Feb 06 01:09:42 RESPONDING 0.065875 388 aeolus.ucsd.edu > > Feb 06 01:09:44 RESPONDING 0.038995 388 aeolus.ucsd.edu > > Feb 06 01:09:46 RESPONDING 0.039381 388 aeolus.ucsd.edu > > Feb 06 01:09:48 RESPONDING 0.038904 388 aeolus.ucsd.edu > > Feb 06 01:09:51 RESPONDING 0.039140 388 aeolus.ucsd.edu > > Feb 06 01:09:53 RESPONDING 0.047059 388 aeolus.ucsd.edu > > Feb 06 01:09:55 RESPONDING 0.039036 388 aeolus.ucsd.edu > > Feb 06 01:09:57 RESPONDING 0.039950 388 aeolus.ucsd.edu > > Feb 06 01:09:59 RESPONDING 0.040719 388 aeolus.ucsd.edu > > Feb 06 01:10:01 RESPONDING 0.104465 388 aeolus.ucsd.edu > > Feb 06 01:10:03 RESPONDING 0.050099 388 aeolus.ucsd.edu > > Feb 06 01:10:05 RESPONDING 0.118380 388 aeolus.ucsd.edu > > Feb 06 01:10:07 RESPONDING 0.039413 388 aeolus.ucsd.edu > > Feb 06 01:10:09 RESPONDING 0.050446 388 aeolus.ucsd.edu > > Feb 06 01:10:11 RESPONDING 0.044901 388 aeolus.ucsd.edu > > Feb 06 01:10:13 RESPONDING 0.041743 388 aeolus.ucsd.edu > > Feb 06 01:10:15 RESPONDING 0.039329 388 aeolus.ucsd.edu > > Feb 06 01:10:17 RESPONDING 0.044745 388 aeolus.ucsd.edu > > Feb 06 01:10:19 RESPONDING 0.040108 388 aeolus.ucsd.edu > > Feb 06 01:10:21 RESPONDING 0.050392 388 aeolus.ucsd.edu > > Feb 06 01:10:23 RESPONDING 0.040905 388 aeolus.ucsd.edu > > Feb 06 01:10:25 RESPONDING 0.039391 388 aeolus.ucsd.edu > > Feb 06 01:10:27 RESPONDING 0.058450 388 aeolus.ucsd.edu > > > > The queue seems ok: > > > > aeolus.ucsd.edu> pqmon -i2 > > Feb 06 01:19:37 pqmon: Starting Up (5892) > > Feb 06 01:19:37 pqmon: nprods nfree nempty nbytes maxprods > > maxfree minempty maxext age > > Feb 06 01:19:37 pqmon: 108327 1 74777 749998248 158714 > > 12 24390 3928 20276 > > Feb 06 01:19:39 pqmon: 108321 1 74783 749993144 158714 > > 12 24390 9032 20271 > > Feb 06 01:19:41 pqmon: 108318 1 74786 750001632 158714 > > 12 24390 544 20267 > > Feb 06 01:19:43 pqmon: 108321 1 74783 749998984 158714 > > 12 24390 3192 20267 > > Feb 06 01:19:45 pqmon: 108328 1 74776 749999056 158714 > > 12 24390 3120 20265 > > Feb 06 01:19:47 pqmon: 108334 1 74770 749997760 158714 > > 12 24390 4416 20266 > > Feb 06 01:19:49 pqmon: 108360 1 74744 750001048 158714 > > 12 24390 1128 20265 > > Feb 06 01:19:51 pqmon: 108372 1 74732 749995496 158714 > > 12 24390 6680 20265 > > Feb 06 01:19:53 pqmon: 108383 1 74721 749997800 158714 > > 12 24390 4376 20262 > > Feb 06 01:19:55 pqmon: 108415 1 74689 749996816 158714 > > 12 24390 5360 20263 > > Feb 06 01:19:55 pqmon: Interrupt > > Feb 06 01:19:55 pqmon: Exiting > > > > I do see some messages in the system log that make me suspicious - these > > are for Mike: > > > > Feb 4 11:40:03 aeolus vmunix: RFS3_WRITE, client address = > > 132.239.94.91, errno 22 > > Feb 5 07:56:31 aeolus vmunix: panic (cpu 0): vm_page_activate: already > > active > > Feb 5 07:56:31 aeolus vmunix: syncing disks... 237 122 30 done > > Feb 5 07:56:31 aeolus vmunix: DUMP.prom: dev SCSI 0 6 0 0 300 0 > > FLAMG-IO, block 722079 > > Feb 5 07:56:31 aeolus vmunix: DUMP.prom: dev SCSI 0 6 0 0 300 0 > > FLAMG-IO, block 722079 > > Feb 5 07:56:31 aeolus vmunix: Alpha boot: available memory from > > 0xbc4000 to 0xe000000 > > Feb 5 07:56:31 aeolus vmunix: Compaq Tru64 UNIX V5.0A (Rev. 1094); Thu > > Nov 29 07:51:09 PST 2001 > > ... > > Feb 5 07:57:58 aeolus vmunix: fta0: Link Unavailable. > > Feb 5 07:58:51 aeolus vmunix: Mouse/Tablet has failed to reset. > > Feb 5 07:59:19 aeolus last message repeated 2 times > > Feb 5 08:59:16 aeolus vmunix: Memory error corrected by system > > Feb 5 08:59:16 aeolus vmunix: biu_stat = 0000000000000240 > > Feb 5 08:59:16 aeolus vmunix: biu_addr = 00000001d4000018 > > Feb 5 08:59:16 aeolus vmunix: dc_stat = 0000000000000007 > > Feb 5 08:59:16 aeolus vmunix: fill_syndrome = 0000000000000000 > > Feb 5 08:59:16 aeolus vmunix: fill_addr = 0000000000065350 > > Feb 5 08:59:16 aeolus vmunix: bc_tag = 003c090000005428 > > Feb 5 08:59:16 aeolus vmunix: ident = 0 > > > > Do you have any ideas about this? > > > > My next step will be to rebuild the queue. I'll save the old queue just > > in case it might be useful. > > > > Anne > > -- > *************************************************** > Anne Wilson UCAR Unidata Program > address@hidden P.O. Box 3000 > Boulder, CO 80307 > ---------------------------------------------------- > Unidata WWW server http://www.unidata.ucar.edu/ > **************************************************** -- *************************************************** Anne Wilson UCAR Unidata Program address@hidden P.O. Box 3000 Boulder, CO 80307 ---------------------------------------------------- Unidata WWW server http://www.unidata.ucar.edu/ ****************************************************