[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 19991108: I need some assistance
- Subject: Re: 19991108: I need some assistance
- Date: Tue, 30 Nov 1999 13:00:48 -0700 (MST)
Karli,
The log messages are still saying that "Que corrupt:". I would go to the
data directory and delete the ldm.pq file. Somehow the file might not be
delete. Also, the data directory needs to be on a local disk drive. If
you are still having a problem, can I get a login to the machine?
Robb...
On Sun, 28 Nov 1999, McIDAS wrote:
> The machine's clock shows the correct time. This is an Octane Machine
> with 128MB Ram running IRIX 6.4 and has two partitions with 1.3GB and
> 5.0GB free respectively. It is running McIDAS 7.5, ldm-5.0.5 and
> ldm-mcidas-7.1.1 (or 7.1.3 if it had been configured correctly).
>
> After commenting out the line 'exec "pqact"' I got this output:
> -----------------------------------------------------------------------
> ldm@breeze 45% alias sverb "bin/rpc.ldmd -vl -
> etc/ldmd.conf"
> ldm@breeze 46% sverb
> Nov 28 19:01:34 rpc.ldmd[5644]: Starting Up (built: Aug 22 1997
> 12:07:40)
> Nov 28 19:01:34 aqua[5646]: run_requester: Starting Up:
> aqua.atmos.uah.edu
> Nov 28 19:01:34 striker[5593]: run_requester: Starting Up:
> striker.atmos.albany.
> edu
> Nov 28 19:01:35 udp.ldmd[5647]: Starting Up
> Nov 28 19:01:59 aqua[5646]: lastmatch:
> c9896c74abb279dea769a3a091a1b891 44766
> 19991128180616.338 MCIDAS 000 LWTOA3 205 DIALPROD=U3 99332 180612
> Nov 28 19:01:59 aqua[5646]: run_requester: 19991128180616.338 TS_ENDT
> {{FSL2|MCI
> DAS, ".*"}}
> Nov 28 19:01:59 striker[5593]: lastmatch:
> b9144b67fb6c5fd60c2fb5938b418cef
> 84 19991128185500.141 NLDN 000 99332184853
> Nov 28 19:01:59 striker[5593]: run_requester: 19991128185500.141 TS_ENDT
> {{NLDN,
> ".*"}}
> Nov 28 19:01:59 striker[5593]: FEEDME(striker.atmos.albany.edu): OK
> Nov 28 19:01:59 aqua[5646]: FEEDME(aqua.atmos.uah.edu): reclass:
> 19991128180616.
> 338 TS_ENDT {{MCIDAS, ".*"}}
> Nov 28 19:01:59 striker[5593]: hereis: dup: 84
> 19991128185500.141 NLDN000 99332184853
> Nov 28 19:01:59 aqua[5646]: FEEDME(aqua.atmos.uah.edu): OK
> Nov 28 19:02:00 striker[5593]: Que corrupt: ftbl
> Nov 28 19:02:00 striker[5593]: 84 19991128190100.696 NLDN 000
> 99332185
> 459
> Nov 28 19:02:00 aqua[5646]: dup : 44766 19991128180616.338 MCIDAS
> 000 LW
> TOA3 205 DIALPROD=U3 99332 180612
> Nov 28 19:02:03 aqua[5646]: 189889 19991128181050.145 MCIDAS 000
> LWTOA3 193DIALPROD=U1 99332 181048
> Nov 28 19:02:03 aqua[5646]: assertion "rp->prev == OFF_NONE" failed:
> file "pq.c"
> , line 678
> Nov 28 19:02:05 rpc.ldmd[5644]: child 5648 terminated by signal 6
> Nov 28 19:02:05 rpc.ldmd[5644]: Killing (SIGINT) process group
> Nov 28 19:02:05 rpc.ldmd[5644]: Interrupt
> Nov 28 19:02:05 rpc.ldmd[5644]: Exiting
> Nov 28 19:02:05 striker[5593]: Interrupt
> Nov 28 19:02:05 striker[5593]: Exiting
> Nov 28 19:02:05 udp.ldmd[5647]: Interrupt
> Nov 28 19:02:05 udp.ldmd[5647]: Exiting
> Nov 28 19:02:06 rpc.ldmd[5644]: Terminating process group
> Nov 28 19:02:29 rpc.ldmd[5644]: child 5646 terminated by signal 6
> Nov 28 19:02:29 rpc.ldmd[5644]: Killing (SIGINT) process group
> -----------------------------------------------------------------------
> after eliminating all requests I still had the same problem:
> -----------------------------------------------------------------------
>
> ldm@breeze 56% !s
> sverb
> Nov 28 19:31:04 rpc.ldmd[5454]: Starting Up (built: Aug 22 1997
> 12:07:40)
> Nov 28 19:31:05 udp.ldmd[5767]: Starting Up
> Nov 28 19:31:45 rpc.ldmd[5454]: child 5717 terminated by signal 6
> Nov 28 19:31:45 rpc.ldmd[5454]: Killing (SIGINT) process group
> Nov 28 19:31:45 rpc.ldmd[5454]: Interrupt
> Nov 28 19:31:45 rpc.ldmd[5454]: Exiting
> Nov 28 19:31:45 udp.ldmd[5767]: Interrupt
> Nov 28 19:31:45 udp.ldmd[5767]: Exiting
> Nov 28 19:31:45 rpc.ldmd[5454]: Terminating process group
> Nov 28 19:31:45 rpc.ldmd[5454]: child 5791 terminated by signal
> 15
> -----------------------------------------------------------------------
> and this is what top was showing me while I ran LDM without any
> requests:
> -----------------------------------------------------------------------
> IRIX64 breeze 6.4 02121744 IP30 Load[0.00,0.07,0.08] 20:08:28 50 procs
> user pid pgrp %cpu proc pri size rss time
> command
> karli 6011 6011 0.37 0 20 115 75 0:00
> top
> root 1038 1038 0.06 * 20 140 60 10:30
> mediad
> root 1117 1112 0.03 * 20 1078 34 8:25
> clogin
> root 1102 1102 0.03 * 20 879 96 6:49
> Xsgi
> root 5708 171 0.01 * 20 111 53 0:00
> telnetd
> root 261 261 0.01 * +0 121 121 2:36
> xntpd
> root 1039 171 0.01 * 20 120 45 0:50
> fam
>
>
> IRIX64 breeze 6.4 02121744 IP30 Load[0.13,0.09,0.09] 20:08:33 60 procs
> user pid pgrp %cpu proc pri size rss time
> command
> ldm 5997 5950 11.54 * 20 6353 780 0:00
> pqexpire
> ldm 6014 5950 3.67 * 20 287 61 0:00
> dmmisc.k
> ldm 6021 5950 3.48 * 20 283 59 0:00
> dmsyn.k
> ldm 6022 5950 3.36 * 20 271 58 0:00
> dmraob.k
> karli 6011 6011 3.20 0 20 116 76 0:00
> top
> ldm 6030 5950 2.11 * 20 279 59 0:00
> dmsfc.k
> root 5708 171 0.12 * 20 111 53 0:00
> telnetd
> root 1102 1102 0.06 * 20 879 96 6:49
> Xsgi
> root 1117 1112 0.06 * 20 1078 34 8:25
> clogin
>
> IRIX64 breeze 6.4 02121744 IP30 Load[0.13,0.09,0.09] 20:08:35 60 procs
> user pid pgrp %cpu proc pri size rss time
> command
> ldm 5997 5950 7.84 * 20 6353 1212 0:00
> pqexpire
> karli 6011 6011 1.11 0 20 116 76 0:00
> top
> root 1117 1112 0.07 * 20 1078 34 8:25
> clogin
> root 5708 171 0.07 * 20 111 53 0:00
> telnetd
> root 1102 1102 0.05 * 20 879 96 6:49
> Xsgi
> root 261 261 0.02 * +0 121 121 2:36
> xntpd
>
> IRIX64 breeze 6.4 02121744 IP30 Load[0.18,0.11,0.10] 20:08:38 60 procs
> user pid pgrp %cpu proc pri size rss time
> command
> ldm 5997 5950 10.93 * 20 6353 2192 0:00
> pqexpire
> karli 6011 6011 1.20 0 20 116 76 0:00
> top
> root 1117 1112 0.07 * 20 1078 34 8:25
> clogin
> root 1102 1102 0.06 * 20 879 96 6:49
> Xsgi
> root 5708 171 0.03 * 20 111 53 0:00
> telnetd
> ldm 6005 5950 0.02 * 20 6364 30 0:00
> rpc.ldmd
>
> IRIX64 breeze 6.4 02121744 IP30 Load[0.18,0.11,0.10] 20:08:39 60 procs
> user pid pgrp %cpu proc pri size rss time
> command
> ldm 5997 5950 4.14 * 20 6353 2338 0:00
> pqexpire
> karli 6011 6011 0.97 0 20 116 76 0:00
> top
> root 1038 1038 0.15 * 20 140 60 10:30
> mediad
> root 5708 171 0.06 * 20 111 53 0:00
> telnetd
> root 1117 1112 0.06 * 20 1078 34 8:25
> clogin
> root 1102 1102 0.05 * 20 879 96 6:49
> Xsgi
> root 1039 171 0.03 * 20 120 45 0:50
> fam
> root 261 261 0.02 * +0 121 121 2:36
> xntpd
>
> IRIX64 breeze 6.4 02121744 IP30 Load[0.24,0.12,0.10] 20:08:43 60 procs
> user pid pgrp %cpu proc pri size rss time
> command
> karli 6011 6011 1.76 0 20 116 76 0:00
> top
> root 1117 1112 0.05 * 20 1078 34 8:25
> clogin
> root 5708 171 0.05 * 20 111 53 0:00
> telnetd
> root 1102 1102 0.04 * 20 879 96 6:49
> Xsgi
> ldm 5950 5950 0.02 * 20 6363 31 0:00
> rpc.ldmd
>
> IRIX64 breeze 6.4 02121744 IP30 Load[0.29,0.13,0.10] 20:08:46 60 procs
> user pid pgrp %cpu proc pri size rss time
> command
> ldm 5997 5950 6.30 * 20 6353 4050 0:01
> pqexpire
> karli 6011 6011 0.95 0 20 116 76 0:00
> top
> root 816 816 0.74 * 20 120 58 2:10
> sendmail
> root 1117 1112 0.06 * 20 1078 34 8:25
> clogin
> root 5708 171 0.05 * 20 111 53 0:00
> telnetd
> root 1102 1102 0.05 * 20 879 96 6:49
> Xsgi
> root 261 261 0.02 * +0 121 121 2:36
> xntpd
>
> IRIX64 breeze 6.4 02121744 IP30 Load[0.33,0.14,0.11] 20:08:50 60 procs
> user pid pgrp %cpu proc pri size rss time
> command
> ldm 5997 5950 10.59 * 20 6353 4457 0:01
> pqexpire
> karli 6011 6011 0.75 0 20 116 70 0:00
> top
> root 1117 1112 0.07 * 20 1078 23 8:25
> clogin
> root 5708 171 0.06 * 20 111 51 0:00
> telnetd
> root 1102 1102 0.06 * 20 879 88 6:49
> Xsgi
> ldm 5950 5950 0.02 * 20 6363 23 0:00
> rpc.ldmd
> root 261 261 0.01 * +0 121 121 2:36
> xntpd
> ldm 6005 5950 0.01 * 20 6364 23 0:00
> rpc.ldmd
>
> IRIX64 breeze 6.4 02121744 IP30 Load[0.37,0.15,0.11] 20:08:54 60 procs
> user pid pgrp %cpu proc pri size rss time
> command
> ldm 5997 5950 10.17 * 20 6353 4860 0:01
> pqexpire
> karli 6011 6011 0.91 0 20 116 70 0:00
> top
> root 1038 1038 0.32 * 20 140 54 10:30
> mediad
> root 1102 1102 0.13 * 20 879 85 6:49
> Xsgi
> root 261 261 0.11 * +0 121 121 2:36
> xntpd
> root 1117 1112 0.11 * 20 1078 21 8:25
> clogin
> root 5708 171 0.07 * 20 111 50 0:00
> telnetd
>
>
> IRIX64 breeze 6.4 02121744 IP30 Load[0.37,0.15,0.11] 20:08:56 53 procs
> user pid pgrp %cpu proc pri size rss time
> command
> karli 6011 6011 0.88 0 20 116 70 0:00
> top
> ldm 5950 5950 0.24 * 20 6363 38 0:00
> rpc.ldmd
> root 1117 1112 0.14 * 20 1078 32 8:25
> clogin
> root 1102 1102 0.13 * 20 879 91 6:49
> Xsgi
> ldm 6024 5950 0.09 * 20 222 40 0:00
> startxcd.k
> root 5437 171 0.09 * 20 111 49 0:00
> telnetd
> root 5708 171 0.05 * 20 111 50 0:00
> telnetd
> ldm 6005 5950 0.04 * 20 6364 35 0:00
> rpc.ldmd
> root 77 0 0.02 * 20 96 43 0:02
> syslogd
> root 165 0 0.02 * 20 92 42 0:07
> portmap
> root 261 261 0.02 * +0 121 121 2:36
> xntpd
>
> IRIX64 breeze 6.4 02121744 IP30 Load[0.87,0.26,0.15] 20:08:57 50 procs
> user pid pgrp %cpu proc pri size rss time
> command
> karli 6011 6011 0.96 0 20 116 70 0:00
> top
> root 1 0 0.59 * 20 26 18 0:27
> init
> root 1038 1038 0.15 * 20 140 60 10:30
> mediad
> root 77 0 0.13 * 20 96 50 0:02
> syslogd
> root 165 0 0.12 * 20 92 46 0:07
> portmap
> root 1039 171 0.11 * 20 120 45 0:50
> fam
> root 5708 171 0.06 * 20 111 50 0:00
> telnetd
> root 5437 171 0.06 * 20 111 49 0:00
> telnetd
> ldm 5463 5463 0.05 * 20 36 16 0:00
> csh
> root 1117 1112 0.05 * 20 1078 32 8:25
> clogin
> root 1102 1102 0.04 * 20 879 91 6:49
> Xsgi
> root 261 261 0.01 * +0 121 121 2:36
> xntpd
> -----------------------------------------------------------------------
> Karli Lopez
>
>
> Robb Kambic wrote:
> >
> > On Mon, 22 Nov 1999, McIDAS wrote:
> >
> > > Rob,
> > > thanks for the tip. Executing the command yielded some pretty
> > > interesting output:
> > >
> > > ---------------------------------------------------------------------
> > > ldm@breeze 1% bin/rpc.ldmd -vl - etc/ldmd.conf
> > > Nov 22 19:59:03 rpc.ldmd[21390]: Starting Up (built: Aug 22 1997
> > > 12:07:40)
> > > Nov 22 19:59:03 aqua[21329]: run_requester: Starting Up:
> > > aqua.atmos.uah.edu
> > > Nov 22 19:59:03 striker[21395]: run_requester: Starting Up:
> > > striker.atmos.albany.edu
> > > Nov 22 19:59:04 udp.ldmd[21382]: Starting Up
> > > Nov 22 19:59:30 aqua[21329]: pq_sequence: xdr_prod_info() failed
> > > Nov 22 19:59:30 striker[21395]: pq_sequence: xdr_prod_info() failed
> > > Nov 22 19:59:30 aqua[21329]: pq_last: seq:I/O error (errno = 5)
> > > Nov 22 19:59:30 aqua[21329]: run_requester: 19991122185903.945 TS_ENDT
> > > {{UNIDATA, ".*"},{FSL2|MCIDAS, ".*"}}
> > > Nov 22 19:59:30 striker[21395]: pq_last: seq:I/O error (errno = 5)
> > > Nov 22 19:59:30 striker[21395]: run_requester: 19991122185903.951
> > > TS_ENDT {{NLDN, ".*"}}
> >
> > Karla,
> >
> > The first thing to check is that your machine time is correct. Also,
> > comment out the "exec pqact ...." line in your ldmd.conf file. I would
> > also comment out the other request lines in the ldmd.conf until it runs
> > correctly. What type of machine is this? What's the output of top?
> >
> > Robb...
> >
> > > Nov 22 19:59:36 rpc.ldmd[21390]: child 21416 terminated by
> > signal 6 > Nov 22 19:59:36 rpc.ldmd[21390]: Killing (SIGINT) process group
> > > Nov 22 19:59:36 rpc.ldmd[21390]: Interrupt
> > > Nov 22 19:59:36 rpc.ldmd[21390]: Exiting
> > > Nov 22 19:59:36 striker[21395]: Interrupt
> > > Nov 22 19:59:36 striker[21395]: Exiting
> > > Nov 22 19:59:36 aqua[21329]: Interrupt
> > > Nov 22 19:59:36 aqua[21329]: Exiting
> > > Nov 22 19:59:36 udp.ldmd[21382]: Interrupt
> > > Nov 22 19:59:36 udp.ldmd[21382]: Exiting
> > > Nov 22 19:59:36 rpc.ldmd[21390]: Terminating process group
> > > ldm@breeze 2%
> > >
> > > ---------------------------------------------------------------------
> > > I got this output in less that a minute. My guess is that the data
> > > stream is failing (but this wouldn't cause it to die) or something is
> > > externally killng it.
> > > Karli
> > >
> > > Robb Kambic wrote:
> > > >
> > > > Karli,
> > > >
> > > > Run the ldm from home on the command line with the messages to the
> > > > screen, ie.
> > > >
> > > > % bin/rpc.ldmd -vl - etc/ldmd.conf
> > > >
> > > > This should give us a clue of the problem.
> > > >
> > > > Robb...
>
> ===============================================================================
> > Robb Kambic Unidata Program Center
> > Software Engineer III Univ. Corp for Atmospheric
> > Research
> > address@hidden WWW: http://www.unidata.ucar.edu/
> > ===============================================================================
>
> --
>
> ====================================================================
> Amos Winter address@hidden
> Director
> Puerto Rico Climatology Center
> P.O. Box 9013
> Department of Marine Sciences phone: (787) 265-5416
> University of Puerto Rico - Mayaguez fax: (787) 265-2195
> Mayaguez, PR 00681-9013
>
===============================================================================
Robb Kambic Unidata Program Center
Software Engineer III Univ. Corp for Atmospheric Research
address@hidden WWW: http://www.unidata.ucar.edu/
===============================================================================