This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Below is the set of messages relevant to the diagnosis of a problem with LDM 5.1.2 pqcreate resulting in a SIGBUS error on SGI/IRIX 32-bit platforms for certain combinations of queue size and number of products. --Russ To: address@hidden, address@hidden From: address@hidden (Pete Pokrandt) Reply-to: address@hidden Subject: Sunset.meteor.wisc.edu major problems Date: Sat, 23 Sep 2000 11:10:50 -0500 >To: address@hidden >From: address@hidden (Pete Pokrandt) >Subject: Re: 20000923: Sunset.meteor.wisc.edu major problems >Organization: Dept of Atmos & Oceanic Sciences, University of Wisconsin-Madison >Keywords: sigbus, bus error, SGI/IRIX Hi all, Anyone feeding from sunset.meteor.wisc.edu, please fail over to your backup until further notice. I'm having major problems with the ldm and/or machine crashing regularly. I suspect either a bad disk or perhaps a memory problem, but I can't go in to deal with it right now, since there's a UW/Northwestern Football game happening 2 blocks away from our building. I'll try to get in tonight to have a look and try to see what's going on. If it looks like an extended outage, I'll try to get everyone set up on profhorn.meteor.wisc.edu as a backup. Unidata support: can you verify that profhorn.meteor.wisc.edu is allowed to feed from motherlode? And if not, can it be added until I figure out what's up with sunset? Thanks. Sorry for the hassles.. Pete -- +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+ ^ Pete Pokrandt V 1447 AOSS Bldg 1225 W Dayton St^ ^ Systems Programmer V Madison, WI 53706 ^ ^ V address@hidden ^ ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^ ^ University of Wisconsin-Madison V 262-0166 (Fax) ^ +<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+ To: address@hidden Subject: Re: Sunset.meteor.wisc.edu major problems Date: Sat, 23 Sep 2000 10:55:58 -0600 From: Russ Rew <address@hidden> Hi Pete, > Unidata support: can you verify that profhorn.meteor.wisc.edu > is allowed to feed from motherlode? And if not, can it be > added until I figure out what's up with sunset? Thanks. I've verified that you should be able to feed from motherlode, because it's ldmd.conf contains the following line: allow UNIDATA|FSL2 ^(sunset|profhorn)\.meteor\.wisc\.edu$ --Russ To: address@hidden, address@hidden From: address@hidden (Pete Pokrandt) Subject: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu Date: Sat, 23 Sep 2000 13:34:21 -0500 Hi all, Thanks to Unidata support, working hard on a Saturday, anyone who normally feeds from sunset.meteor.wisc.edu can instead feed from profhorn.meteor.wisc.edu until sunset is fixed and happy again. One word of caution, I'm going to be slowly piping through the data from the UIUC archive site that I've missed since 0800 UTC, so you may end up getting more data than you are expecting until the backlog flushes through. ALso, profhorn.meteor.wisc.edu is the machine that I use to also ingest the high bandwidth NMC2 feed, so I'm not sure if the 10 mbps line into profhorn will handle the load of everyone feeding from it in addition to the NMC2 feed. I'll keep an eye on it and let you all know if it seems to be a problem. I'll be in this evening to try to figure out what's up on sunset. Very frustrating, at first, the ldm was crashing, but now I can't even get pqcreate to run. It dumps a core as soon as the queue file has grown to it's complete size.. I've tried it on different disk drives as well, so it's not a bad disk. Strange.. I'm going to try first swapping in some different RAM, and if that doesn't work, maybe a new mother board.. Nice to just happen to have a few spare parts lying around.. Unidata Support: does this sound to you like a memory problem? I have not seen any bad memory info in my system logs. Pete -- +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+ ^ Pete Pokrandt V 1447 AOSS Bldg 1225 W Dayton St^ ^ Systems Programmer V Madison, WI 53706 ^ ^ V address@hidden ^ ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^ ^ University of Wisconsin-Madison V 262-0166 (Fax) ^ +<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+ To: address@hidden Cc: support-ldm Subject: Re: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu Date: Sat, 23 Sep 2000 15:44:12 -0600 From: Russ Rew <address@hidden> Pete, > I'll be in this evening to try to figure out what's up on sunset. Very > frustrating, at first, the ldm was crashing, but now I can't even get > pqcreate to run. It dumps a core as soon as the queue file has grown to > it's complete size.. I've tried it on different disk drives as well, so > it's not a bad disk. Strange.. Please send us (address@hidden) the command line you use to invoke pqcreate and if possible also a traceback from when it crashes. You can get the traceback by running it until it crashes and leaves a "core" file, then running "dbx" (or whatever debugger you use, I'm not sure what platform you are running this on) giving as arguments the pqcreate executable and the core file, something like: % dbx /usr/local/ldm/bin/pqcreate core At this point dbx may produce a bunch of output, but when it finally gives you a prompt, type "where" and then cut and paste the output to me, along with how you invoked pqcreate. Also it's just worth checking that you are creating the product queue on a local disk rather than a remotely mounted disk. The latter won't work, but it should give an error message rather than just dumping core ... > I'm going to try first swapping in some different RAM, and if that > doesn't work, maybe a new mother board.. Nice to just happen to have a > few spare parts lying around.. Unidata Support: does this sound to you > like a memory problem? I have not seen any bad memory info in my system > logs. Good luck. It doesn't sound like a memory problem to me, but I haven't had any memory problems recently, so I'm not sure what the symptoms would be. The system should do a memory check when you reboot it, which should catch most memory errors. --Russ To: Russ Rew <address@hidden> cc: address@hidden Subject: Re: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu In-reply-to: Your message of "Sat, 23 Sep 2000 15:44:12 MDT." <address@hidden> Date: Sat, 23 Sep 2000 17:53:07 -0500 From: Pete Pokrandt <address@hidden> Russ, This is the same exact setup that has been running mostly flawlessly for months. Every so often the ldm will die, usually seems to be related to an increase in the data volume. Usually deleting and re-making the queue will solve the problem and it'll run for weeks with no problems. Yesterday the ldm crashed, so I redid the queue and restarted, then last night the machine hung, so I rebooted, redid the queue and started again. It ran for about 1/2 hour and died, so I redid the queue again, then again.. you get the picture.. Then this morning after another reboot I tried again to make the queue and started getting the core dumps. One strange thing is, it works ok for a 2.5 Mb (yes, that small, I've tried lots of things :) queue, but 5 Mb, 25 Mb, 250 Mb, 400 Mb, and 600 Mb (my normal queue size as of late) all dump a core. It is on a local disk, not an nfs mounted one. I'm running on an SGI R4000 with IRIX 6.5, 192 Mb of RAM, roughly 200 Mb of swap As for starting it, I'm just running a normal ldmadmin mkqueue. I believe the command that it is spawning is: pqcreate -q /usr2/ldm/ldm.pq -s 25000000 sunset 10% ldmadmin mkqueue Sep 23 22:43:37 UTC sunset.meteor.wisc.edu : make_pq: mkqueue failed Here's the output from dbx: unset 31% dbx /usr/local/ldm/bin/pqcreate core dbx version 7.2.1 patch 2991 May 14 1998 17:09:10 Core from signal SIGBUS: Bus error (dbx) where > 0 sx_init(sx = 0x5833a64, nalloc = 6103) ["/usr/local/ldm/ldm-5.1.2/src/pq/pq.c":2200, 0x1000a418] 1 ctl_init(pq = 0x10033fe0, align = 8) ["/usr/local/ldm/ldm-5.1.2/src/pq/pq.c":3783, 0x1000ec4c] 2 pq_create(path = 0x7fff3012 = "/usr2/ldm/ldm.pq", mode = 438, pflags = 0, align = 8, initialsz = 25000000, nproducts = 6103, pqp = 0x7fff2dec) ["/usr/local/ldm/ldm-5.1.2/src/pq/pq.c":4306, 0x100105c4] 3 main(ac = 7, av = 0x7fff2e64) ["/usr/local/ldm/ldm-5.1.2/src/pqcreate/pqcreate.c":186, 0x10003790] 4 __start() ["/xlv55/kudzu-apr12/work/irix/lib/libc/libc_n32_M3/csu/crt1text.s":177, 0x10003184] Let me know if this helps at all, I'm still plannign to go in tonight to start swapping hardware to see if that makes a difference. You think maybe I should recompile the ldm? Perhaps some of the binaries got fu-bar'd somehow? Thanks for the help! Pete -- +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+ ^ Pete Pokrandt V 1447 AOSS Bldg 1225 W Dayton St^ ^ Systems Programmer V Madison, WI 53706 ^ ^ V address@hidden ^ ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^ ^ University of Wisconsin-Madison V 262-0166 (Fax) ^ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+ Date: Sat, 23 Sep 2000 19:43:32 -0600 (MDT) From: Steve Chiswell <address@hidden> To: Pete Pokrandt <address@hidden> cc: address@hidden, address@hidden Subject: 20000923: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu In-Reply-To: <address@hidden> Pete, pqcreate would core dump if you ran out of disk space while trying to create the queue.....or if creating the queue was excercising some bad disk blocks. Assuming you have plenty of disk space, you might want to try the format utility to test the disk for bad sectors - and map them out if found. Steve Chiswell To: Steve Chiswell <address@hidden> cc: address@hidden, address@hidden Subject: Re: 20000923: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu In-reply-to: Your message of "Sat, 23 Sep 2000 19:43:32 MDT." <address@hidden> Date: Sat, 23 Sep 2000 21:00:15 -0500 From: Pete Pokrandt <address@hidden> In a previous message to me, you wrote: > > >Pete, > >pqcreate would core dump if you ran out of disk space while trying to create the >queue.....or if creating the queue was excercising some bad disk blocks. >Assuming you have plenty of disk space, you might want to try the format >utility to test the disk for bad sectors - and map them out if found. > > >Steve Chiswell > Steve, The disk is not full, and in fact I have tried it on more than one disk, and get the same results on both. I'll try it on a third and see if it still happens. I also just recompiled the ldm, I'll see if that makes any difference. Thanks, Pete -- +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+ ^ Pete Pokrandt V 1447 AOSS Bldg 1225 W Dayton St^ ^ Systems Programmer V Madison, WI 53706 ^ ^ V address@hidden ^ ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^ ^ University of Wisconsin-Madison V 262-0166 (Fax) ^ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+ To: Steve Chiswell <address@hidden> cc: address@hidden, address@hidden Subject: Re: 20000923: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu In-reply-to: Your message of "Sat, 23 Sep 2000 19:43:32 MDT." <address@hidden> Date: Sat, 23 Sep 2000 21:15:07 -0500 From: Pete Pokrandt <address@hidden> Steve and all, Recompiled the ldm, still dumps core. Tried to build the queue on yet a third disk, still dumps core. Here's the stack from dbx on the pqcreate core file: sunset 18% dbx ~/bin/pqcreate core dbx version 7.2.1 patch 2991 May 14 1998 17:09:10 where Core from signal SIGBUS: Bus error (dbx) > 0 sx_init(sx = 0x5833a64, nalloc = 6103) ["/usr/local/ldm/ldm-5.1.2/src/pq/pq.c":2200, 0x1000a418] 1 ctl_init(pq = 0x10033fe0, align = 8) ["/usr/local/ldm/ldm-5.1.2/src/pq/pq.c":3783, 0x1000ec4c] 2 pq_create(path = 0x7fff300c = "/cool.pretty/ldm/ldm.pq", mode = 438, pflags = 1, align = 8, initialsz = 25000000, nproducts = 6103, pqp = 0x7fff2dec) ["/usr/local/ldm/ldm-5.1.2/src/pq/pq.c":4306, 0x100105c4] 3 main(ac = 5, av = 0x7fff2e64) ["/usr/local/ldm/ldm-5.1.2/src/pqcreate/pqcreate.c":186, 0x10003790] 4 __start() ["/xlv55/kudzu-apr12/work/irix/lib/libc/libc_n32_M3/csu/crt1text.s":177, 0x10003184] (dbx) If I make the queue size small enough - rediculously small, 280000 bytes, then it is successful. Check out this sequence of pqcreate commands (I deleted the ldm.pq in between each one from a different window): sunset 22% pqcreate -q /cool.pretty/ldm/ldm.pq -v -f -s 250000 Creating /cool.pretty/ldm/ldm.pq, 250000 bytes, 61 products. pqcreate: create "/cool.pretty/ldm/ldm.pq" failed: File exists sunset 23% pqcreate -q /cool.pretty/ldm/ldm.pq -v -s 400000 Creating /cool.pretty/ldm/ldm.pq, 400000 bytes, 97 products. Bus error (core dumped) sunset 24% pqcreate -q /cool.pretty/ldm/ldm.pq -v -s 300000 Creating /cool.pretty/ldm/ldm.pq, 300000 bytes, 73 products. Bus error (core dumped) sunset 25% pqcreate -q /cool.pretty/ldm/ldm.pq -v -s 260000 Creating /cool.pretty/ldm/ldm.pq, 260000 bytes, 63 products. sunset 26% pqcreate -q /cool.pretty/ldm/ldm.pq -v -s 270000 Creating /cool.pretty/ldm/ldm.pq, 270000 bytes, 65 products. sunset 27% pqcreate -q /cool.pretty/ldm/ldm.pq -v -s 280000 Creating /cool.pretty/ldm/ldm.pq, 280000 bytes, 68 products. Bus error (core dumped) sunset 28% pqcreate -q /cool.pretty/ldm/ldm.pq -v -s 275000 Creating /cool.pretty/ldm/ldm.pq, 275000 bytes, 67 products. sunset 29% pqcreate -q /cool.pretty/ldm/ldm.pq -v -s 278000 Creating /cool.pretty/ldm/ldm.pq, 278000 bytes, 67 products. sunset 30% pqcreate -q /cool.pretty/ldm/ldm.pq -v -s 279000 Creating /cool.pretty/ldm/ldm.pq, 279000 bytes, 68 products. Bus error (core dumped) For whatever reason, 67 products is ok, but 68 is a no-go. The exact same behavior is exhibited no matter what local disk I try to create the queue on: sunset 32% pqcreate -q /usr2/ldm/ldm.pq -v -s 278000 Creating /usr2/ldm/ldm.pq, 278000 bytes, 67 products. sunset 33% pqcreate -q /usr2/ldm/ldm.pq -v -s 279000 Creating /usr2/ldm/ldm.pq, 279000 bytes, 68 products. Bus error (core dumped) I'm really stumped here.. could it be something with the memory mapping? In all cases, it seems to create the entire length of the file, and right at the very end, when the queue size is almost at, or at it's proper size, that's when the core dump occurs. I'm going to swap in some different RAM and if that doesn't work, a new mother board, to see if either of those make any difference. Pete -- +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+ ^ Pete Pokrandt V 1447 AOSS Bldg 1225 W Dayton St^ ^ Systems Programmer V Madison, WI 53706 ^ ^ V address@hidden ^ ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^ ^ University of Wisconsin-Madison V 262-0166 (Fax) ^ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+ To: Steve Chiswell <address@hidden> cc: address@hidden, address@hidden Subject: Re: 20000923: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu In-reply-to: Your message of "Sat, 23 Sep 2000 19:43:32 MDT." <address@hidden> Date: Sat, 23 Sep 2000 21:27:19 -0500 From: Pete Pokrandt <address@hidden> In a previous message to me, you wrote: > > >Pete, > >pqcreate would core dump if you ran out of disk space while trying to create the >queue.....or if creating the queue was excercising some bad disk blocks. >Assuming you have plenty of disk space, you might want to try the format >utility to test the disk for bad sectors - and map them out if found. > > >Steve Chiswell > Steve and all, Update number n+3.. new motherboard, new memory, same problem. Still dumping core. I suppose it is possible that all 3 of the disks that I am running this on have bad blocks on them, I'll give the format util a try and see if I can find anything along those lines. Pete -- +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+ ^ Pete Pokrandt V 1447 AOSS Bldg 1225 W Dayton St^ ^ Systems Programmer V Madison, WI 53706 ^ ^ V address@hidden ^ ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^ ^ University of Wisconsin-Madison V 262-0166 (Fax) ^ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+ To: Pete Pokrandt <address@hidden> Cc: support-ldm, chiz, rkambic Subject: Re: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu Date: Sat, 23 Sep 2000 23:41:23 -0600 From: Russ Rew <address@hidden> Pete, Sorry to hear replacing the memory and the other things you've tried haven't fixed the problem. The dbx traceback you sent showing the bus seems to indicate an alignment problem, as if something is being stored at an address that is not properly aligned for the type of data that is stored there, for example trying to store a 32-bit integer at an odd byte address. I can't remember seeing anything quite like that, and I couldn't reproduce the problem on an SGI/IRIX 6.5 platform here. Your experiment with changing queue sizes to show that 67 products works but 68 products doesn't leads me to believe you might be able to explicitly set the number of products to a larger number using the "-S" option to pqcreate. While you're at it, you should probably be using the "-c" (clobber) option as well, so you don't have to manually delete the queue each time before you create a new one. pqcreate just divides the queue size by 4096 to get the number of product slots to use, but you can specify a different number with the -S option, something like: pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6101 for example to make the queue have 6101 product slots instead of 6103. If you played around with this, you might find a value that worked with a large queue and there might be a pattern to the bus errors that depends on the number of product slots. This is pure speculation since I can't reproduce the problem, but maybe you are compiling with a compiler flag or optimization level that changes the alignment restrictions. For example, if you set the highest level of optimization when compiling, maybe that requires strict alignment, whereas if you don't specify optimization but instead use the debugging flag "-g", looser alignment works. I'm afraid I'll have to wait until Monday to pursue this, but a little more information might help: - Do you have the CFLAGS environment variable set when you build the LDM? If so, what value? - Is this the first time you've tried LDM 5.1.2 on this SGI/IRIX platform (sunset)? If so, what version were you running with successfully before? - What kind of platform is profhorn? Are you using LDM 5.1.2 on it? You may have found a platform-specific bug in LDM 5.1.2, but until we can reproduce it, we'll have trouble fixing it ... --Russ To: Russ Rew <address@hidden> cc: address@hidden Subject: Re: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu In-reply-to: Your message of "Mon, 25 Sep 2000 12:44:23 MDT." <address@hidden> Date: Mon, 25 Sep 2000 14:11:22 -0500 From: Pete Pokrandt <address@hidden> In a previous message to me, you wrote: >Pete, > >> I'm running on an SGI R4000 with IRIX 6.5, 192 Mb of RAM, roughly 200 Mb of > swap > >> Recompiled the ldm, still dumps core. > >Could you please try using our precompiled binary for SGI/IRIX >platforms on sunset, instead of what you compiled? Maybe just use >pqcreate out of our binary to see if it fails the same way yours >does. This would eliminate a lot of the possible sources of problems, >such as which compiler with which flags and libraries you used to >build LDM 5.1.2. > Russ, Your pqcreate also dumps core. Also rebuilt the kernel and no luck. > >Also, did you get the message I sent Saturday night? If not, I've >appended another copy. Yes, but in the flurry of things I was trying I totally forgot about it. I'll go through that now and get back to you. Thanks again, Pete -- +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+ ^ Pete Pokrandt V 1447 AOSS Bldg 1225 W Dayton St^ ^ Systems Programmer V Madison, WI 53706 ^ ^ V address@hidden ^ ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^ ^ University of Wisconsin-Madison V 262-0166 (Fax) ^ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+ To: Russ Rew <address@hidden> Subject: Re: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu In-reply-to: Your message of "Mon, 25 Sep 2000 12:44:23 MDT." <address@hidden> Date: Mon, 25 Sep 2000 14:57:07 -0500 From: Pete Pokrandt <address@hidden> In a previous message to me, you wrote: > > Pete, > > > Your experiment with changing queue sizes to show that 67 products > works but 68 products doesn't leads me to believe you might be able to > explicitly set the number of products to a larger number using the > "-S" option to pqcreate. While you're at it, you should probably be > using the "-c" (clobber) option as well, so you don't have to manually > delete the queue each time before you create a new one. > > pqcreate just divides the queue size by 4096 to get the number of > product slots to use, but you can specify a different number with the > -S option, something like: > > pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6101 > > for example to make the queue have 6101 product slots instead of 6103. > If you played around with this, you might find a value that worked > with a large queue and there might be a pattern to the bus errors that > depends on the number of product slots. Russ, sunset 35% pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6101 Creating /cool.pretty/ldm/ldm.pq, 25000000 bytes, 6101 products. No core dump (woohoo!). I'll play around with it some more and see if that queue actually works with the ldm.. Shouldn't be any reason why it wouldn't, right? > I'm afraid I'll have to wait until Monday to pursue this, but a little > more information might help: > > - Do you have the CFLAGS environment variable set when you build the > LDM? If so, what value? Shouldn't be, I'm just running with a straight ./configure with no CFLAGS env variable set. > > - Is this the first time you've tried LDM 5.1.2 on this SGI/IRIX > platform (sunset)? If so, what version were you running with > successfully before? I have been running ldm-5.1.2 on sunset since Sept 2, and a beta version before that since August 4. Both ran just fine up until Friday. That's the most bizzare part of this, I didn't change anything, it just stopped working.. Kinda scary. > > - What kind of platform is profhorn? Are you using LDM 5.1.2 on it? profhorn is RedHat Linux Red Hat Linux Red Hat Linux release 6.1 (Cartman) Kernel 2.2.14 on an i686 It is running ldm-5.1.2 beta1 since August 7. > > You may have found a platform-specific bug in LDM 5.1.2, but until we > can reproduce it, we'll have trouble fixing it ... > > --Russ > -- +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+ ^ Pete Pokrandt V 1447 AOSS Bldg 1225 W Dayton St^ ^ Systems Programmer V Madison, WI 53706 ^ ^ V address@hidden ^ ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^ ^ University of Wisconsin-Madison V 262-0166 (Fax) ^ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+ To: Russ Rew <address@hidden> cc: address@hidden Subject: Re: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu In-reply-to: Your message of "Mon, 25 Sep 2000 12:44:23 MDT." <address@hidden> Date: Mon, 25 Sep 2000 16:03:57 -0500 From: Pete Pokrandt <address@hidden> Russ, I have been playing a bit more with the queue sizes.. It seems that you are correct, that only certain values for the number of products work. I have had success with these: ----- pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6101 (where the default would have been 6103) pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6099 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6098 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6097 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6094 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6093 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6089 and pqcreate -c -q /usr3/ldm/data/ldm.pq -v -s 650000000 -S 158689 (where the default would have been 158691) ----- The following all failed: Default: pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 Creating /cool.pretty/ldm/ldm.pq, 25000000 bytes, 6103 products. Bus error (core dumped) pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6102 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6100 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6096 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6095 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6092 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6091 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6090 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6088 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6087 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6086 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6085 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6084 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6083 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6082 pqcreate -c -q /cool.pretty/ldm/ldm.pq -v -s 25000000 -S 6081 Pete -- +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+ ^ Pete Pokrandt V 1447 AOSS Bldg 1225 W Dayton St^ ^ Systems Programmer V Madison, WI 53706 ^ ^ V address@hidden ^ ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^ ^ University of Wisconsin-Madison V 262-0166 (Fax) ^ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+ To: Russ Rew <address@hidden> Subject: Re: sunset downstream sites may feedfrom profhorn.meteor.wisc.edu In-reply-to: Your message of "Mon, 25 Sep 2000 15:56:31 MDT." <address@hidden> Date: Mon, 25 Sep 2000 17:05:48 -0500 From: Pete Pokrandt <address@hidden> In a previous message to me, you wrote: >Pete, > >Thanks for trying our binary and for reporting back on LDM 5.1.2 >pqcreate values that worked and the ones that caused bus errors on >SGI/IRIX. You're the first one to report this bug, and we have now >reproduced it here so we have a chance of fixing it. The bus error >occurs under the following circumstances: > > - SGI/IRIX 32-bit platform (things seems to work fine on 64-bit IRIX > platforms when compiled with -64 flag) > > - LDM 5.1.2 (things seem to work with LDM 5.1.2beta3, so this bug was > introduced late in development) > > - certain values of queue size and number of products, as you have > reported > >The workaround, to try different values of number of products with >"-S" option to pqcreate, will get you going until I can deliver the >real fix. Russ, Got it, yeah, I am running now with the 650 Mb queue I produced with Default - 2 and it's running fine. I must have just been lucky with the previous size queues I had been running with. Glad I could help find the bug.. well, kinda.. :) > >I'll put some sort of announcement onto the ldm-users mailing list >about this bug and the work-around soon. > >Thanks again for your persistence, and sorry we didn't catch this >during testing ... No problem, I'm just happy to have a solution that works.. Pete -- +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+ ^ Pete Pokrandt V 1447 AOSS Bldg 1225 W Dayton St^ ^ Systems Programmer V Madison, WI 53706 ^ ^ V address@hidden ^ ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^ ^ University of Wisconsin-Madison V 262-0166 (Fax) ^ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+