This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
>From: "Jennie L. Moody" <address@hidden> >Organization: UVa >Keywords: 199907281535.JAA02200 McIDAS GRIB DMGRID Jennie, re: differences between the 7.1 and 7.5 versions of dmgrid.pgm >I just looked to see if the codes were identical (noted they were not), >and looked to see if I could find a parameter POINTER in each, but >I didn't spend a lot of time looking at the code. There is no help file >for either versions decoder, so it isn't obvious that there are any >parameters. I would have assumed that something was majorly different as well especially given your expectation that by specifying the POINTER= keyword you would see a corresponding change in GRIBDEC.PRO. re: failure of LWU POKE GRIBDEC.PRO >Well, you had me there for a minute, I thought it might be the >write permission, but not so: > >TFILE did OPEN on window 0 >DMAP GRIB* >PERM SIZE LAST CHANGED FILENAME DIRECTORY >---- --------- ------------ --------------------- --------- >-rw- 1878 Dec 30 1996 GRIBDEC.CFG /home/mcidas/710/workdata >-rw- 20252 Aug 10 16:30 GRIBDEC.OUT /home/mcidas/710/workdata >-rw- 4 Aug 19 15:37 GRIBDEC.PRO /home/mcidas/710/workdata >-rw- 4 Jul 27 15:42 GRIBDEC.PRO.bak /home/mcidas/710/workdata >-rw- 4 Aug 10 15:59 GRIBDEC.PRObeforepoke /home/mcidas/710/workdata >22142 bytes in 5 files File permissions look OK. > its in the path > >REDIRECT LIST >Number of active redirection entries=22 >AREA00* /incoming/data/mcidas >AREA01* /incoming/data/mcidas >AREA02* /incoming/data/mcidas >GRID5* /incoming/data/mcidas/xcd >HRS.SPL /incoming/data/mcidas/xcd >IDXALIAS.DAT /incoming/data/mcidas/xcd >MDXX00* /incoming/data/mcidas/xcd >RAOB.RAP /incoming/data/mcidas/xcd >RAOB.RAT /incoming/data/mcidas/xcd >ROUTE.SYS /incoming/data/mcidas >SAOMETAR.RAP /incoming/data/mcidas/xcd >SAOMETAR.RAT /incoming/data/mcidas/xcd >SYNOPTIC.RAP /incoming/data/mcidas/xcd >SYNOPTIC.RAT /incoming/data/mcidas/xcd >SYSKEY.TAB /incoming/data/mcidas >TERMFCST.RAP /incoming/data/mcidas/xcd >TERMFCST.RAT /incoming/data/mcidas/xcd >TEXTPROD.DAT /incoming/data/mcidas/xcd >WXWATCH.DAT /incoming/data/mcidas/xcd >*.IDX /incoming/data/mcidas/xcd >*.IDT /incoming/data/mcidas/xcd >*.XCD /incoming/data/mcidas/xcd >REDIRECT: Done > there is no redirection messing me up Right, there is NO REDIRECTion for GRIBDEC.PRO. This means that McIDAS will search the directories in MCPATH to find GRIBDEC.PRO. >-rw-rw-rw- 2 mcidas usr 1878 Dec 30 1996 GRIBDEC.CFG >-rw-r--r-- 1 mcidas usr 20252 Aug 10 16:30 GRIBDEC.OUT >-rw-r--r-- 1 mcidas usr 4 Aug 19 17:30 GRIBDEC.PRO >-rw-r--r-- 1 mcidas usr 4 Jul 27 15:42 GRIBDEC.PRO.bak >-rw-r--r-- 1 mcidas usr 4 Aug 10 15:59 GRIBDEC.PRObeforepoke > >its owned by mcidas.... Right. >this did get me wondering about the >fact that ldmadmin is running xcd_run, but that shouldn't matter >since it only exectes ingebin and generates the spool, and the >dmgrid process is started by me (as user mcidas) from a mcidas session. Whoa! I guess that I am not understanding the process here. You are running the LDM to try and get this to work? Running ldmadmin will start the LDM and, if ldmd.conf is so configured, startup startxcd.k which will, in turn, start DMGRID if the decoder is enabled. But you knew that... If it were me, I would bypass the ldmadmin step and run DMGRID directly from a McIDAS session on aeolus. This way you could specify the POINTER= keyword or not. When DMGRID is started from startxcd.k, the POINTER= keyword is not specified. >There isn't any reason that the ingestor needs to know about this >pointer, correct. Right. The binary ingester, ingebin.k, simply reads from stdin and writes to the spool file specified in GRIBDEC.CFG, the default for which is HRS.SPL. >Its just that when the spool gets changed (through >ingebin), the data monitor/decoder needs to be able to look up and >say, hey, new data in the spool, the last byte I read was here ( >looks at pointer), guess I better start decoding. Then, when it >stops decoding, it writes the new location of the last byte read, >correct? Right you are. It also uses information in the spool file itself to know how far it can read in the spool file. re: let's figure out why things that should work are not working >Okay...but at the moment, Tony has written a script which he has running >through a cron, starts a new cat to xcd_run every 30 minutes, to finish >the decoding of a bunch of data he is working with (remember this is on >aeolus, so it takes about 20 minutes to decode each set of grib files >(its the eta model, and he is only decoding the initialization through >the 12 hour forecast, but aeolus is a low performance machine!). Anyway, >since we are retrieving 0 and 12Z run for several days, he has a few >hours worth of crons running. OK. >Yesterday we identified one of the problems (and the reason he was >getting errors but I wasn't when we tried to decode the same data, >which was pissing him off). He didn't always wait for the decoder to >finish....he would get a prompt back from having catted data into >xcd_run, and he would wait a while, but he didn't make sure that >the decoder had stopped processing data in the spool 'ingebin.k' simply writes to the spool file and then will terminate on an EOF (i.e. no more data to read in from stdin). The data monitor, DMGRID, will wake up after sleeping for awhile and see that there is data beyond the last read point in the spool and will begin decoding data. As you note, aeolus is slow, so the decoding takes some time. Just for interest: is the directory into which DMGRID writes McIDAS GRID files on a local file system or is on an NFS mounted file system. If it is on an NFS mounted file system, the writing of the ouput will take a LOT longer than if the file system is local to aeolus. >(there is no >easy way to "see" this except for watching the time/size of the >output grid and ascertaining that is has stopped changing/growing). Another other way to see this is by observing when the pointer in GRIBDEC.PRO stops changing. Since the pointer is updated after each product is completed, there will be plenty of times when the pointer is not changing due to DMGRID being occupied decoding a product. Still another way would be to look at the operating system's process IDs for DMGRID and see if it is active or sleeping. DMGRID should not go to sleep while there is still data to be processed. Be careful if you take this approach to not mistake DMGRID being swapped out for its being asleep. For my money, 'top' is the easiest thing to use to see the state of running processes. Take for instance a 'top' run on our LDM machine: last pid: 18321; load averages: 1.09, 0.99, 0.90 14:12:47 60 processes: 58 sleeping, 2 on cpu CPU states: 27.5% idle, 34.2% user, 10.6% kernel, 27.8% iowait, 0.0% swap Memory: 512M real, 15M free, 58M swap in use, 821M swap free PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND 17889 ldm 1 0 0 7056K 2792K cpu0 1:38 39.06% dcgrib 1325 ldm 1 48 0 1270M 26M sleep 28:12 2.61% rpc.ldmd 1765 ldm 1 58 0 1271M 80M sleep 14:55 1.39% rpc.ldmd 1327 ldm 1 58 0 1270M 5900K sleep 26:17 0.80% rpc.ldmd 1322 ldm 1 58 0 1271M 90M sleep 17:20 0.78% pqact 1321 ldm 1 58 0 1270M 22M sleep 8:44 0.56% pqbinstats 17849 ldm 1 38 0 1972K 784K sleep 0:02 0.42% ingebin.k 16511 ldm 1 58 0 1271M 22M sleep 1:07 0.19% rpc.ldmd 1324 ldm 1 58 0 1270M 3528K sleep 5:38 0.14% rpc.ldmd 1526 ldm 1 58 0 12M 1188K sleep 15:50 0.13% dmgrid.k 5510 ldm 1 52 0 3040K 888K sleep 0:32 0.02% dmsfc.k 935 ldm 1 58 0 3004K 632K sleep 0:14 0.02% dmsyn.k 17820 ldm 1 32 0 4948K 1448K sleep 0:00 0.02% dchrly 18121 ldm 1 58 0 3076K 2464K sleep 0:00 0.02% ldmConnect 17822 ldm 1 54 0 5020K 4248K sleep 0:01 0.01% metar2nc Notice in this listing that dmgrid.k is in the sleep state. After watching the top output, it is pretty easy to discern that there is no grid data that it needs to process right now. >Then he would start a new ingebin process (catting to xcd_run). I >have tried to logically understand why this throws things off, but >I cannot say I understand it...if the spool changes while the decoder >is off running, but it is keeping track of the last byte it read, it >should be able to finish its process, write out the pointer, then >notice that there is new data to decode, and pick up reading where >it left off, no? Yes. >It doesn't seem to work this way, but if >you wait until the last data was decoded, and then initiate a new >ingebin, it does pick up and start working. What may be going on is that the next slug of data being catted to ingebin.k is big enough that the spool wraps around past the point where the decoder is reading. Remember that the spool file is "circular". Data is written to the logical end, and the logical end moves around the file. My theory is that the decoder is munching away at the data in the spool and the next data gets put in past the read point. Perhaps a (bad) picture would help me explain: State 1: ingebin.k has finished writing to the spool and dmgrid.k is munching away decoding data into output GRID files +----------+ + + product to be read + + <- product to be read ends here + + + + + + + + dmgrid reading here -> + + <- product to be read begins here + + + + product to be read + + " + + " + + " + + " + + " + + " +----------+ " State 2: ingebin.k is run again adding new data to the "fill" point the size of the new product puts the end of data pointer beyond where dmgrid.k was already reading:: bad things happen +----------+ + + product to be read + + <- new data starts getting written here + + + + + + + + dmgrid reading here -> + + new data extends beyond where + + dmgrid.k is already reading + + " + + " + + " + + " + + " + + " + + <- end of new data is here +----------+ >There are plenty of things I don't understand (generalizable statement >for sure), but things like this bug me, because I want to understand >how it works (and be able to explain it).... Did the above help? The thing to recognize is that ingebin.k will NOT wait to write to the spool file. It knows nothing about dmgrid.k; all it knows is that it is charged with writing to the spool in a circular fashion. So, if the spool file is big enough, and the data products are added slowly enough, then the decoder will always keep ahead of the filling. If the spool file is small, or if the decoder is running very slowly, then the new data will overwrite spool file locations before dmgrid.k gets a chance to read them. At this point, all bets are off as you might imagine. >anyway, recently I went >off to look a little more closely at how the real-time data gets >decoded, and sadly I have really confused myself further. Again, the big picture is that ingebin.k knows nothing about dmgrid.k running, and dmgrid.k knows nothing about ingebin.k filling the spool (outside of it seeing the read and end of information pointers in the spool file itself). This structure allows for products to be written to the spool file even if dmgrid.k is not running. >So, let me >make a few observations, and ask a few questions here: > >When we used to get grid files from Wisconsin, these grid files came >over the mcidas feed, and the grids were realiably uniform...ie., the >same field (met variable) was in the same location of the grid file >all the time, so in fact we had written lots of batch files (long ago) >that read specific fields (like mid-tropospheric temperatures) just >by referencing their grid numbers within grid files and writing out >new grids into new specified grid numbers, etc. Right. This was needed back in the days when the McIDAS command to be run only accepted the grid number in the GRID file. If you wanted to be able to do automatic data processing, you had to be certain of what field was in what grid in the GRID file. >I realize that >the Wisconsin folks were "making" these mcidas grids, so its easy >to see that they would be standardized. Right. By the way, I put a new routine into the 7.60 release that can create the GRID files that used to be in the Unidata-Wisconsin datastream from XCD decoded GRID files. The new routine is called UWGRID. I grabbed this routine from SSEC; renamed it; and added the ability for it to use the file routing table for output grid file numbers AND to kick off PostProcess BATCH files. I also added a Unix shell script that contains the McIDAS environment variable setting lines that allows one to run UWGRID from the Unix shell. This schell script is cleverly called uwgrid.sh :-). >With the new grids that we receive (processed through decoders), the order >of grids in a file is quite variable, and there are frequent repetitions of >grids, sometimes just a few grids appear to be repeated, sometimes a >lot of grids get repeated. Right, DMGRID does not try to detect duplicate grids when writing to the output GRID file. >They *are* identical grids when one compares >them with an expanded listing of the grdlist command (actually, I might >have been using the old IGG FORM=EXP, nevertheless...). You are correct. If there are multiple grib products containing the exact same data, then there will be multiple grids containing the exact same data in a GRID file. >I presume this >is okay, and does not indicate there being anything wrong with our data >ingestion or decoding. Can you reasure me on this? I can and do reassure you that this is normal. >Also, I am trying >to imagine where this duplication originates? In the datastream itself. >I suppose that data (in >the form of packets of grib files?) could be resent if there are >network interruptions.... Exactly correct! >does resent data result in repeated data? Yes. Go to the head of the class ;-) >If this is the case, would it also be that the >further we are down in the idd distribution, this could get worse?? This should NOT be the case. The LDM has code that attempts to detect and eliminate duplicate products to prevent this kind of problem, but the checksum approach being used can not detect the instance of the same data coming in in two different products. >Or maybe I have a really flawed understanding of how this works and >could use a better education....I'm all ears. Nope, I think you understand what is going on quite well. >(These new files do make me appreciate the ADDE commands that allow us >to forgo any knowledge of where a grid is located in a file since we >can just refer to it by specific parameters). Right. AND you can go to a cooperating site to use their data holdings if you didn't receive the data by the IDD. I think that this is massively cool. >At this point, as long as we can get out the data, and get on with >working with it, thats good enough, but it would be nice to >understand why problems arise I agree. >(and I _still_ believe the cleanest >way to retrieve any individual set of archived data would be >to reset a clean spool [ie., copy /dev/null and then cat a new set >of grib files] and run the decoder with a pointer set to start >reading the spool at the top (in other words, the last byte read >was "zero") You could setup a system where there are multiple spool files. One would be used for the realtime data; the other(s) could be used for case study data sets. Basically, in McIDAS there are usually about 3 to 5 different ways of attacking the same problem. The more esoteric ones require a lot more knowledge about how things actually work, however. This is the main reason that I have not broached this subject before. Tom