This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
On Mon, 19 Aug 2002, Benjamin Cotton wrote: > Anne, > > I made the changes to ldmd.conf, ldmadmin and pqact.conf that you > suggested. I have logs now, but still no satellite data. The rest of > my data is either incomplete and/or a day late. The interesting thing > is that the incoming data is only about 20 seconds late. So something > is getting lost in the shuffle. I knew I shoulda got luggage tags. > Haha anyway? well at least we?re getting somewhere. > > Ben > > P.S. My horoscope for today read in part: ?The world is your oyster.? I > just can?t escape those oysters... > > =================== > Benjamin J. Cotton > LDM Administrator > Department of Earth and Atmospheric Science, > Purdue University > > 165 Cary Quadrangle cell: (502) 551-5403 > West Lafayette, IN 47906 campus: (765) 49-52298 > > address@hidden > www.eas.purdue.edu/~bcotton > > Hi Ben, I'm glad that the logging is working now. Did you see these messages in the log? Aug 20 16:10:59 flood[16519]: run_requester: 20020820154739.805 TS_ENDT {{NNEXRAD|DIFAX|UNIDATA, ".*"}} Aug 20 16:10:59 flood[16519]: FEEDME(flood.atmos.uiuc.edu): OK Aug 20 16:11:00 flood[16519]: pq_del_oldest: conflict on 40654520 Aug 20 16:11:00 flood[16519]: hereis: pq_insert failed: Resource temporarily unavailable: 68934acfb1d8e490a644914e27bbe686 8539 20020820154900.986 NNEXRAD 030 SDUS53 KMQT 201546 /pN0RMQT Aug 20 16:11:00 flood[16519]: Connection reset by peer Aug 20 16:11:00 flood[16519]: Disconnect YOur connection to flood is continually being broken and reestablished, presumably because anvil's disk is unavailable. This can be caused by a full disk, but your disk isn't full: (anvil.eas.purdue.edu) [/project/ldm]% cd data (anvil.eas.purdue.edu) [/project/ldm/data]% df -k . Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad0s1g 20202730 15325911 3260601 82% /net/anvil So, back to this problem in a moment. I see that you are requesting lots of data from flood: request DIFAX|UNIDATA|NNEXRAD ".*" flood.atmos.uiuc.edu Do you really want or need the entire NEXRAD feed? I see that you're filing only the N0R products. If you don't need the entire feed for relay purposes I strongly suggest that you only request the N0R products from flood as that is a small percent of the entire feed. I see that there are lots of 'find' processes running on anvil: (anvil.eas.purdue.edu) [/project/ldm/etc]% ps -ax | grep find 875 ?? D 48:19.36 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - 1795 ?? D 27:29.13 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - 6277 ?? D 69:01.29 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - 8663 ?? D 242:41.01 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - 10368 ?? D 83:40.45 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - 10826 ?? D 199:54.24 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - 12099 ?? DN 61:37.30 find -s / ! ( -fstype ufs ) -prune -or -path /tmp -prune -or -path /usr/tmp -prune -or -path 12588 ?? D 144:34.92 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - 13471 ?? D 261:46.83 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - 13691 ?? D 105:58.58 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - 17146 ?? D 117:23.19 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - 23366 ?? DN 184:42.46 find -s / ! ( -fstype ufs ) -prune -or -path /tmp -prune -or -path /usr/tmp -prune -or -path 92142 ?? D 8:20.79 find /net/anvil -xdev -type f ( -perm -u+x -or -perm -g+x -or -perm -o+x ) ( -perm -u+s -or - These processes are owned by root. I don't know how these are being generated. Are they for security reasons? Are they necessary? Some have been running for many hours. On a full disk like /net/anvil, they can be *very* disk intensive, and could contributed the the disk being unavailable. Until I noticed that the find processes were owned by root I thought perhaps the find was coming from the LDM scour program, so I checked the LDM crontab to see how scour was being invoked. I found this: (anvil.eas.purdue.edu) [/project/ldm/etc]% crontab -l | grep scour 35 * * * * /project/ldm/bin/scour_anvil > /dev/null I looked at scour_anvil - it's running your own scour program, so I have no idea what that's doing. Could that be invoking the 'find' processes above?? Given the delays you're experiencing with the filed data, I wondered if pqact was keeping up. So, I put it in verbose mode, and grabbed this out of the log: Aug 20 17:57:21 pqact[16518]: 149 20020820140009.143 HDS 121 NXUS65 KPSR 201359 /pGSMIWA Aug 20 17:57:21 pqact[16518]: 7332 20020820140009.152 NNEXRAD 122 SDUS25 KABQ 201357 /pN1SABX Aug 20 17:57:21 pqact[16518]: 6158 20020820140009.154 NNEXRAD 123 SDUS23 KGRB 201352 /pN2RGRB Aug 20 17:57:21 pqact[16518]: 18639 20020820140009.171 NNEXRAD 124 SDUS54 KFWD 201351 /pNCRGRK Aug 20 17:57:21 pqact[16518]: 1625 20020820140009.311 HDS 130 SDUS83 KMKX 201351 /pDPAMKX Aug 20 17:57:21 pqact[16518]: 5429 20020820140009.186 NNEXRAD 125 SDUS33 KMKX 201351 /pN3RMKX Aug 20 17:57:21 pqact[16518]: 8470 20020820140009.188 NNEXRAD 126 SDUS24 KAMA 201358 /pN1RAMA Aug 20 17:57:21 pqact[16518]: 5685 20020820140009.203 NNEXRAD 127 SDUS75 KABQ 201357 /pN1VABX Aug 20 17:57:21 pqact[16518]: 9936 20020820140009.205 NNEXRAD 128 SDUS56 KSGX 201357 /pN0RNKX This shows us that pqact is running about 3 hours behind - it's not able to keep up with the volume of data. Perhaps killing the 'find' processes will free up some of the disk for pqact. I also think that your 300MB queue is too small for what you are requesting. If you look at the data volumes in your workshop binder, you'll see that on that day the NEXRAD max was 150MB. (Over the past 24 overs the NEXRAD max was 206MB.) Let's just look at the math based on the volumes in the workshop notebook: NEXRAD 150 HDS 141 IDS|DDPLUS 5.5 DIFAX 7 MCIDAS 7 So you would be unable to keep an hour's worth of data in a 300MB queue. However, pqmon is reporting that you have several hours worth of data in your queue (see the "age" column, the age of the oldest product in the queue in seconds): (anvil.eas.purdue.edu) [/project/ldm/data]% pqmon -i2 Aug 20 17:15:47 pqmon: Starting Up (29021) Aug 20 17:15:47 pqmon: nprods nfree nempty nbytes maxprods maxfree minempty maxext age Aug 20 17:15:47 pqmon: 40458 1 32783 299996608 56208 7 17033 6720 10350 Aug 20 17:15:49 pqmon: 40458 1 32783 299996608 56208 7 17033 6720 10352 Aug 20 17:15:51 pqmon: 40458 1 32783 299996608 56208 7 17033 6720 10354 I assume this is because writes to the queue aren't succeeding, based on the 'resource temporarily unavailable' message. So, in summary, I recommend killing all the 'find' processes and figuring out where they are coming from and whether they're necessary. I also recommend cutting back on your NEXRAD request, or, if you must have it all, then using a bigger queue, at least 500MB. Then if these don't solve the problem I would stop filing so much data and see if the results are better. Then you can gradually add in more filing until you find the threshold where things start falling apart. This is a little scattered. Please let me know if you have any questions. Anne -- *************************************************** Anne Wilson UCAR Unidata Program address@hidden P.O. Box 3000 Boulder, CO 80307 ---------------------------------------------------- Unidata WWW server http://www.unidata.ucar.edu/ ****************************************************