This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
Hi Angel, re: scouring deep hierarchical directories under Linux > I'm re-visiting this problem... An IDE disk was added to this system > just to hold the LDM data and it is using reiserfs. This machine is used > for teaching undergraduates meteorology and they wish to keep over a > month worth of data. The pqact is the Gempak one so it creates at ton of > little files nested very deeply. Again, scour seems to take over 24 > hours and I'm about to let scour only run once a week. I am very surprised that the scouring the GEMPAK tree takes over 24 hours; this is not our experience. I would guess that this might be related to your use of Resier FS. My testing with various file systems under Linux showed that EXT3 was the fastest for journaled file systems and EXT2 was the fastest overall. Given our view that data directories are typically expendable, I would recommend switching the FS back to EXT2. > I looked at the scour script and I can't imagine how it could be speeded > up. I created a Tcl-based scour script to see if it could be made faster than the C-shell scripts that Steve Chiswell wrote. I found that these scripts were faster by a little bit, but not anything dramatic. The LDM utility 'scour' might be more efficient if it was recast in a different language that stopped using 'find'. I think that the biggest inefficiency in 'scour' is 'find'. > Access to the gempak tree using du or find or recursive rm's > literally takes forever. How are other sites dealing with such deep > directory structures? Sites running on OSes other than Linux are not reporting the kinds of problems that sites running under Linux. Sites using Linux have had some success in quicker scouring when the RAID being written to is a true SCSI-based system. Again, my experience in building RAIDs using RAID interface cards is not good. My comments are based on lots of experimentation done here at the UPC and in working with community members (TAMU, Universidad de Costa Rica, Caribbean Institute for Meteorology and Hydrology). The TAMU machine was most problematic since they were keeping a 30-day rolling "archive" of NEXRAD Level II data. The scouring in that case consisted of removing the oldest day of Level II data which is saved in a hierarchy. I found no significant difference in my Tcl-based scouring and a simple 'rm -rf' done from the top level directory underwhich all data lived. From my perspective, the real problem is that Linux does not have a simple 'unlink' command that works only on the inode table. One thing you may want to do is open this up for discussion in the 'ldm-users' email list. Before you can post to any Unidata-maintained email list, however, you must subscribe to the list (I note that you are not subscribed to the 'ldm-users' list). You can (un)subscribe to any Unidata-maintained email list online at: http://www.unidata.ucar.edu/content/support/mailinglist/mailing-list-form.html If you decide to subscribe to repost your message, please wait until you have received notification of being added to the list before your post. Cheers, Tom **************************************************************************** Unidata User Support UCAR Unidata Program (303) 497-8642 P.O. Box 3000 address@hidden Boulder, CO 80307 ---------------------------------------------------------------------------- Unidata HomePage http://www.unidata.ucar.edu **************************************************************************** Ticket Details =================== Ticket ID: JGZ-326819 Department: Support LDM Priority: Normal Status: Closed