Hi John -This sounds great - and good timing too! I'd be very happy to try a beta version and test it out on our server at GFDL. Steve and I are actually at GFDL right now, and heading back to Seattle tonight. Can I grab the beta from the usual place on the THREDDS pages?
thanks! Kevin John Caron wrote:
Hi guys: The good news is that Ive found the problem with the caching. Performance now is a lot better, though i dont have a measurement, and a lot may depend on your server. The bad news (maybe) is that I am only going to fix this in the 4.0 version of NcML/TDS. We are pushing hard to get this out to beta this month. Id love to have you start to use it, to get feedback on other issues that may be lurking. The main problem was the "anonymous" inner aggregations. To get the caching right, we need to give them ids, eg: <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> <aggregation dimName="time" type="joinExisting"> <netcdf ncoords="36500" id="first100"> <aggregation type="union"> <netcdf location="pr_A2.00010101-01001231.nc"/> <netcdf location="tasmax_A2.00010101-01001231.nc"/> <netcdf location="tasmin_A2.00010101-01001231.nc"/> </aggregation> </netcdf> <netcdf ncoords="36500" id="sec100"> <aggregation type="union"> <netcdf location="pr_A2.01010101-02001231.nc"/> <netcdf location="tasmax_A2.01010101-02001231.nc"/> <netcdf location="tasmin_A2.01010101-02001231.nc"/> </aggregation> </netcdf> <netcdf ncoords="36500" id="third100"> <aggregation type="union"> <netcdf location="pr_A2.02010101-03001231.nc"/> <netcdf location="tasmax_A2.02010101-03001231.nc"/> <netcdf location="tasmin_A2.02010101-03001231.nc"/> </aggregation> </netcdf> </aggregation> </netcdf> I might be able to generate auto ids, but for now they have to be added by hand. As I said, this will only be useful in the 4.0 version. Ill get a release out later today in case you want to try it. John Steve Hankin wrote:Hi John, Thanks for looking into this. At this moment Kevin is modifying the code that creates the ncML aggregation configuration from the contents of our database. It looks like we will be "down to the wire" in seeing how much faster TDS becomes when we start using the improved ncML (the changes are bigger than just moving the ncoords attribute). Can we ask you to "stand by" and maybe be willing to set your peepers on it later today? Kevin's preliminary tests indicated that we will still getting the cache hit failures (that for unknown reasons TDS rebuildsthe aggregation in cache instead of reusing what it saved previously). But we don't have an up-to-date TDS site to show you yet.- Steve John Caron wrote:hi kevin: your ftp site is pretty slow (600 KB/sec) - is it throttled, or just overwhelmed? should i wait until tonight to try to download these files? Kevin OBrien wrote:Hi John - I did as you suggested and moved the ncoords attribute to the outer aggregation and I was able to get to the aggregation in around 28 seconds. Just to confirm that it wasn't something system-related, I changed the xml configuration back, and verified that when the ncoords attribute was in the inner aggregation, it took around 2 minutes to open. So that's a big speed improvement! I think I will expand the aggregation xml to include full experiments and see how the performance changes. By the way, you can get all of these files atftp://nomads.gfdl.noaa.gov/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/and you'll see there are many more that would actually be configured into the aggregation.. One thing I did notice and have a question about - after I moved the ncoords attribute to the outer aggregation, and I restarted the server - a cache file showed up in the cacheAged directory. When I then just restarted the server to test the use of cache, after I had opened the aggregation (which again took around 30 seconds), I noticed that the cache file in the cacheAged directory had apparently been updated (at least the time stamp of the file was new). If nothing in the aggregation has changed, should it be updating the cache file? Or should it use the cache file already there? thanks - kevin John Caron wrote:Hi Kevin, Steve: You should try putting the ncoords attribute on the outer aggregation: <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" ncoords="36500"> <aggregation type="union"> <netcdflocation="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/pr_A2.00010101-01001231.nc" /><netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmax_A2.00010101-01001231.nc" /> <netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmin_A2.00010101-01001231.nc" /> </aggregation> </netcdf> let me know if that helps. Id like to test this nested aggregation as a use case . Can I get those 9 files? thanks. Steve Hankin wrote:(This is a continuation of the conversation that Kevin O'Brien started with you.) Hi John, Below is the ncML and TDS configuration information. It all "works" ... except the caching. Any clues? - Steve === This from threddsConfig.xml <AggregationCache><dir>/home/pmel/DataPortal/apache-tomcat-5.5.25/content/thredds/cacheAged/</dir><scour>24 hours</scour> <maxAge>90 days</maxAge> </AggregationCache> === And this is the latest ncML that Kevin tested: "It took nearly two minutes to open the aggregation the first time. After that, accesses were quick -- evidently caching was working. Then I restarted the tomcat server, and again it took nearly two minutes to open the aggregation. I could see that the cache file in the caching directory was again updated after the second tomcat restart (ie, the cache was rewritten rather than used)..." <catalog name="test IPCC Datasets"xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"xmlns:xlink="http://www.w3.org/1999/xlink"><service name="thisDODS3" serviceType="OpenDAP"base="/thredds/dodsC/" /><dataset ID="CM2Q-d2_1PctTo4x_j1 atmos daily all vars 00010101-03001231 test" name="CM2Q-d2_1PctTo4x_j1 atmos daily all vars 00010101-03001231 test" urlPath="ipcc_ar4_CM2.0_R1_1to4x-0_daily_atmos_00010101-03001231_test"> <serviceName>thisDODS3</serviceName> <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> <aggregation dimName="time" type="joinExisting"> <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> <aggregation type="union"> <netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/pr_A2.00010101-01001231.nc" ncoords="36500" /> <netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmax_A2.00010101-01001231.nc" ncoords="36500" /> <netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmin_A2.00010101-01001231.nc" ncoords="36500" /> </aggregation> </netcdf> <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> <aggregation type="union"> <netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/pr_A2.01010101-02001231.nc" ncoords="36500" /> <netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmax_A2.01010101-02001231.nc" ncoords="36500" /> <netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmin_A2.01010101-02001231.nc" ncoords="36500" /> </aggregation> </netcdf> <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> <aggregation type="union"> <netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/pr_A2.02010101-03001231.nc" ncoords="36500"/> <netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmax_A2.02010101-03001231.nc" ncoords="36500"/> <netcdf location="file:/data/gfdl_cm2_0/CM2Q-d2_1PctTo4x_j1/pp/atmos/ts/daily/tasmin_A2.02010101-03001231.nc" ncoords="36500" /> </aggregation> </netcdf> </aggregation> </netcdf> </dataset> </catalog> Steve Hankin wrote:Hi John, We're at the phone number in the signature line below. Will follow this email shortly with some XML fragments ... hoping maybe you have a suggestion. - Steve-- Steve Hankin, NOAA/PMEL -- address@hidden 7600 Sand Point Way NE, Seattle, WA 98115-0070 ph. (206) 526-6080, FAX (206) 526-6744 "The only thing necessary for the triumph of evil is for good men to do nothing." -- Edmund Burke