[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: THREDDS performance [was Re: THREDDS and grib]

This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.

Subject: Re: THREDDS performance [was Re: THREDDS and grib]
Date: Fri, 20 Jun 2008 14:33:07 -0700

Hi John -

I installed the version 3.16.37 of the server and unfortunately, itdoesn't seem to make the problem go away. It did kind of seem likethings were a bit quicker at times, but it was hard to accuratelyassess. Of course, the initial access of these large aggregations isstill pretty slow, and then subsequent accesses are faster. However, itdoes seem like a restart of the tomcat server somehow erases the cacheinformation and so every initial access after a tomcat reboot is slow.

Is there anything else I can do to help further debug the problem?

thanks -
Kevin

John Caron wrote:

Hi Kevin: I made a small fix that looks like it would affect yourcase, but im not convinced it really would cause a huge slowdown.anyway, i wonder if you would give it a try and let me know?
Its TDS release 3.16.37.

thanks for your patience

Kevin O'Brien wrote:
Hi John -
Not to be a pest - but I was wondering if you'd had a chance to lookat these performance issues, or even been able to recreate them?
Thanks -
kevin

John Caron wrote:
these are all good questions - there have been similar reports ofthe agg cache not working like it should. i will have to reproduceto see whats happening.
Kevin O'Brien wrote:
Hi John -
I tried what you suggested and it didn't seem to have a significanteffect in making the initial access of the aggregated datasetquicker. It still took over a minute and a half to open thedataset. I've pasted the xml config that I used to define the newaggregation below. To be honest, I'm actually kind of glad becauseI wasn't looking forward to modifying the guts of the applicationwhich generates the xml config automatically.... :-)
I guess I can understand and probably even accept the fact that forthe first time the dataset is accessed, things will be a littleslow. After that, I presume the dataset is available in the cache,and of course subsequent accesses prove that it is because theresponse is quite quick. However, if the tomcat server isrestarted, it seems like whatever is in the cache is ignored andthe cache entries have to be rebuilt. I have my aggregation cacheset like so:
 <AggregationCache>
<dir>/home/pmel/DataPortal/apache-tomcat-5.5.25/content/thredds/cacheAged/</dir>
   <scour>24 hours</scour>
   <maxAge>90 days</maxAge>
</AggregationCache> Does that seem correct? Also, as an aside,you mention that you thought this would be quicker because itavoids the OPeNDAP URL's....Shouldn't there be some client sidecaching done w/ the OPeNDAP datasets? For example, if I access aremote dataset with ncdump (or Ferret), and my OPeNDAP caching isturned on my ~/.dodsrc file, it will cache the response in the~/.dods_cache directory. Does any of that happen when OPeNDAPURL's are accessed through TDS???
Anyway - here's the xml config I used as per your suggestion:
<dataset ID="CM2.1U-D4_1PctTo2X_I1 atmos daily all vars00010101-02201231_2" name="CM2.1U-D4_1PctTo2X_I1 atmos daily allvars 00010101-02201231_2"urlPath="ipcc_ar4_CM2.1_R1_1to2x-1_daily_atmos_00010101-02201231_2">
       <serviceName>thisDODS3</serviceName>
<netcdfxmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
         <aggregation type="union">
<netcdfxmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
              <aggregation dimName="time" type="joinExisting">
<netcdflocation="file:/data/gfdl_cm2_1/CM2.1U-D4_1PctTo2X_I1/pp/atmos/ts/daily/pr_A2.00010101-01001231.nc"ncoords="36500" /><netcdflocation="file:/data/gfdl_cm2_1/CM2.1U-D4_1PctTo2X_I1/pp/atmos/ts/daily/pr_A2.01010101-02001231.nc"ncoords="36500" /><netcdflocation="file:/data/gfdl_cm2_1/CM2.1U-D4_1PctTo2X_I1/pp/atmos/ts/daily/pr_A2.02010101-02201231.nc"ncoords="7300" />
              </aggregation>
            </netcdf>
<netcdfxmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
              <aggregation dimName="time" type="joinExisting">
<netcdflocation="file:/data/gfdl_cm2_1/CM2.1U-D4_1PctTo2X_I1/pp/atmos/ts/daily/tasmax_A2.00010101-01001231.nc"ncoords="36500" /><netcdflocation="file:/data/gfdl_cm2_1/CM2.1U-D4_1PctTo2X_I1/pp/atmos/ts/daily/tasmax_A2.01010101-02001231.nc"ncoords="36500" /><netcdflocation="file:/data/gfdl_cm2_1/CM2.1U-D4_1PctTo2X_I1/pp/atmos/ts/daily/tasmax_A2.02010101-02201231.nc"ncoords="7300" />
              </aggregation>
            </netcdf>
<netcdfxmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
              <aggregation dimName="time" type="joinExisting">
<netcdflocation="file:/data/gfdl_cm2_1/CM2.1U-D4_1PctTo2X_I1/pp/atmos/ts/daily/tasmin_A2.00010101-01001231.nc"ncoords="36500" /><netcdflocation="file:/data/gfdl_cm2_1/CM2.1U-D4_1PctTo2X_I1/pp/atmos/ts/daily/tasmin_A2.01010101-02001231.nc"ncoords="36500" /><netcdflocation="file:/data/gfdl_cm2_1/CM2.1U-D4_1PctTo2X_I1/pp/atmos/ts/daily/tasmin_A2.02010101-02201231.nc"ncoords="7300" />
              </aggregation>
            </netcdf>
         </aggregation>
       </netcdf>
   </dataset>


I'm open to any suggestions or ideas!

thanks -
kevin


John Caron wrote:
Hi Kevin:
I havent had time to reproduce this yet, but im guessing onesource of the slowdown is using opendap URLS in the compoundaggregation. It would be interesting to time 1) the singleaggregations, 2) the compound agg as it exists, and 3) thecompound agg, but replace the opendap URLs with direct netcdf files,
see attached file


--
Kevin O'Brien                   UW/JISAO        
Research Scientist              NOAA/PMEL/TMAP
206-526-6751                    http://tmap.pmel.noaa.gov

"The contents of this message are mine personally and donot necessarily reflect any position of the Governmentor the National Oceanic and Atmospheric Administration."

Follow-Ups:
- Re: THREDDS performance [was Re: THREDDS and grib]
  - From: John Caron

References:
- Re: THREDDS and grib
  - From: John Caron
- Re: THREDDS and grib
  - From: John Caron
- Re: THREDDS performance [was Re: THREDDS and grib]
  - From: John Caron

Prev by Date: Re: THREDDS performance [was Re: THREDDS and grib]
Next by Date: Re: THREDDS performance [was Re: THREDDS and grib]
Previous by thread: Re: THREDDS performance [was Re: THREDDS and grib]
Next by thread: Re: THREDDS performance [was Re: THREDDS and grib]
Index(es):
- Date
- Thread