[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TDS File Size limit?



ok, its a funny glitch in the servlet spec on ServletResponse.setContentLength(). Theres an easy workaround that ill add to both 4.1 and 4.2. Do you need it immediately?

thanks for reporting,
John

On 5/20/2010 8:54 AM, Eric Nienhouse wrote:
Hi John,

A few more details.  I'm testing this with TDS 4.1.1, however I can also repro on 4.1.2.

Basic test: download a ~5GB file from TDS with HTTPServer via FireFox web browser on WinXP (actual file size on disk is: 4975605612 bytes)

Results in a (smaller) file of: 680638316 bytes

*Note the resulting size is the actual size truncated to a 32-bit integer.

The TDS threddsServlet.log file follows (indicating success with full size byte count) with no error as far as I can tell.

2010-05-20T08:46:20.654 -0600 [    523273][      20] INFO  - thredds.servlet.FileServerServlet - Remote host: 128.117.8.201 - Request: "GET /thredds/fileServer/datazone/esg/CCSM/csm/b30.040e/atm/proc/tseries/hourly6/b30.040e.cam2.h3.Z3.2066-01-01_cat_2066-12-31.nc?gateway=ESG-NCAR&authzToken=896b3914-70ba-4e02-b82b-14529dfaf4f8 HTTP/1.1"
2010-05-20T08:47:08.991 -0600 [    571610][      20] INFO  - thredds.servlet.ServletUtil - returnFile(): Request Completed - 200 - 4975605612 - 48337

Note that the server I am testing with has several custom filters configured for authorization purposes.  It is possible these filters are affecting the resulting file size on some way  (based on initial code reviews this seems unlikely, however.)

Thanks for looking into this.  Please let me know if you need any further details.

-Eric


Eric Nienhouse wrote:
Hi John,

I can reproduce this with a number of different files larger than 2Gb.

This is using HTTPServer (I have not tried OpenDAP.)

I'll check the logs for errors and get back to you.

Thanks,

-Eric

John Caron wrote:
Eric Nienhouse wrote:
Hi John,

We may be hitting a file size limit in TDS 4.1.  We've begun publishing a number of climate model output NetCDF files that are in excess of 3GB.  When downloading these files from the TDS, we're getting "shorter" files returned.  For example a file of ~5GB results in an file returned of ~680MB in size:

Actual file size:  4975605540 bytes
Returned file size: 680638244 bytes

This looks suspiciously like an issue related to the storage type and possible int rollover as the actual file size exceeds Java's Integer.MAX_VALUE and the resulting size is exactly the actual size truncated to 32-bits.

I've looked at the headers returned from the HTTP request for the file and the Content-Length reflects the "shortened" 680638244 byte size.  If I use an HTTP client like WGet and force the Content-Length to be ignored, the resulting stream is still only 680638244 bytes in length.

I reviewed the Thredds list and Googled this topic, but I have not found anything specific as to this problem.  Has this come up before?

We've typically been working with files of less the 2GB in the past, and as far as I know, this is the first we've experienced this problem.  (We're running TDS 4.1.2)  This will become a great need for us once the AR5/CMIP5 datasets start flowing (which may be as early as mid June!)  These data files will likely be in the 2GB - 4GB size range.

I do know the Java Servlet specification defines the setContentLength parameter as an "int" which is a known limitation.  This may be related.

Please let me know if you need further details.

Thanks,

-Eric

Hi Eric:

"When downloading these files from the TDS" : 1) are you using HTTPserver or Opendap? 2) is it reproducible? 3) can you look in the threddsServlet.log to see if there's an error message?