[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[netCDF #ZCE-849683]: clarification on large file support
- Subject: [netCDF #ZCE-849683]: clarification on large file support
- Date: Mon, 12 Jul 2010 09:24:08 -0600
Hi Fabio,
Sorry to have taken so long to respond to your question.
> I am Fabio Milani working in IDS (Ingegneria dei Sistemi), an italian
> company involved, among others, in electromagnetic simulations.
>
> We currently use the netcfd library for managing our simulators file
> and were interested in large file support. I read the FAQ section on
> your website and I need to clearify the following aspects, thanks to
> your precious support.
>
> Is the following correctly understood?
>
> 1. On 32 bit platforms the new netcdf IS ABLE to write file larger
> than 2GB (up to 2^64), BUT each variable contained in the file CANNOT
> exceed 2GB
That's not quite correct.
It's true that a program on 32-bit platforms linked to netCDF 3.6.x or
netCDF 4.x is able to write files larger than 2GB, assuming the
platform and file system are configured for Large File Support (this
is almost always the case), and that the program is compiled such that
the "off_t" type is 64 bits, such as a "long long" type. The
configure script used to build netCDF correctly sets compile flags to
support a 64-bit off_t type, if possible. You can check the output
from running the "configure" script to make sure it has the line
checking size of off_t... 8
indicating a 64-bit (8-byte) off_t type.
Each variable in the file cannot exceed 4GB (not 2GB), in netCDF
versions after 3.6.1, including the current netCDF 4.1.1. The actual
maximum size of a variable on a 32-bit platform is (2^32 - 4) bytes.
Part of the confusion is a documentation error here:
http://www.unidata.ucar.edu/software/netcdf/docs/netcdf.html#Classic-Limitations
which I just discovered hasn't been updated since the size limit on a
single variable was changed from 2GB (2^31 - 4) to 4GB (2^32 - 4) in
versions since netCDF 3.6.1. It should say
If you don't use the unlimited dimension, only one variable can
exceed 4 GiB in size, but it can be as large as the underlying file
system permits. It must be the last variable in the dataset, and the
offset to the beginning of this variable must be less than about 2
GiB.
The limit is really 2^32 - 4. If you were to specify a variable size
of 2^32 -3, for example, it would be rounded up to the nearest
multiple of 4 bytes, which would be 2^32, which is larger than the
largest unsigned 32-bit integer.
It's also true that even on 32-bit platforms, one variable in the file
(the last) can exceed 4 GB in size, as explained in the FAQ, as long
as the system supports a 64-bit off_t type.
> 2. On 64 bit platforms the new netcdf is able to write file larger than
> 2GB (up to 2^64), AND each variable contained in the file CAN exceed
> 2GB
Yes, that's true. But note that most 32-bit platforms support a
64-bit off_t type for file offsets, so the 64-bit offset variant of
the netCDF format is fully supported for reading or writing data on a
32-bit platform (except that you can only access at most 2GB at once,
due to the 32-bit size_t type on 32-bit platforms).
Note that with the netCDF-4 HDF5-based format, variables can also be
larger than 4GB.
--Russ
Russ Rew UCAR Unidata Program
address@hidden http://www.unidata.ucar.edu
Ticket Details
===================
Ticket ID: ZCE-849683
Department: Support netCDF
Priority: Normal
Status: Closed