[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 20040630: adding another entry to unlimited dimension
- Subject: Re: 20040630: adding another entry to unlimited dimension
- Date: Wed, 30 Jun 2004 09:56:37 -0600
>To: address@hidden
>From: Stacy Brodzik <address@hidden>
>Subject: adding another entry to unlimited dimension
>Organization: > Stacy Brodzik <address@hidden>
>Keywords: 200406300005.i5U05WWb002233 netCDF time
Hi Stacy,
> I've added variables, attributes, etc to netcdf files but I've never
> tried to add another time offset and its accompanying data to a
> netcdf file. I've looked through the functions and don't really see
> an easy way to do it. If there's a way to do it without opening the
> dataset, copying all the data out of it into arrays, etc, adding the
> new data, and creating a completely new netcdf file, I'd be
> interested in hearing back from you.
If by "add another time offset" you mean just add all the data
associated with another time record to all the variables that use the
time dimension, where the time dimension is declared to be unlimited
and is the first dimension of each variable that uses it, then you can
do what you want by merely writing the variable data slices using one
of the nc_put_vara C interfaces, for example. That will merely append
the data efficiently to the netCDF file without copying, and is what
the unlimited dimension is designed to support.
But I suspect you probably know that and mean something else by "add
another time offset", for example adding a new dimension and some new
variables that use it. If it's a fixed-size dimension, then you can
add it and also add some additional fixed size variables efficiently,
but only if you anticipated you might need to do this by calling the
"underbar underbar" versions of the function nc__enddef() that has
additional parameters for reserving space in the header for
additional dimensions, variables, and attributes, and additional space
before the first record for additional fixed-size variable data.
This function is currently only documented in the man page reference
documentation, which should be available online, but the script that
produces it is currently broken. So I've appended the relevant part
of that below.
If you didn't reserve any extra space in the header and fixed-variable
data section, then when you call nc_redef(), add new dimensions,
variables, and attributes, then call nc_enddef(), the data will be
copied to make space for the new data. And if you want to add a new
record variable, this also requires a copy of all the data, because
the original format unfortunately doesn't allow for reserving extra
space in records for latter addition of new record variable data.
In netCDF-4, currently under development, you will be able to
efficiently add new variables, attributes, and dimensions without
restriction, and without worrying about the data being copied.
--Russ
_____________________________________________________________________
Russ Rew UCAR Unidata Program
address@hidden http://www.unidata.ucar.edu/staff/russ
int nc__enddef(int ncid, size_t h_minfree, size_t v_align,
size_t v_minfree, size_t r_align)
Like nc_enddef() but has additional performance tuning
parameters.
Caution: this function exposes internals of the netcdf
version 1 file format. It may not be available on fu-
ture netcdf implementations.
The current netcdf file format has three sections, the
"header" section, the data section for fixed size vari-
ables, and the data section for variables which have an
unlimited dimension (record variables). The header be-
gins at the beginning of the file. The (offset) of the
beginning of the other two sections is contained in the
header. Typically, there is no space between the sec-
tions. This causes copying overhead to accrue if one
wishes to change the size of the sections, as may hap-
pen when changing names of things, text attribute
values, adding attributes or adding variables. Also,
for buffered i/o, there may be advantages to aligning
sections in certain ways.
The minfree parameters allow one to control costs of
future calls to nc_redef(), nc_enddef() by requesting
that minfree bytes be available at the end of the sec-
tion. The h_minfree parameter sets the pad at the end
of the "header" section. The v_minfree parameter sets
the pad at the end of the data section for fixed size
variables.
The align parameters allow one to set the alignment of
the beginning of the corresponding sections. The begin-
ning of the section is rounded up to an which is a
multiple of the align parameter. The flag value
NC_ALIGN_CHUNK tells the library to use the chunksize
(see above) as the align parameter. The v_align param-
eter controls the alignment of the beginning of the
data section for fixed size variables. The r_align
parameter controls the alignment of the beginning of
the data section for variables which have an unlimited
dimension (record variables).
The file format requires mod 4 alignment, so the align
parameters are silently rounded up to multiples of 4.
The usual call, nc_enddef(ncid) is equivalent to
nc__enddef(ncid, 0, 4, 0, 4).
The file format does not contain a "record size" value,
this is calculated from the sizes of the record vari-
ables. This unfortunate fact prevents us from providing
minfree and alignment control of the "records" in a
netcdf file. If you add a variable which has an unlim-
ited dimension, the third section will always be copied
with the new variable added.