[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 20010808: netcdf 3.4 help
- Subject: Re: 20010808: netcdf 3.4 help
- Date: Fri, 10 Aug 2001 16:45:22 -0600
>To: address@hidden
>From: "Alan S. Dawes" <address@hidden>
>Subject: netcdf 3.4 help
>Organization: UCAR/Unidata
>Keywords: 200108082235.f78MZM110902, huge files, large file support, record
Hi Alan,
Here's the CDL for a small file that has two "record variables", x and
y, of different shapes:
netcdf big2 {
dimensions:
m = 4 ;
n = 5 ;
r = UNLIMITED ; // (3 currently)
variables:
float x(r, m) ;
float y(r, n) ;
data:
x =
2, 3, 4, 5,
3, 4, 5, 6,
4, 5, 6, 7 ;
y =
1, 2, 3, 4, 5,
2, 4, 6, 8, 10,
3, 6, 9, 12, 15 ;
}
A small Fortran program that will write the netCDF file corresponding
to the above CDL is appended. This program was mostly generated by
the "ncgen -f" utility, except I edited the output from that utility a
bit for this example.
In this example, if r was an ordinary dimension declared to be of
length 3, then all the values for x would be stored in the file
followed by all the values of y. However, since r is declared to be
the unlimited dimension, the first slice of x (corresponding to r=1)
is followed by the first slice of y, then the second slice of x and y,
and so on. But all your netCDF data access calls for reading and
writing the data are the same as if r was a fixed size dimension.
It's just that with r an UNLIMITED dimension, the data is organized
differently in the file and its possible to append more data in the r
direction efficiently.
To have this program generate a 6 Gbyte file, corresponding to the
similar CDL:
netcdf big2 {
dimensions:
m = 400000 ;
n = 600000 ;
r = UNLIMITED ; // (1500 currently)
variables:
float x(r, m) ;
float y(r, n) ;
data:
x =
... // long list of values
y =
... // long list of values
}
it's only necessary to change the three parameters in the Fortran
program to
parameter(MFIXED=400000)
parameter(NFIXED=600000)
parameter(NUMRECS=1500)
and link the resulting Fortran against the netCDF library compiled
with large file support. So even though x is a 1500 x 400000 array of
600,000,000 floats (requiring 2.4 Gbytes to store) and y is a 1500 x
600000 array of 900,000,000 floats (requiring 3.6 Gbytes to store),
both variables can be written into and read from the netCDF file,
because they are record variables, only stored a slice at a time, with
the x slice for r=1 followed by the y slice for r=1, ... To simplify
this example, I haven't included any fixed size variables, but they
don't really change anything as long as the total size of all fixed
size variables is < 2 GBytes.
I hope this clarifies one way to write very large netCDF files on a
32-bit platform with large file support. I don't think any special
Fortran flags are required for this, but the C library had to be built
with
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
--Russ
program fgennc
parameter(MFIXED=4)
parameter(NFIXED=5)
parameter(NUMRECS=3)
include 'netcdf.inc'
* error status return
integer iret
* netCDF id
integer ncid
* dimension ids
integer m_dim
integer n_dim
integer r_dim
* dimension lengths
integer m_len
integer n_len
integer r_len
parameter (m_len = MFIXED)
parameter (n_len = NFIXED)
parameter (r_len = NF_UNLIMITED)
* variable ids
integer x_id
integer y_id
* rank (number of dimensions) for each variable
integer x_rank
integer y_rank
parameter (x_rank = 2)
parameter (y_rank = 2)
* variable shapes
integer x_dims(x_rank)
integer y_dims(y_rank)
* data variables
real x(m_len)
real y(n_len)
* starts and counts for array sections of record variables
integer x_start(x_rank), x_count(x_rank)
integer y_start(y_rank), y_count(y_rank)
* enter define mode
iret = nf_create('big2.nc', NF_CLOBBER, ncid)
call check_err(iret)
* define dimensions
iret = nf_def_dim(ncid, 'm', MFIXED, m_dim)
call check_err(iret)
iret = nf_def_dim(ncid, 'n', NFIXED, n_dim)
call check_err(iret)
iret = nf_def_dim(ncid, 'r', NF_UNLIMITED, r_dim)
call check_err(iret)
* define variables
x_dims(2) = r_dim
x_dims(1) = m_dim
iret = nf_def_var(ncid, 'x', NF_REAL, x_rank, x_dims, x_id)
call check_err(iret)
y_dims(2) = r_dim
y_dims(1) = n_dim
iret = nf_def_var(ncid, 'y', NF_REAL, y_rank, y_dims, y_id)
call check_err(iret)
* leave define mode
iret = nf_enddef(ncid)
call check_err(iret)
* Write record variables one record at a time
do irec=1, NUMRECS
* store some arbitrary values in data variable slices
do ix = 1, m_len
x(ix) = ix + irec
enddo
do iy = 1, n_len
y(iy) = iy * irec
enddo
* store x slice
x_start(1) = 1
x_start(2) = irec
x_count(1) = m_len
x_count(2) = 1
iret = nf_put_vara_real(ncid, x_id, x_start, x_count, x)
call check_err(iret)
* store y slice
y_start(1) = 1
y_start(2) = irec
y_count(1) = n_len
y_count(2) = 1
iret = nf_put_vara_real(ncid, y_id, y_start, y_count, y)
call check_err(iret)
enddo
iret = nf_close(ncid)
call check_err(iret)
end
subroutine check_err(iret)
integer iret
include 'netcdf.inc'
if (iret .ne. NF_NOERR) then
print *, nf_strerror(iret)
stop
endif
end