[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[netCDF #BVD-982935]: Problems with the fuction nc_open
- Subject: [netCDF #BVD-982935]: Problems with the fuction nc_open
- Date: Thu, 13 May 2010 09:03:17 -0600
Hi Marcone,
Sorry to take so long to respond to your question ...
> my name is Marcone Magnus, I'm a graduating student of computer science
> in Federal University of Santa Catarina (http://www.ufsc.br/), I'm doing a
> research Lapesd (http://www.lapesd.inf.ufsc.br/) and Cyclops Group (
> http://www.cyclops.ufsc.br/) about high performance computer on a health
> care system that we devoloping and I'm using netcdf to storage medichal
> data.
>
> my problem is I need to open and close my netcdf file several times. And as
> the file increases size it became slower to open.
> In fact open the file turn to be a critical point in my system. I using c
> interface and netcdf4 with hierarchical format. There is no way to use a
> nc_open function with time fixed ?
When a netCDF-4 file is opened, it reads all the metadata into memory once, so
that later references to the metadata can be accessed quickly. By "metadata",
I mean
- names and sizes of dimensions
- names, shapes, and types of variables
- names and types of attributes
- names of groups and their subgroups
- definitions of all user-defined types
This could be slow if you add a large amount of metadata to the file before
closing
and re-opening it. However, if you are just adding more data to the file
(values
for variables already defined), it should not slow down the nc_open calls
significantly. Most of the use cases for netCDF-4 that we have seen benefit
from
reading in all of the metadata when the file is first opened, to speed up access
to the data and metadata on subsequent calls while the file is still open.
The underlying HDF5 library works differently, only reading in metadata as
needed, so
it is faster for cases such as a large number of nested groups where the common
case
is to only read data from a small subset of those groups before closing the
file. That
makes the open much faster, but each read that has to access metadata slower.
We have considered implementing an optional "fast open" by following the HDF5
model,
but so far there has not been enough demand for that feature to make it a high
priority for development.
The only suggestions I have are to
- consider making more use of data and less use of metadata for representing
your
data structures. For example, instead of using thousands of separate small
variables, use a smaller number of variables with indexing, or use large
multidimensional variables instead of many small variables.
- similarly, if you have thousands of deeply nested groups, consider a design
that
uses indexing in a few groups instead of relying on recursion in deeply
nested
groups.
- consider using HDF5 directly instead of netCDF-4, to see if it's model of
lazy
evaluation of metadata is better suited for your data representations
- try to keep the file open while it is used, to amortize the cost of opening
and
reading in all the metadata
--Russ
Russ Rew UCAR Unidata Program
address@hidden http://www.unidata.ucar.edu
Ticket Details
===================
Ticket ID: BVD-982935
Department: Support netCDF
Priority: Critical
Status: Closed