[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Uniform data indexing and querying
- Subject: Re: Uniform data indexing and querying
- Date: Tue, 23 May 2006 09:59:09 -0600
Hi Sveta:
Most of these questions are probably better addressed to the opendap
principles, Peter Cornillon or James Gallagher. However, I have added my own
opinions, briefly below.
Sveta Shasharina wrote:
Dear John,
Your name came up in our conversation with Mike Folk (NCSA). We are
discussing a common project on adding a "remoting" layer to the indexing
HDF5 API (Mike and Rishi Sinha). The idea will be to take this API allowing
to index (bitmap index) and efficiently query/access data, extract a
"format-agnostic" (working hopefully for HDF5 and NetCDF) interface and use
this interface in a service which will make allow for a remote operation.
The client of this service will be in some kind of 4GL (for visualization).
This is all in regard of a possible proposal for an NSF/SBIR proposal (due
June 13) addressing a topic of "Visualization of large data."
I see overlaps with what you and your colleagues are doing (I googled you
:-) and think we could collaborate or, at least, exchange ideas for the
future collaboration. I work at Tech-X Corporation in Boulder (wee
http://www.txcorp.com and http://grid.txcorp.com) and we have many
scientific computing and data management projects (mostly for DOE, some from
DOD and NASA).
So if you find time, could you please answer a couple of questions?
1. Is opendap used outside of earth systems?
also used in Space Physics (HAO/NCAR).
not sure where else
2. What benefits does it have as a transfer mechanism (compared, say, to
gridftp, soap, corba etc)?
opendap is a subsetting data access protocol. The subsetting of large datasets
is crucial.
compared to:
gridftp is not subsetting, only bulk transfer
soap is not a data access protocol. opendap 4 will use SOAP.
corba: opendap is not a distributed object system, but rather client/server.
the client does have to worry about the server's object's lifecycles.
3. What is the status of unifying NetCDF and HDF5? And is NetCDF4 widely
supported? I heard that parallel NetCDF is superior to HDF5, so why then
NetCDF4 (based on HDF5)?
We dont really unify HDF5/Netcdf. Rather, Netcdf4 is a profile (subset) of HDF5.
NetCDF4 is brand new, not even complete until HDF5 version 8 is released (this fall we hope).
HDF5 has a richer data model than Netcdf, so we are taking advantage of various new features, not just parellel I/O.
See: http://www.unidata.ucar.edu/software/netcdf/netcdf-4/
We are also looking at indexing in order to provide remote access to large collections of data. I would be interested in hearing what approaches you might take.
Regards,
John Caron