[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Small THREDDS catalogs and the Proposed new specification for THREDDSS Catalogs
- Subject: Re: Small THREDDS catalogs and the Proposed new specification for THREDDSS Catalogs
- Date: Mon, 26 Apr 2004 10:20:10 -0600
Hi benno:
Benno Blumenthal wrote:
John Caron wrote:
A proposed new version of the THREDDS Dataset Inventory Catalog is
ready for your comments. Please send them to
address@hidden, or to me.
I am glad to see some expansion of the spec so that we can convey more
information about our datasets. I have a question, however, about
switching from DTD to XML Schema.
My view of THREDDS DIC in general is that it allows a data provider to
describe their collection of data sets at a level of utility not
generally available. I understand there is a considerable effort to
generate catalogs automatically for existing collections, and that is
important, but I think one can provide information in the THREDDS
format that cannot be simply expressed by, for example, a collection
of typed files in a directory. So if providers have the detailed
information about their datasets, it is probably a set of links on an
HTML page (extended documentation in HTML or PDF files, perhaps links
to metadata files, etc), along with documentation for the collection
as a whole. Also, many providers only provide a few datasets. So the
path of least resistance for many is to simply write the THREDDS file.
In light of this, I have been encouraging groups to write THREDDS xml
files (version 0.6) to describe their collections. Not that I have
had a tremendous amount of success so far, but given some more time, I
am hopeful of more to show. For example, there is
http://www.ecco-group.org/thredds/sioeccoCatalog.xml
which is displayed in my (Ingrid) interface at
http://iridl.ldeo.columbia.edu/%28http://www.ecco-group.org/thredds/sioeccoCatalog.xml%29readthredds/
or
http://iridl.ldeo.columbia.edu/SOURCES/.SIO/.ECCO/
So given the THREDDS catalog generation options, they chose to write
it by hand, and I taught them the nuances of the tags. Pretty
reasonable given the small number of datasets and the additional
documentation that we wanted to link in. I am a little embarassed
that the next THREDDS version is so different from the first, though I
guess that is my problem more that anyone elses.
So
1) What are the benefits of changing from DTD to XML Schema?
Well, its not as great as some would claim, and schema validation is
just now becoming stable. I cant say im crazy about it. OTOH, writers
dont actually have to use it, and readers dont have to validate with it.
Schema is certainly more expressive than DTDs.
The integrated namespace handling is one clear win, we are using it for
both version control and "foreign" metadata parsing. I think in the long
run Schema will be worthwhile, esp as we learn what subsets of it are
best for what purposes.
There will be great benefits in switching to the 1.0 spec, as you note
above, esp in connecting to Digital Libaries and providing the raw data
for search services.
Does it include an automatic editing interface easily accessible to
everyone?
yes, we are working on a editing interface to make this easy.
2) Can you provide a conversion utility on the web so that anyone that
has written an old-format file can instantly get a new-format version?
yes. Ill try to get a prototype web service running soon so anyone can
try it.
If the new format is much easy to edit than the old (because
interfaces are readily available), I think the sell for switching can
be made. Otherwise, users may be annoyed, i.e. slow to switch and/or
adopt.
yup, good points. We have been aware of the backwards compatibility
issues. The main problems are around inheritence, which is a big pain in
general, but seems awfully useful from a writer's POV. Anyone have any
thoughts about how useful inheritence is (eg, putting a dataType or
serviceName in a parent dataset element and having it be inherited by
its nested datasets) ?