[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

DODS workshop observations



Hi,

After the DODS meetings last week and a few brief conversations at the AMS meetings this week, I thought it would be useful to summarize the issues that came up at the DODS meetings that I feel are important from my own (admittedly limited) THREDDS perspective.

When I get a chance, I'll try to capture this on a web page with all the relevant links, etc. but I wanted to get it out for discussion (especially for corrections by others who were at the DODS meetings) before I let it fall through the cracks.

Have a nice MLK weekend.

-- Ben

=======================================================

Granularity:

Under this heading, I include the discussions regarding what comprises a dataset, what's an aggregation, what's a catalog, a collection, etc. and how these relate to files, data objects within files, inventories, lists, directories, etc. I came away from the meetings with the sense that there are clear definitions for only a few of these. Within THREDDS, we need to come up with some working definitions that allow us to work with the data heirarchy in a systematic fashion. This is somewhat complicated by the fact that the Digital Library community uses some of the terms, e.g., the term "collection" in its own fashion.

There is a related THREDDS issue that was not discussed much at the DODS meetings, namely, that we envision third-party metadata contributions in the form of "catalogs" that reference files on multiple data servers. But it means that a given dataset or file can be a member of many heirarchies.

Metadata Schemas:

The DODS DDS (Data Descriptor Structure) and DAS (Data Attribute Structure) will not be sufficient for THREDDS. We have to determine how THREDDS fits in with externally defined "standards" such as those of ISO, FGDC, OpenGIS, GCMD, Dublin Core, ESML, etc. Recently we learned of another in the area of software metadata -- BIDM (basic interoperability data model.) Our data provider sites are required to conform to some of these standards and the DL community is adopting Dublin Core with some extensions.

Metadata Creation Tools:

These are needed in the form of crawlers, scanners, and tools to aid human input. This includes hybrid tools where some of the metadata common to many datastts is input by hand one time and is then combined automatically with metadata specific to individual datasets or files. It is important that such tools be able to traverse data holdings where the metadata (and perhaps the datasets themselves) are held in databases and generated on the fly as needed. Some of this work is going on in DODS, some in the DL community, and some at Unidata. So this is one where coordination of efforts is needed.

Metadata Presentation Tools:

Several approaches to making metadata available were discussed at the meeting: DBMS systems, LDAP, simply directory/file systems, full text indexing facilities. As noted above, it's important for metadata "harvesting" tools to be able to "traverse" all the metadata at a site -- even though it is made available in different ways.

Third-party Metadata Catalog Servers and the DODS Auxiliary Information Servers:

I believe these two concepts can be closely related. Whereas the AIS is currently viewed as a way of adding a "delta" of metadata to the main metadata source at the data providers site, the concecpt could be extended to include sites which serve catalogs of metadata organized in a completely different fashion. For example, some of the catalogs might point to collections of datasets on different servers that illustrate different scientific concepts or collections of datasets on different servers that relate to certain events: hurricanes, major storms, floods,etc.