This archive contains answers to questions sent to Unidata support through mid-2025. Note that the archive is no longer being updated. We provide the archive for reference; many of the answers presented here remain technically correct, even if somewhat outdated. For the most up-to-date information on the use of NSF Unidata software and data services, please consult the Software Documentation first.
1) an entity that is considered as a unit by human beings 2) an entity that can be operated on as a unit by the THREDDS API 3) an entity that can be operated on as a unit by a data access protocolRight now, only the entities described by "access" tags meet all of 1, 2, and 3.
The tags "dataset" and "collection" both describe entities that only meet 1 and 2. Thus I agree with benno that there is not a very meaningful distinction between them (and reconsider my listing of them as orthogonal concepts in my previous message).
I wonder if it would be a good idea to merge these concepts and use a less loaded word, say "entry", to refer to an entity that has meaning to THREDDS and to end users, but not to a data access protocol, i.e.
<catalog> <service name="X"/> <service name="Y"/> ... <entry name="my_dataset"> <metadata name="global-metadata" url="..."/> <access name="global-X-access"/> <entry name="monthly-data"> <metadata name="monthly-metadata" url="..."/> <access name="X-with-COARDS" serviceType="X" url="..."/> <access name="X-with-no-COARDS" serviceType="X" url="..."/> <access name="X-flattened-to-2D" serviceType="X" url="http://..."/> <access name="Y" serviceType="Y" url="..."/> .... </entry> </entry> - Joe Daniel Holloway wrote:
Benno Blumenthal wrote:John Caron wrote:Much harder question is the distinction between a dataset and a collection, since a dataset is a collection of data. I have conceptualized it as follows: a dataset is something that can be selected, and then it is processed in a protocol-dependent way. A collection is a protocol-independent mechanism for grouping datasets.I think this is what is getting us into trouble. The concept of a dataset should be independent of the services available for it: a dataset served from two different servers could very well have different services/protocols available, depending on the server. (the aggregation server converts collections to datasets, for example). Yet from the THREDDS/educational point of view, it is the same object.I agree with this as well. I've been trying to reconcile how a catalog might look for a particular multifile 'dataset' which has both WMS and DODS access available for it. For WMS (for multifile) datasets the access point would be at the collection level, while for 'non-aggregated' datasets the DODS access would be lower than the collection level, at the THREDDS dataset level. It seems that the concept of a dataset resides more at the collection level, maybe the service access binding is too tightly coupled to the dataset concept in the current draft. DanBenno -- Dr. M. Benno Blumenthal address@hidden International Research Institute for climate prediction Lamont-Doherty Earth Observatory of Columbia University Palisades NY 10964-8000 (845) 680-4450
-- Joe Wielgosz address@hidden / (707)826-2631 --------------------------------------------------- Center for Ocean-Land-Atmosphere Studies (COLA) Institute for Global Environment and Society (IGES) http://www.iges.org