Hello all,
As a data provider I must admit that I am somewhat alarmed by the
potential for having to provide multiple metadata representations for
thousands (millions) of datasets. I noted in John Weatherley's seminar on
OAI metadata harvesting that the original source materials were DCXML
files (see http://dublincore.org/documents/2000/07/14/dcmes-xml/ for a
discussion and DTD). I could easily imagine a situation where I had to
create and maintain these files and a parallel set for FGDC
representations. This is, of course, relatively straightforward in a
world of static metadata. DC seems much more static than FGDC, so maybe
this is not a huge problem. In a dynamic metadata situation where data
providers, data managers, or data processing systems are interacting with
the metadata on essentially random time schedules, seems like it could
turn into a massive file management headache. BTW, John Caron's seminar
suggested that I was going to need a bunch of other XML files hanging
around to define collections. This only adds to the problem.
My approach to avoiding this problem is to try to produce multiple
metadata representations from a single source (in my case a relational
database). The content of that database is essentially FGDC, although I
expect that it will soon migrate to ISO 19115. What's important about
this is that it is a "fatter" standard (it has more stuff). The desire to
have more stuff is what led me to agree with Jeff's earlier e-mail
suggesting that it might be difficult to recover from starting small. In
that case, the problem Jeff and Stefano have discussed becomes one of
revealing different subsets of information from the database in response
to different requests.
In any case, I was driven to explore the DC-FGDC crosswalk in the hope
that I could easily create DC from FGDC (what the heck, it's the day
after Christmas and I'm at work!). I was interested to see that this
crosswalk was not referenced in the big list of crosswalks
(http://www.ukoln.ac.uk/metadata/interoperability/). Is there an obvious
reason for that? My initial efforts are in the attached file. It looks to
me like this crosswalk is rather straightforward. The most serious
omission is the identifier field. As far as I know, FGDC does not include
this concept, unfortunately. Could be added as an extension. I also think
OGC is working on an interesting approach to unique identifiers.
This crosswalk may raise some interesting questions about the list of
metadata elements Ben presented
(http://www.smete.org/nsdl/workgroups/standards/current_element_set.html
BTW, the definition of the identifier element is broken in that list).
When one follows the crosswalk to FGDC land, one many times lands in the
middle of a section that has a bunch of required elements that are not
included in DC. This, of course, makes going from DC to FGDC impossible,
but it raises the question of whether NSDL might want to beef up this
list. What good are keywords from a controlled vocabulary if you don't
know what controlled vocabulary it is? or identifiers from a specific
context if you don't know what context it is?
I am a real neophyte in this business, so I could be making some simple
errors. In any case, it is also a rough draft!
Happy New Year to all!
Ted Habermann