[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: use of the "date" metadata element in THREDDS catalog

Subject: Re: use of the "date" metadata element in THREDDS catalog
Date: Fri, 01 Jun 2007 10:24:06 -0400

Hi John,

Thanks for your response. Yes, we can use some heuristic methods ratherthan frequently scan through an entire catalog. Apparently some catalognodes are updated more often than others. Thus, the approach we adoptednow is to create separate CSW catalogs for different collections, such asNEXRAD and NCEP forecast, and update different CSWs at differentfrequencies based on our estimation of THREDDS data collection's update rate.


Wenli

At 07:39 PM 5/31/2007, John Caron wrote:

Hi Wenli:

You're right that dynamic catalog generation is (sort of) the problem.
As a practical matter, its not hard to figure out how often the datasetscome in, and use that as a heuristic on how often to crawl.
In principle, "Last Modified" HTTP header could tell you if the cataloghas been modified, or perhaps "Expires" is better. The problem is that theserver doesn't actually know what that should be, but perhaps we canfigure out a way to add that. This would only be approximate, but I assumethat would be good enough for your purposes?
Realtime data is challenging. For gridded data, the granularity is largeenough that "Expires" is probably useful. For datasets like radar andsurface obs, the answer is always "yes, it changed since last time you asked".
John


Wenli Yang wrote:
Hello,
We are doing a project on ingesting THREDDS catalogs to OGC catalogs(Catalog Service for Web, or CSW). We find that we have to go through anentire THREDDS catalog to update an ingested CSW server, because we don'tknow if the THREDDS catalog has been modified before exhaust it.
There is a "date" element in the
threddsMetadataGroup.  The element can be used to
identify the modified (or created, valid, issued, available,
etc)date/time of a individual and/or collection dataset.  This
element is very useful not only at individual dataset level but also at
data collection level.  For example, suppose a data collection A
contains another collection AA which contains another collection AAA
which contains datasets a,b,c,and d (i.e.,
A>AA>AAA>a,b,c,d).  If the "modified" date stamp
is applied to all the dataset nodes, individual as well as collection, a
returned user would not need to follow the complete path to find out if a
new dataset is added/modified/etc in data collection AAA and/or another
other collections in the hierarchy.
However, it seems that this "date" element is not used widely,
if any, at the data collection level.  In fact, I randomly browsed
some of the data paths in Unidata's motherlode catalog
(
http://motherlode.ucar.edu:8080/thredds/catalog.html)and didn't find
any "Last Modified" information until I got to the final
dataset level.
I guess that the reason THREDDS catalog does not show modified date/timeat collection level is that the catalog is not automatically updated whena new dataset is inserted into the database/file system connected to thecatalog. Once a user browses down to the catalog, the server will scanthe immediate child nodes to get all the available datasets/datacollections. Thus, a user browsing down the hierarchy will always bepresented the most currently available datasets although the catalog doesnot update itself upon new datasets being inserted.The disadvantage of the approach is that a user always needs to go to thebottom level to find out if any new datasets has been inserted.Similarly, in order to update our CSW catalog, our THREDDStoCSW ingestorwill have to scan through an entire THREDDS catalog, which can be verylarge, such as the Unidata catalog.
Any comments/suggestions will be highly appreciated.
Wenli Yang
George Mosaon University



===============================================================================
To unsubscribe thredds, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
===============================================================================

References:
- use of the "date" metadata element in THREDDS catalog
  - From: Wenli Yang
- Re: use of the "date" metadata element in THREDDS catalog
  - From: John Caron

Prev by Date: Re: use of the "date" metadata element in THREDDS catalog
Next by Date: THREDDS API Question
Previous by thread: Re: use of the "date" metadata element in THREDDS catalog
Next by thread: THREDDS API Question
Index(es):
- Date
- Thread