Hi all: I think this issue is in the category of "hmm, i never thought of using the TDS that way!". So if you review the way one can restrict dataset access at http://www.unidata.ucar.edu/projects/THREDDS/tech/tds4.2/reference/RestrictedAccess.html you will see this example: <?xml version="1.0" encoding="UTF-8"?> <catalog name="TDS Catalog" xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0"> <service name="thisDODS" serviceType="OpenDAP" base="/thredds/dodsC/" /> 1)<datasetRoot path="test" location="/data/testdata/"/> <dataset name="Test Single Dataset" ID="testDataset" serviceName="thisDODS" urlPath="test/testData.nc" restrictAccess="tiggeData"> <dataset name="Nested" ID="nested" serviceName="thisDODS" urlPath="test/nested/testData.nc" /> </dataset> 2) <datasetScan name="Test all files in a directory" ID="testDatasetScan" path="testAll" location="/data/testdata" restrictAccess="ccsmData" > <metadata inherited="true"> <serviceName>thisDODS</serviceName> </metadata> </datasetScan> </catalog>Example 1) is using datasetRoot, and 2) is using datasetScan . datasetScan defines an implicit data root (in this case path="testAll" location="/data/testdata"). If one removes the datasetScan, the data root also goes away. But removing the dataset doesnt remove the datasetRoot. Apparently ESG defines dataroots in one place, and then defines explicit <dataset> elements for each file. (This is the "hmm, i never thought of that"). Im assuming you dont want to remove the datasetRoot element because other datasets use it? Anyway, just to clarify: 1) removing the dataset means that a user can no longer find it in a public catalog. 2) but if you leave the datasetRoot, and they "just know" the URL, it will get served. So the implications of that are that you have to be careful where you put your data. Anyway a solution might be to allow data roots to be restricted, eg: <datasetRoot path="test" location="/data/testdata/" restrictAccess="ccsmData"/>I will investigate how easy that is. But this will restrict all datasets using that data root, im not sure if thats what you want. John. On 6/27/2011 1:22 PM, Drach, Bob wrote: Hi Estani, You're correct, and it's worth emphasizing this behavior to data publishers. I've highlighted the same information in: - the ESG publisher tutorial - the publisher reference guide - the installation script As I see it the dataset roots are treated much like the DocumentRoot directive in Apache. If there is a simple configuration to block access to files under a dataset root unless otherwise cataloged /configured, I would be interested to know about it. --Bob On 6/27/11 9:18 AM, "Estanislao Gonzalez" <address@hidden> wrote: Hi, I might have missed someone interested in this, so forward it properly if you happen to know. We have a problem with the current Thredds usage. I'm almost certainly this is intended to be a TDS feature but it's not working for us. I'm sending a copy to John as maybe there's some way to turn off this default (aka. undesired for us) behavior. So here's the problem. Everything defined in the thredds_root is served by the fileServer. If there's no catalog (and thus no security policy being defined) it's apparently assumed to be "unprotected". this cause problems in multiple ways: 1) For example, the mere act of unpublishing data (removing the TDS catalogs) makes it widely accessible. - a work around for this is either leaving the catalogs, and retracting the publication from the gateway only, or removing the thredds_root entry altogether (and publishing anything so this change is picked up!) 2) Any other data published in one of those directory is instantly being served, I don't think the publisher is aware of this. 3) links are followed everywhere, even outside of the defined directory. A mere "my_path | /" added to any configuration file opens the machine for read access to whatever tomcat can read. Because omitting any thredds_root entry will result in those catalog being not published at all, systems relying on multiple publishers have to manage a central esg.ini file with all possible thredds_root entry. This will more than probably cause that thredds_root directories that aren't used anymore will not be deleted and will remain open. Well, I might think harder to find other problems, but in general I think that our intention was not to serve any file that's not contained in any Catalog, or am I seeing things upside down? Were you all aware of this? I wasn't... Thanks, Estani |