next up previous
Next: Lineage and Metadata Up: Background Previous: Dataset

   
Schema

Many of the queries that one wishes to perform on a dataset logically should be queries addressed to the dataset's schema, e.g. questions as to what feature types the dataset holds, what attribute names and types are defined for each feature type, etc. If we consider a set of related datasets, e.g. a set of versions, then we can see that the schema itself has a broader existence than each individual dataset and might be better identified with a ``scope'' (Sect. [*]). Clearly if versions of the ``same feature'' may be found in several different datasets, it should have some of the schema in common between them (but datasets may not necessarily the same internal structure: versioning should be able to cope with restructuring, e.g. of a directory tree [5]).

Consider for a moment the case where we have a real world feature description, in this case the different feature representations of the real entity may have nothing in common: for example, the London suburb ``Richmond'' appears as a feature in both the London Tube map and as part of the UK postal code coverage, but these have nothing in common and it is sensible to consider them as two different features (software representations).

Initially it would be simpler to just assert that the schemas must be identical in all datasets in the same scope where a feature identifier may be used. However we must bear in mind that serious geographic information applications are approaching 24 hour - 365 day operation, so some allowance for dynamic schema evolution will certainly be needed in the near future.


next up previous
Next: Lineage and Metadata Up: Background Previous: Dataset
Tom Conversion Service