Resolving Scoped Feature Identifiers

Philip Sargent
16th October 1998

This note briefly records our conclusions from the Vienna OGC meeting.

It will be provisionally edited into a draft of Topic 5, with supporting information, proper references, reconciliation with current IETF work, and the nomenclature already established by the Catalog WG.

This note says nothing about update and maintenance of identifiers or scope resolving objects. That was essentially dealt with in the note "Feature Identifier Registries: Incremental Publishing" put on the Feature SIG email archive on October 1st, though it will need extensive re-editing to fit in Topic 5.

This not says nothing about access (see below), only identifier resolution.

This note says nothing about the reasons why we want permanent, immutable feature identifiers or why they need to be names not locations. See the other recent feature SIG documents for background.

Fundamental Ideas

  1. A service, a "Scope", which resolves an identifier, is itself something that requires a permanent identifier.
  2. Scopes can be nested, but do not form a strict tree: a directed acyclic graph (DAG) is possible.
  3. At the "bottom" are the leaf-objects, which are explicit identifiers which do not then need resolving in any other scope.
  4. At the "top" we need to have some "Well Known Scopes" written into the implementation specification.
  5. Leaf objects can be feature identifiers, scope identifiers, or any other type of identifier. We just deal with feature and scope identifiers here. Note that we only deal with identifiers.
  6. Using a fully-resolved "bottom" leaf feature identifier to actually retrieve a feature's data is access not resolution and is a different issue.

New concept to the Abstract Specification

The key idea that a Scope is a permanent object with an immutable name that allows it to be located indefinitely, which can answer queries about identifiers "in its scope": resolving them to more explicit identifiers.

Key supporting concept

Any scoped identifier (simply called an "identifier" in this note) is conceptually and perhaps actually constructed in two parts.

and there is always a third, invisible, part: the scope in which the whole thing, prefix and suffix, should be interpreted. This is because an identifier is always "inside" a scope.

Example

"The Handle System®" from CNRI is an example of a 1-level scope system. An identifier looks like this:

  • hdl://cnri.test/abcd/efg/ijk#123
  • Where

    The Handle System includes a large number of scopes, but all of them have " hdl://" as the first part of their prefix.

    This example illustrates another useful but non-essential characteristic: the invisible scope in which the identifier should be resolved is indicated in a non-formal way by the initial part of the text string of the prefix. Any informed person reading " hdl://" has a pretty good idea that we are dealing with a variety of URL (a URI to be pedantic) and we need to find which protocol "hdl:" represents. This capability is most use at the "top" level when we have to decide which Well Known Scope we have to start with. Lower scopes could be pure binary for prefixes and suffixes.

    What resolution does

    Diagram of a scoped id being resolved into a sub-scope and a new (scoped) id An identifier is resolved by giving it to the Scope object in which it is defined (the invisible one), which (conceptually) strips off the prefix, finds the Scope object that the prefix names, and hands the suffix to that Scope object. When the process reaches a leaf identifier, the stack unwinds and the leaf identifier is returned together with whatever information is necessary for the original enquirer to make use of it.

     

    Resolution, discovery and access

    For software specification it is best to keep orthogonal functions specified entirely separately as distinct interfaces. Any implementation may choose to support one or several interfaces. We have three quite separate specifications to think about:

    Handles and descriptors

    We understand a feature descriptor to be something which is resolved by a Catalog Service. A feature descriptor contains "sufficiently unique metadata".

    This discussion of scoped identifiers covers what we previously called a feature handle.

    Permanent scope objects

    Anything could offer an identifier resolution service within an "understood" scope, but we need to ensure that scoped-identifiers are only issued containing scope object prefixes for scope objects with certain minimum required qualities of permanence and immutability.

    Catalog services are something which we could also cover by this same scoped identifier mechanism.

    Uniqueness

    The same feature can be in several Scopes. There does not seem to be any way to avoid this since anyone can set up an identifier resolution service which then defines its own new scope.

    Thus the same feature (a specific software representation of a real world object) can have several identifiers. It will have the same leaf identifier, but this could appear in published form hidden inside several different scoped identifiers via different scopes.

    Thus we lose equality by value for published identifiers; however we can define an equality method on identifiers if the method resolves them down to the lowest common denominator scope. Because scopes form a DAG not a tree, and because identifier length does not tell you anything about how many prefixes may be hidden inside opaque suffixes, this means resolving all the way down to the leaf identifier in each case.

    We need to make the following restriction:

    This is tricky, since data repositories can always be copied. So we need to think about how to define scopes for copies, and to consider replication possibilities to see if we can relax this condition under certain carefully-controlled circumstances.

    More formal notes

    What properties must an identifier have ?

    A Well Known Scope is a Scope Environment, e.g. X.500, Corba Locator Service, DNS (machine identifiers), URL (file identifiers if the location is permanent).

    Other Scope Environments might be:

    A Well Known Scope is a scope of scope identifiers.

    Internal identifiers and internal scopes

    All identifiers discussed in this note are published, external identifiers. Any repository can use any internal identification mechanism it likes so long as it can support permanent, immutable external identifiers.

    As an example, consider a system which uses 8-digit numbers as internal feature identifiers. Such as system might want to be able to represent foreign identifiers (e.g. to implement some feature-feature relationships) and it might do so by internally using identifiers of the form "9xxxxxxx". The initial "9" indicating an external identifier. Such a system would then have a bit of software in its export subsystem which published its own identifiers as scoped-identifiers and replaced the foreign identifiers with the original foreign scoped identifier (stored in some local registry).

    The interesting thing about this example is that the initial "9" can itself be considered to be a scope-prefix inside the proprietary system. This illustrates the general points that

    The important property of an identifier is that is resolvable, not that it is readable.

    Scope aliases

    Some scopes may exist only to provide short-forms of identifiers, e.g. in the above example a single digit "9" represented the entire outside world. A deeply-nested identifier could accumulate a very long string of prefixes, so within an organisation or information community, a scope resolver which provided short-form aliases could be useful.

    Scopes can cross-reference each other, thus the opaque part of an X.500 scope may be a Handle System server.

    Published identifiers

    Any identifier published, e.g. in an email or quoted by some other piece of general purpose software for any value-adding purpose, must quote scope prefixes all the way back to the most-global Well Known Scope. This is where it is particularly useful if the prefixes understood by Well Known Scopes are not opaque but are readable.

    The Handle System includes a large number of scopes, but all of them have " hdl://" as the first part of their prefix.

    FeatureCollection

    Conceptually, there exists a FeatureCollection of all the features whose identifiers are in the Scope, but

    they are just implicitly associated. The FeatureCollection may not be instantiatable even in theory for some Scopes.

    Methods

    A Scope Object has these methods:

    This last method is arguably implementable because we always know where we are in the scope sequence because we must always have come "down" some route through the scopes' DAG to get to the scope we are "in" now. Being able to produce a publishable identifier is clearly a requirement, how it is done should be left to responses to an RFP.

    A Scope is probably an Interface not a Class, (using Java nomenclature), i.e. the set of Scope methods could be supported by many different objects of different classes.

    {scope, id} Traverse(id)
    boolean IsEqual(id,id)
    boolean IsLeaf(id)
    object GetLeaf(id)
    feature GetLeafAsFeature(id)
    id GetLeafId(Object)
    id ComposeId(scope,id)

    In discussion, we provisionally decided that we did not want or need a method on Feature where you give it a scope and ask what its publishable identifier would be from that scope.

    To be done...

    Further Reading

    Added on 22 November 1998:

    This list includes papers which I discovered only after I wrote the above paper. They are taken into account, with this paper, in current working drafts of OGC Abstract Specification Topic 5 (copyright © OGC, 1998 and only available on the member area of the OGC website). None of this appears in the public, released copy of the Abstract Specification yet.

    1. Feature Identifier Notes: why we need them.
    2. Uniform Resource Names (urn) Charter
    3. Uniform Resource Names (URN) Progress Report
    4. URNs: URL replacement project
    5. W3C: URNs and URIs
    6. [URNDNS] Daniel, R., Mealling, M.: Resolution of Uniform Resource Identifiers using the Domain Name System, , Internet Engineering Task Force (IETF) Request For Comment (RFC) 2168, June 1997, http://info.internet.isi.edu:80/in­notes/rfc/files/rfc2168.txt
    7. [URNhttp] Daniel, R: A Trivial Convention for using HTTP in URN, Internet Engineering Task Force (IETF) Request For Comment (RFC) 2169, June 1997, http://info.internet.isi.edu:80/in­notes/rfc/files/rfc2169.txt
    8. [URN] Moats, R.: Uniform Resource Name Syntax, Internet Engineering Task Force (IETF) Request For Comment (RFC) 2141, May 1997, http://info.internet.isi.edu:80/in­notes/rfc/files/rfc2141.txt
    9. URNres] Sollins, K.: Architectural Principles of Uniform Resource Name Resolution, Internet Engineering Task Force (IETF) Request For Comment (RFC) 2276, January 1998, http://info.internet.isi.edu:80/in­notes/rfc/files/rfc2276.txt
    10. [URIres] ., Mealling, M., Daniel, R: URI Resolution Services Necessary for URN Resolution, Internet-Draft version 6, Internet Engineering Task Force (IETF) URN Working Group Work In Progress, March 1998.
    11. The Handle System
    12. Kahn/Wilensky Architecture
    13. Sun, S.X.: Handle System: A Persistent Global Name Service -- Overview and Syntax, Internet Engineering Task Force (IETF), Work in progress -- Internet Draft July 16, 1998, Document draft-sun-handle-system-01.txt
    14. Leach, P.J., Salz, R.: UUIDs and GUIDs, Internet Engineering Task Force (IETF), Work in progress -- Internet Draft February 4, 1998, Document draft-leach-uuids-guids-01.txt, see also http://www.ics.uci.edu/~ejw/authoring/
    15. The Digital Object Identifier
    16. Research in Digital Libraries
    17. Architecture for Information in Digital Libraries
    18. Electronic References and Scholarly Citations
    19. UKOLN Uniform Resource Names
    20. An Index of WWW Addressing Schemes
    21. [URI] Berners­Lee, T, Fielding, R., Irvine, U.C., Masinter, L.: Uniform Resource Identifiers (URI): Generic Syntax, Internet Engineering Task Force (IETF) Request For Comment (RFC) 2396, August 1998, http://info.internet.isi.edu:80/in­notes/rfc/files/rfc2396.txt
    22. [Bishr99] A Globally Unique Persistent Object ID for Geospatial information Sharing, Yaser A. Bishr, Interop'99 submission.http://www.opengis.org/members/fid.wg/index.htm
    23. [Sargent99] Feature Identities, Descriptors and Handles, Philip Sargent, Interop'99 submission. http://purl.oclc.org/NET/sargents/Philip/feature-ids/base.html
    24. Feature Identifier Registries for Update and Maintenance: Incremental Publishing, Philip Sargent, 1 October 1998.
    25. [Arctur98] Issues and prospects for the next generation of the spatial data transfer standard (SDTS), David Arctur, David Hair, George Timson, E.Paul Martin, Robin Fegeas. IJGIS (1998) 12 (4) 403-425.
    26. [Hair97] Feature Maintenance Concepts, Requirements, and Strategies, Version 3.0 May 28, 1997, David Hair, EROS Data Center, George Timson, Mid-Continent Mapping Center , Paul Martin, Rocky Mountain Mapping Center.Published by U.S. Geological Survey/National Mapping Division. .http://www.opengis.org/members/fid.wg/index.htm

     


    This work performed at the European Commission Joint Research Centre.

    home | local home