LSID
Life Science Identifiers[1][2] are a way to name and locate pieces of information on the web. Essentially, an LSID is a unique identifier for some data, and the LSID protocol specifies a standard way to locate the data (as well as a standard way of describing that data). They are a little like DOIs used by many publishers.
An LSID is represented as a uniform resource name (URN) with the following format:
- urn:lsid:<Authority>:<Namespace>:<ObjectID>[:<Version>]
The lsid: namespace, however, is not registered with the Internet Assigned Numbers Authority (IANA), and so these are not strictly URNs or URIs.[3]
LSIDs may be resolved in URLs, e.g. http://zoobank.org/urn:lsid:zoobank.org:pub:CDC8D258-8F57-41DC-B560-247E17D3DC8C
Controversy over the use of LSIDs
There has been a lot of interest in LSIDs in both the bioinformatics and the biodiversity communities, with the latter continuing to use them as a way of identifying species in global catalogues.[4] However, more recently, as understanding has increased of how HTTP URIs can perform a similar naming task,[5][6] the use of LSIDs as identifiers has been criticized[7] as violating the Web Architecture good practice of reusing existing URI schemes.[8] Nevertheless, the explicit separation of data from metadata; specification of a method for discovering multiple locations for data-retrieval; and the ability to discover multiple independent sources of metadata for any identified thing were crucial parts of the LSID and its resolution specification that have not successfully been mimicked by an HTTP-only approach.
The World Wide Web provides a globally distributed communication framework that is essential for almost all scientific collaboration, including bioinformatics. However, several limits and inadequacies were thought to exist, one of which was the inability to programmatically identify locally named objects that may be widely distributed over the network. This perceived shortcoming would have limited our ability to integrate multiple knowledgebases, each of which gives partial information of a shared domain, as is commonly seen in bioinformatics. The Life Science Identifier (LSID) and LSID Resolution System (LSRS) were designed to provide simple and elegant solutions to this problem, consistent with next-generation Semantic Web and semantic grid, based on the extension of existing internet technologies. However, it has more recently been pointed out that some of these perceived shortcomings are not intrinsic to HTTP URIs, and much (though not all) of the functionality that LSIDs provide can be obtained using properly crafted HTTP URIs.[5]
See also
- Resource Description Framework
- MicroArray and Gene Expression
- ZooBank
- MycoBank
- MIRIAM URIs
- International Geo Sample Number
- SciCrunch
Notes
- ↑ Clark T., Martin S., Liefeld T. Briefings in Bioinformatics 5.1:59-70, March 1, 2004.
- ↑ Technology Report on OMG Life Sciences Identifiers Specification (LSID)
- ↑ http://wiki.tdwg.org/twiki/bin/view/GUID/LSID
- ↑ Andrew C Jones, Richard J White and Ewen R Orme, Identifying and Relating Biological Concepts in the Catalogue of Life. Journal of Biomedical Semantics 2011, 2:7
- 1 2 Booth, David: "Converting New URI Schemes or URN Sub-Schemes to HTTP"
- ↑ Thompson, Henry S.: "A precedent suggesting a compromise for the SWHCLS IG Best Practices", publicly archived email message
- ↑ Mendelsohn, Noah: "My conversation with Sean Martin about LSIDs", public email to W3C TAG mailing list 25-Jul-2006
- ↑ W3C Architecture of the World Wide Web
External links
- Life Sciences Identifiers: OMG Final Available Specification
- LSID Resolution Project
- LSID best practices from IBM
- LSID Assigning and Resolution Authority from The University of Texas at Austin
- Life Science Identifier from BioPathways Consortium
- LSID Tester
- A Position on LSIDs - Reflections from some one involved in implementation and roll out of LSIDs
- http://identifiers.org