 |
|
Page 1 of 1
|
[ 7 posts ] |
|
| Author |
Message |
|
rdmpage
|
Post subject: A structured wiki of all taxonomic names Posted: Tue Apr 07, 2009 3:55 pm |
|
Joined: Tue Apr 07, 2009 1:07 pm Posts: 8 Location: University of Glasgow
|
BackgroundOne of the most basic things biodiversity science lacks is a list of taxonomic names. We have various online lists of names, none of which are complete, few of which are linked to the original source of the names, and all of which contain errors. These lists are typically "closed", in the sense that there is no quick and easy way for a user who discovers an error to correct it, or for a user to add additional information. Furthermore, each web site assigns it's own identifier to a name, resulting in a multiplicity of identifiers for the same name (see http://ispiders.blogspot.com/2008/05/li ... s-why.html ). WikiOne approach to this problem is to create a wiki of taxonomic names, where the initial content for the wiki comes from bulk population from existing data bases. A wiki enables users to annotate, edit, and correct entries in the underlying database. I have been exploring using a structured wiki to store and integrate information on taxonomic names, classifications, publications, specimens, images, collections, and genomic sequences. Unlike previous efforts (such as WikiSpecies http://species.wikimedia.org) this wiki is built upon Semanic Mediaiwiki http://semantic-mediawiki.org/, which provides tools for expressing relationships between objects (such as names, specimens, and publications), as well as between identifiers. For background on this approach see http://iphylo.blogspot.com/search?q=wikiDemoA developmental version of this wiki is online at http://itaxon.org/wikidev/. Examples of what can be done are: Data for this wiki comes from existing taxonomic and specimen databases, and a literature resolver http://bioguid.info/openurl, as well as from data mining scientific publications. ChallengeMy goals are to: demonstrate how data can be integrated into the wiki demonstrate how a scientific publication can be integrated into the wiki demonstrate how linking data from different sources adds value to an individual data record demonstrate the value of linking data using global identifiers
|
|
 |
|
 |
|
qgroom
|
Post subject: Re: A structured wiki of all taxonomic names Posted: Thu Apr 09, 2009 12:56 am |
|
Joined: Wed Feb 11, 2009 10:00 am Posts: 19
|
|
While I agree with what you are saying. I don't think a new wiki is required. I think that the CoL etc. should adopt some aspects of the wiki idea. Perhaps, editing cannot be as free as a wiki, but one should be able to annotate a record.
I watch the progress of wikispecies with interest and while it is rather unscientific it may prove to be a useful resource. I expect the CoL and Wikispecies will evolve.
Regarding LSIDs: I feel they should relate to unique objects, whether physical or digital. A taxonomic name is not a unique object, but a concept. I assume the LSIDs in the CoL refer to the entries in the CoL, not the names themselves. One thing I have not seen, is the propagation of LSIDs from one source to another. Shouldn't the CoL be passing on LSIDs from the sources of their data? Of course most of these sources don't have LSIDs enabled yet.
regards
Quentin
|
|
 |
|
 |
|
rdmpage
|
Post subject: Re: A structured wiki of all taxonomic names Posted: Thu Apr 09, 2009 5:09 am |
|
Joined: Tue Apr 07, 2009 1:07 pm Posts: 8 Location: University of Glasgow
|
WikiI'm making a wiki partly to demonstrate the potential of the approach. If others adopted some of these techniques, that's great. I'd argue that more than annotation is required. All databases have errors, and the quickest way to fix those is to make the data editable. Databases may claim that their records have be scrutinised by experts, but experts still make mistakes. For Wikispecies to be useful I think it will need to be more structured, and it desperately needs more data. I don't think the project has been a success, especially given that Wikipedia is frequently more informative and better referenced. IdentifiersReuse of identifiers is crucial if we are going to link things together (see http://dx.doi.org/10.1093/bib/bbn022 for more on this, preprint available at http://hdl.handle.net/10101/npre.2008.1760.1 ). Catalogue of Life does link to other LSIDs where they are available (e.g., Index Fungorum). I can't show you a live example because the Catalogue of Life LSID service is currently down (biodiversity informatics has a poor record of service availability, but that's another issue), but a cached example is here: http://linnaeus.zoology.gla.ac.uk/~rpag ... 1c4692.rdf . NamesI view taxonomic names as tags. I think it makes sense to have an identifier (such as a LSID) for a name (which would resolve to information on a name and its publication details). These names are used to tag data, observations, etc. Of course, not all data tagged with the same name may refer to the same taxon, but that's a separate issue. But I think the proliferation of identifiers for names, resulting in multiple identifiers for the same name, is hindering efforts to integrate data.
|
|
 |
|
 |
|
qgroom
|
Post subject: Re: A structured wiki of all taxonomic names Posted: Thu Apr 09, 2009 5:32 am |
|
Joined: Wed Feb 11, 2009 10:00 am Posts: 19
|
|
Regarding Wiki & Identifiers I total agree.
Regarding Identifiers for names. If you use an LSID for a name and not a single usage of that name, then the name itself becomes redundant. What if a taxonomist splits up a taxa into two, or lumps taxa. What do we do with the LSIDs then if they refer to the name? However, I think we both agree that the proliferation of meaningless LSIDs is a bad thing. The CoL should be passing on LSIDs from their sources, before creating new ones.
Regards
Quentin
|
|
 |
|
 |
|
rdmpage
|
Post subject: Re: A structured wiki of all taxonomic names Posted: Thu Apr 09, 2009 6:58 am |
|
Joined: Tue Apr 07, 2009 1:07 pm Posts: 8 Location: University of Glasgow
|
|
Each usage of a name pairs a name (for which we need an identifier) with a use of that name (such as a publication, for which we need an identifier). I don't see the reason for not having a single identifier for a name. Two different usages are still usages of the same name. I think we need to clearly separate names (as in text strings published in a formal way), and what we take that name to mean. Names, in a sense are simple objective facts (e.g., I published the name Pinnotheres atrinicola in 1983). Claims about what data that name covers are another matter.
My own view is that having an identifier for each name usage is probably unnecessary, and if names are tags then the interpretation of a name is basically a query (what information is tagged with that name?). Different sources may have different views on whether all information so tagged applies to the same organism. For example, the NCBI Taxonomy database will have a view on what data tagged with a name comprise the same taxon.
Why not try and keep things simple...?
|
|
 |
|
 |
|
DavidRemsen
|
Post subject: Re: A structured wiki of all taxonomic names Posted: Mon Apr 13, 2009 11:42 am |
|
Joined: Fri Mar 13, 2009 4:35 am Posts: 1
|
|
I'd like to jump in here with some thoughts now that I got my login figured out.
First, this is a topic that could get too detailed too fast and I don't want to swamp the post. So I'll cut this into a set of more simple statements and risk generalising.
Firstly, my motivations in this area are grounded in the same assertion Rod made at the beginning of the thread. Biodiversity science lacks a complete list of taxonomic names and the result is a number of significant issues that can significantly impact finding and integrating information about taxa. When I started uBio and the precursors to it, it was simply to have a place to store names I found in content that I was curating and to tie it to taxonomic checklists that I could identify and index that treated those names in some way. One issue that has cropped up repeatedly throughout my interactions with people occupying different roles in biodiversity informatics concerns this nebulous area in which names, concepts and usages all overlap. This impacts the use of identifiers as well as what those identifiers return when they are resolved.
There is a superset of all usages that pair a scientific name with some object of interest to someone. It includes the 1 million+ occurrence records for Passer domesticus that sit in the GBIF index now. Or the hundreds of thousands of nucleotide sequences for E. coli in GenBank. It includes millions of pages in the BHL and other literature repositories. A name and a source object that the name is tied to occupies a lot of stuff covering a lot of knowledge domains. GBIF, BHL, NCBI and others are indexing some of it and the term I made up for the whole mess is the "BIG index" (BIG for "biodiversity indexing group" because it described it nicely and I like how it sounds).
A subset of the BIG index consists of usages that a taxonomist might find relevant. This includes various nomenclatural acts, references to specimens, publications that contain taxonomic assertions. These came up at a TDWG luncheon and we are currently funding something called the Global Names Usage Bank (Rich Pyle term) that is exploring developing an index of these as a precursor to a more unified framework for nomenclators and taxonomic databases. Conceptually, it's those usages that provide the taxonomic and nomenclatural context for all the other usages.
What I am looking for are source of information ABOUT names, specifically semantic and syntactic information about names. Logically, I am interested in identifiers that provide nomenclatural and taxonomic information about names and that distinguish these. One is based on facts and can be leveraged more widely than those based on opinion. I do think that names exist as objects indepedent of usages, or rather have properties that transcend individual usages. I have found it convenient to refer to names having at least three properties which must be collectively addressed and properly collated. That is names as strings, as nomenclatural acts, and as taxonomic concepts.
What I am most interested in is a collective framework for inventorying and accessing taxonomic opinion in a consistent manner that is tied to a reconciled and complete nomenclatural index. This must of course deal with the full and messy orthography of reality. I see this is including but transcending the efforts of the Catalogue of Life as it ties into a number of other initiatives and includes a much wider range of taxonomic opinion. Ultimately we should have the capacity to tie a usage either to what I refer to as a "nominal concept" meaning where the conceptual reference is unknown or ambiguous or to a more specific usage. That usage should be defined and resolvable to give it meaning. I think if we had a library of defined concepts accessible and tied to a common nomenclature that itself was resolved to originating nomenclatural acts we could, in many cases, find that the taxon concept issue was actually not a problem and, in cases where it is, could develop methods for estimating the conceptual intent of a usage with high confidence. Ultimately, I'd like to see a simple mechanism to tying a prospective usage to a taxon identifier and a reasonable retrospective framework for doing the same.
Enough for now. I guess I should be putting this into the Global Names Architecture OCC.
David Remsen
|
|
 |
|
 |
|
mauri
|
Post subject: Re: A structured wiki of all taxonomic names Posted: Mon Apr 13, 2009 12:20 pm |
|
Joined: Tue Feb 10, 2009 8:53 am Posts: 184 Location: University of Helsinki, FINLAND
|
Hi David, I agree everything you wrote. Also biodiversity informatics education would benefit from implementation of A structured wiki of all taxonomic names. By the way, you have taken interesting nature photos, e.g.: http://www.bioshare.net/dabuh/collectio ... index_htmlYour blog of global bees, is an interesting theme also from educational point of view: http://globalbees.editwebrevisions.info/en/user/13Your and your colleagues’ scientific material could be used also for educational purposes to promote lifelong biodiversity informatics education and sustainable use of biodiversity, if you are interested in these kinds of issues. With good wishes Mauri
_________________ Dr. Mauri Ahlberg FLS Professor of Biology and Sustainability Education University of Helsinki, FINLAND
|
|
 |
|
 |
|
|
Page 1 of 1
|
[ 7 posts ] |
|
Users browsing this forum: No registered users and 1 guest |
| |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum
|
|
 |