Repatriation of knowledge about insects and types through the DORSA virtual museum ( Digital Orthoptera Specimen Access ) With 4 figures

DORSA (Digital Orthoptera Specimen Access) is a virtual museum joining information on Orthoptera types and voucher specimens scattered over the major German museum collections in a single database. Data for about 16,000 specimen records including types and vouchers in over 4,000 species are searchable online via the SYSTAX database (www.biologie.uni-ulm.de/systax) which is linked to both the GBIF (Global Biodiversity Information Facility)and BIOCASE (Biological Collection Access Service for Europe)portals. Roughly 8,000 type specimens (with about 2,300 primary types) are documented with over 30,000 images. 12,000 sound files are also available as are geographical information and maps of the specimens in the database. DORSA specimen information is reciprocally linked to the Orthoptera Species File (OSF) which forms the taxonomic backbone for all taxon names used by DORSA. All DORSA data are freely available on the world-wide web. In this way, the knowledge about type specimens collected since colonial times is repatriated to the countries of origin.

Taxonomic literature is scattered over numerous journals and books which are often not easily available because most of the classical references but also many of the new publications are held only in a few libraries worldwide.A recent bibliography on Orthoptera lists over 14,000 titles with taxonomic significance (INGRISCH & WILLEMSE 2004).Moreover the knowledge of the diversity of insects has greatly multiplied since the introduction of the binominal nomenclature nearly 250 years ago.For the Orthoptera alone, the number of known species raised from only 59 taxa in LINNES' (1758) genus Gryllus to almost 30,000 species group names that are currently listed in the Orthoptera Species File (osf2x.orthoptera.org).Together with the increase of knowledge, the definitions of the species concept has changed repeatedly and taxonomic methods have shown great progress.It is thus understandable that historical descriptions are usually insufficient to identify specimens at hand.Fortunately, descriptions of biological species are based on type specimens.In former times, travelling to several museums or loan of specimens were necessary to compare specimens for identification of new material.With the rise of the world-wide web, it is now possible to provide museum information on-line, thereby facilitating access to type material, pictures or any other information related with type specimens (RIEDE 2003).Tibet 1903-1905(KARNY 1908), Schlaginhaufen to New Guinea (KARNY 1912), the German Central-Africa Expedition 1907-1908(REHN 1914), the Kaiserin-Augustafluss-Expedition to New Guinea 1912-1913(e.g. KARNY 1928), and Ramme's revisions of African, Southeast Asian, Southeast European and West Asian Acrididae (e.g.RAMME 1929RAMME , 1941RAMME , 1951)).Due to those and many others including modern expeditions, Beitr.Ent.55 (2005) 2 German museum collections hold type specimens from all over the world (Fig. 1) and account for about ten percent of the species group names in Orthoptera, but the relative numbers vary between subgroups (INGRISCH et al. 2004a, Fig. 2).Due to the federal political landscape of Germany, there are several museum collections of similar importance, which means that the material is even more scattered than in other European countries, such as Great Britain and France (Fig. 3).This scattered information was digitised and published on the world-wide web as one "Virtual Museum" by DORSA (Digital Orthoptera Specimen Access) thus facilitating access and providing a "virtual centralisation".The type specimens information is provided together with images and songs.The Orthoptera Species File (OSF, EADES 2001) forms the taxonomic backbone for all taxon names used by DORSA.OSF stores the original name as well as the currently valid name together with general information on the type specimens.There is a fundamental difference between both databases.Species databases deal with names (taxa), i.e. human concepts (OSF).Specimen based databases deal with real world objects, such as pinned museum specimens (DORSA).Both databases are linked by the type specimen which is the real world voucher on which a species description is based.DORSA and OSF are two separate relational databases, optimised to administer specimens and names (taxonomic concepts), respectively.Cooperation between and linking of both databases allowed multiple verification of the data.The species database gets its data mainly from published information while the specimen database collects its information from the specimens in the collection and the labels.When both sets of data disagreed, a closer investigation on the fate of the type specimens became necessary.As a result, type specimen information in OSF could be updated with the information provided by DORSA, while the information in OSF helped to discover type specimens in museum collections that were not properly labelled (INGRISCH et al. 2004b).
DORSA compiles data from over 8,000 type specimens of which about 2,300 are primary types in almost 3,000 species.Additional voucher specimens for sound recordings and important taxonomic revisions are also included which totals in about 16,000 specimen records in over 4,000 species (Tab.1).Moreover, 30,000 images of type specimens and 12,000 sound files are included in the DORSA documentation.The sound files were provided by private researchers and became publicly available through DORSA.Sound files help to identify living specimens in the field, as Orthoptera songs can only be paraphrased in printed journals which does not give the same impression as hearing the sound itself.Label information is included in the documentation, to track mistakes during data-basing.All DORSA data are freely available over the SYSTAX database (www.systax.de).In this way, the knowledge on the type specimens collected since the colonial times is also repatriated to the country of origin.The data provided by DORSA can be easily accessed from a variety of sources over the world-wide web.SYSTAX is linked to GBIF (Global Biodiversity Information Facility; www.gbif.org)and BIOCASE (Biological Collection Service for Europe, www.biocase.org).Moreover, DORSA specimen information is re- ciprocally linked to the Orthoptera Species File (OSF).Information about type material in German museums can be retrieved both ways -via SYSTAX, or via OSF, the latter includes all synonyms and taxonomic references (Fig. 4).As OSF now works as the supplier of systematic information on Orthoptera in biodiversity projects as Species 2000 (www.sp2000.org),ITIS (www.itis.usda.gov),NCBI (National Center for Biotechnology Information, used by Genbank), and indirectly the Catalogue of Life and GBIF, DORSA specimens information can also be traced from those sources.The geographic information for many specimens in DORSA had to be actualised as the information given on labels was either not precise or the names of places or their affiliation to countries had changed since the time of collection.Fortunately several expedition reports allowed the tracing of localities with sufficient precision.Thus a majority of the locality data could be geo-referenced with latitude/longitude co-ordinates (3,100 out of the roughly 5,800 locality data).This will help future users to find this information easily, and it allows mapping by a geographical information system (GIS).Distribution maps are available from the Dorsa Map Server (www.dorsa.de)which is based on approximately 1,200 localities of katydid (Tettigonioidea) sound records made by K.-G.Heller (cf.WILLEMSE & HELLER 2001) and for all type specimens via SYSTAX which uses the Canadian Biodiversity Information Facility (www.cbif.gc.ca/home_e.php).DIETRICH 2001).It can be used in rapid assessment programs, as a non-invasive technique to classify and map acoustic diversity in the field.To put this useful tool into practise, it should now be calibrated by new sound data from a representative number of species in a given area.In many areas of the world that means recordings have to be made together with collecting of specimens and taxonomic revisions.

Fig. 2 :
Fig. 2: Species group names of Orthoptera worldwide (valid names and synonyms) and types specimens in German collections.The category holotype includes also syntypes, lectotypes and neotypes.[From Ingrisch et al. 2004a].

Fig. 3 :
Fig. 3: Orthoptera (crickets and grasshoppers) collections in Germany -number of holotypes in major museums.A similar distributed pattern is observed for type material of other groups of organisms housed in German museums or research institutions.

Fig. 4 :
Fig. 4: Documentation of DORSA specimen data on the internet.Left: species page in SYSTAX (red arrows mark scrolling down); top right: species page in OSF; middle right: image and sound file pages in SYSTAX; bottom right: distribution map from a map server (Canadian Biodiversity Information Facility).External links are marked by yellow arrows, internal links by green arrows.
A cricket song classification software was developed, based on neural networks trained by data based song (for details see PALM &