Barcoding cryptic bumblebee taxa : B . lucorum , B . crytarum

Königinnen der fünf Taxa der Untergattung Bombus sensu stricto (Bombus sporadicus, B. terrestris, B. lucorum, B. cryptarum und B. magnus) wurden an verschiedenen Orten quer durch Europa im Frühjahr gefangen, um künstliche Kolonien zu züchten. Mitochondriale Cytochrome Oxidase Untereinheit I (COI) von 40 Proben wurde sequenziert (Teilsequenzen 1005 bp Länge). Die Divergenz der Sequenzen zwischen den Taxa beträgt etwa 30 bis 60 Basen-Substitutionen und die Tamura-Nei Genetische Distanz 0.05–0.25, während innerhalb der Taxa die Divergenz nur 1 bis 6 Basen-Substitutionen beträgt und die Tamura-Nei Genetische Distanz 0.002–0.007. Zusätzlich zu den Clustern für B. sporadicus und B. terrestris zeigt das Phylogramm drei weitere Cluster: den Cluster α für B. lucorum, den Cluster β für B. cryptarum und den Cluster γ für B. magnus. Die Cluster α, β und γ der Taxa des so genannten lucorum-Komplexes sind klar getrennt, mit geringer Variabilität, keiner Überlappung und keiner Endeinheit mit unklarer Position. Da die COISequenzen keine Lücken aufweisen, können die einzelnen Nukleotide wie homologe Positionen verwendet werden. Jedes Taxon besitzt etwa 8–12 eigene Substitutionen, die als diagnostische Positionen verwendet werden können, um das Taxon zu charakterisieren. Mit den klassischen Werkzeugen der Kladistik wurde mittels dieser diagnostischen Positionen ein Stammbaum erarbeitet. Eine Barcode-Abfrage hat alle zweifelhaften Proben richtig bestimmt. Die topologische Position von GenBank-Sequenzen falsch bestimmter Proben und von Proben gealterter DNA wird diskutiert. Museumsproben dreier asiatischer Taxa mit unbekannter Zuordnung wurden sequenziert, um zu prüfen, inwieweit auch Museumsproben mit gealterter DNA mit Hilfe der diagnostischen Positionen zugeordnet werden können. Die Bestimmung mittels morphologischer und genetischer Merkmale wird diskutiert, und die Bestimmung kritischer Proben mittels Stammbaum (= genetischer Distanz) und mittels diagnostischen Positionen wird verglichen.


Introduction
With the recent publications of PEDERSEN (1996PEDERSEN ( , 2002)), KAWAKITA et al. (2004), HINES et al. (2006) and CAMERON et al. (2007) we have, for the first time, a good general picture of the phylogenetic relationships of most bumblebee species.These molecular investigations show that the long-lasting work on the taxonomy of the bumblebee based on morphological characters has produced reliable results: At the level of subgenera only minor corrections are necessary (mainly New World subgenera), there are a few conflicting results, and there is more insight into the deeper nodes of phylogeny.However at the terminal units of the branches many questions remain.And we need more specimens from a broad range of geographical localities to investigate the genetic polymorphism of the taxa.In recent years, the availability of genetic information has increased enormously.The inclusion of molecular information in taxonomic research can help to distinguish between species (equivalent to species identification or species diagnosis) and to discover new species (equivalent to species delimitation, species description).Species description and identification are among the most important tasks in biology, because biologists can neither report empirical results nor access published information on a study organism until it is correctly named.HEBERT & GREGORY (2005) described DNA barcoding as a novel system designed to provide rapid, accurate, and automated species identifications, by using short, standardized gene regions of cytochrome oxidase subunit I (COI) sequences as internal species tags.MUNCH et al. (2008) provided a statistical method for DNA barcoding based on a Bayesian phylogenetic approach, using automated database sequence retrieval.There is a heavy debate about the pros (e. g.TAUTZ et al. 2002TAUTZ et al. , 2003;;HEBERT et al. 2003;HEBERT & GREGORY 2005;) and cons (e. g.WILL & RUBINOFF 2004;WILL et al. 2005;MEIER et al. 2006;WHEELER 2004WHEELER , 2008) of these methods.Instead of comprehensive theoretical considerations, in this study the aim was to empirically test whether, despite all the theoretical challenges, DNA barcoding can deliver reliable species identifications, and to compare the results of the morphological and the molecular approach.The critical taxa B. lucorum, B. cryptarum and B. magnus of the so-called Bombus lucorum-complex were used as a case study.Initial reports demonstrated that these taxa can be safely separated by COI sequences (PEDERSEN 2002;BERTSCH et al. 2005;MURRAY et al. 2008).The identification of many species of the subgenus Bombus sensu stricto (syn.Terrestribombus VOGT) is often difficult because most species share a similar general appearance in colour and morphology, and there is a long-standing discussion about which taxa of the subgenus Bombus have species status, and which taxa might be subspecies belonging to a broader species.In Europe, there are five known taxa in the subgenus Bombus s. str.: Bombus (Bombus) terrestris (LINNAEUS, 1758), B. (B.) lucorum (LINNAEUS, 1761), B. (B.) cryptarum (FABRICIUS, 1775), B. (B.) sporadicus NYLANDER, 1848, and B. (B.) magnus VOGT, 1911.Their taxonomical status has been extensively examined based on morphology (KRÜGER 1939(KRÜGER , 1951(KRÜGER , 1954(KRÜGER , 1956(KRÜGER , 1958;;LØKEN 1973;PEKKARINEN 1979;RASMONT 1984;RASMONT et al. 1986), enzyme electrophoretic data (SCHOLL & OBRECHT 1983;PAMILO et al. 1984;SCHOLL et al. 1992), analyses of the compounds of the male labial glands (PAMILO et al. 1997;BERTSCH 1997;URABANOVÁ et al. 2001;BERTSCH et al. 2004), and DNA data (PEDERSEN 1996, 2002;BERTSCH et al. 2005;HINES et al. 2006;CAMERON et al. 2007;MURRAY et al. 2008).The species status of B. sporadicus, B. terrestris and B. lucorum is generally accepted; however the taxonomic status of B. magnus and B. cryptarum is still in dispute.Whereas RASMONT (1983), RASMONT et al. (1984), BERTSCH et al. (2004, 2005) and MURRAY et al. (2008) treated both taxa as separate species, WILLIAMS (1991,1998) grouped them with B. lucorum " interpreted in the broadest sense, to include a complex of similar taxa" (see WILLIAMS 2008).

Bumblebee samples
Females of all five European taxa of the subgenus Bombus s. str.were collected in spring from different localities throughout Europe (see Table 1).After collection, bumblebees were kept alive in a cool-box.Sometimes the characters essential for identification, such as the tufts of hair on the thorax and abdomen, were soaked and stuck together, especially in wet weather.In such cases, the bees were kept in flight cages with some honey-water.They started to clean and brush their hair by themselves, which restored all the essential characters.Morphological details were studied using a stereo microscope (Wild M16, Planar 1.0, Oculars 10x/21).As previously reported by E. KRÜGER (1928, p. 363), hair details are best studied in diffuse light (use of diffuse filter and indirect light with Novoflex Macrolight Plus) at high magnification by stroking the hair with a fine artist's brush or an insect pin.In this way the distribution of hair on different parts of the thorax and especially at the end of the collare below the tegulae (border of pronothalobus and episternite) was carefully investigated.

Identification of specimens
The identification of females () of the so-called lucorum-complex is still under debate, but most fresh specimens can be identified without any problems.Specimens  MAG-01 -MAG-09, CRY-02 -CRY-09, and LUC-02 -LUC-08 were identified without problems by the characters described in RASMONT (1984) andBERTSCH et al. (2004).Using the queens collected in the field artificial colonies were reared in an air-conditioned greenhouse and the morphological identification of the founder female was verified by investigation of the male labial glands in each case.Males of B. lucorum, B. cryptarum and B. magnus can be identified by their specific labial gland secretions (BERTSCH 1997, BERTSCH et al. 2004, 2005).In all cases, labial gland secretions from males of artificial colonies confirmed the identification of the founder queen.Only specimen CRY-03, which was identified by morphological characters as B. magnus, was a misidentification, as the male labial gland secretions (and the DNA sequences) identified this specimen as B. cryptarum.

Critical and unidentified specimens
To test the different methods of identification (by morphology, by male labial glands and by DNA) four females were included whose identification by morphological characters proved to be problematic ( MAG-10, CRY-01, LUC-09, and LUC-10).The specimen MAG-10 from Milde was identified as B. magnus, but because the parts of the collare below the tegulae were relatively short there was some uncertainty that the specimen might belong to B. cryptarum.Specimen CRY-01 from the Orkney Islands was a typical light form of B. cryptarum but as this species has not yet been identified from the Orkney Islands it was classified as uncertain.Specimen LUC-09 from Central Spain had a very broad collare reaching below the tegulae and habitually looked like Tab. 1: List of 40 Bombus samples (MAG = magnus, CRY = cryptarum, LUC = lucorum, TER = terrestris, and SPO = sporadicus) used in the present analysis with identification codes, and collection locality information.Q = , aC → M = artificial colonies with production of males.Shaded and ?mark specimens which could not be safely identified.

GenBank data
GenBank data were included (Table 2) in order to enlarge the database and to compare DNA sequences from different laboratories (Belfast, Inuyama/Kyoto, and Copenhagen).The specimens of B. magnus (GenBank accession nos.EF362738, EF362736, AY530014, AY630015), B. cryptarum (AY530011, AY530012), and B. lucorum (AY694095, AY530010) were from the artificial colonies from which I had collected and identified the founder queens.These colonies produced males, and male labial gland secretions verified the morphological identification of these females.
In living organisms, DNA damage is repaired by various enzymatic mechanisms.However, once the metabolic pathways of a cell cease to operate the DNA molecules progressively decay.The decay rate is influenced by a variety of factors related to the storage conditions.Biochemical processes subsequent to cell death may alter nucleotide sequence information in many ways.
Several of these post-mortem DNA modifications can block amplification during the polymerase chain reaction (PCR), whereas others allow PCR products to be obtained, but with incorrect bases incorporated and maintained in the amplification products.These kinds of PCR artefacts, termed miscoding lesions, are commonly represented by two types of transitions: type I (A → G) (T → C) and type II (C → T) (G → A).Miscoding lesions can lead to higher estimated substitution rates at the degraded sites and consequent overestimates of levels of polymorphism.
The general number of transitions attributed to damage processes is suspected to be inflated because it may include some errors caused by the PCR technique itself (GILBERT et al. 2007).The amplification of DNA consists of iterative steps, which form the chain reaction.Because these biochemical processes produce errors the PCR is not a deterministic process.Known experimental parameters that influence PCR performance are the quality of the polymerase, the buffer composition and the temperature of the primer annealing.A critical step in the procedure is also the purification of the PCR products.Therefore, great care was taken to obtain high quality sequences by adjusting buffer composition and annealing temperature, and only sequences were used where both the forward and the backward primer delivered flawless and identical sequences.Two independent PCR products were investigated where necessary.All positions were checked by carefully inspecting the original ABI traces, because sometimes the software used to analyze and edit trace files (4PEAKS) produces erroneous results.In the investigation and interpretation of sequences from museum specimens this thorough inspection of the ABI traces proved to be essential to detect miscoding lesions and to interpret doubtful positions of degraded DNA (HOFREITER et al. 2001;HAJIBABEI et al. 2006;JUNQUEIRA et al. 2002;SEFC et al. 2007).

Analysis of sequence divergence and tree topology of mitochondrial COI
The absolute numbers of substitutions were counted based on pairwise comparison of COI sequences.The nucleotide frequencies and the parameters necessary for computer models were estimated from the sequence data and Tamura-Nei genetic distances were calculated.The tree topology was inferred by a maximum likelihood tree based on the general time reversible mod- el (GTR) of base substitution with gamma distribution, calculated by Bayesian analysis using MRBAYES (HUELSENBECK & RONQUIST 2001).Tree topology was also calculated as a neighbourjoining tree (NJ) and as a most parsimonious tree (MP) with bootstrap sampling, using MEGA 4.0 (TAMURA et al. 2007).GENEIOUS Pro 4.5 (Biomatters Ltd.) was used to analyze the alignment, to detect diagnostic positions and the GREENBUTTON plugin (InterGrid) to do the time consuming MRBAYES calculations on a supercomputer cluster.MACCLADE 3.04 was used to examine the nucleotide changes on cladograms.The COI sequence of B. soroeensis was used as the outgroup (GenBank accession no.AY181159, PEDERSEN 2002), and a few sequences from the genetically nearest subgenus Alpinobombus (PEDERSEN 2002;CAMERON et al. 2005) were also included.

Nucleotide frequencies, substitution parameters and COI divergence
The aligned data matrix of 1005 bp of 40 sequences (Table 1) included 134 variable sites.Of these variable positions five were uninformative (singleton substitutions = noise), and 129 informative (= signal).However, most of these informative sites were at silent positions, and translation resulted in amino acid sequences of 335 amino acids with only 11 variable sites.Most of the amino acid sequence variability was in B. sporadicus (7 variable sites out of 11); within the lucorum-complex only three sites out of 335 amino acids were variable.The nucleotide frequencies were pi(A) = 34.0%, pi(C) = 12.2 %, pi(G) = 11.7 % and pi(T) = 42.2 %, demonstrating the known strong A + T bias typical for sequences of Hymenoptera.Therefore the Tamura-Nei model of base substitution was used (TAMURA and NEI 1993), which corrects this bias in its assumption of sequence evolution.Gamma-distributed rates (α = 0.16) were used as a model for rate heterogeneity.
The 1005 base-pair sequences of COI were used in analyses of sequence divergence among the five European taxa.Table 3 presents the matrix of genetic distances estimated by the Maximum composite Likelihood model (MEGA) with rates among sites gamma-distributed.The intraspecific genetic variability was low for all taxa (1-6 nucleotid substitutions, genetic distance 0.002-0.007),even when the specimens of each taxon were collected in geographically distant localities (Table 4).In contrast, the interspecific genetic variability was approximately one order of magnitude larger (30-65 nucleotid substitutions, genetic distance 0.046-0.266).

Tree building by maximum likelihood models
The maximum likelihood tree (Fig. 1) generated using the Bayesian MCMC (Markov Chain Monte Carlo method) analysis was based on the general time reversible model (GTR) of base substitution, gamma distribution, and 5 000 000 generations to achieve equilibrium, sampling every 50 generations and a "burn-in" of 5 000 generations.Phylogenetic trees were also generated using the neighbour-joining (NJ) and the most-parsimonious (MP) model with a bootstrap value of 1 000.As expected (SUZUKI et al. 2002;DOUADY et al. 2003) the reliability of nodes measured by bootstrap percentages (BP) was slightly smaller than Bayesian posterior probabilities (PP).However the data were quite robust and irrespective of the model used, we obtained five distinct clusters, one cluster for B. sporadicus, one cluster for B. terrestris, cluster α for operational taxonomic units (OTU) lucorum, cluster β for OTUs cryptarum and cluster γ for OTUs magnus.
The three clusters α, β and γ, representing OTUs of the so-called lucorum-complex, were well separated, with low variability, no intergrading and no terminal units of unclear position.

Tree building by diagnostic characters
As there are no gaps in the alignments of the COI sequences single nucleotide sites can be used as positional homologies (HILLIS 1994).The alignment file (Fig. 2) shows quite clearly that each taxon is characterized by about 8 to 12 substitutions, which are unique ("private") and can be used as diagnostic characters to define and identify that taxon.In MACCLADE the changes at the nodes and the diagnostic characters at the last branch of the terminal units can be investigated in detail and a tree can be built with the classical tools for morphological characters (Fig. 3).With the large number of diagnostic characters available it is normal that not all of these changes are unambiguous.However each of the three taxa of the so-called lucorum-complex is characterized by about 8 to 12 unambiguous diagnostic characters.
All specimens of B. cryptarum from alpine habitats (CRY-09, CRY-10, AY181123 and AY181124) differed from the rest of the cryptarum sequences by diagnostic position 1101 with a (T → C) replacement, and all specimens of B. magnus from the UK differed from the rest of the magnus sequences by the diagnostic positions 409 (T → C), position 579 (A → G) and position 603 (C → T).More material is needed but as the sequences were obtained from different laboratories the possibility of stochastic variability is very low.Diagnostic position 409 in B. magnus was one of the three sites within the lucorum-complex that results in amino acid sequence replacement; all specimens of B. magnus from the UK differed from the rest by the amino acid proline instead of serine at amino acid position 137.

Morphologically problematic specimens and misidentifications
Specimen  CRY-03 was identified by morphological characters as B. magnus, but the labial gland secretions from males reared in an artificial colony from this queen identified this specimen as B. cryptarum.This identification by labial gland secretions was confirmed by the DNA data; specimen CRY-03 was integrated into the cryptarum-cluster β.As discussed in BERTSCH et al. (2005), the specimens AY181117 (from Austria) and AY181119 (from Denmark) identified by PEDERSEN (2002) as B. cryptarum were morphological misidentifications, and both specimens cluster with the lucorum-cluster α.The observed differences of 4 bp were within the observed infraspecific variability of B. lucorum (Table 4).Specimens AY181123 and AY181124 from alpine habitats, which were identified by PEDERSEN (2002)

Identification by similarity
To check the "promises" in HEBERT & GREGORY (2005) of a novel system for rapid and accurate species identification the sequences of all critical or doubtful specimens were submitted to the Barcode identification engine.The species identity of a query (unknown) sequence is assigned on the basis of its similarity to a set of reference (identified) sequences.All three doubtful specimens of B. lucorum were identified as B. lucorum: the male LUC-01 from the Orkney Islands with 100 % similarity to B. lucorum reference sequences, the female LUC-07 from Spain with 99.8 % similarity and the female LUC-08 from Transbaikal with 99.8 % similarity.There was a clear gap (barcoding gap) between those and the next similar sequences with 96 % similarity.Specimen MAG-06, the most cryptarum-like of all B. magnus specimens from Milde/Bergen, was identified with 100 % similarity as B. magnus and again there was clear gap between that and the next similar sequences with 97 % similarity.The male CRY-01 from the Orkney Islands was identified species in summer 2006, when I obtained a strange sequence from a B. magnus specimen (from Leerstetten / Bavaria, Germany) that was sequenced in Belfast, and which had much similarity (simple BLAST request 99.4 % similarity) with the sequence from Bombus sp.BVP-A from Norway.Because labial gland secretions from males of the artificial colony verified my morphological identification of the specimen from Leerstetten as B. magnus something must be wrong with both strange sequences.New sequences from both specimens resulted in quite different sequences and confirmed the identification of B. magnus from Germany (EF362728) and tentatively identified Bombus sp.BVP-A from Norway as B. cryptarum, which is clearly a specimen with degraded DNA, and thus cannot be identified by similarity but only by diagnostic positions (Table 5).5) and the large difference in base substitutions is probably also due to degraded DNA.Sequences of B. lucorum from Scotland (LUC-01, AY694095) are more or less identical to sequences of B. lucorum from the continent.

More specimens with degraded
To investigate how far the use of diagnostic positions allows the identification of museum specimens in which degraded DNA could be expected, three specimens of the subgenus Bombus s. str.(1974a( , p. 322, collected VI, 1973) ) from the Nepalese Himalayas, B. lucorum terrestricoloratus KRÜGER (1951, p. 195; Nr. 0277.2VOGT collection, Amsterdam) from North Tibet and B. magnus turkestanicus KRÜGER (1954, p. 274;Nr. 0281.885 VOGT collection, Amsterdam, collected IV, 1909) from Central Asia.As expected the DNA of all three specimens was degraded with miscoding lesions, which are quite obvious when inspecting the ABI traces.Figure 5 compares partial sequences from fresh (deep frozen) DNA with hundred year old DNA from a museum specimen.
The overall quality of the museum DNA is quite good (shaded area = quality about 60 %) but three sites of this partial sequence are ambiguous, with double peaks: A first miscoding lesion (T → C, marked Y by the ABI output file), a second miscoding lesion (T → C, not recognised by the ABI output file) and a third miscoding lesion (G → A, marked R).A barcoding request identified all three specimens as part of the cryptarum-cluster (Fig. 4) and analysis of the diagnostic positions also revealed a close relationship with B. cryptarum (Table 5).None of the diagnostic positions characteristic for B. lucorum or B. magnus was found in these specimens.

Discussion
Facing the facts: morphology versus molecules So far accurate identification of specimens of B. magnus and B. cryptarum by morphological characters is only possible with females.Besides the characteristic colouration and shape at the lateral ends of the collare (BERTSCH et al. 2004) the main morphological characters used are (RASMONT 1984): • forms (e. g. form of the labrum), • sculptures (e. g. surface of tergite 2), • numbers (e. g. of "micropunctures" in the lateral corner of the ocellar field), • measures (e. g. length of malar space, diameter of ocelles), • and morphometric indices (e. g. labral-index, ocellar-index).
However, the interspecific differences in all these characters are quite small, there is overlap and measuring length in three-dimensional space is not that simple.With much experience, it is possible to identify most females by a combination of these characters but as can be seen in most museum collections the number of misidentifications is substantial.I do not know of any attempt to extract all these morphological characters from a large number of unclassified specimens and to demonstrate that the result is not a continuum of characters, but character clusters separated by gaps.As it is quite simple to obtain large numbers of specimens from artificial colonies from a wide range of geographical provenances, it would be interesting to see which morphological character or combination of characters is best suited to identify specimens classified independently by male labial gland secretions or DNA sequences.Much work is waiting for the morphological taxonomists.Empirical science relies on the ability to verify results independently in different laboratories.
For identification of critical taxa and validation of morphological characters, this would imply that measurements (for instance of the ocellar index = distance from right ocellus to preoccipital ridge / distance from ocellus to compound eye measured by LØKEN 1973 to separate B. lucorum and B. magnus) can be repeated independently in the same specimens.However, LØKEN's measurements are available only in a complex diagram (LØKEN 1973, Fig. 53) and there is no reference to individual specimens.Whereas LØKEN (1973) came to the conclusion that B. lucorum and B. magnus can be separated by measurements of the ocellar index, a view confirmed by TKALCU (1974), PEKKARINEN (1979) came to the conclusion that the observed differences are caused by allometry and that species separation is not possible.This is a typical situation when dealing with morphological characters of specimens of the lucorum-complex, contradicting results and with no possibility of checking the original data.For these taxa Pierre Rasmont is most probably the only person who has the experience necessary to identify critical specimens.Morphological characters can always be coded for cladistic investigations, trees can be constructed and homologies and possible character developments can be derived.With enough faculties of imagination or suitable mathematical models, any form can be changed into any other form and connected by intermediates.The question is whether such results are reliable.A good example for the problems involved in such approaches might be P. WILLIAMS' "reappraisal of morphology" (1985,1994).A total of forty-four morphological characters (21 from the male genital capsule) of bumblebees were used and coded to construct a strict consensus tree.A minimally and a maximally resolved tree with all character state changes was given, and one of the results of this investigation was a close phylogenetic relationship between  (BOOKSTEIN 1991(BOOKSTEIN , 1994) ) and why landmark data can be useful for delimitation and identification of taxa but are unsuitable to derive homologies and cladistically relevant trees should always be kept in mind.
Compared to morphological evidence the use of genetic evidence is relatively new, and what has been achieved in about 20 years is quite impressive.First, the sequence data are deposited in a public database, so the original data are available.Projects involving the long-term storage of well-documented DNA are underway (e. g.DNA Bank at the Zoologische Staatssammlungen München) and in the future it will be possible to extend and complement previous studies and to reinvestigate doubtful material.The need to make reference to specimens deposited in an accessible museum collection must be improved (RUEDAS et al. 2000), and material preserved in alcohol can be restored such that it is useful for morphological inspection, including characters of colouration and hair (MILLIRON 1971, p. 29).Many reference sequences used by the Barcode identification engine are from GenBank sequences and difficulties with misidentifications within GenBank data are well known (e. g.HARRIS, 2003;HEBERT et al., 2003;SEBERG, 2004;VILGALYS, 2003), the possibilities to correct misidentified sequences should be improved.
For the moment I prefer a simple BLAST (Basic Local Alignment Search Tool, ALTSCHUL et al. 1990) request because with GenBank data there is immediate access to all necessary information associated with the sequences (e. g. author, laboratory, publication, geographic provenance) whereas in the Barcode engine databank the original Genbank numbers have been changed and it is not straightforward to get this useful supplementary information.Thus the basis for every empirical science, that is, the ability to reproduce and check results independently, is guaranteed.Computer software for special purposes is increasing (e. g.In the future, it will be easier to check and correct morphological misidentifications by DNA methods than vice versa.
There is a certain uneasiness in relying on data from only one gene, but even in a very conserved gene like COI a sequence of 1000 bp delivers enough genetic variability: In the European taxa of the subgenus Bombus s. str. each taxon delivers up to 12 unambiguous diagnostic positions, enough to guarantee accurate identification.And as COI is a coding gene without indels the alignment lacks gaps and inconsistencies, so all base substitutions at diagnostic positions can be used as homologies to reconstruct phylogenetic relationships.Instead of the ongoing discussion about molecules versus morphology, close cooperation using both methods could bring rapid progress in difficult and controversial cases.

Molecular taxon identification: tree-based versus character-based
Most recently published approaches using DNA data have utilized distance measures to make the inference regarding species designation.Distances are generally measured in two ways.The first is a simple BLAST-based approach where a raw similarity score will determine the nearest neighbour to the query sequence.The second approach utilizes distances in tree building (HEBERT et al. 2003).A major shortcoming of using distances in DNA data is that all classical studies and taxonomic schemes are character-based, making the union of classical and DNA data a difficult process.Character-based methods have the logical advantage that when diagnostic character data are lacking, they will fail, allowing at least some hypothesis testing, whereas similarity scores will always give a nearest neighbour.However, this nearest neighbour is sometimes not the nearest relative (KOSKI & GOLDING 2001).There is also a lack of an objective set of criteria to delineate taxa when using distances.A universal similarity cutoff to determine species status will simply not exist.Like distance methods, each of the multitudes of available variations of phylogeny estimation via maximum likelihood relies on an explicit underlying model of character transformation.Because methods that rely on explicit, a priori models of evolution are acknowledged to be poor estimators of hierarchical patterns when the assumptions of the models are violated (YANG et al. 1994;FELSENSTEIN 2004) a model has to be taken on empirical grounds.Different models frequently produce the same best-supported tree for the same data: the maximum-likelihood approach seems robust to violation of some assumptions.However, caution is needed because false or overly-simple models can be misleading about the reliability of the estimated tree, tending to suggest that the tree is significantly supported when, in fact, it is not.A practical alternative is the exploration of character diagnostics in the DNA sequences themselves, without reference to trees.Thus morphological and molecular "characters" can easily be integrated and the procedures follow the two-step procedure of traditional taxonomic studies in which relationships among species are assessed only after the minimal biological units appropriately employed as terminal units are first identified by diagnostic characters.This approach, its relevance to diagnosing entities in nature and its relevance to species delimitation has been discussed at length both from the technical and theoretical standpoints (DAVIS & NIXON, 1992;WHEELER 2004;WILL & RUBINOFF 2004;EBACH & HOLREDGE 2005;DESALLE et al. 2005).

Conclusions
Morphological, physiological, and molecular operational taxonomical units (OTUs) clearly separate the specimens of the Bombus lucorum complex into three clusters that correspond with the taxa defined as B. lucorum, B. cryptarum and B. magnus.The differences in morphological characters, the composition of the species recognition signals (male labial gland secretions) and genetic distance are consistent with other taxa of Bombus where the species status is not in debate.All three taxa are thus good morphological, biological, and phylogenetic species.
It seems appropriate to first define terminal biological units as entities in nature and to use and discuss the logical class species in a second step.One hundred and fifty years ago, CH.DARWIN (1859) found a good formulation for such a two-step procedure: The endless disputes whether or not some fifty species of British brambles are true species will cease.Systematists will have only to decide (not that this will be easy) whether any form be sufficiently constant and distinct from other forms, to be capable of definition; and if definable, whether the differences be sufficiently important to deserve a specific name.

Fig. 1 :
Fig.1: Tree topology calculated as Maximum-Likelihood tree using Bayesian MCMC analysis with the general time reversal model of base substitutions, gamma distribution and 5 000 000 generations.

Fig
Fig. 2.1-4: Alignment of all parsimonious informative triplets (with uninformative sites deleted -), and with a pointer for position number (numbered for total COI) and codon position.Diagnostic (= private) positions marked with colour: green = Thymine, violet = Cytosine, red = Adenine and yellow = Guanine.

Fig. 3 :Fig. 4 :
Fig. 3: Observed diagnostic character changes with position numbers mapped onto the Maximum-Likelihood tree.Black box = unambiguous diagnostic character change, grey box = ambiguous diagnostic character change, and white box = unambiguous character change.
DNA PEDERSEN (2002) also discussed in detail a specimen of B. lucorum from Scotland (GenBank AY181121), which shows an exceptionally large difference in base substitutions compared to all other specimen of B. lucorum by adding 38 polymorphic sites.As a consequence this specimen has a different topological position when identified by tree building (Fig. 5 in PEDERSEN 2002) and seems to be a near neighbour to B. magnus.But as the B. magnus in PEDERSEN is really B. cryptarum, it is obviously related to B. cryptarum.This specimen AY181121 corresponds with B. cryptarum at all diagnostic positions (Table

Fig. 5 :
Fig. 5: Comparison of ABI colour traces of fresh (CRY-09) and 100 year old museum DNA (B.magnus turkestanicus), with two miscoding lesions of type T → C, one marked by Y (mixed bases IUPAC code for T/C) in the ABI output file, and one miscoding lesion of type G → A marked by R (mixed bases IUPAC code for G/A) in the ABI output file.
as B. magnus, clustered with the cryptarum- DOI: 10.21248/contrib.entomol.59.2.287-310 PEDERSEN (2002)lpine habitats that were misidentified as B. magnus byPEDERSEN (2002).This explains the strange identification result of the Barcode identification engine in the case of CRY-01 and Cry-10.A second cryptarum-cluster β 2 contains specimens of B. albocinctus from the Russian Far East and B. moderatus from North America, a topological position that proves that both taxa are separate from B. lucorum.So all five doubtful specimens were identified by the Barcode identification engine without any problem.This task requires much professional skill and experience if based on morphology, and requires a lot of time and facilities if carried out using artificial colonies and male labial gland secretions.It should be emphasized that the COI sequences in this investigation used only part of the barcoding region of 658 bp (overlap 435 bp), and the reference sequences were not from the validated reference barcode database but from the species level records barcode database.A new species of the subgenus Bombus s. str.was detected byPEDERSEN (2002)and the sequence is available as "unclassified Bombus sp.BVP-A" from Norway (GenBank AY181116).No morphological details about this new species are available.I became interested in this unidentified Bombus DOI: 10.21248/contrib.entomol.59.2.287-310 and Tab.5: Diagnostic Positions for B. lucorum, B. magnus and B. cryptarum and for misidentified specimens AY181124, AY181123 and AY181121.A new sequence for AY181116 (indet Norway), and sequences from museum specimens B. reinigi, B. lucorum terrestricoloratus KRÜGER, B. magnus turkestanicus KRÜGER.T interpreted as type II miscoding lesions (C → T) caused by degraded DNA.
(CAMERON et al. 2007rufipes, B. (Pressibombus) pressus, B. (Bombus) sporadicus and B. (Bombus) terrestris.Figure 5 inWILLIAMS (1994)shows in detail and convincingly how the change in forms of male genitalia (penis valve) of these species may have occured.However, the results of recently published trees based on genetics(CAMERON et al. 2007) contradict this reconstruction based on "morphological evidence," indicating that there is no close genetic relationship between the subgenus Bombus and B. rufipes (the rufipesgroup is part of the subgenus Melanobombus) or B. pressus (which from genetic evidence is surprisingly part of the subgenus Pyrobombus).The nearest group to the species of the subgenus Bombus is the subgenus Alpinobombus.The theoretical considerations on why morphometric data and the concept of biological homologies are incompatible