Introduction

1. Introduction

A. Challenge –Significance of Digitizing North American Lepidoptera

The visibility and beauty of many moths and butterflies have captivated amateur collectors and professional entomologists for centuries. Lepidoptera are disproportionately common in research collections, with many museums boasting tens of thousands if not millions of specimens. Their popularity has undoubtedly made them the best known and most collected of all insects. Unfortunately, their abundance in collections has led to bioinformatic neglect. While collection data and images from millions of plant and vertebrate specimens are entering the public domain, similar efforts for insects—particularly Lepidoptera—have been viewed as too daunting to undertake. This oversight has prevented Lepidoptera from joining 21st century “big data” science. Revolutionary advances in macroecological [1]and ecological niche modeling [2]methods are therefore unavailable to entomologists, as there are no “big data” to analyze.

Insects comprise more than 60% of all described species, and Lepidoptera are one of four “mega-diverse” orders that make up more than half of all insect life [3]. Because of their popular appeal, they are the best known of all insects and therefore the most logical taxon to begin data-driven, collections-based research. Plants and vertebrates cannot serve as surrogates for studying insects—the planet’s most diverse multi-cellular taxon. The majority of plants require broadly similar biotic and abiotic conditions, and most vertebrates have generalized diets. On the other hand, most Lepidoptera require specific species of host plants as larval food. Thus, the life histories of most insects depend on specialized interactions with one or more other species. This differs fundamentally from the biotic requirements of most vertebrates, which can generally prey or graze upon a taxonomically broad array of food items. Specialized interactions play an important role in insect evolution[4], including coevolution with plants [5]and other food items [6].

Although they are undoubtedly the best-characterized insect group, many Lepidoptera—particularly moths—remain undescribed, even though specimens of many taxa probably already exist in collections. The problem of connecting taxonomists with specimens can easily be overcome with technology. Lepidoptera are ideally suited to digital imaging because their large and colorful wings are two-dimensional and perfect for still photography, and many specimens contain much taxonomically useful information. Increased dissemination of specimen images will therefore not only aid citizen science efforts through automated specimen identification, but will only accelerate alpha taxonomy.

Insect herbivores and their host plants dominate terrestrial biomes, and may constitute as much as half of the earth’s macrooganismal species diversity [7]. As both herbivores and pollinators, butterflies and moths (Lepidoptera) are one of the most important insect orders causally linked to the radiation of flowering plants [8, 9]. They are among the most diverse of insect orders, with more than 157,000 described species [10], and represent the most damaging groups of insect pests to agriculture [11]. They are the primary source of food for many vertebrates as well as the hyperdiverse parasitoid wasps and flies [9]. Lepidoptera have served as model systems for studies of genetics, physiology, development, and many aspects of ecology and evolutionary biology including insect/plant coevolution, conservation biology and biogeography [12-20]. Since the pioneering work of Ehrlich and Raven (1964) [21]on the co-evolution of butterflies and their hosts, there has been great interest in trying to detect and understand macro evolutionary patterns in insect-plant associations [22-26]. Several studies have started to utilize continental-scale data to understand broad scale herbivore community dynamics [27-29]. However, these studies are often hampered by insufficient data, much of which is hidden on specimens in museum collections.

Lepidoptera are nearly always abundant in natural history collections of all sizes, from large national museums to regional collections and countless smaller private ‘enthusiast’ collections [22,23]. In North America alone, there are more than 14,300 documented Lepidoptera species in 86 families [4], but species diversity is thought to be as high as 31,700 if the poorly documented Mexican fauna and undescribed species hidden in collections are included [24]. Many databases have been accumulating this information from specimens and publications on a continental scale [30]. Data from LepNet will mobilize millions of distribution records and additional host data hidden within collections.

Considering the group, region, and status quo of accessibility, the relationship between potential and realized impact is uniquely unbalanced. Despite their abundance in collections, the central role that Lepidoptera play across biological disciplines, and their general interest to the broader public, only about 10% of North American Lepidoptera species have sufficient, accessible occurrence data to enable reliable predictions concerning their habitat use and susceptibility to global change [26]. The needed data are inaccessible or insufficiently integrated to foster systematic research because historical distributions and phenological data are focused solely on particular species or regions [27–29], and are not applicable to studies at a continental scale. Consequently, significant changes in public policy and expenditures are being driven by incomplete and/or geographically constrained studies [30].

From the perspective of advancing digitization, Lepidoptera are a paradigmatic case that embodies the challenges involved in digitizing the vast number of arthropod specimens in North American collections. Arthropods comprise at least 60% of multicellular species diversity [31] and there are ~250 million arthropod specimens housed in North American research collections [26]. However, arthropods only represent a small fraction (18%) of the 23 million extant iDigBio records [26]. Including GBIF records, there are only ~5 million vetted records of North American arthropods that are publicly available, and Lepidoptera comprise just 466,000 (< 10%) of these records.

In short, our community is faced with the challenge of leveraging foreseeable levels of funding in order to digitize 80% of research collection holdings in the United States by 2050. Arthropods are undoubtedly the striking piece of that challenge, and we contend that Lepidoptera are the best-suited group to demonstrate continued progress and attain public attention at a large scale. In response to the ambitious goals of the ADBC program, we feel that a large scale TCN is essential to broaden national digitization efforts.

B. Emergence of LepNet – The Lepidoptera of North American Network

The Lepidoptera of North America Network (LepNet) was formed in 2014 to provide a suitable model focused on a species-rich but achievable taxon to digitize. Currently, LepNet includes 29 collections distributed across 26 states (Figure 1), with physical holdings in excess of 20 million specimens. Our proposed TCN integrates and builds on existing insect TCNs (SCAN, Tri-trophic), and incorporates advancements from other TCNs and iDigBio. Hence, LepNet is well positioned to engage a broadly representative community of researchers and citizen scientists in documenting the North American diversity of the largest herbivore-specific clade, and to serve as a model for additional large arthropod-based TCNs that will advance the overall goals of the ADBC program.

Figure 1. Distribution of LepNet research collections. The number of specimen records to digitize (blue) and specimens to image (red) are shown for each institution.

24. Kergoat, G.J., et al., Parallels in the evolution of the two largest New and Old World seed-beetle genera (Coleoptera, Bruchidae). Molecular Ecology, 2005. 14: p. 4003-4021.

25. Opler, P.A., Oaks as evolutionary islands for leaf-mining insects: the evolution and extinction of phytophagous insects is determined by an ecological balance between species diversity and area of host occupation. American Scientist, 1974: p. 67-73.