Background & Summary

Lovebirds are a group of small sub-Sahara African parrots that belong to the genus Agapornis. Currently nine species are recognized by avian checklists and the International Union of Conservation in Nature IUCN1. Multiple aspects of the phylogeny and historical biogeography of the genus remain poorly understood and genomics presents an opportunity to gain insights into a number of research questions regarding evolution and approaches to conservation2. Past phylogenetic studies based on genetics have either not included all known species3, relied on only one or two molecular markers3,4 or relied on samples from captive-bred individuals5,6,7,8,9,10. The most recent phylogenetics study8 included archival individuals (i.e. specimens held in museum collections) from the wild but for most species the ancestral heritage and precise geographical provenance of the specimens used, are unknown, except three A. canus. The use of specimens from captivity or with uncertain provenance is problematic given that lovebirds hybridize readily, and crossbreeding has been a common practice in some captive collections. Naturalised populations of lovebirds have also been established from introductions outside their natural ranges in multiple localities. As a result, cryptic hybrids may exist within captive and introduced populations11,12,13.

Here we present full genome data of all nine species of lovebird derived from archival geo-referenced individuals collected within the natural historical range of each species with the aim of presenting sequences of individuals with a high certainty of being free from artificial hybridisation. Each genome is linked to a voucher specimen housed in a recognized natural history collection (Table 1).

Table 1 Information on nine voucher specimens used to produce genomic data.

Methods

A toepad from representative wild-caught individuals (collected from within their natural range2, Fig. 1) of each species was sampled from the Academy of Natural Sciences of Drexel University (ANSP) and the Field Museum of Natural History (FMNH). To keep the possibility of hybrid material due to past human interference minimal, we took the following into account when selecting samples. (1) Locality (Fig. 1) - samples were collected from sites that were remote and geographically separate from known centres of human habitation where the risk of hybridisation as result of escaped captive birds is low. Samples were also collected from the core of species range and not from possible hybrid zones or areas where the distribution of species overlap. Agapornis swindernianus and A. pullarius distributions do overlap (Fig. 1), however, these two species are ecologically separated and are not known to hybridize12 (2) Age - most samples (apart from A. canus, collection in the 1996) were collected before the peak of exploitation and mass trade of lovebirds which occurred in 1980s. (3) Phenotypic characters – the specimens included in our study did not show phenotypic characteristics such as unusual plumage characters that might indicate hybridisation. DNA extraction was performed on the toe pad tissue using the Qiagen DNeasy genomic extraction kit (QIAGEN N.V.) using the standard protocol. Paired-end sequencing libraries were constructed using the Illumina TruSeq kit according to the manufacturer’s instructions. The library was sequenced in 50–100x coverage on an Illumina Hi-Seq platform in paired-end (haploid), 2 × 150 bp format by GENEWIZ (Azenta Life Sciences). The resulting fastq files were trimmed of adapter/primer sequence and low-quality regions using Trimmomatic v0.3314. The trimmed sequences were then assembled using SPAdes v2.515 followed by a finishing step performed in Zanfona16. Zanfona16 uses a series of related species that function as references for each other. The genome was assembled on scaffold-level. There was no single alignment where reads from one species were mapped to the chromosomes of another species.

Fig. 1
figure 1

Specimen sample locations and BirdLife distribution for each of the 9 species1.

Data Records

All raw reads and assembled genomes have been deposited and are available on GenBank (National Center for Biotechnology Information NCBI, Table 2).

Table 2 Genomic data for each of the nine recognized lovebird species sequenced for this study.

Technical Validation

The specimens selected for sequencing were collected from the wild and accompanied by precise geo-referenced information (including precise Latitude and Longitude to the nearest minute). We used the default parameters for SPAdes15, Trimmomatic14, and Zanfona16 to assess the validity and quality of the data. Decontamination was performed using FSCR-gx17 and the NCBI’s in-house libraries. No gap closing was performed.