Whole genome amplification decreases the quality of whole genome sequencing for century old ETOH preserved fishes




Roberts, Roy
Garcia, Eric
Bird, Christopher E.


Journal Title

Journal ISSN

Volume Title




The genomes of organisms stored in museums hold a wealth of information that is challenging to sequence. Recent success in sequencing desiccated museum insects involved using whole genome amplification (WGA) and enzymatic repair (NEBNext FFPE Repair Mix) of DNA damage, but these techniques have not been tested on fishes. The Smithsonian Museum currently hosts one of the largest single collections of fishes which was conducted over a century ago on board the RV Albatross and consists of over 27,000 jars, or ‘lots‘, of marine fishes from the Philippines alone, and all were preserved in rum distillates and stored in 70% EtOH which should enable DNA sequencing. Here, we use factorial treatment combinations to test for the effects of WGA, enzymatic repair, and amount of DNA input on whole genome shotgun sequencing of RV Albatross and contemporary samples (preserved in 95% EtOH) of three species of Philippine marine fishes. A total of 74 libraries (30 WGA, 44 NoWGA) were sequenced (2 x 150bp) using one individual per era and species when possible. After adapter trimming, quality filtering, and removal of contaminant sequence reads, contemporary libraries were assembled de novo, and all libraries were mapped to the longest 100 contigs from the best genome assembly of each species according to n50 and BUSCO analysis. Contrary to expectation, WGA had a negative effect on depth of coverage and number of informative positions for historical libraries (p < 0.05). For both contemporary and historical libraries, neither enzymatic repair nor DNA concentration had a consistent effect on number of SNPs or depth of coverage (p > 0.05). Overall, we were able to successfully recover enough DNA to meaningfully test for differences between contemporary and historical libraries and perform downstream population genomic analyses, using a standard shotgun library preparation protocol and $75 of sequencing per library.



Museomics, next generation sequencing, short read assembly, bioinformatic processing, reef habitat conservation



Attribution-NonCommercial-NoDerivatives 4.0 International