Effectiveness of reduced representation sequencing on century-old, ethanol-preserved museum fishes




French, Martin George

Journal Title

Journal ISSN

Volume Title




Museum specimens have a largely underutilized potential to allow biologists to study rare, ancient, or extinct organisms using genomic methods. However, museum samples often have degraded and fragmented DNA making it more difficult to sequence. Reduced representation sequencing has proven to be affordable and effective for population genomic applications but is sensitive to the degradation inherent with museum samples. Here, sequence quality and error rates were compared between reduced representation libraries constructed from century-old, ethanol-preserved museum and contemporary samples for two fishes (Atherinomorus duodecimalis and Siganus spinus), with a focus on the barcoded adapter and SbfI restriction site expected to occur at the beginning of every sequence read due to the library preparation. Museum specimens had a larger proportion of reads filtered due to adapter dimer and low base call quality, while yielding a smaller proportion of reads with the expected adapter sequence. Elevated error rates in the adapter (synthetic sequence) and the last two positions of the restriction site (fish sequence, positions 7 & 8) of museum samples indicates that the specificity of both the DNA polymerase and restriction enzyme, respectively, was impaired by a contaminant. Errors in the last two positions of the restriction site were not independent, indicating that if the restriction enzyme misrecognized position 7, then it also misrecognized position 8. Overall, sequencing of degraded museum specimens preserved in EtOH for >100 years is possible, but all else being equal, it can result in more sequence substitution errors, unintended loci and decreased depth of coverage due to altered enzymatic activity during library preparation when compared to contemporary samples. Consequently, up to 24% more DNA per museum specimen needs to be sequenced to achieve comparable results to contemporary fish.



High-throughput Sequencing, Restriction-site-associated DNA sequencing, RADseq, Sequence Error Modeling, Star Activity



Attribution-NonCommercial-ShareAlike 4.0 International, This material is made available for use in research, teaching, and private study, pursuant to U.S. Copyright law. The user assumes full responsibility for any use of the materials, including but not limited to, infringement of copyright and publication rights of reproduced materials. Any materials used should be fully credited with its source. All rights are reserved and retained regardless of current or future development or laws that may apply to fair use standards. Permission for publication of this material, in part or in full, must be secured with the author and/or publisher.