1887

Abstract

The genome of Bordetella pertussis is complex, with high G+C content and many repeats, each longer than 1000 bp. Long-read sequencing offers the opportunity to produce single-contig B. pertussis assemblies using sequencing reads which are longer than the repetitive sections, with the potential to reveal genomic features which were previously unobservable in multi-contig assemblies produced by short-read sequencing alone. We used an R9.4 MinION flow cell and barcoding to sequence five B. pertussis strains in a single sequencing run. We then trialled combinations of the many nanopore user community-built long-read analysis tools to establish the current optimal assembly pipeline for B. pertussis genome sequences. This pipeline produced closed genome sequences for four strains, allowing visualization of inter-strain genomic rearrangement. Read mapping to the Tohama I reference genome suggests that the remaining strain contains an ultra-long duplicated region (almost 200 kbp), which was not resolved by our pipeline; further investigation also revealed that a second strain that was seemingly resolved by our pipeline may contain an even longer duplication, albeit in a small subset of cells. We have therefore demonstrated the ability to resolve the structure of several B. pertussis strains per single barcoded nanopore flow cell, but the genomes with highest complexity (e.g. very large duplicated regions) remain only partially resolved using the standard library preparation and will require an alternative library preparation method. For full strain characterization, we recommend hybrid assembly of long and short reads together; for comparison of genome arrangement, assembly using long reads alone is sufficient.

Loading

Article metrics loading...

/content/journal/mgen/10.1099/mgen.0.000234
2018-11-21
2019-10-21
Loading full text...

Full text loading...

/deliver/fulltext/mgen/4/11/mgen000234.html?itemId=/content/journal/mgen/10.1099/mgen.0.000234&mimeType=html&fmt=ahah

References

  1. Burns DL, Meade BD, Messionnier NE. Pertussis resurgence: perspectives from the Working Group Meeting on pertussis on the causes, possible paths forward, and gaps in our knowledge. J Infect Dis 2014;209 Suppl 1:S32–S35 [CrossRef][PubMed]
    [Google Scholar]
  2. Jakinovich A, Sood SK. Pertussis: still a cause of death, seven decades into vaccination. Curr Opin Pediatr 2014;26:597–604 [CrossRef][PubMed]
    [Google Scholar]
  3. Sealey K. Is the Circulating Uk Bordetella Pertussis Population Evolving to Evade Vaccine-Induced Immunity? University of Bath; 2015
    [Google Scholar]
  4. Clark TA. Changing pertussis epidemiology: everything old is new again. J Infect Dis 2014;209:978–981 [CrossRef][PubMed]
    [Google Scholar]
  5. Ausiello CM, Cassone A. Acellular pertussis vaccines and pertussis resurgence: revise or replace?. MBio 2014;5:e01339-14 [CrossRef][PubMed]
    [Google Scholar]
  6. Bart MJ, Harris SR, Advani A, Arakawa Y, Bottero D et al. Global population structure and evolution of Bordetella pertussis and their relationship with vaccination. MBio 2014;5:e01074 [CrossRef][PubMed]
    [Google Scholar]
  7. Sealey KL, Harris SR, Fry NK, Hurst LD, Gorringe AR et al. Genomic analysis of isolates from the United Kingdom 2012 pertussis outbreak reveals that vaccine antigen genes are unusually fast evolving. J Infect Dis 2015;212:294–301 [CrossRef][PubMed]
    [Google Scholar]
  8. Bowden KE, Williams MM, Cassiday PK, Milton A, Pawloski L et al. Molecular epidemiology of the pertussis epidemic in Washington State in 2012. J Clin Microbiol 2014;52:3549–3557 [CrossRef][PubMed]
    [Google Scholar]
  9. Octavia S, Sintchenko V, Gilbert GL, Lawrence A, Keil AD et al. Newly emerging clones of Bordetella pertussis carrying prn2 and ptxP3 alleles implicated in Australian pertussis epidemic in 2008– 2010. J Infect Dis 2012;205:1220–1224 [CrossRef][PubMed]
    [Google Scholar]
  10. Lam C, Octavia S, Ricafort L, Sintchenko V, Gilbert GL et al. Rapid increase in pertactin-deficient Bordetella pertussis isolates, Australia. Emerg Infect Dis 2014;20:626–633 [CrossRef][PubMed]
    [Google Scholar]
  11. Chin CS, Alexander DH, Marks P, Klammer AA, Drake J et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods 2013;10:563–569 [CrossRef][PubMed]
    [Google Scholar]
  12. Conlan S, Thomas PJ, Deming C, Park M, Lau AF et al. Single-molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae. Sci Transl Med 2014;6:254ra126 [CrossRef][PubMed]
    [Google Scholar]
  13. Koren S, Phillippy AM. One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly. Curr Opin Microbiol 2015;23:110–120 [CrossRef][PubMed]
    [Google Scholar]
  14. Loman NJ, Quick J, Simpson JT. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat Methods 2015;12:733–735 [CrossRef][PubMed]
    [Google Scholar]
  15. Wick RR, Judd LM, Gorrie CL, Holt KE. Completing bacterial genome assemblies with multiplex MinION sequencing. Microb Genom 2017;3:e000132 [CrossRef][PubMed]
    [Google Scholar]
  16. Schmid M, Frei D, Patrignani A, Schlapbach R, Frey JE et al. Pushing the limits of de novo genome assembly for complex prokaryotic genomes harboring very long, near identical repeats. Nucleic Acids Res 2018;46:8953–8965 [CrossRef][PubMed]
    [Google Scholar]
  17. Jain M, Koren S, Miga KH, Quick J, Rand AC et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat Biotechnol 2018;36:338–345 [CrossRef][PubMed]
    [Google Scholar]
  18. Jain M, Olsen HE, Turner DJ, Stoddart D, Bulazel KV et al. Linear assembly of a human centromere on the Y chromosome. Nat Biotechnol 2018;36:321–323 [CrossRef][PubMed]
    [Google Scholar]
  19. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 2008;456:53–59 [CrossRef][PubMed]
    [Google Scholar]
  20. Siguier P, Gourbeyre E, Chandler M. Bacterial insertion sequences: their genomic impact and diversity. FEMS Microbiol Rev 2014;38:865–891 [CrossRef][PubMed]
    [Google Scholar]
  21. Bowden KE, Weigand MR, Peng Y, Cassiday PK, Sammons S et al. Genome structural diversity among 31 Bordetella pertussis isolates from two recent U.S. whooping cough statewide epidemics. mSphere 2016;1:e00036-16 [CrossRef][PubMed]
    [Google Scholar]
  22. Weigand MR, Peng Y, Loparev V, Batra D, Bowden KE et al. The history of Bordetella pertussis genome evolution includes structural rearrangement. J Bacteriol 2017;199:e00806-16 [CrossRef][PubMed]
    [Google Scholar]
  23. Parkhill J, Sebaihia M, Preston A, Murphy LD, Thomson N et al. Comparative analysis of the genome sequences of Bordetella pertussis, Bordetella parapertussis and Bordetella bronchiseptica. Nat Genet 2003;35:32–40 [CrossRef][PubMed]
    [Google Scholar]
  24. Preston A, Parkhill J, Maskell DJ. The bordetellae: lessons from genomics. Nat Rev Microbiol 2004;2:379–390 [CrossRef][PubMed]
    [Google Scholar]
  25. Heikkinen E, Kallonen T, Saarinen L, Sara R, King AJ et al. Comparative genomics of Bordetella pertussis reveals progressive gene loss in Finnish strains. PLoS One 2007;2:e904 [CrossRef][PubMed]
    [Google Scholar]
  26. Caro V, Hot D, Guigon G, Hubans C, Arrivé M et al. Temporal analysis of French Bordetella pertussis isolates by comparative whole-genome hybridization. Microbes Infect 2006;8:2228–2235 [CrossRef][PubMed]
    [Google Scholar]
  27. Bayliss SC, Hunt VL, Yokoyama M, Thorpe HA, Feil EJ. The use of Oxford Nanopore native barcoding for complete genome assembly. Gigascience 2017;6:gix001 [CrossRef][PubMed]
    [Google Scholar]
  28. Ton KNT, Cree SL, Gronert-Sum SJ, Merriman TR, Stamp LK et al. Multiplexed nanopore sequencing of HLA-B locus in Māori and Polynesian samples. bioRxiv 2017
    [Google Scholar]
  29. Pomerantz A, Penafiel N, Arteaga A, Bustamante L, Pichardo F et al. Real-time DNA barcoding in a remote rainforest using nanopore sequencing. bioRxiv 2017
    [Google Scholar]
  30. Quick J, Loman NJ, Duraffour S, Simpson JT, Severi E et al. Real-time, portable genome sequencing for Ebola surveillance. Nature 2016;530:228–232 [CrossRef][PubMed]
    [Google Scholar]
  31. Edwards A, Debbonaire AR, Sattler B, Mur LA, Hodson AJ. Extreme metagenomics using nanopore DNA sequencing: a field report from Svalbard 78 N. bioRxiv 2016
    [Google Scholar]
  32. Connor TR, Loman NJ, Thompson S, Smith A, Southgate J et al. CLIMB (the Cloud Infrastructure for Microbial Bioinformatics): an online resource for the medical microbiology community. Microb Genom 2016;2:e000086 [CrossRef][PubMed]
    [Google Scholar]
  33. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ et al. ABySS: a parallel assembler for short read sequence data. Genome Res 2009;19:1117–1123 [CrossRef][PubMed]
    [Google Scholar]
  34. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 2014;30:2114–2120 [CrossRef][PubMed]
    [Google Scholar]
  35. Gummy-Bear 2014; Calculate length of all sequences in an multi-fasta file. https://bioexpressblog.wordpress.com/2014/04/15/calculate-length-of-all-sequences-in-an-multi-fasta-file/ [accessed 6 June 2018]
  36. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 2013
    [Google Scholar]
  37. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009;25:2078–2079 [CrossRef][PubMed]
    [Google Scholar]
  38. Lin Y, Yuan J, Kolmogorov M, Shen MW, Chaisson M et al. Assembly of long error-prone reads using de Bruijn graphs. Proc Natl Acad Sci USA 2016;113:E8396E8405 [CrossRef][PubMed]
    [Google Scholar]
  39. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 2017;27:722–736 [CrossRef][PubMed]
    [Google Scholar]
  40. Li H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 2016;32:2103–2110 [CrossRef][PubMed]
    [Google Scholar]
  41. Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 2017;13:e1005595 [CrossRef][PubMed]
    [Google Scholar]
  42. Vaser R, Sović I, Nagarajan N, Šikić M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 2017;27:737–746 [CrossRef][PubMed]
    [Google Scholar]
  43. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 2014;9:e112963 [CrossRef][PubMed]
    [Google Scholar]
  44. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 2012;19:455–477 [CrossRef][PubMed]
    [Google Scholar]
  45. Wick RR, Judd LM, Holt KE. Comparison of Oxford Nanopore basecalling tools. 2018;https://github.com/rrwick/Basecalling-comparison [accessed 14 June 2018]
  46. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 2013;29:1072–1075 [CrossRef][PubMed]
    [Google Scholar]
  47. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 2015;31:3210–3212 [CrossRef][PubMed]
    [Google Scholar]
  48. Watson M. A simple test for uncorrected insertions and deletions (indels) in bacterial genomes. Opiniomics 2018
    [Google Scholar]
  49. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 2014;30:2068–2069 [CrossRef][PubMed]
    [Google Scholar]
  50. Darling AE, Mau B, Perna NT. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 2010;5:e11147 [CrossRef][PubMed]
    [Google Scholar]
  51. Bairoch A, Apweiler R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res 2000;28:45–48 [CrossRef][PubMed]
    [Google Scholar]
  52. Weigand MR, Pawloski LC, Peng Y, Ju H, Burroughs M et al. Screening and genomic characterization of filamentous hemagglutinin-deficient Bordetella pertussis. Infect Immun 2018;86:e00869-17 [CrossRef][PubMed]
    [Google Scholar]
  53. Weigand MR, Peng Y, Loparev V, Johnson T, Juieng P et al. Complete genome sequences of four Bordetella pertussis vaccine reference strains from Serum Institute of India. Genome Announc 2016;4: [CrossRef]
    [Google Scholar]
  54. Payne A, Holmes N, Rakyan V, Loose M. Whale watching with BulkVis: a graphical viewer for Oxford Nanopore bulk fast5 files. bioRxiv 2018
    [Google Scholar]
  55. Caro V, Bouchez V, Guiso N. Is the sequenced Bordetella pertussis strain Tohama I representative of the species?. J Clin Microbiol 2008;46:2125–2128 [CrossRef][PubMed]
    [Google Scholar]
  56. Watson M. Mind the gaps - ignoring errors in long read assemblies critically affects protein prediction. bioRXiv 2018
    [Google Scholar]
  57. Lu H, Giordano F, Ning Z. Oxford Nanopore MinION sequencing and genome assembly. Genomics Proteomics Bioinformatics 2016;14:265–279 [CrossRef][PubMed]
    [Google Scholar]
  58. Pop M. Genome assembly reborn: recent computational challenges. Brief Bioinform 2009;10:354–366 [CrossRef][PubMed]
    [Google Scholar]
  59. Teng H, Cao MD, Hall MB, Duarte T, Wang S et al. Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning. Gigascience 2018;7: [CrossRef][PubMed]
    [Google Scholar]
  60. Nanoporetech 2018; Clive G Brown: CTO plenary from London Calling. https://nanoporetech.com/about-us/news/clive-g-brown-cto-plenary-london-calling?keys=MinION&page=28 [accessed 28 June 2018]
  61. Zimin AV, Puiu D, Luo MC, Zhu T, Koren S et al. Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm. Genome Res 2017;27:787–792 [CrossRef][PubMed]
    [Google Scholar]
http://instance.metastore.ingenta.com/content/journal/mgen/10.1099/mgen.0.000234
Loading
/content/journal/mgen/10.1099/mgen.0.000234
Loading

Data & Media loading...

Supplements

Supplementary File 1

Most Cited This Month

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error