Variable recombination dynamics during the emergence, transmission and ‘disarming’ of a multidrug-resistant pneumococcal clone
- Nicholas J Croucher1, 2, 3,
- William P Hanage1,
- Simon R Harris2,
- Lesley McGee4,
- Mark van der Linden5,
- Herminia de Lencastre6, 7,
- Raquel Sá-Leão6,
- Jae-Hoon Song8,
- Kwan Soo Ko9,
- Bernard Beall4,
- Keith P Klugman10, 11,
- Julian Parkhill2,
- Alexander Tomasz7,
- Karl G Kristinsson12 and
- Stephen D Bentley2, 13Email author
© Croucher et al.; licensee BioMed Central Ltd. 2014
Received: 9 April 2014
Accepted: 12 June 2014
Published: 23 June 2014
Pneumococcal β-lactam resistance was first detected in Iceland in the late 1980s, and subsequently peaked at almost 25% of clinical isolates in the mid-1990s largely due to the spread of the internationally-disseminated multidrug-resistant PMEN2 (or Spain6B-2) clone of Streptococcus pneumoniae.
Whole genome sequencing of an international collection of 189 isolates estimated that PMEN2 emerged around the late 1960s, developing resistance through multiple homologous recombinations and the acquisition of a Tn5253-type integrative and conjugative element (ICE). Two distinct clades entered Iceland in the 1980s, one of which had acquired a macrolide resistance cassette and was estimated to have risen sharply in its prevalence by coalescent analysis. Transmission within the island appeared to mainly emanate from Reykjavík and the Southern Peninsular, with evolution of the bacteria effectively clonal, mainly due to a prophage disrupting a gene necessary for genetic transformation in many isolates. A subsequent decline in PMEN2’s prevalence in Iceland coincided with a nationwide campaign that reduced dispensing of antibiotics to children in an attempt to limit its spread. Specific mutations causing inactivation or loss of ICE-borne resistance genes were identified from the genome sequences of isolates that reverted to drug susceptible phenotypes around this time. Phylogenetic analysis revealed some of these occurred on multiple occasions in parallel, suggesting they may have been at least temporarily advantageous. However, alteration of ‘core’ sequences associated with resistance was precluded by the absence of any substantial homologous recombination events.
PMEN2’s clonal evolution was successful over the short-term in a limited geographical region, but its inability to alter major antigens or ‘core’ gene sequences associated with resistance may have prevented persistence over longer timespans.
Despite the high frequency of horizontal sequence transfer in the pneumococcal population, the spread of multidrug-resistant (MDR) Streptococcus pneumoniae has primarily reflected the global dissemination of particular clones rather than the acquisition of resistance by resident, sensitive genotypes . MDR pneumococci were first identified in 1977 , with the phenotype becoming very common in some countries during the 1980s due to the spread of several MDR clones [3, 4]. However, the absence of such bacteria from Iceland meant no penicillin resistance was detected among S. pneumoniae on the island until December 1988 . This was in spite of the country having the highest per capita consumption of antibiotics of the Nordic countries at the time .
Levels of resistance subsequently rose to a peak in 1995, by which point 24.2% of Icelandic clinical pneumococcal isolates were penicillin non-susceptible [7, 8]. The majority of these were of serotype 6B and also exhibited resistance to co-trimoxazole, chloramphenicol, tetracycline and macrolides [6, 7, 9]. Genotyping demonstrated that the pneumococci sharing these characteristics were a clonal outbreak closely related to isolates found in Spain in the late 1980s, defined as the Spain6B-2 or PMEN2 lineage , suggesting a transmission from Western Europe .
Consequently, an effort was initiated to reduce consumption of antibiotics  that achieved a 35% fall in dispensing to Icelandic children between 1992 and 1997 . The prevalence of PMEN2 fell over subsequent years but despite these alterations in clinical practice and the presence of other MDR genotypes , serotype 6B pneumococci still constituted around 3% of Icelandic penicillin non-susceptible clinical isolates in 2010 . The genotype was found to be variably prevalent in different communities, disappearing from some but persisting in others in a manner that was suggested to relate to local antibiotic prescribing rates and herd immunity . Furthermore, some PMEN2 representatives with losses of particular resistances were identified, which may have been a sign of adaptation to reduced antibiotic consumption . In the case of tetracycline and macrolides, reversions to susceptibility appeared to have occurred through more than one type of mutation, as they were associated with both the presence and absence of the associated resistance gene.
As well as being found throughout Western Europe from the 1980s onwards [6, 14–17], the PMEN2 lineage has also been detected in the United States [18–21], South America [22, 23], the Middle East , South-East Asia [25–27], Australia  and Japan . Multilocus sequence typing (MLST) generally finds such isolates to be of, or similar to, sequence type (ST) 90. This suggests it is closely related to the PMEN22 lineage, first identified in countries around the Mediterranean [30, 31] and subsequently found in Western Europe and the United States [32, 33]. The PMEN22 genotype is associated with ST273, which differs from ST90 at only two of the seven MLST loci, and generally has a similar resistance profile to PMEN2 with the exception that representatives are typically β-lactam susceptible. An international collection of 189 isolates was, therefore, assembled to identify the relationship between these two lineages and trace the emergence and spread of the PMEN2 clone. This included 118 isolates from Iceland to reconstruct the outbreak on this island in detail, and thereby characterise its initial success and subsequent population dynamics in response to altered antibiotic dispensing practices.
Multiple resistance elements across a diverse collection
A second clade corresponded to the PMEN22 lineage (Figure 1). This clade had a relatively high r/m: 18.84 overall and 13.73 excluding recombinations within MGEs. These isolates all inherited the insertion of a single, novel large Tn5253-type ICE, consequently named ICESp6BST273 (Figure 2 and Additional file 6: Figure S5). Macrolide resistance across the clade was mainly the result of a Tn917 cassette  within the ICE carrying a ribosomal RNA methylase gene (ermB). Chloramphenicol resistance was the result of a cat chloramphenicol acetyltransferase within a pC194-type sequence  as part of a gene cluster very similar to the Ωcat(pC194) element found within Tn5253 [see Additional file 6: Figure S5]. De novo assemblies indicated this sequence could transpose to different loci within the genome, sometimes integrated into the ICE or Pneumococcal Pathogenicity Island-1, as was observed for the archetypal element . This represents a further example of sequence exchange between ICE and this island [42, 46]. The PMEN22 clade was generally β-lactam sensitive, except for a single resistant isolate (SN33007) that was predicted to have imported 23 segments of sequence, totaling just under 254 kb in length, that affected the penicillin binding protein genes pbp1a, pbp2x and pbp2b, but not pbp2a or murMN.
Diversification and dissemination of PMEN2
β-lactam resistance was independently acquired by the largest clade, corresponding to PMEN2, all isolates of which were distinguished from the Outgroup and PMEN22 clades by 17 shared segments of recombinant sequence that totaled 156 kb in length. This resulted in the import of resistance-associated alleles of pbp1a, pbp2x and pbp2b, but no contemporaneous recombinations affecting pbp2a or murMN were detected. Resistance to tetracycline and chloramphenicol resulted from the acquisition of ICESp6BST90 on the same branch of the phylogeny, which was shared by all PMEN2 representatives through common descent [see Additional file 6: Figure S5]; this contained both a Tn916-type element and a Ωcat(pC194)-like element [44, 45] [see Additional file 7: Figure S6]. Therefore, the antibiotic resistance determinants of PMEN2 and PMEN22 appear to have been entirely independently acquired. Many isolates were also resistant to macrolides owing to the acquisition of the ermB gene carried on either the Omega cassette  or Tn917, both of which appear to have been acquired more than once during the evolution of PMEN2 [see Additional file 5: Figure S4]. Based on one atypical Tn916-type sequence [see Additional file 8: Figure S7], it seems likely that both cassettes were acquired by a recent common ancestor of one particular clade of three isolates, thereby arranging two highly similar ermB genes in tandem. A subsequent intragenomic recombination between them appears to have deleted the intervening sequence, resulting in a reversion to tetracycline susceptibility owing to the loss of the tetM gene, but leaving the aminoglycoside resistance gene aph3’ and ermB in its place.
A plot of the root-to-tip distance of the isolates within the PMEN2 clade relative to their date of isolation revealed a significant linear correlation (n = 168, R2 = 0.63, P value <2.2 × 10-16) that suggested a date of origin around 1967 [see Additional file 9: Figure S8A]. A Bayesian coalescent analysis of the clade estimated the most likely date of origin to be 1969 (95% credibility interval, 1962 to 1974), implying a point mutation rate of 1.11 × 10-6 substitutions per site per year (95% credibility interval of 9.82 × 10-7 to 1.24 × 10-6 substitutions per site per year). This origin is likely to have occurred in Western Europe (Figure 1), in accordance with the site of the first identification of the clone . PMEN2 then seems to have spread to other continents on multiple occasions, concurrent with isolates undergoing extensive diversification across much of the genome.
Clonal evolution within Iceland
By contrast, the Icelandic isolates formed two clades (IC1 and IC2), neither of which exhibited strong evidence of having engaged in homologous recombination (Figure 1). Clade IC2 appears to represent a relatively unsuccessful transmission to the island; all six representatives were macrolide sensitive, as they carried versions of ICESp6BST90 lacking an ermB-containing cassette. The most recent common ancestor of IC2 was estimated to have existed around 1984 (95% credibility interval, 1980 to 1987) shortly before the first clinical isolate was detected, but no isolates belonging to the clade were detected from 1994 onwards. The IC2 isolates corresponded to three similar pulsed field gel electrophoresis (PFGE) patterns ; although no putative homologous recombinations were identified within this group that might have caused such diversification, there were five recombination events within MGEs indicating that the acquisition of autonomously mobile elements may have caused the alterations. The larger IC1 clade was composed of 112 isolates and appears to have entered Iceland at a similar time, around 1986 (95% credibility interval, 1985 to 1987). These isolates were generally macrolide resistant due to the insertion of Tn917 into ICESp6BST90 (Figure 2).
Only 20 recombinations were detected within clade IC1, of which 14 occurred within prophage and appear to represent the movement of viral sequences. Each of the remaining six events imported no more than eight substitutions, far below the mean of 104 substitutions per recombination in the wider collection. Consequently, all of the isolates tested near the start of the outbreak corresponded to a single PFGE type , and the clade’s r/m was 0.18, falling to 0.024 when recombinations within MGEs were excluded. This seems largely due to the insertion of a prophage, henceforth referred to as ΦIC1, into the comYC (also known as comGC) gene, which encodes a component of the competence pilus essential for the uptake of exogenous DNA into the cell . The integration of prophage into this locus has previously been experimentally associated with the loss of competence for transformation . Sequence read mapping to ΦIC1 revealed that related viruses with very similar integrases, which seem to target the same locus based on cases where the prophage could be assembled de novo, appear to have inserted into PMEN2 representatives across the world on several occasions [see Additional file 10: Figure S9 and Additional file 11: Figure S10]. This analysis of prophage distribution also showed that the ΦIC1 virus is absent from the chromosomes of many isolates within clade IC1, indicative of frequent loss. Based on de novo assemblies, the comYC gene seems to have returned to a functional form in these samples, demonstrating that the MGE’s excision is precise. However, loss of the prophage was not associated with a detectably elevated rate of sequence import, as only one of the putative homologous recombination events was associated with isolates carrying ‘repaired’ comYC genes.
Phylodynamics of the Icelandic outbreak
The phylogeny supported the hypothesis that PMEN2 entered Iceland from Western Europe  (Figure 1 and Additional file 4: Figure S3), but not the suggestion of Spain in particular being the source, which was based on genotyping of isolates SP522 and SP681 (Figure 1). The isolate most closely related to clade IC1 was from Germany, albeit from 1998, post-dating the estimated dates of clades IC1 and IC2 arriving in Iceland by around a decade. The transmission and population dynamics of clade IC1 subsequent to its entry into Iceland were reconstructed using a coalescent model to combine the sequence alignment with the detailed epidemiological information available. This analysis could be performed independently of the rest of the collection, as this clade alone showed a significant correlation between root-to-tip distance within the clade and day of isolation (n = 112, R2 = 0.56, P value <2.2 × 10-16; Additional file 9: Figure S8B). This estimated the date of IC1’s common ancestor as around October 1987 (95% confidence interval September 1986 to August 1988), consistent with the overall analysis of the PMEN2 clade.
‘Disarming’ of an MDR clone
Group 3 isolates were again tetracycline sensitive, but differed in also being susceptible to macrolides, having apparently lost both the tetM and ermB genes. This phenotype appears to have emerged twice within clade IC1; in both cases, it was associated with the excision of the Tn916-type element. This loss was evident both from comparison of de novo assemblies with the reference, as well as an absence of sequence read mapping to the element (Figure 5). The removal of the ICE was precise and led to the restoration of a gene encoding a zinc-dependent peptidase that appeared to have been split by the insertion of the Tn916-type component in the formation of the composite ICE. Aside from the deletion within Tn916 driven by the tandem ermB genes, this element was stable across the rest of the PMEN2 clade, as observed in other lineages .
Group 4 isolates were tetracycline resistant but macrolide sensitive, despite retaining the ermB gene and, by implication, the Tn916-type element. This phenotype was found in a single clade of three isolates within IC1. De novo assembly revealed this to be the consequence of 8 bp of the ermB CDS being replaced by 7 bp of divergent sequence, introducing a frameshift mutation that truncated the resistance gene. The disruptive 7 bp sequence is an inverted copy of the same pattern of bases found downstream within ermB, suggesting that the change may have been driven by an unusual intragenomic recombination. This mutation only appears to have occurred once across the entire collection.
Group 5 isolates were defined as those displaying susceptibility to chloramphenicol; however, it was not previously possible to determine whether this was correlated with the loss of the cat chloramphenicol acetyltransferase gene. Analysis of the de novo genome assemblies revealed that the frequent emergence of this phenotype in the collection was, once more, associated with isolates both possessing, and having lost, the ICE-encoded gene. One clade of isolates had a 9.36 kb deletion within the Tn5252-type component of the ICE that eliminated part of the Ωcat(pC194)-like element, including the cat gene; this appears to have resulted from an intragenomic recombination between the long tandemly-repeated sequences on either side (Figure 5D). This unstable element also appears to have been lost outside of Iceland on several occasions [see Additional file 6: Figure S5]. However, no obvious candidate mutations causing chloramphenicol susceptibility could be found affecting the Ωcat(pC194)-like element in other group 5 representatives, although ambiguities in establishing the element’s mobility and insertion site made it difficult to fully characterize in all isolates. Group 6 isolates, susceptible to chloramphenicol, tetracycline and macrolides despite retaining the required genes, similarly lacked any obvious mutation underlying their resistance profile.
The PMEN2 lineage appears to have emerged in the late 1960s, around the same time as PMEN1 (Spain23F-1), when β-lactam non-susceptible pneumococci were first being detected . Therefore, PMEN2 is likely to have been one of the first such genotypes. Both PMEN1 and PMEN2 seem to have acquired near-identical penicillin-binding protein sequences, and both acquired cat from pC194, and tetM from Tn916, as part of a Tn5253-type element [42, 47]. While PMEN1 was more successful in many countries, it was PMEN2 that predominated in Iceland, despite the presence of other MDR clones (PMEN1 among them) . More specifically, clade IC1 of PMEN2 appears to have been much more successful than IC2, which may reflect the macrolide resistance cassette carried by clade IC1. This is supported by the observation that treatment with erythromycin was associated with double the risk of carrying penicillin resistant pneumococci as treatment with β-lactams in Iceland at the time .
The fall in dispensing of antibiotics to children that started in 1992 is likely to have reduced the selection pressure driving PMEN2’s success within Iceland . In the mid-1990s, this seems to be reflected in the instances of the MDR clone being ‘disarmed’. Each of the reversions to chloramphenicol, tetracycline and macrolide sensitivity occurred both in the presence and absence of the relevant resistance gene; furthermore, some of the specific mutations were homoplasic within clade IC1. Such convergent evolution is likely driven by selection; this would suggest that these resistance mechanisms impose a fitness cost upon the host cell and, therefore, tend to degrade as they became less advantageous . However, these variants appear only to have enjoyed limited success, and primarily seem to have been observed when the PMEN2 clone was present at a comparatively high frequency relative to the use of antibiotics in children selecting for resistance. Both the count of clinical isolates and apparent change in Neτ indicate a drop in the prevalence of clade IC1 beginning in the mid- to late-1990s. Once the clone fell to a lower, stable frequency in response to this change in prescribing, the pressure to revert to a sensitive phenotype is likely to have been reduced. This may explain why no further reversions to macrolide resistance were observed from 1997 onwards. Alternatively, the limited success of these variants may indicate that such mutations are only adaptive over the short-term to atypically low antibiotic use in a certain community .
By contrast, non-ICE ‘core’ sequences that caused antibiotic resistance were stable within clade IC1. For instance, resistance to penicillins and co-trimoxazole was retained despite the declining use of these drugs. This is inevitable in the context of the clonal evolution of clade IC1; in the absence of allelic replacement at dyr (encoding dihydrofolate reductase, the sequence of which determines susceptibility to trimethoprim), folP (encoding dihydropteroate synthase, the sequence of which determines susceptibility to sulphonamides) or penicillin-binding protein genes, reversion to the sensitive ancestral sequences was not possible. This is similar to the pattern of evolution observed in serotype 3 isolates  and in contrast to the rapid import of divergent sequence observed in PMEN1  and PMEN2 isolates from outside Iceland in this collection. This absence of transformation largely appears to be the consequence of the disruption of comYC by the insertion of prophage ΦIC1, although it is not clear why the isolates lacking this element still do not import sequence diversity from other pneumococci. It may be that the prophage is unstable only during disease or laboratory culturing, and that, in fact, the prophage remained inserted in all carried IC1 isolates. This failure to adapt to changing levels of antibiotic use in children through horizontal import of sequence may have played a part in the subsequent decline of clade IC1 in Iceland after the mid-1990s.
Alternatively, the lack of antigenic diversification resulting from the absence of recombination may have contributed to the genotype’s fall in prevalence. The high density of recombination events affecting the pspC locus outside of the IC1 clade suggests a selective pressure to alter such antigen-encoding genes. Therefore, the decrease in clade IC1’s frequency may be the consequence of the resident population accumulating herd immunity to the genotype ; this may be particularly acute in Iceland, as the phylogeography of the lineage suggests that only Reykjavík and the Southern Peninsular region sustained the genotype long-term, potentially acting as a source for the rest of the country. Declining transmission within these communities may have resulted in an ensuing fall in the rate of spread of the clade to other locations. Furthermore, as the acquisition of a capsule differing extensively from 6B seems unlikely within clade IC1, it seems destined for elimination following the introduction of polysaccharide conjugate vaccines (which induces immunity against both serotype 6A and 6B) in 2011 .
PMEN2 emerged in the 1960s, and a single transmission into Iceland in the late 1980s appears to have been particularly successful despite its lack of sequence import through homologous recombination. While this limitation did not appear to inhibit the loss of antibiotic resistance found on mobile genetic elements, it prevented modification of ‘core’ sequences associated with resistance and antigenic diversification, which likely contributed to its decline after the mid-1990s.
Selection of isolates for sequencing
Isolates with an MLST profile similar to that of ST90 were supplied by the sources listed in Additional file 1: Table S1. In the case of the Icelandic representatives of PMEN2, isolates were selected to produce a temporally even sample over the course of the lineage’s spread across the island when possible; these were supplemented by a set specifically chosen as exhibiting atypical resistance profiles. Isolates were supplied by the same set of clinical centres over the course of the study, but the region from which they received samples was reduced from the entire island pre-1995 to only the Capitol and Southwestern regions post-1995. However, these two districts accounted for the vast majority of isolates across all years.
DNA sequencing and phylogenomics
DNA was sequenced as multiplexed libraries on the Illumina Genome Analyzer II and HiSeq platforms, and the raw data deposited in the European Nucleotide Archive (ENA), as described by the read lengths and accession codes in Additional file 1: Table S1. Serotype and sequence type were identified as described in  to check the integrity of sample handling. Sequence reads were mapped against the reference sequence of S. pneumoniae 670-6B (EMBL accession code: CP002176), an annotated complete genome , using SMALT v0.6.4 as described previously . MGEs were identified in the reference genome through alignments with other complete pneumococcal genomes. Polymorphisms were identified using criteria defined in . Samples showing signs of contamination, based on the quality of de novo assemblies and signs of the presence of multiple alleles in mapped sequence read data, were discarded. Only samples with a mean coverage across the entire reference genome above 25 fold and able to call bases at >90% of reference positions, were used in the phylogenomic analysis. This generated a reference-based whole genome alignment for phylogenetic and clustering analyses. The identification of putative recombination events, and generation of a phylogeny based on vertically-inherited point mutations, was performed as described previously . This method identifies recombinations as regions of the genome affected by high densities of base substitutions on individual branches of the phylogeny, which may reflect homologous recombination or the movement of MGEs. Therefore, to exclude horizontal sequence exchange likely driven by the latter mechanism, homologous recombination events were defined as those occurring outside of the annotated MGEs in the reference sequence.
Independently, the same whole genome alignment was analysed using BRATNextGen , assuming three clusters, using a learned value of alpha, a window size of one kilobase and a significance threshold P value of 0.05 (as calculated from 100 permutations).
Bayesian coalescent analyses
For the analysis of the PMEN2 clade, the topology of the phylogeny was fixed as that of the relevant clade of the overall phylogeny, in order to maintain consistency with the prediction of recombinations. Analysis with Path-O-Gen  revealed that this phylogeny displayed a significant correlation between the root-to-tip distance of a sample and its year of isolation [see Additional file 9: Figure S8A]. The nucleotide alignment used was the subset of positions at which base substitutions occurred within the clade, with those substitutions introduced by recombination excluded from the alignment, and the dates of isolation used were the year, or range of years, provided in Additional file 1: Table S1. Samples for which no information regarding date of isolation was available were each associated with a uniform prior distribution across the range of years of isolation of the rest of the collection. Analysis with Bayesian evolutionary analysis using sampling trees (BEAST)  used a general time-reversible (GTR) substitution model with a single rate category and a lognormal relaxed molecular clock rate . A Bayesian skyline plot was used as the population demography prior . Ten chains of 100 million states each were run; 50 million generations were removed as burn-in and the remaining data combined. All parameters were estimated with estimated sample size (ESS) values above 200, with the exception of the year of isolation of samples SPN11926 and SN11927. However, joint marginal plots demonstrated that neither of these parameters showed a significant correlation with any of the values described in the text. This model was compared to four others: a strict molecular clock, a random molecular clock, a lognormal relaxed clock with four rate categories and a lognormal relaxed clock with a Hasegawa, Kishino and Yano (HKY) substitution model. These analyses used the same input data and were processed in the same way, and the same parameters were estimated with ESS values above 200. As assessed by Bayes factors , these alternative models all fitted the data less well than that used for the analysis described in the text [see Additional file 12: Table S2]. In the case of the model using four rate categories the fit was only slightly worse as judged by Bayes factors. However, the coefficient of variation in the absence of multiple rate categories (median estimate of 0.42; 95% credibility interval of 0.32 to 0.52) was not found to provide evidence of extensive rate heterogeneity, hence this alternative model was rejected.
For the analysis of the IC1 clade, the day of isolation was used, where available; otherwise, the 15th day of the month of isolation was used as an approximation of the day of isolation. Again, a significant correlation between root-to-tip distance and day of isolation was observed [see Additional file 9: Figure S8B]. The analysis was otherwise conducted as described for the overall PMEN2 clade. The displayed tree and Bayesian skyline plot was calculated using 10,000 trees (matched to the equivalent number of states) derived through removing 50 million generations from each of the ten 100 million generation chains as burn-in, then resampling the remaining generations at a frequency of 1 per 50,000. All values were estimated with an ESS greater than 200. This model was also compared to the same four alternative models as described above [see Additional file 13: Table S3], and once more the only other model to give a similarly good fit to the data as the described analysis was that using a lognormal relaxed molecular clock with four rate categories. However, the inclusion of the rate categories did not significantly improve the fit, and the coefficient of variation in the described analysis with a single rate category had a median value of 0.33 (95% credibility interval of 0.22 to 0.45), which suggested there was not strong evidence of extensive rate heterogeneity. Hence, the model with multiple rate categories was rejected on the grounds of parsimony.
A discrete state phylogeographic model was used to reconstruct the transmission of the IC1 clade across Iceland . When testing for significance of the observed patterns through permutations, 100 versions of the alignment were generated in which the locations were randomly reassigned to isolates. The tree, sequences and dates of isolation were held constant. Each permutation was analysed with BEAST using the same model, but with a chain length of 10 million generations; this was sufficient for the parameters of the discrete states phylogeographic model to have converged. One million of these were removed as burn-in and the remaining trees (sampled at a frequency of 1 per 5,000 generations) used to determine the robustness of the phylogeographic reconstruction.
Analysis of accessory genome distribution
For identification of novel accessory genome components, Illumina sequence reads were assembled using Velvet . This was run iteratively to identify the k-mer that gave the highest N50 value, while maintaining an expected level of coverage above 20, as described previously . Scaffolds were then generated using SSPACE2 , which were subsequently improved using the post assembly genome improvement toolkit (PAGIT) pipeline . Sequence alignments were performed using BLAT  with default settings and displayed using ACT . The ICE displayed in Figure 2 have been submitted to the ENA with the following accession codes: ICESp22664 (HG799489), ICESpDCC1902 (HG799491), ICESpDCC1738 (HG799492), ICESpDCC1524 (HG799493) and ICESp6BST273 (HG799495). ICESpSPN8332, displayed in Additional file 8: Figure S7, has been submitted to the ENA with accession code HG799498. The prophage sequence in Additional file 11: Figure S10 have been submitted to the ENA with the following accession codes: ϕDCC1738 (HG799497), ϕK13-0810 (HG799496) and ϕIC1 (HG799490).
To ascertain the distribution of sequence reads across the collection, reads were mapped to the accessory genome component using BWA v0.7.3 . The coverage plots were then generated using Samtools . Heatmaps were generated using Biopython .
This work was funded by Wellcome Trust grant number 098051. NJC was funded by an AXA Foundation postdoctoral fellowship. We thank the core sequencing and bioinformatic teams at the Wellcome Trust Sanger Institute for their support. We thank members of the Active Bacterial Core surveillance program of the Emerging Infections Program of the Centers for Disease Control and Prevention (CDC), and Global Strain Bank Project, for contribution of strains provided by LM. The Global Strain Bank Project is a PATH-funded collaborative project between Emory University, the CDC and external contributors. We thank Dr. Thibaut Jombart and Prof. Christophe Fraser for helpful comments, and attending authors are grateful for the opportunity to discuss the project at the PERMAFROST workshop.
- Klugman KP: The successful clone: the vector of dissemination of resistance inStreptococcus pneumoniae. J Antimicrob Chemother. 2002, 50: 1-5.PubMedView ArticleGoogle Scholar
- Jacobs MR, Koornhof HJ, Robins-Browne RM, Stevenson CM, Vermaak ZA, Freiman I, Miller GB, Witcomb MA, Isaacson M, Ward JI, Austrian R: Emergence of multiply resistant pneumococci. N Engl J Med. 1978, 299: 735-740.PubMedView ArticleGoogle Scholar
- Fenoll A, Martin Bourgon C, Munoz R, Vicioso D, Casal J: Serotype distribution and antimicrobial resistance ofStreptococcus pneumoniaeisolates causing systemic infections in Spain, 1979–1989. Rev Infect Dis. 1991, 13: 56-60.PubMedView ArticleGoogle Scholar
- Marton A, Gulyas M, Munoz R, Tomasz A: Extremely high incidence of antibiotic resistance in clinical isolates of Streptococcus pneumoniae in Hungary. J Infect Dis. 1991, 163: 542-548.PubMedView ArticleGoogle Scholar
- Kristinsson KG, Hjalmarsdottir MA, Steingrimsson O: Increasing penicillin resistance in pneumococci in Iceland. Lancet. 1992, 339: 1606-PubMedView ArticleGoogle Scholar
- Soares S, Kristinsson KG, Musser JM, Tomasz A: Evidence for the introduction of a multiresistant clone of serotype 6B Streptococcus pneumoniae from Spain to Iceland in the late 1980s. J Infect Dis. 1993, 168: 158-163.PubMedView ArticleGoogle Scholar
- Sá-Leão R, Vilhelmsson SE, de Lencastre H, Kristinsson KG, Tomasz A: Diversity of penicillin-nonsusceptibleStreptococcus pneumoniae circulating in Iceland after the introduction of penicillin-resistant clone Spain6B-2. J Infect Dis. 2002, 186: 966-975.PubMedView ArticleGoogle Scholar
- Hjálmarsdóttir M, Kristinsson K: Epidemiology of penicillin-non-susceptible pneumococci in Iceland, 1995–2010. J Antimicrob Chemother. 2013, 69: 940-946.PubMedView ArticleGoogle Scholar
- Vilhelmsson SE, Tomasz A, Kristinsson KG: Molecular evolution in a multidrug-resistant lineage of Streptococcus pneumoniae: emergence of strains belonging to the serotype 6B Icelandic clone that lost antibiotic resistance traits. J Clin Microbiol. 2000, 38: 1375-1381.PubMed CentralPubMedGoogle Scholar
- McGee L, McDougal L, Zhou J, Spratt BG, Tenover FC, George R, Hakenbeck R, Hryniewicz W, Lefevre JC, Tomasz A, Klugman KP: Nomenclature of major antimicrobial-resistant clones of Streptococcus pneumoniae defined by the pneumococcal molecular epidemiology network. J Clin Microbiol. 2001, 39: 2565-2571.PubMed CentralPubMedView ArticleGoogle Scholar
- Kristinsson KG: Effect of antimicrobial use and other risk factors on antimicrobial resistance in pneumococci. Microb Drug Resist. 1997, 3: 117-123.PubMedView ArticleGoogle Scholar
- Kristinsson KG: Modification of prescribers’ behavior: the Icelandic approach. Clin Microbiol Infect. 1999, 5: S43-S47.PubMedView ArticleGoogle Scholar
- Arason VA, Sigurdsson JA, Erlendsdottir H, Gudmundsson S, Kristinsson KG: The role of antimicrobial use in the epidemiology of resistant pneumococci: a 10-year follow up. Microb Drug Resist. 2006, 12: 169-176.PubMedView ArticleGoogle Scholar
- Lefevre JC, Bertrand MA, Faucon G: Molecular analysis by pulsed-field gel electrophoresis of penicillin-resistant Streptococcus pneumoniae from Toulouse, France. Eur J Clin Microbiol Infect Dis. 1995, 14: 491-497.PubMedView ArticleGoogle Scholar
- Reichmann P, Varon E, Gunther E, Reinert RR, Luttiken R, Marton A, Geslin P, Wagner J, Hakenbeck R: Penicillin-resistant Streptococcus pneumoniae in Germany: genetic relationship to clones from other European countries. J Med Microbiol. 1995, 43: 377-385.PubMedView ArticleGoogle Scholar
- Enright MC, Fenoll A, Griffiths D, Spratt BG: The three major Spanish clones of penicillin-resistant Streptococcus pneumoniae are the most common clones recovered in recent cases of meningitis in Spain. J Clin Microbiol. 1999, 37: 3210-3216.PubMed CentralPubMedGoogle Scholar
- Munoz R, Musser JM, Crain M, Briles DE, Marton A, Parkinson AJ, Sorensen U, Tomasz A: Geographic distribution of penicillin-resistant clones of Streptococcus pneumoniae: characterization by penicillin-binding protein profile, surface protein A typing, and multilocus enzyme analysis. Clin Infect Dis. 1992, 15: 112-118.PubMedView ArticleGoogle Scholar
- Versalovic J, Kapur V, Mason EO, Shah U, Koeuth T, Lupski JR, Musser JM: Penicillin-resistant Streptococcus pneumoniae strains recovered in Houston: identification and molecular characterization of multiple clones. J Infect Dis. 1993, 167: 850-856.PubMedView ArticleGoogle Scholar
- Brown SD, Farrell DJ, Morrissey I: Prevalence and molecular analysis of macrolide and fluoroquinolone resistance among isolates of Streptococcus pneumoniae collected during the 2000–2001 PROTEKT US Study. J Clin Microbiol. 2004, 42: 4980-4987.PubMed CentralPubMedView ArticleGoogle Scholar
- Gherardi G, Whitney CG, Facklam RR, Beall B: Major related sets of antibiotic-resistant Pneumococci in the United States as determined by pulsed-field gel electrophoresis and pbp1a-pbp2b-pbp2x-dhf restriction profiles. J Infect Dis. 2000, 181: 216-229.PubMedView ArticleGoogle Scholar
- Gertz RE, McEllistrem MC, Boxrud DJ, Li Z, Sakota V, Thompson TA, Facklam RR, Besser JM, Harrison LH, Whitney CG: Clonal distribution of invasive pneumococcal isolates from children and selected adults in the United States prior to 7-valent conjugate vaccine introduction. J Clin Microbiol. 2003, 41: 4194-4216.PubMed CentralPubMedView ArticleGoogle Scholar
- Quintero B, Araque M, van der Gaast-de Jongh C, Hermans PW: Genetic diversity of Tn916-related transposons among drug-resistant streptococcus pneumoniae isolates colonizing healthy children in Venezuela. Antimicrob Agents Chemother. 2011, 55: 4930-4932.PubMed CentralPubMedView ArticleGoogle Scholar
- Gherardi G, Inostrozo JS, O’Ryan M, Prado V, Prieto S, Arellano C, Facklam RR, Beall B: Genotypic survey of recent β-lactam-resistant pneumococcal nasopharyngeal isolates from asymptomatic children in Chile. J Clin Microbiol. 1999, 37: 3725-3730.PubMed CentralPubMedGoogle Scholar
- Greenberg D, Dagan R, Muallem M, Porat N: Antibiotic-resistant invasive pediatric Streptococcus pneumoniae clones in Israel. J Clin Microbiol. 2003, 41: 5541-5545.PubMed CentralPubMedView ArticleGoogle Scholar
- Ip M, Lyon DJ, Yung RW, Tsang L, Cheng AF: Introduction of new clones of penicillin-nonsusceptible Streptococcus pneumoniae in Hong Kong. J Clin Microbiol. 2002, 40: 1522-1525.PubMed CentralPubMedView ArticleGoogle Scholar
- Shi ZY, Enright MC, Wilkinson P, Griffiths D, Spratt BG: Identification of three major clones of multiply antibiotic-resistant Streptococcus pneumoniae in Taiwanese hospitals by multilocus sequence typing. J Clin Microbiol. 1998, 36: 3514-3519.PubMed CentralPubMedGoogle Scholar
- Choi EH, Kim SH, Eun BW, Kim SJ, Kim NH, Lee J, Lee HJ: Streptococcus pneumoniae serotype 19A in children, South Korea. Emerg Infect Dis. 2008, 14: 275-PubMed CentralPubMedView ArticleGoogle Scholar
- Aanensen DM, Spratt BG: The multilocus sequence typing network: mlst.net. Nucleic Acids Res. 2005, 33: W728-W733.PubMed CentralPubMedView ArticleGoogle Scholar
- Imai S, Ito Y, Ishida T, Hirai T, Ito I, Maekawa K, Takakura S, Iinuma Y, Ichiyama S, Mishima M: High prevalence of multidrug‒resistant Pneumococcal molecular epidemiology network clones among Streptococcus pneumoniae isolates from adult patients with community‒acquired pneumonia in Japan. Clin Microbiol Infect. 2009, 15: 1039-1045.PubMedView ArticleGoogle Scholar
- Syrogiannopoulos GA, Ronchetti F, Dagan R, Grivea I, Ronchetti MP, Porat N, Davies TA, Ronchetti R, Appelbaum PC, Jacobs MR: Mediterranean clone of penicillin-susceptible, multidrug-resistant serotype 6B Streptococcus pneumoniae in Greece, Italy and Israel. Int J Antimicrob Agents. 2000, 16: 219-224.PubMedView ArticleGoogle Scholar
- Syrogiannopoulos GA, Grivea IN, Beratis NG, Spiliopoulou AE, Fasola EL, Bajaksouzian S, Appelbaum PC, Jacobs MR: Resistance patterns of Streptococcus pneumoniae from carriers attending day-care centers in southwestern Greece. Clin Infect Dis. 1997, 25: 188-194.PubMedView ArticleGoogle Scholar
- Sá-Leão R, Tomasz A, Sanches IS, Brito-Avô A, Vilhelmsson SE, Kristinsson KG, de Lencastre H: Carriage of internationally spread clones of Streptococcus pneumoniae with unusual drug resistance patterns in children attending day care centers in Lisbon. Port J Infect Dis. 2000, 182: 1153-1160.View ArticleGoogle Scholar
- Sá-Leão R, Tomasz A, de Lencastre H: Multilocus sequence typing of Streptococcus pneumoniae clones with unusual drug resistance patterns: genetic backgrounds and relatedness to other epidemic clones. J Infect Dis. 2001, 184: 1206-1210.PubMedView ArticleGoogle Scholar
- Harris SR, Feil EJ, Holden MT, Quail MA, Nickerson EK, Chantratita N, Gardete S, Tavares A, Day N, Lindsay JA, Edgeworth JD, de Lencastre H, Parkhill J, Peacock SJ, Bentley SD: Evolution of MRSA during hospital transmission and intercontinental spread. Science. 2010, 327: 469-474.PubMed CentralPubMedView ArticleGoogle Scholar
- Croucher NJ, Harris SR, Fraser C, Quail MA, Burton J, van der Linden M, McGee L, von Gottberg A, Song JH, Ko KS, Pichon B, Baker S, Parry CM, Lambertsen LM, Shahinas D, Pillai DR, Mitchell TJ, Dougan G, Tomasz A, Klugman KP, Parkhill J, Hanage WP, Bentley SD: Rapid pneumococcal evolution in response to clinical interventions. Science. 2011, 331: 430-434.PubMed CentralPubMedView ArticleGoogle Scholar
- Croucher NJ, Finkelstein JA, Pelton SI, Mitchell PK, Lee GM, Parkhill J, Bentley SD, Hanage WP, Lipsitch M: Population genomics of post-vaccine changes in pneumococcal epidemiology. Nat Genet. 2013, 45: 656-663.PubMedView ArticleGoogle Scholar
- Croucher NJ, Mitchell AM, Gould KA, Inverarity D, Barquist L, Feltwell T, Fookes MC, Harris SR, Dordel J, Salter SJ, Browall S, Zemlickova H, Parkhill J, Normark S, Henriques-Normark B, Hinds J, Mitchell TJ, Bentley SD: Dominant role of nucleotide substitution in the diversification of serotype 3 pneumococci over decades and during a single infection. PLoS Genet. 2013, 9: e1003868-PubMed CentralPubMedView ArticleGoogle Scholar
- Croucher NJ, Harris SR, Barquist L, Parkhill J, Bentley SD: A high-resolution view of genome-wide pneumococcal transformation. PLoS Pathog. 2012, 8: e1002745-PubMed CentralPubMedView ArticleGoogle Scholar
- Brooks-Walter A, Briles DE, Hollingshead SK: The pspC gene of Streptococcus pneumoniae encodes a polymorphic protein, PspC, which elicits cross-reactive antibodies to PspA and provides immunity to pneumococcal bacteremia. Infect Immun. 1999, 67: 6533-6542.PubMed CentralPubMedGoogle Scholar
- Mavroidi A, Godoy D, Aanensen DM, Robinson DA, Hollingshead SK, Spratt BG: Evolutionary genetics of the capsular locus of serogroup 6 pneumococci. J Bacteriol. 2004, 186: 8181-8192.PubMed CentralPubMedView ArticleGoogle Scholar
- Ayoubi P, Kilic AO, Vijayakumar MN: Tn5253, the pneumococcal omega (cat tet) BM6001 element, is a composite structure of two conjugative transposons, Tn5251 and Tn5252. J Bacteriol. 1991, 173: 1617-1622.PubMed CentralPubMedGoogle Scholar
- Croucher NJ, Walker D, Romero P, Lennard N, Paterson GK, Bason NC, Mitchell AM, Quail MA, Andrew PW, Parkhill J, Bentley SD, Mitchell TJ: Role of conjugative elements in the evolution of the multidrug-resistant pandemic clone Streptococcus pneumoniaeSpain23F ST81. J Bacteriol. 2009, 191: 1480-1489.PubMed CentralPubMedView ArticleGoogle Scholar
- Shaw JH, Clewell DB: Complete nucleotide sequence of macrolide-lincosamide-streptogramin B-resistance transposon Tn917 in Streptococcus faecalis. J Bacteriol. 1985, 164: 782-796.PubMed CentralPubMedGoogle Scholar
- Widdowson CA, Adrian PV, Klugman KP: Acquisition of chloramphenicol resistance by the linearization and integration of the entire staphylococcal plasmid pC194 into the chromosome of Streptococcus pneumoniae. Antimicrob Agents Chemother. 2000, 44: 393-395.PubMed CentralPubMedView ArticleGoogle Scholar
- Iannelli F, Santoro F, Oggioni MR, Pozzi G: Nucleotide sequence analysis of integrative conjugative element Tn5253 of Streptococcus pneumoniae. Antimicrob Agents Chemother. 2013, 58: 1235-1239.PubMedView ArticleGoogle Scholar
- Wyres KL, van Tonder A, Lambertsen LM, Hakenbeck R, Parkhill J, Bentley SD, Brueggemann AB: Evidence of antimicrobial resistance-conferring genetic elements among pneumococci isolated prior to 1974. BMC Genomics. 2013, 14: 500-PubMed CentralPubMedView ArticleGoogle Scholar
- Wyres KL, Lambertsen LM, Croucher NJ, McGee L, von Gottberg A, Liñares J, Jacobs MR, Kristinsson KG, Beall BW, Klugman KP: The multidrug-resistant PMEN1 pneumococcus is a paradigm for genetic success. Genome Biol. 2012, 13: R103-PubMed CentralPubMedView ArticleGoogle Scholar
- Pestova EV, Morrison DA: Isolation and characterization of three Streptococcus pneumoniae transformation-specific loci by use of a lacZ reporter insertion vector. J Bacteriol. 1998, 180: 2701-2710.PubMed CentralPubMedGoogle Scholar
- Drummond AJ, Rambaut A, Shapiro B, Pybus OG: Bayesian coalescent inference of past population dynamics from molecular sequences. Mol Biol Evol. 2005, 22: 1185-1192.PubMedView ArticleGoogle Scholar
- Su Y, He P, Clewell D: Characterization of the tet (M) determinant of Tn916: evidence for regulation by transcription attenuation. Antimicrob Agents Chemother. 1992, 36: 769-778.PubMed CentralPubMedView ArticleGoogle Scholar
- Hansman D, Bullen M: A resistant pneumococcus. Lancet. 1967, 2: 264-265.View ArticleGoogle Scholar
- Arason VA, Kristinsson KG, Sigurdsson JA, Stefansdottir G, Mölstad S, Gudmundsson S: Do antimicrobials increase the carriage rate of penicillin resistant pneumococci in children? Cross sectional prevalence study. BMJ. 1996, 313: 387-PubMed CentralPubMedView ArticleGoogle Scholar
- Goossens H, Ferech M, Vander Stichele R, Elseviers M: Outpatient antibiotic use in Europe and association with resistance: a cross-national database study. Lancet. 2005, 365: 579-587.PubMedView ArticleGoogle Scholar
- Donati C, Hiller NL, Tettelin H, Muzzi A, Croucher NJ, Angiuoli SV, Oggioni M, Dunning Hotopp JC, Hu FZ, Riley DR, Covacci A, Mitchell TJ, Bentley SD, Kilian M, Ehrlich GD, Rappuoli R, Moxon ER, Masignani V: Structure and dynamics of the pan-genome of Streptococcus pneumoniae and closely related species. Genome Biol. 2010, 11: R107-PubMed CentralPubMedView ArticleGoogle Scholar
- Marttinen P, Hanage WP, Croucher NJ, Connor TR, Harris SR, Bentley SD, Corander J: Detection of recombination events in bacterial genomes from large population samples. Nucleic Acids Res. 2012, 40: e6-e6.PubMed CentralPubMedView ArticleGoogle Scholar
- Path-O-Gen. [http://tree.bio.ed.ac.uk/software/pathogen/]
- Drummond AJ, Suchard MA, Xie D, Rambaut A: Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol. 2012, 29: 1969-1973.PubMed CentralPubMedView ArticleGoogle Scholar
- Drummond AJ, Ho SY, Phillips MJ, Rambaut A: Relaxed phylogenetics and dating with confidence. PLoS Biol. 2006, 4: e88-PubMed CentralPubMedView ArticleGoogle Scholar
- Suchard MA, Weiss RE, Sinsheimer JS: Bayesian selection of continuous-time Markov chain evolutionary models. Mol Biol Evol. 2001, 18: 1001-1013.PubMedView ArticleGoogle Scholar
- Lemey P, Rambaut A, Drummond AJ, Suchard MA: Bayesian phylogeography finds its roots. PLoS Comput Biol. 2009, 5: e1000520-PubMed CentralPubMedView ArticleGoogle Scholar
- Zerbino DR, Birney E: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008, 18: 821-829.PubMed CentralPubMedView ArticleGoogle Scholar
- Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W: Scaffolding pre-assembled contigs using SSPACE. Bioinformatics. 2011, 27: 578-579.PubMedView ArticleGoogle Scholar
- Swain MT, Tsai IJ, Assefa SA, Newbold C, Berriman M, Otto TD: A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs. Nat Protoc. 2012, 7: 1260-1284.PubMed CentralPubMedView ArticleGoogle Scholar
- Kent WJ: BLAT–the BLAST-like alignment tool. Genome Res. 2002, 12: 656-664.PubMed CentralPubMedView ArticleGoogle Scholar
- Carver T, Berriman M, Tivey A, Patel C, Bohme U, Barrell BG, Parkhill J, Rajandream MA: Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database. Bioinformatics. 2008, 24: 2672-2676.PubMed CentralPubMedView ArticleGoogle Scholar
- Li H, Durbin R: Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009, 25: 1754-1760.PubMed CentralPubMedView ArticleGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The sequence alignment/map format and SAMtools. Bioinformatics. 2009, 25: 2078-2079.PubMed CentralPubMedView ArticleGoogle Scholar
- Cock PJ, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczynski B: Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009, 25: 1422-1423.PubMed CentralPubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.