DNA fingerprinting of Shiga-toxin producing Escherichia coli O157 based on Multiple-Locus Variable-Number Tandem-Repeats Analysis (MLVA)
Annals of Clinical Microbiology and Antimicrobialsvolume 2, Article number: 12 (2003)
The ability to react early to possible outbreaks of Escherichia coli O157:H7 and to trace possible sources relies on the availability of highly discriminatory and reliable techniques. The development of methods that are fast and has the potential for complete automation is needed for this important pathogen.
In all 73 isolates of shiga-toxin producing E. coli O157 (STEC) were used in this study. The two available fully sequenced STEC genomes were scanned for tandem repeated stretches of DNA, which were evaluated as polymorphic markers for isolate identification.
The 73 E. coli isolates displayed 47 distinct patterns and the MLVA assay was capable of high discrimination between the E. coli O157 strains. The assay was fast and all the steps can be automated.
The findings demonstrate a novel high discriminatory molecular typing method for the important pathogen E. coli O157 that is fast, robust and offers many advantages compared to current methods.
Escherichia coli O157:H7 is a common cause of a variety of illnesses, including bloody diarrhea and the hemolytic uremic syndrome (HUS). The O157:H7 serotype was first described in the literature in 1983 following two outbreaks of hemorrhagic colitis in a fast-food restaurant chain in Oregon and Michigan in 1982 [1, 2]. Illness was characterized by watery diarrhea, which progressed to bloody diarrhea and abdominal cramps [1, 2]. E. coli O157:H7 belongs to a group of diarrheagenic E. coli named Shiga toxin-producing E. coli (STEC) . STEC produce several distinct Shiga toxins (Stx)  and it is believed that HUS results from the systemic action of Stx on vascular endothelial cells . The most frequent mode of transmission for E. coli O157:H7 infection is through consumption of contaminated food and water, and several outbreaks have been caused by ground beef. Approximately 1% of healthy cattle may have the organism in their intestines.
The ability to react early to possible outbreaks of E. coli O157:H7 and to trace possible sources relies on the availability of highly discriminatory and reliable techniques. Of particular interest are the DNA based molecular typing methods. Previous work in our laboratory has shown that the pulsed-field gel electrophoresis (PFGE) method of Xba I digested whole genomes gave good discrimination and was highly reproducible . We previously compared PFGE with the newer method of Amplified-Fragment Length Polymorphism (AFLP) and found that while AFLP was faster, PFGE offered the best resolution for typing E. coli O157:H7 . Typing E. coli O157:H7 by PFGE is currently the preferred typing method in our laboratory. However, the rapid sequencing of complete bacterial genomes including the sequence of two E. coli O157:H7 strains [7, 8]. gave us the opportunity to search the genomes of these two strains in order to look for Variable Number of Tandem Repeats (VNTRs) that could be used as a source of genetic polymorphisms and be used in a typing assay. Polymorphic VNTR regions which can be used for typing purpouses have been identified and tested in Yersinia pestis [9–11], Francisella tularensis , Mycobacterium tuberculosis [13–16], Xylella fastidiosa , Haemophilus influenzae [18–20], Bacillus anthracis [21–24], and we have recently developed a VNTR based typing assay for the important enteropathogen Salmonella Typhimurium .
In all 73 isolates of shiga-toxin producing Escherichia coli O157 were used in this study. The isolates included both domestic (32 isolates) strains and strains from Sweden (11 isolates), Finland (6 isolates), UK (5 isolates) and from the USA (19 isolates). The majority, 69, of the isolates have previously been typed both by Xba I PFGE and AFLP , and the collection included strains from well characterized outbreaks as well as strains designated as sporadic isolates. Table 1 [see Additional file 1] summarizes the strain collection.
Suspensions of bacterial cells were boiled for 15 min and used directly in the PCR reactions after a brief centrifugation at 3500 rpm for 3 min., or genomic DNA was extracted using a commercial kit ("Easy-DNA"; Invitrogen BV, Leek, The Netherlands).
The genomic sequences of both the E. coli O157 strains that have been fully sequenced (accession numbers AE005174 and NC_002695) were searched for tandem repeated DNA motifs with the Tandem Repeat Finder , GeneQuest (DNASTAR, Madison WI) and Kodon (Applied-Maths, Kortrijk, Belgium) software. Several regions containing tandem repeated motifs were found, of which seven were chosen to constitute a set for high level discrimination of E. coli O157 isolates. Six of the VNTRs were localized on the main chromosome, and one was localized on the E. coli O157 large virulence plasmid (accession no. AF074613). Table 2 [see Additional file 2] gives an overview over the repeat characteristics. The VNTRs were situated in five genes and two intergenic regions and had repeated motifs ranging from 6 to 30 basepairs. Some of the VNTRs displayed size variations between the EDL933 and Sakai strains when their GenBank sequences were examined, and the predicted amplicons are shown in table 2 [see Additional file 2]. The actual repeats with leader and tailing sequence of 500 bps were used to design specific PCR primers (Table 3) [see Additional file 3]. Care was taken to match both annealing temperature and sizes of the produced PCR amplicons to make a convenient set for PCR and capillary electrophoresis separation. The PrimerSelect (DNASTAR) software automatically calculated the primer annealing temperatures, and analyzed how the different primers could interact in multiplex PCR. The forward primers for all the seven VNTR regions were labelled with 6 FAM (6'-carboxyfluorescein). The different primer-sets were tested against each other in several combinations. A combination of primers with Vhec1 alone annealed at 50°C, Vhec4 and Vhec5 annealed at 53°C, Vhec2 and Vhec3 annealed at 50°C and Vhec6 and Vhec7 annealed at 50°C. The temperature profile was: 94°C denaturation for 5 min followed by 30 cycles of 94°C for 30 s, annealing temperature from above for 60 s and 72°C for 50 s and finally a 10 min extension step at 72°C performed on a GeneAmp PCR system 9700 (Applied-Biosystems, Foster City, CA). Five microliters of each of the PCR products were pooled together into a single tube for each sample and thoroughly mixed. Five microliters of the resuspended solution mixed with 8 μl formamid and 1 μl of size standard was then used for capillary electrophoresis, after a 2 min denaturation at 94°C, on an ABI-310 Genetic Analyzer (Applied-Biosystems) with POP4-polymer and Genescan TAMRA-500 or TAMRA-2500 as internal standard in each sample (Applied-Biosystems). The capillary electrophoresis was run for 50 min at 60°C. The samples were injected into the capillary by applying a 15 kV voltage for 5 s. The run voltage was 15 kV. The size of each VNTR locus was determined by the GeneScan software (Applied-Biosystems).
The electropherograms were imported into the BioNumerics software package (Applied-Maths) and a phylogenetic tree was constructed using Dice coefficients and cluster analysis with the unweighted pair group method with arithmetic averages (UPGMA) from the ABI trace files. The constructed phylogeny was based on the resulting compound band-pattern from all amplified loci for each isolate, thus, the separate VNTR loci were not analysed individually.
The same primers used to amplify the VNTR regions were used for sequencing at 3.2 pmol concentration. The PCR products were cleaned with the Qiaquick PCR purification kit (QIAGEN, Hilden, Germany) and both forward and reverse strands were sequenced two times each using the ABI Cycle-sequencing kit V3.0 (Applied-Biosystems), the sequencing products were cleaned with the DyeEx kit (QIAGEN) to remove unincorporated dyes before applied on an ABI-3100 automated sequencer.
The Tandem Repeat Finder software  found 36 repeats when it was set to search for repeated motifs of 6 or more nucleotides repeated in tandem with 3 or more copies, which were more than 60% internally homogenous between repeat units. Both the GeneQuest (DNASTAR) and Kodon software (Applied-Maths) found these repeats and added the possibility of viewing graphics of the location of each VNTR locus with surrounding area, and information about melting temperatures and GC content in these genomic areas. The Tandem Repeat Finder offered by far the fastest search routine, however, both GeneQuest and Kodon also searched for other classes of repeated DNA e.g. inverted repeats and dyad repeats.
The MLVA assay displayed 47 distinct profiles among the 73 strains analysed. The resulting electropherogram of the pooled amplicons showed a clear and easily interpretable banding patterns (Figure 1). The identification of each peak in the pooled run with the correct VNTR region was done by running each primer-pair separately in every isolate. This was done primarily for the development of the method, and the resulting phylogeny was based on the resulting banding-pattern without identifying each locus separately. Selected isolates were run repeated times by different individuals, and always displayed identical banding patterns. Some minor variations in signal strength between runs were noted which possibly could be explained by variations in PCR efficiency or in the amount of DNA available as template since the boiling of isolates is likely to produce different amount of DNA. Not all the VNTR loci gave amplification products in all isolates or some of the peaks were masked by two VNTRs producing the same sized PCR products. The majority of samples displayed all amplification products, although some of these were masked by two different VNTRs having the same size in the electropherograms. Eight isolates (11%) displayed a loss of one band, and only one isolate had less than 6 bands (Table 4) [see Additional file 4]. All primer-pairs showed loss of amplification products except Vhec2, Vhec6 and Vhec7, which were present in all our isolates. All the VNTR loci displayed heterogeneity in our material. In table 5 [see Additional file 5] the data for allele size ranges in basepairs, number of alleles at each locus and the polymorphism index of each marker is shown. In table 4 [see Additional file 4] the allele distribution among all the distinct MLVA patterns are shown. All unique alleles differing by gain or loss of repeat units were given allele numbers, thus PCR amplicons of a VNTR locus with the same size in different isolates were given the same number in table 4 [see Additional file 4]. Different allele sizes at the different VNTR loci were results of loss or gains of repeat unites, which were verified by direct sequencing of the PCR amplicons using the VNTR primers. In figure 2 is an alignment of the Vhec4 primer products from isolates G5300 and IHE5344 showing a 10 repeat unit difference of the AAATAG motif.
The MLVA assay was easy to perform, consisting of four separate PCR reactions with subsequent pooling of the products, and capillary electrophoresis. The assay was tested both with extracted DNA and with just boiling the bacterial cells for 15 min. Both methods gave the same PCR products, however with boiled cells as the starting material primer Vhec1, and in some cases, Vhec3 and Vhec4 gave reduced amplification products in the multiplexed PCR reactions compared to what was achieved with extracted DNA, but with the sensitivity of the capillary system with fluorescently labelled DNA we had no difficulties in finding all products. Selected isolates were run repeated times by different individuals, and always displayed identical banding patterns. The assay was thus proven to be robust and highly reproducible. The strain collection used was very well described both in literature, and by previous typing . This constituted a good basis for assessing the resolution of this novel method. We have previously typed 69 isolates in this collection both by Xba I PFGE and EcoR I(0)+Mse I(+C) AFLP , and the MLVA assay displayed 45 distinct profiles in this subset. This was comparable to PFGE, which displayed 44 distinct, profiles and considerably higher than the 23 profiles revealed by AFLP (Table 1) [see Additional file 1]. Some of the PFGE profiles were, however, quite similar (one band difference). The resulting phylogeny generated from the MLVA assay showed good accordance to the epidemiological data available, and with PFGE typing (Table 1) [see Additional file 1]. The majority of strains, which were clustered together with PFGE, also were co-clustered with the MLVA assay with some exceptions. Outbreaks were differentiated and clustered into distinct groups. Some exceptions were, however, noted. One isolate (G4921) from outbreak #7 (USA) displayed a different MLVA profile (V28) than the rest of the outbreak #7 strains (V30). This isolate was also different by PFGE typing indicating that this strain might not be a part of outbreak #7, or that this outbreak had several strains involved. Strain G4919 (V30) of outbreak #7 displayed a different PFGE profile (P17) from the main PFGE pattern of this outbreak (P16), but by MLVA this strain was identical to the main outbreak pattern V30. The P16 and P17 PFGE profiles were, however, quite similar and had a one-band difference  making the MLVA data credible. Outbreak #6 from the USA was also divided into different profiles by MLVAtyping. Strains H0619 and H0620 were identical (V31), while H0617 (V32), H0618 (V34) and H0616 (V35) displayed different profiles. Here only the H0618 isolate was also different by PFGE. The H0617 and H0616 isolates were thus identical with H0619 and H0620 by PFGE while they had different patterns by MLVA (Table 1) [see Additional file 1]. The outbreak #6 isolates did however cluster together, but strain 174/99 (V33), a Norwegian isolate, was included in this cluster by MLVA. The cattle isolates H90/95, H82/95, H83/95 and H89/95 were isolated from the same herd, and believed to be identical. This was confirmed by MLVA typing, which gave identical patterns for these isolates (V36). The PFGE typing, however, displayed minor differences between these isolates, giving H90/95 and H83/95 identical patterns (P37) as well as different patterns for H82/95 (P38) and H89/95 (P39). The PFGE profiles for patterns P37, P38 and P39 were, however, highly similar and related . No size variations in the repeated motifs in any of the seven VNTR loci were detected between isolates H82/95 and H90/95. For outbreak #14 (V42 profile), the MLVA assay included two isolates, one human isolate previously considered unrelated 3108/97 (V42) and one isolate from cattle 127/97 (V42). The suspected outbreak isolate 127/98 (V43) (outbreak #14) was given a different profile from the other outbreak strains by MLVA (Fig. 1). The inclusion of strain 3108/97 into outbreak #14 was confirmed by PFGE, while the exclusion of 127/98 was not indicated neither by PFGE nor AFLP. Figure 1 shows the MLVA patterns for three isolates that were identical by PFGE and AFLP. Two of these isolates (131/97 and 126/97) were also identical by MLVA (V1), while strain 127/97 displayed a clearly unrelated profile (V42). Some of the MLVA types in table 1 [see Additional file 1] displayed less than seven VNTR bands. The reason for this was that some primer-pairs did not give an amplification product in all isolates. It can also be viewed in figure 1 where the 127/97 isolate in the lower panel has no amplification product from the Vhec5 primer-pairs.
The main finding was a remarkable high co-clustering of the MLVA and PFGE profiles. This high co-clustering was a bit unexpected given that these methods were based on quite different mechanisms to generate the profile date. The presence of Xba I recognition sites for PFGE and the variability of tandem repeated motifs in MLVA. The good correlation between PFGE and MLVA is important since the majority of E. coli O157 has been typed by PFGE and the clusters generated with MLVA will be familiar. The speed of the MLVA assay combined with its resolution makes it an attractive alternative for typing E. coli O157 isolates which are considered to be the most important pathogens so far among the shiga-toxin producing E. coli serotypes. We thus present here a novel VNTR based typing method for STEC O157 strains, which is both considerably faster, less labour intensive and yet offers a slightly higher discriminatory power compared to PFGE. An additional benefit compared with PFGE is the potential for fully automating the analysis. The main tasks were liquid handling, which can be performed by laboratory robots, and by using multi-dye electrophoresis the analysis of the data can also be automated.
Riley LW, Remis RS, Helgerson SD, McGee HB, Wells JG, Davis BR, Hebert RJ, Olcott ES, Johnson LM, Hargrett NT, Blake PA, Cohen ML: Hemorrhagic colitis associated with a rare Escherichia coli serotype. N Engl J Med. 1983, 308: 681-685.
Wells JG, Davis BR, Wachsmuth IK, Riley LW, Remis RS, Sokolow R, Morris GK: Laboratory investigation of hemorrhagic colitis outbreaks associated with a rare Escherichia coli serotype. J Clin Microbiol. 1983, 18: 512-520.
Paton JC, Paton AW: Pathogenesis and diagnosis of Shiga toxin-producing Escherichia coli infections. Clin Microbiol Rev. 1998, 11: 450-479.
Melton-Celsa AR, Rogers JE, Schmitt CK, Darnell SC, O'Brien AD: Virulence of Shiga toxin-producing Escherichia coli (STEC) in orally-infected mice correlates with the type of toxin produced by the infecting strain. Jpn J Med Sci Biol. 1998, 51 (Suppl): S108-S114.
Louise CB, Obrig TG: Shiga toxin-associated hemolytic uremic syndrome: combined cytotoxic effects of shiga toxin and lipopolysaccharide (endotoxin) on human vascular endothelial cells in vitro. Infect Immun. 1992, 60: 1536-1543.
Heir E, Lindstedt B-A, Vardund T, Wasteson Y, Kapperud G: Genomic fingerprinting of shigatoxin-producing Escherichia coli (STEC) strains: comparison of pulsed-field gel electrophoresis (PFGE) and fluorescent amplified-fragment-length polymorphism (FAFLP). Epidemiol Infect. 2000, 125: 537-548. 10.1017/S0950268800004908
Hayashi T, Makino K, Ohnishi M, Kurokawa K, Ishii K, Yokoyama K, Han CG, Ohtsubo E, Nakayama K, Murata T, Tanaka M, Tobe T, Iida T, Takami H, Honda T, Sasakawa C, Ogasawara N, Yasunaga T, Kuhara S, Shiba T, Hattori M, Shinagawa H: Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. DNA Res. 2001, 8: 11-22.
Perna NT, Plunkett G, Burland V, Mau B, Glasner JD, Rose DJ, Mayhew GF, Evans PS, Gregor J, Kirkpatrick HA, Posfai G, Hackett J, Klink S, Boutin A, Shao Y, Miller L, Grotbeck EJ, Davis NW, Lim A, Dimalanta ET, Potamousis KD, Apodaca J, Anantharaman TS, Lin J, Yen G, Schwartz DC, Welch RA, Blattner FR: Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature. 2001, 409: 529-533. 10.1038/35054089
Adair DM, Worsham PL, Hill KK, Klevytska AM, Jackson PJ, Friedlander AM, Keim P: Diversity in a variable-number tandem repeat from Yersinia pestis. J Clin Microbiol. 2000, 38: 1516-1519.
Klevytska AM, Price LB, Schupp JM, Worsham PL, Wong J, Keim P: Identification and characterization of variable-number tandem repeats in the Yersinia pestis genome. J Clin Microbiol. 2001, 39: 3179-3185. 10.1128/JCM.39.9.3179-3185.2001
Le Fleche P, Hauck Y, Onteniente L, Prieur A, Denoeud F, Ramisse V, Sylvestre P, Benson G, Ramisse F, Vergnaud G: A tandem repeats database for bacterial genomes: application to the genotyping of Yersinia pestis and Bacillus anthracis. BMC Microbiol. 2001, 1: 2- 10.1186/1471-2180-1-2
Johansson A, Göransson I, Larsson P, Sjöstedt A: Extensive allelic variation among Francisella tularensis strains in a short-sequence tandem repeat region. J Clin Microbiol. 2001, 39: 3140-3146. 10.1128/JCM.39.9.3140-3146.2001
Frothingham R, Meeker-O'Connell WA: Genetic diversity in the Mycobacterium tuberculosis complex based on variable numbers of tandem DNA repeats. Microbiology. 1998, 144 (Pt 5): 1189-1196.
Skuce RA, McCorry TP, McCarroll JF, Roring SM, Scott AN, Brittain D, Hughes SL, Hewinson RG, Neill SD: Discrimination of Mycobacterium tuberculosis complex bacteria using novel VNTR-PCR targets. Microbiology. 2002, 148: 519-528.
Supply P, Mazars E, Lesjean S, Vincent V, Gicquel B, Locht C: Variable human minisatellite-like regions in the Mycobacterium tuberculosis genome. Mol Microbiol. 2000, 36: 762-771. 10.1046/j.1365-2958.2000.01905.x
LeFleche P, Fabre M, Denoeud F, Koeck JL, Vergnaud G: High resolution, on-line identification of strains from the Mycobacterium tuberculosis complex based on tandem repeat typing. BMC Microbiol. 2002, 2: 37- 10.1186/1471-2180-2-37
Coletta-Filho HD, Takita MA, de Souza AA, Aguilar-Vildoso CI, Machado MA: Differentiation of strains of Xylella fastidiosa by a variable number of tandem repeat analysis. Appl Environ Microbiol. 2001, 67: 4091-4095. 10.1128/AEM.67.9.4091-4095.2001
van Belkum A, Melchers WJ, Ijsseldijk C, Nohlmans L, Verbrugh H, Meis JF: Outbreak of amoxicillin-resistant Haemophilus influenzae type b: variable number of tandem repeats as novel molecular markers. J Clin Microbiol. 1997, 35: 1517-1520.
van Belkum A, Scherer S, van Leeuwen W, Willemse D, van Alphen L, Verbrugh H: Variable number of tandem repeats in clinical strains of Haemophilus influenzae. Infect Immun. 1997, 65: 5017-5027.
Weiser JN, Maskell DJ, Butler PD, Lindberg AA, Moxon ER: Characterization of repetitive sequences controlling phase variation of Haemophilus influenzae lipopolysaccharide. J Bacteriol. 1990, 172: 3304-3309.
Andersen GL, Simchock JM, Wilson KH: Identification of a region of genetic variability among Bacillus anthracis strains and related species. J Bacteriol. 1996, 178: 377-384.
Keim P, Price LB, Klevytska AM, Smith KL, Schupp JM, Okinaka R, Jackson P, Hugh-Jones ME: Multiple-locus variable-number tandem repeat analysis reveals genetic relationships within Bacillus anthracis. J Bacteriol. 2000, 182: 2928-2936. 10.1128/JB.182.10.2928-2936.2000
Kim W, Hong YP, Yoo JH, Lee WB, Choi CS, Chung SI: Genetic relationships of Bacillus anthracis and closely related species based on variable-number tandem repeat analysis and BOX-PCR genomic fingerprinting. FEMS Microbiol Lett. 2002, 207: 21-27. 10.1016/S0378-1097(01)00544-4
Smith KL, DeVos V, Bryden H, Price LB, Hugh-Jones ME, Keim P: Bacillus anthracis diversity in Kruger National Park. J Clin Microbiol. 2000, 38: 3780-3784.
Lindstedt B-A, Heir E, Gjernes E, Kapperud G: DNA Fingerprinting of Salmonella enterica subsp. enterica Serovar Typhimurium with Emphasis on Phage Type DT104 Based on Variable Number of Tandem Repeat Loci. J Clin Microbiol. 2003, 41: 1469-1479. 10.1128/JCM.41.4.1469-1479.2003
Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999, 27: 573-580. 10.1093/nar/27.2.573
B-AL had primary responsibility for study design, collection of data and writing the manuscript. EH performed the comparison with the PFGE and AFLP data and had intellectual contribution, EG and TV verified the results and had intellectual contributions. GK had intellectual contribution.