Multiplex ligation reaction based on probe melting curve analysis: a pragmatic approach for the identification of 30 common Salmonella serovars

Background While Salmonella serotyping is of paramount importance for the disease intervention of salmonellosis, a fast and easy-to-operate molecular serotyping solution remains elusive. We have developed a multiplex ligation reaction based on probe melting curve analysis (MLMA) for the identification of 30 common Salmonella serovars. Methods Serovar-specific primers and probes were designed based on a comparison of gene targets (wzx and wzy encoding for somatic antigen biosynthesis; fliC and fljB for flagellar antigens) from 5868 Salmonella genomes. The ssaR gene, a type III secretion system component, was included for the confirmation of Salmonella. Results All gene targets were detected and gave expected Tm values during assay evaluation. Cross reactions were not demonstrated between the 30 serovars (n = 211), or with an additional 120 serovars (n = 120) and other Enterobacteriaceae (n = 3). The limit of identification for all targets ranged from using 1.2 ng/μL to 1.56 ng/μL of DNA. The intra- and inter-assay standard deviations and the coefficients of variation were no more than 0.5 °C and less than 1% respectively, indicating high reproducibility. From consecutive outpatient stool samples (n = 3590) collected over a 10-month period at 11 sentinel hospitals in Shenzhen, China, we conducted a multicenter study using the traditional Salmonella identification workflow and the MLMA assay workflow in parallel. From Salmonella isolates (n = 496, 13.8%) derived by both workflows, total agreement (kappa = 1.0) between the MLMA assay and conventional serotyping was demonstrated. Conclusions With an assay time of 2.5 h, this simple assay has shown promising potential to provide rapid and high-throughput identification of Salmonella serovars for clinical and public health laboratories to facilitate timely surveillance of salmonellosis.


Background
Salmonella is a leading cause of foodborne illness and represents a major public health burden globally. Approximately 93.8 million cases of gastroenteritis and 155,000 deaths were estimated to be caused by Salmonella species annually worldwide [1]. In China, Salmonella have been the most prevalent bacterial foodborne pathogen among diarrheal patients [2]. According to our sentinel surveillance data for infectious diarrhea, the infection rate of Salmonella has surpassed Vibrio parahaemolyticus and has become the primary cause of foodborne illnesses in Shenzhen. Currently, 2659 Salmonella serovars have been described on the basis of a combination of 46 somatic (O) antigens and 114 flagellar (H) antigens [3,4]. Different Salmonella serovars have shown different genetic structures and host specificity, resulting in distinct epidemiology and clinical presentation, as well as differences in drug resistance and treatments [5]. For example, non-typhoidal Salmonella (NTS) serovars such as S. Typhimurium and S. Enteritidis have a broad host range and cause less severe and self-limiting gastroenteritis, compared to the more severe typhoid enteric fever caused by serovars Typhi and Paratyphi [6]. Certain serovars such as S. Choleraesuis and S. Dublin have been shown to cause severe systemic illness of human salmonellosis [7,8]. Therefore, serovar identification for Salmonella serves a vital role in clinical diagnosis and treatment of salmonellosis, assessing trends in antibiotic resistance as well as epidemiological surveillance, outbreak detection and source attribution.
Currently, the phenotypic identification of Salmonella surface antigens by means of serum agglutination according to the White-Kauffmann-Le Minor scheme is regarded as the gold standard for Salmonella serotyping [9]. However, this conventional serological method has some important limitations, including the subjectivity in result interpretation, laborious procedures requiring highly trained personnel, slow turnaround, expensive maintenance and quality control of antisera as well as inconsistency between batches. Thus, the development of rapid molecular approaches for Salmonella serovar identification that exploits serovar-specific gene loci have gained continued interest over the years. Many studies have utilized multiplex PCR and quantitative PCR strategies [10][11][12][13][14][15][16][17][18][19][20][21], including those that have directly targeted O-and H-antigen-encoding genes to mirror the antigenic diversity as discerned by conventional serotyping [19][20][21]. Other strategies have included the use of high-resolution melting analysis [22], bead based suspension arrays [23] and pyrosequencing assay [24]. However, these methods also suffer from practical limitations, including low throughput; the ability to simultaneously identify a limited number of serovars; ambiguous results arising from gel electrophoresis; the need for supplementary PCRs for full functionality; cumbersome procedures involving the preparation of microsphere beads; and the reliance on expensive proprietary equipment and software. More recently, whole genome sequencing (WGS) data have been utilized for Salmonella serovar prediction in North America [25,26], and have shown promise to replace conventional and molecular serotyping approaches in the future. However, it is currently still more costly and technically demanding, requires bioinformatics expertise and suffers from difficulties associated with the standardization and validation of methodologies [27,28]. In contrast, simpler molecular methods are often more practical in terms of providing the serotyping information required within hours of DNA extraction.
Multiple ligation reaction based on probe melting curve (MLMA), a robust molecular typing method for the simultaneous identification of ten bacterial pathogen species [29], has been shown to reliably detect specific nucleotide sequence polymorphisms of gene targets through the combined use of oligonucleotide target probes with unique melting temperature (Tm) tags and fluorescently labelled detection probes. Based on the principles of MLMA, we developed an assay to identify 30 clinically most common Salmonella serovars over a 10-year period (2006-2016) in Shenzhen, China.

Reference and clinical bacterial isolates
A total of 334 reference bacterial isolates were used for the initial development and evaluation of the MLMA assay. This comprised reference isolates (n = 211) for each of the 30 clinically most common Salmonella serovars observed during a 10-year period (2006-2016) from the Shenzhen Center for Disease Control and Prevention reference collection and the National Center for Medical Culture Collections of China (CMCC) ( Table 1); and one reference isolate each for an additional 120 Salmonella serovars, Escherichia coli, Shigella and Proteus mirabilis for detecting cross reactions (see Additional file 1: Table S1). A total of 496 clinical isolates were obtained from a total of 3590 consecutive outpatient,stool samples in a multicenter study conducted between January to October 2017 at 11 sentinel hospitals in Shenzhen.

Bacterial culture and conventional serotyping
Stool specimens from sentinel hospitals in the multicenter study were enriched in 10 mL selenite cystine broth (

DNA extraction
For reference bacterial isolates, during initial development and evaluating analytical performance characteristics including the limits of identification and intra-assay and inter-assay reproducibility, DNA was extracted using QIAamp DNA Mini Kit (QIAGEN, Hilden, Germany) for accurate quantification according to manufacturer's instructions. For clinical isolates during the multicenter study, DNA templates were prepared using the boiling method. A single colony was suspended in 50 μL of distilled water, then boiled at 100 °C for 8 min and centrifuged at 12,000×g for 10 min, with supernatant used as DNA template. The concentration of all DNA templates were measured spectrophotometrically (Nanodrop ND-1000, Thermo Fisher Scientific).

Primer and probe design
The ssaR gene, a type III secretion system component, was incorporated into the assay as the target gene for the confirmation of Salmonella as previously described [30,31]. The wzx and wzy genes encoding biosynthesis of the somatic (O) antigen; and the fliC and fljB genes encoding the phase 1 and 2 flagellar (H) antigens, were chosen as the target gene loci on the basis of their substantial antigenic diversity and heterogeneity of alleles between serovars as represented in conventional serotyping. Extensive sequence analyses were performed to identify serovar-specific polymorphisms at the target gene loci using (

Development of the MLMA assay
Briefly, the MLMA assay consists of two key steps, (a) hybridization-ligation process followed by (b) PCR amplification and melting curve analysis. In the hybridization-ligation process, a pair of oligonucleotide probes hybridizes with the DNA template flanking either side of the target sequence, and is ligated upon complete hybridization. The ligated product is first amplified using a universal PCR primer pair via Linear After-The-Exponential (LATE) PCR, then hybridized with fluorescently labelled detection probes. Salmonella serovars are determined based on unique melting temperatures (Tm) obtained using a melt curve analysis established by corresponding fluorescent detection probes in the respective fluorophore channels. The hybridization-ligation reaction was performed on a T3 Thermocycler (Biometra, Germany). Each 10 μL reaction mix contained 1.5 μL ligation probe mix (1-4 fmol of each specific ligation oligonucleotides), 1U Taq DNA ligase and 1 μL DNA ligase buffer (New England Biolabs, Bejing, China), 1.5 μL diethylpyrocarbonate (DEPC) water and 5 μL genomic DNA. Reaction conditions included an initial DNA denaturation step at 98 °C for 5 min containing only genomic DNA and ligation probe mix, a 75 °C hold to add the remainder of reaction mixture (3.5 μL), incubation step at 60 °C for 80 min, 95 °C for 5 min, and finally cooled to 4 °C. The PCR amplification and melting curve analysis was performed on a Bio-Rad CFX 96 real-time PCR system (Bio-Rad Inc., Hercules, CA). Each 50 μL reaction mix contained 1 × PCR buffer, 3 mM MgCl 2 , 2 μL of deoxynucleoside triphosphate (2.5 mM) and 1U Taq polymerase (TaKaRa Biotech Co., Dalian, China), 0.015 μM limiting primer, 0.4 μM excess primer, 0.08 μM to 0.16 μM of fluorogenic probes and 5 μL of ligation product from the hybridization-ligation reaction. Amplification consisted of an initial denaturation at 95 °C for 3 min; followed by 40 cycles of 95 °C for 5 s, 56 °C for 15 s then 74 °C for 15 s; followed by melting curve analysis starting with 2 min at 95 °C for denaturation, hybridization for 2 min at 40 °C, and a stepwise temperature increase (1 °C per 5 s) from 40 °C to 85 °C. Carboxy fluorescein (FAM), carboxy-Xrhodamine (ROX), and indodicarbocyanine5 (Cy5) fluorescence were collected and recorded.

Analytical performance of the MLMA assay
The limit of identification was determined using a series of tenfold dilutions (from 10 to 0.01 ng) of purified DNA. The reproducibility of the assay was evaluated by using two sets of tenfold serial dilutions (3 × 10 10 and 3 × 10 8 copies/mL) of genomic DNA measured in triplicates.

Multicenter study
All clinical specimens from sentinel hospitals were analyzed using the traditional Salmonella identification workflow and the MLMA assay workflow in parallel. Both workflows began with bacterial culture and isolation from the same stool specimen. Subsequently, the traditional identification workflow proceeded with biochemical testing and conventional serotyping; whereas the MLMA workflow proceeded with DNA extraction and MLMA assay. All procedures were performed as described in sections above. The agreement between the MLMA assay and the conventional serological method was determined by calculating the kappa value. Kappa values range from 0, indicating total disagreement, and 1, indicating total agreement of results between two methods under comparison.

Development of the MLMA assay for the detection of antigen genes
A two-tube system, each with three fluorescence channels (ROX, FAM and Cy5) was developed ( Table 2). In the first tube that detects 14 antigens, distinct Tm values in the order of high to low, the ROX channel detects Salmonella O antigens Group E, Group B, Group A, Group C 1 , Group C 2 -C 3 , Group D1 and the SUC2 gene as an internal control (IC); the FAM channel detects the Salmonella antigens (z 4 ,z 23 ), ssaR, b, a, (f,g,s/g,s,t); whereas the Cy5 channel detects the Salmonella H2 antigens (1,7), (1,2), z 6 . In the second tube which detects 12 antigens, distinct Tm values in the order of high to low, the ROX channel detects Salmonella H antigens (e,n,x/e,n,z 15 ), r, i, v, (g,m); the FAM channel detects Salmonella H1 antigens d, (e,h), (f,g), c; whereas the Cy5 channel detects Salmonella H2 antigens D2, (1,5) and (1,6). Table 3 details the experimental setup for each tube, the fluorophore channel, designated melting temperatures and corresponding antigens in the assay. Performance characteristics of the MLMA assay The MLMA assay was able to detect all gene targets and gave expected Tm values for each serovar (Fig. 1.). Cross reactions were not detected between the 30 serovars (n = 211), or with an additional Salmonella serovars (n = 120), Escherichia coli (n = 1), Shigella (n = 1) and Proteus mirabilis (n = 1) (see Additional file 1: Table S1). Due to the inaccessibility of certain rare serovars with closely related antigenic formula, potential cross reactions with these rare clinical strains could not be tested, as detailed in discussion. The limit of identification ranged from 1.2 to 1.56 ng/μL of genomic DNA concentration for each serovar (see Additional file 1: Table S4). The standard deviations (SD) and the coefficient of variation (CV) for intra-assay and inter-assay reproducibility were calculated (see Additional file 1: Table S5). The largest SD value for mean Tm was no more than 0.5 °C, whereas the CV were less than 1% for both intra-assay and inter-assay reproducibility analyses.

Discussion
Salmonella infection remains a significant public health challenge worldwide, and given the paramount importance of serotyping as a crucial tool for epidemiological investigations and surveillance of salmonellosis, there is a continued need for the development of a rapid and accurate molecular serotyping method that is also simple in operation and economical for high-throughput application. In this study, we have developed a multiple ligation reaction based on probe melting curve analysis (MLMA) [29] for the identification of clinically common Salmonella serovars. The newly developed assay was able to detect all gene targets were and gave expected Tm values with robust performance characteristics. The assay has demonstrated high reproducibility where the largest SD value for mean Tm was no more than 0.5 °C and the CVs were less than 1% in both intra-assay and inter-assay reproducibility analyses; whereas the detection limit for the 30 Salmonella serovars ranged from 1.2 to 1.56 ng/μL. Overall, these performance characteristics are similar to those observed in an assay developed for the rapid identification of foodborne pathogens [29]. In the multicenter study we conducted, the MLMA assay has shown total concordance with the conventional serological method. Serovars for all of the isolates (77.2%) covered by the assay were correctly identified and no false positive identification of non-target serovars was observed, as indicated by kappa values of 1.0 for each of the Salmonella serovars identified. This can be attributed to the high fidelity of the DNA ligase to recognize gaps between nucleic acid sequences, and that the oligonucleotide probes must be completely hybridized with the specific flanking nucleotides of the target sequence for ligation to occur. The MLMA assay workflow presented here was 12-18 h faster compared to the traditional workflow, which required overnight incubation for biochemical identification tests prior to conventional serotyping. From DNA extraction, the assay consists only of two thermocycling sets, with minimal hands-on time and a total assay time of approximately 2.5 h to both Salmonella confirmation and serotyping results at once. Therefore, the current assay presents a prospective method for the simple and rapid detection of Salmonella and serovar identification.
In developing this assay, we included 30 clinically most common Salmonella serovars isolated (87.9%) from human infections over a 10-year period (2006-2016), in Shenzhen, China for rapid identification. Incidentally, these serovars also covered 72.8% of the most common serovars isolated from diarrheal patients domestically in China [32]; whereas internationally, these serovars were well represented among the 20 most frequently identified clinical isolates for countries in Asia (n = 14), Europe (n = 11), North America (n = 11), Latin America (n = 12) and Oceania (n = 10) [33].
In recent years, various molecular approaches have been investigated and developed for the identification of

Table 3 Determination of Salmonella serovars using O and H antigens identified from the two-tube MLMA system
The table below illustrates the experimental setup for the detection of each specific antigen in the assay, including the tube number, the fluorophore channel, the designated melting temperature (Tm, °C) and the antigen being detected. As an example, the detection of O antigen in S. Typhimurium is represented by "1-ROX-70.5-O:4 (B)", where "1" represents the first tube, "ROX" indicates the fluorophore channel, 70. Salmonella serovars to circumvent the limitations of conventional serotyping. Among these, multiplex PCRs and quantitative PCRs have proved to be popular methods [10][11][12][13][14][15][16][17][18][19][20][21]. Other methods included the use of the Bio-Plex system and adapting the Luminex platform, a DNA sensor-based suspension array for the identification of eight common Salmonella serovars was developed with a total assay time of 3.5 h [23]. A PCR-pyrosequencing assay has also been exploited as a rapid serotyping method for seven most common serovars in Canada [24]. However, the application of these methods has a number of limitations, including the few number of serovars that could be identified simultaneously; the need for gel electrophoresis; the reliance on expensive proprietary molecular biology equipment and analyzers; time consuming protocols such as the coupling of oligonucleotide probes to microsphere beads; the demand for sequence analysis knowledge and software. A high-resolution melting analysis (HRM) has also been described using arbitrary melting curve profiles found in several surrogate genomic markers to index and differentiate Salmonella serovars [22]. However, this methodology resulted in identical melting curve profiles found for different serovars and requires a separate PCR assay for full functionality. In contrast, the MLMA methodology is fundamentally different by the combined use of serovar-specific oligonucleotide target probes with designated Tm tags and fluorescently labelled detection probes, resulting in a highly robust and reliable assay.
In the current assay, we selected the genes encoding for the somatic (O) and the flagellar (H) antigen as our target genes for identification of Salmonella serovars because they are directly correlated with the White-Kaufmann-Le Minor scheme. This allows direct comparison with conventional serotyping results and has advantage over indirect approaches that relies on surrogate molecular markers approximated to be serovar-specific for the prediction or correlation of serovars. The selection of the Salmonella serovar surface antigen genes are highly crucial in order to ensure sufficient genetic diversity to avoid cross reaction with other Salmonella serovars or other bacterial species. The rfb gene regulon contain a cluster of genes required for O antigen biosynthesis and assembly, thereby responsible for the substantial antigenic diversity of the Salmonella O antigens that is exploited for serogroup identification in conventional serotyping  [34]. In particular, the O antigen transporter (wzx) and O antigen polymerase (wzy) [35] have been chosen as an ideal target for the current assay as they share little sequence homology between different O antigen serogroups even at the amino acid level [36]. Similarly, it has been shown that the fliC and fljB genes encoding for the phase 1 and phase 2 flagellar (H) antigens in Salmonella are sufficiently heterologous among different alleles but remained homologous between the same alleles [37], making it a practical choice of target in our assay. Based on the current number of O and H antigens detected in the assay for the common serovars, the combination of these antigens could potentially identify additional serovars. For example, we conducted additional tests using the assay to successfully identify serovars such as Salmonella Hidalgo O antigen O: 8 (C 2 -C 3 ) group. This shows that the current system has the potential to detect additional Salmonella serovar antigens and is amenable to modifications for increasing the number of serovars as we plan to expand the current assay to include up to 100 serovars from worldwide identified in the future. There are several limitations to our study. In the current assay configuration, certain rare serovars with closely related antigenic formula containing Salmonella antigens/complexes E1/E4; f,g,s/g,[s],t and e,n,x/e,n,z15 may not be sufficiently discerned and hence the possibility for cross-reaction with the 30 common serovars. Although these serovars are rare, further development of the assay would still be required to address these limitations and the acquisition of rare isolates at our laboratory would be essential.
There are a number of aspects in which the current assay could be further improved. The detection throughput could be further increased by adding more melting point tags and using additional fluorophore channels. A total of 26 genes are detected using three fluorophore channels in the current assay configuration. Theoretically however, each fluorophore channel could carry six and up to eight Tm tags, which we have achieved experimentally using the ROX channel (data not shown). Accordingly, our assay could potentially be further enhanced to detect up to 32 genes by using four fluorophore channels which may in turn identify additional Salmonella serovars. However, as the number of genes detected is increased, we anticipate greater difficulties to establish a stable assay considering the influence of additional probes are introduced. Another useful aspect worth exploring would be its application to clinical and food samples by direct amplification of target genes by incorporating inhibitor resistant polymerases into the assay to avoid the tedious process of isolation and DNA extraction from culture.
In recent years, the rapid advancements of whole genome sequencing (WGS) technologies has facilitated an expansion of important studies of genomic epidemiology [38], population structure [39] and microbial evolution [40] for Salmonella, and has emerged as an alternative to replace current molecular subtyping methods [25,26]. However, gaps still exists for routine use of WGS due to cost, consistent standards and bioinformatics resources, which remains impractical and challenging for most public health laboratories, especially for low and middle income countries with less developed public health infrastructure [27,28]. Indeed, we have already embarked on developing WGS-based methods with preliminary success; the very impetus however behind developing the current assay was to bridge the gap before WGS would become sufficiently ready to gain widespread acceptance and adoption in the public health laboratories in China.

Conclusions
In this study, we have developed and demonstrated the promising utility of an MLMA assay for the identification of clinically most common Salmonella serovars. Further development to the assay by increasing the number of serovars for identification and addressing the current limitations will be our research priority. We have shown that by using relatively basic molecular biology procedures and equipment, the MLMA methodology has great potential to serve as a pragmatic and inexpensive solution for the timely and effective public health intervention of salmonellosis.