Modification of the mycobacteriophage Ms6 attP core allows the integration of multiple vectors into different tRNAala T-loops in slow- and fast-growing mycobacteria

Background Mycobacteriophage Ms6 integrates into Mycobacterium smegmatis and M. bovis BCG chromosome at the 3' end of tRNAala genes. Homologous recombination occurs between the phage attP core and the attB site located in the T-loop. Integration-proficient vectors derived from Ms6 are useful genetic tools, but their insertion sites in the BCG chromosome remain poorly defined. The primary objective of this study was to identify Ms6 target genes in M. smegmatis and BCG. We then aimed to modify the attP site in Ms6-derived vectors, to switch integration to other tRNAala loci. This provided the basis for the development of recombinant M. bovis BCG strains expressing several reporter genes inserted into different tRNAala genes. Results The three tRNAala genes are highly conserved in M. smegmatis and BCG. However, in the T-loop of tRNAalaU and tRNAalaV containing the attB site, a single base difference was observed between the two species. We observed that the tRNAalaU gene was the only site into which Ms6-derived integration-proficient vectors integrated in M. smegmatis, whereas in BCG, the tRNAalaV gene was used as the target. No integration occurred in the BCG tRNAalaU T-loop, despite a difference of only one base from the 26-base Ms6 attP core. We mutated the attP core to give a perfect match with the other tRNAala T-loops from M. smegmatis and BCG. Modification of the seven-base T-loop decreased integration efficiency, identifying this site as a possible site of strand exchange. Finally, two Ms6 vectors were constructed to integrate two reporter genes into the tRNAalaU and tRNAalaV T-loops of the same BCG chromosome. Conclusion Small changes in the 7 bp T-loop attP site of Ms6 made it possible to use another attB site, albeit with a lower integration efficiency. These molecular studies on BCG tRNAala genes made it possible to create valuable tools for the site-directed insertion of several genes in the same BCG strain. These tools will be useful for the development of novel multivalent vaccines and genetic studies.


Background
Temperate phages integrate into the bacterial chromosome through a site-specific recombination event catalyzed by a phage-encoded recombinase. This process involves a common core present in the phage attP and the bacterial attB genomic DNA sequences, which are identical [1]. Genetic tools based on phage systems have furthered research into the biology of Mycobacterium tuberculosis, a pathogen responsible for about two million deaths each year [2]. L5 [3] and Ms6 [4], both temperate mycobacteriophages, integrate into genes encoding tRNAs. L5 integrates into a tRNA gly gene in the genome of the fast-growing species M. smegmatis or the slow-growing species M. bovis Bacillus Calmette Guérin (BCG), which is used as a vaccine against tuberculosis [5,6]. Integrationproficient vectors containing the attP site and either a tyrosine-integrase [5,7] or a serine-integrase [8] system integrate into the attB site of the mycobacterial chromosome. Most of these integrative vectors do not retain recombination directionality factors from the phage that mediate excision [9] and are generally stably maintained in the mycobacterial genome, even in the absence of antibiotic selection [10]. However, plasmid loss can occur through low level integrase-mediated excision [11]. These vectors are of considerable use both for the development of recombinant BCG (rBCG) strains with long-term heterologous gene expression for vaccine development [10] and for overcoming the problems associated with the use of multicopy extrachromosomal plasmids in genetic studies. Ms6-derived vectors integrate efficiently into the genomes of both slow-and fast-growing mycobacteria. In M. smegmatis, the attB core site overlaps the 3' end of a tRNA ala gene containing the 7 bp T-loop [7]. The precise point of strand exchange between attP and attB is unknown. In silico scanning of the genomes of M. smegmatis and M. tuberculosis showed that there were three tRNA ala genes containing 3' end attB sequences either identical or similar to the Ms6 attP core site. We show here that Ms6 integrates into different tRNA ala genes in M. smegmatis and BCG, according to the presence of identical attB and attP cores. We also used site-directed mutagenesis of the attP core to construct vectors integrating into tRNA ala genes other than that targeted by the wild-type attP. This led to the development of a method for integrating two vectors, carrying different heterologous genes, into two different loci of the same BCG chromosome.

Genetic organisation of tRNA ala genes from M. smegmatis and BCG
A BLASTn search of the M. tuberculosis genome [12] for sequences similar to the Ms6 attP core region [13] revealed three possible attB sites at the 3' ends of three tRNA ala genes, named after their anti codon: GGC for tRNA alaU , CGC for tRNA alaV , and TGC for tRNA alaT . In tRNA alaV , the attB core sequence was identical to the 26 bases of the attP core region [7]. In tRNA alaU and tRNA alaT , one and three mismatches, respectively, were observed between the 26 bp Ms6 attP and attB core sequences. Alignment of the attP sequence with the genome of M. bovis [14] revealed the presence of the same three attB sites (data not shown). None of the tRNA ala genes mapped to deletion regions described in BCG [15,16].
Alignment of the Ms6 attP core sequence with the genome of M. smegmatis [17] revealed three possible attB sites displaying sequence similarities. As for BCG, the three possible attB sites were located in tRNA alaU (anticodon GGC), tRNA alaV (anticodon CGC) and tRNA alaT (anticodon TGC). The three M. smegmatis and BCG tRNA ala structures were analysed with tRNA-scan SE [18]. A sequence identity block common to attP-attB was identified that encompassed the tRNA ala T-loop located at the 3' end, without extending to the variable region (see additional file 2).
In both species, the tRNA ala -encoding genes mapped to distant regions of the chromosome. The three tRNA ala genes of M. smegmatis and BCG were highly similar, but not identical. Interestingly, the tRNA alaV T-loop from BCG was identical to the tRNA alaU T-loop from M. smegmatis. In both species, the T-loop from two other tRNA ala genes differed from the 26 bp core attP site from Ms6 by one to four bases ( Fig. 1A) The integration of Ms6-derived vectors requires 100% identity of the 26 base cores of attP and attB We have previously described the construction of pAV-SIV, an Ms6-derived integration-proficient vector, for the production of rBCG candidate vaccines genetically stable in vivo and expressing genes from the simian immunodeficiency virus SIVmac251 [10]. Here, we analysed the insertion locus of pAV-SIV in the BCG chromosome. We carried out Southern blot analysis on genomic DNA using a probe (see additional file 1 for all primers plasmids and strains used in this study) designed from the tRNA alaU gene from M. tuberculosis, as we previously showed that the target insertion sequence corresponded to tRNA alaU [7]. However, we did observe no band disruption when the genomic DNA hybridisation profiles of pAV-SIV integrants and wild-type BCG were compared, indicating that the integration-proficient plasmids had not inserted into the tRNA alaU gene (Fig. 1B, left panel). We then amplified another probe derived from the tRNA alaV sequence and observed, on a Southern blot of DNA from wild-type BCG, two specific bands : a 9 kb band showing a strong positive signal and a weakly hybridizing 7 kb band (Fig.  1B, right panel). Following transformation with pAV-SIV, the 9 kb band disappeared, giving rise to two bands of 8 and 2 kb. The 7 kb band, due to cross-hybridisation of tRNA alaU attB with the tRNA alaV probe, remained intact. This confirmed the specific integration of pAV-SIV into the tRNA alaV gene displaying identity with the attP rather than into the attB site from tRNA alaU , which has a base mismatch. We then investigated whether Ms6-derived vectors also targeted the M. smegmatis attB site displaying 100% identity with the Ms6 attP core -the tRNA alaU gene. A Southern blot analysis with two probes containing either the tRNA alaU or the tRNA alaV gene sequence revealed that the 3 kb band hybridising with the tRNA alaU probe in genomic DNA from wild-type M. smegmatis was disrupted in DNA from pAV-SIV integrants, giving 2.5 kb and 1.3 kb bands (Fig. 1C, left panel). When DNA was probed with tRNA alaV , no difference in hybridisation profile was observed between the wild type and integrants, demonstrating an absence of integration into the tRNA alaV locus (Fig. 1C, right panel).
We investigated whether integration into an attB site carrying mismatches with the attP core site was possible, as reported for L5 [5], by designing three primer pairs to amplify tRNA alaU , tRNA alaV or tRNA alaT regions both from M. smegmatis and BCG genome sequences. The PCR Southern blot of genomic DNA from wild-type and pAV6950-transfected M. bovis BCG and M. smegmatis integrants  amplification of genomic DNA from fifty pAV-SIV BCG integrants identified tRNA alaV as the only gene target for Ms6 integration. By contrast, in the 50 M. smegmatis integrants analysed, pAV-SIV had inserted into the tRNA alaU gene (data not shown).

Modification of the attP core makes integration into the T-loop of other tRNA ala genes possible
The tRNA alaU gene (anticodon GGC) of BCG and the tRNA alaV gene of M. smegmatis (anticodon CGC) differ from the attP core by one base in the T-loop, whereas the tRNA alaT genes differ from this core sequence by three and four bases, respectively (Fig. 1A). We then investigated whether modification of the attP site to match these Tloops would allow us to target Ms6-derived integrative vectors to other tRNA ala genes. Site-directed mutagenesis was performed on pAV6950, carrying the natural Ms6 attP core sequence, to construct four vectors carrying modified 26 bp attP cores identical to the various tRNA ala T-loops.
In M. smegmatis, the integration efficiency of pSV, carrying an attP core identical to the tRNA alaV T-loop, was only 12% that of the non-mutated pAV6950 targeting tRNA alaU (Table 1). PCR amplification of M. smegmatis pSV integrants confirmed that the vector had targeted tRNA alaV in all clones, as expected. When four base changes were introduced, to match the tRNA alaT T-loop sequence (plasmid pST), integration efficiency was only 0.05% that with the non-mutated pAV6950.
In BCG, pBU, in which the sequence TTCGAA was mutated to TTCGAG to match the tRNA alaU T-loop, integration efficiency was 34% that for the non-mutated pAV6950 targeting tRNA alaV (Table 1). In all integrants, pBU targeted the tRNA alaU T-loop, as expected. When three base changes were introduced, to match the tRNA alaT Tloop, integration efficiency was 18% with respect to the non-mutated attP. In all clones tested, integration into tRNA alaT had occurred.

Ms6-derived vectors targeting two different tRNA ala Tloops in BCG can be used to construct bivalent recombinant BCG strains
As pBU made possible integration into tRNA alaU , we investigated whether both tRNA alaU and tRNA alaV could be targeted in the same BCG strain. BCG was electroporated with pNIP46, a pAV6950-derived vector (targeting the tRNA alaV T-loop in BCG) containing the SIVmac251 gagp26 gene and a gene conferring resistance to hygromycin. The recombinant BCG::pNIP46 strain expressing gagp26 was then transformed with pBU-lacZ, containing the Escherichia coli lacZ and a kanamycin resistance gene. Recombinant clones, selected on medium supplemented with kanamycin and X-gal, displayed β-galactosidase activity ( Fig. 2A). The expression of gagp26 was also detected in these clones by western blotting (Fig. 2B). PCR was used to determine the site of integration of the various plasmids in the BCG integrants (Fig. 2C). It was found that pNIP46 had inserted into tRNA alaV and pBU-lacZ into tRNA alaU . In the double integrant, BCG::pNIP46::pBU-lacZ, both tRNA alaU and tRNA alaV were disrupted. A diagram of this double integration of an Ms6-derived vector carrying natural attP in BCG tRNA alaV and an Ms6-derived vector with modified attP in BCG tRNA alaU is provided in Figure 2D.

Discussion
Determination of the complete genome sequences of M. tuberculosis [12] and M. smegmatis [17] made it possibe to analyse the possible integration targets of Ms6-derived vectors. In contrast to expectations [7], we observed that Ms6 targeted two different tRNA ala genes in the fast-growing species M. smegmatis and the slow-growing species M. bovis BCG. The critical factor defining the exclusive integration site was the presence in the attB region of a core sequence identical to that of Ms6 attP. Thus, whereas in M. smegmatis 100% of the transformants obtained with Ms6derived integration-proficient vectors displayed insertion into tRNA alaU , in BCG, all transformants carried the vector in tRNA alaV . In the case of L5, another temperate mycobacteriophage widely used for genetic studies [3], integration into the attB site of the BCG chromosome may occur in a tRNA glyV gene carrying one mismatch with the attP minimal core [5]. However, analysis of the three tRNA gly genes of BCG showed that there was no tRNA gly carrying an attB sequence identical to L5 attP in BCG (data not shown). L5-derived vectors therefore target an attB sequence carrying one mismatch with attP. This mismatch does not seem to affect the efficiency of L5 integration in BCG [5]. How-Bivalent recombinant BCG strain containing two pAV6950-derived vectors inserted into the tRNA alaU and tRNA alaV T-loops ever, this mismatch maps outside the tRNA glyV 7 bp anticodon loop in which strand exchange beween attP and attB occurs [6].
The three tRNA ala gene sequences displayed a high degree of similarity between M. smegmatis and BCG. Only two base differences in tRNA alaU and one base difference in tRNA alaV were observed. Interestingly, whereas the first 25 bases of the tRNA alaU and tRNA alaV anticodon loops were identical, the nucleotides involved in the formation of anticodon loop flanking symmetry differed in the tRNA alaU and tRNA alaV genes. This may be a signature of ancestral recombination between the tRNA ala genes during evolution. The T-loops were remarkably conserved between tRNA alaU and tRNA alaV and between the species. However, one base difference occurred in this region containing the attB site. Point mutations occurring separately in the two species during evolution may account for these observations. Alternatively, the Ms6 phage or its ancestors may have introduced some of these mutations, as mycobacteriophages have actively participated in remodelling of the bacterial chromosome [19].
We next investigated whether the two "attB-like" sequences located in the two other tRNA ala T-loops were potential target sites for modified Ms6. Indeed, with an integrase trained on one gene, a small change in the attP sequence may allow the element to switch to another tRNA ala gene. We therefore mutated the 26-base attP core and assessed integration into the other mycobacterial loci.
In BCG, one base difference between natural attP (TTC-GAA) and pBU attP-like (TTCGAG) sequences reduced integration efficiency by 70%, despite identity with attB (tRNA alaU ). The mutation introduced in attP was located in the 7 bp identical to the T-loop of the tRNA ala gene but did not directly affect flanking symmetry (for review [1]. This change, although minor in nature, caused a dramatic drop in integration efficiency. Two additional changes in the 26-base sequence (pBT) only slightly reduced integration efficiency further in BCG, from the 70% decrease with pBU (one base change) to an 80% decrease with pBT (three base changes). This identifies the 7 bp T-loop sequence TTCGAA as important for Ms6 integration. Indeed, when M. smegmatis was electroporated with pBU (TTCGAG), despite there being only one base difference between this mutated attP and the three attB sites available in the genome, very few integrants were obtained (99% decrease in efficiency). Phages have been classified according to integration site sublocation within tRNA or tmRNA genes [20]. Class I phage integrase targets the tRNA anticodon loop whereas class II targets the tRNA Tloop. The precise crossover segment has been examined in class I phages, in which it maps to the 7 bp anticodonencoding loop. This is also the case for mycobacteriophage L5 [6], which belongs to class I [20]. The precise location for crossover in class II phages remains unknown. Our data suggest that strand exchange between the class II phage Ms6 attP and mycobacterial attB occurs precisely in the 7 bp T-loop from tRNA ala .
Integration-proficient vectors derived from the temperate mycobacteriophage Ms6 are particularly useful for constructing recombinant BCG strains that are genetically stable in vivo [10]. BCG is also of considerable interest for use in the development of vaccines for simultaneous immunisation against several pathogens. The next step was to use Ms6 integration-proficient vectors targeting different tRNA ala T-loops in BCG to construct multivalent vaccine strains. We therefore constructed one vector carrying the natural attP core (TTCGAA/tRNA alaV ), a SIVmac251 gene encoding Gagp26 and another carrying an attP site targeting tRNA alaU (TTCGAG) and carrying the lacZ gene from E. coli. In the rBCG::pNIP46::pBU-lacZ strain transformed with the two vectors, we observed expression of both the gagp26 and lacZ genes. Interestingly, the level of expression of gagp26 in the bivalent strain was similar to that observed in the monovalent rBCG::pNIP46 strain. This method may therefore be useful for the future development of new rBCG vaccines carrying multiple heterologous genes that will help in immunisation programmes.

Bacterial strains and plasmids
The E. coli, M. smegmatis, M. bovis BCG strains and the plasmids and primers used in this study are described in additional file 1. Mycobacterial strains were electroporated as previously described [21]. The plasmid pAV-SIV [10] was derived from pAV6950 [7], containing the attPint region from mycobacteriophage Ms6 and an origin of replication from E. coli. It contains SIVmac251 genes inserted into the single EcoRI site of pAV6950 located upstream from the attP core and int gene. Similarly, pBU-lacZ contains the gene encoding β-galactosidase from E. coli inserted upstream from the modified attP core. Both pAV-SIV and pBU-lacZ express the aph3' gene from Tn5, conferring resistance to kanamycin (20 μg ml -1 ). pNIP46 is a pAV6950 derivative containing the hygB gene from Streptomyces hygrospicus conferring resistance to hygromycin (50 μg ml -1 ) and the gagp26 gene from SIVmac251. BCG transformants were selected on solidified Middlebrook 7H11 medium (Difco) supplemented with OADC (0.05% oleic acid, 5% bovine serum albumin fraction V, 2% dextrose, 0.004% beef catalase, 0.85% NaCl) and the appropriate antibiotic. β-galactosidase activity in BCG transformed with pBU-lacZ was visualised by adding X-gal (80 μg ml -1 , MP Biomedicals, Inc.) to the medium.