RNA interference is a potent method of gene silencing that has rapidly become important over the past years and that is now widely used for experimental as well as therapeutic purposes. However, the method harbours several pitfalls, one of them being the artefactual dysregulation of non-target genes. This effect can either be due to i) induction of the interferon response in mammalian cells after transfection of RNAi molecules or ii) result from the unintended targeting of genes that have only low level of sequence homology to the RNAi molecule. While the former unwanted effect is avoidable, the latter artefact cannot be predicted. This means that even siRNAs with no predicted physiological target sequences – which are often used as negative controls for RNAi experiments – will have specific off-target effects, and these are thought to be caused by sequence similarity of very short seed sequences [13, 20]. This has prompted calls for rigorous standards in siRNA experiments, especially in large-scale screens . Here, we show that siRNA molecules that are commercially distributed and used widely as negative controls actually target endogenous genes with important roles in several pathways, even though there are only very small regions of sequence homology between the siRNA and the mRNA molecules.
We used siRNA directed against GFP as negative control in a series of unrelated knock down experiments that were analyzed by expression microarrays (manuscript in preparation). Due to the ongoing discussion concerning the specificity of RNAi molecules, we rigorously screened the data and paid special attention to genes that were commonly deregulated in all of these expression profiling experiments, which were unrelated except for the use of GFP siRNA as negative control. Indeed, we detected strong dysregulation of the genes CYLD and SOAT in all of these experiments (Figure 1a) and in various cell lines (Figure 2a). Only EVSAT cells did not show a reduced expression of CYLD and SOAT after transfection of GFP siRNA. This is probably due to the lower transfection efficiency of EVSAT cells compared to HEK, HeLa or U2OS cells. Furthermore, we show that the strong dysregulation of CYLD and SOAT is independent of the synthetic origin of the siRNA molecules (Figure 2b). Therefore, the commonly used siRNA directed against GFP has sequence-dependent off-target effects in human cells, with the genes CYLD and SOAT showing most pronounced deregulation.
Are these effects sequence-dependent or caused by sequence-independent effects? In general, unwanted effects like activation of the interferon response are more likely to occur when high concentrations of siRNA are used. How low the concentration of transfected dsRNA molecules must be to prevent unspecific effects is controversial: While there are reports that siRNA concentrations of ≤ 20 nM usually do not lead to induction of the interferon response , others have detected unspecific effects using siRNA concentrations as low as 10 nM . In this study, we transfected increasing amounts of siRNA directed against GFP (Figure 3). In HEK as well as in HeLa cells, we could show that off-target knock down of CYLD and SOAT correlates with the concentration of the transfected siRNA. Already a concentration of 5 nM siRNA showed the down regulation of CYLD and SOAT as off-target genes. At the same time, the interferon response is activated only after transfection of more than 54 nM siRNA, pointing to a directed targeting of CYLD and SOAT and not a down regulation which is concomitant to the interferon response mechanism. In order to shed light on this gene-specific mechanism, we performed a genome-wide screen to identify all mRNA transcripts deregulated by the GFP siRNA. Intriguingly, we found a similar number of mRNAs overrepresented as mRNAs underrepresented after transfection of GFP siRNA, which means that at least half of the deregulation that we detected is due to secondary effects. An earlier study could not detect off-target effects for a siRNA directed against GFP . In the light of the recent progress of the field and several reports on off-target effects, this finding seems highly unlikely since the majority of siRNA molecules, even those with a non-physiological target, will be able to pair to a set of mRNAs with partial homology. In fact, we reproducibly found a large number of genes specifically deregulated after transfection of GFP siRNA.
In order to shed light on the mechanism of action of the GFP siRNA, we looked for sequence homologies between the GFP siRNA and the deregulated mRNAs. In our set of mRNAs that were underrepresented after transfection of GFP siRNA, we found significantly more hits with the proper GFP siRNA sequence as compared to the shuffled siRNA sequences for sequence homologies of more than 7 bp length. Of the 207 genes down modulated, we found perfect matches of more than 7 bp in 116 (56 %) of the down modulated genes. 22 mRNAs showed homology to both the sense strand as well as the antisense strand of the GFP siRNA.
In contrast to the current model where binding of several siRNA molecules results in more consistent down modulation of the mRNA , we did not find these 22 mRNAs to be more underrepresented as compared to the mRNAs with homologies to one of the siRNA molecules only. This could be due to our scoring of perfect, non-mismatch homologies only due to computational limitations . Interestingly, in our system both strands of the transfected double-stranded siRNA duplex show a significant number of homologies to the underrepresented mRNAs as compared to the shuffled control sequences. While there is a bias towards usage of the antisense strand as required, both strands of the siRNA are loaded into the RISC complex. This could be reduced by proper design of the GFP siRNA sequence which would prevent loading of both sense and antisense molecules into the RISC complex . Down regulation of the remaining 91 (44 %) transcripts without perfect matches to the siRNA sequences is probably due to pairing with partial mismatches, whose in-silico prediction is almost impossible . Additionally, a proportion of these remaining 44 % of deregulated mRNAs without perfect match to GFP siRNA is also likely to be modulated indirectly as secondary effect, just as the 190 mRNAs that are up regulated after transfection of the siRNAs. While the indirect target mRNAs may be of little interest for the molecular mechanism, these secondary effects will be just as confounding in experiments using GFP siRNA as the direct effects. According to our data, possibly as many as 14 % of sequence-specific off-target effects could be avoided by design in order to exclude one of the two strands of the transfected siRNA from the RISC complex [9, 8]. Very recently, several vendors introduced a new generation of chemically modified siRNAs that are supposed to ensure loading of the siRNA antisense strand into the RISC complex only, thereby reducing off-target effects.
With increasing length of homology, relatively more hits were scored for the proper, non-shuffled GFP siRNA in the set of actually deregulated genes as compared to all negative controls (Figure 4; Additional file 3). It is remarkable that preservation of dinucleotides while shuffling of GFP siRNA sequence results in a larger number of sequence homologies in deregulated genes. It seems that the dinucleotides in the GFP siRNA sequence have a background homology in human genes, which is abrogated by complete shuffling of the dinucleotide composition.
Interestingly, in contrast to previous findings [13, 20], homologous sequences were not clustered in the 5' seed sequence of the GFP siRNA and they were also not clustered in the 3'UTR of target mRNAs or any other region of the mRNA. In Jackson et al., , the authors performed transcriptome-wide time course analyses to identify off-target mRNA transcripts, which allowed limiting the number of primary off-target genes to 9 only. Within these 9 genes, the authors could search for regions with only partial and very small homology (e.g. 5/11 bp identities). In contrast, we could only analyze a single time point and found 207 down modulated genes, which is why we had to restrict our homology search to perfect matches only, and this could be a confounding factor.
How to handle sequence-specific off-target effects in siRNA experiments? Proper design of the sequence of siRNAs using in-silico target gene prediction will not avoid down regulation of off-target mRNA transcripts as shown here and discussed previously . The aim should therefore not be to avoid these effects but rather to identify false positives and exclude them from the set of deregulated mRNAs. In order to help identify false positive target genes that really are off-target genes in future knock-down experiments, we here present the endogenous genes that are down regulated by one of the most commonly used control siRNA directed against GFP. Another strategy to identify off-target genes arising from the control siRNA is to use several different control siRNAs which are directed against different sites in non-endogenous genes like green fluorescent protein and Luciferase in addition to a mock transfected control. Also, the sequence of the siRNA directed against the gene of interest could be scrambled and the resulting molecule be used as a negative control. However, the most straightforward approach is to employ several different siRNA molecules to target the mRNA of interest and identify true targets by their property of being knocked-down by all of these siRNAs . The most comprehensive approach of this kind would be to use esiRNAs, which should then be produced from the whole target mRNA sequence. In this case, while all of the resulting esiRNA molecules would target the single mRNA of interest, the off-target effects of the single esiRNA that are present only at very low concentrations compared to all other esiRNA molecules would be minimal and probably below detection level. Furthermore, the transfection of shRNA plasmids by lenti- or retroviruses is an option for reducing off-target effects, since the level of stable shRNA expression is comparatively modest which results in minimal off-target effects. A second more laborious approach is to rescue the observed siRNA phenotype by transfecting a recombinant cDNA that is mutated at the siRNA target sequence(s) and thus rendered non-responsive to the siRNA. Alternatively, after using a siRNA directed against the 3'UTR of a certain gene, rescue of expression can be achieved by expression of the mRNA lacking its normal 3'UTR sequence. Ideally, both approaches of i) the usage of several siRNA molecules and ii) rescue of the phenotype by non-responsive cDNA plasmids should be combined in experimental strategies. Only these precautions will allow to definitely exclude that the observed phenotype of the siRNA knockdown is due to unwanted artefactual off-target effects.