Evaluation of real-time PCR endogenous control genes for analysis of gene expression in bovine endometrium

Background Quantitative real-time PCR gene expression results are generally normalised using endogenous control genes. These reference genes should be expressed at a constant level across all sample groups in a study, and should not be influenced by study treatments or conditions. There has been no systematic investigation of endogenous control genes for bovine endometrium to date. The suitability of both commonly used and novel endogenous control genes was evaluated in this study, with the latter being selected from stably expressed transcripts identified through microarray analysis of bovine endometrium. Fifteen candidate endogenous control genes were assessed across different tissue subtypes in pregnant and cycling Holstein-Friesian dairy cows from two divergent genetic backgrounds. Results The expression profiles of five commonly used endogenous control genes (GAPDH, PPIA, RPS9, RPS15A, and UXT) and 10 experimentally derived candidate endogenous control genes (SUZ12, C2ORF29, ZNF131, ACTR1A, HDAC1, SLC30A6, CNOT7, DNAJC17, BBS2, and RANBP10) were analysed across 44 samples to determine the most stably expressed gene. Gene stability was assessed using the statistical algorithms GeNorm and Normfinder. All genes presented with low overall variability (0.87 to 1.48% CV of Cq). However, when used to normalise a differentially expressed gene (oxytocin receptor - OXTR) in the samples, the reported relative gene expression levels were significantly affected by the control gene chosen. Based on the results of this analysis, SUZ12 is proposed as the most appropriate control gene for use in bovine endometrium during early pregnancy or the oestrus cycle. Conclusion This study establishes the suitability of novel endogenous control genes for comparing expression levels in endometrial tissues of pregnant and cycling bovines, and demonstrates the utility of microarray analysis as a method for identifying endogenous control gene candidates.


Background
Quantitative real-time reverse transcription PCR (RT-PCR) is an extremely sensitive technique that allows the precise measurement of gene expression across more than seven orders of magnitude [1,2]. RT-PCR is often considered the gold standard for quantifying gene expression, and is commonly used to validate techniques with greater throughput but less overall sensitivity, such as microarray analysis [3][4][5]. RT-PCR relies on the use of fluorescent dyes to quantify transcript amplification, with the amplification cycle number at which these dyes/transcripts are detected (above background) giving an indication as to the relative abundance of the target molecules. The sensitivity of RT-PCR makes it a powerful tool for gene expression measurement, especially when sample quantities are limited or a transcript is expressed at a low level. However, this sensitivity also means that a great deal of care must be taken with regards to experimental design and implementation of the procedure.
When designing an experiment to evaluate gene expression in a group of samples, a number of critical factors must be kept constant. These include RNA extraction, DNase treatments, and cDNA synthesis. Normalisation of RT-PCR results is required to control inter-sample differences that may arise as a result of these sample processing steps, and ensure the gene expression of target transcripts are robustly quantifiable [6,7].
The most common method for normalising RT-PCR data involves the use of one or more endogenous control genes. An ideal endogenous control gene is one that is stably expressed within the samples to be compared, regardless of tissue differences, experimental conditions, or treatments. Choosing an endogenous control gene to normalise gene expression data is one of the most crucial steps in the experimental design. Genes used as endogenous controls in RT-PCR experiments are often chosen with little prior knowledge of their expression over the experimental conditions examined, and are often selected arbitrarily from a pool of commonly used endogenous control genes such as GAPDH, and β-actin [8]. The most widely used endogenous control gene in studies of endometrial gene expression is GAPDH [9][10][11][12]. However, the suitability of GAPDH as an endogenous control gene has recently come into question, especially due to its potential regulation in a wide variety of physiological states [13], making it a questionable choice for RT-PCR normalisation [14].
Over the past three decades, genetic selection for milk production has resulted in a significant decline in dairy cattle fertility [15]. The fertilisation rate in dairy cattle is around 90% and does not differ between low-moderate and highproducing animals. However, the calving rate in lower producing animals is approximately 55%, whereas in high-producing animals this rate is approximately 35%. Pregnancy losses are thought to occur primarily during the pregnancy recognition/pre-implantation period [16], making studies of endometrial gene expression critical to further understanding of pregnancy establishment, recog-nition and maintenance within the bovine reproductive cycle.
The primary aim of this study was to identify suitable endogenous control genes for analysis of endometrial tissues from pregnant and cycling bovines. This study also aimed to investigate the potential use of microarray data analysis for identification of novel endogenous control genes, and the effect of endogenous control gene selection on the calculated expression of a target gene.
A total of 15 candidate endogenous control genes were analysed in 44 samples representing two different tissues (intercaruncular and caruncular) from 22 animals. These animals were either pregnant or cycling at day 17 of the reproductive cycle, and represented Holstein-Friesian cows from two divergent genetic backgrounds (North American (NA), and New Zealand (NZ)).
Two strategies were employed to identify the candidates. Five genes were selected on the basis that they had been previously used as housekeeping genes [17][18][19], and an additional 10 novel genes were derived from a microarray experiment based around the same 44 samples used in the current analysis. Genespring GX software was used to generate a list through filtering on expression stability across the 44 samples. This list was subjected to GeNorm [20] and Normfinder [21] analysis to identify the 10 most suitable genes. The suitability of all 15 genes was then tested through statistical analyses, including a comparison of expression stability as determined by GeNorm and Normfinder algorithms. The effect of using these endogenous control genes was then evaluated using relative quantification of a gene known to be differentially expressed in the study.

Results and Discussion
Microarray analysis of the 44 endometrial samples revealed 27 transcripts with a high degree of expression stability (filtering on expression level -upper limit 1.2, lower limit 0.833). GeNorm and Normfinder were utilised to identify the 10 most stably expressed transcripts for further analysis. For RT-PCR design, full length transcripts were identified by querying microarray probe sequences against the bovine genome (Btau3,1) using NCBI BLAST http://blast.ncbi.nlm.nih.gov/Blast.cgi.
Gene expression levels of the candidate endogenous control genes (expressed in Cq values) are displayed in Table  1 and Figure 1. Cq values for sample replicates had very low variability with a mean intra-assay coefficient of variation (CV) of 0.41%. All genes had low overall variability, with the Cq range between 1.06 and 2.04 cycles, standard deviations ranging from 0.25 to 0.53 cycles, and CV values ranging from 0.87 to 1.48% Table 1.
Significance calculations between gene expression data for pregnant and cycling animals were performed on Cq and relative concentration values, as estimated through absolute quantification using the Roche LC480 software. No significant differences (P > 0.05) in means or variances (Cq or concentration) between pregnant and cycling endometrial tissues were apparent for any candidate genes (except BBS2, which had a p-value of 0.04 (F-test) for variance between pregnant and cycling animals). The means and variances between the two different strains of Holstein-Friesians also lacked significance (data not shown).
Cqs and relative concentration variances between the two different tissue types showed no significant differences, however means were significantly different for all genes, except RPS15A (P = 0.39 (T-test), P = 0.032 (ANOVA)).
The differences in means calculated between tissue subtypes likely reflect the distinct morphological and functional differences between caruncular and intercaruncular endometrium, which relate to their respective roles in reproduction. The 'caruncules' of the endometrium are specialised projections that are the site of embryo attachment. Caruncules become highly vascularised, and are the major site for small molecule and gaseous exchange. In comparison, intercaruncular tissue is highly glandular and responsible for early nourishment of the embryo through secretions of large molecules into the uterus [22][23][24]. The intercaruncular tissue is often thought to be more important in early pregnancy and, therefore, the majority of expression studies in pre-implantation bovine endometrium focus solely on intercaruncular tissue gene expression [25][26][27]. There is very little reported expression analysis of caruncular endometrium and no known studies comparing expression profiles of the two tissues.

Expression stability testing of candidate genes
To further analyse the suitability of the candidate genes for use as endogenous controls for bovine endometrial tissues, expression stability was assessed using the GeNorm [20] and Normfinder [21] algorithms. GeNorm rankings for the 15 genes tested are presented in Table 2 and Figure 2A. GeNorm identified SUZ12 and ZNF131 as the most stably expressed genes of the 15 candidates. Four of the five most stable genes were those derived from microarray data; the other was RPS15A, which was selected from the literature. By contrast, four of the five least stable genes were chosen from the literature, including GAPDH. The most stably expressed gene identified by Normfinder was SUZ12 (Table 2), which was also one of the two most stable genes identified in the GeNorm analysis. The best combination of genes identified by Normfinder was SUZ12 and C2ORF29. The five most stably expressed genes according to Normfinder consisted entirely of microarray-derived genes, while the least stable five contained three out of five genes selected from the literature, again including GAPDH. The comparative ranking of all genes for both GeNorm and Normfinder algorithm analyses is displayed in Table 2.
GeNorm calculates an expression stability value (M) for each candidate gene based on pairwise comparisons of variability. Each gene is compared to every other gene to determine the two genes that contain the least variation. The stability value calculated for each gene is used to rank genes from least to most stable. The authors of the method give an M-value of 1.5 as a cut-off for suitability as an endogenous control gene. The principal behind the pairwise stability measure ranking is that two ideal candidate normalisation genes should have an equal expression ratio in all samples [28]. In the present study, all genes are well below the stipulated 1.5 M-value. The program then calculates a normalisation factor in each sample for the most stable genes (data not presented). GeNorm also calculates the optimal number of endogenous control genes to be used in the analysis of gene expression (Table 3). This value is determined by locating the point where addition of the next most stable gene does not significantly affect the normalisation factor, using a cut-off value of 0.15. In this study, the value for 2 genes was 0.042, suggesting that 2 genes should be sufficient to normalise the experimental data.
Normfinder is another freely available tool for the identification of stable endogenous control genes. The main point of difference between the two methods is that Normfinder takes into account both inter-and intragroup variability. The program not only identifies the most stable pair of endogenous control genes but also identifies the best overall endogenous control gene. The calculation of variability between groups is especially important in the present study considering the significant expression differences between the two tissue subtypes. The use of the two most stably expressed genes, in this case SUZ12 and C2ORF29 (Normfinder), should provide sufficient normalisation for tissue comparisons as Normfinder selects the best combination of genes whilst taking into account any grouping effects such as tissue type.
The differences in rankings of gene stability using the two algorithms could be due to the fact that they use very different methods to assess gene stability. GeNorm selects genes based on the pairwise variation between genes. The two most stably expressed genes are therefore those genes that share an expression profile. In contrast, Normfinder uses a model based algorithm that takes into account overall stability as well as any groups that may be present in the sample set. For example, if there are any grouping Expression levels of candidate endogenous control genes in pregnant (P) and cycling (Cy) endometrial tissue samples effects on gene expression a gene would be ranked lower than one that demonstrated variability not associated with any particular group.

Effect of Endogenous control gene on target gene relative quantification
Temporal down-regulation of endometrial oxytocin receptor (OXTR) expression is a hallmark of early pregnancy, with embryonic interferon tau (IFNτ) thought to elicit this response [29,30]. Given the expectation of differential OXTR expression between pregnant and cycling animals, the effect of control gene stability on gene expression values of OXTR in the 44 endometrial samples was tested. Figure 3 presents relative OXTR expression levels when normalised with endogenous control genes of varying stability -the most stable gene identified by Normfinder (SUZ12), the two most stable genes identified by GeNorm (SUZ12 and ZNF131), and two of the least stably expressed genes identified by both Normfinder and GeNorm (GAPDH and UXT). Normfinder identified SUZ12 and C2ORF29 as the best combination of genes, but the relative expression was not significantly different from that calculated using only SUZ12 or a combination of SUZ12 and ZNF131 (data not presented). Correlation of normalised RT-PCR data to OXTR microarray expression data was not affected by choice of normalisation strategy. When compared to microarray reported expression, all calculated expression values had correlation coefficients of 0.79.
OXTR expression was significantly greater in cycling than in pregnant cows regardless of the endogenous control gene used (Figure 3, ANOVA, P < 0.01). Notably, OXTR expression in pregnant animals was greater on average and more variable in NA animals (which have lower fertility in general [31]) than for NZ animals ( Figure 3A and 3B), and was also related to embryo size (data not presented). The use of different endogenous control genes had no effect on OXTR expression differences in these group comparisons. The average calculated fold change difference between pregnant and cycling animals was not affected by the choice of normalisation gene, possibly due to the large difference in expression level for this comparison (10-fold average difference between pregnant and cycling animals). However, the choice of reference gene could be important when normalising genes that exhibit more subtle variation between experimental groups, given the considerable variation in expression shown between individual samples.

Conclusion
This study provides the first reported assessment of endogenous control genes for use in expression studies in bovine endometrium. Normalisation is a critical factor in reporting RT-PCR expression data, providing a necessary control for error associated with sample preparation. Normalisation using endogenous control genes provides a means of controlling this error, provided the gene(s) used are stably expressed across all samples under investigation. The study described here tested 15 candidate reference genes across 44 bovine endometrial samples representing a range of physiological states and tissue subtypes. This study evaluated the suitability of both commonly used and novel experimentally derived reference genes for use in normalisation of RT-PCR data. Candidates derived via microarray analysis were superior to existing, commonly used endogenous control genes, demonstrating the suitability of using microarray data for deriving novel endogenous control genes. This study also highlighted the importance of accurate normalisation with a stable endogenous control gene, by demonstrating relative expression of a differentially expressed gene when normalised using control genes of varying stability. SUZ12 was ranked first for stability across samples as determined by the statistical algorithms used in GeNorm and Normfinder, and is therefore proposed as the best gene for normalisation of RT-PCR data in the current study.

Methods
All animal manipulations were carried out with the approval of the Ruakura Animal Ethics Committee (Ham-ilton, New Zealand). This work was conducted at No 5 Dairy, DairyNZ Ltd (Hamilton, New Zealand).  GeNorm calculates the minimum number of genes required for accurate normalisation. In this case, two genes were sufficient (V2/3 = 0.04). The number of genes necessary for accurate normalisation is defined as the point at which the addition of a gene has a non-significant effect on the calculation of the normalisation factor, the threshold for a non-significant difference being 0.15.

Relative gene expression results for OXTR when normalised with endogenous control genes of different stability in pregnant (A) and cycling (B) animals
x the reproductive cycle. The animals consisted of 12 pregnant and 10 cycling, further divided into North American (NA) and New Zealand (NZ) genetic ancestry (Table 4). This genetic strain model was chosen because NA cows have been reported to have poorer reproductive performance than NZ cows [31].
Total RNA was extracted using a Qiagen RNeasy kit (QIA-GEN). All samples were DNase treated using the Ambion DNA-free kit (Ambion, Austin, TX) according to the manufacturer's instructions. RNA quantity was determined by spectrophotometry in a Nanodrop ND-1000 (Nanodrop Technologies, Wilmington, DE). RNA integrity was checked using the Agilent 2100 Bioanalyzer with a RNA 6000 Nano LabChip kit (Agilent Technologies, Palo Alto, CA).

cDNA Synthesis
One μg of an endometrial RNA sample was used for cDNA synthesis using the Invitrogen Superscript III Supermix kit (Invitrogen Corporation, Carlsbad, California). Total RNA was reverse transcribed according to the manufacturer's instructions using a final concentration of 27 μM of random pentadecamers primers. Briefly, RNA and random primers were mixed and denatured at 65 C for 5 minutes, followed by 1 minute on ice. Annealing buffer and Superscript/RNase was added to samples; these were then incubated for 10 minutes at 25 C (primer annealing), followed by 50 minutes at 50 C, and finally 5 minutes at 85 C to inactivate the enzyme. Reverse transcription (RT) negative controls were performed to test for the presence of genomic DNA contamination in RNA samples. Duplicate experimental samples were processed for cDNA synthesis as described above, but without the inclusion of the reverse transcriptase enzyme. Amplification was then tested for all genes using RT-PCR followed by assessment on a 3% agarose gel. No amplification was found in any of these samples.

Candidate endogenous control genes
Fifteen potential endogenous control genes were selected either from a literature search or were identified from a microarray study through statistical analysis using Genespring GX (Agilent Technologies) software in combination with the GeNorm [20]and Normfinder [21] algorithms.
Briefly, Genespring GX software was used to analyse an array data set (Agilent 44k 60-mer oligonucleotide bovine array) representing 44 bovine endometrial samples collected as described above. All array data had undergone standard quality control and statistical analysis, including filtering on flags (present or marginal in all samples) and filtering on raw expression level of 200 to obtain reliable data. A list of 27 genes were derived from this dataset by filtering on normalised expression level (upper limit 1.2 and lower limit 0.833) to obtain genes that had stable expression across all 44 samples. This dataset was further analysed using the Microsoft Excel applets GeNorm and Normfinder. These programs were used to determine the 10 most stably expressed genes in the array data set.

Quantitative Real-Time PCR
Real-time PCR using the Roche Lightcycler 480 (Roche) was performed on 15 candidate endogenous control genes for each of the 44 bovine endometrial samples using the Roche real-time PCR master mix (Lightcycler 480 Probes Master) in combination with Roche Universal Probe Library (UPL) assays. Assays were designed to publicly available bovine gene sequences (NCBI) using Roche UPL design software (ProbeFinder, v.2.45). All assays were designed to span an intron-exon boundary to prevent amplification of DNA. The primer and probe sequences are presented in Table 5.
The PCR reaction volume was 10 μL consisting of 0.5 μM of each primer and 0.1 μM of probe. Standard cycling conditions were used [95 C for 10 minutes, (95 C for 10 seconds, 60 C for 30 seconds) × 50 cycles, 40 C for 40 seconds].
To quantify gene expression, cDNA was diluted 100-fold for all genes except ACTR1A, DNAJC17, HDAC1,  Roche Universal probe library assays were designed for 15 candidate endogenous control genes including 10 experimentally derived genes, five commonly used genes and one target gene -OXTR. method, a non-linear regression line method. A six point relative standard curve of serial dilutions of cDNA was used with an estimated starting concentration of 1.0 and final concentration of 1.6E-03.

Relative quantification of OXTR
The Roche Lightcycler 480 Software was used to perform advanced relative quantification analysis of OXTR gene expression. SUZ12, UXT and GAPDH were each used as a reference for the quantification of OXTR expression. Relative quantification was also performed using the normalisation factor of the two most stably expressed genes identified through GeNorm analysis (SUZ12 and ZNF131). Normalisation factors were calculated by taking the geometric mean of the two genes for each sample.

Data analysis
Statistical analyses were performed using the Excel applets GeNorm (version 3.5) [20]and Normfinder [21] to estimate expression stability of the candidate endogenous control genes. GeNorm requires expression data to be input as concentrations determined via quantification, taking into consideration PCR efficiencies ( Table 5). The program then estimates the most stable genes based upon pairwise comparisons of sample variability. The two most stable genes are identified and a normalisation factor calculated. Normfinder analyses the stability of the candidate genes taking into consideration inter-group variability. The program then ranks genes based on a stability value, with the lowest value indicating the most stably expressed gene. Significance tests were also performed on the data. Microsoft Excel was used to perform T-tests (with Bonferroni multiple testing correction applied) to test for significant differences between the means, and Ftests were used to assess differences in variance between experimental groups for each gene. ANOVA was also used to test for any significant difference in the means (Gen-STAT). Pearson correlation calculation was used to assess the correlation of the microarray reported expression for the target gene (OXTR) and the RT-PCR reported expression.