Reference genes are routinely used as a means of quantifying gene expression. The ideal reference genes should be expressed at a constant level throughout the plant and not be influenced by exogenous treatment [1, 5]. Housekeeping genes, such as those involved in basic cellular processes (EF1α, UBQ and CYP) or cell structure maintenance (ACT, TUB), have been extensively used, but increasingly it has become apparent that their expression level is not as independent of experimental conditions as had been expected [6–8, 13, 14, 18, 48]. This implies a need to test in advance the expression stability of any proposed reference gene(s), a procedure which is often not followed in the literature. Normalization based on several reference genes has begun to become the standard, supported by the development of software such as geNorm and Normfinder [17, 21]. However, the prior validation of reference genes remains uncommon in plant research, although it is the norm in human and animal research [22–25, 32, 49–54].
Soybean has been used as a model plant for the study of photoperiod-induced floral induction , but the molecular mechanism underlying this induction remains poorly understood. In soybean, ACT, TUB and UBQ are the most frequently used reference genes (Additional file 1), but there is increasing evidence that their expression is not particularly stable under certain conditions. More recently, some alternative reference genes have emerged [36, 37]. Although four of these (SKIP16, MTP, PEPKR1 and UKN2) have been shown by RT-qPCR to be stably expressed under certain limited experimental conditions, no detailed validation has to date been carried out to test their suitability in experiments involving photoperiodic treatments.
In the present study, we used more subdivided samples to make the data more representative (Additional file 5). To our knowledge, this is the first systematic study of the expression stability of reference genes across such a large number of samples under varied light regimes (SD/LD/DD/LL, RL and BL) in soybean. The 14 reference genes in general out-performed the conventional housekeeping genes, and the poor performance of commonly used genes such as ACT2/7 and TUB4 was of particular note (Figure 3). SKIP16, UKN1 and UKN2 were overall the most stable and were good candidates for the normalization of general gene expression. But different sets of samples had their own best reference genes (Figure 3). For example, ACT11 is one of best reference genes for both different tissue and photoperiod samples, whereas TIP41 did better than ACT11 when studying samples harvesting from different quality light (blue and red light) and SKIP16 was the best reference for developmental material.
The weakness of ACT2 in soybean, rice, potato and sugarcane has been noted previously [32, 37, 39, 40], while ACT2/7 was seen to be rather variable in A. thaliana . However, ACT2/7 was judged to be the most stable of a set of ten conventional housekeeping genes across 21 soybean samples, covering a range of developmental stages . Similarly, TUB performed poorly as a reference gene in grape, potato and soybean [16, 36, 39]. UBQ10, which ranked poorly in the present experiments, was previously deemed unsatisfactory as a reference in soybean  and in grape , but enjoyed very stable expression in A. thaliana and Brachypodium sp. [9, 33]. EF1b was among the most stable genes both in this study and in a previous study of soybean , while in both potato and rice, EF1α was very stably expressed under conditions of biotic and abiotic stress . The same gene was also identified as being highly stable in its expression across tissues of rice , but was unstable across tissues and organs of tomato at various developmental stages . TUA5 was identified as being highly stable across development in soybean , while in poplar, TUA was very stably expressed across different tissues . Here, TUA5 expression was hardly affected by changes in photoperiod. Globally, the best-performing genes were SKIP16, UKN1, UKN2 and TIP41, while the worst were PEPKR1 and HDC. TIP41 and UKN2 have been noted as showing stable expression across tissues and development in both tomato  and aspen . However, TIP41 performed poorly during grape berry development , and in the roots and leaves of A. thaliana plants suffering cadmium or copper stress . In aspen cambial cells, UKN2 expression was too unstable for the gene to be used for normalization . Thus, overall, while certain reference genes are stably expressed in one plant species, they may not be well suited for use in others. As a consequence, prior validation of reference genes needs to be carried out under the specific experimental conditions to be applied in gene expression studies.
We report the application of various mathematical and statistical models to minimize bias in the quantification of gene expression in soybean. The first was a conventional statistical test to calculate the coefficient of variance (CV) of Cq values, which allowed an assessment of an individual gene's expression stability. But, due to its low sensitivity and reliability, this method can not clearly define the most stably expressed reference genes. The second exploited geNorm software , which showed that the stability of the various candidate reference genes varied considerably across the sets of samples (Figure 1). The third used the alternative program, NormFinder, which ranks the reference genes according to their expression stability . The ranking of genes as revealed by NormFinder was mostly identical to that generated by geNorm (Table 3). Except for TUB4, all the candidate reference genes were represented in the Genevestigator database , and most of the expression patterns revealed by Genevestigator microarray data were consistent with the outputs of geNorm and NormFinder in the present data set (Additional file 6 and 7).
It has been argued that co-regulation of genes may confound geNorm analyses, because of the software's tendency to select the genes with a similar expression profile . Among the set of genes tested, two pairs (TUA5/TUB4 and ACT2/7/ACT11) belong to a particular gene family, and thus may be prone to co-regulation. But the possibility that ACT and TUA may be co-regulated is unlikely in this study (Figure 3), given that ACT11 and TUA5 were consistently ranked above ACT2/7 and TUB4 except that TUB4 ranked above TUA5 in different cultivars.
The transcript abundance of many genes is, like GmFTL3, never very high, so any variation in their expression pattern is inevitably subtle. In this study, we normalized the expression of GmFTL3 with a total of seven normalization factors using individual or combinations of two, three and four control genes, and got similar patterns even though the levels of the abundance were different. But normalization with the combination of more genes resulted in improved accuracy. It suggests that the number of reference genes needed to be employed is dependent on the considerations of a researcher's purpose. That is, if one just wants to show a rough expression mode of genes, one reference gene may be enough if this reference gene was confirmed as a stable expressed gene. However, if the researcher hopes to compare the expression among different samples or to accurate the expression level, more reference genes (dependent on the geNorm threshold of 0.15) must be taken. This may be partially explained by that the geNorm threshold is not a strict cut-off and that the observed trend of changing pairwise variation values is equally informative [17, 33, 56].