- Research article
- Open Access
Statistical modelling of transcript profiles of differentially regulated genes
© Eastwood et al; licensee BioMed Central Ltd. 2008
- Received: 07 August 2007
- Accepted: 23 July 2008
- Published: 23 July 2008
The vast quantities of gene expression profiling data produced in microarray studies, and the more precise quantitative PCR, are often not statistically analysed to their full potential. Previous studies have summarised gene expression profiles using simple descriptive statistics, basic analysis of variance (ANOVA) and the clustering of genes based on simple models fitted to their expression profiles over time. We report the novel application of statistical non-linear regression modelling techniques to describe the shapes of expression profiles for the fungus Agaricus bisporus, quantified by PCR, and for E. coli and Rattus norvegicus, using microarray technology. The use of parametric non-linear regression models provides a more precise description of expression profiles, reducing the "noise" of the raw data to produce a clear "signal" given by the fitted curve, and describing each profile with a small number of biologically interpretable parameters. This approach then allows the direct comparison and clustering of the shapes of response patterns between genes and potentially enables a greater exploration and interpretation of the biological processes driving gene expression.
Quantitative reverse transcriptase PCR-derived time-course data of genes were modelled. "Split-line" or "broken-stick" regression identified the initial time of gene up-regulation, enabling the classification of genes into those with primary and secondary responses. Five-day profiles were modelled using the biologically-oriented, critical exponential curve, y(t) = A + (B + Ct)Rt + ε. This non-linear regression approach allowed the expression patterns for different genes to be compared in terms of curve shape, time of maximal transcript level and the decline and asymptotic response levels. Three distinct regulatory patterns were identified for the five genes studied. Applying the regression modelling approach to microarray-derived time course data allowed 11% of the Escherichia coli features to be fitted by an exponential function, and 25% of the Rattus norvegicus features could be described by the critical exponential model, all with statistical significance of p < 0.05.
The statistical non-linear regression approaches presented in this study provide detailed biologically oriented descriptions of individual gene expression profiles, using biologically variable data to generate a set of defining parameters. These approaches have application to the modelling and greater interpretation of profiles obtained across a wide range of platforms, such as microarrays. Through careful choice of appropriate model forms, such statistical regression approaches allow an improved comparison of gene expression profiles, and may provide an approach for the greater understanding of common regulatory mechanisms between genes.
- Transcript Level
- Fruiting Body
- Microarray Dataset
- Gill Tissue
- Statistical Regression Modelling
Various statistical approaches have been specifically developed to summarise the vast quantities of data that are produced in microarray studies [1–3], employing analysis of variance (ANOVA), clustering and network modelling. Analysis of variance (ANOVA) has been used to identify those gene expression responses that are most affected by different treatments, often taking account of particular forms of treatment structure, such as the correlations between sample times in a time-course study . Approaches for clustering genes with similar responses range from simple methods for observed data, the calculation of correlations between genes , through to clustering based on linear  or polynomial regression  or spline models . Network models are used to reconstruct transcription factor activity  or infer regulatory networks , assuming a particular mechanistic model for the behaviour of each regulation function based on observed microarray gene expression data.
This paper aims to use standard statistical non-linear regression models to enhance the biological interpretation of individual gene expression profiles. Such regression models provide accessible methods to describe the shape of each gene expression profile as a function of time, thus providing an insight into the underlying processes rather than simply identifying significant differences. For example, non-linear models can be used to identify the time of a particular event in a gene expression profile, such as the time of rapid up- or down-regulation. Similarly, modelling transcript changes using parametric equations that allow biological interpretation can further allow the comparison or clustering of the shapes of the expression profiles based on biological interpretable parameters. Such non-linear regression techniques are commonly used in agronomic studies to describe responses to a range of quantitative input variables, but are not commonly used in the examination of gene expression data.
The initial model system used to investigate the potential of statistical parametric, non-linear regression approaches for gene profiling was fungal morphogenesis with data provided by quantitative reverse transcriptase PCR (qRT-PCR) which provides a more precise method than either Northern analysis or microarrays . This system encompasses a range of growth forms from vegetative mycelium to multicellular organs which enable the fungus to respond to changes in nutrition and environment, and undergo pathogenesis or reproduction. The fruiting bodies of the basidiomycete fungus Agaricus bisporus, the cultivated mushroom, are ideal for studying fungal morphogenesis as they are macroscopic, the tissues are clearly de-lineated (stipe, caps and gills) and the initiation of fruiting body morphogenesis is controlled environmentally. Differential screening and targeted gene cloning procedures have identified genes up-regulated post-harvest in A. bisporus fruiting bodies, based on Northern analysis [12–14]. Genes have also been identified which are expressed in developing fruiting bodies of several other fungi, including Lentinula edodes , Pleurotus ostreatus , Flammulina velutipes  and Coprinus cinereus (Coprinopsis cinerea) [18, 19].
This study investigated how expression profiles, generated from qRT-PCR data of differentially regulated genes in A. bisporus fruiting bodies, could be statistically modelled both to estimate the time of up-regulation and determine similar temporal expression patterns. The five genes chosen for profiling are functionally distinct, and therefore unlikely to have obvious common regulatory mechanisms: they are cruciform DNA binding protein, cytochrome P450II, β (1–6) glucan synthase, glucuronyl hydrolase and riboflavin aldehyde-forming enzyme [12, 20]. Whilst being functionally distinct, these genes were expected to show broadly similar patterns of expression following harvest, allowing the fitting of a single model form to all five responses. This enables the comparison of the profiles via biologically interpretable parameters rather than simply clustering genes based on the observed data. To determine the time when transcription first increased for each gene, transcript levels of each gene were examined at 3 h intervals for 24 h. Transcript levels were also measured at 24 hour intervals over 5 days. These profiles were modelled to provide directly interpretable parameter values, offering an insight into the regulation of the genes. Spatial control of gene expression was also assessed by comparing transcription in tissues of the harvested mushroom during the first 48 hours post-harvest storage.
Furthermore, this study has applied these regression modelling approaches to publicly-available microarray data sets from published studies, to identify groups of genes showing similar regulatory patterns. The potential application of this approach to fully exploit the large quantities of high-throughout data was discussed.
The qRT-PCR-derived data were initially assessed to ensure suitability for further study. Standard curves show that the PCR reactions were operating at close to 100% efficiency. Melting curve analyses showed that in all cases a single product was obtained showing that the specificity of the reaction was high. The PCR control treatments showed no evidence of DNA contamination or active reverse transcription after the 85°C treatment (data not shown).
Comparison of methods of measuring gene expression in A. bisporus
Transcription in the first 24 hours (A. bisporus)
Transcription profiling over 5 days post-harvest (A. bisporus)
For all genes, there was a significant improvement in the fit when choosing the critical exponential rather than the single exponential, but no significant improvement when choosing the double exponential over the critical exponential (data not shown). Simultaneous fitting of the critical exponential curve to the data for all five genes allowed a comparison of the fitted parameters, and hence the detection of degrees of commonality between the patterns of expression for the genes. This analysis identified that there was no significant improvement to the fit when allowing parameter R (related to the curvature of the response) to be different for the five genes, and so this parameter could be constrained to be the same. However, constraining the other three parameters to be the same across all five genes resulted in significantly worse fits, though for some pairs of genes the fitted (and derived) parameters did suggest some similarities (Table 1). The different parameters (both fitted and derived) can be interpreted in terms of particular features of the shape of the response. The size of parameter C indicates the magnitude of the decline from the maximum expression response, whilst the ratio of B over C is related to the time to the maximum response. Parameter A measures the asymptotic response level after lengthy storage, A + B is the response at time t = 0, and the maximum response is dependent on all four parameters, obtained by inserting the time of maximum response into the critical exponential equation (Equation 1).
Critical exponential curve models for the transcription patterns of the 5 genes during long-term A. bisporus fruitbody post-harvest storage.
Transcript expression between tissues (A. bisporus)
Application of regression approaches to microarray datasets (E. coli and R. norvegicus)
For both studies the chosen functions allow the description of a number of distinct forms of response. The fitted responses for the 20 most significant fits from each study demonstrate the variety of profiles that can be described by each of these models (Figures 7 and 8).
Gene expression studies were conducted to investigate the benefit of applying statistical regression approaches to the fitting of mathematical models for the analysis of transcriptional data. In this study these approaches have been developed using precisely measured transcript levels for a small number of genes and application of the approaches has been further demonstrated for microarray datasets. As microarray technologies continue to be developed, the variability of gene expression data from such technologies will be reduced, leading to the widespread application of the regression modelling techniques developed in this study, thus allowing the comparative analysis of larger numbers of gene expression profiles. A range of statistical regression techniques, both linear and non-linear, are readily available. The selection of appropriate techniques is critically dependent on the specific question being addressed and the data that are collected.
For A. bisporus, the increases in transcript levels in the first 24 hours following harvest are largely due to transcription rather than losses due to mRNA turnover. 'Split-line or 'broken-stick' regression analysis was used to calculate the time when transcription was initiated, at 13.5 h post-harvest for cytochrome P450II and 16.5 h post-harvest for glucuronyl hydrolase. The novel application of this approach to transcript profiling demonstrates the potential value of this simple mathematical model which has been used previously in such diverse applications as estimating the thresholds for patch size in ecological studies , humidity levels in plant pathogen germination experiments  and the mineral density of bones . Successful application of the 'split-line' or 'broken-stick' model for cruciform DNA-binding protein and β (1–6) glucan synthase would require more sampling points to be made in the first 6 hours to provide sufficient time points to allow the fitting of the baseline response, zero-slope line, and hence allow estimation of the time of initial transcription. For the riboflavin aldehyde-forming enzyme further data are needed beyond 24 hours (but still at 3-hour intervals) to allow estimation of the second linear regression segment, and to again allow calculation of the time of initial transcription.
The time of increased transcription for at least 4 of the 5 tested genes occurs at different times in the first 24 hours post-harvest. This suggests that the response of the mushroom to harvest is not under the control of a single regulatory pathway, such as the signal transduction pathways described for fungal oxidative and osmotic stress responses . The controlling events in the mushroom are likely to be affected by a range of stimuli, such as stress, nutrient limitation, continued maturation and spore formation, which might illicit both primary or immediate responses, and secondary responses. Here we observed increased transcription first of cruciform DNA-binding protein (approximately 3–6 h), followed by β (1–6) glucan synthase (approximately 9 h), cytochrome P450II (13.5 h), glucuronyl hydrolase (16.5 h) and riboflavin aldehyde-forming enzyme (> 24 h). Cruciform DNA-binding protein and β (1–6) glucan synthase may be part of a primary response, while cytochrome P450II, glucuronyl hydrolase and riboflavin aldehyde-forming enzyme are from a secondary response or caused by a later stimulus.
Statistical regression modelling showed similar expression patterns for two A. bisporus genes, glucuronyl hydrolase and riboflavin aldehyde-forming enzyme, the latter of which is known to be up-regulated during the development of the non-harvested mushroom . Further study is required to determine whether the common pattern observed between the genes post-harvest is also observed in the morphogenesis of the non-harvested fruiting body. The pattern of β (1–6) glucan synthase gene expression during 5 days storage was different from the other genes studied, with transcript levels falling greatly after 2 d following an initial increase in gene expression. The gene has been hypothesised to be involved in cell wall synthesis . The initial increase in gene expression coincides with the period when hyphae in the cap and stipe elongate, i.e. in the first 2 days post-harvest, a process for which cell wall synthesis is important. Similarly, the reduction in expression after day 2 coincides with the cessation of cell wall synthesis following the full extension of the cap [27, 28].
The fitted critical exponential models for the 5-day profiles of cruciform DNA-binding protein and cytochrome P450II had similar shapes, whilst the times of initial transcriptional increase for these two genes, as determined by split-line regressions, were markedly different (3–6 h and 13.5 h respectively). This apparent paradox illustrates the importance of considering transcript responses over a number of different time-scales. The precise function of cruciform DNA-binding protein is not known, however, it is unlikely to be involved in recombination as transcript levels are low in the gill tissue where meiosis occurs. In other organisms, proteins with similar cruciform DNA-binding activity (i.e. HMG proteins) cause the increased and decreased transcription of genes [29, 30]. It is possible that the early and abundant transcription of cruciform DNA-binding protein in A. bisporus fruiting bodies acts to regulate the expression of other genes.
The spatial transcript levels were different between tissues and for all genes were significantly lower in the gill tissue, indicating a physiological difference from the stipe and cap. The gills are actively growing, respiring and producing spores via meiosis. They are a nutrient 'sink' and are therefore physiologically different from the stipe and cap tissues, which export nutrients and are subject to stress from cell damage. However the increased respiration in gill tissue [31, 28] may result in increased ribosomal, and therefore 18S rRNA, levels, to which transcript levels from qRT-PCR are normalised, so the quantitative differences between gills and stipe/cap tissues may be influenced by this. While pairs of genes showed similar patterns of response when described using the critical exponential model, the spatial distribution of transcripts between different tissues varied in some cases. For example, cytochrome P450II and cruciform DNA-binding protein showed similar overall patterns, but cytochrome P450II showed a delayed initial increase in transcription in the gill tissue. Further and more detailed studies of the expression of these genes in the separate tissues are needed to understand the potentially different regulatory mechanisms acting in each tissue.
The use of a statistical non-linear regression approach to model the gene expression profiles over an extended period offers the opportunity to compare the shapes of the response curves between genes. The critical exponential curve, selected by the observed response shapes and goodness of fit, can be explained in terms of the combination of two processes, in this case RNA synthesis and degradation. A more complex model for such a situation would be the double exponential curve, which is a natural function for a two-timescale process. The critical exponential is the degenerate case of this model and occurs when two processes in a system have the same timescale. In this case, however, there was no evidence for choosing the more complex model. Choice of an appropriate model can be important, but many common non-linear models are based on functional forms derived from observed biological processes. Increased gene transcription is responsible for the initial rapid increase in transcript levels between days 0 to 2. Transcript levels continue to rise until a maximum is reached, followed by a decline towards a steady level, possibly as a consequence of a balance between transcription and degradation (transcript turnover).
The critical exponential model was successful in identifying genes with quantitatively similar patterns of response, which could not have been predicted from their putative protein functions. This approach, therefore, offers a new method by which a large number of genes could be classified according to their initial transcription regulation and subsequent turnover.
Our approach to first model and then cluster allows more genes to be considered and potentially a greater insight to understand the system. The application of this approach to a microarray dataset allows the screening of genes to identify responses that can be described by a particular mathematical function. Thus each gene profile is reduced from noisy observations to a smaller set of biologically-interpretable parameters. In the analysis of the microarray datasets, different shapes of profiles fitted by the exponential or critical exponential functions can be identified (Figures 7 &8), allowing the grouping of genes based on the parameter values. Eliminating the inherent variability in the data through the regression modelling approach allows a more precise comparison of gene profiles and thus improved clustering. For example, the 9% of R. norvegicus genes identified that followed the critical exponential curve at the p < 0.001 level represents approx 720 genes compared with the 200 genes identified and modelled by Jin et al (2003) . The groupings of genes then generated by the improved clustering propose hypotheses of regulatory association between genes. For example, the aim of the E. coli microarray study was to identify those genes co-regulated with the main regulatory gene, sox S, which was demonstrated to have an exponential-type response following the application of paraquat . By using an initial regression analysis to identify the subset of genes that can be described by an exponential function, subsequent cluster analyses can focus on this subset of genes with similar, but not identical (see Figure 7), shapes of expression profiles. The fitted exponential function parameters for this gene subset could then be used to better identify those genes most closely co-regulated with sox S. Whilst our study demonstrates the application of a regression modelling approach to describe gene expression profiles, this approach can be expanded to fully exploit microarray datasets. Application of a wider range of functional forms (for example including functions with similar shapes but also some temporal variation or time delays) offers the potential to develop regulatory networks based on relationships between the shapes of expression profiles as captured by the fitted parameters. Further interpretation of the parameters, alongside knowledge of gene function, might allow the identification of the stimuli driving the observed gene expression responses.
Compared with standard clustering for gene profiles of microarray data, the statistical regression modelling of mathematical functions to describe these profiles eliminates the inherent variability in the data and allows the direct comparison of profile shapes.
This study has illustrated how the use of standard statistical modelling approaches (analysis of variances (ANOVA), linear regression modelling, non-linear regression modelling) commonly used in plant, microbial and ecological sciences, can be used to aid and extend the interpretation of gene expression profiles obtained from qRT-PCR. These approaches have been applied to model profiles of larger numbers of genes obtained from expression microarray studies, and could be further applied to other high-throughput "-omic" technologies. A wide range of statistical approaches have been specifically developed to analyse the vast quantities of data generated in microarray-based studies , assessing both similarities and differences between genes and between treatments. Similarly a number of approaches have been developed to generate mathematical models for assumed networks of gene pathways (based on simple mathematical assumptions) [33, 34]. However, there appears to have been little statistical consideration for the detailed modelling of individual gene expression profiles. The statistical regression modelling approaches applied in this study allow the estimation of parameters which succinctly describe the shapes of gene expression profiles. These parameter estimates (or combinations of them) can then be related directly to the processes stimulating and driving the expression of these genes. Comparison of the parameters and expression profiles for a set of genes could then indicate that a sub-set of these genes are co-regulated, with the potential to hypothesise a common regulatory mechanism. Hence, consideration of a wide range of non-linear regression models could provide building blocks for the development of more biologically realistic models of gene expression profiles.
Agaricus bisporus strain A15 (Sylvan, UK) was used throughout the study. Mushrooms were grown on composted wheat straw according to commercial practice at the Warwick HRI BioConversion Unit. Mushrooms were harvested at morphogenetic stage 2  and were either frozen immediately under liquid nitrogen, termed time 0, or stored for a specified period in a controlled environment, 18°C and 95–95% relative humidity, before freezing under liquid nitrogen. Stored mushrooms were sampled for gene expression profiling over i) 0 to 24 hour time course post-harvest (three hourly intervals), ii) 0 to 5 day time course following harvest (24 hourly intervals), and iii) 0 to 48 hours post-harvest (24 hour intervals), with mushrooms dissected into stipe, cap and gill tissues. Three replicate mushrooms were taken for each sampling point and frozen samples were stored at -80°C.
RNA was isolated from mushroom tissues according to established phenol/chloroform extraction protocols . Absorbance measurement at 260 nm and 280 nm were used to assess RNA concentration and purity. RNA integrity was determined with formaldehyde agarose gel electrophoresis . For experiments involving reverse transcriptase, RNA samples were treated with RQ1 RNAse-free DNAse enzyme (Promega, Southampton, UK) according to manufacturer's instructions
Quantitative RT-PCR (qRT-PCR)
Transcript levels were determined using the ABI Prism 7900 HT sequence detector (TaqMan™) and SYBR® Green fluorescent reporter dye. Reverse transcription was carried out using the Thermoscript™ RT-PCR system (Invitrogen, Life Technologies, Paisley, UK) in 20 μl volumes containing 50 ngμl-1 random hexamers, 1 μg total RNA, 1 μl Thermoscript reverse transcriptase (15 U μl-1), 4 μl 5× Thermoscript™ buffer, 1 μl 0.1 M DTT and 1 μl RNaseOUT™ (40 U μl-1). Reactions were carried out at 25°C for 10 minutes, followed by 50 minutes at 50°C and terminated at 85°C for 5 minutes. Each cDNA sample was treated with RNAse H according to manufacturer's instructions and diluted to 100 μl final volume. cDNA samples were taken from three replicate mushrooms per time point.
Oligonucleotides used in the qPCR for the selected Agaricus bisporus genes
Cruciform DNA-binding protein
Reverse: 5'- CAGCGATTTGGTCCGTCATA-3'
Reverse: 5'- GCGCAGGCTTGATATCGAA-3'
β (1–6) Glucan synthase
Reverse: 5'- TGCGCAAACAACCTATTCC-3'
Reverse: 5'- AGCGATAGTTGCTGCTGAAGAA-3'
Riboflavin aldehyde-forming enzyme
Reverse: 5'- TGACTTTCACGTATTTGCTTTGT-3'
Reverse: 5'- GACGCTGACAGTCCCTCTAAGAA-3'
Data analysis utilised the ABI PRISM sequence detector® software (SDS) (version 2.0) to determine the cycle threshold of each sample (Ct-Target) which was normalised to the cycle threshold of the 18S rRNA qRT-PCR product (Ct-Control) for the same sample [37, 38]. The ΔCt equation (ΔCt = 2(CtControl-CtTarget)) was used to calculate the amount of each target transcript relative to the amount of 18S rRNA.
The control treatments were (a) water control using sterilised diethylpyrocarbonate (DEPC)-treated water in the place of the cDNA sample to detect environmental DNA contamination and primer-based artefacts (b) DNAse-treated RNA to assess for contaminating DNA in the RNA samples, and (c) the absence of primers during the reverse transcription step, but present during PCR, to detect contaminating DNA and the possibility of active reverse transcriptase present during PCR.
Northern hybridisation analysis
Total RNA, ~10 μg, from each sample was separated by formaldehyde agarose gel electrophoresis and immobilised onto nylon membranes as per established protocols . Hybridisation was carried out using randomly primed [α-32P]dCTP probes and post-hybridisation washes carried out using established protocols . To produce the probes, phagemid clones containing the cDNAs were restricted with Hin dIII and Bam HI and fragments separated by agarose gel electrophoresis, excised and purified using the Qiagen gel purification protocol. Purified fragments were used as templates for random priming incorporating [α-32P]dCTP (Redi prime kit, Amersham Pharmacia Biotech., Buckinghamshire, UK). Agaricus bisporus 28S rRNA gene was used as a loading control as described previously [12, 39]. Hybridisation intensity of each gene-specific probe used in Northern analysis was determined using scanning densitometry (Personal Densitometer SI, Molecular Dynamics, CA, USA). Transcript levels for each gene were calculated relative to the hybridisation intensity recorded for the 28S rRNA gene probe for each sample tested. Northern analysis was performed on total RNA from two replicate mushrooms per time sample or time × tissue sample for transcripts of each gene examined.
Comparisons of the transcript levels, as determined by Northern analysis scanning densitometry and qRT-PCR, were made for each gene by calculating correlation coefficients, and by fitting linear and exponential regression responses to explain the Northern analysis measurements in terms of those from the qRT-PCR. Within each experiment, two replicate Northern analysis measurements were paired with the qRT-PCR values obtained from the same replicate mushroom RNA extracts (note that for one replicate of each sampling point within each experiment no Northern analysis measurement was obtained).
Quantitative RT-PCR data of transcription levels for each gene were analysed using analysis of variance (ANOVA) for each experiment (0–24 h, 0–5d and 0–48 h between different tissues) separately. Three replicate mushrooms were assayed at each time or for each tissue-by-time combination. Prior to analysis, the data were subjected to a logarithm (base 10) transformation to satisfy the ANOVA assumption of homogeneity of variance. The significance of the overall treatment effects (time only in two experiments, time, tissue and the interaction between these factors in the third) was assessed using an F-test, and the significance of differences between individual treatment means was assessed by comparison with appropriate standard errors of differences (SEDs). Treatment differences noted in the text are significant at the 5% level unless stated otherwise.
For the qRT-PCR data only, regression analyses were used to model the gene expression changes over time. 'Split-line' or 'broken-stick' regression analysis of transcription levels from 0–24 h was applied to estimate the time when the up-regulation of each gene commenced. The 'broken-stick' model consists of two linear regression segments fitted to distinct subsets of the data, with separate estimates of slope and intercept for each segment. In this case the first line segment was constrained to have a slope of zero. A sequence of models was fitted to the data for each gene, splitting the data set into two parts (time ≤ x hours: time > x hours, for each of the observed values of x). The best model for each gene was chosen as the one with the minimum sum of residual sums of squares for the two regressions. The time point where the two lines crossed was postulated as the time when increased gene transcription began.
The long-term gene transcript profiles (0–5d) were modelled using the critical exponential curve (Equation 1), fitted to the log10-transformed data.
y(t) = A + (B + Ct)R t + ε (Equation 1)
where A, B, C and R are parameters, y is the gene expression response (log10 transformed), t is storage time, and ε represents the errors, assumed to follow a Normal distribution with mean zero and a constant variance. This form of curve was selected following an initial graphing of the responses, as it can be used to describe a rapidly increasing phase followed by a decline or plateau, and after assessment of how well it fitted the observed data compared with both the simpler exponential model and the more complex double exponential model. The parameters of this non-linear response can be interpreted in terms of a postulated mechanism driving the observed gene expression responses, in this case potentially quantifying the relationship between transcript synthesis and degradation. The fitted parameters, and hence the shapes of the fitted curves, were compared between genes using a parallel curves analysis, either constraining each parameter to be the same across all five genes, or allowing variation in the values taken by each parameter between genes. This analysis provides a basis for comparing a sequence of possible models, and assessment of the change in residual variance between models allows the most appropriate model for the observed data to be determined.
Application of regression modelling to microarray datasets
The regression modelling approach was applied to publicly-available microarray datasets from different organisms (E. coli and R. norvegicus), previously published [21, 22]. The datasets were selected as having an appropriate time course with a fixed time point at which a treatment was applied generating an expression response, with evidence that gene profiles could be described by a standard response function. For the E. coli study  the master regulatory gene sox S demonstrated an exponential-type response to the application of paraquat. To identify all genes with a similar exponential shape of response, an exponential function was fitted using the regression modelling approach to all gene expression profiles from this microarray study, and the significance of each fit was determined. Similarly, a number of genes from R. norvegicus liver tissue treated with corticosteroid displayed profiles appearing to follow the same critical exponential curve as fitted to the A. bisporus data (Equation 1) . The critical exponential function was thus fitted to all genes in this dataset to determine the proportion of gene profiles that were adequately described by this function.
Funding was provided by the UK Government Department for Food and Rural Affairs (DEFRA) project HH2116SMU.
- Dopazo J, Zanders E, Dragoni I, Amphlett G, Falciani F: Methods and approaches in the analysis of gene expression data. J Immunol Methods 2001, 250(1–2):93-112. 10.1016/S0022-1759(01)00307-6.View ArticlePubMedGoogle Scholar
- Tamames J, Clark D, Herrero J, Dopazo J, Blaschke C, Fernandez JM, Oliveros JC, Valencia A: Bioinformatics methods for the analysis of expression arrays: data clustering and information extraction. J Biotechnol 2002, 98: 269-283. 10.1016/S0168-1656(02)00137-2.View ArticlePubMedGoogle Scholar
- Reimers M: Statistical analysis of microarray data. Addict Biol 2005, 10: 23-35. 10.1080/13556210412331327795.View ArticlePubMedGoogle Scholar
- Tai YC, Speed TP: A multivariate empirical Bayes statistic for replicated microarray time course data. Ann Stat 2006, 34(6):2387-2412. 10.1214/009053606000000759.View ArticleGoogle Scholar
- Manfield IW, Jen CH, Pinney JW, Michalopoulos I, Bradford JR, Gilmartin PM, Westhead DR: Arabidopsis Co-expression Tool (ACT): web server tools for microarray-based gene expression analysis. Nucleic Acids Res 2006, 34: 504-509. 10.1093/nar/gkl204.View ArticleGoogle Scholar
- Persson S, Wei H, Milne J, Page GR, Somerville CR: Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets. Proc Natl Acad Sci USA 2005, 102(24):8633-8638. 10.1073/pnas.0503392102.PubMed CentralView ArticlePubMedGoogle Scholar
- Conesa A, Neuda MJ, Ferrer A, Talön M: maSigPro: A method to identify significantly differential expression profiles in time-course microarray experiments. Bioinformatics 2006, 22(9):1096-1102. 10.1093/bioinformatics/btl056.View ArticlePubMedGoogle Scholar
- Heard NA, Holmes CC, Stephens DA: A quantitative study of gene regulation involved in the immune response of Anopheline mosquitoes: An application of Bayesian hierarchical clustering of curves. J Am Stat Assoc 2006, 101(473):18-29. 10.1198/016214505000000187.View ArticleGoogle Scholar
- Kahnin R, Vinciotti V, Mersinias V, Smith CP, Wit P: Statistical reconstruction of transcription factor activity using Michaelis-Menten kinetics. Biometrics 2007, 63(3):816-823. 10.1111/j.1541-0420.2007.00757.x.View ArticleGoogle Scholar
- Nachman I, Regev A, Friedman N: Inferring quantitative models of regulatory networks from expression data. Bioinformatics 2004, 20(suppl 1):i248-i256. 10.1093/bioinformatics/bth941.View ArticlePubMedGoogle Scholar
- Liss B: Improved quantitative real-time RT-PCR for expression profiling of individual cells. Nucleic Acids Res 2002, 30(17):89-98. 10.1093/nar/gnf088.View ArticleGoogle Scholar
- Eastwood DC, Kingsnorth CS, Jones HJ, Burton KS: Genes with increased transcript levels following harvest of the sporophore of Agaricus bisporus have multiple physiological roles. Mycol Res 2001, 105(10):1223-123. 10.1016/S0953-7562(08)61993-0.View ArticleGoogle Scholar
- Kingsnorth CS, Eastwood DC, Burton KS: Cloning and post-harvest expression of serine proteinase transcripts in the cultivated mushroom Agaricus bisporus . Fungal Gen Biol 2001, 32: 135-144. 10.1006/fgbi.2001.1257.View ArticleGoogle Scholar
- Wagemaker MJM, Eastwood DC, Wellboren W, Burton KS, Drift C, Jetten MSM, Van Griensven LJLD, Op Den Camp HJM: Argininosuccinate synthetase and argininosuccinate lyase: two ornithine cycle enzymes from Agaricus bisporus . Mycol Res 2007, 111: 493-502. 10.1016/j.mycres.2007.01.016.View ArticlePubMedGoogle Scholar
- Miyazaki Y, Nakamura M, Babasaki K: Molecular cloning of developmentally specific genes by representational difference analysis during fruiting body formation in the basidiomycete Lentinula edodes . Fungal Gen Biol 2005, 42: 493-505. 10.1016/j.fgb.2005.03.003.View ArticleGoogle Scholar
- Lee S-H, Kim B-G, Kim K-J, Lee J-S, Yun D-W, Hahn J-H, Kim G-H, Lee K-H, Suh D-S, Kwon S-T, Lee C-S, Yoo Y-B: Comparative analysis of sequences expressed during liquid-cultured mycelia and fruit body stages of Pleurotus ostreatus . Fungal Gen Biol 2002, 35: 115-134. 10.1006/fgbi.2001.1310.View ArticleGoogle Scholar
- Yamada M, Sakuraba S, Shibata K, Taguchi G, Inatomi S, Okazaki M, Shimosaka M: Isolation and analysis of genes specifically expressed during fruiting body development in the basidiomycete Flammulina velutipes by fluorescence differential display. FEMS Microbiol Lett 2006, 254: 165-172. 10.1111/j.1574-6968.2005.00023.x.View ArticlePubMedGoogle Scholar
- Kues U: Life history and developmental processes in the basidiomycete Coprinus cinereus . Microbiol Mol Biol Rev 2000, 64(2):316-353. 10.1128/MMBR.64.2.316-353.2000.PubMed CentralView ArticlePubMedGoogle Scholar
- Kamada T: Molecular genetics of sexual development in the mushroom Coprinus cinereus . BioEssays 2002, 24: 449-459. 10.1002/bies.10083.View ArticlePubMedGoogle Scholar
- Sreenivasaprasad S, Eastwood DC, Browning N, Lewis SMJ, Burton KS: Differential expression of a putative riboflavin-aldehyde-forming enzyme ( raf ) gene during development and post-harvest storage and in different tissue of the sporophore in Agaricus bisporus . Appl Microbiol Biotechnol 2006, 70: 470-476. 10.1007/s00253-005-0084-9.View ArticlePubMedGoogle Scholar
- Blanchard JL, Wholey W-Y, Conlon EM, Pomposiello PJ: Rapid changes in gene expression dynamics in response to superoxide reveal SoxRS-dependent and independent transcriptional networks. PLoS ONE 2007, 2(11):e1186. 10.1371/journal.pone.0001186.PubMed CentralView ArticlePubMedGoogle Scholar
- Jin JY, Almon RR, Dubois DC, Jusko J: Modelling of corticosteroid pharmacogenomics in rat liver using gene microarrays. J Pharmacol Exp Ther 2003, 307: 93-109. 10.1124/jpet.103.053256.View ArticlePubMedGoogle Scholar
- Bascampte J, Rodriguez MA: Habitat patchiness and plant species richness. Ecol Lett 2001, 4: 417-420. 10.1046/j.1461-0248.2001.00242.x.View ArticleGoogle Scholar
- Carroll JE, Wilcox WF: Effects of humidity on the development of grapevine powdery mildew. Phytopathology 2003, 93: 1137-1144. 10.1094/PHYTO.2003.93.9.1137.View ArticlePubMedGoogle Scholar
- Price RI, Walters MJ, Retallack RW, Henderson NK, Kerr D, Henzell S, Dhaliwal S, Prince RL: Impact of the analysis of a bone density reference range on determination of the T-score. J Clin Densitom 2003, 6(1):51-62. 10.1385/JCD:6:1:51.View ArticlePubMedGoogle Scholar
- Ikner A, Shiozaki K: Yeast signalling pathways in oxidative stress response. Mutat Res-Fund Mol Mech Mutagen 2005, 569(1–2):13-27. 10.1016/j.mrfmmm.2004.09.006.View ArticleGoogle Scholar
- Umar MH, van Griensven LJLD: Morphological studies on the life span, development stages, senescence and death of fruiting bodies of Agaricus bisporus . Mycol Res 1997, 101: 1409-1422. 10.1017/S0953756297005212.View ArticleGoogle Scholar
- Braaksma A, van Doorn AA, Kieft H, van Aelist AC: Morphometric analysis of ageing mushrooms ( Agaricus bisporus ) during post-harvest development. Postharvest Biol Technol 1998, 13: 71-79. 10.1016/S0925-5214(97)00069-0.View ArticleGoogle Scholar
- Ge H, Roeder RG: The high-mobility group protein HMG1 can reversibly inhibit class-II gene-transcription by interaction with the TATA-binding protein. J Biol Chem 1994, 269(25):17136-17140.PubMedGoogle Scholar
- Stros M, Ozaki T, Bacikova A, Kageyama H, Nakagawara A: HMGB1 and HMGB2 cell-specifically down-regulate the p53-and p-73-dependant sequence-specific transactivation from the human Bax gene promoter. J Biol Chem 2002, 277(9):7157-7164. 10.1074/jbc.M110233200.View ArticlePubMedGoogle Scholar
- Hammond JBW, Nichols R: Changes in respiration and carbohydrates during the post-harvest storage of mushrooms ( Agaricus bisporus ). J Sci Food Agric 1975, 26: 835-842. 10.1002/jsfa.2740260615.View ArticleGoogle Scholar
- Stekel D: Microarray Bioinformatics. Cambridge University Press, Cambridge, UK; 2003.View ArticleGoogle Scholar
- Rangel C, Angus J, Ghahramani Z, Lioumi M, Sotheran E, Gaiba A, Wild DL, Falciani F: Modelling T-cell activation using gene expression profiling and state-space models. Bioinformatics 2004, 20(9):1361-1372. 10.1093/bioinformatics/bth093.View ArticlePubMedGoogle Scholar
- Beal MJ, Falciani F, Ghahramani Z, Rangel C, Wild DL: A Bayesian approach to reconstructing genetic regulatory networks with hiddenfactors. Bioinformatics 2005, 21(3):349-356. 10.1093/bioinformatics/bti014.View ArticlePubMedGoogle Scholar
- Sambrook J, Russell DW: Molecular cloning: a laboratory manual. 3rd edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY; 2001.Google Scholar
- Rosen KM, Villa-Komaroff L: An alternative method for the visualization of RNA in formaldehyde agarose gels. Focus 1993, 12(2):23-24.Google Scholar
- Goidin D, Mamessier A, Statquet M.-J, Schmitt D, Berthier-Vergnes O: Ribosomal 18S RNA prevails over Glyceraldehyde-3-phosphate dehydrogenase and β-actin genes as internal standards for quantitative comparison of mRNA levels in invasive and noninvasive human melanoma cell subpopulations. Anal Biochem 2000, 295(1):17-21. 10.1006/abio.2001.5171.View ArticleGoogle Scholar
- Lekanne Deprez RH, Fijnvandraat AC, Ruijter JM, Moorman AFM: Sensitivity and accuracy of quantitative real-time polymerase chain reaction using SYBR green I depends on cDNA synthesis conditions. Anal Biochem 2002, 307(1):63-69. 10.1016/S0003-2697(02)00021-0.View ArticlePubMedGoogle Scholar
- de Groot PWJ, Schaap PJ, van Griensven LJLD, Visser J: Isolation of developmentally regulated genes from the edible mushroom Agaricus bisporus . Microbiology 1997, 143: 1993-2001.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.