- Research article
- Open Access
Flexible promoter architecture requirements for coactivator recruitment
BMC Molecular Biology volume 7, Article number: 16 (2006)
The spatial organization of transcription factor binding sites in regulatory DNA, and the composition of intersite sequences, influences the assembly of the multiprotein complexes that regulate RNA polymerase recruitment and thereby affects transcription. We have developed a genetic approach to investigate how reporter gene transcription is affected by varying the spacing between transcription factor binding sites. We characterized the components of promoter architecture that govern the yeast transcription factors Cbf1 and Met31/32, which bind independently, but collaboratively recruit the coactivator Met4.
A Cbf1 binding site was required upstream of a Met31/32 binding site for full reporter gene expression. Distance constraints on coactivator recruitment were more flexible than those for cooperatively binding transcription factors. Distances from 18 to 50 bp between binding sites support efficient recruitment of Met4, with only slight modulation by helical phasing. Intriguingly, we found that certain sequences located between the binding sites abolished gene expression.
These results yield insight to the influence of both binding site architecture and local DNA flexibility on gene expression, and can be used to refine computational predictions of gene expression from promoter sequences. In addition, our approach can be applied to survey promoter architecture requirements for arbitrary combinations of transcription factor binding sites.
In most eukaryotes, the sequences that regulate transcription integrate multiple signals, through the binding of different transcription factors, to modulate levels of gene expression. When bound to DNA, transcription factors anchor the assembly of multiprotein complexes that influence the recruitment of RNA polymerase. Efficient assembly depends on optimally spaced protein-protein interactions among transcription factors and auxiliary proteins [1–4]. Since transcription factors recognize specific sites on DNA, the distance between these binding sites can influence how transcription factors interact with each other and other proteins. For example, overlapping sites may prevent two transcription factors from binding simultaneously, while sites too distant from each other may hinder bound transcription factors from recruiting necessary cofactors. Furthermore, some distantly spaced sites can only properly interact when the DNA between them is looped, a process influenced by the composition of the looped DNA.
Computational approaches take into account the multifactorial nature of transcriptional regulation when discovering transcription factor binding sequence motifs. Several methods search for pairs of enriched sequences, while allowing either fixed or variable spacing between them [5–8]. Other approaches start with binding specificities for pairs of known transcription factors, then assess whether the predicted occurrences of their binding sites are closely spaced [9–11]. Notably, most of these methods assess the spacing between binding sites with arbitrary criteria, such as minimum and maximum distance thresholds between binding sites. However, the biological relevance and suitability of these thresholds has seldom been verified experimentally.
Several mechanisms that govern transcription factor interactions have been previously described. Transcription factors may bind cooperatively to adjacent sites in DNA, thus increasing the stability of the ternary DNA-protein complex. Since this effect is mediated by direct protein-protein interactions, sites for cooperatively binding transcription factors are usually spaced within 20 bp (e.g., [12–15]). Slight alterations in spacing between the binding sites can drastically reduce gene expression unless helical phasing is preserved. Computational analyses suggest that proper helical phasing between predicted binding sites may be a common property of transcriptional control regions [16, 17].
Alternatively, transcription factors may bind to DNA independently and cooperatively recruit a coactivator protein. Co-recruitment of such activators is analogous to an "AND gate" in logic. Coincident binding of two proteins increases the fidelity and specificity of signal detection [2, 18, 19]. The network of transcription factors that regulates sulfur gene derepression in yeast provides a model system to dissect the promoter architecture requirements for coactivator recruitment. Among these transcription factors, only the coactivator Met4 contains a strong activation domain. However, Met4 does not bind to DNA directly, but is recruited in sulfur limitation conditions by Cbf1 and Met28 to the MET16 promoter, as well as by Met28 and Met31/32 on regions from the MET3 and MET28 promoters [20, 21]. In addition, yeast two-hybrid assays with Met4 truncation mutants revealed distinct regions that mediate interaction with Cbf1 and Met31/32 [21, 22]. Taken together, these studies suggest a model by which the coactivator Met4 is coordinately recruited by the transcription factors Cbf1, Met28 and Met31/32 to the promoters of sulfur utilization genes (reviewed in ). Nevertheless, the effects of distance constraints and sequence context between Cbf1 and Met31/32 binding sites have not been characterized.
We use the term promoter architecture to refer to distance constraints and sequence context effects that govern interactions among transcription factor binding sites. Our goal is to understand how promoter architectures differ for transcription factors that participate in coactivator recruitment, versus those that bind cooperatively. In this work, we developed a synthetic promoter assay to characterize how various distances between Cbf1 and Met31/32 binding sites influenced gene expression in response to methionine starvation. The relative order of binding sites affected reporter gene expression. We discovered that distance constraints on coactivator recruitment were more flexible than those for cooperatively binding transcription factors. Distances from 18 to 50 bp between binding sites could support efficient recruitment of Met4, with only slight modulation by helical phasing. Intriguingly, we found that certain sequence contexts between the binding sites abolished gene expression. Finally, we noted that the probability of coactivator recruitment could be affected by the bendability of the spacer sequence between transcription factor binding sites.
Annotated promoters of sulfur-regulated genes contain closely spaced binding sites
We first surveyed the promoter architectures of 19 annotated genes in S. cerevisiae that were co-expressed under sulfur-limiting conditions . All of these promoters contained Cbf1 and Met31/32 binding sites with perfectly identical sequences among at least 3 of 4 closely-related yeast species (Figure 1). We assessed binding site conservation based on identity within a multiple sequence alignment, but not the distances between the sites. Due to the small sample size, we did not distinguish between the possible relative orientations of the binding sites. Notably, each sulfur-regulated promoter included a Cbf1 binding site upstream of a Met31/32 binding site. These conserved binding sites could occur in either forward or reverse orientation.
In each promoter, we searched for the Cbf1 binding site upstream of the Met31/32 binding site that yielded the smallest distance between them in S. cerevisiae. A histogram of these smallest distances showed a peak between 10 and 30 bp (Figure 2A). This peak suggested an optimal distance between the transcription factors was necessary for efficient Met4 recruitment. When investigating whether the distances between the closest pairs of binding site were helically phased, we could not detect an enrichment of distances on a certain face of DNA (Figure 2B). Finally, the vast majority of annotated promoters contained Met31/32 binding sites within 100 to 350 bp upstream of the translation start site (Figure 2C).
Cbf1 binding sites are required upstream of Met31/32 binding sites
A larger collection of sulfur-regulated promoters would provide more statistical power to define key components of promoter architecture. To explore sequence space more diverse than that found in the yeast genome, we developed a synthetic genetic approach to select for sulfur-regulated promoters from a plasmid library. We engineered a single-copy plasmid that fused a minimal promoter upstream of the HIS3 reporter gene (Figure 3). To test their effects on transcription levels, different promoter architectures were embedded in the context of the minimal promoter from the S. cerevisiae MEL1 gene . This promoter was chosen for its low background expression, compared to promoters derived from the S. cerevisiae CYC1 gene. Promoter architectures with combinations of regulatory sequences that supported sufficient expression of the HIS3 reporter gene enabled the parental yeast strain BY4742 to grow in media lacking histidine. In addition, semiquantitative measurements of HIS3 expression can be assayed by titration with 3-amino-1,2,4-triazole (3-AT), a competitive inhibitor of the HIS3 gene product . Faster growth rates in the presence of higher concentrations of 3-AT correspond to higher expression levels of the HIS3 gene.
We sought to define the minimal regulatory information that was sufficient to induce reporter gene expression in response to sulfur limitation. As a negative control, none of the synthetic promoters were able to induce enough HIS3 expression in a repressive concentration of methionine to support growth on a low level (5 mM) of 3-AT (Figure 4A). In addition, neither the minimal promoter alone nor a single Met31/32 binding site could support growth in the absence of methionine with 10 mM 3-AT. A single Cbf1 binding site supported weak growth on 10 mM 3-AT. In the wild-type MET14 promoter, a Cbf1 binding site was found 35 bp upstream from a Met31/32 binding site, as measured by center-to-center distance. Two Cbf1 binding sites placed at the same distance showed moderate HIS3 expression. However, two Met31/32 binding sites were unable to support growth. A promoter with a Cbf1 binding site upstream of a Met31/32 binding site showed the highest level of HIS3 expression. In contrast, a promoter with the Met31/32 binding site found upstream of the Cbf1 binding site was unable to support growth on 10 mM 3-AT.
We compared these results obtained from a minimal promoter system to the average effects of promoter architecture in endogenous yeast genes. For each gene, we estimated the average number of mRNA copies per cell by multiplying the basal transcript levels in rich growth media and the average ratio of gene expression change from published microarray studies of sulfur limitation conditions [27–29]. We then averaged these inferred absolute transcript levels for sets of genes whose promoters shared combinations of Cbf1 or Met31/32 binding sites (Figure 4B). The presence of Cbf1 binding sites or Met31/32 binding sites was associated with a significant increase in inferred transcript levels compared to the rest of the genome. However, two Cbf1 or two Met31/32 binding sites did not contribute to higher transcript levels. Genes whose promoters shared a Cbf1 binding site upstream of a Met31/32 binding site showed significantly higher transcript levels compared to genes with only a single Cbf1 binding site. In contrast, the opposite order of binding sites was not associated with a significant increase in transcript levels. Thus, the constraint on the order of binding sites was consistent with computational predictions of the regulatory effects for various promoter architectures.
High cooperativity between Cbf1 and Met31/32 binding sites spaced at least 18 bp apart
We predicted that efficient recruitment of Met4 to the promoters of sulfur utilization genes should depend on the spacing between Cbf1 and Met31/32 binding sites. To investigate the effect of varied spacing on reporter gene activation, we constructed a set of promoter libraries that differed by 2-bp increments from 6 bp to 34 bp, as well as 5-bp increments from 40 bp to 50 bp. Each promoter library had a fixed size but degenerate nucleotide sequences between the Cbf1 and Met31/32 binding sites. The binding sites were flanked by 10 bp of sequence from the MET16 promoter of S. bayanus, which lacks an adjacent Gcn4 site.
By pooling hundreds of yeast transformants for each library, we reasoned that the contribution of nucleotide composition on Met4 recruitment and subsequent gene activation would be averaged out. Growth rates for each promoter library thus represent the aggregate effect of a certain distance on reporter gene expression. At several steps in the procedure, we took care to reduce the potential of selecting only the fastest-growing strains from each pool. First, we picked transformants of similar colony size. Second, we amplified these pools overnight in dropout leucine media, which ensured that the plasmids were retained but did not select for levels of promoter expression. Third, we measured growth rates for 7 hours after promoter induction and selection with 3-AT, which corresponded to fewer than 2.5 doubling times. Finally, we verified the complexity of the library pools by isolating single colonies on non-selective growth plates and sequencing of the promoters in twenty different colonies.
Pooled measurements of growth rates in sulfur limiting conditions determined that a minimum distance between Cbf1 and Met31/32 binding sites was required for the highest levels of gene expression (Figure 5). Yeast harboring promoter libraries of varying sizes grew at similar rates in the absence of 3-AT, indicating low levels of leaky transcription from the reporter construct. Expression levels of the HIS3 reporter gene were titrated with the addition of 1 mM 3-AT; similar results were obtained with different concentrations of 3-AT (data not shown). Binding sites whose centers were spaced fewer than 14 bp apart promoted weak reporter gene expression. At these close distances, Cbf1 and Met31/32 may be sterically constrained from assembling a complex with Met4. Reporter gene expression increases sharply as the distance between binding sites is increased from 14 bp to 18 bp. The highest levels of gene expression were observed for promoter libraries with binding sites spaced from 18 bp to 50 bp apart, whereas helical phasing modulated the average growth rate by less than 20%.
Sequence context between binding sites can inhibit gene activation
In addition to characterizing the aggregate effects of binding site spacing, we also examined the effects of different spacer sequences on reporter gene expression. We assayed the growth rates of individual yeast transformants on solid media containing 10 mM or 25 mM 3-AT. Each of the 70 to 72 transformants tested for a certain distance harbored a promoter with a different, random sequence between the Cbf1 and Met31/32 binding sites. We observed reproducible variability in growth rates among transformants with the same distance, but different spacer sequences, between Cbf1 and Met31/32 binding sites (Figure 6).
At each distance surveyed, a certain proportion of intervening sequences was compatible with reporter gene expression. Since the pooled growth rates in liquid media were qualitatively similar over this distance range, we interpret these proportions as the probability that a random intervening sequence would support gene expression at a given distance. At a distance of 12 bp between sites, less than 30% of the sequences supported reporter gene expression. At distances between 16 and 50 bp, the proportion of transformants that showed moderate to high levels of growth on 25 mM 3-AT varied from 38% to 60%. We observed a modest dependence of this proportion on helical phasing in the distance between binding sites.
To investigate what features of spacer sequences correlated with gene activation, we sequenced a sample of 28 promoters with distances of 12 bp, as well as 41 promoters with distances of 20 bp, between the Cbf1 and Met31/32 binding sites (Table 2). Promoters that supported gene expression (positives) were similar in nucleotide composition to promoters that inhibited gene expression (negatives). Since no trimers or tetramers were enriched in the positive or negative promoter sets, additional sequence-specific transcription factors probably did not contribute to gene expression. The most discriminating feature of negative promoters was a shared G or T immediately 5' to the Met31/32 binding site in 15 of 17 examples of distance 12, as well as in all 13 examples of distance 20. However, about half of the positive examples contained a G or T at that position, as expected.
We searched for additional residues that could discriminate among sequences that shared a G or T at the most 3' position of the spacer region using WebLogo [30, 31]. We compared sequence logos between the positive and negative promoters to calculate whether any nucleotides were enriched at particular positions in the spacer sequences (Figure 7A). By focusing on the three most informative positions, we derived nucleotide combinations that predicted negative promoters with an overall sensitivity of 80% and a specificity of 89% (Table 3).
To test whether the A11-T17 nucleotide combination was sufficient to inhibit gene expression in spacer sequences of length 20, we identified five promoters with a B11-T17 sequence combination and converted the nucleotide at position 11 to an adenine by site-directed mutagenesis. Similar levels of reporter gene expression were driven by the original and mutant promoters, as assayed by serial dilutions on media containing 10 mM or 25 mM 3-AT (Figure 7B). Thus, the effects of sequence context are not encoded by specific positions within the primary nucleotide sequence.
Promoter architecture features of yeast sulfur utilization genes
We have developed a synthetic promoter assay to test how various features of promoter architecture affected HIS3 reporter gene expression in the context of a common minimal promoter. Although this reporter gene assay is indirect, it has been successfully used to obtain semi-quantitative measurements of transcript levels . We applied this system to characterize the collaborative recruitment of the coactivator Met4 by the transcription factors Cbf1 and Met31/32 in response to methionine starvation. We found that the relative order of binding sites was crucial, since a Cbf1 binding site was required upstream of a Met31/32 binding site for full gene expression. The influence of Cbf1 and Met31/32 binding site order on reporter gene expression implies that the spatial orientation of the Met4 activation domain is required for the recruitment of downstream targets. Two Cbf1 binding sites could moderately increase reporter gene expression, yet the mechanism for this enhanced activation is unclear. Synergistic activation of reporter gene expression occurred when Cbf1 and Met31/32 binding sites were spaced at least 18 bp apart. Notably, the allowed distances for coactivator recruitment extend beyond the maximal range for cooperatively binding transcription factors. Finally, we discovered that different sequence contexts between binding sites produced considerable heterogeneity of reporter gene expression, whereas helical phasing showed comparatively little effect.
Although the transcription factors Cbf1 and Met31/32 lack canonical activation domains, they can serve as activators via collaborative recruitment of the coactivator Met4 when they are jointly bound to the promoters of sulfur utilization genes (reviewed in ). Our genome-wide computational survey found that genes with single Cbf1 or Met31/32 binding sites in their promoters were associated with significantly higher transcript levels, on average, when compared to the rest of the genome (Figure 4B). However, the presence of two binding sites for the same transcription factor was not associated with a further increase in transcript levels on average. In contrast, we found that two Cbf1 binding sites separated by 35 bp in a minimal promoter conferred increased reporter gene expression (Figure 4A). This discrepancy could be explained by distinct distance constraints between two binding sites for the same transcription factor.
Our data could not rule out the possibility that Met31/32 may serve as transcriptional repressors when bound to other promoters that lack Cbf1 binding sites. We found that reporter gene expression in methionine starvation was lower for minimal promoters with one or two Met31/32 binding sites, compared to a minimal promoter with a Cbf1 binding site alone (Figure 4A). Several models could explain how the binding of Cbf1 could convert Met31/32 from a repressor to an activator. Since the recruitment of the coactivator Met4 requires interactions with both Cbf1 and Met31/32, Met4 could displace a corepressor that may be constitutively bound to Met31/32. In contrast, the binding of Cbf1 could recruit enzymes that confer posttranslational modifications on Met31/32. These modifications could induce a conformational change that relieves repressive activity of Met31/32. Kinetic analyses of transcription factor binding and subsequent recruitment of multiprotein regulatory complexes by chromatin immunoprecipitation could help distinguish between these models.
Distinct promoter architecture requirements for different transcription factor combinations
The promoter architecture requirements for Met4 coactivator recruitment differ considerably from previously characterized yeast promoters. For instance, the transcription factor Rap1 can efficiently recruit Gcr1/2 only when their binding sites are found 13 or 23 bp apart . Notably, a distance of 18 bp that altered the helical phasing between these factors abolished gene activation. Similarly, helical phasing between Pho2 and Swi5 binding sites modulates cooperative binding by almost three-fold . As an extreme case, the insertion of a single base pair between the a1 and α2 binding sites abolishes cooperative binding . In contrast with the above transcription factor pairs that bind cooperatively, levels of reporter gene activation were fairly consistent when Cbf1 and Met31/32 binding sites were spaced between 18 bp and 50 bp apart. The tolerance of Met4 coactivator recruitment on a wide distance range contradicts the model that transcription factor interactions are predominantly determined by the precise spacing between their binding sites. Intriguingly, the recruitment of Met4 to a common minimal promoter seems to depend more on the sequence context between Cbf1 and Met31/32 binding sites than on the distance between them, provided that the minimum distance requirements were met. In light of these results, previous studies that varied distances between transcription factor binding sites should be reassessed, since they usually considered only a single sequence context for each distance.
The rather flexible distance constraints between Cbf1 and Met31/32 binding sites suggest that Met4 recruitment may not require rigid, simultaneous protein-protein interactions among the bound transcription factors. Taken together, these experiments suggest that the process of Met4 recruitment differs considerably from the lock-and-key arrangements of bound transcription factors that govern the mammalian interferon beta enhanceosome [1, 2]. Instead, an intrinsic property of the intervening sequence context, such as DNA bendability, may facilitate an induced fit between the bound transcription factors and Met4. Whereas the distance between binding sites plays a diminished role in bridging bound transcription factors, intervening sequences with low intrinsic bendability could impair coactivator recruitment. Thus, the key requirements of promoter architecture may rely heavily on the molecular mechanism of transcription factor interactions at a particular set of co-regulated promoters.
Possible effects of sequence context between transcription factor binding sites
Sequence context could alter Met4 recruitment in several ways. First, residues adjacent to binding sites could reduce the binding affinity of Cbf1 or Met31/32. Accordingly, we found that all spacer sequences that were incompatible with reporter gene expression contained a guanine or thymine immediately 5' to the Met31/32 binding site. Secondly, the DNA bendability of the spacer sequence could alter the conformation of Cbf1, which bends DNA by approximately 68° . Conformational changes in Cbf1 could affect its protein-protein interactions with Met28 or Met4, thus reducing Met4 recruitment. A requirement for DNA bendability on protein-protein interactions has been recently shown for the transcription factor Mcm1, which bends DNA by 66°, comparable to the bend angle induced by Cbf1 . A point mutant of Mcm1 with a DNA bending angle of 46° had a lower affinity for cooperative binding with Fkh2 than a mutant with a DNA bending angle of 49°, suggesting that a certain threshold of DNA bending was required for ternary complex formation in vitro . Circular permutation assays on promoters with different sequence contexts could test whether the extent of bendability correlates with reporter gene activation. In addition, chromatin immunoprecipitation studies could identify the transcription factors whose binding in vivo is affected by sequence context.
Whereas the influence of sequence context on gene activation has been widely reported e.g., [34–36], the key determinants of sequence context have been poorly defined. Except for the residue adjacent to the Met31/32 binding site, we could not identify features of the primary nucleotide sequence that correlated with gene activation. Previous studies have reported that protein-DNA interactions can be affected by physicochemical properties of DNA, such as twist . Although we assessed several dinucleotide parameters, we could not find any significant correlation between the average parameter value of a spacer sequence and reporter gene activation (data not shown).
Epigenetic effects could account for some of the observed variability in gene activation among promoters with different sequence contexts. By examining multiple independent serial dilutions for several promoter sequences (Supplementary Figure 1), we believe that this variability is reproducible and not due to stochastic effects on individual clones. In order to sample a large number of promoter architectures, we assayed reporter gene expression from a single-copy plasmid, which yields over 10,000-fold higher transformation efficiency than chromosomal integration. We have not explored how the flanking sequence composition of wild-type promoters may affect the basal or Met4-induced nucleosomal accessibility of Cbf1 and Met31/32 binding sites in the genome. Cbf1 can also modulate nucleosome positioning and recruit the Isw1 chromatin remodeling complex [38, 39]. Thus, additional determinants of local sequence context that affect the binding or DNA bending of Cbf1 may influence Met4 recruitment and gene activation in a chromosomal context.
Implications for computational predictions of transcription factor interactions
The development of computational methods to predict the transcriptional output of an arbitrary regulatory sequence has attracted considerable interest, as reviewed in [40, 41]. Most computational approaches assess the enrichment of predicted binding sites within a large sequence region, while ignoring the spatial arrangement of the binding sites. Moreover, only a handful of methods explicitly consider whether binding sites are more closely spaced than expected [9, 10, 42–44]. These methods typically specify minimum and maximum distance thresholds between which transcription factors are predicted to interact. Whereas the use of thresholds roughly approximates the range of transcription factor interactions, our above experiments suggest two major improvements for more accurate predictions.
First, different mechanisms of transcription factor interactions may impose distinct distance constraints between their binding sites. We found that the minimum spacing between Cbf1 and Met31 binding sites was the key distance constraint on reporter gene activation. We interpret this minimum distance to be a consequence of the coactivator's role in bridging the bound transcription factors. Various coactivators likely have different minimum distance requirements, based on their size and the relative locations of their interaction surfaces with DNA-bound transcription factors. Experimental studies on the promoter architecture requirements for other common coactivators should provide empirical distance thresholds that could improve the prediction accuracy of their regulated target genes.
In addition, the pronounced effects of sequence context on reporter gene activation suggest that highly accurate predictions of target gene regulation may not be easily extrapolated from targeted experimental studies. Further investigations of promoter architecture may benefit from a framework that formalizes how enthalpy gains from protein-protein interactions are offset by the entropy loss of multiprotein complex formation. Thermodynamic measurements on promoter variants with different spacing and sequence contexts between transcription factor binding sites could then be associated with changes in gene activation. Such a theory on the energetics of multiprotein complex formation could provide the quantitative precision needed to predict how a particular transcriptional control region adopts a conformation that enables transcriptional activation.
The main goal of this work was to characterize the influence of various components of promoter architecture on transcription factor interactions. We found that the requirements for Met4 coactivator recruitment were considerably more flexible than those for cooperatively binding transcription factors. The characteristic requirements of Met4 recruitment included the precise order of Cbf1 and Met31/32 binding sites, a large distance range between the binding sites that was insensitive to helical phasing, and the pronounced inhibitory effects of sequence context. Given the modular design of our synthetic promoter system, our approach can be readily used to characterize the promoter architecture constraints between arbitrary combinations of yeast transcription factors.
Plasmid pDC204 was constructed in five steps. 1) The HIS3 coding region was PCR amplified from S. cerevisiae genomic DNA using the primers HIS3_F_BamHI and HIS3_R (Table 1) and cloned downstream of the MEL1 minimal promoter (P MEL1 ) by ligating into the BamHI + EcoRV-cleaved plasmid YIpMELβ2 from EUROSCARF . Two changes were then made to the MEL1 minimal promoter. 2) An NcoI site was introduced into P MEL1 31 bp upstream of the existing XhoI site by site-directed mutagenesis (oligos MEL1_NcoI_W and MEL1_NcoI_C). 3) An out-of-frame ATG codon located 17 bp upstream of the HIS3 coding region was removed by site-directed mutagenesis (oligos ATG_W and ATG_C). 4) The P MEL1 -HIS3 fusion construct was PCR amplified (primers pMH14-F_ApaI & pMH14-R_AscI-SacII) and cloned into the ApaI + SacII-cleaved plasmid pRS314 . 5) The Kluyveromyces lactis LEU2 gene was PCR amplified from pUG73 (primers pUG73_F and pUG73_R)  and cloned into the AscI site of the above plasmid. Restriction digests confirmed the same-strand orientation of the HIS3 and LEU2 coding regions, and sequencing verified the promoter and coding regions.
Promoter library construction
Degenerate oligonucleotides were designed with a Cbf1 binding site at a fixed distance upstream of a Met31/32 binding site (Operon) (Table 1). Ten bp of flanking sequence upstream of the Cbf1 binding site and downstream of the Met31/32 binding site were included from the wild-type MET16 promoter. Double-stranded DNA was synthesized by Bio-X-Act polymerase (Bioline) from the primer MET16_reverse (Table 1), digested with NcoI and XhoI and ligated into pDC204.
Yeast strains and media
Strain BY4742 (MATα his3Δ1 leu2Δ0 lys2Δ0 ura3Δ0) was obtained from Invitrogen. Growth media were prepared by mixing yeast nitrogen base (Bio101), 2% dextrose and amino acid supplements lacking leucine or lacking histidine, leucine and methionine (BD Biosciences). Histidine or 3-amino-1,2,4-triazole (3-AT) (Sigma) were supplemented to the indicated concentrations.
Pooled growth rates for promoter libraries
Plasmids containing promoter libraries with the indicated spacings between binding sites were introduced into the BY4742 parental strain by lithium acetate transformation . Transformants that harbored these single-copy plasmids were selected by growth on dropout medium lacking leucine. For each growth rate experiment, over 100 yeast colonies from a separate transformation were pooled and amplified by culturing overnight in dropout media lacking leucine.
To induce reporter gene expression, the pooled yeast cultures were diluted to early log phase (OD600 ~ 0.04) in 20 mL of dropout media lacking leucine, histidine and methionine and grown at 30°C with shaking at 250 rpm for 3 hours (OD600 ~ 0.1). Each culture was then split in half and 3-AT was added to one half, to a final concentration of 1 mM. To acclimatize the yeast cultures to 3-AT, the cultures were grown for a further 2.5 hours. Subsequently, we measured the OD600 of each pooled culture every 45 minutes until 7 hours after 3-AT addition. These measurements were transformed to log (base 2) values and a linear regression was calculated in Excel. The doublings per hour corresponds to the slope of the linear regression for a single growth curve.
Computational association of promoter architectures with gene expression
The regulatory information associated with a particular sequence – such as a transcription factor binding site – can be quantified as the average change in gene expression for all genes that contain that sequence in its transcriptional control region. We followed the standard practice of defining yeast transcriptional control regions as the 500 bp upstream of each coding region, as obtained from the Saccharomyces Genome Database . We searched for exact matches to the core recognition sequences for Cbf1 (TCACGTG) or Met31/32 (TGTGGC) on either strand of these upstream regions.
We first computed the relative ratio of transcript levels between sulfur starvation and complete media for each gene. The relative ratio for each gene was obtained by averaging the log base 2 expression ratios for the first four timepoints of an amino acid starvation microarray experiment, as well as four replicates of a 1 mM cadmium treatment [28, 29]. To convert these average relative ratios to absolute mRNA levels, we multiplied these relative ratios by the average basal transcript levels in rich growth media as reported by . We reported the average of these mRNA levels for all genes that shared each promoter architecture under consideration.
Carey M: The enhanceosome and transcriptional synergy. Cell. 1998, 92: 5-8. 10.1016/S0092-8674(00)80893-4
Merika M, Thanos D: Enhanceosomes. Curr Opin Genet Dev. 2001, 11: 205-208. 10.1016/S0959-437X(00)00180-5
Ogata K, Sato K, Tahirov TH: Eukaryotic transcriptional regulatory complexes: cooperativity from near and afar. Curr Opin Struct Biol. 2003, 13: 40-48. 10.1016/S0959-440X(03)00012-5
Remenyi A, Scholer HR, Wimanns M: Combinatorial control of gene expression. Nat Struct Mol Biol. 2004, 11: 812-817. 10.1038/nsmb820
van Helden J, Rios AF, Collado-Vides J: Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. Nucleic Acids Res. 2000, 28: 1808-1818. 10.1093/nar/28.8.1808
GuhaThakurta D, Stormo GD: Identifying target sites for cooperatively binding factors. Bioinformatics. 2001, 17: 608-621. 10.1093/bioinformatics/17.7.608
Liu X, Brutlag DL, Liu JS: BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac Symp Biocomput. 2001, 6: 127-138.
Robin S, Daudin JJ, Richard H, Sagot MF, Schbath S: Occurrence probability of structured motifs in random sequences. J Comput Biol. 2002, 9: 761-773. 10.1089/10665270260518254
Klingenhoff A, Frech K, Quandt K, Werner T: Functional promoter modules can be detected by formal models independent of overall nucleotide sequence similarity. Bioinformatics. 1999, 15: 180-186. 10.1093/bioinformatics/15.3.180
Pilpel Y, Sudarsanam P, Church GM: Identifying regulatory networks by combinatorial analysis of promoter elements. Nat Genet. 2001, 29: 153-159. 10.1038/ng724
Bulyk MJ, McGuire AM, Masuda N, Church GM: A motif co-occurrence approach for genome-wide prediction of transcription-factor-binding sites in Escherichia coli. Genome Res. 2004, 14: 201-208. 10.1101/gr.1448004
Amin J, Fernandez M, Ananthan J, Lis JT, Voellmy R: Cooperative binding of heat shock transcription factor to the Hsp70 promoter in vivo and in vitro. J Biol Chem. 1994, 269: 4804-4811.
Drazinic CM, Smerage JB, Lopez MC, Baker HV: Activation mechanism of the multifunctional transcription factor repressor-activator protein 1 (Rap1p). Mol Cell Biol. 1996, 16: 3187-3196.
Brazas RM, Bhoite LT, Murphy MD, Yu Y, Chen Y, Neklason DW, Stillman DJ: Determining the requirements for cooperative DNA binding by Swi5p and Pho2p (Grf10p/Bas2p) at the HO promoter. J Biol Chem. 1995, 270: 29151-29161. 10.1074/jbc.270.49.29151
Jin Y, Mead J, Li T, Wolberger C, Vershon AK: Altered DNA recognition and bending by insertions in the α2 tail of the yeast a1/α2 homeodomain heterodimer. Science. 1995, 270: 290-292.
Ioshikhes I, Trifonov EN, Zhang MQ: Periodical distribution of transcription factor sites in promoter regions and connection with chromatin structure. Proc Natl Acad Sci USA. 1999, 96: 2891-2895. 10.1073/pnas.96.6.2891
Makeev VJ, Lifanov AP, Nazina AG, Papatsenko DA: Distance preferences in the arrangement of binding motifs and hierarchical levels in organization of transcription regulatory information. Nucleic Acids Res. 2003, 31: 6016-6026. 10.1093/nar/gkg799
Naar AM, Lemon BD, Tjian R: Transcriptional coactivator complexes. Annu Rev Biochem. 2001, 70: 475-501. 10.1146/annurev.biochem.70.1.475
Spiegelman BM, Heinrich R: Biological control through regulated transcriptional coactivators. Cell. 2004, 119: 157-167. 10.1016/j.cell.2004.09.037
Kuras L, Barbey R, Thomas D: Assembly of a bZIP-bHLH transcription activation complex: Formation of the yeast Cbf1-Met4-Met28 complex is regulated through Met28 stimulation of Cbf1 DNA binding. EMBO J. 1997, 16: 2441-2451. 10.1093/emboj/16.9.2441
Blaiseau PL, Thomas D: Multiple transcriptional activation complexes tether the yeast activator Met4 to DNA. EMBO J. 1998, 17: 6327-6336. 10.1093/emboj/17.21.6327
Kuras L, Cherest H, Surdin-Kerjan Y, Thomas D: A heteromeric complex containing the centromere binding factor 1 and two basic leucine zipper factors, Met4 and Met28, mediates the transcription activation of yeast sulfur metabolism. EMBO J. 1996, 15: 2519-2529.
Thomas D, Surdin-Kerjan Y: Metabolism of sulfur amino acids in Saccharomyces cerevisiae. Microbiol Mol Biol Rev. 1997, 61: 503-532.
Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES: Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature. 2003, 423: 241-254. 10.1038/nature01644
Melcher K, Sharma B, Ding WV, Nolden M: Zero background yeast reporter plasmids. Gene. 2000, 247: 53-61. 10.1016/S0378-1119(00)00124-4
Horecka J, Sprague GFJ: Use of imadazoleglyercerolphosphate dehydratase (His3) as a biological reporter in yeast. Methods Enzymol. 2000, 326: 107-119.
Holstege FC, Jennings EG, Wyrick JJ, Lee TI, Hengartner CJ, Green MR, Golub TR, Lander ES, Young RA: Dissecting the regulatory circuitry of a eukaryotic genome. Cell. 1998, 95: 717-728. 10.1016/S0092-8674(00)81641-4
Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO: Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell. 2000, 11: 4241-4257.
Fauchon M, Lagniel G, Aude JC, Lombardia L, Soularue P, Petat C, Marguerie G, Sentenac A, Werner M, Labarre J: Sulfur sparing in the yeast proteome in response to sulfur demand. Mol Cell. 2002, 9: 713-723. 10.1016/S1097-2765(02)00500-2
Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: A sequence logo generator. Genome Res. 2004, 14: 1188-1190. 10.1101/gr.849004
Schneider TD, Stephens RM: Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990, 18: 6097-6100.
Niedenthal RK, Sen-Gupta M, Wilmen A, Hegemann JH: Cpf1 protein induced bending of yeast centromere DNA element I. Nucleic Acids Res. 1993, 21: 4726-4733.
Lim FL, Hayes A, West AG, Pic-Taylor A, Darieva Z, Morgan BA, Oliver SG, Sharrocks AD: Mcm1p-induced DNA bending regulates the formation of ternary transcription factor complexes. Mol Cell Biol. 2003, 23: 450-461. 10.1128/MCB.23.2.450-461.2003
Elledge SJ, Davis RW: Position and density effects on repression by stationary and mobile DNA-binding proteins. Genes Dev. 1989, 3: 185-197.
Mai X, Chou S, Struhl K: Preferential accessibility of the yeast HIS3 promoter is determined by a general property of the DNA sequence, not by specific elements. Mol Cell Biol. 2000, 20: 6668-6676. 10.1128/MCB.20.18.6668-6676.2000
Tornoe J, Kusk P, Johansen TE, Jensen PR: Generation of a synthetic mammalian promoter library by modification of sequences spacing transcription factor binding sites. Gene. 2002, 297: 21-32. 10.1016/S0378-1119(02)00878-8
Olson WK, Gorin AA, Lu X-J, Hock LM, Zhurkin VB: DNA sequence-dependent deformability deduced from protein-DNA crystal complexes. Proc Natl Acad Sci USA. 1998, 95: 11163-11168. 10.1073/pnas.95.19.11163
Moreau JL, Lee M, Mahachi N, Vary J, Mellor J, Tsukiyama T, Goding CR: Regulated displacement of TBP from the PHO8 promoter in vivo requires Cbf1 and the Isw1 chromatin remodeling complex. Mol Cell. 2003, 11 (6): 1609-1620. 10.1016/S1097-2765(03)00184-9
Kent NA, Eibert SM, Mellor J: Cbf1p is required for chromatin remodeling at promoter-proximal CACGTG motifs in yeast. J Biol Chem. 2004, 279 (26): 27116-27123. 10.1074/jbc.M403818200
Futcher B: Transcriptional regulatory networks and the yeast cell cycle. Curr Opin Cell Biol. 2002, 14 (6): 676-683. 10.1016/S0955-0674(02)00391-5
Istrail S, Davidson EH: Logic functions of the genomic cis-regulatory code. Proc Natl Acad Sci USA. 2005, 102: 4954-4959. 10.1073/pnas.0409624102
Wagner A: Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes. Bioinformatics. 1999, 15: 776-784. 10.1093/bioinformatics/15.10.776
Chiang DY, Moses AM, Kellis M, Lander ES, Eisen MB: Phylogenetically and spatially conserved word pairs associated with gene-expression changes in yeasts. Genome Biol. 2003, 4: R43- 10.1186/gb-2003-4-7-r43
Beer MA, Tavazoie S: Predicting gene expression from sequence. Cell. 2004, 117: 185-198. 10.1016/S0092-8674(04)00304-6
Sikorski RS, Hieter P: A system of shuttle vectors and yeast host strains designed for efficient manipulation of DNA in Saccharomyces cerevisiae. Genetics. 1989, 122: 19-27.
Gueldener U, Heinisch J, Koehler GJ, Voss D, Hegemann JH: A second set of loxP marker cassettes for Cre-mediated multiple gene knockouts in budding yeast. Nucleic Acids Res. 2002, 30: e23- 10.1093/nar/30.6.e23
Gietz RD, Woods RA: Transformation of yeast by lithium acetate/single-stranded carrier DNA/polyethylene glycol method. Methods Enzymol. 2002, 350: 87-96.
Balakrishnan R, Christie KR, Costanzo MC, Dolinski K, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hong EL, Nash R, Oughtred R, Skrzypek M, Theesfeld CL, Binkley G, Dong Q, Lane C, Sethuraman A, Weng S, Botstein D, Cherry JM: Fungal BLAST and model organism BLASTP best hits: new comparison resources at the Saccharomyces Genome Database (SGD). Nucleic Acids Res. 2005, 33: D374-D377. 10.1093/nar/gki023
Cliften PF, Sudarsanam P, Desikan A, Fulton L, Fulton B, Majors J, Waterston RH, Cohen BA, Johnston M: Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science. 2003, 301: 71-76. 10.1126/science.1084337
We thank Alan Moses, Angela DePace, Jeremy Thorner and Caroline Kane for insightful discussions. In addition, Brandon Davies, Jacqueline Dumais, Paul Jorgensen, Traci Lee and Yiwen Zhu shared experimental tips. D.Y.C. is a Howard Hughes Medical Institute Predoctoral Fellow, and M.B.E. is a Pew Scholar in the Biomedical Sciences. This work was conducted under the US Department of Energy contract No. ED-AC03-76SF00098.
DYC, DAN, APG and MBE conceived and designed the experiments; DYC performed the experiments; DYC, DAN and RKS analyzed the data; DYC, DAN, RKS, APG and MBE wrote and approved the manuscript.
Electronic supplementary material
Additional File 1: Reproducibility of growth assays. Two different sequence contexts were tested between Cbf1 and Met31/32 binding sites with a center-to-center distance of 20 bp. The sequences of these clones were confirmed by isolating plasmids and sequencing. Clones 8A and 8B had the sequence TCACGTG TTTACAAACTAGGGGCCACA; clones 12A and 12B had the sequence TCACGTG GGCATTTATGGGAAGCCACA. These plasmids were transformed independently into yeast strains. Serial dilutions of separate isolates were plated on the indicated growth media. (PDF 138 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.