Cruciform structures are a common DNA feature important for regulating biological processes

DNA cruciforms play an important role in the regulation of natural processes involving DNA. These structures are formed by inverted repeats, and their stability is enhanced by DNA supercoiling. Cruciform structures are fundamentally important for a wide range of biological processes, including replication, regulation of gene expression, nucleosome structure and recombination. They also have been implicated in the evolution and development of diseases including cancer, Werner's syndrome and others. Cruciform structures are targets for many architectural and regulatory proteins, such as histones H1 and H5, topoisomerase IIβ, HMG proteins, HU, p53, the proto-oncogene protein DEK and others. A number of DNA-binding proteins, such as the HMGB-box family members, Rad54, BRCA1 protein, as well as PARP-1 polymerase, possess weak sequence specific DNA binding yet bind preferentially to cruciform structures. Some of these proteins are, in fact, capable of inducing the formation of cruciform structures upon DNA binding. In this article, we review the protein families that are involved in interacting with and regulating cruciform structures, including (a) the junction-resolving enzymes, (b) DNA repair proteins and transcription factors, (c) proteins involved in replication and (d) chromatin-associated proteins. The prevalence of cruciform structures and their roles in protein interactions, epigenetic regulation and the maintenance of cell homeostasis are also discussed.

representation of inverted repeats, which occurs nonrandomly in the DNA of all organisms, has been noted in the vicinity of breakpoint junctions, promoter regions, and at sites of replication initiation [3,7,8]. Cruciform structures may affect the degree of DNA supercoiling, the positioning of nucleosomes in vivo [9], and the formation of other secondary structures of DNA. Cruciforms contain a number of structural elements that serve as direct protein-DNA targets. Numerous proteins have been shown to interact with cruciforms, recognizing features such as DNA crossovers, four-way junctions, and curved or bent DNA. Structural transitions in chromatin occur concomitantly with DNA replication or transcription and in processes that involve a local separation of DNA strands. Such transitions are believed to facilitate the formation of alternative DNA structures [10,11]. Transient supercoils are formed in the eukaryotic genome during DNA replication and transcription, and these often involve protein binding [12]. Indeed, active chromatin remodeling is a typical feature for many promoters and is essential for gene transcription [13]. Notably, DNA supercoiling can have a strong impact on gene expression [14]. Using microarrays covering the E. coli genome, it was recently shown that expression of 7% of genes was rapidly and significantly affected by a loss of chromosomal supercoiling [15]. Several complexes that involve extensive DNA-protein interactions, whereby the DNA wraps around the protein, can only occur under conditions of negative DNA supercoiling [10]. Other proteins are reported to interact with the supercoiled DNA (scDNA) at crossing points or on longer segments of the interwound supercoil [16,17]. Interestingly, the eukaryotic genome has been shown to contain a percentage of unconstrained supercoils, part of which can be attributed to transcriptional regulation [3]. The spontaneous generation of DNA supercoiling is also a requirement for genome organization [18]. Transient supercoils are formed both in front of and behind replication forks as superhelical stress is distributed throughout the entire replicating DNA molecule [19]. A number of additional processes may operate to create transient and localized superhelical stresses in eukaryotic DNA.
The recognition of cruciform DNA seems to be critical not only for the stability of the genome, but also for numerous, basic biological processes. As such, it is not surprising that many proteins have been shown to exhibit cruciform structure-specific binding properties. In this review, we focus on these proteins, many of which are involved in chromatin organization, transcription, replication, DNA repair, and other processes. To organize our review, we have divided cruciform binding proteins into four groups (see Table 1) according to their primary functions: (a) junction-resolving enzymes, (b) transcription factors and DNA repair proteins, (c) replication machinery, and (d) chromatin-associated proteins. For each group, we describe in detail recent examples of research findings. Lastly, we review how dysregulation of cruciform binding proteins is associated with the pathology of certain diseases found in humans.

Formation and presence of cruciform structures in the genome
Cruciform structures are important regulators of biological processes [3,5]. Both stem-loops and cruciforms are capable of forming from inverted repeats. Cruciform structures consist of a branch point, a stem and a loop, where the size of the loop is dependent on the length of the gap between inverted repeats ( Figure 1). Direct inverted repeats lead to formation of a cruciform with a minimal single-stranded loop. The formation of cruciforms from indirect inverted repeats containing gaps is dependent not only on the length of the gap, but also on the sequence in the gap. In general, the AT-rich gap sequences increase the probability of cruciform formation. It is also possible that the gap sequence can form an alternative DNA structure. The formation of DNA cruciforms has a strong influence on DNA geometry whereupon sequences that are normally distal from one another can be brought into close proximity [20,21]. The structure of cruciforms has been studied by atomic force microscopy [22][23][24]. These studies have identified two distinct classes of cruciforms. One class of cruciforms, denoted as unfolded, have a square planar conformation characterized by a 4-fold symmetry in which adjacent arms are nearly perpendicular to one another. The second class comprises a folded (or stacked) conformation where the adjacent arms form an acute angle with the main DNA strands (Figure 2). Two of the three structural motifs inherent to cruciforms, the branch point and stem, are also found in Holliday junctions. Holliday junctions are formed during recombination, double-strand break repair, and fork reversal during replication. Resolving Holliday junctions is a critical process for maintaining genomic stability [25,26]. These junctions are resolved by a class of structure-specific nucleases: the junction-resolving enzymes.
Cruciforms are not thermodynamically stable in naked linear DNA due to branch migration [27]. Cruciform structure formation in vivo has been shown in both prokaryotes and eukaryotes using several methodological approaches. The presence of the cruciform structure was first described in circular plasmid DNA where the negative superhelix density can stabilize cruciform formation. Plasmids with native superhelical density usually contain cruciform structures in vitro and in vivo [28]. For example, higher order structure in the pT181 plasmid was shown to exist in vivo using bromoacetaldehyde treatment [29]. Deletion of the sequence which forms this structure at the ori site leads either to a reduction or failure in replication [30]. Similarly, deletion of the cruciform binding domain in 14-3-3 proteins results in reduced origin binding which affects the initiation of DNA replication in budding yeast [31]. Monoclonal antibodies against cruciform structures have also been used successfully to isolate cruciform-containing segments of genomic DNA. Furthermore, these sequences were able to replicate autonomously when transfected into HeLa cells [32]. Stabilization of the cruciform structures by monoclonal antibodies 2D3 and 4B4, with anti-cruciform DNA specificity, resulted in a 2-to 6-fold enhancement of replication in vivo [33]. 14-3-3 sigma was found to associate in vivo with the monkey origins of DNA replication ors8 and ors12 in a cell cycle-dependent manner, as assayed by a chromatin immunoprecipitation (ChIP) assay that involved formaldehyde cross-linking, followed by immunoprecipitation with anti-14-3-3 sigma antibody and quantitative PCR [34]. Similarly, the 14-3-3 protein homologs from H. sapiens, S.cerevisiae [34,110] Rmi-1 Yeast [157] Crp-1 S. cerevisiae [158] HMG protein family all [47,[159][160][161] Smc S. cerevisiae [118,162] Hop1 S. cerevisiae [163,164] ER estrogen receptor mammals [58] Chromatin-associated proteins DEK mammals [84,85] BRCA1 mammals [49,50,91,93] HMG protein family Eukaryotes [47,[159][160][161] Rad54 Eukaryotes [48] Rad51ap Eukaryotes [81] Topoisomerase I Eukaryotes [101,165] Replication S16 E.coli [113] GF14, homolog of 14-3-3 plants [35] MLL (leukemia) H. sapiens [125,126] WRN (Werner syndrome) H. sapiens [129] AF10 H. sapiens [114] 14-3-3 Eukaryotes [34,110] DEK mammals [84,85] DNA-PK Eukaryotes [166] Vlf-1 Baculovirises [119] HU E. coli [105,167,168] Helicases (59,44, and others) all [55] Saccharomyces cerevisiae, Bmh1p and Bmh2p, have cruciform DNA-binding activity and associate in vivo with ARS307 [35]. Several studies show that transcription is regulated directly by the presence of cruciform structure in vivo. Another example includes the ability of the d (AT)n-d(AT)n insert to spontaneously adopt a cruciform state in E. coli, resulting in a block of protein synthesis [36]. Using site-directed mutational analysis and P1 nuclease mapping, it was demonstrated that the formation of a cruciform structure is required for the repression of enhancer function in transient transfection assays and that Alu elements may contribute to regulation of the CD8 alpha gene enhancer through the formation of secondary structure that disrupts enhancer function [37]. Transcriptionally driven negative supercoiling also mediates cruciform formation in vivo and enhanced cruciform formation correlates with an elevation in promoter activity [38]. It was also shown that the secondary DNA structures of the ATF/CREB element play a vital role in protein-DNA interactions and its cognate transcription factors play a predominant role in the promoter activity of the RNMTL1 gene [39]. Hypo-methylation of inverted repeats by the Dam methylase show that these sequences are consistent with an unusual secondary structure, such as DNA cruciform or hairpin in vivo [40]. The in vivo effects of cruciform formation during transcription have been studied in detail by Krasilnikov et al. [4]. Interestingly hairpincapped linear DNA (in which the replication of hairpincapped DNA and cruciform formation and resolution play central roles) was stably maintained for months in a human cancer cell line as numerous extra-chromosomal episomes [41]. Long palindromes can also induce DNA breaks after assuming a cruciform structure. Palindromes in S. cerevisiae are resolved, in vivo, by structure-specific enzymes. In vivo resolution requires either the Mus81 endonuclease or, as a substitute, the bacterial HJ resolvase RusA. These findings provide confirmation of cruciform extrusion and resolution in the context of eukaryotic chromatin [42]. Taken together, these studies show that cruciforms have been detected in vivo using a variety of independent techniques and that they are an intriguing and integral phenomenon of DNA biology and biochemistry.

Proteins involved in interactions with cruciform structures Junction-resolving enzymes
There are a large number of proteins that recognize cruciforms (summarized in Table 1) and, of these, the junction-resolving enzymes have been studied extensively. These proteins have been identified in many organisms from bacteria (and their phages) to yeast, archea and mammals [43]. The majority of the junction-resolving enzymes can be divided into one of two superfamilies [44]. Those in the first class target specific DNA sequences for enzymatic activity, although they will bind equally well to junctions of any sequence. This superfamily includes E. coli RuvC, the yeast integrases, Cce1, Ydc2, and RnaseH. The second group includes the phage T7, endonuclease RecU, the Hjc and Hje resolving enzymes, the MutH protein family and related restriction enzymes. The x-ray structures of the junction-resolving enzymes in complex with 4-way junctions highlight the flexibility inherent to DNA ( Figure 3) [25] in that these enzymes recognize and distort the junction. This enables them to carry out such key roles as the cleavage of allogene DNAs and maintenance of genomic stability to name but a few. The recognition of non-B-DNA structure by junction-resolving enzymes has been the subject of several reviews [25,43,45,46].

Proteins involved in transcription and DNA repair
The maintenance of a cell's genomic stability is achieved through several independent mechanisms. Arguably, the most important of these mechanisms is DNA repair. Protein binding to damaged DNA and to the local alternative DNA structures is therefore a key function of these processes. The promoter regions of genes are often characterized by presence of inverted repeats that are capable of forming cruciforms in vivo. A number of DNA-binding proteins, such as those of the HMGB-box family [47], Rad54 [48], BRCA1 protein [49,50], as well as PARP-1 (poly(ADP-ribose) polymerase-1) [51], display only a weak sequence preference but bind preferentially to cruciform structures. Moreover, some proteins can induce the formation of cruciform structures upon DNA binding [51,52]. Among the DNA repair proteins which bind to cruciforms are the junction-resolving enzymes Ruv and RuvB [53,54], DNA helicases [55], XPG protein [56], and multifunctional proteins like HMG-box proteins [57] BRCA1, 14-3-3 protein family including homolog's Bmh1 and Bmh2 from S. cerevisiae, and GF14 from plants. Footprinting analysis of the gonadotropin-releasing hormone gene promoter region indicated the human estrogen receptor (ER) to be another potential cruciform binding protein. In this case, extrusion of the cruciform structure allowed the estrogen response elements motifs to be accessed by the ER protein [58]. PARP-1 PARP-1 is an abundant, nuclear, zinc-finger protein present in~1 enzyme per 50 nucleosomes. It has a high affinity for damaged DNA and becomes catalytically active upon binding to DNA breaks [59]. In the absence of DNA damage, the presence of PARP-1 leads to the perturbation of histone-DNA contacts allowing DNA to be accessible to regulatory factors [60]. PARP-1 activity is also linked to the coordination of chromatin structure and gene expression in Drosophila [61]. It was reported that PARP can bind to the DNA hairpins in heteroduplex DNA and that the auto-modification of PARP in the presence of NAD+ inhibited its hairpin binding activity. Atomic force microscopy studies revealed that, in vitro, PARP protein has a preference for the promoter region of the PARP gene in superhelical DNA where the dyad symmetry elements form hairpins ( Figure 4) [62]. PARP-1 recognizes distortions in the DNA backbone allowing it to bind to three-and four-way junctions [63]. Kinetic analysis has revealed that the structural features of non-B form DNA are important for PARP-1 catalysis activated by undamaged DNA. The order of PARP-1's substrate preference has been shown to be: cruciforms > loops > linear DNA. These results suggest a link between PARP-1 binding to cruciforms structures in the genome and its function in the modulation of chromatin structure in cellular processes. Moreover, it was shown that the binding of PARP-1 to DNA can induce changes in DNA topology as was demonstrated using plasmid DNA targets [51]. P53 P53 is arguably one of the most intensively studied tumor suppressor genes. More than 50% of all human tumors contain p53 mutations and the inactivation of this gene plays a critical role in the induction of malignant transformation [64]. Sequence-specific DNA binding is crucial for p53 function. P53 target sequences, which consist of two copies of the sequence 5'-RRRC(A/ T)(T/A)GYYY-3, often form inverted repeats [65]. It was reported that p53 binding is temperature sensitive and dependent on DNA fragment length [66,67]. Moreover, it was demonstrated, in vivo, that p53 binding to its target sequence is highly dependent on the presence of an inverted repeat at the target site. Preferential binding of p53 to superhelical DNA has also been described [68,69]. Non-canonical DNA structures such as mismatched duplexes, cruciform structures [70], bent DNA [71], structurally flexible chromatin DNA [13], hemicatenated DNA [72], DNA bulges, three-and four-way junctions [73], or telomeric t-loops [74] can all be bound selectively by p53. There is a strong correlation between the cruciform-forming targets and an enhancement of p53 DNA binding [75]. Target sequences capable of forming cruciform structures in topologically constrained DNA bound p53 with a remarkably higher affinity than did the internally asymmetrical target site [76]. These results implicate DNA topology as having an important role in the complex, with possible implications in modulation of the p53 regulon.

Chromatin-associated proteins
The chromatin-associated proteins cover a broad spectrum of the proteins localized in the cell nucleus. They are partly involved in modulating chromatin structure, but are also implicated in a range of processes associated with DNA function. They fine-tune transcriptional events (DEK, BRCA1) and are involved in both DNA repair and replication (HMG proteins, Rad51, Rad51ap, topoisomerases). Another family of enzymes deemed important in these processes is that of topoisomerases. These enzymes occur in all known organisms and play crucial roles in the remodeling of DNA topology. Topoisomerase I binds to Holliday junctions [77], and topoisomerase II recognizes and cleaves cruciform structures [78] and interacts with the HMGB1 protein [57]. These processes are particularly important for maintaining genomic stability due to their ability to diffuse the stresses that are levied upon a DNA molecule during transcription, replication and the resolving of long cruciforms that would otherwise hinder DNA chain separation. The Rad54 protein plays an important role during homologous recombination in eukaryotes [79]. Yeast and human Rad54 bind specifically to Holliday junctions and promote branch migration [80]. The binding preference for the open conformation of the Xjunction appears to be common for many proteins that bind to Holliday junctions. Human Rad54 binds preferentially to the open conformation of branched DNA as opposed to the stacked conformation [48]. Similarly, RAD51AP1, the RAD51 accessory protein, specifically stimulates joint molecule formation through the combination of structure-specific DNA binding and by interacting with RAD51. RAD51AP1 has a particular affinity for branched-DNA structures that are obligatory intermediates during joint molecule formation [81]. The recognition of branched structures during homologous recombination is a critical step in this process. DEK The human DEK protein is an abundant nuclear protein of 375 amino acids that occurs in numbers greater than 1 million copies per nucleus [82]. Its interactions with transcriptional activators and repressors suggest that DEK may have a role in the formation of transcription complexes at promoter and enhancer sites [reviewed in [83]]. The binding of DEK to DNA is not sequence specific and DEK has a clear preference for supercoiled and four-way junctions [84]. Work with isolated and recombinant DEK has shown that it has intrinsic DNA-binding activity with a preference for four-way junction and superhelical DNA over linear DNA and introduces positive supercoils into relaxed circular DNA [83,85]. DEK has two DNA-binding domains. The first domain is centrally located and harbors a conserved sequence element, the SAF (scaffold attachment factor). The second DNA-binding domain is located at the C-terminus of DEK which is also posttranslationally modified by phosphorylation. In fact, the DNA-binding properties of DEK are clearly influenced by phosphorylation as phosphorylated DEK binds with a weaker affinity to DNA than does unmodified DEK and induces the formation of DEK multimers [86,87]. DEK's monomeric SAF box (residues 137-187) does not appear to interact with DNA in solution. However, when many SAF boxes are brought into close proximity, it cooperativity drives DNA binding. A DEK construct spanning amino acids 87-187 binds to DNA much like the intact DEK preferring four-way DNA junctions over linear DNA. This fragment forms large aggregates in the presence of DNA and is also able to introduce supercoils into relaxed circular DNA. Interestingly, the 87-187 amino acid peptide induces negative DNA supercoils [88]. BRCA1 BRCA1 is a multifunctional tumor suppressor protein having roles in cell cycle progression, transcription, DNA repair and chromatin remodeling. Mutations to the BRCA1 gene are associated with a significant increase in the risk of breast cancer. The function of BRCA1 likely involves interactions with both DNA and an array of proteins. BRCA1 associates directly with RAD51 and both proteins co-localize to discrete subnuclear foci that redistribute to sites of DNA damage under genotoxic stress [89]. BRCA1 also co-localizes with phosphorylated H2AX (γH2AX) in response to double strand breaks [90]. The central region of human BRCA1 binds strongly to negatively supercoiled plasmid DNA with native superhelical density [50] and binds with high affinity to cruciform DNA [91]. The BRCA1 cruciform DNA complex must dissociate to allow the nuclease complex to work in DNA recombinational repair of double stranded breaks. BRCA1 also acts as a scaffold for assembly of the Rad51 ATPase which is responsible for homologous recombination in somatic cells. The full-length BRCA1 protein binds strongly to supercoiled plasmid DNA and to junction DNA. The difference in affinity was on the order of 6-to 7-fold between linear and junction DNA in reactions containing physiological levels of magnesium [92]. BRCA1 230-534 binds with a higher affinity to four-way junction DNA as compared to duplex and single-stranded DNA [91]. Residues 340-554 of BRCA1 have been identified as the minimal DNA-binding region [93]. The highest affinity among the different DNA targets which mimic damaged DNA (four-way junction DNA, DNA mismatches, DNA bulges and linear DNA) was for DNA four-way junctions. To this end, a 20-fold excess of linear DNA was unable to compete off any of the BRCA1 230-534 bound to DNA molecules mimicking damaged DNA [49]. Furthermore, the loss of the BRCA1 gene prevents cell survival after exposure to DNA cross-linkers such as mitomycin C [94]. These results speak to the importance of BRCA1's ability to recognize cruciform structures.
HMGB family The high mobility-group (HMG) proteins are a family of abundant and ubiquitous non-histone proteins that are known to bind to eukaryotic chromatin. The three HMG protein families comprise the (a) HMGA proteins (formerly HMGI/Y) containing A/T-hook DNA-binding motifs, (b) HMGB proteins (formerly HMG1/2) containing HMG-box domain(s), and (c) HMGN proteins (formerly HMG14/17) containing a nucleosome-binding domain [95].
HMGB proteins bind DNA in a sequence independent manner and are known to bind to certain DNA structures (four-way junctions, DNA minicircles, cis-platinated DNA, etc.) with high affinity as compared to linear DNA [96,97]. The chromatin architectural protein HMGB1 can bind with extremely high affinity to DNA structures that form DNA loops [72], while other studies have shown that the HMG box of different proteins can induce DNA bending [98][99][100]. The HMG box is an 80 amino acid domain found in a variety of eukaryotic chromosomal proteins and transcription factors. HMG box binding to DNA is associated with distortions in DNA structure. Members of the HMG protein family are involved in transcription [101][102][103] and DNA repair [57,104,105]. The HMG protein T160 was found to be co-localized with DNA replication foci [106]. The fact that all HMG box domains bind to four-way DNA junctions suggests that a common feature in the binding targets of this protein family must exist. Single HMG box domains interact exclusively with the open square form of the junction, and conditions that stabilize the stacked × structure conformation significantly weaken the HMG box DNA interaction [107]. Binding of the isolated A domain of HMGB1 protein to four-way junction DNA substrates is abolished by mutation of both Lys2 and Lys11 together to alanine, indicating that these residues play an important role in DNA binding [108].

Proteins involved in replication
Transient transitions from B-DNA to cruciform structures are correlated with DNA replication and transcription [109]. It has been shown that cruciforms serve as recognition signals at or near eukaryotic origins of DNA replication [110][111][112]. There are a large number of proteins involved in replication which bind to cruciform structures (see Table 1). We focus here primarily on the 14-3-3 protein family and MLL and WRN proteins. We will comment briefly on other systems of interest. S16 is a structure-specific DNA-binding protein displaying preferential binding for cruciform DNA structures [113]. The AF10 protein binds cruciform DNA via a specific interaction with an AT-hook motif and is localized to the nucleus by a defined bipartite nuclear localization signal in the N-terminal region [114]. The structural maintenance of chromosomes (SMC) protein family, with members from lower and higher eukaryotes, may be divided into four subfamilies (SMC1 to SMC4) and two SMC-like protein subfamilies (SMC5 and SMC6) [115][116][117]. Members of this family are implicated in a large range of activities that modulate chromosome structure and organization. Smc1 and smc2 proteins have a high affinity for cruciform DNA molecules and for AT-rich DNA fragments including fragments from the scaffold-associated regions [118]. The baculovirus very late expression factor 1 (VLF-1), a member of the integrase protein family, does not bind to single and double strand structures, but it does bind (listed with increasing affinity) to Y-forks, three-way junctions and cruciform structures. This protein is involved in the processing of branched DNA molecules at the late stages of viral genome replication [119]. 14-3-3 The 14-3-3 protein family consists of a highly conserved and widely distributed group of dimeric proteins which occur as multiple isoforms in eukaryotes [120]. There are at least seven distinct 14-3-3 genes in vertebrates, giving rise to nine isoforms (α, β, γ, δ, ε, ζ, η, σ and τ) and at least another 20 have been identified in yeast, plants, amphibians and invertebrates [110]. A striking feature of the 14-3-3 proteins is their ability to bind a multitude of functionally diverse signaling proteins, including kinases, phosphatases, and transmembrane receptors. This plethora of proteins allows 14-3-3s to modulate a wide variety of vital regulatory processes, including mitogenic signal transduction, apoptosis and cell cycle regulation [121]. The 14-3-3 proteins are found mainly within the nucleus and are involved in eukaryotic DNA replication via binding to the cruciform DNA that forms transiently at replication origins at the onset of the S phase [122].
The prevalence of the 14-3-3 family proteins in all eukaryotes combined with a high degree of sequence conservation between species is indicative of their importance. Genetic studies have shown that knocking out the yeasts homologs of the 14-3-3 proteins is lethal [124]. Moreover, 14-3-3 proteins are involved in interactions with numerous transcription factors and it has been reported that several of the 14-3-3 proteins functions are associated with its cruciform binding properties. Mixed lineage leukemia (MLL) protein The MLL gene encodes a putative transcription factor with regions of homology to several other proteins including the zinc fingers and the so-called "AT-hook" DNA-binding motif of high mobility group proteins [125]. The 11q23 chromosomal translocation, found in both acute lymphoid and myeloid leukemias, results in disruption of the MLL gene. Leukemogenesis is often correlated with alternations in chromatin structure brought about by either a gain or loss in function of the regulatory factors due to their being disrupted by chromosomal translocations. The MLL gene, a target of such translocation events, forms a chimeric fusion product with a variety of partner genes [126].
The MLL AT-hook domain binds cruciform DNA, recognizing the structure rather than the sequence of the target DNA. This interaction can be antagonized both by Hoechst 33258 dye and distamycin. In a nitrocellulose protein-DNA binding assay, the MLL AT-hook domain was shown to bind to AT-rich SARs, but not to non-SAR DNA fragments [125]. MLL appears to be involved in chromatin-mediated gene regulation. In translocations involving MLL, the loss of the activation domain combined with the retention of a repression domain alters the expression of downstream target genes, thus suggesting a potential mechanism of action for MLL in leukemia [126]. AF10 translocations to the vicinity of genes other than MLL also result in myeloid leukemia. A biochemical analysis of the MLL partner gene AF10 showed that its AT-hook motif is able to bind to cruciform DNA, but not to double-stranded DNA, and that it forms a homo-tetramer in vitro [114]. WRN The Werner syndrome protein belongs to the RecQ family of evolutionary conserved 3' 5' DNA helicases [127]. WRN encodes a single polypeptide of 162 kDa that contains 1432 amino acids. Prokaryotes and lower eukaryotes generally have one RecQ member while higher eukaryotes possess multiple members and five homologs have been identified in human cells. All RecQ members share a conserved helicase core with one or two additional C-terminal domains, the RQC (RecQ C-terminal) and HRDC (helicase and RNaseD Cterminal) domains. These domains bind both to proteins and DNA. Eukaryotic RecQ helicases have N-and Cterminal extensions that are involved in protein-protein interactions and have been postulated to lend unique functional characteristics to these proteins [55,128]. WRN has been shown to bind at replication fork junctions and to Holliday junction structures. Binding to junction DNA is highly specific because little or no WRN binding is visualized at other sites along these substrates [129]. Upon binding to DNA, WRN assembles into a large complex composed of four monomers.

Cruciform binding proteins and disease
The recognition of DNA junctions and cruciform structures is critical for genomic stability and for the regulation of basic cellular processes. The resolution of Holliday junctions and long cruciforms is necessary for genomic stability where the dysregulation of these proteins can lead to DNA translocations, deletions, loss of genomics stability and carcinogenesis. The large numbers of proteins which bind to these DNA structures work together to keep the genome intact. We believe that the formation of cruciform structures serves as a marker for the proper timing and initiation of some very basic biological processes. The mutations and epigenetic modifications that alter the propensity for cruciform formation can have drastic consequences for cellular processes. Thus, it is unsurprising that the dysregulation of cruciform binding proteins is often associated with the pathology of disease.
As stated above, the cruciform binding proteins including p53, BRCA1, WRN and the proto-oncogenes DEK, MLL and HMG are also associated with cancer development and/or progression. Some of these proteins play such important roles that their mutation and/or inactivation result in severe genomic instability and sometimes lethality. For example, Brca1 -/-mouse embryonic stem cells show spontaneous chromosome breakage, profound genomic instability and hypersensitivity to a variety of damaging agents (e.g. γ radiation) all of which suggests a defect in DNA repair. The connection between the BRCA1 mutation and breast cancer is well known. P53's transcriptional regulation is finetuned by its timely binding to promoter elements. The formation of a cruciform structure in p53 recognition elements may be an important determinant of p53 transcription activity.
The dHMGI(Y) family of "high mobility group" nonhistone proteins comprises architectural transcription factors whose over expression is highly correlated with carcinogenesis, increased malignancy and metastatic potential of tumors in vivo [95]. 14-3-3 proteins are related to several diseases, including cancer, Alzeheimer's disease, the neurological Miller Dieker and Spinocerebellar ataxia type 1 diseases, and spongiform encephalopathy. The deletion of 14-3-3σ in human colorectal cancer cells leads to the loss of the DNA damage checkpoint control [130]. The human DEK protein was discovered as a fusion with a nuclear pore protein in a subset of patients with acute myeloid leukemia. It was also identified as an autoantigen in a relatively high percentage of patients with autoimmune diseases. In addition, DEK mRNA levels are higher in transcriptionally active and proliferating cells than in resting cells, and elevated mRNA levels are found in several transformed and cancer cells [6,7]. Werner syndrome is an autosomal recessive disorder characterized by features of premature aging and a high incidence of uncommon cancers [127]. The Werner syndrome protein (WRN) plays central roles in maintaining the genomic stability of organisms [131]. Individuals harboring mutations in WRN have a rare, autosomal recessive genetic disorder manifested by early onset of symptoms characteristic of aged individuals.

Conclusions
Cruciform structures are fundamentally important for a wide range of biological processes, including DNA transcription, replication, recombination, control of gene expression and genome organization. The putative mechanistic roles of cruciform binding proteins in transcription, DNA replication, and DNA repair are shown in Figure 5. Alternative DNA structures, including cruciforms, are often formed at sites of negatively supercoiled DNA by perfect or imperfect inverted repeats of 6 or more nucleotides. Longer DNA palindromes present a threat to genomic stability as they are recognized by junction-resolving enzymes. Shorter palindromic sequences are essential for basic processes like DNA replication and transcription. The presence of cruciform structures may also play an important role in epigenetics, such that cruciform structures are protected from DNA methylation. For example, the Dam methylase is not able to modify its GATC target site when it occurs in a cruciform or hairpin conformation. The center of a long perfect palindrome located in bacteriophage lambda has also been shown to be methylationresistant in vivo [40]. Moreover, the centers of long palindromes are hypo-methylated as compared to identical sequences in non-palindromic conformations [40]. To this end, transient cruciforms can directly influence DNA methylation and therefore provide another layer for regulation of the DNA code. Proteins that bind to cruciforms can be divided into several categories. In addition to a well defined group of junction-resolving enzymes, we have classified cruciform binding proteins into groups involved in transcription and DNA repair (PARP, BRCA1, p53, 14-3-3), chromatin-associated proteins (DEK, BRCA1, HMG protein family, topoisomerases), and proteins involved in replication (MLL, WRN, 14-3-3, helicases) (see Table 1). Within these groups are proteins indispensable for cell viability, as well as tumor suppressors, proto-oncogenes and DNA remodeling proteins. Similarly, triplet repeat expansion, a phenomenon important in several genetic diseases, including Friedreich's ataxia, cardiomyopathy, myotonic dystrophy type I and other neurological disorders, can change the spectrum of cruciform binding proteins. Lastly, single nucleotide polymorphisms and/or insertion/deletion mutations at inverted repeats located in promoter sites can also influence cruciform formation, which might be manifested through altered gene regulation. A deeper understanding of the processes related to the formation Figure 5 Scheme of the putative mechanistic roles of cruciform binding proteins in transcription, DNA replication, and DNA repair. A) A model for the structure-specific binding of transcription factors to a cognate palindrome-type cruciform implicated in transcription. The equilibrium between classic B-DNA and the higher order cruciform favors duplex DNA, but, when cruciform binding proteins are present, they either preferentially bind to and stabilize the cruciform or bind to the classic form and convert it to the cruciform. This interaction results in both an initial melting of the DNA region covered by transcription factor and an extension of the melt region in both directions. The melting region continues to extend in response to the needs of the active transcription machinery. B) A model for the initiation of replication enhanced by extrusion to a cruciform structure. Dimeric cruciform binding proteins interact with and stabilize the cruciform structure. The replisome is assembled concomitantly and is assumed to include polymerases, single-strand binding proteins and helicases. C) Model for the influence of cruciform binding proteins on DNA structure in DNA damage regulation. Naked cruciforms are sensitive to DNA damage and are covered by proteins in order to protect these sequences from being cleaved. In these cases, a deficiency in cruciform binding proteins can lead to DNA breaks. Here, cruciform-DNA complexes can also serve as scaffolds to recruit the DNA damage machinery.