Skip to main content


We’d like to understand how you use our websites in order to improve them. Register your interest.

The STAR RNA binding proteins GLD-1, QKI, SAM68 and SLM-2 bind bipartite RNA motifs



SAM68, SAM68-like mammalian protein 1 (SLM-1) and 2 (SLM-2) are members of the K homology (KH) and STAR (signal transduction activator of RNA metabolism) protein family. The function of these RNA binding proteins has been difficult to elucidate mainly because of lack of genetic data providing insights about their physiological RNA targets. In comparison, genetic studies in mice and C. elegans have provided evidence as to the physiological mRNA targets of QUAKING and GLD-1 proteins, two other members of the STAR protein family. The GLD-1 binding site is defined as a hexanucleotide sequence (NACUCA) that is found in many, but not all, physiological GLD-1 mRNA targets. Previously by using S ystematic E volution of L igands by EX ponential enrichment (SELEX), we defined the QUAKING binding site as a hexanucleotide sequence with an additional half-site (UAAY). This sequence was identified in QKI mRNA targets including the mRNAs for myelin basic proteins.


Herein we report using SELEX the identification of the SLM-2 RNA binding site as direct U(U/A)AA repeats. The bipartite nature of the consensus sequence was essential for SLM-2 high affinity RNA binding. The identification of a bipartite mRNA binding site for QKI and now SLM-2 prompted us to determine whether SAM68 and GLD-1 also bind bipartite direct repeats. Indeed SAM68 bound the SLM-2 consensus and required both U(U/A)AA motifs. We also confirmed that GLD-1 also binds a bipartite RNA sequence in vitro with a short RNA sequence from its tra-2 physiological mRNA target.


These data demonstrate that the STAR proteins QKI, GLD-1, SAM68 and SLM-2 recognize RNA with direct repeats as bipartite motifs. This information should help identify binding sites within physiological RNA targets.


The K homology (KH domain) is a prevalent RNA binding domain that is an evolutionarily conserved domain initially identified as a repeated sequence in the heteronuclear ribonucleoprotein particle (hnRNP) K [1]. The KH domain is a small protein module consisting of 70 to 100 amino acids and it is the second most prevalent RNA binding domain next to the RRM (RNA recognition motif) [2]. The RNA binding property of the KH domain was initially shown for FMRP, the gene product of the human fragile X syndrome and hnRNP K [3]. The KH domain is often found in multiple copies within proteins (15 in vigilin) and there is a subfamily that contains a single copy KH domain that is larger referred to as a maxi-KH domain [4].

The KH domain makes direct protein-RNA interactions with a three-dimensional β1α1α2β2β3 topology with an additional C-terminal α helix (α3) for maxi-KH domains [1]. The feature of KH domains is an invariant GXXG loop located between α12 that provides close contact with the phosphate groups such that the neighboring nucleotides can form Watson and Crick base pairing with conserved amino acids within the KH domain [5, 6]. The structure determination of the KH domains has also been solved with single-stranded DNA, demonstrating that certain KH domains may accommodate either RNA or ssDNA within their active site [1, 79].

There exists a subfamily of KH domains that contain extended loops between β1/α1 and β2/β3 and that contain an additional C-terminal helix in their topography [10]. These maxi-KH domain proteins contain conserved sequences immediately at the N- and C-terminal of the KH domain. The entire region is referred to as the STAR/GSG (signal transduction activator of RNA metabolism/GRP33, SAM68, GLD-1) domain [4, 11, 12]. Although STAR proteins contain single KH domains, dimerization is required for RNA binding [13]. The STAR proteins are mammalian Sam68, SLM-1, SLM-2, QKI, SF1, C. elegans GLD-1, Drosophila How, KEP1, Sam50 and Artemia Salina GRP33 [4]. STAR proteins have been shown to function in pre-mRNA splicing [1418], mRNA export [1921], mRNA stability [22, 23] and protein translation [2428]. Genetic evidence has implicated the STAR RNA binding proteins in many cellular processes. These include the role of the QKI isoforms in the process of myelination of the central nervous system [29], GLD-1 in the germline determination [3032], How in muscle and tendon differentiation [4], Kep1 in cell death processes [33] and Sam68 in bone marrow mesenchymal cell fate [34] and motor defects [35]. Genetic data has also implicated simple KH domain proteins FMRP in mental retardation and Nova in paraneoplastic neurologic disorders [2].

SF1 or branch point binding protein (BBP) was shown to recognize the branchpoint site RNA sequence (UACUAAC) [6, 36] and structure determination has shown that there is direct protein-RNA contact [6]. These studies have provided necessary information about the contact sites of maxi-KH domains and their similarities/differences with simple KH domains proteins such as Nova. Based on this information, Ryder and coworkers showed that GLD-1 binds a hexanucleotide sequence (NACUCA) and proposed it as the STAR binding site [37]. In a previous effort by using S ystematic E volution of L igands by EX ponential enrichment (SELEX) [38], we defined the QKI RNA binding consensus sequence to be a bipartite motif consisting of a core NACUAAY (where Y is a pyrimidine) sequence with an neighboring half-site (UAAY) [39]. In the present study, we define for the first time the RNA binding specificity of the mammalian STAR protein, SLM-2. We identified using SELEX the SLM-2 consensus sequence as a direct U(U/A)AA repeat. The bipartite nature of the consensus RNA sequence was essential for high affinity RNA binding activity to SLM-2. The identification of a bipartite mRNA binding site for QKI [39] and now for SLM-2 prompted us to further determine whether SAM68 and GLD-1 also bound bipartite direct repeats. Indeed SAM68 and GLD-1 required bipartite RNAs, demonstrating that the STAR proteins SLM-2, SAM68, QKI and GLD-1 bind direct RNA repeats as a bipartite motif in target RNAs.


The identification of the SLM-2 RNA binding site by using SELEX

To identify the binding motif for the SLM-2 RNA binding protein, we performed SELEX to enrich for high affinity RNA ligands. Bacterial recombinant SLM-2 expressed as a histidine epitope tagged fusion protein was generated and purified for the assay. Synthetic RNAs were transcribed with the T7 RNA polymerase from DNA pools of 52-nucleotide random-mers estimated at a complexity of 1.0 × 1014 and we randomly sequenced 20 RNA molecules from the initial library and noted, as expected, that each sequence was unique [39]. The transcribed RNAs were generated in the presence of 32P-α-UTP such that the amount of specific SLM-2 bound RNAs could be measured after each round. After six cycles of selection, we observed an approximately 10% of binding of the initial input (not shown), demonstrating that we indeed had enriched specific sequences. To confirm the SELEX amplification of the SLM-2 specific RNA ligands, we performed a gel electromobility shift assays (EMSA) with purified pools of RNA transcripts isolated from rounds 2, 4 and 6. The RNAs were 32P-labelled and incubated with buffer or increasing concentration of His-SLM-2. The SLM-2/RNA complexes were observed as slow migrating complexes on native gel electrophoresis (Fig. 1A). More efficient RNA binding was observed in round 6 than rounds 2 and 4 (compare the free probe remaining from lanes 2 and 6 with lane 10). After round 6, the SLM-2 bound RNAs were converted into cDNAs, subcloned and sequenced. The sequence of 43 clones revealed that 11 clones were unique (Table 1). The clones were referred to as SLM-2 response element (SRE)-1 to 11. Class I RNAs contained a bipartite motif consisting of direct repeats of the sequence U(U/A)AA (Table 1). Our data show that the selected RNA aptamers contained a bipartite motif with direct repeats and the spacing between the repeats varied from 3 (SRE-3) to 25 (SRE-7) nucleotides (Table 1 and Fig. 1B). The 3 RNAs identified that did not contain the bipartite sequence (SRE-9, -10, -11) were grouped in Class II and since ~10% of RNAs from round 6 bound SLM-2, Class II RNAs are likely to represent non-binders. No apparent secondary structure was identified in the SREs using the prediction of RNA secondary structure program MFOLD (data not shown). Taken together, we have identified a bipartite motif consisting of direct repeats of the sequence U(U/A)AA as the SLM-2 RNA binding site.

Figure 1

SLM-2 RNA ligands identified. (A) EMSAs of pooled RNAs identified in rounds 2, 4 and 6 using increasing concentrations of His-SLM-2. The protein/RNA complex was separated from the free probe on a native PAGE. The migration patterns of unbound RNAs (free probe) and protein bound RNAs (SLM-2/RNA complex) are indicated on the left. (B) The sequences of 8 unique RNAs bound to SLM-2 after six cycles of SELEX. Both identified motifs are aligned and black undermark. Illustrated, underneath the sequences is the probability matrix (graphic logo) based on all the 8 different sequences, depicting the relative frequency of each residue at each position within the selected motif.

Table 1 Selected SLM-2 bound RNA ligands

A direct repeat of U(U/A)AA defines the SLM-2 RNA binding consensus sequence

To define the characteristics of the SLM-2 RNA binding motif, we performed RNA binding assays with SRE-4 and SRE-7. We chose SRE-4 and -7 for further analysis because SRE-4 was the most frequently[40] identified RNA and SRE-7 contains a guanine-rich sequence at its 5'end in addition to the UUAA repeats. The 52 mer identified for SRE-4 was trimmed to a 38 mer conserving 7 nucleotides on the 5' and 3' end of the U(U/A)AA consensus sequence and designated this as SRE-4wt. This synthetic RNA bound SLM-2 with a high affinity dissociation constant of ~16 nM (Fig. 2A and Table 2) like the 52 mer SRE-4 sequences (not shown). The substitution of the either or both U(U/A)AA motifs with CCCC abolished SLM-2 binding (Fig. 2A and Table 2, SRE-4m1, m2, m3). Similarly, the replacement of the UAAA with UACC abolished RNA binding (SRE-4m4). These finding demonstrate that both tetra-nucleotide motifs (U(U/A)AA) are required for SLM-2 high affinity RNA binding.

Figure 2

Defining the SLM-2 response element as a bipartite RNA sequence. EMSAs with the selected SRE-4 with decreasing concentrations of recombinant His-SLM-2 (A) and the SAM68 GSG domain (B) (by a factor of 2 from 1 μM) or with buffer alone. The RNA sequence and mutants (m1-m4) used in the reaction are shown in Table 2. Migration patterns of unbound RNAs (free probe) and protein bound RNAs (protein-RNA complex) are indicated on the left.

Table 2 Binding affinity of SLM-2 for selected and mutated RNA ligands

We analyzed SRE-7 and identified a G-rich sequence that may represent a G quartet. We first proceeded by replacing the G-rich nucleotides with U-rich sequence and this had little effect on SLM-2 RNA binding activity (compare SRE-7m1 and SRE-7wt; Table 2). Interestingly, the replacement of the G-rich sequences with AU-rich sequences such as to introduce a third U(U/A)AA motif enhanced SLM-2 RNA binding to this RNA species (SRE-7m2; Table 2). The substitution of the downstream uUUAAu sequence with CGACGC abolished SLM-2 RNA binding consistent with the U(U/A)AA requirement (SRE-7m2). Numerous 5' and 3' deletions were performed and a minimal sequence of 40 nucleotides was identified containing both U(U/A)AA motifs that bound with a Kd of ~22.3 nM (SRE-7d9; Table 2). The substitution of the 5' or 3' U(U/A)AA motifs reduced the SLM-2 high affinity binding site (Table 2; SRE-7d9m2, m3), demonstrating that indeed SLM-2 binds RNA with high-affinity to direct repeats of U(U/A)AA.

SAM68 binds the SLM-2 response element

SELEX has been performed with recombinant SAM68 and a UAAA consensus was defined as a necessary RNA binding site [41]. As there is 69% sequence identity between the SLM-2 and SAM68 STAR/GSG domains [42], we tested the possibility that the SLM-2 consensus (SRE-4) may be bound by Sam68. Using EMSA with recombinant Sam68 containing only the STAR/GSG domain, we observed that indeed the GSG domain of SAM68 bound the SRE-4wt RNA aptamer, but not the variants that contain mutated U(U/A)AA motifs (Fig. 2B). There was one variant of SRE-4 (SRE-4m2) that retained some binding and this is likely due to the polyuridine stretch (UUUU) that remained between the two U(U/A)AA motifs (Table 2). These findings demonstrate that SAM68 also has the capabilities to bind a bipartite U(U/A)AA consensus.

Defining the bipartite nature of the QKI response element within the mRNAs of myelin basic protein

The mRNAs encoding the myelin basic proteins (MBP) are known QKI targets [19, 43]. The QKI RNA binding site was defined to be a core (NACUAAC) with a neighboring half-site (UAAY) [39]. The MBP QREs were defined as QRE-1 and QRE-2 [39, 44]. QRE-2 is interesting as it contains two regions with an overlapping imperfect core (underlined) and half-site (bold) (UACACAC UAAC, QRE-2:wt) as well as downstream perfect half-site (UAAC) (Fig. 3). Alternatively, region A is recognized as the imperfect half-site (bold) (UACA CACUAAC) and as the perfect core (underlined). To define the requirements of QRE-2, we performed EMSA with various combinations of region A and B. The QRE-2 sequences with regions A and B bound QKI with high affinity (QRE-2:wt, Kd ~121 nM) and the substitution of the UAAC half-site in region A or region B diminished considerably the RNA binding affinity (Fig. 3, QRE-2:m1, m2). The substitution of the UACA to GAGA in region A bound with high affinity demonstrating that region A supplies the perfect core (CACUAGG) and region B supplies the half-site (UAAC) of the bipartite motif. These findings demonstrated that region A without region was unable to serve as a high affinity site for QKI (QRE-2:m1). Ryder and Williamson showed that region A alone was bound with high affinity by QKI. We next centered region A and this considerably improved QKI binding with a Kd of ~168 nM (Fig. 3, QRE-2:m4). The substitution of either the UAUA to GAGA (QRE-2:m5) or the UAAC to GAGC (QRE-2:m6) significantly reduced QKI RNA binding (Fig. 3B). These findings define the QRE-2 as requiring a bipartite motif located in region A or in region A plus region B.

Figure 3

Defining the high affinity QRE within the MBPmRNA. (A) EMSAs of selected RNAs with increasing concentrations of recombinant GST-QKI-5 (by a factor of 2 from 2 nM) or with buffer alone. The RNAs used for the EMSAs are the MBP QRE-2 and variation mutants of each (m1-m6). Migration patterns of unbound RNAs (free probe) and QKI-5 bound RNAs (QKI-RNA complex) are indicated on the left. (B) RNA species tested in (A) are shown. The black bars denote sequences that are unaltered between the wild-type and the mutant versions.

GLD-1 binds a bipartite RNA motif containing the hexanucleotide

A high affinity RNA binding site has been defined for C. elegans GLD-1 that consists of a hexanucleotide (NACU(C/A)A) [37]. To examine whether the GLD-1 hexanucleotide sequence also requires a similar half-site, we performed EMSA assays with a segment of the tra2 and gli repeated element (TGE) containing the hexanucleotide (UACUCAU) and its neighboring half site (UAAU)(Fig. 4A, TGE-wt). GLD-1 bound this wild-type TGE sequence and a variation of it (TGE-m2) with approximate Kd ~104 nM, defining a short sequence for GLD-1 high affinity binding (Fig. 4B). These data are consistent with previous competition experiments that defined the GLD-1 Kd ~10 nM that defined the hexanucleotide as (UACU(C/A)A) [37]. The nucleotide substitution of the half-site (UAAU to GAGU) abolished RNA binding (Fig. 4B, TGE-m1), consistent with the need for a half-site in addition to the hexanucleotide. Similar binding experiments were performed with QKI and we observed that TGE-m2 is essentially a QRE bound with high affinity, whereas the wild-type TGE bound with a moderate affinity of approximately 300 nM (Fig. 4C). The TGE-m1 was not bound by QKI (Fig. 4C). In summary, these data identify the GLD-1 RNA binding motif as bipartite as observed with SLM-2, QKI, and Sam68.

Figure 4

GLD-1 binding to the tra2 /gli element analysis. (A) RNA species tested in (B) and (C) are shown. The black bars denote sequences that are unaltered between the wild-type and the mutant versions. EMSAs of the tra2/Gli element with increasing concentrations of GLD-1 (B) and QKI (C) (by a factor of 2 from 2 nM) or with buffer alone. Migration patterns of unbound RNAs (free probe) and protein/bound RNAs (GLD-1/RNA or QKI/RNA complexes) are indicated on the left.

Discussion & Conclusion

In the present study, we identified a SLM-2 consensus sequence as direct U(U/A)AA repeats using SELEX. The bipartite nature of the consensus sequence was essential for SLM-2 high affinity RNA binding. The identification of a bipartite mRNA binding site for QKI [39] and now SLM-2 prompted us to determine whether SAM68 and GLD-1 also bound bipartite direct repeats. Indeed SAM68 bound the SLM-2 consensus and required both U(U/A)AA motifs. Also, GLD-1 required sequences within the UAAY half-site in addition to its conservative consensus NACU(C/A)A, defining a GLD-1 bipartite motif. Taken together, these data demonstrate that the STAR proteins SLM-2, SAM68, QKI and GLD-1 bind direct RNA repeats.

Defining the SLM-2 RNA binding site as U(U/A)AA repeats

We identified SLM-2 in 1999 by searching databases with SAM68 sequences [42]. Independently SLM-2 (called T-STAR) was identified as an interacting protein of RBM, an RNA binding protein involved in spermatogenesis [45]. SLM-2 is known to bind homopolymeric RNA [42], localize to SAM68 nuclear bodies (SNBs) [46], regulate alternative splicing [16] and dimerize with SAM68 and SLM-1 [42]. SLM-2 is post-translationally modified to contain methylarginines [47] and phosphotyrosines, the latter impairs its ability to associate with RNA [48]. The expression of SLM-2 is mainly restricted to testis and brain, but its function in these tissues remains unknown [49].

Previously we showed that SLM-2 had a preference for poly (G) rich homopolymeric RNA [42], therefore, we searched the SELEX hits for poly (G) rich sequences that could possibly resemble a G-quartet as bound by FMRP [50]. The SLM-2 selected RNA (SRE-7) contained a variation of this sequence (GGnGGGnGGnnnnnnnGG), but its deletion did not affect SLM-2 RNA binding. Therefore, we next focused on the U(U/A)AA rich repeats that resemble the consensus identified with SAM68 SELEX [41]. Indeed we mapped the SLM-2 consensus sequence to direct repeats of the U(U/A)AA sequence, defining a SLM-2 RNA binding site as a bipartite motif. This motif is too frequently found in mRNAs especially in 3'-UTR to perform a bioinformatic analysis to identify the SLM-2 mRNA targets (not shown). Thus the specificity in SLM-2 function is most likely contributed by its tissue specific expression and post-translational modifications may alter its RNA binding specificity and/or accessibility.

Sam68 is known to bind cellular RNA as well as DNA [51]. Sam68 is known to have a preference for poly (U) and poly (A) homopolymeric RNA and this association is abrogated with tyrosine phosphorylation by Src kinases and BRK [52, 53]. Differential display and cDNA representation difference analysis identified 29 potential RNA binding targets of which 10 bind in a KH-dependent manner [54]. Sam68 binding sequences on hnRNP A2/B1 and β-actin mRNAs were mapped to UAAA and UUUUUU nucleotide motifs, respectively and both motifs occur within specific loop structures [54]. Sam68 has also been shown to transport unspliced HIV RNAs [20]. The knockout Sam68 mice are protected against the development of osteoporosis pointing towards an enhancement of the mesenchymal stem cell differentiation along the osteogenic rather than the adipocyte pathway [34]. The mice also have motor coordination defects [35]. The identification of Sam68 in these physiological processes will help direct the search for specific physiological mRNA targets. The work performed herein demonstrates that the STAR/GSG domain of Sam68 has similar RNA binding capabilities to SLM-2, as suggested by their 69% sequence identity within their STAR/GSG domains [42].

QUAKING: a regulator of myelination

The quaking viable (qkv) mice represent an animal model of dysmyelination [55]. The defect is summarized as an incomplete maturation of the myelin sheath. This is due to the lack in proper oligodendrocyte differentiation, resulting in the failure to transport intracellular myelin components such as the MBP mRNAs [55]. QKI null animals have been generated, but the embryos die at ~E9.5–10.5 day, providing little information about the role of QKI in myelination [56]. By using a gain-of-function approach with ectopic expression of the QKI isoforms, we showed previously that QKI-6 and QKI-7 promote oligodendrocyte differentiation by up-regulating p27KIP1, confirming the role for the QKI isoforms during myelination [22]. The QKI response element was defined as a core NACUAAY [44] with a neighboring UAAY [39, 44]. This led to the identification of two binding sites within the mRNAs for the MBPs [39, 44]. QRE-1 contains 3' adjacent half-sites that function as a moderate affinity site. In the present study, we demonstrate that region A in QRE-2 (Fig. 4) shown previously to mediate binding [44], becomes a better site with the presence of the half site from region B (Fig. 4). Our findings show that QRE-2 within the 3'UTR of MBP mRNAs is indeed a bipartite consensus sequence with a core NACUAAY and a neighboring UAAY.

The MBP mRNAs are localized at the distal processes of oligodendrocytes in intact tissue [57]. The factors necessary for MBP mRNA localization are oligodendrocyte-specific, as transfected MBP mRNA into non-glial cells did not properly localize to the cell membrane [58]. Studies performed in living cells by microinjection have shown that the MBP mRNA forms granules, which appear dispersed in the perikaryon and are transported down the processes [59]. MBP is not the only mRNA known to be localized to the distal processes of oligodendrocytes, as myelin oligodendrocytes basic protein (MOBP), alpha-CAMKII, tau, amyloid precursor protein (APP) and others are also transported to the site of myelination [60]. Transport and localization elements have been mapped in the 3' UTR of rat and mouse MBP mRNA. A 21-nucleotide sequence named RNA transport signal (RTS) mapped at nucleotide 794 to 814 of rat MBP or nucleotide 798 to 818 of mouse MBP has been identified as a transport element [61]. This sequence is homologous to several other localized mRNAs, suggesting a general transport signal. In rat oligodendrocytes, another localization element has been mapped to nucleotides 1130 to 1473 named the RNA localization region (RLR), but the region 667 to 953 containing the RTS and QRE-1 is sufficient for localization [61]. HnRNP A2 has been shown to be one of the component which binds the RTS sequence [62], and insertion of the RTS into GFP resulted in enhanced translation [63]. The mapping of QRE-2 (UACUAAC-13nt-UAAC) constitutes another element that may be necessary for proper export of the MBP mRNA into the cytoplasm and subsequent production of the MBP at its site of synthesis. It is likely that QKI works in combination with hnRNP A2 and the other components of the RNP granule in the proper transport of the MBP mRNA, its localization and its translation.

The C. elegans homolog of QKI is GLD-1, a known protein translation inhibitor required for germ-line differentiation [64, 65]. Many GLD-1 mRNA targets have been identified [2528] and a conservative consensus sequence of NACU(C/A)A was defined by comparing the binding specificity with SF1 [37]. Most, but not all mRNA targets [37], contain this conversed consensus sequence. The demonstration that GLD-1 like QKI requires a neighboring half-site is consistent with the ~50% sequence identity within their STAR/GSG domains.

Sam68/SLM-2 tetranucleotide versus QKI/GLD-1 hexanucleotide sequence requirements

The RNA binding domain of STAR/GSG proteins consist in a maxi-KH domain flanked by two conserved sequences (Fig. 5). The NK/QUA1 and CK/QUA2 region refer to the N- and C-terminal region, respectively, flanking the KH domain. Based on the structure of the KH domain of SF-1 associated with its binding RNA molecule U1A2C3U4A5A6C7, the CK region makes important contacts with the RNA. All STAR domain containing proteins have the most important GXXG sequence located in a loop between the two first alpha helices of the KH domain. This sequence of residues is absolutely conserved among the STAR domain proteins and makes the contact with the RNA especially with the bases U4A5A6C7. By looking closely at the residues in the CK region that make important association with the RNA bases, we find that two residues (asterix on Fig. 5) seems to confer the SLM-2/SAM68 specificity versus the QKI/GLD-1 specificity. The SLM-2/SAM68 residues are a threonine or a serine and a conserved glutamic acid while the QKI/GLD-1/SF1 residues consist in a conserved alanine and a conserved arginine. These residues make important contact with base A2 which specificity is lost in the Slm-2/Sam68 consensus binding sequence. In fact, SLM-2/Sam68 binding sequence resembles in all points to the QKI/GLD-1/SF1 core binding sequence but lacking U1A2C3 bases.

Figure 5

STAR/GSG domain protein alignment. (A) Diagram representing the structural and functional region of the STAR/GSG domain containing proteins. (B) The STAR/GSG domain of mouse SLM-2, human SAM68, mouse QKI, C. elegans GLD-1 and human SF-1 were aligned using ClustalW. Secondary structure, beta sheets and alpha helices, are shown on top of the sequences and region NK/QUA1, the KH domain and region CK/QUA2 are shown beneath the sequences. The critical loop between helices alpha 1 and alpha 2 with the GXXG sequence is also shown. (a) Based on [6] the RNA bases UACUAAC that contact with the specific SF-1 residues are numbered as follow U1A2C3U4A5A6C7. (b) Arginine 160 makes contact with U4 and A6. (c) Valine 183 makes contact with A6 and C7.

The STAR protein SF1 structure was determined and the amino acids that contact the RNA were identified [6]. Based on these contact amino acids, it explains why SF1, QKI and GLD-1 have near identical binding specificity. The Sam68, SLM-1 and SLM-2 subfamily have different amino acids in the RNA contact position and it should be possible by amino acid substitution to convert a Sam68 domain into a GLD-1 domain that will bind the NACUA(C/A)C GLD-1 consensus sequence. Lehmann-Blount & Williamson (2005) have performed such experiments and were unable by mutagenesis to identify an amino acid 'code' that would dictate GLD-1-like versus Sam68-like specificity [66]. This led them to propose that Sam68 and hence SLM-1 and SLM-2 might not be RNA binding proteins or possess an RNA binding specificity that is fundamentally unlike that of GLD-1 [66]. The identification of a high affinity RNA target for SLM-2 with the characteristics of a GLD-1/QKI bipartite motif, demonstrates that Sam68, SLM-1 and SLM-2 subfamily are indeed RNA binding proteins, but does not exclude the possibility that they may also bind ssDNA. The challenge ahead will be to identify the physiological RNA targets linking with the phenotypes observed in mammals.


SELEX assay

S ystematic E volution of L igand by EX pornential enrichement (SELEX) was performed as previously described [67]. Essentially, an oligonucleotides harboring a 52-bp random sequence surrounded by two primer binding sites, with an estimated complexity of 1 × 1015, were synthesized by (Invitrogen). The oligonucleotides were amplified by PCR using corresponding forward and reverse primers as previously described [67]. After PCR amplification, the sequences of 24 random clones were determined; each clone was unique and the overall base composition whoed similarity among the clones (average composition: A, 20%; U, 30%; C, 22%; G, 28%; data not shown). A purified DNA library (1 × 1013 molecules) was transcribed in vitro using the T7 RNA polymerase (Promega) and (α-32P)-UTP. RNA was purified from denaturing TBE-acrylamide gels, heated to 65°C fro 5 min, and precleared using TALON Metal Affinity Resin (BD Bioscience) to absorb non-specifically bound RNAs. Unbound RNAs were incubated in binding buffer (50 mM Tris-HCl (pH 8.0), 590 mM KCl) with the recombinant His-SLM-2 for 30 min, then with TALON Metal Affinity Resin for another 30 min. After four washes with binding buffer, the RNAs were eluted by TRIzol extraction (Invitrogen). The purified RNAs were ethanol precipitated and resuspended in water with RNase-free DNase for a 15 min reaction. The DNase reaction was quenched for 10 min at 65°C. Reverse transcriptions were performed using M-MLV reverse transcriptase (Promega) and a reverse oligonucleotide annealing to the 3' primer binding site. cDNAs were then generated by PCR amplification with the reverse oligonucleotide and the forward oligonucleotide annealing to the 5' primer binding site containing the T7 promoter. After round 6, the cDNAs were amplified with the reverse primer and a forward primer containing the Eco RI restriction site. The DNA fragments were digested with Eco RI and Bam HI and subcloned into pBluescript SK+ (Stratagene) for blue/white selection. Forty-three random white colonies were selected, their plasmid were purified and the SELEX sequence was identified by DNA sequencing (Genome Quebec).

RNA preparation, purification and E lectroM obility S hift A ssays (EMSAs)

RNAs were prepared by run-off in vitro transcription of oligonucleotides harboring a T7 binding site in the presence of 32P- UTP, using T7 MegaShortscript (Ambion) according to the manufacturer's protocols. RNAs were purified on TBE-acrylamide gels before use. For EMSAs, a constant concentration of 32P-labeled RNA (100 pmol) was incubated alone with buffer or with increasing or decreasing concentrations of the corresponding tested proteins in the following buffer: 20 mM HEPES (pH7.4), 330 mM KCl, 10 mM MgCl2, 0.1 mM EDTA, 0.1 mg/ml heparin and 0.01% IGEPAL CA630 (Sigma). The 30 μl reaction were incubated at room temperature for 1 h, and 3.3 μl of RNA loading dye (glycerol containing 0.25% (w/v) bromophenol blue, 0.25% (w/v) xylene cyanol) was added to each. A portion (15 μl) of each sample was separated on native Tris-glycine 8%-acrylamide gels. The gels were dried and the bound and unbound RNAs were quantified using a Storm Phosphorimager (Amersham). The fraction of bound RNA was determined and plotted using the software program Prism 3.0 (GraphPad Software).

DNA and protein preparation

Recombinant GST-QKI-5 was described previously [39]. Maltose binding protein fused to GLD-1 was a generous gift of Min-Ho Lee (University of Albany). His-SLM-2 was prepared from subcloning the coding region from GFP-SLM-2 [42] into pQE Trisystem (Qiagen) using Bam H1 and Xho I directionnal cloning. His-GSG(SAM68) was prepared by subcloning the GSG domain of mouse Sam68 into pET-18c. Protein purification was performed as per the manufacturer's instructions.


  1. 1.

    Valverde R, Edwards L, Regan L: Structure and function of KH domains. FEBS J. 2008, 275: 2712-2726. 10.1111/j.1742-4658.2008.06411.x

  2. 2.

    Lukong KE, Chang KW, Khandjian EW, Richard S: RNA-binding proteins in human genetic disease. Trends Genet. 2008

  3. 3.

    Glisovic T, Bachorik JL, Yong J, Dreyfuss G: RNA-binding proteins and post-transcriptional gene regulation. FEBS Lett. 2008, 582: 1977-1986. 10.1016/j.febslet.2008.03.004

  4. 4.

    Volk T, Israeli D, Nir R, Toledano-Katchalski H: Tissue development and RNA control: "HOW" is it coordinated?. Trends Genet. 2008, 24: 94-101. 10.1016/j.tig.2007.11.009

  5. 5.

    Lewis HA, Musunuru K, Jensen KB, Edo C, Chen H, Darnell RB, Burley SK: Sequence-specific RNA binding by a Nova KH domain: implications for paraneoplastic disease and the fragile X syndrome. Cell. 2000, 100: 323-332. 10.1016/S0092-8674(00)80668-6

  6. 6.

    Liu Z, Luyten I, Bottomley MJ, Messias AC, S H-M, R S, Zanier K, Kramer AMS: Structural basis for recognition of the intron branch site RNA by splicing factor 1. Science. 2001, 294: 1098-1102. 10.1126/science.1064719

  7. 7.

    Backe PH, Messias AC, Ravelli RB, Sattler M, Cusack S: X-ray crystallographic and NMR studies of the third KH domain of hnRNP K in complex with single-stranded nucleic acids. Structure. 2005, 13: 1055-1067. 10.1016/j.str.2005.04.008

  8. 8.

    Du Z, Lee JK, Tjhen R, Li S, Pan H, Stroud RM, James TL: Crystal structure of the first KH domain of human poly(C)-binding protein-2 in complex with a C-rich strand of human telomeric DNA at 1.7 A. J Biol Chem. 2005, 280: 38823-38830. 10.1074/jbc.M508183200

  9. 9.

    Braddock DT, Baber JL, Levens D, Clore GM: Molecular basis of sequence-specific single-stranded DNA recognition by KH domains: solution structure of a complex between hnRNP K KH3 and single-stranded DNA. EMBO J. 2002, 21: 3476-3485. 10.1093/emboj/cdf352

  10. 10.

    Di Fruscio M, Chen T, Bonyadi S, Lasko P, Richard S: The identification of two Drosophila KH domain proteins: KEP1 and SAM are members of the Sam68 family of GSG domain proteins. J Biol Chem. 1998, 273: 30122-30130. 10.1074/jbc.273.46.30122

  11. 11.

    Lukong KE, Richard S: Sam68, the KH domain-containing superSTAR. Biochim Biophys Acta. 2003, 1653: 73-86.

  12. 12.

    Vernet C, Artzt K: STAR, a gene family involved in signal transduction and activation of RNA. Trends in Genet. 1997, 13: 479-484. 10.1016/S0168-9525(97)01269-9.

  13. 13.

    Chen T, Damaj BB, Herrera C, Lasko P, Richard S: Self-association of the single-KH-domain family members Sam68, GRP33, GLD-1, and Qk1: role of the KH domain. Mol Cell Biol. 1997, 17 (10): 5707-5718.

  14. 14.

    Paronetto MP, Achsel T, Massiello A, Chalfant CE, Sette C: The RNA-binding protein Sam68 modulates the alternative splicing of Bcl-x. J Cell Biol. 2007, 176: 929-939. 10.1083/jcb.200701005

  15. 15.

    Arning S, Gruter P, Bilbe G, Kramer A: Mammalian splicing factor SF1 is encoded by variant cDNAs and binds to RNA. RNA. 1996, 2: 794-810.

  16. 16.

    Stoss O, Novoyatleva T, Gencheva M, Olbrich M, Benderska N, Stamm S: P59 fyn-mediated phosphorylation regulates the activity of the tissue-specific splicing factor rSLM-1. Mol Cell Neurosci. 2004, 27: 8-21. 10.1016/j.mcn.2004.04.011

  17. 17.

    Matter N, Herrlich P, Konig H: Signal-dependent regulation of splicing via phosphorylation of Sam68. Nature. 2002, 420: 691-695. 10.1038/nature01153

  18. 18.

    Chawla G, Lin CH, Han A, Shiue L, Ares MJ, Black DL: Sam68 regulates a set of alternatively spliced exons during neurogenesis. Mol Cell Biol. 2008

  19. 19.

    Larocque D, Pilotte J, Chen T, Cloutier F, Massie B, Pedraza L, Couture R, Lasko P, Almazan G, Richard S: Nuclear retention of MBP mRNAs in the Quaking viable mice. Neuron. 2002, 36: 815-829. 10.1016/S0896-6273(02)01055-3

  20. 20.

    Reddy TR, Xu W, Mau JKL, Goodwin CD, Suhasini M, Tang H, Frimpong K, Rose DW, Wong-Staal F: Inhibition of HIV replication by dominant negative mutants of Sam68, a functional homolog of HIV-1 Rev. Nature Medicine. 1999, 5: 635-642. 10.1038/9479

  21. 21.

    Coyle JH, Guzik BW, Bor YC, Jin L, Eisner-Smerage L, Taylor SJ, Rekosh D, Hammarskjold ML: Sam68 enhances the cytoplasmic utilization of intron-containing RNA and is functionally regulated by the nuclear kinase Sik/BRK. Mol Cell Biol. 2003, 23: 92-103. 10.1128/MCB.23.1.92-103.2003

  22. 22.

    Larocque D, Galarneau A, Liu HN, Scott M, Almazan G, Richard S: Protection of the p27KIP1 mRNA by quaking RNA binding proteins promotes oligodendrocyte differentiation. Nat Neurosci. 2005, 8: 27-33. 10.1038/nn1359

  23. 23.

    Nabel-Rosen H, Dorevitch N, Reuveny A, Volk T: The balance between two isoforms of the Drosophila RNA-binding protein How controls tendon cell differentiation. Mol Cell. 1999, 4: 573-584. 10.1016/S1097-2765(00)80208-7

  24. 24.

    Saccomanno L, Loushin C, Jan E, Punkay E, Artzt K, Goodwin EB: The STAR protein QKI-6 is a translational repressor. Proc Natl Acad Sci USA. 1999, 96 (22): 12605-12610. 10.1073/pnas.96.22.12605

  25. 25.

    Jan E, Motzny CK, Graves LE, Goodwin EB: The STAR protein, GLD-1, is a translational regulator of sexual identity in Caenorhabditis elegans. EMBO J. 1999, 18: 258-269. 10.1093/emboj/18.1.258

  26. 26.

    Lee MH, Schedl T: Translation repression by GLD-1 protects its mRNA targets from nonsense mediated mRNA decay in C. elegans. Genes & Dev. 2004, 18: 1047-1059. 10.1101/gad.1188404

  27. 27.

    Schumacher B, Hanazawa M, Lee M-H, Nayak S, Volkmann K, Hofmann R, Hengartner M, Schedl T, Gartner A: Translational Repression of C. elegans p53 by GLD-1 Regulates DNA Damage-Induced Apoptosis. Cell. 2005, 120: 357-368. 10.1016/j.cell.2004.12.009

  28. 28.

    Lee M-H, Schedl T: Identification of in vivo mRNA targets of GLD-1, a maxi-KH motif containing protein required for C. elegans germ cell development. Genes & Dev. 2001, 15: 2408-2420. 10.1101/gad.915901

  29. 29.

    Larocque D, Richard S: Quaking KH domain proteins as regulators of glial cell fate and differentiation. RNA Biology. 2005, 2: 37-40.

  30. 30.

    Francis R, Barton MK, Kimbel J, Schedl T: Control of oogenesis, germline proliferation and sex determination by the C. elegans gene gld-1. Genetics. 1995, 139: 579-606.

  31. 31.

    Francis R, Maine E, Schedl T: Gld-1 a cell-type specific tumor suppressor gene in C. elegans. Genetics. 1995, 139: 607-630.

  32. 32.

    Crittenden SL, Bernstein DS, Bachorik JL, Thompson BE, Gallegos M, Petcherski AG, Moulder G, Barstead R, Wickens M, Kimble J: A conserved RNA-binding protein controls germline stem cells in Caenorhabditis elegans. Nature. 2002, 417: 660-663. 10.1038/nature754

  33. 33.

    Di Fruscio M, Styhler S, Wikholm E, Boulanger MC, Lasko P, Richard S: Kep1 interacts genetically with dredd/caspase-8, and kep1 mutants alter the balance of dredd isoforms. Proc Natl Acad Sci USA. 2003, 100 (4): 1814-1819. 10.1073/pnas.0236048100

  34. 34.

    Richard S, Torabi N, Franco GV, Tremblay GA, Chen T, Vogel G, Morel M, Cleroux P, Forget-Richard A, Komarova S: Ablation of the Sam68 RNA binding protein protects mice from age-related bone loss. 2005, 1: e74-

  35. 35.

    Lukong KE, Richard S: Motor coordination defects in mice deficient for the Sam68 RNA-binding protein. Behav Brain Res. 2008, 189: 357- 10.1016/j.bbr.2008.01.010

  36. 36.

    Peled-Zehavi H, Berglund JA, Rosbash M, Frankel AD: Recognition of RNA branch point sequences by the KH domain of splicing factor 1 (mammalian branch point binding protein) in a splicing factor complex. Mol Cell Biol. 2001, 21: 5232-5241. 10.1128/MCB.21.15.5232-5241.2001

  37. 37.

    Ryder SP, Frater LA, Abramovitz DL, Goodwin EB, Williamson JR: RNA target specificity of the STAR/GSG domain post-transcriptional regulatory protein GLD-1. Nat Struct Mol Biol. 2004, 11: 20-28. 10.1038/nsmb706

  38. 38.

    Tuerk C, Gold L: Systemic evolution of ligands by expontential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science. 1990, 249: 505-510. 10.1126/science.2200121

  39. 39.

    Galarneau A, Richard S: Target RNA motif and target mRNAs of the Quaking STAR protein. Nat Struct Mol Biol. 2005, 12: 691-698. 10.1038/nsmb963

  40. 40.

    Boisvert FM, Hendzel MJ, Masson JY, Richard S: Methylation of MRE11 regulates its nuclear compartmentalization. Cell Cycle. 2005, 4: 981-989.

  41. 41.

    Lin Q, Taylor SJ, Shalloway D: Specificity and determinants of Sam68 RNA binding. J Biol Chem. 1997, 272: 27274-27280. 10.1074/jbc.272.43.27274

  42. 42.

    Di Fruscio M, Chen T, Richard S: Two novel Sam68-like mammalian proteins SLM-1 and SLM-2: SLM-1 is a Src substrate during mitosis. Proc Natl Acad Sci USA. 1999, 96: 2710-2715. 10.1073/pnas.96.6.2710

  43. 43.

    Li Z, Zhang Y, Li D, Feng Y: Destabilization and mislocalization of the myelin basic protein mRNAs in quaking dysmyelination lacking the Qk1 RNA-binding proteins. J Neurosci. 2000, 20: 4944-4953.

  44. 44.

    Ryder SP, Williamson JR: Specificity of the STAR/GSG domain protein Qk1: implications for the regulation of myelination. RNA. 2004, 10: 1449-1458. 10.1261/rna.7780504

  45. 45.

    Venables JP, Vernet C, Chew SL, Elliot DJ, Cowmeadow RB, Wu J, Cooke HJ, Artzt K, Eperon IC: T-STAR/ETOILE: a novel relative of Sam68 that interacts with an RNA-binding protein implicated in spermatogenesis. Hum Mol Genetics. 1999, 8: 959-969. 10.1093/hmg/8.6.959.

  46. 46.

    Chen T, Boisvert FM, Bazett-Jones DP, Richard S: A role for the GSG domain in localizing Sam68 to novel nuclear structures in cancer cell lines. Mol Biol Cell. 1999, 10 (9): 3015-3033.

  47. 47.

    Côté J, Boisvert FM, Boulanger MC, Bedford MT, Richard S: Sam68 RNA binding protein is an in vivo substrate for protein arginine N-methyltransferase 1. Mol Biol Cell. 2003, 14 (1): 274-287. 10.1091/mbc.E02-08-0484

  48. 48.

    Haegebarth A, Heap D, Bie W, Derry JJ, Richard S, Tyner AL: The nuclear tyrosine kinase BRK/Sik phosphorylates and inhibits the RNA-binding activities of the Sam68-like mammalian proteins SLM-1 and SLM-2. J Biol Chem. 2004, 279: 54398-54400. 10.1074/jbc.M409579200

  49. 49.

    Elliott DJ: The role of potential splicing factors including RBMY, RBMX, hnRNPG-T and STAR proteins in spermatogenesis. Int J Androl. 2004, 27: 328-334. 10.1111/j.1365-2605.2004.00496.x

  50. 50.

    Darnell JC, Jensen KB, Jin P, Brown V, Warren ST, Darnell RB: Fragile X mental retardation protein targets G quartet mRNAs important for neuronal function. Cell. 2001, 107: 489-499. 10.1016/S0092-8674(01)00566-9

  51. 51.

    Wong G, Muller O, Clark R, Conroy L, Moran MF, Polakis P, McCormick F: Molecular cloning and nucleic acid binding properties of the GAP-associated tyrosine phosphoprotein p62. Cell. 1992, 69 (3): 551-558. 10.1016/0092-8674(92)90455-L

  52. 52.

    Wang LL, Richard S, Shaw AS: p62 association with RNA is regulated by tyrosine phosphorylation. J Biol Chem. 1995, 270: 2010-2013. 10.1074/jbc.270.5.2010

  53. 53.

    Derry JJ, Richard S, Carvajal HV, Ye X, Vasioukhin V, Cochrane AW, Chen T, Tyner AL: Sik (BRK) phosphorylates Sam68 in the nucleus and negatively regulates its RNA binding activity. Mol Cell Biol. 2000, 20: 6114-6126. 10.1128/MCB.20.16.6114-6126.2000

  54. 54.

    Itoh M, Haga I, Li Q-H, Fujisawa J-I: Identification of cellular mRNA targets for RNA-binding protein Sam68. Nucl Acids Res. 2002, 30: 5452-5464. 10.1093/nar/gkf673

  55. 55.

    Chenard CA, Richard S: New implications for the QUAKING RNA binding protein in human disease. J Neurosci Res. 2008, 86: 233-242. 10.1002/jnr.21485

  56. 56.

    Li Z, Takakura N, Oike Y, Imanaka T, Araki K, Suda T, Kaname T, Kondo T, Abe K, Yamamura K: Defective smooth muscle development in qkI-deficient mice. Dev Growth Differ. 2003, 45 (5–6): 449-462. 10.1111/j.1440-169X.2003.00712.x

  57. 57.

    Verity AN, Campagnoni AT: Regional expression of myelin protein genes in the developing mouse brain: In situ hybridization studies. J Neurosci Res. 1988, 21: 238-248. 10.1002/jnr.490210216

  58. 58.

    Boccaccio GL, Colman DR: Myelin basic protein mRNA localization and polypeptide targeting. J Neurosci Res. 1995, 42: 277-286. 10.1002/jnr.490420216

  59. 59.

    Ainger K, Avossa D, Morgan F, Hill SJ, Barry C, Barbarese E, Carson JH: Transport and localization of exogenous myelin basic protein mRNA microinjected into oligodendrocytes. J Cell Biol. 1993, 123: 431-441. 10.1083/jcb.123.2.431

  60. 60.

    Kindler S, Wang H, Richter D, Tiedge H: RNa transport and local control of translation. Annu Rev Cell Dev Biol. 2005, 21: 223-245. 10.1146/annurev.cellbio.21.122303.120653

  61. 61.

    Ainger K, Avossa D, Diana AS, Barry C, Barbarese E, Carson JH: Transport and localization elements in myelin basic protein mRNA. J Cell Biol. 1997, 138: 1077-1087. 10.1083/jcb.138.5.1077

  62. 62.

    Hoek KS, Kidd GJ, Carson JH, Smith R: hnRNP A2 selectively binds the cytoplasmic transport sequence of myelin basic protein mRNA. Biochemistry. 1998, 37: 7021-7029. 10.1021/bi9800247

  63. 63.

    Kwon S, Barbarese E, Carson JH: The cis-acting RNA trafficking signal from myelin basic protein mRNA and its cognate trans-acting ligand hnRNP A2 enhance cap-dependent translation. J Cell Biol. 1999, 147: 247-256. 10.1083/jcb.147.2.247

  64. 64.

    Lee MH, Schedl T: RNA-binding proteins. WormBook. 2006, 18: 1-13.

  65. 65.

    Kimble J, Crittenden SL: Controls of germline stem cells, entry into meiosis, and the sperm/oocyte decision in Caenorhabditis elegans. Annu Rev Cell Dev Biol. 2007, 23: 405-433. 10.1146/annurev.cellbio.23.090506.123326

  66. 66.

    Lehmann-Blount KA, Williamson JR: Shape-specific nucleotide binding of single-stranded RNA by the GLD-1 STAR domain. J Mol Biol. 2005, 346: 91-104. 10.1016/j.jmb.2004.11.049

  67. 67.

    Buckanovich RJ, Darnell RB: The Neuronal RNA Binding Protein Nova-1 Recognizes Specific RNA Targets In Vitro and In Vivo. Mol Cell Biol. 1997, 17: 3194-3201.

Download references


We thank Min-Ho Lee for recombinant GLD-1 protein and stimulating discussions. This work was supported by MT-13377 from the Canadian Institutes of Health Research (CIHR) and the Multiple Sclerosis Society of Canada. A.G. was a Research Student of the National Cancer Institute of Canada supported with funds provided by the Terry Fox Run. S.R. is a Chercheur-National of the Fonds de la recherche en Santé du Québec.

Author information



Corresponding author

Correspondence to Stéphane Richard.

Additional information

Authors' contributions

AG performed the experiments and analyzed the data. AG and SR conceived the experiments and wrote the manuscript.

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Galarneau, A., Richard, S. The STAR RNA binding proteins GLD-1, QKI, SAM68 and SLM-2 bind bipartite RNA motifs. BMC Molecular Biol 10, 47 (2009).

Download citation


  • Myelin Basic Protein
  • Star Protein
  • Talon Metal Affinity Resin
  • Myelin Basic Protein mRNA
  • Hexanucleotide Sequence