The STAR RNA binding proteins GLD-1, QKI, SAM68 and SLM-2 bind bipartite RNA motifs

Background SAM68, SAM68-like mammalian protein 1 (SLM-1) and 2 (SLM-2) are members of the K homology (KH) and STAR (signal transduction activator of RNA metabolism) protein family. The function of these RNA binding proteins has been difficult to elucidate mainly because of lack of genetic data providing insights about their physiological RNA targets. In comparison, genetic studies in mice and C. elegans have provided evidence as to the physiological mRNA targets of QUAKING and GLD-1 proteins, two other members of the STAR protein family. The GLD-1 binding site is defined as a hexanucleotide sequence (NACUCA) that is found in many, but not all, physiological GLD-1 mRNA targets. Previously by using Systematic Evolution of Ligands by EXponential enrichment (SELEX), we defined the QUAKING binding site as a hexanucleotide sequence with an additional half-site (UAAY). This sequence was identified in QKI mRNA targets including the mRNAs for myelin basic proteins. Results Herein we report using SELEX the identification of the SLM-2 RNA binding site as direct U(U/A)AA repeats. The bipartite nature of the consensus sequence was essential for SLM-2 high affinity RNA binding. The identification of a bipartite mRNA binding site for QKI and now SLM-2 prompted us to determine whether SAM68 and GLD-1 also bind bipartite direct repeats. Indeed SAM68 bound the SLM-2 consensus and required both U(U/A)AA motifs. We also confirmed that GLD-1 also binds a bipartite RNA sequence in vitro with a short RNA sequence from its tra-2 physiological mRNA target. Conclusion These data demonstrate that the STAR proteins QKI, GLD-1, SAM68 and SLM-2 recognize RNA with direct repeats as bipartite motifs. This information should help identify binding sites within physiological RNA targets.


Background
The K homology (KH domain) is a prevalent RNA binding domain that is an evolutionarily conserved domain initially identified as a repeated sequence in the heteronu-clear ribonucleoprotein particle (hnRNP) K [1]. The KH domain is a small protein module consisting of 70 to 100 amino acids and it is the second most prevalent RNA binding domain next to the RRM (RNA recognition motif) [2]. The RNA binding property of the KH domain was initially shown for FMRP, the gene product of the human fragile X syndrome and hnRNP K [3]. The KH domain is often found in multiple copies within proteins (15 in vigilin) and there is a subfamily that contains a single copy KH domain that is larger referred to as a maxi-KH domain [4].
The KH domain makes direct protein-RNA interactions with a three-dimensional β 1 α 1 α 2 β 2 β 3 topology with an additional C-terminal α helix (α 3 ) for maxi-KH domains [1]. The feature of KH domains is an invariant GXXG loop located between α 1 /α 2 that provides close contact with the phosphate groups such that the neighboring nucleotides can form Watson and Crick base pairing with conserved amino acids within the KH domain [5,6]. The structure determination of the KH domains has also been solved with single-stranded DNA, demonstrating that certain KH domains may accommodate either RNA or ssDNA within their active site [1,[7][8][9].
There exists a subfamily of KH domains that contain extended loops between β1/α1 and β2/β3 and that contain an additional C-terminal helix in their topography [10]. These maxi-KH domain proteins contain conserved sequences immediately at the N-and C-terminal of the KH domain. The entire region is referred to as the STAR/ GSG (signal transduction activator of RNA metabolism/ GRP33, SAM68, GLD-1) domain [4,11,12]. Although STAR proteins contain single KH domains, dimerization is required for RNA binding [13]. The STAR proteins are mammalian Sam68, SLM-1, SLM-2, QKI, SF1, C. elegans GLD-1, Drosophila How, KEP1, Sam50 and Artemia Salina GRP33 [4]. STAR proteins have been shown to function in pre-mRNA splicing [14][15][16][17][18], mRNA export [19][20][21], mRNA stability [22,23] and protein translation [24][25][26][27][28]. Genetic evidence has implicated the STAR RNA binding proteins in many cellular processes. These include the role of the QKI isoforms in the process of myelination of the central nervous system [29], GLD-1 in the germline determination [30][31][32], How in muscle and tendon differentiation [4], Kep1 in cell death processes [33] and Sam68 in bone marrow mesenchymal cell fate [34] and motor defects [35]. Genetic data has also implicated simple KH domain proteins FMRP in mental retardation and Nova in paraneoplastic neurologic disorders [2]. SF1 or branch point binding protein (BBP) was shown to recognize the branchpoint site RNA sequence (UAC-UAAC) [6,36] and structure determination has shown that there is direct protein-RNA contact [6]. These studies have provided necessary information about the contact sites of maxi-KH domains and their similarities/differences with simple KH domains proteins such as Nova. Based on this information, Ryder and coworkers showed that GLD-1 binds a hexanucleotide sequence (NACUCA) and proposed it as the STAR binding site [37]. In a previous effort by using Systematic Evolution of Ligands by EXponential enrichment (SELEX) [38], we defined the QKI RNA binding consensus sequence to be a bipartite motif consisting of a core NACUAAY (where Y is a pyrimidine) sequence with an neighboring half-site (UAAY) [39]. In the present study, we define for the first time the RNA binding specificity of the mammalian STAR protein, SLM-2. We identified using SELEX the SLM-2 consensus sequence as a direct U(U/A)AA repeat. The bipartite nature of the consensus RNA sequence was essential for high affinity RNA binding activity to SLM-2. The identification of a bipartite mRNA binding site for QKI [39] and now for SLM-2 prompted us to further determine whether SAM68 and GLD-1 also bound bipartite direct repeats. Indeed SAM68 and GLD-1 required bipartite RNAs, demonstrating that the STAR proteins SLM-2, SAM68, QKI and GLD-1 bind direct RNA repeats as a bipartite motif in target RNAs.

The identification of the SLM-2 RNA binding site by using SELEX
To identify the binding motif for the SLM-2 RNA binding protein, we performed SELEX to enrich for high affinity RNA ligands. Bacterial recombinant SLM-2 expressed as a histidine epitope tagged fusion protein was generated and purified for the assay. Synthetic RNAs were transcribed with the T7 RNA polymerase from DNA pools of 52nucleotide random-mers estimated at a complexity of 1.0 × 10 14 and we randomly sequenced 20 RNA molecules from the initial library and noted, as expected, that each sequence was unique [39]. The transcribed RNAs were generated in the presence of 32 P-α-UTP such that the amount of specific SLM-2 bound RNAs could be measured after each round. After six cycles of selection, we observed an approximately 10% of binding of the initial input (not shown), demonstrating that we indeed had enriched specific sequences. To confirm the SELEX amplification of the SLM-2 specific RNA ligands, we performed a gel electromobility shift assays (EMSA) with purified pools of RNA transcripts isolated from rounds 2, 4 and 6. The RNAs were 32 P-labelled and incubated with buffer or increasing concentration of His-SLM-2. The SLM-2/RNA complexes were observed as slow migrating complexes on native gel electrophoresis (Fig. 1A). More efficient RNA binding was observed in round 6 than rounds 2 and 4 (compare the free probe remaining from lanes 2 and 6 with lane 10). After round 6, the SLM-2 bound RNAs were converted into cDNAs, subcloned and sequenced. The sequence of 43 clones revealed that 11 clones were unique ( Table 1). The clones were referred to as SLM-2 response element (SRE)-1 to 11. Class I RNAs contained a bipartite motif consisting of direct repeats of the sequence U(U/ A)AA (Table 1). Our data show that the selected RNA aptamers contained a bipartite motif with direct repeats and the spacing between the repeats varied from 3 (SRE-3) to 25 (SRE-7) nucleotides (Table 1 and Fig. 1B). The 3 RNAs identified that did not contain the bipartite sequence (SRE-9, -10, -11) were grouped in Class II and since ~10% of RNAs from round 6 bound SLM-2, Class II RNAs are likely to represent non-binders. No apparent secondary structure was identified in the SREs using the prediction of RNA secondary structure program MFOLD (data not shown). Taken together, we have identified a bipartite motif consisting of direct repeats of the sequence U(U/A)AA as the SLM-2 RNA binding site.

A direct repeat of U(U/A)AA defines the SLM-2 RNA binding consensus sequence
To define the characteristics of the SLM-2 RNA binding motif, we performed RNA binding assays with SRE-4 and SRE-7. We chose SRE-4 and -7 for further analysis because SRE-4 was the most frequently [40] identified RNA and SRE-7 contains a guanine-rich sequence at its 5'end in addition to the UUAA repeats. The 52 mer identified for SLM-2 RNA ligands identified  Table 2, SRE-4m1, m2, m3). Similarly, the replacement of the UAAA with UACC abolished RNA binding (SRE-4m4). These finding demonstrate that both tetranucleotide motifs (U(U/A)AA) are required for SLM-2 high affinity RNA binding.
We analyzed SRE-7 and identified a G-rich sequence that may represent a G quartet. We first proceeded by replacing the G-rich nucleotides with U-rich sequence and this had little effect on SLM-2 RNA binding activity (compare SRE-7m1 and SRE-7wt; Table 2). Interestingly, the replacement of the G-rich sequences with AU-rich sequences such as to introduce a third U(U/A)AA motif enhanced SLM-2 RNA binding to this RNA species (SRE-7m2; Table 2). The substitution of the downstream uUUAAu sequence with CGACGC abolished SLM-2 RNA binding consistent with the U(U/A)AA requirement (SRE-7m2). Numerous 5' and 3' deletions were performed and a minimal sequence of 40 nucleotides was identified containing both U(U/A)AA motifs that bound with a Kd of ~22.3 nM (SRE-7d9; Table  2). The substitution of the 5' or 3' U(U/A)AA motifs reduced the SLM-2 high affinity binding site (Table 2; SRE-7d9m2, m3), demonstrating that indeed SLM-2 binds RNA with high-affinity to direct repeats of U(U/ A)AA.

SAM68 binds the SLM-2 response element
SELEX has been performed with recombinant SAM68 and a UAAA consensus was defined as a necessary RNA binding site [41]. As there is 69% sequence identity between the SLM-2 and SAM68 STAR/GSG domains [42], we tested the possibility that the SLM-2 consensus (SRE-4) may be bound by Sam68. Using EMSA with recombinant Sam68 containing only the STAR/GSG domain, we observed that indeed the GSG domain of SAM68 bound the SRE-4wt RNA aptamer, but not the variants that contain mutated U(U/A)AA motifs (Fig. 2B). There was one variant of SRE-4 (SRE-4m2) that retained some binding and this is likely due to the polyuridine stretch (UUUU) that remained between the two U(U/A)AA motifs ( Table 2). These findings demonstrate that SAM68 also has the capabilities to bind a bipartite U(U/A)AA consensus.

Defining the bipartite nature of the QKI response element within the mRNAs of myelin basic protein
The mRNAs encoding the myelin basic proteins (MBP) are known QKI targets [19,43]. The QKI RNA binding site was defined to be a core (NACUAAC) with a neighboring halfsite (UAAY) [39]. The MBP QREs were defined as QRE-1 and QRE-2 [39,44]. QRE-2 is interesting as it contains two regions with an overlapping imperfect core (underlined) and half-site (bold) (UACACACUAAC, QRE-2:wt) as well as downstream perfect half-site (UAAC) (Fig. 3). Alternatively, region A is recognized as the imperfect half-site (bold) (UACACACUAAC) and as the perfect core (underlined). To define the requirements of QRE-2, we performed EMSA with various combinations of region A and B. The QRE-2 sequences with regions A and B bound QKI with high affinity (QRE-2:wt, Kd ~121 nM) and the substitution of the UAAC half-site in region A or region B diminished considerably the RNA binding affinity (Fig. 3, QRE-2:m1, m2). The substitution of the UACA to GAGA in region A bound with high affinity demonstrating that region A supplies the perfect core (CACUAGG) and region B supplies the half-site (UAAC) of the bipartite motif. These findings demonstrated that region A without region was unable to serve as a high affinity site for QKI (QRE-2:m1). Ryder and Williamson showed that region A alone was bound with high affinity by QKI. We next centered region A and this considerably improved QKI binding with a Kd of ~168 nM (Fig. 3, QRE-2:m4). The substitu-

Ligand
Sequence n The UAAA and UUAA conserved motifs are shown in bold. n = number of times identified.
Defining the SLM-2 response element as a bipartite RNA sequence Figure 2 Defining the SLM-2 response element as a bipartite RNA sequence. EMSAs with the selected SRE-4 with decreasing concentrations of recombinant His-SLM-2 (A) and the SAM68 GSG domain (B) (by a factor of 2 from 1 μM) or with buffer alone. The RNA sequence and mutants (m1-m4) used in the reaction are shown in Table 2. Migration patterns of unbound RNAs (free probe) and protein bound RNAs (protein-RNA complex) are indicated on the left.
tion of either the UAUA to GAGA (QRE-2:m5) or the UAAC to GAGC (QRE-2:m6) significantly reduced QKI RNA binding (Fig. 3B). These findings define the QRE-2 as requiring a bipartite motif located in region A or in region A plus region B.

GLD-1 binds a bipartite RNA motif containing the hexanucleotide
A high affinity RNA binding site has been defined for C. elegans GLD-1 that consists of a hexanucleotide (NACU(C/A)A) [37]. To examine whether the GLD-1 hexanucleotide sequence also requires a similar half-site, we performed EMSA assays with a segment of the tra2 and gli repeated element (TGE) containing the hexanucleotide (UACUCAU) and its neighboring half site (UAAU) (Fig.  4A, TGE-wt). GLD-1 bound this wild-type TGE sequence and a variation of it (TGE-m2) with approximate Kd ~104 nM, defining a short sequence for GLD-1 high affinity binding (Fig. 4B). These data are consistent with previous competition experiments that defined the GLD-1 Kd ~10 nM that defined the hexanucleotide as (UACU(C/A)A) [37]. The nucleotide substitution of the half-site (UAAU to GAGU) abolished RNA binding (Fig. 4B, TGE-m1), consistent with the need for a half-site in addition to the hexanucleotide. Similar binding experiments were performed with QKI and we observed that TGE-m2 is essentially a QRE bound with high affinity, whereas the wildtype TGE bound with a moderate affinity of approximately 300 nM (Fig. 4C). The TGE-m1 was not bound by QKI (Fig. 4C). In summary, these data identify the GLD-1 RNA binding motif as bipartite as observed with SLM-2, QKI, and Sam68.

Discussion & Conclusion
In the present study, we identified a SLM-2 consensus sequence as direct U(U/A)AA repeats using SELEX. The bipartite nature of the consensus sequence was essential for SLM-2 high affinity RNA binding. The identification of a bipartite mRNA binding site for QKI [39] and now SLM-2 prompted us to determine whether SAM68 and GLD-1 also bound bipartite direct repeats. Indeed SAM68 bound the SLM-2 consensus and required both U(U/A)AA motifs. Also, GLD-1 required sequences within the UAAY half-site in addition to its conservative consensus NACU(C/A)A, defining a GLD-1 bipartite motif. Taken together, these data demonstrate that the STAR proteins SLM-2, SAM68, QKI and GLD-1 bind direct RNA repeats.

Defining the SLM-2 RNA binding site as U(U/A)AA repeats
We identified SLM-2 in 1999 by searching databases with SAM68 sequences [42]. Independently SLM-2 (called T-STAR) was identified as an interacting protein of RBM, an RNA binding protein involved in spermatogenesis [45]. SLM-2 is known to bind homopolymeric RNA [42], localize to SAM68 nuclear bodies (SNBs) [46], regulate alternative splicing [16] and dimerize with SAM68 and SLM-1 [42]. SLM-2 is post-translationally modified to contain methylarginines [47] and phosphotyrosines, the latter  impairs its ability to associate with RNA [48]. The expression of SLM-2 is mainly restricted to testis and brain, but its function in these tissues remains unknown [49].
Previously we showed that SLM-2 had a preference for poly (G) rich homopolymeric RNA [42], therefore, we searched the SELEX hits for poly (G) rich sequences that could possibly resemble a G-quartet as bound by FMRP [50]. The SLM-2 selected RNA (SRE-7) contained a variation of this sequence (GGnGGGnGGnnnnnnnGG), but its deletion did not affect SLM-2 RNA binding. Therefore, we next focused on the U(U/A)AA rich repeats that resemble the consensus identified with SAM68 SELEX [41]. Indeed we mapped the SLM-2 consensus sequence to direct repeats of the U(U/A)AA sequence, defining a SLM-2 RNA binding site as a bipartite motif. This motif is too frequently found in mRNAs especially in 3'-UTR to perform a bioinformatic analysis to identify the SLM-2 mRNA targets (not shown). Thus the specificity in SLM-2 function is most likely contributed by its tissue specific expression and post-translational modifications may alter its RNA binding specificity and/or accessibility.
Sam68 is known to bind cellular RNA as well as DNA [51]. Sam68 is known to have a preference for poly (U) and poly (A) homopolymeric RNA and this association is abrogated with tyrosine phosphorylation by Src kinases and BRK [52,53]. Differential display and cDNA representation difference analysis identified 29 potential RNA binding targets of which 10 bind in a KH-dependent manner [54]. Sam68 binding sequences on hnRNP A2/B1 and β-actin mRNAs were mapped to UAAA and UUUUUU nucleotide motifs, respectively and both motifs occur within specific loop structures [54]. Sam68 has also been shown to transport unspliced HIV RNAs [20]. The knockout Sam68 mice are protected against the development of osteoporosis pointing towards an enhancement of the mesenchymal stem cell differentiation along the osteogenic rather than the adipocyte pathway [34]. The mice also have motor coordination defects [35]. The identification of Sam68 in these physiological processes will help direct the search for specific physiological mRNA targets. The work performed herein demonstrates that the STAR/ GSG domain of Sam68 has similar RNA binding capabilities to SLM-2, as suggested by their 69% sequence identity within their STAR/GSG domains [42].

QUAKING: a regulator of myelination
The quaking viable (qk v ) mice represent an animal model of dysmyelination [55]. The defect is summarized as an incomplete maturation of the myelin sheath. This is due to the lack in proper oligodendrocyte differentiation, resulting in the failure to transport intracellular myelin components such as the MBP mRNAs [55]. QKI null animals have been generated, but the embryos die at ~E9.5-10.5 day, providing little information about the role of QKI in myelination [56]. By using a gain-of-function approach with ectopic expression of the QKI isoforms, we showed previously that QKI-6 and QKI-7 promote oligodendrocyte differentiation by up-regulating p27 KIP1 , confirming the role for the QKI isoforms during myelination [22]. The QKI response element was defined as a core NACUAAY [44] with a neighboring UAAY [39,44]. This led to the identification of two binding sites within the mRNAs for the MBPs [39,44]. QRE-1 contains 3' adjacent half-sites that function as a moderate affinity site. In the present study, we demonstrate that region A in QRE-2 ( Fig. 4) shown previously to mediate binding [44], becomes a better site with the presence of the half site from region B (Fig. 4). Our findings show that QRE-2 within the 3'UTR of MBP mRNAs is indeed a bipartite consensus sequence with a core NACUAAY and a neighboring UAAY.
The MBP mRNAs are localized at the distal processes of oligodendrocytes in intact tissue [57]. The factors necessary for MBP mRNA localization are oligodendrocyte-specific, as transfected MBP mRNA into non-glial cells did not properly localize to the cell membrane [58]. Studies performed in living cells by microinjection have shown that the MBP mRNA forms granules, which appear dispersed in the perikaryon and are transported down the processes [59]. MBP is not the only mRNA known to be localized to the distal processes of oligodendrocytes, as myelin oligodendrocytes basic protein (MOBP), alpha-CAMKII, tau, amyloid precursor protein (APP) and others are also transported to the site of myelination [60]. Transport and localization elements have been mapped in the 3' UTR of rat and mouse MBP mRNA. A 21-nucleotide sequence named RNA transport signal (RTS) mapped at nucleotide 794 to 814 of rat MBP or nucleotide 798 to 818 of mouse MBP has been identified as a transport element [61]. This sequence is homologous to several other localized mRNAs, suggesting a general transport signal. In rat oligodendrocytes, another localization element has been mapped to nucleotides 1130 to 1473 named the RNA localization region (RLR), but the region 667 to 953 containing the RTS and QRE-1 is sufficient for localization [61]. HnRNP A2 has been shown to be one of the component which binds the RTS sequence [62], and insertion of the RTS into GFP resulted in enhanced translation [63]. The mapping of QRE-2 (UACUAAC-13nt-UAAC) constitutes another element that may be necessary for proper export of the MBP mRNA into the cytoplasm and subsequent production of the MBP at its site of synthesis. It is likely that QKI works in combination with hnRNP A2 and the other components of the RNP granule in the proper transport of the MBP mRNA, its localization and its translation.

GLD-1 binding to the tra2/gli element analysis
The C. elegans homolog of QKI is GLD-1, a known protein translation inhibitor required for germ-line differentiation [64,65]. Many GLD-1 mRNA targets have been identified [25][26][27][28] and a conservative consensus sequence of NACU(C/A)A was defined by comparing the binding specificity with SF1 [37]. Most, but not all mRNA targets [37], contain this conversed consensus sequence. The demonstration that GLD-1 like QKI requires a neighboring halfsite is consistent with the ~50% sequence identity within their STAR/GSG domains.

Sam68/SLM-2 tetranucleotide versus QKI/GLD-1 hexanucleotide sequence requirements
The RNA binding domain of STAR/GSG proteins consist in a maxi-KH domain flanked by two conserved sequences (Fig. 5). The NK/QUA1 and CK/QUA2 region refer to the N-and C-terminal region, respectively, flanking the KH domain. Based on the structure of the KH domain of SF-1 associated with its binding RNA molecule U 1 A 2 C 3 U 4 A 5 A 6 C 7 , the CK region makes important contacts with the RNA. All STAR domain containing proteins have the most important GXXG sequence located in a loop between the two first alpha helices of the KH domain. This sequence of residues is absolutely conserved among the STAR domain proteins and makes the contact with the RNA especially with the bases U 4 A 5 A 6 C 7 . By looking closely at the residues in the CK region that make important association with the RNA bases, we find that two residues (asterix on Fig. 5) seems to confer the SLM-2/SAM68 specificity versus the QKI/GLD-1 specificity. The SLM-2/SAM68 residues are a threonine or a serine and a conserved glutamic acid while the QKI/GLD-1/SF1 residues consist in a conserved alanine and a conserved arginine. These residues make important contact with base A 2 which specificity is lost in the Slm-2/Sam68 consensus binding sequence. In fact, SLM-2/Sam68 binding sequence resembles in all points to the QKI/GLD-1/SF1 core binding sequence but lacking U 1 A 2 C 3 bases.
The STAR protein SF1 structure was determined and the amino acids that contact the RNA were identified [6]. Based on these contact amino acids, it explains why SF1, QKI and GLD-1 have near identical binding specificity. The Sam68, SLM-1 and SLM-2 subfamily have different amino acids in the RNA contact position and it should be possible by amino acid substitution to convert a Sam68 domain into a GLD-1 domain that will bind the NACUA(C/A)C GLD-1 consensus sequence. Lehmann-Blount & Williamson (2005) have performed such experiments and were unable by mutagenesis to identify an amino acid 'code' that would dictate GLD-1-like versus Sam68-like specificity [66]. This led them to propose that Sam68 and hence SLM-1 and SLM-2 might not be RNA binding proteins or possess an RNA binding specificity that is fundamentally unlike that of GLD-1 [66]. The identification of a high affinity RNA target for SLM-2 with the characteristics of a GLD-1/QKI bipartite motif, demonstrates that Sam68, SLM-1 and SLM-2 subfamily are indeed RNA binding proteins, but does not exclude the possibility that they may also bind ssDNA. The challenge ahead will be to identify the physiological RNA targets linking with the phenotypes observed in mammals.

SELEX assay
Systematic Evolution of Ligand by EXpornential enrichement (SELEX) was performed as previously described [67]. Essentially, an oligonucleotides harboring a 52-bp random sequence surrounded by two primer binding sites, with an estimated complexity of 1 × 10 15 , were synthesized by (Invitrogen). The oligonucleotides were amplified by PCR using corresponding forward and reverse primers as previously described [67]. After PCR amplification, the sequences of 24 random clones were determined; each clone was unique and the overall base composition whoed similarity among the clones (average composition: A, 20%; U, 30%; C, 22%; G, 28%; data not shown). A purified DNA library (1 × 10 13 molecules) was transcribed in vitro using the T7 RNA polymerase (Promega) and (α-32 P)-UTP. RNA was purified from denaturing TBE-acrylamide gels, heated to 65°C fro 5 min, and precleared using TALON Metal Affinity Resin (BD Bioscience) to absorb non-specifically bound RNAs. Unbound RNAs were incubated in binding buffer (50 mM Tris-HCl (pH 8.0), 590 mM KCl) with the recombinant His-SLM-2 for 30 min, then with TALON Metal Affinity Resin for another 30 min. After four washes with binding buffer, the RNAs were eluted by TRIzol extraction (Invitrogen). The purified RNAs were ethanol precipitated and resuspended in water with RNase-free DNase for a 15 min reaction. The DNase reaction was quenched for 10 min at 65°C. Reverse transcriptions were performed using M-MLV reverse transcriptase (Promega) and a reverse oligonucleotide annealing to the 3' primer binding site. cDNAs were then generated by PCR amplification with the reverse oligonucleotide and the forward oligonucleotide annealing to the 5' primer binding site containing the T7 promoter. After round 6, the cDNAs were amplified with the reverse primer and a forward primer containing the STAR/GSG domain protein alignment The STAR/GSG domain of mouse SLM-2, human SAM68, mouse QKI, C. elegans GLD-1 and human SF-1 were aligned using ClustalW. Secondary structure, beta sheets and alpha helices, are shown on top of the sequences and region NK/QUA1, the KH domain and region CK/QUA2 are shown beneath the sequences. The critical loop between helices alpha 1 and alpha 2 with the GXXG sequence is also shown. (a) Based on [6] the RNA bases UACUAAC that contact with the specific SF-1 residues are numbered as follow U 1 A 2 C 3 U 4 A 5 A 6 C 7 . (b) Arginine 160 makes contact with U 4 and A 6 . (c) Valine 183 makes contact with A 6 and C 7 .
EcoRI restriction site. The DNA fragments were digested with EcoRI and BamHI and subcloned into pBluescript SK+ (Stratagene) for blue/white selection. Forty-three random white colonies were selected, their plasmid were purified and the SELEX sequence was identified by DNA sequencing (Genome Quebec).

RNA preparation, purification and ElectroMobility Shift Assays (EMSAs)
RNAs were prepared by run-off in vitro transcription of oligonucleotides harboring a T7 binding site in the presence of 32 P-UTP, using T7 MegaShortscript (Ambion) according to the manufacturer's protocols. RNAs were purified on TBE-acrylamide gels before use. For EMSAs, a constant concentration of 32 P-labeled RNA (100 pmol) was incubated alone with buffer or with increasing or decreasing concentrations of the corresponding tested proteins in the following buffer: 20 mM HEPES (pH7.4), 330 mM KCl, 10 mM MgCl 2 , 0.1 mM EDTA, 0.1 mg/ml heparin and 0.01% IGEPAL CA630 (Sigma). The 30 μl reaction were incubated at room temperature for 1 h, and 3.3 μl of RNA loading dye (glycerol containing 0.25% (w/v) bromophenol blue, 0.25% (w/v) xylene cyanol) was added to each. A portion (15 μl) of each sample was separated on native Tris-glycine 8%-acrylamide gels. The gels were dried and the bound and unbound RNAs were quantified using a Storm Phosphorimager (Amersham). The fraction of bound RNA was determined and plotted using the software program Prism 3.0 (GraphPad Software).

DNA and protein preparation
Recombinant GST-QKI-5 was described previously [39]. Maltose binding protein fused to GLD-1 was a generous gift of Min-Ho Lee (University of Albany). His-SLM-2 was prepared from subcloning the coding region from GFP-SLM-2 [42] into pQE Trisystem (Qiagen) using BamH1 and XhoI directionnal cloning. His-GSG(SAM68) was prepared by subcloning the GSG domain of mouse Sam68 into pET-18c. Protein purification was performed as per the manufacturer's instructions.