Bioinformatics and phylogeney analyses of Sox30
In the present study, a novel Sry-related gene was isolated from the Nile tilapia accidentally when cloning Sox9b. It was characterized as Sox30 by subsequent blast against GenBank for its relatively high identity with its mammalian orthologs, which was further confirmed by phylogenetic and syntenic analyses. To date, Sox30 has been reported only in mammals [16, 18, 19], but not in any non-mammalian vertebrates (including fish) and invertebrates. Our data, for the first time, provided the solid proofs for the existence of Sox30 in a teleost fish, the Nile tilapia.
Subsequently, genome databases of fishes, including zebrafish, medaka (Oryzias latipes), stickleback (Gasterosteus aculeatus), takifugu (Takifugu rubripes) and tetraodon (Tetraodon nigroviridis) were searched for possible Sox30-like genes. However, it seems that this gene does not exist in any database of these teleosts. EST data bases were also searched afterward, and complete or partial Sox30 sequences were successfully isolated from the channel catfish, guppy, fathead minnow, little skate and dogfish. Further screening of the genome sequences of other animals showed that Sox30 was also present in other chordates, including chicken, anole lizard, frog, sea squirts, lancelets and acorn worm (also called as SoxH in the last three species), and non-chordate invertebrates, including the California mussel, snail and sea anemone (also called as SoxH), while not found in the nematode (Caenorhabditis elegans), fruitfly and sea lamprey. A recent report stated the existence of Sox30-like sequence in sea urchin (Strongylocentrotus purpuratus) , but we failed to find in the genome and EST databases by blast search. These findings suggested that Sox30 is not specific to mammals, but also exists in other vertebrates and invertebrates. Then, why we failed to find Sox30-like sequences in all sequenced teleost genomes? There are two possible explanations. One is that the genome sequences from these species have not covered all the genome yet; and the other is that Sox30 was secondarily lost in these species during evolution. However, the chance for the latter possibility is quite small because of the following two reasons, 1) cyprinids and ictalurids, such as fathead minnow and channel catfish, are relatively primitive teleosts while cichlids and poeciliids, such as tilapia and guppy, are relatively advanced teleosts. It is unlikely that Sox30 was lost in all the other teleosts except the primitive and advanced species; 2) both fathead minnow and zebrafish belong to the same family, Cyprinidae. It is difficult to believe that Sox30 was secondarily lost in the zebrafish but not in another closely related species, fathead minnow. Therefore, we tend to accept the former explanation. To confirm this, further cloning Sox30 from other teleosts is required.
Sox30 was found not only in chordates such as human, mouse, chicken, frog, tilapia, channel catfish, guppy, fathead minnow, little skate, dogfish, sea squirts, lancelets and acorn worm, but also in non-chordate invertebrate species, such as the California mussel, snail and sea anemone, indicating that it had already appeared before or with the emergence of coelenterates. Then, it is of interests to know from which ancestor gene Sox30 was derived. Group H, which only consists of Sox30, shared the highest similarity to group F in 10 Sox groups (Additional file 1, Fig.S1), suggesting that they may share a common ancestor. However, the exact timing for the divergence of the two groups is still unclear. Its presence in the relatively primitive invertebrate species, sea anemones (no Sox30 or other Sox genes have been found in protozoa and bacteria at present) and its position (the outermost clade) in the phylogenetic tree (Additional file 1, Fig. S1) suggested that Sox30 seemed to be one of the oldest members in the Sox family, which is consistent with the previous conclusion .
Sequence and gene structure analyses of Sox30
Compared with its counterparts from mammals, tilapia Sox30 genomic sequence was smaller in size due to compression of intron sequences. Similar phenomenon has been found for Sox30 in sea squirts, lancelets and acorn worm. Alignment of aa sequences indicated that Sox30 s was poorly conserved, even in the highly conserved HMG-box among both distant and closely related species, suggesting high diversity and rapid evolution of these proteins. This also explains why Sox30 has not been successfully isolated from non-mammalian animals until the accident cloning of tilapia Sox30 in this study. Gene structure analysis showed that tilapia Sox30, like some other Sox group members, such as Sox8, Sox9 and Sox10, was also characterized by the splitting of the HMG-box by an intron. The same is true to Sox30 of acorn worm, lancelets, sea squirts, frog, chicken, mouse and human, and the location of the intron is conserved among these species. These findings indicated that this intron had already been fixed before or with the emergence of chordates.
Alternative splicing (AS) is emerging as one of the most important mechanisms to control vertebrate gene expression. Existing data indicate that as much as 76% of genes generate alternatively spliced products . AS has been associated with a regulatory system in tissue- or stage-specific splicing mechanisms by which expression may be regulated. This regulation may be achieved by the introduction of premature stop codons that function as an on-off gene expression switch . Previous study showed that two different mRNA isoforms of Sox17, Sox30 have been isolated in mouse and two different mRNA isoforms of Sox9 in frog because of AS [15, 16, 32]. In mice, the Sox17 isoform is expressed at a low level in the testis throughout postnatal development, while the t-Sox17 isoform is expressed abundantly in the testis, predominantly in postmeiotic germ cells . Based on these results, they suggested that a switch from Sox17 to the t-Sox17 isoform may alter the function of Sox17 at the meiotic and postmeiotic phases during spermatogenesis in mice . Our data demonstrated that tilapia Sox30 has four alternatively spliced isoforms. Early termination of the stop codon was found in the isoform-II and -IV transcripts of tilapia Sox30, suggesting that the mechanism mentioned above may be also responsible for regulation of Sox30 gene expression in tilapia. Besides, similar to mouse Sox17, four isoforms of Sox30 were also expressed in a non-parallel manner during gonadal development of tilapia. Thus, there may be some switches from one to the other isoform(s) to alter the function of Sox30 during gonadal development. It is worthwhile to note that the HMG-box region was deleted partially in isoform-III and completely in isoform-IV, respectively, which resulted in truncated proteins lacking most parts or all of the HMG-box domain and the nuclear localization signal.
It is interesting to know whether these protein isoforms are ever made, but now there is no antibody available because Sox30 is poorly conserved among species of different classes of animals, and therefore, the available antibody against mammalian Sox30 is not suitable for usage in fish. There are reports showing that alternatively spliced isoforms of Sox17 and Sox30, without functional DNA binding domain or C-terminus, very similar to tilapia Sox30 isoforms, were translated into proteins in mammals [15, 16]. Based on those findings, we speculated that all isoforms of Sox30 in tilapia may also be translated into the protein products. As Sox30 gene, like Sry and some other Sox genes, uses the single HMG-box for DNA-binding, it would be very interesting to know what these truncated isoforms do and how they function. Whether this would result in different capacity of transcription activation, e.g. complete non-functional factors or dominant negative mutants of the wild type Sox30, is still an open question.
The expression pattern and functional relevance of Sox30 in tilapia
Previous reports in mammals showed that human and mouse Sox30 are exclusively expressed in normal adult testis and specifically in germ cells . This expression pattern suggests that Sox30 may be involved in mammalian spermatogonial differentiation and spermatogenesis  However, it is still unclear in which cell type Sox30 is expressed in the mammalian ovary even though there was report indicating its possible expression in mouse oocytes . In the present study, Sox30 started to express in gonads from 10 dah, earlier than the morphological gonadal differentiation period (about 25 dah) in tilapia and showed a gonad specific expression pattern at least in adults. Of course, expression of Sox30 in extra-gonadal tissues in other stages of development cannot be completely excluded. Meanwhile, expression of each alternatively spliced isoform showed a clear sexual dimorphism in gonads. However, none of the four isoforms of Sox30 were detected at 5 dah, the critical period of tilapia molecular sex determination, which excluded its role as the sex determining gene in tilapia. In sex reversed adult gonads, three types of Sox30 also exhibited a phenotypic sex-related expression pattern. Sox30 expression was restricted to the germ cells at 10 dah and later to sperms in male gonads, indicating its possible involvement in spermatogonial differentiation and spermatogenesis in male fish, like in human and mouse. In female, it was expressed in the somatic cells in female gonads at 10, 120 and 210 dah, respectively.
Sox30 is a strong regulator of the Sf-1, the most important steroidogenic factor found in all vertebrates, including fish [34, 35]. Our in situ hybridization data demonstrated clearly that Sox30 is expressed in the somatic cells, especially steroidogenic cells of the ovary, colocalized with Sf-1 in tilapia . Taken together, we speculate that Sox30 may be an important regulator for somatic differentiation and steroidogenesis in female fish as well. Both isoform I and IV are expressed in the ovary after 35 dah. Although isoform IV is expressed much higher than isoform I it lacks the DNA binding HMG-box, and therefore, it is unlikely that this isoform regulates Sf-1 expression by direct binding to the promoter. However, the possibility that isoform IV might function as a dominant negative mutant of the wild type molecule (isoform I) can not be excluded because it still has the other functional domains, such as transactivation domain. More works need to be done to unravel this.
In addition to the gonad specific expression of Sox30 in human, mouse and tilapia, all Sox30 EST sequences were also derived from gonads (testis and ovary in chicken and channel catfish, testis in frog, guppy and fathead minnow, male gonads in snail) demonstrating that Sox30 may also be a gonad specific gene in these species. These data further supported that Sox30 plays a key role in gonadal differentiation and development, which might be conserved among species in the animal kingdom. Moreover, four alternatively spliced Sox30 isoforms exhibited different temporal and spatial expression patterns in tilapia gonads. Alternative splicing of Sox30 mRNAs and different temporal expression pattern of the spliced isoforms had also been reported in human and mouse . Therefore, we speculate that Sox30 may be involved in gonadal differentiation and development in different sexes, at different stages and in different cell types of gonads in the animal kingdom by AS. Further study on Sox30 in other species is required to confirm this.
The low similarity between tilapia Sox30 and its mammalian counterpart raises a question of whether tilapia Sox30 is a genuine Sox30 or just a new Sox member of the group H. To answer this, we analyzed the synteny of Sox30 and its adjacent genes in human, mouse, chicken, anole lizard and frog (tilapia was not included because its genome sequences has not finished and open yet). Sox30 and its adjacent gene Thg1l and Adam19 were found to be located on the chromosome 5, 11, 13 and scaffold_69 and _177 in human, mouse, chicken, anole lizard and frog, respectively. The conservation in synteny between non-mammalian species and mammals indirectly support that our isolated Sox30 s are genuine orthologs of mammalian Sox30.