Genome-wide identification and phylogenetic analysis of wheat ZIFL
In order to identify wheat ZIFL genes and to gain insight for possible evolutionary relationship, two complementary approaches were used. This includes, first performing genome-wide sequence search of MFS_1 family using Pfam BLAST (PF07690), followed by homology-based analysis with previously reported ZIFL genes in different plant species using Ensembl database. These approaches resulted in the identification of one hundred seventy-nine sequences and to further validate their identity sequences were checked and searched for MFS_1 domain through Pfam and conserved domain databases (CDD-NCBI) (Additional file 1: Table S1). These sequences were then used to build phylogenetic tree with previously known ZIFL protein sequences from different plants (Additional file 1: Table S1 and Additional file 2: Figure S1). The arrangement of tree suggested a distinct clade for the ZIFL cluster when compared to the remaining MSF_1 proteins. This indicates that ZIFL is a distinct group of MFS transporters that are tightly clustered (Additional file 2: Figure S1). Further, this distribution was confirmed through signature sequences that are specific to ZIFL proteins. The presence of either of these signature sequences validated the distribution of ZIFL proteins. Two wheat specific signature sequences included (i) W-G-x(3)-D-[RK]-x-G-R-[RK] (found in all except in TaZIFL2.5-5D) and (ii) S-x(8)-[GA]-x(3)-G-P-x(2)-G-G with an exception of A instead of G at 10th position of (ii) signature in TaZIFL2. Furthermore, sequences similar to ZIFL specific cysteine (Cys) and histidine (His) signatures were also used for identifying TaZIFL. (iii) C-[PS]-G-C, absent in 6 TaZIFL sequences (TaZIFL2.5-5D, TaZIFL 3-4B, TaZIFL 4.2-4B, TaZIFL 5-5D, TaZIFL 7.1-4B, TaZIFL 7.2-4B) probably due to missing sequence information and (iv) [PQ]-E-[TS]-[LI]-H-x-[HKLRD] (an insertion of ETLYCRHEHRYSIFISLD sequence within the motif was found in TaZIFL7.2-4A) [18]. These signatures guided identification of specific wheat ZIFL from the rest of the MFS_1 member. Such analysis resulted in confirmation of a total of thirty-five wheat ZIFL sequences, including individual TaZIFL genes and their respective homoeologs from different wheat sub-genomes (Additional file 3: Table S2). To check the distribution along with other plant species, ZIFL protein sequences from O. sativa, Z. mays and Arabidopsis were used to build a rooted phylogenetic tree through the NJ method (Fig. 1). Because of the genome duplication events in wheat, the genes are likely to show multiple alleles of a single gene. Hence the resulted 35 putative wheat ZIFL represent 15 genes after distribution with respective homoeologs (Fig. 1). To provide the uniform nomenclature, TaZIFL genes were named according to their respective closest known orthologs from rice. Among the wheat ZIFL proteins TaZIFL4.1 showed highest homology with TaZIFL4.2 of 95.2 percentage identity. When a cross species comparison was done, the maximum identity of 87 percent was shown by TaZIFL2.2-3D and AtZIFL2. With rice, the highest percentage identity of 87 was observed for wheat ZIFL2.2-3A and OsZIFL2. Divergence was observed among TaZIFL3-4B and OsZIFL13 with percentage identity of 50.
Molecular structure and genome organization
The predicted protein length of the identified wheat ZIFL sequence ranged from 300 to 562 amino acids (Additional file 3: Table S2). In general, most of the wheat ZIFL showed 10–12 predicted trans-membrane (TM) domains as reported in rice [18]. Specifically, 16 wheat ZIFL proteins were predicted to have 12 TM domains, 15 proteins have 10–11 predicted TM domains, 3 proteins were found to have 8–9 TM domains and only one wheat ZIFL has 4 TM domains (Additional file 3: Table S2). Further, the genomic organization analysis revealed the presence of genes in all three A, B and D sub-genomes. Maximum number of genes were found to be present on B and D sub-genome with 13 and 12 genes respectively (Fig. 2a). TaZIFL1.2, TaZIFL2.2, TaZIFL3, TaZIFL5, TaZIFL6.2, TaZIFL7.1, TaZIFL7.2 are present in all three genomes, while TaZIFL2.3 and TaZIFL2.4 are present on only one genome 5B and 5D respectively (Additional file 3: Table S2). The chromosomal distribution mapping revealed TaZIFLs to be present only on chromosome 3, 4 and 5 with maximum of 17 sequences on chromosome 4 (Fig. 2b, c). Next, the genomic structure was analyzed and regions corresponding to intron–exons were marked (Fig. 3). TaZIFL clustered into the same group and shared almost similar distribution pattern for the number of exon/intron. The intron–exon number varies from 14 to 18 in the respective TaZIFL genomic sequences (Fig. 3, Additional file 3: Table S2).
Protein motif analysis reveals the presence of diverse domains
To have an understanding about the similarity, variation in motif composition and distribution of TaZIFL, 15 sequences representing each ZIFL transcript was subjected to MEME analysis. Our analysis revealed the presence of fifteen motifs (Fig. 4a, Additional file 4: Table S3). Out of fifteen motifs, six were conserved throughout all ZIFL, while some lacked few motifs. Four unique and exclusive motifs (12, 13, 14, 15) were identified, which are specific to the respective group. Motif 14 and motif 12 (Fig. 4b, Additional file 4: Table S3) are specific to TaZIFL2.1, TaZIFL2.2, TaZIFL2.3, TaZIFL2.4 and TaZIFL2.5, which indicated that TaZIFL2 members might share similar functions. Motif 14 was also present in TaZIFL6.2. Another set of unique motifs, mentioned as motif 13 and 15 was found in TaZIFL4.1 and TaZIFL4.2, which may indicate probable different function from rest of TaZIFLs. The canonical MFS signature WG[V/M/I][F/V/A/I]AD[K/R][Y/I//H/L]GRKP was majorly present in the cytoplasmic loop between TM2 and TM3 (Additional file 2: Figure S2, Additional file 5: Table S4) as well S-x(8)-G-x(3)-G-P-[A/T/G]-[L/I]-G-G as anti-porter signature mainly in TM5. The results suggest that ZIFL proteins share unique signatures and high similarity indicating they are a distinct group of MFS family. Presence of conserved signatures Cysteine (Cys)-containing motif CPGC reported previously were also present in most of the wheat ZIFL proteins [18]. The absence of these motifs was observed in TaZIFL2.5_5D, TaZIFL 3_4B, TaZIFL 4.2_4B, TaZIFL5_5D, TaZIFL7.1_4B and TaZIFL7.2_4B. This might be because of missing sequence information. This motif was found to be present in the cytoplasmic N-terminal loop for TaZIFL groups 2, 4, 5, 6 and in the non-cytoplasmic N-terminal loop for groups 1, 3 and 7 (Additional file 2: Figure S2 and Additional file 5: Table S4). Another conserved histidine (His)-containing motif PET[L/I]H showed its presence in the cytoplasmic loop between TM domains ranging from 2 and 3 to 6 and 7, with highest between 6 and 7 TM domains (Additional file 2: Figure S2).
Analysis of conserved cis-elements in the promoter of wheat ZIFL genes
To find the molecular clues that could regulate the expression of wheat ZIFL transcripts, the 1.5 kB promoter region of the all identified wheat ZIFL genes was explored. Our analysis revealed a large number of cis-elements in the promoter of wheat ZIFL. Predominantly, the promoters were enriched with the presence of the core binding site for iron-deficiency responsive element binding factor 1 (IDEF), iron related transcription factor 2 (IRO2) and heavy metal responsive element (HMRE) (Additional file 6: Table S5). The presence of these promoter elements suggests that wheat ZIFL genes might respond towards the presence of heavy metals and to important micronutrients like Fe and Zn. Interestingly, IDE1 cis-element was present on promoters of all the respective wheat ZIFL genes suggesting that they could respond to Fe deficiency conditions. Few of these promoters consist of multiple such cis-elements suggesting their diverse function in plants (Additional file 6: Table S5).
Expression patterns of wheat ZIFL genes under Zn and Fe stress
ZIFL are primarily known to respond towards Zn excess (+Zn), therefore experiments were performed to study the gene expression of wheat ZIFL in roots and shoots. The qRT-PCR analysis suggested tissue specific expression response by wheat ZIFL genes. A total of eight genes, including TaZIFL1.2, TaZIFL2.2, TaZIFL2.3, TaZIFL4.1, TaZIFL4.2, TaZIFL5, TaZIFL6.1 and TaZIFL6.2 showed significantly higher expression during one of the time points under Zn surplus condition (Fig. 5). Among them, the fold expression level for TaZIFL4.1 was highest (~ sevenfold) at 3 days after treatment (DAT) with respect to control roots (Fig. 5a). Few genes like TaZIFL 1.1, TaZIFL7.1 and TaZIFL7.2 remained unaffected by the +Zn condition in roots. In shoots, TaZIFL1.1, TaZIFL1.2, TaZIFL6.1 and TaZIFL6.2 showed significant transcript accumulation either at 3DAT or 6DAT after treatment (Fig. 5b). Notably, TaZIFL1.2, TaZIFL6.1 and TaZIFL6.2 show enhanced transcript accumulation in both the tissues. In contrast, during our experiment the expression of a few wheat ZIFL genes showed down-regulated in shoots but not in roots. Our expression data under Zn surplus condition suggested the differential response by wheat ZIFL towards the treatment.
Previous evidences indicated that plant ZIFL genes not only respond to Zn excess, but are also affected by the Fe limiting conditions [19]. Therefore, expression analysis of wheat ZIFL genes was checked in roots and shoots of seedling under Fe starvation (−Fe). Interestingly, in the root expression of TaZIFL4.1, TaZIFL4.2 and TaZIFL7.2 show up-regulation during Fe limiting condition at both at 3 and 6 DAT. Out of the remaining genes, TaZIFL2.3, TaZIFL6.2 and TaZIFL7.1 show significant transcript abundance at one-time point or the other (Fig. 6a). Interestingly, in shoots TaZIFL1.1, TaZIFL1.2, TaZIFL3, TaZIFL4.1, TaZIFL4.2, TaZIFL5 and TaZIFL7.1 show up-regulation only at 3 DAT, suggesting their coordinated response in shoots (Fig. 6b). Under −Fe condition, wheat ZIFL genes, namely, TaZIFL4.1 and TaZIFL4.2 show high transcript accumulation in both roots and shoots. Remaining genes remain unaffected by the Fe stress (Fig. 6b). Overall, our expression data suggested that indeed wheat ZIFL respond to the Fe limiting condition, thereby suggesting a common interlink of this gene family during Zn and Fe homeostasis.
Candidate TOM genes show expression response to the presence of heavy metals
Phylogenetic arrangement of the wheat MFS-ZIFL proteins along with the rice also revealed the possible wheat homologs for TOM transporters. Thus, based on the clade distribution and the percentage identity with the rice TOM (TOM1-OsZIFL4, TOM2-OsZIFL5 and TOM3-OsZIFL7), wheat TOM transporters were identified as TaZIFL4.1/TaZIFL4.2, TaZIFL5 and TaZIFL7.1/TaZIFL7.2. Rice TOM1 and TOM2 falls in the same sub-clade along with wheat along with TaZIFL5 and TaZIFL6, therefore we included them to study their response in presence of heavy metals.
Our promoter analysis of wheat ZIFL genes indicates the presence of multiple HMRE suggesting that few of these genes could respond to the heavy metals (Additional file 6: Table S5). Due to the importance of TOM genes in micronutrient mobilization the expression of these transcripts in wheat seedlings (shoots and roots) was studied after exposure to heavy metals such as Co, Ni and Cd. During our experiment all the seedlings showed phenotypic defects when exposed to heavy metals (data not shown). Our expression analysis suggested that wheat ZIFL genes show metal specific responses. For example, TaZIFL4.2 and TaZIFL7.1 showed significant up-regulation in both roots and shoots when exposed to any of the metals tested (Fig. 7). In contrast, the transcripts of TaZIFL5 and TaZIFL6.2 remained unaffected under these heavy metals. Expression of TaZIFL7.2 showed almost no change in the presence of Ni in either of the tissues, yet it was specifically up-regulated in roots when exposed to Cd or Co. Similarly, TaZIFL6.1 showed significant upregulation only in roots upon exposure to Ni and Co (Fig. 7). Overall, these data indicate the influence of specific heavy metals on the expression of wheat ZIFL genes in a tissue dependent manner.
Expression of wheat ZIFL transcripts in different wheat tissues and during stress condition
Analysis of ZIFL genes was also performed in different wheat tissues and developmental stages by using qRT-PCR and transcript expression data. Developing grains are an important reservoir for micronutrients, therefore expression studies were carried out for putative wheat TOM genes during the grain maturation. Transcriptional expression of these genes was checked during grain development (7, 14, 21 and 28 days after anthesis-DAA). Our qRT-PCR analysis of putative TOM genes during the grain development showed differential expression pattern with majority of candidate genes (Additional file 2: Figure S3). In general, most of the putative candidate TOM genes showed high expression at the late phase of the grain maturation i.e. 21 and 28 DAA. In addition to this, wheat expression browser and expVIP (http://www.wheat-expression.com/) was used to extracted the expression values as Transcript per millions (TPM). TaZIFL expression values in different tissues (aleurone-al, starchy endosperm-se, seed coat-sc, leaf, root, spike, shoot) and various developmental stages were extracted (Additional file 7: Table S6) and depicted as a heatmap (Fig. 8). In reference to grain tissue developmental time course (GTDT) [27], highest expression was seen for TaZIFL1.2 (3B, 3D) and TaZIFL5 (5A, 5B), with an increase in expression in “al” at 20 dpa and “al and se” at 30 dpa. In the expression values during grain tissue specific expression (at 12 dpa) [28], TaZIFL1.2 was not expressed, but like GTDT study, TaZIFL5 (5A, 5B) was expressed in “al” as well as “se” Fig. 8). While for sc tissue, TaZIFL2.2-3D and TaZIFL7.1 (4A, 4D) had the highest expression when compared to other ZIFL genes. For the tissue specific expression response TaZIFL2.2 was abundant in spike, TaZIFL1.2 in leaf and root. TaZIFL5 was predominantly expressed in all the tested tissue, including leaf, shoot, spike and shoot. The transcripts exclusively expressed in root were TaZIFL2.4-5D, TaZIFL2.5-5B, TaZIFL6.1-5A, and TaZIFL7.2-4D, with high induction of TaZIFL4.1 (4B, 4D), TaZIFL4.2-4A, TaZIFL6.2 (4A, 4B, 4D) for three-leaf and flag leaf stage as compared to the seedling stage. Highest expression induction was seen for TaZIFL4.2-4D. In addition, the highest expression overall in five tissues was observed for TaZIFL1.2 (3A, 3B, 3D) in leaves for seedling as well as tillering stage, TaZIFL2.2-3D in spike, TaZIFL3-4A in leaf, TaZIFL5-5A in grain, TaZIFL7.1-4D in grain and leaf.
For abiotic stress, while no major changes were observed for TaZIFLs, TaZIFL4.1 (4B, 4D) and TaZIFL4.2 (4A, 4D) were found to be significantly downregulated by ~ 14-fold under phosphate deficiency, while TaZIFL6.2-4D was downregulated by threefold (Additional file 2: Figure S4a, Additional file 7: Table S6). Under heat, drought and heat-drought combined stress (Additional file 2: Figure S4a, Additional file 7: Table S6), TaZIFL7.1 (4A, 4D) and TaZIFL7.2-4D were induced by up to ~ sevenfold and ~ twofold respectively, whereas TaZIFL1.2 (3A, 3B, 3D) and TaZIFL5 (5A, 5B) were downregulated by 6 and 7.5-fold, respectively. No significant changes in the TaZIFL gene expression were observed for Fusarium head blight infected spikelets (Additional file 2: Figure S4b, Additional file 7: Table S6). For Septoria tritici infected seedlings, while a ~ twofold induction was observed for TaZIFL1.2 (3A, 3B) after 4 days of induction, prolonged infection (13 days), resulted in its downregulation. Other ZIFLs showing changed expression were TaZIFL1.2-3D and TaZIFL2.2-3A (> twofold up-regulation), TaZIFL3-4B (up to 2.7-fold downregulation). TaZIFL1.2 (3A, 3B, 3D) were also downregulated (up to ~ fourfold) in seedlings with stripe rust infection, while only TaZIFL1.2-3D was downregulated under powdery mildew infection. These expression data suggest that specific ZIFLs are differentially regulated under infection conditions and show perturbed expression under abiotic stresses.