What is normal? Next generation sequencing-driven analysis of the human circulating miRNAOme
BMC Molecular Biology volume 17, Article number: 4 (2016)
MicroRNAs (miRNAs) are short non-protein-coding RNA species that have a regulatory function in modulating protein translation and degradation of specific mRNAs. MicroRNAs are estimated to target approximately 60 % of all human mRNAs and are associated with the regulation of all physiological processes. Similar to many messenger RNAs (mRNA), miRNAs exhibit marked tissue specificity, and appear to be dysregulated in response to specific pathological conditions. Perhaps, one of the most significant findings is that miRNAs are detectable in various biological fluids and are stable during routine clinical processing, paving the way for their use as novel biomarkers. Despite an increasing number of publications reporting individual miRNAs or miRNA signatures to be diagnostic of disease or indicative of response to therapy, there is still a paucity of baseline data necessary for their validation. To this end, we utilised state of the art sequencing technologies to determine the global expression of all circulating miRNAs within the plasma of 18 disease-free human subjects.
In excess of 500 miRNAs were detected in our study population with expression levels across several orders of magnitude. Ten highly expressed miRNAs accounted for 90 % of the total reads that mapped showing that despite the range of miRNAs present, the total miRNA load of the plasma was predominated by just these few species (50 % of which are blood cell associated). Ranges of expression were determined for all miRNA detected (>500) and a set of highly stable miRNAs identified. Finally, the effects of gender, smoking status and body mass index on miRNA expression were determined.
The data contained within will be of particular use to researchers performing miRNA-based biomarker screening in plasma and allow shortlisting of candidates a priori to expedite discovery or reduce costs as required.
MicroRNAs (miRNAs) are short non-protein-coding RNA species that have a regulatory function in modulating protein translation from specific mRNAs . MicroRNAs are estimated to target approximately 60 % of all human mRNAs  and are associated with the regulation of all physiological processes. Similar to many messenger RNAs (mRNA), miRNAs exhibit marked tissue specificity [9, 12], and appear to be dysregulated in response to specific pathological conditions . Perhaps most significant is the finding that miRNAs are detectable in various biological fluids  and are stable during routine clinical processing , paving the way for their use as novel biomarkers. Whilst the presence of extra-cellular miRNAs in a range of biological fluids has been consistently described within the recent literature, the precise mechanisms via which they leave their “host cell” and enter the circulation are still under investigation. Nevertheless, it is now accepted that miRNAs are not simply shed from necrotic or apoptotic cells but rather, are subject to selective export from specific cells and function as extra-cellular signalling molecules, retaining their biological activity in recipient cells .
In considering the use of miRNAs as novel biomarkers, it is essential to generate baseline data that describes which miRNA species are present in a given biological fluid, where they likely originate from, and what is considered normal in terms of their patterns of expression. Indeed, we take such data for granted when we consider other routine clinical analyses such as the measurement of plasma/serum aspartate aminotransferase and alanine aminotransferase for the determination of liver function. Despite an increasing number of publications (5897 articles indexed by PubMed at the time of publication) reporting individual miRNAs or miRNA signatures as biomarkers of various conditions, there is still a paucity of baseline data necessary for their validation. To date, there are no comprehensive assessments of plasma miRNA expression conducted using unbiased methodology (i.e. where the miRNA targets are not required a priori) in a representative set of healthy human subjects reported. To this end, we utilised state of the art sequencing and bioinformatic techniques to determine the global expression of all circulating miRNAs within the plasma of 18 disease-free human subjects with high resolution. We report data to support the key questions outlined above including (1) a comprehensive list of all miRNAs found within human plasma, (2) the expected range of expression in a disease-free cohort and (3) the likely tissue of origin for a selection of the most highly expressed miRNAs. Furthermore, we report the effects of sex, smoking status and differing body mass index (BMI) on these parameters.
Results and discussion
All data presented herein are derived from 18 human plasma specimens, details of which can be found in Table 1. Circulating RNA was extracted from 5 mL of each plasma sample, yielding an average RNA mass of 0.01 µg (Table 2). The mass of RNA obtained was consistent irrespective of gender (P > 0.05; two-tailed t test; n = 9). Next generation sequencing for small RNA molecules yielded an average of 8,080,000 high quality (≥Q30) sequencing reads per sample (Fig. 1). Raw sequencing data in fastq format can be obtained from the Short Read Archive (SRA), accession number SRP064104. Mapping of the raw sequencing data to miRBase version 20.0 (Fig. 2a) identified 548 different miRNAs across the 18 samples analysed. Of these, only 53 were present in all samples (Table 3). The number of reads mapped and the number of miRNAs identified in each plasma sample is detailed in Fig. 2b, c respectively.
Normalised miRNA expression values were used to compare the expression of individual miRNAs between plasma samples (Additional file 1: S1). In the first instance, we investigated which miRNAs were most abundantly expressed in human plasma. Based upon the mean expression of 18 independent samples, microRNAs 486-5p, 10b-5p, 320a, 423-5p, 92a-3p, 22-3p, 10a-5p, 181a-5p, 151a-3p, and let-7f-5p were found to be most abundantly expressed in the plasma of our study participants. Surprisingly, the most abundant miRNA (miR-486-5p) accounted for almost 60 % of the total number of sequencing reads mapping to miRNAs and furthermore, the ten most abundant miRNAs accounted for approximately 90 % of mapped reads. To verify these striking findings, we reviewed the results of thirty-seven publically available plasma miRNA-seq datasets obtained via miRmine online (Guan Laboratory, University of Michigan—Additional file 1: S1, sheet 2) and undertook a complete re-analysis (from raw sequencing data) of three randomly selected plasma sequencing runs from the short read archive [accession numbers; SRR1005875, SRR1005877 and SRR1005876]. The top ten miRNAs accounted for 68 and 74 % of the total mapped read count in the reviewed and re-analysed datasets respectively. The fraction of the total plasma miRNA read count attributed to each of the top 10 most highly expressed miRNAs is detailed in Fig. 3 (left-hand axis). The normalised expression (reads per million; RPM expressed in Log 2 scale) of each of these miRNAs in human whole blood, serum and plasma (using publically available data via miRmine) is included on the right-hand axis. Comparison of plasma data from this study and those derived from miRMine revealed no clear correlation, however, significant variation within the later dataset may explain this finding (Fig. 3, right-hand axis error bars).
Next we sought to determine the origin of these highly expressed miRNAs. As per the available literature, five out of the ten most highly expressed miRNAs were previously identified as being highly expressed by cells of the blood. MicroRNAs 486-5p and 92a-3p were shown to be highly expressed by erythrocytes, whilst miRNAs 181a-3p, 151a-3p and let-7f-5p were highly expressed by various blood cell types [11, 15]. To ensure our highly expressed miRNAs were naturally present at such levels, as opposed to the result of haemolysis, we determined the expression variation in both blood cell and non-blood cell associated miRNAs with the expectation that these should be equivalent in the absence of haemolysis. Variation in our blood cell associated miRNAs was in fact less than that of the non-blood cell associated miRNAs (Coefficient of variation = 95.8 vs 116.3 %, N = 18), supporting our findings. Leveraging over 300 publically available next generation sequencing datasets (including 66, 13 and 37 in human blood, serum and plasma respectively), we were able to determine the tissue expression profile of each highly expressed miRNA. Whilst each miRNA exhibited a different tissue based expression pattern, nine of the ten (miRNA-151-3p was the exception) highly expressed miRNAs identified herein had median expression levels in either blood, serum or plasma in excess of the median expression for all tissues, suggesting that the blood is a significant source of these. Furthermore, in considering only those miRNA-seq datasets generated from human plasma, nine of the ten (miR-10b-5p was the exception) top expressed miRNAs identified in this study appeared within the upper quartile of expression, confirming that our results mirror those of other studies conducted using similar methods in the same biological sample type. Comprehensive tissue expression data for the most highly expressed miRNAs identified in human plasma is included within Fig. 4.
Our realisation that a number of the most highly expressed miRNAs were of blood cell origin led us to consider the potential implications that this may have for biomarker development (for a comprehensive review of the challenges associated with miRNA biomarker development, please refer to . The routine clinical sampling of blood involves venapuncture, delivery to a suitable storage vessel with or without anticoagulant and delivery to the analytical laboratory. The preparation of plasma or serum involves a further step during which blood cells are removed by centrifugation to leave a “cell free” supernatant. At each point, there is potential to influence the number of blood cell associated miRNAs present in the remaining sample either through haemolysis following venapuncture or during transportation, or via variation in the time and or intensity of centrifugation. In support of this, Pritchard and colleagues have previously demonstrated significant increases in erythrocyte-associated miRNAs (miRs-451, 16, 92a, 486) in haemolysis, and explained miR-150 and 223 expression as a function of lymphocyte and neutrophil count respectively . Thus, small variations in blood sample preparation may lead to individual samples being enriched or depleted in certain blood cell types and thus have significant impacts upon the expression of specific miRNAs. Of concern is the fact that all of the blood cell associated miRNAs identified herein have been identified in biomarker screens and or proposed as novel biomarkers for various conditions (miR-486-5p—gastric adenocarcinoma, miR-92a-3p—colorectal cancer, miR-181-5p—endometrial carcinoma, miR-151a-3p—paracetamol toxicity, let-7f-5p—Alzheimer’s disease) and may thus be subject to the aforementioned issues.
In order to determine the normal range of circulating miRNA expression within our disease free population, the mean and the standard error of the normalised expression values was calculated and the miRNAs ranked according to their apparent stability. The most stably expressed miRNAs included microRNAs 486-5p, 25-3p, 10b-5p, 99b-5p, let-7f-5p, 10a-5p, 423-3p, 101-3p, 532-5p, and 103a-3p (Fig. 5). Although several of the most abundantly expressed miRNAs were identified, these highly stable miRNAs represented several orders of expression magnitude including both highly abundant and very lowly expressed miRNAs. Wang et al.  have previously reported the stability of a restricted set of plasma and serum miRNAs and despite differences in experimental design and measurement techniques, various miRNAs were identified as highly stable in both studies (Table 4).
The sole inclusion criterion for this study was the requirement to be free from overt disease at the time of blood donation and thus our patient cohort was diverse in their gender, smoking status and BMI. It was therefore possible to determine the effects of these factors on circulating miRNA expression. Perhaps unsurprisingly, the gender of the study participant had a significant effect on their circulating miRNA profile. Seventy-five miRNAs were differentially regulated between the two genders (Fig. 6a) however rather surprisingly 74 of these were up-regulated in females (in the absence of any significant differences in RNA mass obtained, number of clean sequencing reads or indeed the percentage of reads mapped to miRNAs—Fig. 6b, c). MicroRNA 486-5p, present in erythrocytes, was the only miRNA to be expressed at a higher level in males than in females, and we consider that this is likely due to the higher initial erythrocyte count that males present with. In considering why all of the deregulated miRNAs were elevated in the female sex, we determined the genomic location at which they were encoded with a hypothesis that there would be an over representation of miRNAs encoded by the X chromosome. In fact of the 75 miRNAs deregulated, only three were encoded at a locus on the X chromosome confirming that X copy number is unlikely responsible for this phenomenon. A review of the available literature reveals that few studies have previously investigated gender-specific miRNA expression, and there is an absence of those that have specifically considered gender-specific expression in healthy human subjects. In order to validate our findings, we therefore returned to the 37 publically available plasma miRNA-seq datasets, 27 of which had gender information. Considering the 74 miRNAs differentially expressed between male and female subjects in our study, we confirmed that 53 of these were also more highly expressed in the female sex of the publically available datasets.
We next considered the impact of smoking status on circulating miRNA expression. Of the 18 participants, 8 (45 %) were self-reported smokers (an equal number of males and females). Smoking was associated with the down-regulation of 27 plasma miRNAs (Fig. 7), including several previously identified as performing tumour suppressor-like functions; let-7i-5p , miR-148a-5p , miR-218-5p , miR-29-3p , miR-133a , miR-296-5p  and miR-370 . We consider this finding particularly interesting; however, these results should be confirmed by analysis of a larger cohort of smokers incorporating robust statistical techniques to control for false discovery. Various publications have investigated the impact of smoking on miRNA expression in the lungs  and lung cells  and have reported an apparent global down-regulation of miRNA regulation, potentially mediated via disruption of the Erk/dicer/trbp pathway . Here, we show the same phenomenon in the plasma however, there is little agreement at present between the dysregulated miRNAs identified in each individual study—perhaps reflecting a difference in the smoking cohort, experimental setup or suggesting that there are tissue specific effects.
Finally, we considered the effects of obesity on the circulating miRNA profile of our study participants. Of the 18 donors, 8 had a BMI of greater or equal to 30 and were considered obese (2 males and 6 females). Sixteen miRNAs were dysregulated in obese individuals, and were more highly expressed compared to normal weight control subjects in all cases (Fig. 8a). Given that our donors had a range of BMI values, we sought to determine whether any of the obesity-associated miRNAs correlated with BMI. The expression of both miR-129-5p and miR-30e-3p were found to correlate with obesity (R = 0.67 and 0.64 respectively) (Fig. 8b, c). In both cases, the relationship was stronger amongst female participants however it is highly likely that this effect is down to a higher number of females than males being obese.
The potential of miRNAs to serve as novel biomarkers is under intense investigation. To date, numerous studies have attempted to correlate miRNA expression with a range of endpoints. Despite the number of studies reporting such miRNA biomarkers, there is still a relative paucity of baseline data reporting the miRNA species expressed in a given biological fluid, what is considered normal in terms of their patterns of expression, and further, how this expression is altered by gender, obesity and smoking status. In this study we utilised state of the art sequencing technology to provide data to support the above key questions. In excess of 500 miRNAs were detected in our study population with expression across several orders of magnitude. However, only 53 of those miRNAs were detected in all participants with some miRNAs being detected in just a single individual. In considering relative expression between miRNAs, the top 10 most highly expressed candidates accounted for 90 % of the total reads that mapped to all miRNAs suggesting that despite the range of miRNAs present, the total miRNA load of the plasma was predominated by just 10 different species. Furthermore, many of the most abundant miRNAs have been shown to be highly expressed in cells of the blood and thus, perhaps with the exception of haematological biomarkers, their use as biomarkers should be approached with caution. Ranges of expression were determined for all miRNA detected (> 500) and a set of highly stable miRNAs identified. Finally, the effects of gender, smoking status and BMI on miRNA expression were determined. These data provide researchers with the ability to (1) determine the presence or absence of individual miRNAs within the plasma of a disease-free population, (2) consider the penetrance of each individual miRNA, (3) gain insight into the relative expression levels and “normal range” of individual miRNAs, ensuring that only suitabily highly expressed and stable candidates are taken forward, and (4) to have an indication of whether individual miRNAs are regulated by gender, smoking or obesity. Perhaps most striking and of most relevance here is the finding that of the large pool of miRNA detected in disease-free plasma, only a very small proportion of this may house suitable biomarker candidates (considering that, with the exception of the 10 most highly expressed miRNAs, only 10 % of the total miRNA pool is left and once miRNAs with low penetrance, those associated with cells of the blood, and those with high levels of inter-individual variation are discounted). Thus, the information contained within will assist researchers in designing more targeted miRNA biomarker studies.
Ethical approval and consent to participate
Sample collection was undertaken by Sera Lab ltd. who obtained ethical approval by an Institutional Review Board (Schulman Associates IRB #201209850). Written informed consent to participate was obtained from each study participant prior to sample collection.
Consent for publication
The supplying company (Sera Lab ltd) obtained written consent to produce reports or articles about the study from each participant, subject to their names being omitted from any such publications.
Samples, RNA isolation and sequencing
This study utilised human control material purchased from SeraLab ltd. All data were analysed anonymously; donor details are included in Table 1. Whole blood was drawn into EDTA containing tubes and stored on ice prior to centrifugation at 1000×g to obtain the plasma component. Plasma samples were frozen at −20 °C immediately after production. RNA was extracted from 5 mL of each plasma sample using the QiaAmp Circulating Nucleic Acid extraction kit in accordance with the manufacturer’s standard instructions. The quantity and quality of all RNA extracts was determined using the QuBit fluorimeter (Invitrogen) and Agilent BioAnalyzer (with Small RNA chip) respectively. Small RNA sequencing libraries were preparing using the TruSeq Small RNA kit according to the manufacturer’s standard directions. Sequencing was conducted on the Illumina HiSeq 2000 platform with the aim of obtaining >5,000,000 reads per sample.
All sequencing data underwent stringent quality control measures prior to analysis. Briefly, reads were quality clipped to only retain reads with a median quality of ≥Q30, sequencing adapters removed, and read length distributions assessed to ensure these met the expectations of a miRNA sequencing run. An extract and count routine was utilised to condense the millions of sequencing reads into sequence tab count format to increase computational efficiency. Reads shorter than 15 nucleotides and longer than 35 nucleotides were filtered given that these were unlikely to map to miRNAs. Following parsing to remove superfluous data columns, the resulting tally tables comprising the miRNA sequence and count were processed by miRanalyzer V0.2 . Reads were mapped to the human genome (hg18) and to miRBase version 20.0 permitting one mismatch between the sequencing reads and each index. Reads mapping to known miRNAs were counted and a relative expression value determined by dividing the number of reads mapping to each particular miRNA by the total number of reads mapped to all miRNAs. MicroRNAs evidenced by less than 10 sequencing reads or present in less than three study individuals were excluded prior to further analysis. Differential expression analysis and P value estimation were performed using QluCore Omics Explorer. The significance of differences between the various factors (BMI, smoking status and gender) was determined using a two-tailed t-test. In each case, contributions from other factors were eliminated using QluCore Omics explorer.
Boon RA, Vickers KC. Intercellular transport of microRNAs. Arterioscler Thromb Vasc Biol. 2013;33(2):186–92.
Boyerinas B, Park S, Hau A, Murmann AE, Peter ME. The role of let-7 in cell differentiation and cancer. Endocr Relat Cancer. 2010;17(1):F19–36.
Cortez MA, Bueso-Ramos C, Ferdin J, Lopez-Berestein G, Sood AK, Calin GA. MicroRNAs in body fluids—the mix of hormones and biomarkers. Nat Rev Clin Oncol. 2011;8(8):467–77.
Dong Y, Zhao J, Wu C, Zhang L, Liu X, Kang W, Leung W, Zhang N, Chan FK, Sung JJ. Tumor suppressor functions of miR-133a in colorectal cancer. Mol Cancer Res. 2013;11(9):1051–60.
Graff JW, Powers LS, Dickson AM, Kim J, Reisetter AC, Hassan IH, Kremens K, Gross TJ, Wilson ME, Monick MM. Cigarette smoking decreases global microRNA expression in human alveolar macrophages. PloS one. 2012;7(8):e44066.
Hackenberg M, Rodriguez-Ezpeleta N, Aransay AM. miRanalyzer: an update on the detection and analysis of microRNAs in high-throughput sequencing experiments. Nucleic Acids Res. 2011;39(Web Server issue):W132–8.
Hogg DR, Harries LW. Human genetic variation and its effect on miRNA biogenesis, activity and function. Biochem Soc Trans. 2014;42(part 4):1184–9.
Izzotti A, Calin GA, Arrigo P, Steele VE, Croce CM, de Flora S. Downregulation of microRNA expression in the lungs of rats exposed to cigarette smoke. FASEB J. 2009;23(3):806–12.
Laterza OF, Lim L, Garrett-Engele PW, Vlasakova K, Muniappa N, Tanaka WK, Johnson JM, Sina JF, Fare TL, Sistare FD, Glaab WE. Plasma microRNAs as sensitive and specific biomarkers of tissue injury. Clin Chem. 2009;55(11):1977–83.
Lee K, Lin F, Hsu T, Lin J, Guo J, Tsai C, Lee Y, Lee Y, Chen C, Hsiao M. MicroRNA-296-5p (miR-296-5p) functions as a tumor suppressor in prostate cancer by directly targeting Pin1. Biochim Biophys Acta (BBA)-Mol Cell Res. 2014;1843(9):2055–66.
Leidinger P, Backes C, Meder B, Meese E and Keller A. The human miRNA repertoire of different blood compounds. BMC Genom. 2014;15:474-2164-15-474.
Mestdagh P, Lefever S, Pattyn F, Ridzon D, Fredlund E, Fieuw A, Ongenaert M, Vermeulen J, de Paepe A, Wong L, Speleman F, Chen C, Vandesompele J. The microRNA body map: dissecting microRNA function through integrative genomics. Nucleic Acids Res. 2011;39(20):e136.
Mitchell PS, Parkin RK, Kroh EM, Fritz BR, Wyman SK, Pogosova-Agadjanyan EL, Peterson A, Noteboom J, O’Briant KC, Allen A, Lin DW, Urban N, Drescher CW, Knudsen BS, Stirewalt DL, Gentleman R, Vessella RL, Nelson PS, Martin DB, Tewari M. Circulating microRNAs as stable blood-based markers for cancer detection. Proc Natl Acad Sci USA. 2008;105(30):10513–8.
Powers L, Dickson A, Gerke A, Hansdottir S, Wilson M, Gross T, Monick M, Sears R. Smoking disrupts the Erk/dicer/trbp/microrna pathway resulting in global down-regulation of microrna expression. Am J Respir Crit Care Med. 2011;183:A1048.
Pritchard CC, Kroh E, Wood B, Arroyo JD, Dougherty KJ, Miyaji MM, Tait JF, Tewari M. Blood cell origin of circulating microRNAs: a cautionary note for cancer biomarker studies. Cancer Prev Res Philadelphia PA. 2012;5(3):492–7.
Tiberio P, Callari M, Angeloni V, Daidone MG and Appierto V. Challenges in using circulating miRNAs as cancer biomarkers. BioMed Res Int. 2015;2015:731479.
Uesugi A, Kozaki K, Tsuruta T, Furuta M, Morita K, Imoto I, Omura K and Inazawa J. The tumor suppressive microRNA miR-218 targets the mTOR component Rictor and inhibits AKT phosphorylation in oral cancer. Cancer Res. 2011;canres.0368.2011.
Wang K, Yuan Y, Cho J, McClarty S, Baxter D, Galas DJ. Comparing the MicroRNA spectrum between serum and plasma. PLoS ONE. 2012;7(7):e41561.
Weber JA, Baxter DH, Zhang S, Huang DY, Huang KH, Lee MJ, Galas DJ, Wang K. The microRNA spectrum in 12 body fluids. Clin Chem. 2010;56(11):1733–41.
Wu Y, Crawford M, Mao Y, Lee RJ, Davis IC, Elton TS, Lee LJ, Nana-Sinkam SP. Therapeutic delivery of microRNA-29b by cationic lipoplexes for lung cancer. Mol Ther Nucleic Acids. 2013;2(4):84.
Yungang W, Xiaoyu L, Pang T, Wenming L, Pan X. miR-370 targeted FoxM1 functions as a tumor suppressor in laryngeal squamous cell carcinoma (LSCC). Biomed Pharmacother. 2014;68(2):149–54.
Zheng B, Liang L, Wang C, Huang S, Cao X, Zha R, Liu L, Jia D, Tian Q, Wu J. MicroRNA-148a suppresses tumor cell invasion and metastasis by downregulating ROCK1 in gastric cancer. Clin Cancer Res. 2011;17(24):7574–83.
DPT and TWG conceived and designed the study; DPT performed the laboratory work and prepared the manuscript. Both authors read and approved the final manuscript.
Funding for this study was provided from the development fund of Public Health England and the University of Keele.
The authors declare that they have no competing interests.
Additional file 1: S1. Sheet 1: Comprehensive list of raw and normalised expression values for all plasma miRNAs detected. Raw Sequencing Data—The raw sequencing data (in fastq format) supporting the results of this article is available in the Short Read Archive (SRA) repository, [SRP064104 http://www.ncbi.nlm.nih.gov/sra]. Sheet 2: Experiment IDs, PubMed IDs and metadata relating to the publically accessible datasets accessed for result validation.
About this article
Cite this article
Tonge, D.P., Gant, T.W. What is normal? Next generation sequencing-driven analysis of the human circulating miRNAOme. BMC Molecular Biol 17, 4 (2016). https://doi.org/10.1186/s12867-016-0057-9
- Plasma sequencing
- Small RNA