Study design
Frozen samples from six patients were retrieved from the tissue bank (-80°C) owned by the South Swedish Breast Cancer Group. In order to obtain RNA of different quality, four equally sized pieces (by weight) from each invasive breast cancer sample were placed at room temperature for four different lengths of time: 50 seconds, 2–3 minutes, 10 minutes, and 30 minutes, after which the samples were placed in liquid nitrogen.
The ethical committee at Lund University approved this project.
RNA isolation and quality control
The samples were pulverized with a Micro-dismembrator II (B. Braun Biotech Int., Germany), and RNA was extracted using Trizol reagent (Invitrogen, Carlsbad, CA), and purified with Qiagen RNeasy Midi columns (Qiagen, Chatsworth, CA). The RNA concentration was determined using a Nanodrop Spectrophotometer (NanoDrop Technologies, Wilmington, DE). The RNA quality was assessed using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA) together with the reagents in the RNA 6000 Nano LabChip kit. All samples were within the kit capacity (5–500 ng/μl). The Agilent 2100 Bioanalyzer generates an electropherogram and a gel-like image and displays results such as sample RNA concentration and the so called ribosomal ratio, i.e. the ratio between the ribosomal subunits, 28S/18S.
The electropherogram can be evaluated in three ways. With visual inspection, (Manual method) the quality of RNA is considered good if the electropherogram shows two distinct peaks, one for 28S and one for 18S, and a flat baseline (e.g. Fig. 1a, 50 sec.). The electropherogram of a degraded sample contains many small peaks and a highly elevated baseline (e.g. Fig. 1b, 30 min.). In addition to the good and degraded are the partly degraded samples; two peaks are visible, but the baseline is elevated (e.g. Fig. 1c, 50 sec.). Most of these are considered good enough for further analysis, i.e. to proceed to the hybridization step. However, methods that rely on visual inspection are subjective and have a tendency to vary over time. A more objective way to evaluate the quality of RNA may be to use a certain threshold for the 28S/18S ratio as a cut-off (Ratio method). From previous studies, we have established a threshold for the Bioanalyzer ratio at ≥ 0.65 (data not shown). A more recent approach is to use the RNA Integrity Number (RIN) method, which is a standardization of RNA quality control [8, 19]. It is a software algorithm that has been developed to extract information about RNA sample integrity from Bioanalyzer electrophoretic trace. The RIN method was developed to eliminate the effect of individual interpretation on RNA quality control. It takes the entire electropherogram into consideration and is based on a numbering system from 1 to 10, where 1 represents the most degraded RNA and 10 represents intact RNA. When the RIN tool was developed, input data included approximately 1,300 total RNA samples from various tissues, all with varying levels of RNA integrity [19]. After a threshold value has been established, this value can be used in the RNA quality control procedure, but if any experimental parameter is changed (e.g. type of organism, type of tissue, type of microarray platform, RNA extraction procedure, etc.) the validation procedure needs to be repeated. There are, thus, no established cut-off values and each laboratory needs to establish their own.
Previously, we have compared RIN values with results from the Manual method in a series of 163 breast tumors, used in other projects. In these projects the samples were extracted in, essentially, the same way as in the present study. All samples considered to be of good RNA quality with the Manual method had RIN values between 6 and 8 (median 7). The median values for the partly degraded and degraded were 6 (range: 3–7) and 4 (range: 2–6), respectively. Based on these results we considered values greater or equal to 6 to represent good RNA. This cut-off was therefore also used in the present study.
cDNA microarrays
Five micrograms of tumor RNA was labeled with Cy3® dCTP (Amersham Biosciences, Piscataway, NJ), and 5 μg of reference RNA (Stratagene, La Jolla, CA), consisting of a pool of ten different tumor cell lines, was labeled with Cy5® dCTP (Amersham Biosciences, Piscataway, NJ), according to the manufacturer's instructions using the reagents in the ChipShot™ labeling system kit (Corning Inc., Corning, NY).
Arrays were produced by the Swegene DNA Microarray Resource Centre, Department of Oncology at Lund University, Sweden, using a set of 26,819 70 base-pair human oligonucleotide probes (Operon Ver. 2.1. and Ver 2.1.1 upgrade, Cat.No. 810516 and 810518), which were obtained from Operon Biotechnologies, Inc. (Huntsville, AL). The probes represent 16,641 gene symbols.
Prior to hybridization, slides were UV-cross linked at 800 mJ/cm2 and pre-treated using the Pronto!™ Plus System 6 (Corning, Inc., Corning, NY), according to the manufacturer's instructions. Arrays were scanned at two wavelengths using an Agilent G2505A DNA microarray scanner (Agilent Technologies, Santa Clara, CA), with 10 μm resolution. Gene Pix Pro 4.0 software (Axon Instruments, Inc., Union City, CA), was used for image analysis. Gene names were linked to the spots and spots with poor quality were manually excluded. Raw-data are available at Gene Expression Omnibus [20].
Data analysis
Background correction of Cy3 and Cy5 intensities was calculated, using the median feature and the median local background intensities provided in the data matrix. Within arrays, intensity ratios for individual features were calculated as background corrected intensity of tumor sample divided by background corrected intensity of reference sample. The data matrix was uploaded to BASE [11], where the data analysis took place.
Spots with intensities lower than zero, and spots that were flagged bad or not found were excluded. Reporters that were not present in 100% of the arrays were filtered out, and the data was normalized using Lowess [21], resulting in 14,288 reporters in the final analysis. Unsupervised hierarchical clustering, using Euclidean distance, was performed in BASE. Concentrating on RIN values, we assigned the data into two groups: RIN ≥ 6 or RIN < 6, and compared the gene expression profiles of these two groups to see whether there was a significant difference for any given reporter × between the two groups. We performed a gene score analysis in BASE to find statistical significance in terms of false discovery rates (FDR), and a permutation test was performed to obtain an estimate of the rate of differentially expressed reporters.
Ontological mapping using the publicly available software GoMiner [12] was performed to investigate the most significantly affected GO categories. A p-value ≤ 0.05 was used, and only categories with ≥ 3 changed genes were considered in the analysis. A percentage of the number of genes that were changed in each category was calculated.
Validating the RIN threshold value
In order to validate the RIN cut-off value, the RIN values were compared to the Pearson correlation coefficients (Fig. 4). In BASE, Pearson correlation coefficients were obtained, when the gene expressions of the sample for the different time points at room temperature were related to the gene expression of the sample left 50 seconds at room temperature. Poor correlations should correspond to lower RIN values, and good correlations should equal higher RIN values (Fig. 4). If the correlation coefficient of the gene expression is the true value for the RNA quality, only two samples did not obtain RIN values as expected, i.e. a low correlation coefficient and a RIN value above the cut-off or vice versa (Fig. 4). Both samples had a RIN value close to the cut-off. This strengthened the choice of 6 as the cut-off for the RIN method in the present study.