Affordable Access

Identifying differentially expressed genes in DNA microarray data

Publication Date
  • Dna Microarrays
  • Bioinformatics
  • Statistics
  • Computer Engineering
  • Biology


We have developed two new nonparametric statistical tests for identifying differentially expressed genes in DNA microarray data. These are the average difference score (ADS) and the mean di erence score (MDS). The ADS generalizes the independently consistent expression (ICE) discriminator proposed by Bijlani and his co-workers. The MDS extends the Welch t-test and the Fisher correlation score. The new tests replace the serial noise estimator used in existing tests by a parallel noise estimator. The result is better detection of changes in the variance of expression levels, which t-test type criteria tend to under-emphasize. We compare the performance of the new tests to that of several commonly used non-parametric tests, including the non-parametric Welch t-test, the Fisher correlation score, the Wilcoxon rank sum test, and ICE. We use the commonly used feature selection performance criteria, namely the feature selection accuracy and classification accuracy. We also developed a new criterion, the ensemble diversity. Using these criteria, we have demonstrated that ADS and MDS outperform the other tests by exhibiting higher sensitivity and comparable speci city, thus being more useful in identifying di erentially expressed genes. To demonstrate this claim, we use synthetic data generated from normal and mixed normal models, and real biological data obtained from acute lymphoblastic leukemia and acute myeloid leukemia patients. ADS is able to flag several biologically important genes that are missed by the non-parametric Welch t-test.

There are no comments yet on this publication. Be the first to share your thoughts.