Bosio M, Bellot P, Salembier P, Oliveras A. Feature set enhancement via hierarchical clustering for microarray classification. In IEEE International Workshop on Genomic Signal Processing and Statistics, GENSIPS 2011. 2011. pp. 226 -229.  (254.44 KB)

Abstract

A new method for gene expression classification is proposed in this paper. In a first step, the original feature set is enriched by including new features, called metagenes, produced via hierarchical clustering. In a second step, a reliable classifier is built from a wrapper feature selection process. The selection relies on two criteria: the classical classification error rate and a new reliability measure. As a result, a classifier with good predictive ability using as few features as possible to reduce the risk of overfitting is obtained. This method has been tested on three public cancer datasets: leukemia, lymphoma and colon. The proposed method has obtained interesting classification results and the experiments have confirmed the utility of both metagenes and feature ranking criterion to improve the final classifier.