Abstract
Microarray data classification is an open and active research field.
The development of more accurate algorithms is of great interest also because many of the techniques can be straightforwardly applied in analyzing different kinds of omics data.
In this work, an ensemble feature selection algorithm is applied within a binary classification framework from [1] that already got good predictive results. Ensemble feature selection is a rich field of research.
Ensemble techniques take some individual experts (i.e. classifiers) to combine them to improve the individual expert results with a voting scheme.
In this case, a thinning algorithm is proposed which starts by using all the available experts and removes them one by one focusing on improving the ensemble vote.
Two versions of an ensemble thinning algorithm have been tested and three key elements have been introduced to work with microarray data: the ensemble cohort definition, the nonexpert notion, which defines a set of excluded expert from the thinning process, and a rule to break ties in the thinning process. Experiments have been done on seven public datasets from the Microarray Quality Control study, MAQC [2].
The studied ensemble technique improves the state of the art results by producing classifiers with significantly better results and the proposed key elements have shown to be useful for the prediction performance.