Skip to main content

Massively Parallel Feature Selection Based on Ensemble of Filters and Multiple Robust Consensus Functions for Cancer Gene Identification

  • Conference paper
  • First Online:
Intelligent Systems in Science and Information 2014 (SAI 2014)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 591))

Included in the following conference series:

Abstract

Currently, cancer prevails as a prime health matter worldwide. Selecting the appropriate biomarkers for early cancer detection might improve patient care and have often driven revolutions in medicine. Statistics and machine learning techniques have been broadly investigated for biomarker identification, especially feature selection where researchers try to identify the most distinguishing genes that can achieve better predictive performance of cancer subtypes. The robustness of the selected signature remains a crucial goal in personalized medicine. Ensemble and parallel feature selection are promising techniques to overcome this problem in which they have seen an increasing use in biomarker discovery. We focus in this chapter on the principal aspects of using ensemble feature selection in biomarker discovery. Furthermore, we propose a massively parallel meta-ensemble of filters (MPME-FS) to select a robust and parsimonious subset of genes. Two types of filters (ReliefF and Information Gain) are investigated in this study. The performances of the proposed approach in terms of robustness, classification power and the biological meaning of the selected signatures on five publicly available cancer datasets are explored. The results attest that the MPME-FS approach can effectively identify a small subset of biomarkers and improve both robustness and classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://levis.tongji.edu.cn/gzli/data/mirror-kentridge.html.

  2. 2.

    http://www.gems-system.org.

References

  1. Zhang, X., et al.: Integrative omics technologies in cancer biomarker discovery. Omics Technol. Cancer Biomark. Discov. 129 (2011)

    Google Scholar 

  2. Nair, M., Sandhu, S.S., Sharma, A.K.: Prognostic and predictive biomarkers in cancer. Curr. Cancer Drug Targets (2014)

    Google Scholar 

  3. Mäbert, K., Cojoc, M., Peitzsch, C., Kurth, I., Souchelnytskyi, S., Dubrovska, A.: Cancer biomarker discovery: current status and future perspectives. Int. J. Radiat. Biol. (0), 1–48 (2014)

    Google Scholar 

  4. Wu, M.Y., Dai, D.Q., Shi, Y., Yan, H., Zhang, X.F.: Biomarker identification and cancer classification based on microarray data using laplace naive bayes model with mean shrinkage. IEEE/ACM Trans. Comput. Biol. Bioinf. (TCBB) 9(6), 1649–1662 (2012)

    Article  Google Scholar 

  5. Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)

    Article  Google Scholar 

  6. Bolón-Canedo, V., Sánchez-Maroño, N., et al.: A review of microarray datasets and applied feature selection methods. Inf. Sci. 282, 111–135 (2014)

    Article  Google Scholar 

  7. He, Z., Yu, W.: Stable feature selection for biomarker discovery. Comput. Biol. Chem. 34(4), 215–225 (2010)

    Article  Google Scholar 

  8. Guan, D., Yuan, W., Lee, Y.K., Najeebullah, K., Rasel, M.K.: A review of ensemble learning based feature selection. IETE Tech. Rev. 31(3), 190–198 (2014)

    Article  Google Scholar 

  9. Upadhyaya, S.R.: Parallel approaches to machine learning—a comprehensive survey. J. Parallel Distrib. Comput. 73(3), 284–292 (2013)

    Article  Google Scholar 

  10. Yang, P., Hwa Yang, Y., B Zhou, B., Y Zomaya, A.: A review of ensemble methods in bioinformatics. Curr. Bioinf. 5(4), 296–308 (2010)

    Google Scholar 

  11. Awada, W., Khoshgoftaar, T.M., et al.: A review of the stability of feature selection techniques for bioinformatics data. In: Information Reuse and Integration (IRI), 13th International Conference, 356–363 (2012)

    Google Scholar 

  12. Saeys, Y., Abeel, T., Van de Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Machine Learning and Knowledge Discovery in Databases, pp. 313–325. Springer, Berlin (2008)

    Google Scholar 

  13. Abeel, T., Helleputte, T., et al.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3), 392–398 (2010)

    Article  Google Scholar 

  14. Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Data classification using an ensemble of filters. Neurocomputing 135, 13–20 (2014)

    Article  Google Scholar 

  15. Yang, P., Liu, W., Zhou, B. B., Chawla, S., Zomaya, A.Y.: Ensemble-based wrapper methods for feature selection and class imbalance learning. In: Advances in Knowledge Discovery and Data Mining, pp. 544–555. Springer, Berlin (2013)

    Google Scholar 

  16. Xu, J., Sun, L., Gao, Y., Xu, T.: An ensemble feature selection technique for cancer recognition. Bio-Med. Mater. Eng. 24(1), 1001–1008 (2014)

    Google Scholar 

  17. Ghorai, S., et al.: Cancer classification from gene expression data by NPPC ensemble. IEEE/ACM Trans. Comput. Biol. Bioinf. 8(3), 659–671 (2011)

    Article  Google Scholar 

  18. Boucheham, A., Batouche, M.: Robust biomarker discovery for cancer diagnosis based on meta-ensemble feature selection. In: The Proceedings of Science and Information Conference, IEEE, pp. 452–460 (2014). ISBN: 978-0-9893193-1-7

    Google Scholar 

  19. Boulesteix, A.L., Slawski, M.: Stability and aggregation of ranked gene lists. Briefings Bioinf. 10(5), 556–568 (2009)

    Article  Google Scholar 

  20. Haury, A.C., Gestraud, P., Vert, J.P.: The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures. PLoS ONE 6(12), e28210 (2011)

    Article  Google Scholar 

  21. Zhu, Z., Ong, Y.S., et al.: Identification of full and partial class relevant genes. Comput. Biol. Bioinf. IEEE/ACM Trans. 7(2), 263–277 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anouar Boucheham .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Boucheham, A., Batouche, M. (2015). Massively Parallel Feature Selection Based on Ensemble of Filters and Multiple Robust Consensus Functions for Cancer Gene Identification. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems in Science and Information 2014. SAI 2014. Studies in Computational Intelligence, vol 591. Springer, Cham. https://doi.org/10.1007/978-3-319-14654-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14654-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14653-9

  • Online ISBN: 978-3-319-14654-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics