Multi-Class Cancer Classification with SVM Using Wrapper Forward and Backward Feature Selection for Dimension Reduction


  • May Myat Myat Khaing Faculty of Computer Science,University of Computer Studies,Yangon, Myanmar
  • May Mar Oo Faculty of Information and Communication Technology,University of Technology (Yatanarpon Cyber City) Pyin Oo Lwin, Myanmar


Cancer type detection, gene expression data, Minimum Redundancy Maximum Relevance, wrapper-based feature selection method, forward feature selection, backward feature elimination, ICMR, SVM, logistic regression


The use of machine learning (ML) into healthcare has shown enormous growth in recent years. The efficacy of supervised ML models is significantly influenced by the quality of the training data. Feature selection is a crucial factor that affects the performance of machine learning models, especially in complex tasks like multi-class cancer classification. This research investigates the efficacy of using forward feature selection and backward feature elimination approaches in combination with logistic regression. The features generated using these approaches are then used for cancer type classification using support vector machines (SVM).The focus of our study is to use a partially complete gene dataset obtained from the Indian Council of Medical study (ICMR) for the purpose of classifying different types of cancer using Support Vector Machines (SVM). Our approach demonstrated a remarkable success rate of 96% when using features selected via the forward selection method and 97% when using features obtained through the backward selection method in multi-class cancer classification.


Anna, A.; Monika, G. Splicing Mutations in Human Genetic Disorders: Examples, Detection, and Confirmation. J. Appl. Genet. 2018, 59, 253–268

Lunshof, J.E.; Bobe, J.; Aach, J.; Angrist, M.; Thakuria, J.V.; Vorhaus, D.B.; Hoehe, M.R.; Church, G.M. Personal Genomes in Progress: From the Human Genome Project to the Personal Genome Project. Dialogues Clin. Neurosci. 2010, 12, 47–60.

Khan, M.F.; Ghazal, T.M.; Said, R.A.; Fatima, A.; Abbas, S.; Khan, M.A.; Issa, G.F.; Ahmad, M.; Khan, M.A. An IoMT-Enabled Smart Healthcare Model to Monitor Elderly People Using Machine Learning Technique. Comput. Intell. Neurosci. 2021, 2021, 2487759.

Bhonde, S.B.; Prasad, J.R. Deep Learning Techniques in Cancer Prediction Using Genomic Profiles. In Proceedings of the 2021 6th International Conference for Convergence in Technology (I2CT), Maharashtra, India, 2–4 April 2021; pp. 1–9

Celesti, F.; Celesti, A.; Wan, J.; Villari, M. Why Deep Learning Is Changing the Way to Approach NGS Data Processing: A Review. IEEE Rev. Biomed. Eng. 2018, 11, 68–76.

Vaiyapuri, T.; Liyakathunisa; Alaskar, H.; Aljohani, E.; Shridevi, S.; Hussain, A. Red Fox Optimizer with Data-Science-Enabled Microarray Gene Expression Classification Model. Appl. Sci. 2022, 12, 4172. app1209417P. D. Turney, “Similarity of Semantic Relations,” 2006.

hukla, A.K.; Singh, P.; Vardhan, M. A two-stage gene selection method for biomarker discovery from microarray data for cancer classification. Chemom. Intell. Lab. Syst. 2018, 183, 47–58

Mohammed, M., Mwambi, H., Mboya, I.B. et al. A stacking ensemble deep learning approach to cancer type classification based on TCGA data. Sci Rep 11, 15626 (2021).

Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol. 2005 Apr;3(2):185-205. doi: 10.1142/s0219720005001004. PMID: 15852500.

Z. Zhao, R. Anand and M. Wang, "Maximum Relevance and Minimum Redundancy Feature Selection Methods for a Marketing Machine Learning Platform," 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Washington, DC, USA, 2019, pp. 442-452, doi: 10.1109/DSAA.2019.00059.




How to Cite

May Myat Myat Khaing, & May Mar Oo. (2024). Multi-Class Cancer Classification with SVM Using Wrapper Forward and Backward Feature Selection for Dimension Reduction. International Journal of Computer (IJC), 51(1), 43–69. Retrieved from