Bagging-Based Ensemble of Optimized SVM Classifiers for Robust Breast Cancer Prediction

Satish Kumar Kalagotla; Thoudam Basanta; Mutum Bidyarani Devi

Authors

Satish Kumar Kalagotla
Thoudam Basanta
Mutum Bidyarani Devi

Keywords:

Bagging ensemble, Bootstrap aggregating, Breast cancer diagnosis, Heterogeneous ensemble, Model diversity, Support vector machine, Variance reduction, Weighted voting

Abstract

Background: Single Support Vector Machine (SVM) classifiers, even when optimized for feature selection and parameters, suffer from high variance and sensitivity to variations in the training data, limiting their reliability in critical medical diagnosis applications. Ensemble methods, particularly bagging, offer a powerful approach to improve robustness and accuracy by combining multiple diverse classifiers.

Objective: This paper proposes a novel heterogeneous bagging ensemble framework that integrates five optimized SVM variants—DT-SVM (missing value handling), Correlation-SVM (multicollinearity-aware), ABC-SVM (feature-optimized), GS-GA-SVM (parameter-optimized), and Standard SVM—to achieve robust and accurate breast cancer prediction.

Methods: The proposed framework employs bootstrap sampling to generate diverse training sets for each base learner. Each SVM variant is trained on bootstrap samples with out-of-bag validation, and predictions are aggregated via weighted voting, with weights optimized using validation performance. The framework was evaluated on four benchmark medical datasets (Wisconsin Breast Cancer, PIMA Indian Diabetes, Hepatitis, and Mammographic Mass) and compared against individual base learners and homogeneous bagging ensembles using 10-fold cross-validation with five repeats.

Results: The heterogeneous bagging ensemble achieved 98.76% accuracy on the Wisconsin dataset, significantly outperforming individual SVM variants (average 95.8%) and standard bagging with homogeneous SVMs (97.1%). The ensemble reduced prediction variance by 67.7% compared to single classifiers (standard deviation 0.0042 vs 0.013). Diversity analysis revealed a moderate correlation among base learners (mean Q-statistic of 0.52 and a mean correlation of 0.65), confirming complementary strengths—optimized weighting assigned the highest weights to ABC-SVM (0.24) and GS-GA-SVM (0.23). Cross-dataset validation showed consistent improvements: PIMA Indian Diabetes (88.67%), Hepatitis (89.51%), and Mammographic Mass (90.83%). Robustness testing demonstrated superior performance under label noise, with only 5.9% degradation at 20% noise compared to 10.0% for standard SVM.

Conclusion: The heterogeneous bagging ensemble of optimized SVMs provides a robust, high-performance framework for breast cancer prediction, significantly reducing variance while improving accuracy. The diversity among base learners and optimized weighting scheme contribute to superior generalization, making it suitable for clinical deployment where prediction stability is paramount.

References

V. N. Vapnik, The Nature of Statistical Learning Theory. New York, NY, USA: Springer, 1995.

C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.

M. W. Huang, C. W. Chen, W. C. Lin, S. W. Ke, and C. F. Tsai, “SVM and SVM ensembles in breast cancer prediction,” PLoS ONE, vol. 12, no. 1, p. e0161501, 2017.

S. Geman, E. Bienenstock, and R. Doursat, “Neural networks and the bias/variance dilemma,” Neural Computation, vol. 4, no. 1, pp. 1–58, 1992.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2nd ed. New York, NY, USA: Springer, 2009. doi: 10.1007/978-0-387-84858-7.

T. G. Dietterich, “Ensemble methods in machine learning,” in International Workshop on Multiple Classifier Systems, Berlin, Germany: Springer, 2000, pp. 1–15. doi: 10.1007/3-540-45014-9_1.

L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, 1996. doi: 10.1007/BF00058655.

T. G. Dietterich, “Ensemble learning,” in The Handbook of Brain Theory and Neural Networks, 2nd ed. Cambridge, MA, USA: MIT Press, 2002, pp. 110–125.

P. Bühlmann and B. Yu, “Analyzing bagging,” The Annals of Statistics, vol. 30, no. 4, pp. 927–961, 2002. doi: 10.1214/aos/1031689014.

G. Valentini and F. Masulli, “Ensembles of learning machines,” in Italian Workshop on Neural Nets, Berlin, Germany: Springer, 2002, pp. 3–20. doi: 10.1007/3-540-45808-5_1.

P. Derbeko, R. El-Yaniv, and R. Meir, “Variance optimized bagging,” in European Conference on Machine Learning, Berlin, Germany: Springer, 2002, pp. 60–71. doi: 10.1007/3-540-36755-1_6.

L. I. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms. Hoboken, NJ, USA: John Wiley & Sons, 2004. doi: 10.1002/0471660264.

P. Melville and R. J. Mooney, “Creating diversity in ensembles using artificial data,” Information Fusion, vol. 6, no. 1, pp. 99–111, 2005. doi: 10.1016/j.inffus.2004.04.001.

L. Rokach, “Ensemble-based classifiers,” Artificial Intelligence Review, vol. 33, no. 1, pp. 1–39, 2010. doi: 10.1007/s10462-009-9124-7.

H. C. Kim, S. Pang, H. M. Je, D. Kim, and S. Y. Bang, “Constructing support vector machine ensemble,” Pattern Recognition, vol. 36, no. 12, pp. 2757–2767, 2003. doi: 10.1016/S0031-3203(03)00175-4.

S. A. M. Kumari and V. Vinod, “An ensemble machine learning-based classification for cardiovascular disease prediction using PCA and SVM with bagging,” Journal of Neonatal Surgery, vol. 14, no. 15S, pp. 1749–1755, 2025. doi: 10.52783/jns.v14.4019.

E. H. Tusher et al., “Comparative investigation of bagging enhanced machine learning for early detection of HCV infections using class imbalance technique with feature selection,” PLoS ONE, vol. 20, p. e0326488, 2025.

R. Kumari, J. Singh, and A. Gosain, “B-HPD: Bagging-based hybrid approach for the early diagnosis of Parkinson’s disease,” Intelligent Decision Technologies, vol. 18, no. 2, pp. 1385–1401, 2024. doi: 10.3233/IDT-230331.

L. A. Ferhi, M. Ben Amar, A. Masmoudi, F. Choubani, and R. Bouallegue, “Improving symptom-based medical diagnosis using ensemble learning approaches,” Systems Research and Behavioral Science, vol. 42, no. 4, pp. 1294–1321, 2025. doi: 10.1002/sres.3139.

L. I. Kuncheva and C. J. Whitaker, “Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy,” Machine Learning, vol. 51, no. 2, pp. 181–207, 2003. doi: 10.1023/A:1022859003006.

K. M. Ting and I. H. Witten, “Issues in stacked generalization,” Journal of Artificial Intelligence Research, vol. 10, pp. 271–289, 1999. doi: 10.1613/jair.594.

D. Opitz and J. Shavlik, “Actively searching for an effective neural network ensemble,” Connection Science, vol. 8, nos. 3–4, pp. 337–354, 1996. doi: 10.1080/095400996116802.

L. I. Kuncheva and J. J. Rodríguez, “A weighted voting framework for classifiers ensembles,” Knowledge and Information Systems, vol. 38, no. 2, pp. 259–275, 2014. doi: 10.1007/s10115-012-0586-6.

B. Efron and R. J. Tibshirani, An Introduction to the Bootstrap. New York, NY, USA: Chapman and Hall/CRC, 1993. doi: 10.1007/978-1-4899-4541-9.

A. C. Davison and D. V. Hinkley, Bootstrap Methods and Their Application. Cambridge, U.K.: Cambridge University Press, 1997. doi: 10.1017/CBO9780511802843.

B. Efron, “Bootstrap methods: Another look at the jackknife,” The Annals of Statistics, vol. 7, no. 1, pp. 1–26, 1979. doi: 10.1214/aos/1176344552.

D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 67–82, 1997. doi: 10.1109/4235.585893.

L. Breiman, “Out-of-bag estimation,” Technical Report, Statistics Department, University of California, Berkeley, CA, USA, 1996.

T. Bylander, “Estimating generalization error on out-of-bag samples,” in Proceedings of the International Joint Conference on Neural Networks, vol. 2, 2002, pp. 1369–1374. doi: 10.1109/IJCNN.2002.1007698.

S. Wager, T. Hastie, and B. Efron, “Confidence intervals for random forests: The jackknife and the infinitesimal jackknife,” Journal of Machine Learning Research, vol. 15, no. 1, pp. 1625–1651, 2014.

J. Kittler, M. Hatef, R. P. W. Duin, and J. Matas, “On combining classifiers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 226–239, 1998. doi: 10.1109/34.667881.

G. U. Yule, “On the association of attributes in statistics,” Philosophical Transactions of the Royal Society of London, vol. 194, pp. 257–319, 1900.

N. Ueda and R. Nakano, “Generalization error of ensemble estimators,” in Proceedings of the International Conference on Neural Networks, vol. 1, 1996, pp. 90–95. doi: 10.1109/ICNN.1996.548872.

C. M. Bishop, Pattern Recognition and Machine Learning. New York, NY, USA: Springer, 2006.

W. H. Wolberg and O. L. Mangasarian, “Multisurface method of pattern separation for medical diagnosis applied to breast cytology,” Proceedings of the National Academy of Sciences of the United States of America, vol. 87, no. 23, pp. 9193–9196, 1990. doi: 10.1073/pnas.87.23.9193.

J. W. Smith, J. E. Everhart, W. C. Dickson, et al., “Using the ADAP learning algorithm to forecast the onset of diabetes mellitus,” in Proceedings of the Symposium on Computer Applications in Medical Care, 1988, pp. 261–265.

G. Cestnik, I. Kononenko, and I. Bratko, “Assistant-86: A knowledge-elicitation tool for sophisticated users,” in Proceedings of the European Working Session on Learning, 1987, pp. 31–45.

M. Elter, R. Schulz-Wendtland, and T. Wittenberg, “The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process,” Medical Physics, vol. 34, no. 11, pp. 4164–4174, 2007. doi: 10.1118/1.2786864

R. Wason, P. Arora, M. N. Hoda, N. Kaur, Bhawana, and Shweta, “Breast cancer diagnosis using machine learning: A revisit with grid search and PCA,” in Communications in Computer and Information Science. Cham, Switzerland: Springer, 2024, pp. 287–300.

R. Caruana, A. Niculescu-Mizil, G. Crew, and A. Ksikes, “Ensemble selection from libraries of models,” in Proceedings of the 21st International Conference on Machine Learning, 2004, p. 18. doi: 10.1145/1015330.1015432.

J. Z. Kolter and M. A. Maloof, “Dynamic weighted majority: An ensemble method for drifting concepts,” Journal of Machine Learning Research, vol. 8, no. Dec, pp. 2755–2790, 2007.

Y. Liu and X. Yao, “Ensemble learning via negative correlation,” Neural Networks, vol. 12, no. 10, pp. 1399–1404, 1999. doi: 10.1016/S0893-6080(99)00073-8.

S. M. Lundberg and S. I. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems, vol. 30, 2017, pp. 4765–4774.

M. Galar, A. Fernández, E. Barrenechea, H. Bustince, and F. Herrera, “A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches,” IEEE Transactions on Systems, Man, and Cybernetics: Part C (Applications and Reviews), vol. 42, no. 4, pp. 463–484, 2012. doi: 10.1109/TSMCC.2011.2161285.

Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review and new perspectives,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1798–1828, 2013. doi: 10.1109/TPAMI.2013.50.

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Proceedings of the Artificial Intelligence and Statistics Conference, 2017, pp. 1273–1282.

B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,” in Advances in Neural Information Processing Systems, vol. 30, 2017, pp. 6402–6413.

J. Wiens, S. Saria, M. Sendak, et al., “Do no harm: A roadmap for responsible machine learning for health care,” Nature Medicine, vol. 25, no. 8, pp. 1337–1340, 2019. doi: 10.1038/s41591-019-0548-6.

Bagging-Based Ensemble of Optimized SVM Classifiers for Robust Breast Cancer Prediction

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Current Issue