Multiple Disease Prediction based on User Symptoms using Machine Learning Algorithms

Authors

  • Sadvika Alli Postgraduate Student, Department of Computer Science & Engineering, Mahatma Gandhi Institute of Technology, Hyderabad, Telangana, India
  • Shyam Sunder Pabboju Assistant Professor, Department of Computer Science & Engineering, Mahatma Gandhi Institute of Technology, Hyderabad, Telangana, India
  • K. Sreekala Assistant Professor, Department of Computer Science & Engineering, Mahatma Gandhi Institute of Technology, Hyderabad, Telangana, India
  • C. R. K Reddy Professor, Department of Computer Science & Engineering, Mahatma Gandhi Institute of Technology, Hyderabad, Telangana, India
  • A. Nagesh Professor, Department of Computer Science & Engineering, Mahatma Gandhi Institute of Technology, Hyderabad, Telangana, India

Keywords:

Decision trees, K-Nearest Neighbors (KNN), Logistic regression, Naïve bayes, Random forests, Support Vector Machines (SVM)

Abstract

With the help of sophisticated machine learning techniques, we propose a system for predicting multiple diseases based on user-reported symptoms. To train the prediction models, the system utilizes a large dataset containing medical records and symptom-disease relationships. Algorithms such as Decision Trees, Support Vector Machines (SVM), Naïve Bayes, Logistic Regression, K-Nearest Neighbors (KNN), and Random Forests are employed to enable accurate analysis of input data. A custom backend handles data preprocessing and powers a user-centered interface for symptom entry. The trained models are stored to enable real-time predictions. Patient data confidentiality is ensured through a secure database that complies with healthcare information privacy standards. By using scalable platforms, healthcare professionals can perform accurate disease prediction, aiding early diagnosis and personalized treatment planning. This system enhances healthcare delivery, reduces costs, and improves patient outcomes. Symptom-based disease prediction, backed by machine learning, holds the potential to transform clinical decision-making.

References

A. Rajkomar, E. Oren, K. Chen, “Scalable and accurate deep learning with electronic health records,” Npj Digital Medicine, vol. 1, no. 1, May 2018, doi: https://doi.org/10.1038/s41746-018-0029-1.

R. Miotto, F. Wang, S. Wang, X. Jiang, and J. T. Dudley, “Deep Learning for healthcare: review, Opportunities and Challenges,” Briefings in Bioinformatics, vol. 19, no. 6, pp. 1236–1246, 2018, doi: https://doi.org/10.1093/bib/bbx044.

E. Choi, M. T. Bahadori, J. Sun, J. Kulas, A. Schuetz, and W. F. Stewart, “RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism,” Arxiv (Cornell University), Jan. 2016, doi: https://doi.org/10.48550/arxiv.1608.05745.

A. Johnson, T. Pollard, L.U. Shen, “Open Subject Categories Background & Summary,” Mimic-Iii, a Freely Accessible Critical Care Database, 2016, doi: https://doi.org/10.1038/sdata.2016.35.

E. Esteva, B. Kuprel, R. Novoa, “Dermatologist-level Classification of Skin Cancer with Deep Neural Networks,” Nature, vol. 542, no. 7639, pp. 115–118, Jan. 2017, doi: https://doi.org/10.1038/nature21056.

R. J. Chen, M. Y. Lu, T. Y. Chen, D. F. K. Williamson, and F. Mahmood, “Synthetic data in machine learning for medicine and healthcare,” Nature Biomedical Engineering, vol. 5, no. 6, pp. 493–497, Jun. 2021, doi: https://doi.org/10.1038/s41551-021-00751-8.

E. Pachetti and S. Colantonio, “A systematic review of few-shot learning in medical imaging,” Artificial Intelligence in Medicine, vol. 156, p. 102949, Oct. 2024, doi: https://doi.org/10.1016/j.artmed.2024.102949.

B. Borsos, C. G. Allaart, and A. van Halteren, “Predicting stroke outcome: A case for multimodal deep learning methods with tabular and CT Perfusion data,” Artificial Intelligence in Medicine, vol. 147, p. 102719, Jan. 2024, doi: https://doi.org/10.1016/j.artmed.2023.102719.

A. Schmidt, O. Mohareri, S. DiMaio, M. C. Yip, and S. E. Salcudean, “Tracking and mapping in medical computer vision: A review,” Medical Image Analysis, vol. 94, p. 103131, May 2024, doi: https://doi.org/10.1016/j.media.2024.103131.

B. Lambert, F. Forbes, A. Tucholka, S. Doyle, H. Dehaene, and M. Dojat, “Trustworthy clinical AI solutions: a unified review of uncertainty quantification in deep learning models for medical image analysis,” Arxiv. 2022. https://arxiv.org/abs/2210.03736.

H. Li, K. Falahkheirkhah, V. Kindratenko, and R. Bhargava, “INSTRAS: INfrared Spectroscopic imaging-based TRAnsformers for medical image Segmentation,” Machine Learning with Applications, vol. 16, p. 100549, Jun. 2024, doi: https://doi.org/10.1016/j.mlwa.2024.100549.

Y. Mo, F. Liu, G. Yang, “Labelling with dynamics: A data-efficient learning paradigm for medical image segmentation,” Medical Image Analysis, vol. 95, p. 103196, May 2024, doi: https://doi.org/10.1016/j.media.2024.103196.

Yeganeh, A. Johannssen, A. Chukhrova, and M. Rasouli, “Monitoring multistage healthcare processes using state space models and a machine learning based framework,” Artificial Intelligence in Medicine, pp. 102826–102826, Mar. 2024, doi: https://doi.org/10.1016/j.artmed.2024.102826.

N. Marini, S. Marchesin, M. Wodzinski, “Multimodal representations of biomedical knowledge from limited training whole slide images and reports using deep learning,” Medical Image Analysis, vol. 97, p. 103303, Aug. 2024, doi: https://doi.org/10.1016/j.media.2024.103303.

Á. Planchuelo-Gómez, M. Descoteaux, H. Larochelle, J. Hutter, D. K. Jones, and C. M. W. Tax, “Optimisation of quantitative brain diffusion-relaxation MRI acquisition protocols with physics-informed machine learning,” Medical Image Analysis, vol. 94, p. 103134, May 2024, doi: https://doi.org/10.1016/j.media.2024.103134.

S. Nerella, S. Bandyopadhyay, J. Zhang, M. Contreras, “Transformers and large language models in healthcare: A review,” Artificial Intelligence in Medicine, vol. 154, pp. 102900–102900, Aug. 2024, doi: https://doi.org/10.1016/j.artmed.2024.102900.

Published

2025-07-14

Issue

Section

Articles