A Smart Learning Analytics Model for Dropout Risk Estimation Using Classification Techniques

Authors

  • Pranita Pramod Patil

Keywords:

Decision Tree, Learning Management Systems (LMS), Logistic Regression, Naïve Bayes, Random Forest, Support Vector Machine

Abstract

Student dropout remains a significant concern in modern education systems, affecting institutional performance, learner progression, and socio-economic development. The expansion of digital learning environments has generated large volumes of educational data, enabling data-driven approaches for predictive analysis. This study presents an intelligent predictive framework for early student dropout risk detection using machine learning classification techniques. The proposed system integrates academic performance indicators, engagement metrics from Learning Management Systems (LMS), and demographic attributes to construct a comprehensive predictive model. Data preprocessing techniques including missing value handling, normalization, and categorical encoding are applied to enhance data quality. Multiple classification algorithms such as Logistic Regression, Decision Tree, Random Forest, Naïve Bayes, and Support Vector Machine are implemented and evaluated using performance metrics including accuracy, precision, recall, and F1-score. Experimental findings indicate that ensemble-based models, particularly Random Forest, demonstrate improved predictive accuracy and stability compared to traditional classifiers. The developed framework enables early identification of at-risk students, supporting proactive intervention strategies. The results highlight the effectiveness of learning analytics and machine learning in transforming educational data into actionable insights for improving student retention and academic success.

References

L. Beam and I. S. Kohane, “Big data and machine learning in health care,” Journal of the American Medical Association, vol. 319, no. 13, pp. 1317–1318, Apr. 2018.

C. Romero and S. Ventura, “Educational data mining: A review of the state of the art,” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 40, no. 6, pp. 601–618, Nov. 2010.

C. Romero and S. Ventura, “Data mining in education,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 3, no. 1, pp. 12–27, Jan. 2013.

R. S. Baker and K. Yacef, “The state of educational data mining in 2009: A review and future visions,” Journal of Educational Data Mining, vol. 1, no. 1, pp. 3–17, Oct. 2009.

J. L. Rastrollo-Guerrero, J. A. Gómez-Pulido, and A. Durán-Domínguez, “Analyzing and predicting students’ performance by means of machine learning: A review,” Applied Sciences, vol. 10, no. 3, p. 1042, Feb. 2020.

S. Kotsiantis, C. Pierrakeas, and P. Pintelas, “Predicting students' performance in distance learning using machine learning techniques,” Applied Artificial Intelligence, vol. 18, no. 5, pp. 411–426, May 2004.

M. Hussain, W. Zhu, W. Zhang, S. M. Abidi, and S. Ali, “Using machine learning to predict student difficulties from learning session data,” Artificial Intelligence Review, vol. 52, no. 1, pp. 381–407, Jun. 2019.

N. Thai-Nghe et al., “Factorization techniques for predicting student performance,” in Educational Recommender Systems and Technologies: Practices and Challenges. Hershey, PA, USA: IGI Global, 2012, pp. 129–153.

J. Xu, K. H. Moon, and M. Van Der Schaar, “A machine learning approach for tracking and predicting student performance in degree programs,” IEEE Journal of Selected Topics in Signal Processing, vol. 11, no. 5, pp. 742–753, Aug. 2017.

P. Baepler and C. J. Murdoch, “Academic analytics and data mining in higher education,” International Journal for the Scholarship of Teaching and Learning, vol. 4, no. 2, p. 17, 2010.

A. Peña-Ayala, “Educational data mining: A survey and a data mining-based analysis of recent works,” Expert Systems with Applications, vol. 41, no. 4, pp. 1432–1462, Mar. 2014.

H. Waheed et al., “Predicting academic performance of students from VLE big data using deep learning models,” Computers in Human Behavior, vol. 104, p. 106189, Mar. 2020.

I. Lykourentzou et al., “Dropout prediction in e-learning courses through the combination of machine learning techniques,” Computers & Education, vol. 53, no. 3, pp. 950–965, Nov. 2009.

D. Delen, “A comparative analysis of machine learning techniques for student retention management,” Decision Support Systems, vol. 49, no. 4, pp. 498–506, Nov. 2010.

F. Marbouti, H. A. Diefes-Dux, and K. Madhavan, “Models for early prediction of at-risk students in a course using standards-based grading,” Computers & Education, vol. 103, pp. 1–5, Dec. 2016.

Published

2026-02-27

Issue

Section

Articles