Employee Attrition Model: A Data-driven Approach to Workforce Analytics
Keywords:
Employee attrition, HR analytics, IBM HR dataset, Machine learning, Predictive analytics, Random Forest, Workforce retention, XGBoostAbstract
Employee attrition represents a significant challenge for organizations, leading to considerable costs associated with recruitment, training, and reduced productivity. To mitigate this issue, the present study focuses on developing a predictive model capable of identifying employees at risk of leaving and determining the most influential factors contributing to attrition. Utilizing publicly available datasets such as the IBM HR Analytics dataset along with real-world HR records, multiple machine learning (ML) algorithms—including logistic regression, Random Forest, XGBoost, and support vector machines—are applied alongside comprehensive data preprocessing steps, such as feature engineering and class imbalance handling. The results indicate that ensemble-based models, particularly random forest and XGBoost, outperform baseline algorithms across evaluation metrics such as accuracy, precision, recall, and F1-score, while emphasizing key predictors like monthly income, job satisfaction, and years at the company. Overall, the findings demonstrate that ML-driven predictive analytics can effectively assist HR managers in developing targeted retention initiatives and enhancing workforce stability. Future research could focus on incorporating explainable AI (XAI) techniques to improve interpretability and facilitate deployment in dynamic organizational settings.