Predictive Modelling of Student Outcomes Using Ensemble Regression and Classification Methods
DOI:
https://doi.org/10.5281/zenodo.17599070Keywords:
Student Performance Prediction, Ensemble Learning, HistGradientBoostingClassifier, Educational Data Mining, Multiclass Classification, Machine LearningAbstract
Accurate prediction of student academic outcomes is vital for developing data-driven interventions in education. This study proposes a robust ensemble learning framework based on the HistGradientBoostingClassifier (HGB) to classify student grades using behavioral and academic features such as self-study hours, attendance, class participation, and total performance scores. Leveraging a large-scale synthetic dataset of 1,000,000 student records, we benchmarked the proposed HGB model against widely used ensemble classifiers including XGBoost, LightGBM, CatBoost, and Random Forest. Comprehensive experiments demonstrated that HGB consistently outperformed all baselines, achieving a testing accuracy of 99.6%, with macro-averaged precision, recall, and F1-score of 0.99. The model also showed strong generalization across both majority and minority grade categories, as confirmed by confusion matrix analysis. These results highlight the effectiveness of histogram-based boosting in educational data mining and support its application in real-time academic performance monitoring and intervention systems.References
Zeineddine, H., Braendle, U., & Farah, A. (2020). Enhancing prediction of student success: Automated machine learning approach. Computers & Electrical Engineering, 87, 106903.
Ng, H., Azha, A.A.M., Yap, T.T.V., & Goh, V.T. (2022). A Machine Learning Approach to Predictive Modelling of Student Performance. JMIRx Med, 3(2), e32557. https://doi.org/10.2196/32557 | PMCID: PMC9194521 | PMID: 35719314
Sekeroglu, B., Dimililer, K., & Tuncal, K. (2019). Student Performance Prediction and Classification Using Machine Learning Algorithms. Proceedings of the 2019 8th International Conference on Educational and Information Technology (ICEIT), 7–11. https://doi.org/10.1145/3318396.3318419
Kabakchieva, D. (2012). Student Performance Prediction by Using Data Mining Classification Algorithms. International Journal of Computer Science and Management Research, 1(4), 687–695. ISSN: 2278-733X.
Alshamaila, Y., Alsawalqah, H., Aljarah, I., Habib, M., Faris, H., Alshraideh, M., & Abu Salih, B. (2024). An automatic prediction of students’ performance to support the university education system: a deep learning approach. Multimedia Tools and Applications, 83, 46369–46396. https://doi.org/10.1007/s11042-023-17726-4
Alsariera, Y. A., Baashar, Y., Alkawsi, G., Mustafa, A., Alkahtani, A. A., & Ali, N. (2022). Assessment and Evaluation of Different Machine Learning Algorithms for Predicting Student Performance. Hindawi. https://doi.org/10.1155/2022/4151487
Aulakh, Kudratdeep, Rajendra Kumar Roul, and Manisha Kaushal. "An Ensemble Approach for Student Academic Performance Prediction." In International Conference on Soft Computing: Theories and Applications, pp. 519-531. Singapore: Springer Nature Singapore, 2024.
Zafari, M., Sadeghi-Niaraki, A., Choi, S.M., & Esmaeily, A. (2021). A Practical Model for the Evaluation of High School Student Performance Based on Machine Learning. Applied Sciences, 11(23), 11534.
Feng, G., Fan, M., & Chen, Y. (2022). Analysis and Prediction of Students’ Academic Performance Based on Educational Data Mining. IEEE Access, 10, 19558–19571. https://doi.org/10.1109/ACCESS.2022.3151652
Asselman, A., Khaldi, M., & Aammou, S. (2021). Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interactive Learning Environments, 29(3), 3360–3379
Agrawal, H., & Mavani, H. (2015). Student Performance Prediction using Machine Learning. International Journal of Engineering Research & Technology (IJERT), 4(3), 111–113. ISSN: 2278-0181.
Hashim, A. S., Awadh, W. A., & Hamoud, A. K. (2020). Student Performance Prediction Model based on Supervised Machine Learning Algorithms. IOP Conference Series: Materials Science and Engineering, 928(3), 032019. https://doi.org/10.1088/1757-899X/928/3/032019
Ahmed, E. (2024). Student Performance Prediction Using Machine Learning Algorithms. Journal of Education and Practice, Hindawi. https://doi.org/10.1155/2024/4067721
Bhutto, S., Siddiqui, I. F., Arain, Q. A., & Anwar, M. (2020). Predicting Students’ Academic Performance Through Supervised Machine Learning. 2020 International Conference on Information Science and Communication Technology (ICISCT). IEEE. https://doi.org/10.1109/ICISCT49550.2020.9080033
Ahmed, S. T., Fathima, A. S., & Reema, S. (2023, December). An Improved System for Students Feedback Analysis Using Supervised Probability Techniques. In 2023 10th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON) (Vol. 10, pp. 328-333). IEEE.
Hasan, H. M. R., Rabby, A. S. A., Islam, M. T., & Hossain, S. A. (2019). Machine Learning Algorithm for Student's Performance Prediction. 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE. https://doi.org/10.1109/ICCCNT45670.2019.8944629
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Mary Teresa, Sukerthi Sutraya, Y Vijaya Sambhavi, Saritha Dasari, J K Neelima

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
