Predictive Modelling of Student Outcomes Using Ensemble Regression and Classification Methods

Authors

  • Mary Teresa Department of CSE-AIML, Guru Nanak Institutions Technical Campus, Hyderabad, India
  • Sukerthi Sutraya Department of Computer Science and Engineering (Data Science), G. Narayanamma Institute of Technology and Science, Hyderabad, India.
  • Y Vijaya Sambhavi Department of EEE, Annamacharya Institute of Technology and Sciences (Autonomous), Tirupati, India.
  • Saritha Dasari Department of Computer Science and Engineering (Data Science), G. Narayanamma Institute of Technology and Science, Hyderabad, India.
  • J K Neelima Department of E.C.E, Narayana Engineering College, Nellore, India.

DOI:

https://doi.org/10.5281/zenodo.17599070

Keywords:

Student Performance Prediction, Ensemble Learning, HistGradientBoostingClassifier, Educational Data Mining, Multiclass Classification, Machine Learning

Abstract

Accurate prediction of student academic outcomes is vital for developing data-driven interventions in education. This study proposes a robust ensemble learning framework based on the HistGradientBoostingClassifier (HGB) to classify student grades using behavioral and academic features such as self-study hours, attendance, class participation, and total performance scores. Leveraging a large-scale synthetic dataset of 1,000,000 student records, we benchmarked the proposed HGB model against widely used ensemble classifiers including XGBoost, LightGBM, CatBoost, and Random Forest. Comprehensive experiments demonstrated that HGB consistently outperformed all baselines, achieving a testing accuracy of 99.6%, with macro-averaged precision, recall, and F1-score of 0.99. The model also showed strong generalization across both majority and minority grade categories, as confirmed by confusion matrix analysis. These results highlight the effectiveness of histogram-based boosting in educational data mining and support its application in real-time academic performance monitoring and intervention systems.

References

Zeineddine, H., Braendle, U., & Farah, A. (2020). Enhancing prediction of student success: Automated machine learning approach. Computers & Electrical Engineering, 87, 106903.

Ng, H., Azha, A.A.M., Yap, T.T.V., & Goh, V.T. (2022). A Machine Learning Approach to Predictive Modelling of Student Performance. JMIRx Med, 3(2), e32557. https://doi.org/10.2196/32557 | PMCID: PMC9194521 | PMID: 35719314

Sekeroglu, B., Dimililer, K., & Tuncal, K. (2019). Student Performance Prediction and Classification Using Machine Learning Algorithms. Proceedings of the 2019 8th International Conference on Educational and Information Technology (ICEIT), 7–11. https://doi.org/10.1145/3318396.3318419

Kabakchieva, D. (2012). Student Performance Prediction by Using Data Mining Classification Algorithms. International Journal of Computer Science and Management Research, 1(4), 687–695. ISSN: 2278-733X.

Alshamaila, Y., Alsawalqah, H., Aljarah, I., Habib, M., Faris, H., Alshraideh, M., & Abu Salih, B. (2024). An automatic prediction of students’ performance to support the university education system: a deep learning approach. Multimedia Tools and Applications, 83, 46369–46396. https://doi.org/10.1007/s11042-023-17726-4

Alsariera, Y. A., Baashar, Y., Alkawsi, G., Mustafa, A., Alkahtani, A. A., & Ali, N. (2022). Assessment and Evaluation of Different Machine Learning Algorithms for Predicting Student Performance. Hindawi. https://doi.org/10.1155/2022/4151487

Aulakh, Kudratdeep, Rajendra Kumar Roul, and Manisha Kaushal. "An Ensemble Approach for Student Academic Performance Prediction." In International Conference on Soft Computing: Theories and Applications, pp. 519-531. Singapore: Springer Nature Singapore, 2024.

Zafari, M., Sadeghi-Niaraki, A., Choi, S.M., & Esmaeily, A. (2021). A Practical Model for the Evaluation of High School Student Performance Based on Machine Learning. Applied Sciences, 11(23), 11534.

Feng, G., Fan, M., & Chen, Y. (2022). Analysis and Prediction of Students’ Academic Performance Based on Educational Data Mining. IEEE Access, 10, 19558–19571. https://doi.org/10.1109/ACCESS.2022.3151652

Asselman, A., Khaldi, M., & Aammou, S. (2021). Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interactive Learning Environments, 29(3), 3360–3379

Agrawal, H., & Mavani, H. (2015). Student Performance Prediction using Machine Learning. International Journal of Engineering Research & Technology (IJERT), 4(3), 111–113. ISSN: 2278-0181.

Hashim, A. S., Awadh, W. A., & Hamoud, A. K. (2020). Student Performance Prediction Model based on Supervised Machine Learning Algorithms. IOP Conference Series: Materials Science and Engineering, 928(3), 032019. https://doi.org/10.1088/1757-899X/928/3/032019

Ahmed, E. (2024). Student Performance Prediction Using Machine Learning Algorithms. Journal of Education and Practice, Hindawi. https://doi.org/10.1155/2024/4067721

Bhutto, S., Siddiqui, I. F., Arain, Q. A., & Anwar, M. (2020). Predicting Students’ Academic Performance Through Supervised Machine Learning. 2020 International Conference on Information Science and Communication Technology (ICISCT). IEEE. https://doi.org/10.1109/ICISCT49550.2020.9080033

Ahmed, S. T., Fathima, A. S., & Reema, S. (2023, December). An Improved System for Students Feedback Analysis Using Supervised Probability Techniques. In 2023 10th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON) (Vol. 10, pp. 328-333). IEEE.

Hasan, H. M. R., Rabby, A. S. A., Islam, M. T., & Hossain, S. A. (2019). Machine Learning Algorithm for Student's Performance Prediction. 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE. https://doi.org/10.1109/ICCCNT45670.2019.8944629

Downloads

Published

2025-11-13

How to Cite

Mary Teresa, Sukerthi Sutraya, Y Vijaya Sambhavi, Saritha Dasari, & J K Neelima. (2025). Predictive Modelling of Student Outcomes Using Ensemble Regression and Classification Methods. International Journal of Human Computations and Intelligence, 5(1), 664–677. https://doi.org/10.5281/zenodo.17599070