Predicting Dropout in MENA STEM Higher Education Using Explainable AI: A Machine Learning Approach
Downloads
This study aims to develop an explainable machine learning–based early warning system to predict dropout risk among Science, Technology, Engineering, and Mathematics (STEM) students in the MENA region. Using longitudinal data from 6,798 undergraduate STEM students enrolled at a major UAE university, we evaluated six supervised classifiers: XGBoost, Gradient Boosting Machine (GBM), Random Forest, CART, Logistic Regression, and K-Nearest Neighbors. Models were trained on institutional student information system (SIS) data spanning ten cohorts (2010–2019), with class imbalance addressed through ROSE sampling. The top-performing models (XGBoost, GBM, and Random Forest) achieved AUC-ROC scores exceeding 0.91 and F1-scores above 0.84, significantly outperforming baseline models. Key predictors of dropout included the number of withdrawn semesters, second-term credit load, academic probation history, and performance in mathematics and physics. To improve interpretability, we applied SHapley Additive exPlanations (SHAP) analysis, enabling both global and individual-level feature attribution. The system offers scalable, real-time predictive capabilities using only routinely available SIS data, with no need for external surveys or learning management system inputs. The novelty of this research lies in its integration of explainable AI into a regional context, enabling early, transparent, and actionable interventions to reduce dropout. These findings contribute to data-driven retention strategies in higher education systems where predictive tools remain underutilized.
Downloads
[1] Hamshire, C., Jack, K., Forsyth, R., Langan, A. M., & Harris, W. E. (2019). The wicked problem of healthcare student attrition. Nursing Inquiry, 26(3), e12294. doi:10.1111/nin.12294.
[2] Webber, D. A., & Ehrenberg, R. G. (2010). Do expenditures other than instructional expenditures affect graduation and persistence rates in American higher education? Economics of Education Review, 29(6), 947–958. doi:10.1016/j.econedurev.2010.04.006.
[3] Arias, A., Linares-Vásquez, M., & Héndez-Puerto, N. R. (2024). Undergraduate dropout in Colombia: A systematic literature review of causes and solutions. Journal of Latinos and Education, 23(2), 612–627. doi:10.1080/15348431.2023.2171042.
[4] Graham, M. J., Frederick, J., Byars-Winston, A., Hunter, A. B., & Handelsman, J. (2013). Increasing persistence of college students in STEM. Science, 341(6153), 1455–1456. doi:10.1126/science.1240487.
[5] Eltanahy, M., Forawi, S., & Mansour, N. (2020). Incorporating entrepreneurial practices into STEM education: Development of interdisciplinary E-STEM model in high school in the United Arab Emirates. Thinking Skills and Creativity, 37, 100697. doi:10.1016/j.tsc.2020.100697.
[6] Husain, F. Y., Forawi, S., & Chang, C. Y. (2023). Triple helix components supporting STEM education to increase future STEM careers in the United Arab Emirates. EURASIA Journal of Mathematics, Science and Technology Education, 19(8), em2303. doi:10.29333/ejmste/13424.
[7] Tlepina, S., Sarsembayev, M., Abaideldinov, Y., Balmagambetova, V., & Zukay, Z. (2024). A New Concept of Specialized Standards to Improve the Quality of Higher Legal Education. Emerging Science Journal, 8(4), 1385–1401. doi:10.28991/ESJ-2024-08-04-09.
[8] Muysken, J., & Nour, S. (2006). Deficiencies in education and poor prospects for economic growth in the Gulf countries: The case of the UAE. The Journal of Development Studies, 42(6), 957–980. doi:10.1080/00220380600774756.
[9] Ashour, S. (2020). Analysis of the attrition phenomenon through the lens of university dropouts in the United Arab Emirates. Journal of Applied Research in Higher Education, 12(2), 357–374. doi:10.1108/JARHE-05-2019-0110.
[10] Ifenthaler, D., & Yau, J. Y. K. (2020). Utilising learning analytics to support study success in higher education: a systematic review. Educational Technology Research and Development, 68(4), 1961–1990. doi:10.1007/s11423-020-09788-z.
[11] Marbouti, F., Diefes-Dux, H. A., & Madhavan, K. (2016). Models for early prediction of at-risk students in a course using standards-based grading. Computers & Education, 103, 1–15. doi:10.1016/j.compedu.2016.09.005.
[12] Salas-Pilco, S. Z., & Yang, Y. (2022). Artificial intelligence applications in Latin American higher education: A systematic review. International Journal of Educational Technology in Higher Education, 19(1), 21. doi:10.1186/s41239-022-00326-w.
[13] Gray, C. C., & Perkins, D. (2019). Utilizing early engagement and machine learning to predict student outcomes. Computers & Education, 131, 22–32. doi:10.1016/j.compedu.2018.12.006.
[14] Luan, H., & Tsai, C. C. (2021). A review of using machine learning approaches for precision education. Educational Technology & Society, 24(1), 250–266.
[15] Riestra-González, M., Paule-Ruíz, M. P., & Ortin, F. (2021). Massive LMS log data analysis for the early prediction of course-agnostic student performance. Computers & Education, 163, 104108. doi:10.1016/j.compedu.2020.104108.
[16] Leal, F., Veloso, B., Pereira, C. S., Moreira, F., Durão, N., & Silva, N. J. (2022). Interpretable success prediction in higher education institutions using pedagogical surveys. Sustainability, 14(20), 13446. doi:10.3390/su142013446.
[17] Atif, A., Richards, D., Liu, D., & Bilgin, A. A. (2020). Perceived benefits and barriers of a prototype early alert system to detect engagement and support ‘at-risk’ students: The teacher perspective. Computers & Education, 156, 103954. doi:10.1016/j.compedu.2020.103954.
[18] Johora, F. T., Hasan, M. N., Rajbongshi, A., Ashrafuzzaman, M., & Akter, F. (2025). An explainable AI-based approach for predicting undergraduate students academic performance. Array, 26, 100384. doi:10.1016/j.array.2025.100384.
[19] Mustofa, S., Emon, Y. R., Mamun, S. B., Akhy, S. A., & Ahad, M. T. (2025). A novel AI-driven model for student dropout risk analysis with explainable AI insights. Computers and Education: Artificial Intelligence, 8, 100352. doi:10.1016/j.caeai.2024.100352.
[20] Zanellati, A., Zingaro, S. P., & Gabbrielli, M. (2024). Balancing performance and explainability in academic dropout prediction. IEEE Transactions on Learning Technologies, 17, 2086–2099. doi:10.1109/TLT.2024.3425959.
[21] Kiss, B., Nagy, M., Molontay, R., & Csabay, B. (2019). Predicting dropout using high school and first-semester academic achievement measures. In: 2019 17th International Conference on Emerging eLearning Technologies and Applications (ICETA); 2019 Nov 21; pp. 383–389. IEEE. doi:10.1109/ICETA48886.2019.9040158.
[22] Alsubhi, B., Alharbi, B., Aljojo, N., Banjar, A., Tashkandi, A., Alghoson, A., & Al-Tirawi, A. (2023). Effective feature prediction models for student performance. Engineering, Technology & Applied Science Research, 13(5), 11937–11944. doi:10.48084/etasr.6345.
[23] Albreiki, B., Habuza, T., & Zaki, N. (2022). Framework for automatically suggesting remedial actions to help students at risk based on explainable ML and rule-based models. International Journal of Educational Technology in Higher Education, 19(1), 49. doi:10.1186/s41239-022-00354-6.
[24] Oqaidi, K., Aouhassi, S., & Mansouri, K. (2022). Towards a students’ dropout prediction model in higher education institutions using machine learning algorithms. International Journal of Emerging Technologies in Learning (iJET), 17(18), 103–117. doi:10.3991/ijet.v17i18.25567.
[25] Ebrahim, P., Al-Moumni, M., Al-Hattami, A., & Ali, A. (2021). A study of student attrition in the foundation year program of a teachers’ college. International Journal of Lifelong Education, 40(3), 198–214. doi:10.1080/02601370.2021.1931973.
[26] Alteneiji, E. (2023). Value changes in gender roles: Perspectives from three generations of Emirati women. Cogent Social Sciences, 9(1), 2184899. doi:10.1080/23311886.2023.2184899.
[27] Ben Said, M., Hadj Kacem, Y., Algarni, A., & Masmoudi, A. (2024). Early prediction of student academic performance based on machine learning algorithms: A case study of bachelor’s degree students in KSA. Education and Information Technologies, 29(11), 13247–13270. doi:10.1007/s10639-023-12370-8.
[28] Tinto, V. (1975). Dropout from higher education: A theoretical synthesis of recent research. Review of Educational Research, 45(1), 89–125. doi:10.3102/00346543045001089.
[29] Tinto, V. (1993). Leaving College: Rethinking the Causes and Cures of Student Attrition (2nd Edi.). University of Chicago Press, Chicago, United States.
[30] Al Murshidi, G. (2019). STEM education in the United Arab Emirates: Challenges and possibilities. International Journal of Learning, Teaching and Educational Research, 18(12), 316–332. doi:10.26803/ijlter.18.12.18.
[31] Yaghi, A., & Alabed, N. (2025). Factors affecting university dropout: Comparison of STEM and public affairs and management students. International Journal of Public Administration, 1–15. doi:10.1080/01900692.2025.2476676.
[32] Shafiq, D. A., Marjani, M., Habeeb, R. A. A., & Asirvatham, D. (2022). Student retention using educational data mining and predictive analytics: A systematic literature review. IEEE Access, 10, 72480–72503. doi:10.1109/ACCESS.2022.3188767.
[33] Aina, C., Baici, E., Casalone, G., & Pastore, F. (2022). The determinants of university dropout: A review of the socio-economic literature. Socio-Economic Planning Sciences, 79, 101102. doi:10.1016/j.seps.2021.101102.
[34] Lorenzo-Quiles, O., Galdón-López, S., & Lendínez-Turón, A. (2023). Factors contributing to university dropout: A review. Frontiers in Education, 8, 1159864. doi:10.3389/feduc.2023.1159864.
[35] Naseem, M., Chaudhary, K., & Sharma, B. (2022). Predicting freshmen attrition in computing science using data mining. Education and Information Technologies, 27(7), 9587–9617. doi:10.1007/s10639-022-11018-3.
[36] Demeter, E., Dorodchi, M., Al-Hossami, E., Benedict, A., Slattery Walker, L., & Smail, J. (2022). Predicting first-time-in-college students’ degree completion outcomes. Higher Education, 1–21. doi:10.1007/s10734-021-00790-9.
[37] Bañeres, D., Rodríguez-González, M. E., Guerrero-Roldán, A. E., & Cortadas, P. (2023). An early warning system to identify and intervene online dropout learners. International Journal of Educational Technology in Higher Education, 20(1), 3. doi:10.1186/s41239-022-00371-5.
[38] Islam, M. M., Sojib, F. H., Mihad, M. F. H., Hasan, M., & Rahman, M. (2025). The integration of explainable AI in educational data mining for student academic performance prediction and support system. Telematics and Informatics Reports, 100203. doi:10.1016/j.teler.2025.100203.
[39] Abdulghani, H. M., Alanazi, K., Alotaibi, R., Alsubeeh, N. A., Ahmad, T., & Haque, S. (2023). Prevalence of potential dropout thoughts and their influential factors among Saudi medical students. Sage Open, 13(1), 21582440221146966. doi:10.1177/21582440221146966.
[40] Hammoudi Halat, D., Abdel-Salam, A. S. G., Bensaid, A., Soltani, A., Alsarraj, L., Dalli, R., & Malki, A. (2023). Use of machine learning to assess factors affecting progression, retention, and graduation in first-year health professions students in Qatar: a longitudinal study. BMC Medical Education, 23(1), 909. doi:10.1186/s12909-023-04887-w.
[41] Verbert, K., Ochoa, X., De Croon, R., Dourado, R. A., & De Laet, T. (2020). Learning analytics dashboards: The past, the present and the future. In: Proceedings of the Tenth International Conference on Learning Analytics & Knowledge; 2020 Mar; pp. 35–40. doi:10.1145/3375462.337550.
[42] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. doi:10.1613/jair.953.
[43] Lunardon, N., Menardi, G., & Torelli, N. (2014). ROSE: A package for binary imbalanced learning. The R Journal, 6(1), 79–89. doi:10.32614/RJ-2014-008.
[44] Helland, H., & Hovdhaugen, E. (2022). Degree completion in short professional courses: Does family background matter? Journal of Further and Higher Education, 46(5), 680–694. doi:10.1080/0309877X.2021.1998394.
[45] Rizvi, S., Rienties, B., & Khoja, S. A. (2019). The role of demographics in online learning: A decision tree based approach. Computers & Education, 137, 32–47. doi:10.1016/j.compedu.2019.04.001.
[46] Nikolaidis, P., Ismail, M., Shuib, L., Khan, S., & Dhiman, G. (2022). Predicting student attrition in higher education through the determinants of learning progress: A structural equation modelling approach. Sustainability, 14(20), 13584. doi:10.3390/su142013584.
[47] Segura, M., Mello, J., & Hernández, A. (2022). Machine learning prediction of university student dropout: Does preference play a key role? Mathematics, 10(18), 3359. doi:10.3390/math10183359.
[48] Bernacki, M. L., Chavez, M. M., & Uesbeck, P. M. (2020). Predicting achievement and providing support before STEM majors begin to fail. Computers & Education, 158, 103999. doi:10.1016/j.compedu.2020.103999.
[49] Vidal, J., Gilar-Corbi, R., Pozo-Rico, T., Castejón, J. L., & Sánchez-Almeida, T. (2022). Predictors of university attrition: Looking for an equitable and sustainable higher education. Sustainability, 14(17), 10994. doi:10.3390/su141710994.
[50] Li, I. W., Jackson, D., & Koshy, P. (2024). Student’s reported satisfaction at university: The role of personal characteristics and secondary school background. Higher Education, 1–19. doi:10.1007/s10734-024-01286-y.
[51] Alshamaila, Y., Alsawalqah, H., Aljarah, I., Habib, M., Faris, H., Alshraideh, M., & Salih, B. A. (2024). An automatic prediction of students’ performance to support the university education system: a deep learning approach. Multimedia Tools and Applications, 83(15), 46369–46396. doi:10.1007/s11042-024-18262-4.
[52] Kocsis, Á., & Molnár, G. (2025). Factors influencing academic performance and dropout rates in higher education. Oxford Review of Education, 51(3), 414–432. doi:10.1080/03054985.2024.2316616.
[53] Lee, S., & Chung, J. Y. (2019). The machine learning-based dropout early warning system for improving the performance of dropout prediction. Applied Sciences, 9(15), 3093. doi:10.3390/app9153093.
[54] White, B. A., Miles, J. R., & Frantell, K. A. (2021). Intergroup dialogue: A justice‐centered pedagogy to address gender inequity in STEM. Science Education, 105(5), 1010-1034. doi:10.1002/sce.21599.
[55] Vooren, M., Haelermans, C., Groot, W., & van den Brink, H. M. (2022). Comparing success of female students to their male counterparts in the STEM fields: An empirical analysis from enrollment until graduation using longitudinal register data. International Journal of STEM Education, 9(1), 1. doi:10.1186/s40594-021-00318-8.
[56] Montmarquette, C., Mahseredjian, S., & Houle, R. (2001). The determinants of university dropouts: a bivariate probability model with sample selection. Economics of education review, 20(5), 475–484. doi:10.1016/S0272-7757(00)00029-7.
[57] Deho, O. B., Joksimovic, S., Li, J., Zhan, C., Liu, J., & Liu, L. (2023). Should learning analytics models include sensitive attributes? Explaining the why. IEEE Transactions on Learning Technologies, 16(4), 560–572. doi:10.1109/TLT.2022.3226474.
[58] Azizah, Z., Ohyama, T., Zhao, X., Ohkawa, Y., & Mitsuishi, T. (2024). Predicting at-risk students in the early stage of a blended learning course via machine learning using limited data. Computers and Education: Artificial Intelligence, 7, 100261. doi:10.1016/j.caeai.2024.100261.
- This work (including HTML and PDF Files) is licensed under a Creative Commons Attribution 4.0 International License.



















