A Comparative Analysis of Machine Learning Models for Predicting EFL Student Language Performance in Smart Learning Environments

Banchakarn Sameephet; Wirapong Chansanam; Mahboubeh Rakhshandehroo; Chawin Srisawat; Kittichai Nilubol; Arnon Jannok; Bhirawit Satthamnuwong; Kornwipa Poonpon

doi:10.28991/ESJ-2025-09-02-07

Authors

Banchakarn Sameephet Smart Learning Innovation Research Center and Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen,, Thailand
Wirapong Chansanam
wirach@kku.ac.th
Smart Learning Innovation Research Center and Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen,, Thailand http://orcid.org/0000-0001-5546-8485
Mahboubeh Rakhshandehroo Faculty of Culture and Representation, Doshisha Women's College of Liberal Arts, Kyoto, Japan; Center for Multilingual Education, Osaka University, Osaka,, Japan
Chawin Srisawat Smart Learning Innovation Research Center and Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen,, Thailand
Kittichai Nilubol Smart Learning Innovation Research Center and Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen,, Thailand
Arnon Jannok Smart Learning Innovation Research Center and Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen,, Thailand
Bhirawit Satthamnuwong Smart Learning Innovation Research Center and Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen,, Thailand
Kornwipa Poonpon Smart Learning Innovation Research Center and Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen,, Thailand

Vol. 9 No. 2 (2025): April

Research Articles

Downloads

PDF

Abstract
How to Cite
Metrics
References
License

Integrating smart learning environments into modern education systems opens up significant opportunities to use data analysis techniques to predict students' English language performance. This study aims to evaluate the performance of various machine learning models for predicting English as a foreign language student performance, emphasizing data preprocessing and feature selection. The dataset was gathered from 181 students in eight middle schools in Thailand. The student's data was exported from the Smart Learning Project, which includes data on 14 PISA-like English quizzes covering 27 competencies. The study compares the predictive performance of machine learning models, including Random Forest, Support Vector Regression, AdaBoost, Bayesian Ridge, K-Nearest Neighbors, ElasticNet, XGBoost, Gradient Boosting, and Stacking Ensemble, using MSE, RMSE, MAE, and R² metrics. The analysis results indicated that ensemble models, particularly XGBoost and Stacking Ensemble, performed the best in predicting students' English language performance. These models can efficiently capture complex relationships in educational data. Therefore, data preprocessing and feature selection play a significant role in improving model performance. This study highlights the potential of advanced machine learning techniques in educational data analysis. The results can contribute to developing personalized learning strategies and early intervention. It supports an efficient and adaptive education system, advancing smart learning and data-driven instruction.

Doi:10.28991/ESJ-2025-09-02-07

Full Text:PDF

Gunawardena, M., Bishop, P., & Aviruppola, K. (2024). Personalized learning: The simple, the complicated, the complex and the chaotic. Teaching and Teacher Education, 139, 104429. doi:10.1016/j.tate.2023.104429.

Herold, B. (2017). Personalized learning: Modest gains, big challenges, RAND study finds. Education Week, Bethesda, United States.

Kukulska-Hulme, A. (2020). Mobile and personal learning for newcomers to a city. Electronic Journal of Foreign Language Teaching, 17(1), 93–103. doi:10.56040/agkh1717.

Xu, Y. (2021). A Study on the Curriculum Design of Personalized English Teaching in the Context of Internet. Open Journal of Social Sciences, 9(4), 65–71. doi:10.4236/jss.2021.94007.

Bishop, P. A., Downes, J. M., & Farber, K. (2021). Personalized learning in the middle grades: A guide for classroom teachers and school leaders. Harvard Education Press, Cambridge, United States.

Bower, M. (2019). Technology-mediated learning theory. British Journal of Educational Technology, 50(3), 1035-1048. doi:10.1111/bjet.12771.

Wu, M., Subramaniam, G., Zhu, D., Li, C., Ding, H., & Zhang, Y. (2024). Using Machine Learning-based Algorithms to Predict Academic Performance - A Systematic Literature Review. 2024 4th International Conference on Innovative Practices in Technology and Management (ICIPTM), 1–8. doi:10.1109/iciptm59628.2024.10563566.

Zhao, L., Ren, J., Zhang, L., & Zhao, H. (2023). Quantitative Analysis and Prediction of Academic Performance of Students Using Machine Learning. Sustainability (Switzerland), 15(16), 12531. doi:10.3390/su151612531.

Sateesh, N., Srinivasa Rao, P., & Rajya Lakshmi, D. (2023). Optimized ensemble learning-based student's performance prediction with weighted rough set theory enabled feature mining. Concurrency and Computation: Practice and Experience, 35(7), e7601. doi:10.1002/cpe.7601.

Çınar, D., & Yılmaz Gündüz, S. (2024). Classification of Students' Academic Success Using Ensemble Learning and Attribute Selection. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering, 25(2), 262–277. doi:10.18038/Estubtda.1394885.

Šževgin, H. (2023). A comparative study of ensemble methods in the field of education: Bagging and Boosting algorithms. International Journal of Assessment Tools in Education, 10(3), 544–562. doi:10.21449/ijate.1167705.

Abdul Bujang, S. D., Selamat, A., Krejcar, O., Mohamed, F., Cheng, L. K., Chiu, P. C., & Fujita, H. (2023). Imbalanced Classification Methods for Student Grade Prediction: A Systematic Literature Review. IEEE Access, 11, 1970–1989. doi:10.1109/ACCESS.2022.3225404.

Ye, M., Sheng, X., Lu, Y., Zhang, G., Chen, H., Jiang, B., Zou, S., & Dai, L. (2022). SA-FEM: Combined Feature Selection and Feature Fusion for Students' Performance Prediction. Sensors, 22(22), 8838. doi:10.3390/s22228838.

Li, S., & Yang, B. (2023). Personalized Education Resource Recommendation Method Based on Deep Learning in Intelligent Educational Robot Environments. International Journal of Information Technologies and Systems Approach, 16(3), 1–15. doi:10.4018/IJITSA.321133.

Mastrothanasis, K., Zervoudakis, K., & Kladaki, M. (2024). An application of Computational Intelligence in group formation for digital drama education. Iran Journal of Computer Science, 1–13. doi:10.1007/s42044-024-00186-9.

López-García, A., Blasco-Blasco, O., Liern-García, M., & Parada-Rico, S. E. (2023). Early detection of students' failure using Machine Learning techniques. Operations Research Perspectives, 11, 100292. doi:10.1016/j.orp.2023.100292.

Alshamaila, Y., Alsawalqah, H., Aljarah, I., Habib, M., Faris, H., Alshraideh, M., & Salih, B. A. (2024). An automatic prediction of students' performance to support the university education system: a deep learning approach. Multimedia Tools and Applications, 83(15), 46369–46396. doi:10.1007/s11042-024-18262-4.

Malik, S., & Jothimani, K. (2024). Enhancing Student Success Prediction with FeatureX: A Fusion Voting Classifier Algorithm with Hybrid Feature Selection. Education and Information Technologies, 29(7), 8741–8791. doi:10.1007/s10639-023-12139-z.

Sghir, N., Adadi, A., & Lahmer, M. (2023). Recent advances in Predictive Learning Analytics: A decade systematic review (2012–2022). Education and Information Technologies, 28(7), 8299–8333. doi:10.1007/s10639-022-11536-0.

Ersozlu, Z., Taheri, S., & Koch, I. (2024). A review of machine learning methods used for educational data. Education and Information Technologies, 1–21. doi:10.1007/s10639-024-12704-0.

Xu, X. (2023). Revolutionizing Education: Advanced Machine Learning Techniques for Precision Recommendation of Top-Quality Instructional Materials. International Journal of Computational Intelligence Systems, 16(1), 179. doi:10.1007/s44196-023-00361-z.

Sajja, R., Sermet, Y., Cwiertny, D., & Demir, I. (2023). Integrating AI and learning analytics for data-driven pedagogical decisions and personalized interventions in education. arXiv preprint, arXiv:2312.09548. doi:10.48550/arXiv.2312.09548.

Orji, F. A., & Vassileva, J. (2022). Machine learning approach for predicting students' academic performance and study strategies based on their motivation. arXiv preprint, arXiv:2210.08186. doi:10.48550/arXiv.2210.08186.

Ayanwale, M. A., Molefi, R. R., & Oyeniran, S. (2024). Analyzing the evolution of machine learning integration in educational research: a bibliometric perspective. Discover Education, 3(1), 47. doi:10.1007/s44217-024-00119-5.

Brdnik, S., Š umak, B., & Podgorelec, V. (2022). Aligning Learners' Expectations and Performance by Learning Analytics Systemwith a Predictive Model. arXiv preprint, arXiv.2211.07729. doi:10.48550/arXiv.2211.07729.

Moubayed, A., Injadat, M., Alhindawi, N., Samara, G., Abuasal, S., & Alazaidah, R. (2023). A Deep Learning Approach Towards Student Performance Prediction in Online Courses: Challenges Based on a Global Perspective. 2023 24th International Arab Conference on Information Technology (ACIT), 1–6. doi:10.1109/acit58888.2023.10453917.

de Souza Zanirato Maia, J., Bueno, A. P. A., & Sato, J. R. (2023). Applications of Artificial Intelligence Models in Educational Analytics and Decision Making: A Systematic Review. World, 4(2), 288–313. doi:10.3390/world4020019.

Akyuz, Y. (2020). Personalized learning in education. American Scientific Research Journal for Engineering, Technology, and Sciences (ASRJETS), 69(1), 175-194.

Bulger, M. (2016). Personalized learning: The conversations we're not having. Data and Society, 22(1), 1-29.

Shemshack, A., & Spector, J. M. (2020). A systematic literature review of personalized learning terms. Smart Learning Environments, 7(1), 33. doi:10.1186/s40561-020-00140-9.

Fitzgerald, E., Jones, A., Kucirkova, N., & Scanlon, E. (2018). A literature synthesis of personalised technology-enhanced learning: What works and why. Research in Learning Technology, 26. doi:10.25304/rlt.v26.2095.

Lee, D., Huh, Y., Lin, C.-Y., & Reigeluth, C. M. (2018). Technology functions for personalized learning in learner-centered schools. Educational Technology Research and Development, 66(5), 1269–1302. doi:10.1007/s11423-018-9615-9.

Hsieh, C.-W., & Chen, S. Y. (2016). A Cognitive Style Perspective to Handheld Devices: Customization vs. Personalization. The International Review of Research in Open and Distributed Learning, 17(1), 2168. doi:10.19173/irrodl.v17i1.2168.

Pontual Falcí£o, T., Mendes de Andrade e Peres, F., Sales de Morais, D. C., & da Silva Oliveira, G. (2018). Participatory methodologies to promote student engagement in the development of educational digital games. Computers & Education, 116, 161–175. doi:10.1016/j.compedu.2017.09.006.

Deci, E. L., Vallerand, R. J., Pelletier, L. G., & Ryan, R. M. (1991). Motivation and Education: The Self-Determination Perspective. Educational Psychologist, 26(3–4), 325–346. doi:10.1080/00461520.1991.9653137.

Al-Mutairi, M. A. (2002). Be Water, My Friend”: The Adaptive Approach to English Language Learning. International Journal of Latest Research in Humanities and Social Science (IJLRHSS), 7(5), 20-29.

Sampson, D., & Karagiannidis, C. (2002). Personalised learning: educational, technological and standarisation perspective. Digital Education Review, (4), 24-39.

Tetzlaff, L., Schmiedek, F., & Brod, G. (2021). Developing Personalized Education: A Dynamic Framework. Educational Psychology Review, 33(3), 863–882. doi:10.1007/s10648-020-09570-w.

Fleming, N.D. (2001) Teaching and Learning Styles: VARK Strategies. Neil Fleming, Seattle, United States.

EHRMAN, M., & OXFORD, R. (1990). Adult Language Learning Styles and Strategies in an Intensive Training Setting. The Modern Language Journal, 74(3), 311–327. doi:10.1111/j.1540-4781.1990.tb01069.x.

Lamy, M.-N., & Hampel, R. (2007). Online Communication in Language Learning and Teaching. Palgrave Macmillan UK, London, United Kingdom. doi:10.1057/9780230592681.

Baker, R. S. (2010). Data mining for education: Adaptive learning systems. Handbook of Educational Data Mining, Boca Raton, United States.

Szabó, F., & Szoke, J. (2024). How does generative AI promote autonomy and inclusivity in language teaching? ELT Journal, 78(4), 478–488. doi:10.1093/elt/ccae052.

Cárdenas-Claros, M. S., Dassonvalle, K., Rodríguez-Arias, P., & Cáceres-Ramírez, B. (2024). Considerations for the Design of Pedagogical Tasks in Computer-Based L2 Listening. TESOL Quarterly. doi:10.1002/tesq.3371.

Wu, J. G. (2023). Mobile Assisted Language Learning Across Educational Contexts. Applied Linguistics, 44(6), 1175–1178. doi:10.1093/applin/amac071.

Whitney, N. (2013). Technology Enhanced Language Learning: Connecting Theory and Practice (Oxford Handbooks for Language Teachers). ELT Journal, 68(1), 105–108. doi:10.1093/elt/cct069.

Goksu, I., Ozkaya, E., & Gunduz, A. (2022). The content analysis and bibliometric mapping of CALL journal. Computer Assisted Language Learning, 35(8), 2018–2048. doi:10.1080/09588221.2020.1857409.

Wei, Y. (2022). Toward Technology-Based Education and English as a Foreign Language Motivation: A Review of Literature. Frontiers in Psychology, 13, 870540. doi:10.3389/fpsyg.2022.870540.

Zhang, R., & Zou, D. (2022). A state-of-the-art review of the modes and effectiveness of multimedia input for second and foreign language learning. Computer Assisted Language Learning, 35(9), 2790–2816. doi:10.1080/09588221.2021.1896555.

Shadiev, R., & Yang, M. (2020). Review of studies on technology-enhanced language learning and teaching. Sustainability (Switzerland), 12(2), 524. doi:10.3390/su12020524.

Liu, M. T., & Yu, P. T. (2011). Aberrant learning achievement detection based on person-fit statistics in personalized e-learning systems. Educational Technology & Society, 14(1), 107–120.

Pane, J. F., Steiner, E. D., Baird, M. D., & Hamilton, L. S. (2015). Continued Progress: Promising Evidence on Personalized Learning. Rand Corporation, Santa Monica, United States. doi:10.7249/rr1365.

Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. doi.:10.1023/a:1010933404324

Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222. doi:10.1023/B:STCO.0000035301.49549.88.

Freund, Y., & Schapire, R. E. (1997). A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55(1), 119–139. doi:10.1006/jcss.1997.1504.

MacKay, D. J. C. (1992). Bayesian Interpolation. Neural Computation, 4(3), 415–447. doi:10.1162/neco.1992.4.3.415.

Fix, E., & Hodges, J. L. (1951). Discriminatory analysis: Nonparametric discrimination: Consistency properties. PsycEXTRA Dataset. American Psychological Association (APA), Washington, D.C., United States. doi:10.1037/e471672008-001.

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 67(2), 301–320. doi:10.1111/j.1467-9868.2005.00503.x.

Chen, T., & Guestrin, C. (2016). XGBoost. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. doi:10.1145/2939672.2939785.

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. doi:10.1214/aos/1013203451.

Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259. doi:10.1016/S0893-6080(05)80023-1.

Chai, T., & Draxler, R. R. (2014). Root mean square error (RMSE) or mean absolute error (MAE)? -Arguments against avoiding RMSE in the literature. Geoscientific Model Development, 7(3), 1247–1250. doi:10.5194/gmd-7-1247-2014.

Willmott, C. J., & Matsuura, K. (2005). Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research, 30(1), 79–82. doi:10.3354/cr030079.

Cameron, A. C., & Windmeijer, F. A. G. (1997). An R-squared measure of goodness of fit for some common nonlinear regression models. Journal of Econometrics, 77(2), 329–342. doi:10.1016/s0304-4076(96)01818-0.

Smirani, L. K., Yamani, H. A., Menzli, L. J., & Boulahia, J. A. (2022). Using Ensemble Learning Algorithms to Predict Student Failure and Enabling Customized Educational Paths. Scientific Programming, 2022(1), 3805235. doi:10.1155/2022/3805235.

Keser, S. B., & Aghalarova, S. (2022). HELA: A novel hybrid ensemble learning algorithm for predicting academic performance of students. Education and Information Technologies, 27(4), 4521–4552. doi:10.1007/s10639-021-10780-0.

Clark, R. C., & Mayer, R. E. (2012). e-Learning and the Science of Instruction: Proven Guidelines for Consumers and Designers of Multimedia Learning. John Wiley & Sons, Hoboken, United States. doi:10.1002/9781118255971.

Field, J. (2010). Listening in the language classroom. ELT journal, 64(3), 331-333. doi:10.1093/elt/ccq026.

Reinders, H., & White, C. (2010). The theory and practice of technology in materials development and task design. English language teaching materials: Theory and practice, 58-80, Cambridge University Press, Cambridge, United Kingdom.

Acceptance Rate:	21%
Review Speed:	74 days
Issue Per Year:	6
Number of Volumes:	7
Number of Issues:	44
Number of Articles:	493
Number of Reviewers:	1187
Number of Contributors:	1394
Contributing Countries:	83
No. of WoS Citations:	2609
No. of Scopus Citations:	2936
No. of Google Citations:	4161
Google h-index:	29
Google i10-index:	126
Abstract Views:	681,807
PDF Download:	492,524

A Comparative Analysis of Machine Learning Models for Predicting EFL Student Language Performance in Smart Learning Environments

Authors

Downloads

Downloads

Login

submission

Publisher & Affiliated Societies

Indexing & Abstracting

SidebarMenu

IndexedBy

Indexing and Abstracting

twitter

Social Media

Analytics

Analytics

Information

Most Cited Articles

Impediments of Green Finance Adoption System: Linking Economy and Environment

Digital Transformation: Opportunities and Challenges for Leaders in the Emerging Countries in Response to Covid-19 Pandemic

Thermal Regeneration and Reuse of Carbon and Glass Fibers from Waste Composites

Optical and Structural Characterization of Bi2FexNbO7 Nanoparticles for Environmental Applications

Address

Contact Info:

A Comparative Analysis of Machine Learning Models for Predicting EFL Student Language Performance in Smart Learning Environments

Authors

Downloads

Downloads

Login

submission

Publisher & Affiliated Societies

Indexing & Abstracting

SidebarMenu

social

Journal Imprint

Journal Metrics

IndexedBy

Indexing and Abstracting

twitter

Social Media

Analytics

Analytics

Information

Most Cited Articles

Impediments of Green Finance Adoption System: Linking Economy and Environment

Digital Transformation: Opportunities and Challenges for Leaders in the Emerging Countries in Response to Covid-19 Pandemic

Thermal Regeneration and Reuse of Carbon and Glass Fibers from Waste Composites

Optical and Structural Characterization of Bi2FexNbO7 Nanoparticles for Environmental Applications