A Data Science Maturity Model Applied to Students' Modeling

L. Cavique, Paulo Pombinho, Luís Correia


Maturity models define a series of levels, each representing an increased complexity in information systems. Data Science appears in the Business Intelligence (BI) and Business Analytics (BA) literature. This work applies the _IABE maturity model, which includes two additional levels: Data Engineering (DE) at the bottom and Business Experimentation (BE) at the top. This study uses the _IABE model for students' modeling in the ModEst project. For this purpose, the Public Administration organism is the Directorate-General for Statistics of Education and Science (DGEEC) of the Portuguese Education Ministry. DGEEC provided vast data on two million students per year in the Portuguese school system, from pre-scholar to doctoral programs. This work presents the comprehensible _IABE maturity model to extract new knowledge from the DGEEC dataset. The method applied is _IABE, where after the DE level, wh-questions are formulated and answered with the most appropriate techniques at each maturity level. This work's novelty is applying the maturity model _IABE to a unique dataset for the first time. Wh-questions are stated at the BI level using data summarization; at the BA level, predictive models are performed, and counterfactual approaches are presented at the BE level.


Doi: 10.28991/ESJ-2023-07-06-08

Full Text: PDF


Maturity Model; Wh-question; Students' Modeling; Business Intelligence; Business Analytics; Causality.


Carvalho, J. V., Rocha, Á., Vasconcelos, J., & Abreu, A. (2019). A health data analytics maturity model for hospitals information systems. International Journal of Information Management, 46, 278–285. doi:10.1016/j.ijinfomgt.2018.07.001.

Pearl, J. (2019). The seven tools of causal inference, with reflections on machine learning. Communications of the ACM, 62(3), 54–60. doi:10.1145/3241036.

Thomke, S. H. (2020). Experimentation works: The surprising power of business experiments. Harvard Business Press, Boston, United States.

Pearl, J., & Mackenzie, D. (2018). The book of why: the new science of cause and effect. Basic Books, New York, United States.

Cavique, L. (2023). Causality: The Next Step in Artificial Intelligence. In Philosophy of Artificial Intelligence and Its Place in Society, IGI Global, 1-17. doi:10.4018/978-1-6684-9591-9.ch001.

Jackson R. (1999). Information Design. MIT Press, Cambridge, United States.

Ackoff, R. L. (1989). From data to wisdom. Journal of Applied Systems Analysis, 16(1), 3-9.

Cavique, L., Pombinho, P., Tallón-Ballesteros, A. J., & Correia, L. (2020). Data Pre-processing and Data Generation in the Student Flow Case Study. Intelligent Data Engineering and Automated Learning – IDEAL 2020, IDEAL 2020, Lecture Notes in Computer Science, 12490. Springer, Cham, Switzerland. doi:10.1007/978-3-030-62365-4_4.

Pombinho, P., Cavique, L., & Correia, L. (2023). Errors of Identifiers in Anonymous Databases: Impact on Data Quality. 17th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2022), SOCO 2022, Lecture Notes in Networks and Systems, 531, Springer, Cham, Switzerland. doi:10.1007/978-3-031-18050-7_53.

Tavares, L. V. (1995). On the development of educational policies. European Journal of Operational Research, 82(3), 409–421. doi:10.1016/0377-2217(95)98193-4.

Lovell, C. C. (1971). Student Flow Models: A Review and Conceptualization. Technical Report 25, Preliminary Field Review Edition, National Center for Higher Education Management System at Western Interstate Commission for Higher education, Boulder, United States.

Kwak, N. K., Brown, R., & Schiederjans, M. J. (1986). A Markov analysis of estimating student enrollment transition in a trimester institution. Socio-Economic Planning Sciences, 20(5), 311–318. doi:10.1016/0038-0121(86)90040-6.

Bessent, E. W., & Bessent, A. M. (1980). Student Flow in a University Department: Results of a Markov Analysis. Interfaces, 10(2), 52–59. doi:10.1287/inte.10.2.52.

Meece, J. L., & Miller, S. D. (2001). A longitudinal analysis of elementary school students’ achievement goals in literacy activities. Contemporary Educational Psychology, 26(4), 454–480. doi:10.1006/ceps.2000.1071.

Lima Junior, P., Silveira, F. L. da, & Ostermann, F. (2012). Survival analysis applied to the study of academic flow in undergraduate physics courses: an example from a Brazilian university. Brazilian Journal of Physics Teaching, 34(1), 1-10. doi:10.1590/s1806-11172012000100014. (In Portuguese).

Saltzman, R. M., & Roeder, T. M. (2012). Simulating student flow through a college of business for policy and structural change analysis. Journal of the Operational Research Society, 63(4), 511–523. doi:10.1057/jors.2011.59.

Fiallos, A., & Ochoa, X. (2017). Discrete event simulation for student flow in academic study periods. 2017 Twelfth Latin American Conference on Learning Technologies (LACLO), La Plata, Argentina. doi:10.1109/laclo.2017.8120908.

Nese, J. F., Lai, C. F., & Anderson, D. (2013). A primer on longitudinal data analysis in education. Behavioral Research and Teaching. Technical Report#1320, University of Oregon, Eugene, United States.

Kwok, O.-M., Lai, M. H.-C., Tong, F., Lara-Alecio, R., Irby, B., Yoon, M., & Yeh, Y.-C. (2018). Analyzing Complex Longitudinal Data in Educational Research: A Demonstration with Project English Language and Literacy Acquisition (ELLA) Data Using XXM. Frontiers in Psychology, 9. doi:10.3389/fpsyg.2018.00790.

Victorino, G., Coelho, P. S., & Henriques, R. (2023). The Value of Design Thinking for PhD Students: A Retrospective Longitudinal Study. Emerging Science Journal, 7, 16–31. doi:10.28991/ESJ-2023-SIED2-02.

Siemens, G., & Baker, R. S. J. d. (2012). Learning analytics and educational data mining. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, 252-254. doi:10.1145/2330601.2330661.

Romero, C., & Ventura, S. (2013). Data mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(1), 12–27. doi:10.1002/widm.1075.

Yağcı, M. (2022). Educational data mining: prediction of students' academic performance using machine learning algorithms. Smart Learning Environments, 9(1), 11. doi:10.1186/s40561-022-00192-z.

Nunes, C., Beatriz-Afonso, A., Cruz-Jesus, F., Oliveira, T., & Castelli, M. (2022). Mathematics and Mother Tongue Academic Achievement: A Machine Learning Approach. Emerging Science Journal, 6, 137–149. doi:10.28991/esj-2022-sied-010.

Costa-Mendes, R., Cruz-Jesus, F., Oliveira, T., & Castelli, M. (2022). Deep Learning in Predicting High School Grades: A Quantum Space of Representation. Emerging Science Journal, 6, 166–187. doi:10.28991/esj-2022-sied-012.

Feng, G., Fan, M., & Chen, Y. (2022). Analysis and Prediction of Students’ Academic Performance Based on Educational Data Mining. IEEE Access, 10, 19558–19571. doi:10.1109/ACCESS.2022.3151652.

Gartner. (2012). Gartner Analytic Ascendancy Model. Gartner, Inc., Stamford, United States. Available online: https://www.gartner.com/en (accessed on May 2023).

Davenport, T. (2018). DELTA Plus Model & five stages of analytics maturity: A primer. International Institute for Analytics, Portland, United Sates.

ISCED. (2011). International Standard Classification of Education. UNESCO Institute for Statistics, Montreal, Quebec, Canada.

Hanif, A., Zhang, X., & Wood, S. (2021). A Survey on Explainable Artificial Intelligence Techniques and Challenges. 2021 IEEE 25th International Enterprise Distributed Object Computing Workshop (EDOCW), Gold Coast, Australia. doi:10.1109/edocw52865.2021.00036.

Lewis, D. (1973). Causation. The Journal of Philosophy, 70(17), 556. doi:10.2307/2025310.

Pearl, J. (2000). Models, reasoning and inference. Cambridge University Press, Cambridge, United Kingdom.

Angrist, J. D., & Pischke, J. S. (2014). Mastering ’metrics: The path from cause to effect. Princeton University Press, Princeton, United States. doi:10.1093/erae/jbv011.

Crato, N., & Paruolo, P. (2018). Data-driven policy impact evaluation: How access to microdata is transforming policy design. Springer, Cham, Switzerland. doi:10.1007/978-3-319-78461-8.

Full Text: PDF

DOI: 10.28991/ESJ-2023-07-06-08


  • There are currently no refbacks.

Copyright (c) 2023 Luís Cavique