T-CER-Net: Attention-Based Temporal Cross-Eye Regression for Noise-Resilient Detection of Intermittent Strabismus

Wattanapong Kurdthongmee; Karanrat Thammarak; Md Eshrat E. Alahi; Yun Hui; Piyadhida Kurdthongmee

doi:10.28991/ESJ-2026-010-02-014

Authors

Wattanapong Kurdthongmee
kwattana@wu.ac.th
Research Center for Intelligent Technology and Integration, Walailak University, Nakhon Si Thammarat 80160, Thailand https://orcid.org/0000-0001-6467-1039
Karanrat Thammarak Research Center for Intelligent Technology and Integration, Walailak University, Nakhon Si Thammarat 80160, Thailand https://orcid.org/0000-0003-4694-6128
Md Eshrat E. Alahi Research Center for Intelligent Technology and Integration, Walailak University, Nakhon Si Thammarat 80160, Thailand https://orcid.org/0000-0002-2721-5146
Yun Hui Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China https://orcid.org/0000-0002-6907-517X
Piyadhida Kurdthongmee Center for Scientific and Technological Equipment, Walailak University, Nakhon Si Thammarat 80160, Thailand

Vol. 10 No. 2 (2026): April

Research Articles

Downloads

PDF

Abstract
How to Cite
Metrics
References
License

Automated strabismus screening using video is difficult in unconstrained settings, where brief events such as blinking, head movement, or tracking errors can easily be mistaken for true ocular misalignment. The objective of this study is to improve diagnostic specificity while maintaining sensitivity in automated pre-screening scenarios. To address this problem, a temporal analysis framework, termed the Temporal Cross-Eye Regression Network (T-CER-Net), is proposed. The method introduces the Cross-Eye Regression Error (CERE), a scale- and position-invariant temporal signal that characterizes deviations in binocular coordination by measuring prediction error between the two eyes. Rather than relying on frame-level deviation estimates, the approach analyzes extended CERE sequences using a Transformer Encoder to assess temporal consistency. In addition, the training procedure explicitly accounts for real-world variability through oversampling of normal sequences containing common artifacts and the use of class weighting. The proposed method was evaluated against static threshold-based classifiers and a CNN–LSTM temporal baseline. On a held-out test set, T-CER-Net achieved an area under the ROC curve of 0.9140, with a sensitivity of 0.8421 and a specificity of 0.8500, showing improved robustness to noise-induced false positives. The findings suggest that treating binocular misalignment as a temporal pattern, together with attention-based sequence analysis, offers a practical and robust basis for automated strabismus pre-screening in real-world settings.

[1] Hashemi, H., Rezvan, F., Pakzad, R., Ansaripour, A., Heydarian, S., Yekta, A., Ostadimoghaddam, H., Pakbin, M., & Khabazkhoob, M. (2022). Global and Regional Prevalence of Diabetic Retinopathy; A Comprehensive Systematic Review and Meta-analysis. Seminars in Ophthalmology, 37(3), 291–306. doi:10.1080/08820538.2021.1962920.

[2] Holmes, J. M., & Clarke, M. P. (2006). Amblyopia. The Lancet, 367(9519), 1343-1351. doi:10.1016/S0140-6736(06)68581-4.

[3] Holmes, J. M., Chandler, D. L., Christiansen, S. P., Birch, E. E., Bothun, E., Laby, D., Melia, B. M., Repka, M. X., Silbert, D. I., & Zeto, V. L. (2009). Interobserver reliability of the prism and alternate cover test in children with esotropia. Archives of Ophthalmology, 127(1), 59–65. doi:10.1001/archophthalmol.2008.548.

[4] Liu, J., Chi, J., Yang, H., & Yin, X. (2022). In the eye of the beholder: A survey of gaze tracking techniques. Pattern Recognition, 132, 108944. doi:10.1016/j.patcog.2022.108944.

[5] Guestrin, E. D., & Eizenman, M. (2006). General theory of remote gaze estimation using the pupil center and corneal reflections. IEEE Transactions on Biomedical Engineering, 53(6), 1124–1133. doi:10.1109/TBME.2005.863952.

[6] Tengtrisorn, S., Tungsattayathitthan, A., Na Phatthalung, S., Singha, P., Rattanalert, N., Bhurachokviwat, S., & Chouyjan, S. (2021). The reliability of the angle of deviation measurement from the Photo-Hirschberg tests and Krimsky tests. PLoS ONE, 16(12 December), 258744. doi:10.1371/journal.pone.0258744.

[7] Mestre, C., Gautier, J., & Pujol, J. (2018). Robust eye tracking based on multiple corneal reflections for clinical applications. Journal of Biomedical Optics, 23(03), 1. doi:10.1117/1.jbo.23.3.035001.

[8] Kellnhofer, P., Recasens, A., Stent, S., Matusik, W., & Torralba, A. (2019). Gaze360: Physically unconstrained gaze estimation in the wild. Proceedings of the IEEE International Conference on Computer Vision, 6911–6920. doi:10.1109/ICCV.2019.00701.

[9] Ghosh, S., Dhall, A., Hayat, M., Knibbe, J., & Ji, Q. (2024). Automatic Gaze Analysis: A Survey of Deep Learning Based Approaches. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(1), 61–84. doi:10.1109/TPAMI.2023.3321337.

[10] Simons, B. D., Siatkowski, R. M., Schiffman, J. C., Berry, B. E., & Flynn, J. T. (1999). Pediatric photoscreening for strabismus and refractive errors in a high- risk population. Ophthalmology, 106(6), 1073–1080. doi:10.1016/S0161-6420(99)90243-9.

[11] Huang, X., Lee, S. J., Kim, C. Z., & Choi, S. H. (2021). An automatic screening method for strabismus detection based on image processing. PLoS ONE, 16(8 August), 255643. doi:10.1371/journal.pone.0255643.

[12] Williams, T., Morgan, L. A., High, R., & Suh, D. W. (2018). Critical assessment of an ocular photoscreener. Journal of Pediatric Ophthalmology and Strabismus, 55(3), 194–199. doi:10.3928/01913913-20170703-18.

[13] Zhao, Z., Meng, H., Li, S., Wang, S., Wang, J., & Gao, S. (2025). High-Accuracy Intermittent Strabismus Screening via Wearable Eye-Tracking and AI-Enhanced Ocular Feature Analysis. Biosensors, 15(2), 110. doi:10.3390/bios15020110.

[14] Selva, J., Johansen, A. S., Escalera, S., Nasrollahi, K., Moeslund, T. B., & Clapés, A. (2023). Video transformers: A survey. IEEE transactions on pattern analysis and machine intelligence, 45(11), 12922-12943. doi:10.1109/TPAMI.2023.3243465.

[15] Omarov, B. (2025). Deep Learning in Biomedical Image and Signal Processing: A Survey. Computers, Materials, & Continua, 85(2), 2195. doi:10.32604/cmc.2025.064799.

[16] Ali, Z., Bukhari, M., Javaid, M., Safdar, J., Kim, H., & Rho, S. (2025). Investigating vulnerabilities of gait recognition model using latent-based perturbations. Scientific Reports, 15(1), 39242. doi:10.1038/s41598-025-22869-4.

[17] Madan, S., Lentzen, M., Brandt, J., Rueckert, D., Hofmann-Apitius, M., & Fröhlich, H. (2024). Transformer models in biomedicine. BMC Medical Informatics and Decision Making, 24(1), 214. doi:10.1186/s12911-024-02600-5.

[18] Tipirneni, S., & Reddy, C. K. (2022). Self-supervised transformer for sparse and irregularly sampled multivariate clinical time-series. ACM Transactions on Knowledge Discovery from Data (TKDD), 16(6), 1-17. doi:10.1145/3516367.

[19] Zheng, J., Ranjan, R., Chen, C. H., Chen, J. C., Castillo, C. D., & Chellappa, R. (2020). An automatic system for unconstrained video-based face recognition. IEEE Transactions on Biometrics, Behavior, and Identity Science, 2(3), 194-209. doi:10.1109/TBIOM.2020.2973504.

[20] Lugaresi, C., Tang, J., Nash, H., McClanahan, C., Uboweja, E., Hays, M., ... & Grundmann, M. (2019). Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172. doi:10.48550/arXiv.1906.08172.

[21] Baltrušaitis, T., Robinson, P., & Morency, L. P. (2016, March). Openface: an open-source facial behavior analysis toolkit. In 2016 IEEE winter conference on applications of computer vision (WACV), 1-10. doi:10.1109/WACV.2016.7477553.

[22] Beck, R. W. (1998). The Pediatric Eye Disease Investigator Group. Journal of AAPOS: The Official Publication of the American Association for Pediatric Ophthalmology and Strabismus / American Association for Pediatric Ophthalmology and Strabismus, 2(5), 255–256. doi:10.1016/S1091-8531(98)90079-9.

[23] Ahuja, K., Islam, R., Barbhuiya, F. A., & Dey, K. (2017). Convolutional neural networks for ocular smartphone-based biometrics. Pattern Recognition Letters, 91, 17–26. doi:10.1016/j.patrec.2017.04.002.

[24] Song, F., Tan, X., Chen, S., & Zhou, Z. H. (2013). A literature survey on robust and efficient eye localization in real-life scenarios. Pattern Recognition, 46(12), 3157–3173. doi:10.1016/j.patcog.2013.05.009.

[25] Morid, M. A., Sheng, O. R. L., & Dunbar, J. (2023). Time Series Prediction Using Deep Learning Methods in Healthcare. ACM Transactions on Management Information Systems, 14(1), 1–29. doi:10.1145/3531326.

[26] Farhad, M., Masud, M. M., Beg, A., Ahmad, A., & Ahmed, L. (2023). A Review of Medical Diagnostic Video Analysis Using Deep Learning Techniques. Applied Sciences (Switzerland), 13(11), 6582. doi:10.3390/app13116582.

[27] Ngo, T., & Manjunath, B. S. (2017). Saccade gaze prediction using a recurrent neural network. Proceedings - International Conference on Image Processing, ICIP, 2017-September, 3435–3439. doi:10.1109/ICIP.2017.8296920.

[28] Zheng, C., Li, W., Wang, S., Ye, H., Xu, K., Fang, W., Dong, Y., Wang, Z., & Qiao, T. (2024). Automated detection of steps in videos of strabismus surgery using deep learning. BMC Ophthalmology, 24(1), 242. doi:10.1186/s12886-024-03504-8.

[29] Han, C., Park, H., Kim, Y., & Gim, G. (2023). Hybrid CNN-LSTM Based Time Series Data Prediction Model Study. Studies in Computational Intelligence, 1075, 43–54. doi:10.1007/978-3-031-19608-9_4.

[30] Du, Q., Gu, W., Zhang, L., & Huang, S. L. (2018). Attention-based LSTM-CNNs for time-series classification. Proceedings of the 16th ACM conference on embedded networked sensor systems, 410-411. doi:10.1145/3274783.3275208.

[31] Thundiyil, S., & Picone, J. (2025). Time Series Analysis from Classical Methods to Transformer-Based Approaches: A Review. Signal Processing in Medicine and Biology: Applications of Deep Learning to the Health Sciences, 51–104. doi:10.1007/978-3-031-88024-7_2.

[32] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 5999–6009. doi:10.1201/9781003561460-19.

[33] Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., & Zhang, W. (2021). Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. 35th AAAI Conference on Artificial Intelligence, AAAI 2021, 12B(12), 11106–11115. doi:10.1609/aaai.v35i12.17325.

[34] Oliveira, J. M., & Ramos, P. (2024). Evaluating the Effectiveness of Time Series Transformers for Demand Forecasting in Retail. Mathematics, 12(17), 2728. doi:10.3390/math12172728.

[35] Hu, Y., & Xiao, F. (2022). Network self-attention for forecasting time series. Applied Soft Computing, 124, 109092. doi:10.1016/j.asoc.2022.109092.

[36] Martínez-Martínez, J., Brown, O., Karami, M., & Nabavi, S. (2025). Robust Training with Data Augmentation for Medical Imaging Classification. arXiv preprint arXiv:2506.17133. doi:10.48550/arXiv.2506.17133.

[37] Mosquera, C., Ferrer, L., Milone, D. H., Luna, D., & Ferrante, E. (2024). Class imbalance on medical image classification: towards better evaluation practices for discrimination and calibration performance. European Radiology, 34(12), 7895–7903. doi:10.1007/s00330-024-10834-0.

[38] Cui, Y., Jia, M., Lin, T. Y., Song, Y., & Belongie, S. (2019). Class-balanced loss based on effective number of samples. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June, 9260–9269. doi:10.1109/CVPR.2019.00949.

[39] Sugano, Y., Matsushita, Y., & Sato, Y. (2014). Learning-by-synthesis for appearance-based 3D gaze estimation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1821–1828. doi:10.1109/CVPR.2014.235.

Acceptance Rate:	21%
Review Speed:	74 days
Issue Per Year:	6
Number of Volumes:	7
Number of Issues:	44
Number of Articles:	493
Number of Reviewers:	1187
Number of Contributors:	1394
Contributing Countries:	83
No. of WoS Citations:	2609
No. of Scopus Citations:	2936
No. of Google Citations:	4161
Google h-index:	29
Google i10-index:	126
Abstract Views:	681,807
PDF Download:	492,524

T-CER-Net: Attention-Based Temporal Cross-Eye Regression for Noise-Resilient Detection of Intermittent Strabismus

Authors

Downloads

Downloads

Login

submission

Publisher & Affiliated Societies

Indexing & Abstracting

SidebarMenu

IndexedBy

Indexing and Abstracting

twitter

Social Media

Analytics

Analytics

Information

Most Cited Articles

Thermal Regeneration and Reuse of Carbon and Glass Fibers from Waste Composites

Impediments of Green Finance Adoption System: Linking Economy and Environment

Digital Transformation: Opportunities and Challenges for Leaders in the Emerging Countries in Response to Covid-19 Pandemic

Optical and Structural Characterization of Bi2FexNbO7 Nanoparticles for Environmental Applications

Address

Contact Info:

T-CER-Net: Attention-Based Temporal Cross-Eye Regression for Noise-Resilient Detection of Intermittent Strabismus

Authors

Downloads

Downloads

Login

submission

Publisher & Affiliated Societies

Indexing & Abstracting

SidebarMenu

social

Journal Imprint

Journal Metrics

IndexedBy

Indexing and Abstracting

twitter

Social Media

Analytics

Analytics

Information

Most Cited Articles

Thermal Regeneration and Reuse of Carbon and Glass Fibers from Waste Composites

Impediments of Green Finance Adoption System: Linking Economy and Environment

Digital Transformation: Opportunities and Challenges for Leaders in the Emerging Countries in Response to Covid-19 Pandemic

Optical and Structural Characterization of Bi2FexNbO7 Nanoparticles for Environmental Applications