Machine Learning-Based Spatiotemporal Modeling for Detecting Disease Hotspots in Primary Care Data

Authors

  • Rinna Rachmatika Universitas Pamulang, Indonesia
  • Teti Desyani Universitas Pamulang, Indonesia
  • Khoirudin Universitas Semarang, Indonesia

DOI:

https://doi.org/10.70062/globalscience.v1i4.188

Keywords:

Spatiotemporal Modeling, Machine Learning, Disease Surveillance, Health Informatics, Predictive Analytics

Abstract

Diseases in primary health services exhibit complex spatial-temporal dynamics due to urbanization and population mobility. Conventional surveillance approaches are difficult to capture these patterns adaptively. Machine learning (ML) based on spatio-temporal modeling offers a solution with the ability to detect disease clusters automatically and with high precision. Research Objectives: This research aims to develop a machine learning model to detect disease hotspots from primary service data in Indonesia, with a focus on improving prediction accuracy, interpretability, and relevance of health policies. Methodology: The primary service dataset for 2024 (5,343 entries) was analyzed using three ML models Gradient Boosting Machine (GBM), Temporal Random Forest (TRF), and Multi-EigenSpot with spatial (village) and temporal (week, month) features. Performance evaluation includes predictive (AUC, F1-score) and spatial (Moran's I, Spatio-Temporal Correlation Index) metrics. Results: The results showed that Multi-EigenSpot achieved the best performance (AUC=0.91; F1=0.86), with the detection of dominant hotspots in Sungai Asam and Beringin Villages. Moran's I value of 0.63 indicates a strong spatial autocorrelation, while STCI=0.57 indicates moderate temporal stability. Conclusions: ML-based spatio-temporal models are effective in identifying hidden disease patterns and have the potential to be integrated into national digital surveillance systems. This approach supports precision public health by providing a scientific basis for real-time location- and time-based intervention policies.

References

Abdulazeem, H., Whitelaw, S., Schauberger, G., & Klug, S. J. (2023). A systematic review of clinical health conditions predicted by machine learning diagnostic and prognostic models trained or validated using real-world primary health care data. PLOS ONE, 18(9), e0274276-. https://doi.org/10.1371/journal.pone.0274276

Abualigah, L., Alomari, S. A., Almomani, M. H., Zitar, R. A., Saleem, K., Migdady, H., Snasel, V., Smerat, A., & Ezugwu, A. E. (2025). Artificial intelligence-driven translational medicine: a machine learning framework for predicting disease outcomes and optimizing patient-centric care. Journal of Translational Medicine, 23(1), 302. https://doi.org/10.1186/s12967-025-06308-6

Alhumaidi, N. H., Dermawan, D., Kamaruzaman, H. F., & Alotaiq, N. (2025). The Use of Machine Learning for Analyzing Real-World Data in Disease Prediction and Management: Systematic Review. JMIR Med Inform, 13, e68898. https://doi.org/10.2196/68898

Bhattacharjee, S., Madl, J., Chen, J., & Kshirsagar, V. (2020). Spatiotemporal Modeling. In B. S. Daya Sagar, Q. Cheng, J. McKinley, & F. Agterberg (Eds.), Encyclopedia of Mathematical Geosciences (pp. 1–5). Springer International Publishing. https://doi.org/10.1007/978-3-030-26050-7_418-1

Chen, J., Farid, F., & Polash, M. (2023). Federated Learning: An Alternative Approach to Improving Medical Data Privacy and Security. In K. Daimi, A. Alsadoon, & S. Seabra Dos Reis (Eds.), Current and Future Trends in Health and Medical Informatics (pp. 277–297). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-42112-9_13

Cheng, X., Xu, L., Kang, W., Zhang, X., Gu, W., Bao, C., & Zhang, P. (2025). Analysis of the Spatiotemporal Distribution and Evolutionary Trends of Scrub Typhus in Jiangsu Province from 2006 to 2023. Journal of Epidemiology and Global Health, 15(1), 110. https://doi.org/10.1007/s44197-025-00450-6

Deshpande, N. M., Gite, S., Pradhan, B., & Assiri, M. E. (2022). Explainable Artificial Intelligence–A New Step towards the Trust in Medical Diagnosis with AI Frameworks: A Review. CMES - Computer Modeling in Engineering and Sciences, 133(3), 843–872. https://doi.org/https://doi.org/10.32604/cmes.2022.021225

Du, K., Yang, X., & Chen, H. (2025). Enhancing multivariate spatio-temporal forecasting via complete dynamic causal modeling. Neural Networks, 191, 107826. https://doi.org/https://doi.org/10.1016/j.neunet.2025.107826

Fayyaz, M., Alamgir, Ullah, S., Ali, H., Alshammari, A. O., Klai, Z., & Himmat, B. (2025). Novel Eigen space method for multiple Spatiotemporal rare diseases clusters detection: a case study of waterborne disease. Scientific Reports, 15(1), 37836. https://doi.org/10.1038/s41598-025-21792-y

Ganesan, A. (2025). Transforming Wellness: Ethical, Cognitive, and Privacy Challenges in AI-Driven Healthcare (pp. 481–500). https://doi.org/10.4018/979-8-3693-9521-9.ch018

Gusev, A. V, Vladzimirskiy, A. V, & Gavrilenko, G. G. (2022). Methodical approach and recommendations for scientific description of creation and validation of machine learning model. Medical Technologies. Assessment and Choice.

Hossain, E., Ferdous, M. H., Wang, J., Subramanian, A. C., & Gani, M. O. (2025). Correlation to Causation: A Causal Deep Learning Framework for Arctic Sea Ice Prediction. 2025 IEEE International Conference on Pervasive Computing and Communications Workshops and Other Affiliated Events (PerCom Workshops), 62–67. https://api.semanticscholar.org/CorpusID:276769286

Hosseinzadeh, S., Jamilpanah, L., Shoa e Gharehbagh, J., Behboudnia, M., Tiwari, A., & Mohseni, S. M. (2019). Effect of YIG nanoparticle size and clustering in proximity-induced magnetism in graphene/YIG composite probed with magnetoimpedance sensors: Towards improved functionality, sensitivity and proximity detection. Composites Part B: Engineering, 173, 106992. https://doi.org/https://doi.org/10.1016/j.compositesb.2019.106992

Janssens, A., Vaes, B., Van Pottelbergh, G., Libin, P. J. K., & Neyens, T. (2024). Model-based disease mapping using primary care registry data. Spatial and Spatio-Temporal Epidemiology, 49, 100654. https://doi.org/10.1016/j.sste.2024.100654

Jaya, I. G. N. M., Kristiani, F., Andriyana, Y., & Chadidjah, A. (2024). Sensitivity Analysis on Hyperprior Distribution of the Variance Components of Hierarchical Bayesian Spatiotemporal Disease Mapping. Mathematics, 12(3). https://doi.org/10.3390/math12030451

Lan, Y., & Delmelle, E. (2023). Space-time cluster detection techniques for infectious diseases: A systematic review. Spatial and Spatio-Temporal Epidemiology, 44, 100563. https://doi.org/https://doi.org/10.1016/j.sste.2022.100563

Lees, J., Theal, R., Barber, D., & Herman, C. (2023). Considerations for Creating a Restricted Data Environment with Complete Primary Care Electronic Medical Record Data. Annals of Family Medicine, 21(21 Suppl 1). https://doi.org/10.1370/afm.21.s1.4084

Lighterness, A., Adcock, M., Scanlon, L. A., & Price, G. (2024). Data Quality–Driven Improvement in Health Care: Systematic Literature Review. Journal of Medical Internet Research, 26. https://doi.org/https://doi.org/10.2196/57615

Ma, H., Cao, J., Fang, Y., Zhang, W., Sheng, W., Zhang, S., & Yu, Y. (2022). Retrieval-Based Gradient Boosting Decision Trees for Disease Risk Assessment. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 3468–3476. https://doi.org/10.1145/3534678.3539052

Meunier, P.-Y., Raynaud, C., Guimaraes, E., Gueyffier, F., & Letrilliart, L. (2023). Barriers and Facilitators to the Use of Clinical Decision Support Systems in Primary Care: A Mixed-Methods Systematic Review. Annals of Family Medicine, 21(1), 57–69. https://doi.org/10.1370/afm.2908

Mridha, K., Shukla, M., Acharya, B., Gerogiannis, V. C., Kanavos, A., Priyok, M. A., & Razu, R. H. (2025). Implementing a Heart Disease Prediction Model with Explainable Machine Learning Techniques. SN Computer Science, 6(7), 861. https://doi.org/10.1007/s42979-025-04409-z

Mulomba, C. M., Kiketa, V. M., Kutangila, D. M., Mampuya, P. H. K., Mukenze, J. N., Kasunzi, L. M., Kyamakya, K., Tashev, T., & Kasereka, S. K. (2025). Applying Causal Machine Learning to Spatiotemporal Data Analysis: An Investigation of Opportunities and Challenges. IEEE Access, 13, 141832–141857. https://doi.org/10.1109/ACCESS.2025.3596680

Obeng, B. M., Kouyos, R. D., Kusejko, K., Salazar-Vizcaya, L., Günthard, H. F., Kelleher, A. D., & Di Giallonardo, F. (2025). Threshold sensitivity analysis for HIV-1 transmission cluster detection using different genomic regions and subtypes. Virology, 608, 110558. https://doi.org/https://doi.org/10.1016/j.virol.2025.110558

Rakers, M. M., van Buchem, M. M., Kucenko, S., de Hond, A., Kant, I., van Smeden, M., Moons, K. G. M., Leeuwenberg, A. M., Chavannes, N., Villalobos-Quesada, M., & van Os, H. J. A. (2024). Availability of Evidence for Predictive Machine Learning Algorithms in Primary Care: A Systematic Review. JAMA Network Open, 7(9), e2432990. https://doi.org/10.1001/jamanetworkopen.2024.32990

Reddy, K. P., Satish, M., Prakash, A., Babu, S. M., Kumar, P. P., & Devi, B. S. (2023). Machine Learning Revolution in Early Disease Detection for Healthcare: Advancements, Challenges, and Future Prospects. 2023 IEEE 5th International Conference on Cybernetics, Cognition and Machine Learning Applications (ICCCMLA), 638–643. https://doi.org/10.1109/ICCCMLA58983.2023.10346963

Tsai, T., Lee, J. J., Phillips, R., & Lin, S. (2025). Data Transformation to Advance AI/ML Research and Implementation in Primary Care. The Annals of Family Medicine, 23(4), 363. https://doi.org/10.1370/afm.240459

Wang, H., Kong, H., Yan, B., Li, L., Xu, J., Wang, Z., & Wang, Q. (2021). A New MSGSA-Optimized Dynamic Window of Spatiotemporal Scan Statistics for Disease Outbreak Detection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14, 10821–10834. https://doi.org/10.1109/JSTARS.2021.3113785

Wozniak, T. M., Cooper, E., Murphy, M. G., Nguyen, A., Conlan, D., Smallbon, V., & Shi, J. (2024). The HOTspots Digital Surveillance: Conceptualisation to Clinical Deployment. Studies in Health Technology and Informatics, 318, 42–47. https://doi.org/10.3233/SHTI240889

Yang, J., Soltan, A. A. S., & Clifton, D. A. (2022). Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening. Npj Digital Medicine, 5(1), 69. https://doi.org/10.1038/s41746-022-00614-9

Zhang, T., Chung, T., Dey, A., & Bae, S. W. (2024). Exploring Algorithmic Explainability: Generating Explainable AI Insights for Personalized Clinical Decision Support Focused on Cannabis Intoxication in Young Adults. 2024 International Conference on Activity and Behavior Computing, 2024. https://doi.org/10.1109/abc61795.2024.10652070

Downloads

Published

2025-12-29

How to Cite

Rachmatika, R., Desyani, T., & Khoirudin. (2025). Machine Learning-Based Spatiotemporal Modeling for Detecting Disease Hotspots in Primary Care Data. Global Science: Journal of Information Technology and Computer Science, 1(4), 17–31. https://doi.org/10.70062/globalscience.v1i4.188