Quantifying Safety-II in aviation maintenance: an integrated design-and-validation framework for communication-resilience KPIs
DOI: https://doi.org/10.3846/aviation.2026.26028Abstract
This study addresses human factors in aviation maintenance by converting routine e-log text into computable communication-resilience indicators – closure-loop ratio, read-back adherence, ambiguity density, temporal/referential completeness, error-catch latency, and cross-shift continuity – and testing whether strengthening these signals reduces defects with minimal operational burden. An integrated design-and-validation pipeline was deployed in a Maintenance, Repair and Overhaul (MRO) setting using a phased rollout (Baseline → Assist → Nudge), and causal effects were estimated via interrupted time-series analysis and, where applicable, stepped-wedge Generalized Linear Mixed Model (GLMM). A Natural Language Processing (NLP) stack (Term Frequency–Inverse Document Frequency (TF-IDF) + regularized logistic regression, with an optional compact transformer) extracts linguistic cues; the predicted probabilities are calibrated to support reliable dashboard thresholds. Results show immediate reductions in level and sustained improvements in slope in sign-off error rates after Assist, with larger step-downs under Nudge. Mediation analyses indicate that gains operate through improved communication KPIs rather than generic attentional effects. Model diagnostics light-strong discrimination with low calibration error; robustness checks and a cross-shift/fleet evaluation show stable transfer with minimal recalibration. Governance emphasizes de-identification, advisory-only AI with human-in-the-loop, and transparent, non-punitive use. Findings operationalize Safety-II as quantifiable communication behavior and demonstrate a scalable, low-friction pathway – advisory Assist plus light User Interface (UI) nudges – that advances Air Transport Technologies & Development while improving safety and quality in maintenance operations.
Keywords:
Safety-II, aviation maintenance, human factors engineering, communication resilience, electronic log systems (e-logs), interrupted time series analysis, calibrated natural language processing (NLP)How to Cite
Share
License
Copyright (c) 2026 The Author(s). Published by Vilnius Gediminas Technical University.

This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Aherne, D. P., Chatzi, A., Kourousis, K., & Kwakye, O. (2025). Human factors considerations for critical maintenance tasks and their effect on the transition to digital maintenance documentation. Aviation, 29(1), 48–54. https://doi.org/10.3846/aviation.2025.23131
Ahmadi Rad, M., Lefsrud, L. M., & Hendry, M. T. (2023). Application of systems thinking accident analysis methods: A review for railways. Safety Science, 160, Article 106066. https://doi.org/10.1016/j.ssci.2023.106066
Ahn, J., Jang, H., & Son, Y. (2020). Critical care nurses’ communication challenges during handovers: A systematic review and qualitative metasynthesis. Journal of Nursing Management, 29(4), 623–634. https://doi.org/10.1111/jonm.13207
Appelbaum, R. D., Puzio, T. J., Bauman, Z., Asfaw, S., Spencer, A., Dumas, R. P., Kaur, K., Cunningham, K. W., Butler, D., Sawhney, J. S., Gadomski, S., Horwood, C. R., Stuever, M., Sapp, A., Gandhi, R., & Freeman, J. (2024). Handoffs and transitions of care: A systematic review, meta-analysis, and practice management guideline from the Eastern Association for the Surgery of Trauma. Journal of Trauma and Acute Care Surgery, 97(2), 305–314. https://doi.org/10.1097/TA.0000000000004285
Bach, S. H., Rodriguez, D., Liu, Y., Luo, C., Shao, H., Xia, C., Sen, S., Ratner, A., Hancock, B., Alborzi, H., Kuchhal, R., Ré, Ch., & Malkin, R. (2019). Snorkel DryBell: A case study in deploying weak supervision at industrial scale. Proceedings of the VLDB Endowment, 12(12), 362–375. https://doi.org/10.1145/3299869.3314036
Bedi, S., Liu, Y., Orr-Ewing, L., Dash, D., Koyejo, S., Callahan, A., Fries, J. A., Wornow, M., Swaminathan, A., Lehmann, L. S., Hong, H. J., Kashyap, M., Chaurasia, A. R., Shah, N. R., Singh, K., Tazbaz, T., Milstein, A., Pfeffer, M. A., & Shah, N. H. (2025). Testing and evaluation of health care applications of large language models. JAMA, 333(4), 319–328. https://doi.org/10.1001/jama.2024.21700
Bickley, S. J., & Torgler, B. (2021). A systematic approach to public health – novel application of the human factors analysis and classification system to public health and COVID-19. Safety Science, 140, Article 105312. https://doi.org/10.1016/j.ssci.2021.105312
Bukoh, M. X., & Siah, C. R. (2020). A systematic review on the structured handover interventions between nurses in improving patient safety outcomes. Journal of Nursing Management, 28(3), 744–755. https://doi.org/10.1111/jonm.12936
Chatzi, A. V., & Kourousis, K. I. (2024). Identifying the contribution of communication and trust in aviation maintenance occurrences: A content analysis methodology. Transportation Research Interdisciplinary Perspectives, 27, Article 101220. https://doi.org/10.1016/j.trip.2024.101220
Chatzi, A. V., Martin, W., Bates, P., & Murray, P. (2019). The unexplored link between communication and trust in aviation maintenance practice. Aerospace, 6(6), Article 66. https://doi.org/10.3390/aerospace6060066
Choi, J. Y., Byun, M., & Kim, E. J. (2024). Educational interventions for improving nursing shift handovers: A systematic review. Nurse Education in Practice, 74, Article 103846. https://doi.org/10.1016/j.nepr.2023.103846
Cohen, A., Lanson, A., Kempf, E., & Tannier, X. (2024). Leveraging information redundancy of real-world data through distant supervision. In Proceedings of LREC-COLING 2024 (pp. 10353–10363). European Language Resources Association. https://tinyurl.com/5dkua43c
Delardes, B., McLeod, L., Chakraborty, S., & Bowles, K.-A. (2020). What is the effect of electronic clinical handovers on patient outcomes? A systematic review. Health Informatics Journal, 26(4), 2422–2434. https://doi.org/10.1177/1460458220905162
Delikhoon, M., Zarei, E., Banda, O. V., Faridan, M., & Habibi, E. (2022). Systems thinking accident analysis models: A systematic review for sustainable safety management. Sustainability, 14(10), Article 5869. https://doi.org/10.3390/su14105869
Dhrangadhariya, A., & Müller, H. (2023). Not so weak PICO: Leveraging weak supervision for participants, interventions, and outcomes recognition for systematic review automation. JAMIA Open, 6(1). https://doi.org/10.1093/jamiaopen/ooac107
Dong, T., Yang, Q., Ebadi, N., Luo, X. R., & Rad, P. (2021). Identifying incident causal factors to improve aviation transportation safety: Proposing a deep learning approach. Journal of Advanced Transportation, 2021, 1–15. https://doi.org/10.1155/2021/5540046
Ewertowski, T., & Kowalska, A. (2025). The impact of improving the safety management system on situational awareness in the context of safety: A case of a selected high-reliability organization. Journal of Management and Financial Sciences, 54, 107–123. https://doi.org/10.33119/JMFS.2024.54.6
Federal Aviation Administration. (2023). Human factors in aviation maintenance: Dirty dozen. https://tinyurl.com/4na9y8by
Griffioen, J., van der Drift, M., & van den Broek, H. (2021). Enhancing maritime crew resource management training by applying resilience engineering: A case study of the bachelor maritime officer training programme in Rotterdam. Education Sciences, 11(8), Article 378. https://doi.org/10.3390/educsci11080378
Grindley, B., Parnell, K. J., Cherett, T., Scanlan, J., & Plant, K. L. (2024). Understanding the human factors challenge of handover between levels of automation for uncrewed air systems: A systematic literature review. Transportation Planning and Technology, 48(6), 1383–1408. https://doi.org/10.1080/03081060.2024.2375645
Ham, D.-H. (2021). Safety-II and resilience engineering in a nutshell: An introductory guide to their concepts and methods. Safety and Health at Work, 12(1), 10–19. https://doi.org/10.1016/j.shaw.2020.11.004
Hölzing, C. R., Rumpf, S., Huber, S., Papenfuß, N., Meybohm, P., & Happel, O. (2024). The potential of using generative AI/NLP to identify and analyse critical incidents in a critical incident reporting system (CIRS): A feasibility case–control study. Healthcare, 12(19), Article 1964. https://doi.org/10.3390/healthcare12191964
Huang, Y., Li, W., Macheret, F., & Gabriel, R. A., & Ohno-Machado, L. (2020). A tutorial on calibration measurements and calibration models for clinical prediction models. Journal of the American Medical Informatics Association, 27(4), 621–633. https://doi.org/10.1093/jamia/ocz228
Iflaifel, M., Lim, R. H., Ryan, K., & Crowley, C. (2020). Resilient Health Care: A systematic review of conceptualisations, study methods and factors that develop resilience. BMC Health Services Research, 20, Article 324. https://doi.org/10.1186/s12913-020-05208-3
Islam, S., Alfred, M., Wilson, D., & Cohen, E. (2024). Evaluating active learning strategies for automated classification of patient safety event reports in hospitals. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 68(1). https://doi.org/10.1177/10711813241260676
Janes, G., Harrison, R., Johnson, J., Simms-Ellis, R., Mills, T., & Lawton, R. (2020). Multiple meanings of resilience: Health professionals’ experiences of a dual element training intervention designed to help them prepare for coping with error. Authorea, Inc. https://doi.org/10.22541/au.160829796.66009765/v1
Judy, G. D., Lindsay, D. P., Gu, D., Mullins, B. T., Mosaly, P. R., Marks, L. B., Chera, B. S., & Mazur, L. M. (2020). Incorporating human factors analysis and classification system (HFACS) into analysis of reported near misses and incidents in radiation oncology. Practical Radiation Oncology, 10(5), e312–e321. https://doi.org/10.1016/j.prro.2019.09.005
Karakiliç, E., Gunaltili, E., Ekici, S., Dalkiran, A., Balli, O., & Karakoc, T. H. (2023). A comparative study between paper and paperless aircraft maintenance: A case study. Sustainability, 15(20), Article 15150. https://doi.org/10.3390/su152015150
Keshtkar, L., Bennett-Weston, A., Khan, A. S., Mohan, S., Jones, M., Nockels, K., Gunn, S., Armstrong, N., Bostock, J., & Howick, J. (2025). Impacts of communication type and quality on patient safety incidents: A systematic review. Annals of Internal Medicine, 178(5), 687–700. https://doi.org/10.7326/ANNALS-24-02904
Lazzari, C., & Rabottini, M. (2025). The use of introduction, situation, background, assessment, and recommendation handover in the COVID-19 pandemic and non-COVID clinical settings: A systematic review and meta-analysis. Frontiers in Health Services, 5. https://doi.org/10.3389/frhs.2025.1380948
Luther, B., Gunawan, I., & Nguyen, N. (2023). Identifying effective risk management frameworks for complex socio-technical systems. Safety Science, 158, Article 105989. https://doi.org/10.1016/j.ssci.2022.105989
Lyu, T., Song, W., & Du, K. (2019). Human factors analysis of air traffic safety based on HFACS-BN model. Applied Sciences, 9(23), Article 5049. https://doi.org/10.3390/app9235049
Ma, Z., & Chen, Z. (2024). Mining construction accident reports via unsupervised NLP and Accimap for systemic risk analysis. Automation in Construction, 161, Article 105343. https://doi.org/10.1016/j.autcon.2024.105343
McGill, A., Smith, D., McCloskey, R., Morris, P., Goudreau, A., & Veitch, B. (2021). The functional resonance analysis method as a health care research methodology: A scoping review. JBI Evidence Synthesis, 20(4), 1074–1097. https://doi.org/10.11124/JBIES-21-00099
Metso, L., Baglee, D., & Marttonen-Arola, S. (2018). Maintenance as a combination of intelligent IT systems and strategies: A literature review. Management and Production Engineering Review, 9(1), 51–64.
Miyamoto, A., Bendarkar, M. V., & Mavris, D. N. (2022). Natural language processing of aviation safety reports to identify inefficient operational patterns. Aerospace, 9(8), Article 450. https://doi.org/10.3390/aerospace9080450
Muecklich, N., Sikora, I., Paraskevas, A., & Padhra, A. (2023). The role of human factors in aviation ground operation-related accidents/incidents: A human error analysis approach. Transportation Engineering, 13, Article 100184. https://doi.org/10.1016/j.treng.2023.100184
Naik, A., Lehman, J., & Rosé, C. (2022). Adapting to the long tail: A meta-analysis of transfer learning research for language understanding tasks. Transactions of the Association for Computational Linguistics, 10, 956–980. https://doi.org/10.1162/tacl_a_00500
Newman, M., & Scott, S. (2023). It was this wing wasn’t it? Identifying the importance of verbal communication in aviation maintenance. The International Journal of Aerospace Psychology, 33(2), 139–152. https://doi.org/10.1080/24721840.2023.2169146
Nikolić, M., Nikolić, D., Stefanović, M., Koprivica, S., & Stefanović, D. (2025). Mitigating algorithmic bias through probability calibration: A case study on lead generation data. Mathematics, 13(13), Article 2183. https://doi.org/10.3390/math13132183
Patriarca, R., Di Gravio, G., Woltjer, R., Costantino, F., Praetorius, G., Ferreira, P., & Hollnagel, E. (2020). Framing the FRAM: A literature review on the functional resonance analysis method. Safety Science, 129, Article 104827. https://doi.org/10.1016/j.ssci.2020.104827
Poller, D. N., Bongiovanni, M., Cochand-Priollet, B., Johnson, S. J., & Perez-Machado, M. (2020). A human factor event-based learning assessment tool for assessment of errors and diagnostic accuracy in histopathology and cytopathology. Journal of Clinical Pathology, 73(10), 681–685. https://doi.org/10.1136/jclinpath-2020-206538
Provan, D. J., Woods, D. D., Dekker, S. W. A., & Rae, A. J. (2020). Safety-II professionals: How resilience engineering can transform safety practice. Reliability Engineering & System Safety, 195, Article 106740. https://doi.org/10.1016/j.ress.2019.106740
Ranasinghe, U., Jefferies, M., Davis, P., & Pillay, M. (2020). Resilience engineering indicators and safety management: A systematic review. Safety and Health at Work, 11(2), 127–135. https://doi.org/10.1016/j.shaw.2020.03.009
Ricketts, J., Barry, D., Guo, W., & Pelham, J. (2023). A scoping literature review of natural language processing application to safety occurrence reports. Safety, 9(2), Article 22. https://doi.org/10.3390/safety9020022
Sarvari, H., Edwards, D. J., Rillie, I., & Posillico, J. J. (2024). Building a safer future: Analysis of studies on safety I and safety II in the construction industry. Safety Science, 178, Article 106621. https://doi.org/10.1016/j.ssci.2024.106621
Silva Filho, T., Song, H., Perello-Nieto, M., Santos-Rodriguez, R., Kull, M., & Flach, P. (2023). Classifier calibration: A survey on how to assess and improve predicted class probabilities. Machine Learning, 112, 3211–3260. https://doi.org/10.1007/s10994-023-06336-7
Steinmann, P., & Tobi, H., & van Voorn, G. A. K. (2024). Resilience metrics for socio-ecological and socio-technical systems: A scoping review. Systems, 12(9), Article 357. https://doi.org/10.3390/systems12090357
Thomas, J. (2019). Introduction to STPA [PowerPoint slides). MIT Partnership for Systems Approaches to Safety and Security. https://tinyurl.com/3fr8588h
Wicaksono, F. D., Ciptomulyono, U., Artana, K. B., & Irawan, M. I. (2021). A state of the art of the accident causation models in the process industries. Process Safety Progress, 41(1), 167–176. https://doi.org/10.1002/prs.12283
Xing, Y., Wu, Y., Zhang, S., Wang, L., Cui, H., Jia, B., & Wang, H. (2024). Discovering latent themes in aviation safety reports using text mining and network analytics. International Journal of Transportation Science and Technology, 16, 292–316. https://doi.org/10.1016/j.ijtst.2024.02.009
Yang, C., & Huang, C. (2023). Natural Language Processing (NLP) in aviation safety: Systematic review of research and outlook into the future. Aerospace, 10(7), Article 600. https://doi.org/10.3390/aerospace10070600
Young, I. J. B., Luz, S., & Lone, N. (2019). A systematic review of natural language processing for classification tasks in the field of incident reporting and adverse event analysis. International Journal of Medical Informatics, 132, Article 103971. https://doi.org/10.1016/j.ijmedinf.2019.103971
Yuzui, T., & Kaneko, F. (2025). Toward a hybrid approach for the risk analysis of maritime autonomous surface ships: A systematic review. Journal of Marine Science and Technology, 30(1), 153–176. https://doi.org/10.1007/s00773-024-01040-0
Zarei, E., Khan, F., & Abbassi, R. (2023). How to account artificial intelligence in human factor analysis of complex systems? Process Safety and Environmental Protection, 171, 736–750. https://doi.org/10.1016/j.psep.2023.01.067
Zhang, Y., Dong, C., Guo, W., Dai, J., & Zhao, Z. (2022). Systems theoretic accident model and process (STAMP): A literature review. Safety Science, 152, Article 105596. https://doi.org/10.1016/j.ssci.2021.105596
View article in other formats
Published
Issue
Section
Copyright
Copyright (c) 2026 The Author(s). Published by Vilnius Gediminas Technical University.
License

This work is licensed under a Creative Commons Attribution 4.0 International License.