Autonomous modular construction strategy using robotized crane based on deep learning and reinforcement learning
DOI: https://doi.org/10.3846/jcem.2025.24043Abstract
Modular construction offers significant advantages including faster construction time, higher quality control and less environmental impact. To further enhance its advantages, advanced robotic construction technologies are being developed. This research develops an automated modular construction framework that incorporates the robotic kinematics, deep learning and deep reinforcement learning using a robotized crane. The proposed modular construction strategy utilizes YOLOv5-S for modular container identification and localization. An improved proximal policy optimization (PPO-I) is developed and implemented in this strategy for collision-free three-dimensional (3D) lifting path planning and modular container transportation. States and rewards of the PPO-I and robot kinematics design of a real mobile crane are developed. The feasibility of the proposed modular construction strategy is verified through four case studies in 3D virtual environments. More than 97% success rate is observed meaning that the proposed strategy can be implemented in the robotized crane to localize the modular container and transport it to the target position with collision avoidance. The results indicate the potential of the proposed robotic-assisted modular construction strategy in the field of automated construction.
Keywords:
modular construction, mobile crane, robotized crane, machine learning, deep learning, deep reinforcement learning, 3D path planning, collision avoidanceHow to Cite
Share
License
Copyright (c) 2025 The Author(s). Published by Vilnius Gediminas Technical University.

This work is licensed under a Creative Commons Attribution 4.0 International License.
References
AlBahnassi, H., & Hammad, A. (2012). Near real-time motion planning and simulation of cranes in construction: Framework and system architecture. Journal of Computing in Civil Engineering, 26(1), 54–63. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000123
Al-Hussein, M., Alkass, S., & Moselhi, O. (2005). Optimization algorithm for selection and on site location of mobile cranes. Journal of Construction Engineering and Management, 131(5), 579–590. https://doi.org/10.1061/(ASCE)0733-9364(2005)131:5(579)
Alipour, M., Harris, D. K., & Miller, G. R. (2019). Robust pixel-level crack detection using deep fully convolutional neural networks. Journal of Computing in Civil Engineering, 33(6), Article 04019040. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000854
Alkaissy, M., Arashpour, M., Golafshani, E. M., Hosseini, M. R., Khanmohammadi, S., Bai, Y., & Feng, H. (2023). Enhancing construction safety: Machine learning-based classification of injury types. Safety Science, 162, Article 106102. https://doi.org/10.1016/j.ssci.2023.106102
Andersson, J., Bodin, K., Lindmark, D., Servin, M., & Wallin, E. (2021, September). Reinforcement learning control of a forestry crane manipulator. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 2121–2126). IEEE. https://doi.org/10.1109/IROS51168.2021.9636219
Asadi, E., Li, B., & Chen, I. M. (2018). Pictobot: A cooperative painting robot for interior finishing of industrial developments. IEEE Robotics & Automation Magazine, 25(2), 82–94. https://doi.org/10.1109/MRA.2018.2816972
Asghari, V., Wang, Y., Biglari, A. J., Hsu, S. C., & Tang, P. (2022). Reinforcement learning in construction engineering and management: A review. Journal of Construction Engineering and Management, 148(11), Article 03122009. https://doi.org/10.1061/(ASCE)CO.1943-7862.0002386
Bi, J., Zong, L., Si, Q., Ding, Y., Lou, N., & Huang, Y. (2020). Field measurement and numerical analysis on wind-induced performance of modular structure with concrete cores. Engineering Structures, 220, Article 110969. https://doi.org/10.1016/j.engstruct.2020.110969
Chang, Y. C., Hung, W. H., & Kang, S. C. (2012). A fast path planning method for single and dual crane erections. Automation in Construction, 22, 468–480. https://doi.org/10.1016/j.autcon.2011.11.006
Chea, C. P., Bai, Y., & Zhou, Z. (2024). Design and development of robotic collaborative system for automated construction of reciprocal frame structures. Computer-Aided Civil and Infrastructure Engineering, 39(10), 1550–1569. https://doi.org/10.1111/mice.13145
Chen, Z., Popovski, M., & Ni, C. (2020). A novel floor-isolated re-centering system for prefabricated modular mass timber construction – Concept development and preliminary evaluation. Engineering Structures, 222, Article 111168. https://doi.org/10.1016/j.engstruct.2020.111168
Chen, S., Dong, J., Ha, P., Li, Y., & Labi, S. (2021). Graph neural network and reinforcement learning for multi‐agent cooperative control of connected autonomous vehicles. Computer‐Aided Civil and Infrastructure Engineering, 36(7), 838–857. https://doi.org/10.1111/mice.12702
Chua, Y. S., Liew, J. R., & Pang, S. D. (2020). Modelling of connections and lateral behavior of high-rise modular steel buildings. Journal of Constructional Steel Research, 166, Article 105901. https://doi.org/10.1016/j.jcsr.2019.105901
City of Toronto. (2019). HousingTO 2020–2030 action plan. https://www.toronto.ca/community-people/community-partners/affordable-housing-partners/housingto-2020-2030-action-plan/
COPMA Articulated Cranes. (2023). COPMA document. https://www.cps-group.com/downloads/copma/
Coumans, E., Bai, Y. P., & PyBullet, A. (2016). A Python module for physics simulation for games, robotics and machine learning.
Craig, J. J. (2005). Introduction to robotics: mechanics and control. Pearson Education.
Delgado, J. M. D., Oyedele, L., Ajayi, A., Akanbi, L., Akinade, O., Bilal, M., & Owolabi, H. (2019). Robotics and automated systems in construction: Understanding industry-specific challenges for adoption. Journal of Building Engineering, 26, Article 100868. https://doi.org/10.1016/j.jobe.2019.100868
Denavit, J., & Hartenberg, R. S. (1955). A kinematic notation for lower-pair mechanisms based on matrices. Journal of Applied Mechanics, 22(2), 215–221. https://doi.org/10.1115/1.4011045
Di Stefano, G., Romeo, G., Mazzini, A., Iarocci, A., Hadi, S., & Pelphrey, S. (2018). The Lusi drone: A multidisciplinary tool to access extreme environments. Marine and Petroleum Geology, 90, 26–37. https://doi.org/10.1016/j.marpetgeo.2017.07.006
Ekanayake, B., Ahmadian Fard Fini, A., Wong, J. K. W., & Smith, P. (2024). A deep learning-based approach to facilitate the as-built state recognition of indoor construction works. Construction Innovation, 24(4), 933–949. https://doi.org/10.1108/CI-05-2022-0121
Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Janoos, F., Rudolph, L., & Madry, A. (2020). Implementation matters in deep policy gradients: A case study on PPO and TRPO. In 2020 International Conference on Learning Representations.
Fang, W., Zhong, B., Zhao, N., Love, P. E., Luo, H., Xue, J., & Xu, S. (2019). A deep learning-based approach for mitigating falls from height with computer vision: Convolutional neural network. Advanced Engineering Informatics, 39, 170–177. https://doi.org/10.1016/j.aei.2018.12.005
Fu, Y., Bu, J., Lin, J., Liu, J., & Zhang, C. (2024). Selection and layout optimization of double tower cranes. Buildings, 14(6), Article 1705. https://doi.org/10.3390/buildings14061705
Gharbia, M., Chang-Richards, A., Lu, Y., Zhong, R. Y., & Li, H. (2020). Robotic technologies for on-site building construction: A systematic review. Journal of Building Engineering, 32, Article 101584. https://doi.org/10.1016/j.jobe.2020.101584
Jocher, G., Stoken, A., Borovec, J., NanoCode012, Chaurasia, A., TaoXie, Changyu, L., Abhiram V, Laughing, tkianai, yxNONG, Hogan, A., lorenzomammana, AlexWang1900, Hajek, J., Diaconu, L., Marc, Kwon, Y., oleg, wanghaoyang0106, Defretin, Y., Lohia, A., ml5ah, Milanko, B., Fineran, B., Khromov, D., Yiwei, D., Doug, Durgesh, & Ingham, F. (2021). ultralytics/yolov5: v5.0- YOLOv5-P6 1280 models, AWS, Supervise.ly and YouTube integrations. https://zenodo.org/records/4679653
Guo, H., Zhang, Z., Yu, R., Sun, Y., & Li, H. (2023). Action recognition based on 3D skeleton and LSTM for the monitoring of construction workers’ safety harness usage. Journal of Construction Engineering and Management, 149(4), Article 04023015. https://doi.org/10.1061/JCEMD4.COENG-12542
Hauser, K. (2021). Kris’ locomotion and manipulation planning toolbox - Klamp’t. https://github.com/krishauser/Klampt
Henze, G. P., & Schoenmann, J. (2003). Evaluation of reinforcement learning control for thermal energy storage systems. HVAC&R Research, 9(3), 259–275. https://doi.org/10.1080/10789669.2003.10391069
Heravi, M. Y., Jang, Y., Jeong, I., & Sarkar, S. (2024). Deep learning-based activity-aware 3D human motion trajectory prediction in construction. Expert Systems with Applications, 239, Article 122423. https://doi.org/10.1016/j.eswa.2023.122423
Hu, K., Chen, Z., Kang, H., & Tang, Y. (2024). 3D vision technologies for a self-developed structural external crack damage recognition robot. Automation in Construction, 159, Article 105262. https://doi.org/10.1016/j.autcon.2023.105262
Islam, M. S., Shaqib, S. M., Ramit, S. S., Khushbu, S. A., Sattar, M. A., & Noor, D. S. R. H. (2024). A deep learning approach to detect complete safety equipment for construction workers based on YOLOv7. arXiv preprint arXiv:2406.07707. https://doi.org/10.48550/arXiv.2406.07707
Kalfarisi, R., Wu, Z. Y., & Soh, K. (2020). Crack detection and segmentation using deep learning with 3D reality mesh model for quantitative assessment and integrated visualization. Journal of Computing in Civil Engineering, 34(3), Article 04020010. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000890
Lapan, M. (2018). Deep reinforcement learning hands-on: Apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo Zero and more. Packt Publishing Ltd.
Lawson, M., Ogden, R., & Goodier, C. I. (2014). Design in modular construction (Vol. 476). CRC Press. https://doi.org/10.1201/b16607
Lee, D., & Kim, M. (2021). Autonomous construction hoist system based on deep reinforcement learning in high-rise building construction. Automation in Construction, 128, Article 103737. https://doi.org/10.1016/j.autcon.2021.103737
Leng, Y., Shi, X., Hiroatsu, F., Kalachev, A., & Wan, D. (2023). Automated construction for human–robot interaction in wooden buildings: Integrated robotic construction and digital design of iSMART wooden arches. Journal of Field Robotics, 40(4), 810–827. https://doi.org/10.1002/rob.22154
Leong, Z., Chen, R., Xu, Z., Lin, Y., & Hu, N. (2023). Robotic arm three-dimensional printing and modular construction of a meter-scale lattice façade structure. Engineering Structures, 290, Article 116368. https://doi.org/10.1016/j.engstruct.2023.116368
Liu, Z., Yang, T., Sun, N., & Fang, Y. (2019). An antiswing trajectory planning method with state constraints for 4-DOF tower cranes: design and experiments. IEEE Access, 7, 62142–62151. https://doi.org/10.1109/ACCESS.2019.2915999
Liu, Z., Sun, N., Wu, Y., Xin, X., & Fang, Y. (2021). Nonlinear sliding mode tracking control of underactuated tower cranes. International Journal of Control, Automation and Systems, 19, 1065–1077. https://doi.org/10.1007/s12555-020-0033-5
Ludwika, A. S., & Rifai, A. P. (2024). Deep learning for detection of proper utilization and adequacy of personal protective equipment in manufacturing teaching laboratories. Safety, 10(1), Article 26. https://doi.org/10.3390/safety10010026
Mekruksavanich, S., & Jitpattanakul, A. (2023). Automatic recognition of construction worker activities using deep learning approaches and wearable inertial sensors. Intelligent Automation & Soft Computing, 36(2), 2111–2128. https://doi.org/10.32604/iasc.2023.033542
MMYOLO Contributors. (2022). MMYOLO: OpenMMLab YOLO series toolbox and benchmark. https://github.com/open-mmlab/mmyolo
Mousaei, A., Taghaddos, H., Nekouvaght Tak, A., Behzadipour, S., & Hermann, U. (2021). Optimized mobile crane path planning in discretized polar space. Journal of Construction Engineering and Management, 147(5), Article 04021036. https://doi.org/10.1061/(ASCE)CO.1943-7862.0002033
Mu, G., Zhang, M., Wu, G., Li, Z., & Pan, L. (2023). Optimisation of overhead crane path based on RRT-A* fusion improvement algorithm. International Journal of Simulation and Process Modelling, 21(1), 63–74. https://doi.org/10.1504/IJSPM.2023.139808
Oladugba, A. O., Gheith, M., & Eltawil, A. (2023). A new solution approach for the twin yard crane scheduling problem in automated container terminals. Advanced Engineering Informatics, 57, Article 102015. https://doi.org/10.1016/j.aei.2023.102015
Olearczyk, J., Bouferguène, A., Al-Hussein, M., & Hermann, U. R. (2014). Automating motion trajectory of crane-lifted loads. Automation in Construction, 45, 178–186. https://doi.org/10.1016/j.autcon.2014.06.001
Ouyang, H., Tian, Z., Yu, L., & Zhang, G. (2020). Motion planning approach for payload swing reduction in tower cranes with double-pendulum effect. Journal of the Franklin Institute, 357(13), 8299–8320. https://doi.org/10.1016/j.jfranklin.2020.02.001
Pan, Y., & Zhang, L. (2020). BIM log mining: Learning and predicting design commands. Automation in Construction, 112, Article 103107. https://doi.org/10.1016/j.autcon.2020.103107
Pascanu, R., Mikolov, T., & Bengio, Y. (2013, May). On the difficulty of training recurrent neural networks. In International Conference on Machine Learning (pp. 1310–1318). PMLR.
Peel, H., Luo, S., Cohn, A. G., & Fuentes, R. (2018). Localisation of a mobile robot for bridge bearing inspection. Automation in Construction, 94, 244–256. https://doi.org/10.1016/j.autcon.2018.07.003
Petersen, K. H., Napp, N., Stuart-Smith, R., Rus, D., & Kovac, M. (2019). A review of collective robotic construction. Science Robotics, 4(28), Article eaau8479. https://doi.org/10.1126/scirobotics.aau8479
Qi, B., Razkenari, M., Costin, A., Kibert, C., & Fu, M. (2021). A systematic review of emerging technologies in industrialized construction. Journal of Building Engineering, 39, Article 102265. https://doi.org/10.1016/j.jobe.2021.102265
Research and Markets. (2021). Modular construction – global market trajectory & analytics. https://www.researchandmarkets.com/reports/4805338/modular-constructionglobal-market-trajectory
Rothemund, P., Kim, Y., Heisser, R. H., Zhao, X., Shepherd, R. F., & Keplinger, C. (2021). Shaping the future of robotics through materials innovation. Nature Materials, 20(12), 1582–1587. https://doi.org/10.1038/s41563-021-01158-1
Schulman, J., Moritz, P., Levine, S., Jordan, M., & Abbeel, P. (2015). High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438. https://doi.org/10.48550/arXiv.1506.02438
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347. https://doi.org/10.48550/arXiv.1707.06347
Shamshiri, A., Ryu, K. R., & Park, J. Y. (2024). Text mining and natural language processing in construction. Automation in Construction, 158, Article 105200. https://doi.org/10.1016/j.autcon.2023.105200
Sirimewan, D., Bazli, M., Raman, S., Mohandes, S. R., Kineber, A. F., & Arashpour, M. (2024). Deep learning-based models for environmental management: Recognizing construction, renovation, and demolition waste in-the-wild. Journal of Environmental Management, 351, Article 119908. https://doi.org/10.1016/j.jenvman.2023.119908
Smith, R. E. (2010). Prefab architecture: A guide to modular design and construction. John Wiley & Sons.
Štefanič, M., & Stankovski, V. (2018, December). A review of technologies and applications for smart construction. Proceedings of the Institution of Civil Engineers-Civil Engineering, 172(2), 83–87. https://doi.org/10.1680/jcien.17.00050
Tucker, G., Bhupatiraju, S., Gu, S., Turner, R., Ghahramani, Z., & Levine, S. (2018, July). The mirage of action-dependent baselines in reinforcement learning. In International Conference on Machine Learning (pp. 5015–5024). PMLR.
Yang, T. Y., Lepine-Lacroix, S., & Ghahremani Baghmisheh, A. (2022). Novel high-performance tall modular mass timber buildings. In 5th International Conference on Earthquake Engineering and Disaster Mitigation (5th ICEEDM), Yogyakarta, Indonesia.
Yao, L., Dong, Q., Jiang, J., & Ni, F. (2020). Deep reinforcement learning for long‐term pavement maintenance planning. Computer‐Aided Civil and Infrastructure Engineering, 35(11), 1230–1245. https://doi.org/10.1111/mice.12558
Yin, J., Li, J., Yang, A., & Cai, S. (2024). Optimization of service scheduling problem for overlapping tower cranes with cooperative coevolutionary genetic algorithm. Engineering, Construction and Architectural Management, 31(3), 1348–1369. https://doi.org/10.1108/ECAM-08-2022-0767
Yu, G., & Wang, S. (2024). Research on lifting path planning algorithms for intelligent cranes. Journal of Engineering Research and Reports, 26(7), 57–63. https://doi.org/10.9734/jerr/2024/v26i71193
Zhang, C., Wang, F., Zou, Y., Dimyadi, J., Guo, B. H., & Hou, L. (2023). Automated UAV image-to-BIM registration for building façade inspection using improved generalised Hough transform. Automation in Construction, 153, Article 104957. https://doi.org/10.1016/j.autcon.2023.104957
Zhou, Y., Zhang, E., Guo, H., Fang, Y., & Li, H. (2021). Lifting path planning of mobile cranes based on an improved RRT algorithm. Advanced Engineering Informatics, 50, Article 101376. https://doi.org/10.1016/j.aei.2021.101376
Zhu, A., Dai, T., Xu, G., Pauwels, P., De Vries, B., & Fang, M. (2023). Deep reinforcement learning for real-time assembly planning in robot-based prefabricated construction. IEEE Transactions on Automation Science and Engineering, 20(3), 1515–1526. https://doi.org/10.1109/TASE.2023.3236805
Zhu, A., Zhang, Z., & Pan, W. (2024). Developing a fast and accurate collision detection strategy for crane-lift path planning in high-rise modular integrated construction. Advanced Engineering Informatics, 61, Article 102509. https://doi.org/10.1016/j.aei.2024.102509
View article in other formats
Published
Issue
Section
Copyright
Copyright (c) 2025 The Author(s). Published by Vilnius Gediminas Technical University.
License

This work is licensed under a Creative Commons Attribution 4.0 International License.