Prediction of sewage pipeline construction duration by introducing machine learning and deep learning approaches

    Sang-Jun Park Info
    Norhane Nour Info
    Kang Young Lee Info
    Ju-Hyung Kim Info
DOI: https://doi.org/10.3846/jcem.2025.23472

Abstract

Establishing project costs in construction is crucial for project success, typically done through regression methods for prediction. While these methods are common, novel regression methods are less practiced in construction management. This study explores both traditional and modern regression techniques, analyzing data from 83 sewage pipeline projects in South Korea. The study implemented state-of-the-art frameworks, including hyperparameter optimization and k-fold cross-validation, to evaluate statistic, machine learning and deep learning based regression models using R2 score, RMSE, MAE, and MSE. Results revealed that performance metrics don’t always align with predictive accuracy. For instance, the random forest regressor achieved the best R2 score of 0.847 but ranked fifth in prediction accuracy. Moreover, polynomial regression outperformed novel methods with a 98.790% accuracy across the validation dataset.

Keywords:

construction management, sewage pipeline construction, statistical regression, machine learning regression, deep learning regression

How to Cite

Park, S.-J., Nour, N., Lee, K. Y., & Kim, J.-H. (2025). Prediction of sewage pipeline construction duration by introducing machine learning and deep learning approaches. Journal of Civil Engineering and Management, 31(7), 687–709. https://doi.org/10.3846/jcem.2025.23472

Share

Published in Issue
August 14, 2025
Abstract Views
15

References

Abed, Y. G., Hasan, T. M., & Zehawi, R. N. (2022). Cost prediction for roads construction using machine learning models. International Journal of Electrical and Computer Engineering Systems, 13(10), 927–936. https://doi.org/10.32985/ijeces.13.10.8

Abu Hammad, A. A., Ali, S. M. A., Sweis, G. J., & Sweis, R. J. (2010). Statistical analysis on the cost and duration of public building projects. Journal of Management in Engineering, 26(2), 105–112. https://doi.org/10.1061/(ASCE)0742-597X(2010)26:2(105)

Akinosho, T. D., Oyedele, L. O., Bilal, M., Ajayi, A. O., Delgado, M. D., Akinade, O. O., & Ahmed, A. A. (2020). Deep learning in the construction industry: A review of present status and future innovations. Journal of Building Engineering, 32, Article 101827. https://doi.org/10.1016/j.jobe.2020.101827

Alshboul, O., Shehadeh, A., Almasabha, G., & Almuflih, A. S. (2022). Extreme gimpoiradient boosting-based machine learning approach for green building cost prediction. Sustainability, 14(11), Article 6651. https://doi.org/10.3390/su14116651

Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M. A., Al-Amidie, M., & Farhan, L. (2021). Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 8(1), Article 53. https://doi.org/10.1186/s40537-021-00444-8

Amoore, L. (2023). Machine learning political orders. Review of International Studies, 49(1), 20–36. https://doi.org/10.1017/S0260210522000031

Awad, M., & Khanna, R. (2015). Efficient learning machines. Theories, concepts, and applications for engineers and system designers. Springer. https://doi.org/10.1007/978-1-4302-5990-9

Baloyi, L., & Bekker, M. C. (2011). Causes of construction cost and time overruns: The 2010 FIFA World Cup stadia in South Africa. Acta Structilia, 18(1), 51–67.

Barjouei, H. S., Ghorbani, H., Mohamadian, N., Wood, D. A., Davoodi, S., Moghadasi, J., & Saberi, H. (2021). Prediction performance advantages of deep machine learning algorithms for two-phase flow rates through wellhead chokes. Journal of Petroleum Exploration and Production Technology, 11(3), 1233–1261. https://doi.org/10.1007/s13202-021-01087-4

Bayram, S., Ocal, M. E., Laptali Oral, E., & Atis, C. D. (2016). Comparison of multi layer perceptron (MLP) and radial basis function (RBF) for construction cost estimation: the case of Turkey. Journal of Civil Engineering and Management, 22(4), 480–490. https://doi.org/10.3846/13923730.2014.897988

Behnia, D., Ahangari, K., Noorzad, A., & Moeinossadat, S. R. (2013). Predicting crest settlement in concrete face rockfill dams using adaptive neuro-fuzzy inference system and gene expression programming intelligent methods. Journal of Zhejiang University SCIENCE A, 14(8), 589–602. https://doi.org/10.1631/jzus.A1200301

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

Bui, D. T., Khosravi, K., Tiefenbacher, J., Nguyen, H., & Kazakis, N. (2020). Improving prediction of water quality indices using novel hybrid machine-learning algorithms. Science of The Total Environment, 721, Article 137612. https://doi.org/10.1016/j.scitotenv.2020.137612

Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018

Dang-Trinh, N., Duc-Thang, P., Cuong, T. N.-N., & Duc-Hoc, T. (2023). Machine learning models for estimating preliminary factory construction cost: case study in Southern Vietnam. International Journal of Construction Management, 23(16), 2879–2887. https://doi.org/10.1080/15623599.2022.2106043

Darko, A., Glushakova, I., Boateng, E. B., & Chan, A. P. C. (2023). Using machine learning to improve cost and duration prediction accuracy in green building projects. Journal of Construction Engineering and Management, 149(8), Article 04023061. https://doi.org/10.1061/JCEMD4.COENG-13101

DataFlair. (2022). Advantages and disadvantages of machine learning language. https://data-flair.training/blogs/advantages-and-disadvantages-of-machine-learning/

Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. (2002). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2), 182–197. https://doi.org/10.1109/4235.996017

Dong, J., Chen, Y., & Guan, G. (2020). Cost index predictions for construction engineering based on LSTM neural networks. Advances in Civil Engineering, 2020, Article 518147. https://doi.org/10.1155/2020/6518147

Doyle, M. W., & Havlick, D. G. (2009). Infrastructure and the environment. Annual Review of Environment and Resources, 34(1), 349–373. https://doi.org/10.1146/annurev.environ.022108.180216

Fang, C., Zhang, X., Cheng, Y., Wang, S., Zhang, L., & Yang, Y. (2019). Fault diagnosis for brake system in high-speed trains using the phased features and multi-layer perceptron. IOP Conference Series: Materials Science and Engineering, 470, Article 012007. https://doi.org/10.1088/1757-899X/470/1/012007

Ganiyu, B., & Zubairu, I. (2010). Project cost prediction model using principal component regression for public building projects in Nigeria. Journal of Building Performance, 1(1), 21–28.

Ghimire, B., Rogan, J., Galiano, V. R., Panday, P., & Neeti, N. (2012). An evaluation of bagging, boosting, and random forests for land-cover classification in Cape Cod, Massachusetts, USA. GIScience & Remote Sensing, 49(5), 623–643. https://doi.org/10.2747/1548-1603.49.5.623

Gujar, R., & Vakharia, V. (2019). Prediction and validation of alternative fillers used in micro surfacing mix-design using machine learning techniques. Construction and Building Materials, 207, 519–527. https://doi.org/10.1016/j.conbuildmat.2019.02.136

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

Idowu, O. S., & Lam, K. C. (2020). Conceptual quantities estimation using bootstrapped support vector regression models. Journal of Construction Engineering and Management, 146(4), Article 04020018. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001780

Kamali, M. Z., Davoodi, S., Ghorbani, H., Wood, D. A., Mohamadian, N., Lajmorak, S., Rukavishnikov, V. S., Taherizade, F., & Band, S. S. (2022). Permeability prediction of heterogeneous carbonate gas condensate reservoirs applying group method of data handling. Marine and Petroleum Geology, 139, Article 105597. https://doi.org/10.1016/j.marpetgeo.2022.105597

Karl, F., Pielok, T., Moosbauer, J., Pfisterer, F., Coors, S., Binder, M., Schneider, L., Thomas, J., Richter, J., Lang, M., Garrido-Merchán, E. C., Branke, J., & Bischl, B. (2023). Multi-objective hyperparameter optimization in machine learning – An overview. ACM Transactions on Evolutionary Learning and Optimization, 3(4), Article 16. https://doi.org/10.1145/3610536

Khedr, A. M., Arif, I., P V Raj, P., El‐Bannany, M., Alhashmi, S. M., & Sreedharan, M. (2021). Cryptocurrency price prediction using traditional statistical and machine‐learning techniques: A survey. Intelligent Systems in Accounting, Finance and Management, 28(1), 3–34. https://doi.org/10.1002/isaf.1488

Kim, K.-Y. (2022). Old water and sewage pipes – land collapse caused by poor construction. Donga Ilbo (in Korean). http://www.donga.com/news/Opinion/article/all/20220112/111202406/1

Kim, Y.-J., Yeom, D.-J., & Kim, Y. S. (2019). Development of construction duration prediction model for project planning phase of mixed-use buildings. Journal of Asian Architecture and Building Engineering, 18(6), 586–598. https://doi.org/10.1080/13467581.2019.1696207

Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA.

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436–444. https://doi.org/10.1038/nature14539

Lee, H., Chung, S.-H., & Choi, E.-J. (2016). A case study on machine learning applications and performance improvement in learning algorithm. Journal of Digital Convergence, 14(2), 245–258. https://doi.org/10.14400/JDC.2016.14.2.245

Lin, M.-C., Tserng, H. P., Ho, S.-P., & Young, D.-L. (2011). Developing a construction-duration model based on a historical dataset for building project. Journal of Civil Engineering and Management, 17(4), 529–539. https://doi.org/10.3846/13923730.2011.625641

Maclin, R., & Opitz, D. (1999). Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research, 11, 169–198. https://doi.org/10.1613/jair.614

Mahmoodzadeh, A., & Zare, S. (2016). Probabilistic prediction of expected ground condition and construction time and costs in road tunnels. Journal of Rock Mechanics and Geotechnical Engineering, 8(5), 734–745. https://doi.org/10.1016/j.jrmge.2016.07.001

Mahmoodzadeh, A., Mohammadi, M., Daraei, A., Rashid, T. A., Sherwani, A. F. H., Faraj, R. H., & Darwesh, A. M. (2019). Updating ground conditions and time-cost scatter-gram in tunnels during excavation. Automation in Construction, 105, Article 102822. https://doi.org/10.1016/j.autcon.2019.04.017

Mahmoodzadeh, A., Mohammadi, M., Daraei, A., Farid Hama Ali, H., Ismail Abdullah, A., & Kameran Al-Salihi, N. (2021). Forecasting tunnel geology, construction time and costs using machine learning methods. Neural Computing and Applications, 33(1), 321–348. https://doi.org/10.1007/s00521-020-05006-2

Mahmoodzadeh, A., Mohammadi, M., Abdulhamid, S. N., Ibrahim, H. H., Ali, H. F. H., Nejati, H. R., & Rashidi, S. (2022a). Prediction of duration and construction cost of road tunnels using Gaussian process regression. Geomechanics and Engineering, 28(1), 65–75. https://doi.org/10.12989/gae.2022.28.1.065

Mahmoodzadeh, A., Nejati, H. R., & Mohammadi, M. (2022b). Optimized machine learning modelling for predicting the construction cost and duration of tunnelling projects. Automation in Construction, 139, Article 104305. https://doi.org/10.1016/j.autcon.2022.104305

Mahmoodzadeh, A., Nejati, H. R., Mohammadi, M., Hashim Ibrahim, H., Rashidi, S., & Ahmed Rashid, T. (2022c). Forecasting tunnel boring machine penetration rate using LSTM deep neural network optimized by grey wolf optimization algorithm. Expert Systems with Applications, 209, Article 118303. https://doi.org/10.1016/j.eswa.2022.118303

Mahmoodzadeh, A., Taghizadeh, M., Mohammed, A., Ibrahim, H., Samadi, H., Mohammadi, M., & Rashidi, S. (2022d). Tunnel wall convergence prediction using optimized LSTM deep neural network. Geomechanics and Engineering, 31(6), 545–556. https://doi.org/10.12989/gae.2022.31.6.545

Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2018). Statistical and machine learning forecasting methods: Concerns and ways forward. PLoS ONE, 13(3), Article e0194889. https://doi.org/10.1371/journal.pone.0194889

Malek Mohammadi, M., Najafi, M., Kaushal, V., Serajiantehrani, R., Salehabadi, N., & Ashoori, T. (2019). Sewer pipes condition prediction models: A State-of-the-art review. Infrastructures, 4(4), Article 64. https://doi.org/10.3390/infrastructures4040064

Marino, D. L., Amarasinghe, K., & Manic, M. (2016). Building energy load forecasting using deep neural networks. In 42nd Annual Conference of the IEEE Industrial Electronics Society (IECON 2016) (pp. 7046–7051). IEEE. https://doi.org/10.1109/IECON.2016.7793413

MathWorks. (2023). What is machine learning?. https://ww2.mathworks.cn/en/discovery/machine-learning.html?s_tid=srchtitle_Machine%20learning_3

Meharie, M. G., & Shaik, N. (2020). Predicting highway construction costs: Comparison of the performance of Random forest, Neural network and Support vector machine models. Journal of Soft Computing in Civil Engineering, 4(2), 103–112. https://doi.org/10.22115/SCCE.2020.226883.1205

Moret, Y., & Einstein, H. H. (2016). Construction cost and duration uncertainty model: Application to high-speed rail line project. Journal of Construction Engineering and Management, 142(10), Article 05016010. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001161

Munns, A., & Bjeirmi, B. (1996). The role of project management in achieving project success. International Journal of Project Management, 14(2), 81–87. https://doi.org/10.1016/0263-7863(95)00057-7

Nair, P., Vakharia, V., Shah, M., Kumar, Y., Woźniak, M., Shafi, J., & Fazal Ijaz, M. (2024). AI-driven digital twin model for reliable lithium-ion battery discharge capacity predictions. International Journal of Intelligent Systems, 2024, Article 185044. https://doi.org/10.1155/2024/8185044

Obradović, D. (2017). The impact of tree root systems on wastewater pipes. In Zajednički temelji 2017 – Peti skup mladih istraživača iz područja građevinarstva i srodnih tehničkih znanosti – Zbornik radova (pp. 65–71). https://doi.org/10.5592/CO/ZT.2017.03

Obradović, D., Šperac, M., & Marenjak, S. (2023). Challenges in sewer system maintenance. Encyclopedia, 3(1), 122–142. https://doi.org/10.3390/encyclopedia3010010

Opila, M. C. (2011). Structural condition scoring of buried sewer pipes for risk based decision making [PhD thesis]. University of Delaware, Newark, Delaware.

Peiman, F., Khalilzadeh, M., & Shahsavari-Pour N., & Ravanshadnia, M. (2025). Estimation of building project completion duration using a natural gradient boosting ensemble model and legal and institutional variables. Engineering, Construction and Architectural Management, 32(4), 2069–2104. https://doi.org/10.1108/ECAM-12-2022-1170

Pesko, I., Mucenski, V., Seslija, M., Radovic, N., Vujkov, A., Bibic, D., & Krkljes, M. (2017). Estimation of costs and durations of construction of urban roads using ANN and SVM. Complexity, 2017, Article 450370. https://doi.org/10.1155/2017/2450370

Pierdzioch, C., & Risse, M. (2020). Forecasting precious metal returns with multivariate random forests. Empirical Economics, 58(3), 1167–1184. https://doi.org/10.1007/s00181-018-1558-9

Popescu, M.-C., Balas, V., Perescu-Popescu, L., & Mastorakis, N. (2009). Multilayer perceptron and neural networks. WSEAS Transactions on Circuits and Systems, 8(7), 579–588.

Rafiei, M. H., & Adeli, H. (2018). Novel machine-learning model for estimating construction costs considering economic variables and indexes. Journal of Construction Engineering and Management, 144(12), Article 04018106. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001570

Rawlings, J. O., Pantula, S. G., & Dickey, D. A. (2001). Applied regression analysis: A research tool (Springer texts in statistics). Springer-Verlag New York Inc.

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1988). Learning internal representations by error propagation. Readings in Cognitive Science. A Perspective from Psychology and Artificial Intelligence, 399–421. https://doi.org/10.1016/B978-1-4832-1446-7.50035-2

Ministry of Environment Domestic Sewage Division. (2021). 2021 Sewerage statistics (in Korean).

Saeidlou, S., & Ghadiminia, N. (2024). A construction cost estimation framework using DNN and validation unit. Building Research & Information, 52(1–2), 38–48. https://doi.org/10.1080/09613218.2023.2196388

Shoar, S., Chileshe, N., & Edwards, J. D. (2022). Machine learning-aided engineering services’ cost overruns prediction in high-rise residential building projects: Application of random forest regression. Journal of Building Engineering, 50, Article 104102. https://doi.org/10.1016/j.jobe.2022.104102

Son, H., & Kim, C. (2015). Early prediction of the performance of green building projects using pre-project planning variables: Data mining approaches. Journal of Cleaner Production, 109, 144–151. https://doi.org/10.1016/j.jclepro.2014.08.071

Sueri, M., & Erdal, M. (2022). Early estimation of sewerage line costs with regression analysis. Gazi University Journal of Science, 35(3), 822–832. https://doi.org/10.35378/gujs.949726

Taye, M. M. (2023). Understanding of machine learning with deep learning: Architectures, workflow, applications and future directions. Computers, 12(5), Article 91. https://doi.org/10.3390/computers12050091

Tayefeh Hashemi, S., Ebadati, O. M., & Kaur, H. (2020). Cost estimation and prediction in construction projects: a systematic review on machine learning techniques. SN Applied Sciences, 2(10), Article 1703. https://doi.org/10.1007/s42452-020-03497-1

Vakharia, V., & Gujar, R. (2019). Prediction of compressive strength and Portland cement composition using cross-validation and feature ranking techniques. Construction and Building Materials, 225, 292–301. https://doi.org/10.1016/j.conbuildmat.2019.07.224

Wang, R., Asghari, V., Cheung, C. M., Hsu, S. C., & Lee, C. J. (2022). Assessing effects of economic factors on construction cost estimation using deep neural networks. Automation in Construction, 134, Article 104080. https://doi.org/10.1016/j.autcon.2021.104080

Yan, X., & Su, X. G. (2009). Linear regression analysis. Theory and computing. World Scientific. https://doi.org/10.1142/6986

Yeom, D.-J., Seo, H.-M., Kim, Y.-J., Cho, C.-S., & Kim, Y. (2018). Development of an approximate construction duration prediction model during the project planning phase for general office buildings. Journal of Civil Engineering and Management, 24(3), 238–253. https://doi.org/10.3846/jcem.2018.1646

Yu, P., & Yan, X. (2020). Stock price prediction based on deep neural networks. Neural Computing and Applications, 32, 1609–1628. https://doi.org/10.1007/s00521-019-04212-x

Yu, P.-S., Chen, S.-T., & Chang, I.-F. (2006). Support vector regression for real-time flood stage forecasting. Journal of Hydrology, 328(3–4), 704–716. https://doi.org/10.1016/j.jhydrol.2006.01.021

Yuan, J., Chen, W., Tan, X., Yang, D., & Wang, S. (2019). Countermeasures of water and mud inrush disaster in completely weathered granite tunnels: A case study. Environmental Earth Sciences, 78(18), Article 576. https://doi.org/10.1007/s12665-019-8590-8

Zakaria, Z., Ismail, S., & Yusof, A. (2012). Cause and impact of dispute and delay the closing of final account in Malaysia construction industry. Journal of Southeast Asian Research, 2012, Article 975385. https://doi.org/10.5171/2012.975385

Zhang, S., & Li, X. (2024). A comparative study of machine learning regression models for predicting construction duration. Journal of Asian Architecture and Building Engineering, 23(6), 1980–1996. https://doi.org/10.1080/13467581.2023.2278887

Zhang, X., Wang, Z., Liu, D., Lin, Q., & Ling, Q. (2021). Deep adversarial data augmentation for extremely low data regimes. IEEE Transactions on Circuits and Systems for Video Technology, 31(1), 15–28. https://doi.org/10.1109/TCSVT.2020.2967419

Zheng, Z., Zhou, L., Wu, H., & Zhou, L. (2023). Construction cost prediction system based on Random Forest optimized by the Bird Swarm Algorithm. Mathematical Biosciences and Engineering, 20(8), 15044–15074. https://doi.org/10.3934/mbe.2023674

View article in other formats

CrossMark check

CrossMark logo

Published

2025-08-14

Issue

Section

Articles

How to Cite

Park, S.-J., Nour, N., Lee, K. Y., & Kim, J.-H. (2025). Prediction of sewage pipeline construction duration by introducing machine learning and deep learning approaches. Journal of Civil Engineering and Management, 31(7), 687–709. https://doi.org/10.3846/jcem.2025.23472

Share