RISK PREDICTION AND DIAGNOSIS OF WATER SEEPAGE IN OPERATIONAL SHIELD TUNNELS BASED ON RANDOM FOREST

. Water seepage (WS) is a paramount defect during tunnel operation and directly affects the operational safety of tunnels. Effectively predicting and diagnosing WS are problems that urgently need to be solved. This paper presents a standard and an evaluation index system for WS grades and constructs a sample dataset from monitoring recoreds for demonstration purposes. First, we use bootstrap resampling to build a random forest (RF) seepage risk prediction model. Second, the optimal branch and parameters are selected by the 5-fold cross-validation method to establish the RF prediction training model. Additionally, to illustrate the effectiveness of the method, the operational stage of Wuhan Metro Line 3 in China is taken as a case study. The results conclude that the segment spalling area, crack width, and loss rate of the rebar cross-section have a strong influence on WS. Finally, the test data are predicted, and the prediction result error index is calculated. Compared with the predictions of some traditional machine learning methods, such as support vector machines and artificial neural networks, RF prediction has the highest accuracy and is the closest to the true value, which demonstrates the accuracy of the model and its application potential.


Introduction
With the increase in urbanization and population influx around the world, urban problems such as insufficient urban space capacity and frequent traffic jams are constantly emerging (Zho et al., 2019;He et al., 2019;Zhang et al., 2020aZhang et al., , 2020b. Urban metro systems are popular due to their energy savings, high passenger flow, safety, green transportation characteristics, and so on (Rahim et al., 2015;Qian et al., 2019;Pan et al., 2019c). By the end of 2019, metros had been built and had begun operating in 40 areas in mainland China, with a total length of approximately 6700 km. Furthermore, during the rapid development of urban rail transit, we should not neglect the importance and attention given to the safety management of metro operation (Li et al., 2017a). Due to the special structure of the shield tunnel, in the process of tunnel operation, under the influence of the tunnel itself and external use, various defects appear in its structure, which leads to a risk of function reduction and threatens operational safety .
Among the common defects that occur during the operation of shield tunnels are water seepage (WS), cracks, uneven settlement, concrete cracking, dislocation, and bolt failure. Under the interaction of these defects, other defects continue to develop, which have a significant impact on the operational safety and service life of the tunnel . WS has one of the highest occurrence probabilities and some of the more severe consequences. According to the statistics, the cases of damage caused by WS account for 70% of all types of hazards, and the loss of the use of tunnels due to WS occurs in 30% of all cases of WS. Cheng and Huang (2014) found a total of approximately 20,000 WS occurrences by investigating all operational metro routes in Shanghai, and WS is an important risk factor for other defects. Dong et al. (2017) investigated shield tunnels in Beijing and found that approximately 77% of the defects were related to WS. This is due to the special structure of the shield tunnel, such as the use of segment splicing and grouting holes on each segment, which increases the possibility of metro tunnel leakage . Furthermore, WS causes cracks in the tunnel structure, increased cracks, rust inside the rails, and corrosion in the interior structure. If the leakage continues to worsen, the tunnel structure itself is endangered, and the stability of the tunnel structure is eliminated, directly affecting the safety of metro operation. In severe cases, these defects cause extensive casualties and great property loss (Hu et al., 2019).
The above security risks have led to growing public concerns about the safety of tunnel operations. Therefore, the analysis, prediction, and diagnosis of WS risk in the tunnel operation period is not only of engineering significance for the safe operation and management of metro tunnels but also significant in the development of the economy and society. Risk prediction plays an important role in safe metro operation to illustrate the potential safety risks and the risk factors' contribution to the occurrence of an accident. Critical potential risks and risk factors can then be diagnosed to assist operators and maintainers in determining critical safety checkpoints in the operation stage (Zhang et al., 2016). However, the shield method is a new construction method, and the development of subways in China is relatively new, so there are few studies on the shield method in tunnel construction. Moreover, at the present stage, tunnels constructed by the shield method are mostly in the early stage of their operational cycle. Compared with the risks in the shield construction stage, there are fewer risks in the operation stage. Therefore, the existing studies on the risks of shield tunnels are mostly focused on the construction stage (Ding et al., 2014).
The aim of our study is to establish a prediction and diagnosis method for WS in operational shield tunnels based on a random forest (RF) algorithm that includes the impact of the combined effects of uncertain and complex multisource environmental factors (Zhou et al., 2017). Hence, we explore a combination of quantitative and qualitative methods to manage the leakage risk of shield tunnels during the operation period by considering the coupling relationship of various risk indicators. To determine the security status of operating tunnels in a timely manner and study areas where WS has already appeared in an operating tunnel, we rationally design corresponding defect treatment measures based on the symptoms of water seepage to ensure the safe operation of metro tunnels and ensure that they reach their expected service life.
The remainder of our paper is organized as follows: Section 1 reviews the literature related to this study. In Section 2, the RF method is presented. Section 3 constructs the WS risk prediction model for operational shield tunnels. In Section 4, a case study based on RF verifies the effectiveness and applicability of our approach. Section 5 presents the results of a case study in Wuhan Metro Line 3 and a discussion. Last Section provides the research conclusions and future work.

Literature review
Generally, numerical simulation methods and qualitative analysis tools have been the major approaches in tunnel safety regarding WS issues. For example, Wang et al. (2013) developed a prediction method for WS based on real-time monitoring data during tunnel construction. Mao et al. (2020) applied 3D numerical simulations to determine the distribution of the total water head and water pressure in cracks for all combinations of water leakage positions during the operation periods of multi-arch tunnels in loess areas. Shi et al. (2013) introduced the improved analytic hierarchy process (AHP) into leakage risk assessment for a highway tunnel. Based on the analysis of the risk system, the degree of influence of the risk indicators on the leakage risk was determined, and the related rankings were obtained. Huang and Li (2017) used a fully convolutional network to study tunnel leakage through image recognition algorithms. Gao et al. (2019) used numerical simulations and 3D physical model tests to study the occurrence mechanisms and evolution laws of WS in the operation of the Kaiyuansi Tunnel.
The application of the current main methods at different stages of different types of tunnels is discussed in the literature. Currently, shield metro tunnels in China are mostly in the initial stage of their operation, so the existing research on shield tunnel risk is focused mostly on the construction stage. Related studies on WS during tunnel operation are generally performed for other types of tunnels, such as railway and highway tunnels. The risk factors for WS in tunnels have a certain degree of interdependence rather than being purely independent of each other. Existing studies focus mostly on the unilateral impact of a single risk indicator on leakage without considering each indicator. The correlation between the indicators also cannot enable the real-time updating of the risk assessment results based on the measured data (Liu et al., 2018b).
Based on the above observations, we used the RF algorithm, which creates mathematical expressions to fit a set of datasets (Zhou et al., 2020a(Zhou et al., , 2020b. Multiple classification and regression tree models are generated and used as base models. The RF algorithm has excellent prediction and diagnosis performance because the predicted results are derived from the integrated predictions of many decision trees (Zhang et al., 2020a). Compared with machine learning algorithms such as backpropagation (BP) neural networks, support vector machines (SVMs), and decision trees, the RF algorithm has higher prediction accuracy Zhang et al., 2020d). At present, it is mainly used in the fields of medicine (Pan et al., 2017), economics (Behrens, 2020), and management (Grushka-Cockayne et al., 2017;Mueller, 2020). In the field of engineering, the RF algorithm has been studied for crack prediction (Bhattacharya & Mishra, 2018), energy evaluation, and construction management (Pan & Zhang, 2020. The results of the above studies have demonstrated the appealing performance of RF in solving both regression and classification problems.

RF regression algorithm
RF is a combination algorithm based on classification trees proposed by Breiman (2001), which integrates two powerful machine learning techniques, bootstrap aggregating and random subspaces (Ho, 1998).

Basic principles of RFs
Similar to the traditional regression model, RF regression can explain the influence of some independent variables on dependent variables . Suppose that there are n observations -that is, n cases -for the dependent variable Y, and there are k independent variables that have an influence on it. In the process of constructing the regression tree, RF randomly extracts some observed values of the dependent variable Y by the bootstrap resampling method and randomly selects a specified number of variables from among the k independent variables to determine the node of the classification tree. In this way, each constructed regression tree may be different due to randomness. Based on this, RF can usually randomly generate several hundred or even thousands of classification trees, from which the tree with the highest degree of repetition is selected as the final result. The combined model is constituted from the regression tree ( ) , and the predicted value of the RF regression model is determined by averaging the average values of the j regression trees ( ) , j h X θ . The model satisfies the condition that the multiple training sets forming the RFs are independent, so the mean-square generalization error of the

RF algorithm steps
The steps of the RF algorithm are shown diagrammatically in Figure 1.
(1) From the n cases of the original dataset, the bootstrap method is used to extract b training sample sets repeatedly, and b regression trees are constructed. Each time the training samples are selected, b samples of the cases not selected are extra-bag (out-of-bag, OOB) samples, and these form the test sample set. (2) When constructing a regression tree, at the branch nodes of each tree, ( ) try try m m k < variables are randomly selected from the k independent variables as the candidate branch variables, and then the optimal branch is selected from them according to the branch goodness criterion. When the R software application is used to establish the RF regression model, the default parameter / 3 try m k = .
(3) Each regression tree branches recursively from top to bottom and grows continuously. Reaching n trees is the termination condition for regression tree growth. (4) The b generated regression trees constitute an RF regression model, and the model estimation effect is evaluated by the accuracy of the OOB data predic-tion -that is, measured by the mean-square error of the test set. Assuming the number of data samples outside the bag is m, we have (1) where y i represents the true value of the dependent variable in the OOB samples, i y ∧ represents the predicted value obtained using an RF regression model, and 2 y ∧ σ represents the variance in the OOB predicted value.

Analysis on evolution mechanism of seepage water
Various defects appear in the structure of the tunnel under the influence of the specific structure of the shield tunnel itself and external use during operation, which cause a risk of reduced functionality and threats to operational safety . Most defects, such as material deterioration and uneven settlement, cause deformation and destruction in the tunnel structure, eventually resulting in WS and severe water inrush. Therefore, to determine the occurrence, development mechanism, and mutual influence of various defects, the risk mechanism of water leakage during the operation period should be understood to establish the WS risk index system in the following section and control the risk to ensure the safe operation of the metro (Jeyisanker & Gunaratne, 2009). The main defects and their mechanisms in shield tunnels are as follows: (1) Segment cracks. Because a shield tunnel is constructed of segments supporting the soil and surrounding rock around the tunnel, the segments are important parts of the shield tunnel. Without considering the quality of segment delivery and construction, the main causes of cracks in segments of shield tunnels during the operation period can be divided into internal causes and external causes. The main internal causes are as follows: in long-term contact with the surrounding soil, the reinforcing materials in the segment react with water and other substances in the environment, resulting in corrosion, which makes the concrete of the segment expand and gradually fall off, and cracks appear. The main external causes are as follows: due to the distribution of soil and surrounding rock around the tunnel, uneven settlement or a sharp increase in the external load, a stress concentration area may appear in the segment, and then microcracks gradually develop into cracks under the action of water in the soil; under long-term pressure from the surrounding soil and rock, there may be large internal forces in the segment, which cause the segment to warp. If the bolts used to connect and fix the segment fall off or fail, a local stress concentration will develop in the segment, leading to segment cracks (Liu et al., 2018a).
(2) Excessive opening of segment joints. There is a gap between the segments supporting the tunnel rock and soil, and bolts between the segments are intended to prevent the gap from being too large; failure leads to defects such as leakage and bolt failure (Liu et al., 2020). In addition to the problems of bolt quality and the initial opening of joints caused by errors in segment assembly during construction, the bolt connection is usually a stress concentration area under external forces during the tunnel operation period.
In the contact between the segment and the bolt, the concrete can easily crack because of the large tensile stress, which leads to problems with the connection between the bolt and the segment. The pretightening force of the bolt is released continuously during this process, and the bolt is unable to effectively connect the adjacent segments. The distance between the segments then gradually increases under external forces; that is, the joint opening becomes increasingly large. The increase in the joint opening leads to the gradual failure of bolts, cracks in segments, and aggravated leakage problems. (3) Differential settlement of segments. This refers to the uneven settlement of tunnel segments caused by uneven external loads or differing structures of the rock and soil of the tunnel during the operation period (Pan et al., 2019c). The geological factors around the tunnel, such as the interface between different geological environments and uneven soil, cause differing settlement of the segments in different parts of the tunnel during the operation period; tunnel segments are also easily affected by external forces, such as the water around the tunnel (Hu et al., 2018). In the case of uneven or sudden changes in pressure and earth pressure, the difference in force between different parts can be large, resulting in the uneven settlement of the segments. Other tunnel defects, such as bolt failure, excessive joint opening, misalignment of segments, and WS leakage, aggravate the differential settlement of the tunnel segments, and the differential settlement of the segments also causes the dislocation of the segments to increase along with the opening of the joints and the failure rate of bolts. (4) Segment dislocation. This refers to large and small dislocations between different segments relative to the rock and soil surface of a tunnel. The main reason for the misalignment of segments during the operation of the tunnel is that an uneven load causes uneven settlement of the rock and soil in the tunnel, which causes different forces on different areas. Compared with the rock and soil mass, the segment of the tube produces different displacements; the deformation of the tunnel structure caused by the shield construction or surrounding construction also causes segments in different positions to become misaligned. A staggered tube segment also provides leakage channels for the moisture in the soil and then causes WS; a staggered tube segment also causes the loss of the pretightening force of the bolts connecting the tube segments, leading to bolt failure and cracking in the tube segments. In addition, the number of joints between the segments is increased . (5) Bolt failure. Bolts are important components that connect longitudinal segments and circumferential segments to ensure the structural rigidity of the tunnel, and the water stop between the segments is closely connected with the bolts to ensure good sealing performance between the tunnel segments. In addition to the quality of bolts and irregular construction, the reasons for bolt failure during the tunnel operation period are as follows: the water stop and bolts are corroded under the action of the tunnel rock and soil, and their performance is gradually degraded; that is, there is no water stop functionality. Regarding the effect of connecting the segments properly, if the external force is too large or uneven, a large internal force or stress concentration area is formed at the bolt connection, which leads to bolt failure (Wang et al., 2014). The failure of the bolts causes the joints to open and the tube segments to crack in addition to causing dislocation, WS, and other types of defects. (6) Seepage water. There are three kinds of seepage water during the operation period of a shield tunnel: joint leakage, crack leakage, and grouting hole leakage. Leakage in a joint is mainly due to poor sealing and water resistance at the joint; the leakage channel is formed by the water supply, producing leakage water. Therefore, a possible cause of joint leakage is that the waterproof material becomes corroded through long-term contact with rock and soil and gradual aging, and it loses the functions of water stopping and waterproofing (Qiu et al., 2020). Another possibility is that the waterproof sealing material cannot protect the joint by preventing deformation, which makes gaps appear in the joint, resulting in leakage; in addition to hole position errors caused by construction, grouting hole leakage is mainly due to the failure of the waterproof plug at the grouting hole; crack leakage mainly refers to the leakage of water. Segment stagger and bolt failure indirectly lead to the excessive opening of joints and cracks in segments, which leads to water leakage, which in turn aggravates the development of segment cracks, increases the amount of segment stagger, increases the joint openings, etc. As one of the most common defects in tunnel operation, leakage not only aggravates other defects in tunnels but also threatens the safety of tunnel operation to a great extent, such as by affecting the stability of tunnels and increasing the risk of train instability, as described below: -WS leads to adverse consequences, such as weathering of the tunnel structure and a certain degree of corrosion, to a large extent. If the situation becomes more severe, it may cause the overall loss of the structure and increase the safety risks for operating subway tunnels. In addition to water leakage itself, weathering, corrosion, and other chemical substances are likely to cause severe structural damage. -The corrosion of the tunnel structure by WS also causes substantial damage to other necessary equipment. The combination of the two types of damage aggravates the damage caused by water leakage. Serious cases of mud in tunnels and other severe disasters increase the risk of personal safety accidents. -The hazard of WS is a continuously developing process. When leakage water remains in a tunnel for a long time, it leads to a vicious cycle in the tunnel, and internal corrosion hazards may often spread to the outside of the tunnel. Such a vicious cycle leads to a decrease in tunnel durability during tunnel operation.

Risk prediction model for WS in operating tunnels
The technical roadmap of the risk prediction model for WS in operating tunnels based on the RF algorithm is shown in Figure 2.

Construction of the seepage water risk assessment system
(1) Evaluation system construction. Relevant influencing factors are obtained by analyzing the formation mechanism of WS. Based on a large amount of practical experience and related references, a WS risk assessment index system is constructed, and risk levels are delineated.
(2) Establishment of the original training set samples.
With the indexes of the index system as the variables of the RF, the index-related data are taken as the original training set. x y x y x y x y =  , and the bootstrap sampling method is used to extract k samples from T, with sample size n, to form an independent training set of size K.

Optimal parameter determination and training model establishment
(1) K-fold cross-validation. First, the K-fold cross-validation method is used to divide the initial sample into K subsamples. A single subsample is retained as the data for the verification model, and the other K-1 samples are used for training. Finally, the average value of the prediction accuracy of the K models is used as the final estimated value of the model prediction accuracy, and the split mode with the highest prediction accuracy is selected as the optimal branch (Zhou et al., 2021a (2) Optimal parameter selection. M features are randomly selected from all feature sets during tree generation, and then an optimal eigenvalue mtry is selected as the split variable value according to the criterion of the maximum information gain ratio. Through the establishment of the RF model, the trend of ntree and the mean-square error is observed, and the decision tree corresponding to the minimum root-mean-square error (RMSE) is chosen as the best ntree value -that is, the number of regression trees (Zhou et al., 2021b).
(3) Training model establishment. The optimal branch is used as the RF input, and the node is divided into two branches according to its characteristics. The best features are found from the remaining features to construct the branches of the classification tree recursively so that the regression tree can grow to the maximum extent without clipping and generate a decision tree. The process is repeated to establish an RF training model.

Variable importance evaluation and model fitting prediction
(1) Variable importance evaluation. The corresponding OOB data for each RF tree are used to calculate its OOB data error, which is recorded as errOOB1. The characteristics of all samples of OOB data are multiplied with random noise interference, and the error of the bag data is calculated again, which is recorded as errOOB2.
Hence, the importance of feature X is as follows: (2) Model fitting prediction. First, the test set is input into the training model, and the test set data are predicted by the RF to establish the prediction model. The average of the output value of all decision trees is taken as the prediction value of the RF, and the RF training model fitted by the training set and the RF prediction model predicted by the test set are visualized. Finally, the model fitting map and prediction map are obtained. The prediction results of the RF regression model are as follows: where ( ) r f x represents the predicted value of the stochastic forest regression model and ( ) i h x is the predicted value of the single regression tree model.

Prediction result analysis
(1) Error analysis. An SVM without feature selection and an artificial neural network (ANN) are selected for modeling and comparative analysis, and the RMSE and goodness of fit (R 2 ) are selected to evaluate the prediction accuracy of the model (Zhou et al., 2020a), as shown in Eqns (6) and (7) (2) Sensitivity analysis. To analyze the interaction between the safety indexes, the global sensitivity evaluation of the input index is carried out by using Sobol's index method. The change in variance caused by the change in the input parameters reflects the importance of the research parameters and the contribution to the change in the model results. The first-order sensitivity reflects only the direct contribution of the uncertainty of a certain parameter to the output variance in the model. The total sensitivity of a parameter reflects the sum of the indirect contributions to the uncertainty of the parameter and the interaction between the parameter and other parameters in the output variance in the model. According to the first-order sensitivity and total sensitivity of Sobol's method, the equation is as follows: where S i is the first-order sensitivity value of the parameter; V i is the variance in the parameter; V is the total variance in the system; ST i is the total sensitivity value of the parameter; N s is the number of parameter samples.

Engineering background
This paper reports a case study on shield tunnel lining damage induced by WS occurring in Wuhan Metro Line 3, China, through both field monitoring and numerical simulation. The starting point is Zhuanyang Avenue Station, the terminal station is Hongtu Avenue Station, the mileage range is DK0 000~DK28 000, and the length of the whole line is 28.0 km; in DK9 600~DK9 920, the tunnel passes through the Hanjiang River (Pan et al., 2019b). Because of the high probability of leakage in the river section, the research scope of this paper is selected from the right line Wangzong interval design starting point at mileage 696.728 and extending in the Zongguan direction at approximately 315 m, as shown in Figure 3.

Seepage water risk indicator system
In operational shield tunnels, several degradation modes are usually active at the same time and mutually interact. Under normal circumstances, the deterioration of materials and the action of external loads cause deformation and destruction of the tunnel structure and ultimately produce leakage and water inrush, the latter of which is severe. According to Section 2.2 of this paper, the occurrence mechanism of shield tunnel degradation issues and the relationship between the various issues are analyzed. In addition, the selection criteria of risk indicators are combined with the principles of representativeness, monitoring, and objective integrity. On the basis of a large amount of practical experience and related references (Li et al., 2018;Pan et al., 2019a), 4 secondary risk indexes and 11 tertiary risk indexes are selected to determine the WS risk. The constructed WS risk evaluation index system is shown in Figure 4.

Risk classification of seepage water
Based on the domestic experience of water leakage engineering, data monitoring and acquisition, and the experience of the classification of leakage risk in railway tunnels, this paper divides the leakage risk status of operating subways into five grades according to the severity of leakage: wetting, infiltration, dripping water, water leakage and water gushing (Li et al., 2017b). The five grades correspond to five states -namely, A (very safe), B (safe), C (minimally safe), D (dangerous), and E (very dangerous)and the corresponding relationships are shown in Table 1.

Establishment of the original training set of sample data
In this paper, the WS risk grade is taken as the output variable to obtain the monitoring data of 100 groups in the monitoring interval, as shown in Table 2.

K-fold cross-validation method for selecting the optimal branch
For small amounts of sample data, the test results of this method are more reliable than the results of dividing the original samples into training sets and test sets. The method is as follows: first, the original data are randomly divided into K groups, and then each subset is used as a test set to test the model; the remaining K-1 sets are used as training sets to train the model, and K models are obtained. K = 5 is taken in this paper. Considering the randomness of the data partition, to obtain more consistent calculation results, the model is constructed and verified by a 5-fold cross-validation method.

Selection of parameters mtry and ntree
To determine the value of mtry (a small part of the total number of predictors), RF modeling uses regression instead of classification trees. The default value should be approximately one-third of the total number of variables to minimize the correlation between the generalization errors and decision trees. In our study, the optimal tuning parameters are selected using 5-fold cross-validation, and the commonly used R language software program is used to construct the RF model, to improve the performance of the RF model and handle overfitting. When optimizing the parameters mtry and ntree, 80 of the 100 sets of data in Table 2 are randomly selected as the training set and used to establish the model. The remaining 20 sets of data are used as the test sets to test the effect of the model. The specific method is as follows: (1) Set ntree = 300, 400, 500, and 600; use 5-fold cross-validation to establish the model. The mtry value corresponding to the minimum value of the mean-square error on the validation set is the optimal parameter mtry of the model. (2) Set mtry to 5, 6, and 7, and set ntree to increase gradually. Use the 5-fold cross-validation method to verify the model. According to the verification set, the corresponding ntree value is the optimal parameter ntree when the mean-square error stabilizes.
The above method is used to construct the model and visualize the output of the model, as shown in Figures 5 and 6.  In Figure 5, one trend is represented as the meansquare error change graph with the change in mtry when ntree is a fixed value. The four trends are the mean-square error line graphs when ntree = 300, 400, 500, and 600. Figure 5 shows that as the value of the parameter mtry increases, the overall RMSE shows a trend of an initial decline and then an increase. When the value of mtry in the model is 6, the RMSE evaluation index reaches the optimal value. This study ultimately determines the value of mtry as 6.
When a curve in Figure 6 has a fixed value of mtry, as the value of ntree changes, the curve of the mean-square error changes; the three curves are the mean-square deviations when mtry = 5, 6, and 7. The error of the ntree value below 400 fluctuates greatly, and above 400, the fluctuation trend gradually decreases. When the number of decision trees is greater than 500, the error of the model tends to stabilize. Therefore, this paper ultimately takes the value of the number of decision trees as 500.

Model fitting prediction
The RF package is first loaded into the R language software, and the 100 sets of monitoring data are input as the original training set. Second, 80 groups of data are randomly selected as the training set, and the remaining 20 groups of data are used as the test set. The related parameters are mtry = 6 and ntree = 500. Finally, the leakage risk in the training set is fitted to establish an RF training model, the test set is input into the training model, and the WS risk in the test set is predicted. The results are shown in Figures 7 and 8. Figure 7 shows that the simulation value is very close to the actual value, and the simulation effect is good. Figure 8 shows that the predicted value curve on the test set of the RF model is close to the real value by using the trained RF model to predict the test set.

Results and discussion
During the operation of metro tunnels, the impermeability performance may degrade due to the combined effects of uncertain multisource factors and the complex environment. Thus, developing a method for diagnosing and predicting WS in operating tunnels is quite challenging because WS is related to many intrinsic and extrinsic factors. The intrinsic factors include the deterioration of the lining materials, the corrosion of reinforcing rods, and segment assembly error, while the extrinsic environmental factors include unexpected earth pressure from geological conditions and nearby geotechnical activity. Two methods contribute to handling the overfitting issue arising from the small amount of training data in our manuscript. First, the selection of key factors by RF helps reduce information dimensionality. Second, a 5-fold cross-validation successfully eliminates the overfitting problem.

Algorithm error analysis
SVMs are based on statistical theory, so they have a strict theoretical and mathematical basis, which is different from ANNs, whose structural design depends on the designer's empirical knowledge and prior knowledge. However, SVMs are difficult to implement for large-scale training samples and cannot readily solve the problem of multiple classifications. To test the superiority of the RF model, this study chooses two excellent machine learning models, an SVM and ANN, for modeling and comparative analysis (Zhou et al., 2015). The physical mechanism of machine learning algorithms such as the RF, ANN, and SVM algorithms is a "black box". As the number of variables increases, the degree of black-box modeling becomes more complicated. Therefore, this section is based on the 11 risk indicator variables scientifically analyzed in Section 4.2.1 as the input parameters of the simulation. By taking Python 2.7 and MATLAB 2014 as the calculation platform, the measured values of level 3 risk indicators and risk category labels of level 2 risk indicators of 30 monitoring points in the Wangzong section of Wuhan Metro Line 3 are taken as the training data of the model. Second, the three models are trained separately, and the optimal number of parameters and the indicator of mode error for each model input are shown in Table 3 (Yu et al., 2021). Finally, RMSE as given by Eqn (6) and R 2 as given by Eqn (7) in Section 3.4 are used to measure the prediction accuracy of the model.  From the prediction results for the three prediction models, RF, SVM, and ANN, we find that the RMSE values are 0.047, 0.244, and 0.219 and that the R 2 values are 0.991, 0.969, and 0.955. The RMSE of the RF model prediction results is the smallest, and its R 2 is closest to 1, indicating that the prediction results of the RF model are the closest to the actual value, the accuracy is highest, and the effect is best. Moreover, the comparison results show that RF outperforms those competitors in solving the research problems, indicating its potential as a promising tool for solving the risk prediction problem of WS in operational shield tunnels.

Sensitivity analysis
Sobol's index method was used to analyze the first-order and global sensitivity of the indicators of the 100 sets of monitored data. The sensitivity of each indicator is shown in Figure 9. Taking the water leakage risk level as the objective function and based on the actual data distribution law, each indicator is made to obey a Gaussian distribution, and the first-order sensitivity and the global total sensitivity of the objective function are obtained, as shown in Figure 9. The safety index with the highest first-order sensitivity and global total sensitivity -which are 0.533 and 0.547, respectively -is the chip peeling area. The sensitivity of the chip peeling area is significantly higher than that of other parameters. The first-order sensitivity and global total sensitivity of the crack width are 0.266, 0.285, respectively. The first-order sensitivity and global total sensitivity of the steel section loss rate are 0.052, and 0.155, respectively. This indicate that these two variables have a great impact on the risk of WS. The first-order sensitivity and total sensitivity of the variables are relatively close, indicating that the influence of these indicators on the risk of water leakage is relatively similar. Therefore, in the case of this project, the most effective way to reduce the risk of WS in this area is to successively reduce the risk status of the chip peeling area, the crack width, and steel section loss rate. This approach can be used to realize risk evaluation and effective decision making regarding WS in the shield tunnel operation period.  2 Pay attention to foundation treatment; otherwise, weathering of the lining surface can easily occur.
3 Insufficient attachment during construction.
Coating method 1 Commonly used in areas with small WS range and light degree; the tunnel lining surface is relatively flat.
2 The coating needs a certain thickness to achieve the waterproof effect, and the cost is high.
Waterproofing membrane method 1 Commonly used when the seepage area is large and there is sufficient clear section.
2 Mainly used for WS in the arch.
3 Prefabricated components are simple in construction and have a good waterproof effect.
4 Effective for preventing peeling.
Waterproof board method 1 Often used in areas with large WS areas and low WS.
2 Usually, used in combination with waterproof film.
3 Mostly used in mountain tunnels constructed by the mining method; a waterproof layer is formed between the shotcrete and the secondary masonry to block water leakage.
Slip casting method 1 Reinforcement of strata to increase tunnel bearing capacity and filling of holes in the lining to make the lining force distribution uniform.
2 Prevent the lining structure from continuing to deform or damage.
3 By repairing cracks in lining concrete structures.
4 Less impact on tunnel operation.
5 Less manpower and time required.

WS management methods
According to the actual situation of the tunnel in the case, the early warning level of WS and expert experience, the library outputs five kinds of WS management methods, as shown in Table 4.

Conclusions and future prospects Key findings
Because of their advantages, such as safety and predictability, metro systems have met the increased traffic demand brought by population growth. As people increasingly rely on urban rail transit, the safety of metro tunnel operations has also received increasing attention. The shield method has become the preferred method for metro tunnel construction due to its high efficiency and safety. However, because of the specific structure of the shield tunnel, many defects can occur during the operation of metro tunnels. As a defect that occurs with high frequency, has strong connections with other defects and can lead to severe consequences, WS should be given special attention in tunnel operation safety management. Therefore, based on the analysis of WS risk mechanisms, this paper proposes a water leakage risk analysis and management method based on an intelligent RF algorithm during the operation period. This paper contributes to the literature on the risk prediction and diagnosis of WS in operational shield tunnels in several ways.
(1) Based on the analysis of the mechanism of leakage in the shield tunnel operation period, the relationships between the development of leakage and other tunnel defects are analyzed. According to the selection principle of the indexes in the risk evaluation index system, the relevant literature, and expert experience, 12 third-level risk indexes and four second-level risk indexes are selected to construct a three-level index system of WS risk evaluation. Combined with research on and experience with tunnel WS risk classification around the world, WS risk is divided into five safety states and grades, which are convenient for subsequent risk assessment, prediction and management.
(2) Among many machine learning algorithms, the RF method has the advantage of reducing the overfitting phenomenon in model training; the training results are stable, and the generalization ability of the model is strong. In this paper, the RF regression algorithm is used to predict the leakage risk grade of an operating tunnel, and a prediction model based on this method is established. The corresponding prediction process and steps are also proposed. This provides an effective way to perform WS risk grade prediction. (3) Our paper takes the Wangzong section of Wuhan Metro Line 3 as an example. A data training set is established, and the optimal parameters mtry = 6 and ntree = 500 are selected to construct a training model and evaluate the importance of the indicators. The key risk indicators that have a great impact on the risk of WS in the operation of the tunnel in this section are, from greatest to least impact, the spalling area of the segment (V6), crack width (V5 2 ), and loss rate of the steel section (V8), which can be considered the three most important indicators for safety control. Therefore, if there is an increase in the risk of WS in this case study, the status of these three risk indicators can be compared and treated in turn. In addition, some of the actual project data are input into the model as the test set data, and the predicted results are compared with the actual values. The results show that the error index is small, which verifies the accuracy and reliability of the RF model. Finally, Sobol's index method is used to analyze the global sensitivity of the variables, and the results are the same as for the variable importance evaluation. (4) To further verify the reliability of the RF prediction model, the building energy consumption prediction results of the RF model and the SVM and BP-ANN models are compared. The results show that, compared with the prediction accuracy of SVM and BP-ANN, the RF model has a greater goodness of fit and the smallest RMSE; thus, the prediction results of the RF model are found to be more accurate and stable than those of the other two models.

Future work
It is a complex task to control the WS of metro tunnel engineering, which requires comprehensive consideration of many factors, including the complexity of engineering construction, maintenance cost, and engineering maintenance. The durability of the protection and other factors should be considered to take appropriate measures in the treatment of WS. In addition, the design of future water leakage treatment schemes should not sacrifice the original technical standards but should consider the integrity of the original structure, in accordance with the principle of combining drainage and interception, to reduce the damage to the structure and protect the environment (Qi & Tang, 2018;Zhang et al., 2020c). Several knowledge gaps related to our research should be considered in further studies. The RF method is a local optimal problem, which depends on the user experience. A combined model that integrates metaheuristic particle swarm optimization in the RF could be used to enhance the robustness of the RF model. Furthermore, we aim to implement decision-making recommendations and compare the leakage status of improved shield operation tunnels with the expected status and predicted status to determine the reliability of the model and method at the actual engineering level. According to feedback, we can make appropriate changes to the optimized model to improve its application to the risk management of WS during shield tunnel operation. Future studies can focus on developing an enhanced RF method to improve the comprehensive security status of operating tunnels, including obtaining other important prediction models, such as tunnel lining stability and adjacent buildings induced by shield tunneling. Meanwhile, more datasets can be collected from different tunnels to further validate the reliability of the proposed model and increase its application scope in the field of engineering.
Operational shield tunnels include the uneven settlement of the tunnel, lining corrosion, cracking, and other failure modes, which reduce the durability and bearing capacity of the structure, shorten the service time, and thus have an important threatening impact on the safety, comfort, and normal operation of the tunnel. In addition, only 11 risk indicators are used as the input variables of the three models in the present study. In the future, more or different input variables can be used to develop RF models to predict and diagnose WS in operational shield tunnels.