Modeling the effect of pollutant gas on PM2.5 in China with computational intelligence

    Xu Wang Info
    Kai Zhang Info
    Peishan Han Info
    Xianjun Li Info
    Qiong Pan Info
DOI: https://doi.org/10.3846/jeelm.2026.25791

Abstract

This study employs computational intelligence techniques – gene expression programming (GEP), back-propagation neural network (BPNN), support vector regression (SVR) and linear regression (LR)–to model the quantitative relationship between pollutant gases (PGs) and PM2.5 concentrations using 2021 environmental data from 12 Chinese cities. A comparative analysis was conducted to evaluate model performance using the correlation coefficient (R), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE). Results showed that the correlation coefficients (R) between predicted and actual PM2.5 concentrations ranged from –0.7579 to 0.9802 across all models. SVR and LR demonstrated the most robust performance, achieving high average R values of 0.8656 and 0.8671, respectively. LR also yielded the lowest average RMSE (0.12) and MAE (0.06) across the cities. GEP proved capable of finding highly accurate explicit models, achieving a maximum R of 0.9766. A key finding from the LR models is that CO and PM10 consistently had the most significant impact on PM2.5 concentrations. Correlation formulas derived from GEP and LR can support further PM2.5 analysis. These findings offer insights into PM2.5 formation mechanisms and inform pollution control strategies.

Keywords:

PM2.5, pollutant gas, GEP, BP neural network, SVR, LR

How to Cite

Wang, X., Zhang, K., Han, P., Li, X., & Pan, Q. (2026). Modeling the effect of pollutant gas on PM2.5 in China with computational intelligence. Journal of Environmental Engineering and Landscape Management, 34(2), 124–138. https://doi.org/10.3846/jeelm.2026.25791

Share

Published in Issue
April 29, 2026
Abstract Views
0

References

Arabloo, M., Bahadori, A., Ghiasi, M. M., Lee, M., Abbas, A., & Zendehboudi, S. (2015). A novel modeling approach to optimize oxygen–steam ratios in coal gasification process. Fuel, 153, 1–5. https://doi.org/10.1016/j.fuel.2015.02.083

Azamathulla, H. M. (2012). Gene-expression programming to predict scour at a bridge abutment. Journal of Hydroinformatics, 14(2), 324–331. https://doi.org/10.2166/hydro.2011.135

Bai, Y., & Li, C. (2016). Daily natural gas consumption forecasting based on a structure-calibrated support vector regression approach. Energy Build, 127, 571–579. https://doi.org/10.1016/j.enbuild.2016.06.020

Cheng, A., Jiang, X., Li, Y., Zhang, C., & Zhu, H. (2017). Multiple sources and multiple measures based traffic flow prediction using the chaos theory and support vector regression method. Physica A: Statistical Mechanics and its Applications, 466, 422–434. https://doi.org/10.1016/j.physa.2016.09.041

Chen, T. Y., Chen, S. C., Wang, C. W., Tu, H. P., Chen, P. S., Hu, S. C. S., Li, C. H., Wu, D. W., Hung, C. H., & Kuo, C. H. (2023). The impact of the synergistic effect of SO2 and PM2.5/PM10 on obstructive lung disease in subtropical Taiwan. Front Public Health, 11, Article 1229820. https://doi.org/10.3389/fpubh.2023.1229820

Dondi, A., Carbone, C., Manieri, E., Zama, D., Del Bono, C., Betti, L., Biagi, C., & Lanari, M. (2023). Outdoor air pollution and childhood respiratory disease: The role of oxidative stress. International Journal of Molecular Sciences, 24(5), Article 4345. https://doi.org/10.3390/ijms24054345

Dorofeyev, A., Dorofeyeva, A., Borysov, A., Tolstanova, G., & Borisova, T. (2023). Gastrointestinal health: Changes of intestinal mucosa and microbiota in patients with ulcerative colitis and irritable bowel syndrome from PM2.5-polluted regions of Ukraine. Environmental Science and Pollution Research, 30(3), 7312–7324. https://doi.org/10.1007/s11356-022-22710-9

Drewil, G. I., & Al-Bahadili, R. J. (2022). Air pollution prediction using LSTM deep learning and metaheuristics algorithms. Measurement: Sensors, 24, Article 100546. https://doi.org/10.1016/j.measen.2022.100546

Frank, A., Fabregat-Traver, D., & Bientinesi, P. (2016). Large-scale linear regression: Development of high-performance routines. Applied Mathematics and Computation, 275, 411–421. https://doi.org/10.1016/j.amc.2015.11.078

He, Y., Liu, R., Li, H., Wang, S., & Lu, X (2017). Short-term power load probability density forecasting method using kernel-based support vector quantile regression and Copula theory. Applied Energy, 185, 254–266. https://doi.org/10.1016/j.apenergy.2016.10.079

Khan, M., Nassar, R. U. D., Anwar, W., Rasheed, M., Najeh, T., Gamil, Y., & Farooq, F. (2024). Forecasting the strength of graphene nanoparticles-reinforced cementitious composites using ensemble learning algorithms. Results Engineering, 21, Article 101837. https://doi.org/10.1016/j.rineng.2024.101837

Kicsiny, R. (2016). Improved multiple linear regression based models for solar collectors. Renewable Energy, 91, 224–232. https://doi.org/10.1016/j.renene.2016.01.056

Kokkinos, K., Karayannis, V., Nathanail, E., & Moustakas, K. (2021). A comparative analysis of Statistical and Computational Intelligence methodologies for the prediction of traffic-induced fine particulate matter and NO2. Journal of Cleaner Production, 328, Article 129500. https://doi.org/10.1016/j.jclepro.2021.129500

Kumar, S., Mishra, S., & Singh, S. K. (2020). A machine learning-based model to estimate PM2.5 concentration levels in Delhi’s atmosphere. Heliyon, 6(11), Article e05618. https://doi.org/10.1016/j.heliyon.2020.e05618

Liu, S., Hou, Z., & Yin, C. (2016). Data-driven modeling for UGI gasification processes via an enhanced genetic BP neural network with link switches. IEEE Transactions on Neural Networks and Learning Systems, 27(12), 2718–2729. https://doi.org/10.1109/TNNLS.2015.2491325

Liu, X. Q., Huang, J., Song, C., Zhang, T. L., Liu, Y. P., & Yu, L. (2023). Neurodevelopmental toxicity induced by PM2.5 exposure and its possible role in neurodegenerative and mental disorders. Human & Experimental Toxicology, 42, 1–20. http://dx.doi.org/10.1177/09603271231191436

López-Granero, C., Polyanskaya, L., Ruiz-Sobremazas, D., Barrasa, A., Aschner, M., & Alique, M. (2023). Particulate matter in human elderly: Higher susceptibility to cognitive decline and age-related diseases. Biomolecules, 14(1), Article 35. https://doi.org/10.3390/biom14010035

Mahdaviara, M., Larestani, A., Nait Amar, M., & Hemmati-Sarapardeh, A. (2022). On the evaluation of permeability of heterogeneous carbonate reservoirs using rigorous data-driven techniques. Journal of Petroleum Science and Engineering, 208, Article 109685. https://doi.org/10.1016/j.petrol.2021.109685

Münzel, T., Hahad, O., Daiber, A., & Lelieveld, J. (2021). Luftverschmutzung und Herz-Kreislauf-Erkrankungen [Air pollution and cardiovascular diseases]. Herz, 46(2), 120–128. https://doi.org/10.1007/s00059-020-05016-9

Onaiwu, G. E., & Eferavware, S. A. (2023). The potential health risk assessment of PM2.5-bound polycyclic aromatic hydrocarbons (PAHs) on the human respiratory system within the ambient air of automobile workshops in Benin City, Nigeria. Air Quality, Atmosphere & Health, 16(12), 2431–2441. https://doi.org/10.1007/s11869-023-01415-z

Peng, X., & Xu, D. (2016). Projection support vector regression algorithms for data regression. Knowledge-Based Systems, 112, 54–66. https://doi.org/10.1016/j.knosys.2016.08.030

Samad, A., Garuda, S., Vogt, U., & Yang, B. (2023). Air pollution prediction using machine learning techniques – An approach to replace existing monitoring stations with virtual monitoring stations. Atmospheric Environment, 310, Article 119987. https://doi.org/10.1016/j.atmosenv.2023.119987

Sarir, P., Chen, J., Asteris, P. G., Armaghani, D. J., & Tahir, M. M. (2021). Developing GEP tree-based, neuro-swarm, and whale optimization models for evaluation of bearing capacity of concrete-filled steel tube columns. Engineering with Computers, 37(1), 1–19. https://doi.org/10.1007/s00366-019-00808-y

Schweidtmann, A. M., Esche, E., Fischer, A., Kloft, M., Repke, J. U., Sager, S., & Mitsos, A. (2021). Machine learning in chemical engineering: A perspective. Chemie Ingenieur Technik, 93(12), 2029–2039. https://doi.org/10.1002/cite.202100083

Tosun, E., Aydin, K., & Bilgili, M. (2016). Comparison of linear regression and artificial neural network model of a diesel engine fueled with biodiesel-alcohol mixtures. Alexandria Engineering Journal, 55(4), 3081–3089. https://doi.org/10.1016/j.aej.2016.08.011

Wang, G., Su, Y., & Shu, L. (2016a). One-day-ahead daily power forecasting of photovoltaic systems based on partial functional linear regression models. Renewable Energy, 96, 469–478. https://doi.org/10.1016/j.renene.2016.04.089

Wang, J., Wang, R. H., Wang, C., & Shen, L. (2016b). Improved v-support vector regression model based on variable selection and brain storm optimization for stock price forecasting. Applied Soft Computing, 49, 164–178. https://doi.org/10.1016/j.asoc.2016.07.024

Wang, Y., Lu, C., & Zuo, C. (2015). Coal mine safety production forewarning based on improved BP neural network. International Journal of Mining Science and Technology, 25(2), 319–324. https://doi.org/10.1016/j.ijmst.2015.02.023

Widziewicz-Rzońca, K., Pyta, H., Słaby, K., Błaszczak, B., Rogula-Kopiec, P., Mathews, B., Błaszczak, M., & Klejnowski, K. (2022). Analysis of the seasonal and fractional variability of metals bearing particles in an urban environment and their inhalability. Journal of Atmospheric Chemistry, 80(1), 77–101. https://doi.org/10.1007/s10874-022-09438-z

Wu, C. H., Lin, I. S., Wei, M. L., & Cheng, T. Y. (2013). Target position estimation by genetic expression programming for mobile robots with vision sensors. IEEE Transactions on Instrumentation and Measurement, 62(12), 3218–3230. https://doi.org/10.1109/TIM.2013.2272173

Xu, T., Zhang, C., Liu, C., & Hu, Q. (2023). Variability of PM2.5 and O3 concentrations and their driving forces over Chinese megacities during 2018-2020. Journal of Environmental Sciences, 124, 1–10. https://doi.org/10.1016/j.jes.2021.10.014

Yassin, M. A., Alazba, A. A., & Mattar, M. A. (2016). A new predictive model for furrow irrigation infiltration using gene expression programming. Computers and Electronics in Agriculture, 122, 168–175. https://doi.org/10.1016/j.compag.2016.01.035

Yuan, X., Liang, F., Zhu, J., Huang, K., Dai, L., Li, X., Wang, Y., Li, Q., Lu, X., Huang, J., Liao, L., Liu, Y., Gu, D., Liu, H., & Liu, F. (2023). Maternal exposure to PM2.5 and the risk of congenital heart defects in 1.4 million births: A nationwide surveillance-based study. Circulation, 147(7), 565–574. https://doi.org/10.1161/CIRCULATIONAHA.122.061245

Yu, F., & Xu, X. (2014). A short-term load forecasting model of natural gas based on optimized genetic algorithm and improved BP neural network. Applied Energy, 134, 102–113. https://doi.org/10.1016/j.apenergy.2014.07.104

Zendehboudi, S., Rezaei, N., & Lohi, A. (2018). Applications of hybrid models in chemical, petroleum, and energy systems: A systematic review. Applied Energy, 228, 2539–2566. https://doi.org/10.1016/j.apenergy.2018.06.051

Zhang, X., Wu, S., Lu, Y., Qi, J., Li, X., Gao, S., Qi, X., & Tan, J. (2024). Association of ambient PM2.5 and its components with in vitro fertilization outcomes: The modifying role of maternal dietary patterns. Ecotoxicology and Environmental Safety, 282, Article 116685. https://doi.org/10.1016/j.ecoenv.2024.116685

Zhou, J., Wan, X., Zhang, J., Yan, Z., & Li, Y. (2015). Modeling of constitutive relationship of aluminum alloy based on BP neural network model. Materials Today: Proceedings, 2(10), 5023–5028. https://doi.org/10.1016/j.matpr.2015.10.09

View article in other formats

CrossMark check

CrossMark logo

Published

2026-04-29

Issue

Section

Articles

How to Cite

Wang, X., Zhang, K., Han, P., Li, X., & Pan, Q. (2026). Modeling the effect of pollutant gas on PM2.5 in China with computational intelligence. Journal of Environmental Engineering and Landscape Management, 34(2), 124–138. https://doi.org/10.3846/jeelm.2026.25791

Share