Effect of noise reduction on PLSR modeling in near infrared spectroscopy using denoising autoencoder
DOI: https://doi.org/10.3846/ntcs.2025.24139Abstract
In this study, a deep learning-based denoising autoencoder approach is proposed to increase the robustness of near-infrared spectroscopy data to random noise and improve quantitative modeling accuracy. Artificial Gaussian noise at four different levels (10, 15, 20, and 25 dB) was added to the near-infrared spectra obtained from milk samples to mimic the real measurement conditions. The noisy spectra were denoised by processing with an autoencoder architecture consisting of fully connected layers. The noise removal performance is quantitatively evaluated with both theoretical and measured signal-to-noise ratio values. The results show that the AE model significantly improves the spectral signal quality at all signal-to-noise ratio levels. In particular, at the lowest signal-to-noise ratio level (10 dB), the signal-to-noise ratio value nearly tripled to 29.6 dB with the autoencoder. At all other levels, an average increase of 18-20 dB was observed in the signal-to-noise ratio of the denoised spectra. In the second stage of the study, Partial Least Squares Regression models were built using both the noisy and cleaned spectra and evaluated on the test set with root mean square error and coefficient of determination. The Partial Least Squares Regression models built with the denoised spectra achieved lower root mean square error and higher coefficient of determination values at all signal-to-noise ratio levels. Especially at the 10 dB signal-to-noise ratio level, the coefficient of determination value of the model increased from 0.44 to 0.71, while the root means square error decreased from 0.60 to 0.43. The results show that the deep learning-based AE architecture can effectively reduce random noise in near-infrared spectral data and significantly improve both spectral signal quality and quantitative modeling performance. This approach provides an effective solution to improve model reliability and accuracy in near-infrared spectroscopy analysis.
Keywords:
autoencoder, milk analysis, deep learning, near-infrared spectroscopy, noise removal, PLSRHow to Cite
Share
License
Copyright (c) 2025 The Author(s). Published by Vilnius Gediminas Technical University.
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Alexandre, R., & Santos, D. (2024). Just-In-Time Software Defect Prediction using a deep learning-based model. New Trends in Computer Sciences, 2(2), 91–100. https://doi.org/10.3846/ntcs.2024.22274
Barnes, R. J., Dhanoa, M. S., & Lister, S. J. (1989). Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra. Applied Spectroscopy, 43(5), 772–777. https://doi.org/10.1366/0003702894202201
Biancolillo, A., & Marini, F. (2018). Chemometric methods for spectroscopy-based pharmaceutical analysis. Frontiers in Chemistry, 6, Article 412780. https://doi.org/10.3389/fchem.2018.00576
Çataltaş, Ö., & Tütüncü, K. (2021). A review of data analysis techniques used in near-infrared spectroscopy. European Journal of Science and Technology. https://doi.org/10.31590/ejosat.882749
Cen, H., & He, Y. (2007). Theory and application of near infrared reflectance spectroscopy in determination of food quality. Trends in Food Science & Technology, 18(2), 72–83. https://doi.org/10.1016/j.tifs.2006.09.003
Davies, T. (1993). Book reviews: Practical NIR spectroscopy with applications in food and beverage analysis, Ft-NIR Atlas. NIR News, 4(5), 12–12. https://doi.org/10.1255/nirn.212
Diaz-Olivares, J. A., van Nuenen, A., Gote, M. J., Díaz, V. F., Saeys, W., Adriaens, I., & Aernouts, B. (2023). Near-infrared spectra dataset of milk composition in transmittance mode. Data in Brief, 51, Article 109767. https://doi.org/10.1016/j.dib.2023.109767
Engel, J., Gerretzen, J., Szymańska, E., Jansen, J. J., Downey, G., Blanchet, L., & Buydens, L. M. C. (2013). Breaking with trends in pre-processing? TrAC Trends in Analytical Chemistry, 50, 96–106. https://doi.org/10.1016/j.trac.2013.04.015
Fodor, M., Matkovits, A., Benes, E. L., & Jókai, Z. (2024). The role of near-infrared spectroscopy in food quality assurance: A review of the past two decades. Foods, 13(21), Article 3501. https://doi.org/10.3390/foods13213501
Jiang, Y., Ma, X., & Li, X. (2025). Towards virtual sample generation with various data conditions: A comprehensive review. Information Fusion, 117, Article 102874. https://doi.org/10.1016/j.inffus.2024.102874
Lv, J., Chen, Z., Luan, X., & Liu, F. (2023). Denoising stacked autoencoders-based near-infrared quality monitoring method via robust samples evaluation. The Canadian Journal of Chemical Engineering, 101(5), 2693–2703. https://doi.org/10.1002/cjce.24684
Michel, V., Blondel, M., Prettenhofer, P., Weiss, R., Vanderplas, J., Cournapeau, D., Pedregosa, F., Varoquaux, G., Gramfort, A., Thirion, B., Grisel, O., Dubourg, V., Passos, A., Brucher, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12, 2825–2830.
Pasquini, C. (2018). Near infrared spectroscopy: A mature analytical technique with new perspectives – A review. Analytica Chimica Acta, 1026, 8–36. https://doi.org/10.1016/j.aca.2018.04.004
Rinnan, Å., Berg, F. van den, & Engelsen, S. B. (2009). Review of the most common pre-processing techniques for near-infrared spectra. TrAC Trends in Analytical Chemistry, 28(10), 1201–1222. https://doi.org/10.1016/j.trac.2009.07.007
Roggo, Y., Chalus, P., Maurer, L., Lema-Martinez, C., Edmond, A., & Jent, N. (2007). A review of near infrared spectroscopy and chemometrics in pharmaceutical technologies. Journal of Pharmaceutical and Biomedical Analysis, 44(3), 683–700. https://doi.org/10.1016/j.jpba.2007.03.023
Savitzky, A., & Golay, M. J. E. (1964). Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, 36(8), 1627–1639. https://doi.org/10.1021/ac60214a047
Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P.-A. (2008). Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine learning (ICML ‘08) (pp. 1096–1103). Association for Computing Machinery. https://doi.org/10.1145/1390156.1390294
Wold, S., Sjöström, M., & Eriksson, L. (2001). PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 58(2), 109–130. https://doi.org/10.1016/S0169-7439(01)00155-1
Zhang, X., Lin, T., Xu, J., Luo, X., & Ying, Y. (2019). DeepSpectra: An end-to-end deep learning approach for quantitative spectral analysis. Analytica Chimica Acta, 1058, 48–57. https://doi.org/10.1016/j.aca.2019.01.002
View article in other formats
Published
Issue
Section
Copyright
Copyright (c) 2025 The Author(s). Published by Vilnius Gediminas Technical University.
License
This work is licensed under a Creative Commons Attribution 4.0 International License.