Combination of machine learning-based automatic valuation models for residential properties in South Korea

    Jengei Hong Affiliation
    ; Woo-sung Kim Affiliation


The applicability of machine learning (ML) techniques has recently been expanding to include automatic real estate valuation models. The main advantage of this technique is that it can better capture complexity in the value determination process. Therefore, the performance of these techniques is shown to be superior to conventional models. In this paper, the latest ML algorithms (i.e., support vector machine, random forest, XGBoost, LightGBM, and CatBoost algorithms) are examined as automatic valuation models, and several combination methods are proposed to improve the models’ predictive power. We applied ML models to approximately 57,000 records on apartment transactions, which were provided by South Korea’s Ministry of Land, Infrastructure, and Transport, that occurred in Seoul in 2018. The results are as follows. First, ML-based predictors (especially, the latest decision tree-based algorithms) are more performative than conventional models. Second, the prediction error from a model can be partially offset by another model’s error, which implies that an efficient averaging of the predictors improves their predictive accuracy. Third, the models’ relative performance may be relearned by the ML algorithms, which means that they can also be used to recommend which algorithm should be selected for making predictions.

Keyword : automatic valuation model, mass appraisal, machine learning (ML) techniques, combined approach, decision tree-based algorithms

How to Cite
Hong, J., & Kim, W.- sung. (2022). Combination of machine learning-based automatic valuation models for residential properties in South Korea. International Journal of Strategic Property Management, 26(5), 362–384.
Published in Issue
Nov 17, 2022
Abstract Views
PDF Downloads
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.


Adamczyk, T., & Bieda, A. (2015). The applicability of time series analysis in real estate valuation. Geomatics and Environmental Engineering, 9(2), 15–25.

Amit, Y., & Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation, 9(7), 1545–1588.

Antipov, E. A., & Pokryshevskaya, E. B. (2012). Mass appraisal of residential apartments: an application of Random Forest for valuation and a CART-based approach for model diagnostics. Expert Systems with Applications, 39(2), 1772–1778.

Bellotti, A. (2017). Reliable region predictions for automated valuation models. Annals of Mathematics and Artificial Intelligence, 81(1–2), 71–84.

Binoy, B. V., Naseer, M. A., Kumar, P. A., & Lazar, N. (2022). A bibliometric analysis of property valuation research. International Journal of Housing Markets and Analysis, 15(1), 35–54.

Bogin, A. N., & Shui, J. (2020). Appraisal accuracy and automated valuation models in rural areas. Journal of Real Estate Finance and Economics, 60(1–2), 40–52.

Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992, July). A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual Workshop on Computational Learning Theory (pp. 144–152). Association for Computing Machinery.

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

Cannon, S. E., & Cole, R. A. (2011). How accurate are commercial real estate appraisals? Evidence from 25 years of NCREIF sales data. Journal of Portfolio Management, 37(5), 68–88.

Chau, K. W., & Chin, T. L. (2003). A critical review of literature on the hedonic price model. International Journal for Housing Science and its Applications, 27(2), 145–165.

Chau, K., Wong, S., Yiu, C., & Leung, H. (2005). Real estate price indices in Hong Kong. Journal of Real Estate Literature, 13(3), 337–356.

Chen, J. H., Ong, C. F., Zheng, L., & Hsu, S. C. (2017). Forecasting spatial dynamics of the housing market using support vector machine. International Journal of Strategic Property Management, 21(3), 273–283.

Chen, T., & Guestrin, C. (2016, August). Xgboost: a scalable tree boosting system. In Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining (pp. 785–794). Association for Computing Machinery.

Chris, A. (2020, July 15). Price rankings by city of price per square meter to buy apartment in city centre (buy apartment price).

Čeh, M., Kilibarda, M., Lisec, A., & Bajat, B. (2018). Estimating the performance of random forest versus multiple regression for predicting prices of the apartments. ISPRS International Journal of Geo-Information, 7(5), 168.

Deaconu, A., Buiga, A., & Tothăzan, H. (2022). Real estate valuation models performance in price prediction. International Journal of Strategic Property Management, 26(2), 86–105.

Dimopoulos, T., Tyralis, H., Bakas, N. P., & Hadjimitsis, D. (2018). Accuracy measurement of random forests and linear regression for mass appraisal models that estimate the prices of residential apartments in Nicosia, Cyprus. Advances in Geosciences, 45, 377–382.

Do, A. Q., & Grudnitski, G. (1992). A neural network approach to residential property appraisal. Real Estate Appraiser, 58(3), 38–45.

Dorogush, A. V., Ershov, V., & Gulin, A. (2018). Catboost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363.

Dubin, R. A., & Sung, C. H. (1990). Specification of hedonic regressions: non-nested tests on measures of neighborhood quality. Journal of Urban Economics, 27(1), 97–110.

Fan, G. Z., Ong, S. E., & Koh, H. C. (2006). Determinants of house price: a decision tree approach. Urban Studies, 43(12), 2301–2315.

Feng, S. T., Peng, C. W., Yang, C. H., & Chen, P. W. (2021). Non-linear relationships between house size and price. International Journal of Strategic Property Management, 25(3), 240–253.

Fletcher, M., Gallimore, P., & Mangan, J. (2000). Heteroscedasticity in hedonic house price models. Journal of Property Research, 17(2), 93–108.

Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.

Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics and Data Analysis, 38(4), 367–378.

Gabrielli, L., & French, N. (2021). Pricing to market: property valuation methods–a practical review. Journal of Property Investment & Finance, 39(5), 464–480.

Garrod, G. D., & Willis, K. G. (1992). Valuing goods’ characteristics: an application of the hedonic price method to environmental attributes. Journal of Environmental Management, 34(1), 59–76.

Glumac, B., & Des Rosiers, F. (2021). Practice briefing–Automated valuation models (AVMs): their role, their advantages and their limitations. Journal of Property Investment and Finance, 39(5), 481–491.

Gnat, S. (2021). Property mass valuation on small markets. Land, 10(4), 388.

Guo, J. Q., Chiang, S. H., Liu, M., Yang, C. C., & Guo, K. Y. (2020). Can machine learning algorithms associated with text mining from internet data improve housing price prediction performance? International Journal of Strategic Property Management, 24(5), 300–312.

Han, X., & Clemmensen, L. (2014). On weighted support vector regression. Quality and Reliability Engineering International, 30(6), 891–903.

Hannonen, M. (2005). An analysis of land prices: a structural time‐series approach. International Journal of Strategic Property Management, 9(3), 145–172.

Ho, T. K. (1995, August). Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition (Vol. 1, pp. 278–282). IEEE Publications.

Ho, W. K., Tang, B. S., & Wong, S. W. (2021). Predicting property prices with machine learning algorithms. Journal of Property Research, 38(1), 48–70.

Hong, J., Choi, H., & Kim, W. S. (2020). A house price valuation based on the random forest approach: the mass appraisal of residential property in South Korea. International Journal of Strategic Property Management, 24(3), 140–152.

Huh, S., & Kwak, S. J. (1997). The choice of functional form and variables in the hedonic price model in Seoul. Urban Studies, 34(7), 989–998.

Yeap, G. P., & Lean, H. H. (2020). Nonlinear relationship between housing supply and house price in Malaysia. International Journal of Strategic Property Management, 24(5), 313–322.

Yilmazer, S., & Kocaman, S. (2020). A mass appraisal assessment study using machine learning based on multiple regression and random forest. Land Use Policy, 99, 104889.

Yu, D. (2007). Modeling owner-occupied single-family house values in the city of Milwaukee: a geographically weighted regression approach. GIScience and Remote Sensing, 44(3), 267–282.

Kain, J. F., & Quigley, J. M. (1970). Measuring the value of housing quality. Journal of the American Statistical Association, 65(330), 532–548.

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017). LightGBM: a highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30, 3146–3154.

Kittler, J., Hatef, M., Duin, R. P. W., & Matas, J. (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3), 226–239.

Kok, N., Koponen, E. L., & Martínez-Barbosa, C. A. (2017). Big data in real estate? From manual appraisal to automated valuation. Journal of Portfolio Management, 43(6), 202–211.

Kryvobokov, M., & Wilhelmsson, M. (2007). Analysing location attributes with a hedonic model for apartment prices in Donetsk, Ukraine. International Journal of Strategic Property Management, 11(3), 157–178.

Krogh, A., & Vedelsby, J. (1995). Neural network ensembles, cross validation, and active learning. Advances in Neural Information Processing Systems, 7, 231–238.

Lancaster, K. J. (1966). A new approach to consumer theory. Journal of Political Economy, 74(2), 132–157.

Lee, T. W., & Chen, K. (2016). Prediction of house unit price in Taipei City using support vector regression [Conference presentation]. Asia Pacific Industrial Engineering and Management Systems Conference, Taipei City, China.

Levantesi, S., & Piscopo, G. (2020). The importance of economic variables on London real estate market: a random forest approach. Risks, 8(4), 112.

Li, M. M., & Brown, H. J. (1980). Micro-neighborhood externalities and hedonic housing prices. Land Economics, 56(2), 125–141.

Liaw, A., & Wiener, M. (2002). Classification and regression by random forest. R News, 2(3), 18–22.

Limsombunchai, V. (2004, June). House price prediction: hedonic price model vs. artificial neural network. In New Zealand Agricultural and Resource Economics Society Conference (pp. 25–26), Blenheim, New Zealand.

Lin, H., & Chen, K. (2011, July). Predicting price of Taiwan real estates by neural networks and support vector regression. In Proceedings of the 15th WSEAS International Conference on Systems (pp. 220–225), Corfu Island, Greece.

Liu, C. L. (2005). Classifier combination based on confidence transformation. Pattern Recognition, 38(1), 11–28.

Lu, C. J., Lee, T. S., & Chiu, C. C. (2009). Financial time series forecasting using independent component analysis and support vector regression. Decision Support Systems, 47(2), 115–125.

Malpezzi, S. (2003). Hedonic pricing models: a selective and applied review. Housing Economics and Public Policy, 1, 67–89.

McCluskey, W. J., Deddis, W. G., Lamont, I. G., & Borst, R. A. (2000). The application of surface generated interpolation models for the prediction of residential property values. Journal of Property Investment and Finance, 18(2), 162–176.

McCluskey, W., & Anand, S. (1999). The application of intelligent hybrid techniques for the mass appraisal of residential properties. Journal of Property Investment and Finance, 17(3), 218–239.

McCluskey, W., Davis, P., Haran, M., McCord, M., & McIlhatton, D. (2012). The potential of artificial neural networks in mass appraisal: the case revisited. Journal of Financial Management of Property and Construction, 17(3), 274–292.

McMillan, M. L., Reid, B. G., & Gillen, D. W. (1980). An extension of the hedonic approach for estimating the value of quiet. Land Economics, 56(3), 315–328.

Merz, C., & Pazzani, M. (1996). Combining neural network regression estimates with regularized linear weights. Advances in Neural Information Processing Systems, 9, 564–570.

Myles, A. J., Feudale, R. N., Liu, Y., Woody, N. A., & Brown, S. D. (2004). An introduction to decision tree modeling. Journal of Chemometrics, 18(6), 275–285.

Pace, R. K., & Hayunga, D. (2020). Examining the information content of residuals from hedonic and spatial models using trees and forests. Journal of Real Estate Finance and Economics, 60(1–2), 170–180.

Pagourtzi, E., Assimakopoulos, V., Hatzichristos, T., & French, N. (2003). Real estate appraisal: a review of valuation methods. Journal of Property Investment & Finance, 21(4), 383–401.

Pi-ying, L. (2011). Analysis of the mass appraisal model by using artificial neural network in Kaohsiung city. Journal of Modern Accounting and Auditing, 7(10), 1080.

Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. (2017). Catboost: unbiased boosting with categorical features. arXiv preprint arXiv:1706.09516.

Raymond, Y. C. (1997). An application of the ARIMA model to real‐estate prices in Hong Kong. Journal of Property Finance, 8(2), 152–163.

Rosen, S. (1974). Hedonic prices and implicit markets: product differentiation in pure competition. Journal of Political Economy, 82(1), 34–55.

Safavian, S. R., & Landgrebe, D. (1991). A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, 21(3), 660–674.

Selim, H. (2009). Determinants of house prices in Turkey: hedonic regression versus artificial neural network. Expert Systems with Applications, 36(2), 2843–2852.

Sheppard, S. (1999). Chapter 41 Hedonic analysis of housing markets. Handbook of Regional and Urban Economics, 3, 1595–1635.

Sims, S., Dent, P., & Oskrochi, G. R. (2008). Modelling the impact of wind farms on house prices in the UK. International Journal of Strategic Property Management, 12(4), 251–269.

Sing, T. F., Yang, J. J., & Yu, S. M. (2022). Boosted tree ensembles for artificial intelligence based automated valuation models (AI-AVM). Journal of Real Estate Finance and Economics, 65, 649–674.

Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.

Song, Y. Y., & Lu, Y. (2015). Decision tree methods: applications for classification and prediction. Shanghai Archives of Psychiatry, 27(2), 130–135.

Taniguchi, M., & Tresp, V. (1997) Averaging regularized estimators. Neural Computation, 9(5), 1163–1178.

Torres-Pruñonosa, J., García-Estévez, P., & Prado-Román, C. (2021). Artificial neural network, quantile and semi-log regression modelling of mass appraisal in housing. Mathematics, 9(7), 783.

Verikas, A., Lipnickas, A., & Malmqvist, K. (2002). Selecting neural networks for a committee decision. International Journal of Neural Systems, 12(5), 351–361.

Verikas, A., Lipnickas, A., Malmqvist, K., Bacauskiene, M., & Gelzinis, A. (1999). Soft combination of neural classifiers: a comparative study. Pattern Recognition Letters, 20(4), 429–444.

Wang, D., & Li, V. J. (2019). Mass appraisal models of real estate in the 21st century: a systematic literature review. Sustainability, 11(24), 7006.

Wikimedia Commons. (2005). Districts of Seoul [Digital image].

Zhou, G., Ji, Y., Chen, X., & Zhang, F. (2018). Artificial neural networks and the mass appraisal of real estate. International Journal of Online Engineering, 14(3), 180–187.

Zurada, J., Levitan, A., & Guan, J. (2011). A comparison of regression and artificial intelligence methods in a mass appraisal context. Journal of Real Estate Research, 33(3), 349–388.