Can machine learning algorithms associated with text mining from internet data improve housing price prediction performance?

    Jian-qiang Guo Affiliation
    ; Shu-hen Chiang   Affiliation
    ; Min Liu Affiliation
    ; Chi-Chun Yang Affiliation
    ; Kai-yi Guo Affiliation


Housing frenzies in China have attracted widespread global attention over the past few years, but the key is how to more accurately forecast housing prices in order to establish an effective real estate policy. Based on the ubiquitousness and immediacy of Internet data, this research adopts a broader version of text mining to search for keywords in relation to housing prices and then evaluates the predictive abilities using machine learning algorithms. Our findings indicate that this new method, especially random forest, not only detects turning points, but also offers prediction ability that clearly outperforms traditional regression analysis. Overall, the prediction based on online search data through a machine learning mechanism helps us better understand the trends of house prices in China.

First published online 10 June 2020

Keyword : housing frenzies, Internet search, text mining, machine learning

How to Cite
Guo, J.- qiang, Chiang, S.- hen, Liu, M., Yang, C.-C., & Guo, K.- yi. (2020). Can machine learning algorithms associated with text mining from internet data improve housing price prediction performance?. International Journal of Strategic Property Management, 24(5), 300-312.
Published in Issue
Aug 14, 2020
Abstract Views
PDF Downloads
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.


Askitas, N., & Zimmermann, K. F. (2009). Google econometrics and unemployment forecasting. Applied Economics Quarterly, 50, 107–120.

Baker, S., & Fradkin, A. (2017). The impact of unemployment insurance on job search: evidence from Google search data. Review of Economics and Statistics, 99, 756–768.

Beracha, E., & Wintoki, M. B. (2013). Forecasting residential real estate price changes from online search activity. Journal of Real Estate Research, 35, 283–312. doi/abs/10.5555/rees.35.3.c0ru080q45n34064

Chauvet, M., Gabriel, S. A., & Lutz, C. (2016). Mortgage default risk: new evidence from internet search queries. Journal of Urban Economics, 96, 91–111.

Chen, J., Guo, F., & Wu, Y. (2011). One decade of urban housing reform in China: urban housing price dynamics and the role of migration and urbanization, 1995-2005. Habitat International, 35, 1–8.

Chen, J., Ong, C., Zheng, L., & Hsu, S. (2017). Forecasting spatial dynamics of the housing market using support vector machine. International Journal of Strategic Property Management, 21, 273–283.

Chen, Y., Liu, X., Li, X., Liu, Y., & Xu, X. (2016). Mapping the fine-scale spatial pattern of housing rent in the metropolitan area by using online rental listings and ensemble learning. Applied Geography, 75, 200–212.

Chiang, S. (2014). Housing markets in China and policy implications: co-movement or ripple effect. China & World Economy, 22, 103–120.

Choi, H., & Varian, H. (2012). Predicting the present with Google trends. Economic Record, 88, 2–9.

Da, Z., Engelberg, J., & Gao, P. (2011). In search of attention. Journal of Finance, 66, 1461–1499.

Ettredge, M., Gerdes, J., & Karuga, G. (2005). Using web-based search data to predict macroeconomic statistics. Communications of the ACM, 48, 87–92.

Ginsberg, J., Mohebb, M. H., Patel, R. S., Brammer, L., Smolinsky, M. S., & Brilliant, L. (2009). Detecting influence epidemics using search engine query data. Nature, 457, 1012–1014.

Glaeser, E., Huang, W., Ma, Y., & Shleifer, A. (2017). A real estate boom with Chinese characteristics. Journal of Economic Perspectives, 31, 93–116.

Gong, Y., Hu, J., & Boelhouwer, P. J. (2016). Spatial interrelations of Chinese housing markets: spatial causality, convergence and diffusion. Regional Science and Urban Economics, 59, 103–117.

Guzman, G. (2011). Internet search behavior as an economic forecasting tool: the case of inflation expectation. Journal of Economic and Social Measurement, 36, 119–167.

Howard, J., & Bowles, M. (2012). The two most important algorithms in predictive modeling today. In Strata Conference: Santa Clara.

Hu, L., He, S., Han, Z., Xiao, H., Su, S., Weng, M., & Cai, Z. (2019). Monitoring housing rental prices based on social media: an integrated approach of machine-learning algorithms and hedonic modelling to inform equitable housing policies. Land Use Policy, 82, 657–673.

Hui, E. C. M., & Yue, S. (2006). Housing price bubbles in Hong Kong, Beijing and Shanghai: a comparative study. Journal of Real Estate Finance and Economics, 33, 299–327.

Jirong, G., Zhu, M., & Jiang, L. (2011). Housing price forecasting based on genetic algorithm and support vector machine. Expert Systems with Applications, 38, 3383–3386.

Lee, C., Liang, C., & Liu, Y. (2019). A comparison of the predictive powers of tenure choices between property ownership and renting. International Journal of Strategic Property Management, 23, 130–141.

Lee, C., Lee, C., & Chiang, S. (2016). Ripple effect and regional house prices dynamics in China. International Journal of Strategic Property Management, 20, 397–408.

Lee, K. O., & Mori, M. (2016). Do conspicuous consumers pay higher housing premiums? Spatial and temporal variation in the United States. Real Estate Economics, 44, 726–728.

Liu, T., Chang, H., Su, C., & Jiang, X. (2016). China’s housing bubble burst? Economics of Transition, 24, 361–389.

Maclennan, D., & O’Sullivan, A. (2012). Housing markets, signals and search. Journal of Property Research, 29, 324–340.

Mullainathan, S., & Obermeyer, Z. (2017). Does machine learning automate moral hazard and error? American Economic Review, 107, 476–480.

Mullainathan, S., & Spiess, J. (2017). Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31, 87–106.

Nardo, M., Petrcco-Giudici, M., & Naltsidis, M. (2015). Walking down Wall Street with a tablet: a survey of stock market predictions using the Web. Journal of Economic Survey, 30, 356–369.

Park, B., & Bae, J. K. (2015). Using machine learning algorithms for housing price prediction: the case of Fairfax county, Virginia housing data. Expert Systems with Applications, 42, 2928–2934.

Piazzesi, M., Schneider, M., & Stroebel, J. (2020). Segmented housing search. American Economic Review, 110, 720−759.

Plakandaras, V., Gupta, R. Gogas, P., & Papadimitriou, T. (2015). Forecasting the U.S. real house price index. Economic Modelling, 45, 259–267.

Rae, A. (2015). Online housing search and the geography of submarkets. Housing Studies, 30, 453–472.

Rae, A., & Sener, E. (2016). How website users segment a city: the geography of housing search in London. Cities, 52, 140–147.

Ren, Y., Xiong, C., & Yuan, Y. (2012). House price bubbles in China. China Economic Review, 23, 786–800.

Tan, Y., Xu, H., & Hui, E. C. M. (2017). Forecasting property price indices in Hong Kong based on a grey model. International Journal of Strategic Property Management, 21, 256–272.

Tsai, I., & Chiang, S. (2019). Exuberance and spillovers in housing markets: evidence from first- and second-tier cities in China. Regional Science and Urban Economics, 77, 75–86.

Van Dijk, D. W., & Francke, M. K. (2018). Internet search behavior, liquidity and prices in the housing market. Real Estate Economics, 46, 368–403.

Van Veldhuizen, S., Vogt, B., & Vogt, B. (2016). Internet searches and transactions on the Dutch housing market. Applied Economics Letters, 23, 1321–1324.

Varian, H. R. (2014). “Big data”: new tricks for econometrics. Journal of Economic Perspectives, 28, 3–28.

Weng, Y., & Gong, P. (2017). On price co-movement and volatility spillover effects in China’s housing markets. International Journal of Strategic Property Management, 21, 240–255.

Wu, J., & Deng, Y. (2015). Intercity information diffusion and price discovery in housing markets: evidence from Google searches. Journal of Real Estate Finance and Economics, 50, 289–306.

Wu, L., & Brynjolfsson, E. (2015). The future of prediction: how Google searches foreshadow housing prices and sales (Working Paper). National Bureau for Economic Research.

Zheng, S., Sun, W., & Kahn, M. E. (2016). Investor confidence as a determinant of China’s urban housing market dynamics. Real Estate Economics, 44, 814–845.