Critical review of text mining and sentiment analysis for stock market prediction

    Zuzana Janková   Affiliation


The paper is aimed at a critical review of the literature dealing with text mining and sentiment analysis for stock market prediction. The aim of this work is to create a critical review of the literature, especially with regard to the latest findings of research articles in the selected topic strictly focused on stock markets represented by stock indices or stock titles. This requires examining and critically analyzing the methods used in the analysis of sentiment from textual data, with special regard to the possibility of generalization and transferability of research results. For this reason, an analytical approach is also used in working with the literature and a critical approach in its organization, especially for completeness, coherence, and consistency. Based on the selected criteria, 260 articles corresponding to the subject area are selected from the world databases of Web of Science and Scopus. These studies are graphically captured through bibliometric analysis. Subsequently, the selection of articles was narrowed to 49. The outputs are synthesized and the main findings and limits of the current state of research are highlighted with possible future directions of subsequent research.

Keyword : bibliometric analysis, financial market, literature review, sentiment analysis, stock market, text mining

How to Cite
Janková, Z. (2023). Critical review of text mining and sentiment analysis for stock market prediction. Journal of Business Economics and Management, 24(1), 177–198.
Published in Issue
Apr 5, 2023
Abstract Views
PDF Downloads
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.


Ab. Rahman, A. S., Abdul-Rahman, S., & Mutalib, S. (2017). Mining textual terms for stock market prediction analysis using financial news. In Communications in computer and information science: Vol. 788. Soft computing in data science (pp. 293–305). Springer Verlag.

Al Nasseri, A., Tucker, A., & de Cesare, S. (2014). Big data analysis of StockTwits to predict sentiments in the stock market. In Lecture notes in computer science: Vol. 8777. Discovery science (pp. 13–24). Springer Verlag.

Al Nasseri, A., Tucker, A., & de Cesare, S. (2015). Quantifying StockTwits semantic terms’ trading behavior in financial markets: An effective application of decision tree algorithms. Expert Systems with Applications, 42(23), 9192–9210.

Al-Ramahi, M., El-Gayar, O., Liu, J., & Chang, Y. (2015). Predicting big movers based on online stock forum sentiment analysis. In Americas Conference on Information Systems, AMCIS.

Alostad, H., & Davulcu, H. (2017). Directional prediction of stock prices using breaking news on Twitter. Web Intelligence, 15(1), 1–17.

Antons, D., Grünwald, E., Cichy, P., & Salge, T. O. (2020). The application of text mining methods in innovation research: Current state, evolution patterns, and development priorities. R&D Management, 50(3), 329–351.

Batra, R., & Daudpota, S. M. (2018, March). Integrating StockTwits with sentiment analysis for better prediction of stock price movement. In 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) (pp. 1–5). Sukkur, Pakistan. IEEE.

Birbeck, E., & Cliff, D. (2018). Using stock prices as ground truth in sentiment analysis to generate profitable trading signals (IDEAS Working Paper Series from RePEc). Federal Reserve Bank of St Louis.

Bouktif, S., Fiaz, A., & Awad, M. (2019, October). Stock market movement prediction using disparate text features with machine learning. In 2019 Third International Conference on Intelligent Computing in Data Sciences (ICDS) (pp. 1–6). Marrakech, Morocco. IEEE.

Bouktif, S., Fiaz, A., & Awad, M. (2020). Augmented textual features-based stock market prediction. IEEE Access, 8, 40269–40282.

Bustos, O., & Pomares-Quimbaya, A. (2020). Stock market movement forecast: A systematic review. Expert Systems with Applications, 156, 113464.

Chen, C.-H., & Shih, P. (2019, June). A stock trend prediction approach based on Chinese news and technical indicator using genetic algorithms. In 2019 IEEE Congress on Evolutionary Computation (CEC) (pp. 1468–1472). Wellington, New Zealand. IEEE.

Chen, M.-Y., & Chen, T.-H. (2019). Modeling public mood and emotion: Blogs and news sentiment and socio-economic phenomena. Future Generation Computer Systems, 96, 692–699.

Das, S., & Das, A. (2016, July). Fusion with sentiment scores for market research. In FUSION 2016 – 19th International Conference on Information Fusion, Proceedings (pp. 1003–1010). Heidelberg, Germany. IEEE.

Derakhshan, A., & Beigy, H. (2019). Sentiment analysis on stock social media for stock price movement prediction. Engineering Applications of Artificial Intelligence, 85, 569–578.

Domeniconi, G., Moro, G., Pagliarani, A., & Pasolini, T. (2017). Learning to predict the stock market dow jones index detecting and mining relevant tweets. In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management. (pp. 165–172). Funchal, Madeira, Portugal. SciTePress.

Eliacik, A., & Erdogan, N. (2015, November). User-weighted sentiment analysis for financial community on Twitter. In 2015 11th International Conference on Innovations in Information Technology (IIT), Conference Proceedings (pp. 46–51). Dubai, United Arab Emirates. IEEE.

Eliacik, A. B., & Erdogan, N. (2018). Influential user weighted sentiment analysis on topic based microblogging community. Expert Systems with Applications, 92, 403–418.

Feuerriegel, S., & Gordon, J. (2018). Long-term stock index forecasting based on text mining of regulatory disclosures. Decision Support Systems, 112, 88–97.

Groß-Klußmann, A., König, S., & Ebner, M. (2019). Buzzwords build momentum: Global financial Twitter sentiment and the aggregate stock market. Expert Systems with Applications, 136, 171–186.

Hájek, P. (2018). Combining bag-of-words and sentiment features of annual reports to predict abnormal stock returns. Neural Computing and Applications, 29(7), 343–358.

Hájek, P., & Boháčová, J. (2016). Predicting abnormal bank stock returns using textual analysis of annual reports – A neural network approach. In Communications in computer and information science: Vol. 629. Engineering applications of neural networks (pp. 67–78). Springer, Cham.

Hao, P. Y., Kung, C. F., Chang, C. Y., & Ou, J. B. (2021). Predicting stock price trends based on financial news articles and using a novel twin support vector machine with fuzzy hyperplane. Applied Soft Computing, 98, 106806.

Huang, C., Liao, J., Yang, D., Chang, T., & Luo, Y. (2010). Realization of a news dissemination agent based on weighted association rules and text mining techniques. Expert Systems with Applications, 37(9), 6409–6413.

Hwang, E., & Kim, Y. (2019). Interdependency between the stock market and financial news. ArXiv.

Jammalamadaka, S., Qiu, J., & Ning, N. (2019). Predicting a stock portfolio with the multivariate bayesian structural time series model: Do news or emotions matter? International Journal of Artificial Intelligence, 17(2), 81–140.

Janková, Z. (2021). Expert system for decision-making on stock markets using investor sentiment [Doctoral Thesis]. Brno University of Technology. Faculty of Business and Management.

Khedr, A., Salama, S. E., & Yaseen, N. (2017). Predicting stock market behavior using data mining technique and news sentiment analysis. International Journal of Intelligent Systems and Applications, 9(7), 22–30.

Kim, M., Park, E. L., & Cho, S. (2018). Stock price prediction through sentiment analysis of corporate disclosures using distributed representation. Intelligent Data Analysis, 22(6), 1395–1413.

Kraus, M., & Feuerriegel, S. (2017). Decision support from financial disclosures with deep neural networks and transfer learning. Decision Support Systems, 104(C), 38–48.

Li, Y., Bu, H., Li, J., & Wu, J. (2020). The role of text-extracted investor sentiment in Chinese stock price prediction with the enhancement of deep learning. International Journal of Forecasting, 36(4), 1541–1562.

Long, W., Tang, Y. R., & Tian, Y. J. (2018). Investor sentiment identification based on the universum SVM. Neural Computing and Applications, 30(2), 661–670.

Mäntylä, M. V., Graziotin, D., & Kuutila, M. (2018). The evolution of sentiment analysis – A review of research topics, venues, and top cited papers. Computer Science Review, 27, 16–32.

Meesad, P., & Li, J. (2014, December). Stock trend prediction relying on text mining and sentiment analysis with tweets. In 2014 4th World Congress on Information and Communication Technologies (WICT 2014) (pp. 257–262). Melaka, Malaysia.

Moro, G., Pasolini, R., Domeniconi, G., Pagliarani, A., & Roli, A. (2019). Prediction and trading of dow jones from Twitter: A boosting text mining method with relevant tweets identification. In communications in computer and information science: Vol. 976. Knowledge discovery, knowledge engineering and knowledge management (pp. 26–42). Springer, Cham.

Nann, S., Krauss, J., & Schoder, D. (2013). Predictive analytics on public data – The case of stock markets. ECIS 2013 Completed Research. 102.

Nguyen, T. H., Shirai, K., & Velcin, J. (2015). Sentiment analysis on social media for stock movement prediction. Expert Systems with Applications, 42(24), 9603–9611.

Nti, I. K., Adekoya, A. F., & Weyori, B. A. (2020). Predicting stock market price movement using sentiment analysis: Evidence from Ghana. Applied Computer Systems, 25(1), 33–42.

O’Hare, N., Davy, M., Bermingham, A., Ferguson, P., Sheridan, P., Gurrin, C., &. Smeaton, A. F. (2009, November). Topic-dependent sentiment analysis of financial blogs. In Proceeding of the 1st International CIKM Workshop on Topic-sentiment Analysis for Mass Opinion – TSA ‘09 (pp. 9–16). Hong Kong China. Association for Computing Machinery.

Oliveira, N., Cortez, P., & Areal, N. (2013, June). Some experiments on modeling stock market behavior using investor sentiment analysis and posting volume from Twitter. In Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics – WIMS ‘13 (pp. 1–8).

Oliveira, N., Cortez, P., & Areal, N. (2017). The impact of microblogging data for stock market prediction: Using Twitter to predict returns, volatility, trading volume and survey sentiment indices. Expert Systems with Applications, 73(C), 125–144.

Owen, L., & Oktariani, F. (2020, August). SENN: Stock ensemble-based neural network for stock market prediction using historical stock data and sentiment analysis. In 2020 International Conference on Data Science and Its Applications (ICoDSA) (pp. 1–7). Bandung, Indonesia. IEEE.

Pagolu, V., Kamal, N. R. C., Panda, G., & Majhi, B. (2016). Sentiment analysis of Twitter data for predicting stock market movements. ArXiv.

Ren, R., Wu, D. D., & Liu, T. (2019). Forecasting stock market movement direction using sentiment analysis and support vector machine. IEEE Systems Journal, 13(1), 760–770.

Sakhare, N. N., Imambi, S., Kagad, S., Kapadwanjwala, T., Malekar, M., & Dalal, M. (2020). Stock market prediction using sentiment analysis. International Journal of Advanced Science and Technology, 29(4s), 1126–1133.

Shi, Y., Tang, Y.-R., Cui, L.-X., & Long, W. (2018). A text mining based study of investor sentiment and its influence on stock returns. Economic Computation and Economic Cybernetics Studies and Research, 52(1), 183–199.

Siering, M. (2012, January). “Boom” or “Ruin” – Does it make a difference? Using text mining and sentiment analysis to support intraday investment decisions. In 2012 45th Hawaii International Conference on System Sciences (pp. 1050–1059). Maui. IEEE.

Simoes, C., Neves, R., & Horta, N. (2017, June). Using sentiment from Twitter optimized by Genetic Algorithms to predict the stock market. In 2017 IEEE Congress on Evolutionary Computation (CEC) (pp. 1303–1310). Donostia, Spain. IEEE.

Smailović, J., Grčar, M., Lavrač, N., & Žnidaršič, M. (2014). Stream-based active learning for sentiment analysis in the financial domain. Information Sciences, 285(C), 181–203.

Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4), 265–269.

Sun, Y., Liu, X., Chen, G., Hao, Y., & Zhang, Z. (2020). How mood affects the stock market: Empirical evidence from microblogs. Information & Management, 57(5), 103181.

Tirea, M., & Negru, V. (2013, September). Classifying and quantifying certain phenomena effect. In SISY 2013 – IEEE 11th International Symposium on Intelligent Systems and Informatics, Proceedings (pp. 363–368). Subotica, Serbia. IEEE.

Urolagin, S. (2017). Text mining of Tweet for sentiment classification and association with stock prices. In 2017 International Conference on Computer and Applications (ICCA) (pp. 384–388). Doha, Qatar. IEEE.

van Eck, N., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538.

Xie, Y., & Jiang, H. (2017). Stock market forecasting based on text mining technology: A support vector machine method. Journal of Computers, 12(6), 500–510.

Zhao, B., He, Y., Yuan, C., & Huang, Y. (2016, July). Stock market prediction exploiting microblog sentiment analysis. In 2016 International Joint Conference on Neural Networks (IJCNN) (pp. 4482–4488). Vancouver, BC, Canada. IEEE.