Share:


Critical review of text mining and sentiment analysis for stock market prediction

    Zuzana Janková   Affiliation

Abstract

The paper is aimed at a critical review of the literature dealing with text mining and sentiment analysis for stock market prediction. The aim of this work is to create a critical review of the literature, especially with regard to the latest findings of research articles in the selected topic strictly focused on stock markets represented by stock indices or stock titles. This requires examining and critically analyzing the methods used in the analysis of sentiment from textual data, with special regard to the possibility of generalization and transferability of research results. For this reason, an analytical approach is also used in working with the literature and a critical approach in its organization, especially for completeness, coherence, and consistency. Based on the selected criteria, 260 articles corresponding to the subject area are selected from the world databases of Web of Science and Scopus. These studies are graphically captured through bibliometric analysis. Subsequently, the selection of articles was narrowed to 49. The outputs are synthesized and the main findings and limits of the current state of research are highlighted with possible future directions of subsequent research.

Keyword : bibliometric analysis, financial market, literature review, sentiment analysis, stock market, text mining

How to Cite
Janková, Z. (2023). Critical review of text mining and sentiment analysis for stock market prediction. Journal of Business Economics and Management, 24(1), 177–198. https://doi.org/10.3846/jbem.2023.18805
Published in Issue
Apr 5, 2023
Abstract Views
1083
PDF Downloads
874
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

References

Ab. Rahman, A. S., Abdul-Rahman, S., & Mutalib, S. (2017). Mining textual terms for stock market prediction analysis using financial news. In Communications in computer and information science: Vol. 788. Soft computing in data science (pp. 293–305). Springer Verlag. https://doi.org/10.1007/978-981-10-7242-0_25

Al Nasseri, A., Tucker, A., & de Cesare, S. (2014). Big data analysis of StockTwits to predict sentiments in the stock market. In Lecture notes in computer science: Vol. 8777. Discovery science (pp. 13–24). Springer Verlag. https://doi.org/10.1007/978-3-319-11812-3_2

Al Nasseri, A., Tucker, A., & de Cesare, S. (2015). Quantifying StockTwits semantic terms’ trading behavior in financial markets: An effective application of decision tree algorithms. Expert Systems with Applications, 42(23), 9192–9210. https://doi.org/10.1016/j.eswa.2015.08.008

Al-Ramahi, M., El-Gayar, O., Liu, J., & Chang, Y. (2015). Predicting big movers based on online stock forum sentiment analysis. In Americas Conference on Information Systems, AMCIS.

Alostad, H., & Davulcu, H. (2017). Directional prediction of stock prices using breaking news on Twitter. Web Intelligence, 15(1), 1–17. https://doi.org/10.3233/WEB-170349

Antons, D., Grünwald, E., Cichy, P., & Salge, T. O. (2020). The application of text mining methods in innovation research: Current state, evolution patterns, and development priorities. R&D Management, 50(3), 329–351. https://doi.org/10.1111/radm.12408

Batra, R., & Daudpota, S. M. (2018, March). Integrating StockTwits with sentiment analysis for better prediction of stock price movement. In 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) (pp. 1–5). Sukkur, Pakistan. IEEE. https://doi.org/10.1109/ICOMET.2018.8346382

Birbeck, E., & Cliff, D. (2018). Using stock prices as ground truth in sentiment analysis to generate profitable trading signals (IDEAS Working Paper Series from RePEc). Federal Reserve Bank of St Louis. http://search.proquest.com/docview/2189119020/

Bouktif, S., Fiaz, A., & Awad, M. (2019, October). Stock market movement prediction using disparate text features with machine learning. In 2019 Third International Conference on Intelligent Computing in Data Sciences (ICDS) (pp. 1–6). Marrakech, Morocco. IEEE. https://doi.org/10.1109/ICDS47004.2019.8942303

Bouktif, S., Fiaz, A., & Awad, M. (2020). Augmented textual features-based stock market prediction. IEEE Access, 8, 40269–40282. https://doi.org/10.1109/ACCESS.2020.2976725

Bustos, O., & Pomares-Quimbaya, A. (2020). Stock market movement forecast: A systematic review. Expert Systems with Applications, 156, 113464. https://doi.org/10.1016/j.eswa.2020.113464

Chen, C.-H., & Shih, P. (2019, June). A stock trend prediction approach based on Chinese news and technical indicator using genetic algorithms. In 2019 IEEE Congress on Evolutionary Computation (CEC) (pp. 1468–1472). Wellington, New Zealand. IEEE. https://doi.org/10.1109/CEC.2019.8790177

Chen, M.-Y., & Chen, T.-H. (2019). Modeling public mood and emotion: Blogs and news sentiment and socio-economic phenomena. Future Generation Computer Systems, 96, 692–699. https://doi.org/10.1016/j.future.2017.10.028

Das, S., & Das, A. (2016, July). Fusion with sentiment scores for market research. In FUSION 2016 – 19th International Conference on Information Fusion, Proceedings (pp. 1003–1010). Heidelberg, Germany. IEEE.

Derakhshan, A., & Beigy, H. (2019). Sentiment analysis on stock social media for stock price movement prediction. Engineering Applications of Artificial Intelligence, 85, 569–578. https://doi.org/10.1016/j.engappai.2019.07.002

Domeniconi, G., Moro, G., Pagliarani, A., & Pasolini, T. (2017). Learning to predict the stock market dow jones index detecting and mining relevant tweets. In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management. (pp. 165–172). Funchal, Madeira, Portugal. SciTePress. https://doi.org/10.5220/0006488201650172

Eliacik, A., & Erdogan, N. (2015, November). User-weighted sentiment analysis for financial community on Twitter. In 2015 11th International Conference on Innovations in Information Technology (IIT), Conference Proceedings (pp. 46–51). Dubai, United Arab Emirates. IEEE. https://doi.org/10.1109/INNOVATIONS.2015.7381513

Eliacik, A. B., & Erdogan, N. (2018). Influential user weighted sentiment analysis on topic based microblogging community. Expert Systems with Applications, 92, 403–418. https://doi.org/10.1016/j.eswa.2017.10.006

Feuerriegel, S., & Gordon, J. (2018). Long-term stock index forecasting based on text mining of regulatory disclosures. Decision Support Systems, 112, 88–97. https://doi.org/10.1016/j.dss.2018.06.008

Groß-Klußmann, A., König, S., & Ebner, M. (2019). Buzzwords build momentum: Global financial Twitter sentiment and the aggregate stock market. Expert Systems with Applications, 136, 171–186. https://doi.org/10.1016/j.eswa.2019.06.027

Hájek, P. (2018). Combining bag-of-words and sentiment features of annual reports to predict abnormal stock returns. Neural Computing and Applications, 29(7), 343–358. https://doi.org/10.1007/s00521-017-3194-2

Hájek, P., & Boháčová, J. (2016). Predicting abnormal bank stock returns using textual analysis of annual reports – A neural network approach. In Communications in computer and information science: Vol. 629. Engineering applications of neural networks (pp. 67–78). Springer, Cham. https://doi.org/10.1007/978-3-319-44188-7_5

Hao, P. Y., Kung, C. F., Chang, C. Y., & Ou, J. B. (2021). Predicting stock price trends based on financial news articles and using a novel twin support vector machine with fuzzy hyperplane. Applied Soft Computing, 98, 106806. https://doi.org/10.1016/j.asoc.2020.106806

Huang, C., Liao, J., Yang, D., Chang, T., & Luo, Y. (2010). Realization of a news dissemination agent based on weighted association rules and text mining techniques. Expert Systems with Applications, 37(9), 6409–6413. https://doi.org/10.1016/j.eswa.2010.02.078

Hwang, E., & Kim, Y. (2019). Interdependency between the stock market and financial news. ArXiv. https://doi.org/10.48550/arXiv.1909.00344

Jammalamadaka, S., Qiu, J., & Ning, N. (2019). Predicting a stock portfolio with the multivariate bayesian structural time series model: Do news or emotions matter? International Journal of Artificial Intelligence, 17(2), 81–140. https://escholarship.org/uc/item/47m0302b

Janková, Z. (2021). Expert system for decision-making on stock markets using investor sentiment [Doctoral Thesis]. Brno University of Technology. Faculty of Business and Management.

Khedr, A., Salama, S. E., & Yaseen, N. (2017). Predicting stock market behavior using data mining technique and news sentiment analysis. International Journal of Intelligent Systems and Applications, 9(7), 22–30. https://doi.org/10.5815/ijisa.2017.07.03

Kim, M., Park, E. L., & Cho, S. (2018). Stock price prediction through sentiment analysis of corporate disclosures using distributed representation. Intelligent Data Analysis, 22(6), 1395–1413. https://doi.org/10.3233/IDA-173670

Kraus, M., & Feuerriegel, S. (2017). Decision support from financial disclosures with deep neural networks and transfer learning. Decision Support Systems, 104(C), 38–48. https://doi.org/10.1016/j.dss.2017.10.001

Li, Y., Bu, H., Li, J., & Wu, J. (2020). The role of text-extracted investor sentiment in Chinese stock price prediction with the enhancement of deep learning. International Journal of Forecasting, 36(4), 1541–1562. https://doi.org/10.1016/j.ijforecast.2020.05.001

Long, W., Tang, Y. R., & Tian, Y. J. (2018). Investor sentiment identification based on the universum SVM. Neural Computing and Applications, 30(2), 661–670. https://doi.org/10.1007/s00521-016-2684-y

Mäntylä, M. V., Graziotin, D., & Kuutila, M. (2018). The evolution of sentiment analysis – A review of research topics, venues, and top cited papers. Computer Science Review, 27, 16–32. https://doi.org/10.1016/j.cosrev.2017.10.002

Meesad, P., & Li, J. (2014, December). Stock trend prediction relying on text mining and sentiment analysis with tweets. In 2014 4th World Congress on Information and Communication Technologies (WICT 2014) (pp. 257–262). Melaka, Malaysia. https://doi.org/10.1109/WICT.2014.7077275

Moro, G., Pasolini, R., Domeniconi, G., Pagliarani, A., & Roli, A. (2019). Prediction and trading of dow jones from Twitter: A boosting text mining method with relevant tweets identification. In communications in computer and information science: Vol. 976. Knowledge discovery, knowledge engineering and knowledge management (pp. 26–42). Springer, Cham. https://doi.org/10.1007/978-3-030-15640-4_2

Nann, S., Krauss, J., & Schoder, D. (2013). Predictive analytics on public data – The case of stock markets. ECIS 2013 Completed Research. 102.

Nguyen, T. H., Shirai, K., & Velcin, J. (2015). Sentiment analysis on social media for stock movement prediction. Expert Systems with Applications, 42(24), 9603–9611. https://doi.org/10.1016/j.eswa.2015.07.052

Nti, I. K., Adekoya, A. F., & Weyori, B. A. (2020). Predicting stock market price movement using sentiment analysis: Evidence from Ghana. Applied Computer Systems, 25(1), 33–42. https://doi.org/10.2478/acss-2020-0004

O’Hare, N., Davy, M., Bermingham, A., Ferguson, P., Sheridan, P., Gurrin, C., &. Smeaton, A. F. (2009, November). Topic-dependent sentiment analysis of financial blogs. In Proceeding of the 1st International CIKM Workshop on Topic-sentiment Analysis for Mass Opinion – TSA ‘09 (pp. 9–16). Hong Kong China. Association for Computing Machinery. https://doi.org/10.1145/1651461.1651464

Oliveira, N., Cortez, P., & Areal, N. (2013, June). Some experiments on modeling stock market behavior using investor sentiment analysis and posting volume from Twitter. In Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics – WIMS ‘13 (pp. 1–8). https://doi.org/10.1145/2479787.2479811

Oliveira, N., Cortez, P., & Areal, N. (2017). The impact of microblogging data for stock market prediction: Using Twitter to predict returns, volatility, trading volume and survey sentiment indices. Expert Systems with Applications, 73(C), 125–144. https://doi.org/10.1016/j.eswa.2016.12.036

Owen, L., & Oktariani, F. (2020, August). SENN: Stock ensemble-based neural network for stock market prediction using historical stock data and sentiment analysis. In 2020 International Conference on Data Science and Its Applications (ICoDSA) (pp. 1–7). Bandung, Indonesia. IEEE. https://doi.org/10.1109/ICoDSA50139.2020.9212982

Pagolu, V., Kamal, N. R. C., Panda, G., & Majhi, B. (2016). Sentiment analysis of Twitter data for predicting stock market movements. ArXiv. https://doi.org/10.48550/arXiv.1610.09225

Ren, R., Wu, D. D., & Liu, T. (2019). Forecasting stock market movement direction using sentiment analysis and support vector machine. IEEE Systems Journal, 13(1), 760–770. https://doi.org/10.1109/JSYST.2018.2794462

Sakhare, N. N., Imambi, S., Kagad, S., Kapadwanjwala, T., Malekar, M., & Dalal, M. (2020). Stock market prediction using sentiment analysis. International Journal of Advanced Science and Technology, 29(4s), 1126–1133. http://sersc.org/journals/index.php/IJAST/article/view/6664

Shi, Y., Tang, Y.-R., Cui, L.-X., & Long, W. (2018). A text mining based study of investor sentiment and its influence on stock returns. Economic Computation and Economic Cybernetics Studies and Research, 52(1), 183–199. https://doi.org/10.24818/18423264/52.1.18.11

Siering, M. (2012, January). “Boom” or “Ruin” – Does it make a difference? Using text mining and sentiment analysis to support intraday investment decisions. In 2012 45th Hawaii International Conference on System Sciences (pp. 1050–1059). Maui. IEEE. https://doi.org/10.1109/HICSS.2012.2

Simoes, C., Neves, R., & Horta, N. (2017, June). Using sentiment from Twitter optimized by Genetic Algorithms to predict the stock market. In 2017 IEEE Congress on Evolutionary Computation (CEC) (pp. 1303–1310). Donostia, Spain. IEEE. https://doi.org/10.1109/CEC.2017.7969455

Smailović, J., Grčar, M., Lavrač, N., & Žnidaršič, M. (2014). Stream-based active learning for sentiment analysis in the financial domain. Information Sciences, 285(C), 181–203. https://doi.org/10.1016/j.ins.2014.04.034

Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4), 265–269. https://doi.org/10.1002/asi.4630240406

Sun, Y., Liu, X., Chen, G., Hao, Y., & Zhang, Z. (2020). How mood affects the stock market: Empirical evidence from microblogs. Information & Management, 57(5), 103181. https://doi.org/10.1016/j.im.2019.103181

Tirea, M., & Negru, V. (2013, September). Classifying and quantifying certain phenomena effect. In SISY 2013 – IEEE 11th International Symposium on Intelligent Systems and Informatics, Proceedings (pp. 363–368). Subotica, Serbia. IEEE. https://doi.org/10.1109/SISY.2013.6662603

Urolagin, S. (2017). Text mining of Tweet for sentiment classification and association with stock prices. In 2017 International Conference on Computer and Applications (ICCA) (pp. 384–388). Doha, Qatar. IEEE. https://doi.org/10.1109/COMAPP.2017.8079788

van Eck, N., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538. https://doi.org/10.1007/s11192-009-0146-3

Xie, Y., & Jiang, H. (2017). Stock market forecasting based on text mining technology: A support vector machine method. Journal of Computers, 12(6), 500–510. https://doi.org/10.17706/jcp.12.6.500-510

Zhao, B., He, Y., Yuan, C., & Huang, Y. (2016, July). Stock market prediction exploiting microblog sentiment analysis. In 2016 International Joint Conference on Neural Networks (IJCNN) (pp. 4482–4488). Vancouver, BC, Canada. IEEE. https://doi.org/10.1109/IJCNN.2016.7727786