Stock Price Prediction Using News Article Sentiment Analysis: A Comparative Study of NLP Classification Techniques
Keywords:
Long short-term memory, Named entity recognition, Natural language processing, Predictive accuracy, Stock price movements, Support vector machinesAbstract
This paper presents an enhanced approach for predicting stock price movements by leveraging sentiment analysis of financial news articles. We evaluate a range of classifiers, including traditional lexicon-based methods, Naive Bayes, and Support Vector Machines (SVM), alongside modern deep learning architectures such as FinBERT and Long Short-Term Memory (LSTM) networks. Using the Reuters news corpus combined with Quandl stock price data, we conduct comprehensive experiments to assess classifier performance across multiple temporal prediction horizons, including one-day and two-day forecasts. Experimental results demonstrate that FinBERT consistently outperforms classical models, achieving an accuracy of 65.7% for two-day price movement predictions. Additionally, we perform rigorous statistical significance testing to validate the robustness of our results, accompanied by detailed error analysis to identify common sources of misclassification. We also investigate the impact of varying labeling thresholds on classifier performance, highlighting the trade-offs between precision and recall in sentiment-driven prediction tasks. Our findings suggest that while sentiment analysis provides valuable signals for stock movement prediction, integrating domain-specific Named Entity Recognition (NER) and contextual modeling is essential to fully capture the complexities inherent in financial text. Future work will explore incorporating event-driven representations and multi-modal data sources, such as social media and market indicators, to enhance predictive accuracy and reliability further.
References
JR. Luss and A. D’Aspremont, “Predicting abnormal returns from news using text classification,” Quantitative Finance, vol. 15, no. 6, pp. 999–1012, Mar. 2012, doi: https://doi.org/10.1080/14697688.2012.672762.
Ding, Y. Zhang, T. Liu, and J. Duan, “Deep Learning for Event-Driven Stock Prediction,” Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015. Available: http://www.wins.or.kr/DataPool/Board/4xxxx/455xx/45587/329.pdf
Y. Peng and H. Jiang, “Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks,” Arxiv:1506.07220, vol. 1, Jun. 2015, Available: https://arxiv.org/abs/1506.07220
P. Malo, A. Sinha, P. Korhonen, J. Wallenius, and P. Takala, “Good debt or bad debt: Detecting semantic orientations in economic texts,” Journal of the Association for Information Science and Technology, vol. 65, no. 4, pp. 782–796, Nov. 2013, doi: https://doi.org/10.1002/asi.23062.
M. Hagenau, M. Liebmann, and D. Neumann, “Automated news reading: Stock price prediction based on financial news using context-capturing features,” Decision Support Systems, vol. 55, no. 3, pp. 685–697, Jun. 2013, doi: https://doi.org/10.1016/j.dss.2013.02.006.
R. P. Schumaker and H. Chen, “Textual analysis of stock market prediction using breaking financial news,” ACM Transactions on Information Systems, vol. 27, no. 2, pp. 1–19, Feb. 2009, doi: https://doi.org/10.1145/1462198.1462204.
M. N. Ashtiani and B. Raahemi, “News-based intelligent prediction of financial markets using text mining and machine learning: A systematic literature review,” Expert Systems with Applications, vol. 217, p. 119509, May 2023, doi: https://doi.org/10.1016/j.eswa.2023.119509
A. Ritter, S. Clark, and O. Etzioni, “Named Entity Recognition in Tweets: An Experimental Study,” Association for Computational Linguistics, 2011. Available: https://aclanthology.org/D11-1141.pdf
S. Rosenthal, N. Farra, and P. Nakov, “SemEval-2017 Task 4: Sentiment Analysis in Twitter,” Arxiv:1912.00741, vol. 1, Dec. 2019, Available: https://arxiv.org/abs/1912.00741
Araci, “FinBERT: Financial Sentiment Analysis with Pre-trained Language Models,” Arxiv:1908.10063, vol. 1, Aug. 2019, Available: https://arxiv.org/abs/1908.10063
J. Cesar, S. Alvarado, K. Verspoor, and T. Baldwin, “Domain Adaptation of Named Entity Recognition to Support Credit Risk Assessment,” 2015. Available: https://aclanthology.org/U15-1010.pdf
Y. Yang, M. C. S. UY, and A. Huang, “FinBERT: A Pretrained Language Model for Financial Communications,” Arxiv:2006.08097, vol. 2, Jul. 2020, Available: https://arxiv.org/abs/2006.08097
Y.-T. Lu and Y. Huo, “Financial Named Entity Recognition: How Far Can LLM Go?” Arxiv.org, 2025. https://arxiv.org/abs/2501.02237.