FA692 Natural Language Processing for Financial Applications

Course Catalog Description


This course focuses on natural language processing (NLP) models and their applications to finance. Building on fundamental machine learning theory and practice, the course covers advanced topics in natural language processing for analyzing financial reports and news. Learning and building from financial data sets, the lectures will introduce machine learning models in quantitative investing, portfolio management, algorithmic trading, risk management, client-relationship management, and beyond. A final project on related topics is required.
Prerequisite: Students must have taken FA590 or comparable introduction to machine learning methods

Campus Fall Spring Summer
On Campus X
Online X


Professor Email Office
Zachary Feinstein
zfeinste@stevens.edu Babbio Center 628

More Information

Course Objectives

This is an advanced course in the FINTECH and Machine Learning concentration of the Financial Analytics program.

In this course, students will (generally):

  • Be able to apply NLP to financial problems

  • Be able to evaluate the performance of different methods to determine the best 2 method/hyperparameters

  • Learn how to interpret results of NLP in a financial context

Course Outcomes

  • Create and apply NLP models for analyzing financial reports and news.

  • Evaluate performance of trained NLP models for sentiment analysis.

  • Understand applicability of diverse NLP techniques.

Course Resources


  1. Dixon, M. F., Halperin, I., Bilokon, P. (2020). Machine Learning in Finance: From Theory to Practice. Germany: Springer International Publishing.
  2. Bird, S. (2006, July). NLTK: the natural language toolkit. In Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions (pp. 69-72).
  3. Gupta, A., Dengre, V., Kheruwala, H. A., & Shah, M. (2020). Comprehensive review of text-mining applications in finance. Financial Innovation, 6(1), 1-25.
  4. Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063.
  5. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.


Grading Policies

Attendance and Short Assignments 20%
Homeworks 35%
Project 30%
Presentation 15%
Total grade 100%

Lecture Outline

Week Topics Readings HW
1 Review of Machine Learning Dixon et al.: Chapter 1 Homework 1: Review of Python (assigned)
2 Introduction to NLP with Applications to Financial Data Bird(2006) Homework 1 (due)
Homework 2: Tokenization and preprocessing (assigned)
Project (assigned)
3 Text Mining of Financial Reports and News Gupta et al. (2020) Homework 2 (due)
Homework 3: Application of text mining (assigned)
Project proposal due
4 Sentiment Analysis of Financial Reports and News Araci (2019) Homework 3 (due)
Homework 4: Homework 3 (due)
Homework 4: Application of neural networks for time series (assigned)
5 Topic Modeling in Finance Blei et al. (2003) Homework 5 (due)
Homework 5: Application of topic modeling(assigned)
6 Applications to Finance/ Special topics Dependent on topic area chosen Homework 5 (due)
7 Presentations Project (due)