Application of Statistical and Machine Learning Techniques to Detect Rare Events in High Frequency Financial Data And Assess Corporate Credit Rating
Author: Parisa Golbayani
Degree: Ph.D. in Financial Engineering
Year: 2019
Advisory Committee: Dr. Ionut¸ Florescu, Dr. Charles Suffel, Dr. Rupak Chatterjee, Dr. Dragos Bozdog
Abstract: A financial crisis is always referred to a situation in which one or more groups of financial assets classes lose their value. Failures to assess the risk associated to the loss value at many important corporations in particular financial institutions is a key cause of this type of crisis. Every financial crisis has large negative impact on the society, economy, political events, and even on individual’s lives. Detection and analysis of crises, or any extreme event in financial markets, is an important problem investigated by industry as well as academia over the past decade. This dissertation presents two separate conceptual frameworks to detect and potentially mitigate risk in financial markets. First, it presents a statistical model for detection and analysis of rare events in high-frequency financial data. We investigate the joint probability distribution of price, volume and time and implement a multivariate criterion for detection of potential rare events. The results identify rare events as a large price movement with relatively small volume traded in a small-time span over a high-frequency transaction level data. This may be potentially extended to construct an early warning system for market-wide rare event monitoring. The second part of this research presents an analysis of various machine learning techniques and their applicability to corporate credit score prediction. This is an important step to assess the risk associated to investment decisions, for example, for future credit loans. To achieve this goal, we review most machine learning algorithms applied to credit rating assessment and design a methodology to compare the performance of most successful methods. We use the same financial statements data for all models. We introduce a new performance measure relevant for credit rating prediction. We provide recommendations based on this new measure, which go beyond the conventional accuracy and computational efficiency. The final part of this dissertation takes the nonlinear structure of data into a deeper consideration and applies modern deep learning techniques to predict more accurately corporate credit scores. We provide recommendations for the best neural network that consistently outperforms others with respect to the temporal dimension of data and input features selection. These recommendations are useful when assessing credit worthiness of a company not previously rated.