Author: Margarita Zaika
Degree: M.S. in Financial Engineering
Year: 2021
Advisory Committee: Dr. Dragos Bozdog, Dr. Ionut Florescu

Abstract: In this thesis we use high-frequency financial data and multidimensional liquidity measures to formulate multiple methodologies for detection and analysis of rare events. We show that liquidity can be used to detect rare events during high-frequency trading throughout the day. In our analysis, we examine and utilize most known liquidity measures, which are based on Trade and Quote (TAQ) and Limit Order Book (LOB) data. To lower the data noise levels we reduce the dimensionality of the data sets by applying Principal Component Analysis (PCA). This data transformation makes it less computationally intensive to fit models that identify outliers. Three methods that are widely used in data science are compared and contrasted: Mahalanobis Distance (MD), Isolation Forest (IForest), and Local Outlier Factor (LOF). To resolve the task of high/low extreme liquidity classification, these outlier detection methods are extended by our proposed methodologies: "hyper-plane" and "cone". Rare events detection is reviewed and compared based on trading data during the beginning of COVID-19 outbreak. Practical considerations are discussed regarding the minimization of computational complexity for better performance. A new measure is developed to measure and visualize extreme liquidity intensity, and we study whether the outlier detection methods would highlight the same clusters of the extreme observations. We also analyze the relationship between rare events and the underlying stocks' prices, volume, and volatility.