Author: Zhiyuan Yao
Advisor: Dr. Ionut Florescu, Dr. Chihoon Lee
Date: October 1, 2024
Department: Financial Engineering
Degree: Doctor of Philosophy
Advisory Committee:
Dr. Ionut Florescu, Chairman
Dr. Chihoon Lee, Chairman
Dr. Rong Liu,
Dr. Zachary Feinstein,
Dr. Jia Xu
Abstract: We investigate the applicability of reinforcement learning (RL) to portfolio optimization problems. Training RL agents is challenging due to two primary issues 1) communication latency between traders and trading systems, which degrades performance, and 2) the need for a realistic market simulator for effective agent learning and interaction. Our three studies tackle these challenges while proposing a hierarchical trading framework with two RL agents working in tandem to maximize utility.
First, we propose a RL method to mitigate the performance degradation when
environments have delayed feedback. The issue is more evident in environments with higher levels of stochastic transition such as a stock market. We focus on deterministic delays and propose a model-based RL method to counteract the effects of latency. Our approach recovers the optimal policy in environments with deterministic transitions. We demonstrate its effectiveness through comparisons with previous methods and apply it to various Atari games to further analyze performance.
Second, we aim to build an agent-based market simulator driven by RL agents.
Market simulators are designed to better understand the dynamics and properties of the market, and to quickly generate enormous amount of data to train models. To simulate a realistic market, the system should have different types of agents with their specific utilities. We design a simulation framework that not only captures the stylized facts but also mirrors real-world market behaviors. Additionally, we examine how RL agents respond to external shocks like flash crashes, demonstrating their adaptability to market events.
Third, we introduce a hierarchical trading framework consisting of two RL agents who work together to optimize the overall utility. Traditional asset management involves asset selection and portfolio optimization, but the success of an actively traded portfolio depends on order execution. Our framework integrates these two RL based agents, reinforcing each other’s decisions to maximize investment returns. We showcase its effectiveness by training and testing it on the U.S. equity market and exploring optimal training methods for this dual-agent system.
For full Dissertation, click here.