Researchers

Sijie Wen, M.S. in Financial Analytics, Graduated May 2020
Yamin Tang, M.S. in Financial Analytics, Graduated May 2020

Advisor:
Dr. Cristian Homescu, Director, Portfolio Analytics, Chief Investment Office at Bank of America Merrill Lynch

Abstract

In this paper, we propose a goal-based investment model that is suitable for personalized retirement management and be implemented by Q-learning and DDPG, which are RL algorithms, in our paper. Compared with traditional portfolio investment, our model shows statistically significant profitability and better performance.

Keywords: Goal-based Investing, Reinforcement Learning, Portfolio Management, Retirement Planning

Main Results

To implement two RL algorithms - Q-learning and DDPG for goal-based investing, we consider an investor’s initial wealth is fully allocated in cash, and allocated between cash and following stocks and bonds during the investment horizon. Table 1 below summarized the statistic characters of these asset classes.


Components of portfolio Expected Return (%)* Standard Deviation (%)*
S&P 500 10.50 10.14
iShares Russell Top 200 14.05 11.99
iShares Russell Mid-Cap 13.88 14.52
SPDR MSCI EM 4.26 18.13
Bloomberg Barclays High Yield 4.20 4.78
Bloomberg Barclays Municipal Bond 6.60 7.98
* Expected returns and standard deviations are estimated from 2010.01 to 2019.12 and annualized
Table 1: Portfolio Construction

Figure 1 Q-learning result (left) vs DDPG result (right): Portfolio value vs market value without cash installment and adjust weights daily
Figure 1 Q-learning result (left) vs DDPG result (right): Portfolio value vs market value without cash installment and adjust weights daily

Figure 1 Q-learning result (left) vs DDPG result (right): Portfolio value vs market value without cash installment and adjust weights daily

Figure 1 shows that the performance of DDPG algorithm is more significant than Q-learning. Figure 2 demonstrate goal achievement and asset allocation outcomes. When the investor has a conservative portfolio like a goal of 18 (\$180,000), the model continues to invest in safe assets in all stages (municipal bond account for the large part). On the other hand, if we improve the goal to a more aggressive level such as 22 (\$220,000), the model has no choice but to allocate more to risky assets and reduce the proportion of risk-free assets like cash so that achieve the goal. The DDPG model is over performed and efficient to achieve a financial goal successfully, as well as the weight of each asset is also reasonable according to the expected return and standard deviation from Table 1.

Figure 2 Total wealth and weights of incremental goal achievement
Figure 2 Total wealth and weights of incremental goal achievement

Figure 2 Total wealth and weights of incremental goal achievement
Figure 2 Total wealth and weights of incremental goal achievement

Figure 2 Total wealth and weights of incremental goal achievement

Conclusion

To summarize, we proposed a reinforcement learning and goal based approach to the retirement optimization problem. For the specified capital-preservation client for retirement goals, the strategy obtained by the DDPG algorithm can outperform the conventional strategies in assets allocation with setting a series of constraints and a clear final wealth goal. A reasonable target will decide the success of a specified investment strategy.

Furthermore, our proposed method only requires inputs of initial wealth, future investment, and goals at the end of the investment horizon, which is applicable for automated financial advising services, also known as robo-advisors.

Check out the Interactive Visualizer here.