Researchers
Sijie Wen, M.S. in Financial Analytics, Graduated May 2020
Yamin Tang, M.S. in Financial Analytics, Graduated May 2020
Advisor:
Dr. Cristian Homescu, Director, Portfolio Analytics, Chief Investment Office at Bank of America Merrill Lynch
Abstract
In this paper, we propose a goal-based investment model that is suitable for personalized retirement management and be implemented by Q-learning and DDPG, which are RL algorithms, in our paper. Compared with traditional portfolio investment, our model shows statistically significant profitability and better performance.
Keywords: Goal-based Investing, Reinforcement Learning, Portfolio Management, Retirement Planning
Main Results
To implement two RL algorithms - Q-learning and DDPG for goal-based investing, we consider an investor’s initial wealth is fully allocated in cash, and allocated between cash and following stocks and bonds during the investment horizon. Table 1 below summarized the statistic characters of these asset classes.
Components of portfolio | Expected Return (%)* | Standard Deviation (%)* |
---|---|---|
S&P 500 | 10.50 | 10.14 |
iShares Russell Top 200 | 14.05 | 11.99 |
iShares Russell Mid-Cap | 13.88 | 14.52 |
SPDR MSCI EM | 4.26 | 18.13 |
Bloomberg Barclays High Yield | 4.20 | 4.78 |
Bloomberg Barclays Municipal Bond | 6.60 | 7.98 |
Table 1: Portfolio Construction


Figure 1 shows that the performance of DDPG algorithm is more significant than Q-learning. Figure 2 demonstrate goal achievement and asset allocation outcomes. When the investor has a conservative portfolio like a goal of 18 (\$180,000), the model continues to invest in safe assets in all stages (municipal bond account for the large part). On the other hand, if we improve the goal to a more aggressive level such as 22 (\$220,000), the model has no choice but to allocate more to risky assets and reduce the proportion of risk-free assets like cash so that achieve the goal. The DDPG model is over performed and efficient to achieve a financial goal successfully, as well as the weight of each asset is also reasonable according to the expected return and standard deviation from Table 1.




Conclusion
To summarize, we proposed a reinforcement learning and goal based approach to the retirement optimization problem. For the specified capital-preservation client for retirement goals, the strategy obtained by the DDPG algorithm can outperform the conventional strategies in assets allocation with setting a series of constraints and a clear final wealth goal. A reasonable target will decide the success of a specified investment strategy.
Furthermore, our proposed method only requires inputs of initial wealth, future investment, and goals at the end of the investment horizon, which is applicable for automated financial advising services, also known as robo-advisors.