31. Philosophical Foundations of Backtesting

Published at 1667968063.064958

Investors often try hard to experiment with a lot of third-party software. They may find many potential trading algorithms but end up losing hard-earned money in a real trading environment. The following five philosophies applied during testing phases can help unravel the root cause.

Backtesting Can Always Result in Perfect Scores

It's common knowledge that any algorithm can become the holy grail with the right combination of parameters when testing on historical data.

To understand this claim, an algorithm without any superior alpha, or insights, that runs on a full combination of parameters will return a standard distribution of profits, before all fees, taxes, and slippage costs. An algorithmic trader can then select the best parameters and start trading in the real environment. Unsurprisingly, most algorithms will fail due to the results of backtesting and the optimization phase.

The Real Trading Result Is the Only Thing That Matters

Testing on historical data is to form confident predictions on real trading performance after all. Suppose an algorithm predicts the result of a 20% annual return. What is the confidence level of this prediction in real trading? An algorithm is only ready to run in real life environment if a trader can expect a confidence level of at least 50%.

Correlation Is to Ensure Reliability

A high degree of correlation across each testing phase increases the reliability. On the contrary, a low degree indicates the testing phase has issues and the results may not be reliable. There are 4 main stages of algorithm evaluation:

In-sample backtesting;
Out-of-sample backtesting;
Paper trading;
Small account test.

The higher the correlation between these 4 stages, the more consistent and reliable the algorithm is in real.

Out-Of-Sample Backtesting May Not Be as Reliable

Many professional traders rely on out-of-sample backtesting after in-sample testing to evaluate an algorithm’s reliability. Although this technique is widely accepted in machine learning, traders can apply it in the wrong way for algorithmic trading. The main reason is that out-of-sample data is still already a known fact traders may already be aware of.

For example, regardless of testing technique, all investors know that 2020 - 2021 is a strong bull market after the Covid-19 pandemic. And 2022 is a year of recession with high interest and high inflation. However, active traders may already be aware of this information before out-of-sample backtesting. Thus, it may not be reliable since out-of-sample data is not completely secret.

Adding Forward Testing to Ensure Objectivity

Forward testing on never-seen-before data ensures objectivity when evaluating algorithm performance. Note that paper trading is to objectively evaluate the performance. The small account test is to confirm both the technical implementation and the real performance altogether.