Over time, new ideas for enhancing a live algorithm will inevitably emerge. This arises from the limitation that backtesting and forward testing alone cannot fully reflect the algorithm’s real performance. In essence, new ideas that can be developed into new features will arise. The challenge lies in how an algorithmic trader can confidently transition an original algorithm to the updated version, ensuring substantial improvements. This article outlines the standard procedure for updating features in a liverun algorithm.
Beta System
The idea is very straightforward: achieving confidence in improvements requires comparing two live algorithm versions. Thus, there’s a need to run these versions in parallel. Algorithmic traders may propose using a smaller initial capital for an alternative algorithm version. However, for safety reasons and to reduce risk of untested technical bugs, a Beta System is recommended as a good practice.
A Beta system is sophisticated like the production system but differs in key aspects. First, it operates with limited capital to minimize risks. Second, it doesn’t have the same group of all live algorithms to maximize capital utilization. Third, it allows frequent updates daily to an algorithm and manual adjustments to trading logs, which is nearly impossible in a large-scale production system.
Although setting up and maintaining a parallel version may be costly at the beginning, the long-term benefits will be well worth it. Refactoring the system into independent core functions can significantly reduce setup costs by sharing the same core functions between both systems.
Benchmark
Algorithms are usually benchmarked against the risk-free rate or market indices. However, in this case, an alternative benchmark is necessary: the original version of the algorithm itself.
When upgrading features, it’s common to test performance in a live environment. Though this isn’t sufficient to determine the value of a new feature. Algorithm traders should be careful not to upgrade to a new version solely based on positive results, as the original might yield even better outcomes.
To assess which version is superior, consider each day as a race, and analyze the trade log produced by each version. The version that consistently performs better has a higher likelihood of becoming the production version.
Comparison Criteria
The upgraded version will be deployed in the production system if it successfully meets all three criteria during the comparison phase.
Excess Alpha: The difference in returns between the upgrade version and the original version is recommended to be a key criterion.
Excess Alpha = Alpha of Upgraded version – Alpha of Original version
The excess alpha, when plotted against historical data and live running data, is often expected to be positive. A positive excess alpha indicates the upgraded version outperforms the original version. If the excess alpha remains positive and stable during live running, consistent with historical backtesting results, it serves as a robust indicator of significant feature improvement.
Example plot of excess alpha
In the example above, a negative excess alpha indicates that a feature is not yet suitable for deployment in the production system. To meet the passing criterion, the accumulated excess alpha must remain positive and stable over a specific period of time or a certain number of trades, depending on the algorithm’s nature.
MDD of Excess Alpha: Maximum drawdown of excess alpha.
This criterion serves as a stop-loss threshold if the performance of the upgraded version falls significantly below that of the original version. To meet the passing criteria, during the live comparison phase, the maximum drawdown (MDD) excess alpha should not drop below this threshold as observed in backtesting.
Return: Regardless of good excess alpha and maximum drawdown of excess alpha, an upgraded version must achieve a positive return compared to the risk-free rate or market index. Besides this, another possible case is the upgraded version has reduced risk but also lowered reward. In this situation, the algorithm may not necessarily improve overall return but will limit risk exposure. Thus, ensuring a positive return relative to the algorithm benchmark becomes the third criterion to consider.
In conclusion, following this procedure allows algorithmic traders to confidently upgrade live-running algorithms. It’s essential to define the time horizon for trades or the duration before entering the comparison phase. If the upgraded version doesn’t qualify but still offers values, conducting paper trading comparison can help assess feature robustness. After paper trading, algorithmic traders can start a new comparison testing phase.