Overfitting Definition
In statistics, overfitting is the result of an analysis that is too precise on a particular data set. It fails to perform well in another data set or to predict future data. This phenomenon occurs when the data noise, or residual/unexplained variation, is included to make an unnecessarily complex model, while it should instead be explained by randomness. Since the complex pattern is influenced by random factors, it will not hold true in the future.
For algorithmic trading, overfitting occurs because a fine-tuned algorithm uses models and parameter values that increase the test performance in the sample by randomness. That randomness is unlikely to repeat in the future, which leads to incorrect predictions.
The following figure shows an in-sample overfitting (cubic curve) in a regression problem.
For example, in algorithmic trading, we can have many different versions of the same algorithm. Each has better performance on the in-sample data compared to the previous version because of added rules. The later version gets increasingly more complex than the former ones. The performance of these versions is in the table below.
The first two versions of the algorithm perform well with positive return on out-of-sample data despite the acceptable difference in performance compared to the in-sample data. Meanwhile, version 3 adds new rules and increases in-sample performance, while significantly reducing out-of-sample performance when compared to version 2. In this case, we can conclude that version 3 is an overfit, and version 1 is underfit since it can be improved to outperform in both in-sample and out-of-sample data.
Techniques to Avoid Overfitting
Here are 05 techniques to avoid overfitting. It should be noted that the techniques only help to eliminate basic overfitting, not undetectable cases. For the latter, we need to use the algorithm testing process after optimization.
Understand Why a Rule Is Right
When finding a rule that has produced profits in the past, it’s important to explain why the rule produces such returns based on financial theory, or human behavior, not hoping the history will repeat itself in the future. This makes the rule valuable. Combining financial knowledge and market conditions, investors can properly interpret and assess the algorithm's performance when monitoring the trading system. As a result, algorithmic traders can predict the irrationality the moment when this assumption no longer holds true and promptly stop the system or change the algorithm before any disaster.
Split the Data Into Training and Validation Sets
After splitting the data set, we only use the training set to optimize the algorithm’s performance. The validation set is to find a version on the optimization path with a reasonable fit to ensure equivalent performance on both data sets (Figure 12).
The idea behind is that randomness in the training and validation sets are different, but the two share a common hidden pattern. Therefore, if a pattern consists of too much randomness only in the training set, it will perform poorly in the validation set where the randomness does not occur. If a version performs well on both data sets, it’s likely the hidden pattern has been found, and the correct rules and parameters have been identified.
In Table 3, using out-of-sample data as a validation set, we can select version 2 as the reasonable fit.
Choose a Simple Preliminary Algorithm to Start With
When an expert proposes a preliminary algorithm to test on the target market, there will be a set of multiple versions that can be applied. These versions are basic algorithms but have different enhancement rules and parameter values. During the optimization process, these versions are tested to find the most likely version that can perform well in the real market. If the preliminary algorithm is simple enough, these versions will have a lesser chance of overfitting.
In figure 11, we see the linear model y=0.5x+1.5 is simpler then the cubic polynomial y=x3+(x+0.3)2-x+1. The hidden patterns are mostly simple in real life.
Drop the Preliminary Algorithm if There Are Too Few Versions for Profit
After evaluating different versions of the initial algorithm, if there are too few instances that profits, the algorithm hypothesis may not be valid for the target market. The few versions with positive performance may be due to randomness or overfitting of the training data. We should drop the preliminary algorithm at this stage.
Better Choose a Stable Optimal Point Than a Sharp Optimal Point
Financial markets are always volatile. The optimal parameter value may change in the future. If a sharp, unstable optimal point is selected, it’s likely the performance will decrease rapidly even though the optimal parameter value changes ever so slightly.