In this article we outline the steps for developing a trading strategy for XLF ETF. The out-of-sample performance is analyzed in the Quantopian platform and it is shown to significantly outperform the SPY benchmark.
The following step will be followed in developing a trading strategy for XLF ETF:
- Data partition
- Statistical analysis
- Workspace construction
- Results and Code generation
- Out-of-sample performance
1. Data partition
We will first partition the data for XLF in in-sample and out-of-sample. The in-sample covers inception (12/22/1988) to 12/31/2010. The out-of-sample covers the time period from 01/03/2011 to 01/20/2017. We use the Data Partition tool in DLPAL to affect the data split for all 12 ETFs in directory C:/ETFDATA, as follows:
The in-sample files are saved in a subdirectory called Insample and the out-of-sample files are saved in the subdirectory Ossample. These subdirectories are generated automatically by DLPAL.
2. Statistical analysis
Before developing a strategy for XLF or any other security it is important to perform basic statistical analysis in the in-sample to determine the characteristics of the price series. We can use the Price Series Statistics tool of DLPAL for this purpose.
It may be seen that the in-sample includes a wide variety of market conditions: sideways, uptrend, downtrend and then a shorter uptrend followed by more sideways activity. This is very important because we would like to avoid developing strategies on data that includes limited price action conditions, such as one long uptrend or just a downtrend.
Note that 95.37% of the daily returns fall within -4.58% and 4.62%. Therefore, a percent profit target and stop-loss of 4% appear to be good levels for the exits to use in DLPAL. We want our strategy to possess timing ability. If the proper exits are not used, we risk over-fitting to the data or getting stopped out to often. Normally we should try to use profit target and a stop-loss levels that are realistic
3. Workspace construction.
For strategy development we use a search workspace where we define the in-sample data, the exit levels, the performance constraints, the major feature cluster to use and the test sample, as shown below:
For this search we selected the in-sample XLF data and a T/S file that corresponds to 4% exits. We also specify minimum win rate of 66% for each strategy reported, long or short, minimum profit factor of 1.5 and at least 30 trades (> 29). We choose the Extended feature cluster and a test sample size of 500 bars. The trade input is the open of the next bar and the exits are of percent type. There are other options available for most advanced search operations we do not consider in this example.
4. Results and Code Generation
The results of the search are shown below:
DLPAL identified 33 distinct strategies, 11 long and 22 short. Note that by definition, the performance of the strategies in the in-sample is always good because they are selected to fulfill the performance objectives. We can use Test Strategies to verify that. In this case we specify “No multiple positions”.
It may be seen that the equity curve is nearly a straight line and performance is exceptional. However, this is not useful as far as determining whether the strategies have timing ability because they may be random. Note that random strategies can be the result of any combination of the following:
- Machine learning is inefficient
- Parameters were not chosen correctly
- Market is too efficient for strategy timing
- There is insufficient data and market conditions
- Other known and unknown factors
We can test in out-of-sample any of the strategies individually or elect to test a system that combines all strategies with the OR Boolean operator. In this case we generate code for the Quantopian platform and test the strategies in the out-of-sample. This is a much more realistic backtest because the Quantopian code generated by DLPAL applies the exits in 1-minute data and the platform accounts for partial fills using a proprietary liquidity algorithm.
The Quantopian code is generated fast and inserted in a new algorithm using copy and paste. The operation is fast and efficient. Below is how the code looks like with the entry conditions truncated for the purpose of this example.
5. Out-of-sample performance
The out-of-sample performance from 01/03/2011 to 01/20/2017 in the Quantopian platform is shown below:
Total return is 164.3% versus 102.5% for SPY benchmark in the same period. Note that buy and hold XLF performance is 96.9%. Alpha is 0.19 and beta is -0.11. Sharpe is 1.02, which is a good result.
The T-statistic (not shown in above screenshot) is about 2.44, which points to significant performance at p-value < 0.05. Maximum drawdown is 21.3%.
We can also use the Random Trading Simulation tool of DLPAL to test the significance of the strategy in out-of-sample. We select long-only random position to make the test stricter since the buy and hold return is positive. We specify 164.3% for the test return and DLPAL performs the simulation, as shown below:
It may be seen that the minimum significant return for long-only random trading is 24.49%. Our test return of 164.3% exceeds that by a wide margin. P-value based on this test is 0.
The results of DLPAL are fully reproducible for same data and parameters. This is one key aspect of DLPAL that allows scientific testing. There are several programs that claim to develop strategies but their output is random since each time they run they produce different results, which cannot be analyzed for significance because they cannot be reproduced. DLPAL has elevated strategy design and machine learning to the status of the scientific method by offering fully reproducible results.
If you have any questions or comments, happy to connect on Twitter: @priceactionlab