1. Home
  2. Docs
  3. DLPAL LS
  4. Feature history generation function

Feature history generation function

This function allows generating and updating files with historical values of features that can be used wit in the development of a variety of trading strategies and with machine learning algos. The function allows:

– Creating historical features files for a single security, a group of securities and an ensemble of securities
– Updating of historical feature files with new instances as new data becomes available for a universe of securities.
– Creation of train/test and score files for a universe of securities from the historical feature files
– Saving and loading of default settings
– feature and train/test files maintenance

Please read below before using this function:

1. This function should be used with data in the daily or higher timeframe.
2. All operations of creating historical data files, train/test and scoring files are performed in the directory specified in History parameters
3. To generate history files for a group of securities, the Select All files box must be marked
4. Always make sure that you have enough historical data for the securities considered before applying this function
5. The History length (in bars) should be no larger than 50% of the history length of the security with the shortest history in the group
6. Initial generation of history files may take in the order of a few hours to even longer depending on the number of securities in the group and instances (bas) specified
7. After the initial generation of history files, updating and generation of train/test files is a faster operation
8. Files with historical feature values have the following header: Date, Open, High, Low, Close, PLong, PShort, Pdelta, S. The files have the same name of the original data file and extension .pih
9. Train/test and scoring files have the following header: Date, Open, High, Low, Close, PLong, PShort, Pdelta, S. Target class label is added to train files. Train files have extension .pit and scoring files .pis
10. The target is a binary class of 0 or 1 based on the sign of the return one period in the future. For EOD data, 1 means that the close of next day is higher than the close of today and the reverse for 0
11. If you would like to create history, train/test and scoring files for just one security, you can save only the historical data for that security in a directory and then use the Select all files option
12. Ensemble history files have the following header header: Date, avgPL, avgPS, Pratio, avgPLS, avgPSS and are saved as pi.pie in the relevant directory.

For more information about the meaning of the parameters in the headers of the history files please read the section on Definitions and Interpretations.

Using the feature history generation function

From Tools select Create feature history. Below is the screen with the available options.

You may select just one security to create a history file or a group of securities. If you select only one security, then the option to create history for all files will be ignored.

The following must be specified on the workspace.

1. T/S file
2. Data file(s)
3. Trade parameters: Exits based on percentages (%) of entry price, points (pts) added to entry price or Next Close (NC). Trade Inputs are the Open or Close
4. Major Cluster Type: Normal, Aggressive or Moderate. Typical use is with the Normal cluster
5. Detrending option: If the option is marked the results will be adjusted for the presence of a trend based on a proprietary algorithm

To Select a T/S file

Select a T/S file from the appropriate directory. Click on a file to highlight it and then click the hand icon pointing down to move it in the selection field. Alternatively, you can just doubleclick a file and it will automatically get selected. To change the T/S file click the hand icon pointing up and repeat the selection process. Note:in the case that next close (NC) is selected as the exit parameter, a dummy T/S file must be selected with just one pair of target/stop values but these will not be used by the program.

To Select data file(s)

Select the directory where the data files are located and mark the box next to the option Select all files, to select all files in the directory specified (default option). If this option is checked, the same T/S file, Trade Parameters Major Cluster Type and Detrend Option will apply to all data files. Note: To select a single file from a directory you must first uncheck the box next to the Select all files option and then highlight the file to scan.

Trade Parameters

Specify the type of exits (percent or points) to be used with the profit target and stop-loss values in the T/S file, or next close (NC), by selecting % for percentages of entry price, pts for points added to entry price, or NC for next close exit. Specify the type of trade input by selecting either Open or Close. If Open is selected then the entry price will be the open of the next bar. If Close is specified, then the entry price will be the close.

blank

Major Cluster Type

Aggressive cluster has 5 sub-clusters. Normal has 10 sub-clusters. Dynamics has 7 sun-clusters and Conservative has 15 sub-clusters Note that the more features that are involved in the extraction of features, the more conservative the results because more market conditions are taken into account.

blank

Detrend option

If the option is marked (default choice) the results will be adjusted for the presence of a trend based on a proprietary algorithm.

blank

Significance option

The minimum significance S in results can be set to 1. This is useful in the case of limited samples or slower timeframes, such as weekly. The default setting is to keep original values and this is recommended. Note: if the option is used it is not saved in the workspace and must be entered manually each time it is retrieved.

blank

History options

History length (bars) specifies the number of instances in the output files(s).

Create history for all files instructs the program to generate files with features for all data files in specified directory. If only one file is selected, then this option is ignored.

Create ensemble history instructs the program to create ensemble features taking into account all data files in the specified directory.

blank

To create a new History parameters line, click the hand icon pointing downwards. To delete a line, click tine and hit the DEL key, or use the hand icon pointing upwards to remove its contents.

blank

The following additional options are available.

blank

Save default settings

Click this option to save the settings History parameters in the computer registry. No other parameters are saved except those shown in the History parameters.

Load default settings

Click this option to load the saved settings

Delete history files

This option will delete all history files in the directory specified in History parameters only. Be extra careful when using this option.

Update history files

This option will update all history files (*.pih) in the directory specified in History parameters. Ensemble files (pi.pih) cannot be updated in this version and that may be done manually.

Any pih files that correspond to text files in each input line of a program workspace can be updated using “Update linked .pih files only” tool from a workspace and this tool is not required in this case. 

Delete train and score

This option will delete all train/test and scoring files in the relevant directory under Data File in History parameters.

Create train and score

This option will create two files for each history file in the directory specified in History parameters. Check all files must be checked. For single securities, save the historical data in a directory just for that security and then mark Select all files.

Click Run to create files with historical values for features.

Examples

Below is a screenshot of a file with features for AAPL.

blank

The features are added for each instance after the Date, open, high, low, close values.

Below is screenshot of an ensemble file for all Dow 30 stocks.

blank

Note: Ensemble files contain only the date and values of ensemble features and cannot be automatically updated but any updating must be done manually by defining the proper number of bars