The Indicator Optimizer, an option of the Indicator Package, is designed to do two things:
1. Find the best parameters for the indicator you choose to enhance your data. For example, if you choose lagged moving average, what is the best number of periods to average and what is the best number of periods to lag? The Optimizer will answer these questions based upon your particular data.
2. Find the best indicators to enhance your data. The Optimizer will try all of the indicators in a search group and a range of parameters for those indicators. You will get a list of the 10 best indicators with their optimal parameters for your particular data. Of course, you can choose to get more or less than 10 indicators if you desire.
The new indicators will be added to the variable list, with a number placed before the indicator name. This number gives an indication of a variable's predictive ability. A number approaching 1.0 indicates very good predictive ability, while a number approaching 0.0 indicates poor predictive ability.
Warning: This is a very complex module to use and you may have to carefully read the following documentation and "experiment" with it before you fully understand it. Please be familiar with the Indicator Package documentation and operation before reading this Optimizer option documentation. Call Technical Support at (301) 662-7950 if you need assistance.
The method used to determine the optimal indicators is related to correlation analysis. The number placed before the indicator name can be roughly thought of as the degree of correlation between an indicator and the variable you wish to predict.
In correlation analysis:
1. You must assume the type of equation you wish to fit to the data [y = Ax + b (linear),
y = Ax^2 + Bx + C (quadratic) , y = log x (logarithmic)],
2. Use regression analysis to determine the coefficients for the chosen equation, and then
3. Calculate how well the original data corresponds to the predicted value using this equation. The advantage of the method used by the Indicator Optimizer Package is that it requires no assumption about the type of equation but still uses the mathematical theory behind correlation analysis as a basis for determining the best indicators.
The Indicator Optimizer uses the same Menus as the Indicator package, except that there is one additional menu item, Optimal, which has submenus.
Specify Optimal Search
You must call this menu selection to tell the optimizer package the variable being predicted, how to conduct its searches for the best parameters and the best indicators, and the number of best indicators to display. Use it before trying to use either the Add optimal parameter(s) or Add optimal indicator(s) menu selections.
The Variable Being Predicted scroll box displays the variable which you choose to predict. Use the mouse or the up and down arrow keys to choose the variable (which must be one of your outputs). Subsequent optimal indicator searches will have the goal of finding indicators most able to predict the selected variable.
It is important to note that the nature of the variable you choose to predict will directly affect the nature of indicators found to be most optimal. Take for example the difference between predicting tomorrow's price and predicting the change in price between today and tomorrow.
When predicting tomorrow's price, indicators which convey a price level (such as moving average of price and lag of price) will "correlate" with tomorrow's price much better than indicators which convey a change or trend in price level (such as change in price and linear regression of price). When predicting the change in price, indicators which convey a change or trend in price level will "correlate" much better than indicators which convey a price level.
To clarify the above example, suppose you were told to try to predict tomorrow's change in price but were only given today's price, which is 255.34. You probably wouldn't be able to predict the correct price change for tomorrow, say +0.9, with such limited information. But if you were told to predict tomorrow's change in price and were given the last observed change in price, say +0.5, you would probably come closer to picking the correct change (+0.9).
The advantage of using a neural network is that it has many inputs upon which to base its prediction. The more these variables correlate to the output(s), the better the network will perform.
The Indicator Parameter Specification box allows you to specify the parameter search space. Only the parameters specified in this section will be used in trying to determine optimal indicators. In order to keep processing time down, you should try to limit the search space if you can by limiting the number of parameter values that must be tried for each indicator. For example, if predicting tomorrow's price, it may not be of value for indicators such as lag to be lagged more than a few days. The price yesterday probably has more influence over tomorrow's price than the price 60 days ago.
The five parameter specification entry areas correspond to the five different type of parameters used by the different indicators. The numbers you enter in these edit boxes will be the parameters that are tried by the Optimizer each time it tries to find the best parameters for an indicator. The defaults are a good representative set, but you can change these values if you want to speed up processing. If your pattern file is large, you may need to speed up processing.
Time Periods are used by the majority of indicators (simple moving average, lag, change, etc.). For indicators which average over or look at all of the points in the time period (simple moving average, Bollinger Bands, etc. ), the larger the time period, the more points considered. Obviously the difference between an indicator value taken over 2 time periods and one taken over 3 time periods could be significant. However, the difference between an indicator value based calculation taken over 50 time periods and one taken over 51 time periods is probably much smaller. For this reason, it is important to specify a search which covers the smaller time periods fairly completely (e.g., 2, 3, 4, 6,and 8). Cover larger periods more sparsely (e.g., 20, 30, 50, 75, 100).
For indicators which only use the point n time periods ago (lag, change, etc. ), the difference between two consecutive time periods (2, 3 or 52, 53) depends upon the volatility of the variable and not how far back we are looking. For a fairly smooth price, the difference will not be that great, whereas for a very volatile trading volume variable, the difference will probably be much greater.
Standard Deviation Multiplier is used by most indicators which involve a standard deviation. As used by the Bollinger Band indicators, the standard deviation multiplier is the value that the standard deviation is multiplied by in order to create different size bands around the price moving average. The multiplier should range up to 3. Adjustment of the standard deviation multiplier controls the relative width of the band and thus determines the strength of the price movement needed for a price to exceed or be less than the values in the band (movement above or below the band). A higher standard deviation multiplier increases the band width (stronger price movement needed for a breakout), while a lower standard deviation multiplier decreases the band width (weaker price movement needed for a breakout). In statistics, if data is normally distributed, then about 68% of the data will occur within one standard deviation of the mean, 95.5% within two standard deviations of the mean, and 99.7% within three standard deviations of the mean. Because prices may or may not be normally distributed, this information should only be used as an estimate for setting the standard deviation multiplier.
Exponential Moving Average Factor is used by all indicators based upon one or more exponential moving average calculations (Exp Mov Avg, RSI, .etc. ). The factor creates a moving average as follows:
New Average = Factor * Current Value + (1 - Factor) * previous average
The higher the factor, the more the recent values contribute. The lower the factor, the more the older values contribute. If you are unsure of what factor to use, a good guideline is to set the factor to 2/(n + 1) where n is the number of periods back you want the values to contribute.
Note that the values specified for this data entry box must be less than 1. If you only specify a range such as 0.025: .02, but do not specify an increment smaller than 1 such as .2, you will receive the following error message: "Invalid parameter specification. (Must specify a loop increment)." because the default setting for all edit boxes is to increment by 1. The correct values would be the following: 0.025: .02 :.002.
Standard Deviation Time Periods are used by all indicators which calculate a standard deviation (Standard Deviation, Bollinger Bands, Commodity Channel Index), and by all Daily Change indicators. Since standard deviation calculations look at all of the points in a time period, the larger the time period, the more points considered. Obviously the difference between a standard deviation taken over 2 time periods and one taken over 3 time periods could be significant. However, the difference between a standard deviation taken over 50 time periods and one taken over 51 time periods is probably much smaller. For this reason, it is important to specify a search which covers the smaller time periods fairly completely (e.g., 2, 3, 4, 6,and 8). Cover larger periods more sparsely (e.g., 20, 30, 50, 75, 100).
Linear Regression Time Periods are used by all indicators which are based upon linear regression. Since linear regression calculations look at all of the points in a time period, the larger the time period, the more points considered. Obviously the difference between a linear regression based calculation taken over 2 time periods and one taken over 3 time periods could be significant. However, the difference between a linear regression based calculation taken over 50 time periods and one taken over 51 time periods is probably much smaller. For this reason, it is important to specify a search which covers the smaller time periods fairly completely (e.g., 2, 3, 4, 6,and 8). Cover larger periods more sparsely (e.g., 20, 30, 50, 75, 100).
Note: If any of these edit boxes are left empty, then any indicators based upon that parameter type will not be a part of the optimal indicator search.
Parameters may be entered in the same notation as the standard indicator package. An entry of 1:5, 15 would thus be translated to the parameters 1, 2, 3, 4, 5, and 15. An entry of 1:5:2, 15 would be translated to the parameters 1, 3, 5, 15, with 2 as the incremental number within the range of 1 to 5.
The Options Box contains options which control how many indicators are added for each optimal indicator search. The edit box on the left hand side sets the upper bound on the number of optimal indicators added to the variable list.
The Multiple Additions per Indicator check box controls whether or not the same indicator can be placed in the optimal list several times with different parameters. If checked, it may allow lags of 1 day, 2 days, 3 days, and 4 days to be optimal indicators. If not checked only one of these lags, the best one, will appear as an optimal indicator. This may allow other indicator types to appear in the list in case all of the lags have higher correlations. Usually, you will not want to check this box under the assumption that you would not want a predominance of one kind of indicator in the list that you apply to the file.
The Options for Add Most Predictive Indicator box controls the depth of the optimal indicator search. When talking about depth of search, there are two important terms: indicator application depth and parameter depth. Searches are controlled by one or the other of these depth concepts.
Indicator application depth is simply the number of indicators used in the calculation. The change in a variable is an example of a calculation with an indicator application depth of 1. The standard deviation of the change in a variable is an example of an indicator application depth of 2. The moving average of the standard deviation of the change in a variable is an example of an indicator depth of 3. In other words, do you want only indicators applied to variables, or do you want indicators applied to other indicators, or do you want indicators applied to indicators applied to indicators. These choices correspond to 1 level, 2 levels, and 3 levels on the buttons adjacent to Search by Indicator Application Depth. The deeper the level you choose, the more time a search for optimal indicators will take.
To control the depth of a search by application depth, you will select one of the 3 levels. Then for each level to the maximum level selected, you must also decide the type of indicators that will be tried, based on the number of parameters an indicator has. At each application level, 3 check boxes will appear allowing your search to use 1 parameter indicators (such as lag), 2 parameter indicators (such as lagged moving average), or 3 parameter indicators (such as MACD). Obviously, the more types you check on at every level, the more time your optimal searches will take.
Parameter depth is simply the total number of parameters necessary for making the calculation. You can control depth this way instead of by using application depth. Simple indicators such as lag and simple moving have only one parameter, and thus a parameter depth of 1. More complicated indicators such as Bollinger bands have two parameters and thus have a parameter depth of 2. Thus the simple moving average of a Bollinger band of a variable would have a parameter depth of 3. (The sum of 1 parameter for the simple moving average and 2 parameters for the Bollinger band). In order to control the depth by parameters, click on one of the buttons labelled 1 Deep, 2 Deep, etc.
It is very important to note that the time required for the optimal parameter search directly relates to the maximum parameter depth. As an example suppose that you had entered 10 numbers in each of the parameter specification text boxes. For every indicator having only one parameter, 10 different parameter combinations are tried in the optimal search. For every indicator having two parameters, 100 different parameter combinations are tried in the optimal search. For every indicator having three parameters, 1000 different parameter combinations are tried. In general, if n is the number of possible parameters, the indicator search space is of the order n to the parameter depth power.
The optimal indicator search may be based upon either the total application indicator depth or the total parameter depth. When basing the optimal search upon indicator application depth, the check boxes may be used to control the parameter depth of the indicators at each level.
Note that an optimal search with an indicator application depth of 3 will also perform calculations with indicator applications depths of 1 or 2. The same is true of an optimal search based upon parameter depth. An optimal search with a parameter depth of 5 will also perform all calculations having a parameter depth of less than 5.
Once you have specified the optimal search information, you are ready to either:
1) Find the best parameters for an indicator, or
2) Find the best indicators, each with its optimal parameters.
Note that these optimal indicators and parameters are only for the particular problem data in the pattern file! Another set of data may produce different optimal indicators or parameters.
If you want to simply find the optimal parameters for an indicator you select, you would select the variable and then select the indicator as usual. Then select the following menu item:
Add optimal parameter(s)
Adds the optimal parameter or parameters for the selected indicator when applied to the selected variable. After selecting this menu option, the optimizer will begin its search of possible parameters based on how you specified your optimal search (see previous discussion). When the search is complete, the selected indicator will be added to the bottom of the variable list with the optimal parameter(s). You do not have to select the button "Add Indicator to End of Variables". The (pseudo) correlation factor will be on the front of the indicator to let you know how valuable the indicator is.
Interrupt add optimal parameter(s)
Interrupts the addition of optimal parameters.
If you decide you want the Optimizer to find the optimal indicator for you with parameters, you will select the following menu item:
Add optimal indicator(s)
Adds the optimal indicators for the selected variable to the file. When you select this menu item, the Optimizer will start a long search based on how you set the boxes in Specify Optimal Search.
During the process a percentage completion bar will appear. This bar will tell you the estimated time remaining in the search because some searches may take years on some machines! However, as the search proceeds, the estimated time may go up or down or even be erratic as the program gains more experience with the data. Therefore, do not trust the first time estimate you see.
When the search is complete, all of the optimal indicators will have been added to the end of the variable list with their (pseudo) correlation factors. You don't have to click to "Add Indicator to the End of Variables" button. You can view the results and delete indicators with low correlation so as not to put too many variables in the file.
Interrupt add optimal indicator(s)
Interrupts the addition of optimal indicators.
Resume add optimal indicator(s)
Resumes the addition of optimal indicators from the point at which it was interrupted. Note that even when the optimal indicator search is interrupted and the program is exited or another problem loaded, the search can be resumed when the program is rerun or the problem is reloaded.
Multiple Variable Mode
Selecting this mode allows you to add Optimal Indicators to the single variable that is highlighted in the Variable Selection box and to all of the subsequent variables in the list (except for lead variables). When you have selected this mode and then do an Optimal Indicator Search, indicators will be added for many variables, not just one. This can be a time saver for you. The names of the variables added will contain labels to show you which variables they apply to, such as 0.8196 Wilder RSI(6) of NYSE close.
Multiple Lead Mode
Selecting this mode has the effect of running multiple searches, each with a different variable being predicted (see Specify Optimal search). In other words, it saves you from constantly changing the variable being predicted if there are several outputs or potential outputs in the pattern file. The only catch is that it will identify these outputs only if they are "leads" which you have previously placed into the pattern file with the Indicator package. In fact you must have previously "Applied Indicators to Pattern File" after these leads were selected, so the lead columns will contain values. Remember, that one way to produce outputs, e.g., tomorrow's close, is to lead the close variable. You may want to lead the close 1 day, 3 days, 1 week, and 10 days; then use the Multiple Lead Mode to see which of these outputs are best correlated with the inputs you have. This may change your outlook on what to predict. (Remember that predictive nets work best when there is only one output). The variables added will have in their name an indication of which output (lead) to which they apply.
The Multiple Variable Mode and Multiple Lead Mode may be used independently of one another or in combination. In other words, in combination, they can find the best indicators for a number of variables as applied to a number of outputs.