Walk Forward Analysis Using The Acceleration System On E-Mini 1min Bars
Walk Forward Analysis Using The Acceleration System On E-Mini 1min Bars
Page
The Dangers of Data Mining and Curve Fitting Real Prices ______________________________ 1
The optimization of the input parameters of a technical analysis system such that they produce the best
performance results on past data has become especially easy using today’s fast computers and
software. Optimization here is defined as the combinatorial search over a range of system input values
on price data defined over a fixed number of bars for the cases that produce the best system net profits or
some other selected performance variable. This type of optimization is many times dubbed as curve
fitting. Unfortunately the input parameters that produce the best results on past data don’t necessarily
produce good results in real time. Why is that? Optimization is performed on test data consisting of
some set of past price data over some fixed number of bars. The price data consists of some
deterministic signal or pattern (if any) plus spurious random price movements which we refer here to as
noise. The computer optimization of a system’s input parameters on the test data produces performance
results that fit the signal and the noise. It is for certain that the noise process will not be repeated in the
same way in the out-of-sample data or real time (out-of-sample data are prices that the system input
parameters have not be optimized on). After an optimization run on the test data, it is difficult to
determine whether the input variables chosen from the optimization run have curve fit the noise, which
will not be repeated, or have successfully modeled the price signal. The profits on out-of-sample data
from choosing the input parameters that generate the best performance results from the optimization fit
on the test data will usually disappear if anomalous price movements or noise is a large part of the price
data.
To illustrate the points above we will examine two different systems on two different sets of data and
compare the test performance results with the out-of-sample performance results. For the first system we
will apply the Noise Channel Breakout System to IBM 5min bars. For the second system we will apply
the Velocity System to E-Mini 1min bars..
Buy Rule:
If price crosses above the highest high price of the last nhi bars by an amount greater then or equal to f
then buy at the market. In addition, when short, and when calculating the highest high price (hhp), that
hhp cannot be higher than the previously calculated hhp as the nhi lookback period moves forward and
Sell Rule:
If price crosses below the lowest low price of last nlo bars minus by an amount equal to f or greater,
then sell at the market. In addition, when long and when calculating the lowest low price (llp), that llp
cannot be lower than the previous calculated llp as the nlo lookback period moves forward and previous
lows are dropped out of the lookback window. In other words when long the llp can only stay the same or
go higher...it cannot go lower.
Exit Rule:
Close any position five minutes before the New York Stock Exchange close (no trades are carried
overnight).
Buy Rule:
IF Velocity is greater than the threshold amount vup then buy at the market.
Sell Rule:
IF Velocity is less than the threshold amount -vdn then sell at the market.
Results
Table 1 presents a table of the best twenty cases ranked by Net Profit(NP). The statistics are based
upon trading 1,000 shares of IBM. Slippage and commissions are ignored. Next to the NP column, the
out-of-sample net profit (OSNP) column was inserted. Recall the net profit numbers are calculated for the
period of May 16, 2001through June 15, 2001. The out-of-sample net profit (OSNP) is separately
calculated for each case from the period outside this test period namely, June 18,2001 through June 22,
2001. Using Excel’s cut and paste, the out-of-sample net profit column was inserted into the column next
to the test section’s Net Profit column for easy comparison.
Sadly, this promise of future profitability is just an illusion. As we can observe from Table 1, the Net Profit
performance variable in this example is not a good criteria for choosing input parameters that will perform
well in the future. The out-of-sample profits are mostly negative. The few cases of positive out-of-sample
net profits seem to be distributed at random without defining features with respect to the other statistics.
Table 2 presents a table of the best 20 cases sorted by out-of-sample net profits (OSNP). The case
numbers from the Net Profit ranking were retained in column 1 for comparison. As one can see from this
table, the top 20 best net profits weren’t even close in producing any of the top 20 out-of-sample net
profits. As a matter of fact not even the top 10% (202cases) of the net profits produced any of the top 20
out-of-sample net profits.
Figure 2 presents a 3D Histogram of Net Profits vs Out-of-Sample Net profits. The first thing one
observes is that there are many more out-of-sample losses than there are gains. The actual count is 278
gains and 1747 losses out of 2025 cases. This indicates that for this time period, the noise channel was
unable to perform well. Also as can be seen from the histogram, the number of out-of-sample gains
decline as the test data Net Profits increase.. This means that for this system and this time period, the
test data Net Profit is not a good performance criteria for selecting input parameters that will produce
gains in the out-of-sample period. This histogram shows the essence of the curve fitting of random data.
The noise portion of the data did not repeat in the out-of-sample data portion resulting in losses from the
input parameters that fit the noise portion of the test data and produced large net profits.
The Profit Factor is often suggested as a good performance statistic for selecting input parameters that
will give good out-of-sample gains. Figure 3 presents a 3D Histogram of the Profit Factor vs Out-of-
Sample Net profits. As can be seen from the histogram the number of out-of-sample profit gains decline
as the Profit Factor increases. This is just the opposite of what is supposed to happen. Larger Profit
Factors are supposed to predict larger out-of-sample gains. This means that for this system the profit
factor is also not a good performance criteria for selecting parameters that will produce gains in the out-
of-sample period. Of particular interest is profit factors above 2. These profit factors produces the
smallest number of gains of all. This indicates that higher values of profit factor represent the curve fitting
of the noise portion of the test data and thus will be useless in producing good out-of-sample
performance.
Results
Figure 4 presents a 3D Histogram of the Net Profits vs Out-of-Sample Net profits for the Velocity system
applied to E-min 1min bars. Once again we can see that large test sample Net Profits generate out-of-
sample losses. This illustrates the dangers of data mining using the test sample Net Profit . The system
has fit the spurious price movements quite well creating big profits in the test section and mostly losses in
the out-of-sample section. The number of out-of-sample gains decline as the test data Net Profits
increase. This histogram shows the essence of curve fitting a system to past prices. The noise portion of
The Profit Factor is often suggested as a good performance statistic for selecting input parameters that
will give good out-of-sample gains. Figure 5 presents a 3D Histogram of the Profit Factor vs Out-of-
Sample Net profits. As can be seen from the histogram the number of out-of-sample profit gains decline
and losses increase as the Profit Factor increases above 2. This is just the opposite of what is a common
believe. Larger Profit Factors from systems are supposed to good. Once again for a completely different
system and set of data, the profit factor was not a good performance criteria for selecting parameters that
will produce gains in the out-of-sample period. Of particular interest are profit factors above 2. These
profit factors produced the smallest number of gains of all. This indicates that higher values of profit
factor represent the curve fitting of the noise portion of the test data and thus will be useless in producing
good out-of-sample performance. We will use this observation later in constructing a good filter or
selection criteria.
Conclusion
We have shown that when the input parameters of a system are chosen based on the best net profits or
the profit factor, out-of-sample losses are likely to occur. The optimization run produces many input
parameter cases that have excellent net profits, profit factors and other statistics in the test section but
losses in the out-of-sample section. This is how we are tricked by spurious price moves that do not
repeat. From just one optimization event we have no idea whether the performance results we’ve
obtained are from curve fitting the noise or an accurate measurement of some signal. Just looking at the
test section performance gives one the illusion that future profits will occur. However as we have seen
from the tables and figures such is not the case. It is rather difficult to separate out the cases that
produce out-of-sample profits from losses using the above statistics from the test data section. More
sophisticated statistical techniques are needed to combine the various test data window performance
statistics in a way that will select cases that produce out-of-sample profits with greater probability than our
simple performance statistics and sorting method above.
This type of combinatorial optimization is many times dubbed as curve fitting, data mining or data
snooping. Unfortunately the input parameters that produce the best results on past data don’t necessarily
produce good results in real time. Why is that? Optimization is performed on test data. That is some
past price data over some fixed number of bars. The price data consists of some deterministic signal or
pattern(if any) plus random price spurious movements which we refer to here as noise. The computer
optimization of a system’s input parameters on the test data produces performance results that fit the
signal and the noise. It is for certain that the noise process will not be repeated in the same way in the
out-of-sample data or real time (out-of-sample data are prices that the system input parameters have not
be optimized on).
To demonstrate these points we will construct two different random price series. The first series will be a
random price series of 3,968 five minute bars. The price series will start at a price level of 992 and the
price charges of each bar from the previous bar will have the same standard deviation as the S&P500
futures 5 minute bars so as mimic a real price series in terms of the magnitude of price changes. The
second series will be a random price series of 11,341 one minute bars. The price series will start at a
price level of 1100 and the price charges of each bar from the previous bar will have the same standard
deviation as the E-Mini futures 1 minute bars so as mimic a real price series in terms of the magnitude of
price changes. We will add real dates and times to each generated random price point so that
TradeStation can import this series onto a chart. We will apply the Channel Breakout System described
in the previous section to the random 5 minute bar data. We will apply the Velocity System described in
the previous section to the random 1 minute bars. We will then use the TradeStation optimization module
to optimize the input parameters of these two systems over the same range of values described in the
previous section. We will then examine the optimization output to see the performance results of these
two systems on random data.
Results
Table 3 presents the results the best twenty cases ranked by Net Profit(NP) for the Channel System on
the 5min random data . The performance statistics are based upon trading 1 contract of the random price
series with each contract point equal to $250. Slippage and commissions are ignored.
As we can see from Table 3 there are many cases that look like they present good input parameters to
use. The Profit Factors in table 1 look good. The Profit Factors are not to high to alert one to curve fitting
and not to low to prevent assumed profitable future trading. If I didn’t know that this was random data then
case 3 looks particularly interesting. The average trade profit is $294, the profit factor is 1.81, the largest
losing trade is only 4% of the Net Profits and the drawdown is only 16% of the net profits. In addition case
3’s Net Profit was based on 133 trades. Many traders believe that if your system shows a profit on over
100 trades then the input parameters will be successful in future trading. This number of trades criteria is
obvious not correct here. Remember this is random data so there are no good input parameters to use
Figure 6 presents a chart of 5min Random data with the buy and sell signals generated from the input
values of case 3 in Table 1. As can be seen from the chart the buy and sell signals from the case 3 curve
fit look great. Observing this type of performance on past data is how the siren call of the past
performance curve fit fools us into believing we will be able to repeat this performance in the future.
Sadly, this promise of future profitability is just an illusion.
Table 4 presents the results for the best twenty cases ranked by Net Profit(NP) for the Velocity System on
the 1 min random data. This table uses the spreadsheet output from the Power Walk Forward Optimizer
described later in the next section. The performance statistics are based upon trading 1 contract of the
random price series with each contract point equal to $50. Slippage and commissions are ignored. In
this table the test section statistics were generated from 9316 random 1min bars from 3/24/04 to 4/23/03
and the out-of sample section statistics were generated from 2025 random 1min bars immediately
following the test section. Once again we can observe that the velocity system produced excellent Net
Profits(tnp column) on the random data in the test section. There is little hint from observing these top 20
tnp’s that these results are from applying the system to completely random data. Even the Losers in a
Row (lr) stats are excellent. However, moving across any given row to the out-of-sample Net Profit (osnp)
column we can see that the osnp’s are all losses except for line 15.
Figure 7 presents a 3D Histogram of the Net Profits vs Out-of-Sample Net profits for the Velocity system
applied to the random 1min bars. The most out-of-sample number of gains were when the test sample
net profits were losses. This histogram indicates the randomness of the association between test net
profits and out-of-sample net profits.
Figure 8 presents a 3D Histogram of the Profit Factor vs Out-of-Sample Net profits for the Velocity
system applied to the random 1min bars. For any given profit factor range the out-of-sample gains and
losses are about equal. If we compare this with the Velocity System on real data histogram of Figure 5
there are some interesting differences that we can exploit later in the walk forward section.
Conclusion
We have shown that system performance on random data can produce excellent optimization results.
How are to determine that the results we obtain from optimization are due to repeatable price dynamics
and not from randomness? One way is called walk forward analysis or out-of-sample testing. In out of
sample testing you would run your optimization on some test data, pick your input parameters based
upon some criteria and then run the system on data that the system has not been tested on namely out-
of-sample data. If this Test Data/Out-Of-Sample data procedure is performed enough times and the
average returns on these runs are satisfactory then you can gain confidence that your system is capturing
some repeatable price patterns that you can trade.
Case # nhi nlo NetPrft AvgTrd %Prft PFact #Trds #WTrds #LTrds LLT Avg W/L WR LR MaxDD
1 4 14 42648 151.8 40 1.55 281 113 168 -1045 2.31 6 8 -9533
2 14 14 42178 283.1 47 1.83 149 71 78 -1663 2.01 5 7 -6348
3 18 14 39180 294.6 45 1.81 133 60 73 -1663 2.20 6 6 -6348
4 16 14 38568 273.5 44 1.74 141 63 78 -1663 2.16 5 7 -6733
5 4 16 36938 140.4 39 1.49 263 105 158 -1045 2.25 6 8 -11955
6 20 14 36848 285.6 44 1.76 129 58 71 -1663 2.15 6 6 -7688
7 14 16 36008 255.4 44 1.70 141 63 78 -1663 2.11 4 7 -7203
8 8 14 35903 168.6 45 1.52 213 96 117 -1440 1.86 6 5 -8845
9 6 14 35548 150.0 43 1.49 237 102 135 -1353 1.97 6 8 -8983
10 4 10 33573 97.9 37 1.36 343 128 215 -1045 2.29 6 10 -13478
11 4 18 33523 134.6 38 1.46 249 95 154 -1045 2.37 5 8 -12070
12 14 18 32393 243.6 46 1.62 133 62 71 -1663 1.86 4 7 -7203
13 18 16 32205 253.6 41 1.63 127 53 74 -1663 2.27 4 6 -8148
14 8 16 32083 161.2 45 1.50 199 91 108 -1530 1.78 6 6 -10373
15 4 12 31978 100.2 36 1.36 319 118 201 -1045 2.32 6 9 -12288
16 6 16 31663 143.3 43 1.46 221 97 124 -1353 1.86 5 5 -11198
17 16 16 31213 231.2 41 1.57 135 56 79 -1663 2.22 4 7 -8093
18 20 16 30768 250.1 42 1.61 123 52 71 -1663 2.19 4 6 -9488
19 12 14 30728 175.6 42 1.49 175 74 101 -1663 2.04 5 7 -10375
20 18 18 30340 255.0 42 1.60 119 51 68 -1663 2.13 4 7 -7373
Where p(t)EST is the estimated price found from the least squares fit at time t, a is the initial value of the
line, b is the slope of the line, 2c is the acceleration and t is the time of the bar. Recall that velocity is
defined as the change of position per unit time and acceleration is defined as the rate of change of
velocity per unit time. Using the formula above acceleration would be the second derivative of p(t)EST and
would be 2c.
Where
c1=30/[N(N-1)(N-2)]
c2=-180(N+1)/[N(N+2)(N-1)(N-2)]
c3=180/[N(N+1)(N+2)(N-1)(N-2)]
While these equations are correct, the implementation of them may cause a few computational problems.
The first problem is slow speed. The computation of the polynomial coefficients at each bar using the
equations above is very slow. The second problem is floating point overflow. The summations in the
polynomial coefficients can become very large causing floating point overflow and computational errors.
To solve these problems I developed a fast efficient algorithm, similar to the Lagrange coefficients
method discussed in the Nth Order Polynomial papers, that solves the above noted numerical floating
point overflow problems and is super fast.
nd
Here we will use the acceleration formula of the least squares 2 order polynomial line to create a
system. The least squares acceleration has the advantage that it is a natural random price noise
inhibitor. We can create a system such that unless the acceleration from fitting N past price bars with a
nd
2 order polynomial is greater than some threshold we will not buy or sell. A large percentage of price
noise generates a lot of back and forth movements of small magnitudes over a short period of time. With
a lot of systems this back and forth movement creates many false buy and sell signals. However using
the least squares acceleration over N past prices, we can filter out many of the small price noise
movements by requiring that the acceleration be greater than some threshold before we act.
Buy Rule:
IF Acceleration is greater than the threshold amount aup then buy at the market.
Sell Rule:
IF Acceleration is less than the threshold amount -adn then sell at the market.
Testing The Least Squares Acceleration System Using Walk Forward Optimization
There are three system parameters to find:
1. N, is the lookback period to calculate the Acceleration = 2c. Note that c is a very small number
on the order of 0.001. In order to scale c we multiply it by 2000 instead of 2.
2. aup, the threshold amount that acceleration has to be greater than to issue a buy signal
3. adn, the threshold amount that acceleration has to be less than to issue a sell signal
To test this system we will use 1 minute bar prices of the E-Mini futures contract traded on the CME
known by the symbol ES.
What is walk forward analysis?. Optimization is a misnomer on today’s trading platforms. Optimization
should really be called a combinatorial search. That is, we have the software output certain performance
parameters for every combination of a range of input parameters. We then sort that range on some
performance parameter. Whenever you run an optimization on noisy data on a fixed time interval, the
best performance will almost always be due to curve fitting the noise and signal. This curve fitting is also
called “Data Mining” or “Data Snooping”. A fixed number of prices on a fixed time interval has many
spurious movements which are also called noise. When we run many different combinations of the input
parameters, the best performance will be from those cases that are able to capture profits from the
spurious movements and the price or signal dynamics. While the signal dynamics, if there, will repeat,
the same spurious price movements will not. If the spurious movements that were captured by a certain
set of input parameters were a large part of the net profits, then choosing these input parameters would
produce losses when traded on future data. These losses occur because the spurious movements will
not be repeated in the same way. The input parameters chosen from the test section performance are
“cherry picked” to perform well on only those exact same spurious movements. This is why curve fitted
systems with no out-of-sample testing cause losses in the out-of-sample section from something that
looked great in the test section.
Walk forward analysis attempts to minimize curve fitting by using the law of averages from the Central
Limit Theorem on the out-of-sample performance. In walk forward analysis the data is broken up into
many test and out-of-sample sections. Usually for any system one has some performance parameter
selection procedure or filter used to select the input parameters from the optimization run. For instance, a
filter might be all cases that have a profit factor (PF) greater than 1 and less than 3. For the number of
cases left we might select the cases that had the best percent profit. This procedure would leave you
with one case in the test section output and it’s associated input parameters. Now suppose we ran our
optimization on each of our many test sections and applied our filter to each test section output. We
would than use the input parameters found by the filter in each test section on the out-of-sample section
immediately following that test section. The input parameters found in each test section and applied to
each out-of-sample section would produce independent net profits and losses for each of the out-of-
sample sections. Using this method we now have “x” number of independent out-of-sample section profit
and losses from our filter. If we take the average of these out-of-sample section net profits and losses
than we will have an estimate of how our system will perform on average. Due to the Central Limit
Theorem, as our sample size increases, the spurious noises results in the out-of-sample section
performance should average out to zero in the limit leaving us with what to expect from our system and
filter. Mathematical note: This assumption assumes that the out-of-sample returns are from probability
distributions that have a finite variance.
This will produce 1296 different cases or combinations of the input parameters.
As discussed above, one thing we do know about data mining is that the curve fit of the signal and noise
will always produce the best performance results. Thus if we eliminate the optimization cases with the
very best performance results we are sure to eliminate many of the data mining system input parameters
that fitted the past spurious movements. We know from experience that very few good intraday systems
can sustain profit factors above 3.0 over time. If we eliminate, from our optimization, all cases that have
The walk forward method described above is very time consuming and laborious. I use a TradeStation
add-on product called The Power Walk Forward Optimizer (PWFO) in combination with a software
product called the Walk Forward Performance Explorer (WFPE). With the PWFO, you tell it the test
section length and the out-of-sample section length and it generates all the necessary test section and
out-of-sample section files in an Excel csv format. For instance to set up, I input the E-mini data into a
TradeStation chart from 12/03/03 to 07/09/04. I add the PWFO as a strategy in the TradeStation Strategy
window along with the Acceleration Strategy. I tell the PWFO that I want a test period of 30 calendar
days and a out-of-sample period of 7 calendar days. When the TradeStation optimization run is finished
the PWFO generates 27 test/out-of-sample files. All files have the output format shown in Appendix A. I
then use the Walk Forward Performance Explorer (WFPE) software on the PWFO optimization output on
each of those 27 test sections to find the input parameters N, aup, and adn that will be used in the out of
sample section immediately following each test section. The WFPE eliminates all cases in the PWFO test
files that do not meet certain Profit Factor (PF), and Losers in a row (LR) criteria, as discussed above,
and chooses the input parameters from the cases that is left. The WFPE explores a range of PF and LR
criteria generating a number of searches in one run and automatically displays the best average PF-LR
filter in Excel .
Below are the best ten results of the WFPE run sorted by the Student t statistic. The t statistic is
calculated from the weekly out-of-sample net profits generated by the given PF-LR filter. The Student t
statistic gives us what the probability is of the average out-of-sample profit for each PF-LR filter being due
to pure chance. The higher the t statistic the lower the probability that the results are due to chance.
Filter: PF>1 and PF<=2 and LR<=4, #Trds>=16 and highest mKr.
Where:
PF = Profit Factor in Test optimization section
LR=Maximum consecutive loses in a row in test optimization section
Table 6 presents a specialized percentage trade by trade summary of the last four weeks of out-of-
sample data segments from 4/12/04 and 5/7/04. where the out-of-sample input values for those out-of-
sample weeks were taken from Table 5.
At the bottom of Figure 5 are some statistics that are of interest. The first statistic ave=419.4 is the
average weekly net profit for the 27 oos weeks. Std=544 is the standard deviation of the weekly return.
Given a process that generates the above average and standard deviation, it would be interesting to
know, statistically, the probabilistic outcome from trading this system for 13 weeks (one quarter). There
is a statistical technique that we can use to determine this called bootstrap Ref [1]. Using the bootstrap
technique we would randomly choose 13 of the twenty-seven weekly profits in Table 5, with replacement
and take the sum of those 13 randomly chosen oos net profits. “With replacement” means we don’t
eliminate the randomly chosen weekly profit from being chosen again. Let us suppose that we repeat this
random choosing of 13 weeks and summing the results 200 times. We would then have 200 different 13
week net profit summations. If we take the average and standard deviation of those 200 different 13
week net profit summations we would have an estimate of what to expect from this system by trading it for
13 weeks.
For this system the 200 bootstrap average of 13 week net profits is $5453 and the standard deviation is
$1856. This means that at two standard deviations we can expect our 13 week return from this system to
be between $1741 and $9165, 95% of the time. Please note that slippage and commissions were not
taken into account.
References
1. Efron, B., Tibshirani, R.J., (1993), “An Introduction to the Bootstrap”, New York, Chapman &
Hall/CRC.
Where
PF = Profit Factor in Test optimization section
LR=Maximum consecutive loses in a row in test optimization section
osnp = Out-of-sample net profit for that week.
Equity = running sum of the weekly out-of-sample net profits
ollt = Largest losing trade in that week
odd = largest close drawdown in that week
ont = The number of trades in the out-of-sample week.
Avosnp = the average out-of-sample trade profit in that week.
Trade Entry Entry Entry Exit Exit Bars Trade Equity Trade Trade
# Date Time Price Time Price InTrd P&L Max Pft Time Max DD Time
1 04/12/04 1001 Sell 1143.50 1610 1145.00 369 ($75) ($75) $63 1326 ($125) 1036
2 04/13/04 1001 Sell 1142.25 1610 1129.00 369 $663 $588 $775 1509 $0 1001
3 04/14/04 1001 Buy 1125.75 1023 1126.75 22 $50 $638 $175 1013 ($13) 1001
4 04/14/04 1023 Sell 1126.75 1610 1128.50 347 ($88) $550 $288 1517 ($238) 1108
5 04/15/04 1003 Sell 1128.75 1542 1128.75 339 $0 $550 $475 1412 ($63) 1016
6 04/15/04 1542 Buy 1128.75 1556 1127.75 14 ($50) $500 $13 1549 ($50) 1555
7 04/15/04 1556 Sell 1127.75 1610 1126.25 14 $75 $575 $75 1610 ($25) 1556
8 04/16/04 1001 Sell 1126.25 1610 1134.00 369 ($388) $188 $13 1002 ($475) 1251
9 04/19/04 1020 Buy 1129.75 1610 1134.25 350 $225 $413 $275 1531 ($50) 1042
10 04/20/04 1124 Buy 1134.50 1540 1123.50 256 ($550) ($138) $25 1126 ($575) 1536
11 04/20/04 1540 Sell 1123.50 1610 1115.00 30 $425 $288 $488 1604 ($13) 1541
12 04/21/04 1003 Buy 1121.00 1610 1122.75 367 $88 $375 $163 1425 ($313) 1139
13 04/22/04 1001 Buy 1123.50 1610 1138.00 369 $725 $1,100 $925 1459 ($163) 1029
14 04/23/04 1023 Buy 1136.75 1610 1139.50 347 $138 $1,238 $200 1438 ($150) 1132
15 04/26/04 1020 Sell 1139.25 1610 1136.25 350 $150 $1,388 $400 1512 ($50) 1023
16 04/27/04 1015 Buy 1145.75 1031 1143.50 16 ($113) $1,275 $0 1015 ($125) 1024
17 04/27/04 1031 Sell 1143.50 1610 1137.75 339 $288 $1,563 $413 1447 ($38) 1339
18 04/28/04 1001 Buy 1130.25 1610 1122.50 369 ($388) $1,175 $25 1005 ($500) 1526
19 04/29/04 1007 Buy 1123.25 1030 1126.00 23 $138 $1,313 $225 1016 $0 1007
20 04/29/04 1030 Sell 1126.00 1445 1111.75 255 $713 $2,025 $788 1437 ($75) 1102
21 04/29/04 1445 Buy 1111.75 1608 1113.00 83 $63 $2,088 $138 1556 ($238) 1522
22 04/29/04 1608 Sell 1113.00 1610 1113.25 2 ($13) $2,075 $0 1608 ($25) 1608
23 04/30/04 1001 Sell 1114.50 1051 1114.75 50 ($13) $2,063 $213 1034 ($125) 1008
24 04/30/04 1051 Buy 1114.75 1140 1113.00 49 ($88) $1,975 $138 1120 ($100) 1137
25 04/30/04 1140 Sell 1113.00 1543 1108.75 243 $213 $2,188 $350 1525 ($125) 1343
26 04/30/04 1543 Buy 1108.75 1604 1105.25 21 ($175) $2,013 $88 1551 ($175) 1603
27 04/30/04 1604 Sell 1105.25 1610 1105.00 6 $13 $2,025 $38 1609 $0 1604
28 05/04/04 1426 Buy 1119.00 1513 1121.75 47 $138 $2,163 $375 1500 ($113) 1427
29 05/04/04 1513 Sell 1121.75 1556 1118.50 43 $163 $2,325 $338 1539 $0 1513
30 05/04/04 1556 Buy 1118.50 1610 1115.75 14 ($138) $2,188 $13 1559 ($138) 1610
31 05/06/04 1001 Buy 1115.00 1031 1114.25 30 ($38) $2,150 $175 1004 ($50) 1026
32 05/06/04 1031 Sell 1114.25 1610 1113.50 339 $38 $2,188 $475 1244 ($50) 1532
33 05/07/04 1009 Sell 1109.00 1610 1096.75 361 $613 $2,800 $613 1609 ($200) 1048
Note:
For the PWFO we use the median instead of the average whenever we can. The median is a much more
robust statistic than the average. As an example look at row 5 on the above spreadsheet. The median
trade profit is minus -$12.50 while the average trade profit is a plus $13.23. The average is controlled by
a few outliers while the median is immune to these few outliers.
! The first part of the filter is the profit factor filter designated by pf{ (2 digit number with decimal
place after first number) or xx }. The number after pf means <=num1.num2. Thus pf35 would
mean pf<=3.5. pfxx would mean any or all pf’s. That is, the xx means that the filter doesn’t filter
for pf.
! The second part of the filter is the losers in a row (LR) filter. This is denoted by a –LR after the pf
filter part as in pf35-5. pf35-5 filters is pf<=3.5 and LR<=5. if LR is set to “a” as in pf35-a then the
filter doesn’t filter for LR and all LR’s are included in the filter. A filter of pfxx-a would mean that
there is no pf or LR filter and the filter is just the maximum of the VAR column.
! The third part of the filter is the Min # Trds. This is designated by a “:number” after the LR portion
of the filter as in pf35-5:19. Here the “:19” means that the minimum number of trades in the test
section has to by greater than or equal to 19.
! The fourth part of the filter is the performance variables(Var). After the PF and LR filters are
applied to a PWFO output file there are anywhere from 0 to 100’s of rows or cases left. If there
are 0 cases left after the application of the PF-LR filter, this means that none of the cases
satisfied the filter requirements. Excluding the PF and LR columns there are 28 columns of
performance variables. Out of the rows that are left the program finds the maximum or minimum
(see below for minimum variables) for each of the 28 columns. This creates 28 different filters for
any given PFLR filter. As an example the first filter above is pf20-3:19 Var=r22. This means that
for the filter pf20-3:19 that the maximum value in the r22 column is chosen leaving only one case.
So the filter should be called pf20-3:19-maxR22. Note if there is more than one row of the
maximum value for a variable the tie is broken with the row that has the highest “t” value.
! The variables that use a minimum rather than a maximum are: nT, std, mlb, lbr, mru, mlos, dd,
llt, dev. This variables are defined on pages II-1 and II=2 of the PWFO manual.
! Var = The PWFO variable. For this PFLR filter the maximum or minimum of this variable produced
the following average statistics on this line.
! t = The student t statistic for the 27 weekly oos profits. The higher the t statistic the higher the
probability that this result was not due to pure chance
! # = The number of weeks this filter produced a case. Note for some weeks there can be no cases that
satisfy a given filter's criteria.
! b00 = The straight line trend of the oos equity curve in $/week.
! Blw = The maximum number of weeks the oos equity curve was below the straight line equity fit.
! BE = Break even weeks. Assuming the average and standard deviation are from a normal
distribution, this is the number of weeks you would have to trade to have a 99% probability that your
oos equity is above zero.