In 1900
Louis Bachelier received a doctorate from the University of Paris with a
dissertation entitled “Theorie de la Speculation”, an event that marked the
first time a serious academic paper addressed the behavior of markets [1]. In his dissertation, Bachelier proposed that
market prices could be modeled as something called Brownian motion. Slowly, his ideas where adopted by the
financial community and are now the foundation of modern financial
engineering. The idea of Brownian
motion arose when a botanist named Robert Brown described the chaotic behavior
of pollen grains suspended in a fluid and viewed under a microscope. He reasoned (correctly) that their motion
was due to large numbers of random molecular forces impinging on the
grains. Using similar reasoning,
Bachelier assumed that market prices vary due to large numbers of random
effects, such as the whims of individual traders, and hence may be modeled as
Brownian motion.

Three
critical assumptions underlie the Brownian model, namely:

1.
price changes are statistically independent,

2.
price changes are normally distributed, and

3.price-change statistics do
not vary over time.

The first assumption means
that price changes behave like coin tosses, where the current change was not
influenced by past changes and has no influence on future changes. The second assumption says the changes
follow a bell-shaped curve. This assumption
is relevant whenever random behavior is due to many small influences. It provides a distribution function
characterized by only two parameters, the mean and standard deviation, and
implies a certain ‘contained’ behavior of the changes. The third assumption says the mean and
standard deviation do not change with time.

Knowledgeable investors might take exception to one or all
of these assumptions. In fact, there is
ample evidence that these assumptions are often violated in the real
markets. The recent book The
(mis)Behavior of Markets [2] documents many of these violations. It discusses for example the 1987 stock
market crash where there was a price change in the Dow Jones Industrial Average
equal to about 18 standard deviations, an event with a probability of about one
in 1017, if the second assumption is true. A glance at price changes for many of the more volatile stocks
over a long enough time span suggests that the third assumption is often
violated as well. The violation of
either assumption 2 or 3 may produce similar effects, namely, larger than usual
excursions in price. A question is: are
these excursions accompanied by precursors, smaller changes but in the same
direction, like earthquakes. If so,
then the changes are locally correlated, in violation of assumptions 1 and 3,
and it may be possible to harness those correlations by means of a simple
linear predictor and some statistical data. This article describes some results from an attempt to do just
that. We focus on the statistical
behavior of the stock charts and ignore real-world issues such as commissions
on trades.

2. BROWNIAN MOTION AND STOCK PRICES

The three
assumptions listed in the Introduction constitute the technical definition of a white noise. Consequently, a
Brownian motion is characterized by the fact that its changes are a white
noise. That is, if we suppose a
sequence of stock closing prices P0, P1,…,PN is a sample from a Brownian motion process, then its price changes W1,
W2,…,WN defined by

A larger
standard deviation of W means a broader bell curve and greater volatility in
the prices. Figure 1 shows a graph of 253 days of price changes for General Motors, GM, starting
11/01/04 and ending 10/28/05 (data source: Yahoo Finance, adjusted
closes). The numbers have been
normalized so that vertical units are standard deviations (sigma’s): their mean
(-0.039) has been subtracted and the result divided by their standard deviation
(s=0.727). For reference, Figure 2 shows a graph of 253 points of computer generated
white noise having the same mean and standard deviation. This sequence is typical in that its values
rarely exceed two sigma’s, in contrast to GM whose largest change is more than
seven sigma's. Our goal is to
capitalize on these large sigmas.

Prices
may be recovered from their changes according to the simple formula

,

where P0 is the starting price. In the case of
GM, P0 is known to be 37.16 on 11/01/04 and the resulting prices are
shown in Figure 3. Using the same starting price and the computer generated values in
Figure 2, gives the simulated prices shown in Figure 4; the
obvious differences between Figures 1 and 2 are not so obvious in Figures 3 and
4. As pointed out in [2], “fake charts”
like Figure 4 are virtually indistinguishable from real stock charts.

The summation that appears in the last equation represents
the difference in the stock’s price between day 0 and day k. Consequently

represents the fractional change in price between day 0
and day k; that is, an investment of D dollars on day 0 results in a
profit/loss of D·Fk on day k. For this reason, Fk is called the Fortune indicator. But suppose we modify the equation slightly to get

.

where the
An’s have the special (magic?) property that

An-1 = +1 if Wn > 0

An-1 = -1 if Wn < 0

making An-1Wn always positive (the case Wn = 0 is ignored). In this case, a graph of Fk starts at zero and increases thereafter, sort-of-a Holy Grail for traders! Of course the problem is: An-1 must be calculated on the day before the measurement of Wn,
that is, predicted on the previous day. A trading scheme that predicts the An’s in an attempt to
capture this behavior is described next.

3. TRADING SCHEMES AND THE FORTUNE INDICATOR

A trading scheme can be thought of here as a device for
generating buy/sell signals when presented with a set of market data. The device employed in this scheme is a
one-day-ahead predictor, which is based on the previous behavior of the market
item. A buy signal occurs when the
predicted price change exceeds the average of the most recent changes. A wager of one unit is placed and the
resulting profit/loss taken at the close on the following day (there is no
explicit sell signal). Each days
profit/loss is added to the previous day’s total to get the Fortune to
date.

It is called a wager because this is not a buy and hold
scheme; rather the scheme is designed primarily to exploit any mis-behavior in
the data. It is unrealistic in the
sense that no one can trade at exactly the closing price each time. However, if the scheme is to be implemented
in the real world, one can monitor the stock’s intraday price and buy just
before the close. Similarly, the stock
may be sold any time before the close on the following day.

A long or short position is determined by predicted
direction of the price movement. The
question of when to place the wager is determined as follows. If ΔP’ denotes the predicted closing price change, and σ the
standard deviation of the most recent price changes, then the Alpha Indicator is defined as their ratio:

α = ΔP’ / σ .

(Notice that alpha is dimensionless.) If alpha is greater than one or less than
negative one, this indicates the predicted price change exceeds the recent
average. Relatively large, positive
values of alpha indicate a long position, and relatively large, negative values
of alpha indicate a short position. On
a buy signal, a position of one investing unit is taken, and after the next
close, the position is canceled and a profit/loss taken. Then, tomorrow’s position is calculated and
the procedure repeats. Profits and
losses are accumulated in the trader’s Fortune.

This then
is the trading scheme, or algorithm. It
has been assumed that correlation within the price changes will be captured in
the forecast to produce a larger ΔP than usual, which is in turn captured
in α, whose calculation is now considered.

Let Pn denote the closing price of a stock and ΔPn = Pn - Pn-1 its price change on day n. The
algorithm for finding α on day n, αn, is as follows; it is assumed that all computations are carried out in Excel.

·use
the built-in FORECAST function to calculate Pn+1’, the estimate of Pn+1, based on the previous 3 prices:
Pn, Pn-1, Pn-2

A
forecast lag of three was chosen because it is the smallest value that can
capture a trend, and yields more potential wagers than larger values. In calculating σn it was
found that a lag of seven produced
enough averaging, while not involving data from the remote past. If the lag was too short, small values of
σn determine large values of αnwhich may lead to losses. Figure 5 shows αn values computed from the GM data in Figure 3. Values range roughly between 2 and –2 and a
question that arises is exactly what constitutes a “relatively large” value
mentioned above.

A crucial
parameter in this algorithm is the Alpha cutoff, denoted by C, where αn > Csignals a long position, andαn < -C a
short position. For a long position, An-1 is set to +1, and for a short
position it is set to –1; the 1 represents one investing unit. Otherwise it is set to 0, and there is no
investment on day n. Thus the algorithm
for finding An is

An = +1 if αn > C

An = -1 if αn < -C

An = 0 if -C ≤ αn ≤ C

However,
these An’s will not always yield a positive factor An-1Wn,
like the magic numbers described in Section 2 above. Figure 6 shows a graph of the Fortune
indicator for GM using C =
1.04. Notice that the graph is
generally increasing with time, while GM is generally decreasing over the same
time period. The Excel spreadsheet was
programmed to test values of C between 1 and 4 in increments of
0.01. The value that maximized the
Fortune on the last day (the LDF) was chosen as optimal C (1.04 in GM’s
case). The LDF for GM was 0.294, with
the maximum fortune of 0.305 occurring at day 239, just 14 days before data’s
end. The number of wagers was 49, with
26 winners and 23 losers for a win-ratio of 0.531 (see Figure 7). The two factors contributing to the rise in
the fortune are the win-ratio and the sizes of the wins and losses. In the case of GM, the winners were
generally larger than the losers.

For
comparison the so-called “buy-and-hold” position was also calculated as

(PN - P0) /
P0.

This
represents the fraction of the wager on day zero that is won or lost by simply
waiting for day N, and must be compared with the LDF. In the case of GM, which decreased in value, the B&H position
was –0.266, hence the difference between the LDF and B&H was 0.560. This means a trader employing the current
scheme realizes 56% more return than an investor holding the stock.

The
Fortune indicator for the Brownian motion in Figure 4 is shown in Figure 8. Here C = 1.42, the LDF is 0.117, the
maximum Fortune is 0.160 occurring at day 144, the number of wagers is 30, with
18 winners and 12 losers for a win-ratio of 0.60. The B&H value for this simulation was 0.45, with a difference
of 0.333. The two graphs in Figures 6 and 8 are similar in that they are
both increasing on average, but the LDF in the Brownian case is somewhat
smaller. Figure 9 shows fewer wagers than GM, as would be expected from a comparison of Figures 1
and 2. However, this behavior is not
“typical”. After numerous experiments,
it appears that most any other LDF, from zero to greater than one, can be
obtained with a different computer generated white noise sequence, depending on
how the prices are patterned.

Price
patterns can be understood by looking more closely at price data in the
vicinity of a win or loss. Figure 10 shows a segment of GM data near day 128 (May 4,
2005). On this date GM jumped by 4.88 points, while price changes for the
previous 7 days had a standard deviation of only 0.319. The 3 previous closes were nicely aligned
yielding a predicted close of 27.45, with a change of 0.51, and an Alpha equal
to 1.59. Since this exceeded the cutoff
of 1.04, the wager was set to +1 and a win of 0.181 was realized, a clear
example of a large change being preceded by several smaller ones, all in the
same direction. (Note that this win was
immediately followed by a loss of 0.059 when the stock had a loss of 1.89 on a
predicted gain of 2.03.) This win can
be traced directly back to the +7 sigma outlier on day 128, an apparent
violation of the Brownian assumption. The other large outlier in Figure 1 is about –7 sigma’s at day 94 (March
16, 2005) and yielded a win of 0.140. The obvious question now is: how many other stocks enjoy this type of
mis-behavior?

4. MORE EXAMPLES

Thirty-two
more experiments were conducted on twenty-eight stocks/indexes, and four
simulated Brownian motions. The stock
charts were selected at random from Yahoo’s most active list in some cases, and
well-known items were chosen from the Dow and Nasdaq in others. The results are summarized in Table I; items BM-1 through BM-4 are the simulated Brownian
motions. Column headings Sdate and
Edate represent start and end dates, respectively, for the item in the first
column. Approximately one year of data
was used in each case. The optimal C value is in the column labeled C; MaxF, B&H and LDF are,
respectively, the maximum Fortune achieved in the date range, the buy-and-hold
position and the last-day Fortune value. The difference between the LDF and B&H position are shown in column
DIFF; the table is ordered by decreasing values of DIFF. Column #W is the total number of wagers and
WR is the win-ratio.

Several
observations can be made.

Items with a positive DIFF
fare better under the scheme than buy-and-hold positions; 22 of the 32
items are in this category, including all four simulations.

The win-ratio in the last
column usually exceeds 0.5, indicating the linear predictor usually had
more successes than failures. It
also appears to be uncorrelated with the DIFF.

Stocks that performed the
best by DIFF ordering generally had a negative B&H, the notable
exception being BM-1, the top performer. This is because the Fortune indicator can increase on a short
position.

The worst performers were
those with a large B&H value like APPL and GOOG, and, presumably,
“poor” price patterns. Figure 11 is a graph of Differences versus Buy &
Hold positions which reveals a downward trend in agreement with this
observation.

Items GE, ZMH and WFMI appear
in their respective locations with zero wagers, indicating there is
no value of C that produces a non-negative LDF.

Other items that appear near
the top like SUNW have very small LDF’s, which are likely to disappear if
commissions are involved

WFMI is interesting in that
its B&H value is relatively large yet no wagers were made. Close examination of its price-change
data suggests that the signs changed too often from plus to minus for the
linear predictor to be effective; this stock does not mis-behave.

5.CONCLUSIONS AND OPEN QUESTIONS

The
intent of this analysis was to test the feasibility of harnessing any
mis-behavior in stock prices, which amounted to calculating the An’s
that appear in the equation for the Fortune indicator. In this regard, the majority of items in
Table I did indeed beat their buy-and-hold position, the reason for which was
traced to a particular price pattern as exemplified in Figure 10. Essentially,
the price must move in the same direction for four consecutive days and, if the
last change is large enough, a profit is realized.

The
related questions of whether most stocks are Brownian motions and if this has
an effect on the Fortune indicator is not as clear. It appears as though most items are approximately a
Brownian motion and Bachelier’s model works well most of the time. But when it fails, e.g., large price
excursions, the scheme often captures this behavior with an accompanying increase
in the fortune indicator. For GM, the
pattern had a seven-sigma price change, reducing the likelihood that it is a
Brownian motion. However, the data in
Table I suggest there is no difference between true Brownian motions (the
BM-items) and real-world market items, as far as the trading scheme is
concerned.

The results suggest there are many items which would yield
a positive DIFF under this scheme. One
outstanding question is how to choose them. A closely related issue is: will an item continue to behave as it did
historically or will assumption 3 in Section 1 be violated with disastrous
consequences. In general, how do we
choose optimal C in the real world? More questions:

Predictions were made using a
simple linear extrapolation; can the Fortune be increased by using a more
sophisticated technique such as a Kalman filter which can capture
curvature in the trend?

What effect will commissions
have on optimal C and the performance of the trading scheme?

Will the Fortune increase in
some cases if C is allowed to be smaller than one?