While experimenting with some of the BTD models, I discovered a slight modification that outperforms almost every model class we've considered thus far in Q1 starting periods. Before I go on, and before dwade and others start shooting methodological arrows at me, let me say that this may appear to be shameless data mining, and in a sense it is. However, I think there may be a way to address that issue at least to some extent, which I'll get into after the results are presented.

As we all know, the standard BTD models sort first for high yield, and then for low price among the top 10 high yield stocks. It turns out that a simple variation on this theme enhances returns considerably. If we examine the returns for the individual stocks in the HY sort before performing the LP sort, it becomes apparent that the highest yielding stock (regardless of price) gives the most 'bang for the buck' of any of the HY10 stocks, provided we use the UV procedure of skipping high yield/low price (HY/LP) combinations. This variation is the basis for the models described below.

My friend MontanaFool suggested the name Empirical Yield (EY) for these models, since we are letting the numbers tell us what works and what doesn't, and the procedure in a nutshell is this. Always buy the high yield stock regardless of price (except when it is also the lowest priced). After that, fill out the portfolio with the usual BTD sort. The mechanics of constructing the models are as follows. Beginning with the 30 Dow stocks, sort (descending) by yield. Among the top 10 high yield stocks, always leave the highest yield stock (HY1) in first position, and then sort the remaining nine stocks (ascending) by price. EYn buys equal dollar amounts of the first 'n' stocks. If there is a HY/LP combination, skip the HY/LP stock (HY1) and move everything down one position.

PLEASE NOTE: the 'juiced' versions of the EY models are denoted EYn+, but they double the allocation of the first stock only!

Below are the returns in multiplier form for these models by starting quarter for four quarter holding periods. Standard deviations and Sharpe ratios are shown below each model's returns. The Sharpe ratio is based on the one year Treasury bill rate, and has been multiplied by 100 and rounded to the nearest integer. For comparison purposes, the returns for similar UV and ERP models are also shown. Fool4 is the 'Old' Fool4, and UV5+ has been replaced by UV5/6+. The DsPc variation of ERP introduced by RayVT is denoted DP below. The sample period is 1965-96 (last portfolio ends in 97:Q4).

As with almost all high yield models, the returns start to unravel for starting periods other than Q1. It might be noted, however, that for Q3, which is the worst starting period for almost all of the models, both ERP and DP hold up relatively well.

Turning to Q1, among the two stock models, EY2+ offers by far the best return. That return is bought with a price, however, namely higher volatility, which causes a lower Sharpe ratio, about which more shortly. We seldom hear about three stock models because few of them perform very well, but as the numbers show, EY3+ does quite well, offering an expected return of more than 21% with a bit more diversification than the two stock models and outperforming any other two or three stock model outside the class, although the difference between EY3+ and ERP2 is insignificant. Among the four stock models, EY4 outperforms F4 and UV4, but lags behind ERP4. EY4+, however, outperforms even ERP4, albeit with more volatility (and hence a lower Sharpe ratio). Among the five stock models, EY5+ shows the best returns of the lot, with DP5 and UV5/6+ running close behind. It is interesting to note that in Q1 the DP models in general do not do as well as their ERP counterparts, but DP5 is an exception to that rule.

Elan and others have remarked that the Sharpe ratios may be, if not irrelevant, certainly not the be-all and end-all for our purposes, and several of us share that opinion. If a higher return is the goal, and we have to pay a price of higher volatility to get a higher expected return, so be it. Up to a point, at least, volatility is really only an issue if one faces the prospect of having to sell to raise cash in a down market, and that should not be the case if the planning has been done correctly up front. The differences in the Sharpe ratios are also exaggerated because they have been multiplied by 100. If they are calculated as they were in Sharpe's original paper (divide the numbers above by 100), the differences are considerably less dramatic. The point of all this is that judging models by Sharpe ratio alone can be rather misleading, and perhaps that should be the last criterion used in evaluating the decision rather than the first.

The Fools have recently begun to show results from 1971 forward, excluding the 'flat' years of the 1960s. Let me say at the outset that I strongly disagree with this practice, partly because it is being done on the grounds that 'things were different' then, and it is far from obvious how anyone would know that. Nonetheless, the question of comparative model performance from 1971 forward will doubtless arise, so below are the returns for all three model classes for the 1971-96 sample period, starting in Q1.

The relative performance of the various models stays roughly the same, with all boats being lifted by the rising tide. Some of the absolute returns, however, are rather amazing. EY2+ comes in with an eye popping figure of almost 28%, truly a remarkable return for any method that restricts us only to Dow stocks. Among the two stock models, EY2 is second, with a bit over 26%, ERP2 comes in third, with a return of over 25%, and UV2 is last but still produces a very respectable return of over 23%. Among the 'non-juiced' four stock models, ERP4 is the winner with a return of almost 24%, but EY4+ again leads by a small (insignificant) margin, with a return just over 24%. For what it's worth, all of the Sharpe ratios rise significantly, and that of ERP4 goes through the roof. That may not be entirely good news, however, a point we'll return to shortly. To return momentarily to the question about Sharpe ratios and the possibility that they are of limited use at best when comparing models, look at EY4+ and EY5. EY5 has a slightly better Sharpe ratio, but EY4+ has a higher expected return by more than 4%. In the end, this probably boils down to one's tolerance for volatility (as opposed to 'risk'), but for some of us at least, the higher expected return is worth the higher volatility regardless of what the Sharpe ratio says.

Finally, let me turn to the issue of data mining, after which I promise I'll shut up :^). As Doug observed recently, the only real test of any model is to see how it performs five years or more beyond the sample period used to construct the model. In principle, I agree with that completely. Unfortunately, the practical significance is limited because we all know very well that a bunch of number crunchers aren't going to sit around for five years after discovering a promising class of models before they announce the results. Given that, how might we proceed to minimize or at least to recognize the effects of data mining? The heart of the problem is that the entire sample period has been used to fit the models to the data. On the other hand, suppose that same model or class of models had been discovered in, say, 1989. If it still performed well by 1998 we'd be rather impressed, and would likely be convinced that the relationship was neither a fluke nor just the result of data mining. Working that logic backwards, suppose we take the model results and eliminate certain years in the sample period. Then we look at how the returns perform over the restricted sample period. If the models perform relatively the same except possibly for a 'scale' factor due to market conditions, that at least suggests that we would have discovered the same relationship had we been looking for promising strategies in 1989. One version of this in fact is just what was done above, restricting the sample period to 1971 forward, although because of the periods involved there are good reasons to take those results with a grain of salt. A more informative procedure might be to eliminate significant portions of the 1990s bull market and observe how the model results change. Below is another table which shows returns for the three classes of models with the 1989-97 period eliminated, again considering only Q1 starts.

Looking first at the EY models, none of them loses even a full percentage point in the returns by eliminating the bull market years, and one model (EY3) actually shows better returns for the 1965-88 sample period than for the full sample. Considering the run that the market has been on since 1989 (until recently, that is :^)) the models exhibit impressive stability. For what they are worth, the Sharpe ratios also exhibit quite a bit of stability, all of them differing by four points or less. All of this suggests that the same relationship existed, and would have been discovered, had we been working only with the data through 1989. Since they contain many of the same stocks, it isn't surprising that the UV models exhibit much of the same stability. Fool4 has the biggest problem, giving up almost 1.5%, and UV4 gives up slightly over 1% in return by eliminating the bull market, but other than those two the returns are fairly stable. MontanaFool's UV5/6+ actually outperforms F4 and UV4 with the bull market removed, which should be of more than passing interest given events of the last several weeks. The Sharpe ratios are a bit more variable, with two of them (UV2 and UV5) differing by five points or more, but that is still not a dramatic shift. For the ERP models, the picture is a bit more troubling. In particular, ERP4 gives up over 1.4% return when we eliminate the bull market from the sample, and ERP5 loses almost as much. Also ERP4 now underperforms EY4+ by almost two percentage points, and also underperforms EY4 and UV4+, although by smaller margins. The Sharpe ratios also seem rather volatile, spiking markedly when we restrict the sample to 1971 forward, and then falling rather dramatically when the bull market period is removed from the sample. Except for the 36% fall in the Sharpe ratio from one restricted sample to the other, most of these differences are not large, and frankly I'm still uncertain as to just what if anything they are telling us, but together the results may suggest that the returns for the ERP class in general and ERP4 in particular depend more heavily on the bull market of the 1990s for their superior historical performance than either of the other two model classes.

As Doug has pointed out, there is more than one way to mine data. I've tried above to address one facet of the issue, but completely sidestepped the question of whether or not significant strategies exist for starts in other quarters or months. I don't have much to offer on that issue, except to say that those who believe such relationships exist should be busy looking for them :^). RayVT has suggested two such strategies (DsPc and Sig4) which do fairly well in quarters other than Q1, but neither of them in their best starting quarter can touch the best Q1 performers. If all that is solely the result of data mining, then I'd have to say that we miners have been astonishingly lucky :^). I personally doubt that non-Q1 strategies exist that will beat known Q1 results, because I believe the January Effect is real, but that is speculation and is certainly open to debate. All I can say is the EY models are another Q1 based strategy, and even with the increased volatility due to the HY1 stock, they look like they might be promising. So fire away, Doug, because in a very real sense, I probably deserve it :^).

My best to all you Fools,

TimberFool (Hinds Wilson - ffxres@erols.com)

P.S. And thanks to Orangeblood for what was indeed a posting tip to beat them all :^) [see #10537].

Great work Timberfool. I'm really glad there's people like you, elann, Rayvt, and numerous others who seem to enjoy all this number crunching :) (Once upon a time I liked math. Then I went to college.) Also, excellent job on the formatting. Didn't realize just how nice until you referenced Orangeblood's message.

The only downside with all these strategies is it makes it that much harder to choose one to stick with!

Thanks for the kind words (and also to Zathrus in the message following yours). Re checking the statistical significance of the EY models, I'm not sure which sort of significance you mean. Are you asking about significantly different from the Dow 30, or from the other models? Regardless of which you mean, the answer at this point is 'no, I haven't' :^). But if you'll let me know which you're interested in, I'll do it Real Soon Now, as soon as I get back from the holiday weekend :^).

zgriner said:Regarding the charge of data mining, might I suggest, as has been suggested before, that you test this on odd and even years, separately, and report back.

Incog opinesThis still seems to me to be not quite satisfactory, since even or odd-numbered years have a fixed relationship to each other. Strictly speaking, for testing on a subgroup, the holdout sample should be randomly drawn from the entire set of available data. That is, a subset of years should be randomly selected from the set of available years. If the method can be shown to work on the holdout sample as well as it does on the entire sample (within reasonable limits of statistical significance) then it is much easier to rebut charges of data mining or "special pleading."

<<Incog opinesThis still seems to me to be not quite satisfactory, since even or odd-numbered years have a fixed relationship to each other. Strictly speaking, for testing on a subgroup, the holdout sample should be randomly drawn from the entire set of available data. That is, a subset of years should be randomly selected from the set of available years. If the method can be shown to work on the holdout sample as well as it does on the entire sample (within reasonable limits of statistical significance) then it is much easier to rebut charges of data mining or "special pleading."

Just a thought from a methodology freak.>>

I would think that this argument would have been stronger if the method had been developed on a subset of data and tested on another sample. That is, if the holdout sample had been held out of the analysis.

Also, what would be the best sample size for the type of comparison you are suggesting. There are obvious problems with using too large or too small a sample. There must be a way to calculate the optimal sample size.

Incog said:If the method can be shown to work on the holdout sample as well as it does on the entire sample (within reasonable limits of statistical significance) then it is much easier to rebut charges of data mining or "special pleading."

ataloss insightfully noted:I would think that this argument would have been stronger if the method had been developed on a subset of data and tested on another sample. That is, if the holdout sample had been held out of the analysis.

Incog respondsYou're quite right about that, of course. Unfortunately, until I finish work on the time machine, we can't go back to before the method was developed and start over. (If we could go back, of course, we'd be rather silly to spend time developing a stock-picking method instead of just buying the stocks we knew would be going up!)

ataloss further asked:Also, what would be the best sample size for the type of comparison you are suggesting. There are obvious problems with using too large or too small a sample. There must be a way to calculate the optimal sample size.

To which Incog evasively replies:Yep, I'm sure there's an answer to this question, and I'm sure that someone who knows statistics better than I do can tell us what it is. The objective would be to make the randomly selected holdout sample as small as practical while keeping the variance of the result as narrow as possible in order to have statistical confidence in the outcome. How to do it? I dunno.

Incredible piece of work, Hinds. Just goes to show you what a good data base, a brilliant mind and a flash if insight can do. :)

Someone along the way made the point that your model just makes it hard to pick one and use it consistently. This thought has troubled me. Now, I am wondering what would happen if one switched models arbitrarily? I can't see that switching to a different model every year or two would necessarily be a bad thing to do as long as each switch was to a better model. Not every model is going to perform well every year, but, given that each model is equally valid, should it make a difference, in any practical sense, over the long term?

Anyone got any idea how to test such a premise?

One other comment. In his magnum opus, Hinds wrote:

>>The Fools have recently begun to show results from 1971 forward, excluding the 'flat' years of the 1960s. Let me say at the outset that I strongly disagree with this practice, partly because it is being done on the grounds that 'things were different' then, and it is far from obvious how anyone would know that.>>

I'm not arguing the point but I want to explain our thinking a bit better. We aren't trying to leave out the flat 60's. Well, not exactly. :) What bothers me about quoting figures from 1962 on is that there is no particular reason to have started then--it just happens to be as far back as our data goes.

That's not my idea of a significant starting point. :)

If we had data back to 1960, or, better yet, 1950, I would see it as a less arbitrary cut-off. Since we didn't have data back to an even year, what we decided to do was start quoting Foolish Four returns, publicly, in the same way that mutual funds quote their returns--from the present back. Past 1 year, 3 years, 5 years, 10 years. . . etc. We settled on past 25 year returns as our standard quoted return because it was a nice "round number."

I'm about 51% of the opinion that 25 years is more relevant to the present, and 49% of the opinion that 35 years is better. :) But whenever I quote returns for a long time into the future, like for projecting retirement income, I plan to use the 35 year returns.

I suspect that this is getting into data-dredging, but it is additional confirming data to what I have come to strongly believe from our data mining: In the universe of DJIA stocks, dividend yield is the best predictor of exceptional returns.

The EY strategy is moving pretty close to SIG4, which looks exclusively at yield.

<<The Fools have recently begun to show results from 1971 forward, excluding the 'flat' years of the 1960s. Let me say at the outset that I strongly disagree with this practice, partly because it is being done on the grounds that 'things were different' then...>>

I totally agree. Every time I read this phrase, a chapter title of JOS comes to mind, "No, things are NOT different this time."

<<As Doug observed recently, the only real test of any model is to see how it performs five years or more beyond the sample period used to construct the model. In principle, I agree with that completely. Unfortunately, the practical significance is limited ...>>

I think you are being too harsh on yourself (and our models) here. I believe that we can do a pretty good approximation by looking at individual 5 or 10 year periods. If the returns in different periods are reasonably close to one another, then we can probably reasonably say that the phenomenon is real. Quite some time ago, I posted some results by decade, and IIRC this condition held.

<<For the ERP models, the picture is a bit more troubling....the results may suggest that the returns for the ERP class in general and ERP4 in particular depend more heavily on the bull market of the 1990s for their superior historical performance.>>

This has troubled me, too, for quite some time. AFAIK, nobody has managed to explain why the Q1/H4 RP4 substantially outperforms the other strategies and other periods. Absent a cogent explanation, I'm inclined to think that it was just luck, and therefore probably will not continue in the future.

If you think it worthwhile, I'll add this ranking to my program, and run the results for rolling H4 portfolios. Just let me know.

<<And thanks to Orangeblood for what was indeed a posting tip to beat them all :^) [see #10537].>>DITTO!

Good Work Timber (&Montana)Two points:1.) To test your model why don't you ask someone with the data for BTSP to try the same modification. Elann appears to be keeping a database for this model but I believe he includes the DOW stocks in the basic screen. Ideally the BTSP stocks data would exclude the DOW from the beginning.2.) Data mining: there is nothing wrong with data mining--its the way any theory gets started--somebody notices a pattern. The problem is that we jump to the conclusion that the pattern will continue even though we haven't yet understood the underlying causitive factors. So in this spirit of "Mine On" can you post the CAGRs for '65-'96 for each position of the three basic models? HY1-10, DP/ERP1-5, &UV1-6. (I'm not sure how your data is organized. Hopefully this is a simple task.)

<<ataloss further asked:Also, what would be the best sample size for the type of comparison you are suggesting. There are obvious problems with using too large or too small a sample. There must be a way to calculate the optimal sample size.

To which Incog evasively replies:Yep, I'm sure there's an answer to this question, and I'm sure that someone who knows statistics better than I do can tell us what it is. The objective would be to make the randomly selected holdout sample as small as practical while keeping the variance of the result as narrow as possible in order to have statistical confidence in the outcome. How to do it? I dunno.>>

I guess my point is that since the "holdout sample" wasn't held out, and since we need the sample to be small and large....I don't think the issue can be settled.

The model is attractive, I admire the work that went in to it,I may use it, but it may not really be better than the alternatives.

Good info, but how are you calculating your Sharpe Ratios? According to the definition at http://www-sharpe.stanford.edu/sr.htm and the Sharpe ratios I see on the Fool results pages, I expect numbers in the range 0.8 - 1.7, typically. Yours are in the 50s to 70s. What conversion factor can I use to compare your results to those I see elsewhere?

Referring to using a holdout sample to evaluate Timberfool's EY model, ataloss said:I guess my point is that since the "holdout sample" wasn't held out, and since we need the sample to be small and large....I don't think the issue can be settled.

Incog stubbornly replies:Well, I think you're being a bit too demanding. As RayVT pointed out in a recent post, if the method can be shown to work about equally well in two or more independently selected samples, that's strong evidence that it's not simply an artifact of the particular sample on which the screen was originally developed. Admittedly, it would be a bit more "pure" if the technique had been developed on one subsample without reference to the remaining sample, but the fact that it wasn't doesn't invalidate the technique.

Stocky Guy:<<Good info, but how are you calculating your Sharpe Ratios? According to the definition athttp://www-sharpe.stanford.edu/sr.htm and the Sharpe ratios I see on the Fool results pages, I expect numbers in the range 0.8 - 1.7, typically. Yours are in the 50s to 70s. What conversion factor can I use to compare your results to those I see elsewhere?>>

Timberfool in msg 11266:<<Below are the returns in multiplier form for these models by starting quarter for four quarter holding periods. Standard deviations and Sharpe ratios are shown below each model's returns. The Sharpe ratio is based on the one year Treasury bill rate, and has been multiplied by 100 and rounded to the nearest integer.[emphasis added ;-)]>>

If we had data back to 1960, or, better yet, 1950, I would see it as a less arbitrary cut-off. Since we didn't have data back to an even year, what we decided to do was start quoting Foolish Four returns, publicly, in the same way that mutual funds quote their returns--from the present back. Past 1 year, 3 years, 5 years, 10 years. . . etc. We settled on past 25 year returns as our standard quoted return because it was a nice "round number."

I'm about 51% of the opinion that 25 years is more relevant to the present, and 49% of the opinion that 35 years is better. :) But whenever I quote returns for a long time into the future, like for projecting retirement income, I plan to use the 35 year returns.

But Ann, deciding that 25 years is a nice round number is just as arbitrary as any other number. Think of it this way, if we were Sumerians 36 years would be a nice round number. It's a shame to drop off any of the history because it reduces the statistical confidence of any conclusion you may want to draw.

<<Incog stubbornly replies:Well, I think you're being a bit too demanding. As RayVT pointed out in a recent post, if the method can be shown to work about equally well in two or more independently selected samples, that's strong evidence that it's not simply an artifact of the particular sample on which the screen was originally developed.>>

Agreed. We just don't have independent samples, right. I guess I'll just have to patiently await your completion of the tiem machine.

Have you calculated how the EY methods stand up to . . . the test of significant difference from the Dow 30. (The test that you performed and reported in message 10289. Matched pair t-test?)

I probably should have done these in the first place, although the results of the tests don't contain any surprises. Here are one-tailed, matched pairs t-test results for the EY models vs the Dow 30 for a Q1 start and 4Q holding period for the full sample (1965-96).

Taking EY5 as an example (rounding the probability to 0.004), the T-test probabilities tell us that under the assumption that the EY5 returns and the Dow 30 returns are really equal, the chances that we would observe returns for EY5 as different from (greater than) the Dow 30 in repeated sampling are roughly four out of 1000. Rounding the probabilities to three digits, EY5 is the only one of the eight models tested that will not pass a significance test at the 0.001 significance level, which is an extremely stringent test, so the conclusion seems inescapable that the EY models do in fact beat the Dow with very high probability.

If I understand the details correctly, EY is a variant of UV. In other words, you sort the 30 Dow stocks by yield, throw away the first if it happens to also be the lowest-priced, and take the remaining top-yielder. With the remaining 29 or 28 stocks you sort by price and take the (n-1) lowest-priced ones.

How about trying a similar variant of RP instead? For example, sort the 30 Dow stocks by yield, throw away the first if it happens to also be the lowest-priced, and take the remaining top-yielder. With the remaining 29 or 28 stocks sort by Yield^2/Price and take the (n-1) with the highest ratio.

The hypothetical benefits would be the same as with RP vs. UV: stocks will be less likely to jump on and off the list from one day to the next, and (maybe) the CAGR will be better.

Idea 2

Data-mining for sure:

Why take just the first high-yield stock before applying the price sort? What happens if you take the top two high-yielders? Or three?

As Ray has pointed out, eventually this approach would converge to his SIG method.

Anyhow, if you haven't already tried these variants, it might be interesting to apply them to both your entire database and to each 5- or 10-year period.

So fire away, Doug, because in a very real sense, I probably deserve it :^).

I just read Timberfool's note, and I'm about 250 messages behind at this point (I've been working long hours lately) and I should probably catch up before replying but it might be a while so I'll just dash off something quick. Please forgive me if this has all been hashed out in the last 250 messages. I'll try to have something intelligent to say when I catch up.

But in the meantime my initial reaction is that if you're going to mine the hell out of the data and go for the highest yield possible you might as well go all the way.

To state that in a somewhat more complex fashion, since I already believe that the screens with first stock rules for Q1 only are unlikely to be valid going forward, it doesn't particularly bother me if you want to push things even further. Everybody has to draw their 'plausibility line' in the sand somewhere, and I think it's good that you came up with a screen that extends that continium without getting totally dopey (like buying stocks that end in 1/4). But you won't find me investng any money in it!

I've been out of the loop for a few days, and this board moves so rapidly that I'm once again behind the power curve. It's really difficult to be a loyal and consistent BTD reader/participant and also have a real life :^).

Re your questions in #11482, here are some quick thoughts.

If I understand the details correctly, EY is a variant of UV.

Yes, but . . . . :^).

In other words, you sort the 30 Dow stocks by yield, throw away the first if it happens to also be the lowest-priced, and take the remaining top-yielder.

No. HY1 is the only stock selected solely on the criterion of yield. If it is also the LP stock, skip it. The procedure in general is to sort all stocks by yield, highest first, then throw away all but the top 10. Isolate the highest yielder, HY1, so it isn't in the next sort, then sort the remaining nine by price, lowest first. These two sorts give you your group of 10 from which to choose, which now looks like HY1, HY/LP2, HY/LP3, etc. Ordinarily, buy the first 'n' stocks for EYn, and double up on the first stock only for EYn+. In case HY1 is also LP1, which in Q1 occurred in 7 of 33 possible cases for the 1965-97 period, skip HY1. Then your feasible set begins with HY/LP2 and proceeds to HY/LP3, etc.

How about trying a similar variant of RP instead?

That's on my list of things to do Real Soon Now :^) Seriously, it would be worth looking at, and I just haven't gotten around to it yet.

Why take just the first high-yield stock before applying the price sort? What happens if you take the top two high-yielders? Or three?

This one I have gotten around to already. Only HY1 shows a consistent 'kick', which was what peaked my interest when I first stumbled across this procedure. HY2 is so-so and HY3 is even less reliable. Once we move below HY1, the HY/LP sort produces better results.

As Ray has pointed out, eventually this approach would converge to his SIG method.

I don't think so, because of the procedures are conceptually different. Generically, the EY class of models (like UV and ERP) says "buy the first 'n' stocks that meet such-and-such criteria" for an 'n' stock model. The Sig procedure OTOH says "only buy as many stocks as fit such-and-such a criterion". The number of stocks in the model 'floats' from one number to another depending on the year chosen, although Sig4 caps the number at 4. This, and the fact that Sig4 does so poorly in Q1 (Sig3 actually does better), troubles some of us. As a believer the value of diversification, I would like to be the one deciding how many stocks to buy rather than having the model decide it for me.

Conceptually, the Sig procedure (without the cap) is a continuum of models, each of which says "buy all stocks whose yields are n*sigma above the mean". Thus far, we've only looked at n=1, but there is no obvious reason to limit ourselves to '1' that I can see except that it is the first number we learned when we were growing up. We really don't know but what n = 0.865 might not be better. [This same issue has come up with the ERP models and the exponent on the yield.]

Anyhow, if you haven't already tried these variants, it might be interesting to apply them to both your entire database and to each 5- or 10-year period.

I hope to have the Dow quarterly database (DQD) in the hands of others who have expressed interest in it within the next week or so, and then we'll have a lot more folks testing these variations. Until then, with the inconvenient necessity of making a living and other time constraints, there's only so much one person can do :^).

Great work and Post #11266!! You and friend MontanaFool have possibly pushed the DDA to the limit and may have identified the best 1, 2 ,3 ,4 and 5 stock strategy extant though I did not see the column showing returns for the PPP or EY1.

These tables are certainly a confirmation of the apparent seasonality of the starting quarter and the advantage of starting in Q1. The job may be considered 33% complete. Is it possible that there is another sweet spot in the market for starting in the months other than Jan, Apr, Jy, or Oct??

Where is the information on the selection for DP, ERP?

Have the '87-'97 year by year total return values for the EYn strategies been posted?? If not would it be possible to post that data for all to see, study analyse and compare with the many other strategies available??

Also, I don't find that the Sharpe ratio tells me anything useful except as a number, and in a sense it penalizes positive deviations which we should all applaud and cheer. I prefer to know what the maximum portfolio drawdown might be and how long (months, years) it takes to get back to where it started(possibly plus risk free interest). This tells me in real terms the % risk and the time period of the loss. These numbers are real to me and I understand them.

Also, I don't find that the Sharpe ratio tells me anything useful except as a number, and in a sense it penalizes positive deviations which we should all applaud and cheer. I prefer to know what the maximum portfolio drawdown might be and how long (months, years) it takes to get back to where it started (possibly plus risk free interest). This tells me in real terms the % risk and the time period of the loss. These numbers are real to me and I understand them.

To put this in the language of mathematics, I'd be interested to know the standard deviation of the difference between EY models and index funds (or Dow30) over 5-year and 10-year periods. Any equity investment should be made with long-term funds (IMHO), so shorter time horizons aren't really relevant (at least to me).

TimberFool,
Just catching up on the boards and saw your work. Makes for some
great reading! Congratulations and thanks for your work!
Thought I'd toss in my $0.02 into the conversation:
To summarize before I start this long winded monologue, I'll be
commenting on some of the "datamining" problems, offering a solution,
and I'd also like to offer to do the number crunching (or help if I
can).
Zgriner and Incog (msg #11272) suggested different ways of subsampling
the data to avoid the effects of "datamining." This caught my
attention, because I remember this same problem being discussed in a
time-series analysis class I took while working on my MS in statistics
a few years back <- shameless attempt to gain credibility 8*0.
To restate the problem, we wish to find the mean and standard
deviation of the annual yield. The problem, is that the estimates
that have been used all assume that the data points are *independent*
samples from the population. Since we are using data from consecutive
years, we are obviously violating this assumption. While many
statistical methods are robust against certain assumptions, that is
*definitly* not the case here.
One way to get around this difficulty, since we have 32 years of data
('66 - '97) is to use only the 1st, 6th, 11th, 16th, ..., and 31st
data points. That would give us 8 samples to play with. Use those 8
samples to compute the mean and std.dev. Sure, this is clearly not a
"random" sample from the population, but it is a far more
statistically independent sample, which is a much more serious issue
when we are concerned with avoiding bias in our mean and variance
estimates.
I can see people getting concerned over estimating the mean from only
8 samples. My response there would be go ahead and estimate the
variance of the estimate as well. What we are doing here is this:
m = m~ + em
s = s~ + es
m = the true mean
m~ = our estimate of 'm' (a random variable)
em = a random error term (a random variable)
s = the true std. dev. of 'em'
s~ = our estimate of 's' (a random variable)
es = a random error term (a random variable)
So, what I am saying is to get a handle on if we've done something
unacceptable by reducing our data set to 8 samples, look at the
std. dev. of our estimate of the mean, m~. (I'll spare you the
details of how to do that). If this variance is small enough, then
we've nothing to worry about. If it is too large, maybe we'll want to
pick every 4th point in the set instead of every 5th (and get more and
more nervous as the samples get less and less independent).
BTW, might as well compute the std. dev. of s~ while we're at it.
Now then, having suggested all this number crunching, (TimberFool, are
you listening? 8*) I'll be glad to do it myself if I could have
access to the data set. TimberFool, is that allowed? If the data set
is propriatary, I'll understand. Otherwise, I'd be happy to dirty
my fingers in some raw data!
What do you say?
Nordman <ele@khoral.com>

Sorry to bother you, but could you provide me with annual returns for your Q1 models, including 1997 (and 1998 ytd if you have it)? I would like to examine whether your strategies improve the performance of various portfolios I'm considering for 1999 (see my killer portfolio and comparative data posts in the workshop board).

Could you please email me the annual returns for your Q1 strategies (including 1997 and ytd 1998 if you have them)? I'd like to examine how they affect the performance of various portfolios I'm considering for 1999 (see my killer portfolio and comparative data posts on the workshop board).

The best place to start with the Dow board is definitely Dave Goldman's updated-almost-daily FAQ. You can find it at <http://boards.fool.com/Message.asp?id=1030001003364000>; I don't know if that's the most recent version but it's certainly recent enough to give you a lot to mull over!

There is the basic EYn and then there is the EYn+, which doubles the investment in the number one stock. However, there are 2 possibilities for the number one stock --* highest yield/not lowest price (HY1/LP>1) or* high yield/2nd lowest price (HY/LP2), which occurs when the highest yielder has the lowest price and is thrown out.

The main basis for the EY methodology is inclusion of the highest yielder. However, when HY1 is also LP1 it is eliminated, and we are left with HY2-10 in the number one position.

My hypothesis is that using EYn+ works better when the doubled stock is HY1. If the stock in the first position is not the highest yielder, use EYn instead, investing equal dollar amounts in all n stocks.

In Post #11548 Timberfool touches on this subject. He was asked: Why take just the first high-yield stock before applying the price sort? What happens if you take the top two high-yielders? Or three?

His reply: This one I have gotten around to already. Only HY1 shows a consistent 'kick', which was what peaked my interest when I first stumbled across this procedure. HY2 is so-so and HY3 is even less reliable. Once we move below HY1, the HY/LP sort produces better results.

It appears as if this board is on the way out. The method has supposedly been discredited. I am still not too sure that it doesn't work. I didn't follow the system much and am not sure of the concept that the various successes were within the range of pure chance, which is what the detractors seem to be saying. At least that is what I remember of the whole story.