I expect to be posting a few more of these "Matter/scatter" charts over the next few weeks, so, for those who want to know, here are some notes on the methodology behind them.

One main thing I want to stress is that the methodology does not do anything to solve the terrible problem of endpoint dependence. The "historical returns," as shown by the central marks on the chart, are simply the actual final values over whatever time period is shown, and subject to all the issues that implies. The purpose of the exercise is to put whatever the differences are in the context of the intrinsic variability of financial asset returns. Since the general message is that the variability is huge compared to whatever differences in historic returns there are, it is reasonable to ask "is it really that large?" That is, am I producing crazy overestimates? On the contrary, I think I am producing very conservative measures that are almost surely large underestimates.

In effect we could get the same results just by pulling some reasonable standard deviation number out of the air and plotting ± so many standard deviations. What I have done, with great effort, is to try to tie the scatter measure as closely as possible to the monthly history over the chosen time period, so that the spread shown for a portfolio is based on the actual details of the month-to-month performance of the whole portfolio, and based on the time period shown (we are not plotting bounds for the last thirty years that include data from the Great Depression, etc.)

Procedure

In all cases, the raw data source is portfoliovisualizer.com's Backtest Portfolio tool, using tabulated "Monthly Returns." These are used as inputs to my own methods and calculations, which do not duplicate anything available in PortfolioVisualizer. No PV visual content or numeric values are directly reproduced. Time ranges are limited to available data in PortfolioVisualizer.

For the Monte Carlo simulation, I divided the time period into two-month intervals, each consisting of an even-numbered and odd-numbered month. The simulated return for the first month an actual historic return, chosen randomly between either the even- or odd-numbered month. The simulated return for the second month is an independently chosen random choice between the two months. That is, if the historic returns for, say, January and February 2012 are A and B, then in the simulation the returns for those two months might be A-A, A-B, B-A, or B-B.

The simulation hews closely to the historic record. It's not based on simply long-term statistical parameters. The Monte Carlo simulations crash in 2008-2009, boom in 1995, and vary in volatility along with the actual market. If the market shows long-term mean reversion, the simulation shows it. Furthermore, since the simulated monthly return is a random choice between the historic value and the value from a date one month different, about half of the monthly returns in the simulation will identical to the historic returns.

Does it overestimate variability?

I'm pretty sure it underestimates variability, for three reasons.

Reason one: it's smaller than variabilities estimated by two other methods

First, reality check of against other approaches gives spreads that are broadly similar in magnitude, but considerably smaller. Here's my simulation of VFINX compared to PortfolioVisualizer's, and to the historic record for "the US stock market" as compiled by Robert Shiller.

We group the historical data into adjacent pairs of values. For each simulation run, for each month, we randomly choose either the actual historical return for that month, or the return of the other month within the same pair. This means that on the average 50% of all values will be the same as the historic values.

Now, I'd wondered whether there was any short-term systematic tendency for successive months to compensate each other, to exhibit one-month mean reversion as it were. If so, then we might be exaggerating the differences. However, I expected momentum, not mean reversion, and that's what I found in the Shiller data:

Wonder if this thread is still open for a little contemplation.
I spent plenty of time tracing through the logic of this, but this was primarily a learning experience.

Some open (and closed?) comments and questions.:

I take it that this is meant to examine which one of two different funds/ports were better over the chosen history by slightly juggling the sequence of returns.

A mea culpa for me is that volatility drag or variance tax truly is an estimation from the Taylor series of CAGR. log(1+r)=r -( a^2)/2... Truncated to two terms. I'll skip any more math.

One question: Did you (nisiprius) use the same iteration of the random sequence on both ports? Not independent random iterations.
My observation on why your returns and SD were lower than actual history and PVs MC sim.:
Markets (and bonds) have a mostly(net) positive return. Your random process would not change time-weighted returns, not standard deviation (I believe), but definitely introduced sequence of returns risk for a dollar cost averaging investor. In other words, even the difference of return month AxB, is dollar weighted return different than month BxA if B is most usually higher than A. Actually, let me backtrack, this would be momentum, or exponential momentum. But I believe still true. And therefore, dollar weighted AxA + BxB <> BxB + AxA (for the other half of the times.)(Those are the other two possible sequences of the 4 possible combinations: AA,BB,AB,BA.)
So two of the random combinations reduce momentum while accumulating causing lower return and lower SD of return.
And they also reduce (the fewer examples of) the negative momentum of down markets.
For bond funds, generally the same.

Last edited by MIretired on Tue Sep 11, 2018 12:53 pm, edited 1 time in total.

As usual excellent work. I have not had a chance to delve deep into your new line of threads like this one, but look forward once my kids and wife leave me alone for a weekend (which will likely be never )

Good luck.

"The stock market [fluctuation], therefore, is noise. A giant distraction from the business of investing.” |
-Jack Bogle

...I take it that this is meant to examine which one of two different funds/ports were better over the chosen history by slightly juggling the sequence of returns...

No. It adds nothing to the question of "which is better" or "which has, had, or will have the highest return." This is Siamond's point, I think. I haven't done anything to deal with the (intractable) problem of endpoint dependency in historical results.

It is simply an attempt to put some rough analog of error bands on the historical returns. To say "even if our mathematical expectation for the future is the same as the actual historic results over this time period, how important is that compared to some estimate of the expected scatter?

One could probably do about the same thing by pulling some number out of the air and "take the historic return and add ±F times the standard deviation of the monthly returns," where F is calculated in... some way.

I use the result of cumulative investments for no particular reason except a) to be a little different from what's usually done, and b) to reduce, though not eliminate the issue of extreme dependence on the starting point.

I don't think it's crazy to use the actual monthly returns over the time period, rather than some grand standard deviation over the whole time period, because that at least links the scatter of the estimate to the actual volatility the investments really had over that time period.

I think I was saying the same thing in not enough words. In this case it's a lot of sequence of 2 month returns random scatter. It could be a scatter by +/- F, not necessarily at 2 months. Maybe I just threw in that it tests if 1 port was higher returning, was it because of it's 2 month sequences?

Right: eliminated starting endpoint.
But I think I went through it enough angles that you did not change the time weighted return. Only the dollar weighted.
Understand that I know nothing of the math of randomness. ie: Add a random time variation to an average return and get same average return, which seems to be this case.

But these plotted data points are not statistically independent. If I understand things they are from a series of highly overlapped time periods. My take is that they are small variations of the same data plotted over and over. The result you show is what I would expect from such data.

Please enlighten me, but I don't believe that you can conclude much of anything from the plots you made.

But these plotted data points are not statistically independent. If I understand things they are from a series of highly overlapped time periods. My take is that they are small variations of the same data plotted over and over. The result you show is what I would expect from such data.

Please enlighten me, but I don't believe that you can conclude much of anything from the plots you made.

To phrase it another way, it is a set of random walks around the historical data, based on the month-to-month variability within that specific period of time.

And what I conclude from it is that even though you are looking at small variations of the same data plotted over and over again, the small variations add up to large variations that are much larger than the performance differences in the two portfolios. Even if you believed the differences were certainties, the uncertainties far exceed the certainties. Maybe that is to be expected, but the normal mindset of writers comparing performance is to make no attempt at all to estimate or visualize uncertainty at all, and to assume that all differences are meaningful.

There's no easy answer, because if you compare nonoverlapping data, by the time you get to reasonably long periods of time, you have unreasonably few periods of time to look at. And, furthermore, you never know whether you are looking at intrinsic variability or looking at different time periods when the assets you are looking at were behaving differently than they do now.

I think I was saying the same thing in not enough words. In this case it's a lot of sequence of 2 month returns random scatter. It could be a scatter by +/- F, not necessarily at 2 months. Maybe I just threw in that it tests if 1 port was higher returning, was it because of it's 2 month sequences?

Right: eliminated starting endpoint.
But I think I went through it enough angles that you did not change the time weighted return. Only the dollar weighted.
Understand that I know nothing of the math of randomness. ie: Add a random time variation to an average return and get same average return, which seems to be this case.

I see something in the randomness period. Avg. 2 month randomness; 4 equal possibilities. Therefore just the probability of any one month variance from the same still mean over 26 yrs. has an error twice as before? (AA, BB vs AB, BA) Or maybe that becomes the SQRT(4) more error at any one month point for a random error period of 8 months (2month x 4 possibilities.)

I don't have much to add, we already had a good discussion in Part1, and I documented in this post the issues I see with such approach.

To summarize, I just don't see that the scatter clouds are adding much value to the points (big cross) on which they are centered. Which are themselves heavily start/end-date dependent, hence quite misleading. Sorry, I realize that assembling those graphs required a lot of efforts, and I applaud the intent, but unfortunately I don't see the value-added.

I don't have much to add, we already had a good discussion in Part1, and I documented in this post the issues I see with such approach.

To summarize, I just don't see that the scatter clouds are adding much value to the points (big cross) on which they are centered. Which are themselves heavily start/end-date dependent, hence quite misleading. Sorry, I realize that assembling those graphs required a lot of efforts, and I applaud the intent, but unfortunately I don't see the value-added.

Do you see an alternative way of visualizing or otherwise judging the relationship between relative performance (over a specific time range) and the amount of intrinsic uncertainty, variability, etc? Political surveys are always published with a stated margin of error, financial performance measurements virtually never are. I didn't want to make a boatload of obviously invalid statistics-101 assumptions, so I just did something that felt reasonable to me and presented results without claiming to calculate significance levels or anything else. The whole point is to put something that would do the job of an error bar, or a box-and-whiskers plot.

I'll keep presenting them, but I'll include a link to this thread and note that some forum members question their value.

The whole point is to put something that would do the job of an error bar, or a box-and-whiskers plot.

Well, the only answer I can think of is to vary the start-date while using historical data, and when possible, get out-of-sample (e.g. in another country). PortfolioCharts.com did a wonderful job with such concept. I know you find this type of answer overly 'filamentous', but this doesn't bother me that much, and I can't think of anything better. As I said, I applaud your intent. I just don't know how to improve on the drawbacks.

Since I remain intrigued by the OP's approach, I decided to learn how to create such a simulation myself. With a heavy use of the RAND() function, and a big copy & paste of 500 simulations, this turned out to be reasonably easy to do. Besides learning, I was curious about the shape of the distribution of portfolio balance outcomes. I was expecting an overly regular distribution (as a side effect of the 'A-A, A-B, B-A, or B-B' approach). I was dead wrong. The distribution turns to not be a normal distribution, nor normal-ish-with-fat-tails (real life when varying the starting date), but skewed towards the lower outcomes. I don't quite understand why. I tried with multiple instances of the simulation, and always get the same outcome. I tried with other stock data series (VFINX S&P 500; NAESX US Small-Caps; VTRIX Int'l Value), same outcome.

Note that I only included one data series here (VTSMX), and that I didn't make the 'Y' axis logarithmic (I am not quite sure why the OP elected to do so), so that we better appreciate the distribution of outcomes on the main graph. Click on the image to see a bigger display.

Then I tried with bonds (VBMFX, Total-Bonds). I was expecting something similar by now, and I was wrong again. This time, it looks somewhat (kinda) close to a normal distribution. Huh?

I wasn't expecting those outcomes, but this reinforces me in my view that the scatter clouds depicted by this representation are very artificial, and not quite representative of real-life outcomes. Discussing the shape of such scattered clouds seems rather fruitless, I am afraid.

PS. I assumed an initial investment of $100, and then a monthly (end of the month) extra investment of $100, as PV appears to use such end-of-the-period logic.

Siamond, although this is tangential to your comments, and may not interest you, I've been thinking that rather than display the end state as a scatter cloud, it might be better to display the entire swath of time courses for the simulations. That is, the historic course, with the 500 different walks around it, spreading out with time. Some experimentation on how to display each line would be needed--probably plotting it in a transparent way, so that lines build up density as they overlay each other. So that instead of seeing two lines, you would see two pale, translucent spreading bands.

The problem for me is that my current set of tools uses Excel, which is really lousy for doing what I want to do (display 500 curves in green, plus 500 curves in red). I won't swear that it can't be done using Visual Basic for Excel, but I'm not interested in trying to do it. So, to do what I want, I would have to stop using Excel and write my own program to create and display the charts. Which is do-able but not trivial, it's a little project in itself.

With regard to the skew, it doesn't surprise me. I don't think it bothers me. I think I "expected" it, I think it's just the result of compounding. The effect of those little "cumulative" departures is multiplicative, not additive. It's one of the reason I use a semilog plot.

The general idea remains the same: how to display the historic results with some kind of "error bars" around them--and to do so without making a set of unrealistic Statistic 101 assumptions. I am very open to some other way to do it. Have you got any ideas?

Siamond, although this is tangential to your comments, and may not interest you, I've been thinking that rather than display the end state as a scatter cloud, it might be better to display the entire swath of time courses for the simulations

PortfolioCharts.com has something like that, I think, see this calculator. Without the regular $100 addition, mind you, and solely based on varying the start date. I am mentioning it because the author uses Excel as the underlying engine, and it shouldn't be too hard to do something similar based on the 500 scenarios, if that is close enough to what you meant.

Otherwise, to do something more akin to what cfiresim does (with a bunch of individual trajectories), we'd need more powerful tools for sure. I have on my TODO list to learn to program in 'R', as I've been told this should be the answer to more heavy-duty data analysis and more sophisticated graphical representations. I didn't start though. Winter project!

With regard to the skew, it doesn't surprise me. I don't think it bothers me. I think I "expected" it, I think it's just the result of compounding. The effect of those little "cumulative" departures is multiplicative, not additive. It's one of the reason I use a semilog plot.

I don't get it. Please elaborate? Why are stocks compounding in such a skewed way in the A/B model? I don't see why randomly changing the order of returns would do that? And why would such strong departure from reality not bother you more than that?

The general idea remains the same: how to display the historic results with some kind of "error bars" around them--and to do so without making a set of unrealistic Statistic 101 assumptions. I am very open to some other way to do it. Have you got any ideas?

Besides what I mentioned two posts ago (vary the start date, vary the geography), no, I don't. I'll keep thinking about it, but I am skeptical this can be done with any proper dose of reality. You're probably as close as can be, and yet very far, I'm afraid.

Lognormal. That's the word I was searching for. We expect the distribution to be lognormal, not normal. And it isn't unrealistic, real-world distributions of financial outcomes are lognormal, not normal. I think I heard that somewhere, anyway.

It isn't randomizing just the order, it is randomizing the subset of monthly returns that are applied cumulatively. If, by chance, you get 100 that are higher than the mean, then you are compounding, not adding, 100 higher numbers. If you get 100 that are lower than the mean, you are compounding, not adding, 100 lower numbers. If it is lognormal, then the skew ought to go away on a semilog chart... and it does.

Lognormal. That's the word I was searching for. We expect the distribution to be lognormal, not normal. And it isn't unrealistic, real-world distributions of financial outcomes are lognormal, not normal. I think I heard that somewhere, anyway.

It isn't randomizing just the order, it is randomizing the subset of monthly returns that are applied cumulatively. If, by chance, you get 100 that are higher than the mean, then you are compounding, not adding, 100 higher numbers. If you get 100 that are lower than the mean, you are compounding, not adding, 100 lower numbers. If it is lognormal, then the skew ought to go away on a semilog chart... and it does.

Ugh. Duh. You're absolutely right. The results should indeed be lognormal. I modified the horizontal axis of my distribution charts to reflect that, and yup, good point, the skew basically disappears. Then we're back to distribution charts closer to what I expected, fairly regular, although less regular than I was anticipating, in truth, and somewhat varying in shape depending on the simulation run. Definitely more regular than a historical distribution chart (varying the start date) though. Hence I remain very skeptical those scatter clouds reflect anything close to reality, but... my criticism was overly skewed!

A pending question from the discussion we had in the OP's first scatter/matter post. So far, I did not adjust the $100 for (monthly) inflation in my own models, and this isn't consistent with the way PV does it - yeah, I should fix that. Please clarify how you do it first?

[*]The performance is not based on the growth of a single investment in the fund. It is, instead, based on the outcome of making periodic fund purchases of $100/month over the time period shown. It thus is closer to the way most of us actually invest.

Are the $100 being added a constant (nominal) quantity, or adjusted for inflation? I assume the latter?

A pending question from the discussion we had in your first scatter/matter post. So far, I did not adjust the $100 for (monthly) inflation in my own models, and this isn't consistent with the way PV does it - yeah, I should fix that. Please clarify how you do it first?

[*]The performance is not based on the growth of a single investment in the fund. It is, instead, based on the outcome of making periodic fund purchases of $100/month over the time period shown. It thus is closer to the way most of us actually invest.

Are the $100 being added a constant (nominal) quantity, or adjusted for inflation? I assume the latter?

I'm not using PV's calculations of growth. I'm using my own. They do not adjust for inflation. They probably should and that's something I'll probably get around to.

A really serious question that I don't begin to know how to answer is whether it is reasonable to model the simulations on the assumption that the real world is the month-to-month cumulative effects of a random walk, or whether there is some kind of mean reversion that would tend to pull all of the alternate universes closer to the historical results. I couldn't figure out any sensible way to deal with this--other than to make what I think are ultraconservative choices on everything ese.

A really serious question that I don't begin to know how to answer is whether it is reasonable to model the simulations on the assumption that the real world is the month-to-month cumulative effects of a random walk, or whether there is some kind of mean reversion that would tend to pull all of the alternate universes closer to the historical results.

I think Mandlebrot and others made it crystal clear, and it isn't too hard to verify based on long data series. This is NOT a random walk. And Taleb certainly had a powerful point emphasizing that black swans (fat tails) events have an oversize effect on trajectories. I seriously doubt that the intricacies of a real-world trajectory can be modeled with any fairly simple formula, random walk or else.

What keeps vexing me is to be sure that the scatter cloud adds value to the historical data. If the scatter cloud is a simple mathematical derivative from the historical data (for a fixed period of time), then there is little value-added, we just see two fairly deterministic clouds side by side, and this doesn't add much to the two historical data points side by side. And even if the two clouds overlap a lot, we just don't know if there is a solid deterministic reason for it (e.g. bond funds with distinct durations), or if it just means that the two asset classes are barely distinguishable (e.g. S&P 500 vs Total Market). Sorry, I guess I am going in circles with this.

PS. for the sake of healthy skepticism, I'll run all 30 years cycles of MITTX (a very old fund which started in 1925), and see how the distribution works out.

Since I remain intrigued by the OP's approach, I decided to learn how to create such a simulation myself.

For reference before I get to the next posts, here is the updated chart for VTSMX, using the Monte-Carlo-like simulation approach described by the OP, with a more adequate logarithmic scale for the distribution chart. Note that all numbers are nominal and that the $100 extra monthly investment increases with inflation. I also increased the number of random simulations to 750.

I kept the vertical scale for the scatter non-logarithmic, so that we can better appreciate the range of possible outcomes. As we can see, there is a wide distribution (from $50k to $400k!), following a fairly regular pattern.

To more easily compare with the next charts I am going to post, here is another Monte-Carlo run, but this time, I used real (inflation-adjusted) monthly returns for VTSMX, and a fixed $100 extra monthly investment. Note that this MC run displayed a 'fatter' tail towards the left side, it happened fairly often in my tests (always on this side of the graph!), probably due to a weird quirk of the Excel random number generator.

PS. for the sake of healthy skepticism, I'll run all 30 years cycles of MITTX (a very old fund which started in 1925), and see how the distribution works out.

I got back to this. No Monte-Carlo here, only historical cycles, starting at every possible month since the fund's inception, faithfully investing an extra $100 every month for 30 years. MITTX (MFS Massachusetts Investors Tr A) is the oldest mutual fund in existence, it started in 1926!! It is essentially a large-cap fund. All numbers are real (inflation-adjusted).

A few observations:
- the scatter cloud is much broader, primarily due to the values on the X axis (this is the standard deviation of portfolio balances for a given 30yrs cycle).
- the distribution table (horizontal log scale, portfolio balances after 30yrs of savings) looks VERY irregular.
- I assembled the same kind of chart with PIORX, another very old fund (started in 1928), and the outcome is very similar.
- I did the same with 20-years cycles with real-life funds starting in the 70s (e.g. VTSMX, NAESX, VTRIX), outcome is quite similar too.

The randomness of the distribution baffled me a bit, so I came back to the historical monthly returns, and assembled a distribution chart with those monthly returns (real). Horizontal scale is regular in this case, not logarithmic. And we're back to what we know, this looks like a normal distribution, but with extra long tails (those few odd data points at both ends of the chart).

Then I looked at the distribution of the CAGR for all the 30yrs cycles that were included on the scatter graph (all 745 of them). Definitely more irregular! And yet the distribution table of portfolio outcomes (first graph) looks even more irregular.

So... I think we can safely conclude that the sequence of returns factor introduces a BIG amount of randomness. Of course, this comes on top of the effect of compounding, which makes a relatively small difference in average (geometric) returns turn into a really big difference at the end. Both empirical observations can easily be validated with a modicum of algebra. Quite the lottery!

Now back to the OP's primary goal (find a way to visualize semi-realistic error bars), this seems rather hopeless to me. The real life outcomes proved just so random... And yet, there are some signals (e.g. factors premiums) hidden behind the extreme noise of a scatter graph... Overall, this was an interesting endeavor, but I am not convinced that this is a terribly useful representation, whether such scatter graph displays historical or Monte-Carlo data points.

Just for kicks, I entered the monthly historical data for total (US) bonds, using the corresponding Barclays index. Here it is. Unsurprisingly, much more compact outcomes, but also significantly lower portfolio balances at the end. Note that the standard-deviation remains quite high, but remember that this tracks the portfolio trajectory while it grows over 20 years, so sure, there is a good deal of variation, if only upwards.

Makes me think that the X axis is really not quite informative in those graphs. It would probably better to show the maximum drawdown, or, better, the Ulcer index. Oh well, this will be for another time...

Last edited by siamond on Wed Sep 19, 2018 1:59 pm, edited 1 time in total.