In general, yes, goals scored largely follow a poisson distribution. It is not perfect,

Click to expand...

We don't need perfect, but does it do a good job of estimating ultra-rare events? It might do a great job of modelling "regular" games but fail at the extremes. Which means you shouldn't use it to estimate rare events.

I've got a simple theory: once a team scores 10 goals in a game the chances of them scoring another goal, until (conservatively) about 1000 goals, declines much less rapidly than a Poisson distribution assumes. Goals are not as independent a variable as the distribution models and there are potential feedback loops that make scoring that many goals far more likely.

What if the chance of scoring 13 goals in a game is actually closer to 1 in 6,000,000? Still incredibly rare but @vividox's estimation is off by over a factor of 100!

I get that the argument will be that the massive skill difference between teams (or different era) makes the probability different than in an MLS game, and I concede that, but suspect that 1 in 665,000,000 is a gross overestimation of the unlikelihood. In fact, I wouldn't be surprised if it happens in my lifetime in MLS. There's just too many weird crazy things that can happen in a football game.

We don't need perfect, but does it do a good job of estimating ultra-rare events? It might do a great job of modelling "regular" games but fail at the extremes. Which means you shouldn't use it to estimate rare events.

I've got a simple theory: once a team scores 10 goals in a game the chances of them scoring another goal, until (conservatively) about 1000 goals, declines much less rapidly than a Poisson distribution assumes. Goals are not as independent a variable as the distribution models and there are potential feedback loops that make scoring that many goals far more likely.

What if the chance of scoring 13 goals in a game is actually closer to 1 in 6,000,000? Still incredibly rare but @vividox's estimation is off by over a factor of 100!

I get that the argument will be that the massive skill difference between teams (or different era) makes the probability different than in an MLS game, and I concede that, but suspect that 1 in 665,000,000 is a gross overestimation of the unlikelihood. In fact, I wouldn't be surprised if it happens in my lifetime in MLS. There's just too many weird crazy things that can happen in a football game.

But San Jose is still burnt toast.

Click to expand...

No, poisson distributions are not extremely good at estimating games with huge numbers of goals. However, this is BigSoccer, not a betting house, and considering that we are talking about MLS, not international soccer, or lower division English soccer from the middle of last century, I'd his estimation is good enough for this forum.

By the way, I can give you even more extreme examples in Women's soccer from just last year. USWNT vs Dominican Republic 14-0, USWNT vs Guatemala 13-0.

By the way, I can give you even more extreme examples in Women's soccer from just last year. USWNT vs Dominican Republic 14-0, USWNT vs Guatemala 13-0.

Click to expand...

Oh, that's not even the worst of it...
--Kuwait's WNT lost their three Asian Cup qualification games by a grand total of 1-51.
--New Zealand's WNT regularly destroys its OFC competition, as did Australia in their later OFC years.
--I thought there was another 20-goal game recently but I can't seem to find it....
That said, it's probably not fair to compare the women's game to the men's, considering the men's game is much better-established in many countries worldwide. Also probably not fair to compare leagues and other club competitions to national-team competitions in general.

Didn't Australia beat one of the Samoas 32 or 33 to 0 in a WCQ before Australia moved to the AFC?

Click to expand...

Australia 31-0 American Samoa in 2001.

Australia had beaten Tonga, 22-0, just two days before this game to break the Qualifier record. American Samoa had passport issues and most of their regular squad was unable to travel to Australia. They played several teenagers in the game including some who had reportedly never played a 90 minute game before.

It was actually 0-0 after 10 minutes but then Australia scored 5 in 7 minutes and the rout was on. It was 16-0 at halftime. Archie Thompson scored 13 goals in the game himself.

That's a great question, I'm glad you asked. I think I'll do a goodness of fit analysis tonight.

Click to expand...

The problem with rare events (at least in terms of our situation, super-high-scoring games) is that they're so rare that you wouldn't have the sample size to justify arguing one distribution over another, at least assuming both tail off on the high end - upwards of eight goals or so I'd guess. But good luck!

Perhaps the better question isn't has it ever happened in soccer history, the question is, has it ever happened in MLS history? The #1 reason that massive blowouts like the ones that have been mentioned up to this point is that there is a massive difference between the clubs playing and that simply doesn't exist in MLS. Australia dropping 30 goals on American Samoa shouldn't be a soccer because Australia is just that much better than American Samoa.. However, San Jose dropping 13 goals on Colorado? That simply shouldn't happen because the teams are much more comparable in quality.

Perhaps the better question isn't has it ever happened in soccer history, the question is, has it ever happened in MLS history? The #1 reason that massive blowouts like the ones that have been mentioned up to this point is that there is a massive difference between the clubs playing and that simply doesn't exist in MLS. Australia dropping 30 goals on American Samoa shouldn't be a soccer because Australia is just that much better than American Samoa.. However, San Jose dropping 13 goals on Colorado? That simply shouldn't happen because the teams are much more comparable in quality.

Click to expand...

That's why I said earlier that, in general, comparing club competitions to NT games probably isn't a good idea. In fact it's not just MLS; even leagues that see blowouts rarely see double-digit blowouts. (The only top-division league I can think of where that happens normally is the French Women's league, though admittedly I don't know many leagues from countries outside the main "Westernized" nations.) The largest victories in MLS have been two 7-goal margins and four 6-goal margins.

We don't need perfect, but does it do a good job of estimating ultra-rare events? It might do a great job of modelling "regular" games but fail at the extremes. Which means you shouldn't use it to estimate rare events.

I've got a simple theory: once a team scores 10 goals in a game the chances of them scoring another goal, until (conservatively) about 1000 goals, declines much less rapidly than a Poisson distribution assumes. Goals are not as independent a variable as the distribution models and there are potential feedback loops that make scoring that many goals far more likely.

What if the chance of scoring 13 goals in a game is actually closer to 1 in 6,000,000? Still incredibly rare but @vividox's estimation is off by over a factor of 100!

Click to expand...

You're asking some very good questions.

For the kind of stuff we are estimating, 1 in 6,000,000 and and 1 in 665,000,000 aren't very "different". Yes, they are off by a factor of 100+, but for any relevant sample size (in this case, we are talking about a one off game, but this applies even if you are talking about a season's worth of games or even all MLS games) you are going to get very similar results using each approximation.

As aperfectring said, it's good enough for what we're doing here on bigsoccer (specifically because of the relatively small sample size). Now, if one were running millions of simulations on these numbers, the difference in estimators would have a very significant difference on our results, so we would care a lot more about the difference between them.

The problem with rare events (at least in terms of our situation, super-high-scoring games) is that they're so rare that you wouldn't have the sample size to justify arguing one distribution over another, at least assuming both tail off on the high end - upwards of eight goals or so I'd guess. But good luck!

Click to expand...

Well, there is enough to justify between using radically different distributions. For instance, you wouldn't use a normal or an exponential or a Weibull or a gamma or a uniform. Of all the major types of distributions, Poisson is the best fit. To get a better fit that that, you have to start splicing together distributions based on specific assumptions of the data you are modeling. For instance, we can all pretty well agree that it would be impossible to score 25 goals against an evenly matched opponent, even though the Poisson distribution is going to give you a non-zero (albeit very, very, very small) probability of that happening.

Well, there is enough to justify between using radically different distributions.

Click to expand...

Right, of course; the point I was trying to make about distribution used was essentially what you said (much better) in your response to BHTC Mike. =-) If you have two distributions that aren't qualitatively different in the high-end tail, the differences between minute probabilities you get don't make argumentative sense over the sample size we're dealing with (some number of soccer games, be it one or a season's worth or MLS' history).

Wouldn't a 13-12 win for San Jose and a shutout loss by Colorado also put the Earthquakes in the playoffs? Since the second tie-breaker is GS and not GD, San Jose doesn't need to blow out Dallas by 13 goals, they just need to score 13 goals (or enough to surpass Colorado's number).

Wouldn't a 13-12 win for San Jose and a shutout loss by Colorado also put the Earthquakes in the playoffs? Since the second tie-breaker is GS and not GD, San Jose doesn't need to blow out Dallas by 13 goals, they just need to score 13 goals (or enough to surpass Colorado's number).

Click to expand...

This is correct. I am personally hoping for a 13-13 tie. It would be so ironic for them to get the 13 goals they need, but not the win they need. It would also be a hell of a game to watch.

This is correct. I am personally hoping for a 13-13 tie. It would be so ironic for them to get the 13 goals they need, but not the win they need. It would also be a hell of a game to watch.

Click to expand...

Sounds like something that would happen. Dallas and SJ should do a throwback night in honor of Ramiro's retirement. Not to early MLS, but to early soccer. Forget 4-4-2 or 4-3-3, let's do 2-3-5 or 2-2-6.

Plus, Wondo needs his boot back. If he gets 12 or 13 he should be in a good position to retain it.

The real story of the weekend will be how bad does NYRB need this trophy. The two teams that folded from MLS have more trophies individually than NYRB does. Interestingly, the two worst teams in MLS could have an indirect hand in deciding NYRB's fate by either helping a team get to the trophy or by forcing Chicago to have to play in the game to decide their playoff fate.

On the other hand, assuming the most recent corresponding (home/road) scores between the teams do repeat themselves in the upcoming games, the following will be the outcome for the Final Week of Regular Season 2013:

(in the absence of the most recent corresponding games this season, the most recent scores regardless of home/road would apply)

0. The Standings at 5 rounds before the end were normalized for each Regular Season, where every team played the same # of games.

1. The points for 1996-1999 were indicative only, based on Shoot-outs=Ties.

2. All but 3 SS Leaders (i.e. S.J. 2002-03 & Seattle 2013) 5 rounds before the end have won the Supporters' Shield.

3. S.J. 2003 were in fact the only such SS Leaders that failed to win their conference.

4. In all but 3 seasons (i.e. Tampa 2000, Houston 2009 & Seattle 2013),the Closest Challengers to SS Leaders 5 rounds before the end.
have either won or finished as Runners-up to the Supporters' Shield.

5. Despite 2001 season being cut short due to 911, 5 rounds before the end was still defined as Week #23, and Miami was leading by then.

6. The slimmest lead by the SS Leaders 5 rounds before the end were 0 point, as seen in 1999.

7. The largest lead by the SS Leaders 5 rounds before the end were 8 points, as seen in 1997, and was never matched ever since.​