This is a continuation of the Lion’s example we did the other day. Every nicety you can think of has been ignored. I won’t even guarantee that I have copied the win/loss record correctly, as I did this by hand. The probabilities that at least one team wins or losses all games is printed below.
This table shows, for each team, the probability of winning 0 games, 1 game, …, 16 games. It has been sorted so that the team (the Patriots) with the highest probability of winning 16 is first, and the team (the Lions) with the lowest probability is shown last. All probabilities are rounded: probabilities less than 1% are shown as 0. The most likely number of games won is in bold.
Somebody remind me after the season to check how good these predictions were.
I used data from 2002 until 2008. In 2002, the NFL changed the league structure (they increased the number of divisions), so this felt like a natural point of demarcation. All data weighted equally. No account of the fact that teams are constrained to winning a certain number of games has been taken. For example, suppose there are only two teams in the entire league: it is then impossible that both can win (or lose) all their games. All ties (only one) have been counted as wins.
These are predictive distributions. The Lions truly stink.
Overall: the probability that at least one team wins 0 games is about 2%. The probability that at least one teams win 16 games is about 3%.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Patriots | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 2 | 5 | 10 | 16 | 21 | 21 | 16 | 8 | 2 |
Colts | 0 | 0 | 0 | 0 | 0 | 1 | 2 | 4 | 8 | 13 | 18 | 20 | 17 | 11 | 5 | 2 | 0 |
Titans | 0 | 0 | 0 | 0 | 0 | 1 | 2 | 5 | 9 | 15 | 19 | 19 | 15 | 9 | 4 | 1 | 0 |
Steelers | 0 | 0 | 0 | 0 | 0 | 1 | 2 | 6 | 10 | 15 | 19 | 19 | 14 | 9 | 4 | 1 | 0 |
Eagles | 0 | 0 | 0 | 0 | 0 | 1 | 3 | 6 | 11 | 16 | 19 | 18 | 14 | 8 | 3 | 1 | 0 |
Giants | 0 | 0 | 0 | 0 | 1 | 2 | 5 | 9 | 14 | 18 | 19 | 15 | 10 | 5 | 2 | 0 | 0 |
Chargers | 0 | 0 | 0 | 0 | 1 | 2 | 5 | 10 | 15 | 18 | 18 | 15 | 9 | 5 | 2 | 0 | 0 |
Panthers | 0 | 0 | 0 | 0 | 1 | 3 | 6 | 11 | 16 | 18 | 17 | 13 | 8 | 4 | 1 | 0 | 0 |
Packers | 0 | 0 | 0 | 0 | 1 | 3 | 7 | 12 | 16 | 19 | 17 | 12 | 7 | 3 | 1 | 0 | 0 |
Broncos | 0 | 0 | 0 | 0 | 1 | 3 | 7 | 12 | 16 | 19 | 17 | 12 | 7 | 3 | 1 | 0 | 0 |
Seahawks | 0 | 0 | 0 | 0 | 1 | 4 | 8 | 13 | 17 | 18 | 16 | 12 | 7 | 3 | 1 | 0 | 0 |
Ravens | 0 | 0 | 0 | 1 | 2 | 4 | 8 | 13 | 17 | 18 | 16 | 11 | 6 | 2 | 1 | 0 | 0 |
Bears | 0 | 0 | 0 | 1 | 3 | 7 | 11 | 16 | 18 | 17 | 13 | 8 | 4 | 1 | 0 | 0 | 0 |
Buccaneers | 0 | 0 | 0 | 1 | 3 | 7 | 11 | 16 | 18 | 17 | 13 | 8 | 4 | 1 | 0 | 0 | 0 |
Jaguars | 0 | 0 | 0 | 1 | 3 | 7 | 12 | 17 | 18 | 17 | 12 | 7 | 3 | 1 | 0 | 0 | 0 |
Cowboys | 0 | 0 | 0 | 1 | 4 | 8 | 13 | 17 | 18 | 16 | 11 | 7 | 3 | 1 | 0 | 0 | 0 |
Vikings | 0 | 0 | 0 | 1 | 4 | 8 | 13 | 17 | 18 | 16 | 11 | 7 | 3 | 1 | 0 | 0 | 0 |
Falcons | 0 | 0 | 0 | 2 | 4 | 9 | 14 | 18 | 18 | 15 | 11 | 6 | 3 | 1 | 0 | 0 | 0 |
Saints | 0 | 0 | 1 | 2 | 5 | 9 | 14 | 18 | 18 | 15 | 10 | 5 | 2 | 1 | 0 | 0 | 0 |
Chiefs | 0 | 0 | 1 | 2 | 5 | 9 | 14 | 18 | 18 | 15 | 10 | 5 | 2 | 1 | 0 | 0 | 0 |
Cardinals | 0 | 0 | 1 | 2 | 5 | 10 | 15 | 18 | 18 | 14 | 9 | 5 | 2 | 1 | 0 | 0 | 0 |
Dolphins | 0 | 0 | 1 | 3 | 7 | 12 | 16 | 18 | 17 | 13 | 8 | 4 | 1 | 0 | 0 | 0 | 0 |
Redskins | 0 | 0 | 1 | 3 | 7 | 12 | 16 | 18 | 17 | 13 | 8 | 4 | 1 | 0 | 0 | 0 | 0 |
Jets | 0 | 0 | 1 | 3 | 7 | 12 | 16 | 18 | 17 | 13 | 8 | 4 | 1 | 0 | 0 | 0 | 0 |
Bills | 0 | 0 | 1 | 3 | 7 | 12 | 17 | 19 | 16 | 12 | 7 | 3 | 1 | 0 | 0 | 0 | 0 |
Bengals | 0 | 0 | 1 | 3 | 7 | 12 | 17 | 19 | 16 | 12 | 7 | 3 | 1 | 0 | 0 | 0 | 0 |
Texans | 0 | 0 | 1 | 3 | 7 | 12 | 17 | 19 | 16 | 12 | 7 | 3 | 1 | 0 | 0 | 0 | 0 |
Rams | 0 | 0 | 1 | 3 | 7 | 12 | 17 | 19 | 16 | 12 | 7 | 3 | 1 | 0 | 0 | 0 | 0 |
49ers | 0 | 1 | 3 | 7 | 13 | 17 | 19 | 17 | 12 | 7 | 3 | 1 | 0 | 0 | 0 | 0 | 0 |
Browns | 0 | 1 | 4 | 9 | 14 | 19 | 19 | 15 | 10 | 6 | 2 | 1 | 0 | 0 | 0 | 0 | 0 |
Raiders | 0 | 2 | 7 | 13 | 18 | 20 | 17 | 12 | 7 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
Lions | 1 | 5 | 12 | 19 | 21 | 18 | 12 | 7 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Interesting. It appears the probability of winning N games is normally distiubted about the mean of the number of wins. Not sure how useful that is. Going over the Super Bowl records from 2002-2008, the Patriots appeared 4 times while the Colts, Steelers, Eagles and Giants appeared once each and the Titans not at all.
I know you’re interested in the probability of the number of wins/per season but if you really want to determine “worth” wouldn’t it be better to rank them by total wins / total games?
Sadly though, by that measure, the Lions truly are the worst by a huge margin. Even that measure is questionable as the next worst team appeared in the SB once during the study period (Raiders in 2003).
I think you are on the wrong track. I assume your historic data is won/lost records for 2002-2008. That data is not the determining factor for 2009 wins and losses. If that data is not the cause, how good are your predictions?
Lets do a mind experiment. Lets say a good high school team plays an average college team, who wins? A good college team vs a bad NFL team? The team with players who are bigger-stronger-faster wins.
The really (or closer to reality) is that the teams with the better players and coaches win more games. The players on a team in 2002 are not the same people in 2009, and each player in 2009 who was playing in 2002-2008 may be with a different team, and his skills are either improving or eroding. And general managers and scouts change over time. Not to mention injuries.
But in any event, picking winners straight up is easier than selecting winners with the point spread. You are the teacher, but doesn’t there have to be a cause and effect relationship in order to make good statistics and good predictions?
Have you considered using a categorical method like a Bradley-Terry-Luce model? Might be interesting to see the parameter estimates for the odds ratios.
Brian,
Of course selecting by point spread is harder. The bookies want the bet payoff to average to zero as they intend to profit only from their take (at least the smart ones do).
As for cause/effect, it helps to know it and base results on it but it’s not absolutely necessary. For example, it’s possible to predict sunrise tomorrow with surprising accuracy without ever knowing or stating the reason why. The prediction is made solely on past performance. But of course that’s no guarantee. The rotational rate of the Earth changed radically over a period measured in days (from around 8 hours/day to 5 hours/day) during the event that created the Moon.
Robert Burns: “The team with players who are bigger-stronger-faster wins”
Is that assumption warranted? Maybe smarter is the main driving factor. Even if your assumption is correct, which of the three counts more? And exactly what does “stronger” mean? Can lift more weight? Pushes harder? More odorous? Just plain “better”? “Bigger” how? Height? Weight? Girth? Are ‘bigger’ and ‘stronger’ correlated?
Brain, my apologies. All of that should have been directe to RB. I must be getting cross-eyed as I age.
To DAV
By “bigger-stronger-faster” I meant better football players, and that includes being football smarter. And if you have ever played a physical sport and had been greatly over matched, you would agree with my assumption. Of course, if you won, it was because you are smarter. LOL.
I don’t know how to measure any of those factors. But I do know that a team has gone from 1 win in a year to 11 wins the next year.
DAV, Brian,
The method used is categorical and does not use a normal distribution. It cannot be that distribution for three reasons: a normal would give non-zero probability to events like “games won greater than 10.1 and less than 10.9”, which is obviously an impossibility; it would give positive probability to events like “games won less than 0” and “greater than 16”; and it would give zero probability to events like “games won equals 10.”
Normal distribution methods can be used as decent approximations for discrete/categorical problems providing the number of categories is large, so that the problems above are small. 16 cannot be considered small, though, so a normal would give a poor approximation.
Robert Burns,
I cannot prove that a team’s past record is (at least somewhat) probative of its future performance, but it is certainly rational to think so. In any case, I am willing (before 1 pm Eastern today) to collect other predictions. We can compare after the season ends.
Robert,
Couldn’t one measure of “better” be total won/total games?
One problem with Briggs’ approach is that there are too many cells for the data. The maximum value in any given cell is 7; so the percentages displayed have wider limits. Wide enough that you could say the teams are all about equal with the Patriots tending toward better and the Lions tending toward worse. The TotWon/TotGames method has at best a max value around 10 — also too small to appreciably narrow the limits and, indeed, doesn’t appreciably change the rankings. Using the total win ratio (and assuming my games won recreation from Briggs’ table is close enough), the Patriots beat the Colts by one SD. The Lions are at the bottom with a Z=-2.26 while the Patriots are at Z=+2.59 but don’t put too much stock into that. The Patriots won around 12 games more than the Colts in 7 years — an average of not much more than one more game per season. The comparison remains nearly the same for the rest of the top 10 teams. I’d say they were closely matched.
Still, in an even money wager, I’d bet the Patriots in any given game. The point spread handicap undoubtedly works quite well. Frankly, I think it seriously dampens any hope of actually making real money on football wagers — excepting bookies of course.
Briggs,
Yes. I tend to use “resembles” and “is” interchangeably because I think in images. “Plotting” the table in my head, it appears to me that the cells equidistant from each row mean are roughly equivalued. I suppose I should be more careful in my speech. I usually use a Dirichlet for discrete data. I say ‘usually’ but I can’t remember I time when I used a normal approximation.
Gaghhh! The total win max is 16/season! The max cell value is of course, 112. Maybe I should get some sleep.
Interesting binomial data structure and statistical results.
Football!!! Here is my idea of a Super Bowl.
The NFL is creating a partnership with researchers at Boston University who are studying the long-term effects of brain injuries on players, the Associated Press reported. “It’s huge that the NFL Draftactively gets behind this research,†Robert Cantu, the co-director of the school’s research program, told the AP. “It forwards the research. It allows players to realize the is concerned about the possibility that they could have this problem and that the NFL is doing everything it can to find out about the risks and the preventive strategies that can be implemented.â€
Merry Christmas! And have a good time!