February 2, 2007

Statistically out of the playoffs

There's often a lot of talk about the point where one is "mathematically out of the playoffs". Websites typically put an 'e-' in front of the team for whom it is impossible to make the playoffs. Often before you are officially eliminated the situation for you to make the playoffs would have to be really bizarre. That is to say in order for Philadelphia to make the playoffs at this point they'd need to win a lot of games and all the top teams would need to start losing. Sometimes the situation is so unlikely it's better just conclude it's impossible statistically speaking.

For example we currently have about 30 games left. A 50% team has a 43% chance of playing above 50% and 43% chance of playing worse (leaving 14% for 15 wins out of 30 games). Of course, what I care about here is that there's a 5% chance that a 50% team wins 20 or more games. In other words if a team needs to win 20/30 games to make the playoffs then they have a 5% chance of making the playoffs (if they're average). Of course most bad teams (those who don't make playoffs) are below 50%, but I'm assuming a team might play a little better to make the playoffs. This is done following the simple binomial distribution. As you get closer to the last game of the season there is a higher likelihood that you could win 100% the remaining games: (1/2)games left, and a similar relationship exists for winning 90% or 80% of remaining games.

Of course winning percentage doesn't make much sense in the new NHL. I use a system that assumes each team will get 8 additional points during the course of the season (10% of games) and so winning percentage equals:
w% = (W+OTL/2)/GP-0.05

Due to the fact that the binomial distribution is discrete you can't solve for an exact value so I did a regression on the last 30 games to show the sort of functional relationship between games remaining and statistically impossible winning percentage.
cut-off = 0.763 - 0.452 *ln(ln(ln(games left))) [or = 1 if games left < 7]

Although it looks complicated, the nested 'ln' functions are necessary to get the appropriate curvature.

Using this equation and a guess for the number of points to make playoffs (~93 points) you can estimate the required winning percentage
Required% = (93-2*W-OTL)/[2*(82-Games Played)]-0.05.

And if required% > cut-off then you have less than a 5% chance of making the playoffs.

Of course the opposite is true if required% < (1-cut-off) then the team is statistically making the playoffs.

Of course if the team is better than 50% for whatever reason (star player just got back from long term injury).

Most of the results are obvious, but in the Blue Jackets, Blackhawks and Kings are out and in the East just Philadelphia. Ducks, Predators and Red Wings are making the playoffs in the West and it appears that Buffalo has secured a spot out East. These statistics are presented on my website.


Anonymous said...

Combine that with linear programing and the sports elimination problem to consider the true elimination.


JavaGeek said...

I liked the paper, but it's primary objective is to prove there is a "cutoff value", which I've already assumed is 93. The NHL is more complicated then the paper suggests as it has the wonderful OTL's, although 3 point soccer games mentioned would create similar problems (and the paper works with them).

The NHL is organized by conference and let's say the west won all games vs. the east. This would boost the cut-off value by 10 points and lower it by 10 points in the east.

Generally one can easily estimate this W* by simply using a normal distribution of points:
For example in the NHL the 8th position fills in 1/15th of the normal distribution at the center:
Cut-off = Z*sqrt(82*0.5*0.5)+82
= 92.7
However given that the western conference is stronger:
West = 1.025*92.7 = 95
East = 92.7/1.025 = 90

I used the 93 value for both for now until it gets a little closer to the end.

However the question I was asking:
What is the probability that the given team can get above the cut-off value assuming they're average?

Anonymous said...

Here's for the NHL (if I remember correctly it's a bit behind (03-04) but I haven't reread it).

While 93-95 is a good estimate why estimate when you can calculate the true value. Sec.4.3 addresses incorporating probability into the program.

To be honest when I do my estimates I use your method because my strength is statistics not linear programing. I was hoping someone with more time on their hands would work on the current NHL format and combine the Pythagorean puck with this method. It would be an intresting exercise.

JavaGeek said...

We used our method to determine qualification / elimination with respect to R1 on the data from the 2003–2004 NHL season, the most recent completed season (the 2004–2005 was cancelled due to labor dispute), and found it to be an improvement over rules used by the media described in the introduction. For most of the teams the simple rule was able to detect qualification and elimination at the same time as our method, but our method determined elimination of one team early and qualification of one team early. We were able to announce the qualification of the Colorado Avalanche on March 20th instead of March 22nd 2004 as was predicted by the simple rule, and were able to announce the elimination of the Anaheim Mighty Ducks on March 25th instead of March 26th 2004.

This is just to show that the theory doesn't seem to do anything, the results are over 90% the same...