So I’m going over 2005-2006 data to enhance my standings predictions. I was a little shocked that for example the northwest division when sorted by point projections was negatively correlated to actual points. In other words

But, in the process I discovered something interesting about OT’s. What’s interesting about overtime is that it should be a function of skill that is to say if two equal teams play together they should be more likely to go to overtime than let’s say Phoenix and Detroit, however a quick binary regression shows there’s little significance to this assumption on a game by game basis. 22% of all games go to OT with little skill reasons for this a regression shows a slight favour of good teams to go to overtime less, a better way of saying this is that Ottawa and Detroit went to overtime a lot fewer time than average teams. Maybe more interesting is that there is virtually no correlation to winning percentage outside overtime compared to teams records in overtime. It shouldn’t be too much of a surprise that the overtime system in the NHL is completely random. I’m saying all this to say that I have good reason to conclude that overtime occurs randomly given any two team and that the results once in overtime are completely random. Basically every game has a ¼ chance of going to overtime and then each team has 50% chance of winning.

The consequences may not be obvious immediately, but the first thing that comes to mind is that this guarantees 22% of the NHL standings are the result of pure randomness (above the normal randomness you would normally observe). What I’m saying is that teams will get about 28 free points (95% confidence interval of (21, 35)) as you can imagine the team or two who only get 20 points will have to be 7 points better (10% better) than average just to make the playoffs or the team who gets 34 points in overtimes (10% worse).

Obviously the overtime is fun to watch, its always exciting to see shootouts, there’s no question there, but if you think about what I said, you’d realize that determining who gets the extra point via a coin toss would produce the same results. Personally I find this frustrating; if it’s so important to have a winner this works, but it simply hurts the overall ranking of teams and of course this is on top of the scheduling problems. Interestingly this should make the NHL appear more competitively balanced as it makes the results come close to random.

What does this mean to me? Well I realized my current algorithm that assumed the better team wins the overtime more often then the worse team was incorrect and also I was incorrect that overtimes only occur with teams with similar skill. This means that I will randomly predict overtimes for all games played. This means if a team is predicted to have overtime with

## 5 comments:

These results make absolutely no sense from a common sense point of view. How can skill level of teams not be a factor in which games go into OT? If we put team of 15 year olds in the NHL, are you saying that 22% of their games will go into OT and that they will win half of those games just through pure randomness? That's nonsense.

When you get results that you describe above you shouldn't conclude that going to OT is a random event but rather you should start asking questions like 'Are my assumptions correct?' and 'What am I missing?'.

A great example happened today:

Anaheim lost in a shootout to Chicago.

Obviously over 15 years you might find something (but the new OT rules were started only recently).

It shouldn't a be a shock that a 5 minute OT or a shootout are essentially random. You can show this with the Poisson data or some simple probabilities for the shoot out.

Now how bad teams get to OT's just as often as average teams is a little confusing to me. Obviously I'll wait and see if the data changes, but I shouldn't assume there exists something if I can't prove it even if it doesn't make sense.

Obviously the conclusion is overstated there is never no correlation, but based on this data assuming it is all random is the better alternative.

Feel free to prove me wrong the data I use is freely available by the NHL.

You will have a hard time convincing me that shootouts are random when Dallas was 12-1, Carolina was 8-2, New York Islanders 9-3, San Jose 1-7, and Pittsbrugh 1-6 last year. The odds of Dallas going 12-1 in a random shootout is about .16%. Sorry, that's not randomness.

Now, if we are just considering overtime decisions (not shootouts) then we have Toronto going 701, Columbus andBuffalo going 6-1, San Jose 9-4, Phoenix and Tampa 6-2, Minnesota 1-5, Boston, Pittsburgh and the Rangers 4-8, St. Louis 3-7, Washington 2-6, etc. Those numbers don't seem random to me. What are the odds of all those records happening in a totally random environment? Pretty slim I would guess.

As for games going to overtime, I believe there is an explanation there too if you dig for it because your conclusions just don't make any sense.

Just curious, what kind of success rates are you seeing in your algorithm at predicting games when tested against last seasons results?

Binary Logistic Regression:

Concordant: 61.6

Discordant: 37.6

This is a

predictionalgorithm, so it's using past data to predict future games.I create a part II for all those interested.

Post a Comment