September 1, 2006

Ownership is 9/10ths of the Law

Most people will agree, at least at some level, that possession of the puck plays a major part in game outcomes. How long you can control the puck may be a lot more important than many people realize. The exact value of possession is hard to quantify (not due to lack of ability, but because of lack of data). Many commentators talk about the importance of battles along the boards, face-offs and give-aways and take-aways. So how important is it really?

First off using the limited data available, how do you research possession? That isn’t easy, but certain events say something about possession, for example you must have possession to take a shot, if you win a face-off you have possession after the face-off. If you hit a player you shouldn’t have the puck (otherwise it should be interference). A blocked shot indicates you didn’t have possession. The NHL provides a couple obvious possession pieces of information in give-aways and take-aways (they say something both before the event and after). Using this data one can be reasonably sure about certain amount of time of the game (I believe it worked out to 50-70%). The rest is still up in the air, but in terms of averages it works out quite well.

Now when I originally did the study I found there was little or even negative correlation between possession and winning individual games (which was a surprise). However, when I averaged the data over the season I got some interesting data. I have two graphs.

The first graph (winning percentage vs. predicted wins) shows the standard Pythagorean prediction of winning using goals for and against. [Has a R2 of 86%]. The second one (winning percent vs. possession) compares the teams winning percent (or points) over the season to their percentage of possession [Has a R2 of 92%]. Now just considering their R2, possession is a better predictor than the Pythagorean prediction. Now this is a small data set, and I hope to test it on more data, but in general being able to control the puck when you need to will result in more wins.

Why might this be? If you think about hockey if you have the puck the other team can’t score and if you don’t have the puck you can’t score. If you’re winning by one goal in the third period and you control the puck you are more likely to win, and if you are down and control the puck you’re more likely to get that tying goal and as such win (or OTL). So puck possession shows the difference between good and bad teams. The question that should be answered next is how then can a bad team possession wise (8th seed) do so well in the playoffs vs. better possession teams. I’ll save that for another day.


Tangotiger said...

Good stuff.

If you remove the shots for and shots allowed from the possession, how does the correlation look? That is, is it possible that almost all of the correlation is due to the shots taken and allowed?

JavaGeek said...

SF and SA determine who has the puck at a given time. Certainly removing them would be possible (I would need to reprocess all the data - couple of hours).

I would certainly expect a high correlation between shots for and against and possession simply because you cant shoot if you dont have the puck (correlation doesn't imply causation), but that's not neccisarily bad.

A regression with possession explains 90% of the variability, where as a regression with just shots for and against explains 45%. The Variance Inflation Factors are reasonable (there's 45% regression between possession and shot winning pct). But if you do a regression, shots do not help the regression (and thus should be removed).

I've included a graph showing shots for and against regression you can see teams like St. Louis, Chicago and Pittsburg had enough shots to win more games, but lacked possession. Look at Nashville as well (well I guess if St. Louis and Chicago are low, another team in their division should be high).

Vic Ferrari said...

Interesting stuff, Chris, I'm amazed that this works so well. The list of teams does seem roughly in the right order though, to my mind anyways. And it's obviously important stuff. Is this just for even strength btw, or the whole game?

The NHL records a measure of this but doesn't publish it. They did put it up on the website for a while, a few years ago, before they even started publishing shift charts (they've since added shift charts for a few years back).

By my memory it was recorded as zone time, home, visitor or neutral. I think that a couple of journalists noticed it, wrote about the boatload of time the game was in the neutral zone ... and the NHL pulled it off of the publicly accessible area of the website.

As an aside: I remember Sens play by play guy Dean Brown in a radio interview last spring, he commented that the Sens had the puck in the Sabres end of the rink for a whopping 63% of the time in the recent Game 2 of their playoff series. Not "2/3rds", not "most", but 63%. Staggering, and they still managed to lose. Sometimes the hockey gods are cruel.

I think Corsi's "shots directed at net" measure is pretty sound as well. Of course some teams will shoot from anywhere, and others take the chance to gain a better scoring chance ... obviously depends on the coach and the personnel.

JavaGeek said...

I appologize for this, I recently noticed my algoritm was wrong with this study.

Ownership of puck only correlates at 56%.