November 5, 2007

Corsi Numbers

Corsi numbers have popped up a few times in the last week. Due to the fact no one else was seeing if these numbers were relevant, I though I'd give it a go:

Corsi Number
Corsi number is the number of shots directed towards the net while the player is on the ice. The number can be broken down into whose net the shots are directed towards (their own net (-) and their opponent's net (+)) similar to the plus minus statistic. The hope of course is that the Corsi plus minus would correlate well with the regular plus minus, but because the numbers will be 16x larger than plus minus numbers they'll be about 4x more accurate than the plus minus numbers.

Team Regression:
If this statistic is really useful in predicting offense (or winning) it should correlate well with scoring, whether it be on a team by team basis or player by player. So I first look at the team using last season's results: Goals = 0.09*shots + 0.02*missed (where 0.02 is +/- 0.04, aka completely insignificant). First off, even if missed shots were significant, one missed shot is still only worth about 1/5 of an actual goal, so it would take 50 missed shots to make 1 goal.

Individual level:
The first question: are missed shots with regular shots a better predictor of offense than just regular shots? A. this is a resounding no, while missed shots don't seem to hurt the results too significantly they don't seem to add anything, except more variability to the model.

Are missed shots significant?
Again a regression with shots and missed shots, at this point in the season, are not a significant variable in the model. What was interesting is that missed shots were more important in a model that used "expected goals" as opposed to just shots.

The problem with Missed Shots:
The simplest most basic problem with the Corsi index is the fact that missed shots are by definition worse than a shot on goal. The only hope Corsi has, is that players who miss the net a lot are likely hitting the net a lot, and in the absence of a decent sample size this is a useful method as a missed shot is better than no shot at all.

Missed shot percentage (missed shots/(missed shots + regular shots)
The higher a player's missed shot percentage is the worse the player is (if a player is only hitting the net 10% of the time, they'll be sent back to the AHL or worse).

The problem is, that in a model where missed shots are included the missed shot percentage becomes a significant liability. That is to say that unless the missed shots are accompanied with actual shots they're worthless (this makes sense).

Missed Shots
Missed shots are a complicated variable that can be both a good thing and a bad thing. A team that chooses to shoot more shots haphazardly will likely struggle to score compare to a team that focuses on getting the puck on target. Missed shots can have any range depending on the score sheet recording a shot that missed by a few inches is quite different from one that misses by 3 feet.

Blocked Shots
Blocked shots are even more complicated than missed shots and similarly do not help predict offense better than regular shots on their own.

Conclusion:
I'll stick with expected goals. That being said, I've posted the Corsi index on my statistic site for those who think it is useful.

1 comment:

Anonymous said...

Interesting....

"The only hope Corsi has, is that players who miss the net a lot are likely hitting the net a lot, and in the absence of a decent sample size this is a useful method as a missed shot is better than no shot at all."

that quote seems to be the takeaway point that would make this index have value. Offhand, I think I'd agree with you that Expected Goals seems to be the better metric.