I was going to post this at Tom Benjamin’s Weblog, but felt it was going on too long and it fits better on this website than anywhere else. It would appear Tom is getting all the good discussions lately.
True. The question is whether the NHL season is long enough for us to get an idea of an individuals' true talent level from those results. I tend to think that it is - I give a lot more credence to the teammate line than the sample size line. If the sample sizes were too small to give any indication, you'd see drastically different lists of scoring leaders each season. You don't. –
Goals/Second
Error is pervasive in hockey, most specifically because goals are so low. You can easily approximate error for plus and minus statistics by using a simple binomial formula sqrt(n*p*(1-p)) where n is seconds on ice, and p are goals/second. For the average player n = 50000s and p=0.00071003G/s and their minus or plus statistic for that matter is ± 12. One can approximate the accuracy of the plus-minus statistic by using the simple sqrt(122+122) = 17, suggesting the plus minus statistic has a range of around ± 17. Of course this isn't perfect, as it's the error for a binomial distribution (hockey is arguably Poisson), but it's the best I can quickly do. What it really comes down to is that 35 scoring events don’t constitute very good information. Think about flipping a coin 70 times: you expect 35 heads, but you’re going to get 35 ± 8 95% of the time, would this make heads more likely if you get 40 heads vs. 30 tails, no it would just appear that way.
Shots
One cannot easily fix this error by looking at shots as opposed to goals (as shots occur at a relatively constant rate for each player), the results are similar: ± 12, playing less reduces the error slightly, but reduces the goals by even more increasing thee percent error from 12/35 = 35% error, to 8/17 = 46% error. Shots per second are more useful as there are 10 times as many shots as goals, this works out to 350 ± 18 or an excellent 5% error. You might actually get some reproducible results with this information. Problem of course is not all shots are equal, not all shots are bad. We would likely be in agreement if I said a shot from greater than 60’ out has almost no chance (~2%) of going in the net and 30’ constitutes a reasonably bad distance to shoot from (~5%). Of course this isn’t the same for all players. This general information generated interesting analysis at Hockey Analytics, which basically leads to a binary logistic regression. This regression provides information about the likelihood of each shot going in. As I showed in this study these shots against are good measures of defense (they ignore quality of goaltending as well). Problem is they appear to be a poor measure of offense (although after going through the offense error today I might re-analyze how poor). However the regression and the analysis (as most people have already discovered) leave much to be desired, a 14’ shot could be a break away or just a soft throw at the net. There’s no way to tell on a score sheet. The distances recorded by the NHL are significantly flaws (distance from backboards mean nothing to me), they’re good approximations, but unless you get real x and y coordinates you’re data has more error as a result of these problems. In order to understand this error I need to understand logistic regressions better.
A few examples
Weinrich played 16 games in Vancouver, has was a -17 and plus 5 in 13000 seconds, his -17 score is ± 8, so 8/17 suggests 47% error, his plus 5 was ± 2 or 45% error, Weinrich had a expected goals against of 7 (note this is outside of the expected range) and an expected goals for of 7, so he should have been a 0. In
In one game a player might go -2 in 15 minutes of even strength time, however the error is over 100% of the data, meaning one cannot really learn something (by looking at goals) about a player in one game. Shots against statistics in one game are 90% error.
Forsberg, for example has played 10 seasons in the NHL if you take statistics on season scoring numbers you get a standard error for the mean of 3.5 (± 7), which is lower than the expected (± 14), of course there’s error in the standard errors was well. Jagr has a standard error of the mean (for points) of 6.1 (± 12), which is certainly well within what’s expected. There are a number of factors involved that go beyond a player’s skill (how much power-play time, how much ice time, line mates etc.), but you can see there is enough error in aggregate points to make the above error conclusions.
Generally a statistic that goes out of the bounds of “normal” is probably unstable, for example Anson Carter shooting percentage was 23% (twice the average). This is unsustainable statistic, while he may hope to score the same number of goals next season he should be happy getting 25 goals next season.
Cooke this year was chastised for his poor play (even though he never played all that much), but he could’ve had 8 more goals (or 8 less for that matter) and it would be in the range of expected error
Ups and Downs
This information helps explain why a player can have up and downs among fans, their skill level doesn’t actually change, but the results do. A player may do poorly for a while as a result of nature of the sport and be chased out a city very quickly. He can move on and do very well.