September 29, 2006

The Error of Our Ways

I was going to post this at Tom Benjamin’s Weblog, but felt it was going on too long and it fits better on this website than anywhere else. It would appear Tom is getting all the good discussions lately.

True. The question is whether the NHL season is long enough for us to get an idea of an individuals' true talent level from those results. I tend to think that it is - I give a lot more credence to the teammate line than the sample size line. If the sample sizes were too small to give any indication, you'd see drastically different lists of scoring leaders each season. You don't. Tyler (mc79hockey)


Error is pervasive in hockey, most specifically because goals are so low. You can easily approximate error for plus and minus statistics by using a simple binomial formula sqrt(n*p*(1-p)) where n is seconds on ice, and p are goals/second. For the average player n = 50000s and p=0.00071003G/s and their minus or plus statistic for that matter is ± 12. One can approximate the accuracy of the plus-minus statistic by using the simple sqrt(122+122) = 17, suggesting the plus minus statistic has a range of around ± 17. Of course this isn't perfect, as it's the error for a binomial distribution (hockey is arguably Poisson), but it's the best I can quickly do. What it really comes down to is that 35 scoring events don’t constitute very good information. Think about flipping a coin 70 times: you expect 35 heads, but you’re going to get 35 ± 8 95% of the time, would this make heads more likely if you get 40 heads vs. 30 tails, no it would just appear that way.


One cannot easily fix this error by looking at shots as opposed to goals (as shots occur at a relatively constant rate for each player), the results are similar: ± 12, playing less reduces the error slightly, but reduces the goals by even more increasing thee percent error from 12/35 = 35% error, to 8/17 = 46% error. Shots per second are more useful as there are 10 times as many shots as goals, this works out to 350 ± 18 or an excellent 5% error. You might actually get some reproducible results with this information. Problem of course is not all shots are equal, not all shots are bad. We would likely be in agreement if I said a shot from greater than 60’ out has almost no chance (~2%) of going in the net and 30’ constitutes a reasonably bad distance to shoot from (~5%). Of course this isn’t the same for all players. This general information generated interesting analysis at Hockey Analytics, which basically leads to a binary logistic regression. This regression provides information about the likelihood of each shot going in. As I showed in this study these shots against are good measures of defense (they ignore quality of goaltending as well). Problem is they appear to be a poor measure of offense (although after going through the offense error today I might re-analyze how poor). However the regression and the analysis (as most people have already discovered) leave much to be desired, a 14’ shot could be a break away or just a soft throw at the net. There’s no way to tell on a score sheet. The distances recorded by the NHL are significantly flaws (distance from backboards mean nothing to me), they’re good approximations, but unless you get real x and y coordinates you’re data has more error as a result of these problems. In order to understand this error I need to understand logistic regressions better.

A few examples

Weinrich played 16 games in Vancouver, has was a -17 and plus 5 in 13000 seconds, his -17 score is ± 8, so 8/17 suggests 47% error, his plus 5 was ± 2 or 45% error, Weinrich had a expected goals against of 7 (note this is outside of the expected range) and an expected goals for of 7, so he should have been a 0. In St. Louis, Weinrich, was also -51 ± 14 in St. Louis, but had an expected goals against of 43 (this is inside the expected range), but both are lower and does create some suspicion to the accuracy of shot measure of skill. It’s conceivable though that when Weinrich was on a team that was doing well he would do better then expected (and worse on bad teams).

In one game a player might go -2 in 15 minutes of even strength time, however the error is over 100% of the data, meaning one cannot really learn something (by looking at goals) about a player in one game. Shots against statistics in one game are 90% error.

Forsberg, for example has played 10 seasons in the NHL if you take statistics on season scoring numbers you get a standard error for the mean of 3.5 (± 7), which is lower than the expected (± 14), of course there’s error in the standard errors was well. Jagr has a standard error of the mean (for points) of 6.1 (± 12), which is certainly well within what’s expected. There are a number of factors involved that go beyond a player’s skill (how much power-play time, how much ice time, line mates etc.), but you can see there is enough error in aggregate points to make the above error conclusions.

Generally a statistic that goes out of the bounds of “normal” is probably unstable, for example Anson Carter shooting percentage was 23% (twice the average). This is unsustainable statistic, while he may hope to score the same number of goals next season he should be happy getting 25 goals next season.

Cooke this year was chastised for his poor play (even though he never played all that much), but he could’ve had 8 more goals (or 8 less for that matter) and it would be in the range of expected error

Ups and Downs

This information helps explain why a player can have up and downs among fans, their skill level doesn’t actually change, but the results do. A player may do poorly for a while as a result of nature of the sport and be chased out a city very quickly. He can move on and do very well.

Top Scorers

A quick note on top scorers: they have much less error due to their increases in goals. The error does not increase all that much, but the goals do, so the error as a percentage of goals is drastically lower. So the best players in the NHL don't change as much as the average players do.


Vic Ferrari said...

I think that there may be a mistake in your math. Unless I've misunderstood, your numbers should give a standard deviation of about 6, no?

In any case that's pretty much Poisson there, because you've broken it down to seconds. In reality you are probably looking at about one time in a shift that a puck could go in if the hockey gods so choose. Could happen at either end of the rink.

It doesn't matter much to the results, because as Alan Ryder quite rightly points out, at around 100 "rolls of the dice" binomial is pretty much the same distribution as poisson anyways. If, instead of breaking a one hour game into 3600 one second segments, you take it from the sensible 100 or 110 game segments (perhaps not coincidentally the number of shots directed at net in a typical NHL game, and about 40 seconds, which is a pretty typical shift length), then you get a spread of results that is going to mirror the real results closely.

And the actual results are going to mirror that pretty well. A change of about 6 or 7 in EV+ or EV- rate from year to year for your mythical average NHLer seems about right. Perhaps a shade more because of injury, changes in linemates and opposition, etc. Though the latter two things have an enormous impact, they don't usually change too much from season to season for most guys. So it should ring in close to that.

Alternatively, you could write a wee sim that just assumes "shit happens". That anything can happen at any time. As complicated or as simple as you'd like, it won't matter much. And let the simulation find the time interval that reflects the actual results the best. Should end up around 35 seconds for this past season, just as a guess, and a smidge higher for past years.

This whole "goalscoring is Poisson!" thing, it gets on my tit, because it is fundamentally wrong. "shit happens", "sometimes they go in, sometimes they don't" and "you only get so many chances to score in a game" ... that's absolutely true. Just ask a good goalscorer. The "shit happens" force is so strong that a poisson distribution happens to fit it reasonably well, even crudely applied. But that's all.

So when a guy has a big change in EV+ rate, or EV- rate. Chances are that injury is a factor. If not then looking at his quality of linemate (did he play with Mario?) or opponent (did he mature to the point where he was the coaches best option against Modano?) usually tells the tale. Failing that, it's probably just coincidence, and if you look at the shooting% when he was on the ice, for both teams, there is probably some spooky stuff there. And if it happened in a good way to Igor Ulanov and Shean Donovan (and it did in 03/04)well even their parents aren't expecting a repeat of that. And if it happened in a good way to Robert Lang and and Marty St. Louis, well then there's at least a chance they can repeat that somewhat, at least on the offensive side of the ledger.

Get enough buggers rolling the dice and some of them are going to have mad stretches of 7s or 2s, just will. There are outliers, in the right numbers, because they have to be. And those are probably the guys overvalued or undervalued by fans, especially the numbers guys. And by gms too, though to a much lesser extent. Coaches ... much less so, they're the ones counting scoring chances, possession time, gauging level of opposition, etc. So even though Fedorov has been written off by most fans, if you watch the attention that opposing bench coaches give him this season, you'll see that most of them haven't been seduced by the recent results of Sergei on the scoresheet.

And when fans choose to imagine that Chubarov played hard icetime, or some such, then just leave them be.

Nobody wants numbers thrown in their face that destroys a myth they hold dear. Or a terrible evaluation they have made of a game or of a player. As illustrated below in the adaptation of Cosby's classic stand up bit.

Bill Cosby:
"What do drugs do for you?"

They enhance your personality.

"Yeah? But what if you're an a$$hole?"

Vic Ferrari:
"What makes you think that?"

I watched the game!

"Yeah? But what if you're an dumba$$?"

That is all. I'll show myself out. :)

JavaGeek said...

I think that there may be a mistake in your math. Unless I've misunderstood, your numbers should give a standard deviation of about 6, no?

I'm a little lazy in the statistical sense of stating what a confidence intervals and standard deviation is. A 95% confidence interval is twice the standard deviation. I should really by stating it more clearly: that the "plus statistic ± 12 19 times 20". Or 95 percent of players will fall withing a ± 12 plus rating 95% of the time (12 = 1.96*6).

One standard deviation only covers about 70% of the data...

I hope to have an article on removing the "mario effect" next week

Vic Ferrari said...

Ya, it's been a long time since I took a stats class, and the application of a confidence interval here seemed dodgy to me.

... and their minus or plus statistic for that matter is ± 12.

That read to me as the fluctuation of your 'average NHLer'. The median change, from year to year, for that guy it is 6 of course. My bad.

Of course if you look at the ability to drive the tempo, the rate that which total goals are scored, that moves up as well as guys play for a less risk averse coach. Or down if they play for a tightass.

It's something that gets overstated imo. There really isn't that much difference between guys on a team, not over and above what you would expect from pure coincidence. Hell, even leaguewide. And obviously forwards are driving that. The difference between Dman impact on event rate and the 'noise curve' will be next to nothing. And even for the forwards not nearly as much as most people would guess. The puck has to be somewhere.