December 22, 2006

Player Value Part III

It’s been a little less than a week since I released part II of my player evaluations. In this release I’m not using the term “rank”. This is a statistics that measure how well these players have played so far, whether that makes them the best player or lucky is a matter of opinion as it always is with statistics. If there were no opinion then we could just evaluate players based on salary (perfect correlation of performance to pay). There are still small tweaks I can make to the algorithm improve, however this is about as good as it will get. Remember also, that this evaluates player’s performance at even strength which accounts for 65% of all goals and 72% of ice time.

The most significant change was to regress to the mean, this means that I look at each player’s score and ask the question: “is it more likely this player is average than actually this good”. So a player with seven goals against per hour in one game might be extremely bad, but it’s more likely that they were unlucky and are actually only “semi-bad”. Using an estimated percent error as a function of ice time, I rescale each players Z score (how many standard deviations they are from average), if the amount of time is less than some amount (where percentage error >100%) I just assume the player is average until I have better data. This means that good players who’ve missed a good chunk of the season I can’t really rate. For example, Gaborik has a negative score, which will improve if he gets more ice time.

Since the regression doesn’t care about individual scores, but the scores of all possible cross products I had to rescale every pair using the above logic. So if a pair doesn’t play much together, but got a ton of goals in the time I just assume it was all luck and give them an average number of goals. So this regression primarily looks at significant pairings and ignores the insignificant pairings. This also prevents the regression from blowing up in both directions and makes the results in a much smaller band.

A good/poor score can be the result of coaching usage, player confidence, ability, luck etc. This is not a definitive number telling you how good a player is at even strength, but rather attempts to explain how they have done, whether that’s an anomaly or not is a matter of opinion. So with the technical discussion over a short list, the rest is available at my site.






I’ve also been doing the same thing to my PP and SH data as a result the columns on my website are significantly different. A NETV score, which represents goal differential provided by a given player, using even strength, power play and penalty kill data. You can estimate winning percentage by (2.86+NETV)2/((2.86+NETV)2 + 2.862), which might be more intuitive to readers. PP+, PP-, SH+, SH- are functions of goals for and shots against and do not represent typical plus minus numbers, they attempt to credit the player who is responsible for the goal rather than all players equally.

+ V
+ (PP+ - PP- - PPavg)*PPhours/(0.15*Games)
+ (SH+ - SH- + SHavg)*SHhours/(0.15* Games)

I’m not quite satisfied with the constant used in the algorithm at this point to say these results are very good, but they are useful. All the NETV scores are on my website.

I want to do some more interesting analysis in the next week, so I’ll avoid dealing with my player rating system for a while, so I’ll keep these results for now. Not sure what topic I’ll cover next. It’s been a while since some of my more detailed better articles came out.

December 18, 2006

Player Ranking attempt II

I’ve been working on one primary scheme to determine who responsible for the performance. I have chosen a regression on the performance of all team’s players cross products to determine who is responsible for the goals for and against. My last attempt was criticized and for good reason as I choose to use expected goals, which are a poor measure of actual goals to predict offensive ability. A player like Iginla who hasn’t taken many good shots, but has scored a bunch of goals (or had goals scored while on the ice) was completely underrated by my old system and this was noted to me. So I choose instead to use goals (I will continue to use expected goals against as their variability almost perfectly matches expected random variation, with a few exceptions). By using goals, I have chosen to increase my variability three times in order to have more accurate information, this means it’s a lot easier for a player to get randomly into the top groups, so these numbers measure “possible talent” as apposed to actual talent and similarly on the low end.

Last time I only used the team to determine skill level, this time I included strength of opposition. I calculated the same regression as the previous article for offense and defense. Then I multiplied the ice time of every opponent by the score of the opponent and divided by the total number of seconds of match ups to get an average score of the opposition for the given player. (So if you spend all of your 1000 seconds against only Iginla I’ll give you an opposition offensive score of 4.3 [Iginla’s score]), basically I consider the opposition score the expected number of goals against, now if a player is better than average they should be able to lower the number (have fewer goals against than average). So I calculate the players score for goals for and against and subtract opposition’s defensive score from their goals for per hour and opposition offensive score from defensive, then I just subtract their “plus” (offensive score) from their “minus” (defensive score):
(Even strength G/hr – Even strength opposition GA/hr)
- (Even strength GA/hr – Even strength opposition G/hr)
This means if you score a lot of goals against an opposition that allows a lot of goals your score won’t improve, but if you have very few goals against a very tough opposition your score won’t be hurt either. Now differences in strength of opposition are not that different, but they are different, so the players performance scores are much more important than their opposition’s score, however opposition scores help remove the team effects so these scores should be comparable across teams and they are not effected by line mates significantly so a player being moved from one team to another should get the same score (plus/minus coaching effects) theoretically speaking of course. These scores are not a definitive measure of talent. Now using a scoring rate I multiply by the amount of time a player is being played to calculate their value to the team. A player who gets 20 minutes of even strength time is arguably twice as valuable as an identical player getting only 10 minutes. Due to the ice time multiplier defensemen appear on top more frequently than forwards.

What does VAL measure?
The units for VAL are goals per game, where goals is the expected goal differential for that individual, which doesn’t necessarily measure true value as a lower goal differential is acceptable for a player with a fewer events so defensive players will be undervalued. If all the players were identical and goal tending is average this should be the plus minus for that player every game, so 82*VAL = plus minus in an ideal system (all players on team identical with average goal tending). So a score of 0.33 would work out so +27, considering the best player last season as +35 this is probably a sort of limit value of this statistic (scores about 0.5 probably don’t mean much other than random error). Of course these scores are for individuals not lines so one player could easily do much better than his line (Selanne). For these metrics (even strength stats) I like to use Malik as he is a consistent plus player and former Canuck (+96 in over last 4 seasons). Malik scores an excellent 0.48 in my system for +40. I'm posting the results for the Northwest division on this site as it's what I'm familiar with, however the complete list can be found on my statistics website. I have also adapted my SH and PP scores on the site, I'm not 100% satisfied with the results yet for the however feel free to comment on them as well.






Remember these stats are for even strength and not power play and other places, some great players perform amazing on the power play, but are less than impressive at even strength. I would argue these stats emphasize offense over defense, so it likely undervalues defense.

December 15, 2006


A few weeks ago Vic Ferrari and I had a lively debate on face-offs, due to school I have been unable to respond to the post correctly and felt I should complete the discussion. Before starting I would like to thank Vic Ferrari for pointing me to some excellent resources and much of this article is based on the great face-off article found at that site.

The above graph shows the effects of the different face-offs on goals against (O = offense zone, D = defensive zone), you should notice a lag for offensive zone face-offs in terms of goals against as one has to skate the length of the ice before scoring. One should notice that the primary effects occur in the first 15 seconds, the results after that aren’t clearly dominant like the first 15 seconds with a net effect (difference between offensive and defensive face-offs) of around 400 goals (incorrectly estimated it to be 200 in the above comments as my scripts were poor). There is a slight increase in scoring after the initial 15 seconds as well, which could be the result of better teams having get more offensive draws or it could also be a slight effect of possible zone control 15 seconds after a face-off. 400 goals over the course of 26,000 face-offs works out to 0.016 goals per defensive face-off so the cost associated with the defensive face-off is: 0.016 goals against (this should be a good metric to quantify the cost of icing). The majority of the effect is caused by losing the draw in the defensive zone, so if you don’t win the draw the effect isn’t really there (at least not as much).

In terms of individual players, this effect will be minimal due to the simple fact that some players see a very small difference in offensive zone vs. defensive zone resulting in a mean of 0 and standard deviation of around 50, so 95% of the players are within around ±100. This results in a standard deviation for the cost to a center man of around 0.016*50 = 0.8, in fact only 8% of players saw their pluses improve by more than 1 as result of taking more offensive face-offs and 10% saw their minus hurt by less than -1. Lecavalier has the largest difference of 207 resulting in an estimated benefit to Lecavalier of around +3.3 and left Andreychuk to pick up some extra defensive zone face-offs (160 defensive vs. 46 offensive). So in terms of individuals this is not a huge issue. One team saw an extra 360 defensive zone face-offs (Minnesota) and that cost them probably about 5 goals against or basically 1 win. If you’re looking at a more global view: a player who sees 200 defensive draws should see a 5.3% chance of a goal against vs. 3.4% chance of a goal for in the 45 seconds or less after the draw or -11 (sd = 3) and +7 (sd = 2.5), where three of the minuses I would consider are as a result of the defensive zone face-off the rest depend on the responsibility (skill) of the player in question. Of course that player will also have a few offensive draws to make up for those defensive draws so it balances out in the end.

December 8, 2006

Personnel, Coach or GM?

There’s been a lot of discussion of the lack of scoring by the Canucks this season. Averaging 2.07 goals per game, just isn’t good enough in the new NHL. The Canucks defense have managed to pull off 13 wins a number of which were luck early season come backs, that the Canucks have been unable to repeat in the last few weeks. If you calculate the Pythagorean prediction with the Canucks goals for and against you get an estimated 40% record or 75 points over the course of season. The real question is: Who on earth is responsible for the Canucks lack of scoring? To answer this question we need to look at scoring per situation: Well the Canucks are ranked 2nd last in power play scoring, 3rd last in even strength scoring and they’re the only team without a shorthanded goal. Interestingly the Canucks have the largest difference of 22 in the NHL in expected goals (80) and goals of (58). If you were considering the possibly of “luck” then this result is 2.46 standard deviations away from expected or the 0.7% of possible “pathetic-ness”. Since about one third of the season is over then we can say that there will be 1 team every two years who sees this sort of random variation in one of the three thirds of the season (games: 0-27,28-55,56-82). Watching the Canucks I know that this probably isn’t the case, rather the team is having some major scoring problems that go far beyond luck.

GM: Nonis

Let’s first look at the GM. He signed and traded a number of players away [Bracket goals are approximate historical averages]. Including the loss of Jovanovski (15 goals), Bertuzzi (30 goals) and the gain of Bulis (15 goals), Choinard (10 goals) and Pyatt (10 goals). Nonis has also signed three new backups, four if you include Auld, in less than two years: Sabourin, Ouellet, Noronen, none have stellar or even average NHL histories. I would be more than happy to argue Nonis’ ability to determine talent is certainly lacking. The forwards he chooses to keep aren’t exactly great scorers either: Kesler (10 goals), Cooke (15 goals), Morrison (20 goals). If you add up the expected goals (based on historical averages) of our forwards and defense it works out to something like [arrows represent how they’re doing this year compared to their average]:

Naslund: 40
D. Sedin: 20
Morrison: 20
H. Sedin: 15
Salo: 15
Cooke: 15
Bulis: 15
Choinard: 10
Pyatt: 10
Green: 10
Kesler: 10
Ohlund: 10
Bieksa: 5
Linden: 5
Burrows: 5
Krajicek: 3
Fitzpatrick: 3
Mitchell: 2

Total: 213 goals or 2.6 goals per game, if the Canucks had gotten 2.6 goals per game they’d have 73 goals so far this season, which is perfectly reasonable. They’ve lost around 15 goals (3 wins) as a result of whatever is causing these players to not score.

Coach: Vigneault

Who couldn’t run a competitive defensive system with such a great goaltender as Luongo? The Canucks are only slightly better (2.4 g/hr) than average (2.6 g/hr) at even strength at preventing good shots against at even strength. And ranked 27th in terms of expected goals on the penalty kill, Luongo’s performance allows the Canucks to be ranked 6th in terms of actual goals against. So Luongo is making the Canucks porous defense look good. But that’s not the problem, the problem is scoring: here’s what the Vancouver Sun writer: Elliott Pap has to say about the Canucks offense: “Obviously there is a crisis of confidence and a dearth of proven marksmen but, please, don’t blame coach Alain Vigneault for stifling the offense with his system.” He explains further that: “You take 66 shots [directed at the net] and don’t score on any, blame the shooters.” Considering the above arrows for the forwards above I have to wonder how you can’t blame the coach when all the scorers aren’t scoring, Bieksa being the lone exception largely due to the fact that he’s seen twice as much power play time this season in 28 games this compared to all 39 games last season. Considering the expected vs. actual we know the Canucks are getting what appear statistically good chances, but haven’t seemed to go in. I certainly will say that there is a possibility that the shoot first ask questions later approach to scoring isn’t working out too well.

If you’re familiar with the Canucks you would know it’s uncommon to see the same two players skating together two games in a row, unless the Canucks win or they have the same last name. 28 games into the season, Vigneault still hasn’t figured out what line combinations maximize scoring. He choose to send down Coulombe, who was significantly leading the power play in terms of expected and actual goals, however his 7 goals against per hour at even strength was too much for Vigneault, although I would argue he had some tough luck due to the fact he has some of the best expected goals against statistics on the team. Either way, Vigneault isn’t letting players play together long enough to figure out if they’re effective together, so every game different players have to try to adapt to new combinations and score at the same time. As a result goal scoring is lower. Most combinations of players can play together defensively, which is what we are seeing, but scoring takes time and patience, which it appears Vigneault is lacking and it’s hurting the team in terms of confidence, this confidence can be heard in this quote by Ryan Kesler: “I must have had a bad practice or something, I don’t know…I was pretty excited to get the chance to play with [the Sedin twins], but I;m still playing with a couple of good guys on my line…who knows what’s going on.” There’s not much more to say, but many of the players are frustrated with how this line juggling has worked. Bulis, who was promised top line minutes, is getting opportunity to try out on the top line with the Sedins (they’ll play their 3rd shift of the season together tonight). Either way, every player’s scoring is lower with Vigneault. It would appear all those shots aren’t helping Vigneault, maybe he’s disrupting some sort of chemistry or it’s just extraordinary bad luck, but either way the Canucks can’t score with Vigneault.

Can we still blame the Players?

The argument provided from the Vancouver Sun was that if players aren’t scoring on a lot of shots then their confidence is gone and this is somehow their fault. Now if you ask me, if there’s a problem with one player you look at that one player, when there’s a problem with all the players, you look to the guy in charge. There’s still a few things to see in terms of individuals that are worth noting. Last season Naslund had 98 shots in the first 28 games and 15 goals, this season he has 93 and 12 goals, well within reasonable random variations. Interestingly the shots this year produced more expected goals. If you look at shots as a percent though, Naslund had 11.4% last year in the first 28 games and this season he’s getting 10.6%, certainly not statistically significant, but still worth noting. Daniel Sedin went from 5.6% to 8.5%, which is no surprise as his ice time also improved. Problem is, while Sedin’s shots increased his shooting percentage fell. In fact with the extra 19 shot he scored one fewer goal. Of course none of these results are statistically significant the sample sizes are way too small, but many are disturbing indicators. Of course scorers, need good passes and if the Canucks aren’t getting those it would certainly make their shooting worse, of course since the NHL only prints the passes that result in a goal and not those that result in a shot that is stopped it almost impossible to analyze whether some passers help the shooting percentage more than others.


Still a god chuck of this variation could be randomness; we’ll see if the Canucks can eventually pick up their scoring, but 28 games in it appears the Canucks we’ll struggle to score goals all season.

*I wanted to post something this week, but due to exams I lack time so I wrote this quick article about the Canucks. I'll have a lot more time when my last exam is finished on the 13th, otherwise there'll be little new material here.

November 28, 2006

Individual Player Contributions

I've been working on many methods to scale out the line effects, this one appears to be the most promising at this point. Not to make any comparisons, but David Johnson has created one as well.

I assume players scores are added linearly and that the line effects are primarily caused by pairs (this method can be extended to 3's, 4's or even 5's). Before going into the details I'll provide some quick short had I12 = I12 = total ice time player 1 and player 2 spent together, S1 = score - goals/hour for player 1, G12 = G21 - expected goals (based on shots for/against) while player 1 and player 2 were on the ice. Now I assume that I12*(S1+S2)/2 ~ G12 that is to say that the average between the two players goals per hour multiplied by ice time should be approximately the number of goals for/against. So if one player is very defensive and the other is terrible defensively they should be average defensively together. Now I know G12, that is to say I know how many (expected) goals for and against for any combination of players and I also know how much ice time every player has spent with any pair. But, I do not know S1 and S2, these are the individual scoring statistics (units: goals/hour). Now depending on the team there are 30 or so players who have played a game with the team so there are 30!/(28!*2!) = 435 equations, with 30 unknowns. Using these variables I can simply calculate the coefficients using a regression (no constant though). I wrote my own regression code for this matrix and as such I don't have error details: I don't know how well it performs.

  • The benefit is that this algorithm will not alter the actual statistics significantly so if, for example, one Sedin has 1 extra goal compared to the other they will still be rated equally.
  • It also allows significantly different scores for players who do spend significant time together given significant scoring differences, due to the fact that: a lower S1 can be made up by a higher S2.
  • It doesn't chase low minute players statistics as the coefficients will be small and will have a smaller squared error.
  • It produces extremes periodically (negative goals for/against), you can't score negative goals for...
  • The scoring rates solutions (S1, S2) aren't very comparable between teams or even fully understandable how they got there.
  • Since S1 and S2 aren't very logical, this leaves me multiplying by ice time to get an approximate "individual plus minus" statistic.
I haven't tested it with past data and this season doesn't have enough data to make these results anything but full of problems, but I primarily posting this for a reader response, that is to say, for people to criticize or compliment the results to see if I should continue this development. I find it interesting to see the offensive numbers (Plus) vs. defensive numbers (Minus). D just calculates the difference between them. I'm just posting the Northwest division to start with.











November 26, 2006

Site Update

Most sites provide some interesting power play and penalty kill statistics. The most basic being sites such as ESPN. However there are a number of stats that are not used at these sites such as my preferred: expected goals and shot quality neutral save percentage, a measure of shot quality. All these can indicate, before it is apparent, that a team has problems in a certain situation. For example I have Edmonton ranked as the worst Power play team in the NHL (slightly below Phoenix) as they have taken very few shots and taken horrible shots. Somehow they’ve managed to score at 6.6 goals per hour, when I would predict they score at 4.3 goals per hour (which I doubt will continue over the course of the season). This is better stated by Andy Grabia one week ago: “The Oilers powerplay got a goal tonight. Thank God. But it wasn't on the 5-on-3. That was wasted by MAB faking the shot about 17,000 times, and the continual passing back and forth along the point. Nor was the goal deserved, although the shot was a beauty. Has this team heard of anything other than a one-timer? Sweet Caroline, our powerplay stinks. And it is not personnel. It's coaching.” You’ll note: they got a goal they didn’t deserve, the statistics agree, Edmonton has gotten a lot of powerplay goals it didn’t deserve. I’ll disagree partly it’s not all coaching as I think the defenseman in Edmonton are not conducive to a productive powerplay, but that’s another issue.

In order to somehow “summarize” this data I created the definitive situational table. I decided to include all data in one massive table due to the fact that I want to be able to see what happens to even strength data when I sort by power play data. In order to make the tables a little simpler I decided to split them up as well (as it’s easy to do so), so there’s separate even strength, power play and penalty kill tables. All columns are sortable, and in general one can spend hours sorting by all 27 columns. I included a goal differential in the separate tables (there was no room in the big one), but I put little weight in it as there’s a lot of random error in the results. This is more data than most people can stomach in one day let alone a few minutes, but I sure some people will like it as much as I do. If there are any errors let me know.

Also, why on earth do the Canadiens have such an amazing penalty kill?

Face-offs: Part I

Since I went through the work to calculate this data for Vic Ferrari as a result of comments in this post. I wasn't sure what I was looking for and still don't. The actual coding is a pain as the NHL does not record whether the face-off is even strength or power play so I had to use my standard penalty prediction method to determine when teams are even. Naturally the more complicated it is there's more chance for error, but the results appear reasonable. So all I'm calculating is the probability of a shot and the probability of a goal after a face-off in a offensive zone. There are two situations: a win and a loss, if you lose you can still battle the puck back and get a quick shot off, but obviously the odds of that are much lower.

Using 8 seconds (2005-2006):
Face-offs aren't recorded as EV/PP so I had to use my penalty prediction of PP/EV algorithm.
Even Strength:
win-shot: shots/face-offs = pct%
Win-Shot: 4622/17334 = 26.7%
Win-Goal: 176/17334 = 1.02%
Loss-Shot: 698/17367 = 4%
Loss-Goal: 34/17367 = 0.2%
Power Play offense:
Win-Shot: 1399/5508 = 25.4%
Win-Goal: 74/5508 = 1.34%
Loss-Shot: 195/4460 = 4.4%
Loss-Goal: 4/4460 = 0.31%
Short handed offense:
Win-Shot: 122/616 = 19.8%
Win-Goal: 5/616 = 0.81%
Loss-Shot: 10/806 = 1.2%
Loss-Goal: 0/806 = 0%

It's interesting, but most shots after face-offs are garbage (shooting percentage half of normal).

  • PP face-off win percentage = 55%. (Players who play more PP time will have high win %, PK players will be lower)
  • Shooting is high: ~25%, even though unsuccessful.
  • I hope there are no bugs in these results...
  • The data is poor; it's possible for a shot to be displayed after a face-off even if it occurred before the face-off.

November 19, 2006

Self Promotion

I know a lot of people don’t go to my statistics website, but in my humble opinion it has a lot of useful features. In the past week I’ve created tougher restrictions for my player lists (at least 10 games played): forwards, defense, goaltenders. I’ve added new features such as team summaries, which shows a shooting graph for and against and general statistics. I’ve also created better navigation from the player list to my team summaries and team rosters. A while back I downloaded all the hockeydb information so I could create neat player information pages with all player information. More importantly, I’ve really worked out some of my bugs in my predictions and now they should be much better and I also included a graph of how my predictions change over time. As a result of changing my code, my past incorrect prediction need to be deleted eventually. A new feature I'm working on is a power ranking system, and this can be seen in my entire NHL list. In any of the lists you can click on the team picture to get to the team pages or the players name to get their page. Many of the columns are sortable as well in each table (just click on the names at the top).

These pages are update every couple of days and should be reasonably accuate. Many of the information presented wont be found anywhere else: such as a goaltenders shot quality neutral save percentage, or a save percentage based quality and quantity. Or goaltenders ability to stop easy, medium and difficult shots. For the forwards and defense I have: expected goals for and against for each player (a better measure than plus minus) all measured in terms of rates rather than absolutes. Along with the standard plus minus measures and points. In the player information page I display their score in each game, this can give you an idea how volatile the scores are, but also how they are performing over time. All these statistics are seperated by power play, penalty kill and even strength as one would expect.

I’m not sure how the website performs, although when I’ve tested it from remote locations it appears to work well, my database doesn’t scale well so as the season progresses some of the pages could take some time to process.

It would also be nice if readers let me know about features/data they want to see on this website. I've primarily made this website for myself, but feel others can benifit and it's easy to add more information to these pages. You can post requests in the comments. And of course if you find any bugs let me know, that would be great.

November 18, 2006

Overtime going into extra innings.

In the last article I concluded that: “I have good reason to conclude that overtime occurs randomly given any two team and that the results once in overtime are completely random.” My analysis before was focusing on global overtimes in order to know how to predict what games will go to overtime in order to make my point predictions more accurate. The graph below shocked me the most during my previous analysis, that is to say teams with a lot of regulation wins were unable to perform better in the overtime session, this is both shootouts and the 5 minute four on four. Of course I wrongly concluded that this suggests the results are random, however I think most readers will agree that if skill doesn't determine who wins, then what on earth does win?

Randomness naturally creates clusters and anomalies these should be expected. The question to ask is "are there too many anomalies?" Now a season is only 82 games and there are 30 teams, most of which see around 20 overtimes. One of the beautiful things about this analysis is that we know the distribution for random binary variables. That is to say that the mean = probability (p) and the standard deviation sqrt(n*p*(1-p))/n [n = number of events]. If a variable is perfectly random with equal chance for each side to win, then we obviously expect the probability to be 50%, which suggests that the error is sqrt(n)/(2n) or 50% ± sqrt(n)/(2n). So if you were to flip a coin 4 times you'd expect to get heads: 50% ± 25%, for a 95% confidence interval of (0%, 100%), of course this is a 100% confidence interval (you can't ever get above 100% or below 0%, so I'm 100% sure that 4 coin flips will land in that range) as a result of problems with approximating the normal distribution with the binomial distribution with small n.

In the NHL you can always find teams that fall outside if the normal 95% range due to the fact there are 30 teams, and if I'm 95% confident that each team is in this range than I could say that 5% should be outside of this range so 1.5 teams should be on the outside on average, so 2 teams outside this range is actually reasonable. If you wanted a range that includes all teams you'd want a 99% or 99.5% range (3 standard deviations).

Modeling the Overtime
I could use theory to analytically calculate the values for winning in overtime, but it's often easier to write a script to simulate the results. First I simulated the 4 on 4 assuming team A was 9% better than team B. So team scored at a rate of 1.2 goals per 20 minutes and team B scored at a 1.1 goals per 20 minute rate [Defense is ignored]. These numbers accurately reflect the actual scoring in the overtime. I simulated 50,000 times so the results should be ± 0.2%. The result: A team who is 9% better wins 52% of the time. The shootout relied on skill a little more, that is to say a team who has a 10% better shooting percentage (whether this comes from shooting or goal tending tending is irrelevant) results a team who is 10% better than team a wins 54% of the time. So shootouts should correlate about twice as much as overtime. There is only a 50% chance to make it to a shootout once you get to the overtime. As I mentioned before there is correlation between winning in OT and winning in regulation, but it's not significant. This often means that given more data there could be a relationship (this is always possible), but we are limited because there is only 1 season of data for this year. There is a more significant relationship to winning in shootout over the four on four portion as predicted by this simulation. Of course if you look at the variables and do regression nothing comes out as important, for example: save percentage doesn't appear to matter in the shootout. So we have theory that say better teams should win, but they don't appear to, but even if they did win at 52% it would be hard to detect (and having 2% error in a prediction algorithm isn't that bad).

Actual Results
The easiest way to test if data is random is compare it to what true randomness would predict. Each team has a given percentage of winning and compare that to how it would "normally" distribute if it were random that is to say calculate a z-score = 2*(team score - 0.5)/(sqrt(n)/n) and then plot the z-scores. [error/standard deviation] For example Dallas' 12-1 works out to 2*(12/13-0.5)/(sqrt(13)/13) = 3.05. 3. In other words the error is 3 standard deviations away from average, which is extremely rare (0.25%), but the probability that one team is 3 standard deviations away in a season is 8%, which is low, but not unreasonable. So Dallas probably was good and lucky. Not sure how physiology plays into all this: that is to say, if you go into Dallas knowing they're 11-1, you will think you will lose and hence you lose. Once you have a set of z-scores for each team you can plot them. If they're perfectly normal (mean = 0, standard deviation = 1) then you can conclude that the variability is identical to that of randomness (whether this means it's random is of course not determined). Minitab creates nice summaries of this data and they include the "standard deviation of the standard deviation". Like all statistics neither the mean or standard deviation is known and we must estimate both and with every estimate there is error and so the standard deviation has error just like the mean, now if this error doesn't include 1 (the value required for randomness) at a statistically significant level I can conclude this data is statistically significantly not random, however if it does include 1 that would mean that I must not reject* the possibility it's the same as randomness. So below are these statistics and in the red box I have a 95% confidence interval for the standard deviation.
Summary of the Graphs
Now if you understood any of that you would have noticed that the 4 on 4 and overtime in general included 1 and the shootout did not. That is to say there was statistically significantly more variability in the shootout than we would expect if it were random. So Dallas probably wasn't 3 standard deviations from their actual score (maybe 2.5 or 2, who knows), but the others were too close for this small data set to determine if it is random or not (I'll will add this years data at the end of the season and see if it becomes significant). Of course all these standard deviations appear more variable than randomness and it's only on the margins that we see

My Predictions
The important variable: overtime, includes 1 and as such there is insignificant evidence at this point to consider overtime determined by anything other than randomness, since overtime is what I care about (who gets the extra point) and not if the game is a shootout or won in the 4 on 4 I have choose to use a random variable to predict overtimes. I will again remind the readers that not only does overtime have very similar properties to randomness it doesn't correlate to skill, so bad teams win just as often as good teams do. This means that there is no useful way to actually predict who will win the OT even if I knew it weren't random. Does this mean that overtimes are the same as a random variables: absolutely not they are extremely complicated, have 18 skaters, a goalie, a coach (and assistants), referees, dynamic wind, air and ice variables not to mention physiological factors, however at this point, based on the data available to me, the best model is a random variable.

Challenge to Everyone
I challenge the readers to prove me wrong, that is to say to show that there exists a statistically significant (95% confidence) variable that can predict the overtime results.

*Hypothesis tests:
Has two hypothesis:
Null hypothesis claim initially assumed to be true [the data is random]
Alternative hypothesis: a assertion that is contradictory to the Null hypothesis [the data isn't random]
When we do the test we have two options:
Reject the Null hypothesis in favour of the alternative [reject: "the data is random" for "the data isn't random"]
Do not reject the Null hypothesis and continue with the belief that our initial claim was true ["the data is random"]
This does not prove the data is random, simply that its not different enough from random that we should conclude otherwise. This is exactly what I'm saying: there's insufficient evidence for me to use anything other than a random variable in my model.

Due to the nature of this site I'm often a little loose on the concept of "not rejecting" and"accepting" (they're different), just because this isn't supposed to be perfectly formal.

November 17, 2006

Predicting what games go to Overtime

Problems with current predictions

So I’m going over 2005-2006 data to enhance my standings predictions. I was a little shocked that for example the northwest division when sorted by point projections was negatively correlated to actual points. In other words Minnesota with 22 is projected to be the worst (75 expected points) and Avalanche with only 16 points are expected to get 109 expected points. So I tediously went through the 2005-2006 data testing alternative algorithms. Of course season of testing isn’t going to be accurate, but it’s the only set of data that complete enough to test with (I need a lot of shot data, which I don’t have for 2003-2004, plus the NHL changes). My current algorithm simply uses expected goals vs. actual goals against (with a special averaging function that scales down blow outs); this is the best predictor for the last half of the season based on the first halves data. Certainly Minnesota is falling because they miss Gaborik, not sure if my algorithm can pick up the changes that quickly, but obviously he’s an important part of the team. The Canucks who have a negative goal differential are predicted to get 104 points, which doesn’t make sense either, but then again their shooting percentage is at a low 5.6%, and I was expecting it to be at 6.6%, so assuming expected goals accurately predict scoring the Canucks should score at 6.6% the rest of the season rather than 5.6% (so the Canucks have seen tougher goaltenders). It’s hard to argue with the prediction that does better than a standard Pythagorean prediction, so with its problems I decided to keep it.

About Overtime

But, in the process I discovered something interesting about OT’s. What’s interesting about overtime is that it should be a function of skill that is to say if two equal teams play together they should be more likely to go to overtime than let’s say Phoenix and Detroit, however a quick binary regression shows there’s little significance to this assumption on a game by game basis. 22% of all games go to OT with little skill reasons for this a regression shows a slight favour of good teams to go to overtime less, a better way of saying this is that Ottawa and Detroit went to overtime a lot fewer time than average teams. Maybe more interesting is that there is virtually no correlation to winning percentage outside overtime compared to teams records in overtime. It shouldn’t be too much of a surprise that the overtime system in the NHL is completely random. I’m saying all this to say that I have good reason to conclude that overtime occurs randomly given any two team and that the results once in overtime are completely random. Basically every game has a ¼ chance of going to overtime and then each team has 50% chance of winning.

Random Overtime

The consequences may not be obvious immediately, but the first thing that comes to mind is that this guarantees 22% of the NHL standings are the result of pure randomness (above the normal randomness you would normally observe). What I’m saying is that teams will get about 28 free points (95% confidence interval of (21, 35)) as you can imagine the team or two who only get 20 points will have to be 7 points better (10% better) than average just to make the playoffs or the team who gets 34 points in overtimes (10% worse). Minnesota for example last year got 20 points in 14 overtimes, getting 84 points over the season, if they have been average and got 28 points they would’ve had 92 points and proabaly wouldn’t have traded away Mitchell and Roloson and possibly done very well in the playoffs.

Obviously the overtime is fun to watch, its always exciting to see shootouts, there’s no question there, but if you think about what I said, you’d realize that determining who gets the extra point via a coin toss would produce the same results. Personally I find this frustrating; if it’s so important to have a winner this works, but it simply hurts the overall ranking of teams and of course this is on top of the scheduling problems. Interestingly this should make the NHL appear more competitively balanced as it makes the results come close to random.

Fixing my Algorithm

What does this mean to me? Well I realized my current algorithm that assumed the better team wins the overtime more often then the worse team was incorrect and also I was incorrect that overtimes only occur with teams with similar skill. This means that I will randomly predict overtimes for all games played. This means if a team is predicted to have overtime with Detroit, for example, this will give them free points they probably wouldn’t have gotten otherwise. This of course is a lot easier to program than the other method of trying to find correct cut off percentages (eg. A game predicted to be won 53% of the time goes to overtime). Of course this will make bad teams look good because they get a number of free points. For example St. Louis last year got 29 out of 57 points, over half, in the random overtime.

November 16, 2006

More on Diving

A third diving infraction results in a $2,000 US fine; a fourth warrants a one-game suspension. "One of the impediments to the enforcement of hooking and holding and interference was the diving," Campbell pointed out. "Or the embellishment of those calls to draw a penalty. We knew this would happen because players are competitive and they do what they have to do to win the game," explained Colin Campbell, the NHL's disciplinarian. "So this is how the players and the managers have asked us to do handle it."

Avery was recently fined 0.09% of his annual income and I guess that means it’s time to write up about diving again. A number of people liked my diving article written a while back, which suggested unless the NHL makes drastic changes to their rules they will be unable to control diving as players still benefit more than they lose by diving. I wanted to write a continuation, with a little data this time. I first wanted the opinion from the NHL on how they’re doing on the diving front, I discovered that articles about diving are hard to find because it seems every hockey writer likes to say that goalies make a “diving save”, but that’s an entirely different issue, but I did find a little blurb at USAToday, which makes two simple arguments about the positives associated with diving enforcement this season.

At this point last season, Walkom says, about 20 diving penalties had been called. "We are probably closer to 30 this season."

Walkcom is very correct diving penalties are up about 50% this year, but as I often say 50% of nothing is well nothing. Last season there were 109 diving calls. At this point I have 25 in 2005-2006 and 35 this year, meaning of course that since the article was written things have been a lot closer to even. So whether the NHL maintains their level of diving calls is questionable (as they want to make a statement at the beginning of the season so they can have nice quotes in articles).

The other difference, according to Walkom, is about half of the diving calls this season were called without being connected to another penalty. Last season, diving calls were primarily called in conjunction with a foul. For example, one team's player would be called for hooking, and the "hooked" player would be called for embellishing the fall to draw a referee's attention.

In 2005-2006 there were 109 diving calls 89 of which were associated with other penalties for the other team (82%). This year of the 36 diving calls 22 were associated with other penalties, which worse out to 61%, but increasing, that is to day after that article was written there has been a few less diving without hooking calls, I am always glad to see this increase, although in my opinion as long as the hook is still called the diving call has no value (there’s no cost). This means that in 266 hockey games there were 14 diving calls without a penalty on the opposing team. 14 penalties become about 2.5 goals against. Over the course of the season this is about 65 power plays and about 12 goals against, distributed amongst 30 teams, so this cost each team about 0.07 wins (~$70,0001), I’d say most players would call that just the cost of doing business.

There were 1251 “subjective” calls this season, which includes, hooking, tripping, interference and holding the stick all of which I consider are “dive-able” I’m sure people could think of more, or debate my choices of penalties, but that wont effect this analysis significantly. Of these 1251 penalties 48 were associated with a call to the other team at the same time so 1203 should result in a power play, which of course is 86 times as many as the diving calls. Just looking at that ratio 86:1 one can figure out you should probably still dive. There is no good way to determine the number of players that actually diving, but if you look at 2005-2006 there were 84 players penalized for diving and I have 762 players listed in my database with more than 20 games played, which works out the NHL declaring that 11% of players dive. Now why would these players not dive for almost every call, but I’ll leave that for the readers to figure out. So let’s say that 11% of the above calls were associated with diving 138 and then we know there have been 36 diving calls, or 26% of dives resulted in diving calls (this is pretty high), of these dives 40% of them resulted in only the diving call, which works out to 12% of dives resulted in just a diving call. So here’s the choice to dive: 74% power play – 16% 4 on 4 – 10% just diving call. It should be obvious that the choice would be to dive, unless that $1,000 fine is that big of a deterrent to someone who makes >$500,000.

Of course the 138 dives in the first 266 games is about 638 over the season with 472 power plays for the diver, or about 80 goals distributed amongst 30 teams or 2.6 goals per team, which works out 0.5 of a win or about $500,000, that’s of course assuming the NHL has a fixed number of divers, if you start to assume every player dives 50% of the time for example these numbers will really start to get big, but as you can see $500,0001 is greater than $70,0001 or 2.6 goals for completely dominates the cost of 0.4 goals against. Of course it’s hard to say what percentage of the time a power play would be called without the dive.

This wouldn’t be as interesting if I just talked about the big picture; each diving call has a diver associated with it and a referee who calls it. For example last season only 4 players got more than 2 diving calls against them, Ilya Kovalchuk had 4, here’s a dominant player who can skate circles around defense it should be no surprise that a guy like Kovalchuk is on the correct end of a lot of calls if he’s actually diving or just getting diving calls because he’s always being hooked and referees randomly call dives would be hard to determine. There are three tied for 2nd and the list includes Gaborik, Zubrus and Afiniganov, now why this list includes 4 Russians (Czechoslovakia or USSR born) is a good question, are Russians bad actors, do Russians have less integrity or do the NHL referees have a bias against Russians (Don Cherry theory)? For my part I won’t conclude any of the three, but say it’s interesting that there are four Eastern European skaters in the top 4. However, the important thing is that all these skaters are strong, fast and likely draw a lot of their teams’ penalties, I bet the three diving calls didn’t hurt any of their teams. Of course if Kovalchuk gets 4 again this season he’ll have a one game suspension, if you estimate Kovalchuk’s value of the season to be about 5 wins (based on player contribution), that works out to 0.06 wins for the suspension or doubles the cost to the team, but still doesn’t exceed the benefits. Of course Kovalchuk will loose a lot of money with a one game suspension.

It’s interesting, but you need referees to make these calls, and well spotting a dive isn’t the same as spotting a hook as it’s subjective, one referees dive is another fall. There are a few this season who call over 0.2 dives per game in 2006-2007: Brad Watson, Dennis LaRue, Dean Morton, Michael McGeough, Brad Meier, Dan O'Rourke, or above 0.2 in 2005-2006: Stephane Auger, Dan O'Rourke, Bob Langdon, but mostly a lot of the referees don’t call any diving penalties (<0.1), style="">

Personally I believe every player dives to a certain extent, it’s when the player goes over a certain NHL determined line that they call the diving penalties. Everyone I’m sure has seen replays of a stick getting very close to someone’s face and their head whipping backwards when it doesn’t touch, of course this could be the reaction to something shoved that close to your face or it could be an embellishment. You of course could do studies on “correct” reactions to things like sticks being shoved very close to someone’s face in order to determine what embellishment is. The same goes for a hook, once the player feels the hook, just stop exerting any balance force and fall naturally, that’s an undetectable dive. I’m sure everyone has seen this one a player holds onto another players stick under their arm and when the player pulls to get their stick lose the other player falls. There are many ways players have found to get power plays and most of them aren’t very honest, but the NHL has chosen to encourage them by calling the penalties and not having much if any consequences for doing so. I wish the NHL good luck, but their enforcement of diving is going no where.

[1] - I approximate the value of a win based on the salary cap of approximately $41 million. That is to say the average number of wins is 41 and the average team spending is $41 million, so one win is worth around $1 million. And it takes about 5.5 marginal goals to get a win.

November 12, 2006

Ottawa Senators

There have been a number of articles written on Ottawa, and for good reason. Here’s a team that’s played well, but is ranked second last in the east (just above Philadelphia). Interestingly, but my algorithm is predicting a great 123 point season or 110 points in 66 games (77% - subtracted 8 points from OTL’s). According to the Pythagorean prediction the Senators should be playing at 57%. However in the last 7 games they’ve managed one win and given all those games were coin tosses there’s a 6.25% chance you lose all but one of those games and if they were a 57% team on average there’s a 2.8% chance of losing those 7 games. Also those loses include two losses to 13 point Boston. Interestingly, it seems that every win is a blowout and every loss is a one goal affair as well stated by Michael from LCS Hockey : “But even that number is misleading. Twenty-one of Ottawa's 37 goals came during the club's three-game binge against New Jersey and Toronto, meaning the Sens have managed just 16 goals in their other nine games.”

So what is going on? Obviously luck plays a certain factor here, but I feel that one cannot look at the Senators performance without thinking, what might be causing. McHockey notes the terrible power play for the Senators. And Michael from LCS Hockey concludes that Ottawa needs a second line center. Kelly Hrudey made some excellent comments on Martin Gerber and how he’s struggling to see pucks through traffic. However, what I’ve noticed is that Ottawa has lost a significant number of games because the other team comes back from a deficit. If you go over the last 7 losses you’ll see that they’ve lost all their games as a result of their second or third period. Boston outscored Ottawa 3-2 in the second on the 11th, Atlanta got 2 in the third to win 5 to 4, Carolina got 2 in the third to win 3 to 2, Montreal scored two in the second to win 4 to 2 and finally on the 28th of October Boston got two in the third to win 2 to 1. Even in the 3 game blowout streak Ottawa only managed 1 goal on average in the third period. Of course every game has some chance of a comeback, but losing so many games that should be won asks questions about what’s going on with Ottawa’s defense in the third period. A bad power play or bad goal tending has nothing to do with when you lose the game. With Ottawa's ability to score goal tending shouldn't be a big issue, neither should the power play if they can get the goals at even strength instead. A better way of looking at this information is an actual break down of their game per period. What's interesting is they start out so well, allowing only 1.7 goals

Another way of looking at the above is a break down of the teams’ performance by period. If you look at it this way the team allowed only 1.7 goals against per hour in the first (and only 24.6 should per hour). So without further ado, here’s the table:


Before delving too much into the table above I should say that:
P stands for period.
S stands for shots
G for goals
EG expected goals – an average number of goals based on the quality of the shots, SQN% - is the save percentage of the goaltender if he saw average shots (scales out the fact the shots are easier or harder to stop).

Now SQN% per period is correct to about 3% so a difference of 1 or 2% is perfectly normal so the differences in scoring and goal tending per period could be just the result of error (randomness). So certainly the SQN% show that Ottawa has been a bit unlucky in the third as their shooting percentage falls to 7.3% and save percentage falls below 90%. Scoring (EG, and G) are both accurate to about 0.7, similarly a difference of 4 shots is not significant. This certainly makes the shots provided to the opposition in the second period statistically significantly different (although actual scoring and expected goals aren’t statistically different). In general this is true as the NHL has about 10% more shots in the second period compared to the first or third. But you can see the Senators were able to more than outperform their chances against with their chances for, generating an astounding 4.3 goals for per hour in the second compared to 3.6 against, although many of these goals came in the blow outs.

The third period is really where the luck part strikes you, of course you can’t conclude whether this states their unlucky in the third or lucky in the first, but it’s probably a bit of both. It’s not like Ottawa isn’t shooting the puck in the third or getting good chances, it appears the opposition is simply getting lucky (they good shots aren’t going in). It’s interesting, while there’s a general trend for worse than average goal tending in the third period (SQN%: 90.9% vs. 90.3%), Ottawa has gotten better than average goal tending against them (Their shots are being stopped too often).

Basically what’s the problem: Ottawa hasn’t scored enough in the third period. If Ottawa could play as well in periods 2 and 3 as they did in the first they'd be doing just fine. If the first period becomes more like the 2nd and 3rd then their in trouble. I don't have anymore to say because I wanted to get this finished before the Canadiens game at 7:30 EST.