July 31, 2006

Canucks: Long Summary

Loses:

Jovanovski, was insanely important for the power-play, he was on the top power-play unit almost 100% of the time until he went down, and as soon as he left the power-play seemed just as injured as Jovanovski was. A lot of people don’t realize how important he was on the penalty kill, easily being Vancouver’s number one penalty killer. His special teams will be missed. Recently a different study showed that Jovanovski was able to control both quality and quantity of shots against even strength as well, potentially the extra goals came from odd man rushes, but more than likely they were the result of random chance.

Bertuzzi, was useful at times, he was effective on the power-play, but other than that you may as well have put it a pylon, not saying he wasn’t useful, he had 25 goals and would’ve had more if Jovanovski stayed healthy.

Baumgartner appeared to be effective no matter who he played with, however he could’ve been given easy minutes during even strength that is hiding some of his flaws. That being said he is one of the “greenest” Canucks on the chart. He is very replaceable, but probably would’ve been worth keeping at his small price tag.

I’ve always liked Allen, he was one of our better penalty killers, but average even strength and didn’t play on the power-play. He is certainly replaceable.

Ruutu will be remembered for so much, but possibly many won’t remember his defensive play, which was top on the team. His offense was lacking all season, but this is the result of playing with the likes of Linden, Goren and Kesler. Possibly the number one or number three forward on the penalty-kill (after the Sedins).

Linden who also is a good penalty killer, but that’s about it, we should be with out him. I realize a lot of people expect him back, but in what role, we have replaced his penalty killing skills with better forwards and he wasn’t exactly producing any offense the past two years. Better off developing young talent than paying Linden.

Additions:

There’s been a lot of talk about Luongo for good reason. That being said, Luongo is just a goaltender, a bad penalty kill can make your goalie bad. It’s hard to say what the exact effect this will have on the team.

Mitchell is just an average defenseman who spends a lot of time on the ice, he is good at what he does, but now we basically have two Ohlunds, which is ok, but we do need some offensive minded defenseman as well.

What does Chouinard provide, he is an effective power-play center as he is good at offensive zone face offs, defensively he lacks many skills, you might call him the opposite of Linden.

Bulis could be called Linden, it would almost be silly to have both, the only difference between Linden and Bulis is icetime, as Bulis received a few power-play chances, however, he didn’t do so well with them.

I don’t mention Tremblay, mainly because I hope he doesn’t play, but from the looks of it he came back to North America to play in the NHL not the AHL, so I suspect he will be in the press box or on the ice.

Analysis:


Canucks Power-play:

Power goals accounted for approximately one third of team’s offense; the Canucks were no exception with 96 power-play goals in 2005-2006. Many would’ve call the Canucks power-play in effective in the last half of the season making the first half one of the most potent power-plays in the league. I would personally argue that Jovanovski was the largest individual contributor on the power-play, he made or broke it. Once he was injured the top power-play unit went from amazing to well below average. You can see this relationship in the table. There’s no question in my mind that a power-play without Jovanovski is less powerful, we also lost Bertuzzi who is reasonably replicable on the power-play.

The second power-play unit took over for the top unit when Jovanovski went down, although not in terms of ice time, but in terms of goals scored. The defenseman they used were not great on the power-play by any stretch, that being said Baumgartner was extremely good at the power-play (keeping the puck in) and it appeared to be the one area Ohlund excelled at (he was paired with Baumgartner). Carter helped the Sedin’s bang pucks in the net, interestingly we’ve lost both Baumgartner and Carter, so we’ve lost half of our power-play defenseman and one third of our forwards. So there are changes in sight.

Nonis has not picked up any spectacular power-play forwards beyond Chouinard, who is not considered to be a big goal scorer. Defensively, I have no idea who will be on the points beyond Ohlund and Salo, neither of whom I would call spectacular power-play specialists (Salo was brutal this past year). It would appear Krajicek will replace Baumgartner’s role, maybe we’ll see more four forward power-plays (shorthanded goals anyone). The recent signing of Tremblay might be a sign that he will be the 4th, but none of these guys are going to be a threat on the power-play and this means our forwards will be critical aspect of the coming season.


Penalty kill:

We lost our best penalty killing forward, and defenseman (Ruutu, Jovanovski), that being said, hopefully Chouinard can replace Ruutu on the penalty kill and we still have the Sedins to kill critical penalties. Mitchell is not quite as good as Allen was killing penalties, but he can replace Allen. I’m slightly curious if they’ll use Krajicek on the penalty kill or if he will only be used on the powerplay. Either way, the penalty kill should be surprising similar, if Luongo can improve on Auld’s performance then we should see improvements.

Even strength play accounts for two thirds of all goals, winning even strength is a good place to start in winning a game. That being said, the power-play contains your best offensive forwards and defense and the penalty kill contains the best defensive players. So the above information transforms. Even strength however focuses more on depth, how defensive are your less defensive players or how offensive are your non-power-play players.

Even strength Defense:

The losses are always the same, and if one is good at the penalty kill, one should be good defensively as well. Jovanovski and Allen will again be the most missed players, that being said I’m excited about Krajicek as he is a stellar defensive defenseman even strength. We still have Salo. The rest are all average or worse than average defensively, this included Ohlund and Mitchell. There’s a few defenseman on the Canucks new team I don’t have data for: Tremblay, Bourdon. I see little in terms of changes in the forwards defensive structure, so unless they play a different style expect similar chances (or worse) against Luongo. In terms of depth we have four arguably marginal players Bourdon, Bieksa, Krajicek, Tremblay (negative?), three of which will play on the Canucks. If Bourdon doesn’t make the team then I’m scared.

Even strength Offense:

Just the same as the power-play we’ve lost a lot of our offense and traded in for some hopefuls: Pyatt, Chouinard, Cooke, Bulis. Any of these players could have a great season, but if all four fail, expect the Canucks offense to sag. I personally expect Linden to not be back, I just feel Chouinard and Bulis were Lindens replacements and there is little incentive management to get Linden to come back. I see the bottom two lines as Cooke, Chouinard and Kesler. The fourth line as Borrows, Reid, and a depth player

July 30, 2006

Shots against: The whole story.

Shots are the bread and butter of hockey games, hits, takeaways, giveaways faceoffs, etc. I all doesn't matter if you don’t get shots or prevent them. “The role of the defender is to minimize both the quantity and the quality of shots on goal” (Shot Quality). In theory this is not quite true, some goaltenders let out bigger rebounds. Some defenders screen a shot and others wont, current shots statistics don’t inform how wide open the shooter is and how much time and space they have. I will demonstrate how this is still a reasonable assumption to make given individual player shot data.

As recently as last week, I believe no one knew how many shots each individual player caused (shots while player was on the ice), largely due to the complexity of the data and due to NHL publishing poor data. Only recently did the NHL provide enough data to find out which shots went to which goaltender. And now I can tell you how many shots each player faced, in each situation as well. This study begins with 5 on 5 play, this of course is a more complicated section, because each player is concerned about offense and defense. An offensive player that spends most time in the opponents’ zone will appear effective defensively, however, his offense is preventing chances (which isn’t a bad thing).

Once I aquired this data I first calculated a few useful statistics: shot quality against average [1], shot quality neutral save percentage (SQN%) [2]. I also calculated a difference from their SQN% and their teams SQN% assuming a player only prevents goals by preventing shots and reducing quality, a players neutral save percentage should not vary significantly from 0. What this means is that if you consider the shots and quality of these shots and the goaltender the number of goals the shots predict should be the number of goals against, plus or minus some "small" error. It’s easier to demonstrate these differences with a graph of shots against vs. expected goal against minus goal against


What one can see is data that has significant error growth; the real question is what the error is and what part is the result of players affecting shots beyond quality and quantity. If you consider a shot a binary event (in or not in), you can consider a binomial distribution and using the trivial standard deviation (sqrt(n*p*q)) to determine errors. The first standard deviation should contain 63% of the data. A quick look at my data and 65% (plus or minus 2%) is contained within this region. Another standard location is z-score of 1.96, which contains 95% of the data I have 93% (plus or minus 1%), and for further comparison 2.5, which should contain 99% of the data, where my data begins to fail with only 97% (plus or minus 1%). There are a number of places where this extra error could be coming from the list is below:

  • Players effecting shots beyond quality and quantity
  • Shot quality missing some key factors due to NHL reporting or poor model
  • Errors in my data collection (players not getting the correct number of shots)
  • Players playing with a bad goaltender more frequently than others. (different SQN%)
Given the above errors I will go on to consider that players, as stated by the hockey analytics website, only effect defense by controlling shot quantity and quality, while I will agree this is a simplistic model that will miss some important aspects, it represents a good starting point. It also removes that huge amount of error and allows you to see players in a completely different light. If you’re interested in data beyond this you can study how the minus rate of players compare, however I would contend the minus rating has significant error to the factor of ± 10 for most players, meaning a -10 is potentially a 0 or +10 is actually +20 or 0 (big difference).


There are two pieces of information here that a player controls, one being that of quality, the other of quantity. The question is how one measures these two variables together. It would appear the best way is to consider expected goals against average, or the number of goals a player is expected to have scored against him if he played for one hour. This is basically a goal against average for players. I compiled two lists [3]: top defensive defenseman and top defensive forwards. Some of the players at the top and bottom are to be expected, others are significant exceptions. You can see Pronger at the top, which is a good sign along with Malik, who is a consistent plus player. What is interesting is that Naslund ranks 62nd (tied with the Sedin’s) defensively, although his record this year (-19) would indicate otherwise. In other words Naslund got unlucky this year and compiled a huge minus rating (this is likely a little too simplistic of an explanation, but it is likely Naslund's minus rating is exagerated by error).

I should make a special note that these statistics are based on two things shots and the quality of the shots, they make no statement of actual goals. This might seem strange, and I would have to agree, on the surface this doesn't make sense, but what we are trying to look at is who is better, not who is lucky, and hockey is full of great examples of lucky individuals and teams. However these statistics are indifferent to any goaltender you had in net, it would make no difference statistically if Noronen or Hasek were in net. That being said a bad goaltender can be motivator to improve ones play.

This can be furthur extended to short-handed situations, however I wont comment on thse (I'll skip the statistical error analysis for this). The top penalty killers: defenseman, forwards. There would be very limited benifit to publishing powerplay data. I hope to look at offense and plus statistic in the next article and both Canucks and Edmonton team changes summaries are coming...


[1] or SQA is calculated by: expected goals / (league save percentage * shots)
[2] calculated by: 1 - (1 - save percentage)/SQA
[3] for those confused by the charts #. Lastname, First initial (expected goals against average) [salary]

July 28, 2006

Introduction: Shift Analysis.

Hockey is a team sport, but even as a team sport it’s built up of lines and pairings that determine the outcome. It is very difficult in hockey to evaluate the individual. In sports like baseball, each player has a specific role and can be measured based on specific standards at that position. You either hit the ball or you don’t, you throw a strike or a ball etc. Every assist has a goal scorer, without the scorer there’s no assist every goal has an assist (ok there are a number of exceptions). Every goal has a supporting cast, and a defense (I will get to this in the future). When considering an individual performance, line mates are key. For example, was Carter in successful because of the Sedin twins or were the Sedin twins successful because of Carter?

In order to discover how players do in certain complicated circumstances one needs a list of every second the player is on the ice, problem is this is a lot of data. The NHL provided two methods for analysis shift charts (optical recognition anyone...) and time sheets, however the time sheets were only provided from mid January until the end of the season. The problem is with the shift charts: they are images that have 540 (or 520) pixels length wise, meaning every pixel represents 6 to 8 seconds (depending on the length of the game), resulting in unknown errors. Also, on the score sheets, big black bars represent goals and potentially hide 12-16 seconds of data creating additional problems. Thankfully the time sheets do not have these problems and will make 2006-2007 data collection a lot more pleasant.

I could use this data for counting 5 on 3 and power-play situations, but I stuck to my older code use penalties to determine when the 5 on 3 situations occur, I will likely convert to using the shift data to determine these events, but at this time this works better. So now I have: every second a player is on the ice and the number of players on the ice for each team. I can easily combine them in anyway to determine how much 5 on 3 time an individual received, or how often a line played together as a percent of each individual or just as an aggregate. In this study, I choose to only look at two players together [Displaying the information for three players would be complicated]

Example

Example*: Jovanovski and Naslund. Naslund was one of the Canucks leaders in terms of minutes played on the power-play, but the top power-play unit struggled in the second half of the season. Was this random, or was this a function of a larger problem? Naslund spent 46% of his power-play time with Jovanovski (who missed half the season); however, Naslund had 61% of the power-play goals scored with Jovanovski, meaning in 54%, the rest of the time, he got only 39% of the goals. However in the other direction: Jovanovski who spent 243 minutes on the power-play spent 84% of his time with Naslund, got 96% of his goals with Naslund, that is quite a tandem!

What the above example shows is two players who help each other both do better when they are together. There are many examples where the benefits only go one way, or both hurt each other, what’s useful about the above example is the fact that these two spent well over 200 minutes together so the statistics are reasonably accurate. As an aside, statistics like this make me nervous about the new season as a Canucks fan.

Just as I did with the example above the same calculations can be done with every player. Some players will make every player they play with do better, other will make all players do worse; these players will likely lead or “drag down” the team statistically as well, although this may not be the case, for example, if the player spends a lot of time with bad players he wont be able to make a big enough different to improve his individual statistics. Below is the data for both the Canucks and Oilers, I will explain the consequences in regards to lost players in future posts.

The Data


Edmonton: PP SH EV
Canucks: PP SH EV

  • Time% - this represents the percentage of the player on the lefts time these player were together.
  • +% - this represents the percentage of the player on the lefts goals for that are scored with that player on the ice
  • -% - this represents the percentage of the player on the lefts goals against that are scored with that player on the ice
  • +R - +%/Time%
  • -R - -%/Time%
  • Time measured in seconds.

So basically how this works is each player plays with certain players a certain amount (these percentages are displayed across the table) and also gets a certain number of goals with these players (also displayed as a percentage across the table). Now across the table, it’s a zero sum game, for example, if you do better with one player, you have to do worse with someone else, but this person could be off the table due to lack of minutes played. The really useful information comes from looking down the columns. While a player will do better or worse along the table, down the columns they can in fact do better with every player. Take more notice of higher Time%’s as they have more relevance, playing 20 power-play minutes together isn’t very statistically significant, so doing better with 60% means a lot more than doing better with 15% of the time. The tables are colour coded for easy reading.

A good example is Smyth on the power-play, it appears everyone he played with benefited from him (except Reasoner and Dvorak, who weren’t good so just ignore them). Another one would be Moreau on the penalty kill, doing better with everyone except Conklin and Ulanov, however not statistically significantly worse with those guys. However, the most impressive, Moreau spent over 50% with Peca (who wasn’t that great killing penalties yet Moreau was able to cut down Peca’s score against rate by over 20%. There was an identical effect (even more extreme) with Bergeron. You can look through the data yourself; I will have a team analysis shortly for both teams (Edmonton and Vancouver). The above sheets take about four hours to create on my 2 GHz processor, so I can’t just make them instantly.


Future Considerations

So what else can be done with such information? This really is the tip of the iceberg, once you figure out who is good, bad and average. Then you can see how players fair versus good, bad and average oppositions. You can also discover which lines shoot the most and even better, how many shots against each individual receives. This of course will be a future analysis. I’d be curious if individual players have better save percentages than others and which players allow the most difficult shots. Maybe this will give some insight into minus statistics.

*I should note: this example only looks at 5 on 4 power-plays and the “goal scored” statistic is any goal scored by any player on the ice while the individual is on the ice (like the plus statistic for even handed situations).

July 27, 2006

New Hockey Numbers Blog.

The main reason for this blog is to highlight new statisitcal insights into the game of hockey.

So far I have a free agency tracker site that includes the player contibution statisitc.

In future posts I will discuss new time on ice statistics I am compiling at the moment: every second of icetime is recorded, so I can figure out what line combinations work and why players do well (line mates) and do poorly.