My statistics site was getting way too many columns that I found it necessary to make a few changes. The biggest change is that I grouped my statistics into two categories: one for goals and one for shots. At this time I also changed the code that is behind the scenes as it would make future changes easier. These changes made it possible to highlight the sorted column and also add/delete columns easily as I see fit. You will note these changes in the team pages as well as the forward and defense pages.
Another feature I decided to add is a items to display per page options. I have always only shown 30 items per page; now there is the option of showing more. Please only use ALL if you really need to see all the players as it can be a bit slow. I also allow users to sort by ice time per game, second assists (in the shooting statistics section).
I also added the requested column: average shot distance (only for even strength). You can also see how many shots/game each player gets (or shots/60 min)
If there are any problems with the site, please let me know.
December 30, 2007
December 20, 2007
Everything you wanted to know about the Shootout
Edmonton started the season going 101 in the shootout (now they are 102), this event brought about an interesting discussion on shootouts. I've been holding back posting anything about shootouts primarily due to lack of data, but after two seasons (plus a bit). It got me thinking: what led to Edmonton doing so well? Was it all luck or was skill a big part of it.
Looking at this without knowing any details it would appear that the shootout results are in fact perfectly random, you can sort of see some clustering in the middle, but other than that there are too many anomalies to make sense of anything. Let me however make a few key points:
1. LA: Lost Garon (80% shootout save percentage after 20052006)
2. Carolina: After the playoffs dumped Gerber and Ward has 1 win in three seasons
3. Florida: Lost Luongo
4. Philadelphia: switched to Niittymaki
5. Pittsburgh: started to use Fleury more often
6. San Jose: Toskala thankfully only played 1 shootout game in20062007 (0 wins in his career)
Remove the above points and you have a data set that has so much more structure. Of course I could probably think of good reasons to exclude every point on this graph, I'm simply trying to show that there are some key points that shouldn't be included because too much changed.
Since it appeared that Goalies were a big part of the shootout results. I figured I'd best compare goalie's save percentages in the shootout. Now the graph below BIG = BAD, small = GOOD. The graph below shows player's Zscores. A Zscore of 2 means that only 2.5% of players will do better, a Zscore of 1 means that 16% of players will do better than that player. A score of 0 means that 50% of players are better, 50% are worse. A score of +1 means that 16% of players are worse and a score of +2 means that 2.5% of players are worse. [Very simplistic explanation]. These Zscores are necessary because each goalie sees a different number of shots and it's easier to stop 100% of the shots if you only see three shots than it is to stop 100% of the shots when you see 100 shots. The main point I want to make is that having a Zscore of 2 one year suggests that you will have a Zscore of 1 the next (this is called a regression to the mean).This regression to the mean is quite small, as it is saying that half of the average goalie's save percentage results are due to luck the other half come from skill (in a given season). [Individuals could be quite different]
In translation: Garon's 90% save percentage on the shootout, should be closer to (0.90.66)/2+0.66 = 78%r 78% (his career save percentage is: 46/58 = 79%).
Statistics by round:
1: 616 Shots. Shooting%: 0.3669
2: 616 Shots. Shooting%: 0.3344
3: 460 Shots. Shooting%: 0.2978
4: 83 Shots. Shooting%: 0.3494
5: 40 Shots. Shooting%: 0.3500
6: 24 Shots. Shooting%: 0.2917
7: 14 Shots. Shooting%: 0.286
8: 11 Shots. Shooting%: 0.364
9: 4 Shots. Shooting%: 0.250
10: 2 Shots. Shooting%: 0
11: 2 Shots. Shooting%: 0
12: 2 Shots. Shooting%: 0
13: 2 Shots. Shooting%: 0
14: 2 Shots. Shooting%: 1
15: 2 Shots. Shooting%: 0.5
Note Round 9+ 16 shots, 25% shooting percentage.
In general later rounds are easier to stop. So if a goalie plays until the 7th or 8th round his numbers should be better than a goalie who only sees the first two shooters.
On the surface it would appear that shooting percentage is all luck as there is no correlation between years. That being said, the graph below could just be the result of playing different opposition.
What is a guy like Garon worth in just the shootout. Well assuming average number of shootouts (10) he wins you an additional 12 points (or about $0.75M  1.0M). [Based on this image , 35% shooting percentage and 80% save percentage]. Of course if you're Edmonton and discovered a way to get to the shootout 3x as often as expected, well then Garon is even more useful.
Team winning percentage
The first thing I looked at was year over year comparisons:Looking at this without knowing any details it would appear that the shootout results are in fact perfectly random, you can sort of see some clustering in the middle, but other than that there are too many anomalies to make sense of anything. Let me however make a few key points:
1. LA: Lost Garon (80% shootout save percentage after 20052006)
2. Carolina: After the playoffs dumped Gerber and Ward has 1 win in three seasons
3. Florida: Lost Luongo
4. Philadelphia: switched to Niittymaki
5. Pittsburgh: started to use Fleury more often
6. San Jose: Toskala thankfully only played 1 shootout game in20062007 (0 wins in his career)
Remove the above points and you have a data set that has so much more structure. Of course I could probably think of good reasons to exclude every point on this graph, I'm simply trying to show that there are some key points that shouldn't be included because too much changed.
Since it appeared that Goalies were a big part of the shootout results. I figured I'd best compare goalie's save percentages in the shootout. Now the graph below BIG = BAD, small = GOOD. The graph below shows player's Zscores. A Zscore of 2 means that only 2.5% of players will do better, a Zscore of 1 means that 16% of players will do better than that player. A score of 0 means that 50% of players are better, 50% are worse. A score of +1 means that 16% of players are worse and a score of +2 means that 2.5% of players are worse. [Very simplistic explanation]. These Zscores are necessary because each goalie sees a different number of shots and it's easier to stop 100% of the shots if you only see three shots than it is to stop 100% of the shots when you see 100 shots. The main point I want to make is that having a Zscore of 2 one year suggests that you will have a Zscore of 1 the next (this is called a regression to the mean).This regression to the mean is quite small, as it is saying that half of the average goalie's save percentage results are due to luck the other half come from skill (in a given season). [Individuals could be quite different]
In translation: Garon's 90% save percentage on the shootout, should be closer to (0.90.66)/2+0.66 = 78%r 78% (his career save percentage is: 46/58 = 79%).
Save percentage By Round
Now there are a few things to keep in mind:Statistics by round:
1: 616 Shots. Shooting%: 0.3669
2: 616 Shots. Shooting%: 0.3344
3: 460 Shots. Shooting%: 0.2978
4: 83 Shots. Shooting%: 0.3494
5: 40 Shots. Shooting%: 0.3500
6: 24 Shots. Shooting%: 0.2917
7: 14 Shots. Shooting%: 0.286
8: 11 Shots. Shooting%: 0.364
9: 4 Shots. Shooting%: 0.250
10: 2 Shots. Shooting%: 0
11: 2 Shots. Shooting%: 0
12: 2 Shots. Shooting%: 0
13: 2 Shots. Shooting%: 0
14: 2 Shots. Shooting%: 1
15: 2 Shots. Shooting%: 0.5
Note Round 9+ 16 shots, 25% shooting percentage.
In general later rounds are easier to stop. So if a goalie plays until the 7th or 8th round his numbers should be better than a goalie who only sees the first two shooters.
Shooters:
On the surface it would appear that shooting percentage is all luck as there is no correlation between years. That being said, the graph below could just be the result of playing different opposition.
What is a guy like Garon worth in just the shootout. Well assuming average number of shootouts (10) he wins you an additional 12 points (or about $0.75M  1.0M). [Based on this image , 35% shooting percentage and 80% save percentage]. Of course if you're Edmonton and discovered a way to get to the shootout 3x as often as expected, well then Garon is even more useful.
December 19, 2007
Brad Stuart
It's hard to miss Brad Stuart. A lot of good things have, been said about him. But he seems to have been involved in a lot of bad trades recently (Stuart being the bad part).
Traded from a losing San Jose team (to a losing) Boston team for Joe Thornton. Many people might point to Thornton as the reason San Jose made the playoffs, but it appears that the loss of a key defenseman wasn't too big of a problem. (Boston ended up with 74 points)
From Boston, Stuart (along with Primeau) was traded for a 1st round draft pick, Kobasew and a solid defender: Andrew Ference. Interestingly, after the trade Calgary struggled and just squeezed themselves into a playoff spot
Stuart is now playing for a team that, as of last year, appeared to only need a goaltender: Los Angeles. Labarbera has provided the goaltending, but now the team lacks any sort of defense. Allowing over 30 shots per hour at even strength and the shots that hit the net are almost 20% more difficult to stop.
Stuart may not be the worst on the team in terms of plusminus, but he certainly is up there with his 10 (and another 2 vs Detroit tonight).
Looking at a short window of history, Stuart may have just signed and been traded to bad teams, his record in the previous 187 games is appalling: 42% winning percentage.
"[Brad Stuart] is prone to mistakes in pressure situations, which has led him into the coach's doghouse in the past. Doesn't use his size effectively enough."  Sportsnet.ca, what more can I really say...
Traded from a losing San Jose team (to a losing) Boston team for Joe Thornton. Many people might point to Thornton as the reason San Jose made the playoffs, but it appears that the loss of a key defenseman wasn't too big of a problem. (Boston ended up with 74 points)
From Boston, Stuart (along with Primeau) was traded for a 1st round draft pick, Kobasew and a solid defender: Andrew Ference. Interestingly, after the trade Calgary struggled and just squeezed themselves into a playoff spot
Stuart is now playing for a team that, as of last year, appeared to only need a goaltender: Los Angeles. Labarbera has provided the goaltending, but now the team lacks any sort of defense. Allowing over 30 shots per hour at even strength and the shots that hit the net are almost 20% more difficult to stop.
Stuart may not be the worst on the team in terms of plusminus, but he certainly is up there with his 10 (and another 2 vs Detroit tonight).
Looking at a short window of history, Stuart may have just signed and been traded to bad teams, his record in the previous 187 games is appalling: 42% winning percentage.
"[Brad Stuart] is prone to mistakes in pressure situations, which has led him into the coach's doghouse in the past. Doesn't use his size effectively enough."  Sportsnet.ca, what more can I really say...
West continues to dominate East
West vs. East (This season)
30 Wins
24 Losses
4 OTW or SOW
2 OTL or SOL
Winning percentage = 30/54 = 56%
185 GF
163 GA
Pythagorean percentage = 185^{2}/(185^{2}+163^{2}) = 56%
What's interesting is that it has stayed so consistent over the last few years.
30 Wins
24 Losses
4 OTW or SOW
2 OTL or SOL
Winning percentage = 30/54 = 56%
185 GF
163 GA
Pythagorean percentage = 185^{2}/(185^{2}+163^{2}) = 56%
What's interesting is that it has stayed so consistent over the last few years.
December 16, 2007
Buy low  Sell high
I haven't said much (if anything) about Lupul, I was curious whether he would rebound or disappear. Also, I didn't know much about Lupul so it was best I kept my mouth shut. I didn't even realize that Lupul was in fact draft 7th overall.
I decided to create a short list of players who were drafted 7th and had a bad season in their early 20's.
1990: Sydor 24  58 GP: 12 Points
1993: Arnot 24  70 GP: 33 Points
1995: Doan 22  79 GP: 22 Points
1996: Rasmussen 23  67 GP: 14 Points
1997: Mara 23  75 GP: 24 Points
1998: Malhotra 22  59 GP: 10 Points
1999: Beech  14 GP: 4 Points in 2 years
2001: Komisarek 23  71 GP: 6 Points
2002: Lupul 81 GP: 28 Points (29)
Often, if you look at past players, teams will trade these players during or right after their bad seasons. However, teams are quickly disappointed as their high draft pick succeeds in their new environment. What's interesting is that Edmonton didn't like Lupul because of his bad plus minus, so the team picked up Souray (ranked second last in plus minus) and Pitkanen (ranked fourth last in plus minus).
The point I'm trying to make is that all players have bad seasons (often early in their careers) and teams view this as a good predictor of future performance, when draft position is probably a better prediction of future performance than one season. Anaheim was smart to see a player who had done better than expected one year and got a great deal for him (sell high). However Edmonton, after one bad season, dumped this 1st round draft pick (plus their captain) for Philadelphia's trash.
In conclusion, don't give up on a player after one bad season. Also, don't get too excited by one great year either.
I decided to create a short list of players who were drafted 7th and had a bad season in their early 20's.
1990: Sydor 24  58 GP: 12 Points
1993: Arnot 24  70 GP: 33 Points
1995: Doan 22  79 GP: 22 Points
1996: Rasmussen 23  67 GP: 14 Points
1997: Mara 23  75 GP: 24 Points
1998: Malhotra 22  59 GP: 10 Points
1999: Beech  14 GP: 4 Points in 2 years
2001: Komisarek 23  71 GP: 6 Points
2002: Lupul 81 GP: 28 Points (29)
Often, if you look at past players, teams will trade these players during or right after their bad seasons. However, teams are quickly disappointed as their high draft pick succeeds in their new environment. What's interesting is that Edmonton didn't like Lupul because of his bad plus minus, so the team picked up Souray (ranked second last in plus minus) and Pitkanen (ranked fourth last in plus minus).
The point I'm trying to make is that all players have bad seasons (often early in their careers) and teams view this as a good predictor of future performance, when draft position is probably a better prediction of future performance than one season. Anaheim was smart to see a player who had done better than expected one year and got a great deal for him (sell high). However Edmonton, after one bad season, dumped this 1st round draft pick (plus their captain) for Philadelphia's trash.
In conclusion, don't give up on a player after one bad season. Also, don't get too excited by one great year either.
November 19, 2007
Blocked Shots
In regards to James Mirtle's blocked shots post I have a list of player's blocked shots with their ice times.
I have also completed a regression of blocked shots vs. ice time
Forwards: 1.2 * EV hours + 7.0 * PK hours = blocked shots
Defense: 3.8 * EV hours + 11.5 * PK hours = blocked shots
Regression was completed using 20062007 data.
Did the regression in such a way that the constant variable was no significant.
columns:
ev = even strength ice time in hours
sh = short handed ice time in hours
P = D: defense, F: forward
I have also completed a regression of blocked shots vs. ice time
Forwards: 1.2 * EV hours + 7.0 * PK hours = blocked shots
Defense: 3.8 * EV hours + 11.5 * PK hours = blocked shots
Regression was completed using 20062007 data.
Did the regression in such a way that the constant variable was no significant.
columns:
ev = even strength ice time in hours
sh = short handed ice time in hours
P = D: defense, F: forward

November 7, 2007
Where have all the OTs gone?
If anyone can answer this simple question it would be great.
In 20032004 there were 47 overtimes in 209 games.
In 20052006 there were 44 overtimes in 210 games.
In 20062007 there were 46 overtimes in 210 games.
or 870 in 3690 games through 20032007 or 23.5%
In 20072008 there were 29 overtimes in 209 games. (13.9%)
This is a 40% reduction in overtimes!
Standard deviation = 6.
I noticed this in the first 100 games, but figured it could be an anomaly until it was repeated in the next 100 games. What is the NHL doing to prevent overtimes? Have team's incentives changed? Do teams see overtime as a bad thing because it gives the opponents a free point [more than last season]?
Something worth noting:
 scoring in the third period is the same as last year.
In 20032004 there were 47 overtimes in 209 games.
In 20052006 there were 44 overtimes in 210 games.
In 20062007 there were 46 overtimes in 210 games.
or 870 in 3690 games through 20032007 or 23.5%
In 20072008 there were 29 overtimes in 209 games. (13.9%)
This is a 40% reduction in overtimes!
Standard deviation = 6.
I noticed this in the first 100 games, but figured it could be an anomaly until it was repeated in the next 100 games. What is the NHL doing to prevent overtimes? Have team's incentives changed? Do teams see overtime as a bad thing because it gives the opponents a free point [more than last season]?
Something worth noting:
 scoring in the third period is the same as last year.
November 5, 2007
Corsi Numbers
Corsi numbers have popped up a few times in the last week. Due to the fact no one else was seeing if these numbers were relevant, I though I'd give it a go:
Corsi Number
Corsi number is the number of shots directed towards the net while the player is on the ice. The number can be broken down into whose net the shots are directed towards (their own net () and their opponent's net (+)) similar to the plus minus statistic. The hope of course is that the Corsi plus minus would correlate well with the regular plus minus, but because the numbers will be 16x larger than plus minus numbers they'll be about 4x more accurate than the plus minus numbers.
Team Regression:
If this statistic is really useful in predicting offense (or winning) it should correlate well with scoring, whether it be on a team by team basis or player by player. So I first look at the team using last season's results: Goals = 0.09*shots + 0.02*missed (where 0.02 is +/ 0.04, aka completely insignificant). First off, even if missed shots were significant, one missed shot is still only worth about 1/5 of an actual goal, so it would take 50 missed shots to make 1 goal.
Individual level:
The first question: are missed shots with regular shots a better predictor of offense than just regular shots? A. this is a resounding no, while missed shots don't seem to hurt the results too significantly they don't seem to add anything, except more variability to the model.
Are missed shots significant?
Again a regression with shots and missed shots, at this point in the season, are not a significant variable in the model. What was interesting is that missed shots were more important in a model that used "expected goals" as opposed to just shots.
The problem with Missed Shots:
The simplest most basic problem with the Corsi index is the fact that missed shots are by definition worse than a shot on goal. The only hope Corsi has, is that players who miss the net a lot are likely hitting the net a lot, and in the absence of a decent sample size this is a useful method as a missed shot is better than no shot at all.
Missed shot percentage (missed shots/(missed shots + regular shots)
The higher a player's missed shot percentage is the worse the player is (if a player is only hitting the net 10% of the time, they'll be sent back to the AHL or worse).
The problem is, that in a model where missed shots are included the missed shot percentage becomes a significant liability. That is to say that unless the missed shots are accompanied with actual shots they're worthless (this makes sense).
Missed Shots
Missed shots are a complicated variable that can be both a good thing and a bad thing. A team that chooses to shoot more shots haphazardly will likely struggle to score compare to a team that focuses on getting the puck on target. Missed shots can have any range depending on the score sheet recording a shot that missed by a few inches is quite different from one that misses by 3 feet.
Blocked Shots
Blocked shots are even more complicated than missed shots and similarly do not help predict offense better than regular shots on their own.
Conclusion:
I'll stick with expected goals. That being said, I've posted the Corsi index on my statistic site for those who think it is useful.
Corsi Number
Corsi number is the number of shots directed towards the net while the player is on the ice. The number can be broken down into whose net the shots are directed towards (their own net () and their opponent's net (+)) similar to the plus minus statistic. The hope of course is that the Corsi plus minus would correlate well with the regular plus minus, but because the numbers will be 16x larger than plus minus numbers they'll be about 4x more accurate than the plus minus numbers.
Team Regression:
If this statistic is really useful in predicting offense (or winning) it should correlate well with scoring, whether it be on a team by team basis or player by player. So I first look at the team using last season's results: Goals = 0.09*shots + 0.02*missed (where 0.02 is +/ 0.04, aka completely insignificant). First off, even if missed shots were significant, one missed shot is still only worth about 1/5 of an actual goal, so it would take 50 missed shots to make 1 goal.
Individual level:
The first question: are missed shots with regular shots a better predictor of offense than just regular shots? A. this is a resounding no, while missed shots don't seem to hurt the results too significantly they don't seem to add anything, except more variability to the model.
Are missed shots significant?
Again a regression with shots and missed shots, at this point in the season, are not a significant variable in the model. What was interesting is that missed shots were more important in a model that used "expected goals" as opposed to just shots.
The problem with Missed Shots:
The simplest most basic problem with the Corsi index is the fact that missed shots are by definition worse than a shot on goal. The only hope Corsi has, is that players who miss the net a lot are likely hitting the net a lot, and in the absence of a decent sample size this is a useful method as a missed shot is better than no shot at all.
Missed shot percentage (missed shots/(missed shots + regular shots)
The higher a player's missed shot percentage is the worse the player is (if a player is only hitting the net 10% of the time, they'll be sent back to the AHL or worse).
The problem is, that in a model where missed shots are included the missed shot percentage becomes a significant liability. That is to say that unless the missed shots are accompanied with actual shots they're worthless (this makes sense).
Missed Shots
Missed shots are a complicated variable that can be both a good thing and a bad thing. A team that chooses to shoot more shots haphazardly will likely struggle to score compare to a team that focuses on getting the puck on target. Missed shots can have any range depending on the score sheet recording a shot that missed by a few inches is quite different from one that misses by 3 feet.
Blocked Shots
Blocked shots are even more complicated than missed shots and similarly do not help predict offense better than regular shots on their own.
Conclusion:
I'll stick with expected goals. That being said, I've posted the Corsi index on my statistic site for those who think it is useful.
October 28, 2007
November is Divisional Play Month
For those who don't know, November will be the month in which teams play the most games against teams within their own division.
In fact there are a total of 136 interdivisional games to be played in the month of November (about 9/team) and 63 games against other opponents. This works out to 68% of all games played in November are interdivisional games. To put this number into perspective, there will be 146 interdivisional games played in December, January and February combined (about 50 per month).
By the end of November we should have a good idea where teams stand within their own divisions, but it will still be difficult to tell how these divisions will fit into the overall standings.
Why the NHL choose to do things this way is beyond me. I would expect the NHL to want to evenly distribute these games throughout the year. Last season the NHL was much more balanced in regards to these games, but it appears the NHL wanted to load all the interdivisional games into two months.
March is also a big interdivisional month, with 108 games. So in November and March account for over 50% of the interdivisional games, but only 1/3 of the season.
In fact there are a total of 136 interdivisional games to be played in the month of November (about 9/team) and 63 games against other opponents. This works out to 68% of all games played in November are interdivisional games. To put this number into perspective, there will be 146 interdivisional games played in December, January and February combined (about 50 per month).
By the end of November we should have a good idea where teams stand within their own divisions, but it will still be difficult to tell how these divisions will fit into the overall standings.
Why the NHL choose to do things this way is beyond me. I would expect the NHL to want to evenly distribute these games throughout the year. Last season the NHL was much more balanced in regards to these games, but it appears the NHL wanted to load all the interdivisional games into two months.
March is also a big interdivisional month, with 108 games. So in November and March account for over 50% of the interdivisional games, but only 1/3 of the season.
October 23, 2007
Goaltending
It's been an interesting season for goaltending. A number of goaltenders moved in the off season, which of course changes the style of defense for the goalies who move. However, there have been other surprise as well:
Vancouver: Luongo  0.896, slow start
Minnesota: Seems every goalie on Minnesota does well, but are they actually good?
Calgary: Kiprosoff doesn't look so good without the nice defense in front.
Nashville: Rolled the dice and lost  Mason is no Vokoun.
Blue Jackets: Leclaire may actually be the real deal...
Pheonix: Sent down Aebischer who has their best save percentage. Auld and Tellqvist vie for #1, need I say more
L.A: Found out how to win when all you have is AHL goaltending  Allow 17 shots.
N.J: No defense  No Brodeur...
Philadelphia: Why was Biron used as a backup last season?
Pittsiburg: Don't worry Fleury is still the same goalie he was last year, except with a bit more experience.
Boston: Did Fernandez hide behind Minnesota's defense?
Toronto: One more bad game for Raycroft and he might see some AHL action...
Atlanta: Don't blame the goalies please.
Florida: This team need a lot more than decent goaltending to do well.
Tampa: No changes from last year  still bad goaltending, but great offense.
Vancouver: Luongo  0.896, slow start
Minnesota: Seems every goalie on Minnesota does well, but are they actually good?
Calgary: Kiprosoff doesn't look so good without the nice defense in front.
Nashville: Rolled the dice and lost  Mason is no Vokoun.
Blue Jackets: Leclaire may actually be the real deal...
Pheonix: Sent down Aebischer who has their best save percentage. Auld and Tellqvist vie for #1, need I say more
L.A: Found out how to win when all you have is AHL goaltending  Allow 17 shots.
N.J: No defense  No Brodeur...
Philadelphia: Why was Biron used as a backup last season?
Pittsiburg: Don't worry Fleury is still the same goalie he was last year, except with a bit more experience.
Boston: Did Fernandez hide behind Minnesota's defense?
Toronto: One more bad game for Raycroft and he might see some AHL action...
Atlanta: Don't blame the goalies please.
Florida: This team need a lot more than decent goaltending to do well.
Tampa: No changes from last year  still bad goaltending, but great offense.
Early season expectedstandings.
West
 East

The above standings represent the expected winning percentage for each team based on the quality of shots for and against each team has had or generated. If a team has better than average goaltending then they should outperform the above predictions and if they have worse than average goaltending they should under perform the above expectations.
This does not account for strength of competition, but is simply calcualted by: EGF^{2}/(EGF^{2}+EGA^{2})
Where EGF = expected goals for, EGA = expected goals against.
This is posted mainly to show which teams may be higher ranked in the standings than they probably will do over the course of the season.
This does not account for strength of competition, but is simply calcualted by: EGF^{2}/(EGF^{2}+EGA^{2})
Where EGF = expected goals for, EGA = expected goals against.
This is posted mainly to show which teams may be higher ranked in the standings than they probably will do over the course of the season.
October 9, 2007
4 Playoff Team Division
James Mirtle posted a while back that it isn't realistic to expect 4 teams from 1 division to make the playoffs. Tom Benjamin responded that: "I do agree that the most probable outcome will be two or three playoff teams from each division, but I do think four teams making it from one division will happen more frequently than he thinks"
I was under the impression that the divisional schedule would significantly effect the chance that 4 teams make the playoffs, but now in two seasons we've had 4 teams make the playoffs in 20062007 and in 20052006 Toronto had 90 points (2 away from a playoff spot), which would have made it 4 that year as well.
So, I decided to look into the chance of this actually happening. I have a script that simulates the whole season to do season predictions. I can randomize team skill or choose a certain skill level manually. A random distribution of skill produced a 63% chance of a 4 playoff team division and unbalancing one division jumped that number to 68%. I then decided to make every team identical (50% chance for every game) and that produced a even larger 69% chance. Either way there will be 4 teams who make the playoffs from one division 2 times out of 3 years based on my best analysis. In other words it's more common to have 4 teams make it from one division than not.
I was under the impression that the divisional schedule would significantly effect the chance that 4 teams make the playoffs, but now in two seasons we've had 4 teams make the playoffs in 20062007 and in 20052006 Toronto had 90 points (2 away from a playoff spot), which would have made it 4 that year as well.
So, I decided to look into the chance of this actually happening. I have a script that simulates the whole season to do season predictions. I can randomize team skill or choose a certain skill level manually. A random distribution of skill produced a 63% chance of a 4 playoff team division and unbalancing one division jumped that number to 68%. I then decided to make every team identical (50% chance for every game) and that produced a even larger 69% chance. Either way there will be 4 teams who make the playoffs from one division 2 times out of 3 years based on my best analysis. In other words it's more common to have 4 teams make it from one division than not.
October 8, 2007
Average Predicted Standings


I like NHL standings predictions, because I find it interesting how wrong we can be. The above standings represent the average (P) of several different standings I found on the web (if you know of more I am more than happy to add them).
I have also included the minimum standing spot (L) and maximum standing spot (U) based on the variance of the predictions. For example the New York Islanders have had a lot of different predictions (from average to great to terrible) as such they have a very large expected range (anywhere from 4th to 15th), where as Phoenix has a very small range (15th). No one has predicted Phoenix to do better than 15th.
The above standings is the average of several different sites:
Bookies
Mirtle
Mirtle's Playoff poll
My opinion
McKeen's Hockey
The Hockey News
Added:
Alain Chantelois
Gaston Therrien
Howard Berger
David Johnson
I have also included the minimum standing spot (L) and maximum standing spot (U) based on the variance of the predictions. For example the New York Islanders have had a lot of different predictions (from average to great to terrible) as such they have a very large expected range (anywhere from 4th to 15th), where as Phoenix has a very small range (15th). No one has predicted Phoenix to do better than 15th.
The above standings is the average of several different sites:
Bookies
Mirtle
Mirtle's Playoff poll
My opinion
McKeen's Hockey
The Hockey News
Added:
Alain Chantelois
Gaston Therrien
Howard Berger
David Johnson
September 28, 2007
Adjusted bookie standings
I like "bookie" predictions due to the fact that they are backed by cash instead of just hot air. That being said, even bookie predictions have systemic problems or even systemic biases. Some can be fixed others cannot. Certain teams attract money from betters and other simply cannot. Due to the fact that very few people cheer for Nashville it may in fact suffer some devaluation.
However, some predictions are just stupid. For example, the Atlantic division last season was a joke, not only was Philadelphia the worst team by a far margin. Overall the division was below average or average. This season Pittsburgh doesn't get to play Philadelphia 8 times to stack their stats as Philadelphia will quite possibly make the playoffs. The bookies from the above prediction state that the Atlantic division will go from below average to the best division by a large margin in a matter of one season.
So, what I did is I adjusted the bookie's standings so that all the divisions performed the same this season as last season.
When I look at those standings I have a hard time finding fault with anything on there [except the four Northwestern team's making the playoff and only one central team and all four Northeastern teams making it].
I find it interesting either way...
September 25, 2007
Bigger Nets
"The NHL first discussed the idea of larger nets two years ago, when players and league executives met to debate ways of increasing scoring and opening up the game."
"The topic was revisited briefly in June when general managers met in Ottawa." [Luongo vows to quit over bigger nets].
How much would 1" change in net size (on all sides)? So what would happen if we moved the left post by 1" and the right post by 1" and increased the hight by 1"?
Since the NHL records how many goalposts (and crossbars) there are, we know how often the puck hits the 2 and 3/8th inch posts.
Last season the puck hit the frame 1480 times. So moving the post by 1" would mean taking the shots that hit the first 1" from the inside and counting those as goals and the pucks that hit the other 1 and 3/8th inch would still be posts (plus all the new posts) [simple logic, "works well enough"]. Essentially this would convert 1/(2+3/8) of the frame hits into goals or about 623 new goals per year for every 1" change in net dimensions. So it would take a 2" all around the net change to increase scoring per game by 1 goal.
Of course this is assuming goalies and teams don't adjust to the new system. Goalies would attempt to cut off shots even more (quite an adjustment for a goalie like Luongo).
"The topic was revisited briefly in June when general managers met in Ottawa." [Luongo vows to quit over bigger nets].
How much would 1" change in net size (on all sides)? So what would happen if we moved the left post by 1" and the right post by 1" and increased the hight by 1"?
Since the NHL records how many goalposts (and crossbars) there are, we know how often the puck hits the 2 and 3/8th inch posts.
Last season the puck hit the frame 1480 times. So moving the post by 1" would mean taking the shots that hit the first 1" from the inside and counting those as goals and the pucks that hit the other 1 and 3/8th inch would still be posts (plus all the new posts) [simple logic, "works well enough"]. Essentially this would convert 1/(2+3/8) of the frame hits into goals or about 623 new goals per year for every 1" change in net dimensions. So it would take a 2" all around the net change to increase scoring per game by 1 goal.
Of course this is assuming goalies and teams don't adjust to the new system. Goalies would attempt to cut off shots even more (quite an adjustment for a goalie like Luongo).
September 23, 2007
My own formatted playbyplay
**UPDATED**
L.A @ ANA 20070913
ANA @ L.A 20070915
PHX @ ANA 20070916
ATL @ STL 20070916
WSH @ CAR 20070916
NSH @ CBJ 20070916
FLA @ CGY 20070916
COL @ PHX 20070917
ANA @ VAN 20070917
PIT @ MTL 20070917
FLA @ EDM 20070917
STL @ DAL 20070918
S.J @ L.A 20070918
CHI @ CBJ 20070918
PIT @ MTL 20070918
TOR @ EDM 20070918
S.J @ ANA 20070919
L.A @ COL 20070919
CGY @ VAN 20070919
CBJ @ CHI 20070919
DAL @ T.B 20070919
COL @ DAL 20070920
PHX @ TOR 20070920
WSH @ OTT 20070920
ATL @ NSH 20070920
EDM @ VAN 20070920
FLA @ CHI 20070920
MIN @ DET 20070920
N.J @ NYR 20070921
NSH @ CAR 20070921
ANA @ S.J 20070921
CBJ @ BUF 20070921
MIN @ CHI 20070921
NYI @ MTL 20070921
PIT @ DET 20070921
TOR @ BOS 20070922
CAR @ NSH 20070922
DAL @ PHX 20070922
OTT @ MTL 20070922
STL @ ATL 20070922
WSH @ T.B 20070922
PHI @ NYR 20070922
VAN @ S.J 20070922
EDM @ CGY 20070922
DET @ PIT 20070922
Ok, this is a work in progress. Basically I want something that has the option to hide certain events and show other ones. If you click on SHOT, BLOCK etc. it will hide or show those events. This is really nice as you can break the playbyplay into just shots (what I care most about), or just faceoffs. In the long run, this is just a great way to see if I have recorded the data correctly and where any problems are.
Also, I've colorcoded the shots based on the likelihood of them going in. I wasn't sure if this would work, but it really gives you an idea of the flow of the game. Bright red = very good shot. Black goal = bad goal (low probability of going in).
I plan on publishing these (possibly live if I can figure out how to write a program that will download and parse them live). They are more in tune with the old style.
Note: the code runs slow in IE7 (not sure how it works in IE5.x or IE6).
I will likely attach a game summary part onto the top, which includes the goalie's save percentages and who scored and got assists. And a few other details. I also find the information a little overwhelming in this format so I'll be moving it around to see if it works better.
Nice CSS formatting was done by Chris Waycott.
L.A @ ANA 20070913
ANA @ L.A 20070915
PHX @ ANA 20070916
ATL @ STL 20070916
WSH @ CAR 20070916
NSH @ CBJ 20070916
FLA @ CGY 20070916
COL @ PHX 20070917
ANA @ VAN 20070917
PIT @ MTL 20070917
FLA @ EDM 20070917
STL @ DAL 20070918
S.J @ L.A 20070918
CHI @ CBJ 20070918
PIT @ MTL 20070918
TOR @ EDM 20070918
S.J @ ANA 20070919
L.A @ COL 20070919
CGY @ VAN 20070919
CBJ @ CHI 20070919
DAL @ T.B 20070919
COL @ DAL 20070920
PHX @ TOR 20070920
WSH @ OTT 20070920
ATL @ NSH 20070920
EDM @ VAN 20070920
FLA @ CHI 20070920
MIN @ DET 20070920
N.J @ NYR 20070921
NSH @ CAR 20070921
ANA @ S.J 20070921
CBJ @ BUF 20070921
MIN @ CHI 20070921
NYI @ MTL 20070921
PIT @ DET 20070921
TOR @ BOS 20070922
CAR @ NSH 20070922
DAL @ PHX 20070922
OTT @ MTL 20070922
STL @ ATL 20070922
WSH @ T.B 20070922
PHI @ NYR 20070922
VAN @ S.J 20070922
EDM @ CGY 20070922
DET @ PIT 20070922
Ok, this is a work in progress. Basically I want something that has the option to hide certain events and show other ones. If you click on SHOT, BLOCK etc. it will hide or show those events. This is really nice as you can break the playbyplay into just shots (what I care most about), or just faceoffs. In the long run, this is just a great way to see if I have recorded the data correctly and where any problems are.
Also, I've colorcoded the shots based on the likelihood of them going in. I wasn't sure if this would work, but it really gives you an idea of the flow of the game. Bright red = very good shot. Black goal = bad goal (low probability of going in).
I plan on publishing these (possibly live if I can figure out how to write a program that will download and parse them live). They are more in tune with the old style.
Note: the code runs slow in IE7 (not sure how it works in IE5.x or IE6).
I will likely attach a game summary part onto the top, which includes the goalie's save percentages and who scored and got assists. And a few other details. I also find the information a little overwhelming in this format so I'll be moving it around to see if it works better.
Nice CSS formatting was done by Chris Waycott.
Preseason Goaltending

I'll update this list "regularly". Only includes games that the NHL published information for (obviously).
There are a significant number of shot from the wrong side of the ice in the NHL data I've assumed any shot >115' feet was a mistake and should have been recorded on the other half of the ice. Hopefully the NHL gets that sorted out before the beginning of the season.
Is Cloutier finished? (310, 318). He was able to stop all three SO opportunities.
For the most part I have the new NHL format in my database.
There are a significant number of shot from the wrong side of the ice in the NHL data I've assumed any shot >115' feet was a mistake and should have been recorded on the other half of the ice. Hopefully the NHL gets that sorted out before the beginning of the season.
Is Cloutier finished? (310, 318). He was able to stop all three SO opportunities.
For the most part I have the new NHL format in my database.
September 14, 2007
Adjusted Shot Quality Neutral Save Percentage
Alan Ryder has found systemic bias in the shot quality data leaving the results showing problems with the data. It is bet summarized by Ryder himself:
I did a rather simple way to fix the problem. I did a regression on SQ results for all games based on two factors: team shot quality and stadium shot quality or [RTSS shot quality]. This simply calculate how much off the RTSS scores are from the standard [how the team normally performs]. Preferably we want no effect from RTSS scores so all those variables should be 0. I found a rather long list of biases, most of them small, including: Calgary, St. Louis, Columbus, Chicago, Phoenix, New Jersey, New York Rangers, Philadelphia, Buffalo, Carolina, Washington. I have deliberately over chosen, so that list likely includes teams which are simply randomly different as opposed to actual bias, but it doesn't matter. Ideally, I would want to incorporate these issues into the model directly, but the shot quality model is time consuming to build and once you get the variables you have to go through the hassle of calculating the percentages for all 7000 shots.
Simple adjustment on shot quality and it's effect on goaltending:
I have been worried that there is a systemic bias in the data. Random errors don’t concern me. They even out over large volumes of data. I seriously doubt that the RTSS scorers bias the shot data in favour of the home team. But I do think that it is a serious possibility that the scoring in certain rinks has a bias towards longer or shorter shots, the most dominant factor in a shot quality model. And I set out to investigate that possibility [Shot Quality Product Recall].
I did a rather simple way to fix the problem. I did a regression on SQ results for all games based on two factors: team shot quality and stadium shot quality or [RTSS shot quality]. This simply calculate how much off the RTSS scores are from the standard [how the team normally performs]. Preferably we want no effect from RTSS scores so all those variables should be 0. I found a rather long list of biases, most of them small, including: Calgary, St. Louis, Columbus, Chicago, Phoenix, New Jersey, New York Rangers, Philadelphia, Buffalo, Carolina, Washington. I have deliberately over chosen, so that list likely includes teams which are simply randomly different as opposed to actual bias, but it doesn't matter. Ideally, I would want to incorporate these issues into the model directly, but the shot quality model is time consuming to build and once you get the variables you have to go through the hassle of calculating the percentages for all 7000 shots.
Simple adjustment on shot quality and it's effect on goaltending:

newSQN = adjusted for RTSS bias
oldSQN = no adjustment for RTSS bias.
I'm curious what the RTSS turnover is. That is to say, I wonder if the bias last year will be the same this year.
oldSQN = no adjustment for RTSS bias.
I'm curious what the RTSS turnover is. That is to say, I wonder if the bias last year will be the same this year.
September 13, 2007
NHL Changes Data Format
Along with the Jersey changes it appears the NHL wanted to make their reporting of games a little more "user friendly". Hopefully they've managed to make sure all the files are the same, although I suspect they'll still have the French versions.
Positives:
+ There is ON ICE information for every event!
+ Nice break down of shots per situation.
+ Moved goalie info to Event Summary
+ Missed shots have distances.
+ Every event has a zone for where the action occurred.
+ Cloutier seems to be himself [310] [couldn't help myself]
+ Break down icetime by situation [NHL publishes 4v3 time and 5v3 time].
Negative:
 This could take weeks to figure out formatting to get information into my database
 They disabled left clicks for who knows what reason.
 The files are HUGE.
 Play by Play includes the header every so often, but doesn't explain why [looks like it's for com. break]
 Still don't include the XY cords. for shots.
Overall I think I can work with these, formatting looks simple and consistent.
Positives:
+ There is ON ICE information for every event!
+ Nice break down of shots per situation.
+ Moved goalie info to Event Summary
+ Missed shots have distances.
+ Every event has a zone for where the action occurred.
+ Cloutier seems to be himself [310] [couldn't help myself]
+ Break down icetime by situation [NHL publishes 4v3 time and 5v3 time].
Negative:
 This could take weeks to figure out formatting to get information into my database
 They disabled left clicks for who knows what reason.
 The files are HUGE.
 Play by Play includes the header every so often, but doesn't explain why [looks like it's for com. break]
 Still don't include the XY cords. for shots.
Overall I think I can work with these, formatting looks simple and consistent.
September 12, 2007
Travel Schedule

Here's the details on this season's schedule. Trips = number of days the team will be on a plane traveling between cities for an NHL game. KM = total kilometers to travel over the course of the upcoming season. KM/trip is just the average trip distance.
You can see the NHL does a pretty good job balancing the schedule. Also, there is a significant relationship between average trip length number of days on a plane.
You can see the NHL does a pretty good job balancing the schedule. Also, there is a significant relationship between average trip length number of days on a plane.
Subscribe to:
Posts (Atom)