September 28, 2007
I like "bookie" predictions due to the fact that they are backed by cash instead of just hot air. That being said, even bookie predictions have systemic problems or even systemic biases. Some can be fixed others cannot. Certain teams attract money from betters and other simply cannot. Due to the fact that very few people cheer for Nashville it may in fact suffer some devaluation.
However, some predictions are just stupid. For example, the Atlantic division last season was a joke, not only was Philadelphia the worst team by a far margin. Overall the division was below average or average. This season Pittsburgh doesn't get to play Philadelphia 8 times to stack their stats as Philadelphia will quite possibly make the playoffs. The bookies from the above prediction state that the Atlantic division will go from below average to the best division by a large margin in a matter of one season.
So, what I did is I adjusted the bookie's standings so that all the divisions performed the same this season as last season.
When I look at those standings I have a hard time finding fault with anything on there [except the four Northwestern team's making the playoff and only one central team and all four Northeastern teams making it].
I find it interesting either way...
September 25, 2007
"The topic was revisited briefly in June when general managers met in Ottawa." [Luongo vows to quit over bigger nets].
How much would 1" change in net size (on all sides)? So what would happen if we moved the left post by 1" and the right post by 1" and increased the hight by 1"?
Since the NHL records how many goalposts (and crossbars) there are, we know how often the puck hits the 2 and 3/8th inch posts.
Last season the puck hit the frame 1480 times. So moving the post by 1" would mean taking the shots that hit the first 1" from the inside and counting those as goals and the pucks that hit the other 1 and 3/8th inch would still be posts (plus all the new posts) [simple logic, "works well enough"]. Essentially this would convert 1/(2+3/8) of the frame hits into goals or about 623 new goals per year for every 1" change in net dimensions. So it would take a 2" all around the net change to increase scoring per game by 1 goal.
Of course this is assuming goalies and teams don't adjust to the new system. Goalies would attempt to cut off shots even more (quite an adjustment for a goalie like Luongo).
September 23, 2007
L.A @ ANA 2007-09-13
ANA @ L.A 2007-09-15
PHX @ ANA 2007-09-16
ATL @ STL 2007-09-16
WSH @ CAR 2007-09-16
NSH @ CBJ 2007-09-16
FLA @ CGY 2007-09-16
COL @ PHX 2007-09-17
ANA @ VAN 2007-09-17
PIT @ MTL 2007-09-17
FLA @ EDM 2007-09-17
STL @ DAL 2007-09-18
S.J @ L.A 2007-09-18
CHI @ CBJ 2007-09-18
PIT @ MTL 2007-09-18
TOR @ EDM 2007-09-18
S.J @ ANA 2007-09-19
L.A @ COL 2007-09-19
CGY @ VAN 2007-09-19
CBJ @ CHI 2007-09-19
DAL @ T.B 2007-09-19
COL @ DAL 2007-09-20
PHX @ TOR 2007-09-20
WSH @ OTT 2007-09-20
ATL @ NSH 2007-09-20
EDM @ VAN 2007-09-20
FLA @ CHI 2007-09-20
MIN @ DET 2007-09-20
N.J @ NYR 2007-09-21
NSH @ CAR 2007-09-21
ANA @ S.J 2007-09-21
CBJ @ BUF 2007-09-21
MIN @ CHI 2007-09-21
NYI @ MTL 2007-09-21
PIT @ DET 2007-09-21
TOR @ BOS 2007-09-22
CAR @ NSH 2007-09-22
DAL @ PHX 2007-09-22
OTT @ MTL 2007-09-22
STL @ ATL 2007-09-22
WSH @ T.B 2007-09-22
PHI @ NYR 2007-09-22
VAN @ S.J 2007-09-22
EDM @ CGY 2007-09-22
DET @ PIT 2007-09-22
Ok, this is a work in progress. Basically I want something that has the option to hide certain events and show other ones. If you click on SHOT, BLOCK etc. it will hide or show those events. This is really nice as you can break the play-by-play into just shots (what I care most about), or just face-offs. In the long run, this is just a great way to see if I have recorded the data correctly and where any problems are.
Also, I've color-coded the shots based on the likelihood of them going in. I wasn't sure if this would work, but it really gives you an idea of the flow of the game. Bright red = very good shot. Black goal = bad goal (low probability of going in).
I plan on publishing these (possibly live if I can figure out how to write a program that will download and parse them live). They are more in tune with the old style.
Note: the code runs slow in IE7 (not sure how it works in IE5.x or IE6).
I will likely attach a game summary part onto the top, which includes the goalie's save percentages and who scored and got assists. And a few other details. I also find the information a little overwhelming in this format so I'll be moving it around to see if it works better.
Nice CSS formatting was done by Chris Waycott.
There are a significant number of shot from the wrong side of the ice in the NHL data I've assumed any shot >115' feet was a mistake and should have been recorded on the other half of the ice. Hopefully the NHL gets that sorted out before the beginning of the season.
Is Cloutier finished? (3-10, 3-18). He was able to stop all three SO opportunities.
For the most part I have the new NHL format in my database.
September 14, 2007
I have been worried that there is a systemic bias in the data. Random errors don’t concern me. They even out over large volumes of data. I seriously doubt that the RTSS scorers bias the shot data in favour of the home team. But I do think that it is a serious possibility that the scoring in certain rinks has a bias towards longer or shorter shots, the most dominant factor in a shot quality model. And I set out to investigate that possibility [Shot Quality Product Recall].
I did a rather simple way to fix the problem. I did a regression on SQ results for all games based on two factors: team shot quality and stadium shot quality or [RTSS shot quality]. This simply calculate how much off the RTSS scores are from the standard [how the team normally performs]. Preferably we want no effect from RTSS scores so all those variables should be 0. I found a rather long list of biases, most of them small, including: Calgary, St. Louis, Columbus, Chicago, Phoenix, New Jersey, New York Rangers, Philadelphia, Buffalo, Carolina, Washington. I have deliberately over chosen, so that list likely includes teams which are simply randomly different as opposed to actual bias, but it doesn't matter. Ideally, I would want to incorporate these issues into the model directly, but the shot quality model is time consuming to build and once you get the variables you have to go through the hassle of calculating the percentages for all 7000 shots.
Simple adjustment on shot quality and it's effect on goaltending:
oldSQN = no adjustment for RTSS bias.
I'm curious what the RTSS turnover is. That is to say, I wonder if the bias last year will be the same this year.
September 13, 2007
+ There is ON ICE information for every event!
+ Nice break down of shots per situation.
+ Moved goalie info to Event Summary
+ Missed shots have distances.
+ Every event has a zone for where the action occurred.
+ Cloutier seems to be himself [3-10] [couldn't help myself]
+ Break down icetime by situation [NHL publishes 4v3 time and 5v3 time].
- This could take weeks to figure out formatting to get information into my database
- They disabled left clicks for who knows what reason.
- The files are HUGE.
- Play by Play includes the header every so often, but doesn't explain why [looks like it's for com. break]
- Still don't include the X-Y cords. for shots.
Overall I think I can work with these, formatting looks simple and consistent.
September 12, 2007
You can see the NHL does a pretty good job balancing the schedule. Also, there is a significant relationship between average trip length number of days on a plane.
Typically a team sees about 12 back to back games and plays against 12 teams who played the day before. Nashville lucked out this season. They are scheduled to play 24 games (30% of the season) against tired teams [listed to the left] and play only 10 back to back games themselves.
Generally I argue that these games have a cost of about 5% in terms of winning percentage or 0.05 wins. So Nashville loses 0.50 games in the 10 games they play back to back and gains 1.2 wins in the 24 games they play against a back to back team. In essence the NHL has given Nashville 1.5 extra points. (this is worth approximately $700,000). Of course this isn't huge, but essentially Nashville has a 1.5 point advantage before the season starts.
It's interesting, but the majority of these games are in the first half of the season, possibly to get the team an early lead and to attract attention to the team. For example, Nashville plays tired teams for a whole week: Dec 6 to Dec 13.
Last season the NHL was very good at making sure back to back games were scheduled fairly and the largest difference was 5 (Col, S.J, NYI)
James Mirtle posted a nice poll, which asked people to pick the teams that are expected to make the playoffs. I simply created a nice graphic for the predicted standings based on this poll. This graphic was made when there was 330 votes.
September 10, 2007
In order to see if players can affect save percentage I did a season vs. season regression: I check if the players who out performed in 2005-2006 continued to outperform in 2008-2007. If there is a significant relationship between the two seasons then obviously players impact save percentage.
The regression does come out significant, which agrees with the theory that players can affect shot quality.
The regression is reasonably simple:
Difference in team's save percentage in 2006 vs. 2007 = D
Player's Save Percentage in 2006 = Save%
Player's Save Percentage in 2006 = 0.7 + 0.235*Save% + 0.618*D
Basically the above equation is a regression towards the mean: it takes extreme values and brings them closer to the average (0.920).
Note: this data only includes even strength shots.
However, this data is worthless for predicting how a player will do next year due to the large amounts of variability. For example Crosby had a n excellent save percentage this season of 0.930, this system says he'll likely perform at 0.919 given identical goaltending as last season, however he could easily do as poorly as 0.890 with bad luck or as well as 0.950 with good luck. A 2% difference in save percentage on 600 shots works out to about 12 goals over the course of a season. So this large range of possible save percentages for players can have a large impact of a player's plus-minus, often due to just luck.
Any way most people will want to see the results
September 6, 2007
|Atlanta - Lehtonen|
Save %: 0.912, Career Save %: 0.911
Shot Quality Neutral Save %: 0.915
Wins: 34, Wins excluding SOW: 27
Career Wins: 58, Career Games: 110
#2 Pick overall in 2002. Lehtonen has had a great start to what should be a long career. He's young and has a lot of time to improve. Bob Hartley destroyed Lehtonen's confidence by replacing him with Hedberg in game #2 after he allowed 4 goals on almost 40 shots. He was put back into the fire for game #3 allowing 7 goals on 35 shots. I expect him to rebound in the coming season, but I never understood what Hartley was thinking.
|Carolina - Ward|
|Save %: 0.897, Career Save %: 0.892|
Shot Quality Neutral Save %: 0.906
Wins: 30, Wins excluding SOW: 30
Career Wins: 44, Career Games: 88
#25 overall in 2002. The only reason Ward become Carolina's starting goaltender was his amazing playoff run. However he has never really impressed me as an outstanding goalie. Ward was able to squeak in a 30 win season, but was limited to only 60 games.
|Florida - Vokoun|
|Save %: 0.920, Career Save %: 0.913|
Shot Quality Neutral Save %: 0.921
Wins: 27, Wins excluding SOW: 25
Career Wins: 161, Career Games: 384
Mason played well enough in Nashville to steal his spot leaving Vokoun, a very good goaltender, free for the taking. Many have commented that if Luongo couldn't get Florida into the playoffs, then why should a weaker goalie be able to? Vokoun has missed a lot of important hockey games due to injuries and most importantly the playoffs two years in a row. Florida almost made the playoffs last season, Vokoun should be able to put them over the edge.
|Tampa Bay - Denis & Holmqvist|
Save %: 0.883, Career Save %: 0.903
Shot Quality Neutral Save %: 0.884
Wins: 17, Wins excluding SOW: 13
Career Wins: 111, Career Games: 338
I commented once that both of Tampa's goaltenders were terrible and was criticized and told it's their defense that's terrible and the goalies are doing the best with what they got. Personally I felt Denis was a big reason Columbus slipped into mediocrity for so many years. Denis made the record for most losses for a goalie two seasons in a row, granted he played the most minutes and saw the most shots.
Save %: 0.893, Career Save %: 0.882
Shot Quality Neutral Save %: 0.890
Wins: 27, Wins excluding SOW: 21
Career Wins: 27, Career Games: 52
"Lanky Holmqvist has the size to blot out the majority of the net from shooters and also has good technical ability and reflexes" (The Sports Forecaster 2001-02, p. 101). The one thing I will note of Holmqvist, despite his terrible stats he was able to play close to 50% hockey. Holmqvist doesn't even have a full season of experience to this point and the experience he does have is less than impressive. I don't see all that much positive in his game. John Tortorella choose to use Holmqvist over Denis in the playoffs last season.
|Washington - Kolzig|
|Save %: 0.910, Career Save %: 0.927|
Shot Quality Neutral Save %: 0.919
Wins: 22, Wins excluding SOW: 21
Career Wins: 276, Career Games: 657
Kolizg is getting older, but still may be the most important piece of a terrible Washington team. His skills are wasted on such a terrible team. Word on the street is that Washington is getting better, although not much. I think enough people know enough about Kolzig, there isn't much to add.
|It would appear that there are three teams with excellent goaltenders (Florida, Atlanta and Washington - note these teams all have terrible defense) and two team's with terrible goaltenders (Tampa Bay and Carolina) with terrible defense as well, but they both have extreme offense to compensate with.|