This made me wonder, is there really a significant "home bias" effect for counting shots. Who could have thought such a simple task would be subject to such variations?

So let's start with the last 4 years of shot data. The graph below shows a "Matrix Plot" of the ratio of Shots@home / Shots@road (only first 3 periods). All four seasons have some sort of positive correlation (I didn't check significance though for each). The reader should notice quite quickly that most of the values are greater than 1. Or that shots @ home is generally greater than shots on the road. This should not be a huge surprise - home teams also get more goals.

Before I get too far ahead of myself I should explain the above plot. A Matrix Plot is used to compare multiple variables at the same to to see if any of them are related. In the above plot the "rows" represent a season variable and the "columns" another season variable. So for example, the graph in the first row and last column represents "Ratio 2005 vs. Ratio 2008" or is a scatter plot along with a regression for the 2005 season and the 2008 season. Similarly, if you move over one column you'd have "Ratio 2005 vs. Ratio 2007". The reason the diagonal doesn't contain any graphs is because "Ratio 2005 vs. Ratio 2005" is not a very interesting graph (just a bunch of points along the 45 degree line).

Ratio in 2008 = 0.42 + 0.61 x Ratio Average (2005,2006,2007) [p-value=0.001]

or

Ratio in 2008 = 1.07 + 0.61 x (Ratio Average - 1.07)

I've also included a data summary for all the ratio observations (4 seasons x 30 teams = 120 observations) to give the reader an idea of how the data is distributed.

Conclusion:

There is obviously something going on here (sorry I'm trying to keep this short). It's worth noting that Colorado elevation provides them with a natural home advantage. However, even when you remove C0lorado from the data you get similar results.

## 3 comments:

Cool, Chris. Do the three matrix plots for each season represent each period?

If I'm reading the plots correctly, look at NJ on all 12 of those graphs!

Oh, sorry I should have explained it better.

Each graphs represent one whole season vs. another whole season. So at row "Ratio 2005" a column "Ratio 2007" you get a dot plot of the "Ratio 2005 points vs Ratio 2007 points".

What the Matrix plot does is does a regression on every possible plot option: (R5 = Ratio 2005 to simplify things here).

First row:

R5 vs. R6; R5 vs. R6; R5 vs. R7 R5 vs. R8

...

So the only regression that doesn't look meaningful is R5 vs. R7

And yes New Jersey is in the lower left quadrant every single season. Anaheim is in the upper right.

