Beyond Make Percentage: Evaulating NFL Kickers
Daniel Tokarz
February 18, 2019
Introduction
Many teams have experienced either glee or distress at the hands of their kicker in the 20182019 NFL season. As a Bears fan, I know Iโm going to be seeing that Cody Parkey miss over and over again in my head for years to come. On the flip side, Greg Zuerlein sending his team to the Superbowl was a moment of the year for Rams fans, especially as good memories were hard to come by for them two weeks later. While these late game hardships and heroics often come to define the public perception of kickers, itโs worth questioning what the best way to evaluate kickers is outside the bubble of emotion that follows a big game. In this project, I sought to find a universal metric that could define the reliability and relative value of NFL kickers, in the hopes of contribuiting information to the conversation beyond the recounting of last second kicks weโll never forget.
General overview of methodology
In the style most modern sports value metrics, Iโm seeking to define the contribution of a kicker to their team relative to an average NFL kicker. The obvious first idea was to look at a model fit over the kicks attempted throughout the season and estimate how many the average kicker would have made. This fails to account for confidence, though. For example, a team with great faith in their kicker might attempt a 60 yard field goal. If he misses, he would get penalized for this in such a metric while a kicker whose team would never have sent him out is not affected. My metric seeks to move beyond this by taking โteam confidenceโ into effect.
To start out, I decided to look at which variables were useful in distinguishing the difficulty of a given field goals. Using play by play data from 20092017 obtained with the nflscrapR package, I fit a regression model over all field goals attempted to look at which variables were statistically significant.
mod < glm(kick_good ~ TimeSecs + FieldGoalDistance + ScoreDiff + home_kicker + season , data = fgs,family = "binomial")
summary(mod)
##
## Call:
## glm(formula = kick_good ~ TimeSecs + FieldGoalDistance + ScoreDiff +
## home_kicker + season, family = "binomial", data = fgs)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## 2.7619 0.2448 0.3958 0.6350 1.5256
##
## Coefficients:
## Estimate Std. Error z value Pr(>z)
## (Intercept) 5.422e+00 1.878e01 28.870 < 2e16 ***
## TimeSecs 2.862e05 3.067e05 0.933 0.350679
## FieldGoalDistance 1.024e01 3.557e03 28.777 < 2e16 ***
## ScoreDiff 2.903e03 3.373e03 0.861 0.389498
## home_kickerTRUE 6.539e02 6.169e02 1.060 0.289147
## season2010 1.101e01 1.302e01 0.846 0.397830
## season2011 2.161e01 1.304e01 1.657 0.097527 .
## season2012 3.352e01 1.283e01 2.612 0.009003 **
## season2013 5.595e01 1.355e01 4.130 3.63e05 ***
## season2014 3.628e01 1.319e01 2.750 0.005966 **
## season2015 4.495e01 1.351e01 3.327 0.000879 ***
## season2016 4.537e01 1.321e01 3.433 0.000597 ***
## season2017 3.772e01 1.286e01 2.932 0.003364 **
## 
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 7903.9 on 8923 degrees of freedom
## Residual deviance: 6834.3 on 8911 degrees of freedom
## (4 observations deleted due to missingness)
## AIC: 6860.3
##
## Number of Fisher Scoring iterations: 5
While we might talk about the advantage of a kicker being clutch late in the game, of having the home crowd on his side, or of folding under the pressure of a tight game, the results of our linear model show that the only statistically significant variables are season and Field Goal Distance. My thought in seperating by season is that kickers get better collectively over time across the NFLso itโs worth accounting for this to even the playing field of evaluation. Our model for expected make porbability of a given field goal will be based solely upon these variables.
mod1 < glm(kick_good ~ FieldGoalDistance + season, data = fgs,family = "binomial")
fgs$predicted_make < predict(mod1,newdata = fgs,type = "response")
ggplot(fgs[fgs$FieldGoalDistance <= 65 & !is.na(fgs$FieldGoalDistance),],
aes(x = FieldGoalDistance, y = predicted_make, col = season)) +
geom_point(size = 1.5) + geom_smooth(method = "loess") + theme_bw() + labs(x = "Field Goal Distance", y = "Predicted Make Percentage", col = "NFL Season", title = "Fitted Field Goal Make Percentage")
As we can see, the three worst seasons for kickers were the first three years in our data set, in line with the hypothesis that kickers have generally improved over time. Now that we have this model, we can match it up with our play by play data.
Fitting Our Model
To fit our model for evaluating kickers, we will look at every fourth down faced by a team. The reason we do this is to take advantage of the confidence factor that makes our model unique. Looking at every fourth down allows us to geta binary variable of โDid this team feel that their kicker attempting a field goal in this situation was the best option.โ Unfortuantely, we miss out on the late game field goals that were taken on one of the first three downs. Although these are often the field goals we remember, our earlier model showed that the time of game is not a good predictor of make percentage. Therefore, losing out on this small fraction of data is not going to be a deal breaker. We have compiled the data into a frame called โfourths.โ The next thing we will do as add in some field goal specific variables.
fourths$isfg < fourths$PlayType == "Field Goal"
isfglm < glm(isfg ~ TimeSecs + ScoreDiff + distance_to_end*ydstogo, data = fourths, family = "binomial")
fourths$fgprob < predict(isfglm,newdata = fourths, type = "response")
summary(isfglm)
##
## Call:
## glm(formula = isfg ~ TimeSecs + ScoreDiff + distance_to_end *
## ydstogo, family = "binomial", data = fourths)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## 4.1371 0.1965 0.0255 0.0000 2.9457
##
## Coefficients:
## Estimate Std. Error z value Pr(>z)
## (Intercept) 3.542e01 8.534e02 4.151 3.31e05 ***
## TimeSecs 4.249e04 2.259e05 18.805 < 2e16 ***
## ScoreDiff 5.164e02 2.291e03 22.542 < 2e16 ***
## distance_to_end 5.397e02 2.335e03 23.115 < 2e16 ***
## ydstogo 8.734e01 2.192e02 39.854 < 2e16 ***
## distance_to_end:ydstogo 2.169e02 6.126e04 35.414 < 2e16 ***
## 
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 37770 on 34334 degrees of freedom
## Residual deviance: 12136 on 34329 degrees of freedom
## AIC: 12148
##
## Number of Fisher Scoring iterations: 9
Here, we have created a glm that looks at the time of game, score differential, place on the field, and yards to go for every fourth down taken and outputs the percent chance that a field goal would be attempted. We will use this to unravel the confidence factor that goes into team decision making.
The next step we have taken is matching up each game with the kicker who was on the field for the team. We have compiled this into a frame called kicker data. The data is as seen below.
head(kicker_data)
## kicker_name fgs_attempted fgs_attempted_exp fgs_made fgs_made_expexted
## 1 S.Gostkowski 274 288.20176 245 241.46773
## 2 C.Santos 99 100.74169 84 82.81871
## 3 C.Catanzaro 109 106.99742 93 87.96339
## 4 S.Hauschka 238 239.57874 209 198.08244
## 5 D.Hopkins 80 74.88821 69 63.09009
## 6 C.Sturgis 127 119.66647 102 100.91555
## total_points_expected no_confidence_expected_makes attempts_added
## 1 724.4032 235.53963 14.201760
## 2 248.4561 83.92704 1.741692
## 3 263.8902 91.84796 2.002577
## 4 594.2473 201.49786 1.578737
## 5 189.2703 68.38280 5.111789
## 6 302.7466 106.28725 7.333532
## kicks_added kicks_added_no_confidence team
## 1 3.532273 9.46036736 NE
## 2 1.181291 0.07296434 KC
## 3 5.036605 1.15204029 ARI
## 4 10.917563 7.50213715 SEA
## 5 5.909915 0.61719865 WAS
## 6 1.084455 4.28724793 MIA
The added columns are as follows.

fgs_attempted_expected is a measure of the total field goals expected to be attempted by fitting our field goal prediction model over every fourth down faced by a team.

fgs_made_expected is roughly fgs_attempted_expected but we have multiplied every probability of a field goal being attempted by the probability that this field goal was made.

total_points_expected is just fgs_made_expected multiplied by three.

no_confidence_expected_makes predicts how many field goals, of the field goals that a kicker attempted, he should have been expected to make.

attempts_added shows how many more field goals were attempted than the average kicker would have taken.

kicks_added shows how many more field goals were made than the average kicker would have made, accounting for all 4th down field goal oppurtunities.

kicks_added_no_confidence shows how many more field goals were made than the average kicker would have made, only looking at the field goals that were actually attempted.
Making Plots With Our New Data
Now that we have our cleaned kicker data for 20092017, we can look at some plots to analyze kicker performance over this span. First, we can make a plot in ggplot that shows total field goals added over this range. This is our master stat, which I believe is much more effective in evaluation than either make percentage our field goals made above replacement (which wouldnโt account for the added attempts). We can see some familiar names at the top of the list.
For a better look, here is the plot zoomed on on the players who have performed most and least admirably in this time frame.
It seems like a whose who of the best known kickers from the last decade. Sometimes the eye test is confirmed by our analysis. Letโs take at the list no one wants to see their name on.