Impact of Defense

Otto von Ballpark · March 20, 2015

How many opportunities does Willingham have like that during a game? One? That's ~150 outlier chances a season, which we would dismiss out of hand were it a hitting metric (RISP, for example).

We dismiss it for RISP because we have an additional sample, three times as large, of a nearly identical task (hitting without RISP).

And while outliers exist all over (Al Newman even hit that one home run!), I don't know that defensive outliers are somehow worse. Willingham botching one or two routine plays isn't going to torpedo his UZR for the season. Willingham only converting 10 out of 50 difficult but not impossible plays, when the average LF converted 30 out of every 50 similar plays over the last 50 seasons, is what is going to hurt his UZR.

Otto von Ballpark · March 20, 2015

"Bad" is subjective. The point is that the current metrics are missing a lot of important components. Positioning alone is basically unaccounted for and that matters for more than just shifts. You couple those weaknesses with all of the interpretation going on and you get something less than perfect. It's why it's always wise to take these ratings in larger sample sizes than even season.

For the last time, those things aren't "unaccounted for".

If a player positions himself well (or his teammates/coaches position him well), that affects the result.

AND THE RESULT IS THE ONLY THING THAT UZR IS ATTEMPTING TO MEASURE.

It's like if team A scores 10 runs, and team B scores 5. Those are results and are very valuable information on their own. Enough of those results and it is pretty clear team A is better than team B. That's what UZR is trying to do with very basic defensive results (out or not an out).

Now obviously it is interesting and valuable too to delve further to see HOW those two teams scored those runs, but that doesn't mean that the high level information is worthless.

And of course UZR isn't perfect. No one claimed that. But including UZR in evaluations (preferably with another defensive metric or two) is better than excluding it. And even using standard UZR ranges is better than guesstimating the effect of defense, I think.

beckmt · March 20, 2015

Range is an important factor. How many balls were reached. Jeter was considered for years a below average fielder, but between his positioning and the Yankee veteran pitching hitting their spots, made him look much better than he was. Positioning and pitching has a lot to do with how fielding looks, so will pass on determination of range. Standard range has been calculated for years by Strato-matic baseball game and you would be suprised how much this affects play.

jimmer · March 20, 2015

Range is an important factor. How many balls were reached. Jeter was considered for years a below average fielder, but between his positioning and the Yankee veteran pitching hitting their spots, made him look much better than he was. Positioning and pitching has a lot to do with how fielding looks, so will pass on determination of range. Standard range has been calculated for years by Strato-matic baseball game and you would be suprised how much this affects play.

Jeter's biggest skill as a defender was that he was very sure handed on the balls he could get to and had a very accurate arm. This helped him in the fielding percentage/error stat categories (you know, the most accurate way to judge defense out there) :-).

His name helped a lot too, people eager to gloss over his flaws.

Brock Beauchamp · March 20, 2015

Can someone explain the seasonal discrepancies of UZR (and all defensive metrics)? I've never had anything close to a satisfactory answer to this question (and I've read a lot about it).

I think we can all agree (well, we should, anyway) that defense, being based mainly on athleticism and/or instinct, *should be* less prone to fluctuation than either offense or pitching. The guy can run and catch a ball faster than another guy or he can't. It's a pretty simple premise.

Yet we have strange anomalies like this: Franklin Gutierrez.

2009: 1353.1 innings, 31.0 UZR, 32 DRS

2010: 1277.1 innings, 5.9 UZR, 0 DRS

2011: 763.0 innings, 16.0 UZR, 10 DRS

Why are these kinds of metrics so prone to insane fluctuations on defense? Franklin was greatest centerfielder in baseball one season, completely average the next, and on his way back to greatest centerfielder the third (shortened) season.

Gutierrez isn't unique in this regard. I see swings like this all over baseball on a yearly basis. For example, Torii Hunter went from +10 to -10 from season to season multiple times in his career. We're talking about one player saving 10-20 runs in a season, sliding all the way down to being a defensive liability the next, and then reverting back to his previous glory in a third season.

Ian Kinsler is another one. Here are his DRS since 2008: -9, 22, 7, 18, 1, 11, 20. So he was an awful defender in 2008, became a rockstar in 2009, turned mediocre in 2010, was a rockstar again in 2011, and was completely "meh" in 2012 until climbing back up to rockstar status over two more years.

How does this make sense to anyone? It's defies logic that a guy's defense could save one run every five games, decline to zero runs saved in 162 games the following season, and then go back to saving a run every 4-5 games the third season.

jimmer · March 20, 2015

http://www.fangraphs.com/blogs/uzr-2008-to-2009/

Brock Beauchamp · March 20, 2015

I've read that before. It's flawed because it's comparing offense to defense. Guys fluctuate quite a bit on offense. We know that. Pitchers exploit a flaw in his swing, he gets into a rut, changes his swing, whatever. There are a million reasons why a hitter fluctuates from year to year, including luck, which we can track. All you have to do is look at the rampant speculation about Danny Santana's BABIP regression to see how we can determine offensive "luck".

There should be virtually no fluctuation on defense unless the guy is injured or he ages. Defensive metrics should be more consistent than offensive metrics, not less.

And if one guy can tack +30 runs to the team defense in a season and be considered possibly inaccurate, the entire metric breaks down. That's one run every 5.5 games.

jimmer · March 20, 2015

'True talent should be constant from year to year (more or less), and whether the source of variation is chance reflected in one’s metric or the inaccuracy of the metric itself doesn’t particularly matter: both screw up one’s ability to detect true talent. Still, this is an important and here unremarked distinction, as outlier UZRs are sometimes used to indicate an extraordinary season (a la outlier wOBAs) and not just its unreliability as a metric (a la outlier RBI totals)'

jimmer · March 20, 2015

http://www.ussmariner.com/2009/09/10/gutierrezs-defense/

Brock Beauchamp · March 20, 2015

That, if anything, only validates my argument.

"Gutierrez was historically good with the glove last year. Don't expect that to happen again."

Why not? What could possibly happen over the course of two seasons to make a defender go from +30 runs to +0 runs?

Brock Beauchamp · March 20, 2015

'True talent should be constant from year to year (more or less), and whether the source of variation is chance reflected in one’s metric or the inaccuracy of the metric itself doesn’t particularly matter: both screw up one’s ability to detect true talent. Still, this is an important and here unremarked distinction, as outlier UZRs are sometimes used to indicate an extraordinary season (a la outlier wOBAs) and not just its unreliability as a metric (a la outlier RBI totals)'

But don't you see how this could horribly skew team defense? What if a team has two +20 DRS outlier players that don't deserve those numbers? That's roughly 20% of the team's fielding that could skew +40 runs per season, or one run every four games.

Yet we see people using these numbers as if they're gospel. When you're dealing with inaccuracies at the positional level over one season and there are only nine positions on a baseball diamond (of which only seven see consistent fielding opportunities), the potential for insane error margin isn't only high, it should be expected.

jimmer · March 20, 2015

Who says they didn't deserve those numbers for that season? How do we know that he truly wasn't just over the top great that season?

An extraordinary defensive season, and a corresponding great metric to quantify it, just tells us what the player did that year and only effects the team defensive ranking for that year. It doesn't mean he will always do that well. Just like wRC+ fluctuates for most batters and FIP for most pitchers. When the projections come out for the following season, seasons like that extraordinary one won't be expected.

And no one I know of takes any of this as error proof gospel, including the people who make the metrics. This strawman keeps popping up.

Just think it's better than any current alternative.

Brock Beauchamp · March 20, 2015

Who says they didn't deserve those numbers for that season? How do we know that he truly wasn't just over the top great that season?

An extraordinary defensive season, and a corresponding great metric to quantify it, just tells us what the player did that year and only effects the team defensive ranking for that year. When the projections come out for the following season, seasons like that extraordinary one won't be expected.

They didn't deserve those numbers because they don't make any sense. It's not as if Gutierrez went from running a 4.3 40 yard dash in 2009 to running a 5.4 40 yard dash in 2010. He was the same guy in 2010 that he was in 2009.

If these metrics were awarding points on specific plays that had significant impact, that might be different. For example, let's pretend that Gutierrez "got lucky" in 2009 and was able to save three home runs and two bases loaded doubles worth a combined ten runs. The following year, he was only able to do that twice. Those ten runs were actually stopped by Gutierrez. Those plays happened.

But the metrics don't track those plays. They track boolean results and award points to the play. There's no reason for that kind of fluctuation if the input data is accurate and even if there was a reason for that fluctuation, it puts into question the use of these metrics on a single-season basis, even for team play. As I mentioned earlier, when there are only seven significant positions to track on a baseball diamond defensively, 1-2 of those guys being waaaaaay outside the norm breaks the entire metric.

jimmer · March 20, 2015

He could have taken better routes, he could have been healthier, he could have had more chance, whatever. Lots of things change from season to season.

I get that it doesn't make sense to you and many. I can't do anything about that. Neither can the people whose job it is to quantify this stuff.

I just ask, what's a better alternative? Errors? Fielding %, your eyes, my eyes, Nick Nelson's eyes, Parker's eyes?

Outliers happen in all things like this. pointing to outliers and saying because of those the system is wrong doesn't make much sense to me. Things don't have to be perfect in every instance for me to buy the overall work, but I'll take the best way to evaluate over other ways that are worse even if the best isn't perfect. I also understand that what makes sense to some doesn't make sense to others and vice versa,

Brock Beauchamp · March 20, 2015

He could have taken better routes, he could have been healthier, he could have had more chance, whatever. Lots of things change from season to season.

Guys don't take the best routes in baseball one year, decline to average the next, and then resume being a great route-taker again in a third season.

And I looked it up. I could find no record of Gutierrez being injured in 2010 and considering that he stole nine more bases in 2010 while being caught fewer times, it's highly unlikely that he was suffering from a speed deficiency. For all intents and purposes, he was the same guy. He played in the same stadium behind (mostly) the same pitching staff.

jimmer · March 20, 2015

Guys don't take the best routes in baseball one year, decline to average the next, and then resume being a great route-taker again in a third season.

How do you know? Do we know how many 3-5 worse routes on difficult plays would effect the calculation?

Brock Beauchamp · March 20, 2015

How do you know?

Because it doesn't make any sense whatsoever. Guys don't forget how to take a route any more than I've forgotten how to ride a bicycle even though I haven't ridden one in several months.

jimmer · March 20, 2015

Because it doesn't make any sense whatsoever. Guys don't forget how to take a route any more than I've forgotten how to ride a bicycle even though I haven't ridden one in several months.

It doesn't make sense to you. Do we know how many 3-5 worse routes on difficult plays would effect the calculation?

Even the best have bad days once in awhile. I've seen historically great fielders made three boneheaded plays in one day. It happens.

Brock Beauchamp · March 20, 2015

What you're talkkng about is an acceptable margin of error. A few runs a season.

Not 30.

jimmer · March 20, 2015

How does a pitcher have a great FIP one year and a horrible FIP the next year and then somewhere in the middle the next year? FIP is based on what he does, what he controls, why the difference? Because just like hitting and defense, pitchers have good years and bad years. A two run difference in FIP for a pitcher who pitchers 30 games worth of innings would be a 60 run difference. A 1.5 difference in FIP for the same guy would be 45 runs.

Mike Sixel · March 20, 2015

Luck? Maybe one year he caught most everything he got close to, the next he didn't. Maybe one year his legs weren't as fresh because he was drinking more, or his wife had a kid or three, or maybe he just had a series of off days on the days he had chances that swung his score from good to bad?

Deltas in outcomes are driven by A LOT of variables, that doesn't mean how you measure those outcomes is wrong.

some days, big companies do everything right. Other days, a system goes down, or someone goes on PTO, or someone gets jury duty, and everything goes wrong. Now, imagine that when there are only 9 people involved in producing the outcomes........any 1 person of 9 can having a bad day can really throw things off.

TheLeviathan · March 20, 2015

And of course UZR isn't perfect. No one claimed that. But including UZR in evaluations (preferably with another defensive metric or two) is better than excluding it. And even using standard UZR ranges is better than guesstimating the effect of defense, I think.

Positioning is absolutely unaccounted for in some WAR calculations. Or, at least, calculated very imprecisely. (You can google some SABRheads lamenting this very notion if it feels better coming from someone else. Here's one )

And, to repeat your tone - no one is claiming we exclude it. It was a comment on something being better. Sport Trac will hopefully be better, more accurate, and less subjective. That's why it will be such a fascinating new tool.

Also, and I've said this before, something being the best current option doesn't make it "good". Thank god we didn't take that mentality with....hell...every damn scientific innovation in human history. Science and math are supposed to be about skepticism driving the development of better understanding. So be open to the idea that some of this might benefit from some fleshing out and airing of grievances - it will only make things better understood and better developed. Yeesh.

Otto von Ballpark · March 20, 2015

Yet we have strange anomalies like this: Franklin Gutierrez.

2009: 1353.1 innings, 31.0 UZR, 32 DRS
2010: 1277.1 innings, 5.9 UZR, 0 DRS
2011: 763.0 innings, 16.0 UZR, 10 DRS

Why are these kinds of metrics so prone to insane fluctuations on defense? Franklin was greatest centerfielder in baseball one season, completely average the next, and on his way back to greatest centerfielder the third (shortened) season.

Gutierrez also went from 105 to 87 to 54 OPS+ those three seasons. And 3.3 to 2.2 to -0.3 oWAR (does not include fielding). Are those "insane fluctuations" too?

Otto von Ballpark · March 20, 2015

There should be virtually no fluctuation on defense unless the guy is injured or he ages. Defensive metrics should be more consistent than offensive metrics, not less.

I know this seems like it should be true, but it really isn't. Maybe he had a nagging injury that impeded his first step. Maybe he or his coaches positioned him sub-optimally, perhaps in response to new player playing beside him. Maybe there was just some noise in the data plus one of the above effects and he was an outlier one season.

Defensive metrics aren't recording "true talent" any more than offensive metrics. They are trying to approximate results on the field, that add up to the big result (team win/loss).

jimmer · March 20, 2015

I know this seems like it should be true, but it really isn't. Maybe he had a nagging injury that impeded his first step. Maybe he or his coaches positioned him sub-optimally, perhaps in response to new player playing beside him. Maybe there was just some noise in the data, and he was an outlier one season.

Defensive metrics aren't recording "true talent" any more than offensive metrics. They are trying to approximate results on the field, that add up to the big result (team win/loss).

True talent is usually where the mean is, especially with the more data you have.

TheLeviathan · March 20, 2015

Maybe there was just some noise in the data plus one of the above effects and he was an outlier one season.

See, I think this is the part that some people are stuck on - the "noise in the data". I really hope we can get to the point that we can just look at the other effects and I hope the new technology will help with that.

Of course, then we have to hope the Twins will care to look at it too,

Otto von Ballpark · March 20, 2015

But don't you see how this could horribly skew team defense? What if a team has two +20 DRS outlier players that don't deserve those numbers? That's roughly 20% of the team's fielding that could skew +40 runs per season, or one run every four games.

Do you have an example of this "horrible skewing"? Where multiple players showed extreme single season outliers in UZR?

Otto von Ballpark · March 20, 2015

Positioning is absolutely unaccounted for in some WAR calculations.

It's accounted for in the result of UZR which is primarily what we've been discussing here. This whole debate came off a Fangraphs article quoting UZR.

Otto von Ballpark · March 20, 2015

And, to repeat your tone - no one is claiming we exclude it. It was a comment on something being better. Sport Trac will hopefully be better, more accurate, and less subjective. That's why it will be such a fascinating new tool.

You will have no disagreement from me that SportTrac could be better.

But I think your tone about "absurd on the face of it" suggests you don't actually know or care what UZR claims to estimate, or how it goes about doing so. It seems you've adopted a very adversarial posture to the concept, which isn't really helpful.

TheLeviathan · March 20, 2015

It's accounted for in the result of UZR which is primarily what we've been discussing here. This whole debate came off a Fangraphs article quoting UZR.

We've sort of floated back and forth between a couple different things. UZR has it's own issues even if it is accounts better for positioning. You keep saying you (and everyone else) don't deny those weaknesses so I'm not sure why we have to constantly go back and forth on that. If we all accept the weaknesses than we should all be in agreement this new technology has some exciting potential to minimize those effects and help us get better value measurements.

Sign In

Impact of Defense

Impact of defense 44 members have voted

1. Over the course of a 162-game season, how much of a difference is there between the best team defense in MLB and the worst team defense in MLB?

Recommended Posts

Archived

Member Statistics

Prospect News & Highlights

Recent News

Notes & Rumors

Recent Blog Entries

Recent Status Updates

Impact of defense
44 members have voted