Re-opening the WAR debate

Jham · November 30, 2017

Any issue with WAR is one of a few things. One, it was never designed as a precise measurement yet people use it that way. Two, War uses many measurements which at best are imprecise like in base running and fielding. Three, while being an imprecise measure, some people fail to understand what it measures. What they think and what it is are different. The other distinct possibility in many cases is that some people just like to argue.

Are we talking bWAR or fWAR? Or is that the point?

People forget that the variables we can measure are vastly outnumbered by the ones we can't. Try to figure out at your own job why you are more productive one day versus another. One week versus another. Then ask whether those things that effect your performance, a poor night's sleep, a sick child, an especially entertaining TD game thread could be measured if you were a baseball player. We hope that those other variables balance out over a season, but again the things we can measure offer only a glimpse.

Statisticians use regression analysis to estimate how much variance within one statistic can be explained by a certain measured variable. Presumably they've run regressions on WAR factors in order to apply proper weight. Regardless, if we were able to get an actual measurement of what contributes to actual wins and put them in a pie chart, WAR factors would take up maybe 25% of the with things like luck, sequencing, umpires, and non-WAR factors dominating the pie. I'm no stats expert. I took one course in business stats and one in research stats. Hated them. Wish they would have had baseball stats back then.

jimmer · November 30, 2017

I will never understand the angst over the fact that different sites put different things into their WAR calculations which makes them come up with different numbers.

PseudoSABR · November 30, 2017

I will never understand the angst over the fact that different sites put different things into their WAR calculations which makes them come up with different numbers.

Because it points to the inherent subjectivity in an endeavor which, well, should be objective.

jimmer · November 30, 2017

Because it points to the inherent subjectivity in an endeavor which, well, should be objective.

it points to the fact that people value things differently. That people look at things differently.

Jham · November 30, 2017

Update: According to Wikipedia, a study had rWAR as explaining 83% of the variance between expected wins and actual wins. (retroactively. 5 random teams from 5 years, SSS of large sample sizes). As a predictor it accounted for 35% of variance between expected to actual wins.

https://en.wikipedia.org/wiki/Wins_Above_Replacement

PseudoSABR · November 30, 2017

it points to the fact that people value things differently. That people look at things differently.

Yes. But when those different values are aggregated into one metric, the metric may tell us more about the values of the creator than the thing that it measures.

spinowner · December 1, 2017

it points to the fact that people value things differently. That people look at things differently.

This is the problem. The goal should be The Truth, not one sabermetrician's opinion.

jimmer · December 1, 2017

This is the problem. The goal should be The Truth, not one sabermetrician's opinion.

Yeah, it's not a problem. Different opinions should be explored and shown.

Some people are completely sure a pitcher should be judged a lot on ERA (or RAA, which is closer to the ERA way) some say FIP instead. Who is going to say which one is the truth? Who gets to decide that? Hence bWAR and fWAR for pitchers. I, for one, don't care at all for ERA. I almost never look at it when we're talking about how well a pitcher did in any given season.

So truth gets a bit shaky.

USAFChief · December 1, 2017

To be clear, I agree with you on this point. I was just responding on behalf of Chief's point, which I don't necessarily agree with. In isolation, it shouldn't matter who gets a hit, but for aggregate value, it is definitely more valuable from a tougher to fill defensive position. That fact being baked into the replacement playet doesn't bother me.

Sorry Jim, but this is where you and others lose me.

The entire idea behind WAR is that runs=wins. Contribute X number of runs on offense, you have contributed Y number of wins. And they've got entire databases of, uh...data that they use to say a single is worth a certain number of runs, a double, a stole base, etc.

Same thing on defense. Deny the other team Z number of runs, well, that equals Y number of wins. Forget the fact that the measurements used to determine Z are problematic at best, that's the theory. It's math...what matters is how many runs.

And then they throw the whole thing out and say if a first baseman contributes 30 runs, and saves 10, he's delivered fewer wins than a shortstop who contributes 30 runs and saves 10. Quite a bit fewer, in fact.

So it's not math?? Both of those players accounted for 40 runs.

Mr. Brooks · December 1, 2017

Sorry Jim, but this is where you and others lose me.

The entire idea behind WAR is that runs=wins. Contribute X number of runs on offense, you have contributed Y number of wins. And they've got entire databases of, uh...data that they use to say a single is worth a certain number of runs, a double, a stole base, etc.

Same thing on defense. Deny the other team Z number of runs, well, that equals Y number of wins. Forget the fact that the measurements used to determine Z are problematic at best, that's the theory. It's math...what matters is how many runs.

And then they throw the whole thing out and say if a first baseman contributes 30 runs, and saves 10, he's delivered fewer wins than a shortstop who contributes 30 runs and saves 10. Quite a bit fewer, in fact.

So it's not math?? Both of those players accounted for 40 runs.

Because it's above replacement, not in a vacuum. A replacement level SS is going to give you a lot less offensive production than a replacement level 1B.

Consider a guy like Kennys Vargas. Would he still be a fringe player if he was a legit SS, or catcher, rather than 1B/DH?

drjim · December 1, 2017

Because it's above replacement, not in a vacuum. A replacement level SS is going to give you a lot less offensive production than a replacement level 1B.
Consider a guy like Kennys Vargas. Would he still be a fringe player if he was a legit SS, or catcher, rather than 1B/DH?

This is pretty much my thought too.

Another way to look at it, is that the SS in the hypothetical can almost certainly move to 1b and be fine defensively, while the 1b almost certainly can't move to SS and be fine. So even though they put up the same offensive numbers, the SS is clearly more valuable. That is captured in WAR.

spinowner · December 2, 2017

This is pretty much my thought too.

Another way to look at it, is that the SS in the hypothetical can almost certainly move to 1b and be fine defensively, while the 1b almost certainly can't move to SS and be fine. So even though they put up the same offensive numbers, the SS is clearly more valuable. That is captured in WAR.

A player's fielding position should carry absolutely no weight when you are evaluating batting. None. A SS contributes more as a fielder than a 1B but it should be accounted for ONLY in terms of fielding, not how he performs as a batter.

Mr. Brooks · December 2, 2017

A player's fielding position should carry absolutely no weight when you are evaluating batting. None. A SS contributes more as a fielder than a 1B but it should be accounted for ONLY in terms of fielding, not how he performs as a batter.

But a replacement level SS or C is going to contribute less offense than a replacement level 1B or DH. So, it kinda has to value their offense differently. If it didn't it wouldn't be wins above replacement, it would just be wins contributed. Which is fine, someone can create that if they want, but we are discussing WAR.

spinowner · December 2, 2017

Because it's above replacement, not in a vacuum. A replacement level SS is going to give you a lot less offensive production than a replacement level 1B.
Consider a guy like Kennys Vargas. Would he still be a fringe player if he was a legit SS, or catcher, rather than 1B/DH?

This illustrates part of my problem with WAR. An ideal measure of a player's value should be absolute, not relative. There should be no arbitrarily-defined hypothetical replacement player that is different for each position.

spinowner · December 2, 2017

But a replacement level SS or C is going to contribute less offense than a replacement level 1B or DH. So, it kinda has to value their offense differently. If it didn't it wouldn't be wins above replacement, it would just be wins contributed. Which is fine, someone can create that if they want, but we are discussing WAR.

See my reply to your other comment. And I'll say that this is about WAR because it illustrates one reason why I think WAR is a highly flawed measure of a player's value.

Mr. Brooks · December 2, 2017

This illustrates part of my problem with WAR. An ideal measure of a player's value should be absolute, not relative. There should be no arbitrarily-defined hypothetical replacement player that is different for each position.

I feel the exact opposite. Nothing in life is absolute. Everything is, and should be, relative.

In another thread, there is discussion about how many ballots it will take Him Thome to make it into the HOF.

If he were even a below average fielding SS or catcher, we'd be debating whether he'd be unanimous, and whether he might be the greatest baseball player of all time. And rightly so.

It's not hypothetical that a replacement level SS or C is going to contribute less offense than a replacement level 1B or DH, its a fact, and common sense.

drjim · December 2, 2017

This illustrates part of my problem with WAR. An ideal measure of a player's value should be absolute, not relative. There should be no arbitrarily-defined hypothetical replacement player that is different for each position.

That's a fair position. You would be very happy with wRC+. Best all encompassing stat that measures strictly offense with no concern about what position the hitter plays.

jimmer · December 2, 2017

You would be very happy with wRC+. Best all encompassing stat that measures strictly offense with no concern about what position the hitter plays.

yeah, I mentioned that earlier.

But, I still can't figure out why it's so controversial when a stat shows that when a shortstop hits the same way a 1B does that the shortstop is more valuable than the 1B.

drjim · December 2, 2017

yeah, I mentioned that earlier.

But, I still can't figure out why it's so controversial when a shows that when a shortstop hits the same way a 1B does that the shortstop is more valuable than the 1B.

I have my own problems with WAR, but this is a point in its favor!

spinowner · December 3, 2017

yeah, I mentioned that earlier.

But, I still can't figure out why it's so controversial when a stat shows that when a shortstop hits the same way a 1B does that the shortstop is more valuable than the 1B.

Because a SS should not receive more credit for batting performance than a 1B just because he plays SS. The position a player plays on the field is completely immaterial when his team is batting. A SS is more valuable than a 1B who has equal batting stats but it's because of fielding.

spinowner · December 3, 2017

This illustrates part of my problem with WAR. An ideal measure of a player's value should be absolute, not relative. There should be no arbitrarily-defined hypothetical replacement player that is different for each position.

Quoting oneself is bad form but that's just the kind of guy I am.

I just wanted to add that what predicated this whole thread was the matter of how to determine the AL MVP. Most. Valuable. Player. WAR attempts to determine value relative to a replacement player, which means a player who plays the same position. So even if WAR were accurate that means comparing a second baseman to a right fielder using WAR is literally analogous to comparing apples to oranges. There certainly can be discussion about how to define value, but it's clear to me that this award is intended to go the the player with the highest absolute value, not the player with the highest relative value.

drjim · December 3, 2017

Because a SS should not receive more credit for batting performance than a 1B just because he plays SS. The position a player plays on the field is completely immaterial when his team is batting. A SS is more valuable than a 1B who has equal batting stats but it's because of fielding.

The SS is more valuable because it is harder to find a replacement for a SS than a 1b. It is a counting stat relative to the positional replacement.

spinowner · December 3, 2017

The SS is more valuable because it is harder to find a replacement for a SS than a 1b. It is a counting stat relative to the positional replacement.

See my post #51, which was posted while you were writing post #52.

prouster · December 3, 2017

I'm not a sabermetrician. I wish I had the time and skill to be one but I don't. That said, I've never been on board with WAR.
Fielding is the most difficult of the four aspects of baseball (the other three being pitching, batting, and base running) to measure. Did an outfielder miss catching a fly ball by one inch because he ran too slowly, because he reacted too slowly, because he misread the ball off the bat, because he took a bad route, because the sun (or artificial lights) affected his vision, because the coach positioned him improperly, because he positioned himself improperly, because his glove was too small, because the ball traveled more slowly due to cold weather, because the ball traveled more rapidly due to hot weather, because he was not 100% healthy, because another nearby fielder spooked him, or some combination of those and myriad other factors? Or was it a fly ball that was played as perfectly as humanly possible but simply wasn't going to be caught? I just think it's impossible to accurately and precisely quantify fielding.
The lack of accuracy and lack of precision are compounded when trying to extrapolate fielding metrics into runs saved. And the lack of accuracy and lack of precision are compounded further when trying to extrapolate runs saved into wins and losses.
The metrics attempting to quantify the other three aspects are also flawed, less so than fielding, although it depends on which metrics are used. However, there is still the same loss of accuracy and precision when trying to extrapolate those metrics into runs, and from there into wins and losses.
So, WAR then. The way I understand it, there are several different formulas, none of which is totally objective, all of which use various imperfect metrics from the four aspects of the game (each of which is affected to some degree by the other three). And it's supposed to determine how many wins a player generates relative to some vague replacement player. While it's a noble goal, I think WAR as it is currently configured is too inaccurate and too imprecise to have much value.

I generally agree. To me, the big issue with sabermetrics is that we often treat it as a set of objective accounts of What Happened, when it's actually a way of narrating events. People commonly talk about narrative and statistics as if they're different things, when they are really just different modes of description. Sabermetrics, to me, suddenly works a lot better after acknowledging that it's slippery and subjective and indeed serves a narrative function.

prouster · December 3, 2017

I haven't read every post in this thread, so apologies if this has already come up.

To me, the trouble with WAR exists in its basic concept: it's an attempt to isolate individual performances in a team sport, wherein individual performances are dependent on team performance and other externalities. I don't think it's possible to separate the individual from his context, because his actions *make* some of the context in which he exists. This is why the problem of including some lucky elements while excluding others comes up in the first place--the outcome of one event is always related to the outcomes of other events.

I suggest revising how we think of individual performance. A player is always responsible for some things, but that player and his actions are nodes in a larger network of people and events. WAR as it currently exists doesn't really allow for that version of reality.

drjim · December 3, 2017

Quoting oneself is bad form but that's just the kind of guy I am.
I just wanted to add that what predicated this whole thread was the matter of how to determine the AL MVP. Most. Valuable. Player. WAR attempts to determine value relative to a replacement player, which means a player who plays the same position. So even if WAR were accurate that means comparing a second baseman to a right fielder using WAR is literally analogous to comparing apples to oranges. There certainly can be discussion about how to define value, but it's clear to me that this award is intended to go the the player with the highest absolute value, not the player with the highest relative value.

I think it is a big mistake to, for instance, give the MVP strictly to the guy with the highest WAR, or to rank WAR in order and say that is the clear order of best seasons.

I'm not even a huge fan of WAR, but it is a quick and dirty look at what is probably directionally acceptable, and is probably somewhat useful in putting players in comparable tiers, while comparing players across positions.

But you do bring up an important point, it is hard to compare players across positions. There isn't a great way to do it. (One of my biggest problems with WAR is the defensive bonus they give to players down the defensive spectrum, such as how they valued Alex Gordon in previous years. He wasn't good enough to stick at 3B, but suddenly he is an elite defender in LF that gets multiple wins of value, even though any CF or middle infielder could likely play that position just as well).

spinowner · December 3, 2017

I think it is a big mistake to, for instance, give the MVP strictly to the guy with the highest WAR, or to rank WAR in order and say that is the clear order of best seasons.

I'm not even a huge fan of WAR, but it is a quick and dirty look at what is probably directionally acceptable, and is probably somewhat useful in putting players in comparable tiers, while comparing players across positions.

But you do bring up an important point, it is hard to compare players across positions. There isn't a great way to do it. (One of my biggest problems with WAR is the defensive bonus they give to players down the defensive spectrum, such as how they valued Alex Gordon in previous years. He wasn't good enough to stick at 3B, but suddenly he is an elite defender in LF that gets multiple wins of value, even though any CF or middle infielder could likely play that position just as well).

Good example in Gordon. In little league ball your worst fielder usually plays right field, but in major league ball it's left field. That's why Delmon Young and Josh Willingham played there.

Kelly Vance · December 3, 2017

I have always thought WAR was sort of silly as a statistic. For me Altuve was more valuable than Judge because, when I watched the games, Altuve was always involved in the big inning or crucial rallies. Judge hit homers, but swung big and that is hit or miss. Altuve slashed a line drive in the gap when it was a good idea to slash a liner in the gap. He was like the football defender that is always around the ball. It is kind of like Supreme Court Justice Potter Stewart once said, "I know it when I see it." I am old school, but for me, the eyeball test is still the best way of telling who is more valuable.

jimmer · December 3, 2017

I'm always impressed when people can tell who the MVP should be based on the eyeball test. I think about how many games of each team I'd have to watch to get a good enough sampling of how everybody plays and think there's no way I could find the time to watch THAT much baseball. It's impressive some people can do that.

USAFChief · December 3, 2017

This is pretty much my thought too.

Another way to look at it, is that the SS in the hypothetical can almost certainly move to 1b and be fine defensively, while the 1b almost certainly can't move to SS and be fine. So even though they put up the same offensive numbers, the SS is clearly more valuable. That is captured in WAR.

We’re measuring value now? I thought we were measuring wins.

Sign In

Re-opening the WAR debate

Recommended Posts

Archived

Member Statistics

Prospect News & Highlights

Recent News

Notes & Rumors

Recent Blog Entries

Recent Status Updates