WAR ‘Misleading,’ Says Baseball Stats Pioneer Bill James

Brendan MillerNovember 24, 2017

WAR, huh, good God, what is it good for? Absolutely nothin’! Okay, that might be taking it a little too far, but sabermetrics godfather Bill James recently said WAR is misleading. Finally, someone brought its misuse to the front stage.

As a form of human behavior, baseball is inherently difficult to quantify. Capturing all the fine details of a player’s contributions is next to impossible. Despite the challenges, several influential statistics have been developed over the last decade — wOBA, FIP, etc. — which revolutionized how we watch the sport and how front offices orchestrate their operation.

The numbers revolution largely contributed to a Chicago Cubs World Series championship, but new statistics are commonly misused by people not working in front offices. This misuse, from my perspective, is the reason why “old-school” and “new-school” baseball fans can’t seem to agree on how to properly value players. This is especially true of Wins Above Replacement (WAR).

To what extent a player contributes to his team’s success, or lack thereof, is the most fundamental question baseball statistics seek to answer. WAR is one way by which statisticians have attempted to answer it, working backward by isolating a player’s value on a linear scale of runs, then converting runs to wins.

WAR a is quick, insightful method used to determine how a player performed, but it gets messy. For example, Ben Zobrist’s go-ahead double in the 10th inning of the Cubs’ World Series-clinching Game 7 win is treated exactly the same way as Ryan Theriot’s bloop double down the right field line in 2008.

That disparity, we’ll call it the Zobrist/Theriot Conundrum, perfectly reflects James’s opinion. Even as one of the progenitors of modern baseball analytics, he thinks WAR is misused.

We come, then, to the present moment, at which some of my friends and colleagues wish to argue that Aaron Judge is basically even with Jose Altuve, and might reasonably have been the Most Valuable Player. It’s nonsense. Aaron Judge was nowhere near as valuable as Jose Altuve. Why? Because he didn’t do nearly as much to win games for his team as Altuve did. It is NOT close. The belief that it is close is fueled by bad statistical analysis—not as bad as the 1974 statistical analysis, I grant, but flawed nonetheless. It is based essentially on a misleading statistic, which is WAR. Baseball-Reference WAR shows the little guy at 8.3, and the big guy at 8.1. But in reality, they are nowhere near that close. I am not saying that WAR is a bad statistic or a useless statistic, but it is not a perfect statistic, and in this particular case it is just dead wrong. It is dead wrong because the creators of that statistic have severed the connection between performance statistics and wins, thus undermining their analysis.

This is like Picasso coming back to life and furiously refuting society’s interpretation of his paintings. James’s viewpoint sent shockwaves through the baseball community and elicited responses from fellow stats godfather, Tom Tango, who actually agreed with what James was saying.

So WAR isn't intended to account for the literal wins that season but for the wins if the offense and defense operated independent of each other. Which works to handle most questions but not necessarily MVP.

— Tangotiger 🍁 (@tangotiger) November 20, 2017

And then Tango put another cherry on top of the sundae that I was already enjoying. He said that he is a big proponent of the context-dependent RE24 metric, which is an indirect umbrella of batting average with runners in scoring position.

And I'm a big proponent of RE24, especially as it relates to runners on 1B or 3B and less than 2 outs, when "baseball" decisions are being made by pitcher, fielders, batter.

— Tangotiger 🍁 (@tangotiger) November 21, 2017

James and Tango are saying exactly what both old-school and new-school proponents preach. In one sense, both men advocate for context-neutral statistics in order to isolate individual talent, just like new-school fans. But like old-school fans, they recognize the context-dependent nature of the sport and emphasize that it is an aspect we can’t ignore.

Which brings us full circle to why Theo Epstein decided to let John Mallee go. Under the now-former batting coach in 2017, Cubs batters had the second-best team wRC+ behind only the 104-win Los Angeles Dodgers. But Chili Davis was brought in because the front office wasn’t satisfied with situational hitting.

“[Davis] is excellent at teaching a two-strike approach and teaching situational hitting,” Epstein said if the hire. “He’s really good at helping to get hitters to understand when an elite pitcher’s on his game, you have to sometimes take what he gives you, and have an adjustable swing, an adjustable approach for those situations.”

Bill James and Tom Tango’s statistics changed the way we think about the game. Admittedly, I’m not sure I would be as big a baseball fan as I am were it not for their work. So when the two of them talk about how the fundamental components of their own stats are being misused, we should listen carefully. Old-school fans should listen. New-school fans should listen.

And maybe what we’ll all hear is that — stay with me here, because this is getting deep — it’s important to take into account more than just one viewpoint and to examine measurements of the game in their proper context.

Brendan MillerNovember 24, 2017