clock menu more-arrow no yes mobile

Filed under:

Statistical Analysis in Baseball Isn't New...At All

By now I'm sure many of you have seen this laughably awful article, "To Create a Winner, You Have to Find the Winners," by Mike Tully that appeared in the New York Times yesterday. In it, the author advances the patently absurd and anti-statistical argument (while, of course, still resorting to less useful statistics to make his point) that the value of players can be measured not by stats typically understood to represent, you know, players' value, but their team's record in games they play. He writes:

The consistent winner is not always the fastest runner or the hardest shot. It's the person who can read the subtleties of the game and knows when certain actions can make a difference. It's the person who involves and inspires teammates, creating emotion or a sense of mission. Coaches have a special name for such a player: starter.

Despite significant gains made by proponents of objective, statistical baseball analysis in recent years, it appears as if the "anti-SABR/anti-Moneyball" crowd has mounted something of a small pushback over the past few weeks.  

First, there was the news of The Beauty of Short Hops: How Chance and Circumstance Confound the Moneyball Approach to Baseball. This new book by Alan and Sheldon Hirsch which purports to offer both a refutation of Billy Beane's methods for building a competitive team on a small budget using non-traditional statistics and argues for rejecting statistical analysis in favor of appreciating the game's intrinsic "beauty". As Alan Hirsch writes of his ill-conceived book: 

A few decades ago, Bill James was just a numbers-loving baseball nut whose annual "book", a mimeographed collection of his eclectic thoughts, attracted roughly 75 buyers. Today, the movement James pioneered (called "sabermetrics") is taking over baseball. Believers in sabermetrics permeate the media and, increasingly, the front offices of major league teams. The catalyst was Moneyball, Michael Lewis' best-selling book that describes how, in the early 2000s, the Oakland A's thrived because their General Manager, Billy Beane, employed insights culled from sabermetrics.


More importantly, the saber-obsession with numbers occludes a major aspect of baseball's beauty - its narrative richness and relentless capacity to surprise. Baseball, thank goodness, transcends and often defies quantitative analysis. Games are decided by bad hops and bad calls, broken bats, sun and wind, pigeons in the outfield, and fans who obstruct players, among other unforeseeable contingencies That may seem obvious (apart from the pigeons), but not to the folks who increasingly run the show. Rather than celebrating baseball's delightfully spontaneous quality, sabermetricians deny it or rebel against it.

Needless to say, the book and this article have been widely lambasted by the pro-statistical analysis crowd and even just plain rational people. 

Then came retired New York Times baseball writer and current blogger Murray Chass's response to the response to the Hirsch brothers' book: 

The book slices and dices the thinking behind "Moneyball" in the first chapter, then moves on to sabermetrics. When I have written about the new-age statistics, I have mostly expressed disdain for them and have paid the price, receiving tons of e-mail from promoters of the new statistics.

I could have warned the brothers Hirsch, but they quickly learned how uncivil (in some cases that's putting it mildly) the WAR, VORP and UZR crowd can be. Two things I have learned that the brothers will learn: I am always wrong, and the sabermetric way is the only way that counts.

Chass misses a key point, however. Most reasonable proponents of sabermetrics aren't wholly wedded to the stats (like WAR, VORP, and UZR) themselves, but rather to an approach for analyzing baseball. Namely, they seek to find the metrics that are best able to express player value. Hence, there is constant discussion and debate in the SABR world about which stats are most useful for conveying specific aspects of player ability which, in turn, help us better understand their value. Tully again: 

Sometimes teams will find a player with an X-factor that goes way beyond talent. Leo Durocher once said that second baseman Eddie Stanky could not hit, field or run - all he could do was win. But Stanky was not helpless. He retired in 1953 with a .410 career on-base percentage, among the best in history.

On-base percentage was not valued in the 1940s and '50s, and that is the point. Certain players have always done things that keep them on the winning side, even if they are not always recognized.

In making his argument for "intangibles," Tully actually makes an argument for something that is quite tangible and quite useful in quantifying a player's value, OBP. But Tully, Chass, and the Hirsches are all wrong. Tully is wrong because OBP was indeed valued by some in the 1940s and 50s. Chass and the Hirsches are wrong because the approach employed by "the promoters of the new statistics" they lambaste is not an innovation of Bill James within the last few decades. No, it is much older than that. 

In 1947, then-Brooklyn Dodgers General Manager, Branch Rickey--one of the game's great innovators--hired a 30-year old Canadian statistician named Allan Roth. Roth--baseball's first full-time statistician--and Rickey began to promote two new and controversial ideas: that on-base percentage was a more useful statistic than batting average, and that platoon splits exist and can be quantified. Both of these concepts are building blocks of the sabermetric discipline and have long since been absorbed into our baseball commonsense. 

In a 1954 Life Magazine article entitled "Goodby to Some Old Baseball Ideas," Rickey built on his and Roth's findings to offer an even more comprehensive formula for estimating a team's efficiency. He writes:

The formula, for I so designate it, is what mathematicians call a simple, additive equation:

The symbols, familiar to all baseball fans, are explained in the caption to the picture above. The part of the equation in the first parenthesis stands for a baseball team's offense. The part in the second parenthesis represents defense. The difference between the two-G, for the game or games-represents a team's efficiency.

Crude as the formula is by today's standards, it nevertheless represents a truly fascinating and groundbreaking attempt to do what Chass's "promoters of the new statistics" have been trying to do for decades: understand and quantify what it takes to win baseball games.


(Click to enlarge. H/t to @leokitty for unearthing this gem. Read the whole article here.)

Rickey's words at the beginning of the article sum up the requirements and goals of engaging in statistical analysis as well as anyone could today: 

Baseball people generally are allergic to new ideas. We are slow to change. For 51 years I have judged baseball by personal observation, by considered opinion and by accepted statistical methods. But recently I have come upon a device for measuring baseballl which has compelled me to put different values on some of my oldest and most cherished theories. It reveals some new and startling truths about the nature of the game. It is a means of gauging with a high degree of accuracy important factors which contribute to winning and losing baseball games. It is most disconcerting and at the same time the most constructive thing to come into baseball in my memory.

The next time the SABR-bashers decry the emergence of "new-age stats," they might want to study up on the history of the game.

If only Chass and company weren't so damn allergic.