You have nothing to fear but fear itself

Baseball fans my age or older tend to quake when you tell them such things as that the traditional batting average is an incomplete statistic. It still has isolated, situational value to a certain extent, such as how a batter does with men on base or in various leverage situations. But as a cumulative view it’s really a false picture.

Why? Think of its basic formula: it divides hits by at-bats. That’s all. It doesn’t account for the actual hits and their actual worth. “That batting average turns a blind eye to so many outcomes,” writes MLB.com columnist Anthony Castrovince, “is not even the greatest flaw in its role as a batter barometer. No, the greatest flaw is the implied insistence that all hits are created equal.”

For better or worse, I’ve phrased it a little more snarkily in past writings: if you really think a single’s as valuable as a double, a double’s as valuable as a triple, a triple’s as valuable as a home run, a single’s as valuable as all the above, you shouldn’t hang a shingle as a baseball observer or analyst any time soon. Castrovince discusses that and numerous other statistical advances, depths, and challenges, in A Fan’s Guide to Baseball Analytics, in language that’s snarky where appropriate but sensitive, smart, and nuanced all at once.

His book should be required reading for any baseball fan who thinks statistics—the life blood of the thinking person’s sport—should conform to prejudice instead of offering the bigger and deeper picture. Sacred cows be damned to steak.

One of the issues with too many articles and books about baseball analytics is that they can be and too often are, well, too analytical. To the average baseball fan they’re the alphabet soup and you can’t even see the soup. Castrovince gives the alphabet—well, the numbers—the places they deserve without letting the soup disappear. He writes a lot more entertainingly about his statistical beliefs than I could hope to write, and he makes plain that he has no intention of burying baseball fans or dismissing them as dumb.

“I’m here to build you up, not break you down,” he writes in his introduction. “While there is plenty of math in this book . . . I’m presenting it as casually as I can. Plus, when things get super-duper complicated, I’ll give you a brief lay of the land instead of wandering too deep into the woods and weeds.”

He explains the newer, deeper numbers in language plain enough that even Yapper McFlapper in the nosebleed seats, who only thinks that he could out-play million-dollar Swinger Swofford or out-think manager Brainy Boner with one arm in a cast and half his cerebrum in formaldehyde, can get it. Yapper might be pleasantly surprised and entertained at once over how he doesn’t have to matriculate back to college to get it.

Castrovince knows it should be child’s play to debunk the traditional batting average and a passel of other old stats that have more flaws than a glass onion. “Stats such as batting average, RBIs, errors, wins, and saves are all baseball backbones,” he writes in the on-deck circle. ” . . . But not acknowledging their faults and trusting them as the be-all and end-all is a mistake.”

Then, he checks in at the plate. “There have been .400 hitters who weren’t even the most productive players in their league in a given season, and there have been .300 hitters whose performance, at large, did not rate as positively as players whose averages had a ‘2’ right after the decimal.”

I can make that just as simple. Let me give you two players. They both had two decades-plus major league careers. Their lifetime batting averages are within a single point of each other. Knowing going in that the old-schooler is going to say the wrong player was more valuable at the plate, here are the batting averages:

By one batting average point, Yapper McFlapper pronounces Player A the better hitter. Let’s give Yapper a cookie and admit Player A has more lifetime hits than Player B, and Player B has over 3,000 of those. Time to go a little deeper. Player B has a higher on-base percentage, slugging percentage, and OPS, not to mention that Player B also walked more unintentionally and intentionally and hit twelve more sacrifice flies—all in almost four thousand fewer trips to the plate.

If Yapper McFlapper sees from that that Player A wasn’t half the real presence at the plate than Player B, why can’t anyone else? And I didn’t even think about measuring them according to my own Real Batting Average (RBA) measure—total bases + walks + intentional walks + sacrifice flies + hit by pitches, divided by total plate appearances. Oh, what the hell:

If Yapper looks at that and still clings to the prejudice that a .303 lifetime traditional batting average makes Player A the slightly better player than Player B, then Yapper’s got some splainin’ to do. That’s without showing Yapper Player A’s three “batting titles” against Player B’s one, by the way.

Castrovince lists the ten ways any trip to the plate ends: hit, walk, out, sacrifice bunt, sacrifice fly, hit by a pitch, reaching base on a fielder’s choice, reaching base on an error, a dropped third strike on which you reach first safely, and defensive interference. You know that five of them don’t count as “at-bats.” (If you don’t . . . )

The so-called “batting title” goes to the hitter from each league who has the highest batting average, yet you need 502 plate appearances . . . to even qualify for the title. So the five outcomes that, for whatever reason, don’t matter when tabulating batting average suddenly matter when assessing who has the best batting average.

It’s enough to drive you batty.

(Why didn’t I include sacrifice bunts in my RBA metric? Sorry, but those are outs made deliberately. You shouldn’t get credit when you make an out on purpose. But you should get credit for the sacrifice fly because it sends home a run and you weren’t trying to hit one right into Leather Sackorocks’s glove.)

That’s not the only thing that drives Castrovince batty. Like me, he thinks runs batted in don’t say as much as Yapper McFlapper and Frostie Fingerflipper think they say about a player’s run productivity and clutch ability. Peel yourselves from the ceiling, Yapper and Flappie.

You can’t drive in the runs if nobody else reaches base ahead of you, unless you hit one out. You can’t look at the RBI total alone and conclude a player’s clutch. Good luck, by the way, scoring runs without a little help from your friends—unless you can steal every base including home every time you reach first. (Well, maybe Rickey Henderson could have, if he wanted to . . . )

Some people accuse the Angels’ all-universe Mike Trout of being a little less than clutch because he isn’t knocking 100+ runs in every full season he plays. “The only thing Mike Trout lacked,” Castrovince writes, with the virtue of truth on his side, “was . . . Mike Trout batting in front of him.” Trout at this writing has a .418 lifetime on-base percentage. Would indeed that he’d had a couple of Mike Trouts batting in front of him.

Here’s one instance where the old batting average does make sense: hitting with men on base. Trout through this writing has hit .306 with men on base and .318 with runners in scoring position. His OPS for the former: 1.082. For the latter: 1.013. (Oh, the futility of the “RISP” stat, because it counts guys on second base or better only. Technically, you’re in scoring position the minute you reach base at all, even just first. If you’re a home run hitter, you’re in scoring position the moment you step into the batter’s box.)

Aside from OBP, SLG, and OPS, Castrovince believes the best way to measure a batter’s value is with runs created, isolated power, weighted OBP, weighted runs created and OPS+, and baserunning. He’ll give you the mathematical formulae and conjugate it in language so simple a schoolboy or schoolgirl can comprehend it a lot more readily than they might algebra or calculus. He’ll tell you why they really matter.

Runs created, whose formula factors the same things my RBA does with a little more complexity: “the central job of a hitter is to help his team score runs.” Isolated power: “batting average does not tell you how often a player’s hits go for extra bases, and slugging percentage does not discriminate between singles and extra-base hits.”

Weighted on-base average: “not all methods of reaching base are equal. OBP goes only so far in measuring offensive value, whereas wOBA assigns the proper value to each event in terms of its impact on scoring runs.” Weighted runs created: “while runs created and OPS were both huge steps forward from more antiquated offensive metrics, neither one is adjusted for the context of a given season or a player’s home park.”

Baserunning (BsR): “with stolen base attempts on a continual decline—and the art of baserunning extending beyond stolen bases—it’s better to look at a context-driven and all-encompassing stat.” Sub-stat: ultimate baserunning, crediting a runner “for advancement on the bases relative to the frequency with which the league average runner advances in the same situation.”

In 2020, the major league average for extra bases taken on followup hits was 42 percent. Think about that. Damn near half the time men reached base they were advancing more than the expected minimum when the next guys swung the bat. (The aforementioned Rickey Henderson did it 55 percent of the time he was on base when the next guy[s] swing the bat[s].) Today’s players are smarter than you think when they reach base.

Castrovince doesn’t let the traditional pitching stats off the hook, either. He thinks pitching wins are baseball’s most deceptive pitching stat and should have been put in their grave when Jacob deGrom won the 2018 National League Cy Young Award. (He won the award with ten wins and nine losses.) “Jacob deGrom’s issue,” Castrovince writes, “wasn’t that he ‘didn’t know how to win.’ It was that he didn’t know how not to be on the 2018 New York Mets.”

DeGrom “won” as many games as the White Sox’s Lucas Giolito in 2018. He posted a 1.70 ERA to Giolito’s 6.13. He also posted a 1.99 fielding-independent pitching rate to Giolito’s 5.56. Trained strictly on what a pitcher actually does control (strikeouts, unintentional walks, hit batsmen, home runs), FIP “is a better tool than ERA—which is influenced by the whims of a pitcher’s defense or the rulings of an official scorer—in evaluating a pitcher’s effectiveness. A pitcher has little control over what happens once the ball is put in play.”

Castrovince even exhumes that only six pitchers in the live-ball era qualified for the ERA title while posting ERAs and FIPs below 2.00 in the qualifying season: Hal Newhouser (1946), Sandy Koufax (1963), Bob Gibson (1968), Tom Seaver (1971), Clayton Kershaw (2014), and Jacob deGrom (2018). The Cy Young Award wasn’t invented when Newhouser pitched, but only one of the other pitchers didn’t win the Cy Young Award in his such season: Seaver, who “won” four fewer than winner Ferguson Jenkins who also “lost” four more. How does a guy who lost four more beat the guy who lost four less?

Want to lean on pitching wins that badly, Yapper and Frostie? Show me the pitcher who strikes 27 straight batters out. (Not even Nolan Ryan ever did that.) Uh oh, Flinger Flounder’s team got shut out, too, not by 27 up and 27 struck out, they’re going to extra innings, and Flinger’s 27 straight punchouts left him an arm and shoulder begging for their lives after nine full. Guess who’s going to get credit for the “win” if he happens to be on the mound when the winning run scores even if it’s only in the tenth inning when it scores?

The author also loves walks/hits per inning pitched, WHIP for short, as I do: “Because, as is the case on your morning commute, traffic is bad. WHIP tells us how well a pitcher has performed the very fundamental role of not letting the traffic pile up—obviously an important element in run prevention.” Pitchers and fielders have the opposite job of batters: their job is to keep the other guys from putting more runs on the board than their guys do.

Castrovince gives fielders their propers, too, meaning you can throw away every defensive stat you grew up with, really, including errors, and focus on defensive runs saved in hand with the ultimate zone rating:

[E]rror counts doled out by scorekeepers in the press box barely tell us anything about what makes a successful defender. DRS and UZR are better approximating of defensive value, as they include elements such as range, efficiency on double play chances, and first-step quickness.

The error, he argues, is “the most capricious and arbitrarily (and often unfairly) applied statistic in all of professional sports. The error, which of course generated fielding percentage, tells us not what happened but what an observer of the game felt should have happened. And its uselessness is matched only by its unreliability, because, on a given day, a play ruled an error in one ballpark could very well be ruled a hit in another.”

Would you consider Bill Buckner’s in Game Six of the 1986 World Series the most infamous “error” in baseball history? Do you remember Mookie Wilson’s slow-rolling ground ball up the line taking a wicked skid on the Shea Stadium grass through Buckner’s feet beneath his mitt instead of the tiny hop up into the mitt, leaving Buckner helpless on the play? Do you remember that Wilson would have beaten the play at first base if the ball did get into Buckner’s mitt, because he was about a step ahead of Red Sox pitcher Bob Stanley ambling over to cover first?

It doesn’t let Red Sox manager John McNamara off the hook for failing to do what he normally did, replacing Buckner at first with Dave Stapleton for that should-have-been final inning. (It doesn’t let the Red Sox bullpen off the hook for surrendering the two-out hits that re-tied the game, either.) But it should have made Red Sox Nation and just about all of baseball nation think twice, thrice, and quadruple, before deciding Billy Buck was Beelze Bub incarnate.

Yapper McFlapper and Frostie Fingerflipper haven’t come to terms with wins above replacement, or WAR. Castrovince saves WAR for last in his book, just as I have for this review. Maybe Yapper can’t stop singing the ancient Edwin Starr hit: “War/what is it good for/absolutely nothing.” Maybe Frostie thinks it means baseball during World War II. What the hell is WAR, really?

“A measure,” Castrovince writes, “of a player’s value in all facets of the game by determining how many more wins he is worth than a readily-available replacement at the same position.”

For position players, it’s the number of runs above average Swinger Swofford’s worth through a combination of batting, running, and fielding, adjusted for his field position (some of which are tougher work than others), the league averages thereof, and the number of runs the mere replacement might be worth. For pitchers, it’s either runs allowed per nine innings (earned and unearned) or FIP adjusted to the league averages and the ballparks, relative to Slinger O’Slick’s innings pitched.

Castrovince admits WAR isn’t the final, most perfect measurement, but he knows its best use may be in showing you that there was more than met your eyes when you watched a particular player during a given season. A player with 8 WAR or better is MVP level. A player with 6-8 WAR is a mere superstar. A player with 4-6 WAR is an All-Star level player. A player with 2-4 WAR is a good, dependable regular. A player with 1-2 WAR is a role player. A player with 0-1 WAR is a pine rider. A player under 0 shouldn’t even ride the major league pine.

But WAR has its uses for measuring a player’s career, too. If they measured WAR during Lou Whitaker’s career, that longtime Detroit second base bellwether might have been in the Hall of Fame two decades ago, instead of one-and-done on the writers’ ballot and waiting for an Eras Committee to reconsider him yet again. Whitaker finished his career with 75.1 career WAR. The average Hall of Fame second baseman’s career WAR is 69.5. Whitaker’s entry into Cooperstown would hike the average a tick or three. Still think WAR’s good for absolutely nothing?

What the old-schooler fears, perhaps, is being left for dead in the woods and weeds with the sabermetric advance. The old-schooler may fear that everything he or she ever learned on baseball cards or in those ancient annual pocket-size volumes of Who’s Who in Baseball turned out to be like an old gag about condensing Romeo and Juliet: a couple of moony teenagers ran off together and died.

You might care to note that was the first time I deployed the S-word here. By design. I, too, have no wish to leave you for dead in the woods and the weeds when I talk or write sabermetrically or analytically. Fellow old-timer, I too grew up with Who’s Who in Baseball as my pocket Bible.

But I also collided happily with The Elias Baseball Analyst most years of the 1980s and Total Baseball pre-Internet. Who’s Who in Baseball was rendered irrelevant by Baseball-Reference.com and Retrosheet, where the basic stats go deeper than the baseball card, and one or two clicks sends you to the kind of advanced stats for which Total Baseball cost you an arm (and maybe a wrist, if you weren’t careful with a book as heavy as a foundation block), a leg, and the annual updated supplement.

My God, the Internet’s made statistical diving simpler than all that. What’s to be afraid of? Castrovince is the Franklin D. Roosevelt of baseball analytics: you have nothing to fear but fear itself. I was the world’s worst math student in my school days. My teachers then would flip to see me now diving into the deep stats the way oceanic explorers dive for subterranean discoveries. If I can do it, anybody can.

So why should you do it? I was afraid you’d ask. Very well, I surrender—no matter how much of a baseball nut you are, no matter how many subscriptions to ESPN or MLB Network you have and use, you can’t see every last baseball game played when baseball is in season, and you’ve got no other way—not even YouTube clips—to know what the players you couldn’t watch really did above and beyond their surface stats. The box score won’t tell you the whole game story.

You don’t “need” stats to, you know, watch and enjoy the game? Well, you watched and enjoyed the games growing up and couldn’t wait to compare what you saw with what was on those guys’ limited baseball cards or in Who’s Who in Baseball—when you weren’t busy flipping the cards in the schoolyard or on the corner, or clipping them to your bicycle to clatter and fart against the turning spokes.

Pick up A Fan’s Guide to Baseball Analytics without fear, with a wide open and fearless mind, and relax with the idea that you’re actually going to get what you wished for, back when you were bound (and gagged?) once upon a time to slog through high school mathematics. The formulae simplified, the concepts making sense, your game eyes not playing tricks on you, and the entertainment as immense and joyous as watching the merry-go-round go ’round on the bases.

Oh. By the way. Refer back to Player A and Player B. Player A is Pete Rose. Player B is Willie Mays. You are now free to ask yourself whom between nine Charlie Hustlers and nine Say Hey Kids will create more runs and hang them on the scoreboard.