There's something utterly insane about Aaron Nola's 2018 season. And I will tell you: I've had a surprisingly hard time writing about it. I've started and stopped this post a dozen times. I don't know why.
In 2018, Aaron Nola was unquestionably one of the best pitchers in the National League. He finished third in the Cy Young voting. He went 17-6 with a 2.37 ERA, he had a fine 224-58 strikeout-to-walk ratio and a superb 0.975 WHIP. It was a breakout season for a brilliant young pitcher whom the Phillies took with the seventh pick back in the 2014 draft. They've been counting on him to lead them into a brighter future. It now seems clear that he's the right guy to do that.
So far, all of that's good.
The question is: Just HOW good was that season?
What's fascinating about that question is that the answer requires us to do something that we've been trying (and failing) to do for 100 years. It requires something that, as Bill James says, we might never fully accomplish.
To figure out just how good Aaron Nola was in 2018, we have to separate pitching from defense.
* * *
In the beginning, pitchers didn't matter. That's how the baseball bible would begin.
In the beginning, pitching didn't matter.
Baseball was without form and void; pitchers existed only to get the action started.
There were no balls. There were no strikes.
Well, it's true. Pitchers were not allowed to throw the ball overhand or sidearm; they were not allowed to do anything to trick the hitter. Batters waited for their pitch as long as they wanted, and they would demand that pitchers throw the ball a little higher, no, a little lower, no, not quite, a little bit more to the left ... baseball was not baseball as we know it now.
It was in this embryonic stage of baseball that the Father of Baseball, a British-born cricket writer named Henry Chadwick, came up with the concept that would forever alter how people would view pitchers. And the funniest part of all this: He wasn't even THINKING about pitchers when he thought up the idea.
In 1867, Chadwick invented the unearned run.
This made sense in 1867. Baseball then was entirely about batting and fielding, nothing else. That's why Chadwick wanted to divide up runs in the first place; he wanted to differentiate between those runs that were cleanly scored with hits (earned runs) and those that were cheaply scored because of defensive errors (unearned runs). Pitchers were not part of the equation.
But the game was changing. Pitchers made it change ... by cheating. We sometimes get high and mighty about cheating in baseball, but the reality is that the game was literally built on cheating. Pitchers started bending the rules to trick the hitters. At first, it was with head fakes. Then during the Civil War came a pitcher named Jim Creighton, who figured out a way to stealthily snap his wrist when pitching (totally illegal). Everybody knew that Creighton was cheating, but he was good at it. Nobody could catch him in the act.*
*Creighton's death might have been an inspiration for Roy Hobbs; he suffered a ruptured abdominal hernia after hitting a home run in a game in 1862 and died four days later.
Pitchers followed Creighton's lead. They grew bolder and bolder in openly flaunting the rules. But here was the thing: People loved it. Pitchers had unknowingly discovered that baseball was a better, more interesting, more commercial game when built around the fascinating battle between pitcher and batter.
The rules changed accordingly and rapidly. By 1884, pitchers had just about all of the restrictions stripped away; they could throw overhand and as hard as they wanted. And then came the never-ending effort to balance the game between hitter and pitcher.
1884: Six balls for a walk. Three strikes for a strikeout. Pitchers' box 50 feet from home.
1887: Five balls for a walk. Four strikes for a strikeout. Batter awarded first when hit by a pitch.
1888: Three strikes for a strikeout.
1889: Four balls for a walk.
1893: Pitching rubber added. Pitchers mound moved to 60 feet, 6 inches from home.
And so on. But even though the game had changed dramatically, there was still this leftover fixation on earned and unearned runs. Chadwick himself hated this, but that's because he fervently believed that no runs scored after stolen bases should be considered earned. Chadwick believed that stolen bases were a reflection of shoddy defense, and the fact that nobody agreed with him drove him up the wall.
Everyone else, basically, used earned and unearned runs, and common wisdom emerged: Earned runs were allowed by the pitcher; Unearned runs were allowed by the defense. This was cemented forever in 1912, when a man named John Heydler came up with one of the most famous formulas in sports history.
Earned Runs * 9 / innings pitched
That's ERA, a first-ballot Hall of Fame sports statistic.*
*Memo to self: Remember to start that sports statistic Hall of Fame.
And that's how it was for many, many decades. ERA ruled. Earned and unearned runs ruled. The impossibly flawed logic of reverse engineering the game so that you tried to reconstruct things as if errors hadn't happened didn't bother too many people. It bothered various people through the years, and it drove Bill James up the wall in the 1970s, but ERA had too strong a hold on the imagination for things to change.
And along came Voros McCracken.
You're now wondering when we will actually get back to Aaron Nola. I'm wondering the same thing. I hope it will be soon.
* * *
Henry Chadwick, when he invented earned and unearned runs, unintentionally wired our brains to look at baseball in a certain and admittedly irresistible way: Runs are the FAULT of the pitcher or, in the case of errors, the FAULT of the defense. It's tidy and neat and tempting to believe.
Once you go down that earned-unearned road, it's hard to think about pitching any other way. The unifying theory behind ERA is that the pitcher controls EVERYTHING ... except for the defensive screw-ups behind him.
As I say, many people had trouble buying into that. But it was Voros who changed the dynamic. He started with a different question: What exactly DOES a pitcher control? And what he found shocked him and shook baseball.
I think part of the magic of Voros' original hypothesis -- that pitchers have no control over whether balls in play become hits -- is that it manages to be both shocking and obvious at the same time. It's like ending of The Sixth Sense.
He came upon his hypothesis by simply looking to see the consistency of pitchers year to year in controlling different baseball events. He found -- unsurprisingly -- that pitchers' walks and strikeouts were consistent from year to year. He also found that the number of homers a pitcher allows year to year stays pretty consistent -- not as consistent as walks and strikeouts, but consistent enough for there to be a correlation. So the numbers suggested that pitchers had control over walks, strikeouts and home runs allowed.
But what about controlling balls once they got in play? This was the breakthrough that shook baseball: Voros found almost no correlation at all.
The most famous example of Voros' research was Nolan Ryan, who was famous for how few hits he gave up. The league hit .204 against him over an entire career, and so you would expect him to consistently be well below league average on Batting Average on Balls in Play, the now famous BABIP.
But he wasn't. Some years he was well below average. But some years he was above average. His BABIP fluctuated pretty wildly. And if Nolan Ryan didn't seem to have much control over balls in play, how could anyone else?
The point here is not to go deep into Voros' theory; many, many articles have been written, there have been dozens of insights and counter-theories launched by Voros' work. One my favorites: Tom Tango came up with the Defensive Responsibility Spectrum. He breaks down every type of play on the baseball diamond, with the responsibility moving from left (all pitcher's responsibility, such as HBPs) to right (all fielders' responsibility, such as outs made on the bases).
Here you go, the spectrum, to be clipped and put on your refrigerator door:
100% Pitcher -- hit by pitch, balk, pickoff, strikeout, walk, homer, wild pitch, stolen base, caught stealing, double, triple, single, batting outs, passed balls, base running outs -- 100% defense.
But let's move on -- Voros' work led directly to one of the most prominent advanced stats going, one you undoubtedly know well, Fielding Independent Pitching, or FIP.
FIP is basically a way to turn a pitchers' strikeouts, walks and homers allowed into an easy-to-digest ERA form.
The formula is quite elegant, I think:
((HR*13) + (3*BB+HBP) - (2 * K)) / IP) + C
Let's do Nola.
He allowed 17 home runs. Multiply that by 13, you get 221.
He walked 58 and hit seven more. Multiply 65 by three, you get 195, so your total is 416.
Now subtract strikeouts times two; Nola had 224 strikeouts, so that's 448 to subtract.
You get a negative number (416 - 448 = -32). That's what you want in a pitcher: The lower the number the better. Anyway, then there's a little math -- you divide the final number by innings pitched and then you add the C, the Constant, just to make it look like ERA. The constant in 2018 was 3.161.
Nola's FIP in 2018 was 3.01. It ranks him 10th in baseball.
That's a very good FIP ... but you'll notice that it's not nearly as good as his 2.37 ERA. It's always good to compare FIP to ERA, because that tells you just how much of a pitcher's success/struggles were due to balls in play. Nola's ERA being so much better than his FIP suggests that he:
A. Got lucky B. Had good timing C. Had good defense behind him D. All or some of the above
And that's one way to separate pitcher from defense. But you know the problems with that. FIP is built on the theory that pitchers have NO control on balls in play, and that probably goes too far. It has been shown that pitchers do have some control, for instance, on the types of hits they give up, whether they're fly balls or ground balls. A Nola fan could argue persuasively that the fact that Nola had one of the game's best BABIPs should not be simply written out of his performance.
Or a Nola fan could simply point to Baseball Reference.
Because according to Baseball Reference, Aaron Nola had an extraordinary, amazing, bonkers 2018 season.
* * *
Here now are the Top 5 pitching seasons by Baseball Reference's Wins Above Average:
No. 1: Dwight Gooden in 1985 (9.8 WAA). He went 24-4, led the league in wins, ERA, innings and K's.
No. 2: Pedro Martinez in 2000 (9.7 WAA). Remember that year? Pedro had a 291 ERA+, a 1.74 ERA, a 284-to-32 strikeout-to-walk ratio.
No. 3: Roger Clemens in 1997 (9.5 WAA). The comeback year, the stick-it-in-Boston's-ear year; he went 21-7 with a 2.05 ERA, led the league in wins, innings, strikeouts, ERA+, etc.
No. 4: Steve Carlton in 1972 (9.3 WAA). One of the most famous seasons in baseball history; Lefty won 27 when his team won just 59, he led the league in basically every category.
No. 5: Aaron Nola, 2018 (8.8 WAA).
OK, so seriously: What gives? How in the WORLD does Aaron Nola's 2018 season -- a good season as I said -- rank fifth in the last half-century-plus, ahead of every single Bob Gibson season, every Tom Seaver season, every Randy Johnson season, every Greg Maddux season, every Sandy Koufax season?
Are we missing something?
Are THEY missing something?
I think you know where this is going. This is especially true if you read my Porcello vs. Verlander piece from a few years ago. It comes back to that eternal puzzle: How do you separate a pitcher from his defense?
Baseball Reference builds its Wins Above Average and Wins Above Replacement formulas on runs allowed. Let's do Wins Above Average for Nola in 2018 and for Bob Gibson in his insane 1.12 ERA in 1968. As always, I make no promises on the precision of the math.
You start with runs scored per nine innings. Not unearned runs -- the defensive adjustment will be made later. All runs are included here.
Nola: 2.42 RA9
Gibson: 1.45 RA9
Nola allowed only one unearned run in all of 2018. Gibson allowed 11 unearned runs in 1968. That's important for the calculations.
OK, so what you want to do next is compare that RA9 to the league average. In 2018, NL teams averaged 4.56 runs per nine innings. In 1968, the year of the pitcher, NL teams only scored 3.35 RA9. So when you put everything in context, you come up with this:
Nola: 50.4 runs above average
Gibson: 64.3 runs above average
All right, Baseball Reference has a starting pitcher adjustment because, generally, relievers allow fewer runs per nine innings. That adjustment helps Nola more because (I think) there were many fewer relievers in Gibson's time. After the adjustment (let's leave off the decimal points from here on out):
Nola: 55 runs above average
Gibson: 66 runs above average
There are ballpark adjustments made -- I was surprised to see that Gibson actually loses a bunch of runs because of ballpark adjustments; I guess new Busch Stadium was a pretty extreme pitchers park in 1968. Nola pitched in a slight hitters park, so his numbers go up.
Nola 57 runs above average
Gibson: 59 runs above average
So, that's probably much closer than you expected. People have very different opinions about context -- lots of people just want to enjoy Gibson's 1.12 ERA without taking into account that the ERA across baseball that year was 2.98. Anyway, that's where the numbers end up, and Gibson is ahead.
But we forgot something: The defensive adjustment. And it's a doozy.
[caption id="attachment_23998" align="aligncenter" width="466"] Even Nola himself probably doesn't believe how bonkers his 2018 season was.[/caption]
Baseball Reference has Gibson's defense being just about average, so there really isn't an adjustment for him.
But BR has Philadelphia's defense so terrible, so abominable, so unbelievably bad, that they add -- get ready for this -- FIFTEEN RUNS to Nola's total. Yeah. Fifteen.
And that gives puts Nola 72 runs above average. As mentioned, that's way more than Gibson ever had, way more than Greg Maddux ever had, way more than Clayton Kershaw, Justin Verlander, John Smoltz or Tom Glavine ever had in a season. It's more than Roger Clemens' MVP season of 1986. I could go on.
In other words, it's unbelievable.
I mean that in the literal sense of the word. I don't believe it. You don't believe it. Nobody believes it. I suspect that Aaron Nola himself doesn't believe it.
So what the heck happened here?
I'll tell you what I think happened: Baseball Reference double counted. And it shows you just how difficult it is even for some of the smartest baseball minds around the game today to separate pitching from defense.
Baseball Reference makes its defensive adjustment based on the quality of a team's defense. It seems a logical thing to do. The Phillies were, in fact, a terrible defense in 2018. By John Dewan's runs saved, they were 145 runs below average, the worst defensive team in baseball by a mile, one of the worst defensive teams in memory.
So, Baseball Reference spreads around those runs among all the pitchers. Again, that seems logical. It's the same lousy defense for every pitcher, right?
Wrong. It isn't. And this is what I mean by double counting.
By every available measure, the Phillies played pretty good defense behind Aaron Nola. Mitchel Lichtman does Ultimate Zone Rating per pitcher, and he figured that the Phillies defense was league-average behind Nola. You can see this in Batting Average on Balls in Play.
League average BABIP: .295
Phillies BABIP: .306
Nola's BABIP: .254
Nola's BABIP was fourth-lowest in all of baseball. Our pal Mike Petriello used Statcast numbers to show that, based on exit velocity and angle and such things, the league hit 12 points LOWER than you would expect on the balls in play that Nola gave up.
Why would Nola get better defense than others? It could be for any number of reasons. It could be a statistical illusion, I suppose. It could be luck. It could be that Nola's balls in play happened to be easier to field. It could be that the Phillies played with more concentration behind him. It could be something else. We're not going to be able to pinpoint this right now; but I would say that the larger point it this: The bulk of evidence suggests that Nola got at least average defense behind him, and probably got better than average defense.
So that's counting defense once.
But then on top of that, Baseball Reference gives him a bunch of runs above average because they assume that he had a dreadful defense behind him. That's the double count. It's like giving a poker player a full house AND extra chips.
And that's how you turn a good season into a legendary one.
* * *
I'm not exaggerating when I say: This was the most baffling post I've ever written. I don't mean that in a literal way -- I've written countless baffling, confusing and inexplicable posts. I mean that for reasons I don't understand, I just couldn't quite get to the end. I wrote probably 10,000 words trying to get these 3,000 words out there. I have no idea why it was so difficult. But since you made it all the way to the bottom, I'll include a couple of the outtakes here.
Sort of a director's cut for this ridiculous post.
Outtake 1: For some reason, at some point, I compared people realizing that separating pitching from defense is comparable to NFL referees realizing that actually calling an NFL game is futile.
Something weird happened in the Bears-Eagles playoff game last Sunday ... well, actually, I suppose several weird things happened in the game, including a game-winning field goal being tipped at the line, THEN hitting an upright, THEN hitting the crossbar and THEN bouncing forward, making it just another night Chicago died.
"Wait, shouldn't that kick be good?" my wife Margo asked.
"No. The ball has to go through."
"That would be a home run in baseball, right?" she asked.
"Yes. That's right. If it hits the foul pole and bounces back in, that's a home run."
"Yep," she said. "Just another reason that baseball is better than football."
But that isn't the weird thing I was talking about. Late in the first half -- and I'm not going to look up the specifics because I don't want to get too bogged down in this -- a pass was thrown to a Bears receiver, who wrestled for the ball with an Eagles defender. They wrestled for it for quite some time and then, just as the receiver was going to the ground, the defender ripped the ball out of his hands.
The referees immediately called it an incomplete pass, because it's not humanly possible to officiate an NFL game. I don't mean this facetiously. It's not humanly possible to officiate an NFL game. The game moves too fast, there's too much happening at once, the rules are too vague and teams work harder at breaking the rules than referees could ever work at upholding them.
Calling an NFL game is like trying to police a 42.4 mph speed limit (or maybe it's a 37.3 mph speed limit) on the Autobahn if the cars were (only in specific ways) allowed to crash into each other.
But the point is because the referees immediately called it an incomplete pass, none of the players went to go pick up the ball. Well, why would they?
Then the Bears challenged the call that it was an incomplete pass, and of course, the replay showed that the referee got it wrong. The receiver definitely caught the ball.
And replay also showed that because he caught the ball and it was ripped from his hands, it should have been called a fumble too.
And replay also showed that nobody actually recovered the fumble because the officials called it an incomplete pass in the first place.
Faced with all this madness, the replay official did what anybody faced with the sheer inanity of professional football should do: He threw his hands up in the air and said, "OK, I guess it's an incomplete pass, I mean, what else can we do?"
That's the NFL.
And I think that's how many people have taken on the challenge of separating pitching and defense. At some point, it has become: I mean, what else can we do?
Outtake 2: At some point, I wanted to use Nola's WAR scores to say that maybe we've gone as far down the "one-stop shopping number" as we can go. But before doing that, I wanted to show why WAR, as flawed as it might be, is still very valuable.
I'm as susceptible as the next person -- OK, realistically, I'm much more susceptible than the next person -- to the allure of a one-stop shopping baseball number. I love them. I love f-WAR. I love b-WAR. I love WARP. I love Win Shares. I love the Indis. I love them all.
I love them because they allow us to bring just a little bit of order to the extraordinary disorder of baseball history. For example: Who are the best eligible and non-tainted players who are not in the Baseball Hall of Fame? I can just go over to Baseball Reference, us b-WAR and get a list of potential candidates:
Lou Whitaker, 75.1
Larry Walker, 72.7
Bobby Grich, 71.1
Scott Rolen, 70.2
Edgar Martinez, 68.4
Kenny Lofton, 68.3
Graig Nettles, 68.0
Dwight Evans, 67.1
Buddy Bell, 66.3
Willie Randolph, 65.9
I could then go over to Fangraphs and get a list, though it's a bit harder because they don't have a Hall of Fame filter that I know how to use.
Scott Rolen, 69.9
Bobby Grich, 69.2
Larry Walker, 68.7
Lou Whitaker, 68.1
Andruw Jones, 66.9
Graig Nettles, 65.7
Edgar Martinez, 65.5
Dwight Evans, 65.1
Reggie Smith, 64.6
Jim Edmonds, 64.5
That would have been so much harder without having WAR as a starting point. Now, are these the RIGHT 10 players? There would be those who would say no, that the right 10 players would include Steve Garvey, Dick Allen, Tony Oliva, Dale Murphy, etc. They won't admit it, but they're just using a different formula. I have a friend who believes deeply that Garvey belongs in the Hall of Fame (and now that Harold Baines is in, I think his argument is much stronger). I ask him: Why Garvey? He had a .329 career on-base percentage. He never slugged .500 in a season (and as a first baseman). His career WAR is 38.1.
And he says: WAR is stupid. He was the Top RBI man on a dominant Dodgers team. He was a .300 hitter who got 200 hits a year pretty much every year. He won an MVP award. He was one of the biggest stars in the game.
So you can put together my friend's formula pretty easily.
(6 * Fame) + (3 * Leadership) + (4 * All-Star Game starts) + (Credit for statistics I like, such as 200 hits per season) - (0 * statistics I don't care about at all like on-base percentage).
I make fun of it, but the wonderful thing about baseball is that we're allowed to watch it, analyze it and calculate it any way we want.