“There are three kinds of lies: lies, damned lies, and statistics.”
“73.6% of all statistics are made up.”
One of the most refreshing aspects of this season has been the growing acceptance of advanced statistics. On Friday night, Tom McCarthy casually spoke about how he met with Gabe Kapler to talk about wRC+ (adjusted weighted runs created) and ISO (isolated power). The Citizens Bank Park scoreboard now shows on base percentage. Coming from the dark days when our general manager openly admitted that he did not care about on base percentage and it was not even clear he understood the difference between at bats and plate appearances, the change is quite refreshing.
But there is a reason why they are called “advanced” stats. It is easy to understand runs, runs batted in, home runs, stolen bases, walks, strikeouts, and other counting stats. At bats and batting average require a little knowledge regarding how sacrifices and walks are counted towards them but are still pretty simple, and earned runs and earned run average are in the same boat. They can all be calculated quickly and easily. On base percentage, slugging percentage, and on base + slugging percentage have not been as commonly used for as long but are still easy to calculate and easy to understand.
However, it is becoming more and more common to talk about stats like wRC+, OPS+, FIP, xFIP, and the most common composite stat, WAR. Which is great. Honestly, when I first became interested in advanced statistics more than a decade ago, even OBP and SLG were considered too advanced by many people, so hearing the announcers talk about complicated advanced statistics and how they are informing decision-making shows just how far baseball discourse has come in such a short time.
But as the accepted stats become more and more complicated, they also become further and further removed from stats an average person can quickly calculate or double check. Many of the new stats are composites of multiple other advanced stats. Many people refer to stats, especially stats like WAR, without fully understanding what exactly these stats represent. A quick Google News search shows many articles talking about WAR and its relation to the MVP race. Using stats without knowing how they are calculated or what they represent is no good at all.
Did you know that WAR is actually two completely different stats? There are two sites that maintain libraries of advanced statistics: baseball-reference.com and fangraphs.com. Each site displays the WAR of every player for every season going back a century. And despite both stats being named Wins Above Replacement, they are calculated differently, using different component stats, and can disagree somewhat significantly, especially on an individual season level.
Given how quickly the use of these advanced stats has proliferated, I am never sure how much you, my reader, know about a given stat. So I have decided to write this quick primer, so that I – and you – can refer back to this in the future. While I typically default to using FanGraphs (FG) and its stats, I do not think it is better or worse than baseball-reference (bb-ref), and so I will try to address stats and both sites and their differences.
+ and –: Many stats have a + or – after their acronym. BB-ref displays OPS+ and ERA+. FG displays wRC+, ERA-, and FIP-. A + or – after an acronym indicates that the stat is adjusted for park factors and league averages and then indexed so that average = 100. If you ever want to compare two players, these + and – stats are generally the best to use. While the stats may not be as complete as something like WAR, because of the way WAR is calculated, these stats are typically better for cross-generation comparisons.
Park factors and league averages: One of the best parts of baseball is that every stadium is slightly different, providing a true home field. At the same time, this means that every stadium plays differently. A hitter in Colorado will have better numbers than one in Oakland. Similarly, over the course of baseball’s history, scoring has varied for various reasons. Comparing a hitter in the deadball era to a hitter in the steroid era, you would expect the latter to have significantly better numbers. Again, when comparing players, using stats that are adjusted for these factors is ideal.
wOBA, wRC+, FIP, and xFIP: These are roughly analog hitting and pitching stats. Ultimately, they are all built on statistical calculations that assign different values to different events and calculate what a player’s “true” performance level is by stripping out the effects of luck or randomness. wOBA is most similar to OPS but is indexed to OBP. wRC+ is wOBA adjusted for season, league, and park. FIP is a calculation of how a pitcher has pitched assuming league average outcomes for balls in play, indexed to ERA. xFIP is the same as FIP but with a normalized HR/FB rate and is best used for relievers and other pitchers with a small sample size.
bWAR and fWAR: While both sites identify their metric as WAR, it is probably more correct to identify which site’s WAR you are using, commonly by putting a b or f in front of it. The really, really short version of WAR is that it attempts to calculate how many runs a player contributes on offense, defense, running, and pitching, then calculate how many runs constitutes a win, and then indexing the runs to wins. They calculate all four components basically completely differently. One is not better than the other, and it’s most important to understand that these are just two of many ways to calculate what they are attempting to calculate.
“Ninety percent of the game is half mental.” -Yogi Berra
These are some of the many advanced statistics that are starting to be commonly used throughout baseball. Becoming familiar with them will increase your understanding of the deeper workings of baseball. It’ll also stop you from doing a week’s worth of research, only to discover you got too fancy, used the wrong stats, and your research is completely unusable. Whoops!