/cdn.vox-cdn.com/imported_assets/498809/4739cavalry_saber.jpg)
Mood Music - Time To Say Goodbye by Andrea Bocelli and Sarah Brightman
Well, it's been a while since I've tried to do something impossible, so I'm going to try and write a few general pieces on the field of sabermetrics and advanced statistics. Another reason why this is going to be challenging is the extremely wide range of interest and knowledge about advanced statistics within the community.
I know for a fact that some of you just couldn't care less about "fancy new age acronyms," so it's going to be almost impossible for me to write this without sounding preachy or condescending. So, let me assure you that my sole intention is to inform, and if this information makes you more or less inclined to "believe" in sabermetrics, that's up to you.
If I had to try to give a blanket definition to the entire field of sabermetrics, I would call it an attempt to isolate variables, and to connect those variables to certain outcomes. For example, one of the main ones, hitting line drives is good, and will lead to hits, runs, and victories. Not Earth shattering, I know, but still worth considering.
Now, as another aside, before we get into some of the "meat" of this piece, I'd like to make another point. The purpose of any metric is to put a numerical value on past performance, both to catalog that performance, and to give you data to predict future performance. For example, a .330 hitter is better than a .200 hitter, and the .330 hitter is more likely to get a hit in the future.
However, is that the entire story? Who did a better job at the plate, a batter who hit a bloop single off the end of the bat, or a batter who hit a rocket right at the center fielder? With the use of traditional metrics, there is only room to record the results of hitting, and that turns a bloop single and a line drive single into the same statistical event. And the same thing can be said about a lazy pop up and a line out to the warning track. It's an oversimplification, and an important piece of the data is missing. In accepting this as true, we must also accept that luck does play somewhat into the outcome of a hit vs. an out.
There will be more, post jump, so come along.
Another beef I have with more "traditional" metrics is that marks are assigned to an individual player based on a team accomplishment. For example, scoring a run (in any way other than a solo home run) is a team effort. You can't drive in a run with a single if there is no one on scoring position. And you can't score from a single if there is no one behind you to deliver a base hit. Concordantly, runs and RBI more accurately reflect the situations in which a hitter is placed then their actual ability. Guys at the top of the lineup are going to have a lot of runs, guys in the middle of the lineup are going to have a lot of RBI, it's a silly way to compare players.
So, I don't like traditional metrics focus more on results than actual performance and skill, and there is a degree of luck involved in all balls in play. If I was a politician, I would stop right here; however, as none of you voted for me, I will actually try and give some solutions and explanation:
The first thing that I would like to stress, is the use of On Base Percentage (OBP) instead of traditional batting average. Batting average is designed to tell you the rate at which a batter reaches base, yet, a walk, which is a significant victory for the hitter, is considered a draw (no hit, no AB). However, drawing a walk does not require the ball to fall in between fielders, does not require a swing, and requires the pitcher to throw in excess of 4 pitches. As such, drawing walks is a massively important skill, and should be reflected in the simple "how often do you succeed" ratio. For example, Alex Rodriguez usually hits at or below .300, and Robinson Cano usually hits above .300, but because A-Rod walks and Cano doesn't, A-Rod succeeds in a higher percentage of his plate appearances. And finally, the easiest way to report this in context is with a players "triple slash" of Batting Average / On Base Percentage / Slugging Percentage.
Now, to look at balls that are actually put into play. The things that I first look at before anything else are LD%, GB%, FB%, and K%. These "contact peripherals" give you a very good idea of the distribution of at bats that end up in line drives, ground balls, fly balls, and strike outs, and they all have different hitting characteristics that they point to. Obviously, line drives for a hitter are excellent, and anything in excess of 20% is considered to be making great contact at the plate. And line drive percentage also gives rise to the idea of "ball in play luck," centered around Batting Average on Balls In Play (BABIP).
The general rule of thumb is that BABIP ~ LD% + .120 (so if you are hitting 20% LD, you would expect a .320 BABIP), and should serve to give you a good idea of who is having balls fall in for them, and who isn't. For example, Curtis Granderson is hitting line drives at a 25.7% rate (so, using the rule above, we would expect a .377 BABIP); however, his BABIP is currently at .280. Therefore, if Granderson continues to make solid contact at the same rate, we can expect positive regression.
Some metrics to consider:
On Base Percentage Plus Slugging Percentage (OPS) - OPS is probably the simplest of the advanced statistics. By simply adding the rate at which a batter gets on base, with the amount of total bases per plate appearance, you have a basic understanding of worth at the plate. It is much the simplest way of giving extra weight to extra base hits.
Gross Production Average (GPA) - One of the main shortcomings of OPS is that a point of OBP and a point of SLG are considered equal. They are not. As the average OBP is considerably lower than the average SLG, each point must be weighted more. GPA is calculated as (1.8*OBP + SLG)/4 in order to give more weight to the OBP term.
More advanced metrics such as wOBA and wRC+ make use of other offensive feats, such as stealing bases and assign different weights to different situational batting events, and are probably the most trustworthy if you want an "all inclusive" comparison of batting value.
Another attempt at an all inclusive measurement of value is Wins Above Replacement (WAR). WAR uses wOBA and Ultimate Zone Rating (UZR) in an attempt to determine the total value of a player both offensively and defensively over a league average AAA level backup. Meaning, how many more wins would the Yankees have with Alex Rodriguez playing 3rd base instead of having the average AAA 3rd basemen?
Well, that's about all that I've got as far as a "quick overview" goes. If this is something of interest, there is worlds of information available, and Fangraphs has become an expanding powerhouse of knowledge and data. The thing to remember with this type of analysis is that it's never going to be a perfect system, but hopefully, we can give ourselves better analytical tools to objectively evaluate and compare the skills and performances of individual players.