Using the Saber For Good: Batting
Mood Music - Time To Say Goodbye by Andrea Bocelli and Sarah Brightman
Well, it's been a while since I've tried to do something impossible, so I'm going to try and write a few general pieces on the field of sabermetrics and advanced statistics. Another reason why this is going to be challenging is the extremely wide range of interest and knowledge about advanced statistics within the community.
I know for a fact that some of you just couldn't care less about "fancy new age acronyms," so it's going to be almost impossible for me to write this without sounding preachy or condescending. So, let me assure you that my sole intention is to inform, and if this information makes you more or less inclined to "believe" in sabermetrics, that's up to you.
If I had to try to give a blanket definition to the entire field of sabermetrics, I would call it an attempt to isolate variables, and to connect those variables to certain outcomes. For example, one of the main ones, hitting line drives is good, and will lead to hits, runs, and victories. Not Earth shattering, I know, but still worth considering.
Now, as another aside, before we get into some of the "meat" of this piece, I'd like to make another point. The purpose of any metric is to put a numerical value on past performance, both to catalog that performance, and to give you data to predict future performance. For example, a .330 hitter is better than a .200 hitter, and the .330 hitter is more likely to get a hit in the future.
However, is that the entire story? Who did a better job at the plate, a batter who hit a bloop single off the end of the bat, or a batter who hit a rocket right at the center fielder? With the use of traditional metrics, there is only room to record the results of hitting, and that turns a bloop single and a line drive single into the same statistical event. And the same thing can be said about a lazy pop up and a line out to the warning track. It's an oversimplification, and an important piece of the data is missing. In accepting this as true, we must also accept that luck does play somewhat into the outcome of a hit vs. an out.
There will be more, post jump, so come along.
Another beef I have with more "traditional" metrics is that marks are assigned to an individual player based on a team accomplishment. For example, scoring a run (in any way other than a solo home run) is a team effort. You can't drive in a run with a single if there is no one on scoring position. And you can't score from a single if there is no one behind you to deliver a base hit. Concordantly, runs and RBI more accurately reflect the situations in which a hitter is placed then their actual ability. Guys at the top of the lineup are going to have a lot of runs, guys in the middle of the lineup are going to have a lot of RBI, it's a silly way to compare players.
So, I don't like traditional metrics focus more on results than actual performance and skill, and there is a degree of luck involved in all balls in play. If I was a politician, I would stop right here; however, as none of you voted for me, I will actually try and give some solutions and explanation:
The first thing that I would like to stress, is the use of On Base Percentage (OBP) instead of traditional batting average. Batting average is designed to tell you the rate at which a batter reaches base, yet, a walk, which is a significant victory for the hitter, is considered a draw (no hit, no AB). However, drawing a walk does not require the ball to fall in between fielders, does not require a swing, and requires the pitcher to throw in excess of 4 pitches. As such, drawing walks is a massively important skill, and should be reflected in the simple "how often do you succeed" ratio. For example, Alex Rodriguez usually hits at or below .300, and Robinson Cano usually hits above .300, but because A-Rod walks and Cano doesn't, A-Rod succeeds in a higher percentage of his plate appearances. And finally, the easiest way to report this in context is with a players "triple slash" of Batting Average / On Base Percentage / Slugging Percentage.
Now, to look at balls that are actually put into play. The things that I first look at before anything else are LD%, GB%, FB%, and K%. These "contact peripherals" give you a very good idea of the distribution of at bats that end up in line drives, ground balls, fly balls, and strike outs, and they all have different hitting characteristics that they point to. Obviously, line drives for a hitter are excellent, and anything in excess of 20% is considered to be making great contact at the plate. And line drive percentage also gives rise to the idea of "ball in play luck," centered around Batting Average on Balls In Play (BABIP).
The general rule of thumb is that BABIP ~ LD% + .120 (so if you are hitting 20% LD, you would expect a .320 BABIP), and should serve to give you a good idea of who is having balls fall in for them, and who isn't. For example, Curtis Granderson is hitting line drives at a 25.7% rate (so, using the rule above, we would expect a .377 BABIP); however, his BABIP is currently at .280. Therefore, if Granderson continues to make solid contact at the same rate, we can expect positive regression.
Some metrics to consider:
On Base Percentage Plus Slugging Percentage (OPS) - OPS is probably the simplest of the advanced statistics. By simply adding the rate at which a batter gets on base, with the amount of total bases per plate appearance, you have a basic understanding of worth at the plate. It is much the simplest way of giving extra weight to extra base hits.
Gross Production Average (GPA) - One of the main shortcomings of OPS is that a point of OBP and a point of SLG are considered equal. They are not. As the average OBP is considerably lower than the average SLG, each point must be weighted more. GPA is calculated as (1.8*OBP + SLG)/4 in order to give more weight to the OBP term.
More advanced metrics such as wOBA and wRC+ make use of other offensive feats, such as stealing bases and assign different weights to different situational batting events, and are probably the most trustworthy if you want an "all inclusive" comparison of batting value.
Another attempt at an all inclusive measurement of value is Wins Above Replacement (WAR). WAR uses wOBA and Ultimate Zone Rating (UZR) in an attempt to determine the total value of a player both offensively and defensively over a league average AAA level backup. Meaning, how many more wins would the Yankees have with Alex Rodriguez playing 3rd base instead of having the average AAA 3rd basemen?
Well, that's about all that I've got as far as a "quick overview" goes. If this is something of interest, there is worlds of information available, and Fangraphs has become an expanding powerhouse of knowledge and data. The thing to remember with this type of analysis is that it's never going to be a perfect system, but hopefully, we can give ourselves better analytical tools to objectively evaluate and compare the skills and performances of individual players.
37 comments
|
0 recs |
Do you like this story?
Comments
Thank you Duggan
I’ve actually learned quite a bit from this.
"WHO WOULD LEAD?! THE CLOWN?!"
by I'mGivingYouARaise on Jul 14, 2010 2:25 AM EDT reply actions
I feel like what you said about RBIs is very important.
So many people seem to live and die by the RBI, which seems unfair because it’s entirely dependent upon a persons teammates. I actually like the fact that it shouldn’t matter as much. Sure, every RBI helps the team, but as far as using them for personal statistics, I feel like it shouldn’t matter as much as it seems to.
I had a general understanding of BABIP and how it is possible to be “very lucky” (see: Jackson, Austin), but I understand BABIP a little more now. Honestly, I think GPA still goes a bit over my head, and as far as the “advanced stats” go, well, I’d like to learn more and understand them better.
I enjoyed the post, Duggan, and I feel like I’ll learn a lot from the subsequent posts, too.
by WhatwouldJeterdo on Jul 14, 2010 2:33 AM EDT reply actions
GPA seems a little too arbitrary to worry about anyway
DB
by DukBudr on Jul 14, 2010 8:08 AM EDT via mobile up reply actions
Cool, this actually helped a lot.
I still am a little iffy on the WAR, wOBA.
"We're only going to score 17 points?" - Tom Brady
"Well played, Mauer." - Guy from PS3 commercials
ah, thank you
"We're only going to score 17 points?" - Tom Brady
"Well played, Mauer." - Guy from PS3 commercials
lol
Nobody is smarter than the almighty Duggan
"Winning is the most important thing in my life, after breathing. Breathing first, winning next." -George Michael Steinbrenner III
by Chris McKeown on Jul 14, 2010 9:36 AM EDT up reply actions
lol
"Winning is the most important thing in my life, after breathing. Breathing first, winning next." -George Michael Steinbrenner III
by Chris McKeown on Jul 14, 2010 11:26 AM EDT up reply actions
Good job, Duggan
Good overview.
It’s worth noting that wOBA’s value scale is the same as OBP, even though they’re calculated very differently. So, in the same way that a .320 OBP is mediocre, a .320 wOBA is mediocre; a .420 OBP is excellent, and so is a .420 wOBA.
"I am a man of great mental power." ~Alfonso Soriano
U must have invented these while sittn in ur mudders bassment.
Strikeouts are boring- Besides that, they're fascist. Throw some ground balls - it's more democratic.
Good explanation
of the “New Age Stats”. Even an old fart like me could follow along and learn something.
"I don't want one of those guys who'll drive in two but let in three every game." Casey Stengel
by tnredneckyankeesfan on Jul 14, 2010 9:25 AM EDT reply actions
As always, great stuff Duggan
It opens all of our eyes to the fact that there is a bigger picture than just AVG/HR/RBI/OBP/R
"Winning is the most important thing in my life, after breathing. Breathing first, winning next." -George Michael Steinbrenner III
I still
think BABIP falls into the “luck is the residue of design”, possibly? maybe?
I am not a big fan of all the Y2K stats in general, just too much for my taste, but a good post.
His mother has a tattoo that reads, "Son".
Sharks have a week dedicated to HIM.
"It doesn't take more than one person, to talk to a woman.
Stay thirsty my friends."
Um…. I dont understand what I just read
Join the Lacrosse community at http://www.theprolaxblog.blogspot.com/
"That place was for diehard sports fans. I only follow my team when they're in the playoffs" - Homer Simpson
by bestbostonsports on Jul 14, 2010 12:29 PM EDT reply actions
I never really liked the sabermetrics, and I get killed on OTM about it.
Join the Lacrosse community at http://www.theprolaxblog.blogspot.com/
"That place was for diehard sports fans. I only follow my team when they're in the playoffs" - Homer Simpson
by bestbostonsports on Jul 14, 2010 12:31 PM EDT up reply actions
I dont know why you even respond if thats all you have to say
Join the Lacrosse community at http://www.theprolaxblog.blogspot.com/
"That place was for diehard sports fans. I only follow my team when they're in the playoffs" - Homer Simpson
by bestbostonsports on Jul 14, 2010 12:33 PM EDT up reply actions
it's probably how he racks up comments so fast
"I'm looking at 600 as first base. I want to run right through it and use it as a platform and a spring board for more to come"- Alex Rodriguez
All good, but...
UZR is bogus. As a consequence, WAR is bogus.
UZR doesn’t take the defender’s decisions into account. Are they good at holding runners before making a throw? Did they get the lead runner, or the guy at first? Could they have turned a double play? Was there a shift on (certain managers will make their fielders look terrible in UZR)? Was there a play on? (New UZR tries to fix that a little, but it assumes the manager is making cookie cutter decisions about where to play people based on the runners and the outs, while the manager might decide to pull the infield in or out based on the score, or what he had for lunch.)
OBP, and slugging, are great. But when you start with a shaky, subjective metric, and you start deriving sub-metrics off of that, you’re going to be left with something made up of too many variables to have any predictive value. Then you end up looking at things like “Total Zone Runs” and decide that it’s a good idea to sign Marco Scutaro as your shortstop for run prevention.
While I sit firmly in the pro-sabermetric camp, I do hear what you are saying.
UZR isn’t a perfect metric, and I don’t think its creators or anybody who’s intellectually honest is going to say otherwise, but I think of it as a stepping stone in that direction. Despite its flaws, it does improve upon the statistics it was intended to replace – total chances, fielding percentage, and errors – and in that sense it isn’t bogus.
Signing Marco Scutaro based on his Total Zone Runs is better than signing him based on his fielding percentage, no?
+1, it is what it is, and as Kuri said, the uncertainty that you talk about is kind of built right into the definition.
UZR takes a three year sample size before it even becomes valid. What does that tell you? You need a lot of data and there is a lot of variance. However, it mops the floor with simpleton analysis like “Derek Jeter won a gold glove because his fielding percentage was high.”
Questions or thoughts? Email me at duggan2423(at)gmail(dot)com
Sure, but no team has ever signed a guy based on his fielding percentage alone.
Instead teams scout players.
I have a hard time believing that Theo sufficiently scouted Scutaro before signing him. He’s been in the AL East for a while, and we’ve all seen him play. We all know he’s not particularly impressive. So you have to wonder… Did Theo see him make some spectacular plays against the Sox, and then confirm his opinion using TZR/UZR/WAR?
Also, I don’t believe they would have signed him based on his fielding percentage. Unless they only looked at 2009, which was an uncharacteristically good year for him.
Good piece
I actually learned a good amount.
"He wasn't an astronaut, he was a tv comedian! And he was just using space travel as a metaphor for beating his wife!"
I don't know who IVAN256 is, but I'm with him.
I didn’t vote in the poll, because “Of limited use” was not a choice. As I have said here before, statistics derived from other statistics multiply uncertainty. I also have doubts about classifying batted balls as line drives or pop-ups. What about bloopers, squibbers, flares, etc??
by designatedquitter on Jul 14, 2010 1:28 PM EDT reply actions
This was very informative.
I kind of feel like I need to pull out my notebook and take notes for a pop quiz that will be given at a later date. But most of them were not as complicated as I had thought. Good job. NOT TL;DR, afterall!
Don’t worry, the quiz is open book, and you get one life line.
Questions or thoughts? Email me at duggan2423(at)gmail(dot)com
Oh, that's a relief.
I will pencil you in as my phone-a-friend.
by CAyankeesfan on Jul 14, 2010 2:00 PM EDT up reply actions
im fairly new
To the new statistics but becoming more and more informed day by day. This article certainly is insightful and definitely a good read. Its funny how broadcasters throw around terms like “good RBI hitter” for these middle of the lineup guys. But if you look at carlos pena it just serves as proof that you should take traditional statistics with a grain of salt. Batting 4th in tampa bay he drove in over 100 runs with a .220 batting avg but that didn’t necessarily mean he was a good rbi guy or a bad hitter you can’t get a clear picture of what his season was like. Then you look at almighty cervelli and His avg with risp is through the roof would you call him a good rbi guy although if he’s lucky he may only get 65 this year? Rhetorical questions obviously. But anyway great read. You always do an awesome job lord duggan keep it up it helps get me through these long boring early work days
by ghandioncesaid on Jul 14, 2010 1:54 PM EDT via mobile reply actions
I have never been a fan of the "new age" stats(I know Duggan hates that term,lol)
But it certainly helps when someone like Duggan helps explain how they work. I doubt I will ever embrace these stats unless someone comes calling with an offer to GM a team but I do appreciate the time and work that Duggan puts into these great posts. Keep up the awesome work Duggan.

by 



















