Baseball Statistics and Acronyms Explained
Mood Music - The Circle of Life by Elton John
Baseball analysis is fraught with acronyms, abbreviations, numbers, and statistics. This post is intended to be a cheat sheet and a reference, allowing the average PSAer to make more statistically informed posts, as well as decipher the cryptic messages of others. I will keep a link to this post in my signature, and hopefully it will be helpful in our future discussions of Yankees baseball.
(Batting Average / On Base Percentage / Slugging Percentage) The simplest and most basic way to gauge a batter's contribution is through the "triple slash" of batting average, on base percentage, and slugging percentage. Batting average measures the rate at which a player records hits to making outs, on base percentage measures the rate at which a player reaches base to plate appearances, and slugging percentage measures the rate at which a player accumulates "total bases" to making outs. While each of these statistics have shortcomings and can not be viewed in a vacuum, the triple slash is a simple and accurate way to gauge the strengths and weaknesses of a hitter.
(On Base Percentage Plus Slugging Percentage) OPS is the most basic composite stat. It combines two abilities of a batter (the ability to get on base and the ability to hit for power) and combines them into one number, an indicator of overall value. OPS, and the understanding of the importance of OBP and SLG, led to many of the subsequent breakthroughs in analysis, however the main shortcoming is in an equal weight being given to OBP and SLG. While this issue can be addressed in other statistics, OPS is an excellent way to quickly judge the overall offensive contribution of a player. OPS+ is a normalized version of OPS, with 100 being exactly league average.
(Isolated Power) ISO is a measure of a hitter's ability to record extra bases with a hit. Thus, if a player records a hit, how many extra bases do they average? Players with good speed and/or power will record high ISO values, and ISO is often referenced when discussing the "power potential" for a minor league prospect.

(Weighted On Base Average) wOBA assigns a linear weight to each of the possible outcomes of a plate appearance. (In the above formula, NIBB = Non intentional walk, RBOE = Reached base on error) The coefficients of each outcome are determined by their relative correlation to runs being scored as a result of that event. wOBA is one of the most inclusive and revealing statistics for batting analysis. (More)
(Weighted Runs Created) wRC+ uses the linear weights of wOBA to determine the runs created by a specific batter, and then is park and league adjusted, and normalized to 100. In much the same way as OPS+ is a normalized version of OPS, wRC+ gives you a more detailed way to compare a batter's contribution to the league average. (More)
(Batting Average on Balls in Play) BABIP measures the rate at which batted balls land for hits in the field of play. As any ball in play can result in anything from a triple play to an inside-the-park home run, there is a great variance in the result of batting a ball. Also, as this is not a direct result of contact or skill, but merely the trajectory of the ball and the positioning of the defense, BABIP is often used as a measure of a player's luck. BABIP in and of itself does not signal if a player is getting lucky or unlucky, or whether a level of play is or is not sustainable, however, when combined with contact ratios (LD%, GB%, FB%, HR/FB) these things can be determined with more certainty. The general rule of thumb is that BABIP ~ LD% + .120 and a player's hitting chart can be used to estimate xBABIP (expected BABIP) as another way to judge ball in play luck.
(Earned Run Average) ERA is a measure of how many earned runs a pitcher gives up compared to how many outs their team is able to record with them on the mound. Like a triple slash, ERA can be the simplest way to measure the overall effectiveness of a pitcher, however, the main shortcoming is the correlation between ERA and the ball park as well as the aptitude of the pitcher's defense.
(Walks and Hits per Innings Pitched) WHIP is a measure of how many baserunners a pitcher allows per inning pitched. As the converse of OBP, it allows you to judge the ability of opposing batters to reach base. Similar to OBP, the reason that WHIP shouldn't be used as a standalone metric is that it does not give adequate weight to extra base hits.
(Fielding Independent Pitching) As referenced earlier, one of the main shortcomings of ERA is in the play of the defense being outside the control of the pitcher. As such, DIPS (Defense Independent Pitching Statistics) have been created, in order to isolate the pitcher's contribution to the prevention of runs. As such, only plays which can not be affected by the defense are considered (home runs, walks, and strikeouts). The modifier of 3.10 (sometimes 3.08 or 3.20 are used) is used to put FIP on the same scale as ERA, meaning that a pitcher with a 3.50 FIP would be expected to have a 3.50 ERA if he played with a league average defense.
(Expected Fielding Independent Pitching) xFIP accounts for differences in park size and luck in home run rates by standardizing the home run term. On average, 10.6% of fly balls carry over the fence, and as such, xFIP is park and luck neutral. As such, it can address some of the major concerns about the reliability and predictive nature of ERA. (More)
(True ERA) tERA is an exciting statistic on the frontier of new possibilities of baseball analysis. It uses the speed and trajectory of batted balls (using Hit F/X to track the flight of the ball) to determine the expected outcome of each ball in play. This concept and technology is very new, but once all of the kinks get worked out, this could be a very exciting way to measure what a pitcher's ERA should be. (More)
(Ultimate Zone Rating) UZR is a measure of a defensive players ability to make plays within certain areas of the field. In order to calculate UZR, the field is split into 78 zones. The player's ability to make plays in each of these zones in comparison to other fielders of the same position result in positive and negative run values. UZR is said to need 3 years of data in order to have adequate sample size, and it's reliability does not extended to catcher or first base. It can also be normalized to 150 innings with UZR/150, which gives a rate, instead of a counting stat. While defense is the hardest to quantify and defensive metrics are the least predictive, UZR provides a sizeable improvement over fielding percentage, gold gloves, or the "eye test." (More and More)

(Wins Above Replacement) WAR is a measurement of how many extra wins a team would be expected to win as a result of a single player. Each player is compared against a "replacement level" player, who is of the talent of the average AAA player or a cheap journeyman signing that is available to any team. From the offensive and defensive contributions of a position player, or from the FIP of a pitcher, a number of runs can be assigned to an individual player. These runs are then correlated to a number of wins. (More)
52 comments
|
13 recs |
Do you like this story?
Comments
Rec'd.
Can’t get enough of learning more about these stats. Yet this is the 1st comment. People to need read!
"Don't give up, don't ever give up."
nerds
So I was sitting on the couch, watching Brief Encounter...
also,
rec’d
So I was sitting on the couch, watching Brief Encounter...
by Brian5517209 on Dec 23, 2010 9:50 PM EST up reply actions
You only posted this to make yourself sound smart.
Good stuff.
Yes, well when I see five weirdo's dressed in togas stabbing a guy in the middle of the park in full view of a hundred people, I shoot the bastards, that's my policy.
Wins is Runs Above Replacement divided by 10, because on average, a swing of 10 runs in runs scored – runs allowed over the course of a season for a team results in one extra win.
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
Yeah, the formula for WAR is too complicated to post here. You have to calculate replacement level first, which changes from year to year, then there are positional adjustments that reward playing a tough position, and so on. I don’t even know the formula, I just know how it works so I can give a basic description of what it does
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
replacement level isn’t just the statistical averages of the players at each position? or even the middle ranking of all players at a position?
No, because that would be the average player.
It doesn’t make a difference when determining relative value, but it does with absolute value. An average player at a position is pretty valuable. The point of replacement level is it’s trying to determine exactly how valuable an average player (or any player) is. Even below-average players still help teams as long as they’re above a certain level, and they are worth something. When a player starts hurting their team is when they’re worse than whatever is easily available.
Basically, it’s the idea of opportunity cost. Where is the level that a team would be unwilling to pay anything for a player because similar players are easily available at min salary.
Money!
Mike Marra is the worst Division 1 Starter in college basketball
by TheRealSlimShady on Dec 23, 2010 11:49 PM EST up reply actions
A few things:
You forgot to include base running in wOBA. +SB•.25 – CS•.5 in the numerator. Also, the weights aren’t correlations, it would be impossible to have a result higher than one, they are the average number of runs above an out (and then scaled to look like OBP) each event will get you, found from parsing years of data.
The LD% + .120 equation for xbabip is obsolete. The correlation for that to actual babip is only 18%, but the new xbabip calculations have a correlation of 59%. Check here to read more about it.
Really good read. I’m glad you didn’t go into more of the defensive stats, because I was planning on writing an article about those soon, lol.
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
Too much math
The more advanced stuff still doesn’t make much sense to me. I understand what it’s meant to show but I can’t see how it shows it. It just looks like a bunch of numbers to me
For wOBA, just think, “How many runs is an event worth, on average?” For example, if I hit a single instead of making an out, how many more runs, on average, am I getting for my team? The answer is .9 runs. So just think of the numbers in wOBA as run values.
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
Strictly speaking, the most accurate way to think of it is how many runs per plate appearance you get for your team on offense. That’s basically what wOBA says.
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
Weird thing, but wOBA isn't park weighted
So even though I personally prefer looking at wOBA, wRC+ is a more accurate stat.
by pkyankeefan on Dec 24, 2010 4:03 AM EST via mobile up reply actions
i was gonna say something sarcastic
but
spooky languagepretty much sums everything up.
I believe in the Church of Baseball
Free FreeBradshaw!
by Frank Campagnola on Dec 23, 2010 10:07 PM EST reply actions
Great post
I was unaware of tERA. Nice read.
Gardner for President.
by McDaniel on Dec 23, 2010 10:11 PM EST via mobile reply actions
does anybody know how they do the Pythagorean W/L record?
I know they post that record on RAB
This is a very good article…think about submitting it to something bigger?
Section 203 Row 15 Seat 1
Think about submitting it to something bigger
What’s bigger than Pinstripe Alley?
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
Yeah, most sites inclined towards stats already have their own primers
A lot of the SBN sites do, and so do Fangraphs, B-Ref, etc. This is pretty much the right audience for this post.
by pkyankeefan on Dec 24, 2010 3:59 AM EST via mobile up reply actions
The exponent used to be 2 (hence the Pythagorean name), but they've switched it to 1.83 because it's more accurate.
(Runs Scored)^1.83
-———————————————————————————
(Runs Scored)^1.83 + (Runs Allowed)^1.83
The Yankees’ Pythag record for last year was
Runs Scored: 847
Runs Against: 693
Pythagorean Record = 0.5908 * 162 = 95.7 wins
So we were right around where we were expected to be, since our actual record was also 95-67. Tampa’s Pythagorean record was also 96-66.
by pkyankeefan on Dec 24, 2010 12:48 AM EST up reply actions
Helped a lot
I am big on stats but I’m a lot better with football stats so this helped me brush up on baseball stats thanks man
by Kmillz2525 on Dec 23, 2010 10:32 PM EST via mobile reply actions
I think they (fangraphs) actually changes the constant on FIP/xFIP yearly
based on some sort of a rolling average. Correct me if I’m wrong though.
Also, if you mention tERA you’d probably also want to mention SIERA. Though I personally don’t think that either are particularly useful. In large sample sizes, use FIP. In small sample sizes, use xFIP.
I prefer tERA over FIP. It’s the same thing but with batted ball profiles added.
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
The problem is that batted ball profiles aren't really that reliable.
You could say that ERA is the same thing as FIP but with batted ball profiles, LOB%, and BABIP added. But as we know, BABIP isn’t very reliable year-to-year, so it’s better just to cut that out, even if we acknowledge that BABIP is a pitcher skill.
Or the reason xFIP is popular is because it cuts out HR/FB%. But we know that HR/FB% is a pitcher skill, certain pitchers have the ability to keep it low or high. But it isn’t consistent, so in smaller sample sizes it’s more useful to just cut it out (or really, regress it completely to league average).
All these arguments apply to tERA vs. FIP vs. xFIP as well. tERA does take into account batted ball profiles, but those batted ball profiles are also pretty inconsistent (really, the inconsistent one is LD%), and also can have some systemic biases due to human classification error. But they are also obviously pitcher skills. Since they’re inconsistent, the question is whether you should regress them or not, and how much. tERA does no (I think?) regression, while FIP regresses them a lot (keep in mind that different batted ball profiles are accounted for partially due to home runs). xFIP takes a different approach and regresses HR/FB% all the way, so that you’re taking a more direct measure of batted ball profile (FB%), but a totally regressed measure of HR/FB%, both of which are pitcher skills.
So it’s not enough to say that tERA is simply FIP with batted ball profiles added. More information is not always more useful. It depends on the data you have and what you want to do with it. For me, I generally use xFIP for 1-year samples and FIP for 3+ year samples for starters. While for relievers, I pretty much randomly choose; they have smaller sample sizes but I think they also have larger true variations in HR/FB%.
Line drive rates are luck driven, but groundball rates are not. They are pretty close, but tERA correlates slightly better to ERA.
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
I actually haven't seen any data on that.
Could you link me to whoever did that work?
And even if it is, I wouldn’t be surprised if they correlated better. Mainly for the same reasons that ERA is going to correlate better with RA than any of the metrics; LD% has a huge effect on ERA, even if it isn’t a very consistent skill, therefore tERA incorporates informations that affects ERA, even if it’s inconsistent ERA.
I would be more interested in it’s predictive value, which I don’t know. I think recently someone showed on The Book Blog that, on a 3-year sample size, RA (followed by FIP, xFIP, and tERA) was the best predictor of RA, but that it was mostly the result of pitchers staying with the same teams/parks. When looking at pitchers who switched teams, xFIP was the best predictor, followed by RA, FIP, and tERA. So at least in this work, tERA did not fair well at all. It was only a good predictor of RA when the pitcher stayed with his team, but even then it fell short of just using pure RA.
Well, they are different. tERA is not a “what will this guy do in the future” stat, it’s a “how should this guy have done this year” stat. FIP also says what you did this year, but xFIP is different from those two because it attempts to predict your FIP for following years. SIERA is a better predictor than xFIP, though.
It comes to this: if I want to analyze how good a pitcher was this year, I look at tERA. The difference of that and his ERA will also tell you how much help he got from his defense, and how good he was at stranding hitters. If I want to know how good a pitcher will be next year, I would look at SIERA. And btw, the correlations showing that tERA is slightly better than FIP can be found on the articles discussing SIERA on baseball prospectus. They found that tERA was the best estimator of current year performance, but SIERA was the best estimator of future performance.
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
I've been wanting folks to do this at OTM for awhile now.
Swell.
Now I have to go to a @#$%ing Yankees blog to figure out what Sox fans are saying.
Galactus does as he pleases. Because Galactus is drunk.
@#$%ing Twit: @blogtard
Lol, that's hilarious
But since we’re all being fair, here’s an option for you to go to a Rays site for another opinion. To be honest, that primer is less intuitive than Duggan’s, but has a few different things (like minimum sample sizes) included.
http://www.draysbay.com/2010/1/28/1274374/the-draysbay-stats-guide-2-0
So there. Now you can go to a Yankees AND Rays site to figure out what people at OTM are saying.
by pkyankeefan on Dec 24, 2010 8:05 AM EST via mobile up reply actions
LOOGY = Lefty One Out GuY. We didn’t make that up here, though
GGBG = Gritty Gutsy Brett Gardner. I don’t think that was made up here either, though.
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
I believe it's Gritty Gutty Brett Gardner.
Same idea though.
People ask me what I do in winter when there's no baseball. I'll tell you what I do. I stare out the window and wait for spring.
by WhatwouldJeterdo on Dec 24, 2010 9:20 PM EST up reply actions
I don’t like the word gutty. 99% of the time, when people say the word gutty, they really mean gutsy.
Russell Martin is just like the Jewish Pharisees, trying to keep Jesus down.
That's a lot less cool.
People ask me what I do in winter when there's no baseball. I'll tell you what I do. I stare out the window and wait for spring.
by WhatwouldJeterdo on Dec 25, 2010 5:34 PM EST up reply actions
Not PA but funny
My daughter is Cubs fan, and one they used for Ryan Theriot (before he was traded) was TOOTBLAN = Thrown Out On The Basepaths Like A Nincompoop. She even had a custom jersey made with his number and TOOTBLAN for the name.
"Screech, you CAN'T elope!"
"Who are you calling a cantaloupe, you melonhead?"
I'm dyslectic and in need of spell check on here
This is baseball and all these stats are just TMI…Try this: catch ball, throw ball, hit ball run!


by 




































