The Communication of Baseball Statistics

T. Andrew Finn

“Pitching is 75% of baseball.” (Connie Mack)

It is nearly impossible to know what is meant by this.
However, to the extent that it is meaningful, it’s false.
” (Bill James)


Connie Mack was recognized as one of the most experienced insiders in the history of baseball. Bill James never played, managed, or coached a single inning of minor or major league baseball. So who do you believe?

Connie Mack never volunteered any evidence to support his statement. He knew the game and that was enough for most people at the time. Bill James explains what he means. When he questions an old adage he follows a logic that says: first, what does the statement mean? Second, is it true, and if so, is it meaningful and relevant? Third, what else does it imply? (James, 1994). After noting that Connie Mack’s statement is relatively vague, James (1986) points out several consequences that should follow from the notion that pitching is the dominant component of baseball:
· Teams with the best pitching staffs should win the pennant 75% of the time (it doesn’t happen)
· Teams with the better starting pitcher should win the game 75% of the time (it doesn’t happen)
· Pitchers would be the highest paid players (they’re not)
· No team would trade a pitcher for a position player of the same quality (it happens all the time)
· There would be more variability in runs allowed than in runs scored (no, they are virtually the same every season)
· Pitchers would win the MVP awards most years (they do not, and did not even before the CY Young award was created)
· Teams would spend most of their time developing pitchers (they do not)
· Most draft choices would be pitchers (they are not)
· The pitcher would be the dominant force determining the outcome of each pitcher/batter outcome (he is not – for reasons explained in detail by James)

A quote from James (1989), summarizing his view of what we know—and can know—about the relative importance of pitching, is instructive for its measured assessment of the issue:
“In short, none of it is true. We cannot say for certain that baseball is not 75 percent pitching, but we can say these two things: 1) that the statistical patterns which one would expect to manifest themselves in such an event do not occur, and 2) that while people may say the baseball is 75 percent pitching, no one acts in a manner consistent with this belief.” (p. 247)

This brief summary of James’ analysis of the relative importance of pitching to baseball is one small example of what is now known as sabermetrics—the systematic study of performance in baseball.

This paper discusses three main topics. The first is the manner in which the scientific method and statistics have been used to study performance in baseball in recent years. The second is a review of the changes in the way various mass media present the game to the public. The third is probably the most fun and interesting, at least to scholars with an interest in quantitative analyses: a summary of some of the analytical findings of sabermetricans and their implication for our knowledge of baseball.

Basic Principles of Sabermetrics

Sabermetricans subscribe to the same standards as any good researcher. And while they are not averse to examining the qualitative aspects of baseball performance, it should be noted that sabermetrics has a strong bias toward the goals of quantitative research:
· Observe and collect data about a phenomenon or event (in this case, baseball and baseball games)
· Examine the data for meaningful and relevant information and patterns
· Interpret the data to better understand the phenomenon or event.

The rationale for quantitative analysis and a search for patterns across large volumes of data is explained by Grabiner (1997):
“The statistics which are available in baseball are a collected record of observations. An individual fan, sportswriter, or even a player or manager will see most teams thirteen or fewer times during the year. His observations may be of some interest, but they are a small (and often biased) sample. In thirteen games, the difference between a great hitter and a poor hitter is just five hits; thus, if the observer happens to see a mediocre player's two best games of the season, he would get an incorrect impression of the player's ability. In contrast, a player's statistics are a record obtained from all of his games, as observed by the official scorers in the league. This is a much larger collection of observations, and it is converted to a form which can be easily understood; few fans could get a good idea of a player's batting average by watching his 600 plate appearances.” (p. 1)

Basic Assumptions of Sabermetrics

There are a few basic assumptions that guide the field of sabermetrics. While they may seem obvious, it is important to state them explicitly. First, the roots of team performance are to be found in individual performances. Second, the roots of team wins are to be found in the number of runs scored (and allowed). Third, the roots of runs scored are to be found in getting batters on base and advancing those runners (Thorn & Palmer, 1985, 1993). Each of these points is now be discussed in more detail.

The roots of team performance. Team sports are primarily measured at the group level—they are first and foremost about team wins and loses more than individual performance. But virtually every sport also has individual statistical categories that isolate different aspects of an individual’s game performance.

No sport has more such categories of data than baseball. In part this is due to the individual nature of hitting in baseball. In part it is due to the fact that the pace of the game allows for data collection. A baseball game is the only sporting event at which a substantial number of fans can be seen “scoring” the game. The official records kept about baseball may be more formally collected, but they are still a form of scoring—the coding of game events, as they occur, in text and symbols. Today relatively complete records are also being kept for other major team sports, but baseball clearly has the longest tradition and the most complete records of any sport.

One can analyze a number of dimensions of team performance, but ultimately it comes down to a coordinated collection of individual performances. While the whole may be greater than the sum of its parts, the parts give you a good idea what level of performance the whole can reach. There are numerous cases of underdog winners who rose above expectations by apparently making the sum much more than its parts. And there have been teams with a collection of first class players that disappoint. But by and large, teams appear to rise to the level justified by how much talent is spread across a variety of team roles. In other words, most of the variability in baseball performance is in the abilities at the level of the individual players.

The roots of wins and losses. Across relatively large samples, for example 10 full seasons of major league baseball, there is a high correlation between total wins and total runs scored. In fact, it turns out that the ratio of any team’s wins to losses in any season is nearly identical to the ratio (of the square) of runs scored to (the square of) runs allowed. James (1984) called this the “Pythagorean Theory.” While the idea that total wins (and losses) are related to total runs scored (and allowed) may seem obvious to most, once reassured that there is a consistent relationship over time, the sabermetrican can focus on factors related to scoring (and preventing) runs.

The roots of runs scored. At the most basic level there are two factors that contribute to offensive performance. The first is the ability to get batters on base. The second is the ability to advance runners. Sabermetricans begin with these two types of offensive performance, and examine them in the context of how many outs a player uses up while getting on base and advancing runners. Analyzing how well teams and team members accomplish these goals (and prevent their opponents from accomplishing them) is the primary focus of most sabermetric research.

The Effects of Sabermetrics

Baseball lore is full of quotes similar to the one above from Connie Mack that attempt to summarize different aspects of the game. Many have a kernel of truth, but virtually all of them oversimplify the game and even the specific issue at hand. In recent years, however, the rise of sabermetrics has coincided with greatly increased knowledge about the inner working of the game.

This increase in knowledge has, in turn, meant a major change in the way that baseball games are discussed, both among managers, sportswriters, agents, and other baseball professionals, and in the way baseball is presented to the general public. The primary change in the public discourse about the games themselves, and the play of particular teams and players, has been an increasing reliance on statistical information and analysis in communicating baseball to the public.

This change can be seen in the coverage of baseball by all the traditional media: newspapers, magazines, books, radio, and television. The trend is especially noticeable when one examines the baseball section of modern bookstores. There have always been dozens of books on baseball. Traditionally these have covered non-statistical topics. But today there are numerous volumes with summary statistics and analyses in different mixes, including many with special sections on aiding picks in rotisserie, or fantasy league, baseball. In large part, the root of this change has been the application of the scientific method to baseball.

Science, Statistics, Patterns, and Context

In particular, two components of scientific research have helped shape the fan’s perception of the game today. The first is a long-standing principle of the physical sciences: the simple quantification of events in the game. Baseball has a long tradition of tracking relatively detailed statistics on performance. Even before computers, baseball was tracking a number of offensive and defensive categories that could be tallied simply and easily. Today, computers have greatly simplified compilation, analysis, and presentation of a wealth of statistical information. Not only is the raw data more readily available, but professionals and amateurs alike can use the computer to test hypotheses and search for meaningful patterns in the data.

The second principle of science that has been applied to provide a more complete understanding of the game today is a lesson that both the physical and social sciences have taken to heart only relatively recently: that events must be seen within the specific context in which they occur. Regardless of the phenomenon being studied it is much more common today than it was twenty years ago to see quantitative social science research, including communication research, evaluating events within the particular social context in which they occur. In the same way, there are several aspects of context that affect how we view baseball. There are at least three types of interesting baseball issues that are sensitive to variations in context.

Game to game situational contexts. Perhaps the most important impact on our ability to analyze game performance came with the recognition that the traditional box scores were lacking important contextual information. Ted Williams may have gone two for four in a game in which Whitey Ford gave up eight hits in seven innings. But the box scores of the time never indicated how Williams performed when he faced Ford. Similarly, season statistics provide detailed performance records for each individual, but not for specific pitcher/batter match-ups. Historically, other situational data have been neglected as well, for example switch hitting statistics, and offensive performance with different ball/strike counts or runners in different positions. In the past twenty years, however, there has been a quiet revolution in how games are scored and how data is stored.

Traditionally, the records of baseball games allowed one to tally dozens of variables about offensive and defensive performance but not their relationship to each other. The permanent record of any baseball game was a simple summary of events. But in the late 1970s the data collection of baseball (the scoring of games) was refined to collect all information within the context of the other variables of interest. Today, baseball records are a detailed database of information representing events in the context in which they occurred. This has implications for our general knowledge of the game and for the level of sophistication needed to conduct analyses and communicate summary information in meaningful ways. It also has implications for the level of complexity presented to the fans.

Park effects. Knowledgeable fans have known for some time that certain ballparks favor particular types of hitters or pitchers. Baseball professionals have known this since the game began. The position, contour, and height of the outfield fences, the amount of foul territory on the field, patterns of wind and sunshine have long been known to have an affect on hitting and pitching statistics. More recently, the increase in the number of night games (which impair batters’ visual acuity), the invention of dome stadiums and the use of artificial turf have provided further evidence that differences among ballparks continues to be a major variable in player performance. With the advent of major league baseball in Denver, the most recent park effect added to the mix is altitude above sea level. Never a serious variable among major league parks before the 1990s, the predictions of knowledgeable sabermetricans have come true with the offensive inflation (pun intended) caused by playing games at mile-high Coors Field in Denver. (This issue is discussed in more detail near the end of this essay.)

Comparisons across eras. Changes in how the game is played over the years accounts for differences across eras, and these changes include the prevailing rules, equipment, and strategies, as well as the evolution of ballparks. These variations result in changes in the seasonal league averages over time. Fans have argued for years whether Henry Arron could have topped Babe Ruth in home runs if they had played in the same era. Or whether any modern pitcher is as good as Cy Young. It is well known that the seasonal distributions of offensive and defensive production can fluctuate dramatically. There have been extreme “hitters’ years” and “pitchers’ years.” But the accomplishments of players from different eras can be compared—by examining their rank among comparable players of their era across a number of relevant categories.

There are additional examples of contextual information that provide interesting insights into other pieces of the research puzzle provided by baseball, such as the context of player age. For example, there are some relatively clear patterns in the rise and fall of performance across the span of the career of different types of players. To keep the task manageable, this article will limit its discussion to the three types of contextual analyses described above. Before addressing these topics, the next section reviews how improvements in data collection has made possible comparisons across a variety of contexts.

A Brief History of Baseball Record-Keeping

There are relatively detailed records of the majority of professional baseball games played in the last 100 years. However, while the handwritten scorecards of individual games may or may not contain relatively detailed information on the sequence of events in those games, the permanent and published records of baseball have simply aggregated offensive and defensive statistics by player, team, and season. Most of the contextual information that was recorded remains buried in handwritten records of the individual games. Some of this information has been lost forever or, in cases such as a player’s defensive range, was never collected in the first place.

Major league baseball recognizes dozens of official statistics. Most of these are offensive stats and are based on the performance of the individual batter. There are also some offensive statistics for runners, and defensive statistics for pitchers and fielders. In recent years, baseball has added a variety of new statistical categories, such as the save (1969) and the game-winning RBI (1980). In general, these attempts to refine our measures of performance have had limited value. In fact, the game-winning (or “victory-important”) RBI was abandoned a few years after it was invented. While such additions are intended to help determine the relative contribution of different team members, sabermetricans are in general agreement that a superior approach to isolating individual performance is the creation of more complex statistics derived from the existing categories of data. As we will see below, measures such as runs created, secondary average, and defensive range, which are based on existing statistical categories, appear to provide more useful information than is made available by creating new categories of raw data. In general, sabermetricans have been less interested in new scoring categories than in data sets that capture the relationship among the variables—the contextual information.

Prior to the 1970s the records of baseball presented an incomplete picture of the events of the game. It was during the 70s that the methods used to collect game statistics were modified to capture events in relation to each other. Today this approach to scoring is common enough that there are essentially four independent repositories of detailed contextual data of the full Major League baseball season. These include the databases of: 1) the Elias Sports Bureau, which began gathering contextual information in 1975, 2) the Major League Baseball IBM Baseball Information System, begun in 1987, 3) STATS, Inc., and 4) Total Sports. STATS, Inc. and Total Sports each collect their information independently. Major League Baseball is somewhat secretive, but is believed to use the same data to supply both the Baseball Information System and the Elias databases.

To be sure, additional informal, personal, and proprietary sources of data have existed as long as baseball itself. One of the most common themes for such data was pitcher/batter match-ups. Pitchers have long kept their own “book” on hitters. Individual teams often kept records of certain events that were not part of the official records. These include charting pitches, tracking where the ball was hit, and any other information the manager deemed useful. As early as the 1940s Alan Ross kept track of various statistics for the Brooklyn Dodgers. In the 1970s Earl Weaver of the Orioles popularize the idea of tracking pitcher/batter match-ups. While Earl kept his stats on index cards, in the mid-80s Mets manager Davey Johnson (one of Weaver’s former players) enlisted the aid of a personal computer (and, presumably, someone to run it!). His goal was to expand on the sample size used by Weaver.

Fans got into this game as well. George Lindsey (1964) published some of the early articles on baseball research. Pete Palmer, Dick Cramer, and Bill James were among a handful of budding sabermetricans studying a variety of performance issues in the 70s. This was the beginning of an explosion of amateur interest in baseball statistics. Dissatisfied with the available data on baseball, some of these fans of the game became interested enough to start collecting relatively detailed information about a subset of the major league games played—for example, those of one particular team. Dick Cramer developed a system for compiling information that was later used by the Athletics and the White Sox. Pete Palmer and Steve Mann developed and sold a tracking system to the Braves and the Phillies. It tracked the type of pitch thrown (fastball, slider, etc.), its speed and location, and where the ball was hit on the field. Not surprisingly, these early sabermetricans were only able to construct partial data sets. The majority of baseball games were scored and recorded as they had been for decades—until the late 1970s.

The Elias Sports Bureau, the official custodians of Major League Baseball records, reports that they began collecting game events in their proper context about 1975. Because Elias regards this data as proprietary, however, there is no public record of exactly what is collected and how it is collected.

The mid-70s was also the time that Bill James was attempting to get Major League Baseball, and the Elias Sports Bureau, to make their databases available to fans who wished to study player and team performance issues. But in Elias’ view, their customers are Major League Baseball, the networks covering the games, and individual teams that may want specific statistical information. The fans had no legitimate claim on the official records of Major League Baseball. Many people were unhappy about this, but Mr. James decided to do something about it.

Led by Bill James, a grass roots effort known as “Project Scoresheet” was begun in 1985. As James wrote in his 1984 Baseball Abstract: “When PROJECT SCORESHEET is in place, all previous measures of performance in baseball will immediately become obsolete, and an entire universe of research options will open up in front of us. With your help, ladies and gentlemen, there is no need for the next generation to be as ignorant as we are” (p. 252).

Project Scoresheet’s Board of Directors settled on a methodology and a standard form for collecting data and enlisted volunteers around the country—primarily readers of the Bill James Baseball Abstract. Within two years 26 groups of volunteers, led by John and Sue Dewan and organized around each major league team/city, were attempting to provide two independent replications of the events of every game played by their respective teams. Since this structure produced two scoresheets per team, it meant Project Scoresheet would actually have four scoresheets per game. Even when one scorer failed to mail in a complete scoresheet this duplicative system produced a relatively reliable record of virtually every game. (See the section on Scoring Games and Preserving Contextual Information below for an explanation of the scoring system.)

By 1987 the most complex aspect of Project Scoresheet was not scoring the games but managing the data produced by three or four sheets for each of the 2106 regular season games played each year. After cross-checking the hard-copy data and reducing the several sheets produced for any one game to one “clean” scoresheet, the data had to be entered into a computer database. The sheer scope of this undertaking clearly required a paid staff at headquarters. To raise the capital needed for this enterprise Project Scoresheet made commitments to provide data to a number of organizations, including STATS, Inc. of Chicago. Unfortunately, several members of the Board of Directors of Project Scoresheet had formal connections with STATS, Inc., which raised conflict of interest concerns for some with Project Scoresheet.

John Dewan joined STATS, Inc. full time in 1987. STATS, Inc. began to pay its scorers, most of whom had been wooed away from Project Scoresheet. Without going into the intrigue and disagreements that occurred between STATS and Project Scoresheet, it will simply be noted that since representatives of both organizations had helped develop the Project Scoresheet scoring system and methodology, it was agreed that each should be allowed to refine it and continue using it. Project Scoresheet died a natural death in the early 90s, and the rights to the data and the format were part of the compensation given to its last Executive Director, Gary Gillette. The rights were passed to Gillette’s company, The Baseball Workshop, which was recently purchased by Total Sports, Thus, today both STATS, Inc. and Total Sports share the legacy of Project Scoresheet. More importantly, together they provide what Bill James wanted when he originally started Project Scoresheet: a competitive market in the provision of seasonal baseball statistics.

Scoring Games and Preserving Contextual Information

If you’ve ever scored a baseball game while at the ballpark, you know how complex things can get—and how difficult it is to make notations about the unusual situations that may arise. As with any data collection effort, recording the events of a baseball game on paper requires a methodology that handles routine events simply but allows for documenting the non-routine.

The approach used by Project Scoresheet embodied many principles that are general accepted today as a sound approach for scoring games and preserving contextual information. Some of the primary components of this method of data collection include:

This approach allowed the project to capture all the data required for traditional statistical categories, as well as more subtle situational data. Two variables that the Project Scoresheet effort did not attempt to track are type of pitch (fastball, curve, etc.) and pitch location (where the pitch was relative to the strike zone when it crossed the plate). While this system tracks the location where the batted ball is fielded, and whether or not it was a ground or fly ball, tracking type of pitch or pitch location both present a difficult situation. Both require specialized scorer knowledge or a particular view of the game that is often not available to scorers. Such data is collected by most teams, however.

The Influence of Sabermetrics on Communicating Baseball to the Public

Computerized records of game information that preserves contextual information allow a variety of statistical breakdowns across careers, seasons, teams and individual players. Some of the statistical breakdowns that are now commonly reported in the mass media include:

Radio and television coverage of Major League Baseball began regularly reporting some of these breakdowns a few years ago. Sometimes these data are useful; often times they are not. Part of the problem stems from the fact that the snippets of data prepared for game announcers to employ in particular situations are simply that—raw data. In addition, some of these data, such as batter/pitcher match-ups, are often based on small samples. Still, it must be remembered that for years television simply introduced each batter with a simple graphic that indicated his current batting average, RBI, and home run total. Not only is this a highly selective set of statistics about overall performance, but RBI and home run totals are instructive only in the context of total at-bats (which the networks have traditionally not included). At the very least, however, the presentation of basic statistical breakdowns on radio and television today help make the average fan more aware of the contextual nature of performance on the field.

It should be noted that the temporal nature of television puts it at a bit of a disadvantage in providing information that is best presented as text and numbers. Technologically, the computer screen and television screen have physical commonalties that do not explain why they are used so differently. Two important factors are (a) the limitation of broadcast radio and television waves to one-way, and therefore non-interactive, communication and (b) the capacity for timed communication (i.e., to provide relatively high-fidelity screen replicas of human activities happening at the rate of human life). The ability to provide audio and video content, the basic forms of timed communication, implies that television and radio are communication systems that are useful for conveying sequences of events, from conversations to action, in everything from movies to baseball. This same temporal attribute of being timed communication has led televised baseball to play to its strengths and provide video replays of pitch sequences. In recent years the networks have edited videotape so they can show particular sequences of pitches and how they are choreographed by the catcher and pitcher with the intention of limiting the hitter’s success. While it was mentioned above that the publicly available data on baseball does not include pitch type and location, television does a particularly good job of presenting the highly contextual aspects of pitch sequences.

While television and radio are best suited to carrying the performance (and pre- and post-game analyses) live, newspapers take the responsibility for providing a relatively complete record of the games in print. Newspapers, and all print media, represent untimed communication. Readers can take as much or little time reading the game summary and box scores as they like (while 30 minutes of sports news is over in 30 minutes).

Taken together, the box score and written game summary provided, for many years, the only view most fans have had of any individual game. If you weren’t there, you only knew what you read in the paper. Today, of course, most games are broadcast and all are recorded. Thus, ESPN’s SportsCenter can show highlights from each game. Still, the brief video clips of a key hit can hardly capture a good sense of this particular game. The box score carries a more complete picture of what happened at the game.

Not coincidentally, sabermetrics has had a large impact on the composition of the box score in recent years. In particular, by 1990 USA Today had began publishing a new type of box score with a great deal of additional data. These box scores were supplied by STATS, Inc. and were made possible by the way in which the data had been collected. They include, for example, the number of pitches thrown by each pitcher and how many were strikes, the number of ground balls vs. fly balls, and the number of batters each pitcher faced. None of these statistics are available in traditional box scores.

The overall effect of communicating more statistics and analysis, through any of these communication systems, is to inject more discussion of performance statistics into the public discourse about baseball. While fans are notoriously poor at remaining objective, many have clearly learned to support a sports argument with supporting statistics. While some argue that sports discussions are fundamentally unresolveable, moving to the realm of sabermetrics means that points of difference are often amenable to rigorous analysis.

The Influence of Sabermetrics on Baseball Professionals

In most areas of business today the desire for a competitive advantage has led successful professionals to become very sophisticated in their approach to the problems that face their particular businesses. The competitive nature of modern business requires that companies conduct research and/or purchase information to remain successful.

Yet in the high stakes world of baseball, sabermetric research typically takes a back seat to other types of research, such as the work of scouts. Estimates from people interviewed for this piece suggest that fewer than half of the major league teams use sabermetric research to any extent.

How heavily a team relies on this type of information tends to be based on the predilections of particular managers or general managers. Earl Weaver and Davey Johnson were mentioned above. There are a handful of teams, such as the White Sox, Tigers, Red Sox, Padres, and Reds that currently seem to value the insights of sabermetricans.

Yet there are even more teams that apparently take the attitude that the only way to “know” baseball is to be around it day by day. There are a number of reasons why sabermetric knowledge may not be taken as seriously as it could be in the inner circles of professional baseball. Baseball people often bristle at the notion that a sabermetrican can evaluate a player without ever seeing him perform. They resist the notion that being among the individual trees actually may prevent them from seeing the large trends in the forest. Baseball professionals also have a stake in preserving the notion that their brand of insight about the game is the most valid.

From the sabermetrican’s point of view, one of the strengths of the wizened old manager has traditionally been the storehouse of situational knowledge built up over a lifetime. Yet the reality is that today the human mind is no match for the computer when it comes to storing and sorting situational performance data. Most sabermetricans understand that a manager must consider many factors beyond historical situational performance data. They simply believe that baseball professionals should make this quantifiable information a part of the decision making process.

Sabermetric Analyses: New Measures of Performance

In large part sabermetrics is the search for more meaningful statistical measures. Most new statistics have been created from categories of raw statistical information that have existed for some time. Some of the more useful statistics devised in the last 20 years include:

Sabermetricans have used these and other statistical analyses to uncover a number of important insights about game performance. As discussed above, sabermetricans examine team performance (wins and losses) by evaluating the relative contribution of the individual team members. In part because individual position players (non-pitchers) have much more control over scoring runs than over preventing runs, sabermetric research focuses more on offense than defense. The last two sections of this essay summarize a few of the interesting issues that sabermetrics addresses. Be aware that the discussion of these issues is over-simplified in the interest of keeping each one relatively brief. As with interpretations of scientific data, when examined in detail there is some controversy among experts about the conceptualization, the measurement, and the interpretation of these events.

Sabermetric Analyses: Examples of Park Effects

Park effects are based on the physical conditions of the park and other environmental factors. Differences in the playing surface, field dimensions, weather, the amount of light on the playing field, and even the density of the air are based on the structure and location of the park. These are situational factors, but relatively constant across games.

It should be pointed out that, for sabermetricans, the discussion of park effects does not include the influence of the crowd. Sabermetricans for the most part are content to work with the complex set of numerical data provided by the structure of the game on the field. They don’t deny the home field advantage may be a “home crowd” advantage—it’s there in the statistics each year. They simply don’t try to measure it. Instead, they factor in the data that reflects the average home field advantage in wins and losses, which has been tracked for many years, then look for measurable factors that are correlated with significant variations from that long-term average.

While it is common knowledge in baseball that some parks are hitters parks and some pitchers parks, measures of park effects allow us to precisely quantify the size of these effects and examine what types of hitters and pitchers are helped by particular parks. Park effects are less complex than they might seem at first glance. Park effects for any particular category of performance are calculated in the following manner. First, for a particular team’s home schedule, sum the totals (for example, in total runs scored, home runs, errors) for both teams playing the 81 home games. Second, sum the same performance totals for the road games of the same team(s). The difference between the two totals represents the net park effect in that particular statistical category.

A number of statistical categories are not greatly affected by the park in which a game is played. Offensive categories that typically vary less than ± 15% for any ballpark (from the major league average, across any given season) include: total runs scored, batting average, strikeouts, and stolen bases.

The offensive categories that typically exhibit the greatest variability across ballparks (for any given season) include total home runs (as much as ± 40%) and triples (as much as ± 50%). Artificial turf is a major factor in increasing triples—most years the top ten parks in triples are dominated by turf parks. Incidentally, turf is also a major correlate of parks with the highest number of stolen bases each year. I say correlate, not cause, because bases are not necessarily easier to steal in turf parks—rather, the speedy players needed to cover outfield ground in turf parks also steal a lot of bases.

Wide variations in home run production is primarily caused by the placement of the fences, and secondarily by factors such as wind conditions (Wrigley Field with the wind blowing out is a pitcher’s nightmare). How does this affect our perception of the performance of individual players? In 1987 Andre Dawson hit .287 with 49 home runs and 137 RBI and won the MVP award in his first season for the Cubs (81 games played in the home run-friendly confines of Wrigley Field). Meanwhile, Jack Clark hit a nearly identical .286, with only 35 home runs and 106 RBI for the Cardinals—and was not considered a serious MVP candidate. Yet Clark’s “runs created” for 1987 was 127, while Dawson’s was 111 (James, 1988). In part this was due to the “secondary average” contributions of Clark’s 136 walks compared with only 32 for Dawson. But the other culprit blinding the MVP voters to the relative accomplishments of each man that year was park effects: from 1983-87 batters at Wrigley Field enjoyed a 44% increase in home runs relative to the average park, while hitting in St. Louis’ Busch Stadium reduced home runs by 18%. Reverse Dawson and Clark’s home parks that season and the MVP vote would likely have been very different.

As mentioned above, the introduction of baseball to Coors Field in Denver has meant fans are becoming aware of another variable that sabermetricans have known has a major affect on home run production— altitude above sea level, or atmospheric pressure (O’Brien, 1986). For the last several years a collection of decent ballplayers have looked like great sluggers due to the severe effects of playing baseball in the Mile High City. One analysis after the 1996 season (Dewan, Zminda, & STATS, 1997) indicated that had the game’s best sluggers played their home games at Coors Field, no fewer than four players would have broken Roger Maris’ record of 61 homers, led by Mark McGuire with a projection of 72 home runs!

On a smaller scale, Boston’s Fenway Park is one of the friendliest hitter’s parks year after year. This is reflected in the fact that Ted Williams, perhaps the greatest pure hitter the game has ever seen, hit 33 points higher at home than on the road over the course of his career. It is also reflected in the fact that Fenway apparently helped Jim Rice, Wade Boggs, Carney Lansford, and countless other hitters who have played there. Conversely, the Houston Astrodome has had as negative an effect on the offensive stats of Jeff Bagwell in the 1990s as it did on Jose Cruz in the 1980s.

Sabermetric Analyses: Examples of Game Performance Effects

Some performance effects occur at the individual and some at the group level. Most are related to the “situational” events that occur on the field in the course of a three hour ball game. The discovery of these effects is related to a careful examination of traditional and new measures of batting, pitching, and fielding events. My intention here is to provide a flavor for the types of insights that sabermetrics can provide; I apologize in advance for a very brief statement of some very complex issues.

Clutch Hitting. I think fans collude with announcers and sportswriters who wish to believe that some players “step up” their game in big situations. We want our heroes to be larger than life; we want them to defy the statistics, and to succeed by sheer force of will. Every season certain players appear to be “clutch hitters:” they perform better than their overall average with men in scoring position, or in the late innings of close games. Unfortunately, different players appear to be the clutch hitters almost every year. In other words, there is little or no evidence for the existence of clutch hitting (James, 1982).

The lively ball? Every time there is an increase in offense people argue that the ball has been “juiced up.” Manufacturers and Major League Baseball always maintain that the ball has not been changed. More reasonable explanations for the increase in offense include the dilution of pitching due to expansion and umpires using a smaller strike zone (Dewan, Zminda, & STATS, 1997).

Winning the close games. Years ago Bill James pointed out that many people fundamentally misunderstand the role of luck in baseball when they argue, after their team loses a one-run game, that “the good teams win the close games.” The exact opposite is actually true: the good teams win the blowouts (games decided by five or more runs). Here’s why. Close games can be decided by a bad hop, a broken bat single, or any number of other factors where luck comes into play. One-run games can go either way, so all else being equal, a team should be expected to play .500 ball in one-run games (and many come close). There is often not that much difference between the records of the good and poor teams in one-run games, since luck plays a relatively large part in these games. But when the blowouts are examined, good teams always have better records than poor teams in lopsided games across the course of a season.

Evaluating minor league talent. For years it has been argued in baseball circles that minor league stats are not a good predictor of major league success. James (1985) argued that minor league statistics are as reliable an indicator or success as previous major league stats—if you adjust for the (more severe) park effects found in the minor leagues.

Speed. Stolen bases have long been the yardstick for calibrating a player’s speed. Sabermetric analyses have shown that the best measures of player “speed” consider not only stolen bases, but triples, runs scores, grounding into double plays, and defensive range (Dewan, Zminda, & STATS, 1997).

Defensive range factor. The traditional standard for fielding excellence has been fielding percentage (successful chances divided by total chances). With this measure players are not penalized for not getting to a ball, only for getting to it and not fielding it cleanly. One major problem is that the official rules of baseball regarding scoring errors are relatively vague (an error is to be charged if “ordinary effort” would have resulted in a caught ball). Another is that it penalizes players who get to more balls, since they have more opportunities to fail. Bill James (1982) presented a persuasive case for the superiority of adjusted range factors (successful defensive chances per game by a player). In brief, the argument is based on the notion that the more balls a player catches (of all the balls hit to his area of the field), the greater the contribution he has made to preventing the other team from scoring runs.

Statistics with limited value. That modern creation, the game-winning (“victory-important”) RBI has been eliminated as an official statistic, thank goodness. Many sabermetricans would argue that there are also some time-honored statistics that are not worth the time it takes to collect them. Among these are the distinction between passed balls and wild pitches (James, 1988) and even between an error and a hit (Wright & House, 1989). While the primary rationale is that these judgment calls are too subjective, and often applied in a biased manner, another argument in support of doing away with these measures is that they are traceable in large part to pitching style. That is, a pitcher who throws a knuckleball is probably responsible for the catcher’s increased number of passed balls, and the pitcher who throws a lot of sinkers and forces ground balls bears some responsibility for the errors that result from more ground balls being put into play. James argues that to pretend that the pitcher is not responsible for most of the passed balls is a distortion of the record.

Career Projections. Bill James has created a number of formulae to project where any current player will end up in any particular statistical category. This allows him to estimate the chance of a player hitting 500 home runs, winning 300 games, etc. (James, 1985).

Hall of Fame Criteria and Projections. There are no formal criteria for the Hall of Fame, so James (1986) examined the current members and extrapolated from their accomplishments to determine what it apparently takes to gain entry. His formula allows one to evaluate any current or existing player and estimate his chances of being voted into the Hall. As a bonus, James’ (1995) book on the history of the Hall of Fame includes detailed discussions of weak selections and controversial slights. Fascinating reading if you’re interested in who got in and how.

This represents a portion of the topics sabermetricans have studied and debated. As with any field of study, serious researchers disagree about some of the methods and the findings. Yet there is common ground and agreement about some of the relatively clear findings. While not all of the knowledge of sabermetrics has found its way to the mass audience for baseball, fans and game coverage seem to get more sophisticated with each passing year. As ‘Perfessor’ Stengel said many years ago: “You could look it up.” That’s more true today than ever before.

References

Dewan, J., Zminda, D., & STATS, Inc. (1997). STATS 1997 baseball scoreboard. Chicago: STATS Publishing.

Dickson, P. (1898). The Dickson baseball dictionary. New York: facts on File.

Grabiner, D. (1997). The sabermetric manifesto. Available on the WWW at: http://remarque.berkeley.edu/~grabiner/manifesto.txt.

James, B. (1982 - 1988). The Bill James baseball abstract. New York: Ballantine Books.

James, B. (1989). This time let’s not eat the bones: Bill James without the numbers. New York: Villard Books.

James, B. (1995). Whatever happened to the Hall of Fame? Baseball, Cooperstown, and the politics of glory. New York: Fireside.

Lindsey, G. (1963). An Investigation of Strategies in Baseball, Operations Research, 11, 447-501.

O’Brien, D. (1986). In B. James (Ed.). The Bill James baseball abstract. New York: Ballantine Books.

Thorn, J. & Palmer, P. (1985). The hidden game of baseball. New York: Doubleday.

Thorn, J. & Palmer, P. (1993). Total baseball. New York: Harper Collins.

Wright, C. R. and House, T. (1989). The diamond appraised. New York: Simon & Schuster.


Author's Biography

T. Andrew Finn is an Associate Professor in the Department of Communication at George Mason University, where he teaches graduate and undergraduate courses in communication and telecommunications. Andrew received his PhD in Social Psychology from Washington University in St. Louis. His research interests center on how communication and information systems affect human communication. He spent a number of years at AT&T in the 1980s and early 90s, working on electronic communication systems from a variety of perspectives, including user interface and system design, technology implementation and training, and product management and marketing. He believes Coors Field should have been built with taller outfield fences, he is not a fan of artificial turf nor one-run strategies, but he has warmed up to the DH rule.