Analytical Sport Business
Analytical Sport Business
Analytical Sport Business
In the last decade, the use of data analytics has become a major source of
personnel decisions and playing strategies in the world of sports. Although
a complex subject, the concept is straightforward. Analytics involves the
collection of large quantities of statistical data, identifying trends derived
from that information and, ultimately, using the findings to derive conclu-
sions. It has been the basis for predicting outcomes in finance, weather,
sales and nearly every other industry, but this concept was not embraced
in the sports business until recently. And its adoption has not been univer-
sally accepted, with many “purists” deriding the concept as taking the joy
out of sports (Glockner, 2014). Analytics even grabbed popular culture
attention with the release of the book (and later the film) Moneyball.
Analytics Defined
Analytics, generally speaking, is the study of data. The process begins
by recording information, either in predetermined categories or through
mass information mining, which becomes the data used in the study. This
data is analyzed in a variety of ways in hopes of recognizing a pattern
(businessdictionary.com, n.d.). These patterns can then be used as tools
for predictions, decision making, problem solving and more. This is how
your web browser is able to guess your next search or how your credit
card company can detect fraudulent use—the latter case as a result of your
spending behavior not aligning with your past actions.
Analytical predictions have been frequently used in the area of financial
investments. Every second of the day, computers worldwide are collecting
price changes for every stock, bond, commodity and any other finan-
cial instrument that exists. In addition, the state of external factors like
exchange rates, inflation, interest rates, political climate, weather, season-
ality and other stock prices are recorded at that same moment. Once the
data is gathered, computers run analyses on this information to find any
type of correlation between all of those factors and the movement of the
financial instrument being studied. The information gathered is important
Analytics in Sports 293
Analytics in Baseball
In 1977, Bill James self-published a book titled The Baseball Abstract:
Featuring 18 Categories of Statistical Information That You Just Can’t
Find Anywhere Else, which was subsequently updated in 1985 and 2001
294 Analytics in Sports
(James, 1985, 2001). Throughout the baseball season, James had com-
piled data from baseball box scores of each game and studied it to find
new measures of performance and success. He used statistical calculations
and ratios that had not been analyzed previously. He then made this book
an annual release, offering his subscribers more of the same each year.
In the 1980s, it was very difficult for fans to obtain detailed statistics
on games. Many people did so in a self-help manner: those viewing the
game in the stands or at home liked to keep their own box scores. In
response, James created a nonprofit organization called Project Scoresheet
that worked as a fan network with members committed to logging info
about every play of every game they had access to so that they could be
published and released to the public. This process would collect all of
the data needed to find meaningful knowledge about the performance of the
players. Members of this network went on to form STATS, Inc., which
quickly became the leading sports statistics database in the world and still
continues to this day (Stats.com, 2016).
During this time of increased public awareness and desire for detailed
statistics, James, along with others, was developing a new methodology of
measuring performance in baseball called sabermetrics, which is derived
from the acronym SABR (Society for American Baseball Research) (Sabr.
org., 2016). It placed far less weight on the previously utilized measures,
such as batting average, runs batted in (RBIs) and earned run average (ERA),
in favor of a more statistical approach, similar to the work James had been
doing for years. The purpose was to analyze a player’s performance in a
way that excluded extraneous factors, such as the performance of his team-
mates or the particular opposing pitcher and fielders. This would provide a
true representation of that player’s skill and ability, without putting it in the
context of the particular situation he was playing in. For example, RBIs are
mostly dependent on a teammate being on base when you get a hit. Simply
because the teammates that bat ahead of you are not putting themselves in
a position to score runs, your statistical measurement, and thus your own
skill as a hitter, is deemed inferior.
One example of a sabermetric statistic is slugging percentage (SLG),
which is used to measure the power of hitters. This is calculated by tak-
ing the total bases reached on your hits (one for a single, two for a dou-
ble, etc.) divided by your total number of at-bats (smartfantasybaseball.
com, 2014). Another measure is on-base percentage (OBP), which gives
a numerical representation of how often a player gets on base purely
through his own efforts (i.e., discounting plays resulting from the fielders’
actions, such as fielding errors, fielder’s choices and others). It is calculated
by adding hits, walks and hit by pitches and dividing that number by the
sum of at-bats, walks, hit by pitches and sacrifice flies (Fangraphs.com,
2016a). The equation is as follows: OBP = (H + BB + HBP) / (AB + BB +
HBP + SF)].
Analytics in Sports 295
13 * HR + 3 * (BB+ HBP) − 2 * K
FIP = +C
IP
HR represents home runs, BBs are walks, HBP is hit by pitch (number
of times a player got on base this way), Ks are strikeouts, IPs are innings
pitched and C is a constant that changes for each season and is based on sta-
tistical averages from that season. The constant is simply to make the result-
ing number look more representative to an ERA (fangraphs.com, n.d.).
Another important sabermetric tool to mention is Wins Above Replace-
ment (WAR). It is meant to be the ultimate measure of a player’s true value
by calculating the number of wins that player provided his team com-
pared to what a readily available replacement at his position would have
provided (fangraphs.com, 2016b). In other words, if a second baseman’s
WAR is 6.3, it means his team would have won 6.3 fewer games that
season were he to be replaced by an average free agent or minor league
second baseman being paid the league minimum. This measure can be
extremely valuable for general managers who have to decide how much
a player is worth in dollars. But it should be noted that this is a fairly
complex calculation and far from a precise measurement.
These relatively modern concepts were gaining traction among fans
and some analysts, but they were still rejected by “purists,” which encom-
passed all those working within the sport. The first person credited with
not only embracing, but implementing, these measurements in baseball
decision making is Sandy Alderson, while working as general manager of
296 Analytics in Sports
the Oakland Athletics. Alderson was an outsider from the start, having no
prior professional baseball experience. He had graduated from Harvard
Law School and was originally the general counsel for the team. In the
mid-1990s, Alderson was given a minimal budget to work with, which
made fielding a competitive team difficult. With no traditional bounds to
restrain his thinking, he did not hesitate to look for advantages in other
ways, which led him to hire Eric Walker, a former aerospace engineer and
baseball writer who studied numerical analysis of the sport (Rubin, 2010).
Walker would become the bridge linking the new-age quantitative ana-
lysts with the inner circle of team management. Alderson would finally
put these studies to practical use. It was taken even further by Alderson’s
successor, Billy Beane, whose widely publicized adoption brought saber-
metrics and sports analytics to the general public when it yielded surpris-
ingly successful results in the face of scrutiny. This became the basis of
the book Moneyball. Today, numerical analysis in baseball is ubiquitous.
Practically every baseball team in the league has incorporated it into their
decision making.
Analytics in Basketball
After becoming widely accepted within baseball, the concept of analytics
began to spread to other sports, but none have utilized it to quite the same
extent. Baseball can be boiled down to a single batter vs. pitcher matchup
broken down pitch by pitch, whereas most other team sports have far too
many variables in play and other teammates involved in everything that
occurs (Holmes, 2014). Of the remaining three major professional Ameri-
can sports, basketball is most conducive to isolating individual player
actions throughout the game. That is why it makes sense that numerical
analysis has made the most progress in that sport.
Traditional basketball box scores track the basic actions of a player
that are easy to recognize and document. They are field goals attempted
(FGA), field goals made (FGM), points (PTS), rebounds (REB), assists (AST),
steals (STL), blocks (BLK), turnovers (TO) and personal fouls (PF). These
numbers can offer a general picture of a player’s abilities, but they do not
take into account any variables such as teammates and game situations.
As an example, a player who averages a lot of points per game is not likely
to be as good as someone else averaging the same if one of those players
has very high skilled teammates and the other has duds. The player with
unskilled teammates is bound to score more because no one else on his
team has the competence to do it. Similarly, a player who threads a pass
behind the back through three defenders to a teammate he signaled to
charge the basket will be credited with the same “1 assist” as a player
who simply tossed the ball to the person standing next to him who then
hits a shot. There is a lot more that contributes to a player scoring a basket
Analytics in Sports 297
than that player taking the shot. It makes sense that, just as in baseball,
a certain subgroup of people looked for ways to account for things like
these when comparing players’ abilities.
Just like baseball comes down to scoring runs, everything that occurs
on a basketball court is done so with the ultimate goal of scoring as
many points as possible. Essentially, basketball analytics boils down to
determining the average number of points a team will score when they
possess the ball or the probability that the team scores any points at all in
a given possession. One analytical measure that has been developed is the
“expected possession value,” or EPV (Goldsberry, 2014). This attempts
to account for every possible variable in a given moment, such as which
player is holding the ball, the arrangements of the other players on the
court (as well as who those players are) and the time left on the shot
clock. To quantify exactly how one player contributes to an increase in
this number, his “EPV-added” or “points added” value is determined.
In theory, this represents how a player affects the overall EPV compared
with an average replacement player swapped into the identical situation.
The resulting player’s “points added” is a number representing how many
more points are scored in a game simply by having that player on a team.
For example, if a player has a “points added” rating of 2.1, theoretically,
his team would average 2.1 more points per game by having him instead
of an average replacement. Many players even produce negative points
added, which means their teams would actually be better off without
them. If accurate, this offers obvious advantages to teams valuing players
for trades or for contract negotiations. Further, from a coach’s perspec-
tive, with all of this data available, one could devise plays that put all of
the players in the positions that maximize the EPV of his team on a given
possession.
Another advanced statistic used to boil down a player’s value to a
single number is the Player Efficiency Rating (PER). This has become one
of the more well-known of the new statistics, most likely due in part to
its developer, John Hollinger, who worked at ESPN and used the PER in
front of the mainstream audience. The statistic is designed to measure
a player’s per-minute performance, taking into account the traditional
box score statistics, as well as deducting for negative accomplishments,
like turnovers or fouls. There are some questions and criticisms of the
process, but the resulting data does produce fairly accurate results (Hol-
linger, 2011).
Some other advanced basketball statistics are becoming more common.
A team’s offensive and defensive ratings are calculated by finding the aver-
age points scored or points allowed per possession. An “effective field
goal percentage” (eFG%) is similar to a traditional measure of field goals
made divided by field goals attempted, but it awards a bonus for 3-point
shots made, because they are both more difficult and more valuable. The
298 Analytics in Sports
Analytics in Football
As of 2016, data analytics is still in its early stages in the NFL, but the
presence is growing (Fleming, 2013). Still others claim that football ana-
lytics has not kept pace with other sports (Causey, 2015). Analytics is
growing in college football, although it has been largely limited to major
programs that have the economic resources to utilize it (Myberg, 2014).
Even if a college football team accumulated the massive amount of data
necessary to produce meaningful information, more than likely it does not
have the advanced computers or mathematic experts required to make
sense of it.
However, high-level statistics and sabermetric-style data have been
used to decide on the four teams that would participate in the College
Football Playoff (Schlabach, 2014). The committee choosing the teams
cannot simply look to the best record, because there are so many teams
and there is an extremely wide range in competition talent. Instead the
selections are made by subjectively choosing the four best teams, being
as objective as possible. But instead of the typical Bowl Championship
Series (BCS) methodology of ranking based mostly on record and strength
of schedule, the committee was armed with every conceivable data point
that may be considered when trying to decipher which team is better than
another.
Analytics in Sports 299
At the time of this writing, there are not many mainstream or standout
advanced statistics in football as there are in baseball and basketball. The
development of a play is so complex that it makes it difficult to whittle a
player’s contribution down to a single number. But we are still in the early
stages, so it is inevitable that some measures will come to the forefront,
much like they did in basketball.
Analytics in Hockey
More and more NHL teams have embraced analytics in determining
which players to sign, how long players should be on ice, the quality of
play during power play, penalty kills, overtime and types of zone entry
(McIndoe, 2014). In the past, the most common data used to determine
player effectiveness was the “plus/minus” rating, which simply measured
a player’s success by the noting if the player was on the ice for his team’s
goals more often than if that participant was when his team was scored
against. Therefore, a player is awarded a “plus” each time he is on the ice
when his club scores for an even-strength or shorthanded goal. The player
receives a “minus” if he is on the ice for an even-strength or shorthanded
goal scored by the opposing club. The difference in these numbers is con-
sidered the player’s “plus-minus” statistic (NHL.com, n.d.).
The metrics for measuring success have expanded greatly in recent
years. Teams utilize more sophisticated metrics, which include the major
possession-oriented standards: (1) Corsi and (2) Fenwick (both named
after their respective creators). Before discussing each one, it should be
noted that they are not pure determinates of puck possession, but rather
are proxies or indirect ways to do so. As one commentator stated: “In
order to shoot the puck, you have to possess the puck. Since we do
not have the technology or another practical way to measure time of
possession in hockey, we must employ [Corsi or Fenwick as a] proxy.”
(JenLC, 2013).
Corsi’s formula involves shots directed at the goal. It involves shots
on goal, shots high, shots wide, shots that get saved and shots that go
into the goal. A player who has a positive Corsi has more shots directed
toward the opponent’s net while he is on the ice at even strength than
shots directed toward his own net under the same criteria. It is a far more
in-depth version of the old plus/minus standard. For example, let’s say
that player A is on the ice for 10 shots on behalf of his team during a
game. The opposing team takes 3 shots while Player A is on the ice dur-
ing the game.
Under the Fenwick formula, the idea is the same, but as noted earlier,
the blocked shots are omitted. So, taking our earlier example, let’s say of
the 10 shots Player A was on the ice for, 2 were blocked by players on the
opposing team. The opposing team had 3 shots while Player A was on the
ice, but 1 of those was blocked. Because Fenwick excludes blocked shots
from the formula, Player A’s numbers would look like this:
These figures are often translated into percentages for easier compari-
son. So, in the first case, Player A would have a 70 percent Corsi and
a 75 percent Fenwick rating. Proponents of these statistical standards
believe these formulas are reliable metrics for possession because the more
shots a team is able to direct toward the net, the longer it controls the
puck. However, a high plus Corsi rate does not ensure victory. A team
could have many more shots than its opponents and still lose if too many
of these shots are wide or saved by a superb goaltender on a given night.
These formulas deal with the typical five skaters vs. five skaters situ-
ation in hockey (the goaltender is not included). However, penalties in
hockey—more than other sports—affect the numbers of skaters on the ice.
If Team A’s center was caught engaging in a specified infraction, the team
will be short one player on the ice because that player will have to sit for
at least two minutes in the penalty box. This may affect the dynamics of
the game—and is not reflected in the Corsi or Fenwick ratings.
So, how do we find analytic information for these situations? What
players are more effective on the power play (for the team with the player
advantage) or the penalty kill (for the team with the player deficit)? One
method is to determine which player(s) “draw” the most penalties in
a given season, meaning that the player is able—usually through his
skills—to lure opposing players to commit penalties, forcing the power
play advantage. So, Player A skates around Player B who is “forced”
to trip Player A to prevent a goal-scoring opportunity. The players who
are best at drawing penalties can give their teams an advantage. If one
assumes that a team has an 18 percent chance of scoring a power play
goal (which is about the NHL average), then these players can “add” a
certain number of goals per season, helping their respective team’s stand-
ings (Tulsky, 2013).
These are just a few samples of the kind of analytics research and prac-
tice that has been utilized in the period from 2010–16. More analytical
standards are being developed and it is likely that this area will involve
more sports, more data and more analysis in the years ahead.
Analytics in Sports 301
References
Beyondtheboxscore.com (2014). Retrieved May 20, 2016, from http://www.
beyondtheboxscore.com/2014/6/2/5758898/sabermetrics-stats-pitching-stats-
learn-sabermetrics Businessdictionary.com (n.d.).
Causey, T. (2015, September 23). The sorry state of football analytics. Thespread.
com. Retrieved May 23, 2016, from http://thespread.us/sorry-state.html
Definition of analytics. Retrieved May 1, 2016, from http://www.businessdictionary.
com/definition/analytics.html
Fangraphs.com (2016a). Fielding independent pitching. Retrieved May 20, 2016,
from http://www.fangraphs.com/library/pitching/fip/
Fangraphs.com (2016b). OPS and OPS+. Retrieved May 24, 2016, from http://
www.fangraphs.com/library/offense/ops/
Fangraphs.com (2016c). What is war? Retrieved May 24, 2016, from http://www.
fangraphs.com/library/misc/war/
Fleming, D. (2013, August 20). The geeks shall inherit the turf. ESPN.com.
Retrieved May 23, 2016, from http://espn.go.com/nfl/story/_/id/9581177/
new-jacksonville-jaguars-coach-gus-bradley-relies-analytics-espn-magazine
Glockner, A. (2014, March 3). Do analytics take the fun out of sports? A dispatch
from Sloan. The big lead. Retrieved May 22, 2016, from http://thebiglead.
com/2014/03/03/do-analytics-take-the-fun-out-of-sports-a-dispatch-from-sloan/
Goldsberry, K. (2014, February 6). Databall. Grantland.com. Retrieved May 24,
2016, from http://grantland.com/features/expected-value-possession-nba-
analytics/
Grochowski, J. (2015, April 27). Baseball by the numbers: OPS+ not perfect, but
it’s a useful tool. Chicago Sun-Times. Retrieved May 24, 2016, from http://
chicago.suntimes.com/sports/baseball-by-the-numbers-ops-not-perfect-but-its-
a-useful-tool/
Hollinger, J. (2011, August 8). What is PER? ESPN.com. Retrieved May 24,
2016, from http://espn.go.com/nba/columns/story?columnist=hollinger_
john&id=2850240
Holmes, B. (2014, March 30). New age of NBA analytics: Advantage or over-
load? The Boston Globe. Retrieved May 24, 2016, from https://www.boston
globe.com/sports/2014/03/29/new-age-nba-analytics-advantage-overload/
1gAim4yKYXGUQ2CTAe7iCO/story.html
Infosys.com (2016). Use of big data technologies in capital markets. Retrieved
May 22, 2016, from https://www.infosys.com/industries/financial-services/
white-papers/Documents/big-data-analytics.pdf
James, B. (1985). The Bill James historical baseball abstract. New York: Villard.
James, B. (2001). The new Bill James historical baseball abstract. New York: Free
Press.
JenLC (2013, December 4). Stats made simple Part 1: Corsi & Fenwick. SBNa-
tion . Retrieved March 27, 2016, from http://www.secondcityhockey.
com/2013/12/4/5167404/nhl-stats-made-simple-part-1-corsi-fenwick
Lee, M. (2013, October 27). NBA deputy commissioner Adam Silver backs the
analytical movement. The Washington Post. Retrieved July 15, 2016, from
https://www.washingtonpost.com/news/wizards-insider/wp/2013/10/27/
nba-deputy-commissioner-adam-silver-backs-the-analytical-movement/
302 Analytics in Sports
Lowe, Z. (2013, September 4). Seven ways the NBA’s new camera system can
change the future of basketball. Grantland.com. Retrieved May 24, 2016,
from http://grantland.com/the-triangle/seven-ways-the-nbas-new-camera-
system-can-change-the-future-of-basketball/
McIndoe, S. (2014, September 17). The NHL’s analytics awakening. Grantland.
Retrieved March 27, 2016, from http://grantland.com/the-triangle/
the-nhls-analytics-awakening/
Myberg, P. (2014, August 24). Slowly but surely, college football teams embrace
analytics. USA Today. Retrieved May 24, 2016, from http://www.usatoday.
com/story/sports/ncaaf/2014/08/24/college-football-preview-revolution-
analytics/14289989/
Plus/minus explained (n.d.). NHL.com. Retrieved March 26, 2016, from http://
www.nhl.com/ice/page.htm?id=26374
Rubin, A. (2010, October 27). Original A’s analyst discusses Alderson. ESPN.
com. Retrieved May 24, 2016, from http://espn.go.com/blog/new-york/mets/
post/_/id/11448/original-as-analyst-discusses-alderson
SABR.org (2016). SABR convention history. Retrieved May 22, 2016, from
http://sabr.org/content/sabr-convention-history
Schlabach, M. (2014, August 21). The CFB Playoff’s stats gurus. ESPN.com.
Retrieved May 24, 2016, from http://espn.go.com/college-football/story/_/
id/11382331/stats-company-sportsource-analytics-inform-college-football-
playoff-selection-committee-decisions
Sherpasoftware.com (n.d.). What’s the difference between structured and unstruc-
tured data. Retrieved May 23, 2016, from http://www.sherpasoftware.com/
blog/structured-and-unstructured-data-what-is-it/
Silver, N. (2014, February 19). The search for intelligent life. ESPN.com. Retrieved
May 22, 2016, from http://espn.go.com/espn/story/_/id/10476210/
nba-mlb-embrace-analytics-nfl-reluctant-espn-magazine
Smartfantasybaseball.com (2014). How do I calculate SGP for slugging percent-
age? Retrieved May 24, 2016, from http://www.smartfantasybaseball.
com/2014/02/how-do-i-calculate-sgp-for-slugging-percentage/
Sportingcharts.com (n.d.). Effective field goal percentage—eFG%. Retrieved May
24, 2016, from http://www.sportingcharts.com/dictionary/nba/effective-field-
goal-percentage-efg.aspx
Stats.com (2016). About STATS. Retrieved May 22, 2016, from http://www.stats.
com/about/
Tulsky, E. (2013, July 11). Hidden value: Penalty differential. SBNation. Retrieved
March 28, 2016, from http://www.broadstreethockey.com/2013/7/11/4504236/
hidden-value-penalty-differential