The Impact of Movie Genre on Average Budget vs.
Box Office
Earnings
page count: 12
i
Introduction and Rationale
There are over 1000 movies released each year, and I love going to the movie theater to watch
movies in varying genres from horror to romance. As a movie enthusiast, I love cinema but am
equally intrigued on the factors that make a certain movie more successful than others. There are
many factors that make a movie great: budget (for sets and actors), genre, and marketing. In this
analysis, I will look at how genre genres could affect box office revenue and average budget in
2023, which will help me get a better understanding if a genre can indicate a movie’s
performance. This analysis will combine my interests for movies and statistics, I will do lots of
research online to gather data. I will use this data to examine whether a relationship exists
between the film genre and revenue outcome, which might explain why some connect better with
audiences. I loved watching movies as soon as they were released, and I have sometimes even
watched genres I usually don’t, because of marketing, related to budget. A movie’s genre often
lets the audience predict the storyline and target audience. I chose an array of 20 genres in
specific so that there wouldn’t be any bias and I had a representative sample to make my result
significant. While most genres require bigger budgets, especially action and science fiction, since
they use more special effects and demand more extended production, the budget of other genres
like romance and drama is small while performing well financially. Through data analysis of
genre compared to budget and actual box office earnings, I will analyze if there is a correlation
between these variables for a movie’s success. The independent variable is the average budget
allocated to films across different genres, while the dependent variable is the average box office
earnings generated by those films. This could help predict how the chosen genre of a film may
relate to its budget and revenue estimates and enhance understanding of film industry trends
from a quantitative perspective. I think action movies are going to have the highest box office
1
revenue and historical will have the least, because action is more popular as I see more
marketing for those movies rather than historical documentaries, and action movies have a larger
audience than historical.
Aim and Approach
My aim is to determine the relationship between the average budget used for a movie genre and
its box office performance in the United States in 2023. By evaluating genres, I can determine
which genres provide a high box office return and how budgets relate to box office earnings,
creating a genre-performance relationship and settle on trends that will be helpful in the movie
industry going forward, or just help me understand why I like certain movies better. Movies
might have a lower budget due to the story they’re telling, as horror movies for example don’t
need any sets or equipment dependent on the director. I will gather data from reliable online
sources that will give me statistics on box office revenues and budgets for 2023, as seen in my
works cited. I am exploring many different genres and gathering data from all. I will ensure this,
so I have a representative sample size for analysis of a minimum 20 genres to have a large
enough sample size out of over 90 genres. At first, I created a scatter plot using genre as my
independent variable and box office revenue as dependent variable. I will use scatter plots and
analyze them to understand the relationship between budget and box office earnings for each
genre. The best model to describe the relationship will be found through testing many regression
models in Excel. Making my data easier to understand and be organized. I will be able to see the
patterns between different genres and their financial success.
Data collection and Results
I’ve compiled data from various sources, containing genes from Action (high budget) to Horror
(low budget). It took a while to gather data and make sure it was accurate, because there are
2
differing sources giving me different information. I first was inclined to use data for 2024 but
seeing as how the year hasn’t ended yet there was limited data, so I changed to 2023.
Genre Average budget Average box Genre Average budget Average box
(in millions office earning (in millions office earning
USD) (in million) USD) (in million)
Action 151 264 Musical 48 108
Adventure 138 266 Mystery 43 112
Animation 83 176 Biography 35 145
Drama 23 41 Historical 75 174
Comedy 42 88 Documentary 6 62
Horror 16 31 Crime 65 143
Romance 22 55 War 79 112
Sci-Fi 153 341 Western 35 63
Thriller 49 91 Family 67 227
Fantasy 108 245 Sports 38 50
Table 1: a descriptive statistics table of average box office earnings and number of movies by
movie genre for 2023 in America
From the data I notice, action films had an average budget of $151 million and had average
earnings of $264 million, whereas sports films, with a budget of approximately $38 million, had
about $50 million in earnings. Due to lower budget movies like Documentaries I can speculate
that my prediction will be true. The data matches my expectation and my life as I tend to watch
action movies more often and they require a higher budget due to production and animation used.
I was surprised that romance movies had a lower budget, because it’s a popular genre. I
would’ve expected western movies to be in the lower budget end, but it surprised me with a
budget of $79 million. The highest box office revenue is for Sci-Fi genre movies and the highest
budget is also for Sci-Fi genre movies aligning with my predictions. From this data I created a
scatter plot.
3
Figu
re 1: Scatter plot of average budget (in million USD) vs. average box office earnings
This scatter plot shows the relationship between the average budget (in millions USD) and
average box office earnings, in million USD. There is a clustering of the data points at the lower
budgets, under $50 million, which is related to low earnings, under $100 million. While the data
above this price is more linear and indicates a much more hopeful trend for positive budget and
return correlations.
This scatter plot was the best way to illustrate the data because we can see each data point in
relation to budget and earnings, making it easier to analyze. The plot shows a positive
correlation, but it looks non-linear so I will find the r 2 value. The one noticeable trend is how
as the budget rises so do box office earnings. Which makes sense as the more budget a movie
producer has the better actors they can get, music artists, background sets, designers, and
promotion. Notably, Action, Sci-Fi, and Fantasy genres emerge as clear outliers, with
exceptionally high earnings by far compared to their average budgets: the former two genres
4
harvested $1,500 and $1,100 million, respectively, while average budgets have been around $200
million. My preferred genre of Romance is in the lower budget and earnings too, in the cluster.
This proves my theory. Such genres are mass-marketed and take advantage of international
markets too, having high budgets can lead to higher box office returns. The overall pattern in this
graph reveals that, although in general, the bigger the budget, the bigger the return, genres such
as Horror and Documentary can still receive relatively decent returns for significantly lower
investments.
Using this data, I think it matches the features of an exponential model because I see a
continuous growth pattern, that with an increase in budget the box office earnings will also
increase at a constant rate. Visually we can see the data points form in a curve so I think
exponential function may work.
5
Figure 2: Scatter plot with exponential regression line
The exponential function for my data is y
¿ =¿ ¿ 65.2
e 0.0108 x
. The data shows a continuous growth
pattern. This means that with the increase in budget, the box office earnings are growing at an
exponential rate. It starts at the y intercept and visually follows the pattern of the graph. It has a
r 2 value of 0.806, but a logarithmic regression model may also fit because it could show how it
grows fast first and then slows down, representing how it doesn’t grow exponentially forever. As
visually the data has a rise then as the budgets grow so do the revenues, but they grow by less
amounts. With this knowledge I tested the logarithmic function.
6
Figure 3: Scatter plot with logarithmic regression line
The logarithmic function for this graph is y
¿ = ¿ ¿ 90.7
ln ( x )
. The r 2 value is 0.678 which is
− ¿ 214
lower than the r 2 value for the exponential function. However, visually this regression line also
fits as it goes with the curve but not as accurate as the exponential function. Comparing both
regression models the exponential function visually fits better and based of the r 2 value is a
better fit being 0.806. I will continue to use the exponential function now for further analysis.
Modelling and mathematical manipulation of results
7
To analyze the non-linear relationship between budget and earnings, I tried both exponential and
logarithmic regression models. The exponential model produced the equation y
¿ =¿ ¿ 65.2
e 0.0108 x
which indicated a continuous growth pattern, showing that with an increase in budget the box
office earnings will also increase at a constant rate. An exponential continuous growth pattern is
a process where the rate of change of a quantity is proportional to its current value. In this
equation, y represents box office earnings on the vertical axis (dependent variable) while x
represents movie budgets on the horizontal axis (independent variable). The constant 65.2
represents the earnings for very low budgets, while the exponent 0.0108 is rate at which it
increases, with the interpretation that with more budget, the box office earnings will e 0.0108.
This model fits well for blockbuster movies, high budgets in return for high box office earnings,
due to having such a huge budget allowing them more marketing and exposure, as I’ve
mentioned before.
The value of the exponential model is that it captures the growth observed on the plot, for higher
budget movies, like Action and Sci-Fi genres, which have higher ROI’s. These films often
realize much higher returns, which disproportional earnings indicate high profits that large
investments can yield in the market. The limitation of this model is that it assumed that the
growth is constant no matter how high the budgets are, but it may not be realistic for different
genres with less budgets. It also tends to underrepresent those lower-budget films that can have
noteworthy profitability despite limited budget.
On the other hand, the logarithmic regression model, represented by the equation
y
¿ = ¿ ¿− ¿ 214
¿ + ¿ 90.7
¿ ln ( x )
, represents the data in a different way, as it highlights how as the
budgets grow higher the ROI’s may diminish, and are not forever constant. In this equation, ln
8
(x) stands for the natural logarithm of the budget, which means an increase in earnings tends to
be positively related to higher budgets, but the increment in every further input gets smaller. The
coefficient 90.7 represents the expected gain in box office earnings. Every increase in the log of
the budget and the constant −214 serves as an intercept for this model. A value of the logarithmic
model is that it helps include the outcome for low-budget films, which may still perform well
even with smaller investments.
The strengths of the logarithmic model are that it shows diminishing returns, showing how
increased spending would not necessarily promise great profits. This model also encapsulates
films with a lower budget and shows that even those genres, like horror and documentary, can be
profitable too. It is, however, a little more complicated to interpret, it might be misleading as far
as perceived earnings are concerned. Also, the logarithmic model cannot predict the performance
of high-budget films very well since it would underestimate their earnings on account of the
outliers of blockbusters.
Comparing these two models, the focus is on different aspects. The exponential model highlights
how very high-budget films show exponential increases in earnings while the logarithmic model
takes into better account the profitability of a film when budget levels are lower. Finally, as it
unfolds, both equations give an understanding of how varying movie budgets impact box office
earnings, capturing the complexities of the film industry and audience behavior across different
genres. I believe the exponential model is best for my data, because it fits the graph showing a
continuous growth pattern as has a higher r 2 value.
Analysis and Conclusion
9
Upon examining the relationship between average movie budgets and box office earnings across
various genres, I now have a greater love for the movie industry, as there is so much more that
goes into making a successful movie aside from the filming aspect, marketing and genre etc. The
data revealed that there is a trend where higher budgets result in higher box office earnings, but
also the lower budget movies still have high profit just not as much compared to blockbuster
movies, like action etc. The exponential model, suggests a strong potential for blockbuster films
to achieve great returns on investment, shows how higher budgets can result in better marketing
and attracting audiences. These findings helped me realize that the film industry is not solely
driven by financial resources but also by storytelling dependent on genre and effective marketing
strategies. It is evident that genres like Action and Sci-Fi, which have higher audience attractions
in movie theaters, often dominate box office returns despite the high production costs. In
contrast, lower-budget films, such as Documentaries and Horror movies, can still achieve
success, emphasizing that audience engagement often transcends budget constraints. It is all in
the hands of the filmmakers and how they utilize the budget to the fullest extent to make a
successful movie no matter what the budget.
This analysis has shown the role of context in the interpretation of data within the film industry.
Budget of a movie is very important but so is audience preference and marketing to determine
box office revenue, but even then, there can always be outliers like blockbuster movies or flops.
This was the entire aim for this paper, and I have come to a conclusion.
This analysis of how film budgets relate to box office revenues has highlighted more of the
operational frameworks in the film industry. This data proves that although there is a strong
relationship between budget and box office earnings, it is affected by different aspects for each
10
genre. The exponential model helps us understand these nuances, noting that higher budgets in
films don’t always guarantee success, and vice versa, sometimes lower budget films do
incredibly well under the right conditions, dependent on the filmmakers. Looking through the
results, my research could benefit from more inclusive data, particularly international movies and
other factors that might explain box office outcomes. Marketing strategies, critical reception, and
audience demographics could be investigated further to better future analyses. Moreover, this
trend, over time using a longitudinal approach, helps portray how relationships evolve in
response to the changing dynamics of the industry, which questions people's tendency to watch
movies based off story and emotions or the luxury experience. Such as Marvel but instead
movies like the Sound of Freedom, a thriller, with a budget of only $14.5 million but made over
$200 million, shows how the story sometimes matters more. As society is always changing it's
hard to predict what the audience will like.
My analysis provides a starting point for future research, demonstrating the need for an inclusive
approach in understanding the relationships in the film industry. After knowing the limitations of
this analysis, and undertaking efforts to include more variables, I hope to understand better the
impact of budget on box office returns but not forget the important role that story and audience
engagement play in how films succeed.
Evaluation and Extensions
Through reflection on my analysis, there are a few areas of improvement:
11
More data: I should've had more data and different genres. Including international sources would
have strengthened my results too and explained what external factors would impact their box
office earnings.
Different/More Variables: Towards the end, I realized there were other factors that may have
influenced the box office earnings, and not just budget. Elements like marketing expenditure,
release timing, and audience demographic characteristics, would bring about a more holistic
view of the factors that determine box office success. This would create a stronger relationship
between budget and earnings but may have been too many variables to control.
More outlier analysis: It would’ve been significant to investigate the outliers, the strange
elements that make certain movies grossly successful, like "Avatar" or "Avengers: Endgame”. I
only mentioned the outliers in my analysis but never went in depth, which may have helped my
results. It would help in fine-tuning the models by understanding the factors related to these
anomalies.
Testing more models: Exploring more models, like polynomial and linear, even if they didn’t
work, and explaining why they didn’t fit, improving the analytical predictive capability.
Better data graphs: I could’ve added more elements to my scatter plots, like error bars and maybe
the genre names on each data point to help recognize which is which, rather than having a broad
idea. This would make it easier to read the plots and understand them.
12
Longitudinal Analysis: Having data from different years would examine the trends over an
extended period of time to understand the always changing relationship between budget and
earnings. It might show patterns influenced by changes in market dynamics and the tastes of the
audience. Fixing my limitations, I hope future studies can explore the aspects in need of
improvement and possible extensions to arrive at a deeper understanding of the relationship
between film budgets and box office performances in comparison to genres.
Bibliography
"Box Office Mojo." Box Office Mojo, IMDb.com, Inc., https://www.boxofficemojo.com/.
Accessed 16 Sept. 2024.
"Genre Movies." Statista, https://www.statista.com/search/?q=genre+movies&p=1.
Accessed 16 Sept. 2024.
"2024 Box Office Summary." The Numbers,
https://www.the-numbers.com/market/2024/summary. Accessed 16 Sept. 2024.
"Movie Genres in North America by Box Office Revenue Since 1995." Statista,
https://www.statista.com/statistics/188658/movie-genres-in-north-america-by-box-office-
revenue-since-1995/. Accessed 16 Sept. 2024.
Saint Fleur, J. "Movie Box Office Analysis." GitHub,
https://github.com/jsaintfleur/Movie-Box-Office-Analysis. Accessed 16 Sept. 2024.
"Highest Grossing Genres at the 2023 Box Office." Media CSuite,
https://mediacsuite.com/highest-grossing-genres-at-the-2023-box-office/. Accessed 16
Sept. 2024.
13
14