Document 8
Document 8
presidential elections. The dataset spans from 1904 to 2020, comprising a wealth of economic and
non-economic variables. Through thorough examination and visualization of this data, we aim to
uncover trends, patterns, and relationships that could influence election outcomes. The EDA process
involves identifying key variables such as GDP growth rate, inflation rate, and approval ratings, and
assessing their impact on electoral results.
By conducting EDA, we seek to gain valuable insights into the factors shaping US presidential elections,
setting the stage for more advanced modeling and prediction techniques.
Variables considered:
Target variable:
The target variable in this project is the percentage of electoral votes obtained by the incumbent
party's candidate in the US presidential elections. This variable serves as a crucial indicator of election
success, reflecting the level of support and confidence in the incumbent party's leadership. By
analyzing this target variable across election cycles, we aim to understand the factors influencing voter
behavior and ultimately predict future election outcomes.
The decision rule for this variable is straightforward: if the percentage is greater than 50%, it indicates
a successful election outcome for the incumbent party; otherwise, it suggests a less successful
outcome. This binary classification is visually represented in the graph, where the green area denotes
"yes" and the red area denotes "no." Remarkably, there is a notable balance between the two
outcomes, suggesting a relatively even distribution of electoral success and less successful outcomes
across the analyzed elections.
The waterfall chart of the percentage of electoral votes obtained by the incumbent party's candidate
in US presidential elections reveals a pattern of normal fluctuations over time, with one notable
anomaly between 1936 and 1952.
During this period, there is a significant and unusual drop in the incumbent party's electoral
percentage, declining from 98.5% in 1936 to 16.8% in 1952.
This drastic decrease prompts further investigation into the underlying causes. The unexplained
nature of this drop raises concerns about its potential impact on our predictive model if not justified
by the explanatory data.
By analyzing the explanatory variables associated with these elections, we aim to uncover patterns or
events that may explain this unprecedented shift in electoral performance. The provided dataset,
encompassing electoral percentages and changes across various election years, will be instrumental in
gaining insights into this historical anomaly.
The table presents key summary statistics for the electoral percentage variable, including the mean,
standard deviation, and select percentiles. These statistics offer insights into the central tendency,
variability, and distribution of electoral percentages in US presidential elections.
Complementing this, the graph illustrates a symmetrical and bell-shaped curve, characteristic of a
normal distribution. This visual representation provides a clear depiction of the electoral percentage
distribution, highlighting trends and patterns in election outcomes over time. Together, these
statistics and graph contribute to a comprehensive understanding of the electoral landscape, aiding in
the analysis and prediction of future election results.
After extensively reviewing previous literature on this topic, numerous relevant predictor features
were identified, along with additional features deemed to be of significance. Features are divided into
two types: economic and non-economic. These are the obtained features to be considered in this
modeling project:
Economic variables:
• GDP growth rate: This variable illustrates the percentage change in USA real GDP per capita
between the year preceding the election and the election year itself. Real GDP per capita was
selected to mitigate the impact of population growth and inflation adjustments.
The table of descriptive statistics for the GDP growth rate offers a concise overview of its distribution
over the analyzed election years.
The mean GDP growth rate is approximately 0.017, indicating a modest average increase in real GDP
per capita.
The standard deviation is around 0.052, reflecting variability in the economic performance across
different election cycles.
The minimum value observed is -0.151, highlighting significant economic downturns in certain periods,
while the maximum value of 0.122 showcases periods of robust economic growth.
The 25th percentile is -0.00025, the median (50th percentile) is 0.0235, and the 75th percentile is
0.04075, which illustrates the interquartile range where the middle 50% of the GDP growth rates lie.
The waterfall chart of the GDP growth rate visually represents the economic fluctuations over time. A
notable observation from this chart is the significant drop-in GDP growth rate between 1936 and
1952, mirroring the drop observed in the target variable of electoral percentages.
This period of economic volatility coincides with major historical events, such as the Great Depression
and World War II, which likely influenced both economic performance and electoral outcomes.
Further investigation into the explanatory variables during this period is essential to understand the
underlying factors contributing to these changes.
The graph below comparing the GDP growth rate and electoral percentage across election years
reveals a strikingly similar pattern between the two variables.
This visual comparison suggests a potential relationship between economic performance and electoral
success, where periods of positive GDP growth often correlate with higher electoral percentages for
the incumbent party.
The alignment of patterns between these two variables underscores the importance of economic
indicators in understanding voter behavior and predicting election outcomes. By analyzing these
trends, we can gain deeper insights into how economic conditions may influence political support.
• Inflation rate: The inflation rate of the United States over the years of election. It refers to the
percentage change in the general price level of goods and services over on an annual basis. It
quantifies the rate at which the purchasing power of a currency declines from the year that
precedes the election to that specific year.