Extended Essay First Draft
Extended Essay First Draft
Extended Essay
How can space debris in low Earth orbit be modelled using statistics?
1. Introduction 1
b. Hypothesis Testing - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 8
b. Piecewise Graph - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 13
3. Conclusion
Introduction
On October 4th, 1957, the Soviet Union successfully launched Sputnik 1, using their R-7
launch vehicle to deliver the first artificial satellite into low Earth orbit. As of 2011, “over four
thousand rockets have sent more than six thousand payloads into orbit, greatly improving the
world's capacity to retrieve, transmit, and share information.” (Chen, 2011) This already large
figure has continued to grow rapidly in the past decade, due to a renaissance of the space
industry. William Welser IV, the director of Engineering and Applied Sciences at the RAND
Corporation, described this phenomenon in a 2015 article. He writes that the democratization of
the space industry “has made outer space accessible to not only the global superpowers and large
multinationals, but to developing countries, start-ups, universities, and even high schools”
(Welser, 2015). SpaceX, a company utilising reusable rockets to maintain a rapid launch
cadence, has launched over 1,700 of their Starlink satellites as of August 2021 (Sheetz, 2021).
This satellite constellation provides broadband internet access to more remote regions of the
world. As our dependence on infrastructure in low earth orbit (LEO) increases, it is imperative
To counteract the effects of Earth’s gravitational pull, satellites in orbit must be moving
horizontally at extremely high speeds. As an example, the International Space Station, to which
the universal principles of orbital mechanics apply, orbits 400 kilometers above the Earth and
travels tangentially to the Earth’s surface at a rate of almost eight kilometers per second. It is
estimated that the average relative velocity of space debris and a satellite during a collision is ten
kilometers per second (Chen, 2011). These high speeds allow for very small particles to release
enormous amounts of energy in the event of a collision, breaking satellites into many smaller
pieces.
1
If a satellite becomes obsolete or malfunctions, its orbit will slowly decay as a result of
atmospheric drag. There are a number of factors that influence the rate of orbital decay, such as
the mass and surface area of a satellite. At lower orbital altitudes (200km-600km), without
correction maneuvers, satellites’ orbits will decay in a timespan of about a few months to a few
years (NASA Goddard Space Flight Center). Geostationary orbits of around 36,000km in altitude
are useful for weather and communications satellites because they allow a satellite to remain in
one position above the Earth. Because of the negligible air density at this altitude, satellites will
remain there indefinitely. These defunct satellites, as well as spent rocket stages and other
On February 10, 2011 the Iridium-33 communications satellite collided with the retired
Cosmos 2251, a Russian military satellite, at nearly ten kilometers a second. From this one
collision, nearly 1,900 new pieces of space debris were created, revealing just how quickly these
figures can grow. According to a study conducted by NASA in early 2021, “Millions of pieces of
orbital debris exist in low Earth orbit (LEO) - at least 26,000 the size of a softball or larger that
could destroy a satellite on impact; ove 500,000 the size of a marble big enough to cause damage
to spacecraft or satellites; and over 100 million the size of a grain of salt that could puncture a
In 1978, NASA scientist Donald J. Kessler proposed a scenario in which low Earth orbit
becomes inaccessible due to the density of space debris. The ramifications of “Kessler
positioning system satellites would be at risk of destruction, and humanity would be trapped on
Earth for hundreds or thousands of years. Even if all space launches were stopped today, it is
likely that the exponential growth of debris could render certain common orbits, such as
2
geostationary orbit, inaccessible. It is also entirely possible that the number of sequential
collisions required to reach the tipping point will never be reached. I will collect data and attempt
to determine a relationship between satellite launch rate and debris in orbit. I will then examine if
To better understand how space debris has increased over the past sixty years of
spaceflight, it is beneficial to visualise this relationship using a graph. Figure 1 shows data
evolutionary, three dimensional model used for the study of long-term debris environmental
projection. (National Aeronautics and Space Administration, 2019). Collisions with objects in
LEO larger than ten centimeters would result in a catastrophic loss of vehicle or crew (NASA’s
Orbital Debris Program Office, 2019). This graph features a clear positive trend, showing that
Figure 1
3
Linear Regression Model
Linear regression analysis is useful when attempting to model a relationship between the
independent and dependent variables. A linear equation can be fitted to the data in the form
𝑦 = β0 + β1𝑥, where β0is the y-intercept and β1 is slope. The following equations will be used
2 2
2 (Σ𝑥) 2 (Σ𝑦) Σ𝑥Σ𝑦
𝑆𝑥 = Σ𝑥 − 𝑛
𝑆𝑦 = Σ𝑦 − 𝑛
𝑆𝑥𝑦 = Σ𝑥𝑦 − 𝑛
𝑆𝑥𝑦 𝑆𝑦
𝑟 = 𝑆𝑥𝑆𝑦
β1 = 𝑟( 𝑆𝑥 )
To find β0, the point (𝑥, 𝑦) will be substituted into the equation written in point slope form.
𝑦 − 𝑦 = β1(𝑥 − 𝑥)
Calculations:
2 2
2 (Σ𝑥) (1953)
𝑆𝑥 = Σ𝑥 − 𝑛
= 81375 − 62
≈ 140. 9
2 2
2 (Σ𝑦) (546023)
𝑆𝑦 = Σ𝑦 − 𝑛
= 6916407559 − 62
≈ 45909. 5
Σ𝑥Σ𝑦 (1953)(546023)
𝑆𝑥𝑦 = Σ𝑥𝑦 − 𝑛
= 23561758 − 62
= 6362033. 5
𝑆𝑥𝑦 6362033.5
𝑟= 𝑆𝑥𝑆𝑦
= (140.9)(45909.5)
≈ 0. 984
𝑆𝑦 45909.5
β1 = 𝑟( 𝑆𝑥 ) =. 984( 140.9
) ≈ 320. 5
4
Using the point (𝑥, 𝑦), or (31.5, 8806.8), the
point-slope form:
𝑦 = 320. 5𝑥 − 1289. 25
2 2
Columns “xy,” “𝑥 , " and “𝑦 , " as well as the
obtained.
centimeters.
Figure 2
5
Figure 3
Figure 3 graphs a linear model for this data, using the equation
𝑦 = 320. 5𝑥 − 1289. 25. The slope of the linear model, or β0 predicts that the number of
objects larger than 10 cm in low Earth orbit increases by about 320.5 every year. A correlation
coefficient or r value of 0.984 indicates a strong positive linear relationship between objects in
LEO greater than ten centimeters in diameter and years since 1957. The negative y-intercept of
-1289.25 indicates there is error in this model, as there cannot be a negative number of objects in
orbit.
Visually, the linear model fits the data well for most x values. However, near x=50, the
6
Residual Analysis of Linear Model
To examine the extent to which this model accounts for variation in the data, a residual
analysis can be performed by using the equation 𝑟 = 𝑥 − 𝑥0where r is the residual, x is the data
value, and 𝑥0is the value prediction by the linear model 𝑦 = 320. 5𝑥 − 1289. 25
Figure 4
Figure 4 graphs residuals from the linear model 𝑦 − 8806. 5 = 320. 5(𝑥 − 31. 5)
against years after 1957. The seemingly random grouping close to the x-axis for domain values
(4, 28) indicates that this linear model is a good fit for the measured data. This linear model fails
to account for variation in the data around x=42 to x=57 where a more distinct pattern can be
observed. This result suggests that a linear model does not fit the data in this region of the
domain. To further examine the fitness of this linear model, a hypothesis test can be conducted.
7
Hypothesis Testing
A hypothesis test will be useful in determining how atypical the result is, and if the
significant linear relationship between the independent and dependent variables, then obtaining
the sample data would not be too unlikely. The null hypothesis 𝐻0states that there is no
significant linear relationship between the two variables. At significance level, α, of 0.05, the
null hypothesis will be rejected if there is a less than 5% chance of obtaining data as extreme as
the sample data if the null hypothesis were true (La Trobe University).
𝐻0: β1 = 0
𝐻1: β1 ≠ 0
To find standard error, 𝑆 ,the following equation can be used where 𝑦𝑖 is the value of the
β1
dependent variable for observation i, 𝑦𝑖is estimated value of the dependent variable for
observation i, 𝑥𝑖 is the observed value of the independent variable for observation i, 𝑥 is the mean
2
Σ(𝑦𝑖−𝑦𝑖) (61,823,128.71)
2
(𝑛−2) (60)
𝑆 = =
β1 Σ(𝑥𝑖−𝑥)
2
(19,855.5)
2
𝑆 = 402.0
β1
8
Test statistic t is defined by the following equation:
β1 320.5
𝑡= 𝑆
= 402.0
= 0. 7973
β1
This test statistic can then be compared to a critical value corresponding with a
significance level, α, of 0.05, and 60 degrees of freedom, df. If the absolute value of test statistic
t is greater than the critical value, the null hypothesis is rejected. If t is less than the critical value,
the null hypothesis is not rejected (Stat Trek). A test statistic of 1.671 was obtained from a t
Therefore, we fail to reject the null hypothesis, and can conclude that the sample data
does not provide sufficient evidence to support that there is a relationship between the number of
years after 1957 and the quantity of debris in low Earth orbit greater than ten centimeters in
diameter.
Piecewise Model
When examining the linear model near domain values of x=50, it was noticed that the
data began to deviate significantly from the values predicted by the linear regression. The pattern
observed on the residual analysis graph (Figure 4) indicates a non-linear relationship might better
model this data. To account for this, the data can be modeled using a piecewise function. A linear
model will be used when 𝑥 < 50, as the random pattern observed in the residual analysis
indicates this model is appropriate. A cubic model will be used when 𝑥 ≥ 50, as the data first
9
Cubic Regression Model
Serving a similar function to its linear counterpart, cubic regressions are useful when
modeling a cubic relationship between an independent and dependent variable. A cubic equation
3 2
in the form 𝑦 = 𝑎𝑥 + 𝑏𝑥 + 𝑐𝑥 + 𝑑 can be fitted to the data. Coefficients a, b, c, and d, will
1 3 3 2 5 6
𝑦 = (1. 22⋅10 )𝑥 − (2. 06⋅10 )𝑥 + (1. 17⋅10 )𝑥 − (2. 17⋅10 )
10
Figure 5
1 3 3 2 5 6
𝑦 = (1. 22⋅10 )𝑥 − (2. 06⋅10 )𝑥 + (1. 17⋅10 )𝑥 − (2. 17⋅10 )
2
𝑅 = 0. 924
The cubic equation graphed in Figure 5 seems to more accurately represent the data in the
2
domain interval 50 ≤ 𝑥 ≤ 62. A coefficient of determination or 𝑅 value of 0.924 was obtained
using a graphing calculator and indicates a strong cubic relationship between the independent
and dependent variables. To further test the fitness of this model, a residual analysis can be
performed. The process to create a residual plot for a cubic model is identical to that of the linear
model. Residuals can be calculated using the equation 𝑟 = 𝑥 − 𝑥0where r is the residual, x is the
data value, and 𝑥0is the value prediction by the cubic model
1 3 3 2 5 6
𝑦 = (1. 22⋅10 )𝑥 − (2. 06⋅10 )𝑥 + (1. 17⋅10 )𝑥 − (2. 17⋅10 ).
11
Figure 6
Figure 6 graphs residuals from the cubic model against years after 1957. The pattern
appears almost entirely random, indicating a good fit for this model. The cubic function
1 3 3 2 5 6
𝑦 = (1. 22⋅10 )𝑥 − (2. 06⋅10 )𝑥 + (1. 17⋅10 )𝑥 − (2. 17⋅10 ) will serve as the second
sub-function in the piecewise model. As previously mentioned, the first sub-function will come
𝑦 = β1𝑥 + β0
𝑦 = 273. 8𝑥 − 435. 5
r = 0.976
12
Figure 7
Figure 7 graphs the piecewise function shown above. This model is a better fit for the
data than the solely linear model derived earlier, as it accounts for the patterned residuals from
50 ≤ 𝑥 ≤ 62.
Conclusion
Several different methods of modelling data have been discussed in previous sections, as
well as methods for testing accuracy of those models. Regressions allow for the construction of
both linear and non-linear models which can be fitted to a set of data points. Residual analysis
was conducted to test if those models were appropriate for the given data. A hypothesis test
determined there was a less than 5% chance of obtaining data as extreme as the sample data if
13
In terms of relating these findings to real-world applications, the sample data used was
limited in two significant ways. By only accounting for objects greater than ten centimeters in
diameter, all smaller objects were excluded. Smaller objects can still pose great risk to other
1 2
structures in orbit due to the nature of kinetic energy, represented by the equation 𝐾𝐸 = 2
𝑚𝑣 .
Simply put, because it is squared, velocity has a more significant impact on kinetic energy than
mass. The most dangerous pieces of space junk are too small to be tracked. Consequently,
collisions with these small but fast moving objects cannot be predicted or knowingly avoided.
Space debris included in sample data is in low Earth orbit, or orbits of less than 2,000
kilometers in altitude (National Aeronautics and Space Administration). While collisions with
space debris at these lower altitudes certainly pose a risk to human spaceflight and some
satellites, a far more concerning scenario is one in which geostationary orbit becomes
inaccessible. This orbit of about 36,000 kilometers is essential for weather, communication, and
Statistical analysis is useful when predicting the nature of growth of space debris.
14
List of Sources
NASA. “NASA's Efforts to Mitigate the Risk Posed by Orbital Debris.” 2021,
NASA Goddard Space Flight Center. “The Hubble Space Telescope Servicing Missions.”
https://orbitaldebris.jsc.nasa.gov/modeling/legend.html#:~:text=LEGEND%20is%20a%2
0full%2Dscale,long%2Dterm%20debris%20environment%20projection. Accessed 18
August 2021.
National Aeronautics and Space Administration. “LEO Economy FAQs.” NASA, 19 November
Sheetz, Michael. “SpaceX says Starlink has about 90,000 users as the internet service gains
https://www.cnbc.com/2021/08/03/spacex-starlink-satellite-internet-has-about-90000-use
Stat Trek. “Hypothesis Test for Regression Slopes.” Stat Trek, 2021,
15
Welser, William, and Dave Baiocchi. “The Democratization of Space: New Actors Need New
https://www.foreignaffairs.com/articles/space/2015-04-20/democratization-space.
16