0% found this document useful (0 votes)
73 views48 pages

Final Project

Uploaded by

Aina Canadell
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views48 pages

Final Project

Uploaded by

Aina Canadell
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Aina Canadell Castells

26/07/2024
Outline

• Executive Summary
• Introduction
• Methodology
• Results
• Conclusion
• Appendix

2
Executive Summary

The commercial space age is here, for that, reason our company SPACE Y was born. SPACE Y
wants to make the space travels affordable for everyone.

Methodologies Results

- Data Collection from API and Web scraping. - The best Hyperparamaers for Logistic Regression,
- Data Wrangling. SVM, Decision Tree and KNN classifiers.
- The method that performs best using test data.
- Exploratory Data Analysis (EDA) using SQL,
Pandas and Matplotlib.
- Interactive Visual Analytics and Dashboard
with Folium and Plotly Dash.
- Predictive Analysis (Classification).

3
Introduction

SPACE Y is here to compete in the commercial space race. We are making rocket launches
relatively inexpensive for everyone.

SPACE Y can save millions in every launch for our Eagle rocket because we can reuse it’s first
stage.
In addition, we can determine if the first stage of our competitor will land and determine the
cost of a launch by using Data Science and Machine Learning models.

4
Section 1

5
Methodology

Executive Summary
• Data collection methodology:
• The data was gathered from the SpaceX REST API and web scraping from wiki pages.

• Perform data wrangling


• The data collected in form of a JSON object and HTML tables, after that the data is
converted into a Pandas dataframe for visualization and analysis.

• Perform exploratory data analysis (EDA) using visualization and SQL


• Perform interactive visual analytics using Folium and Plotly Dash
• Perform predictive analysis using classification models
• Use of machine learning, to determine if the first stage of Falcon 9 will land successfully.
6
Data Collection

The data was gathered from the SpaceX REST API and web scraped from wiki pages.

7
Data Collection – SpaceX API

Collect and make sure the data is in the correct format from an API.

8
Data Collection - Scraping

Perform web scraping to collect Falcon 9 historical launch records from Wikipedia page.

9
Data Wrangling

Perform Exploratory Data Analysis (EDA) to find patterns in the data and determine
what would be the label for train supervised models.

10
EDA with Data Visualization

Summary of charts that were plotted:


- Catplot to visualize the relationship between Flight Number and Payload.
- Catplot to visualize the relationship between Flight Number and Launch site.
- Catplot to visualize the relationship between Payload and Launch site.
- Bar chart to visualize the relationship between success rate of each Orbit type.
- Catplot to visualize the relationship between Flight Number and Orbit type.
- Catplot to visualize the relationship between Payload and Orbit type.
- Line chart to visualize the launch success yearly trend.

11
EDA with SQL
SQL queries performed:

12
Build an Interactive Map with Folium
Summary of map objects that were created and added to the Folium map

13
Build a Dashboard with Plotly Dash

14
Predictive Analysis (Classification)

15
Predictive Analysis (Classification)

16
Results

• Exploratory data analysis results


• Interactive analytics demo in screenshots
• Predictive analysis results

17
Section 2
Flight Number vs. Launch Site

19
Payload vs. Launch Site

20
Success Rate vs. Orbit Type

21
Flight Number vs. Orbit Type

22
Payload vs. Orbit Type

23
Launch Success Yearly Trend

24
All Launch Site Names

25
Launch Site Names Begin with 'CCA'

26
Total Payload Mass

27
Average Payload Mass by F9 v1.1

28
First Successful Ground Landing Date

29
Successful Drone Ship Landing with Payload between 4000 and 6000

30
Total Number of Successful and Failure Mission Outcomes

31
Boosters Carried Maximum Payload

32
2015 Launch Records

33
Rank Landing Outcomes Between 2010-06-04 and 2017-03-20

34
Section 3
All Launch Sites
All launch sites are in very close proximity to the coast and into restricted areas.

36
Success/Failed Launches For Each Site
The first map shows clusters for every launch site, the second shows a green marker if a launch
was successful, and a red marker if a launch was failed.

37
A Launch Site and Its Proximities
Launch Site are near to railways, roads, highways and coastline. I understand that it is not just
for easy supply or access but, for maintain a safe distance with near cities.

38
Section 4
Total Success Launches By Site

40
KSC LC-39A

41
Payload vs. Launch Outcome

42
Section 5
Classification Accuracy

44
Confusion Matrix

45
Conclusions

• As all the algorithms are giving the same accuracy, they all perform practically the
same.

• By using our machine learning model, we can predict if the first stage of our
competitor will land and determine the cost of a launch.

46
Appendix

For notebooks, datasets and scripts, follow this GitHub repository link:
Applied Data Science Capstone

47

You might also like