SPACEX
SPACEX
SETHU S
22 JULY 2023
OUTLINE
⁂ EXECUTIVE SUMMARY
⁂ INTRODUCTION
⁂ METHODOLOGY
⁂ RESULTS
⁂ CONCLUSION
⁂ APPENDIX
GITHUB URL:
https://github.com/itsmesethus/courserac10assignments/tree/main/applied%20data%20science%20capstone%20project%20works
EXECUTIVE SUMMARY
SUMMARIES OF METHODOLOGY:
"As long as there are dreams, rockets will forever carry the hope of mankind beyond the horizon."
We, hereby going to travel with the process of our Data Science Team
and how they have worked to solve this problem of SpaceX Organisation before
their next launch of rockets into the orbit whether it will be succeful or loss for
them based on their previous history launches. So, okay let’s dive into the
presentation of the Team.
METHODOLOGY
❖ DATA COLLECTION ----- (Rest API, Web Scrapping):
* Using the Rest API we extract the data from the source in the form of JSON and later we can easily turn that format
to data frame using the help of Pandas library of Python.
* For Web Scrapping we can use the BeautifulSoup and request libraries to scrap out the data from the Wikipedia
source.
https://github.com/itsmesethus/coursera-c10-
assignments/blob/main/applied%20data%20science%20capstone%20project%20works/Week1%20SpaceX%20Falcon%20Data
%20Collection-Wrangling.ipynb
❖ DATA PREPROCESSING :
* Here in this section we need to look for the data integrity, data quality and handling the missing values will be
considered.
* Because if data is n’t in correct format or any other possibilities the results may mislead to wrong predictions.
https://github.com/itsmesethus/coursera-c10-
assignments/blob/main/applied%20data%20science%20capstone%20project%20works/Week1%20SpaceX%20Falcon%20Data
%20Collection-Wrangling.ipynb
❖ EDA WITH STRUCTURED QUERY LANGUAGE:
* SQL is the best programming language when handling in terms of huge volumes of data. Using this we have done the
EDA for the Falcon 9 rockets. And some results are,
❖ EDA WITH MATPLOTLIB AND PANDAS:
* Matplotlib and Pandas are the most versatile libraries in Python for handling the visualizations and data frames.
Launch Sites
Red : faliures
Green: Success
❖ PLOTLY:
*The success of a mission can be explained by several factors such as the launch site, the orbit and especially the number of
previous launches. Indeed, we can assume that there has been a gain in knowledge between launches that allowed to go from a
launch failure to a success.
• The orbits with the best success rates are GEO, HEO, SSO, ES-L1.
• Depending on the orbits, the payload mass can be a criterion to take into account for the success of a mission. Some orbits
require a light or heavy payload mass. But generally low weighted payloads perform better than the heavy weighted payloads.
• With the current data, we cannot explain why some launch sites are better than others (KSC LC-39A is the best launch site).
To get an answer to this problem, we could obtain atmospheric or other relevant data.
• For this dataset, we choose the Decision Tree Algorithm as the best model even if the test accuracy between all the models
used is identical. We choose Decision Tree Algorithm because it has a better train accuracy.
THANK YOU!