4.3 Applied Data Science Capstone-Collecting The Data 1
The document outlines a capstone assignment focused on collecting SpaceX launch data using the SpaceX REST API, which provides details on launches, rockets, payloads, and landing outcomes. It discusses the process of transforming JSON data into a clean dataframe, addressing issues like NULL values, and filtering for Falcon 9 data specifically. Additionally, it includes instructions for implementing the data collection API in a Jupyter notebook within IBM Watson Studio.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
18 views14 pages
4.3 Applied Data Science Capstone-Collecting The Data 1
The document outlines a capstone assignment focused on collecting SpaceX launch data using the SpaceX REST API, which provides details on launches, rockets, payloads, and landing outcomes. It discusses the process of transforming JSON data into a clean dataframe, addressing issues like NULL values, and filtering for Falcon 9 data specifically. Additionally, it includes instructions for implementing the data collection API in a Jupyter notebook within IBM Watson Studio.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 14
Collecting the Data
Objectives Data Collection Overview Implement the Data Collection API
Collecting the Data 2
Data Collection Overview In this capstone assignment, we will be working with SpaceX launch data that is gathered from an API, specifically the SpaceX REST API. This API will give us data about launches, including information about the rocket used, payload delivered, launch specifications, landing specifications, and landing outcome. Our goal is to use this data to predict whether SpaceX will attempt to land a rocket or not.
Collecting the Data 3
Data Collection Overview Working with the endpoint api.spacexdata.com/v4/launches/past: get past launch data.
Collecting the Data 4
Data Collection Overview Get data:
JSON data
Collecting the Data 5
Data Collection Overview Convert this JSON to a dataframe, we can use the json_normalize function a table form
Collecting the Data 6
Data Collection Overview Another popular data source for obtaining Falcon 9 Launch data is web scraping related Wiki pages: using the Python BeautifulSoup package
Collecting the Data 7
Data Collection Overview Raw data: transform to clean dataset: Wrangling Data using an API Sampling Data Dealing with Nulls. Sometimes we need to use the API again targeting another endpoint to gather specific data for each ID number.
Collecting the Data 8
Data Collection Overview The data we needed: Booster, Launchpad, payload, and core
Collecting the Data 9
Data Collection Overview Another issue is that the launch data we have includes data for the Falcon 1 booster whereas we only want falcon 9.
We only keep information related to Falcon 9
Collecting the Data 10
Data Collection Overview Not all gathered data is perfect: contains NULL values. NULL values inside the PayloadMass: calculate the mean of the PayloadMass data and then replace the null values in PayloadMass with the mean. We will leave the column LandingPad with NULL values, as it is represented when a landing pad is not used.
Collecting the Data 11
Implement the Data Collection API Student task: Import and complete the below Jupyter notebook in IBM Watson Studio
jupyter-labs-spacex-data-collection-api.ipynb
Show your results
Collecting the Data 12
Summary Data Collection Overview Implement the Data Collection API