0% found this document useful (0 votes)
18 views14 pages

4.3 Applied Data Science Capstone-Collecting The Data 1

The document outlines a capstone assignment focused on collecting SpaceX launch data using the SpaceX REST API, which provides details on launches, rockets, payloads, and landing outcomes. It discusses the process of transforming JSON data into a clean dataframe, addressing issues like NULL values, and filtering for Falcon 9 data specifically. Additionally, it includes instructions for implementing the data collection API in a Jupyter notebook within IBM Watson Studio.

Uploaded by

nhunhse183644
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views14 pages

4.3 Applied Data Science Capstone-Collecting The Data 1

The document outlines a capstone assignment focused on collecting SpaceX launch data using the SpaceX REST API, which provides details on launches, rockets, payloads, and landing outcomes. It discusses the process of transforming JSON data into a clean dataframe, addressing issues like NULL values, and filtering for Falcon 9 data specifically. Additionally, it includes instructions for implementing the data collection API in a Jupyter notebook within IBM Watson Studio.

Uploaded by

nhunhse183644
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 14

Collecting the Data

Objectives
 Data Collection Overview
 Implement the Data Collection API

Collecting the Data 2


Data Collection Overview
 In this capstone assignment, we will be working with
SpaceX launch data that is gathered from an API,
specifically the SpaceX REST API.
 This API will give us data about launches, including
information about the rocket used, payload delivered,
launch specifications, landing specifications, and landing
outcome.
 Our goal is to use this data to predict whether SpaceX will
attempt to land a rocket or not.

Collecting the Data 3


Data Collection Overview
 Working with the endpoint
api.spacexdata.com/v4/launches/past: get past launch
data.

Collecting the Data 4


Data Collection Overview
 Get data:

JSON data 

Collecting the Data 5


Data Collection Overview
 Convert this JSON to a dataframe, we can use the
json_normalize function a table form

Collecting the Data 6


Data Collection Overview
 Another popular data source for obtaining Falcon 9
Launch data is web scraping related Wiki pages: using the
Python BeautifulSoup package

Collecting the Data 7


Data Collection Overview
 Raw data: transform to clean dataset:
 Wrangling Data using an API
 Sampling Data
 Dealing with Nulls.
 Sometimes we need to use the API again targeting
another endpoint to gather specific data for each ID
number.

Collecting the Data 8


Data Collection Overview
 The data we needed: Booster, Launchpad, payload, and
core

Collecting the Data 9


Data Collection Overview
 Another issue is that the launch data we have includes
data for the Falcon 1 booster whereas we only want
falcon 9.

 We only keep information related to Falcon 9

Collecting the Data 10


Data Collection Overview
 Not all gathered data is perfect: contains NULL values.
 NULL values inside the PayloadMass: calculate the mean
of the PayloadMass data and then replace the null values
in PayloadMass with the mean.
 We will leave the column LandingPad with NULL values,
as it is represented when a landing pad is not used.

Collecting the Data 11


Implement the Data Collection
API
 Student task: Import and complete the below Jupyter
notebook in IBM Watson Studio

 jupyter-labs-spacex-data-collection-api.ipynb

 Show your results

Collecting the Data 12


Summary
 Data Collection Overview
 Implement the Data Collection API

Collecting the Data 13


Q&A

Collecting the Data 14

You might also like