0% found this document useful (0 votes)

272 views16 pages

Portfolio Project Solution Sheet

The document provides context for an analysis task given to an intern at Lyft. The intern has been asked to analyze differences between Lyft bike share users and previous Ford bike share users. To complete the task, the intern will need to: 1) Combine bike share rental data from Lyft and Ford datasets from 2020 by writing an SQL query to union the tables and add a column indicating the data source. 2) Prepare the combined data for analysis by writing queries to clean member types and join weather data from an additional table. 3) Analyze the prepared data to understand user engagement and behavior patterns to help Lyft marketing efforts. Resources and an SQL app are provided to help the intern complete the

Uploaded by

api-708555321

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

272 views16 pages

Portfolio Project Solution Sheet

Uploaded by

api-708555321

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Portfolio Project | Bay Wheels User

Analysis

INTRODUCTION: Here’s what you need to know: Lyft purchased its bike share
program from Ford (who owned GoBike) and needs a data analyst – that’s you! – to
help the marketing team use data-driven approaches in their new marketing
efforts. You’ve been tasked by your manager to investigate the differences
between Lyft users and Ford users. Lyft wants to increase memberships in its
rideshare program and needs to determine how their users, both past and present,
use their product.

HOW IT WORKS: Follow the prompts in the questions below to investigate your
data. Post your answers in the provided boxes: the yellow boxes for the queries you
write, purple boxes for visualizations and blue boxes for text-based answers. When
you're done, export your document as a pdf file and submit it on the Milestone page
– see instructions for creating a PDF at the end of the Milestone.

RESOURCES: If you need hints on the Milestone or are feeling stuck, there are
multiple ways of getting help. Attend Drop-In Hours to work on these problems with
your peers, or reach out to the HelpHub if you have questions. Good luck!

PROMPT: Congratulations are in order! You’ve been hired as an intern by Lyft, one
of the largest ride-sharing transportation providers in the country. In your new role,
you’ll be working on the Lyft Bay Wheels product: their latest initiative that provides
rental bikes all across San Francisco through the Lyft app.

SQL App: Here’s that link to our specialized SQL app, where you’ll write your SQL
queries and interact with the data.
— Data Set Description

To begin, you’ll query a total of 3 datasets. You’ll start with the lyft.baywheels and
ford.gobike datasets available in your schema. Later, you will join the sf.weather
dataset.

The lyft.baywheels dataset reports information about rentals made on the Bay
Wheels bike share system. Each row represents a single rental; we will be making
use of the following fields in this project:

● started_date - Date for start of rental

● started_at - Timestamp for start of rental
● ended_at - Timestamp for end of rental
● start_station_name - For rentals that started from a bike dock, the name of
the dock.
● end_station_name - For rentals that ended at a bike dock, the name of the
dock.
● start_lat, start_lng - Latitude and longitude, respectively, of the start of
the rental.
● end_lat, end_lng - Latitude and longitude, respectively, of the end of the
rental.
● member_casual - String indicating whether the rental was made by a system
“member”, who has a monthly subscription with the bikeshare system, or by a
“casual” user, who is making a one-time rental.

The ford.gobike dataset has information very similar to the lyft.baywheels table, but
reports rides prior to Lyft’s takeover of the bikeshare system. One major distinction
between the two tables is different field names. The field names in the ford.gobike
dataset will be explained through the course of the project tasks.

The sf.weather dataset contains daily weather statistics recorded at SF

International Airport through 2020. We will be concerned with the following three
features in this project:
● date - Date of weather recordings
● temperature_avg - Average temperature in Fahrenheit
● precipitation - Recorded precipitation in inches

— Task 1: Top User Engagement

These datasets are currently captured in your SQL database in separate tables, but
your manager has told you that they are indeed the same data, just with different
names.

Before you can start analyzing customer activity, you first need to combine the data
needed from Ford and Lyft. While the datasets are currently captured in your SQL
database in separate data tables, your manager has assured you that they are the
same data, though with different variable names. Below is a table of equivalent
columns between the two datasets, detailing which columns in the lyft.baywheels
data set match which columns in the ford.gobike data table.

Lyft Bay Wheels Ford GoBike

started_date start_date

started_at start_time

ended_at end_time

start_station_name start_station_name

end_station_name end_station_name

start_lat start_station_latitude

start_lng start_station_longitude

end_lat end_station_latitude

end_lng end_station_longitude

member_casual user_type
A. Write a query that filters the ford.gobike data to only include data from the
year 2020. HINT: Use the date_part function in SQL!

SELECT *
FROM ford.gobike
WHERE date_part('year', start_date) = 2020

B. Write a query that unions the ford.gobike dataset and the lyft.baywheels
dataset using the corresponding columns above. Make sure that you are still
filtering to the year 2020 on the Ford data.

Note: You will want the Lyft data to be the first table in your query so that the
column names from the Lyft dataset become the standard ones for the
remainder of your analysis.

SELECT
started_date,
started_at,
ended_at,
start_station_name,
end_station_name,
start_lat,
start_lng,
end_lat,
end_lng,
member_casual
FROM lyft.baywheels
UNION
SELECT
start_date,
start_time,
end_time,
start_station_name,
end_station_name,
start_station_latitude,
start_station_longitude,
end_station_latitude,
end_station_longitude,
user_type
FROM ford.gobike
WHERE date_part('year', start_date) = 2020

After showing the result of the query to your manager, she tells you that she wants
to know which data source is attributed to each row. She asks you to create a new
column called data_source that has the value ‘Lyft’ if the data came from the Lyft
dataset and the value ‘Ford’ if it came from the Ford dataset.

A colleague teaches you a simple method to do this. When writing your query, add
an additional column after your select statement. Here is an example of this for the
Lyft table:

SELECT
*,
'Lyft' AS data_source
FROM lyft_baywheels
Modify your query from part B to include the data_source column.

SELECT
started_date,
started_at,
ended_at,
start_station_name,
end_station_name,
start_lat,
start_lng,
end_lat,
end_lng,
member_casual,
'Lyft' AS data_source
FROM lyft.baywheels
UNION
SELECT
start_date,
start_time,
end_time,
start_station_name,
end_station_name,
start_station_latitude,
start_station_longitude,
end_station_latitude,
end_station_longitude,
user_type,
'Ford' AS data_source
FROM ford.gobike
WHERE date_part('year', start_date) = 2020

Great! Since you and other members on your team will be referencing the output of
your query for deeper analysis, your manager asked the Engineering team to store it
specially in your schema. For the remainder of this project, you’ll query
project.ford_lyft_analysis.

— Task 2: Preparing the Data and Creating New Features

Now that we have combined and joined our three data tables together, you’ll need
to create additional variables so that you can perform the analysis your manager is
asking from you.

A. The member_casual column is supposed to indicate whether the rental was

made by a system “member”, who has a monthly subscription, or by a
“casual” user, who is making a one-time rental. You notice that the
member_casual column actually has four different values: ‘member’,
‘Subscriber’, ‘casual’, and ‘Customer’. This is because Ford referred to its
members as ‘Subscribers’ and its casual users as ‘Customer’ in its data.

Write a query that returns all the variables from project.ford_lyft_analysis,

plus a new variable called “member_type”, that contains only values that
match the Lyft classifications: ‘member’ or ‘casual’.

In other words, if member_casual is equal to ‘Subscriber’ your member_type

field should be the string ‘member’ and if member_casual is equal to
‘Customer’, your member_type field should be the string ‘casual’.
Remember SQL is case sensitive!

SELECT
*,
CASE
WHEN member_casual = 'Subscriber'
THEN 'member'
WHEN member_casual = 'Customer'
THEN 'casual'
END AS member_type
FROM project.ford_lyft_analysis

B. Almost there! After going over the table with your manager, she hypothesises
that patterns are driven by changes in weather and wants you to incorporate
weather data into your analysis.
You both decide San Francisco's average daily temperature and amount of
precipitation are the best metrics to base your weather analysis on. These are
located in the temperature_avg and precipitation columns, respectively, of
the sf.weather table.

Modify your query from part A to join the table with the sf_weather data on the
started_date field. From the sf_weather table, return the average daily
temperature, and the amount of precipitation.

SELECT
analysis.*,
CASE
WHEN member_casual = 'Subscriber'
THEN 'member'
WHEN member_casual = 'Customer'
THEN 'casual'
END AS member_type,
weather.temperature_avg,
weather.precipitation
FROM project.ford_lyft_analysis AS analysis
INNER JOIN sf.weather AS weather
ON analysis.started_date = weather.date

That’s it! Now this query will result in almost 2 million records for the year
2020! Since SQLPad will only let you download 150,000 records in a .csv, the
engineering team used some extra tools they have to download the result of
your query. It’s loaded for you in a Tableau Workbook, where you’ll complete
the rest of your project.

— Task 3: Visualizing and Analyzing Using Tableau

Phew! Now that you’ve gotten the query out of the way, you’re ready to dive into
investigating the differences between Lyft users and Ford users so that the
marketing team at Lyft can make the best plan possible to help increase
memberships in its rideshare program. The remaining Tasks will be completed in
Tableau, and will focus on visualizing and analyzing your results. Click this link to
navigate to the workbook you’ll use to complete the remainder of this Project.

Once you’ve published your Tableau Workbook, paste the Share Link in the box
below.

https://prod-useast-b.online.tableau.com/#/site/globaltech/w
orkbooks/746907?:origin=card_share_link

Continue to post your answers in the provided boxes: purple boxes for your
visualizations, and blue boxes for text-based answers.

A. On Sheet 1, start your exploration by plotting the number of rentals made

each week. (Use the Started At field to determine each rental’s week.) You
should also add color to the chart so that you can clearly see when the Data
Source changed over from Ford to Lyft.

Using your visualization, when did operations transfer over from Ford to Lyft?
Are there any major differences in the volume of rentals before and after the
transfer?
In the visualization, it is clear to see that operations changed from
Ford to Lyft most likely around Week 14, or around March 14th,
2020. Before making the switch, Ford had a much higher daily
usage, almost double the daily usage that Lyft saw after
switching. However, Lyft’s daily usage was significantly more
consistent as opposed to Ford’s. The visualization also shows
that the daily usage may have plummeted prior to the switch,
where Week 12 had a major decline in daily usage while Ford still
owned the company.

B. Next, on Sheet 2, create a bar chart to depict the total number of rides during
each hour of the day. No need to include this visualization in this report just
yet! During which hours of the day are customers most likely to rent a bike?

Customers are most likely to rent a bike in the afternoon and

evening, where it peaks at 5pm. This is most likely due to “rush
hour” and people commuting back home after work.

C. Let’s break the hourly usage patterns down by data source. Using the Data
Source field, modify your visualization from part B to create two
side-by-side bar charts: one to illustrate the total number rides during each
hour of the data for Ford GoBike data, and the other for Lyft Baywheels.
Regarding popular hours of the day, what differences do you notice between
Lyft users and Ford users?
Looking at the hourly usage patterns based on the data source,
you can see that Ford has significantly greater rides in the
morning hours, where it then plateaus during the mid-day and
picks back up again in the evening. Whereas Lyft riders daily
usage gradually increases from 6am to where it peaks at 5pm,
and then starts to decline for the remainder of the day.

D. On Sheet 3, create a line plot of the average temperature on the

horizontal-axis and the number of rides taken on the vertical-axis. Plot one
line for each Member Type. Finally, add Data Source to the column in order to
compare Ford ridership with Lyft ridership. Note: you will have to convert the
Temperature Avg feature into a Dimension first!

How does the temperature affect ridership? Which riders are more willing to
use a bike on cold days, and which riders are more likely to ride on warmer
days?
From the visualization, it appears that in the Ford data members
are more likely to ride regardless of temperature, but the most
likely to ride when the temperature is around 55 degrees, while
the casual members follow a very similar trend. From the Lyft
data, it seems that casual riders are more likely to ride regardless
of temperature, but both casual and members are most likely to
ride when it is around 60-65 degrees. This is interesting because
there is a 10 degree difference between Ford and Lyft riders
“preferred” temperatures. It also seems that there are more Ford
members than there are Lyft members.

— Task 4: Communicating Results

Your manager wants you to share the visualizations you created in parts C and D of
Task 4 with the marketing team for visibility. She asks you to email the visualizations
to the team with a short paragraph explaining what insights can be drawn from it
and any data-based marketing strategies you might recommend to increase
ridership at Lyft Baywheels.

A. In a single paragraph, summarize what can be gleaned from your

visualizations. In particular, are there differences between the datasets
representing Ford and Lyft riders? How might Lyft market to customers in
order to build upon the success of the Ford’s GoBike program?

From my findings, it is clear to me that Lyft needs to focus their

attention on trying to gain more members. From the visualization
in Part D, it is clear that Lyft has more casual members than they
do actual subscribers, contrary to Ford seeming to have more
members. Also, the companies switched mid March, which could
explain why the Lyft riders seem to prefer a slightly warmer
temperature as opposed to Ford riders, due to the data collected
on Lyft riders being taken when it is objectively warmer in the
summertime. As for the decrease in daily rentals that seemingly
coincide with Lyft taking over the Ford program, I believe that this
was not caused by the switch in companies, but more so that this
purchase happened within the same week that the lockdowns
began due to the Covid pandemic. Although, from the daily
usage broken down by hour of the day visualization, it is clear that
Lyft definitely has more usage throughout the day than Ford did,
which is a solid start. All in all, the only thing I can see to help Lyft
overall would be to try to incentivize memberships as opposed to
casual riders.

That’s it! Submit your final project for evaluation, and go celebrate your
achievement! You just completed a rich, complex data analysis project
representing real-world level work. You’ve gained some impressive skills! Well
done, and never stop learning 😀
— LevelUp
The dataset in your Tableau workbook is rich – there’s much more that can be done
with the data! Below you’ll find three additional LevelUp tasks. Have fun exploring
them!
A. Your manager tells you that Lyft is interested in determining the distance
riders travel between start and end points. Take a look in your Tableau
notebook. You’ll find a variable called RIDE DISTANCE that is the distance
between the start and end points on a map.

Note: this is not the same as the total distance traveled on the bike. For
instance, if a ride began and ended at the same location, the distance would
show up as a zero in the data regardless of how long the bike was rented for.
Instead, it lets Lyft know the typical distance riders travel when they start and
end their rides at different points. The formula used is the Haversine distance.
It calculates the distance between two GPS coordinates, taking Earth’s
curvature into consideration.

On Sheet 5, use this new calculated field to plot a histogram of the distance
riders traveled. To make your visualization more useful, filter to values that are
less than 7 miles and use a bin size of 0.1.

Analyze the histogram: how far do the majority of the rides typically go?

Typically, it seems that riders normally begin their ride at the same
place they end the ride, therefore the data is showing that they
didn’t go anywhere. The second most common trend is that
riders will go 0.8 miles.

B. While you were assigned the analysis against temperature, one of your
colleagues looked at the other weather feature you joined into the data:
precipitation. She has interpreted the data to say that there’s no major
differences between Member Types in terms of ridership due to the weather.

She’s asked that you verify her work. Can you create a plot to illustrate how
precipitation affects ridership? Compare between Ford and Lyft users and
again between member and casual riders.
From my visualization, there is a slight difference in casual vs.
members and their willingness to ride, but only when there is no
precipitation. It appears the Ford members are most willing to
ride when there is no precipitation, but then the between casual
and members is pretty similar when it comes to any precipitation.
On the other hand, Lyft casual members are most willing to ride
when there is no precipitation, but then again both members and
casual users follow very similar trends when rain is involved. The
line graph for both show that ALL riders are very much not willing
to ride in the rain, even if it is the slightest amount of precipitation.

C. One of your colleagues has looked at the rentals by temperature plot you
created and the rentals by precipitation plot your colleague created. With
the approaching colder season in San Francisco, they’re afraid of a dropoff in
the amount of casual riders on the system and want to suggest additional
marketing efforts to increase casual rider engagement over the next few
months.

How much do you agree with, or disagree with your colleague’s assessment?
Are there aspects of the data that they haven’t considered in their analysis
that can be addressed with other plots you created? Is there information
outside of the available data that would be useful to make a better judgment
of where to put the marketing focus for the next winter season?

As I said before, I think the best strategy for Lyft is to encourage

more members as opposed to casual riders. This would help the
company support the coming winter months by having a more
“reliable” income, as opposed to casual users renting unreliably. I
think with the coming colder season they can also utilize
discounts to incentivize people to still rent and take bikes during
the “not so pleasant” weather, for example casual riders can get a
percentage off when signing up to become a member in
December. However, I believe the best way to go about this is for
Lyft to get their hands on Ford’s 2019 gobike data, so they can
see exactly what the trends look like for the winter months and go
from there.

— Submission
Great work completing your Final Project!!!! To submit your completed project file,
you will need to download / export this document as a PDF and then upload it to the
Milestone submission page. You can find the option to download as a PDF from the
File menu in the upper-left corner of the Google Doc interface.

Data Science Tools Study Guides For MIT's 15.003
No ratings yet
Data Science Tools Study Guides For MIT's 15.003
23 pages
Module 08 Storage Area Network: Background
100% (1)
Module 08 Storage Area Network: Background
4 pages
Hana Basics
No ratings yet
Hana Basics
153 pages
Structured and Unstructured Data: Learning Outcomes
100% (1)
Structured and Unstructured Data: Learning Outcomes
13 pages
IrpMan PDF
No ratings yet
IrpMan PDF
182 pages
SQL Overview
No ratings yet
SQL Overview
3 pages
DBMS Lab Week3
No ratings yet
DBMS Lab Week3
2 pages
Computer Memory: Prepared By: Avanthika Krishnan, XI D
No ratings yet
Computer Memory: Prepared By: Avanthika Krishnan, XI D
17 pages
ABAP101 Exercises - Beginner
No ratings yet
ABAP101 Exercises - Beginner
127 pages
Big Data Analytics
No ratings yet
Big Data Analytics
13 pages
17 18 19 20 21 22 23 Yarn
No ratings yet
17 18 19 20 21 22 23 Yarn
44 pages
FactoryTalk View SE - Adding The Data Server Name and Tag Address To The Tag Address Syntax For Third Party OPC Servers
No ratings yet
FactoryTalk View SE - Adding The Data Server Name and Tag Address To The Tag Address Syntax For Third Party OPC Servers
7 pages
Project Walkthrough - Bike Share-2020
No ratings yet
Project Walkthrough - Bike Share-2020
58 pages
Questions For The May 2024 IDU
100% (3)
Questions For The May 2024 IDU
13 pages
Important Apps DBA Interview Questions
100% (2)
Important Apps DBA Interview Questions
43 pages
2016 Doctoral Conference Graduate School of Education University of Bristol
100% (1)
2016 Doctoral Conference Graduate School of Education University of Bristol
36 pages
Assignment4 PadillaRuizMarta
No ratings yet
Assignment4 PadillaRuizMarta
28 pages
Delineasi Das Dan Elemen Model Hidrologi Menggunakan Hec-Hms Versi 4.4
No ratings yet
Delineasi Das Dan Elemen Model Hidrologi Menggunakan Hec-Hms Versi 4.4
6 pages
Ip Lab Manual (Python) 2019-20
No ratings yet
Ip Lab Manual (Python) 2019-20
16 pages
HFS+ File System Format Reference Sheet: HFS+ Data Is Big Endian GPT Is Li2le Endian
No ratings yet
HFS+ File System Format Reference Sheet: HFS+ Data Is Big Endian GPT Is Li2le Endian
2 pages
(06) 分析輔助工具 - Power PI - 基礎入門介紹.zh-CN.en
No ratings yet
(06) 分析輔助工具 - Power PI - 基礎入門介紹.zh-CN.en
55 pages
Complete Athletic Department Information
No ratings yet
Complete Athletic Department Information
14 pages
L-9517-9417-06-B Data Sheet RTLC en
No ratings yet
L-9517-9417-06-B Data Sheet RTLC en
9 pages
Qualitative Research
No ratings yet
Qualitative Research
4 pages
1 3 UnderstandingHadoop UseCases
No ratings yet
1 3 UnderstandingHadoop UseCases
18 pages
How To Convert Casuals To Members?": Google Data Analytics Course Capstone Project: Case Study 1 "Cyclistic"
No ratings yet
How To Convert Casuals To Members?": Google Data Analytics Course Capstone Project: Case Study 1 "Cyclistic"
18 pages
Data Structures Technical Interview Questions
No ratings yet
Data Structures Technical Interview Questions
11 pages
Big Data To Avoid Weather Related Flight Delays
No ratings yet
Big Data To Avoid Weather Related Flight Delays
22 pages
Hass Planning t1 2023 Year 6
No ratings yet
Hass Planning t1 2023 Year 6
3 pages
Case Study: Problem Statement
No ratings yet
Case Study: Problem Statement
6 pages
Wait Event - Latch: Row Cache Objects - DC - Objects
No ratings yet
Wait Event - Latch: Row Cache Objects - DC - Objects
11 pages
DS Journal
No ratings yet
DS Journal
46 pages
IA 04. Methodology Proposal
No ratings yet
IA 04. Methodology Proposal
4 pages
CSE6006 NoSQL-Databases ETH 1 AC41
No ratings yet
CSE6006 NoSQL-Databases ETH 1 AC41
10 pages
Dbms Lab Split Up (Ct3) - Sheet1
No ratings yet
Dbms Lab Split Up (Ct3) - Sheet1
2 pages
Unit7-Transaction Processing Concepts Notes
No ratings yet
Unit7-Transaction Processing Concepts Notes
7 pages
MTH 4407 - Group 2 (Dr. Farid Zamani) - Lecture 6
No ratings yet
MTH 4407 - Group 2 (Dr. Farid Zamani) - Lecture 6
22 pages
2017-2 COMP3278 Assignment 2 SQL
No ratings yet
2017-2 COMP3278 Assignment 2 SQL
5 pages
Aarohan Subedi
No ratings yet
Aarohan Subedi
38 pages
Faculty of Engineering and Technology Electrical and Computer Engineering Department
No ratings yet
Faculty of Engineering and Technology Electrical and Computer Engineering Department
2 pages
Case Study 1 Exercise R Script
No ratings yet
Case Study 1 Exercise R Script
5 pages
Midterm Database
No ratings yet
Midterm Database
8 pages
Data Mining Journal 1 Kashan
No ratings yet
Data Mining Journal 1 Kashan
13 pages
M4 Working With Derived Tables
No ratings yet
M4 Working With Derived Tables
120 pages
Interview Task - Locale
No ratings yet
Interview Task - Locale
5 pages
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
No ratings yet
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
23 pages
Ngao Duncan Muia MOD 2023
No ratings yet
Ngao Duncan Muia MOD 2023
126 pages
Anum 1BO20CS091
No ratings yet
Anum 1BO20CS091
14 pages
Retrieve Data Using Query 11
No ratings yet
Retrieve Data Using Query 11
11 pages
Data Analyst Work Sample Request
No ratings yet
Data Analyst Work Sample Request
4 pages
Rent A Car DBS - Semstral Work
No ratings yet
Rent A Car DBS - Semstral Work
5 pages
Datathon at UCI Resource Sheet
No ratings yet
Datathon at UCI Resource Sheet
15 pages
Hands On Lab Guide For Data Lake PDF
No ratings yet
Hands On Lab Guide For Data Lake PDF
19 pages
An Introduction To SQL 1731971471
No ratings yet
An Introduction To SQL 1731971471
57 pages
Session 4 Practice Case Questions
No ratings yet
Session 4 Practice Case Questions
3 pages
Database Management
No ratings yet
Database Management
4 pages
Classicmodels
No ratings yet
Classicmodels
3 pages
Activity #1 - Advanced SQL
No ratings yet
Activity #1 - Advanced SQL
12 pages
SQL 1721960421
No ratings yet
SQL 1721960421
131 pages
Unstructured Data: User Price Shipped
No ratings yet
Unstructured Data: User Price Shipped
14 pages
City Bike Project Management Analytics
No ratings yet
City Bike Project Management Analytics
5 pages
Interview Questions
No ratings yet
Interview Questions
29 pages
Practice Paper (IP)
No ratings yet
Practice Paper (IP)
28 pages
Course Code: Course Title: TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives
No ratings yet
Course Code: Course Title: TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives
4 pages
Hadoop (Hive) - NYC Yellow Taxi Case Study
No ratings yet
Hadoop (Hive) - NYC Yellow Taxi Case Study
2 pages
N N N N N N: A Ovel Approach To A Alyze Uber Datausi G Machi E Lear I G
No ratings yet
N N N N N N: A Ovel Approach To A Alyze Uber Datausi G Machi E Lear I G
17 pages
Taxi Trip Analysis Using Hive
No ratings yet
Taxi Trip Analysis Using Hive
3 pages
Xujia Wei - Data Science Portfolio
No ratings yet
Xujia Wei - Data Science Portfolio
13 pages
Data Science Lab Group Submission
No ratings yet
Data Science Lab Group Submission
13 pages
Analysis Report
No ratings yet
Analysis Report
54 pages
Group 7 - Data Mining Report
No ratings yet
Group 7 - Data Mining Report
18 pages
Text 3
No ratings yet
Text 3
3 pages
Project List Data Analytics
No ratings yet
Project List Data Analytics
13 pages
Operation and Metric Analytics
No ratings yet
Operation and Metric Analytics
9 pages
TDIA2 TP3 Spark
No ratings yet
TDIA2 TP3 Spark
2 pages
Traffic Accidents
No ratings yet
Traffic Accidents
14 pages
CO3 Session 22
No ratings yet
CO3 Session 22
35 pages
IOT-Domain Analyst
No ratings yet
IOT-Domain Analyst
37 pages
BigQuery Lab
No ratings yet
BigQuery Lab
13 pages
Trainity Assignment 3
No ratings yet
Trainity Assignment 3
9 pages
E2E Ecommerce Analytics GCP Pipeline Report
No ratings yet
E2E Ecommerce Analytics GCP Pipeline Report
18 pages
MATLAB For Dummies
From Everand
MATLAB For Dummies
Jim Sizemore
No ratings yet
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
From Everand
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
Georgio Daccache
No ratings yet
Truck Driving School Revenues World Summary: Market Values & Financials by Country
From Everand
Truck Driving School Revenues World Summary: Market Values & Financials by Country
Editorial DataGroup
No ratings yet
SAP ECC FI Transaction Codes: Unofficial Certification and Review Guide
From Everand
SAP ECC FI Transaction Codes: Unofficial Certification and Review Guide
Equity Press
5/5 (2)
Salesforce.com Interview Q & A & Certification Question Bank with Answers
From Everand
Salesforce.com Interview Q & A & Certification Question Bank with Answers
Mohammed Azizuddin Aamer
4/5 (5)
How to Estimate with RSMeans Data: Basic Skills for Building Construction
From Everand
How to Estimate with RSMeans Data: Basic Skills for Building Construction
Saleh A. Mubarak
4.5/5 (2)
PROC REPORT by Example: Techniques for Building Professional Reports Using SAS: Techniques for Building Professional Reports Using SAS
From Everand
PROC REPORT by Example: Techniques for Building Professional Reports Using SAS: Techniques for Building Professional Reports Using SAS
Lisa Fine
No ratings yet
How To Develop A Performance Reporting Tool with MS Excel and MS SharePoint
From Everand
How To Develop A Performance Reporting Tool with MS Excel and MS SharePoint
S. Alyafei
No ratings yet
The Personal Finance Application How to Save Money
From Everand
The Personal Finance Application How to Save Money
Emilio Aleu
No ratings yet

Portfolio Project Solution Sheet

Uploaded by

Portfolio Project Solution Sheet

Uploaded by

Portfolio Project | Bay Wheels User

● started_date - Date for start of rental

The sf.weather dataset contains daily weather statistics recorded at SF

— Task 1: Top User Engagement

Lyft Bay Wheels Ford GoBike

— Task 2: Preparing the Data and Creating New Features

A. The member_casual column is supposed to indicate whether the rental was

Write a query that returns all the variables from project.ford_lyft_analysis,

In other words, if member_casual is equal to ‘Subscriber’ your member_type

— Task 3: Visualizing and Analyzing Using Tableau

A. On Sheet 1, start your exploration by plotting the number of rentals made

Customers are most likely to rent a bike in the afternoon and

D. On Sheet 3, create a line plot of the average temperature on the

— Task 4: Communicating Results

A. In a single paragraph, summarize what can be gleaned from your

From my findings, it is clear to me that Lyft needs to focus their

As I said before, I think the best strategy for Lyft is to encourage

You might also like