0% found this document useful (0 votes)

20 views18 pages

Lab 4 Creating A Streaming Data Pipeline For A Real

Uploaded by

Julius Sutrisno

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views18 pages

Lab 4 Creating A Streaming Data Pipeline For A Real

Uploaded by

Julius Sutrisno

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Creating a Streaming Data Pipeline for a Real-Time Dashboard with

Dataflow
help_outline
language

Start Lab

01:00:00

 Overview
 Set up your environments
 Task 1. Source a public Pub/Sub topic and create a BigQuery dataset
 Task 2. Create a Cloud Storage bucket
 Task 3. Set up a Dataflow Pipeline
 Task 4. Analyze the taxi data using BigQuery
 Task 5. Perform aggregations on the stream for reporting
 Task 6. Stop the Dataflow job
 Task 7. Create a real-time dashboard
 Task 8. Create a time series dashboard
 Congratulations!
 End your lab

Creating a Streaming
Data Pipeline for a
Real-Time Dashboard
with Dataflow
1 hour1 Credit

Overview
In this lab, you own a fleet of New York City taxi cabs and are looking to monitor how well your
business is doing in real-time. You will build a streaming data pipeline to capture taxi revenue,
passenger count, ride status, and much more and visualize the results in a management
dashboard.

Set up your environments

Qwiklabs setup
For each lab, you get a new GCP project and set of resources for a fixed time at no cost.

1. Make sure you signed into Qwiklabs

using an incognito window.

2. Note the lab's access time (for

example, and make

sure you can finish in that time block.

There is no pause feature. You can restart if needed, but you have to start at the beginning.
3. When ready,

click .

4. Note your lab credentials. You will use

them to sign in to Cloud Platform
Console.

5. Click Open Google Console.

6. Click Use another account and
copy/paste credentials for this lab into
the prompts.

If you use other credentials, you'll get errors or incur charges.

7. Accept the terms and skip the recovery
resource page.
Do not click End Lab unless you are finished with the lab or want to restart it. This clears your
work and removes the project.

Task 1. Source a public Pub/Sub topic and

create a BigQuery dataset
Pub/Sub is an asynchronous global messaging service. By decoupling senders and receivers, it
allows for secure and highly available communication between independently written
applications. Pub/Sub delivers low-latency, durable messaging.
In Pub/Sub, publisher applications and subscriber applications connect with one another through
the use of a shared string called a topic. A publisher application creates and sends messages to a
topic. Subscriber applications create a subscription to a topic to receive messages from it.

Google maintains a few public Pub/Sub streaming data topics for labs like this one. We'll be
using the NYC Taxi & Limousine Commission’s open dataset.
BigQuery is a serverless data warehouse. Tables in BigQuery are organized into datasets. In this
lab, messages published into Pub/Sub will be aggregated and stored in BigQuery.
To create a new BigQuery dataset:

Option 1: The command-line tool

1. Open Cloud Shell ( ) and run the

below command to create
the taxirides dataset.
bq --location=us-west1 mk taxirides
Copied!
content_copy
2. Run this command to create
the taxirides.realtime table
(empty schema that you will stream
into later).

bq --location=us-west1 mk \
--time_partitioning_field timestamp \
--schema ride_id:string,point_idx:integer,latitude:float,longitude:float,\
timestamp:timestamp,meter_reading:float,meter_increment:float,ride_status:
string,\
passenger_count:integer -t taxirides.realtime
Copied!
content_copy

Option 2: The BigQuery Console UI

Note: Skip these steps if you created the tables using the command line.
1. In the Google Cloud Console,
select Navigation
menu > Analytics > BigQuery:

2. The Welcome to BigQuery in the

Cloud Console message box opens.
This message box provides a link to the
quickstart guide and lists UI updates.

3. Click on the View actions icon next to

your Project ID and click Create
dataset.

4. Set the Dataset ID as taxirides,

for Data location, select us-west1
(Oregon) leave all the other fields the
way they are, and click CREATE
DATASET.

5. If you look at the left-hand resources

menu, you should see your newly
created dataset.
6. Click on the View actions icon next to
the taxirides dataset and click Open in
current tab.

7. Click CREATE TABLE.

8. Name the table realtime

9. For the schema, click Edit as text and

paste in the below:

ride_id:string,
point_idx:integer,
latitude:float,
longitude:float,
timestamp:timestamp,
meter_reading:float,
meter_increment:float,
ride_status:string,
passenger_count:integer
Copied!
content_copy
10. Under Partition and cluster settings,
select the timestamp option for the
Partitioning field.

11. Click the CREATE TABLE button.

Task 2. Create a Cloud Storage bucket

Cloud Storage allows world-wide storage and retrieval of any amount of data at any time. You
can use Cloud Storage for a range of scenarios including serving website content, storing data for
archival and disaster recovery, or distributing large data objects to users via direct download. In
this lab, you use Cloud Storage to provide working space for your Dataflow pipeline.

1. In the Cloud Console, go to Navigation

menu > Cloud Storage.
2. Click CREATE BUCKET.
3. For Name, paste in your GCP Project
ID and then click Continue.
4. For Location type, click Multi-
region if it is not already selected.
5. Click CREATE.

Task 3. Set up a Dataflow Pipeline

Dataflow is a serverless way to carry out data analysis. In this lab, you set up a streaming data
pipeline to read sensor data from Pub/Sub, compute the maximum temperature within a time
window, and write this out to BigQuery.
Restart the connection to the Dataflow API.

1. In the Cloud Console, enter Dataflow

API in the top search bar.

2. Click on the result for Dataflow API.

3. Click Manage.

4. Click Disable API.

5. If asked to confirm, click Disable.

6. Click Enable.

To create a new streaming pipeline:

1. In the Cloud Console, go to Navigation

menu > Dataflow.

2. In the top menu bar, click CREATE

JOB FROM TEMPLATE.
3. Enter streaming-taxi-pipeline as the
Job name for your Dataflow job.

4. Under Regional endpoint, select us-

west1 (Oregon).

5. Under Dataflow template, select

the Pub/Sub Topic to
BigQuery template.

6. Under Input Pub/Sub topic,

click Enter topic Manually,
enter projects/pubsub-public-
data/topics/taxirides-
realtime

7. Click Save.

8. Under BigQuery output table,

enter <myprojectid>:taxirides
.realtime

Note: There is a colon : between the project and dataset name and a dot . between the dataset
and table name.
8. Under Temporary location,
enter gs://<mybucket>/tmp/.

9. Click Show Optional Parameters and

input the following values as listed
below:

 Max workers: 2

 Number of workers: 2

 Worker region: us-west1

10. Click the RUN JOB button.

A new streaming job has started! You can now see a visual representation of the data pipeline.

Note: If the dataflow job fails for the first time then re-create a new job template with new job
name and run the job.
Task 4. Analyze the taxi data using
BigQuery
To analyze the data as it is streaming:

1. In the Cloud Console,

select Navigation menu > BigQuery.

2. Enter the following query in the

query EDITOR and click RUN:

SELECT * FROM taxirides.realtime LIMIT 10

Copied!
content_copy
3. If no records are returned, wait another
minute and re-run the above query
(Dataflow takes 3-5 minutes to setup
the stream). You will receive a similar
output:
Task 5. Perform aggregations on the
stream for reporting
1. Copy and paste the below query and
click RUN.

WITH streaming_data AS (
SELECT
timestamp,
TIMESTAMP_TRUNC(timestamp, HOUR, 'UTC') AS hour,
TIMESTAMP_TRUNC(timestamp, MINUTE, 'UTC') AS minute,
TIMESTAMP_TRUNC(timestamp, SECOND, 'UTC') AS second,
ride_id,
latitude,
longitude,
meter_reading,
ride_status,
passenger_count
FROM
taxirides.realtime
WHERE ride_status = 'dropoff'
ORDER BY timestamp DESC
LIMIT 1000
)
# calculate aggregations on stream for reporting:
SELECT
ROW_NUMBER() OVER() AS dashboard_sort,
minute,
COUNT(DISTINCT ride_id) AS total_rides,
SUM(meter_reading) AS total_revenue,
SUM(passenger_count) AS total_passengers
FROM streaming_data
GROUP BY minute, timestamp
Copied!
content_copy
Note: Ensure dataflow is registering data in BigQuery before proceeding to the next task.
The result shows key metrics by the minute for every taxi drop-off.

Task 6. Stop the Dataflow job

1. Navigate back to Dataflow.

2. Click the streaming-taxi-pipeline or

the new job name.

3. Click STOP and

select Cancel > STOP JOB.

This will free up resources for your project.

Task 7. Create a real-time dashboard

1. Open this Google Data Studio link in a
new incognito browser tab.
2. On the Reports page, in the Start with
a Template section, click the [+]
Blank Report template.

3. To get started complete account setup,

select your Country from the drop
down, enter Company if applicable.

4. Check the checkbox to acknowledge

the Google Data Studio Additional
Terms, and click Continue.

5. Select No to all the questions, then

click Continue.

6. Switch back to the BigQuery Console.

7. Click EXPLORE DATA > Explore

with Data Studio in BigQuery page.
8. Specify the below settings:

 Chart type: Combo chart

 Date range Dimension: dashboard_sort
 Dimension: dashboard_sort
 Drill Down: dashboard_sort (Make sure
that Drill down option is turned ON)
 Metric: SUM() total_rides, SUM()
total_passengers, SUM()
total_revenue
 Sort: dashboard_sort, Ascending (latest
rides first)
Your chart should look similar to this:
Note: Visualizing data at a minute-level granularity is currently not supported in Data Studio as
a timestamp. This is why we created our own dashboard_sort dimension.
9. When you're happy with your
dashboard, click Save and share to
save this data source.

10. If prompted with the Review data

access before saving window,
click Acknowledge and save.

11. Click Add to report.

12. Whenever anyone visits your

dashboard, it will be up-to-date with
the latest transactions. You can try it
yourself by clicking on the More
option and Refresh data.

Task 8. Create a time series dashboard

1. Click this Google Data Studio link to
open Data Studio in a new browser tab.
2. On the Reports page, in the Start with
a Template section, click the [+]
Blank Report template.

3. A new, empty report opens with Add

data to report.

4. From the list of Google Connectors,

select the BigQuery tile.

5. Under CUSTOM QUERY,

click qwiklabs-gcp-xxxxxxx > Enter
Custom Query, add the following
query.

SELECT
*
FROM
taxirides.realtime
WHERE
ride_status='dropoff'
Copied!
content_copy
6. Click Add > ADD TO REPORT.

Create a time series chart

1. In the Data panel, click ADD A

FIELD. Click All Fields on the left
corner.

2. Change the field timestamp type

to Date & Time > Date Hour Minute
(YYYYMMDDhhmm).

3. Click Continue and then click Done.

4. Click Add a chart.

5. Choose Time series chart.

6. Position the chart in the bottom left

corner - in the blank space.

7. In the Data panel on the right, change

the following:

 Dimension: timestamp
 Metric: meter_reading(SUM)
Your time series chart should look similar to this:
Note: if Dimension is timestamp(Date), then click on calendar icon next
to timestamp(Date), and select type to Date & Time > Date Hour Minute.

Congratulations!
In this lab, you used Pub/Sub to collect streaming data messages from taxis and feed it through
your Dataflow pipeline into BigQuery.

End your lab

When you have completed your lab, click End Lab. Google Cloud Skills Boost removes the
resources you’ve used and cleans the account for you.

You will be given an opportunity to rate the lab experience. Select the applicable number of
stars, type a comment, and then click Submit.

The number of stars indicates the following:

 1 star = Very dissatisfied

 2 stars = Dissatisfied
 3 stars = Neutral
 4 stars = Satisfied
 5 stars = Very satisfied
You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Copyright 2021 Google LLC All rights reserved. Google and the Google logo are trademarks of
Google LLC. All other company and product names may be trademarks of the respective
companies with which they are associated.

How satisfied are you with this lab?*

Additional Comments

Cancel
Submit
error_outline

All done? If you end this lab, you will lose all
your work. You may not be able to restart the
lab if there is a quota limit. Are you sure you
want to end this lab?
Cancel
Submit

Module-I: Office Management and Automation - B. Com, FCMS, Sri Sri University
No ratings yet
Module-I: Office Management and Automation - B. Com, FCMS, Sri Sri University
52 pages
Associate Data Practitioner Exam Dumps
No ratings yet
Associate Data Practitioner Exam Dumps
11 pages
Google's Professional Data Engineer - ExamTopics
No ratings yet
Google's Professional Data Engineer - ExamTopics
234 pages
Die Shear Test - Microelectronic Devices - Application Overview
No ratings yet
Die Shear Test - Microelectronic Devices - Application Overview
2 pages
Bigquery Scenarios - Dipakraj Patil
No ratings yet
Bigquery Scenarios - Dipakraj Patil
37 pages
Big Data-2 Sourcing Data
No ratings yet
Big Data-2 Sourcing Data
38 pages
JEEMAINJAN AdmitCard PDF
No ratings yet
JEEMAINJAN AdmitCard PDF
1 page
Factors Influencing The Strand Choice of Grade 11 HUMSS Students in DCNHS-SHS
40% (5)
Factors Influencing The Strand Choice of Grade 11 HUMSS Students in DCNHS-SHS
5 pages
Lab 3 Predicting Visitor Purchases With A Classification Model With BigQuery ML
No ratings yet
Lab 3 Predicting Visitor Purchases With A Classification Model With BigQuery ML
29 pages
BigQuery Lab
No ratings yet
BigQuery Lab
13 pages
Exam Topics - PDE - Questions-7w1dhd9jefy8p8w9ucpjurqidy
No ratings yet
Exam Topics - PDE - Questions-7w1dhd9jefy8p8w9ucpjurqidy
64 pages
Pls Academy Pde Student Slides 4 2405
No ratings yet
Pls Academy Pde Student Slides 4 2405
129 pages
Lab 2 Recommending Products Using Cloud SQL and Spark
No ratings yet
Lab 2 Recommending Products Using Cloud SQL and Spark
27 pages
Novellini KUADRAH13 Brochure
No ratings yet
Novellini KUADRAH13 Brochure
6 pages
Iot Analytics Pipeline
No ratings yet
Iot Analytics Pipeline
14 pages
Ex 9
No ratings yet
Ex 9
20 pages
Lab 5 Classifying Images of Clouds in The Cloud With AutoML Vision
No ratings yet
Lab 5 Classifying Images of Clouds in The Cloud With AutoML Vision
30 pages
Google's Big Data Analytics - Supriya Dusad
No ratings yet
Google's Big Data Analytics - Supriya Dusad
12 pages
Testbank For Medical Terminology For Health Professions 9th Edition Ehrlich Solution Manual
No ratings yet
Testbank For Medical Terminology For Health Professions 9th Edition Ehrlich Solution Manual
17 pages
From Data To Insights Course Summary
No ratings yet
From Data To Insights Course Summary
67 pages
Lab 1 - Exploring A BigQuery Public Dataset
No ratings yet
Lab 1 - Exploring A BigQuery Public Dataset
19 pages
Sertif GCP
No ratings yet
Sertif GCP
177 pages
Prakash, Chandra - Google Cloud Professional Data Engineer Practice Tests 2019 - GCP Data Engineer Dumps 2019. 100 - Unconditional Pass Guarantee Ex (2019, 万千书友聚集地) - Libgen.li
No ratings yet
Prakash, Chandra - Google Cloud Professional Data Engineer Practice Tests 2019 - GCP Data Engineer Dumps 2019. 100 - Unconditional Pass Guarantee Ex (2019, 万千书友聚集地) - Libgen.li
141 pages
Unstructured Data: User Price Shipped
No ratings yet
Unstructured Data: User Price Shipped
14 pages
(English (Auto-Generated) ) (Cloud Forum) Understanding BigQuery - Use Cases and Best Practices (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) (Cloud Forum) Understanding BigQuery - Use Cases and Best Practices (DownSub - Com)
42 pages
Big Data HW
No ratings yet
Big Data HW
6 pages
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
No ratings yet
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
75 pages
GCP Fund Module 8 Big Data and Machine Learning in The Cloud
No ratings yet
GCP Fund Module 8 Big Data and Machine Learning in The Cloud
41 pages
Associate Cloud Engineer - 8
No ratings yet
Associate Cloud Engineer - 8
22 pages
2 - Building An IoT Analytics Pipeline On Google Cloud
No ratings yet
2 - Building An IoT Analytics Pipeline On Google Cloud
19 pages
AVR® Microcontroller Hardware Design Considerations
No ratings yet
AVR® Microcontroller Hardware Design Considerations
26 pages
4.4 - Managed Services
No ratings yet
4.4 - Managed Services
17 pages
Dokumen - Tips Speedstar Varistar Manua
No ratings yet
Dokumen - Tips Speedstar Varistar Manua
165 pages
05 Data Warehouse Using Google Big Query
No ratings yet
05 Data Warehouse Using Google Big Query
6 pages
Cloud Computing Lab-3
No ratings yet
Cloud Computing Lab-3
8 pages
Big Query
No ratings yet
Big Query
11 pages
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
No ratings yet
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
73 pages
02 - Big Data Tools Overview
No ratings yet
02 - Big Data Tools Overview
29 pages
7 BigData BigQuery Intelli
No ratings yet
7 BigData BigQuery Intelli
3 pages
03.01.67-Electrical Bulk Material-Specification
No ratings yet
03.01.67-Electrical Bulk Material-Specification
19 pages
Amazon RDS Custom
No ratings yet
Amazon RDS Custom
26 pages
Chapter 5 Interpretation of Contracts - Obligations and Contracts
No ratings yet
Chapter 5 Interpretation of Contracts - Obligations and Contracts
44 pages
BDA04 GoogleCloud
No ratings yet
BDA04 GoogleCloud
33 pages
Google GCP BigLake
No ratings yet
Google GCP BigLake
13 pages
Event Handling - V
No ratings yet
Event Handling - V
49 pages
Chan V. Honda Motor Co., Ltd. and Honda Phil.: Rights, Regulations and Remedies) in Relation To Sec 170
No ratings yet
Chan V. Honda Motor Co., Ltd. and Honda Phil.: Rights, Regulations and Remedies) in Relation To Sec 170
3 pages
Preparing For The Google Cloud Professional Data Engineer Exam
No ratings yet
Preparing For The Google Cloud Professional Data Engineer Exam
3 pages
Population Survey
No ratings yet
Population Survey
10 pages
04 Present Scenarios Opportunities and Obstacles of E Business in Bangladesh - Doc Final
No ratings yet
04 Present Scenarios Opportunities and Obstacles of E Business in Bangladesh - Doc Final
22 pages
GlowCorp Case
No ratings yet
GlowCorp Case
25 pages
Bar 2024
No ratings yet
Bar 2024
25 pages
3 - ETL Processing On Google Cloud Using Dataflow and BigQuery
0% (1)
3 - ETL Processing On Google Cloud Using Dataflow and BigQuery
15 pages
Member List With Plot No./Load
No ratings yet
Member List With Plot No./Load
7 pages
LABNICS Filter Integrity Tester NFIT 101
No ratings yet
LABNICS Filter Integrity Tester NFIT 101
5 pages
Project Report Edit
No ratings yet
Project Report Edit
20 pages
Hands On Lab Guide For Data Lake PDF
No ratings yet
Hands On Lab Guide For Data Lake PDF
19 pages
New Part Number: PRICE LIST - August 2020
0% (1)
New Part Number: PRICE LIST - August 2020
531 pages
GCP Technologies
No ratings yet
GCP Technologies
12 pages
Coa Lecture Unit 3 Pipelining
No ratings yet
Coa Lecture Unit 3 Pipelining
95 pages
A Guide To: Project Auditing
No ratings yet
A Guide To: Project Auditing
37 pages
International Reporting Template: Exploration Results, Mineral Resources and Mineral Reserves
No ratings yet
International Reporting Template: Exploration Results, Mineral Resources and Mineral Reserves
36 pages
Bigquery, Google'S Enterprise Data Warehouse: Slid02
No ratings yet
Bigquery, Google'S Enterprise Data Warehouse: Slid02
3 pages
Hasee HP500 Laptop Schematics
No ratings yet
Hasee HP500 Laptop Schematics
41 pages
SAP FSCM-Dispute Management-FI/AR: Wipro Confidential
100% (1)
SAP FSCM-Dispute Management-FI/AR: Wipro Confidential
97 pages
Mechanical Equipment Selection
No ratings yet
Mechanical Equipment Selection
17 pages
Google Cloud Fund M8 Big Data and Machine Learning in The Cloud
No ratings yet
Google Cloud Fund M8 Big Data and Machine Learning in The Cloud
44 pages
Bomba Hidráulica (Ppal) Serie A.55 Serie H1P250 Parts Manual (H1P250R E8 C3 N D6 C G2 NN L40 K38 R L 24 PN NNN NNN)
No ratings yet
Bomba Hidráulica (Ppal) Serie A.55 Serie H1P250 Parts Manual (H1P250R E8 C3 N D6 C G2 NN L40 K38 R L 24 PN NNN NNN)
96 pages
GCP Data
No ratings yet
GCP Data
6 pages
Dataengieer
No ratings yet
Dataengieer
23 pages
Quiz 1 Topic 1 and 2
No ratings yet
Quiz 1 Topic 1 and 2
6 pages
Data Storage Services in GCP: Relational Database Data Warehouse Nosql Big Data Database Service
No ratings yet
Data Storage Services in GCP: Relational Database Data Warehouse Nosql Big Data Database Service
15 pages
Smoke Control System in High Rise Building
No ratings yet
Smoke Control System in High Rise Building
8 pages
Exam Professional Data Engineer Topic 1 Question 204 Discussion - ExamTopics
No ratings yet
Exam Professional Data Engineer Topic 1 Question 204 Discussion - ExamTopics
1 page
Asnt NDT Level Iii Basic Requirements
59% (17)
Asnt NDT Level Iii Basic Requirements
2 pages
Mastering Shell for DevOps: Automate, streamline, and secure DevOps workflows with modern shell scripting
From Everand
Mastering Shell for DevOps: Automate, streamline, and secure DevOps workflows with modern shell scripting
Gilbert Stew
No ratings yet
Mastering Shell for DevOps
From Everand
Mastering Shell for DevOps
Gilbert Stew
No ratings yet
Google BigQuery Analytics
From Everand
Google BigQuery Analytics
Jordan Tigani
3/5 (1)
Hallo Docker: Learning Docker Containers by Doing Projects
From Everand
Hallo Docker: Learning Docker Containers by Doing Projects
Agus Kurniawan
No ratings yet
Learn Kubernetes - Container orchestration using Docker: Learn Collection
From Everand
Learn Kubernetes - Container orchestration using Docker: Learn Collection
Arnaud Weil
4/5 (1)
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
Kubernetes: Build and Deploy Modern Applications in a Scalable Infrastructure. The Complete Guide to the Most Modern Scalable Software Infrastructure.: Docker & Kubernetes, #2
From Everand
Kubernetes: Build and Deploy Modern Applications in a Scalable Infrastructure. The Complete Guide to the Most Modern Scalable Software Infrastructure.: Docker & Kubernetes, #2
Jordan Lioy
No ratings yet
Python and SQLite Development
From Everand
Python and SQLite Development
Agus Kurniawan
No ratings yet
Build your own Blockchain: Make your own blockchain and trading bot on your pc
From Everand
Build your own Blockchain: Make your own blockchain and trading bot on your pc
Magelan Cybersecurity
No ratings yet
Projects with IOTA
From Everand
Projects with IOTA
Guillermo Perez Guillen
No ratings yet
C & C++ Interview Questions You'll Most Likely Be Asked
From Everand
C & C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Evaluation of Some Cloud Based Virtual Private Server (VPS) Providers
From Everand
Evaluation of Some Cloud Based Virtual Private Server (VPS) Providers
Dr. Hidaia Mamood Alassouli
No ratings yet
Hibernate, Spring & Struts Interview Questions You'll Most Likely Be Asked
From Everand
Hibernate, Spring & Struts Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
The Beginner’s Guide to Node.js
From Everand
The Beginner’s Guide to Node.js
Steven Mcananey
No ratings yet
Salesforce Developer Interview Questions: 1.0, #1
From Everand
Salesforce Developer Interview Questions: 1.0, #1
SFDC TELUGU
No ratings yet
Evaluation of Some Cloud Based Virtual Private Server (VPS) Providers
From Everand
Evaluation of Some Cloud Based Virtual Private Server (VPS) Providers
Dr. Hidaia Mahmood Alassouli
No ratings yet
Java: Tips and Tricks to Programming Code with Java
From Everand
Java: Tips and Tricks to Programming Code with Java
Charlie Masterson
No ratings yet
Java: Tips and Tricks to Programming Code with Java: Java Computer Programming, #2
From Everand
Java: Tips and Tricks to Programming Code with Java: Java Computer Programming, #2
Charlie Masterson
No ratings yet
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
5/5 (1)
IBM WebSphere Application Server Interview Questions You'll Most Likely Be Asked
From Everand
IBM WebSphere Application Server Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Lab 4 Creating A Streaming Data Pipeline For A Real

Uploaded by

Lab 4 Creating A Streaming Data Pipeline For A Real

Uploaded by

Creating a Streaming Data Pipeline for a Real-Time Dashboard with

Set up your environments

1. Make sure you signed into Qwiklabs

2. Note the lab's access time (for

example, and make

4. Note your lab credentials. You will use

5. Click Open Google Console.

If you use other credentials, you'll get errors or incur charges.

Task 1. Source a public Pub/Sub topic and

Option 1: The command-line tool

1. Open Cloud Shell ( ) and run the

Option 2: The BigQuery Console UI

2. The Welcome to BigQuery in the

3. Click on the View actions icon next to

4. Set the Dataset ID as taxirides,

5. If you look at the left-hand resources

7. Click CREATE TABLE.

8. Name the table realtime

9. For the schema, click Edit as text and

11. Click the CREATE TABLE button.

Task 2. Create a Cloud Storage bucket

1. In the Cloud Console, go to Navigation

Task 3. Set up a Dataflow Pipeline

1. In the Cloud Console, enter Dataflow

2. Click on the result for Dataflow API.

4. Click Disable API.

5. If asked to confirm, click Disable.

To create a new streaming pipeline:

1. In the Cloud Console, go to Navigation

2. In the top menu bar, click CREATE

4. Under Regional endpoint, select us-

5. Under Dataflow template, select

6. Under Input Pub/Sub topic,

8. Under BigQuery output table,

9. Click Show Optional Parameters and

 Worker region: us-west1

10. Click the RUN JOB button.

1. In the Cloud Console,

2. Enter the following query in the

SELECT * FROM taxirides.realtime LIMIT 10

Task 6. Stop the Dataflow job

2. Click the streaming-taxi-pipeline or

3. Click STOP and

This will free up resources for your project.

Task 7. Create a real-time dashboard

3. To get started complete account setup,

4. Check the checkbox to acknowledge

5. Select No to all the questions, then

6. Switch back to the BigQuery Console.

7. Click EXPLORE DATA > Explore

 Chart type: Combo chart

10. If prompted with the Review data

11. Click Add to report.

12. Whenever anyone visits your

Task 8. Create a time series dashboard

3. A new, empty report opens with Add

4. From the list of Google Connectors,

5. Under CUSTOM QUERY,

Create a time series chart

1. In the Data panel, click ADD A

2. Change the field timestamp type

3. Click Continue and then click Done.

4. Click Add a chart.

5. Choose Time series chart.

6. Position the chart in the bottom left

7. In the Data panel on the right, change

End your lab

The number of stars indicates the following:

 1 star = Very dissatisfied

For feedback, suggestions, or corrections, please use the Support tab.

How satisfied are you with this lab?*

You might also like