0% found this document useful (0 votes)

77 views4 pages

Big Query Help

This document provides an overview of an activity where the learner will query large datasets using SQL to understand how data size affects query runtime. It begins by defining units of digital data storage from bits to zettabytes. The learner is then instructed to log into BigQuery and run a sample SQL query on a 10 billion row sample of Wikipedia page view data. The query takes 10-15 seconds to run and processes over 415 gigabytes of data, demonstrating how SQL can efficiently query very large datasets.

Uploaded by

Shiv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views4 pages

Big Query Help

Uploaded by

Shiv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

1.

Question 1

Activity overview

In previous activities, you learned about and practiced SQL. In this activity, you’ll work with SQL
queries of different sizes.

By the time you complete this activity, you’ll be familiar with the different sizes used to measure
data storage. This will help you understand how data size affects the amount of time queries take
to run and how valuable tools like SQL can be to data analysts.

Understand how data is measured

Data is measured by the number of bits it takes to represent it. All information in a computer can
be represented as a binary number consisting solely of 0’s and 1’s. Each 0 or 1 in a number is a
bit. A bit is the smallest unit of storage in computers. Since computers work in binary (Base 2),
this means that all the important numbers that differentiate between different data sizes will be
powers of 2.

A byte is a collection of 8 bits. Take a moment to examine the table below to get a feel for the
difference between data measurements and their relative sizes to one another.

Unit Equivalent to Abbreviation Real-Worl

Byte 8 bits B 1 characte

Kilobyte 1024 bytes KB A page of

Megabyte 1024 Kilobytes MB 1 song in M

Gigabyte 1024 Megabytes GB ~300 song

Terabyte 1024 Gigabytes TB ~500 hour

Petabyte 1024 Terabytes PB 10 billion F

Exabyte 1024 Petabytes EB ~500 millio

Zettabyte 1024 Exabytes ZB All the data

The amount of data in the world is exploding and growing at an incredible pace every year. This
growth is largely the result of the over 4.6 billion people around the world connected to the
Internet. Now that smartphones and other Internet-connected devices have become common,
they generate a staggering amount of new data. Many experts believe that the size of all the data
on the Internet will swell to 175 ZB by the end of 2025!

The size of the dataset you’re working with usually determines which tool, spreadsheets or SQL,
is best suited for the task. Spreadsheets often start to have performance issues as dataset sizes
increase beyond a few megabytes. SQL databases are much better at working with larger
datasets that have billions of rows with sizes measured in gigabytes. The dataset’s size still
matters here--larger datasets will take longer for queries to complete, depending on the query’s
content and the number of rows SQL has to process to complete the query.

Query a large dataset

You’ll now discover for yourself how these runtimes change with dataset size by running some
queries on a huge dataset—Wikipedia!

1. Log in to BigQuery Sandbox. If you have a free trial version of BigQuery, you can use that
instead. On the BigQuery page, click the Go to BigQuery button.

 Note: BigQuery Sandbox frequently updates its user interface. The latest changes may
not be reflected in the screenshots presented in this activity, but the principles remain
the same. Adapting to changes in software updates is an essential skill for data
analysts, and it’s helpful for you to practice troubleshooting. You can also reach out to
your community of learners on the discussion forum for help.

2. If you have never created a BigQuery project before, click CREATE PROJECT on the right
side of the screen. If you have created a project before, you can use an existing one or create a
new one by clicking the project dropdown in the blue header bar and selecting NEW PROJECT.

3. Name your project something that will help you identify it later. You can give it a unique project
ID or use an auto-generated one. Don’t worry about selecting an organization if you don’t know
what to put.

4. Now, you’ll see the Editor interface. In the middle of the screen is a window where you can
type code, and to the left is the Explorer menu where you can search for datasets.

5. Copy and paste the following query into the editor and run it. The formatting is just cosmetic,
so don’t worry if it changes when copied over. The query should take 10-15 seconds to run:

13
views DESC;

Note: This query sorts and filters a dataset. You don't need to understand each detail yet.
Coming up, you will learn what each part of this query means and how to use its functions in your
own work.

After the query finishes, your screen should appear like this:
This query returns a table that displays the total number of times each Wikipedia page with
“Google” in the title has been viewed in each language. Note the information that BigQuery
provides on the query you just ran. As you can infer from the dataset’s title in the query, this
dataset is a sample consisting of 10 billion rows from the Wikipedia public dataset.

You’ll find that the query processes over 415 gigabytes of data when run—pretty impressive for
15 seconds! Note that if you run the query again, the runtime will be almost instant (as long as
you haven’t changed the default caching settings). This is because BigQuery caches the query
results to avoid extra work if the query needs to be rerun.

Apache Hive Cookbook
No ratings yet
Apache Hive Cookbook
485 pages
Simple Grocery Store Api Documentation v202306
No ratings yet
Simple Grocery Store Api Documentation v202306
10 pages
Lecture 1 - Introduction To Big Data
No ratings yet
Lecture 1 - Introduction To Big Data
51 pages
Linq 2019 PDF
No ratings yet
Linq 2019 PDF
47 pages
VSAT Installation and Monitoring Skyedge
No ratings yet
VSAT Installation and Monitoring Skyedge
53 pages
BigQuery For Data Warehouse Practitioners - Solutions - Google Cloud
No ratings yet
BigQuery For Data Warehouse Practitioners - Solutions - Google Cloud
25 pages
Manual CV-OTA710 For A Fingerprint Access Control - Time AttendanceManual For 5M - TFS
No ratings yet
Manual CV-OTA710 For A Fingerprint Access Control - Time AttendanceManual For 5M - TFS
30 pages
Mca Big Data PDF Sem 3
No ratings yet
Mca Big Data PDF Sem 3
193 pages
Update 1
No ratings yet
Update 1
87 pages
Apple Privacy Letter
100% (1)
Apple Privacy Letter
4 pages
Python Data Analysis - 2E (2017)
No ratings yet
Python Data Analysis - 2E (2017)
186 pages
CSS Tutorial
No ratings yet
CSS Tutorial
15 pages
SQL For Beginners
No ratings yet
SQL For Beginners
171 pages
Kfupm Homework
100% (1)
Kfupm Homework
8 pages
Blue and Lime Scrapbook Milestone Timeline Presentation
No ratings yet
Blue and Lime Scrapbook Milestone Timeline Presentation
33 pages
Android Technology Seminar
0% (1)
Android Technology Seminar
31 pages
Internet Service Providers and Online Service Providers
No ratings yet
Internet Service Providers and Online Service Providers
15 pages
Python Programming & SQL
100% (4)
Python Programming & SQL
152 pages
CSSI 5 User Manual For Windows
No ratings yet
CSSI 5 User Manual For Windows
55 pages
Introduction-to-TikTok-Shop-Affiliate-Program 2
No ratings yet
Introduction-to-TikTok-Shop-Affiliate-Program 2
10 pages
w3... Hands On Activity..clean Data Using SQL
No ratings yet
w3... Hands On Activity..clean Data Using SQL
10 pages
TC PPT @
No ratings yet
TC PPT @
18 pages
Lightroom Presets Installation Guide
No ratings yet
Lightroom Presets Installation Guide
10 pages
DS and AI IIT Madras Brochure 17aug
No ratings yet
DS and AI IIT Madras Brochure 17aug
20 pages
Immediate Download Marketing Communications: Fame, Influencers and Agility, Ninth Edition Chris Fill - Ebook PDF Ebooks 2024
100% (2)
Immediate Download Marketing Communications: Fame, Influencers and Agility, Ninth Edition Chris Fill - Ebook PDF Ebooks 2024
23 pages
Big Query
No ratings yet
Big Query
64 pages
Intro To BigQuery
No ratings yet
Intro To BigQuery
17 pages
From Data To Insights Course Summary
No ratings yet
From Data To Insights Course Summary
67 pages
Implementation Phase SU22 SU24 and SU25 1727510958
No ratings yet
Implementation Phase SU22 SU24 and SU25 1727510958
10 pages
NTP Configuration Procedure
No ratings yet
NTP Configuration Procedure
5 pages
Laboratory Exercise No. 1 Connect To Your Device
No ratings yet
Laboratory Exercise No. 1 Connect To Your Device
9 pages
Activity Overview - Course 3 Module 3 Google Data ANALYTICS
No ratings yet
Activity Overview - Course 3 Module 3 Google Data ANALYTICS
15 pages
P1 DataAnalyticsWorkbenchForEducationalData FinalReport
No ratings yet
P1 DataAnalyticsWorkbenchForEducationalData FinalReport
161 pages
SMDR Fields
No ratings yet
SMDR Fields
3 pages
XERO
No ratings yet
XERO
4 pages
Sap Fieldglass S So Configuration Guide
No ratings yet
Sap Fieldglass S So Configuration Guide
14 pages
Google Bigquery
No ratings yet
Google Bigquery
10 pages
Understanding WHOIS A Deep Dive
No ratings yet
Understanding WHOIS A Deep Dive
8 pages
YouTube Community Guidelines Enforcement - Google Transparency Report
No ratings yet
YouTube Community Guidelines Enforcement - Google Transparency Report
9 pages
Data Science Pipeline and Hadoop Ecosystem
No ratings yet
Data Science Pipeline and Hadoop Ecosystem
8 pages
Skip To Main Content Accessibility Feedback
No ratings yet
Skip To Main Content Accessibility Feedback
7 pages
Big Data New Tricks For Econometrics
No ratings yet
Big Data New Tricks For Econometrics
26 pages
Appligent: A Quick Introduction To Acrobat Forms Technology
No ratings yet
Appligent: A Quick Introduction To Acrobat Forms Technology
3 pages
BidData New Tricks For Econometrics (Varian H)
No ratings yet
BidData New Tricks For Econometrics (Varian H)
52 pages
MT199 Pof-2
No ratings yet
MT199 Pof-2
3 pages
Advanced SQL & Data Literacy Training
No ratings yet
Advanced SQL & Data Literacy Training
47 pages
W1..well Aligned Objectives and Data
No ratings yet
W1..well Aligned Objectives and Data
5 pages
Python SIP Tutorial
No ratings yet
Python SIP Tutorial
2 pages
Managing Replicated Objects
No ratings yet
Managing Replicated Objects
2 pages
Big Data
No ratings yet
Big Data
79 pages
Big Data: New Tricks For Econometrics: Hal R. Varian
No ratings yet
Big Data: New Tricks For Econometrics: Hal R. Varian
55 pages
Big Data New Tricks For Econometrics
No ratings yet
Big Data New Tricks For Econometrics
26 pages
Flutter Cheat Sheet: by Via
No ratings yet
Flutter Cheat Sheet: by Via
1 page
BigQuery CheatSheet
No ratings yet
BigQuery CheatSheet
100 pages
Datascience Unit3
No ratings yet
Datascience Unit3
19 pages
How Is Bigdata Handled in Kaggle?: 17Cp006-Leenanci Parmar 17CP012-DHRUVI LAD
No ratings yet
How Is Bigdata Handled in Kaggle?: 17Cp006-Leenanci Parmar 17CP012-DHRUVI LAD
18 pages
GDA Mizbain
No ratings yet
GDA Mizbain
15 pages
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
No ratings yet
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
75 pages
InfosecTrain Full Stack Data Science v1
No ratings yet
InfosecTrain Full Stack Data Science v1
43 pages
Bigquery, Google'S Enterprise Data Warehouse: Slid02
No ratings yet
Bigquery, Google'S Enterprise Data Warehouse: Slid02
3 pages
Big Data New Tricks For Econometrics
No ratings yet
Big Data New Tricks For Econometrics
27 pages
Lab 1 - Exploring A BigQuery Public Dataset
No ratings yet
Lab 1 - Exploring A BigQuery Public Dataset
19 pages
Big Data Analytics (R18a0529)
No ratings yet
Big Data Analytics (R18a0529)
134 pages
Caw 7-3
No ratings yet
Caw 7-3
2 pages
7creating A Dataset and Table, Then Loading Data Into The Table in BigQuery
No ratings yet
7creating A Dataset and Table, Then Loading Data Into The Table in BigQuery
13 pages
Big Data 3
No ratings yet
Big Data 3
2 pages
Data Analysis Notes
No ratings yet
Data Analysis Notes
20 pages
Explore A Bigquery Public Dataset
No ratings yet
Explore A Bigquery Public Dataset
21 pages
Import Datasets With Python
No ratings yet
Import Datasets With Python
8 pages
Bda Unit 1
No ratings yet
Bda Unit 1
32 pages
CDA C2 R 175 en File 54.en
No ratings yet
CDA C2 R 175 en File 54.en
4 pages
DW
No ratings yet
DW
4 pages
Unstructured Data: User Price Shipped
No ratings yet
Unstructured Data: User Price Shipped
14 pages
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
No ratings yet
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
73 pages
Bigquery Interview Questions
No ratings yet
Bigquery Interview Questions
5 pages
BigQuery UI
No ratings yet
BigQuery UI
9 pages
Framework For Migrate Your Data Warehouse Google BigQuery WhitePaper
No ratings yet
Framework For Migrate Your Data Warehouse Google BigQuery WhitePaper
21 pages
BQ Solutions-1
No ratings yet
BQ Solutions-1
19 pages
Introduction To Dbms
No ratings yet
Introduction To Dbms
37 pages
Finding IT Job Data Project
No ratings yet
Finding IT Job Data Project
3 pages
Dsmlusingpython
No ratings yet
Dsmlusingpython
10 pages
Module 1
No ratings yet
Module 1
54 pages
Big Query
No ratings yet
Big Query
11 pages
Data Science Professional
No ratings yet
Data Science Professional
21 pages
CDA C2 R 200 en File 22.en
No ratings yet
CDA C2 R 200 en File 22.en
7 pages
Database Toolbox: User's Guide
No ratings yet
Database Toolbox: User's Guide
226 pages
The Free Hive Book
No ratings yet
The Free Hive Book
1 page
Big Data New Tricks For Econometrics
No ratings yet
Big Data New Tricks For Econometrics
51 pages
BQ BQ: BQ Command-Line Tool Bigquery Documentation
No ratings yet
BQ BQ: BQ Command-Line Tool Bigquery Documentation
7 pages
Data Analtycs Professional-1
No ratings yet
Data Analtycs Professional-1
15 pages
7 BigData BigQuery Intelli
No ratings yet
7 BigData BigQuery Intelli
3 pages
BDA2023 Outline
No ratings yet
BDA2023 Outline
7 pages
The Absolute Beginner's Guide to Binary, Hex, Bits, and Bytes! How to Master Your Computer's Love Language
From Everand
The Absolute Beginner's Guide to Binary, Hex, Bits, and Bytes! How to Master Your Computer's Love Language
Greg Perry
4.5/5 (11)
Building Scalable Data-Intensive Applications
From Everand
Building Scalable Data-Intensive Applications
Chandani Kaul
No ratings yet
Bitcoin Tutorials - Herong's Tutorial Examples
From Everand
Bitcoin Tutorials - Herong's Tutorial Examples
Herong Yang
No ratings yet

Big Query Help

Uploaded by

Big Query Help

Uploaded by

1.

Understand how data is measured

Unit Equivalent to Abbreviation Real-Worl

Byte 8 bits B 1 characte

Kilobyte 1024 bytes KB A page of

Megabyte 1024 Kilobytes MB 1 song in M

Gigabyte 1024 Megabytes GB ~300 song

Terabyte 1024 Gigabytes TB ~500 hour

Petabyte 1024 Terabytes PB 10 billion F

Exabyte 1024 Petabytes EB ~500 millio

Zettabyte 1024 Exabytes ZB All the data

Query a large dataset

You might also like