0% found this document useful (0 votes)

7 views2 pages

Loading and Exporting Data

Uploaded by

SECE20A39MRUNAL VAIDYA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views2 pages

Loading and Exporting Data

Uploaded by

SECE20A39MRUNAL VAIDYA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Loading and exporting data

Can be done using UI, CLI or api.

Supported data formats:

▪ CSV

▪ JSON (newline delimited only)

▪ Avro

▪ Parquet

▪ ORC

Export from bq – to cloud storage(csv, avro, json). Export limited to 1gb. Can use wildcards to split
into multiple files. “Bq extract”
Bigquery transfer service – import data from other marketing apps. Adwords, doubleclick, youtube
reports

External tables

Supported for cloud storage, bigtable, google drive

Create table definition file with schema, schema can also be autodetected.

Use either temporary or permanent tables. The data is not stored in these tables, its just the schema.
Permanent tables can be used for access control and sharing since table is access controlled.
Temporary tables are for 1-off use. Permanent tables are placed in a dataset.

Queries are not cached in this case

Partitioning

2 options – based on ingestion time or using a particular column already existing. In first scheme,
_PARTITIONTIME is a pseudocolumn that gets added.

_PARTITIONTIME and _PARTITIONDATE are available only in ingestion-time partitioned tables.

For manual partitions, we can use any date or timestamp column.

Another approach to partitioning is to shard the data and put it into separate tables. This has more
overhead because multiple tables are there, we need to maintain access control and schema for each
table separately.

Advanced queries

Analytical window functions:

Aggregate – sum, count
Navigation – lead, lag
Ranking, numbering – rank, cume_dist
“Partition by” is similar to “group by” but doesnt aggregate. This is different from bq partitions how
data is stored.
Types – struct, array, timestamp, int64, float64, string
Inner table can be using WITH
ARRAY_AGG – creates array. UNNEST – break array.
STRUCT – creates struct
User defined functions – sql udf as well as javascript udfs is possible
Udf has constraints – size of udf output is limited, native javascript not supported
Unnest – takes an array and returns table

Streaming

Query while data is getting streamed before data is written to disk100,000 rows/second insertion
rate, use rest apis to insert
Streaming data is available within seconds

Costing

storage cost similar to cloud storage

Older unread data charged lesser
query cost based on data processed. For ingest data, its based on streaming rate.
pay based on usage. there is also a flat rate plan, but its mostly not used
cost optimized by restricting the number of columns for which query is done
Free part – loading, exporting, cached queries, queries on metadata, queries with error.
Cached queries – to save on cost, per user. typical cache lifetime is 24 hours, but the cached results
are best-effort and may be invalidated sooner
Billing – done on the project where job is running irrespective of where the dataset is from

IAM

control at project, dataset, view. Views are virtual tables. There is no direct iam roles for controlling
view access. Views are put in a new dataset and iam control is done at that dataset. This gives a
virtual view access control. Called as authorized view.

Views can also be used to share rows based on particular user. Allowed user is added as a column
and view will match with SESSION_USER to display only for those users

Roles – admin, data owner, editor, viewer, user(can run queries, more permissions than job user), job
user. From primitive roles project owner, editor, viewer are available. For datasets, owner, writer and
reader are available.

Public dataset – accessible to all authenticated users

1Z0-1085-23 - Oracle Cloud Infrastructure 2023-Exam
No ratings yet
1Z0-1085-23 - Oracle Cloud Infrastructure 2023-Exam
6 pages
Designing Data Intensive Applications
25% (4)
Designing Data Intensive Applications
61 pages
MICROSOFT AZURE ADMINISTRATOR EXAM PREP(AZ-104) Part-3: AZ 104 EXAM STUDY GUIDE
From Everand
MICROSOFT AZURE ADMINISTRATOR EXAM PREP(AZ-104) Part-3: AZ 104 EXAM STUDY GUIDE
Devi Prasad
No ratings yet
Big Query Google
100% (2)
Big Query Google
62 pages
BQ Solutions-1
No ratings yet
BQ Solutions-1
19 pages
From Data To Insights Course Summary
No ratings yet
From Data To Insights Course Summary
67 pages
BigQuery Cost Optimization + Best Practices
No ratings yet
BigQuery Cost Optimization + Best Practices
30 pages
Framework For Migrate Your Data Warehouse Google BigQuery WhitePaper
No ratings yet
Framework For Migrate Your Data Warehouse Google BigQuery WhitePaper
21 pages
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
No ratings yet
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
73 pages
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
No ratings yet
T-GCPBDML-B - M3 - Big Data With BigQuery - ILT Slides
75 pages
Data Storage Services in GCP: Relational Database Data Warehouse Nosql Big Data Database Service
No ratings yet
Data Storage Services in GCP: Relational Database Data Warehouse Nosql Big Data Database Service
15 pages
Google Bigquery & Tableau: Best Practices
No ratings yet
Google Bigquery & Tableau: Best Practices
14 pages
Big Query Optimization Document
No ratings yet
Big Query Optimization Document
10 pages
BigQuery For Data Warehouse Practitioners - Solutions - Google Cloud
No ratings yet
BigQuery For Data Warehouse Practitioners - Solutions - Google Cloud
25 pages
Performance and Tuning - 6
No ratings yet
Performance and Tuning - 6
172 pages
BigQuery CheatSheet
No ratings yet
BigQuery CheatSheet
100 pages
Interview Q & A (SQL Spark HIVE Airflow AWS Kafka) - 1
No ratings yet
Interview Q & A (SQL Spark HIVE Airflow AWS Kafka) - 1
25 pages
Navi Cat
No ratings yet
Navi Cat
346 pages
Best Practices For Designing Efficient Tableau Workbooks-1
No ratings yet
Best Practices For Designing Efficient Tableau Workbooks-1
8 pages
Lecture 10 - Interactive Querying
No ratings yet
Lecture 10 - Interactive Querying
27 pages
Apache Druid: Sudhindra Tirupati Nagaraj
No ratings yet
Apache Druid: Sudhindra Tirupati Nagaraj
12 pages
Bigquery
No ratings yet
Bigquery
2 pages
Big Query
No ratings yet
Big Query
11 pages
Barclays Data Engineer Interview Questions
No ratings yet
Barclays Data Engineer Interview Questions
17 pages
DBMS Complete Presentation Detailed
No ratings yet
DBMS Complete Presentation Detailed
13 pages
DBT Bigquery Whitepaper
No ratings yet
DBT Bigquery Whitepaper
39 pages
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Pbi 2002
No ratings yet
Pbi 2002
13 pages
Navicat Mac Es
No ratings yet
Navicat Mac Es
376 pages
Advanced Databases Unit 3
No ratings yet
Advanced Databases Unit 3
6 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
30 pages
BigData - W4 - Big Data 0 Graph Data - HoangVu (Cont)
No ratings yet
BigData - W4 - Big Data 0 Graph Data - HoangVu (Cont)
76 pages
Power BI Vinay Tech
No ratings yet
Power BI Vinay Tech
11 pages
Manual
No ratings yet
Manual
254 pages
Big Query Interview Q&A
No ratings yet
Big Query Interview Q&A
8 pages
Formatted BigQuery CheatSheet
No ratings yet
Formatted BigQuery CheatSheet
1 page
Portfolio Tracker
No ratings yet
Portfolio Tracker
8 pages
TopDev - High Performance and Scalability Database Design - V2.1
No ratings yet
TopDev - High Performance and Scalability Database Design - V2.1
52 pages
In Memory or Live Data
No ratings yet
In Memory or Live Data
6 pages
Query Execution
No ratings yet
Query Execution
25 pages
Clickhouse en
No ratings yet
Clickhouse en
673 pages
BigQuery Data Engineer Interview CheatSheet
No ratings yet
BigQuery Data Engineer Interview CheatSheet
4 pages
Data Engineering - Behind The Scene of Data by Hoda Ragaie
No ratings yet
Data Engineering - Behind The Scene of Data by Hoda Ragaie
44 pages
Class Notes 1701619372
No ratings yet
Class Notes 1701619372
4 pages
Multi-Terabyte MySQL Data Warehouses - Absolutely! Presentation
100% (1)
Multi-Terabyte MySQL Data Warehouses - Absolutely! Presentation
33 pages
How Google Big Query Changed The Game
100% (1)
How Google Big Query Changed The Game
11 pages
BigQuery Introduction
No ratings yet
BigQuery Introduction
11 pages
Curso Google Data Engineer
No ratings yet
Curso Google Data Engineer
36 pages
Tableau Performance Checklist
No ratings yet
Tableau Performance Checklist
21 pages
Vertica Unify 2021 - A Deep Dive of Complex Data Types
No ratings yet
Vertica Unify 2021 - A Deep Dive of Complex Data Types
43 pages
Reliable ERP For Sale RK Part 1
No ratings yet
Reliable ERP For Sale RK Part 1
55 pages
Dbms
No ratings yet
Dbms
12 pages
Warehouse Tuning
No ratings yet
Warehouse Tuning
22 pages
Unstructured Data: User Price Shipped
No ratings yet
Unstructured Data: User Price Shipped
14 pages
Untitled Document
No ratings yet
Untitled Document
3 pages
Rajimartin Google Bigquery
No ratings yet
Rajimartin Google Bigquery
1 page
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Why Postgresql For Analytics Infrastructure (DW) ?: Huy Nguyen Cto, Cofounder - Holistics - Io
No ratings yet
Why Postgresql For Analytics Infrastructure (DW) ?: Huy Nguyen Cto, Cofounder - Holistics - Io
50 pages
E2E Ecommerce Analytics GCP Pipeline Report
No ratings yet
E2E Ecommerce Analytics GCP Pipeline Report
18 pages
GCP Technologies
No ratings yet
GCP Technologies
12 pages
Data Warehousing (Advanced Query Processing) : Carsten Binnig Donald Kossmann
No ratings yet
Data Warehousing (Advanced Query Processing) : Carsten Binnig Donald Kossmann
55 pages
LIAN 98 (En) - Protocol IEC 60870-5-101, Telegram Structure
No ratings yet
LIAN 98 (En) - Protocol IEC 60870-5-101, Telegram Structure
7 pages
DBMS
No ratings yet
DBMS
19 pages
Oracle Advanced Compression Advisor
No ratings yet
Oracle Advanced Compression Advisor
2 pages
Grade 8 Revision
100% (1)
Grade 8 Revision
4 pages
Go Programming For Network Operations A Golang Network Automation Handbook
100% (1)
Go Programming For Network Operations A Golang Network Automation Handbook
109 pages
Scheda Corso
No ratings yet
Scheda Corso
3 pages
v1 Covered
No ratings yet
v1 Covered
15 pages
DataGrokr-Software Development Internship Assignment
No ratings yet
DataGrokr-Software Development Internship Assignment
5 pages
Hardware and Software
No ratings yet
Hardware and Software
24 pages
Delta - Siemens s7 200 Smart (Iso TCP) - Eng
No ratings yet
Delta - Siemens s7 200 Smart (Iso TCP) - Eng
3 pages
OceanStor S2600T&S5500T&S5600T&S5800T&S6800T Storage System V200R002 Online Capacity Expansion 01
No ratings yet
OceanStor S2600T&S5500T&S5600T&S5800T&S6800T Storage System V200R002 Online Capacity Expansion 01
226 pages
Chapter 18: The Pentium and Pentium Pro Microprocessors
No ratings yet
Chapter 18: The Pentium and Pentium Pro Microprocessors
75 pages
Huawei OceanStor SNS2624 SNS3664 SNS3696E Data Sheet
No ratings yet
Huawei OceanStor SNS2624 SNS3664 SNS3696E Data Sheet
6 pages
Database Management Short Notes
No ratings yet
Database Management Short Notes
5 pages
Database Management System Final Q1
No ratings yet
Database Management System Final Q1
12 pages
What Is Key Vault
No ratings yet
What Is Key Vault
3 pages
Welcome To Our Presentation: Today Our Topic Is Processor and Memory
No ratings yet
Welcome To Our Presentation: Today Our Topic Is Processor and Memory
24 pages
Ports AC Freedom WiFi Guide
No ratings yet
Ports AC Freedom WiFi Guide
5 pages
Class 10 SQL Lab Manual
No ratings yet
Class 10 SQL Lab Manual
12 pages
T6 Worksheet 6
No ratings yet
T6 Worksheet 6
3 pages
SAP SUM (Software Update Manager) Upgrade Phases
No ratings yet
SAP SUM (Software Update Manager) Upgrade Phases
16 pages
Ios XR System Manager
No ratings yet
Ios XR System Manager
12 pages
Linux Commands
No ratings yet
Linux Commands
6 pages
DBA - Oracle 12c Exploring - Part 1
No ratings yet
DBA - Oracle 12c Exploring - Part 1
4 pages
Delay Tolerant Networks Presentation
100% (1)
Delay Tolerant Networks Presentation
16 pages
Ofp
No ratings yet
Ofp
3 pages
Keep Alive
No ratings yet
Keep Alive
7 pages
Deutsche Telekom Perspective On HADOOP and Big Data Technologies
No ratings yet
Deutsche Telekom Perspective On HADOOP and Big Data Technologies
19 pages
Ieee 802.15.4 Mac Layer
No ratings yet
Ieee 802.15.4 Mac Layer
17 pages

Loading and Exporting Data

Uploaded by

Loading and Exporting Data

Uploaded by

Loading and exporting data

Can be done using UI, CLI or api.

Supported data formats:

▪ JSON (newline delimited only)

Supported for cloud storage, bigtable, google drive

Queries are not cached in this case

_PARTITIONTIME and _PARTITIONDATE are available only in ingestion-time partitioned tables.

For manual partitions, we can use any date or timestamp column.

Analytical window functions:

storage cost similar to cloud storage

Public dataset – accessible to all authenticated users

You might also like