0% found this document useful (0 votes)
23 views

Using The BigQuery Datasource - Preview User Guide-V2

This document provides information on using the BigQuery data source in AppSheet, including: 1) BigQuery can only be accessed in read-only mode from AppSheet as it is a data warehouse. 2) Accessing BigQuery requires an Enterprise plan due to the large volumes of data. 3) AppSheet limits access to BigQuery to 100,000 rows and no partitioned tables to ensure performance. 4) Instructions are provided on setting up a service account, custom roles, and views to optimize data access within these limits.

Uploaded by

coreylwy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Using The BigQuery Datasource - Preview User Guide-V2

This document provides information on using the BigQuery data source in AppSheet, including: 1) BigQuery can only be accessed in read-only mode from AppSheet as it is a data warehouse. 2) Accessing BigQuery requires an Enterprise plan due to the large volumes of data. 3) AppSheet limits access to BigQuery to 100,000 rows and no partitioned tables to ensure performance. 4) Instructions are provided on setting up a service account, custom roles, and views to optimize data access within these limits.

Uploaded by

coreylwy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Using the BigQuery Data Source

Overview
The BigQuery data source is being released to all users as a Preview. The BigQuery data
source allows app creators to access BigQuery datasets in READ-ONLY mode only. We do not
support write/update/delete access because BigQuery is a data warehouse type of database,
not an online transaction processing (OLTP) database that you would normally use to be the
read/write data store for an application (like Cloud SQL). The intent of a data warehouse is to
provide read-only datasets for reporting and data analytics. We do not plan to support write
modes in AppSheet. We expect BigQuery datasets will be refreshed by the systems of record in
a separate process outside of AppSheet (like a nightly ETL batch process, or a stream of data
changes from the system of record using a tool like Cloud Data Fusion).

Licensing
BigQuery is a GCP service that many enterprise customers use for data reporting. The volumes
of data to process can be quite significant. Therefore, we are classifying this data source as an
Advanced Data Connector, which will require a Business- or Enterprise-level plan. Here is a
summary of the plans that will enable access to a BigQuery data source :

New AppSheet plans that are required for BigQuery data source:
● AppSheet Enterprise Standard
● AppSheet Enterprise Plus
For reference: Older AppSheet plans that would have also enabled you to use the
BigQuery data source:
● AppSheet Business
● AppSheet Enterprise

Limits
Because BigQuery has the potential to have massive amounts of data (potentially tens of
millions of rows or more and up to 10,000 columns), we have put some product limits in place to
ensure your apps will be able to load data and perform well. AppSheet will have the following
limits in place for a BigQuery data source:
● Read-only - BigQuery is a data warehouse, and we do not plan to support write
operations from AppSheet.
● 100,000 row hard limit - BigQuery datasets larger than this will be truncated at the limit.
This is the preview limit - it is still TBD what the limit will be for GA based on
performance test results and user feedback.
● Partitioned tables will not work - This is a preview limitation we hope to remove by
GA. Any BigQuery tables that have a “Partitioned Column” will give an error during
configuration in AppSheet. Use a BigQuery View without the partitioned column to get
around this.
● GCP Access - Service Accounts are the only access method we are providing. You
will need to have access to create service accounts and keys in your GCP project, or
you will need to request the creation from an IT Administrator who has this level of
access. They can even create the data source as a Team Data Source and share it with
you. The instructions below show how to set this up.

How to set up your App to be successful within these limits


1. Use Security Filters
a. Use AppSheet expressions, which will be translated to BigQuery SQL and
executed on the server before the data is downloaded to the app.
Examples:
i. Filter by USEREMAIL() to just get rows owned or viewable by the
currently logged-in app user
ii. Filter by DateTime values in the last week or month to get only the most
recent data. Assuming a column in the dataset called DateCreated, you
could use security filter expressions like:
1. For the last week: [DateCreated] >= TODAY() - 7
2. For the last month: [DateCreated] >= TODAY() - 30
iii. Use a combination of user settings and security filters to allow your app
user to limit the data.
1. Using the user setting column named Year, and a column in the
dataset called StartTime, this is a way to select just the 2018 or
greater data from a dataset:
a. [StartTime] >= DATETIME("1/1/" &
USERSETTINGS("YEAR"))
2. Use BigQuery Views
a. You can create specific views for AppSheet in BigQuery that have potentially
complex SQL queries to create a limited number of rows from the dataset. When
setting-up the BigQuery data source as a table in your app, select from the list of
views in your project as well as from the base datasets.
3. Create a new BigQuery Table
a. BigQuery allows you to set up a copy of a table that can periodically be
re-created based on a scheduled query.
b. This is similar to a view, but is a physical table that contains the subset of data.
This can also be used to reduce the number of columns in order to simplify the
table structure.

User Guide

Pre-Requisites in GCP
In order to connect AppSheet to BigQuery, a Service Account must be created in GCP with the
correct role to access BigQuery data. A Service Account is a special kind of account that is
used to grant system to system level access, rather than access by individual users. AppSheet
(a system) will use the service account user to access BigQuery (a system) which will establish
the system-to-system communications. Additional user level access can be put in place in
AppSheet through security filters on the datasource.

Accessing a Public DataSet will only require a BigQuery Job User role.
Accessing a Private DataSet will require a Custom Role in GCP. Please see the instructions in
the Accessing a Private DataSet section in the instructions below. You can skip this section if
you plan to just use a Public DataSet.

Create a Service Account

1. Go to GCP console (https://console.cloud.google.com) / IAM & Admin / Service


Accounts, click on “+ CREATE SERVICE ACCOUNT ”
a. Create service account with the BigQuery Job User role
Choose a Service Account Name and fill out the description, then click the
“Create” button

Grant the Bigquery.Job.User role


Type in “Bigquery Job” to the filter text box and a short list will appear below the
filter text box. Select “BigQuery Job User”

Click on the “Done” button


b. (optional) The role can also be set with the gcloud command line interface, if this
is preferred (Change [PROJECTID] and [SERVICE_ACCOUNT_ID] to the values
appropriate for your account :
gcloud projects add-iam-policy-binding [PROJECTID]
--member="serviceAccount:[SERVICE_ACCOUNT_ID]"
--role="roles/bigquery.jobUser"

Create a JSON key of the Service Account


c. Click on the Three Dots on the rightmost Actions column of the new Service Key
you created, and select “Manage keys”:
d. Click on the “Add Key” button and select “Create new key”

e. Accept the default selection of “JSON” and Click on the “Create” button
f. Note the file name. You will find this in your browser’s download folder. Click on
the “Close” button. Depending on your browser type, you may see the
downloaded file in the bottom bar of your browser window.
g. Open the File so that you can cut/paste the contents in a few minutes

Accessing a Private DataSet


Every GCP account will have different security policies set up, according to the security needs
of the customer account. The instructions in this section should work for most GCP projects
with a highly restrictive set of policies. You may not need all of the permissions in the custom
role in your specific GCP Project, so you can also try removing some of the permissions if you
would like, then add them back in until it works for your project.

You need to create a Custom Role in GCP IAM in order to access a Private DataSet. Please
ignore this section if you only need to access Public DataSets, and skip to the “Add the
BigQuery data source to an AppSheet App” Section.

Creating a Custom Role


GCP provides a shell feature in the GCP Cloud Console called Cloud Shell, which allows you
to use a command line interface. This is the quickest way to create a custom role. Click on the
Cloud Shell icon in the upper right of your GCP console
window. Click on the Continue button when prompted, and then wait for the Terminal window to
boot up in the bottom portion of your screen.

Click on the “Open Editor” Icon in the Cloud Shell Navigation bar (or use vi if you prefer) :
Click on the … icon and then the “New File” selection on the pop up menu:
Enter a File name of your choice with a .yaml extension, like : BigQuery.AppSheet.Roles.yaml

Cut and paste the following text into the file (modify the title and description values as needed
for your use case):

title: "bigquery.appsheet"
description: "BigQuery ro role for AppSheet"
stage: "GA"
includedPermissions:
- bigquery.datasets.get
- bigquery.jobs.create
- bigquery.routines.get
- bigquery.routines.list
- bigquery.tables.get
- bigquery.tables.getData
- bigquery.tables.list
- resourcemanager.projects.get

Remove any blank lines at the end by using backspace or delete on your keyboard. You should
see 12 lines in your file, like this:
Click on the “x” in the file tab to close the file (it will automatically save). Then, click on the
“Open Terminal” button to get back to the shell.

Cut and paste the following commands, one at a time. Replace the Red text with your own
values. The backslash tells the shell that the command will continue on the next line :

gcloud iam roles create <your new role name> \


--project=<your-project-id> \
--file=<your-YAML-File-Name.yaml>
Here is an example of how I filled out the Red text with my values :

gcloud iam roles create bigquery.appsheet3 \


--project=appsheet-scott \
--file=BigQuery.AppSheet.Roles.yaml

You will now need to authorize the shell to call the API which will create the role. Click on
Authorize:

You should see the resulting text which tells you the role was successfully created:

Now, you need to share your Private DataSet with the role that you created so that all the
permissions needed by the AppSheet BigQuery data source. The first thing you need is the
user name that was assigned to your service account. Navigate to the Service Accounts
section of Google Cloud Console using the hamburger menu or go to this URL:
https://console.cloud.google.com/iam-admin/serviceaccounts

Copy the email column from your BigQuery Service Account that we created earlier in this
document, so that you can paste it in a few minutes. You may want to open another tab for the
next step so that you can re-copy it into your mouse buffer if you lose it :

Open a new tab in your browser and go to the BigQuery service in Google Cloud Console. Use
the hamburger menu in the upper left corner, or just navigate to this URL :
https://console.cloud.google.com/bigquery

Find your private dataset and click on the triple dot menu on the right and then on “Open”:
Click on “Share DataSet” in the BigQuery navigation bar:

Paste the Service Account email address into the Add Members text box.
Click on the “Select a role” drop down menu
Select the “Custom >” option
Select the new role you created like “bigquery.appsheet”
Click on the “Add” button.

Note the new role now has a member, which is your service account. Click on the “Done”
button:
Now, you are ready to add the Data Source in AppSheet. If you were using a Public DataSet,
then you can continue from here.
Add the BigQuery data source to an AppSheet App
2. Go to appsheet.com and login
a. Go to My Account / My Account

b. Click on the “+ New Data Source” button


c. Select “Cloud Database” and put your data source Name at the top that you’d like
to use for your new data source (we used “BigQuery-NYC-Citibikes” as our
name)::

d. Choose the “BigQuery” option from the Dropdown list of available DB types
(NOTE: If you don’t see BigQuery in the list, your account may not have this
feature rolled out yet - please wait and try again tomorrow - Don’t worry, all of the
previous work you’ve done is still good! You can start from here.)

e. Next, we will find the 3 input fields from back in our GCP Console and in the key
file we downloaded above. We need the BigQuery DataSet ID, the Google Cloud
Project ID, and the Service Account Key

f. DataSet ID - Go to the Google Cloud Console (https://console.cloud.google.com)


and select “BigQuery” under the “Big Data” section. You may need to scroll down
a fair bit on the main left hand hamburger menu to find it.
g.
h. Find your BigQuery dataset that you’d like to use. We’ll use a Public dataset
called bigquery-public-data:new_york_citibike for demo/instructional purposes, but you
should select the dataset that you would like to use in your app. If you don’t see
the dataset you want, click on the “+ ADD DATA” button and add it.
i. Open the dataset by clicking on the 3 vertical dots menu on the right of
the dataset name. Make sure you are at the top level:

The dataset level , not on a Table level

ii. Copy the DataSet ID so that you can Paste it into AppSheet

iii. Now, paste it into AppSheet’s new data source window:


i. Google Cloud Project ID: Go back to the Google Cloud Console. Click on the
Project Name at the top of the Browser, and then copy the Project ID from the
pop-up so that you can paste it into AppSheet.

j. Paste the Project Id into AppSheet


k. Service Account key : This is in the JSON file that was downloaded when we
created the key on the Service Account in GCP Console (IAM & Admin). Find
the file that was downloaded (will be in your browser’s download directory) and
open it with any text editor. Copy the entire contents of the file.

l. Paste into AppSheet:


m. Click on Test and then Authorize Access

Add a BigQuery Table to your App in AppSheet


n. On the Data Tab - Click on “+ New Table”
o. Select big query nYC citibikes

p. Select Tables or Views (we will use Tables)


q. Select the table that you want for your App (we selected citibike_stations):

r. Mark the table as “Read-Only” - this will improve your performance as the app
will not try to recache as often. Click on the “Add This Table” button.

s. Now, your app will have access to this data as if it were any other table.
Don’t forget to add security filters to limit the number of rows that will be used with AppSheet.

You might also like