Big Data&Business Analytics
Big Data&Business Analytics
INSTITUTE OF MANAGEMENT
SUBJECT
BIG DATA & BUSINESS ANALYTICS
CLASS - SYMMS
ROLL NO. - 42
TOPIC
TABLEAU BIG DATA ANALYTICS
Google ADS, Google Analytics, Google Big Query, Google Cloud
SQL, Google Drive
This assignment report is submitted as per the guidelines
and referenced with accurate details.
Submitted By
Ms. Vrushali Anil Rajpure
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
TABLEAU
• Tableau is a powerful and fastest growing data visualization tool used in the
Business Intelligence Industry. It helps in simplifying raw data into the very
easily understandable format.
• Data analysis is very fast with Tableau and the visualizations created are in
the form of dashboards and worksheets. The data that is created using
Tableau can be understood by professional at any level in an organization. It
even allows a non-technical user to create a customized dashboard.
• The best feature Tableau are
1. Data Blending
2. Real time analysis
3. Collaboration of data
• The great thing about Tableau software is that it doesn't require any
technical or any kind of programming skills to operate. The tool has
garnered interest among the people from all sectors such as business,
researchers, different industries, etc.
TABLEAU DESKTOP
Tableau Desktop has a rich feature set and allows you to code and customize
reports. Right from creating the charts, reports, to blending them all together to
form a dashboard, all the necessary work is created in Tableau Desktop.
For live data analysis, Tableau Desktop provides connectivity to Data Warehouse,
as well as other various types of files. The workbooks and the dashboards created
here can be either shared locally or publicly.
Based on the connectivity to the data sources and publishing option, Tableau
Desktop is classified into
2
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
GOOGLE ADS
Google Ads is Google's online advertising program. Through Google Ads, you
can create online ads to reach people exactly when they're interested in the
products and services that you offer.
i. Sign in to Google Ads using your email or phone, and then select Next to
enter your password. If multiple accounts are listed, select the account that has
the GA data you want to access, and enter the password, if you are not already
signed in.
ii. Select Allow to authorize Google to securely share your data with Tableau
Desktop.
iii. Close the browser window when notified to do so.
iv. In Tableau Desktop, select your Account and your Client Customer ID
v. Then, select the pre-defined report and date filters
vi. You also can select the columns to show for the report previously selected.
vii. Select Connect.
i. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use a data
source naming convention that helps other users of the data source figure out
which data source to connect to.
ii. By default, the report selected will be displayed under Table and will be
automatically drag it to the top of the canvas.
iii. Select the sheet tab to start your analysis.
iv. After you select the sheet tab, Tableau imports the data by creating an extract.
Note that Tableau Desktop supports only extracts for Google Ads. You can
update the data by refreshing the extract. For more information, see Extract
Your Data.
v. Creating extracts may take some time depending on the amount of data that is
included.
3
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
Here is an example of a Google Ads data source connection using Tableau Desktop on a
Windows computer:
Key considerations
Account Requirements
To use the Google Ads connector, you must be a customer of Google Ads.
4
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
When you apply a date filter, it's tempting to gather as much data as possible when you do an
analysis, however, retrieving records from Google Ads can be time-consuming. Tableau
doesn’t know how much data there is in a particular date range until it retrieves the data. For
this reason, you should restrict your date range at first, and then expand after you evaluate
performance.
To give you a rough idea of how much time it might take to retrieve data from Google Ads,
tests were conducted using a high-speed connection. This table shows how long it took in the
test environment to retrieve a given number of records.
1,000 11 seconds
10,000 2 minutes
100,000 18 minutes
While in the Google Ads UI only one segment at a time can be used for display, with the
Google Ads connector you can combine multiple segments in the same report.Keep in mind
that the number of rows can increase exponentially for each additional segment field included
in your report.
The connector returns the data in the same format as the Google Ads API. For Money fields
(such as costs, amounts, etc.), the Google Ads API returns them in micro currency
units(micros). To have the correct value you need to divide by 1000000 to get the amount in
the account's local currency.
5
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
GOOGLE ANALYTICS
Google Analytics is a free Web analytics service that provides statistics
and basic analytical tools for search engine optimization (SEO) and marketing
purposes. The service is available to anyone with a Google account. Google
bought Urchin Software Corporation in April 2005 and used that company’s
Urchin on Demand product as the basis for its current service.
i. Sign in to GA using your email or phone, and then select Next to enter
your password. If multiple accounts are listed, select the account that has
the GA data you want to access, and enter the password, if you are not
already signed in.
6
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
ii. Select Allow so that Tableau Desktop can access your GA data.
iii. Close the browser window when notified to do so.
a. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use a data
source naming convention that helps other users of the data source figure out
which data source to connect to.
b. Follow the steps at the top of the data source page to complete the connection.
c. Select the sheet tab to start your analysis. After you select the sheet tab,
Tableau imports the data by creating an extract. Note that Tableau Desktop
supports only extracts for Google Analytics. You can update the data by
refreshing the extract.
7
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
Here is an example of a Google Analytics data source connection using Tableau Desktop on a
Windows computer:
8
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
You see the following message when Tableau returns all data.
If the query stays within the boundaries of the query restrictions, GA doesn't return sampled
data and you do not see the above message.
• Missing date dimension – You must use the date dimension in your query to return
all data.
• Too much data – Your query might contain too much data. Reduce the date range.
Note that the minimum date range is one day.
• Legacy workbooks – Workbooks created in Tableau Desktop 9.1 and earlier cannot
return all data. Open the legacy workbook in Tableau Desktop 9.2 and later and save
the workbook.
In cases when workbook performance is critical or there are specific dimensions and
measures you want to use in your query that are not supported by Tableau’s default query
process, use sampled data instead. To return sampled data, select the Sample data button.
9
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
GOOGLE BIGQUERY
BigQuery, Google's serverless, highly scalable enterprise data warehouse,
is designed to make data analysts more productive with unmatched price-
performance. Because there is no infrastructure to manage, you can focus on
uncovering meaningful insights using familiar SQL without the need for a
database administrator.
How to connect Tableau to Google BigQuery and set up the data source.
10
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
ii. Select Accept so that Tableau Desktop can access your Google BigQuery
data.
iii. Close the browser window when notified to do so.
i. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use a data
source naming convention that helps other users of the data source figure out
which data source to connect to.
ii. (Optional) From the Billing Project drop-down list, select a billing project. If
you don't select a billing project, EmptyProject appears in the field after you
have selected the remaining fields.
iii. From the Project drop-down list, select a project. Alternatively,
select publicdata to connect to sample data in BigQuery.
iv. From the Dataset drop-down list, select a data set.
v. Under Table, select a table.
Use custom SQL to connect to a specific query rather than the entire data
source.
11
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
Here is an example of a Google BigQuery data source using Tableau Desktop on a Windows
computer:
You can have the customization attributes included in your published workbook or data
source, as long as you specify the attributes before you publish the workbook or data source
to Tableau Online or Tableau Server.
12
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
Customization attributes accept integer values and affect both live queries and extract
refreshes for the specified connection.
The following attributes help the most to increase performance of large result sets:
bq-fetch-tasks Number of parallel background tasks to use when fetching data using HTTP. The
default is 10.
bq-large-fetch- Number of rows to fetch in each batch for spool queries. The default is 50000.
rows
The following attributes are also available and are primarily used for small queries:
bq-fetch-rows Number of rows to fetch in each batch for non-spool queries. The default is 10000.
bq-response-rows Number of rows returned in non-spool non-batched queries. The default is 10000.
This capability setting accepts yes or no values and can be useful when testing:
CAP_BIGQUERY_FORCE_SPOOL_JOB Force all queries to use the temp table approach. The default
value is “no.” Change the value to “yes” to turn this attribute
on.
Tableau uses two approaches to return rows from BigQuery: the default non-spool approach,
or the temp table (spool) approach:
1. On the first attempt, queries are executed using the default, non-spool query, which
uses the bq-fetch-rows setting.
2. If the result set is too large, the BigQuery API returns an error and the Tableau
BigQuery connector retries the query by saving the results into a BigQuery temp
table. The BigQuery connector then reads from that temp table, which is a spool job
that uses the bq-large-fetch-rows setting.
13
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
The customization attributes in the .tdc file are read and included by Tableau Desktop when
the data source or workbook is published to Tableau Online or Tableau Server.
You can manually embed customization attributes inside the 'connection' tag in the workbook
.twb file or the data source .tds file. The BigQuery customization attributes are bold in the
following example to make them easier for you to see.
When using web authoring or publishing to the web, you cannot use multiple Google
BigQuery accounts in the same workbook. You can have multiple Google BigQuery account
connections in Desktop.
14
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
Driver required
This connector requires a driver to talk to the database. You might already have the required
driver installed on your computer. If the driver is not installed on your computer, Tableau
displays a message in the connection dialog box with a link to the Driver Download page
where you can find driver links and installation instructions.
If Tableau can't make the connection, verify that your credentials are correct. If you still
can't connect, your computer is having trouble locating the server. Contact your network
administrator or database administrator.
i. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use a data
source naming convention that helps other users of the data source figure out
which data source to connect to.
ii. From the Database drop-down list, select a database or use the text box to
search for a database by name.
iii. Under Table, select a table or use the text box to search for a table by name.
iv. Drag the table to the canvas, and then select the sheet tab to start your
analysis.
Use custom SQL to connect to a specific query rather than the entire data source.
15
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
Here is an example of a Google Cloud SQL data source using Tableau Desktop on a
Windows computer.
Sign in on a Mac
If you use Tableau Desktop on a Mac, when you enter the server name to connect, use a fully
qualified domain name, such as mydb.test.ourdomain.lan, instead of a relative domain name,
such as mydb or mydb.test.
Alternatively, you can add the domain to the list of Search Domains for the Mac computer so
that when you connect, you need to provide only the server name. To update the list of
Search Domains, go to System Preferences > Network > Advanced, and then open
the DNS tab.
16
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
GOOGLE DRIVE
Google Drive is a free cloud-based storage service that enables users to
store and access files online. The service syncs stored documents, photos and
more across all of the user's devices, including mobile devices, tablets and
PCs.It integrates with the company's other services and systems -- including
Google Docs, Gmail, Android, Chrome, YouTube, Google Analytics and
Google+. It competes with Microsoft OneDrive, Apple iCloud, Box, Dropbox
and SugarSync.to connect Tableau to Google Sheets and set up the data source.
i. Sign in to Google Drive using your email or phone, and then select Next to
enter your password. If multiple accounts are listed, select the account that has
the Google Drive data you want to access and enter the password, if you're not
already signed in.
ii. Select Allow so that Tableau Desktop can access your Google Drive data.
iii. Close the browser window when notified to do so.
iv. Select a file from the list or use the text box to search for a file by name, and
then select Connect.
17
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
a. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use a data
source naming convention that helps other users of the data figure out which
data source to connect to.
b. If your Google Drive file has one table, select the sheet tab to start your
analysis.
You can store up to 2 million cells for spreadsheets that are created in Google Drive.
When using web authoring or publishing to the web, you cannot use multiple Google Drive
accounts, even when using different connections. You can have multiple Google Drive
account connections in Desktop.
In Internet Explorer 11 and Edge, you cannot access a server using an unsecured
connection (http). Use a secure connection (https) or switch to another browse
18
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
GOOGLE SHEETS
Google Sheets, you can create and edit spreadsheets directly in your web
browser—no special software is required. Multiple people can work
simultaneously, you can see people’s changes as they make them, and every
change is saved automatically.
i. Sign in to Google Sheets using your email or phone, and then select Next to
enter your password. If multiple accounts are listed, select the account that has
the Google Sheets data you want to access and enter the password, if you're
not already signed in.
ii. Select Allow so that Tableau Desktop can access your Google Sheets data.
iii. Close the browser window when notified to do so.
iv. Select a Google Sheet from the list or use the text box to search for a Google
Sheet by name or by URL, and then select Connect.
19
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
i. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use
a data source naming convention that helps other users of the data figure out
which data source to connect to.
ii. If your Google Sheets file has one table, select the sheet tab to start your
analysis.
The Select Your Google Sheet dialog box includes the following functionality:
1. The list of sheets that you can select from includes your private sheets, sheets shared
with you, and the public sheets that you've accessed in the past.
2. If you search by URL and the URL doesn't exist or you don't have access to it, an
error displays.
3. You can select the Name and Last opened by me column names to sort the Google
Sheets, and when you select a sheet you can preview it in the right pane. You cannot
sort by Owned by.
20
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42
You can also connect to a named range the same way you connect to a worksheet. The named
range functions as a table in Tableau.
You create named ranges in Google Sheets by highlighting a range of cells and then
selecting Data > Named ranges. When you connect to a named range in Tableau, an icon
appears next to the sheet in the Data Source tab as shown below.
You can store up to 2 million cells for spreadsheets that are created in or converted to Google
Sheets. Error message: Internal Error - An unexpected error occurred and the operation could
not be completed.
If there are errors in your Google Sheet, such as #DIV/0! or #N/A, Tableau is unable to create
an extract and an error message will appear. To resolve this issue, wrap the function
with iferror() and have it return a blank, or any value that's appropriate.For example, the sheet
below includes a #DIV/0! error. The solution is to wrap the calculation in
an iferror() calculation.
21