0% found this document useful (0 votes)
254 views21 pages

Big Data&Business Analytics

The document is a report submitted by Ms. Vrushali Anil Rajpure to the H&G H Mansukhnai Institute of Management about Tableau big data analytics. It discusses features of Tableau including data blending, real-time analysis and collaboration. It also describes Tableau Desktop and how it can be used to create charts, reports and dashboards from live data sources. Finally, it provides overviews of how to connect Tableau to Google Ads and Google Analytics data.

Uploaded by

Vrushali Rajpure
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
254 views21 pages

Big Data&Business Analytics

The document is a report submitted by Ms. Vrushali Anil Rajpure to the H&G H Mansukhnai Institute of Management about Tableau big data analytics. It discusses features of Tableau including data blending, real-time analysis and collaboration. It also describes Tableau Desktop and how it can be used to create charts, reports and dashboards from live data sources. Finally, it provides overviews of how to connect Tableau to Google Ads and Google Analytics data.

Uploaded by

Vrushali Rajpure
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

H & G H MANSUKHANI

INSTITUTE OF MANAGEMENT

SUBJECT
BIG DATA & BUSINESS ANALYTICS

CLASS - SYMMS
ROLL NO. - 42

TOPIC
TABLEAU BIG DATA ANALYTICS
Google ADS, Google Analytics, Google Big Query, Google Cloud
SQL, Google Drive
This assignment report is submitted as per the guidelines
and referenced with accurate details.

Submitted By
Ms. Vrushali Anil Rajpure
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

TABLEAU
• Tableau is a powerful and fastest growing data visualization tool used in the
Business Intelligence Industry. It helps in simplifying raw data into the very
easily understandable format.
• Data analysis is very fast with Tableau and the visualizations created are in
the form of dashboards and worksheets. The data that is created using
Tableau can be understood by professional at any level in an organization. It
even allows a non-technical user to create a customized dashboard.
• The best feature Tableau are

1. Data Blending
2. Real time analysis
3. Collaboration of data

• The great thing about Tableau software is that it doesn't require any
technical or any kind of programming skills to operate. The tool has
garnered interest among the people from all sectors such as business,
researchers, different industries, etc.

TABLEAU DESKTOP
Tableau Desktop has a rich feature set and allows you to code and customize
reports. Right from creating the charts, reports, to blending them all together to
form a dashboard, all the necessary work is created in Tableau Desktop.

For live data analysis, Tableau Desktop provides connectivity to Data Warehouse,
as well as other various types of files. The workbooks and the dashboards created
here can be either shared locally or publicly.

Based on the connectivity to the data sources and publishing option, Tableau
Desktop is classified into

• Tableau Desktop Personal: The development features are similar to


Tableau Desktop. Personal version keeps the workbook private, and the
access is limited. The workbooks cannot be published online. Therefore, it
should be distributed either Offline or in Tableau Public.
• Tableau Desktop Professional: It is pretty much similar to Tableau
Desktop. The difference is that the work created in the Tableau Desktop can
be published online or in Tableau Server. Also, in Professional version,
there is full access to all sorts of the datatype. It is best suitable for those
who wish to publish their work in Tableau Server.

2
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

GOOGLE ADS
Google Ads is Google's online advertising program. Through Google Ads, you
can create online ads to reach people exactly when they're interested in the
products and services that you offer.

Before you begin

• Your email address or user ID and password

Make the connection and set up the data source


1. Start Tableau and under Connect, select Google Ads. For a complete list of data
connections, select More under To a Server. Then do the following:

i. Sign in to Google Ads using your email or phone, and then select Next to
enter your password. If multiple accounts are listed, select the account that has
the GA data you want to access, and enter the password, if you are not already
signed in.
ii. Select Allow to authorize Google to securely share your data with Tableau
Desktop.
iii. Close the browser window when notified to do so.
iv. In Tableau Desktop, select your Account and your Client Customer ID
v. Then, select the pre-defined report and date filters
vi. You also can select the columns to show for the report previously selected.
vii. Select Connect.

2. On the data source page, do the following:

i. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use a data
source naming convention that helps other users of the data source figure out
which data source to connect to.
ii. By default, the report selected will be displayed under Table and will be
automatically drag it to the top of the canvas.
iii. Select the sheet tab to start your analysis.
iv. After you select the sheet tab, Tableau imports the data by creating an extract.
Note that Tableau Desktop supports only extracts for Google Ads. You can
update the data by refreshing the extract. For more information, see Extract
Your Data.
v. Creating extracts may take some time depending on the amount of data that is
included.

3
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

Google Ads data source example

Here is an example of a Google Ads data source connection using Tableau Desktop on a
Windows computer:

Key considerations
Account Requirements

To use the Google Ads connector, you must be a customer of Google Ads.

4
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

Date range selections can impact performance

When you apply a date filter, it's tempting to gather as much data as possible when you do an
analysis, however, retrieving records from Google Ads can be time-consuming. Tableau
doesn’t know how much data there is in a particular date range until it retrieves the data. For
this reason, you should restrict your date range at first, and then expand after you evaluate
performance.

To give you a rough idea of how much time it might take to retrieve data from Google Ads,
tests were conducted using a high-speed connection. This table shows how long it took in the
test environment to retrieve a given number of records.

Number of Records Time to Retrieve

1,000 11 seconds

10,000 2 minutes

100,000 18 minutes

Selecting more than one segment can impact performance

While in the Google Ads UI only one segment at a time can be used for display, with the
Google Ads connector you can combine multiple segments in the same report.Keep in mind
that the number of rows can increase exponentially for each additional segment field included
in your report.

Use a calculated field to correct money values

The connector returns the data in the same format as the Google Ads API. For Money fields
(such as costs, amounts, etc.), the Google Ads API returns them in micro currency
units(micros). To have the correct value you need to divide by 1000000 to get the amount in
the account's local currency.

Troubleshoot data access


If you see an error when you try to log in using your google account, for example, " The login
information provided corresponds to a Google account that does not have Ads enabled. Make
sure to login with a valid Ads account....”, contact your company's assigned Google Ads
account administrator

5
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

GOOGLE ANALYTICS
Google Analytics is a free Web analytics service that provides statistics
and basic analytical tools for search engine optimization (SEO) and marketing
purposes. The service is available to anyone with a Google account. Google
bought Urchin Software Corporation in April 2005 and used that company’s
Urchin on Demand product as the basis for its current service.

Before you begin


Before you begin, gather this connection information:

• GA email address and password

Make the connection and set up the data source


1. Start Tableau and under Connect, select Google Analytics. For a complete list of
data connections, select More under To a Server. In the tab Tableau opens in your
default browser, do the following:

i. Sign in to GA using your email or phone, and then select Next to enter
your password. If multiple accounts are listed, select the account that has
the GA data you want to access, and enter the password, if you are not
already signed in.

6
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

ii. Select Allow so that Tableau Desktop can access your GA data.
iii. Close the browser window when notified to do so.

2. On the data source page, do the following:

a. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use a data
source naming convention that helps other users of the data source figure out
which data source to connect to.

b. Follow the steps at the top of the data source page to complete the connection.

i. Step 1 – Select an Account, Property, and Profile using the drop-down


menus.
ii. Step 2 – Select filters for a date range and a segment.
1. For Date Range, you can select one of the predefined date
ranges or select specific dates. When selecting a date range, GA
can provide complete data only up to the previous full day. For
example, if you choose Last 30 days, data will be retrieved for
the last 30-day period ending yesterday.
2. For Segment, select a segment to filter your data. Segments are
reset filters that you can set for a GA connection. Default
Segments are defined by Google, and Custom Segments are
defined by the user on the GA website. Segments also help
prevent sampling to occur by filtering the data as defined by the
segment. For example, with a segment, you can get results for a
specific platform, such as tablets, or for a particular search
engine, such as Google.
iii. Step 3 – Add dimensions and measures by using the Add
Dimension and Add Measure drop-down menus, or select a
predefined set of measures from the Choose a Measure Group drop-
down menu. Some dimensions and measures cannot be used together.

c. Select the sheet tab to start your analysis. After you select the sheet tab,
Tableau imports the data by creating an extract. Note that Tableau Desktop
supports only extracts for Google Analytics. You can update the data by
refreshing the extract.

7
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

Google Analytics data source example

Here is an example of a Google Analytics data source connection using Tableau Desktop on a
Windows computer:

8
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

All data vs. sampled data returned from a query


GA restricts the amount of data that it returns from a query and provides sampled data
instead. Sampled data is a random subset of your data. When performing analysis on sampled
data, you can miss interesting outliers, and aggregations can be inaccurate. If Tableau detects
that your query might return sampled data, by default, Tableau creates multiple queries from
your query, and then combines the results from the queries to return all data.

You see the following message when Tableau returns all data.

If the query stays within the boundaries of the query restrictions, GA doesn't return sampled
data and you do not see the above message.

Troubleshoot issues with returning all data

If your query continues to return sampled data, consider the following:

• Missing date dimension – You must use the date dimension in your query to return
all data.

• Too much data – Your query might contain too much data. Reduce the date range.
Note that the minimum date range is one day.

• Non-aggregatable dimensions and measures – Some dimensions and measures


cannot be separated into multiple queries. If you suspect a problematic dimension or
measure in your query, hover over the All data button to see the tooltip that shows
which dimensions or measures to remove from your query.

• Legacy workbooks – Workbooks created in Tableau Desktop 9.1 and earlier cannot
return all data. Open the legacy workbook in Tableau Desktop 9.2 and later and save
the workbook.

Return sampled data

In cases when workbook performance is critical or there are specific dimensions and
measures you want to use in your query that are not supported by Tableau’s default query
process, use sampled data instead. To return sampled data, select the Sample data button.

9
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

GOOGLE BIGQUERY
BigQuery, Google's serverless, highly scalable enterprise data warehouse,
is designed to make data analysts more productive with unmatched price-
performance. Because there is no infrastructure to manage, you can focus on
uncovering meaningful insights using familiar SQL without the need for a
database administrator.

How to connect Tableau to Google BigQuery and set up the data source.

Before you begin


Before you begin, gather this connection information:

• Google BigQuery email or phone, and password

Make the connection and set up the data source


1. Start Tableau and under Connect, select Google BigQuery. For a complete list of
data connections, select More under To a Server. In the tab Tableau opens in your
default browser, do the following:

i. Sign in to Google BigQuery using your email or phone, and then


select Next to enter your password. If multiple accounts are listed, select the
account that has the Google BigQuery data you want to access and enter the
password, if you're not already signed in.

10
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

ii. Select Accept so that Tableau Desktop can access your Google BigQuery
data.
iii. Close the browser window when notified to do so.

2. On the data source page, do the following:

i. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use a data
source naming convention that helps other users of the data source figure out
which data source to connect to.
ii. (Optional) From the Billing Project drop-down list, select a billing project. If
you don't select a billing project, EmptyProject appears in the field after you
have selected the remaining fields.
iii. From the Project drop-down list, select a project. Alternatively,
select publicdata to connect to sample data in BigQuery.
iv. From the Dataset drop-down list, select a data set.
v. Under Table, select a table.

Use custom SQL to connect to a specific query rather than the entire data
source.

11
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

Google BigQuery data source example

Here is an example of a Google BigQuery data source using Tableau Desktop on a Windows
computer:

Use customization attributes to improve query performance


You can use customization attributes to improve the performance of large result sets returned
from BigQuery to Tableau Online and Tableau Server, and on Tableau Desktop.

You can have the customization attributes included in your published workbook or data
source, as long as you specify the attributes before you publish the workbook or data source
to Tableau Online or Tableau Server.

12
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

Use Google BigQuery customization attributes

Customization attributes accept integer values and affect both live queries and extract
refreshes for the specified connection.

The following attributes help the most to increase performance of large result sets:

bq-fetch-tasks Number of parallel background tasks to use when fetching data using HTTP. The
default is 10.

bq-large-fetch- Number of rows to fetch in each batch for spool queries. The default is 50000.
rows

The following attributes are also available and are primarily used for small queries:

bq-fetch-rows Number of rows to fetch in each batch for non-spool queries. The default is 10000.

bq-response-rows Number of rows returned in non-spool non-batched queries. The default is 10000.

This capability setting accepts yes or no values and can be useful when testing:

CAP_BIGQUERY_FORCE_SPOOL_JOB Force all queries to use the temp table approach. The default
value is “no.” Change the value to “yes” to turn this attribute
on.

How Tableau returns rows from Google BigQuery

Tableau uses two approaches to return rows from BigQuery: the default non-spool approach,
or the temp table (spool) approach:

1. On the first attempt, queries are executed using the default, non-spool query, which
uses the bq-fetch-rows setting.
2. If the result set is too large, the BigQuery API returns an error and the Tableau
BigQuery connector retries the query by saving the results into a BigQuery temp
table. The BigQuery connector then reads from that temp table, which is a spool job
that uses the bq-large-fetch-rows setting.

13
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

How to specify the attributes

You can specify attributes in one of two ways: in a Tableau Datasource


Customization .tdc file, or in the workbook or data source XML.

Specify attributes in a .tdc file

To specify customization attributes during a publish workbook or publish data source


operation from Tableau Desktop, follow these steps:

1. Create an XML file that contains the customization attributes.

2. Save the file with a .tdc extension, for example, BigQueryCustomization.tdc.

3. Save the file to the My Tableau Repository\Datasources folder.

The customization attributes in the .tdc file are read and included by Tableau Desktop when
the data source or workbook is published to Tableau Online or Tableau Server.

Example of a .tdc file with recommended settings for large extracts


<connection-customization class='bigquery' enabled='true' version='8.0' >
<vendor name='bigquery' />
<driver name='bigquery' />
<customizations>
<customization name='bq-fetch-tasks' value='10' />
<customization name='bq-large-fetch-rows' value='10000' />
</customizations>
</connection-customization>
Manually embed attributes in the XML of the workbook or data source file

You can manually embed customization attributes inside the 'connection' tag in the workbook
.twb file or the data source .tds file. The BigQuery customization attributes are bold in the
following example to make them easier for you to see.

Example of manually embedded attributes

<connection CATALOG='publicdata' EXECCATALOG='some-project-123' REDIRECT_URI='some-


url:2.0:oob' SCOPE='https://www.googleapis.com/auth/bigquery
https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email'
authentication='yes' bq-fetch-tasks='10' bq-large-fetch-rows='10000' bq_schema='samples' class='bigquery'
connection-dialect='google-bql' connection-protocol='native-api' login_title='Sign in to Google BigQuery' odbc-
connect-string-extras='' project='publicdata' schema='samples' server='googleapis.com/bigquery' server-oauth=''
table='wikipedia' username=''>

Troubleshoot Google BigQuery issues


Connections to multiple accounts

When using web authoring or publishing to the web, you cannot use multiple Google
BigQuery accounts in the same workbook. You can have multiple Google BigQuery account
connections in Desktop.

14
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

GOOGLE CLOUD SQL


Cloud SQL is a fully managed database service that makes it easy to set
up, maintain, manage, and administer your relational PostgreSQL, MySQL, and
SQL Server databases in the cloud. Fully managed relational database services.
It offers high performance, scalability, and convenience. Hosted on Google
Cloud Platform, it provides a database infrastructure for applications running
anywhere.

Before you begin


• Name of the server that hosts the database you want to connect to

• User name and password

Driver required

This connector requires a driver to talk to the database. You might already have the required
driver installed on your computer. If the driver is not installed on your computer, Tableau
displays a message in the connection dialog box with a link to the Driver Download page
where you can find driver links and installation instructions.

Make the connection and set up the data source


1. Start Tableau and under Connect, select Google Cloud SQL. For a complete list of
data connections, select More under To a Server. Then do the following:

i. Enter the name of the server that hosts the database.


ii. Enter the user name and password, and then select Sign In.

If Tableau can't make the connection, verify that your credentials are correct. If you still
can't connect, your computer is having trouble locating the server. Contact your network
administrator or database administrator.

2. On the data source page, do the following:

i. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use a data
source naming convention that helps other users of the data source figure out
which data source to connect to.
ii. From the Database drop-down list, select a database or use the text box to
search for a database by name.
iii. Under Table, select a table or use the text box to search for a table by name.
iv. Drag the table to the canvas, and then select the sheet tab to start your
analysis.

Use custom SQL to connect to a specific query rather than the entire data source.

15
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

Google Cloud SQL data source example

Here is an example of a Google Cloud SQL data source using Tableau Desktop on a
Windows computer.

Sign in on a Mac
If you use Tableau Desktop on a Mac, when you enter the server name to connect, use a fully
qualified domain name, such as mydb.test.ourdomain.lan, instead of a relative domain name,
such as mydb or mydb.test.

Alternatively, you can add the domain to the list of Search Domains for the Mac computer so
that when you connect, you need to provide only the server name. To update the list of
Search Domains, go to System Preferences > Network > Advanced, and then open
the DNS tab.

16
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

GOOGLE DRIVE
Google Drive is a free cloud-based storage service that enables users to
store and access files online. The service syncs stored documents, photos and
more across all of the user's devices, including mobile devices, tablets and
PCs.It integrates with the company's other services and systems -- including
Google Docs, Gmail, Android, Chrome, YouTube, Google Analytics and
Google+. It competes with Microsoft OneDrive, Apple iCloud, Box, Dropbox
and SugarSync.to connect Tableau to Google Sheets and set up the data source.

Before you begin


• Google email address and password

Make the connection and set up the data source


1. Start Tableau and under Connect, select Google Drive. For a complete list of data
connections, select More under To a Server. In the tab Tableau opens in your default
browser, do the following:

i. Sign in to Google Drive using your email or phone, and then select Next to
enter your password. If multiple accounts are listed, select the account that has
the Google Drive data you want to access and enter the password, if you're not
already signed in.
ii. Select Allow so that Tableau Desktop can access your Google Drive data.
iii. Close the browser window when notified to do so.
iv. Select a file from the list or use the text box to search for a file by name, and
then select Connect.

17
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

2. On the data source page, do the following:

a. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use a data
source naming convention that helps other users of the data figure out which
data source to connect to.

b. If your Google Drive file has one table, select the sheet tab to start your
analysis.

About .ttde and .hhyper files


You might notice .ttde or .hhyper files when navigating your computer's directory. When you
create a Tableau data source that connects to your data, Tableau creates a .ttde or .hhyper file.
This file, also known as a shadow extract, is used to help improve the speed your data source
loads in Tableau Desktop. Although a shadow extract contains underlying data and other
information similar to the standard Tableau extract, a shadow extract is saved in a different
format and can't be used to recover your data.In certain situations, you might need to delete a
shadow extract from your computer.

Troubleshoot Google Drive issues


Data limit in Google Drive

You can store up to 2 million cells for spreadsheets that are created in Google Drive.

Connections to multiple accounts

When using web authoring or publishing to the web, you cannot use multiple Google Drive
accounts, even when using different connections. You can have multiple Google Drive
account connections in Desktop.

Web authoring with Internet Explorer 11 and Edge

In Internet Explorer 11 and Edge, you cannot access a server using an unsecured
connection (http). Use a secure connection (https) or switch to another browse

18
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

GOOGLE SHEETS
Google Sheets, you can create and edit spreadsheets directly in your web
browser—no special software is required. Multiple people can work
simultaneously, you can see people’s changes as they make them, and every
change is saved automatically.

Before you begin


• Google email address and password

Make the connection and set up the data source


1. Start Tableau and under Connect, select Google Sheets. For a complete list of data
connections, select More under To a Server. In the tab Tableau opens in your default
browser, do the following:

i. Sign in to Google Sheets using your email or phone, and then select Next to
enter your password. If multiple accounts are listed, select the account that has
the Google Sheets data you want to access and enter the password, if you're
not already signed in.
ii. Select Allow so that Tableau Desktop can access your Google Sheets data.
iii. Close the browser window when notified to do so.
iv. Select a Google Sheet from the list or use the text box to search for a Google
Sheet by name or by URL, and then select Connect.

19
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

2. On the data source page, do the following:

i. (Optional) Select the default data source name at the top of the page, and then
enter a unique data source name for use in Tableau. For example, use
a data source naming convention that helps other users of the data figure out
which data source to connect to.
ii. If your Google Sheets file has one table, select the sheet tab to start your
analysis.

Select Your Google Sheet dialog box functionality

The Select Your Google Sheet dialog box includes the following functionality:

1. The list of sheets that you can select from includes your private sheets, sheets shared
with you, and the public sheets that you've accessed in the past.
2. If you search by URL and the URL doesn't exist or you don't have access to it, an
error displays.
3. You can select the Name and Last opened by me column names to sort the Google
Sheets, and when you select a sheet you can preview it in the right pane. You cannot
sort by Owned by.

Google Sheets data source example

Here is an example of a Google Sheets data source:

20
H & G H MANSUKHANI INSTITUTE OF MANAGEMENT | ROLL NO. 42

Connect to more data

You can connect to more than one table by using join.

You can also connect to a named range the same way you connect to a worksheet. The named
range functions as a table in Tableau.

You create named ranges in Google Sheets by highlighting a range of cells and then
selecting Data > Named ranges. When you connect to a named range in Tableau, an icon
appears next to the sheet in the Data Source tab as shown below.

About .ttde and .hhyper files


You might notice .ttde or .hhyper files when navigating your computer's directory. When you
create a Tableau data source that connects to your data, Tableau creates a .ttde or .hhyper file.
This file, also known as a shadow extract, is used to help improve the speed your data source
loads in Tableau Desktop. Although a shadow extract contains underlying data and other
information similar to the standard Tableau extract, a shadow extract is saved in a different
format and can't be used to recover your data.

Troubleshoot Google Sheets issues


Data limit in Google Drive

You can store up to 2 million cells for spreadsheets that are created in or converted to Google
Sheets. Error message: Internal Error - An unexpected error occurred and the operation could
not be completed.

If there are errors in your Google Sheet, such as #DIV/0! or #N/A, Tableau is unable to create
an extract and an error message will appear. To resolve this issue, wrap the function
with iferror() and have it return a blank, or any value that's appropriate.For example, the sheet
below includes a #DIV/0! error. The solution is to wrap the calculation in
an iferror() calculation.

21

You might also like