0% found this document useful (0 votes)
3 views

Key Notes

Project Requirements Document Template

Uploaded by

Yaseen Khan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Key Notes

Project Requirements Document Template

Uploaded by

Yaseen Khan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 33

Know your stakeholders and their

goals
Previously, you learned about the four different types of stakeholders you might
encounter as a business intelligence professional:

 Project sponsor: A person who provides support and resources for a project and is
accountable for enabling its success.
 Developer: A person who uses programming languages to create, execute, test, and
troubleshoot software applications. This includes application software developers and
systems software developers.
 Systems analyst: A person who identifies ways to design, implement, and advance
information systems in order to ensure that they help make it possible to achieve
business goals.
 Business stakeholders: Business stakeholders can include one or more of the
following groups of people:
 The executive team: The executive team provides strategic and operational
leadership to the company. They set goals, develop strategy, and make sure that
strategy is executed effectively. The executive team might include vice presidents, the
chief marketing officer, and senior-level professionals who help plan and direct the
company’s work.
 The customer-facing team: The customer-facing team includes anyone in an
organization who has some level of interaction with customers and potential customers.
Typically they compile information, set expectations, and communicate customer
feedback to other parts of the internal organization.
 The data science team: The data science team explores the data that’s already out
there and finds patterns and insights that data scientists can use to uncover future
trends with machine learning. This includes data analysts, data scientists, and data
engineers.

The business

In this scenario, you are a BI professional working with an e-book retail company. The
customer-facing team is interested in using customer data collected from the
company’s e-reading app in order to better understand user reading habits, then
optimize the app accordingly. They have asked you to create a system that will ingest
customer data about purchases and reading time on the app so that the data is
accessible to their analysts. But before you can get started, you need to understand all
of your stakeholders’ needs and goals to help them achieve them.

The stakeholders and their goals


Project sponsor
A project sponsor is the person who provides support and resources for a project and is
accountable for enabling its success. In this case, the project sponsor is the team lead
for the customer-facing team. You know from your discussions with this team that they
are interested in optimizing the e-reading app. In order to do so, they need a system
that will deliver customer data about purchases and reading time to a database for their
analysts to work with. The analysts can then use this data to gain insights about
purchasing habits and reading times in order to find out what genres are most popular,
how long readers are using the app, and how often they are buying new books to make
recommendations to the UI design team.

Developers
The developers are the people who use programming languages to create, execute,
test, and troubleshoot software applications. This includes application software
developers and systems software developers. If your new BI workflow includes software
applications and tools, or you are going to need to create new tools, then you’ll need to
collaborate with the developers. Their goal is to create and manage your business’s
software tools, so they need to understand what tools you plan to use and what you
need those tools to do. For this example, the developers you work with will be the ones
responsible for managing the data captured on the e-reading app.

Systems analyst
The systems analyst identifies ways to design, implement, and advance
information systems in order to ensure that they help make it possible to
achieve business goals. Their primary goal is to understand how the business
is using its computer hardware and software, cloud services, and related
technologies, then they figure out how to improve these tools. So the system
analyst will be ensuring that the data captured by the developers can be
accessed internally as raw data.

Business stakeholders
In addition to the customer-facing team, who is the project sponsor for this
project, there may also be other business stakeholders for this project such
as project managers, senior-level professionals, and other executives. These
stakeholders are interested in guiding business strategy for the entire
business; their goal is to continue to improve business processes, increase
revenue, and reach company goals. So your work may even reach the chief
technology officer! These are generally people who need bigger-picture
insights that will help them make larger scale decisions as opposed to detail-
oriented insights about software tools or data systems.

Conclusion

Often, BI projects encompass a lot of teams and stakeholders who have


different goals depending on their function within the organization.
Understanding their perspectives is important because it enables you to
consider a variety of use cases for your BI tools. And the more useful your
tools, the more impactful they will be!
ob-search resources for business
intelligence professionals
As you continue through this course, you'll encounter resources and best practices to help you land
a job as a business intelligence professional or advance your career. This reading provides you with
some resources you can explore and bookmark to use on your job search.

Job search sites


There are a lot of job search sites, and it can be difficult to find ones that are useful in your specific
field. Here are a few resources designed for BI professionals:

 Built In: Built In is an online community specifically designed to connect startups and tech
companies with potential employees. This is an excellent resource for finding jobs specifically in the
tech industry, including BI. Built In also has hubs in some U.S. cities and resources for finding
remote positions.
 Crunchboard: Crunchboard is a job board hosted by TechCrunch. TechCrunch is also the creator
of CrunchBase, an open database with information about start-up companies in the tech industry.
This is another valuable resource for people looking for jobs specifically in tech.
 Dice: Dice is a career marketplace specifically focused on tech professionals in the United States. It
provides insights and information for people on the job search.
 DiversityJobs: DiversityJobs is a resource that hosts a job board, career and resume resources,
and community events intended to help underrepresented job seekers with employers currently
hiring. This resource is not tech specific and encompasses a lot of industries.
 Diversify Tech: Diversify Tech is a newsletter that is designed to connect underrepresented
people with opportunities in the tech industry, including jobs. Their job board includes positions from
entry-level to senior positions with companies committed to diversity and inclusion in the field.
 LinkedIn: You’ve learned about LinkedIn as a great way to start networking and building your
online presence as a BI professional. LinkedIn also has a job board with postings from potential
employers. It has job postings from across the world in all sorts of industries, so you’ll need to
commit some time to finding the right postings for you, but this is a great place to begin your job
search.
You can also search for more specific job boards depending on your needs as a job seeker and your
career interests!

Interview and resume resources


In addition to applying to jobs, you will want to make sure your interview skills and resume are
polished and ready to go. If you completed the Google Data Analytics Career Certificate, you already
learned a lot about these things. Feel free to review these resources anytime. Or, if you are new to
Google Career Certificates, check them out now! They provide useful interview strategies and clear
steps for developing a winning resume!

The many benefits of mentorships


Exploring job boards and online resources is only one part of your job-search process; it is just as
important to connect with other professionals in your field, build your network, and join in the BI
community. A great way to accomplish these goals is by building a relationship with a mentor. In this
reading, you will learn more about mentors, the benefits of mentorship, and how to connect with
potential mentors.

Considering mentorship
Mentors are professionals who share knowledge, skills, and experiences to help you grow and
develop. These people can come in many different forms at different points in your career. They can
be advisors, sounding boards, honest critics, resources, or all of those things. You can even have
multiple mentors to gain more diverse perspectives!

There are a few things to consider along the way:

 Decide what you are searching for in a mentor. Think about your strengths and
weaknesses, what challenges you have encountered, and how you would like to grow as a BI
professional. Share these ideas with potential mentors who might have had similar experiences and
have guidance to share.
 Consider common ground. Often you can find great mentorships with people who share
interests and backgrounds with you. This could include someone who had a similar career path or
even someone from your hometown.
 Respect their time. Often, mentors are busy! Make sure the person you are asking to mentor
you has time to support your growth. It’s also important for you to put in the effort necessary to
maintain the relationship and stay connected with them.
Note that mentors don't have to be directly related to BI. It depends on what you want to focus on
with each individual. Mentors can be friends of friends, more experienced coworkers, former
colleagues, or even teammates. For example, if you find a family friend who has a lot of experience
in their own non-BI field, but shares a similar background as you and understands what you're trying
to achieve, that person may become an invaluable mentor to you. Or, you might fortuitously meet
someone at a casual work outing with whom you develop an instant rapport. Again, even if they are
not in the BI field, they may be able to connect you to someone in their company or network who is
in BI.

How to build the relationship


Once you have considered what you’re looking for in a mentor and found someone with time and
experience to share, you’ll need to build that relationship. Sometimes, the connection happens
naturally, but usually you need to formally ask them to mentor you.

One great way to reach out is with a friendly email or a message on a professional networking
website. Describe your career goals, explain how you think those goals align with their own
experiences, and talk about something you admire about them professionally. Then you can suggest
a coffee chat, virtual meetup, or email exchange as a first step.

Be sure to check in with yourself. It’s important that you feel like it is a natural fit and that you’re
getting the mentorship you need. Mentor-mentee relationships are equal partnerships, so the more
honest you are with them, the more they can help you. And remember to thank them for their time
and effort!

As you get in touch with potential mentors, you might feel nervous about being a bother or taking up
too much of their time. But mentorship is meaningful for mentors too. They often genuinely want to
help you succeed and are invested in your growth. Your success brings them joy! Many mentors
enjoy recounting their experiences and sharing their successes with you, as well. And mentors often
learn a lot from their mentees. Both sides of the mentoring relationship are meaningful!

Resources
There are a lot of great resources you can use to help you connect with potential mentors. Here are
just a few:

 Mentoring websites such as Score.org, MicroMentor.org, or the Mentorship app allow you to
search for mentors with specific credentials that match your needs. You can then arrange dedicated
times to meet up or talk on the phone.
 Meetups, or online meetings that are usually local to your geography. Enter a search for “business
intelligence meetups near me” to check out what results you get. There is usually a posted schedule
for upcoming meetings so you can attend virtually. Find out more information about meetups
happening around the world.
 Platforms including LinkedIn and Twitter. Use a search on either platform to find data science or
data analysis hashtags to follow. Post your own questions or articles to generate responses and
build connections that way.
 Webinars may showcase a panel of speakers and are usually recorded for convenient access and
playback. You can see who is on a webinar panel and follow them too. Plus, a lot of webinars are
free. One interesting pick is the Tableau on Tableau webinar series. Find out how Tableau has used
Tableau in its internal departments.
 Conferences present innovative ideas and topics. The cost varies, and some are pricey. But
many offer discounts to students, and some conferences like Women in Analytics aim to increase
the number of under-represented groups in the field.
 Associations or societies gather members to promote a field such as business intelligence.
Many memberships are free. The Digital Analytics Association is one example. The Cape Fear
Community College Library also has a list of professional associations for analytics, business
intelligence, and business analysis.
 User communities and summits offer events for users of professional tools; this is a chance
to learn from the best. Have you seen the Tableau community?
 Nonprofit organizations that promote the ethical use of data science and might offer events
for the professional advancement of their members. The Data Science Association is one example.
Finding and connecting with a mentor is a great way to build your network, access career
opportunities, and learn from someone who has already experienced some of the challenges you’re
facing in your career. Whether your mentor is a senior coworker, someone you connect with on
LinkedIn, or someone from home on a similar career path, mentorship can bring you great benefits
as a BI professional.

Applications software developer: A person who designs computer or mobile


applications, generally for consumers

Business intelligence monitoring: Building and using hardware and software


tools to easily and rapidly analyze data and enable stakeholders to make impactful
business decisions

Deliverable: Any product, service, or result that must be achieved in order to


complete a project
Developer: A person who uses programming languages to create, execute, test, and
troubleshoot software applications

Metric: A single, quantifiable data point that is used to evaluate performance

Project sponsor: A person who has overall accountability for a project and
establishes the criteria for its success

Strategy: A plan for achieving a goal or arriving at a desired future state

Systems analyst: A person who identifies ways to design, implement, and advance
information systems in order to ensure that they help make it possible to achieve
business goals

Systems software developer: A person who develops applications and


programs for the backend processing systems used in organizations

Tactic: A method used to enable an accomplishment

Terms and their definitions from previous modules

A
Application programming interface (API): A set of functions and procedures
that integrate computer programs, forming a connection that enables them to
communicate

B
Business intelligence (BI): Automating processes and information channels in
order to transform relevant data into actionable insights that are easily available to
decision-makers

Business intelligence governance: A process for defining and implementing


business intelligence systems and frameworks within an organization

Business intelligence stages: The sequence of stages that determine both BI


business value and organizational data maturity, which are capture, analyze, and
monitor

Business intelligence strategy: The management of the people, processes,


and tools used in the business intelligence process

D
Data analysts: People who collect, transform, and organize data
Data governance professionals: People who are responsible for the formal
management of an organization’s data assets

Data maturity: The extent to which an organization is able to effectively use its data
in order to extract actionable insights

Data model: A tool for organizing data elements and how they relate to one another

Data pipeline: A series of processes that transports data from different sources to
their final destination for storage and analysis

Data warehousing specialists: People who develop processes and procedures


to effectively store and organize data

E
ETL (extract, transform, and load): A type of data pipeline that enables data to
be gathered from source systems, converted into a useful format, and brought into a
data warehouse or other unified destination system

I
Information technology professionals: People who test, install, repair,
upgrade, and maintain hardware and software solutions

Iteration: Repeating a procedure over and over again in order to keep getting closer
to the desired result

K
Key performance indicator (KPI): A quantifiable value, closely linked to
business strategy, which is used to track progress toward a goal

P
Portfolio: A collection of materials that can be shared with potential employers

Project manager: A person who handles a project’s day-to-day steps, scope,


schedule, budget, and resources

Database comparison checklist


OLAP versus OLTP

Database
technolo Description Use
gy
 Provide user access to data
from a variety of source
systems
 Used by BI and other data
Online Analytical Processing (OLAP) professionals to support
OLAP systems are databases that have been decision-making processes
primarily optimized for analysis.  Analyze data from multiple
databases
 Draw actionable insights
from data delivered to
reporting tables
 Store transaction data
 Used by customer-facing
employees or customer self-
Online Transaction Processing (OLTP)
service applications
systems are databases that have been
OLTP  Read, write, and update
optimized for data processing instead
single rows of data
of analysis.
 Act as source systems that
data pipelines can be pulled
from for analysis

Row-based versus columnar

Database
technolog Description Use
y
 Traditional, easy to write database
organization typically used in OLTP
Row-based databases are systems
Row-based
organized by rows.  Writes data very quickly
 Stores all of a row’s values together
 Easily optimized with indexing
Columnar Columnar databases are  Newer form of database
organized by columns organization, typically used to
instead of rows. support OLAP systems
 Read data more quickly and only
Database
technolog Description Use
y
pull the necessary data for analysis
 Stores multiple row’s columns
together

Distributed versus single-homed

Database
technolog Description Use
y
 Easily expanded to address
increasing or larger scale
Distributed databases are business needs
collections of data systems  Accessed from different
Distributed
distributed across multiple networks
physical locations.  Easier to secure than a
single-homed database
system
 Data stored in a single
location is easier to access
Single-homed databases are and coordinate cross-team
Single- databases where all of the data is  Cuts down on data
homed stored in the same physical redundancy
location.  Cheaper to maintain than
larger, more complex
systems

Separated storage and compute versus combined

Database
technolog Description Use
y

Separated storage and computing  Run analytical queries more


Separated systems are databases where efficiently because the system
storage and less relevant data is stored only needs to process the
compute remotely, and relevant data is most relevant data
stored locally for analysis.  Scale computation resources
Database
technolog Description Use
y
and storage systems
separately based on your
organization’s custom needs
 Traditional setup that allows
users to access all possible
Combined Combined systems are database data at once
storage and systems that store and analyze  Storage and computation
compute data in the same place. resources are linked, so
resource management is
straightforward

Four key elements of database schemas


Whether you are creating a new database model or exploring a system in place already, it is
important to ensure that all elements exist in the schema. The database schema enables you to
validate incoming data being delivered to your destination database to prevent errors and ensure the
data is immediately useful to users.

Here is a checklist of common elements a database schema should include:

 The relevant data: The schema describes how the data is modeled and shaped within the
database and must encompass all of the data being described.
 Names and data types for each column: Include names and data types for each column
in each table within the database.
 Consistent formatting: Ensure consistent formatting across all data entries. Every entry is an
instance of the schema, so it needs to be consistent.
 Unique keys: The schema must use unique keys for each entry within the database. These keys
build connections between the tables and enable users to combine relevant data from across the
entire database.
Key takeaways
As you receive more data or business needs change, databases and schemas may also need to
change. Database optimization is an iterative process, which means you may need to check the
schema multiple times throughout the database’s useful life. Use this checklist to help you ensure
that your database schema remains functional.

Seven elements of quality testing


In this part of the course, you have been learning about the importance of quality testing in your ETL
system. This is the process of checking data for defects in order to prevent system failures. Ideally,
your pipeline should have checkpoints built-in that identify any defects before they arrive in the target
database system. These checkpoints ensure that the data coming in is already clean and useful! In
this reading, you will be given a checklist for what your ETL quality testing should be taking into
account.

When considering what checks you need to ensure the quality of your data as it moves through the
pipeline, there are seven elements you should consider:

 Completeness: Does the data contain all of the desired components or measures?
 Consistency: Is the data compatible and in agreement across all systems?
 Conformity: Does the data fit the required destination format?
 Accuracy: Does the data conform to the actual entity being measured or described?
 Redundancy: Is only the necessary data being moved, transformed, and stored for use?
 Timeliness: Is the data current?
 Integrity: Is the data accurate, complete, consistent, and trustworthy? (Integrity is influenced by
the previously mentioned qualities.)

Common issues
There are also some common issues you can protect against within your system to ensure the
incoming data doesn’t cause errors or other large-scale problems in your database system:

 Check data mapping: Does the data from the source match the data in the target database?
 Check for inconsistencies: Are there inconsistencies between the source system and the
target system?
 Check for inaccurate data: Is the data correct and does it reflect the actual entity being
measured?
 Check for duplicate data: Does this data already exist within the target system?
To address these issues and ensure your data meets all seven elements of quality testing, you can
build intermediate steps into your pipeline that check the loaded data against known parameters. For
example, to ensure the timeliness of the data, you can add a checkpoint that determines if that data
matches the current date; if the incoming data fails this check, there’s an issue upstream that needs
to be flagged. Considering these checks in your design process will ensure your pipeline delivers
quality data and needs less maintenance over time.

Key takeaways
One of the great things about BI is that it gives us the tools to automate certain processes that help
save time and resources during data analysis– building quality checks into your ETL pipeline system
is one of the ways you can do this! Making sure you are already considering the completeness,
consistency, conformity, accuracy, redundancy, integrity, and timeliness of the data as it moves from
one system to another means you and your team don’t have to check the data manually later on.

Schema-validation checklist
In this course, you have been learning about the tools business intelligence professionals use to
ensure conformity from source to destination: schema validation, data dictionaries, and data
lineages. In another reading, you already had the opportunity to explore data dictionaries and
lineages. In this reading, you are going to get a schema validation checklist you can use to guide
your own validation process.

Schema validation is a process used to ensure that the source system data schema matches the
target database data schema. This is important because if the schemas don’t align, it can cause
system failures that are hard to fix. Building schema validation into your workflow is important to
prevent these issues.
Common issues for schema validation
 The keys are still valid: Primary and foreign keys build relationships between tables in
relational databases. These keys should continue to function after you have moved data from one
system into another.
 The table relationships have been preserved: The keys help preserve the relationships
used to connect the tables so that keys can still be used to connect tables. It’s important to make
sure that these relationships are preserved or that they are transformed to match the target schema.
 The conventions are consistent: The conventions for incoming data must be consistent with
the target database’s schema. Data from outside sources might use different conventions for naming
columns in tables– it’s important to align these before they’re added to the target system.
Using data dictionaries and lineages
You’ve already learned quite a bit about data dictionaries and lineages. As a refresher, a data
dictionary is a collection of information that describes the content, format, and structure of data
objects within a database, as well as their relationships. And a data lineage is the process of
identifying the origin of data, where it has moved throughout the system, and how it has transformed
over time. These tools are useful because they can help you identify what standards incoming data
should adhere to and track down any errors to the source.

The data dictionaries and lineages reading provided some additional information if more review is
needed.

Key takeaways
Schema validation is a useful check for ensuring that the data moving from source systems to your
target database is consistent and won’t cause any errors. Building in checks to make sure that the
keys are still valid, the table relationships have been preserved, and the conventions are consistent
before data is delivered will save you time and energy trying to fix these errors later on.

Business rules
As you have been learning, a business rule is a statement that creates a restriction on specific parts
of a database. These rules are developed according to the way an organization uses data. Also, the
rules create efficiencies, allow for important checks and balances, and also sometimes exemplify the
values of a business in action. For instance, if a company values cross-functional collaboration,
there may be rules about at least 2 representatives from two teams checking off completion on some
data set. They affect what data is collected and stored, how relationships are defined, what kind of
information the database provides, and the security of the data. In this reading, you will learn more
about the development of business rules and see an example of business rules being implemented
in a database system.

Imposing business rules


Business rules are highly dependent on the organization and their data needs. This means business
rules are different for every organization. This is one of the reasons why verifying business rules is
so important; these checks help ensure that the database is actually doing the job you need it to do.
But before you can verify business rules, you have to implement them.

For example, let’s say the company you work for has a database that manages purchase order
requests entered by employees. Purchase orders over $1,000 dollars need manager approval. In
order to automate this process, you can impose a ruleset on the database that automatically delivers
requests over $1,000 to a reporting table pending manager approval. Other business rules that may
apply in this example are: prices must be numeric values (data type should be integer); or for a
request to exist, a reason is mandatory (table field may not be null).

In order to fulfill this business requirement, there are three rules at play in this system:

1. Order requests under $1,000 are automatically delivered to the approved product order requests
table
2. Requests over $1,000 are automatically delivered to the requests pending approval table
3. Approved requests are automatically delivered to the approved product order requests table
These rules inherently affect the shape of this database system to cater to the needs of this
particular organization.
Verifying business rules
Once the business rules have been implemented, it’s important to continue to verify that they are
functioning correctly and that data being imported into the target systems follows these rules. These
checks are important because they test that the system is doing the job it needs to, which in this
case is delivering product order requests that need approval to the right stakeholders.

Key takeaways
Business rules determine what data is collected and stored, how relationships are defined, what kind
of information the database provides, and the security of the data. These rules heavily influence how
a database is designed and how it functions after it has been set up. Understanding business rules
and why they are important is useful as a BI professional because this can help you understand how
existing database systems are functioning, design new systems according to business needs, and
maintain them to be useful in the future.

Database performance testing in an ETL


context
In previous lessons, you learned about database optimization as part of the database building
process. But it’s also an important consideration when it comes to ensuring your ETL and pipeline
processes are functioning properly. In this reading, you are going to return to database performance
testing in a new context: ETL processes.

How database performance affects your pipeline


Database performance is the rate that a database system is able to provide information to users.
Optimizing how quickly the database can perform tasks for users helps your team get what they
need from the system and draw insights from the data that much faster.

Your database systems are a key part of your ETL pipeline– these include where the data in your
pipeline comes from and where it goes. The ETL or pipeline is a user itself, making requests of the
database that it has to fulfill while managing the load of other users and transactions. So database
performance is not just key to making sure the database itself can manage your organization’s
needs– it’s also important for the automated BI tools you set up to interact with the database.

Key factors in performance testing


Earlier, you learned about some database performance considerations you can check for when a
database starts slowing down. Here is a quick checklist of those considerations:

 Queries need to be optimized


 The database needs to be fully indexed
 Data should be defragmented
 There must be enough CPU and memory for the system to process requests
You also learned about the five factors of database performance: workload, throughput, resources,
optimization, and contention. These factors all influence how well a database is performing, and it
can be part of a BI professional’s job to monitor these factors and make improvements to the system
as needed.

These general performance tests are really important– that’s how you know your database can
handle data requests for your organization without any problems! But when it comes to database
performance testing while considering your ETL process, there is another important check you
should make: testing the table, column, row counts, and Query Execution Plan.

Testing the row and table counts allows you to make sure that the data count matches between the
target and source databases. If there are any mismatches, that could mean that there is a potential
bug within the ETL system. A bug in the system could cause crashes or errors in the data, so
checking the number of tables, columns, and rows of the data in the destination database against
the source data can be a useful way to prevent that.

Key takeaways
As a BI professional, you need to know that your database can meet your organization’s needs.
Performance testing is a key part of the process. Not only is performance testing useful during
database building itself, but it’s also important for ensuring that your pipelines are working properly
as well. Remembering to include performance testing as a way to check your pipelines will help you
maintain the automated processes that make data accessible to users!

Defend against known issues


In this reading, you’ll learn about a defensive check applied to a data pipeline. Defensive checks
help you prevent problems in your data pipeline. They are similar to performance checks but focus
on other kinds of problems. The following scenario will provide an example of how you can
implement different kinds of defensive checks on a data pipeline.

Scenario
Arsha, a Business Intelligence Analyst at a telecommunications company, built a data pipeline that
merges data from six sources into a single database. While building her pipeline, she incorporated
several defensive checks that ensured that the data was moved and transformed properly.

Her data pipeline used the following source systems:

1. Customer details

2. Mobile contracts

3. Internet and cable contracts

4. Device tracking and enablement

5. Billing

6. Accounting

All of these datasets had to be harmonized and merged into one target system for business
intelligence analytics. This process required several layers of data harmonization, validation,
reconciliation, and error handling.

Pipeline layers
Pipelines can have many different stages of processing. These stages, or layers, help ensure that
the data is collected, aggregated, transformed, and staged in the most effective and efficient way.
For example, it’s important to make sure you have all the data you need in one place before you
start cleaning it to ensure that you don’t miss anything. There are usually four layers to this process:
staging, harmonization, validation, and reconciliation. After these four layers, the data is brought into
its target database and an error handling report summarizes each step of the process.

Staging layer
First, the original data is brought from the source systems and stored in the staging layer. In this
layer, Arsha ran the following defensive checks:

 Compared the number of records received and stored

 Compared rows to identify if extra records were created or records were lost

 Checked important fields, such as amounts, dates, and IDs

Arsha moved the mismatched records to the error handling report. She included each unconverted
source record, the date and time of its first processing, its last retry date and time, the layer where
the error happened, and a message describing the error. By collecting these records, Arsha was
able to find and fix the origin of the problems. She marked all of the records that moved to the next
layer as “processed.”

Harmonization layer
The harmonization layer is where data normalization routines and record enrichment are
performed. This ensures that data formatting is consistent across all the sources. To harmonize the
data, Arsha ran the following defensive checks:

 Standardized the date format

 Standardized the currency

 Standardized uppercase and lowercase stylization

 Formatted IDs with leading zeros

 Split date values to store the year, month, and day in separate columns
 Applied conversion and priority rules from the source systems

When a record couldn’t be harmonized, she moved it to Error Handling. She marked all of the
records that moved to the next layer as “processed.”

Validations layer
The validations layer is where business rules are validated. As a reminder, a business rule
is a statement that creates a restriction on specific parts of a database. These rules are developed
according to the way an organization uses data. Arsha ran the following defensive checks:

 Ensured that values in the “department” column were not null, since “department” is a crucial
dimension

 Ensured that values in the “service type” column were within the authorized values to be processed

 Ensured that each billing record corresponded to a valid processed contract

Again, when a record couldn’t be harmonized, she moved it to error handling. She marked all the
records that moved to the next layer as “processed.”

Reconciliation layer
The reconciliation layer is where duplicate or illegitimate records are found. Here, Arsha ran
defensive checks to find the following types of records:

 Slow-changing dimensions

 Historic records

 Aggregations

As with the previous layers, Arsha moved the records that didn't pass the reconciliation rules to Error
Handling. After this round of defensive checks, she brought the processed records into the BI and
Analytics database (OLAP).

Error handling reporting and analysis


After completing the pipeline and running the defensive checks, Arsha made an error handling report
to summarize the process. The report listed the number of records from the source systems, as well
as how many records were marked as errors or ignored in each layer. The end of the report listed
the final number of processed records.
Key takeaways
Defensive checks are what ensure that a data pipeline properly handles its data. Defensive checks
are an essential part of preserving data integrity. Once the staging, harmonization, validations, and
reconciliation layers have been checked, the data brought into the target database is ready to be
used in a visualization.

Glossary terms from week 3


Accuracy: An element of quality testing used to confirm that data conforms to the actual entity
being measured or described

Business rule: A statement that creates a restriction on specific parts of a database

Completeness: An element of quality testing used to confirm that data contains all desired
components or measures

Conformity: An element of quality testing used to confirm that data fits the required destination
format

Consistency: An element of quality testing used to confirm that data is compatible and in
agreement across all systems

Data dictionary: A collection of information that describes the content, format, and structure of
data objects within a database, as well as their relationships

Data lineage: The process of identifying the origin of data, where it has moved throughout the
system, and how it has transformed over time

Data mapping: The process of matching fields from one data source to another

Integrity: An element of quality testing used to confirm that data is accurate, complete, consistent,
and trustworthy throughout its life cycle

Quality testing: The process of checking data for defects in order to prevent system failures; it
involves the seven validation elements of completeness, consistency, conformity, accuracy,
redundancy, integrity, and timeliness
Redundancy: An element of quality testing used to confirm that no more data than necessary is
moved, transformed, or stored

Schema validation: A process to ensure that the source system data schema matches the
target database data schema

Timeliness: An element of quality testing used to confirm that data is current

Terms and definitions from previous


weeks
A
Application programming interface (API): A set of functions and procedures that
integrate computer programs, forming a connection that enables them to communicate

Applications software developer: A person who designs computer or mobile applications,


generally for consumers

Attribute: In a dimensional model, a characteristic or quality used to describe a dimension

B
Business intelligence (BI): Automating processes and information channels in order to
transform relevant data into actionable insights that are easily available to decision-makers

Business intelligence governance: A process for defining and implementing business


intelligence systems and frameworks within an organization

Business intelligence monitoring: Building and using hardware and software tools to easily
and rapidly analyze data and enable stakeholders to make impactful business decisions

Business intelligence stages: The sequence of stages that determine both BI business value
and organizational data maturity, which are capture, analyze, and monitor

Business intelligence strategy: The management of the people, processes, and tools used
in the business intelligence process

C
Columnar database: A database organized by columns instead of rows

Combined systems: Database systems that store and analyze data in the same place

Compiled programming language: A programming language that compiles coded


instructions that are executed directly by the target machine

Contention: When two or more components attempt to use a single resource in a conflicting way
D
Data analysts: People who collect, transform, and organize data

Data availability: The degree or extent to which timely and relevant information is readily
accessible and able to be put to use

Data governance professionals: People who are responsible for the formal management of
an organization’s data assets

Data integrity: The accuracy, completeness, consistency, and trustworthiness of data


throughout its life cycle

Data lake: A database system that stores large amounts of raw data in its original format until it’s
needed

Data mart: A subject-oriented database that can be a subset of a larger data warehouse

Data maturity: The extent to which an organization is able to effectively use its data in order to
extract actionable insights

Data model: A tool for organizing data elements and how they relate to one another

Data partitioning: The process of dividing a database into distinct, logical parts in order to
improve query processing and increase manageability

Data pipeline: A series of processes that transports data from different sources to their final
destination for storage and analysis

Data visibility: The degree or extent to which information can be identified, monitored, and
integrated from disparate internal and external sources

Data warehouse: A specific type of database that consolidates data from multiple source
systems for data consistency, accuracy, and efficient access

Data warehousing specialists: People who develop processes and procedures to effectively
store and organize data

Database migration: Moving data from one source platform to another target database

Database performance: A measure of the workload that can be processed by a database, as


well as associated costs

Deliverable: Any product, service, or result that must be achieved in order to complete a project

Developer: A person who uses programming languages to create, execute, test, and troubleshoot
software applications

Dimension (data modeling): A piece of information that provides more detail and context
regarding a fact
Dimension table: The table where the attributes of the dimensions of a fact are stored

Design pattern: A solution that uses relevant measures and facts to create a model in support of
business needs

Dimensional model: A type of relational model that has been optimized to quickly retrieve data
from a data warehouse

Distributed database: A collection of data systems distributed across multiple physical


locations

E
ELT (extract, load, and transform): A type of data pipeline that enables data to be gathered
from data lakes, loaded into a unified destination system, and transformed into a useful format

ETL (extract, transform, and load): A type of data pipeline that enables data to be gathered
from source systems, converted into a useful format, and brought into a data warehouse or other
unified destination system

Experiential learning: Understanding through doing

F
Fact: In a dimensional model, a measurement or metric

Fact table: A table that contains measurements or metrics related to a particular event

Foreign key: A field within a database table that is a primary key in another table (Refer to
primary key)

Fragmented data: Data that is broken up into many pieces that are not stored together, often as
a result of using the data frequently or creating, deleting, or modifying files

Functional programming language: A programming language modeled around functions

G
Google DataFlow: A serverless data-processing service that reads data from the source,
transforms it, and writes it in the destination location

I
Index: An organizational tag used to quickly locate data within a database system

Information technology professionals: People who test, install, repair, upgrade, and
maintain hardware and software solutions

Interpreted programming language: A programming language that uses an interpreter,


typically another program, to read and execute coded instructions

Iteration: Repeating a procedure over and over again in order to keep getting closer to the
desired result
K
Key performance indicator (KPI): A quantifiable value, closely linked to business strategy,
which is used to track progress toward a goal

L
Logical data modeling: Representing different tables in the physical data model

M
Metric: A single, quantifiable data point that is used to evaluate performance

O
Object-oriented programming language: A programming language modeled around data
objects

OLAP (Online Analytical Processing) system: A tool that has been optimized for analysis
in addition to processing and can analyze data from multiple databases

OLTP (Online Transaction Processing) database: A type of database that has been
optimized for data processing instead of analysis

Optimization: Maximizing the speed and efficiency with which data is retrieved in order to ensure
high levels of database performance

P
Portfolio: A collection of materials that can be shared with potential employers

Primary key: An identifier in a database that references a column or a group of columns in which
each row uniquely identifies each record in the table (Refer to foreign key)

Project manager: A person who handles a project’s day-to-day steps, scope, schedule, budget,
and resources

Project sponsor: A person who has overall accountability for a project and establishes the
criteria for its success

Python: A general purpose programming language

Q
Query plan: A description of the steps a database system takes in order to execute a query

R
Resources: The hardware and software tools available for use in a database system

Response time: The time it takes for a database to complete a user request

Row-based database: A database that is organized by rows


S
Separated storage and computing systems: Databases where data is stored remotely,
and relevant data is stored locally for analysis

Single-homed database: Database where all of the data is stored in the same physical
location

Snowflake schema: An extension of a star schema with additional dimensions and, often,
subdimensions

Star schema: A schema consisting of one fact table that references any number of dimension
tables

Strategy: A plan for achieving a goal or arriving at a desired future state

Subject-oriented: Associated with specific areas or departments of a business

Systems analyst: A person who identifies ways to design, implement, and advance information
systems in order to ensure that they help make it possible to achieve business goals

Systems software developer: A person who develops applications and programs for the
backend processing systems used in organizations

T
Tactic: A method used to enable an accomplishment

Target table: The predetermined location where pipeline data is sent in order to be acted on

Throughput: The overall capability of the database’s hardware and software to process requests

Transferable skill: A capability or proficiency that can be applied from one job to another

V
Vanity metric: Data points that are intended to impress others, but are not indicative of actual
performance and, therefore, cannot reveal any meaningful business insights

W
Workload: The combination of transactions, queries, data warehousing analysis, and system
commands being processed by the database system at any given time

So far in this program, you have learned a lot about available business intelligence tools and how
you can use them as a BI professional. These tools will help you to monitor incoming data, generate
visualizations and dashboards, and empower stakeholders with access to reports. This helps
stakeholders make informed decisions.

These tools also have limitations. They may not be able to process complex demands fast enough
or generate complicated visualizations with a lot of metrics. In this reading, you’ll have an opportunity
to review strengths and limitations of business intelligence tools.
Tool performance
Coming up, you are going to learn more about what affects tool performance. But for now, there are
three elements you should keep in mind:

 The scope of a project: How much time needs to be represented in the dashboard? The longer
the timeline, the more data needs to be processed and presented.
 The complexity of the metrics: How many key performance indicators need to be captured
by the dashboard? The more complex your metrics, the more processing speed is affected.
 The processing speed of your tool: How fast can your tools actually respond to requests?
The more requests, the more burden is placed on the system, which can slow down response times.
Comparing common tools
Tool Strengths Limitations
Can be connected with most databases and big data
Long loading times for larger dashboards
platforms
Not as flexible as other tools
Looker Studio Intuitive and simple to use
Requires additional tools for reading data
Easily connects to other Google tools

Versatile and customizable


Long loading times for larger dashboards
Tableau Intuitive and simple to use
Limited graph selection
Can integrate a variety of data sources
Intuitive and simple to use
Microsoft Power Can integrate a variety of data sources Limited processing power
BI Variety of visualization choices Cannot export third party visuals
Easily connects to other Microsoft tools
Can integrate a variety of data sources More difficult to use custom reports
MicroStrategy Intuitive and simple to use Includes detailed functionality that can b
Mobile support for users to master
As a BI professional, part of your job will be considering tool limitations, as well as what affects a
tool's performance and how to balance technological requirements with stakeholder needs.
Continuing to think about tool limitations and user expectations will help you develop the best
possible solutions for your projects, no matter what the requirements may be.

Course3
Compare scope in different contexts
In a previous video, you were introduced to the idea of scope as it relates to dashboard design. You
may have also encountered the word “scope” in terms of project scope. In the business intelligence
world, you might find the word scope being used in a variety of contexts. And, as you’ll recall from
earlier discussions of context, understanding these contexts is key. In this reading, you’ll get a side-
by-side comparison of project scope and dashboard scope and at what stage in a project you will
likely encounter these terms. This will help as you encounter scope in different contexts as a BI
professional so you know what the expectations are in every situation.
Project scope Dashboard scope
Refers to the overall project goals, Refers to the breadth of what a dashboard is tracking,
resources, deliverables, deadlines, including the amount of time and how many metrics it
collaborators, and stakeholders. includes.
Determined by team leadership
Determined by BI teams as they consider project and user
including project sponsors and
requirements.
managers.
Outlined at the very beginning of a
Outlined as part of the dashboard creation process based
project to determine the overarching
on the specific reporting needs.
aspects of the project.
Involves working with key sponsors Involves choosing KPIs, how much time should be
and stakeholders to better understand represented, and how to make important data available
and align on the entire project and its and understandable to decision makers through the
goals. dashboard.

Key takeaways
Often, as a BI professional, you will encounter language that means different things in different
contexts. By paying close attention, asking questions, and thinking critically, you can ensure that you
and your team stay on the same page. In this case, the difference between project scope and
dashboard scope is useful to understand as you communicate with stakeholders about their
expectations with the dashboard specifically, and not the entire project.

Reduce processing load and maintain


dashboard effectiveness
Previously, you learned about the importance of optimizing processing speed for dashboards.
Processing speed describes how quickly a program can update and load a specified amount of
data. Basically, it’s how fast your dashboard can deliver answers to users. Processing speed is
usually determined by the volume of data, the number of measures, and the number of dimensions.
This is another example of a trade-off: you have to balance various factors, often prioritizing one
element while sacrificing another, in order to arrive at the best possible result. In this case, there is a
trade-off between processing speed and workload. This reading will offer solutions that enable you
to reduce processing load while maintaining the effectiveness of your dashboard.

Reduce processing load


One of the primary ways you can work to optimize your processing speed is by reducing the
processing load. You can do this by:

 Pre-aggregating: This is the process of performing calculations on data while it is still in the
database. Pre-aggregating data will transform data into a state that’s closer to what you ultimately
need because some necessary calculations will happen before the data is sent to the data
visualization tool. The trade-off is that your pipeline will involve more steps and your dataset
uploaded into the visualization tool will be less flexible , but your users will get the information they
need more quickly.
 Using JOINs: JOINS are used to combine rows from two or more tables based on a related
column. This basically merges tables together before they’re ever used in the dashboard. This can
save a lot of processing load in the actual dashboard. However, if you are trying to join a full table, it
can be more of a burden to the system. This is caused by the dimensionality of the tables. For
example, joining a one million row table with a 100 million row table will most likely generate a lot of
overhead every time the dash is updated. So it’s important to think carefully about how you use
JOINs to reduce processing load!
 Filtering: Filtering is the process of showing only the data that meets a specified criteria while
hiding the rest. Filtering the data early in your dashboard’s processing means that it doesn’t have to
sort through data that isn’t actually going to be used. The tradeoff of this is that this means less data
is available for your users to view on their own.
 Linking to external locations: In cases where you have data in your dashboard that you can
provide context for outside of the dashboard and which can help cut down on the processing load,
you can link out to that location for users to explore on their own.
 Avoiding user-defined functions: Users making requests of your dashboard can add a lot of
load to the processing work it’s doing. Consider the kinds of questions that users might have when
designing the dashboard so that you can address them without the users themselves having to input
functions repeatedly.
 Deciding between data views and tables: Tables contain actual data. Data views are the
result of a stored data query that preserves business logic and can be queried like a database. Data
views often require much less processing load because they don’t contain actual data, just a view of
the data. This makes them less flexible, so you’ll want to consider how interactive you need the data
in your dashboard to be.
Key takeaways
When you are considering dashboard design, you’ll have to consider processing speed and load and
decide how to best balance them to deliver the answers your stakeholders need as quickly as
possible. This can be challenging, but you can apply the strategies described in this reading to
reduce processing load and improve performance

Privacy settings in business intelligence


tools
As a business intelligence professional, you won’t just create dashboards and visualizations. You will
also share these tools with stakeholders so that they can access the data to get up-to-date
information and make informed decisions. You empower stakeholders with the ability to answer their
own questions—but you also want to ensure that only the people who are supposed to access that
information can do so. This has to do with data privacy and security. In this reading, you will learn
about some of the privacy restrictions that are already included with Tableau as well as other
common BI tools.

Privacy settings in Tableau


Incorporating privacy settings in your dashboard helps ensure that the data remains secure, even
when people from across your organization need to access it for different purposes. Throughout this
program, you will be using Tableau to practice key concepts and get familiar with sharing BI insights.
Luckily, Tableau already has a variety of privacy and security settings built in that you can take
advantage of.

Setting permissions
Tableau gives you the power to set permissions to control how users are interacting with your
dashboards and data sources; you can even use permissions to determine which users can access
which parts of a workbook. Tableau organizes permissions into projects and groups. Basically, this
means you can determine permissions depending on project needs, or by groups of users instead of
person-by-person.
You can also use permission settings to choose what metrics users can interact with, show or hide
different sheet tabs, or even add explanations of the data that can be seen by different users
depending on their specific needs.

To learn more about permissions and how to set them yourself in Tableau, you can check out the
Tableau Online Help article about permissions.

Managing user visibility


In addition to allowing you to determine what permissions users have as they interact with your
Tableau dashboards, you can also manage how users are able to interact with each other. Usually,
all users can view other users’ aliases, project ownership, and comments by other users by default.
But in cases where you have created a tool that’s being used by multiple clients, teams, or users
who don’t need to interact, you can actually determine how much visibility users have of each other.

To learn more about user visibility settings and how to set them yourself in Tableau, you can check
out the Tableau Online Help article about managing user visibility.

Row-level restrictions and filtering


Finally, Tableau allows you to filter the actual rows of data so users can access the data relevant to
their role without having to create an entirely separate view for them. This is especially useful when
working with live data sources or extracts that use multiple tables.

To learn more about user visibility settings and how to set them yourself in Tableau, you can check
out the Tableau Online Help article about user filters and row-level restrictions.

Privacy settings in other tools


Other tools that you might encounter as a professional also use privacy settings that allow you to
determine what data different users can access and view. Here are some resources you can use to
learn more about those tools:

 Data Studio: Sharing, access permissions, and data credentials


 Looker: Access control and permission management
 MicroStrategy: Restricting access to data: security filters
 PowerBI: Power BI Desktop privacy levels

otential pain points when long-term


monitoring
Business intelligence monitoring involves building and using hardware and software tools to easily
and rapidly analyze data and enable stakeholders to make impactful business decisions. As you
design dashboards for long-term monitoring, there are a few common obstacles users might
encounter that can make dashboards harder for them to use. In this reading, you’ll get a short
checklist of potential obstacles, known as “pain points.” These are things to keep in mind when you
build out the case study dashboard—and as you build dashboards in the future! This will help you
build more user-friendly dashboards that your team and stakeholders can use long-term.
Three possible obstacles for long-term monitoring

Pain points can change depending on the scale of the dashboard—the larger the scale, the more
additional context is required to make the data understandable for users. However, there are three
general obstacles you might encounter:

 Poorly defined use cases: The ways a business intelligence tool is actually used and
implemented by the team are referred to as “use cases.” When designing a dashboard that includes
live-monitoring, it’s important to establish how the different views will be used. For example, if you
only include one “executive view” with no way to drill down into specific information different users
might need, it leaves a lot of the interpreting work to users who may not understand or even need to
understand all of the data.
 Isolated snapshots: Snapshots of the latest information can be useful for reports, but if there’s
no way to track the data’s evolution, then these snapshots have a pretty limited utility. Building in
tracking for users to explore will help them understand the snapshots better. Basically, tracking
means including insights about how the data is changing over time.
 Lack of comparisons: When creating a dashboard, implementing comparisons can help users
understand whether the visualizations being presented indicate good or bad performance.
Comparisons place KPIs side-by-side in order to easily examine how similar or different they are.
Similar to adding more context to snapshots, adding comparisons is a fast way to ensure users
understand why the data in the dashboard is useful.
Key takeaways
In upcoming activities, you are going to work with stakeholders to create a dashboard designed to
monitor incoming data and provide as close to real-time updates as possible. When designing
dashboards, it’s important to keep the user in mind. Identifying potential pain points they might
encounter and addressing those problems in your design phase is a great way to guide your process
and generate more useful, accessible, and long-lasting solutions for your team.

Business intelligence presentation


examples
In this part of the certificate program, you started thinking about how to present your insights to
stakeholders. Slide decks are a common tool that you can use to present to your stakeholders and
even showcase and explain your dashboards. This reading will provide you with some tips and tricks
for designing slide presentations. For more resources, you can also check out the Google Data
Analytics Certificate content about developing presentations and slideshows.

Tips for building a presentation


Use the following tips and sample layout to build your own presentation.

Tip 1: Always remember audience and purpose


To develop an effective presentation that communicates your point, it’s important to keep your
audience in mind. Ask yourself these two questions to help you define the overall flow and build out
your presentation:

Who is my audience?

 If your intended audience is primarily high-level executives, your presentation should be kept at a
high level. Executives tend to focus on main takeaways that encourage improving, correcting, or
inventing things. Keep your presentation brief and spend most of your time on results and
recommendations, or provide a walkthrough of how they can best use the tools you’ve created. It
can be useful to create an executive summary slide that synthesizes the whole presentation in one
slide.

 If your intended audience is comprised of stakeholders and managers, they might have more time to
learn about new processes, how you developed the right tools, and ask more technical questions.
Be prepared to provide more details with this audience!

 If your intended audience is comprised of analysts and individual contributors, you will have the most
freedom—and perhaps the most time—to go in to more detail about the data, processes, and
results.

 Support all members of your audience by making your content accessible for audience members
with diverse abilities, experiences, and backgrounds.

What is the purpose of my presentation?

 If the goal of your presentation is to request or recommend something at the end, like a sales pitch,
you can have each slide work toward the recommendations at the end.

 If the goal of your presentation is to focus on the results of your analysis, each slide can help mark
the path to the results. Be sure to include plenty of views of the data analysis steps to demonstrate
the path you took with the data.

 If the goal of your presentation is to provide a report on the data analysis, your slides should clearly
summarize your data and key findings. In this case, it is alright to simply offer the data on its own.

 If the goal of your presentation is to showcase how to use new business intelligence tools, your
slides should clearly showcase what your audience needs to understand to start using the tool
themselves.

Tip 2: Prepare talking points and limit text on slides


As you create each slide in your presentation, prepare talking points (also called speaker
notes) on what you will say.
Don’t forget that you will be talking at the same time that your audience is reading your slides. If your
slides start becoming more like documents, you should rethink what you will say so that you can
remove some text from the slides. Make it easy for your audience to skim read the slides while still
paying attention to what you are saying. In general, follow the five-second rule. Your audience
should not be spending more than five seconds reading any block of text on a slide.

Knowing exactly what you will say throughout your presentation creates a natural flow to your story,
and helps avoid awkward pauses between topics. Slides that summarize data can also be repetitive;
if you prepare a variety of interesting talking points about the data, you can keep your audience alert
and paying attention.

Tip 3: End with your recommendations


Ending your presentation with recommendations and key takeaways brings the presentation to a
natural close, reminds your audience of the key points, and allows them to leave with a strong
impression of your recommendations. Useone slide for your recommendations at the end, and make
them clear and concise. And, if you are recommending that something be done, provide next steps
and describe what you would consider a successful outcome.

Tip 4: Allow enough time for the presentation and questions


Assume that everyone in your audience is busy. Keep your presentation on topic and as short as
possible by:

 Being aware of your timing. This applies to the total number of slides and the time you spend on
each slide. A good starting point is to spend 1-2 minutes on summary slides and 3-5 minutes on
slides that generate discussion.

 Presenting your data efficiently. Make sure that every slide tells a unique and important part of your
data story. If a slide isn’t that unique, you might think about combining the information on that slide
with another slide.

 Saving enough time for questions at the end or allowing enough time to answer questions
throughout your presentation.

Putting it all together: Your slide deck layout


This section will cover how to put everything together in a sample slide deck layout. This is just one
way you can format your slide presentations–you may find that other layouts work better for you and
your presentation style. That’s great! But this is a clear and concise starting point you can use to
develop clear and effective slide decks.

First slide: Agenda


Provide a high-level bulleted list of the topics you will cover and the amount of time you will spend on
each. Every company’s practices are different, but in general, most presentations run from 30
minutes to an hour at most. Here is an example of a 30-minute agenda:

 Introductions (4 minutes)

 Project overview and goals (5 minutes)

 Data and analysis (10 minutes)


 Recommendations (3 minutes)

 Actionable steps (3 minutes)

 Questions (5 minutes)

Second slide: Purpose


Not everyone in your audience is familiar with your project or knows why it is important. They didn’t
spend the last couple of weeks thinking through the BI processes and tools for your project like you
did. This slide summarizes the purpose of the project for your audience and why it is important to the
business.

Here is an example of a purpose statement:

Service center consolidation is an important cost savings initiative. The aim of this project is to
monitor the impact of service center consolidation on customer response times for continued
improvement.

Third slide: Data/analysis


When discussing the data, the BI processes and tools, and how your audience can use them, be
sure to include the following:

 Slides typically have a logical order (beginning, middle, and end) to fully build the story.

 Each slide should logically introduce the slide that follows it. Visual cues from the slides or verbal
cues from your talking points should let the audience know when you will go on to the next slide.

 Remember not to use too much text on the slides. When in doubt, refer back to the second tip on
preparing talking points and limiting the text on slides.

 The high-level information that people read from the slides shouldn’t be the same as the information
you provide in your talking points. There should be a nice balance between the two to tell a good
story. You don’t want to simply read or say the words on the slides.

For extra visuals on the slides, use animations. For example, you can:

 Fade in one bullet point at a time as you discuss each on a slide.

 Only display the visual that is relevant to what you are talking about (fade out non-relevant visuals).

 Use arrows or callouts to point to a specific area of a visual that you are using.

Fourth slide: Recommendations


If you have been telling your story well in the previous slides, the recommendations will be obvious
to your audience. This is when you might get a lot of questions about how your data supports your
recommendations. Be ready to communicate how your data backs up your conclusion or
recommendations in different ways. Having multiple words to state the same thing also helps if
someone is having difficulty with one particular explanation.
Fifth slide: Call to action
Sometimes the call to action can be combined with the recommendations slide. If there are multiple
actions or activities recommended, a separate slide is best.

Recall our example of a purpose statement: Service center consolidation is an important cost
savings initiative. The aim of this project is to monitor the impact of service center consolidation on
customer response times for continued improvement.

Suppose the monitoring reports showed that service center consolidation negatively impacted
customer response times. A call to action might be to examine if processes need to change to bring
customer response times back to what they were before the consolidation.

Wrapping it up: Getting feedback


After you present to your audience, think about how you told your data story and how you can get
feedback for improvement. Consider asking your manager or a colleague for candid thoughts about
your storytelling and presentation overall. Feedback is great to help you improve. Just like most of
the work you’ll do as a BI professional, presentations are an iterative process!

Use prediction for better collaboration


In this program, you have been learning all about the role business intelligence professionals fill in
an organization, how they build systems to store and move data where it needs to go, and how they
create visualization and dashboard tools to share insights. You’ve also been learning about how
monitoring can be used to provide stakeholders with updated information to inform decisions.
Monitoring can also be used for predictive analytics. Normally this is not part of a BI professional’s
role, but the tools they create can be used by data scientists to make predictions. In this reading,
you’ll be introduced to predictive analytics and how BI professionals are sometimes involved.

Predictive analytics
Predictive analytics is a branch of data analytics that uses historical data to identify patterns to
forecast future outcomes that can guide decision-making. The goal of predictive analytics is to
anticipate upcoming events and preemptively make decisions according to those predictions. The
predictions can focus on any point in the future—from weekly measurements to revenue predictions
for the next year.

By feeding historical data into a predictive model, stakeholders can make decisions that aren’t just
based on what has already happened in the past—they can make decisions that take into account
likely future events, too!

One example would be a hotel using predictive analytics to determine staffing needs for major
holidays. In the hospitality industry, there are many variables that might affect staffing decisions:

 the number of guests

 what services they’re using the most

 how much it costs to pay employees to be there

Being able to predict needs and schedule employees appropriately is key. So, a hotel might use a
predictive model to consider all of these factors to inform staffing decisions.
Another example could be a marketing team using predictive analysis to time their advertising
campaigns. Based on the successes of previous years, the marketing team can assess what trends
are likely to follow in the coming year and plan accordingly.

Presenting dashboards
As a BI professional, you might not be performing predictive analytics as part of your role. However,
the tools you build to monitor or update data might be helpful for data scientists on your team who
will perform this kind of analysis. By presenting dashboards effectively, you can properly
communicate to stakeholders or data scientists what the next step will be in the data pipeline, and
set them up to take the tools you create to the next level.

Key takeaways
BI professionals collaborate with a variety of different teams and experts to support the business
needs of their organization. Predictive analytics likely will not be a task you perform on the job, but
you may work with teams who do. Understanding the basics will help you consider their needs as
you design tools to support all of the teams who rely on your work!

You might also like