Data Analysis Course
Data Analysis Course
Learning Objectives
Define key concepts involved in data analytics including data, data analysis, and data ecosystem
Identify the key features of the learning environment and their uses
Describe principles and practices that will help to increase one's chances of success in this
certificate
Describe the key concepts to be discussed in the program, including learning outcomes
-------------------------------------------------------------------------------------------------------------------
Data analysis is the collection, transformation and the organisation of data in order to draw conclusions,
make predictions and drive informed decision making.
Ask
prepare '
process
analyse
Share
act
The process presented as part of the Google Data Analytics Certificate is one that will be valuable to you
as you keep moving forward in your career:
Data-driven decision-making is defined as using facts to guide business strategy. a data analyst finds
data, analyzes it and uses it to uncover trends, patterns and relationships..
data alone will never be as powerful as data combined with human experience, observation, and
sometimes even intuition. To get the most out of data-driven decision-making, it's important to include
insights from people who are familiar with the business problem. These people are called subject matter
experts, and they have the ability to look at the results of data analysis and identify any inconsistencies,
make sense of gray areas, and eventually validate choices being made.
The main goal of business analytics is to extract meaningful insights from data to guide organizational
decisions, while data science is focused on turning raw data into meaningful conclusions through using
algorithms and statistical models. Business analysts participate in tasks such as budgeting, forecasting,
and product development, while data scientists focus on data wrangling, programming, and statistical
modeling.
==================================================================================
Module 2'
Analytical skills are qualities and characteristics associated with solving problems using facts.
There are several aspects to analytical skills; however, here we will focus on the following five essential
points, which are curiosity, understanding context, having a technical mindset, data design, and data
strategy.
Analytical skills are qualities and characteristics associated with solving problems using facts.
There are several aspects to analytical skills; however, here we will focus on the following five essential
points, which are curiosity, understanding context, having a technical mindset, data design, and data
strategy.
Data design is how you organize information, just like the way you organize the contacts on your phone.
A data strategy is a management of the people (they know how to use the right data to find solutions),
processes (the path to that solution is clear and accessible), and tools used in data analysis (the right
technology is being used for the job).
Analytical thinking involves identifying and defining a problem before solving it. Its five key aspects are
visualization, strategy, problem-orientation, correlation, and big-picture and detail-oriented thinking.
Data analysts use a problem-oriented approach to identify, describe, and solve problems. They could
identify correlations between two or more pieces of data.
Big-picture thinking helps you zoom out and see possibilities and opportunities, while detail-oriented
thinking is about figuring out all aspects that will help you execute a plan.
There are all kinds of problems in the business world that can benefit from employees who have a big-
picture and detail-oriented way of thinking.
=========================================================================
Plan: Decide what kind of data is needed, how it will be managed, and who will be responsible for it.
Manage: Care for and maintain the data. This includes determining how and where it is stored and the
tools used to do so.
Analyze: Use the data to solve problems, make decisions, and support business goals.
Archive: Keep relevant data stored for long-term and future reference.
Destroy: Remove data from storage and delete any shared copies of the data.
Warning: Be careful not to mix up or confuse the six stages of the data life cycle (Plan, Capture, Manage,
Analyze, Archive, and Destroy) with the six phases of the data analysis life cycle (Ask, Prepare, Process,
Analyze, Share, and Act). They shouldn't be used or referred to interchangeably.
Data analysis is not a life cycle; it is the process of analyzing data.
Working with stakeholders is the best way for asking effective questions and for defining the problem in
question.
The previous statement is a part of the “ask” phase. It helps you focus on the problem itself and not only
on its symptoms.
The “share” phase and the visualization are a data analyst's best friends.
=====================================================
The first thing you want to do is to ask, “What is the problem that we are trying to solve,” “What is the
purpose of this analysis,” and “What are we hoping to learn from it?”
We need to think about what type of data we need to answer those key questions.
After you have done all the hard work of collecting your data, now, you should process that data.
Data analysts are trained to look for patterns, but the data are not our story to tell. This is the point
where we must take a step back and let the data speak for themselves.
We might have a sneaking suspicion as to what the data will tell us.
The next step is to share all the data and insights that you have generated from your analyses. This is
where we use all those data-driven insights to decide which types of interventions we want to introduce.
After you've asked all the right questions and you've wrapped your arms around the scope of the
analysis you need to conduct, the next step is to prepare. We need to be thinking about what type of
data we need to answer those key questions. This could be anything from quantitative data or
qualitative data. It could be cross-sectional or points in time versus longitudinal over a long period of
time.
The most common tools you will see analysts use are spreadsheets, query languages, and visualization
tools. These tools allow analysts to be more focused on maximizing everything the former could do,
streamlining their reporting, and just making their work simpler.
A formula is a set of instructions that perform a specific calculation using the data in a spreadsheet. It
can do basic things, such as add, subtract, multiply, and divide, but they do not stop there.
A function is a preset command that automatically performs a process or task using data.
Data analysts prefer using Tableau because it helps them create visuals that are very easy to understand.
The latter gives them an easy way to create visuals based on the results of a query.
With Looker, you can give stakeholders a complete picture of your work by showing them visualization
data and the actual data related to it.
---------------
Spreadsheets
Data analysts rely on spreadsheets to collect and organize data. Two popular spreadsheet applications
you will probably use a lot in your future role as a data analyst are Microsoft Excel and Google Sheets.
Identify patterns and piece the data together in a way that works for each specific data project
A database is a collection of structured data stored in a computer system. Some popular Structured
Query Language (SQL) programs include MySQL, Microsoft SQL Server, and BigQuery.
Query languages
Allow analysts to isolate specific information from a database(s)
Make it easier for you to learn and understand the requests made to databases
Allow analysts to select, create, add, or download data from a database for analysis
Visualization tools
Data analysts use a number of visualization tools, like graphs, maps, tables, charts, and more. Two
popular visualization tools are Tableau and Looker.
These tools
Help stakeholders come up with conclusions that lead to informed decisions and effective business
strategies
- Tableau's simple drag-and-drop feature lets users create interactive graphs in dashboards and
worksheets
- Looker communicates directly with a database, allowing you to connect your data right to the visual
tool you choose
A career as a data analyst also involves using programming languages, like R and Python, which are used
a lot for statistical analysis, visualization, and other data analysis.
Key takeaway
You have a lot of tools as a data analyst. This is a first glance at the possibilities, and you will explore
many of these tools in-depth throughout this program.
=================================
Depending on which phase of the data analysis process you’re in, you will need to use different tools. For
example, if you are focusing on creating complex and eye-catching visualizations, then the visualization
tools we discussed earlier are the best choice. But if you are focusing on organizing, cleaning, and
analyzing data, then you will probably be choosing between spreadsheets and databases using queries.
Spreadsheets and databases both offer ways to store, manage, and use data. The basic content for both
tools are sets of values. Yet, there are some key differences, too:
Question 1
Overview
Now that you have been introduced to working with data, you can pause for a moment and think about
what you are learning. In this self-reflection, you will consider your thoughts about the data analysis
process and data life cycle, then respond to brief questions.
This self-reflection will help you develop insights into your own learning and prepare you to apply your
knowledge of the phases of data analysis to your data analysis toolbox. As you answer questions—and
come up with questions of your own—you will consider concepts, practices, and principles to help refine
your understanding and reinforce your learning. You’ve done the hard work, so make sure to get the
most out of it: This reflection will help your knowledge stick!
So far you’ve learned about the data analysis process and the data life cycle. They include the following
steps:
Prepare
Process
Analyze
Share
Act
Plan
Capture
Manage
Analyze
Archive
Destroy
For a refresher on the phases of data, you can review the reading on the data analysis process and the
video on the data life cycle.
Reflection
What is the relationship between the data life cycle and the data analysis process? How are the two
processes similar? How are they different?
What is the relationship between the Ask phase of the data analysis process and the Plan phase of the
data life cycle? How are they similar? How are they different?
Reflect on your learning and think about how you can apply the phases of data to future projects.
Now, write 2-3 sentences (40-60 words) in response to each of these questions. Type your response in
the text box below.
1 point
===========================================================================
ans
The data life cycle and the data analysis process are closely related but distinct processes that are
important for managing and analyzing data.
The data life cycle refers to the series of stages that data goes through from its initial collection to its
eventual disposal or archiving. These stages typically include data collection, data entry and storage, data
cleaning and preprocessing, data analysis, and data dissemination or visualization.
The data analysis process, on the other hand, specifically refers to the process of transforming raw data
into meaningful insights by using statistical or other analytical methods. It typically involves identifying
patterns, relationships, and trends in the data, and drawing conclusions based on these findings.
The two processes are similar in that they both involve working with data, and both require careful
attention to detail, accuracy, and reliability. Both processes also require expertise in data management
and analysis techniques, as well as an understanding of the underlying data and its context.
However, there are also some important differences between the two processes. The data life cycle is
more focused on the overall management of data while the data analysis process, on the other hand, is
more focused on using data to answer specific questions or generate insights that can inform decision-
making.
------------------------------------------------------------------------------------------------------------
The data life cycle and the data analysis process certainly have one thing in comon, and that is Data.
However while data life cycle provides a generic or common framework for how data is managed, the
data analysis process explains the different processes engaged in the collection, transfromation and the
organisation of data to draw meaningful insight and drive infromed decision making.
------------------------------------------------------------------------------------------------------------
The Ask phase of the data analysis process and the Plan phase of the data life cycle are closely related
stages in the overall process of data management and analysis.
The Ask phase of the data analysis process involves identifying the key questions or problems that need
to be addressed through data analysis. This includes defining the scope of the analysis, identifying the
relevant data sources, and clarifying the goals and objectives of the analysis.
The Plan phase of the data life cycle, on the other hand, involves the overall planning and management
of the data, including the identification of data sources, data storage and retrieval mechanisms, and the
establishment of data quality standards and policies.
The two phases are similar in that they both involve careful planning and consideration of the goals and
objectives of the analysis or data management process. Both phases also require a deep understanding
of the data being analyzed, and both involve collaboration with stakeholders to ensure that the analysis
or data management meets their needs.
However, there are also some important differences between the two phases. The Ask phase is more
focused on the specific questions that need to be answered through the analysis, while the Plan phase is
more focused on the overall management and governance of the data. Additionally, the Ask phase is
typically iterative and may involve several rounds of refining the questions and objectives of the analysis,
while the Plan phase is typically more linear and involves more structured planning and implementation
of data management processes.
The application of phases of data projects are useful to gain insight and uncover treands and pattern to
make meaningful future predictions.
Just as humans use different languages to communicate with others, so do computers. Structured Query
Language (or SQL, often pronounced “sequel”) enables data analysts to talk to their databases. SQL is
one of the most useful data analyst tools, especially when working with large datasets in tables. It can
help you investigate huge databases, track down text (referred to as strings) and numbers, and filter for
the exact kind of data you need—much faster than a spreadsheet can.
If you haven’t used SQL before, this reading will help you learn the basics so you can appreciate how
useful SQL is and how useful SQL queries are in particular. You will be writing SQL queries in no time at
all.
What is a query?
A query is a request for data or information from a database. When you query databases, you use SQL to
communicate your question or request. You and the database can always exchange information as long
as you speak the same language.
Every programming language, including SQL, follows a unique set of guidelines known as syntax. Syntax
is the predetermined structure of a language that includes all required words, symbols, and punctuation,
as well as their proper placement. As soon as you enter your search criteria using the correct syntax, the
query starts working to pull the data you’ve requested from the target database.
A SQL query is like filling in a template. You will find that if you are writing a SQL query from scratch, it is
helpful to start a query by writing the SELECT, FROM, and WHERE keywords in the following format: