Data Analytics 1
Data Analytics 1
ANALYTICS
FOUNDATI
ON
WHAT IS BUSINESS ANALYTICS?
Business analytics is the use of math and statistics to collect, analyze, and interpret data to make better business
decisions.
There are four key types of business analytics: descriptive, predictive, diagnostic, and prescriptive. Descriptive
analytics is the interpretation of historical data to identify trends and patterns, while predictive analytics centers on
taking that information and using it to forecast future outcomes. Diagnostic analytics can be used to identify the root
cause of a problem. In the case of prescriptive analytics, testing and other techniques are employed to determine which
outcome will yield the best result in a given scenario.
Now, data science, the discipline of making data useful, is an umbrella term that encompasses three disciplines:
machine learning, statistics, and analytics. These are separated by how many decisions you know you want to make
before you begin with them. If you want to make a few important decisions under uncertainty, that is statistics. If you
want to automate, in other words, make many, many, many decisions under uncertainty, that is machine learning and
AI.
The excellence of statistics is rigor. Statisticians are essentially philosophers, epistemologists. They are very, very
careful about protecting decision-makers from coming to the wrong conclusion. If that care and rigor is what you are
passionate about, I would recommend statistics.
Performance is the excellence of the machine learning and AI engineer. You know that's the one for you if someone
says to you, "I bet that you couldn't build an automation system that performs this task with 99.99999 percent
accuracy," and your response to that is, "Watch me."
How about analytics? The excellence of an analyst is speed. How quickly can you surf through vast amounts of data to
explore it and discover the gems, the beautiful potential insights that are worth knowing about and bringing to your
decision-makers, working on a lot of different things, looking at a lot of different data sources, and thinking through
vast amounts of information, while promising not to snooze past the important potential insights. No one has looked at
it before. Go find something interesting If that's you, then analytics is probably the best fit for you.
The six steps of the data analysis process that you have been learning in this program are: ask, prepare, process,
analyze, share, and act. These six steps apply to any data analysis.
1. Discovery
2. Pre-processing data
3. Model planning
4. Model building
5. Communicate results
6. Operationalize
It is a little different from the data analysis process on which this program is based on, but it has some core
ideas in common: the first phase is interested in discovering and asking questions; data has to be prepared
before it can be analyzed and used; and then findings should be shared and acted on.
1. Ask
2. Prepare
3. Explore
4. Model
5. Implement
6. Act
7. Evaluate
The SAS model emphasizes the cyclical nature of their model by visualizing it as an infinity symbol. Its
process has seven steps, many of which mirror the other models, like ask, prepare, model, and act. But this
process is also a little different; it includes a step after the act phase designed to help analysts evaluate their
solutions and potentially return to the ask phase again.
3. Pre-processing data
5. Visualizing data
This data analytics project process was developed by Vignesh Prajapati. It doesn’t include the sixth phase, or
the act phase. However, it still covers a lot of the same steps described. It begins with identifying the
problem, preparing and processing data before analysis, and ends with data visualization.
Big data analytics process
Authors Thomas Erl, Wajid Khattak, and Paul Buhler proposed a big data analytics process in their book,
Big Data Fundamentals: Concepts, Drivers & Techniques. Their process suggests phases divided into nine
steps:
1. Business case evaluation
2. Data identification
3. Data acquisition and filtering
4. Data extraction
5. Data validation and cleaning
6. Data aggregation and representation
7. Data analysis
8. Data visualization
9. Utilization of analysis results
This process appears to have three or four more steps than the previous models. But in reality, they have just
broken down what has been referred to as prepare and process into smaller steps. It emphasizes the
individual tasks required for gathering, preparing, and cleaning data before the analysis phase.
Data ecosystems
Data ecosystems are made up of various elements that interact with one another in order to produce,
manage, store, organize, analyze, and share data. These elements include hardware and software tools, and
the people who use them. The cloud is a place to keep data online, rather than on a computer hard drive. So
instead of storing data somewhere inside your organization's network, that data is accessed over the internet.
So the cloud is just a term we use to describe the virtual location. The cloud plays a big part in the data
ecosystem, and as a data analyst, it's your job to harness the power of that data ecosystem, find the right
information, and provide the team with analysis that helps them make smart decisions.
Analytical Thinking
The five key aspects to analytical thinking. They are visualization, strategy, problem-orientation, correlation,
and finally, big-picture and detail-oriented thinking.
Visualization is the graphical representation of information. Some examples include graphs, maps,
or other design elements. Visuals can help data analysts understand and explain information more
effectively.
Strategic: With so much data available, having a strategic mindset is key to staying focused and on
track. Strategizing helps data analysts see what they want to achieve with the data and how they can
get there. Strategy also helps improve the quality and usefulness of the data we collect. By
strategizing, we know all our data is valuable and can help us accomplish our goals.
Problem-oriented: Data analysts use a problem- oriented approach in order to identify, describe, and
solve problems.
Correlation: Being able to identify a correlation between two or more pieces of data. A correlation is
like a relationship. Correlation does not equal causation. In other words, just because two pieces of
data are both trending in the same direction, that doesn't necessarily mean they are all related.
Big-picture thinking: This means being able to see the big picture as well as the details. It helps you
zoom out and see possibilities and opportunities. This leads to exciting new ideas or innovations.
There are all kinds of problems in the business world that can benefit from employees who have both
a big-picture and a detail-oriented way of thinking.
The five whys is a simple but effective technique for identifying a root cause. It involves asking "Why?"
repeatedly until the answer reveals itself. This often happens at the fifth “why,” but sometimes you’ll need
to continue asking more times, sometimes fewer.
1. First, think about curiosity and context. The more you learn about the power of data, the more
curious you're likely to become. You'll start to see patterns and relationships in everyday life, The
analysts take their thinking a step further by using context to make predictions, research answers, and
eventually draw conclusions about what they've discovered.
2. Having a technical mindset comes next. Everyone has instincts, or as in the case of our human
resources director example, gut feelings. Data analysts are no different. They have gut feelings too.
But they've trained themselves to build on those feelings and use a more technical approach to
explore them. They do this by always seeking out the facts, putting them to work through analysis,
and using the insights they gain to make informed decisions.
3. Next, we come to data design, which has a strong connection to data-driven decision-making. To
put it simply, designing your data so that it is organized in a logical way makes it easy for data
analysts to access, understand, and make the most of available information. And it's important to
keep in mind that data design doesn't just apply to databases. This kind of thinking can work with all
sorts of real-life situations too. If you make decisions that are informed by data, you are more likely
to make more informed and effective decisions.
4. The final ability is data strategy, which incorporates the people, processes, and tools used to solve a
problem. Data strategy gives you a high-level view of the path you need to take to achieve your
goals. Also, data-driven decision-making isn't a one-person job. It's much more likely to be
successful if everyone is on board and on the same page, so it's important to make sure specific
procedures are in place and that your technology being used is aligned with your data-driven
strategy.
ASK
STEP
Solve problems with data
SMART QUESTIONS
It's important that we ask the right questions. Effective questions follow the SMART methodology.
Specific questions are simple, significant and focused on a single topic or a few closely related ideas.
This helps us collect information that's relevant to what we're investigating. If a question is too general,
try to narrow it down by focusing on just one element. For example, instead of asking a closed-ended
question, like, are kids getting enough physical activities these days? Ask what percentage of kids
achieve the recommended 60 minutes of physical activity at least five days a week? That question is
much more specific and can give you more useful information.
Measurable questions can be quantified and assessed. An example of an unmeasurable question would
be, why did a recent video go viral? Instead, you could ask how many times was our video shared on
social channels the first week it was posted? That question is measurable because it lets us count the
shares and arrive at a concrete number.
Action-oriented questions encourage change. You might remember that problem solving is about seeing
the current state and figuring out how to transform it into the ideal future state. Well, action-oriented
questions help you get there. So rather than asking, how can we get customers to recycle our product
packaging? You could ask, what design features will make our packaging easier to recycle? This brings
you answers you can act on.
Relevant questions matter, are important and have significance to the problem you're trying to solve.
Let's say you're working on a problem related to a threatened species of frog. And you asked, why does it
matter that Pine Barrens tree frogs started disappearing? This is an irrelevant question because the
answer won't help us find a way to prevent these frogs from going extinct. A more relevant question
would be, what environmental factors changed in Durham, North Carolina between 1983 and 2004 that
could cause Pine Barrens tree frogs to disappear from the Sandhills Regions? This question would give
us answers we can use to help solve our problem. That's also a great example for our final point, time-
bound questions.
Time-bound questions specify the time to be studied. The time period we want to study is 1983 to 2004.
This limits the range of possibilities and enables the data analyst to focus on relevant data.
Fairness: Fairness means ensuring that your questions don't create or reinforce bias. sandwich
example. There we had an unfair question because it was phrased to lead you toward a certain answer.
This made it difficult to answer honestly if you disagreed about the sandwich quality. Another common
example of an unfair question is one that makes assumptions. For instance, let's say a satisfaction survey
is given to people who visit a science museum. If the survey asks, what do you love most about our
exhibits? This assumes that the customer loves the exhibits which may or may not be true. Fairness also
means crafting questions that make sense to everyone. It's important for questions to be clear and have a
straightforward wording that anyone can easily understand. Unfair questions also can make your job as a
data analyst more difficult. They lead to unreliable feedback and missed opportunities to gain some truly
valuable insights.
Closed-ended questions: questions that ask for a one-word or brief response only