0% found this document useful (0 votes)
72 views13 pages

Data Analytics 1

Uploaded by

Yash Agrawal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views13 pages

Data Analytics 1

Uploaded by

Yash Agrawal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

DATA

ANALYTICS
FOUNDATI
ON
 WHAT IS BUSINESS ANALYTICS?
Business analytics is the use of math and statistics to collect, analyze, and interpret data to make better business
decisions.
There are four key types of business analytics: descriptive, predictive, diagnostic, and prescriptive. Descriptive
analytics is the interpretation of historical data to identify trends and patterns, while predictive analytics centers on
taking that information and using it to forecast future outcomes. Diagnostic analytics can be used to identify the root
cause of a problem. In the case of prescriptive analytics, testing and other techniques are employed to determine which
outcome will yield the best result in a given scenario.

 Business Analytics vs. Data Science


It’s important to highlight the difference between business analytics and data science. While both processes use big
data to solve business problems they’re separate fields.
The main goal of business analytics is to extract meaningful insights from data to guide organizational decisions,
while data science is focused on turning raw data into meaningful conclusions through using algorithms and statistical
models. Business analysts participate in tasks such as budgeting, forecasting, and product development, while data
scientists focus on data wrangling, programming, and statistical modeling.

Now, data science, the discipline of making data useful, is an umbrella term that encompasses three disciplines:
machine learning, statistics, and analytics. These are separated by how many decisions you know you want to make
before you begin with them. If you want to make a few important decisions under uncertainty, that is statistics. If you
want to automate, in other words, make many, many, many decisions under uncertainty, that is machine learning and
AI.

 Machine Learning VS Statistics VS Analytics

The excellence of statistics is rigor. Statisticians are essentially philosophers, epistemologists. They are very, very
careful about protecting decision-makers from coming to the wrong conclusion. If that care and rigor is what you are
passionate about, I would recommend statistics.
Performance is the excellence of the machine learning and AI engineer. You know that's the one for you if someone
says to you, "I bet that you couldn't build an automation system that performs this task with 99.99999 percent
accuracy," and your response to that is, "Watch me."
How about analytics? The excellence of an analyst is speed. How quickly can you surf through vast amounts of data to
explore it and discover the gems, the beautiful potential insights that are worth knowing about and bringing to your
decision-makers, working on a lot of different things, looking at a lot of different data sources, and thinking through
vast amounts of information, while promising not to snooze past the important potential insights. No one has looked at
it before. Go find something interesting If that's you, then analytics is probably the best fit for you.

The six steps of the data analysis process that you have been learning in this program are: ask, prepare, process,
analyze, share, and act. These six steps apply to any data analysis.

The six phases of data analysis


The data analysis process helps analysts break down business problems into a series of manageable tasks:
1. In the ask phase (Ask: business challenge, objective, or question), you’ll work to understand the challenge to be
solved or the question to be answered. It will likely be assigned to you by stakeholders. As this is the ask
phase, you’ll ask many questions to help you along the way.
2. Next, in the prepare phase (Prepare: data generation, collection, storage, and data management), you’ll find and
collect the data you'll need to answer your questions. You’ll identify data sources, gather data, and verify that
it is accurate and useful for answering your questions.
3. The process phase (Process: data cleaning and data integrity) is when you will clean and organize your data.
Tasks you perform here include removing any inconsistencies; filling in missing values; and, in many cases,
changing the data to a format that's easier to work with. Essentially, you’re ensuring the data is ready before
you begin analysis.
4. The analyze phase (Analyze: data exploration, visualization, and analysis) is when you do the necessary data
analysis to uncover answers and solutions. Depending on the situation and the data, this could involve tasks
such as calculating averages or counting items in categories so you can examine trends and patterns.
5. Next comes the share phase (Share: communicating and interpreting results), when you present your findings
to decision-makers through a report, presentation, or data visualizations. As part of the share phase, you
decide which medium you want to use to share your findings and select the data to include. Tools for
presenting data visually include charts made in Google Sheets, Tableau, and R.
6. Last is the act phase (Act: putting insights to work to solve the problem), in which you and others in the company put
the data insights into action. This could mean implementing a new business strategy, making changes to a
website, or any other action that solves the initial problem.

 EMC's data analysis process


EMC Corporation's data analytics process is cyclical with six steps:

1. Discovery

2. Pre-processing data

3. Model planning

4. Model building

5. Communicate results

6. Operationalize
It is a little different from the data analysis process on which this program is based on, but it has some core
ideas in common: the first phase is interested in discovering and asking questions; data has to be prepared
before it can be analyzed and used; and then findings should be shared and acted on.

 SAS's iterative process


An iterative data analysis process was created by a company called SAS, a leading data analytics solutions
provider. It can be used to produce repeatable, reliable, and predictive results:

1. Ask

2. Prepare

3. Explore

4. Model

5. Implement

6. Act

7. Evaluate

The SAS model emphasizes the cyclical nature of their model by visualizing it as an infinity symbol. Its
process has seven steps, many of which mirror the other models, like ask, prepare, model, and act. But this
process is also a little different; it includes a step after the act phase designed to help analysts evaluate their
solutions and potentially return to the ask phase again.

 Project-based data analytics process


A project-based data analytics process has five simple steps:

1. Identifying the problem

2. Designing data requirements

3. Pre-processing data

4. Performing data analysis

5. Visualizing data

This data analytics project process was developed by Vignesh Prajapati. It doesn’t include the sixth phase, or
the act phase. However, it still covers a lot of the same steps described. It begins with identifying the
problem, preparing and processing data before analysis, and ends with data visualization.
 Big data analytics process
Authors Thomas Erl, Wajid Khattak, and Paul Buhler proposed a big data analytics process in their book,
Big Data Fundamentals: Concepts, Drivers & Techniques. Their process suggests phases divided into nine
steps:
1. Business case evaluation
2. Data identification
3. Data acquisition and filtering
4. Data extraction
5. Data validation and cleaning
6. Data aggregation and representation
7. Data analysis
8. Data visualization
9. Utilization of analysis results
This process appears to have three or four more steps than the previous models. But in reality, they have just
broken down what has been referred to as prepare and process into smaller steps. It emphasizes the
individual tasks required for gathering, preparing, and cleaning data before the analysis phase.

 Data ecosystems

Data ecosystems are made up of various elements that interact with one another in order to produce,
manage, store, organize, analyze, and share data. These elements include hardware and software tools, and
the people who use them. The cloud is a place to keep data online, rather than on a computer hard drive. So
instead of storing data somewhere inside your organization's network, that data is accessed over the internet.
So the cloud is just a term we use to describe the virtual location. The cloud plays a big part in the data
ecosystem, and as a data analyst, it's your job to harness the power of that data ecosystem, find the right
information, and provide the team with analysis that helps them make smart decisions.

 Data scientists VS data analysts


Data science is defined as creating new ways of modeling and understanding the unknown by using raw
data. Data scientists create new questions using data, while analysts find answers to existing questions by
creating insights from data sources.
Data analysis is the collection, transformation, and organization of data in order to draw conclusions, make
predictions, and drive informed decision-making. Data analytics in the simplest terms is the science of data.
It's a very broad concept that encompasses everything from the job of managing and using data to the tools
and methods that data workers use each and every day.
 How data informs better decisions
The first step in data-driven decision-making is figuring out the business need. Usually, this is a problem
that needs to be solved. Whatever the problem is, once it's defined, a data analyst finds data, analyzes it and
uses it to uncover trends, patterns and relationships. Sometimes the data-driven strategy will build on what's
worked in the past. Other times, it can guide a business to branch out in a whole new direction. By ensuring
that data is built into every business strategy, data analysts play a critical role in their companies' success,
but it's important to note that no matter how valuable data-driven decision-making is, data alone will never
be as powerful as data combined with human experience, observation, and sometimes even intuition. To get
the most out of data-driven decision-making, it's important to include insights from people who are familiar
with the business problem. These people are called subject matter experts, and they have the ability to look
at the results of data analysis and identify any inconsistencies, make sense of gray areas, and eventually
validate choices being made. Organizations that work this way put data at the heart of every business
strategy, but also benefit from the insights of their people.

 Analytical Thinking

The five key aspects to analytical thinking. They are visualization, strategy, problem-orientation, correlation,
and finally, big-picture and detail-oriented thinking.
 Visualization is the graphical representation of information. Some examples include graphs, maps,
or other design elements. Visuals can help data analysts understand and explain information more
effectively.
 Strategic: With so much data available, having a strategic mindset is key to staying focused and on
track. Strategizing helps data analysts see what they want to achieve with the data and how they can
get there. Strategy also helps improve the quality and usefulness of the data we collect. By
strategizing, we know all our data is valuable and can help us accomplish our goals.
 Problem-oriented: Data analysts use a problem- oriented approach in order to identify, describe, and
solve problems.
 Correlation: Being able to identify a correlation between two or more pieces of data. A correlation is
like a relationship. Correlation does not equal causation. In other words, just because two pieces of
data are both trending in the same direction, that doesn't necessarily mean they are all related.
 Big-picture thinking: This means being able to see the big picture as well as the details. It helps you
zoom out and see possibilities and opportunities. This leads to exciting new ideas or innovations.
There are all kinds of problems in the business world that can benefit from employees who have both
a big-picture and a detail-oriented way of thinking.

 Use the five whys for root cause analysis


Recently, you’ve been learning why business solutions almost always require some data detective work.
This is one way critical thinking helps data professionals determine the right questions to ask in order to
arrive at those solutions. One very common question is, “What is the root cause of the problem?” A root
cause is the reason why a problem occurs. So, by identifying and eliminating the root cause, data
professionals can help stop that problem from occurring again.

The five whys is a simple but effective technique for identifying a root cause. It involves asking "Why?"
repeatedly until the answer reveals itself. This often happens at the fifth “why,” but sometimes you’ll need
to continue asking more times, sometimes fewer.

 Data-driven decision-making: Using facts to drive business decisions

1. First, think about curiosity and context. The more you learn about the power of data, the more
curious you're likely to become. You'll start to see patterns and relationships in everyday life, The
analysts take their thinking a step further by using context to make predictions, research answers, and
eventually draw conclusions about what they've discovered.

2. Having a technical mindset comes next. Everyone has instincts, or as in the case of our human
resources director example, gut feelings. Data analysts are no different. They have gut feelings too.
But they've trained themselves to build on those feelings and use a more technical approach to
explore them. They do this by always seeking out the facts, putting them to work through analysis,
and using the insights they gain to make informed decisions.

3. Next, we come to data design, which has a strong connection to data-driven decision-making. To
put it simply, designing your data so that it is organized in a logical way makes it easy for data
analysts to access, understand, and make the most of available information. And it's important to
keep in mind that data design doesn't just apply to databases. This kind of thinking can work with all
sorts of real-life situations too. If you make decisions that are informed by data, you are more likely
to make more informed and effective decisions.

4. The final ability is data strategy, which incorporates the people, processes, and tools used to solve a
problem. Data strategy gives you a high-level view of the path you need to take to achieve your
goals. Also, data-driven decision-making isn't a one-person job. It's much more likely to be
successful if everyone is on board and on the same page, so it's important to make sure specific
procedures are in place and that your technology being used is aligned with your data-driven
strategy.
ASK
STEP
Solve problems with data

Data analysts work with a variety of problems


 Making predictions: This problem type involves using data to make an informed decision about
how things may be in the future. For example, a hospital system might use a remote patient
monitoring to predict health events for chronically ill patients. The patients would take their health
vitals at home every day, and that information combined with data about their age, risk factors, and
other important details could enable the hospital's algorithm to predict future health problems and
even reduce future hospitalizations.
 Categorizing things: This means assigning information to different groups or clusters based on
common features. An example of this problem type is a manufacturer that reviews data on shop floor
employee performance. An analyst may create a group for employees who are most and least
effective at engineering. A group for employees who are most and least effective at repair and
maintenance, most and least effective at assembly, and many more groups or clusters.
 Spotting something unusual: In this problem type, data analysts identify data that is different from
the norm. An instance of spotting something unusual in the real world is a school system that has a
sudden increase in the number of students registered, maybe as big as a 30 percent jump in the
number of students. A data analyst might look into this upswing and discover that several new
apartment complexes had been built in the school district earlier that year. They could use this
analysis to make sure the school has enough resources to handle the additional students.
 Identifying themes: Identifying themes takes categorization as a step further by grouping
information into broader concepts. Going back to our manufacturer that has just reviewed data on the
shop floor employees. First, these people are grouped by types and tasks. But now a data analyst
could take those categories and group them into the broader concept of low productivity and high
productivity. This would make it possible for the business to see who is most and least productive, in
order to reward top performers and provide additional support to those workers who need more
training.
 Discovering connections: enables data analysts to find similar challenges faced by different entities,
and then combine data and insights to address them. Here's what I mean; say a scooter company is
experiencing an issue with the wheels it gets from its wheel supplier. That company would have to
stop production until it could get safe, quality wheels back in stock. But meanwhile, the wheel
companies encountering the problem with the rubber it uses to make wheels, turns out its rubber
supplier could not find the right materials either. If all of these entities could talk about the problems
they're facing and share data openly, they would find a lot of similar challenges and better yet, be
able to collaborate to find a solution.
 Finding patterns: Data analysts use data to find patterns by using historical data to understand what
happened in the past and is therefore likely to happen again. Ecommerce companies use data to find
patterns all the time. Data analysts look at transaction data to understand customer buying habits at
certain points in time throughout the year.

SMART QUESTIONS

It's important that we ask the right questions. Effective questions follow the SMART methodology.

 Specific questions are simple, significant and focused on a single topic or a few closely related ideas.
This helps us collect information that's relevant to what we're investigating. If a question is too general,
try to narrow it down by focusing on just one element. For example, instead of asking a closed-ended
question, like, are kids getting enough physical activities these days? Ask what percentage of kids
achieve the recommended 60 minutes of physical activity at least five days a week? That question is
much more specific and can give you more useful information.
 Measurable questions can be quantified and assessed. An example of an unmeasurable question would
be, why did a recent video go viral? Instead, you could ask how many times was our video shared on
social channels the first week it was posted? That question is measurable because it lets us count the
shares and arrive at a concrete number.
 Action-oriented questions encourage change. You might remember that problem solving is about seeing
the current state and figuring out how to transform it into the ideal future state. Well, action-oriented
questions help you get there. So rather than asking, how can we get customers to recycle our product
packaging? You could ask, what design features will make our packaging easier to recycle? This brings
you answers you can act on.
 Relevant questions matter, are important and have significance to the problem you're trying to solve.
Let's say you're working on a problem related to a threatened species of frog. And you asked, why does it
matter that Pine Barrens tree frogs started disappearing? This is an irrelevant question because the
answer won't help us find a way to prevent these frogs from going extinct. A more relevant question
would be, what environmental factors changed in Durham, North Carolina between 1983 and 2004 that
could cause Pine Barrens tree frogs to disappear from the Sandhills Regions? This question would give
us answers we can use to help solve our problem. That's also a great example for our final point, time-
bound questions.
 Time-bound questions specify the time to be studied. The time period we want to study is 1983 to 2004.
This limits the range of possibilities and enables the data analyst to focus on relevant data.
Fairness: Fairness means ensuring that your questions don't create or reinforce bias. sandwich
example. There we had an unfair question because it was phrased to lead you toward a certain answer.
This made it difficult to answer honestly if you disagreed about the sandwich quality. Another common
example of an unfair question is one that makes assumptions. For instance, let's say a satisfaction survey
is given to people who visit a science museum. If the survey asks, what do you love most about our
exhibits? This assumes that the customer loves the exhibits which may or may not be true. Fairness also
means crafting questions that make sense to everyone. It's important for questions to be clear and have a
straightforward wording that anyone can easily understand. Unfair questions also can make your job as a
data analyst more difficult. They lead to unreliable feedback and missed opportunities to gain some truly
valuable insights.

Things to avoid when asking questions


Leading questions: questions that only have a particular response

 Example: This product is too expensive, isn’t it?


This is a leading question because it suggests an answer as part of the question. A better question might be,
“What is your opinion of this product?” There are tons of answers to that question, and they could include
information about usability, features, accessories, color, reliability, and popularity, on top of price. Now, if
your problem is actually focused on pricing, you could ask a question like “What price (or price range)
would make you consider purchasing this product?” This question would provide a lot of different
measurable responses.

Closed-ended questions: questions that ask for a one-word or brief response only

 Example: Were you satisfied with the customer trial?


This is a closed-ended question because it doesn’t encourage people to expand on their answer. It is really
easy for them to give one-word responses that aren’t very informative. A better question might be, “What
did you learn about customer experience from the trial.” This encourages people to provide more detail
besides “It went well.”

Vague questions: questions that aren’t specific or don’t provide context

 Example: Does the tool work for you?


This question is too vague because there is no context. Is it about comparing the new tool to the one it
replaces? You just don’t know. A better inquiry might be, “When it comes to data entry, is the new tool
faster, slower, or about the same as the old tool? If faster, how much time is saved? If slower, how much
time is lost?” These questions give context (data entry) and help frame responses that are measurable (time).

You might also like