0% found this document useful (0 votes)
166 views

What Is Data Analytics?

This document discusses what data analytics is and how businesses can use it. It defines data analytics as examining datasets to draw conclusions about the information and uncover patterns. Businesses use data analytics techniques to gain insights from customer data that help improve decision making, marketing, customer service, and operations. Modern data analytics utilizes machine learning, data management, data mining, and predictive analytics to analyze large amounts of data quickly and accurately.

Uploaded by

Aizel Almonte
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
166 views

What Is Data Analytics?

This document discusses what data analytics is and how businesses can use it. It defines data analytics as examining datasets to draw conclusions about the information and uncover patterns. Businesses use data analytics techniques to gain insights from customer data that help improve decision making, marketing, customer service, and operations. Modern data analytics utilizes machine learning, data management, data mining, and predictive analytics to analyze large amounts of data quickly and accurately.

Uploaded by

Aizel Almonte
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 56

What Is Data Analytics?

Data can help businesses better understand their customers, improve


their advertising campaigns, personalize their content and improve
their bottom lines. The advantages of data are many, but you can’t
access these benefits without the proper data analytics tools and
processes. While raw data has a lot of potential, you need data
analytics to unlock the power to grow your business. Here is what we
will be going over.

What Is Data Analytics?


The term data analytics refers to the process of examining datasets to
draw conclusions about the information they contain. Data analytic
techniques enable you to take raw data and uncover patterns to
extract valuable insights from it.

Today, many data analytics techniques use specialized systems and


software that integrate machine learning algorithms, automation and
other capabilities.

Data Scientists and Analysts use data analytics techniques in their


research, and businesses also use it to inform their decisions. Data
analysis can help companies better understand their customers,
evaluate their ad campaigns, personalize content, create content
strategies and develop products. Ultimately, businesses can use data
analytics to boost business performance and improve their bottom
line.

For businesses, the data they use may include historical data or new
information they collect for a particular initiative. They may also
collect it first-hand from their customers and site visitors or purchase it
from other organizations. Data a company collects about its own
customers is called first-party data, data a company obtains from a
known organization that collected it is called second-party data, and
aggregated data a company buys from a marketplace is called third-
party data. The data a company uses may include information about
an audience’s demographics, their interests, behaviors and more.

4 Ways to Use Data


Analytics

Data has the potential to provide a lot of value to businesses, but to


unlock that value, you need the analytics component. Analysis
techniques give businesses access to insights that can help them to
improve their performance. It can help you improve your knowledge of
your customers, ad campaigns, budget and more.

As the importance of data analytics in the business world increases, it


becomes more critical that your company understand how to
implement it. Some  benefits of data analytics include:
1. Improved Decision Making
Companies can use the insights they gain from data analytics to
inform their decisions, leading to better outcomes.

Data analytics eliminates much of the guesswork from planning


marketing campaigns, choosing what content to create, developing
products and more. It gives you a 360-degree view of your customers,
which means you understand them more fully, enabling you to better
meet their needs. Plus, with modern data analytics technology, you
can continuously collect and analyze new data to update your
understanding as conditions change.

2. More Effective Marketing


When you understand your audience better, you can market to them
more effectively. Data analytics also gives you useful insights into how
your campaigns are performing so that you can fine-tune them for
optimal outcomes.

Using the Lotame Campaign Analytics tool, you can gain insights into


which audience segments are most likely to interact with a campaign
and convert. You can use this information to adjust your targeting
criteria either manually or through automation, or use it to develop
different messaging and creative for different segments. Improving
your targeting results in more conversions and less ad waste.

3. Better Customer Service


Data analytics provide you with more insights into your customers,
allowing you to tailor customer service to their needs, provide more
personalization and build stronger relationships with them. 
Your data can reveal information about your customers’
communications preferences, their interests, their concerns and more.
Having a central location for this data also ensures that your whole
customer service team, as well as your sales and marketing teams,
are on the same page.

4. More Efficient Operations


Data analytics can help you streamline your processes, save money
and boost your bottom line. When you have an improved
understanding of what your audience wants, you waste less time on
creating ads and content that don’t match your audience’s interests.

This means less money wasted as well as improved results from your
campaigns and content strategies. In addition to reducing your costs,
analytics can also boost your revenue through increased conversions,
ad revenue or subscriptions.
What Insights Can You Gain
From Data Analytics?

By collecting various kinds of data from numerous sources, you can


gain insights into your audiences and campaigns that help you
improve your targeting and better predict future customer behavior.

One valuable type of data is information about customer behaviors.


This refers to data about specific actions that a user takes. They
might, for instance, click on an ad, make a purchase, comment on a
news article or like a social media post.

This and other types of data can reveal information about customer
affinities — expressed or suggested interest in activities, products,
brands and topics. A customer may express interest in your brand by
signing up for your email list. They may also indirectly express interest
in a topic by reading about it on your website. They may express
interest in a product by clicking on one of your ads for it. Some other
potential sources of customer affinity data include survey responses,
social media likes and video views.

By combining this data with information about your current customers’


demographics, you can gain insights into the customer segments that
are most likely to be interested in your brand, content or products.
Demographic information includes information about customers’ ages,
genders, income, marital status and various other characteristics. For
example, you might find, through data analytics, that people between
the ages of 18 and 35 are the most likely to purchase your product.
You might also find that people who are married make up most of your
website’s audience. By targeting multiple characteristics, you can
create more specific audiences who are highly likely to convert.

You can then use this information to predict the behaviors of various
types of users and target your ads and content more effectively. 
Data Analytics Technology
Data analytics is nothing new. Today, though, the growing volume of
data and the advanced analytics technologies available mean you can
get much deeper data insights more quickly. The insights that big data
and modern technologies make possible are more accurate and more
detailed. In addition to using data to inform future decisions, you can
also use current data to make immediate decisions.

Some of the technologies that make modern data analytics so


powerful are:

 Machine learning: Artificial intelligence (AI) is the field of developing


and using computer systems that can simulate human intelligence to complete
tasks. Machine learning (ML) is a subset of AI that is significant for data
analytics and involves algorithms that can learn on their own. ML enables
applications to take in data and analyze it to predict outcomes without
someone explicitly programming the system to reach that conclusion. You can
train a machine learning algorithm on a small sample of data, and the system
will continue to learn as it gathers more data, becoming more accurate as time
goes on.
 Data management: Before you can analyze data, you need to have
procedures in place for managing the flow of data in and out of your systems
and keeping your data organized. You also need to ensure that your data is
high-quality and that you collect it in a central data management platform
(DMP) where it’s available for use when needed. Establishing a data
management program can help ensure that your organization is on the same
page regarding how to organize and handle data.

 Data mining: The term data mining refers to the process of sorting


through large amounts of data to identify patterns and discover
relationships between data points. It enables you to sift through large
datasets and figure out what’s relevant. You can then use this information to
conduct analyses and inform your decisions. Today’s data mining
technologies allow you to complete these tasks exceptionally quickly.

 Predictive analytics: Predictive analytics technology helps you analyze


historical data to predict future outcomes and the likelihood of various
outcomes occurring. These technologies typically use statistical algorithms
and machine learning. More accurate predictions means businesses can
make better decisions moving forward and position themselves to succeed. It
allows them to anticipate their customers’ needs and concerns, predict future
trends and stay ahead of the competition.

Data Analytics Examples


Let’s look at a few quick examples of how you might collect data and
analyze it to help improve outcomes for your business.
Let’s say you’re a marketer who’s running an online ad campaign to
promote a new smartphone. You might start by targeting the ad to
people who bought the previous version of the phone in question. As
your campaign runs, you use data analytics techniques to sift through
the data generated when people clicked on the ad. By examining data
about these users’ interests, perhaps you discover many of them are
interested in photography. Perhaps this is because your new phone
has a better camera than the previous model. Using this information,
you could fine-tune your ad to focus on users who bought the previous
phone and like photography. You could also find new audiences of
people who didn’t buy the older phone but are interested in
photography.

As another example, let’s say you publish a site that features videos
about sports. As people visit your site, you could collect data about
which videos different visitors watch as well as how highly they rate
the videos, which ones they comment on and more. You could also
gather information about the demographics of each user. You can use
data analytics tools to determine which audience segments are most
likely to watch certain videos. You can then suggest videos to people
based on the segments they fit into best. For example, you might find
that older men are most likely to be interested in golf, while younger
men are most likely to be interested in basketball.

For some real-life examples of how Lotame’s data analytics tools have
helped business drive improved results, check out our case studies.

Challenges of Data Analytics


While data analytics can provide many benefits to the companies that
use it, it’s not without its challenges. Working with the right partners
and using the right tools can help businesses to overcome these
difficulties.

One of the biggest challenges related to data analytics is collecting the


data. There’s a lot of data that businesses could potentially collect,
and they need to determine what to prioritize. Collecting data requires
tools that can gather data from website visits, ad clicks and other
interactions and deliver it in a usable format.
Once you collect your data, you need somewhere to store itn. This
can take up a considerable amount of space and contain many
different types of information. You have to integrate both structured
and unstructured data from online and offline sources and from
internal and external sources.

You also need to ensure data quality so your results are accurate. In
addition, your data needs to be accessible and not siloed so everyone
throughout your organization has the same repository.
4 Types of Data in Statistics
Introduction
 
Data types are important concepts in statistics, they enable us to apply
statistical measurements correctly on data and assist in correctly concluding
certain assumptions about it.
 
Having an adequate comprehension of the various data types is significantly
essential for doing Exploratory Data Analysis or EDA since you can use certain
factual measurements just for particular data types. 
 
SImilarly, you need to know which data analysis and its type you are working
to select the correct perception technique. You can consider data types as an
approach to arrange various types of variables. 
 
If you go into detail then there are only two classes of data in statistics, that
is Qualitative and Quantitative data. But, after that, there is a subdivision and
it breaks into 4 types of data. Data types are like a guide for doing the whole
study of statistics correctly!
 
This blog gives you a glance over different types of data need to know for
performing proper exploratory data analysis.
 
 
Qualitative and Quantitative Data
 
Qualitative data is a bunch of information that cannot be measured in the
form of numbers. It is also known as categorical data. It normally comprises
words, narratives, and we labelled them with names.
 
It delivers information about the qualities of things in data. The outcome of
qualitative data analysis can come in the type of featuring key words,
extracting data, and ideas elaboration.
 
For examples: 
 
 Hair colour- black, brown, red
 Opinion- agree, disagree, neutral
 
On the other side, Quantitative data is a bunch of information gathered from a
group of individuals and includes statistical data analysis. Numerical data is
another name for quantitative data. Simply, it gives information about
quantities of items in the data and the items that can be estimated. And, we
can formulate them in terms of numbers.
 
For examples:
 
 We can measure the height (1.70 meters), distance (1.35 miles)  with
the help of a ruler or tape.
 We can measure water (1.5 litres) with a jug.
 
Under a subdivision, nominal data and ordinal data come under qualitative
data. Interval data and ratio data come under quantitative data.  Here we will
read in detail about all these data types.
Different Types of Data

1. Nominal Data
 
Nominal data are used to label variables where there is no quantitative
value and has no order. So, if you change the order of the value then the
meaning will remain the same.
 
Thus, nominal data are observed but not measured, are unordered but
non-equidistant, and have no meaningful zero.
 
The only numerical activities you can perform on nominal data is to state
that perception is (or isn't) equivalent to another (equity or inequity), and
you can use this data to amass them. 
 
You can't organize nominal data, so you can't sort them.
 
Neither would you be able to do any numerical tasks as they are saved
for numerical data. With nominal data, you can calculate frequencies,
proportions, percentages, and central points.
 
Examples of Nominal data:
 
 What languages do you speak?
 

 English
 German
 French
 Punjabi

 
 What’s your nationality?
 

 American
 Indian
 Japanese
 German

 
You can clearly see that in these examples of nominal data the
categories have no order.
 
2. Ordinal Data
 
Ordinal data is almost the same as nominal data but not in the case of
order as their categories can be ordered like 1st, 2nd, etc. However, there
is no continuity in the relative distances between adjacent categories.
 
Ordinal Data is observed but not measured, is ordered but non-
equidistant, and has no meaningful zero. Ordinal scales are always used
for measuring happiness, satisfaction, etc.
 
With ordinal data, likewise, with nominal data, you can amass the
information by evaluating whether they are equivalent or extraordinary. 
 
As ordinal data are ordered, they can be arranged by making basic
comparisons between the categories, for example, greater or less than,
higher or lower, and so on.
 
You can't do any numerical activities with ordinal data, however, as they
are numerical data.
 
With ordinal data, you can calculate the same things as nominal data like
frequencies, proportions, percentage, central point but there is one more
point added in ordinal data that is summary statistics and
similarly bayesian statistics.
 
Examples of Ordinal data:
 
 Opinion
o Agree
o Disagree
o Mostly agree
o Neutral
o Mostly disagree

 
 Time of day
o Morning
o Noon
o Night

 
In these examples, there is an obvious order to the categories.
 
3. Interval Data
 
Interval Data are measured and ordered with the nearest items but have
no meaningful zero.
 
The central point of an Interval scale is that the word 'Interval' signifies
'space in between', which is the significant thing to recall,  interval scales
not only educate us about the order but additionally about the value
between every item. 
 
Interval data can be negative, though ratio data can't.
 
Even though interval data can show up fundamentally the same as ratio
data, the thing that matters is in their characterized zero-points. If the
zero-point of the scale has been picked subjectively, at that point the
data can't be ratio data and should be interval data.
 
Hence, with interval data you can easily correlate the degrees of the data
and also you can add or subtract the values.
 
There are some descriptive statistics that you can calculate for interval
data are central point (mean, median, mode), range (minimum,
maximum), and spread (percentiles, interquartile range, and standard
deviation). 
 
In addition to that, similar other statistical data analysis techniques can
be used for more analysis.
 
Examples of Interval data:
 
 Temperature (°C or F, but not Kelvin)
 Dates (1066, 1492, 1776, etc.)
 Time interval on a 12-hour clock (6 am, 6 pm)
 
4. Ratio Data
 
Ratio Data are measured and ordered with equidistant items and a
meaningful zero and never be negative like interval data.
 
An outstanding example of ratio data is the measurement of heights. It
could be measured in centimetres, inches, meters, or feet and it is not
practicable to have a negative height. 
 
Ratio data enlightens us regarding the order for variables, the contrasts
among them, and they have absolutely zero. It permits a wide range of
estimations and surmisings to be performed and drawn. 
 
Ratio data is fundamentally the same as interval data, aside from zero
means none.
 
The descriptive statistics which you can calculate for ratio data are the
same as interval data which are central point (mean, median, mode),
range (minimum, maximum), and spread (percentiles, interquartile range,
and standard deviation).
 
Example of Ratio data:
 
 Age (from 0 years to 100+)
 Temperature (in Kelvin, but not °C or F)
 Distance (measured with a ruler or any other assessing device)
 Time interval (measured with a stop-watch or similar)
 
Therefore, for these examples of ratio data, there is an actual, meaningful
zero-point like the age of a person, absolute zero, distance calculated
from a specified point or time all have real zeros.
 
Key Takeaways
 
We hope you understood about 4 types of data in statistics and their
importance, now you can learn how to handle data correctly, which statistical
hypothesis tests you can use, and what you could calculate with them.
Moreover,
 Nominal data and ordinal data are the types of qualitative data or
categorical data.
 Interval data and ratio data are the types of quantitative data which are
also known as numerical data.
 Nominal Data are not measured but observed and they are unordered,
non-equidistant, and also have no meaningful zero.
(Also check: Types of Statistical Analysis)
 Ordinal Data is also not measured but observed and they are ordered
however non-equidistant and have no meaningful zero.
 Interval Data are measured and ordered with equidistant items yet have
no meaningful zero.
 Ratio Data are also measured and ordered with equidistant items and a
meaningful zero.

98%
What is data visualization? Presenting data for
decision-making
Data visualization definition
Data visualization is the presentation of data in a graphical format. It reduces the “noise”
of data by presenting it visually, making it easier for decision makers to see and
understand trends, outliers, and patterns in data.

Maps and charts were among the earliest forms of data visualization. One of the most
well-known early examples of data visualization was a flow map created by French civil
engineer Charles Joseph Minard in 1869 to help understand what Napoleon’s troops
suffered in the disastrous Russian campaign of 1812. The map used two dimensions to
depict the number of troops, distance, temperature, latitude and longitude, direction of
travel, and location relative to specific dates.

[ Learn the essential skills and traits of elite data scientists and the secrets of highly
successful data analytics teams. | Prove your data science chops by earning one of
these data science certifications. | Get the insights by signing up for our newsletters. ]
The business value of data visualization
Data visualization helps people analyze data quickly and efficiently. By providing easy-
to-understand visual representations of data, it helps employees make more informed
decisions based on that data. Presenting data in visual form can make it easier to
comprehend, enable people to obtain insights more quickly. Visualizations can also
make it easier to communicate those insights. Visual representations of data can also
make it easier to see how independent variables relate to one another. This can help
you see trends, understand the frequency of events, and track connections between
operations and performance, for example.

Types of data visualization


There are myriad ways of visualizing data, but data design agency The Datalabs
Agency says there are two basic categories of data visualization:

 Exploration: Exploration visualizations help you understand what the data is telling you.
 Explanation: Explanation visualizations tell a story to an audience using data.

It is essential to understand which of those two ends a given visualization is intended to


achieve.

Some of the most common specific types of visualizations include:

2D area
These are typically geospatial visualizations. For example, cartograms use distortions of
maps to convey information such as population or travel time. Choropleths use shades
or patterns on a map to represent a statistical variable, such as population density by
state.

Temporal
These are one-dimensional linear visualizations that have a start and finish time.
Examples include a time series, which presents data like website visits by day or month,
and Gantt charts, which illustrate project schedules.

Multidimensional
These common visualizations present data with two or more dimensions. Examples
include pie charts, histograms, and scatter plots.

Hierarchical
These visualizations show how groups relate to each other. Tree diagrams are an
example of a hierarchical visualization that shows how larger groups encompass sets of
smaller groups.
Network
Network visualizations show how data sets are related to each other in a network. An
example is a node-link diagram, also known as a network graph, which uses nodes and
link lines to show how things are interconnected.

Data visualization examples


Tableau has collected what it considers to be 10 of the best data visualization
examples. Number one on Tableau’s list is Minard’s map of Napoleon’s march to
Moscow, mentioned above. Other prominent examples include:

 A dot map created by English physician John Snow in 1854 to understand the cholera
outbreak in London that year. The map used bar graphs on city blocks to indicate cholera
deaths at each household in a London neighborhood. The map showed that the worst-affected
households were all drawing water from the same well, which eventually led to the insight that
wells contaminated by sewage had caused the outbreak.
 An animated age and gender demographic breakdown pyramid created by Pew
Research Center as part of its The Next America project, published in 2014. The project is filled
with innovative data visualizations. This one shows how population demographics have shifted
since the 1950s, with a pyramid of many young people at the bottom and very few older people
at the top in the 1950s to a rectangular shape in 2060.
 A collection of four visualizations by Hanah Anderson and Matt Daniels of The Pudding
that illustrate gender disparity in pop culture by breaking down the scripts of 2,000 movies and
tallying spoken lines of dialogue for male and female characters. The visualizations include a
breakdown of Disney movies, the overview of 2,000 scripts, a gradient bar with which users can
search for specific movies, and a representation of age biases shown toward male and female
roles.

Data visualization tools


There are many applications, tools, and scripts available for data visualization. Some of
the most popular include the following:

Domo
Domo is a cloud software company that specializes in business intelligence tools and
data visualization. It focuses on business-user deployed dashboards and ease of use.

Dundas BI
Dundas BI is a BI platform for visualizing data, building and sharing dashboards and
reports, and embedding analytics.

Infogram
Infogram is a drag-and-drop visualization tool for creating visualizations for marketing
reports, infographics, social media posts, dashboards, and more.

Microsoft Power BI
Microsoft Power BI is a business intelligence platform integrated with Microsoft Office. It
has an easy-to-use interface for making dashboards and reports.
Qlik
Qlik’s Qlik Sense features an “associative” data engine for investigating data and AI-
powered recommendations for visualizations. It is continuing to build out its open
architecture and multicloud capabilities.

Sisense
Sisense is an end-to-end analytics platform best known for embedded analytics. Many
customers use it in an OEM form.

Tableau
One of the most popular data visualization platforms on the market, Tableau is a
platform that supports accessing, preparing, analyzing, and presenting data.

4 Ways Data Visualization Can Improve


Your Decision Making
4 Ways Data Visualization Improves Decision
Making 
1 |  Increase Speed 

Information travels at the speed of light in our always-connected world. Due to


the fast pace of business, managers are often expected to make critical
decisions in short periods. If you are not timely, an opportunity may pass by,
or a small problem may grow exponentially worse overnight. 

Data visualization can help you draw actionable insights from massive
amounts of data in a short amount of time. Even a simple visualization, like
a bar graph, can present valuable insights in seconds. Take a look at the
example below:
Data collected from a corporate technology assessment is organized into this
colorful bar graph. By glancing at the chart, an IT manager could
immediately recognize which skills need improvement. From there, the
manager could decide to allocate resources towards training and recruiting in
these skill areas. All within a few minutes.  

2 | Improve Accuracy 

Business leaders must make timely and informed decisions. However,


collecting and efficiently reviewing all of the numbers is not always a
reasonable option. What happens when you don't have all of the facts before
making a decision? You fill in the blanks with assumptions and biases.  

Psychologists labeled this quick fix as heuristics. Heuristics are mental


shortcuts that allow people to solve problems and make judgments quickly
and efficiently. However, these shortcuts can lead to cognitive biases that can
taint our decision-making process. For some decisions, relying on heuristics is
perfectly acceptable, but for high-impact decisions, missing information can
lead to serious missteps. 

While big data provides decision-makers with all the information, it's not
always presented in a consumable form. Imagine for every high-impact
decision you must scroll through rows of data compiled in a spreadsheet, just
to digest all the facts. It's unreasonable, time-consuming, and confusing.  

As a manager, you need to spend your time driving action, not analyzing
numbers. When it's hard to consume data, it's easy to ignore the facts
and lean on our biases. Instead of wasting valuable time analyzing rows of
data or falling back on your assumptions, use visualizations to identify
relevant information quickly.  

Data visualization simplifies the information, reducing the need to fill the
gaps with your personal biases. In the bar chart above, you can easily see
a comparison of all the skills across the workforce. When you need to decide
where to allocate resources, your decision is based on factual data, not
assumptions. 

3 | Simplify Communication

A decision is just words until it is carried out through people's actions. After
you make a decision, you must effectively communicate your thoughts
with the people who will carry out the subsequent steps. In the same way
that data visualization simplifies data analysis, it can also streamline and
objectify communication. 

For example, a manager at an engineering firm determines his lead


mechanical engineer needs to improve his solar, thermal skills (red
column) and assigns him to training. The manager could just say this to his
engineer. However, now, the engineer is left with several unanswered
questions. Why does he need training? Where should his thermal skills be
right now? Does his manager dislike him? 

While the decision may be apparent to the manager, it isn't communicated


clearly to the engineer. How can you expect people to take effective action
when the message isn't clear in the first place? 

Alternatively, the manager could use the graph below to clearly communicate
why he is making this decision to the developer:
The chart clearly shows which skills do not meet the ideal proficiency levels
and by how much those skills need to improve. By presenting his message in
visual form, the manager can ensure the engineer understands why he needs
training and how he can gauge his progress. The visualization shifted the
manager's message from unclear and subjective to concise and
objective.  

4 | Empower Collaboration

Leaders should always seek the perspectives of others to enhance their


decision-making process.  Research shows that “being able to view the
decision environment from multiple perspectives enhances the decision-
maker's ability to make better-informed choices.” 

Related Post: The Power of Data Visualization


Data visualization turns raw data into a universally, consumable form.By
providing access to valuable information, you give people the tools to
develop more informed opinions and empower them to contribute their
perspective in the decision-making process. Provoking diverse thought is
especially important when leaders are faced with high-impact decisions that
affect their organizations.

Great decision-making has always been a crucial skill for business leaders.
Big data can put you ahead of the competition if you can use it to produce
timely, informed decisions that deliver successful outcomes for your company.
Incorporating data visualization in the decision-making process can improve
speed, reduce inaccuracies, and enhance communication and collaboration.
How will you leverage data visualization to start making better data-driven
decisions today?
What is Data Science?
Data science defined
Data science combines multiple fields, including statistics, scientific methods, artificial
intelligence (AI), and data analysis, to extract value from data. Those who practice data science
are called data scientists, and they combine a range of skills to analyze data collected from the
web, smartphones, customers, sensors, and other sources to derive actionable insights.

Data science encompasses preparing data for analysis, including cleansing, aggregating, and
manipulating the data to perform advanced data analysis. Analytic applications and data
scientists can then review the results to uncover patterns and enable business leaders to draw
informed insights.

Data Science
Data science combines the scientific method, math and statistics, specialized programming,
advanced analytics, AI, and even storytelling to uncover and explain the business insights
buried in data.

What is data science?

Data science is a multidisciplinary approach to extracting actionable insights from the


large and ever-increasing volumes of data collected and created by today’s
organizations. Data science encompasses preparing data for analysis and processing,
performing advanced data analysis, and presenting the results to reveal patterns and
enable stakeholders to draw informed conclusions.

Data preparation can involve cleansing, aggregating, and manipulating it to be ready for
specific types of processing. Analysis requires the development and use of algorithms,
analytics and AI models. It’s driven by software that combs through data to find patterns
within to transform these patterns into predictions that support business decision-
making. The accuracy of these predictions must be validated through scientifically
designed tests and experiments. And the results should be shared through the skillful
use of data visualization tools that make it possible for anyone to see the patterns and
understand trends.

As a result, data scientists (as data science practitioners are called) require computer
science and pure science skills beyond those of a typical data analyst. A data scientist
must be able to do the following:

 Apply mathematics, statistics, and the scientific method


 Use a wide range of tools and techniques for evaluating and preparing data—
everything from SQL to data mining to data integration methods
 Extract insights from data using predictive analytics and artificial intelligence (AI) ,
including machine learning  and deep learning models
 Write applications that automate data processing and calculations
 Tell—and illustrate—stories that clearly convey the meaning of results to decision-
makers and stakeholders at every level of technical knowledge and understanding
 Explain how these results can be used to solve business problems

This combination of skills is rare, and it’s no surprise that data scientists are currently in
high demand. According to an IBM survey (PDF, 3.9 MB), the number of job openings in
the field continues to grow at over 5% per year, with over 60,000 forecast for 2020.

Data science: An untapped resource for machine learning


Data science is one of the most exciting fields out there today. But why is it so important?

Because companies are sitting on a treasure trove of data. As modern technology has enabled the
creation and storage of increasing amounts of information, data volumes have exploded. It’s
estimated that 90 percent of the data in the world was created in the last two years. For example,
Facebook users upload 10 million photos every hour.

But this data is often just sitting in databases and data lakes, mostly untouched.

The wealth of data being collected and stored by these technologies can bring transformative
benefits to organizations and societies around the world—but only if we can interpret it. That’s
where data science comes in.

Data science reveals trends and produces insights that businesses can use to make better
decisions and create more innovative products and services. Perhaps most importantly, it enables
machine learning (ML) models to learn from the vast amounts of data being fed to them, rather
than mainly relying upon business analysts to see what they can discover from the data.

Data is the bedrock of innovation, but its value comes from the information data scientists can
glean from it, and then act upon.
What’s the difference between data science, artificial intelligence, and machine learning?
To better understand data science—and how you can harness it—it’s equally important to know
other terms related to the field, such as artificial intelligence (AI) and machine learning. Often,
you’ll find that these terms are used interchangeably, but there are nuances.

Here’s a simple breakdown:

 AI means getting a computer to mimic human behavior in some way.


 Data science is a subset of AI, and it refers more to the overlapping areas of statistics,
scientific methods, and data analysis—all of which are used to extract meaning and insights from
data..
 Machine learning is another subset of AI, and it consists of the techniques that enable
computers to figure things out from the data and deliver AI applications.
And for good measure, we’ll throw in another definition.
 Deep learning which is a subset of machine learning that enables computers to solve more
complex problems.
How data science is transforming business
Organizations are using data science to turn data into a competitive advantage by refining
products and services. Data science and machine learning use cases include:

 Determine customer churn by analyzing data collected from call centers, so marketing can
take action to retain them
 Improve efficiency by analyzing traffic patterns, weather conditions, and other factors so
logistics companies can improve delivery speeds and reduce costs
 Improve patient diagnoses by analyzing medical test data and reported symptoms so doctors
can diagnose diseases earlier and treat them more effectively
 Optimize the supply chain by predicting when equipment will break down
 Detect fraud in financial services by recognizing suspicious behaviors and anomalous actions
 Improve sales by creating recommendations for customers based upon previous purchases
Many companies have made data science a priority and are investing in it heavily. In Gartner’s
recent survey of more than 3,000 CIOs, respondents ranked analytics and business intelligence as
the top differentiating technology for their organizations. The CIOs surveyed see these
technologies as the most strategic for their companies, and are investing accordingly.
How data science is conducted
The process of analyzing and acting upon data is iterative rather than linear, but this is how the
data science lifecycle typically flows for a data modeling project:

Planning:  Define a project and its potential outputs.


Building a data model:  Data scientists often use a variety of open source libraries or in-
database tools to build machine learning models. Often, users will want APIs to help with data
ingestion, data profiling and visualization, or feature engineering. They will need the right tools
as well as access to the right data and other resources, such as compute power.
Evaluating a model:  Data scientists must achieve a high percent of accuracy for their models
before they can feel confident deploying it. Model evaluation will typically generate a
comprehensive suite of evaluation metrics and visualizations to measure model performance
against new data, and also rank them over time to enable optimal behavior in production. Model
evaluation goes beyond raw performance to take into account expected baseline behavior.
Explaining models:  Being able to explain the internal mechanics of the results of machine
learning models in human terms has not always been possible—but it is becoming increasingly
important. Data scientists want automated explanations of the relative weighting and importance
of factors that go into generating a prediction, and model-specific explanatory details on model
predictions.
Deploying a model:  Taking a trained, machine learning model and getting it into the right
systems is often a difficult and laborious process. This can be made easier by operationalizing
models as scalable and secure APIs, or by using in-database machine learning models.
Monitoring models:  Unfortunately, deploying a model isn’t the end of it. Models must always
be monitored after deployment to ensure that they are working properly. The data the model was
trained on may no longer be relevant for future predictions after a period of time. For example, in
fraud detection, criminals are always coming up with new ways to hack accounts.
Tools for data science
Building, evaluating, deploying, and monitoring machine learning models can be a complex
process. That’s why there’s been an increase in the number of data science tools. Data scientists
use many types of tools, but one of the most common is open source notebooks, which are web
applications for writing and running code, visualizing data, and seeing the results—all in the
same environment.

Some of the most popular notebooks are Jupyter, RStudio, and Zeppelin. Notebooks are very
useful for conducting analysis, but have their limitations when data scientists need to work as a
team. Data science platforms were built to solve this problem.

To determine which data science tool is right for you, it’s important to ask the following
questions: What kind of languages do your data scientists use? What kind of working methods
do they prefer? What kind of data sources are they using?

For example, some users prefer to have a datasource-agnostic service that uses open source
libraries. Others prefer the speed of in-database, machine learning algorithms.

The data science lifecycle

The data science lifecycle—also called the data science pipeline—includes anywhere
from five to sixteen (depending on whom you ask) overlapping, continuing processes.
The processes common to just about everyone’s definition of the lifecycle include the
following:

 Capture: This is the gathering of raw structured and unstructured data from all
relevant sources via just about any method—from manual entry and web scraping to
capturing data from systems and devices in real time.
 Prepare and maintain: This involves putting the raw data into a consistent format for
analytics or machine learning or deep learning models. This can include everything
from cleansing, deduplicating, and reformatting the data, to using ETL (extract,
transform, load) or other data integration technologies to combine the data into a data
warehouse, data lake, or other unified store for analysis.
 Preprocess or process: Here, data scientists examine biases, patterns, ranges, and
distributions of values within the data to determine the data’s suitability for use with
predictive analytics, machine learning, and/or deep learning algorithms (or other
analytical methods).
 Analyze: This is where the discovery happens—where data scientists perform
statistical analysis, predictive analytics, regression, machine learning and deep
learning algorithms, and more to extract insights from the prepared data.
 Communicate: Finally, the insights are presented as reports, charts, and other data
visualizations that make the insights—and their impact on the business—easier for
decision-makers to understand. A data science programming language such as R or
Python (see below) includes components for generating visualizations; alternatively,
data scientists can use dedicated visualization tools.
Data science tools

Data scientists must be able to build and run code in order to create models. The most
popular programming languages among data scientists are open source tools that
include or support pre-built statistical, machine learning and graphics capabilities. These
languages include:

 R: An open source programming language and environment for developing statistical
computing and graphics, R is the most popular programming language among data
scientists. R provides a broad variety of libraries and tools for cleansing and prepping
data, creating visualizations, and training and evaluating machine learning and deep
learning algorithms. It’s also widely used among data science scholars and
researchers.
 Python: Python is a general-purpose, object-oriented, high-level programming
language that emphasizes code readability through its distinctive generous use of
white space. Several Python libraries support data science tasks, including Numpy for
handling large dimensional arrays, Pandas for data manipulation and analysis, and
Matplotlib for building data visualizations.

Machine Learning
https://www.youtube.com/watch?v=ukzFI9rgwfU
Machine Learning
https://www.youtube.com/watch?v=6M5VXKLf4D4
Deep learning
https://www.youtube.com/watch?v=bfmFfD2RIcg
Artificial Neural Network

This introduction to machine learning provides an overview of its history, important definitions,
applications and concerns within businesses today.

What is machine learning?


Machine learning is a branch of artificial intelligence (AI) and computer science which
focuses on the use of data and algorithms to imitate the way that humans learn,
gradually improving its accuracy.
IBM has a rich history with machine learning. One of its own, Arthur Samuel, is credited
for coining the term, “machine learning” with his research (PDF, 481 KB) (link resides
outside IBM) around the game of checkers. Robert Nealey, the self-proclaimed
checkers master, played the game on an IBM 7094 computer in 1962, and he lost to the
computer. Compared to what can be done today, this feat almost seems trivial, but it’s
considered a major milestone within the field of artificial intelligence. Over the next
couple of decades, the technological developments around storage and processing
power will enable some innovative products that we know and love today, such as
Netflix’s recommendation engine or self-driving cars.

Machine learning is an important component of the growing field of data science.


Through the use of statistical methods, algorithms are trained to make classifications or
predictions, uncovering key insights within data mining projects. These insights
subsequently drive decision making within applications and businesses, ideally
impacting key growth metrics. As big data continues to expand and grow, the market
demand for data scientists will increase, requiring them to assist in the identification of
the most relevant business questions and subsequently the data to answer them.
Machine Learning vs. Deep Learning vs. Neural Networks

Since deep learning and machine learning tend to be used interchangeably, it’s worth
noting the nuances between the two. Machine learning, deep learning, and neural
networks are all sub-fields of artificial intelligence. However, deep learning is actually a
sub-field of machine learning, and neural networks is a sub-field of deep learning.

The way in which deep learning and machine learning differ is in how each algorithm
learns. Deep learning automates much of the feature extraction piece of the process,
eliminating some of the manual human intervention required and enabling the use of
larger data sets. You can think of deep learning as "scalable machine learning" as Lex
Fridman notes in this MIT lecture (00:30) (link resides outside IBM). Classical, or "non-
deep", machine learning is more dependent on human intervention to learn. Human
experts determine the set of features to understand the differences between data
inputs, usually requiring more structured data to learn.

"Deep" machine learning can leverage labeled datasets, also known as supervised
learning, to inform its algorithm, but it doesn’t necessarily require a labeled dataset. It
can ingest unstructured data in its raw form (e.g. text, images), and it can automatically
determine the set of features which distinguish different categories of data from one
another. Unlike machine learning, it doesn't require human intervention to process data,
allowing us to scale machine learning in more interesting ways. Deep learning and
neural networks are primarily credited with accelerating progress in areas, such as
computer vision, natural language processing, and speech recognition.

Neural networks, or artificial neural networks (ANNs), are comprised of a node layers,
containing an input layer, one or more hidden layers, and an output layer. Each node, or
artificial neuron, connects to another and has an associated weight and threshold. If the
output of any individual node is above the specified threshold value, that node is
activated, sending data to the next layer of the network. Otherwise, no data is passed
along to the next layer of the network. The “deep” in deep learning is just referring to the
depth of layers in a neural network. A neural network that consists of more than three
layers—which would be inclusive of the inputs and the output—can be considered a
deep learning algorithm or a deep neural network. A neural network that only has two or
three layers is just a basic neural network.
10 INDUSTRIES REDEFINED BY BIG DATA
ANALYTICS
It has been a widely acknowledged fact that big data has become a big
game changer in most of the modern industries over the last few years. As
big data continues to permeate our day-to-day lives the number of different
industries that are adopting big data continues to increase. It is well said
that when new technologies become cheaper and easier to use, they have
the potential to transform industries. That is exactly what is happening
with big data right now. Here are 10 Industries redefined the most by big
data analytics-

Sports
Most elite sports have now embraced data analytics. In Premier League
football games, cameras installed around the stadiums track the movement
of every player with the help of pattern recognition software generating over
25 data points per player every second. What more? NFL players have
installed sensors on their shoulder pads to gather intelligent insights on
their performance using data mining. It was analytics which helped British
rowers row their way to the Olympic gold.

Hospitality
Hotel and the luxury industry have turned to advanced analytics solutions
to understand the secret behind customer satisfaction initiatives. Yield
management in the hotel industry is one common use of analytics which is
an important means to tackle the recurring peaks in demand throughout the
year in consideration with other factors which include weather and local
events, that can influence the number and nationalities of guests checking
in.

 
Government and Public Sector Services
Analytics, data science, and big data have helped a number of cities to pilot
the smart cities initiative where data collection, analytics and the IoT
combine to create joined-up public services and utilities spanning the entire
city. For example, a sensor network has been rolled out across all 80 of the
council’s neighborhood recycling centres to help streamline collection
services, so wagons can prioritize the fullest recycling centres and skip
those with almost nothing in them.

Energy
The costs of extracting oil and gas are rising, and the turbulent state of
international politics adds to the difficulties of exploration and drilling for
new reserves. The energy industry Royal Dutch Shell, for example, has
been developing the “data-driven oilfield” in an attempt to bring down the
cost of drilling for oil.

And on a smaller but no less important scale, data and the Internet of
Things (IoT) is disrupting the way we use energy in our homes. The rise of
“smart homes” includes technology like Google’s Nest thermostat, which
helps make homes more comfortable and cut down on energy wastage.

Agriculture and Farming


The power of AI has embraced even traditional industries like Agriculture
and Farming. Big data practices have been adopted by the US agricultural
manufacturer John Deere which has launched several data-enabled
services that have led farmers to benefit from the real-time monitoring of
data collected from its thousands of users worldwide.

Education
Education sector generates massive data through courseware and learning
methodologies. Important insights can identify better teaching strategies,
highlight areas where students may not be learning efficiently, and
transform how the education is delivered. Increasingly educational
establishments have been putting data into use for everything from
planning school bus routes to improving classroom cleanliness.

Banking and Securities


Securities Exchange Commission (SEC) has deployed big data to track
and monitor the movements in the financial market. Big data and analytics
with network analytics and natural language processors is used by the
stock exchanges to catch illegal trade practices in the stock market.
Retail traders, Big banks, hedge funds and other so-called ‘big boys’ in the
financial markets use big data for trade analytics used in high-frequency
trading, pre-trade decision-support analytics, sentiment measurement,
predictive analytics, etc.

This industry also heavily relies on big data for risk analytics including; anti-
money laundering, demand enterprise risk management, “Know Your
Customer”, and fraud mitigation.

Entertainment, Communications and the Media


The on-demand music service, Spotify uses Hadoop big data analytics
to collate data from its millions of users across the world. This data is used
and analyzed to give customized music recommendations to its individual
users. Over the top media, services have relied big on big data to offer
customized content offerings to its users. An important move in the growing
competitive market.
 
Retail and Wholesale Trade
Big data has in a big way impacted the traditional brick and mortar retailers
and wholesalers to current day e-commerce traders. The retail and whole
industry has gathered a lot of data over time which is derived from POS
scanners, RFID, customer loyalty cards, store inventory, local
demographics, etc. Big data is applicable to the retail and wholesale
industry to mitigate fraud, offer customized products to the end user
thereby improving the user experience.

Transportation
Big data analytics finds huge application to the transportation industry.
Governments of different countries use big data to control the traffic,
optimize route planning and intelligent transport systems and congestion
management.

Moreover, the private sector uses big data in revenue management,


technological enhancements, logistics and to gain a competitive advantage.

Big data is improving user experiences, and the massive adoption change
has just begun.
https://www.youtube.com/watch?v=_XfWkCsvbEU

Excel Data Analysis Tutorial


https://www.youtube.com/watch?v=9NUjHBNWe9M

Introduction to Pivot Tables, Charts, and Dashboards in Excel (Part 1)

Data Analysis is a process of inspecting, cleaning, transforming and modeling data with
the goal of discovering useful information, suggesting conclusions and supporting
decision-making
.

Types of Data Analysis


Several data analysis techniques exist encompassing various domains such as
business, science, social science, etc. with a variety of names. The major data analysis
approaches are −

 Data Mining
 Business Intelligence
 Statistical Analysis
 Predictive Analytics
 Text Analytics
Data Mining
Data Mining is the analysis of large quantities of data to extract previously unknown,
interesting patterns of data, unusual data and the dependencies. Note that the goal is
the extraction of patterns and knowledge from large amounts of data and not the
extraction of data itself.
Data mining analysis involves computer science methods at the intersection of the
artificial intelligence, machine learning, statistics, and database systems.
The patterns obtained from data mining can be considered as a summary of the input
data that can be used in further analysis or to obtain more accurate prediction results
by a decision support system.
Business Intelligence
Business Intelligence techniques and tools are for acquisition and transformation of
large amounts of unstructured business data to help identify, develop and create new
strategic business opportunities.
The goal of business intelligence is to allow easy interpretation of large volumes of
data to identify new opportunities. It helps in implementing an effective strategy based
on insights that can provide businesses with a competitive market-advantage and long-
term stability.
Statistical Analysis
Statistics is the study of collection, analysis, interpretation, presentation, and
organization of data.
In data analysis, two main statistical methodologies are used −
 Descriptive statistics − In descriptive statistics, data from the entire population
or a sample is summarized with numerical descriptors such as −
o Mean, Standard Deviation for Continuous Data
o Frequency, Percentage for Categorical Data
 Inferential statistics − It uses patterns in the sample data to draw inferences
about the represented population or accounting for randomness. These
inferences can be −
o answering yes/no questions about the data (hypothesis testing)
o estimating numerical characteristics of the data (estimation)
o describing associations within the data (correlation)
o modeling relationships within the data (E.g. regression analysis)

Predictive Analytics
Predictive Analytics use statistical models to analyze current and historical data for
forecasting (predictions) about future or otherwise unknown events. In business,
predictive analytics is used to identify risks and opportunities that aid in decision-
making.
Text Analytics
Text Analytics, also referred to as Text Mining or as Text Data Mining is the process of
deriving high-quality information from text. Text mining usually involves the process of
structuring the input text, deriving patterns within the structured data using means such
as statistical pattern learning, and finally evaluation and interpretation of the output.
Data Analysis Process
Data Analysis is defined by the statistician John Tukey in 1961 as "Procedures for
analyzing data, techniques for interpreting the results of such procedures, ways of
planning the gathering of data to make its analysis easier, more precise or more
accurate, and all the machinery and results of (mathematical) statistics which apply to
analyzing data.”
Thus, data analysis is a process for obtaining large, unstructured data from various
sources and converting it into information that is useful for −

 Answering questions
 Test hypotheses
 Decision-making
 Disproving theories

Data Analysis with Excel


Microsoft Excel provides several means and ways to analyze and interpret data. The
data can be from various sources. The data can be converted and formatted in several
ways. It can be analyzed with the relevant Excel commands, functions and tools -
encompassing Conditional Formatting, Ranges, Tables, Text functions, Date functions,
Time functions, Financial functions, Subtotals, Quick Analysis, Formula Auditing,
Inquire Tool, What-if Analysis, Solvers, Data Model, PowerPivot, PowerView,
PowerMap, etc.
You will be learning these data analysis techniques with Excel as part of two parts −

 Data Analysis with Excel and


 Advanced Data Analysis with Excel
Data Analysis Process consists of the following phases that are iterative in nature −

 Data Requirements Specification


 Data Collection
 Data Processing
 Data Cleaning
 Data Analysis
 Communication
Data Requirements Specification
The data required for analysis is based on a question or an experiment. Based on the
requirements of those directing the analysis, the data necessary as inputs to the
analysis is identified (e.g., Population of people). Specific variables regarding a
population (e.g., Age and Income) may be specified and obtained. Data may be
numerical or categorical.

Data Collection
Data Collection is the process of gathering information on targeted variables identified
as data requirements. The emphasis is on ensuring accurate and honest collection of
data. Data Collection ensures that data gathered is accurate such that the related
decisions are valid. Data Collection provides both a baseline to measure and a target
to improve.
Data is collected from various sources ranging from organizational databases to the
information in web pages. The data thus obtained, may not be structured and may
contain irrelevant information. Hence, the collected data is required to be subjected to
Data Processing and Data Cleaning.

Data Processing
The data that is collected must be processed or organized for analysis. This includes
structuring the data as required for the relevant Analysis Tools. For example, the data
might have to be placed into rows and columns in a table within a Spreadsheet or
Statistical Application. A Data Model might have to be created.

Data Cleaning
The processed and organized data may be incomplete, contain duplicates, or contain
errors. Data Cleaning is the process of preventing and correcting these errors. There
are several types of Data Cleaning that depend on the type of data. For example, while
cleaning the financial data, certain totals might be compared against reliable published
numbers or defined thresholds. Likewise, quantitative data methods can be used for
outlier detection that would be subsequently excluded in analysis.

Data Analysis
Data that is processed, organized and cleaned would be ready for the analysis.
Various data analysis techniques are available to understand, interpret, and derive
conclusions based on the requirements. Data Visualization may also be used to
examine the data in graphical format, to obtain additional insight regarding the
messages within the data.
Statistical Data Models such as Correlation, Regression Analysis can be used to
identify the relations among the data variables. These models that are descriptive of
the data are helpful in simplifying analysis and communicate results.
The process might require additional Data Cleaning or additional Data Collection, and
hence these activities are iterative in nature.

Communication
The results of the data analysis are to be reported in a format as required by the users
to support their decisions and further action. The feedback from the users might result
in additional analysis.
The data analysts can choose data visualization techniques, such as tables and charts,
which help in communicating the message clearly and efficiently to the users. The
analysis tools provide facility to highlight the required information with color codes and
formatting in tables and charts.

Ranges and Tables


The data that you have can be in a range or in a table. Certain operations on data can
be performed whether the data is in a range or in a table.
However, there are certain operations that are more effective when data is in tables
rather than in ranges. There are also operations that are exclusively for tables.
You will understand the ways of analyzing data in ranges and tables as well. You will
understand how to name ranges, use the names and manage the names. The same
would apply for names in the tables.
Data Cleaning – Text Functions, Dates and Times
You need to clean the data obtained from various sources and structure it before
proceeding to data analysis. You will learn how you can clean the data.

 With Text Functions


 Containing Date Values
 Containing Time Values

Conditional Formatting
Excel provides you conditional formatting commands that allow you to color the cells or
font, have symbols next to values in the cells based on predefined criteria. This helps
one in visualizing the prominent values. You will understand the various commands for
conditionally formatting the cells.

Sorting and Filtering


During the preparation of data analysis and/or to display certain important data, you
might have to sort and/or filter your data. You can do the same with the easy to use
sorting and filtering options that you have in Excel.

Subtotals with Ranges


As you are aware, PivotTable is normally used to summarize data. However, Subtotals
with Ranges is another feature provided by Excel that will allow you to group / ungroup
data and summarize the data present in ranges with easy steps.

Quick Analysis
With Quick Analysis tool in Excel, you can quickly perform various data analysis tasks
and make quick visualizations of the results.

Understanding Lookup Functions


Excel Lookup Functions enable you to find the data values that match a defined criteria
from a huge amount of data.

PivotTables
With PivotTables you can summarize the data, prepare reports dynamically by
changing the contents of the PivotTable.

Data Visualization
You will learn several Data Visualization techniques using Excel Charts. You will also
learn how to create Band Chart, Thermometer Chart, Gantt chart, Waterfall Chart,
Sparklines and PivotCharts.
Data Validation
It might be required that only valid values be entered into certain cells. Otherwise, they
may lead to incorrect calculations. With data validation commands, you can easily set
up data validation values for a cell, an input message prompting the user on what is
expected to be entered in the cell, validate the values entered with the defined criteria
and display an error message in case of incorrect entries.

Financial Analysis
Excel provides you several financial functions. However, for commonly occurring
problems that require financial analysis, you can learn how to use a combination of
these functions.

Working with Multiple Worksheets


You might have to perform several identical calculations in more than one worksheet.
Instead of repeating these calculations in each worksheet, you can do it one worksheet
and have it appear in the other selected worksheets as well. You can also summarize
the data from the various worksheets into a report worksheet.

Formula Auditing
When you use formulas, you might want to check whether the formulas are working as
expected. In Excel, Formula Auditing commands help you in tracing the precedent and
dependent values and error checking.

Inquire
Excel also provides Inquire add-in that enables you compare two workbooks to identify
changes, create interactive reports, and view the relationships among workbooks,
worksheets, and cells. You can also clean the excessive formatting in a worksheet that
makes Excel slow or makes the file size huge.
Gartner Top 10 Trends in Data and Analytics for 2020

Trend 1: Smarter, faster, more responsible


AI
By the end of 2024, 75% of enterprises will shift from piloting to
operationalizing AI, driving a 5X increase in streaming data and analytics
infrastructures.

Within the current pandemic context, AI techniques such as machine


learning (ML), optimization and natural language processing (NLP) are
providing vital insights and predictions about the spread of the virus and the
effectiveness and impact of countermeasures. AI and machine learning are
critical realigning supply and the supply chain to new demand patterns.

“ Pre-COVID models based on historical


data may no longer be valid ”
AI techniques such as reinforcement learning and distributed learning are
creating more adaptable and flexible systems to handle complex business
situations; for example, agent-based systems can model and stimulate
complex systems - particularly now when pre-COVID models based on
historical data may no longer be valid. 

Significant investments made in new chip architectures such as neuromorphic


hardware that can be deployed on edge devices are accelerating AI and ML
computations and workloads and reducing reliance on centralized systems
that require high bandwidths. Eventually, this could lead to more scalable AI
solutions that have higher business impact.

Responsible AI that enables model transparency is essential to protect


against poor decisions. It results in better human-machine collaboration and
trust for greater adoption and alignment of decisions throughout the
organization.

Trend 2: Decline of the dashboard


Dynamic data stories with more automated and consumerized experiences
will replace visual, point-and-click authoring and exploration. As a result, the
amount of time users spend using predefined dashboards will decline. The
shift to in-context data stories means that the most relevant insights will
stream to each user based on their context, role or use. These dynamic
insights leverage technologies such as augmented analytics, NLP, streaming
anomaly detection and collaboration.

Data and analytics leaders need to regularly evaluate their existing analytics
and business intelligence (BI) tools and innovative startups offering new
augmented and NLP-driven user experiences beyond the predefined
dashboard.

Trend 3: Decision intelligence


By 2023, more than 33% of large organizations will have analysts practicing
decision intelligence, including decision modeling.

Decision intelligence brings together a number of disciplines, including


decision management and decision support. It encompasses applications in
the field of complex adaptive systems that bring together multiple traditional
and advanced disciplines.

It provides a framework to help data and analytics leaders design, compose,


model, align, execute, monitor and tune decision models and processes in the
context of business outcomes and behavior.

Explore using decision management and modeling technology when decisions


need multiple logical and mathematical techniques, must be automated or
semi-automated, or must be documented and audited.
Trend 4: X analytics
Gartner coined the term “X analytics” to be an umbrella term, where X is the
data variable for a range of different structured and unstructured content such
as text analytics, video analytics, audio analytics, etc.

Data and analytics leaders use X analytics to solve society’s toughest


challenges, including climate change, disease prevention and wildlife
protection.

During the pandemic, AI has been critical in combing through thousands of


research papers, news sources, social media posts and clinical trials data to
help medical and public health experts predict disease spread, capacity-plan,
find new treatments and identify vulnerable populations. X analytics combined
with AI and other techniques such as graph analytics (another top trend) will
play a key role in identifying, predicting and planning for natural disasters and
other business crises and opportunities in the future.

Data and analytics leaders should explore X analytics capabilities available


from their existing vendors, such as cloud vendors for image, video and voice
analytics, but recognize that innovation will likely come from small disruptive
startups and cloud providers.
Trend 5: Augmented data management
Augmented data management uses ML and AI techniques to optimize and
improve operations.  It also converts metadata from being used in auditing,
lineage and reporting to powering dynamic systems.

Augmented data management products can examine large samples of


operational data, including actual queries, performance data and schemas.
Using the existing usage and workload data, an augmented engine can tune
operations and optimize configuration, security and performance.

Data and analytics leaders should look for augmented data management
enabling active metadata to simplify and consolidate their architectures, and
also increase automation in their redundant data management tasks.

Trend 6: Cloud is a given


By 2022, public cloud services will be essential for 90% of data and analytics
innovation.

As data and analytics moves to the cloud, data and analytics leaders still
struggle to align the right services to the right use cases, which leads to
unnecessary increased governance and integration overhead.

The question for data and analytics is moving from how much a given service
costs to how it can meet the workload’s performance requirements beyond the
list price.
Data and analytics leaders need to prioritize workloads that can exploit cloud
capabilities and focus on cost optimization and other benefits such as change
and innovation acceleration when moving to cloud.

Trend 7: Data and analytics worlds collide


Data and analytics capabilities have traditionally been considered distinct
capabilities  and managed accordingly. Vendors offering end-to-end workflows
enabled by augmented analytics blur the distinction between once separate
markets.

The collision of data and analytics will increase interaction and collaboration
between historically separate data and analytics roles. This impacts not only
the technologies and capabilities provided, but also the people and processes
that support and use them. The spectrum of roles will extend from traditional
data and analytics roles in IT to information explorer, consumer and citizen
developer as an example.

To turn the collision into a constructive convergence, incorporate both data


and analytics tools and capabilities into the analytics stack. Beyond tools,
focus on people and processes to foster communication and collaboration.
Leverage data and analytics ecosystems enabled by an augmented approach
that have the potential to deliver coherent stacks.
Trend 8: Data marketplaces and exchanges
By 2022, 35% of large organizations will be either sellers or buyers of data via
formal online data marketplaces, up from 25% in 2020.

Data marketplaces and exchanges provide single platforms to consolidate


third-party data offerings. These marketplaces and exchanges provide
centralized availability and access (to X analytics and other unique data sets,
for example) that create economies of scale to reduce costs for third-party
data.

To monetize data assets through data marketplaces, data and analytics


leaders should establish a fair and transparent methodology by defining a
data governance principle that ecosystems partners can rely on.

Trend 9: Blockchain in data and analytics


Blockchain technologies address two challenges in data and analytics.
First, blockchain provides the full lineage of assets and transactions. Second,
blockchain provides transparency for complex networks of participants.

Outside of limited bitcoin and smart contract use cases, ledger database
management systems (DBMSs) will provide a more attractive option for
single-enterprise auditing of data sources. By 2021, Gartner estimates that
most permissioned blockchain uses will be replaced by ledger DBMS
products.
Data and analytics should position blockchain technologies as supplementary
to their existing data management infrastructure by highlighting the
capabilities mismatch between data management infrastructure and
blockchain technologies.

Trend 10: Relationships form the foundation


of data and analytics value
By 2023, graph technologies will facilitate rapid contextualization for decision
making in 30% of organizations worldwide. Graph analytics is a set of analytic
techniques that allows for the exploration of relationships between entities of
interest such as organizations, people and transactions.

It helps data and analytics leaders find unknown relationships in data and
review data not easily analyzed with traditional analytics.

For example, as the world scrambles to respond to current and future


pandemics, graph technologies can relate entities across everything from
geospatial data on people’s phones to facial-recognition systems that can
analyze photos to determine who might have come into contact with
individuals who later tested positive for the coronavirus.
“ Consider investigating how graph
algorithms and technologies can improve
your AI and ML initiatives ”
When combined with ML algorithms, these technologies can be used to comb
through thousands of data sources and documents that could help medical
and public health experts rapidly discover new possible treatments or factors
that contribute to more negative outcomes for some patients.

Data and analytics leaders need to evaluate opportunities to incorporate


graph analytics into their analytics portfolios and applications to uncover
hidden patterns and relationships. In addition, consider investigating how
graph algorithms and technologies can improve your AI and ML initiatives.

https://www.gartner.com/smarterwithgartner/gartner-top-10-trends-in-data-and-analytics-for-2020

You might also like