0% found this document useful (0 votes)

176 views

What Is Data Analytics?

This document discusses what data analytics is and how businesses can use it. It defines data analytics as examining datasets to draw conclusions about the information and uncover patterns. Businesses use data analytics techniques to gain insights from customer data that help improve decision making, marketing, customer service, and operations. Modern data analytics utilizes machine learning, data management, data mining, and predictive analytics to analyze large amounts of data quickly and accurately.

Uploaded by

Aizel Almonte

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

176 views

What Is Data Analytics?

Uploaded by

Aizel Almonte

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 56

What Is Data Analytics?

Data can help businesses better understand their customers, improve

their advertising campaigns, personalize their content and improve
their bottom lines. The advantages of data are many, but you can’t
access these benefits without the proper data analytics tools and
processes. While raw data has a lot of potential, you need data
analytics to unlock the power to grow your business. Here is what we
will be going over.

What Is Data Analytics?

The term data analytics refers to the process of examining datasets to
draw conclusions about the information they contain. Data analytic
techniques enable you to take raw data and uncover patterns to
extract valuable insights from it.

Today, many data analytics techniques use specialized systems and

software that integrate machine learning algorithms, automation and
other capabilities.

Data Scientists and Analysts use data analytics techniques in their

research, and businesses also use it to inform their decisions. Data
analysis can help companies better understand their customers,
evaluate their ad campaigns, personalize content, create content
strategies and develop products. Ultimately, businesses can use data
analytics to boost business performance and improve their bottom
line.

For businesses, the data they use may include historical data or new
information they collect for a particular initiative. They may also
collect it first-hand from their customers and site visitors or purchase it
from other organizations. Data a company collects about its own
customers is called first-party data, data a company obtains from a
known organization that collected it is called second-party data, and
aggregated data a company buys from a marketplace is called third-
party data. The data a company uses may include information about
an audience’s demographics, their interests, behaviors and more.

4 Ways to Use Data

Analytics

Data has the potential to provide a lot of value to businesses, but to

unlock that value, you need the analytics component. Analysis
techniques give businesses access to insights that can help them to
improve their performance. It can help you improve your knowledge of
your customers, ad campaigns, budget and more.

As the importance of data analytics in the business world increases, it

becomes more critical that your company understand how to
implement it. Some benefits of data analytics include:
1. Improved Decision Making
Companies can use the insights they gain from data analytics to
inform their decisions, leading to better outcomes.

Data analytics eliminates much of the guesswork from planning

marketing campaigns, choosing what content to create, developing
products and more. It gives you a 360-degree view of your customers,
which means you understand them more fully, enabling you to better
meet their needs. Plus, with modern data analytics technology, you
can continuously collect and analyze new data to update your
understanding as conditions change.

2. More Effective Marketing

When you understand your audience better, you can market to them
more effectively. Data analytics also gives you useful insights into how
your campaigns are performing so that you can fine-tune them for
optimal outcomes.

Using the Lotame Campaign Analytics tool, you can gain insights into

which audience segments are most likely to interact with a campaign
and convert. You can use this information to adjust your targeting
criteria either manually or through automation, or use it to develop
different messaging and creative for different segments. Improving
your targeting results in more conversions and less ad waste.

3. Better Customer Service

Data analytics provide you with more insights into your customers,
allowing you to tailor customer service to their needs, provide more
personalization and build stronger relationships with them.
Your data can reveal information about your customers’
communications preferences, their interests, their concerns and more.
Having a central location for this data also ensures that your whole
customer service team, as well as your sales and marketing teams,
are on the same page.

4. More Efficient Operations

Data analytics can help you streamline your processes, save money
and boost your bottom line. When you have an improved
understanding of what your audience wants, you waste less time on
creating ads and content that don’t match your audience’s interests.

This means less money wasted as well as improved results from your
campaigns and content strategies. In addition to reducing your costs,
analytics can also boost your revenue through increased conversions,
ad revenue or subscriptions.
What Insights Can You Gain
From Data Analytics?

By collecting various kinds of data from numerous sources, you can

gain insights into your audiences and campaigns that help you
improve your targeting and better predict future customer behavior.

One valuable type of data is information about customer behaviors.

This refers to data about specific actions that a user takes. They
might, for instance, click on an ad, make a purchase, comment on a
news article or like a social media post.

This and other types of data can reveal information about customer
affinities — expressed or suggested interest in activities, products,
brands and topics. A customer may express interest in your brand by
signing up for your email list. They may also indirectly express interest
in a topic by reading about it on your website. They may express
interest in a product by clicking on one of your ads for it. Some other
potential sources of customer affinity data include survey responses,
social media likes and video views.

By combining this data with information about your current customers’

demographics, you can gain insights into the customer segments that
are most likely to be interested in your brand, content or products.
Demographic information includes information about customers’ ages,
genders, income, marital status and various other characteristics. For
example, you might find, through data analytics, that people between
the ages of 18 and 35 are the most likely to purchase your product.
You might also find that people who are married make up most of your
website’s audience. By targeting multiple characteristics, you can
create more specific audiences who are highly likely to convert.

You can then use this information to predict the behaviors of various
types of users and target your ads and content more effectively.
Data Analytics Technology
Data analytics is nothing new. Today, though, the growing volume of
data and the advanced analytics technologies available mean you can
get much deeper data insights more quickly. The insights that big data
and modern technologies make possible are more accurate and more
detailed. In addition to using data to inform future decisions, you can
also use current data to make immediate decisions.

Some of the technologies that make modern data analytics so

powerful are:

 Machine learning: Artificial intelligence (AI) is the field of developing

and using computer systems that can simulate human intelligence to complete
tasks. Machine learning (ML) is a subset of AI that is significant for data
analytics and involves algorithms that can learn on their own. ML enables
applications to take in data and analyze it to predict outcomes without
someone explicitly programming the system to reach that conclusion. You can
train a machine learning algorithm on a small sample of data, and the system
will continue to learn as it gathers more data, becoming more accurate as time
goes on.
 Data management: Before you can analyze data, you need to have
procedures in place for managing the flow of data in and out of your systems
and keeping your data organized. You also need to ensure that your data is
high-quality and that you collect it in a central data management platform
(DMP) where it’s available for use when needed. Establishing a data
management program can help ensure that your organization is on the same
page regarding how to organize and handle data.

 Data mining: The term data mining refers to the process of sorting

through large amounts of data to identify patterns and discover
relationships between data points. It enables you to sift through large
datasets and figure out what’s relevant. You can then use this information to
conduct analyses and inform your decisions. Today’s data mining
technologies allow you to complete these tasks exceptionally quickly.

 Predictive analytics: Predictive analytics technology helps you analyze

historical data to predict future outcomes and the likelihood of various
outcomes occurring. These technologies typically use statistical algorithms
and machine learning. More accurate predictions means businesses can
make better decisions moving forward and position themselves to succeed. It
allows them to anticipate their customers’ needs and concerns, predict future
trends and stay ahead of the competition.

Data Analytics Examples

Let’s look at a few quick examples of how you might collect data and
analyze it to help improve outcomes for your business.
Let’s say you’re a marketer who’s running an online ad campaign to
promote a new smartphone. You might start by targeting the ad to
people who bought the previous version of the phone in question. As
your campaign runs, you use data analytics techniques to sift through
the data generated when people clicked on the ad. By examining data
about these users’ interests, perhaps you discover many of them are
interested in photography. Perhaps this is because your new phone
has a better camera than the previous model. Using this information,
you could fine-tune your ad to focus on users who bought the previous
phone and like photography. You could also find new audiences of
people who didn’t buy the older phone but are interested in
photography.

As another example, let’s say you publish a site that features videos
about sports. As people visit your site, you could collect data about
which videos different visitors watch as well as how highly they rate
the videos, which ones they comment on and more. You could also
gather information about the demographics of each user. You can use
data analytics tools to determine which audience segments are most
likely to watch certain videos. You can then suggest videos to people
based on the segments they fit into best. For example, you might find
that older men are most likely to be interested in golf, while younger
men are most likely to be interested in basketball.

For some real-life examples of how Lotame’s data analytics tools have
helped business drive improved results, check out our case studies.

Challenges of Data Analytics

While data analytics can provide many benefits to the companies that
use it, it’s not without its challenges. Working with the right partners
and using the right tools can help businesses to overcome these
difficulties.

One of the biggest challenges related to data analytics is collecting the

data. There’s a lot of data that businesses could potentially collect,
and they need to determine what to prioritize. Collecting data requires
tools that can gather data from website visits, ad clicks and other
interactions and deliver it in a usable format.
Once you collect your data, you need somewhere to store itn. This
can take up a considerable amount of space and contain many
different types of information. You have to integrate both structured
and unstructured data from online and offline sources and from
internal and external sources.

You also need to ensure data quality so your results are accurate. In
addition, your data needs to be accessible and not siloed so everyone
throughout your organization has the same repository.
4 Types of Data in Statistics
Introduction

Data types are important concepts in statistics, they enable us to apply
statistical measurements correctly on data and assist in correctly concluding
certain assumptions about it.

Having an adequate comprehension of the various data types is significantly
essential for doing Exploratory Data Analysis or EDA since you can use certain
factual measurements just for particular data types.

SImilarly, you need to know which data analysis and its type you are working
to select the correct perception technique. You can consider data types as an
approach to arrange various types of variables.

If you go into detail then there are only two classes of data in statistics, that
is Qualitative and Quantitative data. But, after that, there is a subdivision and
it breaks into 4 types of data. Data types are like a guide for doing the whole
study of statistics correctly!

This blog gives you a glance over different types of data need to know for
performing proper exploratory data analysis.

Qualitative and Quantitative Data

Qualitative data is a bunch of information that cannot be measured in the
form of numbers. It is also known as categorical data. It normally comprises
words, narratives, and we labelled them with names.

It delivers information about the qualities of things in data. The outcome of
qualitative data analysis can come in the type of featuring key words,
extracting data, and ideas elaboration.

For examples:

 Hair colour- black, brown, red
 Opinion- agree, disagree, neutral

On the other side, Quantitative data is a bunch of information gathered from a
group of individuals and includes statistical data analysis. Numerical data is
another name for quantitative data. Simply, it gives information about
quantities of items in the data and the items that can be estimated. And, we
can formulate them in terms of numbers.

For examples:

 We can measure the height (1.70 meters), distance (1.35 miles) with
the help of a ruler or tape.
 We can measure water (1.5 litres) with a jug.

Under a subdivision, nominal data and ordinal data come under qualitative
data. Interval data and ratio data come under quantitative data. Here we will
read in detail about all these data types.
Different Types of Data

1. Nominal Data

Nominal data are used to label variables where there is no quantitative
value and has no order. So, if you change the order of the value then the
meaning will remain the same.

Thus, nominal data are observed but not measured, are unordered but
non-equidistant, and have no meaningful zero.

The only numerical activities you can perform on nominal data is to state
that perception is (or isn't) equivalent to another (equity or inequity), and
you can use this data to amass them.

You can't organize nominal data, so you can't sort them.

Neither would you be able to do any numerical tasks as they are saved
for numerical data. With nominal data, you can calculate frequencies,
proportions, percentages, and central points.

Examples of Nominal data:

 What languages do you speak?

 English
 German
 French
 Punjabi

 What’s your nationality?

 American
 Indian
 Japanese
 German

You can clearly see that in these examples of nominal data the
categories have no order.

2. Ordinal Data

Ordinal data is almost the same as nominal data but not in the case of
order as their categories can be ordered like 1st, 2nd, etc. However, there
is no continuity in the relative distances between adjacent categories.

Ordinal Data is observed but not measured, is ordered but non-
equidistant, and has no meaningful zero. Ordinal scales are always used
for measuring happiness, satisfaction, etc.

With ordinal data, likewise, with nominal data, you can amass the
information by evaluating whether they are equivalent or extraordinary.

As ordinal data are ordered, they can be arranged by making basic
comparisons between the categories, for example, greater or less than,
higher or lower, and so on.

You can't do any numerical activities with ordinal data, however, as they
are numerical data.

With ordinal data, you can calculate the same things as nominal data like
frequencies, proportions, percentage, central point but there is one more
point added in ordinal data that is summary statistics and
similarly bayesian statistics.

Examples of Ordinal data:

 Opinion
o Agree
o Disagree
o Mostly agree
o Neutral
o Mostly disagree

 Time of day
o Morning
o Noon
o Night

In these examples, there is an obvious order to the categories.

3. Interval Data

Interval Data are measured and ordered with the nearest items but have
no meaningful zero.

The central point of an Interval scale is that the word 'Interval' signifies
'space in between', which is the significant thing to recall, interval scales
not only educate us about the order but additionally about the value
between every item.

Interval data can be negative, though ratio data can't.

Even though interval data can show up fundamentally the same as ratio
data, the thing that matters is in their characterized zero-points. If the
zero-point of the scale has been picked subjectively, at that point the
data can't be ratio data and should be interval data.

Hence, with interval data you can easily correlate the degrees of the data
and also you can add or subtract the values.

There are some descriptive statistics that you can calculate for interval
data are central point (mean, median, mode), range (minimum,
maximum), and spread (percentiles, interquartile range, and standard
deviation).

In addition to that, similar other statistical data analysis techniques can
be used for more analysis.

Examples of Interval data:

 Temperature (°C or F, but not Kelvin)
 Dates (1066, 1492, 1776, etc.)
 Time interval on a 12-hour clock (6 am, 6 pm)

4. Ratio Data

Ratio Data are measured and ordered with equidistant items and a
meaningful zero and never be negative like interval data.

An outstanding example of ratio data is the measurement of heights. It
could be measured in centimetres, inches, meters, or feet and it is not
practicable to have a negative height.

Ratio data enlightens us regarding the order for variables, the contrasts
among them, and they have absolutely zero. It permits a wide range of
estimations and surmisings to be performed and drawn.

Ratio data is fundamentally the same as interval data, aside from zero
means none.

The descriptive statistics which you can calculate for ratio data are the
same as interval data which are central point (mean, median, mode),
range (minimum, maximum), and spread (percentiles, interquartile range,
and standard deviation).

Example of Ratio data:

 Age (from 0 years to 100+)
 Temperature (in Kelvin, but not °C or F)
 Distance (measured with a ruler or any other assessing device)
 Time interval (measured with a stop-watch or similar)

Therefore, for these examples of ratio data, there is an actual, meaningful
zero-point like the age of a person, absolute zero, distance calculated
from a specified point or time all have real zeros.

Key Takeaways

We hope you understood about 4 types of data in statistics and their
importance, now you can learn how to handle data correctly, which statistical
hypothesis tests you can use, and what you could calculate with them.
Moreover,
 Nominal data and ordinal data are the types of qualitative data or
categorical data.
 Interval data and ratio data are the types of quantitative data which are
also known as numerical data.
 Nominal Data are not measured but observed and they are unordered,
non-equidistant, and also have no meaningful zero.
(Also check: Types of Statistical Analysis)
 Ordinal Data is also not measured but observed and they are ordered
however non-equidistant and have no meaningful zero.
 Interval Data are measured and ordered with equidistant items yet have
no meaningful zero.
 Ratio Data are also measured and ordered with equidistant items and a
meaningful zero.

98%
What is data visualization? Presenting data for
decision-making
Data visualization definition
Data visualization is the presentation of data in a graphical format. It reduces the “noise”
of data by presenting it visually, making it easier for decision makers to see and
understand trends, outliers, and patterns in data.

Maps and charts were among the earliest forms of data visualization. One of the most
well-known early examples of data visualization was a flow map created by French civil
engineer Charles Joseph Minard in 1869 to help understand what Napoleon’s troops
suffered in the disastrous Russian campaign of 1812. The map used two dimensions to
depict the number of troops, distance, temperature, latitude and longitude, direction of
travel, and location relative to specific dates.

[ Learn the essential skills and traits of elite data scientists and the secrets of highly
successful data analytics teams. | Prove your data science chops by earning one of
these data science certifications. | Get the insights by signing up for our newsletters. ]
The business value of data visualization
Data visualization helps people analyze data quickly and efficiently. By providing easy-
to-understand visual representations of data, it helps employees make more informed
decisions based on that data. Presenting data in visual form can make it easier to
comprehend, enable people to obtain insights more quickly. Visualizations can also
make it easier to communicate those insights. Visual representations of data can also
make it easier to see how independent variables relate to one another. This can help
you see trends, understand the frequency of events, and track connections between
operations and performance, for example.

Types of data visualization

There are myriad ways of visualizing data, but data design agency The Datalabs
Agency says there are two basic categories of data visualization:

 Exploration: Exploration visualizations help you understand what the data is telling you.
 Explanation: Explanation visualizations tell a story to an audience using data.

It is essential to understand which of those two ends a given visualization is intended to

achieve.

Some of the most common specific types of visualizations include:

2D area
These are typically geospatial visualizations. For example, cartograms use distortions of
maps to convey information such as population or travel time. Choropleths use shades
or patterns on a map to represent a statistical variable, such as population density by
state.

Temporal
These are one-dimensional linear visualizations that have a start and finish time.
Examples include a time series, which presents data like website visits by day or month,
and Gantt charts, which illustrate project schedules.

Multidimensional
These common visualizations present data with two or more dimensions. Examples
include pie charts, histograms, and scatter plots.

Hierarchical
These visualizations show how groups relate to each other. Tree diagrams are an
example of a hierarchical visualization that shows how larger groups encompass sets of
smaller groups.
Network
Network visualizations show how data sets are related to each other in a network. An
example is a node-link diagram, also known as a network graph, which uses nodes and
link lines to show how things are interconnected.

Data visualization examples

Tableau has collected what it considers to be 10 of the best data visualization
examples. Number one on Tableau’s list is Minard’s map of Napoleon’s march to
Moscow, mentioned above. Other prominent examples include:

 A dot map created by English physician John Snow in 1854 to understand the cholera
outbreak in London that year. The map used bar graphs on city blocks to indicate cholera
deaths at each household in a London neighborhood. The map showed that the worst-affected
households were all drawing water from the same well, which eventually led to the insight that
wells contaminated by sewage had caused the outbreak.
 An animated age and gender demographic breakdown pyramid created by Pew
Research Center as part of its The Next America project, published in 2014. The project is filled
with innovative data visualizations. This one shows how population demographics have shifted
since the 1950s, with a pyramid of many young people at the bottom and very few older people
at the top in the 1950s to a rectangular shape in 2060.
 A collection of four visualizations by Hanah Anderson and Matt Daniels of The Pudding
that illustrate gender disparity in pop culture by breaking down the scripts of 2,000 movies and
tallying spoken lines of dialogue for male and female characters. The visualizations include a
breakdown of Disney movies, the overview of 2,000 scripts, a gradient bar with which users can
search for specific movies, and a representation of age biases shown toward male and female
roles.

Data visualization tools

There are many applications, tools, and scripts available for data visualization. Some of
the most popular include the following:

Domo
Domo is a cloud software company that specializes in business intelligence tools and
data visualization. It focuses on business-user deployed dashboards and ease of use.

Dundas BI
Dundas BI is a BI platform for visualizing data, building and sharing dashboards and
reports, and embedding analytics.

Infogram
Infogram is a drag-and-drop visualization tool for creating visualizations for marketing
reports, infographics, social media posts, dashboards, and more.

Microsoft Power BI
Microsoft Power BI is a business intelligence platform integrated with Microsoft Office. It
has an easy-to-use interface for making dashboards and reports.
Qlik
Qlik’s Qlik Sense features an “associative” data engine for investigating data and AI-
powered recommendations for visualizations. It is continuing to build out its open
architecture and multicloud capabilities.

Sisense
Sisense is an end-to-end analytics platform best known for embedded analytics. Many
customers use it in an OEM form.

Tableau
One of the most popular data visualization platforms on the market, Tableau is a
platform that supports accessing, preparing, analyzing, and presenting data.

4 Ways Data Visualization Can Improve

Your Decision Making
4 Ways Data Visualization Improves Decision
Making
1 | Increase Speed

Information travels at the speed of light in our always-connected world. Due to

the fast pace of business, managers are often expected to make critical
decisions in short periods. If you are not timely, an opportunity may pass by,
or a small problem may grow exponentially worse overnight.

Data visualization can help you draw actionable insights from massive
amounts of data in a short amount of time. Even a simple visualization, like
a bar graph, can present valuable insights in seconds. Take a look at the
example below:
Data collected from a corporate technology assessment is organized into this
colorful bar graph. By glancing at the chart, an IT manager could
immediately recognize which skills need improvement. From there, the
manager could decide to allocate resources towards training and recruiting in
these skill areas. All within a few minutes.

2 | Improve Accuracy

Business leaders must make timely and informed decisions. However,

collecting and efficiently reviewing all of the numbers is not always a
reasonable option. What happens when you don't have all of the facts before
making a decision? You fill in the blanks with assumptions and biases.

Psychologists labeled this quick fix as heuristics. Heuristics are mental

shortcuts that allow people to solve problems and make judgments quickly
and efficiently. However, these shortcuts can lead to cognitive biases that can
taint our decision-making process. For some decisions, relying on heuristics is
perfectly acceptable, but for high-impact decisions, missing information can
lead to serious missteps.

While big data provides decision-makers with all the information, it's not
always presented in a consumable form. Imagine for every high-impact
decision you must scroll through rows of data compiled in a spreadsheet, just
to digest all the facts. It's unreasonable, time-consuming, and confusing.

As a manager, you need to spend your time driving action, not analyzing
numbers. When it's hard to consume data, it's easy to ignore the facts
and lean on our biases. Instead of wasting valuable time analyzing rows of
data or falling back on your assumptions, use visualizations to identify
relevant information quickly.

Data visualization simplifies the information, reducing the need to fill the
gaps with your personal biases. In the bar chart above, you can easily see
a comparison of all the skills across the workforce. When you need to decide
where to allocate resources, your decision is based on factual data, not
assumptions.

3 | Simplify Communication

A decision is just words until it is carried out through people's actions. After
you make a decision, you must effectively communicate your thoughts
with the people who will carry out the subsequent steps. In the same way
that data visualization simplifies data analysis, it can also streamline and
objectify communication.

For example, a manager at an engineering firm determines his lead

mechanical engineer needs to improve his solar, thermal skills (red
column) and assigns him to training. The manager could just say this to his
engineer. However, now, the engineer is left with several unanswered
questions. Why does he need training? Where should his thermal skills be
right now? Does his manager dislike him?

While the decision may be apparent to the manager, it isn't communicated

clearly to the engineer. How can you expect people to take effective action
when the message isn't clear in the first place?

Alternatively, the manager could use the graph below to clearly communicate
why he is making this decision to the developer:
The chart clearly shows which skills do not meet the ideal proficiency levels
and by how much those skills need to improve. By presenting his message in
visual form, the manager can ensure the engineer understands why he needs
training and how he can gauge his progress. The visualization shifted the
manager's message from unclear and subjective to concise and
objective.

4 | Empower Collaboration

Leaders should always seek the perspectives of others to enhance their

decision-making process. Research shows that “being able to view the
decision environment from multiple perspectives enhances the decision-
maker's ability to make better-informed choices.”

What is data science?

Data science is a multidisciplinary approach to extracting actionable insights from the

large and ever-increasing volumes of data collected and created by today’s
organizations. Data science encompasses preparing data for analysis and processing,
performing advanced data analysis, and presenting the results to reveal patterns and
enable stakeholders to draw informed conclusions.

Data preparation can involve cleansing, aggregating, and manipulating it to be ready for
specific types of processing. Analysis requires the development and use of algorithms,
analytics and AI models. It’s driven by software that combs through data to find patterns
within to transform these patterns into predictions that support business decision-
making. The accuracy of these predictions must be validated through scientifically
designed tests and experiments. And the results should be shared through the skillful
use of data visualization tools that make it possible for anyone to see the patterns and
understand trends.

As a result, data scientists (as data science practitioners are called) require computer
science and pure science skills beyond those of a typical data analyst. A data scientist
must be able to do the following:

 Apply mathematics, statistics, and the scientific method

 Use a wide range of tools and techniques for evaluating and preparing data—
everything from SQL to data mining to data integration methods
 Extract insights from data using predictive analytics and artificial intelligence (AI) ,
including machine learning and deep learning models
 Write applications that automate data processing and calculations
 Tell—and illustrate—stories that clearly convey the meaning of results to decision-
makers and stakeholders at every level of technical knowledge and understanding
 Explain how these results can be used to solve business problems

This combination of skills is rare, and it’s no surprise that data scientists are currently in
high demand. According to an IBM survey (PDF, 3.9 MB), the number of job openings in
the field continues to grow at over 5% per year, with over 60,000 forecast for 2020.

Data science: An untapped resource for machine learning

Data science is one of the most exciting fields out there today. But why is it so important?

Because companies are sitting on a treasure trove of data. As modern technology has enabled the
creation and storage of increasing amounts of information, data volumes have exploded. It’s
estimated that 90 percent of the data in the world was created in the last two years. For example,
Facebook users upload 10 million photos every hour.

But this data is often just sitting in databases and data lakes, mostly untouched.

The wealth of data being collected and stored by these technologies can bring transformative
benefits to organizations and societies around the world—but only if we can interpret it. That’s
where data science comes in.

Data science reveals trends and produces insights that businesses can use to make better
decisions and create more innovative products and services. Perhaps most importantly, it enables
machine learning (ML) models to learn from the vast amounts of data being fed to them, rather
than mainly relying upon business analysts to see what they can discover from the data.

Data is the bedrock of innovation, but its value comes from the information data scientists can
glean from it, and then act upon.
What’s the difference between data science, artificial intelligence, and machine learning?
To better understand data science—and how you can harness it—it’s equally important to know
other terms related to the field, such as artificial intelligence (AI) and machine learning. Often,
you’ll find that these terms are used interchangeably, but there are nuances.

Here’s a simple breakdown:

 AI means getting a computer to mimic human behavior in some way.

 Data science is a subset of AI, and it refers more to the overlapping areas of statistics,
scientific methods, and data analysis—all of which are used to extract meaning and insights from
data..
 Machine learning is another subset of AI, and it consists of the techniques that enable
computers to figure things out from the data and deliver AI applications.
And for good measure, we’ll throw in another definition.
 Deep learning which is a subset of machine learning that enables computers to solve more
complex problems.
How data science is transforming business
Organizations are using data science to turn data into a competitive advantage by refining
products and services. Data science and machine learning use cases include:

 Determine customer churn by analyzing data collected from call centers, so marketing can
take action to retain them
 Improve efficiency by analyzing traffic patterns, weather conditions, and other factors so
logistics companies can improve delivery speeds and reduce costs
 Improve patient diagnoses by analyzing medical test data and reported symptoms so doctors
can diagnose diseases earlier and treat them more effectively
 Optimize the supply chain by predicting when equipment will break down
 Detect fraud in financial services by recognizing suspicious behaviors and anomalous actions
 Improve sales by creating recommendations for customers based upon previous purchases
Many companies have made data science a priority and are investing in it heavily. In Gartner’s
recent survey of more than 3,000 CIOs, respondents ranked analytics and business intelligence as
the top differentiating technology for their organizations. The CIOs surveyed see these
technologies as the most strategic for their companies, and are investing accordingly.
How data science is conducted
The process of analyzing and acting upon data is iterative rather than linear, but this is how the
data science lifecycle typically flows for a data modeling project:

Planning: Define a project and its potential outputs.

Building a data model: Data scientists often use a variety of open source libraries or in-
database tools to build machine learning models. Often, users will want APIs to help with data
ingestion, data profiling and visualization, or feature engineering. They will need the right tools
as well as access to the right data and other resources, such as compute power.
Evaluating a model: Data scientists must achieve a high percent of accuracy for their models
before they can feel confident deploying it. Model evaluation will typically generate a
comprehensive suite of evaluation metrics and visualizations to measure model performance
against new data, and also rank them over time to enable optimal behavior in production. Model
evaluation goes beyond raw performance to take into account expected baseline behavior.
Explaining models: Being able to explain the internal mechanics of the results of machine
learning models in human terms has not always been possible—but it is becoming increasingly
important. Data scientists want automated explanations of the relative weighting and importance
of factors that go into generating a prediction, and model-specific explanatory details on model
predictions.
Deploying a model: Taking a trained, machine learning model and getting it into the right
systems is often a difficult and laborious process. This can be made easier by operationalizing
models as scalable and secure APIs, or by using in-database machine learning models.
Monitoring models: Unfortunately, deploying a model isn’t the end of it. Models must always
be monitored after deployment to ensure that they are working properly. The data the model was
trained on may no longer be relevant for future predictions after a period of time. For example, in
fraud detection, criminals are always coming up with new ways to hack accounts.
Tools for data science
Building, evaluating, deploying, and monitoring machine learning models can be a complex
process. That’s why there’s been an increase in the number of data science tools. Data scientists
use many types of tools, but one of the most common is open source notebooks, which are web
applications for writing and running code, visualizing data, and seeing the results—all in the
same environment.

Some of the most popular notebooks are Jupyter, RStudio, and Zeppelin. Notebooks are very
useful for conducting analysis, but have their limitations when data scientists need to work as a
team. Data science platforms were built to solve this problem.

To determine which data science tool is right for you, it’s important to ask the following
questions: What kind of languages do your data scientists use? What kind of working methods
do they prefer? What kind of data sources are they using?

For example, some users prefer to have a datasource-agnostic service that uses open source
libraries. Others prefer the speed of in-database, machine learning algorithms.

The data science lifecycle

The data science lifecycle—also called the data science pipeline—includes anywhere
from five to sixteen (depending on whom you ask) overlapping, continuing processes.
The processes common to just about everyone’s definition of the lifecycle include the
following:

 Capture: This is the gathering of raw structured and unstructured data from all
relevant sources via just about any method—from manual entry and web scraping to
capturing data from systems and devices in real time.
 Prepare and maintain: This involves putting the raw data into a consistent format for
analytics or machine learning or deep learning models. This can include everything
from cleansing, deduplicating, and reformatting the data, to using ETL (extract,
transform, load) or other data integration technologies to combine the data into a data
warehouse, data lake, or other unified store for analysis.
 Preprocess or process: Here, data scientists examine biases, patterns, ranges, and
distributions of values within the data to determine the data’s suitability for use with
predictive analytics, machine learning, and/or deep learning algorithms (or other
analytical methods).
 Analyze: This is where the discovery happens—where data scientists perform
statistical analysis, predictive analytics, regression, machine learning and deep
learning algorithms, and more to extract insights from the prepared data.
 Communicate: Finally, the insights are presented as reports, charts, and other data
visualizations that make the insights—and their impact on the business—easier for
decision-makers to understand. A data science programming language such as R or
Python (see below) includes components for generating visualizations; alternatively,
data scientists can use dedicated visualization tools.
Data science tools

Data scientists must be able to build and run code in order to create models. The most
popular programming languages among data scientists are open source tools that
include or support pre-built statistical, machine learning and graphics capabilities. These
languages include:

 R: An open source programming language and environment for developing statistical
computing and graphics, R is the most popular programming language among data
scientists. R provides a broad variety of libraries and tools for cleansing and prepping
data, creating visualizations, and training and evaluating machine learning and deep
learning algorithms. It’s also widely used among data science scholars and
researchers.
 Python: Python is a general-purpose, object-oriented, high-level programming
language that emphasizes code readability through its distinctive generous use of
white space. Several Python libraries support data science tasks, including Numpy for
handling large dimensional arrays, Pandas for data manipulation and analysis, and
Matplotlib for building data visualizations.

Machine Learning
https://www.youtube.com/watch?v=ukzFI9rgwfU
Machine Learning
https://www.youtube.com/watch?v=6M5VXKLf4D4
Deep learning
https://www.youtube.com/watch?v=bfmFfD2RIcg
Artificial Neural Network

This introduction to machine learning provides an overview of its history, important definitions,
applications and concerns within businesses today.

What is machine learning?

Machine learning is a branch of artificial intelligence (AI) and computer science which
focuses on the use of data and algorithms to imitate the way that humans learn,
gradually improving its accuracy.
IBM has a rich history with machine learning. One of its own, Arthur Samuel, is credited
for coining the term, “machine learning” with his research (PDF, 481 KB) (link resides
outside IBM) around the game of checkers. Robert Nealey, the self-proclaimed
checkers master, played the game on an IBM 7094 computer in 1962, and he lost to the
computer. Compared to what can be done today, this feat almost seems trivial, but it’s
considered a major milestone within the field of artificial intelligence. Over the next
couple of decades, the technological developments around storage and processing
power will enable some innovative products that we know and love today, such as
Netflix’s recommendation engine or self-driving cars.

Machine learning is an important component of the growing field of data science.

Through the use of statistical methods, algorithms are trained to make classifications or
predictions, uncovering key insights within data mining projects. These insights
subsequently drive decision making within applications and businesses, ideally
impacting key growth metrics. As big data continues to expand and grow, the market
demand for data scientists will increase, requiring them to assist in the identification of
the most relevant business questions and subsequently the data to answer them.
Machine Learning vs. Deep Learning vs. Neural Networks

Since deep learning and machine learning tend to be used interchangeably, it’s worth
noting the nuances between the two. Machine learning, deep learning, and neural
networks are all sub-fields of artificial intelligence. However, deep learning is actually a
sub-field of machine learning, and neural networks is a sub-field of deep learning.

The way in which deep learning and machine learning differ is in how each algorithm
learns. Deep learning automates much of the feature extraction piece of the process,
eliminating some of the manual human intervention required and enabling the use of
larger data sets. You can think of deep learning as "scalable machine learning" as Lex
Fridman notes in this MIT lecture (00:30) (link resides outside IBM). Classical, or "non-
deep", machine learning is more dependent on human intervention to learn. Human
experts determine the set of features to understand the differences between data
inputs, usually requiring more structured data to learn.

"Deep" machine learning can leverage labeled datasets, also known as supervised
learning, to inform its algorithm, but it doesn’t necessarily require a labeled dataset. It
can ingest unstructured data in its raw form (e.g. text, images), and it can automatically
determine the set of features which distinguish different categories of data from one
another. Unlike machine learning, it doesn't require human intervention to process data,
allowing us to scale machine learning in more interesting ways. Deep learning and
neural networks are primarily credited with accelerating progress in areas, such as
computer vision, natural language processing, and speech recognition.

Neural networks, or artificial neural networks (ANNs), are comprised of a node layers,
containing an input layer, one or more hidden layers, and an output layer. Each node, or
artificial neuron, connects to another and has an associated weight and threshold. If the
output of any individual node is above the specified threshold value, that node is
activated, sending data to the next layer of the network. Otherwise, no data is passed
along to the next layer of the network. The “deep” in deep learning is just referring to the
depth of layers in a neural network. A neural network that consists of more than three
layers—which would be inclusive of the inputs and the output—can be considered a
deep learning algorithm or a deep neural network. A neural network that only has two or
three layers is just a basic neural network.
10 INDUSTRIES REDEFINED BY BIG DATA
ANALYTICS
It has been a widely acknowledged fact that big data has become a big
game changer in most of the modern industries over the last few years. As
big data continues to permeate our day-to-day lives the number of different
industries that are adopting big data continues to increase. It is well said
that when new technologies become cheaper and easier to use, they have
the potential to transform industries. That is exactly what is happening
with big data right now. Here are 10 Industries redefined the most by big
data analytics-

Sports
Most elite sports have now embraced data analytics. In Premier League
football games, cameras installed around the stadiums track the movement
of every player with the help of pattern recognition software generating over
25 data points per player every second. What more? NFL players have
installed sensors on their shoulder pads to gather intelligent insights on
their performance using data mining. It was analytics which helped British
rowers row their way to the Olympic gold.

Hospitality
Hotel and the luxury industry have turned to advanced analytics solutions
to understand the secret behind customer satisfaction initiatives. Yield
management in the hotel industry is one common use of analytics which is
an important means to tackle the recurring peaks in demand throughout the
year in consideration with other factors which include weather and local
events, that can influence the number and nationalities of guests checking
in.

Government and Public Sector Services
Analytics, data science, and big data have helped a number of cities to pilot
the smart cities initiative where data collection, analytics and the IoT
combine to create joined-up public services and utilities spanning the entire
city. For example, a sensor network has been rolled out across all 80 of the
council’s neighborhood recycling centres to help streamline collection
services, so wagons can prioritize the fullest recycling centres and skip
those with almost nothing in them.

Energy
The costs of extracting oil and gas are rising, and the turbulent state of
international politics adds to the difficulties of exploration and drilling for
new reserves. The energy industry Royal Dutch Shell, for example, has
been developing the “data-driven oilfield” in an attempt to bring down the
cost of drilling for oil.

And on a smaller but no less important scale, data and the Internet of
Things (IoT) is disrupting the way we use energy in our homes. The rise of
“smart homes” includes technology like Google’s Nest thermostat, which
helps make homes more comfortable and cut down on energy wastage.

Agriculture and Farming

The power of AI has embraced even traditional industries like Agriculture
and Farming. Big data practices have been adopted by the US agricultural
manufacturer John Deere which has launched several data-enabled
services that have led farmers to benefit from the real-time monitoring of
data collected from its thousands of users worldwide.

Education
Education sector generates massive data through courseware and learning
methodologies. Important insights can identify better teaching strategies,
highlight areas where students may not be learning efficiently, and
transform how the education is delivered. Increasingly educational
establishments have been putting data into use for everything from
planning school bus routes to improving classroom cleanliness.

Banking and Securities

Securities Exchange Commission (SEC) has deployed big data to track
and monitor the movements in the financial market. Big data and analytics
with network analytics and natural language processors is used by the
stock exchanges to catch illegal trade practices in the stock market.
Retail traders, Big banks, hedge funds and other so-called ‘big boys’ in the
financial markets use big data for trade analytics used in high-frequency
trading, pre-trade decision-support analytics, sentiment measurement,
predictive analytics, etc.

This industry also heavily relies on big data for risk analytics including; anti-
money laundering, demand enterprise risk management, “Know Your
Customer”, and fraud mitigation.

Entertainment, Communications and the Media

The on-demand music service, Spotify uses Hadoop big data analytics
to collate data from its millions of users across the world. This data is used
and analyzed to give customized music recommendations to its individual
users. Over the top media, services have relied big on big data to offer
customized content offerings to its users. An important move in the growing
competitive market.

Retail and Wholesale Trade
Big data has in a big way impacted the traditional brick and mortar retailers
and wholesalers to current day e-commerce traders. The retail and whole
industry has gathered a lot of data over time which is derived from POS
scanners, RFID, customer loyalty cards, store inventory, local
demographics, etc. Big data is applicable to the retail and wholesale
industry to mitigate fraud, offer customized products to the end user
thereby improving the user experience.

Transportation
Big data analytics finds huge application to the transportation industry.
Governments of different countries use big data to control the traffic,
optimize route planning and intelligent transport systems and congestion
management.

Moreover, the private sector uses big data in revenue management,

technological enhancements, logistics and to gain a competitive advantage.

Big data is improving user experiences, and the massive adoption change
has just begun.
https://www.youtube.com/watch?v=_XfWkCsvbEU

Excel Data Analysis Tutorial

https://www.youtube.com/watch?v=9NUjHBNWe9M

Introduction to Pivot Tables, Charts, and Dashboards in Excel (Part 1)

Data Analysis is a process of inspecting, cleaning, transforming and modeling data with
the goal of discovering useful information, suggesting conclusions and supporting
decision-making
.

Types of Data Analysis

Several data analysis techniques exist encompassing various domains such as
business, science, social science, etc. with a variety of names. The major data analysis
approaches are −

 Data Mining
 Business Intelligence
 Statistical Analysis
 Predictive Analytics
 Text Analytics
Data Mining
Data Mining is the analysis of large quantities of data to extract previously unknown,
interesting patterns of data, unusual data and the dependencies. Note that the goal is
the extraction of patterns and knowledge from large amounts of data and not the
extraction of data itself.
Data mining analysis involves computer science methods at the intersection of the
artificial intelligence, machine learning, statistics, and database systems.
The patterns obtained from data mining can be considered as a summary of the input
data that can be used in further analysis or to obtain more accurate prediction results
by a decision support system.
Business Intelligence
Business Intelligence techniques and tools are for acquisition and transformation of
large amounts of unstructured business data to help identify, develop and create new
strategic business opportunities.
The goal of business intelligence is to allow easy interpretation of large volumes of
data to identify new opportunities. It helps in implementing an effective strategy based
on insights that can provide businesses with a competitive market-advantage and long-
term stability.
Statistical Analysis
Statistics is the study of collection, analysis, interpretation, presentation, and
organization of data.
In data analysis, two main statistical methodologies are used −
 Descriptive statistics − In descriptive statistics, data from the entire population
or a sample is summarized with numerical descriptors such as −
o Mean, Standard Deviation for Continuous Data
o Frequency, Percentage for Categorical Data
 Inferential statistics − It uses patterns in the sample data to draw inferences
about the represented population or accounting for randomness. These
inferences can be −
o answering yes/no questions about the data (hypothesis testing)
o estimating numerical characteristics of the data (estimation)
o describing associations within the data (correlation)
o modeling relationships within the data (E.g. regression analysis)

Predictive Analytics
Predictive Analytics use statistical models to analyze current and historical data for
forecasting (predictions) about future or otherwise unknown events. In business,
predictive analytics is used to identify risks and opportunities that aid in decision-
making.
Text Analytics
Text Analytics, also referred to as Text Mining or as Text Data Mining is the process of
deriving high-quality information from text. Text mining usually involves the process of
structuring the input text, deriving patterns within the structured data using means such
as statistical pattern learning, and finally evaluation and interpretation of the output.
Data Analysis Process
Data Analysis is defined by the statistician John Tukey in 1961 as "Procedures for
analyzing data, techniques for interpreting the results of such procedures, ways of
planning the gathering of data to make its analysis easier, more precise or more
accurate, and all the machinery and results of (mathematical) statistics which apply to
analyzing data.”
Thus, data analysis is a process for obtaining large, unstructured data from various
sources and converting it into information that is useful for −

 Answering questions
 Test hypotheses
 Decision-making
 Disproving theories

Data Analysis with Excel

Microsoft Excel provides several means and ways to analyze and interpret data. The
data can be from various sources. The data can be converted and formatted in several
ways. It can be analyzed with the relevant Excel commands, functions and tools -
encompassing Conditional Formatting, Ranges, Tables, Text functions, Date functions,
Time functions, Financial functions, Subtotals, Quick Analysis, Formula Auditing,
Inquire Tool, What-if Analysis, Solvers, Data Model, PowerPivot, PowerView,
PowerMap, etc.
You will be learning these data analysis techniques with Excel as part of two parts −

 Data Analysis with Excel and

 Advanced Data Analysis with Excel
Data Analysis Process consists of the following phases that are iterative in nature −

 Data Requirements Specification

 Data Collection
 Data Processing
 Data Cleaning
 Data Analysis
 Communication
Data Requirements Specification
The data required for analysis is based on a question or an experiment. Based on the
requirements of those directing the analysis, the data necessary as inputs to the
analysis is identified (e.g., Population of people). Specific variables regarding a
population (e.g., Age and Income) may be specified and obtained. Data may be
numerical or categorical.

Data Collection
Data Collection is the process of gathering information on targeted variables identified
as data requirements. The emphasis is on ensuring accurate and honest collection of
data. Data Collection ensures that data gathered is accurate such that the related
decisions are valid. Data Collection provides both a baseline to measure and a target
to improve.
Data is collected from various sources ranging from organizational databases to the
information in web pages. The data thus obtained, may not be structured and may
contain irrelevant information. Hence, the collected data is required to be subjected to
Data Processing and Data Cleaning.

Data Processing
The data that is collected must be processed or organized for analysis. This includes
structuring the data as required for the relevant Analysis Tools. For example, the data
might have to be placed into rows and columns in a table within a Spreadsheet or
Statistical Application. A Data Model might have to be created.

Data Cleaning
The processed and organized data may be incomplete, contain duplicates, or contain
errors. Data Cleaning is the process of preventing and correcting these errors. There
are several types of Data Cleaning that depend on the type of data. For example, while
cleaning the financial data, certain totals might be compared against reliable published
numbers or defined thresholds. Likewise, quantitative data methods can be used for
outlier detection that would be subsequently excluded in analysis.

Data Analysis
Data that is processed, organized and cleaned would be ready for the analysis.
Various data analysis techniques are available to understand, interpret, and derive
conclusions based on the requirements. Data Visualization may also be used to
examine the data in graphical format, to obtain additional insight regarding the
messages within the data.
Statistical Data Models such as Correlation, Regression Analysis can be used to
identify the relations among the data variables. These models that are descriptive of
the data are helpful in simplifying analysis and communicate results.
The process might require additional Data Cleaning or additional Data Collection, and
hence these activities are iterative in nature.

Communication
The results of the data analysis are to be reported in a format as required by the users
to support their decisions and further action. The feedback from the users might result
in additional analysis.
The data analysts can choose data visualization techniques, such as tables and charts,
which help in communicating the message clearly and efficiently to the users. The
analysis tools provide facility to highlight the required information with color codes and
formatting in tables and charts.

Ranges and Tables

The data that you have can be in a range or in a table. Certain operations on data can
be performed whether the data is in a range or in a table.
However, there are certain operations that are more effective when data is in tables
rather than in ranges. There are also operations that are exclusively for tables.
You will understand the ways of analyzing data in ranges and tables as well. You will
understand how to name ranges, use the names and manage the names. The same
would apply for names in the tables.
Data Cleaning – Text Functions, Dates and Times
You need to clean the data obtained from various sources and structure it before
proceeding to data analysis. You will learn how you can clean the data.

 With Text Functions

 Containing Date Values
 Containing Time Values

Conditional Formatting
Excel provides you conditional formatting commands that allow you to color the cells or
font, have symbols next to values in the cells based on predefined criteria. This helps
one in visualizing the prominent values. You will understand the various commands for
conditionally formatting the cells.

Sorting and Filtering

During the preparation of data analysis and/or to display certain important data, you
might have to sort and/or filter your data. You can do the same with the easy to use
sorting and filtering options that you have in Excel.

Subtotals with Ranges

As you are aware, PivotTable is normally used to summarize data. However, Subtotals
with Ranges is another feature provided by Excel that will allow you to group / ungroup
data and summarize the data present in ranges with easy steps.

Quick Analysis
With Quick Analysis tool in Excel, you can quickly perform various data analysis tasks
and make quick visualizations of the results.

Understanding Lookup Functions

Excel Lookup Functions enable you to find the data values that match a defined criteria
from a huge amount of data.

PivotTables
With PivotTables you can summarize the data, prepare reports dynamically by
changing the contents of the PivotTable.

Data Visualization
You will learn several Data Visualization techniques using Excel Charts. You will also
learn how to create Band Chart, Thermometer Chart, Gantt chart, Waterfall Chart,
Sparklines and PivotCharts.
Data Validation
It might be required that only valid values be entered into certain cells. Otherwise, they
may lead to incorrect calculations. With data validation commands, you can easily set
up data validation values for a cell, an input message prompting the user on what is
expected to be entered in the cell, validate the values entered with the defined criteria
and display an error message in case of incorrect entries.

Financial Analysis
Excel provides you several financial functions. However, for commonly occurring
problems that require financial analysis, you can learn how to use a combination of
these functions.

Working with Multiple Worksheets

You might have to perform several identical calculations in more than one worksheet.
Instead of repeating these calculations in each worksheet, you can do it one worksheet
and have it appear in the other selected worksheets as well. You can also summarize
the data from the various worksheets into a report worksheet.

Formula Auditing
When you use formulas, you might want to check whether the formulas are working as
expected. In Excel, Formula Auditing commands help you in tracing the precedent and
dependent values and error checking.

Inquire
Excel also provides Inquire add-in that enables you compare two workbooks to identify
changes, create interactive reports, and view the relationships among workbooks,
worksheets, and cells. You can also clean the excessive formatting in a worksheet that
makes Excel slow or makes the file size huge.
Gartner Top 10 Trends in Data and Analytics for 2020

Trend 1: Smarter, faster, more responsible

AI
By the end of 2024, 75% of enterprises will shift from piloting to
operationalizing AI, driving a 5X increase in streaming data and analytics
infrastructures.

Within the current pandemic context, AI techniques such as machine

learning (ML), optimization and natural language processing (NLP) are
providing vital insights and predictions about the spread of the virus and the
effectiveness and impact of countermeasures. AI and machine learning are
critical realigning supply and the supply chain to new demand patterns.

“ Pre-COVID models based on historical

data may no longer be valid ”
AI techniques such as reinforcement learning and distributed learning are
creating more adaptable and flexible systems to handle complex business
situations; for example, agent-based systems can model and stimulate
complex systems - particularly now when pre-COVID models based on
historical data may no longer be valid.

Significant investments made in new chip architectures such as neuromorphic

hardware that can be deployed on edge devices are accelerating AI and ML
computations and workloads and reducing reliance on centralized systems
that require high bandwidths. Eventually, this could lead to more scalable AI
solutions that have higher business impact.

Responsible AI that enables model transparency is essential to protect

against poor decisions. It results in better human-machine collaboration and
trust for greater adoption and alignment of decisions throughout the
organization.

Trend 2: Decline of the dashboard

Dynamic data stories with more automated and consumerized experiences
will replace visual, point-and-click authoring and exploration. As a result, the
amount of time users spend using predefined dashboards will decline. The
shift to in-context data stories means that the most relevant insights will
stream to each user based on their context, role or use. These dynamic
insights leverage technologies such as augmented analytics, NLP, streaming
anomaly detection and collaboration.

Data and analytics leaders need to regularly evaluate their existing analytics
and business intelligence (BI) tools and innovative startups offering new
augmented and NLP-driven user experiences beyond the predefined
dashboard.

Trend 3: Decision intelligence

By 2023, more than 33% of large organizations will have analysts practicing
decision intelligence, including decision modeling.

Decision intelligence brings together a number of disciplines, including

decision management and decision support. It encompasses applications in
the field of complex adaptive systems that bring together multiple traditional
and advanced disciplines.

It provides a framework to help data and analytics leaders design, compose,

model, align, execute, monitor and tune decision models and processes in the
context of business outcomes and behavior.

Explore using decision management and modeling technology when decisions

need multiple logical and mathematical techniques, must be automated or
semi-automated, or must be documented and audited.
Trend 4: X analytics
Gartner coined the term “X analytics” to be an umbrella term, where X is the
data variable for a range of different structured and unstructured content such
as text analytics, video analytics, audio analytics, etc.

Data and analytics leaders use X analytics to solve society’s toughest

challenges, including climate change, disease prevention and wildlife
protection.

During the pandemic, AI has been critical in combing through thousands of

research papers, news sources, social media posts and clinical trials data to
help medical and public health experts predict disease spread, capacity-plan,
find new treatments and identify vulnerable populations. X analytics combined
with AI and other techniques such as graph analytics (another top trend) will
play a key role in identifying, predicting and planning for natural disasters and
other business crises and opportunities in the future.

Data and analytics leaders should explore X analytics capabilities available

from their existing vendors, such as cloud vendors for image, video and voice
analytics, but recognize that innovation will likely come from small disruptive
startups and cloud providers.
Trend 5: Augmented data management
Augmented data management uses ML and AI techniques to optimize and
improve operations. It also converts metadata from being used in auditing,
lineage and reporting to powering dynamic systems.

Augmented data management products can examine large samples of

operational data, including actual queries, performance data and schemas.
Using the existing usage and workload data, an augmented engine can tune
operations and optimize configuration, security and performance.

Data and analytics leaders should look for augmented data management
enabling active metadata to simplify and consolidate their architectures, and
also increase automation in their redundant data management tasks.

Trend 6: Cloud is a given

By 2022, public cloud services will be essential for 90% of data and analytics
innovation.

As data and analytics moves to the cloud, data and analytics leaders still
struggle to align the right services to the right use cases, which leads to
unnecessary increased governance and integration overhead.

The question for data and analytics is moving from how much a given service
costs to how it can meet the workload’s performance requirements beyond the
list price.
Data and analytics leaders need to prioritize workloads that can exploit cloud
capabilities and focus on cost optimization and other benefits such as change
and innovation acceleration when moving to cloud.

Trend 7: Data and analytics worlds collide

Data and analytics capabilities have traditionally been considered distinct
capabilities and managed accordingly. Vendors offering end-to-end workflows
enabled by augmented analytics blur the distinction between once separate
markets.

The collision of data and analytics will increase interaction and collaboration
between historically separate data and analytics roles. This impacts not only
the technologies and capabilities provided, but also the people and processes
that support and use them. The spectrum of roles will extend from traditional
data and analytics roles in IT to information explorer, consumer and citizen
developer as an example.

To turn the collision into a constructive convergence, incorporate both data

and analytics tools and capabilities into the analytics stack. Beyond tools,
focus on people and processes to foster communication and collaboration.
Leverage data and analytics ecosystems enabled by an augmented approach
that have the potential to deliver coherent stacks.
Trend 8: Data marketplaces and exchanges
By 2022, 35% of large organizations will be either sellers or buyers of data via
formal online data marketplaces, up from 25% in 2020.

Data marketplaces and exchanges provide single platforms to consolidate

third-party data offerings. These marketplaces and exchanges provide
centralized availability and access (to X analytics and other unique data sets,
for example) that create economies of scale to reduce costs for third-party
data.

To monetize data assets through data marketplaces, data and analytics

leaders should establish a fair and transparent methodology by defining a
data governance principle that ecosystems partners can rely on.

Trend 9: Blockchain in data and analytics

Blockchain technologies address two challenges in data and analytics.
First, blockchain provides the full lineage of assets and transactions. Second,
blockchain provides transparency for complex networks of participants.

Outside of limited bitcoin and smart contract use cases, ledger database
management systems (DBMSs) will provide a more attractive option for
single-enterprise auditing of data sources. By 2021, Gartner estimates that
most permissioned blockchain uses will be replaced by ledger DBMS
products.
Data and analytics should position blockchain technologies as supplementary
to their existing data management infrastructure by highlighting the
capabilities mismatch between data management infrastructure and
blockchain technologies.

Trend 10: Relationships form the foundation

of data and analytics value
By 2023, graph technologies will facilitate rapid contextualization for decision
making in 30% of organizations worldwide. Graph analytics is a set of analytic
techniques that allows for the exploration of relationships between entities of
interest such as organizations, people and transactions.

It helps data and analytics leaders find unknown relationships in data and
review data not easily analyzed with traditional analytics.

For example, as the world scrambles to respond to current and future

pandemics, graph technologies can relate entities across everything from
geospatial data on people’s phones to facial-recognition systems that can
analyze photos to determine who might have come into contact with
individuals who later tested positive for the coronavirus.
“ Consider investigating how graph
algorithms and technologies can improve
your AI and ML initiatives ”
When combined with ML algorithms, these technologies can be used to comb
through thousands of data sources and documents that could help medical
and public health experts rapidly discover new possible treatments or factors
that contribute to more negative outcomes for some patients.

Data and analytics leaders need to evaluate opportunities to incorporate

graph analytics into their analytics portfolios and applications to uncover
hidden patterns and relationships. In addition, consider investigating how
graph algorithms and technologies can improve your AI and ML initiatives.

https://www.gartner.com/smarterwithgartner/gartner-top-10-trends-in-data-and-analytics-for-2020

Data Analytics for Beginners: Introduction to Data Analytics
From Everand
Data Analytics for Beginners: Introduction to Data Analytics
Anthony S. Williams
4/5 (19)
The Great Debate: The Rizal Retraction
88% (8)
The Great Debate: The Rizal Retraction
10 pages
Cse Viii Software Architectures (10is81) Notes
100% (3)
Cse Viii Software Architectures (10is81) Notes
145 pages
6.3.4 Packet Tracer - Troubleshoot EtherChannel
No ratings yet
6.3.4 Packet Tracer - Troubleshoot EtherChannel
10 pages
Act of Declaration of Philippine Independence
100% (5)
Act of Declaration of Philippine Independence
37 pages
Speech On The Retraction of Rizal Consolidated Edited
No ratings yet
Speech On The Retraction of Rizal Consolidated Edited
6 pages
Core Framework For Evaluating The Reliability of Saas Products
No ratings yet
Core Framework For Evaluating The Reliability of Saas Products
15 pages
BBB BANGALORE Company Details
No ratings yet
BBB BANGALORE Company Details
29 pages
Notes on Data Analytics
No ratings yet
Notes on Data Analytics
5 pages
Data Analytics Complete Notes
No ratings yet
Data Analytics Complete Notes
33 pages
Data analytics_1
No ratings yet
Data analytics_1
21 pages
Data Driven
From Everand
Data Driven
Ethan Evans
No ratings yet
Marketing Analytics: How to Achieve Success, #1
From Everand
Marketing Analytics: How to Achieve Success, #1
Ricardo Moreno
No ratings yet
Unit 1
No ratings yet
Unit 1
6 pages
Business Analytics and Big Data
From Everand
Business Analytics and Big Data
Sachin Naha
No ratings yet
How To Win Customers Every Day _ Volume 7: Data-Driven Selling: The Complete Guide to Success
From Everand
How To Win Customers Every Day _ Volume 7: Data-Driven Selling: The Complete Guide to Success
Max Editorial
No ratings yet
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
19 pages
What Is Data Analytics
No ratings yet
What Is Data Analytics
3 pages
Chapter 1
No ratings yet
Chapter 1
34 pages
Data Analytics Notes
No ratings yet
Data Analytics Notes
12 pages
Data Analytics and Data Processing Essentials
From Everand
Data Analytics and Data Processing Essentials
gareth thomas
No ratings yet
Da Unit-2
No ratings yet
Da Unit-2
23 pages
Eedom and Data Driven PDF
No ratings yet
Eedom and Data Driven PDF
156 pages
Data Analytics and Python Programming 2 Bundle Manuscript - Isaac D. Cody
100% (3)
Data Analytics and Python Programming 2 Bundle Manuscript - Isaac D. Cody
156 pages
data analysis vs analytics
No ratings yet
data analysis vs analytics
4 pages
Analysing Data: Involves Using Tools and Techniques To Identify Patterns, Trends, and
No ratings yet
Analysing Data: Involves Using Tools and Techniques To Identify Patterns, Trends, and
52 pages
Data sci notes
No ratings yet
Data sci notes
88 pages
What is Data Analytics
No ratings yet
What is Data Analytics
4 pages
Ba Unit 1a
No ratings yet
Ba Unit 1a
18 pages
Data Analytics
No ratings yet
Data Analytics
5 pages
Data Analytics
No ratings yet
Data Analytics
11 pages
Lecture 1 Introduction To Advanced Data Analytics
No ratings yet
Lecture 1 Introduction To Advanced Data Analytics
44 pages
TechTrail Business Intelligence
No ratings yet
TechTrail Business Intelligence
14 pages
Introduction to Data Science and Data Analytics
No ratings yet
Introduction to Data Science and Data Analytics
72 pages
Data Analytics Strategy Ebook
No ratings yet
Data Analytics Strategy Ebook
15 pages
1-DA (1).pptx
No ratings yet
1-DA (1).pptx
44 pages
Unit-1 for Students
No ratings yet
Unit-1 for Students
57 pages
2 Data Sources for Business Analytics
No ratings yet
2 Data Sources for Business Analytics
15 pages
Business Analytics Summary (Units 1.2 - 1.8)
No ratings yet
Business Analytics Summary (Units 1.2 - 1.8)
8 pages
DATA ANALYTICS
No ratings yet
DATA ANALYTICS
9 pages
Data Analytics Unit-1
No ratings yet
Data Analytics Unit-1
83 pages
Unit-1 Business Analytics
No ratings yet
Unit-1 Business Analytics
160 pages
Data Analytics
No ratings yet
Data Analytics
32 pages
Definition: What Is Business Analytics?
100% (1)
Definition: What Is Business Analytics?
9 pages
Unit I
No ratings yet
Unit I
47 pages
Data Analytics(1)[1]
No ratings yet
Data Analytics(1)[1]
10 pages
UCS551 Chapter 1 - Introduction To Data Analytics
No ratings yet
UCS551 Chapter 1 - Introduction To Data Analytics
23 pages
Intro To Data Analytics
No ratings yet
Intro To Data Analytics
42 pages
Business Analytics
No ratings yet
Business Analytics
5 pages
5-Data Analytics in a Business operations and BI Marketing Models
No ratings yet
5-Data Analytics in a Business operations and BI Marketing Models
29 pages
What Are The Examples of Data Analytics
No ratings yet
What Are The Examples of Data Analytics
2 pages
Lecture 0_dd96a9317d5537072feea03a885dc911
No ratings yet
Lecture 0_dd96a9317d5537072feea03a885dc911
21 pages
What is Data Analytics
No ratings yet
What is Data Analytics
44 pages
Business Analytics
No ratings yet
Business Analytics
42 pages
Data Analytics
No ratings yet
Data Analytics
183 pages
PYTHON FOR DATA ANALYTICS: Mastering Python for Comprehensive Data Analysis and Insights (2023 Guide for Beginners)
From Everand
PYTHON FOR DATA ANALYTICS: Mastering Python for Comprehensive Data Analysis and Insights (2023 Guide for Beginners)
Waldo Todd
No ratings yet
Data Analytics Guide
No ratings yet
Data Analytics Guide
10 pages
[ISP610] Lesson 1 - Introduction to data analytics _Mdm Ezza2024
No ratings yet
[ISP610] Lesson 1 - Introduction to data analytics _Mdm Ezza2024
54 pages
Business with AI
From Everand
Business with AI
Kendall Johnson
No ratings yet
What Is Data Analytics
No ratings yet
What Is Data Analytics
16 pages
Here is an even more detailed and expanded version of Chapter 1 - Copy
No ratings yet
Here is an even more detailed and expanded version of Chapter 1 - Copy
5 pages
Generative AI Tools for Marketing & Sales
From Everand
Generative AI Tools for Marketing & Sales
Daniel Basso
No ratings yet
Data Analytics - Definition Uses Examples Process
No ratings yet
Data Analytics - Definition Uses Examples Process
3 pages
BISMA ITC
No ratings yet
BISMA ITC
7 pages
UNIT-1 Data Analytics
No ratings yet
UNIT-1 Data Analytics
37 pages
Dadv Unit1
No ratings yet
Dadv Unit1
40 pages
AI-Powered Growth: 54 Proven Strategies for Small Businesses to Boost Revenue: How AI Can Change Business Outcomes That Increase Revenue
From Everand
AI-Powered Growth: 54 Proven Strategies for Small Businesses to Boost Revenue: How AI Can Change Business Outcomes That Increase Revenue
Rick Spair
No ratings yet
Lesson 13 Gartner Top 10 Trends in Data and Analytics For 2020
No ratings yet
Lesson 13 Gartner Top 10 Trends in Data and Analytics For 2020
9 pages
React Native Text Input
No ratings yet
React Native Text Input
5 pages
React Native
No ratings yet
React Native
3 pages
React Native Touchables: Types of Touchable Components
No ratings yet
React Native Touchables: Types of Touchable Components
9 pages
4 Types of Data in Statistics
No ratings yet
4 Types of Data in Statistics
10 pages
React Native Flatlist Example
No ratings yet
React Native Flatlist Example
4 pages
What Is Data Visualization? Presenting Data For Decision-Making
No ratings yet
What Is Data Visualization? Presenting Data For Decision-Making
7 pages
React Native Default Custom Props
No ratings yet
React Native Default Custom Props
4 pages
6.3.4 Packet Tracer - Troubleshoot EtherChannel
No ratings yet
6.3.4 Packet Tracer - Troubleshoot EtherChannel
10 pages
3.1.2.3 Lab - Backup Data To External Storage
No ratings yet
3.1.2.3 Lab - Backup Data To External Storage
4 pages
Lab - Switch Security Configuration: Topology
0% (1)
Lab - Switch Security Configuration: Topology
9 pages
Agrarian Reform Joji
No ratings yet
Agrarian Reform Joji
56 pages
The Database Environment
No ratings yet
The Database Environment
19 pages
Activity Spherical Mirrors
No ratings yet
Activity Spherical Mirrors
1 page
IT11a - Programming Notes
No ratings yet
IT11a - Programming Notes
4 pages
Scanned Documents
No ratings yet
Scanned Documents
12 pages
Information Society Chapter 8
100% (1)
Information Society Chapter 8
16 pages
Nanoworld: Nanotechnology
No ratings yet
Nanoworld: Nanotechnology
17 pages
Gini Talent-Senior Android Developer
No ratings yet
Gini Talent-Senior Android Developer
25 pages
1544099771VMUG Virtual December Agenda
No ratings yet
1544099771VMUG Virtual December Agenda
1 page
Acknowledment Receipt
No ratings yet
Acknowledment Receipt
2 pages
Qualitative, Semi-Quantitative And, Quantitative Methods For Risk Assessment: Case of The Financial Audit
No ratings yet
Qualitative, Semi-Quantitative And, Quantitative Methods For Risk Assessment: Case of The Financial Audit
16 pages
Docu57862 Documentum Content Server 7.2 Release Notes
No ratings yet
Docu57862 Documentum Content Server 7.2 Release Notes
36 pages
Sum Dmo
No ratings yet
Sum Dmo
8 pages
Resource Allocation & Resource Levelling
100% (1)
Resource Allocation & Resource Levelling
19 pages
FSD S4 STR I STR 001 Stock Transfer Interface
No ratings yet
FSD S4 STR I STR 001 Stock Transfer Interface
13 pages
HR Project
No ratings yet
HR Project
27 pages
Part 4 - Thi Online - OFFICE (Đề Số 01)
No ratings yet
Part 4 - Thi Online - OFFICE (Đề Số 01)
2 pages
CBAD2103 Topic 1 2
No ratings yet
CBAD2103 Topic 1 2
52 pages
Create A Domain Model - Enterprise Architect User Guide
No ratings yet
Create A Domain Model - Enterprise Architect User Guide
2 pages
Hunger and Public Action
No ratings yet
Hunger and Public Action
391 pages
Case Study: Inventec Corporation
100% (1)
Case Study: Inventec Corporation
33 pages
ChainUp Your Centralised Exchange Software Solution PDF 1711599714
No ratings yet
ChainUp Your Centralised Exchange Software Solution PDF 1711599714
5 pages
Info - Security U2 Ch.4 Planning For Security
No ratings yet
Info - Security U2 Ch.4 Planning For Security
108 pages
Materi Sosialisasi Customer - FMS DM TRAC Leasing - Eng
No ratings yet
Materi Sosialisasi Customer - FMS DM TRAC Leasing - Eng
23 pages
An introduction to AIMLOps
No ratings yet
An introduction to AIMLOps
23 pages
Thesis On Quality Management System
100% (2)
Thesis On Quality Management System
7 pages
The SAP Business Workflow - ABAP Development - SCN Wiki
No ratings yet
The SAP Business Workflow - ABAP Development - SCN Wiki
5 pages
Ashraf Khan
No ratings yet
Ashraf Khan
1 page
Status ISBD Not Reset After Reversal: Symptom
No ratings yet
Status ISBD Not Reset After Reversal: Symptom
2 pages
ICAI CA GPT Manual
No ratings yet
ICAI CA GPT Manual
4 pages
Domain Requiremnts
No ratings yet
Domain Requiremnts
1 page
Trans-It: Marine Planned Maintenance System Data Services
No ratings yet
Trans-It: Marine Planned Maintenance System Data Services
4 pages
Wasi Resume 4
No ratings yet
Wasi Resume 4
7 pages
Azure Devops
No ratings yet
Azure Devops
4 pages

What Is Data Analytics?

Uploaded by

What Is Data Analytics?

Uploaded by

What Is Data Analytics?

Data can help businesses better understand their customers, improve

What Is Data Analytics?

Today, many data analytics techniques use specialized systems and

Data Scientists and Analysts use data analytics techniques in their

4 Ways to Use Data

Data has the potential to provide a lot of value to businesses, but to

As the importance of data analytics in the business world increases, it

Data analytics eliminates much of the guesswork from planning

2. More Effective Marketing

Using the Lotame Campaign Analytics tool, you can gain insights into

3. Better Customer Service

4. More Efficient Operations

By collecting various kinds of data from numerous sources, you can

One valuable type of data is information about customer behaviors.

By combining this data with information about your current customers’

Some of the technologies that make modern data analytics so

 Machine learning: Artificial intelligence (AI) is the field of developing

 Data mining: The term data mining refers to the process of sorting

 Predictive analytics: Predictive analytics technology helps you analyze

Data Analytics Examples

Challenges of Data Analytics

One of the biggest challenges related to data analytics is collecting the

Types of data visualization

It is essential to understand which of those two ends a given visualization is intended to

Some of the most common specific types of visualizations include:

Data visualization examples

Data visualization tools

4 Ways Data Visualization Can Improve

Information travels at the speed of light in our always-connected world. Due to

Business leaders must make timely and informed decisions. However,

Psychologists labeled this quick fix as heuristics. Heuristics are mental

For example, a manager at an engineering firm determines his lead

While the decision may be apparent to the manager, it isn't communicated

Leaders should always seek the perspectives of others to enhance their

Related Post: The Power of Data Visualization

What is data science?

Data science is a multidisciplinary approach to extracting actionable insights from the

 Apply mathematics, statistics, and the scientific method

Data science: An untapped resource for machine learning

Here’s a simple breakdown:

 AI means getting a computer to mimic human behavior in some way.

Planning: Define a project and its potential outputs.

The data science lifecycle

What is machine learning?

Machine learning is an important component of the growing field of data science.

Agriculture and Farming

Banking and Securities

Entertainment, Communications and the Media

Moreover, the private sector uses big data in revenue management,

Excel Data Analysis Tutorial

Introduction to Pivot Tables, Charts, and Dashboards in Excel (Part 1)

Types of Data Analysis

Data Analysis with Excel

 Data Analysis with Excel and

 Data Requirements Specification

Ranges and Tables

 With Text Functions

Sorting and Filtering

Subtotals with Ranges

Understanding Lookup Functions

Working with Multiple Worksheets

Trend 1: Smarter, faster, more responsible

Within the current pandemic context, AI techniques such as machine

“ Pre-COVID models based on historical

Significant investments made in new chip architectures such as neuromorphic

Responsible AI that enables model transparency is essential to protect

Trend 2: Decline of the dashboard

Trend 3: Decision intelligence

Decision intelligence brings together a number of disciplines, including

It provides a framework to help data and analytics leaders design, compose,

Explore using decision management and modeling technology when decisions

Data and analytics leaders use X analytics to solve society’s toughest

During the pandemic, AI has been critical in combing through thousands of

Data and analytics leaders should explore X analytics capabilities available

Augmented data management products can examine large samples of