Data Analysis - Wikipedia
Data Analysis - Wikipedia
Data requirements
Data collection
Data processing
Data cleaning
Data product
Communication
Quantitative messages
Given some
concrete - What Kellogg's cereals have
5 Sort Given a set of What is the sorted order - Order the cars by weight.
data cases, of a set S of data cases
- Rank the cereals by calories.
rank them according to their value
according to of attribute A?
some ordinal
metric.
Given a set of
data cases
and a
quantitative
- What is the distribution of
attribute of What is the distribution
carbohydrates in cereals?
Characterize interest, of values of attribute A
7
Distribution characterize in a set S of data - What is the age distribution
the cases? of shoppers?
distribution of
that attribute's
values over
the set.
Identify any
anomalies
within a given - Are there exceptions to the
set of data Which data cases in a relationship between
cases with set S of data cases horsepower and
8 Find Anomalies respect to a have acceleration?
given unexpected/exceptional
- Are there any outliers in
relationship or values?
protein?
expectation,
e.g. statistical
outliers.
9 Cluster Given a set of Which data cases in a - Are there groups of cereals
data cases, set S of data cases are w/ similar fat/calories/sugar?
find clusters similar in value for
- Is there a cluster of typical
of similar attributes {X, Y, Z, ...}? film lengths?
attribute
values.
- Is there a correlation
Given a set of
between carbohydrates and
data cases
fat?
and two
attributes, - Is there a correlation
What is the correlation between country of origin and
determine
between attributes X MPG?
10 Correlate useful
and Y over a given set S
relationships
of data cases? - Do different genders have a
between the
preferred payment method?
values of
those - Is there a trend of increasing
attributes. film length over the years?
Given a set of
data cases,
Which data cases in a - Are there groups of
find
set S of data cases are restaurants that have foods
11 Contextualization[18] contextual
relevant to the current based on my current caloric
relevancy of
users' context? intake?
the data to the
users.
Cognitive biases
Innumeracy
Effective analysts are generally adept with
a variety of numerical techniques.
However, audiences may not have such
literacy with numbers or numeracy; they
are said to be innumerate. Persons
communicating the data may also be
attempting to mislead or misinform,
deliberately using bad numerical
techniques.[21]
Other topics
Smart buildings
A data analytics approach can be used in
order to predict energy consumption in
buildings.[22] The different steps of the
data analysis process are carried out in
order to realise smart buildings, where the
building management and control
operations including heating, ventilation,
air conditioning, lighting and security are
realised automatically by miming the
needs of the building users and optimising
resources like energy and time.
Education
Practitioner notes
This section contains rather technical
explanations that may assist practitioners
but are beyond the typical scope of a
Wikipedia article.
Quality of data
The quality of the data should be checked
as early as possible. Data quality can be
assessed in several ways, using different
types of analysis: frequency counts,
descriptive statistics (mean, standard
deviation, median), normality (skewness,
kurtosis, frequency histograms, n:
variables are compared with coding
schemes of variables external to the data
set, and possibly corrected if coding
schemes are not comparable.
Quality of measurements
Analysis
Several analyses can be used during the
initial data analysis phase:[34]
Stability of results
See also
Actuarial science
Analytics
Big data
Business intelligence
Censoring (statistics)
Computational physics
Data acquisition
Data blending
Data governance
Data mining
Data Presentation Architecture
Data science
Digital signal processing
Dimension reduction
Early case assessment
Exploratory data analysis
Fourier analysis
Machine learning
Multilinear PCA
Multilinear subspace learning
Multiway data analysis
Nearest neighbor search
Nonlinear system identification
Predictive analytics
Principal component analysis
Qualitative research
Scientific computing
Structured data analysis (statistics)
System identification
Test method
Text analytics
Unstructured data
Wavelet
References
Citations
Further reading
Retrieved from
"https://en.wikipedia.org/w/index.php?
title=Data_analysis&oldid=920438508"