0% found this document useful (0 votes)

57 views7 pages

Correlation Visualization of High Dimensional Data

Correlation analysis has always been a key technique for understanding data. However, traditional methods are only applicable on

Uploaded by

Haruna Tofa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views7 pages

Correlation Visualization of High Dimensional Data

Correlation analysis has always been a key technique for understanding data. However, traditional methods are only applicable on

Uploaded by

Haruna Tofa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/221079744

Correlation Visualization of High Dimensional Data Using Topographic Maps

Conference Paper in Lecture Notes in Computer Science · August 2002

DOI: 10.1007/3-540-46084-5_163 · Source: DBLP

CITATIONS READS

7 175

3 authors, including:

Ignacio Díaz Blanco

University of Oviedo
69 PUBLICATIONS 354 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Visual Analytics Techniques for Improving Efficiency in Buildings and Processes View project

All content following this page was uploaded by Ignacio Díaz Blanco on 03 June 2014.

The user has requested enhancement of the downloaded file.

Correlation Visualization of High Dimensional
Data using Topographic Maps

Ignacio Dı́az Blanco, Abel A. Cuadrado Vega, and Alberto B. Diez González

Área de Ingenierı́a de Sistemas y Automática

Universidad de Oviedo
Campus de Viesques s/n, 33204, Gijón, Spain
{idiaz,cuadrado,alberto}@isa.uniovi.es

Abstract. Correlation analysis has always been a key technique for un-
derstanding data. However, traditional methods are only applicable on
the whole data set, providing only global information on correlations.
Correlations usually have a local nature and two variables can be di-
rectly and inversely correlated at different points in the same data set.
This situation arises typically in nonlinear processes. In this paper we
propose a method to visualize the distribution of local correlations along
the whole data set using dimension reduction mappings. The ideas are
illustrated through an artificial data example.

1 Introduction
Visualization and dimension reduction techniques have received considerable at-
tention in recent years for the analysis of large sets of multidimensional data
[1–3] and particularly for supervision and condition monitoring of complex in-
dustrial processes [4–6]. These techniques allow to discover unknown features
and relationships of high dimensional data in a visual manner by means of a
mapping from a data space D (also input space) onto a low dimensional visu-
alization space V where complex relationships among input variables can be
easily represented and visualized while preserving information significant to a
given problem.
Another very useful technique when dealing with high dimensional data is
correlation analysis. Correlation analysis is concerned with finding how compo-
nents x1 , · · · , xp of the sample data vectors {xi }i=1,···,n are mutually related.
The standard way to cope with this problem is through the analysis of second
order statistics such as the correlation matrix R whose coefficients r ij ∈ [−1, 1]
provide a description of how variables xi and xj are related. These coefficients
are the result of a normalized inner product –the cosine– between vectors formed
by the values of xi and xj for the whole data set and, in consequence, they pro-
vide a correlation information of a global nature. However, in many cases data
variables can be correlated in different ways for different regions of the data
space. This is the case, for instance, of multimodal or nonlinear processes, which
behave locally in different ways depending on the working point. Thus, we need
a local description of correlation.
In this paper, we suggest a method to combine correlation analysis with the
power of dimension reduction visualization methods, such as the Self-Organizing
Map (SOM) [7] or the Generative Topographic Map (GTM) [8], allowing to vi-
sualize local correlations for each pair of variables xi , xj through the so called
correlation maps defined in the visualization space. The paper is organized as
follows. In section 2 the ideas of local covariance and local correlation are intro-
duced, and a method to display the information provided by local second order
statistics in the visualization space is proposed. In section 3 the proposed ideas
are illustrated through a simple example. Finally, in section 4 some concluding
remarks and future research lines are outlined.

2 Correlation Maps

2.1 Local Covariance Matrix

Let ψ(y) : R2 → Rn a continuous mapping which takes a point y of the visualiza-

tion space V ⊂ R2 and obtains a point ψ(y) pertaining to the manifold which ap-
proximates the distribution of the input data points xi in the data space D ⊂ Rn .
1 2 2
Let’s define the following neighborhood function wi (y) = e− 2 kxi −ψ(y)k /σ , which
describes the degree of locality or proximity of sample xi with respect to ψ(y) in
the data space D. We define the local mean vector m(y) and the local covariance
matrix C(y) associated to a point y in the visualization space V as

P
i xi · wi (y)
m(y) = P (1)
i wi (y)
P
[xi − m(y)][xi − m(y)]T · wi (y)
C(y) = (cij ) = i P (2)
i wi (y)

Taken independently, the n×n components cij (y) of the covariance matrix C(y),
can be regarded as local covariance values which describe the local dependency
between variables xi and xj . Expressions (1) and (2) represent local versions of
the sample first and second order moments of the input data distribution around
the image of point y in the visualization space, i.e., ψ(y), where the width factor
σ is a design parameter related the degree of locality to be taken into account,
allowing to establish a tradeoff between global and local correlations.
The local covariance C(y) described in (2) defines in V a field of covariance
matrices from D each of which provides a local description of second order
statistical features of data in D lying in the vicinity of ψ(y).

2.2 Local Correlation Matrix

The previously defined covariance matrix provides insight in the approach of

local description of second order statistics. However, in looking for correlations,
correlation coefficients are preferred as they provide a normalized description of
correlations in the interval [−1, +1]. The local correlation matrix around y can
be defined as
cij
R(y) = (rij ) where, rij = √ (3)
cii cjj

The local correlation matrix R(y) has n × n components rij (y) which rep-
resent the local correlation coefficient between variable xi and variable xj and
lie always in the interval [−1, +1], where +1 denotes full direct correlation, 0
denotes incorrelation, and −1 denotes full inverse correlation.

2.3 Visualization of Second Order Statistical Features

Both the covariance matrix C(y) and correlation matrix R(y) are defined for
each point y of V . In addition to this, all powerful geometrical and statistical
interpretations underlying both matrices can be represented in V using scalar
quantities. Thus, for instance, each component cij (y) or rij (y) defines a scalar
quantity susceptible to be represented in the same way as SOM planes, using a
color code for each pixel y. In the same way, the principal values of the covariance
matrix λi (y) or the components of the principal vectors ui (y) can be represented
as SOM planes.
This representation provides a unified visualization of the underlying correla-
tions and second order statistical properties in general. Moreover, it is coherent
with other SOM representations such as SOM planes or the u-matrix providing
insight in the pattern of correlation dependencies among variables or revealing
the most important features describing the behavior of the underlying process
for each data region.

3 Application to Artificial Data

All these ideas are illustrated in figures 1, 2 and 3. A simple 2D data set was
used to train both a 1D-SOM and a 2D-SOM. Local covariances were obtained
for the 2D-SOM using (1) and (2) and then plotted in both the data space D
and the visualization space V . Local correlations were also obtained using (3) to
build the correlation maps of rxx , rxy , ryx , ryy shown in figure 2. A set of points
with negative local correlations (corresponding to the right part of the “arc” in
the data) can be discovered by looking at the upper left corner in the rxy plane.
Similarly, moderately high correlations appear in the upper right corner of the
map, showing up the positive local correlations existing in the left part of the
“arc” in the data space. It can also be observed that the graphical information
provided by correlation maps in figure 2 is consistent with that shown in the
SOM planes in figure 3, because both are descriptions in the same visualization
space V . Finally, as we should expect, planes rxx and ryy are equal to 1, and
rxy = ryx due to the symmetry properties of correlation matrices.
1D−SOM in Data Space 2D−SOM in Data Space
5 5

0 0

−5 −5

−10 −10
−5 0 5 −5 0 5 10

Visualization Space Visualization Space

15
14
10 12

5 10
8
0
6
−5 4
−10 2
0
−15
0 10 20 30 0 5 10 15

Fig. 1. Local covariances in D (top) and in V (bottom) obtained for both a

1D-SOM (left) and a 2D-SOM (right). In the thick areas (low correlations),
the covariances are nearly spherical, while in thin areas (high correlations) the
covariances become low rank, and oriented, showing up in V the nature of local
correlations.

4 Concluding Remarks

We have proposed here a method for the visualization of local second order sta-
tistical properties using dimension reduction mappings like –but not restricted
to– the SOM. The proposed idea has strong connections with local model ap-
proaches, such as [9], where local linear PCA projections are proposed to capture
the nonlinear structure of data.
We showed here through an artificial data example how local second order
statistical properties can be revealed by means of correlation maps, which, in
xx xy
14 1 14 1

12 12
0.5 0.5
10 10
8 8
0 0
6 6
4 4
−0.5 −0.5
2 2
0 −1 0 −1
0 5 10 0 5 10

yx yy
14 1 14 1

12 12
0.5 0.5
10 10
8 8
0 0
6 6
4 4
−0.5 −0.5
2 2
0 0
−1 −1
0 5 10 0 5 10 15

Fig. 2. Correlation Maps for the 2D-SOM show a region in V (up-left) related to
highly negative local correlations and another region (up-right) revealing positive local
correlations.

addition, are consistent with other standard representations in the visualization

space such as the component planes or the distance matrix. This provides an
alternative way for high dimensional data visualization to the standard methods
based on SOM –u-matrix, SOM planes, response surfaces or SOM planes rear-
rangement [10], as well as SOM clustering methods [11]– which combines the
classical correlation analysis techniques (correlation matrix) with the power of
SOM data visualization.
As a matter of further study, the idea of local second order moments is not
restricted to correlation analysis or even to second order moments. Eigenvalues
λi (y) or the components of eigenvectors ui (y) of the local covariance matrix can
lead to meaningful maps, which can be derived in a straightforward manner from
the ideas described here. In a similar way, higher order statistics (cumulants)
can be obtained in a local fashion opening new exciting research lines in data
visualization.
x y Interneuron Distance Matrix

14 14 14

12 12 12

10 10 10

8 8 8

6 6 6

4 4 4

2 2 2

0 0 0

0 5 10 0 5 10 0 5 10

Fig. 3. SOM planes of variables x and y and distance matrix.

The ideas proposed in this paper are currently being tested in the steel in-
dustry to investigate the effects of several dozens of process variables in several
quality factors of the processed coils in a tandem mill with encouraging results.

References
1. Joshua B. Tenenbaum, Vin de Silva, and John C. Langford. A global geometric
framework for nonlinear dimensionality reduction. Science, 290:2319–2323, Dec,
22 2000.
2. Sam T. Roweis and Lawrence K. Saul. Nonlinear dimensionality reduction by
locally linear embedding. Science, 290:2323–2326, Dec., 22 2000.
3. Jianchang Mao and Anil K. Jain. Artificial neural networks for feature extrac-
tion and multivariate data projection. IEEE Transactions on Neural Networks,
6(2):296–316, March 1995.
4. David J. H. Wilson and George W. Irwin. RBF principal manifolds for process
monitoring. IEEE Transactions on Neural Networks, 10(6):1424–1434, November
1999.
5. Teuvo Kohonen, Erkki Oja, Olli Simula, Ari Visa, and Jari Kangas. Engineering
applications of the self-organizing map. Proceedings of the IEEE, 84(10):1358–1384,
october 1996.
6. Esa Alhoniemi, Jaakko Hollmén, Olli Simula, and Juha Vesanto. Process mon-
itoring and modeling using the self-organizing map. Integrated Computer Aided
Engineering, 6(1):3–14, 1999.
7. Teuvo Kohonen. Self-Organizing Maps. Springer-Verlag, 1995.
8. Christopher M. Bishop, Markus Svensen, and Christopher K. I. Williams. GTM:
The generative topographic mapping. Neural Computation, 10(1):215–234, 1998.
9. M. Tipping and C. Bishop. Mixtures of probabilistic principal component analyz-
ers. Neural Computation, 11(2):443–482, 1999.
10. Juha Vesanto. Som-based data visualization methods. Intelligent Data Analysis,
3(2):111–126, 1999.
11. Juha Vesanto and Esa Alhoniemi. Clustering of the self-organizing map. IEEE
Transactions on Neural Networks, 11(3):586–600, May 2000.

View publication stats

DVT r18 Notes
No ratings yet
DVT r18 Notes
17 pages
New SAT Math Workbook PDF
100% (12)
New SAT Math Workbook PDF
354 pages
Data Visualization Charts, Maps, and Interactive Graphics
100% (16)
Data Visualization Charts, Maps, and Interactive Graphics
249 pages
Data Mining: Exploring Data: Lecture Notes For Chapter 3
No ratings yet
Data Mining: Exploring Data: Lecture Notes For Chapter 3
21 pages
Ostatistics
100% (2)
Ostatistics
44 pages
Anselin L An Introduction To Spatial Data Science With Geoda
No ratings yet
Anselin L An Introduction To Spatial Data Science With Geoda
238 pages
Jehle and Reny Solutions
80% (10)
Jehle and Reny Solutions
33 pages
Data Visulization Techniques
No ratings yet
Data Visulization Techniques
10 pages
Variograms: Brian Klinkenberg Geography
100% (1)
Variograms: Brian Klinkenberg Geography
34 pages
Spatial Data Mining: Presented By-: Rajkumar Jain M.tech (C.s.e) 1 Year (2 Sem)
0% (1)
Spatial Data Mining: Presented By-: Rajkumar Jain M.tech (C.s.e) 1 Year (2 Sem)
27 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
35 pages
09 Plotting and Visualization
No ratings yet
09 Plotting and Visualization
97 pages
Statistical Methods For Spatial Data Analysis
No ratings yet
Statistical Methods For Spatial Data Analysis
164 pages
430 - C - 3 Mathematics Basic
No ratings yet
430 - C - 3 Mathematics Basic
27 pages
Unit III
No ratings yet
Unit III
105 pages
Spatialdataanalysis1 140323115706 Phpapp01
No ratings yet
Spatialdataanalysis1 140323115706 Phpapp01
102 pages
Visualization of High Dimensional Scientific Data
No ratings yet
Visualization of High Dimensional Scientific Data
105 pages
Desug12 Pisati
No ratings yet
Desug12 Pisati
91 pages
Imet131 e Chapitre 1
No ratings yet
Imet131 e Chapitre 1
28 pages
Unit 4 Part A
No ratings yet
Unit 4 Part A
51 pages
Barbara Hammer and Alexander Hasenfuss - Topographic Mapping of Large Dissimilarity Data Sets
No ratings yet
Barbara Hammer and Alexander Hasenfuss - Topographic Mapping of Large Dissimilarity Data Sets
58 pages
Data Visualization Techniques: Dr. D. Koteswara Rao
No ratings yet
Data Visualization Techniques: Dr. D. Koteswara Rao
41 pages
Module 6 - Data Visualization Tools
No ratings yet
Module 6 - Data Visualization Tools
37 pages
Unit III
No ratings yet
Unit III
71 pages
Anomalies 2312.16139
No ratings yet
Anomalies 2312.16139
41 pages
2 Spatial Statistics - Univariate
No ratings yet
2 Spatial Statistics - Univariate
70 pages
1-LA Multivariate Geary
No ratings yet
1-LA Multivariate Geary
28 pages
Data Mining Notes C3
No ratings yet
Data Mining Notes C3
11 pages
1985 - Wartenberg - Multivariate Spatial Correlation A Method For Exploratory Geographical
No ratings yet
1985 - Wartenberg - Multivariate Spatial Correlation A Method For Exploratory Geographical
21 pages
3rd UNIT DVT
No ratings yet
3rd UNIT DVT
30 pages
DVT Unit-V
No ratings yet
DVT Unit-V
24 pages
Geospatial Data II
No ratings yet
Geospatial Data II
20 pages
WINSEM2018-19 MGT1051 TH MB310 VL2018195003608 Reference
No ratings yet
WINSEM2018-19 MGT1051 TH MB310 VL2018195003608 Reference
35 pages
ML 4
No ratings yet
ML 4
14 pages
2 Data Analysis
No ratings yet
2 Data Analysis
11 pages
W4.2 DataPreProcessing-PCA
No ratings yet
W4.2 DataPreProcessing-PCA
22 pages
M2 - Visualization Across Time, Space, Relationships
No ratings yet
M2 - Visualization Across Time, Space, Relationships
14 pages
Da Unit 5
No ratings yet
Da Unit 5
11 pages
And Information Science, University of Granada, Spain Received 22 January 2004 Revised
No ratings yet
And Information Science, University of Granada, Spain Received 22 January 2004 Revised
15 pages
4 Iis 2023 161-175
No ratings yet
4 Iis 2023 161-175
15 pages
A Tour Through The Visualization Zoo PDF
No ratings yet
A Tour Through The Visualization Zoo PDF
18 pages
Icai 07
No ratings yet
Icai 07
6 pages
6406 Report
No ratings yet
6406 Report
7 pages
DSS Chapter SEVEN
No ratings yet
DSS Chapter SEVEN
12 pages
CO5 Notes
No ratings yet
CO5 Notes
11 pages
Spatial Data
No ratings yet
Spatial Data
18 pages
Lectura en Segundo Idioma
No ratings yet
Lectura en Segundo Idioma
12 pages
T-Visne: Interactive Assessment and Interpretation of T-Sne Projections
No ratings yet
T-Visne: Interactive Assessment and Interpretation of T-Sne Projections
18 pages
IoT Based Detection of Microbial Activity in Raw Milk by Using Intel Galileo Gen II.
No ratings yet
IoT Based Detection of Microbial Activity in Raw Milk by Using Intel Galileo Gen II.
4 pages
NewPaper DataVisualization
No ratings yet
NewPaper DataVisualization
6 pages
2012-408 Understanding Correlation Matrices
No ratings yet
2012-408 Understanding Correlation Matrices
6 pages
Data Visualization1
No ratings yet
Data Visualization1
5 pages
Exploratory Spatial Data Analysis
No ratings yet
Exploratory Spatial Data Analysis
16 pages
Visualizing Data Using t-SNE: Laurens Van Der Maaten
No ratings yet
Visualizing Data Using t-SNE: Laurens Van Der Maaten
27 pages
Unit-1 Final
No ratings yet
Unit-1 Final
11 pages
LM of Tip
No ratings yet
LM of Tip
5 pages
Multivariate Geostatistics
No ratings yet
Multivariate Geostatistics
15 pages
DA UNIT V Notes
No ratings yet
DA UNIT V Notes
17 pages
DA Unit-5
No ratings yet
DA Unit-5
6 pages
5 Da
No ratings yet
5 Da
6 pages
Conference Presen - Bigdata1
No ratings yet
Conference Presen - Bigdata1
16 pages
Ordinary Differential Equations and Dynamical Systems Teschl Instant Download
100% (2)
Ordinary Differential Equations and Dynamical Systems Teschl Instant Download
82 pages
Nyquist Plot Expt
No ratings yet
Nyquist Plot Expt
7 pages
Gauss Quadrature Integration
100% (1)
Gauss Quadrature Integration
30 pages
WKB Approximation
No ratings yet
WKB Approximation
14 pages
QMech HomeworkSolutions PDF
100% (1)
QMech HomeworkSolutions PDF
31 pages
Chapter 2 - Asymptotic Notation (Slides)
No ratings yet
Chapter 2 - Asymptotic Notation (Slides)
46 pages
CH 15
No ratings yet
CH 15
39 pages
Numerical Differentiation: NG Tin Yau (PHD)
No ratings yet
Numerical Differentiation: NG Tin Yau (PHD)
18 pages
Math-Scope and Sequence-New Form - Career Programs
No ratings yet
Math-Scope and Sequence-New Form - Career Programs
18 pages
K.nithin Vignesh Maths
No ratings yet
K.nithin Vignesh Maths
7 pages
Engineering Mathematics - Weekly Test 05 - Test Paper (By Puneet Sir)
No ratings yet
Engineering Mathematics - Weekly Test 05 - Test Paper (By Puneet Sir)
7 pages
Full Test 05 - Maths - Test Paper - KCET Safalta Test Series
No ratings yet
Full Test 05 - Maths - Test Paper - KCET Safalta Test Series
18 pages
Time Response of Zero and Poles
No ratings yet
Time Response of Zero and Poles
3 pages
회로이론chapter 18 alexander의 회로이론 솔루션
No ratings yet
회로이론chapter 18 alexander의 회로이론 솔루션
76 pages
Math 2
No ratings yet
Math 2
4 pages
2.1 Note
No ratings yet
2.1 Note
6 pages
KA & TN CBSE VI To VIII Class Techno COT - 4 Syllabus (06.04.2024)
No ratings yet
KA & TN CBSE VI To VIII Class Techno COT - 4 Syllabus (06.04.2024)
2 pages
2022 23 DIVISION ASSESSMENT TESTQ2 With TOS
No ratings yet
2022 23 DIVISION ASSESSMENT TESTQ2 With TOS
5 pages
CBNST (Lab)
No ratings yet
CBNST (Lab)
7 pages
SS1 Further Mathematics Ca
No ratings yet
SS1 Further Mathematics Ca
4 pages
Stochastic Differential Equations With Multi-Marko
No ratings yet
Stochastic Differential Equations With Multi-Marko
12 pages
QM Problem Set 2
No ratings yet
QM Problem Set 2
11 pages
PH3101 2021-2022 Semester 2 PDF
No ratings yet
PH3101 2021-2022 Semester 2 PDF
5 pages
Drills On Intro To Rational Numbers
No ratings yet
Drills On Intro To Rational Numbers
2 pages
Eulogio "Amang Rodriguez Institute of Science & Technology: Republic of The Philippines
No ratings yet
Eulogio "Amang Rodriguez Institute of Science & Technology: Republic of The Philippines
6 pages
Linear Algebra and Its Application.122 PDF
No ratings yet
Linear Algebra and Its Application.122 PDF
1 page
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Two Dimensional Computer Graphics: Exploring the Visual Realm: Two Dimensional Computer Graphics in Computer Vision
From Everand
Two Dimensional Computer Graphics: Exploring the Visual Realm: Two Dimensional Computer Graphics in Computer Vision
Fouad Sabry
No ratings yet
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
From Everand
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
Fouad Sabry
No ratings yet

Correlation Visualization of High Dimensional Data

Uploaded by

Correlation Visualization of High Dimensional Data

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Correlation Visualization of High Dimensional Data Using Topographic Maps

Conference Paper in Lecture Notes in Computer Science · August 2002

Ignacio Díaz Blanco

The user has requested enhancement of the downloaded file.

Área de Ingenierı́a de Sistemas y Automática

2.1 Local Covariance Matrix

Let ψ(y) : R2 → Rn a continuous mapping which takes a point y of the visualiza-

2.2 Local Correlation Matrix

The previously defined covariance matrix provides insight in the approach of

2.3 Visualization of Second Order Statistical Features

3 Application to Artificial Data

Visualization Space Visualization Space

Fig. 1. Local covariances in D (top) and in V (bottom) obtained for both a

addition, are consistent with other standard representations in the visualization

Fig. 3. SOM planes of variables x and y and distance matrix.

View publication stats

You might also like