0% found this document useful (0 votes)

46 views10 pages

Numerical Coding of Nominal Data: January 2015

This document proposes a novel approach for numerically coding nominal data using complex numbers. Nominal data values are grouped by identical values and assigned a rank equal to the average value of the group. For groups with the same number of elements, complex numbers are used to code the values, assigning each a root of unity to distinguish them. An example shows ranks assigned to a dataset containing different frequencies of nominal values. The proposed coding method maintains information about the nominal attributes while introducing new properties that could enable improved classification performance compared to using nominal data alone.

Uploaded by

chao chong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views10 pages

Numerical Coding of Nominal Data: January 2015

Uploaded by

chao chong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/278403191

Numerical Coding of Nominal Data

Article · January 2015

DOI: 10.26348/znwwsi.12.53

CITATIONS READS

2 8,932

2 authors, including:

Zenon Gniazdowski
Warsaw School of Computer Science
34 PUBLICATIONS 81 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Clustering of variables in the method of PCA View project

Extension of the 1D Stoney algorithm to a 2D case View project

All content following this page was uploaded by Zenon Gniazdowski on 28 August 2016.

The user has requested enhancement of the downloaded file.

Zeszyty Naukowe WWSI, No 12, Vol. 9, 2015, pp. 53-61

Numerical Coding of Nominal Data

Zenon Gniazdowski∗1 and Michał Grabowski1

1 Warsaw School of Computer Science

Abstract
In this paper, a novel approach for coding nominal data is proposed. For the given
nominal data, a rank in a form of complex number is assigned. The proposed
method does not lose any information about the attribute and brings other proper-
ties previously unknown. The approach based on these knew properties can been
used for classification. The analyzed example shows that classification with the
use of coded nominal data or both numerical as well as coded nominal data is
more effective than the classification, which uses only numerical data.

Keywords — nominal data, numerical data, classification

1 Introduction
Different types of data are used in data analysis. Generally, they can be numerical data or
nominal data. Numerical data are linearly ordered, which leads to the conclusion that two
elements are equal or one element precedes the second one. Nominal data cannot be naturally
ordered. In the set of nominal data, the identity equivalence relation can be defined, at most. It
means that two elements may be equal or different.
For both types of data specific methods of analysis are developed. Particular difficulties
arise when continuous and nominal data are analyzed simultaneously. Usually, by discretization
continuous data are treated as nominal data. In this way, there avoids the opportunity of setting
the data in order. On the other hand, the procedure can be reversed. In this case, nominal data
are coded with the use of numbers [1]. Unfortunately, numerically coded nominal data cannot
be naturally ordered.
In this paper, a novel approach for coding nominal data with the use of complex numbers
will be presented [2]. For the given nominal data, it will be assigned a rank in a form of number.
Proposed approach can be employed for classification and clustering.
∗
E-mail: [email protected]

Manuscript received April 11, 2015 53

Zenon Gniazdowski, Michał Grabowski

Table 1: Ranks of numerical data

No. Sorted data Assigned rank
1 21 1
2 28 2
3 33 3
4 44 4
5 45 5
6 54 6
7 55 7
8 60 8
9 63 9
10 76 10

2 Preliminaries – ranks of numerical data

In statistics, in some cases, the numerical results of observations are replaced by their ranks. In
the first stage of the ranking procedure, the data set is sorted in ascending order. Next, a rank
equal to the item in the sorted set is assigned for the ordinal elements [3]. An example of this
procedure is shown in Table 1.
On the other hand, it may happen that in the sorted set there are different elements with
the same values. In this case, the ranks assigned to identical values should be the same. Such
elements receive rank that is equal to their average position in the sorted set [3]. These are
so–called tied rank. Table 2 shows an example of tied ranks.

3 Coding of nominal data

As it is noted above, attaching ranks to numerical data is a function of the value of numerical
data as well as the cardinality of its occurrences. In the case of nominal data, there is no dif-
ferentiation of values. Possible method of ranking of nominal data has no chance to utilize the
values. On the other hand, intuition suggests that in random community, both for numerical
and nominal data, more numerous elements are more important than elements of less cardinal-
ity. Therefore, there is a proposal to rank of nominal data using only cardinality of identical
elements.

3.1 Different frequencies of different nominal values

Nominal data cannot be sorted according to their values, but it can be grouped according to
identical values. In the n–element subset consisting of identical elements, these elements may
be numbered from 1 to n. For each of them can be assigned a rank that is equal to the average

54
Numerical Coding of Nominal Data

Table 2: Tied ranks of numerical data

No. Sorted data Assigned rank
1 21 1
2 28 2
3 44 4
4 44 4
5 44 4
6 54 6
7 55 8.5
8 55 8.5
9 55 8.5
10 55 8.5

Table 3: Rank of nominal data – differences in the cardinality of elements

First value Second value Third value
Value Position Rank Value Position Rank Value Position Rank
a 1 3.5 b 1 3 c 1 2.5
a 2 3.5 b 2 3 c 2 2.5
a 3 3.5 b 3 3 c 3 2.5
a 4 3.5 b 4 3 c 4 2.5
a 5 3.5 b 5 3
a 6 3.5

value of these numbers:

n+1
R= (1)
2
More numerous elements will have higher rank than less numerous elements. As an exam-
ple, a set consisting of 15 elements with nominal values {a, a, a, a, a, a, b, b, b, b, b, c, c, c, c} is
considered. The set can be divided into three subsets. Each subset is a class of equivalence,
which contains identical elements. To each nominal value, respective rank was assigned, ac-
cording to the formula (1). The result of the ranking is shown in Table 3.

3.2 The same frequencies of different nominal values

For equinumerous subsets, method (1) gives the same rank. This would lead to the situation
in which equinumerous elements will be indistinguishable. Therefore, method (1) should be
modified. If the variable has several equinumerous subsets, the nominal elements belonging to

55
Zenon Gniazdowski, Michał Grabowski

Table 4: Ranks of nominal data - possible equinumerosity of elements in subsets

Nominal Module Phase Exponential Algebraic
No. variable |R| φ[rad] form Reiφ form a + bi
1 a 2 0 2 2
2 a 2 0 2 2
3 a 2 0 2 2
4 b 2 2π/3 2ei2π/3 −1 + 1.73i
5 b 2 2π/3 2ei2π/3 −1 + 1.73i
6 b 2 2π/3 2ei2π/3 −1 + 1.73i
7 c 2 4π/3 2ei4π/3 −1 − 1.73i
8 c 2 4π/3 2ei4π/3 −1 − 1.73i
9 c 2 4π/3 2ei4π/3 −1 − 1.73i
10 d 3.5 0 3.5 3.5
11 d 3.5 0 3.5 3.5
12 d 3.5 0 3.5 3.5
13 d 3.5 0 3.5 3.5
14 d 3.5 0 3.5 3.5
15 d 3.5 0 3.5 3.5

the j − th subset (j = 0, 1, . . . , k − 1) can be coded with the use of k successive roots of unity:
√
Rj = R · k −1 = R · eiφ = R · (cos φ + i sin φ) (2)
√
In the above expression i = −1, φ = 2πj/k (j = 0, 1, . . . , k − 1) and R is the rank
calculated by the formula (1). Value of φ is the phase assigned to the successive (j − th)
nominal value. In the presented concept R is a module of complex rank, depending on the car-
dinality of the subset that contains given nominal value. This approach gives the same modules
R for equinumerous subsets contained identical nominal elements, and distinguishes ranks of
different nominal values via different phases.
Table 4 shows an example of the ranking for the case when the cardinality of elements a, b
and c are equal to three, and the cardinality of the element d is equal to six. For nominal values
of a, b and c assigned phases are respectively equal to 0, 2π/3 and 4π/3. Hence, the rank
assigned to the value of a is real, and ranks assigned to nominal values of b and c are complex.
Real rank is assigned to the nominal value of d.

4 Properties of complex coding

In nominal data set, an equivalence relation can be defined for a data set in column correspond-
ing to the given attribute. This relation divides this column into classes of equivalence. Each

56
Numerical Coding of Nominal Data

class will contain identical elements. The cardinality of each class is the only attribute informa-
tion that is important from our analysis point of view. Coding with the use of complex numbers
is unambiguous, i. e. after coding different elements are still distinguishable. In addition, it is
also possible to define the corresponding equivalence relation, which divides the set into classes
of equivalences, with cardinality of each class as before coding.
Coding does not lose any information about the attribute. The coded data receives addi-
tional properties that enrich them. Before coding, the cardinality of the given value was as the
external feature. Now, through the module, the cardinality is an inherent property of the coded
value of the attribute. The module presents information about the statistical strength of a given
subset of elements. The phase contains the information about the number of equinumerous
classes. Additionally, coding with the use of complex numbers brings other properties previ-
ously unknown. Above all, on complex numbers all arithmetic operations can be performed.
Objects in data space can be viewed as vectors in a complex space. In this space, a scalar prod-
uct, norm, as well as metric can be defined [4]. Scalar product of two complex vectors x and y
is defined as follows:
Xn
(x, y) = xi y i (3)
i=1

This way the norm, which is generated by the above scalar product, can also be defined:
p
||x|| = (x, x) (4)

By using this norm, metric also can be defined:

ρ(x, y) = ||y − x|| (5)

Proposed methodology of coding can be employed for analysis of nominal data. In par-
ticular, in a natural way it may be used for clustering and classification, because of the metric
defined above.

5 Possible application of proposed approach – an example

The proposed approach can be used in k-means method, since the distance between objects
in data space can be calculated by taking into account complex ranks of data. Data from the
company that sells cars will be shown as an example [5]. The data set consists of ten objects
(Table 5). Each object is described by means of the seven attributes. The first two attributes
(number of doors, engine power) are numbers. The other five attributes (color, fuel, interior,
wheels and brand) take the nominal values. The first four of them were coded using the proposed
approach (Table 6).
The set of attributes will be divided into two subsets. The first six attributes are conditional
attributes. The last one is an attribute of decision-making. Based on conditional attributes the
set of objects will be clustered into three subsets. Afterwards, it is needed to check to what
extent these subsets are consistent with the decision attribute. In other words, it is necessary to
check whether the cars brand can be recognized from the car's description. In order to verify the

57
Zenon Gniazdowski, Michał Grabowski

Table 5: Description of cars

No. Door Power Color Fuel Interior Wheel Brand
1 2 60 Blue Petrol Fabric Steel Opel
2 2 100 Black Diesel Fabric Steel Nissan
3 2 200 Black Petrol Leather Alloy Ferrari
4 2 200 Red Petrol Leather Alloy Ferrari
5 2 200 Red Petrol Fabric Steel Opel
6 3 100 Red Diesel Leather Steel Opel
7 3 100 Red LPG Fabric Steel Opel
8 3 200 Black Petrol Leather Alloy Ferrari
9 4 100 Blue LPG Fabric Steel Nissan
10 4 100 Blue Diesel Fabric Alloy Nissan

Table 6: Coded description of cars

No. Door Power Color Fuel Interior Wheel Brand
1 2 60 2 3 3.5 3.5 Opel
2 2 100 -2 2 3.5 3.5 Nissan
3 2 200 -2 3 2.5 2.5 Ferrari
4 2 200 2.5 3 2.5 2.5 Ferrari
5 2 200 2.5 3 3.5 3.5 Opel
6 3 100 2.5 2 2.5 3.5 Opel
7 3 100 2.5 1.5 3.5 3.5 Opel
8 3 200 -2 3 2.5 2.5 Ferrari
9 4 100 2 1.5 3.5 3.5 Nissan
10 4 100 2 2 3.5 2.5 Nissan

58
Numerical Coding of Nominal Data

Table 7: The results of classification

Accuracy of classification
Considered Data Type 90% 80% 70% 60% 50%
Ad hoc – – – – 20
Only Numerical Data – – 12 1 7
Only Coded Symbolic Data 1 8 3 8 –
Numerical and Coded Symbolic Data 3 4 10 2 1

usefulness of the proposed complex coding, four classifications were made based on different
conditional attributes:

• Numbers and ad-hoc coded nominal data,

• Only numbers (number of doors and engine power),
• Only nominal data (color, fuel, interior, wheel) coded by the use of (2),
• Numbers as well as nominal data coded by the use of formula (2).

K–means algorithm was used for classification. The data were standardized, for this pur-
pose. Euclidean norm was used to measure distances. For this purpose, the adequate number of
starting points for the k-means algorithm was chosen randomly. All these tests were repeated
twenty times. In none of these twenty cases, the sequence of randomly selected points was not
repeated.
After completion of the experiments, the results obtained for different conditional attributes
were compared. Table 7 shows the comparison of classification results for different types of
used data. It can be seen that the classification using coded nominal data and both numerical as
well as coded nominal data is more effective than the classification, which uses ad hoc coding
or only numerical data. Based on obtained results it must be concluded that the information that
is contained in the coded nominal data is important for classification.

6 Conclusions
In this paper, a novel approach for coding nominal data was proposed. For the given nominal
data, it can be assigned rank in a form of complex number. The module of this rank presents
information about the statistical strength of a given subset of elements. The phase contains the
information about the number of equinumerous values of attribute.
Proposed methodology is unambiguous. After coding, different values of attribute are still
distinguishable. The method does not lose any information about the attribute. Additionally,
coded data receives properties previously unknown that enrich them. Above all, on complex
numbers all arithmetic operations can be performed. In complex space, a scalar product, norm,
as well as metric can be defined. It means that coded data may be used for clustering and
classification.

59
Zenon Gniazdowski, Michał Grabowski

The well-known, folklore-type idea represents m–valued nominal domain as equidistant

points in Rm Euclidian space. Thus the classical Euclidean approach represents vectors of
nominal values as points in Rq space, where q is equal to the sum of cardinalities of nominal
domains under consideration. The proposed coding represents vectors of nominal values as
points in C s space, where s is equal to the number of nominal domains under consideration. We
think that our coding can be considered as one of alternatives when the analyzed continuous–
nominal data are sensitive to course of dimensionality (see [6], [7]), due to low (compared to
Euclidean coding) dimension of final space of codes.
It is plain enough that the proposed coding injects additional information. For instance, the
symmetries of codes of nominal data with equal frequencies are specific for the proposed coding
schema and it may happen that the symmetries of the original data differ from the symmetries
of their codes. Nevertheless, the other coding schemas involving nominal data frequencies
information, for instance Bayesian coding of nominal values by scoring (see [8]), are subjected
to this weakness as well. Moreover, Bayesian coding by scoring is applicable only to training
data sets with two decision categories. Our proposal is fully general and can be applied to data
sets with no decision categories information at all.
The analyzed in this article data set shows, that classification with the use of coded nominal
data or both numerical as well as coded nominal data is more effective than classification, which
uses ad-hoc coding or only numerical data. From here, it must be concluded that the information
that is contained in the coded nominal data is important for classification.
The presented proposal is preliminary proposal. Although the method looks interesting,
further investigations of this approach are necessary, for instance a serious experimental study.
These investigations could confirm the usefulness of the method. They could also show other
possible applications of the proposed method.

References
[1] M. Grabowski and M. Korpusik. Metrics and similarities in modeling dependencies be-
tween continuous and nominal data. Zeszyty Naukowe WWSI, 7(10):25–37, 2013.
[2] Z. Gniazdowski. Numerical coding of nominal data. Seminar, Warsaw School of Computer
Science, May 15, 2014.
[3] F. Wilcoxon. Individual comparisons by ranking methods. Biometrics Bulletin, 1(6):80–83,
1945.
[4] S. G. Krejn. Analiza funkcjonalna. PWN, Warszawa, 1967.
[5] L. Rutkowski. Metody i techniki sztucznej inteligencji. Wydawnictwo Naukowe PWN,
Warszawa, 2012.
[6] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer,
New York, 2001.
[7] R. Bellman. Adaptive Control Processes. A Guided Tour. Princeton University Press,
Princeton, 1961.

60
Numerical Coding of Nominal Data

[8] J. Koronacki and J. Ćwik. Statystyczne systemy ucza̧ce siȩ. Akademicka Oficyna
Wydawnicza EXIT, 2008.

View publication stats

A Handbook of Small Data Sets D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway A PDF
No ratings yet
A Handbook of Small Data Sets D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway A PDF
470 pages
Folk Magic and Protestant Christianity in Appalachia
100% (7)
Folk Magic and Protestant Christianity in Appalachia
35 pages
Abs 0678
No ratings yet
Abs 0678
11 pages
HF in Maintenance MOOC
No ratings yet
HF in Maintenance MOOC
32 pages
Cholesterol Synthesis, Transport, and Excretion
100% (2)
Cholesterol Synthesis, Transport, and Excretion
37 pages
Royal Manchester Children's Hospital - PICU (Version 12.0)
No ratings yet
Royal Manchester Children's Hospital - PICU (Version 12.0)
2 pages
16 03 09 Numerical Coding of Nominal Data
No ratings yet
16 03 09 Numerical Coding of Nominal Data
10 pages
Topic 1 - Types of Data PDF
No ratings yet
Topic 1 - Types of Data PDF
10 pages
Slides By: Andrew Stephenson Georgia Gwinnett College
No ratings yet
Slides By: Andrew Stephenson Georgia Gwinnett College
22 pages
Zenon Gniazdowski NR 27
No ratings yet
Zenon Gniazdowski NR 27
26 pages
Data Transformation (1)
No ratings yet
Data Transformation (1)
16 pages
SBST1103 Topic 1 Edited
No ratings yet
SBST1103 Topic 1 Edited
10 pages
Slides By: Andrew Stephenson Georgia Gwinnett College
No ratings yet
Slides By: Andrew Stephenson Georgia Gwinnett College
22 pages
Q. Define Data Explain Its Types With Suitable Example ?
No ratings yet
Q. Define Data Explain Its Types With Suitable Example ?
53 pages
QM Topic - Data Description & Presentation
No ratings yet
QM Topic - Data Description & Presentation
65 pages
Data Types
No ratings yet
Data Types
5 pages
N.D Bhatt Engineering Drawing and Graphics
No ratings yet
N.D Bhatt Engineering Drawing and Graphics
4 pages
Types of Data
No ratings yet
Types of Data
14 pages
ITDS Unit 1_merged
No ratings yet
ITDS Unit 1_merged
86 pages
Unit 10 Flow 1_ Lists of Data
No ratings yet
Unit 10 Flow 1_ Lists of Data
58 pages
Department of Computer Science: Prepared By: Ms. Zainab Imtiaz
No ratings yet
Department of Computer Science: Prepared By: Ms. Zainab Imtiaz
7 pages
2 Types of Data
No ratings yet
2 Types of Data
44 pages
Data Science
No ratings yet
Data Science
47 pages
Discrete Data
No ratings yet
Discrete Data
3 pages
Classes of Data
No ratings yet
Classes of Data
10 pages
1.2 Types of Data
No ratings yet
1.2 Types of Data
5 pages
2 Graphical Descriptive Techniques 1
No ratings yet
2 Graphical Descriptive Techniques 1
24 pages
data foundations
No ratings yet
data foundations
5 pages
Types of Data
No ratings yet
Types of Data
2 pages
Selvanathan 5e Chapter 02
No ratings yet
Selvanathan 5e Chapter 02
77 pages
Graphical Descriptive Statistics Lec 1
No ratings yet
Graphical Descriptive Statistics Lec 1
18 pages
Lecture 1
No ratings yet
Lecture 1
33 pages
02DataCategorization
No ratings yet
02DataCategorization
25 pages
Dav Theory
No ratings yet
Dav Theory
111 pages
MLS 404 2021
No ratings yet
MLS 404 2021
36 pages
Analyzing The Data
No ratings yet
Analyzing The Data
54 pages
4 Types of Data in Statistics
No ratings yet
4 Types of Data in Statistics
10 pages
Week 01, PT 1
No ratings yet
Week 01, PT 1
16 pages
Chapter 1. Biostatistics
No ratings yet
Chapter 1. Biostatistics
34 pages
Statistics 1A Lecture Notes Article
No ratings yet
Statistics 1A Lecture Notes Article
123 pages
Basic Statistics Notes
No ratings yet
Basic Statistics Notes
4 pages
Data Categorization
No ratings yet
Data Categorization
20 pages
Data Transformation
No ratings yet
Data Transformation
5 pages
Self-Instructional Manual (SIM) For Self-Directed Learning (SDL)
No ratings yet
Self-Instructional Manual (SIM) For Self-Directed Learning (SDL)
33 pages
Unit-2-1
No ratings yet
Unit-2-1
48 pages
Measurement Scale: Dr. Myint Moe Moe Khin Professor / Head Department of Statistics Monywa University of Economics
No ratings yet
Measurement Scale: Dr. Myint Moe Moe Khin Professor / Head Department of Statistics Monywa University of Economics
27 pages
cs3352-foundations-of-data-science-unit-ii
No ratings yet
cs3352-foundations-of-data-science-unit-ii
34 pages
Session 2
No ratings yet
Session 2
17 pages
GE 4-MMW-Week 4-5 (Revised)
No ratings yet
GE 4-MMW-Week 4-5 (Revised)
35 pages
Data and Information
No ratings yet
Data and Information
6 pages
Lecture 5
No ratings yet
Lecture 5
3 pages
Unit-4 Short Notes
No ratings yet
Unit-4 Short Notes
5 pages
business Analytics (tanya pandey) mba m3a
No ratings yet
business Analytics (tanya pandey) mba m3a
64 pages
CSE512 DataAndImageModels
No ratings yet
CSE512 DataAndImageModels
82 pages
Week 01, PT 1
No ratings yet
Week 01, PT 1
16 pages
Numerical Linear Algebra in Data Mining: Lars Eld en
No ratings yet
Numerical Linear Algebra in Data Mining: Lars Eld en
58 pages
Signed Learning Material No. 4A Data Management
No ratings yet
Signed Learning Material No. 4A Data Management
14 pages
Numeracy & Quantitative Methods: Numeracy For Professional Purposes
No ratings yet
Numeracy & Quantitative Methods: Numeracy For Professional Purposes
18 pages
Numeracy & Quantitative Methods: Numeracy For Professional Purposes
No ratings yet
Numeracy & Quantitative Methods: Numeracy For Professional Purposes
18 pages
Nominal Data - What Is It and How Can You Use It?
No ratings yet
Nominal Data - What Is It and How Can You Use It?
14 pages
Nominal Data Ordinal Data
No ratings yet
Nominal Data Ordinal Data
4 pages
Unit 1 Introduction
No ratings yet
Unit 1 Introduction
86 pages
I CS PST Unit Iv
No ratings yet
I CS PST Unit Iv
17 pages
Math Starters: 5- to 10-Minute Activities Aligned with the Common Core Math Standards, Grades 6-12
From Everand
Math Starters: 5- to 10-Minute Activities Aligned with the Common Core Math Standards, Grades 6-12
Gary R. Muschla
No ratings yet
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
From Everand
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
Wouter Verbeke
No ratings yet
EE466 - C5 Distance Protection of Transmission Lines PDF
No ratings yet
EE466 - C5 Distance Protection of Transmission Lines PDF
7 pages
Prompt Engineering - Prompts and Responses
100% (1)
Prompt Engineering - Prompts and Responses
16 pages
Accendo GloGreen Series 575W Digital HID (DHID) Retrofit Ballast Operates Metal Halide and High-Pressure Sodium HID Light Bulbs
No ratings yet
Accendo GloGreen Series 575W Digital HID (DHID) Retrofit Ballast Operates Metal Halide and High-Pressure Sodium HID Light Bulbs
2 pages
Southern Leyte State University Sogod, Southern Leyte College of Engineering Department of Electrical Engineering Final Term Exam I. Multiple Choice
No ratings yet
Southern Leyte State University Sogod, Southern Leyte College of Engineering Department of Electrical Engineering Final Term Exam I. Multiple Choice
4 pages
Music 9 Q3 M1
No ratings yet
Music 9 Q3 M1
3 pages
The Diversity of India
No ratings yet
The Diversity of India
2 pages
Aspects of STOL Aircraft
No ratings yet
Aspects of STOL Aircraft
8 pages
Night S Edge
No ratings yet
Night S Edge
98 pages
D6 Frogs and Phylogeny
No ratings yet
D6 Frogs and Phylogeny
2 pages
Odd Man Out: Chapter Highlights
No ratings yet
Odd Man Out: Chapter Highlights
3 pages
Modicon TSX Micro - TSXASZ200
No ratings yet
Modicon TSX Micro - TSXASZ200
3 pages
Principal Methods of Thermal Conductivity Measurement
No ratings yet
Principal Methods of Thermal Conductivity Measurement
5 pages
Acoustic Impedance
No ratings yet
Acoustic Impedance
6 pages
Filipino Inventions and Discoveries
No ratings yet
Filipino Inventions and Discoveries
5 pages
Indikasi Endoskopi Saluran Cerna Atas Dan Persiapan Pasien
No ratings yet
Indikasi Endoskopi Saluran Cerna Atas Dan Persiapan Pasien
38 pages
Information Sheet Jugitec H
No ratings yet
Information Sheet Jugitec H
1 page
Nvis 7008: Features Technical Specifications
No ratings yet
Nvis 7008: Features Technical Specifications
1 page
Unit 1 - Canada - 2024
No ratings yet
Unit 1 - Canada - 2024
42 pages
Better Keep Ass
100% (2)
Better Keep Ass
27 pages
Penicillium Chrysogenum
No ratings yet
Penicillium Chrysogenum
2 pages
Igneous & Metamorphic Petrology Lecture Notes
No ratings yet
Igneous & Metamorphic Petrology Lecture Notes
17 pages
3eb7ae5a98117f1b50fad6999ea6261c
No ratings yet
3eb7ae5a98117f1b50fad6999ea6261c
9 pages
Experimenter: The Impoverished Radio
100% (2)
Experimenter: The Impoverished Radio
52 pages
Arches & Lintels
No ratings yet
Arches & Lintels
25 pages
BJT Bipolar Junction Transistor
No ratings yet
BJT Bipolar Junction Transistor
21 pages

Numerical Coding of Nominal Data: January 2015

Uploaded by

Numerical Coding of Nominal Data: January 2015

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Numerical Coding of Nominal Data

Article · January 2015

Clustering of variables in the method of PCA View project

Extension of the 1D Stoney algorithm to a 2D case View project

The user has requested enhancement of the downloaded file.

Numerical Coding of Nominal Data

Zenon Gniazdowski∗1 and Michał Grabowski1

Keywords — nominal data, numerical data, classification

Manuscript received April 11, 2015 53

Table 1: Ranks of numerical data

2 Preliminaries – ranks of numerical data

3 Coding of nominal data

3.1 Different frequencies of different nominal values

Table 2: Tied ranks of numerical data

Table 3: Rank of nominal data – differences in the cardinality of elements

value of these numbers:

3.2 The same frequencies of different nominal values

Table 4: Ranks of nominal data - possible equinumerosity of elements in subsets

4 Properties of complex coding

By using this norm, metric also can be defined:

ρ(x, y) = ||y − x|| (5)

5 Possible application of proposed approach – an example

Table 5: Description of cars

Table 6: Coded description of cars

Table 7: The results of classification

• Numbers and ad-hoc coded nominal data,

The well-known, folklore-type idea represents m–valued nominal domain as equidistant

View publication stats

You might also like