Ijtech Template
Ijtech Template
K-MEANS
Citra Dewi Megawati1st, Eko Mulyanto Yuniarno2nd, Supeno Mardi Susiki Nugroho3rd
1
Institut Teknologi Sepuluh November, [email protected], Surabaya Indonesia
2
Institut Teknologi Sepuluh November,[email protected], Surabaya Indonesia
3
Institut Teknologi Sepuluh November, [email protected], Surabaya Indonesia
ABSTRACT
This game requires an avatar to make it look attractive. An avatar is the first remember
from a game. However, illustrators find it difficult to determine the avatar desired by game
lovers. Avatars can also represent a person's appearance or personality in a game or cyberspace.
Avatars can be humanized animals or humanized plants or humans who are female or male. But
in this study we chose a female genre human avatar because it was considered to be more
complete in terms of facial features, because this study was sulking on the clustering of
frequently selected female face avatars. In this study face data features avatars such as face
shape, eyebrow shape, eyes, nose, lips and ears.
In this research clustering data on female avatar faces using the K-means method, K
Means is an unsupervised data clustering algorithm that uses class partitioning methods that
have similarities. In this research the data is partitioned into 3 different classes based on the
similarity of shapes. The results of this study indicate that 39% choosing triangle face shape, circle
left eyebrow, up right eyebrow, slanted eyes with a sharp nose, wide lips and long lobe ears. Then
34% choose avatars with oval faces, triangle left eyebrows, short right eyebrows with
asymmetrical nosed, wide eyes, thick upper lips and ears with thick lobes, and the last 27%
choose avatars with square faces with thin eyebrows, round eyes with rounded noses, lips thick
and ears with small lobes. So with this it is proven that the k-means method is very effective
and fast in classifying the faces of women who are often chosen.
Keywords: Clustering; characters; female character; faces shape; K-means
1. INTRODUCTION
This game is an interactive activity, where one or more players follow rules that limit
the behavior of players, but this application is liked by consumers, besides playing they also
unconsciously learn and enter the game's story. The game has several components, one of the
most important components is the character or avatar. The game will look interesting if there is
a unique avatar.
Avatars are characters that are directly related and exist in the game world. Avatar
greatly influences the player's preference for playing games. Avatar design is one form of
illustration that comes with the concept of "human" male and female genre, "humanized
animal" or "humanized plant" with all its attributes (character, physical, professional, residential
and even destiny). In the design of the avatar, one of the important things that determine is the
face shape of the avatar. However, illustrators find it difficult to determine the face of a human
avatar as desired by the player. In this study we chose a human-shaped avatar because it was
considered more complete in terms of facial features, because this study was sulking on the
grouping of face data of female avatars that were often chosen. In this study face data in the
form of features on the avatar's face such as face shape, shape of eyebrows, eyes, nose, lips and
ears.
Users take care when creating unique avatars, even though face options are ready to
use. (Cheng, Farnham, & Stone, 2002; Taylor, 2002), The problem has been found in the
adjustment process that requires a lot of time (Cheng et al., 2002). Most of the research on
avatars focuses on how visual appearance and appearance of avatar behavior can influence
viewers' perceptions. (e.g., Garau et al., 2003; Nowak & Rauh, 2005; Nowak & Rauh, 2008).
The appearance of the avatar can also influence the perception and behavior of the owner. for
example, users who consider their avatars are more similar to their own appearance (Vasalou,
Joinson, & Pitt, 2007), proven from the quality displayed from the visual avatar can be reflected
in the behavior of the owner, because the avatar is a representation of someone who looks at the
interface. (Rauh, Polonsky, & Buck, 2004).
Research on the classification of facial shapes using 3 methods (N K Bansode and P
K Sinha and Pomthep Sarakon and Theekapun Charoenpong). Research on the classification of
facial forms using SVM, (S. C. Zhang, B. Fang, Y. Z. Liang, J. Wen, and L. Wu), research on
grouping facial data. But they only discuss for face shape, not discussing in full about other face
features such as eyes, nose, eyebrows etc.
In this research clustering face data of female avatars using the K-means method,
Kmeans is an unsupervised data clustering algorithm that uses class partitioning methods that
have similarities. The purpose of this study was to find the faces of female avatars that are often
chosen by game consumers. The game consumers here are undergraduate students from various
majors.
2. METHODS
The research process was carried out according to the K-means clustering method. The
proposed method is the k-means algorithm, which is a method of grouping algorithms that goes
into an unsupervised algorithm. K-means uses a method of partitioning data based on its
similarity in shape. Figure 1 is a block diagram of the steps to research face data.
Round nose Long nose Asyimetrical Pug nose Sharp nose Eagle nose
Table.5. Design lips shape
Little ear lobe Long ear lobe Thick ear lobe Two lobe ear
3.2 Data shape vectorization
The next process is vectorizing the data on the female avatar's face features, before
manufacture on the face survey game. Vector data is needed to facilitate data processing with
the K-means algorithm. These facial data features are arranged according to similarities.
Table.7. data structure table for face shape, table.8.table for ear shape, table.9. table for
lips shape, table.10.table for eyebrows shape, table.11. table for eye shape, table.12. table for
nose shape. The point is to sort face shapes based on the similarity of data so that it is easier to
clustering using k-means.
Table.7. Data structure face shape
Face shape elbow side Curved side Blunt side Pointed side Lengthwise Round
Triangle 0 0 0 1 0 0
Diamond 0 0 0 1 1 0
Heart 0 0 1 1 0 0
Round 0 1 1 0 0 1
Oval 0 1 1 0 1 1
Square 1 0 0 1 1 0
Table.8. Data structure ear shape
Shape ear Big Wide Thick Long Thin Short Narrow Small
Little ear 0 0 0 0 1 1 1 1
Long ear 1 0 1 1 0 0 1 0
Thick ear 1 1 1 0 0 1 0 0
Two lobe ear 1 1 1 1 0 1 1 1
Table.9. Data structure lip shape
Lip type Thick up Thick down Big Wide Thin up Thin down Little
Thin lip 0 0 0 1 1 1 0
Wide lip 0 0 0 1 1 1 0
Thin upper lip 0 1 1 1 1 0 0
thick upper lip 1 0 1 1 0 1 0
Little lip 1 1 0 0 0 0 1
Heart lip 1 1 0 0 0 0 1
Thick lip 1 1 1 1 0 0 0
Table.10. Data structure eyebrows shape
Eyebrow shape Thick Long Straight Curved Thin Short Up down Even
Down 0 1 0 1 1 0 0 1 0
Top 0 1 0 1 1 0 1 0 0
Thin 0 1 0 1 1 0 1 0 0
Short 1 0 1 0 0 1 1 0 0
Thick 1 1 0 1 0 0 0 0 1
Triangle 1 1 0 1 0 0 0 0 1
Circular 1 1 0 1 0 0 0 0 1
Curved 1 1 0 1 0 0 1 0 0
Straight 1 1 1 0 0 0 0 0 1
Table.11. data structure eye shape
Eye shape Wide Big Long Short Narrow Little
Slanted eyes 0 0 1 0 0 0
Long eyes 0 0 1 0 1 0
Little eyes 1 0 0 1 0 1
Eyes into 1 0 1 0 0 1
Round eyes 1 1 0 1 0 0
The eyes of almond 1 1 1 0 0 0
Wide eyes 1 1 1 0 0 0
Table.12. Data structure nose shape
Nose shape Big long Wide Sharp Round Short Narrow Little Pug
Round 0 0 1 0 1 1 0 0 0
Narrow long 0 1 0 1 0 0 1 0 0
Asymmetral 1 1 0 0 0 0 0 0 1
Pug 1 1 0 0 0 0 1 1 1
Sharp 1 1 0 1 0 0 1 0 0
Eagle 1 1 1 1 0 0 0 0 0
3.3 Clustering of face data using K-means
The next process after collecting data is clustering data using K-means. Clustering of
face data using K-means based on the similarity of shapes will be divided into several clusters
and cluster centers randomly.
3.3.1 K-Means Clustering
K-Means is one of the most effective unsupervised clustering algorithm techniques
using the clustering partitioning approach. This method partitioned existing data into groups so
that data with the same characteristics were entered into the same group and data with different
characteristics were grouped into other groups.
In this research 3 clusters were seen according to the clustering with similarities of
shapes starting from the face shape divided by 3 clusters because those with similar shapes
include triangular faces similar to diamond faces and similar to the faces of hearts but different
from oval faces, oval faces them selves have similarities with a round face and different from a
square face, in the form of clusters 1 triangle faces, diamond faces and hearts faces while
cluster 2 faces oval and round faces and in cluster 3 there is a square face only. These centroids
must be placed in a cunning manner because different locations cause different results. So, the
best choice is to put them as much as possible from each other. The application in this study is
that the K point of the centroid is randomly chosen to take 3 centroid points from 100 samples.
The objective function has been calculated as follows:
(1)
Where xi = 1-100 survey data is separated from the feature, while yi is the K point of centroid
and separated by its features. Where as for i is a member or sample data.
3.4 Testing of K-means results with silhouette coefficien
Testing the results of K-means grouping with silhouette cofficien is useful to see
whether the k-means results are good or not. The calculation results of the silhoutte coefficient
value have a range between -1 to 1. The results can be said to be good if they are positive, this
means the points are already in the right cluster. Where as if this value is negative it indicates
overlapping so that the point is between two clusters.
3.4.1 Silhouette Coefficient
The silhouette coefficient method serves to test the quality of the cluster produced as
well as a method for validating a cluster that combines the cohesion method and the separation
method. To calculate the silhoutte coefisient value, the distance between objects is needed by
using the euclidean distance method. The coefficient silhoutte value can be determined using
the following formula: (1)
Where : Si: silhoutte coefficient value. Sedangkan bi: average one point distance with all data in
one cluster. Dan ai: minimum average distance from one point to another different cluster.
The calculation results of the silhoutte coefficient value have a range between -1 to 1.
The results can be said to be good if they are positive, this means the points are already in the
right cluster. Whereas if this value is negative it indicates overlapping so that the point is
between two clusters.
3. RESULTS AND DISCUSSION
The results of face data were obtained from a survey of 100 people consisting of 5
psychology students, 4 electro-education students, 5 Indonesian literary students, 11 fine arts
students, 9 English literature students, 15 dance art students, 29 visual communication design
students, 16 machine students and 6 electro students. Figure.2. is a mapping of the face data
survey results conducted with 100 student samples.