0% found this document useful (0 votes)
2 views

Data_Science

Uploaded by

Harmeet kaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Data_Science

Uploaded by

Harmeet kaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Journal of Physics: Conference Series

PAPER • OPEN ACCESS

Data Science
To cite this article: Mahyuddin K M Nasution et al 2020 J. Phys.: Conf. Ser. 1566 012034

View the article online for updates and enhancements.

This content was downloaded from IP address 92.249.33.235 on 04/07/2020 at 13:45


ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

Data Science
Mahyuddin K M Nasution∗ , Opim Salim Sitompul, Erna Budhiarti
Nababan
Fakultas Ilmu Komputer dan Teknologi Informasi, Universitas Sumatera Utara, Padang Bulan
20155 USU, Medan, Indonesia
E-mail: ∗ [email protected]

Abstract. The presence of new science does not necessarily occur just like that. Every
science starts from interests, discussion, and looks for a basic foundation, but in general the
main foundation of science is mathematics. Data science includes structured and systematic
knowledge about data. However, many other sciences that has a relationship with the data in
question, ranging from statistics to computer science. This paper aims to reveal the obstacle
and limitations of other science into a data science completely, on that basis the definition of
data sciences needs to be elaborated, then confirm data science as new science and not depend
directly on several other sciences.

1. Introduction
Data science consists of two words that form a term to refer to scientific activities around or
relating to what is recognized with data [1], i.e., starting from the collection and processing,
then presenting it as information that is useful for decision making or beneficial to stakeholders
concerned with data [2, 3]. As science, restrictions about it need to be expressed, but the
term reference is not enough to state the purpose and objective of its existence as a science,
which causes various definitions about it to appear [4, 5]. A science, has an ontological basis,
and taxonomically spreads in various directions of development, but still within the interrelated
scope [6, 7].
Data science involves methods in all its activities or scholarly [8], but it is logically mapped
into a scientific integration [9, 10]. In other side, data in particular is an object of study which
has long been a part of other scholarly differently [11], whereas science is born theoretically
to deliver technology and other applications that pioneered the improvement of the quality
of human life [12]. Therefore, data science becomes a paradigm system involving empirical,
theoretical, computational, and big data.

2. Reviews on the track


Data science [13] as a scientific system is an open system [14], which consists of interacting units.
Thus, internal units interact with external units. Internal units focus on data, while external
units complement from the outside [15]. If data science is a science about data (or knowledge
related to data as a whole), then data science as a system requires that the whole units be
organized in a scientific structure and systematics [16], see Figure 1.
Data is phenomenal at the moment, which requires a special study container. Although data
has long been a major part of statistics, based on mathematics, , statistics do not have more

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

Figure 1. Data and anything for data science.

capabilities without involving the core concept, namely probability [17, 18], even though it is
already well established with data analysis [19, 20]. Statistics both in theory and application
consist of a combination of the involvement about an amount (Σ) to an average (μ), or the
centrality of the data and some of their expansion. [21]. Statistics has not been able to predict,
test and assess something, before proving the formula in fairness according to the contour the
probability [22, 23]. Many scientists claim that data science according to the first term is none
other than statistics itself [24]. However, data as a phenomenon cannot be completely revealed by
statistics. Statistics in general targets quantity or quality with parametric and non-parametric
containers. However, the data has its own systematic structure besides the distribution concepts
[25, 26].
From the data side as a bequest from the data or derivation from data, or sometimes as
a twin of data, or also at a particular time a source of data, namely information is the most
important part of all human activities today [27, 28]. In this case, the method has a role, to
solve problems that are linked with or from/to the data. Computationally, statistics grow in
theory to present methods. However, the method’s implications are only the implementation
of statistical formulas in computation, such as computers and other tools [29, 30]. Statistics in
theory has no way to use the facilities in an effective and efficient manner [31]. In fact, ease
of computing such as memory and processor speed requires a balance of performance between
them.
The main requirement of a statistical contour, namely probability, is randomness [32]. To
bypass testing, minimal number of data as sample is statistical reasons as initial assumptions for
data processing and benchmarking [33], as for the dataset [34]. Instead of looking for ways to
test the validity of samples from generally accepted populations, issue a concept that statistics =
data science [35]. It means that assuming data ↔ formula ↔ computation [36, 37], which have no

2
ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

mathematical conclusions abstractions that have not been validly proven. It puts data mining,
as a shift in understanding of the data, unconsciously coupping the data processing rules, so it
is not uncommon for the same data to have different conclusions [38].
Is that just a recognition or as a process before the data [39]. Is it a presentation or just
an external dish. Clearly, there is a series of scientific activities before and after the data
form. Background of data, for example, who is as the collector or about its origin, will cause
information as a result of processed data to make it invalid as a source of knowledge [40, 41].
Do we need to do forensics? [42, 43] The answer lies in placing the statistical position in the
scientific sequence [44]. Not a few fields of science that struggle with data, statistics part of
it. However, the basis of all that is mathematics, which was originally divided into four main
areas: arithmetic, algebra, trigonometric, and geometry. On that basis, however, statistics try
to break free from the arithmetic trap, the pitfalls of computing place statistics growing around
arithmetic involving numbers and operators [45]. Along with that, demands for the meaning of
data require the presence of other fields such as optimization, matrices, distance and similarity,
operations research, and others [46]. Is not it, the fuzzy theory and the rough set are also the
results of the demands or the interests of the data [47].

3. An approach
The birth of a science begins with the emergence of the term in certain academics [48]. The
existence of science was awakened when elaborating and it gave rise to many documents that
were published.
Related terms will be present to offset the terms that might be the name of the science. Data
science is a term, and definitions about it come with the documents that describe it. Currently,
the description comes with the information space as a result of discussion and exchange of
opinions. There are two different spaces that become the focus of attention. First is the
information room of the search engine where there is information that is recorded about the
term, it shows the interest from various stakeholders about it [49]. The term data science is
recorded and revealed based on the year of discussion, to see the ups and downs of activities
about data science activities. Of course, semantically, this information involves all internet users
who are connected and record their interests with their responses about data science [50].
In addition, as a counterweight to that interest, information about documents related to the
term science data is indexed by search engines in the form of numbers and years [51]. On the
grounds that the documents become reliable information because they come from scientists who
are related to various fields of science as stakeholders for realizing the new science. A graphic
will illustrate the trajectory of a scientific journey.

4. A discussion for establishment


To establish a new science, the related term continues to flow in every related scientific activity,
see Figure 2. Definitions fill the discussion room, and then define it, even though a definition
only applies as long as there are no objections.

4.1. Some of related terms


The term science implies organizing systematically and structurally knowledge [52]. Systematics
refers to an effort or study to build and organize knowledge in the form of explanations so as to
produce a limit as the presence of definitions in science, then followed by a theory in the form
of theorems and proofs [16]. All of that is arranged logically and in reasoning, and the parts
become a structure of the science [53]. Based on that, data science originally addressed computer
science to underlie its science [54], but the unresolved complexity trap on algorithms [55, 56],
with which it was the focus of computer science, had caused computer science to experience
scientific defraction, and cannot be a strong foundation for a new science.
3
ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

Figure 2. Hit count of term ”Data Science” in 1960-2019 based on Google.

Conversely, computer science is not to replace data science. The failure of the data science
affirmation, there is an idea to come with the term datalogy [57, 58]. Based on the nature
and characteristics of the data in its dimensions, the data becomes part of the overall existing
knowledge [59, 60]. As the word logy tries to emphasize the word method in methodology, the
term datalogy thus trends to make it a part of every existing science by which data become the
support of any study in it [61]. Therefore, datalogy increasingly do not reinforce the existence
of science that is intended as data science.
Data science as a term is to express data and what is around it, starting from its existence
and its meaning. However, a systematic study of data - organization, property, and analysis,
or its role in inference - in statistic gives restraints to both if the term data science replace
statistics or vice versa. The fact that today, talking about data means that dealing with large
amounts of data, or recognizing big data [62], causes some statistical concepts to change [63].
Organizing data is not limited to numbers. Data characteristics abound as long as they are
related to the meaning of life, data attributes are not like the properties available in statistics
[64]. Data analysis by statistic is constrained by the sample, and experience obstacles when
dealing with the obscurity that big data exhibits (between as a sample or population) [65, 66].
In other words, statistics deal with convergence in computing [67], and with that the term data
science goes peacefully for describing a discipline typically involving some mixture of statistics
and large-scale computing [68]. Therefore, the term data science as a phrase in this case is an
affirmation of new tasks related to data [69].
In addition to the terms computer science and statistics, not a few other terms present as the
trial name of this science such as data mining to obscure the importance of data science [38]. In
the view of data mining, drilling rigs will mine big data such as oil mines in one pool. Although,
the big data is compartmentalized in the existing systems, but actually the big data is in an
information space that does not have any such structure. Thus, mining data in accordance with
the method is only able to reveal part by part of the whole big data [70, 71, 72]. The term data
science is not data mining in its overall sense, or vice versa.

4
ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

Figure 3. Terms about ”Data Science” and their definition.

4.2. Toward definition


As a new science, data science get strengthened by the dissemination that has been done by
scientists or organizations through lectures or scientific meetings [73]. Discussion about data
science becomes a major issues in the scientific world with the presence of journals that serve
as a means of publishing articles related to research or review of this science, covering the scope
of studies that may be present [74]. Along with that, a new definition of data science and an
additional scope of study are presented to explain it, see Figure 3.
Once again, the basic concepts of statistics that never come out of arithmetic traps in the
interval [0, 1], from theory to computation [75], are followed by computer science which must
be pleased to be in the study and application of ”Program = Data structure + Algorithm” until
”Genetic algorithms + Data structure = Evolution programs” with the pitfalls its complexity

5
ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

Figure 4. Hit count of term ”Data Science” in 1960-2019 based on Google Scholar.

[76]. Statistics does not need to change into data science, while computer science must still
be able to assert itself in science. Statisticians should unravel the constraints of convergence
to be able to handle data transformation, from classical to fuzzy or rough (rough sets). The
birth of another field that studies from another angle about data and computers has sorted out
several derivatives of science of fields. The focus of the study of computer science is different
from systems/information sciences, information technology or computer systems, for example.
Although, there are scientists who state that ”Data science is the child of statistics and computer
science [77]”, in theory of data science is the science of data. Differences occur only as a results
of organizing the scientific units needed in a scientific system to deal with the dimensions of
the data. This system is based on the interaction and continuity of the scientific units. Even
though, all of them are based on mathematics discrete as a driving force of scientific energy, but
it will give different implications when interpretation is based on scientific mission and vision
and external targets. Thus, any science that will be born, will rely on mathematics as a scientific
foundation.
Data science consists of scientific units that are openly organized, but have their own borders.
Border is intended to limit the study in accordance with the output targets and achievement
targets, but also remains open to recognizing the changes needed. There is an unequivocal
goal that scientists want to produce something so people can judge [78]. The substance of
data science comes from a variety of sciences or involves multidisciplinary investigations, and is
supported by the application of technology. However, the data model is the foundation of the
investigation, but to build the model requires recognition of the data as a whole. Data models
propose a choice of method for accessing data so that data analysis has the ability to rely on
producing information. Collaborative statistics, optimization, and mining methods, including
probabilistic inference, are an attractive choice for knowledge to be present [44]. With various
constraints that the method has, artificial intelligence is present in an integrated way. Thus, data
science is not only related to units of data recognition or data models, statistics, optimization,
data mining, artificial intelligence, but involves the support of technology available in form of
computing with all its devices (hardware systems, software systems, and algorithms) [79, 80],
but do not make them the substance of the study [81]. As data with all its characteristics,

6
ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

Figure 5. Comparison of scientific information and documents about data science.

data science is very closely related to all other important concepts about the data itself such
as big data and decision making. This assertion also gives something commitment to the data.
Data recording should involve good validation, forensics about the origin of the data becomes
part of smart collection of data, because after all principle of the use of technology still applies,
namely garbage in garbage out (GIGO), whereby behavioral and economic data have different
properties than other, the data with suspect the existence of a subjective system [82].

4.3. Track record


The track record of the development of data science as new science can be seen from the growth of
information in the information space. Information related to data science was revealed from the
Google search engine starting from the first year this term was present in the literature, Figure 2
and Figure 3, and confirmed through studies with documented documentary evidence on Google
Scholar, see Figure 4. Use of the term data science as a name this new science has reached its
culmination point, and scientists are examining its completeness in different headlines, Figure
5. It is shown by the decreasing percentage of documents related to data science compared to
information about research group sites or other information related science data [83].
The debate about what data science is had ended, and data science is accepted as a new
science that is entirely related to data, in contrast to statistics, computer science, data mining,
and so on.

4.4. Definition
Dealing with data, which as a whole as Figure 1, to express data science, it is necessary to
consider a series of relationships: data (δ), information (ι), and knowledge (κ), in a relation, as
stated as follows [84]:
Data Science is ”the extraction of knowledge from high-volume data, using skills in computing
science, statistics and the specialist domain knowledge of experts.”
OR

7
ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

Data Science is ”concerned with the extraction of useful knowledge from large, complex data
sets.” 1
However, when the extraction of something is done from its source, for example Ω represents
the source, γ represents an extraction function that involves artificial intelligence [85, 86, 87, 88,
89, 90, 91, 92, 93, 94, 95], the data science is

DS(δ, ι, κ) = γ(Ω) + μ(Σ)

by which μ is a function involving tools available through other knowledge Σ.

5. Conclusion
In particular, data science has been stimulated by various relevant experts. Various terms have
been raised to provide suggestions and invitations, various responses from the public and other
scientists reflected by the presence of study documents presented in various scientific activities.
As a search, data science is a new science even though data has long been recognized in all
scientific activities. Furthermore, based on the importance of data science determination, it is
necessary to study the current status of chronological data science.

Acknowledgment: This paper is the result of a study visit to Europe by Universitas Sumatera
Utara Team of the Erasmus+ DS&AI project.

References
[1] L Manovich 2015 Data science and digital art history International Journal for Digital Art History 1.
[2] E K Nwabueze, P Ranch 2005 Methods for dynamically accessing, processing, and presenting data acquired
from disparate data sources Unites States Patent No:USOO6959306B2.
[3] P Obrador 2006 Presenting a collection of media objects Unites States Patent No:US007149755B2.
[4] M K M Nasution 2007 SumutSiana Renungan, IPR:EC00201944654. DOI:10.13140/RG.2.2.10127.59047.
[5] M K M Nasution 2018 SumutSiana IOP Conference Series: Materials Science and Engineering 309(1).
DOI:10.1088/1757-899X/309/1/012131.
[6] M K M Nasution 2017 Ontologi Ontologi dan Taksonomi Informasi 1, IPR:EC00201945521.
DOI:10.13140/RG.2.2.22463.92323.
[7] M K M Nasution 2018 Ontology Journal of Physics: Conference Series 1116(2). DOI:10.1088/1742-
6596/1116/2/022030.
[8] D Donoho 2017 50 Years of data science Journal of Computational and Graphical Statistics 26(4).
[9] S Iwata 2008 Editor’s Note: Scientific ”Agenda” of data science Data Science Journal 7.
[10] C A Mattmann 2013 A vision for data science Nature 493.
[11] F Xia, W Wang, T M Bekele, H Liu 2017 Big scholarly data: A survey IEEE Transactions on Big Data
3(1).
[12] M Bunge 1975 What is a quality of life indicator? Social Indicator Research 2.
[13] M K M Nasution 2019 Sains Data Sains Data 1(1). DOI:10.13140/RG.2.2.21816.49924.
[14] K S Baker, G C Bowker 2007 Information ecology: open system environment for data, memories, and knowing
Journal of Intelligent Information System 29.
[15] L Berchicci 2013 Towards an open R&D system: Internal R&D investment, external knowledge acquisition
and innovative performance Research Policy 42(1). DOI:10.1016/j.respol.2012.04.017.
[16] M K M Nasution, I Aulia, M Elveny 2019 Data Journal of Physics: Conference Series 1235(1).
DOI:10.1088/1742-6596/1235/1/012110.
[17] G Shafer 1990 The unity and diversity of probability Statistical Science 5(4).
[18] M Borovcnik 2011 Strengthening the role of probability within statistics curricula Teaching Statistics in
School Mathematics-Challenges for Teaching and Teacher Education, NISS 14.
[19] J O Ramsay, B W Silverman 2005 Functional Data Analysis Springer.
[20] P Mihas 2019 Qualitative data analysis Oxford Research Encyclopedias.
DOI:10.1093/acrefore/9780190264093.013.1195.
1
https://www.universiteitleiden.nl/en/research/research-projects/science/eu-erasmus-curriculum-development-
in-data-science-and-artificial-intelligence

8
ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

[21] D G Altman, J M Bland 1995 Statistics notes: The normal distribution BMJ doi:10.1136/bmj.310.6975.298.
[22] T Asparouhov, B Muthen 2006 Robust chi square difference testing with mean and variance adjusted test
statistics Mplus Web Notes 10.
[23] K-H Yuan, P M Bentler 2011 Normal theory based test statistics in structural equation modelling British
Journal of Mathematical and Statistical Psychology 51(2). DOI:10.1111/j.2044-8317.1998.tb00682.x.
[24] W J Youden 1950 Index for rating diagnostic tests Cancer 3(1). DOI:10.1002/1097-0142(1950)3.
[25] J Elith, M A Burgman, H M Regan 2002 Mapping epistemic uncertainties and vague concepts in predictions
of species distribution Ecological Modelling 157(2-3).
[26] W Chanhom, C Anutariya 2019 TOMS: A linked open data system for collaboration and distribution of
cultural heritage artifact collections of National Museums in Thailand New Generation Computing 37(4).
DOI:10.1007/s00354-019-00063-1.
[27] S Rapps, E J Weyuker 1985 Selecting software test data using data flow information IEEE Transactions on
Software Engineering SE-11(4).
[28] M Chen, D Ebert, H Hagen, R S Laramee, R van Liere, K-L Ma, W Ribarsky, G Scheuermann, D Silver 2009
Data, Information, and Knowledge in Visualization IEEE Computer Graphics and Applications 29(1).
DOI:10.1109/MCG.2009.6.
[29] K H Yuan 2005 Fit indices versus test statistics Multivariate behavioral research 40(1).
DOI:10.1207/s15327906mbr4001 5.
[30] N Siripon 2013 A novel design of distributed oscillator based on the balanced oscillator technique 2013
10th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and
Information Technology, ECTI-CON 2013. DOI:10.1109/ECTICon.2013.6559633.
[31] J L Bentley, M I Shamos 1978 A problem in multivariate statistics: algorithm, data structure and applications
CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE.
[32] P M Bentler, K-H Yuan 1999 Structural equation modeling with smalll samples: Test statistics Multivariate
Behavioral Research 34(2).
[33] P J Fleming, J J Wallace 1986 How not to lie with statistics: The correct way to summarize benchmark
results Communication of the ACM 29(3).
[34] A Moore, M S Lee 1998 Cached sufficient statistics for efficient machine learning with large datasets Journal
of Artificial Intelligence Research 8.
[35] N B-n Vilches, L Napalkova 2016 Application of data science techniques to the field of air traffic control
Universitat Autonoma de Barceelona.
[36] O Rosenbach 1953 A contribution to the computation of the ”second derivative” from gravity data Geophysics
18(4). DOI:10.1190/1.1437943.
[37] K A Berteussen, B Ursin 1983 Approximate computation of the acoustic impedance from seismic data
Geophysics 48(10). DOI:10.1190/1.1441415.
[38] W van der Aalst 2016 Data science in action Process mining.
[39] I Fischler, P A Bloom 1979 Automatic and attentional processes in the effects of sentence contexts on word
recognition Journal of Verbal Learning and Verbal Behavior 18(1).
[40] M K M Nasution, M Hardi, R Sitepu 2016 Using social networks to assess forensic of negative issues
Proceedings of 2016 4th International Conference on Cyber and IT Service Management, CITSM 2016.
DOI:10.1109/CITSM.2016.7577513.
[41] M K M Nasution, M Hardi, R Sitepu, E Sinulingga 2017 A Method to Extract the Forensic about Negative
Issues from Web IOP Conference Series: Materials Science and Engineering 180(1). DOI:10.1088/1757-
899X/180/1/012241.
[42] M K M Nasution, D Sitompul, M Harahap 2018 Modeling reliability measurement of interface on information
system: Towards the forensic of rules IOP Conference Series: Materials Science and Engineering 308(1).
DOI:10.1088/1757-899X/308/1/012042.
[43] M K M Nasution 2019 Forensic in information technology: A redefinition Journal of Physics: Conference
Series 1235(1). DOI:10.1088/1742-6596/1235/1/012106.
[44] W S Cleveland 2001 Data science: An action plan for expanding the technical areas of the field of statistics
International Statistical Review 69.
[45] J B Kruskal 1977 Three-way arrays: rank and uniqueness of trilinear decompositions, with application to
arithmetic complexity and statistics Linear Algebra and its Applications 18(2).
[46] W Noor, M N Dailey, P Haddawy 2014 Learning predictive choice models for decision optimization IEEE
Transactions on Knowledge and Data Engineering 26(8). DOI:10.1109/TKDE.2013.173.
[47] M K M Nasution 2018 The uncertainty: A history in Mathematics Journal of Physics: Conference Series
1116(2). DOI:10.1088/1742-6596/1116/2/022031.
[48] M K M Nasution 2020 The birth of a science IOP Conference Series: Earth and Environmental Science.
[49] M K M Nasution 2018 Singleton: A role of the search engine to reveal the existence of something in

9
ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

information space IOP Conference Series: Materials Science and Engineering 420(1). DOI:10.1088/1757-
899X/420/1/012137.
[50] M K M Nasution 2018 Semantic interpretation of search engine resultant IOP Conference Series: Materials
Science and Engineering 300(1). DOI:10.1088/1757-899X/300/1/012053.
[51] M K M Nasution 2018 Doubleton: A role of the search engine to reveal the existence of relation in
information space IOP Conference Series: Materials Science and Engineering 420(1). DOI:10.1088/1757-
899X/420/1/012138.
[52] J L Heilbron 2003 The Oxford Companion to the History of Modern Science Oxford University Press.
[53] P M Dung, G Sartor 2011 The modular logic of private international law Artificial Intelligence and Law
19(2-3). DOI:10.1007/s10506-011-9112-5.
[54] H A Gohel 2015 Data science - data, tools & technologies CSI Communication 8.
[55] V Cutello, G Nicosia, M Pavone 2004 Exploring the Capability of Immune Algorithms: A characterization
of hypermutation operators International Conference on Artificial Immune Systems LNCS 3239.
DOI:10.1007/978-3-540-30220-9 22.
[56] D I De Silva, N Kodagoda, S R Kodituwakku, A J Pinidiyaarachchi 2017 Analysis and enhancements of a
cognitive based complexity measure IEEE International Symposium on Information Theory - Proceedings.
DOI:10.1109/ISIT.2017.8006526.
[57] P Naur 1966 The science of datalogy Communication of the ACM.
[58] E Sveinsdottir, E Frokjaer 1988 Datalogy - The copenhagen tradition of computer science BIT Numerical
Mathematics 28(3).
[59] C A Chinn, W F Brewer 1993 The role of anomalous data in knowledge acquisition: A theoretical framework
and implication for science instruction Center for the Study of Reading Technical Report No. 583.
[60] L M Ghiringhelli, J Vybiral, S V Levchenko, C Draxl, M Scheffer 2015 Big data of materials science: Critical
role of the descriptor Physical Review Letters 114.
[61] M K M Nasution 2019 Research methodology IOP Conference Series: Materials Science and Engineering
[62] L Manovich 2015 Data science and digital art history DAH-Journal 1.
[63] F Provost, T Fawcett 2013 Data science and its relationship to big data and data-driven decision making
Big data 1(1).
[64] W R van Hage, M van Erp, V Malaisé 2012 Linked Open Piracy: A Story about e-Science, Linked Data,
and Statistics Journal on Data Semantics 1(3).
[65] A Gandomi, M Haidar 2015 Beyond the hype: Big data concepts, methods, and analytics International
Journal of Information Management 32(2). DOI:10.1016/j.ijinfomgt.2014.10.007.
[66] G Grolemund, H Wickham 2018 R for data science Journal of Statistical Software 77(1).
[67] M K M Nasution 2017 Modelling and simulation of search engine Journal of Physics: Conference Series
801(1). DOI:10.1088/1742-6596/801/1/012078.
[68] J B Greenhouse 2013 Statistical Thinking: The bedrock of data science The Huffington Post.
[69] J Hardin, R Hoerl, N J Horton, D Nolan, B Baumer, O Hall-Holt, P Murrell, R Peng, P Roback, D T
Lang, M D Ward 2015 Data science in statistics curricula: Preparing students to ”think with data” The
American Statistician 69(4).
[70] M K M Nasution 2016 Social network mining (SNM): A definition of relation between the resources
and SNA International Journal on Advanced Science, Engineering and Information Technology 6(6).
DOI:10.18517/ijaseit.6.6.1390.
[71] M K M Nasution, M Hardi, R Syah 2017 Mining of the social network extraction Journal of Physics:
Conference Series 801(1). DOI:10.1088/1742-6596/801/1/012020.
[72] M K M Nasution 2019 Social Network Mining: A discussion Journal of Physics: Conference Series 1235(1).
DOI:10.1088/1742-6596/1235/1/012111.
[73] M K M Nasution 2018 Indonesia knowledge dissemination: A snapshot Journal of Physics: Conference Series
978(1). DOI:10.1088/1742-6596/978/1/012012.
[74] P F Uhlir, P Schroder 2007 Open data for global science Data Science Journal 6.
[75] S R Kalidindi, M De Graef 2015 Materials data science: Current status and future outlook Annu. Rev.
Mater. Res. 45.
[76] X Liu, D Li, S Wang, Z Tao 2007 Effective algorithm for detecting community structure in complex networks
based on GA and clustering International Conference on Computational Science, Computational Science
ICCS 2007 DOI:10.1007/978-3-540-72586-2 95.
[77] D M Blei, P Smyth 2017 Science and data science PNAS 114(33).
[78] D J Patil 2011 Bulding data science teams O’Reilly.
[79] T F Abidin, R Ferdhiana 2017 Algorithm for updating n-grams word dictionary for web classification 2016
International Conference on Informatics and Computing, ICIC 2016. DOI:10.1109/IAC.2016.7905758.
[80] S Xiang, H Zhu, X Wu, L Xiao, M Bonsangue, W Xie, L Zhang 2020 Modeling and verifying the topology

10
ICCAI 2019 IOP Publishing
Journal of Physics: Conference Series 1566 (2020) 012034 doi:10.1088/1742-6596/1566/1/012034

discovery mechanism of OpenFlow controllers in software-defined networks using process algebra Science
of Computer Programming 187. DOI:10.1016/j.scico.2019.102343.
[81] M Molina-Solana, M Ros, M D Ruiz, J Gomez-Romero, M J Martin-Bautista 2017 Data science for building
energy management: A review Renewable and Sustainable Energy Reviews 70.
[82] S Chawla, J Hartline, D Nekipelov 2014 Mechanism design for data science arXiv:1404.5971v2 [cs.GT].
[83] M K M Nasution, R Syah, M Elveny 2017 Studies on behaviour of information to extract the meaning behind
the behaviour Journal of Physics: Conference Series 801(1). DOI:10.1088/1742-6596/801/1/012022.
[84] V Dhar 2012 Data science and prediction Commun ACM 56.
[85] M K M Nasution, S A Noah 2010 Superficial method for extracting social network for academics using web
snippets Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics) 6401 LNAI. DOI:10.1007/978-3-642-16248-0 68.
[86] M K M Nasution, S A Noah 2011 Extraction of academic social network from online database
2011 International Conference on Semantic Technology and Information Retrieval STAIR 2011.
DOI:10.1109/STAIR.2011.5995766.
[87] M K M Nasution, S A Noah 2012 Information retrieval model: A social network extraction perspective
Proceedings - 2012 International Conference on Information Retrieval and Knowledge Management,
CAMP’12. DOI:10.1109/InfRKM.2012.6204999.
[88] M K M Nasution 2014 New method for extracting keyword for the social actor Lecture Notes in Computer
Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
8397 LNAI(PART 1). DOI:10.1007/978-3-319-05476-6 9.
[89] M K M Nasution, O S Sitompul 2017 Enhancing extraction method for aggregating strength relation between
social actors Advances in Intelligent Systems and Computing 573. DOI:10.1007/978-3-319-57261-1 31.
[90] M K M Nasution, S A Noah 2017 Social Network Extraction Based on Web. A Comparison of Superficial
Methods Procedia Computer Science 124. DOI:10.1016/j.procs.2017.12.133.
[91] C Anutariya, R Dangol 2018 VizLOD: Schema extraction and visualization of linked open data Proceeding of
2018 15th International Joint Conference on Computer Science and Software Engineering, JCSSE 2018.
DOI:10.1109/JCSSE.2018.8457325.
[92] M Elfida, M K M Nasution, O S Sitompul 2018 Enhancing to method for extracting Social network by the
relation existence IOP Conference Series: Materials Science and Engineering 300(1). DOI:10.1088/1757-
899X/300/1/012057.
[93] M K M Nasution 2018 Social network extraction based on Web: 1. Related superficial methods IOP
Conference Series: Materials Science and Engineering 300(1). DOI:10.1088/1757-899X/300/1/012056.
[94] M K M Nasution 2018 Social network extraction based on Web: 2. Strategies in superficial methods Journal
of Physics: Conference Series. DOI:10.1088/1742-6596/1116/2/022029.
[95] M K M Nasution, O S Sitompul, S A Noah 2018 Social network extraction based on Web: 3. The integrated
superficial method Journal of Physics: Conference Series 978(1). DOI:10.1088/1742-6596/978/1/012033.

11

You might also like