0% found this document useful (0 votes)
70 views6 pages

2 - Unms - Building A Data Warehouse To Support Active Student Management Analysis and Design PDF

Uploaded by

Josimar Vera M.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views6 pages

2 - Unms - Building A Data Warehouse To Support Active Student Management Analysis and Design PDF

Uploaded by

Josimar Vera M.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Building A Data Warehouse to support Active

Student Management: Analysis and Design


Indrajani Sutedja Pandi Yudha Nurul Khotimah
Information Systems Department, Information Systems Department, Information Systems Department,
School of Information Systems School of Information Systems School of Information Systems
Bina Nusantara University Bina Nusantara University Bina Nusantara University
Jakarta, Indonesia 11480 Jakarta, Indonesia 11480 Jakarta, Indonesia 11480
[email protected] [email protected] [email protected]

Claresta Vasthi
Information Systems Department,
School of Information Systems
Bina Nusantara University
Jakarta, Indonesia 11480
[email protected]

Abstract— Data analysis for the number of active students in a XYZ University is a private university which spreads in
university is very important. It is required for universities to several locations, one of them located in DKI Jakarta. Each year
comply with the regulation of the ministry of research, technology, XYZ University has produced graduates and has increased the
and higher education of the Republic of Indonesia, Number 32 of number of students. That way the amount of data owned will
2016 on the accreditation of study programs and universities. increase and affect in terms of analyzing important information.
However, there are many difficulties in analyzing active student Until now to analyze student data especially about total active
reports, additional and ad hoc reports, and the need for business student, top management spend a lot of time to gain structured,
intelligence and data mining development. The purpose of this accurate, and complete information. This is because the data
research is to analyze and design data warehouse to integrate
source used to generate information obtained from several
various operational databases needed to provide information
about active students at XYZ University. The method of analysis
different databases. Active student data is needed to support the
is done by running system analysis and analysis of the running strategic decision-making process.
system weaknesses. While the data warehouse design method uses To overcome these problems, it is necessary to invest in
4 stages (Four-Step Methodology) used by Ralph Kimball in technology which capable of managing large amounts of data
designing a data warehouse. The stages are selecting the business and could perform effective analysis so that data in the
process, declaring grain, identifying the dimensions, and organization can be processed into valuable information for
identifying facts. The results achieved are the design of data competitive advantage and support the needs of long-term
warehouse and dashboard that will provide relevant and
information [1]. This technology called data warehouse. Data
integrated information about active students that can be viewed
from different angles. Designed data warehouses are needed to
warehouse is an analytical database which can support decision
help organizations to analyze information as needed and help making process. Various databases within the organization
management to make strategic decisions. The conclusion is that could be integrated into data warehouse that can provide user
with the build of a data warehouse to support active student convenience to perform data analysis. Data which already
management can help the university to analyze active student and integrated inside data warehouse can be utilized to present the
make decisions in the student area. information that can be reviewed from various dimensions and
can be adjusted the level of data details [2]. Besides that, data
Keyword— data warehouse, dashboard, four-step methodology, warehouse is also a source for Business Intelligence and data
strategic mining.
In previous research, the student retention rate is one of the
I. INTRODUCTION biggest warnings of universities. To address this problem, some
Currently information is urgently needed in a whole life universities implement data mining to determine the variables
aspect. Information has already been a part of important need for correlated to student retention. These universities build a data
the future life development. But the increased need for warehouse to aggregate the data needed to overcome the
information is not accompanied by timely and accurate challenge. [3] [4] [5][6].
information presentation, not infrequently the information is still To enhance decision-making capabilities, business
to be traced in depth from the large amount of data. Information organizations have implemented data warehouses. As a
technology plays an important role to help alignment between prerequisite, a data warehouse analysis and design framework is
strategies, processes, and technologies that can enhance an needed, specifically for data mining and business reporting
organization’s competitiveness.

978-1-5386-5821-5/18/$31.00 ©2018 IEEE 3-5 September 2018, Bina Nusantara University, Jakarta, Indonesia
2018 International Conference on Information Management and Technology (ICIMTech)
Page 460
purposes, for a better and more thorough data analysis of the Physical modelling, which a mapping from transformation
university in the future [7][8][9] . result into table on data [17].
The research questions are how to analyze and design a
metadata from 3 operational databases and how to build a data III. RESULT AND DISCUSSION
warehouse to support active student management.
To obtain an overview from current conditions and
The scope of this research is the build a data warehouse to problems facing the organization, appropriate data and
support active student. The purpose of this research is the built a information are needed for analysis and design of data
data warehouse for analyzing active student. The benefit of this warehouse [18]. Therefore, several researches were conducted
research to include producing analytic data and dashboard for
to obtain information related to active students at XYZ
deciding.
University, such as Literacy study related to data warehouse,
This paper consists 4 sections such as introduction, study conduct interview with specific users, observe and analyze
literatures, result and discussion, and conclusions. First section report generated from operational databases, analyzing current
is background, previous research, research questions, scopes, running systems, analyze the weakness from running systems,
objectives, and benefits. The second and the third section are and problem-solving analysis.
encompassing theories that support the design of a data
warehouse for active student. In designing data warehouse at XYZ University, the data
warehousing design method used is based on the Kimball
II. STUDY LITERATURE methodology. Where this method consists of 4 stages [19],
which are select the business process, declare grain, identify the
A. Data Warehousing dimensions, and identify facts.
Data warehouse is used to collect and integrate databases
that can be used as a decision support system. Data warehouse Select the Business Process, at this stage we have to
focused on providing information to support company’s determine subject from the faced problem. Based on analysis
decision making. Data warehouse usually provide storaging result that has been done there are several important related to
media in a high performance, performing calculation / data operational student activities at XYZ University, those are
aggregation operations, and providing an interface that allows Student Registration, Student Assessment, Student Leave
users to command data [11][12][13]. Compared with database Request, Student Withdrawal, and Graduation.
which could do Create, Read, Update, and Delete (CRUD)
process, data warehouse tends to do the Read process to analyze Declare Grain, this stage is to determine the balance
data based on dimensions that have been normalized and between business needs and available data. Also, to specify
illustrated in the star scheme [14]. what kind of data can be shown in fact tables. Grain in the
The star scheme is designed in such a way as to produce design of this data warehouse includes: first, analysis that can
relevant information. The relationship between the tables be done on the Student Registration process, including:
contained in the star schema is managed by using surrogate key Quantity of student registration, Quantity of student active,
and indexing table to facilitate the query process [10]. Quantity of student who took thesis. Second, analysis that can
be done on the Student Assessment process, including: Student
B. Extract, Transform, and Loading Quantity based on Student Activity Transcript (SAT) points and
ETL (Extract, Transform, and Load) includes all between social work hours, Quantity of students whose achievement
operational source systems and data warehouse presentation index is not eligible for graduation. Third, analysis that can be
area or Business Intelligence [15]. ETL is the main activity in done on the Student Leave Request process, including: Quantity
the data warehouse, Extraction refers to the reading process, of students who apply for leave, Total reasons for frequent
understanding the data source, and copying the data from the leave. Fourth, analysis that can be done on the Student
heterogeneous source needed in ETL process for further Withdrawal process, including: Quantity of students who
manipulation. After the data extraction to ETL system, next resigned, Total reasons for frequent resignations submitted,
process is transformation. In this process will be conducted Quantity of students who moved into different majors. The last,
cleaning data and combining data from multiple sources. analysis that can be done on the Graduation process, including:
Finally, the transformed data will enter the data warehouse. Quantity of students who eligible and attend the graduation
When an organization uses a data warehouse from the database, ceremony, the average achievement index gained by graduation
the number of ETL processing is minimized significantly [16]. student.
ETL Models is divided into 3 models, (1) Conceptual
modeling, conceptual modeling aims to create conceptual Identify the Dimensions, at this stage will be the process of
models for ETL processes that describe the mapping of identifying dimensions that will be related to fact table. Here is
attributes from a data source to a data warehouse attribute. (2) one of the examples:
Logical modelling, logical modelling focused on data from data
source to data warehouse which ended at data store and (3)
TABLE I. IDENTIFY DIMENSION FOR STUDENT LEAVE TRANSACTION
FACT

978-1-5386-5821-5/18/$31.00 ©2018 IEEE 3-5 September 2018, Bina Nusantara University, Jakarta, Indonesia
2018 International Conference on Information Management and Technology (ICIMTech)
Page 461
Grain Quantity of Total reasons Execute SQL Running Stuctured Query
Scripting
students who apply for frequent script Language (SQL) script
for leave leave Data Grid Input Inserting static data rows
If field value is Specifies a certain value if the
Dimension Utility
null data is null
Student X Insert / Update Output Update or insert data to database
Program X X
Adjust specific attributes or
Lecture Period X X Select values Transform
values and set attribute metadata
Time X X Look for a specific value based
Database
Lookup on the attributes value contained
lookup
Identify Facts, at this stage will be identified facts that will in the database
be used in data warehouse based on the subject that has been Look for a particular value that
Stream lookup Lookup comes from another source in
determined. The following are facts contained in the data the transformation process
warehouse Registration Fact, Assessment Fact, Leave Request Read the data information from
Table input Input
Fact, Withdrawal Fact, and Graduation Fact database table
Insert information into the
Table output Output
database table
Here is an example of designed star schema of an active
student data warehouse. Factcuti is related with dimWaktu, Pentaho Data Integration (PDI) is an open source software
dimMasaPerkuliahan, dimProgramAkademik, and that we used for integrating the database. The advantages of
dimMahasiswa. PDI are (1) Has a large collection of transformation stage, (2)
Modules are easy to use in data warehouse design (3) Has a
good performance and scalability, and (4) Can be developed
with various additional plugins [20].

Here is an example of ETL stages used in the process of


extraction, transformation and loading data from operational
databases into a data warehouse.

Fig. 1. Leave Request Star Schema

In ETL process, data is being extracted from 3 operational


databases which mainly used in XYZ University, then those
databases transformed to obtain the corresponding format to
data structured on data warehouse. These are steps used in ETL
process using software Pentaho Data Integration.
Fig. 2. Extract, Transform, and Loading Process of Student Dimension
TABLE II. TRANSFORMATION PROCESS STEPS ON PENTAHO DATA
INTEGRATION
Tableinput consists mahasiswa master table. The process will
Step Name Category Description be start from checking field value is null or not. Next, select
Add sequence Transform Generate consecutive values values and block this step until step finish and create table
Block this step Blocks the stages until the output.
until steps Flow process at the specified stage is
finish complete
Create new attributes by
Calculator Transform
performing simple calculations

978-1-5386-5821-5/18/$31.00 ©2018 IEEE 3-5 September 2018, Bina Nusantara University, Jakarta, Indonesia
2018 International Conference on Information Management and Technology (ICIMTech)
Page 462
4 Total Transform
totalMahasiswaCuti int Mahasiswa Count
Cuti (emplid)
Dashboard design, Dashboard design aims to represent or
visualize data contained in the data warehouse so it is easy to
understand and help the management to explore data
visualization.

Fig. 3. Extract, Transform, and Loading Process of of Leave Request Facts

Tableinput consists leave_request_transaction table. The


process will be start from looking up dimension such as
dimMahasiswa, dimProgramAkademic, dimMasaPerkuliahan,
and dimWaktu. Next, select values and block this step until step
finish, checking field value is null or not, and create table
output.
Fig. 4. Dashboard Design of Student Withdrawal

TABLE III. SOURCE TABLE METADATA OF LEAVE REQUEST FACTS

Source
Database Table Attribute
Student_Active_Olap dimwaktu skwaktu
Student_Active_Olap dimmahasiswa skmahasiswa
Student_Active_Olap dimprogramakademik skprogramakademik
Student_Active_Olap dimmasaperkuliahan skmasaperkuliahan
Bcs ps_prog_rsn_tbl leave_type
Bcs ps_prog_rsn_tbl, descry,
Legacy transaksi_cuti_resmi alasan
Bcs ps_bn_ofcleav_dtl,
emplid
Legacy transaksi_cuti_resmi
Metadata of Leave_Request_Fact consists source tables and
target tables. Source tables are from student_active_olap, bcs,
and legacy databases.

Fig. 5. Dashboard Visualization of Student Withdrawal


TABLE IV. TARGET LEAVE REQUEST FACTS TABLE METADATA
Target Transform
Figure 5 shows dashboard visualization of student
Data Length
Attribute
Type
Description withdrawal. In the process of resignation, the analysis
4 Surrogate Copy carried out includes: first, total quantity of students who
skWaktu int
key resign based on student, academic program, lecture period,
skMahasiswa
int 4 Surrogate Copy and time. Second, total reasons for frequent resignations
key based on students, academic programs, lectures, types of
int 4 Surrogate Copy
skProgramAkademik
key retirement, and time. The last, total quantity of students
int 4 Surrogate Copy who moved majors based on students, academic programs,
skMasaPerkuliahan
key lectures, types of retreat and time.
jenisCuti varchar 50 Jenis Cuti Copy
alasanCuti varchar 200 Alasan Cuti Copy

978-1-5386-5821-5/18/$31.00 ©2018 IEEE 3-5 September 2018, Bina Nusantara University, Jakarta, Indonesia
2018 International Conference on Information Management and Technology (ICIMTech)
Page 463
building a data mining, so the analysis can be more in-depth and
also helps in decision making.

REFERENCES

[1] L. W. Santoso and others, “Analysis of The Impact of Information


Technology Investments-A Survey of Indonesian Universities,” Petra
Christian University, 2014.
[2] L. W. Santoso and others, “Data Warehouse with Big Data Technology
for Higher Education,” Procedia Comput. Sci., vol. 124, pp. 93–99, 2017.
[3] E. Bates, “UVM Big Data? Aggregating Campus Databases and Creating
a Data Warehouse to Improve Student Retention Rates at the University
of Vermont,” 2015.
[4] A. S. Kumar, “Edifice an educational framework using educational data
mining and visual analytics,” IJ Educ. Manag. Eng., vol. 2, pp. 24–30,
2016.
[5] N. Pasyeka and M. Pasyeka, “Construction of multidimensional data
warehouse for processing students’ knowledge evaluation in universities,”
in Modern Problems of Radio Engineering. Telecommunications and
Computer Science (TCSET), 2016 13th International Conference on,
2016, pp. 822–824.
[6] Indrajani and Y. Lisanti, “Business intelligence design on the company,”
Fig. 6. Active Student Data Warehouse Architecture in Proceeding of the International Conference on e-Education
Entertainment and e-Management, ICEEE 2011, 2011.
Figure 6 is the architecture used in data warehouse design [7] R. P. Singh and K. Singh, “Design and Research of Data Analysis System
of active students in XYZ University. The architecture consists for Student Education Improvement (Case Study: Student Progression
System in University),” in Micro-Electronics and Telecommunication
of 3 main sections [5], such as Data source, Data warehouse Engineering (ICMETE), 2016 International Conference on, 2016, pp.
System, and End User Access Tools. 508–512.
[8] H. Dandan, Z. Yajuan, L. Junfeng, L. Chen, X. Mo, and S. Zhihai,
“Research on Centralized Data-Sharing Model Based on Master Data
IV. CONCLUSION Management,” in MATEC Web of Conferences, 2017, vol. 139, p. 195.
In this paper has been analyzed the running business [9] Z. Alharbi, J. Cornford, L. Dolder, and B. De La Iglesia, “Using data
mining techniques to predict students at risk of poor performance,” in SAI
processes and systems on XYZ University. Based on the Computing Conference (SAI), 2016, 2016, pp. 523–531.
analysis result found several problems, namely the absence of [10] ]N. E. Cagiltay, G. Tokdemir, O. Kilic, and D. Topalli, “Performing and
data warehouse associated with active students making it analyzing non-formal inspections of entity relationship diagram (ERD),”
difficult for the management in obtaining information quickly, J. Syst. Softw., vol. 86, no. 8, pp. 2184–2195, 2013.
accurately and thoroughly. As for from this writing can be drawn [11] C. Ulmer, G. Bayer, Y. R. Choe, and D. Roe, “Exploring data warehouse
some conclusions as follows. appliances for mesh analysis applications,” in System Sciences (HICSS),
2010 43rd Hawaii International Conference on, 2010, pp. 1–10.
The designed data warehouse has 9 dimensions and 5 facts [12] K. C. Davis, D. Aggarwal, and S. Baskin, “Scaling Data Warehousing
consist of student dimension, academic program dimension, Course Projects,” in Computational Science and Computational
streaming dimension, campus location dimension, lecture period Intelligence (CSCI), 2016 International Conference on, 2016, pp. 241–
dimension, subject dimension, withdrawal reason dimension, 245.
graduation batch dimension, and time dimension. Then consists [13] M. A. Mohammed and M. M. Anad, “Data warehouse for human resource
by Ministry of Higher Education and Scientific Research,” in Computer,
of registration facts, assessment facts, leave request facts, Communications, and Control Technology (I4CT), 2014 International
withdrawal facts, and graduation fact. By using data warehouse Conference on, 2014, pp. 176–181.
structure data access process becomes easier because the data [14] A. Khan, N. Ehsan, E. Mirza, and S. Z. Sarwar, “Integration between
needed can be seen from various angles of view. customer relationship management (CRM) and data warehousing,”
Procedia Technol., vol. 1, pp. 239–249, 2012.
Data warehouse that periodically absorb the data contained [15] T. Nobre, A. Trigo, and P. Sanches, “SBIAES�Business intelligence
in the operational database will store data consistently, system for analysis of access to higher education: The case of the
accurately, and has data aggregation that will facilitate top Polytechnic Institute of Coimbra,” in Information Systems and
management in analyzing data and decision making. Technologies (CISTI), 2014 9th Iberian Conference on, 2014, pp. 1–6.
[16] C. S. Saunders, G. Liu, Y. Yu, and W. Zhu, “Data-driven distributed
Based on the design of active student data warehouse that analytics and control platform for smart grid situational awareness,”
has been done at XYZ University, there are some suggestions CSEE J. Power Energy Syst., vol. 2, no. 3, pp. 51–58, 2016.
that can be considerate in the implementation of data warehouse [17] W. Astriani and R. Trisminingsih, “Extraction, Transformation, and
in the future, including as follows: expand the implementation Loading (ETL) module for hotspot spatial data warehouse using
scope of data warehouse to other departments and develop geokettle,” Procedia Environ. Sci., vol. 33, pp. 626–634, 2016.
decision support system related to data warehouse such as [18] A. Gosain and others, “Literature review of data model quality metrics of
data warehouse,” Procedia Comput. Sci., vol. 48, pp. 236–243, 2015.

978-1-5386-5821-5/18/$31.00 ©2018 IEEE 3-5 September 2018, Bina Nusantara University, Jakarta, Indonesia
2018 International Conference on Information Management and Technology (ICIMTech)
Page 464
[19] T. M. Connolly and C. E. Begg, “Database systems. A practical approach Conference Series: Materials Science and Engineering, 2016, vol. 128,
to design implementation and management. global ed,” Harlow, Pearson no. 1, p. 12020.
Educ., 2015.
[20] R. J. Salaki, J. Waworuntu, and I. Tangkawarow, “Extract transformation
loading from OLTP to OLAP data using pentaho data integration,” in IOP

978-1-5386-5821-5/18/$31.00 ©2018 IEEE 3-5 September 2018, Bina Nusantara University, Jakarta, Indonesia
2018 International Conference on Information Management and Technology (ICIMTech)
Page 465

You might also like