Bioinformatics Database Query Performance and Optimization
Bioinformatics Database Query Performance and Optimization
net/publication/347132039
CITATIONS READS
0 58
3 authors, including:
Some of the authors of this publication are also working on these related projects:
Integrated multi criteria decision making for a destitute problem View project
All content following this page was uploaded by Edy Budiman on 15 April 2021.
Published By:
Retrieval Number: 100.1/ijrte.C4666099320 Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.C4666.099320 581 and Sciences Publication
Bioinformatics Database Query Performance and Optimization
Luna in Government web portals performance evaluation (F) >=26 CSS >= 11
>= 5 file size >= 6 unminified
using data envelopment analysis[28], where according to 0 <= S
< 500b
or inline
components
duplicated
< 50 Style JS or CSS
these studies the quality assurance of a website depends on
automated testing tools which lower costs and increase its Matrix table-keys according to[32]:
efficiency. The performance of a website can be an important Rule: The YSlow performance rule
factor for its success. It depends on the main speed factor. If Weight. How this performance rule is weighted in the
the website speed is fast then the performance automatically overall page analyis grade
increases. Points. Number of points deducted per offender
(performance infraction occurance), from a total of
B. Metric for Perfomance
100 per rule
Performance can be evaluated using tools that break down Score Computation. The formula used to compute the
the resources and components on the bioinformatics web site. final score per rule
There are a wide variety of automated site testing tools Grades from A to F. How many components/offenders is
available. necessary to reach grades from A to F.
Google PageSpeed Insights (PSI), Refers to [29]
PageSpeed Insights (PSI) reports on the performance of a C. Borneo Bioinformatics Portal
page on both mobile and desktop devices and provides One of the efforts to manage biodiversity is through data
suggestions on how that page may be improved. PSI provides and information support. Data and information on
a score which summarizes the page’s performance. This score biodiversity need to be continued efforts are made to be
is determined by running Lighthouse to collect and analyze added, both in species diversity, habitat, population, and
lab data about the page. A score of 90 or above is considered distribution. Records of 47,910 species of Indonesian
good. 50 to 90 is a score that needs improvement, and below biodiversity [33] are estimated still far less than the potential
50 is considered poor. PSI also classifies field data into 3 that actually exists. It is necessary to increase the intensity of
buckets, describing experiences deemed good, needs the implementation of identification and inventory of
improvement, or poor (see Tabel-I). biodiversity in the field, and on the other hand, a database
Tabel- I: Classifying Good, Needs Improvement, Poor[29] system that is able to collect data and information that is
Good Needs Improvement Poor spread across various circles is needed
FCP [0, (1000ms, 3000ms] over 3000ms Borneo bioinformatics portal as a data management system
1000ms]
FID [0, 100ms] (100ms, 300ms] over 300ms and information on endemic plants for the island of Borneo,
LCP [0, (2500ms, 4000ms] over 4000ms Kalimantan, Indonesia. Borneo Bioinformatics is an example
2500ms] of an information system which presents taxonomic data,
CLS [0, 0.1] (0.1, 0.25] over 0.25
form an ontology model which can serve, mapping data as
Yslow from Yahoo, Refers to [30] YSlow grades web page information regarding data descriptions and relationships
based on one of three predefined rulesets or a user-defined between taxons in accordance with taxonomic levels based on
ruleset. It offers suggestions for improving the page's data stored in the database[34].
performance, summarizes the page's components, displays The portal online is accessed in url
statistics about the page, and provides tools for performance https://www.borneodiversity.org/index. The BBIS interface
analysis [31]. YSlow's web page analysis is based on the 23 of is shown in Figure 1.
these 34 rules that are testable [30] (see Table-II).
Tabel -II. Rule weights of YSlow V2 Ruleset [32].
Remove
Compress
Avoid CSS Minify JScript duplicate
Rule component
expressions and CSS JScript and
with GZip
CSS
0 to 5
(A) expressions 0 or 1 0 to 2
0 file size <
90 <= S on CSS or unminified duplicated
500b
<= 100 inline component JS or CSS
STYLE
6 to 10
(B) 1 file size < 3 or 4
expressions 2 unminified
80 <= S 500b of any duplicated
on CSS or components
< 90 type
inline Style
JS or CSS Fig. 1. Screenshot of the borneo bioinformatics portal
The first stage of data collection, to date, the system have
11 to 15
(C)
2 file size < expressions 3 unminified
5 or 6 recorded 233 Medicinal data, 1482 tree species, 86 types of
70 <= S duplicated
< 80
500b on CSS or components
JS or CSS wood, and 80 types of bamboo. Until now, the data collection
inline Style
process is still continuing.
(D) 3 file size < 16 to 20 7 or 8
4 unminified
60 <= S 500b of any CSS or duplicated
components
< 70 type inline Style JS or CSS
4 21 to 25
(E) 9 or 10
uncompress expressions 5 unminified
50 <= S duplicated
ed or file on CSS or components
< 60 JS or CSS
size < 500b inline Style
Published By:
Retrieval Number: 100.1/ijrte.C4666099320 Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.C4666.099320 582 and Sciences Publication
International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878, Volume-9 Issue-3, September 2020
DB Connection
tree medicinal wood taxonomy
Fig. 2. Performance test design id id
id id
The test is to get the query execution time (response) based tree kingdom
leaf_flower wood
on the relations formed. Two types of testing are performed, latin division
rod_root botany
synonym class
offline and online for each relationship ie; one to one, one to fruit_seed local
ordo
local chemical synonym
many, many to many, and has many through using the Laravel image family
information_ habitus
Debugbar package. An overview of the performance test ecology research picture
genius
design is seen in Figure 2. endemic species
efficacy medicinal_id
sub species
For optimization performance bioinformatics portal: high descriptions descriptions
stem_color varietas
performance testing and optimization using the pretest and
sap_color
post-test scenarios. Pretest is the performance of the portal information_
before optimization, Posttest is the performance portal after research
optimization. Performance measurement parameters are descriptions
Script(Javascript and CSS), images and page load timings,
which are files that affect the performance of a Bioinformatics Fig. 3. Test tables for query performance (data relations)
portal. Table 1 is the parameter as a performance test metric. Figure 3 for testing the relation scheme One To One
Table- III: Performance optimization metrics relation (hasOne), between medicinal table and wood table,
Parameter Metric scheme One to Many relation (hasMany) between tree table
Serve scaled images and wood table, scheme Many to Many relation
Image Optimize images
Image dimensions
(belogsToMany) between tree table and wood table, and for
Minify CSS and Javascript scheme Has Many Through relation between medicinal table
Defer parsing JavaScript to tree table and wood table.
Script Inline small CSS and JavaScript
(CSS and Combine images using CSS sprites
JavaScript Avoid CSS @import
Duplicate JavaScript and CSS
Avoid CSS expressions
Page Load Redirect duration
Published By:
Retrieval Number: 100.1/ijrte.C4666099320 Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.C4666.099320 583 and Sciences Publication
Bioinformatics Database Query Performance and Optimization
IV. RESULTS AND DISCUSSION relationship between tree table and wood table. The results of
One To Many relations query test are shown in Table VI and
A. Result: Query Performance Figure 5.
The query test gets the execution time to display data based Table -VI: Query performance for One To Many relation
on the relations schema in the scenario. Tests are carried out Online Offline
on each relationship, ie, one to one, one to many, many to Test ORM Non ORM ORM Non ORM
many, and has many through, and the offline and online 1 22930 115.6 372.59 68.02
testing.
2 20160 260.96 383.09 15.83
The query performance test results are as follows:
Result: One-To-One relation 3 22400 328.45 383.28 66.14
This test is done to get the time from the results of the query 4 22400 268.56 423.1 14.51
execution for the One to One relation by using Eloquent ORM 5 21420 252.57 380.3 14.84
and Query Builder to display data. One-to-One relationship
6 33460 161.03 381.34 5.94
between wood table and medicinal table. The results of One
7 42700 367.03 387.52 6.1
To One relations query test are shown in Table V:
Table- V: Query performance for One To One relation 8 7700 231.46 379.34 6.76
Online Offline 9 29420 411.13 402.37 6.58
Test ORM Non ORM ORM Non ORM 10 25960 456.15 401.4 6.3
1 9710 58.63 143.31 7.25 389.43
Avg. 24855 285.294 21.102
3
2 6780 250.75 137.02 6.26
3 5680 324.65 148.93 6.77
4 7870 40.65 148.17 7.74
5 3550 124.81 136.12 7.27
6 4920 119.59 135.75 7.06
7 5470 140.65 137.82 7.69
8 2190 447 144.76 7.64
9 2500 574.75 137.11 7.17
10 2750 49.19 137.67 6.2
140.66
Avg. 5142 213.067 7.105
6
Published By:
Retrieval Number: 100.1/ijrte.C4666099320 Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.C4666.099320 584 and Sciences Publication
International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878, Volume-9 Issue-3, September 2020
8 429.79 407.85 12.21 5.16 average 3189.65 ms and with an average time for Non ORM
9 146.54 232.95 11.28 6.74 of 58.008 ms to execute the field data.
To get a comparison between offline and online testing
10 432.81 122.81 11.91 4.97
Then a summary of the average response time of each relation
Avg. 428.552 256.549 12.602 5.667 as seen in Table IX.
Table -IX: Summary offline-online query test relation
Online Offline
Relationshi Non
ORM Non ORM ORM
p ORM
One to One 5142 213.067 140.666 7.105
One to
24855 285.294 389.433 21.102
Many
Many To
428.552 256.549 12.602 5.666
Many
Has Many
3189.65 58.088 84.948 5.731
Through
156.912
8403.801 203.2495 9.901
3
When testing an impedence mismatch case, it occurs when
Fig. 6. Performance query Many to Many relationship
there is a mapping problem in the database relation in
From the test in Figure 6 and Table VII, it is known that the
displaying details of plant data that displays data from
Many To Many relation without ORM response time offline
columns with the same name even though the columns are in
average is 5.667 ms and online average time is 256.549 ms to
different tables.
execute the field data.
Result: Has Many Through relation B. Results: Borneo Bioinformatics Optimization
This test is done to get the time from the results of the query PreTest
execution for the Has Many Through relation by using The results of the Preliminary Test (PreTest) on the
Eloquent ORM and Query Builder to display data. One to Borneo Bioinformatics portal are presented in Figure 8.
Many relationship between tree table and wood table. The
results of Has Many Through relations query test are shown in
Table VIII and Figure 7.
Published By:
Retrieval Number: 100.1/ijrte.C4666099320 Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.C4666.099320 585 and Sciences Publication
Bioinformatics Database Query Performance and Optimization
Avoid The external stylesheets were Fig. 11. Screenshot Posttest PageSpeed performance
C(73) CSS
CSS@import included in using @import The performance score of the main portal page after
Whereas for the pretest recommendation from YSlow Post-Test gets Grade C (77%) for Pagespeed
yahoo is presented in Figure 10. recommendation, and for YSlow with Grade B(80%).
Regarding Detail Pages' performance to get score for Full
Load Time is 3.5 seconds, Total Page Size is 9.00MB of 317
Requests. The results of the PostTest measurement in Figure
11 an 12 have shown that the score of each recommendation
item has been optimized which shows a good value. For
"Serve scaled images with Grade C (75)," Defer parsing of
JavaScript with Grade A (99) and "Minify JavaScript with
Grade A (99), Optimize images with Grade A(97), Inline small
CSS and JavaScript with Grade A(100), Minify CSS with
Grade B(83) and JavaScript with Grade A(99). For Avoid
CSS@import with Grade A (100).
Fig. 10: Screenshot prestest YSlow performance
Figure 10 (prestest) presents a list of recommendations
from YSlow with the performance scores obtained in the
PreTest. There are four (4) recommended items that score
very low, i.e. " Avoid CSS expressions, Minify JavaScript and
Published By:
Retrieval Number: 100.1/ijrte.C4666099320 Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.C4666.099320 586 and Sciences Publication
International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878, Volume-9 Issue-3, September 2020
Published By:
Retrieval Number: 100.1/ijrte.C4666099320 Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.C4666.099320 587 and Sciences Publication
Bioinformatics Database Query Performance and Optimization
6. J. Chomicki and D. Toman, “Temporal Databases,” in Foundations of 28. D. E. Luna, L. F. Luna-Reyes, J. R. Gil-Garcia, and R.
Artificial Intelligence, Volume 1., L. V. M. Fisher, D. Gabbay, Ed. Sandoval-Almazán, “Government web portals performance evaluation
Elsevier B.V., 2005, pp. 429–467. using data envelopment analysis,” 2011, doi:
7. P. Cybula, H. Kozankiewicz, K. Stencel, and K. Subieta, “Optimization 10.1145/2037556.2037617.
of distributed queries in grid via caching,” 2005, doi: 29. D. google, “About PageSpeed Insights,” developers.google.com.
10.1007/11575863_58. https://developers.google.com/speed/docs/insights/v5/.
8. E. Budiman, N. Puspitasari, S. N. Alam, T. M. A. Akbar, Haeruddin, 30. Marcelduran, “Web Performance Best Practices and Rules,” yslow.org.
and D. Indra, “Performance analysis of the resource loading time for http://yslow.org/.
borneo biodiversity information system,” 2018, doi: 31. Carbon60, “Recommendations,” gtmetrix.com.
10.1109/IAC.2018.8780515. https://gtmetrix.com/recommendations.html (accessed Jun. 06, 2020).
9. S. Wu, F. Li, S. Mehrotra, and B. C. Ooi, “Query optimization for 32. marcelduran, “YSlow Ruleset Matrix,” yslow.org.
massively parallel data processing,” 2011, doi: http://yslow.org/ruleset-matrix/.
10.1145/2038916.2038928. 33. Widjaja, Kekinian Keanekaragaman Hayati Indonesia 2014. 2014.
10. E. Budiman, N. Puspitasari, M. Wati, J. A. Widians, and Haviluddin, 34. E. Budiman, N. Puspitasari, M. Wati, Haviluddin, and R. Rahim,
“Web Performance Optimization Techniques for Biodiversity Resource “Model Framework for Development of Biodiversity Information
Portal,” Journal of Physics: Conference Series, vol. 1230, no. 1, 2019, Systems,” Journal of Physics: Conference Series, vol. 1230, no. 1,
doi: 10.1088/1742-6596/1230/1/012011. 2019, doi: 10.1088/1742-6596/1230/1/012012.
11. L. Zamboulis, N. Martin, and A. Poulovassilis, “Query performance 35. D. F. R. A. Cleary and L. DeVantier, “Indonesia: Threats to the
evaluation of an architecture for fine-grained integration of Country’s Biodiversity,” Encyclopedia of Environmental Health, no.
heterogeneous grid data sources,” Future Generation Computer November 2017, pp. 622–632, 2011, doi:
Systems, 2010, doi: 10.1016/j.future.2010.05.008. 10.1016/B978-0-444-52272-6.00504-3.
12. E. Budiman and S. N. Alam, “Database: Taxonomy of plants
Nomenclature for borneo biodiversity information system,” 2018, doi:
10.1109/IAC.2017.8280642.
AUTHORS PROFILE
13. L. Caroprese, E. Zumpano, and E. Vocaturo, “No SQL Database
Management Systems for Big Data,” International Journal of
Engineering and Advanced Technology, vol. 9, no. 5, pp. 21–26, 2020, Edy Budiman is member of the Association for
doi: 10.35940/ijeat.D9145.069520. Computing Machinery (ACM), member of Institute of
14. N. Puspitasari and E. Budiman, “Evaluation of Borneo’s Biodiversity Electrical and Electronics Engineers (IEEE), and member
Information System,” 2018 Electrical Power, Electronics, of APTIKOM (Asosiasi Pendidikan Tinggi Informatika
Communications, Controls and Informatics Seminar, EECCIS 2018, dan Komputer) and mem-ber of The Institution of
pp. 434–439, 2019, doi: 10.1109/EECCIS.2018.8692955. Engineers Indonesia (PII). Currently, he is actively
15. N. K. Gundla and Z. Chen, “Creating NoSQL Biological Databases with teaching and researching. As a writer on several journals and conferences, he
Ontologies for Query Relaxation,” 2016, doi: focuses his research on mobile network issues, performance and
10.1016/j.procs.2016.07.120. mobile-based apps.
16. N. Dengen, E. Budiman, J. A. Widians, M. Wati, U. Hairah, and M.
Ugiarto, “Biodiversity information system: Tropical rainforest borneo
and traditional knowledge ethnic of dayak,” Journal of
Telecommunication, Electronic and Computer Engineering, vol. 10, no.
1–9, 2018.
17. E. Budiman, M. Jamil, U. Hairah, H. Jati, and Rosmasari, “Eloquent
object relational mapping models for biodiversity information system,”
in 2017 4th International Conference on Computer Applications and
Information Processing Technology (CAIPT), Aug. 2017, vol.
2018-Janua, pp. 1–5, doi: 10.1109/CAIPT.2017.8320662.
18. U. Hairah, A. Tejawati, E. Budiman, and F. Agus, “Borneo biodiversity:
Exploring endemic tree species and wood characteristics,” in
Proceeding - 2017 3rd International Conference on Science in
Information Technology: Theory and Application of IT for Education,
Industry and Society in Big Data Era, ICSITech 2017, 2017, vol.
2018-Janua, pp. 435–440, doi: 10.1109/ICSITech.2017.8257152.
19. L. A. Bultet et al., “The SIB Swiss Institute of bioinformatics’ resources:
Focus on curated databases,” Nucleic Acids Research, vol. 44, no. D1,
pp. D27–D37, 2016, doi: 10.1093/nar/gkv1310.
20. M. Cannataro and P. Veltri, “Bioinformatics web portals,” in Selected
Readings on Database Technologies and Applications, 2008.
21. P. Artimo et al., “ExPASy: SIB bioinformatics resource portal,” Nucleic
Acids Research, 2012, doi: 10.1093/nar/gks400.
22. Haeruddin, H. Johan, U. Hairah, and E. Budiman, “Ethnobotany
database: Exploring diversity medicinal plants of Dayak Tribe Borneo,”
in International Conference on Electrical Engineering, Computer
Science and Informatics (EECSI), 2017, vol. 2017-Decem, doi:
10.1109/EECSI.2017.8239094.
23. J.-S. Varré, B. Schmidt, S. Janot, and M. Giraud, “Manycore
High-Performance Computing in Bioinformatics,” 2011.
24. P. D. Karp et al., “A comparison of microbial genome web portals,”
Frontiers in Microbiology. 2019, doi: 10.3389/fmicb.2019.00208.
25. W. W. Li et al., “Building cyberinfrastructure for bioinformatics using
service oriented architecture,” 2006, doi:
10.1109/ccgrid.2006.1630932.
26. E. Budiman and S. N. Alam, “User perceptions of mobile internet
services performance in borneo,” in 2017 Second International
Conference on Informatics and Computing (ICIC), Nov. 2017, vol.
2018-Janua, pp. 1–6, doi: 10.1109/IAC.2017.8280643.
27. S. Kaur, K. Kaur, and P. Kaur, “An Empirical Performance Evaluation
of Universities Website,” International Journal of Computer
Applications, 2016, doi: 10.5120/ijca2016910922.
Published By:
Retrieval Number: 100.1/ijrte.C4666099320 Blue Eyes Intelligence Engineering
DOI:10.35940/ijrte.C4666.099320
View publication stats
588 and Sciences Publication