ICCED WEB 30-Rev
ICCED WEB 30-Rev
Abstract — Society 5.0 that we are experiencing search for information compared to conventional
today makes information a commodity for everyone. sources of information (such as newspapers or
This situation requires receiving, disseminating, and magazines). This causes the web to become one of the
most widely used portals for people to search for
processing data in real-time and very quickly. With information.
the sophistication of existing technology, people are Since it was first launched in 1989 by Tim Bernes-
increasingly using technology (such as the internet, Lee, the development of the web has undergone many
cell phones, and computers) to search for evolutions [1]. At the time WEB 1.0 web pages were
information compared to conventional sources of created to display information statically, users could
only view and read the information presented without
information (such as newspapers or magazines). This being able to interact with the web page. WEB 2.0 is a
causes the web to become one of the most widely time when users can interact with web pages. The
used portals for people to search for information. interaction in question is that users can not only read
The development of the web since it was first and view web pages, but users can also read and write or
launched in 1989 by Tim Bernes-Lee has undergone two-way communication [2]. WEB 3.0 is a concept
from the Intelligent Web. WEB 3.0 is also called the
many evolutions which until now will face WEB 3.0. Semantic Web, which supports users to get more
WEB 3.0 is a concept from the Intelligent Web. WEB meaningful information. The idea of WEB 3.0 is
3.0 is also called the Semantic Web, which supports assisted by artificial intelligence to gather and present
users to get more meaningful information. The information in a shorter but meaningful form from
results of this study are in the form of a Smart various relevant sources on the internet.
Search engines available in current web browsers
Search Engine on Web Browser to generate a
(such as Google, Bing, MSN, etc.) can accommodate
summary of data that has been processed user searches and provide the best results based on the
automatically based on the top 5 web pages from source (web page), relevance, and the amount of traffic
search results based on keywords entered by the from keywords to the web page. [3]. Unfortunately, the
user. Data testing conducted with three different current search engines are only able to display web
pages that are relevant to the keywords in sequence, and
methods resulted in an average accuracy rate of
they have not been able to present the essence of the
82%. search results carried out and processed to become
concise and complete information that the user can well
Keywords — Smart Search Engine, WEB 3.0, Search receive.
Engine Pagerank, Automatic Text Summarization. This research focuses on improving the performance
of existing search engines in browsers to achieve the
intended WEB 3.0 concept, namely by combining
I. INTRODUCTION
current search engine algorithms and adding the text
Society 5.0 that we are experiencing today makes summarization method to the top 5 web pages from
information a commodity for everyone. This situation search results in search engines. . The text
requires receiving, disseminating, and processing data in summarization in question is part of the scientific field
real-time and very quickly. People get information from of text mining which automatically produces a summary
various sources such as social media, forums, news, containing meaningful sentences and includes all
newspapers, etc. With the sophistication of existing relevant necessary information from the original
technology, people are increasingly using technology document [4]. The results of this study are in the form of
(such as the internet, cell phones, and computers) to
a Smart Search Engine on Web Browser to generate a automatic and learns to improve from experience
summary of information that has been processed without being explicitly programmed [8].
automatically based on the top 5 web pages from search The latest research with the machine learning
results based on keywords entered by the user. approach is [9] by hybridizing the Maximal marginal
importance (MMI) method, PSO, and hybrid the other
approach techniques, namely the fuzzy logic method.
II. RELATED WORK The input of this research is a single document, and the
Several related studies that we take as a reference for summary results are extractive summaries. MMI is used
this research include research related to search engine to produce summaries that excel in terms of diversity by
algorithms, search engine page rank, text summarization determining the most important sentences. The most
and the semantic web. The search engine algorithm that critical penalties are determined by selecting the same
is the reference in this study is the algorithm belonging punishment and choosing various sentences by
to the Google search engine. As we know that Google is extracting sentences from the original text [10]. PSO is
a potent search engine that almost everyone uses to used to select the most important and least important
become their favourite search engine on their web features, and fuzzy logic is used to help PSO to create
browser [5]. The way search engines work on web risk and uncertainty values, and the tolerance value can
browsers is classified into three stages, namely: be changed flexibly.
Crawling: The first stage involves Google's bots (the Meanwhile, [11] with his research entitled Smart
infamous “spiders”) crawling the web and looking for Search Engine, focuses on improving search engine
new or updated web pages. In general, the more links a performance to understand the intent of the keywords
page has to it, the easier it is for Google to locate it. entered by the user using the Design and Test of
Pages need to be crawled and indexed to rank. Intelligent Search of News with Classification. This
Indexing: Google's next step is to analyze these URLs research is also similar to research [12] on search
and figure out what each page is about. It does this by engines for the semantic web. However, this study uses
looking closely at the content, images, and other media a new approach that combines the search engine page
files on the page and then stores this information in a rank method with automatic text summarization to
massive database known as the Google index. produce a summary of information from the top 5 search
Serving: The final step is to determine which pages are results web pages.
the most relevant and helpful for a particular search
query. This is known as the ranking stage, and this is III RESEARCH METHODOLOGY
where the Google search algorithm comes in [6]. This study uses a new approach that combines the
Search engines will present search results to users search engine page rank method with automatic text
according to predetermined criteria. The result is called summarization to produce a summary of information
PageRank. Research Ao-Jan Su et al. describes 17 from the top 5 search results web pages. The research
ranking features that affect the ranking of a web page. flow can be seen in Figure 1.
These features are described in table 1.