User Web Usage Mining For Navigation Improvisation Using Semantic Related Frequent Patterns
User Web Usage Mining For Navigation Improvisation Using Semantic Related Frequent Patterns
Abstract - Web sites have abundant web usage log standard web user navigation or structure of a website
which provides useful information that can be used for and the content of the web pages. Many researches
user navigation improvisation. Traditional web site have been directed to constantly improve the process
does not use this rich web usage data for any of customizing the web, most of them simply using the
investigation. It can be used to generate efficient navigation patterns of users [16][18]. However, with
frequent patterns which can support in user navigation the frequency of web pages grows, personalization
improvisation. It can also be help in re-organizing web based on web usage mining has the defect of not
site for efficient navigation. In this paper we propose a taking the context of the website in mind. Thus
frequent pattern generation approach using semantic semantic web, which elaborates the context of a web
relations with user web usage data. The quality of web page, is equally important to consider the concept.
usage pattern generated is measured with standards Although some research has explored this area, there
methods for evaluation. Experiment results show that is still room for improvement.
more precise presentation using user pattern
generation can improve user navigation measures. Semantic Web is focused on making content
comprehensible website not only by humans but also
Keywords: Web mining, Usage Mining, Navigation, by computers. To accomplish this, it helps the
Semantic, Frequent Pattern. software agents to look for expected contents. Hence
in increased efforts in the annotation of Web pages
1. INTRODUCTION and objects as semantic information using ontologies
(such as product catalogs or hierarchies of concepts)
The expansion in the dimension of the World Wide are observed. Ontological instances can be built by
Web (WWW) has made it the place of tremendous using Web site specific domain knowledge [19][20].
interest for the e-commerce, Web services and Web Therefore, in this paper, the semantic information of a
information system. Research is being done website can be combined with patterns generated by
enormously in order to maximize the advantage of conventional mining web use to generate frequent
using the web sites for such web based applications patterns navigation enriched with semantic
[13][14]. It is the ability of a site to keep visitors on a information of web pages.
deeper level and to successfully guide with useful
information, which is seen as a key point in the final 2. RELATED WORKS
success of the site. However lacking in the size,
structure and complexity of the Web, it is the Various data mining techniques [20][21] can be used
challenging task to access the relevant information to model and understand the Web user activity [4][8].
efficiently. Web Usage Mining (WUM) is the However WUM process can be divided into three
approach to extract the knowledge from analysis of inter-dependent stages as data collection and
web usage data about a particular website [5][7]. This preprocessing, pattern discovery and pattern analysis.
usage data can be obtained from server logs and can The preprocessing stage consists of cleaning the click
analyze the behavioral patterns and profiles those stream data obtained from server logs and partitioning
interact with the web sites. This analyzed data is into set of user transactions to represent individual
beneficial and can be used for different needs such as users' activity [3].
web personalization, recommender systems,
presentation of promotional contents etc. Different statistical and database operations are
performed in pattern discovery stage to obtain the
Web mining is a process that allows searching and patterns reflecting behavior of users. Some of the
predicting users’ interests and helps in personalizing techniques of discovery and analysis of common
the web. Web mining deals with the analysis of patterns are session and visitor analysis, cluster
analysis, and correlation analysis of association and defined by a metric to evaluate the goodness of quality
sequential pattern analysis and navigation[1][2]. implementation.
Jespersen et al [9] proposed a hybrid approach to An approach to the process of creating a model of
analyze visitor click-stream sequences. Hypertext Web navigation, where frequent navigation patterns
probabilistic grammar approach in a combination and ontology instances are formed instead of a Web page
sequence of the table to be used for general purposes addresses websites into a framework that we have
Weblog mining is used for mining. Mobasher et al [6] developed to integrate semantic information as shown
presented web personalization system, associated with in figure-1. Evaluation of the generated patterns
the proposed mining offline tasks, the use of data and quality is measured by a mechanism involving the
knowledge discovery based on Web page automatic recommendation of the Website.
customization process is online. Lumberjacks by Chi
et.al [22] are user profiles are constructed by
combining both clustering and user sessions and
statistical traffic analysis with traditional k-means
algorithm.
R. Cooley et al. [12] propose that the process of web Figure-1 Framework of Frequent Pattern Extraction
mining can be divided into two main sections. The
first part of the transformation of the domain-
3.1 Pre-processing
dependent Web data into a suitable form of transaction
processes. This preprocessing transaction data is for
Preprocessing involves in removal of noisy and
the identification and integration of the components. In irrelevant data, and in addition to this, the semantic
the second part, data mining and pattern matching information of web pages with your registration data is
techniques such as association rules and sequential also integrated. In this step log server files are pruned
patterns. and transactions are extracted and individual’s
ontology class is assigned to the address of the web
Using conventional techniques, the quality of the web page.
usage patterns generated by the mining and next page
prediction accuracy is limited by the recommendation.
A. Pruning Process
Therefore, the objective is to design a framework to
correct this problem by combining the conventional In this step did not answer requests and petitions made
scheme with semantic information, which is clearly by software agents (eg, web crawlers) disposed of by
using error codes and access to information in logs information structure, user profiles, website content,
records. etc. Web usage Log data is collected from
reachouthyderabad.com website and we modify the
B. Extraction of Navigation History user ip address as per our requirement for the
evaluation. A web log is a transcript of transactions
Browsing history is a set of Web objects requested by made between a group of users and group of servers.
the user in his / her active session. This step is Figure-2 show several lines of a typical log. For
analogous to the process of the session. The historical reasons, many web logs use the same format.
implemented method extracts the log data using the The fields are separated by white space, typically a
Internet and stores them in the database. This data will single space, although some fields are additionally
be used for the pattern generation and result quoted.
evaluation.
4. EXPERIMENT EVALUATION