Compusoft, 3 (10), 1108-115 PDF
Compusoft, 3 (10), 1108-115 PDF
Compusoft, 3 (10), 1108-115 PDF
ISSN:2320-0790
Research Scholar, Dr. SNS.Rajalakshmi College of Arts and Science, Coimbatore, India
HOD, Department of Computer Applications, Dr. SNS.Rajalakshmi College of Arts and Science, Coimbatore, India
Abstract: Relational query databases provide a high level declarative interface to access data stored in relational
databases. Two key components of the query evaluation component of a SQL database system are the query
optimizer and the query execution engine. System R optimization framework since this was a remarkably elegant
approach that helped fuel much of the subsequent work in optimization. Transparent and efficient evaluations of
preferential queries are allowed by relational database systems. This results in experimenting extensive evaluation
on two real world data sets which illustrates the feasibility and advantages of the framework. Early pruning of
results based on score or confidence during query processing are enabled by combining the prefer operator with the
rank and rank join operators. During preference evaluation, both the conditional and the scoring part of a preference
are used. The conditional part acts as a soft constraint that determines which records are scored without
disqualifying any duplicates from the query result. To introduce a preferences mapping relational data model that
extends database with profile preferences for query optimizing and an extended algebra that captures the essence of
processing queries with ranking method. Based on a set of algebraic properties and a cost model that to propose, to
provide several query optimization strategies for extended query plans. To describe a query execution algorithm that
blends preference evaluation with query execution, while making effective use of the native query engine.
Keywords: query optimization; relational databases; query plan; preferential databases; query evaluation; query
parser; dynamic query optimization algorithm.
and disk space. By performing data mining, interesting
knowledge, regularities, or high level information can be
extracted from databases and viewed or browsed from
different angles. The discovered knowledge can be applied
to decision making, process control, information
management, and query processing. Therefore, data mining
is considered one of the most important frontiers in
database and information systems and one of the most
promising interdisciplinary developments in the
information technology.
I. INTRODUCTION
Data mining has attracted a great deal of attention in the
information industry and in society as a whole in recent
years, due to the wide availability of huge amounts of data
and the imminent need for turning such data into useful
information and knowledge. Data mining can be viewed as
a result of the natural evolution of information technology.
Data mining involves an integration of techniques from
multiple disciplines such as database and data warehouse
technology, statistics, machine learning, high performance
computing, pattern recognition, neural net works, data
visualization, information retrieval, image and signal
processing, and spatial or temporal data analysis. For an
algorithm to be scalable, its running time should grow
approximately linearly in proportion to the size of the data,
given the available system resources such as main memory
COMPUSOFT, An international journal of advanced computer technology, 3 (10), October-2014 (Volume-III, Issue-X)
A.
COMPUSOFT, An international journal of advanced computer technology, 3 (10), October-2014 (Volume-III, Issue-X)
A. Qualitative Approach
In the qualitative approach, preferences are specified using
binary predicates called preference relations. In quantitative
approaches, preferences are expressed as scores assigned to
tuples or query conditions. Existing works have studied
various types of preferences including likes and dislikes,
multi-granular preferences that involve many attributes and
context-dependent preferences. In the latter case, the
context can be dictated by the data or it can be external to
the database.
B. Quantitative Approach
A quantitative approach covers prior works with respect to
different preference types. In this model, preference scores
are assigned to tuples in a context-dependent way. Context
is related to the data as in but it is defined in a quantitative
way. In addition, each preference score carries a confidence
value that captures how certain a preference is. Using
scores, context, and confidences allows not only expressing
several types of preferences; it also enables the formulation
of different types of queries with preferences where the
expected answer may be specified based on any
combination of scores, confidences and context.
COMPUSOFT, An international journal of advanced computer technology, 3 (10), October-2014 (Volume-III, Issue-X)
E. Query Evaluation
In the proposed system, multi-block queries are considered
for optimization which is converted to single block query
and then moved for further processing. If tuple iteration
semantics are used to answer the query, then the inner
query is evaluated for each tuple of the Dept relation once.
D. Query Plan
Every node represents either an attribute in the conjunctive
query (i.e., a service invocation), or a join, or a selection
operation. Every arc indicates data flow and parameter
passing from outputs of one service to inputs of another
service. Atoms are partitioned into exact and search
services. Exact services are distinguished between
proliferative and selective and may be chunked, while
search services are always proliferative and chunked.
F. Query Model
The path that a query traverses through a database until its
answer is generated. The system modules through which it
moves have the following functionality:
The Query Parser checks the validity of the query and then
translates it into an internal form, usually a relational
calculus expression or something equivalent.
1111
COMPUSOFT, An international journal of advanced computer technology, 3 (10), October-2014 (Volume-III, Issue-X)
(1)
COMPUSOFT, An international journal of advanced computer technology, 3 (10), October-2014 (Volume-III, Issue-X)
A. Query Generator
In order to perform experiments involving IDP1ccp and the
ML algorithms it is necessary for generate a set of queries
in join graph form. However, cannot produce queries
without having relations to refer to. Therefore have to
generate a system catalogue before generating queries.
Consider the experimental case where it has a distributed
setting consisting of the execution sites and where the
maximum number of relations involved in a query is n. The
minimum number of relation entries required to be present
in the catalogue must therefore be n. The next step is to
populate the relation entry with fields. Each relation entry
is assigned between 5 and 10 fields according to a uniform
probability distribution. As with to assign each field a
domain according to the probabilities in. The size of each
domain and the size in bytes can also be seen. The final
addition to each relation entry is the resident sites of the
relation. First choose randomly how many sites the relation
should be available at using 1+U where U is a discrete
random variable taking values from 0 to 1. Then proceed
Source
Number of Tuples
Size
5,000
0.7 MB
10,000
1.5 MB
20,000
1.8 MB
50,000
5.9 MB
100,000
9.3 MB
1113
COMPUSOFT, An international journal of advanced computer technology, 3 (10), October-2014 (Volume-III, Issue-X)
Predicate
Selectivity
Handling cost
A,B
2.10-3
A,C
5.10-3
5 ms
B,D
10
-4
D,E
10-5
Plan
Traditional
Estimation
Rate-Based
Estimation
Left Deep
104
1.3
Fast Leaves
2.103
9.7
Evenly Spread
5.103
8.8
COMPUSOFT, An international journal of advanced computer technology, 3 (10), October-2014 (Volume-III, Issue-X)
28th International Conference on Very Large Data Bases, Hong
Kong, China, 275 - 286.
Shared
Ph.D.
[2]
[3]
[4]
[5]
[6]
[7]
1115
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]