We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 22
Advanced data base
Introduction and overview
Database technology has evolved from primitive file
processing to the development of database management systems with query and transaction processing. Relational database systems have been widely used in business applications With the advancement of database technology ,various kinds of advanced database systems have been emerged and undergoing development to address the requirements of new database applications Introduction… The new database applications include handling Special data (such as maps) Engineering design data (such as the design of buildings, system components or integrated circuits) Hypertext and multimedia (including text,images ,videos,and audio data) Time related data (such as historical records stock exchange data) World wide web ( a huge ,widely distributed information repository made available by the internet Introduction… While such databases or information repositories require sophisticated facilities to efficiently store ,retrieve and update large amounts of complex data ,they also provide fertile grounds and raise many challenging research and implementation issues for data mining. Further progress has led to the increasing demand for efficient and effective data analysis and data undedrstanding tools. This need is a result of explosive growth in data collected from applications including business and management, government administration, science and engineering, and environmental control. Introduction… These applications require efficient data structures and scalable methods for handling complex object structures, variable length records, semistructure and unstructured data, text and multimedia data, data schemas with complex structures and dynamic changes. In response to these needs, advanced database system and specific application oriented database systems have been developed. These includes: object-oriented and object relational database systems, text and multimedia database system, heterogeneous and legacy database system, and web base global information systems. Introduction… Object oriented databases based on the object oriented programming paradigm, where in general terms, each entity is concerned as an object. data and code relating to an object are encapsulated into a single unit Introduction… Each object has associated with it the following: A set of variables that describe the objects(these correspond to the attributes in the entity relationship and relational models. A set of messages that the object can use to communicate with other objects,or with the rest of the database system A set of methods ,where each method holds the code to implement a message .Upon receiving a message ,the method returns a value in response For instance the method for the message get_photo(employee) will retrieve and return a photo of the given employee object Objects that share a common set of properties can be grouped into an object class. Each object is an instance of its class. Object classes can be grouped organized into class/subclass hierarchies so that each class represents properties that are common to objects in that class. Introduction… Object Relational Databases Are constructed based on an object –relational data model This model extends the relational model by providing a rich data type for handling complex objects and object orientation. In addition special constructs for relational query languages are included to manage the added data types. Are becoming increasingly popular in industry and applications. Introduction… Special Databases Contains spatial related information Such databases includes geographic (map) databases,VLSI chip design databases, and medical and satellite image databases. Special data may be represented in raster format ,consisting of n-dimensional bit maps or pixel maps. For example a 2D satellite image may be represented as raster each pixel registers the rain fall in a given area. Maps can be represented in vector format ,where roads ,bridges, buildings and lakes are represented as union of basic geometric constructs such as points, lines,polygons, and the partitions and networks formed by these shapes. Introduction… Geographic database applications Forestry and ecology planning Location of telephone and electric cables, pipes and sewage system Vehicle navigation and dispatching system Urban planning Introduction… Temporal Databases And Time series Database
Both stores time related data
A time –series data database stores sequences of values that exchange with time, such as data collected regarding the stock exchange Data mining techniques can be used to find the characteristics of object evolution or the trend of changes for objects in the database Such information can be useful in decision making and strategic planning Ex Bacteria Growth____expiration date The mining of banking data may aid in the scheduling bank tellers according to the volume of customer traffic. Stock exchange data ______ investement strategies Time may be decomposed according to fiscal years,academic years,or calendar years,years may be further decomposed into quarteres,or months Introduction… Text databases and Multimedia databases Text databases are databases that contain word descriptions for objects These word descriptions are actually not simple keywords but rather long sentences or paragraphs such as documents. Text databases may be highly unstructured(such as home web pages on the www) Some text databases may be semistructured (such as e-mail message and many HTML/XML web pages) Others are relatively well structured(such as library database) * General description of object classes *Key words *Content associations Introduction… Multimedia databases atore image,audio,and video data They are used in applications such as picture content-based retrieval,voice mail systems,video on demand systems,the www,and speech based user interfaces that recognize spoken commands * multimedia databases must support large objects,since data objects such as video can require gigabytes of storage. *Specialized storage and search techniques are also reqired *real-time retrieval(leap sysnchronizations) Introduction… Hetrogeneous Databases and Legacy databases Objects in one component databases may differ greatly from objects in other component database, making it difficult to assimilate their semantics into the overall heterogeneous database A legacy database is a group of heterogeneous databases that combines different kinds of data systems such as relational or object oriented databases, hierarchical databases, network databass, spreadsheets,multimedia databases ,or file systems The heterogeneous databases in a legacy databases may be connected by intra or inter computer networks Information exchange across such databases is difficult since one needs to work out precise transformation rules from one representation to another, considering diverse semantics @ ex Student academic performance among d/f schools @ grading per quarter(year/semester @grade A to F (1-10)(1-100) @number of course in Database 2,3,4 @fair,good,excellent,poor(more generalized,conceptual level) Introduction… The world Wide Web The www and its associated distributed information services such as America online, yahoo, Altavista,and rich, world-wide, online information services, where data objects are linked together to facilitate interactive access. Users seeking information of interest traverse from one object via links to another *web services that provide keyword-based search without understanding the context behind particular web pages can only offer limited help to users * understanding users access pattern –better market decisions(advertisements) Introduction… Data warehouses Refers to a database that is maintained separately from an organization’s operational databases Data warehouse systems allow for the integration of a variety of application systems They support information processing by providing a solid platform of consolidated historical data for analysis *non volatile *Time variant[5-10 years historical data] *Integrated on multiple heterogeneous sources Introduction… Data Mining Refers to extracting or “mining” knowledge from large amount of data *the term is misnomer(gold mining, mineral mining… *knowledge mining from databases, knowledge extraction, data analysis, data archeology--- Data mining is as simply an essential step in the process of knowledge discovery in databases The steps in knowledge discovery are : Data cleaning to remove noise and inconsistent data Data integration combining multiple sources Data selection data relevant to analysis is selected Data transformation into a form appropriate for mining ex aggregation Data mining extracting knowledge(patterns) Pattern evaluation identifying the truly interesting patterns using measures Knowledge presentation visualization(presentation to users) Introduction… Query Languages DBMS –is a software system that enables users to define ,create maintain and control access to the database Typically the DBMS provides the following facilities: It allows users to define the databases,usually through a Data Defination Language(DDL) * the DDL allows users to specify the data types and structures and the constraints on the data to be stored in the database It allows users to insert ,update ,delete and retievel data from the database, usually through a Data Manipulation Language(DML) Having a central repository for all data and data descriptions allows the DML to provide a general inquiry facility to this data,called a query language Introduction… The most common query languages is the structured Qery Languages(SQL) *pronounced as S_Q_L or see-Quel *standard language of DBMS It provides controlled access to the database using DCL provider Security system prevents unauthorized users accessing the database An integrity system maintains the consistency of stored data A concurrency control system shared access of the database A recovery control system restore the data base to previous consistent state A user accessible catalog@contains description of the data in the DB *DDL&DML have compilers(DDL compilers and DML compilers) Introduction… The 1992 SQL standard locked computational completeness :it contained no flow of control commands such as IF…THEN…ELSE,GOTO,OR DO…WHILE To overcome this and to provide more flexibility,S QL allows statements to be embedded in a high–level procedural language, as well as being able to enter SQL statements interactively at a terminal In the embedded approach ,flow of control can be obtained from the structures provided by the programming language Introduction… Two types of programming SQL: Embedded SQl statements :SQL statements are embedded directly into the program source code and mixed with the host language statements This approach allows users to write programs that access the database directly *A special precompilers modifies the source code to replace SQL statements with calls to DBMS routines *The source code can then be compiled and linked in the normal way Application programming Interface(API).An alternative technique is to provide the programmer with a standard set of functions that can be invoked from the software. Introduction… An API can provide the same functionality as embedded statements and removes the need for any precompilation It may be argued that this approach provides a cleaner interface and generates more manageable code The best known API is the open database connectivity(ODBC) standard Java programs communicate with databases and manipulate their data using the JDBC API