Advance Dbms 2
Advance Dbms 2
• Multimedia database is the collection of interrelated multimedia data that includes text, graphics (sketches, drawings), images,
animations, video, audio etc and have vast amounts of multisource multimedia data. The framework that manages different types
of multimedia data which can be stored, delivered and utilized in different ways is known as multimedia database management
system. There are three classes of the multimedia database which includes static media, dynamic media and dimensional media.
• Content of Multimedia Database management system :
• Media data – The actual data representing an object.
• Media format data – Information such as sampling rate, resolution, encoding scheme etc. about the format of the media data
after it goes through the acquisition, processing and encoding phase.
• Media keyword data – Keywords description relating to the generation of data. It is also known as content descriptive data.
Example: date, time and place of recording.
• Media feature data – Content dependent data such as the distribution of colors, kinds of texture and different shapes present in
data.
• Types of multimedia applications based on data management characteristic are :
• Repository applications – A Large amount of multimedia data as well as meta-data(Media format date, Media keyword data,
Media feature data) that is stored for retrieval purpose, e.g., Repository of satellite images, engineering drawings, radiology
scanned pictures.
• Presentation applications – They involve delivery of multimedia data subject to temporal constraint. Optimal viewing or
listening requires DBMS to deliver data at certain rate offering the quality of service above a certain threshold. Here data is
processed as it is delivered. Example: Annotating of video and audio data, real-time editing analysis.
• Collaborative work using multimedia information – It involves executing a complex task by merging drawings, changing
notifications. Example: Intelligent healthcare network.
MULTIMEDIA DATABASE
• Modelling – Working in this area can improve database versus information retrieval techniques thus, documents constitute a specialized area and
deserve special consideration.
• Design – The conceptual, logical and physical design of multimedia databases has not yet been addressed fully as performance and tuning issues at each
level are far more complex as they consist of a variety of formats like JPEG, GIF, PNG, MPEG which is not easy to convert from one form to another.
• Storage – Storage of multimedia database on any standard disk presents the problem of representation, compression, mapping to device hierarchies,
archiving and buffering during input-output operation. In DBMS, a ”BLOB”(Binary Large Object) facility allows untyped bitmaps to be stored and
retrieved.
• Performance – For an application involving video playback or audio-video synchronization, physical limitations dominate. The use of parallel processing
may alleviate some problems but such techniques are not yet fully developed. Apart from this multimedia database consume a lot of processing time as
well as bandwidth.
• Queries and retrieval –For multimedia data like images, video, audio accessing data through query opens up many issues like efficient query formulation,
query execution and optimization which need to be worked upon.
• Areas where multimedia database is applied are :
• Documents and record management : Industries and businesses that keep detailed records and variety of documents. Example: Insurance claim record.
• Knowledge dissemination : Multimedia database is a very effective tool for knowledge dissemination in terms of providing several resources. Example:
Electronic books.
• Education and training : Computer-aided learning materials can be designed using multimedia sources which are nowadays very popular sources of
learning. Example: Digital libraries.
• Marketing, advertising, retailing, entertainment and travel. Example: a virtual tour of cities.
• Real-time control and monitoring : Coupled with active database technology, multimedia presentation of information can be very effective means for
monitoring and controlling complex tasks Example: Manufacturing operation control.
MULTIMEDIA DATABASE MODELLING
• Multimedia data modelling is radically different from the traditional data modelling due to the
special requirements imposed by multimedia data. One fundamental issue of concern is the
notorious “semantic gap” between what were stored in the database, and what the end-users
understand and expect. Due to the ambiguous, subjective, and transient (AST) semantic
problems of multimedia data, necessary context information needs to be made available and
provided in order to conduct meaningful and effective query processing of different paradigms
and at various levels of abstraction. Unfortunately, traditional and conventional data models
were not designed with such objectives in mind. More specifically, when people realized that
relational databases fall short in supporting advanced applications including multimedia data
management due to the limited modelling power of the relational data model, researchers
went ahead with devising semantic, object-oriented (OO) data models in the 80's (until early
90's). While the later commercial development of database systems has led to the so-called
object-relational (OR) databases since late 90's, such a marriage of the two does not actually
solve the problems encountered by multimedia data management. In particular, it does not
solve the basic issue of bridging the semantic gap by addressing the AST problems.
MULTIMEDIA DATABASE STORAGE
• To store pictures, videos, and sounds that people use for learning, fun, health, and advertising. We need a multimedia database. A multimedia database has all
kinds of information, like words, pictures, sounds, and videos. It keeps track of all this information and finds data easily. These databases enable the storage and
retrieval of multimedia data components. All media files in these databases are stored as binary strings and encoded according to their file types. Let's explore the
various types of multimedia databases.
MULTIMEDIA DATABASE STORAGE
EXPLANATION........
• Multimedia data requires a significant amount of storage space compared to text data. These storage technologies are used for multimedia data, hard disks,
optical disks, and flash drives. Hard disks are used to store multimedia data. Hard disks offer a large storage capacity. Optical disks such as CDs and DVDs offer high
storage capacity. But Optical disks are slower than hard disks. Flash drives offer high-speed data transfer and a small form factor. It makes them convenient for
portable storage. However, they offer limited storage capacity compared to hard disks and optical disks.
• Dynamic grids
• While 10km cells are overlarge to use in urban areas, there are many less dense areas outside of cities where two locations a couple of kilometres apart would have an equivalent list
of 20 nearby Xone businesses. Ideally, we’d be able to use large, imprecise cells in sparse areas and smaller, more granular cells in dense areas.
• Often a quadtree is represented as a grid, with each square representing a node within the tree. The subsequent image visualizes the method of subdividing nodes during a quadtree,
starting with a uni-root node and finishing with a tree of depth
CONTENT BASED RETRIVAL
• Inexpensive image-capture and storage technologies have allowed massive collections of digital images to be created. However, as a database grows, the difficulty of
finding relevant images increases. Two general approaches to this problem have been developed, both of which use metadata for image retrieval:
• Using information manually entered or included in the table design, such as titles, descriptive keywords from a limited vocabulary, and predetermined classification
schemes
• Using automated image feature extraction and object recognition to classify image content -- that is, using capabilities unique to content-based retrieval
• With Visual Information Retrieval, you can combine both approaches in designing a table to accommodate images: use traditional text columns to describe the semantic
significance of the image (for example, that the pictured automobile won a particular award, or that its engine has six or eight cylinders), and use the Visual Information
Retrieval type for the image, to permit content-based queries based on intrinsic attributes of the image (for example, how closely its color and shape match a picture of
a specific automobile).
• As an alternative to defining image-related attributes in columns separate from the image, a database designer could create a specialized composite data type that
combines Visual Information Retrieval and the appropriate text, numeric, and date attributes.
• The primary benefit of using content-based retrieval is reduced time and effort required to obtain image-based information. With frequent adding and updating of
images in massive databases, it is often not practical to require manual entry of all attributes that might be needed for queries, and content-based retrieval provides
increased flexibility and practical value. It is also useful in providing the ability to query on attributes such as texture or structure that are difficult to represent using
keywords.
• Examples of database applications where content-based retrieval is useful -- where the query is semantically of the form, "find objects that look like this one" -- include:
• Trademarks and copyrights
• Art galleries and museums
• Retailing
• Fashion and fabric design
• Interior design or decorating
HOW CONTENT BASED RETRIVAL
WORKS...
• A content-based retrieval system processes the information contained in image data and creates an abstraction of its content in terms of
visual attributes. Any query operations deal solely with this abstraction rather than with the image itself. Thus, every image inserted into
the database is analyzed, and a compact representation of its content is stored in a feature vector, or signature.
• The signature contains information about the following visual attributes:
• Global color represents the distribution of colors within the entire image. This distribution includes the amounts of each color, but not the
locations of colors.
• Local color represents color distributions and where they occur in an image, such as the fact that a red-green-blue (RGB) vector for sky
blue occurs in the upper half of an image.
• Texture represents the low-level patterns and textures within the image, such as graininess or smoothness. Unlike structure, texture is
very sensitive to features that appear with great frequency in the image.
• Structure represents the shapes that appear in the image, as determined by shape-characterization techniques such as edge detection.
• Facial represents unique characteristics of human faces. For example, characteristics include the size and shape of the nose, the distance
between the eyes, and various other attributes that cannot be easily disguised. Facial signatures are generated using separately purchasable
software from Viisage Technology, Inc.
• Feature data for all these visual attributes is stored in the signature, whose size typically ranges from 1000 to 2000 bytes. For better
performance with large image databases, you can create an index based on the signatures of your images. See Section 2.4 for more
information on indexing.
• Images in the database can be retrieved by matching them with a comparison image. The comparison image can be: any image inside or
outside the current database, a sketch, an algorithmically generated image, and so forth.
COLOUR HISTOGRAM
• Browsing, searching, and retrieving images has never been easy. Traditionally,
many technologies relied on manually appending metadata to images and
searching via this metadata. This approach works for datasets with high-quality
annotation, but most datasets are too large for manual annotation.
• That means any large image dataset must rely on Content-Based Image Retrieval
(CBIR). Search with CBIR focuses on comparing the content of an image rather
than its metadata. Content can be color, shapes, textures – or with some of the
latest advances in ML — the “semantic meaning” behind an image.
• Color histograms represent one of the first CBIR techniques, allowing us to search
through images based on their color profiles rather than metadata.
CONTENT BASED RETRIVAL:
TEXTURES
• In recent years, with the rapid development of digital image processing technology, helping the user
to find the multimedia information what they need quickly and effectively becomes a hot research
topic at present. Image retrieval is a major component of multimedia information retrieval
technology, and also one of the basic theory of video information retrieval, it play a significant role
in the field of information retrieval. Image retrieval is based on users' query requests, extract an
image or image set that related to the query image from the image dataset. Generally, three
categories of methods for image retrieval are used: text-based, content-based and semantic-based.
The content-based image retrieval (CBIR) has been proposed in the early 1990’s. This approach is to
retrieve images using low-level features like color, texture and shape that can represent an image.
Texture is one of the most important characteristics of an image. Texture features are also widely
used in CBIR systems. Various algorithms have been designed for texture analysis, such as gray level
co-occurrence matrices, the Tamura texture feature, the Markov random field model, Gabor filtering,
and local binary patterns. Tamura et al., based on human visual psychology research put forward
some different methods to describe the texture feature, give a description of several different terms:
coarseness contrast and directionality, line likeness, regularity, roughness, etc. This paper uses the
color and edge orientation feature that describe the texture information correctly.
ANOTHER METHOD TO IMPLY
TEXTURE
• 2. HSV COLOR SPACE AND QUANTIZATION Color information is the bottom and intuitive
physical characteristics. Because color is robust to the effects of noise, size and
orientation of image, so color feature is most commonly used in content-based image
retrieval. Color quantization is closely related to the color space. A lot kinds of color
spaces have been proposed and used for image retrieval. However, different color space
has different application, we usually hard to decide which kind of color space is most
suitable for our image retrieval algorithm. The HSV color space could mimic human color
perception well. Color Non-Equal Interval Quantization in HSV Color Space In order to
cut down the computing complexity and extract the color features in efficient way, we
use HSV color space and quantize it into non-equal interval 72 bins, thus we get the
color index image C(x, y). As is known to all, quantizing the H, S and V channels
uniformly is not suitable for human’s visual perception and recognition. We give our
non-equal interval quantization scheme as follows. Fig.1 shows that our quantization
scheme is better than the equal interval scheme.
IMAGE FEATURES
• Recent image retrieval techniques are focusing on multiple image features for the efficient image retrieval. It has been an inevitable requirement to fetch the
images from a variety of semantic groups and datasets. It is vital to retrieve the images based on their primitive features shape, texture, color and spatial
information to cater the versatile image datasets. State-of-the-art detectors and descriptors are capable of finding the interest points based on their specialty. To
encompass the strength of the image features for the information fusion purpose this contribution presents a novel technique to fuse the spatial color information
with shaped extracted features and object recognition. For RGB channels L2 spatial color arrangements are applied and features are extracted, thereby fused with
intensity ranged shapes formed by connecting the discovered edges and corners for the grey level image. Perifoveal receptive field estimation with 128-bit cascade
matching with symmetric sampling on the detected interest points that discovers the potential information for the complex, overlay, foreground and background
objects. Firstly the process is accomplished by reducing the massive features vectors, selecting high variance coefficient and secondly obtaining the indexing and
retrieval by employing a Bag-of-Words approach. Extensive experiments are conducted on ten highly recognized image dataset benchmarks, specialized for
texture, shapes, colors and objects including ImageNet, Caltech-256, Caltech-101, 102-Flower, Corel-10,000, 17-Flower, Corel-1000, COIL, ALOT and FTVL tropical
fruits. To check the affectivity and robustness of the proposed method, it is compared with state-of-the-art detectors and descriptors SIFT, SURF, HOG, LBP, DoG,
MSER and RGBLBP. Encouraging results reported that the proposed method has a remarkable performance in most of the image categories of versatile image
datasets and can gain better precision to those of the state-of-the-art detectors and descriptors.
MULTIMEDIA DATA FORMATS
• Audio and Video Digital audio and video allow sound and motion pictures to be recorded, stored, and played back under
computer control. Furthermore, because the media are sampled digitally, the computer has very fine control over how
the sample is played back and when.
• One of the chief advantages that this technology has over computer-controlled tape or laser disc is that careful
programming can eliminate the seek times, or pauses, when jumping from one point in the sample to another. Another
advantage is that the samples can be digitally reprocessed and filtered, both to remove noise and perform other
interesting effects.
• Digitizing video/audio also permits intermixing of other digital media, e.g., putting a graphic or some text on a video
image. Basic digital audio and video editors use a splice operation to assemble sequences, much like their tape or film
counterparts. Filtering and other signal processing effects are becoming more popular, and start to take advantage of the
flexibility of the digital medium. The most interesting audio/video applications allow an instructor to create programmed
or interactive sequences. Multimedia Documents Once an author has access to several different media, he will want to
compose a multimedia document, that is, a document which uses more than one medium (e.g. an article with figures).
• The ability to compose media varies from system to system: some systems allow only an aside, where the non-text media
can be requested to appear in another window; others allow in-line media, but only in alternation with the text; the
richest multimedia editors allow any media to contain any other media ad infinitum, so that a text can contain a table
which contains more formatted text within which lies a raster. The Andrew Toolkit [Palay88] is an example of such a
system, and it was used to produce this paper
MULTI MEDIA DATA FORMATS
• Formatted Text Modem text editors let a user add typographical information (often called "styles") to their text and show the effect
of these styles on-screen as the user works with the text. For instance, emboldened words look heavier, titles appear in larger type,
and subscripts are smaller and sit below the baseline. Some editors, recognizing that most styles are used to convey the structure of
the document or additional semantics of the text, try to capture this information directly: an author builds a hierarchy of
paragraphs, subsections, sections, and chapters; or marks passages as quotations.
• The text editor then induces the appropriate stylistic layout, using author-adjustable parameters. Rasters and Drawings Like a tile
mosaic, raster images allow a user to manipulate the image down to the picture element (i.e. pixel), or more broadly, as with a
brushstroke.
• Rasters are a flexible medium, as indicated by the plurality of methods for working with them: paint programs allow artists access
to a digital canvas via simulated brushes and pencils; scanners and frame grabbers allow photographs and video stills to be
captured; image editors allow CMU-ITC-098 1 pictures to be cropped, cut, pasted, rotated, and filtered
• . A more structured approach to imaging can be found in drawing editors. Instead of manipulating the pixels, a user defines the
image in terms of its component geometric shapes, such as lines, polygons, and circles. More sophisticated drawing editors work in
three dimensions and provide the ability to specify surfaces and solids, even texture, material, lighting sources and manufacturing
tolerances. Almost all allow the user to package a set of specifications together for reuse as a component, creating a personal or
organizational library of designs.
• The process of turning a drawing into an image is called rendering. Every drawing program is capable of at least enough rendering
to allow the user to interactively manipulate the drawing. But there exist accurate and computationally expensive rendering
operations such as ray-tracing and radiosity which can also be employed to generate photo-realistic images. Rendering a drawing
into a raster freezes a perspective on the drawing, and opens up the possibility of an artist altering the image.
VIDEO DATA MODEL
• Content based Video Indexing and Retrieval (CBVIR), in the application of image retrieval problem, that is,
the problem of searching for digital videos in large databases. “Content-based” means that the search will
analyze the actual content of the video.
• The term ‘Content’ in this context might refer colours, shapes, textures. Without the ability to examine video
content, searches must rely on images provided by the user [10]. Although the term "search engine" is often
used indiscriminately to describe crawler-based search engines, human-powered directories, and everything
in between, they are not all the same. Each type of "search engine" gathers and ranks listings in radically
different ways. Crawler-based search engines such as Google, compile their listings automatically. They
"crawl" or "spider" the web, and people search through their listings.
• These listings are what make up the search engine's index or catalog. One can think of the index as a massive
electronic filing cabinet containing a copy of every web page the spider finds. Because spiders scour the web
on a regular basis, any changes made to a web site may affect search engine ranking. It is also important to
remember that it may take a while for a spidered page to be added to the index. Until that happens, it is not
available to those searching with the search engine[8]. Directories such as Open Directory depend on human
editors to compile their listings. Webmasters submit an address, title, and a brief description of their site,
and then editors review the submission. The hybrid search engines will typically favor one type of listing over
the other however.
GEOGRAPHIC INFORMATION SYSTEM
• Geographical information system (GIS) is basically defined as a systematic integration of hardware and software for capturing, storing, displaying, updating manipulating
and analyzing spatial data. GIS can also be viewed as an interdisciplinary area that incorporates many distinct fields of study such as:
• 1. Geodesy that is basically projection,
• surveying, cartography and so on.
• 2. Remote Sensing
• 3. Photogrammetry
• 4. Environmental Science
• 5. City Planning
• 6. Cognitive Science
• As a result GIS relies on progress made in fields such as computer science, databases, statistics, and artificial intelligence. All the different problems and question that
arises from the integration of multiple disciplines make a more than a simple tool.
• Requirements for GIS –
Geographic Information requires a means of integration between different sources of data at different level of accuracy. System basically deals with the aspects of
daily life, so it must be updated daily to keep it current and reliable. Much of the Information Stored in GIS are for practical use requires a special means of retrieval
and manipulation.
• GIS system and application basically deals with information that can be viewed as data with specific meaning and context rather than simple data.
• Components of GIS system –
GIS system can be viewed as an integration of three components are hardware and software, data, people. Lets discuss them one by one:
• Hardware and software –
Hardware relates to device used by end users such as graphic devices or plotters and scanners. Data storage and manipulation is done using a range of
processor. With the development of the Internet and Web based application, Web servers have become part of many system’s architecture, hence most GIS’s
follows 3-Tier architecture.
• Software parts relates to the processes used to define, store and manipulate the data and hence it is akin to DBMS. Different models are used to provide efficient
means of storage retrieval and manipulation of data.
• Data –
Geographic data are basically divided into two main groups are vector and raster.
• Vector data/layers in GIS refers to discrete objects represented by points, lines and polygons. Lines are formed by connecting two or more points and polygons are
closed set of Lines. Layers represent geometries that share a common set of attributes. Objects within a layer have mutual topology. Vector sources include
digitized maps, features extracted from image surveys and many more.
• Raster data is a continuous grid of cells in two dimension or the equivalent of cubic cells in three dimension. Raster data are divided conceptually into categorical
and continuous. In a categorical raster every cell value is linked to a category in a separate table.Examples Soil type, vegetation types.land suitability, and so on.
Continuous raster images usually describes continuous phenomena in space such as Digital Elevation Model where each pixel is an elevation value. Unlike
categorical raster, a continuous raster doesn’t have an attribute/category table attached. Typical Raster sources are aerial images, satellite images and scanned
map images.
• People –
People are involved in all phases of development of a GIS system and in collecting data. They include cartographers and surveyors who create the maps and
survey the land and the geographical features. They also include system users who collect the data, upload the data to system, manipulate the system and
analyze the results.
WEB DATABASE ACCESSING
DATABASE
• While many DBMS sellers are working for providing a proprietary database for connectivity solutions with the Web, the majority of the organizations necessitate a more general way out to prevent them from
being tied into a single technology. Here are the lists of some of the most significant necessities for the database integration applications within the Web. These requirements are standards and not fully
attainable at present. There is no ranking of orders, and so the requirements are as follows:
• The ability and right to use valuable corporate data in a fully secured manner.
• Provides data and vendor's autonomous connectivity that allows freedom of choice in selecting the DBMS for present and future use.
• The capability to interface to the database, independent of any proprietary Web browser and/or Web server.
• A connectivity solution that takes benefit of all the features of an organization's DBMS.
• An open-architectural structure that allows interoperability with a variety of systems and technologies; such as:
• Different types of Web servers
• Microsoft's Distributed Common Object Model (DCOM) / Common Object Model (COM)
• CORBA / IIOP
• Java / RMI which is Remote Method Invocation
• XML (Extensible Markup Language)
• Various Web services (SOAP, UDDI, etc.)
• A cost-reducing way which allows for scalability, development, and changes in strategic directions and helps lessen the costs of developing and maintaining those applications
• Provides support for transactions that span multiple HTTP requests.
• Gives minimal administration overhead.
WEB SERVERS
• A web server is software and hardware that uses HTTP (Hypertext Transfer Protocol) and other protocols to respond to client requests made
over the World Wide Web. The main job of a web server is to display website content through storing, processing and delivering webpages to
users. Besides HTTP, web servers also support SMTP (Simple Mail Transfer Protocol) and FTP (File Transfer Protocol), used for email, file
transfer and storage.
• Web server hardware is connected to the internet and allows data to be exchanged with other connected devices, while web server software
controls how a user accesses hosted files. The web server process is an example of the client/server model. All computers that host websites
must have web server software.
• Web servers are used in web hosting, or the hosting of data for websites and web-based applications -- or web applications.
• How do web servers work?
• Web server software is accessed through the domain names of websites and ensures the delivery of the site's content to the requesting user.
The software side is also comprised of several components, with at least an HTTP server. The HTTP server is able to understand HTTP and
URLs. As hardware, a web server is a computer that stores web server software and other files related to a website, such as HTML documents,
images and JavaScript files.
• When a web browser, like Google Chrome or Firefox, needs a file that's hosted on a web server, the browser will request the file by HTTP.
When the request is received by the web server, the HTTP server will accept the request, find the content and send it back to the browser
through HTTP.
• More specifically, when a browser requests a page from a web server, the process will follow a series of steps. First, a person will specify a URL
in a web browser's address bar. The web browser will then obtain the IP address of the domain name -- either translating the URL through DNS
(Domain Name System) or by searching in its cache. This will bring the browser to a web server. The browser will then request the specific file
from the web server by an HTTP request. The web server will respond, sending the browser the requested page, again, through HTTP. If the
requested page does not exist or if something goes wrong, the web server will respond with an error message. The browser will then be able to
display the webpage.
XML DATABASES
• XML Database is used to store huge amount of information in the XML format. As the use of XML is increasing in every field, it is required to have a secured place
to store the XML documents. The data stored in the database can be queried using XQuery, serialized, and exported into a desired format.
• Blockchain Tables
• Blockchain as a technology has promised much in terms of solving many of the problems associated with the verification of transactions. While considerable
progress has been made in bringing this technology to the enterprise, a number of problems exist. Arguably, the largest being the complex nature of building
applications that can support a distributed ledger. Oracle Database 21c addresses this problem with the introduction of Blockchain Tables. These tables operate
like any normal heap table, but with a number of important differences. The most notable of these being that rows are cryptographically hashed as they are
inserted into the table, ensuring that the row can no longer be changed at a later date.
COMMERCIAL SYSTEMS:DB2
• Optimization in commercial systems: IBM DB2, Informix, Microsoft SQL Server,
Oracle 8, and Sybase ASE all search for left-deep trees using dynamic programming,
as described here, with several variations. For example, Oracle always considers
interchanging the two relations in a hash join, which could lead to right-deep trees
or hybrids. DB2 generates some bushy trees as well. Systems often use a variety of
strategies for generating plans, going beyond the systematic bottom-up enumeration
that we described, in conjunction with a dynamic programming strategy for costing
plans and remembering interesting plans (in order to avoid repeated analysis of the
same plan). Systems also vary in the degree of control they give to users. Sybase ASE
and Oracle 8 allow users to force the choice of join orders and indexes—Sybase ASE
even allows users to explicitly edit the execution plan—whereas IBM DB2 does not
allow users to direct the optimizer other than by setting an ‘optimization level,’
which influences how many alternative plans the optimizer considers.