MODULE 4 OOR Dbms (Merrin)
MODULE 4 OOR Dbms (Merrin)
Object-Oriented Database
The ODBMS which is an abbreviation for object-oriented database management system is the
data model in which data is stored in the form of objects, which are instances of classes. These
classes and objects together make an object-oriented data model.
A. Object Structure:
The structure of an object refers to the properties that an object is made up of. These properties
of an object are referred to as an attribute. Thus, an object is a real-world entity with certain
attributes that makes up the object structure. Also, an object encapsulates the data code into a
single unit which in turn provides data abstraction by hiding the implementation details from the
user.
The object structure is further composed of three types of components: Messages, Methods, and
Variables. These are explained below.
1. Messages –
A message provides an interface or acts as a communication medium between an object
and the outside world. A message can be of two types:
○ Read-only message: If the invoked method does not change the value of a
variable, then the invoking message is said to be a read-only message.
○ Update message: If the invoked method changes the value of a variable,
then the invoking message is said to be an update message.
2. Methods –
When a message is passed then the body of code that is executed is known as a method.
Whenever a method is executed, it returns a value as output. A method can be of two
types:
○ Read-only method: When the value of a variable is not affected by a
method, then it is known as the read-only method.
○ Update-method: When the value of a variable changes by a method, then
it is known as an update method.
3. Variables –
It stores the data of an object. The data stored in the variables makes the object
distinguishable from one another.
B. Object Classes:
An object which is a real-world entity is an instance of a class. Hence first we need to define a
class and then the objects are made which differ in the values they store but share the same class
definition. The objects in turn correspond to various messages and variables stored in them.
Example –
class CLERK
{ //variables
char name;
string address;
int id;
int salary;
//methods
char get_name();
string get_address();
int annual_salary();
};
In the above example, we can see, CLERK is a class that holds the object variables and
messages.
An OODBMS also supports inheritance in an extensive manner as in a database there may be
many classes with similar methods, variables and messages. Thus, the concept of the class
hierarchy is maintained to depict the similarities among various classes.
The concept of encapsulation that is the data or information hiding is also supported by an
object-oriented data model. And this data model also provides the facility of abstract data types
apart from the built-in data types like char, int, float. ADT’s are the user-defined data types that
hold the values within them and can also have methods attached to them.
Thus, OODBMS provides numerous facilities to its users, both built-in and user-defined. It
incorporates the properties of an object-oriented data model with a database management system,
and supports the concept of programming paradigms like classes and objects along with the
support for other concepts like encapsulation, inheritance, and the user-defined ADT’s (abstract
data types).
ODBMS stands for Object Database Management System. In ODBMS data is encapsulated
and represented in the form of objects. It relates the concept of object-oriented programming
with database systems. ODBMS grew out of research during the early 1970s as database support
for graph-structured objects. In comparison with RDBMS, where data is stored in tables with
rows and columns, ODBMS stores information as objects.
Characteristics
● Easy to link with programming language: The programming language and the
database schema use the same type definitions, so developers may not need to learn a
new database query language.
● No need for user defined keys: Object Database Management Systems have an
automatically generated OID associated with each of the objects.
● Easy modeling: ODBMS can easily model real-world objects, hence, are suitable for
applications with complex data.
● Can store non-textual data ODBMS can also store audio, video and image data.
Advantages
● Speed: Access to data can be faster because an object can be retrieved directly
without a search, by following pointers.
● Improved performance:These systems are most suitable for applications that use
object oriented programming.
● Extensibility:Unlike traditional RDBMS where the basic-data types are hardcoded,
when using ODBMS the user can encode any kind of data-structures to hold the data.
● Data consistency: When ODBMS is integrated with an object-based application,
there is much greater consistency between the database and the programming
language since both use the same model of representation for the data. This helps
avoid the impedance mismatch.
● Capability of handling variety of data: Unlike other database management systems,
ODBMS can also store nn textual data like-: images, videos and audios
Disadvantages:
● No universal standards: There are no universally agreed standards of operating
ODBMS This is the most significant drawback as the user is free to manipulate data
models as he wants which can be an issue when handling enormous amounts of data.
● No security features:Since use of ODBMS is very limited, there are not adequate
security features to store production-grade data.
● Exponential increase in complexity:ODBMS become very complex very fast. When
there is a lot of data and a lot of relations between data, managing and optimizing
ODBMS becomes difficult.
● Scalability: Unable to support large systems.
● Query optimization is challenging: Optimising ODBMS queries requires complete
information about the data like-: type and size of data. This compromises the
data-encapsulation feature that ODBMS had to offer.
Object Relational Model
An Object relational model is a combination of an Object oriented database model and a
Relational database model. So, it supports objects, classes, inheritance etc. just like Object
Oriented models and has support for data types, tabular structures etc. like Relational data
models.
One of the major goals of Object relational data model is to close the gap between relational
databases and the object oriented practises frequently used in many programming languages such
as C++, C#, Java etc.
Advantages of Object Relational model:
The advantages of the Object Relational model are −
● Inheritance
The Object Relational data model allows its users to inherit objects, tables etc. so that they can
extend their functionality. Inherited objects contain new attributes as well as the attributes that
were inherited.
● Complex Data Types
Complex data types can be formed using existing data types. This is useful in Object relational
data models as complex data types allow better manipulation of the data.
● Extensibility
The functionality of the system can be extended in the Object relational data model. This can be
achieved using complex data types as well as advanced concepts of object oriented models such
as inheritance.
Disadvantages of Object Relational model:
The object relational data model can get quite complicated and difficult to handle at times as it is
a combination of the Object oriented data model and Relational data model and utilizes the
functionalities of both of them.
Web databases
A web database is essentially a database that can be accessed from a local network or the internet
instead of one that has its data stored on a desktop or its attached storage. Used for both
professional and personal use, they are hosted on websites, and are software as service (SaaS)
products, which means that access is provided via a web browser.
One of the types of web databases that you may be more familiar with is a relational database.
Relational databases allow you to store data in groups (known as tables), through its ability to
link records together. It uses indexes and keys, which are added to data, to locate information
fields stored in the database, enabling you to retrieve information quickly.
To paint a picture, just think about when you shop online and want to have a look at a specific
product. Typing in keywords such as “black dress” enables all the black dresses stored on the
website to appear right on the very browser you are looking on, because the information “black”
and “dress” are stored in their database entries.
Some advantages of using a web database include:
1. Web database applications can be free or require payment, usually through monthly
subscriptions. Because of this, you pay for the amount you use. So whether your business
shrinks or expands, your needs can be accommodated by the amount of server space. You
also don’t have to fork out for the cost of installing an entire software program.
2. The information is accessible from almost any device. Having things stored in a cloud
means that it is not stuck to one computer. As long as you are granted access, you can
technically get a hold of the data from just about any compatible device.
3. Web database programs usually come with their own technical support team so your IT
department folks can focus on other pressing company matters.
4. It’s convenient: web databases allow users to update information so all you have to do is
to create simple web forms.
Data Organization
Web databases enable collected data to be organized and cataloged thoroughly within hundreds
of parameters. The Web database does not require advanced computer skills, and many database
software programs provide an easy "click-and-create" style with no complicated coding. Fill in
the fields and save each record. Organize the data however you choose, such as chronologically,
alphabetically or by a specific set of parameters.
Web Database Software
Web database software programs are found within desktop publishing programs, such as
Microsoft Office Access and OpenOffice Base. Other programs include the Webex WebOffice
database and FormLogix Web database. The most advanced software applications can set up data
collection forms, polls, feedback forms and present data analysis in real time.
Applicable Uses
Businesses both large and small can use Web databases to create website polls, feedback forms,
client or customer and inventory lists. Personal Web database use can range from storing
personal email accounts to a home inventory to personal website analytics. The Web database is
entirely customizable to an individual's or business's needs.
Securing your website-based database is also of great importance, especially since hackers
access billions of organizational records every year. Protecting your systems isn’t a matter that’s
up for discussion; it’s a must.
Luckily, database management systems (DBMS) offer robust data encryption mechanisms. Top
of that list is the use of complex algorithms for encrypting files. This approach makes
information unreadable to unauthorized users. When you need access, it will decrypt the records
to make them readable.
Passwords and private keys are great alternatives for securing your web database. These usually
limit the people that can access the system. It ensures hackers have a rough time trying to
penetrate the website database.
A web application firewall (WAF) is another excellent option. It adds an extra layer of protection
to your systems. The set-up works effectively in filtering bots, spam, and DDoS attacks. The best
part – it’s available at an affordable cost from CDN providers.
Data Warehouse
Data Warehouse is a relational database management system (RDBMS) constructed to meet the
requirement of transaction processing systems. It can be loosely described as any centralized data
repository which can be queried for business benefits. It is a database that stores information
oriented to satisfy decision-making requests. It is a group of decision support technologies,
targets to enable the knowledge worker (executive, manager, and analyst) to make superior and
higher decisions. So, Data Warehousing supports architectures and tools for business executives
to systematically organize, understand and use their information to make strategic decisions.
The Data Warehouse environment contains an extraction, transportation, and loading (ETL)
solution, an online analytical processing (OLAP) engine, customer analysis tools, and other
applications that handle the process of gathering information and delivering it to business users.
A Data Warehouse (DW) is a relational database that is designed for query and analysis rather
than transaction processing. It includes historical data derived from transaction data from single
and multiple sources. A Data Warehouse provides integrated, enterprise-wide, historical data and
focuses on providing support for decision-makers for data modeling and analysis. A Data
Warehouse is a group of data specific to the entire organization, not only to a particular group of
users. It is not used for daily operations and transaction processing but used for making
decisions.
A Data Warehouse can be viewed as a data system with the following attributes:
● It is a database designed for investigative tasks, using data from various applications.
● It supports a relatively small number of clients with relatively long interactions.
● It includes current and historical data to provide a historical perspective of information.
● Its usage is read-intensive.
● It contains a few large tables.
Subject-Oriented
A data warehouse target on the modeling and analysis of data for decision-makers. Therefore,
data warehouses typically provide a concise and straightforward view around a particular
subject, such as customer, product, or sales, instead of the global organization's ongoing
operations. This is done by excluding data that are not useful concerning the subject and
including all data needed by the users to understand the subject.
Integrated
A data warehouse integrates various heterogeneous data sources like RDBMS, flat files, and
online transaction records. It requires performing data cleaning and integration during data
warehousing to ensure consistency in naming conventions, attributes types, etc., among different
data sources.
Time-Variant
Historical information is kept in a data warehouse. For example, one can retrieve files from 3
months, 6 months, 12 months, or even previous data from a data warehouse. These variations
with a transactions system, where often only the most current file is kept.
Non-Volatile
The data warehouse is a physically separate data storage, which is transformed from the source
operational RDBMS. The operational updates of data do not occur in the data warehouse, i.e.,
update, insert, and delete operations are not performed. It usually requires only two procedures in
data accessing: Initial loading of data and access to data. Therefore, the DW does not require
transaction processing, recovery, and concurrency capabilities, which allows for substantial
speedup of data retrieval. Non-Volatile defines that once entered into the warehouse, and data
should not change.
1)Business User: Business users require a data warehouse to view summarized data from the
past. Since these people are non-technical, the data may be presented to them in an elementary
form.
2) Store historical data: Data Warehouse is required to store the time variable data from the
past. This input is made to be used for various purposes.
3) Make strategic decisions: Some strategies may be depending upon the data in the data
warehouse. So, data warehouses contribute to making strategic decisions.
4) For data consistency and quality: Bringing the data from different sources at a
commonplace, the user can effectively undertake to bring uniformity and consistency in data.
5) High response time: Data warehouses have to be ready for somewhat unexpected loads and
types of queries, which demands a significant degree of flexibility and quick response time.
Architecture is the proper arrangement of the elements. We build a data warehouse with software
and hardware components. To suit the requirements of our organizations, we arrange these
buildings. We may want to boost up another part with extra tools and services. All of these
depend on our circumstances.
The figure shows the essential elements of a typical warehouse. We see the Source Data
component shown on the left. The Data staging element serves as the next building block. In the
middle, we see the Data Storage component that handles the data warehouses data. This element
not only stores and manages the data; it also keeps track of data using the metadata repository.
The Information Delivery component on the right consists of all the different ways of making the
information from the data warehouses available to the users.
Source Data Component
Source data coming into the data warehouses may be grouped into four broad categories:
Production Data: This type of data comes from the different operating systems of the enterprise.
Based on the data requirements in the data warehouse, we choose segments of the data from the
various operational modes.
Internal Data: In each organization, the client keeps their "private" spreadsheets, reports,
customer profiles, and sometimes even department databases. This is the internal data, part of
which could be useful in a data warehouse.
Archived Data: Operational systems are mainly intended to run the current business. In every
operational system, we periodically take the old data and store it in achieved files.
External Data: Most executives depend on information from external sources for a large
percentage of the information they use. They use statistics associated with their industry
produced by the external department.
Data Mining
Data mining is one of the most useful techniques that help entrepreneurs, researchers, and
individuals to extract valuable information from huge sets of data. Data mining is also called
Knowledge Discovery in Database (KDD). The knowledge discovery process includes Data
cleaning, Data integration, Data selection, Data transformation, Data mining, Pattern evaluation,
and Knowledge presentation.
Our Data mining tutorial includes all topics of Data mining such as applications, Data mining vs
Machine learning, Data mining tools, Social Media Data mining, Data mining techniques,
Clustering in data mining, Challenges in Data mining, etc.
The process of extracting information to identify patterns, trends, and useful data that would
allow the business to take the data-driven decision from huge sets of data is called Data Mining.
In other words, we can say that Data Mining is the process of investigating hidden patterns of
information to various perspectives for categorization into useful data, which is collected and
assembled in particular areas such as data warehouses, efficient analysis, data mining algorithm,
helping decision making and other data requirement to eventually cost-cutting and generating
revenue.
Data mining is the act of automatically searching for large stores of information to find trends
and patterns that go beyond simple analysis procedures. Data mining utilizes complex
mathematical algorithms for data segments and evaluates the probability of future events. Data
Mining is also called Knowledge Discovery of Data (KDD).
Data Mining is a process used by organizations to extract specific data from huge databases to
solve business problems. It primarily turns raw data into useful information.
Data Mining is similar to Data Science carried out by a person, in a specific situation, on a
particular data set, with an objective. This process includes various types of services such as text
mining, web mining, audio and video mining, pictorial data mining, and social media mining. It
is done through software that is simple or highly specific. By outsourcing data mining, all the
work can be done faster with low operation costs. Specialized firms can also use new
technologies to collect data that is impossible to locate manually. There is tons of information
available on various platforms, but very little knowledge is accessible. The biggest challenge is
to analyze the data to extract important information that can be used to solve a problem or for
company development. There are many powerful instruments and techniques available to mine
data and find better insight from it.
Relational Database:
A relational database is a collection of multiple data sets formally organized by tables, records,
and columns from which data can be accessed in various ways without having to recognize the
database tables. Tables convey and share information, which facilitates data searchability,
reporting, and organization.
Data warehouses:
A Data Warehouse is the technology that collects the data from various sources within the
organization to provide meaningful business insights. The huge amount of data comes from
multiple places such as Marketing and Finance. The extracted data is utilized for analytical
purposes and helps in decision- making for a business organization. The data warehouse is
designed for the analysis of data rather than transaction processing.
Data Repositories:
The Data Repository generally refers to a destination for data storage. However, many IT
professionals utilize the term more clearly to refer to a specific kind of setup within an IT
structure. For example, a group of databases, where an organization has kept various kinds of
information.
Object-Relational Database:
Transactional Database:
A transactional database refers to a database management system (DBMS) that has the potential
to undo a database transaction if it is not performed appropriately. Even though this was a unique
capability a very long while back, today, most of the relational database systems support
transactional database activities.
● There is a probability that the organizations may sell useful data of customers to other
organizations for money. As per the report, American Express has sold credit card
purchases of their customers to other organizations.
● Many data mining analytics software is difficult to operate and needs advance training to
work on.
● Different data mining instruments operate in distinct ways due to the different algorithms
used in their design. Therefore, the selection of the right data mining tools is a very
challenging task.
● The data mining techniques are not precise, so that it may lead to severe consequences in
certain conditions.
Data Mining is primarily used by organizations with intense consumer demands- Retail,
Communication, Financial, marketing company, determine price, consumer preferences, product
positioning, and impact on sales, customer satisfaction, and corporate profits. Data mining
enables a retailer to use point-of-sale records of customer purchases to develop products and
promotions that help the organization to attract the customer.
Challenges of Implementation in Data mining
The process of extracting useful data from large volumes of data is data mining. The data in the
real-world is heterogeneous, incomplete, and noisy. Data in huge quantities will usually be
inaccurate or unreliable. These problems may occur due to data measuring instrument or because
of human errors. Suppose a retail chain collects phone numbers of customers who spend more
than $ 500, and the accounting employees put the information into their system. The person may
make a digit mistake when entering the phone number, which results in incorrect data. Even
some customers may not be willing to disclose their phone numbers, which results in incomplete
data. The data could get changed due to human or system error. All these consequences (noisy
and incomplete data)makes data mining challenging.
Data Distribution:
Real-world data is heterogeneous, and it could be multimedia data, including audio and video,
images, complex data, spatial data, time series, and so on. Managing these various types of data
and extracting useful information is a tough task. Most of the time, new technologies, new tools,
and methodologies would have to be refined to obtain specific information.
Performance:
The data mining system's performance relies primarily on the efficiency of algorithms and
techniques used. If the designed algorithm and techniques are not up to the mark, then the
efficiency of the data mining process will be affected adversely.
Data mining usually leads to serious issues in terms of data security, governance, and privacy.
For example, if a retailer analyzes the details of the purchased items, then it reveals data about
buying habits and preferences of the customers without their permission.
Data Visualization:
In data mining, data visualization is a very important process because it is the primary method
that shows the output to the user in a presentable way. The extracted data should convey the
exact meaning of what it intends to express. But many times, representing the information to the
end-user in a precise and easy way is difficult. The input data and the output information being
complicated, very efficient, and successful data visualization processes need to be implemented
to make it successful.
9.7M
100
Sta