0% found this document useful (0 votes)

12 views18 pages

DB Unit-3

The document discusses the classification of data into structured, semi-structured, and unstructured types, detailing their characteristics and storage methods. It also covers XML data models, including DTD and XML Schema, along with querying languages like XPath and XQuery. Additionally, it explains the differences between XML-enabled and native XML databases, and the benefits of using XQuery for data retrieval.

Uploaded by

Manoj D

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views18 pages

DB Unit-3

Uploaded by

Manoj D

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

UNIT-3

Structured Data Vs Unstructured Data Vs SemiStructured

Data
We can classify data as structured data, semi-structured
data, or unstructured data.
Structured data resides in predefined formats and
models.
Unstructured data is stored in its natural format until it’s
extracted for analysis.
Semi-structured data basically is a mix of both
structured and unstructured data.
What Is Data?
Data is a set of facts such as descriptions, observations,
and numbers used in decision making.
We can classify data as structured, unstructured, or
semi-structured data.
1) Structured Data
➢ Structured data is generally tabular data that is
represented by columns and rows in a database.
➢ Databases that hold tables in this form are called
relational databases.
➢ In structured data, all row in a table has the same set of
columns.
➢ SQL (Structured Query Language) programming
language used for structured data.
2) Semi-structured Data
➢ Semi-structured data is information that doesn’t consist
of Structured data (relational database) but still has
some structure to it.
3) Unstructured Data
➢ Unstructured data is information that either does not
organize in a pre-defined manner or not have a pre-
defined data model.
➢ Videos, audio, and binary data files might not have a
specific structure.
Characteristics Of Structured (Relational) and Unstructured
(Non-Relational) Data
Relational Data
➢ Relational databases provide undoubtedly the most
well-understood model for holding data.
➢ We can communicate with relational databases using
Structured Query Language (SQL).
➢ SQL allows the joining of tables
➢ Examples of relational databases: MySQL, PostgreSQL.
Non-Relational Data
➢ Non-relational databases permit us to store data in a
format that more closely meets the original structure.
➢ A non-relational database is a database that does not
use the tabular schema of columns and rows
➢ In a non-relational database the data may be stored as
JSON documents, as simple key/value pairs, or as a
graph consisting of edges and vertices.
➢ Examples of non-relational databases: Redis,JanusGraph,
MongoDB, RabbitMQ
Document Data Stores
A document data store handles a set of objects data
values and named string fields in an entity referred to as a
document.
Columnar Data Stores
A columnar or column-family data store construct data
into rows and columns. The columns are divided into groups
known as column families.
Key/Value Data Stores
A key/value store is actually a large hash table
Graph Data Stores
A graph data store handles two types of information,
edges, and nodes.
Structured Unstructured
Data in rows&columns Not in rows&columns
Number,data,strings Images,audio,video
Less storage More storage
Easy to manage and protect Difficult
XML Hierarchical (Tree) Data Model
An XML document has a self descriptive structure. It
forms a tree structure which is referred as an XML tree.
A tree structure contains root element (as parent), child
element and so on.
It is very easy to traverse all succeeding branches and
sub-branches and leaf nodes starting from the root.
Example of an XML document
<?xml version="1.0"?>
<college>
<student>
<firstname>Tamanna</firstname>
<lastname>Bhatia</lastname>
<contact>09990449935</contact>
<email>[email protected]</email>
<address>
<city>Ghaziabad</city>
<state>Uttar Pradesh</state>
<pin>201007</pin>
</address>
</student>
</college>
XML Tree Rules
These rules are used to figure out the relationship of the
elements.
Descendants: If element A is contained by element B, then A
is known as descendant of B. In the above example "College"
is the root element and all the other elements are the
descendants of "College".
Ancestors: The containing element which contains other
elements is called "Ancestor" of other element. In the above
example Root element (College) is ancestor of all other
elements.

Elements in XML Tree Model:

Complex Element: It is constructed from other elements
hierarchically
Simple Element: It contains data values.
Characterize three main types of XML documents:
Data-centric XML documents: These documents have
many small data items that follow a specific structure.
Document-centric XML documents. These are
documents with large amounts of text, such as news articles
or books.
Hybrid XML documents. These documents may have
parts that contain structured data and unstructured.

XML DTD:
The XML Document Type Declaration, commonly known
as DTD, is a way to describe XML language. DTDs check
vocabulary and validity of XML documents.
An XML DTD can be either specified inside the
document, or it can be kept in a separate document and then
linked separately.
Syntax
Basic syntax of a DTD is as follows −
<!DOCTYPE element DTD identifier
[
declaration1
declaration2
........
]>
• The DTD starts with <!DOCTYPE delimiter.
• An element tells the parser to parse the document from
the specified root element.
• DTD identifier is an identifier for the document type
definition, which may be the path to a file on the system
or URL.
• The square brackets [ ] enclose an optional list of entity
declarations called Internal Subset.

Internal DTD

A DTD is referred to as an internal DTD if elements are

declared within the XML files.

To refer it as internal DTD, standalone attribute in XML

declaration must be set to yes.

Syntax

<!DOCTYPE root-element [element-declarations]>

Example:
<?xml version = "1.0" encoding = "UTF-8" standalone = "yes"
?>
<!DOCTYPE address [
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>

<address>
<name>Manoj</name>
<company>SKP</company>
<phone>123</phone>
</address>

Start Declaration:
<?xml version = "1.0" encoding = "UTF-8" standalone = "yes"
?>
DTD:
<!DOCTYPE address [
DTD Body :
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone_no (#PCDATA)>
End Declaration :
]>
Rules
• The document type declaration must appear at the start
of the document.
• Similar to the DOCTYPE declaration, the element
declarations must start with an exclamation mark.
• The Name in the document type declaration must match
the element type of the root element.

External DTD

In external DTD elements are declared outside the XML

file. They are accessed by specifying the system attributes
which may be either the legal .dtd file or a valid URL.

To refer it as external DTD, standalone attribute in the

XML declaration must be set as no.

Syntax
<!DOCTYPE root-element SYSTEM "file-name">
Example
<?xml version = "1.0" encoding = "UTF-8" standalone = "no"
?>
<!DOCTYPE address SYSTEM "address.dtd">
<address>
<name>Manoj</name>
<company>SKP</company>
<phone>123</phone> </address>
The content of the DTD file address.dtd is as shown −
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
Types
You can refer to an external DTD by using either system
identifiers or public identifiers.
System Identifiers
A system identifier enables you to specify the location of
an external file containing DTD declarations.
Syntax
<!DOCTYPE name SYSTEM "address.dtd" [...]>
Public Identifiers
<!DOCTYPE name PUBLIC "-//Beginning XML//DTD Address
Example//EN">
XML SCHEMA:
XML Schema is commonly known as XML Schema
Definition (XSD). It is used to describe and validate the
structure and the content of XML data.
Schema element supports Namespaces.
Checking Validation
An XML document is called "well-formed" if it contains
the correct syntax. A well-formed and valid XML document is
one which have been validated against Schema.

XML Schema Example[employee.xsd]

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema
"
targetNamespace="http://www.javatpoint.com"
xmlns="http://www.javatpoint.com"
elementFormDefault="qualified">
<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

Let's see the xml file using XML schema or XSD file.
employee.xml
<?xml version="1.0"?>
<employee
xmlns="http://www.javatpoint.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.javatpoint.com employee.x
sd">
<firstname>vimal</firstname>
<lastname>jaiswal</lastname>
<email>[email protected]</email>
</employee>
Description of XML Schema
<xs:element name="employee"> : It defines the element
name employee.
<xs:complexType> : It defines that the element 'employee' is
complex type.
<xs:sequence> : It defines that the complex type is a
sequence of elements.
<xs:element name="firstname" type="xs:string"/> : It
defines that the element 'firstname' is of string/text type.
<xs:element name="lastname" type="xs:string"/> : It
defines that the element 'lastname' is of string/text type.
<xs:element name="email" type="xs:string"/> : It defines
that the element 'email' is of string/text type.
XML Schema Data types
There are two types of data types in XML schema.
1. simpleType
2. complexType
simple type
The simple type allows you to have text-based elements.
It contains less attributes, child elements, and cannot be left
empty.
Complex type
The complex type allows you to hold multiple attributes
and elements. It can contain additional sub elements and can
be left empty.
XML DATABASE
XML database is a data persistence software system
used for storing the huge amount of information in XML
format.
You can query your stored data by using XQuery.
Types of XML databases
There are two types of XML databases.
1. XML-enabled database
2. Native XML database (NXD)

XML-enable Database

XML-enable database works just like a relational

database. In this database, data is stored in table, in the form
of rows and columns.
Native XML Database
Native XML database is used to store large amount of
data. Instead of table format, Native XML database is based
on container format.
You can query data by XPath expressions.
Example of XML database:
<?xml version="1.0"?>
<contact-info>
<contact1>
<name>Vimal Jaiswal</name>
<company>SSSIT.org</company>
<phone>(0120) 4256464</phone>
</contact1>
<contact2>
<name>Mahesh Sharma </name>
<company>SSSIT.org</company>
<phone>09990449935</phone>
</contact2>
</contact-info>

X-PATH:
XPath defines a pattern or path expression to select
nodes or node sets in an XML document. These patterns are
used by XSLT to perform transformations.
XPath specifies seven types of nodes that can be output of
the execution of the XPath expression.
o Root
o Element
o Text
o Attribute
o Comment
o Processing Instruction
o Namespace
Syntax:
o //tagname[@attribute = ‘value’]

XPath Expressions:

Symbol Description

Selects nodes in the document from the current

// node that match the selection no matter where
they are

/ Selects the root node

tagname Tag name of the current node

@ Select the attribute

attribute Attribute name of the node

Value Value of the attribute

Example:
//input[@id = 'fakebox-input']
In this example, We are locating the ‘input‘ element whose
‘id‘ is equal to ‘fakebox-input‘
Types of XPath:
1. Absolute XPath
2. Relative Xpath
Absolute XPath:
Absolute XPath uses the root element of the HTML/XML
code and followed by all the elements which are necessary
to reach the desired element. It starts with the forward
slash ‘/’ .
Relative XPath:
In this, XPath begins with the double forward
slash ‘//’ which means it can search the element anywhere in
the Webpage.

XQuery
XQuery is a functional language that is used to retrieve
information stored in XML format.
XQuery can be used on XML documents, relational
databases containing data in XML formats.
Characteristics
• Functional Language − XQuery is a language to
retrieve/querying XML based data.
• Analogous to SQL − XQuery is to XML what SQL is to
databases.
• XPath based − XQuery uses XPath expressions to
navigate through XML documents.
• Universally accepted − XQuery is supported by all major
databases.
• W3C Standard − XQuery is a W3C standard.
Benefits of XQuery
• Using XQuery, both hierarchical and tabular data can be
retrieved.
• XQuery can be directly used to build webpages.
• XQuery can be used to transform xml documents.

Chapter 3-The Client Tier
No ratings yet
Chapter 3-The Client Tier
66 pages
XML: Introduction To XML, Defining XML Tags, Their Attributes and Values, Document Type Definition, XML Schemas, Document Object Model, XHTML. Parsing XML Data - DOM and SAX Parsers in Java
No ratings yet
XML: Introduction To XML, Defining XML Tags, Their Attributes and Values, Document Type Definition, XML Schemas, Document Object Model, XHTML. Parsing XML Data - DOM and SAX Parsers in Java
36 pages
Introduction To XML Extensible Markup Language: Prof.N.Nalini AP (SR) VIT
No ratings yet
Introduction To XML Extensible Markup Language: Prof.N.Nalini AP (SR) VIT
35 pages
L07 XML DTD XSD
No ratings yet
L07 XML DTD XSD
65 pages
Unit-5 Web Technology
No ratings yet
Unit-5 Web Technology
17 pages
XML, Ajax and PHP
No ratings yet
XML, Ajax and PHP
40 pages
XML PDF
No ratings yet
XML PDF
21 pages
Unit IV XML Databases Adt
No ratings yet
Unit IV XML Databases Adt
36 pages
Lecture 09
No ratings yet
Lecture 09
110 pages
XML - DTD & Schema
No ratings yet
XML - DTD & Schema
200 pages
SGML and XML
No ratings yet
SGML and XML
23 pages
XML Notes
No ratings yet
XML Notes
48 pages
Unit 9 Java and XML
No ratings yet
Unit 9 Java and XML
29 pages
Module 4 - Journal Finder: Journal Suggestion Tools
No ratings yet
Module 4 - Journal Finder: Journal Suggestion Tools
9 pages
Unit 4 Adbt
No ratings yet
Unit 4 Adbt
18 pages
Unit 2 - XML
No ratings yet
Unit 2 - XML
48 pages
XML Presentation
No ratings yet
XML Presentation
62 pages
XML and DTD: Mario Alviano
No ratings yet
XML and DTD: Mario Alviano
51 pages
Chapter 4 XML
No ratings yet
Chapter 4 XML
52 pages
Chapter 11
No ratings yet
Chapter 11
73 pages
Adbms Unit1
No ratings yet
Adbms Unit1
19 pages
Unit Ii
No ratings yet
Unit Ii
106 pages
Unit 3
No ratings yet
Unit 3
80 pages
AdvancedJavaProgramming-SLIDES03-UNIT1-FP2005-Ver 1.0
No ratings yet
AdvancedJavaProgramming-SLIDES03-UNIT1-FP2005-Ver 1.0
56 pages
Rohini 26517845406
No ratings yet
Rohini 26517845406
4 pages
Unit-1 XML To RWD
No ratings yet
Unit-1 XML To RWD
103 pages
4 XML and PHP
No ratings yet
4 XML and PHP
34 pages
XML DTD Xmlschemas XSLT Json Dom
No ratings yet
XML DTD Xmlschemas XSLT Json Dom
68 pages
Lecture 5 - Semi-Structured Data
No ratings yet
Lecture 5 - Semi-Structured Data
26 pages
Unit Iv
No ratings yet
Unit Iv
17 pages
Unit 5 XML
No ratings yet
Unit 5 XML
73 pages
Extensible Markup Language
No ratings yet
Extensible Markup Language
74 pages
WT Unit II
No ratings yet
WT Unit II
33 pages
Monday, January 30, 2006
No ratings yet
Monday, January 30, 2006
34 pages
Unit-4 ET
No ratings yet
Unit-4 ET
15 pages
Quiz Niit MSSQL
100% (1)
Quiz Niit MSSQL
15 pages
Module 5
No ratings yet
Module 5
29 pages
Siam6 PDF
No ratings yet
Siam6 PDF
47 pages
0432 XML DTD and XML Schema
No ratings yet
0432 XML DTD and XML Schema
32 pages
WT - Unit 3
No ratings yet
WT - Unit 3
24 pages
Unit-III Introduction To XML
No ratings yet
Unit-III Introduction To XML
25 pages
WP Unit5
No ratings yet
WP Unit5
17 pages
XML Schema
No ratings yet
XML Schema
28 pages
WT - Unit Ii
No ratings yet
WT - Unit Ii
28 pages
CH4 WEB Lecture
No ratings yet
CH4 WEB Lecture
24 pages
Synopsis of Online Resume Builder
No ratings yet
Synopsis of Online Resume Builder
29 pages
E Tensible Arkup Anguage Unit-3: Basic XML DTD XML Schema Dom Vs Sax Presenting XML
No ratings yet
E Tensible Arkup Anguage Unit-3: Basic XML DTD XML Schema Dom Vs Sax Presenting XML
39 pages
Yazici XML Ex
No ratings yet
Yazici XML Ex
71 pages
XML Stands For Extensible Markup Language.: 2. XML Is Designed To Transport and Store Data
No ratings yet
XML Stands For Extensible Markup Language.: 2. XML Is Designed To Transport and Store Data
62 pages
XML Technologies and Applications: Rajshekhar Sunderraman
No ratings yet
XML Technologies and Applications: Rajshekhar Sunderraman
24 pages
XML 1
No ratings yet
XML 1
5 pages
CISSP Cheat Sheet Domain 80
No ratings yet
CISSP Cheat Sheet Domain 80
1 page
Chapter 11: XML: Data Integration
No ratings yet
Chapter 11: XML: Data Integration
73 pages
CS3492 DBMS Univ - QP Answer AM 2024
No ratings yet
CS3492 DBMS Univ - QP Answer AM 2024
19 pages
XML
No ratings yet
XML
7 pages
XML Notes
No ratings yet
XML Notes
11 pages
Internship Report - K
No ratings yet
Internship Report - K
30 pages
IDBE Lectures 12 - XML
No ratings yet
IDBE Lectures 12 - XML
30 pages
(4th Year) Roadmap To Dream Placement
No ratings yet
(4th Year) Roadmap To Dream Placement
1 page
DB2 Precompile
0% (1)
DB2 Precompile
10 pages
Web Data: XML
No ratings yet
Web Data: XML
13 pages
XML With Informatica
No ratings yet
XML With Informatica
12 pages
XML Ora
No ratings yet
XML Ora
11 pages
Chapter 3: Introduction To Database Solutions
No ratings yet
Chapter 3: Introduction To Database Solutions
4 pages
DMS - 22319 - EPA-Poonam - (1) (2) (AutoRecovered)
No ratings yet
DMS - 22319 - EPA-Poonam - (1) (2) (AutoRecovered)
79 pages
XML Schema Ket
No ratings yet
XML Schema Ket
28 pages
Master PySpark 1-18
No ratings yet
Master PySpark 1-18
59 pages
Mizanur Islam Project Report
No ratings yet
Mizanur Islam Project Report
59 pages
Archydro Projectdevelopmentbestpractices
No ratings yet
Archydro Projectdevelopmentbestpractices
11 pages
RDBMS Unit 4
No ratings yet
RDBMS Unit 4
20 pages
Hadoop Training in Hyderabad
No ratings yet
Hadoop Training in Hyderabad
10 pages
Fundamentals of Database Systems: Lesson 2: Data Models
No ratings yet
Fundamentals of Database Systems: Lesson 2: Data Models
39 pages
Diff Between Delete, Drop and Truncate SQL
No ratings yet
Diff Between Delete, Drop and Truncate SQL
6 pages
DWM Notes
No ratings yet
DWM Notes
19 pages
PPL Unit-5
No ratings yet
PPL Unit-5
10 pages
LINUX Basic Commands
No ratings yet
LINUX Basic Commands
5 pages
Module 6 - Normalization-1
No ratings yet
Module 6 - Normalization-1
30 pages
Chapter 4 Spark
No ratings yet
Chapter 4 Spark
57 pages
Unit-02 Notes - DBMS
No ratings yet
Unit-02 Notes - DBMS
24 pages
Raghav Pract..
No ratings yet
Raghav Pract..
36 pages
Venkateshwaran Gopal: Professional
No ratings yet
Venkateshwaran Gopal: Professional
5 pages
PPL Unit2
No ratings yet
PPL Unit2
17 pages
DB Unit-4
No ratings yet
DB Unit-4
15 pages
Informatica Pushdown Tips - New
No ratings yet
Informatica Pushdown Tips - New
5 pages
Network Unit 3
No ratings yet
Network Unit 3
23 pages
Tandra Aditya Krishna Rao
No ratings yet
Tandra Aditya Krishna Rao
5 pages
Database
No ratings yet
Database
4 pages
View Plant 3D User Properties in Navisworks
No ratings yet
View Plant 3D User Properties in Navisworks
4 pages
AI Assignment
No ratings yet
AI Assignment
2 pages
Mainframe Evaluation Guide
No ratings yet
Mainframe Evaluation Guide
4 pages
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
XML Programming: The Ultimate Guide to Fast, Easy, and Efficient Learning of XML Programming
From Everand
XML Programming: The Ultimate Guide to Fast, Easy, and Efficient Learning of XML Programming
Christopher Right
2.5/5 (2)
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked
From Everand
Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
1/5 (1)

DB Unit-3

Uploaded by

DB Unit-3

Uploaded by

UNIT-3

Structured Data Vs Unstructured Data Vs SemiStructured

Elements in XML Tree Model:

A DTD is referred to as an internal DTD if elements are

To refer it as internal DTD, standalone attribute in XML

<!DOCTYPE root-element [element-declarations]>

In external DTD elements are declared outside the XML

To refer it as external DTD, standalone attribute in the

XML Schema Example[employee.xsd]

XML-enable database works just like a relational

Selects nodes in the document from the current

/ Selects the root node

tagname Tag name of the current node

@ Select the attribute

attribute Attribute name of the node

Value Value of the attribute

You might also like