0% found this document useful (0 votes)

24 views47 pages

Dbms Unit-6

The document discusses different methods of file organization including sequential, heap, hashing, B+ tree, and clustered file organization. Sequential file organization stores records sequentially. Heap file organization stores records without order. Hashing maps keys to addresses via a hash function. B+ trees index records similarly to binary search trees. Clustering stores related records together.

Uploaded by

Parth Jadhav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views47 pages

Dbms Unit-6

Uploaded by

Parth Jadhav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

UNIT-6

File Organization

• File – A file is named collection of related information that is recorded on

secondary storage such as magnetic disks, magnetic tapes and optical
disks.
• File Organization refers to the logical relationships among various records
that constitute the file, particularly with respect to the means of
identification and access to any specific record. In simple terms, Storing
the files in certain order is called file Organization.
• Types of File Organizations –
• Sequential File Organization
• Heap File Organization
• Hash File Organization
• B+ Tree File Organization
• Clustered File Organization
Sequential File Organization –
• The easiest method for file Organization is Sequential
method. In this method the file are stored one after
another in a sequential manner. There are two ways to
implement this method:
• Pile File Method – This method is quite simple, in which
we store the records in a sequence i.e one after other in
the order in which they are inserted into the tables.
Insertion of new record –

Sorted File Method –In this method, As the name itself suggest whenever a new
record has to be inserted, it is always inserted in a sorted (ascending or
descending) manner. Sorting of records may be based on any primary key or any
other key
Insertion of new record –
Pros –
•Fast and efficient method for huge amount of data.
•Simple design.
•Files can be easily stored in magnetic tapes i.e cheaper
storage mechanism.

CONS-
•Time wastage as we cannot jump on a particular record
that is required, but we have to move in a sequential
manner which takes our time.
•Sorted file method is inefficient as it takes time and
space for sorting records.
Heap File Organization –

• Heap File Organization works with data blocks. In this method

records are inserted at the end of the file, into the data blocks.
No Sorting or Ordering is required in this method. If a data
block is full, the new record is stored in some other block, Here
the other data block need not be the very next data block, but it
can be any block in the memory. It is the responsibility of
DBMS to store and manage the new records.
Insertion of new record –
Suppose we have four records in the heap R1, R5, R6, R4
and R3 and suppose a new record R2 has to be inserted
in the heap then, since the last data block i.e data block 3
is full it will be inserted in any of the data blocks selected
by the DBMS, lets say data block 1.

If we want to search, delete or update data in heap file

Organization the we will traverse the data from the
beginning of the file till we get the requested record.
Thus if the database is very huge, searching, deleting or
updating the record will take a lot of time.
Pros –
•Fetching and retrieving records is faster than sequential
record but only in case of small databases.
•When there is a huge number of data needs to be
loaded into the database at a time, then this method of
file Organization is best suited.
Cons –
•Problem of unused memory blocks.
•Inefficient for larger databases.
Hashing is an efficient technique to directly search the location of desired data on the disk without using index
structure. Data is stored at the data blocks whose address is generated by using hash function. The memory
location where these records are stored is called as data block or data bucket.
Static Hashing:
In static hashing, when a search-key value is provided, the hash function always computes the same address.
For example, if we want to generate an address for STUDENT_ID = 104 using mod (5) hash function, it always
results in the same bucket address 4. There will not be any changes to the bucket address here. Hence a
number of data buckets in the memory for this static hashing remain constant throughout.
Operations:
•
Insertion – When a new record is inserted into the table, The hash function h generates a bucket address for
the new record based on its hash key K. Bucket address = h(K)
•Searching – When a record needs to be searched, The same hash function is used to retrieve the bucket
address for the record. For Example, if we want to retrieve the whole record for ID 104, and if the hash function
is mod (5) on that ID, the bucket address generated would be 4. Then we will directly got to address 4 and
retrieve the whole record for ID 104. Here ID acts as a hash key.
•Deletion – If we want to delete a record, Using the hash function we will first fetch the record which is
supposed to be deleted. Then we will remove the records for that address in memory.
•Updation – The data record that needs to be updated is first searched using hash function, and then the data
record is updated.
If we want to insert some new records into the file But the data bucket address generated by the hash function is
not empty or the data already exists in that address. This becomes a critical situation to handle. This situation in
the static hashing is called bucket overflow. How will we insert data in this case? There are several methods
provided to overcome this situation. Some commonly used methods are discussed below:

1.Open Hashing – In Open hashing method, next available data block is used to enter the new record,
instead of overwriting the older one. This method is also called linear probing. For example, D3 is a new
record that needs to be inserted, the hash function generates the address as 105. But it is already full. So
the system searches next available data bucket, 123 and assigns D3 to it.
2.Closed hashing – In Closed hashing method, a new data bucket is allocated with same address and is linked it
After the full data bucket. This method is also known as overflow chaining. For example, we have to insert a new
record D3 into the tables. The static hash function generates the data bucket address as 105. But this bucket is full
to store the new data. In this case is a new data bucket is added at the end of 105 data bucket and is linked to it.
Then new record D3 is inserted into the new bucket.

•Quadratic probing : Quadratic probing is very much similar to open hashing or linear probing.
Here, The only difference between old and new bucket is linear. Quadratic function is used to
determine the new bucket address.
•Double Hashing : Double Hashing is another method similar to linear probing. Here the
difference is fixed as in linear probing, but this fixed difference is calculated by using another hash
function. That’s why the name is double hashing.
Dynamic Hashing –
The drawback of static hashing is that it does not expand or shrink dynamically as the size of the database grows
or shrinks. In Dynamic hashing, data buckets grows or shrinks (added or removed dynamically) as the records
increases or decreases. Dynamic hashing is also known as extended hashing. In dynamic hashing, the hash
function is made to produce a large number of values. For Example, there are three data records D1, D2 and D3 .
The hash function generates three addresses 1001, 0101 and 1010 respectively. This method of storing considers
only part of this address – especially only first one bit to store the data. So it tries to load three of them at address
0 and 1.

But the problem is that No bucket address is remaining

for D3. The bucket has to grow dynamically to
accommodate D3. So it changes the address have 2 bits
rather than 1 bit, and then it updates the existing data to
have 2 bit address. Then it tries to accommodate D3.
B+ Tree File Organization –
B+ Tree, as the name suggests, It uses a tree like structure to store records in File. It uses the concept of Key
indexing where the primary key is used to sort the records. For each primary key, an index value is generated and
mapped with the record. An index of a record is the address of record in the file.
B+ Tree is very much similar to binary search tree, with the only difference that instead of just two children, it can
have more than two. All the information is stored in leaf node and the intermediate nodes acts as pointer to the
leaf nodes. The information in leaf nodes always remain a sorted sequential linked list.

In the above diagram 56 is the root node which is also called the main node of the tree.
The intermediate nodes here, just consist the address of leaf nodes. They do not contain any actual record. Leaf
nodes consist of the actual record. All leaf nodes are balanced.
Pros –

•Tree traversal is easier and faster.

•Searching becomes easy as all records are stored only in leaf nodes and are sorted sequential linked list.
•There is no restriction on B+ tree size. It may grows/shrink as the size of data increases/decreases.
Cons –
•Inefficient for static tables.
Cluster File Organization –
In cluster file organization, two or more related tables/records are stored within same file known as clusters.
These files will have two or more tables in the same data block and the key attributes which are used to
map these table together are stored only once.
Thus it lowers the cost of searching and retrieving various records in different files as they are now
combined and kept in a single cluster.
For example we have two tables or relation Employee and Department. These table are related to each
other.
Therefore these table are allowed to combine using a join operation and can be seen in a cluster file.

If we have to insert, update or delete any record we can

directly do so. Data is sorted based on the primary key or the
key with which searching is done. Cluster key is the key with
which joining of the table is performed.
Types of Cluster File Organization – There are two ways to implement this method:
1.Indexed Clusters –
In Indexed clustering the records are group based on the cluster key and stored together. The above mentioned
example of the Employee and Department relationship is an example of Indexed Cluster where the records are
based on the Department ID.
2.Hash Clusters –
This is very much similar to indexed cluster with only difference that instead of storing the records based on
cluster key, we generate hash key value and store the records with same hash key value.
Hash function:-

hash = hashfunc(key)
index = hash % array_size
1. Linear Probing

h(k, i) = (h'(k) + i ) mod m

for i = 0, 1, 2, . . .,m-1

2. Quadratic Probing

h(k, i) = (h'(k) + c1i + c2i2) mod m

where h’ is the auxiliary hash function and c1 and c2 are
called positive auxiliary constants.
i = 0, 1, 2, . . . , m-1

3. Double Hashing

h(k, i) = (h1(k) + ih2(k)) mod m Take and example

h1 and h2 are the auxiliary functions. h1(k) = k mod m
h2(k) = 1 + (k mod m’)
Q-1:Consider double hashing of the form
h(k,i)=(h 1 (k)+ih 2 (k)) mod m
Where h 1 (k)=k mod m
h 2 (k)=1+(k mod n)
Where n=m-1and m=701
for k=123456, what is the difference between first and
second probes in terms of slots?
(A) 255
(B) 256
(C) 257
(D) 258
Answer: (C)

Explanation: Given that

=> h(k, i) = (h1(k)+ih2 (k)) mod m
=> Where h1(k)=k mod m,
h2 (k)=1+(k mod n)
n=m-1,
m=701
k = 123456
Now,
 h1(k) = 123456 mod 701 = 80
=> h2(k) = 1 + (123456 mod 700) = 1 + 256 = 257
1st probe: when i =1
=> h(k, i) = h1(k) + ih2(k)
=> h(k, 1) = h1(k) + h2(k) = 80 + 257 = 337
2nd probe: when i =2
=> h(k,2) = h1(k) + 2*h2(k)
= 80 + 2*257
=> h(k,2) = 80 + 514 = 594
So, difference between first two probes = 594 – 337
= 257
=> Option C is answer.
What is Big Data?
Big Data is a collection of data that is huge in volume, yet growing exponentially with time. It is a data with so large size
and complexity that none of traditional data management tools can store it or process it efficiently. Big data is also a data
but with huge size.
What is an Example of Big Data?
The New York Stock Exchange is an example of Big Data that generates about one terabyte of new trade data per day.
Social Media
The statistic shows that 500+terabytes of new data get ingested into the databases of social media site Facebook, every
day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments etc.

A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many thousand flights per day,
generation of data reaches up to many Petabytes.

Types Of Big Data

Following are the types of Big Data:
1.Structured
2.Unstructured
3.Semi-structured
Structured
Any data that can be stored, accessed and processed in the form of fixed format is termed as a ‘structured’ data. Over the
period of time, talent in computer science has achieved greater success in developing techniques for working with such kind
of data (where the format is well known in advance) and also deriving value out of it. However, nowadays, we are foreseeing
issues when a size of such data grows to a huge extent, typical sizes are being in the rage of multiple zettabytes.
Do you know? 1021 bytes equal to 1 zettabyte or one billion terabytes forms a zettabyte.
Looking at these figures one can easily understand why the name Big Data is given and imagine the challenges involved in
its storage and processing.
Do you know? Data stored in a relational database management system is one example of a ‘structured’ data.
Unstructured
Any data with unknown form or the structure is classified as unstructured data. In addition to the size being huge, un-
structured data poses multiple challenges in terms of its processing for deriving value out of it. A typical example of
unstructured data is a heterogeneous data source containing a combination of simple text files, images, videos etc. Now
day organizations have wealth of data available with them but unfortunately, they don’t know how to derive value out of it
since this data is in its raw form or unstructured format.
Examples Of Un-structured Data
The output returned by ‘Google Search’
Semi-structured
Semi-structured data can contain both the forms of data. We can see semi-structured data as a structured in form but it is
actually not defined with e.g. a table definition in relational DBMS. Example of semi-structured data is a data represented
in an XML file.
Examples Of Semi-structured Data
Personal data stored in an XML file-

Data Growth over the years

Characteristics Of Big Data
Big data can be described by the following characteristics:
•Volume
•Variety
•Velocity
•Variability
(i) Volume – The name Big Data itself is related to a size which is enormous. Size of data plays a very crucial role in
determining value out of data. Also, whether a particular data can actually be considered as a Big Data or not, is dependent
upon the volume of data. Hence, ‘Volume’ is one characteristic which needs to be considered while dealing with Big Data
solutions.
(ii) Variety – The next aspect of Big Data is its variety.
Variety refers to heterogeneous sources and the nature of data, both structured and unstructured. During earlier days,
spreadsheets and databases were the only sources of data considered by most of the applications. Nowadays, data in the
form of emails, photos, videos, monitoring devices, PDFs, audio, etc. are also being considered in the analysis applications.
This variety of unstructured data poses certain issues for storage, mining and analyzing data.
(iii) Velocity – The term ‘velocity’ refers to the speed of generation of data. How fast the data is generated and processed
to meet the demands, determines real potential in the data.
Big Data Velocity deals with the speed at which data flows in from sources like business processes, application logs,
networks, and social media sites, sensors, Mobile devices, etc. The flow of data is massive and continuous.
(iv) Variability – This refers to the inconsistency which can be shown by the data at times, thus hampering the process of
being able to handle and manage the data effectively.
Advantages Of Big Data Processing
Ability to process Big Data in DBMS brings in multiple benefits, such as-
•Businesses can utilize outside intelligence while taking decisions
Access to social data from search engines and sites like facebook, twitter are enabling organizations to fine tune their
business strategies.
•Improved customer service
Traditional customer feedback systems are getting replaced by new systems designed with Big Data technologies. In these
new systems, Big Data and natural language processing technologies are being used to read and evaluate consumer
responses.
•Early identification of risk to the product/services, if any
•Better operational efficiency
Big Data technologies can be used for creating a staging area or landing zone for new data before identifying what data
should be moved to the data warehouse. In addition, such integration of Big Data technologies and data warehouse helps
an organization to offload infrequently accessed data.
Summary
•Big Data definition : Big Data meaning a data that is huge in size. Bigdata is a term used to describe a collection of data
that is huge in size and yet growing exponentially with time.
•Big Data analytics examples includes stock exchanges, social media sites, jet engines, etc.
•Big Data could be 1) Structured, 2) Unstructured, 3) Semi-structured
•Volume, Variety, Velocity, and Variability are few Big Data characteristics
•Improved customer service, better operational efficiency, Better Decision Making are few advantages of Bigdata
What is NoSQL?
NoSQL Database is a non-relational Data Management System, that does not require a fixed schema. It avoids joins, and is
easy to scale. The major purpose of using a NoSQL database is for distributed data stores with humongous data storage
needs. NoSQL is used for Big data and real-time web apps. For example, companies like Twitter, Facebook and Google
collect terabytes of user data every single day.
NoSQL database stands for “Not Only SQL” or “Not SQL.” Though a better term would be “NoREL”, NoSQL caught on. Carl
Strozz introduced the NoSQL concept in 1998.
Traditional RDBMS uses SQL syntax to store and retrieve data for further insights. Instead, a NoSQL database system
encompasses a wide range of database technologies that can store structured, semi-structured, unstructured and
polymorphic data. Let’s understand about NoSQL with a diagram in this NoSQL database tutorial:
Why NoSQL?
The concept of NoSQL databases became popular with Internet giants like Google, Facebook, Amazon, etc. who deal with
huge volumes of data. The system response time becomes slow when you use RDBMS for massive volumes of data.
To resolve this problem, we could “scale up” our systems by upgrading our existing hardware. This process is expensive.
The alternative for this issue is to distribute database load on multiple hosts whenever the load increases. This method is
known as “scaling out.”

NoSQL database is non-relational, so it scales out better than relational databases as they are designed with web
applications in mind.
Features of NoSQL
Non-relational
•NoSQL databases never follow the relational model
•Never provide tables with flat fixed-column records
•Work with self-contained aggregates or BLOBs
•Doesn’t require object-relational mapping and data normalization
•No complex features like query languages, query planners,referential integrity joins, ACID
Schema-free
•NoSQL databases are either schema-free or have relaxed schemas
•Do not require any sort of definition of the schema of the data
•Offers heterogeneous structures of data in the same domain

Simple API
•Offers easy to use interfaces for storage and querying data
provided
•APIs allow low-level data manipulation & selection methods
•Text-based protocols mostly used with HTTP REST with JSON
•Mostly used no standard based NoSQL query language
•Web-enabled databases running as internet-facing services
Distributed
•Multiple NoSQL databases can be executed in a distributed fashion
•Offers auto-scaling and fail-over capabilities
•Often ACID concept can be sacrificed for scalability and throughput
•Mostly no synchronous replication between distributed nodes Asynchronous Multi-Master Replication, peer-to-peer, HDFS
Replication
•Only providing eventual consistency
•Shared Nothing Architecture. This enables less coordination and higher distribution.

NoSQL is Shared Nothing.

Types of NoSQL Databases
NoSQL Databases are mainly categorized into four types: Key-value pair, Column-oriented, Graph-based and Document-
oriented. Every category has its unique attributes and limitations. None of the above-specified database is better to solve
all the problems. Users should select the database based on their product needs.
Types of NoSQL Databases:
•Key-value Pair Based
•Column-oriented Graph
•Graphs based
•Document-oriented
Key Value Pair Based
Data is stored in key/value pairs. It is designed in such a way to handle lots of data and heavy load.
Key-value pair storage databases store data as a hash table where each key is unique, and the value can be a JSON,
BLOB(Binary Large Objects), string, etc.
For example, a key-value pair may contain a key like “Website” associated with a value like “youtube”.

It is one of the most basic NoSQL database example. This kind of NoSQL database is used as a collection, dictionaries,
associative arrays, etc. Key value stores help the developer to store schema-less data. They work best for shopping cart
contents.
Redis, Dynamo, Riak are some NoSQL examples of key-value store DataBases. They are all based on Amazon’s Dynamo
paper.
Column-based
Column-oriented databases work on columns and are based on BigTable paper by Google. Every column is treated separately.
Values of single column databases are stored contiguously.

Column based NoSQL database

They deliver high performance on aggregation queries like SUM, COUNT, AVG, MIN etc. as the data is readily available in a
column.
Column-based NoSQL databases are widely used to manage data warehouses, business intelligence, CRM, Library card
catalogs,
HBase, Cassandra, HBase, Hypertable are NoSQL query examples of column based database.
Document-Oriented:
Document-Oriented NoSQL DB stores and retrieves data as a key value pair but the value part is stored as a document. The
document is stored in JSON or XML formats. The value is understood by the DB and can be queried.

Relational Vs. Document

In this diagram on your left you can see we have rows and columns, and in the right, we have a document database which
has a similar structure to JSON. Now for the relational database, you have to know what columns you have and so on.
However, for a document database, you have data store like JSON object. You do not require to define which make it
flexible.
The document type is mostly used for CMS systems, blogging platforms, real-time analytics & e-commerce applications. It
should not use for complex transactions which require multiple operations or queries against varying aggregate structures.
Amazon SimpleDB, CouchDB, MongoDB, Riak, Lotus Notes, MongoDB, are popular Document originated DBMS systems
Graph-Based
A graph type database stores entities as well the relations amongst those entities. The entity is stored as a node with the
relationship as edges. An edge gives a relationship between nodes. Every node and edge has a unique identifier.
Compared to a relational database where tables are loosely connected, a Graph database is a multi-relational in nature.
Traversing relationship is fast as they are already captured into the DB, and there is no need to calculate them.
Graph base database mostly used for social networks, logistics, spatial data.
Neo4J, Infinite Graph, OrientDB, FlockDB are some popular graph-based databases.
Eventual Consistency
The term “eventual consistency” means to have copies of data on multiple machines to get high availability and scalability.
Thus, changes made to any data item on one machine has to be propagated to other replicas.
Data replication may not be instantaneous as some copies will be updated immediately while others in due course of time.
These copies may be mutually, but in due course of time, they become consistent. Hence, the name eventual consistency.
BASE: Basically Available, Soft state, Eventual consistency
•Basically, available means DB is available all the time as per CAP theorem
•Soft state means even without an input; the system state may change
•Eventual consistency means that the system will become consistent over time
Advantages of NoSQL
•Can be used as Primary or Analytic Data Source
•Big Data Capability
•No Single Point of Failure
•Easy Replication
•No Need for Separate Caching Layer
•It provides fast performance and horizontal scalability.
•Can handle structured, semi-structured, and unstructured data with equal effect
•Object-oriented programming which is easy to use and flexible
•NoSQL databases don’t need a dedicated high-performance server
•Support Key Developer Languages and Platforms
•Simple to implement than using RDBMS
•It can serve as the primary data source for online applications.
•Handles big data which manages data velocity, variety, volume, and complexity
•Excels at distributed database and multi-data center operations
•Eliminates the need for a specific caching layer to store data
•Offers a flexible schema design which can easily be altered without downtime or service disruption
Indexing in Databases
Indexing improves database performance by minimizing the number of disc visits required to fulfil a query. It is a
data structure technique used to locate and quickly access data in databases. Several database fields are used to
generate indexes. The main key or candidate key of the table is duplicated in the first column, which is the Search
key. To speed up data retrieval, the values are also kept in sorted order. It should be highlighted that sorting the
data is not required.
Basic diagram of Indexing in DBMS
Sequential File Organization or Ordered Index File
• Dense Index
• For every search key value in the data file, there is an index record.
• This record contains the search key and also a reference to the first data record with that search key
value.

•Sparse Index
• The index record appears only for a few items in the data file. Each item points to a block as shown.
• To locate a record, we find the index record with the largest search key value less than or equal to the
search key value we are looking for.
• We start at that record pointed to by the index record, and proceed along with the pointers in the file (that
is, sequentially) until we find the desired record.
• Number of Accesses required=log₂(n)+1, (here n=number of blocks acquired by index file)
Hash File organization
Indices are based on the values being distributed uniformly across a range of buckets. The buckets to which a
value is assigned is determined by a function called a hash function. There are primarily three methods of
indexing:
Clustered Indexing
When more than two records are stored in the same file
these types of storing known as cluster indexing. By
using the cluster indexing we can reduce the cost of
searching reason being multiple records related to the
same thing are stored at one place and it also gives the
frequent joining of more than two tables (records).
Primary Indexing
This is a type of Clustered Indexing wherein the data is sorted according to the search key and the primary key
of the database table is used to create the index. It is a default format of indexing where it induces sequential
file organization. As primary keys are unique and are stored in a sorted manner, the performance of the
searching operation is quite efficient.
Non-clustered or Secondary Indexing
A non clustered index just tells us where the data lies, i.e. it gives us a list of virtual pointers or references to the
location where the data is actually stored. Data is not physically stored in the order of the index. Instead, data is
present in leaf nodes. For eg. the contents page of a book. Each entry gives us the page number or location of the
information stored. The actual data here(information on each page of the book) is not organized but we have an
ordered reference(contents page) to where the data points actually lie. We can have only dense ordering in the non-
clustered index as sparse ordering is not possible because data is not physically organized accordingly.
Multilevel Indexing
With the growth of the size of the database, indices also grow. As the index is stored in the main memory, a
single-level index might become too large a size to store with multiple disk accesses. The multilevel indexing
segregates the main block into various smaller blocks so that the same can stored in a single block. The outer
blocks are divided into inner blocks which in turn are pointed to the data blocks. This can be easily stored in the
main memory with fewer overheads.
Advantages of Indexing
•Improved Query Performance: Indexing enables faster data retrieval from the database. The database may
rapidly discover rows that match a specific value or collection of values by generating an index on a column,
minimising the amount of time it takes to perform a query.
•Efficient Data Access: Indexing can enhance data access efficiency by lowering the amount of disk I/O required
to retrieve data. The database can maintain the data pages for frequently visited columns in memory by
generating an index on those columns, decreasing the requirement to read from disk.
•Optimized Data Sorting: Indexing can also improve the performance of sorting operations. By creating an index
on the columns used for sorting, the database can avoid sorting the entire table and instead sort only the relevant
rows.
•Consistent Data Performance: Indexing can assist ensure that the database performs consistently even as the
amount of data in the database rises. Without indexing, queries may take longer to run as the number of rows in
the table grows, while indexing maintains roughly consistent speed.
•By ensuring that only unique values are inserted into columns that have been indexed as unique, indexing can
also be utilized to ensure the integrity of data. This avoids storing duplicate data in the database, which might
lead to issues when performing queries or reports.
Disadvantages of Indexing
•Indexing necessitates more storage space to hold the index data structure, which might increase the total size of
the database.
•Increased database maintenance overhead: Indexes must be maintained as data is added, destroyed, or modified
in the table, which might raise database maintenance overhead.

Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)

Dbms Unit-6

Uploaded by

Dbms Unit-6

Uploaded by

UNIT-6

• File – A file is named collection of related information that is recorded on

• Heap File Organization works with data blocks. In this method

If we want to search, delete or update data in heap file

But the problem is that No bucket address is remaining

•Tree traversal is easier and faster.

If we have to insert, update or delete any record we can

h(k, i) = (h'(k) + i ) mod m

h(k, i) = (h'(k) + c1i + c2i2) mod m

h(k, i) = (h1(k) + ih2(k)) mod m Take and example

Explanation: Given that

Types Of Big Data

Data Growth over the years

NoSQL is Shared Nothing.

Column based NoSQL database

Relational Vs. Document

You might also like