0% found this document useful (0 votes)
6 views6 pages

CH5-Written Report

Chapter 5 discusses the data dictionary, which serves as an index and description of all data stored in a database, enhancing documentation and standardizing programming methods. It outlines the types of data, categories of data dictionaries (active and passive), and major database structures (hierarchical, network, and relational). Additionally, it highlights the importance of data dictionaries in data-driven organizations and provides steps for creating one.

Uploaded by

milescy09
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views6 pages

CH5-Written Report

Chapter 5 discusses the data dictionary, which serves as an index and description of all data stored in a database, enhancing documentation and standardizing programming methods. It outlines the types of data, categories of data dictionaries (active and passive), and major database structures (hierarchical, network, and relational). Additionally, it highlights the importance of data dictionaries in data-driven organizations and provides steps for creating one.

Uploaded by

milescy09
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

CHAPTER 5: DATA DICTIONARY

Data Dictionary
-​ contains an index and descriptions all of data stored in database.
-​ Directory describes the locations of the data and the access method

Database Object
-​ Structure that stores or references data, most common example is a table.

Metadata
-​ Another word for “data about data,” most common example is the description you read
on Google before clicking a link.

Data Dictionary Benefits


●​ Enhancing Documentation
●​ Facilitating programming by reducing the needs for data definition
●​ Providing common validation criteria
●​ Standardizing programming methods

Data Dictionary Data Types


One of the most common fields in data dictionary is the “data types.”
In data analysis and statistics, a data point is a piece of information that describes one unit of
observation, at one point in time, at the data collection level. It most commonly appears as one
cell in a data table.

Each coding language (JAVA, SQL, etc) has its own data types, but we almost always use SQL
(Structured Query Language) data types, as this is the dominant database language.

Each butterfly is one unit of observation. You may collect information such as the continent
where the butterfly is found, the color of its wings, its weight, and its speed. Each of these pieces
of information are called dimensions, and each entry in a cell is a data point. Each data point
describes the unit of observation (aka each butterfly).

Data points are either words, numbers, or other symbols. These are the types of data points we
create in, and query from, data tables.

In most software, the common five types are:


1.​ Integer – any number that doesn’t have a decimal point
2.​ Date – a date of a given year and month
3.​ Time – the time of day
4.​ Text – often referred to as “string,” means simply any combination of letters instead of
numbers or other symbols
5.​ Boolean – TRUE or FALSE data, often migrated to YES or NO text, or 1 and 0 numbers.
It is, in simple terms, binary data.

Six Categories of Data Types


1.​ Numeric Data Item Types
●​ Integer – any number that is not a decimal. Examples include -11, 34, 0, 100.
●​ Tinyint – an integer, but only numbers from 0 to 255
●​ Bigint – an integer bigger than 1 trillion
●​ Float – numbers too big to write out, and the scientific method is needed
●​ Real – any fixed point on a line
2.​ Date and Time Data Item Types
●​ Date – the date sorted in different forms, including “mm/dd/yyyy” (US),
“dd/mm/yyyy” (Europe), “mmmm dd, yyyy”, and “mm-dd-yy” among many
more.
●​ Time – the time of day, broken down as far as milliseconds
●​ Date time – the date and time value of an event
●​ Timestamp – stores number of seconds passes since 1970-01-01 00:00:00’ UTC
●​ Year – stores years ranging from 1901 to 2155 in two-digit or four-digit ranges
3.​ Character and String Data Item Types
●​ Char – fixed length of characters, with a maximum of 8,000
●​ Varchar – max of 8,000 characters like char, but each entry can differ in length
(variable)
●​ Text – similar to varchar, but the maximum is 2GB instead of a specific length
4.​ Unicode Character and String Item Types – unicode is a way of structuring data in the
form of U+0000, where the 0’s can be any type
●​ nchar – fixed length with maximum length of 8,000 characters
●​ nvarchar – variable length with maximum of 8,000 characters
●​ ntext – variable length storage, only now the maximum is 1GB rather than a
specific length
5.​ Binary Data Item Types – a combination of 0s and 1s
●​ binary – fixed length with maximum of 8,000 bytes
●​ varbinary – variable length storage with maximum bytes, topped at 8,000
6.​ Miscellaneous Data Item Types
●​ clob – also known as Character Large Object, is a type of sub-character that
carries Unicode texts up to 2GB
●​ blob – carries big binary objects
●​ xml – a specific data type that stores XML data. XML stands for extensible
markups language, and is common in data bases

Main Categories of Data Dictionary

A data dictionary is an essential component of a database that stores metadata—data about data.
It describes the structure, format, and definitions of data elements in a database, making it easier
for database administrators and users to understand, manage, and maintain the system.

Data dictionaries can be classified into two main categories:

1. Active Data Dictionary

An active data dictionary is directly linked to the database management system (DBMS). It
automatically updates whenever changes are made to the database structure. This means that any
modifications to tables, columns, or relationships are immediately reflected in the data dictionary
without requiring manual updates.

Characteristics of an Active Data Dictionary:


​ •​ Automatically updates when changes are made in the database.
​ •​ Integrated into the DBMS.
​ •​ Commonly found in enterprise-level database management systems.

Example:
A large corporation using an active data dictionary in its DBMS would automatically see updates
in the metadata when a new table or field is added.

2. Passive Data Dictionary


A passive data dictionary, in contrast, is not automatically updated when changes occur in the
database. It requires manual modifications to keep the documentation up to date. This type of
data dictionary is often maintained separately in documents, spreadsheets, or external software.

Characteristics of a Passive Data Dictionary:


​ •​ Requires manual updates when the database changes.
​ •​ Not integrated into the DBMS.
​ •​ Used for documentation, reporting, and reference purposes.

Examples:
​ •​ A spreadsheet (e.g., an Excel file) listing all the tables and fields in a database.
​ •​ A text-based document explaining database structures for non-technical users.

Data Structure and Its Major Types

A data structure is a system for organizing and storing data efficiently in a computer. It affects
the performance of a program by determining how data is accessed, manipulated, and stored.

There are three major types of database structures:

1. Hierarchical Database Model

In the hierarchical database model, data is structured in a tree-like format, where each record has
a parent-child relationship. This means that each child record has only one parent, but a parent
can have multiple child records.

Characteristics:
​ •​ Uses a 1:N (one-to-many) relationship between records.
​ •​ Data is stored in a top-down format, similar to an organizational chart.
​ •​ Efficient for applications that require quick data retrieval, such as banking
systems.

Example:
An employee database where a manager (parent) supervises multiple employees (children).

2. Network Database Model

The network database model allows more flexible relationships between records by enabling
many-to-many (M:N) relationships. Instead of a strict hierarchy, data is organized using a
network-like structure.

Characteristics:
​ •​ Uses sets to define relationships between records.
​ •​ A record can have multiple parent and child connections.
​ •​ Provides more flexibility than the hierarchical model but is more complex.

Example:
A university database where students are linked to multiple courses, and each course can have
multiple students.

3. Relational Database Model

The relational database model organizes data into tables (also called relations) with rows (tuples)
and columns (attributes). It is the most widely used database model today.
Characteristics:
​ •​ Uses tables to store data.
​ •​ Relationships between tables are defined using primary and foreign keys.
​ •​ Supports high-level query operations using SQL (Structured Query Language).

Example:
A sales database where one table stores customer details, another table stores order details, and
both tables are linked using customer ID’s

OSI ARCHITECTURE

OSI Model was developed by the International Organization for Standardization (ISO) in 1984,
and it is now considered as an architectural model for the inter-computer communications.
OSI Model is a reference model that describes how information from a software application in
one computer moves through a physical medium to the software application in another computer.

The OSI (Open Systems Inter-connection) is a proof-of-concept model composed of seven


layers, each specifying particular specialized tasks or functions.

- defines a systematic approach to providing security at each layer.


- defines security services & security mechanisms that can be used at each of the 7 layers of the
OSI model to provide security for data transmitted over a network.

The OSI Model was defined in ISO/IEC 7498, which has the following parts:
o ISO/IEC 7498-1 The Basic Model
o ISO/IEC 798-2 Security Architecture
o ISO/IEC 7498-3 Naming and addressing
o ISO/IEC 7498-4 Management framework

OSI LAYERS

1.​ Physical Layer – The physical layer provides the hardware that transmits and receives
the bit stream as electrical, optical, or radio signals over an appropriate medium or
carrier.
- lowest layer, concerned with electrically/optically transmitting raw unstructured data
bits from the Physical Layer of the sending device to the Physical Layer of the receiving
device.
2.​ Data-Link Layer – The data link layer is used for the encoding, decoding, and logical
organization of data bits. Data packets are framed and addressed by this layer, which has
two sublayers.
- Directly connected nodes are used to perform node-to-node data transfer where data is
packaged into frames.
- The OSI layer responsible for error detection and encryption
3.​ Network Layer – This layer is responsible for assigning IP addresses, as well as routing
and forwarding. This layer prepares the packets for the data link layer.
- Responsible for receiving frames from the data link layer & delivering them to their
intended destinations based on the addresses contained inside the frame.
4.​ Transport Layer – The transport layer provides reliable and transparent transfer of data
between and points, end-to-end error recovery and flow control.
- Regulates the size, sequencing, & ultimately the transfer of data between systems &
hosts.
5.​ Session Layer – The session layer controls the dialogs (sessions) between computers. It
establishes, manages and terminates the connections between the local and remote
application layers.
- A session/connection between machines is set up, managed, & determined.
- Manages dialog, synchronizes data transfer with checkpoints.
6.​ Presentation Layer – The presentation layer converts the outgoing data into a format
acceptable by the network standard and then passes the data to the session layer. (It is
responsible for translation, compression, and encryption)
- Formats/translates data for the application layer based on the syntax/semantics that the
app accepts.
7.​ Application Layer – provides a standard interface for applications that must
communicate with devices on the network (e.g., print files on a network-connected
printer, send an email or store data on a file server)
- Sees network services provided to end-user applications such as web browser or office
365.

Importance of a Data Dictionary


Data dictionaries are very important for teams that need to share huge amounts of data on a
regular basis. This is the case for most organizations today, since most decisions are
progressively more data-driven.
The exception to this is in organizations where only one team needs a working knowledge of a
database. Otherwise, data dictionaries are a must.

How to make a Data Dictionary? (3 easy steps)


Making a data dictionary is not as complicated as it might see, but the process depends on which
tool you use. In Excel, you will need to do much more manual work if you were building it with
an automatic, active database management software.

The steps to make a data dictionary are as follows:


1.​ Make each field (column header) in the data table and list it as a row in the data
dictionary.
1.​ Decide how you want to define each field in the data dictionary
-​ For example, you may want to say what the data type is
2.​ Either use a database management software to compile the source data into the data
dictionary, or build out logic in a spreadsheet software like Excel.

Data Dictionary Diagram?


A data dictionary diagram does not exist. What most people often confuse with a data dictionary
diagram is called an entity relationship diagram. It’s easy to confuse the ideas conceptually, but
be careful not to confuse them in practice!

Data Dictionary Tools


One of the biggest challenges with data dictionaries is finding the right tool! In short, the best
data dictionary tool depends on your needs.

Five Most Common Data Object Types


1.​ Table - A series of rows and columns containing information. The first column always
contains the reference data (or unique ID), while the other columns provide information
on these IDs.
2.​ Views - Data dictionaries can be used to grant special access to a user. A database
manager may want to limit the visibility on secure information for certain users. In other
words, s/he may want to change the user’s view. The word also refers to displayed data
that the user can easily see but not edit. A query to the database for a view will display
data quickly, which is useful for decision making. You can think of them as a window.
3.​ Clusters - A cluster is simply a table built by connecting two other tables around a
common column.
4.​ Sequences - A set of data columns or tables that describe a specific real-world event.
Clusters can be sequences.
5.​ Index - A copy of key columns that can be easily accessed.

You might also like