Database Development
Database Development
D. Entities
Entity Types:
Strong Entity: Represents a real-world object that can
exist independently. For example, a Customer entity in
an e-commerce database is a strong entity.
Weak Entity: Depends on another entity for its
identification. It cannot be uniquely identified by its own
attributes alone. For example, an OrderItem might be
a weak entity dependent on an Order.
Entity Set:
An entity set is a collection of entities of the same type.
For example, the set of all Customers in a database
forms the Customer entity s
E. Attributes/Field
An attribute is a field or column in a database table that
stores a specific piece of data about an entity. For
example, in a database for employees, attributes of the
Employee entity might include EmployeeID,
FirstName, LastName, and HireDate.
Attributes are used to store individual data items related
to an entity. Each attribute holds a particular type of data,
such as text, numbers, or dates.
Attribute Types:
Simple Attribute: An attribute that cannot be divided
further. For example, FirstName or DateOfBirth.
Composite Attribute: An attribute that can be divided
into smaller sub-parts, each representing a more basic
attribute. For example, a FullName attribute might be
divided into FirstName and LastName.
Derived Attribute: An attribute whose value is derived
from other attributes. For example, an Age attribute can
be derived from the DateOfBirth attribute.
Multi-Valued Attribute: An attribute that can hold
multiple values. For example, PhoneNumbers for a
contact might include multiple phone numbers
F. Records
A record is a collection of related data fields that together
represent a specific instance of an entity. In a table, each
record is typically stored in a separate row
Example:
Consider a Students table in a school database. A record in
this table might look like this:
lua
Copy code
| StudentID | FirstName | LastName |
DateOfBirth |
|-----------|-----------|----------|--
-----------|
| 001 | John | Doe |
2004-05-15 |
g. Table
Foreign Key:
Tables can have foreign keys, which are columns that
establish relationships with primary keys in other tables.
For example, an Order table might have a
CustomerID foreign key that links to the
CustomerID primary key in the Customer table.
Types of Schemas:
Logical Schema: Focuses on the logical structure of the
database, including tables, columns, and relationships,
without concern for how data is physically stored.
Physical Schema: Details the physical storage of data,
including how data is stored on disk, indexing methods,
and performance optimizations.
Conceptual Schema: Provides a high-level view of the
database, describing the overall structure and
relationships between entities without getting into
implementation details
i. DBMS
A Database Management System (DBMS) is software
designed to manage, organize, and control access to
databases. It provides an interface for users and
applications to interact with data, ensuring efficient
storage, retrieval, and manipulation
Applications of database
One to one
Example
Consider a database with two tables: Student and Course.
Students can enroll in multiple courses, and each course can
have multiple students enrolled
Determination of data types
Determining data types is a crucial part of database design, as
it ensures that the data is stored efficiently, accurately, and
can be manipulated effectively. Here’s a detailed explanation
of how to determine the most common data types in
databases: Character, Number, and Date.
1. Character Data Type
Character data types are used to store text-based data, such as
names, addresses, or descriptions. The length of the text can
vary, and databases provide different character data types
based on the length and nature of the text.
Common Character Data Types:
CHAR(n): Fixed-length character type. Always stores a
specified number of characters (n). If the input is shorter
than n, it is padded with spaces.
o Example: CHAR(10) stores exactly 10 characters.
o Use case: Storing fixed-length codes like postal
codes.
VARCHAR(n): Variable-length character type. Only
stores the characters provided up to a maximum limit of
n.
o Example: VARCHAR(50) stores up to 50 characters.
variable-length text.
TEXT: Stores large text data with no predefined length
limit. Ideal for storing paragraphs or lengthy
descriptions.
o Example: TEXT is used for storing articles or blog
posts.
How to Determine Character Data Type:
Use CHAR if the length of the text is fixed.
Use VARCHAR if the text length varies but has a
known maximum.
Use TEXT for large, unrestricted text fields.
IDs.
DECIMAL(p, s) or NUMERIC(p, s): Stores fixed-point
numbers with precision p (total digits) and scale s (digits
after the decimal point).
o Example: DECIMAL(10, 2) stores numbers with up
events.
DATETIME or TIMESTAMP: Stores both date and
time.
o Example: DATETIME for tracking events with a
precise timestamp.
INTERVAL: Represents a span of time, such as "3
days" or "2 hours".
How to Determine Date Data Type:
Use DATE if only the date is needed.
Use TIME if only the time component is required.
Use DATETIME or TIMESTAMP if both date and
time are needed.
Use INTERVAL for storing durations or time
differences.
Summary Table
Data Type Example Use Case Example Value
CHAR Postal code, '12345'
country code
VARCHAR Names, email '[email protected]'
addresses
TEXT Article content, 'Lorem ipsum...'
blog posts
INTEGER Counting items, 12345
user IDs
DECIMAL Currency, financial 999.99
amounts
FLOAT Scientific 3.14159
measurements
DATE Birthdate, contract '2024-11-11'
date
TIME Appointment time '14:30:00'
DATETIM Timestamp for logs '2024-11-11 14:30:00'
E
INTERVAL Time duration '3 days'
Understanding and choosing the correct data type ensures
efficient storage and accurate data manipulation in your
database.
Data Dictionary: Definition and Elements
1. Definition of a Data Dictionary
A data dictionary is a centralized repository that contains
detailed information (metadata) about the data within a
database or information system. It describes the structure,
types, constraints, relationships, and meanings of the data
elements stored in the system. Essentially, it's like a reference
manual that documents the data assets of an organization.
A data dictionary provides:
A consistent understanding of data elements across the
organization.
Improved data management by providing clear documentation
of what each data element represents.
Easier data governance, making it simpler to maintain and
manage data standards.
A single source of truth for developers, analysts, and other
stakeholders to ensure consistent usage of data fields.
B. Data Type
Specifies the type of data stored (e.g., INTEGER, VARCHAR,
DATE).
Example: customer_name is VARCHAR(50), order_amount is
DECIMAL(10, 2).
C. Data Length
Defines the maximum size of the data element.
Example: phone_number has a length of 10 characters.
D. Description
A brief explanation of the data element's purpose or meaning.
Example: order_status indicates whether an order is
"Pending", "Shipped", or "Cancelled".
E. Default Value
The value automatically assigned to the field if no value is
provided.
Example: status defaults to "Active".
H. Relationships
Defines associations between data elements or tables (one-to-
one, one-to-many, many-to-many).
Example: customer_id in the Orders table is linked to the
Customers table.
I. Source
Indicates the origin of the data element (e.g., user input,
automated system, external source).
Example: sales_data imported from an external system.
J. Owner or Steward
Identifies the person or department responsible for
maintaining the data element.
Example: The finance team manages the invoice_amount
field.
K. Last Updated
Timestamp indicating when the data element definition was
last modified.
Example: last_updated = '2024-11-10'.
Example of a Data Dictionary Entry
Attribute Description
Name employee_id
Data Type INTEGER
Length 10
Description Unique identifier for each employee
Default Value None
Constraints Primary Key, Not Null
Allowed Positive integers only
Values
Relationships Linked to employee_department table
Source Internal HR system
Owner HR Department
Last Updated 2024-11-11
Data Manipulation:
o Users should be able to add, update, and delete records
in the database (e.g., updating a customer's address).
o The system should support searching and filtering data
(e.g., finding all orders placed in the last month).
2. Non-Functional Requirements
Non-functional requirements specify how the database system
should operate rather than what it should do. These
requirements focus on the performance, usability, reliability,
and other quality attributes of the database.
Types of Non-Functional Requirements:
Non-Functional
Description
Requirement
The database should handle up to 10,000
Performance transactions per second with a response time of
less than 1 second.
The system must be able to scale to
Scalability accommodate 1 million records without
degradation in performance.
The database should have 99.9% uptime,
Availability
ensuring minimal downtime.
Non-Functional
Description
Requirement
Implement encryption for data at rest and in
Security
transit to protect sensitive information.
The database interface should be intuitive,
Usability allowing users to easily perform CRUD (Create,
Read, Update, Delete) operations.
The database structure should be easy to
Maintainability modify to support future changes in business
requirements.
The database must enforce data consistency,
Data Consistency
especially in a distributed environment.
The system must comply with industry
Compliance
regulations such as GDPR for data privacy.
Ensure automated backups occur every 24
Backup and
hours, and recovery should take less than 30
Recovery
minutes.
2. Scalability:
o The database should scale horizontally to handle
increased loads as the business grows.
3. Reliability:
o The system should be resilient to failures, with automatic
failover to a backup database.
4. Data Security:
o Implement two-factor authentication for database
administrators.
o All personal data should be encrypted using AES-256
encryption.
5. Compliance:
o The database must be GDPR compliant, ensuring users
can request data deletion.
6. Usability:
o The user interface should allow non-technical users to
easily generate standard reports without writing SQL
queries.
Summary Table
Requirement
Example Focus Area
Type
Storing customer information, What the database
Functional
generating reports does
Ensuring high performance, How the database
Non-Functional
security, scalability operates
1. Interview
An interview is a direct, face-to-face or virtual conversation
between an interviewer and an interviewee to gather in-depth
information. It is one of the most effective methods to collect
qualitative data.
How It Works:
Advantages:
Disadvantages:
Use Cases:
2. Documentation
Documentation refers to reviewing existing documents,
records, and files related to the system or business processes.
It is a passive method of data collection that involves
analyzing written or digital resources.
How It Works:
Disadvantages:
Use Cases:
3. Questionnaire
A questionnaire is a set of pre-defined questions distributed
to a large group of people to collect standardized data. It is
usually used for gathering quantitative data but can also
include open-ended questions for qualitative insights.
How It Works:
Advantages:
Disadvantages:
Use Cases:
4. Observation
Observation involves watching users interact with systems or
perform tasks in their natural environment to understand their
behavior and identify potential data requirements. It is
particularly useful for understanding workflows that users
may not articulate clearly.
How It Works:
Observers watch users as they perform tasks, noting their
actions, difficulties, and how they use existing systems.
Observations can be structured (predefined aspects to observe)
or unstructured (general observation).
Advantages:
Provides real-world insights into how users interact with
systems.
Can identify inefficiencies or pain points that users may not be
aware of.
Useful for validating or supplementing data collected from
interviews or questionnaires.
Disadvantages:
Observations can be time-consuming and resource-intensive.
Users may behave differently when they know they are being
observed (Hawthorne Effect).
Interpretation of observations can be subjective and requires
experience.
Use Cases:
Observing how employees enter data into existing systems to
identify inefficiencies.
Understanding real-time challenges in a workflow for better
database design.
Validating data collected through interviews or surveys.
Comparison Table
Method Type of Data Advanta Disadvant Best Used
Collected ges ages For
Interview Qualitative In- Time- Understan
depth, consuming ding
flexible, , potential detailed
interacti bias user needs
ve
Documenta Existing records Historica May be Analyzing
tion l view, outdated, existing
non- lacks systems
intrusive context
Questionna Quantitative/ Scalable, Low Gathering
ire Qualitative quick, response input from
cost- rates, large
effective superficial groups
Observatio Qualitative Real- Time- Understan
n world consuming ding user
insights, , observer interaction
behavior bias
-focused
Conclusion
Each data collection method has its strengths and weaknesses,
and the choice of method often depends on the project’s
objectives, available resources, and the nature of the data
being sought. In practice, a combination of methods is often
used to get a comprehensive view and validate findings,
ensuring a well-rounded understanding of database
requirements.