CHAPTER 8 :
Creating Data Base
Introduction
In the ever-evolving world of data management, creating a well-structured database is the
foundation upon which a successful data-driven operation stands. Whether it's an inventory system,
customer relationship management, or enterprise resource planning, databases are the silent
engines that power these systems. This chapter walks through the essential steps involved in
creating a database, from the initial gathering of requirements to the ongoing maintenance that
ensures the database continues to meet the needs of its users.
Determine the Data Needs
The first step in database creation is understanding the requirements of the system for which the
database is being designed. This phase involves conversations with stakeholders, system architects,
and potential users to determine what the database needs to achieve. The primary objectives during
this stage are:
Defining the Purpose: What specific functions will the database support? For
example, will it manage customer information, track sales, or organize inventory?
Understanding Data Types: What types of data will the database store? This
includes everything from numerical data (e.g., sales figures, employee salaries) to
textual data (e.g., customer names, addresses), as well as more complex data types
like images or documents.
Identifying User Access: Who will interact with the database? Identifying user roles
(administrators, analysts, etc.) helps determine security needs and access levels.
Estimating Data Volume and Performance Requirements: Understanding the
amount of data the system is expected to handle, as well as performance expectations,
is crucial for making informed decisions about hardware, indexing, and storage
strategies.
Data Modelling Techniques
Data modeling refers to the process of creating a visual representation of the data structures and
the relationships between different data elements. It acts as a guide for developers, database
administrators, and analysts to understand how the system's data will be organized and how various
elements interact.
The primary goals of data modeling include:
Improving data consistency: Ensuring data is structured in a way that reduces
redundancy and prevents inconsistencies.
Facilitating data retrieval: Designing databases that allow for quick, efficient
querying and reporting.
`
Ensuring scalability: Building systems that can handle increasing volumes of data
over time.
Supporting business requirements: Aligning data structures with the needs of the
business, ensuring data can support operations, analytics, and decision-making.
Types of Data
Data is the foundation upon which insights, decisions, and operations are built in virtually every
industry. Understanding the different types of data is crucial for selecting the right approach to
collection, storage, and analysis.
Quantitative Data
Quantitative data, often referred to as numerical data, consists of information that can
be measured and expressed in numbers. This data type is fundamental to business operations
because it allows for objective analysis, statistical modeling, and performance benchmarking.
Quantitative data is ideal for generating financial reports, tracking performance metrics, and
identifying trends over time.
Characteristics:
• Measurable: Expressed in numerical terms, making it easy to compare and
analyze.
• Objective: Unlike qualitative data, quantitative data is not influenced by personal
opinions or biases.
• Scalable: It can be aggregated, averaged, and analyzed across large datasets to
detect patterns or irregularities.
Applications: Quantitative data is essential for areas such as:
• Financial tracking (e.g., revenue, expenses)
• Inventory management (e.g., stock levels)
• Performance evaluation (e.g., sales targets, employee productivity)
Examples:
• Sales figures and revenue reports
• Customer transactions (e.g., purchase volume)
• Employee performance metrics (e.g., hours worked or units produced)
Qualitative Data
Qualitative data contrasts with quantitative data by being non-numeric and
descriptive. It provides insights into the "why" and "how" behind the numbers. Often
subjective, qualitative data plays a key role in capturing human experiences, emotions, and
complex ideas that cannot be easily expressed through numbers alone.
`
Characteristics:
• Descriptive: Offers rich, detailed information that adds context and nuance to
quantitative results.
• Subjective: Based on individual opinions, experiences, and perceptions.
• Contextual: Helps explain the underlying reasons behind patterns or trends
observed in quantitative data.
Applications: Qualitative data is used for:
• Market research to understand consumer preferences
• Employee performance assessments, capturing aspects such as creativity or
teamwork
• Customer feedback to improve service quality or product offerings
Examples:
• Open-ended survey responses from customers
• Employee appraisals based on subjective evaluation criteria
• Feedback from customer service calls or social media interactions
Internal Data
Internal data is generated within the organization through its various operational
processes. This data is often the most directly relevant to day-to-day decision-making,
providing insights into internal performance, resource management, and workflow efficiency.
Internal data is typically stored in relational databases, enterprise resource planning (ERP)
systems, or customer relationship management (CRM) software.
Characteristics:
• Organization-Specific: Comes directly from the organization's activities and
operations.
• Operational: Crucial for managing day-to-day business operations and ensuring
that processes are running smoothly.
• Reliable: Internal data is often consistent and subject to regular updates as part
of routine business functions.
Applications: Internal data is integral to:
• Financial accounting and reporting
• Employee and resource management
• Product and service delivery tracking
`
Examples:
• Sales transaction data
• Employee payroll records and attendance
• Inventory levels and production outputs
External Data
Unlike internal data, external data comes from sources outside the organization. This
type of data is essential for understanding broader market conditions, assessing competitive
performance, and making strategic decisions in response to external factors. External data is
commonly sourced from third-party reports, market research firms, and government
publications.
Characteristics:
• Externally Sourced: Provides insights from outside the organization, offering a
broader perspective on industry trends and market dynamics.
• Contextual: Helps businesses assess their position in the market relative to
competitors and adapt to changes in external conditions.
• Diverse: Includes a wide variety of data types, from economic indicators to
customer sentiment and regulatory changes.
Applications: External data is vital for:
• Market research and competitive analysis
• Industry benchmarking and trend forecasting
• Strategic planning and risk management
Examples:
• Market trend reports from research agencies
• Economic indicators (e.g., inflation rates, GDP)
• Competitor performance and pricing strategies
Structured Data
Structured data refers to information that is organized in a predefined, tabular format,
typically in rows and columns. It is the most commonly used type of data in MIS because it
can be easily stored, searched, and analyzed using traditional database systems and tools
such as SQL.
`
Characteristics:
• Organized: Data is arranged in a clear, consistent format, which makes it
straightforward to query and analyze.
• Standardized: Follows a specific structure, such as a relational database model.
• Efficient: Structured data can be quickly retrieved and processed using common
tools and database management systems.
Applications: Structured data is foundational in:
• Transactional systems
• Financial analysis and reporting
• Customer relationship management (CRM) and inventory management systems
Examples:
• Customer names, contact details, and transaction history stored in a database
• Financial statements (e.g., balance sheets, income statements)
• Employee records (e.g., job titles, salary, work history)
Unstructured Data
Unstructured data lacks a predefined format and is often more complex to manage
and analyze than structured data. It includes a wide variety of information types, such as
text, images, audio, and video. Although unstructured data poses challenges in terms of
storage and analysis, it can provide invaluable insights when processed using advanced
technologies like natural language processing (NLP) and machine learning.
Characteristics:
• Flexible: Can include various types of content, such as emails, social media posts,
and multimedia.
• Complex: More difficult to organize, search, and analyze compared to structured
data.
• Rich in Information: Despite its complexity, unstructured data often contains
detailed insights that are not captured in structured formats.
Applications: Unstructured data is critical for:
• Social media monitoring and sentiment analysis
• Customer support and service
• Media management (e.g., videos, photos, audio recordings)
`
Examples:
• Social media posts and comments
• Customer service call recordings and transcripts
• Email correspondence and attachments
Time-Series Data
Time-series data is a type of data that is collected and organized based on time. Each
data point in a time-series dataset is associated with a specific timestamp. Time-series data
is essential for identifying trends, making predictions, and forecasting future outcomes based
on historical patterns.
Characteristics:
• Chronological: Data is ordered and indexed by time, allowing for trend analysis
and forecasting.
• Pattern-Detecting: Used to identify seasonal variations, cyclical patterns, or
long-term trends.
• Predictive: Time-series data is often used to make future predictions and
forecasts.
Applications: Time-series data is used for:
• Financial market analysis (e.g., stock prices, market trends)
• Sales forecasting
• Resource management and capacity planning
Examples:
• Daily stock prices or trading volumes
• Monthly sales figures
• Temperature or environmental data collected over time
Geographic Data
Geographic data, also known as geospatial data, relates to physical locations on the
Earth's surface. It is used to map, analyze, and interpret data in relation to geographic
positions, often using coordinates such as latitude and longitude. Geographic data is essential
for understanding spatial relationships and optimizing business processes like logistics and
location-based services.
`
Characteristics:
• Location-Based: Data is tied to specific geographical locations.
• Visualizable: Can be mapped and visualized using Geographic Information
Systems (GIS) tools.
• Contextual: Provides insight into how geography impacts business operations,
marketing strategies, and customer behaviors.
Applications: Geographic data is crucial for:
• Logistics and supply chain management
• Market segmentation and targeting
• Site selection for new business locations
Examples:
• Customer addresses or geographical coordinates
• Delivery route optimization
• Demographic data tied to specific regions
Big Data
Big data refers to massive datasets that are too large, fast, or complex to be processed
using traditional data management tools. The defining characteristics of big data are its
volume, velocity, and variety—often referred to as the "three Vs." Big data requires
specialized tools and technologies, such as cloud computing, Hadoop, and machine learning,
to store, manage, and extract valuable insights.
Characteristics:
• Massive Volume: Involves enormous amounts of data that exceed the
capabilities of conventional databases.
• High Velocity: Data is generated and processed at unprecedented speeds,
requiring real-time analytics.
• Variety: Big data comes from a wide range of sources, including structured,
unstructured, and semi-structured data.
Applications: Big data is used for:
• Predictive analytics and customer behavior modeling
• Real-time decision-making
• Internet of Things (IoT) applications and sensor data analysis
`
Examples:
• Real-time streaming data from social media
• E-commerce transaction data
• Data generated by sensors and connected devices (IoT)
Data Modeling Techniques
Data modeling techniques help organize, structure, and optimize data for efficient use in
Management Information Systems (MIS). Each technique serves different purposes depending on
the system's requirements.
1. Entity-Relationship (ER) Model
The ER model represents entities (objects) and their relationships. Entities are linked via
relationships, with attributes describing each entity.
• Key Components: Entities, Attributes, Relationships, Primary Keys
• Advantages: Simple, visual, and helps in designing relational databases.
• Use: Database design, identifying relationships between business objects.
Example: A Customer places an Order; both have defined attributes like Name and
Order Date.
2. Relational Model
Data is organized into tables (relations), with rows (tuples) and columns (attributes).
Relationships are established via primary and foreign keys.
• Key Components: Tables, Primary Keys, Foreign Keys
• Advantages: Simplifies querying, enforces data integrity.
• Use: Relational databases (e.g., MySQL, Oracle).
Example: A Customer table with Customer_ID and Name, linked to an Orders table via
Customer_ID.
3. Dimensional Model
Used for data warehousing, the model organizes data into fact tables (quantitative data)
and dimension tables (descriptive context), often using star or snowflake schemas.
• Key Components: Fact Tables, Dimension Tables, Star/Snowflake Schemas
• Advantages: Optimized for querying and reporting.
• Use: Business Intelligence, reporting.
`
Example: A Sales Fact Table linked to Product, Customer, and Date dimension tables.
4. Object-Oriented Data Model
This model integrates object-oriented principles, storing data as objects (instances of
classes), allowing inheritance and encapsulation.
• Key Components: Objects, Classes, Inheritance
• Advantages: Suitable for complex systems, supports reusability.
• Use: Object-oriented databases and systems.
Example: A Book class with attributes like Title and methods like CheckOut().
5. Hierarchical Data Model
Data is organized in a tree-like structure with parent-child relationships, where each child has
only one parent.
• Key Components: Parent-Child Relationships, Nodes
• Advantages: Simple and efficient for hierarchical data.
• Use: File systems, XML data storage.
Example: An Organizational Chart, where the CEO is the root, and departments branch
out.
6. Network Data Model
An extension of the hierarchical model, allowing multiple parent-child relationships between
records, forming a network.
• Key Components: Records, Sets, Pointers
• Advantages: More flexible than hierarchical; supports many-to-many relationships.
• Use: Complex networks like telecommunications systems.
Example: A Telecom Network where a Switch is connected to multiple Lines, and each
Line connects to multiple Phones.
`
Entity-Relationship Diagrams
(ERD)
An Entity-Relationship Diagram (ERD) is a visual representation of the data and its relationships
in a system. It’s used in database design to illustrate how data entities relate to one another,
providing a clear structure for how information is stored, accessed, and manipulated in a database
system.
Key Parts of an ERD:
1. Entities: These are the main objects in your system. For example, in a store, entities could
be Customer, Order, and Product.
2. Attributes: These are details about an entity. For example, a Customer entity might have
attributes like Name, Email, and Phone Number.
3. Primary Key: A unique identifier for an entity. For example, each Customer has a unique
Customer_ID.
4. Relationships: These show how entities are related. Relationships can be:
o One-to-One: One instance of an entity is related to one instance of another entity.
o One-to-Many: One instance of an entity is related to many instances of another
entity.
o Many-to-Many: Many instances of one entity are related to many instances of
another entity.
`
5. Foreign Key: A key from one entity that links it to another entity. For example, in an Order
entity, the Customer_ID is a foreign key linking the Order to a specific Customer.
Example of an ERD:
Let’s say we have a simple store database with these entities:
1. Customer
o Attributes: Customer_ID, Name, Email
o Relationship: A Customer places many Orders.
2. Order
o Attributes: Order_ID, Order_Date, Customer_ID
o Relationship: An Order can contain many Products.
3. Product
o Attributes: Product_ID, Product_Name, Price
o Relationship: A Product can appear in many Orders.
How They’re Connected:
• Customer “places” Order: One customer can place many orders (one-to-many
relationship).
• Order “contains” Product: An order can contain many products, and a product can
appear in many orders (many-to-many relationship).
Why Use ERDs?
• Easy to Understand: They show how data is connected in a simple, clear way.
• Helps Design Databases: Before creating a database, an ERD helps you plan how data
will be stored and related.
• Prevents Errors: ERDs help make sure relationships between data are correct.
`
Class Diagrams
A Class Diagram is a core component of Object-Oriented Design (OOD), showing the structure
of a system by representing classes, their attributes (data), methods (behavior), and the
relationships between them. They are widely used in UML (Unified Modeling Language) to plan,
communicate, and document object-oriented systems.
Key Components:
1. Classes: Represent entities in the system, shown as rectangles with three parts:
• Top: Class name (e.g., Customer).
• Middle: Attributes (e.g., name, email).
• Bottom: Methods (e.g., placeOrder()).
2. Attributes: Properties or data stored by a class (e.g., name: String).
3. Methods: Functions or behaviors of a class (e.g., placeOrder(): void).
4. Relationships:
• Association: A general link between two classes (e.g., Customer places
Order).
• Inheritance: One class inherits from another (e.g., Employee inherits
from Person).
• Aggregation: A whole-part relationship (e.g., Library has Books).
• Composition: A strong whole-part relationship (e.g., House has Rooms).
`
Example of a Class Diagram:
In an E-commerce System, we might have the following classes:
1. Customer
o Attributes: customerID, name, email
o Methods: placeOrder(), updateProfile()
2. Order
o Attributes: orderID, orderDate, totalAmount
o Methods: calculateTotal(), addProduct()
3. Product
o Attributes: productID, name, price
o Methods: getPrice(), applyDiscount()
Relationships:
• Customer "places" Order (One-to-Many): A customer can place many orders, but
each order is placed by only one customer.
• Order "contains" Product (Many-to-Many): An order can have many products,
and a product can be part of many orders.
Why Use Class Diagrams?
• Provides a clear structure for the system.
• Helps organize the code and design.
• Improves communication between team members.
`
Report and Forms
Reports and Forms are essential components in a Management Information System (MIS).
They help collect, display, and analyze data for decision-making, operational control, and
communication.
Reports
A Report is a structured presentation of data used for analysis, decision-making, and summarization.
Reports can be static (pre-generated) or dynamic (real-time).
Types of Reports:
1. Standard Reports: Predefined, often used for regular updates (e.g., financial
statements, inventory reports). They follow a fixed format and are automatically
generated at set intervals.
2. Ad-hoc Reports: Created based on user requests for specific data, such as detailed
product sales for a given month. These reports are customizable.
3. Dashboards: Visual reports that display key metrics in real-time, such as sales
numbers, traffic analytics, or operational performance. Dashboards often use graphs,
charts, and tables for easy interpretation.
4. Exception Reports: Highlight irregularities or issues, like failed transactions, missed
deadlines, or inventory shortages, to flag problems for immediate attention.
Characteristics of Reports:
• Data Organization: Information is presented in tables, charts, or graphs for
clarity.
• Summarization and Aggregation: Complex data is simplified by summarizing
it (e.g., total sales, average scores).
• Customizable Filters: Users can adjust the report's scope, such as filtering data
by time, region, or category.
• Analysis Tools: Dashboards often include tools to drill down into the data,
providing deeper insights int o trends or performance metrics.
• Common Tools for Report Generation:
Common Tools for Report Generation:
a. Microsoft Excel
Excel is a simple and popular tool for creating reports. It allows users to organize data,
create charts, and perform basic calculations.
Create charts and tables.
Use formulas and pivot tables for data analysis.
Save reports in different formats like PDF or CSV.
`
b. Google Data Studio
Google Data Studio is a free tool that helps create interactive reports. It’s easy to use and
connects with other Google tools like Google Sheets and Google Analytics.
Create interactive dashboards and reports.
Share and collaborate with others in real time.
Integrates with Google data sources.
c. Tableau
Tableau is a tool for creating beautiful, interactive reports with data visualizations. It’s
good for analyzing complex data.
Drag-and-drop interface to create reports.
Create interactive charts and dashboards.
Connect to different data sources.
d. Power BI
Power BI is a tool from Microsoft used for making reports and dashboards. It’s good for
both simple and complex data analysis.
Create interactive reports.
Easily connect to data from many sources.
Share reports with others.
e. Crystal Reports
Crystal Reports is a tool used to create detailed and professional reports, especially in
businesses.
Customize reports in many ways.
Connect to different data sources for more detailed reports.
Export reports in multiple formats.
f. SAS Visual Analytics
SAS Visual Analytics is a powerful tool for generating reports and analyzing large datasets.
It’s often used for complex data analysis.
Advanced data analytics and visualizations.
Real-time interactive reports.
Scalable for large datasets.
`
Forms
A Form is an interactive tool that collects user input and submits it to a system. It allows users to
input, request, or modify data, ensuring smooth communication between the system and the user.
Types of Forms:
1. Data Entry Forms: Used to input new data, such as registration forms, order forms, or
user profiles. They help gather essential information.
2. Search Forms: Allow users to find specific data within the system, such as searching for
products, clients, or reports.
3. Feedback Forms: Collect responses, feedback, or suggestions from users, often used for
surveys, customer reviews, or employee evaluations.
4. Survey Forms: Designed for more detailed data collection, like market research, customer
satisfaction surveys, or research questionnaires. These forms are often more complex and
may include various question types (e.g., multiple choice, open-ended).
Characteristics of Forms:
• Input Fields: These can include text boxes, radio buttons, checkboxes, and
dropdown menus to capture user input.
• Validation: Ensures accurate and complete data entry by checking formats (e.g.,
validating email addresses or required fields).
• Submit Button: Sends the form data to the system for processing or storage.
• Conditional Logic: Forms can change based on user inputs (e.g., showing or
hiding fields based on previous selections).
• Security Features: This can include CAPTCHA, encryption, and other methods to
ensure data security.
Common Types of Form Fields:
a. Text Field: A Text Field is used for short, single-line input such as names,
addresses, or other brief text responses.
b. Textarea: A Textarea is used for longer, multi-line input, such as messages,
descriptions, or feedback.
c. Radio Buttons: Radio Buttons allow users to select one option from a group of
predefined choices. Only one option can be selected at a time.
d. Checkboxes: Checkboxes allow users to select multiple options from a list.
Multiple choices can be selected independently.
e. Dropdown Menu (Select Box): A Dropdown Menu displays a list of options in a
dropdown format, allowing users to select one option from the list.
f. File Upload: A File Upload field allows users to attach and upload files to the form,
such as images, documents, or other files.
g. Email Field: An Email Field is specifically for entering email addresses and typically
includes built-in validation to ensure the correct email format.
h. URL Field: A URL Field is for entering website addresses (URLs). The input is
validated to ensure the correct format.
`
CONCLUSION
Creating a database involves careful planning and design to ensure efficient data storage, retrieval,
and management. It begins with understanding the data needs, defining the types of data, and
choosing the appropriate data modeling techniques, such as ERD or Class Diagrams. By
determining the relationships between entities and structuring data logically, we can create a system
that is both scalable and easy to manage.
A well-designed database should:
• Ensure Data Integrity: Maintain accurate and consistent data through proper
normalization, validation, and constraints.
• Support Efficient Querying: Be designed to allow fast and flexible data retrieval, making
it easy to access and update information.
• Be Scalable: Adapt to growing data and evolving system needs without major redesigns.
• Be Secure: Protect sensitive data through access controls and encryption.
Ultimately, a well-structured database provides a solid foundation for business operations, enabling
organizations to make informed decisions based on reliable data.
`
REFERENCES
I. Analytics Extra. (2024, September 4). How to create a Database and Table in SQL [Video].
YouTube. https://www.youtube.com/watch?v=PoIr6K36W-o
II. CBT Nuggets. (2019, July 16). How to create your first database [Video]. YouTube.
https://www.youtube.com/watch?v=QY4bVNL_yrI
III. GeeksforGeeks. (2024, September 18). Difference between Centralized Database and
Distributed Database. GeeksforGeeks. https://www.geeksforgeeks.org/difference-between-
centralized-database-and-distributed-database/
IV. Ibm. (2025, January 13). Relational Databases. What is a relational database?
https://www.ibm.com/think/topics/relational-databases
V. Redirect notice. (n.d.).
https://www.google.com/url?sa=i&url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fmedium.com%2F%40elizabethsk
achkov%2Fentity-relationship-diagrams-an-explanation-
1c478499b77f&psig=AOvVaw0zwQQRLBVFv63XdZhxlGL5&ust=1738747398117000&source
=images&cd=vfe&opi=89978449&ved=0CBcQjhxqFwoTCOi4ycXaqYsDFQAAAAAdAAAAABA
h
VI. Tableau. (n.d.). What is business intelligence? https://www.tableau.com/business-
intelligence/what-is-business-intelligence