0% found this document useful (0 votes)
10 views

Data modeling (1)

Data modeling is the process of creating a visual representation of data and its relationships to organize, store, and analyze it effectively within a database. It involves defining entities, attributes, and relationships, and is documented using entity-relationship diagrams (ERDs) to facilitate collaboration among stakeholders and improve data quality. Data models can be conceptual, logical, or physical, each serving different purposes in the design and implementation of data systems.

Uploaded by

rathorea356
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Data modeling (1)

Data modeling is the process of creating a visual representation of data and its relationships to organize, store, and analyze it effectively within a database. It involves defining entities, attributes, and relationships, and is documented using entity-relationship diagrams (ERDs) to facilitate collaboration among stakeholders and improve data quality. Data models can be conceptual, logical, or physical, each serving different purposes in the design and implementation of data systems.

Uploaded by

rathorea356
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

To derive business insights from data, through business analytics and data science, the data must be

in a usable form. The process for converting data into a usable form is known as data modeling.
Thus, Data modeling is the process of creating a visual representation (blueprint) of data and its
relationships to show how data is organized, stored and analyzed in a database on Information
system.
Data models are the way to relate the data to real world. Data model defines coherent structure
(logical and consistent) of data. Data model helps different stakeholders, like data analysts,
scientists, and engineers, to create a unified view of the data. The model outlines what data the
business collects, the relationship between different datasets, and the methods that will be used to
store and analyze the data. Data standards of organization are set by structuring efficient data
models.
Whereas we traditionally think of data modeling as a problem for database administrators (DBAs)
and ETL developers, data modeling can happen almost anywhere in an organization. Firmware
engineers develop the data format of a record for an IoT device, or web application developers
design the JSON response to an API call or a MySQL table schema—these are all instances of data
modeling and design.
Data model is documented using entity relationship diagrams (ER Diagram) which is a
representation of the data structures in a table for a company’s database. It is a very powerful
expression of the company’s business requirements. Data models are used for many purposes, from
high-level conceptual models, logical to physical data models and typically represented by the
entity-relationship diagram. It serves as a guide used by database analysts and software developers
in the design and implementation of a system and the underlining database.
Entity Relationship Diagram (ERD) is a pictorial representation of the information that can be
captured by a database. Such a “picture” serves two purposes. It allows database professionals to
describe an overall design. An ER Diagram can be easily transformed into the relational schema.
There are three components in ERD: Entities, Attributes, and Relationships.
1. Data Entities: These are core components of data model representing real world objects.
These are actually the main things that you're tracking, like Customers, Products, Payments,
Categories or Orders.
[Implementation wise It is the number of tables you need for your database – A specific
example of an entity is called an instance. Each instance becomes a record or a row in a
table]

2. Attributes: These are the specific characteristics/properties/facts/pieces of information


about each entity, such as a Customer's Name, Customer's ID, Address, Email are attributes
of the entity name Customer.
[These are the columns of the table. For example, for entity Customer, the attributes can be
Customer's Name, Customer's ID, Address, Email]

Primary Key is an attribute or a set of attributes that uniquely identifies an instance (record)
of the entity [It is a column or set of columns that uniquely identifies each row in a table].
For example, for a Customer entity, Customer's ID is the primary key since no two
Customers have the same Customer's ID. We can have only one primary key in a table. It
identifies uniquely every row and it cannot be null.

Foreign key is a key used to link two tables together. It is a column or set of columns that
references the primary key of another table. Establishes and enforces a link between data in
two tables. Typically you take the primary key field from one table and insert it into the
other table where it becomes a foreign key (it remains a primary key in the original table).
We can have more than one foreign key in a table.

Primary keys and foreign keys are columns in a database table that identify rows and
establish relationships between tables. They are essential for a database's structure and
functionality.

Example: In a library database, the BookID column in the Books table is the primary key,
uniquely identifying each book.
The BookID column in the Loans table is the foreign key, linking the Loans table to the
Books table.

3. Relationships: How different entities are connected to each other. For example, a
customer might place many orders, or a product might be included in many orders.
[How tables are linked together ]
Types of relationships (Cardinality):It defines the nayure of relationships. It indicates the
possible number of occurrences in one entity which is associated with the number of
occurrences in another. For example, ONE team has MANY players. When present in an
ERD, the entity Team and Player are inter-connected with a one-to-many relationship.
In an ER diagram, cardinality is represented as a crow’s foot at the connector’s ends. The
three common cardinal relationships are one-to-one, one-to-many, and many-to-many. Here
is some examples cardinality of relationship in ERD:

ERD Example – Customer Appointment


Suppose we have the following business scenario:
a) One Customer may be making one or more Appointments
b) One Appointment Must be made by One and Only One Customer
c) The cardinality linked from Customer to Appointments is 0 to many

The ERD above using the Crow’s Foot notation:


Entities are shown in a box with attributes listed below the entity name.
Relationships are shown as solid lines between two entities.
The minimum and maximum cardinalities of the relationship linked between Customer and
Appointment are shown with either a straight line and hash marks, or a crow’s foot as shown
in the figure above.

Why is data modeling important?


1. Facilitates Collaboration: Data models provide a common framework for data engineers,
business analysts, and other stakeholders to understand and discuss data flows and
requirements.
 Improved Data Quality: A strong model improves the quality of the data by defining clear
relationships, constraints, and validation rules, ensuring accuracy and consistency across
systems.
 Data Integrity: A well-designed data model ensures that data is accurate, consistent, and
stored in a way that prevents errors or anomalies.
 Scalability and Performance: A properly designed data model can optimize data storage
and processing, improving system performance and scalability. like, By organizing data in
an efficient way you can optimize the speed of queries and data processing. Also, a robust
data model can handle large volumes of data and allow for future growth without
performance degradation.
 Enhanced Data Analysis: A clear data model facilitates efficient data retrieval and analysis,
enabling data-driven decision-making.
 Reduced Development Costs: A well-defined data model can streamline the development
process, reducing errors and rework.
 Ease of Maintenance: Proper data modeling ensures that the data structure can be updated
or modified with minimal disruption to applications or processes.

Types/levels of data models:


1. Conceptual – the “what” model
2. Logical – the “how” of the details
3. Physical – the “how” of the implementation

1. Conceptual data model: The conceptual data model serves as the foundational layer in the
process of database design. This model covers only the fundamentals and is represented through
visual tools like entity-relationship diagrams (ERDs) or other schemas.

 It is a high-level view of the data or general idea of data.


 It focuses on the business requirements. It answers what data (information) is
required to store and why it is required.
 It offers overview of main entities required and their relationships.
 Decides Business logics and rules (which are defined by stakeholders).
 Decides Data securities (what and how security has to be established for securely
accessing data)
 It allows all team members to understand the project objectives without having any
technical understanding.
 The conceptual data model is often created by architects in conjunction with business
stakeholders and domain experts.

Example of Conceptual Data model

Here’s an example from e-commerce business. In a conceptual model, the primary entities
could include “Customer,” “Product,” “Order,” and “Payment.” Every entity signifies a core
data domain, linked by general relationships such as “Customer places Order” or “Order
contains Product.” (one customer may place many orders, one product may be in many
orders and all).
However, this model will not feature data types, primary keys, or other technical details.

2. Logical data model: It describes how conceptual model will be implemented. This model
offers a clear framework that guides database architects in structuring data effectively. The
logical model is especially helpful for larger, more complex projects.

 This model expands on the basic framework of the conceptual model by incorporating
additional details (technical) to conceptual model.
 Determines all of the entities required, attributes and their respective arrangements and data
types.
 Clarifies the specific attributes and relationships (What are relationships like one to one
or many to many or one to many?).
 Describes the data structure (like table, columns and rows)/data types associated to the
attributes. But these types remain broad (for instance, “number” or “string”) instead of being
specific to any particular database management system (DBMS).
 It Identifies primary and foreign keys.
 It establishes data constrains.(if one customer can have more than one email ID or contact
number)
 Normalization is applied to minimize data redundancy.
 Although a logical data model is still independent of the actual database system in which the
database will be created, you can still consider that if it affects the design.
 The logical model is typically created by architects and business analysts.

Example of Logical model

In e-commerce example, we would outline the attributes of each entity. For example, a
“Customer” could include details such as “Customer ID,” “First Name,” and “Email.” The
data type for attributes like “Customer ID” and “First Name“ are of integer and string type
respectively. It is possible that the specific data base does not allow space between Customer
and ID in attribute name “Customer ID”. It will specify primary and foreign keys, making
“Customer ID” the primary key for the Customer table and “Customer ID” the foreign key
in Order table.

The logical model reduces redundancy. It maintains data quality by identifying data
attributes and relationships. Logical models help in organizing data to minimize redundancy
and enhance reliability.

It becomes easy to translate the model into the technical requirements of the physical model.

3. Physical data model: It describes how logical model will be implemented. Physical data
level model refers to a detailed representation of how data will be stored, organized and
accessed at physical level (within a database system).
 Translates a logical data model into a technical implementation ready for deployment.
 This model implements framework of the logical model by incorporating additional high
level details with respect to database in which it will be stored. Thus, It is Data base
specific.
 Specifies the exact tables (names), columns (names), data types, constraints, and indexes
needed to implement the database structure on a specific database management system
(DBMS).
 Establishes Primary and Foreign keys, views, indexes, access profiles, and authorizations,
etc.
 It is the most granular level of data modeling, focusing on technical details like storage
mechanisms and indexing strategies to optimize performance.
 A crucial aspect of physical data modeling is optimizing database performance by
strategically designing indexes, partitioning tables, and considering data access patterns.
 DeNormalization is applied to optimize query performance.

Example of Physical model


In e-commerce example, we would outline the attributes of each entity. For example,
consider table for entity “Customer” with details such as “Customer ID,” “First Name,” and
“Email.” The data type for attributes like “Customer ID” and “First Name“ are of integer
and string type respectively. It is possible that the specific data base does not allow space
between Customer and ID in attribute name “Customer ID”. It is represented as
“Customer_ID” in that specific database. Data type “string” is represented by “Varchar” in
specific database. Physical model may also add some other attributes to entity using foreign
keys.

Developed for a specific version of a DBMS, location, data storage or technology to be used
in the project.

The physical data model is typically created by DBAs and/or developers.

Tools for data modeling:


Diagramming tools: Tools like Microsoft Visio, Lucidchart, and Draw.io are commonly
used to create visual representations of data models.
Database modeling tools: Specialized tools like ER/Studio and PowerDesigner offer
features for creating and managing complex data models.
Overall, data modeling is a crucial step in any data-related project, from designing a new database
to analyzing existing data.By creating a clear and well-defined data model, organizations can ensure
that their data is accurate, reliable, and valuable.
A query allows you to retrieve and act on data

Go to the following link for detailed example of different level of Data modeling:

https://vertabelo.com/blog/er-diagram-for-online-shop/

Data models for Data ware housing and Business Intelligence


1.Dimensional data Models:

You might also like