Data modeling (1)
Data modeling (1)
in a usable form. The process for converting data into a usable form is known as data modeling.
Thus, Data modeling is the process of creating a visual representation (blueprint) of data and its
relationships to show how data is organized, stored and analyzed in a database on Information
system.
Data models are the way to relate the data to real world. Data model defines coherent structure
(logical and consistent) of data. Data model helps different stakeholders, like data analysts,
scientists, and engineers, to create a unified view of the data. The model outlines what data the
business collects, the relationship between different datasets, and the methods that will be used to
store and analyze the data. Data standards of organization are set by structuring efficient data
models.
Whereas we traditionally think of data modeling as a problem for database administrators (DBAs)
and ETL developers, data modeling can happen almost anywhere in an organization. Firmware
engineers develop the data format of a record for an IoT device, or web application developers
design the JSON response to an API call or a MySQL table schema—these are all instances of data
modeling and design.
Data model is documented using entity relationship diagrams (ER Diagram) which is a
representation of the data structures in a table for a company’s database. It is a very powerful
expression of the company’s business requirements. Data models are used for many purposes, from
high-level conceptual models, logical to physical data models and typically represented by the
entity-relationship diagram. It serves as a guide used by database analysts and software developers
in the design and implementation of a system and the underlining database.
Entity Relationship Diagram (ERD) is a pictorial representation of the information that can be
captured by a database. Such a “picture” serves two purposes. It allows database professionals to
describe an overall design. An ER Diagram can be easily transformed into the relational schema.
There are three components in ERD: Entities, Attributes, and Relationships.
1. Data Entities: These are core components of data model representing real world objects.
These are actually the main things that you're tracking, like Customers, Products, Payments,
Categories or Orders.
[Implementation wise It is the number of tables you need for your database – A specific
example of an entity is called an instance. Each instance becomes a record or a row in a
table]
Primary Key is an attribute or a set of attributes that uniquely identifies an instance (record)
of the entity [It is a column or set of columns that uniquely identifies each row in a table].
For example, for a Customer entity, Customer's ID is the primary key since no two
Customers have the same Customer's ID. We can have only one primary key in a table. It
identifies uniquely every row and it cannot be null.
Foreign key is a key used to link two tables together. It is a column or set of columns that
references the primary key of another table. Establishes and enforces a link between data in
two tables. Typically you take the primary key field from one table and insert it into the
other table where it becomes a foreign key (it remains a primary key in the original table).
We can have more than one foreign key in a table.
Primary keys and foreign keys are columns in a database table that identify rows and
establish relationships between tables. They are essential for a database's structure and
functionality.
Example: In a library database, the BookID column in the Books table is the primary key,
uniquely identifying each book.
The BookID column in the Loans table is the foreign key, linking the Loans table to the
Books table.
3. Relationships: How different entities are connected to each other. For example, a
customer might place many orders, or a product might be included in many orders.
[How tables are linked together ]
Types of relationships (Cardinality):It defines the nayure of relationships. It indicates the
possible number of occurrences in one entity which is associated with the number of
occurrences in another. For example, ONE team has MANY players. When present in an
ERD, the entity Team and Player are inter-connected with a one-to-many relationship.
In an ER diagram, cardinality is represented as a crow’s foot at the connector’s ends. The
three common cardinal relationships are one-to-one, one-to-many, and many-to-many. Here
is some examples cardinality of relationship in ERD:
1. Conceptual data model: The conceptual data model serves as the foundational layer in the
process of database design. This model covers only the fundamentals and is represented through
visual tools like entity-relationship diagrams (ERDs) or other schemas.
Here’s an example from e-commerce business. In a conceptual model, the primary entities
could include “Customer,” “Product,” “Order,” and “Payment.” Every entity signifies a core
data domain, linked by general relationships such as “Customer places Order” or “Order
contains Product.” (one customer may place many orders, one product may be in many
orders and all).
However, this model will not feature data types, primary keys, or other technical details.
2. Logical data model: It describes how conceptual model will be implemented. This model
offers a clear framework that guides database architects in structuring data effectively. The
logical model is especially helpful for larger, more complex projects.
This model expands on the basic framework of the conceptual model by incorporating
additional details (technical) to conceptual model.
Determines all of the entities required, attributes and their respective arrangements and data
types.
Clarifies the specific attributes and relationships (What are relationships like one to one
or many to many or one to many?).
Describes the data structure (like table, columns and rows)/data types associated to the
attributes. But these types remain broad (for instance, “number” or “string”) instead of being
specific to any particular database management system (DBMS).
It Identifies primary and foreign keys.
It establishes data constrains.(if one customer can have more than one email ID or contact
number)
Normalization is applied to minimize data redundancy.
Although a logical data model is still independent of the actual database system in which the
database will be created, you can still consider that if it affects the design.
The logical model is typically created by architects and business analysts.
In e-commerce example, we would outline the attributes of each entity. For example, a
“Customer” could include details such as “Customer ID,” “First Name,” and “Email.” The
data type for attributes like “Customer ID” and “First Name“ are of integer and string type
respectively. It is possible that the specific data base does not allow space between Customer
and ID in attribute name “Customer ID”. It will specify primary and foreign keys, making
“Customer ID” the primary key for the Customer table and “Customer ID” the foreign key
in Order table.
The logical model reduces redundancy. It maintains data quality by identifying data
attributes and relationships. Logical models help in organizing data to minimize redundancy
and enhance reliability.
It becomes easy to translate the model into the technical requirements of the physical model.
3. Physical data model: It describes how logical model will be implemented. Physical data
level model refers to a detailed representation of how data will be stored, organized and
accessed at physical level (within a database system).
Translates a logical data model into a technical implementation ready for deployment.
This model implements framework of the logical model by incorporating additional high
level details with respect to database in which it will be stored. Thus, It is Data base
specific.
Specifies the exact tables (names), columns (names), data types, constraints, and indexes
needed to implement the database structure on a specific database management system
(DBMS).
Establishes Primary and Foreign keys, views, indexes, access profiles, and authorizations,
etc.
It is the most granular level of data modeling, focusing on technical details like storage
mechanisms and indexing strategies to optimize performance.
A crucial aspect of physical data modeling is optimizing database performance by
strategically designing indexes, partitioning tables, and considering data access patterns.
DeNormalization is applied to optimize query performance.
Developed for a specific version of a DBMS, location, data storage or technology to be used
in the project.
Go to the following link for detailed example of different level of Data modeling:
https://vertabelo.com/blog/er-diagram-for-online-shop/