Understanding
Normalization
Voluntary Additional Learning Material
Flat Files
All information in one table
In a flat file, all information is stored in one table. There are no structures for indexing or
recognizing relationships between records. This way of data organization leads to a lot
of redundancy.
Datasheet View and Design View
In Datasheet View, you can see that each row contains one record. Each column
contains information of a specific data category, such as Company, City, or Sales
Region.
Example of Datasheet View
In Design View, the data itself is not visible. Only the data fields are visible. The data
fields correspond to the column headings.
© Sonic Performance Support, Hamburg, Germany, 2019 Page 1 of 5
Field properties are assigned to each data field. The field properties define the allowed
type of content, e.g. that the field “City” should always contain text and the field
“Revenue“ should always contain numbers in a specific currency format.
Data fields in Design View
A common misconception
It is a common misconception that flat files are easy to handle just because it is not
necessary to establish relationships between different tables.
In fact, flat files cause problems when data need to be changed, e.g. company name,
product name, sales representatives, or a reorganization of sales regions.
Solving this task with “Find and Replace" is not a suitable workaround. Instead, take
advantage of the many benefits of normalization. In this way, you can significantly
reduce typos, avoid data duplication, and significantly increase processing speed.
Sonic Performance Support, Hamburg, Germany Page 2 of 5
Normalization
Master data and Transactions
Normalization requires several tables. In each table each value is only stored once.
After normalization, there are in most cases two types of tables (master data and
transactions). Master data include products, customers, sales regions, and sales
representatives.
In the Revenue table, the individual transactions are logged, i.e. which client ordered
which product on which date. Note that the Revenue table does not store the clients’
names, only the Client ID, and thus refers to a specific record in the Clients table.
The same structure applies to products.
You may ask yourself, what is the purpose of the Order ID in the table Revenue? The
Order ID provides a unique ID for each entry in the Revenue table. In this way, another
requirement for normalized tables is met, namely that each record has a unique
identifier.
Data from the normalized table “Clients”
Data from the normalized table “Product”
Sonic Performance Support, Hamburg, Germany Page 3 of 5
Data from the normalized table “Revenue”
Note that every table has a unique identifier: Product ID, Client ID and Order ID. Now the
preparations are done, and you can establish relationships between these tables. The
relationships are established between the IDs, used in both tables. The table Revenue is
using the Client ID to point to a specific record in the Client table.
The same works with Product IDs.
For the data used throughout this course, a normalized table structure looks like this:
Querys
So-called queries let you create specific Views of data from the tables that are related
to each other. You can now ask questions to your database.
Query example: Which client has bought which product?
Sonic Performance Support, Hamburg, Germany Page 4 of 5
In Power BI, you can create various visualizations using the relationships between the
tables. You can create lists like the above, and also charts, tables, cards, maps,
sparklines and much more. And you can run calculations on fields containing values.
Links
If you want to learn more, this Wikipedia article is a good start. Try to understand the
first, the second and third normal form (NF).
[Link]
Sonic Performance Support, Hamburg, Germany Page 5 of 5