0% found this document useful (0 votes)
11 views

CS103_ Databases_ Normalizing a Database Cheatsheet _ Codecademy

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

CS103_ Databases_ Normalizing a Database Cheatsheet _ Codecademy

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

3/11/24, 11:35 AM CS103: Databases: Normalizing a Database Cheatsheet | Codecademy

Cheatsheets / CS103: Databases

Normalizing a Database

Database Normalization

Database normalization is a process by which database


and table structures are created or modified in order to
address inefficiencies/complexities related to the
following:
data storage
data modification
querying database tables

Repeating Column Groups

Repeating groups of columns in a database table can


create inefficiencies and errors related to data storage,
querying, and modification. For example, consider a
songs table with the following columns:
id
title
artist1_id
artist1_name
artist2_id
artist2_name
The repeating artist-related columns likely contain
duplicated data. It would also be difficult to sort this
table by artist.

https://www.codecademy.com/learn/cspath-databases/modules/cspath-normalizing-a-database/cheatsheet 1/7
3/11/24, 11:35 AM CS103: Databases: Normalizing a Database Cheatsheet | Codecademy

Independent Columns

In a relational database, columns that are not


dependent on the primary key of a table can create
inefficiencies related to data storage and modification,
while also increasing the potential for future data
errors. This is often because columns that are not
dependent on the primary key contain duplicated
information. For example, in a books table with
columns isbn , title , length , author_id , and
author_name , the author-related columns will
contain duplicated data if the same author has written
multiple books; moving author-related information to a
separate table would solve this problem.

Updating Duplicated Data

If the same information is stored in multiple locations in |order_id |price |cust_id |cust_email
a database table, a database manager needs to be
|
careful when updating the table. For example, in the
database table shown here, each customer’s email | ----- | -----| -------| -------------
address is stored in multiple rows. Therefore, in order - |
to update a customer email, multiple fields will need to
| 1 |20.43 | 1
be changed. Normalizing the table gets rid of
duplicated data and therefore makes data errors less |[email protected] |
likely. | 2 |51.33 | 1
|[email protected] |
| 3 |80.01 | 2
|[email protected] |
| 4 |33.27 | 2
|[email protected] |

https://www.codecademy.com/learn/cspath-databases/modules/cspath-normalizing-a-database/cheatsheet 2/7
3/11/24, 11:35 AM CS103: Databases: Normalizing a Database Cheatsheet | Codecademy

Inserting Data Without a Primary Key

Problems can occur when updating a database table if


new information needs to be inserted before the
associated primary key is known. This can happen if
columns are not dependent on the primary key. For
example, consider a songs table with the following
columns:
song_id (primary key)
title
length
artist_id
artist_name
It would be impossible to add artist information to this
table without also adding a value for song_id , which
could be problematic.

Database Efficiency and Use

The efficiency of any database schema is dependent on


how the database is going to be used. While
normalization solves many problems related to data
storage, modification, and querying, it can also make
some things more difficult. For example, tables in a
normalized database will need to be joined back
together if a query relies on information in multiple
tables. It is therefore not always beneficial to normalize
every database table. Decisions about schema design
should be made with future use in mind!

1NF Databases

A 1NF database is an atomic database. In this case, // This would be a non-atomic table
atomic means that each cell contains one value and
| Title | Length | Type |
each row is unique. In the given example, we can see
that the non-atomic table has cells with more than one |-----------------|------------|------|
value and non-unique rows. | Example A, B, C | 125 inches | B, C |
| Example A, B, C | 125 inches | B, C |
// This would be an atomic table
| ID | Title | Length | Type |
|----|-----------|------------|------|
| 1 | Example A | 125 inches | B |
| 2 | Example A | 125 inches | C |
https://www.codecademy.com/learn/cspath-databases/modules/cspath-normalizing-a-database/cheatsheet 3/7
3/11/24, 11:35 AM CS103: Databases: Normalizing a Database Cheatsheet | Codecademy

A 2NF Database

When a database is said to be 2NF, that means the // A 1NF Database with partial
database is both 1NF and contains no partial
dependencies
dependencies. A partial dependency is when an
attribute depends partly on the table’s primary key. | ID | Name | Address ID | Address
|
|----|---------|------------|------------
-----------|
| 1 | Logan | 1 | 7777 Willow
Drive |
| 2 | Charlie | 2 | 8888 Blue
Bonnet Road |
| 3 | Johanna | 1 | 7777 Willow
Drive |
// A 2NF database
*Name.db
| ID | Name | Address ID |
|----|---------|------------|
| 1 | Logan | 1 |
| 2 | Charlie | 2 |
| 3 | Johanna | 1 |
*Address.db
| Address ID | Address |
|------------|-----------------------|
| 1 | 7777 Willow Drive |
| 2 | 8888 Blue Bonnet Road |

https://www.codecademy.com/learn/cspath-databases/modules/cspath-normalizing-a-database/cheatsheet 4/7
3/11/24, 11:35 AM CS103: Databases: Normalizing a Database Cheatsheet | Codecademy

A 3NF database

When making a 3NF database, two goals need to be // A 2NF database with transitive
accomplished. The first being that the database is
functional dependency.
already 2NF, and the second being that the database
contains no transitive functional dependencies. A | ID | Name | Address |
transitive functional dependency is when a non-prime |----|---------|-----------------------|
attribute is dependent on another non-prime attribute.
| 1 | Logan | 7777 Willow Drive |
| 2 | Charlie | 4444 Blue Bonnet Road |
| 4 | Hannah | 4444 Blue Bonnet Road |
// A 3NF database with no transitive
functional dependencies.
*name.db
| ID | Name | Address ID |
|----|---------|------------|
| 1 | Logan | 1 |
| 2 | Charlie | 2 |
| 4 | Hannah | 2 |
*Address.db
| Address ID | Address |
|------------|-----------------------|
| 1 | 7777 Willow Drive |
| 2 | 4444 Blue Bonnet Road |

https://www.codecademy.com/learn/cspath-databases/modules/cspath-normalizing-a-database/cheatsheet 5/7
3/11/24, 11:35 AM CS103: Databases: Normalizing a Database Cheatsheet | Codecademy

Update Anomalies

When updating data in a non-normalized database, | ID | Name | City | Total


sometimes not all of the data can get updated due to
Sales |
the lack of normalization. Another possible problem
can be updating the wrong data. This is called an |----|--------|---------------|----------
update anomaly and can be fixed by making sure the ---|
database has a higher level of normalization.
| 1 | Arthur | New York City | $26,000
|
| 2 | Sarah | Cincinnati | NULL
|
| 3 | Josh | Cincinnati | $20,000
|
// In the table above, if we updated the
sales of an employee in cincinnati
without specifics, we would end up
updating either both of the sales for
Sarah or Josh, or by possibly updating
the wrong employee.

Insertion Anomalies

Sometimes, when working with a non-normalized | ID | Name | City | Total


database, incomplete data being added to the database
Sales |
can lead to NULL values existing within the database.
This is called an insertion anomaly and can be |----|--------|---------------|----------
prevented by making sure the database has a higher ---|
level of normalization.
| 1 | Arthur | New York City | $26,000
|
| 2 | Sarah | Cincinnati | NULL
|
| 3 | Josh | Los Angeles | $20,000
|
// In the table above, new employees
won't have any sales yet meaning that
there will be a number of NULL values in
the database until a set number exists.

https://www.codecademy.com/learn/cspath-databases/modules/cspath-normalizing-a-database/cheatsheet 6/7
3/11/24, 11:35 AM CS103: Databases: Normalizing a Database Cheatsheet | Codecademy

Deletion Anomalies

When working with a non-normalized database, a


deletion anomaly can occur. A deletion anomaly is
when a query ends up deleting more data from the
database than was intended due to a lack of
normalization.

Print Share

https://www.codecademy.com/learn/cspath-databases/modules/cspath-normalizing-a-database/cheatsheet 7/7

You might also like