CS doc - Google Docs
CS doc - Google Docs
Knowledge:
CS Option A: HL
.1.4 Describe the use of transactions, states and updates to maintain data
A
consistency and integrity
tates:They are the states that a transaction goes through to keep track of changes
S
to the database (additions, updates, and deletions), they inform about the current
state of the transaction whether it was committed or was aborted to maintain
consistency across different bases to ensure data integrity.
ransaction:A transaction ensures that these operationsare completed as a single
T
logical unit, atomic unit.
Concurrency Control:
oncurrency control is crucial for designing high-performance database systems
C
that can handle multiple users simultaneously. It is essential to manage
concurrency effectively to maintain data consistency, integrity, and reliability
● L
ocking: This prevents other transactions from accessing data that's being
modified.
● Timestamp ordering: Transactions are ordered based on their timestamp.
● M
ultiversion concurrency control (MVCC): Keeps multiple versions of data
to allow read operations without blocking writes.
Indexing strategies:
1. B-tree indexes:Balanced tree structure, good fora wide range of queries
2. Hash indexes:Excellent for equality comparisons,but not range queries
3. B
itmap indexes:Efficient for low-cardinality columns(columns with few
distinct values)
OTE:While indexes speed up reads, they slow downwrites. It's all about
N
finding the right balance.
.2.1 Define the terms Database Management System (DBMS) and Relational
A
Database Management System (RDBMS)
atabase Management System:A system software usedto create and manage a
D
database. It provides the user with a systematic way to create, retrieve, update and
manage data
elational Database Management System:It is a DBMSbased on the relational
R
model. The data in a RDBMS is stored in a table, consisting of fields (UserID,
UserAge, UserName) and records where each record is an individual entry filling
out the fields.
Database Privacy
rivacy focuses on protecting sensitive personal information. This includes
P
implementing:
1. Data anonymization: Removing personally identifiable information
2. Authentication: Verifying the identity of users
3. Access controls: Limiting who can see sensitive data
4. Compliance with regulations.
A.2.4 Define the term schema
chema:The organisation of data as a blueprint ofhow the database is
S
constructed.
OTE:Don't confuse schema with the actual data. Theschema is the structure,
N
while the data is what fills that structure.
.2.5 Identify the characteristics of the three levels of the schema: conceptual,
A
logical, physical
Physical
- Physical storage structure of the database
- Low level data structures in detail
Logical:
ogical schema: Higher-level abstraction, defining data relationships and
L
constraints.
Outlines:
● Entities, attributes, and relationships.
● C
onstraints ensuring data integrity and accuracy (e.g., referential integrity
constraints).
Acts as an interface between business requirements and physical implementation.
aintains consistent data structure even as physical implementation evolves,
M
aiding in data access and query performance.
nsures data consistency and supports decision-making by providing a coherent
E
view of data for business and technical stakeholders.
Conceptual:
High-level representation of database structure and organization.
rovides aunified viewof organizational data, abstractingfrom physical storage
P
and processing.
Defines:
● Entitiesand theirattributes.
● Relationshipsbetween entities (semantic model ofdata).
erves as a bridge betweenbusiness requirementsand thephysical database
S
implementation.
oundation forlogical schema(detailed data relationshipsand constraints) and
F
physical schema(data storage).
nables consistent, organized data views that adapt as the database evolves.It
E
describes the part of the database that the user is interested in and hides the
remaining database from the user
Data Mining
ata mining is the process of discovering patterns and knowledge from large
D
amounts of data.
Common ways for data Mining:
1. Classification: Predicting which category something belongs to
a. Ex-Classify customers into segments based on purchasing behaviour
2. Clustering: Grouping similar items together
a. Ex -Cluster products that are often bought together
3. Association rule mining: Finding relationships between variables
4. Anomaly detection: Identifying unusual patterns
a. Ex-Detect fraudulent transactions
A.2.6 Outline the nature of a Data Dictionary
Explanation:
data dictionary tool is an inventory that holds all information about the fields of
A
a RDBMS such as the type of data that should be inputted into it, a limit on the
number of characters or numbers, description, and any other information that could
be of use.
.2.9 Define the following database terms: table, record, field, primary key,
A
secondary key, foreign key, candidate key, composite primary key, join
Table:A relation or a file
Record:A tuple or row
Field:An attribute or column.
rimary key:A primary key is a keyword of a certaintype that is unique to each
P
record used to identify that record from the rest. Typically, the primary key makes
up a column and is and ID.
econdary key / Candidate key:It is a column thatis eligible to be a primary key
S
but has not been chosen, so is called a secondary key or a candidate key.
oreign key:A foreign key is a column in a relationaldatabase that that provide a
F
link between two tables by referencing a primary key in another table.
omposite primary key:A combination of fields thatwhen both used together
C
can identify each record uniquely. Commonly used when a singular column cannot
identify each record uniquely, but a combination of them can.
J oin:An SQL operation used to create a link betweentwo tables based on
matching columns. Used a foreign and primary key.
I nner join:a relational database operation that mergesrows from two tables based
on a shared attribute, usually aprimary key and foreignkey relationship. It
retrieves only those tuples (rows) that have matching values in both tables, thus
enforcingreferential integrityby excluding non-matchingrecords. This operation
is essential in structured query management, enabling efficient data retrieval and
relational data analysis in SQL databases.
.2.10 Identify the different types of relationships within a database:
A
one-to-one, one-to-many, many-to-many
ne-to-One:Each record in a table is associated withone and only record in
O
another table.
ne-to-Many:Each record in one table can be associatedto many records in
O
another table.
any-to-Many:Multiple records in a table are associatedwith multiple records in
M
another table. In this case, a join table is used where foreign keys are implemented
from each table.
Short Summary:
• 1NF has no repeating rows or columns.
• 2NF is based on full functional dependency.
• 3NF involves the removal of transitive dependencies.