0% found this document useful (0 votes)
8 views

Database Midterm Notes

Uploaded by

Aisha Emad
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Database Midterm Notes

Uploaded by

Aisha Emad
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Database Midterm notes

SQL Server Shortcuts:


- Ctrl+0: Enter NULL in a column.
- Store query results as tab-delimited or comma-separated.
- Shift+F1: Sensitive help information.
- Use Verdana or Georgia 11-12 font for query result readability on LCD screens.
- F5: Refresh object list.
- Ctrl+L: View query plan in Query Analyzer.
- F8: Open Object Explorer.
- F5: Execute SQL code.
Database Concept, and Terms:
Data: Raw information, such as facts, images, or sounds, that may or may not be
relevant or useful for a specific task.
Information: Processed and coded data with appropriate form and content for a
particular use.
Knowledge: Concepts residing in the human brain, encompassing instinct, ideas,
rules, and procedures that guide actions and decisions.
Database (DB): Integrated set of records containing relevant data for a specific
subject and purpose, organized as a self-describing collection of integrated
records. A database's purpose is to help track transactions and represent data as
desired.
Database Management System (DBMS): Computer software facilitating access to
databases, performing management, security, control, and providing data
processing, storage, and reporting.
Database System (DBS): System comprising database files, DBMS software, and
personnel organizing them.
Table: Structures in relational database systems where data sets with the same
features are stored. Tables consist of columns and rows.
Record: Each row in a table, formed by the combination of fields, also known as a
row or tuple.
Column: Data field or attribute defined in the database for each record within
tables. Attributes can have different types based on their function.
Primary Key (PK): Field or group of fields used to uniquely define each record in
the table.
Foreign Key (FK): Field referring to the key field of a related table.
Default (DF): Database objects allowing specified values to be entered instead of
NULL values in any column.
Index (IND): Database objects providing faster access to records in a table based
on a specific field's data.
View: Customized representations of data from one or more tables or other views.
Views retrieve data from underlying tables without storing it themselves.
Stored Procedure (SP): Database objects storing SQL code for performing
operations within the database. They streamline repetitive tasks by allowing the
code to be written once and reused.
Trigger: Database objects containing action codes automatically activated after
specific actions in the database. Triggers are defined on tables and executed
before or after transactions on data.
Structured Query Language (SQL): Structural query/computer language enabling
operations such as storing, processing, changing, and querying data in relational
databases.
Entity-Relation Diagram (ERD): Diagrams modeling real-world entities, their
attributes, and the relationships between them.
Dataset: Collection of records, including entire tables, subsets, or selected fields
from multiple tables.
Metadata: Information about the database and its objects, such as tables, views,
indexes, triggers, attributes, and defaults.
Normalization: Process of organizing data into a relational database model,
reaching the third normal form (3NF) as the minimum standard.
Transactions: Representation of database events and movements.
Relationship: Definition of connections between tables, fields, and data in
relational databases, including one-to-one (1:1), one-to-many (1:N), and many-to-
many (N:M) relationships.
Data, Information, and Knowledge:
Data and information can be encoded into digital formats for processing with IT
tools, whereas knowledge exists solely within the human brain and cannot be
written down.
 Knowledge represents the meanings individuals create in their minds using
data and information, aiding in shaping the world, reducing uncertainty, and
interpreting it.

 The value of information decreases with the potential damage caused by


decisions based on outdated or insufficient data. The process of collecting
data, generating information, and creating knowledge incurs costs.

 Reluctance to share knowledge among staff signifies a hesitance to


collaborate and share valuable insights.

Digitizing data with IT tools facilitates communication among humans,


machines, and between the two. This exchange of information creates
significant social, financial, and intellectual value.

 Advancements in technologies like expert systems allow for capturing


implicit knowledge from human brains and converting it into usable
information. Eventually, this process could lead to replicating human brain
functions, known as "Knowledge Technology."
 Unlike tangible goods such as food or a lit candle, data doesn't lose value
with use and can be reused endlessly. Economists classify data as "non-
competing" goods.

Knowledge is highly valuable to employees for several reasons:

 Knowledge is scarce compared to abundant data and information.

 It's not easily imitated or copied; its use depends on permission from the
individual.

 Knowledge is irreplaceable and holds value specific to particular events and


situations.

 Digitization involves recording data using binary digits (0s and 1s), while
datalization refers to digitally recording data for processing into information
and knowledge, typically in text files.
 Datalization allows data to be tabulated, analyzed, and reused in various
contexts. This process differs from digitization, which converts analog
recorded data (e.g., pictures or text on paper) into binary code for
computer use.
What is Database?:
 A database comprises interconnected records centered around a specific
subject.
 The primary function of databases is to facilitate tracking, sorting, selecting,
and updating events, such as sales, student notes, or financial transactions.
 Databases enhance decision-making accuracy by analyzing data
fluctuations, trajectories, and trends, thereby reducing uncertainties.
 Databases contribute to the production of information and knowledge.
 The key benefits of databases include presenting recorded data to users in
desired formats and sequences.
Reasons for the Emergence of Databases:
 The quantity of recorded data doubles annually, while information
doubles every 10 years (some sources claim even faster rates).
 Databases and database management systems (DBMS) have
enabled the accumulation and processing of this data and
information.
 The rapid growth of data has necessitated the development of
databases for efficient storage and analysis.
Database Processing Functions:
- Saving data
- Transferring data
- Storing data
- Recalling data
- Processing data
- Displaying data
Database Corruptions of Causes:
- Power outage in institutions
- Improper shutdown of the operating system
- Errors in computer networks
- Viruses
- Messages triggered during file storage (e.g., screen savers, network messages)
- Corruption in disk drive areas or file assignment tables
- Piracy or intentional tampering
- Faulty codes (bugs) in DBMS and operating system software

Database Management System (DBMS):


- A Database Management System (DBMS) is software that manages databases,
providing access, security, and control.
- DBMS programs store and process data in computer memory, organizing them
based on different features.
- DBMS processes, reshapes, and queries stored data, especially in relational
databases, using SQL commands according to user requests.
Database Management System Architecture Types:
 Local (Single): Examples include Paradox, dBase, Access.
 Client/Server: Oracle, Sybase, MS SQL Server, DB2, MySQL.
 Distributed: DB2, SQL Server, Oracle.
 Mainframe: DB2.
Database Types:
 File processing databases
 Hierarchical model
 Network model
 Relational databases
 Object-oriented databases
 Web-based (XML) databases
 Multidimensional databases (data warehouses)
 Big data databases
 NoSQL databases
Advantages and Disadvantages of the Traditional DB
Method:
Advantages:
- Other programs can continue running despite issues in a specific program.
- The data structure is easily understandable.
- Data can be easily classified.
Disadvantages:
- Risk of data duplication and inconsistency.
- Difficulty in sharing data easily.
- Dependency on experts for meeting new application requirements.
- Challenges in accessing and retrieving desired data.
Advantages and Disadvantages of the DB Approach:

Advantages:
- Efficiently calls and combines data.
- Facilitates adding new data without altering existing data.
- Saves hard drive space by reducing data duplication.
- Provides centralized control and consistency over data.
Disadvantages:
- Data can be easily changed.
- Corruption in the database can render all programs inoperable.
- Requires a robust security system.
- Installation and maintenance of the database system are expensive.
XML Databases:
- Database operations require document operations to display data, and vice
versa.
- XML technologies eliminate this dependency.
- XML defines clear distinctions between data, document structure, and
representation.
- XML standards, including DTD and XSL, are established by W3C and have become
global standards.

Features of XML Databases:


- Standardized display of data.
- Clear separation between data structure, content, and presentation.
- Control over the coding structure through programs.
- Establishment of a universally recognized standard XML file structure.
Data Warehouses:
- The primary goal of Data Warehouses (DW) is to integrate distributed data,
regardless of its source format. This integration allows access to data stored in
various formats across different systems through a single system (DW), minimizing
errors caused by data duplication and differing definitions.
- Data warehouses facilitate decision-making by providing decision-makers with an
analytical view of the data. Decision-makers can analyze data stored in the DW
without affecting operational systems and while preserving their structures.
Four Basic Elements of Data Warehouses:
- Calculation (Measures/Facts)
- Dimensions
- Attributes
- Hierarchies
Data Mining:
 Data Mining (DM) involves analyzing large databases to uncover hidden
patterns and relationships, using DM software. It aids in understanding the
past, guiding current decisions, and shaping the future.
 Knowledge Discovery from Data (KDD) is a vital aspect of the process,
employing methods such as statistics, machine learning, artificial
intelligence, and decision trees.
 DM is essentially about uncovering useful insights from institutional data
accumulation.

Data Mining Transaction Categories:


 Discovery: Processing and formatting data stored in the database
to uncover patterns within a model framework.
 Prediction Model: Utilizing processed data according to discovered
models to predict future outcomes.
 Research Analysis: Application of organizational plans and models
to aid decision-making processes, determining which model to use
for specific cases.
Big Data:
- Big Data (BD) comprises very large datasets, often raw, unstructured, or semi-
structured.
- Traditional systems (RDB) struggle to store and analyze Big Data effectively and
efficiently.
- Big Data emphasizes researching 'what' an event is about rather than its cause.
- Analyzing Big Data may provide more benefits due to its focus on understanding
events rather than their causes.
Big Data Features:
- Volume: Big Data is characterized by its vast volume, generated rapidly.
- Velocity: Data within Big Data is produced at high speeds.
- Variety: Big Data exhibits a wide diversity of data types, including unstructured
formats like text, audio, video, and device recordings.
- Veracity: Big Data may have lower accuracy levels.
- Value: Effective analysis of Big Data aims to derive valuable insights, ensuring
useful outcomes.
NoSQL Databases:
- NoSQL encompasses databases other than relational databases used for various
object-oriented or alternative storage structures.
- NoSQL databases, particularly in Big Data environments today, are optimized for
managing large volumes of data.
- NoSQL databases are typically read-only, don't require normalization or complex
joins, and are designed for fast queries and analysis.
- MySQL and SQL Server are beginning to incorporate NoSQL features.
- Popular NoSQL databases include Hadoop, HBase, Apache Cassandra, and
MongoDB, tailored for Big Data environments.
Relational Databases:
- A Relational Database (RDB) is a structured, integrated collection of data created
for an application and stored on disk media.
- RDBs organize data into entities (tables) and establish relationships between
them based on the Entity-Relationship (ER) model.
- Ensuring the accuracy, consistency, and performance of RDBs is referred to as
data integrity.
Relational Databases:
- A Relational Database (RDB) is a structured, integrated collection of data created
for an application and stored on disk media.
- RDBs organize data into entities (tables) and establish relationships between
them based on the Entity-Relationship (ER) model.
- Ensuring the accuracy, consistency, and perform ance of RDBs is referred to as
data integrity.
Key Features of Relational Databases According to Codd:
- Data is stored in two-dimensional tables (relations) comprising rows and
columns.
- Each record is unique, with intersections of rows and columns containing a single
value.
- The order of columns and records (rows) is irrelevant.
- Columns are assigned logical names corresponding to the data fields they
represent (e.g., Emp_FName, Emp_LName, Emp_BDate, Emp_Cell, etc.).
Types of Relationships in Relational DB:
One-to-One (1:1) Relation:
- Each record in one entity corresponds to exactly one record in another entity.
- This relationship is mutual, with only one record allowed on each side.
One-to-Many (1:N) Relation:
- For each record in one entity, there can be multiple records in another entity.
- This relationship is denoted as 1:N, where N represents the maximum number of
records within the entity.
Many-to-Many (M:N) Relation:
- Entities can be associated with multiple records in each other.
- This relationship allows for mutual matching between entities.
Normalization:
- Detection and correction of abnormal situations (Insert, Delete, Update) in the
database by examining designed tables individually.
- Normalization operations are applied specifically to tables, aiming to minimize
abnormal situations in database designs.
Denormalization:
- Involves creating queries that retrieve recurring data in a controlled and
systematic manner or adding columns to tables to improve performance.
- The basic rule in denormalization is to normalize first, then denormalize if
performance issues arise.
Relational DB Design Process:
1. Conceptual Model: Developed during the analysis phase using the Entity
Relationship Diagram (ERD) to align with the requirements.
2. Logical Model: Created during the design phase using the relational DB model.
3. Physical Model: Developed during the design and implementation phase using
SQL Data Structure Definition Language (DSDL) codes.
Relational DB Design Quality Elements:
- Number of tables
- Number of columns
- Number of relationships
- Depth of hierarchies
Considerations in Database Design:
- Start and end the design process with the end user in mind.
- Gain user support for innovations to avoid wasting time.
- Data modeling involves determining structures to store real-world data.
- Database design requires a blend of artistic and engineering skills.
- Consider employee-used forms, reports, and data files in model creation.
Database Design Types:
1. Integration of existing spreadsheets, text files, or DB reports to create a
database. Digitizing paper records can aid in this process.
2. Creation of databases for newly designed Information System projects.
3. Redesigning old databases to align with new requirements.
Implementation of the Database Design Process:
- Database Design involves Conceptual, Logical, and Physical design stages.
- The process begins with the creation of a conceptual model, followed by
transitioning to the Relational DB model (Logical).
- During the logical design stage, tables are normalized to minimize data
duplication, and denormalization may be considered for performance
improvement.
- Implementation is the final stage where the Relational Database is physically
created, tested, and populated with data using a specific DBMS.
Requirement Analysis:
- Requirement analysis is a phase of the Information System (IS) application
software development process rather than database (DB) design.
- It involves determining the functionality of the system to be developed, the data
to be stored, their relationships, and operational details.
- These requirements shape both the functionality of the IS and the structure of
the supporting database.
- Considerations for the database include data types, rules, table relationships,
and data management actions.
Considerations in Conceptual Design:
- Determine the tables.
- Define attributes for each table.
- Define key fields for tables.
- Define relationships between tables.
- Define the number of elements and participation constraints for each
relationship.
- Assign unique and meaningful names to each relation.
- Avoid unnecessary relationships in designs.
- Carefully plan data access paths to prevent data redundancy.
- Develop multiple alternatives in designs and select the most suitable one.
Physical Design:
- Select optimal data types for performance.
- Determine initial data size and growth trend for file organization.
- Analyze usage frequency to optimize data caching.
- Identify areas needing performance enhancements and create appropriate
indexes.
- Determine table partitioning needs based on data structure.
- Implement storage improvements, like compression.
- Establish security settings, including encryption.
- Define data recovery requirements and implement undo settings.
Relational DB Design Steps:
- Create the database and files (CREATE Database).
- Define tables (CREATE Table).
- Specify attributes (fields) for each table (columns of the data group).
- Determine Primary Key (PK) fields for each table (columns uniquely identifying
rows).
- Establish relationships between tables by creating foreign keys (FK).
- Define domains (data groups that can be written to a field).
- Set constraints or limitations on data entry.
- Define business logic and rules.
- Create forms and reports.
- Establish security and management rules.
- Test the designed system.
- Implement data entry and system.
Normalization and Denormalization:
- Normalization ensures databases are efficiently designed, minimizing redundant
data and accurately defining reference fields.
- Denormalization is employed to optimize performance by systematically
organizing recurring data or adding columns to tables, addressing performance
issues post-normalization.
- Data Warehouses and OLAP systems frequently use denormalization for efficient
reporting, often utilizing structures like cubes.
- During the conversion from an Entity-Relationship Diagram (ERD) to a relational
DB, critical aspects such as table structures, columns, allowance for NULL values,
primary and foreign keys, and relationship types are determined.
Normal Forms:
- First Normal Form (1NF): Ensures each column contains only atomic data (no
sets) and eliminates record duplication.
- Second Normal Form (2NF): Requires all non-PK fields to be fully functionally
dependent on the entire PK, in addition to 1NF.
- Third Normal Form (3NF): Extends 2NF by ensuring non-PK fields are directly and
functionally dependent on the PK, without dependencies between non-PK fields.
- Domain Key Normal Form (DK/NF): Each data value must depend on the table's
PK and defined datasets, with each field linked solely to the PK field and only one
dataset per table.

You might also like