CHPT 1
CHPT 1
Intro to Databases
• Examples?
• Traditional Applications
• Numeric
• Textual
Recent Applications of Databases
• Multimedia Databases
• Data Warehouses
• OLAP
• Real-time/Active Databases
• What is Data?
• Facts that can be recorded and that have implicit meaning aka Information
• Example,
• the names, telephone numbers, and addresses of the people you know is a
collection of related data with an implicit meaning
What is a Database?
• Data has a Source: Usually represents some aspect of the real world.
• Data can Change: Updates to the data occur are usually reflected in the
database.
• Defining the database – specifying the data types, structures, and constraints of the
data to be stored in the database;
• Constructing the database – building and storing the data on some storage medium
• Manipulating the database – functions such as querying the database to retrieve
specific data, updating the database to reflect changes, and generating reports; and
• Sharing the database – controlling access among various users and simultaneously.
Database Management System (DBMS)
• DBMS facilitates the processes of
3. SECTION file stores data on each section of a course, like which year and which lecturer
4. GRADE_REPORT table stores the grades that students receive in the various sections they
have completed
• STUDENT record includes data attributes like Name, Student_number, Class and
Major;
• Name is a string of alphabetic characters,
• Student_number is an integer
• Class is an integer
• Data records in the various tables are almost always related and have many
relationships among the records.
• For example,
• the record for Smith in the STUDENT file is related to two records in the GRADE_REPORT file
that specify Smith’s grades in two sections.
• each record in the PREREQUISITE file relates two course records: one representing the course
and the other representing the prerequisite.
Example of a Student Database
• Manipulating the database involves querying and updating the data records in
one or more tables.
• List the names of students who took the section of the ‘Database’ course offered in fall
2008 and their grades in that section.
Example of a Student Database
• Examples of updates are as follows:
• Change the class of ‘Smith’ to 2nd year student
• Create a new section for the ‘Database’ course for this semester
• Enter a grade of ‘A’ for ‘Smith’ in the ‘Database’ section of last semester
Characteristics of the Database Approach
• Database approach vs older approach of writing customized programs to access
data stored in files (i.e. traditional file processing)
• The catalog is used by the DBMS software and also by database users who need
information about the database structure
• DBMS software can access diverse databases by extracting the database definitions from
the catalog—for example, a university database, a banking database, or a company
database
• In traditional file processing approach, programs are constrained to work with only one
specific database, whose structure is declared in the application programs
(1) Self-Describing Nature of a Database System
Example of Meta Data
(2) Insulation between Programs and Data, & Data Abstraction
• In the database approach, the structure of data files is stored in the DBMS catalog
separately from the access programs.
• We call this property program-data independence.
• In the database approach, a database typically has many types of users, each of
whom may require a different perspective or view of the database.
• A view may be a subset of the database or it may contain virtual data that is
derived from the database files but is not explicitly stored.
• For example,
• one user of the database of in our student db example may be interested only in accessing and
printing the transcript of each student (a)
• a second user, may only be interested in checking that students have taken all the prerequisites
of each course for which the student registers (b)
(3) Support of Multiple Views of the Data
(4) Sharing of Data and Multiuser Transaction Processing
• In the database approach a DBMS, allows multiple users to access the database
at the same time.
• The DBMS includes concurrency control software to ensure that several users
trying to update the same data does not cause problems
• responsible for authorizing access to the database, coordinating and monitoring its use, and
acquiring software and hardware resources as needed.
• accountable for problems such as security breaches and poor system response time
2. Database Designers
• discusses with all prospective database users in order to understand their requirements and
then creates a design that meets the requirements of the users
• responsible for identifying the data to be stored in the database and for choosing
appropriate structures to represent and store this data
Database Actors/Users
3. End Users
• DB is built for them
• This is data redundancy i.e storing the same data multiple times
• data inconsistency – data stored in multiple locations becomes inconsistent if not all updated
simultaneously
• In the database approach, we have a database design that stores each logical data
item in only one place in the database – data normalization
Advantages of Using the DBMS Approach
• A DBMS provides security and authorization which the DBA uses to create
accounts and to specify account restrictions.
• For example,
• In a university database only Academic Office staff may be allowed to enroll students but cannot
add course grades.
• While lectures and instructors can only view students but not enroll. However, can add course
grades
Advantages of Using the DBMS Approach
(4) Providing Storage Structures and Search Techniques for Efficient Query
Processing
• The DBMS provide specialized data structures called indexes to efficiently execute
searches, queries and updates
• In order to process the database records needed by a particular query, those records must
be copied from disk to main memory
• Therefore, the DBMS often has a buffering or caching module that maintains parts of the
database in main memory buffers
• The query processing and optimization module of the DBMS is responsible for choosing
an efficient query execution plan for each query based on the existing storage structures
• The choice of which indexes to create and maintain is part of physical database design and
tuning, which is one of the responsibilities of the DBA staff
Advantages of Using the DBMS Approach
• For example, if the computer system fails in the middle of an update transaction,
the DBMS makes sure that the database is restored to the state it was in before the
transaction started executing.
• Many types of users with varying levels of technical knowledge use a database,
• A database may include numerous varieties of data that are interrelated in many
ways
• A DBMS has
• the capability to represent a variety of complex relationships among the data,
2. Referential Integrity Constraint - involves specifying that a record in one file must be related to
records in other files
3. Key/Uniqueness Constraint - uniqueness on data item values, e.g. every course record must have a
unique Course_code
• Constraints are derived from the meaning or semantics of the data and of what it represents
• Other constraints may have to be checked by update programs or at the time of data entry
Advantages of Using the DBMS Approach
• Some database systems provide capabilities for defining deduction rules for
inferencing new information from the stored database facts – called deductive
database systems
• Rules, when compiled and maintained by the DBMS can allow it to make deductions.
• But if the rules change, it is more convenient to change the declared deduction rules than to
recode the application program(s)
Web and E-Commerce (1990s onwards): The rise of the web and e-commerce
drove the need for dynamic data exchange on web pages, leading to the
development of standards like XML for data interchange between databases and
web applications.
Big Data and NoSQL (21st Century): The explosion of data from social media, e-
commerce, and cloud storage required new database systems (NoSQL) that could
handle large-scale, nontraditional data types and provide fast, reliable storage and
retrieval, supplementing traditional SQL systems.