Introduction To IT Lecture Notes Topic 14: Database Concepts
This document provides an introduction to basic database concepts. It defines what a database and DBMS are, describes the relational database model and how data is organized in tables with records and fields. It discusses the history of databases from early file systems to today's dominant relational model. Examples of database tables are provided to illustrate concepts like records, fields, and relationships between tables.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
450 views9 pages
Introduction To IT Lecture Notes Topic 14: Database Concepts
This document provides an introduction to basic database concepts. It defines what a database and DBMS are, describes the relational database model and how data is organized in tables with records and fields. It discusses the history of databases from early file systems to today's dominant relational model. Examples of database tables are provided to illustrate concepts like records, fields, and relationships between tables.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9
Introduction to IT Lecture Notes
Topic 14: Database Concepts
In our information society, record keeping and data processing using database has become an important aspect of every organization, and much of the world's computing power is dedicated to doing this. You probably use databases all the time, often without knowing it. For examples, withdrawing money from the Agriculture Bank's ATM or selecting elective courses from UIC's online system. This chapter will provide you with an introduction to basic database concepts. 1 What is a Database? Data are unprocessed raw facts that include text, number, images, audio, and video. Information is processed data that is organized, meaningful, and useful. Database, in simple terms, can be stated to be storage of information in a systematic manner usually records, from which data can be easily (relatively speaking) retrieved or to which modifications can be made or new data can be added. Formally, a database is a collection of organized data and a means to allow you to manipulate it in a useful way. From pre-stage fat-file system, to relational and object-relational systems, database technology has gone through several generations and its 40 years history. Example: Examples of databases include a telephone book, the library catalogue, and the TV guide. 2 What is a DBMS? Usually a database accompanies with a management software called Database Management System (DBMS), which provides some convenient ways to the users to manipulate (add, update, delete), retrieve and present the data. Example: On PCs, Microsoft Access is a popular example of a single or small-group user DBMS. Microsoft's SQL Server is an example of a DBMS that serves database requests from multiple (client) users. Other popular DBMSs are IBM's DB2, Oracle Corporation's Oracle, and Sybase Corporation's Sybase. 3 A Short History of Database The origins go back to libraries, governmental, business, and medical records. There is a very long history of information storage, indexing, and retrieval. Good design principles goes way back and lots is known now about how to make good designs that lead to better reliability and performance. In 1960's, computers become cost effective for private companies along with increasing storage capability of computers. Two main data models were developed: network model (CODASYL) and hierarchical (IMS). Access to database is through low-level pointer operations linking records. Storage details depended on the type of data to be stored. Thus adding an extra field to your database requires rewriting the underlying access/modification scheme. Emphasis was on records to be processed, not overall structure of the system. A user would need to know the physical structure of the database in order to query for information. In 1970's, E.F. Codd proposed a relational model for databases in a landmark paper on how to think about databases. He disconnects the schema (logical organization) of a database from the physical storage methods. This system has been standard ever since. In 1980's, Commercialization of relational systems began as a boom in computer purchasing fuels DB market for business. SQL (Structured Query Language) became "intergalactic standard". DB2 becomes IBM's fagship product. Network and hierarchical models fade into the background, with essentially no development of these systems today but some legacy systems are still in use. In 1990's, an industry shakeout begins with fewer surviving companies (IBM, Oracle, Sybase, Microsoft, etc.) offering increasingly complex products at higher prices. Client-server model for computing becomes the norm for future business decisions. In mid-1990's, the usable Internet/WWW appears. A mad scramble ensues to allow remote access to computer systems with legacy data. Client-server frenzy reaches the desktop of average users with little patience for complexity while Web/DB grows exponentially. In late-1990's, the large investment in Internet companies fuels tools market boom for Web/Internet/DB connectors. Open source solution comes online with widespread use of gcc, cgi, Apache, MySQL, etc. Online Transaction processing (OLTP) and online analytic processing (OLAP) comes of age with many merchants using point-of-sale (POS) technology on a daily basis. The early 21st century saw the decline of the Internet industry as a whole but solid growth of DB applications continues. More interactive applications appear with use of PDAs, POS transactions, consolidation of vendors, etc. ID Name Address Phone 004 Alice 111 Washington Str. 6123432 005 Ben 222 Anderson Ave. 6123433 006 Candy 333 Washington Str. 6123434 007 Doris 444 J ingfeng Rd. 6123435 007 Eva 444 J ingfeng Rd. 6123435 Table 1: A telephone book table with 4 fields (columns) and 5 records (rows). Three main (western) companies predominate in the large DB market: IBM (buys Informix), Microsoft, and Oracle. 4 What is a Relational Database? Every database or DBMS is based on a specific data model, which consists of rules and standards that define how the database organizes data, and how users view the organization of the data. A database model called Relational Database Model is widely adopted. A relational database stores all its data in tables, and nothing more. All operations on data are done on the tables themselves or produces another table as the result. You never see anything except for tables. 5 What is a Table? A table is a set of rows and columns. Each row (record) is a set of columns (fields) with only one value for each. All rows from the same table have the same set of columns, although some columns may have NULL values, i.e. the values for that rows was not initialized. Example: Table 1 contains telephone number information. It has 4 columns and 5 rows. Each row represents one person. Each column describes one attribute of the person. 6 The Hierarchy of Data To manage, a database is organized in a hierarchy which consists of several levels such as character, field, record and file. Each higher level of data consists of one or more items from the lower level. 6.1 Character A bit is the smallest data unit in a computer. Eight bits grouped together in a unit comprise a byte. Each byte represents a single character, which can be number, letter, punctuation mark, or other symbol. Example: Each phone number in Table 1 consists of 7 characters. Each character is represented by 8 bits in ASCII code. So each phone number is a string of 56 characters. 6.2 Field A field is a combination of one or more related characters. A field is a column in a table. It is the smallest unit of data a user accesses. There are three parameters associate with each field: 1. A field name uniquely identifies each field. 2. A field size defines the maximum number of characters a field can contain. 3. A data type specifies the kind of data a field can contain. The common data types are: o Text: Letters, numbers or special characters. Example: "Alice" o Numeric: Numbers only. Example: 123456. automatically assigned by the DBMS to each added record. o Currency: Dollar and cent amount. o Date: Month, day, year, and time information. o Memo: Lengthy text entries. o Boolean: 0 or 1 (two states). o Hyperlink: Web address that links to a document or a Web page. o Object: This data type can be photograph, audio, video, or a document created by other programs. Example: Table 1 has 4 fields: ID, NAME, Address, and PHONE. MovieID Title Genre Rating 001 Back to the Future comedy adventure PG 002 X-Men action sci-fi 12 003 Aliens sci-fi horror 18 004 Independence Day action sci-fi 15 005 Forest Gump comedy PG Table 2: Movies. 6.3 Record A record is a group of related fields. Each row in a table is called a record. A table is a collection of records. Each record represents or describes a unique item in the database. For examples, each record in a telephone book represents a person; each record in a library catalogue represents a book; and each record in the TV guide represents a TV program. Each record must consists of a special field called a primary key that uniquely identifies each record. Primary Keys are often used to linked tables. Example: Table 1 has 5 records representing five persons: Alice, Ben, Candy, Doris, Eva. We can use their ID as the primary key if we require each person's ID must be unique. 6.4 File A data file (table) is a collection of related records stored on a disk. A database includes a group of related data files or tables. 7 Relationships Consider what happens at a movie rental store as an example. There must be a table, say Movie (Table 3), containing information about movies. The Movie table in Table 3 has 5 movie records. Each movie record has 4 fields or attributes: MovieID, Title, Genre, Rating. The key field is MovieID that must be unique. When a person comes to rent a movie, a clerk enters the customer's information in another table, say Customer. The Customer table has 5 fields: CustomerID, FirstName, Surname, Address, CreditCardNumber. Similar to what we did with the Movie table, the Customer table contains a CustomerID as the key field.
CustomerID FirstName Surname Address CreditCardNumber 101 Dennis Cook 123 Main Street 2723 4657 8765 0834 102 Mike Scofield 456 Second Ave. 3472 3098 4678 2764 103 Doug Nickle 789 Elm Street 4253 3471 5082 5494 104 Amy Stevens 321 Yellow Street 8932 4657 8957 0834 105 Susan Person 654 Broadway Street 7890 4767 6786 4268 106 Andy May 789 Lois Lane 0281 4657 8765 7896 107 David Wang 3214 Anderson Street 1370 8676 28657 6867 Table 3: Customers. CustomerID MovieID DateRented DateDue 103 001 3-12-2006 3-14-2006 103 002 3-12-2006 3-14-2006 105 003 3-12-2006 3-14-2006 Table 4: Rents. The Movie table and the Customer table show how data can be organized as records within isolated tables. The power of relational DBMS, though, is in the ability to create tables that conceptually link various tables together. When a customer rents a movie, there is a "rents" relationship between the person and the movie. We can use a table, Rents, to contain information about this relationship. The table Rents has four fields: CustomerID, MovieID, DataRented, DateDue. Note that it does not contain all of the data about a customer or a movie. There is no need to store these information in the Rents table. They are already stored in the Customer table and the Movie table. When we need data about the customer, we use CustomerID stored in the Rents table to look up the customer's detail data in the Customer table. Likewise, when we need data about the movie, we use MovieID stored in the Rents table to look up the movie's detail data in the Movie table. This is the basis of relational database. Relational database is based on the idea that objects (tables) of a database are connected or related so they can exchange information. This exchange of information is made possible by creating relationships among objects (tables) of a database. The "rents" relationship is called a one-to-many relationship. That is, one customer is allowed to rent many movies, but a movie can only be rented by a single customer at any given time. There are three general cardinality constraints: one-to-one one-to-many many-to-many
These cardinality constraints help the database designer convey the details of a relationship. Note that the CustomerID value 103 is shown in two records in Table 4. That indicates that the same customer rented two different movies. Data is modifies in, added to, and deleted from our various database tables as needed. When movies are added or removed from the available stock, we update the records of the Movie table. As people become customers of our store, we add them to the Customer table. On an ongoing basis we add and remove records from the Rents table as customers rent and return movies. 8 Query and Operations on Table 8.1 Queries A query is a question or a request for specific data in a database. A DBMS provides a tool called Query Language which consists of simple, Englishlike statements that allow users to manage, update and retrieve the data in a database. A query language provides the following common functions: Select records: Users can retrieve all or partial records from the database. Insert records: Users will add records in the database when they obtain new data. Update records: Users can update the records when the records have any changes. Delete records: When a record no longer is needed, a user can delete it from a file. 8.2 Validating Data Validation is the process of comparing data with a set of rules or values to find out if the data is correct. DBMS performs validity check to help ensure the integrity of entered data, where the data integrity is measured by: Correctness and completeness Alphabetic/Numeric Check: Ensure that users enter only alphabetic/numeric data into a field. Range Check: Determine whether the input number is within a specific range Consistency Check: Test the data in two or more associated fields to ensure that relationship is logical. Completeness Check: Verify that a required field contains data. Check Digit: A check digit is a number(s) or character(s) that is appended to or inserted into specific data value. It is used to check the accuracy of the value. 9 SQL The Structured Query Language (SQL) is a comprehensive database language for managing relational databases. It includes statements to specify database schemas as well as statements that add, modify, and delete database contents. It also includes, as its name implies, the ability to query the database to retrieve specific data. The original version of SQL was Sequal, developed by IBM in the early 1970s. In 1986, the American National Standards Institute (ANSI) published the SQL standard, the basis for commercial database languages for accessing relational databases. 9.1 Queries The select statement is the primary tool for this purpose. The basic select statement includes a select clause, a from clause, and a where clause: SELCET attribute-list FROM table-list WHERE condition The select clause determines what attributes are returned. The from clause determines what tables are used in the query. The where clause restricts the data that is returned. For example: SELECT Title FROM Movie WHERE Rating = 'PG' The result of this query is a list of all titles from the Movie table that have a rating of PG. 9.2 Modifying Database Content The insert, update, and delete statements in SQL allow the data in a table to be changed. The insert statement adds a new record into a table. Each insert statement specifies the values of the attributes for the new record. For example: INSERT INTO Customer VALUES (9876, 'J ohn', 'Smith', '602 Green Street', '2120 9873 0976 2445') This statement inserts a new record into the Customer table with the specified attributes. The update statement changes the values in one or more records of a table. For example: UPDATE Movie SET Genre ='thriller drama' WHERE title ='X-Men' This statement changes the Genre of the Movie X-Men to 'thriller drama'. The delete statement removes all records from a table matching the specified condition. For example, if we want to remove all R-rated movies from the table Movie, we could use the following delete statement: delete from Movie where Rating ='R' 10 File Processing versus Database Usually there are two approaches to store and manage data: file processing or database. In a typical file processing system, each department or area within an organization has its own set of files. These files are often are designed specifically for their particular applications. The records in one file may not relate to the records in any other files. By using the database approach, all data is centralized in one place such that many programs and users share the data in the database. The database approach provides following advantages: 1. Reduce data redundancy: Since all data is centralized in one place, all the programs and users share the same data in the database. So that it reduces the data replication. 2. Improve data integrity: DBMS performs validity check to help ensure the entered data is correct. Also, the database approach reduces the possibility of introducing inconsistencies. 3. Share data: The data in a database environment belongs to and is shared, usually over a network, by the entire organization. 4. Easier access: The database approach allows non-technical users to access and maintaining data. 5. Reduce development time: It is often easier and faster to develop programs that use the database approach. 6. More secure: When the companies use databases, they typically have security settings to define who can access, add, change, and delete the data in a database.