The document provides an overview of Database Management Systems (DBMS), including definitions, models, design processes, security measures, SQL, data warehousing, data mining, and the role of Database Administrators (DBA). It explains various database models such as hierarchical, network, entity-relationship, relational, and object-oriented models, as well as the importance of database design and security. Additionally, it discusses the applications of data warehousing and data mining across different industries.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
4 views
Unit 4 Database Management System
The document provides an overview of Database Management Systems (DBMS), including definitions, models, design processes, security measures, SQL, data warehousing, data mining, and the role of Database Administrators (DBA). It explains various database models such as hierarchical, network, entity-relationship, relational, and object-oriented models, as well as the importance of database design and security. Additionally, it discusses the applications of data warehousing and data mining across different industries.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38
Unit 4
Database Management System
4.1 Introduction to DBMS A database is an organized collection of logically related data that contains information relevant to an enterprise. The database is also called the repository or container for a collection of data files. E.g. College database consists of data related to students such as name, course, faculty, grade, fee etc. A Database Management System (DBMS) is a set of programs that is used to store, retrieve and manipulate the data in convenient and efficient way. A database system consists of database, database management system and application programs. DBMS that maintains relationship between multiple data files is called Relational Database Management System (RDBMS). E.g. Oracle, Sybase, Microsoft SQL Server, Postgres, MS-Access etc. 4.2 Database Models A database model or simply a data model is an abstract model that describes how the data is represented and used. A database model provides the necessary means to achieve data abstraction. Data abstraction can be defined as database system highlighting only the essential features and hiding storage and data organization details from user. There are different database models which are as follows: 1. Hierarchical Model 2. Network Model 3. Entity-Relationship Model 4. Relational Model 5. Object Oriented Model 1. Hierarchical Model: In this model, different records are inter- related through hierarchical or tree-like structure. The root may have number of descendants and each of these descendants may have any number of lower descendants. 2. Network Model: In this model we represent complex data relationships more effectively when compared to hierarchical models, to improve database performance and standards. It has entities which are organized in a graphical representation and some entities are accessed through several paths. A User perceives the network model as a collection of records in 1:M relationships. 3. Entity-Relationship Model: In an ER Model a database can be modeled as a collection of entities and relationship among entities. It is overall logical structure of a database expressed graphically by E-R diagram. The basic components of this diagram are a. Rectangle (represents entity sets) b. Ellipses (represents attributes) c. Diamonds (represents relationship sets among entity sets) d. Lines (link attributes to entity sets and entity sets to relationship sets) Fig: E-R Diagram 4. Relational Model: It represents the database as a collection of relations. All data is maintained in the form of tables consisting of rows and columns. Each row represents an entity and a column represents an attributes of the entity. The relationship between the two tables is implemented through a common attribute in the table and are known as primary key and foreign key. 1. Primary Key: A primary key is a set of attributes that is used for identifying records uniquely. Primary key must satisfy following two characteristics: a. It cannot be null. b. It cannot be duplicate. 2. Foreign Key: A foreign key is an attribute or combination of attribute that is used to establish and enforce relationship between two relations (table). A set of attributes that reference primary key of another table is called foreign key. 5. Object Oriented Model: It is based on object-oriented programming paradigm. Object oriented data model is based upon real world situations. These situations are represented as objects, with different attributes. a. Objects: The real world entities and situations are represented as objects in the Object oriented database model. b. Attributes and Method: Every object has certain characteristics. These are represented using Attributes. The behavior of the objects is represented using Methods. c. Class : Similar attributes and methods are grouped together using a class. d. Inheritance: A new class can be derived from the original class. The derived class contains attributes and methods of the original class as well as its own. Calculate
Fig. Object oriented data model
4.3 Database Design Database design is a collection of processes that facilitates the designing, development, implementation and maintenance of enterprise data management systems. Properly designed database is easy to maintain, improve data consistency and are cost effective in terms of storage. Typical database design process includes: 1. Identify Entities: Anything that are kept in database is called entity. E.g. Customer, Product, Shops etc. 2. Identify Relationship: Determine the relationship between entities and to determine the cardinality of each relationship. E.g. relation between customer and orders. 3. Identify Attributes: The data elements for each entity. E.g. product can have various attributes such as product name, manufacturer, price, manufacture date, expiry date etc. 4. Draw Entity Relationship Diagram (ERD): ERD gives a graphical overview of the database. 5. Assign Keys: Primary key (PK) is one or more data attributes that uniquely identifies an entity. Foreign Key (FK) is an entity is the reference to the primary key of another entity. A key that consists of two or more attributes is called composite key. 6. Define the Attribute’s data type: Attributes data type can be CHAR, INTEGER etc. 7. Normalization: It makes data more flexible, reliable, reduce data redundancy and remove data inconsistency. Normalization can be First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), Fourth Normal Form (4NF), Fifth Normal Form (5NF), Boyce-Codd Normal Form (BCNF). 4.4 Database Security Database security refers to the collective measure to protect and secure a database or database management software from illegitimate use and malicious threats and attacks. Database security covers and enforces security on all aspects and components which includes: data stored in database, database server and database management system(DBMS). Some of the ways database security is analyzed and implemented includes: 1. Restricting unauthorized access by implementing strong and multifactor access and data management controls. 2. Load / Stress testing and capacity testing of a database to ensure it doesn’t crash in user overload or DDOS. 3. Physical security of the database server from theft and natural disasters. 4. Reviewing existing system for any known or unknown vulnerabilities and defining and implementing a road map or plan to mitigate them. 4.5 SQL SQL stands for Structured Query Language. It is a database language designed for the retrieval and management of data stored in relational database management system (RDBMS). It is non-procedural query language. SQL is originally designed as a declarative query and data manipulation language. The SQL language has several parts such as: a. Data Definition Language: Data Definition Language (DDL) is used to define data structures. It is used to define and alter the structure of database items. It also can be referred to as data description language since it defines the columns and records in a database table. b. Data Manipulation Language: A data manipulation language (DML) is a computer programming language used for adding (inserting), deleting, and modifying (updating) data in a database. c. Embedded SQL: Embedded SQL defines how SQL statements can be embedded within general purpose propose programming language such as C, C++, PHP etc. d. Dynamic SQL: Dynamic SQL allows to construct queries at run time. e. Transaction Control: SQL commands for specifying the beginning and ending of transaction. f. Client-server Execution and Remote Database Access: These commands control how a client application program can connect to a SQL database server or access data from a database over a network. Basic structure of SQL Query The SQL SELECT statement queries data from tables in database. The statement begins with SELECT keyword. The basic SELECT statement has three clause: SELECT, FROM and WHERE. The SELECT clause specifies the table columns that are retrieved. The FROM clause specifies the table accessed and WHERE clause specifies which table rows are retrieved. Sid Sname Level Age Sex 1. Ram Undergraduate 22 Male 2. Shyam Graduate 25 Male 3. Rita Undergraduate 21 Female 4. Gopal Graduate 26 Male
Table : Student E.g. 1. SELECT Sname FROM Student WHERE Sex = ‘FEMALE’ Sname Rita
E.g. 2. SELECT Sid, Sname, Sex FROM Student
Sid Sname Sex 1. Ram Male 2. Shyam Male 3. Rita Female 4. Gopal Male E.g. 3. SELECT * from Student Sid Sname Level Age Sex 1. Ram Undergraduate 22 Male 2. Shyam Graduate 25 Male 3. Rita Undergraduate 21 Female 4. Gopal Graduate 26 Male
E.g. 4. SELECT Sid, Sname, Level, Age FROM Student where
Age >21 Sid Sname Level Age 1. Ram Undergraduate 22 2. Shyam Graduate 25 3. Gopal Graduate 26 E.g. 5. ALTER TABLE Student ADD (Address Varchar (15)); Sid Sname Level Age Sex Address 1. Ram Undergraduate 22 Male 2. Shyam Graduate 25 Male 3. Rita Undergraduate 21 Female 4. Gopal Graduate 26 Male
E.g. 6. Drop table Student
Removes table student from database. 4.6 Data Warehouse A data warehouse is a central repository of information that can be analyzed to make more informed decisions. Data warehousing involves data cleansing, data integration and data consolidations. a. Integrated Data: the data is collected from various sources, such as transactional systems, and then cleaned, transformed, and consolidated into a single, unified view. This allows for easy access and analysis of the data, as well as the ability to track data over time. b. Subject-Oriented: data is organized around specific subjects, such as customers, products, or sales. This allows for easy access to the data relevant to a specific subject, as well as the ability to track the data over time.. Non-Volatile This means that the data in the warehouse is never updated or deleted, only added to. Time-Variant data is stored with a time dimension. This allows for easy access to data for specific time periods, such as last quarter or last year. This makes it possible to track trends and patterns over time. Application of Data Warehouse a. Banking Industry: They use data warehouse to analyze consumer data, market trend and financial decision making. b. Consumer Goods Industry: They use data warehouse for prediction of customer trends, inventory management, market and advertisement research. c. Government: They use data warehouse to maintain and analyze tax records, health policy records, criminal record etc. d. Education: Universities use data warehouse to understand student’s demographics, human resource management, proposal of research grants. e. Healthcare: Financial, clinical and employee records are fed to warehouse as it strategize and predict outcomes, track and analyze their service feedback, generate patient reports, share data with tie-in insurance company, medical aid services etc. f. Hospitality Industry: The utilize warehouse services to design and evaluate their advertising and promotion campaigns where they target customers based on their feedback and travel patterns. g. Insurance: To analyze data patterns and customers trends. h. Manufacturing, Distribution and Retailers: They use data warehouse to track items, analyze sales, predict market demands, current business trend and ultimately make better decisions. Telecom Industry: They use data warehouse to analyze fixed assets, analysis of customer calling patterns for sales representative to push advertisement campaigns, tracking customer queries etc. 4.7 Data Mining Data mining is defined as extracting information from a huge sets of data. The information or knowledge extracted so can be used for number of applications such as market analysis, customer retention, production control etc. Application of Data Mining a. Financial Data Analysis: Banks and Financial institutions use data mining for loan payment prediction and customer credit policy analysis, classification and clustering of customers for targeted marketing, detection of money laundering and other financial crimes. b. Retail Industry: Data mining in retail industry helps identifying customer buying patterns and trends that leads to improved quality of customer service and good customer retention and satisfaction. c. Telecom Industry: Data mining in telecommunication industry helps in identifying the telecommunication pattern, catch fraudulent activities, make better use of resources and improve quality of service. d. Biological Data Analysis: Biological data mining is a very important part of bioinformatics. It can be used in indexing, similarity search and comparative analysis multiple nucleotide sequence, discovery of structural patterns and analysis of genetic network and protein pathways etc. e. Other scientific application: A large amount of data set is being generated because of the fast numerical simulations in various fields such as climate and ecosystem modeling, chemical engineering, fluid dynamics etc. f. Intrusion Detection: Intrusion refers to any kind of action that threatens integrity, confidentiality and availability of network resources. 4.8 Database Administrator Database Administrator is a person who has central control over both data and application program. The job of DBA vary depending upon the job description and corporate and organization policies. Some of the responsibilities of DBA are as follows: a. Schema Definition and Modification: It is the responsibility of DBA to create the database schemas by executing a set of data definition statements in DDL. b. Security Enforcement and Administration: DBA is responsible for establishing and monitoring the security of database system. c. Performance Tuning: DBA is responsible for analyzing the data stored in database and studying it performance and efficiency. d. Database Design: DBA works with the development team during the database design stage due to which many problems can be avoided in later phase. e. Physical Organization Modification: DBA is responsible for carrying out the modification in the physical organization of database for better performance. e. Data Backup and Maintenance: DBA is responsible for taking the database backup parodically in order to recover from any hardware or software failure. Other routine maintenance checks that are carried out by the DBA are checking data storage and ensuring the availability of free disk space for normal operation, upgrading disk space as when required.