MySQL Conceptual Architecture
MySQL Conceptual Architecture
Ryan Bannon ([email protected]) Alvin Chin ([email protected]) Faryaaz Kassam ([email protected]) Andrew Roszko ([email protected])
Table of Contents
TABLE OF CONTENTS ..........................................................................................................................................................................2 1.0 ABSTRACT...........................................................................................................................................................................................3 2.0 INTRODUCTION AND OVERVIEW ...........................................................................................................................................4 3.0 GENERAL RDBMS ARCHITECTURE.......................................................................................................................................5 3.1 A PPLICATION LAYER..........................................................................................................................................................................5 3.2 LOGICAL LAYER..................................................................................................................................................................................5 3.3 PHYSICAL LAYER................................................................................................................................................................................6 4.0 MYSQL ARCHITECTURE..............................................................................................................................................................7 4.1 A PPLICATION LAYER..........................................................................................................................................................................8 4.2 LOGICAL LAYER..................................................................................................................................................................................8 4.2.1 Query Processor.........................................................................................................................................................................8 4.2.2 Transaction Management ...................................................................................................................................................... 10 4.2.3 Recovery Management........................................................................................................................................................... 10 4.2.4 Storage Management.............................................................................................................................................................. 10 4.2.5 Evolvability/Scalability.......................................................................................................................................................... 11 5.0 USE CASE SCENARIOS................................................................................................................................................................ 12 5.1 CRASH SCENARIO.............................................................................................................................................................................12 5.2 UPDATE TABLE UNDER TRANSACTION SCENARIO ......................................................................................................................12 6.0 GLOSSARY........................................................................................................................................................................................ 13 7.0 REFERENCES .................................................................................................................................................................................. 14
Page 2 of 14
1.0 Abstract
This paper presents a proposed conceptual architecture for the MySQL Relational Database Management System (RDBMS). The paper first goes into a brief examination of general RDBMS's. This research serves as a "roadmap" for further architectural discovery. It is a good assumption that the architecture for MySQL is similar to that of a general RDBMS architecture. Beyond this, MySQL documentation and literature specific to MySQL was examined to assist in defining the actual conceptual architecture of MySQL. The MySQL architecture was first broken down into three layers: an application layer, a logical layer, and a physical layer. The logical layer, which is focused upon in detail, was broken down into four components: a query processor, a transaction management system, a recovery management system, and a storage management system. It should be noted that recovering a conceptual architecture has a noticeable envelope for inaccuracy; putting the pieces of the puzzle together depends not only on facts, but also on relative intuition and logic, which may vary between the realization of the researchers and the designers. Thus, there should never be a guarantee that a conceptual architecture is completely correct; rather, someone who had no part in development should use the conceptual architecture as a general interpretation of the true architecture for MySQL.
Page 3 of 14
Page 4 of 14
Application Layer
Logical Layer
Physical Layer
Page 5 of 14
Query Processing
Transaction Management
Recovery Management
Storage Management
It can be observed that there are four main modules present in the RDBMS; their implementations and interactions have been summarized based on general documentation from a variety of sources. It is interesting to note that since the major control flow is indeed downwards, the logical layer itself can be considered a Garlan & Shaw layered architecture. It should be noted that the specific implementations and interactions between the subsystems vary greatly from vendor to vendor; in the ensuing section, the MySQL subsystems of these components will be analyzed in detail.
Page 6 of 14
Query Interface
Query Parser
Query Preprocessor
Query Optimizer
Execution Engine
Transaction Management
Recovery management
Concurrency-Control Manager
Transaction Manager
Log Manager
Recovery Manager
= control flow Storage Manager Physical Disk/ Secondary Storage (log files, databases and relative statistical, index, data, and meta-data files)
Page 7 of 14
The architecture depicted in Figure 3 is a view of the control flow of the MySQL system. It is an expansion of the simple architecture described in Figure 2. It should be noted that the architecture described is a layered architecture as described by Garlan and Shaw. There exists a pipeline architecture, also described by Garlan and Shaw, represented in the Query Processing layer between the Embedded DML Precompiler and the Execution Engine. For sake of simplicity, flow back up the architecture has been left out and should be deemed as implicit. For example, calling a simple SQL command, such as SELECT *, would require information to be brought back up the system. This flow is implied and is not mentioned in the diagram. Each layer in Figure 3 is described below with the core of the functionality found in the Logical Layer. As a result, this is the layer that will be directly explained in further detail.
4.2.1.2 DDL Compiler Requests to access the MySQL databases received from an administrator are processed by the DDL (Data Definition Language) compiler. The DDL compiler compiles the commands (which are SQL statements) to interact directly with the database. The administrator and administrative utilities do not expose an interface, and hence execute directly to the MySQL server. Therefore, the embedded DML precompiler does not process it, and this explains the need for a DDL compiler. 4.2.1.3 Query Parser After the relevant SQL query statements are obtained from deciphering the client request or the administrative request, the next step involves parsing the MySQL query. In this stage, the objective of the query parser is to create a parse tree structure based on the query so that it can be easily understood by the other components later in the pipeline. 4.2.1.4 Query Preprocessor The query parse tree, as obtained from the query parser, is then used by the query preprocessor to check the SQL syntax and check the semantics of the MySQL query to determine if the query is valid. If it is a valid query, then the query progresses down the pipeline. If not, then the query does not proceed and the client is notified of the query processing error. 4.2.1.5 Security/Integration Manager Once the MySQL query is deemed to be valid, the MySQL server needs to check the access control list for the client. This is the role of the security integration manager which checks to see if the client has access to connecting to that particular MySQL database and whether he/she has table and record privileges. In this case, this prevents malicious users from accessing particular tables and records in the database and causing havoc in the process. 4.2.1.6 Query Optimizer After determining that the client has the proper permissions to access the specific table in the database, the query is then subjected to optimization. MySQL uses the query optimizer for executing SQL queries as fast as possible. As a result, this is the reason why the performance of MySQL is fast compared to other RDBMS's. The task of the MySQL query optimizer is to analyze the processed query to see if it can take advantage of any optimizations that will allow it to process the query more quickly. MySQL query optimizer uses indexes whenever possible and uses the most restrictive index in order to first eliminate as many rows as possible as soon as possible. Queries can be processed more quickly if the most restrictive test can be done first. 4.2.1.7 Execution Engine Once the MySQL query optimizer has optimized the MySQL query, the query can then be executed against the database. This is performed by the query execution engine, which then proceeds to execute the SQL statements and access the physical layer of the MySQL database from Figure 3. As well the database administrator can execute commands on the database to perform specific tasks such as repair, recovery, copying and backup, which it receives from the DDL compiler.
4.2.1.8 Scalability/Evolvability The layered architecture of the logical layer of the MySQL RDBMS supports the evolvability of the system. If the underlying pipeline of the query processor changes, the other layers in the RDBMS are not affected. This is because the architecture has minimal sub-component interactions to the layers above and below it, as can be seen from the architecture diagram. The only sub-components in the query processor that interact with other
Page 9 of 14
layers is the embedded DML preprocessor, DDL compiler and query parser (which are at the beginning stages of the pipeline) and the execution engine (end of the pipeline). Hence, if the query preprocessor security/integration manager and/or query optimizer is replaced, this does not affect the outcome of the query processor.
Page 10 of 14
4.2.4.1 Storage Manager At the lowest level exists the Storage Manager. The role of the Storage Manager is to mediate requests between the Buffer Manager and secondary storage. The Storage Manager makes requests through the underlying disk controller (and sometimes the operating system) to retrieve data from the physical disk and reports them back to the Buffer Manager. 4.2.4.2 Buffer Manager The role of the Buffer Manager is to allocate memory resources for the use of viewing and manipulating data. The Buffer Manager takes in formatted requests and decides how much memory to allocate per buffer and how many buffers to allocate per request. All requests are made from the Resource Manager. 4.2.4.3 Resource Manager The purpose of the Resource Manager is to accept requests from the execution engine, put them into table requests, and request the tables from the Buffer Manager. The Resource Manager receives references to data within memory from the Buffer Manager and returns this data to the upper layers.
4.2.5 Evolvability/Scalability
The goals of the Transaction Management subsystem and the Recovery Management subsystem seem to provide non-functional requirements such evolvability and scalability. For example, the different managers provide the necessary abstractions so that the implementation can change while leaving the interface the same, thereby ensuring that the system can evolve to contain better data structures or algorithms. Furthermore, these subsystems provide scalability by being able to handle several transactions from several different users concurrently, or by recovering crashes from several different databases without much effort.
Page 11 of 14
Page 12 of 14
6.0 Glossary
API (Application Programming Interface): A set of functions in a particular programming language issued by a client that interfaces to a software system. Atomically : The "all or nothing" execution of transactions. Disk Controller: A circuit or chip that translates commands into a form that can control a hard disk drive. DML (Data Manipulation Language): this is a generic language in which the actual SQL statements are embedded in the client or translated from the client code. DDL (Data Definition Language): This is the language that the RDBMS understands. In MySQL, and all SQL databases, the DDL is SQL itself. Deadlock : A failure or inability to proceed due to two transactions having some data that the other needs. Index: A means to speed up access to the contents in database tables. The MySQL query optimizer takes advantages of indexes in order to speed up the processing of queries. Main Memory: The storage device used by a computer to hold the currently executing program and its working data for fast access. Meta-data : Data whose purpose is to represent the structure and meaning of the actual data. Query Optimizer: Component in the Query Processor whose primary purpose is to optimize the queries so that they can be processed faster. Query Precompiler: Processes the client code from the application layer to extract the SQL statements or translate the code into SQL statements. Query Preprocessor: Performs the checking of syntax and semantics of the query before the query is processed, optimized and executed. RDBMS (Relational Database Management System): A system that pro vides the management, storage and handling of requests to a relational database. Transaction: One or more commands grouped together into a single unit of work. Virtual Memory : Memory, often as simulated on a hard disk, that emulates RAM, allowing an application to operate as though the computer has more memory than it actually does.
Page 13 of 14
7.0 References
1. Atkinson, Leon. Core MySQL: The Serious Developers Guide. New Jersey: Prentice Hall Publishing, 2002. 2. Date, C.J. An Introduction to Database Systems . Menlo Park: Addison-Wesley Publishing Company, Inc, 1986. 3. Dubois, Paul. MySQL. New York: New Riders Publishing, 2000. 4. Frost, R.A . Database Management Systems. New York: Granada Technical Books, 1984. 5. Garcia-Molina, Hector. Database System Implementation. New Jersey: Prentice Hall, 2000. 6. Garlan, David. Shaw, Mary. "An Introduction to Software Architecture". Pittsburgh, PA USA: School of Computer Science, Carneigie Mellon University, 1994. 7. Kruchten, Phillippe. "Architectural Blueprints The 4+1 View Model of Software Architecture". Rational Software Corp, 1995. 8. Silberschatz, Abraham et al. Database System Concepts. New York: McGraw- Hill, c1997. 9. U.S. Department of Commerce. "An Architecture for Database Management Standards". Washington: U.S. Government Printing Office, 1982. 10. "http://www.mysql.com/". MySQL AB, 2002. 11. Yarger, Randy Jay MySQL & mSQL. Sebastopol: OReilly & Associates, 1999.
Page 14 of 14