Book3SQLArchitecture WebSample
Book3SQLArchitecture WebSample
RickA.Morelan AllRightsReserved2010
ISBN: 1451579462 EAN-13: 978-145-157-9468 Rick A. Morelan [email protected]
1
www.Joes2Pros.com
Table of Contents
About the Author ................................................................................................ 9 Acknowledgements ............................................................................................. 9 Introduction ....................................................................................................... 10 Skills Needed for this Book .............................................................................. 10 About this Book ................................................................................................ 11 How to Use the Downloadable Companion Files ............................................. 12 What This Book is Not ..................................................................................... 13
2
www.Joes2Pros.com
Chapter 1. Database File Structures Lab 2.2 Database Snapshots ............................................................................. 73 Database Snapshots - Points to Ponder ............................................................. 74 Setting Database Properties .............................................................................. 75 Altering Databases Using Management Studio ............................................ 77 Altering Databases Using T-SQL Code ........................................................ 78 Database Code Generators ................................................................................ 80 Lab 2.3: Setting Database Properties ................................................................ 85 Setting Database Properties - Points to Ponder................................................. 86 Chapter Glossary ............................................................................................... 87 Chapter Two - Review Quiz ............................................................................. 88 Answer Key .................................................................................................. 88 Bug Catcher Game ............................................................................................ 89
3
www.Joes2Pros.com
Chapter 1. Database File Structures Using the Sparse Data Option ..................................................................... 133 Lab 4.1: Sparse Data Option ........................................................................... 137 Sparse Data Option - Points to Ponder ........................................................... 139 Custom Data Types ......................................................................................... 140 Creating Custom Types............................................................................... 143 Using Custom Types ................................................................................... 145 Dropping Custom Types ............................................................................. 147 Lab 4.2: Custom Data Types .......................................................................... 149 Custom Data Types - Points to Ponder ........................................................... 150 System and Time Data Types ......................................................................... 151 Recap of DateTime Functions .................................................................... 151 Standard Date and Time Data Types .......................................................... 153 Date and Time Zone Types ......................................................................... 156 Lab 4.3: System and Time Data Types ........................................................... 161 System and Time Data Types - Points to Ponder ........................................... 162 Chapter Glossary ............................................................................................. 163 Chapter Four - Review Quiz ........................................................................... 164 Answer Key ................................................................................................ 166 Bug Catcher Game .......................................................................................... 166
4
www.Joes2Pros.com
Chapter 1. Database File Structures Creating Separate Tables ................................................................................ 206 Overview of Partitioned Tables ...................................................................... 207 Partitioning .................................................................................................. 209 Partition Functions ...................................................................................... 210 Partition Schemes........................................................................................ 211 Data Storage Areas ..................................................................................... 212 Creating Partitioned Tables............................................................................. 213 Creating Filegroups ..................................................................................... 213 Creating Datafiles for the Filegroups .......................................................... 216 Creating Partitioned Functions ................................................................... 218 Creating Partitioned Schemes ..................................................................... 219 Viewing Partitioned Table Information ...................................................... 221 Lab 6.1: Creating Partitioned Tables .............................................................. 224 Creating Partitioned Tables - Points to Ponder ............................................... 227 Chapter Glossary ............................................................................................. 227 Chapter Six - Review Quiz ............................................................................. 228 Answer Key ................................................................................................ 229 Bug Catcher Game .......................................................................................... 229
5
www.Joes2Pros.com
Chapter 1. Database File Structures Identity Fields ............................................................................................. 265 Lab 8.1: Clustered Indexes ............................................................................. 267 Clustered Indexes - Points to Ponder .............................................................. 268 Nonclustered Indexes ...................................................................................... 270 Index Scan ................................................................................................... 270 Index Seek ................................................................................................... 270 Clustered Indexes Recap ................................................................................. 271 Creating Nonclustered Indexes ....................................................................... 276 Unique Nonclustered Indexes ..................................................................... 277 Lab 8.2: Nonclustered Indexes ....................................................................... 281 Nonclustered Indexes - Points to Ponder ........................................................ 282 Chapter Glossary ............................................................................................. 283 Chapter Eight - Review Quiz .......................................................................... 284 Answer Key ................................................................................................ 284 Bug Catcher Game .......................................................................................... 285
Chapter 10.
Query Execution Plans .................................................................................... 318 Covering Indexes ........................................................................................ 318 How Nonclustered Indexes Work ............................................................... 321 When a Query Needs a Covering Index ......................................................... 322 Selectivity ................................................................................................... 323 Selective Predicate Operations ................................................................... 325
6
www.Joes2Pros.com
Chapter 1. Database File Structures Optimization Hints .......................................................................................... 327 Index Hints .................................................................................................. 327 Query Hints ................................................................................................. 328 Lab 10.1: Query Execution Plans ................................................................... 335 Query Execution Plans - Points to Ponder ...................................................... 337 Analyzing Indexes .......................................................................................... 338 Seek vs. Scan Recap ................................................................................... 338 Historical Index Metadata ........................................................................... 339 Lab 10.2: Analyzing Indexes .......................................................................... 342 Analyzing Indexes - Points to Ponder............................................................. 343 Database Tuning ............................................................................................. 344 Workload Files ............................................................................................ 345 Database Engine Tuning Advisor DTA ...................................................... 346 Analyzing Session Definitions .................................................................... 355 Analyzing and Saving Session Results ....................................................... 357 Lab 10.3: Data Tuning Advisor ...................................................................... 360 Database Tuning - Points to Ponder................................................................ 363 Chapter Glossary ............................................................................................. 364 Chapter Ten - Review Quiz ............................................................................ 365 Answer Key ................................................................................................ 367 Bug Catcher Game .......................................................................................... 368
Chapter 11.
Fragmentation Basics ...................................................................................... 370 Detecting Fragmentation ................................................................................. 371 Detecting Fragmentation with Management Studio ................................... 372 Detecting Fragmentation with Dynamic Management Views .................... 373 Lab 11.1: Detecting Fragmentation ................................................................ 375 Detecting Fragmentation - Points to Ponder ................................................... 376 Fragmentation Reports .................................................................................... 377 Fragmentation Recap .................................................................................. 377 Combining Fragmentation Metadata .......................................................... 377 Index Defragmentation ................................................................................... 380 Rebuilding Indexes ..................................................................................... 380 Reorganizing Indexes.................................................................................. 381 Lab 11.2: Index Defragmentation ................................................................... 383 Index Defragmentation - Points to Ponder...................................................... 384 Index Metadata................................................................................................ 385 Actual Execution Plans ............................................................................... 385 Missing Indexes .......................................................................................... 387 Object Property Metadata ........................................................................... 390 Index Functions ............................................................................................... 391
7
www.Joes2Pros.com
Chapter 1. Database File Structures Lab 11.3: Index Metadata ............................................................................... 395 Index Metadata - Points to Ponder .................................................................. 397 Chapter Glossary ............................................................................................. 398 Chapter Eleven - Review Quiz ....................................................................... 399 Answer Key ................................................................................................ 400 Bug Catcher Game .......................................................................................... 401
Chapter 12.
SQL Statistics.................................................................................................. 403 Statistics Metadata ...................................................................................... 405 Histogram.................................................................................................... 406 Creating Statistics ....................................................................................... 406 Updating Statistics ...................................................................................... 408 Lab 12.1: Index Statistics ................................................................................ 409 Index Statistics - Points to Ponder .................................................................. 410 Partitioned Table Indexing .............................................................................. 412 Creating Partitioned Tables Recap.............................................................. 412 Partitioned Tables as Heaps ........................................................................ 416 Creating Indexes on Partitioned Tables ...................................................... 417 Query Partitioned Tables ............................................................................ 418 Lab 12.2: Partitioned Table Indexing ............................................................. 419 Partition Table Indexing - Points to Ponder .................................................... 420 Chapter Glossary ............................................................................................. 420 Chapter Twelve - Review Quiz....................................................................... 421 Answer Key ................................................................................................ 422 Bug Catcher Game .......................................................................................... 422
Chapter 13.
Full Text Indexing........................................................................................... 424 Full Text Catalogs ....................................................................................... 425 Full Text Indexes ........................................................................................ 426 Lab 13.1: Full Text Indexing .......................................................................... 429 Full Text Indexing - Points to Ponder ............................................................. 429 Stop Words...................................................................................................... 430 Dropping Stop Words ................................................................................. 434 Adding Stop Words..................................................................................... 436 Lab 13.2: Stop Words ..................................................................................... 438 Stop Words - Points to Ponder ........................................................................ 439 Chapter Glossary ............................................................................................. 439 Chapter Thirteen - Review Quiz ..................................................................... 440 Answer Key ................................................................................................ 441 Bug Catcher Game .......................................................................................... 441 Index ............................................................................................................... 442
8
www.Joes2Pros.com
Acknowledgements
As a book with a supporting web site, illustrations, media content and software scripts, it takes more than the usual author, illustrator and editor to put everything together into a great learning experience. Since my publisher has the more traditional contributor list available, Id like to recognize the core team members: Editor: Jessica Brown, Joel Heidal Cover Illustration: Jungim Jang Technical Review: Tom Ekberg, Joel Heidal Software Design Testing: Irina Berger Index: Denise Driscoll User Acceptance Testing: Michael McLean Website & Digital Marketing:Gaurav Singhal Thank you to all the teachers at Catapult Software Training Institute in the mid1990s. What a great start to open my eyes. It landed me my first job at Microsoft by August of that year. A giant second wind came from Koenig-Solutions, which gives twice the training and attention for half the price of most other schools. Mr. Rohit Aggarwal is the visionary founder of this company based in New Delhi, India. Rohits business model sits students down one-on-one with experts. Each expert dedicates weeks to help each new IT student succeed. The numerous twelve-hour flights I took to India to attend those classes were pivotal to my success. Whenever a new
9
www.Joes2Pros.com
Chapter 1. Database File Structures generation of software was released, I got years ahead of the learning curve by spending one or two months at Koenig. Dr. James D. McCaffrey at Volt Technical Resources in Bellevue, Wash., taught me how to improve my own learning by teaching others. Youll frequently see me in his classroom because he makes learning fun. McCaffreys unique style boosts the self-confidence of his students, and his tutelage has been essential to my own professional development. His philosophy inspires the Joes 2 Pros curriculum.
Introduction
Sure I wrote great queries and was always able to deliver precisely the data that everyone wanted. But a wake up call occasion on December 26, 2006 forced me to get better acquainted with the internals of SQL Server. That challenge was a turning point for me and my knowledge of SQL Server performance and architecture. Almost everyone from the workplace was out of town or taking time off for the holidays, and I was in the office for what I expected would be a very quiet day. I received a call that a query which had been working fine for years was suddenly crippling the system. The request was not for me to write one of my brilliant queries but to troubleshoot the existing query and get it back to running well. Until that moment, I hadnt spent any time tuning SQL Server that was always done by another team member. To me, tuning seemed like a black box feat accomplished by unknown formulas that I would never understand. On my own on that Boxing Day, I learned the mysteries behind tuning SQL Server. In truth, its really just like driving a car. By simply turning the key and becoming acquainted with a few levers and buttons, you can drive anywhere you want. With SQL Server, there are just a few key rules and tools which will put you in control of how things run on your system.
10
www.Joes2Pros.com
11
www.Joes2Pros.com
free downloading. Videos show labs, demonstrate concepts, and review Points to Ponder along with tips from the appendix. Ranging from 3-15 minutes in length, they use special effects to highlight key points. There is even a Setup video that shows you how to download and use all other files. You can go at your own pace and pause or replay within lessons as needed.
Answer Keys: The downloadable files also include an Answer Key for you to
verify your completed work. Another helpful use for independent students is that these coding answers are available for peeking if you get really stuck.
Resource files: Located in the resources sub-folder from the download site are
your practice lab resource files. These files hold the few non-SQL script files needed for some labs. You will be prompted by the text each time you need to download and utilize a resource file.
Lab setup files: SQL Server is a database engine and we need to practice on a
database. The Joes 2 Pros Practice Company database is a fictitious travel booking company whose name is shortened to the database name of JProCo. The scripts to set up the JProCo database can be found here.
Chapter review files: Ready to take your new skills out for a test drive? We have
Joes 2 Pros book which requires that you have AdventureWorks installed in order to do some of the lessons. Be aware there are many versions of AdventureWorks, and the file you need to locate and download is the AdventureworksDBCI.msi. AdventureworksDBCI.msi. Once you download and run this file, you may find that it is installed but not visible in Management Studio. You first need to attach the database. In Management Studios Object Explorer, simply right-click the Databases folder, click Attach, then click Add, and then navigate to locate the filepath for this mdf file (listed at the top of page 13). You will find it in your Program Files folder:
12
www.Joes2Pros.com
Chapter 1. Database File Structures \Microsoft SQLServer\MSSQL.1\MSSQL\Data\AdventureWorks_Data.mdf. Once you attach the file, you may need to right-click and refresh the Databases folder in order to see the AdventureWorks database in Object Explorer.
13
www.Joes2Pros.com
Chapter 1.
In Beginning SQL Joes 2 Pros, we compared data to cargo and database objects to containers for our mission-critical data. The curriculum for that book included the core keywords for each of the four components of the SQL language: DML, DDL, DCL, and TCL. To put it plainly, this book is about what is under the hood of SQL Server: the engine and parts which process all your data. This book will utilize a great deal of DDL (Data Definition Language), since we will be addressing the design and programming of database objects those vital containers of our business data. However, we will see a fair amount of DML working hand-in-glove with DDL in this book. In the first two books, we built stored procedures using DDL statements which encapsulated the DML statements needed to handle our data tasks. When we build objects like views and functions in this book, we similarly will see statements which define the objects executed simultaneously with the data-centric statements needed to carry out the objects purpose. This is the principal reason it is better for students to gain extensive exposure to queries before launching into the Architecture class.
Figure 1.1 This book will emphasize DDL statements, as well as DDL combined with DML.
Each chapter will include instructions for the setup scripts you need to run in order to follow along with the chapter examples. The setup scripts give you the freedom to practice any code you like, including changing or deleting data or even dropping entire tables or databases. Afterwards you can rerun the setup script (or run the setup script when you reach the next section) and all needed objects and data will be restored. This process is also good practice these are typical tasks done frequently when working with SQL Server, particularly in a software development and testing environment.
READER NOTE: In order to follow along with the examples in the first section of Chapter 1, please install SQL Server and run the first setup script SQLArchChapter1.0Setup.sql. The setup scripts for this book are posted at Joes2Pros.com.
14
www.Joes2Pros.com
Figure 1.2 Upper figure: The JProCo database before I remove it from SQL Server shown alongside the code I will run to drop it. Lower figure: I have removed JProCo and double-checked the Object Explorer to confirm that it is gone.
15
www.Joes2Pros.com
Chapter 1. Database File Structures Now that JProCo has been removed, lets look at my hard drive and its current capacity. To find capacity details on your own machine, open Windows Explorer (Start+E) > right-click the C drive > Properties. It appears that I have 499 MB free on my hard drive without the JProCo database (see Figure 1.3). Ill now rerun the first setup script for this chapter in order to bring back the JProCo database. The setup script (SQLArchChapter1.0Setup.sql) will load the JProCo database onto my system fully populated (Figure 1.4, Panel A). Once the script has executed successfully, Ill go to SQL Servers Object Explorer > right-click the Databases folder > and choose Refresh (Figure 1.4, Panel B). After the refresh, the JProCo folder becomes visible in my Object Explorer (Figure 1.4, Panel C).
Figure 1.3 I have 499 MB free without the JProCo db.
Figure 1.4 Panel A the setup script has run successfully. Panel B the Databases folder is refreshed. Panel C the JProCo database becomes visible in Object Explorer.
Be aware that the Object Explorer tree does not dynamically update a database object after it is added or dropped. Expect to run this same refresh process each time you check Object Explorer and wish to see your updated changes in the list.
16
www.Joes2Pros.com
Chapter 1. Database File Structures Now I'll return to my hard drive. Recall it showed 499 MB of free space prior to reloading the JProCo database onto my system. With JProCo now reloaded on my system, my hard drive shows just 305 MB of free space (see Figure 1.5). So it appears JProCo is occupying roughly 200 MB someplace on my local drive. SQL Server chooses a default location to store databases whenever an alternate location is not specified. The filepath shown here (and in Figure 1.6) is the default location for SQL Server 2008:
This screen capture of my MSSQL\DATA folder shows the two files which comprise the JProCo database. When I ran the setup script, SQL Server loaded these two files onto my system. Notice that one file is roughly 152 MB in size (JProCo.mdf). The other file is roughly 45 MB (JProCo_log.ldf). Together they total Figure 1.6 My MSSQL\DATA folder shows two JProCo files. roughly 200 MB the same amount which we estimated JProCo occupies on my hard drive. These two files stored on the hard drive contain all of the data and all of the logging activity for the JProCo database.
17
www.Joes2Pros.com
18
www.Joes2Pros.com
Chapter 1. Database File Structures These are pretty small files, just 1 KB each. In the lesson video for this section (Lab1.1_DatabaseFileStructures.wmv), you will see me make some significant edits to Document A, which I wont make to Document B. Not only will the documents differ visually, but when we see that the changes cause Document As size to expand to 6 KB, its clear that Document A and Document B are no longer identical files. Where my classes tend to find the ah-ha moment is when we actually see the changes in Document A being removed one by one as I use Edit > Undo to backtrack and see the edits disappear. Similarly, if I delete a few words one by one, the Undo operation will backtrack and make each word reappear one by one. What the application is doing is traversing through the log of changes and accessing memory to find all of those changes. If the changes werent logged, they wouldnt be available.
Figure 1.8 Documents A & B begin as identical files.
Figure 1.9 Our demonstration with two WordPad files helps conceptualize logfile activity.
At the end of the video demonstration, Document A has been returned to its beginning state it contains the identical information as Document B and Ive saved the file in order to clear the logged changes from memory. Thus, Document A and B each are 1 KB in size at the end. But just prior to saving Document A, we make another interesting ah-ha observation. On the surface, both documents appear identical (as shown in Figure 1.10). However, when we compare the size of the two files, Document A is many times larger than Document B. In my classroom demos,
19
www.Joes2Pros.com
Chapter 1. Database File Structures the file grows to 250 KB with relatively few clicks. The log tracks changes made to the document from the last time it was saved up until the current moment.
Figure 1.10 The log tracks all the changes from the last document backup (save) until now.
At this point, an expert database admin (DBA) would understandably be bored silly with this demo. However, over the years Ive found this the fastest way to ramp up students new to the abstract concept of the work done by logfiles. When the document save operation clears out the log, we also get a nice reference point to regular server backups, which truncate (empty) the logfile. Document As condition at the beginning and end of the demo (i.e., 1 KB and reflecting the data This File is Small.) serves as a comparison to the datafile. Because the file was saved at the beginning of the demo and then again at the end, the document showed just the current state of the data nothing to do with tracking data which was added or deleted along the way. The datafiles purpose is to reflect the current state of your database. For student readers still trying to get their heads around this idea of datafiles and logfiles, have no fear the video demonstrations contain many examples, as well as a practice or challenge at the end of the video to give you plenty of practice with datafiles and logfiles. And the next four pages include a step by step tutorial following data through the datafile and logfile as it enters a new database.
20
www.Joes2Pros.com
Chapter 1. Database File Structures Step 1. Pretend you have a brand new database with one table (Employee) which contains zero records. There are no records in your JProCo database, so there are no records in your datafile. And since you havent made any changes to the database, there are zero records in your logfile.
You have not made any changes in JProCo, so no records are in your logfile.
JProCo
Step 2. Now data starts coming into the JProCo database. You add one new record for Alex Adams to the Employee table. So now you have one record in your datafile and one record in your logfile.
You have made one change in JProCo, so one record is in your logfile.
21
www.Joes2Pros.com
Chapter 1. Database File Structures Step 3. You then add another record (Barry Brown). Two records are now in JProCo, so two records are in the datafile and two records in the logfile. So you have two pieces of data and two entries in the logfile reflecting those changes.
Two records are in JProCo, so two records are in your JProCo datafile.
You have made two changes since the last time you backed up your database, so two records are in your logfile.
Step 4. Your next step updates an existing record. Employee 2 is coming back from leave, so youre going to change his status from On Leave to Active. There will still be two records in the database, so the datafile will contain two records. But there will be three records in your logfile, since you made three changes since the last time you backed up your database.
JProCo Two records are in JProCo, so two records are in your datafile.
2009 by Rick A. Morelan
You have made three changes since the last time you backed up your database so three records are in your logfile.
Figure 1.14 An existing record is updated; 2 records in the database but 3 records in the LDF.
22
www.Joes2Pros.com
12:05 AM. You still have two records in JProCo, so two records are in your datafile.
All your changes were sent to the backup, so zero records are in your logfile. Your logfile has been truncated during the backup process.
JProCo
Figure 1.15 The database backup runs. The logfile is truncated, so it contains no records.
Step 6. On Day 2, you insert Lee Osakos record (the third record added to Employee). At this point you have three records in your datafile. The logfile has been empty since the backup, and this change now adds one record to the logfile.
Three records are in JProCo, so three records are in your JProCo datafile.
You have made one change in JProCo since the last backup, so one record is in your logfile.
Figure 1.16 The third record is added to the db. Three records in MDF, one record in LDF.
23
www.Joes2Pros.com
Chapter 1. Database File Structures Step 7. On the same day (Day 2), you delete Barry Brown from the table. Removing one record leaves two records in the datafile. The logfile now contains two records, one for the INSERT (Lee) and one for the DELETE (Barry).
Two records are in JProCo, so two records are in your JProCo datafile.
You have made two changes in JProCo since the last backup, so two records are in your logfile.
Figure 1.17 On Day 2 one record is deleted. Two records remain in MDF, two in LDF.
Recall Figure 1.6 where we saw the default data and log files which SQL Server originated when we created the JProCo database. It named the datafile JProCo.MDF and the logfile JProCo_log.LDF.
JProCo
Figure 1.18 It is highly recommended that you follow the naming convention shown here.
This convention for naming files and the extensions reflect best practice recommendations (see Figure 1.18). SQL Server does not enforce the .mdf/.ldf extensions, but following this standard is highly recommended.
24
www.Joes2Pros.com
Creating Databases
If you were to execute this statement, you would create a database called TSQLTestDB, and all the defaults for name, size, and location of the datafile and logfile would be chosen for you. Up until now, we have accepted SQL Servers defaults for these items each time we have created a database.
Figure 1.19 This code would create a db using defaults for name-size-location of MDF & LDF.
But you can actually choose your own options. For our examples in this chapter, we wont store any of our files in the default location (MSSQL10.MSSQLSERVER\ DATA folder). Create a folder titled SQL on your hard drive (C:\SQL). The MDF and LDF files for our new test database will be stored there. In addition to specifying the location, we will also choose the name and size for TSQLTestDBs datafile and logfile. As a rule of thumb, its generally a good idea to make the size of your LDF 25% of the MDF. Now run all of this code together.
Figure 1.20 You can choose the name, size, and location for your datafiles.
25
www.Joes2Pros.com
Chapter 1. Database File Structures Lets check the C:\SQL folder and confirm we can see the newly created files TSQLTestDB.mdf (20 MB) and TSQLTestDB.ldf (5 MB).
Figure 1.21 We specified the name, size, and location for TSQLTestDB, its MDF, and its LDF.
Now lets see how to locate metadata for this test database using SQL Servers Object Explorer. Remember to first refresh the Databases folder in the Object Explorer, since we know that SQL Server does not change its contents automatically (Object Explorer > right-click Databases > Refresh, as shown earlier in Figure 1.4). Navigate to TSQLTestDB (Object Explorer > Databases > TSQLTestDB). Then open the Database Properties dialog by right-clicking the TSQLTestDB folder > Properties. In the left-hand Select a page nav menu, click the Files page. This will show you the name for the database, the MDF, and the LDF. You can see the custom specifications we included in our CREATE DATABASE statement (see prior page, Figure 1.19) for the logical and physical name, size, and location (C:\SQL) for each file.
Figure 1.23 The Database Properties dialog shows metadata for the database TSQLTestDB.
26
www.Joes2Pros.com
Figure 1.24 In Skill Check 1, you will create the RatisCo database with the specified properties.
Answer Code: The T-SQL code to this lab can be found in the downloadable files in a file named Lab1.1_DatabaseFileStructures.sql.
27
www.Joes2Pros.com
28
www.Joes2Pros.com
Make the datafile large enough to hold all of your expected data. Its a good idea to make sure the MDF is adequately large, even when your database starts out empty. If you estimate your data will likely grow to 5 GB of data, then create your MDF from the outset to be at least 5 GB. Your datafile size is only limited by the size of your hard drive. The minimum size for a datafile is 3 MB and 1 MB for a logfile. These are also the default file sizes when you create a database without specifying the size. The upper limit for your datafiles size is essentially the amount of available space on your available hard drives. If your datafile fills up, then SQL Server cant process additional data. One solution is you can adjust the size property of your datafile. When you increase the size property, the datafile will procure more space on the drive. You can also adjust the filegrowth property to allow the datafile to grow. You can specify the incremental rate at which you want your datafile to grow once the current size specification has been reached. This rate is known as filegrowth and is controlled by the MDFs filegrowth property. For example, assume your hard drive has 100 MB available and your MDF is approaching 30 MB, which is the maximum size you specified when you created your database. You can set the MDF to grow and occupy additional hard drive space (up to roughly 70MB more). However, if your hard drive runs out of space, then you either need a larger drive or a separate drive. Thats where Alternate Datafile Placement can be the solution. You have the option of placing your datafile on a separate drive, or even
29
www.Joes2Pros.com
Chapter 1. Database File Structures on its own dedicated drive. (e.g., if you have the D drive available, you could dedicate it just to your data D:\SQL\JProCo.mdf). Your data activity may overwhelm the throughput of one drive. In your SQL Server work, you will occasionally encounter scenarios where your business moves so quickly that the throughput of your data traffic exceeds the throughput capacity of the drive. For example, since the C drive can become very busy with operating system (OS) activity and several services running on it, you might want to reserve the throughput of the C drive for the day to day operations. Also, the main use of the database is one giant data table which receives billions of hits per hour. The activity of that table could overwhelm the throughput of any single drive. You can use Multiple Datafiles Placement to spread the entire database workload evenly across several datafiles (see Figure 1.24). In addition to your MDF (main datafile), you can create secondary datafiles and split where the storage takes place. (Note: SQL Server allows just one file per database to use the .MDF extension. Any additional datafile(s) must use the .NDF extension (for secoNdary DataFile, e.g., E:\SQL\JProCo.ndf).) A table (or any single database object) may be assigned to just one filegroup. However, you can associate a filegroup with as many datafiles as you need. Thus, when you wish to spread a giant tables workload across many drives, you will use multiple datafiles but place them all in the same filegroup. SOLUTION #1: Adjust the Size of Your Datafile You can adjust the size property of your datafile. When you increase the size property, the datafile will procure more space on the drive. You can also adjust the filegrowth property to allow the datafile to grow. If your datafile fills up, then SQL Server cant process additional data. You can specify the incremental rate at which you want your datafile to grow once the current size specification has been reached. This rate is known as filegrowth and is controlled by the MDFs filegrowth property. SOLUTION #2: Alternate Datafile Placement You may choose to place your datafile on its own dedicated drive. Since the C drive can become very busy with operating system (OS) activity and several services running on it, you might want to reserve the throughput of the C drive for the day to day operations. You have the option of placing your datafile on a separate drive, or even on its own dedicated drive. (e.g., if you have the D drive available, you could dedicate it just to your data D:\SQL\JProCo.mdf)
30
www.Joes2Pros.com
Chapter 1. Database File Structures SOLUTION #3: Multiple Datafiles for Data Placement Suppose you have already placed your datafile on a dedicated drive (e.g., the D drive, D:\SQL\JProCo.mdf), but you soon see that this drive will not be big enough for your growing database. Or perhaps its not fast enough. In addition to your MDF (main datafile), you can create a secondary datafile and split where the storage takes place. (Note: SQL Server allows just one file per database to use the .MDF extension. Any additional datafile(s) must use the .NDF extension (for secoNdary DataFile, e.g., E:\SQL\JProCo.ndf).) Suppose your database contains multiple tables and you figure out that your production tables comprise about half of the workload of your entire SQL Server database infrastructure. The other tables combined use the rest of the available workload. You can elect to have your secondary datafile dedicated to your production tables, and the remaining tables (e.g., reporting tables, lookup tables) can reside in your MDF on another drive. SOLUTION #4: Multiple Datafiles for Automatic Load Balancing Suppose the main use of your database is one giant data table which receives billions of hits per hour. The activity of that table could overwhelm the throughput of any single drive. You can spread the entire database workload evenly across several datafiles (see Figure 1.25). Note that a table (or any single database object) may be assigned to just one filegroup. However, you can associate a filegroup with as many datafiles as you need. Thus, when you wish to spread a giant tables workload across many drives, you will use multiple datafiles but place them all in the same filegroup.
Multiple Datafiles for Automatic Load Balancing
If the main use of your database is one giant data-table that gets millions of hits per hour, your data might be far too large and fast to be handled by one drive. You can spread the entire DB load evenly across many locations. C:\OperatingSystem
D:\SQL\JProCo.MDF E:\SQL\JProCo.NDF
One GIANT Table
JProCo
F:\SQL\JProCo.NDF G:\SQL\JProCo.NDF
Figure 1.25 A giant tables workload spread across many drives and datafiles, all in one filegroup.
31
www.Joes2Pros.com
Using Filegroups
There are many advantages to using filegroups to manage the database workload. A filegroup may contain many datafiles, and the properties of all the datafiles can be managed simultaneously with a filegroup (see Figure 1.26).
Filegroups
Managing many files at once.
If the first 4 datafiles (on the left) were for production and you wanted to change all the files properties at once, you would have to set each one individually. By putting several files into one filegroup you can manage all these files as a single item.
Filegroup1 Filegroup2
Figure 1.26 By putting several files into one filegroup, you can manage them as a single item.
32
www.Joes2Pros.com
Chapter 1. Database File Structures the last database backup. Whereas datafiles have the file extension .mdf or .ndf, log datafiles always have the .ldf extension.
Figure 1.27 The Log Data File (LDF) must exist but cant be part of a filegroup.
When we completed Lab 1.1, we created one datafile called RatisCo_Data.mdf in the SQL folder of the C drive (C:\SQL\RatisCo_Data.mdf). Since we didnt specify any filegroups, SQL Server automatically placed it in the primary filegroup for us. We also created a log file in the same location (C:\SQL\RatisCo_Log.ldf). That file was not placed inside a filegroup, since log files are never part of a filegroup.
Configuration of Previous Lab (Lab 1.1): One filegroup & one datafile
PRIMARY Filegroup
C:\SQL\RatisCo_Data.MDF
All Tables
Figure 1.28 Our previous lab created one filegroup with one datafile.
33
www.Joes2Pros.com
Chapter 1. Database File Structures Our next lab will create one datafile in the primary filegroup and two datafiles in the secondary filegroup (also known as the user-defined filegroup).
Figure 1.29 Our next lab will create one MDF in the primary filegroup, 2 NDFs in the secondary.
You will accomplish Part 1 of the Lab 1.2 by following along hands-on with the demonstration. Later you will accomplish Part 2 independently. As shown in Figure 1.29 (previous page), the goal of Part 1 will be the following: 1) Create one MDF (main datafile) in the primary filegroup; 2) Create two NDFs (secondary datafiles) in a user-defined filegroup called Order_Hist located on a separate drive; 3) Create the LDF (log datafile) on a separate drive. Note: For at least 80% of readers, its unlikely that the machine you are practicing on has several drives you can write to. Its important to practice sending datafiles to different drives upon database creation and writing our code as such. Thus, we will improvise by creating separate folders, which we will treat as drives in order to simulate workplace conditions. Lets create the following three folders on our hard drives to stand in for the C, D, and E drives, respectively: C:\C_SQL C:\D_SQL C:\E_SQL
34
www.Joes2Pros.com
Lets also make sure the RatisCo database has been dropped from our server: USE master GO DROP DATABASE RatisCo GO The CREATE DATABASE statement we will write for RatisCo will contain all the specifications for the datafiles and filegroups. The ON clause specifies which datafile(s) to store the database on and also accepts arguments to specify which filegroup the datafile(s) should belong to. The usual practice is to place the .MDF file in the PRIMARY filegroup. All other filegroups must be explicitly defined within the FILEGROUP argument.
Figure 1.30 Our next lab will create one MDF in the primary filegroup, 2 NDFs in the secondary.
Figure 1.31 Our code successfully executes and creates RatisCo per our specifications.
35
www.Joes2Pros.com
Below is the Files page from the Database Properties dialog for RatisCo (Figure 1.32, lower frame). We see RatisCo and all its files were created as we specified. The Filegroups page from the Database Properties dialog (Figure 1.32, upper frame) shows one primary filegroup containing one file and an Order_Hist filegroup containing two files.
Notice the default size and growth properties (Initial Size (MB) and Autogrowth) showing for RatisCo (Figure 1.32, previous page). SQL Server assigned the default settings for these properties, since we did not specify otherwise. Here we see each RatisCo file as it appears in Windows Explorer (Start+E). RatisCo and all its files appear as we created them in our code (see Figure 1.31 on the previous page for the syntax of the CREATE DATABASE statement).
36
www.Joes2Pros.com
Figure 1.33 We find the MDF, two NDFs, and the LDF have all been created as we specified.
In the second part of the lab, you will drop RatisCo and then create it with two files (the MDF and an NDF) in the primary filegroup, three NDF files in the Order_Hist filegroup, and one log datafile (LDF). In the final section of this chapter, we will make these types of file structure changes (add datafiles, add filegroups) to existing databases without having to first drop and re-create the database. The model for Lab 1.2 Part 2 is shown on the next page (Figure 1.34). The primary filegroup contains the current tables, and the historical tables are housed in the Order_Hist filegroup.
37
www.Joes2Pros.com
Figure 1.34 Next we will create two datafiles in the primary filegroup, 3 NDFs in the secondary.
Please note, in this chapter we frequently capitalize the extensions of datafiles in filenames (as .MDF, .NDF, .LDF) for illustrative purposes. As you can see, SQL Server gives you the freedom to capitalize filenames as you wish, including your file extensions. Where our code capitalized the datafile extensions, SQL Server created the file precisely as we specified. We saw the uppercase lettering reflected in both the Object Explorer and the Windows Explorer views of the files (Figures 1.31 through 1.33). Similarly, where we coded lowercase lettering for the file extensions, we saw that SQL Server again created the file precisely as we specified (Figures 1.20 through 1.23).
**NOTE: This book includes references and code samples for capabilities such as Autogrowth, Filegrowth, and Auto Shrink, which you need to be aware of as a SQL Pro. However, the rule of thumb for homes and buildings, its more expensive to remodel than to build it right the first time, also applies to constructing databases. Ideally, you want each database file to be physically located in one place and not fragmented in various locations on a drive. Each time you upsize your database in small increments, SQL Server grabs any available hard drive space. The additional space procured most likely wont be adjacent to the location on the hard drive where your datafile currently sits. More Input/Output (I/O) resources are needed when SQL Server must search for information which is scattered across a drive. Also, most SQL Pros would recommend you avoid the use of automatic settings (e.g., Autogrowth, Filegrowth, Auto Shrink, Auto Close). Not only does the frequent monitoring utilize additional CPU and system resources, but shrinking a datafile invariably results in file and index fragmentation. Instead, SQL Pros prefer to have SQL Server alert them when the database is nearing its maximum size. In cases where you do need to upsize or downsize a database, the best practice is generally to perform these tasks manually.
38
www.Joes2Pros.com
Figure 1.35 Filegroups & Files pages from the Database Properties dialog for RatisCo.
Answer Code: The T-SQL code to this lab can be found in the downloadable files in a file named Lab1.2_UsingFileGroups.sql.
39
www.Joes2Pros.com
Chapter 1. Database File Structures Skill Check 2: Create the database GrowthCo using the name, size, filegrowth, and location specifications below: 1) One MDF (main datafile) in the primary filegroup and in the C:\SQL directory. 2) Initial db size is 50MB, db can grow up to 80MB max, increase by 15MB increments. 3) One LDF (log datafile) in the C:\SQL directory. 4) Initial log size is 20MB, log can grow up to 100MB max, increase by 25% increments. 5) Use the recommended naming convention. After creating this database, confirm your result matches the Database Properties dialog shown below (see Figure 1.36).
Object Explorer > Databases > GrowthCo > right-click Properties > Files (page)
Figure 1.36 The Database Properties dialog for GrowthCo confirms the properties you created.
40
www.Joes2Pros.com
Chapter 1. Database File Structures Skill Check 3: There will be times when you dont want to your database to grow automatically. You want SQL Server to notify you when the MDF or log file is full, so you can manage the upsizing process yourself. Create the database No_Growth using the name, size, filegrowth, and location specifications below: 1) One MDF (main datafile) in the primary filegroup and in the C:\SQL directory. 2) Initial db size is 40MB, db cannot grow. 3) One LDF (log datafile) in the C:\SQL directory. 4) Initial log size is 10MB, log cannot grow. Use the recommended naming convention. After creating the No_Growth database, confirm your result matches the Database Properties dialog shown below (see Figure 1.37).
Object Explorer > Databases > No_Growth > right-click Properties > Files (page)
Figure 1.37 The Database Properties dialog for No_Growth confirms the properties you created.
41
www.Joes2Pros.com
A filegroup is a collection of datafiles which are managed as a single unit. SQL Server databases have a primary filegroup and may also have user defined secondary filegroups (like OrderHist). If no filegroups are defined when the database is created then the only filegroup that exists will be the PRIMARY filegroup, which is the default. The primary filegroup contains the MDF file and as many (to a maximum of 32,767) NDF files as specified. A secondary datafile is optional and will carry an .NDF extension. A database can contain a maximum of 32,767 secondary datafiles. By putting several files into one filegroup, all these files can be managed as a single item. A good use of files and filegroups is to separate files which are heavily queried (OLAP) from files which are heavily changed (OLTP). Example:
a. Products, Customers, and Sales are in one file b. Order History tables could be in another.
The two main reasons for using filegroups are to improve performance and to control the physical placement of the data. Log files have a structure different from datafiles and cannot be placed into filegroups. When a database is created, at least one datafile (.MDF) must be created. Its a good idea to make the datafile large enough to hold all of the expected data. The datafile size is only limited by the size of the hard drive. A datafile can be placed on its own dedicated drive. Production tables might be so critical and have such a large workload that they need their own storage location. The FILEGROWTH property can be used to expand a datafile when it grows beyond its initial size. Setting a database datafile or logfile to no-growth is considered best for performance, but that increases the risk of a growing database running out of available SQL datafile space.
42
www.Joes2Pros.com
Chapter 1. Database File Structures 17. 18. If filegrowth is set to zero, then the datafile will never grow automatically. But the file can manually be resized at any time. If a filesize or filegrowth or any other database property is not specified then the database properties will be modeled after the model database.
43
www.Joes2Pros.com
Suppose we had intended to create an NDF file (secondary datafile) on the D drive but forgot this step when we created the database. We can add this file without having to rebuild the database. We will do this by using an ALTER DATABASE statement. Find RatisCo in the Object Explorer, right-click it, and select Properties to open the Database Properties dialog. Then click Files in the left nav menu (Select a page) and notice the files we have created are present. The MDF has been created on our virtual C drive, and the LDF (log) is on our virtual E drive.
Figure 1.39 The Files page of the Database Properties dialog shows the files we just created.
44
www.Joes2Pros.com
Chapter 1. Database File Structures Now switch to the Filegroups page (left nav menu) and see we have just one filegroup, the Primary. Inside this filegroup we see one file, which is the MDF. We want to add a user-defined filegroup called RatisCo_OrderHist. We will then put two NDF files inside of that group.
Figure 1.40 The Filegroups page of the Database Properties dialog shows just the default filegroup.
Now lets re-check the Filegroups page of the Database Properties dialog to see the result of our code. If your dialog is still open (i.e., from Figure 1.40), then you must first close and re-open the dialog. We now see the new filegroup, RatisCo_OrderHist, which contains no files.
Figure 1.42 The secondary filegroup, RatisCo_OrderHist, is present and contains no files.
45
www.Joes2Pros.com
Chapter 1. Database File Structures Now lets create an NDF file, RatisCo_Hist1, and add it to the new filegroup.
Figure 1.43 Create the secondary datafile, RatisCo_Hist1, and add it to RatisCo_OrderHist.
Notice in Figure 1.43 we put the filegroup in square brackets even though it was not required. When you add to a filegroup like PRIMARY that is also a keyword, you may need to delimit in square brackets. Now lets re-open the Database Properties dialog and check the Filegroups page to see the changes. One file should show in RatisCo_OrderHist.
Figure 1.44 See one file now showing in the new secondary filegroup (RatisCo_OrderHist).
Now switch to the Files page and look at all the files. We can see the newly created NDF file showing on the D drive.
Figure 1.45 The new NDF file now shows in our secondary filegroup, RatisCo_OrderHist.
46
www.Joes2Pros.com
Chapter 1. Database File Structures Finally, we will add an additional NDF into the new filegroup, (RatisCo_OrderHist). We will name this file RatisCo_Hist2. However, we will use SQL Server Management Studios user interface (UI) to add this file, instead of running code which appears in Figure 1.46. First, lets look at the code statement which would accomplish this task. Do not run this code.
Figure 1.46 We will add the file RatisCo_Hist2 using the SSMS UI, not running code.
Step 1. From the Files page of the Database Properties dialog for RatisCo, click the Add button in the lower right corner of the dialog.
Figure 1.47 Step 1. Add the new NDF file by clicking Add button on the Files page.
Step 2. Notice that SQL Server has automatically populated five fields in the new row. These fields (File Type, Filegroup, Initial size (MB), Autogrowth, and Path) contain SQL Servers default values but can be customized to our specifications. The filepath shown on the next page (Figure 1.48, upper frame) is SQL Servers default location for storing data (as discussed in Figure 1.6). The default filegroup is Primary, and the initial size defaults to 3 MB and is allowed to grow by 1MB. Rows Data is the default value for File Type, but we can choose Log if we are adding a log datafile. The two fields which do not automatically populate are Logical Name and File Name. Now change the filegroup to the secondary filegroup, RatisCo_OrderHist.
47
www.Joes2Pros.com
Figure 1.48 Step 2. Notice the 4 default field values. Change the filegroup to RatisCo_OrderHist.
Step 3. Change the filepath to C:\D_SQL by clicking the white ellipsis button to launch the Locate Folder dialog, which opens to the MSSQL\DATA folder.
Figure 1.49 Step 3. Change the filepath to C:\D_SQL, which is serving as our makeshift D drive.
48
www.Joes2Pros.com
Chapter 1. Database File Structures Step 4. Manually enter the two file names in the two blank fields (Logical Name, File Name). The logical name for this secondary datafile (NDF) is RatisCo_Hist2. The full file name is RatisCo_Hist2.NDF
Figure 1.50 Step 4. Manually enter the Logical Name and the full File Name.
Step 5. Change the size to 1 MB (Initial Size (MB) property). Then click OK to save all the changes weve made to the new NDF file, RatisCo_Hist2. Be aware that, whenever we click OK in the Database Properties dialog, it snaps shut.
Figure 1.51 Step 5. Change the file size to 1MB instead of SQL Servers default 3MB. Click OK.
Recall the beginning of this section and our premise that we wanted to add a secondary filegroup and two NDF files to the existing RatisCo database. We altered the database with code to create the filegroup RatisCo_OrderHist (in Figure 1.42), added the first NDF file with code (Figure 1.43), and finally we added the second NDF file using the UI in SQL Server Management Studio. We see both NDF files appearing in our makeshift D drive (Figure 1.52).
Figure 1.52 Confirm that both NDF files appear in our makeshift D drive.
49
www.Joes2Pros.com
Chapter 1. Database File Structures Now re-open the Database Properties dialog to see the change reflected in the Filegroups page.
Figure 1.53 Reopen the Database Properties dialog to see all our changes reflected.
In the first two Joes 2 Pros books, we emphasized robust and reusable code as the preferred method of writing queries and creating database objects. That approach is still the best practice and is the one we hope to use most frequently in our SQL Server career. When we click through an interface to add or modify database objects, we arent creating an automatic trail to check our code (in case of error) and we arent able to easily repeat or rerun those steps the way we can with a script. However, there are helpful tools in the SQL Server Management Studio, which we should be able to navigate as part of our journey to becoming a SQL Pro. For example, we can generate the T-SQL needed to recreate the RatisCo database by navigating through Databases > RatisCo > Script Database as > CREATE To > New Query Editor Window. This can then be saved to reconstruct the database in case it is dropped. We covered a number of these tools in the first two books, and we will continue to demonstrate them in this book, as well.
50
www.Joes2Pros.com
Figure 1.54 Your Filegroups page will show three files in RatisCo_OrderHist after Skill Check 1.
Figure 1.55 You will add the file RatisCo_Hist3.NDF to the secondary filegroup.
51
www.Joes2Pros.com
Chapter 1. Database File Structures Skill Check 2: After Skill Check 1, the RatisCo database contains one MDF (main datafile), three NDFs (secondary datafiles), and one LDF (log datafile). Add another file to the RatisCo database. Place this file in the primary filegroup. Note: since PRIMARY is a keyword use square brackets in your code [PRIMARY] when specifying the filegroup. The screenshot below is taken from the Files page of the Database Properties dialog for RatisCo. Your files logical name should be RatisCo_Data2, and the full file name will be C:\C_SQL\RatisCo_Data2.ndf (as shown in Figure 1.56). It is recommended that you write the code for this Skill Check, even if you elect to also try adding the file using the UI. When complete, your Files page should match that shown below (Figure 1.56).
Figure 1.56 You will add the file RatisCo_Data2.ndf to the primary filegroup.
4. 5.
52
www.Joes2Pros.com
Chapter Glossary
Alternate Datafile Placement: Placing your datafile on a separate drive that you select. Autogrowth: The property in SQL that can be set up to determine the rate of file growth. Auto Shrink: A database setting that automatically check a database size and return unused space back to the hard drive. CREATE Database: A SQL statement that is used to create new databases and can determine specifications for the datafiles and filegroups within it. DCL: Data Control Language. A statement that affects the permissions a principal has to a securable. DDL: Data Definition Language. A statement that creates, drops or alters databases or database objects. DML: Data Manipulation language. DML statements handle the structure or design of database objects. Datafile: A file that contains your current data and whose purpose is to reflect the current state of your database. Filegroup: A collection of datafiles which are managed as a single unit. Filegrowth: A property that can be used to expand a datafile when it grows beyond its initial size. Filegrowth property: A property that can be used to expand a datafile when it grows beyond its initial size. Graphical UI: User interface. Logfile: Logfiles keep track of your database transactions and help ensure data and system integrity. Metadata: Data about data. Multiple Datafiles Placement: A process that is used to spread the entire database workload evenly across several datafiles. Primary filegroup: Contains the primary datafile (MDF) and possibly secondary datafiles (NDF). All system tables are allocated to the primary filegroup. Secondary Datafile: A secondary datafile is optional and will carry an .NDF extension. Secondary filegroup: Also called a user-defined filegroup; these contain secondary datafiles (NDF) and database objects. T-SQL: Transact Structured Query Language is the computer programming language based on the SQL standard and used by Microsoft SQL Server to create databases, populate tables and retrieve data. TCL: Transaction Control Language, or TCL, provides options to control transactions.
53
www.Joes2Pros.com
4.) What type of action will cause the logfile and the data file to increase their size? O a. O b. O c. O d. SELECT INSERT UPDATE DELETE
5.) How many primary filegroups can you have? O a. None O b. One O c. Two O d. Eight O e. As many as you want 6.) How many user defined filegroups can you have? O a. None O b. One O c. Two O d. Eight O e. As many as you want
54
www.Joes2Pros.com
Chapter 1. Database File Structures 7.) What are two main reasons for using multiple filegroups? (Choose Two) a. User defined filegroups use compression to save space. b. To improve performance. c. To control the physical placement of the data. d. To open up more keyword usage. 8.) How many files can go into the primary filegroup? O a. None O b. One O c. Two O d. Eight O e. As many as you want (up to 32,767). 9.) If you have two files in the same filegroup can they be on different drives? O a. Yes O b. No 10.) Can you place an .NDF file into the primary filegroup? O a. Yes O b. No 11.) Can you place a .LDF file (log file) into the primary filegroup? O a. Yes O b. No
Answer Key
1.) a 2.) a 3.) c 4.) b 5.) b 6.) e 7.)b, c 8.) e 9.) a 10.) a 11.) b
55
www.Joes2Pros.com
Chapter 2. Properties
Back in 2005, I heard there was some fancy invention called schemas coming later that year in the new SQL Server 2005 release. My initial reaction was, Yikes, Im already too busy and schemas must mean some fancy code design that will take many hours of serious late night studying to get a handle on. As it turned out, schemas werent difficult at all! In fact, SQL Servers use of schemas resembles a simple categorization and naming convention used from prehistoric times when humans began organizing themselves into clans. And centuries later, names progressed to the system we have now where we each have our individual name (like Bob or Paul) along with our family name (like Paul Johnson). If we were SQL Server objects, we might say our two-part naming convention was FamilyName.FirstName. Our two-part name ensures easier identification and also classification. Paul Johnson and Krista Johnson are more likely to be in the same immediate family than Paul Johnson and Paul Kriegel. In .NET programming languages, the names Johnson and Kriegel would be called Namespaces. In SQL Server we might create two tables called Employee and PayRates which belong to the HR department. We can refer to them as HR.Employee and HR.Payrates if we first set up the HR schema. All HR tables would then be created in the HR schema. You could say that the schema is SQL Servers answer to the Namespace used in .NET languages.
READER NOTE: In order to follow along with the examples in the first section of Chapter 2, please run the setup script SQLArchChapter2.0Setup.sql. The setup scripts for this book are posted at Joes2Pros.com.
56
www.Joes2Pros.com
Schemas
We know that data contained and tracked in SQL Server most likely relates to an organization. The database we frequently use in the Joes 2 Pros series is JProCo, which contains all the data for this small fictitious company. No matter how small an organization, certain tables will be more important to one group than to others. For example, even in JProCo, employee and pay data would be controlled by a Human Resources group. Customer data usually is managed by the Sales and Marketing team, whether that team consists of five or 15,000 people. Shown here is the list of JProCos tables. Notice that all tables show the default dbo (database owner) prefix. There isnt any category to help indicate whether a table is managed by the Sales team or by HR. Prior to SQL Server 2005, the syntax for fully qualified names in SQL Server was [server].[databasename].[owner].[object name], and the idea of dbo pertained to [owner] or which database user created and/or was allowed to access the object. Since SQL Server 2005, the owner identity has been replaced by the schema name. This eliminates the need for a namespace to be tied to the user who created the object. You have the freedom to pick meaningful category names. In SQL Server 2005, the new schemas capability was introduced. Lets look at a sample from the AdventureWorks database to see how schemas can help organize objects within databases.
57
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties In the two screenshots here, most of the AdventureWorks schemas are visible: dbo, HumanResources, Person, Production, Purchasing, and Sales. The figure to the right shows there are 71 tables contained in AdventureWorks sorted in order of table name (Figure 2.2). Below, the Object Explorer Details window shows there are 71 tables contained in AdventureWorks sorted in order of schema (Figure 2.3). Instead of creating all tables in the general dbo schema, AdventureWorks has defined a separate category for the tables used by each of its key departments (e.g., Sales, HR).
Figure 2.2 You can sort by table name.
Figure 2.3 AdventureWorks contains several schemas (e.g., HumanResources, Production, Sales).
58
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties We can easily see that the Department, Employee, EmployeeAddress, EmployeeDepartmentHistory, EmployeePayHistory, JobCandidate, and Shift tables all belong to the HumanResources department. Tables such as CreditCard, Customer, and CurrencyRate all belong to the Sales group. The introduction of schemas (SchemaName.ObjectName) has provided more freedom for DBAs to use meaningful names to categorize tables and other database objects. Schemas and principals can also have a One-to-Many relationship, meaning that a principal may own many differently named schemas. Prior to SQL Servers 2005 release, it was not uncommon for DBAs to house each departments tables in a separate database. Now DBAs have more choices and can simply manage access to schemas, rather than managing separate databases just to be able to distinguish tables belonging to separate categories. Below we can see the five custom schemas the AdventureWorks DBA defined to manage each of the five departments tables (Figure 2.4). Notice the many other system-defined schemas which SQL Server has created for its own management and tracking of the database (db_accessadmin, db_backupoperator, etc.).
Figure 2.4 The AdventureWorks DBA created five schemas, one for each departments tables.
59
www.Joes2Pros.com
To display the Object Explorer Details window, hit the F7 key or this path:
Management Studio > View > Object Explorer Details
Figure 2.5 RatisCo contains no user-defined schemas. These 13 schemas are all system-defined.
There should be just the 13 system-defined schemas no user-defined schemas are there yet, such as the Sales, Purchasing, or HumanResources schemas were now going to create. First lets make sure we are in the RatisCo database context by doing either of these items: 1) Toggling the dropdown list (as shown in Figure 2.6). OR 2) Beginning our code with a USE RatisCo statement. (Recall that GO delimits the end of a batch statement and must appear on the line below the last DDL statement.)
60
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties Now lets run this code to create the Sales schema and the Sales.Customer table.
Figure 2.6 When you are in the RatisCo db context, you will create the Sales schema and one table.
Suppose we ran the code in Figure 2.6 but forgot to first check the Object Explorer and see the 13 system-defined schemas by themselves. We would want to go back and remove the Sales schema, so we could see just the system-defined schemas. To do this, we would first need to remove Sales.Customer (DROP TABLE Sales.Customer GO) before SQL Server would allow us to remove the Sales schema (DROP SCHEMA Sales GO). SQL Server will not allow a schema to be removed, if database objects which are dependent on that schema still exist. Now lets run the code below to create two additional schemas (Figure 2.7).
Figure 2.7 These two statements create two new schemas in the RatisCo database.
Now lets check the Schemas folder to see that the newly created schemas are present (Sales, HumanResources, and Purchasing). In Figure 2.5, we saw 13 system defined schemas. We should now see a total of 16 schemas:
Management Studio > View > Object Explorer > Databases > RatisCo > Security > Schemas
61
www.Joes2Pros.com
Figure 2.8 shows the creation steps for the first schema, People. After creating the People schema, repeat the process and create the Production schema.
Figure 2.8 Create two new schemas (People and Production) in the RatisCo database.
Figure 2.9 We just added five user-defined schemas to the RatisCo database.
62
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties Well stay with the Management Studio interface and next add some records in the Sales.Customer table we created in the last section (see p. 60, Figure 2.6). We begin by navigating to this table in Object Explorer:
Management Studio > View > Object Explorer > Databases > RatisCo > Tables > Sales.Customer
As shown in Figure 2.10, we will right-click the Sales.Customer table which expands a full menu of options. Choosing Edit Top 200 Rows makes the record editing interface launch in a new tab. Now we enter the two rows of data as shown below (Figure 2.11). Just by clicking and beginning to type inside the first cell containing NULL, a new blank row will appear for us to populate. This interface is similar to Microsoft Access, in that there isnt an explicit OK or Enter button once the user clicks away from a row, the data will be entered into the database. We must enter the two records Figure 2.10 Add records using the SSMS UI. precisely as we see them here. We will continue working with these records later in this chapter. Notice the full four-part name in the tab (Server.Database.Schema.ObjectName). My server name is RENO. So the fully qualified name of this table object is RENO.RatisCo.Sales.Customer.
Figure 2.11 Manually enter two records into the Sales.Customer table.
We must enter the two records precisely as we see them here. We will continue working with these records later in this chapter.
63
www.Joes2Pros.com
Figure 2.12 This query locates the intended table even though the context points to a different db.
When we specify (SchemaName.ObjectName) in our query (SELECT * FROM SchemaName.ObjectName), this is known as a qualified query because we are using the qualified name. Qualified queries look in the exact schema for the table. Unqualified queries (SELECT * FROM ObjectName) use dbo as the default schema and consequently only check within the dbo schema. In other words, SQL Server actually interprets the query [SELECT * FROM ObjectName] as [SELECT * FROM dbo.ObjectName]. The four part name (Server.Database.SchemaName.ObjectName) is also referred to as the fully-qualified name. Later in this book, we will see that a fully-qualified name is required with linked servers.
64
www.Joes2Pros.com
Figure 2.13 Skill Check 1 uses T-SQL code to create five new schemas in JProCo.
Answer Code: The T-SQL code to this lab can be found in the downloadable files in a file named Lab2.1_Schemas.sql
65
www.Joes2Pros.com
3. 4.
5. 6.
7. 8. 9.
10.
11. 12.
66
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties 14. 15. A query with a simple name might cause some confusion if multiple schemas have an object with the same name (SELECT * FROM Order). SQL attempts to search schemas for simple object names in the following order: a. SQL attempts to find simple names from the default schema. b. If no default exists (OR if the default does not contain the requested object) it attempts to find simple names from dbo. A default schema can be assigned to a user in two ways. a. Using the UI in the properties of a Database user. b. Specifying the schema name in the DEFAULT_SCHEMA clause of the CREATE or ALTER user statement. A default schema can be assigned to each database user.
16.
17.
67
www.Joes2Pros.com
Database Snapshots
Most database backups sit in a file store in case you ever need to restore a database to a point in the past. We cant see the data inside the backup unless we restore the database. This makes comparing the real-time database with the backup file limited or even impossible. A new capability introduced in SQL Server 2005 is the database snapshot. A snapshot is an online, point-in-time reference to a database on our system. Think of it like taking a static picture of our database with references to its data. We can use SQL Server to query our database and look for all changes made since the last snapshot was taken. (We will see this done toward the end of this section in Figure 2.18.)
Snapshot Basics
Database snapshots are created from our source database (e.g., the JProCo database, the RatisCo database). Database snapshots are not really backups at all. A snapshot can help our database revert back to a point in time, but the snapshot relies upon the source database and must be able to communicate with it in order to function. In fact if the source database becomes corrupt or unavailable, the snapshot(s) of it will be useless. The snapshot begins as an empty shell of our database file(s) and stores only changed data pages. If our database never changes after we create the snapshot, then the snapshot will always remain just an empty shell. When we make (or write) a change to the source database, the database gets updated and the old record is then copied into the snapshot. This is known as copy on write (COW). If we query for changes, the snapshot shows us those records from its own datafiles. However, if we query a snapshot database for data that has never changed, SQL Server will fetch the data from the source database. When a change is made to the source database the new record is updated. The old record is then copied to the snapshot. In this way a snapshot maintains a view of the database as it existed when the snapshot was created.
68
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties Before we can create a snapshot, we need to know the name of the database and where all of its data is stored. In other words, we need to know the name and storage location for the MDF, as well as any NDF files. We want to create a snapshot of RatisCo, so lets look for the file details contained in the Files page of the Database Properties dialog for RatisCo. (Object Explorer >
Databases > right-click RatisCo > Properties > Files page (from left nav menu).)
In Figure 2.14, we see the MDF file for the RatisCo database is named RatisCo.
Figure 2.14 Prepare to create a snapshot of RatisCo by locating the MDF and NDF files (if any).
Before we create the snapshot database its a good idea to change you current query window context to the Master database. The code used to create a snapshot of a database is similar to the code we use to create a database (see Figure 2.15).
Figure 2.15 The skeleton code used to create a snapshot of the RatisCo database.
69
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties Check Object Explorer to confirm we can see the newly created snapshot.
Object Explorer > Database Snapshots > RatisCo_Monday_noon
Figure 2.18 The queries of the snapshot and Sales.Customer produce an identical result.
70
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties Note that the snapshot is a static, read-only reference to the database. No matter what data or objects are added to the RatisCo database after 12:01pm Monday, the snapshot we created (RatisCo_Monday_noon) will not contain those changes. The snapshots job is to keep track of any changes impacting the data it cares about (i.e., RatisCo and its data at the time the snapshot was taken at noon on Monday). An EXCEPT query is a quick way to check for changes between the snapshot and the source database. Since we just created this snapshot of RatisCo (RatisCo_Monday_noon), we would expect there to be no difference between the snapshot and the source (see Figure 2.19).
Figure 2.19 An EXCEPT query will reveal any changes between the snapshot and the source db.
Recall with multiple query operators, the top table is dominant. So in this query, if changes had been made to data which the snapshot is tracking, the result would show the old record. To see the new record, just reverse the order of these tables so that the table from the source database is dominant. --The result of this query will be the new record(s).
71
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties Lets make a change to a record that was included in the Monday noon snapshot.
Now lets reattempt our EXCEPT query with the snapshot as the dominant table.
Figure 2.21 Our EXCEPT query reveals one record changed after the Monday noon snapshot.
This result provides us two pieces of information. It tells us at least one record has changed in RatisCo since we took the Monday noon snapshot. It also provides us the old record. Lets think about the activity of the snapshot and the COW (copy on write) behavior. Prior to the moment we updated one of the records this snapshot (RatisCo_Monday_noon) is tracking, the snapshot was an empty shell. When the one record was updated in RatisCo (Figure 2.21, the CustomerName Kush-a-rama was changed to Cush-A-Rama), the source database copied the old record into the snapshot file. In other words, when RatisCo wrote the update to the record, the old record was copied into the snapshot. Afterwards, the snapshot was a shell containing exactly one record.
72
www.Joes2Pros.com
Skill Check 2: Change the last name of Employee 106 from Davies to Santwon. Run a query to show all the changes from the source database Employee table and the snapshot database Employee table. Use the snapshot you created in the first Skill Check (dbBasics_Monday_Noon).
Figure 2.23a The data for Employee 106s old record contained in the snapshot.
Figure 2.23b The data for Employee 106s new record contained in the source database.
Answer Code: The T-SQL code to this lab can be found in the downloadable files in a file named Lab2.2_DatabaseSnapshots.sql
73
www.Joes2Pros.com
74
www.Joes2Pros.com
Figure 2.24 The Options page in the Database Properties dialog displays properties and settings.
Open this page in your own instance of SQL Server and notice the several categories of settings and properties found on the Options page (Automatic, Cursor, Miscellaneous, Recovery, Service Broker, and State). Most of these settings can be altered from this page. Later we will also change properties using T-SQL code, and we will also use the scripting tool to generate T-SQL code from the Options page. Another interface where we can quickly view metadata for a database is the Object Explorer Details window. To get this window to appear press the f7 key. There are two panes and most of the data we want to see is in the lower pane. The bottom
75
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties area contains the Details Pane that displays certain details of the selected object. To see more of this lower pane drag the divider from bottom of the window (see Figure 2.25). We cannot use this interface to change settings, as might be guessed from the grey shading. However, its a handy resource for viewing high level metadata (creation date, size (MB), space available (KB)), the physical location and name of the default filegroup, and some commonly referenced properties, such as the ANSI NULL Default setting and the Read Only property.
Details Pane
Figure 2.25 The Object Explorer Details window displays 21 points of metadata for a database.
Read Only and ANSI NULL Default are two of the properties we will utilize in this chapter. Figure 2.24 shows RatisCos Read Only property set to False, meaning that it is not in read-only mode. Another indication that RatisCo is not in read-only mode is that its icon displays normally in Object Explorer. A read-only databases icon will be greyed out in the Object Explorer display, and it will have a ReadOnly label alongside its name (an example is shown in Figure 2.27).
76
www.Joes2Pros.com
Once weve confirmed the Database Read-Only property of RatisCo is set to True, we will recheck the Object Explorer to note the icons changed appearance (as shown on the next page in Figure 2.27). We may need to right-click the
77
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties Databases folder and choose the refresh option in order for the icon to be refreshed and display its newly modified status.
When you create a table you can specify each field as NULL or NOT NULL. If you forget then SQL Server likes to say that field must be nullable. The ANSI standard says if you forget to speciy then the field will not allow nulls. If you want SQL to use the ANSI NULL default you can set this property. Use a similary process to change the ANSI NULL Default to true.
Figure 2.28 We can query objects within the RatisCo database while it is in Read-Only mode.
Suppose we wanted to update one of these records. We cannot run an UPDATE statement while RatisCo is in Read-Only mode (see Figure 2.29).
78
www.Joes2Pros.com
Figure 2.29 We cannot run an UPDATE statement while RatisCo is in Read-Only mode.
Lets now take the RatisCo database out of Read-Only mode or in other words, put it back into Read-Write mode using the T-SQL code below.
Figure 2.30 We can use T-SQL code to take the RatisCo database out of Read-Only mode.
Figure 2.31 With RatisCo no longer in Read-Only mode, we can now UPDATE Sales.Customer.
Figure 2.32 We can use T-SQL code to put RatisCo back into Read-Only mode.
Check the Management Studio interface and see that running the code has the same effect as toggling the Database Read-Only property dropdown to True.
79
www.Joes2Pros.com
Figure 2.33 Using T-SQL code has the same effect as toggling the property dropdown.
80
www.Joes2Pros.com
Figure 2.34 We are temporarily turning on the Auto Shrink option for RatisCo.
Now lets use T-SQL code to turn off Auto Shrink, which has the same effect as changing the dropdown to False in the Options page.
Figure 2.35 This T-SQL code will turn off the Auto Shrink option.
Suppose we need to change some properties for our database, but we arent sure of the T-SQL code syntax. Each of the nine pages within the Database Properties dialog contains a Script menu (see Figure 2.36). For most actions we can take
81
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties using these pages, we can have Management Studio generate the T-SQL code which would accomplish that action.
Figure 2.36 Management Studios scripting capability provides T-SQL code upon request.
Every operation performed in SQL Server is accomplished by the execution of TSQL code. That is to say, SQL Server requires code in order to accomplish any task, whether we accomplish the task by running code in a query window, or by clicking through the SQL Server Management Studio interface. When we write and run our own code in the query window, we are obviously providing the code. However, when we use the Management Studio interface to perform tasks (e.g., create or drop database objects, change database properties), behind the scenes SQL Server automatically generates and runs the code needed to execute our tasks. Lets have the scripting tool generate the code for the two settings weve been practicing, Auto Shrink and Database Read-Only. Open the Options page in the Database Properties dialog for RatisCo. Change the Auto Shrink and Read-Only properties (toggling the dropdowns to true and false, respectively) BUT dont yet click OK (see Figure 2.37 on next page). Once we have toggled the dropdowns for the two settings, we will hit the Script button and then the Cancel button to close the Options page.
82
www.Joes2Pros.com
Figure 2.37 Management Studios scripting capability provides T-SQL code upon request.
The Script button registers the changes which weve toggled in the interface and generates a script showing us the code which will affect the change(s) we are considering. Lets run the code which Management Studio has generated for us. Make sure any query windows pointing to RatisCo are closed before running this code.
Figure 2.38 We are running the T-SQL code which Management Studio generated for us.
Notice that SQL Server automatically uses square brackets to delimit every database object name. These and other signs, such as an N appearing with Unicode characters, are often hints that a code sample was generated by a machine (i.e., SQL Server) and not written by a person.
83
www.Joes2Pros.com
Chapter 2. Database Schemas, Snapshots & Properties After running the code, refresh the Databases folder in Object Explorer and see the RatisCo icon appearing normally (its no longer greyed out and the Read-Only notation is gone). Return to the Options page and see that Auto Shrink is now set to True and Database Read-Only is set to False. Since most SQL Pros do not use Auto Shrink on their databases (and prefer to manually control any database resizing operations), lets be sure to reopen the Options page and generate the code to turn Auto Shrink off for RatisCo.
Figure 2.39 Management Studio scripts the action(s) registered when you toggle dropdown items.
Before leaving this section, run the code to turn off Auto Shrink for RatisCo.
Figure 2.40 Execute the script which Management Studio generated for us.
84
www.Joes2Pros.com
85
www.Joes2Pros.com
3.
SQL Server Management Studios code generating capabilities can show the T-SQL code syntax for any database action available in the Database Properties dialog. Every operation performed in SQL Server is accomplished by the execution of T-SQL code. That is to say, SQL Server requires code in order to accomplish any task, whether we accomplish the task by running code in a query window, or by clicking through the SQL Server Management Studio interface. When we write and run our own code in the query window, we are obviously providing the code. However, when we use the Management Studio interface to perform tasks (e.g., create or drop database objects, change database properties), behind the scenes SQL Server automatically generates and runs the code needed to execute our tasks.
4.
86
www.Joes2Pros.com
Chapter Glossary
(COW) Copy on Write: When we make (or write) a change to the source database, the database gets updated and the old record is then copied into the snapshot. This is known as copy on write (COW). Database properties: A SQL dialog box that allows you to view and set useful information for each database and its datafiles. Database Snapshots: Database snapshots are read-only (point-in-time) and must exist on the same server as the source data. Database snapshots begin as an empty shell of the database file(s) and stores only changed data pages. Fully qualified names: The syntax for fully qualified names in SQL Server is [server].[databasename].[owner].[object name]. In a fully qualified name the object must be explicitly identified. Naming convention: The convention used for naming databases and files within a system. Objects: Tables, stored procedures, views, etc. are SQL database objects. Partially qualified names: Partially qualified names omit some of the qualifiers by leaving them out or by replacing them with another period. Qualified query: Qualified queries look in the exact schema for the table. Schemas: A namespace for database objects. Script: SQL code saved as a file. Also a control in SQL Server Management Studio which allowed you to generate the underlying code for a database or object. Source database: The database which the snapshot is created from is referred to as the Source Database. System-defined schema: The schemas created by SQL. User-defined schema: Schemas created by the User.
87
www.Joes2Pros.com
What is the name of the source database? O a. O b. O c. O d. dbPublish dbPublish_Snap dbPublish_Data dbPublish_dbPubMon
3.) You have the following code. CREATE DATABASE dbPublish_Snap ON (Name = dbPublish_Data, FILENAME = C:\SQL\dbPubMon.ss) AS SNAPSHOT OF dbPublish What is the name of the source data file? O a. O b. O c. O d. dbPublish dbPublish_Snap dbPublish_Data dbPublish_dbPubMon
4.) Database snapshots are: O a. Read Only O b. Read/Write 5.) Which statement is not true about database snapshots? O a. They are static, point in time versions of a database. O b. They must be located on the same server as the source database. O c. You can create a snapshot for any database including Master.
Answer Key
1.)a, d 2.) a 3.) c 4.) a 5.) c
88
www.Joes2Pros.com
89
www.Joes2Pros.com
Chapter 3.
In the last chapter we discussed the importance of creating your database adequately large in size, in order to provide the space your data will need. While processing power, memory, and disk storage have all become cheaper and more plentiful in the last decade, the tasks of capacity planning and estimating infrastructure requirements are still important to the IT world. It is also incumbent upon you as a SQL Pro to become intimately familiar with the data types available to SQL Server and their impacts on performance and storage consumption. The next three chapters will cover data type options and usage. In your database career you will use this knowledge in designing and implementing your own database systems, as well as troubleshooting and diagnosing performance issues with existing databases. Factors affecting the space your data consumes are its data types, including fixed versus variable length, and whether or not the data type supports Unicode. The building blocks of database objects are fields, so our storage calculations will be based on the fields contained within a row.
READER NOTE: In order to follow along with the examples in the first section of Chapter 3, please run the setup script SQLArchChapter3.0Setup.sql. The setup scripts for this book are posted at Joes2Pros.com.
90
www.Joes2Pros.com