09 - Relational Partitions
09 - Relational Partitions
Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
Microsoft Corporation Technical Documentation License Agreement (Standard) READ THIS! THIS IS A LEGAL AGREEMENT BETWEEN MICROSOFT CORPORATION ("MICROSOFT") AND THE RECIPIENT OF THESE MATERIALS, WHETHER AN INDIVIDUAL OR AN ENTITY ("YOU"). IF YOU HAVE ACCESSED THIS AGREEMENT IN THE PROCESS OF DOWNLOADING MATERIALS ("MATERIALS") FROM A MICROSOFT WEB SITE, BY CLICKING "I ACCEPT", DOWNLOADING, USING OR PROVIDING FEEDBACK ON THE MATERIALS, YOU AGREE TO THESE TERMS. IF THIS AGREEMENT IS ATTACHED TO MATERIALS, BY ACCESSING, USING OR PROVIDING FEEDBACK ON THE ATTACHED MATERIALS, YOU AGREE TO THESE TERMS. 1. For good and valuable consideration, the receipt and sufficiency of which are acknowledged, You and Microsoft agree as follows: (a) If You are an authorized representative of the corporation or other entity designated below ("Company"), and such Company has executed a Microsoft Corporation Non-Disclosure Agreement that is not limited to a specific subject matter or event ("Microsoft NDA"), You represent that You have authority to act on behalf of Company and agree that the Confidential Information, as defined in the Microsoft NDA, is subject to the terms and conditions of the Microsoft NDA and that Company will treat the Confidential Information accordingly; (b) If You are an individual, and have executed a Microsoft NDA, You agree that the Confidential Information, as defined in the Microsoft NDA, is subject to the terms and conditions of the Microsoft NDA and that You will treat the Confidential Information accordingly; or (c)If a Microsoft NDA has not been executed, You (if You are an individual), or Company (if You are an authorized representative of Company), as applicable, agrees: (a) to refrain from disclosing or distributing the Confidential Information to any third party for five (5) years from the date of disclosure of the Confidential Information by Microsoft to Company/You; (b) to refrain from reproducing or summarizing the Confidential Information; and (c) to take reasonable security precautions, at least as great as the precautions it takes to protect its own confidential information, but no less than reasonable care, to keep confidential the Confidential Information. You/Company, however, may disclose Confidential Information in accordance with a judicial or other governmental order, provided You/Company either (i) gives Microsoft reasonable notice prior to such disclosure and to allow Microsoft a reasonable opportunity to seek a protective order or equivalent, or (ii) obtains written assurance from the applicable judicial or governmental entity that it will afford the Confidential Information the highest level of protection afforded under applicable law or regulation. Confidential Information shall not include any information, however designated, that: (i) is or subsequently becomes publicly available without Your/Companys breach of any obligation owed to Microsoft; (ii) became known to You/Company prior to Microsofts disclosure of such information to You/Company pursuant to the terms of this Agreement; (iii) became known to You/Company from a source other than Microsoft other than by the breach of an obligation of confidentiality owed to Microsoft; or (iv) is independently developed by You/Company. For purposes of this paragraph, "Confidential Information" means nonpublic information that Microsoft designates as being confidential or which, under the circumstances surrounding disclosure ought to be treated as confidential by Recipient. "Confidential Information" includes, without limitation, information in tangible or intangible form relating to and/or including released or unreleased Microsoft software or hardware products, the marketing or promotion of any Microsoft product, Microsoft's business policies or practices, and information received from others that Microsoft is obligated to treat as confidential. 2. You may review these Materials only (a) as a reference to assist You in planning and designing Your product, service or technology ("Product") to interface with a Microsoft Product as described in these Materials; and (b) to provide feedback on these Materials to Microsoft. All other rights are retained by Microsoft; this agreement does not give You rights under any Microsoft patents. You may not (i) duplicate any part of these Materials, (ii) remove this agreement or any notices from these Materials, or (iii) give any part of these Materials, or assign or otherwise provide Your rights under this agreement, to anyone else. 3. These Materials may contain preliminary information or inaccuracies, and may not correctly represent any associated Microsoft Product as commercially released. All Materials are provided entirely "AS IS." To the extent permitted by law, MICROSOFT MAKES NO WARRANTY OF ANY KIND, DISCLAIMS ALL EXPRESS, IMPLIED AND STATUTORY WARRANTIES, AND ASSUMES NO LIABILITY TO YOU FOR ANY DAMAGES OF ANY TYPE IN CONNECTION WITH THESE MATERIALS OR ANY INTELLECTUAL PROPERTY IN THEM. 4. If You are an entity and (a) merge into another entity or (b) a controlling ownership interest in You changes, Your right to use these Materials automatically terminates and You must destroy them. 5. You have no obligation to give Microsoft any suggestions, comments or other feedback ("Feedback")
Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
relating to these Materials. However, any Feedback you voluntarily provide may be used in Microsoft Products and related specifications or other documentation (collectively, "Microsoft Offerings") which in turn may be relied upon by other third parties to develop their own Products. Accordingly, if You do give Microsoft Feedback on any version of these Materials or the Microsoft Offerings to which they apply, You agree: (a) Microsoft may freely use, reproduce, license, distribute, and otherwise commercialize Your Feedback in any Microsoft Offering; (b) You also grant third parties, without charge, only those patent rights necessary to enable other Products to use or interface with any specific parts of a Microsoft Product that incorporate Your Feedback; and (c) You will not give Microsoft any Feedback (i) that You have reason to believe is subject to any patent, copyright or other intellectual property claim or right of any third party; or (ii) subject to license terms which seek to require any Microsoft Offering incorporating or derived from such Feedback, or other Microsoft intellectual property, to be licensed to or otherwise shared with any third party. 6. Microsoft has no obligation to maintain confidentiality of any Microsoft Offering, but otherwise the confidentiality of Your Feedback, including Your identity as the source of such Feedback, is governed by Your NDA. 7. This agreement is governed by the laws of the State of Washington. Any dispute involving it must be brought in the federal or state superior courts located in King County, Washington, and You waive any defenses allowing the dispute to be litigated elsewhere. If there is litigation, the losing party must pay the other partys reasonable attorneys fees, costs and other expenses. If any part of this agreement is unenforceable, it will be considered modified to the extent necessary to make it enforceable, and the remainder shall continue in effect. This agreement is the entire agreement between You and Microsoft concerning these Materials; it may be changed only by a written document signed by both You and Microsoft.
Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
In a relational database, a partition divides logical database elements into separate physical parts. You typically partition a database to improve manageability, performance or availability. You can choose to partition your tables using either vertical or horizontal partitioning depending on the nature of the data and the type of performance improvement you need.
Need of Partitions
The following is a brief list of the primary advantages of using partitions: Loading large data volumes: It has been a challenge for traditional method of loading warehouse particularly when there are very large volumes of extracted data. In partitions, the data in warehouse is organized based on partitioned column, such as transaction date. One day of transaction load will be loaded to one partition would minimize the total load time. The loading time will further be reduced if partition switching option is adopted, which would further optimize the loading time. Archiving: Archiving is often used with time-based entries on fact tables. Archiving keeps the size of a table manageable by retaining detailed data only for the necessary time period. For example, if you need daily data for only two months, you could rows for previous months from the daily detail table. Deleting old data from a fact table by using traditional methods is slow, because the deletions are logged. By using partitions, you can simply remove old partitions from the logical table, possibly switching them to an archive table. Removing a partition from one table, or adding the partition to another table is a very fast operation. Data Access Queries: Fast performance for data access queries is the critical to a data warehouse system. Achieving fast query performance is challenging when the data warehouse is large. Partitioning facilitates fast queries in two ways: first, partitions that are unrelated to the query can be ignored; second, multiple partitions can be queried in parallel. Seamless access to distributed sources: When the data is distributed across different servers, partitioned views allow you to present it as if it were a single table. As with local partitions, the query optimizer is able to avoid querying source partitions that are not used in a query, but this optimization is even more critical with distributed data sources.
Partition Types
Most partitioning can be described as either horizontal or vertical. Horizontal Partitions: Horizontal partitioning divides a logical table into multiple physical tables called partition members. Each table maintains an identical schema, but contains fewer Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
rows. For example, a table containing one year of data can be partitioned to 12 partition member tables, each with 1 month of data. As another example, a table containing employee records can be split into two partition member tables, one with US employees and one with non-US employees. In a horizontally partitioned table, no two partition member tables contain the same data: there is no overlap between the partitions. A logical union of the partition member tables recreates the original logical table.
The partition commands in SQL Server all facilitate various types of horizontal partitioning, and can apply to various objects. You can partition a table or an index, or you can create a partitioned view that joins data from multiple sources into one logical table. Vertical Partitions: Vertical partitioning divides a logical table into multiple physical tables that contain fewer columns. The two types of vertical partitioning are normalization and row splitting. Normalization is the standard database process of removing redundant columns from a table and putting them in secondary tables to reduce update anomalies and redundancy and linked to the primary table by primary key and foreign key relationships. Row splitting is a technique for grouping columns together based on their business meaning. With row splitting, you divide the original table into tables with the same number of rows, but fewer columns. Each logical row in a split table matches the same logical row in the others. Like horizontal partitioning, vertical partitioning lets queries scan less data. This can query performance, provided that the query does not require columns from all the vertical partitions. As a simplistic example, table that contains four attributes for the key, of which only the first two are typically used in queries may benefit from splitting the infrequently used attribute columns into a separate table. This is, naturally, much more important when there are hundreds of attributes for a key. If most queries use columns from most or all of the partitions, Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
then the overhead of adding a join more than offsets any benefit from splitting the rows.
Designing Partitions
This section details the implementation guidelines for horizontal partitioning. Vertical partitioning is nothing but separate table structures which would be accessed using joins as they both contain same cardinality but different columns combined with a key. There are no separate guidelines required for implementing vertical partitions. There are three different approaches for implementing horizontal partitions, they are Partitioned Views Partitioned Tables Partitioned Indexes
Partitioned Views
A Partitioned View is also known as an Updateable Partitioned View or a Distributed Partitioned View. You create a view to logically union identically structured tables. You typically use a partitioned view in distribution environment where multiple servers are connected using linked servers. Rather than create a Partitioned View for a local database, you should create a Partitioned Table.
Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
Once you create the view, users can access the underlying data transparently, without knowing its physical location. You can make a Partitioned view writeable (updatable), but there are restrictions. For example, a single update command must affect only a single source table. Partitioned Table The concept of a partitioned table involves splitting a logical table into several physical units called partitions or partition members. The query engine can access each partition member independently of the others, limiting the scope of I/O intensive activities such as queries, data loads, and maintenance tasks, particularly those that require table level locks. The most common method of splitting data is horizontal partitioning, in which rows of a table matching mutually exclusive criteria (such as, range of dates or letters in alphabet, for date time and character data, respectively) are placed in designated partitions. Steps to create Partitioned tables The process of partitioning involves four main steps.
Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
o The first, preliminary step consists of creating file groups and their corresponding files. During the second step, you define a partition function using the CREATE o PARTITION FUNCTION T-SQL statement, with ranges of values (set by the FOR VALUES clause) that determine the beginning and end of each partition (and subsequently, their number)
CREATE PARTITION FUNCTION PartFunction (int) AS RANGE LEFT FOR VALUES (1, 100, 1000);
o As a third step, create partition scheme using the CREATE PARTITION SCHEME T-SQL statement, which associates partition function to file groups, determining physical layout of data. The association between partitions and file groups is typically one to one, although other arrangements are also possible (including specifying extra unassigned partitions for future growth)
CREATE PARTITION SCHEME PartSchema AS PARTITION PartFunction TO (FileGroup1, FileGroup2, FileGroup3, FileGroup4);
As a fourth step, create a partitioned table or index using CREATE TABLE or CREATE INDEX statements. The ON clause within these statements identifies the target partition scheme and partitioning column, whose values are compared against the definition of the partitioning function.
CREATE TABLE PartTable (F1 INT, F2 INT) ON PartSchema (F2)
Partitioned Indexes Indexes on each partition table will naturally be smaller than a corresponding index on the non partitioned table, which means fewer levels to traverse and fewer pages to scan, thereby boosting performance. In addition, you can explicitly partition an index, even on an unpartitioned table. Secondary indexes can be set up completely separately from primary indexes. The syntax for creation is the same. When the indexes and partitions are within the same file group, the indexes are aligned. Alignment provides several advantages; most importantly, it provides a means for simplifying data backup. Query performance is better in aligned index systems, because the I/O aspects of query processing are increased. The following example create partitioned index on table PartTable1 column F1, using partition schema partitioned on F2. Creating Partitioned schema, partition function is explained in the example given for creating Partitioned table
CREATE NONCLUSTERED INDEX IX_PartTable1_ReferenceF2 ON PartTable F1) ON PartSchema (F2);
Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
Traditional Approach with no Partitions Generally depending on the level of expertise there are various ways developers can design tables with no partitions Crude way to designing table is (1 FileGroup; 1 File; on only 1 Disk) Single table with multiple file groups, multiple files on multiple disks. To check for a record the query optimizer scans all the files groups as to check all the discs where the files are locate because upfront it doesnt understand where the required data is stored.
Consider creating vertical partitions for a table with a large number of columns
Vertical partitioning is particularly useful if the number of columns of a table exceeds the limit that SQL Server supports. This often happens with employee or customer demographic information. When you partition because of a large number of columns, you may need to maintain separate queries for the different partitions, because of the limit in the number of columns in a view or query. Vertical partitioned tables do not help in warehouse structure as warehouse structure is denormalized in nature.
Consider creating vertical partitions when several of the columns are rarely used in queries
When a specific subset of the columns in a table is rarely used, consider moving the columns to a different physical table. Since there will be fewer columns left in the main table, more records can fit into a data page. As a result, queries that use only the frequently-used columns can be much faster. A reporting query that needs the infrequently used columns can still access them by joining the two tables together.
Consider creating vertical partitions when some of the columns are very large
Tables that contain very large text fields or binary data such as images are often very large. You may want to put very large columns into a separate vertical Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
partition. This enhances performance for queries that dont require the large data column, and also allow the database to manage the table space more effectively than storing very large columns together with small columns.
data in another, and Asia in a fourth, you could put each partition on a server within the region. If most queries are local, most queries will run exclusively off the local server. Enterprise-wide queries can still access all the partitions as needed.
Use partitioned column of partition views for optimizer to choose appropriate partition when queried
When SELECT statements referencing the partitioned view specify a search condition on the partition column, the query optimizer uses the CHECK constraint definitions to determine which member table contains the rows and so the fetching is faster If the queries have a filtering key that corresponds to the partitioning key, response time will be faster. This is because the use of partitioning encourages the use of parallel operations and the partitioning key in the query predicate makes pruning data easier. If you are trying to improve performance and manageability for large subsets of data and there are defined access patterns, use the range partitioning mechanism. If your data has logical groupings and you often work with only a few of these defined subsets at a time then define your ranges so the queries are isolated to work with only the appropriate data
Partitioned view member tables should have same schema, same partition column, non overlapped partition ranges and check constraints
All member tables in the partitioned view should have same schema (columns, data type, size, collation), and a primary key defined on the same set of columns. Only the Check constraint should be different. The ranges defined in the check constraints on the member tables should not have gaps or overlap. The query manager could not determine where the data of the overlapped region exists, and this would defeat the purpose of the partitioned views
table's primary key or the view doesn't include all the base tables' columnsyou can create INSTEAD OF triggers to modify the view. However, the query optimizer's plans for views with INSTEAD OF triggers might be less efficient than its plans for updateable views because some of the new optimizing techniques depend on the rules that make the views updateable To perform updates on a partitioned view, the partitioning column must be a part of the primary key of the base table. If a view is not updatable, you can create an INSTEAD OF trigger on the view that allows updates Check constraints will help query manager look for the specific member table instead of all the member tables defined in the partitioned view. Therefore, create check constraints on the member tables participating in the partitioned view and queries on partitioned view The partitioned column cannot be auto generated or computed such as identity, time stamp, and it should be part of the primary key. Limit each partition view to one constraint. More than one constraint on the partitioned column will eventually make query manager unable to recognize the view as a partitioned view
Use member tables for bulk load instead of using corresponding partitions views
Bulk load data directly into the member tables rather than through the partitioned view When loading into a partitioned view or a partitioned table, load the data into separate staging tablesone for each partition. Add the loaded staging table to another new table having check constraint and modify partitioned view to have the new table be considered in partitioned view (make sure partition view requirements are followed such as no overlapping key, schema requirements
If you are using linked server for a member table in a partitioned view use sp_serveroption to defer schema validation
Consider setting the lazy schema validation for each linked server definition option by using the sp_serveroption stored procedure. This optimizes Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
performance by making sure the query processor does not request metadata for any one of the linked tables until data is actually needed from that remote member table
Ensure the partitioning index are dropped or disabled prior loading the data
When data is loaded a partitioned table, it is better to drop the partitioned index before loading to get performance advantage and eliminate load time index rebuilt. Create the partitioned index on the partitioned table once the load it finished. You do not need to drop and recreate the index for lower order records transactions. This technique is particularly used for bulk loading. Adding the data with indexes may cause index sorting and index be rebuilt Having indexes when you load the data has a performance issue because the records are sorted o Add data to partition and then create the index o Ensure the partitioning index are dropped or disabled prior loading the data o After loading the data, rebuild the indexes o Rebuild and reorganize indexes on individual partitions, thereby facilitating better management of partitioned indexes o It is a good practice to create partition of the tables and then create index over it to have it aligned with the data o Partition Index alignment with partition tables is also important when SQL performs sorting based on partitioned index column. If it is aligned, it would take less memory as it does it by chunks (partitions) or else it would load and sort entirely. This process may get slower in memory constrained machines and happens only when non alignment case.
Use the same partition function for the table as well as for the index
Partition both the table and its indexes using the same partition function. When both the indexes and the table use the same partitioning function and the same partitioning columns (in the same order), the table and index are said to be aligned Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
To achieve alignment two partitioned tables or indexes must have some correspondence between their respective partitions. They must use "equivalent" partition functions and some relation with respect to the partitioning columns. Two partition functions can be used to align data when: o Both partition functions use the same number of arguments and partitions o The partitioning key used in each function is of equal type (includes length, precision and scale if applicable, and collation if applicable) o The boundary values of partition ranges are equivalent
NOTE: Even when two partition functions are designed to align data, you may end up with an unaligned index if it is not partitioned on the same column as the partitioned table. When the tables and indexes are defined with not only the same partition function but also the same partition scheme then they are said to be storage aligned. In general, having the tables and indexes on the same file or file group can often benefit where multiple CPUs can parallelize an operation across partitions, it is because one processor can deal with finding the data from the index from same file group. In the case of storage alignment and multiple CPUs, SQL Server can have each processor work directly on a specific file or file group and know that there are no conflicts in data access as they typically exist on the same disk. This allows more processes to run in parallel without interruption and so the better performance. Indexes created on a partitioned table will also use the same partitioning scheme and partitioning column. When this is true, the index are said to be aligned with the table. If you are creating an index on a partitioned table, and do not specify a file group on which to place the index, the index is partitioned in the same manner as the underlying table. This is because indexes, by default, are placed on the same file groups as their underlying tables, and for a partitioned table in the same partition scheme that uses the same partitioning columns If you are creating a non-unique index then it will add the partition column to index keys so it can go to corresponding partition directly. When partitioning a non-unique, clustered index, the Database Engine by default adds any partitioning columns to the list of clustered index keys, if not already specified
o If queries typically extract from a predictable subset of the table, it may be a good candidate for partitioning. o Before designing partitions, make sure that the table and index are properly designed and optimized. o If possible, move old data to an archive to reduce the size of the table. This may eliminate the need to create partitions.
Data Loading:
o Determine which tables are slow to load. This will typically be very large tables that have very large incremental loads o If the Warehouse system cannot allow for any down-time or performance degradation while loading, partitions may be a good option. o As a general rule, dont consider tables smaller than two gigabytes for partitioning o The most common tables for partitioning are fact tables. However, if a dimension table has more than 10 million members, you may benefit from partitioning it.
Data Archiving/Aging:
o Tables containing historical datain which new data is added into the newest partitionare good candidates for partitioning. A typical example is a historical table where only the current month's data is updatable and previous months are read only o Tables that contain a rolling time periodwhere an old period is dropped each time a new period is addedare good candidates for partitioning. In this case, executing a delete on a large table is prohibitively expensive, but dropping a partition is very efficient, particularly when the table is very large. o Tables that are too large to effectively back-up are good candidates for partitioning.
Data Indexing:
o When dropping and rebuilding an index over a whole table is too slow, you may want to add partitions. Dropping and rebuilding indexes can be effective when working with bulk load processes. o When query performance degrades because the index is too large, you may want to add partitions.
Data Volume:
o If the existing table has already huge size of data and cannot be moved to different table cause of its size then consider partitioning indexes instead of table
o Put the most frequently used partitions on the best performing disk drives so that mostly used queries can run faster than seldom run queries o One more reason of keeping partitions on different file groups would facilitate easy backups o The number of file groups may be limited by hardware resources, so consider plans for scaling up or scaling out in the future to place each file group on a different disk for getting parallel disk IO advantage o To simplify alignment between data files and indexes, keep the data file and index for each partition on the same file group. o Partition scheme and partition function should have the same number of partitions.
participate in locking instead of entire table to reduce the downtime. It would facilitate to maintain high availability of existing data at the same time for faster loading of new data. Some of the guidelines on loading scenario on partitioned tables are as follows Results in lower availability: Block all other users from accessing the table (partition member) during data loading Have a Planned maintenance window for incrementally loading data Use partition switching if the data is loaded aligned with partition schema. Example: Table is partitioned on date. If the loading data is once per day then partition switching approach would greatly benefit particularly if the data volume is huge
Confine each partition to a separate file group to simplify back-up and restore
SQL Server only allows piecemeal (i.e., incremental) restores It would provide the advantage of piecemeal backup and restore operations in SQL server to offer more flexibility to take backup of only required data instead of complete table. This particular file group can be made read only and perform bulk load particularly important for larger size loads. Mark the file group as read-only mode for piecemeal backup to work in a Simple Recovery model. For Bulk-Logged Recovery model or the Full Recovery model, it is necessary to back up the transaction log(s). Doing so is essential to restore the file group successfully.
Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
scenarios. If new data you are adding has to be loaded, scrubbed, or transformed, it can be treated as a separate entity before adding it as a partition. Regardless of how large or small the collection, the transfer is quick and efficient because, unlike an INSERT INTO SELECT FROM statement, the data is not physically moved. Only the metadata about where it is stored changes from one partition to another.
The sliding window (or rolling window) scenario works best when data movement across partitions is minimized The sliding window scenario works only when the target table or partition is empty This requires a schema lock on the partitioned table. Partitions can be switched in and out when no other process has acquired a table-level lock on the partitioned table. Consider using Check Constraints on the archive table to ensure it holds only the specified data range The switch partitions will wait until the lock has been released by other process When partitioning is used and if you want to use the partition switch trick we have to know in which file group the partition resides and the id of that partition. In order to make the partitioning switch work, in fact, we have to create the two support table in the same file group where the partition of the main table we have to change resides Note also that any indexes that are not aligned with their respective tables must be dropped or disabled prior to the switch The sliding window scenario works best when data movement across partitions is minimized. The rows are moved between partitions by deleting the rows at the original position and by inserting the rows at the new position. The partitions involved in this switch are inaccessible during this period. The basic steps that include in sliding window scenario is to create an empty partition on the receiving partition table Isolate the data to one partition of source partition table by splitting/merging the partitions accordingly Move the data from source table to destination using a partition switch statement (which modifies the meta data, not exactly moving the entire content) Remove the source partition which is moved to destination
each of which represents a determined partition range. These loaded tables are then added to the partitioned view or switched in to the partitioned table as a new partition.
Move seldom used rows to other partitions The general motive behind horizontal partitioning is to move seldom-used data into other partition
Consider moving the old or infrequently used data into a different partition so that there is not much data for the current partition that query optimizer can work on. This would give performance advantage as the data it queries on is less when compared to unpartitioned table. The other motives are time scaling.
Consider maintaining Current and Archive set for the warehouse data Keep archiving the current data and keep it in different set of tables. In data warehouse, these tables aggregated by month or by week and stored in different set of tables leaving behind the current tables intact. Having this scenario help queries to run faster as most of the analysis queries would be on the current data. The duration of current data window is decided by the business requirements, such as business trend id closely observed for last 2 weeks, it means last 2 weeks is to be more granular and would act as a current set of tables. As the new data enters to new week, the previous week data is aggregated and keep it in a different data store. The history data is also queried to get business requirement but drill down to more granular is not available. This is acceptable as it is not practically feasible to have data maintained granular for large periods of time. Current set of tables as well history tables also partitioned but using different partition schema.
Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.
Summary
Partitions are unavoidable implementation particularly when the data volume is huge, so as is typically in warehouse systems. This chapter details the guidelines for implementing the various types of partitions in different scenarios to get loading time reduced and data access queries run faster. Sliding window scenario is much useful for loading huge volumes and archival of fact tables.
Copyright 2006 by Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to the attached license agreement. Please provide feedback at BI Feedback Alias.