The basic concepts of dimensional modeling are facts, dimensions, and measures. Facts represent business transactions and contain measures and context data. Dimensions describe one business dimension and determine the context for facts. Measures are numeric attributes of facts representing business performance relative to dimensions.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
333 views29 pages
Data Warehouse Design
The basic concepts of dimensional modeling are facts, dimensions, and measures. Facts represent business transactions and contain measures and context data. Dimensions describe one business dimension and determine the context for facts. Measures are numeric attributes of facts representing business performance relative to dimensions.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29
The basic concepts of dimensional modeling
are: facts, dimensions and measures.
A fact is a collection of related data items, consisting of measures and context data. represents business items or business transactions. A dimension is a collection of data that describe one business dimension. determine the contextual background for the facts; A measure is a numeric attribute of a fact, representing the performance or behavior of the business relative to the dimensions 1. Fact Tables the large tables in the warehouse schema that store business measurements. typically contain facts and foreign keys to the dimension tables. Represents data, usually numeric and additive, that can be analyzed and examined. Examples include sales, cost, and profit. 2. Dimension Tables also known as lookup or reference tables, contain the relatively static data in the warehouse. store the information normally use to contain queries. usually textual and descriptive and use them as the row headers of the result set. Examples are customers, Location, Time, Suppliers or products. A fact table has two types of columns: 1. measurements : those that contain numeric facts 2. those that are foreign keys to dimension tables. A fact table contains : detail-level facts facts that have been aggregated. Fact tables that contain aggregated facts are called SUMMARY TABLES. A fact table usually contains facts with the same level of aggregation. A fact table is the primary table in a dimensional model where the numerical performance measurements of the business are stored. We can imagine standing in the marketplace watching products being sold and writing down the quantity sold and dollar sales amount each day for each product in each store. A measurement is taken at the intersection of all the dimensions (day, product, and store). This list of dimensions defines the grain of the fact table and tells us what the scope of the measurement is. A row in a fact table corresponds to a measurement. A measurement is a row in a fact table. All the measurements in a fact table must be at the same grain. The most useful facts are numeric and additive fact tables have two or more foreign keys, as designated by the FK notation When all the keys in the fact table match their respective primary keys correctly in the corresponding dimension tables, we say that the tables satisfy referential integrity. The fact table itself generally has its own primary key made up of a subset of the foreign keys. This key is often called a composite or concatenated key. Every fact table in a dimensional model has a composite key, and conversely, every table that has a composite key is a fact table. Fact tables express the many-to-many relationships between dimensions in dimensional models. A dimension is a structure, often composed of one or more hierarchies, that categorizes data. Dimensional attributes help to describe the dimensional value. normally descriptive, textual values. Dimension data is typically collected at the lowest level of detail and then aggregated into higher-level totals that are more useful for analysis. These natural rollups or aggregations within a dimension table are called hierarchies dimension tables have many columns or attributes. These attributes describe the rows in the dimension table. Each dimension is defined by its single primary key, designated by the PK notation. Dimension attributes serve as the primary source of query constraints, groupings, and report labels. In a query or report request, attributes are identified as the by words. For example, when a user state that he or she wants to see dollar sales by week by brand, week and brand must be available as dimension attributes. A dimension table may be used in multiple places if the data warehouse contains multiple fact tables or contributes data to data marts. For example: a product dimension may be used with a sales fact table and an inventory fact table in the data warehouse, and also in one or more departmental data marts. A dimension such as customer, time, or product that is used in multiple schemas is called a conforming dimension The records in a dimension table establish one- to-many relationships with the fact table. Examples : a number of sales to a single customer, or a number of sales of a single product. A schema is a collection of database objects, including tables, views, indexes, and synonyms. Most data warehouses use a dimensional model schema The principal characteristic of a dimensional model is a set of detailed business facts surrounded by multiple dimensions that describe those facts. When realized in a database, the schema for a dimensional model contains a central fact table and multiple dimension tables. A schema is called a star schema if all dimension tables can be joined directly to the fact table. In the star schema design, a single object (the fact table) sits in the middle and is radically connected to other surrounding objects (dimension lookup tables) like a star. A star schema can be simple or complex. A simple star consists of one fact table; a complex star can have more than one fact table. A schema is called a snowflake schema if one or more dimension tables do not join directly to the fact table but must join through other dimension tables. For example, a dimension that describes products may be separated into three tables (snowflaked). In a star schema every dimension will have a primary key. In a star schema, a dimension table will not have any parent table. In a snowflake schema, a dimension table will have one or more parent tables. In star schema Hierarchies for the dimensions are stored in the dimensional table itself. Hierarchies are broken into separate tables in snowflake schema.
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)