Project Architecture
Project Architecture
Version 10-Jun15
Contents
Column Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Table Keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Lookup Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Relationship Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Fact Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Schema Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Query Performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Data Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Database Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Creating a Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
1
ARCHITECTING MICROSTRATEGY
SCHEMA
Before discussing how to architect a MicroStrategy schema, it is important to review some basic concepts.
• A MicroStrategy project is an integrated object in MicroStrategy. The project connects to source databases
and contains all objects related to an Analytics solution – schema objects, reports, dashboards, cubes, filters,
prompts, custom groups, metrics, etc.
• MicroStrategy Schema is the backbone of the MicroStrategy Analytics Platform. It is the primary contact
layer between Analytics and the backend data sources, such as data warehouse.
• A central part of this architecture is the metadata database. The metadata is the repository that stores all
project information. MicroStrategy applications use metadata to translate user requests into SQL queries
and then translate the results of those queries back into MicroStrategy objects like reports and documents.
• Project definitions and parameters, as well as Schema objects, are part of the metadata. MicroStrategy
metadata contains all the information needed for a MicroStrategy project to function, including data
warehouse connection information, project settings, and MicroStrategy object definitions. However, the
most important objects stored in the metadata are schema objects. Schema objects are logical objects that
relate application objects to data warehouse content. They are the bridge between the reporting
environment and the data warehouse. As such, the basic schema objects must be created before any other
tasks (such as creating templates, filters, reports, or documents) are completed.
• Creating schema objects is the primary task performed in MicroStrategy Architect. The following basic
schema objects form the foundation of a MicroStrategy project:
•Tables—Logical objects that correspond to physical tables stored in the data warehouse to use in a
MicroStrategy project
•Facts—Logical objects that relate aggregatable data stored in the data warehouse to the MicroStrategy
reporting environment. They are usually numeric, and must be aggregated to different levels, depending
on reporting needs.
•Attributes—Logical objects that relate descriptive (non-fact) data stored in the data warehouse to the
MicroStrategy reporting environment. They provide context for reporting on facts and define the level of
detail at which users want to analyze facts.
•Hierarchies—Logical objects that enable grouping of attributes to reflect their relationships or provide
convenient browsing and drilling paths in the MicroStrategy reporting environment.
Assume that business requirements have already been defined. Now, review the rest of the steps in the project life
cycle.
Once the business requirements have been documented, the next step is to organize requirements into a Fact
Qualifier Matrix. This type of matrix helps summarize what measures are needed for building Analytics, and what
factors (dimensions) they are viewed with.
• Based on the Fact Qualifier Matrix, we can identify attributes, measures we need (facts), and hierarchies. For
example, Citywide, Region, Sector, Zone, Grid attributes for the geographic hierarchy.
• Next, we can start grouping attributes and facts into entities that will end up on the logical data model.
1. Facts are measures used to analyze a business. Fact data is typically numeric, and it is generally aggregatable.
Revenue, unit sales, inventory, and account balance are just a few examples of facts that may be used in
business.
2. Attributes are descriptive data that provide context for analyzing facts. They enable users to answer questions
about facts and report on various aspects. Without this context, facts are meaningless.
Attribute forms are a MicroStrategy-specific concept. Attribute forms enable different types of descriptive
information about an attribute to be displayed. Use attribute forms to display the ID for an attribute along with any
number of description fields. It is the attribute forms that actually map directly to the columns in the data
warehouse.
All attributes have an ID form. Most have at least one primary description form. Some attributes have many
description forms, so they can display various aspects of that attribute on reports. For example, the Customer
attribute can have the following columns in a table an attribute forms.
• ID
• Name
• Home phone
• Home address
• Birth date
• Gender
• Income
Attribute forms must have a one-to-one relationship with other forms of the same attribute.
3. Hierarchies are groupings of directly related attributes ordered to reflect their relationships. In a logical data
model, hierarchies are also sometimes referred to as dimensions. Attributes are generally organized into
hierarchies based on logical business areas.
Before beginning to organize business data into a logical data model, consider the following factors that influence
the design of the logical data model:
Logical data model should take into account the reporting requirements of end users. It should include:
• The granularity of each piece of information (for example, Sales per month or week).
• Consideration for the diversity of users. For example, executive users are interested in overall business
trends for the company; while a department manager may need to track sales for his/her area of
responsibility.
• Evaluation of whether these reporting requirements can be readily supported given the existing source
data and other environmental characteristics.
Ensuring that the logical data model adequately addresses user reporting requirements is often an iterative
process. Over time, as requirements change, the logical data model may change as well. Users may need to add
new facts, attributes, or hierarchies or remove some components that are no longer needed or cannot be
practically supported. Additionally, they may often create multiple “draft” versions of a logical data model before
completing the final design.
A logical data model should take into account what source data is available for use. Users need to ensure that there
are sufficient source data to support user reporting requirements and address any gaps that exist.
An initial review of source systems may reveal that the necessary source data to support all the user reporting
requirements is available. However, sometimes the original source data is not sufficient to answer specific
reporting requirements. In such cases, it may be possible to derive additional data using the original source data to
satisfy those requirements.
For example, a company may record sales transactions by customer and city, but users also want to analyze sales by
state or region. State and region data are not part of the source data, but this information can be extracted based
on the cities in which sales occur. Although this data does not exist in the source system, it can readily be inferred,
so to include Region and State attributes in the logical data model.
At other times, the absence of data in source systems may render it impossible to support certain user reporting
requirements. This would happen, for example, when the level of detail is not available unless the method for
capturing the required data has been changed.
• Sometimes, there are application-level workarounds within MicroStrategy for missing source data that
cannot be extrapolated at the database level. For more information refer to Fact Level Extensions section in
this course.
Finally, remember that only information needed for reports should be part of the logical data model. Therefore,
only the information needed to populate the data model should be extracted from the source systems.
A number of technical and performance factors can affect how the logical data model is designed, particularly with
regard to its size and complexity. The more information to be included in the logical data model and the more
complex analysis to be supported with the logical data model, the greater demand there will be on the Analytics
system.
Technical issues such as the robustness of the database server and software can affect the volume or types of
queries that can be supported in the reporting environment. Other factors such as network bandwidth or the
volume of concurrent users can influence speed and system demand.
Evaluation must be done to find out what level of performance can be delivered with the technical resources at the
user’s disposal. Complex user reporting requirements or source data structures pose greater challenges to
delivering optimal performance.
Large projects may contain thousands of attributes, each with the potential to have hundreds, thousands, or even
millions of elements. It is important to consider any technical or performance limitations posed by the hardware
and database software. These limitations may require scaling back the size or complexity of the logical data model
or sacrifice some user reporting requirements.
Your instructor will provide guidelines for completing this business scenario in groups. After you work on the
business scenario with your group to create a logical data model, the class will review the scenario together and
discuss possible solutions for how to structure the logical data model.
• Creating a logical data model is an exercise that is open to interpretation and variability. As long as the
decisions you make in how you structure the logical data model are in keeping with user requirements and
the nature of the underlying source data, you can come up with different “right” answers for this scenario.
The structure of some logical data models may be preferred over others based on implementation
requirements, but it is often possible to create several serviceable data models for the same scenario.
Overview
Your company captures information about various aspects of its business and stores it in a dedicated source
system. Now, the company wants to use this existing data to create a well-organized data warehouse that enables
users to better analyze business data.
You are new to the company, but you have been assigned the task of creating the logical data model that will
eventually be used to design the data warehouse. In creating the logical data model, you need to consider the
available source data, which is shown in the illustration:
Existing Source Data
• This illustration shows a very simple source system structure. In most business environments, source
systems are much more complex, and you often have to integrate information from multiple source
systems.
You need to create the logical data model based on the following requirements:
• The logical data model must include everything from the source data.
As you work, follow these five steps to create a logical data model:
1. List all the information from the source data you need to include in the logical data model based on the user
reporting requirements.
A physical schema is a detailed, graphical representation of the physical structure of a database. When the
database in question is a data warehouse, the physical schema shows how business data is stored. The physical
schema design is based on the organization of the logical data model. Various physical schema designs can be
created from a single logical data model, depending on desired way to store the data representing logical objects
in the data warehouse.
While the logical data model shows the facts and attributes, the physical schema shows how the underlying data
for these objects is stored in the data warehouse.
This physical schema shows tables and columns that store product, geography, time, and sales data.
The schema objects created in MicroStrategy Architect serve as the link between the logical structure of the data
model and the physical structure of the data warehouse content. When facts and attributes are created in
Architect, these logical objects are mapped to specific columns and tables in the data warehouse. Therefore,
understanding the physical schema of the data warehouse is essential to creating the correct mappings between
schema objects and the actual data.
• Columns
• Tables
A physical schema consists of a set of tables, and those tables contain columns that store the actual data. The
following topics describe each of these components in more detail.
Column Types
In a data warehouse, the columns in tables store fact or attribute data. The following are the three types of
columns:
ID columns—these columns store the IDs for attributes. IDs are often numeric and generally serve as the unique
key that identifies each element of an attribute. All attributes have an ID column.
Description columns—these columns store the text descriptions of attributes. Description columns are optional.
However, most attributes have an ID column and at least one description column. If an attribute has additional
description columns, attribute forms for each column can be created.
When an attribute does not have a separate description column, the ID column serves as both the ID and
description.
Fact columns—these columns store the measures of business performance. They are usually numeric.
In the LU_EMPLOYEE table, the Employee_ID column stores the unique IDs for each employee, while the other
three columns store description information about each employee, including their names, emails, and addresses.
In the FACT_SALES table, the Dollar_Sales and Unit_Sales columns store values for these two facts. The remaining
columns are attribute ID columns.
Table Keys
Before learning about the different types of tables in a physical schema, it is important to understand some basic
concepts about table structure.
Every table has a primary key that consists of a unique value that identifies each distinct record (or row) in the table.
There are two types of primary keys:
Simple key—this type of key requires only one column to uniquely identify each record within a table.
Compound key—this type of key requires two or more columns to uniquely identify each record within a table.
The type of key used for a table depends on the nature of the data itself. The specific requirements of a business
may also influence what type of key is used.
In the first example, each call center in the LU_CALL_CTR table can be uniquely identified using only the
Call_Ctr_ID column. Since each call center has a unique ID, the table has a simple key.
In the second example, each call center in the LU_CALL_CTR table can be uniquely identified using both the
Call_Ctr_ID and Region_ID columns. The table requires a compound key because call centers from different regions
can have the same ID. For example, Boston and Baltimore have the same IDs, but they belong to different regions.
One cannot distinguish between these two call centers unless one also knows the IDs for their respective regions.
Simple keys are generally easier to work with in a data warehouse than compound keys because they require less
storage space and they allow for simpler SQL. Compound keys tend to increase SQL complexity and query time and
require more storage space. However, compound keys are often present in source system data. In such cases,
retaining compound keys provides for a less complex and more efficient ETL process.
There are advantages and disadvantages to both types of table keys, so users must consider which key structure
best meets their needs when designing the data warehouse schema.
Lookup Tables
Lookup tables store information about attributes, including their IDs and any descriptions. They enable users to
easily browse attribute data.
Depending on how the physical schema is designed, a lookup table can store information for either a single
attribute or multiple related attributes.
The LU_CATEGORY, LU_SUBCATEGORY, and LU_ITEM lookup tables each store’s information only for a single
attribute—Category, Subcategory, or Item. The LU_PRODUCT table stores information for all three of these
attributes.
Relationship Tables
Relationship tables store information about the relationship between two or more attributes. They enable users to
join data for related attributes. To map the relationship between two or more attributes, their respective ID
columns must exist together in a relationship table.
Lookup tables can serve a dual function as both lookup and relationship tables in some circumstances.
For attributes that have a one-to-one relationship, we do not need a separate relationship table. Typically, we do
not even have a separate lookup table for the parent attribute. Instead, the parent is placed directly in the lookup
table of the child attribute to map the relationship between the two attributes.
The following illustration shows an example of using a single lookup and relationship table for two attributes:
Email and Employee have a one-to-one relationship. Therefore, the LU_EMPLOYEE table can serve as both the
lookup and relationship table for both attributes.
• In a MicroStrategy project, often, the parent attribute is treated in a one-to-one relationship as a form of the
attribute it describes. In order to treat the parent attribute as a separate attribute, we may create a separate
lookup table for the parent.
For attributes that have a one-to-many relationship, a separate relationship table is not needed. Instead, the
parent-child relationship can be defined by including the ID column for the parent attribute in the lookup table of
the child attribute. Placing the parent ID in the child table creates a foreign key on the parent ID that enables
mapping the relationship between the two attributes using their respective lookup tables.
The following illustration shows an example of a lookup table that also functions as a relationship table:
Category and Subcategory have a one-to-many relationship. Therefore, the LU_SUBCATEGORY table can serve as
both the lookup table for the Subcategory attribute and the relationship table for the Category and Subcategory
attributes. The Category_ID column is a foreign key that maps back to the LU_CATEGORY table.
For attributes that have a many-to-many relationship, a separate relationship table must be created to map the
parent-child relationship.
Supplier and Item have a many-to-many relationship. Therefore, the ID column for the Supplier attribute cannot be
placed in the LU_ITEM table and still retain only the Item_ID column as the primary key. The REL_SUPPLIER_ITEM
table includes the ID columns for both the Supplier and Item attributes to map the relationship between them.
These two ID columns form a compound primary key for the relationship table.
• Use separate relationship tables in a data warehouse only for attributes that have many-to-many
relationships.
• Analyzing facts using attributes that have a many-to-many relationship requires the use of a distinct
relationship table and include the relationship in any corresponding fact tables.
Fact Tables
Fact tables store fact data and attribute ID columns that describe the level at which the fact values are recorded.
They enable the analysis of the fact data with regard to the business dimensions or hierarchies those attributes
represent.
The FACT_SALES table stores dollar and unit sales data by item, employee, and date. These three attributes
comprise the fact table level. Using this table, user can view dollar or unit sales data at any of these levels, or can
aggregate the fact data from these levels to higher attribute levels within the same hierarchies. For example,
aggregate the date-level dollar sales to the month level or the item-level unit sales to the subcategory level.
There are typically two types of fact tables in a data warehouse. Base fact tables are tables that store a fact or set of
facts at the lowest possible level of detail. Aggregate fact tables are tables that store a fact or set of facts at a higher,
or summarized, level of detail.
The FACT_SALES table stores dollar and unit sales data at the lowest possible level of detail—by item, employee,
and date. Therefore, it is the base fact table for these two facts. The FACT_SALES_AGG table stores dollar and unit
sales data at a higher level of detail—by category, region, and month. Therefore, it is an aggregate fact table for
these two facts.
Because they store data at a higher level, aggregate fact tables reduce query time. For example, a report that shows
unit sales by region, can obtain the result set more quickly using the FACT_SALES_AGG table than the FACT_SALES
table.
In a data warehouse, there are frequently multiple aggregate fact tables for the same fact or set of facts to enable
more quick analysis of the fact data at various levels of detail.
Schema Types
For any given logical data model, a variety of schemas can be designed to store the data using different physical
structures. The type of schema design selected for the data warehouse depends on the nature of the data, how
users want to query the data, and other factors unique to a project and database environments.
Normalization occurs whenever the schema design does not store data redundantly. Denormalization occurs
whenever the schema design stores at least some data multiple times, or redundantly. Generally, tables are
denormalized for performance purposes to reduce the number of joins necessary between tables for queries.
The normalized LU_ITEM table contains the minimal amount of information necessary to store items and map their
relationships to subcategories. The denormalized LU_ITEM table stores all this information, but it also contains the
descriptions for each subcategory and the IDs and descriptions for the corresponding categories. This data is
stored multiple times for each item to which it corresponds. There may also be separate lookup tables that store
only the subcategory and category data, so the same information may be stored redundantly across multiple
tables.
Schema designs largely vary in terms of the degree of denormalization they employ. The following are the
commonly used schema designs:
A completely normalized schema does not store any data redundantly. The following illustration shows an example
of a completely normalized schema:
Completely Normalized Schema
• For simplicity, the image above shows only one fact table as part of the schema. However, schemas
generally contain multiple fact tables at different levels.
The lookup tables in this example contain only the IDs and descriptions for their respective attributes as well as the
IDs of the immediate parent attributes. For example, the LU_EMPLOYEE table has only three
columns—Employee_ID, Employee_Name, and Call_Ctr_ID.
This schema design stores the minimal amount of information, but it can require more joins between tables,
depending on the level at which users query data.
A moderately denormalized schema stores some data redundantly. The following illustration shows an example of
a moderately denormalized schema:
Moderately Denormalized Schema
For simplicity, the image above shows only one fact table as part of the schema. However, schemas generally
contain multiple fact tables at different levels.
The lookup tables in this example contain the IDs and descriptions for their respective attributes as well as the IDs
of all related higher-level attributes within the hierarchy. For example, the LU_EMPLOYEE table has four
columns—Employee_ID, Employee_Name, Call_Ctr_ID, and Region_ID.
This schema design stores the IDs of related higher-level attributes redundantly. However, because it stores this
information redundantly, this design can reduce the number of joins required between tables.
A completely denormalized schema stores the maximum amount of data redundantly. The following illustration
shows an example of a completely denormalized schema:
Completely Denormalized Schema
For simplicity, the image above shows only one fact table as part of the schema. However, schemas generally
contain multiple fact tables at different levels.
The lookup tables in this example contain the IDs and descriptions for their respective attributes as well as the IDs
and descriptions of all related higher-level attributes within the hierarchy. For example, the LU_EMPLOYEE table
has six columns—Employee_ID, Employee_Name, Call_Ctr_ID, Call_Ctr_Desc, Region_ID, and Region_Desc.
This schema design stores the ID and descriptions of related higher-level attributes redundantly. However, because
it stores so much information redundantly, this design can further reduce the number of joins required between
tables.
In a completely denormalized schema, the lowest-level lookup tables for each hierarchy actually contain all the
information found in the higher-level lookup tables:
Lowest-Level Lookup Tables
The LU_ITEM, LU_EMPLOYEE, and LU_DATE tables contain all the possible information for their respective
hierarchies. The other, higher-level lookup tables do not provide any information that cannot be obtained from
these three tables.
Users can eliminate all the higher-level lookup tables to create a schema that has only one table for each hierarchy,
which is often referred to as a star schema:
Star Schema
A completely denormalized schema can be simplified to a single table per hierarchy. However, there are
advantages to retaining the higher-level lookup tables.
First, depending on how table keys are created in a star schema, higher-level attributes can more efficiently be
browsed using the higher-level lookup tables rather than the consolidated lookup tables.
For example, to create a report that shows a list of categories, the MicroStrategy Engine has to perform a SELECT
DISTINCT query on the LU_PRODUCT table to obtain this result set since this table does not have a distinct list of
categories. The Engine can obtain the same result by performing a simple SELECT on the LU_CATEGORY table.
SELECT DISTINCT queries are much more resource intensive and can negate any performance gains that come
from joining fewer tables.
Second, the higher-level lookup tables are needed to take advantage of aggregate fact tables in the schema.
For example, consider the aggregate fact table in the following schema:
Using Aggregate Fact Tables
The FACT_SALES table stores dollar and unit sales at the item, employee, and date level. However,
FACT_SALES_REGION is an aggregate fact table that stores the same data at the item, region, and date level.
Sales by Region: Obtain the data from either the FACT_SALES or FACT_SALES_REGION tables. Querying the
FACT_SALES_REGION table is more efficient since the data is already aggregated to the region level. However, after
the MicroStrategy Engine obtains the sales data from the FACT_SALES_REGION table, it has to join to a lookup table
to retrieve the corresponding region descriptions.
If LU_EMPLOYEE table is used for this join, it does not contain a distinct list of regions. Instead, the regions are
repeated for each employee in each region. As a result, the Engine would join to each region for every occurrence
in the LU_EMPLOYEE table, which results in counting the sales multiple times. The result set would show inflated
sales values.
If LU_REGION table is used for this join, it contains a distinct list of regions, which enables the Engine to perform the
join correctly and produce the correct sales values. Therefore, this higher-level lookup table is necessary to use the
aggregate fact table.
The following table provides a comparison of the characteristics of each schema type:
Schema Type Comparison
The best structure for any data warehouse is often a combination of schema types. For example, we may choose to
normalize one hierarchy, while completely denormalizing another hierarchy. We may even need to normalize and
denormalize various tables within the same hierarchy to best meet our needs.
When a data warehouse schema is created, it is imperative to consider the following factors that influence the
design of the schema:
• Query performance
• Data volume
• Database maintenance
Compromises and trade-offs that balance each of these factors are integral to the process of designing the schema.
The following topics describe each of these factors in detail.
The structure of the data warehouse schema should take into account the reporting requirements of end users.
Understanding what information users need on reports helps determine the query profile—the types of queries
users will execute and how often they will execute them.
For example, normalizing a particular part of the data warehouse schema may require more joins between tables
to access higher-level attributes. However, if users query that data very infrequently, it may be worthwhile to
normalize the tables. On the other hand, data that users frequently query may be good candidates for
denormalization.
Query Performance
The structure of the data warehouse schema should take into account the type of query performance that needs to
be delivered to users. More granular, detail-level queries involve accessing a greater volume of data, which can
degrade performance. Denormalizing tables can often help speed query performance by reducing the number of
joins between tables.
Data Volume
The structure of the data warehouse schema should take into account the volume of data stored for each attribute
in each hierarchy. Attributes and hierarchies with greater data volumes are often better candidates for
denormalization. On the other hand, those with lower data volumes can often use a more normalized schema
design. In such cases, joining more tables that contain less data may be preferable to significantly increasing the
size of smaller tables.
Database Maintenance
The structure of the data warehouse schema should take into account the effort involved in maintaining whatever
schema design is selected. Often, designs that provide greater flexibility in terms of query performance and detail
also require more work to maintain.
In general, the more denormalized a table is, the more work is required to maintain the table. Because
denormalized schemas store data redundantly, when changes occur, there are simply more tables and columns to
update. One must balance the benefits such denormalization provides against the maintenance burdens it creates.
For example, users may have some tables in a hierarchy that they want to denormalize for performance reasons.
However, they also know that the data in these tables is very volatile and subject to lots of changes. In considering
the maintenance overhead this will create, users may opt to partially denormalize the tables.
Another thing to consider with regard to database maintenance is the complexity of ETL process for converting
data from source system structures to the structure planned for the data warehouse. The more tables have to be
rekeyed - combine data from multiple tables, or separate data into multiple tables - the more complex the
conversion becomes. At times, the benefits users may realize from these schema changes must be weighed against
the demands of the ETL work required to create them.
Your instructor will provide guidelines for completing this business scenario in groups. After you work on the
business scenario in groups to create a data warehouse schema, the class will review the scenario together and
discuss possible solutions for how to structure the physical schema.
• Creating a data warehouse schema is an exercise that is open to interpretation and variability. As long as the
decisions you make in how you structure the physical schema are in keeping with the characteristics of the
project and database environments described in the overview, you can come up with different “right”
answers for this scenario. The structure of some physical schema designs may be preferred over others
based on implementation requirements, but it is often possible to create several serviceable schema
designs for the same scenario.
Overview
You work for a bank that is building a data warehouse to better analyze its business. The following illustration
shows the logical data model that has been created:
Logical Data Model
Now, the bank wants to use this logical data model to design the data warehouse schema. You have been assigned
the task of creating the schema for the Geography and Account hierarchies. In creating the physical schema, you
need to consider the following factors to select an optimal design for each hierarchy:
• The bank evaluates and makes organizational changes to its regions and districts at least three times each
year, which results in those relationships changing frequently.
• The bank currently has 7 regions, 4o districts, and 1500 branches. The logical data model must include
everything from the source data.
• The source system database that the bank uses to capture transactions uses a normalized schema.
• Users tend to execute a lot of queries to view data at the branch level. They also execute selected queries to
view data at the region and district levels.
• Users execute a variety of queries at both the division and account levels.
• Detailed, account-level queries often degrade performance in the current reporting system. As part of
designing the data warehouse schema, the bank hopes to better address the performance issues posed by
such queries.
The source system contains the following minimum information for each attribute in the Geography and Account
hierarchies:
For the Employee attribute, the source system also stores each employee’s Social Security Number, home address,
and home phone number.
As you work, remember the four factors to consider in designing a data warehouse schema:
• Query performance
• Data volume
• Database maintenance
Creating a Project
Use the MicroStrategy Connectivity Wizard, MicroStrategy Configuration Wizard, Project Creation Assistant and
Architect graphical interface to create a new project source called My Tutorial Project and project object called My
Demo Project.
In this set of exercises, configure the metadata, create the project source and database instance in MicroStrategy
Developer, and create the project object. Follow these steps:
Overview
You will use an existing, empty database in MySQL called empty_shell for creating your metadata.
Detailed Instructions
Using the MicroStrategy Connectivity Wizard, create a DSN named My_Tutorial_Metadata to connect to the
empty_shell database in MySQL.
Detailed Instructions
1. To open the MicroStrategy Connectivity Wizard, on the Start menu, point to All Programs, point to
MicroStrategy Tools, and select Connectivity Wizard.
2. To create the DSN for the metadata database, in the MicroStrategy Connectivity Wizard, click Next.
4. Click Next.
6. Click Finish.
• In the TCP/IP Server box, type the IP of Intelligence Server listed in Your Secure Cloud Credentials email.
8. Click Test.
9. Click OK.
Overview
Use the MicroStrategy Connectivity Wizard to create a DSN named TUTORIAL_WH_DSN to connect to the
tutorial_wh database in MySQL.
Detailed Instructions
1. To open the MicroStrategy Connectivity Wizard, on the Start menu, point to All Programs, point to
MicroStrategy Tools, and select Connectivity Wizard.
2. To create the DSN for the metadata database, in the MicroStrategy Connectivity Wizard, click Next.
4. Click Next.
5. In the second Driver Selection window, select MySQL ODBC Unicode Driver. The ANSI version will function but
may have issues displaying data that has been saved as Unicode characters.
6. Click Finish.
7. A setup window will appear. In the MySQL Setup window, type in the Data Source Name box, type
My_Tutorial_Metadata.
8. Type in the TCP/IP Server, the IP address for the server provided by this course.
Overview:
In this exercise, use the MicroStrategy Configuration Wizard to create the metadata shell in the
My_Tutorial_Metadata database.
Detailed Instructions
1. To open the MicroStrategy Configuration Wizard, on the Start menu, point to All Programs, point to
MicroStrategy Tools, and select Configuration Wizard.
2. On the Welcome Page of the Configuration Wizard, select the first option "Create Metadata, History List and
Statistics Repository Tables”.
3. Click Next.
5. Click Next.
6. In the Repository Configuration: Metadata Tables window, in the DSN drop-down list, select
My_Tutorial_Metadata.
Overview
Connect to the metadata database in MicroStrategy Developer by creating a project source. A project source is a
pointer to a metadata database. It either connects directly to a data source name (DSN) that points to the
appropriate database location (two-tier project source), or it connects to an instance of MicroStrategy Intelligence
Server, which points to the metadata database (three-tier project source).
Use the Project Source Manager in MicroStrategy Developer to create a default two-tier project source named My
Tutorial Project Source that connects directly to the My_Tutorial_Metadata database.
Detailed Instructions
3. Click Add.
5. Click OK.
6. Click OK.
Overview
The following are the naming requirements for the data warehouse connection for a TUTORIAL project:
Use the Database Instances manager in MicroStrategy Developer to create a database instance named TUTORIAL
WH DB. As part of creating the database instance, create a database connection named TUTORIAL WH DB
Connection and a database login named TUTORIAL WH DB Login.
Detailed Instructions
1. To open the Database Instances manager, in Developer, log in to the TUTORIAL Project Source.
• Log in using the default credentials for a new project: user name is administrator and there is no password.
• No projects have been created in the project source; a message window should display stating that no
projects were returned for the project source. Click OK.
4. Under Configuration Managers, right-click the Database Instances manager, point to New, and select Database
Instance.
5. To create the database instance, in the Database Instances window, on the General tab, in the Database
instance name box, type TUTORIAL WH DB Instance.
7. To create the database connection, under Database connection (default), click New.
8. In the Database Connections window, on the General tab, in the Database connection name box, type
TUTORIAL WH DB Connection.
11. In the Database Logins window, in the Database login box, type TUTORIAL WH DB Login.
16. In the Database Instances window, click OK. Keep Developer open for the next exercise.
Overview
Create a new project object named My Demo Project in the My Tutorial Project Source.
Detailed Instructions
1. To open the Project Creation Assistant, in Developer, in the My Tutorial Project Source, on the Schema menu,
select Create New Project.
2. To create the project object, in the Project Creation Assistant, click Create project.
3. In the New Project window, type My Demo Project as the project name.
4. Click OK.
• MicroStrategy Architect populates the metadata tables with initial project data. This process takes a few
minutes.
5. When project initialization is complete, you return to the Project Creation Assistant. The Create project step
should have a green check mark next to it.
Architect is a graphical interface that provides a visual, freeform approach to projects. It provides a consolidated
interface to complete various project design tasks that enables you to access many different functions. Users can
work with the following schema objects:
• Tables
• Facts
• Attributes
• User Hierarchies
Create, modify, and remove schema objects or configure various settings for these objects using this interface. You
can perform most project design tasks in the Architect graphical interface. However, there are a few functions that
are not supported by Architect. These include creating project objects, mapping attributes or facts to partitioned
tables, and creating fact extensions.
Architect graphical interface provides freeform drag and drop, and one dedicated interface for working with
specific types of schema objects. Users can access all the functions available for facts, attributes, or user hierarchies
in their respective object editors. Using this interface to work with schema objects can be useful to learn about the
various components and properties that relate to each type of object.
Architect graphical interface provides an integrated, interactive experience for working on projects. For example,
users can add new tables to a project, add new facts and modify existing facts, create a new attribute, and remove
an existing user hierarchy all within this one interface.
• The Project Creation Assistant is an alternative tool for bulk project creation. It is a collection of wizards that
leads users through the process of creating a project in a very linear, step-by-step manner.
• To access Architect, in Developer, log in to the project source that contains the project you want to modify.
OR
The following image shows the option for accessing Architect on the Schema menu:
Accessing Architect from the Schema Menu
• Users can also access Architect from the Project Creation Assistant when they first create a project.
MicroStrategy Architect permits multiple users to access schema objects simultaneously by providing two modes,
Read Only mode and Edit mode.
The Read Only mode allows users to view schema object definitions without locking the schema for other users.
Users in Read Only mode cannot make changes to schema objects. Additionally, users in Read Only mode cannot
access schema editors that require the ability to make updates to the project.
The Edit mode provides all capabilities of modifying schema objects and locks the schema object from being
modified by all other users. Therefore, only one user can use edit mode for a project at a given time. Only one user
at a time can open the project in the Edit mode to prevent schema inconsistencies.
For detail information on the Architect tool, please refer to the Project Design Guide product manual.
Walk through the process of creating MicroStrategy schema using the Architect tool.
Schema Creation Workflow
Once the project object has been created and a database instance has been linked to it, MicroStrategy architect
can get into the Architect tool to create schema objects. There are four steps to create schema.
• Create facts
• Create attributes and relationships
The first task in the project creation workflow is to add tables to the MicroStrategy project. Users can have any
number of tables in your data warehouse, but the tables actually selected to use in a project are called project
tables. Project tables are the only tables in the data warehouse that MicroStrategy uses in the reporting
environment. It effectively ignores any other tables that are present.
Users can choose to include all the tables in your data warehouse in a project. Alternatively, they can select only a
subset of tables. The tables selected for the project are the tables users can access when executing reports.
When selecting a physical table from the data warehouse to include in a project, MicroStrategy Architect
automatically creates a corresponding logical table in the metadata. For example, if a user chooses to include the
LU_CUSTOMER table in the project, they have a corresponding LU_CUSTOMER table in the metadata.
Physical tables store the actual data, and the MicroStrategy Engine executes report SQL against these tables to
obtain result sets. Logical tables store information about their corresponding physical tables, including table
names, column names and data types, and schema objects associated with the table columns. The Engine uses the
logical tables to generate the appropriate report SQL.
The following illustration shows the physical and logical tables for the LU_CUST_STATE table:
Physical and Logical Tables—LU_CUST_STATE
The LU_CUST_STATE physical table contains the CUST_STATE_ID, CUST_STATE_NAME, and CUST_REGION_ID
columns. The corresponding LU_CUST_STATE logical table has both a physical and a logical view. The physical view
displays these same columns and their data types. The logical view shows the attributes and attribute forms in the
project that are mapped to the table columns.
Layers enable you to organize project tables into groupings based on the logical data model. This lets users focus
on just the tables they need to use for a particular task. For example, they can create a layer that contains only the
fact tables needed to create a particular set of facts, or they can create a layer that contains only the lookup and
relationship tables needed to create the attributes that belong to a particular hierarchy or dimension.
Based on the project creation workflow, users create individual layers when they are ready to create the associated
facts or attributes. However, since the layers functionality is based on tables, users will learn about this concept
while working with tables and use it later in this course as they create facts and attributes.
All project tables have properties that can be viewed in the Properties pane. You can also modify many of these
properties. The following table lists these properties and their descriptions:
Properties for Project Tables
The Mapped Attributes and Mapped Facts categories display only if attributes or facts are mapped to a table.
• On the Project Tables View tab, select the table for which you want to modify properties, or
• In the Properties pane, click the Tables tab. On the Tables tab, in the drop-down list, select the table for
which you want to modify properties.
In this set of exercises, add fact and lookup tables to the My Demo Project that were created in the previous
exercises.
3. Click Options.
5. Click Settings.
7. In the upper text dialog, set the last value as TABLE_SCHEMA = ‘tutorial_wh’.
8. Click OK.
9. Click OK.
Overview
In this exercise, use the Architect graphical interface to add fact tables to the My Demo Project. Select Tutorial Data
as the database instance for the project. Before performing any task in the project, disable automatic column
recognition, automatic relationship recognition, and automatic metric creation. Next, create a Fact Tables layer.
Finally, add the following tables to the Fact Tables layer:
After adding these fact tables, continue to the next exercise to add lookup tables to the project.
Detailed Instructions
1. To launch the MicroStrategy Architect for the My Demo Project, in the My Tutorial Project Source, open the My
Demo Project.
3. If you see the Read Only window, ensure that Edit: This will lock all schema objects in this project from other
users is selected and click OK.
4. To select Tutorial Data as the database instance, in the Warehouse Database Instance window, in the
drop-down list, select Tutorial Data.
5. Click OK.
6. In order to disable automatic column recognition and relationship recognition, in the Architect graphical
interface, click the Design tab.
7. On the Design tab, in the Auto Recognize section, click the arrow button to launch the Automatic Schema
Recognition window.
8. In the Automatic Schema Recognition window, under Automatic column recognition, click Do not auto
recognize.
9. Under Recognize Relationships, ensure that Do not automatically create relations is selected.
11. To enable automatic metric creation, in the Architect graphical interface, click the Architect button.
13. In the MicroStrategy Architect Settings window, click the Metric Creation tab.
14. On the Metric Creation tab, check the Sum check box.
16. To create a Fact Tables layer, in the Architect graphical interface, click the Project Tables View tab.
18. On the Home tab, in the Layer section, click Create New Layer.
19. In the MicroStrategy Architect window, type Fact Tables as the layer name.
21. To add fact tables to the project, in the Warehouse Tables pane, expand the Tutorial Data database instance to
view all the tables.
22. Drag the following tables to the Fact Tables layer on the Project Tables View tab:
23. On the Home tab, in the Auto Arrange Table Layout section, click Regular to arrange the tables.
25. If a Change Comments window displays, click Do not show this screen in the future. Then click OK.
Overview
In this exercise, use the Architect graphical interface to add lookup tables to the My Demo Project. First, create the
Geography and Time layers. Next, add the appropriate tables to the respective layers.
After these lookup tables have been added up, save and update the project schema.
Detailed Instructions
1. On the Home tab, in the Layers section, click Create New Layer.
3. You need to ensure that you do not have any tables in the current layer selected before clicking the Create New
Layer button. If you have tables selected, they are automatically included in the new layer.
5. To add lookup tables to the project, in the layers drop-down list, select the Geography layer.
6. Drag the following tables in the Warehouse Tables pane to the Geography layer on the Project Tables View tab:
7. On the Home tab, in the Auto Arrange Table Layout section, click Regular to arrange the tables.
9. Drag the following tables in the Warehouse Tables pane to the Time layer on the Project Tables View tab:
10. On the Home tab, in the Auto Arrange Table Layout section, click Regular to arrange the tables.
11. Next, to save and update the project schema, on the Home tab, in the Save section, click Save and Update
Schema.
12. The Change Comments window may appear when you update schema. To prevent this window from
appearing, select the Do not show this screen in the future check box and click OK.
13. In the Update Schema window, ensure the following check boxes are selected:
The second task in the project creation workflow is to create the facts for the MicroStrategy project. Create facts
based on the logical data model and map them to columns in the data warehouse schema. Then use facts to define
metrics. As such, the fact schema object serves as a bridge between fact values stored in the data warehouse and
the metrics users want to see on MicroStrategy reports. Facts point to the appropriate physical columns in the data
warehouse, while metrics perform aggregations on those columns.
The existence of facts at the application level enables you to create a layer of abstraction between the underlying
structure of your data warehouse and the metrics users require on reports. Consider the following example:
Sales Fact Layer of Abstraction
In the data warehouse, the Sales fact exists in both the FACT_SALES and FACT_SALES_AGG tables. However, it is
stored in these tables using different column names—Dollar_Sales and Revenue. Depending on which table you
need to query for a report, you either need to aggregate the Dollar_Sales column or the Revenue column. The
optimal table to use for a query varies based on the attributes used in the report.
Define the Sales fact to map to both columns in both tables. That way, the MicroStrategy Engine can use a single
fact to access either table. Use this Sales fact to define the Sales metric. Because of the existence of the Sales fact,
users can access any table that contains sales data using a single metric. Without the Sales fact, users would need to
create two metrics—one defined as SUM(Dollar_Sales) and one defined as SUM(Revenue). Then, users would have
to know which metric to use on a report to access the appropriate table.
The Sales fact creates a layer of abstraction that makes the discrepancies in how the columns are named in the data
warehouse transparent to users. For users, sales are sales, regardless of the column name used to identify them.
Report developers and end users do not need to understand the structure of the data warehouse.
The data warehouse can contain fact columns with different names that store the same data.
• You do not have to resolve discrepancies in the data warehouse to make reporting on such data seamless
for users.
Types of Facts
Before you learn how to create facts in MicroStrategy Architect, you need to understand the different types of facts
and fact expressions that exist. There are two primary types of facts:
• Homogeneous
• Heterogeneous
The following topics describe both of these types of facts in more detail.
Homogeneous Facts
Associate facts to columns in data warehouse tables. Map the same fact to any number of tables. A homogeneous
fact is one that points to the same column or set of columns in every table to which it maps.
The Sales fact maps to three different tables—ITEM_SALES, CATEGORY_SALES, and CUSTOMER_SALES. However, it
is a homogeneous fact because it maps to the same Dollar_Sales column in each table.
Heterogeneous Facts
Whereas a homogenous fact always maps to the same column or set of columns, a heterogeneous fact is one that
points to two or more different columns or sets of columns in the tables to which it maps.
The Sales fact maps to three different tables—ITEM_SALES, CATEGORY_SALES, and CUSTOMER_SALES. However, in
the first two tables, it maps to the Dollar_Sales column, but in the third table, it maps to the Revenue column.
Therefore, it is a heterogeneous fact because it maps to two different columns.
A fact expression consists of a column or set of columns to which the fact maps. All facts have at least one
expression. However, facts can have any number of expressions. There are two primary types of fact expressions:
• Simple
• Derived
The following topics describe both of these types of fact expressions in more detail.
A simple fact expression is one that maps directly to a single fact column. It can map to that same column for any
number of tables.
The Sales fact maps directly to the Dollar_Sales column in the ITEM_SALES table, creating a simple fact expression.
A derived fact expression is one that maps to an expression from which the fact values are obtained. A derived fact
expression can contain multiple fact columns from the same table, mathematical operators, numeric constants,
and various functions.
If you want to combine fact columns from different database tables in a derived fact expression, you have to create
a logical view.
MicroStrategy provides a variety of out-of-the-box functions that you can use in defining fact expressions,
including pass-through functions that enable you to pass SQL statements directly to the data warehouse.
The Sales fact maps to an expression that combines the Unit_Price and Quantity_Sold columns in the ITEM_SALES
table, creating a derived fact expression.
MicroStrategy Architect enables you to create derived fact expressions at the application level. However, you can
also store derived fact columns at the database level. The advantage of using derived fact columns in the data
warehouse is that the calculation is performed ahead of time during the ETL process and the result is stored as a
single column. This method translates into simpler report SQL and better performance. If you implement a derived
fact expression at the application level, the calculation has to be performed each time you process a query that
uses that fact.
Creating Facts
The Architect graphical interface enables you to create facts using different methods. The optimal method varies,
depending on your project environment and the characteristics of individual facts. The following topics describe
various aspects of fact creation.
Before creating facts, users need to organize the fact tables in a project using layers. In some instances, they may
have a small number of fact tables that they can include in a single layer. In other cases, users may want to create
layers for different types of facts.
Use any method for determining which layers to create and what fact tables to include in each layer. However, the
layers should align with how users plan to create facts. That way, you can focus only on pertinent tables as you
create facts or sets of facts.
There are two primary means of creating facts in the Architect graphical interface.
Manual fact creation means that users create facts on their own, deciding which columns to use and defining the
appropriate facts. This method gives you precise control over which facts are created and how they are defined.
However, it does require that users create each individual fact one by one.
With automatic fact creation, allow Architect to identify and create the appropriate facts. Architect uses automatic
column recognition that is based on a set of heuristics to determine which columns are facts. After the facts are
created, modify them however necessary. This method provides a quick and easy way to create facts that
minimizes the amount of work you have to do. However, if this method is used, verify that Architect identifies the
right columns as facts and creates all of the desired facts correctly. Otherwise, users may end up with fewer facts
than required, extra facts they do not need, or incorrectly defined facts.
The method chosen depends on the physical structure of the tables and columns in the data warehouse and the
requirements that users have for facts. If the project environment lends itself to creating facts automatically, it can
be a tremendous timesaver. However, users must carefully consider the characteristics of their environment. The
more exceptions or anomalies there are for a particular fact, the more likely that it is a better candidate for manual
creation.
The remainder of this lesson describes how to create facts manually. Creating facts automatically requires users to
understand how automatic column recognition works, including the heuristics it uses to determine which columns
are facts. You will learn about these concepts later in this course.
Use the Architect graphical interface to create simple or derived fact expressions. Users can also create
heterogeneous facts that have multiple expressions.
After mapping a column in a table to a fact, that column no longer displays in the table as an available column.
However, you may need to reuse that same column as an expression for another fact. For example, consider the
following scenario:
Reusing Fact Columns
In this example, there are two facts that both use the TOT_DOLLAR_SALES column in their definitions. If you first
create the Profit fact, you use both the TOT_DOLLAR_SALES and GROSS_DOLLAR_SALES columns in its expression.
As a result, neither of these columns would display in the table as available columns.
Overview
In this exercise, use the Architect Graphical interface to manually create facts in the My Demo Project. Create the
following facts:
After creating these facts, save and update the project schema.
Before creating facts, make sure to set the metric creation setting in the Architect Settings.
This will enable you to automatically create simple metrics for any facts that you create in the Architect graphical
interface. These settings enable you to select the aggregation functions you want to use in creating these metrics.
Typically, Sum is selected as an aggregation function.
Detailed Instructions
1. To open the My Demo Project in Architect, in the My Tutorial Project Source, open the My Demo Project in the
Architect graphical interface.
2. In the Architect graphical interface, click the Project Tables View tab.
3. On the Home tab, in the Layer section, in the layers drop-down list, select the Fact Tables layer.
4. On the Project Tables View tab, right-click the header of any table that contains the TOT_COST expression and
select Create Fact.
5. In the MicroStrategy Architect window, in the box, type Cost as the fact name.
6. Click OK.
7. In the Create New Fact Expression window, define the fact expression:
8. TOT_COST
• Alternatively, you can create these facts by right-clicking the appropriate column in the table and selecting
Create Facts. Because this method defaults the fact name to the mapped column name, make sure you
rename the facts when appropriate. To rename a fact, right-click it and select Rename. In the MicroStrategy
Architect window, in the box, type the fact name, and click OK.
12. On the Home tab, in the Save section, click Save and Update Schema.
13. In the Update Schema window, ensure the following check boxes are selected:
The third task in the project creation workflow is to create the attributes for the MicroStrategy project. Create
attributes based on your logical data model and map them to columns in the data warehouse physical schema.
Define any direct relationships between attributes. Then, use attributes as components in reports.
On templates, attributes enable you to describe metrics at various levels. For example, the following illustration
shows two reports with the same metric aggregated to different levels, based on the attributes in their respective
templates:
Attributes and Metric Aggregation
The total for the Profit metric on each report is the same. However, the first report displays profit by year and
category since those are the attributes on the template and they are not directly related. The second report
displays profit by call center only since the Region and Call Center attributes on the template are directly related
and call centers represent the lower level.
• Use attributes to directly define the level of aggregation for a specific metric.
• In filters, attributes enable you to qualify the result set of a report based on attribute data. For example, the
following illustration shows two reports with the same template that return different result sets based on
the attributes in their respective filters:
Attributes and Qualification
The attributes on each report are the same. However, the first report displays profit for all years only for the Books
and Music categories since those Category attribute elements are in the filter. The second report displays profit for
all categories only for 2012 since that Year attribute element is in the filter.
As part of creating attributes in a MicroStrategy project, you create forms for each attribute. Attribute forms enable
you to display different types of descriptive information about an attribute. You can use attribute forms to display
the ID for an attribute along with any number of description fields. It is the attribute forms that actually map
directly to the columns in the data warehouse.
All attributes have an ID form. Most have at least one primary description form. Some attributes have many
description forms, so you can display various aspects of that attribute on reports. For example, you may store the
following information about customers in the data warehouse:
Customer Information
The LU_CUSTOMER table contains a variety of information about each customer that may be useful to view or
analyze in the reporting environment, including the following:
• ID
• Name
• Home phone
• Home address
• Birth date
• Gender
• Income
In deciding what forms to create for an attribute, consider what users intend to do with this customer information
and whether the unique characteristics of attribute forms will meet those needs.
First, attribute forms must have a one-to-one relationship with other forms of the same attribute. For example, if a
customer has more than one email address, then the relationship between Email and Customer is not one to one.
In this case, if you want to display all the email addresses for a single customer, you have to create Email as a
separate attribute rather than an attribute form of Customer.
Creating attribute forms that have a one-to-many relationship with the attribute they describe, remember that
only the first element displays on reports.
Second, you can use attribute forms for the following functions:
Display—Display different forms of an attribute on reports or in the Data Explorer when you browse the attribute.
Sorting—Sort report data using attribute forms. Sort using any attribute form, not just those displayed on a report.
Qualification—Qualify on the elements of any attribute form. Qualify using any attribute form, not just those
displayed on a report.
If all users need to do is display, sort, or qualify on descriptive data, creating an attribute form for that data is
sufficient. However, if users need to be able to aggregate metrics based on descriptive data, you need to create it as
a separate attribute rather than an attribute form.
Because ID forms are the unique identifiers for attributes, they are included in the GROUP BY clause when
aggregating metrics on reports.
For example, you probably do not need to aggregate revenue data for customers based on their name, home
address, home phone, email, or birth date. However, you may want to aggregate revenue data based on their
gender or income. Therefore, you would probably want to create Gender and Income as separate attributes rather
than attribute forms.
The ability to create multiple forms for an attribute provides flexibility in reporting. Users can choose to display
different forms for an attribute on different reports. They can view as many or as few forms as they want.
The following image shows a report with various attribute forms displayed for the Customer attribute:
Customer Attribute Forms
This report displays each customer’s ID, last name, first name, address, and email all under the Customer attribute
header.
Types of Attributes
Before learning how to create attributes in MicroStrategy Architect, users need to understand the different types of
attributes and attribute form expressions that exist. There are three primary types of attributes:
• Compound
• Homogeneous
• Heterogeneous
The following topics describe each of these types of attributes in more detail.
Compound Attributes
A compound attribute is one that uses two or more columns as its ID. Therefore, its ID form maps to a combination
of columns. These attributes require a compound primary key in the data warehouse to uniquely identify their
elements.
The Item attribute has two different columns that comprise its primary key in the LU_ITEM table—Item_ID and
Color_ID. Therefore, it is a compound attribute since you have to map its ID form to the combination of these two
columns.
Homogeneous Attributes
Associate the forms of an attribute to columns in data warehouse tables. The same attribute form can be mapped
to any number of tables. A homogeneous attribute is one where each attribute form points to the same column or
set of columns in every table to which it maps.
The illustration above shows an example where the ID for an attribute occurs in multiple tables. Other attribute
forms can also exist in multiple tables, although this scenario most frequently occurs with ID forms.
The ID form for the Region attribute maps to three different tables—LU_REGION, LU_CALL_CTR, and
REGION_SALES. However, Region is a homogeneous attribute because its ID form maps to the same Region_ID
column in each table.
Heterogeneous Attributes
Whereas each form for a homogenous attribute always maps to the same column or set of columns, a
heterogeneous attribute is one where at least one attribute form points to two or more different columns or sets of
columns in the tables to which it maps.
Heterogeneous Attribute
The illustration above shows an example where the ID for an attribute occurs in multiple tables. Other attribute
forms can also exist in multiple tables, although this scenario most frequently occurs with ID forms.
The ID form for the Region attribute maps to three different tables—LU_REGION, LU_CALL_CTR, and
REGION_SALES. In the first two tables, it maps to the Region_ID column, but in the third table, it maps to the
Reg_ID column. Therefore, Region is a heterogeneous attribute because its ID form maps to two different columns.
All attributes have one or more attribute forms. The attribute forms directly map to columns in the data warehouse.
An attribute form expression consists of a column or set of columns to which an attribute form maps. All attribute
forms have at least one expression. However, attribute forms can have any number of expressions. There are two
primary types of attribute form expressions:
• Simple
• Derived
The following topics describe both of these types of attribute form expressions in more detail.
A simple attribute form expression is one that maps directly to a single attribute column. It can map to that same
column for any number of tables.
The ID form for the Days to Ship attribute maps directly to the Days_to_Ship column in the LU_SHIPMENT table,
creating a simple attribute form expression.
A derived attribute form expression is one that maps to an expression from which the attribute form values are
obtained. A derived attribute form expression can contain multiple attribute columns from the same table,
mathematical operators, constants, and various functions.
If you want to combine attribute columns from different database tables in a derived attribute form expression, you
have to create a logical view. For information on logical views, see the MicroStrategy Advanced Data Warehousing
course.
MicroStrategy provides a variety of out-of-the-box functions that you can use in defining attribute form
expressions, including pass-through functions that enable you to pass SQL statements directly to the data
warehouse.
The ID form for the Days to Ship attribute maps to an expression that combines the Ship_Date and Order_Date
columns in the LU_SHIPMENT table, creating a derived attribute form expression.
• MicroStrategy Architect enables you to create derived attribute form expressions at the application level.
However, you can also store derived attribute columns at the database level. The advantage of using
derived attribute columns in the data warehouse is that the calculation is performed ahead of time during
the ETL process and the result is stored as a single column. This method translates into simpler report SQL
and better performance. If you implement a derived attribute form expression at the application level, the
calculation has to be performed each time you process a query that uses that attribute form.
Creating Attributes
The Architect graphical interface enables you to create attributes using different methods. The optimal method
varies, depending on your project environment and the characteristics of individual attributes. The following topics
describe various aspects of attribute creation.
Before creating attributes, users need to create layers to organize the lookup and relationship tables in your
project. The best method for creating these layers is to group together tables from the same hierarchy or
dimension. This strategy aligns the layers with how you create attributes. That way, you can focus only on pertinent
tables as you create each group of related attributes.
As users create each hierarchy or dimension layer, they can simultaneously create the corresponding attributes and
their relationships.
As with facts, there are two primary means of creating attributes in the Architect graphical interface:
• Manual attribute creation means that you create attributes on your own, deciding which columns to use
and defining the appropriate attributes and their attribute forms.
• With automatic attribute creation, you allow Architect to identify and create the appropriate attributes and
their attribute forms.
• You can also use the Attribute Creation Wizard to create multiple attributes.
Creating attributes automatically requires understanding of how automatic schema recognition works, including
the rules it uses to determine which columns are attributes or forms.
There are two ways to create attributes manually: map an attribute to the same column in all project tables, or map
an attribute to a column only for selected project tables. Create attributes by first mapping the ID form, and then
add other forms for the attribute as needed.
1. On the Project Tables View tab, find the project table you want to use as the primary lookup table for the
attribute.
Architect automatically designates the table you use to create the attribute as the primary lookup table. You
can change the primary lookup table at a later point.
2. Right-click the header of the project table and select Create Attribute.
3. In the MicroStrategy Architect window, in the box, type a name for the attribute.
4. Click OK.
5. In the Create New Form Expression window, define the ID form expression.
6. Click OK.
• To ensure the available columns in the table are visible, right-click the table, point to Properties, point to
Logical View, and select Display Available Columns.
The following image shows the Region ID form being created in the Create New Form Expression window using the
automatic mapping method:
Create New Form Expression Window—Region Attribute
The following image shows the Region attribute defined for the REGION_ID column in the LU_REGION and
LU_CALL_CTR tables:
Project Tables View Tab—Region Attribute
The following image shows the Region attribute displayed in the Properties pane:
The ID form for the Region attribute maps to the REGION_ID column for both the LU_REGION and LU_CALL_CTR
lookup tables.
To create an attribute with a simple form expression and map it to a column only for selected project tables:
1. On the Project Tables View tab, right-click the header of the project table to be used as the primary lookup table
for the attribute and select Create Attribute.
Architect automatically designates the table you use to create the attribute as the primary lookup table. The
primary lookup table can be changed at a later point.
2. In the MicroStrategy Architect window, in the box, type a name for the attribute.
3. Click OK.
4. In the Create New Form Expression window, define the ID form expression.
6. Click OK.
This action creates an attribute with an ID form that maps to that column only for the selected project table. The
new attribute automatically displays in the Properties pane.
7. To map the ID form for the attribute to additional tables, in the Properties pane, under the Form 1: ID category,
select the ID property.
9. In the Modify Attribute Form window, on the Definition tab, in the Source tables list, select any other tables to
which you want to map the ID form.
The following image shows the ID form for the Call Center attribute being created in the Create New Form
Expression window using the manual mapping method:
Create New Form Expression Window—Call Center Attribute
The following image shows the Call Center attribute displayed in the Properties pane:
Observe that the ID form for the Call Center attribute maps to the CALL_CTR_ID column only for the LU_CALL_CTR
lookup table. Other tables that include this column are not part of the attribute form definition.
1. On the Project Tables View tab, right-click the header of the project table to be used as the primary lookup table
for the attribute and select Create Attribute.
Architect automatically designates the table you use to create the attribute as the primary lookup table. The
primary lookup table can be changed at a later point.
2. In the MicroStrategy Architect window, in the box, type a name for the attribute.
3. In the Create New Form Expression window, define the ID form expression as desired.
Use the toolbar above the box to insert parentheses, mathematical operators, or other functions. Information can
also be typed directly in the box.
5. Click OK.
The following image shows the Create New Form Expression window with a derived expression for the Days to Ship
attribute:
Create New Form Expression Window—Days to Ship Attribute with Derived Form Expression
1. On the Project Tables View tab, right-click the header of the project table you want to use as the primary lookup
table for the attribute and select Create Attribute.
2. In the MicroStrategy Architect window, in the box, type a name for the attribute.
3. In the Create New Form Expression window, define the ID form expression as desired.
4. Click OK.
5. On the Project Tables View tab, right-click the attribute form you just created and select Edit.
6. In the Modify Form Expression window, click OK to return to the Modify Attribute Form window.
7. In the Modify Attribute Form window, on the Definition tab, click New.
8. In the Create New Form Expression window, in the Source table list, select the source table for the new
expression.
The following image shows the Modify Attribute form with multiple expressions for the ID form of the
heterogeneous Day attribute:
Modify Attribute Form—Heterogeneous Attribute
In Architect graphical interface, create a compound attribute by assigning two or more attribute forms to the ID
form category. When more than one attribute form has been assigned to the same form category, a form group is
created.
1. On the Project Tables View tab, right-click the header of the project table you want to use as the primary lookup
table for the attribute. Then, select Create Attribute.
2. In the MicroStrategy Architect window, in the box, type a name for the attribute.
3. In the Create New Form Expression window, define the ID form expression as desired.
4. Click OK.
5. On the Project Tables View tab, select the column you want to add to the attribute you just created.
If the column is already used in another attribute, you must right-click the attribute and add a new attribute form.
In the Properties Pane, for the new form, change the Category to ID. In a message window that prompts the user to
create a form group, click Yes.
1. In the Properties pane, modify the Category property of the new description form to ID.
2. In the message window that prompts you to create a form group, click Yes.
3. In the Properties pane, modify the Name property of the new form to ID.
There is no need to use “ID” as the form group name. However, this naming convention is the easiest for users to
understand.
The following image shows the message window that prompts you to create a form group:
The following image shows a table with Distribution Center as a compound attribute:
Project Table—Distribution Center Compound Attribute
The following image shows the Properties pane for the Distribution Center attribute with form group:
Properties Pane of a Compound Attribute—Distribution Center
After mapping an ID column in a table to an attribute, that column no longer displays in the table as an available
column. However, users may need to reuse the same column as an expression for another attribute. For example,
consider the following scenario:
Reusing Attribute Columns
In this example, two attributes use the STATE_ID column in their definitions. If you first create the Store State
attribute, you use the STATE_ID column in its ID attribute form expression. As a result, this column would not
display in the table as an available column.
Now, if you want to create the Customer State attribute, which also requires the STATE_ID column, you can no
longer access the column directly from the table. You cannot right-click a used column and create an attribute in
the Architect graphical interface.
Reusing an attribute column is only an issue for ID columns since you cannot right-click columns to create other
types of attribute forms.
1. On the Project Tables View tab, find the project table that contains the column you like to re-use for the
attribute.
3. In the MicroStrategy Architect window, in the box, type a name for the attribute.
4. Click OK.
5. In the Create New Form Expression window, define the attribute form expression as desired.
Any column can be accessed in a table from this window, even those that have already been used for other
attributes.
6. Click OK.
Existing attributes can be modified in the Architect graphical interface. Any of the following parts of an attribute
can be changed:
• Attribute forms
• Source tables
• Mapping methods
• Column aliases
Modify attributes from the Project Tables View tab or the Properties pane. They both provide access to all parts of
an attribute.
To access the full range of attribute functions from the Project Tables View tab, change the display properties to
show attribute forms within project tables.
Creating attributes in the Architect graphical interface involves creating attribute forms and form expressions.
Every attribute must have at least one attribute form, and each form can have any number of attribute form
expressions.
When first creating an attribute, create the ID form as well. Most attributes will have other forms as part of their
definition. Create additional attribute forms by modifying attributes. You can also modify existing attribute forms.
Create attributes with forms that have simple or derived attribute form expressions. Add as many attribute forms as
needed to describe an attribute. Add attribute forms using the Project Tables View tab.
1. On the Project Tables View tab, find the project table that contains the attribute form you want to add.
Attribute forms should be part of the primary lookup table for the attribute.
2. In the project table, right-click the attribute for which add a form is to be added, and select New Attribute Form.
3. In the Create New Form Expression window, define the new attribute form expression.
Define simple or derived form expressions and select the mapping method.
4. Click OK.
If Manual is selected as the mapping method and the user needs to map additional tables to the form, the attribute
form must be modified.
The following image shows the option for adding attribute forms:
Option for Adding Attribute Forms
The following image shows the Create New Form Expression with a simple form expression for the Subcategory
attribute:
Create New Form Expression
The following image shows the description (DESC) attribute form for the Employee attribute displayed in the
Architect graphical interface:
Project Table—DESC Form for Employee Attribute
The following image shows the DESC form for the Subcategory attribute displayed in the Properties pane:
Modify existing attribute forms, including changing their attribute form expressions, mapping methods, primary
lookup tables, source tables, and column aliases. Attribute forms can also be modified using either the Project
Tables View tab or the Properties pane.
2. On the Attributes tab, in the drop-down list, select the attribute to be modified.
3. Under the appropriate form category, select the corresponding form property.
For example, if you are modifying an ID form, the default property name is ID. If modifying a description form, the
default property name is DESC.
5. Click Browse.
6. In the Modify Attribute Form window, modify parts of the attribute form as desired.
Change the source tables for attribute form expressions or the column alias for an attribute form. Modify or delete
existing expressions for the attribute form or create new ones. Remember that the primary lookup table for the
attribute can also be changed.
The following image shows attribute form categories in the Properties pane:
Attribute Form Categories
The Properties pane also contains properties that enable you to modify individual attribute form expressions for
specific tables and modify column aliases.
Modify column aliases for existing attributes. Column aliases for attributes work in a manner similar to column
aliases for facts.
2. On the Attributes tab, in the drop-down list, select the attribute to be viewed or modify the column alias.
3. Under the appropriate form category, select the Column Alias property to see the following Browse button:
5. In the Column Editor - Column Selection window, modify existing or create new column alias.
All attributes have properties in the Properties pane. Many of these properties can be modified. The following table
lists these properties and their descriptions:
Properties for Attributes
An attribute has a set of properties for each form that are part of its definition. Those properties are organized
under a category name. The category name for the first form is labeled Form 1: <Form Name>, the second is Form
2: <Form Name>, and so forth.
On the Project Tables View tab, find a project table that contains the attribute for which you want to view or
modify properties.
OR
• On the Attributes tab, in the drop-down list, select the attribute for which you want to view or modify
properties.
Some properties have text boxes or drop-down lists that enable you to modify their values. For other properties,
selecting the property or its current value displays the following Browse button:
Clicking the Browse button opens a window that enables you to modify the related property.
As you learned earlier, attribute forms enable users to display different types of information about an attribute. If
creating multiple forms for an attribute, it is necessary to define which forms are part of its default report and
browse displays.
Report display forms are the attribute forms that display when an attribute is present on a report. Browse forms are
the attribute forms that display when you browse an attribute in a hierarchy in the Data Explorer.
Attribute ID forms are automatically added to the default report display and browse forms only if other forms do
not exist. Other types of attribute forms are always added to the default report display and browse forms. If you
create a new form and you do not want to use it for report display or browsing, you need to remove it from both
lists before saving the attribute.
The default report display you define at the attribute level determines which forms users initially see when they
view an attribute on a report. However, you can change the default form display for a particular report. At the
attribute level, you should include the forms you need to display on most reports. You can then use the report-level
display to address any exceptions.
To define report display and browse forms for an attribute using the Properties pane:
• On the Project Tables View tab, find a project table that contains the attribute for which you want to view or
modify properties.
OR
• On the Attributes tab, in the drop-down list, select the attribute for which you want to view or modify
properties.
2. For the appropriate form, beside the Use as Browse Form property, select the check box to toggle between True
and False.
3. Beside the Use as Report Form property, select the check box to toggle between True and False.
When first creating an attribute, the ID form defaults to True for this property. If a subsequent form (like a DESC
form) is added, the value of the new form defaults to True, and the value of the ID form automatically changes to
False.
4. Repeat steps 2 to 3 to modify the report display and browse form properties for other attribute forms as
appropriate.
The following image shows the Properties pane with an attribute form display highlighted:
As each hierarchy or dimension of attributes is created, remember that the last step in building the attributes is to
create the relationship which exists between attributes identified in the logical data model. For example, a
customer hierarchy could look like the following:
Customer Hierarchy
Create attribute relationships using the Hierarchy View tab. As the attribute relationships for each hierarchy or
dimension are created, build the system hierarchy for the project.
Attribute Relationship Creation Methods
There are two primary means of creating attribute relationships in the Architect graphical interface.
Manual attribute relationship creation means that users create attribute relationships on their own, deciding which
attributes to relate and the type of relationship between them based on the data warehouse model. This method
gives you control over what relationships are created. However, it does require that users to create each attribute
relationship that they need.
Automatic attribute relationship creation means that Architect is allowed to identify and create the appropriate
attribute relationships. Architect uses relationship recognition that is based on a set of rules you select to
determine whether attributes are related and the type of relationship that exists between them. After the attribute
relationships are created, you can modify them however you want. This method provides a quick easy way to
create attribute relationships and minimizes the amount of work you have to do. However, if you use this method,
you should verify that Architect correctly identifies related attributes and creates all of the desired relationships.
Otherwise, you can end up with incorrect, missing, or unwanted attribute relationships.
The method chosen depends on the physical structure of the tables and columns in your data warehouse, the
extent to which users want to analyze various relationships, and the user’s own knowledge of the data warehouse
model. If there is uncertainty about the relationships to be created and the project environment lends itself to
creating attribute relationships automatically, it can be a tremendous timesaver. However, if the user is familiar with
the data warehouse model and knows which attribute relationships they want to use in a project, creating the
relationships manually can be a better way of organizing the system hierarchy.
If relationship recognition is disabled when the attributes for a hierarchy or dimension are created, they display on
the Hierarchy View tab without any relationships defined. The following image shows the attributes from the Time
hierarchy on the Hierarchy View tab:
The image on the previous page displays the system hierarchy in its current state. Before relationships are created,
all attributes display as entry points. Once the user has created relationships, only top-level attributes that do not
have parents are kept as entry points.
All the attributes from the hierarchy that has been created are present, but no relationships exist, hence they are
not connected.
The Hierarchy View tab provides a click-and-drag interface that enables you to easily draw relationships between
attributes. When you select an attribute, the attributes that are child candidates display using regular attribute
icons. The attributes that are not child candidates display using ghosted attribute icons.
For example, the following image shows the Month of Year attribute with its child candidates identified:
The Month, Year, and Quarter attributes are the potential children for the Month of Year attribute, so their icons
display normally. The Day attribute is not a potential child for the Month of Year attribute, so its icon is ghosted.
If the ID forms for two attributes are in the same project table, Architect recognizes them as potentially being
related.
1. On the Hierarchy View tab, click the parent attribute and drag the mouse pointer to the child attribute.
When you click and drag the pointer, a line is dynamically drawn that links the two attributes.
2. If you need to change the relationship type, right-click the line that shows the relationship and select the
appropriate relationship type.
3. To change the relationship table, right-click the line that shows the relationship, point to Relationship Table,
and select the appropriate table.
The following image shows the relationship between the Month of Year and Month attributes:
Relationship between the Month of Year and Month Attributes
The relationship line indicates the type of relationship—one to one, one to many, or many to many. If the pointer is
moved over the relationship line, the name of the relationship table displays.
The following image shows the options for selecting attribute relationship types and relationship tables:
Options for Selecting Attribute Relationship Types and Relationship Tables
On the Hierarchy View tab, the attribute relationships in the Children Relations window can also be modified. To
modify existing relationships, right-click the attribute and select Edit Children Relations.
The following image shows the option for modifying parent-child relationships:
Option for Modifying Child Attributes
Overview
In this exercise, use the Architect graphical interface to create attributes manually in the My Demo Project. Create
the following attributes in their corresponding layers:
Geography Attributes
Time Attributes
After creating these attributes, create the following description forms for selected attributes:
As the form expressions for each attribute is created, use the Properties pane to verify correct expressions and
mapping to the appropriate tables. Save project work after creating each attribute.
After creating these attributes and their respective forms, manually create the following child relationships for each
attribute:
All relationships should be defined on the Hierarchy View tab in the Architect graphical interface. All relationships
are one to many unless another type of relationship is indicated in the table. One-to-one relationships are
indicated as 1:1.
After creating the child relationships for these attributes, save and update the project schema.
• The detailed instructions are shown only for the Call Center attribute and the Distribution Center attribute.
Use the same procedure to create the remaining attributes using their respective tables and column names.
Detailed Instructions
1. In the My Tutorial Project Source, open the My Demo Project in the Architect graphical interface.
2. In the Architect graphical interface, click the Project Tables View tab.
3. To create the Call Center attribute, on the Home tab, in the Layer section, in the layers drop-down list, select the
Geography layer.
4. On the Project Tables View tab, right-click the header of the LU_CALL_CTR table and select Create Attribute.
6. Click OK.
7. In the Create New Form Expression window, in the Form expression box, create the following expression:
CALL_CTR_ID
9. Click OK.
• This action creates a Call Center attribute with an ID form that maps to the column CALL_CTR_ID for every
project table in which the column occurs. The new attribute automatically displays in the Properties pane.
10. Repeat steps 4 to 9 to create the remaining attributes except for the Distribution Center attribute in the
Geography and Time layers:
Geography Attributes
Time Attributes
• Alternatively, these attributes can be created by right-clicking the appropriate column in the table and
selecting Create Attributes. Because this method defaults the attribute name to the mapped column name,
make sure the attributes are renamed if necessary. To rename an attribute, right-click it and select Rename.
In the MicroStrategy Architect window, in the box, type the attribute name, and click OK.
11. To create a description form for the Call Center attribute, on the Home tab, in the Layer section, in the layers
drop-down list, select the Geography layer.
12. On the Project Tables View tab, in the LU_CALL_CTR table, right-click the new Call Center attribute and select
New Attribute Form.
13. In the Create New Form Expression window, in the Available columns list, double-click CENTER_NAME to define
it as a form expression.
16. To ensure the attribute forms in the table is visible, right-click the table, point to Properties, point to Logical
Tables, and select Display Attribute Forms.
17. To create description forms for the remaining attributes, repeat steps 11 to 15 to create the description forms
for the following attributes in their respective lookup tables:
• Alternatively, these attribute forms can be created by dragging the appropriate column to the attribute in
the lookup table.
18. To create a parent-child relationship for the Call Center attribute, Click the Hierarchy View tab.
19. On the Hierarchy View tab, select the Call Center attribute and drag the mouse pointer towards the Employee
attribute.
21. Right-click the Call Center attribute and select Edit Children Relations.
22. In the Children Relations window, in the Relationship type drop-down list, make sure the One to Many
relationship type is selected.
24. On the Home tab, in the Auto Arrange Hierarchy Layout section, click Regular to rearrange the attributes.
25. To create parent-child relationships for the remaining attributes, repeat steps 18 to 24 to create the following
relationships on the Hierarchy View tab:
26. To create the Distribution Center compound attribute, click the Project Tables View tab.
27. On the Project Tables View tab, in the Layer section, in the drop-down list, select the Geography layer.
28. On the Project Tables View tab, right-click the header of the LU_DIST_CTR table and select Create Attribute.
29. In the MicroStrategy Architect window, in the box, type Distribution Center.
31. In the Create New Form Expression window, in the Form expression box, create the following expression:
DIST_CTR_ID.
34. Right-click the header of the Distribution Center attribute, select New Attribute Form.
35. In the Create New Form Expression window, in the Form expression box, create the following expression:
COUNTRY_ID.
38. In the Properties pane, modify the Category property of the new description form to ID.
39. In the MicroStrategy Architect window that prompts you to create a form group, click Yes.
42. On the Hierarchy View tab, select the Distribution Center attribute and drag the mouse pointer towards the Call
Center attribute.
43. Right-click the Distribution Center attribute and select Edit Children Relations.
44. In the Children Relations window, in the Relationship type drop-down list, select the One to One relationship
type.
46. Select the Country attribute and drag the mouse pointer towards the Distribution Center attribute.
48. To verify the attribute relationships, on the Home tab, in the Auto Arrange Hierarchy Layout section, click
Regular.
49. The Geography system hierarchy should look like the following:
50. The Time system hierarchy should look like the following:
• System hierarchy
• User hierarchy
System Hierarchy
When attributes and their corresponding relationships are created, system hierarchy is automatically created. The
system hierarchy includes all the attributes in a project and their respective relationships. It derives its structure
from the direct relationships you define between attributes.
All highest-level attributes (attributes without parents) are entry points in the hierarchy. From these entry points,
users can then browse to other directly related attributes. The system hierarchy is automatically updated any time
that attributes or attribute relationships are added, modified, or removed.
The system hierarchy is useful for showing all the attributes in a project and their relationships. However, its
structure is generally not the most conducive to browsing attribute data. While the system hierarchy serves as the
default drill path used in reports to drill up and down to directly related attributes, creating a user hierarchy
configured for drilling is an easier way to manage drilling in reports.
• User hierarchies can be created that enable users to efficiently browse attribute data in the project.
All attributes and attribute relationships on the Hierarchy View tab comprise the system hierarchy:
System Hierarchy
User Hierarchies
The final task in the project creation workflow is to create the user hierarchies for your MicroStrategy project. User
hierarchies provide convenient paths for browsing attribute data. They may or may not reflect the structure of the
hierarchies in your logical data model. Instead, you define the structure of user hierarchies based on how users
need to browse data.
For example, if users typically browse geography data at the employee level, they can create a user hierarchy that
enables users to navigate from the Region attribute, a higher-level attribute in the hierarchy, directly to the
Employee attribute, the lowest-level attribute in the hierarchy. The following illustration shows how the hierarchy
from the logical data model and the user hierarchy might differ in structure:
Logical Data Model versus User Hierarchy
The logical data model does not have a direct path between the Region and Employee attributes. You have to go
through the Call Center attribute. However, if users need to browse directly from Region to Employee, they can
include this path in the user hierarchy, eliminating the need to have to browse call center data first.
User hierarchies are also not limited to directly related attributes. Attributes from different hierarchies can be
included in the logical data model in the same user hierarchy.
For example, if users often browse data by Region and Month, you could create a user hierarchy that includes these
attributes, even though they are not directly related. The following illustration shows how a user hierarchy can
combine attributes from different hierarchies in the logical data model:
User Hierarchy with Unrelated Attributes
Even though Region and Month are from different hierarchies in the logical data model, combining them in a user
hierarchy together provides quick access to browse between these attributes.
Although user hierarchies are not required in a MicroStrategy project, they are certainly helpful for users.
Ultimately, consider how users navigate attribute data to determine the best structure for any user hierarchies that
are created.
When you create user hierarchies for browsing, they are available to users throughout the project. For example,
users can access them in the Data Explorer browser, the Object Browser within object editors, and prompts. The
following illustration shows a few examples of how user hierarchies enable users to accomplish routine tasks within
a project:
Browsing User Hierarchies
In addition to providing users with a convenient method of browsing attribute data throughout the project, user
hierarchies can also be configured as drill paths for analyzing report data. The following illustration shows how
users can drill on user hierarchies within reports:
Drilling on User Hierarchies
Use the Hierarchy View tab to create attribute relationships and define the system hierarchy. User hierarchies can
be created in the Architect graphical interface from the Hierarchy View tab.
Perform the following steps to create a user hierarchy in the Architect graphical interface:
• Browse attributes
• Entry points
• Element display
• Attribute filters
* Customize the sort order for browsing and drilling user hierarchies.
After you created and defined user hierarchies, you have to move all the user hierarchies to the Schema
Objects\Hierarchies\Data Explorer folder to make them available for browsing.
The following topics describe each of the steps involved in creating and defining user hierarchies in more detail.
The Architect toolbar displays the hierarchies drop-down list on the Hierarchy View Tab. By default, the only item in
this list is the System Hierarchy View, which displays the system hierarchy for the project. To view a user hierarchy, it
is necessary to first create it.
The following image shows the Geography user hierarchy displayed on the Hierarchy View tab:
Geography User Hierarchy
When a user hierarchy is initially created, it does not contain any attributes. The next step in creating the user
hierarchy is to add the attributes that need to be included in the hierarchy.
The following image shows the option for adding attributes to a user hierarchy:
Option for Adding Attributes to a User Hierarchy
The following image shows the Geography user hierarchy with all the attributes added to it:
Geography User Hierarchy
After you add the attributes you want to include in a user hierarchy, the next step is to define the various elements
of the user hierarchy. These characteristics determine the structure of the user hierarchy and how it functions.
For all user hierarchies, users must define browse attributes and entry points. You can also define the element
display for attributes, define filters on attributes, and configure user hierarchies for drilling.
Define the browse attributes for each attribute in a user hierarchy. Browse attributes are the attributes to which you
can directly browse from any given attribute.
By default, when a user hierarchy is created, only the immediate children of a parent attribute are defined as
browse attributes. However, if it is necessary to provide direct access to more attribute levels, select additional
browse attributes.
Browse attributes are indicated by a line that connects the two attributes.
Remember, it is necessary to define the necessary browse attributes for each attribute in the user hierarchy.
The following image shows the option for defining browse attributes:
Option for Defining Browse Attributes
The following image shows the Geography User Hierarchy with the browse paths defined:
Geography User Hierarchy—Browse Paths
Users can also set the entry points for a user hierarchy. Entry points are the attributes that display when you first
open a user hierarchy. Users can begin browsing a user hierarchy from any attribute that is an entry point.
To provide direct access to specific attributes in the hierarchy, set those attributes as entry points.
Attributes that are entry points for a user hierarchy are indicated by a green checkmark beside the attribute icon.
The following image shows the option for setting entry points:
Option for Setting Entry Points
The following image shows the Geography user hierarchy with all the attributes set as entry points:
Geography User Hierarchy—Entry Points
Set the element display for attributes in a user hierarchy. The element display setting determines the extent to
which you can browse attribute elements. You have the following options for configuring an attribute’s element
display:
• Limit—browse a specified number of elements for an attribute. You can then either retrieve the next set of
elements or return to browsing the previous set of elements.
By default, the element display for attributes is unlocked when added to a user hierarchy. However, users can
choose to lock or limit the element display of any attribute. Locks and limits are used on attributes either to enforce
security requirements on specific data or conserve system resources by restricting the volume of data retrieved for
attributes that have a large number of elements.
* Locking or limiting the element display for an attribute only affects how you can browse it in that particular user
hierarchy. Use different element display settings for the same attribute in other user hierarchies.
Attributes that are locked for a user hierarchy are indicated by a padlock beside the attribute icon.
For each attribute in the user hierarchy, users have the option to change how attribute elements display or whether
they display at all.
The following image shows the option for setting the element display:
Option for Setting Element Display
User hierarchies are generally used for browsing attribute data, but they can also be configured for drilling. When a
user hierarchy is made available for drilling, it displays as a drill path for attributes in reports.
* If you configure a user hierarchy for drilling, it is displayed on the Other directions drill menu.
The following image shows the option for configuring a user hierarchy for drilling:
Option for Configuring a User Hierarchy for Drilling
Attribute elements are also visible on the Hierarchy View tab. In the following image, see the Show Sample Data
option for the Quarter attribute. After choosing to show sample data, the elements of the Quarter attribute display
in the Browse for attribute elements window:
Option for Showing Sample Data
Attribute filters can also be defined in a user hierarchy. Attribute filters on a user hierarchy control the data
displayed for the attributes to which they are applied. A filter on an attribute in a hierarchy works just like a filter on
a report. Only attribute elements that match the filter conditions display when you browse the attribute.
* Any type of filter can be applied to user hierarchies. For example, there are filters that contain metric
qualifications, multiple conditions, or reports in their definition.
For example, if you browse the Year attribute with a filter for 2010, there is only the data for 2010. If users perform
most of their analysis on 2010 data, this filter enables them to quickly access the necessary timeframe without
having to retrieve and browse through data for other years in the data warehouse.
* Attribute filters on a user hierarchy are not a security measure to prevent users from viewing attribute data.
Rather, you use them to restrict the results of a browse request to the specific information users need to analyze
without retrieving unnecessary data.
By default, a user hierarchy does not have any filters applied to it. Define filters on a user hierarchy at the attribute
level. Therefore, if a hierarchy has multiple entry points, you need to define the same filter on each one to ensure
that the filter conditions are applied regardless of where a user begins browsing the hierarchy.
Attributes that have a filter applied to them for a user hierarchy are indicated by a filter icon beside the attribute
icon.
For each attribute in the user hierarchy, there is option to define filters that apply to its attribute element display.
* If a filter is defined on an attribute, a filter icon displays beside the attribute on the Hierarchy View tab.
The following image shows the option for defining attribute filters:
Option for Defining Attribute Filters
The following image shows the elements of the Year attribute when you browse without a filter:
Browsing Year Attribute without Filter
The following image shows the Time user hierarchy with the filter defined for the Year attribute:
Year Attribute with Filter
The following image shows the elements of the Year attribute with 2011 and 2012 filter defined:
Browsing Year Attribute with Filter
In this exercise, use the Architect graphical interface to create user hierarchies in the My Demo Project. Create the
following user hierarchies:
• Geography
• Time
Because of the number of tasks involved in creating user hierarchies, there is a separate overview and set of
instructions for each user hierarchy that needs to be created.
In this exercise, use Architect graphical interface to create a Geography user hierarchy for the geography-related
attributes in the My Demo Project. The following table shows which attributes to include in the user hierarchy and
how to define them:
After this user hierarchy has been created, save and update the project schema.
1. To create the Geography user hierarchy, in the My Tutorial Project Source, open the My Demo Project in the
Architect graphical interface.
2. In the Architect graphical interface, click the Hierarchy View tab, if necessary.
4. In the MicroStrategy Architect window, in the box, type Geography as the user hierarchy name.
5. Click OK.
6. To add attributes to the Geography user hierarchy, on the Hierarchy View tab, right-click an empty area inside
the user hierarchy and select Add/Remove attributes in Hierarchy.
7. In the Select Objects window, in the Available objects list, hold CTRL and select the following attributes:
8. Click the > button to add the attributes to the Selected objects list.
9. Click OK.
10. To define the browse attributes for the Geography user hierarchy, on the Hierarchy View tab, in the Geography
user hierarchy, right-click the Call Center attribute and select Define Browse Attributes.
11. In the Select Objects window, in the Available objects list, select Employee.
12. Click the > button to add the Employee attribute to the Selected objects list.
15. To set the entry points for the Geography user hierarchy, on the Home tab, in the Auto Arrange Hierarchy
Layout section, click Regular to rearrange the attributes.
16. Users may have to switch to the System Hierarchy View, and then back to the Geography hierarchy, for this
setting to take effect.
17. In the Geography user hierarchy, right-click the Region attribute and select Set As Entry Point.
18. Repeat step 17 for the Call Center, Employee, Country, and Distribution Center attributes.
• Since the element display for all the attributes in geography hierarchy is unlocked, there is no need to
define element display for the attributes.
19. To configure the Geography user hierarchy for drilling, in the Geography user hierarchy, click anywhere in the
empty area to deselect any attributes that may have been selected.
21. On the Home tab, click Save and Update Schema to update the project schema.
In this exercise, use the Architect graphical interface to create a Time user hierarchy for the time-related attributes
in the My Demo Project. The following table shows which attributes to include in the user hierarchy and how to
define them:
• After creating this user hierarchy, save and close the Architect graphical interface and update the project
schema.
• Finally, in Developer, move all the user hierarchies you created from the Schema Objects\Hierarchies folder
to the Schema Objects\Hierarchies\Data Explorer folder.
Following instructions are written at a higher level to help better test your understanding.
2. To add attributes to the Time user hierarchy, on the Hierarchy View tab, in the Time user hierarchy, right-click an
empty area and select Add/Remove attributes in Hierarchy.
3. In the Select Objects window, select the following attributes to include in the Time user hierarchy:
4. Click OK.
5. To define the browse attributes for the Time user hierarchy, on the Hierarchy View tab, in the Geography user
hierarchy, right-click the Month attribute and select Define Browse Attributes.
6. In the Select Objects window, in the Available objects list, select Day.
7. Click the > button to add the Day attribute to the Selected objects list.
8. Click OK.
10. To set the entry points for the Time user hierarchy, on the Hierarchy View tab, in the Time user hierarchy, set the
Day, Month, Quarter, and Year attributes as entry points.
11. To set the element display for the Time user hierarchy, in the Time user hierarchy, right-click the Day attribute,
point to Element Display, and select Limit.
14. To configure the Time user hierarchy for drilling, in the Time user hierarchy, click anywhere in the empty area to
deselect any attributes you may have selected.
16. Click Save and Close to update the project schema and to exit Architect graphical interface.
17. To save the Geography and Time hierarchies in the Data Explorer folder, in Developer, browse to the Schema
Objects\Hierarchies folder and drag the Geography and Time hierarchies to the Data Explorer folder to make
them available for browsing.
• •The hierarchies do not display in the Schema Objects\Hierarchies folder unless you save and update
schema. It may be necessary to refresh Developer display to view them. To refresh Developer display,
browse to the Hierarchies folder and press F5. To view the hierarchies in the Data Explorer for the project, in
the Folder List, right-click Data Explorer - My Demo Project and select Refresh.
Ragged Hierarchies
A ragged hierarchy is one in which the organizational structure varies such that the depth of the hierarchy is not
uniform. In other words, for every child attribute element, there does not always exist a corresponding parent
attribute element. Instead, the child attribute element may have a direct relationship only with a grandparent
attribute element.
For example, an advertising company has its sales organization represented as follows:
Logical Model for Sales Hierarchy
In this model, the company is divided into regions, which are then split into markets by advertising segments. Each
market has dedicated account executives who are responsible for specific clients. However, that general structure
may not hold true for all clients. For example, you could have some clients that do not directly correspond to a
market. Therefore, account executives for these clients report directly to the region level. A look at some of the
actual data reveals a ragged structure to the hierarchy:
Ragged Structure of Sales Data
There are four levels of data in the warehouse for the Sales hierarchy. They map to the logical data model as follows:
In the second level of data, notice that there are only three markets—Fashion, Food, and Transportation. One of the
elements that ties directly into the East region is an account executive, Sara Kaplan. The physical data at the second
level is ragged. The Market attribute is not represented between the Region and Account Executive levels for all of
the data in the warehouse. In the case of Sara Kaplan, she is responsible for a single client that is not associated
with a particular market, so she falls directly under the East region in the sales structure rather than reporting
through a market. For this particular attribute element, the Market attribute carries no meaning, making the
hierarchy ragged.
Attributes from ragged hierarchies are problematic when you place them in reports. For example, if you want to
create a report that displays all of the attributes in a ragged hierarchy, the missing values can cause issues since
data does not exist uniformly at every level to populate each cell in the report. Ragged hierarchies also pose a
challenge when drilling. If a report contains an attribute from a ragged hierarchy and you need to drill to other
levels in the hierarchy, values may not exist for every row of data in the original report.
* In a normalized schema, ragged hierarchies can also cause problems when you aggregate data that is stored at a
lower level in the hierarchy to a higher-level attribute that exists at or above the ragged level in the hierarchy. In
such cases, joining from the fact table to the higher-level lookup table requires using the lookup table in which the
gaps exist. As a result, this join excludes some fact table records from being aggregated. Denormalizing the schema
can prevent aggregation issues and resolve specific drill paths. However, even in a denormalized schema, drills or
other queries that join through the lookup table in which the gaps exist will continue to pose a challenge. This is
because some data will be “left out” of result sets because of the gaps.
You can resolve these issues with ragged hierarchies in one of three ways:
• Use left outer join so that data can be aggregated from skipped level
• Populate the gaps for an attribute with values from its parent or child attribute or with system-generated
values
One method for resolving ragged hierarchies is to use left outer join of the lookup table to the fact table. For
example, in the illustration below, both Sara Kaplan and Dave Williams are not assigned to markets.
Table Data
If you run a report that shows market and account executive information along with the revenue generated by
each account executive, the default result set looks like the following:
Result Set for Market and Account Executive with a Metric
Because Sara Kaplan and Dave Williams are not assigned to markets, the null values that exist in the database table
cause them to be excluded from the result set.
from(((`FACT_CLIENT_SALES` a11
join `LU_CLIENT` a12
on (a11.Client_ID = a12.Client_ID))
join `LU_ACCT_EXEC` a13
on(a12.Acct_Exec_ID = a13.Acct_Exec_ID))
join `LU_MARKET` a14
on(a13.Market_ID = a14.Market_ID))
group by a13.Market_ID,
a12.Acct_Exec_ID
To address this issue, you first change the join VLDB property of the Market attribute and all of its parents attributes
so that data can be aggregated from skipped levels. Second, for the affected report, modify the join VLDB property
at the report level. Once you have made both these changes, you can roll up data from the lowest level to higher
levels in the hierarchy.
1. Edit the Market attribute and on the Tools menu, select VLDB Properties.
2. Expand the Joins folder and select Preserve all final pass result elements.
3. Clear the Use default inherited value - (Default Settings) check box.
4. Click Preserve all elements of final pass result table with respect to lookup table but not relationship table.
7. Edit the report and on the Data menu, select VLDB Properties.
8. Expand the Joins folder and select Preserve all final pass result elements.
9. Clear the Use default inherited value - (Default Settings) check box.
10. Click Do not listen to per report level setting, preserve elements of the final pass according to the setting at the
attribute level. If this choice is selected at attribute level, it will be treated as preserve common elements (i.e.
choice 1).
* Controlling the join setting at the report level ensures that changing the attribute does not affect the entire
project. This adds flexibility to your reporting environment because you can choose to have some reports that left
join on the attribute while others do not.
Now if you run the report, the result set looks like the following:
Result Obtained by Left Outer Join
Although modifying the VLDB property resolves the issue of displaying data at the Market level, it does not resolve
the issue of displaying data at the parent level. For example, a report showing Region, Market, Account Executive,
and Revenue will result in account executives who are not assigned to markets still not displaying on the report.
Therefore, this option is not a complete solution to resolving issues with ragged hierarchies.
Another method for resolving ragged hierarchies is to revise the data model so that gaps do not exist. For example,
you could revise the Sales hierarchy to model the attribute relationships as follows:
Revised Logical Model for Sales Hierarchy
Changing the data model to directly relate to the Region and Account Executive attributes also means modifying
the underlying structure of the LU_ACCT_EXEC table. Add the ID column for Region to the LU_ACCT_EXEC table to
map the relationship between the two attributes:
Modified Lookup Table for Account Executive
* In the illustration above, both Sara Kaplan and Dave Williams are not assigned to markets.
By adding the Region_ID column to the LU_ACCT_EXEC table and making Account Executive a child of Region, you
establish a means of relating account executives directly to their corresponding regions. You can now drill directly
from Region to Account Executive without having to join through the lookup table for Market.
Although changing the data model provides a drill path to avoid the gaps in the ragged structure of the hierarchy,
it does not resolve issues with displaying data from a ragged hierarchy on a report. If a report contains all levels of
the hierarchy, the SQL joins include the lookup table for Market in which the gaps exist. As a result, account
executives who are not assigned to markets still do not display on the report. Therefore, this option is not a
complete solution to resolving issues with ragged hierarchies.
+ Revising the data model resolves the problems posed by ragged hierarchies only for very specific drill paths.
Resolving ragged hierarchies using this method can lead to even bigger issues when it comes to rolling up data
from the lowest level to higher levels in a hierarchy. In this case, the roll-up logic for the Account Executive attribute
differs depending on whether the data is aggregated by joining through the LU_MARKET table or joining directly
to the LU_REGION table. Aggregation that occurs through the LU_MARKET table does not include revenue for Sara
Kaplan or Dave Williams, while aggregation that occurs through the LU_REGION table does include their revenue.
A better method for resolving ragged hierarchies is to populate the null values with attribute elements of either the
child or parent attribute or with system-generated values. Inserting values effectively eliminates gaps in a ragged
hierarchy.
For example, the following illustration shows the original data in the LU_ACCT_EXEC table:
Lookup Table for Account Executive with Null Values
Sara Kaplan and Dave Williams are not assigned to markets, so the market ID for both of them is null. You could run
a report with the following template in which all three attribute levels are present:
Template with Attributes from Sales Hierarchy
However, if you run this report, Sara Kaplan and Dave Williams are not included in the result set since their
respective market IDs are null:
Report Result with Null Values
To ensure that all account executives are included in the report display, you can populate the empty values in the
LU_ACCT_EXEC table by inserting the values of the parent (Region) or child (Account Executive) attributes into the
Market_ID column of the LU_ACCT_EXEC table or by generating your own values to replace the nulls.
If you populate the Market_ID column with the parent attribute values, the LU_ACCT_EXEC table looks like the
following:
Lookup Table for Account Executive with Parent Values
Now, if you run the same report, the result set looks like the following:
Report Result with Parent Values
Alternately, you could populate the empty cells for the Market_ID column with the values for the Account
Executive attribute. Then, the LU_ACCT_EXEC table looks like the following:
Lookup Table for Account Executive with Child Values
Now, if you run the same report, the result set looks like the following:
Report Result with Child Values
Whether you choose to populate empty cells with the parent or child attribute values completely depends on
which action provides the most business value to users as they view reports.
If inserting parent or child attribute values does not make sense in your business environment, you can also
populate the empty cells with system-generated IDs that map to descriptions that indicate that a value does not
exist.
For example, you could generate market IDs for account executives who are not assigned to a market. In the lookup
table for the Market attribute, these IDs map to description columns that indicate that no market is assigned. Then,
the LU_ACCT_EXEC table looks like the following:
Lookup Table for Account Executive with Generated Values
Now, if you run the same report, the result set looks like the following:
Report Result with Generated Values
During the exercises, users will first create and run reports in the project without having any solution in place to
resolve gaps in a ragged hierarchy. Then, you will see how changing the VLDB property at the attribute and report
levels addresses the issue of aggregating data for skipped levels. Next, you will see how revising the data model
does not completely solve all of the issues caused by the ragged hierarchy. Finally, you will resolve the ragged
hierarchy by populating the gaps in the hierarchy with system-generated values.
The Ragged Hierarchies Project is based on the following logical data model:
Ragged Hierarchies Logical Data Model
In this data model, gaps exist in the data at the market level. Most account executives are assigned to a market,
which then rolls up into a region. However, two of the account executives have large enough client accounts that
they form their own “markets.” Therefore, they roll up directly into a region as follows:
Gaps in Sales Organization Data
The schema for this project consists of the following five tables:
Ragged Hierarchies Schema
Create a report
1. In MicroStrategy Developer, in the Advanced Data Warehousing project source, open the Ragged Hierarchies
Project.
2. In the Public Objects folder, in the Reports folder, create the following report:
* You can access the Region attribute from the Sales hierarchy. The Revenue metric is located in the Metrics folder.
3. Run the report. The result set should look like the following:
4. Drill down on the entire report from Region to Market and choose to keep the parent attribute when drilling.
The result set should look like the following:
* You can use the Drill option on the Data menu to perform the drill while keeping the parent attribute.
* Notice that the markets related to each region (Food, Fashion, Transportation, and Retail) are included in the
result set.
5. On the drill report, drill down on the entire report from Market to Account Executive, again choosing to keep
the parent attribute when drilling. The result set should look like the following:
* Notice that Sara Kaplan (the account executive for the East region who is not assigned to a market) and Wes
Vinson (the account executive for the Mountain region who is not assigned to a market) do not display on this
report.
Why do they not display in the result set? The answer is in the SQL.
6. Switch to the SQL View for this report. You should see the following SQL:
To access account executive information, the join goes through the LU_MARKET table to relate Region to Account
Executive. Since these two account executives are not assigned to markets, no relationship exists. Therefore, the
result set excludes them.
7. Close both drill reports, but leave the original report you created open. You do not need to save the drill
reports.
8. On the original report, place a Total subtotal. The total should display as follows:
* Notice that according to this report, the total revenue across all regions is $3,774,215.
9. Save this report in the Reports folder as Revenue by Region and close the report.
* You can access the Client attribute from the Sales hierarchy.
11. Run the report. The result set should look like the following:
* Notice that American Express, Citibank, and VISA (the clients that Sara Kaplan and Wes Vinson represent) are all
included in the result set.
12. Place a Total subtotal on the report. The total should display as follows:
* Notice that the true total for all of the revenue is $19,752,626. That is a very different figure than you saw on the
Revenue by Region report. The FACT_CLIENT_SALES table is keyed based on the client ID. Therefore, on this report,
all clients are accounted for in the revenue figures. However, the Revenue by Region report joins to the
FACT_CLIENT_SALES table through the LU_MARKET table. Because of the gaps that exist at the market level, not all
clients are accounted for in the aggregation.
13. Save this report in the Reports folder as Revenue by Client and close the report.
One method for resolving some of the issues caused by the gaps in this ragged hierarchy is to change the join VLDB
Property for the Market attribute so that data can be aggregated from the skipped level.
1. In the Schema Objects folder, in the Attributes folder, open the Market attribute.
3. Expand the Joins folder and select Preserve all final pass result elements.
4. Clear the Use default inherited value - (Default Settings) check box, it not already clear.
5. Click Preserve all elements of final pass result table with respect to lookup table but not relationship table, it not
already selected.
8. In the Public Objects folder, in the Reports folder, create the following report:
* You can access the Market attribute from the Sales hierarchy. The Revenue metric is located in the Metrics folder.
10. Expand the Joins folder and select Preserve all final pass result elements.
11. Clear the Use default inherited value - (Default Settings) check box.
12. Click Preserve all elements of final pass result table with respect to lookup table but not relationship table. If
this choice is selected at attribute level, it will be treated as preserve common elements (i.e. choice 1).
15. Save the report as Revenue by Market and close the report.
16. Run the Revenue by Market report. Place a Total subtotal on the report. The result set should look like the
following:
* Notice that the report shows the correct revenue total of $19,752,626.
Why is the total correct on this report? The answer is in the SQL.
17. Switch to the SQL View for this report. You should see the following SQL:
Because the lookup tables can now left join with the fact table the report shows the correct revenue total. At first
glance, this solution seems to have resolved the issues with ragged hierarchies. However, it has really only resolved
the issue for a very specific reporting scenario—any time you want to aggregate the data at the Market level. If you
run any query that involve parents of the Market attribute, the aggregation will not yield the correct total. To see
the issue that is still present, perform the following steps:
19. Drill up on the entire report from Market to Region and choose to keep the parent attribute when drilling. Move
Region to the left of Market. The result set should look like the following:
The report does not include Sara Kaplan and Wes Vinson revenue because this query uses the relationship between
Region and Market to retrieve the result set. Since Sara and Wes do not relate to any market, there is no way to link
them to the region that they belong to in the LU_REGION table. Because of this, their data is excluded in this report.
Now that you have seen the problem that is still present when you change the VLDB Properties, let us see how
revising the data model addresses this issue.
Another method for resolving some of the issues caused by the gaps in this ragged hierarchy is to revise the data
model so that Region is directly related to both Market and Account Executive as follows:
Because revising the data model entails changing the logical relationship of Region and Account Executive such
that Account Executive is a child of Region, you must also relate the two attributes in the database itself. To do so,
you must include the ID for Region in the LU_ACCT_EXEC table.
1. Open the MicroStrategy DB Query Tool by choosing it in the Start Menu? MicroStrategy Tools menu.
3. Click Connect.
6. Verify the change was made by typing in the following SQL command and executing it:
update LU_ACCT_EXEC
set Region_ID = 2
where Acct_Exec_ID > 8 and Acct_Exec_ID <14;
update LU_ACCT_EXEC
set Region_ID = 3
where Acct_Exec_ID > 13 and Acct_Exec_ID < 20;
update LU_ACCT_EXEC
set Region_ID = 4
10. You now need to make MicroStrategy Architect recognize the changes you made to the LU_ACCT_EXEC table
and define the logical relationship between the Region and Account Executive attributes.
11. In MicroStrategy Developer, in the Ragged Hierarchies Project, open the Warehouse Catalog.
* You access the option to update table structure by right-clicking the table name in the Warehouse Catalog.
14. In the Schema Objects folder, in the Attributes folder, open the Region attribute.
18. Run the Revenue by Region report again. The result set should look like the following:
* Notice that the report now shows the correct revenue total of $19,752,626.
Why is the total now correct on this report? The answer is in the SQL.
19. Switch to the SQL View for this report. You should see the following SQL:
Because Region is directly related to the Account Executive attribute, the FACT_CLIENT_SALES table can join to the
LU_REGION table using the LU_ACCT_EXEC table. This join yields the correct revenue total.
21. Drill down on the entire report from Region to Account Executive and choose to keep the parent attribute
when drilling. The result set should look like the following:
* Notice that both Sara Kaplan and Wes Vinson are part of this result set. Because Region and Account Executive are
directly related, their respective lookup tables can be joined without having to go through the LU_MARKET table.
At first glance, this solution seems to have resolved the issues with ragged hierarchies. However, it has really only
resolved the issues for a very specific join path—any time you join directly from the LU_REGION table to the
LU_ACCT_EXEC table. If you run any query that involves the LU_MARKET table, this “solution” does not remove the
problems associated with joining through the LU_MARKET table. To see some of the issues that are still present,
perform the following steps:
22. On the drill report, select the row for Sara Kaplan and drill up to Market. This drill action should return the
following result:
No data is returned because this query uses the relationship between Account Executive and Market to retrieve the
result set. Since Sara Kaplan does not relate to any market, there is no data to display on the report. You can see
how this join occurs by looking at the SQL.
23. Switch to the SQL View for this report. You should see the following SQL:
* Notice the WHERE clause has a condition that filters on Sara Kaplan’s ID since you drilled only on Sara Kaplan in
the report. Since the ID for Sara Kaplan does not relate to any market ID, the SQL cannot return any data for the
report.
24. Close both drill reports. You do not need to save them.
* You can access the Market attribute from the Sales hierarchy.
27. Run the report again. The result set should look like the following:
* With the Market attribute on the report, notice that the report now shows the incorrect revenue total of
$3,774,215.
Why is the total once again incorrect on this report? The answer is in the SQL.
28. Switch to the SQL View for this report. You should see the following SQL:
* Notice that the SQL now includes the LU_MARKET table. Since the join to the fact table occurs through this table,
the sales for clients with account executives who do not roll up into a market are left out of the aggregation.
30. Drill down on the entire report from Market to Account Executive and choose to keep the parent attribute
when drilling. The first part of the result set should look like the following:
Again, notice that Sara Kaplan does not display in the result set for the East region, and Wes Vinson does not
display in the result set for the Mountain region. Because the drill joins through the Market attribute, and therefore
the LU_MARKET table, these account executives are left out.
31. Close the drill report. You do not need to save it.
32. Close the Revenue by Region report. You do not need to save any changes to the report.
Now that you have seen the issues that are still present when you revise the data model, you are going to return the
tables and attribute relationships to their original definitions and resolve the ragged hierarchy by populating the
gaps with system-generated values.
1. Open the DB Query Tool by choosing it in the Start Menu?MicroStrategy Tools menu..
4. Verify that the column has been dropped by running the following SQL:
show columns from LU_ACCT_EXEC;
update LU_ACCT_EXEC
set Market_ID = 20
where Acct_Exec_ID = 19;
8. Verify the changes have been made using the following SQL:
select * from LU_ACCT_EXEC;
9. Close DB Query Tool.
10. In MicroStrategy Developer, in the Ragged Hierarchies project, in the Schema Objects/Attributes folder, open
the Region attribute.
18. Run the Revenue by Region report again. The result set should look like the following:
* Notice that the total on the report is aggregating all of the revenue in the fact table.
19. Switch to the SQL View for the report. The SQL should look like the following:
* Notice that the SQL contains the LU_MARKET table. This query joins through the LU_MARKET table, but because
of the values you entered, all of the account executives are now related to a market and are included in the
aggregation.
21. Drill down on the entire report from Region to Market and choose to keep the parent attribute when drilling.
The result set should look like the following:
* Notice that the market values you entered display in the result set.
22. On the drill report, drill down on the entire report from Market to Account Executive and choose to keep the
parent attribute when drilling. The first part of the result set should look like the following:
Notice that both Sara Kaplan and Wes Vinson are part of the result set.
23. Close all of the reports. You do not need to save the drill reports.
Split Hierarchies
A split hierarchy is one in which there is a split in the primary hierarchy such that more than one child attribute
exists at some level in the hierarchy. Most hierarchies follow a linear progression from higher-level to lower-level
attributes. While characteristic attributes may branch off the primary hierarchy at various points, the primary
hierarchy itself generally follows a single path to the lowest-level attribute and any related fact tables. With split
hierarchies, somewhere along the primary hierarchy, a split occurs. For example, a pharmaceutical company has its
Prescriber hierarchy organized as follows:
Logical Data Model for Prescriber Hierarchy
The prescriber relates at the lowest level to both the drug that is being prescribed and the patient to whom the
prescription belongs. In this example, the complexity is compounded by the many-to-many relationships between
the parent and child attributes. A prescriber can prescribe multiple drugs, and multiple prescribers can prescribe
the same drug. A prescriber has multiple patients, and a patient can go to multiple prescribers (doctors) for
different ailments.
* Split hierarchies can be present without many-to-many relationships between the parent and child attributes.
The problem with a split hierarchy is that it provides two paths that you can use to join to fact tables. The lookup
tables (and relationship tables in the case of many-to-many relationships) for each parent-child attribute form
separate, distinct join paths. You can use either path to join to fact tables for metrics that are contained in a report.
Nonetheless, the SQL Engine optimizes the path to fact tables, so it is forced to make a choice.
* A split hierarchy may not pose join issues if each child attribute in the split joins to a different set of fact tables and
you never use the attributes to join to the same fact table.
For split hierarchies in which there are one-to-one or one-to-many relationships between the parent and children,
the split results in the SQL Engine consistently choosing one join path over the other, even though that path may
not be the most efficient way of joining from the fact tables to the parent attribute for all queries. This same
problem also arises when you have split hierarchies in which there are many-to-many relationships. However, the
many-to-many relationship further compounds the issue. Because the SQL Engine chooses one join path over the
other, the join may not occur through the proper relationship table, which can lead to an inaccurate result set.
There are lookup tables for the Prescriber, Drug, and Patient attributes as well as two separate relationship
tables—one to map the relationship between Prescriber and Drug and one to map the relationship between
Prescriber and Patient. These tables contain the following data:
Table Data
If you run a report to view the drugs that prescribers have prescribed, the result set looks like the following:
Result Set for Prescriber-Drug Information
The result set correctly displays each prescriber along with the drugs they have prescribed. The SQL for this report
looks like the following:
select a12.Prescriber_ID AS Prescriber_ID,
a13.Prescriber_Name AS Prescriber_Name,
a11.Drug_ID AS Drug_ID,
a11.Drug_Name AS Drug_Name
from `LU_DRUG` a11
join`REL_DRUG_PRESCRIBER` a12
on a11.Drug_ID = a12.Drug_ID
join `LU_PRESCRIBER` a13
on a12.Prescriber_ID = a13.Prescriber_ID
The result set is correct because it is obtained by querying the REL_DRUG_PRESCRIBER table, which maps the
relationships between prescribers and drugs.
You could also run a report to view the patients for each prescriber. The result set looks like the following:
Result Set for Prescriber-Patient Information
The result set correctly displays each prescriber along with the patients for whom they have prescribed drugs.
You could also run a report that shows prescriber and patient information along with the amount of prescriptions
for each patient. The result set looks like the following:
Result Set for Prescriber-Patient Information with a Metric
With the Prescription Amount metric as part of the report, this result set does not correctly display the prescriber
and patient relationships. Instead of relating patients to prescribers who have prescribed drugs for them, the result
set relates patients to any prescriber that prescribes drugs they have taken, regardless of whether they actually
obtained their prescription from that particular prescriber. The SQL for this report looks like the following:
select a12.Prescriber_ID AS Prescriber_ID,
max(a14.Prescriber_Name) AS
Prescriber_Name,
a11.Patient_ID AS Patient_ID,
max(a13.Patient_Name) AS Patient_Name,
sum(a11.Presc_Amt AS WJZBFS1
from `FACT_PRESCRIPTIONS` a11
join `REL_DRUG_PRESCRIBER` a12
on a11.Drug_ID = a12.Drug_ID
join `LU_PATIENT` a13
on a11.Patient_ID = a13.Patient_ID
join `LU_PRESCRIBER` a14
on a12.Prescriber_ID = a14.Prescriber_ID
group by a12.Prescriber_ID,
a11.Patient_ID
In this case, the result set is incorrect because the SQL Engine chooses to join from the FACT_PRESCRIPTIONS table
to the LU_PRESCRIBER table through the Drug attribute, rather than the Patient attribute. Therefore, it uses the
relationship table between Drug and Prescriber to obtain the result set. As a result, the query finds the drugs that
each prescriber prescribed and then just joins to each patient who took those drugs, regardless of whether or not
they have a relationship with a particular prescriber. To determine relationships between patients and prescribers
(the information that you really want to analyze in this report), the query must access the relationship table
between Prescriber and Patient. Because the SQL Engine chooses the join path provided by the Drug attribute, the
REL_PATIENT_PRESCRIBER table is not included in the query.
A preliminary step to generating SQL is that the SQL Engine has to determine the most efficient join path from fact
tables to lookup tables. To do so, the SQL Engine checks the following:
MicroStrategy Architect automatically calculates the logical table size and assigns a numeric value to each table
relative to the attributes that are contained in the table and their position in their respective hierarchies. Usually, a
smaller logical table size equates to a smaller physical table size. In cases like this one where the SQL Engine finds
two join paths to the LU_PRESCRIBER table, it checks the logical size of both the LU_DRUG and LU_PATIENT tables.
In this example, the Drug and Patient attributes have the same weight. Therefore, the tables have the same logical
size. The SQL Engine cannot differentiate between the two paths based on logical table size.
At this point, if the logical table size is equal, the SQL Engine cannot distinguish which path is the most efficient.
Therefore, it simply picks the lookup table based on the order of the attributes (Drug and Patient) in the system
hierarchy. Because Drug is first in the system hierarchy (it was created before Patient), the SQL Engine chooses to
join through the LU_DRUG table.
Essentially, when you have a split hierarchy such as this one, the SQL Engine has no way of differentiating between
the join paths because it creates a situation in which both choices seem to be equally efficient. In actuality,
depending on the attributes on the report, sometimes you need to join through the LU_DRUG table and
sometimes through the LU_PATIENT table.
The best way to ensure that the SQL Engine always selects the most efficient join path and uses tables that provide
the desired result set is to remove the split from the hierarchy. You can resolve split hierarchies by creating joint
child relationships.
The joint child relates each of the original parent and child attributes that are involved in the split. It also provides a
single path for joins. In this example, you need to relate Prescriber, Drug, and Patient. To set up the joint child, you
need to do the following:
1. Create a relationship table that includes the parent attribute and both child attributes.
2. Create joint child relationship between the parent and the two children attributes using this relationship table.
First, you need to create a relationship table that maps the relationship between the parent attribute and both
child attributes. The relationship table looks like the following:
Relationship Table with All Three Attributes
The REL_PRESCRIBER_DRUG_PATIENT table provides a means of relating all three attributes to one another. In this
way, you can view prescribers in relationship to both the drugs they prescribe and the patients to whom they
prescribe them at the same time. The following image shows the modified relationship between the three
attributes:
Modified Tables in the Prescriber Hierarchy
During the exercise, you will first create and run reports in the project without having any solution in place so that
you can see the effect of a split hierarchy on the result set. Then, you will resolve the split hierarchy by creating and
implementing a joint child relationship.
The Split Hierarchies Project is based on the following logical data model:
In the original data model, the hierarchy is split at its base by the Drug and Patient attributes. Because these
exercises focus on these attributes and their parent attribute, the hierarchy exists in the project only up to the
Prescriber attribute. The Prescriber hierarchy looks like the following:
Prescriber Hierarchy
The schema for this project consists of the following six tables:
Split Hierarchies Schema
1. In MicroStrategy Developer, in the Advanced Data Warehousing project source, open the Split Hierarchies
Project.
2. In the Public Objects folder, in the Reports folder, create the following report:
* You can access the Prescriber and Drug attributes from the Prescriber hierarchy.
3. Run the report. The result set should look like the following:
* This report displays each of the drugs that various prescribers have prescribed.
4. Switch to the SQL View for the report. The SQL should look like the following:
* Notice in the FROM clause that the REL_PRESCRIBER_DRUG table joins the Prescriber and Drug attributes as you
would expect since this table maps the relationship between them.
6. Save this report in the Reports folder as Prescriber-Drug Report and close the report.
* You can access the Patient attribute from the Prescriber hierarchy.
8. Run the report. The result set should look like the following:
* This report displays each of the patients for whom the various prescribers have prescribed drugs.
9. Switch to the SQL View for the report. The SQL should look like the following:
* Notice in the FROM clause that the REL_PRESCRIBER_PATIENT table joins the Prescriber and Patient attributes as
you would expect since this table maps the relationship between them.
11. Save this report in the Reports folder as Prescriber-Patient Report and close the report.
13. Run the report. The result set should look like the following:
Compare the patients related to each prescriber in this result set with the patients related to each prescriber in the
result set for the Prescriber-Patient Report.
Why do you see more patients associated with prescribers in this report than in the Prescriber-Patient Report? The
answer is in the SQL.
14. Switch to the SQL View for the report. The SQL should look like the following:
* Notice in the FROM and WHERE clauses that the query joins the fact table to the LU_PATIENT table using the join
path made available by the Drug attribute. As a result, this query uses the REL_PRESCRIBER_DRUG table to join to
the LU_PRESCRIBER table, not the REL_PRESCRIBER_PATIENT table. Therefore, in the result set, you do not see
patients related only to the prescribers who have prescribed medication to them. Instead, you see patients related
to any prescriber who has ever prescribed a drug they have taken, regardless of whether they were ever a patient of
that prescriber.
16. Save this report in the Reports folder as Prescriber-Patient-Metric Report and close the report.
1. Open the MicroStrategy DB Query Tool by choosing it in the Start Menu ? MicroStrategy Tools menu..
3. Click Connect.
(5,5,13),
(6,5,10),
(7,6,9),
(5,7,7),
(5,7,6),
(5,7,13),
(1,8,11),
(7,8,8),
(2,4,10);
At this point, you would enter the data; however, for this exercise this has already been done for you as shown in
another table below:
10. In MicroStrategy Developer, in the Split Hierarchies project, open the Warehouse Catalog.
13. In the Schema Objects folder, in the Attributes folder, open the Prescriber attribute.
15. Modify the ID form to remove REL_PRESCRIBER_DRUG and REL_PRESCRIBER_PATIENT as source tables.
19. In the Schema Objects folder, in the Attributes folder, open the Drug attribute.
24. In the Schema Objects folder, in the Attributes folder, open the Patient attribute.
31. In the Add Children Attributes window, under Selected Children, select the Create as Joint Child check box.
In creating the joint child relationship, you are essentially changing the data model to look like the following:
Modified Logical Data Model for Prescriber Hierarchy
35. Run the Prescriber-Patient-Metric Report again. The result set should look like the following:
* The result set correctly relates the prescribers and their patients. You can compare this result set to the previous
one for this same report earlier in the exercises.
36. Switch to the SQL View for the report. The SQL should look like the following:
* Because of the joint child, the report now contains the correct result set. The information from the
FACT_PRESCRIPTIONS table is joined to the prescriber information through the REL_PRESCRIBER_DRUG_PATIENT
table.
Recursive Hierarchies
A recursive hierarchy is one in which elements of an attribute have a parent-child relationship with other elements
of the same attribute. All of the attributes in a hierarchy may be recursive, or a hierarchy may have only a single
attribute that is recursive. For example, a company’s organizational structure looks like the following:
At first, this hierarchy seems very simple and straightforward. However, within the Employee attribute, there are
two levels of management along with the lowest-level employees who do not manage anyone. An organization
chart for the company looks like the following:
In the database, the employee data is stored in a single table in a recursive fashion. This table looks like the
following:
The LU_EMPLOYEE table stores not only the ID and name of each employee, but it also has a Manager_ID column,
which references the employee ID of each employee’s manager. If you just want to view a list of all employees, you
can create an Employee attribute and map its ID and DESC forms to the Employee_ID and Employee_Name
columns in the LU_EMPLOYEE table.
Generally, when you have a recursive attribute like Employee, you want to be able to run reports that show
managers and their corresponding employees. In the database, the data for all three levels of employees come
from the same columns in the same table. However, on a report, they are logically different attributes. For example,
you could run a report that looks like the following:
The Level 1 Manager and Level 2 Manager attributes represent the two levels of management, and the Employee
attribute represents the lowest-level employees who do not manage anyone. Because all three attributes map to
the same columns in the same lookup table, you need to be able to alias the table three times in the SQL to retrieve
the employee name for each of the three attributes on the template. By default, the SQL Engine aliases a table only
once.
To resolve this issue with recursive hierarchies, you need to flatten the recursive attribute, creating separate lookup
tables or views for each level of recursion.
* You cannot use explicit table aliasing to support recursive hierarchies. Although you could create two logical
table aliases for the LU_EMPLOYEE table and map each of the manager levels to one of the table aliases, this
solution does not work because all of the employee records are contained in the lookup table that is aliased. When
you run a report with any one of the three attributes (Level 1 Manager, Level 2 Manager, or Employee), it displays
every employee in the table for each attribute.
* Sometimes, tables with a recursive structure also contain a level column, which indicates the level of an element
in the recursive hierarchy. For example, a Level 1 Manager would be “1,” a Level 2 Manager would be “2,” and an
Employee would be “3.” Using a level column like this inside a table does not resolve issues with recursive
hierarchies because the MicroStrategy SQL Engine does not look at the specific data elements contained in the
rows of a table.
To flatten the recursive LU_EMPLOYEE table, you need to create three separate lookup tables or views. The three
tables or views look like the following:
After flattening the LU_EMPLOYEE table, you map the ID and DESC forms of the three attributes as follows:
You can then change the data model to reflect the relationships between the three attributes as follows:
Revised Geography Logical Data Model
Now, if you run the report, the result set looks like the following:
Report Result with Recursive Relationships
Since the Level 1 Manager, Level 2 Manager, and Employee attributes map to different tables or views, they each
are aliased in the SQL, which enables the query to display the employees with respect to the managerial
relationships that exist. The SQL for this report looks like the following:
select a12.Level1_Mgr_ID AS Level1_Mgr_ID,
a13.Level1_Mgr_Name AS Level1_Mgr_Name,
a11.Level2_Mgr_ID AS Level2_Mgr_ID,
a12.Level2_Mgr_Name AS Level2_Mgr_Name,
a11.Employee_ID AS Employee_ID,
a11.Employee_Name AS Employee_Name
from `LU_EMPLOYEE` a11
join `LU_LEVEL2MANAGER` a12
on a11.[Level2_Mgr_ID] =a12.[Level2_Mgr_ID]
join `LU_LEVEL1MANAGER` a13
on a12.Level1_Mgr_ID = a13.Level1_Mgr_ID
In the FROM clause, the SQL Engine uses the LU_LEVEL1MANAGER, LU_LEVEL2MANAGER, and LU_EMPLOYEE
tables that comprise the flattened schema to retrieve the data for the result set.
In the above example, if you also have fact tables in your data warehouse that store data at the Employee level,
then you could have resolved this issue using a completely denormalized lookup table as shown below:
Completely Denormalized Employee Lookup Table
In the previous example, flattening the recursive table is a good solution since the number of levels is relatively
small and fixed. Furthermore, there was an added assumption that the fact tables stored data at the employee level
only. This then allows you to drill up and down the hierarchy.
However, there could be situations where the fact tables contain data from higher levels. For instance, let us
assume the above example represents a service organization. Both the level 1 and 2 managers along with the
employees who report to them perform billable work. The billable hours are recorded in one fact table. So if Joseph
Duke performed 20 hours of billable work for a customer, in order to create an accurate monthly billing report, he
would have to be an employee in the employee table reporting to Matt Wilson who would be listed as a level 2
manager. Similarly, if Matt Wilson also performed billable work, then he would have to be a level 2 manager and an
employee and in both cases reporting to himself.
Additional challenges posed by recursive hierarchies are that they could be ragged. Also, some recursive
hierarchies could have no predefined limit to the number of levels.
There are different solutions possible based on whether there is a need to see separate attributes. In the example
discussed so far, it made sense to model three separate attributes so that you could look at business facts from an
organizational hierarchy perspective. If such a requirement is not needed, then the entire hierarchy could be
modeled through employee and manager attributes. Basically, you create a separate relationship table that links
any employee to her direct and indirect managers. In addition, in the relationship table, there is another attribute
that represents the distance that any employee is from the top of the hierarchy.
* These relationship tables are also referred to as bridge, helper and explosion tables.
Essentially, the relationship table captures information in such a way that it effectively represents the parent-child
relation in the hierarchy. Here is a sample organization structure showing the employees and their relationship to
each other:
In our example, for Paul Smith the relationship table would contain multiple entries, capturing all of Paul’s
managers.
So for Paul, there would be three records in the relationship table, each indicating the overall relationship in the
hierarchy. It may also be useful to add another row indicating a relationship from Paul to himself with a distance of
zero. This will prove useful when the fact tables include data for both managers and employees. The following
image shows the relationship table structure and the number of records for Paul:
Hierarchy Relationship Table
* The relationship table only contains the IDs necessary to represent relationships between the attributes.
* Caution, if you are working with large and deep hierarchies, the relationship table may become really big. Note
that it essentially captures all paths from any employee to the top level manager.
After creating the REL_HIERARCHY table above, you need to perform the following steps:
2. Create the Employee attribute by mapping it to Employee_ID column in the LU_EMPLOYEE and
REL_HIERARCHY tables.
3. Create the description form for the Employee attribute using Employee_Name column in the LU_EMPLOYEE
table.
* You could also use the other options discussed in the Attribute Roles lesson.
5. Create the Manager attribute by mapping it to the Manager_ID in the REL_HIERARCHY table.
6. Use heterogeneous mapping to map the Employee_ID in the LU_EMPLOYEE_ALIAS table to the Manager
attribute.
7. Create the description form for the Manager attribute using the Employee_Name column in the
LU_EMPLOYEE_ALIAS table.
8. Create the Distance attribute by mapping it to the Distance_ID in the REL_HIERARCHY table.
9. Make Manager the parent and Employee the child of the Distance attribute using the REL_HIERARCHY table.
The following image shows the revised data model for the Geography hierarchy:
Revised Geography Logical Data Model
Now, if you wanted to see all the employees who report to Joseph Duke, you can create the following report:
Report for all Employees of a Specific Manager
When you run the report the following results are displayed:
Report Result of all Employees Reporting to a Specific Manager
* It is easy to exclude Joseph Duke from the result set by creating a report filter on the Distance attribute.
The SQL for the above report looks like the following:
select a12.Manager_ID AS Manager_ID,
a13.Employee_Name AS Employee_Name,
a11.Employee_ID AS Employee_ID,
a11.Employee_Name AS Employee_Name0
from `LU_EMPLOYEE` a11
join `REL_HIERARCHY` a12
on a11.[Employee_ID] = a12.[Employee_ID]
join `LU_EMPLOYEE` a13
on a12.Manager_ID = a13.Employee_ID
where
(a12.Distance > 0
and a12.Manager_ID in (2 ))
Notice that the relationship table is used to retrieve all employees and managers who have a relationship with a
distance greater than zero.
Sometimes you may not want to query a specific branch of the hierarchy, but rather for any given employee you
want to find the chain of managers. The relationship table makes this possible and if you include the Distance
attribute on the report template, you can then sort the result set to see an employee’s entire reporting structure.
For example, the following image shows the report template to determine all the managers for Paul Smith:
Report for all Managers for a Specific Employee
When you run the report the following results are displayed:
Report Result Showing Reporting Structure for a Specific Employee
* By sorting using the Distance attribute, it is easy to see who Paul’s immediate manager is and all of his indirect
managers up to the highest level.
The SQL for the above report looks like the following:
select distinct a12.[Manager_ID] AS Manager_ID,
a13.[Employee_Name] AS Employee_Name,
a12.[Distance] AS Distance
from [REL_HIERARCHY] a12,
[LU_EMPLOYEE] a13
where a12.[Manager_ID] = a13.[Employee_ID]
and (a12.[Employee_ID] in (12)
and a12.[Distance] > 0)
Notice that the relationship table is used to retrieve all the managers and their distances for Paul Smith.
Finally, consider what you want to include in your report business fact data, such as the hours billed by each
employee. The following image shows the fact table:
Fact Table Structure
3. Edit the Employee attribute and make sure that the Employee_ID attribute form is mapped to the
FACT_EMPLOYEE_BILLING _HOURS table.
Then, you could run a report that shows the total number of hours billed by all employees who are part of Joseph
Duke’s reporting chain. The result set looks like the following:
Result Set for Employee Information with a Metric
Notice that the report displays correctly all the employees who are under Joseph Duke’s chain of command. The
report also shows the hours billed by Joseph himself. To exclude him from the report result, an additional report
filter using the Distance attribute (Distance > 0) can be applied to this report. The SQL for this report looks like the
following:
select a11.Employee_ID AS Employee_ID,
max(a13.Employee_Name) AS Employee_Name,
sum(a11.Billed_Hours) AS WJXBFS1
from `FACT_EMPLOYEE_BILLING_HOURS` a11
join`REL_HIERARCHY` a12
on a11.Employee_ID = a12.Employee_ID
join `LU_EMPLOYEE` a13
on a11.Employee_ID = a12.Employee_ID
where and
a12.Manager_ID in (2)
group by a11.Employee_ID
Notice how the relationship table is used to join to the fact table to calculate the employees billed hours. The
relationship table is also used to filter the report for only one manager and in this case Joseph Duke.
First, view table data that contains recursive relationships. Next, you will flatten the recursive table into three
lookup tables. Finally, you will create and run a report using the flattened lookup tables.
You will use the Recursive Hierarchies Project to build many of the project objects as part of the exercises. The
image below shows the logical data model with the recursive attribute.
In this data model, the Employee attribute is the recursive attribute. Because these exercises focus on recursion,
you will build attributes only for the employee data. You will modify the logical data model to separate the
Employee attribute into several logical attributes that reflect the recursive relationships.
1. Open the MicroStrategy DB Query Tool by choosing it in the Start Menu--> MicroStrategy Tools menu.
3. Click Connect.
* The Manager_ID column in the table denotes recursive relationships in the employee data. There are two levels of
managers. The lowest-level employees do not manage anyone. With this table in its current form, you cannot run a
report that displays the relationships between employees. You need to flatten the LU_EMPLOYEE_RECURSION
table to make such a report possible.
* The image below also shows the appropriate data type to use for each column.
9. In Microsoft Access, in the ADVDW_WH: Database window, click the Create menu.
* The table below also shows the appropriate data type to use for each column.
At this point, you would follow the steps above to create the lookup table for the lowest-level employees. However,
this table has already been created for you.
Add the flattened lookup tables to the project and create the project attributes
1. In MicroStrategy Developer, in the Advanced Data Warehousing project source, open and select the Recursive
Hierarchies Project.
2. On the Schema menu, select Architect to open the Architect Graphical Interface.
3. In the Architect graphical interface, on the Project Tables View tab, add the LU_LEVEL1_MANAGER,
LU_LEVEL2_MANAGER, and LU_EMPLOYEE_FLATTENED tables to the project from the Advanced Data
Warehousing Warehouse database instance.
• In the Result Preview window, for each table, leave all the attributes selected and click OK.
4. Rename the Level1 Mgr attribute as Level 1 Manager and the Level2 Mgr attribute as Level 2 Manager.
You need to create the Level 1 Manager, Level 2 Manager, and Employee attributes to map to the three levels of
data in the recursive relationships. Essentially, you are modifying the data model to look like the following:
Modified Logical Data Model for Recursive Hierarchies Project
* In this exercise, you will create only the last three attributes in the hierarchy.
5. On the Hierarchy View tab, configure the parent-child relationships between the three attributes based on the
logical data model above.
* You need to click the parent attribute and drag the mouse pointer to the child attribute.
To make it easier to browse the project attributes, you need to create a user hierarchy.
7. In the Hierarchy View tab, create a user hierarchy named Geography that includes all three attributes. Make all
three attributes entry points. Your user hierarchy should look like the following:
10. In the Schema objects folder, in the Hierarchies folder, move the Geography hierarchy to the Data Explorer
folder.
* This is required to be able to browse the Geography hierarchy in the Data Explorer.
Create a report
1. In the Public Objects folder, in the Reports folder, create the following report:
* You should find the Level 1 Manager, Level 2 Manager and Employee attributes in the Geography hierarchy.
2. Run the report. The result set should look like the following:
The report displays the three levels of employees in relationship to one another.
3. Switch to the SQL View for the report. The SQL should look like the following:
* The query retrieves the data from the three lookup tables that you created.
5. Save the report in the Reports folder as Employee List and close the report.
Creating Metrics
When creating project schema, every fact created requires a simple metric created for it so that it can be used in a
report or to build other complex metrics. The easiest way to do this is through Architect Settings – Metric Creation
Settings. When this option is set, for every fact created in Architect, a metric with the same name is created in the
Public/Metrics folder. Alternatively, a metric for each fact can created manually using the Metric Editor. Below is an
example exercise.
1. In MicroStrategy Developer, log into the MicroStrategy Analytics Modules 3-tier project source.
6. Within Project Architecting folder, on the File menu, point to New and select Metric.
7. Click OK.
8. On the Object Browser panel on the left, navigate to Schema Objects – Facts – Cost.
9. Drag Cost fact to the Definition box on the right. By default, the Sum function is applied to Cost.
13. Type Avg Cost as the new metric’s name and click Save.
At the end of this process, the user has created a simple – basic metric.
Create Transformations
Transformations are schema objects most often used to compare values at different times—for example, this year
versus last year and today versus month to date. Transformations are useful for discovering and analyzing
time-based trends in your data.
Transformations are not placed on the template of a report or a document. Instead, report developers use
transformations to define transformation metrics—simple metrics that assume the properties of the
transformation applied to them.
For example, a report developer could create a metric to calculate revenue. If the report developer adds a Last
Year’s transformation to this metric—effectively creating a Last Year’s Revenue metric, this metric calculates last
year’s revenue as shown in the following image:
Current Versus Last Year’s Revenue
Any transformation can be included as part of the definition of a metric, and multiple transformations can be
applied to the same metric.
Types of Transformations
Table-Based Transformations
Table-based transformations use transformation tables in the warehouse to define the transformation from one
time period to another. These tables are often created during the ETL process. Some transformation data can be
easily incorporated into the lookup tables for attributes—such as Day—by adding the transformation columns. For
example, the LU_DAY table in the data warehouse for the MicroStrategy Tutorial project has the following
structure:
LU_DAY Lookup Table
* This table also has columns for the parent IDs at all levels. These columns are not displayed in the image above.
Each date in the LU_DAY table has a transformed value for the previous day, last month’s date, last quarter’s date,
and last year’s date. For example, for January 5, 2013, the previous day was January 4, 2013, while the last quarter’s
date was October 5, 2012.
Other types of transformations may require separate transformation tables. For example, the following image
shows the table that stores the values for the month to date transformation, the MTD_DAY table:
Month to Date Transformation Table
In the MTD_DAY transformation table, each date has one or more records for all dates within its month before and
including that date. For example, there is only one record for the January 1, 2013 date, but there are three records
for the January 3, 2013 date.
* This type of transformation data cannot be stored in the lookup table for the Day attribute because lookup tables
store each unique date only once.
After a transformation object has been associated with a metric, the Engine uses the transformation to generate
SQL for that metric. The following illustration shows how transformation tables act as intermediaries in the metric
join path when you use transformation metrics on a report:
Transformation Tables in Metric Join Path
Depending on the database been used for the data warehouse, a table-based transformation may be required
when performing a many-to-many transformation such as a year-to-date calculation. Table-based transformations
are also required any time the business rules for a transformation cannot be accounted for using in an expression.
Expression-Based Transformations
For example, a Last Quarter or Last Month transformation can be created using QUARTER_ID-1 or MONTH_ID-1,
respectively. Users can also create expression-based transformations using pass-through functions such as
ApplySimple. These types of expressions enable users to take advantage of database-specific functions that they
can use to calculate certain types of transformations.
Expression-based transformations work only if they are store data in a format conducive to calculation. For
example, if month IDs are stored in the format YYYYMM, the MONTH_ID–1 expression does not always work. If that
expression is applied to the month ID 201101 (January 2011) with the desire to obtain information about 201012
(December 2010), you do not get any data because there is no 201100 month ID.
Transformation Components
• Member attributes
• Member expressions
• Member tables
• Mapping type
Member Attributes
The member attributes are attributes to which the transformation applies. In other words, it is the lowest-level
attribute on the report to which the transformation can be applied. For example, if the report is analyzed at the
quarter level, a Quarter member attribute should be added to the transformation. On the other hand, if the
transformation is month to date, the member attribute would be Day.
Member Expressions
Each member attribute has a corresponding expression that needs to be resolved in the report SQL to obtain valid
data. For expression-based transformations, the member expression is a mathematical expression. For table-based
transformations, it is a column from a transformation table in the warehouse that points to the valid data.
* A single transformation can use a combination of table-based and expression-based transformations. For
example, you could create a Last Year transformation based on the Year, Month, and Day attributes. Year could use
an expression such as YEAR_ID–1.However, Month and Day could map to columns in a transformation table
because their IDs are not conducive to expression-based transformation.
Member Tables
The member tables store the data for the member attributes. For expression-based transformations, the member
tables are generally lookup tables that correspond to the attribute being transformed, such as LU_DAY,
LU_QUARTER, and so forth. For table-based transformations, it is the transformation table that stores the
relationship.
Mapping Type
The mapping type determines the way the transformation is created based on the nature of the data. The mapping
type can be one of the following:
• One-to-one— typical one-to-one relationship is last year versus this year. A year maps to exactly one other
year.
• Many-to-many— typical many-to-many relationship is year to date. A date maps to itself and any other
dates that came before it in the given year.
Overview
In this exercise, first create a Last Year’s transformation that has a Year member attribute with a [YEAR_ID] - 1
expression defined on the LU_YEAR table.
After updating schema, create the Last Year’s Forecast Revenue transformation metric.
Run the report. Add the Quarter attribute to the template and run the report again. After reviewing the result set,
save the report as Transformation Example in the Public Objects\Reports folder.
Next, modify the Last Year’s transformation by adding three more member expressions. The transformation should
be defined as follows:
Finally, after updating schema, run the Transformation Example report and drill from 2011 Q1 to Day to test the
new member expression.
Detailed Instructions
1. To create the Last Year’s transformation, in the MicroStrategy Analytics Modules project source, open the
Forecasting Project.
6. Click Open.
7. In the Define a new member attribute expression window, in the Table drop-down list, select the LU_YEAR
table.
9. Click OK.
11. Save the transformation in the Schema Objects\Transformations folder as Last Year’s.
13. To create a transformation metric, in the Public Objects folder, in the Metrics folder, create a new metric.
16. In the Metric: New Metric is defined as pane, select Transformation = (nothing).
18. Save the metric in the Public Objects\Metrics folder as Last Year’s Forecast Revenue and close it.
20. To create a report with a transformation metric, in the Public Objects\My Reports folder, create the following
report:
• Access the Year attribute from the Time hierarchy. Find the Forecast Revenue metric in the Metrics folder.
21. Run the report. The report result set should look like the following:
Only data for the 2012 and 2013 years are displayed, even though there is Forecast Revenue data for 2011 in the
data warehouse. Notice that both metrics return different values for each row, but the Forecast Revenue data for
2012 is identical to the Last Year’s Forecast Revenue for 2013, as expected.
23. Add the Quarter attribute to the report template to the right of Year.
25. Run the report. The report result set should look like the following:
Since the Last Year’s transformation is not defined for the Quarter attribute, the transformation metric is not
evaluated correctly. The metric values are identical for both metrics in each row. This data cannot be correct, since
you already know from the previous result set that forecast revenue for each year is different.
26. Save the report in the Public Objects\Reports folder as Transformation Example and close it.
27. Add the Quarter member attribute to the Last Year’s transformation
28. In the Schema Objects folder, in the Transformations folder, double-click the Last Year’s transformation.
32. In the Define a new member attribute expression window, in the Table drop-down list, select the LU_QUARTER
table.
33. Double-click the LY_QUARTER_ID column to move it to the Member attribute expression pane.
35. To add the Month member attribute to the Last Year’s transformation, in the Transformation Editor, click Add.
38. In the Define a new member attribute expression window, in the Table drop-down list, select the LU_MONTH
table.
39. Double-click the LY_MONTH_ID column to move it to the Member attribute expression pane.
41. To add the Day member attribute to the Last Year’s transformation, in the Transformation Editor, click Add.
44. In the Define a new member attribute expression window, in the Table drop-down list, select the LU_DAY table.
45. Double-click the LY_DAY_DATE column to move it to the Member attribute expression pane.
49. Run the Transformation Example report. The report result set should look like the following:
This result set is correct with the quarter-level transformation. The Forecast Revenue total for 2012 is identical to
the Last Year’s Forecast Revenue total for 2013, as expected.
50. Right-click the 2012 Q1 quarter element, point to Drill, point to Down, and select Day. The drill down report
result set should resemble the following:
* The image above only displays the first few rows of the result set for the report.
2
MANAGING MICROSTRATEGY
SCHEMA
Modifying Facts
Modify existing facts in the Architect graphical interface. Change any of the following parts of a fact:
• Fact expressions
• Source tables
• Mapping methods
• Column alias
Modify facts either from the Project Tables View tab or the Properties pane. While the Project Tables View tab
provides access to all parts of a fact, the Properties pane provides access only to the column alias and individual
fact expressions.
It is best to use the Project Tables View tab to modify facts when you need to perform any of the following tasks:
• Modify a single fact expression for all the tables to which it maps
Column aliases can also be modified for existing facts. Every fact has a default column alias, regardless of the type
or number of expressions defined for it. The column alias specifies the data type that the MicroStrategy Engine uses
for a fact when it generates SQL for temporary tables.
By default, a fact inherits its data type from the column on which it is defined. Therefore, if you map a fact to a
column called QTY_SOLD that uses an Integer data type, MicroStrategy Architect automatically creates a
corresponding column alias called QTY_SOLD with an Integer data type.
If a fact maps only to a derived expression, MicroStrategy Architect creates a custom column alias. Custom column
aliases use the naming convention CustCol_<n> where “Cust” stands for custom, “Col” stands for column, and n is a
number. The first custom column alias MicroStrategy Architect creates in a project is CustCol_1, the next CustCol_2,
and so forth.
• Since the column alias name is used in any SQL that the MicroStrategy Engine generates, you can change
the name of custom column aliases to make them more meaningful. Having column alias names that
correlate to underlying fact names can be useful when troubleshooting the SQL for complex reports.
If you define a fact using multiple expressions, the column alias uses the column name and data type of the first
expression you create.
If the first expression you create is a derived expression, MicroStrategy Architect creates a custom column alias as
described above.
Most of the time, you do not need to modify the column alias. However, there are specific scenarios in which you
may need to change the default data type. For example, you could create a fact defined as the difference between
two dates, such as a start date and expire date. This fact has the following expression:
[EXPIRE_DATE] - [START_DATE]
The column alias for this fact automatically uses a TimeStamp data type because the EXPIRE_DATE and
START_DATE columns use the TimeStamp data type. However, the result of the expression (the difference between
the two dates) produces an integer.
The difference in data types can cause problems when the MicroStrategy Engine needs to insert values for this fact
into a temporary table. The Engine uses a TimeStamp data type to define this fact column in the temporary table
and then tries to insert integer numbers into the column. While this may not be a problem for some database
platforms, it can cause an error. To eliminate the conflicting data types, you can modify the column alias for the fact
to change the data type from TimeStamp to Integer.
All facts have properties that you can view in the Properties pane. You can also modify many of these properties.
The following table lists these properties and their descriptions:
Properties for Facts
Using the Properties pane is the quickest method to modify a column alias.
2. On the Facts tab, in the drop-down list, select the fact for which you want to view or modify properties.
3. In the Properties pane, click the properties you want to view or modify. View or modify the fact properties as
desired. Some properties have text boxes or drop-down lists that enable you to modify their values. For other
properties, selecting the property or its current value displays the following Browse button:
Clicking this Browse button opens a window that enables you to modify the related property.
Overview
In this exercise, use the Architect graphical interface to modify the Cost, Revenue, and Units Sold facts you created
in the My Demo Project. Create the following expressions for these facts:
• After all these facts are modified, save and update the project schema.
• There are separate sets of instructions for the Cost, Revenue, and Units Sold facts.
1. To create the second expression for the Cost fact, on the Project Tables View tab, in any project table that
contains the Cost fact, right-click Cost and select Edit.
3. In the Create New Fact Expression window, in the Source table drop-down list, select ORDER_DETAIL.
4. In the Fact Expression box, create the following expression: QTY_SOLD * UNIT_COST.
6. Click OK.
7. To create the third expression for the Cost fact, in the Fact Editor, on the Definition tab, click New.
8. In the Create New Fact Expression window, in the Source table drop-down list, select ORDER_FACT.
13. Save the changes you have made to the project, but keep the project open in the Architect graphical interface
so you can create the next fact.
1. To create the second expression for the Revenue fact, on the Project Tables View tab, in any project table that
contains the Revenue fact, right-click Revenue and select Edit.
3. In the Create New Fact Expression window, in the Source table drop-down list, select ORDER_DETAIL.
4. In the Fact Expression box, create the following expression: QTY_SOLD * (UNIT_PRICE - DISCOUNT).
6. To create the third expression for the Revenue fact, in the Fact Editor, modify the Revenue fact to create a third
expression: ORDER_AMT.
8. Click OK.
10. Save the changes you have made to the project, but keep the project open in the Architect graphical interface
so you can modify the next fact.
1. To create the second expression for the Units Sold fact, on the Project Tables View tab, in any project table that
contains the Units Sold fact, right-click Units Sold and select Edit.
3. In the Create New Fact Expression window, in the Source table drop-down list, select ORDER_DETAIL.
6. Click OK.
9. In the Update Schema window, ensure the following check boxes are selected:
Overview
In this exercise, use the Architect graphical interface to create the Profit fact in the My Demo Project. Create the
following expressions for this fact:
After this fact is created, save and update the project schema.
Detailed Instructions
Since you have practiced modifying several different facts using step-by-step instructions, these instructions are
written at a higher level to help better test your understanding.
1. On the Project Tables View tab, in any project table (CITY_SUBCATEG_SLS, CUSTOMER_SLS, or DAY_CTR_SLS)
that contains the TOT_DOLLARS_SALES column, right-click the table header and select Create Fact.
2. The TOT_DOLLARS_SALES column is already used in the Revenue fact. You have to reuse this column to create
the Profit fact.
3. In the MicroStrategy Architect window, in the box, type Profit as the fact name.
4. Click OK.
5. In the Create New Fact Expression box, create the following expression: TOT_DOLLAR_SALES - TOT_COST.
7. Click OK.
8. To create the second expression for the Profit fact, edit the Profit fact.
9. In the Fact Editor, create a second expression: QTY_SOLD * (UNIT_PRICE - DISCOUNT - UNIT_COST).
11. To create the third expression for the Profit fact, in the Fact Editor, create a third expression: ORDER_AMT -
ORDER_COST.
14. On the Home tab, in the Save section, click Save and Close.
15. In the Update Schema window, ensure the following check boxes are selected:
In this exercise, modify the Day and Employee attributes created in the My Demo Project. Create or modify the
following forms and expressions for these attributes:
• Define the Use as Browse Form and Use as Report Form properties as indicated for each attribute
• After modified both attributes have been modified, save and update the project schema.
1. In the Architect graphical interface, click the Project Tables View tab.
2. On the Project Tables View tab, in the Layer section, in the layers drop-down list, select the Time layer.
3. On the Project Tables View tab, in the LU_DAY table, select the Day attribute.
4. To modify the ID form for the Day attribute, in the Properties pane, under Form 1:ID, select ID to see the Browse
button.
5. Click Browse.
6. In the Modify Attribute Form window, on the Definition tab, click New.
7. In the Create New Form Expression window, in the Source table drop-down list, select ORDER_DETAIL.
• There is no need to configure the properties for the Use as Browse and Use as Report Forms for the Day
attribute, because they are automatically set to True.
1. To modify the Employee description (DESC) form, on the Project Tables View tab, in the Layer section, in the
layers drop-down list, select the Geography layer.
2. On the Project Tables View tab, in the LU_EMPLOYEE table, right-click the Employee attribute and select New
Attribute Form.
3. In the Create New Form Expression window, define the expression as follows: EMP_LAST_NAME + ", " +
EMP_FIRST_NAME.
5. Click OK.
6. To create the First Name form for the Employee attribute, create a new attribute form for the Employee attribute
with the following expression: EMP_FIRST_NAME.
8. In the Properties pane, modify the Name property of the newly created form (Form 3: None) to First Name.
9. To create the Last Name form for the Employee attribute, create a new attribute form for the Employee attribute
with the following expression: EMP_LAST_NAME.
11. In the Properties pane, modify the Name property of the newly created form (Form 4: None) to Last Name.
12. To create the SSN form for the Employee attribute, create a new attribute form for the Employee attribute with
the following expression: EMP_SSN.
14. In the Properties pane, modify the Name property of the newly created form (Form 5: None) to SSN.
15. To define the report display and browse forms for the Employee attribute, in the LU_EMPLOYEE table, select the
Employee attribute, if not already selected.
16. In the Properties pane, under Form 2: DESC, ensure that the Use as Browse Form and Use as Report Form
properties are set to True.
17. Under Form 3: First Name, modify the Use as Browse Form and Use as Report Form properties to False.
18. Under Form 4: Last Name, modify the Use as Browse Form and Use as Report Form properties to False.
19. Under Form 5: SSN, modify the Use as Report Form property to False.
21. Save and close Architect graphical interface, updating the project schema.
Aggregate Tables
Base fact tables are tables that store a fact or set of facts at the lowest possible level of detail. Aggregate fact tables
are tables that store a fact or set of facts at a higher, or summarized, level of detail.
Because they store data at a higher level, aggregate fact tables reduce query time. For example, to view a report
that shows unit sales by region, users can obtain the result set more quickly using the FACT_SALES_AGG table than
the FACT_SALES table.
At this point, let us learn how MicroStrategy Architect becomes aware of aggregate fact tables. However, the
question remains as to how MicroStrategy Architect knows to use the aggregate table rather than the base fact
table in cases where either table can provide the answer.
MicroStrategy Architect assigns a size to every table when you initially add them to a project. These size
assignments are stored in the metadata. MicroStrategy Architect assigns sizes based on the columns in the tables
and the attributes to which those columns correspond. Because MicroStrategy Architect uses the logical attribute
definitions to assign a size to each table in the project, this measurement is referred to as logical table size.
The following illustration is a visual representation of the algorithm used by MicroStrategy Architect in assigning
logical table sizes:
Calculating the Logical Table Size
Logical table size is the sum of the weight for each attribute contained in the table. Attribute weight is defined as
the position of an attribute in its hierarchy divided by the number of attributes in the hierarchy, multiplied by a
factor of 10. Using this formula, MicroStrategy Architect calculates the respective weight of each attribute as shown
in the illustration above. The logical table size of each fact table is simply the sum of its respective attribute
weights.
You can view the logical table size for each table in the Logical Table Editor. When the SQL Engine can obtain data
from two or more tables in the warehouse, it looks at the logical table size and generates SQL against the table with
the smallest logical table size. This process helps the SQL Engine select the optimal table for a query.
At times, you may need to reassign the logical table size for a table. For example, in the previous illustration for
logical table sizes, there are two aggregate fact tables that both have the same logical table size of 15. However,
one of these tables contains item and region information, and the other one has class and store information.
Clearly, based on the attributes they contain, the table with item and region information is larger. There are many
more items than classes to which items belong. In this example, where the logical table size is the same but the
physical size is actually very different, you can change the logical table size automatically assigned by
MicroStrategy Architect.
Generally, smaller logical size does equate to smaller physical size. Tables with higher-level attributes usually have a
smaller logical table size than tables with lower-level attributes. However, there are times when this is not the case
due to the particular combination of attributes in a table. In such cases, you have to change the logical table size to
force the SQL Engine to use the table that you know has a smaller physical size.
Data Mart
A data mart is a relational table containing results of a report. Create the data mart report in Developer and save
the data mart table in a warehouse of your choice. After you create a data mart table, you can add it to a project
and use it as a source table.
• Creating tables for very large result sets and then using other applications such as Microsoft Excel or
Microsoft Access to access the data
In this lesson, you will use data marts to create aggregate fact tables.
* You can use data marts in other usage scenarios. Combining data marts with MicroStrategy data mining features
or with Freeform SQL reports are two such scenarios.
In this example, forecasting data is stored at the employee and date level in the FORECAST_SALES base fact table.
However, you want to report on the Forecast Unit Sold at the Region level. This requires three joins from the fact
table to the LU_REGION lookup table. In addition, the FORECAST_SALES table may have millions of rows. This query
may be very costly, especially if users request it often.
What if users could create an aggregate table that limits the number of joins and the number of rows in the fact
table? This can be achieved by creating a data mart table. Users can bring this table into a project, map the Forecast
Unit Sales and metric to it, and have their region-level reports automatically use it, as shown below:
Aggregate Fact Table Created as Data Mart
• Data mart report—a metadata object that can be created in the Report Editor. When executed, the data
mart report creates the data mart table in the warehouse of your choice. The data mart report contains
attributes, metrics, and other application objects that translate into columns in the data mart table.
• Data mart table—the relational table created after the execution of a data mart report.
When a data mart report has been created, users must specify a database instance in which to create the data mart
table.
• Option 2—Use a secondary project database instance that exists in the same warehouse as the primary
project database instance.
• Option 3—Use a different database instance than the project, and one that is in a different warehouse than
the primary project database instance.
The following figure illustrates each of these data mart database instance options:
Data Mart Database Instance Options
If the primary project database instance is used, there is no need to take any additional steps to create a data mart.
Simply select the primary data mart database instance as a target when you create the data mart report.
If planning to use a secondary project database instance, the user must create that database instance before
creating the data mart. This database can then associate the instance to the project in the Project Configuration
Editor.
Complete the following exercises using the Forecasting Project, which is found in the MicroStrategy Analytics
Modules project source. The logical data model and schema for this project are included with the exercises. Please
review this information before beginning the exercises.
• It is also necessary to use the Forecasting Project to complete other exercises throughout this course.
The Forecasting Project consists of several attributes in the Time and Geography hierarchies. The attributes in both
hierarchies are indirectly related to each other through the fact tables.
The schema for this project consists of the following lookup tables:
Lookup Tables
The schema for this project consists of the following fact tables:
Fact Tables
The data warehouse for the Forecasting Project contains additional tables. Bring some of those tables into the
MicroStrategy Tutorial project later in this course.
In this exercise, first create a REGION_FORECAST_SALES aggregate table using the data mart feature in the
Forecasting Project. Then bring this table to the project and test how the Engine selects the fact table based on the
logical table size.
Before creating a data mart, first modify the Forecast Revenue metric to use a custom Forecast_Revenue column
alias. Then, create a data mart report with Region ID and Quarter ID attribute forms and the Forecast Revenue and
Forecast Units Sold metrics on the template. Name the data mart table REGION_FORECAST_SALES. Save the data
mart report as Regional Forecast Revenue in the Public Objects\Reports folder.
Next, add the table to the Forecasting Project. Then, create a new fact expression for the Forecast Revenue fact that
uses the Forecast_Revenue column in the REGION_FORECAST_SALES table. Also, create and run a Data Mart Test
report with the Region attribute and the Forecast Revenue metric on the template to confirm that the Forecast
Revenue fact uses the REGION_FORECAST_SALES table.
Finally, change the logical table size for the REGION_FORECAST_SALES table to 30 and run the Data Mart Test
report to view the impact of your change on the report SQL.
Detailed Instructions
1. Open the DB Query Tool by selecting it from the Start Menu > MicroStrategy Tools.
5. To modify metric alias, in Developer, log in to the MicroStrategy Analytics Modules project source as
Administrator and leave the password blank.
7. In the Public Objects folder, in the Metrics folder, edit the Forecast Revenue metric.
8. In the Metric Editor, on the Tools menu, point to Advanced Settings and select Metric Column Options.
9. In the Metric Column Alias Options window, in the Column Name used in table SQL creation box, type
Forecast_Revenue.
12. If there were no edits to Forecast Units Sold during the lesson with your instructor, repeat steps 3 to 7 for the
Forecast Units Sold metric so the Column Name used in table SQL creation box is Total_Units_Sales.
13. To create the Data Mart report, in the Public Objects folder, in the Reports folder, create the following report:
14. Access the Region attribute from the Geography hierarchy. Access the Quarter attribute from the Time
hierarchy. The Forecast Revenue and Forecast Units Sold metrics are located in the Metrics folder.
15. To configure attribute display options, in the Report Editor, on the Data menu, select Attribute Display.
16. In the Attribute Display window, in the Attribute drop-down list, ensure that Region is selected.
17. Under Select one of the display options below, click Use the following attribute forms.
19. Click the upper > button to move the ID form to the Displayed forms list.
21. Click the upper < button to remove the DESC form from the Displayed forms list.
22. In the Report objects forms list, select the DESC form.
23. Click the lower < button to remove the DESC form from the Report objects forms list.
24. In the Attribute Display window, in the Attribute drop-down list, select Quarter.
25. Under Select one of the display options below, click Use the following attribute forms.
27. Click the upper > button to move the ID form to the Displayed forms list.
29. Click the upper < button to remove the DESC form from the Displayed forms list.
30. In the Report objects forms list, select the DESC form.
31. Click the lower < button to remove the DESC form from the Report objects forms list.
33. To configure the data mart, in the Report Editor, on the Data menu, select Configure Data Mart.
34. In the Report Data Mart Setup window, on the General tab, in the Data mart database instance drop-down list,
ensure the Forecast Data database instance is selected.
39. Save the report in the Public Objects\Reports folder as Regional Forecast Revenue
41. After the report executes, you see a message that the result data has been stored in the
REGION_FORECAST_SALES table, as shown below:
43. To incorporate the data mart table into the project, in Developer, on the Schema menu, select Architect.
44. In the Read Only window, select Edit: This will lock all schema objects in this project from other users.
45. Open the Architect graphical interface, click Project Tables View tab.
47. Automatic metric creation can be changed by clicking Architect button, selecting Settings, clicking the Metric
Creation tab, and clearing Sum check box.
48. In the Warehouse Tables pane, in the Forecast Data database instance, right-click the
REGION_FORECAST_SALES table and select Add Table to Project.
49. In the Results Preview window, in the Fact tab, clear the Forecast Revenue and Total Unit Sales check boxes.
51. To update Forecast Revenue fact, in the Project Tables View tab, find the FORECAST_SALES table and select the
Forecast Revenue fact.
53. In the Fact Editor, create a new fact expression that uses the Forecast_Revenue column in the
REGION_FORECAST_SALES table as a source table.
57. To update the Forecast Units Sold fact, in the Project Tables View tab, find the FORECAST_SALES table and
select the Forecast Units Sold fact.
58. Right-click the Forecast Units Sold fact and select Edit.
59. In the Fact Editor, create a new fact expression that uses the Total_Unit_Sales column in the
REGION_FORECAST_SALES table as a source table.
63. To verify the attribute source tables, in the REGION_FORECAST_SALES table, right-click the Quarter attribute ID
form and select Edit.
65. In the Modify Attribute Form window, ensure the REGION_FORECAST_SALES table is selected as source tables.
69. Ensure that the Recalculate table logical sizes check box is selected in the Schema Update window.
70. To test the new fact mapping, in Developer, in the Public Objects\Reports folder, create the following report:
71. Access the Region attribute from the Geography hierarchy. The Forecast Revenue metric is located in the
Metrics folder.
72. Run the report. The result set should resemble like the following:
73. View the report in SQL View. The SQL should look like the following:
In the FROM clause, notice that the data is retrieved from the new aggregate table REGION_FORECAST_SALES.
74. Save the report in the Public Objects/Reports folder as Data Mart Test and close the report.
75. To change the logical table size for the aggregate table, go to Architect graphical interface.
76. In Architect graphical interface, on the Design tab, click Edit logical size of tables button as shown below:
Notice that the REGION_FORECAST_SALES table’s logical size is 10. The logical size for the FORECAST_SALES table is
20. Therefore, when you execute any report that aggregates the forecast revenue data to the region or quarter
level, the Engine chooses the REGION_FORECAST_SALES table due to its lower logical table size.
77. In the Logical Size Editor, for the REGION_FORECAST_SALES table, in the Size value box, type 30.
78. For the REGION_FORECAST_SALES table, select the Size locked check box.
When you select this check box, the logical size for the table will not be recalculated when you update the
schema.
82. Ensure that the Recalculate table logical sizes check box is selected in the Schema Update window.
Test the change of the logical table size on the report SQL
83. In Developer, in the Reports folder, right-click the Data Mart Test report and select View SQL. The SQL should
look like the following:
Notice that this time, the Engine picked the base FORECAST_SALES table over the aggregate
REGION_FORECAST_SALES table because it has a smaller logical table size.
3
MICROSTRATEGY MULTISOURCE
OPTION
By default, the objects in a standard report come from a single data source. However, you can use the MultiSource
Option—an add-on component to Intelligence Server—to overcome this limitation. MultiSource Option enables
you to define a single project schema that uses multiple data sources. As a result, you can create a standard report
that executes SQL against multiple data sources.
For example, consider the following scenario in which actual revenue data is stored in one data warehouse, while
forecast revenue data is stored in a second data warehouse:
Report with Objects from Multiple Data Sources
If you want to create a report that includes the revenue and forecast revenue data for each region, you must
execute SQL against both data warehouses to retrieve the result set. You obtain the data for each of the metrics
from their respective data warehouses. And you can also obtain the region data from either data warehouse, since
it exists in both databases. MultiSource Option enables you to create this type of report.
Without MultiSource Option, you can use secondary database instances for standard reports only if you
implemented a database gateway. MultiSource Option is the default method for accessing multiple data sources.
However, you can configure the Multiple data source support VLDB property if you want to use a gateway rather
than MultiSource Option.
With MultiSource Option, you have the ability to define primary and secondary database instances at the table
level and connect to them directly within the MicroStrategy platform. This capability enables you to define a
project schema across multiple relational data sources.
• You can add tables to the project from different database instances, not just the primary database instance
for the project.
•Any SQL database instance that exists within the project source is available to the project as a secondary
database instance from which you can select tables.
• You can associate a single project table with multiple database instances, which essentially creates
duplicate tables.
However, keep in mind that MultiSource Option has the following limitations:
• You can use MultiSource Option to connect to any data source that you access using an ODBC driver,
including Microsoft Excel® files and text files. You cannot use MultiSource Option to connect to MDX or
other non-relational data sources.
• MultiSource Option does not support fact tables partitioned across multiple data sources.
You can have lookup, relationship, and fact tables duplicated across multiple data sources. However, because of
how the Engine selects the data source for fact tables, there is benefit to using duplicate tables only for lookup and
relationship tables.
© 2015 MicroStrategy Inc. Primary/Secondary Database Instances at the Table Level 242
Project Architecting MicroStrategy Multisource Option 3
When you bring duplicate tables to the project, you must consider the following guidelines required by
MultiSource Option:
• Corresponding columns in duplicate tables must either have the same data type or compatible data types.
•To maintain data consistency, the Engine applies data type compatibility rules when it joins columns in
tables from different database instances.
• The number of columns in the table associated with the primary database instance has to be less than or
equal to the number of columns in the table associated with the secondary database instance. Any extra
columns in the secondary table are not imported into the project.
• Ideally, duplicate tables have the same number of columns. However, if there are extra columns, they can
exist only in the table associated with the secondary database instance. Otherwise, you must treat the
tables as two different tables.
With reports that use multiple data sources, much of the work of the SQL Engine remains the same. However, the
Engine performs two additional tasks:
• It determines the optimal database instance for each pass of SQL.
Every project table has a primary database instance. The primary database instance is the first one to which it is
mapped. If you have duplicate tables, the same table can have both primary and secondary database instances.
The primary table is the one that exists in the primary database instance. The secondary table is the one that exists
in the secondary database instance.
You can change which database instance is the primary one for a table.
You can have multiple secondary tables if the table is mapped to more than one secondary database instance.
In this example, the two fact tables each map to only one database. The primary database instance for the
REGION_SALES table is the Sales Data Warehouse. The primary database instance for the FORECAST_SALES table is
the Forecast Data Warehouse.
However, the LU_REGION table exists in both data warehouses, so you can map it to both database instances as
duplicate lookup tables. You can assign either data warehouse as the primary or secondary database instance for
this table.
If you designate the Sales Data Warehouse as the primary database instance for this table, the LU_REGION table
from that database is the primary table. The LU_REGION table from the Forecast Data Warehouse is the secondary
table.
If a table is available in a single data source, that source is the only one the Engine can use in the report SQL to
obtain the necessary data. However, if a table is available in multiple data sources, the Engine uses specific logic to
select the optimal data source.
SQL generation for reports is focused on metric data. When the Engine needs to calculate a metric, it first has to
determine the best source for the underlying fact. After taking into account attributes in the template, the metric's
dimensionality, and report or metric filters, the Engine uses the following logic to select the optimal data source for
a fact:
• If the fact comes from a fact table that is available in the primary database instance for the project, the
Engine calculates the metric using the primary database instance for the project.
• If the fact comes from a fact table that is not available in the primary database instance for the project, the
Engine calculates the metric using a secondary database instance. If the fact table is available in more than
one secondary database instance, the Engine selects the database instance with the smallest GUID
(alphabetically).
In essence, the Engine considers only the primary and secondary database instance designation at the project level
when selecting the data source for a fact. When you have a fact table available in multiple sources, it does not
matter which sources have primary versus secondary designation for the table.
After selecting the optimal data source for a fact, the Engine also has to select the best source for any
corresponding attributes. The Engine uses the following logic to select the optimal data source for an attribute:
• If the attribute comes from a lookup table that exists in the same data source as the one selected for the
fact, the Engine obtains the attribute data from this same database instance.
• If the attribute comes from a lookup table that does not exist in the same data source as the one selected for
the fact, the Engine obtains the attribute data from the primary database instance for the lookup table and
moves it to the database instance used as the fact source.
In essence, when you have a lookup table available in multiple sources, it can matter which sources have primary
versus secondary designation for the table.
This same logic also applies if the Engine has to retrieve attribute information from a relationship table.
However, if a user is just browsing attribute elements, the Engine treats lookup tables for attributes like fact tables.
If the lookup table exists in the primary database instance for a project, the Engine queries that database instance.
Otherwise, it uses the secondary database instance with the smallest GUID (alphabetically).
When the Engine needs to join data from different data sources, it selects data from the first data source to the
memory of the Intelligence Server. Then, it creates a temporary table in the second data source and inserts the data
into this table to continue processing the result set.
Users may have a data source that either does not support the creation of temporary tables or in which you do not
want to create temporary tables. If so, they can configure the CREATE and INSERT support VLDB property for the
corresponding database instance to not support CREATE and INSERT statements. This action makes the database
instance read only and forces the Engine to always move data out of this source.
To work with tables from different data sources, the Engine joins table columns based on specific data type
compatibility rules. The following table lists these rules:
If two columns do not have compatible data types, the Engine cannot join data using those columns.
The data type of a column is based on the Engine data type definition, not the data type definition in the physical
database.
These two data type definitions may not always be the same.
Often, different data sources are optimal for different passes in the SQL for a report. For such queries, the Engine
follows the data type compatibility rules and moves data between the different data sources until it finishes
processing the result set.
Note: these examples are not intended to be a comprehensive list of all supported scenarios.
It is common to store different facts in separate data sources. Therefore, you may have a report that contains
metrics that use facts from different data sources.
Users can obtain the region data from either data warehouse. However, revenue data is available only in the Sales
Data Warehouse, and forecast revenue data is available only in the Forecast Data Warehouse.
Users can store lookup and relationship tables in a single data source, but it is also common for lookup and
relationship tables to be split across data sources. Therefore, you may have a report that requires you to join data in
one data source using relationships stored in another data source.
The lookup tables that relate the Category, Subcategory, and Item attributes are stored in the Sales Data
Warehouse. Forecast revenue is stored at the item level in the Forecast Data Warehouse. Aggregate this data to the
category level to calculate the forecast revenue for the report. Remember that the Forecast Data Warehouse stores
only the relationship between Item and Subcategory. Therefore, this aggregation requires you to join the
item-level forecast revenue data to the category data using the lookup tables in the Sales Data Warehouse.
Filtering Qualifications
You may also have reports where you need to filter data from one data source by qualifying on data that comes
from another data source.
The report contains a filter based on the Forecast Revenue metric. This fact data is stored in the Forecast Data
Warehouse. Use this filter to qualify on the revenue for each category, which is stored in the Sales Data Warehouse.
You can have this same scenario with other types of filter qualifications. The following image shows an example
that involves an attribute qualification:
Attribute Qualification
The report contains a filter based on the Category attribute. This attribute is stored in the Sales Data Warehouse.
However, you use this filter to determine which elements are included in the result set for the Product attribute,
which is stored in the Forecast Data Warehouse.
These cases are just a few examples of how you can use MultiSource Option to combine data from multiple sources
in a single report. The MultiSource Option supports a variety of reporting needs, including the following:
• Using separate data sources for simple versus more complex queries
Now that you have learned how MultiSource Option works, you are ready to learn how to configure a project for
heterogeneous data access.
• Although you can use MultiSource Option with standard reports, you cannot use it in conjunction with
Freeform SQL or Query Builder reports. The Engine has to use multipass SQL to access multiple data sources
and move data between them. Since Freeform SQL and Query Builder reports allow only a single pass of
SQL, they cannot take advantage of this feature.
Provisional Exercises
Overview
In this exercise, use the Warehouse Tables pane in Architect graphical interface to add the following tables from the
Forecast Data database instance to the MicroStrategy Tutorial project:
• Make Forecast Data the primary database instance for the FORECAST_SALES table
• Add LU_REGION and LU_COUNTRY as duplicate tables with Forecast Data as the primary database instance
Detailed Instructions
1. In the MicroStrategy Analytics Modules project source, open the MicroStrategy Tutorial project.
• Automatic metric creation can be changed by clicking Architect button, selecting Settings, clicking the
Metric Creation tab, and clearing the Sum check box.
5. In order to add the LU_REGION and LU_COUNTRY tables to the project and change the Database Instance, in
the Warehouse Tables pane, in the list of tables available in the Forecast Data database instance, right-click the
LU_REGION table and select Add Table to Project.
6. In the Options window, under Available options, keep the Indicate that LU_REGION is also available from the
current DB Instance option selected.
7. Click OK.
8. In the Architect graphical interface, in the Properties pane, click the Tables tab.
11. Beside the current primary database instance, click the Browse button:
12. In the Available Database Instances window, in the Primary Database Instance drop-down list, select Forecast
Data. Ensure the check box for Tutorial Data is selected.
14. The color coding of the table indicates the tables is mapped to a secondary database instance.
16. To add the FORECAST_SALES table to the project, in the Warehouse Tables pane, expand the Forecast Data
database instance.
17. In the list of tables available in the database instance, right-click the FORECAST_SALES table and select Add
Table to Project.
18. In the Results Preview window, in the Fact tab, keep Forecast Unit Price only and clear all other facts. On the
Attribute tab, keep the default selection.
20. Right-click the ID columns for all the attributes mapped in the FORECAST_SALES table, select Edit to verify all
the source tables available are mapped and click OK.
21. To create Forecast Revenue fact, Right-click Forecast_Sales table and select Create Fact.
22. In the Create New Fact Expression window, in the Source table list, select FORECAST_SALES.
26. Right-click fact column you just created and select Rename.
30. To create the Forecast Revenue metric, in Developer, in the MicroStrategy Tutorial project, in the Public
Objects\Metrics\Sales Metrics folder, in the File menu, select New, and select Metric.
31. Use the Forecast Revenue fact to create the following metric:
Adding Tables from a Secondary Database Instance and Creating an Attribute and Metric for a
Multisource Report
Overview
In this exercise, use the Warehouse Tables pane in Architect graphical interface to add the following tables from the
Forecast Data database instance to the MicroStrategy Tutorial project:
Also, add the LU_GROUP table from the Inventory Data database instance. While adding these tables, configure
them as follows:
• Make Forecast Data as the primary database instance for the LU_PRODUCT table
• Add LU_SUBCATEG as a duplicate table with Forecast Data as the secondary database instance
Save the Product attribute in the Schema Objects\Attributes\Products folder. As part of creating the Product
attribute, complete the following tasks:
• Make sure the ID and DESC attribute forms for the Product attribute map the following as source tables:
• The primary lookup table for the Product attribute displays in bold text. Use Automatic mapping for both
attribute form expressions.
• Relate the Product attribute to the Category attribute as shown in the following table:
• After completing these tasks, save and update project schema, and close Architect.
Detailed Instructions
1. To add the LU_PRODUCT table to the project, in the MicroStrategy Analytics Modules project source, open the
MicroStrategy Tutorial project.
4. In the list of tables available in the Forecast Data database instance, right-click the LU_PRODUCT table and
select Add Table to Project.
9. In the Modify Attribute Form window, ensure that the LU_PRODUCT source table is selected and click OK.
10. To add the LU_GROUP table to the project, in the list of tables available in the Inventory Data database instance,
right-click the LU_GROUP table and select Add Table to Project.
12. Product attribute has been created with ID:PRODUCT_ID. Category attribute has been created with
ID:CATEGORY_ID and DESC:CATEGORY_DESC.
13. In LU_GROUP table, right-click the ID:PRODUCT_ID column and select Edit.
15. In the Modify Attribute Form window, ensure that the LU_PRODUCT and LU_GROUP source tables are selected
and click OK.
17. To add the LU_SUBCATEG table to the project, in the Warehouse Tables pane, in the list of tables available in the
Forecast Data database instance, right-click the LU_SUBCATEG table and select Add Table to Project.
18. In the Options window, under Available options, keep the Indicate that LU_SUBCATEG is also available from the
current DB Instance option selected.
20. To create Product and Category relationship and update the Product user hierarchy, on the Hierarchy View tab,
select the Product attribute and drag it to Category attribute. A one-to-many relationship is created with
Product as the parent attribute.
21. On the Properties pane, in the Location property click Browse and save the Product attribute in the Schema
Objects\Attribute\Products folder.
22. On the Home tab, in the Hierarchy section, in the drop-down list, select Products.
23. In the Hierarchy View tab, right-click an empty area and select Add/Remove attributes in Hierarchy.
24. In the Select Objects window under Available objects, select the Product attribute and click the > button to
move Product to the Selected objects list.
26. Right-click the Product attribute and select Define Browse Attributes. In the Select Objects window, add the
Category attribute to the Selected Objects list.
Overview
In this exercise, create a multisource report. The report should contain the following attributes and metrics:
Product, Category, Revenue, and Forecast Revenue. The report should also contain an attribute element filter with
the following condition: Product In list (Entertainment).
Save the report in the Public Objects\..\My Reports folder as Revenue and Forecast Revenue for Entertainment
Categories.
Run the report. The result set should look like the following:
Detailed Instructions
1. To create the multisource report template, in the MicroStrategy Tutorial project, in the Public Objects\Reports
folder, create the following report:
• Access the Product and Category attributes from the Products hierarchy.
• Find the Revenue metric in the Public Objects\Metrics\Sales Metrics folder. The Forecast Revenue metric
was created during the lesson demonstrations.
2. To create the multisource report filter, create an attribute element filter with the following condition: Product In
list (Entertainment).
• Create a local filter on the report. There is no need to create the filter separately in the Filter Editor.
3. Save the report in the Public Objects\..\My Reports folder as Revenue and Forecast Revenue for Entertainment
Categories.
5. Compare your results to the expected report in the Overview section at the beginning of this exercise.
4
ADVANCED DESIGN FEATURES
• It is possible for the same fact to be stored at different attribute levels within a hierarchy. For example, you
could have another fact table that stores unit sales by item and date, rather than item and week. This fact
table would store unit sales for items at a lower level within the Time hierarchy.
Sometimes, facts may not be available in the data warehouse at the levels you want to analyze in reports. You may
not be able to change the levels at which they are stored in data warehouse tables, but you can change the levels
of facts in MicroStrategy Architect by using fact level extensions. Fact level extensions enable facts stored in the
data warehouse at one level to be reported on at a different level.
MicroStrategy Architect provides the following three types of fact level extensions:
• Degradation—Enables one to extend the level of a fact to a lower level within the same hierarchy.
• Extension—Enables one to extend the level of a fact to include a level from a different hierarchy not
currently related to the fact
• Disallow—Enables one to disallow a specific fact level to prevent unnecessary cross joins to lookup tables
when reporting on a fact
Fact Degradation
A fact degradation enables you to lower the level of a fact within a hierarchy to which it is already related.
The following illustration shows a fact table with facts stored at the month level and a report that requires one of
these facts to be displayed at the day level:
Fact Degradation Scenario
In this example, the report contains the Units Received metric that uses the Units Received fact in its definition.
The Units Received fact is stored at the item and month levels. However, you need the Units Received fact to be
available at the day level. If users try to run this report, they will receive the following error message:
Report Error without the Fact Degradation
The error message notifies users that the Units Received fact is not available at the day and item levels. For this
report to work, lower the level of the Units Received fact from month to day using a fact degradation.
* As an alternative to creating a fact degradation, change the level of the fact table in the data warehouse itself.
However, changing the fact table is not always an option. The fact data may not be captured at that level in the
source system, or there may be other organizational or environmental restrictions on changing the table structure.
1 Select the attribute level to which you want to lower the fact.
2 Select the attribute the SQL Engine can use to join the fact to the attribute to which you want to lower the fact.
3 Determine the join direction between the join attribute and the fact.
When a fact degradation is created, the fact is already stored in a fact table at a higher level within the hierarchy
than the level at which you want to analyze the fact. Therefore, choose the lower-level attribute within the same
hierarchy to which the fact is to be degraded.
For the Units Received fact degradation, lower the level of the Units Received fact from month to day. To do this,
select the Day attribute.
When a fact degradation is created, it is important to pay close attention to the process. This is because the fact is
not related to the desired attribute level within a hierarchy. Select an attribute that the SQL Engine can use to join
the fact to the desired attribute level. With a fact degradation, since you are lowering a fact for a hierarchy to which
it is already related, the join attribute is always a higher-level attribute from that hierarchy.
For the Units Received fact degradation, the Units Received fact is stored at the item and month level, so it is related
to both the Item and Month attributes. Since the fact to the day level must be lowered, select Month as the join
attribute because it is directly related to the Day attribute. The Item attribute is not directly related to the Day
attribute, so it would not be used as the join attribute.
After the join attribute to relate the desired attribute and fact has been selected, determine how you want the SQL
Engine to perform the join between the join attribute and the fact. There are two possible join directions. Users can
either join to the fact using only the join attribute itself, or they can allow the fact to also join to children of the join
attribute.
For the Units Received fact degradation, the Units Received fact is stored only at the item and month level. It is not
stored at any other levels of time. Choose to join only against the attribute itself (Month). However, if the Units
Received fact were stored at another level of time between month and day, such as week, join to the fact at the
month or week level. In this case, choose to allow the SQL Engine to join against the attribute and its children
(Month and Week). If you do not want to allow the join at both levels, you could still choose to perform the join only
against the attribute itself.
• If you allow the SQL Engine to join against the join attribute and any of its children, you need to ensure that
the allocation expression you use for the fact degradation returns values that are valid at any of those
attribute levels.
Some facts are static and do not change value from one level to another. Such facts do not require an expression to
allocate the fact at the lower level. Other facts do change from one level to another, and users need to define an
expression that correctly allocates the fact data at the lower level. An allocation expression can include attributes,
facts, constants, and any standard expression syntax, including mathematical operators, pass-through functions,
and so forth.
Although the Units Received fact is stored at the month level, its value may be different depending on the level of
time at which you are reporting on the units received. The units received on a particular day are different from units
received during a month. Therefore, an allocation expression is needed to translate units received at the month
level into day-level values. For the Units Received degradation, create an allocation expression to divide the
monthly units received by the duration of the month. This will yield rough approximation of units received at the
day level: ([Units Received] / [Month Duration]).
However, if a user is creating a degradation for a fact like Unit Price, its value might be the same regardless whether
they report on it at the month level or the day level since the unit price does not change during a month. Therefore,
the user would not need an allocation expression for this fact degradation.
Overview
Run the report. Add the Month attribute to the template and run the report again. After reviewing the error
message, save the report as Degradation Example in the Public Objects\..\My Reports folder.
Next, create a degradation for the Forecast Cost fact to enable report on that fact at the Month level. In the data
warehouse, this fact exists only at the Quarter level. Use the following allocation expression for the fact
degradation: [Forecast Cost] / 3. After creating the fact degradation, update the project schema.
Run the report. The result set should look like the following:
The image above only displays the first few rows of the result set for the report.
Detailed Instructions
1. In Developer, in the Forecasting Project, in the Public Objects\..\My Reports folder, create the following report:
• User can access the Quarter attribute from the Time hierarchy. Forecast Cost metric can be found in the
Metrics folder.
2. Run the report. The result set should look like the following:
3. To modify the report to include the Month attribute, switch to Design View.
4. Add the Month attribute to the report template to the right of Quarter.
The error message states that the Forecast Cost fact does not exist at the month level in the data warehouse. To
report on that fact at a level lower than quarter, must create a fact degradation.
7. Save the report in the Public Objects\..\My Reports folder as Degradation Example and close the report.
8. To create a fact degradation at the month level for the Forecast Cost fact, in the Schema Objects, in the Facts
folder, open the Forecast Cost fact in the Fact Editor.
11. In the Level Extension Wizard, in the Introduction window, click Next.
12. In the General Information window, in the Name box, type Degradation to Month as the name for the
extension.
13. Under What would you like to do?, click Lower the fact entry level.
15. In the Extended Attributes window, select the Show all attributes check box.
16. In the Available attributes list, select the Month attribute and click the > button to add it to the Selected
attributes list.
• Multiple attributes can be added for a single fact degradation. However, must ensure that the allocation
expression returns correct results for all attribute levels. For example, if a Day attribute is added to the
degradation definition, must create a single allocation expression that returns the correct results at both the
month and day levels.
18. In the Join Type window, select the Quarter check box.
20. In the Join Attributes Direction window, in the Join attributes list, in the Join against column, keep the default
setting.
22. In the Allocation window, select the Specify an allocation expression check box.
• This allocation expression returns correct values only at the month level.
26. In the Finish window, review the information and click Finish.
• Click Back to go back through the Level Extension Wizard and make changes.
29. Run the Degradation Example report. The result set should look like the following:
The image above only displays the first few rows of the result set for the report.
User can now report on the Forecast Cost fact at the month level. The monthly values are only estimates
based on the allocation expression provided in the definition of the fact degradation. Notice that for each
month within a quarter, the forecast cost value is the same.
While a fact degradation allows users to lower the level of a fact within a hierarchy to which it is already related, a
fact extension enables users to extend the level of a fact to a level in a different hierarchy, to which that fact it is
currently unrelated.
Consider the following simplified data model for the MicroStrategy Tutorial project:
Fact Extension Data Model
In this data model, store the Freight fact at the employee, order, and day level. However, freight data is not stored at
the item level. The Freight fact is not related to any attribute in the Product hierarchy.
If a report is run that contains the Item attribute and a Freight metric that uses the Freight fact, the result set looks
like the following:
Item and Freight—Report Result Set
• The image above displays only the first few rows of the result set for the report.
The report returns the same freight value for each item. This result set is meaningless because of how the query
joins the lookup table for the Item attribute and the fact table for Freight. The following image shows the SQL for
this report:
Because there is no relationship between Item and Freight, the SQL Engine performs a cross join between the fact
and lookup tables to retrieve the data for the report.
Therefore, if you want to view freight information by item, you have to extend the Freight fact to the item level. You
cannot use a fact degradation because Item is an attribute from a different hierarchy than the attributes that are
already related to the Freight fact.
* As an alternative to creating a fact extension, you could add a new level to the fact table in the data warehouse
itself. However, changing the fact table is not always an option. The fact data may not be captured at that level in
the source system, or there may be other organizational or environmental restrictions on changing the table
structure.
MicroStrategy Architect provides three methods for creating fact extensions. They enable different options for
joining the desired attribute and fact. You can create fact extensions using the following three methods:
• Table relation—Select a particular table to use for the join. You should select this option if you always want
the SQL Engine to use the same table to join the desired attribute and fact.
• Fact relation—Instead of selecting a single table, you select a fact to use for the join. This option enables the
SQL Engine to use any table containing that fact to join the desired attribute and fact. You should select this
option if you want to allow the SQL Engine to choose the optimal table for a particular query.
• Cross product—Choose to have the SQL Engine perform a cross join between the lookup table of the
desired attribute and the fact table.
•Use the cross product option only as a last resort. If there are no tables in the data warehouse that you can
use to join the desired attribute and fact, then this is the only option you have for creating a fact extension.
However, keep in mind that a cross-join requires a great deal of processing overhead, and the resulting data
may not be meaningful.
* To learn more about the fact relation and cross product methods, refer to the Project Design Guide product
manual.
Using the table relation method for a fact extension forces the SQL Engine to always join the desired attribute and
fact using a particular table. Creating a fact extension using a table relation consists of the following steps:
2. Select the table you want the SQL Engine to use to join the fact to the attribute to which you want to extend the
fact.
3. Select the attribute or set of attributes the SQL Engine can use to join the fact to the attribute to which you
want to extend the fact.
4. Determine the join direction between the join attributes and the fact.
When a fact extension is created, the fact is completely unrelated to any attributes in the given hierarchy. The
attribute to which you want to extend the fact depends on how you want to analyze the fact. If you want to report
on the fact at any level in the hierarchy, you should select the lowest-level attribute in that hierarchy.
For the Freight fact extension, if you want to report on the Freight fact at any attribute level in the Product
hierarchy, select the Item attribute, which is the lowest-level attribute in the hierarchy. Selecting a higher-level
attribute from the Product hierarchy, such as Subcategory, only extends the fact to that attribute level or any
attribute above it in the hierarchy. Extending the Freight fact to the item level enables users to create reports that
analyze freight data using any attribute from the Product hierarchy.
The SQL Engine needs to join the table that contains the fact being extending and the lookup table that stores the
attribute which is extending the fact. Because these two tables are not related, there is no need to select another
data warehouse table to serve as a relationship table between the fact and lookup tables.
After selecting the attribute to which extend the fact is to be extended, MicroStrategy Architect searches the
project warehouse catalog and returns a list of all tables that contain the ID column of that attribute. Using this list
of candidate tables, you can then select the optimal table for the join. In selecting a table, several factors should be
considered, including the number of possible join paths, the optimal join path for a given allocation expression,
and any other characteristics specific to your data warehouse environment. For example, you may want to use a
table for the join that you know has better indexes or is updated more frequently.
In the previous example, the Freight fact is stored in the ORDER_FACT table. The following image shows the logical
view for the ORDER_FACT table:
ORDER_FACT Table—Logical View
Notice that several attributes from multiple hierarchies map to the ORDER_FACT table. Therefore, all these
attributes relate to the Freight fact and could possibly be used to relate the Freight fact to the Item attribute.
* The ORDER_FACT table is the only table in the data warehouse that contains the Freight fact. Therefore, you have
to join the ORDER_FACT table to the LU_ITEM table to relate the Freight fact to the Item attribute.
When the Freight fact has been extended to Item using a table relation, MicroStrategy Architect returns the
following list of candidate tables:
List of Candidate Tables
The list of candidate tables includes the LU_ITEM and REL_CAT_ITEM tables. These lookup and relationship tables
contain only product-related information. Therefore, eliminate them as possible tables to use for the join since no
attributes from the Product hierarchy are related to the Freight fact. In general, users can usually eliminate tables
from the hierarchy to which the attribute belongs since these tables typically do not provide any way to join to
other hierarchies.
The remaining tables are all fact tables that contain the Item attribute. However, most of them have only one or two
attributes in common with the ORDER_FACT table. However, the ORDER_DETAIL table contains many of the same
attributes as the ORDER_FACT table, including Employee, Day, Customer, and Order. The following image shows
the logical view for the ORDER_DETAIL table:
ORDER_DETAIL Table—Logical View
Use the ORDER_DETAIL table to join the LU_ITEM and ORDER_FACT tables using any of the common attribute
columns. Because the ORDER_DETAIL table provides multiple join paths, it is the best table to use to join the
Freight fact to the Item attribute.
* The ORDER_DETAIL table is the optimal join table provided you can use it in conjunction with the allocation
expression for the fact extension. It is imperative to consider any characteristics of the table that might render it
less optimal because of factors unrelated to its structure. For example, if this table is only updated monthly and you
want reports that provide the most current data, it would not be the best table to use for the join.
After selecting the table for the join, select an attribute or set of attributes from that table on which the SQL Engine
should join. MicroStrategy Architect lists any attributes whose ID columns are present in the join table. Either
manually select the join attributes, or allow the SQL Engine to select the join attributes dynamically on a
query-by-query basis.
For the Freight fact extension, select Order as the join attribute. Use other attributes in the ORDER_FACT table as
the join attribute. These attributes are listed in the Level Extension Wizard, as shown below:
Possible Join Attributes
However, the allocation expression for this fact extension uses facts that are related to individual orders, so Order is
the optimal join attribute.
If the SQL Engine is allowed to dynamically select the join attributes, do not perform this step. However, if the join
attributes are manually selected, determine how the SQL Engine should perform the join between the join
attributes and the fact. Just as with fact degradations, there are two possible join directions. Users can either join to
the fact using only the join attributes themselves, or they can allow the fact to also join to the children of the join
attributes.
If the SQL Engine is allowed to join against the join attributes and any of their children, ensure that the allocation
expression used for the fact extension returns values are valid at any of those attribute levels.
For the Freight fact extension, allow the SQL Engine to join only against the Order attribute itself. Since the Order
attribute is already the lowest-level attribute in the Customers hierarchy, it does not have any child attributes that
can be used to join to the fact.
Just as with fact degradations, when a fact extension is created, users may need to define an expression to allocate
the fact data at the extended attribute level. Some facts are static and do not change value from one attribute level
to another, while other facts have values that do change, depending on the attribute level.
As with fact degradations, a valid allocation expression can include attributes, facts, constants, and any standard
expression syntax, including mathematical operators, pass-through functions, and so forth.
For the Freight fact extension, a user could create the following allocation expression:
A fact disallow functions very differently from other types of fact level extensions. Fact degradations and
extensions can be used to relate the fact to additional attributes. However, a fact disallow actually prevents
unnecessary cross joins between fact and lookup tables that would otherwise occur by default.
For example, a project may contain the following hierarchies and fact table:
This project contains Geography, Product, and Time hierarchies. The INVENTORY_ORDERS fact table stores data
only at the level of item and month, so it is only related to the Product and Time hierarchies.
However, what if a user should want to run a report that displays a Units Received metric (built from the Units
Received fact) as well as the Item and Call Center attributes? The Item attribute is related to the Units Received fact
since it is part of the Product hierarchy, but the Call Center attribute is not related to this fact since it is part of the
Geography hierarchy. By default, the result set for this report looks like the following:
Report Result Set—Unrelated Attribute and Fact
• The image above only displays the first few rows of the result set for the report.
Because there is no way to relate call center data to units received data, this report result displays the units received
for every item paired with every call center.
In an attempt to return data for the report, the SQL Engine generates SQL that results in a cross join between the
lookup table for the Call Center attribute and the fact table that contains the units received data. The SQL for this
report looks like the following:
Report SQL—Unrelated Attribute and Fact
The Units Received Fact is not related to any attribute in the Geography hierarchy. However, by default, the SQL
Engine performs a cross join to the Call Center lookup table
The report SQL includes the LU_CALL_CTR table in the FROM clause, but it cannot join this table to the
INVENTORY_ORDERS fact table in the WHERE clause. The cross join makes it possible for the report to return data,
but it can only produce a cross product, which does not yield a meaningful result set.
If there is no need for this cross join and the result set it produces, disallow the Call Center attribute for the Units
Received fact. This fact disallow prevents the SQL Engine from executing the cross join.
• Users cannot use fact disallows to prevent normal joins from occurring, only cross joins. For example, for the
fact table referenced in this topic, disallowing an attribute from the Time or Product hierarchies for the Units
Received fact would not prevent users from running a report with attributes from either hierarchy. The Units
Received fact is related to attributes from each of these hierarchies, so a cross join to lookup tables would
not occur.
Partitioning
Partitioning Fact Tables in MicroStrategy Schema
Partitioning Concepts
Partitioning is the division of a larger table into smaller tables. You often implement partitioning in a data
warehouse to improve query performance by reducing the number of records that queries must scan to retrieve a
result set. You can also use partitioning to decrease the amount of time necessary to load data into data warehouse
tables and perform batch processing.
Databases differ dramatically in the size of the data files and physical tables they can manage effectively.
Partitioning support varies by database. Most database vendors today provide some support for partitioning at the
database level. Regardless, the use of some partitioning strategy is essential in designing a manageable data
warehouse. Like all data warehouse tuning techniques, you should periodically re-evaluate your partitioning
strategy.
• Server level
• Application level
The following illustration shows the difference between these two types of partitioning:
Server-level partitioning involves dividing one physical table into logical partitions in the database environment.
The database software handles this type of partitioning completely, so these partitions are effectively transparent
to MicroStrategy software. Since only one physical table exists, the SQL Engine only has to write SQL against a
single table, and the database manages which logical partitions are used to resolve the query.
Application-level partitioning involves dividing one large table into several separate, smaller physical tables called
partition base tables. You split the table into smaller tables in the database itself, and then, the application that is
running queries against the database (in this case, MicroStrategy) manages which partitions are used for any given
query. Since multiple physical tables exist, the SQL Engine has to write SQL against different tables, depending on
which tables are needed to retrieve the result set for a query.
MicroStrategy supports application-level partitioning for fact tables through one of two methods:
• Warehouse partition mapping—This method is based on a physical warehouse partition mapping table
(PMT) that resides in the data warehouse and describes the partitioned base tables that are part of a
conceptual whole.
• Metadata partition mapping—This method is based on rules logically defined in MicroStrategy Architect
and stored in the metadata.
With warehouse partition mapping, you do not include the original fact table or the partition base tables in the
project. Rather, you create and maintain a partition mapping table, which MicroStrategy Architect uses to identify
the partitioned base tables as part of a logical whole. You only bring the partition mapping table to the project.
To understand how you can use warehouse partition mapping to manage partitioned base tables, consider the
following example. To ease maintenance and improve query performance, you have divided a multibillion-row fact
table by month into a few one-billion-row fact tables. The following illustration shows a partition mapping table
that describes how the partitioned tables fit together as a whole:
Warehouse Partition Mapping
The partition mapping table has several features that must be included for it to work appropriately in
MicroStrategy Architect:
• You can use any name for the partition mapping table.
• It has a column for each attribute ID by which you partitioned the table. These attribute IDs represent the
partitioning attributes.
• It must have an additional column whose name must be PBTName and whose contents refer to the names
of the various partition base tables (PBTs). Partition base tables are the actual physical partition tables that
make up the complete, logical fact table.
* The PBTName column indicates to MicroStrategy Architect that the table is a partition mapping table.
• There is a row for each of the partition base tables in the logical whole.
When you follow these conventions, MicroStrategy Architect automatically recognizes the partition mapping table,
and you can very easily add it to the project using Architect graphical interface.
When you use warehouse partition mapping, you add the partition mapping table to the project in the Architect
graphical interface. You do not add the partition base tables to the project. When you add the partition mapping
table, it automatically associates this table with the underlying partition base tables. After adding the partition
mapping table to the project, you also need to identify the attribute used to partition the tables by creating a
partition mapping.
2. In the Architect graphical interface, in the Warehouse Tables pane, in the list of tables in the database instance,
select the partition mapping table and right-click to select Add Table to Project.
* When you add a partition mapping table to a project, you see a message window that reminds you to set the
partitioning attribute level in the partition mapping. Click OK.
4. In Developer, in the Schema Objects folder, in the Partition Mappings folder, double-click the partition
mapping object with the same name as the partition mapping table you just added to the project.
5. In the Warehouse Partition Mapping Editor, on the Logical View tab, click Add.
6. In the Partition Level Attributes Selection window, in the Available attributes list, select the attribute by which
the tables are partitioned and click the > button to add it to the Selected attributes list.
7. Click OK.
The Warehouse Partition Mapping Editor has four tabs, each displaying different information about the partition
mapping.
The following image shows the Logical View tab of the Warehouse Partition Mapping Editor:
Logical View of the Partition Mapping
The Logical View tab shows the attribute by which the base tables are partitioned. In the image above, the base
tables are partitioned by the Quarter attribute.
The following image shows the Physical View tab of the Warehouse Partition Mapping Editor:
Physical View of the Partition Mapping
The Physical View tab shows the actual columns in the partition mapping table. In this example, the partition
mapping table contains two columns, PBTNAME and QUARTER_ID.
The following image shows the logical view of the associated partition base tables on the Base Table(s) Logical
View tab of the Warehouse Partition Mapping Editor:
Logical View of the Partition Base Tables
The Base Table(s) Logical View tab shows the attributes mapped to the base partition tables. In this example, the
Item and Month attributes are mapped to the base partition tables.
The following image shows the physical view of the base partition tables on the Base Table(s) Physical View tab of
the Warehouse Partition Mapping Editor:
Physical View of the Partition Base Tables
The Base Table(s) Physical View tab shows the actual columns in the partition base tables. In this example, the
Month attribute is mapped to the MONTH_ID column, and the Item attribute is mapped to the ITEM_ID column in
the base partition tables. In addition, you can map the Beginning on Hand and End on Hand facts to the BOH_QTY
and EOH_QTY columns, respectively.
When you run a report that requires information from one of the partition base tables, the Query Engine first runs a
prequery to the partition mapping table to determine which partition to access to obtain the data for the report.
The prequery requests the partition base table names associated with the attribute IDs from the filtering criteria.
Next, the SQL Engine generates SQL against the appropriate partition base tables to retrieve the report results.
* The partition base tables may contain either the same column as the partitioning attribute or a column
corresponding to any child of the partitioning attribute.
* MicroStrategy Architect does not support application-level partitioning for lookup tables. The size of lookup
tables usually makes this kind of partitioning unnecessary. If you want, you can use server-level partitioning on
lookup tables.
In metadata partition mapping, the application running against the database still manages the physically
partitioned fact tables, but the execution is different. Metadata partition mapping does not require a partition
mapping table in the data warehouse. Instead, you define the data contained in each partition base table in a
partition mapping object in MicroStrategy Architect. This object is stored in the metadata. You update the partition
mapping as new partition base tables are created.
When you execute a report that runs against the partitioned tables, the Query Engine sends the necessary
prequeries to the metadata to determine which partition base tables need to be included in the report SQL. The
SQL Engine then generates SQL against the appropriate partition base tables to retrieve the report results.
When you use metadata partition mapping, you add the partition base tables to the project in the Architect
graphical interface. After adding the partition base tables to the project, you create the partition mapping and
define the data slices for each partition base table.
With metadata partition mapping, you can also add partition base tables from multiple data sources for a single
partition definition. This feature can improve the performance immensely and can be cost effective, especially
when you are dealing with large amounts of data in a project. For example, all 2012 partition base tables are in
Tutorial Data, but the 2011 partition base tables are in Archived Tutorial Data. You can add the 2011 and 2012
partition base tables from both data sources to a project and define the data slice for each partition base table. The
following illustration shows a metadata partition mapping table from multiple data sources:
2. In the Architect graphical interface, in the Warehouse Tables pane, in the list of tables available in the database
instance, select all the partition base tables. Right-click your selection and select Add Table to Project.
4. In Developer, in the Schema Objects folder, select the Partition Mappings folder.
6. In the Partition Tables Selection window, in the Available tables list, select the partition base tables and click the
> button to add them to the Selected tables list.
7. Click OK.
8. In the Metadata Partition Mapping Editor, in the Tables defining the partition mapping list, select a partition
base table.
9. Click Define.
10. In the Data Slice Editor, in the Data Slice Definition pane, define the data contained in the partition base table.
12. Repeat steps 8 to 11 for each partition base table in the partition mapping.
13. If you want to define a logical size for the partition mapping, in the Metadata Partition Mapping Editor, in the
Partition mapping logical size box, type a value.
* Remember to lock the logical size if you want to preserve it when updating the project schema.
One of the key capabilities of In-Memory Analytics is its ability to partition and process data in parallel. As with
other distributed processing systems, picking the correct partition attribute is the key to a successful
implementation.
In-Memory Analytics
MicroStrategy In-Memory Analytics combines a massive parallel in-memory data store with a state of the art
visualization and interaction engine. The optimization between the data store and the end-user interaction layer
enables MicroStrategy In-Memory Analytics to achieve orders of magnitude higher performance than its
competitors. MicroStrategy In-Memory Analytics is not a database, and will not preclude the need for a database or
data warehouse platform; rather, In-Memory Analytics is an in-memory engine that co-exists with databases such
as Oracle, HP Vertica, and Teradata, among others.
MicroStrategy In-Memory Analytics takes the previously available 9.4.1 MicroStrategy OLAP Services offering to a
new level through three key-dimensions:
• MicroStrategy In-Memory datasets can be divided into partitions. Each partition can have up to 2 billion
rows.
• MicroStrategy In-Memory Analytics extends the in-memory architecture to multiple fact tables; fact tables
with varying grains; Many-to-Many relationship tables; and Entity-Relation Model Semantics.
• The MicroStrategy In-Memory Analytics storage model is closer to raw data staged in-memory compared to
OLAP cubes.
• Current in-memory cubes do not fully support Filter and Metric functionality of the MicroStrategy Analytics
platform.
• MicroStrategy In-Memory Analytics includes the ability to have multiple levels of aggregation and filtering.
Partitioning is the act of distributing data across multiple cores on a single box and/or distributing data across
available CPU cores in the environment. Each data partition is in a “shared nothing” architecture and will only work
with only its own corresponding CPU core or cores.
• Partitioning does limit the types of aggregations that can be performed on the raw data. A list of functions
that can be handled include distributive functions such as – SUM, MIN, MAX, COUNT, PRODUCT, or
semi-distributive functions such as STD DEV, VARIANCE that can be re-written as distributive functions.
• Scalar functions such as Add, Greatest, Date/Time Functions, String manipulation functions, etc. are also
supported.
• Derived metrics using any of the MicroStrategy 250+ functions are supported
• MicroStrategy In-Memory Analytics currently supports only one partitioning key/attribute for the entire
dataset. All tables that have the partition attribute will have their data distributed along the elements of
that attribute
• MicroStrategy In-Memory Analytics currently supports all flavors of INT data types, STRING/TEXT data type
as well DATE data types for partitioning. INT data is distributed using either MOD or HASH schemes and
TEXT/DATE data is distributed using HASH schemes
• The partition attribute is typically dictated by the specific application needs. Below are some general
guidelines on identifying a good partition attribute:
•Some of the largest fact tables in the application are typically good candidates for partitioning and thus
influence the choice of the partition attribute. They need to be partitioned in order to accommodate large
data sizes and also take advantage of In-Memory Analytics’ parallel processing architecture.
•Data should be partitioned in such a way that it allows for the most number of partitions to be involved in
any question that is asked of the application. Attributes that are frequently used for filtering or selections
don’t make for good partition attributes, as they tend to push the analysis towards specific sets of partitions
thus minimizing the benefits of parallel processing.
•Partition attribute should also allow for near uniform distribution of data across the partitions, so that the
workload on each partition is more evenly distributed.
•Columns on which some of the larger tables in the application are joined also make for good partition
attributes.
The workshop in the “Pick Tables (Database) to Build MTDI Cubes” section includes partitioning exercise.
Data Import
The following connectors and methods to access data are available in MicroStrategy 10:
• Clipboard: copy and paste data directly into the MicroStrategy interface
• Pick Tables (Database): import multiple relational tables into a single Intelligence Cube
• Search Engine Indices (Apache Solr Search): import data based on search terms
• Hadoop: import from Hadoop leveraging the MicroStrategy Big Query Engine; users can still also import
directly from Hadoop
• BI Tools (SAP, SAP BO, Cognos): import data via web services URLs
• OLAP Sources/MDX
• File from Disk and File from URL: users can now import delimited text files with extensions other than .csv.
Also, users can now import multiple spreadsheets, workbooks, or text files into a single Intelligence Cube.
Search Engine Indices and BI Tools cannot be loaded into In-Memory Analytics Intelligence Cubes. Both the
aforementioned connectors load data directly from the source to the report, document, or dashboard in MicroStrategy.
For the purpose of this course, we will discuss a couple of types of data sources – Pick Tables (Databases) and
OLAP/MDX.
Users can simultaneously import two or more tables, Excel sheets, Excel workbooks, or CSV files from any
supported data source. MTDI (Multi-Table Data Import) offers users the opportunity to load tables into
MicroStrategy without first building a SQL query—part of the new In-Memory Analytics Intelligence Cube design.
Thanks to the new In-Memory Analytics (formerly known as Parallel Relational In-Memory Engine or PRIME) cube
architecture, when importing multiple tables from a database, users have the option to specify a partitioning
attribute (distribution key) and the number of distributions. Due to the fact that MicroStrategy currently supports
single node architecture, the number of distributions needs to match the number of CPU cores of the
MicroStrategy Intelligence Server machine.
In this exercise:
• Create MTDI cube with the new In-Memory Analytics Intelligence Cube structure.
4. Select the “Tutorial Data” Database Instance and add the following tables. Click “Edit Data.”
• LU_CUST_CITY
• LU_CALL_CTR
• LU_ITEM
• LU_MONTH
• CITY_MNTH_SLS
5. The Multi-Table Data Import (MTDI) window will open. This window shows all tables selected in the previous
screen, and the respective columns mapped as attributes and metrics. Note that Data Import automatically
maps the table columns as attributes and metrics. Any errors in the attribute or the metric mappings will be
corrected later in this section of the workshop.
6. Users may want to add additional tables after the initial set of tables are brought into Data Import. Add another
table to the preview screen by clicking on the “Add a new Table” button. Follow the Data Import workflow and
select the ITEM_CCTR_MONTH_SLS (sales data at the item, call center, and month levels) fact table. The new
table should appear in the MTDI preview window after scrolling to the right.
7. In the MTDI preview window, notice that each column in a given table will be imported as a separate attribute.
For example, Call Ctr Id and Center Name are each listed as distinct attributes in the LU_CALL_CENTER table.
Data Import offers the opportunity to designate two or more columns as attribute forms within a single
attribute. To create a unified “CALL CENTER” attribute, highlight both Call Center Id and Center Name by
holding the shift key while making the selections > RMC > select “Create Multiform Attribute.” Change the
attribute name to “CALL CENTER.” Double-check that the form (e.g. Call Center Id) and form categories (e.g. ID)
are correctly matched. Uncheck the display form for Call Center Id to ensure that only the Call Center DESC will
appear by default on reports and documents. Click Submit to create the new multiform attribute.
8. The LU_CALL_CTR table should now look like the following in the MTDI preview window:
The name of the attribute will change for all other tables containing at least one attribute form. For example, the Call Ctr
Id column in all other tables will also change to “CALL CENTER.”
9. Repeat the step for the following columns—all multiform attributes should be created on the lookup table:
10. Click on the “All Objects View” to see all attributes and metrics in a list form.
• Unit Price
• Unit Cost
• Month of Year
• Tot Cost
11. Choose “Do Not Import” for all other metrics. Additionally, do not import any extraneous attribute columns that
were not used in the multiform attributes created in steps 6 and 7.
Users can choose to not import multiple columns at once by highlighting a group of columns using the Shift or Ctrl keys.
Users can also Right-mouse click on a given column and choose “Do Not Import” in the MTDI preview window.
Choose “Do Not Import” for multiple columns at once by highlighting multiple items in the table preview before
right clicking and choosing “Do Not Import.”
12. In the “All Objects View” window, convert “Month of Year” from a metric to an attribute. Right-mouse click on
the “Month of Year” metric and select “Convert to Attribute.” Note: users can also convert a metric to an
attribute (and vice versa) in the MTDI preview window.
13. The metric (and attribute) names are automatically populated according to the respective the database column
names. However, these names may not accurately reflect the metric’s data to the end user. While still in the “All
Objects View” window, RMC on the following metrics and select “Rename.”
14. Next, establish the Distribution Key, or Partition Attribute, from within the “All Objects View” window.
Ideally, we would like to partition against the largest fact tables. Due to the fact that an in-memory cube can only
be partitioned on a single attribute, it makes sense to choose an attribute that exists in both fact tables. In this case,
the “Item” attribute meets those criteria.
We can also validate our choice of the Item attribute by looking at how the elements of item are distributed across
the row count of each fact table. When looking at the table ITEM_CCTR_MONTH_SLS, we see the following:
From the screenshot above, we can see that the elements of “Item” are fairly evenly distributed throughout the fact
table (looking through the rest of the returned data confirms this).
Choose 2 for the number of partitions, as these VMs typically come with 2 CPU cores. This number can be adjusted
depending on number of cores in the Intelligence Server machine. Matching the number of cores to the number of
partitions will help ensure that you are taking advantage of in-memory analytics parallel processing functionality.
In the “All Objects View” window, select “ITEM” as the Distribution Key and “2” and the Number of Distributions.
Click “Save” and return to the MTDI preview window.
15. Click “Continue” in the MTDI screen. Save the Cube as ‘”MTDI Workshop Cube.”
16. This MTDI cube with the new In-Memory Analytics Intelligence Cube structure is now ready to use in a
dashboard or document.
OLAP Sources/MDX
Also new in MicroStrategy 10 is a function that provides users with support for using MicroStrategy Data Import
and allows them to retrieve data from OLAP/MDX Sources. As with using ‘Search as Source’, data can only be
retrieved via Direct Data Access, so the data cannot be brought into memory. The following sources are supported:
These sources can be configured by clicking on the ‘Add External Data’ and ‘OLAP’ icons in the ‘Connect to Your
Data’ window.
Choose one of the MDX sources. Next, configure the selected MDX source. Below are some examples of MDX data
source selections and the configuration parameter required.
Once connected to the data source, users can select desired objects for reporting.
Logical Views
Logical views are basically an alternative to using database views. Most functions performed using database views
can also be accomplished using logical views. However, the two are different inasmuch as users create and
maintain logical views at the application level rather than the database level.
Use the Logical Table Editor to create a logical table called a logical view. Rather than mapping to a physical table in
the data warehouse, a logical view is a SQL query that is created and then executed against tables in the data
warehouse. For example, a user stores order and shipping information in different tables in the data warehouse,
LU_ORDER and LU_SHIPMENT. The user must calculate the processing time between the order dates in the
LU_ORDER table and the ship dates in the LU_SHIPMENT table to analyze the processing time for each order. The
user can create a logical view that performs these functions:
Logical Views
In this example, the LVW_PROCESSING logical table is a logical view that is created by running a SQL query against
the LU_ORDER and LU_SHIPMENT tables in the data warehouse. It does not map directly to any physical table.
When defining the logical view, write a SQL statement that selects the pertinent information from the LU_ORDER
and LU_SHIPMENT tables and then calculates the difference between the order and ship dates to determine the
processing time. The following image shows an example of what this logical view might look like in the Logical
Table Editor:
Notice that the logical view contains a SQL statement and definitions for each column that is created as part of the
logical view. Map the attributes and facts to these columns.
During this workshop, create and run a report in the project that uses a month-to-date transformation sales metric.
The transformation object is table-based. Next, create a logical view, and use this to modify the existing
month-to-date transformation. When running the existing report again, observe that the metric yields the same
result and that the difference is in the SQL.
Complete the following exercise using the Logical Views Project in the Advanced Data Warehousing project source.
The schema for the project objects used in this exercise consists of the following six tables:
Schema
The MTD_DAY table is a transformation table. The MTD_DAY_DATE column is used to create a month-to-date
transformation object.
1. In the Public Objects folder, in the Reports folder, create the following report:
• Access Product Type, Day and Month attributes from the Product and Time hierarchies respectively. Access
the Sales and MTD Sales metrics from the Metrics folder.
2. Run the report. The first part of the result set should look like the following:
Notice that the MTD Sales for a given day is a sum of the Sales for that day plus the sales for the previous days of the
month.
3. Switch to the SQL view for the report. The SQL should look like the following:
5. Save the report in the Reports folder as MTD Sales and close the report.
6. To open the Logical Table Editor, in the Schema Objects folder, in the Tables folder, right-click a blank area of the
Object Viewer, point to New, and select Logical Table(W).
7. To define the logical view SQL, in the Logical View Exercises SQL.doc file, copy the SQL labeled as
LVW_MTD_DAY and paste it in the SQL Statement pane of the Logical Table Editor. The SQL looks like the
following:
select a.DAY_ID as DAY_ID,
b.DAY_ID as MTD_DAY_DATE,
fromLU_DAY a
joinLU_DAY b
ona.MONTH_ID=b.MONTH_ID
wherea.DAY_ID >= b.DAY_ID
8. To define the logical view columns, in the Mapping pane, click Add.
9. In the Column Object box, type DAY_ID as the name of the column.
11. In the Column Object box, type MTD_DAY_DATE as the name of the column.
12. The column name must match the name used in the SQL statement.
13. In the Data Type drop-down list, select TimeStamp as the data type for both the columns.
14. In the Precision/Length field, enter the number 0 for both columns.
15. Save the logical table in the Tables folder as LVW_MTD_DAY and close the Logical Table Editor.
16. To map the existing attribute to the logical view column, in the Attributes folder, open the Day attribute.
17. Modify the ID form and select DAY_ID in the Form definition Expressions window
20. In the Source table drop down list, select LVW_MTD_DAY table.
• Users may receive two messages warning that the display type for the form is different from the database
data type. The Day column is stored as Datetime in the database, but it is using the Date format to display it
on reports. The attribute form works correctly, so click Yes to both messages.
25. In order to map the existing transformation to the logical view column, In the Transformations folder, open the
Month-to-Date transformation.
26. Click the Modify button to change the existing Day transformation.
27. In the Member attribute expression window, click Clear to remove the MTD_DAY_DATE expression.
33. Run the MTD Sales report again. The first part of the result set should look like the following:
• Notice that as expected, there is no change in the results. However, there is difference in the SQL.
34. Switch to the SQL View of the report. Users should see the following SQL for Pass3:
Notice that the FROM clause contains a derived table expression where users would normally see a physical table
name. This expression is the SQL statement for the logical view. It selects the necessary information from the
LU_DAY table to create the LVW_MTD_DAY transformation table. The WHERE clause then joins the information
form the logical view to the FACT_DAILY PRODUCT_SALES table using the ORDER_DATE and MTD_DAY_DATE
columns.
Attribute Roles
A role attribute refers to any instance there is a column in a single lookup table that is used to define more than one
attribute. In such cases, one set of data plays multiple roles in the reporting environment. For example, a data
warehouse may contain one lookup table for cities. However, on reports, users may choose to view the city in
which a store is located (Store City), the city in which a customer is located (Customer City), or the city from which
an order is shipped (Ship City). All of these attributes are roles that reference the same underlying table.
A single attribute in a dimension or hierarchy may function as a role attribute, or all of the attributes in a dimension
may be role attributes. If an entire dimension consists of role attributes, it is referred to as a role-playing dimension.
For example, a user could have Ship Time and Order Time dimensions where all levels of time function as role
attributes and reference the same set of lookup tables.
Complete the following exercises using the Attribute Roles Project found in the Advanced Data Warehousing
project source. The hierarchies and schema for this project are included with the exercises. Review this information
before beginning the exercises.
While completing the exercises, create and run reports in the project without having any solution in place to
enable attribute roles. Next, configure the project for attribute roles using automatic attribute role recognition.
Finally, disable automatic attribute role recognition for the project and use explicit table aliasing to support
attribute roles.
The Attribute Roles Project is based on the following logical data model:
Attribute Roles Logical Data Model
In this data model, Customer City and Store City are role attributes that both point to the same lookup table.
Because you are focusing on these attributes, only the bottom two levels of each hierarchy in the data model exist
in the project. The Customer hierarchy looks like the following:
Customer Hierarchy
The schema for this project consists of the following two tables:
Attribute Roles Schema
Both the Customer City and Store City attributes map to the ID and description columns in the LU_CITY table. The
Customer City attribute also maps to the Cust_City_ID column in the FACT_CUST_SALES table, and the Store City
attribute maps to the St_City_ID column in this table. For these exercises, since you do not need to query store and
customer information, separate lookup tables for the Store and Customer attributes do not exist. These attributes
are mapped only to the Store_ID and Cust_ID columns in the FACT_CUST_SALES table.
Create a report
1. In MicroStrategy Developer, in the Advanced Data Warehousing project source, open the Attribute Roles
Project.
2. In the Public Objects folder, in the Reports folder, create the following report:
* You can access the Store City attribute from the Store hierarchy and the Customer City attribute from the
Customer hierarchy. The Revenue metric is located in the Public Objects/Metrics folder.
3. Run the report. The result set should look like the following:
* Notice that since both the Customer City and Store City attributes are on the report, it does not return an accurate
result set. The report displays only records where the store city and customer city happen to be the same. The
query excludes all other records.
4. Switch to the SQL View for the report. You should see the following SQL:
* Notice that the report requires the ID and description for both attributes. To correctly obtain this information, it
needs to access the LU_CITY table twice. However, because these two attributes are not configured as role
attributes, the table is aliased only once in the FROM clause.
6. Save the report in the Reports folder as Revenue by Store City and Customer City and close the report.
4. In the Object Viewer, right-click the Advanced Data Warehousing Warehouse database instance and select
VLDB Properties.
5. In the VLDB Properties window, on the Tools menu, select Show Advanced Settings.
8. Clear the Use default inherited value - (Default Settings) check box.
* This message window reminds you to disconnect and reconnect to the project source for changes in VLDB
settings to take effect.
13. Reconnect to the Advanced Data Warehousing project source using the administrator login.
* You must disconnect and reconnect to the project source for the change in the Engine attribute role VLDB
property to take effect.
15. In the Public Objects folder, in the Reports folder, run the Revenue by Store City and Customer City report again.
The result set should look like the following:
* Notice that the result set now correctly returns all of the records in the fact table, not just those where the store
city and customer city are the same.
16. Switch to the SQL View for the report. You should see the following SQL:
* Notice that the LU_CITY table is aliased twice in the FROM clause. The query can retrieve all of the records from
the fact table, not just those where the store city and customer city are the same.
Now that you have seen how to support role attributes using automatic attribute role recognition, you are going to
disable automatic attribute role recognition and create an explicit table alias to support the role attributes.
4. In the Object Viewer, right-click the Advanced Data Warehousing Warehouse database instance and select
VLDB Properties.
5. In the VLDB Properties window, in the VLDB Settings list, expand the Query Optimizations folder.
* This message window reminds you to disconnect and reconnect to the project source for changes in VLDB
settings to take effect.
11. Reconnect to the Advanced Data Warehousing project source using the administrator login.
* You must disconnect and reconnect to the project source for the change in the Engine attribute role VLDB
property to take effect.
13. In the Schema Objects folder, in the Tables folder, right-click the LU_CITY table and select Create Table Alias.
14. Right-click the LU_CITY (1) table alias and select Rename.
16. In the Schema Objects folder, in the Attributes folder, open the Customer City attribute.
17. Modify the ID form to map the City_ID form expression to the LU_CITY_ALIAS_CUSTCITY table.
* When you remove LU_CITY as a source table and close the Modify Attribute Form window, you see a message
window warning you that this change will make the DESC form invalid. Click OK when you see this warning as you
will remap the DESC form.
21. Modify the DESC form to map the City_Desc form expression to the LU_CITY_ALIAS_CUSTCITY table.
27. For the ID form, ensure that the LU_CITY_ALIAS_CUSTCITY table is not selected as a source table for the City_ID
form expression.
30. Run the Revenue by Store City and Customer City report again. The result set should look like the following:
* Notice that the result set again correctly returns all of the records in the fact table, not just those where the store
city and customer city are the same.
31. Switch to the SQL View for the report. You should see the following SQL:
* Notice that the LU_CITY table is aliased twice in the FROM clause, so the query can retrieve the information for
both Store City and Customer City.
5
APPENDIX: ALTERNATE TO
ARCHITECT
You first create a project object in Project Creation Assistant. Then, you can use the combination of wizards and
editors as an alternative way for creating schema objects.
Creating schema objects using Project Creation Assistant involves the following steps:
1. Select the project tables using the Warehouse Catalog: In this interface, you first select the database instance
associated with the project, and then select the tables.
2. Create facts: Fact Creation wizard helps you create facts. Before you create a fact you have to define the fact
creation rules. For information see the MicroStrategy Project Design Help.
3. Create attributes: Attribute Creation wizard enables you to create multiple attributes very quickly. However, you
are limited to creating basic attributes using this tool. In this wizard, you define the creation rules, create ID,
description forms, select lookup tables, create attribute relationships, and create compound attributes. You can
use Attribute Editor to create more complex attributes. For more information see MicroStrategy Project Design
Help.
© 2015 MicroStrategy Inc. Other Methods for Creating Schema Objects 323
Project Architecting Appendix: Alternate to Architect 5
4. Create user hierarchies: You can create and modify user hierarchies using the Hierarchy Editor. The Hierarchy
Editor is one of the schema object editors available in MicroStrategy Architect. It enables you to configure a
variety of hierarchy-related settings.
In MicroStrategy, the following Project Creation Assistant wizards and editors could be used as alternative ways to
create a project:
• Warehouse Catalog
• Fact Editor
• Attribute Editor
• Hierarchy Editor
The following image shows an example of Warehouse Catalog with select lookup and fact tables added to a
project:
Warehouse Catalog—Lookup and Fact Tables Added to Project
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows an example of Fact Creation Wizard with selected facts created in a project:
Fact Creation Wizard—Facts Created in Project
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows an example of Fact Editor with a simple expression for the Gross Revenue fact:
Fact Editor—Simple Expression
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows an example of Attribute Creation Wizard with selected attributes created in a project:
Attribute Creation Wizard—Attributes Created in Project
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows an example of Attribute Creation Wizard with the correct description forms selected for
each attribute:
Attribute Creation Wizard—Description Forms Selected
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows an example of Attribute Creation Wizard with the correct lookup tables selected for
each attribute:
Attribute Creation Wizard—Lookup Tables Selected
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows an example of Attribute Creation Wizard with the relationships created for an attribute:
Attribute Creation Wizard—Relationships Created
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows an example of Attribute Creation Wizard with a compound attribute created:
Attribute Creation Wizard—Compound Attribute Created
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows an example of Display tab on the Attribute Editor. It is useful to define the Report and
Browse elements of attribute:
Attribute Editor—Display Tab
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows an example of Hierarchy Editor with Time user hierarchy, which includes the Year,
Quarter, Month, and Day attributes:
Time User Hierarchy
When you create user hierarchies, you can define the sort order used to display their attributes. You cannot define
the sort order in Architect graphical interface. You can customize the sort order for user hierarchies by using a
combination of settings in the Project Configuration Editor and Hierarchy Editor. These options enable you to
define sort orders for browsing and drilling.
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
By default, when you create a user hierarchy for browsing, its attributes display in alphabetical order. However, it
may make more sense to users for the attributes to display in a logical order based on their relationships. You can
customize the order in which the attributes display.
The sort order you define for browsing also applies to hierarchy prompts since these objects are based on browsing
user hierarchies.
1. In Developer, right-click the project that contains the user hierarchy for which you want to define the sort order
and select Project Configuration.
2. In the Project Configuration Editor, under the Project definition category, select Advanced.
3. Under Advanced Prompt Properties, clear the Display Attribute alphabetically in hierarchy prompt check box.
4. Click OK.
5. In the Schema Objects folder, in the Hierarchies folder, in the Data Explorer folder, right-click the user hierarchy
for which you want to define the sort order and select Edit.
7. In the Select Objects window, using the arrow buttons to the right of the Selected objects list, change the order
of the attributes in the user hierarchy to the desired display order.
• After you clear the project-level setting, the order of the attributes in the Selected objects list determines
the order in which they display in the user hierarchy.
8. Click OK.
After closing the Hierarchy Editor, you can update the project schema if desired.
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows the sort order setting for browsing user hierarchies in the Project Configuration Editor:
Project Configuration Editor—Sort Order Setting for Browsing User Hierarchies
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows the Select Objects window with a customized sort order for the attributes in the Time
user hierarchy of My Tutorial project:
Select Objects Window—Customized Sort Order for Attributes
The attributes in the Time user hierarchy are ordered from highest to lowest rather than alphabetically.
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
When you view the user hierarchy in the Data Explorer, the attributes display using this same sort order, as shown
in the following image:
Data Explorer—Customized Sort Order for Attributes
• In the image above, the Day attribute does not display because it is not an entry point for the user hierarchy.
• You may need to refresh the user hierarchy in the Data Explorer browser to see the customized sort order.
By default, when you create a user hierarchy for drilling, its attributes display according to their order in the
Hierarchy Editor. If you do not customize the sort order of the attributes in the Hierarchy Editor, they display in
alphabetical order. However, you can change the sort order to arrange the attributes in whatever order is most
convenient for users when drilling on reports.
• If you configure a user hierarchy for both browsing and drilling, you can only define one sort order. If you
want the user hierarchy to display attributes using different sort orders depending on whether you use it for
browsing or drilling, you must create separate user hierarchies for browsing and drilling.
1. In Developer, right-click the project that contains the user hierarchy for which you want to define the sort order
and select Project Configuration.
2. In the Project Configuration Editor, under the Project definition category, select Drilling.
3. Under Advanced, clear the Sort drilling options in ascending alphabetical order check box.
• By default, this check box is not selected, so you may not need to change this setting.
4. Click OK.
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
5. In the Schema Objects folder, in the Hierarchies folder, right-click the hierarchy for which you want to define the
sort order and select Edit.
7. In the Select Objects window, using the arrow buttons to the right of the Selected objects list, change the order
of the attributes in the user hierarchy to the desired display order.
• After you clear the project-level setting, the order of the attributes in the Selected objects list determines
the order in which they display in the user hierarchy.
8. Click OK.
• After closing the Hierarchy Editor, you can update the project schema if desired.
The following image shows the sort order setting for drilling user hierarchies in the Project Configuration Editor:
Project Configuration Editor—Sort Order Setting for Drilling User Hierarchies
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
The following image shows the Select Objects window with a customized sort order for the attributes in the Time
user hierarchy:
Select Objects Window—Customized Sort Order for Attributes
The attributes in the Time user hierarchy are ordered from the most to least frequently used attribute for drilling.
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
When you view the user hierarchy while drilling on a report, the attributes display using this same sort order, as
shown in the following image:
Drilling on a Report—Customized Sort Order for Attributes
© 2015 MicroStrategy Inc. This appendix provides information on alternative methods for creating
Project Architecting Appendix: Alternate to Architect 5
Warehouse Catalog
Maintaining Individual Tables
The Warehouse Catalog enables you to maintain the integrity of the logical tables with the data warehouse
structure by providing a variety of options that apply to the project tables on an individual basis. You access these
options in the Warehouse Catalog by right-clicking any table you have added to a project. The following image
displays these individual table options:
The Warehouse Catalog provides the following options for individual tables:
• Table Structure—Selecting this option opens the Table Structure window and enables you to view the
columns and data types stored in a logical table. If the table structure has changed since you added the
table to the project, you can click Update Structure to force MicroStrategy Architect to recognize the
changes.
• Update Structure—This option is a shortcut to the Update Structure function available in the Table
Structure window.
• Show Sample Data—This option enables you to view the first 100 rows of data in a table.
• Calculate Table Row Count—Selecting this option enables you to calculate and display the row count for a
table.
* You can also choose to automatically calculate the row counts for all project tables using a project-wide
option in the Warehouse Catalog.
• Table Prefix—Selecting this option 0pens the Table Prefix window, which enables you to add and remove
prefixes and assign a prefix to a table. Assigning a prefix to a table means that any time the SQL Engine uses
that table, it includes the prefix when referencing the table.
• Table Database Instances—Selecting this option opens the Available Database Instances window, which
enables you to select the primary database instance for a table and assign secondary database instances for
a table. This option enables you to choose additional database instances in which you want to search for the
table if you cannot obtain it from the primary database instance.
* The primary database instance is the database instance that you should use for element browsing against
the selected table and queries that do not require joins to other tables. The secondary database instance is
any other database where the table also exists. You use it to support database gateways and when you
create duplicate tables with MultiSource Option.
• Import Prefix—Selecting this option enables you to retrieve a prefix for a table.
* There are no table prefixes for the MicroStrategy Tutorial data warehouse.
* You can also choose to automatically display the prefixes for all project tables in the warehouse catalog
using a project-wide option in the Warehouse Catalog.
In addition to the table-level options, the Warehouse Catalog also contains a variety of options that apply across a
project. You access these options in the Warehouse Catalog by clicking the Options button on the toolbar. The
following image displays the Warehouse Catalog Options window, which contains various categories of
project-wide options:
Warehouse Catalog Options
These project-wide options are separated into a series of the following expandable categories:
• Catalog—The options in this category enable you to quickly edit the primary project database instance,
specify custom login, customize the SQL statement to query the database catalog tables, define different
column reading settings, and specify the default read mode.
* The Warehouse Connection subcategory of the Catalog category is displayed in the image above.
• View—The options in this category enable you to specify table viewing options, such as displaying table
prefixes, row counts, and name spaces by default.
The following image shows the Table Prefixes subcategory in the View category:
View Category in the Warehouse Catalog
• Schema—The options in this category enable you to configure the automatic mapping of schema objects
to the tables brought into the Warehouse Catalog and the calculation of logical table sizes.
The following image shows the Automatic Mapping subcategory in the Schema category:
Schema Category in the Warehouse Catalog
As you can see, the Warehouse Catalog has a host of options that provide flexibility in maintaining your project
over time.
* For detailed information on the Warehouse Catalog options, refer to the Project Design Guide product manual.
You can add tables associated with secondary database instances using the Warehouse Catalog. The following
steps illustrate how you can add tables associated with secondary database instance using the Warehouse Catalog.
To add tables from a secondary database instance to a project in the Warehouse Catalog:
1. In Developer, open the project to which you want to add the tables.
3. In the Warehouse Catalog, in the Select current database instance drop-down list, select the database instance
that contains the tables you want to add.
* This drop-down list defaults to the primary database instance for the project.
4. In the Tables available in the database instance list, select the tables you want to add.
* The tables display in the Tables being used in the project list, along with the primary database instance for
each table.
The following image shows the Warehouse Catalog with the FORECAST_SALES table added to the project:
FORECAST_SALES Table Added to Project
The following image shows the Warehouse Catalog with the REGION_FORECAST_SALES table added to the project:
REGION_FORECAST_SALES Table Added to Project
As you can see, the primary database instance for both of these tables is Forecast Data, not Tutorial Data.
1. In Developer, open the project to which you want to add the tables.
3. In the Warehouse Catalog, in the Select current database instance drop-down list, select the database instance
that contains the tables you want to add.
4. In the Tables available in the database instance list, select the tables you want to add.
* When you add a table that is already mapped to another database instance, a warning displays asking if you
want to create a duplicate table and associate it with the selected database instance.
6. In the warning window, under Available options, keep the Indicate that <Table Name> is also available from the
current DB Instance option selected.
* If you click the Make no changes to <Table Name> option, the table is not mapped to the selected database
instance, and no duplicate table is created.
If you want to view and respond to the warnings for each duplicate table individually, click OK.
OR
If you want to respond to the warnings for all duplicate tables at the same time, click OK for All.
* The tables display in the Tables being used in the project list along with the primary database instance for
each table. The icons beside the tables indicate that they are mapped to multiple data sources.
* You can verify the database instances to which a table is mapped. In the Tables being used in the project list,
right-click the table you want to check and select Table Database Instances. The Available Database Instances
window displays the primary and secondary database instances for the table.
The following image shows the warning you see when you add duplicate tables:
Duplicate Tables Warning
The following image shows the Warehouse Catalog with the LU_COUNTRY table mapped to multiple data sources:
LU_COUNTRY Mapped to Multiple Data Sources
The following image shows the Warehouse Catalog with the LU_REGION table mapped to multiple data sources:
LU_REGION Mapped to Multiple Data Sources
The primary database instance for both of these tables is Tutorial Data. However, the icons beside the tables
indicate that they are also mapped to another database instance.
To change the primary database instance for a table in the Warehouse Catalog:
3. In the Warehouse Catalog, in the Tables being used in the project list, right-click the table for which you want to
change the primary database instance and select Table Database Instances.
4. In the Available Database Instances window, in the Primary Database Instance drop-down list, select the
database instance you want to use as the primary database instance.
* The current primary database instance is moved to the Secondary Database Instances list.
5. To keep the current primary database instance as a secondary database instance, in the Primary Database
Instance drop-down, select a new database instance. Then in the Secondary Database instance list, ensure the
former primary database instance is selected.
6. Click OK.
The following image shows the option for accessing the database instances for a table:
Option for Accessing the Database Instances for a Table
The following image shows the Available Database Instances window with the default database instance
configuration for the LU_COUNTRY table:
LU_COUNTRY with Default Database Instance Configuration
Tutorial Data is the primary database instance for the LU_COUNTRY table, while Forecast Data is a secondary
database instance for the table.
The following image shows the Available Database Instances window with the modified database instance
configuration for the LU_COUNTRY table. Forecast Data is now the primary database instance for the LU_COUNTRY
table, and Tutorial Data is a secondary database instance for the table.
LU_COUNTRY with Modified Database Instance Configuration
Trademark Information
All other company and product names may be trademarks of the respective companies with which they are
associated. Specifications subject to change without notice. MicroStrategy is not responsible for errors or
omissions. MicroStrategy makes no warranties or commitments concerning the availability of future products or
versions that may be planned or under development.
Patent Information
This product is patented. One or more of the following patents may apply to the product sold herein: U.S. Patent
Nos. 6,154,766, 6,173,310, 6,260,050, 6,263,051, 6,269,393, 6,279,033, 6,567,796, 6,587,547, 6,606,596, 6,658,093,
6,658,432, 6,662,195, 6,671,715, 6,691,100, 6,694,316, 6,697,808, 6,704,723, 6,741,980, 6,765,997, 6,768,788,
6,772,137, 6,788,768, 6,798,867, 6,801,910, 6,820,073, 6,829,334, 6,836,537, 6,850,603, 6,859,798, 6,873,693,
6,885,734, 6,940,953, 6,964,012, 6,977,992, 6,996,568, 6,996,569, 7,003,512, 7,010,518, 7,016,480, 7,020,251,
7,039,165, 7,082,422, 7,113,993, 7,127,403, 7,174,349, 7,181,417, 7,194,457, 7,197,461, 7,228,303, 7,260,577,
7,266,181, 7,272,212, 7,302,639, 7,324,942, 7,330,847, 7,340,040, 7,356,758, 7,356,840, 7,415,438, 7,428,302,
7,430,562, 7,440,898, 7,486,780, 7,509,671, 7,516,181, 7,559,048, 7,574,376, 7,617,201, 7,725,811, 7,801,967,
7,836,178, 7,861,161, 7,861,253, 7,881,443, 7,925,616, 7,945,584, 7,970,782, 8,005,870, 8,051,168, 8,051,369,
8,094,788 and 8,130,918. Other patent applications are pending.
© 2000–2015 MicroStrategy, Incorporated. All rights reserved.
This Course (course and course materials) and any Software are provided “as is” and without express or limited
warranty of any kind by either MicroStrategy Incorporated (“MicroStrategy”) or anyone who has been involved in
the creation, production, or distribution of the Course or Software, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose. The entire risk as to the quality and performance
of the Course and Software is with you. Should the Course or Software prove defective, you (and not MicroStrategy
or anyone else who has been involved with the creation, production, or distribution of the Course or Software)
assume the entire cost of all necessary servicing, repair, or correction.
In no event will MicroStrategy or any other person involved with the creation, production, or distribution of the
Course or Software be liable to you on account of any claim for damage, including any lost profits, lost savings, or
other special, incidental, consequential, or exemplary damages, including but not limited to any damages assessed
against or paid by you to any third party, arising from the use, inability to use, quality, or performance of such
Course and Software, even if MicroStrategy or any such other person or entity has been advised of the possibility of
such damages, or for the claim by any other party. In addition, MicroStrategy or any other person involved in the
creation, production, or distribution of the Course and Software shall not be liable for any claim by you or any other
party for damages arising from the use, inability to use, quality, or performance of such Course and Software, based
upon principles of contract warranty, negligence, strict liability for the negligence of indemnity or contribution, the
failure of any remedy to achieve its essential purpose, or otherwise.
The Course and the Software are copyrighted and all rights are reserved by MicroStrategy. MicroStrategy reserves
the right to make periodic modifications to the Course or the Software without obligation to notify any person or
entity of such revision. Copying, duplicating, selling, or otherwise distributing any part of the Course or Software
without prior written consent of an authorized representative of MicroStrategy are prohibited.
U.S. Government Restricted Rights. It is acknowledged that the Course and Software were developed at private
expense, that no part is public domain, and that the Course and Software are Commercial Computer Software
and/or Commercial Computer Software Documentation provided with RESTRICTED RIGHTS under Federal
Acquisition Regulations and agency supplements to them. Use, duplication, or disclosure by the U.S. Government
is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer
Software clause at DFAR 252.227-7013 et. seq. or subparagraphs (c)(1) and (2) of the Commercial Computer
Software—Restricted Rights at FAR 52.227-19, as applicable. The Contractor is MicroStrategy, 1850 Towers Crescent
Plaza, Tysons Corner, Virginia 22182. Rights are reserved under copyright laws of the United States with respect to
unpublished portions of the Software.