OLAP
OLAP
● Dimensions:
○ Time (Year, Quarter, Month)
○ Geography (Region, Store)
○ Product (Category, Brand)
● Measures:
○ Total Sales
○ Units Sold
○ Profit
This structure allows users to "slice" and "dice" data to answer complex questions
like:
Example: With sales data for several years, you want to examine only the data
for 2023, you slice the cube along the "Time" dimension to extract data for the
year 2023.
Example: If you want to examine sales for product category "A" in the "North"
region for 2023, you dice the cube along both the "Product" and "Geography"
dimensions for the year 2023.
● Drill-Down allows you to navigate from summary data to more detailed data.
You start with a high-level view and progressively drill down to a more
granular level of data.
Example: You might start with total sales for a region and drill down into
specific months or even days to see detailed sales performance.
Example: You might start with sales data for individual products and drill up
to see total sales for the entire product category.
● Pivot (Rotate) operation involves reorienting the dimensions of the cube to
view the data from different perspectives. It is a way to "rotate" the data to see
different combinations of dimensions.
Example: You might rotate the cube to switch the positions of "Time" and
"Geography" to examine data over time in each region, rather than examining
data in each region over time.
Image source: Google Image Search
OLAP Types
● MOLAP (Multidimensional OLAP)
○ stores data in a multidimensional database (often a specialized
structure) that allows fast access to pre-aggregated data.
○ is optimized for fast retrieval of summary data and is very efficient for
read-heavy analytical workloads.
● Marketing and Customer Analytics: Marketers often use OLAP for customer
segmentation and campaign analysis. They can analyze data to understand:
○ "What demographics (age, gender, location) respond best to our
marketing campaigns?"
○ "How do sales differ between customer segments and advertising
channels?"
Analysis examples:
● If you query "total sales for the North region in 2020", you can quickly retrieve
the value from this intersection point of the cube.
● If you "drill down" to see monthly sales for 2020 in the North region, the OLAP
system retrieves the data for that specific "slice" (e.g., total sales for January,
February, etc.).
Consider a retailer (e.g. Flipkart) with sales data. A simple sales cube could have:
● Dimensions:
○ Time (Year, Quarter, Month)
○ Geography (Country, Region, Store)
○ Product (Category, Brand, Product)
● Measures:
○ Total Sales (Revenue)
○ Units Sold
○ Profit Margin
Key Points
● The cube is a conceptual model to represent multidimensional data, and it’s
often used in OLAP systems to allow users to explore data across multiple
dimensions.
● Physically, data is still stored in tables (either relational or multidimensional
databases), but these tables are organized to efficiently support fast querying
and aggregation along the various dimensions.
● OLAP cubes enable users to quickly view and analyze aggregated data by
slicing, dicing, and drilling into the data.
● The cube structure helps make sense of complex data by organizing it into
intuitive, multidimensional frameworks, even though the underlying data is
stored in tables.