DPT 4 - Ba Question &answer
DPT 4 - Ba Question &answer
ANSWERS:
OLAPANDBUSINESSINTELLIGENCE
Business intelligence (BI)is an umbrella term for data analysis techniques, applications and
practices used to support the decision-making processes in business. The term was proposed by
Howard Dresner in 1989 and became widespread in the late 1990s.
Business intelligence assists business owners in making important decisions based on their
business data. Rather than directly telling business owners what to do, business intelligence
allows them to analyze the data they have to understand trends and get insights, thus scaffolding
the decision-making process.
BI includes a wide variety of techniques and tools for data analytics, including tools for ad-
hocanalytics and reporting, OLAP tools,real-time business intelligence, SaaS BI, etc. Another
important area of BI is data visualizationsoftware, dashboards,and scorecards.
THEROLEOFOLAPINBUSINESS INTELLIGENCE
OLAP (online analytical processing) is sometimes used as a synonym of business
intelligence. However, it is not correct - it could be better described a function ofBIsoftwarethat
enables a user to extract and view data from different viewpoints.
ThereareseveralreasonswhyOLAPispopularinBI:
Itrepresentsdatainamultidimensionalform,whichmakesitconvenientforanalystsandother business
users to analyze numeric values from different perspectives.
OLAP is good for storing, extracting and analyzing large amounts of data. Business intelligence
specialists are able to analyze data accumulated over a long period of time, which enables more
preciseresultsandbetterforecasting.Thearchitecture ofOLAPsystems allowsfastaccess tothe data
as they typically pre-aggregate data.
OLAPprovideswideopportunitiesfordataslicinganddicing,drilldown/up/through,which helps
analysts narrow down the data used for BI analysisand reporting.
OLAP systems usually have an intuitive and easy-to-use interface, which allows nontechnical
users to analyze data and generate reports without involving IT department. OLAP
Wecan say thatOLAPoccupies aplace between adatawarehouseandend-user tools in BI,thus
allowing users to get the data they need in a fast and efficient way.
WhatisOLAP?
Most business data have multiple dimensions—multiple categories into which the data are
broken down for presentation, tracking, or analysis. For example, sales figures might have
several dimensions related to location (region, country, state/province, store), time (year, month,
week, day), product (clothing, men/women/children, brand, type), and more.
But in a data warehouse, data sets are stored in tables, each of which can organize data into just
two of these dimensions at a time. OLAP extracts data from multiple relational data sets and
reorganizes it into a multidimensional format that enables very fast processing and veryinsightful
analysis.
WhatisanOLAPcube?
This is where the OLAP cube comes in. The OLAP cube extends the single table with additional
layers, each adding additional dimensions—usually the next level in the ―concept hierarchy‖of
the dimension. For example, the top layer of the cube might organize sales by region; additional
layers could be country, state/province, city and even specific store.
In theory, a cube can contain an infinite number of layers. (An OLAP cube representing more
than three dimensions is sometimes called a hypercube.) And smaller cubes can exist within
layers—for example, each store layer could contain cubes arranging sales by salesperson and
product. In practice, data analysts will create OLAP cubes containing just the layers they need,
for optimal analysis and performance.
paste,
edit/update,and
delete.
Suchtransactions have ashortresponsetime –measured inseconds–asthey arenatural to OLTP.
But when it comes to more complex queries that involve aggregating data from
multipletables,atransactionaldatabasewillrunintotrouble.Themoredataisinquired,the more
problematic and resource-intensive it is for OLTP.
DataoperationsinOLAP
OLAPorOnlineAnalyticalProcessingaggregatestransactionaldatafromastoragetotransform it into
a feasible form for analysis. As a source of data, OLAP can use some type of unified storage
like a data warehouse, data lake,or data mart,or simply any place you store the historical data.
But to run complex custom queries, we must structure data properly. That‘s why in most
cases,thereisaneedforaseparateOLAPdatabaseorwarehousethatwillmodeldatafor
multidimensional analysis.
ThequeryinOLAPmaylooksomething likethis:
“show64GBsmartphonesalesinWinnipegforthelast6month,”
“compare64GBmodelsalesacrossCanadato256GBinthe4thQ,”
“groupall64GBmodelreturnsforthe2021soldbyJohnDoesalesperson,”and
“showaveragemarginforagiven smartphonevendorforthis year.”
Such analytical queries require the database to gather information from multiple tablesthat
categorize data by ―dimensions.‖ An example of dimension can be time, product,
location, customer, etc.
OLAP models a database in such away that it becomes possible to quickly gather thedata and
present it to analysts in a multidimensional mode rather than a flat table. That‘s why OLTPand
OLAP databases will differ in numerous ways.
Now,wehavetoanswertwosimplequestions.HowisOLAPdatamodelingdifferent from
transactional databases? And why can‘t werun such complex queries in OLTP?
DatamodelinginOLTP
Thisisthemoststandardwaywestoredataandmakemodificationstotransactionalinformation
. Such an approach works great for simple queries to modify transactional data. But if
we need to query something like ―compare sales of a given item in the 3rd quarter for
the last three yearsin the US‖ — the relational database will require enormous
resources because it will scan each table entirely to find all the related values.
Moreover, the query will return disparate data items with a lot of unnecessary
information, as
therelationalmodeldoesn‘tsupportfilteringbymultipledimensionsatonce(producttype,time
period, location).
DatamodelinginOLAP
Incontrast,OLAPusesstarandsnowflakeschemasfordatamodeling.
In a star schema we structure data around facts, providing the keys to every dimension
for measurement. A fact, in this case, is a category of related business items, e.g.,
product, sales
amount,revenue,customers,time,location,etc.Eachoftheseitemsisaseparatedimensiontha
t includes subcategories. So we can divide, for example, time by year, quarter, month,
week, and day.
A snowflake schema is an extension to a star schema: Basically, it adds more
dimensions to
theexistingones.Butdataisstillorganizedaroundfactualtables.Soifweaddmoredimension
tables, the database starts to take a ―snowflake shape.‖
AmultidimensionalmodelofdataiswhatmakesitpossibleforOLAPsystemstoextract
the required information, perform complex filtering, and allow for analysis of this
data.
OnlineAnalyticalProcessingArchitecture
Whenwetalkaboutanalytics,wegenerallymeanbusinessintelligencesystems,whereadata
warehouse is a central point uniting all the corporate data. A data warehouse can be
used differently depending on the goals of the organization.
AcorporateDWcanbeusedasaunifiedstorageforhistoricaldata,optimized for
transaction or simple analytical tasks.
OritcanbeoptimizedforOLAPqueriesapplyingstarorsnowflakeschemas.
Dependingonhowweplantodesignthebusinessintelligencesystem,OLAPmayormay
not require a separate database to run queries. The architecture of a BI system
with a standalone OLAP repository looks something like this:
Solet‘sanalyzeOLAPworkflowinsucharchitecture.
Data extraction. First, the data is extracted from its original sources and uploaded
to a
unifieddatastorage.InthecaseofBI,adatawarehousewillbetheplaceweuploaddatato.
Data preparation. Once we‘ve got the data, it requires optimization and modeling for
multidimensionalanalysis.Insomecases,corporateDWcanbeoptimizedtorunOLAPqueries,
but a more typical case is to use a separate OLAP database. Here are a few reasons
why.
Runninganalyticalandtransactionalqueriesonseparatedatabaseseliminatestheriskof
overloads and database downtimes, while guaranteeing decent performance of
the two.
Applyingdatamodelsiseasierwhenweuseastorageforasinglepurpose.
Datatransformationandintegrationisusuallydonevia ETL/ELTtools,whichhelpdevelopers
to automate data extraction, transformation, and uploading.
Building a cube. Once the data is prepared, a group of responsible data engineers
will model
cubesanddeploythemonthededicatedserver.Creatingacubeisacustomprocesseachtim
e, because data can‘t be updated once it was modeled in a cube. So, for each
specific query, a new cube will be created.
Accessingdata.Asanendpointinthesystem,OLAPcubeswillbeaccessedthroug
h analyticalinterfaces.Here,analyticscantypeincommandsandperformcube-
specific operations to analyze data.
Now,let‘slookatthecubesthemselvesanddefinethecapabilitiestheygivetothe analyst.
HowOLAPcubeswork
The cube may consist of several dimensions that can be used to filter the required
information and form reports. OLAP systems use a specific SQL language called
MDX or
MultidimensionalExpressions.StandardSQLqueriesarealsosupportedbythemostdataba
ses to perform OLAP analysis.
TherearefivebasictypesofOLAPmultidimensionalanalysisoperations.
Drilldownallowsausertomovefromhigh-leveldata(e.g.,annualsales)toalowerlevel (e.g.,
monthlysales).Hereweusetheconceptofhierarchythatappliestoeverysingledimension.So
, in the ―time‖ dimension, we can move down from yearly figures to weekly or even
daily records. This depends on how you store your data and model the actual cube.
Rollupistheoppositeofdrilldown,asitbasicallyliftsthedatainhierarchylevels.Bothoperations
either make the data more or less detailed, or add/remove dimensions for the analysis.
Slice operations help you divide a certain dimension into a separate table
(one- dimension view). ―Slice‖ can detach, say, the city‘s dimension from the rest of
the cube, which willcreateaseparatespreadsheet.Thiswaywecananalyzelow-
levelinformationintheisolated environment.
Diceprovidesthesameseparationfunctionality,butallowsyoutochoosemorethanone
dimension, producing a separate cube.
PivotisasimilaroperationtocreatepivottablesinExcel.Thisfunctionallowsustorotate a
cube to get a different representation of data in between the dimensions.
Allinall,thefunctionscanbeusedinconjunction,whichgiveshugeflexibilitytouseasingle cube
for multiple purposes. But, as we mentioned before, each time there is a modification to
data, the cube will require reuploading the information or remodeling the existing OLAP
DB.
OLAPPROVIDERS
OLAPisavitalpartofanyBIsystem.Despiteitsresource-intensivenature,OLAPremainsa
standard solution for complex analytics that can‘t be done in the usual databases. As
the technology appeared in the early ‗90s, the market of solutions is quite large. And
the main proposal comes from data warehouse/business intelligence providers.
Nearly any provider these days supports all of the basic functions of OLAP and allows
the
creationofmultidimensionalcubesystemsasapartoftheirBIplatform.Now,let‘slookatsome
popular products that can be used as a separate OLAP tool.