0% found this document useful (0 votes)
67 views42 pages

BusinessIntelligence 2 ModelingBusinessIntelligence

BI2

Uploaded by

vynska amalia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views42 pages

BusinessIntelligence 2 ModelingBusinessIntelligence

BI2

Uploaded by

vynska amalia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Chapter 2:

Modeling in
Business Intelligence
Contents

1 Models and Modeling in BI

2 Logical and Algebraic Structures

3 Graph Structures

4 Analytical Structures

5 Models and Data

6 Summary and Outlook


W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 2
1 Models and Modeling in Business Intelligence

There are many different models used in BI


− Examples you know:

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 3


1 Models and Modeling in Business Intelligence

− Model Definition: Models represent some part of


the business process and allow precise formulation
of interesting questions (Analytical Goals)
− How can we realize the representation?
(representation function)
− How should we formulate the representation?
(“model language”)

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 4


1 Models and Modeling in Business Intelligence

Representation function of models


− Models of phenomena
− Phenomena: Features of the business process interesting
from an analytical point of view
− Models define a picture of the phenomena (caricatures)
• Idealized models, e.g., control flow of the business
process, a treatment process, a course design
• Analogical models: Overtake ideas from other sciences,
e.g., gravity model for relations between persons in
dependence of distance
• Phenomenological models: Statistics, e.g. regression

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 5


1 Models and Modeling in Business Intelligence

Representation function of models (ctd.)


− Models of data
• We have no precise idea about the models, but only a
number of candidate models for the empirical data
• The task is to learn the most appropriate model (Machine
Learning, Data Mining)
• Simple example: Churn management:
 Which variables influence the churn behavior of a customer , e.g.,
age, sex, marital status, income?
 How should we define the relation between churn behavior and
theses variables?

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 6


1 Models and Modeling in Business Intelligence

Representation function of models (ctd.)


− Models of Theories
• Each application domain of BI has specific domain
knowledge, usually defined by concepts and
relation (logical relations) between the concepts
• Concepts and logical relations define a formal
system (ontology)
• Understanding this formal system as a theory data
instances are models of this theory
→ Database models
W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 7
1 Models and Modeling in Business Intelligence

Languages for Models


− Corresponding to the multitude of models
there are different formulations (languages)
used, for example:
• UML or ER-modeling for data

• BPMN for modeling the control flow

• Statistics in case of modeling customer


behavior
• Connectedness (reachability) in a graph
W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 8
1 Models and Modeling in Business Intelligence

Formulation of models
− Each language has its own semantic allowing
definition of certain model elements an formulation
of generic questions
• Queries in a database

• Simultaneous occurrence of two events in a


business process
• Strength of association between two variables

• Graph models for social networks

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 9


1 Models and Modeling in Business Intelligence

Formulation of models (ctd.)


− Generic questions can be formulated in different
languages
• Example: Relations between attributes
 Formulate a query in a data model and represent the
result as a table
 Define a regression model and formulate the relation as
an equation
 Use a graphical language and visualize the relation in a
scatterplot

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 10


1 Models and Modeling in Business Intelligence

Model Structures
− Putting all these things together leads to the concept of a model
structure composed of:
• Model Language:

 Syntax defines basic elements and the rules how to compose


model elements
 Semantic defines the meaning of the elements in the language,
independent from any domain
 Notation for communication of the expressions in the language

− Model Elements: Certain expressions in the model language, useful for


describing facts about the business process
− Generic questions: Questions formulated in the semantic of the model
language about properties of model elements
• Generic questions can be answered by specific analysis techniques
W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 11
1 Models and Modeling in Business Intelligence

Modeling
− A mapping of some part of the domain semantic of a business process
into a certain model structure (“Conceptual Modeling”)
• Examples for domain concepts and relations:
 Health Care Use Case:
 Higher Education Use Case:
 CRM Use Case:
− Definition of a model configuration: admissible expression in a model
structure which allows formulation of the analytical goal in questions
about the model configuration
− Connection of model configuration with observations: data about the
instances of the business process have to fit to the model configuration,
i.e., views and perspectives
− Definition of model variability: Usually data are blurred due to noise or
statistical variability
W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 12
1 Models and Modeling in Business Intelligence

Model Assessment and Quality


− Quality criteria for business process models
• Correctness: model is syntactical correct and mapping of domain
semantic and model semantic is appropriate
• Relevance: model complies with intended function, i.e., explain past
observations and predict future observations
• Economic efficiency: trade-off between complexity and costs (Occams
razor)
• Clarity: model can be understood by users
• Comparability: model fits in the overall analysis framework of the
business process

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 13


1 Models and Modeling in Business Intelligence

Model Assessment and Quality (ctd.)


− Quality criteria for empirical models
• Objectivity: Results are independent of the person using
the model
• Reliability: results of the model can be reproduced

• Validity: model is useful from a practical point of view


 Content validity: model represents phenomenon under
consideration
 Criterion validity: high correlation between model results and other
external properties
 Construct validity: new results can be derived from model

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 14


1 Models and Modeling in Business Intelligence

Models and Patterns


− Patterns describe local behavior whereas models
describe global behavior
− Examples:
• Medical treatment process: a pattern of co-
occurrence of certain medications
• Customer relationship: A pattern of occurrence of
certain combination of variables like outliers

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 15


Contents

1 Models and Modeling in BI

2 Logical and Algebraic Structures

3 Graph Structures

4 Analytical Structures

5 Models and Data

6 Summary and Outlook


W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 16
2 Logical and Algebraic Structures

Language: Propositional logic and predicate logic


− Individual constants (names), e.g., “John Dee”
− Variables: placeholders for constants, e.g., “Student”, “Course”
− Functions: operating on constants or variables, e.g., “grade(Student)
= passed”
− Predicates: define properties for the individual constants, e.g., “Attends
BI”
− Quantifiers (“for all (∀)”, “exists (∃)”)
− Definition of terms by individual constants, individual variables, and
functions
− Generate atomic formulas by a predicate symbol followed by a number
of terms for which the predicate is applicable, e.g., “AttendsBI[John
Dee]”
− Build well formed formulas using propositional calculus and quantifiers,
e.g., ∃(Student)(∀(Course) grade(Student, Course) = passed
W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 17
2 Logical and Algebraic Structures

Model elements and generic questions


− Building expressions according to predicate logic
− Assign truth values to the expressions (interpretation)
− If the interpretation results in truth values TRUE for all
possible assignments of the free variables we call the
interpretation a model
− Generic questions are whether a well formed formula is
true
− Modeling using logical structures tries to capture domain
knowledge in a logical form
− The simplest form are terminology systems like taxonomies
W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 18
2 Logical and Algebraic Structures

Ontologies: “A specification of a conceptualization”


− OWL:
• T-Box: Vocabulary of a domain as a logical theory

• A-Box: Assertion about the domain, which has to


be checked
• Uses the open world assumption, i.e., anything
can be entered in the T-Box unless it violates
constraints

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 19


2 Logical and Algebraic Structures

© 2015 Springer-Verlag Berlin Heidelberg

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 20


2 Logical and Algebraic Structures

Frames
− Representation in an object-oriented style
− For each object a number of slots are defined for
attributes of the objects
− Frames use the closed world assumption, i.e., a
statement is true if its negation cannot be proven
within the system
− Example:
• “All birds can fly” (closed world)

• “There exist non flying birds” (open world)


W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 21
Contents

1 Models and Modeling in BI

2 Logical and Algebraic Structures

3 Graph Structures

4 Analytical Structures

5 Models and Data

6 Summary and Outlook


W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 22
3 Graph Structures

Model Structure - Language:


− Syntactic elements:
• Nodes (vertices)
• Edges (directed, undirected)
• Labels for edges (e.g., “distance”) or nodes (e.g.,
“degree”)
− Notation:
• Numeric representation (adjacency matrix)
• Visual representation

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 23


3 Graph Structures

Model Structure
− Model elements
• Special kinds of graphs, e.g., trees, series parallel networks, bipartite
graphs
• Connected graphs (path)

© 2015 Springer-Verlag Berlin Heidelberg


− Generic questions
• Generic questions refer to properties of the graph and can be
answered by well known algorithms like spanning tree, shortest path,
best matching of nodes

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 24


3 Graph Structures

Modeling using graph structures, e.g.,


− Business process modeling and notation (BPMN)

© 2015 Springer-Verlag Berlin Heidelberg


W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 25
3 Graph Structures

− Modeling using graph structures, e.g., Petri Nets

© 2015 Springer-Verlag Berlin Heidelberg

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 26


Contents

1 Models and Modeling in BI

2 Logical and Algebraic Structures

3 Graph Structures

4 Analytical Structures

5 Models and Data

6 Summary and Outlook


W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 27
4 Analytical structures

Calculus (ctd.)
− Model elements (ctd.)
− Inner product, given a vector w of coefficients:
𝑓𝑓 𝑥𝑥 = 𝑤𝑤 𝑇𝑇 𝑥𝑥 = w1 x1 + ⋯ + wp xp
− Linear functions in more than one variable (matrices): 𝑓𝑓 𝒙𝒙 = 𝐵𝐵𝒙𝒙
where B is a 𝑘𝑘 × 𝑝𝑝 matrix
− Projections: The orthogonal projection of a vector x onto another
′ 𝑤𝑤
vector w is defined by 𝑝𝑝𝑤𝑤 𝑥𝑥 = 𝑥𝑥 ∗
𝑤𝑤

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 29


4 Analytical structures

Probability – Language
− Events, Calculus of events: E
𝑃𝑃(𝐸𝐸)
− Probability of events 𝑃𝑃 𝐸𝐸 , 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝐸𝐸 =
1−𝑃𝑃(𝐸𝐸)
− Random variables as model for measurement: X
− Probability Distribution:
• Distribution function: 𝐹𝐹 𝑥𝑥 = 𝑃𝑃(𝑋𝑋 ≤ 𝑥𝑥)
• Density function and probability function: 𝑝𝑝(𝑥𝑥)
• We interpret the density as likelihood of an observation

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 31


4 Analytical structures

Probability – Language (ctd.)


− Conditional probability and independence:
𝑝𝑝(𝑥𝑥 |𝑦𝑦) = 𝑝𝑝(𝑥𝑥, 𝑦𝑦)/𝑝𝑝(𝑦𝑦)
− Two variables are independent if
𝑝𝑝 𝑥𝑥, 𝑦𝑦 = 𝑝𝑝 𝑥𝑥 ∗ 𝑝𝑝 𝑦𝑦
− Bayes Theorem: 𝑝𝑝(𝑥𝑥|𝑦𝑦) = 𝑝𝑝(𝑦𝑦|𝑥𝑥)/𝑝𝑝(𝑦𝑦)
− Interpretation of Bayes Theorem in the discrete
case: Compute column percentages from row
percentages

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 32


4 Analytical structures

Probability – Example

© 2015 Springer-Verlag Berlin Heidelberg

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 33


4 Analytical structures

Statistics – Language
− Statistical units (observation units)
− Population
− Observable variable
− Transfer the concepts of probability to observations, e.g., “distribution” to
“sample distribution” (“empirical distribution”)
− Model elements and generic questions:
• Descriptive methods
• Inferential methods
 Estimation
 Testing
 Confidence regions
− Modeling methods
• Regression

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 34


Contents

1 Models and Modeling in BI

2 Logical and Algebraic Structures

3 Graph Structures

4 Analytical Structures

5 Models and Data

6 Summary and Outlook


W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 35
5 Models and data

Data Generation
− In BI we have usually secondary data, i.e., data which have been
collected for other purposes, e.g.,
• Transactional data
• Administrative data
• Web data
− An important question for interpretation of results is defining the
population which is represented by the data (e.g., tweets or
evaluations on portals)
− Measurement of the data

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 36


5 Models and data

Elements of the knowledge based temporal abstraction method


− Time stamps 𝑇𝑇𝑖𝑖 are the basic primitives with a predefined granularity and
a well defined zero.
− Time intervals 𝑇𝑇 = 𝑇𝑇𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 , 𝑇𝑇𝑒𝑒𝑒𝑒𝑒𝑒 are defined as pairs of time stamps for
start and end. Time points are zero length intervals.
− An interpretation context 𝜉𝜉 is a proposition that can change the
interpretation of parameters within the scope of a time interval.
Interpretation contexts can be nested.
− A context interval 𝜉𝜉, 𝐼𝐼 defines time intervals for which the interpretation
context holds.
− An event proposition e represents the occurrence of an external
volitional action or process and has to be distinguished from a
measurable datum.
− An event interval 𝑒𝑒, 𝐼𝐼 represents the temporal duration of an event e.

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 37


5 Models and data

Elements of the knowledge based temporal abstraction method (ctd.)


− A parameter schema 𝜋𝜋 is a measurable aspect of the state of the world (states
of a process) with values in some domain 𝑣𝑣 𝜖𝜖 𝑉𝑉𝜋𝜋 . Parameter schemas may be of
different type: primitive parameters (measurable data), abstract parameters
(concepts), constant parameters (instant specific or instant independent).
− A parameter proposition 𝜋𝜋, 𝑣𝑣, 𝜉𝜉 defines the values of parameters in a context.
− An abstraction function 𝜃𝜃 𝜖𝜖 Θ maps parameters into abstract parameters.
− A parameter interval 𝜋𝜋, 𝑣𝑣, 𝜉𝜉, 𝐼𝐼 denotes the value v of the parameter 𝜋𝜋 in the
context 𝜉𝜉 during time interval I.
− An abstraction is a parameter or a parameter interval.
− An abstraction goal 𝜓𝜓 𝜖𝜖 Ψ represent a specific intention or goal.
− An abstraction goal interval 𝜓𝜓, 𝐼𝐼 represents the idea that abstraction goal 𝜓𝜓
holds in interval I.
− Induction of context intervals allows the induction of events, parameters, or
abstraction goal propositions for some context interval.

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 38


5 Models and data

Quality Dimensions for Data


− Relevance measures in how far the data are useful in the intended context.
− Accuracy is the degree of conformity of a measure to a standard or a true value.
− Completeness is a characteristic measuring the degree to which all required
data is known, with respect to depth, breath, and scope.
− Timeliness: Data coming early or at the right time, appropriate or adapted to the
times or the occasion.
− Consistency is expressed as the degree to which a set of data is equivalent in
redundant or distributed databases.
− Coherence refers to the adequacy of the data to be reliable combined in different
ways and for various uses.
− Reliability is a characteristic of an information infrastructure to store and retrieve
information in an accessible, secure, maintainable, and fast manner.

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 39


Contents

1 Models and Modeling in BI

2 Logical and Algebraic Structures

3 Graph Structures

4 Analytical Structures

5 Models and Data

6 Summary and Outlook


W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 40
6 Summary and outlook

− Modeling is a rather intricate activity in BI


− Different approaches for model representation and model
presentation
− Key tasks:
• Definition of a model configuration

• Connection of the model configuration with the


observations from business process instances.
• Formulation of a model component for capturing the
variability of the different process instances.

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 41


6 Summary & outlook

W.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 2: Modeling in Business Intelligence 42

You might also like