0% found this document useful (0 votes)
3 views51 pages

Decision

The document discusses decision-making in smart home technologies, emphasizing the need for intelligent environments to optimize inhabitant experience and task performance through automation and contextual awareness. It outlines various decision-making approaches, including rule-based, goal-based, and utility-based systems, while highlighting the importance of safety and efficiency in decision processes. Additionally, it covers modeling uncertainty and the use of Markov Decision Processes to make rational decisions under uncertainty.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views51 pages

Decision

The document discusses decision-making in smart home technologies, emphasizing the need for intelligent environments to optimize inhabitant experience and task performance through automation and contextual awareness. It outlines various decision-making approaches, including rule-based, goal-based, and utility-based systems, while highlighting the importance of safety and efficiency in decision processes. Additionally, it covers modeling uncertainty and the use of Markov Decision Processes to make rational decisions under uncertainty.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 51

Smart Home Technologies

Decision Making
Motivation
 Intelligent Environments are aimed at
improving the inhabitants’ experience
and task performance

Provide appropriate information

Automate functions in the home
 Prediction techniques can only
determine what would happen next, not
what should happen next.

Automated functions can be different from
inhabitant actions

Computer has to determine actions that
would optimize inhabitant experience
Decision Making
 Decision Making attempts to determine
the actions the system should take in
the current situation

Should a function be automated ?

What should be done next ?
 Decisions should be based on the
current context and the requirements of
the inhabitants

Just programmed timers for automation are
not sufficient

Decision maker has to take into account the
stream of data
Decision Making in
Intelligent Environments
 Example Decision Making Tasks in
Intelligent Environments:
 Automation of physical devices

Turn on lights

Regulate heating and air conditioning

Control media devices

Automate lawn sprinklers

Automate robotic components (vacuum cleaner, etc)
 Control of information devices

Provide recipe services in the kitchen

Construct shopping lists

Decide which types of alarms to display (and where)
Decision Making in
Intelligent Environments
 Objectives of decision making:
 Optimize inhabitant productivity
 Minimize operating costs
 Maximize inhabitant comfort
 Decision making process has to be safe
 Decisions made can never endanger
inhabitants or cause damage
 Decisions should be within the range
accepted by the inhabitants
Example Task
 Should a light be turned on ?
 Decision Factors:

Inhabitant’s location (current and future)

Inhabitant’s task

Inhabitant’s preferences

Time of the day

Other inhabitants

Energy efficiency

Security
 Possible Decisions

Turn on

Do not automate
Decision Making
Approaches
 Pre-programmed decisions

Timer-based automation
 Reactive decision making systems

Decisions are based on condition-action
rules

Decisions are driven by the available facts
 Goal-based decision making systems

Decisions are made in order to achieve a
particular outcome
 Utility-based decision making systems

Decisions are made in order to maximize a
given performance measure
Reactive Decision Making
Goal-Based Decision
Making
Utility-Based Decision
Making
Qualities of a Decision
Making
 Ideal
 Complete: always makes a decision
 Correct: decision is always right
 Natural: knowledge easily expressed
 Efficient
 Rational
 Decisions made to maximize performance
Decision-Making
Techniques
 Reactive Decision Making
 Rule-based expert system
 Goal-Based Decision Making
 Planning
 Decision theoretic Decision Making
 Belief Networks
 Markov decision process
 Learning Techniques
 Neural Networks
 Reinforcement Learning
Rule-Based Decision
Making
 Decisions are made based on rules and
facts
 Facts represent the state of the environment

Represented as first-order predicate logic
 Condition-Action rules represent heuristic
knowledge about what to do

Rules represent implications that imply actions
from logic sentences about facts
 Inference mechanism
 Deduction: {A, A  B}  B

The left hand side of rules are matched against the
set of facts

Rules where the left hand side matches are active
Rule-Based Inference
 Rules define what actions should be
executed for a given set of conditions (facts)

Actions can either be external actions
(“automation”) or internal updates of the set of
facts (“state update”)

Rules are often heuristics provided by an expert
 Multiple rules can be active at any given
time

Conflict resolution to decide which rule to fire

Scheduling of active rules to perform sequence
of actions
Example
 Facts:
 CurrentTime = 6:30
 Location(CurrentTime,bedroom)
 CurrentDay = Monday
 Rules:
 Internal actions:
(CurrentDay=Monday)^(CurrentTime>6:00)

^(CurrentTime<7:00)^(Location(CurrentTime,bedroom))
->Set(Location(NextTime,bathroom))
 External actions:
(Location(NextTime,X)) -> Action(TurnOnLight,X)
Rule-Based Expert
Systems
 Intended to simulate (and automate)
human reasoning process
 Domain is modeled in first-order logic

State is represented by a set of facts

Internal rules model behavior of the environment
 Experts provide sets of heuristic condition-
action rules

Rules with internal actions can model reasoning
process

Rules with external actions indicate decisions the
expert would make
 The system can optionally be provided
with queries by including them in the
Internal Rules
 Internal rules have to model the behavior
of the system
 Persistence over time
E.g.:
(Location(CurrentTime,X))^(NoMove(CurrentTime))
-> Set(Location(NextTime,X))
 Dynamic behavior of devices
E.g.: (Temperature(CurrentTime,X))^(HeatingOn)
-> Set(Temperature(NextTime,X+2))
 Behavior of the inhabitants
E.g.: (Location(CurrentTime,bedroom))
^(CurrentTime>23:00)
^(LightOn(CurrentTime, bedroom))
-> Action(TurnOffLight, bedroom)
Rule-Based Expert
Systems

WORKING INFERENCE
MEMORY ENGINE
(Facts)
PATTERN EXECUTION
MATCHER
ENGINE
RULE
BASE
AGENDA

Rule-Based Expert System Architecture


Logic Inference Systems
and
Expert System Shells
 Logic programming systems provide
inference capabilities.
 Examples:

Prolog

OTTER
 Expert system shells provide the
infrastructure to build complete expert
systems
 Examples:

CLIPS (for C)

JESS (for Java)
Example System: IRoom
[Kul02]

 Initial versions of the MIT IRoom project


used JESS as an inference engine to make
decisions about activating devices

For example:
If a person enters the room and the room is empty
then turn on the light
 Rules are programmed by the system
designer before the room is used and then
refined based on experience
[Kul02] Ajay Kulkarni
. Design Principles of a Reactive Behavioral System for the
Intelligent Room.. 2002.
Rule-Based Decision
Making
 Characteristics
 Complete and correct (given complete rules)
 Natural (given expert specified rules)
 Advantages
 Permits the system to be programmed
relatively efficiently by an expert
 Can address relatively complex systems
 Problems
 Quality of the rules is essential
 Behavior of the environment mimics the expert
 Anticipating all possible contexts is difficult
Planning Decisions
 A planning system searches for a sequence
of actions which can achieve a defined goal.
 States can be represented as logic sequences
 Actions are defined as operators (symbolic
representations of the effect and conditions of
actions) which contain:

Preconditions of actions

Effects of actions
 A goal is a set of states
 Planning system uses constraints to
efficiently search for a sequence of
operators that lead from the start state to a
goal state.
Example

Initial State : (Location(bedroom))^(Light(bathroom,off))
 Goal: Happy(Inhabitant)
 Action 1: MakeHappy
Precondition: (Location(X))^(Light(X,on))
Effect: Add: Happy(Inhabitant)
 Action 2: TurnOnLight(X)
Precondition: Light(X,off)
Effect: Delete: Light(X,off), Add: Light(X,on)
 Action 3: Move(X, Y)
Precondition: (Location(X))^(Light(Y,on))
Effect: Delete: Location(X), Add: Location(Y)
 Plan: Action 2, Action 3, Action 1
Example
Start
Location(bedroom) Light(bathroom,off)
Light(bathroom,off)
TurnOnLight
Location(bedroom) Light(bathroom,on)
Light(bathroom,on)
MoveTo
Location(bathroom)
Location(bathroom) Light(bathroom,on)
MakeHappy
Happy(Inhabitant)
Happy(Inhabitant)
Finish
Example Planning Systems
 Partial Order Planners
 Derive plans without requiring to find
actions in sequence

SNLP (Univ. of Washington)

GraphPlan (CMU)
 Builds and prunes graph of possible plans
 Conditional Planners
 Derive plans under uncertainty by
constructing plans that work under given
conditions

UCPOP (Univ. of Washington)
 Partial Order Planner with Universal quanitification
and Conditional effects CPOP

Planning Decisions
 Characteristics
 Complete and correct (given complete rules)
 Relatively natural formulation
 Advantages
 Permits a sequence of actions to be found that
performs a given task
 Goals can be defined easily
 Problems
 Requires complete description of the system
 Uncertainty is difficult to handle
 Planning is generally very complex
Decision Theory
 Decision theory addresses rational decision
making under uncertainty

Uncertainty is represented using probabilities

Uncertainty due to incomplete observability

Uncertainty due to nondeterministic action outcomes

Uncertainty due to nondeterministic system behavior

Utility theory is used to achieve rational decisions

Utility is a measure of the expected “value” of a given
situation or decision

Rational decisions are the ones that yield the highest
expected utility in the current situation
Modeling Uncertainty
 The current situation can be represented as a
Belief state, i.e. as a probability distribution
over the states indicating the likelihood that
any given state xi is the current state
{(x1, P(x1)), (x2, P(x2)),…, (xn, P(xn))}
 The probability of a state can be expressed as the

probability of all state attributes P(x)=P(a1,a2,…,an)


 Uncertainties from incomplete observability
and nondeterminism can be modeled as
conditional probabilities
 State transition model:
 Observation model: P(o | x)
P ( xit | x tj 1 , d )
Bayes Rule
 Bayes rule permits to invert cause and
effect when calculating probabilities
P(e | c) P(c)
P ( c | e) 
P ( e)
 It is often easier to estimate P(e | c)
 Probability of a state given a set of
sensor readings, P(x | o) , can be
calculated knowing the observation
probabilities P(o | x)
Utility Theory
 Utilities U(A) represent the “value” of a
given situation or decision A and model
preferences
 The utility function for a particular system is
not unique
 Only relative differences between utility
values are important

U(A) > U(B)  A preferred to B

U(A) = U(B)  agent indifferent to A and B
 Utilities for uncertain situations can be
calculated as the expected value of the
utility of all possibilities
U({(x ,P(x )),…,(x ,P(x ))) =  P(x )* U(x )
Rational Decisions
 The rational decision is the one that leads to
the highest utility
d arg max  P ( xi | o ( 0 ) , , o ( t ) , b)U ( xi )
b i
 Rational decisions in Decision theory requires
 Complete causal model of the environment
P(xi | xj, d)
 Complete knowledge of the observation (sensor)
model
P(o | xi)
 Knowledge of the Utility function for all states
U(xi)
Markov Decision Processes
 Markov Decision Processes (MDPs) form a
probabilistic model of all possible system
behavior
 MDPs can be described by a tuple <S, A, T, R>
representing states, actions, transition
probabilities, and reinforcements.
 System has to obey the Markov assumption
P(xt+1|xt, dt, xt-1, dt-1, …, x0) = P(xt+1 | xt, dt)
 Reinforcement represents the instantaneous
change in utility obtained in a given state

Models costs and payoffs

Are generally sparse and delayed
Utility Function for MDPs
 In an MDP, the utility of a state under a given
policy  can be defined as the expected sum of
discounted reinforcements

U ( x )    t R( x )
 t

 t
 The optimal utility function U* can be
computed using Value iteration
U t 1 ( xi )  R ( xi )   max  P ( x j | xi , d )U t ( x j )
d
xj

 Optimal policy (decision strategy) can be


extracted from the utility function
 * ( xi ) arg max ( R( xi )    P( x j | xi , d )U * ( x j ))
d xj
MDP Example
 S = {(1,1), (1,2), … (4, 3)}
 A = {,,,}
 T: P(intended direction) = 0.8, P(right angle to intended) = 0.1
 R: +1 at goal, -1 at trap, 0.04 in all other states
 =1
MDP Example

Optimal Utilities Optimal Policy


Markov Decision Processes
 Characteristics
 Complete and Correct
 Advantages
 Takes into account transition uncertainty
 Makes optimal decisions
 Automatically calculates the utility function
 Problems
 Requires complete probabilistic description
of the system
 Requires complete observability of the state
Partially Observable MDPs
 Partially Observable Markov Decision
Processes (POMDPs) extend MDP by
permitting states to be only partially
observable.
 Systems can be represented by a tuple
<S, A, T, R, O, V> where <S, A, T, R> is an
MDP and O, V are mapping observations about
the state to probabilities of a given state

O = {oi} is the set of observations

V: V(x, o) = P(o | x)
 To determine an optimal policy, an
optimal utility function for the belief states
has to be computed
POMDPs
 Characteristics
 Complete and Correct
 Advantages
 Takes into account all uncertainty
 Makes optimal decisions
 Problems
 Requires complete probabilistic description of
the system
 Optimal solution is so far intractable (dynamic
decision networks and approximation
techniques exist and work for small state
spaces)
Learning Decisions
 Learning techniques permit decisions to be
learned from past experience and feedback
from the inhabitants or the environment.
 Supervised learning

Requires the desired decision to be specified during
training
 Reinforcement learning

Learns by experimentation from scalar reward
feedback
 Inhabitant feedback (e.g. device interactions)
 Explicit environment feedback (e.g. energy consumption)
 Implicit feedback (e.g. prediction of comfort of inhabitant)
Feedforward Neural
Networks
 Neural networks are a supervised learning
mechanism that can be trained to make
decisions based on a set of training
examples.

Training for reactive decisions involves the
presentation of a set of examples (xi, d(xi)) ,where
d(xi) is the desired decision to be made in
configuration xi.

Training for goal-based or utility-based decisions
involves learning a model that maps input (xi, d(xi))
to the outcome of the action f(xi, d(xi)) and then
selecting the decision with the best outcome.
Example System:
Regulation in the Adaptive
House [DLRM94]
 Neural network learns to regulate the lights in
the house to maintain a given light intensity.
1. Learns a network that predicts the light
intensity if a given set of lights are turned on

Input:
 The current light device levels (7 inputs)
 The current light sensor levels (4 inputs)
 The new light device levels (7 inputs)

Output:
 The new light sensor levels (4 outputs)

[DLRM94] Dodier, R. H., Lukianow, D., Ries, J., & Mozer, M. C. (1994).

A comparison of neural net and conventional techniques for lighting co


ntrol.
Applied Mathematics and Computer Science, 4, 447-462.
Example System:
Regulation in the Adaptive
House continued
2. Decisions are made by comparing the
output of the network for all possible
decisions (i.e. combinations of lights to be
turned on) with the desired light intensity
and taking the decision that most closely
matches it. Set point p

State xi :
:
:
Decision d :
Prediction f(xi, d)

d ( xi ) arg min p  f ( xi , d )
d

Decision:
Neural Networks
 Characteristics
 Efficient
 Advantages
 Can learn arbitrary decision functions from
training data
 Generalizes to new situations
 Fast decision making
 Problems
 Requires training data that contains desired
decision or a goal/objective
 Requires design of sufficient input
representation
Reinforcement Learning
 Reinforcement learning learns an
optimal decision strategy from trial and
error and sparse reward feedback.
 On-line method to solve Markov Decision
Processes (or, with extensions, POMDPs).
 Reward, R, is a signal encoding the
instantaneous feedback to the system.
 System learns a mapping from states to
decisions, *(xi), which optimizes the
expected utility.
Q-Learning
 Q-learning is the most popular
reinforcement learning technique for
MDPs.
 Learns a utility function for state-action pairs

Q(x, d)

Utility U(x) = maxa Q(x,d)
 Learns by experimentation.

Update Q(xi ,d) after each observed transition from
state xi by comparing the expected utility of (xi,d)
with the expectation computed after observing the
actual outcome xj.
Q(xi,d) = Q(xi,d) +  * (R(xi) + maxd’ Q(xj,d’) - Q(xi,d))
 Decisions are made to optimize Q-values

(x) = argmax Q(x,d)
Example System:
Regulation in the Adaptive
House [Moz98]
 Neural network
regulators can control
lighting and heating to
achieve a given set
point
 Set point is learned
using reinforcement
 Energy usage
 Inhabitant interactions

with light switches or


[Moz98] Mozer, M. C.
thermostats
The neural network house: An environment that adapts to its inhabitants. In Proc. AAAI
Example System:
MavHome
 Uses Q-learning on a state space including
device status and the Active LeZi prediction.

State st at time t
st = (xt, pt)
 Reinforcement includes multiple metrics

Energy usage

Number of inhabitant-device interactions
 Decisions are device interactions and an action
representing the decision not to perform an
action.
 System operates event-driven, making a decision
every time an event happens.
 Learner is pre-trained using the Active LeZi
Example System:
MavHome
 Example task: getting up in the morning
and taking a shower.
Example System:
MavHome
 Home learns to automate light activations
such as to minimize energy usage without
increasing the number of inhabitant
interactions
Reinforcement Learning
 Characteristics
 Optimal policies (given enough training)
 Advantages
 Can learn optimal decision strategies without
explicit training
 Can deal with multiple objectives
 Problems
 Trial and error learning can lead to spurious
actions leading to potential safety issues
 Requires complete state space representations
 Can be very complex
Conclusions
 Decision making is an integral component of
intelligent environments.
 Automates devices
 Determines information to inhabitants
 Different decision making approaches apply to
different conditions based on the available
information.
 Reactive / Goal-based / Utility-based
 Programmed / Learning
 Decision-making approaches can be “mixed”.
 Many open issues remain:
 How to deal with complexity of intelligent environments?
(Hierarchical systems, multi-agent systems, etc)
 How to assure safety and acceptability of learning
decision makers ?

You might also like