0% found this document useful (0 votes)
9 views

Module 4.2

Uploaded by

nehal1103sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Module 4.2

Uploaded by

nehal1103sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

BAYESIAN NETWORKS

Module 4.2
WHAT IS A BAYESIAN NETWORK?
• A Bayesian network is a compact, flexible and interpretable representation of a joint probability
distribution.
• It is also a useful tool in knowledge discovery as directed acyclic graphs allow representing
causal relations between variables.

• A Bayesian network is learned from data.


• Bayesian inference has found application in a wide range of
activities, including science, engineering, philosophy, medicine,
sport, and law.
• A Bayesian network (BN) is a probabilistic graphical model for
representing knowledge about an uncertain domain where each node
corresponds to a random variable and each edge represents the
conditional probability for the corresponding random variables.

2
3
BAYESIAN BELIEF NETWORK
• Bayesian Belief Network is a powerful tool for modelling cause and effect in a variety of
domains.
• There is a compact network of probabilities that capture probabilistic relationship between
variables as well as historical information about their relationship.
• They provide very convincing result when the historical information in conditional
probability table are exact.
• They provide consistent semantics for representing uncertainty, so they are a very effective
method of modelling uncertain situations that depend on cause and effects.
• The Bayesian network has mainly two components:
⮚ Causal Component
⮚ Actual numbers

4
• A Bayesian network graph is made up of nodes and Arcs (directed links) where,
• Each node corresponds to the random variables, and a variable can be continuous or discrete.
• Arc or directed arrows represent the causal relationship or conditional probabilities between
random variables. These directed links or arrows connect the pair of nodes in the graph.

• These links represent that one node directly influence


the other node, and if there is no directed link that
means that nodes are independent with each other.
• In the diagram, A, B, C, and D are random variables
represented by the nodes of the network graph.
• If we are considering node B, which is connected with
node A by a directed arrow, then node A is called the
parent of Node B.
• Node C is independent of node A.

5
BAYESIAN BELIEF NETWORK EXAMPLE
1

6
ADVANTAGES OF BAYESIAN BELIEF NETWORK
• Informative, graphical and efficient
• Accounts for sources of uncertainty
• Allows for information updating
• Models multiple interdependencies
• Includes utility and decision nodes

DISADVANTAGES OF BAYESIAN BELIEF


NETWORK
• Not ideally suited for computing small probabilities
• Computationally demanding for systems with a large number of random variables
• Exponential growth of computational effort with increased number of states

7
PROBABILISTI
C REASONING

8
9
HIDDEN MARKOV
MODELS

10
HIDDEN MARKOV MODELS
• Hidden Markov models (HMMs) are a type of statistical modeling that has been used
for several years.
• They have been applied in different fields such as medicine, computer science, and data
science.
• The Hidden Markov model (HMM) is the foundation of many algorithms.
• It has been used in data science to make efficient use of observations for successful
predictions or decision-making processes.

11
12
WHAT ARE MARKOV MODELS??
• Markov models are named after Andrey Markov, who first developed them in the early 1900s.
• Markov models are a type of probabilistic model that is used to predict the future state of a
system, based on its current state.
• In other words, Markov models are used to predict the future state based on the current hidden
or observed states.
• Markov model is a finite-state machine where each state has an associated probability of
being in any other state after one step.
• They can be used to model real-world problems where hidden and observable states are
involved.
• Markov models can be classified into hidden, and observable based on the type of information
available to use for making predictions or decisions.
• Hidden Markov models deal with hidden variables that cannot be directly observed but only
inferred from other observations, whereas in an observable model also termed as Markov
chain, hidden variables are not involved.

13
• In Markov models the future state of a system is determined by its current state and past
history.
• In the case of the bag of marbles, the current state is determined by the number of each
color of marble in the bag.
• The past history is represented by the contents of the bag, which determine the probabilities
of selecting each color of marble.
• Markov models have many applications in the real world, including predicting the weather,
stock market prices, and the spread of disease.
• Markov models are also used in natural language processing applications such as speech
recognition and machine translation.
• In speech recognition, Markov models are used to identify the correct word or phrase based
on the context of the sentence.
• In machine translation, Markov models are used to select the best translation for a sentence
based on the translation choices made for previous sentences in the text.
14
WHAT IS MARKOV CHAIN?
• Markov chains, can be thought of as a machine or a system that hops from one
state to another, typically forming a chain.
• Markov chains have the Markov property, which states that the probability of
moving to any state next depends only on the current state and not on the previous
states.
• A Markov chain consists of three important components:
⮚ Initial probability distribution: An initial probability distribution over
states, πi is the probability that the Markov chain will start in a certain state i.
Some states j may have πj = 0, meaning that they cannot be initial states
⮚ One or more states
⮚ Transition probability distribution: A transition probability matrix A which
represents the probability of moving from state I to state j

15
• The diagram below represents a Markov chain where there are three states representing the
weather of the day (cloudy, rainy, and sunny).
• And there are transition probabilities representing the weather of the next day given the
weather of the current day.

16
Using this Markov chain, what is the probability that the Wednesday will
be cloudy if today (Monday) is sunny??

The total probability of a cloudy Wednesday = 0.2 + 0.03 + 0.04 = 0.27.

17
WHAT ARE HIDDEN MARKOV MODELS?
• The hidden Markov model (HMM) is another type of Markov model where there are few states which
are hidden.
• This is where HMM differs from a Markov chain.
• HMM is a statistical model in which the system being modeled are Markov processes with unobserved
or hidden states.
• It is a hidden variable model which can give an observation of another hidden state with the help of the
Markov assumption.
• The hidden state is the term given to the next possible variable which cannot be directly observed but
can be inferred by observing one or more states according to Markov’s assumption.
• Markov assumption is the assumption that a hidden variable is dependent only on the previous hidden
state. This is also termed as limited horizon assumption.
• Another Markov assumption states that the conditional distribution over the next state, given the current
state, doesn’t change over time. This is also termed as stationary process assumption.

18
• A Markov model is made up of two components: the state transition and hidden random variables that
are conditioned on each other.
• However, A hidden Markov model consists of five important components:
⮚ Initial probability distribution: An initial probability distribution over states, πi is the probability
that the Markov chain will start in state i. Some states j may have πj = 0, meaning that they cannot
be initial states. The initialization distribution defines each hidden variable in its initial condition at
time t=0 (the initial hidden state).
⮚ One or more hidden states
⮚ Transition probability distribution: A transition probability matrix represents the probability of
moving from state i to state j. The transition matrix is used to show the hidden state to hidden state
transition probabilities.
⮚ A sequence of observations
⮚ Emission probabilities: A sequence of observation likelihoods, also called emission probabilities,
each expressing the probability of an observation are being generated from a state I. The emission
probability is used to define the hidden variable in terms of its next hidden state. It represents the
conditional distribution over an observable output for each hidden state at time t=0.

19
The hidden Markov model in the above diagram represents the process of predicting whether
someone will be found to be walking, shopping, or cleaning on a particular day depending
upon whether the day is rainy or sunny.

20
The five components of hidden Markov model will be:

21
• The Hidden Markov model is a special type of Bayesian network that has hidden variables which are
discrete random variables.
• The first-order hidden Markov model allows hidden variables to have only one state and the second-
order hidden Markov models allow hidden states to be having two or more two hidden states.
• The hidden Markov model represents two different states of variables: Hidden state and observable
state.
• A hidden state is one that cannot be directly observed or seen.
• An observable state is one that can be observed or seen.
• One hidden state can be associated with many observable states and one observable state may have
more than hidden states.
• The hidden Markov model uses the concept of probability to identify whether there will be an emission
from the hidden state to another hidden state or from hidden states to observable states.

22
REAL WORLD EXAMPLES OF HMM
• Retail scenario: If you go to the grocery store once per week, it is relatively easy for a computer program to
predict exactly when your shopping trip will take more time. The hidden Markov model calculates which day
of visiting takes longer compared with other days and then uses that information in order to determine why
some visits are taking long while others do not seem too problematic for shoppers like yourself. Another
example from e-commerce where hidden Markov models are used is the recommendation engine. The hidden
Markov models try to predict the next item that you would like to buy.
• Travel scenario: By using hidden Markov models, airlines can predict how long it will take a person to finish
checking out from an airport. This allows them to know when they should start boarding passengers!
• Marketing scenario: As marketers utilize a hidden Markov model, they can understand at what stage of their
marketing funnel users are dropping off and how to improve user conversion rates.

• Medical Scenario: The hidden Markov models are used in various medical applications, where it tries to find
out the hidden states of a human body system or organ. For example, cancer detection can be done by
analyzing certain sequences and determining how dangerous they might pose for the patient. Another example
where hidden Markov models get used is for evaluating biological data such as RNA-Seq, ChIP-Seq, etc., that
help researchers understand gene regulation. Using the hidden Markov model, doctors can predict the life
expectancy of people based on their age, weight, height, and body type

23
KALMAN
FILTER

24
KALMAN FILTER
• The Kalman Filter is an efficient optimal estimator (a set of mathematical equations) that provides
a recursive computational methodology for estimating the state of a discrete-data controlled process
from measurements that are typically noisy, while providing an estimate of the uncertainty of the
estimates.
• Kalman filters are discrete: they rely on measurement samples taken between repeated but constant
periods of time.
• Kalman filters are recursive: its prediction of the future relies on the state of the present (position,
velocity, acceleration, etc.). Further, it presents a guess about external factors that may affect the
situation.
• Kalman filters predict the future: this is applied by making measurements (such as by sensors) and
then deriving an adjusted estimate of the state from both prediction and measurements.
• The basic Kalman filter is meant for linear systems, but challenging scientific problems, for
example in satellite navigation, are nonlinear and therefore it was necessary to implement a special
version of the Kalman filter called the extended Kalman Filter (EKF).

25
Some common applications of Kalman filter in technology are:
• Guidance and navigation of vehicles, particularly aircraft and spacecraft.
• Robotic motion planning and trajectory adjustment.
• Position awareness radar sensors for advanced driver assistance systems (ADAS) in autonomous
vehicles.
• Many computer vision applications such as stabilizing depth measurements, object tracking (e.g., faces,
heads, hands), fusing data from laser scanners, stereo cameras for depth and velocity measurements, and
3D mapping through Kinect or range cameras.

26
DETAILED KALMAN FILTERS
Part 1

27
Part 2

28
Part 3

29
Part 4

30
Part 5

31
FLOWCHART FOR KALMAN

32
DYNAMIC
BAYESIAN
NETWORKS

33
DYNAMIC BAYESIAN NETWORKS
• Dynamic Bayesian networks extend standard Bayesian networks with the concept of time.
• This allows us to model time series or sequences.
• In fact, they can model complex multivariate time series, which means we can model the
relationships between multiple time series in the same model, and also different regimes of
behavior, since time series often behave differently in different contexts.
• In general time series modelling the terms Auto, cross & partial correlation are often used.
• Auto correlation - correlation between a variable at different times (lags)
• Cross correlation - correlation between different time series
• Partial correlation - correlation between the same or a different time series with the effect
of lower order correlations removed.

34
• Dynamic Bayesian networks can capture all types of correlation and model even more complex
relationships, as these correlations can be conditional on other variables (temporal or non-
temporal) and on latent variables (described later) and models can include both discrete and
continuous variables.
• Just because data is recorded sequentially in time does not always mean you have to use a time
series model (DBN).
• For some tasks, a standard Bayesian network may perform very well, and the added complexity
of a temporal model may not be justified.
• In these cases, it is good practice to create a non-temporal model first for comparison, before
developing a Dynamic Bayesian network.
• You must be careful when creating test and validation datasets from time series data, as the data
is usually not IID (Independent and identically distributed).
• Dynamic Bayesian networks can contain both nodes which are time based (temporal), and those
found in a standard Bayesian network. They also support both continuous and discrete variables.
• Multiple variables representing different but related time series can exist in the same model.

35
• Their dependencies can be modelled leading to models that can make multivariate time series
predictions.
• This means that instead of using only a single time series to make a prediction, we can use many time
series and their interrelations to make better predictions.
• Note that a Bayesian network in Bayes Server becomes a Dynamic Bayesian network as soon as
you add a temporal node.
• There are a few special kinds of nodes.
• Initial nodes only ever connect to temporal nodes at t=0.
• Terminal nodes only ever connect from temporal nodes at the final time (which can vary from case
to case).
• As with standard Bayesian networks, Dynamic Bayesian networks in Bayes Server support multiple
variables in a node (multi-variable nodes).
• This is especially useful for temporal models as their structure is vastly simplified, inference is often
faster and it can be easier to interpret the parameters
• A useful way to understand a dynamic Bayesian network, is to unroll it. Unrolling means converting
a dynamic Bayesian network into its equivalent Bayesian network.
36
KEEPING TRACK
OF MANY
OBJECTS

37
KEEPING TRACK OF MANY OBJECTS
• This concept considers the situation when two or more objects generate observations.
• What makes this case different from plain old state estimation is that there is now the possibility of
uncertainty about which object generated which observation.
• Object tracking means estimating the state of the target object present in the scene from previous
information.
• On a high level of abstraction, there are mainly two levels of object tracking.
⮚ Single Object Tracking(SOT)
⮚ Multiple Object Tracking(MOT).
• Object tracking is not limited to 2D sequence data and can be applied to 3D domains as well.
• The strength of Deep Neural Networks (DNN) resides in their ability to learn rich representations and
to extract complex and abstract features from their input.
• In recent years, due to the exponential rise in the research of deep learning methods, there have been
tremendous gains in accuracy and performance of the detection and tracking approaches.

38
• Multiple Object Tracking (MOT), also called Multi-Target Tracking (MTT), is a computer vision task
that aims to analyse videos to identify and track objects belonging to one or more categories, such as
pedestrians, cars, animals and inanimate objects, without any prior knowledge about the appearance and
number of targets.
• While in Single Object Tracking (SOT) the appearance of the target is known a priori, in MOT a
detection step is necessary to identify the targets that can leave or enter the scene.
• The main difficulty in tracking multiple targets simultaneously stems from the various occlusions and
interactions between objects that can sometimes also have a similar appearance.
• Thus, merely applying SOT models directly to solve MOT leads to poor results, often incurring in target
drift and numerous ID switch errors, as such models usually struggle in distinguishing between similar-
looking intra-class objects.
• Most of the state-of-the-art tracking approaches follow the ‘Tracking by Detection’ scheme where they
first find objects in the scene and then find the corresponding tracklets(position of it in the next frame) of
the objects.
• Today the detectors are performing exceptionally well and can scale to different scene adaptations.
Consequently, it has led to the standard input to tracking algorithms.

39
• Data association is an essential foundation for keeping track of a complex world, because without it there
is no way to combine multiple observations of any given object.
• When objects in the world interact with each other in complex activities, understanding the world requires
combining data association with the relational and open-universe probability models. This is currently an
active area of research.
• Multi-Object Tracking is a task in computer vision that involves detecting and tracking multiple objects
within a video sequence.
• The goal is to identify and locate objects of interest in each frame and then associate them across frames
to keep track of their movements over time.
• This task is challenging due to factors such as occlusion, motion blur, and changes in object appearance,
and is typically solved using algorithms that integrate object detection and data association techniques.

40
CHALLENGES
• While solving the object tracking problem, there arises a number of issues which can lead to a poor
outcome.
• The algorithms over the years have tried to tackle these issues but till now we have not arrived at a full
proof solution keeping it an open-ended area of research.
⮚ Variations due to geometric changes Eg:- Pose, articulation, the scale of objects
⮚ Difference due to photo-metric factors. E.g.:- Illumination, appearance
⮚ Non-Linear motion
⮚ Limited resolution Eg:- Video captured from a low-end phone
⮚ Similar objects in the scene Eg:- Same colour of clothes, accessories etc.
⮚ Highly crowded scenarios like streets, concerts, stadiums, markets.
⮚ Track initiation & termination. Before you start any tracking algorithm, you need prior
information of an object you want to track. Initiating the algorithm with a target object may or
may not be possible always.
⮚ Tracks can get merged/switched due to sudden change in movements, a sharp change in camera
quality etc.
⮚ ID’s of the target object switched due to similar characteristics like similar clothes, face structure,
glasses, skin complexion, height etc.
⮚ Drifting due to the wrong update of the target model. One wrong update might lead to a
continuous update in the wrong direction and forget the correct one throughout the video . 41
THANK YOU

You might also like