0% found this document useful (0 votes)
20 views4 pages

Group Activity3

Group Activity3

Uploaded by

ikokashechka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

Group Activity3

Group Activity3

Uploaded by

ikokashechka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Bayesian Belief Neural Networks

Probabilistic models can define relationships between variables and be used to calculate probabilities.

For example, fully conditional models may require an enormous amount of data to cover all possible cases, and
probabilities may be intractable to calculate in practice. Simplifying assumptions such as the conditional independence
of all random variables can be effective.

An alternative is to develop a model that preserves known conditional dependence between random variables and
conditional independence in all other cases. Bayesian Belief networks are a probabilistic graphical model that explicitly
capture the known conditional dependence with directed edges in a graph model. All missing connections define the
conditional independencies in the model.

As such Bayesian Belief Networks provide a useful tool to visualize the probabilistic model for a domain, review all
of the relationships between the random variables, and reason about causal probabilities for scenarios given available
evidence.

Challenge of Probabilistic Modeling


Probabilistic models can be challenging to design and use. Most often, the problem is the lack of information about
the domain required to fully specify the conditional dependence between random variables. If available, calculating
the full conditional probability for an event can be impractical. A common approach to addressing this challenge is to
add some simplifying assumptions, such as assuming that all random variables in the model are conditionally
independent.

An alternative approach is to develop a probabilistic model of a problem with some conditional independence
assumptions. This provides an intermediate approach between a fully conditional model and a fully conditionally
independent model. Bayesian Belief networks are one example of a probabilistic model where some variables are
conditionally independent.

Bayesian Belief Network as a Probabilistic Model

A Bayesian Belief network is a type of probabilistic graphical model.

Probabilistic Graphical Models

A probabilistic graphical model (PGM), or simply “graphical model” for short, is a way of representing a probabilistic
model with a graph structure. The nodes in the graph represent random variables and the edges that connect the nodes
represent the relationships between the random variables. A graph comprises nodes (also called vertices) connected by
links (also known as edges or arcs). In a probabilistic graphical model, each node represents a random variable (or
group of random variables), and the links express probabilistic relationships between these variables.

Bayesian Belief Networks


A Bayesian Belief Network, or simply “Bayesian Network,” provides a simple way of applying Bayes Theorem to
complex problems. The networks are not exactly Bayesian by definition, although given that both the probability
distributions for the random variables (nodes) and the relationships between the random variables (edges) are specified
subjectively, the model can be thought to capture the “belief” about a complex domain.
Bayesian probability is the study of subjective probabilities or belief in an outcome, compared to the frequentist
approach where probabilities are based purely on the past occurrence of the event.

A Bayesian Belief Network captures the joint probabilities of the events represented by the model.

1
Central to the Bayesian network is the notion of conditional independence. Independence refers to a random variable
that is unaffected by all other variables. A dependent variable is a random variable whose probability is conditional on
one or more other random variables.
Conditional independence describes the relationship among multiple random variables, where a given variable may be
conditionally independent of one or more other random variables. This does not mean that the variable is independent
per se; instead, it is a clear definition that the variable is independent of specific other known random variables.

A probabilistic graphical model, such as a Bayesian Network, provides a way of defining a probabilistic model for a
complex problem by stating all of the conditional independence assumptions for the known variables, whilst allowing
the presence of unknown (latent) variables.

As such, both the presence and the absence of edges in the graphical model are important in the interpretation of the
model.

A graphical model (GM) is a way to represent a joint distribution by making [Conditional Independence] CI
assumptions. In particular, the nodes in the graph represent random variables, and the (lack of) edges represent CI
assumptions. (A better name for these models would in fact be “independence diagrams”

Bayesian networks provide useful benefits as a probabilistic model. For example:

• Visualization. The model provides a direct way to visualize the structure of the model and motivate the design of new
models.
• Relationships. Provides insights into the presence and absence of the relationships between random variables.
• Computations. Provides a way to structure complex probability calculations.

How to Develop and Use a Bayesian Network


Designing a Bayesian Network requires defining at least three things:

• Random Variables. What are the random variables in the problem?


• Conditional Relationships. What are the conditional relationships between the variables?
• Probability Distributions. What are the probability distributions for each variable?
It may be possible for an expert in the problem domain to specify some or all of these aspects in the design of the
model.

In many cases, the architecture or topology of the graphical model can be specified by an expert, but the probability
distributions must be estimated from data from the domain.

Both the probability distributions and the graph structure itself can be estimated from data, although it can be a
challenging process. As such, it is common to use learning algorithms for this purpose; for example, assuming a
Gaussian distribution for continuous random variables gradient ascent for estimating the distribution parameters.

Once a Bayesian Network has been prepared for a domain, it can be used for reasoning, e.g. making decisions.

Reasoning is achieved via inference with the model for a given situation. For example, the outcome for some events is
known and plugged into the random variables. The model can be used to estimate the probability of causes for the
events or possible further outcomes.

Reasoning (inference) is then performed by introducing evidence that sets variables in known states, and subsequently
computing probabilities of interest, conditioned on this evidence.

Practical examples of using Bayesian Networks in practice include medicine (symptoms and diseases), bioinformatics
(traits and genes), and speech recognition (utterances and time)

2
Example of a Bayesian Network
Consider a problem with three random variables: A, B, and C. A is dependent upon B, and C is dependent upon B. We
can state the conditional dependencies as follows:

• A is conditionally dependent upon B, e.g. P(A|B)


• C is conditionally dependent upon B, e.g. P(C|B)
We know that C and A have no effect on each other. We can also state the conditional independencies as follows:

• A is conditionally independent from C: 𝑃(𝐴|𝐵, 𝐶)


• C is conditionally independent from A: 𝑃(𝐶|𝐵, 𝐴)
Notice that the conditional dependence is stated in the presence of the conditional independence. That is, A is
conditionally independent of C, or A is conditionally dependent upon B in the presence of C. We might also state the
conditional independence of A given C as the conditional dependence of A given B, as A is unaffected by C and can
be calculated from A given B alone.

• 𝑃(𝐴|𝐶, 𝐵) = 𝑃(𝐴|𝐵)
We can see that B is unaffected by A and C and has no parents; we can simply state the conditional independence of
B from A and C as P(B, P(A|B), P(C|B)) or P(B).

We can also write the joint probability of A and C given B or conditioned on B as the product of two conditional
probabilities; for example:

• P(A, C | B) = P(A|B) * P(C|B)


The model summarizes the joint probability of P(A, B, C), calculated as:

• P(A, B, C) = P(A|B) * P(C|B) * P(B)


We can draw the graph as follows:

Notice that the random variables are each assigned a node, and the conditional probabilities are stated as directed
connections between the nodes. Also notice that it is not possible to navigate the graph in a cycle, e.g. no loops are
possible when navigating from node to node via the edges.

Also notice that the graph is useful even at this point where we don’t know the probability distributions for the variables.

You might want to extend this example by using contrived probabilities for discrete events for each random variable
and practice some simple inference for different scenarios.

3
Implementation of a Bayesian Network
Part 1:
1.1. Start with Google Colab website. Google Colaboratory, or “Colab” for short, is a product from Google Research.
Colab allows anybody to write and execute arbitrary python code through the browser, and is especially well suited to
machine learning, data analysis and education. More technically, Colab is a hosted Jupyter notebook service that
requires no setup to use, while providing access free of charge to computing resources including GPUs.

1.2. In the upper right of the new page, sign in to the Google Colab with your Google Account.

1.3. Type the following address in your browser:

https://colab.research.google.com/drive/1NQNMdKaE9RncuWgO_vM2k3qywV76Byfh#scrollTo=5wlbUXQ5mxts

1.4. From the RunTime tab, select Change RunTime type and then select T4 GPU.

1.5. Check the top right of your Google Colab page. It should show T4 with a green checkmark.

(Note! T4 GPU is free only for about 10 hours. (I’m sure, this assignment does not take 10 hours!) After that,
you need to use your groupmates google account and continue all the steps from 1.2)

(Note! If you don’t work with your Google Colab page for about 10 minutes, the GPU and CPU get
disconnected then you need to restart from 1.4)

1.6. In line 6 of the first piece of code, instead of 42, enter your group number (This way the results for different groups
will be different)

1.7. Read the instructions and try to run each piece of code through the run bottom at the left side of each piece of code.

1.8. You probably see a warning, choose run anyway and proceed.

1.9. Keep running all piece of codes (note that when you run Markov chain Monte Carlo sampler, network is getting
trained so it may take time) (After running each piece of the code, Save the Figures for Your Report)

1.10. See the model prediction (You don’t need to train Deep Networks. So, no need to work on Exercise 1 and 2), and
save the curve for your report. (Note that the model prediction is accurate when the green line (Predictive Mean)
exactly follows the blue line (True Function).

Part 2:

2.1. In the first piece of code, try to explain what the following lines of codes are doing:

Line 6, Line 10, Line 11, Line 13, Line 14

2.2. Change the amplitude of noise from 0.02 to 0.2 (Line 10), and repeat steps 1.9 and 1.10 of part 1 (i.e., run all
pieces of code again). Save both Figures

2.3. Reset the amplitude of noise to 0.02 (Line 10). Change the amplitude of signal from 0.3 to 0.6 and repeat steps 1.9
and 1.10 of part 1 then change both scales from 0.3 to 0.03 in Lines 11 and 14 and repeat steps 1.9 and 1.10 of part 1
(i.e., run all pieces of codes again) Save both set of Figures.

2.4. Compare 2.2 and 2.3. Which case decrease the model prediction accuracy (Increasing the noise amplitude in 2.2
or decreasing the signal amplitude in 2.3)? Why?
4

You might also like