Bayesian Network – Brief Introduction, Characteristics & Examples
The objective of this tutorial is to provide you with a detailed description of the Bayesian Network. Moreover, we will also cover Bayesian Network example and various characteristics of the Bayesian Network in R.
Bayesian network in R is a complete model for the variables and their relationships. It uses to answer probabilistic queries about them.
So, let’s start the Bayesian Network Tutorial.
2. What is Bayesian Statistics?
Bayes’ theorem is the basis of Bayesian statistics. It enables the user to update the probabilities of unobserved events. Consider that you have a prior probability for the unobserved event. A related event has occurred. Now you can update the prior probability to get the posterior probability of the event.
Bayesian inference found the data that already occurred, and not on the data that could have occurred but did not. Bayes’ theorem finds the posterior density of parameters for a given data. It combines information about the parameters from prior density with the observed data.
Bayes’ Theorem – A theorem of probability theory states by the Reverend Thomas Bayes. A new piece of evidence affects the way of understanding how the probability that a theory is true. We use it in a wide variety of contexts from marine biology to the development of “Bayesian” spam blockers for email systems.
When applied, the probabilities involved in Bayes’s theorem may have some probability interpretations. There is the use of the theorem as part of a particular approach to statistical inference.
3. What is Bayesian Network?
A Bayesian Network (BN) is a marked cyclic graph. It represents a JPD over a set of random variables V.
By using a directed graphical model Bayesian Network describes random variables and conditional dependencies. For example, you can use a BN for a patient suffering from a particular disease. By examining various other health factors we use BN to calculate the probability of the other related disease. Some examples are gene regular tree networks, protein structure, gene expression analysis. The following equation defines it: B = (G, Θ) Here:
- B is a BN
- G is a Directed Acyclic Graph (DAG). Its nodes X1, X2, …Xn represents random variables. Its edges represent direct dependencies between these variables.
- G encodes independence assumptions by which each variable Xi is independent of its non-descendants given its parents in G.
- Θ represents the set of BN parameters. This set contains the parameter Θxi |πi = PB(xi |πi for each realization of xi of Xi conditioned on π©, the set of parents of Xi in G.
- B defines a unique JPD over V, as follows:
In the above equation, · For random variables, xi having parents, their local probabilities distribution is conditional. The probability distribution for other variables is unconditional. Evidence nodes are the nodes that represent an observable variable. Unseen or hidden nodes are the nodes that represent unobservable variables. For a BN, you must specify the structure and parameters for the graph model. Conditional probability distribution (CPD) defines the parameters at each node. Conditional probability table (CPT) defines the CDP in discrete variables. Using the combination of different values from parents, CPT represents the probability of a child node.
Bayesian network is a complete model for the variables and their relationships. We use it to answer probabilistic queries about them.
4. Bayesian Network in R Examples
Suppose you want to determine the possibility of grass that wet due to rain.
The weather has three states: Sunny, cloudy, and rainy. There are two possibilities for the grass: Wet or dry.
The sprinkler can be on or off. If it is rainy, the grass gets wet. If it is sunny weather, by turning on the sprinkler we can make grass wet by pouring water from a sprinkler.
When the probabilities of the Bayesian Network reflect the weather, lawn, then the BN can answer questions such as what is the probability that rain or sprinkler cause the lawn wet? If the probability of rain increases, how will it impact your use of sprinkler to water the lawn? Suppose that the grass is wet, this is because of either of the following causes: It is raining. Sprinklers are on to calculate it you can use the Bayes’ rule, which deduces it is more likely that the grass is wet because it is raining.
5. Characteristics of Bayesian Networks In R
It is important to understand some characteristics associated with Bayesian Networks. They include the following:
- Explaining away
- Top-down and bottom-up reasoning
- Conditional independence in BNs
- BNs with discrete and continuous nodes
- Temporal models
- Hidden Markov Models (HMMs)
- Linear Dynamic Systems (LDSs) and Kalman Filters
i. Explaining Away
In the water sprinkler example, the two causes (S and R) compete to explain the observed data. The common child W observe that S and R become dependent, even though they are independent. This phenomenon is called explaining away because either cause is adequate to explain the fact on W. In grass wet example, the grass is wet at a time and at the same time, it is raining too. It reduces the posterior probability. This is explaining away as either cause is sufficient to explain the fact on W.
ii. Bottom-Up and Top-Down Reasoning
BNs are often called generative models because they specify how causes generate effects. In the water sprinkler example, you have any evidence of an effect, which is wet grass, and you inferred the most likely cause. This approach is called bottom-up or diagnostic reasoning. It goes from effects to causes and creates a kind of inverse probability. In a BN graph, you can perform the bottom-up reasoning by going through evidence nodes connected through its descendant nodes.
Top-down or casual reasoning calculates the probability that the grass will be wet if given that it is cloudy. In a BN graph, for a node X, top-down reasoning can perform by going through the evidence nodes connected through its ancestor nodes.
Casualty discussions explain the cause and effect of a phenomenon. For example, What phenomenon causes what, what are the main factors behind an effect? Previously you used to do experiments to deduce casualty in a phenomenon. BNs gives a solid mathematical foundation for casualty discussions without an experiment.
iv. Conditional Independence in BNs
The Bayes Ball algorithm explains the conditional independence relationships encoded by a BN. It states that two sets of nodes A and B are independent given a set C if there is no way for a ball to move from A to B in the graph.
v. BNs with Discrete and Continuous Nodes
The water sprinkler example used nodes with categorical values and multinomial distributions. The user can also create BNs with continuous-valued nodes. The most common distribution for these variables is Gaussian. The logistic or softmax distribution use for discrete nodes with continuous parents. You can make complex models using multinomial, conditional Gaussians and the logistic distribution.
vi. Temporal Models
By direct graphical models, Temporal models represent a process based on random inputs or variables. Temporal models are the directed graphical models of stochastic processes. They are also known as Dynamic BNs (DBNs). The simplest DBNs are Hidden Markov Models (HMMs), Linear Dynamic Systems (LDSs). Temporal models simplify HMMs and LDSs by using states of variables as hidden or observed.
vii. Hidden Markov Model (HMM)
An HMM that has one discrete hidden node and one discrete or continuous observed node per slice. The figure below shows an HMM.
Circle symbol represents variables having continuous values. Square symbol represents variables having discrete values. Here the HMM unroll for four-time slices. Duplication of the body of a loop several times this is called Unrolling. The copies used to replace the original body. Duplication of copies is called unrolling factor.
To describe BDN, you have to express topology within and between slices. You also need to define parameters for 2 slices. Such networks with two slice temporal also known as the 2 TBN.
viii. Linear Dynamic Systems (LDSs) and Kalman Filters
An LDS has the same topology as an HMM, but it assume to have all the nodes in linear-Gaussian distributions. A linear-Gaussian distribution is as follows:
So, this was all in the Bayesian Network in R Tutorial. Hope you like our explanation.
6. Conclusion – Bayesian Network in R
Hence, in this Bayesian Network tutorial, we discussed Bayesian Statistics and Bayesian Networks. Moreover, we saw Bayesian Network examples and characteristics of Bayesian Network. Still, if you have any doubt, ask in the comment tab.