At the forefront of Artificial Intelligence
  Home Articles Reviews Interviews JDK Glossary Features Discussion Search
Home » Articles » Uncertainty Handling » Artificial Inference

Probability Modulation and Non-linearity in Bayesian Networks

By Clive Spenser & Charles Langley

Introduction

This document is the second of a series on artificial inference. The first document in the series looked at the differences between production rules, fuzzy logic, Bayesian networks and certainty theory. Here we concentrate on Bayesian networks, and in particular at the non-linearity of their inference. We start by looking at how probabilities are represented in Bayesian network tools, contrasting conditional probability tables and affirms/denies weights. Following this we examine seven different levels of non-linearity using both three dimensional graphs and two dimensional slices of these graphs.

Conditional Probabilities, Affirms/Denies Weights and Conditional Probability Tables

To understand how to increase or decrease the linearity of a Bayesian network we need first to look at the five factors involved in a simple binary network like the one below:

The five factors are:
P(H)The prior probability of the hypothesis H
P(E1|H)The probability of evidence E1 when H is true
P(E2|H)The probability of evidence E2 when H is true
P(E1|~H)The probability of evidence E1 when H is false
P(E2|~H)The probability of evidence E2 when H is false

To make matters simpler, however, we can look at symmetrical networks where

P(E1|H)  = P(E2|H) and
P(E1|~H) = P(E2|~H)
thus reducing the relevant factors to three:
P(H), P(E|H) and P(E|~H).
To implement a Bayesian knowledge base using LPA software, all we need to do is to convert these conditional probabilities to affirms and denies weights according to the equations below.

The affirms weight is calculated as:

        A = P(E | H)
            -------------
            P(E | ~H)

and the denies weight is calculated as:

        D = 1 - (P(E | H))
            -----------------
            1 - P(E | ~H)

These weights can then be used in rules such as:

uncertainty_rule r1
if e1 is high ( affirms 3.20; denies 0.895 ) and e2 is high ( affirms 9.00; denies 0.895 ) then h is high .

Some Bayesian tools produced by other companies, such as Hugin and Nettica express conditional probabilities by means of what are called Conditional Probability Tables (CPTs). Here is a typical CPT.

E1yesyesnono
E2yesnoyesno
H
yes0.710.560.560.29
no0.280.440.440.61

CPTs such as this are designed to express P(H|E1) etc. rather than P(E1|H). Logic Programming Associates uses the latter rather than the former, but one can be calculated from the other by means of Bayes’ theorem. The advantage of LPA’s approach is that it enables these values to be obtained directly from databases of previous cases.

Example One

Let us start with the simplest case of all, where the ratio between P(E|H) and P(E|~H) is equal to 1. This results in a flat plane with a vertical value equal to P(H):

E1yesyesnono
E2yesnoyesno
H
yes0.50.50.50.5
no0.50.50.50.5
P(H)0.5
P(E1 | H)0.25
P(E1 | ~H)0.25
P(E2 | H)0.25
P(E2 | ~H)0.25

Example Two

Now we will increase the ratio of P(E|H) and P(E|~H) to 1.6:

The corresponding graph is what we might call a flying carpet:

E1yesyesnono
E2yesnoyesno
H
yes0.720.560.560.29
no0.280.440.440.61
P(H)0.5
P(E1 | H)0.4
P(E1 | ~H)0.25
P(E2 | H)0.4
P(E2 | ~H)0.25

We will examine this shape more closely by looking at some of the two-dimensional slices of this three-dimensional graph.


These are curves with a fairly low non-linearity compared with what we shall see below.

Example Three

E1yesyesnono
E2yesnoyesno
H
yes0.720.610.610.49
no0.280.390.390.51
P(H)0.5
P(E1 | H)0.04
P(E1 | ~H)0.025
P(E2 | H)0.04
P(E2 | ~H)0.025

Note that it is the ratio of P(E|H) to P(E|~H) that is relevant here as can be seen by reducing both by a factor of ten:

Example Four

E1yesyesnono
E2yesnoyesno
H
yes0.850.560.560.22
no0.150.440.440.78
P(H)0.5
P(E1 | H)0.6
P(E1 | ~H)0.25
P(E2 | H)0.6
P(E2 | ~H)0.25

If we want to decrease the linearity in the model, all we need to do is to increase the ratio between P(E|H) and P(E|~H). Firstly a small increase:

Here are the two-dimensional slices:


In the graph above which represents E1 held at 33% we now see the classical ‘S’ curve of Bayesian non-linearity.

Example Five

E1yesyesnono
E2yesnoyesno
H
yes0.960.690.690.17
no0.040.310.310.83
P(H)0.5
P(E1 | H)0.6
P(E1 | ~H)0.12
P(E2 | H)0.6
P(E2 | ~H)0.12

Another increase:


Example Six

E1yesyesnono
E2yesnoyesno
H
yes1.000.900.900.14
no0.000.100.100.86
P(H)0.5
P(E1 | H)0.6
P(E1 | ~H)0.025
P(E2 | H)0.6
P(E2 | ~H)0.025

Next a more significant increase to the ratio of P(E | H) to P(E | ~H):


Example Seven

E1yesyesnono
E2yesnoyesno
H
yes1.000.990.990.14
no0.000.010.010.86
P(H)0.5
P(E1 | H)0.6
P(E1 | ~H)0.0025
P(E2 | H)0.6
P(E2 | ~H)0.0025

Finally a very significant increase:


Conclusions

We have seen here that the most significant factor in determining the linearity of a Bayesian network is the ratio between P(E|H) and P(E|~H). It is this ratio which defines the Affirms weight (and indirectly the Denies weight) used to represent Bayesian rules using LPA software.

Submitted: 05/12/2003

Article content copyright © Clive Spenser & Charles Langley, 2003.
 Article Toolbar
Print
BibTeX entry

Search

Latest News
- Generation5 10-year Anniversary (03/09/2008)
- New Generation5 Design! (09/04/2007)
- Happy New Year 2007 (02/01/2007)
- Where has Generation5 Gone?! (04/11/2005)
- NeuroEvolving Robotic Operatives (NERO) (25/06/2005)

What's New?
- Back-propagation using the Generation5 JDK (07/04/2008)
- Hough Transforms (02/01/2008)
- Kohonen-based Image Analysis using the Generation5 JDK (11/12/2007)
- Modelling Bacterium using the JDK (19/03/2007)
- Modelling Bacterium using the JDK (19/03/2007)


All content copyright © 1998-2007, Generation5 unless otherwise noted.
- Privacy Policy - Legal - Terms of Use -