# Bayesian Networks: An Introduction

# Bayesian Networks: An Introduction

ISBN: 978-0-470-68402-3 September 2009 366 Pages

## Description

*Bayesian Networks: An Introduction*provides a self-contained introduction to the theory and applications of Bayesian networks, a topic of interest and importance for statisticians, computer scientists and those involved in modelling complex data sets. The material has been extensively tested in classroom teaching and assumes a basic knowledge of probability, statistics and mathematics. All notions are carefully explained and feature exercises throughout.

Features include:

- An introduction to Dirichlet Distribution, Exponential Families and their applications.
- A detailed description of learning algorithms and Conditional Gaussian Distributions using Junction Tree methods.
- A discussion of Pearl's intervention calculus, with an introduction to the notion of see and do conditioning.
- All concepts are clearly defined and illustrated with examples and exercises. Solutions are provided online.

This book will prove a valuable resource for postgraduate students of statistics, computer engineering, mathematics, data mining, artificial intelligence, and biology.

Researchers and users of comparable modelling or statistical techniques such as neural networks will also find this book of interest.

## Table of contents

**Preface.**

**1 Graphical models and probabilistic reasoning**.

1.1 Introduction.

1.2 Axioms of probability and basic notations.

1.3 The Bayes update of probability.

1.4 Inductive learning.

1.5 Interpretations of probability and Bayesian networks.

1.6 Learning as inference about parameters.

1.7 Bayesian statistical inference.

1.8 Tossing a thumb-tack.

1.9 Multinomial sampling and the Dirichlet integral.

Notes.

Exercises: Probabilistic theories of causality, Bayes’ rule, multinomial sampling and the Dirichlet density.

**2 Conditional independence, graphs and** ** d-separation**.

2.1 Joint probabilities.

2.2 Conditional independence.

2.3 Directed acyclic graphs and *d*-separation.

2.4 The Bayes ball.

2.5 Potentials.

2.6 Bayesian networks.

2.7 Object oriented Bayesian networks.

2.8 *d*-Separation and conditional independence.

2.9 Markov models and Bayesian networks.

2.10 *I*-maps and Markov equivalence.

Notes.

Exercises: Conditional independence and *d*-separation.

**3 Evidence, sufficiency and Monte Carlo methods**.

3.1 Hard evidence.

3.2 Soft evidence and virtual evidence.

3.3 Queries in probabilistic inference.

3.4 Bucket elimination.

3.5 Bayesian sufficient statistics and prediction sufficiency.

3.6 Time variables.

3.7 A brief introduction to Markov chain Monte Carlo methods.

3.8 The one-dimensional discrete Metropolis algorithm.

Notes.

Exercises: Evidence, sufficiency and Monte Carlo methods.

**4 Decomposable graphs and chain graphs**.

4.1 Definitions and notations.

4.2 Decomposable graphs and triangulation of graphs.

4.3 Junction trees.

4.4 Markov equivalence.

4.5 Markov equivalence, the essential graph and chain graphs.

Notes.

Exercises: Decomposable graphs and chain graphs.

**5 Learning the conditional probability potentials**.

5.1 Initial illustration: maximum likelihood estimate for a fork connection.

5.2 The maximum likelihood estimator for multinomial sampling.

5.3 MLE for the parameters in a DAG: the general setting.

5.4 Updating, missing data, fractional updating.

Notes.

Exercises: Learning the conditional probability potentials.

**6 Learning the graph structure**.

6.1 Assigning a probability distribution to the graph structure.

6.2 Markov equivalence and consistency.

6.3 Reducing the size of the search.

6.4 Monte Carlo methods for locating the graph structure.

6.5 Women in mathematics.

Notes.

Exercises: Learning the graph structure.

**7 Parameters and sensitivity**.

7.1 Changing parameters in a network.

7.2 Measures of divergence between probability distributions.

7.3 The Chan-Darwiche distance measure.

7.4 Parameter changes to satisfy query constraints.

7.5 The sensitivity of queries to parameter changes.

Notes.

Exercises: Parameters and sensitivity.

**8 Graphical models and exponential families**.

8.1 Introduction to exponential families.

8.2 Standard examples of exponential families.

8.3 Graphical models and exponential families.

8.4 Noisy ‘or’ as an exponential family.

8.5 Properties of the log partition function.

8.6 Fenchel Legendre conjugate.

8.7 Kullback-Leibler divergence.

8.8 Mean field theory.

8.9 Conditional Gaussian distributions.

Notes.

Exercises: Graphical models and exponential families.

**9 Causality and intervention calculus**.

9.1 Introduction.

9.2 Conditioning by observation and by intervention.

9.3 The intervention calculus for a Bayesian network.

9.4 Properties of intervention calculus.

9.5 Transformations of probability.

9.6 A note on the order of ‘see’ and ‘do’ conditioning.

9.7 The ‘Sure Thing’ principle.

9.8 Back door criterion, confounding and identifiability.

Notes.

Exercises: Causality and intervention calculus.

**10 The junction tree and probability updating**.

10.1 Probability updating using a junction tree.

10.2 Potentials and the distributive law.

10.3 Elimination and domain graphs.

10.4 Factorization along an undirected graph.

10.5 Factorizing along a junction tree.

10.6 Local computation on junction trees.

10.7 Schedules.

10.8 Local and global consistency.

10.9 Message passing for conditional Gaussian distributions.

10.10 Using a junction tree with virtual evidence and soft evidence.

Notes.

Exercises: The junction tree and probability updating.

**11 Factor graphs and the sum product algorithm**.

11.1 Factorization and local potentials.

11.2 The sum product algorithm.

11.3 Detailed illustration of the algorithm.

Notes.

Exercise: Factor graphs and the sum product algorithm.

**References**.

Index.

## Reviews

"Extensively tested in classroom teaching … .The authors clearly define all concepts and provide numerous examples and exercises." (*Book News*, December 2009)