useR

Tutorial: Learning Bayesian Networks in R: an Example in Systems Biology


Marco Scutari , Genetics Institute, University College London (UCL), United Kingdom m.scutari@ucl.ac.uk

Overview

The purpose of this tutorial is to provide an overview of the facilities implemented by different R packages to learn Bayesian networks, and to show how to interface these packages [1-3]. As a motivating example, we will reproduce the analysis performed by Sachs et al. [4] to learn a causal protein-signalling network.

Goals

The tutorial aims to introduce the basics of Bayesian networks' learning and inference using real-world data to explore the issues commonly found in graphical modelling.

Outline

The tutorial will cover the following topics, with particular attention to R coding practices.

  1. Basic concepts and uses of Bayesian networks and their Markov properties. Discrete and Gaussian parametric assumptions. Workflow of model estimation and inference: structure learning, parameter learning, exact and approximate inference. Causal and non-causal Bayesian network interpretations.
  2. Structure learning: different classes of algorithms. Conditional independence tests and network scores in common use. Examples focusing on the bnlearn and Rgraphviz packages, mentioning deal, pcalg and catnet.
  3. Parameter learning: Bayesian and maximum likelihood estimators. Examples focusing on bnlearn and lattice packages.
  4. Model averaging and identification of significant edges. Examples focusing on bnlearn, mentioning pcalg and catnet.
  5. Approximate inference with package bnlearn, exact inference with package gRain.

Prerequisites

Background knowledge required for this tutorial includes basic probability theory (multinomial and normal distributions in particular) and basic R commands.

Intended Audience

Target audience includes researchers and analysts working on data that can be intuitively modelled as networks. Practicioners working in life sciences can relate best with the motivating example, but the techniques covered in the tutorial can easily be applied to other fields such as social sciences [5].

Workshop Materials

Slides and other materials can be downloaded here.

Related Links

bnlearn homepage: http://www.bnlearn.com

References

[1] Scutari M, Strimmer K (2011). Introduction to Graphical Modelling. In Handbook of Statistical Systems Biology, D. J. Balding, M. Stumpf and M. Girolami (editors), Wiley.

[2] Nagarajan R, Scutari M, Lèbre S (2013). Bayesian Networks in R with Applications in Systems Biology. In print, due April 2013. Use R!, Springer (US).

[3] Denis, J-B, Scutari M (2013). Réseaux Bayésiens avec R: Élaboration, Manipulation et Utilisation en Modélisation Appliquée. In preparation. Pratique R, Springer (France).

[4] Sachs K et al. (2005). Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data. Science, 308(5721), pages 523-529.

[5] Kennet RS, Perruca G , Salini S (2012). Modern Analysis of Customer Surveys: with Applications Using R. Wiley.