# How to Design, Analyse and Report Cluster Randomised Trials in Medicine and Health Related Research

# How to Design, Analyse and Report Cluster Randomised Trials in Medicine and Health Related Research

ISBN: 978-1-118-76345-2 April 2014 272 Pages

## Description

**A complete guide to understanding cluster randomised trials**

Written by two researchers with extensive experience in the field, this book presents a complete guide to the design, analysis and reporting of cluster randomised trials. It spans a wide range of applications: trials in developing countries, trials in primary care, trials in the health services. A key feature is the use of R code and code from other popular packages to plan and analyse cluster trials, using data from actual trials. The book contains clear technical descriptions of the models used, and considers in detail the ethics involved in such trials and the problems in planning them. For readers and students who do not intend to run a trial but wish to be a critical reader of the literature, there are sections on the CONSORT statement, and exercises in reading published trials.

- Written in a clear, accessible style
- Features real examples taken from the authors’ extensive practitioner experience of designing and analysing clinical trials
- Demonstrates the use of R, Stata and SPSS for statistical analysis
- Includes computer code so the reader can replicate all the analyses
- Discusses neglected areas such as ethics and practical issues in running cluster randomised trials

*How to Design, Analyse and Report Cluster Randomised Trials in Medicine and Health Related Research* provides an excellent reference tool and can be read with profit by statisticians, health services researchers, systematic reviewers and critical readers of cluster randomised trials.

## Related Resources

### Student

Preface xiii

Acronyms and abbreviations xv

**1 Introduction 1**

1.1 Randomised controlled trials 1

1.1.1 A-Allocation at random 1

1.1.2 B-Blindness 2

1.1.3 C-Control 2

1.2 Complex interventions 3

1.3 History of cluster randomised trials 4

1.4 Cohort and field trials 4

1.5 The field/community trial 5

1.5.1 The REACT trial 5

1.5.2 The Informed Choice leaflets trial 6

1.5.3 The Mwanza trial 7

1.5.4 The paramedics practitioner trial 7

1.6 The cohort trial 8

1.6.1 The PoNDER trial 8

1.6.2 The DESMOND trial 9

1.6.3 The Diabetes Care from Diagnosis trial 10

1.6.4 The REPOSE trial 11

1.6.5 Other examples of cohort cluster trials 11

1.7 Field versus cohort designs 11

1.8 Reasons for cluster trials 12

1.9 Between- and within-cluster variation 14

1.10 Random-effects models for continuous outcomes 15

1.10.1 The model 15

1.10.2 The intracluster correlation coefficient 16

1.10.3 Estimating the intracluster correlation (ICC) coefficient 16

1.10.4 Link between the Pearson correlation coefficient and the intraclass correlation coefficient 17

1.11 Random-effects models for binary outcomes 18

1.11.1 The model 18

1.11.2 The ICC for binary data 19

1.11.3 The coefficient of variation 19

1.11.4 Relationship between cvc and for binary data 20

1.12 The design effect 20

1.13 Commonly asked questions 21

1.14 Websources 21

Exercise 22

Appendix 1.A 22

**2 Design issues 27**

2.1 Introduction 27

2.2 Issues for a simple intervention 28

2.2.1 Phases of a trial 28

2.2.2 ‘Pragmatic’ and ‘explanatory’ trials 29

2.2.3 Intention-to-treat and per-protocol analyses 29

2.2.4 Non-inferiority and equivalence trials 30

2.3 Complex interventions 30

2.3.1 Design of complex interventions 30

2.3.2 Phase I modelling/qualitative designs 32

2.3.3 Pilot or feasibility studies 33

2.3.4 Example of pilot/feasibility studies in cluster trials 33

2.4 Recruitment bias 34

2.5 Matched-pair trials 34

2.5.1 Design of matched-pair studies 34

2.5.2 Limitations of matched-pairs designs 36

2.5.3 Example of matched-pair design: The Family Heart Study 36

2.6 Other types of designs 37

2.6.1 Cluster factorial designs 37

2.6.2 Example cluster factorial trial 38

2.6.3 Cluster crossover trials 38

2.6.4 Example of a cluster crossover trial 39

2.6.5 Stepped wedge 39

2.6.6 Pseudorandomised trials 40

2.7 Other design issues 41

2.8 Strategies for improving precision 41

2.9 Randomisation 42

2.9.1 Reasons for randomisation 42

2.9.2 Simple randomisation 43

2.9.3 Stratified randomisation 43

2.9.4 Restricted randomisation 43

2.9.5 Minimisation 44

Exercise 45

Appendix 2.A 48

**3 Sample size: How many subjects/clusters do I need for my cluster randomised controlled trial? 50**

3.1 Introduction 51

3.1.1 Justification of the requirement for a sample size 51

3.1.2 Significance tests, P-values and power 51

3.1.3 Sample size and cluster trials 53

3.2 Sample size for continuous data – comparing two means 53

3.2.1 Basic formulae 53

3.2.2 The design effect (DE) in cluster RCTs 54

3.2.3 Example from general practice 55

3.3 Sample size for binary data – comparing two proportions 56

3.3.1 Sample size formula 56

3.3.2 Example calculations 57

3.3.3 Example: The Informed Choice leaflets study 58

3.4 Sample size for ordered categorical (ordinal) data 59

3.4.1 Sample size formula 59

3.4.2 Example calculations 60

3.5 Sample size for rates 62

3.5.1 Formulae 62

3.5.2 Example comparing rates 63

3.6 Sample size for survival 63

3.6.1 Formulae 63

3.6.2 Example of sample size for survival 64

3.7 Equivalence/non-inferiority studies 64

3.7.1 Equivalence/non-inferiority versus superiority 64

3.7.2 Continuous data – comparing the equivalence of two means 65

3.7.3 Example calculations for continuous data 65

3.7.4 Binary data – comparing the equivalence of two proportions 66

3.8 Unknown standard deviation and effect size 66

3.9 Practical problems 67

3.9.1 Tips on getting the SD 67

3.9.2 Non-response 67

3.9.3 Unequal groups 67

3.10 Number of clusters fixed 68

3.10.1 Number of clusters and number of subjects per cluster 68

3.10.2 Example with number of clusters fixed 69

3.10.3 Increasing the number of clusters or number of patients per cluster? 69

3.11 Values of the ICC 69

3.12 Allowing for imprecision in the ICC 70

3.13 Allowing for varying cluster sizes 70

3.13.1 Formulae 70

3.13.2 Example of effect of variable cluster size 71

3.14 Sample size re-estimation 71

3.14.1 Adjusting for covariates 72

3.15 Matched-pair studies 72

3.15.1 Sample sizes for matched designs 72

3.15.2 Example of a sample size calculation for a matched study 72

3.16 Multiple outcomes/endpoints 73

3.17 Three or more groups 74

3.18 Crossover trials 74

3.18.1 Formulae 75

3.18.2 Example of a sample size formula in a crossover trial 75

3.19 Post hoc sample size calculations 75

3.20 Conclusion: Usefulness of sample size calculations 76

3.21 Commonly asked questions 76

Exercise 77

Appendix 3.A 78

**4 Simple analysis of cRCT outcomes using aggregate cluster-level summaries 83**

4.1 Introduction 83

4.1.1 Methods of analysing cluster randomised trials 83

4.1.2 Choosing the statistical method 84

4.2 Aggregate cluster-level analysis – carried out at the cluster level, using aggregate summary data 84

4.3 Statistical methods for continuous outcomes 86

4.3.1 Two independent-samples t-test 86

4.3.2 Example 88

4.4 Mann–Whitney U test 91

4.5 Statistical methods for binary outcomes 94

4.6 Analysis of a matched design 95

4.7 Discussion 98

4.8 Commonly asked question 98

Exercise 99

Appendix 4.A 99

**5 Regression methods of analysis for continuous outcomes using individual person-level data 102**

5.1 Introduction 102

5.2 Incorrect models 104

5.2.1 The simple (independence) model 104

5.2.2 Fixed effects 104

5.3 Linear regression with robust standard errors 105

5.3.1 Robust standard errors 105

5.3.2 Example of use of robust standard errors 107

5.3.3 Cluster-specific versus population-averaged models 107

5.4 Random-effects general linear models in a cohort study 108

5.4.1 General models 108

5.4.2 Fitting a random-effects model 109

5.4.3 Example of a random-effects model from the PoNDER study 110

5.4.4 Checking the assumptions 110

5.5 Marginal general linear model with coefficients estimated by generalised estimating equations (GEE) 112

5.5.1 Generalised estimating equations 112

5.5.2 Example of a marginal model from the PoNDER study 113

5.6 Summary of methods 114

5.7 Adjusting for individual-level covariates in cohort studies 115

5.8 Adjusting for cluster-level covariates in cohort studies 118

5.9 Models for cross-sectional designs 119

5.10 Discussion of model fitting 120

Exercise 122

Appendix 5.A 123

**6 Regression methods of analysis for binary, count and time-to-event outcomes for a cluster randomised controlled trial 126**

6.1 Introduction 126

6.2 Difference between a cluster-specific model and a population-averaged or marginal model for binary data 127

6.3 Analysis of binary data using logistic regression 129

6.4 Review of past simulations to determine efficiency of different methods for binary data 130

6.5 Analysis using summary measures 131

6.6 Analysis using logistic regression (ignoring clustering) 132

6.7 Random-effects logistic regression 134

6.8 Marginal models using generalised estimating equations 135

6.9 Analysis of count data 135

6.10 Survival analysis with cluster trials 137

6.11 Missing data 139

6.12 Discussion 139

Exercise 139

Appendix 6.A 140

**7 The protocol 143**

7.1 Introduction 143

7.2 Abstract 144

7.3 Protocol background 147

7.4 Research objectives 147

7.5 Outcome measures 147

7.6 Design 147

7.7 Intervention details 148

7.8 Eligibility 148

7.9 Randomisation 149

7.10 Assessment and data collection 149

7.11 Statistical considerations 150

7.11.1 Sample size 150

7.11.2 Statistical analysis 151

7.11.3 Interim analyses 152

7.12 Ethics 153

7.12.1 Declaration of Helsinki 153

7.12.2 Informed consent 154

7.13 Organisation 155

7.13.1 The team 155

7.13.2 Trial forms 155

7.13.3 Data management 155

7.13.4 Protocol amendments 156

7.14 Further reading 156

Exercise 156

**8 Reporting of cRCTs 159**

8.1 Introduction: Extended CONSORT guidelines for reporting and presenting the results from cRCTs 159

8.2 Patient flow diagram 160

8.3 Comparison of entry characteristics 160

8.4 Incomplete data 167

8.5 Reporting the main outcome 171

8.6 Subgroup analysis and analysis of secondary outcomes/endpoints 174

8.7 Estimates of between-cluster variability 175

8.7.1 Example of reporting the ICC: The PoNDER cRCT 175

8.8 Further reading 175

Exercise 176

**9 Practical issues 178**

9.1 Preventing bias in cluster randomised controlled trials 178

9.1.1 Problems with identifying and recruiting patients to cluster trials 178

9.1.2 Preventing biased recruitment 179

9.2 Developing complex interventions 181

9.3 Choice of method of analysis 182

9.4 Missing data 185

9.5 Example sensitivity analysis: Imputation of missing 6-month EPDS data for at-risk women from the PoNDER cRCT 188

9.6 Multiplicity of outcomes 192

9.6.1 Limiting the number of confirmatory tests 192

9.6.2 Summary measures and statistics 193

9.6.3 Global tests and multiple comparison procedures 193

9.6.4 Which multiple comparison procedure to use? 194

**10 Computing software 195**

10.1 R 195

10.1.1 History 195

10.1.2 Installing R 196

10.1.3 Simple use of R 197

10.1.4 An example of an R program 198

10.2 Stata (version 12) 199

10.2.1 Introduction to Stata 199

10.2.2 Aggregate cluster-level analysis – carried out at the cluster level, using aggregate summary data 201

10.2.3 Random-effects models – continuous outcomes 202

10.2.4 Random-effects models – binary outcomes 205

10.2.5 Random-effects models – count outcomes 206

10.2.6 Marginal models – continuous outcomes 208

10.2.7 Marginal models – binary outcomes 209

10.2.8 Marginal models – count outcomes 210

10.3 SPSS (version 19) 212

10.3.1 Introduction to SPSS 212

10.3.2 Comparing cluster means using aggregate cluster-level analysis – carried out at the cluster level, using aggregate summary data 213

10.3.3 Marginal models 215

10.3.4 Random-effects models 227

10.4 Conclusion and further reading 232

References 234

Index 243

“Overall, the reviewers are enthusiastic about the book. The authors have covered all important areas of cRCTs, using a practical and pragmatic approach to the topic. The code is helpful for the practical implementation of the examples. The material is simple to understand, which will appeal to applied researchers, not only to biostatisticians. As such, we clearly recommend this book to all researchers interested in cRCTs. For biostatisticians involved in cRCTs and investigators of cRCTs, it is a must-have on the bookshelf.” (*Biometrical Journal*, 1 May 2015)