Preface xiii

Statistical packages xix

About the website xxi

**PART 1 PRESENTING DATA 1**

**1 Data types 3**

1.1 Does it really matter? 3

1.2 Interval scale data 4

1.3 Ordinal scale data 4

1.4 Nominal scale data 5

1.5 Structure of this book 6

1.6 Chapter summary 6

**2 Data presentation 7**

2.1 Numerical tables 8

2.2 Bar charts and histograms 9

2.3 Pie charts 14

2.4 Scatter plots 16

2.5 Pictorial symbols 21

2.6 Chapter summary 22

**PART 2 INTERVAL]SCALE DATA 23**

**3 Descriptive statistics for interval scale data 25**

3.1 Summarising data sets 25

3.2 Indicators of central tendency: Mean, median and mode 26

3.3 Describing variability – standard deviation and coefficient of variation 33

3.4 Quartiles – Another way to describe data 36

3.5 Describing ordinal data 40

3.6 Using computer packages to generate descriptive statistics 43

3.7 Chapter summary 45

**4 The normal distribution 47**

4.1 What is a normal distribution? 47

4.2 Identifying data that are not normally distributed 48

4.3 Proportions of individuals within 1SD or 2SD of the mean 52

4.4 Skewness and kurtosis 54

4.5 Chapter summary 57

4.6 Appendix: Power, sample size and the problem of attempting to test for a normal distribution 58

**5 Sampling from populations: The standard error of the mean 63**

5.1 Samples and populations 63

5.2 From sample to population 65

5.3 Types of sampling error 65

5.4 What factors control the extent of random sampling error when estimating a population mean? 68

5.5 Estimating likely sampling error – The SEM 70

5.6 Offsetting sample size against SD 74

5.7 Chapter summary 75

**6 95% Confidence interval for the mean and data transformation 77**

6.1 What is a confidence interval? 78

6.2 How wide should the interval be? 78

6.3 What do we mean by ‘95%’ confidence? 79

6.4 Calculating the interval width 80

6.5 A long series of samples and 95% C.I.s 81

6.6 How sensitive is the width of the C.I. to changes in the SD, the sample size or the required level of confidence? 82

6.7 Two statements 85

6.8 One]sided 95% C.I.s 85

6.9 The 95% C.I. for the difference between two treatments 88

6.10 The need for data to follow a normal distribution and data transformation 90

6.11 Chapter summary 94

**7 The two]sample t]test (1): Introducing hypothesis tests 95**

7.1 The two]sample t]test – an example of an hypothesis test 96

7.2 Significance 103

7.3 The risk of a false positive finding 104

7.4 What aspects of the data will influence whether or not we obtain a significant outcome? 106

7.5 Requirements for applying a two]sample t]test 108

7.6 Performing and reporting the test 109

7.7 Chapter summary 110

**8 The two]sample t]test (2): The dreaded P value 111**

8.1 Measuring how significant a result is 111

8.2 P values 112

8.3 Two ways to define significance? 113

8.4 Obtaining the P value 113

8.5 P values or 95% confidence intervals? 114

8.6 Chapter summary 115

**9 The two]sample t]test (3): False negatives, power and necessary sample sizes 117**

9.1 What else could possibly go wrong? 118

9.2 Power 119

9.3 Calculating necessary sample size 122

9.4 Chapter summary 130

**10 The two]sample t]test (4): Statistical significance, practical significance and equivalence 131**

10.1 Practical significance – Is the difference big enough to matter? 131

10.2 Equivalence testing 135

10.3 Non]inferiority testing 139

10.4 P values are less informative and can be positively misleading 141

10.5 Setting equivalence limits prior to experimentation 143

10.6 Chapter summary 144

**11 The two]sample t]test (5): One]sided testing 145**

11.1 Looking for a change in a specified direction 146

11.2 Protection against false positives 148

11.3 Temptation! 149

11.4 Using a computer package to carry out a one]sided test 153

11.5 Chapter summary 153

**12 What does a statistically significant result really tell us? 155**

12.1 Interpreting statistical significance 155

12.2 Starting from extreme scepticism 159

12.3 Bayesian statistics 160

12.4 Chapter summary 161

**13 The paired t]test: Comparing two related sets of measurements 163**

13.1 Paired data 163

13.2 We could analyse the data by a two]sample t]test 165

13.3 Using a paired t]test instead 165

13.4 Performing a paired t]test 166

13.5 What determines whether a paired t]test will be significant? 169

13.6 Greater power of the paired t]test 170

13.7 Applicability of the test 170

13.8 Choice of experimental design 171

13.9 Requirement for applying a paired t]test 172

13.10 Sample sizes, practical significance and one]sided tests 173

13.11 Summarising the differences between paired and two]sample t]tests 175

13.12 Chapter summary 175

**14 Analyses of variance: Going beyond t]tests 177**

14.1 Extending the complexity of experimental designs 177

14.2 One]way analysis of variance 178

14.3 T wo]way analysis of variance 188

14.4 Fixed and random factors 198

14.5 Multi]factorial experiments 204

14.6 Chapter summary 204

**15 Correlation and regression – Relationships between measured values 207**

15.1 Correlation analysis 208

15.2 Regression analysis 218

15.3 Multiple regression 225

15.4 Chapter summary 235

**16 Analysis of covariance 237**

16.1 A clinical trial where ANCOVA would be appropriate 238

16.2 General interpretation of ANCOVA results 239

16.3 Analysis of the COPD trial results 241

16.4 Advantages of ANCOVA over a simple two]sample t]test 244

16.5 Chapter summary 249

**PART 3 NOMINAL]SCALE DATA 251**

**17 D escribing categorised data and the goodness of fit chi]square test 253**

17.1 Descriptive statistics 254

17.2 Testing whether the population proportion might credibly be some pre]determined figure 258

17.3 Chapter summary 264

**18 Contingency chi]square, Fisher’s and McNemar’s tests 265**

18.1 Using the contingency chi]square test to compare observed proportions 266

18.2 Extent of change in proportion with an expulsion – Clinically significant? 270

18.3 Larger tables – Attendance at diabetic clinics 270

18.4 Planning experimental size 273

18.5 Fisher’s exact test 275

18.6 McNemar’s test 277

18.7 Chapter summary 279

18.8 Appendix 280

**19 Relative risk, odds ratio and number needed to treat 283**

19.1 Measures of treatment effect – relative risk, odds ratio and number needed to treat 283

19.2 Similarity between relative risk and odds ratio 287

19.3 Interpreting the various measures 288

19.4 95% confidence intervals for measures of effect size 289

19.5 Chapter summary 293

**20 Logistic regression 295**

20.1 Modelling a binary outcome 295

20.2 Additional predictors and the problem of confounding 304

20.3 Analysis by computer package 307

20.4 Extending logistic regression beyond dichotomous outcomes 308

20.5 Chapter summary 309

20.6 Appendix 309

**PART 4 ORDINAL]SCALE DATA 311**

**21 Ordinal and non]normally distributed data: Transformations and non]parametric tests 313**

21.1 Transforming data to a normal distribution 314

21.2 The Mann–Whitney test – a non]parametric method 318

21.3 Dealing with ordinal data 323

21.4 Other non]parametric methods 325

21.5 Chapter summary 333

21.6 Appendix 334

**PART 5 OTHER TOPICS 337**

**22 Measures of agreement 339**

22.1 Answers to several questions 340

22.2 Several answers to one question – do they agree? 344

22.3 Chapter summary 358

**23 Survival analysis 361**

23.1 What special problems arise with survival data? 362

23.2 Kaplan–Meier survival estimation 363

23.3 Declining sample sizes in survival studies 369

23.4 Precision of sampling estimates of survival 369

23.5 Indicators of survival 371

23.6 Testing for differences in survival 374

23.7 Chapter summary 383

**24 Multiple testing 385**

24.1 What is it and why is it a problem? 385

24.2 Where does multiple testing arise? 386

24.3 Methods to avoid false positives 388

24.4 The role of scientific journals 392

24.5 Chapter summary 393

**25 Questionnaires 395**

25.1 Types of questions 396

25.2 Sample sizes and low return rates 398

25.3 Analysing the results 399

25.4 Problem number two: Confounded questionnaire data 401

25.5 Problem number three: Multiple testing with questionnaire data 401

25.6 Chapter summary 403

Index 405