# Statistics : An Introduction Using R, 2e

ISBN: 978-1-118-94109-6 November 2014 354 Pages

Paperback

In Stock

£29.50

*VAT

## Description

A revised and updated edition of this bestselling introduction to statistical analysis using the leading free software package R

In recent years R has become one of the most popular, powerful and flexible statistical software packages available. It enables users to apply a wide variety of statistical methods, ranging from simple regression to generalized linear modelling, and has been widely adopted by life scientists and social scientists. This new edition offers a concise introduction to a broad array of statistical methods, at a level that is elementary enough to appeal to a wide range of disciplines. Step-by-step instructions help the non-statistician to fully understand the methodology.  The book covers the full range of statistical techniques likely to be needed to analyse the data from research projects, including elementary material such as t tests and chi-squared tests, intermediate methods such as regression and analysis of variance, and more advanced techniques such as generalized linear modelling. Numerous worked examples and exercises are included within each chapter.

• Comprehensively revised to include more detailed introductory material on working with R
• Updated to be compatible with the current R Version 3
• Complete coverage of all the essential statistical methods
• Focus on linear models (regression, analysis of variance and analysis of covariance) and generalized linear models (for count data, proportion data and age-at-death data)
• Now includes more detail on experimental design
• Accompanied by a website featuring worked examples, data sets, exercises and solutions www.imperial.ac.uk/bio/research/crawley/statistics

Statistics: An introduction using R is primarily aimed at undergraduate students in medicine, engineering, economics and biology – but will also appeal to postgraduates in these areas who wish to switch to using R.

Preface xi

Chapter 1 Fundamentals 1

Everything Varies 2

Significance 3

Null Hypotheses 3

p Values 3

Interpretation 4

Model Choice 4

Statistical Modelling 5

Maximum Likelihood 6

Experimental Design 7

The Principle of Parsimony (Occam’s Razor) 8

Observation, Theory and Experiment 8

Controls 8

Replication: It’s the ns that Justify the Means 8

How Many Replicates? 9

Power 9

Randomization 10

Strong Inference 14

Weak Inference 14

How Long to Go On? 14

Pseudoreplication 15

Initial Conditions 16

Orthogonal Designs and Non-Orthogonal Observational Data 16

Aliasing 16

Multiple Comparisons 17

Summary of Statistical Models in R 18

Housekeeping within R 20

References 22

Chapter 2 Dataframes 23

Selecting Parts of a Dataframe: Subscripts 26

Sorting 27

Summarizing the Content of Dataframes 29

Summarizing by Explanatory Variables 30

First Things First: Get to Know Your Data 31

Relationships 34

Looking for Interactions between Continuous Variables 36

Graphics to Help with Multiple Regression 39

Interactions Involving Categorical Variables 39

Chapter 3 Central Tendency 42

Chapter 4 Variance 50

Degrees of Freedom 53

Variance 53

Variance: A Worked Example 55

Variance and Sample Size 58

Using Variance 59

A Measure of Unreliability 60

Confidence Intervals 61

Bootstrap 62

Non-constant Variance: Heteroscedasticity 65

Chapter 5 Single Samples 66

Data Summary in the One-Sample Case 66

The Normal Distribution 70

Calculations Using z of the Normal Distribution 76

Plots for Testing Normality of Single Samples 79

Inference in the One-Sample Case 81

Bootstrap in Hypothesis Testing with Single Samples 81

Student’s t Distribution 82

Higher-Order Moments of a Distribution 83

Skew 84

Kurtosis 86

Reference 87

Chapter 6 Two Samples 88

Comparing Two Variances 88

Comparing Two Means 90

Student’s t Test 91

Wilcoxon Rank-Sum Test 95

Tests on Paired Samples 97

The Binomial Test 98

Binomial Tests to Compare Two Proportions 100

Chi-Squared Contingency Tables 100

Fisher’s Exact Test 105

Correlation and Covariance 108

Correlation and the Variance of Differences between Variables 110

Scale-Dependent Correlations 112

Reference 113

Chapter 7 Regression 114

Linear Regression 116

Linear Regression in R 117

Calculations Involved in Linear Regression 122

Partitioning Sums of Squares in Regression: SSY = SSR + SSE 125

Measuring the Degree of Fit, r2 133

Model Checking 134

Transformation 135

Polynomial Regression 140

Non-Linear Regression 142

Influence 148

Chapter 8 Analysis of Variance 150

One-Way ANOVA 150

Shortcut Formulas 157

Effect Sizes 159

Plots for Interpreting One-Way ANOVA 162

Factorial Experiments 168

Pseudoreplication: Nested Designs and Split Plots 173

Split-Plot Experiments 174

Random Effects and Nested Designs 176

Fixed or Random Effects? 177

Removing the Pseudoreplication 178

Analysis of Longitudinal Data 178

Derived Variable Analysis 179

Dealing with Pseudoreplication 179

Variance Components Analysis (VCA) 183

References 184

Chapter 9 Analysis of Covariance 185

Chapter 10 Multiple Regression 193

The Steps Involved in Model Simplification 195

Caveats 196

Order of Deletion 196

Carrying Out a Multiple Regression 197

A Trickier Example 203

Chapter 11 Contrasts 212

Contrast Coefficients 213

An Example of Contrasts in R 214

A Priori Contrasts 215

Treatment Contrasts 216

Model Simplification by Stepwise Deletion 218

Contrast Sums of Squares by Hand 222

The Three Kinds of Contrasts Compared 224

Reference 225

Chapter 12 Other Response Variables 226

Introduction to Generalized Linear Models 228

The Error Structure 229

The Linear Predictor 229

Fitted Values 230

A General Measure of Variability 230

Akaike’s Information Criterion (AIC) as a Measure of the Fit of a Model 233

Chapter 13 Count Data 234

A Regression with Poisson Errors 234

Analysis of Deviance with Count Data 237

The Danger of Contingency Tables 244

Analysis of Covariance with Count Data 247

Frequency Distributions 250

Chapter 14 Proportion Data 256

Analyses of Data on One and Two Proportions 257

Averages of Proportions 257

Count Data on Proportions 257

Odds 259

Overdispersion and Hypothesis Testing 260

Applications 261

Logistic Regression with Binomial Errors 261

Proportion Data with Categorical Explanatory Variables 264

Analysis of Covariance with Binomial Data 269

Chapter 15 Binary Response Variable 273

Incidence Functions 275

ANCOVA with a Binary Response Variable 279

Chapter 16 Death and Failure Data 285

Survival Analysis with Censoring 287

Appendix Essentials of the R Language 291

R as a Calculator 291

Built-in Functions 292

Numbers with Exponents 294

Modulo and Integer Quotients 294

Assignment 295

Rounding 295

Infinity and Things that Are Not a Number (NaN) 296

Missing Values (NA) 297

Operators 298

Creating a Vector 298

Named Elements within Vectors 299

Vector Functions 299

Summary Information from Vectors by Groups 300

Subscripts and Indices 301

Working with Vectors and Logical Subscripts 301

Trimming Vectors Using Negative Subscripts 304

Logical Arithmetic 305

Repeats 305

Generate Factor Levels 306

Generating Regular Sequences of Numbers 306

Matrices 307

Character Strings 309

Writing Functions in R 310

Arithmetic Mean of a Single Sample 310

Median of a Single Sample 310

Loops and Repeats 311

The ifelse Function 312

Evaluating Functions with apply 312

Testing for Equality 313

Testing and Coercing in R 314

Dates and Times in R 315

Calculations with Dates and Times 319

Understanding the Structure of an R Object Using str 320

Reference 322