Ebook
Mathematical Statistics with Resampling and RISBN: 9781119013570
432 pages
October 2015

Description
Resampling helps students understand the meaning of sampling distributions, sampling variability, Pvalues, hypothesis tests, and confidence intervals. This groundbreaking book shows how to apply modern resampling techniques to mathematical statistics. Extensively classtested to ensure an accessible presentation, Mathematical Statistics with Resampling and R utilizes the powerful and flexible computer language R to underscore the significance and benefits of modern resampling techniques.
The book begins by introducing permutation tests and bootstrap methods, motivating classical inference methods. Striking a balance between theory, computing, and applications, the authors explore additional topics such as:
 Exploratory data analysis
 Calculation of sampling distributions
 The Central Limit Theorem
 Monte Carlo sampling
 Maximum likelihood estimation and properties of estimators
 Confidence intervals and hypothesis tests
 Regression
 Bayesian methods
Throughout the book, case studies on diverse subjects such as flight delays, birth weights of babies, and telephone company repair times illustrate the relevance of the realworld applications of the discussed material. Key definitions and theorems of important probability distributions are collected at the end of the book, and a related website is also available, featuring additional material including data sets, R scripts, and helpful teaching hints.
Mathematical Statistics with Resampling and R is an excellent book for courses on mathematical statistics at the upperundergraduate and graduate levels. It also serves as a valuable reference for applied statisticians working in the areas of business, economics, biostatistics, and public health who utilize resampling methods in their everyday work.
Table of Contents
Preface xiii
1 Data and Case Studies 1
1.1 Case Study: Flight Delays 1
1.2 Case Study: Birth Weights of Babies 2
1.3 Case Study: Verizon Repair Times 3
1.4 Sampling 3
1.5 Parameters and Statistics 5
1.6 Case Study: General Social Survey 5
1.7 Sample Surveys 6
1.8 Case Study: Beer and Hot Wings 8
1.9 Case Study: Black Spruce Seedlings 8
1.10 Studies 8
1.11 Exercises 10
2 Exploratory Data Analysis 13
2.1 Basic Plots 13
2.2 Numeric Summaries 16
2.2.1 Center 17
2.2.2 Spread 18
2.2.3 Shape 19
2.3 Boxplots 19
2.4 Quantiles and Normal Quantile Plots 20
2.5 Empirical Cumulative Distribution Functions 24
2.6 Scatter Plots 26
2.7 Skewness and Kurtosis 28
2.8 Exercises 30
3 Hypothesis Testing 35
3.1 Introduction to Hypothesis Testing 35
3.2 Hypotheses 36
3.3 Permutation Tests 38
3.3.1 Implementation Issues 42
3.3.2 OneSided and TwoSided Tests 47
3.3.3 Other Statistics 48
3.3.4 Assumptions 51
3.4 Contingency Tables 52
3.4.1 Permutation Test for Independence 54
3.4.2 ChiSquare Reference Distribution 57
3.5 ChiSquare Test of Independence 58
3.6 Test of Homogeneity 61
3.7 GoodnessofFit: All Parameters Known 63
3.8 GoodnessofFit: Some Parameters Estimated 66
3.9 Exercises 68
4 Sampling Distributions 77
4.1 Sampling Distributions 77
4.2 Calculating Sampling Distributions 82
4.3 The Central Limit Theorem 84
4.3.1 CLT for Binomial Data 87
4.3.2 Continuity Correction for Discrete Random Variables 89
4.3.3 Accuracy of the Central Limit Theorem 90
4.3.4 CLT for Sampling Without Replacement 91
4.4 Exercises 92
5 The Bootstrap 99
5.1 Introduction to the Bootstrap 99
5.2 The PlugIn Principle 106
5.2.1 Estimating the Population Distribution 107
5.2.2 How Useful Is the Bootstrap Distribution? 109
5.3 Bootstrap Percentile Intervals 113
5.4 Two Sample Bootstrap 114
5.4.1 The Two Independent Populations Assumption 119
5.5 Other Statistics 120
5.6 Bias 122
5.7 Monte Carlo Sampling: The “Second Bootstrap Principle” 125
5.8 Accuracy of Bootstrap Distributions 125
5.8.1 Sample Mean: Large Sample Size 126
5.8.2 Sample Mean: Small Sample Size 127
5.8.3 Sample Median 127
5.9 How Many Bootstrap Samples are Needed? 129
5.10 Exercises 129
6 Estimation 135
6.1 Maximum Likelihood Estimation 135
6.1.1 Maximum Likelihood for Discrete Distributions 136
6.1.2 Maximum Likelihood for Continuous Distributions 139
6.1.3 Maximum Likelihood for Multiple Parameters 143
6.2 Method of Moments 146
6.3 Properties of Estimators 148
6.3.1 Unbiasedness 148
6.3.2 Efficiency 151
6.3.3 Mean Square Error 155
6.3.4 Consistency 157
6.3.5 Transformation Invariance 160
6.4 Exercises 161
7 Classical Inference: Confidence Intervals 167
7.1 Confidence Intervals for Means 167
7.1.1 Confidence Intervals for a Mean σ Known 167
7.1.2 Confidence Intervals for a Mean σ Unknown 172
7.1.3 Confidence Intervals for a Difference in Means 178
7.2 Confidence Intervals in General 183
7.2.1 Location and Scale Parameters 186
7.3 OneSided Confidence Intervals 189
7.4 Confidence Intervals for Proportions 191
7.4.1 The Agresti–Coull Interval for a Proportion 193
7.4.2 Confidence Interval for the Difference of Proportions 194
7.5 Bootstrap t Confidence Intervals 195
7.5.1 Comparing Bootstrap t and Formula t Confidence Intervals 200
7.6 Exercises 200
8 Classical Inference: Hypothesis Testing 211
8.1 Hypothesis Tests for Means and Proportions 211
8.1.1 One Population 211
8.1.2 Comparing Two Populations 215
8.2 Type I and Type II Errors 221
8.2.1 Type I Errors 221
8.2.2 Type II Errors and Power 226
8.3 More on Testing 231
8.3.1 On Significance 231
8.3.2 Adjustments for Multiple Testing 232
8.3.3 Pvalues Versus Critical Regions 233
8.4 Likelihood Ratio Tests 234
8.4.1 Simple Hypotheses and the Neyman–Pearson Lemma 234
8.4.2 Generalized Likelihood Ratio Tests 237
8.5 Exercises 239
9 Regression 247
9.1 Covariance 247
9.2 Correlation 251
9.3 LeastSquares Regression 254
9.3.1 Regression Toward the Mean 258
9.3.2 Variation 259
9.3.3 Diagnostics 261
9.3.4 Multiple Regression 265
9.4 The Simple Linear Model 266
9.4.1 Inference for α and β 270
9.4.2 Inference for the Response 273
9.4.3 Comments About Assumptions for the Linear Model 277
9.5 Resampling Correlation and Regression 279
9.5.1 Permutation Tests 282
9.5.2 Bootstrap Case Study: Bushmeat 283
9.6 Logistic Regression 286
9.6.1 Inference for Logistic Regression 291
9.7 Exercises 294
10 Bayesian Methods 301
10.1 Bayes’ Theorem 302
10.2 Binomial Data Discrete Prior Distributions 302
10.3 Binomial Data Continuous Prior Distributions 309
10.4 Continuous Data 316
10.5 Sequential Data 319
10.6 Exercises 322
11 Additional Topics 327
11.1 Smoothed Bootstrap 327
11.1.1 Kernel Density Estimate 328
11.2 Parametric Bootstrap 331
11.3 The Delta Method 335
11.4 Stratified Sampling 339
11.5 Computational Issues in Bayesian Analysis 340
11.6 Monte Carlo Integration 341
11.7 Importance Sampling 346
11.7.1 Ratio Estimate for Importance Sampling 352
11.7.2 Importance Sampling in Bayesian Applications 355
11.8 Exercises 359
Appendix A Review of Probability 363
A.1 Basic Probability 363
A.2 Mean and Variance 364
A.3 The Mean of a Sample of Random Variables 366
A.4 The Law of Averages 367
A.5 The Normal Distribution 368
A.6 Sums of Normal Random Variables 369
A.7 Higher Moments and the Moment Generating Function 370
Appendix B Probability Distributions 373
B.1 The Bernoulli and Binomial Distributions 373
B.2 The Multinomial Distribution 374
B.3 The Geometric Distribution 376
B.4 The Negative Binomial Distribution 377
B.5 The Hypergeometric Distribution 378
B.6 The Poisson Distribution 379
B.7 The Uniform Distribution 381
B.8 The Exponential Distribution 381
B.9 The Gamma Distribution 382
B.10 The ChiSquare Distribution 385
B.11 The Student’s t Distribution 388
B.12 The Beta Distribution 390
B.13 The F Distribution 391
B.14 Exercises 393
Appendix C Distributions Quick Reference 395
Solutions to OddNumbered Exercises 399
Bibliography 407
Index 413
Author Information
TIM HESTERBERG, PhD, is Senior Ads Quality Statistician at Google. He was a senior research scientist for Insightful Corporation and led the development of S+Resample and other S+ and R software. Dr. Hesterberg has published numerous articles in the areas of bootstrap and related resampling techniques, Monte Carlo simulation methodology, modern regression, tectonic deformation estimation, and electric demand forecasting.
Reviews
"It is highly recommended to someone with a good background in mathematics, probability, and basic statistics who wants to learn about the theory and about resampling and how it relates to traditional methods, and how to implement resamplinjg in R. The book is also a wonderful source of simulations to support the teaching of statistics." (Journal of Biopharmaceutical Statistics, 2011)
"It is less demanding mathematically, more applied in its emphasis, and more modern in content than the usual book, which makes it a good choice if you want a modern applied book at the level of Larsen and Marx (1986)." George W. Cobb, Mount Holyoke College Department of Mathematics and Statsitics (Chilean Journal of Statistics, 1 April 2011)