E-book

# Statistical Tools for the Comprehensive Practice of Industrial Hygiene and Environmental Health Sciences

ISBN: 978-1-119-35137-5
392 pages
December 2016

## Description

Reviews and reinforces concepts and techniques typical of a first statistics course with additional techniques useful to the IH/EHS practitioner.

• Includes both parametric and non-parametric techniques described and illustrated in a worker health and environmental protection practice context
• Illustrated through numerous examples presented in the context of IH/EHS field practice and research, using the statistical analysis tools available in Excel® wherever possible
• Emphasizes the application of statistical tools to IH/EHS-type data in order to answer IH/EHS-relevant questions
• Includes an instructor’s manual that follows in parallel with the textbook, including PowerPoints to help prepare lectures and answers in the text as for the Exercises section of each chapter.
See More

Preface xv

Acknowledgments xvii

1 Some Basic Concepts 1

1.1 Introduction 1

1.2 Physical versus Statistical Sampling 2

1.3 Representative Measures 3

1.4 Strategies for Representative Sampling 3

1.5 Measurement Precision 4

1.6 Probability Concepts 6

1.6.1 The Relative Frequency Approach 7

1.6.2 The Classical Approach – Probability Based on Deductive Reasoning 7

1.6.3 Subjective Probability 7

1.6.4 Complement of a Probability 7

1.6.5 Mutually Exclusive Events 8

1.6.6 Independent Events 8

1.6.7 Events that Are Not Mutually Exclusive 9

1.6.8 Marginal and Conditional Probabilities 9

1.6.9 Testing for Independence 11

1.7 Permutations and Combinations 12

1.7.1 Permutations for Sampling without Replacement 12

1.7.2 Permutations for Sampling with Replacement 13

1.7.3 Combinations 13

1.8 Introduction to Frequency Distributions 14

1.8.1 The Binomial Distribution 14

1.8.2 The Normal Distribution 16

1.8.3 The Chi-Square Distribution 20

1.9 Confidence Intervals and Hypothesis Testing 22

1.10 Summary 23

1.11 Addendum: Glossary of Some Useful Excel Functions 23

References 28

2 Descriptive Statistics and Methods of Presenting Data 29

2.1 Introduction 29

2.2 Quantitative Descriptors of Data and Data Distributions 29

2.3 Displaying Data with Frequency Tables 33

2.4 Displaying Data with Histograms and Frequency Polygons 34

2.5 Displaying Data Frequency Distributions with Cumulative Probability Plots 35

2.6 Displaying Data with NED and Q–Q Plots 38

2.7 Displaying Data with Box-and-Whisker Plots 41

2.8 Data Transformations to Achieve Normality 42

2.9 Identifying Outliers 43

2.10 What to Do with Censored Values? 45

2.11 Summary 45

References 48

3 Analysis of Frequency Data 49

3.1 Introduction 49

3.2 Tests for Association and Goodness-of-Fit 50

3.2.1 r × c Contingency Tables and the Chi-Square Test 50

3.2.2 Fisher’s Exact Test 54

3.3 Binomial Proportions 55

3.4 Rare Events and the Poisson Distribution 57

3.4.1 Poisson Probabilities 57

3.4.2 Confidence Interval on a Poisson Count 60

3.4.3 Testing for Fit with the Poisson Distribution 61

3.4.4 Comparing Two Poisson Rates 62

3.4.5 Type I Error, Type II Error, and Power 64

3.4.6 Power and Sample Size in Comparing Two Poisson Rates 64

3.5 Summary 65

References 69

4 Comparing Two Conditions 71

4.1 Introduction 71

4.2 Standard Error of the Mean 71

4.3 Confidence Interval on a Mean 72

4.4 The t-Distribution 73

4.5 Parametric One-Sample Test – Student’s t-Test 74

4.6 Two-Tailed versus One-Tailed Hypothesis Tests 76

4.7 Confidence Interval on a Variance 77

4.8 Other Applications of the Confidence Interval Concept in IH/EHS Work 79

4.8.1 OSHA Compliance Determinations 79

4.8.2 Laboratory Analyses – LOB, LOD, and LOQ 80

4.9 Precision, Power, and Sample Size for One Mean 81

4.9.1 Sample Size Required to Estimate a Mean with a Stated Precision 81

4.9.2 Sample Size Required to Detect a Specified Difference in Student’s t-Test 81

4.10 Iterative Solutions Using the Excel Goal Seek Utility 82

4.11 Parametric Two-Sample Tests 83

4.11.1 Confidence Interval for a Difference in Means: The Two-Sample t-Test 83

4.11.2 Two-Sample t-Test When Variances Are Equal 84

4.11.3 Verifying the Assumptions of the Two-Sample t-Test 85

4.11.3.1 Lilliefors Test for Normality 86

4.11.3.2 Shapiro–Wilk W-Test for Normality 87

4.11.3.3 Testing for Homogeneity of Variance 91

4.11.3.4 Transformations to Stabilize Variance 93

4.11.4 Two-Sample t-Test with Unequal Variances –Welch’s Test 93

4.11.5 Paired Sample t-Test 95

4.11.6 Precision, Power, and Sample Size for Comparing Two Means 96

4.12 Testing for Difference in Two Binomial Proportions 99

4.12.1 Testing a Binomial Proportion for Difference from a Known Value 100

4.12.2 Testing Two Binomial Proportions for Difference 100

4.13 Nonparametric Two-Sample Tests 102

4.13.1 Mann–Whitney U Test 102

4.13.2 Wilcoxon Matched Pairs Test 104

4.13.3 McNemar and Binomial Tests for Paired Nominal Data 105

4.14 Summary 107

References 111

5 Characterizing the Upper Tail of the Exposure Distribution 113

5.1 Introduction 113

5.2 Upper Tolerance Limits 113

5.3 Exceedance Fractions 115

5.4 Distribution Free Tolerance Limits 117

5.5 Summary 119

References 121

6 One-Way Analysis of Variance 123

6.1 Introduction 123

6.2 Parametric One-Way ANOVA 123

6.2.1 How the Parametric ANOVA Works – Sums of Squares and the F-Test 124

6.2.2 Post hoc Multiple Pairwise Comparisons in Parametric ANOVA 127

6.2.2.1 Tukey’s Test 127

6.2.2.2 Tukey–Kramer Test 128

6.2.2.3 Dunnett’s Test for Comparing Means to a Control Mean 130

6.2.2.4 Planned Contrasts Using the Scheffé S Test 132

6.2.3 Checking the ANOVA Model Assumptions – NED Plots and Variance Tests 134

6.2.3.1 Levene’s Test 134

6.2.3.2 Bartlett’s Test 135

6.3 Nonparametric Analysis of Variance 136

6.3.1 Kruskal–Wallis Nonparametric One-Way ANOVA 137

6.3.2 Post hoc Multiple Pairwise Comparisons in Nonparametric ANOVA 139

6.3.2.1 Nemenyi’s Test 139

6.3.2.2 Bonferroni–Dunn Test 140

6.4 ANOVA Disconnects 142

6.5 Summary 144

References 149

7 Two-Way Analysis of Variance 151

7.1 Introduction 151

7.2 Parametric Two-Way ANOVA 151

7.2.1 Two-Way ANOVA without Interaction 154

7.2.2 Checking for Homogeneity of Variance 154

7.2.3 Multiple Pairwise Comparisons When There Is No Interaction Term 154

7.2.4 Two-Way ANOVA with Interaction 156

7.2.5 Multiple Pairwise Comparisons with Interaction 158

7.2.6 Two-Way ANOVA without Replication 160

7.2.7 Repeated-Measures ANOVA 160

7.2.8 Two-Way ANOVA with Unequal Sample Sizes 162

7.3 Nonparametric Two-Way ANOVA 162

7.3.1 Rank Tests 162

7.3.1.1 The Rank Test 162

7.3.1.2 The Rank Transform Test 166

7.3.1.3 Other Options – Aligned Rank Tests 166

7.3.2 Repeated-Measures Nonparametric ANOVA – Friedman’s Test 166

7.3.2.1 Friedman’s Test without Replication 167

7.3.2.2 Multiple Comparisons for Friedman’s Test without Replication 169

7.3.2.3 Friedman’s Test with Replication 170

7.3.2.4 Multiple Comparisons for Friedman’s Test with Replication 172

7.4 More Powerful Non-ANOVA Approaches: Linear Modeling 172

7.5 Summary 172

References 178

8 Correlation Analysis 181

8.1 Introduction 181

8.2 Simple Parametric Correlation Analysis 181

8.2.1 Testing the Correlation Coefficient for Significance 184

8.2.1.1 t-Test for Significance 185

8.2.1.2 F-Test for Significance 186

8.2.2 Confidence Limits on the Correlation Coefficient 186

8.2.3 Power in Simple Correlation Analysis 187

8.2.4 Comparing Two Correlation Coefficients for Difference 188

8.2.5 Comparing More Than Two Correlation Coefficients for Difference 189

8.2.6 Multiple Pairwise Comparisons of Correlation Coefficients 190

8.3 Simple Nonparametric Correlation Analysis 190

8.3.1 Spearman Rank Correlation Coefficient 190

8.3.2 Testing Spearman’s Rank Correlation Coefficient for Statistical Significance 191

8.3.3 Correction to Spearman’s Rank Correlation Coefficient When There Are Tied Ranks 193

8.4 Multiple Correlation Analysis 195

8.4.1 Parametric Multiple Correlation 195

8.4.2 Nonparametric Multiple Correlation: Kendall’s Coefficient of Concordance 195

8.5 Determining Causation 198

8.6 Summary 198

References 204

9 Regression Analysis 205

9.1 Introduction 205

9.2 Linear Regression 205

9.2.1 Simple Linear Regression 207

9.2.2 Nonconstant Variance – Transformations and Weighted Least Squares Regression 209

9.2.3 Multiple Linear Regression 213

9.2.3.1 Multiple Regression in Excel 215

9.2.3.2 Multiple Regression Using the Excel Solver Utility 218

9.2.3.3 Multiple Regression Using Advanced Software Packages 221

9.2.4 Using Regression for Factorial ANOVA with Unequal Sample Sizes 222

9.2.5 Multiple Correlation Analysis Using Multiple Regression 227

9.2.5.1 Assumptions of Parametric Multiple Correlation 233

9.2.5.2 Options When Collinearity Is a Problem 233

9.2.6 Polynomial Regression 234

9.2.7 Interpreting Linear Regression Results 234

9.2.8 Linear Regression versus ANOVA 235

9.3 Logistic Regression 235

9.3.1 Odds and Odds Ratios 236

9.3.2 The Logit Transformation 238

9.3.3 The Likelihood Function 240

9.3.4 Logistic Regression in Excel 240

9.3.5 Likelihood Ratio Test for Significance of MLE Coefficients 241

9.3.6 Odds Ratio Confidence Limits in Multivariate Models 243

9.4 Poisson Regression 243

9.4.1 Poisson Regression Model 243

9.4.2 Poisson Regression in Excel 244

9.5 Regression with Excel Add-ons 245

9.6 Summary 246

References 252

10 Analysis of Covariance 253

10.1 Introduction 253

10.2 The Simple ANCOVA Model and Its Assumptions 253

10.2.1 Required Regressions 255

10.2.2 Checking the ANCOVA Assumptions 258

10.2.2.1 Linearity, Independence, and Normality 258

10.2.2.2 Similar Variances 258

10.2.2.3 Equal Regression Slopes 258

10.2.3 Testing and Estimating the Treatment Effects 259

10.3 The Two-Factor Covariance Model 261

Summary 261

References 263

11 Experimental Design 265

11.1 Introduction 265

11.2 Randomization 266

11.3 Simple Randomized Experiments 266

11.4 Experimental Designs Blocking on Categorical Factors 267

11.5 Randomized Full Factorial Experimental Design 270

11.6 Randomized Full Factorial Design with Blocking 271

11.7 Split Plot Experimental Designs 272

11.8 Balanced Experimental Designs – Latin Square 273

11.9 Two-Level Factorial Experimental Designs with Quantitative Factors 274

11.9.1 Two-Level Factorial Designs for Exploratory Studies 274

11.9.2 The Standard Order 275

11.9.3 Calculating Main Effects 276

11.9.4 Calculating Interactions 278

11.9.5 Estimating Standard Errors 278

11.9.6 Estimating Effects with REGRESSION in Excel 279

11.9.7 Interpretation 280

11.9.8 Cube, Surface, and NED Plots as an Aid to Interpretation 280

11.9.9 Fractional Factorial Two-Level Experiments 282

11.10 Summary 282

References 284

12 Uncertainty and Sensitivity Analysis 285

12.1 Introduction 285

12.2 Simulation Modeling 285

12.2.1 Propagation of Errors 286

12.2.2 Simple Bounding 287

12.2.2.1 Sums and Differences 287

12.2.2.2 Products and Ratios 287

12.2.2.3 Powers 289

12.2.3.1 Sums and Differences 289

12.2.3.2 Products and Ratios 290

12.2.3.3 Powers 292

12.2.4 LOD and LOQ Revisited – Dust Sample Gravimetric Analysis 292

12.3 Uncertainty Analysis 295

12.4 Sensitivity Analysis 296

12.4.1 One-at-a-Time (OAT) Analysis 296

12.4.2 Variance-Based Analysis 297

12.5 Further Reading on Uncertainty and Sensitivity Analysis 297

12.6 Monte Carlo Simulation 297

12.7 Monte Carlo Simulation in Excel 298

12.7.1 Generating Random Numbers in Excel 298

12.7.2 The Populated Spreadsheet Approach 299

12.7.3 Monte Carlo Simulation Using VBA Macros 299

12.8 Summary 303

References 307

13 Bayes’ Theorem and Bayesian Decision Analysis 309

13.1 Introduction 309

13.2 Bayes’ Theorem 310

13.3 Sensitivity, Specificity, and Positive and Negative Predictive Value in Screening Tests 310

13.4 Bayesian Decision Analysis in Exposure Control Banding 312

13.4.1 Introduction to BDA 312

13.4.2 The Prior Distribution and the Parameter Space 314

13.4.3 The Posterior Distribution and Likelihood Function 314

13.4.4 Relative Influences of the Prior and the Data 315

13.4.5 Frequentist versus Bayesian Perspectives 316

References 318

A z-Tables of the Standard Normal Distribution 321

B Critical Values of the Chi-Square Distribution 327

C Critical Values for the t-Distribution 329

D Critical Values for Lilliefors Test 331

Reference 332

E Shapiro–Wilk W Test 𝜶 Coefficients and Critical Values 333

References 336

F Critical Values of the F Distribution for 𝜶 =0.05 337

G Critical U Values for the Mann–Whitney U Test 341

Reference 342

H Critical Wilcoxon Matched Pairs Test t Values 343

Reference 344

I K Values for Upper Tolerance Limits 345

Reference 346

J Exceedance Fraction 95% Lower Confidence Limit versus Z 347

References 347

K q Values for Tukey’s, Tukey–Kramer, and Nemenyi’s MSD Tests 349

L q′ Values for Dunnett’s Test 351

References 353

M Q Values for the Bonferroni–Dunn MSD Test 355

N Critical Spearman Rank Correlation Test Values 357

O Critical Values of Kendall’s W 359

Reference 361

Index 363

See More