Skip to main content

An Introduction to Categorical Data Analysis, 2nd Edition

An Introduction to Categorical Data Analysis, 2nd Edition

Alan Agresti

ISBN: 978-0-470-11474-2

May 2007

400 pages



The first edition of this text has sold over 19,600 copies. However, the use of statistical methods for categorical data has increased dramatically in recent years, particularly for applications in the biomedical and social sciences. A second edition of the introductory version of the book will suit it nicely. Wiley also published a second edition of Categorical Data Analysis, which is an advanced, more technical text, in 2003.


Related Resources


Request an Evaluation Copy for this title

Contact your Rep for all inquiries

Preface to the Second Edition xv

1. Introduction 1

1.1 Categorical Response Data 1

1.2 Probability Distributions for Categorical Data 3

1.3 Statistical Inference for a Proportion 6

1.4 More on Statistical Inference for Discrete Data 11

Problems 16

2. Contingency Tables 21

2.1 Probability Structure for Contingency Tables 21

2.2 Comparing Proportions in Two-by-Two Tables 25

2.3 The Odds Ratio 28

2.4 Chi-Squared Tests of Independence 34

2.5 Testing Independence for Ordinal Data 41

2.6 Exact Inference for Small Samples 45

2.7 Association in Three-Way Tables 49

Problems 55

3. Generalized Linear Models 65

3.1 Components of a Generalized Linear Model 66

3.2 Generalized Linear Models for Binary Data 68

3.3 Generalized Linear Models for Count Data 74

3.4 Statistical Inference and Model Checking 84

3.5 Fitting Generalized Linear Models 88

Problems 90

4. Logistic Regression 99

4.1 Interpreting the Logistic Regression Model 99

4.2 Inference for Logistic Regression 106

4.3 Logistic Regression with Categorical Predictors 110

4.4 Multiple Logistic Regression 115

4.5 Summarizing Effects in Logistic Regression 120

Problems 121

5. Building and Applying Logistic Regression Models 137

5.1 Strategies in Model Selection 137

5.2 Model Checking 144

5.3 Effects of Sparse Data 152

5.4 Conditional Logistic Regression and Exact Inference 157

5.5 Sample Size and Power for Logistic Regression 160

Problems 163

6. Multicategory Logit Models 173

6.1 Logit Models for Nominal Responses 173

6.2 Cumulative Logit Models for Ordinal Responses 180

6.3 Paired-Category Ordinal Logits 189

6.4 Tests of Conditional Independence 193

Problems 196

7. Loglinear Models for Contingency Tables 204

7.1 Loglinear Models for Two-Way and Three-Way Tables 204

7.2 Inference for Loglinear Models 212

7.3 The Loglinear–Logistic Connection 219

7.4 Independence Graphs and Collapsibility 223

7.5 Modeling Ordinal Associations 228

Problems 232

8. Models for Matched Pairs 244

8.1 Comparing Dependent Proportions 245

8.2 Logistic Regression for Matched Pairs 247

8.3 Comparing Margins of Square Contingency Tables 252

8.4 Symmetry and Quasi-Symmetry Models for Square Tables 256

8.5 Analyzing Rater Agreement 260

8.6 Bradley–Terry Model for Paired Preferences 264

Problems 266

9. Modeling Correlated Clustered Responses 276

9.1 Marginal Models Versus Conditional Models 277

9.2 Marginal Modeling: The GEE Approach 279

9.3 Extending GEE: Multinomial Responses 285

9.4 Transitional Modeling Given the Past 288

Problems 290

10. Random Effects: Generalized Linear Mixed Models 297

10.1 Random Effects Modeling of Clustered Categorical Data 297

10.2 Examples of Random Effects Models for Binary Data 302

10.3 Extensions to Multinomial Responses or Multiple Random Effect Terms 310

10.4 Multilevel (Hierarchical) Models 313

10.5 Model Fitting and Inference for GLMMS 316

Problems 318

11. A Historical Tour of Categorical Data Analysis 325

11.1 The Pearson–Yule Association Controversy 325

11.2 R. A. Fisher’s Contributions 326

11.3 Logistic Regression 328

11.4 Multiway Contingency Tables and Loglinear Models 329

11.5 Final Comments 331

Appendix A: Software for Categorical Data Analysis 332

Appendix B: Chi-Squared Distribution Values 343

Bibliography 344

Index of Examples 346

Subject Index 350

Brief Solutions to Some Odd-Numbered Problems 357

  • Second edition of one of the best-selling books on categorical data analysis, from one of the most authoritative authors in the field.
  • Features new chapters on marginal models, including the generalized estimating equations (GEE) approach and random effects models.
  • Already existing material, including SAS and SPSS data sets, is updated to reflect technical advances since the publication of the first edition.
  • Introductory material on generalized linear models will now include information on negative binomial regression.
  • Written on a relatively low technical level and does not require familiarity with advanced mathematics such as calculus or matrix algebra.
"Yes, I fully recommend the text as a basis for introductory course, for students, as well as non-specialists in statistics.  The wealth of examples provided in the text is, from my point of view, a rich source of motivating ones own studies and work." (Biometrical Journal, December 2008)

"This text does a good job of achieving its state goal, and we enthusiastically recommend it." (Journal of the American Statistical Association, September 2008)

"This book is very well-written and it is obvious that the author knows the subject inside out." (Journal of Applied Statistics, April 2008)

"Provides an applied introduction to the most important methods for analyzing categorical data, such as chi-squared tests and logical regression." (Statistica 2008)

"This is an introductory book and as such it is marvelous...essential for a novice..." (MAA Reviews, June 26, 2007)