Wiley.com
Print this page Share

Classic Topics on the History of Modern Mathematical Statistics: From Laplace to More Recent Times

ISBN: 978-1-119-12792-5
754 pages
April 2016
Classic Topics on the History of Modern Mathematical Statistics: From Laplace to More Recent Times  (1119127920) cover image

Description

"There is nothing like it on the market...no others are as encyclopedic...the writing is exemplary: simple, direct, and competent."
—George W. Cobb, Professor Emeritus of Mathematics and Statistics, Mount Holyoke College

Written in a direct and clear manner, Classic Topics on the History of Modern Mathematical Statistics: From Laplace to More Recent Times presents a comprehensive guide to the history of mathematical statistics and details the major results and crucial developments over a 200-year period. Presented in chronological order, the book features an account of the classical and modern works that are essential to understanding the applications of mathematical statistics.

Divided into three parts, the book begins with extensive coverage of the probabilistic works of Laplace, who laid much of the foundations of later developments in statistical theory. Subsequently, the second part introduces 20th century statistical developments including work from Karl Pearson, Student, Fisher, and Neyman. Lastly, the author addresses post-Fisherian developments. Classic Topics on the History of Modern Mathematical Statistics: From Laplace to More Recent Times also features:

  • A detailed account of Galton's discovery of regression and correlation as well as the subsequent development of Karl Pearson's X2 and Student's t
  • A comprehensive treatment of the permeating influence of Fisher in all aspects of modern statistics beginning with his work in 1912
  • Significant coverage of Neyman–Pearson theory, which includes a discussion of the differences to Fisher’s works
  • Discussions on key historical developments as well as the various disagreements, contrasting information, and alternative theories in the history of modern mathematical statistics in an effort to provide a thorough historical treatment

Classic Topics on the History of Modern Mathematical Statistics: From Laplace to More Recent Times is an excellent reference for academicians with a mathematical background who are teaching or studying the history or philosophical controversies of mathematics and statistics. The book is also a useful guide for readers with a general interest in statistical inference.

See More

Table of Contents

Preface xvi

Acknowledgments xix

Introduction: LANDMARKS IN PRE-LAPLACEAN STATISTICS xx

PART ONE: LAPLACE 1

1 The Laplacean Revolution 3

1.1 Pierre ]Simon de Laplace (1749–1827) 3

1.2 Laplace’s Work in Probability and Statistics 7

1.2.1 “Mémoire sur les suites récurro ]récurrentes” (1774): Definition of Probability 7

1.2.2 “Mémoire sur la probabilité des causes par les événements” (1774) 9

1.2.2.1 Bayes’ Theorem 9

1.2.2.2 Rule of Succession 13

1.2.2.3 Proof of Inverse Bernoulli Law. Method of Asymptotic Approximation. Central Limit Theorem for Posterior Distribution. Indirect Evaluation of et2 0 dt 14

1.2.2.4 Problem of Points 18

1.2.2.5 First Law of Error 19

1.2.2.6 Principle of Insufficient Reason (Indifference) 24

1.2.2.7 Conclusion 25

1.2.3 “Recherches sur l’intégration des équations différentielles aux différences finis” (1776) 25

1.2.3.1 Integration of Difference Equations. Problem of Points 25

1.2.3.2 Moral Expectation. On d’Alembert 26

1.2.4 “Mémoire sur l’inclinaison moyenne des orbites” (1776): Distribution of Finite Sums, Test of Significance 28

1.2.5 “Recherches sur le milieu qu’il faut choisir entre les resultants de plusieurs observations” (1777): Derivation of Double Logarithmic Law of Error 35

1.2.6 “Mémoire sur les probabilités” (1781) 42

1.2.6.1 Introduction 42

1.2.6.2 Double Logarithmic Law of Error 44

1.2.6.3 Definition of Conditional Probability. Proof of Bayes’ Theorem 46

1.2.6.4 Proof of Inverse Bernoulli Law Refined 50

1.2.6.5 Method of Asymptotic Approximation Refined 53

1.2.6.6 Stirling’s Formula 58

1.2.6.7 Direct Evaluation of e t2 0 dt 59

1.2.6.8 Theory of Errors 60

1.2.7 “Mémoire sur les suites” (1782) 62

1.2.7.1 De Moivre and Generating Functions 62

1.2.7.2 Lagrange’s Calculus of Operations as an Impetus for Laplace’s Generating Functions 65

1.2.8 “Mémoire sur les approximations des formules qui sont fonctions de très grands nombres” (1785) 70

1.2.8.1 Method of Asymptotic Approximation Revisited 70

1.2.8.2 Stirling’s Formula Revisited 73

1.2.8.3 Genesis of Characteristic Functions 74

1.2.9 “Mémoire sur les approximations des formules qui sont fonctions de très grands nombres (suite)” (1786): Philosophy of Probability and Universal Determinism, Recognition of Need for Normal Probability Tables 78

1.2.10 “Sur les naissances” (1786): Solution of the Problem of Births by Using Inverse Probability 79

1.2.11 “Mémoire sur les approximations des formules qui sont fonctions de très grands nombres et sur leur application aux probabilités” (1810): Second Phase of Laplace’s Statistical Career, Laplace’s First Proof of the Central Limit Theorem 83

1.2.12 “Supplément au Mémoire sur les approximations des formules qui sont fonctions de très grands nombres et sur leur application aux probabilités” (1810): Justification of Least Squares Based on Inverse Probability, The Gauss–Laplace Synthesis 90

1.2.13 “Mémoire sur les intégrales définies et leur applications aux probabilités, et spécialement à la recherche du milieu qu’il faut choisir entre les résultats des observations” (1811): Laplace’s Justification of Least Squares Based on Direct Probability 90

1.2.14 Théorie Analytique des Probabilités (1812): The de Moivre–Laplace Theorem 90

1.2.15 Laplace’s Probability Books 92

1.2.15.1 Théorie Analytique des Probabilités (1812) 92

1.2.15.2 Essai Philosophique sur les Probabilités (1814) 95

1.3 The Principle of Indifference 98

1.3.1 Introduction 98

1.3.2 Bayes’ Postulate 99

1.3.3 Laplace’s Rule of Succession. Hume’s Problem of Induction 102

1.3.4 Bertrand’s and Other Paradoxes 106

1.3.5 Invariance 108

1.4 Fourier Transforms, Characteristic Functions, and Central Limit Theorems 113

1.4.1 The Fourier Transform: From Taylor to Fourier 114

1.4.2 Laplace’s Fourier Transforms of 1809 120

1.4.3 Laplace’s Use of the Fourier Transform to Solve a Differential Equation (1810) 122

1.4.4 Lagrange’s 1776 Paper: A Precursor to the Characteristic Function 123

1.4.5 The Concept of Characteristic Function Introduced: Laplace in 1785 127

1.4.6 Laplace’s Use of the Characteristic Function in his First Proof of the Central Limit Theorem (1810) 128

1.4.7 Characteristic Function of the Cauchy Distribution: Laplace in 1811 128

1.4.8 Characteristic Function of the Cauchy Distribution: Poisson in 1811 131

1.4.9 Poisson’s Use of the Characteristic Function in his First Proof of the Central Limit Theorem (1824) 134

1.4.10 Poisson’s Identification of the Cauchy Distribution (1824) 138

1.4.11 First Modern Rigorous Proof of the Central Limit Theorem: Lyapunov in 1901 139

1.4.12 Further Extensions: Lindeberg (1922), Lévy (1925), and Feller (1935) 148

1.5 Least Squares and the Normal Distribution 149

1.5.1 First Publication of the Method of Least Squares: Legendre in 1805 149

1.5.2 Adrain’s Research Concerning the Probabilities of Errors (1808): Two Proofs of the Normal Law 152

1.5.3 Gauss’ First Justification of the Principle of Least Squares (1809) 159

1.5.3.1 Gauss’ Life 159

1.5.3.2 Derivation of the Normal Law. Postulate of the Arithmetic Mean 159

1.5.3.3 Priority Dispute with Legendre 163

1.5.4 Laplace in 1810: Justification of Least Squares Based on Inverse Probability, the Gauss–Laplace Synthesis 166

1.5.5 Laplace’s Justification of Least Squares Based on Direct Probability (1811) 169

1.5.6 Gauss’ Second Justification of the Principle of Least Squares in 1823: The Gauss–Markov Theorem 177

1.5.7 Hagen’s Hypothesis of Elementary Errors (1837) 182

PART TWO : FROM GALTON TO FISHER 185

2 Galton, Regression, and Correlation 187

2.1 Francis Galton (1822–1911) 187

2.2 Genesis of Regression and Correlation 190

2.2.1 Galton’s 1877 Paper, “Typical Laws of Heredity”: Reversion 190

2.2.2 Galton’s Quincunx (1873) 195

2.2.3 Galton’s 1885 Presidential Lecture and Subsequent Related Papers: Regression, Discovery of the Bivariate Normal Surface 197

2.2.4 First Appearance of Correlation (1888) 206

*2.2.5 Some Results on Regression Based on the Bivariate Normal Distribution: Regression to the Mean Mathematically Explained 209

2.2.5.1 Basic Results Based on the Bivariate Normal Distribution 209

2.2.5.2 Regression to the Mean Mathematically Explained 211

2.3 Further Developments after Galton 211

2.3.1 Weldon (1890; 1892; 1893) 211

2.3.2 Edgeworth in 1892: First Systematic Study of the Multivariate Normal Distribution 213

2.3.3 O rigin of Pearson’s r (Pearson et al. 1896) 220

2.3.4 Standard Error of r (Pearson et al. 1896; Pearson and Filon 1898; Student 1908; Soper 1913) 224

2.3.5 Development of Multiple Regression, Galton’s Law of Ancestral Heredity, First Explicit Derivation of the Multivariate Normal Distribution (Pearson et al. 1896) 230

2.3.5.1 Development of Multiple Regression. Galton’s Law of Ancestral Heredity 230

2.3.5.2 First Explicit Derivation of the Multivariate Normal Distribution 233

2.3.6 Marriage of Regression with Least Squares (Yule 1897) 237

2.3.7 Correlation Coefficient for a 2 × 2 Table (Yule 1900). Feud Between Pearson and Yule 244

2.3.8 Intraclass Correlation (Pearson 1901; Harris 1913; Fisher 1921; 1925) 253

2.3.9 First Derivation of the Exact Distribution of r (Fisher 1915) 258

2.3.10 Controversy between Pearson and Fisher on the Latter’s Alleged Use of Inverse Probability (Soper et al. 1917; Fisher 1921) 264

2.3.11 The Logarithmic (or Z ]) Transformation (Fisher 1915; 1921) 267

*2.3.12 Derivation of the Logarithmic Transformation 270

2.4 Work on Correlation and the Bivariate (and Multivariate) Normal Distribution Before Galton 270

2.4.1 Lagrange’s Derivation of the Multivariate Normal Distribution from the Multinomial Distribution (1776) 271

2.4.2 Adrain’s Use of the Multivariate Normal Distribution (1808) 275

2.4.3 Gauss’ Use of the Multivariate Normal Distribution in the Theoria Motus (1809) 275

2.4.4 Laplace’s Derivation of the Joint Distribution of Linear Combinations of Two Errors (1811) 276

2.4.5 Plana on the Joint Distribution of Two Linear Combinations of Random Variables (1813) 276

2.4.6 Bravais’ Determination of Errors in Coordinates (1846) 281

2.4.7 Bullet Shots on a Target: Bertrand’s Derivation of the Bivariate Normal Distribution (1888) 288

3 Karl Pearson’s Chi ]Squared Goodness ]of ]Fit Test 293

3.1 Karl Pearson (1857–1936) 293

3.2 Origin of Pearson’s Chi ]Squared 297

3.2.1 Pearson’s Work on Goodness of Fit Before 1900 297

3.2.2 Pearson’s 1900 Paper 299

3.3 Pearson’s Error and Clash with Fisher 306

3.3.1 Error by Pearson on the Chi-Squared When Parameters Are Estimated (1900) 306

3.3.2 Greenwood and Yule’s Observation (1915) 308

3.3.3 Fisher’s 1922 Proof of the Chi ]Squared Distribution: Origin of Degrees of Freedom 311

*3.3.4 Further Details on Degrees of Freedom 313

3.3.5 Reaction to Fisher’s 1922 Paper: Yule (1922), Bowley and Connor (1923), Brownlee (1924), and Pearson (1922) 314

3.3.6 Fisher’s 1924 Argument: “Coup de Grâce” in 1926 315

3.3.6.1 The 1924 Argument 315

3.3.6.2 ‘Coup de Grâce’ in 1926 317

3.4 The Chi ]Squared Distribution Before Pearson 318

3.4.1 Bienaymé’s Derivation of Simultaneous Confidence Regions (1852) 318

3.4.2 Abbe on the Distribution of Errors in a Series of Observations (1863) 331

3.4.3 Helmert on the Distribution of the Sum of Squares of Residuals (1876): The Helmert Transformations 336

*3.4.4 Derivation of the Transformations Used by Helmert 344

4 Student’s t 348

4.1 William Sealy Gosset (1876–1937) 348

4.2 O rigin of Student’s Test: The 1908 Paper 351

4.3 Further Developments 358

4.3.1 Fisher’s Geometrical Derivation of 1923 358

4.3.2 From Student’s z to Student’s t 360

4.4 Student Anticipated 363

4.4.1 Helmert on the Independence of the Sample Mean and Sample Variance in a Normal Distribution (1876) 363

4.4.2 Lüroth and the First Derivation of the t ]Distribution (1876) 363

4.4.3 Edgeworth’s Derivation of the t ]Distribution Based on Inverse Probability (1883) 369

5 The Fisherian Legacy 371

5.1 Ronald Aylmer Fisher (1890–1962) 371

5.2 Fisher and the Foundation of Estimation Theory 374

5.2.1 Fisher’s 1922 Paper: Consistency, Efficiency, and Sufficiency 374

5.2.1.1 Introduction 374

5.2.1.2 The Criterion of Consistency 375

5.2.1.3 The Criterion of Efficiency 377

5.2.1.4 The Criterion of Sufficiency 377

5.2.2 Genesis of Sufficiency in 1920 378

5.2.3 First Appearance of “Maximum Likelihood” in the 1922 Paper 385

5.2.4 The Method of Moments and its Criticism by Fisher (Pearson 1894; Fisher 1912; 1922) 390

5.2.5 Further Refinement of the 1922 Paper in 1925: Efficiency and Information 396

5.2.6 First Appearance of “Ancillary” Statistics in the 1925 Paper: Relevant Subsets, Conditional Inference, and the Likelihood Principle 403

5.2.6.1 First Appearance of “Ancillary” Statistics 403

5.2.6.2 Relevant Subsets. Conditional Inference 412

5.2.6.3 Likelihood Inference 417

5.2.7 Further Extensions: Inconsistency of MLEs (Neyman and Scott 1948), Inadmissibility of MLEs (Stein 1956), Nonuniqueness of MLEs (Moore 1971) 419

5.2.8 Further Extensions: Nonuniqueness of Ancillaries and of Relevant Subsets (Basu 1964) 421

5.3 Fisher and Significance Testing 423

5.3.1 Significance Testing for the Correlation Coefficient (Student 1908; Soper 1913; Fisher 1915; 1921) 423

5.3.2 Significance Testing for a Regression Coefficient (Fisher 1922) 424

5.3.3 Significance Testing Using the Two ]Sample t ]test Assuming a Common Population Variance (Fisher 1922) 427

5.3.4 Significance Testing for Two Population Variances (Fisher 1924) 428

5.3.5 Statistical Methods for Research Workers (Fisher 1925) 429

5.4 ANOVA and the Design of Experiments 431

5.4.1 Birth and Development of ANOVA (Fisher and Mackenzie 1923; Fisher 1925) 431

5.4.2 Randomization, Replication, and Blocking (Fisher 1925; 1926), Latin Square (Fisher 1925), Analysis of Covariance (Fisher 1932) 441

5.4.2.1 Randomization 441

5.4.2.2 Replication 442

5.4.2.3 Blocking 442

5.4.2.4 Latin Square 444

5.4.2.5 Analysis of Covariance 445

5.4.3 Controversy with Student on Randomization (1936–1937) 448

5.4.4 Design of Experiments (Fisher 1935) 456

5.5 Fisher and Probability 458

5.5.1 Formation of Probability Ideas: Likelihood, Hypothetical Infinite Populations, Rejection of Inverse Probability 458

5.5.2 Fiducial Probability and the Behrens-Fisher Problem 462

5.5.2.1 The Fiducial Argument (1930) 462

5.5.2.2 Neyman’s Confidence Intervals (1934) 467

5.5.2.3 The Behrens-Fisher Problem (1935) 470

5.5.2.4 Controversy with Bartlett (1936–1939) 473

5.5.2.5 Welch’s Approximations (1938 1947) 476

5.5.2.6 Criticism of Welch’s Solution (1956) 483

5.5.3 Clash with Jeffreys on the Nature of Probability (1932–1934) 487

5.6 Fisher Versus Neyman–Pearson: Clash of the Titans 502

5.6.1 The Neyman-Pearson Collaboration 502

5.6.1.1 The Creation of a New Paradigm for Hypothesis Testing in 1926 502

5.6.1.2 The ‘Big Paper’ of 1933 514

5.6.2 Warm Relationships in 1926–1934 520

5.6.3 1935: The Latin Square and the Start of an Ongoing Dispute 522

5.6.4 Fisher’s Criticisms (1955 1956 1960) 528

5.6.4.1 Introduction 528

5.6.4.2 Repeated Sampling 528

5.6.4.3 Type II Errors 532

5.6.4.4 Inductive Behavior 534

5.6.4.5 Conclusion 536

5.7 Maximum Likelihood before Fisher 536

5.7.1 Lambert and the Multinomial Distribution (1760) 536

5.7.2 Lagrange on the Average of Several Measurements (1776) 541

5.7.3 Daniel Bernoulli on the Choice of the Average Among Several Observations (1778) 544

5.7.4 Adrain’s Two Derivations of the Normal Law (1808) 550

5.7.5 Edgeworth and the Genuine Inverse Method (1908 1909) 550

5.8 Significance Testing before Fisher 555

5.8.1 Arbuthnot on Divine Providence: The First Published Test of a Statistical Hypothesis (1710) 555

5.8.2 ‘s Gravesande on the Arbuthnot Problem (1712) 562

5.8.3 Nicholas Bernoulli on the Arbuthnot Problem: Disagreement with ‘s Gravesande and Improvement of James Bernoulli’s Theorem (1712) 565

5.8.4 Daniel Bernoulli on the Inclination of the Planes of the Planetary Orbits (1735). Criticism by d’Alembert (1767) 571

5.8.5 Michell on the Random Distribution of Stars (1767): Clash Between Herschel and Forbes (1849) 578

5.8.5.1 Michell on the Random Distribution of Stars (1767) 578

5.8.5.2 Clash Between Herschel and Forbes (1849) 582

5.8.6 Laplace on the Mean Inclination of the Orbit of Comets (1776) 588

5.8.7 Edgeworth’s “Methods of Statistics” (1884) 588

5.8.8 Karl Pearson’s Chi ]squared Goodness ]of ]Fit Test (1900) 590

5.8.9 Student’s Small ]Sample Statistics (1908) 590

PART THREE: FROM DARMO IS TO RO BBINS 591

6 Beyond Fisher and Neyman–Pearson 593

6.1 Extensions to the Theory of Estimation 593

6.1.1 Distributions Admitting a Sufficient Statistic 594

6.1.1.1 Fisher (1934) 594

6.1.1.2 Darmois (1935) 595

6.1.1.3 Koopman (1936) 597

6.1.1.4 Pitman (1936) 599

6.1.2 The Cramér–Rao Inequality 602

6.1.2.1 Introduction 602

6.1.2.2 Aitken & Silverstone (1942) 603

6.1.2.3 Fréchet (1943) 607

6.1.2.4 Rao (1945) 611

6.1.2.5 Cramér (1946) 614

6.1.3 The Rao–Blackwell Theorem 618

6.1.3.1 Rao (1945) 618

6.1.3.2 Blackwell (1947) 620

6.1.4 The Lehmann–Scheffé Theorem 624

6.1.4.1 Introduction 624

6.1.4.2 The Lehmann-Scheffé Theorem. Completeness (1950) 626

6.1.4.3 Minimal Sufficiency and Bounded Complete Sufficiency (1950) 629

6.1.5 The Ancillarity–Completeness–Sufficiency Connection: Basu’s Theorem (1955) 630

6.1.6 Further Extensions: Sharpening of the CR Inequality (Bhattacharyya 1946), Variance Inequality without Regularity Assumptions (Chapman and Robbins 1951) 632

6.2 Estimation and Hypothesis Testing Under a Single Framework: Wald’s Statistical Decision Theory (1950) 634

6.2.1 Wald’s Life 634

6.2.2 Statistical Decision Theory: Nonrandomized and Randomized Decision Functions, Risk Functions, Admissibility, Bayes, and Minimax Decision Functions 636

6.2.3 Hypothesis Testing as a Statistical Decision Problem 641

6.2.4 Estimation as a Statistical Decision Problem 642

6.2.5 Statistical Decision as a Two ]Person Zero ]Sum Game 643

6.3 The Bayesian Revival 645

6.3.1 Ramsey (1926): Degree of Belief, Ethically Neutral Propositions, Ramsey’s Representation Theorem, Calibrating the Utility Scale, Measuring Degree of Belief, and The Dutch Book 646

6.3.2 De Finetti (1937): The Subjective Theory, Exchangeability, De Finetti’s Representation Theorem, Solution to the Problem of Induction, and Prevision 656

6.3.3 Savage (1954): The Seven Postulates, Qualitative Probability, Quantitative Personal Probability, Savage’s Representation Theorem, and Expected Utility 667

6.3.4 A Breakthrough in “Bayesian” Methods: Robbins’ Empirical Bayes (1956) 674

References 681

Index 714

See More

Author Information

Prakash Gorroochurn, PhD, is Associate Professor in the Mailman School of Public Health's Department of Biostatistics at Columbia University, where he is also a statistical consultant in the School of Social Work. Dr. Gorroochurn has published in the fields of history of probability and statistics, mathematical population genetics, and genetic epidemiology. He is the author of Classic Problems of Probability, which is the winner of the 2012 PROSE Award for Mathematics from The American Publishers Awards for Professional and Scholarly Excellence.

See More
Back to Top