Dear customers, please be informed that our shopping cart will be unavailable between August 21 and September 1, 2014, as we will be making some changes to serve you better. To minimise any possible delivery disruption, we encourage you to make your purchases before August 21. We appreciate your understanding and apologise for any inconvenience.

Wiley
Wiley.com
Print this page Share

Quantitative Methods In Linguistics

ISBN: 978-1-4051-4424-7
296 pages
March 2008, Wiley-Blackwell
Quantitative Methods In Linguistics (1405144246) cover image
Quantitative Methods in Linguistics introduces the general strategies and methods of quantitative analysis. The book dedicates individual chapters to phonetics, psycholinguistics, sociolinguistics, historical linguistics, and syntax, as well as two introductory chapters on probability distribution and quantitative methods.

Each chapter uses actual data sets which have been contributed by researchers working in the field to illustrate key principles. The book also provides detailed instruction in practical aspects of handling quantitative linguistic data by using statistical software package (R) to discover patterns in quantitative data and to test linguistic hypotheses. End-of-chapter assignments and a balanced presentation make this an ideal text for students.

Further information and resources are available from the accompanying website at www.blackwellpublishing.com/quantmethods.
See More
Acknowledgments.

Design of the Book.

1. Fundamentals of Quantitative Analysis.

1.1 What We Accomplish in Quantitative Analysis.

1.2 How to Describe an Observation.

1.3 Frequency Distributions: A Fundamental Building Block of Quantitative Analysis.

1.4 Types of Distributions.

1.5 Is Normal Data, Well, Normal?.

1.6 Measures of Central Tendency.

1.7 Measures of Dispersion.

1.8 Standard Deviation of the Normal Distribution.

Exercises.

2. Patterns and Tests.

2.1 Sampling.

2.2 Data.

2.3 Hypothesis Testing.

2.3.1 The Central Limit Theorem.

2.3.2 Score Keeping.

2.3.3 H0: µ = 100.

2.3.4 Type I and Type II Error.

2.4 Correlation.

2.4.1 Covariance and Correlation.

2.4.2 The Regression Line.

2.4.3 Amount of Variance Accounted For.

Exercises.

3. Phonetics.

3.1 Comparing Mean Values.

3.1.1 Cherokee Voice Onset Time: µ1971=µ2001.

3.1.2 Samples Have Equal Variance.

3.1.3 If the Samples Do Not Have Equal Variance.

3.1.4 Paired t Test: Are Men Different from Women?.

3.1.5 The Sign Test.

3.2 Predicting the Back of the Tongue from the Front: Multiple Regression.

3.2.1 The Covariance Matrix.

3.2.2 More than One slope: The bi.

3.2.3 Selecting a Model.

3.3 Tongue Shape Factors: Principal Components Analysis.

Exercises.

4. Psycholinguistics.

4.1 Analysis of Variance: One Factor, More than Two Levels.

4.2 Two Factors: Interaction.

4.3 Repeated Measures.

4.3.1 An Example of Repeated Measures ANOVA.

4.3.2 Repeated Measures ANOVA with a Between-Subjects Factor.

4.4 The “Language as Fixed Effect” Fallacy.

4.5 Exercises.

5. Sociolinguistics.

5.1 When the Data are Counts - Contingency Tables.

5.1.1 Frequency in a Contingency Table.

5.2 Working with Probabilities: The Binomial Distribution.

5.2.1 Bush or Kerry?.

5.3 An Aside about Maximum Likelihood Estimation.

5.4 Logistic Regression.

5.5 An Example from the [∫]treets of Columbus.

5.5.1 On the Relationship between x2 and G2.

5.5.2 More than One Predictor.

5.6 Logistic Regression as Regression: An Ordinal Effect - Age.

5.7 Varbrul/R Comparison.

Exercises.

6. Historical Linguistics.

6.1 Cladistics: Where Linguistics and Evolutionary Biology Meet.

6.2 Clustering on the Basis of Shared Vocabulary.

6.3 Cladistic Analysis: Combining Character-Based Subtrees.

6.4 Clustering on the Basis of Spelling Similarity.

6.5 Multidimensional Scaling: A Language Similarity Space.

Exercises.

7. Syntax.

7.1 Measuring Sentence Acceptability.

7.2 A Psychogrammatical Law?.

7.3 Linear Mixed Effects in the Syntactic Expression of Agents in English.

7.3.1 Linear Regression: Overall, and Separately by Verbs.

7.3.2 Fitting a Linear Mixed-Effects Model: Fixed and Random Effects.

7.3.3 Fitting Five More Mixed-Effects Models: Finding the Best Model.

7.4 Predicting the Dative Alternation: Logistic Modeling of Syntactic Corpora Data.

7.4.1 Logistic Model of Dative Alternation.

7.4.2 Evaluating the Fit of the Model.

7.4.3 Adding a Random Factor: Mixed Effects Logistic Regression.

Exercises.

Appendix 7A.

References.

Index

See More
Keith Johnson is Professor of Linguistics at the University of California at Berkeley. He is the author of Acoustic and Auditory Phonetics, Second Edition (Blackwell, 2002), as well as numerous articles on phonetics and speech perception.
See More

  • Introduces the general strategies and methods of quantitative analysis for use in linguistic research
  • Provides balanced treatment of the practical aspects of handling quantitative linguistic data
  • Includes sample datasets contributed by researchers working in a variety of sub-disciplines of linguistics
  • Uses R, the statistical software package most commonly used by linguists, to discover patterns in quantitative data and to test linguistic hypotheses
  • Features student-friendly end-of-chapter assignments and is accompanied by online resources.
See More
"As research in the language sciences becomes more interdisciplinary, students must become proficient in a wider range of data analysis methods. Johnson’s text is a comprehensive and detailed introduction to some of the most widely used statistical methods in language research. The book teaches by example, walking the reader through the analysis of data sets using the software package R, which provides concrete understanding of how to apply the methods, not just understand them conceptually. This is a good practical text, one that can serve as a handbook, and is appropriate for graduate students and advanced undergraduates who are doing research in the broad field of language." Mark A Pitt, Ohio State University

"Johnson's book is a catalyst for change in linguistics. Increasingly, the subjective, impressionistic data collection method is being replaced by objective, quantitative measurements. This book serves an important function in this process leading students step-by-step toward using statistical methods to analyze complex data." Chilin Shih, University of Illinois at Urbana-Champaign

"This rich and rewarding textbook is a must-read for all students and researchers who wish to follow the new wave of sophisticated empirical models and methods now sweeping the field of linguistics from phonetics to syntax and semantics." Joan Bresnan, Stanford University

See More
Download TitleSizeDownload
Data Sets and Scripts
This file is stored in a ZIP archive. If your computer is not capable of opening ZIP archives, you can download a trial version of WinZip at WinZip.com.
   
2. Patterns and Tests (.zip)
Files included in archive: Script: Figure 2.1 Script: The central limit function from a uniform distribution (central.limit.unif). Script: The central limit function from a skewed distribution (central.limit). Script: The central limit function from a normal distribution (central.limit.norm). Script: Figure 2.5 Script: Figure 2.6 (shade.tails) Data: Male and female F1 frequency data (F1_data.txt). Script: Explore the chi-square distribution (chisq).
4.27 KB Click to Download
2. Patterns and Tests (.rar)
Files included in archive: Script: Figure 2.1 Script: The central limit function from a uniform distribution (central.limit.unif). Script: The central limit function from a skewed distribution (central.limit). Script: The central limit function from a normal distribution (central.limit.norm). Script: Figure 2.5 Script: Figure 2.6 (shade.tails) Data: Male and female F1 frequency data (F1_data.txt). Script: Explore the chi-square distribution (chisq).
4.00 KB Click to Download
3. Phonetics (.zip)
Files included in archive: Data: Cherokee voice onset times (cherokeeVOT.txt). Data: The tongue shape data (chaindata.txt). Script: Commands to calculate and plot the first principal component of tongue shape (principal_components). Script: Explore the F distribution (shade.tails.df) Data: Made-up regression example (regression.txt)
9.15 KB Click to Download
3. Phonetics (.rar)
Files included in archive: Data: Cherokee voice onset times (cherokeeVOT.txt). Data: The tongue shape data (chaindata.txt). Script: Commands to calculate and plot the first principal component of tongue shape (principal_components). Script: Explore the F distribution (shade.tails.df) Data: Made-up regression example (regression.txt)
8.98 KB Click to Download
4. Psycholinguistics (.zip)
Files included in archive: Data: One observation of phonological priming per listener from Pitt & Shoaf's (2002) Data: One observation per listener from two groups (overlap versus no overlap) from Pitt & Shoaf's study. Data: Hypothetical data to illustrate repeated measures of analysis. Data: The full Pitt & Shoaf data set. Data: Reaction time data on perception of flap, /d/, and eth by Spanish-speaking and English-speaking listeners. Data: Luka & Barsalou (2005) "by subjects" data. Data: Luka & Barsalou (2005) "by items" data. Data: Boomershine's dialect identification data for exercise 5.
16.10 KB Click to Download
4. Psycholinguistics (.rar)
Files included in archive: Data: One observation of phonological priming per listener from Pitt & Shoaf's (2002) Data: One observation per listener from two groups (overlap versus no overlap) from Pitt & Shoaf's study. Data: Hypothetical data to illustrate repeated measures of analysis. Data: The full Pitt & Shoaf data set. Data: Reaction time data on perception of flap, /d/, and eth by Spanish-speaking and English-speaking listeners. Data: Luka & Barsalou (2005) "by subjects" data. Data: Luka & Barsalou (2005) "by items" data.
9.26 KB Click to Download
5. Sociolinguistics (.zip)
Files included in archive: Data: Robin Dodsworth's preliminary data on /l/ vocalization in Worthington, Ohio. Data: Data from David Durian's rapid anonymous survey on /str/ in Columbus, Ohio. Data: Hope Dawson's Sanskrit data.
9.97 KB Click to Download
5. Sociolinguistics (.rar)
Files included in archive: Data: Robin Dodsworth's preliminary data on /l/ vocalization in Worthington, Ohio. Data: Data from David Durian's rapid anonymous survey on /str/ in Columbus, Ohio. Data: Hope Dawson's Sanskrit data.
4.95 KB Click to Download
6. Historical Linguistics (.zip)
File included in archive: Script: A script that draws Figure 6.1 Data: Dyen et al.'s (1984) distance matrix for 84 Indo-European languages based on the percentage of cognate words between languages. Data: A (rather arbitrary) subset of the Dyen et al. (1984) data coded as input to the Phylip program "pars". Data: IE-lists.txt: A version of the Dyen et al. word lists that is readable in the scripts below. Script: make_dist: This perl script tabulates all of the letters used in the Dyen et al. word lists." Script: get_IE_distance: This perl script implements the "spelling distance" metric that was used to calculate distances between words in the Dyen et al. list. Script: make_matrix: Another perl script. This one takes the output of get_IE_distance and writes it back out as a matrix that R can easily read. Data: A distance matrix produced from the spellings of words in the Dyen et al. (1984) dataset. Data: Distance matrix for eight Bantu languages from the Tanzanian Language Survey. Data: A phonetic distance matrix of Bantu languages from Ladefoged, Glick & Criper (1971). Data: The TLS Bantu data arranged as input for phylogenetic parsimony analysis using the Phylip program pars.
139.78 KB Click to Download
6. Historical Linguistics (.rar)
File included in archive: Script: A script that draws Figure 6.1 Data: Dyen et al.'s (1984) distance matrix for 84 Indo-European languages based on the percentage of cognate words between languages. Data: A (rather arbitrary) subset of the Dyen et al. (1984) data coded as input to the Phylip program "pars". Data: IE-lists.txt: A version of the Dyen et al. word lists that is readable in the scripts below. Script: make_dist: This perl script tabulates all of the letters used in the Dyen et al. word lists." Script: get_IE_distance: This perl script implements the "spelling distance" metric that was used to calculate distances between words in the Dyen et al. list. Script: make_matrix: Another perl script. This one takes the output of get_IE_distance and writes it back out as a matrix that R can easily read. Data: A distance matrix produced from the spellings of words in the Dyen et al. (1984) dataset. Data: Distance matrix for eight Bantu languages from the Tanzanian Language Survey. Data: A phonetic distance matrix of Bantu languages from Ladefoged, Glick & Criper (1971). Data: The TLS Bantu data arranged as input for phylogenetic parsimony analysis using the Phylip program pars.
132.64 KB Click to Download
7. Syntax (.zip)
Files included in archive: Data: Results from a magnitude estimation study. Data: Verb argument data from CoNLL-2005. Script: Cross-validation of linear mixed effects models. Data: Bresnan et al.'s dative alternation data.
265.36 KB Click to Download
7. Syntax (.rar)
Files included in archive: Data: Results from a magnitude estimation study. Data: Verb argument data from CoNLL-2005. Script: Cross-validation of linear mixed effects models. Data: Bresnan et al.'s dative alternation data.
218.03 KB Click to Download
See More
Back to Top