Skip to main content

Mathematical Models for Speech Technology

Mathematical Models for Speech Technology

Stephen Levinson

ISBN: 978-0-470-02090-6

May 2005

282 pages



Mathematical Models of Spoken Language presents the motivations for, intuitions behind, and basic mathematical models of natural spoken language communication. A comprehensive overview is given of all aspects of the problem from the physics of speech production through the hierarchy of linguistic structure and ending with some observations on language and mind.

The author comprehensively explores the argument that these modern technologies are actually the most extensive compilations of linguistic knowledge available.Throughout the book, the emphasis is on placing all the material in a mathematically coherent and computationally tractable framework that captures linguistic structure.

It presents material that appears nowhere else and gives a unification of formalisms and perspectives used by linguists and engineers. Its unique features include a coherent nomenclature that emphasizes the deep connections amongst the diverse mathematical models and explores the methods by means of which they capture linguistic structure.

This contrasts with some of the superficial similarities described in the existing literature; the historical background and origins of the theories and models; the connections to related disciplines, e.g. artificial intelligence, automata theory and information theory; an elucidation of the current debates and their intellectual origins; many important little-known results and some original proofs of fundamental results, e.g. a geometric interpretation of parameter estimation techniques for stochastic models and finally the author's own unique perspectives on the future of this discipline.

There is a vast literature on Speech Recognition and Synthesis however, this book is unlike any other in the field. Although it appears to be a rapidly advancing field, the fundamentals have not changed in decades. Most of the results are presented in journals from which it is difficult to integrate and evaluate all of these recent ideas. Some of the fundamentals have been collected into textbooks, which give detailed descriptions of the techniques but no motivation or perspective. The linguistic texts are mostly descriptive and pictorial, lacking the mathematical and computational aspects. This book strikes a useful balance by covering a wide range of ideas in a common framework. It provides all the basic algorithms and computational techniques and an analysis and perspective, which allows one to intelligently read the latest literature and understand state-of-the-art techniques as they evolve. 

Author's preface.



2.1  The physics of speech production

2.2  The source-filter model

2.3  Information-bearing features of the speech signal

2.4  Time-frequency representations

2.5  Classifications of acoustic patterns in speech

2.6  Temporal invariance and stationarity

2.7  Taxonomy of linguistic structure

Mathematical models of linguistic structure

3.1  Probabilistic functions of a discrete Markov process

3.2  Formal grammars and abstract automata

Syntactic analysis

4.1  Deterministic parsing algorithms

4.2  Probabilistic parsing algorithms

4.3  Parsing natural language

Grammatical inference

5.1  Exact inference and Gold's theorem

5.2  Baum's algorithm for regular grammars

5.3  Event counting in parse trees

5.4  Baker's algorithm for context-free grammars

Information-theoretic analysis of speech communication

6.1  The Miller et al. experiments

6.2  Entropy of an information source

6.3  Recognition error rates and entropy

Automatic speech recognition and constructive theories of language

7.1  Integrated architectures

7.2  Modular architectures

7.3  Parameter estimation from fluent speech

7.4  System performance

7.5  Other speech technologies

Automatic speech understanding and semantics

8.1  Transcription and comprehension

8.2  Limited domain semantics

8.3  The semantics of natural language

8.4 System architectures

8.5  Human and machine performance

Theories of mind and language

9.1  The challenge of automatic natural language understanding

9.2  Metaphors for mind

9.3  The artificial intelligence program

10  A speculation on the prospects for a science of the mind

10.1  The parable of the thermos bottle: measurements and symbols

10.2  The four questions of science

10.3  A constructive theory of the mind

10.4  The problem of consciousness

10.5  The role of sensorimotor function, associative memory and reinforcement learning in automatic acquisition of spoken language by an autonomous robot

10.6  Final thoughts: predicting the course of discovery

"...a succinct presentation of the most important mathematical technology of speech technology and the author's ideas for overcoming the limitations of these techniques…" (Mathematical Reviews, 2005j)