Wiley.com
Print this page Share

Exploration and Analysis of DNA Microarray and Other High-Dimensional Data, 2nd Edition

ISBN: 978-1-118-35633-3
344 pages
March 2014
Exploration and Analysis of DNA Microarray and Other High-Dimensional Data, 2nd Edition (1118356330) cover image

Description

Praise for the First Edition

“…extremely well written…a comprehensive and up-to-date overview of this important field.” – Journal of Environmental Quality

 

Exploration and Analysis of DNA Microarray and Other High-Dimensional Data, Second Edition provides comprehensive coverage of recent advancements in microarray data analysis. A cutting-edge guide, the Second Edition demonstrates various methodologies for analyzing data in biomedical research and offers an overview of the modern techniques used in microarray technology to study patterns of gene activity.

 

The new edition answers the need for an efficient outline of all phases of this revolutionary analytical technique, from preprocessing to the analysis stage. Utilizing research and experience from highly-qualified authors in fields of data analysis, Exploration and Analysis of DNA Microarray and Other High-Dimensional Data, Second Edition features:

 

  • A new chapter on the interpretation of findings that includes a discussion of signatures and material on gene set analysis, including network analysis
  • New topics of coverage including ABC clustering, biclustering, partial least squares, penalized methods, ensemble methods, and enriched ensemble methods
  • Updated exercises to deepen knowledge of the presented material and provide readers with resources for further study

 

The book is an ideal reference for scientists in biomedical and genomics research fields who analyze DNA microarrays and protein array data, as well as statisticians and bioinformatics practitioners. Exploration and Analysis of DNA Microarray and Other High-Dimensional Data, Second Edition is also a useful text for graduate-level courses on statistics, computational biology, and bioinformatics.

See More

Table of Contents

Preface xv

Acknowledgments xvii

1 A brief introduction 1

1.1 A note on exploratory data analysis 3

1.2 Computing considerations and software 4

1.3 A brief outline of the book 5

1.4 Datasets and case studies 7

2 Genomics basics 11

2.1 Genes 11

2.2 DNA 12

2.3 Gene expression 13

2.4 Hybridization assays and other laboratory techniques 15

2.5 The human genome 16

2.6 Genome variations and their consequences 18

2.7 Genomics 19

2.8 The role of genomics in pharmaceutical and research and clinical practice 20

2.9 Proteins 23

2.10 Bioinformatics 23

3 Microarrays 27

3.1 Types of microarray experiments 28

3.2 A very simple hypothetical microarray experiment 32

3.3 A typical microarray experiment 34

3.4 Multichannel cDNA microarrays 38

3.5 Oligonucleotide microarrays 38

3.6 Bead based arrays 40

3.7 Confirmation of microarray results 40

4 Processing the scanned image 43

4.1 Converting the scanned image to the spotted image 44

4.2 Quality assessment 47

4.3 Adjusting for background 53

4.4 Expression level calculation for twochannel cDNA microarrays 56

4.5 Expression level calculation for oligonucleotide microarrays 58

5 Preprocessing microarray data 65

5.1 Logarithmic transformation 66

5.2 Variance stabilizing transformations 66

5.3 Sources of bias 68

5.4 Normalization 69

5.5 Intensity dependent normalization 70

5.6 Judging the success of a normalization 81

5.7 Outlier identification 83

5.8 Nonresistant rules for outlier identification 83

5.9 Resistant rules for outlier identification 83

5.10 Assessing replicate array quality 84

6 Summarization 95

6.1 Replication 95

6.2 Technical replicates 96

6.3 Biological replicates 100

6.4 Biological replicates 100

6.5 Multiple oligonucleotide arrays 102

6.6 Estimating fold change in twochannel experiments 104

6.7 Bayes estimation of fold change 105

6.8 Estimating fold change Affymetrix data 106

6.9 RMA Summarization of multiple oligonucleotide arrays revisited 107

6.10 FARMS summarization. 108

7 Two group comparative experiments 119

7.1 Basics of statistical hypothesis testing 120

7.2 Fold changes 123

7.3 The two sample t test 123

7.4 Diagnostic checks 127

7.5 Robust t tests 129

7.6 The Mann Whitney Wilcox on rank sum test 130

7.7 Multiplicity 132

7.8 The false discovery rate 135

7.9 Resampling based Multiple Testing Procedures 138

7.10 Small variance adjusted t tests and SAM 140

7.11 Conditional t 146

7.12 Borrowing strength across genes 149

7.13 Twochannel experiments 151

7.14 Filtering 153

8 Model based inference and experimental design considerations 177

8.1 The F test 178

8.2 The basic linear model 179

8.3 Fitting the model in two stages 181

8.4 Multichannel experiments 182

8.5 Experimental design considerations 183

8.6 Miscellaneous issues 187

8.7 Model based analysis of Affymetrix arrays 188

9 Analysis of gene sets 211

9.1 Methods for identifying enriched gene sets 213

9.2 ORA and Fisher’s exact test 217

9.3 Interpretation of results 217

9.4 Example 217

10 Pattern discovery 221

10.1 Initial considerations 222

10.2 Cluster analysis 223

10.3 Seeking patterns visually 241

10.4 Biclustering 254

11 Class prediction 263

11.1 Initial considerations 264

11.2 Linear Discriminant Analysis 269

11.3 Extensions of Fisher’s LDA 275

11.4 Penalized methods 278

11.5 Nearest neighbors 279

11.6 Recursive partitioning 280

11.7 Ensemble methods 285

11.8 Enriched ensemble classifiers 288

11.9 Neural networks 288

11.10 Support Vector Machines 289

11.11 Generalized enriched methods 291

11.12 Integration of genome information 301

12 Protein arrays 307

12.1 Introduction 307

12.2 Protein array experiments 308

12.3 Special issues with protein arrays 310

12.4 Analysis 310

12.5 Using antibody antigen arrays to measure protein concentrations 311

References 317

Index 337

See More

Author Information

DHAMMIKA AMARATUNGA, PhD, is Senior Director and Janssen Fellow in the Nonclinical Statistics and Computing Department at Janssen R&D, a Johnson & Johnson pharmaceutical company. His research interests include analysis of large multivariate data sets generated by functional genomics research, robust and resistant statistical methods, linear and nonlinear modeling, and biostatistics.

JAVIER CABRERA, PhD, is Full Professor in the Department of Statistics at Rutgers University. He has published over 100 articles in his areas of research interest, which include DNA microarray, data mining of biopharmaceutical databases, computer vision, statistical computing and graphics, robustness, and biostatistics. He has also lectured at Cold Spring Harbor Laboratory, The Hong Kong University of Science and Technology, and National University of Singapore.

ZIV SHKEDY, PhD, is Associate Professor and Statistical Consultant in the Interuniversity Institute for Biostatistics and Statistical Bioinformatics, Center for Statistics at Hasselt University, Belgium. He has published numerous journal articles on the topics of clinical and non-clinical trials, modeling infectious disease data, dose-response analysis, Bayesian modeling, bioinformatics, and analysis of gene expression data.

See More

Reviews

“Featuring new information on interpretation of findings, class prediction, ABC clustering, limma for mixed models, biclustering, mass spectrometry, tracking Spearman correlations, and more, this \extremely well written" (Journal of Environmental Quality) book is a choice reference for scientists, teachers, and students interested in DNA data analysis.”  (Zentralblatt MATH, 1 October 2014)

“In summary this is an excellent text for both life scientist and computer/mathematicians.  Highly recommended.”  (Scientific Computing, 1 August 2014)

 

See More
Back to Top