Wiley
Wiley.com
Print this page Share

Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications

ISBN: 978-0-470-50519-9
1080 pages
February 2011
Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications (0470505192) cover image
This book represents the most comprehensive and up-to-date collection of information on the topic of computational molecular biology. Bringing the most recent research into the forefront of discussion, Algorithms in Computational Molecular Biology studies the most important and useful algorithms currently being used in the field, and provides related problems. It also succeeds where other titles have failed, in offering a wide range of information from the introductory fundamentals right up to the latest, most advanced levels of study.
See More
PREFACE.

CONTRIBUTORS.

I STRINGS PROCESSING AND APPLICATION TO BIOLOGICAL SEQUENCES.

1 STRING DATA STRUCTURES FOR COMPUTATIONAL MOLECULAR BIOLOGY (Christos Makris and Evangelos Theodoridis).

1.1 Introduction.

1.2 Main String Indexing Data Structures.

1.3 Index Structures for Weighted Strings.

1.4 Index Structures for Indeterminate Strings.

1.5 String Data Structures in Memory Hierarchies.

1.6 Conclusions.

2 EFFICIENT RESTRICTED-CASE ALGORITHMS FOR PROBLEMS IN COMPUTATIONAL BIOLOGY (Patricia A. Evans and H. Todd Wareham).

2.1 The Need for Special Cases.

2.2 Assessing Efficient Solvability Options for General Problems and Special Cases.

2.3 String and Sequence Problems.

2.4 Shortest Common Superstring.

2.5 Longest Common Subsequence.

2.6 Common Approximate Substring.

2.7 Conclusion.

3 FINITE AUTOMATA IN PATTERN MATCHING (Jan Holub).

3.1 Introduction.

3.2 Direct Use of DFA in Stringology.

3.3 NFA Simulation.

3.4 Finite Automaton as Model of Computation.

3.5 Finite Automata Composition.

3.6 Summary.

4 NEW DEVELOPMENTS IN PROCESSING OF DEGENERATE SEQUENCES (Pavlos Antoniou and Costas S. Iliopoulos).

4.1 Introduction.

4.2 Background.

4.3 Basic Definitions.

4.4 Repetitive Structures in Degenerate Strings.

4.5 Conservative String Covering in Degenerate Strings.

4.6 Conclusion.

5 EXACT SEARCH ALGORITHMS FOR BIOLOGICAL SEQUENCES (Eric Rivals, Leena Salmela, and Jorma Tarhio).

5.1 Introduction.

5.2 Single Pattern Matching Algorithms.

5.3 Algorithms for Multiple Patterns.

5.4 Application of Exact Set Pattern Matching for Read Mapping.

5.5 Conclusions.

6 ALGORITHMIC ASPECTS OF ARC-ANNOTATED SEQUENCES (Guillaume Blin, Maxime Crochemore, and Stephane Vialette).

6.1 Introduction.

6.2 Preliminaries.

6.3 Longest Arc-Preserving Common Subsequence.

6.4 Arc-Preserving Subsequence.

6.5 Maximum Arc-Preserving Common Subsequence.

6.6 Edit Distance.

7 ALGORITHMIC ISSUES IN DNA BARCODING PROBLEMS (Bhaskar DasGupta, Ming-Yang Kao, and Ion M˘andoiu).

7.1 Introduction.

7.2 Test Set Problems: A General Framework for Several Barcoding Problems.

7.3 A Synopsis of Biological Applications of Barcoding.

7.4 Survey of Algorithmic Techniques on Barcoding.

7.5 Information Content Approach.

7.6 Set-Covering Approach.

7.7 Experimental Results and Software Availability.

7.8 Concluding Remarks.

8 RECENT ADVANCES IN WEIGHTED DNA SEQUENCES (Manolis Christodoulakis and Costas S. Iliopoulos).

8.1 Introduction.

8.2 Preliminaries.

8.3 Indexing.

8.4 Pattern Matching.

8.5 Approximate Pattern Matching.

8.6 Repetitions, Covers, and Tandem Repeats.

8.7 Motif Discovery.

8.8 Conclusions.

9 DNA COMPUTING FOR SUBGRAPH ISOMORPHISM PROBLEM AND RELATED PROBLEMS (Sun-Yuan Hsieh, Chao-Wen Huang, and Hsin-Hung Chou).

9.1 Introduction.

9.2 Definitions of Subgraph Isomorphism Problem and Related Problems.

9.3 DNA Computing Models.

9.4 The Sticker-based Solution Space.

9.5 Algorithms for Solving Problems.

9.6 Experimental Data.

9.7 Conclusion.

II ANALYSIS OF BIOLOGICAL SEQUENCES.

10 GRAPHS IN BIOINFORMATICS (Elsa Chacko and Shoba Ranganathan).

10.1 Graph theory—Origin.

10.2 Graphs and the Biological World.

10.3 Conclusion.

11 A FLEXIBLE DATA STORE FOR MANAGING BIOINFORMATICS DATA (Bassam A. Alqaralleh, Chen Wang, Bing Bing Zhou, and Albert Y. Zomaya).

11.1 Introduction.

11.2 Data Model and System Overview.

11.3 Replication and Load Balancing.

11.4 Evaluation.

11.5 Related Work.

11.6 Summary.

12 ALGORITHMS FOR THE ALIGNMENT OF BIOLOGICAL SEQUENCES (Ahmed Mokaddem and Mourad Elloumi).

12.1 Introduction.

12.2 Alignment Algorithms.

12.3 Score Functions.

12.4 Benchmarks.

12.5 Conclusion.

13 ALGORITHMS FOR LOCAL STRUCTURAL ALIGNMENT AND STRUCTURAL MOTIF IDENTIFICATION (Sanguthevar Rajasekaran, Vamsi Kundeti, and Martin Schiller).

13.1 Introduction.

13.2 Problem Definition of Local Structural Alignment.

13.3 Variable-Length Alignment Fragment Pair (VLAFP) Algorithm.

13.4 Structural Alignment based on Center of Gravity: SACG.

13.5 Searching Structural Motifs.

13.6 Using SACG Algorithm for Classification of New Protein Structures.

13.7 Experimental Results.

13.8 Accuracy Results.

13.9 Conclusion.

14 EVOLUTION OF THE CLUSTAL FAMILY OF MULTIPLE SEQUENCE ALIGNMENT PROGRAMS (Mohamed Radhouene Aniba and Julie Thompson).

14.1 Introduction.

14.2 Clustal-ClustalV.

14.3 ClustalW.

14.4 ClustalX.

14.5 ClustalW and ClustalX 2.0.

14.6 DbClustal.

14.7 Perspectives.

15 FILTERS AND SEEDS APPROACHES FOR FAST HOMOLOGY SEARCHES IN LARGE DATASETS (Nadia Pisanti, Mathieu Giraud, and Pierre Peterlongo).

15.1 Introduction.

15.2 Methods Framework.

15.3 Lossless filters.

15.4 Lossy Seed-Based Filters.

15.5 Conclusion.

15.6 Acknowledgments.

16 NOVEL COMBINATORIAL AND INFORMATION-THEORETIC ALIGNMENT-FREE DISTANCES FOR BIOLOGICAL DATA MINING (Chiara Epifanio, Alessandra Gabriele, Raffaele Giancarlo, and Marinella Sciortino).

16.1 Introduction.

16.2 Information-Theoretic Alignment-Free Methods.

16.3 Combinatorial Alignment-Free Methods.

16.4 Alignment-Free Compositional Methods.

16.5 Alignment-Free Exact Word Matches Methods.

16.6 Domains of Biological Application.

16.7 Datasets and Software for Experimental Algorithmics.

16.8 Conclusions.

17 IN SILICO METHODS FOR THE ANALYSIS OF METABOLITES AND DRUG MOLECULES (Varun Khanna and Shoba Ranganathan).

17.1 Introduction.

17.2 Molecular Descriptors.

17.3 Databases.

17.4 Methods and Data Analysis Algorithms.

17.5 Conclusions.

III MOTIF FINDING AND STRUCTURE PREDICTION.

18 MOTIF FINDING ALGORITHMS IN BIOLOGICAL SEQUENCES (Tarek El Falah, Mourad Elloumi, and Thierry Lecroq).

18.1 Introduction.

18.2 Preliminaries.

18.3 The Planted (l, d )-Motif Problem.

18.4 The Extended (l, d )-Motif Problem.

18.5 The Edited Motif Problem.

18.6 The Simple Motif Problem.

18.7 Conclusion.

19 COMPUTATIONAL CHARACTERIZATION OF REGULATORY REGIONS (Enrique Blanco).

19.1 The Genome Regulatory Landscape.

19.2 Qualitative Models of Regulatory Signals.

19.3 Quantitative Models of Regulatory Signals.

19.4 Detection of Dependencies in Sequences.

19.5 Repositories of Regulatory Information.

19.6 Using Predictive Models to Annotate Sequences.

19.7 Comparative Genomics Characterization.

19.8 Sequence Comparisons.

19.9 Combining Motifs and Alignments.

19.10 Experimental Validation.

19.11 Summary.

20 ALGORITHMIC ISSUES IN THE ANALYSIS OF CHIP-SEQ DATA (Federico Zambelli and Giulio Pavesi).

20.1 Introduction.

20.2 Mapping Sequences on the Genome.

20.3 Identifying Significantly Enriched Regions.

20.4 Deriving Actual Transcription Factor Binding Sites.

20.5 Conclusions.

21 APPROACHES AND METHODS FOR OPERON PREDICTION BASED ON MACHINE LEARNING TECHNIQUES (Yan Wang, You Zhou, Chunguang Zhou, Shuqin Wang, Wei Du, Chen Zhang, and Yanchun Liang).

21.1 Introduction.

21.2 Datasets, Features, and Preprocesses for Operon Prediction.

21.3 Machine Learning Prediction Methods for Operon Prediction.

21.4 Conclusions.

21.5 Acknowledgments.

22 PROTEIN FUNCTION PREDICTION WITH DATA-MINING TECHNIQUES (Xing-Ming Zhao and Luonan Chen).

22.1 Introduction.

22.2 Protein Annotation Based on Sequence.

22.3 Protein Annotation Based on Protein Structure.

22.4 Protein Function Prediction Based on Gene-Expression Data.

22.5 Protein Function Prediction Based on Protein Interactome Map.

22.6 Protein Function Prediction Based on Data Integration.

22.7 Conclusions and Perspectives.

23 PROTEIN DOMAIN BOUNDARY PREDICTION (Paul D. Yoo, Bing Bing Zhou, and Albert Y. Zomaya).

23.1 Introduction.

23.2 Profiling Technique.

23.3 Results.

23.4 Discussion.

23.5 Conclusions.

24 AN INTRODUCTION TO RNA STRUCTURE AND PSEUDOKNOT PREDICTION (Jana Sperschneider and Amitava Datta).

24.1 Introduction.

24.2 RNA Secondary Structure Prediction.

24.3 RNA Pseudoknots.

24.4 Conclusions.

IV PHYLOGENY RECONSTRUCTION.

25 PHYLOGENETIC SEARCH ALGORITHMS FOR MAXIMUM LIKELIHOOD (Alexandros Stamatakis).

25.1 Introduction.

25.2 Computing the Likelihood.

25.3 Accelerating the PLF by Algorithmic Means.

25.4 Alignment Shapes.

25.5 General Search Heuristics.

25.6 Computing the Robinson Foulds Distance.

25.7 Convergence Criteria.

26 HEURISTIC METHODS FOR PHYLOGENETIC RECONSTRUCTION WITH MAXIMUM PARSIMONY (Adrien Goeffon, Jean-Michel Richer, and Jin-Kao Hao).

26.1 Introduction.

26.2 Definitions and Formal Background.

26.3 Methods.

26.4 Conclusion.

27 MAXIMUM ENTROPY METHOD FOR COMPOSITION VECTOR METHOD (Raymond H.-F. Chan, Roger W. Wang, and Jeff C.-F. Wong).

27.1 Introduction.

27.2 Models and Entropy Optimization.

27.3 Application and Dicussion.

27.4 Concluding Remarks.

V MICROARRAY DATA ANALYSIS.

28 MICROARRAY GENE EXPRESSION DATA ANALYSIS (Alan Wee-Chung Liew and Xiangchao Gan).

28.1 Introduction.

28.2 DNA Microarray Technology and Experiment.

28.3 Image Analysis and Expression Data Extraction.

28.4 Data Processing.

28.5 Missing Value Imputation.

28.6 Temporal Gene Expression Profile Analysis.

28.7 Cyclic Gene Expression Profiles Detection.

28.8 Summary.

29 BICLUSTERING OF MICROARRAY DATA (Wassim Ayadi and Mourad Elloumi).

29.1 Introduction.

29.2 Types of Biclusters.

29.3 Groups of Biclusters.

29.4 Evaluation Functions.

29.5 Systematic and Stochastic Biclustering Algorithms.

29.6 Biological Validation.

29.7 Conclusion.

30 COMPUTATIONAL MODELS FOR CONDITION-SPECIFIC GENE AND PATHWAY INFERENCE (Yu-Qing Qiu, Shihua Zhang, Xiang-Sun Zhang, and Luonan Chen).

30.1 Introduction.

30.2 Condition-Specific Pathway Identification.

30.3 Disease Gene Prioritization and Genetic Pathway Detection.

30.4 Module Networks.

30.5 Summary.

31 HETEROGENEITY OF DIFFERENTIAL EXPRESSION IN CANCER STUDIES: ALGORITHMS AND METHODS (Radha Krishna Murthy Karuturi).

31.1 Introduction.

31.2 Notations.

31.3 Differential Mean of Expression.

31.4 Differential Variability of Expression.

31.5 Differential Expression in Compendium of Tumors.

31.6 Differential Expression by Chromosomal Aberrations: The Local Properties.

31.7 Differential Expression in Gene Interactome.

31.8 Differential Coexpression: Global MultiDimensional Interactome.

VI ANALYSIS OF GENOMES.

32 COMPARATIVE GENOMICS: ALGORITHMS AND APPLICATIONS (Xiao Yang and Srinivas Aluru).

32.1 Introduction.

32.2 Notations.

32.3 Ortholog Assignment.

32.4 Gene Cluster and Synteny Detection.

32.5 Conclusions.

33 ADVANCES IN GENOME REARRANGEMENT ALGORITHMS (Masud Hasan and M. Sohel Rahman).

33.1 Introduction.

33.2 Preliminaries.

33.3 Sorting by Reversals.

33.4 Sorting by Transpositions.

33.5 Other Operations.

33.6 Sorting by More Than One Operation.

33.7 Future Research Directions.

33.8 Notes on Software.

34 COMPUTING GENOMIC DISTANCES: AN ALGORITHMIC VIEWPOINT (Guillaume Fertin and Irena Rusu).

34.1 Introduction.

34.2 Interval-Based Criteria.

34.3 Character-Based Criteria.

34.4 Conclusion.

35 WAVELET ALGORITHMS FOR DNA ANALYSIS (Carlo Cattani).

35.1 Introduction.

35.2 DNA Representation.

35.3 Statistical Correlations in DNA.

35.4 Wavelet Analysis.

35.5 Haar Wavelet Coefficients and Statistical Parameters.

35.6 Algorithm of the Short Haar Discrete Wavelet Transform.

35.7 Clusters of Wavelet Coefficients.

35.8 Conclusion.

36 HAPLOTYPE INFERENCE MODELS AND ALGORITHMS (Ling-Yun Wu).

36.1 Introduction.

36.2 Problem Statement and Notations.

36.3 Combinatorial Methods.

36.4 Statistical Methods.

36.5 Pedigree Methods.

36.6 Evaluation.

36.7 Discussion.

VII ANALYSIS OF BIOLOGICAL NETWORKS.

37 UNTANGLING BIOLOGICAL NETWORKS USING BIOINFORMATICS (Gaurav Kumar, Adrian P. Cootes, and Shoba Ranganathan).

37.1 Introduction.

37.2 Types of Biological Networks.

37.3 Network Dynamic, Evolution and Disease.

37.4 Future Challenges and Scope.

38 PROBABILISTIC APPROACHES FOR INVESTIGATING BIOLOGICAL NETWORKS (Jeremie Bourdon and Damien Eveillard).

38.1 Probabilistic Models for Biological Networks.

38.2 Interpretation and Quantitative Analysis of Probabilistic Models.

38.3 Conclusion.

39 MODELING AND ANALYSIS OF BIOLOGICAL NETWORKS WITH MODEL CHECKING (Dragan Bosnacki, Peter A.J. Hilbers, Ronny S. Mans, and Erik P. de Vink).

39.1 Introduction.

39.2 Preliminaries.

39.3 Analyzing Genetic Networks with Model Checking.

39.4 Probabilistic Model Checking for Biological Systems.

40 REVERSE ENGINEERING OF MOLECULAR NETWORKS FROM A COMMON COMBINATORIAL APPROACH (Bhaskar DasGupta, Paola Vera-Licona, and Eduardo Sontag).

40.1 Introduction.

40.2 Reverse-Engineering of Biological Networks.

40.3 Classical Combinatorial Algorithms: A Case Study.

40.4 Concluding Remarks.

41 UNSUPERVISED LEARNING FOR GENE REGULATION NETWORK INFERENCE FROM EXPRESSION DATA: A REVIEW (Mohamed Elati and C´eline Rouveirol).

41.1 Introduction.

41.2 Gene Networks: Definition and Properties.

41.3 Gene Expression: Data and Analysis.

41.4 Network Inference as an Unsupervised Learning Problem.

41.5 Correlation-Based Methods.

41.6 Probabilistic Graphical Models.

41.7 Constraint-Based Data Mining.

41.8 Validation.

41.9 Conclusion and Perspectives.

42 APPROACHES TO CONSTRUCTION AND ANALYSIS OF MICRORNA-MEDIATED NETWORKS (Ilana Lichtenstein, Albert Zomaya, Jennifer Gamble, and Mathew Vadas).

42.1 Introduction.

42.2 Fundamental Component Interaction Research: Predicting miRNA Genes, Regulators, and Targets.

42.3 Identifying miRNA-mediated Networks.

42.4 Global and Local Architecture Analysis in miRNA-Containing Networks.

42.5 Conclusion.

References.

INDEX.

See More

mourad elloumi, PhD, is Associate Professor in Computer Science, Faculty of Economic Sciences and Management of Tunis (Tunisia), and member of the Unit of Technologies of Information and Communication (UTIC).?He is the author/coauthor of more than forty publications in international journals and conferences. Professor Elloumi was the guest editor of a special issue on biological knowledge discovery and data mining in Knowledge-Based Systems and the coeditor of the proceedings of two international conferences.

albert y. zomaya, PhD, is the Chair Professor of High Performance Computing and Networking in the School of Information Technologies at The University of Sydney (Australia). He is the author/coauthor of eight books and more than 400 publications in technical journals and conferences, and the editor of eight books and eight conference volumes. Professor Zomaya is currently an associate editor for twenty journals, the Founding Editor of the Wiley Series on Parallel and Distributed Computing, and a Founding Coeditor of the Wiley Series in Bioinformatics.

See More
Buy Both and Save 25%!
+

Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications (US $170.00)

-and- Rough-Fuzzy Pattern Recognition: Applications in Bioinformatics and Medical Imaging (US $107.95)

Total List Price: US $277.95
Discounted Price: US $208.46 (Save: US $69.49)

Buy Both
Cannot be combined with any other offers. Learn more.

Related Titles

Back to Top