Print this page Share

Bioinformatics For Dummies, 2nd Edition

ISBN: 978-0-470-08985-9
456 pages
December 2006
Bioinformatics For Dummies, 2nd Edition (0470089857) cover image


Were you always curious about biology but were afraid to sit through long hours of dense reading? Did you like the subject when you were in high school but had other plans after you graduated? Now you can explore the human genome and analyze DNA without ever leaving your desktop!

Bioinformatics For Dummies is packed with valuable information that introduces you to this exciting new discipline. This easy-to-follow guide leads you step by step through every bioinformatics task that can be done over the Internet. Forget long equations, computer-geek gibberish, and installing bulky programs that slow down your computer. You’ll be amazed at all the things you can accomplish just by logging on and following these trusty directions. You get the tools you need to:

  • Analyze all types of sequences
  • Use all types of databases
  • Work with DNA and protein sequences
  • Conduct similarity searches
  • Build a multiple sequence alignment
  • Edit and publish alignments
  • Visualize protein 3-D structures
  • Construct phylogenetic trees

This up-to-date second edition includes newly created and popular databases and Internet programs as well as multiple new genomes. It provides tips for using servers and places to seek resources to find out about what’s going on in the bioinformatics world. Bioinformatics For Dummies will show you how to get the most out of your PC and the right Web tools so you’ll be searching databases and analyzing sequences like a pro!

See More

Table of Contents


Part I: Getting Started in Bioinformatics.

Chapter 1: Finding Out What Bioinformatics Can Do for You.

Chapter 2: How Most People Use Bioinformatics.

Part II: A Survival Guide to Bioinformatics.

Chapter 3: Using Nucleotide Sequence Databases.

Chapter 4: Using Protein and Specialized Sequence Databases.

Chapter 5: Working with a Single DNA Sequence.

Chapter 6: Working with a Single Protein Sequence.

Part III: Becoming a Pro in Sequence Analysis.

Chapter 7: Similarity Searches on Sequence Databases.

Chapter 8: Comparing Two Sequences.

Chapter 9: Building a Multiple Sequence Alignment.

Chapter 10: Editing and Publishing Alignments.

Part IV: Becoming a Specialist: Advanced Bioinformatics Techniques.

Chapter 11: Working with Protein 3-D Structures.

Chapter 12: Working with RNA.

Chapter 13: Building Phylogenetic Trees.

Part V: The Part of Tens.

Chapter 14: The Ten (Okay, Twelve) Commandments for Using Servers.

Chapter 15: Some Useful Bioinformatics Resources.


See More

Author Information

Jean-Michel Claverie is Professor of Medical Bioinformatics at the School of Medicine of the Université de la Méditerranée, and a consultant in genomics and bioinformatics. He is the founder and current head of the Structural & Genomic Information Laboratory, located in Marseilles, a sunny city on the Mediterranean coast of France. Using science as a pretext to travel, Jean-Michel has held positions in Paris (France), Sherbrooke (PQ, Canada), the Salk Institute (La Jolla, CA), the Pasteur Institute (Paris), Incyte pharmaceutical (Palo Alto, CA); and the National Center for Biotechnology Information (Bethesda, MD). He has used computers in biology since the early days –– his Ph.D. work involved modeling biochemical reactions by programming an 8K Honeywell 516 computer right from the console switches! Although he has no clear recollection of it, he has been credited with introducing the French word “bioinformatique” in the late eighties, before involuntarily coining the catchy “bioinformatics” by mistranslating it while giving a talk in English!
Jean-Michel’s current research interests are in microbial and structural genomics, and in the development of bioinformatic methods for the prediction of gene function. He is the author or coauthor of more than 150 scientific publications, and a member of numerous international review panels and scientific councils. In his spare time, he enjoys the relaxed pace of life in Marseilles, with his wife Chantal and their two sons, Nicholas and Raphael.

Cedric Notredame is a researcher at the French National Centre for Scientific Research. Cedric has used and abused the facilities offered by science to wander around Europe. After a Ph.D. at EMBL (Heidelberg, Germany) and at the European Bioinformatics Institute (Cambridge, UK) under the supervision of Des Higgins (yes, the ClustalW guy), Cedric did a post-doc at the National Institute of Medical Research (London, UK), in the lab of Willie Taylor and under the supervision of Jaap Heringa. He then did a post-doc in Lausanne (Switzerland) with Phillip Bucher, and remained involved with the Swiss Institute of Bioinformatics for several years. Having had his share of rain, snow, and wind, Cedric has finally settled in Marseilles, where the sun and the sea are simply warmer than any other place he has lived in.
Cedric dedicates most of his research to the multiple sequence alignment problem and its many applications in biology. His friends claim that his entire life (past, present, future) is somehow stuffed into the T-Coffee multiple-sequence alignment package. When he is not busy dismantling T-Coffee and brewing new sequences, Cedric enjoys life in the company of his wife, Marita.

See More


Download TitleSizeDownload
Test Questions
These test questions are in Microsoft Word format.
63.50 KB Click to Download
Chapter 1 PowerPoint files
These presentations are in Microsoft PowerPoint format. If you are unable to view PowerPoint files, you can download OpenOffice for free.

819.50 KB Click to Download
Chapter 2 PowerPoint files 818.00 KB Click to Download
Chapter 3 PowerPoint files 506.00 KB Click to Download
Chapter 4 PowerPoint files 820.50 KB Click to Download
Chapter 5 PowerPoint files 642.00 KB Click to Download
Chapter 6 PowerPoint files 915.50 KB Click to Download
Chapter 7 PowerPoint files 2.51 MB Click to Download
Chapter 8 PowerPoint files 1.05 MB Click to Download
Chapter 9 PowerPoint files 875.00 KB Click to Download
Chapter 10 PowerPoint files 964.00 KB Click to Download
Chapter 11 PowerPoint files 676.00 KB Click to Download
Chapter 12 PowerPoint files 282.00 KB Click to Download
Chapter 13 PowerPoint files 267.50 KB Click to Download
Chapter 14 PowerPoint files 372.50 KB Click to Download
Chapter 1 images 485.51 KB Click to Download
Chapter 2 images 1.89 MB Click to Download
Chapter 3 images 2.29 MB Click to Download
Chapter 4 images 568.95 KB Click to Download
Chapter 5 images 591.21 KB Click to Download
Chapter 6 images 1.95 MB Click to Download
Chapter 7 images 1.71 MB Click to Download
Chapter 8 images 911.11 KB Click to Download
Chapter 9 images 1.69 MB Click to Download
Chapter 10 images 1.84 MB Click to Download
Chapter 11 images 690.67 KB Click to Download
Chapter 12 images 754.93 KB Click to Download
Chapter 13 images 1.10 MB Click to Download
See More


Bonus Material

For your convenience, we have listed the resources chapter by chapter, following the order in which they appear in the book. Along with the chapters the authors have provided images and diagrams used in the book. You may go to the corresponding chapter to download that specific chapter.
(All images are kept in .zip archives and are available on the download tab. You may download winzip a utility to open the archives.)


Chapter 1 Finding Out What Bioinformatics Can Do for You
Chapter 2 How Most People Use Bioinformatics
Chapter 3 Using Nucleotide Sequence Databases
Chapter 4 Using Protein and Specialized Sequence Databases
Chapter 5 Working with a Single DNA Sequence
Chapter 6 Working with a Single Protein Sequence
Chapter 7 Similarity Searches on Sequence Databases
Chapter 8 Comparing Two Sequences
Chapter 9 Building a Multiple Sequence Alignment
Chapter 10 Editing and Publishing Alignments
Chapter 11 Working with Protein 3-D Structures
Chapter 12 Working with RNA Structures
Chapter 13 Building Phylogenetic Trees
Chapter 15 Some Useful Bioinformatics Resources


Chapter 1: Finding Out What Bioinformatics Can Do for You

Beyond the book: Finding out about DNA chips and micro-arrays

Address Description
cmgm.stanford.edu/pbrown/ A leading laboratory offering a complete "do-it-yourself" tutorial on micro-arrays
research.nhgri.nih.gov/microarray/main.html A great resource from the U.S. National Institutes of Health
www.ebi.ac.uk The public repository for micro-array data from the European Bioinformatics Institute
www.affymetrix.com The leading company in DNA chips
www.axon.com Nice pictures and animations from a leading provider of micro-array readers

Back to Menu


Chapter 2: How Most People Use Bioinformatics

The sites everybody should know about

Address Description
www.ncbi.nlm.nih.gov/entrez/ The top site for bibliographic information in biomedical sciences
www.expasy.org/sprot/ The best starting point for finding out about proteins and their genes
www.ncbi.nlm.nih.gov The US site of the joint international DNA sequence repository (GenBank)
www.ebi.ac.uk/embl/ Its counterpart in Europe (EMBL)
www.ddbj.nig.ac.jp Its counterpart in Japan (DDBJ)
www.ncbi.nlm.nih.gov/BLAST/ The main site to compare your sequence with all others
pir.georgetown.edu A user-friendly site for analyzing your protein sequence and trying your first multiple sequence alignment with CLUSTALW

Back to Menu


Chapter 3: Using DNA databases

A few places for finding genomic information

Address Description
www.ncbi.nlm.nih.gov The US site of the joint international DNA sequence repository (GenBank)
www.tigr.org/tdb/ The Institute of Genomic Research: microbial genomics
www.ensembl.org The place to find out about the human genome
genome.cse.ucsc.edu Another user-friendly human genome browser

Back to Menu


Chapter 4: Using Protein and Specialized Sequence Databases

The two main information resources about protein sequences

Address Description
www.expasy.org/sprot/ The Expasy/SWISS-PROT server
pir.georgetown.edu The Protein Information Resource server

Some good places for refreshing your biochemistry

Address Description
www.glycosuite.com The glycan structure database
lipid.bio.m.u-tokyo.ac.jp The ultimate lipid database
chem.sis.nlm.nih.gov/chemidplus/ ChemIDplus: Identifying molecules by drawing them up!

The main resources for biochemical pathways and enzymes

Address Description
www.expasy.ch/cgi-bin/search-biochem-index Find which metabolic pathway a molecule belongs to.
www.genome.ad.jp/kegg/ The famous Kyoto Encyclopedia of Genes and Genomes (KEGG). E.C. (Enzyme Codes) numbers or gene names are the best starting points for this resource.

The comprehensive enzyme information system BRENDA.

www.chem.qmul.ac.uk/iubmb The official site for enzyme nomenclature of the International Union of Biochemistry and Molecular Biology (IUBMB).
www.ecocyc.org The Encyclopedia of E. coli Genes and Metabolism. It is progressively extending to other bacteria.

Some great 3-D structure information resources

Address Description
www.rcsb.org/pdb PDB, the official repository database for protein 3-D structures.
www.ncbi.nlm.nih.gov/Structure MMDB, NCBI's database of macromolecular 3-D structures with visualization tools.

SCOP, a Structural Classification Of Proteins.

www.biochem.ucl.ac.uk/bsm/cath_new CATH (Class, Architecture, Topology, Homologous superfamily), a hierarchical classification of protein structures.
www.expasy.ch/swissmod/SWISS-MODEL.html Swiss-Model, a fully automated protein structure homology-modeling server.

Some specialized protein databases

Address Description
imgt.cines.fr IMGT, the International Immunogenetics database, specializes in proteins involved in the immune response.
rebase.neb.com Rebase, the reference restriction-modification enzyme database.

CAZy, an information resource on enzymes that degrade, modify, or create glycosidic bonds.

www.merops.co.uk MEROPS, a database specializing on proteases.
pkr.sdsc.edu/html/index.shtml PKR, the Protein Kinase Resource, focuses on the protein kinase family of enzymes.
nrr.georgetown.edu NRR, the Nuclear Receptor Resource, is a collection of individual databases on the steroid and thyroid hormone receptors.
senselab.med.yale.edu/senselab The Human Brain Database provides information on the proteins involved in neural processes, such as ion channels, membrane receptors of neurotransmitters and neuromodulators, as well as olfactory receptors (ORDB).
www.ncbi.nlm.nih.gov/COG The COG (Cluster of Orthologous Groups) database regroups proteins shared by at least three major phylogenetic lineages (ancient conserved domains).

Back to Menu


Chapter 5: Working with a Single DNA Sequence

Some sites for performing DNA analysis

Address Description
VecScreen_docs Screen your sequence for vector contamination
repeatmasker.genome.washington.edu Tools for detecting and masking repeats
Sites to compute restriction maps for your sequence
biotools.umassmed.edu Designing PCR primers
bioweb.pasteur.fr Tools for various DNA composition analyses
Two sites for interactive dot-plot analysis
www.ncbi.nlm.nih.gov/gorf/gorf.html A basic ORF finder
Gene prediction in prokaryotes using GeneMark
Various sites for predicting protein-coding genes in eukaryote DNA sequences
genes.mit.edu/genomescan For predicting complete gene structures from vertebrate DNA sequences
bio.ifom-firc.it/ASSEMBLY/assemble.html A straightforward Web-service for small-scale gene assembly
Popular software for assembling and managing DNA sequences (you need to install them on your computer)
Main commercial sequence assembly software

Web Sites for searching motifs in DNA sequences

Address Description
transfac.gbf.de/TRANSFAC/ Search for potential transcriptional elements using the TRANSFAC database
bimas.dcrt.nih.gov/molbio/matrixs/ Search for transcriptional elements using the IMD database
bimas.dcrt.nih.gov/molbio/proscan/ Predict putative eukaryotic promoter regions
www.gsf.de/biodv/genomeinspector.html Detect distance correlations between sequence elements
www.dna.affrc.go.jp/htdocs/PLACE/signalscan.html Detect regulatory signals in plant sequences
meme.sdsc.edu/meme/website/ Discover motifs in groups of related DNA or protein sequences
rsat.ulb.ac.be/rsat/ Tools to analyze regulatory sequences

Back to Menu


Chapter 6 :Working with a Single Protein Sequence

The Main Domain Collections

Name Address Number of Domains Generation
http://www.expasy.org/prosite 616 Manual
http://www.sanger.ac.uk/Software/Pfam 7973 Manual
http://www.bioinf.man.ac.uk/dbbrosers/PRINTS 1900 Manual
http://protein.toulouse.inra.fr/prodom/current/html/home.php 736000 Manual
http://smart.embl-heidelberg.de 685 Manual
http://www.ncbi.nlm.nih.gov/COG/new/ 4852 Manual
http://www.tigr.org/TIGRFAMs 2453 Manual
http://blocks.fhcrc.org 12542 Manual

Protein sequence analysis over the Internet

ExPASyPbilPIRCBSHitsInterProCD search
Name Site Description
http://www.expasy.org/tools Proteins
http://npsa-pbil.ibcp.fr Proteins
http://pir.georgetown.edu Proteins
http://www.cbs.dtu.dk/services Proteins
http://hits.isb-sib.ch/ Proteins
http://www.ebi.ac.uk/interpro/scan.html Domains
http://www.ebi.ac.uk/InterProScan/ Domains

Back to Menu


Chapter 7: Similarity Searches on Sequence Databases

A few BLAST and PSI-BLAST servers around the world

Country or Continent Program URL
USA BLAST/PSI-BLAST www.ncbi.nlm.nih.gov/BLAST
Europe BLAST www.expasy.ch/tools/blast/
Europe BLAST www.ch.embnet.org/software/bBLAST.html
Europe BLAST www.ebi.ac.uk/blast
Japan BLAST/PSI-BLAST www.ddbj.nig.ac.jp/search/blast-e.html


Address Description
http://blast.wustl.edu/ The Home of WU-BLAST (no online server)
http://tigrblast.tigr.org/tgi// Program
http://www.genome.wustl.edu/tools/blast/ Program
http://www.ebi.ac.uk/blast/ Program
http://brassica.bbsrc.ac.uk/BrassicaDB/blast_form.html Program

Alternative Methods for Homology Searches

Country/Continent Program Address
USA FASTA http://fasta.bioch.Virginia.edu/fasta
EUROPE FASTA http://www.ebi.ac.uk/fasta33
EUROPE SSEARCH http://www.ch.embnet.org/software/GMFDF_form.html
JAPAN SSEARCH/FASTA http://www.ddbj.nig.ac.jp/search/ssearch-e.html
USA BLAT http://genome.ucsc.edu

Back to Menu


Chapter 8: Comparing Two Sequences

Various flavors of dot-plot programs

Name Used For Range URL Platforms
Dotlet Proteins, DNA 10,000 www.ch.embnet.org All (Java)
Dnadot Proteins, DNA 100,000 arbl.cvmbs.colostate.edu/molkit/dnadot/ All (Java)
Dotter Proteins, DNA 100,000 www.cgr.ki.se/cgr/groups/sonnhammer/Dotter.html Unix, Linux, Windows
Dottup Complete genomes, DNA >100,000 www.emboss.org Unix, Linux

Online pairwise alignment programs

Name Address Alignment Type
lalign www.ch.embnet.org/software/LALIGN_form.html Global/Local
lalign http://fasta.bioch.virginia.edu/fasta_www/plalign.htm Global/Local
USC www-hto.usc.edu/software/seqaln/seqaln-query.html Global/Local
alion fold.stanford.edu/alion/ Global/Local
align genome.cs.mtu.edu/align.html Global/Local
align www.ebi.ac.uk/emboss/align/ Global/Local
xenAliTwo www.soe.ucsc.edu/~kent/xenoAli/xenAliTwo.html Local for DNA
Blast2seqs www.ncbi.nlm.nih.gov/blast/bl2seq/bl2.html Local BLAST
Protal2dna bioweb.pasteur.fr/seqanal/interfaces/protal2dna.html Protein against DNA
Pal2nal coot.embl.de/pal2nal Prottein against DNA

Online pairwise alignment analyses

Name Address Function
lalnview www.expasy.ch/tools/sim-prot.html Visualization
prss www.ch.embnet.org/software/PRSS_form.html Evaluation
prss fasta.bioch.virginia.edu/fasta/prss.htm Evaluation
graph-align darwin.nmsu.edu/cgi-bin/graph_align.cgi Evaluation

Back to Menu


Chapter 9: Building a Multiple Sequence Alignment

Application Procedure
Extrapolation A good multiple alignment can help convince you that an uncharacterized sequence is really a member of a protein family. Alignments that include SWISS-PROT sequences are the most informative. Use the ExPASyBLAST server (at www.expasy.ch/tools/blast/) to gather and align them.
Phylogenetic analysis If you carefully choose the sequences you include in your multiple alignment, you can reconstruct the history of these proteins. Use the Pasteur Phylip server at bioweb.pasteur.fr/seqanal/phylogeny/phylip-uk.html.
Pattern identification By discovering very conserved positions, you can identify a region that is characteristic of a function (in proteins or in nucleic-acid sequences). Use the logo server for that purpose: www-lmmb.ncifcrf.gov/~toms/sequencelogo.html.
Domain identification It is possible to turn a multiple sequence alignment into a profile that describes a protein family or a protein domain (PSSM). You can use this profile to scan databases for new members of the family. Use NCBI-BLAST to produce and analyze PSSMs: www.ncbi.nlm.nih.gov/blast/blastcgihelp.shtml#pssm.
DNA regulatory elements You can turn a DNA multiple alignment of a binding site into a weight matrix and scan other DNA sequences for potentially similar binding sites. Use the Gibbs sampler to identify these sites: bayesweb.wadsworth.org/gibbs/gibbs.html
Structure prediction A good multiple alignment can give you an almost perfect prediction of your protein secondary structure for both proteins and RNA. Sometimes it can also help in the building of a 3-D model.
nsSNP analysis Various gene alleles often have different amino-acid sequences. Multiple alignments can help you predict whether a Non-Synonymous Single-Nucleotide Polymorphism is likely to be harmful. See the SIFT site for more details: blocks.fhcrc.org/sift/SIFT.html.
PCR Analysis A good multiple alignment can help you identify the less-degenerated portions of a protein family, in order to fish out new members by PCR (polymerase chain reaction). If this is what you want to do, you can use the following site: blocks.fhcrc.org/codehop.html.

BLAST servers integrating multiple alignment methods

Address What You Can Do There
www.expasy.ch/tools/blast/ Extract entire sequences,
Export sequences in FASTA,
Submit sequences to ClustalW, Tcoffee or MAFFT.
Turn the list of Hits into a non-redundant collection of sequences
npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_blast.html Extract entire sequences;
Extract sequence fragments;
Export sequences in FASTA;
Submit sequences to ClustalW
srs.ebi.ac.uk Submit sequences to ClustalW

A List of ClustalW servers

Name Location URL
EBI Europe www.ebi.ac.uk/clustalw
EMBnet Europe www.ch.embnet.org/software/ClustalW.html
PIR USA pir.georgetown.edu/pirwww/search/multaln.html
BCM USA searchlauncher.bcm.tmc.edu/multi-align/multi-align.html
GenomeNet Japan align.genome.jp/
DDBJ Japan www.ddbj.nig.ac.jp/search/clustalw-e.html
Strasbourg Europe ftp://ftp-igbmc.u-strasbg.fr/pub/ClustalW/

Multiple Sequence Alignment Resources Over the Internet

Method Description Address
Tcoffee Accurate combination of sequences and structures www.tcoffee.org
Probcons A Bayesian version of Tcoffee probcons.stanford.edu/
MUSCLE A fast and accurate sequence cruncher www.drive5.com/muscle/
Kalign A fast sequence aligner msa.cgb.ki.se
MAFFT A fast and accurate sequence cruncher using Fast Fourier Tranforms timpani.genome.ad.jp/~mafft/server/
Dialign Ideal for Sequences With Local Homology bibiserv.techfak.uni-bielefeld.de/dialign/

Motif-finding methods available online

Method Address
Gibbs Sampler bioweb.pasteur.fr/seqanal/interfaces/gibbs-simple.html
Pratt www.ebi.ac.uk/pratt/index.html
eMotif dna.stanford.edu/emotif/
MEME meme.sdsc.edu/meme/
TEIRESIAS cbcsrv.watson.ibm.com/Tspd.html
Bioprospector ai.stanford.edu/~xsliu/BioProspector/
Improbizer www.soe.ucsc.edu/~kent/improbizer/improbizer.html
BLOCK-Maker blocks.fhcrc.org/blocks/blockmkr/make_blocks.html

Back to Menu


Chapter 10: Editing and Publishing Alignments

Packages for Editing Multiple Sequence Alignments

Name Address Description
Jalview www.jalview.org www.es.embnet.org/Services/MolBio/jalview/ Java package, available online
Kalignview msa.cgb.ki.se Nice online alignment viewer
CINEMA www.bioinf.man.ac.uk/dbbrowser/CINEMA2.1/ A very complete Java package
Seaview pbil.univ-lyon1.fr/software/seaview.html A beautiful editor, very easy to install
Belvu www.cgr.ki.se/cgr/groups/sonnhammer/Belvu.html Useful for removing redundancy
Bioedit www.mbio.ncsu.edu/BioEdit/bioedit.html Adapted for RNA
RALEE www.sanger.ac.uk/Users/sgj/ralee/ An RNA viewer
Review bioweb.pasteur.fr/cgi-bin/seqanal/review-edital.pl A very complete list of viewers

Extracting information from a multiple sequence alignment

in your multiple alignment
Name URL Description
Logo weblogo.berkeley.edu , www-lmmb.ncifcrf.gov/~toms/sequencelogo.html, www.cbs.dtu.dk/~gorodkin/appl/plogo.html Logos
Blocks blocks.fhcrc.org/blocks/process_blocks.html Identifies blocks
Blockgap www.bork.embl-heidelberg.de/Alignment/blockgap.html
Lama blocks.fhcrc.org/blocks-bin/LAMA_search.sh Compares your multiple alignment with the BLOCKs database
Amas www.compbio.dundee.ac.uk/servers/amas_server.html Identifies important features in the multiple alignment

Multiple alignment beautifying tools

Name URL Description
ESPript espript.ibcp.fr A very powerful shading-and-coloring tool
Boxshade www.ch.embnet.org/software/BOX_form.html Shading in black and white
Mview bioweb.pasteur.fr/seqanal/interfaces/mview_blast-simple.html Can process BLAST alignments

Back to Menu


Chapter 11: Working with Protein 3-D Structures

Predicting secondary structures

URL Description
bioinf.cs.ucl.ac.uk/psipred/ PsiPred for predicting protein secondary structures
PredictProtein for predicting protein secondary structures
www.rcsb.org/pdb/ The Protein Database, containing every publicly available protein structure
www.ncbi.nlm.nih.gov/Structure The NCBI section dedicated to structure analysis
Two very popular PDB viewers (you must install them on your machine)
Popular structure classification collections
Homology modeling
Threading sequences onto PDB structures
folding.stanford.edu ab-initio folding
MolMovDB Molecular dynamics
www.bio.vu.nl/nvtb/Docking.html Protein Interaction

Back to Menu


Chapter 12: Working with RNA

Hunting Micro RNAs (miRNAs) over the Web

Address Description
sirna.cgb.ki.se/ An extensive collection of resources on silencing RNAs
itb.biologie.hu-berlin.de/~nebulus/sirna/v2/ A database of all known human silencing RNAs
microrna.sanger.ac.uk/sequences The home of miRNAs at the Sanger Center in the UK. Probably one of the most extensive resources on micro-RNAs.
cbit.snu.ac.kr/~ProMiR2/ A resource for predicting miRNAs using probabilistic methods.
pictar.bio.nyu.edu/ Prediction of the potential target of your miRNA on complete genomes.
bibiserv.techfak.uni-bielefeld.de/rnahybrid/ A resource for predicting the potential target of your miRNA on a user-provided genomic sequence.
mirna.imbb.forth.gr/microinspector/ Runs your genomic sequence against an exhaustive database of miRNAs

Ribosomal RNA resources on the Internet

URL Description
www.psb.ugent.be/rRNA/lsu/ A European database on the larger of the two ribosomal subunits. It contains predicted structures. It is possible to query the database online. Features lots of online software.
www.psb.ugent.be/rRNA/ssu/ The "other" European database, this time dedicated to the small ribosomal subunit.

Some non-coding RNA resources

URL Description
condor.bcm.tmc.edu/smallRNA/smallrna.html Dedicated to small non-coding RNAs.
rna.wustl.edu/tRNAdb/ Dedicated to tRNAs.
bighost.area.ba.cnr.it/BIG/UTRHome/ Dedicated to the untranslated regions of genes.
www.indiana.edu/~tmrna/ Dedicated to the recently discovered tmRNA that are both transfer and messenger RNAs. (If you don't yet know what this is, you MUST take a look at this fascinating Web site!)

A list of generic RNA resources

URL Description
bioinfo.lifl.fr/rna/ A site dedicated to the detection of non-coding RNAs.
www.imb-jena.de/RNA.html/ RNA World, one of the most complete sites currently available.
www.rnabase.org/links/ Another very complete list of sites.

Back to Menu


Chapter 13: Building Phylogenetic Trees

Online sites for making phylogenetic trees

Address Description
www.ebi.ac.uk/clustalw/ You can use ClustalW to build multiple alignments and compute NJ trees. Remember: You cannot do both at the same time!
www.genebee.msu.ru/clustal/basic.html The Genebee server can produce genuine phylogenetic trees in one step.
www.tcoffee.org Tcoffee computes a genuine NJ-phylogenetic tree in one step
www.jalview.org You can use Jalview to produce NJ trees. Its a very powerful tool that combines alignment editing with tree computation.
atgc.lirmm.fr/phyml/ A powerful method to compute maximum likelihood trees from Gascuel and his team.
bioweb.pasteur.fr/seqanal/interfaces/bionj-simple.html An interface to BioNJ, a novel NJ method.
www.up.univ-mrs.fr/evol/figenix/ A powerful Java tool to gather members of a protein family and build the associated tree.
bioweb.pasteur.fr/seqanal/phylogeny/phylip-uk.html A Web interface for Phylip.
www.genebee.msu.ru/services/phtree_reduced.html Very powerful interface for a new tree reconstruction method.

Generic phylogenetic resources on the Internet

Address Description
evolution.genetics.washington.edu/phylip/software.html Joe Felsenstein's pages, where Phylip lives; it's also one of the most extensive collections of resources available. Truly a legendary site!
www.ucmp.berkeley.edu/subway/phylo/phylosoft.html A very complete list of phylogeny resources.
paup.csit.fsu.edu/index.html The home of PAUP, legendary phylogeny package using Parsimony. Although PAUP is a commercial package, its reasonably priced and worth every penny, according to specialists.
www.ncbi.nlm.nih.gov/About/primer/phylo.html The NCBI primer on phylogeny.
www.techfak.uni-bielefeld.de/bcd/Curric/MathAn/mathan.html A high-quality course on tree reconstruction methods.

Collections of Orthologous Sequences

Address Description
www.ncbi.nlm.nih.gov/COG/ Clusters of orthologous sequences maintained by the NCBI. Each cluster contains proteins from bacterial genomes.
pbil.univ-lyon1.fr/databases/hovergen.html A collection of orthologous vertebrate genes.
pbil.univ-lyon1.fr/databases/hobacgen.html A collection of orthologous bacterial genes.
systers.molgen.mpg.de Another collection of homologous sequences.
Three extensive collections of ribosomal RNA sequences, which are very useful for classifying new organisms, and come with appropriate phylogenetic tools.

Back to Menu


Chapter 15: Some Useful Bioinformatics Resources

Ten important bioinformatics databases

Name URL Description
GenBank/DDBJ/EMBL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Nucleotide Nucleotide sequences
Ensembl www.ensembl.org Human/mouse genome
PubMed http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?DB=pubmed Literature references
NR http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein Non redundant Protein sequences
SWISS-PROT www.expasy.ch Protein sequences
InterPro www.ebi.ac.uk Protein domains
OMIM www.ncbi.nlm.nih.gov Genetic diseases
Enzymes www.chem.qmul.ac.uk Enzymes
PDB www.rcsb.org/pdb/ Protein structures
KEGG www.genome.ad.jp Metabolic pathways

Twelve important software programs in bioinformatics

Category Name URL Description
Database Search SRS srs.ebi.ac.uk Database search
  Entrez www.ncbi.nih.gov/Entrez Database search (Chapter 3)
  BLAST www.ncbi.nlm.nih.gov/blast Homology search (Chapter 7)
  DALI www.ebi.ac.uk/dali Structure database search (Chapter 11)
Multiple alignment ClustalW www.ebi.ac.uk Multiple sequence alignment (Chapter 9)
  MUSCLE phylogenomics.berkeley.edu/muscle/ Multiple sequence alignment (Chapter 9)
  Tcoffee www.tcoffee.org Multiple Sequence Alignment (Chapter 9)
Prediction GenScan genes.mit.edu Gene prediction (Chapter 5)
  PsiPred bioinf.cs.ucl.ac.uk/psipred/ Protein structure prediction (Chapter 11)
  Mfold www.bioinfo.rpi.edu/applications/mfold/ RNA structure prediction (Chapter 12)
Phylogenetics Phylip bioweb.pasteur.fr/seqanal/phylogeny/phylip-uk.html Tree reconstruction (Chapter 13)
  PhyML atgc.lirmm.fr/phyml/ Tree reconstruction (Chapter 13)
Edition/Visualization Jalview www.jalview.org Alignment editor (Chapter 10)
  Logos weblogo.berkeley.edu A MSA Visualization Tool (Chapter 10).
  Trees iubio.bio.indiana.edu/treeapp/treeprint-form.html Tree Visualization (Chapter 13).
  Rasmol www.umass.edu/microbio/rasmol/ Structure visualization (Chapter 11)

Ten bioinformatics resource locators

Name Address Description
ExPASy www.expasy.ch Dedicated to proteins
ArrayExpress www.ebi.ac.uk/microarray/ DNA chips
Swbic www.swbic.org Miscellaneous links
Pasteur bioweb.pasteur.fr/intro-uk.html Miscellaneous links; many online tools
RNA World www.imb-jena.de/RNA.html RNA-related links
miRNAs microrna.sanger.ac.uk/sequences/index.shtml Extensive Resources on miRNA
Phylip evolution.genetics.washington.edu/phylip/software.html Everything on phylogeny
NCBI primers www.ncbi.nlm.nih.gov/education Very good primers on many subjects
Bielefeld bibiserv.techfak.uni-bielefeld.de/intro/dist.html Awesome online course
Bio-informer www.ebi.ac.uk/Information/News/ The EBI online news
Coffee Corner www.ncbi.nlm.nih.gov/books/bv.fcgi?call=bv.View..ShowSection&rid=coffeebrk NCBI Online News.

Ten Places to Go Farther

Name Address Description
Nucleic Acid Research nar.oxfordjournals.org/ Once a year, NAR publishes both a database issue and Web-server issue. These are available for free -- and contain the state of the art in bioinformatics.
Nucleic Acid Research bioinformatics.oxfordjournals.org/ Bioinformatics contains articles describing the most recent methods in bioinformatics.
Nucleic Acid Research www.iscb.org/events/event_board.php An exhaustive list of major conferences in the field of bioinformatics, provided by the International Society For Computational Biology.

Back to Menu


See More
Instructors Resources
Wiley Instructor Companion Site
Request a print evaluation copy
Contact us
See More
See Less
Students Resources
Wiley Student Companion Site
See More
See Less

Related Titles

Back to Top