Essential Biochemistry   Help
Credits
Home Exercises Quizzes Weblinks Reviews Structures Activities
   Student Activities Chapter

Proteomics

 
 
Lecture Resources

INTRODUCTION

For many years, sequencing the human genome was seen as the Holy Grail of biological science and medicine. But the simultaneous completion of a draft sequence by two groups, one public and one private, in February of 2001 ushered in a new “post-genomic” era as scientists began to contemplate the next step in understanding genomic function.

The sequencing all the nucleotides in the human genome does not mean that everything is known about our DNA. Although approximately 30,000-40,000 genes have been identified within our 23 pairs of chromosomes, the functions of only about half of them have been identified. Perhaps more surprisingly, all those genes account for only about 2% of the DNA in our cells. Some of the other 98% is involved in regulating the expression of genes, that is, turning them on and off as necessary. However, most of the genome consists of stretches of DNA that do not appear to be needed for anything. These sequences may be the inactive remnants of genes that have accumulated over evolutionary timescales, that is, genes discarded by the primitive creatures from which we evolved. Or, these sequences may be involved in some very intricate mechanism of genomic function and gene regulation that we have yet to discern. Clearly there is much we do not understand about this so-called junk DNA; this and other mysteries of our genome will keep scientists busy for decades to come.

Nevertheless, while the research into gene regulation and chromosomal structure continues, other avenues of research have branched off to investigate the end products of gene expression—the proteins of the cell. DNA is the blueprint of life, no question, but the proteins it encodes are the real cellular workhorses. Structural proteins, such as the collagen found in tendons and ligaments, and the keratin found in hair and nails, provide our bodies with strength and shape. Enzymes are catalytic proteins that facilitate the biochemical reactions of our metabolism. The antibodies produced by the immune system are proteins, as well as many hormones, muscle fibers, and the hemoglobin that carries oxygen from the lungs to tissues throughout the body. The various types of proteins in our cells are essential for life.

Human cells have 100,000 or more individual proteins, and therefore scientists were surprised to find that the human genome appears to house only about 40,000 genes. How can this be, when every protein sequence is necessarily encoded by our genes? A new science of proteins, called proteomics, has recently blossomed to answer this and other questions about how all the proteins in our cells function to sustain life. The study of individual proteins has been the bread-and-butter of biochemistry research for many years, but the focus of proteomics is a bit different. Proteomics is aimed at examining the whole set of proteins contained by a cell at any one time. This set of proteins is known as the proteome.

As daunting as it has been (and continues to be) to study the genome of a cell, it will be that much more difficult to study the cell's proteome. Whereas DNA is assembled from four nucleotide building blocks, commonly called G, A, T, and C, there are 20 amino acids that are used to construct a protein. Also, the shapes of proteins are more complicated than the shape of DNA as well. DNA twists into a spiral-shaped “double helix” independent of its sequence, but the shape of each folded protein is unique. Since a protein's shape is critical to its function, understanding the three-dimensional structures of proteins is important to proteomics scientists.

Another level of complexity in studying proteomes has to do with how much the proteome itself can vary, even from cell to cell within an organism. Although each cell has an identical copy of the genome, the differences between cells reflects which genes are actually expressed. In other words, each type of cell has a different proteome, or set of proteins, that gives that cell its unique characteristics. This is further complicated by the fact that even within a single cell, a proteome can change over time as the cell develops and matures or becomes diseased (which is of particular interest to medical researchers and pharmaceutical companies).

DEFINING THE PROTEOME: PROTEIN PROFILING

To get a better grasp on the functions of cellular proteins, proteomics scientists are focusing their efforts in three major areas: identifying proteins, predicting their structures, and understanding how proteins interact. The task of determining which proteins make up a given proteome is often referred to as “protein profiling.” This is not as simple as just looking at the organism’s genomic sequences and identifying the open reading frames. Even today’s powerful genome-scanning computer algorithms are not perfect at detecting genes for very small (but biologically important) proteins. Furthermore, not all proteins are synthesized in every cell, and some proteins are produced in great amounts while others are rare.

To complicate matters, a gene often serves as a blueprint for more than one protein. Sometimes several different protein-encoding mRNAs are created from one gene through alternative splicing mechanisms. It is also possible for newly made polypeptide strands to be cleaved and rejoined in different ways as they fold into three-dimensional proteins. And once made, many proteins undergo further modifications in the cell; chemical groups (such as phosphate or methyl groups) and biological molecules (such as fats or sugars) can be covalently attached to the protein. Because of such alterations, some scientists believe that the human genome can potentially express close to one million different proteins.

Examining proteins in a cell has traditionally meant laborious experiments using one- or two-dimensional gel electrophoresis to isolate individual proteins, followed by chemical sequencing. But such processes are both time-consuming and expensive and are likely to miss very small proteins as well as proteins present in tiny amounts. Because effective proteomics research means identifying thousands of proteins quickly and accurately, scientists are beginning to look at new techniques that will provide them with the data they need. One trend, still in its infancy, is to move to mass spectrometry (where electron beams pry proteins apart, and the fragments are identified by their mass). Already, peptide sequences that took hours to determine chemically can now be read in seconds. In addition to the time savings, such systems are both extremely sensitive to rare proteins, and can be more readily automated.

Scientists hope that comparing the proteomes of different types of cells may provide insight into how the genome is utilized by specific tissues. In addition, differences in the proteomes healthy and abnormal cells can help pinpoint the causative factor in disease, leading to diagnostic and hopefully therapeutic advances in the form of rationally designed drugs.

APPLYING THE LESSONS OF THE PROTEOME: RATIONAL DRUG DESIGN

drug targetting
Drug targetting: Rational drug design of a chemotherapeutic agent

The best drugs are those that can perform the desired function at the lowest dose possible, with the fewest side effects. Rational drug design depends on identifying a biomolecule (such as a protein) that causes disease and then tailoring a drug to alter or inhibit the function of that protein (see figure at right).

After a particular protein has been implicated in causing a disease, it is studied in detail. Information about the three-dimensional shape of the protein is used to design drugs that can specifically inhibit the function of the target protein and thus halt progress of the disease.

The figure at right describes one scenario for the rational design of a chemotherapeutic agent. First, tissue from a healthy brain and from a cancerous brain tumor are collected, and the proteins from each sample are extracted. The proteomes of both samples are then analyzed via two-dimensional gel electrophoresis, which separates the proteins by size in one dimension, and by electrical charge in the second dimension. Comparison of the resulting pattern of protein spots results in the identification of a protein that is solely present or present to a much larger degree in the cancerous tissue. This protein is carefully collected and purified, and its three-dimensional structure is then determined via X-ray crystallography. The structural information is used to design compounds that will bind to the protein’s active site. In the final steps of the design, the most promising compounds are synthesized and tested to ensure that they have the desired effect of halting the growth of the cancerous tissue, without unacceptable side effects.

PREDICTING PROTEIN STRUCTURE

A second focus of proteomics research is protein structure determination. Knowing the folded conformation of a protein is important because a protein’s function depends largely on its shape. Therefore, in order to understand how all the proteins in a given proteome are able to do their jobs in the cell, scientists need fast, reliable ways to determine protein shape. X-Ray crystallography has long been the standard method for doing this, but it continues to be a difficult and time-consuming process, and not all proteins crystallize well. But since many thousands of protein structures are already known, it is becoming possible to use this information to develop methods for predicting the three-dimensional shapes of proteins from their amino acid sequences.

Three dimensional structure of a protein whose activity is implicated in certain cancers.
At right is a close-up view of the structure with a bound inhibitory drug (potential chemotherepeutic).

Although this branch of proteomics holds great promise, the computer algorithms that have been developed are not yet sophisticated enough to determine the shape of a protein with great accuracy. Still, pharmaceutical companies continue to be keenly interested in the advances in computer modeling of proteins, since rational drug design requires knowledge of the three-dimensional shape of the protein of interest.

NO PROTEIN IS AN ISLAND: PROTEIN NETWORKS

Isolation of protein complexes

A third area of proteomics research is focused on determining how proteins work in networks. Scientists have known for years that some proteins interact with others in the cell, joining forces to get a particular job done. However, the extent of this kind of protein cooperation wasn’t well understood until two studies published early in 2002 gave scientists a better feel for the importance of cellular protein interactions. In separate studies using the yeast Saccharomyces cerevisiae, researchers used some of the yeast proteins as “bait” to see what other proteins they could fish out (see figure at right).

In this approach, genetic engineering techniques are used to place a molecular “tag” on a “bait” protein synthesized by growing yeast cells. The yeast cells are then harvested and gently broken open, and the cellular material is poured over a special column designed to catch the tagged bait protein. Any proteins associated with the tagged protein are also retained in clusters, while unassociated proteins are rinsed away. The protein cluster is then collected, broken apart into individual proteins on a denaturing gel, and each protein strand is sequenced by mass spectrometry. Proteins in the cluster can then be identified by matching their protein sequence to protein sequences in public databases (a process that relies on the science of bioinformatics).

Such “protein fishing” experiments were repeated many times, with hundreds of different yeast proteins used as “bait.” When the proteins caught by the column were examined, it was found that they were usually attached to one or more other proteins. Scientists now believe that at least 80% of proteins interact with other proteins to form complexes. Even more intriguing, many proteins are found in more than one protein complex. This suggests that the regulation of cellular activities depends not just on signal cascades (with one protein activating the next, and so on), but on an intricate network of protein interactions that works as a system of checks and balances to keep the cell running smoothly. Compiling the data from many “protein fishing” experiments will allow scientists to construct detailed maps of all the interactions in a proteome.

Approaches such as these are not perfect, for some well-known protein interactions are frequently not detected, and false-positive interactions are common. However, these forays into examining the cellular machinery as a whole (rather than studying isolated proteins), are furthering our understanding of life as a complex system. The data generated by these types of experiments will allow scientists to clarify the roles of proteins involved in metabolic pathways, regulatory cascades, and other functions critical for cell survival.

THE FUTURE OF PROTEOMICS

In this postgenomic era, the new field of proteomics shows great promise in revolutionizing our understanding of biological processes. But it also faces daunting technical challenges to fulfill that promise. New technologies to rapidly sequence proteins, to determine the cellular locations of proteins, and to analyze genomic and protein sequencing data are still being developed and improved. Multidisciplinary collaborations will also need to be forged between computer scientists, geneticists, and protein chemists, since proteomics encompasses expertise in multiple fields. But if preliminary results are any indication, all this effort will yield rich rewards.

For example, in a small but highly promising study from the National Cancer Institute published early in 2002, analysis of the blood proteins of women with and without ovarian cancer allowed researchers to correctly identify each woman who had the disease. This news was especially welcome because ovarian cancer is often not detected until the late stages of the disease, when chances for a cure are small. The power of proteomics may help medical researchers develop simple blood tests for other difficult-to-detect cancers, giving doctors and patients the one thing they need most: time for early detection and an early cure.

Thus, proteomics is likely to be a part of standard medical diagnostics in the future. And perhaps many diseases will soon be treated with efficacious drugs developed through rational design with the aid of proteomics.

 

WILEY© 2004 | John Wiley & Sons, Inc. | All Rights Reserved | Privacy PolicyScience Technologies