![]() |
|
| |||||||||||
Systems biology: A functional blueprint of E. coli
The integration of proteomics data with genomic-context analysis has permitted the development of a protein-function prediction tool to annotate functional orphans in Escherichia coli. In this age of 'omics' technologies, it is perhaps a bit surprising that one-third of the 4,225 proteins found in E. coli, the most studied bacterium, are functionally uncharacterized. Andrew Emili of the University of Toronto, Gabriel Moreno-Hagelsieb of Wilfrid Laurier University and their colleagues wondered whether this was because these proteins had unusual properties or low expression that made them difficult to study, or whether these proteins, which are generally not major network nodes, were just not deemed 'popular' enough to focus research efforts on understanding what they do in the cell. One way to infer molecular function is to use sequence alignments. But sequence alignments, Emili notes, "don't give you a lot of biological context, for instance, the process or pathway the protein works in." However, looking at 'whom' a protein 'hangs out' with in its natural environment can lead to clues about its function, which is why Emili and his colleagues decided to begin the process of annotating E. coli proteins with a proteome-wide analysis of protein-protein interactions. "Knowledge of the components in a complex is a stage towards understanding how that protein is positioned in terms of a biological pathway," he explains. The researchers used a tandem affinity purification method to systematically discover protein-protein interactions in E. coli, a project on which Emili has been working with Jack Greenblatt, also at the University of Toronto, for many years. With this method, a purification tag is expressed as a fusion to a bait protein, which naturally associates with its interaction partners, the 'prey', in the cell. The tagged bait and any prey proteins are then isolated via a gentle, two-step purification, which maintains the endogenous interactions. The proteins in the complex can then be identified by mass spectrometry. Emili and his colleagues performed such purifications with more than a thousand tagged bait proteins. In the end they identified nearly 6,000 pairwise interactions, about half of which were novel findings, including findings for 451 functional orphans. A clustering algorithm assigned many of the orphans to multiprotein complexes. However, they were unable to detect 469 of the orphan proteins, which were likely membrane-associated or present at very low abundance. Thus they used a complementary approach to look at the natural chromosomal clustering of E. coli genes, consisting of four different computational genomic-context profiling methods. These included looking at gene fusions, intergenic distances, the similarity between phylogenetic profiles and the evolutionary conservation of gene order. With this approach, they predicted pairwise interactions for most of the orphans. Integrating the results from the proteomics experiments and genomic context analysis, the researchers generated a dataset of high-confidence pairwise interactions for 99% of the annotated and 96% of the unannotated genome. "Is protein-complex information or genomic context enough to tell you about a protein's role in the cell? No, but it certainly gives you hints," says Emili. They created a function prediction tool called StepPLR to assign putative functions to the orphans, using the information about 'whom' the orphans interacted with, directly and indirectly. Notably, they predicted that many of the orphans are actually involved in core cellular processes. Emili hopes that E. coli researchers with proteins or genes or pathways of interest will follow up on their functional predictions. The researchers have set up a public resource called eNet to host their data, and they plan to keep adding to it and refining it. "It's not a complete story; we'd like to fill in the gaps. Certainly what's missing [from the proteomics data] is the membrane proteins," says Emili. Although similar resources exist for other model species, such as yeast, worm and fly, Emili acknowledges that bacteria have been largely understudied by genomics researchers. He hopes that this resource will help "bring in the bacterial community, making them aware of the things we can do with omics approaches." Allison Doerr References | |||||||||||
![]() | |||||||||||
HOME | SIGNALING UPDATE | MOLECULE PAGES | DATA CENTER | ABOUT US | |||||||||||