SGD

SGD Help: Domains/Motifs and Signal Peptides


Contents




Description

The Domains/Motifs and Signal Peptides page displays sequence-based predictive information for protein-coding ORFs in S. cerevisiae. This page contains sections for the display of InterPro-derived, shared and unique domains/motifs, TMHMM-derived transmembrane domains and SignalP-derived signal peptides. Up to three Proteome Browser thumbnails may be present to aid in the visualization of sequence-based predictions including domains/motifs, transmembrane domains and signal peptides. InterProScan-derived shared and unique domains/motifs are also presented in tabular form, as are the coordinates for predicted transmembrane domains and signal peptides. Finally, an External Links section is provided so that external databases can be searched directly for protein specific domain/motif information.

Proteome Browser Thumbnails

To aid in the visualization of primary sequence-based protein information, an interactive Proteome Browser has been developed. As many as three separate thumbnail images may be present on the domains/motifs page. The first thumbnail, located at the top of the page (see figure below), is similar to the image displayed on the Protein Information page, Clicking on the thumbnail provides provides access to the interactive browser. This browser is a customized version of GBrowse, a genome browser developed by the Generic Model Organism Database (GMOD) project. The Proteome Browser consolidates the display of domains/motifs (predicted by software and datasets assembled by the InterPro database, using InterProScan), transmembrane domains (predicted using TMHMM), signal peptides (identified using SignalP), profile hits (using BlastProDom and ProfileScan, methods based on the generation of profiles from a family of related sequences derived through multiple sequence alignments), and Kyte-doolittle hydropathy plots. Additional information on the InterProScan tool is located in the Shared Domains/motifs section

In both this thumbnail and the interactive Proteome Browser, HMM domains have been color coded based on the source of the prediction, with PIR SUPERFAMILY domains in red, PFAM domains in orange and yellow, GENE3D domains in purple, PANTHER domains in green, TIGRFAM domains in blue and SMART domains in brown. In the Proteome Browser, a mouseover feature has been added to provide additional detailed information regarding the feature of interest. For example, mousing over a domain will provide details concerning the database origin of the domain match, the name and description of the domain, as well as the E-value of the match.

A second Proteome Browser thumbnail may be present in the transmembrane domain section if such domains have been identified by TMHMM. This thumbnail displays the relative location of the predicted transmembrane domains with specific amino acid coordinates listed in the associated table. Finally, a third thumbnail will be present if signal peptides have been predicted for the protein of interest. This thumbnail displays the relative location of signal peptide(s) with specific amino acid coordinates listed in the associated table.

To view a different protein, first click on a thumbnail image to open the Proteome Browser. Then enter the name in the landmark or region text box. The scroll/zoom feature can be used to modify the region of the protein shown in the default view. The default setting displays the predicted full-length protein, and the zoom option can be used to look at a particular region in more detail (zooming in). Note that one cannot zoom out. Tracks shown on the default view can be modified by selecting/deselecting the tracks of interest and then updating the image. User defined tracks of information can also be displayed by simply uploading the file of interest. Additional information concerning the functionality of the proteome browser can be obtained in the general GBrowse help document since the underlying code and functionality of the two viewers are the same.

Shared Domains/motifs

The table in the shared domains/motifs section provides information about other Saccharomyces cerevisiae proteins that also contain the Domains/Motifs identified in the original Query Protein sequence. The following image of the shared domains/motifs table is an excerpt of the results table produced using Hxt1p as the query sequence. The first column of this table provides the name of other S. cerevisiae proteins with links to the respective Protein Information pages. The middle column shows a list of all Domains/Motifs that are found in both the original Query Protein sequence (e.g. Hxt1p) and also in another S. cerevisiae protein sequence (e.g. YBR241Cp or Mal31p or Git1p). The third column shows a list of Domains/Motifs found in the InterPro database for these other S. cerevisiae proteins, (e.g. Git1p), but not found in the original Query Protein sequence. Clicking on the accession number takes you to a page describing the specific domain/motif at the database of origin.

The results displayed on this section of the page were derived by comparing yeast protein sequences using the InterProScan program (Quevillon E et al. (2005)). Briefly, InterProScan is a tool that combines different protein signature recognition methods into one resource. The Interpro database integrates motif and domain information from the following member databases: PROSITE, PRINTS, Pfam, ProDom, SMART, TIGERFAMs, Gene3D, PANTHER and PIR SUPERFAMILY. Scanning methods and cut-offs recommended by the member databases are used in the InterProScan. A list of all the hits retrieved by the InterProScan is available at SGD's FTP site.

Unique Domains/motifs

The table displayed in this section of the page contains a list of domains/motifs that are unique to the query protein (i.e. not shared by other yeast proteins). The first column of this table contains the database source of the domain, the middle column contains accession number of the unique domain and the third column provides a description of the domain. Clicking on the accession number takes you to a page describing the specific domain/motif at the database of origin.

Transmembrane Domains

Transmembrane Domain(s) were calculated using version 2.0 of TMHMM, an application available at The Center for Biological Sequence Analysis at the Technical University of Denmark DTU. If transmembrane domains have been predicted for the query protein, a table, containing both a Proteome Browser thumbnail displaying the relative location of the predicted domains and the specific amino acid coordinates will be present. If no transmembrane domains are predicted to be present a message will be displayed in place of the table.

Signal Peptides

Signal sequences serve to direct the proteins from the cytosol to their destination (ER, mitochondria etc). There are two types of such sequences: Sorting Signal Peptides/Sequences and Signal patches. Signal patch sequences are very difficult to predict and are not displayed on the protein pages at SGD, while the amino acids predicted to be encode the sorting signal peptide are indicated. Cleavage typically occurs on the carboxyl side of the predicted site. Signal Peptides were predicted using version 3.0 of SignalP, an application available at The Center for Biological Sequence Analysis at the Technical University of Denmark DTU. If signal peptides have been predicted for the query protein, a table, containing both a Proteome Browser thumbnail displaying the relative location of the predicted signal peptide and the specific amino acid coordinates, will be present. If no signal peptides are predicted a message will be displayed in place of the table.

External Links

For your convenience, we have also provided a list of some of the databases included in the InterProScan Dataset (SMART, PROSITE) in addition to those not included in the Interpro dataset (NCBI-DART, eMOTIF and SCOP). Depending on your query of interest, one or the other of these resources may produce the best results for you. For specific help with any of these other resources, you will be best served at the original source.

Other Relevent Help Pages


Return to Saccharomyces Genome Database Send a Message to the SGD Curators