How do I add organism-specific GOA databases to Scaffold

Scaffold provides the user the option to view GO Annotations associated with proteins. These annotations come from a variety of sources including the NBCI and the UniProt GOA knowledge base. Scaffold comes preconfigured with the ability to add the UniProt All Proteomes and Human Only GOA databases. While the All Proteomes database contains a wealth of information regarding numerous species it is very large and takes a long time to download and index. Often users want the ability to quickly search a smaller subset database that is relevant to the species with which they are working. Scaffold allows users to download and search organism-specific GOA databases that are much smaller and therefore downloaded and index more quickly.

For general information about GOA databases and downloads, here are a couple resources:

https://www.ebi.ac.uk/GOA/downloads

http://geneontology.org/page/download-annotations

To download specific databases follow the instructions below, there are two ways to add GOA annotations in Scaffold:

Option 1 (preferred method):

  1. Access the UniProt FTP server available from the download link above or by following this link: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/
  2. From here select the specific organism you are working with and open the corresponding folder, CHICKEN or DOG for example. Note, if you do not see your organism listed on the main page click on the Proteomes folder to bring up a vast collection of organism specific proteomes.
  3. It is important to select the file with the extension GAF.GZ in order for Scaffold to be able to read the data contained therein. Note, select from the standard file or files *_complex.gaf.gz, *_isoform.gaf.gz, or *_rna.gaf.gz based on your experimental needs. Consult the UniProt website for more information
  4. Right click the file you are interested in adding and select Copy Link Location
  5. Follow this pathway: Edit > Edit GO Term Options > GO Annotation Databases. Here you will see any databases that have been added (NCBI is added automatically)
  6. Select New Database. Here you have the option the add the All Proteomes and Human Only UniProt databases.
  7. Select other web site from the dropdown menu and paste the copied link into the the box
  8. Give your database a name and choose a folder to save your GOA database to. This should be a local folder that you have permissions to read/write to (where you store your data or FASTA files, for example). Click Add
  9. Your database should appear under Database Name, choose it and click Select. Note, the currently selected database is highlighted in green
  10. Once your database is selected GO terms should appear automatically. You can add go terms to an experiment from a selected database by clicking Experiment >Add GO Annotations. This option will add GO Annotations based on the database currently selected
  11. To change your selected database simply return to Edit > Edit GO Term Options > GO Annotation Databases and select a different Database Name 

Option 2 (alternative method):

  1. Follow steps 1 through 3 above to locate the organism specific GOA database required
  2. Download the file by clicking and select Save File
  3. Follow this pathway: Edit > Edit GO Term Options > GO Annotation Databases.
  4. Select New Database.
  5. To add an additional database Select Other File, This will bring up a dialog box which will allow you to add the GAZ.GF file you recently downloaded
  6. Give your database a name and choose a folder to save your GOA database to. This should be a local folder that you have permissions to read/write to (where you store your data or FASTA files for example). Click Add

Note: Option 1 requires an Internet connection to download the database file the first time. If installing on a computer without Internet you can use Option 2 and transfer the file to your computer via a thumb drive.

 

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.