Swissprot Database - bwHPC Wiki Swissprot Database - bwHPC Wiki

Swissprot Database

From bwHPC Wiki
Jump to: navigation, search
Description Content
module load dbdata/swissprot/current
Availability bwUniCluster
License Public Domain | Free for academic users
Citation Publications on Uniprot/Swissprot
Links Nucleic Acids Research | N.A.R.-Oxford Journals
Graphical Interface No
Update Daily at midnight

1 Description

SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include format and content enhancements, cross-references to additional databases, new documentation files and improvements to TrEMBL, a computer-annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDSs) in the EMBL Nucleotide Sequence Database, except the CDSs already included in SWISS-PROT. We also describe the Human Proteomics Initiative (HPI), a major project to annotate all known human sequences according to the quality standards of SWISS-PROT.

More detailed information about the SWISS-PROT Database

2 Versions and Availability

A list of versions currently available on all bwHPC-C5-Clusters can be obtained from the

Cluster Information System CIS

Show a list of available versions using 'module avail dbdata/swissprot' on the bwUniCluster.

$ module avail dbdata/swissprot
------------------------ /opt/bwhpc/common/modulefiles -------------------------
dbdata/swissprot/20150920 dbdata/swissprot/current
$ 


3 License

Public Domain

4 Updates

On the bwUniCluster, SWISS-PROT is updated daily at midnight.

4.1 Update-Date

You can check the date of the latest update by using
'module whatis dbdata/swissprot/current' or 'module help dbdata/swissprot/current'.

  • $ module whatis dbdata/swissprot/current

dbdata/swissprot/current: This packages contains the Nucleic Acids Protein Sequence
  Database Swiss-Prot. Last update: 24.11.2015

  • $ module help dbdata/swissprot/current

Module Specific Help for 'dbdata/swissprot/current'
DESCRIPTION
SWISS-PROT is a curated protein sequence database which strives
[...]
Last update: 24.11.2015
[...]

4.2 Exeptions

Updates will be avoided if:

  • the current database version is in use by a module. E.g. module load dbdata/swissprot/current was called by another module and has still not finished its job,
  • there is no newer version available at the source system.

4.3 Update-Source and Logs

Source: ftp.ncbi.nih.gov/blast/db/swissprot.tar.gz
Logs: in $SWISSPROT_HOME you'll find some logfiles.

$ ls -x $SWISSPROT_HOME 
bwhpc-examples cron.out          lastupdate.txt   ...

lastupdate.txt : Informations about the last successful or not successful updates including date and time.

5 Usage

5.1 Loading the module

You can load the default version of the SWISS-PROT Database with the command
'module load dbdata/swissprot'.

$ module load dbdata/swissprot
module dbdata/swissprot loaded:

** IMPORTANT **

Copy the database files to your local workspace.
Use only a local copy of the files for your calculations.
[...]

$ module list
Currently Loaded Modulefiles:
  1) dbdata/swissprot/current
$ 

The module will not load any other modules like compiler or other software-package.
It provides an interface to a database you may use for other software-modules like Structure, Bowtie, Fasta, Ultrascan ...
If loading the module fails, check if you have already loaded one of those modules, but not in the version needed for the SWISS-PROT Database.

If you wish to load a specific (older) version, you can do so using e.g. 'module load dbdata/swissprot/20150920' to load the version you desires. At the time of this document was created, two versions were available.

$ module load dbdata/swissprot/20150920
  • dbdata/swissprot/current : up-to-data database
  • dbdata/swissprot/20150920 : older version - no updates (YYYYMMDD) date can differ


5.2 Program Binaries

There will be no binary packages supplied with the database.
It's a link to DB-files using environment-variables you can include in your submit-scripts for other modules.
After loading the SWISS-PROT module (module load dbdata/swissprot/current) this path is also set to the local $PATH- and other environments.

5.3 Swissprot-Specific Environments

A list of all SWISS-PROT environments are set and listed by the module load-command.
You can also use the command 'module display dbdata/swissprot/current' (you don't need to load the module before!) or 'module load dbdata/swissprot/current && env | grep DBDATA'.

Example

$ module load dbdata/swissprot/current

module dbdata/swissprot loaded:

** IMPORTANT **

Copy the database files to your local workspace.
Use only a local copy of the files for your calculations.

All available environment variables:

DBDATA=/opt/bwhpc/common/dbdata/swissprot/current
DBDATA_PIN=/opt/bwhpc/common/dbdata/swissprot/current/swissprot.00.pin
DBDATA_PNI=/opt/bwhpc/common/dbdata/swissprot/current/swissprot.00.pni
DBDATA_PND=/opt/bwhpc/common/dbdata/swissprot/current/swissprot.00.pnd
DBDATA_PSI=/opt/bwhpc/common/dbdata/swissprot/current/swissprot.00.psi
DBDATA_PSD=/opt/bwhpc/common/dbdata/swissprot/current/swissprot.00.psd
DBDATA_PPI=/opt/bwhpc/common/dbdata/swissprot/current/swissprot.00.ppi
DBDATA_PPD=/opt/bwhpc/common/dbdata/swissprot/current/swissprot.00.ppd
DBDATA_POG=/opt/bwhpc/common/dbdata/swissprot/current/swissprot.00.pog


5.4 Extensions

Extension Content Format
Protein database formatted without "-o T"
phr deflines binary
pin indices binary
psq sequence data binary
Protein database formatted with "-o T" add these ISAM files:
pnd GI data binary
pni GI indices binary
psd non-GI data binary
psi non-GI indices binary

Please report missing entries and errors to Rainer Rutka.

6 Version-Specific Information

For a more detailed information specific to a specific SWISS-PROT version, see the information available via the module system with the command 'module help dbdata/swissprot/current'
For a small abstract what SWISS-PROT is about use the command 'module whatis dbdata/swissprot/cureent'
Example

$ module list 
Currently Loaded Modulefiles:
  1) dbdata/swissprot/current

$ module whatis dbdata/swissprot/current
dbdata/swissprot/current: This packages contains the Nucleic Acids Protein Sequence
    Database Swiss-Prot. Last update: 24.11.2015

$ module help dbdata/swissprot/current
----------- Module Specific Help for 'dbdata/swissprot/current' ---------------------------
[...]
DOCUMENTATION
   * Get started
     https://www.ncbi.nlm.nih.gov/pmc/articles/PMC102476/
   * List of all the documents that are currently available
     http://www.expasy/sprot/sp_docu.html
   * Swiss-Prot is available at
     http://www.expasy.ch/sprot/ 
     and
     http://www.ebi.ac.uk/swissprot/
   * Sequence Library Downloads (excerpt from the Fasta36 page)
     http://fasta.bioch.virginia.edu/fasta_www2/fasta_db_down.shtml
[...]