Fasta - bwHPC Wiki Fasta - bwHPC Wiki

Fasta

From bwHPC Wiki
Jump to: navigation, search
Description Content
module load bio/fasta/36.3.8
License Apache License
Citing Code in the smith_waterman_sse2.c and smith_waterman_sse2.h files is copyright (c) 2006 by Michael Farrar. Code in the global_sse2.c, global_sse2.h, glocal_sse2.c, and glocal_sse2.h files is copyright (c) 2010 by Michael Farrar.
Links William R. Pearsons' Fasta Page | Fasta Sequence Comparison
Graphical Interface No
Included in module compiler/intel/14.0 | mpi/openmpi/1.10-intel-14.0

1 Description

FASTA compares a protein sequence to another protein sequence or to a protein database, or a DNA sequence to another DNA sequence or a DNA library.
The FASTA programs find regions of local or global similarity between Protein or DNA sequences, either by searching Protein or DNA databases, or by identifying local duplications within a sequence. Other programs provide information on the statistical significance of an alignment. Like BLAST, FASTA can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.

More information about FASTA

2 Versions and Availability

A list of versions currently available on all bwHPC-C5-Clusters can be obtained from the

Cluster Information System CIS

Show a list of available versions using 'module avail bio/fasta' on the bwUniCluster.

$ module avail bio/fasta
--------------------- /opt/bwhpc/common/modulefiles ----------------------
bio/fasta/36.3.8
$ 


3 License

Apache License

4 Usage

4.1 Loading the module

You can load the default version of Fasta with the command

$ module load bio/fasta
$ module list
Currently Loaded Modulefiles:
  1) compiler/intel/14.0(default)   3) bio/fasta/36.3.8
  2) mpi/openmpi/1.10-intel-14.0
$ 

The module will try to load modules it needs to function (e.g. compiler/intel). If loading the module fails, check if you have already loaded one of those modules, but not in the version needed for Fasta. If you wish to load a specific (older) version, you can do so using e.g. '$ module load bio/fasta/36.3.8' to load the version you desires. At the time of this document was created, only one version was available.

$ module load bio/fasta/36.3.8


4.2 Program Binaries

Show the location and contents of the installed binaries in the main folder of the software and show some informations, too.
After loading the FASTA module ('module load bio/fasta/36.3.8') this path is also set to the local $PATH- and $FASTA_BIN_DIR environments.

$ ls -x $FASTA_BIN_DIR
fasta36     fasta36_mpi     fastf36     fastf36_mpi     fastm36   fastm36_mpi
fasts36     fasts36_mpi     fastx36     fastx36_mpi     fasty36   fasty36_mpi
ggsearch36  ggsearch36_mpi  glsearch36  glsearch36_mpi  README    reln.sh
ssearch36   ssearch36_mpi   tfastf36    tfastf36_mpi    tfasts36  tfasts36_mpi
tfastx36    tfastx36_mpi    tfasty36    tfasty36_mpi
$ 

*_mpi = MPI-capable versions of FASTA.

4.3 Disk Usage / Workspaces

Change to your local $TMP-directory or to "your local workspace" before starting your calculations.
'fasta_repo' is an example name of a repository you created by using the command 'ws_allocate'.

$ cd $(ws_find fasta_repo)
$ pwd
/workspace/scratch/...-fasta_repo-0



5 Moab Submit Examples

You can copy a simple interactive example to your workspace and run it, using 'msub'
Strictly use the $SOFTWARENAME_EXA_DIR environment-variable!

$ cd $(ws_find fasta_repo)
$ pwd
/workspace/scratch/...-fasta_repo-0
$ ls -l $FASTA_EXA_DIR
-rw-r--r--. 1 ... ... 6566 28. Sep 15:39 bwhpc-fasta-example.moab
-rw-r--r--. 1 ... ... 14 23. Nov 14:35 README.bwhpc
$ cp $FASTA_EXA_DIR/*.moab .
$ ls -l
-rw-r--r--. 1 ... 6566 23. Nov 14:06 bwhpc-fasta-example.moab
$ cp bwhpc-fasta-example.moab  myfastajob.moab
vi myfastajob.moab                # do your modifications here and now...
$ msub myfastajob.moab            # start job submission


How FASTA is started in bwhpc-fasta-example.moab

[...]
#MSUB -l nodes=4:ppn=4
[...]
echo " "
echo "### Run Fasta36 in MPI-mode..."
echo " "
echo "starting a protein TEST calculation..."
echo "using default of 4 nodes"
# # Have it all one's own way here:
mpiexec -v -x PATH -x LD_LIBRARY_PATH fasta36_mpi -q -m 9c -Z 100000 -d 10 seq/prot_test.lseg q > results/test_plib.ok2_mpi
echo "done"
[...]

FASTA uses 4 nodes only by DEFAULT.
See documentation if you'd like to change this (not recommended).

6 Fasta-Specific Environments

To see a list of all FASTA environments set by the 'module load'-command use 'env | grep FASTA' or the command 'module display bio/fasta/36.3.8'.

$ module list
Currently Loaded Modulefiles:
  1) compiler/intel/14.0(default)   bio/fasta/36.3.8
  2) mpi/openmpi/1.10-intel-14.0
$ env | grep FASTA
FASTA_BIN_DIR=/opt/bwhpc/common/bio/fasta/36.3.8/bin
FASTA_VERSION=36.3.8
FASTA_EXA_DIR=/opt/bwhpc/common/bio/fasta/36.3.8/bwhpc-examples
FASTA_HOME=/opt/bwhpc/common/bio/fasta/36.3.8
$ 


7 Version-Specific Information

For a more detailed information specific to a specific FASTA version, see the information available via the module system with the command

$ module help bio/fasta/36.3.8


For a small abstract what FASTA is about use the command

$ module whatis bio/fasta/36.3.8


EXAMPLES using the FASTA-Module

$ module avail bio/fasta
--------------------------------- /opt/bwhpc/common/modulefiles ---------------------------------
bio/fasta/36.3.8
$ module whatis bio/fasta
bio/fasta            : Fasta36.3.8 Scan a protein or DNA sequence library for similar 
   sequences

$ module help bio/fasta
----------- Module Specific Help for 'bio/fasta/36.3.8' -----------

This is the MPI version of Fasta!

 [...]

  FASTA AT A GLANCE
  /opt/bwhpc/common/bio/fasta/36.3.8/bin

  * fasta36(_mpi) - scan a protein or DNA sequence library for similar sequences

  * fastx36(_mpi) - compare a DNA sequence to a protein sequence database, 
    comparing the translated DNA sequence in forward and reverse frames.

  * tfastx36(_mpi) - compare a protein sequence to a DNA sequence database, 
    calculating similarities with frameshifts to the forward and reverse 
    orientations.

  * fasty36(_mpi) - compare a DNA sequence to a protein sequence database,
    comparing the translated DNA sequence in forward and reverse frames.

  * tfasty36(_mpi) - compare a protein sequence to a DNA sequence database,
    calculating similarities with frameshifts to the forward and reverse
    orientations.

  * fasts36(_mpi) - compare unordered peptides to a protein sequence database.

  * fastm36(_mpi) - compare ordered peptides (or short DNA sequences) to a
    protein (DNA) sequence database.

  * tfasts36(_mpi) - compare unordered peptides to a translated DNA sequence
    database.

  * fastf36(_mpi) - compare mixed peptides to a protein sequence database.

  * tfastf36(_mpi) - compare mixed peptides to a translated DNA sequence 
    database.

  * ssearch36(_mpi) - compare a protein or DNA sequence to a sequence database
    using the Smith-Waterman algorithm.

  * ggsearch36(_mpi) - compare a protein or DNA sequence to a sequence database
    using a global alignment (Needleman-Wunsch).

  * glsearch36(_mpi) - compare a protein or DNA sequence to a sequence database
    with alignments that are global in the query and local in the database
    sequence (global-local).

  * lalign36(_mpi) - produce multiple non-overlapping alignments for protein and
    DNA sequences using the Huang and Miller sim algorithm for the Water-
    man-Eggert algorithm.

  * prss36, prfx36 - discontinued; all the FASTA programs will estimate
    statistical significance using 500 shuffled sequence scores if two
    sequences are compared.

  This version is compiled with openmpi support synced Intel- and System-
  Math-Libraries.

DOCUMENTATION

*  Get started:
   http://fasta.bioch.virginia.edu/fasta_www2/fasta_intro.shtml
   German - https://de.wikipedia.org/wiki/FASTA-Format
   English - https://en.wikipedia.org/wiki/FASTA_format

*  Fasta Sequence Comparison at the U. of Virginia:
   http://fasta.bioch.virginia.edu/fasta_www2/fasta_list2.shtml

*  Full manual, command-line optionen and more:
   /opt/bwhpc/common/bio/fasta/36.3.8/doc

   http://fasta.bioch.virginia.edu/fasta_www2/fasta_guide.pdf
   http://www.genome.jp/tools-bin/show_man?fasta
   http://home.cc.umanitoba.ca/~psgendb/birchhomedir/doc/fasta/fasta36.1.html

*  Fasta tests can be found here:
   /opt/bwhpc/common/bio/fasta/36.3.8/examples
   /opt/bwhpc/common/bio/fasta/36.3.8/test

*  bwHPC examples and a moab example script can be found here:
   /opt/bwhpc/common/bio/fasta/36.3.8/bwhpc-examples

[...]
$