Integrative-Genomics-Viewer - bwHPC Wiki Integrative-Genomics-Viewer - bwHPC Wiki

Integrative-Genomics-Viewer

From bwHPC Wiki
Jump to: navigation, search
Description Content
module load bio/igv
Availability bwUniCluster
License MIT License
Citing James T. Robinson, Helga Thorvaldsdóttir, Wendy Winckler, Mitchell Guttman,
Eric S. Lander, Gad Getz, Jill P. Mesirov.


Integrative Genomics Viewer. Nature Biotechnology 29, 24–26 (2011)
Helga Thorvaldsdóttir, James T. Robinson, Jill P. Mesirov. Integrative
Genomics Viewer (IGV): high-performance genomics data visualization and
exploration. Briefings in Bioinformatics 14, 178-192 (2013).

Links IGV Homepage | IGV User-Guide
Graphical Interface Yes
Requirements IGV 2.3.x requires Java 7 (Java Runtime Environment. Version >=1.7)
Plugins IGV-Tools


1 Description/What is Integrative-Genomics-Viewer (IGV)

The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.
For more information on features please visit the IGV Homepage

2 Versions and Availability

A list of versions currently available on all bwHPC-C5-Clusters can be obtained from the

Cluster Information System CIS

On the command line interface you'll get a list of available versions by using the command 'module avail bio/igv'.

$ module avail bio/igv
------------------ /opt/bwhpc/common/modulefiles --------------------
bio/igv/2.3


3 License

Permission to use this work is granted under the MIT License.

4 Usage

4.1 Loading the module

4.1.1 Default

You can load the default version of 'IGV with the command 'module load bio/igv'.

$ module avail bio/igv
------------------- /opt/bwhpc/common/modulefiles --------------------
bio/igv/2.3
$ module load bio/igv
$ module list
Currently Loaded Modulefiles:
  1) bio/igv/2.3

The module will try to load modules it needs to function. If loading the module fails, check if you have already loaded one of those modules, but not in the version needed for IGV.

4.1.2 Special Version

If you wish to load a version of IGV, you can do so using module load bio/igv/'version' to load the version you desires.
Example:

$ module load bio/igv/2.3
$ module list
Currently Loaded Modulefiles:
  1) bio/igv/2.3

4.2 Program Binaries

IGV is a java application. In order to run it needs your system to have a suitable Java Runtime Environment (JRE) installed. Before you try to run IGV you should therefore ensure that you have a suitable JRE.

Before using IGV the first time, go to your $HOME folder, load this module and start 'igv'. Some initialisations will be done. Be patient! This happens only once.

You can find the binaries in the main folder of the IGV system. After loading the IGV module ('module load bio/igv') this path is also set to the local $PATH- and $IGV_HOME environments.
You can run IGV as an interactive graphical application.

$ ls -xRF $IGV_HOME
/opt/bwhpc/common/bio/igv/2.3: 
# $IGV_HOME
batik-codec__V1.7.jar  bwhpc-examples/  goby-io-igv__V1.0.jar  igv@
igv.jar   igv.sh@    IGVTools/   modulefiles/
readme.txt
# $IGV_EXA_DIR
/opt/bwhpc/common/bio/igv/2.3/bwhpc-examples:
bwhpc-igv-example.moab  example.bam  example.bam.bai  example.bam.fai
example.bam.tdf         igv.sh*      README.bwhpc-examples
# Command-Line IGV_TOOLS   $IGV_TOOLS_BIN_DIR
/opt/bwhpc/common/bio/igv/2.3/IGVTools:
genomes/  igvtools*  igvtools_gui*  igvtools.jar  igvtools_readme.txt
# Genomes-Sizes Data (inside IGV-Tools)
/opt/bwhpc/common/bio/igv/2.3/IGVTools/genomes:
1kg_ref.chrom.sizes                  1kg_v37_alias.tab
[...]
# Modulefile
/opt/bwhpc/common/bio/igv/2.3/modulefiles:
bio-igv-2.3

'*' indicates the file is executable, '/' is a folder and '@' is a symbolic link.

4.2.1 IGV GUI

IGV is a Graphical-User-Interface mode Java program.
To run IGV non-interactively you should use the 'igv' wrapper script to launch the program. Do not use the origin 'igv.sh' supplied by the vendor.

  • -Xmx4000m indicates 4000 mb of memory, adjust number up or down as needed
  • Add the flag -Ddevelopment = true to use features still in development
#!/bin/bash
# Adapted by rainer.rutka@uni-konstanz.de for bwUniCluster
[ -z ${IGV_HOME} ] && { module load bio/igv; sleep 5; } \
|| echo "IGV_HOME: $IGV_HOME"
exec java -Xmx4000m \
   -Dapple.laf.useScreenMenuBar=true \
   -Djava.net.preferIPv4Stack=true \
   -jar ${IGV_HOME}/igv.jar "$@"

  • You must have a running X-Window server on your local system (X-forwarding).
  • Start the ssh-session with the option "-X" (ssh -X 'your-id'@'your-cluster-DN').
$ cd $IGV_HOME
$ ls -l igv*
[...] igv -> bwhpc-examples/igv.sh
[...] igv.jar
[...] igv.sh -> bwhpc-examples/igv.sh
$ igv &  # or igv.sh


IGV GUI-Version

4.2.2 IGVTools

The IGVTools are command line utilities for preprocessing data files.

# IGV_TOOLS   $IGV_TOOLS_BIN_DIR
/opt/bwhpc/common/bio/igv/2.3/IGVTools:
genomes/  igvtools*  igvtools_gui*  igvtools.jar  igvtools_readme.txt


The igvtools utility provides a set of tools for pre-processing data files. File names must contain an accepted file extension, e.g. test-xyz.bam. Tools include:

  • toTDF
    Converts a sorted data input file to a binary tiled data (.tdf) file.
    Used to preprocess large datasets for improved IGV performance.
    Supported input file formats: .cn, .gct, .igv, .res, .snp, and .wig
    Note: This tool was previously known as "Tile"
  • count
    Computes average alignment or feature density for over a specified window size across the genome and outputs a binary tiled data .tdf file, text .wig file, or both depending on inputs.
    Used to create a track that can be displayed in IGV, for example as a bar chart.
    Supported input file formats: .aligned, .bam, .bed, .psl, .pslx, and .sam
  • index
    Creates an index file for an ASCII alignment or feature file.
    Index files are required for loading alignment files into IGV, and can significantly improve performance for large feature files.
    Supported input file formats: .aligned, .bed, .psl, .sam, and .vcf (v3.2)
    To sort and index .bam files, refer to Samtools or the Picard.SortSam module on GenePattern.
  • sort
    Sorts the input file by start position.
    Used to prepare data files for tools that required sorted input files.
    Supported input file formats: .aligned, .bed, .cn, .igv, .psl, .sam, and .vcf

To sort and index .bam files, refer to Samtools or the Picard.SortSam module on GenePattern.

From IGV: igvtools is accessed by selecting Tools>Run igvtools.
Command line: The igvtools commands can also be run from the command line. To install, download the igvtools zip file from the Downloads page. On Windows, enter the commands at an MS-DOS prompt (select Start>Run and type: cmd). On Mac, enter the commands in a terminal window (select Applications>Utilities>Terminal).


IGV GUI-Version

5 bwHPC Examples for IGV

  • MPI is not implemented in the software IGV until now (March 2016).
  • IGV will run multithreaded (java).


In the folder $IGV_EXA_DIR you'll find an example how to use IGV.

$ echo $IGV_EXA_DIR
/opt/bwhpc/common/bio/igv/2.3/bwhpc-examples
$ ls -lF $IGV_EXA_DIR
bwhpc-igv-example.moab # Moab submit script for IGVTools! ONLY
example.bam  # BAM formated genome example file
example.bam.bai # indexed BAM file 
igv.sh*  # new adapted IGV start script. Linked to $IGV_HOME/igv 
  and igv.sh !
README.bwhpc-examples


5.1 bwhpc-example file

  • bwhpc-igv-example.moab

Use this Moab start-script to start your own IGV Tools session in interactive mode. Look for this section inside the file and do your modifications.

5.1.1 How to use the IGVTools Test-Script

  • Create your own work-space
# WS-Name Days alive (max. 60)
ws_allocate igv_ws 30
  • Change dir to your workspace
cd $(ws_find igv_ws)
  • Copy the moab-example file you'll find in this folder and make your modifications
cp $IGV_EXA_DIR/bwhpc-igv-example.moab .
  • Submit your job
msub bwhpc-igv-example.moab
  • Wait for awhile...

... until you see some output files and a tarbal. The *.tgz-file contains your data.

tar xvzf *.tgz to extract the file-contents


5.1.2 Exerpt from bwhpc-igv-example.moab

These parameters are allying for the use of IGVTools on the bwUniCluster.

#!/bin/bash
#
#MSUB -N igv_job
#MSUB -j oe
#MSUB -o $(JOBNAME).$(JOBID)
#MSUB -m ae
# -M 'your e-mail-address@DN'
#MSUB -q singlenode
#MSUB -l walltime=00:10:00
[...]
echo " "
echo "### Loading IGV module:"
echo " "
module load bio/igv/2.3
[ -z "$IGV_HOME" ] && { echo 'ERROR: Failed to load module bio/igv/2.3.'; exit 1; }
echo "IGV_HOME = ${IGV_HOME}"
module list

echo " "
echo "### Copying input test files for job (if required):"
echo " "
cp -v $IGV_EXA_DIR/{example.bam,example.bam.bai} .

echo " "
echo "### Runing IGVTools example in single-node-mode..."
echo " "
# This example 'count' command computes average feature density over
# a specified window size across the genome. Common usages
# include computing coverage for alignment files and counting
# hits in Chip-seq experiments. By default, the resulting#
# file will be displayed as a bar chart when loaded into IGV.
# Supported input file formats are: .sam, .bam, .aligned, .psl, .pslx, and .bed
# See: https://www.broadinstitute.org/software/igv/igvtools_commandline
igvtools count -z 5 -w 25 -e 250 example.bam example_alignments.tdf hg18
[ "$?" -ne 0 ] && { echo "igvtools returned with an error: $?"; exit 1; }
echo "done"
# The input file must be sorted by start position. So here you'll need the *.bai
# file, too.


6 IGV-Specific Environments

To see a list of all IGV environments set by the 'module load'-command use env | grep IGV. Or use the command module display bio/igv.

$ module display bio/igv
-------------------------------------------------------------------
/opt/bwhpc/common/modulefiles/bio/igv/2.3:
module-whatis	 The Integrative Genomics Viewer 2.3 (IGV) is a high-performance visualization
    tool for interactive exploration of large, integrated genomic datasets. (command 'igv') 
setenv		 IGV_VERSION 2.3 
setenv		 IGV_HOME /opt/bwhpc/common/bio/igv/2.3 
setenv		 IGV_EXA_DIR /opt/bwhpc/common/bio/igv/2.3/bwhpc-examples 
setenv		 IGV_BIN_DIR /opt/bwhpc/common/bio/igv/2.3 
setenv		 IGV_TOOLS_BIN_DIR /opt/bwhpc/common/bio/igv/2.3/IGVTools 
prepend-path	 PATH /opt/bwhpc/common/bio/igv/2.3 
prepend-path	 PATH /opt/bwhpc/common/bio/igv/2.3/IGVTools 
conflict	 bio/igv 

The module display command will not load the module!

7 Version-Specific Information

For a more detailed information specific to a specific IGV version, see the information available via the module system with the command module help bio/igv.
For a small abstract what IGV is about use the command module whatis bio/igv.
Example:

$ module whatis bio/igv
bio/igv              : The Integrative Genomics Viewer 2.3 (IGV) is a high-performance 
     visualization tool for interactive exploration of large, integrated genomic datasets.
     (command 'igv')

$ module help bio/igv
----------- Module Specific Help for 'bio/igv/2.3' ----------------
DESCRIPTION

  The Integrative Genomics Viewer (IGV) is a high-performance
[...]
DOCUMENTATION

*  Get started and FAQs
    Read the 
    /opt/bwhpc/common/bio/igv/2.3/readme.txt 
    and 
    /opt/bwhpc/common/bio/igv/2.3/IGVTools/igvtools_readme.txt 
    files at first.

*  IGV User Guide and Documents
    https://www.broadinstitute.org/software/igv/UserGuide

*  IGV Downloads
    https://www.broadinstitute.org/software/igv/download 
  
*  bwHPC examples and a moab example script can be found here:
    /opt/bwhpc/common/bio/igv/2.3/bwhpc-examples
    Please read the 'README.bwhpc-examples' file.
[...]
 IMPORTANT

   Before using IGV the first time, go to your $HOME folder,
   load this module and start 'igv' the first time.
   Some initialisations will be done. Be patient! This happens
   only once.
[...]


8 Useful Links