Views
FastQ screen
Description | Content |
---|---|
module load | bio/fastq_screen |
Availability | bwUniCluster |
License | GPLV3 |
Citing | n./a. |
Links | Babraham Bioinformatics |
Graphical Interface | no |
Requirements | Bowtie2 | bio/bowtie2/2.2.3 (automatic load of module). A suitable Perl Runtime Environment with GD::Graph plugin (optional) |
Contents
1 Description/What is FastQ Screen
FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.
When running a sequencing pipeline it is useful to know that your sequencing runs contain the
types of sequence they're supposed to.
FastQ Screen allows you to set up a standard set of libraries against which all of your sequences
can be searched. Your search libraries might contain the genomes of all of the organisms you
work on, along with PhiX, Vectors or other contaminants commonly seen in sequencing experiments.
The program produces both text based and graphical output which summaries the mapping of
your sequences against each of your libraries, so that when you search your mouse sequences you
can see if they're good or not.
For more information on features please visit the FastQ Screen Informations Page
2 Versions and Availability
A list of versions currently available on all bwHPC-C5-Clusters can be obtained from the
Cluster Information System CIS
On the command line interface you'll get a list of available versions by using the command module avail bio/fastq_screen.
$ module avail bio/fastq_screen ----------------- /opt/bwhpc/common/modulefiles -------------------- bio/fastq_screen/0.5.2
3 License
The program FastQ Screen is a free software package.
4 Usage
4.1 Loading the module
4.1.1 Default
You can load the default version of FastQ Screen with the command module load bio/fastq_screen.
$ module avail bio/fastq_screen ------------- /opt/bwhpc/common/modulefiles ------------------ bio/fastq_screen/0.5.2 $ module load bio/fastq_screen $ module list Currently Loaded Modulefiles: 1) bio/bowtie2/2.2.3 2) bio/fastq_screen/0.5.2
The module will try to load modules it needs to function. If loading the module fails, check if you have already loaded one of those modules, but not in the version needed for FastQ Screen.
4.1.2 Special Version
If you wish to load a version of FastQ Screen, you can do so using module load bio/fastq_screen/'version' to load the version you desires.
Example: $ module avail bio/fastq_screen ----------------- /opt/bwhpc/common/modulefiles ------------------- bio/fastq_screen/0.5.2 $ module load bio/fastq_screen/0.5.2 $ module list Currently Loaded Modulefiles: 1) bio/bowtie2/2.2.3 2) bio/fastq_screen/0.5.2
4.2 Program Binaries
- This version of FastQ Screen is for the command-line only.
- Fastq Screen is intended to be used as part of a QC pipeline.
It allows you to take a sequence dataset and search it against a set of Bowtie databases. It will then generate both a text and a graphical summary of the results to see if the sequence dataset contains the kind of sequences you expect or not.
$ ls -xF $FASTQ_SCREEN_HOME aln-pe.sam bwhpc-examples/ database/ fastq_screen* fastq_screen.conf fastq_screen.conf.example license.txt modulefiles/ OpenSans-Regular.ttf README-PERL.bwhpc README.txt RELEASE_NOTES.txt
'*' indicates the file is executable.
'/' indicates its a folder.
5 Perl Plugin-Installation
Install the GD::Graph plugin for every new user who wants to use the FastQ_Screen module on the bwUni-Cluster (this one).
# INSTALL THE PLUGIN $ perl -MCPAN -e shell | > install Bundle::CPAN | : answer all(!) questions with: _yes_ | > install GD::Graph | : answer all(!) questions with: _yes_ | > quit
This must be done once!
You'll find a $HOME/.cpan folder where your plugings will be located.
6 Main Configuration File
The most important configuration file is:
/opt/bwhpc/common/bio/fastq_screen/0.5.2/fastq_screen.conf(.example)
Edit this one and rename it to fastq_screen.conf.
At least check these key/value pairs (EXAMPLES ONLY):
- TREADS 8
FastQ Screen runs in multi-threaded mode and uses 8 cores by default.
This can be changed by editing the 'fastq_screen.conf' file. - DATABASE Human /opt/bwhpc/common/bio/fastq_screen/0.5.2/database/grch38_genome
If the bowtie AND bowtie2 indices of a given genome reside in the SAME FOLDER, a SINLGE path may be provided to BOTH sets of indices. [...]
Beware!
In the main location of FastQ Screen ($FASTQ_SCREEN_HOME) you will find a folder named 'database'.
This one is an example only supplied by us.
7 bwHPC Examples for FastQ Screem
- MPI is not implemented in this version of FastQ Screen.
In the folder $FASTQ_SCREEN_EXA_DIR you'll find an example how to use FastQ Screen.
$ ls -l $FASTQ_SCREEN_EXA_DIR [...] build_bowtie_index.sh # starts 'bowtie2-build ' to make an index of the Sample DB [...] BWA_aligning_indexing_example.sh # Example Burrows Wheeler Aligner indexing [...] bwhpc-fastq_screen-example.moab # Moab submitscript. Creates final screen-file (+alignements). [...] fastq_screen_aligner.sh # FastQ Screen aligner example [...] fastq_screen_job.msub_out # msub STDOUT example of a finished job [...] Homo_sapiens.GRCh38.dna.chromosome.10.fa # DB example for tests [...] README-PERL.bwhpc # Include GD::Graph plugin for perl [dir] result # example screen-result file [...] Sample_ABC_L005_R1.fastq # given indexed example Fasta file
7.1 bwHPC example workflow
- bwhpc-fastq_screen-example.moab
Use this Moab start-script to start your own FastQ Screen session in interactive mode. Look for this section inside the file and do your modifications.
7.1.1 How to use the bwhpc-FastQ Screen Test-Script
- Create your own work-space
# WS-Name Days alive (max. 60) ws_allocate fastq_screen_repo 30
- Change dir to your workspace
cd $(ws_find fastq_screen_repo)
- Copy the moab-example file you'll find in this folder and make your modifications
cp $FASTQ_SCREEN_EXA_DIR/bwhpc-fastq_screen-example.moab .
- Submit your job
msub bwhpc-fastq_screen-example.moab
- Wait for awhile...
... until you see some more files created. The *.tgz-file contains your data.
tar xvzf *.tgz # to extract the file-contents
7.1.2 Exerpt from bwhpc-fastq_screen-example.moab
These parameters are allying for the use of FastQ Screen on the bwUniCluster.
#!/bin/bash
#
#MSUB -N fastq_screen_job
#MSUB -j oe
#MSUB -o $(JOBNAME).$(JOBID)
#MSUB -m ae
#MSUB -M 'your e-mail@DN'
#MSUB -q singlenode
#MSUB -l walltime=00:10:00
#
[...]
echo " "
echo "### Loading Bowtie, FASTQC module:"
echo " "
module load bio/bowtie2/2.2.3
[ -z "$BOWTIE2_HOME" ] && { echo 'ERROR: Failed to load module bio/bowtie2/2.2.3'; exit 1; }
module load bio/fastq_screen/0.5.2
[ -z "$FASTQ_SCREEN_HOME" ] && { echo 'ERROR: Failed to load module bio/fastq_screen_0.5.3'; exit 1; }
module list
[...]
echo " "
echo "### Copying input test files for job (if required):"
echo " "
cp -v ${FASTQ_SCREEN_EXA_DIR}/{Sample_ABC_L005_f1.fastq,Homo*} .
[...]
echo " "
echo "### Run FastQC in 'threads-mode'..."
echo " "
bowtie2-build Homo_sapiens.GRCh38.dna.chromosome.10.fa grch38_genome
fastq_screen --aligner bowtie2 --subset 1000 --threads 6 Sample_ABC_L005_R1.fastq
fastq_screen --threads 6 Sample_ABC_L005_R1.fastq
[ "$?" -ne 0 ] && { echo "fastqc returned with an error: $?"; exit 1; }
echo "done"
[...]
|
Piping the command to 'parallel' will not work!
8 FastQ Screen-Specific Environments
To see a list of all FastQ Screen environments set by the module load-command use env | grep FASTQ_SCREEN. Or use the command module display bio/fastq_screen.
$ module display bio/fastq_screen/0.5.2 ------------------------------------------------------------------- /opt/bwhpc/common/modulefiles/bio/fastq_screen/0.5.2: module-whatis FastQ screen 0.5.2 Contmaination Screening for large data sets setenv FASTQ_SCREEN_VERSION 0.5.2 setenv FASTQ_SCREEN_HOME /opt/bwhpc/common/bio/fastq_screen/0.5.2 setenv FASTQ_SCREEN_EXA_DIR /opt/bwhpc/common/bio/fastq_screen/0.5.2/bwhpc-examples setenv FASTQ_SCREEN_BIN_DIR /opt/bwhpc/common/bio/fastq_screen/0.5.2 prepend-path PATH /opt/bwhpc/common/bio/fastq_screen/0.5.2 conflict bio/fastq_screen
The module display command will not load the module!
You may check the Bowtie2 environments, too.
$ module display bio/bowtie2/2.2.3 ------------------------------------------------------------------- /opt/bwhpc/common/modulefiles/bio/bowtie2/2.2.3: module-whatis 2.2.3 Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. [...] setenv BOWTIE2_VERSION 2.2.3 setenv BOWTIE2_HOME /opt/bwhpc/common/bio/bowtie2/2.2.3 setenv BOWTIE2_BIN_DIR /opt/bwhpc/common/bio/bowtie2/2.2.3/bin setenv BOWTIE2_DOC_DIR /opt/bwhpc/common/bio/bowtie2/2.2.3/doc setenv BOWTIE2_EXA_DIR /opt/bwhpc/common/bio/bowtie2/2.2.3/examples prepend-path PATH /opt/bwhpc/common/bio/bowtie2/2.2.3/bin:/opt/bwhpc/common/bio/bowtie2/2.2.3/ conflict bio/bowtie2
9 Version-Specific Information
For a more detailed information specific to a specific FastQ Screen version, see the information available via the module system with the command module help bio/fastq_screen.
For a small abstract what FastQ Screen is about use the command module whatis bio/fastq_screen.
Example:
$ module whatis bio/fastq_screen bio/fastq_screen : FastQ screen 0.5.2 Contmaination Screening for large data sets $ module help bio/fastq_screen ----------- Module Specific Help for 'bio/fastq_screen/0.5.2' --------------------------- DESCRIPTION FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect. [...] DOCUMENTATION * Get started: http://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/README.txt * Release notes: http://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/RELEASE_NOTES.txt [...] MAIN CONFIGURATION FILE The most important configuration file is: /opt/bwhpc/common/bio/fastq_screen/0.5.2/fastq_screen.conf(.example) Edit this one and rename it to 'fastq_screen.conf'. FastQ Screen runs in multi-threaded mode and uses 8 cores by default. This can be changed by editing the 'fastq_screen.conf' file. [...]