The HPC cluster provides a collection of software, mainly in the bioinformatics field, and a generally computation oriented collection of libraries.
Falkor HPC cluster software list
Name | Category | Homepage | Description | Version | Modulefile |
---|---|---|---|---|---|
Abyss | assembler | http://www.bcgsc.ca/platform/bioinfo/software/abyss | ABySS is a de novo, parallel, paired-end sequence assembler that is designed for short reads. The single-processor version is useful for assembling genomes up to 100 Mbases in size. The parallel version is implemented using MPI and is capable of assembling larger genomes. | 2.0.2 2.1.4 | module load abyss/2.0-openmpi module load abyss/2.1.4-openmpi |
AlignGraph | assembler | https://github.com/baoe/AlignGraph | Algorithm for secondary de novo genome assembly guided by closely related references | module load aligngraph/latest | |
bamtools | formats toolkit | https://github.com/pezmaster31/bamtools | BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files. | 2.5.1 | module load bamtools/2.5.1 |
bbmap/bbtools | tool suite | https://jgi.doe.gov/data-and-tools/bbtools/ | BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data. BBTools can handle common sequencing file formats such as fastq, fasta, sam, scarf, fasta+qual, compressed or raw, with autodetection of quality encoding and interleaving. | 38.08 | module load bbmap/38.08 |
bcftools | data toolkit | https://samtools.github.io/bcftools/bcftools.html | BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed. | 1.7.1 | module load bcftools/1.7.1 |
bedtools | analisys toolkit | http://bedtools.readthedocs.io/en/latest/ | Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line. | 2.27.1 | module load bedtools/2.27.1 |
bloomtree | sequence alignment | http://www.cs.cmu.edu/~ckingsf/software/bloomtree/ | 0.3.5 | ||
bowtie1 | sequence alignment | http://bowtie-bio.sourceforge.net/index.shtml | Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end). | 1.2.2 | module load bowtie/1.2.2 loads correct PATH |
bowtie2 | sequence alignment | http://bowtie-bio.sourceforge.net/bowtie2/index.shtml | Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes. | 2.3.4.1 | module load bowtie/2.3.4.1 loads correct PATH |
bwa | sequence alignment | http://bio-bwa.sourceforge.net | BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads. | 0.7.10 0.7.15 | module load bwa/0.7.17 |
canu | assembler | https://github.com/marbl/canu | Canu is a fork of the Celera Assembler, designed for high-noise single-molecule sequencing (such as the PacBio RS II/Sequel or Oxford Nanopore MinION). | 1.7.1 | module load canu/1.7.1 |
CDHIT | sequence analysis | http://weizhongli-lab.org/cd-hit/ | CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences. | cdhit/4.6.8 | module load cdhit/4.6.8 |
diamond | sequence alignment | https://github.com/bbuchfink/diamond | DIAMOND is a sequence aligner for protein and translated DNA searches and functions as a drop-in replacement for the NCBI BLAST software tools. It is suitable for protein-protein search as well as DNA-protein search on short reads and longer sequences including contigs and assemblies, providing a speedup of BLAST ranging up to x20,000. | 0.9.22 | module load diamond/0.9.22 |
FastQC | raw data analysis | https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ | A quality control tool for high throughput sequence data. | 0.11.5 | os package |
FreeBayes | alignment tool | https://github.com/ekg/freebayes | FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment. | 1.1.0 git branch | module load freebayes/1.1.0 |
gapfiller | assembler | https://sourceforge.net/projects/gapfiller/ | GapFiller is a seed-and-extend local assembler to fill the gap within paired reads. It can be used for both DNA and RNA and it has been tested on Illumina data. GapFiller can be used whenever a sequence is to be assembled starting from reads lying on its ends, provided a loose estimate of sequence length. | 2.1.1 | module load gapfiller/2.1.1 |
gmap | alignment and mapping tool | http://research-pub.gene.com/gmap/ | Gmap is a standalone program for mapping and aligning cDNA sequences to a genome. The program maps and aligns a single sequence with minimal startup time and memory requirements, and provides fast batch processing of large sequence sets. The program generates accurate gene structures, even in the presence of substantial polymorphisms and sequence errors, without using probabilistic splice site models. | latest 01.2018 | module load gmap/latest |
gromacs | molecular dynamics | http://www.gromacs.org/ | GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. | 5.1 | module load gromacs/5.1 |
HMMER | sequence alignment | http://hmmer.org/ | Hammer is a tool for error correction of short read datasets with non-uniform coverage, such as single-cell data. In particular, Hammer does not make any uniformity assumptions on the distribution of the reads along the genome. It is based on a combination of the Hamming graph build from the set of k-mers and a simple probabilistic model for sequencing errors. | 3.1b2 | module load hmmer/3.1.2 |
Hisat | sequence aligment/mapping | https://ccb.jhu.edu/software/hisat2/index.shtml | HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes (as well as to a single reference genome). | 2.1.0 | module load hisat/2.1.0 |
Interproscan | sequence analysis | https://www.ebi.ac.uk/interpro/download.html | InterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as signatures, provided by several different databases (referred to as member databases) that make up the InterPro consortium. | 5.29 | module load interproscan/5.29 |
mafft | sequence alignment | https://mafft.cbrc.jp/alignment/software/source.html | Multiple alignment program for amino acid or nucleotide sequences | 7.397 | module load mafft/7.397 |
matam | assembler | https://github.com/bonsai-team/matam | MATAM, a software dedicated to the fast and accurate targeted assembly of short reads sequenced from a genomic marker of interest. The method implements a stepwise process based on construction and analysis of a read overlap graph. | module load matam/latest | |
mira | assembler | http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html | MIRA is a multi-pass DNA sequence data assembler/mapper for whole genome and EST/RNASeq projects. Supports Sanger, Illumina, Ion Torrent, 454. | 4.0.4 | module load mira/4.0.4 |
MPICH2 (hydra process manager) | mpi library | https://www.mpich.org | MPICH is a high performance and widely portable implementation of the Message Passing Interface (MPI) standard. | 3.2 | module load mpi/mpich2 |
MPICH2 (--with-pm=none --with-pmi=slurm) | mpi library | https://www.mpich.org | MPICH is a high performance and widely portable implementation of the Message Passing Interface (MPI) standard. | 3.2 | module load mpi/mpich2-slurm |
MrBayes | sequence analysis | http://mrbayes.sourceforge.net | MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo (MCMC) methods to estimate the posterior distribution of model parameters. | 3.2.6 | module load mrbayes/3.2.7 module load mrbayes/3.2.7-openmpi |
mummer | sequence alignment | http://mummer.sourceforge.net/ | Ultra-fast alignment of large-scale DNA and protein sequences | 4.0 | module load mummer/4.0 |
muscle | sequence alignment | https://www.drive5.com/muscle/ | MUSCLE is one of the most widely-used methods in biology. On average, MUSCLE is cited by ten new papers every day. | 3.8.31 | module load muscle/3.8.31 |
NCBI Blast | sequence alignment | https://blast.ncbi.nlm.nih.gov/Blast.cgi | Blast is an acronym for Basic Local Alignment Search Tool. BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance. | 2.7.1 | |
OpenMPI | mpi library | https://www.open-mpi.org | The Open MPI Project is an open source Message Passing Interface implementation that is developed and maintained by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise, technologies, and resources from all across the High Performance Computing community in order to build the best MPI library available. | 2.1 | module load mpi/openmpi |
Pairagon | sequence alignment | A pair hidden Markov model based cDNA-to-genome alignment program, as the most accurate aligner for sequences with high- and low-identity levels. | 1.1 | module load pairagon/1.1 | |
Qiime2 | sequence analysis | https://qiime2.org/ | QIIME 2™ is a next-generation microbiome bioinformatics platform that is extensible, free, open source, and community developed. | 2.0 | module load qiime/2 |
R | programming language | https://www.r-project.org/ | R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. | 3.5.1 | module load R/3.5.1 |
salmon | rna-seq | https://salmon.readthedocs.io/en/latest/salmon.html | Salmon is a tool for wicked-fast transcript quantification from RNA-seq data. | 0.11.3 | module load salmon/0.11.3 |
snowball | assembler | https://github.com/algbioi/snowball/wiki | 1.2 | module load snowball/1.2 | |
Spades | sequence assembly toolkit | http://cab.spbu.ru/software/spades/ | SPAdes – St. Petersburg genome assembler – is an assembly toolkit containing various assembly pipelines. The current version of SPAdes works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads. You can also provide additional contigs that will be used as long reads. | 3.12 | module load spades/3.12 |
STAR | sequence aligner | https://github.com/alexdobin/STAR | ultrafast universal RNA-seq aligner | 2.6.0c | module load STAR/2.6.0c |
TensorFlow | machine learning framework | https://www.tensorflow.org/ | Machine learning framework | 1.8 | |
TopHat | sequence aligner | https://ccb.jhu.edu/software/tophat/index.shtml | TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons. | 2.1.1 | module load tophat/2.1.1 |
Trimmomatic | sequence toolkit | http://www.usadellab.org/cms/?page=trimmomatic | A flexible read trimming tool for Illumina NGS data | 0.38 | module load trimmomatic/0.38 |
velvet | assembler | https://www.ebi.ac.uk/~zerbino/velvet/ | Sequence assembler for very short reads | 1.2.10 | module load velvet/1.2.10 |
vica | sequence analysis | https://github.com/USDA-ARS-GBRU/vica | Software to identify highly divergent DNA and RNA viruses and phages in microbiomes | module load vica/latest | |
virsorter | sequence analysis | https://github.com/simroux/VirSorter | VirSorter: mining viral signal from microbial genomic data | git master branch at nov. 2018 | module load virsorter/latest |
Vsearch | sequence alignment | https://github.com/torognes/vsearch | VSEARCH stands for vectorized search, as the tool takes advantage of parallelism in the form of SIMD vectorization as well as multiple threads to perform accurate alignments at high speed. VSEARCH uses an optimal global aligner (full dynamic programming Needleman-Wunsch), in contrast to USEARCH which by default uses a heuristic seed and extend aligner. This usually results in more accurate alignments and overall improved sensitivity (recall) with VSEARCH, especially for alignments with gaps. | 2.6.2 | module load vsearch/2.6.2 |