The HPC cluster provides a collection of software, mainly in the bioinformatics field, and a generally computation oriented collection of libraries.

Falkor HPC cluster software list

NameCategoryHomepageDescriptionVersionPrefix PathModulefileNotesProgramming Language
Abyssassemblerhttp://www.bcgsc.ca/platform/bioinfo/software/abyssABySS is a de novo, parallel, paired-end sequence assembler that is designed for short reads. The single-processor version is useful for assembling genomes up to 100 Mbases in size. The parallel version is implemented using MPI and is capable of assembling larger genomes.2.0.2
2.1.4
/opt/abyss
/opt/abyss-2.1.4
module load abyss/2.0-openmpi
module load abyss/2.1.4-openmpi
AdMixturehttps://dalexander.github.io/admixture/ADMIXTURE is a software tool for maximum likelihood estimation of individual ancestries from multilocus SNP genotype datasets. It uses the same statistical model as STRUCTURE but calculates estimates much more rapidly using a fast numerical optimization algorithm.1.3.0module load admixture/1.3.0
AlignGraphassemblerhttps://github.com/baoe/AlignGraphAlgorithm for secondary de novo genome assembly guided by closely related references/opt/aligngraphmodule load aligngraph/latestthis environment enables path to nucmer/pblat aligners too
ANGSDhttp://www.popgen.dk/angsd/index.php/ANGSDANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed genotype probabilities. Most methods take genotype uncertainty into account instead of basing the analysis on called genotypes. This is especially useful for low and medium depth data. The software is written in C++ and has been used on large sample sizes.0.933-18/opt/angsdmodule load angsd/latest
AntiSmashhttps://antismash.secondarymetabolites.org/#!/download4.2.0
5
module load antismash/4.2.0
module load antismash/5
Augustussequence analysishttps://bioinf.uni-greifswald.de/augustus/AUGUSTUS is a program that predicts genes in eukaryotic genomic sequences. It can be run on this web server, on a new web server for larger input files or be downloaded and run locally. It is open source so you can compile it for your computing platform.3.3.2module load augustus/3.3.2
bam-readcounthttps://github.com/genome/bam-readcount0.8.0module load bam-readcount/0.8.0
bamtoolsformats toolkithttps://github.com/pezmaster31/bamtoolsBamTools provides both a programmer's API and an end-user's toolkit for handling
BAM files.
2.5.1/opt/bamtools/binmodule load bamtools/2.5.1
bayescanhttp://cmpg.unibe.ch/software/BayeScan/2.1module load bayescan/2.1
bbmap/bbtoolstool suitehttps://jgi.doe.gov/data-and-tools/bbtools/BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data. BBTools can handle common sequencing file formats such as fastq, fasta, sam, scarf, fasta+qual, compressed or raw, with autodetection of quality encoding and interleaving.38.08/opt/bbmapmodule load bbmap/38.08Java
bcftoolsdata toolkithttps://samtools.github.io/bcftools/bcftools.htmlBCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.1.7.1
1.9
/opt/bcftools/bin
/opt/bcftools-1.9
module load bcftools/1.7.1
module load bcftools/1.9
bcl2fastqsequence toolkithttps://emea.support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software.html2.20.0module load bcl2fastq/2.20.0
beasthttps://beast.communityBEAST is a cross-platform program for Bayesian analysis of molecular sequences using MCMC. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. 2.6.2module load beast/2.6.2
bedtoolsanalisys toolkithttp://bedtools.readthedocs.io/en/latest/Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.2.27.1/opt/bedtools-2.2.27module load bedtools/2.27.1
BLASTsequence aligner
https://blast.ncbi.nlm.nih.gov/Blast.cgi2.7.1
2.10.1
module load blast/2.7.1
module load blast/2.10.1
bloomtreesequence alignmenthttp://www.cs.cmu.edu/~ckingsf/software/bloomtree/0.3.5/op/bin/
bowtie1sequence alignmenthttp://bowtie-bio.sourceforge.net/index.shtmlBowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).1.2.2/opt/bowtie-1.2.2/module load bowtie/1.2.2
loads correct PATH
bowtie2sequence alignmenthttp://bowtie-bio.sourceforge.net/bowtie2/index.shtmlBowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.2.2.3
2.3.4.1
/opt/bowtie2-2.3.4.1/binmodule load bowtie/2.2.3
module load bowtie/2.3.4.1

loads correct PATH

burstshort reads alignerhttps://github.com/knights-lab/BURST0.99module load burst/0.99
BUSCOhttps://busco.ezlab.org3module load busco/3
bwasequence alignmenthttp://bio-bwa.sourceforge.netBWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.0.7.10
0.7.15
/opt/bwa-0.7.17
module load bwa/0.7.17
canuassemblerhttps://github.com/marbl/canuCanu is a fork of the Celera Assembler, designed for high-noise single-molecule sequencing (such as the PacBio RS II/Sequel or Oxford Nanopore MinION).1.7.1module load canu/1.7.1
CDHITsequence analysishttp://weizhongli-lab.org/cd-hit/CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences.cdhit/4.6.8/opt/cdhitmodule load cdhit/4.6.8
checkVhttps://bitbucket.org/berkeleylab/checkv/src/master/CheckV is a fully automated command-line pipeline for assessing the quality of single-contig viral genomes, including identification of host contamination for integrated proviruses, estimating completeness for genome fragments, and identification of closed genomes.0.6.0module load checkV/0.6.0
cutadaptsequence toolkithttps://cutadapt.readthedocs.io/en/stable/3.4module load cutadapt/3.4
deeptoolssequence toolkithttps://github.com/deeptools/deepToolsUser-friendly tools for exploring deep-sequencing data
3.1.3module load deeptools/3.1.3
deepvirfindersequence predictionhttps://github.com/jessieren/DeepVirFinderDeepVirFinder predicts viral sequences using deep learning method. The method has good prediction accuracy for short viral sequences, so it can be used to predict sequences from the metagenomic data.
latestmodule load deepvirfinder/latest
diamondsequence alignmenthttps://github.com/bbuchfink/diamondDIAMOND is a sequence aligner for protein and translated DNA searches and functions as a drop-in replacement for the NCBI BLAST software tools. It is suitable for protein-protein search as well as DNA-protein search on short reads and longer sequences including contigs and assemblies, providing a speedup of BLAST ranging up to x20,000.0.9.22/opt/diamond/module load diamond/0.9.22
DRAMannotation toolhttps://github.com/WrightonLabCSU/DRAMDRAM (Distilled and Refined Annotation of Metabolism) is a tool for annotating metagenomic assembled genomes and VirSorter identified viral contigs. latestmodule load DRAM/latest
dsuitehttps://github.com/millanek/Dsuite0.3
0.4
module load dsuite/0.3
module load dsuite/0.4
enrichmcomparative genomics toolkithttps://github.com/geronimp/enrichMEnrichM is a set of comparative genomics tools for large sets of metagenome assembled genomes (MAGs). 0.5.0module load enrichm/0.5.0
exabayesphylogenetic toolkithttps://cme.h-its.org/exelixis/web/software/exabayes/manual/manual.html#sec-2ExaBayes is a tool for Bayesian phylogenetic analyses.1.5
1.5-mpi
module load exabayes/1.5
module load exabayes/1.5-mpi
expressrna-seqhttps://bioinformaticshome.com/tools/rna-seq/descriptions/eXpress.htmleXpress is a tool to quantify RNA-seq data, but it is also applicable to ChIP-seq, metagenomics, and large-scale sequencing data in general. 1.5.1module load express/1.5.1
FastTreephylogenetic toolkithttp://www.microbesonline.org/fasttree/FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences.2.1.11module load fasttree/2.1.11
FastXsequence toolkithttp://hannonlab.cshl.edu/fastx_toolkit/The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.0.0.13module load fastx/0.0.13
fionasequence toolkithttps://academic.oup.com/bioinformatics/article/30/17/i356/1995580.2.10module load fiona/0.2.10
flashsequence toolkithttps://ccb.jhu.edu/software/FLASH/FLASH (Fast Length Adjustment of SHort reads) is a very fast and accurate software tool to merge paired-end reads from next-generation sequencing experiments.2.2module load flash/2.2
FastQCraw data analysis
https://www.bioinformatics.babraham.ac.uk/projects/fastqc/A quality control tool for high throughput sequence data.
0.11.5os packageos package
FreeBayesalignment toolhttps://github.com/ekg/freebayesFreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.1.2.0 git branch/opt/freebayes/binmodule load freebayes/1.2.0Should use --recursive when cloning from git repo
Modified manually the Makefile to change the prefix path
gapfillerassemblerhttps://sourceforge.net/projects/gapfiller/GapFiller is a seed-and-extend local assembler to fill the gap within paired reads.
It can be used for both DNA and RNA and it has been tested on Illumina data.
GapFiller can be used whenever a sequence is to be assembled starting from reads lying on its ends, provided a loose estimate of sequence length.
2.1.1/opt/gapfillermodule load gapfiller/2.1.1
gapseqhttps://github.com/jotech/gapseqInformed prediction and analysis of bacterial metabolic pathways and genome-scale networks1.1module load gapseq/latest
GaTKhttps://gatk.broadinstitute.org/hc/en-usVariant Discovery in High-Throughput Sequencing Data
3.8
4.1.3.0
module load gatk/3.8
module load gatk/4.1.3.0
GenomeThreadergene predictionhttps://genomethreader.orgGenomeThreader is a software tool to compute gene structure predictions.1.6.6
1.7.0
module load genomethreader/1.6.6
module load genomethreader/1.7.0
Genrichhttps://github.com/jsh58/GenrichGenrich is a peak-caller for genomic enrichment assays (e.g. ChIP-seq, ATAC-seq). It analyzes alignment files generated following the assay and produces a file detailing peaks of significant enrichment.module load genrich/latest
gffreadhttps://github.com/gpertea/gffreadGFF/GTF utility providing format conversions, filtering, FASTA sequence extraction and more.
0.11.4gffread/0.11.4
gmapalignment and mapping toolhttp://research-pub.gene.com/gmap/Gmap is a standalone program for mapping and aligning cDNA sequences to a genome. The program maps and aligns a single sequence with minimal startup time and memory requirements, and provides fast batch processing of large sequence sets. The program generates accurate gene structures, even in the presence of substantial polymorphisms and sequence errors, without using probabilistic splice site models.latest 01.2018/opt/gmap/binmodule load gmap/latest
Goprogramming languageThe Go programming language1.12.1module load go/1.12.1
grinderhttps://github.com/zyxue/biogrinderGrinder is a versatile program to create random shotgun and amplicon sequence libraries based on DNA, RNA or proteic reference sequences provided in a FASTA file.
0.5.4module load grinder/0.5.4
gromacsmolecular dynamicshttp://www.gromacs.org/GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.5.1/opt/gromacsmodule load gromacs/5.1MPI support compiled
hh-suitehttps://github.com/soedinglab/hh-suiteThe HH-suite is an open-source software package for sensitive protein sequence searching based on the pairwise alignment of hidden Markov models (HMMs).3.1.0module load hh-suite/latest
hic-prohttps://github.com/nservant/HiC-ProHiC-Pro was designed to process Hi-C data, from raw fastq files (paired-end Illumina data) to normalized contact maps. It supports the main Hi-C protocols, including digestion protocols as well as protocols that do not require restriction enzymes such as DNase Hi-C. I2.11.1module load hic-pro/2.11.1
hicuphttps://www.bioinformatics.babraham.ac.uk/projects/hicup/A tool for mapping and performing quality control on Hi-C data0.7.2module load hicup/0.7.2
Hisatsequence aligment/mappinghttps://ccb.jhu.edu/software/hisat2/index.shtmlHISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes (as well as to a single reference genome).2.1.0module load hisat/2.1.0
HMMERsequence alignmenthttp://hmmer.org/Hammer is a tool for error correction of short read datasets with non-uniform coverage, such as single-cell data. In particular, Hammer does not make any uniformity assumptions on the distribution of the reads along the genome. It is based on a combination of the Hamming graph build from the set of k-mers and a simple probabilistic model for sequencing errors.3.1b2/opt/hmmer-3.1.2/module load hmmer/3.1.2
Humann HUMAnN is a method for efficiently and accurately profiling the abundance of microbial metabolic pathways and other molecular functions from metagenomic or metatranscriptomic sequencing data.2
3
module load humann/2
module load humann/3
hydehttps://hybridization-detection.readthedocs.ioHyDe is a software package that detects hybridization in phylogenomic data sets using phylogenetic invariants. 0.4.3module load hyde/latest
ima2phttps://github.com/arunsethuraman/ima2pMa2p is a parallel implementation of IMa2, using OpenMPI-C++ - a Bayesian MCMC based method for inferring population demography under the IM (Isolation with Migration) model. Please refer to Sethuraman and Hey (2015) for details of implementation.module load ima2p/latest
ima3https://github.com/jodyhey/IMa3
Infernalhttp://eddylab.org/infernal/Infernal ("INFERence of RNA ALignment") is for searching DNA sequence databases for RNA structure and sequence similarities.1.1.4module load infernal/1.1.4-openmpislurm
module load infernal/1.1.4
Interproscansequence analysishttps://www.ebi.ac.uk/interpro/download.htmlInterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as signatures, provided by several different databases (referred to as member databases) that make up the InterPro consortium.5.29
5.33
/opt/interproscan-5.29-68.0/module load interproscan/5.29
module load interproscan/5.33
needs Java environment
iq-treehttp://www.iqtree.orgA fast and effective stochastic algorithm to infer phylogenetic trees by maximum likelihood. IQ-TREE compares favorably to RAxML and PhyML in terms of likelihoods with similar computing time1.6.11
2.1.3
module load iq-tree/1.6.11
module load iq-tree/2.1.3
Jellyfishsequence toolkithttps://github.com/gmarcais/JellyfishJellyfish is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. 2.2.10module load jellyfish/2.2.10
MACS2https://hbctraining.github.io/Intro-to-ChIPseq/lessons/05_peak_calling_macs.htmlA commonly used tool for identifying transcription factor binding sites is named Model-based Analysis of ChIP-seq (MACS). The MACS algorithm captures the influence of genome complexity to evaluate the significance of enriched ChIP regions. Although it was developed for the detection of transcription factor binding sites it is also suited for larger regions.2.1.2module load MACS2/2.1.2
mafftsequence alignmenthttps://mafft.cbrc.jp/alignment/software/source.htmlMultiple alignment program for amino acid or nucleotide sequences7.397/opt/mafft/binmodule load mafft/7.397
malthttps://uni-tuebingen.de/it/fakultaeten/mathematisch-naturwissenschaftliche-fakultaet/fachbereiche/informatik/lehrstuehle/algorithms-in-bioinformatics/software/malt/MALT performs alignment of metagenomic reads against a database of reference sequences (such as NR, GenBank or Silva) and produces a MEGAN RMA file as output.0.4.1
0.5
module load malt/0.4.1
module load malt/0.5
matamassemblerhttps://github.com/bonsai-team/matamMATAM, a software dedicated to the fast and accurate targeted assembly of short reads sequenced from a genomic marker of interest. The method implements a stepwise process based on construction and analysis of a read overlap graph./opt/matam/binmodule load matam/latestinstalled under dedicated conda environment
maxquanthttps://www.maxquant.orgMaxQuant is a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets. It is specifically aimed at high-resolution MS data.1.6.3
1.6.5
1.6.10
1.6.17
module load maxquant/1.6.3
module load maxquant/1.6.5
module load maxquant/1.6.10
module load maxquant/1.6.17
MEGAhttps://www.megasoftware.net/docsMEGA analysis suite10.0.5module load mega/10.0.5
memehttps://meme-suite.org/meme/The MEME Suite allows the biologist to discover novel motifs in collections of unaligned nucleotide or protein sequences, and to perform a wide variety of other motif-based analyses.
5.0.5
5.3.3
module load meme/5.0.5
module load meme/5.3.3
metawraphttps://github.com/bxlab/metaWRAPMetaWRAP aims to be an easy-to-use metagenomic wrapper suite that accomplishes the core tasks of metagenomic analysis from start to finish: read quality control, assembly, visualization, taxonomic profiling, extracting draft genomes (binning), and functional annotation.1.1.1module load metawrap/1.1.1
mgasequence aligner
https://bio.tools/mgaMultiple Genome Aligner computes multiple genome alignments of large, closely related DNA sequences. MGA is a software tool for efficiently aligning two or more sufficiently similar genomic sized sequences [HKO02]. It belongs to the category of anchor-based multiple alignment methods. mga uses multiMEMs (or MEMs, MUMs as special cases) to anchor the alignment.latestmodule load mga/latest
miraassemblerhttp://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.htmlMIRA is a multi-pass DNA sequence data assembler/mapper for whole genome and EST/RNASeq projects. Supports Sanger, Illumina, Ion Torrent, 454.4.0.4/opt/miramodule load mira/4.0.4statically compiled binaries installedC/C++
mitozassemblerhttps://academic.oup.com/nar/article/47/11/e63/5377471MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization2.3module load mitoz/2.3
mmseqssequence toolkithttps://github.com/soedinglab/MMseqs2MMseqs2: ultra fast and sensitive sequence search and clustering suitemodule load mmseqs2/latest
mothurhttps://mothur.orgThis project seeks to develop a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community. 1.4.3module load mothur/1.4.3
MPICH2
(hydra process manager)
mpi libraryhttps://www.mpich.orgMPICH is a high performance and widely portable implementation of the Message Passing Interface (MPI) standard.3.2/opt/mpi/mpich2/bin/module load mpi/mpich2This version can be used natively with the SLURM workload manager. Have a look at https://slurm.schedmd.com/mpi_guide.html#mpich2 (section "MPICH with MPIEXEC")
For infos on the hydra process manager you can refer to https://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager
MPICH2
(--with-pm=none --with-pmi=slurm)
mpi libraryhttps://www.mpich.orgMPICH is a high performance and widely portable implementation of the Message Passing Interface (MPI) standard.3.2/opt/mpi/mpich2-slurm/bin/module load mpi/mpich2-slurmThis version links to slurm explicitly as a process manager and does link against libpmi.
Refer to this link from the mpich FAQ for more infos: https://wiki.mpich.org/mpich/index.php/Frequently_Asked_Questions#Note_that_the_default_build_of_MPICH_will_work_fine_in_SLURM_environments._No_extra_steps_are_needed.
MrBayessequence analysishttp://mrbayes.sourceforge.netMrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo (MCMC) methods to estimate the posterior distribution of model parameters.3.2.6/opt/bin/module load mrbayes/3.2.7
module load mrbayes/3.2.7-openmpi
A mpich2-slurm enabled version is available in /opt/bin as mb_mpich

DEPRECATED: An mpi version is also available in /opt/bin as mb_mpi
To execute it please use:

mpirun -np /opt/bin/mb_mpi

mummersequence alignmenthttp://mummer.sourceforge.net/Ultra-fast alignment of large-scale DNA and protein sequences4.0/opt/mummermodule load mummer/4.0
musclesequence alignmenthttps://www.drive5.com/muscle/MUSCLE is one of the most widely-used methods in biology. On average, MUSCLE is cited by ten new papers every day. 3.8.31module load muscle/3.8.31
NCBI Blastsequence alignmenthttps://blast.ncbi.nlm.nih.gov/Blast.cgiBlast is an acronym for Basic Local Alignment Search Tool. BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.2.7.1/opt/blast-2.7.1/bin/
OpenMPImpi libraryhttps://www.open-mpi.orgThe Open MPI Project is an open source Message Passing Interface implementation that is developed and maintained by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise, technologies, and resources from all across the High Performance Computing community in order to build the best MPI library available. 2.1/opt/mpi/openmpimodule load mpi/openmpi
Pairagonsequence alignmentA pair hidden Markov model based cDNA-to-genome alignment program, as the most accurate aligner for sequences with high- and low-identity levels.1.1/opt/pairagonmodule load pairagon/1.1
Qiime2sequence analysishttps://qiime2.org/QIIME 2‚Ñ¢ is a next-generation microbiome bioinformatics platform that is extensible, free, open source, and community developed.2.0installed via condamodule load qiime/2
Rprogramming languagehttps://www.r-project.org/R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.3.5.1/opt/R-3.5.1module load R/3.5.1
salmonrna-seqhttps://salmon.readthedocs.io/en/latest/salmon.htmlSalmon is a tool for wicked-fast transcript quantification from RNA-seq data.0.11.3/opt/salmon-0.11.3module load salmon/0.11.3
snowballassemblerhttps://github.com/algbioi/snowball/wiki1.2module load snowball/1.2
Spadessequence assembly toolkithttp://cab.spbu.ru/software/spades/SPAdes – St. Petersburg genome assembler – is an assembly toolkit containing various assembly pipelines.
The current version of SPAdes works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads. You can also provide additional contigs that will be used as long reads.
3.12/opt/spades-3.12/bin/module load spades/3.12
STARsequence aligner
https://github.com/alexdobin/STARultrafast universal RNA-seq aligner2.6.0c/opt/STAR/module load STAR/2.6.0c
TensorFlowmachine learning frameworkhttps://www.tensorflow.org/Machine learning framework1.8You can load the working environment by
source /opt/tensorflow/virtpy/bin/activate
or by loading the vica environment
module load vica/latest
Python
TopHatsequence aligner
https://ccb.jhu.edu/software/tophat/index.shtmlTopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.2.1.1module load tophat/2.1.1
trfhttps://bioinformaticshome.com/tools/DNA-sequence-analysis/descriptions/TRF.html#gsc.tab=0Tandem Repeats Finder is a tool to find tandem repeats in DNA sequences. The Tandem Repeats Finder algorithm uses k-tuples for matching to speed up the computation and computes consensus sequences.4.0.9module load trf/4.0.9
trimalalignment toolhttps://bioweb.pasteur.fr/packages/pack@trimal@1.4.1A tool for automated alignment trimming in large-scale phylogenetic analyses
1.4module load trimal/1.4
Trimmomaticsequence toolkithttp://www.usadellab.org/cms/?page=trimmomaticA flexible read trimming tool for Illumina NGS data0.38module load trimmomatic/0.38
Trinityassemblerhttps://github.com/trinityrnaseq/trinityrnaseq/wikiTrinity assembles transcript sequences from Illumina RNA-Seq data.2.8.4
2.11
2.15
module load trinity/2.8.4
module load trinity/2.11
module load trinity/2.15
trinotateannotation toolhttps://github.com/Trinotate/TrinotateTrinotate is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms.3.2.0module load trinotate/3.2.0
trnascansequence analysishttps://users.soe.ucsc.edu/~lowe/thesis/node20.html1.4modul load trnascan/1.4
umap1.1.1modul load umap/1.1.1
vcftoolsdata toolkithttps://vcftools.sourceforge.netVCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.0.1.17module load vcftools/latest
vcontacthttps://bitbucket.org/MAVERICLab/vcontact2vConTACT2 is a tool to perform guilt-by-contig-association classification of viral genomic sequence data. It's designed to cluster and provide taxonomic context of viral metagenomic sequencing data.0.9.11module load vcontact/0.9.11
velvetassemblerhttps://www.ebi.ac.uk/~zerbino/velvet/Sequence assembler for very short reads1.2.10/opt/velvetmodule load velvet/1.2.10
vibrantsequence annotationhttps://github.com/AnantharamanLab/VIBRANTVirus Identification By iteRative ANnoTation1.0.1module load vibrant/1.0.1
vicasequence analysishttps://github.com/USDA-ARS-GBRU/vicaSoftware to identify highly divergent DNA and RNA viruses and phages in microbiomes/opt/tensorflow/virtpy/binmodule load vica/latestPython
virsortersequence analysishttps://github.com/simroux/VirSorterVirSorter: mining viral signal from microbial genomic datagit master branch at nov. 2018module load virsorter/latest
Vsearchsequence alignmenthttps://github.com/torognes/vsearchVSEARCH stands for vectorized search, as the tool takes advantage of parallelism in the form of SIMD vectorization as well as multiple threads to perform accurate alignments at high speed. VSEARCH uses an optimal global aligner (full dynamic programming Needleman-Wunsch), in contrast to USEARCH which by default uses a heuristic seed and extend aligner. This usually results in more accurate alignments and overall improved sensitivity (recall) with VSEARCH, especially for alignments with gaps.2.6.2
2.21.1
module load vsearch/2.6.2
module load vsearch/2.21.1
whokaryotesequence analysishttps://github.com/LottePronk/whokaryoteWhokaryote uses a random forest classifier that uses gene-structure based features and optionally Tiara (https://github.com/ibe-uw/tiara) predictions to predict whether a contig is from a eukaryote or from a prokaryote.

You can use Whokaryote to determine which contigs need eukaryotic gene prediction and which need prokaryotic gene prediction.
git master branchmodule load whokaryote/latest