snakemake base

Lately, Smakemake become a favorite workflow management system for most in the bioinformatics community. This repo can serve as a base to start adding rules/modules as per diffrent workflow requirements.

Basic read QC step is added which is start point for most of NGS analysis. It accepts both paired-end and single-end reads in fastq(.fq) format as mentioned in units.tsv file.

Use of conda and snakemake wrappers/APIs making it really easy to configure tool requirements, so no need to setup individual tools.

Working Methodology

Setup an enviroment

Insatall Snakemake>=5.7.0 in a global enviroment using pip3

pip3 install snakemake

or make an isolated enviroment using conda and activate it.

conda create -c bioconda -c conda-forge -n snakemake snakemake=5.7
conda activate snakemake

Clone this repo

git clone https://github.com/codingene/snakemake-base.git

Dry Run (for testing)

cd snakemake-base
snakemake -n 

Run the workflow

snakemake --use-conda --cores 10

Results

It should produce an qc/multiqc.html report on current direcotry.

If you working on a server open the html with following

From current directory run

python -m http.server 8000

browse html file with

http://0.0.0.0:8000/qc/multiqc.html

Adding Rules/Modules

A specific workflow can be created added by adding rules/modules (.smk files). For example see here for alignment.

Take advantage of followings to write snakemake files.

Use Snakemake Wrappers

Some commonly used tool can be called directly without writing the full syntax. Also the advantage, it will automatically download the corresponding tool wrapper with --use-conda flag.

In details - The Snakemake Wrappers repository doc

Use Snakemake APIs

Snakemake give some API functionality to make life easy to deal with common workflow problems.

In details - Snakemake-API reference doc

Use Snakemeke Utils

Some addtional utils from snakemake.

In details - Snakemake Utils doc

Acknowledgement

The best practices in writing snakemake workflows are taken from snakemake-workflows.