Lately, Smakemake become a favorite workflow management system for most in the bioinformatics community. This repo can serve as a base to start adding rules/modules as per diffrent workflow requirements.
Basic read QC step is added which is start point for most of NGS analysis. It accepts both paired-end and single-end reads in fastq(
.fq) format as mentioned in units.tsv file.
Use of conda and snakemake wrappers/APIs making it really easy to configure tool requirements, so no need to setup individual tools.
Setup an enviroment
>=5.7.0 in a global enviroment using pip3
pip3 install snakemake
or make an isolated enviroment using conda and activate it.
conda create -c bioconda -c conda-forge -n snakemake snakemake=5.7 conda activate snakemake
Clone this repo
git clone https://github.com/codingene/snakemake-base.git
Dry Run (for testing)
cd snakemake-base snakemake -n
Run the workflow
snakemake --use-conda --cores 10
It should produce an
qc/multiqc.html report on current direcotry.
If you working on a server open the html with following
From current directory run
python -m http.server 8000
browse html file with
Take advantage of followings to write snakemake files.
Use Snakemake Wrappers
Some commonly used tool can be called directly without writing the full syntax.
Also the advantage, it will automatically download the corresponding tool wrapper with
In details - The Snakemake Wrappers repository doc
Use Snakemake APIs
Snakemake give some API functionality to make life easy to deal with common workflow problems.
In details - Snakemake-API reference doc
Use Snakemeke Utils
Some addtional utils from snakemake.
In details - Snakemake Utils doc
The best practices in writing snakemake workflows are taken from snakemake-workflows.