Snakemake
From charlesreid1
Snakemake is a pythonic way of writing makefiles.
Snakemake works with distributions like anaconda to package and install dependencies. I recommend using Pyenv to keep things organized and separated out.
Contents
Installing
Pyenv
Regardless of what version or flavor of Python you wish to use, you can manage it using pyenv
.
Quick start:
pyenv install anaconda3-5.0.1 pyenv global anaconda3-5.0.1 eval "$(pyenv init -)"
Check you have conda shimmed to your path:
which conda
Now install snakemake:
conda install -c bioconda snakemake
Conda
Snakemake is intended to be used in conjunction with Conda, which is a Python tool that enables installing software both inside and outside of the Python ecosystem.
conda install -c bioconda snakemake
This installs snakemake from the bioconda channel.
There is a much more extensive discussion of miniconda vs anaconda here: https://conda.io/docs/user-guide/install/download.html
Regular Python
If you don't need to install dependencies via conda, install snakemake via pip:
pip install snakemake
Using Snakemake
How It Works
The central idea behind Snakemake is to combine the flexibility and readability of Python's syntax with the concept of rules from Makefiles.
A rule in snakemake is a mapping from an input file to an output file.
rule plot: input: "raw/{dataset}.csv" output: "plots/{dataset}.pdf" shell: "somecommand {input} {output}"
Walkthrough Example
Walkthrough example with a non-trivial bioinformatics workflow called dahak-flot: Snakemake/DahakFlot
Strategies and Patterns
Stragies for designing Snakemake files and patterns for complex workflows: Snakemake/Patterns
Links
Projects
Snakemake documentation: https://snakemake.readthedocs.io/en/stable/
Snakemake for Bioconda documentation: https://bioconda.github.io/recipes/snakemake/README.html
Snakemake Biocontainer: https://quay.io/repository/biocontainers/snakemake
Snakemake wrappers for common bioinformatics components: https://snakemake-wrappers.readthedocs.io/en/stable/
Documentation
Rules: https://snakemake.readthedocs.io/en/latest/snakefiles/rules.html#targets
Setup/tutorial: https://snakemake.readthedocs.io/en/stable/tutorial/setup.html#step-1-installing-miniconda-3
Examples
A SingleCell RNASeq pre-processing pipeline built on snakemake: https://github.com/Hoohm/dropSeqPipe/blob/master/Snakefile
Amplicon trimming workflow using a short snakemake file: https://github.com/snakemake-workflows/accel-amplicon-trimming/blob/master/Snakefile
Both are listed on the snakemake workflows repo page: https://github.com/snakemake-workflows/docs