Skip to main content

MPUSP/snakemake-ont-basecalling

A Snakemake workflow for basecalling and demultiplexing of Oxford Nanopore data using Dorado.

Overview

Testing: GitHub Actions Workflow Status GitHub Actions Workflow Status

Last update: 2026-01-08

Latest release: v1.5.2

Topics: basecalling cluster dorado nanopore-sequencing oxford-nanopore parallel-computing slurm snakemake snakemake-workflow

Authors: @m-jahn @rabioinf

Configuration

The following configuration details are extracted from the config's README file.


Running the workflow

Input data

This workflow requires pod5 input data. These input files are supplied to the workflow using a mandatory runs table linked in the config.yml file (default: .test/config/runs.csv). Each row in the runs table corresponds to a single run, for which all pod5 files are provided via a data_folder column. Multiple runs can be defined in the table. The runs table has the following layout:

run_iddata_folderbasecalling_modelbarcode_kit
MK1C_run_01".test/data"dna_r10.4.1_e8.2_400bps_sup@v5.0.0SQK-PCB114-24

Execution

To define rule specific resources like GPU usage, configuration profiles will be used. See snakemake docs on profiles for more information. A default profile for local testing and a slurm specific cluster profile is provided with this workflow.

To run the workflow from command line, change to the working directory and activate the conda environment.

cd snakemake-ont-basecalling
conda activate snakemake-ont-basecalling

Adjust options in the default config file config/config.yml. Before running the entire workflow, perform a dry run using:

snakemake --cores 3 --sdm conda --directory .test --dry-run

To run the workflow with test files using conda:

snakemake --cores 3 --sdm conda --directory .test

To run the workflow with test files using conda and apptainer, set the dorado path to /share/resources/dorado-<version>-linux-x64/bin/dorado and make it available for apptainer using bind:

snakemake --cores 3 --sdm conda apptainer --directory .test --apptainer-args "--bind ../resources:/share/resources"

To run the workflow with test files on a slurm cluster, adjust the slurm-specific profile workflow/profiles/slurm/config.yaml file and run:

snakemake --cores 3 --sdm conda --workflow-profile workflow/profiles/slurm/ --directory .test

Note: It is recommended to start the snakemake pipeline on the cluster using a session multiplexer like screen or tmux.

Parameters

This table lists all parameters that can be used to run the workflow.

ParameterTypeDetailsDefault
input
runsstringtable with sequencing runsconfig/runs.csv
file_extensionstringextension for input filespod5
file_regexstringpattern to match input files[A-Z]{3}[0-9]{5}...
barcodesstringused barcodes for demultiplexing1-24
dorado
pathstringpath to the Dorado executable
simplex / cudastringCUDA device: auto, cuda:0, cuda:allcuda:all
simplex / trimstringall or nonenone
simiplex / extrastringparams passed to dorado basecaller""
demultiplexingboolwhether to perform demultiplexingTrue
report
toolsarraylist of tools to include in the report["pycoQC", "NanoPlot"]