TY - JOUR
T1 - Ten simple rules and a template for creating workflows-as-applications
AU - Roach, Michael J.
AU - Pierce-Ward, N. Tessa
AU - Suchecki, Radoslaw
AU - Mallawaarachchi, Vijini
AU - Papudeshi, Bhavya
AU - Handley, Scott A.
AU - Brown, C. Titus
AU - Watson-Haigh, Nathan S.
AU - Edwards, Robert A.
PY - 2022/12/15
Y1 - 2022/12/15
N2 - As bioinformatics analyses increase in size and complexity, workflow managers are becoming more popular for building pipelines. Workflow managers, such as Snakemake , Nextflow, and Cromwell with WDL or CWL, empower researchers to build robust pipelines that call a series of tools and scripts to perform a bespoke analysis. Workflow managers enable non-bioinformaticians to run published pipelines with confidence, and workflow managers with graphical user interfaces such as Galaxy and BioWorkflow have helped non-bioinformaticians create their own simple pipelines. Earlier tools for workflow management have been around for a while, including GNU Make, ruffus, doit, rake for ruby, and Makeflow. However, the integration of cluster and cloud computing support in Snakemake, Nextflow, and Cromwell helped drive their current popularity. The use of workflow managers facilitates following the FAIR (Findable, Accessible, Interoperable, Reusable) guiding principles for open scientific research. Interestingly, many existing bioinformatics command line tools are wrappers for a series of other software, but since that is the goal of workflow managers, they can be used instead. Examples of command line tools built on a workflow manager include Hecatomb, ATLAS, VirSorter2, spacegraphcats, BlobToolKit, and PGAP. These tools all consist of two key components: a convenience launcher, which provides the command line interface for the tool and compiles the configuration from user command line arguments, and the workflow pipeline and associated files, which performs the actual analysis...
AB - As bioinformatics analyses increase in size and complexity, workflow managers are becoming more popular for building pipelines. Workflow managers, such as Snakemake , Nextflow, and Cromwell with WDL or CWL, empower researchers to build robust pipelines that call a series of tools and scripts to perform a bespoke analysis. Workflow managers enable non-bioinformaticians to run published pipelines with confidence, and workflow managers with graphical user interfaces such as Galaxy and BioWorkflow have helped non-bioinformaticians create their own simple pipelines. Earlier tools for workflow management have been around for a while, including GNU Make, ruffus, doit, rake for ruby, and Makeflow. However, the integration of cluster and cloud computing support in Snakemake, Nextflow, and Cromwell helped drive their current popularity. The use of workflow managers facilitates following the FAIR (Findable, Accessible, Interoperable, Reusable) guiding principles for open scientific research. Interestingly, many existing bioinformatics command line tools are wrappers for a series of other software, but since that is the goal of workflow managers, they can be used instead. Examples of command line tools built on a workflow manager include Hecatomb, ATLAS, VirSorter2, spacegraphcats, BlobToolKit, and PGAP. These tools all consist of two key components: a convenience launcher, which provides the command line interface for the tool and compiles the configuration from user command line arguments, and the workflow pipeline and associated files, which performs the actual analysis...
KW - Bioinformatics
KW - Workflow
KW - Workflow management
UR - http://www.scopus.com/inward/record.url?scp=85144172718&partnerID=8YFLogxK
UR - http://purl.org/au-research/grants/ARC/DP220102915
U2 - 10.1371/journal.pcbi.1010705
DO - 10.1371/journal.pcbi.1010705
M3 - Article
C2 - 36520686
AN - SCOPUS:85144172718
SN - 1553-734X
VL - 18
JO - PLOS Computational Biology
JF - PLOS Computational Biology
IS - 12
M1 - e101070
ER -