Overview
This section provides a high-level overview of the Nextflow source code for users who want to understand or contribute to it. Rather than a comprehensive API documentation, these docs simply provide a conceptual map to help you understand the key concepts of the Nextflow implementation, and to quickly find code sections of interest for further investigation.
Before you dive into code, be sure to check out the CONTRIBUTING.md for Nextflow to learn about the many ways to contribute to the project.
IntelliJ IDEA
The suggested development environment is IntelliJ IDEA. Nextflow development with IntelliJ IDEA requires a recent version of the IDE (2019.1.2 or later).
After installing IntelliJ IDEA, use the following steps to use it with Nextflow:
Clone the Nextflow repository to a directory in your computer.
Open IntelliJ IDEA and go to File > New > Project from Existing Sources….
Select the Nextflow project root directory in your computer and click OK.
Select Import project from external model > Gradle and click Finish.
After the import process completes, select File > Project Structure….
Select Project, and make sure that the SDK field contains Java 11 (or later).
Go to File > Settings > Editor > Code Style > Groovy > Imports and apply the following settings:
Use single class import
Class count to use import with ‘*’:
99
Names count to use static import with ‘*’:
99
Imports layout:
import java.*
import javax.*
blank line
all other imports
all other static imports
New files must include the appropriate license header boilerplate and the author name(s) and contact email(s) (see for example).
Groovy
Nextflow is written in Groovy, which is itself a programming language based on Java. Groovy is designed to be highly interoperable with Java – Groovy programs compile to Java bytecode, and nearly any Java program is also a valid Groovy program. However, Groovy adds several language features (e.g. closures, list and map literals, optional typing, optional semicolons, meta-programming) and standard libraries (e.g. JSON and XML parsing) that greatly improve the overall experience of developing for the Java virtual machine.
Recommended resources for Groovy, from most reference-complete to most user-friendly, are listed below:
Software Dependencies
Nextflow depends on a variety of libraries and frameworks, the most prominent of which are listed below:
AWS SDK for Java 1.x: AWS integration
Azure SDK for Java: Azure integration
Google Cloud Client Libraries for Java: Google Cloud integration
GPars: dataflow concurrency
Gradle: build automation
JCommander: command line interface
JGit: Git integration
Kryo: serialization
LevelDB: key-value store for the cache database
Logback: application logging
PF4J: plugin extensions
Spock: unit testing framework
Any other integrations are likely implemented using a CLI (e.g. Conda, Docker, HPC schedulers) or REST API (e.g. Kubernetes).
Class Diagrams
Each package has a class diagram, abridged and annotated for relevance and ease of use.
Each node is a class. Fields are selectively documented in order to show only the core data structures and the classes that “own” them. Methods are not explicitly documented, but they are mentioned in certain links where appropriate. Links are selectively documented in order to show only the most important classes and relationships.
Links between classes denote one of the following relationships:
Inheritance (
A <|-- B
):B
is a subclass ofA
Composition (
A --* B
):A
containsB
Instantiation (
A --> B : f
):A
creates instance(s) ofB
at runtime viaA::f()
See Packages for the list of Nextflow packages.
Warning
Class diagrams are manually curated, so they might not always reflect the latest version of the source code.
Building from source
If you are interested in modifying the source code, you only need Java 11 or later to build Nextflow from source. Nextflow uses the Gradle build automation system, but you do not need to install Gradle to build Nextflow. In other words, if you can run Nextflow, then you can probably build it too!
To build locally from a branch (useful for testing PRs):
git clone -b <branch> git@github.com:nextflow-io/nextflow.git
cd nextflow
make compile
The build system will automatically download all of the necessary dependencies on the first run, which may take several minutes.
Once complete, you can run your local build of Nextflow using the launch.sh
script in place of the nextflow
command:
./launch.sh run <script> ...
A self-contained executable Nextflow package can be created with the following command:
make pack
Again, use launch.sh
in place of the nextflow
command to use your local build.
Testing
To run the unit tests:
# run all tests
make test
# run individual test
make test module=<nextflow|plugins:nf-amazon|...> class=<package>.<class>.<method>
# refer to the Makefile for all build rules
When a test fails, it will give you a report that you can open in your browser to view the reason for each failed test. The Standard output tab is particularly useful as it shows the console output of each test.
Refer to the build.yml configuration to see how to run integration tests locally, if you are interested.
Installing from source
The nextflow
command is just a Bash script that downloads and executes the Nextflow JAR. When you install Nextflow using get.nextflow.io
, it only downloads this launcher script, while the Nextflow JAR is downloaded on the first Nextflow run.
You can run make install
to install a local build of the Nextflow JAR to $NXF_HOME
. Note that this command will overwrite any existing Nextflow packages with the same version. This approach is useful for testing non-core plugins with a local build of Nextflow.
If you need to test changes to the nextflow
launcher script, you can run it directly as ./nextflow
, or you can install it using cp nextflow $(which nextflow)
and then run it as nextflow
.
Debugging
Groovy REPL
The groovysh
command provides a command-line REPL that you can use to play around with Groovy code independently of Nextflow. The groovyConsole
command provides a graphical REPL similar to nextflow console
. These commands require a standalone Groovy distribution, which can be installed as described for Java in Getting started.
IntelliJ IDEA
New in version 23.09.0-edge.
You can perform limited breakpoint debugging on a Nextflow script using IntelliJ IDEA.
Set a breakpoint in your Nextflow script by clicking on a line number.
Run
nextflow -remote-debug run <script>
Select the Run / Debug Configurations dropdown, select Edit Configurations…, and create a new configuration of type Remote JVM Debug. Set the port that appeared in the terminal when you launched your Nextflow script. Click OK.
Select the green bug icon to begin the remote debug session. The Debug window will appear and allow you to step through and inspect your script as it runs.
Note that this approach can only be used to debug the script execution, which does not include the pipeline execution.