.. _defining-custom-workloads: Defining Custom Workloads ================================================ This page documents the ``JSON`` input format that FireSim uses to understand your software workloads that run on the target design. Most of the time, you should not be writing these files from scratch. Instead, use :ref:`firemarshal` to build a workload (including Linux kernel images and root filesystems) and use ``firemarshal``'s ``install`` command to generate an initial ``.json`` file for FireSim. Once you generate a base ``.json`` with FireMarshal, you can add some of the options noted on this page to control additional files used as inputs/outputs to/from simulations. **Workloads** in FireSim consist of a series of **Jobs** that are assigned to be run on individual simulations. Currently, we require that a Workload defines either: - A single type of job, that is run on as many simulations as specfied by the user. These workloads are usually suffixed with ``-uniform``, which indicates that all nodes in the workload run the same job. An example of such a workload is :gh-file-ref:`deploy/workloads/linux-uniform.json`. - Several different jobs, in which case there must be exactly as many jobs as there are running simulated nodes. An example of such a workload is :gh-file-ref:`deploy/workloads/ping-latency.json`. FireSim uses these workload definitions to help the manager deploy your simulations. Historically, there was also a script to build workloads using these JSON files, but this has been replaced with a more powerful tool, :ref:`firemarshal`. New workloads should always be built with :ref:`firemarshal`. In the following subsections, we will go through the two aforementioned example workload configurations, describing the how the manager uses each part of the JSON file inline. The following examples use the default buildroot-based linux distribution (br-base). In order to customize Fedora, you should refer to the :ref:`booting-fedora` page. Uniform Workload JSON ---------------------------- :gh-file-ref:`deploy/workloads/linux-uniform.json` is an example of a "uniform" style workload, where each simulated node runs the same software configuration. Let's take a look at this file: .. include:: /../deploy/workloads/linux-uniform.json :code: json There is also a corresponding directory named after this workload/file: .. code-block:: bash centos@ip-192-168-2-7.ec2.internal:~/firesim/deploy/workloads/linux-uniform$ ls -la total 4 drwxrwxr-x 2 centos centos 69 Feb 8 00:07 . drwxrwxr-x 19 centos centos 4096 Feb 8 00:39 .. lrwxrwxrwx 1 centos centos 47 Feb 7 00:38 br-base-bin -> ../../../sw/firesim-software/images/firechip/br-base/br-base-bin lrwxrwxrwx 1 centos centos 53 Feb 8 00:07 br-base-bin-dwarf -> ../../../sw/firesim-software/images/firechip/br-base/br-base-bin-dwarf lrwxrwxrwx 1 centos centos 47 Feb 7 00:38 br-base.img -> ../../../sw/firesim-software/images/firechip/br-base/br-base.img We will elaborate on this later. Looking at the JSON file, you'll notice that this is a relatively simple workload definition. In this "uniform" case, the manager will name simulations after the ``benchmark_name`` field, appending a number for each simulation using the workload (e.g. ``linux-uniform0``, ``linux-uniform1``, and so on). It is standard pratice to keep ``benchmark_name``, the JSON filename, and the above directory name the same. In this case, we have set all of them to ``linux-uniform``. Next, the ``common_bootbinary`` field represents the binary that the simulations in this workload are expected to boot from. The manager will copy this binary for each of the nodes in the simulation (each gets its own copy). The ``common_bootbinary`` path is relative to the workload's directory, in this case :gh-file-ref:`deploy/workloads/linux-uniform`. You'll notice in the above output from ``ls -la`` that this is actually just a symlink to ``br-base-bin`` that is built by the :ref:`FireMarshal ` tool. Similarly, the ``common_rootfs`` field represents the disk image that the simulations in this workload are expected to boot from. The manager will copy this root filesystem image for each of the nodes in the simulation (each gets its own copy). The ``common_rootfs`` path is relative to the workload's directory, in this case :gh-file-ref:`deploy/workloads/linux-uniform`. You'll notice in the above output from ``ls -la`` that this is actually just a symlink to ``br-base.img`` that is built by the :ref:`FireMarshal ` tool. The ``common_outputs`` field is a list of outputs that the manager will copy out of the root filesystem image AFTER a simulation completes. In this simple example, when a workload running on a simulated cluster with ``firesim runworkload`` completes, ``/etc/os-release`` will be copied out from each rootfs and placed in the job's output directory within the workload's output directory (See the :ref:`firesim-runworkload` section). You can add multiple paths here. Additionally, you can use bash globbing for file names (ex: ``file*name``). The ``common_simulation_outputs`` field is a list of outputs that the manager will copy off of the simulation host machine AFTER a simulation completes. In this example, when a workload running on a simulated cluster with ``firesim runworkload`` completes, the ``uartlog`` (an automatically generated file that contains the full console output of the simulated system) and ``memory_stats.csv`` files will be copied out of the simulation's base directory on the host instance and placed in the job's output directory within the workload's output directory (see the :ref:`firesim-runworkload` section). You can add multiple paths here. Additionally, you can use bash globbing for file names (ex: ``file*name``). .. TODO: this is no longer relevant with firemarshal **ERRATA**: "Uniform" style workloads currently do not support being automatically built -- you can currently hack around this by building the rootfs as a single-node non-uniform workload, then deleting the ``workloads`` field of the JSON to make the manager treat it as a uniform workload. This will be fixed in a future release. Non-uniform Workload JSON (explicit job per simulated node) --------------------------------------------------------------- Now, we'll look at the ``ping-latency`` workload, which explicitly defines a job per simulated node. .. include:: /../deploy/workloads/ping-latency-firemarshal.json :code: json Additionally, let's take a look at the state of the ``ping-latency`` directory AFTER the workload is built (assume that a tool like :ref:`firemarshal` already created the rootfses and linux images): .. code-block:: bash centos@ip-172-30-2-111.us-west-2.compute.internal:~/firesim-new/deploy/workloads/ping-latency$ ls -la total 15203216 drwxrwxr-x 3 centos centos 4096 May 18 07:45 . drwxrwxr-x 13 centos centos 4096 May 18 17:14 .. lrwxrwxrwx 1 centos centos 41 May 17 21:58 bbl-vmlinux -> ../linux-uniform/br-base-bin -rw-rw-r-- 1 centos centos 7 May 17 21:58 .gitignore -rw-r--r-- 1 centos centos 1946009600 May 18 07:45 idler-1.ext2 -rw-r--r-- 1 centos centos 1946009600 May 18 07:45 idler-2.ext2 -rw-r--r-- 1 centos centos 1946009600 May 18 07:45 idler-3.ext2 -rw-r--r-- 1 centos centos 1946009600 May 18 07:45 idler-4.ext2 -rw-r--r-- 1 centos centos 1946009600 May 18 07:45 idler-5.ext2 -rw-r--r-- 1 centos centos 1946009600 May 18 07:46 idler-6.ext2 drwxrwxr-x 3 centos centos 16 May 17 21:58 overlay -rw-r--r-- 1 centos centos 1946009600 May 18 07:44 pingee.ext2 -rw-r--r-- 1 centos centos 1946009600 May 18 07:44 pinger.ext2 -rw-rw-r-- 1 centos centos 2236 May 17 21:58 ping-latency-graph.py First, let's identify some of these files: - ``bbl-vmlinux``: This workload just uses the default linux binary generated for the ``linux-uniform`` workload. - ``.gitignore``: This just ignores the generated rootfses, which you probably don't want to commit to the repo. - ``idler-[1-6].ext2``, ``pingee.ext2``, ``pinger.ext2``: These are rootfses that we want to run on different nodes in our simulation. They can be generated with a tool like :ref:`firemarshal`. Next, let's review some of the new fields present in this JSON file: - ``common_simulation_inputs``: This is an array of extra files that you would like to supply to the simulator as *input*. One example is supplying files containing DWARF debugging info for TracerV + Stack Unwinding. See the :ref:`tracerv-flamegraph-workload-description` section of the :ref:`tracerv-with-flamegraphs` page for an example. - ``no_post_run_hook``: This is a placeholder for running a script on your manager automatically once your workload completes. To use this option, rename it to ``post_run_hook`` and supply a command to be run. The manager will automatically suffix the command with the path of the workload's results directory. - ``workloads``: This time, you'll notice that we have this array, which is populated by objects that represent individual jobs (note the naming discrepancy here, from here on out, we will refer to the contents of this array as **jobs** rather than **workloads**). Each job has some additional fields: - ``name``: In this case, jobs are each assigned a name manually. These names MUST BE UNIQUE within a particular workload. - ``simulation_inputs``: Just like ``common_simulation_inputs``, but specific to this job. - ``simulation_outputs``: Just like ``common_simulation_outputs``, but specific to this job. - ``outputs``: Just like ``common_outputs``, but specific to this job. Because each of these jobs do not supply a ``rootfs`` field, the manager instead assumes that that the rootfs for each job is named ``name``.ext2. To explicitly supply a rootfs name that is distinct from the job name, add the ``rootfs`` field to a job and supply a path relative to the workload's directory. Once you specify the ``.json`` for this workload (and assuming you have built the corresponding rootfses with :ref:`firemarshal`, you can run it with the manager by setting ``workload_name: ping-latency.json`` in ``config_runtime.yaml``. The manager will automatically look for the generated rootfses (based on workload and job names that it reads from the JSON) and distribute work appropriately. Just like in the uniform case, it will copy back the results that we specify in the JSON file. We'll end up with a directory in ``firesim/deploy/results-workload/`` named after the workload name, with a subdirectory named after each job in the workload, which will contain the output files we want.