Loading and saving workflows#

This tutorial uses a workflow with a single node passing through an object:

from ewokscore.task import Task

class PassThroughTask(Task, input_names=["object"], output_names=["object"]):
    def run(self):
        self.outputs.object = self.inputs.object


workflow = {
    "graph": {"id": "testworkflow", "schema_version": "1.1"},
    "nodes": [
        {
            "id": "task1",
            "task_type": "class",
            "task_identifier": "__main__.PassThroughTask",
        },
    ],
    "links": [],
}

Workflows can be saved in JSON or YAML format

from ewokscore import convert_graph

convert_graph(
    workflow,
    "myresults/workflows.json",
    inputs=[{"id": "task1", "name": "object", "value": 42}],
)

Loading workflows from JSON or YAML can be done with the same function

workflow = convert_graph(
    "myresults/workflows.json",
    None,
)

Conversion between file formats can be done like this

workflow = convert_graph(
    "myresults/workflows.json",
    "myresults/workflows.yaml",
)

Workflow representations#

In short, convert_graph allows converting different workflow representations, either in file or as a Python object in memory (dictionary, string, TaskGraph from ewokscore, Graph from networkx). The representation can be provided explicitly when automatic inference fails

workflow = convert_graph(
    "myresults/workflows.json",
    "myresults/workflows.yaml",
    load_options={"representation": "json"},
    save_options={"representation": "yaml"},
)

ewokscore supports the following representations:

Representation

Data Types

Notes

"json"

JSON types only

JSON file

"json_dict"

Python types

Python dictionary

"json_string"

JSON types only

JSON string

"json_module"

JSON types only

JSON file distributed in a Python package

"yaml"

YAML types only

YAML file

"test_core"

Python types

Demo workflows provided by ewokscore

Serialization#

In case the representation does not support the data type of the inputs values, a serializer can be used which will serialize the data before serializing as JSON or YAML and deserialize after the data is deserialized from JSON or YAML.

For example json_pickle allows storing any object in JSON or YAML, in this case a numpy array

import numpy
from ewokscore import convert_graph

convert_graph(
    workflow,
    "myresults/workflows.json",
    save_options={"serializer": "json_pickle"},
    inputs=[{"id": "task1", "name": "object", "value": numpy.arange(5)}],
)

ewokscore supports the following serializers:

Serializer

Notes

"json"

Serialize like python’s json module

"json_pickle"

Preserve native JSON types and pickle all other types

"hdf5_pickle"

Preserve native JSON and numpy types and pickle all other types