Day 11: Planning to fail

I spent most of today in research, which I found quite relaxing after the go, go, go! of previous implementation sessions. One strategy that helped (which I can thank my lovely partner for recommending) is using a visual timer while I worked. This makes it easy to see how far I am into my planned session length. I thought this might increase the pressure, but I found it freeing: I only have this amount of time, so I have to be happy with what I have done at the end. It also made the passage of time feel much more concrete, rather than the tunnel of hyperfocus that is my usual working state. Sipping a nice cup of green tea while I worked and listening to pareidolia probably helped too.

We have some jobs to do

As mentioned previously, I want Séance to be robust in the face of failure, able to pick back up where it left off at a moments notice despite network conditions, even if it was forcefully quit during its previous run. To do this, I landed on a design quite similar to a build system, where we will built out a directed acyclic graph of tasks and dependencies and execute upon them. Reflecting on the inspirations for this idea, I recalled a blog post I had previously read, The plan-execute pattern, which generalizes the concept.

I encourage you to read it for yourself, but I will summarize what I took from it here. Splitting a complex task into multiple individually comprehensible tasks is generally a good idea because it lets you test and reason about smaller units. One such way to break up a complex task is to split up planning and execution. When first starting, the program can construct a data structure of some sort that boils down its intended strategy for solving a problem. This logic can often be complex on its own — for a build system it means a graph search at the minimum. Therefore, we would want to test the “planner” component in isolation, feeding it some input problem and inspecting that it produces a reasonable looking plan. This has the extra benefit of making a “dry-run” mode trivial.

Execution can be similarly complex, needing to adapt and respond to failure through retries, and parallelizing the work for performance. If the inputs to the executor is just the plan, which we can make a pure data structure, then testing it is similarly easy.

I like the formulation of this pattern particularly because it also provides a clean, functional interface between the two most complex aspects of the program. Even better, we can represent the executor as a state machine, making it possible to exercise it without actually having to perform I/O, an echo of the sans-io pattern.

A slight change of plans

The example given in the article is a build system, where the definition of the build graph (inputs, outputs, and rules mapping between them) has to be fully dynamic. This is represented in the type modeling, which represents nodes quite genericly

struct Node {
    id: NodeId,
    inputs: Vec<ArtifactId>,
    outputs: Vec<ArtifactId>,
    transform: Command,
}

In the case of Séance, though, we control the definition of the build graph, and so can represent it in a more structured way. In particular, the above definition does not enforce that we have the correct number of inputs or outputs for the given transform. Downloading a file, for example, always produces a single output, but we are not encoding that knowledge in the type system, and therefore not making illegal state unrepresentable.

My current design of a more strict build graph looks like this:


pub struct ArtifactId(PathBuf);

pub(crate) trait Task {
    fn inputs(&self) -> Box<dyn Iterator<Item=ArtifactId>>;
    fn outputs(&self) -> Box<dyn Iterator<Item=ArtifactId>>;
    fn execute(self) -> Result<()>;
}

pub(crate) struct Plan {
    tasks: Vec<Box<dyn Task>>
}

struct Download {
    url: Url
    output: ArtifactId
}

impl Task for Download {
    fn outputs(&self) -> Box<dyn Iterator<Item=ArtifactId>>> {
        Box::new(std::iter::once(self.artifact_id))
    }

    // ...
}

This is, of course, heavily inspired by the structure used in the blog post, but tweaked so that each Node (renamed Task) owns its own set of inputs and outputs. I expect that this will make executing them simpler, and hope that the dynamic iterator method is sufficient for any neccesary graph algorithms. I am using iterators instead of returning a Vec for two reasons: it decouples the data structure that the planner/executor need from a specific storage class (eg. they could collect the iterator into a HashSet if desired) and it reduces the size of allocated the temporary allocation. I have a bit of a paranoia about the lurking performance issues with dynamic memory allocation, and may later rework the planner to use a bump allocator like bumpalo to avoid the issue entirely. Alternatively, it might be that all my tasks only need a single output, which would be even simpler.