Day 7: Feeding the config

Now that I am successfully reading and parsing the config file, the natural next step is to modify it. To review, I decided that I want to store feed URLs in a plain-text config file so that people can sync them along with their dotfiles. That said, going through the effort of learning reading the docs to learn where a config file is stored and what its options are can be a pain. I often explore newly installed command line programs only using the --help options. Therefore, I think it would be convenient to be able to edit the config file in basic ways without having to open the config directly. Today I implemented the first version of that.

For arguments’ sake

The MVP API that I think Séance needs is as follows:

$ seance add "https://example.com"

$ seance sync

Given how simple this is, I could probably implement it fairly easily by hand. However, that would also mean implementing the --help flag (for each subcommand) and the overall usage messages. That’s a chore, so instead I pulled in clap, which will do all of that for me.

I think most people who use clap use the derive based API, which lets you write your command line arguments as annotations on the same struct into which they’ll be parsed. This is convenient, but (as is becoming a recurring topic), I prefer to use the builder API to avoid the procedural macro that backs custom derives. For clap specifically this is a habit I picked up while trying to write a web service whose code would fit in a Raspberry Pi Zero 2 Ws L1 instruction cache (as you do). The builder API isn’t that much less convenient, so I haven’t felt the need to swap back.

fn main() -> miette::Result<()> {
    let matches = Command::new("seance")
        .version("0.1")
        .about("A terminal-based podcatcher")
        .subcommand(
            Command::new("sync")
                .about("downloads the latest versions of all podcasts")
        )
        .subcommand(
            Command::new("add")
                .about("add a podcast feed by its URL")
                .arg(Arg::new("url"))
                .arg_required_else_help(true)
        )
        .subcommand_required(true)
        .get_matches();

    match matches.subcommand() {
        Some(("sync", _submatches)) => sync(),
        Some(("add", submatches)) => {
            let url: &String = submatches.get_one("url").expect("url required");
            add(url)
        },
        Some((other, _)) => unreachable!("unknown subcommand {}", other),
        None => unreachable!("no subcommand matched")
    }
}

Lets split up, gang

This change pushed me over the complexity edge where I decided to split up the project into multiple files. Up to now, I’ve been writing everything in one long main.rs. This may seem a bit messy, but I’ve learned from past projects that if I try to preemptively build out the project’s module architecture, I will usually get it wrong, and spend an annoying amount of time shuffling files around afterwards. For now, I’ve added the following files:

main.rs
Top-level command line parsing and subcommand dispatch
config.rs
Config file loading and parsing logic
dirs.rs
Exposes project_dirs() to other modules
add.rs
Implementation of the add subcommand
sync.rs
Implementation of the sync subcommand (currently stubbed)

Eventually I may pull out the subcommand implementations into a “cmd” or “commands” submodule, which I believe is a popular approach.

Being lazy is OK

Previously I wrote the Config struct like this:

struct Config {
    feeds: Vec<Feed>
}

This may have been my preference for functional programming showing: as a rule, I want to have as much of the code working directly on pure and transparent data as possible. It’s calming, in a way — I don’t have to worry about whether some arbitrary later operation on the structure could fail, and I can focus my efforts on the happy path.

Unfortunately, a Config definition like the one above won’t work for the purposes of automatically editing the config file. Because I expect people to edit the config file by hand as well, it is important that the contents of the file are round-tripable. That is, other than any changes that I make, the layout of the file should look exactly the same before and after modification. Importantly, this means preserving semantically unimportant details like whitespace. If I were to write the code as I had imagined:

let config = evil_load_config_using_io()?;
let config2 = happy_apply_infallible_in_memory_modification(config);
evil_write_config_using_io(config)?;

then I would lose all the whitespace etc. during the parse. This would be annoying to a lot of people, like having a bad autoformatter run on your carefully maintained text files.

So instead of parsing directly into simple structs, I’m taking a more object-oriented approach. The new API for Config looks like this:

struct Config {
    doc: KdlDocument
}

impl Config {
    pub(crate) fn load_or_build_default(path: &Path) -> miette::Result<Config> { /* ... */ }
    pub(crate) fn parse(file_content: &str) -> miette::Result<Config> { /* ... */ }
    pub(crate) fn save(&self, path: &Path) -> io::Result<()> { /* ... */ }

    pub(crate) fn feeds(&self) -> impl Iterator<Item=miette::Result<Feed>> { /* ... */ }
    pub(crate) fn add_feed(&mut self, feed: Feed) { /* ... */ }
}

Instead of storing the simplified representation of the data in the struct, I store the full KDL document. Importantly, this means that feeds are validated lazily. If a feed entry is invalid, you don’t see that until and unless you iterate over the output of the feeds() method.

I’m conflicted on this approach. On one hand, it does make the API more complex, since now all the client code that interacts with feeds has to worry about whether a feed is invalid. On the other hand, I think that for most use cases I actually don’t need to do more than a single iteration over the set of feeds. This structure will also make it easier to provide some degree of “forwards-compatibility” for the config. If an older version of the software reads a file created by a newer version (which is likely if people are using git to sync them across computers) then they can be easily ignored, and wouldn’t accidentally disappear during the round-trip.

The changes required for this refactor are fairly straightforward. The biggest change required was to change the loop over feeds into a map and move it into its own method.

One detail I went back and forth on was how much Config should know, specifically whether it should have methods for performing I/O, and if it should know where it is stored by default. I eventually landed on a few simple methods that perform I/O (load_or_build_default and save), with both containing as little logic as possible, and keeping the knowledge about default storage locations isolated to the subcommands. Keeping the I/O methods simple makes testing easier, with the tests I wrote yesterday just being updated to use parse and iterate through feeds to find errors.

I considered keeping I/O isolated to a set of totally-separate functions so that the division between I/O and pure operations is as obvious as possible, but for now that seemed like too much ceremony for little benefit. I may still walk this back later, though, especially if I want to add more tests for the I/O logic itself.

Tying it all together

Given the refactors above, the actual definition of the add subcommand is quite simple (aside from the setup ceremony):

pub(crate) fn add(url: &str) -> miette::Result<()> {
    let url = Url::parse(url).into_diagnostic()?;
    let project_dirs = dirs::project_dirs()
            .map_err(|e| miette!("failed to load config: {e}"))?;
    let config_dir = project_dirs.config_dir();
    std::fs::create_dir_all(config_dir).into_diagnostic()?;
    let config_path = config_dir.join("config.kdl");
    let mut config = Config::load_or_build_default(&config_path)?;

    config.add_feed(Feed { url });
    config.save(&config_path).into_diagnostic()?;

    Ok(())
}

Manually testing this from the command line with cargo run add "https://example.com", I see that it has correctly created the config file and written the feed into it.