Day 7: Feeding the config
Now that I am successfully reading and parsing the config file, the natural next step is to modify it.
To review, I decided that I want to store feed URLs in a plain-text config file so that people can sync
them along with their dotfiles. That said, going through the effort of learning reading the docs to learn
where a config file is stored and what its options are can be a pain. I often explore newly installed
command line programs only using the --help options. Therefore, I think it would be convenient to be able
to edit the config file in basic ways without having to open the config directly. Today I implemented the
first version of that.
For arguments’ sake
The MVP API that I think Séance needs is as follows:
$ seance add "https://example.com"
$ seance sync
Given how simple this is, I could probably implement it fairly easily by hand. However, that would also
mean implementing the --help flag (for each subcommand) and the overall usage messages. That’s a chore,
so instead I pulled in clap, which will
do all of that for me.
I think most people who use clap use the derive based API, which lets you write your command line arguments as
annotations on the same struct into which they’ll be parsed. This is convenient, but (as is becoming a recurring topic),
I prefer to use the builder API to avoid the procedural macro that backs custom derives. For clap
specifically this is a habit I picked up while trying to write a web service whose code would fit in a
Raspberry Pi Zero 2 Ws L1 instruction
cache (as you do). The builder API isn’t that much less convenient, so I haven’t felt the need to swap back.
fn main() -> miette::Result<()> {
let matches = Command::new("seance")
.version("0.1")
.about("A terminal-based podcatcher")
.subcommand(
Command::new("sync")
.about("downloads the latest versions of all podcasts")
)
.subcommand(
Command::new("add")
.about("add a podcast feed by its URL")
.arg(Arg::new("url"))
.arg_required_else_help(true)
)
.subcommand_required(true)
.get_matches();
match matches.subcommand() {
Some(("sync", _submatches)) => sync(),
Some(("add", submatches)) => {
let url: &String = submatches.get_one("url").expect("url required");
add(url)
},
Some((other, _)) => unreachable!("unknown subcommand {}", other),
None => unreachable!("no subcommand matched")
}
}
Lets split up, gang
This change pushed me over the complexity edge where I decided to split up the project into multiple
files. Up to now, I’ve been writing everything in one long main.rs. This may seem a bit messy, but I’ve
learned from past projects that if I try to preemptively build out the project’s module architecture,
I will usually get it wrong, and spend an annoying amount of time shuffling files around afterwards. For now,
I’ve added the following files:
- main.rs
- Top-level command line parsing and subcommand dispatch
- config.rs
- Config file loading and parsing logic
- dirs.rs
-
Exposes
project_dirs()to other modules - add.rs
-
Implementation of the
addsubcommand - sync.rs
-
Implementation of the
syncsubcommand (currently stubbed)
Eventually I may pull out the subcommand implementations into a “cmd” or “commands” submodule, which I believe is a popular approach.
Being lazy is OK
Previously I wrote the Config struct like this:
struct Config {
feeds: Vec<Feed>
}
This may have been my preference for functional programming showing: as a rule, I want to have as much of the code working directly on pure and transparent data as possible. It’s calming, in a way — I don’t have to worry about whether some arbitrary later operation on the structure could fail, and I can focus my efforts on the happy path.
Unfortunately, a Config definition like the one above won’t work for the purposes of automatically
editing the config file. Because I expect people to edit the config file by hand as well, it is important
that the contents of the file are round-tripable. That is, other than any changes that I make, the layout
of the file should look exactly the same before and after modification. Importantly, this means preserving
semantically unimportant details like whitespace. If I were to write the code as I had imagined:
let config = evil_load_config_using_io()?;
let config2 = happy_apply_infallible_in_memory_modification(config);
evil_write_config_using_io(config)?;
then I would lose all the whitespace etc. during the parse. This would be annoying to a lot of people, like having a bad autoformatter run on your carefully maintained text files.
So instead of parsing directly into simple structs, I’m taking a more object-oriented approach. The new API for
Config looks like this:
struct Config {
doc: KdlDocument
}
impl Config {
pub(crate) fn load_or_build_default(path: &Path) -> miette::Result<Config> { /* ... */ }
pub(crate) fn parse(file_content: &str) -> miette::Result<Config> { /* ... */ }
pub(crate) fn save(&self, path: &Path) -> io::Result<()> { /* ... */ }
pub(crate) fn feeds(&self) -> impl Iterator<Item=miette::Result<Feed>> { /* ... */ }
pub(crate) fn add_feed(&mut self, feed: Feed) { /* ... */ }
}
Instead of storing the simplified representation of the data in the struct, I store the full KDL
document. Importantly, this means that feeds are validated lazily.
If a feed entry is invalid, you don’t see that until and unless you iterate over the output
of the feeds() method.
I’m conflicted on this approach. On one hand, it does make the API more complex, since now all the client code that interacts with feeds has to worry about whether a feed is invalid. On the other hand, I think that for most use cases I actually don’t need to do more than a single iteration over the set of feeds. This structure will also make it easier to provide some degree of “forwards-compatibility” for the config. If an older version of the software reads a file created by a newer version (which is likely if people are using git to sync them across computers) then they can be easily ignored, and wouldn’t accidentally disappear during the round-trip.
The changes required for this refactor are fairly straightforward. The biggest change required was to change
the loop over feeds into a map and move it into its own method.
One detail I went back and forth on was
how much Config should know, specifically whether it should have methods for performing I/O, and if
it should know where it is stored by default. I eventually landed on a few simple methods that perform I/O
(load_or_build_default and save), with both containing as little logic as possible, and keeping the
knowledge about default storage locations isolated to the subcommands. Keeping the I/O methods simple
makes testing easier, with the tests I wrote yesterday just being updated to use parse
and iterate through feeds to find errors.
I considered keeping I/O isolated to a set of totally-separate functions so that the division between I/O and pure operations is as obvious as possible, but for now that seemed like too much ceremony for little benefit. I may still walk this back later, though, especially if I want to add more tests for the I/O logic itself.
Tying it all together
Given the refactors above, the actual definition of the add subcommand is quite simple (aside from the setup ceremony):
pub(crate) fn add(url: &str) -> miette::Result<()> {
let url = Url::parse(url).into_diagnostic()?;
let project_dirs = dirs::project_dirs()
.map_err(|e| miette!("failed to load config: {e}"))?;
let config_dir = project_dirs.config_dir();
std::fs::create_dir_all(config_dir).into_diagnostic()?;
let config_path = config_dir.join("config.kdl");
let mut config = Config::load_or_build_default(&config_path)?;
config.add_feed(Feed { url });
config.save(&config_path).into_diagnostic()?;
Ok(())
}
Manually testing this from the command line with cargo run add "https://example.com", I see that it has correctly
created the config file and written the feed into it.