Meaning-preserving modularity

Sharing code is like sharing knowledge—the knowledge of how to write that code. I think we can simplify the meaning of program imports by making them queries to an oracle of all possible shared code.

This frees us from some complexity in maintaining the modules installed in a system: The modules’ effects on what the language can accept is declarative—as in they’re commutative, associative, and idempotent, like a mathematical set—and their effect on what the language’s programs mean is zilch. If the module oracle fails to find a match in its search, that just means there isn’t enough knowledge installed, not that the application itself is defective or should behave in a different, unintended way (like choking at run time).

When I talk about searching “all possible shared code,” I do mean searching every revision of every library, including every possible future revision, where the names may have been replaced with every possible combination of other names. This means we can’t identify modules by name, the way most module systems do.

Sharing proofs and other unique values

The easiest kind of code to share this way is mathematical fact. If we ask the oracle to verify a theorem for us, it can start searching, discover a proof in one module or another, and tell us “Just like you said, pal.” If it doesn’t find any proofs, it rejects the program, and we have to feed it a better selection of modules or revise our own code.

If the logical framework we’re using supports classical logic, we might ask for a classical theorem and get back either “Just like you said, pal” or “No way pal, you’ve got it all wrong,” depending on whether the oracle finds a proof or disproof first. In general, every question we ask of the oracle must have a unique answer, so that we don’t try to solve problems in our program by disingenuously omitting certain modules from our “all possible modules” directory.

Formalism schmormalism

Designing an expressive logical framework with this kind of unique lookup seems pretty hard, but I’m not even striving to accomplish it myself. With some of the code we share, we don’t care about its mathematical properties so much as we care about its empirical, aesthetic, or social properties. We need to be able to query knowledge on this basis too.

Unique oracle responses are essential to the style of module system I’m after… but when things can’t be unique with mathematical certainty, they can at least be unique in practice.

Sharing cryptographically unique values

Let’s actually go back and incorporate a global naming scheme into this module system after all, with a catch. Every globally named export should be timestamped and cryptographically signed by the author or organization responsible for keeping that export’s meaning stable over time.

Queries of these global variables take the form of a ( name, acceptable timestamp range, author/organization’s public key ) triple, and the development environment or language runtime may help fill out the latter two fields. As is usual for making something’s meaning context-independent, we’re just taking some context-specific things the program depends on and making them explicit.

Instead of using names here, it might help to use a content-addressing idiom like what David Barbour has been investigating. (David describes more recent developments on that train of thought here, but I find that post less approachable.) This would help us to actually query on empirical, aesthetic, or social properties themselves, rather than always querying by attribution alone.

I would still use attribution sometimes, in order to identify specific experts I sided with on these matters—myself, for instance! Of all authors whose code I could use, I find myself extremely relevant to my interests and almost obsessively responsive to my feedback.

What we’ve wrought

The point of this system is to simplify the maintenance of installed modules in a system. The system doesn’t become outright buggy if modules are added or removed, but it may stop having all the knowledge it needs to be semantically complete.

Modifying a module is also protected. For mathematical knowledge, modifying a module won’t break the system. For attributed, cryptographically signed values, modifying a module without changing the attribution will be difficult for anyone but the author, and that author is expected to modify the value as they see fit anyway. Authors who do modify their modules are equipped with a timestamp mechanism to help preserve the meaning of existing downstream code.

All modules, no monolithic applications

I’d like to touch on a final thought here. We may still be figuring out better ways to share general-purpose code among programmers, but it’s important—maybe even moreso—to share general-purpose code with computing amateurs. Applications are shared code too; they just have the widest audience, with potentially the widest variety of security concerns and accessibility needs. It would be best if we could make all these concerns composable in fine-grained yet intuitive ways, rather than bloating our application codebases with hardcoded choices of our own.


3 thoughts on “Meaning-preserving modularity

  1. PS: Talking to David Barbour has done a lot for the way I think about modules. I can’t give enough credit for that, but I can at least draw the line at the provable/cryptographically-ensurable uniqueness of imports, which I seem to be particularly paranoid about. :-p

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s