This is another journal entry of my progress toward an extensible quasiquotation syntax. It wanders a bit, but I think it has a happy ending. :)
My last post was about “higher quasiquotation.” Since then, I’ve taken to calling that subject hypersnippets, since the characteristic feature is that it’s a repeated iteration of the concept of “the snippet of code between this boundary and this boundary.” Degree-N hypersnippets are made up of all the code in between a degree-(N-1) hypersnippet shape and zero or more nonoverlapping degree-(N-1) hypersnippet shapes appearing inside it. A degree-1 hypersnippet is like a text selection, and degree-0 hypersnippet is a text stream. Quotation is a certain kind of DSL where the syntax is hypersnippet-shaped, but there are potentially other uses for these shapes.
(Spoilers: Yesterday I finally convinced myself hypersnippet shapes were precisely the opetopes, and hypersnippet-shaped data is data that’s composable using the operations of an opetopic ω-category. So hypersnippets in my original sense are an ω-category generated by some free 1-cells corresponding to characters that can appear in a text stream. (Update 6-3-2018: Michael Arntzenius points out that these generators on their own would just generate strings. I was also sloppy about specifying the generator cells’ sources and targets here. Looks like I need one generator of each opetopic shape to be the holes, with each one’s sources and target being lower-dimensional holes; as well as one generator corresponding to each text character, each of which is a 2-cell with no sources, targeting the unique 1-cell hole.) Nevertheless, I’m still going to refer to these as “hypersnippets” in this post, and I think it’s valuable to refer to them by their intended usage domain in case they morph into a slightly different concept, even if the concept now seems to have stabilized into something that corresponds with opetopes.)
I actually made a lot of headway on the practical problem of manipulating hypersnippet-shaped data. What I finally did was represent hypersnippet shapes using a plain old list of closing brackets. (Adding opening brackets to this would give a nested structure to the interior of a hypersnippet, but all I’m modeling here is the exterior shape.) Each element of the list is a number, namely the degree/dimension of the hole that its closing bracket opens up. After one closing bracket opens a hole, the hole must be closed again by closing brackets later in the list, with one exception: A degree-0 hypersnippet never closes — there are no hole degrees less than 0 which could close it — and so neither does a degree-0 hole, and thus every nonzero-degree hypersnippet’s list of closing brackets eventually gets to one bracket labeled 0, with nothing beyond it.
Passing around strings of brackets is not exactly the most error-safe way to manipulate structured data, or so you would think, but somehow it’s the most intuitive and successful approach I’ve been able to find. I do at least do sanity checks by running over those brackets with a certain kind of “hyperstack” to ensure there are no bracket mismatch errors, and I’ve found a lot of operations I can implement using this hyperstack technique but expose as a more algebraically elegant interface. So the details of this list-based representation format have been easy to encapsulate in library code.
In particular, I wrote a bunch of operations this way for what I’m calling “hypertees,” namely hypersnippets with no interesting data in its interior, only data annotations on each hole. At degree 0, a hypertee is a trivially uninformative value, since it has no holes. At degree 1, a hypertee has precisely one hole (the hole of degree 0), so it’s just like passing around that hole’s annotation value on its own. At degree 2, a hypertee is like an improper list: It has a series of values annotating its degree-1 holes, and then it finally has one more value annotating its degree-0 hole at the end. At degree 3, the structure of the holes resembles an s-expression, where each pair of parentheses that we might write in an s-expression can instead be represented using a degree-2 hole with a single degree-1 hole in it.
Concatenation is a very interesting operation for hypertees. To concatenate hypertees, we annotate every hole of one “root” hypertee with a “leaf” hypertee of a shape that corresponds with the shape of the hole it’s in at low dimensions. A leaf hypertee’s low-dimensional holes must be annotated with trivial values, since those annotations will be lost in the trivial interior of the resulting hypertee.
This corresponds to monoidal concatenation at degree-0 holes, and it corresponds to monadic composition at degree-1 holes. In particular, degree-2 hypertees are like improper lists, and composing them along degree-0 holes works just like Haskell’s list monoid, while composing them along degree-1 holes works just like the list monad.
At holes of higher degrees, it’s been more difficult to find familiar things to compare this operation to, since the operation demands compatible hypertee shapes. There’s only one hypertee shape of degree 0 and only one of degree 1, so the shape compatibility requirement disappears entirely at those dimensions.
I started to call the interface for this kind of concatenation “hypermonad,” since the operations so strongly resemble the Haskell monad typeclass, but I’m still not quite sure in what way it corresponds with monads (even now that I know I’m dealing with opetopes).
Having chosen that term “hypermonad,” I found it more urgent to see whether what I had really counted as a monad. I kept poring over higher category theory pages on nLab to try to figure it out. A lot of other higher category theory concepts seemed pretty related, but nothing seemed to line up exactly.
Detour into hypersnippet-shaped side effects
If there was anything else I knew a monad was good for, it was side effects, so I started to explore whether my hypermonads made sense as a side effect model.
Around this time, for mostly unrelated reasons, I was starting a library called Effection where I intended to recreate some of Cene’s monotonic side effect system in Racket, so that I could talk about them with people without getting derailed in talking about Cene’s other quirky features.
So, Effection was already a library I was already planning to write in Racket to manage side effects. I started to put exploratory code in there for hypersnippet-shaped side effects. While I was at it, I was keeping an eye open for connections between these and Cene’s monotonic side effects, hoping to unify the two motivations of the Effection library.
The idea behind hypersnippet-shaped side effects was this: Just as hypersnippet brackets upgrade a simple text stream into a kind of data with n dimensions of structure, something like hypersnippet brackets could be arranged on a wall-clock timeline to structure the execution of a program into distinct regions of time, which could be structured into distinct regions between regions of time, etc.
Brainstorming for examples of degree-1-or-higher side effects, I found what I was looking for in effect handlers and
parameterize: The act of setting up an effect handler or
parameterize is in some sense a side effect — because it interacts with nonlocal code by way of that code’s side effects — but it’s a side effect that’s only “visible” in the dynamic extent of a certain expression, and the code outside that expression might consider the expression to be pure if all its side effects are handled. Likewise, resource acquisition is often a side effect that’s structured in a degree-1 way, which you can see in utilities that open a file, run some code, and close the file again.
But I’m not sure I found any satisfying degree-2-or-higher side effects. There can be degree-2-or-higher effects that set up effect handlers with the ability to open holes in their regions (escaping to the previously bound effect handlers for a while), but this is not very different from what effect handlers can already do. I found no really satisfying operational account of what side effects “degree-0-and-degree-1-pure code” could have in Racket, because all the data Racket’s evaluation strategy manipulates has a shape of degree 1 or less; degree-0-pure code doesn’t depend on Racket’s left-to-right evaluation order policy, and degree-0-and-degree-1-pure code doesn’t depend on Racket’s evaluation strategy at all.
There was a slight connection to monotonic side effects, in that it seems that there’s a sensible notion of degree-1-monotonic effects, which only bind effect handlers or
parameterize variables in ways that are consistent with the existing dynamic environment. Nevertheless, this kind of monotonicity seems too permissive to me, because it lets a program temporarily bind something to one value and then (after that’s done) temporarily bind it to a different one. The program can call the same procedure in each case and get different results, and then it can execute some behavior that that depends on both of those results at once, so this programming style is not nearly as “pure” as the semi-deterministic style facilitated by LVars-style (degree-0-)monotonic state. I’m not sure what use I would ever have for it other than a thought experiment.
So, I think of this foray into hypersnippet-shaped side effects as an instructive experience which turned up mostly negative results. It gave me practice implementing hyperstacks and reasoning intuitively about hypersnippet shapes, and that was basically my primary goal, but it hasn’t brought me much closer to identifying a clear relationship between hypermonads and traditional monads.
Detour into some cleanup
At this point, my code was a mess, full of dead-end approaches to modeling hypersnippet data, and full of dead-end approaches to hypersnippet-shaped side effects too.
Trying to find some place to work on the question “What are monads, really?”, I started another Lathe repository focused on more category-theoretical utilities in Racket, called “Lathe Morphisms for Racket.” I’m trying to abide by the way nLab and other category theory literature refers to monads (and other things, including higher categories), building the pieces up one at a time to get there.
Lathe Morphisms is still a work in progress, not nearly as tidy as Lathe Comforts. That’s partly because I’m still learning a lot about what ground category theory covers, and partly because I know there are already various known ways to translate category theory concepts into a programming language library (varying depending on things like how they treat excluded middle and the axiom of choice). I’m trying to find a naming convention that’ll leave room for adding more of these variations to the library as the demand arises.
I didn’t know the words for this when I began, but I think the approach I’m taking resembles essentially algebraic theories (EATs): I’m documenting what a “category” or “functor” is as an EAT, and then I’m making a different module for each way of interpreting that EAT as a library API signature. Whatever it is I’m doing in Lathe Morphisms, it’s not really stable yet, but it’s been a great way to work my way through some of the minutia of what different category theory concepts mean, building my familiarity so I’m more capable of reading the literature.
Detour into ordinals
Since my hypersnippet-shaped side effects didn’t seem very meaningful in specific finite dimensions, I had kept wondering what it’d be like to use infinite-dimensional hypersnippet-shaped side effects instead.
As I tried to approach higher category theory concepts in Lathe Morphisms, I kept thinking about whether I’d need to have some data structure that contained an infinite regress of witnesses of algebraic laws.
And since hypersnippets and higher categories are intuitively based on iterating some inductive notion of what “the next higher” is, and infinite-dimensional categories are sometimes called ω-categories, I figured I’d need to express infinite-dimensional hypersnippet shapes using ordinal numerals in place of integers, and I’d need to use ordinal-sized lists to hold all the witnesses for an infinite-dimensional algebra.
And indidentally, I was learning about inductive-inductive type definitions made me think about how they might be considered to be a single type with its constructors indexed by an ordinal, so one constructor’s type could refer to any constructor at a lesser ordinal.
Whatever it was, ordinals were on my mind, so I wrote an implementation of Cantor normal form ordinals and lists indexed by them.
That implementation has gone really well, but the interface isn’t really stable yet. I kind of want any code that uses ordinals to be able to swap in a different implementation suitable to its needs (such as arithmetic up to a larger ordinal, or an algebra that looks like ordinal arithmetic but has some free variables in the mix), but I’m not sure how to reconcile that kind of abstraction with some error-reporting and optimization concerns I’ve folded into this API’s design (like having a decidable comparison operation, and making lists have predictable garbage collection reachability properties).
Detour back to hypersnippet-shaped effects
Equipped with these ordinal utilities, I went back to Effection to make a simpler system of hypersnippet-shaped effects, focusing on extrapolating upon the behavior of
parameterize itself, since at least that seemed like a distinct target compared to all the other things that didn’t quite seem useful in the end. That worked, but it made me realize
parameterize can already do this all conveniently enough by using first-class parameterization values (
call-with-parameterization), so it never really needs effects beyond degree 1.
I had thought that if I didn’t care about side effects of particular finite dimensions, at least I could figure out how the ordinal-generalized hypersnippets could be generalized further to an ordinal-like arithmetic where there was an impredicative infinity. But I think, for Racket side effects, that impredicative infinity is already reached at dimension 1, and the implementation of the impredicative hole-punching would simply look almost exactly like
call-with-parameterization and be a little less expressive (so as not to let holes be opened up to a parameterization that isn’t currently in progress).
Back to hypersnippets, quasiquotation, and higher category theory
Now I’m sort of back where I started: Trying to implement a library for hypersnippet-shaped data and an extensible quasiquotation syntax which uses that data.
But I’ve finally learned enough about category theory that I’m very confident I know how hypersnippets fit into things. They’re the cells of an opetopic ω-category. What I’ve called “hypersnippet shapes” are called opetopes. Their “degree-n closing brackets” are the degree-n input cells of the opetope. My “hypertees” (which label all these input cells with arbitrary values) are a kind of opetopic pasting diagram. What I’ve been calling “hypermonadic bind” is composition of opetopes.
I think I’ll need to prove that my bracket-based representation of hypersnippet shapes is equivalent to other encodings of opetopes and write at least one paper about it. The encoding of opetopes in  actually very closely resembles a previous approach I took that I abandoned because it was getting so complicated that I couldn’t be confident it was correct.
For now, I’m mainly doing code cleanup and aiming for that extensible quasiquotation syntax.
I took a look at the old attempts I was making at this, and my latest notes laid out a design with a tower of “reader macros” (analogous to Lisp or Racket reader macros, but in progressively higher dimensions) and a tower of “backend macros” (analogous to Lisp or Racket macros, but expecting progressively higher dimensions of input, and either successfully expanding to a Racket expression or punting to higher-dimensional syntaxes by expanding to a reader macro call).
Looking over that plan now, I’m not sure that it makes much sense to jump directly to a multidimensional tower of macro systems here. Quasiquotation is a specific operation for use in degree-1 code, and it takes not just a degree-2 hypersnippet of literal code, but also several degree-1 subexpressions to fill in the gaps. If there’s a tower of macro systems, then surely the only reason quasiquotation would be in any layer of the tower higher than degree 1 would be due to its expectation of degree-2 input data.
We could just as well implement that in the quasiquotion macro’s implementation by explicitly converting its degree-1 input data into the appropriate higher-dimensional input data using a procedure for this purpose. This procedure would effectively be a macroexpansion phase, potentially invoking custom syntaxes for names like
unquote, but this is the only new macroexpansion phase we should need. There’s no reason it must step up just one dimension to degree-2 data; it can generate any nonzero degree of data, which are all in some sense the same kind of data since they’re composition-compatible with each other (being collectively the “stable opetopes” in ). So we don’t need a tower at all, just this one new macroexpander.
So the data structure of opetopic shapes may be complicated to work with, but it seems like using it for syntax and macros will be the easy part.
If it’s as easy as I think, well, I’ll probably make another post about it once the implementation’s all there. :)
 “Polynomial functors and opetopes” Joachim Kock, André Joyal, Michael Batanin, Jean-François Mascari