Lately I’ve been trying to iron out the details of a generalization of quasiquotation which I call “higher quasiquotation.” The basic idea is that just as a quasiquotation is a region in one parenthesis-delimited region (marked by
quasiquote) and a set of other parenthesis-delimited regions (marked by
unquote), we can go on to talk about regions between quasiquoted regions, regions between those regions, and so on.
If you think of values with holes as being functions, then the notion that this is a “higher-order” quasiquotation is clear: Each quasiquotation determines a value of type
(c SExpr -> SExpr), the next higher degree of quasiquotation determines a value of type
(c (c SExpr -> SExpr) -> (c SExpr -> SExpr)), and so on, where
c is some collection like
c a = Map String a. But these functions aren’t the whole story; the quasiquotations should be able to be pulled apart like other data structures, not just filled in to create s-expressions.
I haven’t managed to write a full macroexpander for higher quasiquotation yet. I’ve written this post to share my status as it is.
Cene is a language I’ve built over the last couple of years. I’ve talked about Staccato and Tenerezza here, and that code has turned into Cene.
What sets Cene apart: Extensibility support
Cene’s design revolves around the primary idea that future generations will have better ideas for programming languages than we do, so most of what sets it apart is its support for custom languages, which mainly has to do with the design of its macroexpander.
Cene’s macroexpansion phase incrementally writes definitions (of macros, functions, etc.) to monotonic state resources using deterministic concurrency. These state resources are expressive enough that user-defined macros can use them to achieve combinations of open-world and closed-world extensibility, which is what I consider to be Cene’s primary feature.
It’s been a couple of years since I posted here! Somehow 2014 turned into an opportunity for me to practice forms of expression other than programming, but that’s not what I’m here to talk about. I still spent lots of time on language design throughout 2014, and I’ve been zooming along in the new year.
Ethics for language design
I still dream big enough in my language design that ethics is an important consideration, but I’ve mellowed down when it comes to long-term ethics, and I’ve riled up a little about short-term ethics. :) Whatever happens, I think we’ll eventually build systems for better communication throughput between people, reducing the kind of violent pressure releases I was afraid of. After all, whatever organizations have better communication methods will probably become more intelligent as organizations, and they’ll outcompete the others. We just have to worry about how brutal that competition will be. So I figure we should foster egalitarianism in ways that cultivate competitive markets. Does that make me a left-libertarian? I don’t even know.
Pragmatics for language design
At this point I think of a programming language as something that has niche value. “Programming” is a rather nebulous term itself, and “language” describes the skill you use rather than the reward you get. As user interfaces go, programming languages are optimized for tasks where the user will have a) a long time to prepare their input, b) a higher tolerance for complexity than for redundancy, and c) a rather strong commitment to their chosen code once it’s deployed out of their reach. Well-designed UIs avoid such high complexity and commitment burdens, so my expectation as a programming language designer is to put myself out of a hobby.
Between complexity and commitment, commitment is the more essential problem. Much of the complexity in programming can be traced back to commitment thresholds: Either the bit is set, or it’s clear, never in between. Furthermore I think it’s plausible to trace this back to the mind-body threshold: Human minds are so disconnected from each other that we insist on personal identity, and this insistence lends a sense of absolute discreteness to so many of the concepts we form. (Although I’m attributing this to humans, this might be more specifically a Western trend. My perspective is too myopic to tell.)
On the other hand, not all complex artifacts are deployed with a high commitment cost. Sometimes people deploy complex artifacts because they can’t help it, leaving behind fingerprints and memories. We might want to take advantage of this, focusing on programming languages that help us interpret found artifacts in a useful way.
If I’m on the right track here, then programs would do best to be shaped like some kind of fingerprint, and encapsulation boundaries in programming would do best to behave like the encapsulation boundaries of people. That way we’re not introducing unnecessary concepts. Well, people are vaguely like modules: Not only is a program module encapsulated in a way vaguely similar to a person, but orderless sets of interacting modules are vaguely similar to orderless sets of interacting people. When modules interact, their interaction membrane, if we took a fingerprint of it, would be their import/export type signature. Maybe a type signature is a fingerprint for person-to-person interaction too.
Based on this train of thought, we might like to find a language with only type signatures. If we took a typed functional language with implicits or type classes, removed everything but the type signatures, and tried to program with it anyway, we would accomplish some form of logic programming. Maybe logic programming languages are on to something.
Below the cut, I’ll list some of the language projects I’ve worked on over the past year or two. The above philosophical premises will be relevant for a few of them, but I won’t refrain from discussing tangential features and challenges I’m excited about. If you’d like to avoid most of the technical meat-fluff and get back to the philosophical fluff-meat, I recommend skipping to the section titled “Era Tenerezza” and reading from there. :)
I’m gradually figuring out a foundation for a general-purpose programming language, and I think I just laid a great cornerstone, which seems to solve the expression problem for dependent type theory. My pseudocode is in this long GitHub Gist, which comes with a long revision history showing my progress over the last 12 days.
I haven’t yet looked for a proof of strong normalization, consistency, and whatnot (and I don’t even intend for my theory to be expressive enough to support induction!), but the final insight has turned out to be very straightforward: If we can extend an extensible type and reimplement its interface (and re-prove its invariant) so that the new implementation/proof is observationally equal to the original as far as the original cases are concerned, then our extension may as well have been part of the type all along.
I’m using the observational equality infrastructure described by Altenkirch and McBride in “Observational Equality, Now!” and the way I think of the expression problem pretty much lines up with the requirements listed in Zenger and Odersky’s “Independently Extensible Solutions to the Expression Problem.”
On December 20, just in time for the Mayan apocalypse, I thought of an approach to computer programming that unites my meaning-preserving modularity, some of David Barbour‘s RDP vision, and my own philosophical worldview.
I’m calling it Reactive Knowledge Networking. It takes the philosophical idea that a person does nothing with the world except observation and action, and it uses that idea to facilitate people’s communication with each other, with minimal (if any) computer configuration bureaucracy along the way. Its network structure is very similar to RDP, and it uses meaning-preserving modularity to encode the partial knowledge a person has observed.
I call it Cairntaker. It’s for programmers who want well-tended heaps.
Recently I find it compelling to view programs in terms of their dataflow, since sometimes the whole dataflow graph can be generalized to some other purpose. I think it could be interesting to have a lisp variant whose surface syntax was not a tree but instead a dataflow graph. Last post, I described this train of thought in more detail. Now I’ll go into detail about my own partial ideas for generalized dataflow syntax.
The thing about syntax is, all the familiar syntaxes we use are flat enough to fit as text, and we group them by locality into hierarchies. That’s not something I’m trying to revolutionize right now; instead I’m looking for a way to construct these dataflow graphs using a surface syntax similar to what we already use.
Reactive programming and dependent typing have opened my eyes to an interesting way to view syntax. Now if you’ll bear with me, I won’t get to syntax until the end of the post.
In programming languages we often build “data” out of sums (either this kind of thing or that kind of thing) and products (this kind of thing and that kind of thing together), and every serializable notion of data, like binary, Unicode text, JSON, etc. falls into that pattern pretty easily. But right down at the lowest level, when we encode a binary bit (or boolean value) as “either 0 or 1″… when the “kind of thing” is a 0, what does it take to store it? Nothing; if we know we’re trying to read something of kind 0, we already know the value without even looking at the data. The same goes for 1. It turns out a bit is just “either this or that” when we only care about the branch itself, not what this or that actually represent. Our algorithms could actually be operating on secretly richer data structures, and they just happen to ignore that substructure.
In most languages, we can hold state containers and external references in our data structures, even though they don’t have the same serializable nature as sums or products. Many utilities that operate on data structures (say, a utility that reverses a list) will simply leave these exotic values alone. The secretly rich substructure actually exists, and it’s crucial to the other parts of the program.
A more common word to use to discuss this secrecy is polymorphism. If a utility really doesn’t care about what type it’s dealing with, it’s polymorphic in that type. Often things are polymorphic only up to some condition on the types involved, and conditions on individual types tend to be represented as types of their own. Implications between conditions are subtyping relations: A is a subtype of B iff A’s condition is sufficient to ensure B’s condition.
Reactive programming turns programming on its head in the following way: Suppose sums represent decisions that change over time, rather than eternally fixed, serializable bits. It turns out most of our programming patterns are actually polymorphic enough to continue making sense under this interpretation. Naively, we can just run our programs over and over every time a decision changes.
Dependent typing turns programming on its head in a similar way: Suppose every value is associated with some other value (its type) that represents what we know about that value at compile time. Unlike in many kinds of static typing, dependent typing breaks down boundaries that typically distinguish types from values, and a polymorphic utility can potentially work on types as easily as it works on values, so that some of our existing programming patterns can essentially be used for extending the compiler.
That’s one ideal of dependent typing anyway; in dependent type systems, there’s a tradeoff between ease-of-use and the mathematical rigor of “what we know about that value.” Mathematical rigor isn’t really something we can have halfway (though there are multiple ways we can appraise it, e.g. multiple foundational theories our own theory can be relatively consistent to or expressive of), so lots of research effort goes into pushing the boundaries of usability while preserving desirable proof-theoretical properties. For instance, if we naively allow arbitrary (potentially nonterminating) computation or allow full polymorphism on some type of all types (including this one), we open the door to mathematical inconsistency, and our compile-time knowledge is no longer trustworthy.
In effect, dependent typing introduces some new exotic values to our data structures (the exotic values being types), and certain kinds of dependent typing classify nonterminating functions as being too exotic to fit.
Back to syntax
I opened with a mention of syntax, and then I’ve blitzed through data, polymorphism, and these alternate kinds of dataflow on my way to the point.
Lisp gets pretty good mileage out of its syntax. When it handles program fragments as hierarchical lists of symbols, they’re about as intuitive to manipulate as any other list-structured data.
The thing is, encoding a program as a hierarchical structure in the first place is finicky business, with variable scoping issues to worry about. Now that I’ve seen that there are meaningful ways to reinterpret whole programs, I wonder how much more mileage a lisp variant would get if its surface syntax took care of scoping issues itself and left us with just a dataflow graph to manipulate.
Unfortunately, I think this surface syntax would inevitably be more verbose. If we assume little about the programming model at hand, we don’t get to fine-tune the syntax to work in the best way for that model. Additional syntactic abstraction before this step, in the form of a hierarchical macro system or a context-free parser, may make it more palatable.
During this time, the potential value of static type systems has become quite clear to me. For the most part I still really don’t care about protecting programmers from their own mistakes. Instead what I value is the ability to make promises about program behavior that untrusting parties can verify on their own (for non-hackish metaprogramming and secure code distribution), the ability to write expressive abstractions that don’t sacrifice any run-time performance, and the ability to infer a program’s implementation from its interface. I still like the dynamic programming ideal of putting lots of power in the programmer’s hands… but whence else comes this power? Rather than stubbornly reinventing the existing static type research under some other name, I embrace it.
I haven’t felt the pain of the asynchronous callback nesting problem firsthand, but I know a horror close to that one: Monadic programming in Groovy. Observe the indentation.
In fact, that indentation-heavy part of the Blade code is what originally got me fed up with using Groovy and interested in making Penknife as an intermediate project on the way to Blade. Well, my work on Penknife has been so focused on modularity and hygiene that it’s slack on what really matters: Syntax. Oh, syntax, how you really really matter so! :-p