It’s been a couple of years since I posted here! Somehow 2014 turned into an opportunity for me to practice forms of expression other than programming, but that’s not what I’m here to talk about. I still spent lots of time on language design throughout 2014, and I’ve been zooming along in the new year.
Ethics for language design
I still dream big enough in my language design that ethics is an important consideration, but I’ve mellowed down when it comes to long-term ethics, and I’ve riled up a little about short-term ethics. :) Whatever happens, I think we’ll eventually build systems for better communication throughput between people, reducing the kind of violent pressure releases I was afraid of. After all, whatever organizations have better communication methods will probably become more intelligent as organizations, and they’ll outcompete the others. We just have to worry about how brutal that competition will be. So I figure we should foster egalitarianism in ways that cultivate competitive markets. Does that make me a left-libertarian? I don’t even know.
Pragmatics for language design
At this point I think of a programming language as something that has niche value. “Programming” is a rather nebulous term itself, and “language” describes the skill you use rather than the reward you get. As user interfaces go, programming languages are optimized for tasks where the user will have a) a long time to prepare their input, b) a higher tolerance for complexity than for redundancy, and c) a rather strong commitment to their chosen code once it’s deployed out of their reach. Well-designed UIs avoid such high complexity and commitment burdens, so my expectation as a programming language designer is to put myself out of a hobby.
Between complexity and commitment, commitment is the more essential problem. Much of the complexity in programming can be traced back to commitment thresholds: Either the bit is set, or it’s clear, never in between. Furthermore I think it’s plausible to trace this back to the mind-body threshold: Human minds are so disconnected from each other that we insist on personal identity, and this insistence lends a sense of absolute discreteness to so many of the concepts we form. (Although I’m attributing this to humans, this might be more specifically a Western trend. My perspective is too myopic to tell.)
On the other hand, not all complex artifacts are deployed with a high commitment cost. Sometimes people deploy complex artifacts because they can’t help it, leaving behind fingerprints and memories. We might want to take advantage of this, focusing on programming languages that help us interpret found artifacts in a useful way.
If I’m on the right track here, then programs would do best to be shaped like some kind of fingerprint, and encapsulation boundaries in programming would do best to behave like the encapsulation boundaries of people. That way we’re not introducing unnecessary concepts. Well, people are vaguely like modules: Not only is a program module encapsulated in a way vaguely similar to a person, but orderless sets of interacting modules are vaguely similar to orderless sets of interacting people. When modules interact, their interaction membrane, if we took a fingerprint of it, would be their import/export type signature. Maybe a type signature is a fingerprint for person-to-person interaction too.
Based on this train of thought, we might like to find a language with only type signatures. If we took a typed functional language with implicits or type classes, removed everything but the type signatures, and tried to program with it anyway, we would accomplish some form of logic programming. Maybe logic programming languages are on to something.
Below the cut, I’ll list some of the language projects I’ve worked on over the past year or two. The above philosophical premises will be relevant for a few of them, but I won’t refrain from discussing tangential features and challenges I’m excited about. If you’d like to avoid most of the technical meat-fluff and get back to the philosophical fluff-meat, I recommend skipping to the section titled “Era Tenerezza” and reading from there. :)
Penknife (demo) has been a project of mine for a while. It’s a Scheme-like language. Its main distinguishing feature, at this point, is that if a value is used more than once in a scope, it’s explicitly branched. The programmer can define custom branching behavior, which for one thing means they can track the flow of a variable through the program. I intended this feature as a way to build pipeline data structures that could be executed in a second step, but at this point it’s mostly experimental.
This isn’t the only thing the programmer can customize in Penknife. Penknife’s macro system works like a Kernel-style fexpr language, passing around a value to represent the lexical environment. Penknife procedures pass around an implicit state value (a “yoke,” as I’m calling it) that accomplishes the convenience of features like dynamic scope and ambient authority. Lexical environments and yokes usually have values of some built-in type tag, but programmers can replace them with values of their own type tags, as long as those values implement all the methods Penknife expects.
By controlling the yoke, a Penknife programmer can enforce purity of Penknife subcomputations. Nevertheless, even a pure Penknife computation has access to convenient impure idioms, because it can establish a local dynamic scope with access to mutable boxes that freeze when the scope is finished. This is inspired by Haskell’s ST monad, but it’s accomplished without a type system.
Midway through 2014, Penknife started to languish. It wasn’t especially performant — it still takes me about 8 seconds to load the uncompiled demo, and about 1 second for the compiled demo — and its design details were too hesitant in too many places. I didn’t want to use it until it was more optimized and polished, and I didn’t have enthusiastic ideas about how to polish it up.
At one point I went back to Penknife and added coroutines. I did that because I was interested in what would happen if I let one Penknife program be a stateful resource in another Penknife program; maybe I could write things like debuggers. At the very least, I could reimplement mutation in terms of coroutines, and then mutation wouldn’t feel like such a tacked-on feature. I expected to encounter some complexity here, but I didn’t realize how much.
The worst source of complexity is the fact that a coroutine yield may never return: The current coroutine may be forgotten on the other side and garbage-collected. It was possible for computations to never return before, but only because of dynamic errors or nontermination, and programmers would presumably avoid those. There was now no source of friction to stop programmers from write code that relied on yielding and never returning. This makes it more challenging to support auto-closing of resources, including the auto-freezing of my own locally scoped mutable boxes!
To tame this complexity, I’d like to force programmers to continue running the coroutine sooner or later. Typed languages could do this with linear types, and I can potentially do this in Penknife by giving the coroutine a custom branching behavior that causes an error if the coroutine is used more or less than once. However, so far custom-branching values have been more of a convenience tool than an enforcement tool, and they have inelegant back doors and poor performance. So I have some serious refactoring to do if I even want to continue with this design at all.
It took me a while to circle back around to working on this topic, but the language with a well-defined stack now exists, and I’ve named it Staccato. I’ll get to it further down.
Miscellaneous concurrent coroutine designs
After my fallout with Penknife coroutines and my decision to use a sort of linear typing pattern, I went on to sketch out some coroutine designs for a hypothetical language that has proper linear typing.
Linear typing already has a notion of concurrency, because each resource is like a concurrent computation. It can express most kinds of coroutine in a strongly typed way, by way of linear function types. But that doesn’t help Penknife much.
My goal was to augment that strongly typed notion of concurrency with a system of weakly typed coroutine resources that provided “yield” over weakly typed input and output values. Ideally the coroutine resources would be sufficient for implementing the weakly typed equivalents of a wide range of other effect systems that can be encoded as linear types.
In the system I sketched out, yield is a handshake where the client decides how many branches the coroutine should create. Although this seems asymmetrical, the coroutine may treat any improper requests the same way it would treat weak typing errors.
So the lifetime of one of these coroutines is shaped like the causality tree of the effects it invokes. This seems like a sufficient basis for encoding lots of different kinds of effects.
The Era module system
Having gotten back to more strongly typed ways of designing languages, I revisited the designs I had made for the Era meaning-preserving module system. I drafted a new design (see long comments toward the bottom), re-expressing all of the module system features in a way that used types, but no terms. I was hoping to follow through on the philosophical thoughts I collected above: That an ideal language would be a mere module boundary between people, and would therefore be nothing but type signatures.
Although I expressed all the module system features I wanted, I had trouble expressing actual computation! I wasn’t even seeing an opportunity for logic programming. So I set aside this idea for a while.
Since then, I’ve realized that I was accidentally inventing an explicit rewrite calculus, just by explicitly encoding
(let ...) constructs as types. I’ve heard these calculi are well-known to have trouble expressing computation, but that there are several proposed solutions. I might want to read up on those.
As I was reformulating this module system, I added a feature I’m pretty happy with: Like you’d expect of any module system, this enforce information hiding, but it also lets that information be accessed by any people who demonstrate that they have a way to know the information anyway. This way, if you can derive knowledge by studying the implementation of a module, then you’re actually allowed to record that derived knowledge directly in the module system at the same level of representation, which makes it easier to believe that these modules are actually representations of knowledge. And practically speaking, it means people can dive into unforeseen use-cases without fear that they’ll need to break through abstractions using hackish whole-program preprocessors or unsafe language runtime extensions.
I refactored my hand-rolled s-expression reader and tweaked a few things about it. Just when I thought I had nothing more to add, @arntzenius tweeted something that made me realize exactly what hole I felt in my heart. So I redesigned the reader completely, and I reimplemented it (demo). Now when a program uses nested quasiquotation of interpolated strings, it can use labeled unquoting. Labels! The resulting design reminds me of hybrid logic’s world variables, but since this reader just implements a shallow surface syntax with no other kinds of variables, I think it’ll remain pretty simple.
Oh, if you missed that: I’m doing quasiquotation of strings. The entire language syntax is designed so that almost any code can be quoted as a string with zero modification. String literals within string literals behave like nested quasiquotation, and I love it. The wild world of building code by string concatenation may be frown-inspiring, but I have never been more prepared for it.
Besides string quasiquotation, the syntax has the usual s-expressions and line comments. It has an infix shorthand
((a b) c) as seen in Arc, which is good for taming left-leaning nested code such as namespace lookups. For right-leaning code like
(a (b c)), the syntax uses
/ as a special one-sided delimiter
(a /b c). This actually makes continuation-passing style look very nice, even when some of the continuations are strings.
I’ve also prepared this syntax for use as a markup language. Most strings read by the Era reader will have their whitespace collapsed, but this can also be suppressed on a per-string basis, and there are escape sequences for micromanaging the whitespace if the need arises. Between this feature and string interpolation, it’s going to be very comfy to write informative error messages.
I may have gotten used to debugging continuation-passing style code, but one thing I’m not used to is rewriting all my code when I think of a better trampoline idiom. Above I described the case in point: I ran across a stumbling block when I wanted to iterate over and replace the stack frames of a Penknife coroutine. I would sooner write a new language than try to track continuation-passing-style stack frames manually, and Staccato is that language.
Besides giving me a more direct coding style for the code I would have written anyway, Staccato potentially gives me several special abilitie: Maybe I could take well-defined stack and heap snapshots of my programs as they ran, and then use those snapshots for unit test suites and bug reports. Maybe I could implement custom garbage collection that walked the stack. Maybe I could hand-tune the compilation policy for each little constant-time segment of my program to maximize performance. Maybe I could write a JIT to do that for me on the fly. So Staccato is not only a good compilation target because of its simplicity, but also because of its potential for high performance.
Meanwhile, maybe the lessons from implementing Staccato would give me techniques for visualizing the call tree of reactive programs. In particular, I’d be thrilled if these lessons can help me add general recursion to Underreact or a similar design.
There are so many potential paths to take with Staccato, it’s almost boring. But I am taking one path with it already.
I’m actually already underway on sketching a successor to Staccato called Tenerezza. In Staccato it turns out stack frames were my one and only data structure, doubling as both a product type (if you know what tag it has) or a first-class function (if you don’t know or don’t care). I’m realizing that these semi-encapsulated lambdas are a lot like the semi-encapsulated modules I’ve been exploring for the Era module system, and maybe if I just add some stricter access control, some collection-oriented operations, and a continuous reactive runtime, I’ll achieve something like the Reactive Knowledge Networking (RKN) idea I blogged about before.
In Tenerezza, as in RKN, all first-class values are two-way communication channels that continuously carry sets of knowledge. Something interesting happens with my design philosophy here. If I’m representing sets of knowledge, I want to believe they’ll be representable as Era modules. But an Era module is a chunk of program code, and I want program code to be shaped like fingerprints/blueprints of person-to-person interaction. So I’m just ending up with communication channels that carry fingerprints/blueprints of communication channels! The language design is feeling marvelously compositional.
I can hardly believe how well Tenerezza is falling together, but it’s still in early design.
The Era project overall
My Era project is a mess of all of the above ideas, but I think the mess is organic and productive. If not for my spurious interest in coroutines, I wouldn’t have gotten stuck in Penknife and wanted a new trampoline, and I wouldn’t have made Staccato. If I didn’t make Staccato, I wouldn’t have a plan for managing Tenerezza’s resource usage and optimization. Tenerezza now seems like a serious contender for accomplishing lots of the same goals as the Era module system, which may center Tenerezza as the main feature of the Era project. Finally, whatever languages finally come out of this process, the Era reader is ready for them.
Those languages wouldn’t even be Era’s final form.
I dream of developing a visual OS/IDE in art nouveau style. As visual styles go, art nouveau has the nice property of providing nooks and crannies for cultural signaling, personal decorations, and subtle live ops dashboards. It has a built-in stringiness that may help to convey arbitrary mind-maps, flowcharts, line graphs, and containment hierarchies. It can work in black and white. It can work as an embossed texture. It can give Era a brand image.
The OS/IDE would let end users manipulate the underlying workings of any UI widget they interact through. This goal doesn’t come out of the blue. I’m not building Era technologies so that users interact with Era technologies, but to replace technologies that prevent users from interacting more directly with each other. Idiosyncratic user interfaces must ultimately be for the users to develop, simply as part of the idiosyncratic usage habits we know they’ll develop at the same time.
This isn’t feature creep. I was aware of those dreams when I decided to start the Era repo and name it “Era.” With goals like those, if I put myself out of a hobby as a language designer, I’ll still have hobbies to pursue as a UX designer.
Then again, if I live long enough to put myself out of a hobby as a UX designer, I bet our society will be so socially integrated and collaborative that we won’t need discrete project identities like “Era” anyway. ^_^ This could be the feature creep to end all feature creep.