NAVIGATION

"Stop Designing Languages. Write Libraries Instead."

Patrick S. Li - May 29, 2016

I had a friend tell me recently that all programming languages seem very similar to each other. They all have variables, and arrays, a few loop constructs, functions, and some arithmetic constructs. Sure, some languages have fancier features like first-class functions or coroutines, but he doesn't consider himself an expert programmer anyway and doesn't use those features.

What really makes a programming language productive for him, he says, are the libraries it comes with. For example, he got into programming by using the popular Ruby on Rails web framework. There is no way that he could have written a full database-driven web stack by himself, nor is he interested in doing so. But thanks to Ruby on Rails, he doesn't have to! So he said that he has no particular opinion about the Ruby programming language, but he absolutely loves Rails. The vast majority of programmers are non-experts, like himself, and the largest gains in productivity for non-experts come from having a wide spectrum of easy-to-use libraries. Subtle language features like first-class functions, and object systems, are lost on them because they don't really use them anyway. Computer scientists should really be spending their time developing new libraries rather than inventing new programming languages.

My friend's opinion about programming languages is a common one, and I have heard it repeatedly from experts and non-experts alike. Being a language designer myself, I, of course, don't share this opinion. Here is what I consider to be the purpose of a general-purpose programming language.

To start off, I would say that my friend's opinion is completely correct, just incomplete. The greatest productivity gains are indeed the result of having a wide spectrum of libraries. Ruby on Rails is a fantastic framework, and it has enabled thousands (if not millions) of non-experts to build sophisticated websites quickly. So the natural question then is, why isn't there now a Rails framework for every programming language?

Some languages that are semantically similar to Ruby do have their own web frameworks. Python, for example, has Django. But as of now, there is still no decent web framework for Java that is as easy to use as Ruby on Rails. Why is that? Are Java developers just not as competent as Ruby programmers? If David Hansson could design and develop Rails by himself, why can't a group of programmers just copy the design to Java? What makes this even more embarrassing is the fact that Java initially marketed itself as the web programming language, because of its applet technology. To emphasize this point, let me add that there is no good web framework for C either, and it is unlikely that there ever will be. Let me assure you that it's not because C programmers are worse than Ruby programmers.

Economics is not the reason either. The Tiobe index lists Java and C as the most widely used programming languages today, with Ruby coming in eighth place. There are many times more Java and C programmers than there are Ruby programmers. If someone would just write Java on Rails their framework would have many times more users than Ruby on Rails, and it would instantly propel him to internet fame and fortune.

So it's not because of incompetency. Nor is it because of economics. So why else wouldn't someone port Ruby on Rails to Java? Well, simply, because they can't.

If you're a knowledgeable Ruby programmer and you take a deep look through an introductory Rails tutorial, you'll notice that pretty much all of the Ruby language features come into play in some way. Rail's ActiveRecords library makes pervasive use of Ruby's meta-programming features. Rail's template system heavily relies upon Ruby's runtime evaluation features. To make your website respond to a user click, you subclass ApplicationController and reuse pre-coded functionality by importing various mixins. Events are handled often by attaching a call back in the form of a first-class function to some widget. Casual website designers can safely completely ignore the concepts of types and memory deallocation because Ruby is dynamically-typed and garbage-collected. These features are simply not available in all other languages. Java's meta-programming features, for example, are just not powerful enough to implement a system like ActiveRecords. Rails is only possible because of Ruby.

So, completely unbeknownst to my friend, he is actually making heavy use of all those subtle language features that he claimed he never cared about. And this is intentional! Ruby on Rails was designed to make it possible to build websites without understanding type theory, or memory management, or object-oriented design patterns. Rails allow website designers to focus on designing websites, not managing their software infrastructure. My friend is enjoying all the benefits of Ruby without even knowing it, and that's the whole point.

Taking a step back, the concept of packaging code into easy-to-use libraries is not new. It's been around even in the days when programs were stored on punched paper tape. There are still vast libraries of assembly code containing useful subroutines. And every programming language ever designed provided some way for common functionality to be reused. To me, this is the primary purpose of a general-purpose programming language, to enable the creation of a wide spectrum of easy-to-use libraries.

The design of the programming language directly determines what sort of libraries you can write and how easy they are to use in the end. In the C language, the only major feature provided for enabling reuse is the ability to declare and call functions. So guess what? The majority of C libraries are basically large collections of functions. Ruby on Rails provides a concise way for expressing: do this when the button is clicked. The "do this" part is implemented in Ruby as a first-class function. How would it be implemented in languages like Java which don't support them? Well, the behaviour of first-class functions can be mocked by defining a new event handler class with a single perform_action method and then passing an instance of this class to the button object. So guess what? Using a Java library typically entails declaring a humongous number of handler classes. The programming language directly shapes the design of its libraries.

In the early days of software, collections of functions were sufficient in allowing us to code reusable components. A lot of early software was numerical in nature, and there was a library function for every numerical algorithm you would want to run. Numbers go in. Numbers come out. Functions were perfectly adequate for this. Unix and C were also designed in a time when the majority of computing happens in batch mode. You prepare some input data, call a function or run a program, and you get some output data back. But computing has changed radically since the 70's. Nowadays, most interesting programs are interactive. When a user clicks a button, it should do something. It was rare to want to extend the functionality of a library of the 70's. The library provides a collection of useful functions. If one of them does what you want, then use it. If not, then write your own. But with the advent of interactive software, the need for extensible libraries became apparent. Programmers wanted GUI libraries that allowed them to say: when a user clicks a button, please run my code. Java (and C++) provides a limited method for extending an existing library's functionality through its subclassing mechanism. So using a Java library often consists of subclassing a number of magical classes and then overriding a number of magical methods. This style of library became so pervasive at one point that we even gave them a new name. They're called frameworks.

I surmise that probably many general purpose programming languages were originally designed because of the author's inability to write a good library for the language that he was using at the time. The initial impetus that got me thinking about designing Stanza, for example, came out of my frustrations with trying to write an easy-to-use game programming library in Java. To handle concurrency, traditional game programming frameworks required sprite behaviours to be programmed using a state machine model. But that's not how we intuitively think about sprites in our heads. Intuitively, we think about a character's behaviour as consisting of a sequence of steps. For example, first the character jumps, and then after he lands he looks to his left and then his right for the nearest enemy. If he sees one then he goes to attack it, otherwise he jumps again. He does this three times, and if he doesn't see an enemy after three jumps, then he takes a short nap. Transforming this sequence of steps into a state machine is an incredibly tedious and error-prone process, and most importantly, feels repetitive. It felt like I was doing the same thing again and again. So the natural question is, can I just make this state machine transformation a library and re-use it? It turns out I couldn't, not in Java at least. The language feature that I needed was some sort of coroutine or continuation mechanism. After some research I found that the Scheme language supports continuations, so the Scheme version of my game programming library was much easier to use than the Java version.

Because of its support for continuations, the Scheme version of my game library does not require users to write their sprite behaviour as state machines. But it wasn't better than the Java version in every way. Most importantly, the Java version was statically typed and so the compiler automatically caught many of your mistakes for you. The Scheme version didn't have this ability and thus debugging my games took a bit longer. At this point, the right question to ask would be, well can you write a static-typing library for Scheme that then automatically checks your code for type errors? And the current answer, for now and for the foreseeable future, is no. No mainstream language today allows you to write a library to extend its type system. Stanza doesn't either. It just attempts to provide one that is useful for a wider audience.

Since the purpose of general-purpose programming languages are to enable the creation of powerful libraries, this means that different languages can also be characterized by what features they provide that cannot be written as libraries. Stanza provides an optional type system, garbage collection, and a multimethod based object system. But if you don't like Stanza's object system, there is no way to write your own. This is one of the main directions of programming language research. Can we design a language so expressive that library writers can easily write the most appropriate object system, or most appropriate type system, to fit their application? Perhaps one day we'll have such a language. Racket and Shen provide mechanisms for extending their type systems and research on meta-object protocols were attempts at designing extensible object systems. So languages are differentiated by what types of libraries you can write in them and what types of libraries you can't.

In summary, the purpose of a general-purpose programming language is to enable the creation of powerful and easy-to-use libraries. The more powerful the language, the easier the libraries are to use. Code that makes use of a perfectly tuned library should read almost like a set of instructions for a coworker. So the next time you come across a particularly elegant library, know that many decades of language research has gone into making that possible. If you're curious about specifically which language features a library makes use of, then you can dig deeper, explore, and appreciate the thought that went into its implementation. If you're not curious about all this subtle language stuff, you can safely ignore it all and get on with your work. That's the whole point.