HomePhilosophyDownloadsDocumentationPeopleCommunityNewsReference | ||
NAVIGATIONThe Design of Stanza's Optional Type SystemThe Role of a Type SystemDynamic versus Static TypingOptional TypingTechnical ObstaclesThe Subtyping ParadigmExtending Subtyping with the ? TypeExtending Subtyping with the Union TypeExtending Subtyping with the Intersection TypeParametric Types and Stanza's Captured Type SystemDeep Casts and Types as ContractsImplications for SafetyComparisons to Other LanguagesOur Experiences Using StanzaOur Experiences Teaching StanzaSteps From HereFootnotesAcknowledgements |
The Design of Stanza's Optional Type SystemPatrick S. Li - August 16, 2016 Stanza is one of a new paradigm of languages to recently be released that are neither dynamically- nor statically-typed, and instead is designed around an optional type system. Stanza's type system subsumes the semantics of both fully dynamically-typed and fully statically-typed languages and offers the flexibility and productivity of the former, with the error detection capabilities of the latter. The system is designed to be non-invasive, to allow users to seamlessly mix untyped with typed code, easily transition between the two paradigms, and prioritizes simplicity and ease-of-use. Our teaching experience shows students -- even those with little experience in typed languages -- are comfortable with the language after less than ten hours of instruction. We have, thus far, written roughly 50000 lines of code in Stanza and found the optionally-typed paradigm to be more productive than both fully dynamically-typed and fully statically-typed languages. This article details the advantages of optional typing in general, the motivations behind the design of Stanza's type system, the technical hurdles we encountered, and our experiences with using and teaching Stanza. The Role of a Type SystemNot all syntactically-correct programs do something meaningful. In fact, the majority of them simply crash. For instance, subtracting the number
Java, on the other hand won't even allow you to compile the program. It prints out:
A programming language's type system is what is responsible for detecting meaningless operations and notifying the user. Some languages, such as Python, detect errors at runtime when the operation is attempted, and other languages do it at compile-time, as is the case for Java. The majority of programming languages are predominantly divided into two categories: according to whether they are dynamically- or statically-typed. The above operation of subtracting Unlike current languages, Stanza is neither dynamically- nor statically-typed, and has instead what we call an optional type system. This allows users to begin programming without specifying any types -- in which case Stanza acts as a dynamically-typed language -- and then gradually add types throughout development to take advantage of the error-checking capabilities of a statically-typed language. The type system is a fundamental part of the design of a programming language, and significantly influences the design of the rest of the language -- so much so, that designers often recommend designing the type system first before other parts of the language. In this article, we'll explain, in depth, the motivations underlying the design of Stanza's type system, the difficulties that we encountered, and our experiences with using and teaching Stanza so far. Dynamic versus Static TypingThe great majority of current programming languages are divided into two categories: according to whether they are dynamically-typed or statically-typed. The Tiobe index lists Python, PHP, and Javascript as the most popular dynamically-typed language in use today, and Java, C, and C++ as the most popular statically-typed languages. Stanza is one of the few languages, along with Dart and Typescript, that does not fall under one of these two categories. Both paradigms have their own strengths and weaknesses. Dynamically-typed languages do not require any type annotations, which saves the user from some verbosity while writing and reading code. They are also less restrictive than statically-typed languages and allow users to perform operations that would conservatively be deemed illegal in a statically-typed language. Proponents of static-typing may argue that the use of these operations is indicative of bad programming style and should not be used in polished software. But nonetheless, they are often very convenient -- especially during the early stages of development before the requirements and structure of the program are fully understood. Because of this, dynamically-typed languages are sometimes called prototyping languages due to the ease of turning an idea into a mostly-working program. The disadvantages of dynamically-typed languages appear when programs grow in size. The lack of an error detection pass means that type errors are not detected until the pertaining line of code is executed under the proper conditions -- the result of which is that, when working in a dynamically-typed language, a significant amount of implementation effort goes into avoiding minor typos and trivial mistakes. This includes both the time required to find and fix bugs caused by trivial mistakes, and also the time wasted on writing and updating tedious test code to maintain a reasonable level of correctness. An automated typechecker significantly relieves the programmer of both burdens. Statically-typed languages, on the other hand, come equipped with an automated typechecker that will preemptively detect ill-typed operations before the code is executed. However, most statically-typed languages require type annotations (OCaml and Haskell are notable exceptions) which contribute towards the verbosity of the program. And, depending on the design of the type system, they impose a restrictive structure on the program that periodically feels awkward and counter-intuitive. A common sentiment among programmers is the burden of having to "satisfy the typechecker", which alludes to the need to structure their programs in a form that is accepted by the type system. In a polished program, the added structure imposed by the typechecker may be beneficial for ensuring good software architecture, but it is often cumbersome during the prototyping stage of development. The programming community now recognizes that both dynamically- and statically-typed languages have their own pros and cons, and developers are advised to choose the most appropriate paradigm for their application. Applications that require frequent changes and where the final structure of the program is not well understood in advance benefit from the flexibility of a dynamically-typed language. Computer games are the best example of this. The ultimate objective of a computer game is simply to be fun, and thus developers are continuously revising the program to test different designs. On the other hand, applications whose requirements are well understood and for whom correctness is important, benefit more from the structure and the error detection capabilities offered by a statically-typed language. A hospital patient management system is a good example. An additional issue that complicates the choice of programming paradigm is the observation that the most appropriate paradigm in the beginning of the project may not remain the most appropriate paradigm towards the end. Considering a computer game as an example, in the beginning, we would like to quickly implement and evaluate different designs. The goal during this stage is not to perfect any one design, but to build as many as possible that function well-enough to judge whether it is indeed what is wanted. Thus the flexibility of a dynamically-typed programming language is well suited for this stage. However, once the game design has stabilized, the developers must eventually ship a working and well-performing product. At this stage, the lack of an automated typechecker is a serious hindrance, and thus a statically-typed programming language is better suited for the later stages of the project. To deal with the above problem, for a large enough project, developers are sometimes advised to adopt a split-language management solution. The project is first prototyped using a dynamically-typed language, taking advantage of its flexibility to fully explore the design space and discover detailed project requirements. Once the design has stabilized, and requirements are well-understood, then the code base can be ported to a statically-typed language where development then continues with the aid of a stricter language and an automated typechecker. This split-language approach to software engineering is a common one, and one that we are used to, but it has many shortcomings that we find unsatisfying. The most obvious one is the requirement to know two languages, a prototyping and a production language, and the high cost of porting the software from one to the other. Another observation is that, even with the split-language approach, developers are still faced with a tradeoff between their choice of paradigms. Minor typos and silly mistakes arise even during the prototyping stage, so why can't the dynamically-typed language catch them? Currently, the user is faced with this false tradeoff between sacrificing early detection of errors in return for productivity and flexibility. We felt strongly that this tradeoff is not fundamental, and that we could design a type system that provides productivity and flexibility on par with dynamically-typed languages while still being able to provide an automated typechecker. Finally, the last shortcoming comes from the observation that most software is comprised of multiple components, and each component may be at a different stage of development. Even if the main application is stable and is programmed in a statically-typed language, new features are always under development and need to be prototyped, and developers should be able to use a suitable language for the task. But the difficulty of managing an application with components written in different languages is overwhelming. Consequently, this is rarely done. So, for a given project, the act of porting from the prototyping to the production language is a one-time decision. Once you've made the switch to the production language, you're stuck there. So the driving motivation behind the design of Stanza's optional type system stems from our frustrations with the divide between the existing dynamic and static type systems, and we set out to design a system that would offer the best of both paradigms and be more productive than either. Optional TypingOur optional type system was designed from the perspective that a single language can consistently contain the semantics of both a dynamically-typed and a statically-typed language. A variable or argument in Stanza can optionally be declared with a type. If it is, then the typechecker will detect ill-typed usages of the variable/argument and notify the user, as is done in a statically-typed language. But if it is not declared with a type, then we will delay the detection of any resulting type errors until runtime, as is done in dynamically-typed languages. So it is more fair to say that Stanza is both dynamically- and statically-typed rather than neither dynamically- nor statically-typed. Dynamic-typing and static-typing are now just two different (and opposite) programming styles within Stanza. If no variable/argument has a declared type, then Stanza effectively behaves as a dynamically-typed programming language. And if every variable/argument has a declared type, then Stanza behaves as a statically-typed language. Stanza thus offers a continuous bridge between the dynamic and statically-typed paradigms. During the prototyping stage of development, developers may leave out type annotations to take advantage of the flexibility of dynamic-typing. Then gradually, as the software stabilizes and matures, more and more type annotations can be added to the code until the software reaches a level of reliability that the developers are comfortable with. With this method of development, users enjoy both the flexibility of a dynamically-typed language as well as the error detection capabilities of a statically-typed language without having to learn two different languages, or bear the cost of an expensive port from one language to another. Note that there is also no need for developers to restrict themselves to the two extreme styles: either declaring the types of all variables/arguments or no variables/arguments. Developers are free to declare the types of as many or as little of the variables/arguments as desired. The majority of Stanza code contains a mix of untyped and typed code, and in fact, almost no Stanza program is entirely untyped. Keep in mind that even if the user declares no types for any variable or argument, that Stanza's core library is still stable and entirely typed. The effect is that the user's own code will run and behave like Python, but incorrect usages of the core library functions will still be detected in advance by the typechecker. Thus even during the prototyping stage, many common mistakes are detected by the typechecker. Our experience shows that this property allows programs to be prototyped even faster in Stanza than in Python [2]. The flexibility of dynamic-typing is provided where it is desired (the developer's own code) but incorrect usages of trusted code (the core libraries) are still statically detected. Another use case for mixing typed and untyped code arises during the development of new components for mature applications. The main application may be mature and the majority of the program will be typed. But new components under development may be left untyped during prototyping. Once a component has stabilized, type annotations may be added to bring it to the same level of reliability as the main application. Unlike the split-language management solution, switching from an untyped to typed paradigm in Stanza is not an all-or-nothing decision. Technical ObstaclesGiven our goals for Stanza's type system, there were three key technical obstacles we knew of that had to be overcome. The first obstacle is that the type annotations have to be non-invasive. One problem that plagued earlier attempts at hybrid dynamic-static type systems is the inability to smoothly transition from untyped to typed code. A user may begin programming by leaving out type annotations entirely, but then as soon as a single type annotation is introduced the program will continuously fail to typecheck until a significant portion of the code has been annotated. Knowing exactly which annotations are necessary and which aren't is not easily understood for someone that is unfamiliar with the typechecking algorithm. In practice, users coped with the system by treating it as essentially a two-mode language. Code is prototyped by leaving out type declarations entirely; then when finished, type declarations are added everywhere. We were adamant that Stanza not have to be used in this fashion. The type system should allow for a gradual and easy transition from untyped to typed code. The second obstacle results from our desire to allow users to smoothly convert untyped code to typed code without requiring any structural changes to the code. As a driving example, consider that porting a program from Python to Java involves more than just changing the syntax and adding type annotations. This is because many common idioms in Python are disallowed by Java's type system. For example, in Python, variables and arrays are allowed to hold values of different types, whereas this is disallowed by Java. And in Python, the "duck typing" idiom allows users to require a function argument to support multiple interfaces, whereas Java requires the user to preemptively declare a class that implements the desired interfaces. As mentioned earlier, statically-typed languages impose more restrictions on the code structure than dynamically-typed languages. Thus to enable a smooth transition from untyped to typed code, Stanza's type system must be able to accomodate the common coding styles used with dynamically-typed languages. The last obstacle is support for user-defined parametric types. The Subtyping ParadigmStanza's type system is built on top of the subtyping relation. The foundations are more similar to the statically-typed object oriented languages C++ and Java, and less so to the Hindley-Milner style functional languages, OCaml and Haskell. The subtyping paradigm involves first defining, for a program, a set of types and an acyclic (usually tree-structured) binary relation between them, called the subtype relation. For example, our set of types can consist of:
for representing a partial categorization of the animal kingdom, and the subtyping relation between them can be defined as follows:
where we use the notation roughly means "a
is allowed to be called with We chose to build the Stanza type system on the subtyping relation for two reasons. First, it is a familiar paradigm for users accustomed to the type systems of C++ and Java. And second, because even though Python/Ruby/Javascript has no static type system, we found that the typical coding style and architecture used in these languages is closer to the style of a Java programmer than an OCaml programmer. This second point is important for us as it means that typical untyped code is more easily converted to typed code using a subtyping-based type system than typed code using a Hindley-Milner style type system. Extending Subtyping with the ? TypeStanza's most important extension to the subtyping paradigm is the addition of the The The following example demonstrates a function called
Just as in Python,
issues the runtime error
when attempting the If we, however, explicitly declare
then Stanza behaves like a statically-typed language, and the call
will be detected by the typechecker as an illegal operation. The specific error issued is
indicating that there are multiple overloaded definitions of the For convenience when prototyping, we actually allow users to completely elide the type annotation for arguments of named functions. The default type of an argument, if one is not given, is the
Note that our extension for handling untyped code relies upon the introduction of a new type to the subtyping paradigm, but does not change any of the theory beyond that. This means that for standard types we may continue to use their traditional subtyping rules as found in the literature. Here is an example of how the
The first version of Finally, modeling dynamic-typing using the As mentioned in the previous section, the fact that the core library is completely typed means that users get to take advantage of Stanza's typechecker even before writing any type annotations. The following example function computes the total number of digits in an array of integers. Notice that there is not a single type annotation in the program.
Note that
Stanza is still able to detect that the expression
You cannot add a string to an integer. This shows that even during prototyping, Stanza's error detection capabilities saves the user the burden of having to track down bugs due to trivial mistakes. Extending Subtyping with the Union TypeAs mentioned previously, one of our foreseen hurdles was to design a type system that allowed users to convert untyped to typed code without changing the structure of the code. This means that the final type system must be expressive enough to reflect the structure of typical code written in a dynamically-typed language. One common idiom that we noticed in dynamically-typed code is the use of a single variable or container for holding values of different types. The following code initializes
Hindley-Milner style type systems simply do not permit such code, and porting dynamically-typed code to such a language will necessitate a structural change. Java's type system does allow the code but only if What is needed is a type for indicating that Similarly to the
As an interesting aside, proper support for untagged union types also allows Stanza to omit the concept of a null value. A common idiom that is used to indicate that a variable may be "uninitialized" is to assign it the value
indicating that Extending Subtyping with the Intersection TypeAnother dynamic-language idiom that is commonly seen is the presumption that a function argument will support the methods of multiple interfaces. As an example, let us first suppose that the following functions are available and defined in an existing library:
There are two relevant types to consider in this example,
To annotate the above code, a type is needed for indicating that person should be both an
Parametric Types and Stanza's Captured Type SystemThe interaction between subtyping and parametric types are notorious for being complicated to grasp and unintuitive for programmers not versed in type theory. The fundamental issue lies in the concept of variance. Consider the following function:
The critical question is: should the user be allowed to call To illustrate the problem with the above logic, let us rename
where the name implies that the function will populate the given array with a number of random Other statically-typed languages that feature both subtyping and parametric types, such as Java and Scala, employ complicated variance systems for users to specify whether a value is used in a covariant or contravariant setting. But for ease of use, Stanza makes the simplifying assumption that all parametric types are covariant as that is the most common usage case. The Because of Stanza's covariant assumption, much of the traditional theory on parametric types, which rely upon bounded quantifiers, no longer hold. Thus Stanza employs a new type system mechanic called captured types for use in the definition of polymorphic functions, i.e. functions that take type parameters. The following example function,
The
the type parameter
the type parameter Despite its simplicity, we have found that the captured type system is sufficient to type the vast majority of polymorphic functions used in practice and that it is quickly understood even by programmers with no type system experience. Deep Casts and Types as ContractsBesides the covariance assumption, there is one last key difference between Stanza's parametric type system and that of Java and Scala's. Consider the following function, for computing the sum of all the integers in an array.
The argument In contrast, Stanza interprets the annotation on
Even though
results in this error:
which indicates that Stanza cannot prove that
The cast forces Support for deep casts are important as it is a common idiom in untyped code, and is a feature that is unsupported by Java and Scala. The interpretation of parametric types as contracts on its behaviour allows Stanza to retain its flexibility while detecting as many static errors as possible and also detecting runtime errors as early as possible. Implications for SafetyStanza is a strongly-typed language and its semantics are fully defined. All programs are guaranteed to terminate with either a result or a fatal error which indicates that an illegal operation was attempted and the line number and calling context under which it occurred. It is difficult to characterize precisely whether Stanza is more or less safe compared to other statically-typed languages. Because of the presence of the For instance, the following Stanza code
is correctly rejected by the typechecker, as it detects that the expression
Note that xs must be declared with type Compared to OCaml's Hindley-Milner type system, Stanza's support for subtyping also allows for more accurate type annotations in some contexts. Consider the following definition of the
There are a number of subtypes under
In contrast, consider the following tagged union declaration in OCaml, and
OCaml does not allow you to refer to specific tags in the union type, thus
hence cannot be detected by the OCaml typechecker. Comparisons to Other LanguagesTo compare Stanza's type system to that of other languages, we can roughly group other type systems into five categories:
Stanza's goal is to offer the advantages of both dynamic and static typing in a single language with a shallow learning curve. The language should be familiar enough to be easily learned by programmers experienced with any one of the popular dynamically-typed scripting languages (Python, Ruby, Javascript) or one of the popular object-oriented production languages (C++, Java). For this reason, we chose to build Stanza's type system on top of the subtyping paradigm (as made popular by C++ and Java) and using the theory of nominal subtyping. The nominal subtyping-based type systems of Java, Scala, and C++ are closest in nature to Stanza's type system. The key differences include Stanza's support for the Compared to the Hindley-Milner (HM) style type systems of OCaml and Haskell, Stanza's type system is less restrictive and more expressive, but also offers less guarantees. The HM type systems do not support subtyping, union or intersection types, or untyped code. Languages with HM type systems also presume a very different coding style than is common with dynamically-typed languages. Porting code from Python to OCaml, for example, usually requires significant structural changes. We have yet to measure whether the smaller number of guarantees offered by Stanza's type system actually translates to more errors in production code, but our experience is that, after type annotations have been added, the remaining undetected errors are almost always fundamental logic errors that would not have been caught by any type checker. Completely dynamic type systems as used by Python, Ruby, Lua, and Javascript can be modeled completely within Stanza's type system by declaring every variable/argument with the Julia's type system is similar to that of Python, being a completely dynamic system, but with the difference that objects are tagged with their parametric properties. In the implementation of traditional dynamically-typed languages, all objects carry along with it a type tag that indicates its runtime type. Thus, integer objects carry a tag indicating they're integers, and array objects carry a tag indicating they're arrays. But an array containing integers would not carry a tag for indicating that it is an array of integers. Its tag would indicate only that it's an array. In contrast, Julia's tags are far richer and has tags for indicating, for example, an array of integers, or an array of array of integers. This allows the Julia system to perform some significant optimizations and also dispatch to different code depending on the stated contents of an array. Nonetheless, Julia is still a completely dynamically typed language, as opposed to Stanza, and does not offer any capabilities for detecting type errors before execution. Dart and Typescript are the other two programming languages, beside Stanza, that boast an optional type system. While the goals of the type systems of all three languages are similar -- to elegantly mix untyped and typed code -- their underlying theories are not. The most crucial distinction is caused by the fundamental difference between Stanza's multimethod-based object system and the object systems of Dart and Typescript. Stanza's type system was designed specially to support its multimethod system -- which will be covered in depth in a later article -- and the type systems of Dart and Typescript were designed to support their own respective object systems. Dart also does not support polymorphic functions, and Typescript's type system is based on the theories of structural subtyping instead of nominal subtyping. Compared to Dart and Typescript, we placed greater emphasis on enabling users to transform untyped to typed code without making structural changes, and our users have reported that Stanza's types feel less invasive. Our Experiences Using StanzaWe have used Stanza ourselves on four major projects thus far, and a handful of smaller ones. The largest project is the Stanza compiler itself, which consists of about 20KLoC, and was written entirely by a single graduate student. This includes the macro system, type system, optimizer, register allocator, code generator, garbage collector, and standard library. The compiler itself has been rewritten many times now as the language design evolved, with the last full rewrite taking about four months. The early Stanza compilers were originally written in Gambit Scheme because we were researching Stanza's coroutine semantics at the time and needed Scheme's continuation mechanism to evaluate our design. I distinctly remember the frustration of having to find and fix all of the runtime type errors caused by developing a large program in a dynamically-typed language. We treated each runtime error as a personal assurance that we were developing something useful. The Stanza compiler was not yet mature enough to consider writing Stanza in itself, and I remember that after tracking down each bug I would record whether it is an error that would have been caught by the Stanza compiler, or whether it was my own logic error. The final tally showed that the great majority were type errors caused by silly mistakes and I remember pining for the day when I would be able to implement the compiler in Stanza itself. Ironically for a prototyping language, the lack of a type checker caused the most anguish not during development but during experimentation when attempting to make adjustments to the design. During development, we broke down the compiler into subcomponents and tested each component extensively before developing and attaching the next component. By being reasonably disciplined, we could ensure that the code base was always at a reasonable level of correctness. If the system failed after attaching a new component, then the error was likely in the implementation of the new component and not in the existing code base. But adjustments in the design require us to make changes deep in some component in the middle of the compiler chain. A change in an interface would require us to consistently update all the passes downstream. Now the cause of a runtime error could be anywhere. It could be that the new algorithm itself was wrong. Or it could be that we forgot to properly update a line in any one of the later passes. In Scheme, the act of changing an intermediate pass of the compiler was an intimidating affair. Those fears entirely went away when we switched to implementing the compiler in Stanza. There was a short phase when I was dismayed at the slow progress of developing Stanza in Scheme and considered using Java instead. I hoped that the addition of a typechecker would reduce the time I spent hunting down mistakes and hasten progress. But after three weeks of porting the Scheme code to Java, I realized quickly that the productivity loss of coding in a non-functional language like Java far outweighed the benefit of a typechecker. There were single lines in Scheme that blew up to multiple levels of nested loops in Java, and my own coding style made heavy use of nested functions which were also unsupported by Java. The straw that broke the camel's back however was when I was considering how to port a piece of code that made heavy use of a sophisticated Scheme macro. The only two sensible options were to directly write out the expanded code by hand, or to write a Java preprocessor that generated the desired code. Faced with that decision, I decided it was a senseless endeavor and opted to just stick it out with Scheme until Stanza was bootstrapped. I reckon that many people share similar experiences with using dynamically-typed languages. The lack of a typechecker is incredibly frustrating but we're willing to put up with it because of the productivity gain from the rest of the language. The second major project with Stanza was the FIRRTL hardware design language and compiler. We trained a junior graduate student in Stanza to develop FIRRTL. The student had an electrical engineering background and little programming experience, but within three weeks was fluent in Stanza. In fact, he was comfortable with the core language after only a week and a half, but required the remaining time to learn enough of the core library to be productive. Stanza, by this time, was fairly mature, but documentation was still lacking. He commented that the language itself was easy to learn and his major complaint was having to read through the source code to learn the core library. Nonetheless, after three months, under our supervision, he had completed the datastructures for the intermediate representation, the lowering passes of the compiler, and the bitwidth constraint solver. The three other major projects written in Stanza are the Feeny teaching language, a declarative printed circuit board (PCB) design system, and a physical imaging application. Feeny was a minimal programming system consisting of an interpreter, bytecode compiler, virtual machine, and just-in-time compiler, written in about 7KLoC. It was used to teach a graduate course at Berkeley on virtual machines and managed runtimes. The PCB design system is roughly 10KLoC and takes as input a declarative listing of desired peripherals which is used to automatically compute and generate the circuit board layout, the wire routing, and startup and networking code. The physical imaging application is 10KLoC and applies a collection of digital signal filters to datasets of up to 10GB. The interface and project management code is written in Stanza and the computational kernels for the filters are implemented in C for efficiency and called from LoStanza. All three projects were implemented by a team of less than two programmers in less than one year. This was made possible by using a language with productivity and flexibility on par with a dynamically-typed scripting language but with the additional aid of an automated typechecker. The following summarizes our impressions of day-to-day programming with Stanza. The types are intuitive and expressive enough that it can easily type the typical coding styles of programmers with Java, Scheme, and Python experience. For the rare instances where it is not obvious how to type a segment of code, it is trivial to leave it untyped. Type errors are well-localized, easily understood, and fixed. In our experience writing the Stanza compiler, FIRRTL, and PCB system, we have consistently felt that Stanza's type system guided us towards writing well-documented and well-architected software, and we have never gone out of our way to satisfy the type system. Our Experiences Teaching StanzaWe have held two Stanza bootcamps at Berkeley thus far, each lasting for six sessions of one and a half hours. Each bootcamp consisted of a series of presentations intermixed with hands-on exercises. Students were expected to follow along and do the exercises on their own laptops. We had two main goals in mind when organizing the teaching material. The first goal was to quickly teach the students enough of the core language and libraries for them to be as productive with Stanza as with existing languages. The second goal was to expose them briefly to features of Stanza that did not exist in mainstream languages. We were too pressed for time to explain Stanza's deeper features in detail, but by the end of the bootcamp, the students knew the rough purpose of each feature and knew where to look to find more information on their usages. Our experiences showed that students can comfortably learn the core language and libraries after roughly ten hours of instruction. After that time, they are able to code easily in Stanza, but still in the same style as the language they are most familiar with. The majority of students were most fluent in an imperative programming language (such as Java and idiomatic Python) and this was reflected in their coding style. We were, however, happy that we had no difficulty teaching the type system, even to students who have never programmed in a statically-typed language before. This includes the concepts of parametric types and polymorphic functions. When tasked with defining their own polymorphic functions, the students reported that the captured type system is quite intuitive to use. In constrast to OCaml and Haskell, because of Stanza's dataflow-based type inference engine, the reported type errors are also easy for students to understand and fix. After ten hours, all students were able to code Stanza in Java/Python style, and able to easily port their existing Java/Python code to Stanza. Three concepts that were harder for students to grasp were first-class functions, higher-order functions, and immutable datastructures. Students were able to easily understand the semantics of first-class and higher-order functions but had trouble recognizing when to use them. Even after teaching them to use and write their own versions of the common map function, most students still preferred instead to write explicit loops. Similarly, Stanza's The remaining concept which proved difficult to learn for students with both imperative and functional programming backgrounds was Stanza's multimethod-based object system. Students have no difficulty understanding the basic mechanism and usage of Stanza's Steps From HereAll narratives that advocate the productivity of a new programming language are, of course, anecdotal by nature. We have been very pleased with the productivity we've gained from the migration of our software stack to Stanza. More rigorous experiments are planned for characterizing the productivity curve of Stanza versus competing languages. A small number of type system extensions are currently under development, including support for bounded type parameters, constrained subtyping, type aliases, unsigned types, and an improved type inference engine. As has been the case so far, we place a strong emphasis on ease-of-use over static safety, and will strictly moderate the addition of features which complicate the language. This article did not cover any of the performance implications of the type system, but we are confident that the added type information provides many optimization opportunities for the compiler. A separate line of work consists of writing aggressive optimization passes that take advantage of types for better inlining and transformations. Ultimately, in the area of application programming, in the absence of hard real-time constraints, we don't foresee any fundamental limitations that would prevent Stanza from becoming the dominant language in this space. In the upcoming years we will be dedicated to making that happen. Footnotes[1] Without considering ambiguity errors resulting from function overloading. [2] For projects where both languages have equal support for any necessary libraries. AcknowledgementsThanks to Jonathan Bachrach for proof-reading and editing this article. Thanks to George Necula in helping to iron out many subtle technical issues with the type system. |
|
Site design by Luca Li. Copyright 2015. |