samskivert: Modules as Objects in Newspeak – Bracha, et al.

27 December 2010

This paper describes Newspeak’s approach to modularity (along with a grab-bag of other things). Newspeak uses top-level classes as module definitions and declares all dependencies as “type” members of said classes. Newspeak is dynamically typed and names are late bound (hence the scare-quotes in the previous sentence). This late-binding is useful when taking this approach toward dependency resolution.

Using top-level classes as modules is nice in a number of ways. It reuses an existing linguistic concept, rather than adding another structure to the language (in Newspeak’s case, classes, though the same could be accomplished with top-level objects in a prototype-base language like Self or JavaScript). It allows one to abstract over modules (assuming one can abstract over classes). If one’s language allows for mutually recursive class definitions (which Newspeak does), then one gets mutually recursive module definitions for free.

I think Newspeak is a very interesting language, and I applaud the “extreme” opinions of its designer and the influence they have on its design. However, I have a couple of issues with modularity in Newspeak.

The first is that they are missing the party, as far as I’m concerned. Scala is another language that is taking this approach toward modularity, and Scala is in active use by hundreds (possibly thousands these days) of professional developers. Systems come in uncountably many shapes and sizes, and to really understand an approach to modularity, you need to try it on a lot of different systems. That’s not to say that Newspeak’s authors are wrong not to focus on adoption first, but they’re going to have a hard time evaluating their modularity ideas until the language is more widely used.

Furthermore, Scala tackles the interesting (in my opinion) challenge of providing a type system that checks that your module assembly is well-formed. Just as one wants to reuse a linguistic structure to support modularity, one also wants to reuse type-system mechanisms to check those structures. Scala accomplishes this with self types and (mixin composed) traits.

My second concern is that Newspeak takes the extreme position that every dependency must be explicitly enumerated and brought into the namespace. This extends to basic platform classes like, for example, List. In Newspeak one would write code like so:

class NewspeakExample usingLib: platform = (
List = platform collections List.

The platform library is declared as a dependency, and the platform collections List class is brought into the local namespace as List. This amounts to the same level of syntactic drudgery as a Java import statement (and its addition can probably be automated to the same degree by an IDE, but readers of the code still have to read it).

Putting aside the potential benefits (and dangers) of Newspeak’s extreme approach to late binding (the name List is just a member of the NewspeakExample class and can be overridden by subclasses or changed at runtime, thereby changing the class instantiated at every place List is used inside NewspeakExample), they are missing the strong message being sent by the Python, Ruby, Perl, etc. communities that they want their batteries included. They want the standard set of data structures and utilities available without having to specifically request them.

Assuming that the vast majority of programs will use the standard libraries, and use them in the “standard” way. When would the ability to replace the standard libraries with specialized versions actually be useful? (Let’s also ignore the massive potential for confusion if you redefined the standard libraries to behave differently.) Perhaps when profiling or debugging one could replace the standard libraries with instrumented versions. Explicit enumeration of the standard libraries as a dependency seems like a steep price to pay for this marginal use case. Such a use case would probably be better accomplished via some extra-linguistic mechanism.

This explicit approach to modularity is great for structuring the major components of a system. There the benefits of explicit dependency declaration outweigh the costs. You frequently want to test such components in isolation, so supplying mock dependencies is useful. You may in some cases want to supply meaningfully different “production” implementations of some dependencies. It is also beneficial for readers of the code to see the dependencies made explicit because they may not be intimately familiar with the system’s structure.

However, this approach is cumbersome when used to provide access to the vast vocabulary of standard libraries and de-facto standard third-party utility libraries. In those cases, you are not replacing the implementations for testing. You are generally not providing alternative implementations. These libraries are widely understood by any reader of the code, and to explicitly enumerate every use is a distraction rather than a helpful guide. To make an analogy with natural language, one often clarifies a use of jargon (like say “applicative functor”) with a definition or reference to some external definition. That is far more likely to help a reader than distract them. But if one has to explicitly prepare the reader for terms like “the” (or “define,” or “terms”), communication is almost certainly going to be impaired.


©1999–2015 Michael Bayne