samskivert: Static Extraction and Conformance Analysis of Hierarchical Runtime Architectural Structure using Annotations

Summary
Describes a system, Scholia, for specifying object ownership in Java programs via an annotation-based type system, extracting hierarchical object-ownership graphs from said code, abstracting away architecturally insignificant details while preserving architecturally significant communications via edge lifting and summarizing, visualizing the extracted architecture, comparing the extracted architecture to a target architecture, and tracing differences of commission back to offending source lines. Details of the type system, details of the extraction and abstraction processes, case studies, and conformance measurement are also provided.

Comments
This is a bit of a tour de force paper, being a summary of the author’s PhD work. The general idea is one in which I have keen interest. The basic problem is that many systems start with a rough architectural sketch (or less), and then proceed full speed into implementation with no particular facility for ensuring conformance with the original architecture or easy way to see how that architecture evolves as the rubber hits the road.

Various approaches have been developed involving architecture description languages and even language extensions that enforce architectural constraints. These are all noble ideas and have achieved surprisingly little currency in industry. I can’t claim to definitively know why this is, but I suspect it involves a number of factors.

The desired architecture of a system is hard to determine up front and must usually evolve with the system. Yet once the system begins to evolve, the focus shifts to the actual code and the features of the product, and architecture takes a back seat. It only comes back up when some feature is likely to have major architectural impact. This is probably natural and efficient, but it means that the likelihood that the architectural modeling tool will be pulled out and used to update the designed architecture to account for system evolution is low.

Language support for layering architectural information on top of actual code (i.e. annotations) is relatively new (and not yet supported by many popular languages). As a long history of out of date documentation can tell us, if something is not specified right next to the code, or by the code itself, it’s likely to be neglected. Thus maintaining a separate architectural description is probably doomed to failure. If the compiler were enforcing the architectural restrictions, the architecture would likely remain more up to date, but the compiler (or a compiler plugin) would likely only be doing that job if the architectural requirements were specified as annotations on the code. Otherwise it would be handled by some separate model checker which requires yet more effort to integrate into a project’s build system and maintain.

This work seems to me to be close to tipping the scales in favor of a first-class representation of architectural concerns directly in the code base, enforced by the compiler and amenable to extraction by tools to generate architectural visualizations of a system as it is actually implemented. Whether or not a target architecture is maintained, having the ability to see the architecture as it evolves allows the natural spaghetti tendencies of any evolving system to be observed and kept in check.

Without having used Scholia myself on a system of reasonable size, I am probably underestimating the pain in adding and maintaining object-ownership annotations. The authors even indicate in the paper that it’s somewhat cumbersome. It remains to be seen whether some combination of ownership inference and a sensible choice of ownership defaults could be used to reduce the annotation burden to levels tolerable by real developers. Ideally one could make some use of partial annotations, but if a partially annotated system presented only a partial object ownership graph, architecture violating interactions might be hidden.

Though I’m not giving up on the idea of a total architectural description, I wonder if value could not be obtained through descriptions of separate aspects of a system’s architecture. Certainly there are many ways of looking at a system, and focusing on one aspect may allow useful and sound information to be obtained without huge effort by the programmer.

Take one architectural concern: separation of client and server. Often one wants to separate a code base into client code, server code and shared code. The client may make totally different assumptions about the runtime environment, use different dependency management techniques, have different performance considerations. As such, one may wish to restrict the client to use only client and shared code and the server to use only server and shared code, and possibly place stronger restrictions on the shared code, like no implicit dependencies. This is only one architectural decomposition of a system, and you can imagine others that might slice the system by feature-set or some other architectural aspect of interest.

I also think that more clearly defined modularity concepts in the source language could help to bridge the gap between a giant pile of interacting objects and a high-level view of inter-module communication. There are clearly many ways to skin this cat.

Source: PDF ACM

samskivert: Static Extraction and Conformance Analysis of Hierarchical Runtime Architectural Structure using Annotations – Abi-Antoun, et. al.

20 November 2009