samskivert: On Understanding Data Abstraction, Revisited: Cook

02 November 2009

Summary
It is widely assumed that Objects == ADTs (or even Objects <= ADTs). They are both forms of data abstraction, but fundamentally different. OO programming languages conflate the two and obscure the difference. Conflation useful, obscurity unfortunate. ADTs: public name, hidden representation, operations to create, combine and observe. Provide representation independence, optimizability, formal reasonability. Require unique implementation (module jockeying helps but does not allow interoperation). Objects: a record of higher-order functions (an interface). Classes are functions that create instances. Objects are autognostic: know only their own representation. Adds flexibility, reduces optimizability, drastically reduces formal reasonability, inclines programmer to expose internals to regain performance. Relationships: ADTs require static types, objects don't. Both are equivalent for non-complex operations. Extensibility problem. Imperative state and polymorphism actually orthogonal to this duality. Reality: OOP in Java, type classes in Haskell, Smalltalk. Understanding these distinctions informs the choice of which to use when.

Comments
Wow, this is a great paper! Maybe everyone in academia already understands this like the back of their hand, but as someone who was introduced to these concepts through real-world programming languages, this really clarifies a lot of fuzzy notions I’ve had about what fundamental abstractions I use when I’m programming. I now have a much better idea of the extent to which Java was influenced by Smalltalk and why Scala’s reintroduction of ADT constructs as elegant language features is so compelling (even if they’re implemented under the hood using OO mechanisms). One wants both approaches and one wants to avoid the more-typing/loss-of-clarity burden of emulating one approach in the other. There are numerous other great insights in here: OOP being fundamentally higher-order programming because objects are actually records of HOFs, OO being hard to analyze as a result, OO being less popular in academia (than ADTs) as a result of that result. I am now convinced that one of the first things that should be done in CS education (maybe not in 101) is to implement an object system in an ADT-based language (like C) and implement a proper ADT in an OO language (like Java or Smalltalk). I had done the former and it was a mind-expanding experience, I had not (explicitly) done the latter.

Source: PDF

©1999–2022 Michael Bayne