I can't count the number of times I've created overloaded methods to destructure a function
argument and call a version that takes arguments individually. This is most prevalent in code that
takes a Point
or Vector
, like so:
case class Vector (x :Float, y :Float)
class Transform {
def translate (tx :Float, ty :Float) = ...
def translate (t :Vector) = translate(t.x, t.y)
// ...
}
but there are plenty of other situations where I end up doing the same thing. Here are some examples from Scaled:
case class Region (start :Loc, end :Loc)
class BufferV {
// ...
def region (start :Loc, until :Loc) :Seq[Line] = ...
def region (r :Region) :Seq[Line] = region(r.start, r.end)
// ...
}
class Buffer extends BufferV {
// ...
def delete (start :Loc, until :Loc) :Seq[Line]
def delete (r :Region) :Seq[Line] = delete(r.start, r.end)
def replace (start :Loc, until :Loc, lines :Ordered[LineV]) :Loc
def replace (r :Region, lines :Ordered[LineV]) :Loc = replace(r.start, r.end, lines)
def transform (start :Loc, until :Loc, fn :Char => Char) :Loc
def transform (r :Region, fn :Char => Char) :Loc = transform(r.start, r.end, fn)
// ...
}
Some from a game I'm working on:
public Facility facility (Coord coord) { return _facs[index(coord)]; }
public Facility facility (int x, int y) { return _facs[index(x, y)]; }
public Stack stack (Coord coord) { return _stacks[index(coord)]; }
public Stack stack (int x, int y) { return _stacks[index(x, y)]; }
public boolean scouted (Coord coord) { return _scouted[index(coord)]; }
public boolean scouted (int x, int y) { return _scouted[index(x, y)]; }
You get the idea.
If we have ADTs (Scala calls them case
classes), or value classes/structs, it seems perfectly
reasonable to automatically destructure one passed to a function which takes arguments of the
same type and in the same order.
val tx = new Transform(...)
val dt = Vector(...)
tx.translate(dt) // desugars into tx.translate(dt.x, dt.y)
This should work even if there are additional arguments to the function:
val buffer :Buffer = ...
val r = Region(start, end)
buffer.replace(r, Seq(Line("Yes"), Line("we"), Line("can")))
// desugars into buffer.replace(r.start, r.until, Seq(...))
I'm inclined to match purely on declaration order and type, because name matching seems likely to
fall prey to minor quibbles like translate
taking tx
and ty
whereas Vector
declares x
and
y
. However, this does restrict us to only desugaring "struct-like" types, which are unambiguously
meant to be an ordered bundle of values.
I would like to allow arbitrary types to opt-into this behavior, not least because my Region
example up there was cheating. The actual definition of Region
is:
trait Region {
def start :Loc
def end :Loc
def contains (loc :Loc) :Boolean = (start <= loc) && (loc < end)
def isEmpty = start >= end
}
For a variety of reasons, I didn't want it to be a case class, but I sure wish I didn't have to
create dozens of overloaded methods to manually destructure a Region
.
I use another pattern in my Java geometry library which would fail in these circumstances as well:
interface XY {
float x ();
float y ();
}
class Point implements XY {
public float x, y;
public float x () { return x; }
public float y () { return y; }
// ...
public Point set (float x, float y) { ... }
public Point set (XY p) { return set(p.x(), p.y()); }
}
class Vector implements XY {
public float x, y;
public float x () { return x; }
public float y () { return y; }
// ...
public Vector set (float x, float y) { ... }
public Vector set (XY p) { return set(p.x(), p.y()); }
}
The XY
interface makes it easy to express that you don't care whether your x,y pair is actually a
Point
or a Vector
(or any other damned thing that has an x and y), which is very useful. But
how do I know that it's OK to destructure an object that happens to implement XY
?
Probably some kind of annotation like @Value
which, when placed on a class or interface means
that all public methods represent destructurable values:
@Value
interface XY {
float x ();
float y ();
}
and when placed on fields/methods means that those fields/methods comprise the value part of the class:
trait Region {
@value def start :Loc
@value def end :Loc
def contains (loc :Loc) :Boolean = (start <= loc) && (loc < end)
def isEmpty = start >= end
}
I can already imagine situations where this would cause trouble. Imagine I have two traits, both
annotated with @Value
which are equivalent from the perspective of the types of their fields and
their order:
@value trait XY {
def x :Float
def y :Float
}
@value trait PolarVec {
def angle :Float
def length :Float
}
case class Impulse (x :Float, y :Float, angle :Float, length :Float)
extends XY with PolarVec
If I pass an Impulse
to def translate (tx :Float, ty :Float)
how do I know which value to use
when destructuring (and for that matter, does a value trait override the value nature of the case
class itself)?
Maybe the answer is just to fail, report the ambiguity, and be happy that the solution helps the
programmer 90% of the time. One can always be explicit and write translate(imp.x, imp.y)
.
Update: 7/27
Having slept on this little exposition, it occurred to me this morning that I should at least mention Scala's extractors and how they provide a general mechanism which could also be used to accomplish this destructuring.
In Scala, each type has a companion singleton which can define an extractor for a type in the
form of an unapply
method. For example, the Region
trait above can define:
object Region {
// ..
def unapply (r :Region) = Some((r.start, r.end))
// ..
}
This allows one to write code like so:
def foo (r :Region) = r match {
case (Loc(0, 0), end) => // starts at zero, end is bound to r.end
case (start, end) => // start is bound to r.start, end to r.end
}
def bar (r :Region) {
val Region(start, end) = r
// start is bound to r.start and end to r.end
}
This abstraction is great and unifies pattern matching, destructuring assignment and my proposed feature of automatic destructuring of function arguments, but in Scala it comes at substantial performance cost.
Scala is syntactically concise, which makes it very easy to violate programmer expectations. That
unapply
method is actually doing:
def unapply (r :Region) = new Some(new Tuple2[Loc,Loc](r.start, r.end))
and the destructuring assignment under the hood, looks like:
val v :Option[Tuple2[Loc,Loc]] = Region.unapply(r)
if (v.isEmpty) throw new MatchError
val t :Tuple2[Loc,Loc] = v.get()
val start = t._1()
val end = t._2()
We're wrapping the data of interest into two temporary objects and then immediately unwrapping
them. Yes, between the Scala optimizer and the JVM optimizer this profligacy may eventually be
optimized away, but it also may not, and I should not have to worry that using a handy feature like
passing a Point
to a method which takes two floats might incidentally create two objects on the
heap and throw them away. Stick that into an inner loop that's doing math calculations and you've
just "abstracted away" 50% of your performance.
So the abstraction is good, the implementation is bad. I would definitely use one abstraction for pattern matching, destructuring assignment and this auto-destructuring of arguments, but I would do it in such a way that it was closer to zero cost. Scala already cheats and automatically optimizes this abstraction for its own ADT types (case classes) to dramatically reduce the cost, but I'd prefer to design the mechanism to be cheap for everyone rather than make user code a performance second class citizen.
The @Value
annotation described above is one ham-fisted way of accomplishing that. One could
also imagine a metaprogramming facility where you provide the moral equivalent of an unapply
method, but it's written in code that communicates to the compiler the number, type and order of
your type's value components so that when it sees:
val Region(start, end) = r
it can generate:
val start = r.start
val end = r.end
(Note: Scala's unapply mechanism does a bunch more stuff than destructuring of a known type into
its constituent parts, but this metaprogramming approach applies equally well to that functionality
and with equivalent improvement over the current wildly inefficient approach of wrapping everything
into Option
and Seq
so that the compiler knows what to do with it.)