Software | Secret Software | Writing
Other People's Arguments
Let me let you into a secret about writing technical articles. The trick that I often use to plan an article is to think of a particular technique I want to illustrate, then find a practical use of it, and then finally find a problem that the practical application solves. And here comes you trick: you then present it all back-to-front. That way you've got an article that looks like it's showing you how to solve a particular problem, and the technique you want to talk about pops up magically as the answer to all your problems in the end.
So in this article,
for instance,
the technique that I want to talk about is the little known @DB::args magic variable; the application is my recent rubyisms Perl module; the problem,
if you want to call it that,
is my recent dabbling with Ruby.
As you can probably tell from last month's column, I've grown fond of some of the features I've been using in Ruby. This is a common problem - or so I'm told - with love affairs: if and when you come back to your first love, you can't help wanting some of the things you've left behind. Thankfully, however, programming languages are a good deal easier to change than people. So the more frustrated I got with the things from Ruby that I thought Perl lacked, there more I wanted to sit down and fix them up.
The first thing I found myself missing was the super keyword.
It came up in Perl as I was specialising a Class::DBI-based module.
I had a class representing a database table,
which I could search by its two columns,
real_name and displayed_name.
But I also wanted a "magical" search term name which searched through both columns.
So,
in my sub-class,
I would say:
sub search {
my ($self, $terms) = @_;
if (exists $terms->{name}) {
# Do our special search
} else {
# Call the superclass's search.
}
}
Now, calling the superclass method is easy. We all know how to do that. Here's how you'd naturally write it in Perl:
$self->SUPER::search($terms);
However, this is what it looks like in Ruby:
super
You can probably understand why I felt spoilt by Ruby. No problem, I thought; I can find a way to do this in Perl. So I thought about what the super subroutine would have to do:
- It would have a prototype of
()so it could be called like a built-in. - It would use
callerto get at the method which called it. - It would have to somehow get at the object and the arguments to the method, so it could work out the superclass method and call it with the same arguments.
Well, the first two were pretty easy:
sub super () {
my $caller= (caller(1))[3];
$caller =~ s/.*:://;
}
But the third had me completely confused. How on earth could I retrieve my caller's subroutine arguments? Well, the obvious place to start looking was the documentation for caller, and there I found something I had never noticed before:
Furthermore, when called from within the DB package, caller
returns more detailed information: it sets the list variable
`@DB::args' to be the arguments with which the subroutine was
invoked.
Wow, just perfect. So I wrote up a little subroutine to call caller from package DB:
package DB;
sub uplevel_args { my @x = caller(2); return @DB::args }
This looks two frames up the stack (DB::uplevel_args is the zeroth frame, SUPER::super is the first, and the method that called super is the second) and returns the arguments from the method. The array is needed to stop Perl optimizing the call to caller.
So now we know how the method was called, which tells us the object.
sub super () {
my $caller= (caller(1))[3];
$caller =~ s/.*:://;
@_ = DB::uplevel_args();
my $self = $_[0];
}
Unfortunately, here it gets tricky again: We'd like to say $self->SUPER::$caller, but that gives us a "Bad name after ::" error. And we want to avoid using eval if possible. What we need is somehow to get hold of a reference to the appropriate superclass method. Well, if we know the class and the method name, we're half way there. Let's assume a putative UNIVERSAL::super subroutine which works like UNIVERSAL::can and returns a reference to the method if one is available. Then we can say:
sub super () {
my $caller= (caller(1))[3];
$caller =~ s/.*:://;
my @their_args = DB::uplevel_args();
my $self = $their_args[0];
$self->UNIVERSAL::super($caller)->(@their_args);
}
Now this is pretty clever, but it has a slight untidyness problem. Suppose we have a class Wibble::Simple which inherits from class Wibble; with our current use of super we'd see a call stack like this:
Wibble::Simple::do_it
SUPER::super
Wibble::do_it
Whereas we'd prefer to see this:
Wibble::Simple::do_it
Wibble::do_it
Now there is a way to make SUPER::super morph itself into the appropriate method: the goto &subroutine technique. Since UNIVERSAL::super - when we've written it - will return a subroutine reference, we merely need to set our @_ to be what we'd like the superclass to see, and then goto the reference:
sub super () {
my $caller= (caller(1))[3];
$caller =~ s/.*:://;
@_ = DB::uplevel_args();
my $self = $_[0];
my $supermethod = $self->UNIVERSAL::super($caller);
goto &$supermethod;
}
Right, we're done! Well, apart from the little matter of that UNIVERSAL::super method. But this isn't too much of a problem; all we need to do to work out an object's super-method is to think what Perl would do. And what Perl does is ask each member of that object's class's @ISA array if it can perform the method. We can do this with the can method, which returns a code reference if the class can perform the given method - precisely what we want!
package UNIVERSAL;
sub super {
my ($class, $method) = @_;
if (ref $class) { $class = ref $class; }
my $x;
for (@{$class."::ISA"}, "UNIVERSAL") {
return $x if $x = $_->can($method);
}
}
And with this - and a little testing and documentation - the SUPER module was born and released onto CPAN. Now I could write things like
sub search {
my ($self, $terms) = @_;
if (exists $terms->{name}) {
# Do our special search
} else {
super;
}
}
This made me happy.
But then a few more lines of code later, there was another problem. One of the nice things about Ruby's OO model as opposed to Python's and Perl's is that the recipient of a method is implicit; you don't have to say
my $self = shift;
to get it from the argument list. It's just there, and you can get at it with the self keyword.
Furthermore, you can call one method from another just by naming it, and the self is again passed around implicitly.
Here's a bit of Ruby to demonstrate this:
class Thing
def look
_print
end
def _print
puts self
end
end
foo = Thing.new
foo.look
We create a new object, and call its look method. This then calls another method, _print, implicitly passing the object around. _print receives the object, once again implicitly, before finally referencing it as self.
Now of course, we can't do this in Perl - we can't change the fact that method calls do pass around the receiver and that we need to pass the receiver to a sub-method. And we can't rewrite @_ once we know the receiver, to pretend it was never there in the first place.
But we can fake it.
Using the same @DB::args trick, we can create a subroutine which returns the first argument of its caller:
sub self () {
return (DB::uplevel_args())[0];
}
This means we can say things like:
sub look { self->_print }
sub _print { print self, "\n" }
Not a bad start. But we'd really like to be able to say:
sub look { _print }
sub _print { print self, "\n" }
That's right: even though we call print with no arguments, it should still know what the current receiver is.
Once again, being able to mess about with other subroutines' arguments comes to our aid. The key to this is realising that we don't have to look just two levels up the stack - we can look more if we want to; and as we look up the call stack, we'll eventually come to a frame that was called "properly", with the appropriate type of object as its first argument.
So, we first modify DB::uplevel_args to allow us to look up a variable number of frames:
sub uplevel_args { my @x = caller($_[0]+1); return @DB::args };
We now look up the stack until we find a subroutine whose first parameter is-a whatever class we were called by:
sub self () {
my $call_pack = (caller())[0];
my $level =1;
while (caller($level)) {
my @their_args = DB::uplevel_args($level);
if (ref $their_args[0]
and eval { $their_args[0]->isa($call_pack) }) {
return $their_args[0];
}
$level++;
}
return $call_pack;
}
We're only interested in objects which are inherited from our caller because if we have
package Thing;
sub look_to_file { my $output = new IO::File (...);
_print($output)
}
we want the recipient of _print to be the Thing, not the IO::File. So in this case, we want to ignore the argument to _print but look back at the arguments of look_to_file.
Notice also that if we don't find any object of the relevant class at any time in the recent past, we assume that we're dealing with a class method, and the self is the name of the calling package; this is a reasonable approach, and is pretty much what Ruby does:
% ruby -e 'print self;'
main
% perl -Mrubyisms -e 'print self;'
main
So now we can use an implicit self, and pass it around between methods of the same class. Very neat, no?
But we've glossed over another little detail of our Ruby example: our class Thing had a constructor, but we didn't define a new method. This is because all classes in Ruby inherit from class Class which provides a generally-good-enough default constructor and then calls the initialize method to allow us to specify the object. This is a neat idea, so I wanted to steal that too. First, we need to make everything that imports the rubyism method inherit from class Class:
sub import {
no strict 'refs';
push @{(caller())[0]."::ISA"}, "Class";
rubyisms->export_to_level(1, @_);
}
We find the calling package's package name, and slap Class onto the end of its @ISA array. And then a little Exporter trick which deserves to be better known; we want to call Exporter's import to make super and self available to calling packages as well as doing our own importish things. Now you might think that after all we've seen in this article so far we could just say:
sub import {
no strict 'refs';
push @{(caller())[0]."::ISA"}, "Class";
super;
}
to jump to our superclass. Unfortunately, that doesn't quite work; this is because Exporter's import method moves symbols from Exporter to the class calling the method - in this case rubyisms. This isn't what we want - we want to move symbols from rubyisms to whatever used it. So we call the import_to_level method, which moves symbols around at a different calling level. This does the right thing.
Now we can populate the Class class with the methods we want. A generally-good-enough constructor in Perl blesses an empty hash and calls a specializer before returning the new object:
package Class;
sub new {
my $class = shift;
my $self = bless {}, $class;
$self->initialize(@_);
return $self;
}
(Note that we can't use self here to get the recipient because in the constructor, there isn't a recipient yet!)
And we provide a dummy specializer for completeness:
sub initialize {}
So now we can rewrite our Ruby example in Rubyish Perl:
package Thing;
use rubyisms;
sub _print { print self }
sub look { _print }
my $foo = Thing->new;
$foo->look;
And I was happy again. Until I found another feature from Ruby I wanted to steal...
Now we have this technique of inspecting the caller's arguments, it's very simple to write our own keywords like super and self which depend on the properties of a subroutine. Another subroutine-specific keyword in Ruby is yield.
As we saw last month, every method in Ruby can take an optional block, and the yield keyword calls back that block. Now in Perl we don't have the same optional block syntax, but we do have something a little similar: if we give our subroutines a prototype starting with &, they behave a little like map or grep.
For instance, here's a truly simple array iterator. You might want to call it a "Visitor design pattern" if you're a Gand-of-Four devotee; if you're a Perl devotee, you might want to call it a highly redundant for loop. It simply visits each element of the array, calling a codeblock on the element:
sub each_arr (&@) {
my ($code, @array) = @_;
for (@array) {
$code->($_);
}
}
With the syntax-modifying & prototype, we can call this as follows:
each_arr { print $_[0], "\n" } (10, 20, 30);
But we'd prefer to write this in a more Ruby-ish way, like so:
sub each_arr {
for (@_) { yield }
}
The Perl way is slightly different - instead of a block at the end, we expect a block at the beginning. So, once again, we look at our caller's arguments and ensure that the first argument is a code reference. If it isn't, we give a nice Ruby-friendly error:
sub yield {
my @their_args = DB::uplevel_args(1);
if ((!@their_args) or ref $their_args[0] ne "CODE") {
croak "no block given (LocalJumpError)";
}
And now we have the code reference, we can call it on $_:
$their_args[0]->($_);
}
This is pretty good, but Ruby's yield doesn't just yield the default value. In fact, Ruby doesn't really have a "default value" equivalent to $_. yield can take arguments, and those arguments should be passed to the code reference. But this being Perl, we want to support both styles: implicit $_ and explicit arguments. So our yield subroutine ends up looking like this:
sub yield (@) {
my @their_args = DB::uplevel_args(1);
if ((!@their_args) or ref $their_args[0] ne "CODE") {
croak "no block given (LocalJumpError)";
}
my @stuff = (@_||$_);
$their_args[0]->(@stuff);
}
But there's a slight problem. If we try out our shiny new yield with the each_args example, we might see something like this:
CODE(0x10774)
10
20
30
Ee need to stress again that we're only faking it. We can't rewrite each_arr's @_ array so that the codeblock is squirrelled away for yield and doesn't appear when we call for. The code reference is going to stay part of @_ whether we like it or not. So yield needs to be a bit tricky.
The obvious way to solve this problem in the majority of cases is to simple refuse to call the code reference on itself:
$their_args[0]->(@stuff)
unless $stuff[0] == $their_args[0];
And that's By And Large Good Enough. Now our iterators work the way we expect them to.
So that was all I've wanted from Ruby so far, and the whole module, rubyisms.pm, is available from the CPAN. I'm sure that there'll be more and more features added as I keep dragging things across from Ruby.
But we've seen that with the knowledge gained from a relatively simple but relatively obscure technique - the interaction between caller and @DB::args - we can bend Perl in some interesting and extraordinary directions without mucking about with XS, the Perl internals or any other difficulty.
Sometimes, it seems, getting involved in other people's arguments isn't such a bad thing after all.