Programming

Software | Secret Software | Writing

Introduction

    map BLOCK LIST
    map EXPR, LIST

Last time, we had a look at the grep built-in function; this month, we'll have a look at its very close cousin, map.

Functional Programming and LISP

map has its roots in the functional programming language LISP. Most programming languages in common use, such as C, Java, Perl and Python, are "procedural" languages - you tell the computer what procedure to carry out. But there's another style of programming which encourages programming in terms of combining functions. While it's possible to program in this style in procedural languages, other languages like LISP, Haskell, and ML make it particularly easy.

LISP stands for LISt Processing; as its name implies, it puts a lot of emphasis on the manipulation of lists. To show the difference between functional languages and procedural languages, here's one way of adding up a list of numbers in LISP:

    (defun sum (list)
      "Add a list of numbers"
        (if (null list) 
            0 
            (+ (car list) (sum (cdr list)))
        )
    )

    (sum '(15 26 34)) ;;; Returns 75

It might help you read that if we translate it to Perl:

    sub sum {
        my @list = @_;
        # Add a list of numbers
        if (! @list) { 
            return 0;
        } else {
            return $list[0] + sum(@list[1..$#list]);
        }
   }

   sum(15, 26, 34); # Returns 75

(Can you see how that works? If not, try adding some print statements to see what it's calculating.)

Compare this with the more Perl-ish way of adding a list:

    sub sum {
        my @list  = @_;
        my $total = 0;
        # Add a list of numbers
        $total += $_ for @list;
        return $total;
    }

Here we're using a temporary variable to hold the total, and adding each element of the list to it in turn using a very compact for loop. The more functional solution didn't use a temporary variable: instead, it preferred to call another function - in this case, the function was sum itself.

Now, what has all this to do with map? Well, LISP has a function called mapcar which performs an operation on every element of a list and builds up a list of the results. Suppose we have a function double which returns twice its input. In Perl:

    sub double { my $var = $_[0]; return $var * 2 }

And in LISP:

    (defun double (var) (* var 2))

Using mapcar, we can run double on a list of numbers:

    (mapcar 
        'double 
        '(1 2 3 4)
    ) ;; Returns (2 4 6 8)

In fact, it's traditional not to use a separate function like double, but instead to use what Perl would call an anonymous subroutine:

    (mapcar 
        (lambda (var) (* var 2)) 
        '(1 2 3 4)
    ) ;; Returns (2 4 6 8)

You'll notice that we've almost got the definition of double in there, but the magic word lambda transforms that definition into something that can act like a function.

mapcar in Perl?

How would we write mapcar in Perl? It would be a subroutine which takes a list and another subroutine. This is what functional programming is all about: manipulating not just lists and scalars but also subroutines and functions.

Well, we can do this: we have to use what's called a subroutine reference as a way of passing subroutines around. To take the subroutine reference from the subroutine double we defined earlier, we say \&double. This gives us a value which "represents" what that subroutine does.

If we put that subroutine reference into a scalar, say $subref, we can call it by saying &$subref(...). For instance, to double the number 5 this way, we can say:

    sub double { my $var = $_[0]; return $var * 2 }

    $subref = \&double;

    print &$subref(5);

Since $subref is an ordinary scalar, we can pass it to a subroutine. Now we can start to write mapcar:

    sub mapcar {
        my ($subref, @list) = @_;
        my @output;
        for (@list) {
            my $value = &$subref($_);
            push @output, $value;
        }
        return @output;
    }

We take a subroutine reference and a list. We then look at each element of the list in turn, run the subroutine on it, and then put the return value from the subroutine onto a list. So:

    sub double { my $var = $_[0]; return $var * 2 }
    my $subref = \&double;
    @out = mapcar ($subref, 1, 2, 3, 4);
    print "@out\n"; # Prints "2 4 6 8"

Now, we could slightly tidy this up a bit using a compact for loop and by getting rid of some of the temporary variables:

    sub mapcar {
        my $subref = shift;
        my @output;
        push @output, &$subref($_) for @_;
        return @output;
    }

    sub double { return $_[0] * 2 }

    @out = mapcar (\&double, 1, 2, 3, 4);
    print "@out\n"; # Prints "2 4 6 8"

We can even do the lambda trick to get rid of the double routine; except in Perl, lambda is spelt sub:

    @out = mapcar ( sub { return $_[0] *2 }, 1, 2, 3, 4);

Now for the twist: we've more or less reinvented Perl's map function. map takes a block which manipulates $_, applies the block to each element of the list:

    @out = map { $_ * 2 } (1, 2, 3, 4); # No comma between block and list!

Mappings between lists

Fine. But why? What's the point of it? Well, map transforms one set of values into another; it creates a mapping between one list and another. For instance, suppose you have a hash %address which contains a database of people's addresses. You can then get a list of addresses given a list of people like this:

    @addresses = map { $address{$_} } @people;

How about turning a list of filenames into a list of URLs by combining them with your web site address?

    @urls = map { "http://www.mysite.com/~me/" . $_ } @files;

Because you can return more than one value, you can convert an array into a hash. Let's say you're writing a program for a hotel, and you have an array of rooms; you can find the occupant by room number:

    $room[123] = "Simon Cozens";
    $room[200] = "Kevin Lenzo";

    print "Who's staying in room 200? ", $room[200];

Now, of course, you get the obvious question: How do you find out which room someone is staying in if you know their name? You could go through the array, testing to see if each element contains the name you're looking for. Or you could turn the array into a hash, like this:

    my $start = 0;
    %where = map { $_ => $start++ } @room;

As we saw when dealing with grep, when you need the iterator of an array, it's best to construct a list of the array indices, like this:

    %where = map { $room[$_] => $_ } 0..$#room;

Now you can say

    print "Where's Kevin? ", $where{"Kevin Lenzo"};

(Although if you need to do this, it might be that you should have used a hash in the first place, and use reverse to turn the lookup around.)

A short foreach

Since map runs over the entire array, it's possible to use it as a short version of foreach by ignoring the return value. This is called "using map in a void context", and some people love it and others hate it. For instance, we earlier summed a list like this:

        my $total = 0;
        $total += $_ for @list;

We could also have done it like this:

        my $total = 0;
        map {$total += $_} @list;

The block {$total += $_} just happens to return the running total after adding each element in turn, but we don't bother about that: we're only doing it for the side effect of setting $total.

However, as you can see from the above, it's usually shorter, and sometimes clearer, to use for instead.

map vs. grep

You should by now have realised that map isn't all that different from grep, the function we looked at last time. Both look over each element of a list in turn, set $_ to the element and perform a block on them. The difference comes in what happens next: with grep, the element is added to the list of return values if the block tests true; with map, the return value of the block itself goes on the list.

In other words:

    grep {x} @list

is equivalent to

    map { x ? $_ : () } @list

That is, if x is true, return the element; else, return nothing.

As you know, grep supports the syntax grep EXPR, LIST; it shouldn't really be a surprise to you to learn that map supports this too. (Especially since you should have seen it at the top of this article...)

So when should you use grep and when map? Naturally, it depends on whether you want the value of the block or the value of the element, but in general, it's customary to use grep when you're looking for something, and map when you're transforming a list.

Schwartzian Transforms

The most famous (and startling) use of map is the Schwartzian Transform, a technique invented by Randal Schwartz for speeding up sorting, and we'll take a look at that next month when we investigate the sort function.

Latest articles

Development activity

This page was last checked for correctness on 2000-09-26. Contact Simon.