30 12 / 2009
Polyglot Folding (Ruby, Clojure, Scala)
fold in a nutshell
It’s common to use the fold family of higher order functions (inject, reduce, and friends, depending on the language) to “reduce” some sequence to a single value through sequential application of a function taking two arguments. It’s often used to do things like sum an array of integers. Because this usage is so common, fold is often thought of as a “destructive” or “information-losing” operation.
It doesn’t need to be. Below you’ll find examples offered up in Ruby, Scala, and Clojure that demonstrate use of an explicity supplied accumulator value. This technique allows for more powerful folds and serves as a way to more concisely write what in imperative languages would involve a loop and a temporary variable.
This post is probably more for me than for you, but I hope you enjoy it nonetheless, and I welcome comments and suggestions.
intro to folding
in Ruby
In Ruby, to find the sum of an array of numbers, one might write something like this:
[1,2,3].inject{|l,r| l+r}
In this case [1], the Proc {|l,r| l+r} is passed to the inject [2] method of the Array class. inject calls the function with arguments l = 1 and r = 2, the left-most elements in the array. The result of this application is the number 3, which is stored.
As long as elements remain in the sequence, the result of the previous application is stored and used as the “left” argument to the function, and the next element of the sequence is used as the “right” argument. This continual application of the function using the current result and next element continues left to right, until no elements remain.
So, after the result 3 stored, {|l,r| l+r} is called again with l = 3 and r = 3. The result of this call is 6, which is then returned, since no elements remain. [3]
in Scala
Scala, like Ruby, supports the notion of first-class functions and as such provides also a “reduce” method on collections like arrays.
The most Ruby-like example might be:
Array(1,2,3).reduceLeft((l,r) => l+r))
As in the Ruby example, a method taking two arguments is passed to reduceLeft and elements in the array are sequentially added together left to right, and a sum is returned when no elements remain. [4]
in Clojure
While Clojure’s syntax is much different than Ruby’s or Scala’s, the basic ingredients of folding can be recognized in this snippet:
(reduce (fn [l r] (+ l r)) [1 2 3])
Here, the function reduce is called with two arguments: a function taking two arguments, and a sequence (in this case a vector) of values on which to operate.
The idiomatic way to write this, which doesn’t require us to wrap + with an anonymous function, looks like this [5]:
(reduce + [1 2 3])
nondestructive folding
So far, the examples have assumed that the “accumulator” value - the value that is continually passed as the left argument to the function as fold moves left to right - is of the same type as the elements of the sequence. In the examples, this has been an integer. Because the accumulator is also the value that is returned by fold, all of the examples above return not an array but an integer value: the sum.
However, many implementations of “fold left” allow specifying an initial value manually. This value can be any type, including an array or sequence, as long as the function provided recognizes the left argument as such.
This allows for “nondestructive folding,” since in some sense the accumulator value now has the same ability to store information as the sequence on which fold is operating.
the running sums problem
Suppose you have a sequence of numbers like [1,2,3], and that you’d like to create an array representing the “running sum” of these numbers. That is, you’d like to create a new array, [1,3,6], that stored at each old element’s position the sum of the old element and all of the old elements to its left.
I can’t find the logs, but this was a question posed in #scala IRC a few months ago. Paul Phillips offered a solution similar to this one:
List(1,2,3).foldLeft(List(0))((xs,y) => List(xs.head + y) ++ xs).reverse.tail
wtf
in Scala
Instead of Array.reduceLeft, we’re using List.foldLeft, which takes two arguments:
- an initial value for the accumulator: List(0)
- a function taking the initial value as its first argument, and an element from the original List as its second argument: ((xs,y) => List(xs.head + y) ++ xs)
So, when foldLeft first calls ((xs,y)…, xs = List(0) and y = 1. In the function body, the statement:
xs.head + y
..adds the “head” (or first element) of xs, which is 0, to the first element of the original list, which is 1. This result is then used to create another list containing a single value, the result of this addition:
List(xs.head + y)
Then, we use List’s concatenation method, ++, to prepend the contents of this list to xs:
List(xs.head + y) ++ xs
This value, the product of appending our new list with the old one, is returned and stored by foldLeft as the new value of the accumulator. It’s subsequently passed again to ((xs,y) =>… as xs, with y = 2, the next value in the original list.
This passing, evaluation, and array-munging continues until every original element has been iterated over. The final result of foldLeft by itself:
List(1,2,3).foldLeft(List(0))((xs,y) => List(xs.head + y) ++ xs)
… is actually List(6, 3, 1, 0), because we “built” the accumulator backwards and kept its first element, 0. To get the result we want, we must reverse the list (List.reverse) and drop the 0 (List.tail). We then end up with paulp’s solution. [6]
in Ruby
Ruby’s Array class doesn’t provide a foldLeft method, but it does offer a two argument form of inject that behaves exactly the same way. It takes an initial value, and a function to execute for each element, which is expected to take the accumulator value as the first argument.
It looks like this:
[1,2,3].inject([0]){|xs,y| xs.unshift(xs.first + y) }.reverse.slice(1..-1)
Ruby’s semantics for prepending an element to an array (unshift) and for dropping the first element of an array (slice(1..-1)) differ from Scala’s, but the spirit of the code is basically the same. Like the Scala implementation, variations are possible that might be more efficient. [6]
in Clojure
Clojure, owing to Lisp syntax, is sort of the inside-out version of the previous two implementations:
(rest (reverse (reduce (fn [xs y] (cons (+ (first xs) y) xs)) '(0) [1 2 3])))
Clojure’s reduce, like Ruby’s inject, provides a form capable of taking an initial value instead of a separate function like Scala’s foldLeft.
I’m pretty new to Clojure, but I think a more idiomatic approach might be to use Clojure’s short hand function definition:
(rest (reverse (reduce #(cons (+ (first %1) %2) %1) '(0) [1 2 3])))
Like Ruby and Scala, many variations are possible, and they all depend on your particular problem and the affordances of available data structures.
For instance, in this Clojure version, we might use a vector instead of a list and shorten it up a bit:
(rest (reduce #(conj %1 (+ (last %1) %2)) [0] [1 2 3]))
I haven’t tested whether this is faster or better in any way, but it seems to work.
“summing” up
There’s a lot to “functional programming style,” and I don’t claim to be an expert. paulp’s running sums solution was the first example I’d seen of using a collection as an accumulator, but I’ve since had a lot of fun with folding in both Ruby and Clojure. It seems to be a pretty neat way to tighten up what in other languages would be explicit loops.
While fold is a functional standby and will probably be available in languages for a very long time, it’s not without criticism. In a recent presentation, Guy Steele considered foldl and foldr “slightly harmful” and emphasized the importance of a “divide and conquer” approach instead of an “accumulator” approach. The PDF of his presentation elaborates on this.
notes
- A shorter, cleverer, and somewhat mystical way to sum an array in Ruby is
[1,2,3].inject(:+)
Reg Braithwaite put together a comprehensive blog post covering the nuances of Ruby’s inject, if you’d like to learn more. - Ruby’s inject is comparable to the foldl or “fold left” of other languages, but is named in the Smalltalk tradition.
- Many languages offer also a foldr, or “fold right” function which behaves like inject except operates from right to left in the sequence.
- Like Ruby, Scala provides a shortcut:
Array(1,2,3).reduceLeft(_+_)
Because this use of reduceLeft is so common, Scala 2.8 provides a “sum” method on numeric collections:Array(1,2,3).sum
- In the two argument form, Clojure’s reduce is equivalent to apply:
(apply + [1 2 3])
- It’s possible implement this in such a way as to append, rather than prepend, results. This approach means that reverse isn’t necessary, but might be slower and/or more complicated depending on the implementation of the collection.
Permalink 2 notes