May 01 2015

How is lazy evaluation implemented in functional programming languages?

Category: UncategorizedFractalizeR @ 4:28 pm

Answer by Costya Perepelitsa:

With thunks. It's pretty straight-forward, really:

The definition of the lazy value is wrapped in a tiny wrapper function which, when called, produces the desired value. This function is not called immediately, but instead stored, to be called only when/if needed. This stored function is the "thunk".

When the value is needed, a simple check is performed:
If the thunk has never been evaluated, the thunk is forced to run, the result is stored for future reference, and then the result is returned.
But if the thunk has already been evaluated, the result has already been stored, and so it's just returned directly without evaluating the thunk again.

I'll give an example implementation in C first, so you can see how lazy values can be implemented in a language that does not support first-class functions. If you're not following the C, there's a Python implementation at the end of this answer that you'll probably find much more readable, but it's less explicit about the creation of the thunk.
(I use a somewhat uncommon feature of C here called a function pointer. If you're unfamiliar with this concept, just know that

thunk

must be a function that takes a single argument of type

void
*

and returns

void
*

. This way, an arbitrary number of arguments of arbitrary types can be passed to the thunk, and the return can be anything the client wants.)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include <stdlib.h>

typedef struct {
    void *val;                  // result, when evaluated
    int evaluated;              // true if thunk has ever been evaluated
    void *(*thunk)(void*);      // deferred computation
    void *args;                 // args to pass to deferred computation
} lazy;

// given a thunk and args to pass to it, creates the lazy type above
lazy *make_lazy(void* (*thunk)(void*), void *args) {
    lazy *l = malloc(sizeof(lazy));
    l->val = NULL;
    l->evaluated = 0;
    l->thunk = thunk;
    l->args = args;
    return l;
}

// force evaluation of the thunk
void *force(lazy *l) {
    if (!l->evaluated) {
        // the thunk will only ever be evaluated once,
        // and only when the evaluation is forced via this function
        l->val = l->thunk(l->args);
        l->evaluated = 1;
    }
    return l->val;
}

And here's an example of its use:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include <stdlib.h>
#include <stdio.h>
#include "lazy.h"   // or whatever you named it

// the function to make lazy
int id(int x) {
    // the following puts() gives an obvious indication
    // that this function is being evaluated
    puts("Evaluating id!");
    return x;
}

// create the thunk; when forced, it evaluates id()
void* id_thunk(void *x_vp) {
    int *x = malloc(sizeof(int));
    *x = id(*(int*)x_vp);
    return (void*)x;
}

int main(void) {
    // create a lazily-evaluated call to "id(10)"
    int *ten_arg = malloc(sizeof(int));
    *ten_arg = 10;
    lazy *ten = make_lazy(id_thunk, (void*)ten_arg);

    puts("Evaluating lazy ten:");
    // the first time the lazy value is evaluated,
    // the thunk will be evaluated, which should
    // produce the side-effect of outputting
    // "Evaluating id!" to the console
    int *ten_evaluated = (int*)force(ten);
    printf("Result: %d\n", *ten_evaluated);

    puts("\nEvaluating lazy ten again:");
    // the second evaluation of the lazy value
    // should not cause id() to be run again,
    // so we're expecting to NOT see the
    // "Evaluating id!" message
    ten_evaluated = (int*)force(ten);
    printf("Result: %d\n", *ten_evaluated);

    return 0;
}

And when you run this program, you get the following output:

1
2
3
4
5
6
Evaluating lazy ten:
Evaluating id!
Result: 10

Evaluating lazy ten again:
Result: 10

You can see from this output that the

id()

function is not evaluated before the call to

force

; otherwise, the message "Evaluating id!" would have been above the first line that said "Evaluating lazy ten:".
You can further see that the

id()

function is only evaluated once, because the "Evaluating id!" message does not appear again when the lazy value is evaluated a second time.

If C makes you queasy, here's a re-implementation in Python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
class Lazy:
    def __init__(self, f, args):
        self.evaluated = False
        self.value = None
        self.f = f
        self.args = args

    def force(self):
        if not self.evaluated:
            self.value = self.f(self.args)
            self.evaluated = True
        return self.value

And example use in Python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def identity(x):
    print "Evaluating identity()!"
    return x

if __name__ == '__main__':
    lazy_ten = Lazy(identity, 10)

    print "Evaluating lazy ten:"
    result = lazy_ten.force()
    print "Result: " + str(result)
    print
    print "Evaluating lazy ten again:"
    result = lazy_ten.force()
    print "Result: " + str(result)

The output is the same as the C example:

1
2
3
4
5
6
Evaluating lazy ten:
Evaluating identity()!
Result: 10

Evaluating lazy ten again:
Result: 10

How is lazy evaluation implemented in functional programming languages?

Leave a Reply

You must be logged in to post a comment. Login now.