[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Local definitions inside list comprehensions
Original-Via: uk.ac.ox.prg; Thu, 24 Jan 91 09:42:42 GMT
Date: Thu, 24 Jan 91 09:49:37 GMT
From: Mark.Jones@prg.oxford.ac.uk
To: haskell@cs.glasgow.ac.uk
Subject: Local definitions inside list comprehensions
Sender: haskell-request@cs.glasgow.ac.uk
I'd like to propose a small (but useful) extension to the syntax of
Haskell list comprehensions. I don't know if this is a new idea, but
I haven't seen anything along these lines before. My apologies if this
article has turned out to be rather long.
A NEW FORM OF LIST COMPREHENSION QUALIFIER:
I would like to suggest that the syntax for Haskell list comprehensions be
extended to allow qualifiers of the form pat = exp introducing new bindings
for the variables in pat from the value exp.
To properly define this new form of qualifier we simply augment the rules
on p.14 of the Haskell report for transforming list comprehensions:
[ e | p <- l ] = map (\p -> e) l
[ e | b ] = if b then [e] else []
[ e | q1, q2 ] = concat [ [ e | q2 ] | q1 ]
etc...
with:
[ e | p = v ] = [ e where { p = v } ]
(Incidentally, I wonder if the report should also say that [ e | ] = [e]?)
In other words, it shouldn't be too difficult for Haskell implementors to
include this extension.
MOTIVATION:
One of the weaknesses in the otherwise powerful notation of list
comprehensions is the inability to make local definitions depending
on values introduced by the generators in the comprehension.
As an example, consider an expression of the form:
[ f (g x) y | x<-xs, p (g x), y<-h x ]
The fact that the expression (g x) appears twice in this comprehension
causes two problems:
- Clarity: if (g x) is a (textually) large expression, then it may not
be obvious to a reader of the program that the two expressions are
in fact the same.
- Efficiency: Since the value of (g x) will be the same however many
times we calculate it, we would prefer to restrict ourselves to a
single evaluation.
Of course, there are already several ways of rewriting this expression
to avoid repeating the expression (g x). Here are a few:
(1) [ f gx y | x <- xs, gx <- [g x], p gx, y <- h x ]
(2) [ f gx y | (x,gx) <- [ (x, g x) | x <- xs ], p gx, y <- h x ]
(3) [ res | x <- xs, res <- [ f gx y | p gx, y <- h x ] where gx = g x ]
(4) concat [ [ f gx y | p gx, y <- h x] where gx = g x | x <- xs ]
In my opinion, none of these are really satisfactory: Although they avoid
repeated evaluation of (g x), only (1) has anything approaching the clarity
that we would like. My reasons for disliking (1) are:
- Consistency: Although not significant in this case, the binding
gx <- [g x] is explicitly non-recursive (i.e. a let rather than a
letrec), whereas all other binding mechanisms in Haskell are
(implicitly) recursive.
- Philosophical: A list structure is used to represent collections
of objects of varying sizes. Why use a list [g x] when we know
that it will always contain exactly one element?
- Efficiency: Construction of the list [g x] followed by its immediate
destruction is likely to incur a (perhaps small) performance cost.
Using the pat=exp qualifier, the original list comprehension can be written:
[ f gx y | x<-xs, gx=g x, p gx, y<-h x ]
which seems to solve both the clarity and efficiency issues raised above.
This new form of qualifier is also a valuable aid to clarity, even when
repeated expressions are not an issue. As an example, here is a recoding
(using the `list of successes' technique) of a function defined on pp.179-180
of `The Implementation of Functional Programming languages' as part of a
polymorphic type checker. In my opinion, the resulting program is much
easier to follow than the original:
tclet gamma ns xs es e = [ (phi' @@ phi, t)
| (phi,ts) <- tcl gamma ns0 es,
gamma' = sub_te phi gamma,
gamma'' = add_decls gamma' ns1 xs ts,
(phi',t) <- tc gamma'' ns2 e ]
where (ns0:ns1:ns2:_) = manySplit ns
Any comments?
Mark