Rust Closures: Returning `impl Fn` for `move` closures
Problem setup
Recently I've had to write some nom
code (it is a parser combinator library for Rust). To my surprise I discovered that there is no combinator for creating a parser which always succeeds returning a certain given value. At least not without using macros which is discouraged in nom v5. That combinator would be something like pure
in Haskell Parsec. It's not very useful on its own, but can be used as part of other combinators, say providing a default alternative for alt
.
So I decided to add success
to nom. After looking at the library code, I realised that it uses closures quite heavily and I didn't use them much in Rust, so I had some questions. Here is my version of success
basically copy-pasted from a similar combinator value
:
That type signature looks a little scary, eh? However, since we are not going to focus on I
or E
arguments here (input and error types), we can just rewrite it like this, omitting irrelevant details:
Questions
I had three questions here:
Why do we need to clone
val
? After all it looks like l I have a value and just want to pass the ownership to the parser, no need to clone anything.Why we have
move
closure, but return type of the function isimpl Fn(something)
and notimpl FnOnce(something)
I thought that when we usemove
then we move the captured environment into the closure andFnOnce
trait matches that behaviour.Can we omit
move
or change the type toFnOnce
or removeClone
i.e. to remove any of those things which I didn't understand and still make it work? Are they actually necessary?
TL;DR move
determines how captured variables are moved into the returned closure. Then returned impl Fn/FnMut/FnOnce
puts restrictions on how they are used inside that closure (which in turn defines whether the closure can be used once or more). We can move
into closure but still only use the captured values by reference and return impl Fn
to allow multiple calls of the returned closure. And yes, everything in the code above was necessary :)
More detailed answers
I assume here that you know the basics about closures. If not, you can read a corresponding chapter in the Rust book. Also, on top of that I would recommend reading Steven Donovan's post "Why Rust Closures are (Somewhat) Hard".
That post (and Rust reference) tells you that a closure basically corresponds to an anonymous structure of some concrete but unknown type which has fields corresponding to the captured variables. The capture mode of those fields (i.e. whether they are &T
, &mut T
or T
) is determined by the usage of the captured variables inside the closure. Or it can be forced to be T
, i.e. to passing the ownership to the closure, by using move
keyword.
I'll repeat the code above for your convenience:
So, in our case we implicitly have the following structure for our move
closure, it only captures a single variable val
of generic type O
:
Then we know that this closure should implement Fn
trait, since this is what is returned from success
function. As described in the documentation, it will look like this (note the helpful comment):
Now we can answer our question 1: why do we have to clone? If we didn't clone, we would be moving out of self.val
which is behind a reference. So we can't do that for Fn
.
Another way of resolving this issue (i.e. if we don't want to clone) would be to use FnOnce as the result type which would in effect give us self
instead of &self
in the call method, so we can pass the ownership further with Ok((input, self.val))
. However, using FnOnce means that we can use the resulting closure just once, perhaps that's not we want in a parser. Why? I don't know for sure, but I suspect that we may want to do certain lookaheads while parsing and this means invoking the closure and then backtracking if it doesn't match. But this mean we would have spoiled the parser closure and can't use it again for another run. So let's assume that our parsers should be Fn
s which is indeed Nom's API.
The question 2: how move
can coexist with returning Fn
is also clear now. move
determines how values are captured into that closure-structure, i.e. which type they have there (refs, mutable refs or owned values) while Fn/FnOnce/FnMut
trait is determined by the way they are used in that closure (in our call
method above). For example, we can move
into a closure but still only use the captured values by reference and return impl Fn
to allow multiple calls of the returned closure.
The remaining part of question 3 is why we want to use move
closure here if we only access the variable by reference anyway. The explanation is that we are returning this closure, but val
will be dropped immediately on return and the closure can't outlive it. In other words, if we didn't use move
, we would have val: &'a O
in that closure structure field. But that reference would immediately become invalid since when we return our closure, val
is dropped, so no references to that are allowed, including inside our returned closure. So that won't work and we would get "val does not live long enough" error message.
Therefore, it looks like all the bells and whistles in that success
combinator were actually needed.
Summary
I think the main conclusion of this investigation is that explicitly writing out implicit structures and Fn/FnMut/FnOnce implementations is very useful for making sense of compiler errors and understanding what's going on under the hood in the world of closures. At least for the first time.
Further reading
Rust reference: "Closure expressions", "Closure types" (it even has a special note about combination of
move
closures andFn
here)."Why Rust Closure are (Somewhat) Hard" blog post
Update: as /u/CUViper has suggested on Reddit, there is also a very nice blog post "Closures: Magic Functions" which meticulously demonstrates the desugaring of various types of closures.
Last updated