Published on to joshleeb's blog
Procedural macros are a really powerful language feature in Rust and something I haven’t seen in many other languages.
There are a heap of tutorials out there for procedural macros, including in The Rust Reference, and the first edition of the Rust Book. One of the more entertaining (and useful) posts is by Zach Mitchell where you get to “learn Rust procedural macros with Nic Cage”.
I won’t go into depth about what procedural macros are and why they’re so powerful. But basically they allow you to tell the compiler to take in some code, analyse it, and generate some more code. To me, that sounds pretty powerful already.
VariantEq
I recently put up a crate called VariantEq
which exposes a
Custom Derive type procedural macro called VariantEq
.
Custom Derive macros are used with derive
like #[derive(Debug, ...)]
above
your struct or enum. And the job of this is usually to implement a trait for
you! In this case it’s the Debug
trait.
Examples are better than explanations so this is what deriving VariantEq
on
your enum will allow you to do.
#[macro_use]
extern crate varianteq;
#[derive(Debug, VariantEq)]
enum E {
A(i32),
B(i32),
C(u32, bool),
}
fn main() {
assert_eq!(E::A(1), E::A(1));
assert_eq!(E::A(1), E::A(2));
assert_ne!(E::A(1), E::B(1));
assert_ne!(E::A(1), E::C(1, false));
}
Pretty much it implements the PartialEq
and Eq
traits in a way that only
the variant is considered, and the variant fields are ignored.
With all the tutorials and examples around on the web I thought I would have no trouble implementing this. But turns out, there were some recent changes to the most up to date way of implementing these macros. The docs on the web hadn’t been fully updated (or maybe I just couldn’t find up to date examples) so it became a bit harder than I thought.
In any case this can be yet another example of using procedural macros in Rust.
Exposing the macro
A good place to start is by exposing the macro you are creating. This will
tell the compiler what to run when you use #[derive(...)]
.
To find a good example of how to do with with proc_macro2
, I
ended up looking through the diesel_derives
source which
sets out the code to do this pretty nicely.
First we define the entry point varianteq_derive
. This function actually has
a procedural macro on itself, which marks the function to be called whenever
we #[derive(VariantEq)]
.
#[proc_macro_derive(VariantEq)]
pub fn varianteq_derive(tokens: TokenStream) -> TokenStream {
expand_derive(tokens, varianteq::derive)
}
The expand_derive
functions is fairly straight forward. It takes the
TokenStream
from proc_macro
, converts it into a
proc_macro2::TokenStream
, parses it, and then calls our derive function, in
this case varianteq::derive
.
fn expand_derive(tokens: TokenStream, derive: DeriveFn) -> TokenStream {
let item = parse2(tokens.into()).unwrap();
match derive(item) {
Ok(tokens) => tokens.into(),
Err(err) => handle_derive_err(err),
}
}
Proc Macro 2
From alexcrichton/proc-macro2:
proc_macro2
is a small shim over theproc_macro
crate in the compiler intended to multiplex the current stable interface and the upcoming richer interface.
Deriving the Macro
Within src/varianteq.rs
we have the derive
function that
was being called earlier. In the real world this could just be a method than
generates an implementation of PartialEq
using the mem::discriminant
method. But that wouldn’t make for a very interesting procedural macro, so
instead we’ll assume that this method doesn’t exist in the stdlib. So then,
the logic can be broken up into three stages.
First, we gather information from the DeriveInput
, which was parsed out of
the TokenStream
earlier on. For VariantEq
specifically we just need the
enum identifier, and the variants of the enum.
Next, we construct the list of variants. This is essentially a mapping of each
enum variant into our EnumVariant
type which can be used to generate tokens.
More on that soon.
Finally, we generate our tokens with the quote!
macro. This macro takes Rust
code, and parses it into the Tokens
that we need to give back to the
compiler. This is to avoid manually specifying each individual token of code
to generate.
This last part is fairly straight forward. But it has one line which is a bit
mystifying. The #(#enum_variants => true,)*
is a special syntax used by the
quote!
macro to bring values from outside its scope into scope.
For a specific explanation of what this line, and similar syntax, does: from the the docs for quote/quote on interpolation:
This iterates through the elements of any variable interpolated within the repetition and inserts a copy of the repetition body for each one.
Our Own EnumVariant
Now back to the EnumVariant
type I mentioned earlier, set out in
src/token.rs
. This struct is just an abstraction to make it
easier to use that special interpolation syntax in the quote!
macro.
The important bit is that EnumVariant
implements the ToTokens
trait which
defines how it gets generated into tokens.
Let’s say we have this enum:
enum E {
A, // Unit variant.
B(i32, i32), // Unnamed variant with 2 fields.
C{x: i32}, // Named variant.
}
The ToTokens
implementation for EnumVariant
will spit out the tokens for
this Rust code, generating a different line for each variant based on the
variant type:
match (self, other) {
(E::A, E::A) => true,
(E::B(_, _), E::B(_, _)) => true,
(E::C{..}, E::C{..}) => true,
}
Wrapping Up
Running back up the function calls, this output from EnumVariant::to_tokens
is plugged back into the quote!
block defined in varianteq::derive
.
Now we have PartialEq
and Eq
implemented for the enum that derived
VariantEq
. So we turn that Rust code into Tokens and send it back to the
compiler.