src/macro-expansion.md - rust-lang/rustc-dev-guide - Git at Google

 # Macro expansion

 <!-- toc -->

 > N.B. [`rustc_ast`], [`rustc_expand`], and [`rustc_builtin_macros`] are all
 > undergoing refactoring, so some of the links in this chapter may be broken.

 Rust has a very powerful macro system. In the previous chapter, we saw how
 the parser sets aside macros to be expanded (using temporary [placeholders]).
 This chapter is about the process of expanding those macros iteratively until
 we have a complete [*Abstract Syntax Tree* (AST)][ast] for our crate with no
 unexpanded macros (or a compile error).

 [ast]: ./ast-validation.md
 [`rustc_ast`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/index.html
 [`rustc_expand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/index.html
 [`rustc_builtin_macros`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_builtin_macros/index.html
 [placeholders]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/placeholders/index.html

 First, we discuss the algorithm that expands and integrates macro output into
 ASTs. Next, we take a look at how hygiene data is collected. Finally, we look
 at the specifics of expanding different types of macros.

 Many of the algorithms and data structures described below are in [`rustc_expand`],
 with fundamental data structures in [`rustc_expand::base`][base].

 Also of note, `cfg` and `cfg_attr` are treated specially from other macros, and are
 handled in [`rustc_expand::config`][cfg].

 [`rustc_expand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/index.html
 [base]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/index.html
 [cfg]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/config/index.html

 ## Expansion and AST Integration

 Firstly, expansion happens at the crate level. Given a raw source code for
 a crate, the compiler will produce a massive AST with all macros expanded, all
 modules inlined, etc. The primary entry point for this process is the
 [`MacroExpander::fully_expand_fragment`][fef] method. With few exceptions, we
 use this method on the whole crate (see ["Eager Expansion"](#eager-expansion)
 below for more detailed discussion of edge case expansion issues).

 [`rustc_builtin_macros`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_builtin_macros/index.html
 [reb]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/build/index.html

 At a high level, [`fully_expand_fragment`][fef] works in iterations. We keep a
 queue of unresolved macro invocations (i.e. macros we haven't found the
 definition of yet). We repeatedly try to pick a macro from the queue, resolve
 it, expand it, and integrate it back. If we can't make progress in an
 iteration, this represents a compile error.  Here is the [algorithm][original]:

 [fef]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.MacroExpander.html#method.fully_expand_fragment
 [original]: https://github.com/rust-lang/rust/pull/53778#issuecomment-419224049

 1. Initialize a `queue` of unresolved macros.
 2. Repeat until `queue` is empty (or we make no progress, which is an error):
    1. [Resolve](./name-resolution.md) imports in our partially built crate as
       much as possible.
    2. Collect as many macro [`Invocation`s][inv] as possible from our
       partially built crate (`fn`-like, attributes, derives) and add them to the
       queue.
    3. Dequeue the first element and attempt to resolve it.
    4. If it's resolved:
       1. Run the macro's expander function that consumes a [`TokenStream`] or
          AST and produces a [`TokenStream`] or [`AstFragment`] (depending on
          the macro kind). (A [`TokenStream`] is a collection of [`TokenTree`s][tt],
          each of which are a token (punctuation, identifier, or literal) or a
          delimited group (anything inside `()`/`[]`/`{}`)).
          - At this point, we know everything about the macro itself and can
            call [`set_expn_data`] to fill in its properties in the global
            data; that is the [hygiene] data associated with [`ExpnId`] (see
            [Hygiene][hybelow] below).
       2. Integrate that piece of AST into the currently-existing though
          partially-built AST. This is essentially where the "token-like mass"
          becomes a proper set-in-stone AST with side-tables. It happens as
          follows:
          - If the macro produces tokens (e.g. a proc macro), we parse into
            an AST, which may produce parse errors.
          - During expansion, we create [`SyntaxContext`]s (hierarchy 2) (see
            [Hygiene][hybelow] below).
          - These three passes happen one after another on every AST fragment
            freshly expanded from a macro:
            - [`NodeId`]s are assigned by [`InvocationCollector`]. This
              also collects new macro calls from this new AST piece and
              adds them to the queue.
            - ["Def paths"][defpath] are created and [`DefId`]s are
              assigned to them by [`DefCollector`].
            - Names are put into modules (from the resolver's point of
              view) by [`BuildReducedGraphVisitor`].
       3. After expanding a single macro and integrating its output, continue
          to the next iteration of [`fully_expand_fragment`][fef].
    5. If it's not resolved:
       1. Put the macro back in the queue.
       2. Continue to next iteration...

 [`AstFragment`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragment.html
 [`BuildReducedGraphVisitor`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/build_reduced_graph/struct.BuildReducedGraphVisitor.html
 [`DefCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/def_collector/struct.DefCollector.html
 [`DefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.DefId.html
 [`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
 [`InvocationCollector`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.InvocationCollector.html
 [`NodeId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/node_id/struct.NodeId.html
 [`set_expn_data`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.LocalExpnId.html#method.set_expn_data
 [`SyntaxContext`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html
 [`TokenStream`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html
 [defpath]: hir.md#identifiers-in-the-hir
 [hybelow]: #hygiene-and-hierarchies
 [hygiene]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/index.html
 [inv]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/struct.Invocation.html
 [tt]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/enum.TokenTree.html

 ### Error Recovery

 If we make no progress in an iteration we have reached a compilation error
 (e.g. an undefined macro). We attempt to recover from failures (i.e.
 unresolved macros or imports) with the intent of generating diagnostics.
 Failure recovery happens by expanding unresolved macros into
 [`ExprKind::Err`][err] and allows compilation to continue past the first error
 so that `rustc` can report more errors than just the original failure.

 [err]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/enum.ExprKind.html#variant.Err

 ### Name Resolution

 Notice that name resolution is involved here: we need to resolve imports and
 macro names in the above algorithm. This is done in
 [`rustc_resolve::macros`][mresolve], which resolves macro paths, validates
 those resolutions, and reports various errors (e.g. "not found", "found, but
 it's unstable", "expected x, found y"). However, we don't try to resolve
 other names yet. This happens later, as we will see in the chapter: [Name
 Resolution](./name-resolution.md).

 [mresolve]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/macros/index.html

 ### Eager Expansion

 _Eager expansion_ means we expand the arguments of a macro invocation before
 the macro invocation itself. This is implemented only for a few special
 built-in macros that expect literals; expanding arguments first for some of
 these macro results in a smoother user experience.  As an example, consider
 the following:

 ```rust,ignore
 macro bar($i: ident) { $i }
 macro foo($i: ident) { $i }

 foo!(bar!(baz));
 ```

 A lazy-expansion would expand `foo!` first. An eager-expansion would expand
 `bar!` first.

 Eager-expansion is not a generally available feature of Rust.  Implementing
 eager-expansion more generally would be challenging, so we implement it for a
 few special built-in macros for the sake of user-experience.  The built-in
 macros are implemented in [`rustc_builtin_macros`], along with some other
 early code generation facilities like injection of standard library imports or
 generation of test harness. There are some additional helpers for building
 AST fragments in [`rustc_expand::build`][reb]. Eager-expansion generally
 performs a subset of the things that lazy (normal) expansion does. It is done
 by invoking [`fully_expand_fragment`][fef] on only part of a crate (as opposed
 to the whole crate, like we normally do).

 ### Other Data Structures

 Here are some other notable data structures involved in expansion and
 integration:
 - [`ResolverExpand`] - a `trait` used to break crate dependencies. This allows the
   resolver services to be used in [`rustc_ast`], despite [`rustc_resolve`] and
   pretty much everything else depending on [`rustc_ast`].
 - [`ExtCtxt`]/[`ExpansionData`] - holds various intermediate expansion
   infrastructure data.
 - [`Annotatable`] - a piece of AST that can be an attribute target, almost the same
   thing as [`AstFragment`] except for types and patterns that can be produced by
   macros but cannot be annotated with attributes.
 - [`MacResult`] - a "polymorphic" AST fragment, something that can turn into
   a different [`AstFragment`] depending on its [`AstFragmentKind`] (i.e. an item,
   expression, pattern, etc).

 [`AstFragment`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragment.html
 [`rustc_ast`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/index.html
 [`rustc_resolve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/index.html
 [`ResolverExpand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.ResolverExpand.html
 [`ExtCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExtCtxt.html
 [`ExpansionData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExpansionData.html
 [`Annotatable`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.Annotatable.html
 [`MacResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.MacResult.html
 [`AstFragmentKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/expand/enum.AstFragmentKind.html

 ## Hygiene and Hierarchies

 If you have ever used the C/C++ preprocessor macros, you know that there are some
 annoying and hard-to-debug gotchas! For example, consider the following C code:

 ```c
 #define DEFINE_FOO struct Bar {int x;}; struct Foo {Bar bar;};

 // Then, somewhere else
 struct Bar {
     ...
 };

 DEFINE_FOO
 ```

 Most people avoid writing C like this – and for good reason: it doesn't
 compile. The `struct Bar` defined by the macro clashes names with the `struct
 Bar` defined in the code. Consider also the following example:

 ```c
 #define DO_FOO(x) {\
     int y = 0;\
     foo(x, y);\
     }

 // Then elsewhere
 int y = 22;
 DO_FOO(y);
 ```

 Do you see the problem? We wanted to generate a call `foo(22, 0)`, but instead
 we got `foo(0, 0)` because the macro defined its own `y`!

 These are both examples of _macro hygiene_ issues. _Hygiene_ relates to how to
 handle names defined _within a macro_. In particular, a hygienic macro system
 prevents errors due to names introduced within a macro. Rust macros are hygienic
 in that they do not allow one to write the sorts of bugs above.

 At a high level, hygiene within the Rust compiler is accomplished by keeping
 track of the context where a name is introduced and used. We can then
 disambiguate names based on that context. Future iterations of the macro system
 will allow greater control to the macro author to use that context. For example,
 a macro author may want to introduce a new name to the context where the macro
 was called. Alternately, the macro author may be defining a variable for use
 only within the macro (i.e. it should not be visible outside the macro).

 [code_dir]: https://github.com/rust-lang/rust/tree/master/compiler/rustc_expand/src/mbe
 [code_mp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser
 [code_mr]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_rules
 [code_parse_int]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser/struct.TtParser.html#method.parse_tt
 [parsing]: ./the-parser.html

 The context is attached to AST nodes. All AST nodes generated by macros have
 context attached. Additionally, there may be other nodes that have context
 attached, such as some desugared syntax (non-macro-expanded nodes are
 considered to just have the "root" context, as described below).
 Throughout the compiler, we use [`rustc_span::Span`s][span] to refer to code locations.
 This struct also has hygiene information attached to it, as we will see later.

 [span]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/struct.Span.html

 Because macros invocations and definitions can be nested, the syntax context of
 a node must be a hierarchy. For example, if we expand a macro and there is
 another macro invocation or definition in the generated output, then the syntax
 context should reflect the nesting.

 However, it turns out that there are actually a few types of context we may
 want to track for different purposes. Thus, there are not just one but _three_
 expansion hierarchies that together comprise the hygiene information for a
 crate.

 All of these hierarchies need some sort of "macro ID" to identify individual
 elements in the chain of expansions. This ID is [`ExpnId`].  All macros receive
 an integer ID, assigned continuously starting from 0 as we discover new macro
 calls.  All hierarchies start at [`ExpnId::root`][rootid], which is its own
 parent.

 The [`rustc_span::hygiene`][hy] crate contains all of the hygiene-related algorithms
 (with the exception of some hacks in [`Resolver::resolve_crate_root`][hacks])
 and structures related to hygiene and expansion that are kept in global data.

 The actual hierarchies are stored in [`HygieneData`][hd]. This is a global
 piece of data containing hygiene and expansion info that can be accessed from
 any [`Ident`] without any context.


 [`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
 [rootid]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html#method.root
 [hd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.HygieneData.html
 [hy]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/index.html
 [hacks]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_resolve/struct.Resolver.html#method.resolve_crate_root
 [`Ident`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Ident.html

 ### The Expansion Order Hierarchy

 The first hierarchy tracks the order of expansions, i.e., when a macro
 invocation is in the output of another macro.

 Here, the children in the hierarchy will be the "innermost" tokens.  The
 [`ExpnData`] struct itself contains a subset of properties from both macro
 definition and macro call available through global data.
 [`ExpnData::parent`][edp] tracks the child-to-parent link in this hierarchy.

 [`ExpnData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html
 [edp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.parent

 For example:

 ```rust,ignore
 macro_rules! foo { () => { println!(); } }

 fn main() { foo!(); }
 ```

 In this code, the AST nodes that are finally generated would have hierarchy
 `root -> id(foo) -> id(println)`.

 ### The Macro Definition Hierarchy

 The second hierarchy tracks the order of macro definitions, i.e., when we are
 expanding one macro another macro definition is revealed in its output.  This
 one is a bit tricky and more complex than the other two hierarchies.

 [`SyntaxContext`][sc] represents a whole chain in this hierarchy via an ID.
 [`SyntaxContextData`][scd] contains data associated with the given
 [`SyntaxContext`][sc]; mostly it is a cache for results of filtering that chain in
 different ways.  [`SyntaxContextData::parent`][scdp] is the child-to-parent
 link here, and [`SyntaxContextData::outer_expns`][scdoe] are individual
 elements in the chain.  The "chaining-operator" is
 [`SyntaxContext::apply_mark`][am] in compiler code.

 A [`Span`][span], mentioned above, is actually just a compact representation of
 a code location and [`SyntaxContext`][sc]. Likewise, an [`Ident`] is just an interned
 [`Symbol`] + `Span` (i.e. an interned string + hygiene data).

 [`Symbol`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/symbol/struct.Symbol.html
 [scd]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html
 [scdp]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.parent
 [sc]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html
 [scdoe]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContextData.html#structfield.outer_expn
 [am]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.apply_mark

 For built-in macros, we use the context:
 [`SyntaxContext::empty().apply_mark(expn_id)`], and such macros are
 considered to be defined at the hierarchy root. We do the same for `proc
 macro`s because we haven't implemented cross-crate hygiene yet.

 [`SyntaxContext::empty().apply_mark(expn_id)`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.apply_mark

 If the token had context `X` before being produced by a macro then after being
 produced by the macro it has context `X -> macro_id`. Here are some examples:

 Example 0:

 ```rust,ignore
 macro m() { ident }

 m!();
 ```

 Here `ident` which initially has context [`SyntaxContext::root`][scr] has
 context `ROOT -> id(m)` after it's produced by `m`.

 [scr]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.SyntaxContext.html#method.root

 Example 1:

 ```rust,ignore
 macro m() { macro n() { ident } }

 m!();
 n!();
 ```

 In this example the `ident` has context `ROOT` initially, then `ROOT -> id(m)`
 after the first expansion, then `ROOT -> id(m) -> id(n)`.

 Example 2:

 Note that these chains are not entirely determined by their last element, in
 other words [`ExpnId`] is not isomorphic to [`SyntaxContext`][sc].

 ```rust,ignore
 macro m($i: ident) { macro n() { ($i, bar) } }

 m!(foo);
 ```

 After all expansions, `foo` has context `ROOT -> id(n)` and `bar` has context
 `ROOT -> id(m) -> id(n)`.

 Currently this hierarchy for tracking macro definitions is subject to the
 so-called ["context transplantation hack"][hack]. Modern (i.e. experimental)
 macros have stronger hygiene than the legacy "Macros By Example" (MBE)
 system which can result in weird interactions between the two. The hack is
 intended to make things "just work" for now.

 [`ExpnId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnId.html
 [hack]: https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732

 ### The Call-site Hierarchy

 The third and final hierarchy tracks the location of macro invocations.

 In this hierarchy [`ExpnData::call_site`][callsite] is the `child -> parent`
 link.

 [callsite]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/hygiene/struct.ExpnData.html#structfield.call_site

 Here is an example:

 ```rust,ignore
 macro bar($i: ident) { $i }
 macro foo($i: ident) { $i }

 foo!(bar!(baz));
 ```

 For the `baz` AST node in the final output, the expansion-order hierarchy is
 `ROOT -> id(foo) -> id(bar) -> baz`, while the call-site hierarchy is `ROOT ->
 baz`.

 ### Macro Backtraces

 Macro backtraces are implemented in [`rustc_span`] using the hygiene machinery
 in [`rustc_span::hygiene`][hy].

 [`rustc_span`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/index.html

 ## Producing Macro Output

 Above, we saw how the output of a macro is integrated into the AST for a crate,
 and we also saw how the hygiene data for a crate is generated. But how do we
 actually produce the output of a macro? It depends on the type of macro.

 There are two types of macros in Rust:
   1. `macro_rules!` macros (a.k.a. "Macros By Example" (MBE)), and,
   2. procedural macros (proc macros); including custom derives.

 During the parsing phase, the normal Rust parser will set aside the contents of
 macros and their invocations. Later, macros are expanded using these
 portions of the code.

 Some important data structures/interfaces here:
 - [`SyntaxExtension`] - a lowered macro representation, contains its expander
   function, which transforms a [`TokenStream`] or AST into another
   [`TokenStream`] or AST + some additional data like stability, or a list of
   unstable features allowed inside the macro.
 - [`SyntaxExtensionKind`] - expander functions may have several different
   signatures (take one token stream, or two, or a piece of AST, etc). This is
   an `enum` that lists them.
 - [`BangProcMacro`]/[`TTMacroExpander`]/[`AttrProcMacro`]/[`MultiItemModifier`] -
   `trait`s representing the expander function signatures.

 [`SyntaxExtension`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.SyntaxExtension.html
 [`SyntaxExtensionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/enum.SyntaxExtensionKind.html
 [`BangProcMacro`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.BangProcMacro.html
 [`TTMacroExpander`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.TTMacroExpander.html
 [`AttrProcMacro`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.AttrProcMacro.html
 [`MultiItemModifier`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/trait.MultiItemModifier.html

 ## Macros By Example

 MBEs have their own parser distinct from the Rust parser. When macros are
 expanded, we may invoke the MBE parser to parse and expand a macro.  The
 MBE parser, in turn, may call the Rust parser when it needs to bind a
 metavariable (e.g. `$my_expr`) while parsing the contents of a macro
 invocation. The code for macro expansion is in
 [`compiler/rustc_expand/src/mbe/`][code_dir].

 ### Example

 ```rust,ignore
 macro_rules! printer {
     (print $mvar:ident) => {
         println!("{}", $mvar);
     };
     (print twice $mvar:ident) => {
         println!("{}", $mvar);
         println!("{}", $mvar);
     };
 }
 ```

 Here `$mvar` is called a _metavariable_. Unlike normal variables, rather than
 binding to a value _at runtime_, a metavariable binds _at compile time_ to a
 tree of _tokens_.  A _token_ is a single "unit" of the grammar, such as an
 identifier (e.g. `foo`) or punctuation (e.g. `=>`). There are also other
 special tokens, such as `EOF`, which its self indicates that there are no more
 tokens. There are token trees resulting from the paired parentheses-like
 characters (`(`...`)`, `[`...`]`, and `{`...`}`) – they include the open and
 close and all the tokens in between (Rust requires that parentheses-like
 characters be balanced). Having macro expansion operate on token streams
 rather than the raw bytes of a source-file abstracts away a lot of complexity.
 The macro expander (and much of the rest of the compiler) doesn't consider
 the exact line and column of some syntactic construct in the code; it considers
 which constructs are used in the code. Using tokens allows us to care about
 _what_ without worrying about _where_. For more information about tokens, see
 the [Parsing][parsing] chapter of this book.

 ```rust,ignore
 printer!(print foo); // `foo` is a variable
 ```

 The process of expanding the macro invocation into the syntax tree
 `println!("{}", foo)` and then expanding the syntax tree into a call to
 `Display::fmt` is one common example of _macro expansion_.

 ### The MBE parser

 There are two parts to MBE expansion done by the macro parser:
   1. parsing the definition, and,
   2. parsing the invocations.

 We think of the MBE parser as a nondeterministic finite automaton (NFA) based
 regex parser since it uses an algorithm similar in spirit to the [Earley
 parsing algorithm](https://en.wikipedia.org/wiki/Earley_parser). The macro
 parser is defined in
 [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].

 The interface of the macro parser is as follows (this is slightly simplified):

 ```rust,ignore
 fn parse_tt(
     &mut self,
     parser: &mut Cow<'_, Parser<'_>>,
     matcher: &[MatcherLoc]
 ) -> ParseResult
 ```

 We use these items in macro parser:

 - a `parser` variable is a reference to the state of a normal Rust parser,
   including the token stream and parsing session. The token stream is what we
   are about to ask the MBE parser to parse. We will consume the raw stream of
   tokens and output a binding of metavariables to corresponding token trees.
   The parsing session can be used to report parser errors.
 - a `matcher` variable is a sequence of [`MatcherLoc`]s that we want to match
   the token stream against. They're converted from token trees before matching.

 [`MatcherLoc`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser/enum.MatcherLoc.html

 In the analogy of a regex parser, the token stream is the input and we are
 matching it against the pattern defined by matcher. Using our examples, the
 token stream could be the stream of tokens containing the inside of the example
 invocation `print foo`, while matcher might be the sequence of token (trees)
 `print $mvar:ident`.

 The output of the parser is a [`ParseResult`], which indicates which of
 three cases has occurred:

 - **Success**: the token stream matches the given matcher and we have produced a
   binding from metavariables to the corresponding token trees.
 - **Failure**: the token stream does not match matcher and results in an error
   message such as "No rule expected token ...".
 - **Error**: some fatal error has occurred _in the parser_. For example, this
   happens if there is more than one pattern match, since that indicates the
   macro is ambiguous.

 The full interface is defined [here][code_parse_int].

 The macro parser does pretty much exactly the same as a normal regex parser
 with one exception: in order to parse different types of metavariables, such as
 `ident`, `block`, `expr`, etc., the macro parser must call back to the normal
 Rust parser. Both the definition and invocation of macros are parsed using
 the parser in a process which is non-intuitively self-referential.

 The code to parse macro _definitions_ is in
 [`compiler/rustc_expand/src/mbe/macro_rules.rs`][code_mr]. It defines the
 pattern for matching a macro definition as `$( $lhs:tt => $rhs:tt );+`. In
 other words, a `macro_rules` definition should have in its body at least one
 occurrence of a token tree followed by `=>` followed by another token tree.
 When the compiler comes to a `macro_rules` definition, it uses this pattern to
 match the two token trees per the rules of the definition of the macro, _thereby
 utilizing the macro parser itself_. In our example definition, the
 metavariable `$lhs` would match the patterns of both arms: `(print
 $mvar:ident)` and `(print twice $mvar:ident)`. And `$rhs` would match the
 bodies of both arms: `{ println!("{}", $mvar); }` and `{ println!("{}", $mvar);
 println!("{}", $mvar); }`. The parser keeps this knowledge around for when it
 needs to expand a macro invocation.

 When the compiler comes to a macro invocation, it parses that invocation using
 a NFA-based macro parser described above. However, the matcher variable
 used is the first token tree (`$lhs`) extracted from the arms of the macro
 _definition_. Using our example, we would try to match the token stream `print
 foo` from the invocation against the matchers `print $mvar:ident` and `print
 twice $mvar:ident` that we previously extracted from the definition. The
 algorithm is exactly the same, but when the macro parser comes to a place in the
 current matcher where it needs to match a _non-terminal_ (e.g. `$mvar:ident`),
 it calls back to the normal Rust parser to get the contents of that
 non-terminal. In this case, the Rust parser would look for an `ident` token,
 which it finds (`foo`) and returns to the macro parser. Then, the macro parser
 proceeds in parsing as normal. Also, note that exactly one of the matchers from
 the various arms should match the invocation; if there is more than one match,
 the parse is ambiguous, while if there are no matches at all, there is a syntax
 error.

 For more information about the macro parser's implementation, see the comments
 in [`compiler/rustc_expand/src/mbe/macro_parser.rs`][code_mp].

 ## Procedural Macros

 Procedural macros are also expanded during parsing. However, rather than
 having a parser in the compiler, proc macros are implemented as custom,
 third-party crates. The compiler will compile the proc macro crate and
 specially annotated functions in them (i.e. the proc macro itself), passing
 them a stream of tokens. A proc macro can then transform the token stream and
 output a new token stream, which is synthesized into the AST.

 The token stream type used by proc macros is _stable_, so `rustc` does not
 use it internally. The compiler's (unstable) token stream is defined in
 [`rustc_ast::tokenstream::TokenStream`][rustcts]. This is converted into the
 stable [`proc_macro::TokenStream`][stablets] and back in
 [`rustc_expand::proc_macro`][pm] and [`rustc_expand::proc_macro_server`][pms].
 Since the Rust ABI is currently unstable, we use the C ABI for this conversion.

 [tsmod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/index.html
 [rustcts]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/tokenstream/struct.TokenStream.html
 [stablets]: https://doc.rust-lang.org/proc_macro/struct.TokenStream.html
 [pm]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/proc_macro/index.html
 [pms]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/proc_macro_server/index.html
 [`ParseResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/mbe/macro_parser/enum.ParseResult.html

 <!-- TODO(rylev): more here. [#1160](https://github.com/rust-lang/rustc-dev-guide/issues/1160) -->

 ### Custom Derive

 Custom derives are a special type of proc macro.

 ### Macros By Example and Macros 2.0

 There is an legacy and mostly undocumented effort to improve the MBE system
 by giving it more hygiene-related features, better scoping and visibility
 rules, etc. Internally this uses the same machinery as today's MBEs with some
 additional syntactic sugar and are allowed to be in namespaces.

 <!-- TODO(rylev): more? [#1160](https://github.com/rust-lang/rustc-dev-guide/issues/1160) -->