| # Queries: demand-driven compilation |
| |
| As described in [the high-level overview of the compiler][hl], the Rust compiler |
| is still (as of <!-- date: 2021-07 --> July 2021) transitioning from a |
| traditional "pass-based" setup to a "demand-driven" system. **The Compiler Query |
| System is the key to our new demand-driven organization.** The idea is pretty |
| simple. You have various queries that compute things about the input – for |
| example, there is a query called `type_of(def_id)` that, given the [def-id] of |
| some item, will compute the type of that item and return it to you. |
| |
| [def-id]: appendix/glossary.md#def-id |
| [hl]: ./compiler-src.md |
| |
| Query execution is **memoized** – so the first time you invoke a |
| query, it will go do the computation, but the next time, the result is |
| returned from a hashtable. Moreover, query execution fits nicely into |
| **incremental computation**; the idea is roughly that, when you do a |
| query, the result **may** be returned to you by loading stored data |
| from disk (but that's a separate topic we won't discuss further here). |
| |
| The overall vision is that, eventually, the entire compiler |
| control-flow will be query driven. There will effectively be one |
| top-level query ("compile") that will run compilation on a crate; this |
| will in turn demand information about that crate, starting from the |
| *end*. For example: |
| |
| - This "compile" query might demand to get a list of codegen-units |
| (i.e. modules that need to be compiled by LLVM). |
| - But computing the list of codegen-units would invoke some subquery |
| that returns the list of all modules defined in the Rust source. |
| - That query in turn would invoke something asking for the HIR. |
| - This keeps going further and further back until we wind up doing the |
| actual parsing. |
| |
| However, that vision is not fully realized. Still, big chunks of the |
| compiler (for example, generating MIR) work exactly like this. |
| |
| ### Incremental Compilation in Detail |
| |
| The [Incremental Compilation in Detail][query-model] chapter gives a more |
| in-depth description of what queries are and how they work. |
| If you intend to write a query of your own, this is a good read. |
| |
| [query-model]: queries/incremental-compilation-in-detail.md |
| |
| ### Invoking queries |
| |
| To invoke a query is simple. The tcx ("type context") offers a method |
| for each defined query. So, for example, to invoke the `type_of` |
| query, you would just do this: |
| |
| ```rust,ignore |
| let ty = tcx.type_of(some_def_id); |
| ``` |
| |
| ### How the compiler executes a query |
| |
| So you may be wondering what happens when you invoke a query |
| method. The answer is that, for each query, the compiler maintains a |
| cache – if your query has already been executed, then, the answer is |
| simple: we clone the return value out of the cache and return it |
| (therefore, you should try to ensure that the return types of queries |
| are cheaply cloneable; insert an `Rc` if necessary). |
| |
| #### Providers |
| |
| If, however, the query is *not* in the cache, then the compiler will |
| try to find a suitable **provider**. A provider is a function that has |
| been defined and linked into the compiler somewhere that contains the |
| code to compute the result of the query. |
| |
| **Providers are defined per-crate.** The compiler maintains, |
| internally, a table of providers for every crate, at least |
| conceptually. Right now, there are really two sets: the providers for |
| queries about the **local crate** (that is, the one being compiled) |
| and providers for queries about **external crates** (that is, |
| dependencies of the local crate). Note that what determines the crate |
| that a query is targeting is not the *kind* of query, but the *key*. |
| For example, when you invoke `tcx.type_of(def_id)`, that could be a |
| local query or an external query, depending on what crate the `def_id` |
| is referring to (see the [`self::keys::Key`][Key] trait for more |
| information on how that works). |
| |
| Providers always have the same signature: |
| |
| ```rust,ignore |
| fn provider<'tcx>( |
| tcx: TyCtxt<'tcx>, |
| key: QUERY_KEY, |
| ) -> QUERY_RESULT { |
| ... |
| } |
| ``` |
| |
| Providers take two arguments: the `tcx` and the query key. |
| They return the result of the query. |
| |
| #### How providers are setup |
| |
| When the tcx is created, it is given the providers by its creator using |
| the [`Providers`][providers_struct] struct. This struct is generated by |
| the macros here, but it is basically a big list of function pointers: |
| |
| [providers_struct]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/query/struct.Providers.html |
| |
| ```rust,ignore |
| struct Providers { |
| type_of: for<'tcx> fn(TyCtxt<'tcx>, DefId) -> Ty<'tcx>, |
| ... |
| } |
| ``` |
| |
| At present, we have one copy of the struct for local crates, and one |
| for external crates, though the plan is that we may eventually have |
| one per crate. |
| |
| These `Providers` structs are ultimately created and populated by |
| `rustc_driver`, but it does this by distributing the work |
| throughout the other `rustc_*` crates. This is done by invoking |
| various [`provide`][provide_fn] functions. These functions tend to look |
| something like this: |
| |
| [provide_fn]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/fn.provide.html |
| |
| ```rust,ignore |
| pub fn provide(providers: &mut Providers) { |
| *providers = Providers { |
| type_of, |
| ..*providers |
| }; |
| } |
| ``` |
| |
| That is, they take an `&mut Providers` and mutate it in place. Usually |
| we use the formulation above just because it looks nice, but you could |
| as well do `providers.type_of = type_of`, which would be equivalent. |
| (Here, `type_of` would be a top-level function, defined as we saw |
| before.) So, if we want to add a provider for some other query, |
| let's call it `fubar`, into the crate above, we might modify the `provide()` |
| function like so: |
| |
| ```rust,ignore |
| pub fn provide(providers: &mut Providers) { |
| *providers = Providers { |
| type_of, |
| fubar, |
| ..*providers |
| }; |
| } |
| |
| fn fubar<'tcx>(tcx: TyCtxt<'tcx>, key: DefId) -> Fubar<'tcx> { ... } |
| ``` |
| |
| N.B. Most of the `rustc_*` crates only provide **local |
| providers**. Almost all **extern providers** wind up going through the |
| [`rustc_metadata` crate][rustc_metadata], which loads the information |
| from the crate metadata. But in some cases there are crates that |
| provide queries for *both* local and external crates, in which case |
| they define both a `provide` and a `provide_extern` function, through |
| [`wasm_import_module_map`][wasm_import_module_map], that `rustc_driver` can invoke. |
| |
| [rustc_metadata]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/index.html |
| [wasm_import_module_map]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/back/symbol_export/fn.wasm_import_module_map.html |
| |
| ### Adding a new kind of query |
| |
| So suppose you want to add a new kind of query, how do you do so? |
| Well, defining a query takes place in two steps: |
| |
| 1. first, you have to specify the query name and arguments; and then, |
| 2. you have to supply query providers where needed. |
| |
| To specify the query name and arguments, you simply add an entry to |
| the big macro invocation in |
| [`compiler/rustc_middle/src/query/mod.rs`][query-mod], which looks something like: |
| |
| [query-mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/query/index.html |
| |
| ```rust,ignore |
| rustc_queries! { |
| Other { |
| /// Records the type of every item. |
| query type_of(key: DefId) -> Ty<'tcx> { |
| cache { key.is_local() } |
| } |
| } |
| |
| ... |
| } |
| ``` |
| |
| Queries are grouped into categories (`Other`, `Codegen`, `TypeChecking`, etc.). |
| Each group contains one or more queries. Each query definition is broken up like |
| this: |
| |
| ```rust,ignore |
| query type_of(key: DefId) -> Ty<'tcx> { ... } |
| ^^ ^^^^^^^ ^^^^^ ^^^^^^^^ ^^^ |
| | | | | | |
| | | | | query modifiers |
| | | | result type of query |
| | | query key type |
| | name of query |
| query keyword |
| ``` |
| |
| Let's go over them one by one: |
| |
| - **Query keyword:** indicates a start of a query definition. |
| - **Name of query:** the name of the query method |
| (`tcx.type_of(..)`). Also used as the name of a struct |
| (`ty::queries::type_of`) that will be generated to represent |
| this query. |
| - **Query key type:** the type of the argument to this query. |
| This type must implement the [`ty::query::keys::Key`][Key] trait, which |
| defines (for example) how to map it to a crate, and so forth. |
| - **Result type of query:** the type produced by this query. This type |
| should (a) not use `RefCell` or other interior mutability and (b) be |
| cheaply cloneable. Interning or using `Rc` or `Arc` is recommended for |
| non-trivial data types. |
| - The one exception to those rules is the `ty::steal::Steal` type, |
| which is used to cheaply modify MIR in place. See the definition |
| of `Steal` for more details. New uses of `Steal` should **not** be |
| added without alerting `@rust-lang/compiler`. |
| - **Query modifiers:** various flags and options that customize how the |
| query is processed (mostly with respect to [incremental compilation][incrcomp]). |
| |
| [Key]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_query_impl/keys/trait.Key.html |
| [incrcomp]: queries/incremental-compilation-in-detail.html#query-modifiers |
| |
| So, to add a query: |
| |
| - Add an entry to `rustc_queries!` using the format above. |
| - Link the provider by modifying the appropriate `provide` method; |
| or add a new one if needed and ensure that `rustc_driver` is invoking it. |
| |
| #### Query structs and descriptions |
| |
| For each kind, the `rustc_queries` macro will generate a "query struct" |
| named after the query. This struct is a kind of a place-holder |
| describing the query. Each such struct implements the |
| [`self::config::QueryConfig`][QueryConfig] trait, which has associated types for the |
| key/value of that particular query. Basically the code generated looks something |
| like this: |
| |
| ```rust,ignore |
| // Dummy struct representing a particular kind of query: |
| pub struct type_of<'tcx> { data: PhantomData<&'tcx ()> } |
| |
| impl<'tcx> QueryConfig for type_of<'tcx> { |
| type Key = DefId; |
| type Value = Ty<'tcx>; |
| |
| const NAME: QueryName = QueryName::type_of; |
| const CATEGORY: ProfileCategory = ProfileCategory::Other; |
| } |
| ``` |
| |
| There is an additional trait that you may wish to implement called |
| [`self::config::QueryDescription`][QueryDescription]. This trait is |
| used during cycle errors to give a "human readable" name for the query, |
| so that we can summarize what was happening when the cycle occurred. |
| Implementing this trait is optional if the query key is `DefId`, but |
| if you *don't* implement it, you get a pretty generic error ("processing `foo`..."). |
| You can put new impls into the `config` module. They look something like this: |
| |
| [QueryConfig]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_query_system/query/config/trait.QueryConfig.html |
| [QueryDescription]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_query_system/query/config/trait.QueryDescription.html |
| |
| ```rust,ignore |
| impl<'tcx> QueryDescription for queries::type_of<'tcx> { |
| fn describe(tcx: TyCtxt, key: DefId) -> String { |
| format!("computing the type of `{}`", tcx.def_path_str(key)) |
| } |
| } |
| ``` |
| |
| Another option is to add `desc` modifier: |
| |
| ```rust,ignore |
| rustc_queries! { |
| Other { |
| /// Records the type of every item. |
| query type_of(key: DefId) -> Ty<'tcx> { |
| desc { |tcx| "computing the type of `{}`", tcx.def_path_str(key) } |
| } |
| } |
| } |
| ``` |
| |
| `rustc_queries` macro will generate an appropriate `impl` automatically. |
| |
| ## External Links |
| |
| Related design ideas, and tracking issues: |
| |
| - Design document: [On-demand Rustc incremental design doc] |
| - Tracking Issue: ["Red/Green" dependency tracking in compiler] |
| |
| More discussion and issues: |
| |
| - [GitHub issue #42633] |
| - [Incremental Compilation Beta] |
| - [Incremental Compilation Announcement] |
| |
| [On-demand Rustc incremental design doc]: https://github.com/nikomatsakis/rustc-on-demand-incremental-design-doc/blob/master/0000-rustc-on-demand-and-incremental.md |
| ["Red/Green" dependency tracking in compiler]: https://github.com/rust-lang/rust/issues/42293 |
| [GitHub issue #42633]: https://github.com/rust-lang/rust/issues/42633 |
| [Incremental Compilation Beta]: https://internals.rust-lang.org/t/incremental-compilation-beta/4721 |
| [Incremental Compilation Announcement]: https://blog.rust-lang.org/2016/09/08/incremental.html |