| # Queries: demand-driven compilation |
| |
| <!-- toc --> |
| |
| As described in [Overview of the compiler], the Rust compiler |
| is still (as of <!-- date-check --> July 2021) transitioning from a |
| traditional "pass-based" setup to a "demand-driven" system. The compiler query |
| system is the key to rustc's demand-driven organization. |
| The idea is pretty simple. Instead of entirely independent passes |
| (parsing, type-checking, etc.), a set of function-like *queries* |
| compute information about the input source. For example, |
| there is a query called `type_of` that, given the [`DefId`] of |
| some item, will compute the type of that item and return it to you. |
| |
| [`DefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/def_id/struct.DefId.html |
| [Overview of the compiler]: overview.md#queries |
| |
| Query execution is *memoized*. The first time you invoke a |
| query, it will go do the computation, but the next time, the result is |
| returned from a hashtable. Moreover, query execution fits nicely into |
| *incremental computation*; the idea is roughly that, when you invoke a |
| query, the result *may* be returned to you by loading stored data |
| from disk.[^incr-comp-detail] |
| |
| Eventually, we want the entire compiler |
| control-flow to be query driven. There will effectively be one |
| top-level query (`compile`) that will run compilation on a crate; this |
| will in turn demand information about that crate, starting from the |
| *end*. For example: |
| |
| - The `compile` query might demand to get a list of codegen-units |
| (i.e. modules that need to be compiled by LLVM). |
| - But computing the list of codegen-units would invoke some subquery |
| that returns the list of all modules defined in the Rust source. |
| - That query in turn would invoke something asking for the HIR. |
| - This keeps going further and further back until we wind up doing the |
| actual parsing. |
| |
| Although this vision is not fully realized, large sections of the |
| compiler (for example, generating [MIR]) currently work exactly like this. |
| |
| [^incr-comp-detail]: The [Incremental compilation in detail] chapter gives a more |
| in-depth description of what queries are and how they work. |
| If you intend to write a query of your own, this is a good read. |
| |
| [Incremental compilation in detail]: queries/incremental-compilation-in-detail.md |
| [MIR]: mir/index.md |
| |
| ## Invoking queries |
| |
| Invoking a query is simple. The [`TyCtxt`] ("type context") struct offers a method |
| for each defined query. For example, to invoke the `type_of` |
| query, you would just do this: |
| |
| ```rust,ignore |
| let ty = tcx.type_of(some_def_id); |
| ``` |
| |
| [`TyCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyCtxt.html |
| |
| ## How the compiler executes a query |
| |
| So you may be wondering what happens when you invoke a query |
| method. The answer is that, for each query, the compiler maintains a |
| cache – if your query has already been executed, then, the answer is |
| simple: we clone the return value out of the cache and return it |
| (therefore, you should try to ensure that the return types of queries |
| are cheaply cloneable; insert an `Rc` if necessary). |
| |
| ### Providers |
| |
| If, however, the query is *not* in the cache, then the compiler will |
| call the corresponding **provider** function. A provider is a function |
| implemented in a specific module and **manually registered** into the |
| [`Providers`][providers_struct] struct during compiler initialization. |
| The macro system generates the [`Providers`][providers_struct] struct, |
| which acts as a function table for all query implementations, where each |
| field is a function pointer to the actual provider. |
| |
| **Note:** The `Providers` struct is generated by macros and acts as a function table for all query implementations. |
| It is **not** a Rust trait, but a plain struct with function pointer fields. |
| |
| **Providers are defined per-crate.** The compiler maintains, |
| internally, a table of providers for every crate, at least |
| conceptually. Right now, there are really two sets: the providers for |
| queries about the **local crate** (that is, the one being compiled) |
| and providers for queries about **external crates** (that is, |
| dependencies of the local crate). Note that what determines the crate |
| that a query is targeting is not the *kind* of query, but the *key*. |
| For example, when you invoke `tcx.type_of(def_id)`, that could be a |
| local query or an external query, depending on what crate the `def_id` |
| is referring to (see the [`self::keys::Key`][Key] trait for more |
| information on how that works). |
| |
| Providers always have the same signature: |
| |
| ```rust,ignore |
| fn provider<'tcx>( |
| tcx: TyCtxt<'tcx>, |
| key: QUERY_KEY, |
| ) -> QUERY_RESULT { |
| ... |
| } |
| ``` |
| |
| Providers take two arguments: the `tcx` and the query key. |
| They return the result of the query. |
| |
| N.B. Most of the `rustc_*` crates only provide **local |
| providers**. Almost all **extern providers** wind up going through the |
| [`rustc_metadata` crate][rustc_metadata], which loads the information |
| from the crate metadata. But in some cases there are crates that |
| provide queries for *both* local and external crates, in which case |
| they define both a `provide` and a `provide_extern` function, through |
| [`wasm_import_module_map`][wasm_import_module_map], that `rustc_driver` can invoke. |
| |
| [rustc_metadata]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/index.html |
| [wasm_import_module_map]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/back/symbol_export/fn.wasm_import_module_map.html |
| |
| ### How providers are set up |
| |
| When the tcx is created, it is given the providers by its creator using |
| the [`Providers`][providers_struct] struct. This struct is generated by |
| the macros here, but it is basically a big list of function pointers: |
| |
| [providers_struct]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/query/struct.Providers.html |
| |
| ```rust,ignore |
| struct Providers { |
| type_of: for<'tcx> fn(TyCtxt<'tcx>, DefId) -> Ty<'tcx>, |
| // ... one field for each query |
| } |
| ``` |
| |
| #### How are providers registered? |
| |
| The `Providers` struct is filled in during compiler initialization, mainly by the `rustc_driver` crate. |
| But the actual provider functions are implemented in various `rustc_*` crates (like `rustc_middle`, `rustc_hir_analysis`, etc). |
| |
| To register providers, each crate exposes a [`provide`][provide_fn] function that looks like this: |
| |
| [provide_fn]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/fn.provide.html |
| |
| ```rust,ignore |
| pub fn provide(providers: &mut Providers) { |
| *providers = Providers { |
| type_of, |
| // ... add more providers here |
| ..*providers |
| }; |
| } |
| ``` |
| |
| - This function takes a mutable reference to the `Providers` struct and sets the fields to point to the correct provider functions. |
| - You can also assign fields individually, e.g. `providers.type_of = type_of;`. |
| |
| #### Adding a new provider |
| |
| Suppose you want to add a new query called `fubar`. You would: |
| |
| 1. Implement the provider function: |
| ```rust,ignore |
| fn fubar<'tcx>(tcx: TyCtxt<'tcx>, key: DefId) -> Fubar<'tcx> { ... } |
| ``` |
| 2. Register it in the `provide` function: |
| ```rust,ignore |
| pub fn provide(providers: &mut Providers) { |
| *providers = Providers { |
| fubar, |
| ..*providers |
| }; |
| } |
| ``` |
| |
| --- |
| |
| ## Adding a new query |
| |
| How do you add a new query? |
| Defining a query takes place in two steps: |
| |
| 1. Declare the query name, its arguments and description. |
| 2. Supply query providers where needed. |
| |
| To declare the query name and arguments, you simply add an entry to |
| the big macro invocation in [`compiler/rustc_middle/src/query/mod.rs`][query-mod]. |
| Then you need to add a documentation comment to it with some _internal_ description. |
| Then, provide the `desc` attribute which contains a _user-facing_ description of the query. |
| The `desc` attribute is shown to the user in query cycles. |
| |
| This looks something like: |
| |
| [query-mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/query/index.html |
| |
| ```rust,ignore |
| rustc_queries! { |
| /// Records the type of every item. |
| query type_of(key: DefId) -> Ty<'tcx> { |
| cache_on_disk_if { key.is_local() } |
| desc { |tcx| "computing the type of `{}`", tcx.def_path_str(key) } |
| } |
| ... |
| } |
| ``` |
| |
| A query definition has the following form: |
| |
| ```rust,ignore |
| query type_of(key: DefId) -> Ty<'tcx> { ... } |
| ^^^^^ ^^^^^^^ ^^^^^ ^^^^^^^^ ^^^ |
| | | | | | |
| | | | | query modifiers |
| | | | result type |
| | | query key type |
| | name of query |
| query keyword |
| ``` |
| |
| Let's go over these elements one by one: |
| |
| - **Query keyword:** indicates a start of a query definition. |
| - **Name of query:** the name of the query method |
| (`tcx.type_of(..)`). Also used as the name of a struct |
| (`ty::queries::type_of`) that will be generated to represent |
| this query. |
| - **Query key type:** the type of the argument to this query. |
| This type must implement the [`ty::query::keys::Key`][Key] trait, which |
| defines (for example) how to map it to a crate, and so forth. |
| - **Result type of query:** the type produced by this query. This type |
| should (a) not use `RefCell` or other interior mutability and (b) be |
| cheaply cloneable. Interning or using `Rc` or `Arc` is recommended for |
| non-trivial data types.[^steal] |
| - **Query modifiers:** various flags and options that customize how the |
| query is processed (mostly with respect to [incremental compilation][incrcomp]). |
| |
| [Key]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/query/keys/trait.Key.html |
| [incrcomp]: queries/incremental-compilation-in-detail.html#query-modifiers |
| |
| So, to add a query: |
| |
| - Add an entry to `rustc_queries!` using the format above. |
| - Link the provider by modifying the appropriate `provide` method; |
| or add a new one if needed and ensure that `rustc_driver` is invoking it. |
| |
| [^steal]: The one exception to those rules is the `ty::steal::Steal` type, |
| which is used to cheaply modify MIR in place. See the definition |
| of `Steal` for more details. New uses of `Steal` should **not** be |
| added without alerting `@rust-lang/compiler`. |
| |
| ## External links |
| |
| Related design ideas, and tracking issues: |
| |
| - Design document: [On-demand Rustc incremental design doc] |
| - Tracking Issue: ["Red/Green" dependency tracking in compiler] |
| |
| More discussion and issues: |
| |
| - [GitHub issue #42633] |
| - [Incremental Compilation Beta] |
| - [Incremental Compilation Announcement] |
| |
| [On-demand Rustc incremental design doc]: https://github.com/nikomatsakis/rustc-on-demand-incremental-design-doc/blob/master/0000-rustc-on-demand-and-incremental.md |
| ["Red/Green" dependency tracking in compiler]: https://github.com/rust-lang/rust/issues/42293 |
| [GitHub issue #42633]: https://github.com/rust-lang/rust/issues/42633 |
| [Incremental Compilation Beta]: https://internals.rust-lang.org/t/incremental-compilation-beta/4721 |
| [Incremental Compilation Announcement]: https://blog.rust-lang.org/2016/09/08/incremental.html |
| |