| # Libraries and metadata |
| |
| When the compiler sees a reference to an external crate, it needs to load some |
| information about that crate. |
| This chapter gives an overview of that process, |
| and the supported file formats for crate libraries. |
| |
| ## Libraries |
| |
| A crate dependency can be loaded from an `rlib`, `dylib`, or `rmeta` file. |
| A key point of these file formats is that they contain `rustc`-specific |
| [*metadata*](#metadata). |
| This metadata allows the compiler to discover enough |
| information about the external crate to understand the items it contains, |
| which macros it exports, and *much* more. |
| |
| ### rlib |
| |
| An `rlib` is an [archive file], which is similar to a tar file. |
| This file format is specific to `rustc`, and may change over time. |
| This file contains: |
| |
| * Object code, which is the result of code generation. |
| This is used during regular linking. |
| There is a separate `.o` file for each [codegen unit]. |
| The codegen step can be skipped with the [`-C linker-plugin-lto`][linker-plugin-lto] CLI option, |
| which means each `.o` file will only contain LLVM bitcode. |
| * [LLVM bitcode], which is a binary representation of LLVM's intermediate |
| representation, which is embedded as a section in the `.o` files. |
| This can be used for [Link Time Optimization] (LTO). |
| This can be removed with the |
| [`-C embed-bitcode=no`][embed-bitcode] CLI option to improve compile times |
| and reduce disk space if LTO is not needed. |
| * `rustc` [metadata], in a file named `lib.rmeta`. |
| * A symbol table, which is essentially a list of symbols with offsets to the |
| object files that contain that symbol. |
| This is pretty standard for archive files. |
| |
| [archive file]: https://en.wikipedia.org/wiki/Ar_(Unix) |
| [LLVM bitcode]: https://llvm.org/docs/BitCodeFormat.html |
| [Link Time Optimization]: https://llvm.org/docs/LinkTimeOptimization.html |
| [codegen unit]: ../backend/codegen.md |
| [embed-bitcode]: https://doc.rust-lang.org/rustc/codegen-options/index.html#embed-bitcode |
| [linker-plugin-lto]: https://doc.rust-lang.org/rustc/codegen-options/index.html#linker-plugin-lto |
| |
| ### dylib |
| |
| A `dylib` is a platform-specific shared library. |
| It includes the `rustc` [metadata] in a special link section called `.rustc`. |
| |
| ### rmeta |
| |
| An `rmeta` file is a custom binary format that contains the [metadata] for the crate. |
| This file can be used for fast "checks" of a project by skipping all code |
| generation (as is done with `cargo check`), collecting enough information for |
| documentation (as is done with `cargo doc`), or for [pipelining](#pipelining). |
| This file is created if the [`--emit=metadata`][emit] CLI option is used. |
| |
| `rmeta` files do not support linking, since they do not contain compiled object files. |
| |
| [emit]: https://doc.rust-lang.org/rustc/command-line-arguments.html#option-emit |
| |
| ## Metadata |
| |
| The metadata contains a wide swath of different elements. |
| This guide will not go into detail about every field it contains. |
| You are encouraged to browse the |
| [`CrateRoot`] definition to get a sense of the different elements it contains. |
| Everything about metadata encoding and decoding is in the [`rustc_metadata`] package. |
| |
| Here are a few highlights of things it contains: |
| |
| * The version of the `rustc` compiler. |
| The compiler will refuse to load files from any other version. |
| * The [Strict Version Hash](#strict-version-hash) (SVH). |
| This helps ensure the correct dependency is loaded. |
| * The [Stable Crate Id](#stable-crate-id). |
| This is a hash used to identify crates. |
| * Information about all the source files in the library. |
| This can be used for a variety of things, such as diagnostics pointing to sources in a |
| dependency. |
| * Information about exported macros, traits, types, and items. |
| Generally, |
| anything that's needed to be known when a path references something inside a crate dependency. |
| * Encoded [MIR]. |
| This is optional, and only encoded if needed for code generation. |
| `cargo check` skips this for performance reasons. |
| |
| [`CrateRoot`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/rmeta/struct.CrateRoot.html |
| [`rustc_metadata`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/index.html |
| [MIR]: ../mir/index.md |
| |
| ### Strict Version Hash |
| |
| The Strict Version Hash ([SVH], also known as the "crate hash") is a 64-bit |
| hash that is used to ensure that the correct crate dependencies are loaded. |
| It is possible for a directory to contain multiple copies of the same dependency |
| built with different settings, or built from different sources. |
| The crate loader will skip any crates that have the wrong SVH. |
| |
| The SVH is also used for the [incremental compilation] session filename, |
| though that usage is mostly historic. |
| |
| The hash includes a variety of elements: |
| |
| * Hashes of the HIR nodes. |
| * All of the upstream crate hashes. |
| * All of the source filenames. |
| * Hashes of certain command-line flags (like `-C metadata` via the [Stable |
| Crate Id](#stable-crate-id), and all CLI options marked with `[TRACKED]`). |
| |
| See [`compute_hir_hash`] for where the hash is actually computed. |
| |
| [SVH]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/svh/struct.Svh.html |
| [incremental compilation]: ../queries/incremental-compilation.md |
| [`compute_hir_hash`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast_lowering/fn.compute_hir_hash.html |
| |
| ### Stable Crate Id |
| |
| The [`StableCrateId`] is a 64-bit hash used to identify different crates with |
| potentially the same name. |
| It is a hash of the crate name and all the |
| [`-C metadata`] CLI options computed in [`StableCrateId::new`]. |
| It is used in a variety of places, such as symbol name mangling, crate loading, and |
| much more. |
| |
| By default, all Rust symbols are mangled and incorporate the stable crate id. |
| This allows multiple versions of the same crate to be included together. |
| Cargo automatically generates `-C metadata` hashes based on a variety of factors, like |
| the package version, source, and target kind (a lib and test can have the same |
| crate name, so they need to be disambiguated). |
| |
| [`StableCrateId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/def_id/struct.StableCrateId.html |
| [`StableCrateId::new`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_span/def_id/struct.StableCrateId.html#method.new |
| [`-C metadata`]: https://doc.rust-lang.org/rustc/codegen-options/index.html#metadata |
| |
| ## Crate loading |
| |
| Crate loading can have quite a few subtle complexities. |
| During [name resolution], when an external crate is referenced (via an `extern crate` or |
| path), the resolver uses the [`CStore`] which is responsible for finding |
| the crate libraries and loading the [metadata] for them. |
| After the dependency is loaded, the `CStore` will provide the information the resolver needs |
| to perform its job (such as expanding macros, resolving paths, etc.). |
| |
| To load each external crate, the `CStore` uses a [`CrateLocator`] to |
| actually find the correct files for one specific crate. |
| There is some great documentation in the [`locator`] module that goes into detail on how loading |
| works, and I strongly suggest reading it to get the full picture. |
| |
| The location of a dependency can come from several different places. |
| Direct dependencies are usually passed with `--extern` flags, and the loader can look |
| at those directly. |
| Direct dependencies often have references to their own dependencies, which need to be loaded, too. |
| These are usually found by |
| scanning the directories passed with the `-L` flag for any file whose metadata |
| contains a matching crate name and [SVH](#strict-version-hash). |
| The loader will also look at the [sysroot] to find dependencies. |
| |
| As crates are loaded, they are kept in the [`CStore`] with the crate metadata |
| wrapped in the [`CrateMetadata`] struct. |
| After resolution and expansion, the |
| `CStore` will make its way into the [`GlobalCtxt`] for the rest of the compilation. |
| |
| [name resolution]: ../name-resolution.md |
| [`CrateLocator`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/locator/struct.CrateLocator.html |
| [`locator`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/locator/index.html |
| [`CStore`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/creader/struct.CStore.html |
| [`CrateMetadata`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_metadata/rmeta/decoder/struct.CrateMetadata.html |
| [`GlobalCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.GlobalCtxt.html |
| [sysroot]: ../building/bootstrapping/what-bootstrapping-does.md#what-is-a-sysroot |
| |
| ## Pipelining |
| |
| One trick to improve compile times is to start building a crate as soon as the |
| metadata for its dependencies is available. |
| For a library, there is no need to wait for the code generation of dependencies to finish. |
| Cargo implements this technique by telling `rustc` to emit an [`rmeta`](#rmeta) file for each |
| dependency as well as an [`rlib`](#rlib). |
| As early as it can, `rustc` will |
| save the `rmeta` file to disk before it continues to the code generation phase. |
| The compiler sends a JSON message to let the build tool know that it |
| can start building the next crate if possible. |
| |
| The [crate loading](#crate-loading) system is smart enough to know when it |
| sees an `rmeta` file to use that if the `rlib` is not there (or has only been partially written). |
| |
| This pipelining isn't possible for binaries, because the linking phase will |
| require the code generation of all its dependencies. |
| In the future, it may be |
| possible to further improve this scenario by splitting linking into a separate |
| command (see [#64191]). |
| |
| [#64191]: https://github.com/rust-lang/rust/issues/64191 |
| |
| [metadata]: #metadata |