Bootstrapping is the process of using a compiler to compile itself. More accurately, it means using an older compiler to compile a newer version of the same compiler.
This raises a chicken-and-egg paradox: where did the first compiler come from? It must have been written in a different language. In Rust's case it was written in OCaml. However it was abandoned long ago and the only way to build a modern version of rustc
is a slightly less modern version.
This is exactly how ./x.py
works: it downloads the current beta release of rustc
, then uses it to compile the new compiler.
Note that this documentation mostly covers user-facing information. See bootstrap/README.md to read about bootstrap internals.
Compiling rustc
is done in stages. Here‘s a diagram, adapted from Jynn Nelson’s talk on bootstrapping at RustConf 2022, with detailed explanations below.
The A
, B
, C
, and D
show the ordering of the stages of bootstrapping. Blue nodes are downloaded, yellow nodes are built with the stage0
compiler, and green nodes are built with the stage1
compiler.
graph TD s0c["stage0 compiler (1.63)"]:::downloaded -->|A| s0l("stage0 std (1.64)"):::with-s0c; s0c & s0l --- stepb[ ]:::empty; stepb -->|B| s0ca["stage0 compiler artifacts (1.64)"]:::with-s0c; s0ca -->|copy| s1c["stage1 compiler (1.64)"]:::with-s0c; s1c -->|C| s1l("stage1 std (1.64)"):::with-s1c; s1c & s1l --- stepd[ ]:::empty; stepd -->|D| s1ca["stage1 compiler artifacts (1.64)"]:::with-s1c; s1ca -->|copy| s2c["stage2 compiler"]:::with-s1c; classDef empty width:0px,height:0px; classDef downloaded fill: lightblue; classDef with-s0c fill: yellow; classDef with-s1c fill: lightgreen;
The stage0 compiler is usually the current beta rustc
compiler and its associated dynamic libraries, which ./x.py
will download for you. (You can also configure ./x.py
to use something else.)
The stage0 compiler is then used only to compile src/bootstrap
, library/std
, and compiler/rustc
. When assembling the libraries and binaries that will become the stage1 rustc
compiler, the freshly compiled std
and rustc
are used. There are two concepts at play here: a compiler (with its set of dependencies) and its ‘target’ or ‘object’ libraries (std
and rustc
). Both are staged, but in a staggered manner.
The rustc source code is then compiled with the stage0
compiler to produce the stage1
compiler.
We then rebuild our stage1
compiler with itself to produce the stage2
compiler.
In theory, the stage1
compiler is functionally identical to the stage2
compiler, but in practice there are subtle differences. In particular, the stage1
compiler itself was built by stage0
and hence not by the source in your working directory. This means that the ABI generated by the stage0
compiler may not match the ABI that would have been made by the stage1
compiler, which can cause problems for dynamic libraries, tests, and tools using rustc_private
.
Note that the proc_macro
crate avoids this issue with a C
FFI layer called proc_macro::bridge
, allowing it to be used with stage1
.
The stage2
compiler is the one distributed with rustup
and all other install methods. However, it takes a very long time to build because one must first build the new compiler with an older compiler and then use that to build the new compiler with itself. For development, you usually only want the stage1
compiler, which you can build with ./x build library
. See Building the compiler.
Stage 3 is optional. To sanity check our new compiler we can build the libraries with the stage2
compiler. The result ought to be identical to before, unless something has broken.
The script ./x
tries to be helpful and pick the stage you most likely meant for each subcommand. These defaults are as follows:
check
: --stage 0
doc
: --stage 0
build
: --stage 1
test
: --stage 1
dist
: --stage 2
install
: --stage 2
bench
: --stage 2
You can always override the stage by passing --stage N
explicitly.
For more information about stages, see below.
Since the build system uses the current beta compiler to build a stage1
bootstrapping compiler, the compiler source code can‘t use some features until they reach beta (because otherwise the beta compiler doesn’t support them). On the other hand, for compiler intrinsics and internal features, the features have to be used. Additionally, the compiler makes heavy use of nightly
features (#![feature(...)]
). How can we resolve this problem?
There are two methods used:
--cfg bootstrap
when building with stage0
, so we can use cfg(not(bootstrap))
to only use features when built with stage1
. Setting --cfg bootstrap
in this way is used for features that were just stabilized, which require #![feature(...)]
when built with stage0
, but not for stage1
.RUSTC_BOOTSTRAP=1
. This special variable means to break the stability guarantees of Rust: allowing use of #![feature(...)]
with a compiler that's not nightly
. Setting RUSTC_BOOTSTRAP=1
should never be used except when bootstrapping the compiler.This is a detailed look into the separate bootstrap stages.
The convention ./x
uses is that:
--stage N
flag means to run the stage N compiler (stageN/rustc
).Anything you can build with ./x
is a build artifact. Build artifacts include, but are not limited to:
stage0-rustc/rustc-main
stage0-sysroot/rustlib/libstd-6fae108520cf72fe.so
stage0-sysroot/rustlib/libstd-6fae108520cf72fe.rlib
doc/std
./x test tests/ui
means to build the stage1
compiler and run compiletest
on it. If you're working on the compiler, this is normally the test command you want../x test --stage 0 library/std
means to run tests on the standard library without building rustc
from source (‘build with stage0
, then test the artifacts’). If you're working on the standard library, this is normally the test command you want../x build --stage 0
means to build with the beta rustc
../x doc --stage 0
means to document using the beta rustdoc
../x test --stage 0 tests/ui
is not useful: it runs tests on the beta compiler and doesn't build rustc
from source. Use test tests/ui
instead, which builds stage1
from source../x test --stage 0 compiler/rustc
builds the compiler but runs no tests: it‘s running cargo test -p rustc
, but cargo
doesn’t understand Rust‘s tests. You shouldn’t need to use this, use test
instead (without arguments)../x build --stage 0 compiler/rustc
builds the compiler, but does not build libstd
or even libcore
. Most of the time, you'll want ./x build library
instead, which allows compiling programs without needing to define lang items.Note that build --stage N compiler/rustc
does not build the stage N compiler: instead it builds the stage N+1 compiler using the stage N compiler.
In short, stage 0 uses the stage0
compiler to create stage0
artifacts which will later be uplifted to be the stage1 compiler.
In each stage, two major steps are performed:
std
is compiled by the stage N compiler.std
is linked to programs built by the stage N compiler, including the stage N artifacts (stage N+1 compiler).This is somewhat intuitive if one thinks of the stage N artifacts as “just” another program we are building with the stage N compiler: build --stage N compiler/rustc
is linking the stage N artifacts to the std
built by the stage N compiler.
std
Note that there are two std
libraries in play here:
stageN/rustc
, which was built by stage N-1 (stage N-1 std
)stageN/rustc
, which was built by stage N (stage N std
).Stage N std
is pretty much necessary for any useful work with the stage N compiler. Without it, you can only compile programs with #![no_core]
-- not terribly useful!
The reason these need to be different is because they aren‘t necessarily ABI-compatible: there could be new layout optimizations, changes to MIR
, or other changes to Rust metadata on nightly
that aren’t present in beta.
This is also where --keep-stage 1 library/std
comes into play. Since most changes to the compiler don‘t actually change the ABI, once you’ve produced a std
in stage1
, you can probably just reuse it with a different compiler. If the ABI hasn‘t changed, you’re good to go, no need to spend time recompiling that std
. The flag --keep-stage
simply instructs the build script to assumes the previous compile is fine and copies those artifacts into the appropriate place, skipping the cargo
invocation.
Cross-compiling is the process of compiling code that will run on another architecture. For instance, you might want to build an ARM version of rustc using an x86 machine. Building stage2
std
is different when you are cross-compiling.
This is because ./x
uses the following logic: if HOST
and TARGET
are the same, it will reuse stage1
std
for stage2
! This is sound because stage1
std
was compiled with the stage1
compiler, i.e. a compiler using the source code you currently have checked out. So it should be identical (and therefore ABI-compatible) to the std
that stage2/rustc
would compile.
However, when cross-compiling, stage1
std
will only run on the host. So the stage2
compiler has to recompile std
for the target.
(See in the table how stage2
only builds non-host std
targets).
cfg(bootstrap)
?For docs on cfg(bootstrap)
itself, see Complications of Bootstrapping.
The rustc
generated by the stage0
compiler is linked to the freshly-built std
, which means that for the most part only std
needs to be cfg
-gated, so that rustc
can use features added to std
immediately after their addition, without need for them to get into the downloaded beta
compiler.
Note this is different from any other Rust program: stage1
rustc
is built by the beta compiler, but using the master version of libstd
!
The only time rustc
uses cfg(bootstrap)
is when it adds internal lints that use diagnostic items, or when it uses unstable library features that were recently changed.
When you build a project with cargo
, the build artifacts for dependencies are normally stored in target/debug/deps
. This only contains dependencies cargo
knows about; in particular, it doesn‘t have the standard library. Where do std
or proc_macro
come from? They come from the sysroot, the root of a number of directories where the compiler loads build artifacts at runtime. The sysroot
doesn’t just store the standard library, though - it includes anything that needs to be loaded at runtime. That includes (but is not limited to):
libstd
/libtest
/libproc_macro
.rustc_private
. In-tree these are always present; out of tree, you need to install rustc-dev
with rustup
.libLLVM.so
for the LLVM project. In-tree this is either built from source or downloaded from CI; out-of-tree, you need to install llvm-tools-preview
with rustup
.All the artifacts listed so far are compiler runtime dependencies. You can see them with rustc --print sysroot
:
$ ls $(rustc --print sysroot)/lib libchalk_derive-0685d79833dc9b2b.so libstd-25c6acf8063a3802.so libLLVM-11-rust-1.50.0-nightly.so libtest-57470d2aa8f7aa83.so librustc_driver-4f0cc9f50e53f0ba.so libtracing_attributes-e4be92c35ab2a33b.so librustc_macros-5f0ec4a119c6ac86.so rustlib
There are also runtime dependencies for the standard library! These are in lib/rustlib/
, not lib/
directly.
$ ls $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/lib | head -n 5 libaddr2line-6c8e02b8fedc1e5f.rlib libadler-9ef2480568df55af.rlib liballoc-9c4002b5f79ba0e1.rlib libcfg_if-512eb53291f6de7e.rlib libcompiler_builtins-ef2408da76957905.rlib
Directory lib/rustlib/
includes libraries like hashbrown
and cfg_if
, which are not part of the public API of the standard library, but are used to implement it. Also lib/rustlib/
is part of the search path for linkers, but lib
will never be part of the search path.
-Z force-unstable-if-unmarked
Since lib/rustlib/
is part of the search path we have to be careful about which crates are included in it. In particular, all crates except for the standard library are built with the flag -Z force-unstable-if-unmarked
, which means that you have to use #![feature(rustc_private)]
in order to load it (as opposed to the standard library, which is always available).
The -Z force-unstable-if-unmarked
flag has a variety of purposes to help enforce that the correct crates are marked as unstable
. It was introduced primarily to allow rustc and the standard library to link to arbitrary crates on crates.io which do not themselves use staged_api
. rustc
also relies on this flag to mark all of its crates as unstable
with the rustc_private
feature so that each crate does not need to be carefully marked with unstable
.
This flag is automatically applied to all of rustc
and the standard library by the bootstrap scripts. This is needed because the compiler and all of its dependencies are shipped in sysroot
to all users.
This flag has the following effects:
unstable
” with the rustc_private
feature if it is not itself marked as stable
or unstable
.#![feature(rustc_private)]
attribute to use other unstable
crates. However, that would make it impossible for a crate from crates.io to access its own dependencies since that crate won't have a feature(rustc_private)
attribute, but everything is compiled with -Z force-unstable-if-unmarked
.Code which does not use -Z force-unstable-if-unmarked
should include the #![feature(rustc_private)]
crate attribute to access these forced-unstable crates. This is needed for things which link rustc
its self, such as MIRI
or clippy
.
You can find more discussion about sysroots in:
extern crate
for dependencies loaded from sysroot
bootstrap
Conveniently ./x
allows you to pass stage-specific flags to rustc
and cargo
when bootstrapping. The RUSTFLAGS_BOOTSTRAP
environment variable is passed as RUSTFLAGS
to the bootstrap stage (stage0
), and RUSTFLAGS_NOT_BOOTSTRAP
is passed when building artifacts for later stages. RUSTFLAGS
will work, but also affects the build of bootstrap
itself, so it will be rare to want to use it. Finally, MAGIC_EXTRA_RUSTFLAGS
bypasses the cargo
cache to pass flags to rustc without recompiling all dependencies.
RUSTDOCFLAGS
, RUSTDOCFLAGS_BOOTSTRAP
and RUSTDOCFLAGS_NOT_BOOTSTRAP
are analogous to RUSTFLAGS
, but for rustdoc
.CARGOFLAGS
will pass arguments to cargo itself (e.g. --timings
). CARGOFLAGS_BOOTSTRAP
and CARGOFLAGS_NOT_BOOTSTRAP
work analogously to RUSTFLAGS_BOOTSTRAP
.--test-args
will pass arguments through to the test runner. For tests/ui
, this is compiletest
. For unit tests and doc tests this is the libtest
runner.Most test runner accept --help
, which you can use to find out the options accepted by the runner.
During bootstrapping, there are a bunch of compiler-internal environment variables that are used. If you are trying to run an intermediate version of rustc
, sometimes you may need to set some of these environment variables manually. Otherwise, you get an error like the following:
thread 'main' panicked at 'RUSTC_STAGE was not set: NotPresent', library/core/src/result.rs:1165:5
If ./stageN/bin/rustc
gives an error about environment variables, that usually means something is quite wrong -- such as you're trying to compile rustc
or std
or something which depends on environment variables. In the unlikely case that you actually need to invoke rustc
in such a situation, you can tell the bootstrap shim to print all env
variables by adding -vvv
to your x
command.
Finally, bootstrap makes use of the cc-rs crate which has its own method of configuring C
compilers and C
flags via environment variables.
stdout
In this part, we will investigate the build command's stdout
in an action (similar, but more detailed and complete documentation compare to topic above). When you execute x build --dry-run
command, the build output will be something like the following:
Building stage0 library artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu) Copying stage0 library from stage0 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu / x86_64-unknown-linux-gnu) Building stage0 compiler artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu) Copying stage0 rustc from stage0 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu / x86_64-unknown-linux-gnu) Assembling stage1 compiler (x86_64-unknown-linux-gnu) Building stage1 library artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu) Copying stage1 library from stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu / x86_64-unknown-linux-gnu) Building stage1 tool rust-analyzer-proc-macro-srv (x86_64-unknown-linux-gnu) Building rustdoc for stage1 (x86_64-unknown-linux-gnu)
These steps use the provided (downloaded, usually) compiler to compile the local Rust source into libraries we can use.
This copies the library and compiler artifacts from cargo
into stage0-sysroot/lib/rustlib/{target-triple}/lib
This copies the libraries we built in “building stage0
... artifacts” into the stage1
compiler‘s lib/
directory. These are the host libraries that the compiler itself uses to run. These aren’t actually used by artifacts the new compiler generates. This step also copies the rustc
and rustdoc
binaries we generated into build/$HOST/stage/bin
.
The stage1/bin/rustc
is a fully functional compiler, but it doesn‘t yet have any libraries to link built binaries or libraries to. The next 3 steps will provide those libraries for it; they are mostly equivalent to constructing the stage1/bin
compiler so we don’t go through them individually here.