blob: 831acbd2f8f8eb7e49b0c60aa61eba2adb7cf7ab [file] [log] [blame] [view] [edit]
# Debugger Visualizers
These are typically the last step before the debugger displays the information, but the results may
be piped through a debug adapter such as an IDE's debugger API.
The term "Visualizer" is a bit of a misnomer. The real goal isn't just to prettify the output, but
to provide an interface for the user to interact with that is as useful as possible. In many cases
this means reconstructing the original type as closely as possible to its Rust representation, but
not always.
The visualizer interface allows generating "synthetic children" - fields that don't exist in the
debug info, but can be derived from invariants about the language and the type itself. A simple
example is allowing one to interact with the elements of a `Vec<T>` instead of just it's `*mut u8`
heap pointer, length, and capacity.
## `rust-lldb`, `rust-gdb`, and `rust-windbg.cmd`
These support scripts are distributed with Rust toolchains. They locate the appropriate debugger and
the toolchain's visualizer scripts, then launch the debugger with the appropriate arguments to load
the visualizer scripts before a debugee is launched/attached to.
## `#![debugger_visualizer]`
[This attribute][dbg_vis_attr] allows Rust library authors to include pretty printers for their
types within the library itself. These pretty printers are of the same format as typical
visualizers, but are embedded directly into the compiled binary. These scripts are loaded
automatically by the debugger, allowing a seamless experience for users. This attribute currently
works for GDB and natvis scripts.
[dbg_vis_attr]: https://doc.rust-lang.org/reference/attributes/debugger.html#the-debugger_visualizer-attribute
GDB python scripts are embedded in the `.debug_gdb_scripts` section of the binary. More information
can be found [here](https://sourceware.org/gdb/current/onlinedocs/gdb.html/dotdebug_005fgdb_005fscripts-section.html). Rustc accomplishes this in [`rustc_codegen_llvm/src/debuginfo/gdb.rs`][gdb_rs]
[gdb_rs]: https://github.com/rust-lang/rust/blob/main/compiler/rustc_codegen_llvm/src/debuginfo/gdb.rs
Natvis files can be embedded in the PDB debug info using the [`/NATVIS` linker option][linker_opt],
and have the [highest priority][priority] when a type is resolving which visualizer to use. The
files specified by the attribute are collected into
[`CrateInfo::natvis_debugger_visualizers`][natvis] which are then added as linker arguments in
[`rustc_codegen_ssa/src/back/linker.rs`][linker_rs]
[linker_opt]: https://learn.microsoft.com/en-us/cpp/build/reference/natvis-add-natvis-to-pdb?view=msvc-170
[priority]: https://learn.microsoft.com/en-us/visualstudio/debugger/create-custom-views-of-native-objects?view=visualstudio#BKMK_natvis_location
[natvis]: https://github.com/rust-lang/rust/blob/e0e204f3e97ad5f79524b9c259dc38df606ed82c/compiler/rustc_codegen_ssa/src/lib.rs#L212
[linker_rs]: https://github.com/rust-lang/rust/blob/main/compiler/rustc_codegen_ssa/src/back/linker.rs#L1106
LLDB is not currently supported, but there are a few methods that could potentially allow support in
the future. Officially, the intended method is via a [formatter bytecode][bytecode]. This was
created to offer a comparable experience to GDB's, but without the safety concerns associated with
embedding an entire python script. The opcodes are limited, but it works with `SBValue` and `SBType`
in roughly the same way as python visualizer scripts. Implementing this would require writing some
sort of DSL/mini compiler.
[bytecode]: https://lldb.llvm.org/resources/formatterbytecode.html
Alternatively, it might be possible to copy GDB's strategy entirely: create a bespoke section in the
binary and embed a python script in it. LLDB will not load it automatically, but the python API does
allow one to access the [raw sections of the debug info][SBSection]. With this, it may be possible
to extract the python script from our bespoke section and then load it in during the startup of
Rust's visualizer scripts.
[SBSection]: https://lldb.llvm.org/python_api/lldb.SBSection.html#sbsection
## Performance
Before tackling the visualizers themselves, it's important to note that these are part of a
performance-sensitive system. Please excuse the break in formality, but: if I have to spend
significant time debugging, I'm annoyed. If I have to *wait on my debugger*, I'm pissed.
Every millisecond spent in these visualizers is a millisecond longer for the user to see output.
This can be especially painful for large stackframes that contain many/large container types.
Debugger GUI's such as VSCode will request the whole stack frame at once, and this can result in
delays of tens of seconds (or even minutes) before being able to interact with any variables in the
frame.
There is a tendancy to balk at the idea of optimizing Python code, but it really can have a
substantial impact. Remember, there is no compiler to help keep the code fast. Even simple
transformations are not done for you. It can be difficult to find Python performance tips through
all the noise of people suggesting you don't bother optimizing Python, so here are some things to
keep in mind that are relevant to these scripts:
* Everything allocates, even `int`
* Use tuples when possible. `list` is effectively `Vec<Box<[Any]>>`, whereas tuples are equivalent
to `Box<[Any]>`. They have one less layer of indirection, don't carry extra capacity and can't
grow/shrink which can be advantageous in many cases. An additional benefit is that Python caches and
recycles the underlying allocations of all tuples up to size 20.
* Regexes are slow and should be avoided when simple string manipulation will do
* Strings are immutable, thus many string operations implictly copy the contents.
* When concatenating large lists of strings, `"".join(iterable_of_strings)` is typically the fastest
way to do it.
* f-strings are generally the fastest way to do small, simple string transformations such as
surrounding a string with parentheses.
* The act of calling a function is somewhat slow (even if the function is completely empty). If the
code section is very hot, consider inlining the function manually.
* Local variable access is significantly faster than global and built-in function access
* Member/method access via the `.` operator is also slow, consider reassigning deeply nested values
to local variables to avoid this cost (e.g. `h = a.b.c.d.e.f.g.h`).
* Accessing inherited methods and fields is about 2x slower than base-class methods and fields.
Avoid inheritance whenever possible.
* Use [`__slots__`](https://wiki.python.org/moin/UsingSlots) wherever possible. `__slots__` is a way
to indicate to Python that your class's fields won't change and speeds up field access by a
noticable amount. This does require you to name your fields in advance and initialize them in
`__init__`, but it's a small price to pay for the benefits.
* Match statements/if..elif..else are not optimized in any way. The conditions are checked in order,
1 by 1. If possible, use an alternative such as dictionary dispatch or a table of values
* Compute lazily when possible
* List comprehensions are typically faster than loops, generator comprehensions are a bit slower
than list comprehensions, but use less memory. You can think of comprehensions as equivalent to
Rust's `iter.map()`. List comprehensions effectively call `collect::<Vec<_>>` at the end, whereas
generator comprehensions do not.