| r[asm] |
| # Inline assembly |
| |
| r[asm.intro] |
| Support for inline assembly is provided via the [`asm!`], [`naked_asm!`], and [`global_asm!`] macros. |
| It can be used to embed handwritten assembly in the assembly output generated by the compiler. |
| |
| [`asm!`]: core::arch::asm |
| [`naked_asm!`]: core::arch::naked_asm |
| [`global_asm!`]: core::arch::global_asm |
| |
| r[asm.stable-targets] |
| Support for inline assembly is stable on the following architectures: |
| - x86 and x86-64 |
| - ARM |
| - AArch64 and Arm64EC |
| - RISC-V |
| - LoongArch |
| - s390x |
| |
| The compiler will emit an error if an assembly macro is used on an unsupported target. |
| |
| r[asm.example] |
| ## Example |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| use std::arch::asm; |
| |
| // Multiply x by 6 using shifts and adds |
| let mut x: u64 = 4; |
| unsafe { |
| asm!( |
| "mov {tmp}, {x}", |
| "shl {tmp}, 1", |
| "shl {x}, 2", |
| "add {x}, {tmp}", |
| x = inout(reg) x, |
| tmp = out(reg) _, |
| ); |
| } |
| assert_eq!(x, 4 * 6); |
| # } |
| ``` |
| |
| r[asm.syntax] |
| ## Syntax |
| |
| The following grammar specifies the arguments that can be passed to the `asm!`, `global_asm!` and `naked_asm!` macros. |
| |
| ```grammar,assembly |
| @root AsmArgs -> FormatString (`,` FormatString)* (`,` AsmOperand)* `,`? |
| |
| FormatString -> STRING_LITERAL | RAW_STRING_LITERAL | MacroInvocation |
| |
| AsmOperand -> |
| ClobberAbi |
| | AsmOptions |
| | RegOperand |
| |
| ClobberAbi -> `clobber_abi` `(` Abi (`,` Abi)* `,`? `)` |
| |
| AsmOptions -> |
| `options` `(` ( AsmOption (`,` AsmOption)* `,`? )? `)` |
| |
| AsmOption -> |
| `pure` |
| | `nomem` |
| | `readonly` |
| | `preserves_flags` |
| | `noreturn` |
| | `nostack` |
| | `att_syntax` |
| | `raw` |
| |
| RegOperand -> (ParamName `=`)? |
| ( |
| DirSpec `(` RegSpec `)` Expression |
| | DualDirSpec `(` RegSpec `)` DualDirSpecExpression |
| | `sym` PathExpression |
| | `const` Expression |
| | `label` `{` Statements? `}` |
| ) |
| |
| ParamName -> IDENTIFIER_OR_KEYWORD | RAW_IDENTIFIER |
| |
| DualDirSpecExpression -> |
| Expression |
| | Expression `=>` Expression |
| |
| RegSpec -> RegisterClass | ExplicitRegister |
| |
| RegisterClass -> IDENTIFIER_OR_KEYWORD |
| |
| ExplicitRegister -> STRING_LITERAL |
| |
| DirSpec -> |
| `in` |
| | `out` |
| | `lateout` |
| |
| DualDirSpec -> |
| `inout` |
| | `inlateout` |
| ``` |
| |
| r[asm.scope] |
| ## Scope |
| |
| r[asm.scope.intro] |
| Inline assembly can be used in one of three ways. |
| |
| r[asm.scope.asm] |
| With the `asm!` macro, the assembly code is emitted in a function scope and integrated into the compiler-generated assembly code of a function. |
| This assembly code must obey [strict rules](#rules-for-inline-assembly) to avoid undefined behavior. |
| Note that in some cases the compiler may choose to emit the assembly code as a separate function and generate a call to it. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| unsafe { core::arch::asm!("/* {} */", in(reg) 0); } |
| # } |
| ``` |
| |
| r[asm.scope.naked_asm] |
| With the `naked_asm!` macro, the assembly code is emitted in a function scope and constitutes the full assembly code of a function. The `naked_asm!` macro is only allowed in [naked functions](attributes/codegen.md#the-naked-attribute). |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| # #[unsafe(naked)] |
| # extern "C" fn wrapper() { |
| core::arch::naked_asm!("/* {} */", const 0); |
| # } |
| # } |
| ``` |
| |
| r[asm.scope.global_asm] |
| With the `global_asm!` macro, the assembly code is emitted in a global scope, outside a function. |
| This can be used to hand-write entire functions using assembly code, and generally provides much more freedom to use arbitrary registers and assembler directives. |
| |
| ```rust |
| # fn main() {} |
| # #[cfg(target_arch = "x86_64")] |
| core::arch::global_asm!("/* {} */", const 0); |
| ``` |
| |
| r[asm.ts-args] |
| ## Template string arguments |
| |
| r[asm.ts-args.syntax] |
| The assembler template uses the same syntax as [format strings][format-syntax] (i.e. placeholders are specified by curly braces). |
| |
| r[asm.ts-args.order] |
| The corresponding arguments are accessed in order, by index, or by name. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i64; |
| let y: i64; |
| let z: i64; |
| // This |
| unsafe { core::arch::asm!("mov {}, {}", out(reg) x, in(reg) 5); } |
| // ... this |
| unsafe { core::arch::asm!("mov {0}, {1}", out(reg) y, in(reg) 5); } |
| // ... and this |
| unsafe { core::arch::asm!("mov {out}, {in}", out = out(reg) z, in = in(reg) 5); } |
| // all have the same behavior |
| assert_eq!(x, y); |
| assert_eq!(y, z); |
| # } |
| ``` |
| |
| r[asm.ts-args.no-implicit] |
| However, implicit named arguments (introduced by [RFC #2795][rfc-2795]) are not supported. |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| let x = 5; |
| // We can't refer to `x` from the scope directly, we need an operand like `in(reg) x` |
| unsafe { core::arch::asm!("/* {x} */"); } // ERROR: no argument named x |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.ts-args.one-or-more] |
| An `asm!` invocation may have one or more template string arguments; an `asm!` with multiple template string arguments is treated as if all the strings were concatenated with a `\n` between them. |
| The expected usage is for each template string argument to correspond to a line of assembly code. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i64; |
| let y: i64; |
| // We can separate multiple strings as if they were written together |
| unsafe { core::arch::asm!("mov eax, 5", "mov ecx, eax", out("rax") x, out("rcx") y); } |
| assert_eq!(x, y); |
| # } |
| ``` |
| |
| r[asm.ts-args.before-other-args] |
| All template string arguments must appear before any other arguments. |
| |
| ```rust,compile_fail |
| let x = 5; |
| # #[cfg(target_arch = "x86_64")] { |
| // The template strings need to appear first in the asm invocation |
| unsafe { core::arch::asm!("/* {x} */", x = const 5, "ud2"); } // ERROR: unexpected token |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.ts-args.positional-first] |
| As with format strings, positional arguments must appear before named arguments and explicit [register operands](#register-operands). |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // Named operands need to come after positional ones |
| unsafe { core::arch::asm!("/* {x} {} */", x = const 5, in(reg) 5); } |
| // ERROR: positional arguments cannot follow named arguments or explicit register arguments |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // We also can't put explicit registers before positional operands |
| unsafe { core::arch::asm!("/* {} */", in("eax") 0, in(reg) 5); } |
| // ERROR: positional arguments cannot follow named arguments or explicit register arguments |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.ts-args.register-operands] |
| Explicit register operands cannot be used by placeholders in the template string. |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // Explicit register operands don't get substituted, use `eax` explicitly in the string |
| unsafe { core::arch::asm!("/* {} */", in("eax") 5); } |
| // ERROR: invalid reference to argument at index 0 |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.ts-args.at-least-once] |
| All other named and positional operands must appear at least once in the template string, otherwise a compiler error is generated. |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // We have to name all of the operands in the format string |
| unsafe { core::arch::asm!("", in(reg) 5, x = const 5); } |
| // ERROR: multiple unused asm arguments |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.ts-args.opaque] |
| The exact assembly code syntax is target-specific and opaque to the compiler except for the way operands are substituted into the template string to form the code passed to the assembler. |
| |
| r[asm.ts-args.llvm-syntax] |
| Currently, all supported targets follow the assembly code syntax used by LLVM's internal assembler which usually corresponds to that of the GNU assembler (GAS). |
| On x86, the `.intel_syntax noprefix` mode of GAS is used by default. |
| On ARM, the `.syntax unified` mode is used. |
| These targets impose an additional restriction on the assembly code: any assembler state (e.g. the current section which can be changed with `.section`) must be restored to its original value at the end of the asm string. |
| Assembly code that does not conform to the GAS syntax will result in assembler-specific behavior. |
| Further constraints on the directives used by inline assembly are indicated by [Directives Support](#directives-support). |
| |
| [format-syntax]: std::fmt#syntax |
| [rfc-2795]: https://github.com/rust-lang/rfcs/pull/2795 |
| |
| r[asm.operand-type] |
| ## Operand type |
| |
| r[asm.operand-type.supported-operands] |
| Several types of operands are supported: |
| |
| r[asm.operand-type.supported-operands.in] |
| * `in(<reg>) <expr>` |
| - `<reg>` can refer to a register class or an explicit register. |
| The allocated register name is substituted into the asm template string. |
| - The allocated register will contain the value of `<expr>` at the start of the assembly code. |
| - The allocated register must contain the same value at the end of the assembly code (except if a `lateout` is allocated to the same register). |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| // ``in` can be used to pass values into inline assembly... |
| unsafe { core::arch::asm!("/* {} */", in(reg) 5); } |
| # } |
| ``` |
| |
| r[asm.operand-type.supported-operands.out] |
| * `out(<reg>) <expr>` |
| - `<reg>` can refer to a register class or an explicit register. |
| The allocated register name is substituted into the asm template string. |
| - The allocated register will contain an undefined value at the start of the assembly code. |
| - `<expr>` must be a (possibly uninitialized) place expression, to which the contents of the allocated register are written at the end of the assembly code. |
| - An underscore (`_`) may be specified instead of an expression, which will cause the contents of the register to be discarded at the end of the assembly code (effectively acting as a clobber). |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i64; |
| // and `out` can be used to pass values back to rust. |
| unsafe { core::arch::asm!("/* {} */", out(reg) x); } |
| # } |
| ``` |
| |
| r[asm.operand-type.supported-operands.lateout] |
| * `lateout(<reg>) <expr>` |
| - Identical to `out` except that the register allocator can reuse a register allocated to an `in`. |
| - You should only write to the register after all inputs are read, otherwise you may clobber an input. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i64; |
| // `lateout` is the same as `out` |
| // but the compiler knows we don't care about the value of any inputs by the |
| // time we overwrite it. |
| unsafe { core::arch::asm!("mov {}, 5", lateout(reg) x); } |
| assert_eq!(x, 5) |
| # } |
| ``` |
| |
| r[asm.operand-type.supported-operands.inout] |
| * `inout(<reg>) <expr>` |
| - `<reg>` can refer to a register class or an explicit register. |
| The allocated register name is substituted into the asm template string. |
| - The allocated register will contain the value of `<expr>` at the start of the assembly code. |
| - `<expr>` must be a mutable initialized place expression, to which the contents of the allocated register are written at the end of the assembly code. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let mut x: i64 = 4; |
| // `inout` can be used to modify values in-register |
| unsafe { core::arch::asm!("inc {}", inout(reg) x); } |
| assert_eq!(x, 5); |
| # } |
| ``` |
| |
| r[asm.operand-type.supported-operands.inout-arrow] |
| * `inout(<reg>) <in expr> => <out expr>` |
| - Same as `inout` except that the initial value of the register is taken from the value of `<in expr>`. |
| - `<out expr>` must be a (possibly uninitialized) place expression, to which the contents of the allocated register are written at the end of the assembly code. |
| - An underscore (`_`) may be specified instead of an expression for `<out expr>`, which will cause the contents of the register to be discarded at the end of the assembly code (effectively acting as a clobber). |
| - `<in expr>` and `<out expr>` may have different types. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i64; |
| // `inout` can also move values to different places |
| unsafe { core::arch::asm!("inc {}", inout(reg) 4u64=>x); } |
| assert_eq!(x, 5); |
| # } |
| ``` |
| |
| r[asm.operand-type.supported-operands.inlateout] |
| * `inlateout(<reg>) <expr>` / `inlateout(<reg>) <in expr> => <out expr>` |
| - Identical to `inout` except that the register allocator can reuse a register allocated to an `in` (this can happen if the compiler knows the `in` has the same initial value as the `inlateout`). |
| - You should only write to the register after all inputs are read, otherwise you may clobber an input. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let mut x: i64 = 4; |
| // `inlateout` is `inout` using `lateout` |
| unsafe { core::arch::asm!("inc {}", inlateout(reg) x); } |
| assert_eq!(x, 5); |
| # } |
| ``` |
| |
| r[asm.operand-type.supported-operands.sym] |
| * `sym <path>` |
| - `<path>` must refer to a `fn` or `static`. |
| - A mangled symbol name referring to the item is substituted into the asm template string. |
| - The substituted string does not include any modifiers (e.g. GOT, PLT, relocations, etc). |
| - `<path>` is allowed to point to a `#[thread_local]` static, in which case the assembly code can combine the symbol with relocations (e.g. `@plt`, `@TPOFF`) to read from thread-local data. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| extern "C" fn foo() { |
| println!("Hello from inline assembly") |
| } |
| // `sym` can be used to refer to a function (even if it doesn't have an |
| // external name we can directly write) |
| unsafe { core::arch::asm!("call {}", sym foo, clobber_abi("C")); } |
| # } |
| ``` |
| |
| * `const <expr>` |
| - `<expr>` must be an integer constant expression. This expression follows the same rules as inline `const` blocks. |
| - The type of the expression may be any integer type, but defaults to `i32` just like integer literals. |
| - The value of the expression is formatted as a string and substituted directly into the asm template string. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| // swizzle [0, 1, 2, 3] => [3, 2, 0, 1] |
| const SHUFFLE: u8 = 0b01_00_10_11; |
| let x: core::arch::x86_64::__m128 = unsafe { core::mem::transmute([0u32, 1u32, 2u32, 3u32]) }; |
| let y: core::arch::x86_64::__m128; |
| // Pass a constant value into an instruction that expects an immediate like `pshufd` |
| unsafe { |
| core::arch::asm!("pshufd {xmm}, {xmm}, {shuffle}", |
| xmm = inlateout(xmm_reg) x=>y, |
| shuffle = const SHUFFLE |
| ); |
| } |
| let y: [u32; 4] = unsafe { core::mem::transmute(y) }; |
| assert_eq!(y, [3, 2, 0, 1]); |
| # } |
| ``` |
| |
| r[asm.operand-type.supported-operands.label] |
| * `label <block>` |
| - The address of the block is substituted into the asm template string. The assembly code may jump to the substituted address. |
| - For targets that distinguish between direct jumps and indirect jumps (e.g. x86-64 with `cf-protection` enabled), the assembly code must not jump to the substituted address indirectly. |
| - After execution of the block, the `asm!` expression returns. |
| - The type of the block must be unit or `!` (never). |
| - The block starts a new safety context; unsafe operations within the `label` block must be wrapped in an inner `unsafe` block, even though the entire `asm!` expression is already wrapped in `unsafe`. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] |
| unsafe { |
| core::arch::asm!("jmp {}", label { |
| println!("Hello from inline assembly label"); |
| }); |
| } |
| ``` |
| |
| r[asm.operand-type.left-to-right] |
| Operand expressions are evaluated from left to right, just like function call arguments. |
| After the `asm!` has executed, outputs are written to in left to right order. |
| This is significant if two outputs point to the same place: that place will contain the value of the rightmost output. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let mut y: i64; |
| // y gets its value from the second output, rather than the first |
| unsafe { core::arch::asm!("mov {}, 0", "mov {}, 1", out(reg) y, out(reg) y); } |
| assert_eq!(y, 1); |
| # } |
| ``` |
| |
| r[asm.operand-type.naked_asm-restriction] |
| Because `naked_asm!` defines a whole function body and the compiler cannot emit any additional code to handle operands, it can only use `sym` and `const` operands. |
| |
| r[asm.operand-type.global_asm-restriction] |
| Because `global_asm!` exists outside a function, it can only use `sym` and `const` operands. |
| |
| ```rust,compile_fail |
| # fn main() {} |
| // register operands aren't allowed, since we aren't in a function |
| # #[cfg(target_arch = "x86_64")] |
| core::arch::global_asm!("", in(reg) 5); |
| // ERROR: the `in` operand cannot be used with `global_asm!` |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| ```rust |
| # fn main() {} |
| fn foo() {} |
| |
| # #[cfg(target_arch = "x86_64")] |
| // `const` and `sym` are both allowed, however |
| core::arch::global_asm!("/* {} {} */", const 0, sym foo); |
| ``` |
| |
| r[asm.register-operands] |
| ## Register operands |
| |
| r[asm.register-operands.register-or-class] |
| Input and output operands can be specified either as an explicit register or as a register class from which the register allocator can select a register. |
| Explicit registers are specified as string literals (e.g. `"eax"`) while register classes are specified as identifiers (e.g. `reg`). |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let mut y: i64; |
| // We can name both `reg`, or an explicit register like `eax` to get an |
| // integer register |
| unsafe { core::arch::asm!("mov eax, {:e}", in(reg) 5, lateout("eax") y); } |
| assert_eq!(y, 5); |
| # } |
| ``` |
| |
| r[asm.register-operands.equivalence-to-base-register] |
| Note that explicit registers treat register aliases (e.g. `r14` vs `lr` on ARM) and smaller views of a register (e.g. `eax` vs `rax`) as equivalent to the base register. |
| |
| r[asm.register-operands.error-two-operands] |
| It is a compile-time error to use the same explicit register for two input operands or two output operands. |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // We can't name eax twice |
| unsafe { core::arch::asm!("", in("eax") 5, in("eax") 4); } |
| // ERROR: register `eax` conflicts with register `eax` |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // ... even using different aliases |
| unsafe { core::arch::asm!("", in("ax") 5, in("rax") 4); } |
| // ERROR: register `rax` conflicts with register `ax` |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.register-operands.error-overlapping] |
| Additionally, it is also a compile-time error to use overlapping registers (e.g. ARM VFP) in input operands or in output operands. |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // al overlaps with ax, so we can't name both of them. |
| unsafe { core::arch::asm!("", in("ax") 5, in("al") 4i8); } |
| // ERROR: register `al` conflicts with register `ax` |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.register-operands.allowed-types] |
| Only the following types are allowed as operands for inline assembly: |
| - Integers (signed and unsigned) |
| - Floating-point numbers |
| - Pointers (thin only) |
| - Function pointers |
| - SIMD vectors (structs defined with `#[repr(simd)]` and which implement `Copy`). |
| This includes architecture-specific vector types defined in `std::arch` such as `__m128` (x86) or `int8x16_t` (ARM). |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| extern "C" fn foo() {} |
| |
| // Integers are allowed... |
| let y: i64 = 5; |
| unsafe { core::arch::asm!("/* {} */", in(reg) y); } |
| |
| // and pointers... |
| let py = &raw const y; |
| unsafe { core::arch::asm!("/* {} */", in(reg) py); } |
| |
| // floats as well... |
| let f = 1.0f32; |
| unsafe { core::arch::asm!("/* {} */", in(xmm_reg) f); } |
| |
| // even function pointers and simd vectors. |
| let func: extern "C" fn() = foo; |
| unsafe { core::arch::asm!("/* {} */", in(reg) func); } |
| |
| let z = unsafe { core::arch::x86_64::_mm_set_epi64x(1, 0) }; |
| unsafe { core::arch::asm!("/* {} */", in(xmm_reg) z); } |
| # } |
| ``` |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| struct Foo; |
| let x: Foo = Foo; |
| // Complex types like structs are not allowed |
| unsafe { core::arch::asm!("/* {} */", in(reg) x); } |
| // ERROR: cannot use value of type `Foo` for inline assembly |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.register-operands.supported-register-classes] |
| Here is the list of currently supported register classes: |
| |
| | Architecture | Register class | Registers | LLVM constraint code | |
| | ------------ | -------------- | --------- | -------------------- | |
| | x86 | `reg` | `ax`, `bx`, `cx`, `dx`, `si`, `di`, `bp`, `r[8-15]` (x86-64 only) | `r` | |
| | x86 | `reg_abcd` | `ax`, `bx`, `cx`, `dx` | `Q` | |
| | x86-32 | `reg_byte` | `al`, `bl`, `cl`, `dl`, `ah`, `bh`, `ch`, `dh` | `q` | |
| | x86-64 | `reg_byte`\* | `al`, `bl`, `cl`, `dl`, `sil`, `dil`, `bpl`, `r[8-15]b` | `q` | |
| | x86 | `xmm_reg` | `xmm[0-7]` (x86) `xmm[0-15]` (x86-64) | `x` | |
| | x86 | `ymm_reg` | `ymm[0-7]` (x86) `ymm[0-15]` (x86-64) | `x` | |
| | x86 | `zmm_reg` | `zmm[0-7]` (x86) `zmm[0-31]` (x86-64) | `v` | |
| | x86 | `kreg` | `k[1-7]` | `Yk` | |
| | x86 | `kreg0` | `k0` | Only clobbers | |
| | x86 | `x87_reg` | `st([0-7])` | Only clobbers | |
| | x86 | `mmx_reg` | `mm[0-7]` | Only clobbers | |
| | x86-64 | `tmm_reg` | `tmm[0-7]` | Only clobbers | |
| | AArch64 | `reg` | `x[0-30]` | `r` | |
| | AArch64 | `vreg` | `v[0-31]` | `w` | |
| | AArch64 | `vreg_low16` | `v[0-15]` | `x` | |
| | AArch64 | `preg` | `p[0-15]`, `ffr` | Only clobbers | |
| | Arm64EC | `reg` | `x[0-12]`, `x[15-22]`, `x[25-27]`, `x30` | `r` | |
| | Arm64EC | `vreg` | `v[0-15]` | `w` | |
| | Arm64EC | `vreg_low16` | `v[0-15]` | `x` | |
| | ARM (ARM/Thumb2) | `reg` | `r[0-12]`, `r14` | `r` | |
| | ARM (Thumb1) | `reg` | `r[0-7]` | `r` | |
| | ARM | `sreg` | `s[0-31]` | `t` | |
| | ARM | `sreg_low16` | `s[0-15]` | `x` | |
| | ARM | `dreg` | `d[0-31]` | `w` | |
| | ARM | `dreg_low16` | `d[0-15]` | `t` | |
| | ARM | `dreg_low8` | `d[0-8]` | `x` | |
| | ARM | `qreg` | `q[0-15]` | `w` | |
| | ARM | `qreg_low8` | `q[0-7]` | `t` | |
| | ARM | `qreg_low4` | `q[0-3]` | `x` | |
| | RISC-V | `reg` | `x1`, `x[5-7]`, `x[9-15]`, `x[16-31]` (non-RV32E) | `r` | |
| | RISC-V | `freg` | `f[0-31]` | `f` | |
| | RISC-V | `vreg` | `v[0-31]` | Only clobbers | |
| | LoongArch | `reg` | `$r1`, `$r[4-20]`, `$r[23,30]` | `r` | |
| | LoongArch | `freg` | `$f[0-31]` | `f` | |
| | s390x | `reg` | `r[0-10]`, `r[12-14]` | `r` | |
| | s390x | `reg_addr` | `r[1-10]`, `r[12-14]` | `a` | |
| | s390x | `freg` | `f[0-15]` | `f` | |
| | s390x | `vreg` | `v[0-31]` | Only clobbers | |
| | s390x | `areg` | `a[2-15]` | Only clobbers | |
| |
| > [!NOTE] |
| > - On x86 we treat `reg_byte` differently from `reg` because the compiler can allocate `al` and `ah` separately whereas `reg` reserves the whole register. |
| > - On x86-64 the high byte registers (e.g. `ah`) are not available in the `reg_byte` register class. |
| > - Some register classes are marked as "Only clobbers" which means that registers in these classes cannot be used for inputs or outputs, only clobbers of the form `out(<explicit register>) _` or `lateout(<explicit register>) _`. |
| |
| r[asm.register-operands.value-type-constraints] |
| Each register class has constraints on which value types they can be used with. |
| This is necessary because the way a value is loaded into a register depends on its type. |
| For example, on big-endian systems, loading a `i32x4` and a `i8x16` into a SIMD register may result in different register contents even if the byte-wise memory representation of both values is identical. |
| The availability of supported types for a particular register class may depend on what target features are currently enabled. |
| |
| | Architecture | Register class | Target feature | Allowed types | |
| | ------------ | -------------- | -------------- | ------------- | |
| | x86-32 | `reg` | None | `i16`, `i32`, `f32` | |
| | x86-64 | `reg` | None | `i16`, `i32`, `f32`, `i64`, `f64` | |
| | x86 | `reg_byte` | None | `i8` | |
| | x86 | `xmm_reg` | `sse` | `i32`, `f32`, `i64`, `f64`, <br> `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` | |
| | x86 | `ymm_reg` | `avx` | `i32`, `f32`, `i64`, `f64`, <br> `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` <br> `i8x32`, `i16x16`, `i32x8`, `i64x4`, `f32x8`, `f64x4` | |
| | x86 | `zmm_reg` | `avx512f` | `i32`, `f32`, `i64`, `f64`, <br> `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` <br> `i8x32`, `i16x16`, `i32x8`, `i64x4`, `f32x8`, `f64x4` <br> `i8x64`, `i16x32`, `i32x16`, `i64x8`, `f32x16`, `f64x8` | |
| | x86 | `kreg` | `avx512f` | `i8`, `i16` | |
| | x86 | `kreg` | `avx512bw` | `i32`, `i64` | |
| | x86 | `mmx_reg` | N/A | Only clobbers | |
| | x86 | `x87_reg` | N/A | Only clobbers | |
| | x86 | `tmm_reg` | N/A | Only clobbers | |
| | AArch64 | `reg` | None | `i8`, `i16`, `i32`, `f32`, `i64`, `f64` | |
| | AArch64 | `vreg` | `neon` | `i8`, `i16`, `i32`, `f32`, `i64`, `f64`, <br> `i8x8`, `i16x4`, `i32x2`, `i64x1`, `f32x2`, `f64x1`, <br> `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` | |
| | AArch64 | `preg` | N/A | Only clobbers | |
| | Arm64EC | `reg` | None | `i8`, `i16`, `i32`, `f32`, `i64`, `f64` | |
| | Arm64EC | `vreg` | `neon` | `i8`, `i16`, `i32`, `f32`, `i64`, `f64`, <br> `i8x8`, `i16x4`, `i32x2`, `i64x1`, `f32x2`, `f64x1`, <br> `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` | |
| | ARM | `reg` | None | `i8`, `i16`, `i32`, `f32` | |
| | ARM | `sreg` | `vfp2` | `i32`, `f32` | |
| | ARM | `dreg` | `vfp2` | `i64`, `f64`, `i8x8`, `i16x4`, `i32x2`, `i64x1`, `f32x2` | |
| | ARM | `qreg` | `neon` | `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4` | |
| | RISC-V32 | `reg` | None | `i8`, `i16`, `i32`, `f32` | |
| | RISC-V64 | `reg` | None | `i8`, `i16`, `i32`, `f32`, `i64`, `f64` | |
| | RISC-V | `freg` | `f` | `f32` | |
| | RISC-V | `freg` | `d` | `f64` | |
| | RISC-V | `vreg` | N/A | Only clobbers | |
| | LoongArch64 | `reg` | None | `i8`, `i16`, `i32`, `i64`, `f32`, `f64` | |
| | LoongArch64 | `freg` | `f` | `f32` | |
| | LoongArch64 | `freg` | `d` | `f64` | |
| | s390x | `reg`, `reg_addr` | None | `i8`, `i16`, `i32`, `i64` | |
| | s390x | `freg` | None | `f32`, `f64` | |
| | s390x | `vreg` | N/A | Only clobbers | |
| | s390x | `areg` | N/A | Only clobbers | |
| |
| > [!NOTE] |
| > For the purposes of the above table pointers, function pointers and `isize`/`usize` are treated as the equivalent integer type (`i16`/`i32`/`i64` depending on the target). |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x = 5i32; |
| let y = -1i8; |
| let z = unsafe { core::arch::x86_64::_mm_set_epi64x(1, 0) }; |
| |
| // reg is valid for `i32`, `reg_byte` is valid for `i8`, and xmm_reg is valid for `__m128i` |
| // We can't use `tmm0` as an input or output, but we can clobber it. |
| unsafe { core::arch::asm!("/* {} {} {} */", in(reg) x, in(reg_byte) y, in(xmm_reg) z, out("tmm0") _); } |
| # } |
| ``` |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| let z = unsafe { core::arch::x86_64::_mm_set_epi64x(1, 0) }; |
| // We can't pass an `__m128i` to a `reg` input |
| unsafe { core::arch::asm!("/* {} */", in(reg) z); } |
| // ERROR: type `__m128i` cannot be used with this register class |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.register-operands.smaller-value] |
| If a value is of a smaller size than the register it is allocated in then the upper bits of that register will have an undefined value for inputs and will be ignored for outputs. |
| The only exception is the `freg` register class on RISC-V where `f32` values are NaN-boxed in a `f64` as required by the RISC-V architecture. |
| |
| <!--no_run, this test has a non-deterministic runtime behavior--> |
| ```rust,no_run |
| # #[cfg(target_arch = "x86_64")] { |
| let mut x: i64; |
| // Moving a 32-bit value into a 64-bit value, oops. |
| #[allow(asm_sub_register)] // rustc warns about this behavior |
| unsafe { core::arch::asm!("mov {}, {}", lateout(reg) x, in(reg) 4i32); } |
| // top 32-bits are indeterminate |
| assert_eq!(x, 4); // This assertion is not guaranteed to succeed |
| assert_eq!(x & 0xFFFFFFFF, 4); // However, this one will succeed |
| # } |
| ``` |
| |
| r[asm.register-operands.separate-input-output] |
| When separate input and output expressions are specified for an `inout` operand, both expressions must have the same type. |
| The only exception is if both operands are pointers or integers, in which case they are only required to have the same size. |
| This restriction exists because the register allocators in LLVM and GCC sometimes cannot handle tied operands with different types. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| // Pointers and integers can mix (as long as they are the same size) |
| let x: isize = 0; |
| let y: *mut (); |
| // Transmute an `isize` to a `*mut ()`, using inline assembly magic |
| unsafe { core::arch::asm!("/*{}*/", inout(reg) x=>y); } |
| assert!(y.is_null()); // Extremely roundabout way to make a null pointer |
| # } |
| ``` |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i32 = 0; |
| let y: f32; |
| // But we can't reinterpret an `i32` to an `f32` like this |
| unsafe { core::arch::asm!("/* {} */", inout(reg) x=>y); } |
| // ERROR: incompatible types for asm inout argument |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.register-names] |
| ## Register names |
| |
| r[asm.register-names.supported-register-aliases] |
| Some registers have multiple names. |
| These are all treated by the compiler as identical to the base register name. |
| Here is the list of all supported register aliases: |
| |
| | Architecture | Base register | Aliases | |
| | ------------ | ------------- | ------- | |
| | x86 | `ax` | `eax`, `rax` | |
| | x86 | `bx` | `ebx`, `rbx` | |
| | x86 | `cx` | `ecx`, `rcx` | |
| | x86 | `dx` | `edx`, `rdx` | |
| | x86 | `si` | `esi`, `rsi` | |
| | x86 | `di` | `edi`, `rdi` | |
| | x86 | `bp` | `bpl`, `ebp`, `rbp` | |
| | x86 | `sp` | `spl`, `esp`, `rsp` | |
| | x86 | `ip` | `eip`, `rip` | |
| | x86 | `st(0)` | `st` | |
| | x86 | `r[8-15]` | `r[8-15]b`, `r[8-15]w`, `r[8-15]d` | |
| | x86 | `xmm[0-31]` | `ymm[0-31]`, `zmm[0-31]` | |
| | AArch64 | `x[0-30]` | `w[0-30]` | |
| | AArch64 | `x29` | `fp` | |
| | AArch64 | `x30` | `lr` | |
| | AArch64 | `sp` | `wsp` | |
| | AArch64 | `xzr` | `wzr` | |
| | AArch64 | `v[0-31]` | `b[0-31]`, `h[0-31]`, `s[0-31]`, `d[0-31]`, `q[0-31]` | |
| | Arm64EC | `x[0-30]` | `w[0-30]` | |
| | Arm64EC | `x29` | `fp` | |
| | Arm64EC | `x30` | `lr` | |
| | Arm64EC | `sp` | `wsp` | |
| | Arm64EC | `xzr` | `wzr` | |
| | Arm64EC | `v[0-15]` | `b[0-15]`, `h[0-15]`, `s[0-15]`, `d[0-15]`, `q[0-15]` | |
| | ARM | `r[0-3]` | `a[1-4]` | |
| | ARM | `r[4-9]` | `v[1-6]` | |
| | ARM | `r9` | `rfp` | |
| | ARM | `r10` | `sl` | |
| | ARM | `r11` | `fp` | |
| | ARM | `r12` | `ip` | |
| | ARM | `r13` | `sp` | |
| | ARM | `r14` | `lr` | |
| | ARM | `r15` | `pc` | |
| | RISC-V | `x0` | `zero` | |
| | RISC-V | `x1` | `ra` | |
| | RISC-V | `x2` | `sp` | |
| | RISC-V | `x3` | `gp` | |
| | RISC-V | `x4` | `tp` | |
| | RISC-V | `x[5-7]` | `t[0-2]` | |
| | RISC-V | `x8` | `fp`, `s0` | |
| | RISC-V | `x9` | `s1` | |
| | RISC-V | `x[10-17]` | `a[0-7]` | |
| | RISC-V | `x[18-27]` | `s[2-11]` | |
| | RISC-V | `x[28-31]` | `t[3-6]` | |
| | RISC-V | `f[0-7]` | `ft[0-7]` | |
| | RISC-V | `f[8-9]` | `fs[0-1]` | |
| | RISC-V | `f[10-17]` | `fa[0-7]` | |
| | RISC-V | `f[18-27]` | `fs[2-11]` | |
| | RISC-V | `f[28-31]` | `ft[8-11]` | |
| | LoongArch | `$r0` | `$zero` | |
| | LoongArch | `$r1` | `$ra` | |
| | LoongArch | `$r2` | `$tp` | |
| | LoongArch | `$r3` | `$sp` | |
| | LoongArch | `$r[4-11]` | `$a[0-7]` | |
| | LoongArch | `$r[12-20]` | `$t[0-8]` | |
| | LoongArch | `$r21` | | |
| | LoongArch | `$r22` | `$fp`, `$s9` | |
| | LoongArch | `$r[23-31]` | `$s[0-8]` | |
| | LoongArch | `$f[0-7]` | `$fa[0-7]` | |
| | LoongArch | `$f[8-23]` | `$ft[0-15]` | |
| | LoongArch | `$f[24-31]` | `$fs[0-7]` | |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let z = 0i64; |
| // rax is an alias for eax and ax |
| unsafe { core::arch::asm!("", in("rax") z); } |
| # } |
| ``` |
| |
| r[asm.register-names.not-for-io] |
| Some registers cannot be used for input or output operands: |
| |
| | Architecture | Unsupported register | Reason | |
| | ------------ | -------------------- | ------ | |
| | All | `sp`, `r15` (s390x) | The stack pointer must be restored to its original value at the end of the assembly code or before jumping to a `label` block. | |
| | All | `bp` (x86), `x29` (AArch64 and Arm64EC), `x8` (RISC-V), `$fp` (LoongArch), `r11` (s390x) | The frame pointer cannot be used as an input or output. | |
| | ARM | `r7` or `r11` | On ARM the frame pointer can be either `r7` or `r11` depending on the target. The frame pointer cannot be used as an input or output. | |
| | All | `si` (x86-32), `bx` (x86-64), `r6` (ARM), `x19` (AArch64 and Arm64EC), `x9` (RISC-V), `$s8` (LoongArch) | This is used internally by LLVM as a "base pointer" for functions with complex stack frames. | |
| | x86 | `ip` | This is the program counter, not a real register. | |
| | AArch64 | `xzr` | This is a constant zero register which can't be modified. | |
| | AArch64 | `x18` | This is an OS-reserved register on some AArch64 targets. | |
| | Arm64EC | `xzr` | This is a constant zero register which can't be modified. | |
| | Arm64EC | `x18` | This is an OS-reserved register. | |
| | Arm64EC | `x13`, `x14`, `x23`, `x24`, `x28`, `v[16-31]`, `p[0-15]`, `ffr` | These are AArch64 registers that are not supported for Arm64EC. | |
| | ARM | `pc` | This is the program counter, not a real register. | |
| | ARM | `r9` | This is an OS-reserved register on some ARM targets. | |
| | RISC-V | `x0` | This is a constant zero register which can't be modified. | |
| | RISC-V | `gp`, `tp` | These registers are reserved and cannot be used as inputs or outputs. | |
| | LoongArch | `$r0` or `$zero` | This is a constant zero register which can't be modified. | |
| | LoongArch | `$r2` or `$tp` | This is reserved for TLS. | |
| | LoongArch | `$r21` | This is reserved by the ABI. | |
| | s390x | `c[0-15]` | Reserved by the kernel. | |
| | s390x | `a[0-1]` | Reserved for system use. | |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // bp is reserved |
| unsafe { core::arch::asm!("", in("bp") 5i32); } |
| // ERROR: invalid register `bp`: the frame pointer cannot be used as an operand for inline asm |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.register-names.fp-bp-reserved] |
| The frame pointer and base pointer registers are reserved for internal use by LLVM. While `asm!` statements cannot explicitly specify the use of reserved registers, in some cases LLVM will allocate one of these reserved registers for `reg` operands. Assembly code making use of reserved registers should be careful since `reg` operands may use the same registers. |
| |
| r[asm.template-modifiers] |
| ## Template modifiers |
| |
| r[asm.template-modifiers.intro] |
| The placeholders can be augmented by modifiers which are specified after the `:` in the curly braces. |
| These modifiers do not affect register allocation, but change the way operands are formatted when inserted into the template string. |
| |
| r[asm.template-modifiers.only-one] |
| Only one modifier is allowed per template placeholder. |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // We can't specify both `r` and `e` at the same time. |
| unsafe { core::arch::asm!("/* {:er}", in(reg) 5i32); } |
| // ERROR: asm template modifier must be a single character |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.template-modifiers.supported-modifiers] |
| The supported modifiers are a subset of LLVM's (and GCC's) [asm template argument modifiers][llvm-argmod], but do not use the same letter codes. |
| |
| | Architecture | Register class | Modifier | Example output | LLVM modifier | |
| | ------------ | -------------- | -------- | -------------- | ------------- | |
| | x86-32 | `reg` | None | `eax` | `k` | |
| | x86-64 | `reg` | None | `rax` | `q` | |
| | x86-32 | `reg_abcd` | `l` | `al` | `b` | |
| | x86-64 | `reg` | `l` | `al` | `b` | |
| | x86 | `reg_abcd` | `h` | `ah` | `h` | |
| | x86 | `reg` | `x` | `ax` | `w` | |
| | x86 | `reg` | `e` | `eax` | `k` | |
| | x86-64 | `reg` | `r` | `rax` | `q` | |
| | x86 | `reg_byte` | None | `al` / `ah` | None | |
| | x86 | `xmm_reg` | None | `xmm0` | `x` | |
| | x86 | `ymm_reg` | None | `ymm0` | `t` | |
| | x86 | `zmm_reg` | None | `zmm0` | `g` | |
| | x86 | `*mm_reg` | `x` | `xmm0` | `x` | |
| | x86 | `*mm_reg` | `y` | `ymm0` | `t` | |
| | x86 | `*mm_reg` | `z` | `zmm0` | `g` | |
| | x86 | `kreg` | None | `k1` | None | |
| | AArch64/Arm64EC | `reg` | None | `x0` | `x` | |
| | AArch64/Arm64EC | `reg` | `w` | `w0` | `w` | |
| | AArch64/Arm64EC | `reg` | `x` | `x0` | `x` | |
| | AArch64/Arm64EC | `vreg` | None | `v0` | None | |
| | AArch64/Arm64EC | `vreg` | `v` | `v0` | None | |
| | AArch64/Arm64EC | `vreg` | `b` | `b0` | `b` | |
| | AArch64/Arm64EC | `vreg` | `h` | `h0` | `h` | |
| | AArch64/Arm64EC | `vreg` | `s` | `s0` | `s` | |
| | AArch64/Arm64EC | `vreg` | `d` | `d0` | `d` | |
| | AArch64/Arm64EC | `vreg` | `q` | `q0` | `q` | |
| | ARM | `reg` | None | `r0` | None | |
| | ARM | `sreg` | None | `s0` | None | |
| | ARM | `dreg` | None | `d0` | `P` | |
| | ARM | `qreg` | None | `q0` | `q` | |
| | ARM | `qreg` | `e` / `f` | `d0` / `d1` | `e` / `f` | |
| | RISC-V | `reg` | None | `x1` | None | |
| | RISC-V | `freg` | None | `f0` | None | |
| | LoongArch | `reg` | None | `$r1` | None | |
| | LoongArch | `freg` | None | `$f0` | None | |
| | s390x | `reg` | None | `%r0` | None | |
| | s390x | `reg_addr` | None | `%r1` | None | |
| | s390x | `freg` | None | `%f0` | None | |
| |
| > [!NOTE] |
| > - on ARM `e` / `f`: this prints the low or high doubleword register name of a NEON quad (128-bit) register. |
| > - on x86: our behavior for `reg` with no modifiers differs from what GCC does. |
| > GCC will infer the modifier based on the operand value type, while we default to the full register size. |
| > - on x86 `xmm_reg`: the `x`, `t` and `g` LLVM modifiers are not yet implemented in LLVM (they are supported by GCC only), but this should be a simple change. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let mut x = 0x10u16; |
| |
| // u16::swap_bytes using `xchg` |
| // low half of `{x}` is referred to by `{x:l}`, and the high half by `{x:h}` |
| unsafe { core::arch::asm!("xchg {x:l}, {x:h}", x = inout(reg_abcd) x); } |
| assert_eq!(x, 0x1000u16); |
| # } |
| ``` |
| |
| r[asm.template-modifiers.smaller-value] |
| As stated in the previous section, passing an input value smaller than the register width will result in the upper bits of the register containing undefined values. |
| This is not a problem if the inline asm only accesses the lower bits of the register, which can be done by using a template modifier to use a subregister name in the assembly code (e.g. `ax` instead of `rax`). |
| Since this an easy pitfall, the compiler will suggest a template modifier to use where appropriate given the input type. |
| If all references to an operand already have modifiers then the warning is suppressed for that operand. |
| |
| [llvm-argmod]: http://llvm.org/docs/LangRef.html#asm-template-argument-modifiers |
| |
| r[asm.abi-clobbers] |
| ## ABI clobbers |
| |
| r[asm.abi-clobbers.intro] |
| The `clobber_abi` keyword can be used to apply a default set of clobbers to the assembly code. |
| This will automatically insert the necessary clobber constraints as needed for calling a function with a particular calling convention: if the calling convention does not fully preserve the value of a register across a call then `lateout("...") _` is implicitly added to the operands list (where the `...` is replaced by the register's name). |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| extern "C" fn foo() -> i32 { 0 } |
| |
| let z: i32; |
| // To call a function, we have to inform the compiler that we're clobbering |
| // callee saved registers |
| unsafe { core::arch::asm!("call {}", sym foo, out("rax") z, clobber_abi("C")); } |
| assert_eq!(z, 0); |
| # } |
| ``` |
| |
| r[asm.abi-clobbers.many] |
| `clobber_abi` may be specified any number of times. It will insert a clobber for all unique registers in the union of all specified calling conventions. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| extern "sysv64" fn foo() -> i32 { 0 } |
| extern "win64" fn bar(x: i32) -> i32 { x + 1} |
| |
| let z: i32; |
| // We can even call multiple functions with different conventions and |
| // different saved registers |
| unsafe { |
| core::arch::asm!( |
| "call {}", |
| "mov ecx, eax", |
| "call {}", |
| sym foo, |
| sym bar, |
| out("rax") z, |
| clobber_abi("C") |
| ); |
| } |
| assert_eq!(z, 1); |
| # } |
| ``` |
| |
| r[asm.abi-clobbers.must-specify] |
| Generic register class outputs are disallowed by the compiler when `clobber_abi` is used: all outputs must specify an explicit register. |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| extern "C" fn foo(x: i32) -> i32 { 0 } |
| |
| let z: i32; |
| // explicit registers must be used to not accidentally overlap. |
| unsafe { |
| core::arch::asm!( |
| "mov eax, {:e}", |
| "call {}", |
| out(reg) z, |
| sym foo, |
| clobber_abi("C") |
| ); |
| // ERROR: asm with `clobber_abi` must specify explicit registers for outputs |
| } |
| assert_eq!(z, 0); |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.abi-clobbers.explicit-have-precedence] |
| Explicit register outputs have precedence over the implicit clobbers inserted by `clobber_abi`: a clobber will only be inserted for a register if that register is not used as an output. |
| |
| r[asm.abi-clobbers.supported-abis] |
| The following ABIs can be used with `clobber_abi`: |
| |
| | Architecture | ABI name | Clobbered registers | |
| | ------------ | -------- | ------------------- | |
| | x86-32 | `"C"`, `"system"`, `"efiapi"`, `"cdecl"`, `"stdcall"`, `"fastcall"` | `ax`, `cx`, `dx`, `xmm[0-7]`, `mm[0-7]`, `k[0-7]`, `st([0-7])` | |
| | x86-64 | `"C"`, `"system"` (on Windows), `"efiapi"`, `"win64"` | `ax`, `cx`, `dx`, `r[8-11]`, `xmm[0-31]`, `mm[0-7]`, `k[0-7]`, `st([0-7])`, `tmm[0-7]` | |
| | x86-64 | `"C"`, `"system"` (on non-Windows), `"sysv64"` | `ax`, `cx`, `dx`, `si`, `di`, `r[8-11]`, `xmm[0-31]`, `mm[0-7]`, `k[0-7]`, `st([0-7])`, `tmm[0-7]` | |
| | AArch64 | `"C"`, `"system"`, `"efiapi"` | `x[0-17]`, `x18`\*, `x30`, `v[0-31]`, `p[0-15]`, `ffr` | |
| | Arm64EC | `"C"`, `"system"` | `x[0-12]`, `x[15-17]`, `x30`, `v[0-15]` | |
| | ARM | `"C"`, `"system"`, `"efiapi"`, `"aapcs"` | `r[0-3]`, `r12`, `r14`, `s[0-15]`, `d[0-7]`, `d[16-31]` | |
| | RISC-V | `"C"`, `"system"`, `"efiapi"` | `x1`, `x[5-7]`, `x[10-17]`\*, `x[28-31]`\*, `f[0-7]`, `f[10-17]`, `f[28-31]`, `v[0-31]` | |
| | LoongArch | `"C"`, `"system"` | `$r1`, `$r[4-20]`, `$f[0-23]` | |
| | s390x | `"C"`, `"system"` | `r[0-5]`, `r14`, `f[0-7]`, `v[0-31]`, `a[2-15]` | |
| |
| > [!NOTE] |
| > - On AArch64 `x18` only included in the clobber list if it is not considered as a reserved register on the target. |
| > - On RISC-V `x[16-17]` and `x[28-31]` only included in the clobber list if they are not considered as reserved registers on the target. |
| |
| The list of clobbered registers for each ABI is updated in rustc as architectures gain new registers: this ensures that `asm!` clobbers will continue to be correct when LLVM starts using these new registers in its generated code. |
| |
| r[asm.options] |
| ## Options |
| |
| r[asm.options.supported-options] |
| Flags are used to further influence the behavior of the inline assembly code. |
| Currently the following options are defined: |
| |
| r[asm.options.supported-options.pure] |
| - `pure`: The assembly code has no side effects, must eventually return, and its outputs depend only on its direct inputs (i.e. the values themselves, not what they point to) or values read from memory (unless the `nomem` options is also set). |
| This allows the compiler to execute the assembly code fewer times than specified in the program (e.g. by hoisting it out of a loop) or even eliminate it entirely if the outputs are not used. |
| The `pure` option must be combined with either the `nomem` or `readonly` options, otherwise a compile-time error is emitted. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i32 = 0; |
| let z: i32; |
| // pure can be used to optimize by assuming the assembly has no side effects |
| unsafe { core::arch::asm!("inc {}", inout(reg) x => z, options(pure, nomem)); } |
| assert_eq!(z, 1); |
| # } |
| ``` |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i32 = 0; |
| let z: i32; |
| // Either nomem or readonly must be satisfied, to indicate whether or not |
| // memory is allowed to be read |
| unsafe { core::arch::asm!("inc {}", inout(reg) x => z, options(pure)); } |
| // ERROR: the `pure` option must be combined with either `nomem` or `readonly` |
| assert_eq!(z, 0); |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.options.supported-options.nomem] |
| - `nomem`: The assembly code does not read from or write to any memory accessible outside of the assembly code. |
| This allows the compiler to cache the values of modified global variables in registers across execution of the assembly code since it knows that they are not read from or written to by it. |
| The compiler also assumes that the assembly code does not perform any kind of synchronization with other threads, e.g. via fences. |
| |
| <!-- no_run: This test has unpredictable or undefined behavior at runtime --> |
| ```rust,no_run |
| # #[cfg(target_arch = "x86_64")] { |
| let mut x = 0i32; |
| let z: i32; |
| // Accessing outside memory from assembly when `nomem` is |
| // specified is disallowed |
| unsafe { |
| core::arch::asm!("mov {val:e}, dword ptr [{ptr}]", |
| ptr = in(reg) &mut x, |
| val = lateout(reg) z, |
| options(nomem) |
| ) |
| } |
| |
| // Writing to outside memory from assembly when `nomem` is |
| // specified is also undefined behaviour |
| unsafe { |
| core::arch::asm!("mov dword ptr [{ptr}], {val:e}", |
| ptr = in(reg) &mut x, |
| val = in(reg) z, |
| options(nomem) |
| ) |
| } |
| # } |
| ``` |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i32 = 0; |
| let z: i32; |
| // If we allocate our own memory, such as via `push`, however. |
| // we can still use it |
| unsafe { |
| core::arch::asm!("push {x}", "add qword ptr [rsp], 1", "pop {x}", |
| x = inout(reg) x => z, |
| options(nomem) |
| ); |
| } |
| assert_eq!(z, 1); |
| # } |
| ``` |
| |
| r[asm.options.supported-options.readonly] |
| - `readonly`: The assembly code does not write to any memory accessible outside of the assembly code. |
| This allows the compiler to cache the values of unmodified global variables in registers across execution of the assembly code since it knows that they are not written to by it. |
| The compiler also assumes that this assembly code does not perform any kind of synchronization with other threads, e.g. via fences. |
| |
| <!-- no_run: This test has undefined behaviour at runtime --> |
| ```rust,no_run |
| # #[cfg(target_arch = "x86_64")] { |
| let mut x = 0; |
| // We cannot modify outside memory when `readonly` is specified |
| unsafe { |
| core::arch::asm!("mov dword ptr[{}], 1", in(reg) &mut x, options(readonly)) |
| } |
| # } |
| ``` |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i64 = 0; |
| let z: i64; |
| // We can still read from it, though |
| unsafe { |
| core::arch::asm!("mov {x}, qword ptr [{x}]", |
| x = inout(reg) &x => z, |
| options(readonly) |
| ); |
| } |
| assert_eq!(z, 0); |
| # } |
| ``` |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i64 = 0; |
| let z: i64; |
| // Same exception applies as with nomem. |
| unsafe { |
| core::arch::asm!("push {x}", "add qword ptr [rsp], 1", "pop {x}", |
| x = inout(reg) x => z, |
| options(readonly) |
| ); |
| } |
| assert_eq!(z, 1); |
| # } |
| ``` |
| |
| r[asm.options.supported-options.preserves_flags] |
| - `preserves_flags`: The assembly code does not modify the flags register (defined in the rules below). |
| This allows the compiler to avoid recomputing the condition flags after execution of the assembly code. |
| |
| r[asm.options.supported-options.noreturn] |
| - `noreturn`: The assembly code does not fall through; behavior is undefined if it does. It may still jump to `label` blocks. If any `label` blocks return unit, the `asm!` block will return unit. Otherwise it will return `!` (never). As with a call to a function that does not return, local variables in scope are not dropped before execution of the assembly code. |
| |
| <!-- no_run: This test aborts at runtime --> |
| ```rust,no_run |
| fn main() -> ! { |
| # #[cfg(target_arch = "x86_64")] { |
| // We can use an instruction to trap execution inside of a noreturn block |
| unsafe { core::arch::asm!("ud2", options(noreturn)); } |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] panic!("no return"); |
| } |
| ``` |
| |
| <!-- no_run: Test has undefined behavior at runtime --> |
| ```rust,no_run |
| # #[cfg(target_arch = "x86_64")] { |
| // You are responsible for not falling past the end of a noreturn asm block |
| unsafe { core::arch::asm!("", options(noreturn)); } |
| # } |
| ``` |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] |
| let _: () = unsafe { |
| // You may still jump to a `label` block |
| core::arch::asm!("jmp {}", label { |
| println!(); |
| }, options(noreturn)); |
| }; |
| ``` |
| |
| r[asm.options.supported-options.nostack] |
| - `nostack`: The assembly code does not push data to the stack, or write to the stack red-zone (if supported by the target). |
| If this option is *not* used then the stack pointer is guaranteed to be suitably aligned (according to the target ABI) for a function call. |
| |
| <!-- no_run: Test has undefined behavior at runtime --> |
| ```rust,no_run |
| # #[cfg(target_arch = "x86_64")] { |
| // `push` and `pop` are UB when used with nostack |
| unsafe { core::arch::asm!("push rax", "pop rax", options(nostack)); } |
| # } |
| ``` |
| |
| r[asm.options.supported-options.att_syntax] |
| - `att_syntax`: This option is only valid on x86, and causes the assembler to use the `.att_syntax prefix` mode of the GNU assembler. |
| Register operands are substituted in with a leading `%`. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let x: i32; |
| let y = 1i32; |
| // We need to use AT&T Syntax here. src, dest order for operands |
| unsafe { |
| core::arch::asm!("mov {y:e}, {x:e}", |
| x = lateout(reg) x, |
| y = in(reg) y, |
| options(att_syntax) |
| ); |
| } |
| assert_eq!(x, y); |
| # } |
| ``` |
| |
| r[asm.options.supported-options.raw] |
| - `raw`: This causes the template string to be parsed as a raw assembly string, with no special handling for `{` and `}`. |
| This is primarily useful when including raw assembly code from an external file using `include_str!`. |
| |
| r[asm.options.checks] |
| The compiler performs some additional checks on options: |
| |
| r[asm.options.checks.mutually-exclusive] |
| - The `nomem` and `readonly` options are mutually exclusive: it is a compile-time error to specify both. |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // nomem is strictly stronger than readonly, they can't be specified together |
| unsafe { core::arch::asm!("", options(nomem, readonly)); } |
| // ERROR: the `nomem` and `readonly` options are mutually exclusive |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.options.checks.pure] |
| - It is a compile-time error to specify `pure` on an asm block with no outputs or only discarded outputs (`_`). |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| // pure blocks need at least one output |
| unsafe { core::arch::asm!("", options(pure)); } |
| // ERROR: asm with the `pure` option must have at least one output |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.options.checks.noreturn] |
| - It is a compile-time error to specify `noreturn` on an asm block with outputs and without labels. |
| |
| ```rust,compile_fail |
| # #[cfg(target_arch = "x86_64")] { |
| let z: i32; |
| // noreturn can't have outputs |
| unsafe { core::arch::asm!("mov {:e}, 1", out(reg) z, options(noreturn)); } |
| // ERROR: asm outputs are not allowed with the `noreturn` option |
| # } |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.options.checks.label-with-outputs] |
| - It is a compile-time error to have any `label` blocks in an asm block with outputs. |
| |
| r[asm.options.naked_asm-restriction] |
| `naked_asm!` only supports the `att_syntax` and `raw` options. The remaining options are not meaningful because the inline assembly defines the whole function body. |
| |
| r[asm.options.global_asm-restriction] |
| `global_asm!` only supports the `att_syntax` and `raw` options. The remaining options are not meaningful for global-scope inline assembly. |
| |
| ```rust,compile_fail |
| # fn main() {} |
| # #[cfg(target_arch = "x86_64")] |
| // nomem is useless on global_asm! |
| core::arch::global_asm!("", options(nomem)); |
| # #[cfg(not(target_arch = "x86_64"))] core::compile_error!("Test not supported on this arch"); |
| ``` |
| |
| r[asm.rules] |
| ## Rules for inline assembly |
| |
| r[asm.rules.intro] |
| To avoid undefined behavior, these rules must be followed when using function-scope inline assembly (`asm!`): |
| |
| r[asm.rules.reg-not-input] |
| - Any registers not specified as inputs will contain an undefined value on entry to the assembly code. |
| - An "undefined value" in the context of inline assembly means that the register can (non-deterministically) have any one of the possible values allowed by the architecture. |
| Notably it is not the same as an LLVM `undef` which can have a different value every time you read it (since such a concept does not exist in assembly code). |
| |
| r[asm.rules.reg-not-output] |
| - Any registers not specified as outputs must have the same value upon exiting the assembly code as they had on entry, otherwise behavior is undefined. |
| - This only applies to registers which can be specified as an input or output. |
| Other registers follow target-specific rules. |
| - Note that a `lateout` may be allocated to the same register as an `in`, in which case this rule does not apply. |
| Code should not rely on this however since it depends on the results of register allocation. |
| |
| r[asm.rules.unwind] |
| - Behavior is undefined if execution unwinds out of the assembly code. |
| - This also applies if the assembly code calls a function which then unwinds. |
| |
| r[asm.rules.mem-same-as-ffi] |
| - The set of memory locations that assembly code is allowed to read and write are the same as those allowed for an FFI function. |
| - If the `readonly` option is set, then only memory reads are allowed. |
| - If the `nomem` option is set then no reads or writes to memory are allowed. |
| - These rules do not apply to memory which is private to the assembly code, such as stack space allocated within it. |
| |
| r[asm.rules.black-box] |
| - The compiler cannot assume that the instructions in the assembly code are the ones that will actually end up executed. |
| - This effectively means that the compiler must treat the assembly code as a black box and only take the interface specification into account, not the instructions themselves. |
| - Runtime code patching is allowed, via target-specific mechanisms. |
| - However there is no guarantee that each block of assembly code in the source directly corresponds to a single instance of instructions in the object file; the compiler is free to duplicate or deduplicate the assembly code in `asm!` blocks. |
| |
| r[asm.rules.stack-below-sp] |
| - Unless the `nostack` option is set, assembly code is allowed to use stack space below the stack pointer. |
| - On entry to the assembly code the stack pointer is guaranteed to be suitably aligned (according to the target ABI) for a function call. |
| - You are responsible for making sure you don't overflow the stack (e.g. use stack probing to ensure you hit a guard page). |
| - You should adjust the stack pointer when allocating stack memory as required by the target ABI. |
| - The stack pointer must be restored to its original value before leaving the assembly code. |
| |
| r[asm.rules.noreturn] |
| - If the `noreturn` option is set then behavior is undefined if execution falls through the end of the assembly code. |
| |
| r[asm.rules.pure] |
| - If the `pure` option is set then behavior is undefined if the `asm!` has side-effects other than its direct outputs. |
| Behavior is also undefined if two executions of the `asm!` code with the same inputs result in different outputs. |
| - When used with the `nomem` option, "inputs" are just the direct inputs of the `asm!`. |
| - When used with the `readonly` option, "inputs" comprise the direct inputs of the assembly code and any memory that it is allowed to read. |
| |
| r[asm.rules.preserved-registers] |
| - These flags registers must be restored upon exiting the assembly code if the `preserves_flags` option is set: |
| - x86 |
| - Status flags in `EFLAGS` (CF, PF, AF, ZF, SF, OF). |
| - Floating-point status word (all). |
| - Floating-point exception flags in `MXCSR` (PE, UE, OE, ZE, DE, IE). |
| - ARM |
| - Condition flags in `CPSR` (N, Z, C, V) |
| - Saturation flag in `CPSR` (Q) |
| - Greater than or equal flags in `CPSR` (GE). |
| - Condition flags in `FPSCR` (N, Z, C, V) |
| - Saturation flag in `FPSCR` (QC) |
| - Floating-point exception flags in `FPSCR` (IDC, IXC, UFC, OFC, DZC, IOC). |
| - AArch64 and Arm64EC |
| - Condition flags (`NZCV` register). |
| - Floating-point status (`FPSR` register). |
| - RISC-V |
| - Floating-point exception flags in `fcsr` (`fflags`). |
| - Vector extension state (`vtype`, `vl`, `vcsr`). |
| - LoongArch |
| - Floating-point condition flags in `$fcc[0-7]`. |
| - s390x |
| - The condition code register `cc`. |
| |
| r[asm.rules.x86-df] |
| - On x86, the direction flag (DF in `EFLAGS`) is clear on entry to the assembly code and must be clear on exit. |
| - Behavior is undefined if the direction flag is set on exiting the assembly code. |
| |
| r[asm.rules.x86-x87] |
| - On x86, the x87 floating-point register stack must remain unchanged unless all of the `st([0-7])` registers have been marked as clobbered with `out("st(0)") _, out("st(1)") _, ...`. |
| - If all x87 registers are clobbered then the x87 register stack is guaranteed to be empty upon entering the assembly code. Assembly code must ensure that the x87 register stack is also empty when exiting the asssembly code. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] |
| pub fn fadd(x: f64, y: f64) -> f64 { |
| let mut out = 0f64; |
| let mut top = 0u16; |
| // we can do complex stuff with x87 if we clobber the entire x87 stack |
| unsafe { core::arch::asm!( |
| "fld qword ptr [{x}]", |
| "fld qword ptr [{y}])", |
| "faddp", |
| "fstp qword ptr [{out}]", |
| "xor eax, eax", |
| "fstsw ax", |
| "shl eax, 11", |
| x = in(reg) &x, |
| y = in(reg) &y, |
| out = in(reg) &mut out, |
| out("st(0)") _, out("st(1)") _, out("st(2)") _, out("st(3)") _, |
| out("st(4)") _, out("st(5)") _, out("st(6)") _, out("st(7)") _, |
| out("eax") top |
| );} |
| |
| assert_eq!(top & 0x7, 0); |
| out |
| } |
| |
| pub fn main() { |
| # #[cfg(target_arch = "x86_64")]{ |
| assert_eq!(fadd(1.0, 1.0), 2.0); |
| # } |
| } |
| ``` |
| |
| r[asm.rules.arm64ec] |
| - On arm64ec, [call checkers with appropriate thunks](https://learn.microsoft.com/en-us/windows/arm/arm64ec-abi#authoring-arm64ec-in-assembly) are mandatory when calling functions. |
| |
| r[asm.rules.only-on-exit] |
| - The requirement of restoring the stack pointer and non-output registers to their original value only applies when exiting the assembly code. |
| - This means that assembly code that does not fall through and does not jump to any `label` blocks, even if not marked `noreturn`, doesn't need to preserve these registers. |
| - When returning to the assembly code of a different `asm!` block than you entered (e.g. for context switching), these registers must contain the value they had upon entering the `asm!` block that you are *exiting*. |
| - You cannot exit the assembly code of an `asm!` block that has not been entered. |
| Neither can you exit the assembly code of an `asm!` block whose assembly code has already been exited (without first entering it again). |
| - You are responsible for switching any target-specific state (e.g. thread-local storage, stack bounds). |
| - You cannot jump from an address in one `asm!` block to an address in another, even within the same function or block, without treating their contexts as potentially different and requiring context switching. You cannot assume that any particular value in those contexts (e.g. current stack pointer or temporary values below the stack pointer) will remain unchanged between the two `asm!` blocks. |
| - The set of memory locations that you may access is the intersection of those allowed by the `asm!` blocks you entered and exited. |
| |
| r[asm.rules.not-successive] |
| - You cannot assume that two `asm!` blocks adjacent in source code, even without any other code between them, will end up in successive addresses in the binary without any other instructions between them. |
| |
| r[asm.rules.not-exactly-once] |
| - You cannot assume that an `asm!` block will appear exactly once in the output binary. |
| The compiler is allowed to instantiate multiple copies of the `asm!` block, for example when the function containing it is inlined in multiple places. |
| |
| r[asm.rules.x86-prefix-restriction] |
| - On x86, inline assembly must not end with an instruction prefix (such as `LOCK`) that would apply to instructions generated by the compiler. |
| - The compiler is currently unable to detect this due to the way inline assembly is compiled, but may catch and reject this in the future. |
| |
| r[asm.rules.preserves_flags] |
| > [!NOTE] |
| > As a general rule, the flags covered by `preserves_flags` are those which are *not* preserved when performing a function call. |
| |
| r[asm.naked-rules] |
| ## Rules for naked inline assembly |
| |
| r[asm.naked-rules.intro] |
| To avoid undefined behavior, these rules must be followed when using function-scope inline assembly in naked functions (`naked_asm!`): |
| |
| r[asm.naked-rules.reg-not-input] |
| - Any registers not used for function inputs according to the calling convention and function signature will contain an undefined value on entry to the `naked_asm!` block. |
| - An "undefined value" in the context of inline assembly means that the register can (non-deterministically) have any one of the possible values allowed by the architecture. Notably it is not the same as an LLVM `undef` which can have a different value every time you read it (since such a concept does not exist in assembly code). |
| |
| r[asm.naked-rules.callee-saved-registers] |
| - All callee-saved registers must have the same value upon return as they had on entry. |
| |
| r[asm.naked-rules.caller-saved-registers] |
| - Caller-saved registers may be used freely. |
| |
| r[asm.naked-rules.noreturn] |
| - Behavior is undefined if execution falls through past the end of the assembly code. |
| - Every path through the assembly code is expected to terminate with a return instruction or to diverge. |
| |
| r[asm.naked-rules.mem-same-as-ffi] |
| - The set of memory locations that assembly code is allowed to read and write are the same as those allowed for an FFI function. |
| |
| r[asm.naked-rules.black-box] |
| - The compiler cannot assume that the instructions in the `naked_asm!` block are the ones that will actually be executed. |
| - This effectively means that the compiler must treat the `naked_asm!` as a black box and only take the interface specification into account, not the instructions themselves. |
| - Runtime code patching is allowed, via target-specific mechanisms. |
| |
| r[asm.naked-rules.unwind] |
| - Unwinding out of a `naked_asm!` block is allowed. |
| - For correct behavior, the appropriate assembler directives that emit unwinding metadata must be used. |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| #[unsafe(naked)] |
| extern "sysv64-unwind" fn unwinding_naked() { |
| core::arch::naked_asm!( |
| // "CFI" here stands for "call frame information". |
| ".cfi_startproc", |
| // The CFA (canonical frame address) is the value of `rsp` |
| // before the `call`, i.e. before the return address, `rip`, |
| // was pushed to `rsp`, so it's eight bytes higher in memory |
| // than `rsp` upon function entry (after `rip` has been |
| // pushed). |
| // |
| // This is the default, so we don't have to write it. |
| //".cfi_def_cfa rsp, 8", |
| // |
| // The traditional thing to do is to preserve the base |
| // pointer, so we'll do that. |
| "push rbp", |
| // Since we've now extended the stack downward by 8 bytes in |
| // memory, we need to adjust the offset to the CFA from `rsp` |
| // by another 8 bytes. |
| ".cfi_adjust_cfa_offset 8", |
| // We also then annotate where we've stored the caller's value |
| // of `rbp`, relative to the CFA, so that when unwinding into |
| // the caller we can find it, in case we need it to calculate |
| // the caller's CFA relative to it. |
| // |
| // Here, we've stored the caller's `rbp` starting 16 bytes |
| // below the CFA. I.e., starting from the CFA, there's first |
| // the `rip` (which starts 8 bytes below the CFA and continues |
| // up to it), then there's the caller's `rbp` that we just |
| // pushed. |
| ".cfi_offset rbp, -16", |
| // As is traditional, we set the base pointer to the value of |
| // the stack pointer. This way, the base pointer stays the |
| // same throughout the function body. |
| "mov rbp, rsp", |
| // We can now track the offset to the CFA from the base |
| // pointer. This means we don't need to make any further |
| // adjustments until the end, as we don't change `rbp`. |
| ".cfi_def_cfa_register rbp", |
| // We can now call a function that may panic. |
| "call {f}", |
| // Upon return, we restore `rbp` in preparation for returning |
| // ourselves. |
| "pop rbp", |
| // Now that we've restored `rbp`, we must specify the offset |
| // to the CFA again in terms of `rsp`. |
| ".cfi_def_cfa rsp, 8", |
| // Now we can return. |
| "ret", |
| ".cfi_endproc", |
| f = sym may_panic, |
| ) |
| } |
| |
| extern "sysv64-unwind" fn may_panic() { |
| panic!("unwind"); |
| } |
| # } |
| ``` |
| |
| > [!NOTE] |
| > |
| > For more information on the `cfi` assembler directives above, see these resources: |
| > |
| > - [Using `as` - CFI directives](https://sourceware.org/binutils/docs/as/CFI-directives.html) |
| > - [DWARF Debugging Information Format Version 5](https://dwarfstd.org/doc/DWARF5.pdf) |
| > - [ImperialViolet - CFI directives in assembly files](https://www.imperialviolet.org/2017/01/18/cfi.html) |
| |
| r[asm.validity] |
| ### Correctness and Validity |
| |
| r[asm.validity.necessary-but-not-sufficient] |
| In addition to all of the previous rules, the string argument to `asm!` must ultimately become--- |
| after all other arguments are evaluated, formatting is performed, and operands are translated--- |
| assembly that is both syntactically correct and semantically valid for the target architecture. |
| The formatting rules allow the compiler to generate assembly with correct syntax. |
| Rules concerning operands permit valid translation of Rust operands into and out of the assembly code. |
| Adherence to these rules is necessary, but not sufficient, for the final expanded assembly to be |
| both correct and valid. For instance: |
| |
| - arguments may be placed in positions which are syntactically incorrect after formatting |
| - an instruction may be correctly written, but given architecturally invalid operands |
| - an architecturally unspecified instruction may be assembled into unspecified code |
| - a set of instructions, each correct and valid, may cause undefined behavior if placed in immediate succession |
| |
| r[asm.validity.non-exhaustive] |
| As a result, these rules are _non-exhaustive_. The compiler is not required to check the |
| correctness and validity of the initial string nor the final assembly that is generated. |
| The assembler may check for correctness and validity but is not required to do so. |
| When using `asm!`, a typographical error may be sufficient to make a program unsound, |
| and the rules for assembly may include thousands of pages of architectural reference manuals. |
| Programmers should exercise appropriate care, as invoking this `unsafe` capability comes with |
| assuming the responsibility of not violating rules of both the compiler or the architecture. |
| |
| r[asm.directives] |
| ### Directives Support |
| |
| r[asm.directives.subset-supported] |
| Inline assembly supports a subset of the directives supported by both GNU AS and LLVM's internal assembler, given as follows. |
| The result of using other directives is assembler-specific (and may cause an error, or may be accepted as-is). |
| |
| r[asm.directives.stateful] |
| If inline assembly includes any "stateful" directive that modifies how subsequent assembly is processed, the assembly code must undo the effects of any such directives before the inline assembly ends. |
| |
| r[asm.directives.supported-directives] |
| The following directives are guaranteed to be supported by the assembler: |
| |
| - `.2byte` |
| - `.4byte` |
| - `.8byte` |
| - `.align` |
| - `.alt_entry` |
| - `.ascii` |
| - `.asciz` |
| - `.balign` |
| - `.balignl` |
| - `.balignw` |
| - `.bss` |
| - `.byte` |
| - `.comm` |
| - `.data` |
| - `.def` |
| - `.double` |
| - `.endef` |
| - `.equ` |
| - `.equiv` |
| - `.eqv` |
| - `.fill` |
| - `.float` |
| - `.global` |
| - `.globl` |
| - `.inst` |
| - `.insn` |
| - `.lcomm` |
| - `.long` |
| - `.octa` |
| - `.option` |
| - `.p2align` |
| - `.popsection` |
| - `.private_extern` |
| - `.pushsection` |
| - `.quad` |
| - `.scl` |
| - `.section` |
| - `.set` |
| - `.short` |
| - `.size` |
| - `.skip` |
| - `.sleb128` |
| - `.space` |
| - `.string` |
| - `.text` |
| - `.type` |
| - `.uleb128` |
| - `.word` |
| |
| ```rust |
| # #[cfg(target_arch = "x86_64")] { |
| let bytes: *const u8; |
| let len: usize; |
| unsafe { |
| core::arch::asm!( |
| "jmp 3f", "2: .ascii \"Hello World!\"", |
| "3: lea {bytes}, [2b+rip]", |
| "mov {len}, 12", |
| bytes = out(reg) bytes, |
| len = out(reg) len |
| ); |
| } |
| |
| let s = unsafe { core::str::from_utf8_unchecked(core::slice::from_raw_parts(bytes, len)) }; |
| |
| assert_eq!(s, "Hello World!"); |
| # } |
| ``` |
| |
| r[asm.target-specific-directives] |
| #### Target Specific Directive Support |
| |
| r[asm.target-specific-directives.dwarf-unwinding] |
| ##### Dwarf Unwinding |
| |
| The following directives are supported on ELF targets that support DWARF unwind info: |
| |
| - `.cfi_adjust_cfa_offset` |
| - `.cfi_def_cfa` |
| - `.cfi_def_cfa_offset` |
| - `.cfi_def_cfa_register` |
| - `.cfi_endproc` |
| - `.cfi_escape` |
| - `.cfi_lsda` |
| - `.cfi_offset` |
| - `.cfi_personality` |
| - `.cfi_register` |
| - `.cfi_rel_offset` |
| - `.cfi_remember_state` |
| - `.cfi_restore` |
| - `.cfi_restore_state` |
| - `.cfi_return_column` |
| - `.cfi_same_value` |
| - `.cfi_sections` |
| - `.cfi_signal_frame` |
| - `.cfi_startproc` |
| - `.cfi_undefined` |
| - `.cfi_window_save` |
| |
| r[asm.target-specific-directives.structured-exception-handling] |
| ##### Structured Exception Handling |
| |
| On targets with structured exception Handling, the following additional directives are guaranteed to be supported: |
| |
| - `.seh_endproc` |
| - `.seh_endprologue` |
| - `.seh_proc` |
| - `.seh_pushreg` |
| - `.seh_savereg` |
| - `.seh_setframe` |
| - `.seh_stackalloc` |
| |
| r[asm.target-specific-directives.x86] |
| ##### x86 (32-bit and 64-bit) |
| |
| On x86 targets, both 32-bit and 64-bit, the following additional directives are guaranteed to be supported: |
| - `.nops` |
| - `.code16` |
| - `.code32` |
| - `.code64` |
| |
| Use of `.code16`, `.code32`, and `.code64` directives are only supported if the state is reset to the default before exiting the assembly code. |
| 32-bit x86 uses `.code32` by default, and x86_64 uses `.code64` by default. |
| |
| r[asm.target-specific-directives.arm-32-bit] |
| ##### ARM (32-bit) |
| |
| On ARM, the following additional directives are guaranteed to be supported: |
| |
| - `.even` |
| - `.fnstart` |
| - `.fnend` |
| - `.save` |
| - `.movsp` |
| - `.code` |
| - `.thumb` |
| - `.thumb_func` |