Add hard cut operator (`^`) to grammar The hard cut operator (`^`) is a backtracking fence. Once the expressions to its left in a sequence match, the rest of the sequence must match or parsing fails unconditionally -- no enclosing expression can backtrack past the cut point. This operator is necessary because some Rust tokens begin with a prefix that is itself a valid token. For example, `c"` begins a C string literal, but `c` alone is a valid identifier. If `c"\0"` fails to lex as a C string literal (because null bytes are not allowed in C strings), a PEG parser would normally backtrack and try other alternatives, potentially lexing it as the identifier `c` followed by the string `"\0"`. The hard cut after `c"` prevents this: once the opening delimiter matches, failure is unconditional. We add `^` to the grammar notation and use it in the productions for C string literals, byte literals, byte string literals, and the raw string variants -- each of which has a prefix that could otherwise be consumed as a separate token. In the notation chapter, we add a dedicated section explaining ordered alternation and backtracking, distinguishing a hard cut (which prevents all backtracking past the cut point) from a soft cut (which prevents backtracking only within the immediately enclosing choice), and citing Mizushima et al. for introducing cut operators to PEG. In the grammar tooling, we add a `Cut` variant to the expression AST, parse `^` at the sequence level, and render it in both the Markdown and railroad diagram outputs. In the railroad diagrams, the hard cut is rendered as a "no backtracking" box around the expressions after the cut point. The idea is that once you enter the box the only way out is forward.
This document is the primary reference for the Rust programming language.
See the Reference Developer Guide for information on contributing to the Reference.