blob: de262423c9f8482b4d871faaef9c78adbd242e08 [file] [log] [blame] [view]
# Notation
## Grammar
The following notations are used by the *Lexer* and *Syntax* grammar snippets:
| Notation | Examples | Meaning |
|-------------------|-------------------------------|-------------------------------------------|
| CAPITAL | KW_IF, INTEGER_LITERAL | A token produced by the lexer |
| _ItalicCamelCase_ | _LetStatement_, _Item_ | A syntactical production |
| `string` | `x`, `while`, `*` | The exact character(s) |
| x<sup>?</sup> | `pub`<sup>?</sup> | An optional item |
| x<sup>\*</sup> | _OuterAttribute_<sup>\*</sup> | 0 or more of x |
| x<sup>+</sup> | _MacroMatch_<sup>+</sup> | 1 or more of x |
| x<sup>a..b</sup> | HEX_DIGIT<sup>1..6</sup> | a to b repetitions of x |
| Rule1 Rule2 | `fn` _Name_ _Parameters_ | Sequence of rules in order |
| \| | `u8` \| `u16`, Block \| Item | Either one or another |
| \[ ] | \[`b` `B`] | Any of the characters listed |
| \[ - ] | \[`a`-`z`] | Any of the characters in the range |
| ~\[ ] | ~\[`b` `B`] | Any characters, except those listed |
| ~`string` | ~`\n`, ~`*/` | Any characters, except this sequence |
| ( ) | (`,` _Parameter_)<sup>?</sup> | Groups items |
| U+xxxx | U+0060 | A single unicode character |
| \<text\> | \<any ASCII char except CR\> | An English description of what should be matched |
| Rule <sub>suffix</sub> | IDENTIFIER_OR_KEYWORD <sub>_except `crate`_</sub> | A modification to the previous rule |
Sequences have a higher precedence than `|` alternation.
## String table productions
Some rules in the grammar &mdash; notably [unary operators], [binary
operators], and [keywords] &mdash; are given in a simplified form: as a listing
of printable strings. These cases form a subset of the rules regarding the
[token][tokens] rule, and are assumed to be the result of a lexical-analysis
phase feeding the parser, driven by a <abbr title="Deterministic Finite
Automaton">DFA</abbr>, operating over the disjunction of all such string table
entries.
When such a string in `monospace` font occurs inside the grammar,
it is an implicit reference to a single member of such a string table
production. See [tokens] for more information.
## Grammar visualizations
Below each grammar block is a button to toggle the display of a [syntax diagram]. A square element is a non-terminal rule, and a rounded rectangle is a terminal.
[syntax diagram]: https://en.wikipedia.org/wiki/Syntax_diagram
## Common productions
The following are common definitions used in the grammar.
r[input.syntax]
```grammar,lexer
@root CHAR -> <a Unicode scalar value>
NUL -> U+0000
TAB -> U+0009
LF -> U+000A
CR -> U+000D
```
[binary operators]: expressions/operator-expr.md#arithmetic-and-logical-binary-operators
[keywords]: keywords.md
[tokens]: tokens.md
[unary operators]: expressions/operator-expr.md#borrow-operators