blob: 27016a5171fd98da2f8015fdadc2abd9d17afcde [file] [log] [blame] [view]
r[ident]
# Identifiers
r[ident.syntax]
```grammar,lexer
IDENTIFIER_OR_KEYWORD ->
XID_Start XID_Continue*
| `_` XID_Continue+
XID_Start -> <`XID_Start` defined by Unicode>
XID_Continue -> <`XID_Continue` defined by Unicode>
RAW_IDENTIFIER -> `r#` IDENTIFIER_OR_KEYWORD _except `crate`, `self`, `super`, `Self`_
NON_KEYWORD_IDENTIFIER -> IDENTIFIER_OR_KEYWORD _except a [strict][lex.keywords.strict] or [reserved][lex.keywords.reserved] keyword_
IDENTIFIER -> NON_KEYWORD_IDENTIFIER | RAW_IDENTIFIER
RESERVED_RAW_IDENTIFIER -> `r#_`
```
<!-- When updating the version, update the UAX links, too. -->
r[ident.unicode]
Identifiers follow the specification in [Unicode Standard Annex #31][UAX31] for Unicode version 16.0, with the additions described below. Some examples of identifiers:
* `foo`
* `_identifier`
* `r#true`
* `Москва`
* `東京`
r[ident.profile]
The profile used from UAX #31 is:
* Start := [`XID_Start`], plus the underscore character (U+005F)
* Continue := [`XID_Continue`]
* Medial := empty
with the additional constraint that a single underscore character is not an identifier.
> [!NOTE]
> Identifiers starting with an underscore are typically used to indicate an identifier that is intentionally unused, and will silence the unused warning in `rustc`.
r[ident.keyword]
Identifiers may not be a [strict] or [reserved] keyword without the `r#` prefix described below in [raw identifiers](#raw-identifiers).
r[ident.zero-width-chars]
Zero width non-joiner (ZWNJ U+200C) and zero width joiner (ZWJ U+200D) characters are not allowed in identifiers.
r[ident.ascii-limitations]
Identifiers are restricted to the ASCII subset of [`XID_Start`] and [`XID_Continue`] in the following situations:
* [`extern crate`] declarations (except the [AsClause] identifier)
* External crate names referenced in a [path]
* [Module] names loaded from the filesystem without a [`path` attribute]
* [`no_mangle`] attributed items
* Item names in [external blocks]
r[ident.normalization]
## Normalization
Identifiers are normalized using Normalization Form C (NFC) as defined in [Unicode Standard Annex #15][UAX15]. Two identifiers are equal if their NFC forms are equal.
[Procedural][proc-macro] and [declarative][mbe] macros receive normalized identifiers in their input.
r[ident.raw]
## Raw identifiers
r[ident.raw.intro]
A raw identifier is like a normal identifier, but prefixed by `r#`. (Note that
the `r#` prefix is not included as part of the actual identifier.)
r[ident.raw.allowed]
Unlike a normal identifier, a raw identifier may be any strict or reserved
keyword except the ones listed above for `RAW_IDENTIFIER`.
r[ident.raw.reserved]
It is an error to use the [RESERVED_RAW_IDENTIFIER] token `r#_` in order to avoid confusion with the [WildcardPattern].
[`extern crate`]: items/extern-crates.md
[`no_mangle`]: abi.md#the-no_mangle-attribute
[`path` attribute]: items/modules.md#the-path-attribute
[`XID_Continue`]: http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%3AXID_Continue%3A%5D&abb=on&g=&i=
[`XID_Start`]: http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%3AXID_Start%3A%5D&abb=on&g=&i=
[external blocks]: items/external-blocks.md
[mbe]: macros-by-example.md
[module]: items/modules.md
[path]: paths.md
[proc-macro]: procedural-macros.md
[reserved]: keywords.md#reserved-keywords
[strict]: keywords.md#strict-keywords
[UAX15]: https://www.unicode.org/reports/tr15/tr15-56.html
[UAX31]: https://www.unicode.org/reports/tr31/tr31-41.html