src/diagnostics/translation.md - rust-lang/rustc-dev-guide - Git at Google

 # Translation

 <div class="warning">
 rustc's current diagnostics translation infrastructure (as of
 <!-- date-check --> October 2024
 ) unfortunately causes some friction for compiler contributors, and the current
 infrastructure is mostly pending a redesign that better addresses needs of both
 compiler contributors and translation teams. Note that there is no current
 active redesign proposals (as of
 <!-- date-check --> October 2024
 )!

 Please see the tracking issue <https://github.com/rust-lang/rust/issues/132181>
 for status updates.

 We have downgraded the internal lints `untranslatable_diagnostic` and
 `diagnostic_outside_of_impl`. Those internal lints previously required new code
 to use the current translation infrastructure. However, because the translation
 infra is waiting for a yet-to-be-proposed redesign and thus rework, we are not
 mandating usage of current translation infra. Use the infra if you *want to* or
 otherwise makes the code cleaner, but otherwise sidestep the translation infra
 if you need more flexibility.
 </div>

 rustc's diagnostic infrastructure supports translatable diagnostics using
 [Fluent].

 ## Writing translatable diagnostics

 There are two ways of writing translatable diagnostics:

 1. For simple diagnostics, using a diagnostic (or subdiagnostic) derive.
    ("Simple" diagnostics being those that don't require a lot of logic in
    deciding to emit subdiagnostics and can therefore be represented as
    diagnostic structs). See [the diagnostic and subdiagnostic structs
    documentation](./diagnostic-structs.md).
 2. Using typed identifiers with `Diag` APIs (in
    `Diagnostic` or `Subdiagnostic` or `LintDiagnostic` implementations).

 When adding or changing a translatable diagnostic,
 you don't need to worry about the translations.
 Only updating the original English message is required.
 Currently,
 each crate which defines translatable diagnostics has its own Fluent resource,
 which is a file named `messages.ftl`,
 located in the root of the crate
 (such as`compiler/rustc_expand/messages.ftl`).

 ## Fluent

 Fluent is built around the idea of "asymmetric localization", which aims to
 decouple the expressiveness of translations from the grammar of the source
 language (English in rustc's case). Prior to translation, rustc's diagnostics
 relied heavily on interpolation to build the messages shown to the users.
 Interpolated strings are hard to translate because writing a natural-sounding
 translation might require more, less, or just different interpolation than the
 English string, all of which would require changes to the compiler's source
 code to support.

 Diagnostic messages are defined in Fluent resources. A combined set of Fluent
 resources for a given locale (e.g. `en-US`) is known as Fluent bundle.

 ```fluent
 typeck_address_of_temporary_taken = cannot take address of a temporary
 ```

 In the above example, `typeck_address_of_temporary_taken` is the identifier for
 a Fluent message and corresponds to the diagnostic message in English. Other
 Fluent resources can be written which would correspond to a message in another
 language. Each diagnostic therefore has at least one Fluent message.

 ```fluent
 typeck_address_of_temporary_taken = cannot take address of a temporary
     .label = temporary value
 ```

 By convention, diagnostic messages for subdiagnostics are specified as
 "attributes" on Fluent messages (additional related messages, denoted by the
 `.<attribute-name>` syntax). In the above example, `label` is an attribute of
 `typeck_address_of_temporary_taken` which corresponds to the message for the
 label added to this diagnostic.

 Diagnostic messages often interpolate additional context into the message shown
 to the user, such as the name of a type or of a variable. Additional context to
 Fluent messages is provided as an "argument" to the diagnostic.

 ```fluent
 typeck_struct_expr_non_exhaustive =
     cannot create non-exhaustive {$what} using struct expression
 ```

 In the above example, the Fluent message refers to an argument named `what`
 which is expected to exist (how arguments are provided to diagnostics is
 discussed in detail later).

 You can consult the [Fluent] documentation for other usage examples of Fluent
 and its syntax.

 ### Guideline for message naming

 Usually, fluent uses `-` for separating words inside a message name. However,
 `_` is accepted by fluent as well. As `_` fits Rust's use cases better, due to
 the identifiers on the Rust side using `_` as well, inside rustc, `-` is not
 allowed for separating words, and instead `_` is recommended. The only exception
 is for leading `-`s, for message names like `-passes_see_issue`.

 ### Guidelines for writing translatable messages

 For a message to be translatable into different languages, all of the
 information required by any language must be provided to the diagnostic as an
 argument (not just the information required in the English message).

 As the compiler team gain more experience writing diagnostics that have all of
 the information necessary to be translated into different languages, this page
 will be updated with more guidance. For now, the [Fluent] documentation has
 excellent examples of translating messages into different locales and the
 information that needs to be provided by the code to do so.

 ### Compile-time validation and typed identifiers

 rustc's `fluent_messages` macro performs compile-time validation of Fluent
 resources and generates code to make it easier to refer to Fluent messages in
 diagnostics.

 Compile-time validation of Fluent resources will emit any parsing errors
 from Fluent resources while building the compiler, preventing invalid Fluent
 resources from causing panics in the compiler. Compile-time validation also
 emits an error if multiple Fluent messages have the same identifier.

 ## Internals

 Various parts of rustc's diagnostic internals are modified in order to support
 translation.

 ### Messages

 All of rustc's traditional diagnostic APIs (e.g. `struct_span_err` or `note`)
 take any message that can be converted into a `DiagMessage` (or
 `SubdiagMessage`).

 [`rustc_error_messages::DiagMessage`] can represent legacy non-translatable
 diagnostic messages and translatable messages. Non-translatable messages are
 just `String`s. Translatable messages are just a `&'static str` with the
 identifier of the Fluent message (sometimes with an additional `&'static str`
 with an attribute).

 `DiagMessage` never needs to be interacted with directly:
 `DiagMessage` constants are created for each diagnostic message in a
 Fluent resource (described in more detail below), or `DiagMessage`s will
 either be created in the macro-generated code of a diagnostic derive.

 `rustc_error_messages::SubdiagMessage` is similar, it can correspond to a
 legacy non-translatable diagnostic message or the name of an attribute to a
 Fluent message. Translatable `SubdiagMessage`s must be combined with a
 `DiagMessage` (using `DiagMessage::with_subdiagnostic_message`) to
 be emitted (an attribute name on its own is meaningless without a corresponding
 message identifier, which is what `DiagMessage` provides).

 Both `DiagMessage` and `SubdiagMessage` implement `Into` for any
 type that can be converted into a string, and converts these into
 non-translatable diagnostics - this keeps all existing diagnostic calls
 working.

 ### Arguments

 Additional context for Fluent messages which are interpolated into message
 contents needs to be provided to translatable diagnostics.

 Diagnostics have a `set_arg` function that can be used to provide this
 additional context to a diagnostic.

 Arguments have both a name (e.g. "what" in the earlier example) and a value.
 Argument values are represented using the `DiagArgValue` type, which is
 just a string or a number. rustc types can implement `IntoDiagArg` with
 conversion into a string or a number, and common types like `Ty<'tcx>` already
 have such implementations.

 `set_arg` calls are handled transparently by diagnostic derives but need to be
 added manually when using diagnostic builder APIs.

 ### Loading

 rustc makes a distinction between the "fallback bundle" for `en-US` that is used
 by default and when another locale is missing a message; and the primary fluent
 bundle which is requested by the user.

 Diagnostic emitters implement the `Emitter` trait which has two functions for
 accessing the fallback and primary fluent bundles (`fallback_fluent_bundle` and
 `fluent_bundle` respectively).

 `Emitter` also has member functions with default implementations for performing
 translation of a `DiagMessage` using the results of
 `fallback_fluent_bundle` and `fluent_bundle`.

 All of the emitters in rustc load the fallback Fluent bundle lazily, only
 reading Fluent resources and parsing them when an error message is first being
 translated (for performance reasons - it doesn't make sense to do this if no
 error is being emitted). `rustc_error_messages::fallback_fluent_bundle` returns
 a `std::lazy::Lazy<FluentBundle>` which is provided to emitters and evaluated
 in the first call to `Emitter::fallback_fluent_bundle`.

 The primary Fluent bundle (for the user's desired locale) is expected to be
 returned by `Emitter::fluent_bundle`. This bundle is used preferentially when
 translating messages, the fallback bundle is only used if the primary bundle is
 missing a message or not provided.

 There are no locale bundles distributed with the compiler,
 but mechanisms are implemented for loading them.

 - `-Ztranslate-additional-ftl` can be used to load a specific resource as the
   primary bundle for testing purposes.
 - `-Ztranslate-lang` can be provided a language identifier (something like
   `en-US`) and will load any Fluent resources found in
   `$sysroot/share/locale/$locale/` directory (both the user provided
   sysroot and any sysroot candidates).

 Primary bundles are not currently loaded lazily and if requested will be loaded
 at the start of compilation regardless of whether an error occurs. Lazily
 loading primary bundles is possible if it can be assumed that loading a bundle
 won't fail. Bundle loading can fail if a requested locale is missing, Fluent
 files are malformed, or a message is duplicated in multiple resources.

 [Fluent]: https://projectfluent.org
 [`compiler/rustc_borrowck/messages.ftl`]: https://github.com/rust-lang/rust/blob/HEAD/compiler/rustc_borrowck/messages.ftl
 [`compiler/rustc_parse/messages.ftl`]: https://github.com/rust-lang/rust/blob/HEAD/compiler/rustc_parse/messages.ftl
 [`rustc_error_messages::DiagMessage`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_error_messages/enum.DiagMessage.html
	# Translation

	<div class="warning">
	rustc's current diagnostics translation infrastructure (as of
	<!-- date-check --> October 2024
	) unfortunately causes some friction for compiler contributors, and the current
	infrastructure is mostly pending a redesign that better addresses needs of both
	compiler contributors and translation teams. Note that there is no current
	active redesign proposals (as of
	<!-- date-check --> October 2024
	)!

	Please see the tracking issue <https://github.com/rust-lang/rust/issues/132181>
	for status updates.

	We have downgraded the internal lints `untranslatable_diagnostic` and
	`diagnostic_outside_of_impl`. Those internal lints previously required new code
	to use the current translation infrastructure. However, because the translation
	infra is waiting for a yet-to-be-proposed redesign and thus rework, we are not
	mandating usage of current translation infra. Use the infra if you want to or
	otherwise makes the code cleaner, but otherwise sidestep the translation infra
	if you need more flexibility.
	</div>

	rustc's diagnostic infrastructure supports translatable diagnostics using
	[Fluent].

	## Writing translatable diagnostics

	There are two ways of writing translatable diagnostics:

	1. For simple diagnostics, using a diagnostic (or subdiagnostic) derive.
	("Simple" diagnostics being those that don't require a lot of logic in
	deciding to emit subdiagnostics and can therefore be represented as
	diagnostic structs). See [the diagnostic and subdiagnostic structs
	documentation](./diagnostic-structs.md).
	2. Using typed identifiers with `Diag` APIs (in
	`Diagnostic` or `Subdiagnostic` or `LintDiagnostic` implementations).

	When adding or changing a translatable diagnostic,
	you don't need to worry about the translations.
	Only updating the original English message is required.
	Currently,
	each crate which defines translatable diagnostics has its own Fluent resource,
	which is a file named `messages.ftl`,
	located in the root of the crate
	(such as`compiler/rustc_expand/messages.ftl`).

	## Fluent

	Fluent is built around the idea of "asymmetric localization", which aims to
	decouple the expressiveness of translations from the grammar of the source
	language (English in rustc's case). Prior to translation, rustc's diagnostics
	relied heavily on interpolation to build the messages shown to the users.
	Interpolated strings are hard to translate because writing a natural-sounding
	translation might require more, less, or just different interpolation than the
	English string, all of which would require changes to the compiler's source
	code to support.

	Diagnostic messages are defined in Fluent resources. A combined set of Fluent
	resources for a given locale (e.g. `en-US`) is known as Fluent bundle.

	```fluent
	typeck_address_of_temporary_taken = cannot take address of a temporary
	```

	In the above example, `typeck_address_of_temporary_taken` is the identifier for
	a Fluent message and corresponds to the diagnostic message in English. Other
	Fluent resources can be written which would correspond to a message in another
	language. Each diagnostic therefore has at least one Fluent message.

	```fluent
	typeck_address_of_temporary_taken = cannot take address of a temporary
	.label = temporary value
	```

	By convention, diagnostic messages for subdiagnostics are specified as
	"attributes" on Fluent messages (additional related messages, denoted by the
	`.<attribute-name>` syntax). In the above example, `label` is an attribute of
	`typeck_address_of_temporary_taken` which corresponds to the message for the
	label added to this diagnostic.

	Diagnostic messages often interpolate additional context into the message shown
	to the user, such as the name of a type or of a variable. Additional context to
	Fluent messages is provided as an "argument" to the diagnostic.

	```fluent
	typeck_struct_expr_non_exhaustive =
	cannot create non-exhaustive {$what} using struct expression
	```

	In the above example, the Fluent message refers to an argument named `what`
	which is expected to exist (how arguments are provided to diagnostics is
	discussed in detail later).

	You can consult the [Fluent] documentation for other usage examples of Fluent
	and its syntax.

	### Guideline for message naming

	Usually, fluent uses `-` for separating words inside a message name. However,
	`_` is accepted by fluent as well. As `_` fits Rust's use cases better, due to
	the identifiers on the Rust side using `_` as well, inside rustc, `-` is not
	allowed for separating words, and instead `_` is recommended. The only exception
	is for leading `-`s, for message names like `-passes_see_issue`.

	### Guidelines for writing translatable messages

	For a message to be translatable into different languages, all of the
	information required by any language must be provided to the diagnostic as an
	argument (not just the information required in the English message).

	As the compiler team gain more experience writing diagnostics that have all of
	the information necessary to be translated into different languages, this page
	will be updated with more guidance. For now, the [Fluent] documentation has
	excellent examples of translating messages into different locales and the
	information that needs to be provided by the code to do so.

	### Compile-time validation and typed identifiers

	rustc's `fluent_messages` macro performs compile-time validation of Fluent
	resources and generates code to make it easier to refer to Fluent messages in
	diagnostics.

	Compile-time validation of Fluent resources will emit any parsing errors
	from Fluent resources while building the compiler, preventing invalid Fluent
	resources from causing panics in the compiler. Compile-time validation also
	emits an error if multiple Fluent messages have the same identifier.

	## Internals

	Various parts of rustc's diagnostic internals are modified in order to support
	translation.

	### Messages

	All of rustc's traditional diagnostic APIs (e.g. `struct_span_err` or `note`)
	take any message that can be converted into a `DiagMessage` (or
	`SubdiagMessage`).

	[`rustc_error_messages::DiagMessage`] can represent legacy non-translatable
	diagnostic messages and translatable messages. Non-translatable messages are
	just `String`s. Translatable messages are just a `&'static str` with the
	identifier of the Fluent message (sometimes with an additional `&'static str`
	with an attribute).

	`DiagMessage` never needs to be interacted with directly:
	`DiagMessage` constants are created for each diagnostic message in a
	Fluent resource (described in more detail below), or `DiagMessage`s will
	either be created in the macro-generated code of a diagnostic derive.

	`rustc_error_messages::SubdiagMessage` is similar, it can correspond to a
	legacy non-translatable diagnostic message or the name of an attribute to a
	Fluent message. Translatable `SubdiagMessage`s must be combined with a
	`DiagMessage` (using `DiagMessage::with_subdiagnostic_message`) to
	be emitted (an attribute name on its own is meaningless without a corresponding
	message identifier, which is what `DiagMessage` provides).

	Both `DiagMessage` and `SubdiagMessage` implement `Into` for any
	type that can be converted into a string, and converts these into
	non-translatable diagnostics - this keeps all existing diagnostic calls
	working.

	### Arguments

	Additional context for Fluent messages which are interpolated into message
	contents needs to be provided to translatable diagnostics.

	Diagnostics have a `set_arg` function that can be used to provide this
	additional context to a diagnostic.

	Arguments have both a name (e.g. "what" in the earlier example) and a value.
	Argument values are represented using the `DiagArgValue` type, which is
	just a string or a number. rustc types can implement `IntoDiagArg` with
	conversion into a string or a number, and common types like `Ty<'tcx>` already
	have such implementations.

	`set_arg` calls are handled transparently by diagnostic derives but need to be
	added manually when using diagnostic builder APIs.

	### Loading

	rustc makes a distinction between the "fallback bundle" for `en-US` that is used
	by default and when another locale is missing a message; and the primary fluent
	bundle which is requested by the user.

	Diagnostic emitters implement the `Emitter` trait which has two functions for
	accessing the fallback and primary fluent bundles (`fallback_fluent_bundle` and
	`fluent_bundle` respectively).

	`Emitter` also has member functions with default implementations for performing
	translation of a `DiagMessage` using the results of
	`fallback_fluent_bundle` and `fluent_bundle`.

	All of the emitters in rustc load the fallback Fluent bundle lazily, only
	reading Fluent resources and parsing them when an error message is first being
	translated (for performance reasons - it doesn't make sense to do this if no
	error is being emitted). `rustc_error_messages::fallback_fluent_bundle` returns
	a `std::lazy::Lazy<FluentBundle>` which is provided to emitters and evaluated
	in the first call to `Emitter::fallback_fluent_bundle`.

	The primary Fluent bundle (for the user's desired locale) is expected to be
	returned by `Emitter::fluent_bundle`. This bundle is used preferentially when
	translating messages, the fallback bundle is only used if the primary bundle is
	missing a message or not provided.

	There are no locale bundles distributed with the compiler,
	but mechanisms are implemented for loading them.

	- `-Ztranslate-additional-ftl` can be used to load a specific resource as the
	primary bundle for testing purposes.
	- `-Ztranslate-lang` can be provided a language identifier (something like
	`en-US`) and will load any Fluent resources found in
	`$sysroot/share/locale/$locale/` directory (both the user provided
	sysroot and any sysroot candidates).

	Primary bundles are not currently loaded lazily and if requested will be loaded
	at the start of compilation regardless of whether an error occurs. Lazily
	loading primary bundles is possible if it can be assumed that loading a bundle
	won't fail. Bundle loading can fail if a requested locale is missing, Fluent
	files are malformed, or a message is duplicated in multiple resources.

	[Fluent]: https://projectfluent.org
	[`compiler/rustc_borrowck/messages.ftl`]: https://github.com/rust-lang/rust/blob/HEAD/compiler/rustc_borrowck/messages.ftl
	[`compiler/rustc_parse/messages.ftl`]: https://github.com/rust-lang/rust/blob/HEAD/compiler/rustc_parse/messages.ftl
	[`rustc_error_messages::DiagMessage`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_error_messages/enum.DiagMessage.html