|  | ====================================== | 
|  | Syntax of AMDGPU Instruction Modifiers | 
|  | ====================================== | 
|  |  | 
|  | .. contents:: | 
|  | :local: | 
|  |  | 
|  | Conventions | 
|  | =========== | 
|  |  | 
|  | The following notation is used throughout this document: | 
|  |  | 
|  | =================== ============================================================= | 
|  | Notation            Description | 
|  | =================== ============================================================= | 
|  | {0..N}              Any integer value in the range from 0 to N (inclusive). | 
|  | <x>                 Syntax and meaning of *x* is explained elsewhere. | 
|  | =================== ============================================================= | 
|  |  | 
|  | .. _amdgpu_syn_modifiers: | 
|  |  | 
|  | Modifiers | 
|  | ========= | 
|  |  | 
|  | DS Modifiers | 
|  | ------------ | 
|  |  | 
|  | .. _amdgpu_synid_ds_offset8: | 
|  |  | 
|  | offset8 | 
|  | ~~~~~~~ | 
|  |  | 
|  | Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0. | 
|  |  | 
|  | Used with DS instructions which have 2 addresses. | 
|  |  | 
|  | =================== ==================================================================== | 
|  | Syntax              Description | 
|  | =================== ==================================================================== | 
|  | offset:{0..0xFF}    Specifies an unsigned 8-bit offset as a positive | 
|  | :ref:`integer number <amdgpu_synid_integer_number>` | 
|  | or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  | =================== ==================================================================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | offset:0xff | 
|  | offset:2-x | 
|  | offset:-x-y | 
|  |  | 
|  | .. _amdgpu_synid_ds_offset16: | 
|  |  | 
|  | offset16 | 
|  | ~~~~~~~~ | 
|  |  | 
|  | Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0. | 
|  |  | 
|  | Used with DS instructions which have 1 address. | 
|  |  | 
|  | ==================== ==================================================================== | 
|  | Syntax               Description | 
|  | ==================== ==================================================================== | 
|  | offset:{0..0xFFFF}   Specifies an unsigned 16-bit offset as a positive | 
|  | :ref:`integer number <amdgpu_synid_integer_number>` | 
|  | or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  | ==================== ==================================================================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | offset:65535 | 
|  | offset:0xffff | 
|  | offset:-x-y | 
|  |  | 
|  | .. _amdgpu_synid_sw_offset16: | 
|  |  | 
|  | swizzle pattern | 
|  | ~~~~~~~~~~~~~~~ | 
|  |  | 
|  | This is a special modifier which may be used with *ds_swizzle_b32* instruction only. | 
|  | It specifies a swizzle pattern in numeric or symbolic form. The default value is 0. | 
|  |  | 
|  | See AMD documentation for more information. | 
|  |  | 
|  | ======================================================= =========================================================== | 
|  | Syntax                                                  Description | 
|  | ======================================================= =========================================================== | 
|  | offset:{0..0xFFFF}                                      Specifies a 16-bit swizzle pattern. | 
|  | offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3})   Specifies a quad permute mode pattern | 
|  |  | 
|  | Each number is a lane *id*. | 
|  | offset:swizzle(BITMASK_PERM, "<mask>")                  Specifies a bitmask permute mode pattern. | 
|  |  | 
|  | The pattern converts a 5-bit lane *id* to another | 
|  | lane *id* with which the lane interacts. | 
|  |  | 
|  | *mask* is a 5 character sequence which | 
|  | specifies how to transform the bits of the | 
|  | lane *id*. | 
|  |  | 
|  | The following characters are allowed: | 
|  |  | 
|  | * "0" - set bit to 0. | 
|  |  | 
|  | * "1" - set bit to 1. | 
|  |  | 
|  | * "p" - preserve bit. | 
|  |  | 
|  | * "i" - inverse bit. | 
|  |  | 
|  | offset:swizzle(BROADCAST,{2..32},{0..N})                Specifies a broadcast mode. | 
|  |  | 
|  | Broadcasts the value of any particular lane to | 
|  | all lanes in its group. | 
|  |  | 
|  | The first numeric parameter is a group | 
|  | size and must be equal to 2, 4, 8, 16 or 32. | 
|  |  | 
|  | The second numeric parameter is an index of the | 
|  | lane being broadcasted. | 
|  |  | 
|  | The index must not exceed group size. | 
|  | offset:swizzle(SWAP,{1..16})                            Specifies a swap mode. | 
|  |  | 
|  | Swaps the neighboring groups of | 
|  | 1, 2, 4, 8 or 16 lanes. | 
|  | offset:swizzle(REVERSE,{2..32})                         Specifies a reverse mode. | 
|  |  | 
|  | Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes. | 
|  | ======================================================= =========================================================== | 
|  |  | 
|  | Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | offset:255 | 
|  | offset:0xffff | 
|  | offset:swizzle(QUAD_PERM, 0, 1, 2, 3) | 
|  | offset:swizzle(BITMASK_PERM, "01pi0") | 
|  | offset:swizzle(BROADCAST, 2, 0) | 
|  | offset:swizzle(SWAP, 8) | 
|  | offset:swizzle(REVERSE, 30 + 2) | 
|  |  | 
|  | .. _amdgpu_synid_gds: | 
|  |  | 
|  | gds | 
|  | ~~~ | 
|  |  | 
|  | Specifies whether to use GDS or LDS memory (LDS is the default). | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | gds                                      Use GDS memory. | 
|  | ======================================== ================================================ | 
|  |  | 
|  |  | 
|  | EXP Modifiers | 
|  | ------------- | 
|  |  | 
|  | .. _amdgpu_synid_done: | 
|  |  | 
|  | done | 
|  | ~~~~ | 
|  |  | 
|  | Specifies if this is the last export from the shader to the target. By default, | 
|  | *exp* instruction does not finish an export sequence. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | done                                     Indicates the last export operation. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_compr: | 
|  |  | 
|  | compr | 
|  | ~~~~~ | 
|  |  | 
|  | Indicates if the data are compressed (data are not compressed by default). | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | compr                                    Data are compressed. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_vm: | 
|  |  | 
|  | vm | 
|  | ~~ | 
|  |  | 
|  | Specifies valid mask flag state (off by default). | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | vm                                       Set valid mask flag. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | FLAT Modifiers | 
|  | -------------- | 
|  |  | 
|  | .. _amdgpu_synid_flat_offset12: | 
|  |  | 
|  | offset12 | 
|  | ~~~~~~~~ | 
|  |  | 
|  | Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0. | 
|  |  | 
|  | Cannot be used with *global/scratch* opcodes. GFX9 only. | 
|  |  | 
|  | ================= ==================================================================== | 
|  | Syntax            Description | 
|  | ================= ==================================================================== | 
|  | offset:{0..4095}  Specifies a 12-bit unsigned offset as a positive | 
|  | :ref:`integer number <amdgpu_synid_integer_number>` | 
|  | or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  | ================= ==================================================================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | offset:4095 | 
|  | offset:x-0xff | 
|  |  | 
|  | .. _amdgpu_synid_flat_offset13s: | 
|  |  | 
|  | offset13s | 
|  | ~~~~~~~~~ | 
|  |  | 
|  | Specifies an immediate signed 13-bit offset, in bytes. The default value is 0. | 
|  |  | 
|  | Can be used with *global/scratch* opcodes only. GFX9 only. | 
|  |  | 
|  | ===================== ==================================================================== | 
|  | Syntax                Description | 
|  | ===================== ==================================================================== | 
|  | offset:{-4096..4095}  Specifies a 13-bit signed offset as an | 
|  | :ref:`integer number <amdgpu_synid_integer_number>` | 
|  | or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  | ===================== ==================================================================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | offset:-4000 | 
|  | offset:0x10 | 
|  | offset:-x | 
|  |  | 
|  | .. _amdgpu_synid_flat_offset12s: | 
|  |  | 
|  | offset12s | 
|  | ~~~~~~~~~ | 
|  |  | 
|  | Specifies an immediate signed 12-bit offset, in bytes. The default value is 0. | 
|  |  | 
|  | Can be used with *global/scratch* opcodes only. | 
|  |  | 
|  | GFX10 only. | 
|  |  | 
|  | ===================== ==================================================================== | 
|  | Syntax                Description | 
|  | ===================== ==================================================================== | 
|  | offset:{-2048..2047}  Specifies a 12-bit signed offset as an | 
|  | :ref:`integer number <amdgpu_synid_integer_number>` | 
|  | or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  | ===================== ==================================================================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | offset:-2000 | 
|  | offset:0x10 | 
|  | offset:-x+y | 
|  |  | 
|  | .. _amdgpu_synid_flat_offset11: | 
|  |  | 
|  | offset11 | 
|  | ~~~~~~~~ | 
|  |  | 
|  | Specifies an immediate unsigned 11-bit offset, in bytes. The default value is 0. | 
|  |  | 
|  | Cannot be used with *global/scratch* opcodes. | 
|  |  | 
|  | GFX10 only. | 
|  |  | 
|  | ================= ==================================================================== | 
|  | Syntax            Description | 
|  | ================= ==================================================================== | 
|  | offset:{0..2047}  Specifies an 11-bit unsigned offset as a positive | 
|  | :ref:`integer number <amdgpu_synid_integer_number>` | 
|  | or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  | ================= ==================================================================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | offset:2047 | 
|  | offset:x+0xff | 
|  |  | 
|  | dlc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only. | 
|  |  | 
|  | glc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_glc>`. | 
|  |  | 
|  | lds | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_lds>`. GFX10 only. | 
|  |  | 
|  | slc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_slc>`. | 
|  |  | 
|  | tfe | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_tfe>`. | 
|  |  | 
|  | nv | 
|  | ~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_nv>`. | 
|  |  | 
|  | MIMG Modifiers | 
|  | -------------- | 
|  |  | 
|  | .. _amdgpu_synid_dmask: | 
|  |  | 
|  | dmask | 
|  | ~~~~~ | 
|  |  | 
|  | Specifies which channels (image components) are used by the operation. By default, no channels | 
|  | are used. | 
|  |  | 
|  | =============== ==================================================================== | 
|  | Syntax          Description | 
|  | =============== ==================================================================== | 
|  | dmask:{0..15}   Specifies image channels as a positive | 
|  | :ref:`integer number <amdgpu_synid_integer_number>` | 
|  | or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Each bit corresponds to one of 4 image components (RGBA). | 
|  |  | 
|  | If the specified bit value is 0, the component is not used, | 
|  | value 1 means that the component is used. | 
|  | =============== ==================================================================== | 
|  |  | 
|  | This modifier has some limitations depending on instruction kind: | 
|  |  | 
|  | =================================================== ======================== | 
|  | Instruction Kind                                    Valid dmask Values | 
|  | =================================================== ======================== | 
|  | 32-bit atomic *cmpswap*                             0x3 | 
|  | 32-bit atomic instructions except for *cmpswap*     0x1 | 
|  | 64-bit atomic *cmpswap*                             0xF | 
|  | 64-bit atomic instructions except for *cmpswap*     0x3 | 
|  | *gather4*                                           0x1, 0x2, 0x4, 0x8 | 
|  | Other instructions                                  any value | 
|  | =================================================== ======================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | dmask:0xf | 
|  | dmask:0b1111 | 
|  | dmask:x|y|z | 
|  |  | 
|  | .. _amdgpu_synid_unorm: | 
|  |  | 
|  | unorm | 
|  | ~~~~~ | 
|  |  | 
|  | Specifies whether the address is normalized or not (the address is normalized by default). | 
|  |  | 
|  | ======================== ======================================== | 
|  | Syntax                   Description | 
|  | ======================== ======================================== | 
|  | unorm                    Force the address to be unnormalized. | 
|  | ======================== ======================================== | 
|  |  | 
|  | glc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_glc>`. | 
|  |  | 
|  | slc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_slc>`. | 
|  |  | 
|  | .. _amdgpu_synid_r128: | 
|  |  | 
|  | r128 | 
|  | ~~~~ | 
|  |  | 
|  | Specifies texture resource size. The default size is 256 bits. | 
|  |  | 
|  | GFX7, GFX8 and GFX10 only. | 
|  |  | 
|  | =================== ================================================ | 
|  | Syntax              Description | 
|  | =================== ================================================ | 
|  | r128                Specifies 128 bits texture resource size. | 
|  | =================== ================================================ | 
|  |  | 
|  | .. WARNING:: Using this modifier should decrease *rsrc* operand size from 8 to 4 dwords, but assembler does not currently support this feature. | 
|  |  | 
|  | tfe | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_tfe>`. | 
|  |  | 
|  | .. _amdgpu_synid_lwe: | 
|  |  | 
|  | lwe | 
|  | ~~~ | 
|  |  | 
|  | Specifies LOD warning status (LOD warning is disabled by default). | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | lwe                                      Enables LOD warning. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_da: | 
|  |  | 
|  | da | 
|  | ~~ | 
|  |  | 
|  | Specifies if an array index must be sent to TA. By default, array index is not sent. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | da                                       Send an array-index to TA. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_d16: | 
|  |  | 
|  | d16 | 
|  | ~~~ | 
|  |  | 
|  | Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | d16                                      Enables 16-bits data mode. | 
|  |  | 
|  | On loads, convert data in memory to 16-bit | 
|  | format before storing it in VGPRs. | 
|  |  | 
|  | For stores, convert 16-bit data in VGPRs to | 
|  | 32 bits before going to memory. | 
|  |  | 
|  | Note that GFX8.0 does not support data packing. | 
|  | Each 16-bit data element occupies 1 VGPR. | 
|  |  | 
|  | GFX8.1, GFX9 and GFX10 support data packing. | 
|  | Each pair of 16-bit data elements | 
|  | occupies 1 VGPR. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_a16: | 
|  |  | 
|  | a16 | 
|  | ~~~ | 
|  |  | 
|  | Specifies size of image address components: 16 or 32 bits (32 bits by default). | 
|  | GFX9 and GFX10 only. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | a16                                      Enables 16-bits image address components. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_dim: | 
|  |  | 
|  | dim | 
|  | ~~~ | 
|  |  | 
|  | Specifies surface dimension. This is a mandatory modifier. There is no default value. | 
|  |  | 
|  | GFX10 only. | 
|  |  | 
|  | =============================== ========================================================= | 
|  | Syntax                          Description | 
|  | =============================== ========================================================= | 
|  | dim:1D                          One-dimensional image. | 
|  | dim:2D                          Two-dimensional image. | 
|  | dim:3D                          Three-dimensional image. | 
|  | dim:CUBE                        Cubemap array. | 
|  | dim:1D_ARRAY                    One-dimensional image array. | 
|  | dim:2D_ARRAY                    Two-dimensional image array. | 
|  | dim:2D_MSAA                     Two-dimensional multi-sample auto-aliasing image. | 
|  | dim:2D_MSAA_ARRAY               Two-dimensional multi-sample auto-aliasing image array. | 
|  | =============================== ========================================================= | 
|  |  | 
|  | The following table defines an alternative syntax which is supported | 
|  | for compatibility with SP3 assembler: | 
|  |  | 
|  | =============================== ========================================================= | 
|  | Syntax                          Description | 
|  | =============================== ========================================================= | 
|  | dim:SQ_RSRC_IMG_1D              One-dimensional image. | 
|  | dim:SQ_RSRC_IMG_2D              Two-dimensional image. | 
|  | dim:SQ_RSRC_IMG_3D              Three-dimensional image. | 
|  | dim:SQ_RSRC_IMG_CUBE            Cubemap array. | 
|  | dim:SQ_RSRC_IMG_1D_ARRAY        One-dimensional image array. | 
|  | dim:SQ_RSRC_IMG_2D_ARRAY        Two-dimensional image array. | 
|  | dim:SQ_RSRC_IMG_2D_MSAA         Two-dimensional multi-sample auto-aliasing image. | 
|  | dim:SQ_RSRC_IMG_2D_MSAA_ARRAY   Two-dimensional multi-sample auto-aliasing image array. | 
|  | =============================== ========================================================= | 
|  |  | 
|  | dlc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only. | 
|  |  | 
|  | Miscellaneous Modifiers | 
|  | ----------------------- | 
|  |  | 
|  | .. _amdgpu_synid_dlc: | 
|  |  | 
|  | dlc | 
|  | ~~~ | 
|  |  | 
|  | Controls device level cache policy for memory operations. Used for synchronization. | 
|  | When specified, forces operation to bypass device level cache making the operation device | 
|  | level coherent. By default, instructions use device level cache. | 
|  |  | 
|  | GFX10 only. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | dlc                                      Bypass device level cache. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_glc: | 
|  |  | 
|  | glc | 
|  | ~~~ | 
|  |  | 
|  | This modifier has different meaning for loads, stores, and atomic operations. | 
|  | The default value is off (0). | 
|  |  | 
|  | See AMD documentation for details. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | glc                                      Set glc bit to 1. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_lds: | 
|  |  | 
|  | lds | 
|  | ~~~ | 
|  |  | 
|  | Specifies where to store the result: VGPRs or LDS (VGPRs by default). | 
|  |  | 
|  | ======================================== =========================== | 
|  | Syntax                                   Description | 
|  | ======================================== =========================== | 
|  | lds                                      Store result in LDS. | 
|  | ======================================== =========================== | 
|  |  | 
|  | .. _amdgpu_synid_nv: | 
|  |  | 
|  | nv | 
|  | ~~ | 
|  |  | 
|  | Specifies if instruction is operating on non-volatile memory. By default, memory is volatile. | 
|  |  | 
|  | GFX9 only. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | nv                                       Indicates that instruction operates on | 
|  | non-volatile memory. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_slc: | 
|  |  | 
|  | slc | 
|  | ~~~ | 
|  |  | 
|  | Specifies cache policy. The default value is off (0). | 
|  |  | 
|  | See AMD documentation for details. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | slc                                      Set slc bit to 1. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_tfe: | 
|  |  | 
|  | tfe | 
|  | ~~~ | 
|  |  | 
|  | Controls access to partially resident textures. The default value is off (0). | 
|  |  | 
|  | See AMD documentation for details. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | tfe                                      Set tfe bit to 1. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | MUBUF/MTBUF Modifiers | 
|  | --------------------- | 
|  |  | 
|  | .. _amdgpu_synid_idxen: | 
|  |  | 
|  | idxen | 
|  | ~~~~~ | 
|  |  | 
|  | Specifies whether address components include an index. By default, no components are used. | 
|  |  | 
|  | Can be used together with :ref:`offen<amdgpu_synid_offen>`. | 
|  |  | 
|  | Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | idxen                                    Address components include an index. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_offen: | 
|  |  | 
|  | offen | 
|  | ~~~~~ | 
|  |  | 
|  | Specifies whether address components include an offset. By default, no components are used. | 
|  |  | 
|  | Can be used together with :ref:`idxen<amdgpu_synid_idxen>`. | 
|  |  | 
|  | Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | offen                                    Address components include an offset. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_addr64: | 
|  |  | 
|  | addr64 | 
|  | ~~~~~~ | 
|  |  | 
|  | Specifies whether a 64-bit address is used. By default, no address is used. | 
|  |  | 
|  | GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and | 
|  | :ref:`idxen<amdgpu_synid_idxen>` modifiers. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | addr64                                   A 64-bit address is used. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_buf_offset12: | 
|  |  | 
|  | offset12 | 
|  | ~~~~~~~~ | 
|  |  | 
|  | Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0. | 
|  |  | 
|  | ================== ==================================================================== | 
|  | Syntax             Description | 
|  | ================== ==================================================================== | 
|  | offset:{0..0xFFF}  Specifies a 12-bit unsigned offset as a positive | 
|  | :ref:`integer number <amdgpu_synid_integer_number>` | 
|  | or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  | ================== ==================================================================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | offset:x+y | 
|  | offset:0x10 | 
|  |  | 
|  | glc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_glc>`. | 
|  |  | 
|  | slc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_slc>`. | 
|  |  | 
|  | lds | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_lds>`. | 
|  |  | 
|  | dlc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only. | 
|  |  | 
|  | tfe | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_tfe>`. | 
|  |  | 
|  | .. _amdgpu_synid_fmt: | 
|  |  | 
|  | fmt | 
|  | ~~~ | 
|  |  | 
|  | Specifies data and numeric formats used by the operation. | 
|  | The default numeric format is BUF_NUM_FORMAT_UNORM. | 
|  | The default data format is BUF_DATA_FORMAT_8. | 
|  |  | 
|  | ========================================= =============================================================== | 
|  | Syntax                                    Description | 
|  | ========================================= =============================================================== | 
|  | format:{0..127}                           Use format specified as either an | 
|  | :ref:`integer number<amdgpu_synid_integer_number>` or an | 
|  | :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  | format:[<data format>]                    Use the specified data format and | 
|  | default numeric format. | 
|  | format:[<numeric format>]                 Use the specified numeric format and | 
|  | default data format. | 
|  | format:[<data format>, <numeric format>]  Use the specified data and numeric formats. | 
|  | format:[<numeric format>, <data format>]  Use the specified data and numeric formats. | 
|  | ========================================= =============================================================== | 
|  |  | 
|  | .. _amdgpu_synid_format_data: | 
|  |  | 
|  | Supported data formats are defined in the following table: | 
|  |  | 
|  | ========================================= =============================== | 
|  | Syntax                                    Note | 
|  | ========================================= =============================== | 
|  | BUF_DATA_FORMAT_INVALID | 
|  | BUF_DATA_FORMAT_8                         Default value. | 
|  | BUF_DATA_FORMAT_16 | 
|  | BUF_DATA_FORMAT_8_8 | 
|  | BUF_DATA_FORMAT_32 | 
|  | BUF_DATA_FORMAT_16_16 | 
|  | BUF_DATA_FORMAT_10_11_11 | 
|  | BUF_DATA_FORMAT_11_11_10 | 
|  | BUF_DATA_FORMAT_10_10_10_2 | 
|  | BUF_DATA_FORMAT_2_10_10_10 | 
|  | BUF_DATA_FORMAT_8_8_8_8 | 
|  | BUF_DATA_FORMAT_32_32 | 
|  | BUF_DATA_FORMAT_16_16_16_16 | 
|  | BUF_DATA_FORMAT_32_32_32 | 
|  | BUF_DATA_FORMAT_32_32_32_32 | 
|  | BUF_DATA_FORMAT_RESERVED_15 | 
|  | ========================================= =============================== | 
|  |  | 
|  | .. _amdgpu_synid_format_num: | 
|  |  | 
|  | Supported numeric formats are defined below: | 
|  |  | 
|  | ========================================= =============================== | 
|  | Syntax                                    Note | 
|  | ========================================= =============================== | 
|  | BUF_NUM_FORMAT_UNORM                      Default value. | 
|  | BUF_NUM_FORMAT_SNORM | 
|  | BUF_NUM_FORMAT_USCALED | 
|  | BUF_NUM_FORMAT_SSCALED | 
|  | BUF_NUM_FORMAT_UINT | 
|  | BUF_NUM_FORMAT_SINT | 
|  | BUF_NUM_FORMAT_SNORM_OGL                  GFX7 only. | 
|  | BUF_NUM_FORMAT_RESERVED_6                 GFX8 and GFX9 only. | 
|  | BUF_NUM_FORMAT_FLOAT | 
|  | ========================================= =============================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | format:0 | 
|  | format:127 | 
|  | format:[BUF_DATA_FORMAT_16] | 
|  | format:[BUF_DATA_FORMAT_16,BUF_NUM_FORMAT_SSCALED] | 
|  | format:[BUF_NUM_FORMAT_FLOAT] | 
|  |  | 
|  | .. _amdgpu_synid_ufmt: | 
|  |  | 
|  | ufmt | 
|  | ~~~~ | 
|  |  | 
|  | Specifies a unified format used by the operation. | 
|  | The default format is BUF_FMT_8_UNORM. | 
|  | GFX10 only. | 
|  |  | 
|  | ========================================= =============================================================== | 
|  | Syntax                                    Description | 
|  | ========================================= =============================================================== | 
|  | format:{0..127}                           Use unified format specified as either an | 
|  | :ref:`integer number<amdgpu_synid_integer_number>` or an | 
|  | :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  | Note that unified format numbers are not compatible with | 
|  | format numbers used for pre-GFX10 ISA. | 
|  | format:[<unified format>]                 Use the specified unified format. | 
|  | ========================================= =============================================================== | 
|  |  | 
|  | Unified format is a replacement for :ref:`data<amdgpu_synid_format_data>` | 
|  | and :ref:`numeric<amdgpu_synid_format_num>` formats. For compatibility with older ISA, | 
|  | :ref:`syntax with data and numeric formats<amdgpu_synid_fmt>` is still accepted | 
|  | provided that the combination of formats can be mapped to a unified format. | 
|  |  | 
|  | Supported unified formats and equivalent combinations of data and numeric formats | 
|  | are defined below: | 
|  |  | 
|  | ============================== ============================== ============================= | 
|  | Syntax                         Equivalent Data Format         Equivalent Numeric Format | 
|  | ============================== ============================== ============================= | 
|  | BUF_FMT_INVALID                BUF_DATA_FORMAT_INVALID        BUF_NUM_FORMAT_UNORM | 
|  |  | 
|  | BUF_FMT_8_UNORM                BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_UNORM | 
|  | BUF_FMT_8_SNORM                BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SNORM | 
|  | BUF_FMT_8_USCALED              BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_USCALED | 
|  | BUF_FMT_8_SSCALED              BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SSCALED | 
|  | BUF_FMT_8_UINT                 BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_8_SINT                 BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SINT | 
|  |  | 
|  | BUF_FMT_16_UNORM               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_UNORM | 
|  | BUF_FMT_16_SNORM               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SNORM | 
|  | BUF_FMT_16_USCALED             BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_USCALED | 
|  | BUF_FMT_16_SSCALED             BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SSCALED | 
|  | BUF_FMT_16_UINT                BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_16_SINT                BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SINT | 
|  | BUF_FMT_16_FLOAT               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_FLOAT | 
|  |  | 
|  | BUF_FMT_8_8_UNORM              BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_UNORM | 
|  | BUF_FMT_8_8_SNORM              BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SNORM | 
|  | BUF_FMT_8_8_USCALED            BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_USCALED | 
|  | BUF_FMT_8_8_SSCALED            BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SSCALED | 
|  | BUF_FMT_8_8_UINT               BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_8_8_SINT               BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SINT | 
|  |  | 
|  | BUF_FMT_32_UINT                BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_32_SINT                BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_SINT | 
|  | BUF_FMT_32_FLOAT               BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_FLOAT | 
|  |  | 
|  | BUF_FMT_16_16_UNORM            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_UNORM | 
|  | BUF_FMT_16_16_SNORM            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SNORM | 
|  | BUF_FMT_16_16_USCALED          BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_USCALED | 
|  | BUF_FMT_16_16_SSCALED          BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SSCALED | 
|  | BUF_FMT_16_16_UINT             BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_16_16_SINT             BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SINT | 
|  | BUF_FMT_16_16_FLOAT            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_FLOAT | 
|  |  | 
|  | BUF_FMT_10_11_11_UNORM         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_UNORM | 
|  | BUF_FMT_10_11_11_SNORM         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SNORM | 
|  | BUF_FMT_10_11_11_USCALED       BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_USCALED | 
|  | BUF_FMT_10_11_11_SSCALED       BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SSCALED | 
|  | BUF_FMT_10_11_11_UINT          BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_10_11_11_SINT          BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SINT | 
|  | BUF_FMT_10_11_11_FLOAT         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_FLOAT | 
|  |  | 
|  | BUF_FMT_11_11_10_UNORM         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_UNORM | 
|  | BUF_FMT_11_11_10_SNORM         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SNORM | 
|  | BUF_FMT_11_11_10_USCALED       BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_USCALED | 
|  | BUF_FMT_11_11_10_SSCALED       BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SSCALED | 
|  | BUF_FMT_11_11_10_UINT          BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_11_11_10_SINT          BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SINT | 
|  | BUF_FMT_11_11_10_FLOAT         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_FLOAT | 
|  |  | 
|  | BUF_FMT_10_10_10_2_UNORM       BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_UNORM | 
|  | BUF_FMT_10_10_10_2_SNORM       BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SNORM | 
|  | BUF_FMT_10_10_10_2_USCALED     BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_USCALED | 
|  | BUF_FMT_10_10_10_2_SSCALED     BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SSCALED | 
|  | BUF_FMT_10_10_10_2_UINT        BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_10_10_10_2_SINT        BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SINT | 
|  |  | 
|  | BUF_FMT_2_10_10_10_UNORM       BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_UNORM | 
|  | BUF_FMT_2_10_10_10_SNORM       BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SNORM | 
|  | BUF_FMT_2_10_10_10_USCALED     BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_USCALED | 
|  | BUF_FMT_2_10_10_10_SSCALED     BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SSCALED | 
|  | BUF_FMT_2_10_10_10_UINT        BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_2_10_10_10_SINT        BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SINT | 
|  |  | 
|  | BUF_FMT_8_8_8_8_UNORM          BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_UNORM | 
|  | BUF_FMT_8_8_8_8_SNORM          BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SNORM | 
|  | BUF_FMT_8_8_8_8_USCALED        BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_USCALED | 
|  | BUF_FMT_8_8_8_8_SSCALED        BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SSCALED | 
|  | BUF_FMT_8_8_8_8_UINT           BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_8_8_8_8_SINT           BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SINT | 
|  |  | 
|  | BUF_FMT_32_32_UINT             BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_32_32_SINT             BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_SINT | 
|  | BUF_FMT_32_32_FLOAT            BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_FLOAT | 
|  |  | 
|  | BUF_FMT_16_16_16_16_UNORM      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_UNORM | 
|  | BUF_FMT_16_16_16_16_SNORM      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SNORM | 
|  | BUF_FMT_16_16_16_16_USCALED    BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_USCALED | 
|  | BUF_FMT_16_16_16_16_SSCALED    BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SSCALED | 
|  | BUF_FMT_16_16_16_16_UINT       BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_16_16_16_16_SINT       BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SINT | 
|  | BUF_FMT_16_16_16_16_FLOAT      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_FLOAT | 
|  |  | 
|  | BUF_FMT_32_32_32_UINT          BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_32_32_32_SINT          BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_SINT | 
|  | BUF_FMT_32_32_32_FLOAT         BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_FLOAT | 
|  | BUF_FMT_32_32_32_32_UINT       BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_UINT | 
|  | BUF_FMT_32_32_32_32_SINT       BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_SINT | 
|  | BUF_FMT_32_32_32_32_FLOAT      BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_FLOAT | 
|  | ============================== ============================== ============================= | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | format:0 | 
|  | format:[BUF_FMT_32_UINT] | 
|  |  | 
|  | SMRD/SMEM Modifiers | 
|  | ------------------- | 
|  |  | 
|  | glc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_glc>`. | 
|  |  | 
|  | nv | 
|  | ~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_nv>`. GFX9 only. | 
|  |  | 
|  | dlc | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only. | 
|  |  | 
|  | VINTRP Modifiers | 
|  | ---------------- | 
|  |  | 
|  | .. _amdgpu_synid_high: | 
|  |  | 
|  | high | 
|  | ~~~~ | 
|  |  | 
|  | Specifies which half of the LDS word to use. Low half of LDS word is used by default. | 
|  | GFX9 and GFX10 only. | 
|  |  | 
|  | ======================================== ================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================ | 
|  | high                                     Use high half of LDS word. | 
|  | ======================================== ================================ | 
|  |  | 
|  | DPP8 Modifiers | 
|  | -------------- | 
|  |  | 
|  | GFX10 only. | 
|  |  | 
|  | .. _amdgpu_synid_dpp8_sel: | 
|  |  | 
|  | dpp8_sel | 
|  | ~~~~~~~~ | 
|  |  | 
|  | Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier. | 
|  | There is no default value. | 
|  |  | 
|  | GFX10 only. | 
|  |  | 
|  | The *dpp8_sel* modifier must specify exactly 8 values. | 
|  | First value selects which lane to read from to supply data into lane 0. | 
|  | Second value controls lane 1 and so on. | 
|  |  | 
|  | Each value may be specified as either | 
|  | an :ref:`integer number<amdgpu_synid_integer_number>` or | 
|  | an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | =============================================================== =========================== | 
|  | Syntax                                                          Description | 
|  | =============================================================== =========================== | 
|  | dpp8:[{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7}]  Select lanes to read from. | 
|  | =============================================================== =========================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | dpp8:[7,6,5,4,3,2,1,0] | 
|  | dpp8:[0,1,0,1,0,1,0,1] | 
|  |  | 
|  | .. _amdgpu_synid_fi8: | 
|  |  | 
|  | fi | 
|  | ~~ | 
|  |  | 
|  | Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero. | 
|  |  | 
|  | Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero. | 
|  |  | 
|  | GFX10 only. | 
|  |  | 
|  | ==================================== ===================================================== | 
|  | Syntax                               Description | 
|  | ==================================== ===================================================== | 
|  | fi:0                                 Fetch zero when accessing data from inactive lanes. | 
|  | fi:1                                 Fetch pre-exist values from inactive lanes. | 
|  | ==================================== ===================================================== | 
|  |  | 
|  | Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | DPP Modifiers | 
|  | ------------- | 
|  |  | 
|  | GFX8, GFX9 and GFX10 only. | 
|  |  | 
|  | .. _amdgpu_synid_dpp_ctrl: | 
|  |  | 
|  | dpp_ctrl | 
|  | ~~~~~~~~ | 
|  |  | 
|  | Specifies how data are shared between threads. This is a mandatory modifier. | 
|  | There is no default value. | 
|  |  | 
|  | GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10. | 
|  |  | 
|  | Note: the lanes of a wavefront are organized in four *rows* and four *banks*. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads. | 
|  | row_mirror                               Mirror threads within row. | 
|  | row_half_mirror                          Mirror threads within 1/2 row (8 threads). | 
|  | row_bcast:15                             Broadcast 15th thread of each row to next row. | 
|  | row_bcast:31                             Broadcast thread 31 to rows 2 and 3. | 
|  | wave_shl:1                               Wavefront left shift by 1 thread. | 
|  | wave_rol:1                               Wavefront left rotate by 1 thread. | 
|  | wave_shr:1                               Wavefront right shift by 1 thread. | 
|  | wave_ror:1                               Wavefront right rotate by 1 thread. | 
|  | row_shl:{1..15}                          Row shift left by 1-15 threads. | 
|  | row_shr:{1..15}                          Row shift right by 1-15 threads. | 
|  | row_ror:{1..15}                          Row rotate right by 1-15 threads. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | quad_perm:[0, 1, 2, 3] | 
|  | row_shl:3 | 
|  |  | 
|  | .. _amdgpu_synid_dpp16_ctrl: | 
|  |  | 
|  | dpp16_ctrl | 
|  | ~~~~~~~~~~ | 
|  |  | 
|  | Specifies how data are shared between threads. This is a mandatory modifier. | 
|  | There is no default value. | 
|  |  | 
|  | GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9. | 
|  |  | 
|  | Note: the lanes of a wavefront are organized in four *rows* and four *banks*. | 
|  | (There are only two rows in *wave32* mode.) | 
|  |  | 
|  | ======================================== ==================================================== | 
|  | Syntax                                   Description | 
|  | ======================================== ==================================================== | 
|  | quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads. | 
|  | row_mirror                               Mirror threads within row. | 
|  | row_half_mirror                          Mirror threads within 1/2 row (8 threads). | 
|  | row_share:{0..15}                        Share the value from the specified lane with other | 
|  | lanes in the row. | 
|  | row_xmask:{0..15}                        Fetch from XOR(current lane id, specified lane id). | 
|  | row_shl:{1..15}                          Row shift left by 1-15 threads. | 
|  | row_shr:{1..15}                          Row shift right by 1-15 threads. | 
|  | row_ror:{1..15}                          Row rotate right by 1-15 threads. | 
|  | ======================================== ==================================================== | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | quad_perm:[0, 1, 2, 3] | 
|  | row_shl:3 | 
|  |  | 
|  | .. _amdgpu_synid_dpp32_ctrl: | 
|  |  | 
|  | dpp32_ctrl | 
|  | ~~~~~~~~~~ | 
|  |  | 
|  | Specifies how data are shared between threads. This is a mandatory modifier. | 
|  | There is no default value. | 
|  |  | 
|  | May be used only with GFX90A 32-bit instructions. | 
|  |  | 
|  | Note: the lanes of a wavefront are organized in four *rows* and four *banks*. | 
|  |  | 
|  | ======================================== ================================================== | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================== | 
|  | quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads. | 
|  | row_mirror                               Mirror threads within row. | 
|  | row_half_mirror                          Mirror threads within 1/2 row (8 threads). | 
|  | row_bcast:15                             Broadcast 15th thread of each row to next row. | 
|  | row_bcast:31                             Broadcast thread 31 to rows 2 and 3. | 
|  | wave_shl:1                               Wavefront left shift by 1 thread. | 
|  | wave_rol:1                               Wavefront left rotate by 1 thread. | 
|  | wave_shr:1                               Wavefront right shift by 1 thread. | 
|  | wave_ror:1                               Wavefront right rotate by 1 thread. | 
|  | row_shl:{1..15}                          Row shift left by 1-15 threads. | 
|  | row_shr:{1..15}                          Row shift right by 1-15 threads. | 
|  | row_ror:{1..15}                          Row rotate right by 1-15 threads. | 
|  | row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row. | 
|  | ======================================== ================================================== | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | quad_perm:[0, 1, 2, 3] | 
|  | row_shl:3 | 
|  |  | 
|  |  | 
|  | .. _amdgpu_synid_dpp64_ctrl: | 
|  |  | 
|  | dpp64_ctrl | 
|  | ~~~~~~~~~~ | 
|  |  | 
|  | Specifies how data are shared between threads. This is a mandatory modifier. | 
|  | There is no default value. | 
|  |  | 
|  | May be used only with GFX90A 64-bit instructions. | 
|  |  | 
|  | Note: the lanes of a wavefront are organized in four *rows* and four *banks*. | 
|  |  | 
|  | ======================================== ================================================== | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================== | 
|  | row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row. | 
|  | ======================================== ================================================== | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | row_newbcast:3 | 
|  |  | 
|  |  | 
|  | .. _amdgpu_synid_row_mask: | 
|  |  | 
|  | row_mask | 
|  | ~~~~~~~~ | 
|  |  | 
|  | Controls which rows are enabled for data sharing. By default, all rows are enabled. | 
|  |  | 
|  | Note: the lanes of a wavefront are organized in four *rows* and four *banks*. | 
|  | (There are only two rows in *wave32* mode.) | 
|  |  | 
|  | ================= ==================================================================== | 
|  | Syntax            Description | 
|  | ================= ==================================================================== | 
|  | row_mask:{0..15}  Specifies a *row mask* as a positive | 
|  | :ref:`integer number <amdgpu_synid_integer_number>` | 
|  | or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Each of 4 bits in the mask controls one row | 
|  | (0 - disabled, 1 - enabled). | 
|  |  | 
|  | In *wave32* mode the values should be limited to 0..7. | 
|  | ================= ==================================================================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | row_mask:0xf | 
|  | row_mask:0b1010 | 
|  | row_mask:x|y | 
|  |  | 
|  | .. _amdgpu_synid_bank_mask: | 
|  |  | 
|  | bank_mask | 
|  | ~~~~~~~~~ | 
|  |  | 
|  | Controls which banks are enabled for data sharing. By default, all banks are enabled. | 
|  |  | 
|  | Note: the lanes of a wavefront are organized in four *rows* and four *banks*. | 
|  | (There are only two rows in *wave32* mode.) | 
|  |  | 
|  | ================== ==================================================================== | 
|  | Syntax             Description | 
|  | ================== ==================================================================== | 
|  | bank_mask:{0..15}  Specifies a *bank mask* as a positive | 
|  | :ref:`integer number <amdgpu_synid_integer_number>` | 
|  | or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Each of 4 bits in the mask controls one bank | 
|  | (0 - disabled, 1 - enabled). | 
|  | ================== ==================================================================== | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | bank_mask:0x3 | 
|  | bank_mask:0b0011 | 
|  | bank_mask:x&y | 
|  |  | 
|  | .. _amdgpu_synid_bound_ctrl: | 
|  |  | 
|  | bound_ctrl | 
|  | ~~~~~~~~~~ | 
|  |  | 
|  | Controls data sharing when accessing an invalid lane. By default, data sharing with | 
|  | invalid lanes is disabled. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | bound_ctrl:1                             Enables data sharing with invalid lanes. | 
|  |  | 
|  | Accessing data from an invalid lane will | 
|  | return zero. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_fi16: | 
|  |  | 
|  | fi | 
|  | ~~ | 
|  |  | 
|  | Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero. | 
|  |  | 
|  | Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero. | 
|  |  | 
|  | GFX10 only. | 
|  |  | 
|  | ======================================== ================================================== | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================== | 
|  | fi:0                                     Interaction with inactive lanes is controlled by | 
|  | :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`. | 
|  |  | 
|  | fi:1                                     Fetch pre-exist values from inactive lanes. | 
|  | ======================================== ================================================== | 
|  |  | 
|  | Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | SDWA Modifiers | 
|  | -------------- | 
|  |  | 
|  | GFX8, GFX9 and GFX10 only. | 
|  |  | 
|  | clamp | 
|  | ~~~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_clamp>`. | 
|  |  | 
|  | omod | 
|  | ~~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_omod>`. | 
|  |  | 
|  | GFX9 and GFX10 only. | 
|  |  | 
|  | .. _amdgpu_synid_dst_sel: | 
|  |  | 
|  | dst_sel | 
|  | ~~~~~~~ | 
|  |  | 
|  | Selects which bits in the destination are affected. By default, all bits are affected. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | dst_sel:DWORD                            Use bits 31:0. | 
|  | dst_sel:BYTE_0                           Use bits 7:0. | 
|  | dst_sel:BYTE_1                           Use bits 15:8. | 
|  | dst_sel:BYTE_2                           Use bits 23:16. | 
|  | dst_sel:BYTE_3                           Use bits 31:24. | 
|  | dst_sel:WORD_0                           Use bits 15:0. | 
|  | dst_sel:WORD_1                           Use bits 31:16. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_dst_unused: | 
|  |  | 
|  | dst_unused | 
|  | ~~~~~~~~~~ | 
|  |  | 
|  | Controls what to do with the bits in the destination which are not selected | 
|  | by :ref:`dst_sel<amdgpu_synid_dst_sel>`. | 
|  | By default, unused bits are preserved. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | dst_unused:UNUSED_PAD                    Pad with zeros. | 
|  | dst_unused:UNUSED_SEXT                   Sign-extend upper bits, zero lower bits. | 
|  | dst_unused:UNUSED_PRESERVE               Preserve bits. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_src0_sel: | 
|  |  | 
|  | src0_sel | 
|  | ~~~~~~~~ | 
|  |  | 
|  | Controls which bits in the src0 are used. By default, all bits are used. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | src0_sel:DWORD                           Use bits 31:0. | 
|  | src0_sel:BYTE_0                          Use bits 7:0. | 
|  | src0_sel:BYTE_1                          Use bits 15:8. | 
|  | src0_sel:BYTE_2                          Use bits 23:16. | 
|  | src0_sel:BYTE_3                          Use bits 31:24. | 
|  | src0_sel:WORD_0                          Use bits 15:0. | 
|  | src0_sel:WORD_1                          Use bits 31:16. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_src1_sel: | 
|  |  | 
|  | src1_sel | 
|  | ~~~~~~~~ | 
|  |  | 
|  | Controls which bits in the src1 are used. By default, all bits are used. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | src1_sel:DWORD                           Use bits 31:0. | 
|  | src1_sel:BYTE_0                          Use bits 7:0. | 
|  | src1_sel:BYTE_1                          Use bits 15:8. | 
|  | src1_sel:BYTE_2                          Use bits 23:16. | 
|  | src1_sel:BYTE_3                          Use bits 31:24. | 
|  | src1_sel:WORD_0                          Use bits 15:0. | 
|  | src1_sel:WORD_1                          Use bits 31:16. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_sdwa_operand_modifiers: | 
|  |  | 
|  | SDWA Operand Modifiers | 
|  | ---------------------- | 
|  |  | 
|  | Operand modifiers are not used separately. They are applied to source operands. | 
|  |  | 
|  | GFX8, GFX9 and GFX10 only. | 
|  |  | 
|  | abs | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_abs>`. | 
|  |  | 
|  | neg | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_neg>`. | 
|  |  | 
|  | .. _amdgpu_synid_sext: | 
|  |  | 
|  | sext | 
|  | ~~~~ | 
|  |  | 
|  | Sign-extends value of a (sub-dword) operand to fill all 32 bits. | 
|  | Has no effect for 32-bit operands. | 
|  |  | 
|  | Valid for integer operands only. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | sext(<operand>)                          Sign-extend operand value. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | sext(v4) | 
|  | sext(v255) | 
|  |  | 
|  | VOP3 Modifiers | 
|  | -------------- | 
|  |  | 
|  | .. _amdgpu_synid_vop3_op_sel: | 
|  |  | 
|  | op_sel | 
|  | ~~~~~~ | 
|  |  | 
|  | Selects the low [15:0] or high [31:16] operand bits for source and destination operands. | 
|  | By default, low bits are used for all operands. | 
|  |  | 
|  | The number of values specified with the op_sel modifier must match the number of instruction | 
|  | operands (both source and destination). First value controls src0, second value controls src1 | 
|  | and so on, except that the last value controls destination. | 
|  | The value 0 selects the low bits, while 1 selects the high bits. | 
|  |  | 
|  | Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified | 
|  | by op_sel must be 0. | 
|  |  | 
|  | GFX9 and GFX10 only. | 
|  |  | 
|  | ======================================== ============================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ============================================================ | 
|  | op_sel:[{0..1},{0..1}]                   Select operand bits for instructions with 1 source operand. | 
|  | op_sel:[{0..1},{0..1},{0..1}]            Select operand bits for instructions with 2 source operands. | 
|  | op_sel:[{0..1},{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands. | 
|  | ======================================== ============================================================ | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | op_sel:[0,0] | 
|  | op_sel:[0,1] | 
|  |  | 
|  | .. _amdgpu_synid_clamp: | 
|  |  | 
|  | clamp | 
|  | ~~~~~ | 
|  |  | 
|  | Clamp meaning depends on instruction. | 
|  |  | 
|  | For *v_cmp* instructions, clamp modifier indicates that the compare signals | 
|  | if a floating point exception occurs. By default, signaling is disabled. | 
|  | Not supported by GFX7. | 
|  |  | 
|  | For integer operations, clamp modifier indicates that the result must be clamped | 
|  | to the largest and smallest representable value. By default, there is no clamping. | 
|  | Integer clamping is not supported by GFX7. | 
|  |  | 
|  | For floating point operations, clamp modifier indicates that the result must be clamped | 
|  | to the range [0.0, 1.0]. By default, there is no clamping. | 
|  |  | 
|  | Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any). | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | clamp                                    Enables clamping (or signaling). | 
|  | ======================================== ================================================ | 
|  |  | 
|  | .. _amdgpu_synid_omod: | 
|  |  | 
|  | omod | 
|  | ~~~~ | 
|  |  | 
|  | Specifies if an output modifier must be applied to the result. | 
|  | By default, no output modifiers are applied. | 
|  |  | 
|  | Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any). | 
|  |  | 
|  | Output modifiers are valid for f32 and f64 floating point results only. | 
|  | They must not be used with f16. | 
|  |  | 
|  | Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result | 
|  | but accepts output modifiers. | 
|  |  | 
|  | ======================================== ================================================ | 
|  | Syntax                                   Description | 
|  | ======================================== ================================================ | 
|  | mul:2                                    Multiply the result by 2. | 
|  | mul:4                                    Multiply the result by 4. | 
|  | div:2                                    Multiply the result by 0.5. | 
|  | ======================================== ================================================ | 
|  |  | 
|  | Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | mul:2 | 
|  | mul:x      // x must be equal to 2 or 4 | 
|  |  | 
|  | .. _amdgpu_synid_vop3_operand_modifiers: | 
|  |  | 
|  | VOP3 Operand Modifiers | 
|  | ---------------------- | 
|  |  | 
|  | Operand modifiers are not used separately. They are applied to source operands. | 
|  |  | 
|  | .. _amdgpu_synid_abs: | 
|  |  | 
|  | abs | 
|  | ~~~ | 
|  |  | 
|  | Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>` | 
|  | (if any). Valid for floating point operands only. | 
|  |  | 
|  | ======================================== ==================================================== | 
|  | Syntax                                   Description | 
|  | ======================================== ==================================================== | 
|  | abs(<operand>)                           Get the absolute value of a floating-point operand. | 
|  | \|<operand>|                             The same as above (an SP3 syntax). | 
|  | ======================================== ==================================================== | 
|  |  | 
|  | Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|' | 
|  | may be misinterpreted. Such operands should be enclosed into additional parentheses as shown | 
|  | in examples below. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | abs(v36) | 
|  | \|v36| | 
|  | abs(x|y)     // ok | 
|  | \|(x|y)|      // additional parentheses are required | 
|  |  | 
|  | .. _amdgpu_synid_neg: | 
|  |  | 
|  | neg | 
|  | ~~~ | 
|  |  | 
|  | Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>` | 
|  | (if any). Valid for floating point operands only. | 
|  |  | 
|  | ================== ==================================================== | 
|  | Syntax             Description | 
|  | ================== ==================================================== | 
|  | neg(<operand>)     Get the negative value of a floating-point operand. | 
|  | The operand may include an optional | 
|  | :ref:`abs<amdgpu_synid_abs>` modifier. | 
|  | -<operand>         The same as above (an SP3 syntax). | 
|  | ================== ==================================================== | 
|  |  | 
|  | Note: SP3 syntax is supported with limitations because of a potential ambiguity. | 
|  | Currently it is allowed in the following cases: | 
|  |  | 
|  | * Before a register. | 
|  | * Before an :ref:`abs<amdgpu_synid_abs>` modifier. | 
|  | * Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier. | 
|  |  | 
|  | In all other cases "-" is handled as a part of an expression that follows the sign. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | // Operands with negate modifiers | 
|  | neg(v[0]) | 
|  | neg(1.0) | 
|  | neg(abs(v0)) | 
|  | -v5 | 
|  | -abs(v5) | 
|  | -\|v5| | 
|  |  | 
|  | // Operands without negate modifiers | 
|  | -1 | 
|  | -x+y | 
|  |  | 
|  | VOP3P Modifiers | 
|  | --------------- | 
|  |  | 
|  | This section describes modifiers of *regular* VOP3P instructions. | 
|  |  | 
|  | *v_mad_mix\** and *v_fma_mix\** | 
|  | instructions use these modifiers :ref:`in a special manner<amdgpu_synid_mad_mix>`. | 
|  |  | 
|  | GFX9 and GFX10 only. | 
|  |  | 
|  | .. _amdgpu_synid_op_sel: | 
|  |  | 
|  | op_sel | 
|  | ~~~~~~ | 
|  |  | 
|  | Selects the low [15:0] or high [31:16] operand bits as input to the operation | 
|  | which results in the lower-half of the destination. | 
|  | By default, low bits are used for all operands. | 
|  |  | 
|  | The number of values specified by the *op_sel* modifier must match the number of source | 
|  | operands. First value controls src0, second value controls src1 and so on. | 
|  |  | 
|  | The value 0 selects the low bits, while 1 selects the high bits. | 
|  |  | 
|  | ================================= ============================================================= | 
|  | Syntax                            Description | 
|  | ================================= ============================================================= | 
|  | op_sel:[{0..1}]                   Select operand bits for instructions with 1 source operand. | 
|  | op_sel:[{0..1},{0..1}]            Select operand bits for instructions with 2 source operands. | 
|  | op_sel:[{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands. | 
|  | ================================= ============================================================= | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | op_sel:[0,0] | 
|  | op_sel:[0,1,0] | 
|  |  | 
|  | .. _amdgpu_synid_op_sel_hi: | 
|  |  | 
|  | op_sel_hi | 
|  | ~~~~~~~~~ | 
|  |  | 
|  | Selects the low [15:0] or high [31:16] operand bits as input to the operation | 
|  | which results in the upper-half of the destination. | 
|  | By default, high bits are used for all operands. | 
|  |  | 
|  | The number of values specified by the *op_sel_hi* modifier must match the number of source | 
|  | operands. First value controls src0, second value controls src1 and so on. | 
|  |  | 
|  | The value 0 selects the low bits, while 1 selects the high bits. | 
|  |  | 
|  | =================================== ============================================================= | 
|  | Syntax                              Description | 
|  | =================================== ============================================================= | 
|  | op_sel_hi:[{0..1}]                  Select operand bits for instructions with 1 source operand. | 
|  | op_sel_hi:[{0..1},{0..1}]           Select operand bits for instructions with 2 source operands. | 
|  | op_sel_hi:[{0..1},{0..1},{0..1}]    Select operand bits for instructions with 3 source operands. | 
|  | =================================== ============================================================= | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | op_sel_hi:[0,0] | 
|  | op_sel_hi:[0,0,1] | 
|  |  | 
|  | .. _amdgpu_synid_neg_lo: | 
|  |  | 
|  | neg_lo | 
|  | ~~~~~~ | 
|  |  | 
|  | Specifies whether to change sign of operand values selected by | 
|  | :ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used | 
|  | as input to the operation which results in the upper-half of the destination. | 
|  |  | 
|  | The number of values specified by this modifier must match the number of source | 
|  | operands. First value controls src0, second value controls src1 and so on. | 
|  |  | 
|  | The value 0 indicates that the corresponding operand value is used unmodified, | 
|  | the value 1 indicates that negative value of the operand must be used. | 
|  |  | 
|  | By default, operand values are used unmodified. | 
|  |  | 
|  | This modifier is valid for floating point operands only. | 
|  |  | 
|  | ================================ ================================================================== | 
|  | Syntax                           Description | 
|  | ================================ ================================================================== | 
|  | neg_lo:[{0..1}]                  Select affected operands for instructions with 1 source operand. | 
|  | neg_lo:[{0..1},{0..1}]           Select affected operands for instructions with 2 source operands. | 
|  | neg_lo:[{0..1},{0..1},{0..1}]    Select affected operands for instructions with 3 source operands. | 
|  | ================================ ================================================================== | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | neg_lo:[0] | 
|  | neg_lo:[0,1] | 
|  |  | 
|  | .. _amdgpu_synid_neg_hi: | 
|  |  | 
|  | neg_hi | 
|  | ~~~~~~ | 
|  |  | 
|  | Specifies whether to change sign of operand values selected by | 
|  | :ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used | 
|  | as input to the operation which results in the upper-half of the destination. | 
|  |  | 
|  | The number of values specified by this modifier must match the number of source | 
|  | operands. First value controls src0, second value controls src1 and so on. | 
|  |  | 
|  | The value 0 indicates that the corresponding operand value is used unmodified, | 
|  | the value 1 indicates that negative value of the operand must be used. | 
|  |  | 
|  | By default, operand values are used unmodified. | 
|  |  | 
|  | This modifier is valid for floating point operands only. | 
|  |  | 
|  | =============================== ================================================================== | 
|  | Syntax                          Description | 
|  | =============================== ================================================================== | 
|  | neg_hi:[{0..1}]                 Select affected operands for instructions with 1 source operand. | 
|  | neg_hi:[{0..1},{0..1}]          Select affected operands for instructions with 2 source operands. | 
|  | neg_hi:[{0..1},{0..1},{0..1}]   Select affected operands for instructions with 3 source operands. | 
|  | =============================== ================================================================== | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | neg_hi:[1,0] | 
|  | neg_hi:[0,1,1] | 
|  |  | 
|  | clamp | 
|  | ~~~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_clamp>`. | 
|  |  | 
|  | .. _amdgpu_synid_mad_mix: | 
|  |  | 
|  | VOP3P MAD_MIX/FMA_MIX Modifiers | 
|  | ------------------------------- | 
|  |  | 
|  | *v_mad_mix\** and *v_fma_mix\** | 
|  | instructions use *op_sel* and *op_sel_hi* modifiers | 
|  | in a manner different from *regular* VOP3P instructions. | 
|  |  | 
|  | See a description below. | 
|  |  | 
|  | GFX9 and GFX10 only. | 
|  |  | 
|  | .. _amdgpu_synid_mad_mix_op_sel: | 
|  |  | 
|  | m_op_sel | 
|  | ~~~~~~~~ | 
|  |  | 
|  | This operand has meaning only for 16-bit source operands as indicated by | 
|  | :ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`. | 
|  | It specifies to select either the low [15:0] or high [31:16] operand bits | 
|  | as input to the operation. | 
|  |  | 
|  | The number of values specified by the *op_sel* modifier must match the number of source | 
|  | operands. First value controls src0, second value controls src1 and so on. | 
|  |  | 
|  | The value 0 indicates the low bits, the value 1 indicates the high 16 bits. | 
|  |  | 
|  | By default, low bits are used for all operands. | 
|  |  | 
|  | =============================== ================================================ | 
|  | Syntax                          Description | 
|  | =============================== ================================================ | 
|  | op_sel:[{0..1},{0..1},{0..1}]   Select location of each 16-bit source operand. | 
|  | =============================== ================================================ | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | op_sel:[0,1] | 
|  |  | 
|  | .. _amdgpu_synid_mad_mix_op_sel_hi: | 
|  |  | 
|  | m_op_sel_hi | 
|  | ~~~~~~~~~~~ | 
|  |  | 
|  | Selects the size of source operands: either 32 bits or 16 bits. | 
|  | By default, 32 bits are used for all source operands. | 
|  |  | 
|  | The number of values specified by the *op_sel_hi* modifier must match the number of source | 
|  | operands. First value controls src0, second value controls src1 and so on. | 
|  |  | 
|  | The value 0 indicates 32 bits, the value 1 indicates 16 bits. | 
|  |  | 
|  | The location of 16 bits in the operand may be specified by | 
|  | :ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`. | 
|  |  | 
|  | ======================================== ==================================== | 
|  | Syntax                                   Description | 
|  | ======================================== ==================================== | 
|  | op_sel_hi:[{0..1},{0..1},{0..1}]         Select size of each source operand. | 
|  | ======================================== ==================================== | 
|  |  | 
|  | Note: numeric values may be specified as either | 
|  | :ref:`integer numbers<amdgpu_synid_integer_number>` or | 
|  | :ref:`absolute expressions<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | Examples: | 
|  |  | 
|  | .. parsed-literal:: | 
|  |  | 
|  | op_sel_hi:[1,1,1] | 
|  |  | 
|  | abs | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_abs>`. | 
|  |  | 
|  | neg | 
|  | ~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_neg>`. | 
|  |  | 
|  | clamp | 
|  | ~~~~~ | 
|  |  | 
|  | See a description :ref:`here<amdgpu_synid_clamp>`. | 
|  |  | 
|  | VOP3P MFMA Modifiers | 
|  | -------------------- | 
|  |  | 
|  | These modifiers may only be used with GFX908 and GFX90A. | 
|  |  | 
|  | .. _amdgpu_synid_cbsz: | 
|  |  | 
|  | cbsz | 
|  | ~~~~ | 
|  |  | 
|  | Specifies a broadcast mode. | 
|  |  | 
|  | =============================== ================================================================== | 
|  | Syntax                          Description | 
|  | =============================== ================================================================== | 
|  | cbsz:[{0..7}]                   A broadcast mode. | 
|  | =============================== ================================================================== | 
|  |  | 
|  | Note: numeric value may be specified as either | 
|  | an :ref:`integer number<amdgpu_synid_integer_number>` or | 
|  | an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | .. _amdgpu_synid_abid: | 
|  |  | 
|  | abid | 
|  | ~~~~ | 
|  |  | 
|  | Specifies matrix A group select. | 
|  |  | 
|  | =============================== ================================================================== | 
|  | Syntax                          Description | 
|  | =============================== ================================================================== | 
|  | abid:[{0..15}]                  Matrix A group select id. | 
|  | =============================== ================================================================== | 
|  |  | 
|  | Note: numeric value may be specified as either | 
|  | an :ref:`integer number<amdgpu_synid_integer_number>` or | 
|  | an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  |  | 
|  | .. _amdgpu_synid_blgp: | 
|  |  | 
|  | blgp | 
|  | ~~~~ | 
|  |  | 
|  | Specifies matrix B lane group pattern. | 
|  |  | 
|  | =============================== ================================================================== | 
|  | Syntax                          Description | 
|  | =============================== ================================================================== | 
|  | blgp:[{0..7}]                   Matrix B lane group pattern. | 
|  | =============================== ================================================================== | 
|  |  | 
|  | Note: numeric value may be specified as either | 
|  | an :ref:`integer number<amdgpu_synid_integer_number>` or | 
|  | an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. | 
|  |  |