2024年11月5日发(作者:雷凝旋)
Command Reference - Instructions
3DSTATE_POLY_STIPPLE_OFFSET
Source:
Length Bias:
RenderCS
2
The 3DSTATE_POLY_STIPPLE_OFFSET command is used to specify the origin of the repeated screen-space
Polygon Stipple Pattern as an X,Y offset from the Color Buffer origin.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
06h 3DSTATE_POLY_STIPPLE_OFFSET
OpCode
MBZ
0h Excludes Dword (0,1)
=n Total Length - 2
MBZ
U5
Value
[0,31]
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8
7:0
Reserved
Format:
Dword Length
Default Value:
Format:
1 31:13
12:8
Reserved
Format:
Polygon Stipple X Offset
Format:
Specifies a 5 bit x address offset in the poly stipple pattern
Name
MBZ
U5
Value
[0,31]
7:5
4:0
Reserved
Format:
Polygon Stipple Y Offset
Format:
Specifies a 5 bit y address offset in the poly stipple pattern
Name
100 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_POLY_STIPPLE_PATTERN
Source:
Length Bias:
RenderCS
2
The 3DSTATE_POLY_STIPPLE_PATTERN command is used to specify the 32x32 Polygon Stipple Pattern used in
the Polygon Stipple function of the WM unit.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
07h 3DSTATE_POLY_STIPPLE_PATTERN
OpCode
MBZ
1Fh Excludes Dword (0,1)
=n Total Length - 2
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 Dword Length
Default Value:
Format:
1 31:0 Polygon Stipple Pattern Row 1 (top most)
Format: 32 bit mask Bit 31 = upper left corner, Bit 0 = upper right corner of first row.
Specifies a pattern used by Polygon Stipple to mask out specific pixels of every 32x32 area
rendered.
2..32 31:0 Polygon Stipple Pattern Rows 2-32 (bottom most)
Format: 32 bit mask Bit 31 = upper left corner, Bit 0 = upper right corner of first row.
Specifies a pattern used by Polygon Stipple to mask out specific pixels of every 32x32 area
rendered.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 101
Command Reference - Instructions
3DSTATE_PS
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
20h 3DSTATE_PS
OpCode
MBZ
06h Excludes DWord (0,1)
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
Total Length - 2
1 31:6 Kernel Start Pointer[0]
Format: InstructionBaseOffset[31:6]Kernel
Specifies the 64-byte aligned address offset of the first instruction in the kernel[0]. This pointer is
relative to the Instruction Base Address.
5:0 Reserved
Format: MBZ
2 31 Single Program Flow (SPF)
Specifies the initial condition of the kernel program as either a single program flow (SIMDnxm
with m = 1) or as multiple program flows (SIMDnxm with m > 1). See CR0 description in ISA
Execution Environment.
Value
0h
1h
Name
Multiple
Single
Description
Multiple Program Flows
Single Program Flows
U1 Enumerated Type
30 Vector Mask Enable (VME)
Format:
When SPF=0, VME specifies which mask to use to initialize the initial channel enables. When
SPF=1, VME specifies which mask to use to generate execution channel enables.
102 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PS
Value
0h
1h
Name
Dmask
Vmask
Description
Channels are enabled based on the dispatch mask
Channels are enabled based on the vector mask
U3
29:27 Sampler Count
Format:
Specifies how many samplers (in multiples of 4) the pixel shader 0 kernel uses. Used only for
prefetching the associated sampler state entries.
Value
[0,4]
0h
1h
2h
3h
4h
5h-7h
Name
no samplers used
Description
between 1 and 4 samplers used
between 5 and 8 samplers used
between 9 and 12 samplers used
between 13 and 16 samplers used
Reserved
26 Denormal Mode
Specifies the denornal mode used by the dispatched thread.
Value
0h
1h
Name
FTZ
RET
Description
Denormals are flushed to zero
Denormals are retained
U8
25:18 Binding Table Entry Count
Format:
Specifies how many binding table entries the kernel uses. Used only for prefetching of the
binding table entries and associated surface state. Note: For kernels using a large number of
binding table entries, it may be advantageous to set this field to zero to avoid prefetching too
many entries and thrashing the state cache.
This field is ignored if [PS Function Enable] is DISABLED.
Value
[0,255]
Programming Notes
When HW binding table bit is set, it is assumed that the Binding Table Entry Count field will be
generated at JIT time.
Name
17 Reserved
Format: MBZ
16 Floating Point Mode
Specifies the floating point mode used by the dispatched thread.
Value
0h
1h
Name
IEEE-745
Alt
Description
Use IEEE-754 rules
Use alternate rules
103 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PS
15:14 Rounding Mode
Specifies the rounding mode used by the dispatched thread.
Value
0h
1h
2h
3h
Name
RTNE
RU
RD
RTZ
Description
Round to Nearest Even
Round toward +infinity
Round toward -infinity
Round toward zero
Enable
13 Illegal Opcode Exception Enable
Format:
This bit gets loaded into EU CR0.1[12] (note the bit # difference). See Exceptions and ISA
Execution Environment.
12 Reserved
Format: MBZ
Enable
11 Mask Stack Exception Enable
Format:
This bit gets loaded into EU CR0.1[12] (note the bit # difference). See Exceptions and ISA
Execution Environment.
10:8 Reserved
Format: MBZ
Enable
7 Software Exception Enable
Format:
This bit gets loaded into EU CR0.1[13] (note the bit # difference). See Exceptions and ISA
Execution Environment.
6:0 Reserved
Format: MBZ
GeneralStateOffset[31:10]ScratchSpace
3 31:10 Scratch Space Base Pointer
Format:
Specifies the 1k-byte aligned address offset to scratch space for use by the kernel. This pointer is
relative to the General State Base Address.
9:4 Reserved
Format: MBZ
U4
3:0 Per Thread Scratch Space
Format:
Specifies the amount of scratch space allowed to be used by each thread. The driver must
allocate enough contiguous scratch space, pointed to by the Scratch Space Pointer, to ensure
that the Maximum Number of Threads each get Per Thread Scratch Space size without exceeding
the driver-allocated scratch space.
104 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PS
Value
[0,11]
Name
indicating [1k bytes, 2M bytes] in powers of two
U8-1 representing thread count
Description
4 31:24 Maximum Number of Threads
Format:
Range:
WIZ Hashing Disable in GT_MODE register enabled: Range = [7,171] --> [8,172]
threads. Only odd values are allowed (resulting in even max number of threads)
WIZ Hashing Disable in GT_MODE register disabled: Range = [3,85] --> [4,86] threads.
Only odd values are allowed (resulting in even max number of threads)
Specifies the maximum number of simultaneous threads allowed to be active. Used to
avoid using up the scratch space, or to avoid potential deadlock.
Value
[3h,1fh]
Programming Notes
If this field is changed between 3DPRIMITIVE commands, a PIPE_CONTROL command with Stall
at Pixel Scoreboard set is required to be issued. This field must have an odd value so that the
max number of PS threads is even.
Name
Range
Description
[4,32] threads
23:12 Reserved
Format: MBZ
Enable
11 Push Constant Enable
Format:
This field must be enabled if the sum of the PS Constant Buffer [3:0] Read Length fields in
3DSTATE_CONSTANT_PS is nonzero, and must be disabled if the sum is zero.
10 Attribute Enable
Format: Enable
This field must be enabled if the Number of SF Output Attributes field in 3DSTATE_SBE is
nonzero, and must be disabled if that field is zero.
9 oMask Present to RenderTarget
Format: Enable
This bit is inserted in the PS payload header and made available to the DataPort (either via the
message header or via header bypass) to indicate that oMask data (one or two phases) is
included in Render Target Write messages. If present, the oMask data is used to mask off
samples.
8 Render Target Fast Clear Enable
Format: Enable
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 105
Command Reference - Instructions
3DSTATE_PS
This field is set to enable fast clear of the bound render targets. See "Render Target Fast Clear"
for restrictions on enabling this field.
7 Dual Source Blend Enable
Format: Enable
This field is set if dual source blend is enabled. If this bit is disabled, the data port dual source
message reverts to a single source message using source 0.
6 Render Target Resolve Enable
Format: Enable
This field is set to enable clear value resolve on non-multisampled render targets. See "Render
Target Resolve" for restrictions on enabling this field.
5 Reserved
Format: MBZ
U2 Enumerated Type
4:3 Position XY Offset Select
Format:
This field specifies if/what Position XY Offset values are passed in the PS payload. Note that these
are per-slot (pixel|sample) offsets, and therefore separate from the subspan XY coordinates
passed in R1.
Value
0h
1h
2h
3h
Programming Notes
SW Recommendation: If the PS kernel needs the Position Offsets to compute a Position XY
value, this field should match Position ZW Interpolation Mode to ensure a consistent
computation
If the PS kernel does not need the Position XY Offsets to compute a Position Value, then this
field should be programmed to POSOFFSET_NONE, as the PS kernel should be using the
various barycentric inputs to evaluate other-than-position attributes.
MSDISPMODE_PERSAMPLE is required in order to select POSOFFSET_SAMPLE.
Name
POSOFFSET_NONE
Reserved
Description
No Position XY Offsets are included in the PS payload.
POSOFFSET_CENTROID Position XY Offsets will be passed in the PS payload,
and these will reflect the Centroid position(s).
POSOFFSET_SAMPLE Position XY Offsets will be passed in the PS payload,
and these will reflect the multisample position(s).
2 32 Pixel Dispatch Enable
Format:
Description
Enables the Windower to dispatch 8 subspans in one payload.
Note: See Note: in the table below, the Valid column indicates which products that
Enable
106 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PS
combination is supported on. Combinations of dispatch enables not listed in the table
are not available on any product.
A: Valid on all products
B: Valid.
C: Not valid.
D: Valid on all products, except when in non-1x PERSAMPLE mode.
E: Valid on all products, except when in PERSAMPLE mode with number of
multisamples >= 8.
F: Valid on all products.
Each of the three KSP values are separately specified.
In addition, each kernel has a separately-specified GRF register count.
Variable Pixel Dispatch Section: Pixel Grouping (Dispatch size) control for valid pixel
dispatch combinations.
1 16 Pixel Dispatch Enable
Format:
Description
Enables the Windower to dispatch 4 subspans in one payload.
Note: See Note: in the table below, the Valid column indicates which products that
combination is supported on. Combinations of dispatch enables not listed in the table
are not available on any product.
A: Valid on all products
B: Valid.
C: Not valid.
D: Valid on all products, except when in non-1x PERSAMPLE mode.
E: Valid on all products, except when in PERSAMPLE mode with number of
multisamples >= 8.
F: Valid on all products.
Each of the three KSP values are separately specified.
In addition, each kernel has a separately-specified GRF register count.
Variable Pixel Dispatch Section: Pixel Grouping (Dispatch size) control for valid pixel
dispatch combinations.
Enable
0 8 Pixel Dispatch Enable
Format:
Description
Enables the Windower to dispatch 2 subspans in one payload.
Note: See Note: in the table below, the Valid column indicates which products that
combination is supported on. Combinations of dispatch enables not listed in the table
are not available on any product.
A: Valid on all products
B: Valid.
Enable
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 107
Command Reference - Instructions
3DSTATE_PS
C: Not valid.
D: Valid on all products, except when in non-1x PERSAMPLE mode.
E: Valid on all products, except when in PERSAMPLE mode with number of
multisamples >= 8.
F: Valid on all products.
Each of the three KSP values are separately specified.
In addition, each kernel has a separately-specified GRF register count.
Variable Pixel Dispatch Section: Pixel Grouping (Dispatch size) control for valid pixel
dispatch combinations.
5 31:23 Reserved
Format: MBZ
U7
22:16 Dispatch GRF Start Register for Constant/Setup Data [0]
Format:
Specifies the starting GRF register number for the Constant/Setup portion of the thread payload
for kernel[0].
Value
[0,127]
Name
MBZ
U7
15 Reserved
Format:
14:8 Dispatch GRF Start Register for Constant/Setup Data [1]
Format:
Specifies the starting GRF register number for the Constant/Setup portion of the thread payload
for kernel[1].
Value
[0,127]
Name
MBZ
U7
7 Reserved
Format:
6:0 Dispatch GRF Start Register for Constant/Setup Data [2]
Format:
Specifies the starting GRF register number for the Constant/Setup portion of the thread payload
for kernel[2].
Value
[0,127]
Name
6 31:6 Kernel Start Pointer[1]
Format: InstructionBaseOffset[31:6]Kernel
Specifies the 64-byte aligned address offset of the first instruction in kernel[1]. This pointer is
relative to the Instruction Base Address.
5:0 Reserved
Format: MBZ
108 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PS
7 31:6 Kernel Start Pointer[2]
Format: InstructionBaseOffset[31:6]Kernel
Specifies the 64-byte aligned address offset of the first instruction in kernel[2]. This pointer is
relative to the Instruction Base Address.
5:0 Reserved
Format: MBZ
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 109
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_DS
Source:
Length Bias:
RenderCS
2
Programming Notes
This command sets up the URB configuration for DS Push Constant Buffer.
Programming Restriction:
• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value
of the Constant Buffer Size.
• The sum of the constant length programmed in 3DSTATE_CONSTANT_DS must be equal or smaller then
the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant
URB Allocation section for more details.
• The 3DSTATE_CONSTANT_DS must be reprogrammed prior to the next 3DPRIMITIVE command after
programming the 3DSTATE_PUSH_CONSTANT_ALLOC_DS.
DWord
Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
14h 3DSTATE_PUSH_CONSTANT_ALLOC_DS
OpCode
MBZ
0h Excludes DWord (0,1)
=n Total Length - 2
MBZ
U4
Value
[0,15] (0KB - 15KB)
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Name
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:20 Reserved
Format:
19:16 Constant Buffer Offset
Format:
Specifies the offset of the DS constant buffer into the URB.
110
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_DS
0h 0KB [Default]
MBZ
U5
15:5 Reserved
Format:
4:0 Constant Buffer Size
Format:
Specifies the size of the DS constant buffer. This value will determine the amount of data the
command stream can pre-fetch before the buffer is full. Value of zero is only valid when
constants are not enabled for DS.
Value
[0,15]
0h
Name
(0KB - 15KB) Increments of 1KB
0KB [Default]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 111
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_GS
Source:
Length Bias:
RenderCS
2
Programming Notes
This command sets up the URB configuration for GS Push Constant Buffer.
• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value
of the Constant Buffer Size.
• The sum of the constant length programmed in 3DSTATE_CONSTANT_GS must be equal or smaller then
the size of the allocated space in the URB including the buffering for half cachelines.
• The 3DSTATE_CONSTANT_GS must be reprogrammed prior to the next 3DPRIMITIVE command after
programming the 3DSTATE_PUSH_CONSTANT_ALLOC_GS.
See Push Constant URB Allocation section for more details.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
15h 3DSTATE_PUSH_CONSTANT_ALLOC_GS
OpCode
MBZ
=n
Name
3DSTATE_PUSH_CONSTANT_ALLOC_GS [Default]
MBZ
U4
Value Name
Description
Excludes DWord (0,1)
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Format:
Total Length - 2
Value
0h
1 31:20 Reserved
Format:
19:16 Constant Buffer Offset
Format:
Specifies the offset of the GS constant buffer into the URB.
112 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_GS
[0,15]
0h
(0KB - 15KB)
0KB [Default]
MBZ
U5
15:5 Reserved
Format:
4:0 Constant Buffer Size
Format:
Specifies the size of the GS constant buffer. This value will determine the amount of data the
command stream can pre-fetch before the buffer is full. Value of zero is only valid when
constants are not enabled for GS.
Value
[0,15]
0h
Name
(0KB - 15KB) Increments of 1KB
0KB [Default]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 113
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_HS
Source:
Length Bias:
RenderCS
2
Programming Notes
This command sets up the URB configuration for HS Push Constant Buffer.
Programming Restriction:
• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value
of the Constant Buffer Size.
• The sum of the constant length programmed in 3DSTATE_CONSTANT_HS must be equal or smaller then
the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant
URB Allocation section for more details.
• The 3DSTATE_CONSTANT_HS must be reprogrammed prior to the next 3DPRIMITIVE command after
programming the 3DSTATE_PUSH_CONSTANT_ALLOC_HS.
DWord
Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
13h 3DSTATE_PUSH_CONSTANT_ALLOC_HS
OpCode
MBZ
0h Excludes DWord (0,1)
=n Total Length - 2
MBZ
U4
Value
[0,15] (0KB - 15KB)
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Name
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:20 Reserved
Format:
19:16 Constant Buffer Offset
Format:
Specifies the offset of the HS constant buffer into the URB.
114
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_HS
0h 0KB [Default]
MBZ
U5
15:5 Reserved
Format:
4:0 Constant Buffer Size
Format:
Specifies the size of the HS constant buffer. This value will determine the amount of data the
command stream can pre-fetch before the buffer is full. Value of zero is only valid when
constants are not enabled for HS.
Value
[0,15]
0h
Name
(0KB - 15KB) Increments of 1KB
0KB [Default]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 115
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_PS
Source:
Length Bias:
RenderCS
2
Description
This command sets up the URB configuration for PS Push Constant Buffer.
A PIPE_CONTOL command with the CS Stall bit set must be programmed in the ring after this
instruction.
Programming Notes
Restriction:
• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value
of the Constant Buffer Size.
• The sum of the constant length programmed in 3DSTATE_CONSTANT_PS must be equal or smaller then
the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant
URB Allocation section for more details.
• The 3DSTATE_CONSTANT_PS must be reprogrammed prior to the next 3DPRIMITIVE command after
programming the 3DSTATE_PUSH_CONSTANT_ALLOC_PS.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
16h 3DSTATE_PUSH_CONSTANT_ALLOC_PS
OpCode
MBZ
0h Excludes Dword (0,1)
=n Total Length - 2
MBZ
U4
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 Dword Length
Default Value:
Format:
1 31:20 Reserved
Format:
19:16 Constant Buffer Offset
Format:
116
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_PS
Specifies the offset of the PS constant buffer into the URB.
Value
[0,15]
0h
Name
(0KB - 15KB)
0KB [Default]
MBZ
U5
15:5 Reserved
Format:
4:0 Constant Buffer Size
Format:
Specifies the size of the PS constant buffer. This value will determine the amount of data the
command stream can pre-fetch before the buffer is full. Value of zero is only valid when
constants are not enabled for PS.
Value
[0,15]
0h
Name
(0KB - 15KB) Increments of 1KB
0KB [Default]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 117
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_VS
Source:
Length Bias:
RenderCS
2
Programming Notes
This command sets up the URB configuration for VS Push Constant Buffer.
Programming Restriction:
• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value
of the Constant Buffer Size.
• The sum of the constant length programmed in 3DSTATE_CONSTANT_VS must be equal or smaller then
the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant
URB Allocation section for more details.
• The 3DSTATE_CONSTANT_VS must be reprogrammed prior to the next 3DPRIMITIVE command after
programming the 3DSTATE_PUSH_CONSTANT_ALLOC_VS.
DWord
Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
12h 3DSTATE_PUSH_CONSTANT_ALLOC_VS
OpCode
MBZ
0h Excludes DWord (0,1)
=n Total Length - 2
MBZ
U4
Value
[0,15] (0KB - 15KB)
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Name
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:20 Reserved
Format:
19:16 Constant Buffer Offset
Format:
Specifies the offset of the VS constant buffer into the URB.
118
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_VS
0h 0KB [Default]
MBZ
U5
15:5 Reserved
Format:
4:0 Constant Buffer Size
Format:
Specifies the size of the VS constant buffer. This value will determine the amount of data the
command stream can pre-fetch before the buffer is full. Value of zero is only valid when
constants are not enabled for VS.
Value
[0,15]
0h
Name
(0KB - 15KB) Increments of 1KB
0KB [Default]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 119
Command Reference - Instructions
3DSTATE_SAMPLE_MASK
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
18h 3DSTATE_SAMPLE_MASK
OpCode
MBZ
0h Excludes Dword (0,1)
=n Total Length - 2
MBZ
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 Dword Length
Default Value:
Format:
1 31:8 Reserved
Format:
7:0 Sample Mask
Format: 8 bit mask Right-justified bitmask (Bit 0 = Sample0). Number of bits that are used is
determined by Num Multisamples (3DSTATE_MULTISAMPLE)
A per-multisample-position mask state variable that is immediately and unconditionally ANDed
with the sample coverage mask as part of the rasterization process. This mask is applied prior to
centroid selection.
Programming Notes
• If Number of Multisamples is NUMSAMPLES_1, bits 7:1 of this field must be zero.
• If Number of Multisamples is NUMSAMPLES_4, bits 7:4 of this field must be zero.
120 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SAMPLER_PALETTE_LOAD0
Source:
Length Bias:
RenderCS
2
Description
The 3DSTATE_SAMPLER_PALETTE_LOAD0 instruction is used to load 32-bit values into the first
texture palette. The texture palette is used whenever a texture with a paletted format (containing
"Px [palette0]") is referenced by the sampler.
This instruction is used to load all or a subset of the 256 entries of the first palette. Partial loads
always start from the first (index 0) entry.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
Opcode
3h GFXPIPE_3D
Opcode
1h 3DSTATE
Opcode
02h 3DSTATE_SAMPLER_PALETTE_LOAD0
Opcode
MBZ
0h Excludes DWord (0,1)
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8
7:0
Reserved
Format:
DWord Length
Default Value:
Format:
Total Length - 2
1..n 31:24 Palette Alpha[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
23:16 Palette Red[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
15:8 Palette Green[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 121
Command Reference - Instructions
3DSTATE_SAMPLER_PALETTE_LOAD0
7:0 Palette Blue[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
122 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SAMPLER_PALETTE_LOAD1
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_PALETTE_LOAD1 instruction is used to load 32-bit values into the second texture
palette. The second texture palette is used whenever a texture with a paletted format (containing
"Px...[palette1]") is referenced by the sampler. This instruction is used to load all or a subset of the 256 entries of
the second palette. Partial loads always start from the first (index 0) entry.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE
OpCode
0Ch 3DSTATE_SAMPLER_PALETTE_LOAD1
OpCode
MBZ
0h Excludes DWord (0,1)
=n Total Length - 2
U8
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1..n 31:24 Palette Alpha[0:N-1]
Format:
Alpha channel loaded into the Nth entry of the texture color palette.
23:16 Palette Red[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
15:8 Palette Green[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
7:0 Palette Blue[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 123
Command Reference - Instructions
3DSTATE_SAMPLER_STATE_POINTERS_DS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_STATE_POINTERS_DS command is used to define the location of DS SAMPLER_STATE
table. Only some of the fixed functions utilize sampler state tables.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
2Dh 3DSTATE_SAMPLER_STATE_POINTERS_DS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SAMPLER_STATE*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Pointer to DS Sampler State
Format:
Specifies the 32-byte aligned address offset of the DS function's SAMPLER_STATE table. This
offset is relative to the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
124 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SAMPLER_STATE_POINTERS_GS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_STATE_POINTERS_GS command is used to define the location of GS SAMPLER_STATE
table. Only some of the fixed functions utilize sampler state tables.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
2Eh 3DSTATE_SAMPLER_STATE_POINTERS_GS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SAMPLER_STATE*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Pointer to GS Sampler State
Format:
Specifies the 32-byte aligned address offset of the GS function's SAMPLER_STATE table. This
offset is relative to the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 125
Command Reference - Instructions
3DSTATE_SAMPLER_STATE_POINTERS_HS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_STATE_POINTERS_HS command is used to define the location of HS SAMPLER_STATE
table. Only some of the fixed functions utilize sampler state tables.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
2Ch 3DSTATE_SAMPLER_STATE_POINTERS_HS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SAMPLER_STATE*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Pointer to HS Sampler State
Format:
Specifies the 32-byte aligned address offset of the HS function's SAMPLER_STATE table. This
offset is relative to the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
126 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SAMPLER_STATE_POINTERS_PS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_STATE_POINTERS_PS command is used to define the location of PS SAMPLER_STATE
table. Only some of the fixed functions utilize sampler state tables.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
2Fh 3DSTATE_SAMPLER_STATE_POINTERS_PS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SAMPLER_STATE*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Pointer to PS Sampler State
Format:
Specifies the 32-byte aligned address offset of the PS function's SAMPLER_STATE table. This
offset is relative to the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 127
Command Reference - Instructions
3DSTATE_SAMPLER_STATE_POINTERS_VS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_STATE_POINTERS_VS command is used to define the location of VS SAMPLER_STATE
table. Only some of the fixed functions utilize sampler state tables.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
2Bh 3DSTATE_SAMPLER_STATE_POINTERS_VS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SAMPLER_STATE*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Pointer to VS Sampler State
Format:
Specifies the 32-byte aligned address offset of the VS function's SAMPLER_STATE table. This
offset is relative to the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
128 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SBE
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
1Fh 3DSTATE_SBE
OpCode
MBZ
0Ch Excludes DWord (0,1)
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
Total Length - 2
1 31:29 Reserved
Format: MBZ
U1 enumerated type
28 Attribute Swizzle Control Mode
Format:
When Attribute Swizzle Enable is ENABLED, this bit controls whether attributes 0-15 or
16-31 are subject to the following swizzle controls:
• Attribute n Component Override X/Y/Z/W
• Attribute n Constant Source
• Attribute n Swizzle Select
• Attribute n Source Attribute
• Attribute n Wrap Shortest Enables
Note that the Number of SF Output Attributes field specifies how many attributes are
output.
Note: This field does not impact any functions which provide separate states for all 32
attributes (e.g., Point sprite, Constant interpolation).
Value Name Description
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 129
Command Reference - Instructions
3DSTATE_SBE
0h
1h
SWIZ_0_15 Attributes 0-15 are subject to swizzling, and attributes 16-31 are
not.
SWIZ_16_31 Attributes 16-31 are subject to swizzling, and attributes 0-15 are
not. Only valid when 16 or more attributes are output.
U6 count of attributes
27:22 Number of SF Output Attributes
Format:
Specifies the number of vertex attributes passed from the SF stage to the WM stage (does not
include Position).
Value
[0,32]
Name
Enable
21 Attribute Swizzle Enable
Format:
Enables the SF to perform swizzling on (up to the first 16) vertex attributes. If DISABLED, all vertex
attributes are passed through.
20 Point Sprite Texture Coordinate Origin
Format: U1 enumerated type
This state controls how Point Sprite Texture Coordinates are generated (when enabled on a per-
attribute basis by Point Sprite Texture Coordinate Enable).
Value
0h
1h
Name Description
UPPERLEFT Top Left = (0,0,0,1)Bottom Left = (0,1,0,1)Bottom Right = (1,1,0,1)
LOWERLEFT Top Left = (0,1,0,1)Bottom Left = (0,0,0,1)Bottom Right = (1,0,0,1)
MBZ
19:16 Reserved
Format:
15:11 Vertex URB Entry Read Length
Format: U5 Specifies the amount of URB data read for each Vertex URB entry, in 256-bit
register increments.
Value
[1,16]
Programming Notes
It is UNDEFINED to set this field to 0 indicating no Vertex URB data to be read. This field should
be set to the minimum length required to read the maximum source attribute. The maximum
source attribute is indicated by the maximum value of the enabled Attribute # Source Attribute
if Attribute Swizzle Enable is set, Number of Output Attributes-1 if enable is not set.
read_length = ceiling((max_source_attr+1)/2)
Name
10 Reserved
9:4 Vertex URB Entry Read Offset
Specifies the offset (in 256-bit units) at which Vertex URB data is to be read from the URB.
3:0 Reserved
130 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SBE
Format: MBZ
Enable
2..9 31 Attribute [2n+1] Component Override W
Format:
If set, the W component of output Attribute 1 is overridden by the W component of the constant
vector specified by ConstantSource[1].
30 Attribute [2n+1] Component Override Z
Format: Enable
If set, the Z component of output Attribute 1 is overridden by the Z component of the constant
vector specified by ConstantSource[1].
29 Attribute [2n+1] Component Override Y
Format: Enable
If set, the Y component of output Attribute 1 is overridden by the Y component of the constant
vector specified by ConstantSource[1].
28 Attribute [2n+1] Component Override X
Format: Enable
If set, the X component of output Attribute 1 is overridden by the X component of the constant
vector specified by ConstantSource[1].
27 Reserved
Format: MBZ
U2 enumerated type
26:25 Attribute [2n+1] Constant Source
Format:
This state selects a constant vector which can be used to override individual components of
Attribute 1
Value
0h
1h
2h
3h
Name
CONST_0000
CONST_0001_FLOAT
CONST_1111_FLOAT
PRIM_ID
Description
= 0.0,0.0,0.0,0.0
= 0.0,0.0,0.0,1.0
= 1.0,1.0,1.0,1.0
= PrimID (replicated)
MBZ
U2 enumerated type
24 Reserved
Format:
23:22 Attribute [2n+1] Swizzle Select
Format:
Value
0h
1h
Name
INPUTATTR
INPUTATTR_FACING
This state, along with Attribute 1 Source Attribute, specifies the source for output Attribute 1.
Description
This attribute is sourced from
AttrInputReg[SourceAttribute]
If the object is front-facing, this attribute is sourced
131 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SBE
from AttrInputReg[SourceAttribute]. If the object is
back-facing, this attribute is sourced from
AttrInputReg[SourceAttribute+1].
2h INPUTATTR_W This attribute is sourced from
AttrInputReg[SourceAttribute]. The W component is
copied to the X component.
3h INPUTATTR_FACING_W If the object is front-facing, this attribute is sourced
from AttrInputReg[SourceAttribute]. If the object is
back-facing, this attribute is sourced from
AttrInputReg[SourceAttribute+1]. The W component is
copied to the X component.
MBZ
U5
21 Reserved
Format:
20:16 Attribute [2n+1] Source Attribute
Format:
This field selects the source attribute for Attribute 1. Source attribute 0 corresponds to the first
128 bits of data indicated by Vertex URB Entry Read Offset
15 Attribute [2n] Component Override W
Format: Enable
If set, the W component of output Attribute 0 is overridden by the W component of the constant
vector specified by ConstantSource[1].
14 Attribute [2n] Component Override Z
Format: Enable
If set, the Z component of output Attribute 0 is overridden by the Z component of the constant
vector specified by ConstantSource[1].
13 Attribute [2n] Component Override Y
Format: Enable
If set, the Y component of output Attribute 0 is overridden by the Y component of the constant
vector specified by ConstantSource[1].
12 Attribute [2n] Component Override X
Format: Enable
If set, the X component of output Attribute 0 is overridden by the X component of the constant
vector specified by ConstantSource[1].
11 Reserved
Format: MBZ
U2 enumerated type
10:9 Attribute [2n] Constant Source
Format:
This state selects a constant vector which can be used to override individual components of
132 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SBE
Attribute 0
Value
0h
1h
2h
3h
Name
CONST_0000
CONST_0001_FLOAT
CONST_1111_FLOAT
PRIM_ID
Description
= 0.0,0.0,0.0,0.0
= 0.0,0.0,0.0,1.0
= 1.0,1.0,1.0,1.0
= PrimID (replicated)
MBZ
U2 enumerated type
8 Reserved
Format:
7:6 Attribute [2n] Swizzle Select
Format:
Value
0h
1h
Name
INPUTATTR
INPUTATTR_FACING
This state, along with Attribute 0 Source Attribute, specifies the source for output Attribute 0.
Description
This attribute is sourced from
AttrInputReg[SourceAttribute]
If the object is front-facing, this attribute is sourced
from AttrInputReg[SourceAttribute]. If the object is
back-facing, this attribute is sourced from
AttrInputReg[SourceAttribute+1].
This attribute is sourced from
AttrInputReg[SourceAttribute]. The W component is
copied to the X component.
2h INPUTATTR_W
3h INPUTATTR_FACING_W If the object is front-facing, this attribute is sourced
from AttrInputReg[SourceAttribute]. If the object is
back-facing, this attribute is sourced from
AttrInputReg[SourceAttribute+1]. The W component is
copied to the X component.
MBZ
U5
5 Reserved
Format:
4:0 Attribute [2n] Source Attribute
Format:
This field selects the source attribute for Attribute 0. Source attribute 0 corresponds to the first
128 bits of data indicated by Vertex URB Entry Read Offset
10 31:0 Point Sprite Texture Coordinate Enable
Format:
Description
32-bit bitmask
When processing point primitives, the attributes from the incoming point vertex are
typically copied to the point object corner vertices. However, if a bit is set in this field,
the corresponding Attribute is selected as a Point Sprite Texture Coordinate, in which
case each corner vertex is assigned a pre-defined texture coordinate as defined by
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 133
Command Reference - Instructions
3DSTATE_SBE
the Point Sprite Texture Coordinate Origin state bit. Bit 0 corresponds to output
Attribute 0.
This field must be programmed to 0 when non-point primitives are rendered.
11 31:0 Constant Interpolation Enable[31:0]
This field is a bitmask containing a Constant Interpolation Enable bit for each corresponding
attribute. If a bit is set, that attribute will undergo constant interpolation, and the corresponding
WrapShortest Enable bits (if defined) will be ignored. If a bit is clear, components which are not
enabled for WrapShortest interpolation (if defined) will be linearly interpolated.
31:28 Attribute 7 WrapShortest Enables
Format: Enable[4]
This state selects which components (if any) of Attribute 7 are to be interpolated in a "wrap
shortest" fashion. Operation is UNDEFINED if any of these bits are set and the Constant
Interpolation Enable bit associated with this attribute is set. Note that wrap-shortest interpolation
is only supported for Attributes 0-15. Bit 0: WrapShortest X ComponentBit 1: WrapShortest Y
ComponentBit 2: WrapShortest Z ComponentBit 3: WrapShortest W Component
27:24 Attribute 6 WrapShortest Enables
(See above).
23:20 Attribute 5 WrapShortest Enables
(See above).
19:16 Attribute 4 WrapShortest Enables
(See above).
15:12 Attribute 3 WrapShortest Enables
(See above).
11:8 Attribute 2 WrapShortest Enables
(See above).
7:4 Attribute 1 WrapShortest Enables
(See above).
3:0 Attribute 0 WrapShortest Enables
(See above).
12
13 31:28 Attribute 15 WrapShortest Enables
Format: Enable[4]
This state selects which components (if any) of Attribute 15 are to be interpolated in a "wrap
shortest" fashion. Operation is UNDEFINED if any of these bits are set and the Constant
Interpolation Enable bit associated with this attribute is 0: WrapShortest X ComponentBit
1: WrapShortest Y ComponentBit 2: WrapShortest Z ComponentBit 3: WrapShortest W
Component
27:24 Attribute 14 WrapShortest Enables
(See above).
23:20 Attribute 13 WrapShortest Enables
(See above).
19:16 Attribute 12 WrapShortest Enables
134 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SBE
(See above).
15:12 Attribute 11 WrapShortest Enables
(See above).
11:8 Attribute 10 WrapShortest Enables
(See above).
7:4 Attribute 9 WrapShortest Enables
(See above).
3:0 Attribute 8 WrapShortest Enables
(See above).
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 135
Command Reference - Instructions
3DSTATE_SCISSOR_STATE_POINTERS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SCISSOR_STATE_POINTERS command is used to define the location of the indirect SCISSOR_RECT
state.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
0Fh 3DSTATE_SCISSOR_STATE_POINTERS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SCISSOR_RECT*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Scissor Rect Pointer
Format:
Specifies the 32-byte aligned address offset of the SCISSOR_RECT state. This offset is
relative to the Dynamic State Base Address
4:0 Reserved
Format: MBZ
136 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SF
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE
OpCode
13h 3DSTATE_SF
OpCode
MBZ
5h Excludes DWord (0,1)
=n Total Length - 2
MBZ
U3 Enumerated Type
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:15 Reserved
Format:
14:12 Depth Buffer Surface Format
Format:
Specifies the format of the depth buffer. This must exactly match the Surface Format
programmed via 3DSTATE_DEPTH_BUFFER. The SF requires this information in order to compute
Global Depth Bias.
Value
0h
1h
2h
3h
4h
5h
6h-7h
Name
D32_FLOAT_S8X24_UINT
D32_FLOAT
D24_UNORM_S8_UINT
D24_UNORM_X8_UINT
Reserved
D16_UNORM
Reserved
Description
D32_FLOAT_S8X24_UINT
D32_FLOAT
D24_UNORM_S8_UINT
D24_UNORM_X8_UINT
Reserved
D16_UNORM
Reserved
Enable
11 Legacy Global Depth Bias Enable
Format:
Enables the SF to use the Global Depth Offset Constant state unmodified. If this bit is not set, the
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 137
Command Reference - Instructions
3DSTATE_SF
SF will scale the Global Depth Offset Constant as described in section Error! Reference source not
found. of this document.
Programming Notes
This bit should be set whenever non zero depth bias (Slope, Bias) values are used. Setting this
bit may have some degradation of performance for some workloads.
10 Statistics Enable
Format: Enable
If ENABLED, this FF unit will increment CL_PRIMITIVES_COUNT on behalf of the CLIP stage. If
DISABLED, CL_PRIMITIVES_COUNT will be left unchanged.
Programming Notes
This bit should be set whenever clipping is enabled and the Statistics Enable bit is set in
CLIP_STATE. It should be cleared if clipping is disabled or Statistics Enable in CLIP_STATE is
clear.
9 Global Depth Offset Enable Solid
Format: Enable
Programming Notes
This bit should be set whenever non zero depth bias (Slope, Bias) values are used.
Setting this bit may have some degradation of performance for some workloads.
Enables computation and application of Global Depth Offset for SOLID objects.
8 Global Depth Offset Enable Wireframe
Format: Enable
Enables computation and application of Global Depth Offset when triangles are rendered in
WIREFRAME mode.
Programming Notes
This bit should be set whenever non zero depth bias (Slope, Bias) values are used.
Setting this bit may have some degradation of performance for some workloads.
7 Global Depth Offset Enable Point
Format: Enable
Enables computation and application of Global Depth Offset when triangles are rendered in
POINT mode.
Programming Notes
This bit should be set whenever non zero depth bias (Slope, Bias) values are used.
Setting this bit may have some degradation of performance for some workloads.
6:5 FrontFace Fill Mode
Format:
Value
0h
Name
SOLID
U2 enumerated type
Description
Any triangle or rectangle object found to be front-facing is
rendered as a solid object. This setting is required when
rendering rectangle (RECTLIST) objects.
This state controls how front-facing triangle and rectangle objects are rendered.
1h WIREFRAME Any triangle object found to be front-facing is rendered as a
series of lines along the triangle boundaries (as determined by
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 138
Command Reference - Instructions
3DSTATE_SF
the topology type and controlled by the vertex EdgeFlags).
2h POINT Any triangle object found to be front-facing is rendered as a set
of point primitives at the triangle vertices (as determined by the
topology type and controlled by the vertex EdgeFlags). NOTE: If
the triangle is clipped, points will not be rendered at clip-inserted
vertices. Point will only be rendered at original vertices (if visible).
U2 enumerated type
Name
SOLID
Description
Any triangle or rectangle object found to be back-facing is
rendered as a solid object. This setting is required when
rendering rectangle (RECTLIST) objects.
3h Reserved
4:3 BackFace Fill Mode
Format:
Value
0h
This state controls how back-facing triangle and rectangle objects are rendered.
1h WIREFRAME Any triangle object found to be back-facing is rendered as a
series of lines along the triangle boundaries (as determined by
the topology type and controlled by the vertex EdgeFlags).
POINT Any triangle object found to be back-facing is rendered as a set
of point primitives at the triangle vertices (as determined by the
topology type and controlled by the vertex EdgeFlags). NOTE: If
the triangle is clipped, points will not be rendered at clip-inserted
vertices. Point will only be rendered at original vertices (if visible).
MBZ
Enable
2h
3h Reserved
2
1
Reserved
Format:
View Transform Enable
Format:
This bit controls the Viewport Transform function.
0 Front Winding
Determines whether a triangle object is considered "front facing" if the screen space vertex
positions, when traversed in the order, result in a clockwise (CW) or counter-clockwise (CCW)
winding order. Does not apply to points or lines.
Format:
This field enables "alpha-based" line anti-aliasing.
Programming Notes
This field must be disabled if any of the render targets have integer (UINT or SINT) surface
format.
2 31 Anti-Aliasing Enable
Enable
30:29 Cull Mode
Format: 3D_CullMode
Controls removal (culling) of triangle objects based on orientation. The cull mode only applies to
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 139
Command Reference - Instructions
3DSTATE_SF
triangle objects and does not apply to lines, points or rectangles.
Value
0h
1h
2h
3h
Programming Notes
Orientation determination is based on the setting of the Front Winding state.
Name Description
CULLMODE_BOTH All triangles are discarded (i.e., no triangle objects are
drawn)
CULLMODE_NONE No triangles are discarded due to orientation
CULLMODE_FRONT Triangles with a front-facing orientation are discarded
CULLMODE_BACK Triangles with a back-facing orientation are discarded
28 Reserved
27:18 Line Width
Format:
Range: [0.0, 7.9921875]
U3.7
Controls width of line primitives. Setting a Line Width of 0.0 specifies the rasterization
of the "thinnest" (one-pixel-wide), non-antialiased lines. Note that this effectively
overrides the effect of AAEnable (though the AAEnable state variable is not modified).
Programming Notes
Software must not program a value of 0.0 when running in MSRASTMODE_ON_xxx
modes - zero-width lines are not available when multisampling rasterization is
enabled.
17:16 Line End Cap Antialiasing Region Width
Format: U2
This field specifies the distances over which the coverage of anti-aliased line end caps are
computed.
Value
0h
1h
2h
3h
Name
0.5 pixels
1.0 pixels
2.0 pixels
4.0 pixels
Description
15 Reserved
Format: MBZ
MBZ
14 Reserved
Format:
13 Reserved
12 Reserved
11 Scissor Rectangle Enable
Format:
140
Enable
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SF
Enables operation of Scissor Rectangle.
10 Reserved
Format: MBZ
U2 enumerated type
9:8 Multisample Rasterization Mode
Format:
This state is duplicated in 3DSTATE_WM and both must be set to the same value. See the field in
3DSTATE_WM for definition details.
7:0 Reserved
Format: MBZ
Enable
3 31 Last Pixel Enable
Format:
If ENABLED, the last pixel of a diamond line will be lit. This state will only affect the rasterization
of Diamond lines (will not affect wide lines or anti-aliased lines).
Programming Notes
Last pixel is applied to all lines of a LINELIST, and only the last line of a LINESTRIP.
30:29 Triangle Strip/List Provoking Vertex Select
Format: 0-based vertex index
Selects which vertex of a triangle (in a triangle strip or list primitive) is considered the "provoking
vertex". Used for flat shading of primitives. Does current implementation send provoking vertex
first?
Value
0h
1h
2h
3h
Name
Vertex 0
Vertex 1
Vertex 2
Reserved
0-based vertex index
Name
Vertex 0
Vertex 1
Reserved
Reserved
0-based vertex index
Description
28:27 Line Strip/List Provoking Vertex Select
Format:
Value
0h
1h
2h
3h
Selects which vertex of a line (in a line strip or list primitive) is considered the "provoking vertex".
26:25 Triangle Fan Provoking Vertex Select
Format:
Value
0h
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Selects which vertex of a triangle (in a triangle fan primitive) is considered the "provoking vertex".
Name
Vertex 0
141
Command Reference - Instructions
3DSTATE_SF
1h
2h
3h
Vertex 1
Vertex 2
Reserved
MBZ
U1
Name
Reserved Reserved
Description
24:15 Reserved
Format:
14 AA Line Distance Mode
Format:
This bit controls the distance computation for antialiased lines.
Value
0h
1h AALINEDISTANCE_TRUE True distance computation. This is the normal setting
which should yield WHQL compliance.
MBZ
U1
Name
Disable
Enable
Description
8 sub pixel precision bits maintained
4 sub pixel precision bits maintained
U1
13 Reserved
Format:
12 Vertex Sub Pixel Precision Select
Format:
Selects the number of fractional bits maintained in the vertex data
Value
0h
1h
11 Use Point Width State
Format:
Controls whether the point width passed on the vertex or from state is used for rendering point
primitives.
Value
0h
1h
Name
Description
Use Point Width on Vertex
Use Point Width from State
U8.3
10:0 Point Width
Format:
Range: [0.125, 255.875] pixels
This field specifies the size (width) of point primitives in pixels. This field is overridden (though
not overwritten) whenever point width information is passed in the FVF
4 31:0 Global Depth Offset Constant
Format: IEEE_FP
Specifies the constant term in the Global Depth Offset function.
5 31:0 Global Depth Offset Scale
Format: IEEE_FP
Specifies the scale term used in the Global Depth Offset function.
142 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SF
6 31:0 Global Depth Offset Clamp
Format: IEEE_FP
Specifies the clamp term used in the Global Depth Offset function.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 143
Command Reference - Instructions
3DSTATE_SO_BUFFER
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
18h 3DSTATE_SO_BUFFER
OpCode
MBZ
2h Excludes DWord (0,1)
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
Total Length - 2
1 31 Reserved
Format: MBZ
U2
30:29 SO Buffer Index
Format:
Specifies which of the four SO Buffers is being defined.
28:25 SO Buffer Object Control State
Format: MEMORY_OBJECT_CONTROL_STATE
Specifies the memory object control state for the SO buffer.
24:22 Reserved
Format: MBZ
MBZ
U12 Pitch in Bytes
21:12 Reserved
Format:
11:0 Surface Pitch
Format:
This field specifies the pitch of the SO buffer in #Bytes.
144 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SO_BUFFER
Value
[0,2048]
Programming Notes
A Surface Pitch of 0 indicates an un-bound buffer. No writes are performed. Surface Base
Address is ignored.
Name
Must be 0 or a multiple of 4 Bytes.
2 31:2 Surface Base Address
Format: GraphicsAddress[31:2]
This field specifies the starting DWord address LSBs of the buffer in Graphics Memory.
1:0 Reserved
Format: MBZ
GraphicsAddress[31:2]
3 31:2 Surface End Address
Format:
This field specifies the ending DWord address of the buffer in Graphics Memory.
1:0 Reserved
Format: MBZ
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 145
Command Reference - Instructions
3DSTATE_SO_DECL_LIST
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
17h 3DSTATE_SO_DECL_LIST
OpCode
MBZ
=n Total Length - 2
Format: Q1
Name Description
Default value = 2(N-1)+3 h
MBZ
U4 bitmask
Index of SO Stream
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:9 Reserved
Format:
8:0 DWord Length
Format:
Value
3h Excludes DWord (0,1) [Default]
1 31:16 Reserved
Format:
15:12 Stream to Buffer Selects [3]
Format:
Identifies to which SO Buffers stream 3 outputs. See Stream To Buffer Selects [0] field description.
11:8 Stream to Buffer Selects [2]
Format: U4 bitmask
Identifies to which SO Buffers stream 2 outputs. See Stream To Buffer Selects [0] field description.
7:4 Stream to Buffer Selects [1]
Format: U4 bitmask
Identifies to which SO Buffers stream 1 outputs. See Stream To Buffer Selects [0] field description.
146 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SO_DECL_LIST
3:0 Stream to Buffer Selects [0]
Format: U4 bitmask
Identifies to which SO Buffers stream 0 outputs (irrespective of whether those buffers are
enabled via 3DSTATE_STREAMOUT). Software is required to scan the SO_DECL list in order to
provide this summary information.
Note: For "inactive" streams, software must program this field to all zero (no buffers written to)
and the corresponding Num Entries field to zero (no valid SO_DECLs).
Value
1xxxb
x1xxb
xx1xb
xxx1b
Name
SO Buffer 3
SO Buffer 2
SO Buffer 1
SO Buffer 0
U8 #entries
2 31:24 Num Entries [3]
Format:
Specifies the number of valid SO_DECL entries for Stream 3. (See notes in Num Entries [0] field
description).
Value
[0,128]
Name
entries
U8 #entries
23:16 Num Entries [2]
Format:
Specifies the number of valid SO_DECL entries for Stream 2. (See notes in Num Entries [0] field
description).
Value
[0,128]
Name
entries
U8 #entries
15:8 Num Entries [1]
Format:
Specifies the number of valid SO_DECL entries for Stream 1. (See notes in Num Entries [0] field
description).
Value
[0,128]
Name
entries
U8 #entries
7:0 Num Entries [0]
Format:
Specifies the number of valid SO_DECL entries for Stream that the SO_DECLs are
programmed in groups of four (one SO_DECL for each of the four streams). Therefore the
number of 2-DWord groups of SO_DECLs supplied in this command is derived from the stream(s)
with the most valid SO_DECLs. The NumEntries value specific to each stream will indicate how
many SO_DECLS are valid for that particular stream. Any trailing invalid SO_DECLs supplied for
streams with fewer valid SO_DECLs will be ignored. It is legal to specify Num Entries = 0 for all
four streams simultaneously. In this case there will be no SO_DECLs included in the command
(only DW 0-2). Note that all Stream to Buffer Selects bits must be zero in this case (as no streams
produce output).
Value
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Name
147
Command Reference - Instructions
3DSTATE_SO_DECL_LIST
[0,128] entries
SO_DECL
3..n 63:48 SO_DECL[3,n]
Format:
This field contains Stream 3 SO_DECL [n]
47:32 SO_DECL[2,n]
Format:
This field contains Stream 2 SO_DECL [n]
31:16 SO_DECL[1,n]
Format:
This field contains Stream 1 SO_DECL [n]
15:0 SO_DECL[0,n]
Format:
This field contains Stream 0 SO_DECL [n]
SO_DECL
SO_DECL
SO_DECL
148 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_STENCIL_BUFFER
Source:
Length Bias:
RenderCS
2
This command sets the surface state of the separate stencil buffer, delivered as a pipelined state command.
However, the state change pipelining isn't completely transparent (see restriction below).
Programming Notes
Restriction: Prior to changing Depth/Stencil Buffer state (i.e., any combination of
3DSTATE_DEPTH_BUFFER, 3DSTATE_CLEAR_PARAMS, 3DSTATE_STENCIL_BUFFER,
3DSTATE_HIER_DEPTH_BUFFER) SW must first issue a pipelined depth stall (PIPE_CONTROL with Depth
Stall bit set, followed by a pipelined depth cache flush (PIPE_CONTROL with Depth Flush Bit set,
followed by another pipelined depth stall (PIPE_CONTROL with Depth Stall Bit set), unless SW can
otherwise guarantee that the pipeline from WM onwards is already flushed (e.g., via a preceding
MI_FLUSH).
3DSTATE_STENCIL_BUFFER must always be programmed in the along with the other Depth/Stencil
state 3DSTATE_DEPTH_BUFFER, 3DSTATE_CLEAR_PARAMS, or
3DSTATE_HIER_DEPTH_BUFFER)
The stencil buffer is always Tile-Y
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
06h 3DSTATE_STENCIL_BUFFER
OpCode
MBZ
=n Total Length - 2
Name
Excludes Dword (0,1) [Default]
MBZ
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 Dword Length
Format:
Value
1h
1 31 Reserved
Format:
30:29 Reserved
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 149
Command Reference - Instructions
3DSTATE_STENCIL_BUFFER
Format: MBZ
MEMORY_OBJECT_CONTROL_STATE
Description
28:25 Stencil Buffer Object Control State
Format:
Specifies the memory object control state for the stencil buffer.
Stencil Buffer Object Control State [3:0]
This field is not context save and restored by hardware. If this field is programmed to
any value other than zero, it must be programmed after the following commands or
events:
• MI_SET_CONTEXT
• MI_WAIT_FOR_EVENT (Specifically waits on vblank or display flip)
• Render engine goes IDLE due to head point equal to tail pointer
24:22 Reserved
Format: MBZ
MBZ
U17-1 Pitch in Bytes
Name Description
corresponding to [128B, 128KB]also restricted to a multiple of 128B
Programming Notes
21:17 Reserved
Format:
16:0 Surface Pitch
Format:
Value
Since this surface is tiled, the pitch specified must be a multiple of the tile pitch, in the range
[128B, 128KB].
The pitch must be set to 2x the value computed based on width, as the stencil buffer is stored
with two rows interleaved. For details on the separate stencil buffer storage format in memory,
see GPU Overview (vol1a), Memory Data Formats, Surface Layout, 2D Surfaces, Stencil Buffer
Layout (section 8.20.4.8).
This field specifies the pitch of the stencil buffer in (#Bytes - 1).
[127, 3FFFFh]
2 31:0 Surface Base Address
Format: GraphicsAddress[31:0]Stencil_Buffer
Programming Notes
The Stencil Buffer can only be mapped to Main Memory (uncached).
This field specifies the starting Dword address of the buffer in mapped Graphics Memory.
150 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_STREAMOUT
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
This command contains pipelined state required by the SOL unit.
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
1Eh 3DSTATE_STREAMOUT
OpCode
MBZ
1h
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
Total Length - 2
1 31 SO Function Enable
Format: U1
If set, the SO function is enabled. Vertex data will be streamed out to memory (subject to
overflow detection) as controlled by the various SO-related state variables.
If clear, the SO function is disabled, and therefore no vertex data will be streamed out to
memory. However, the Rendering Disable and Render Stream Select fields will still be used to
determine which vertices (if any) are forwarded down the pipeline for (possible) rendering.
30 Rendering Disable
Format: U1
If set, the SO stage will not forward any topologies down the pipeline. If clear, the SO stage will
forward topologies associated with Render Stream Select down the pipeline. This bit is used even
if SO Function Enable is DISABLED.
29 Reserved
Format: MBZ
28:27 Render Stream Select
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 151
Command Reference - Instructions
3DSTATE_STREAMOUT
Format:
Description
This field specifies which stream has been selected to be forwarded down the pipeline
for possible rendering. Topologies from other streams will not be passed down the
pipeline. If Rendering Disable is set, this field is ignored, as no topologies are sent
down the pipeline.
This bit is used even if SO Function Enable is DISABLED.
U2
26 Reorder Mode
This bit controls how vertices of triangle objects in TRISTRIP[_ADJ] and TRISTRIP_REV are
reordered for the purposes of stream-out only (does not impact rendering). See table in Input
Buffering.
Value
0h
Name Description
LEADING Reorder the vertices of alternating triangles of a TRISTRIP[_ADJ]
such that the leading (first) vertices are in consecutive order starting
at v0. A similar reordering is performed on alternating triangles in a
TRISTRIP_REV.
TRAILING Reorder the vertices of alternating triangles of a TRISTRIP[_ADJ]
such that the trailing (last) vertices are in consecutive order starting
at v2. A similar reordering is performed on alternating triangles in a
TRISTRIP_REV.
Enable
Description
1h
25 SO Statistics Enable
Format:
Value Name
0h
1h
This bit controls whether StreamOutput statistics register(s) can be incremented.
Disable SO_NUM_PRIMS_WRITTEN[0..3] and SO_PRIM_STORAGE_NEEDED[0..3]
registers cannot increment.
Enable SO_NUM_PRIMS_WRITTEN[0..3] and SO_PRIM_STORAGE_NEEDED[0..3]
registers can increment.
MBZ
MBZ
U1
24:23 Reserved
Format:
22:12 Reserved
Format:
11 SO Buffer Enable [3]
Format:
(See SO Buffer Enable [0] )
10 SO Buffer Enable [2]
Format:
(See SO Buffer Enable [0] )
9
152
U1
SO Buffer Enable [1]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_STREAMOUT
Format:
(See SO Buffer Enable [0] )
8 SO Buffer Enable [0]
Format: U1
If set, stream output to SO Buffer 0 is enabled. If clear, SO Buffer 0 is considered "not bound"
and effectively treated as a zero-length buffer for the purposes of SO output and overflow
detection. If an enabled stream's Stream to Buffer Selects includes this buffer it is by definition an
overflow condition. That stream will cause no writes to occur, and only
SO_PRIM_STORAGE_NEEDED[
is DISABLED.
7:0 Reserved
Format:
U1
MBZ
MBZ
U1 count of 256-bit units
2 31:30 Reserved
Format:
29 Stream 3 Vertex Read Offset
Format:
Specifies amount of data to skip over before reading back Stream 3 vertex data.
(See Stream 0 Vertex Read Offset)
28:24 Stream 3 Vertex Read Length
Format: U5-1 count of 256-bit units
(See Stream 0 Vertex Read Length)
23:22 Reserved
Format: MBZ
U1 count of 256-bit units
21 Stream 2 Vertex Read Offset
Format:
Specifies amount of data to skip over before reading back Stream 2 vertex data. (See Stream 0
Vertex Read Offset)
20:16 Stream 2 Vertex Read Length
Format: U5-1 count of 256-bit units
MBZ
U1 count of 256-bit units
15:14 Reserved
Format:
13 Stream 1 Vertex Read Offset
Format:
Specifies amount of data to skip over before reading back Stream 1 vertex data. (See Stream 0
Vertex Read Offset)
12:8 Stream 1 Vertex Read Length
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 153
Command Reference - Instructions
3DSTATE_STREAMOUT
Format: U5-1 count of 256-bit units
(See Stream 0 Vertex Read Length)
7:6 Reserved
Format: MBZ
U1 count of 256-bit units
5 Stream 0 Vertex Read Offset
Format:
Specifies amount of data to skip over before reading back Stream 0 vertex data. Must be zero if
the GS is enabled and the Output Vertex Size field in 3DSTATE_GS is programmed to 0 (i.e., one
16B unit).
4:0 Stream 0 Vertex Read Length
Format: U5-1 count of 256-bit units
Specifies amount of vertex data to read back for Stream 0 vertices, starting at the Stream 0
Vertex Read Offset location. Maximum readback is 17 256-bit units (34 128-bit vertex attributes).
Read data past the end of the valid vertex data has undefined contents, and therefore shouldn't
be used to source stream out data.
Must be zero (i.e., read length = 256b) if the GS is enabled and the Output Vertex Size field in
3DSTATE_GS is programmed to 0 (i.e., one 16B unit).
154 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_TE
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
The state used by TE is defined with this inline state packet.
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
1Ch 3DSTATE_TE
OpCode
MBZ
2h Excludes DWord (0,1)
=n Total Length - 2
MBZ
MBZ
MBZ
U2
Name
INTEGER
Description
Outside/inside edges are divided into an integer number
of equal-sized segments.
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:19 Reserved
Format:
18:16 Reserved
Format:
15:14 Reserved
Format:
13:12 Partitioning
Format:
Value
0h
1h
2h
This field specifies how edges are partitioned based on tessellation factor.
ODD_FRACTIONAL Outside/inside edges are divided into an odd number of
possibly-unequal-sized segments.
EVEN_FRACTIONAL Outside/inside edges are divided into an even number of
possibly-unequal-sized segments.
11:10 Reserved
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 155
Command Reference - Instructions
3DSTATE_TE
Format: MBZ
U2
Description
Points are output (as POINTLIST topologies)
Lines are output (as LINESTRIP topologies). Only valid if ISOLINE
domain is selected.
9:8 Output Topology
Format:
This field specifies which primitive types are to be output.
Value Name
0h
1h
2h
POINT
LINE
TRI_CW Clockwise-ordered triangles are output (either as TRISTRIP,
TRISTRIP_REV or TRILIST topologies). Not valid if ISOLINE domain is
selected.
TRI_CCW Count-clockwise-ordered triangles are output (either as TRISTRIP,
TRISTRIP_REV or TRILIST topologies). Not valid if ISOLINE domain is
selected.
MBZ
U2
Name
QUAD
TRI
ISOLINE
Description
2D (U,V) domain is tessellated
Triangular (U,V,W) domain is tessellated
2D (U,V) domain is tessellated.
MBZ
U2
3h
7:6 Reserved
Format:
5:4 TE Domain
Format:
This field specifies which type of domain is to be tessellated.
Value
0h
1h
2h
3 Reserved
Format:
2:1 TE Mode
Format:
When TE Enable is ENABLED, this field specifies the overall operation of the TE stage. This field is
ignored if TE Enable is DISABLED.
Value
0h
Name Description
HW_TESS Normal HW Tessellation Mode. The TessFactors are read from the
patch URB entry, and are used to perform fixed-function hardware
tessellation of the specified domain.
SW_TESS Software Tessellation Mode. The TE unit will pass down HS-thread-
generated tessellated domain points instead of generating them
itself from TessFactors. The TE unit will read the Domain Point Count
and Domain Point Buffer Starting Address fields from the patch
header, and if the count is 0 it will consider the patch culled and
discard it. Otherwise the address is used to start fetching
DOMAIN_POINT structures from memory and passing them down
the pipeline to DS.
1h
156 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_TE
2h
3h
Reserved Reserved
Reserved Reserved
Enable
0 TE Enable
Format:
If ENABLED, the TE stage will perform tessellation processing on incoming patch primitives. The
TE Mode field determines how this tessellation operation proceeds. If DISABLED, the TE goes into
pass-through mode. All other state fields are ignored.
Programming Notes
The tessellation stages (HS, TE and DS) must be enabled/disabled as a group. I.e., draw
commands can only be issued if all three stages are enabled or all three stages are disabled,
otherwise the behavior is UNDEFINED.
2 31:0 Maximum Tessellation Factor Odd
Format: IEEE_Float
This field specifies the maximum TessFactor for ODD_FRACTIONAL partitioning when in
HW_TESS mode.
Value
427c0000h
Name
63
Description
Per API Spec, For normal operation software should set this
value to 63.0
[40400000h,427c0000h] Reserved Reserved.
Programming Notes
Note that ISOLINE's LineDensity TF is always subjected to INTEGER partitioning regardless of
the Partitioning state.
3 31:0 Maximum Tessellation Factor Not Odd
Format: IEEE_Float
This field specifies the maximum TessFactor for EVEN_FRACTIONAL or INTEGER partitioning
when in HW_TESS mode.
Value
42800000h
Name
64
Description
Per API Spec, For normal operation software should set this
value to 64.0
[40000000h,42800000h] Reserved Reserved
Programming Notes
Note that ISOLINE's LineDensity TF is always subjected to INTEGER partitioning regardless of
the Partitioning state.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 157
Command Reference - Instructions
3DSTATE_URB_DS
Source:
Length Bias:
RenderCS
2
This command may not overlap with the push constants in the URB defined by the
3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,
3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.
Programming Notes
3DSTATE_URB_VS, 3DSTATE_URB_HS, and 3DSTATE_URB_GS must also be programmed in order for the
programming of this state to be valid.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
32h 3DSTATE_URB_DS
OpCode
MBZ
0h DWORD_COUNT_n
=n
MBZ
MBZ
U5
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31
30
Reserved
Format:
Reserved
Format:
29:25 DS URB Starting Address
Format:
Offset from the start of the URB memory where DS starts its allocation, specified in multiples of
8 KB.
Value
[0,11]
Name
U9-1 Count of 512-bit units
24:16 DS URB Entry Allocation Size
Format:
Specifies the length of each URB entry owned by DS. This field is always used (even if DS
158 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_URB_DS
Function Enable is DISABLED).
Value
[0,9]
Name
Description
15:0 DS Number of URB Entries
Specifies the number of URB entries that are used by DS. This field is always used
(even if DS Function Enable is DISABLED).
If Domain Shader Thread Dispatch is Enabled then the minimum number of handles
that must be allocated is 10 URB entries.
Value
[0,288]
Programming Notes
DS Number of URB Entries must be divisible by 8 if the DS URB Entry Allocation Size is
programmed to a value less than 9, which is 10 512-bit URB entries. "2:0" = reserved "000"
Name
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 159
Command Reference - Instructions
3DSTATE_URB_GS
Source:
Length Bias:
RenderCS
2
This command may not overlap with the push constants in the URB defined by the
3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,
3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.
Programming Notes
3DSTATE_URB_VS, 3DSTATE_URB_HS, and 3DSTATE_URB_DS must also be programmed in order for the
programming of this state to be valid.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
33h 3DSTATE_URB_GS
OpCode
MBZ
0h DWORD_COUNT_n
=n
MBZ
MBZ
U5
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31 Reserved
Format:
30 Reserved
Format:
29:25 GS URB Starting Address
Format:
Offset from the start of the URB memory where GS starts its allocation, specified in multiples of 8
KB.
Value
[0,11]
Name
U9-1 512-bit units
24:16 GS URB Entry Allocation Size
Format:
Specifies the length of each URB entry owned by GS. This field is always used (even if GS
160 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_URB_GS
Function Enable is DISABLED).
15:0 GS Number of URB Entries
Specifies the number of URB entries that are used by GS. This field is always used (even if GS
Function Enable is DISABLED).
Value
[0,192]
Programming Notes
Only if GS is disabled can this field be programmed to 0.
If GS is enabled this field shall be programmed to a value greater than 0. For GS Dispatch Mode
"Single", this field shall be programmed to a value greater than or equal to 1. For other GS
Dispatch Modes, refer to the definition of Dispatch Mode (3DSTATE_GS) for minimum values of
this field.
GS Number of URB Entries must be divisible by 8 if the GS URB Entry Allocation Size is less than
9 512-bit URB entries.
"2:0" = reserved "000"
Name
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 161
Command Reference - Instructions
3DSTATE_URB_HS
Source:
Length Bias:
RenderCS
2
This command may not overlap with the push constants in the URB defined by the
3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,
3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.
Programming Notes
3DSTATE_URB_VS, 3DSTATE_URB_DS, and 3DSTATE_URB_GS must also be programmed in order for the
programming of this state to be valid.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
31h 3DSTATE_URB_HS
OpCode
MBZ
0h DWORD_COUNT_n
=n
MBZ
MBZ
U5
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31
30
Reserved
Format:
Reserved
Format:
29:25 HS URB Starting Address
Format:
Offset from the start of the URB memory where HS starts its allocation, specified in multiples of
8 KB.
Value
[0,11]
Name
U9-1 Count of 512-bit units
24:16 HS URB Entry Allocation Size
Format:
Specifies the length of each URB entry owned by HS. This field is always used (even if HS
162 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_URB_HS
Function Enable is DISABLED).
15:0 HS Number of URB Entries
Specifies the number of URB entries that are used by HS. This field is always used (even
if HS Function Enable is DISABLED).
Programming Restriction:HS Number of URB Entries must be divisible by 8 if the HS
URB Entry Allocation Size is less than 9 512-bit URB entries."2:0" = reserved "000"
Value
[0,32]
Name
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 163
Command Reference - Instructions
3DSTATE_URB_VS
Source:
Length Bias:
RenderCS
2
Description
VS URB Entry Allocation Size equal to 4(5 512-bit URB rows) may cause performance to decrease due
to banking in the URB. Element sizes of 16 to 20 should be programmed with six 512-bit URB rows.
This command may not overlap with the push constants in the URB defined by the
3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,
3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.
Programming Notes
3DSTATE_URB_HS, 3DSTATE_URB_DS, and 3DSTATE_URB_GS must also be programmed in order for the
programming of this state to be valid.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
30h 3DSTATE_URB_VS
OpCode
MBZ
0h DWORD_COUNT_n
=n
MBZ
MBZ
U5
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31
30
Reserved
Format:
Reserved
Format:
29:25 VS URB Starting Address
Format:
Offset from the start of the URB memory where VS starts its allocation, specified in multiples of
8 KB.
Value Name
164 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_URB_VS
[0,11]
U9-1 count of 512-bit units
24:16 VS URB Entry Allocation Size
Format:
Specifies the length of each URB entry owned by VS. This field is always used (even if VS
Function Enable is DISABLED).
Programming Notes
Programming Restriction: As the VS URB entry serves as both the per-vertex input and output
of the VS shader, the VS URB Allocation Size must be sized to the maximum of the vertex input
and output structures.
15:0 VS Number of URB Entries
Format: U16
Specifies the number of URB entries that are used by VS. This field is always used (even if VS
Function Enable is DISABLED).
Value
[32,512]
Programming Notes
Programming Restriction: VS Number of URB Entries must be divisible by 8 if the VS URB Entry
Allocation Size is less than 9 512-bit URB entries."2:0" = reserved "000b"
Name
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 165
Command Reference - Instructions
3DSTATE_VERTEX_BUFFERS
Source:
Length Bias:
RenderCS
2
Description
This command is used to specify VB state used by the VF function.
Can specify from 1 to 33 VBs.
The VertexBufferID field within a VERTEX_BUFFER_STATE structure indicates the specific VB. If
a VB definition is not included in this command, its associated state is left unchanged and is
available for use if previously defined.
Programming Notes
It is possible to have individual vertex elements sourced completely from generated ID values and
therefore not require any vertex buffer accesses for that vertex element. In this case, VF function will
simply ignore the VB state associated with that vertex element. If all enabled vertex elements have
this characteristic, no VBs are required to process 3DPRIMITIVE commands. For example, this might
arise when the user wants to perform all data lookups in the first shader, so only generated index
values need to be passed down to it. In this extreme case, SW would not need to program any VB
state, and therefore not need to issue any 3DSTATE_VERTEX_BUFFERS commands.
For any 3DSTATE_VERTEX_BUFFERS command, at least one VERTEX_BUFFER_STATE structure must be included.
VERTEX_BUFFER_STATE structures are 4 DWords for both VERTEXDATA buffers and INSTANCEDATA buffers.
Inclusion of partial VERTEX_BUFFER_STATE structures is UNDEFINED.
The order in which VBs are defined within this command can be arbitrary, though a vertex buffer must be
defined only once in any given command (otherwise operation is UNDEFINED).
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
03h GFXPIPE
Opcode
3h 3D
Opcode
0h 3DSTATE_VERTEX_BUFFERS
Opcode
08h 3DSTATE_VERTEX_BUFFERS
Opcode
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8
7:0
Reserved
DWord Count
Default Value:
Format:
3 DWORD_COUNT_n
=n
n = 4b-1 (where b = # of buffer states included)
166 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VERTEX_BUFFERS
1..n 127:0 Vertex Buffer State [n]
Format: VERTEX_BUFFER_STATE
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 167
Command Reference - Instructions
3DSTATE_VERTEX_ELEMENTS
Source:
Length Bias:
RenderCS
2
Description
This is a variable-length command used to specify the active vertex elements. Each
VERTEX_ELEMENT_STATE structure contains a Valid bit which determines which elements are
used.
Up to 34 elements.
Programming Notes
At least one VERTEX_ELEMENT_STATE structure must be included.
Inclusion of partial VERTEX_ELEMENT_STATE structures is UNDEFINED.
SW must ensure that at least one vertex element is defined prior to issuing a 3DPRIMTIVE
command, or operation is UNDEFINED.
There are no 'holes' allowed in the destination vertex: NOSTORE components must be
overwritten by subsequent components unless they are the trailing DWords of the vertex.
Software must explicitly chose some value (probably 0) to be written into DWords that would
otherwise be 'holes'.
Within a VERTEX_ELEMENT_STATE structure, if a Component Control field is set to something
other than VFCOMP_STORE_SRC, no higher-numbered Component Control fields may be set
to VFCOMP_STORE_SRC. In other words, only trailing components can be set to something
other than VFCOMP_STORE_SRC.
See additional restrictions listed in the command fields and VERTEX_ELEMENT_STATE
description.
Element[0] must be valid.
All elements must be valid from Element[0] to the last valid element. (I.e. if Element[2] is valid
then Element[1] and Element[0] must also be valid).
The pitch between elements packed in the URB will always be 128 bits.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
03h GFXPIPE
Opcode
3h 3D
Opcode
0h 3DSTATE_VERTEX_ELEMENTS
Opcode
09h 3DSTATE_VERTEX_ELEMENTS
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
168 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VERTEX_ELEMENTS
Format: Opcode
15:8
7:0
Reserved
DWord Count
Format:
Vertex Element Count = (DWord Count + 1) / 2
Value
1
[1,66]
=n
Name
DWORD_COUNT_n [Default]
Range
Description
excludes DWords 0,1
1-34 Elements
VERTEX_ELEMENT_STATE
1..n
63:0 Element [n]
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 169
Command Reference - Instructions
3DSTATE_VF_STATISTICS
Source:
Length Bias:
RenderCS
1
The VF stage tracks two pipeline statistics, the number of vertices fetched and the number of objects generated.
VF will increment the appropriate counter for each when statistics gathering is enabled by issuing the
3DSTATE_VF_STATISTICS command with the [Statistics Enable] bit set.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
Opcode
Opcode
Name
Pipelined, Single DWord [Default]
0h 3DSTATE_PIPELINED
Opcode
28:27 Command SubType
Format:
Value
1h
26:24 3D Command Opcode
Default Value:
Format:
GFXPIPE[28:27 = 1h, 26:24 = 0h, 23:16 = 0Bh] (Pipelined, Single DWord)
23:16 3D Command Sub Opcode
Default Value:
Format:
0Bh 3DSTATE_VF_STATISTICS
Opcode
GFXPIPE[28:27 = 1h, 26:24 = 0h, 23:16 = 0Bh] (Pipelined, Single DWord)
15:1 Reserved
Format: MBZ
Enable
0 Statistics Enable
Format:
If ENABLED, VF will increment the pipeline statistics counters IA_VERTICES_COUNT and
IA_PRIMITIVES_COUNT for each vertex fetched and each object output, respectively, for
3DPRIMITIVE commands issued subsequently.
If DISABLED, these counters will not be incremented for subsequent 3DPRIMITIVE commands.
170 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VIEWPORT_STATE_POINTERS_CC
Source:
Length Bias:
RenderCS
2
The 3DSTATE_VIEWPORT_STATE_POINTERS_CC command is used to define the location of fixed functions'
viewport state table.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
23h 3DSTATE_VIEWPORT_STATE_POINTERS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]CC_VIEWPORT*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 CC Viewport Pointer
Format:
Specifies the 32-byte aligned address offset of the CC_VIEWPORT state. This offset is relative to
the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 171
Command Reference - Instructions
3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP
Source:
Length Bias:
RenderCS
2
The 3DSTATE_VIEWPORT_STATE_POINTERS_CLIP command is used to define the location of fixed functions'
viewport state table.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
21h 3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:6]SF_CLIP_VIEWPORT*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:6 SF Clip Viewport Pointer
Format:
Specifies the 64-byte aligned address offset of the SF_CLIP_VIEWPORT state. This offset is
relative to the Dynamic State Base Address.
5:0 Reserved
Format: MBZ
172 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
Source:
Length Bias:
RenderCS
2
Description
The state used by VS is defined with this inline state packet.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
10h 3DSTATE_VS
OpCode
MBZ
4h Excludes DWord (0,1)
=n Total Length - 2
InstructionBaseOffset[31:6]Kernel
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:6 Kernel Start Pointer
Format:
This field specifies the starting location (1st GEN4 core instruction) of the kernel program run by
threads spawned by this FF unit. It is specified as a 64-byte-granular offset from the Instruction
Base Address. This field is ignored if VS Function Enable is DISABLED.
5:0 Reserved
Format: MBZ
U1 Enumerated type
Name
Multiple
Single
Description
Dual vertex SIMD4x2 thread dispatches are allowed.
Single vertex SIMD4x2 thread dispatches are forced.
2 31 Single Vertex Dispatch
Format:
Value
0h
1h
This field can be used to force single vertex SIMD4x2 VS threads.
30 Vector Mask Enable (VME)
When SPF=0, VME specifies which mask to use to initialize the initial channel enables. When
SPF=1, VME specifies which mask to use to generate execution channel enables.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 173
Command Reference - Instructions
3DSTATE_VS
Value
0h
1h
Name
Dmask
Vmask
Description
Channels are enabled based on the dispatch mask
Channels are enabled based on the vector mask
29:27 Sampler Count
Specifies how many samplers (in multiples of 4) the vertex shader 0 kernel uses. Used only for
prefetching the associated sampler state entries. This field is ignored if VS Function Enable is
DISABLED.
Value
0h
1h
2h
3h
4h
Name
No Samplers
1-4 Samplers
5-8 Samplers
9-12 Samplers
13-16 Samplers
no samplers used
Description
between 1 and 4 samplers used
between 5 and 8 samplers used
between 9 and 12 samplers used
between 13 and 16 samplers used
MBZ
U8
26 Reserved
Format:
25:18 Binding Table Entry Count
Format:
Specifies how many binding table entries the kernel uses. Used only for prefetching of the
binding table entries and associated surface state.
Note: For kernels using a large number of binding table entries, it may be wise to set this field to
zero to avoid prefetching too many entries and thrashing the state cache.
This field is ignored if VS Function Enable is DISABLED.
Value
[0,255]
Name
MBZ
U1 enumerated type
17 Reserved
Format:
16 Floating Point Mode
Format:
Specifies the initial floating point mode used by the dispatched thread. This field is ignored if VS
Function Enable is DISABLED.
Value
0h
1h
Name
IEEE-754
Alternate
Description
Use IEEE-754 Rules
Use alternate rules
MBZ
Enable
15:14 Reserved
Format:
13 Illegal Opcode Exception Enable
Format:
This bit gets loaded into EU CR0.1[12] (note the bit # difference). See Exceptions and ISA
Execution field is ignored if VS Function Enable is DISABLED.
174 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
12 Reserved
Format: MBZ
MBZ
Enable
11:8 Reserved
Format:
7 Software Exception Enable
Format:
This bit gets loaded into EU CR0.1[13] (note the bit # difference). See Exceptions and ISA
Execution field is ignored if VS Function Enable is DISABLED.
6:0 Reserved
Format: MBZ
GeneralStateOffset[31:10]ScratchSpace
3 31:10 Scratch Space Base Offset
Format:
Specifies the starting location of the scratch space area allocated to this FF unit as a 1K-byte
aligned offset from the General State Base Address. If required, each thread spawned by this FF
unit will be allocated some portion of this space, as specified by Per-Thread Scratch Space. The
computed offset of the thread-specific portion will be passed in the thread payload as Scratch
Space Offset. The thread is expected to utilize "stateless" DataPort read/write requests to access
scratch space, where the DataPort will cause the General State Base Address to be added to the
offset passed in the request header.
This field is ignored if VS Function Enable is DISABLED.
9:4 Reserved
Format: MBZ
U4 power of 2 Bytes over 1K Bytes
3:0 Per-Thread Scratch Space
Format:
Specifies the amount of scratch space to be allocated to each thread spawned by this FF unit.
The driver must allocate enough contiguous scratch space, starting at the Scratch Space Base
Pointer, to ensure that the Maximum Number of Threads can each get Per-Thread Scratch Space
size without exceeding the driver-allocated scratch space. This field is ignored if VS Function
Enable is DISABLED.
Value
[0,11]
Programming Notes
This amount is available to the kernel for information only. It will be passed verbatim (if not
altered by the kernel) to the Data Port in any scratch space access messages, but the Data Port
will ignore it.
Name
Description
indicating [1K Bytes, 2M Bytes]
4 31:25 Reserved
Format: MBZ
U5
175
24:20 Dispatch GRF Start Register for URB Data
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
Specifies the starting GRF register number for the URB portion (Constant + Vertices) of the
thread payload. This field is ignored if VS Function Enable is DISABLED.
Value
[0,31]
Name
Description
indicating GRF [R0,R31]
MBZ
U6
19:17 Reserved
Format:
16:11 Vertex URB Entry Read Length
Format:
Specifies the number of pairs of 128-bit vertex elements to be passed into the payload
for each vertex. This field is ignored if VS Function Enable is DISABLED.
For SIMD4x2 dispatch, each vertex element requires one GRF of payload data, therefore
the number of GRFs with vertex data will be double the value programmed in this field.
Value
[1,63]
Programming Notes
It is UNDEFINED to set this field to 0 indicating no Vertex URB data to be read and passed to
the thread.
Name
10 Reserved
Format: MBZ
U6
9:4 Vertex URB Entry Read Offset
Format:
Specifies the offset (in 256-bit units) at which Vertex URB data is to be read from the URB before
being included in the thread payload. This offset applies to all Vertex URB entries passed to the
thread. This field is ignored if VS Function Enable is DISABLED.
Value
[0,63]
Name
MBZ
3:0 Reserved
Format:
5 31:25 Maximum Number of Threads
Format: U7-1 representing thread count
Specifies the maximum number of simultaneous threads allowed to be active. Used to avoid
using up the scratch space. Programming the value of the max threads over the number of
threads based off number of threads supported in the execution units may improve performance
since the architecture allows threads to be buffered between the check for max threads and the
actual dispatch into the EU. Programming the max values to a number less than the number of
threads supported in the execution units may reduce performance. This field is ignored if VS
Function Enable is DISABLED.
Value
[0,15]
Name
indicating thread count of [1,16]
24:23 Reserved
176 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
Format: MBZ
MBZ
Enable
Description
22:11 Reserved
Format:
10 Statistics Enable
Format:
If ENABLED, this FF unit will engage in statistics gathering. See the Statistics Gathering
section later in this chapter. If DISABLED, statistics information associated with this FF
stage will be left unchanged.
This field is used even if VS Function Enable is DISABLED.
9:3 Reserved
Format: MBZ
MBZ
Disable
2
1
Reserved
Format:
Vertex Cache Disable
Format:
This bit controls the operation of the Vertex Cache. This field is always used. If the Vertex Cache
is DISABLED and the VS Function is ENABLED, the Vertex Cache is not used and all incoming
vertices will be passed to VS threads.
If the Vertex Cache is ENABLED and the VS Function is ENABLED, incoming vertices that do not
hit in the Vertex Cache will be passed to VS threads.
If the Vertex Cache is ENABLED and the VS Function is DISABLED, input vertices that miss in the
Vertex Cache will be assembled and written to the URB, though pass thru the VS stage
unmodified (not shaded).
The Vertex Cache is invalidated whenever the Vertex Cache becomes DISABLED , whenever the
VS Function Enable toggles, between 3DPRIMITIVE commands and between instances within a
3DPRIMITIVE command.
0 VS Function Enable
Format:
Description
If ENABLED, VS threads may be spawned to process VF-generated vertices before the
resulting vertices are passed down the pipeline.
If DISABLED, VF-generated vertices will pass thru the VS function and sent down the
pipeline unmodified. The Vertex Cache is still available in this mode, if enabled.
If Statistics Enable is ENABLED, VS_INVOCATION_COUNT will increment by 1 for every
vertex that passes through the VS stage, even if VS Function Enable is DISABLED.
This field is always used.
Enable
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 177
Command Reference - Instructions
3DSTATE_WM
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
14h 3DSTATE_WM
OpCode
MBZ
01h Excludes DWord (0,1)
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
Total Length - 2
1 31 Statistics Enable
Format: Enable
If ENABLED, the Windower and pixel pipeline will engage in statistics gathering. If DISABLED,
statistics information associated with this FF stage will be left unchanged. See Statistics
Gathering.
30 Depth Buffer Clear
Format: Enable
Programming Notes
If this field is enabled,
2. the Depth Test Enable field in DEPTH_STENCIL_STATE must be disabled.
3. 3DSTATE_DEPTH_BUFFER::Depth Write Enable must be set.
4. 3DSTATE_DEPTH_BUFFER::Stencil Write Enable must be set if
3DSTATE_STENCIL_BUFFER::Stencil buffer enable is set. Additionally the following must
be set to the correct values.
178
When set, the depth buffer is initialized as a side-effect of rendering pixels.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
2. DEPTH_STENCIL_STATE::Stencil Write Mask must be 0xFF
3. DEPTH_STENCIL_STATE::Stencil Test Mask must be 0xFF
4. DEPTH_STENCIL_STATE::Back Face Stencil Write Mask must be 0xFF
5. DEPTH_STENCIL_STATE::Back Face Stencil Test Mask must be 0xFF
Refer to section 0 "Depth Buffer Clear" for additional restrictions when this field is enabled. If
this field is enabled,Pixel Shader Kill Pixel must be disabled.
29 Thread Dispatch Enable
Format: Enable
This bit, if set, indicates that it is possible for a PS thread to modify a render target, i.e.,at least
one render target is enabled (is not of type SURFTYPE_NULL and has at least one channel
enabled for writes) and the PS kernel contains a code path that may issue a write to that/those
enabled RTs.
Programming Notes
This bit is used for performance optimizations and does not directly control writing to render
targets. If this bit is DISABLED, no pixel shader threads will be dispatched. For correct behavior,
this bit must be set consistently with the behavior of the PS kernel, i.e. if this bit is DISABLED
the PS kernel must not write color or depth to any render targets. If this field is disabled, Pixel
Shader Kill Pixel must be disabled.
28 Depth Buffer Resolve Enable
Format: Enable
When set, the depth buffer is made to be consistent with the hierarchical depth buffer as a side-
effect of rendering pixels. This is intended to be used when the depth buffer is to be used as a
surface outside of the 3D rendering operation.
Programming Notes
If this field is enabled,
2.
the Depth Buffer Clear and Hierarchical Depth Buffer Resolve Enable fields must
both be disabled.
3. 3DSTATE_DEPTH_BUFFER::Depth Write Enable must be set.
Refer to section 11.5.4.2 "Depth Buffer Resolve" for additional restrictions when this field is
enabled. If Hierarchical Depth Buffer Enable is disabled, enabling this field will have no effect.
27 Hierarchical Depth Buffer Resolve Enable
Format: Enable
When set, the hierarchical depth buffer is made to be consistent with the depth buffer as a side-
effect of rendering pixels. This is intended to be used when the depth buffer has been modified
outside of the 3D rendering operation.
Programming Notes
If this field is enabled,
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 179
Command Reference - Instructions
3DSTATE_VS
2.
the Depth Buffer Clear and Depth Buffer Resolve Enable fields must both be
disabled.
3. 3DSTATE_DEPTH_BUFFER::Depth Write Enable must be set.
Refer to section 11.5.4.3 "Hierarchical Depth Buffer Resolve" for additional restrictions
when this field is enabled.
If Hierarchical Depth Buffer Enable is disabled, enabling this field will have no effect.
Performance Note: expect the hierarchical depth buffer's impact on performance to
be reduced for some period of time after this operation is performed, as the
hierarchical depth buffer is initialized to a state that makes it ineffective. Further
rendering will tend to bring the hierarchical depth buffer back to a more effective
state.
Software needs to do an ambiguate after allocating the surface for the first time if the
depth buffer width and height are NOT aligned to 8 and 4 respectively.
26 Legacy Diamond Line Rasterization
Format: Enable
This bit, if ENABLED, indicates that the Windower will rasterize zero width lines using the DX9
rasterization rules. If DISABLED, the Windower will rasterize zero width lines using the DX10
rasterization rules (see Strips Fans chapter).
25 Pixel Shader Kill Pixel
Format: Enable
This bit, if ENABLED, indicates that the PS kernel or color calculator has the ability to kill
(discard) pixels or samples, other than due to depth or stencil testing. This bit is required
to be ENABLED in the following situations:
• The API pixel shader program contains "killpix" or "discard" instructions, or other code in
the pixel shader kernel that can cause the final pixel mask to differ from the pixel mask
received on dispatch.
• A sampler with chroma key enabled with kill pixel mode is used by the pixel shader.
• Any render target has Alpha Test Enable or AlphaToCoverage Enable enabled.
• The pixel shader kernel generates and outputs oMask.
Note: As ClipDistance clipping is fully supported in hardware and therefore not via PS
instructions, there should be no need to ENABLE this bit due to ClipDistance clipping.
24:23 Pixel Shader Computed Depth Mode
Format: U2 Enumerated Type
This field specifies the computed depth mode for the pixel shader.
180 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
Value
0h
1h
2h
3h
Programming Notes
When bit 5 is set in WM_ RT independent rasterization is enabled), this field can not
be programmed to values: 2h or 3h.
Name
PSCDEPTH_OFF
PSCDEPTH_ON
Description
Pixel shader does not compute depth
Pixel shader computes depth with no guarantee as to its
value
PSCDEPTH_ON_GE Pixel shader computes depth and guarantees that oDepth
>= SourceDepth
PSCDEPTH_ON_LE Pixel shader computes depth and guarantees that oDepth
<= SourceDepth
22:21 Early Depth/Stencil Control
Format:
Value
0h
Name
U2 Enumerated Type
Description
This field specifies the behavior of early depth/stencil test.
EDSC_NORMAL Depth/Stencil Test/Write behaves as if it happens post-shader,
however the pixel shader is not necessarily executed if the
pixel fails depth or stencil test (this is the legacy behavior)
EDSC_PSEXEC Depth/Stencil Test/Write behaves as if it happens post-shader,
and the pixel shader is executed if the pixel fails depth or
stencil test (although pre-shader actions such as primitive
inclusion, stipple, etc. will still cause the shader not to execute)
Depth/Stencil Test/Write behaves as if it happens pre-shader.
The pixel shader is not executed if the pixel fails depth or
stencil test. Depth and stencil writes occur even if the pixel is
killed by the shader or post-shader by alpha test, etc. Depth
output by the pixel shader is ignored.
Programming Notes
If EDSC_PSEXEC mode is selected, Thread Dispatch Enable must be set.
Restriction
Restriction: When value of "2h" is programmed, PS_INVOCATIONs_COUNT may not be
accurate.
1h
2h EDSC_PREPS
3h
Reserved
20 Pixel Shader Uses Source Depth
Format: Enable
This bit, if ENABLED, indicates that the PS kernel requires the source depth value (vPos.z) to be
passed in the payload. The source depth value is interpolated according to the Position ZW
Interpolation Mode state.
19 Pixel Shader Uses Source W
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 181
Command Reference - Instructions
3DSTATE_VS
Format: Enable
This bit, if ENABLED, indicates that the PS kernel requires the interpolated source W value
(vPos.w) to be passed in the payload. The W value is interpolated according to the Position ZW
Interpolation Mode state.
18:17 Position ZW Interpolation Mode
Format: U2 Enumerated Type
This field elects "interpolation mode" associated with the Position Z (source depth) and W
coordinates passed in the PS payload when the PS requires Position as input. This field does not
determine whether these coordinates are actually included in the payload (see Pixel Shader
Requires Depth, Pixel Shader Requires W).
Value
0h
1h
2h
3h
Programming Notes
When bit 5 is set in WM_STATE, value of 3h is not defined for this field.
Programming Note: When bit 5 in dword 1 (RT Independent Rasterization Enable) is set and bit
30 in dword 2 (PS UAV-only) is not set in WM_STATE, value of 3h is not defined for this field.
Name
INTERP_PIXEL
Reserved
INTERP_SAMPLE
Description
Evaluate Z & W at the pixel center or UL corner (as
specified by Pixel Location of 3DSTATE_MULTISAMPLE)
INTERP_CENTROID
16:11 Barycentric Interpolation Mode
Format: Enable[6]
Controls which barycentric interpolation terms must be passed into the pixel shader kernel.
Bit 0: Perspective Pixel Location barycentric is required
Bit 1: Perspective Centroid barycentric is required
Bit 2: Perspective Sample barycentric is required
Bit 3: Non-perspective Pixel Location barycentric is required
Bit 4: Non-perspective Centroid barycentric is required
Bit 5: Non-perspective Sample barycentric is required
Programming Notes
If contiguous dispatch modes are enabled, only bit 3 (non-perspective pixel location) can be
set, all other bits in this field must be zero. Pixel Location below refers to either the upper left
corner or pixel center depending on the Pixel Location state of 3DSTATE_MULTISAMPLING).
MSDISPMODE_PERSAMPLE is required in order to select Perspective Sample or Non-
perspective Sample barycentric coordinates.
Restriction: When Centroid Barycentric mode is required, HW may produce incorrect
interpolation results when a 2X2 pixels have unlit pixels.
10 Pixel Shader Uses Input Coverage Mask
Format: Enable
This bit, if ENABLED, indicates that the PS kernel requires the input coverage mask to be passed
in the payload.
182 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
9:8 Line End Cap Antialiasing Region Width
Format: U2
This field specifies the distances over which the coverage of anti-aliased line end caps are
computed.
Value
0h
1h
2h
3h
Name
0.5 pixels
1.0 pixels
2.0 pixels
4.0 pixels
Description
7:6 Line Antialiasing Region Width
Format:
Value
0h
1h
2h
3h
U2
Name
0.5 pixels
1.0 pixels
2.0 pixels
4.0 pixels
MBZ
Enable
Description
This field specifies the distance over which the anti-aliased line coverage is computed.
5
4
Reserved
Format:
Polygon Stipple Enable
Format:
Enables the Polygon Stipple function.
3 Line Stipple Enable
Format:
Enables the Line Stipple function.
Enable
2 Point Rasterization Rule
Format: 3D_RasterizationRule
This field specifies the rasterization rules to be applied whenever the edges of a point primitive
fall exactly on a pixel sampling point.
Value
0h
1h
Name
RASTRULE_UPPER_LEFT
Description
To match "normal" upper left rules for surface
primitives
RASTRULE_UPPER_RIGHT To match OpenGL point rasterization rules (round to
+ infinity, where this is the upper right direction wrt
OpenGL screen origin of lower left).
U2 enumerated type
1:0 Multisample Rasterization Mode
Format:
This field determines whether multisample rasterization is turned on/off, and how the pixel
sample point(s) are defined. Software sets this according to the API, the API's multisample enable
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 183
Command Reference - Instructions
3DSTATE_VS
state setting (if any), and whether 1X or 4X MSRTs are bound. This state is duplicated in
3DSTATE_SF and both must be set to the same value. Refer to the "Multisampling" section for
details on the settings of this field.
Value
0h
1h
2h
3h
Name
MSRASTMODE_OFF_PIXEL
MSRASTMODE_OFF_PATTERN
MSRASTMODE_ON_PIXEL
MSRASTMODE_ON_PATTERN
U1 Enumerated Type
2 31 Multisample Dispatch Mode
Format:
This bit, along with Number of Multisamples, determines how PS threads are dispatched.
Software programs this bit depending on the per-pixel v.s per-sample PS execution requirement.
When RT Independent Rasterization Enable = 1, value of 0h for this field is not allowed.
Value
0h
Name Description
MSDISPMODE_PERSAMPLE This is the high-quality DX10.1 multisample mode
where (over and above PERPIXEL mode) the PS is
run for each covered sample. This mode is also
used for "normal" non-multisample rendering (aka
1X), given Number of Multisamples is
programmed to NUMSAMPLES_1.
MSDISPMODE_PERPIXEL This is the classic multisample mode of operation,
typically used for both antialiasing and
transparency. Setup and rasterization operate in
full multisample mode, testing coverage and
depth/stencil test at the sample level but only
running the PS once per pixel.
MBZ
1h
30:0 Reserved
Format:
184 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
add - Addition
Source:
Length Bias:
EuIsa
4
The add instruction performs component-wise addition of src0 and src1 and stores the results in dst.
Addition of two floating-point numbers follows rules in add (IEEE mode) or add (ALT mode).
Format:
[(pred)] add[.cmod] (exec_size) dst src0 src1
Programming Notes
Use a source modifier with add to implement subtraction.
Syntax
[(pred)] add[.cmod] (exec_size) reg reg reg [(pred)] add[.cmod] (exec_size) reg reg imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { [n] =
[n] + [n]; } }
Predication Conditional Modifier Saturation Source Modifier
Y Y Y Y
Src Types Dst Types
*B,*W,*D *B,*W,*D
*B,*W,*D F
F
DF
F
DF
Bit
127:64 ImmSource
Exists If:
Format:
DWord
0..3
Description
([ImmSource][e]=='IMM')
EU_INSTRUCTION_SOURCES_REG_IMM
([RegSource][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 185
Command Reference - Instructions
addc - Addition with Carry
Source:
Length Bias:
EuIsa
4
The addc instruction performs component-wise addition of src0 and src1 and stores the results in dst; it also
stores the carry into acc.
If the operation produces a carry out, 0x00000001 is stored in acc, else 0x00000000 is stored in acc.
Format:
[(pred)] addc[.cmod] (exec_size) dst src0 src1
Restriction
Restriction: AccWrEn is required. The accumulator is an implicit destination and thus cannot be an explicit
destination operand.
Syntax
[(pred)] addc[.cmod] (exec_size) reg reg reg [(pred)] addc[.cmod] (exec_size) reg reg
imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { [n] =
[n] + [n]; [n] = carry([n] + [n]); } }
Predication Conditional Modifier Saturation Source Modifier
Y Y N N
Src Types Dst Types
UD UD
Bit
127:64 ImmSource
Exists If:
Format:
DWord
0..3
Description
([ImmSource][e]=='IMM')
EU_INSTRUCTION_SOURCES_REG_IMM
([RegSource][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
186 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
asr - Arithmetic Shift Right
Source:
Length Bias:
EuIsa
4
Description
Perform component-wise arithmetic right shift of the bits in src0 by the shift count indicated in src1,
storing the results in dst. If src0 has a signed type, insert copies of src0's sign bit in the number of
MSBs indicated by the shift count. Otherwise insert 0 bits.
The shift count is taken from the low five bits of src1, regardless of the src1 type and treated as an
unsigned integer in the range 0 to 31.
Format:
[(pred)] asr[.cmod] (exec_size) dst src0 src1
Programming Notes
If src0 is -1, the result is -1 regardless of the shift count.
For unsigned src0 types, asr and shr produce the same result.
Syntax
[(pred)] asr[.cmod] (exec_size) reg reg reg [(pred)] asr[.cmod] (exec_size) reg reg imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( l[n] ) { shiftCnt =
[n] & 0x1F; // Always use low 5 bits for shift count. if ([n] >= 0) {
[n] = [n] >> shiftCnt; } else { int maskLSB = pow(2, shiftCnt) - 1; if (
maskLSB & [n] == 0 ) { [n] = sign([n]) * ((abs)[n] >>
shiftCnt); } else { [n] = sign([n]) * ((abs)[n] >> shiftCnt) -
1; } } } }
Predication Conditional Modifier Saturation Source Modifier
Y Y Y Y
Src Types Dst Types
*B,*W,*D *B,*W,*D
DWord
0..3
Bit
127:64 ImmSource
Exists If:
Format:
Description
([ImmSource][e]=='IMM')
EU_INSTRUCTION_SOURCES_REG_IMM
([RegSource][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 187
Command Reference - Instructions
avg - Average
Source:
Length Bias:
EuIsa
4
The avg instruction performs component-wise integer average of src0 and src1 and stores the results in dst. An
integer average uses integer upward rounding. It is equivalent to increment one to the addition of src0 and
src1 and then apply an arithmetic right shift to this intermediate value.
Format:
The avg instruction performs component-wise integer average of src0 and src1 and stores the results in dst. An
integer average uses integer upward rounding. It is equivalent to increment one to the addition of src0 and
src1 and then apply an arithmetic right shift to this intermediate value.
Syntax
[(pred)] avg[.cmod] (exec_size) reg reg reg [(pred)] avg[.cmod] (exec_size) reg reg imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { [n] =
([n] + [n] + 1) >> 1; // Use arithmetic shift right. } }
Predication Conditional Modifier Saturation Source Modifier
Y Y Y Y
Src Types Dst Types
*B,*W,*D *B,*W,*D
DWord
0..3
Bit
127:64 ImmSource
Exists If:
Format:
Description
([ImmSource][e]=='IMM')
EU_INSTRUCTION_SOURCES_REG_IMM
([RegSource][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
188 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
bfe - Bit Field Extract
Source:
Length Bias:
EuIsa
4
Component-wise extract a bit field from src2 using the bit field width from src0 and the bit field offset from
src1. Store the extracted bit field value in the low bits of dst and sign extend (if D type) or zero extend (if UD
type).
The width and offset values are from the low five bits of src0 and src1 respectively, or src0 & 0x1f and src1 &
0x1f.
If width is zero, the result is zero.
If offset + width > 32 then the extracted bit field is bits offset to 31 of src2, extracting only 32 - offset bits, less
than width as the bit field cannot extend past the MSB of the source value. Otherwise extract width bits
extending from bit positions offset to offset + width - 1.
Format:
[(pred)] bfe (exec_size) dst src0 src1 src2
Restriction
Restriction: No accumulator access, implicit or explicit.
Restriction: All three-source instructions have certain restrictions, described in Instruction Machine
Formats.
Syntax
[(pred)] bfe (exec_size) reg reg reg reg
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { UD width =
[n][4:0]; UD offset = [n][4:0]; if ( width == 0 ) { [n] =
0x00000000; } else if ( (width + offset) < 32 ) { [n] = [n] << (32 -
width - offset); if (src2 is signed) { [n] = [n] >> (32 - width); // pad
sign bit of } else { [n] = [n] >> (32 - width); // pad 0 } } else
{ if ( src2 is signed ) { [n] = [n] >> offset; // pad sign bit } else {
[n] = [n] >> offset; // pad 0 } } } }
Predication Conditional Modifier Saturation Source Modifier
Y N N N
Src Types Dst Types
UD
D
UD
D
Bit
Format:
DWord Description
MBZ
EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
MBZ
0..3 127:126 Reserved
125:106 Source 2
Format:
105 Reserved
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 189
Command Reference - Instructions
bfe - Bit Field Extract
104:85 Source 1
Format: EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
MBZ
EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
DstRegNum
DstSubRegNum[2:0]
ChanEn[4]
84 Reserved
Format:
83:64 Source 0
Format:
63:56 Destination Register Number
Format:
55:53 Destination Subregister Number
Format:
52:49 Destination Channel Enable
Format:
Four channel enables are defined for controlling which channels are written into the
destination region. These channel mask bits are applied in a modulo-four manner to all
ExecSize channels. There is 1-bit Channel Enable for each channel within the group of 4. If the
bit is cleared, the write for the corresponding channel is disabled. If the bit is set, the write is
enabled. Mnemonics for the bit being set for the group of 4 are x, y, z, and w, respectively,
where x corresponds to Channel 0 in the group and w corresponds to channel 3 in the group
48
47
46
Reserved
Format: MBZ
NibCtrl
MBZ
NibCtrl
Format:
Reserved
Format:
45:44 Destination Data Type
This field contains the data type for the destination
Value
00b
01b
10b
11b
Name
Single Precision Float
DWord
Unsigned DWord
Double Precision Float
43:42 Source Data Type
This field contains the data type for all three sources
Value
00b
01b
10b
11b
Name
Single Precision Float
DWord
Unsigned DWord
Double Precision Float
41:40 Source 2 Modifier
190 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
bfe - Bit Field Extract
Exists If:
Format:
([Property[Source Modification]=='true')
SrcMod
([Property[Source Modification]=='true')
SrcMod
([Property[Source Modification]=='false')
MBZ
([Property[Source Modification]=='true')
SrcMod
MBZ
39:38 Source 1 Modifier
Exists If:
Format:
41:36 Reserved
Exists If:
Format:
37:36 Source 0 Modifier
Exists If:
Format:
35
34
Reserved
Format:
Flag Register Number
This field contains the flag register number for instructions with a non-zero Conditional
Modifier.
Flag Subregister Number
This field contains the flag subregister number for instructions with a non-zero Conditional
Modifier.
Reserved
Format:
33
32
31:0
MBZ
EU_INSTRUCTION_HEADER
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 191
Command Reference - Instructions
bfi1 - Bit Field Insert 1
Source:
Length Bias:
EuIsa
4
The bfi1 instruction is the first instruction in a two-instruction macro for bfi (Bit Field Insert).
The bfi1 instruction component-wise generates mask with control from src0 and src1 and stores the results in
dst. The mask is used in the bfi2 instruction to generate the final result of bfi.
Create a bit mask corresponding to the bit field width and offset in src0 and src1. Store the bit mask in dst. The
mask has all bits in the bit field set to 1 and all other bits as 0.
The width and offset values are from the low five bits of src0 and src1 respectively, or src0 & 0x1f and src1 &
0x1f.
If width is zero, the result is zero.
The bfi macro has four source operands: src0 - bit field width in low five bits, src1 - bit field offset/starting bit
position in low five bits, src2 - bit field value to insert, using only the number of least significant bits given by
width in src0, and src3 - overall value into which the bit field is inserted, providing all bits other than the
inserted bits for the result value.
bfi dst src0 src1 src2 src3
// Translates to these two instructions:
bfi1 dst src0 src1
bfi2 dst dst src2 src3
Format:
[(pred)] bfi1 (exec_size) dst src0 src1
Programming Notes
No accumulator access, implicit or explicit.
Syntax
[(pred)] bfi1 (exec_size) reg reg reg [(pred)] bfi1 (exec_size) reg reg imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { UD width =
[n][4:0]; UD offset = [n][4:0]; dst = ((1 << width) - 1) << offset; } }
Predication Conditional Modifier Saturation Source Modifier
Y N N N
Src Types Dst Types
UD
D
UD
D
Bit Description DWord
192 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
bfi1 - Bit Field Insert 1
0..3 127:64 ImmSource
Exists If:
Format:
([ImmSource][e]=='IMM')
EU_INSTRUCTION_SOURCES_REG_IMM
([RegSource][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 193
Command Reference - Instructions
bfi2 - Bit Field Insert 2
Source:
Length Bias:
EuIsa
4
The bfi2 instruction is the second instruction in a two-instruction macro for bfi (Bit Field Insert).
The bfi2 instruction component-wise performs the bitfield insert operation on src1 and src2 based on the mask
in src0.
Use the mask in src0 to take a bit field value from the low bits of src1 and combine it with the value from src2
(so src2 provides all bits other than those masked out and replaced by the bit field value). Store the result in
dst.
The bfi macro has four source operands: src0 - bit field width in low five bits, src1 - bit field offset/starting bit
position in low five bits, src2 - bit field value to insert, using only the number of least significant bits given by
width in src0, and src3 - overall value into which the bit field is inserted, providing all bits other than the
inserted bits for the result value.
bfi dst src0 src1 src2 src3
// Translates to these two instructions:
bfi1 dst src0 src1
bfi2 dst dst src2 src3
Format:
[(pred)] bfi2 (exec_size) dst src0 src1 src2
Restriction
Restriction: No accumulator access, implicit or explicit.
Restriction: All three-source instructions have certain restrictions, described in Instruction Machine
Formats.
Syntax
[(pred)] bfi2 (exec_size) reg reg reg reg
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { UD offset =
LZD(reverse([n]))-1; // offset is the number of LSB zero bits below the bit mask
which has all 1s. // width (implied by the logic) is the number of 1 bits in the mask
value, which should be all 1s. [n] = (([n] << offset) & [n]) |
([n] & ! [n]); }
Predication Conditional Modifier Saturation Source Modifier
Y N N N
Src Types Dst Types
UD
D
UD
D
Bit Description
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
DWord
194
Command Reference - Instructions
bfi2 - Bit Field Insert 2
0..3 127:126 Reserved
Format: MBZ
EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
MBZ
EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
MBZ
EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
DstRegNum
DstSubRegNum[2:0]
ChanEn[4]
125:106 Source 2
Format:
105 Reserved
Format:
104:85 Source 1
Format:
84 Reserved
Format:
83:64 Source 0
Format:
63:56 Destination Register Number
Format:
55:53 Destination Subregister Number
Format:
52:49 Destination Channel Enable
Format:
Four channel enables are defined for controlling which channels are written into the
destination region. These channel mask bits are applied in a modulo-four manner to all
ExecSize channels. There is 1-bit Channel Enable for each channel within the group of 4. If the
bit is cleared, the write for the corresponding channel is disabled. If the bit is set, the write is
enabled. Mnemonics for the bit being set for the group of 4 are x, y, z, and w, respectively,
where x corresponds to Channel 0 in the group and w corresponds to channel 3 in the group
48
47
46
Reserved
Format: MBZ
NibCtrl
MBZ
NibCtrl
Format:
Reserved
Format:
45:44 Destination Data Type
This field contains the data type for the destination
Value
00b
01b
10b
11b
Name
Single Precision Float
DWord
Unsigned DWord
Double Precision Float
43:42 Source Data Type
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 195
Command Reference - Instructions
bfi2 - Bit Field Insert 2
This field contains the data type for all three sources
Value
00b
01b
10b
11b
Name
Single Precision Float
DWord
Unsigned DWord
Double Precision Float
([Property[Source Modification]=='true')
SrcMod
([Property[Source Modification]=='true')
SrcMod
([Property[Source Modification]=='false')
MBZ
([Property[Source Modification]=='true')
SrcMod
MBZ
41:40 Source 2 Modifier
Exists If:
Format:
39:38 Source 1 Modifier
Exists If:
Format:
41:36 Reserved
Exists If:
Format:
37:36 Source 0 Modifier
Exists If:
Format:
35
34
Reserved
Format:
Flag Register Number
This field contains the flag register number for instructions with a non-zero Conditional
Modifier.
Flag Subregister Number
This field contains the flag subregister number for instructions with a non-zero Conditional
Modifier.
Reserved
Format:
33
32
31:0
MBZ
EU_INSTRUCTION_HEADER
Header
Format:
196 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
bfrev - Bit Field Reverse
Source:
Length Bias:
EuIsa
4
The bfrev instruction component-wise reverses all the bits in src0 and stores the results in dst.
Format:
[(pred)] bfrev (exec_size) dst src0
Restriction
Restriction: No accumulator access, implicit or explicit.
Syntax
[(pred)] bfrev (exec_size) reg reg [(pred)] bfrev (exec_size) reg imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { for ( idx = 0;
idx < 32; idx++ ) { [n][idx] = [n][31-idx]; } } }
Predication Conditional Modifier Saturation Source Modifier
Y N N N
Src Types Dst Types
UD UD
Bit
127:64 ImmSource
Exists If:
Format:
DWord
0..3
Description
([Operand Controls][e]=='IMM')
EU_INSTRUCTION_SOURCES_IMM32
([Operand Controls][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 197
Command Reference - Instructions
brc - Branch Converging
Source:
Length Bias:
EuIsa
4
Description
The brc instruction redirects the execution forward or backward to the instruction pointed by (current
IP + offset). The jump will occur if all channels are branched away. UIP should reference the instruction
where all channels are expected to come together. JIP should reference the end of the innermost
conditional block.
In GEN binary, JIP and UIP are at location src1 when immediates and at location src0 when reg32,
where reg32 is accessed as a scalar DWord containing both JIP and UIP. The null register must be used
(for example, by the assembler) as dst. When offsets are immediate, src0 must be null.
Format:
[(pred)] brc (exec_size) JIP UIP
Restriction
Restriction: A brc instruction must use the Switch instruction option.
Syntax
[(pred)] brc (exec_size) imm16 imm16 [(pred)] brc (exec_size) reg32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < 32; n++ ) { if ( WrEn[n] ) { PcIP[n] = IP + UIP; } else {
PcIP[n] = IP + 1; } } if ( all PcIP != IP + 1 ) { // for all channels Jump(IP + JIP); }
Predication Conditional Modifier Saturation Source Modifier Source Types
Y N
Bit
127:112 UIP
Format:
0..3
N N D
Description
S15
DWord
The jump distance in number of eight-byte units if a jump is taken for the channel.
111:96 JIP
Format: S15
The jump distance in number of eight-byte units if a jump is taken for the instruction.
95:64
63:32
31:0
198
Reserved
Format: MBZ
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
Operand Control
Format:
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
brd - Branch Diverging
Source:
Length Bias:
EuIsa
4
Description
The brd instruction redirects the execution forward or backward to the instruction pointed by (current
IP + offset). The jump will occur if any channels are branched away.
In GEN binary, JIP is at location src1 when immediate and at location src0 when reg32, where reg32 is
accessed as a scalar DWord. The null register must be used at dst locations.
Format:
[(pred)] brd (exec_size) JIP
Restriction
Restriction: A brd instruction must use the Switch instruction option.
Syntax
[(pred)] brd (exec_size) imm16 [(pred)] brd (exec_size) reg32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < 32; n++ ) { if ( WrEn[n] ) { PcIP[n] = IP + JIP; } else {
PcIP[n] = IP + 1; } } if ( any PcIP == ExIP + JIP ) { // any channel Jump(ExIP + JIP); }
Predication Conditional Modifier Saturation Source Modifier
Y N N N
Src Types
D
DWord Bit
Format:
Description
MBZ
S15
0..3 127:112 Reserved
111:96 JIP
Format:
Jump Target Offset. The relative offset in 64-bit units if a jump is taken for the instruction.
95:91 Reserved
Format: MBZ
90
89
Flag Register Number
Added a second flag register
Flag Subregister Number
This field specifies the sub-register number for a flag register operand. There are two sub-
registers in the flag register. Each sub-register contains 16 flag bits.
The selected flag sub-register is the source for predication if predication is enabled for the
instruction. It is the destination to store conditional flag bits if conditional modifier is enabled
for the instruction. The same flag sub-register can be both the predication source and
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 199
2024年11月5日发(作者:雷凝旋)
Command Reference - Instructions
3DSTATE_POLY_STIPPLE_OFFSET
Source:
Length Bias:
RenderCS
2
The 3DSTATE_POLY_STIPPLE_OFFSET command is used to specify the origin of the repeated screen-space
Polygon Stipple Pattern as an X,Y offset from the Color Buffer origin.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
06h 3DSTATE_POLY_STIPPLE_OFFSET
OpCode
MBZ
0h Excludes Dword (0,1)
=n Total Length - 2
MBZ
U5
Value
[0,31]
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8
7:0
Reserved
Format:
Dword Length
Default Value:
Format:
1 31:13
12:8
Reserved
Format:
Polygon Stipple X Offset
Format:
Specifies a 5 bit x address offset in the poly stipple pattern
Name
MBZ
U5
Value
[0,31]
7:5
4:0
Reserved
Format:
Polygon Stipple Y Offset
Format:
Specifies a 5 bit y address offset in the poly stipple pattern
Name
100 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_POLY_STIPPLE_PATTERN
Source:
Length Bias:
RenderCS
2
The 3DSTATE_POLY_STIPPLE_PATTERN command is used to specify the 32x32 Polygon Stipple Pattern used in
the Polygon Stipple function of the WM unit.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
07h 3DSTATE_POLY_STIPPLE_PATTERN
OpCode
MBZ
1Fh Excludes Dword (0,1)
=n Total Length - 2
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 Dword Length
Default Value:
Format:
1 31:0 Polygon Stipple Pattern Row 1 (top most)
Format: 32 bit mask Bit 31 = upper left corner, Bit 0 = upper right corner of first row.
Specifies a pattern used by Polygon Stipple to mask out specific pixels of every 32x32 area
rendered.
2..32 31:0 Polygon Stipple Pattern Rows 2-32 (bottom most)
Format: 32 bit mask Bit 31 = upper left corner, Bit 0 = upper right corner of first row.
Specifies a pattern used by Polygon Stipple to mask out specific pixels of every 32x32 area
rendered.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 101
Command Reference - Instructions
3DSTATE_PS
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
20h 3DSTATE_PS
OpCode
MBZ
06h Excludes DWord (0,1)
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
Total Length - 2
1 31:6 Kernel Start Pointer[0]
Format: InstructionBaseOffset[31:6]Kernel
Specifies the 64-byte aligned address offset of the first instruction in the kernel[0]. This pointer is
relative to the Instruction Base Address.
5:0 Reserved
Format: MBZ
2 31 Single Program Flow (SPF)
Specifies the initial condition of the kernel program as either a single program flow (SIMDnxm
with m = 1) or as multiple program flows (SIMDnxm with m > 1). See CR0 description in ISA
Execution Environment.
Value
0h
1h
Name
Multiple
Single
Description
Multiple Program Flows
Single Program Flows
U1 Enumerated Type
30 Vector Mask Enable (VME)
Format:
When SPF=0, VME specifies which mask to use to initialize the initial channel enables. When
SPF=1, VME specifies which mask to use to generate execution channel enables.
102 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PS
Value
0h
1h
Name
Dmask
Vmask
Description
Channels are enabled based on the dispatch mask
Channels are enabled based on the vector mask
U3
29:27 Sampler Count
Format:
Specifies how many samplers (in multiples of 4) the pixel shader 0 kernel uses. Used only for
prefetching the associated sampler state entries.
Value
[0,4]
0h
1h
2h
3h
4h
5h-7h
Name
no samplers used
Description
between 1 and 4 samplers used
between 5 and 8 samplers used
between 9 and 12 samplers used
between 13 and 16 samplers used
Reserved
26 Denormal Mode
Specifies the denornal mode used by the dispatched thread.
Value
0h
1h
Name
FTZ
RET
Description
Denormals are flushed to zero
Denormals are retained
U8
25:18 Binding Table Entry Count
Format:
Specifies how many binding table entries the kernel uses. Used only for prefetching of the
binding table entries and associated surface state. Note: For kernels using a large number of
binding table entries, it may be advantageous to set this field to zero to avoid prefetching too
many entries and thrashing the state cache.
This field is ignored if [PS Function Enable] is DISABLED.
Value
[0,255]
Programming Notes
When HW binding table bit is set, it is assumed that the Binding Table Entry Count field will be
generated at JIT time.
Name
17 Reserved
Format: MBZ
16 Floating Point Mode
Specifies the floating point mode used by the dispatched thread.
Value
0h
1h
Name
IEEE-745
Alt
Description
Use IEEE-754 rules
Use alternate rules
103 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PS
15:14 Rounding Mode
Specifies the rounding mode used by the dispatched thread.
Value
0h
1h
2h
3h
Name
RTNE
RU
RD
RTZ
Description
Round to Nearest Even
Round toward +infinity
Round toward -infinity
Round toward zero
Enable
13 Illegal Opcode Exception Enable
Format:
This bit gets loaded into EU CR0.1[12] (note the bit # difference). See Exceptions and ISA
Execution Environment.
12 Reserved
Format: MBZ
Enable
11 Mask Stack Exception Enable
Format:
This bit gets loaded into EU CR0.1[12] (note the bit # difference). See Exceptions and ISA
Execution Environment.
10:8 Reserved
Format: MBZ
Enable
7 Software Exception Enable
Format:
This bit gets loaded into EU CR0.1[13] (note the bit # difference). See Exceptions and ISA
Execution Environment.
6:0 Reserved
Format: MBZ
GeneralStateOffset[31:10]ScratchSpace
3 31:10 Scratch Space Base Pointer
Format:
Specifies the 1k-byte aligned address offset to scratch space for use by the kernel. This pointer is
relative to the General State Base Address.
9:4 Reserved
Format: MBZ
U4
3:0 Per Thread Scratch Space
Format:
Specifies the amount of scratch space allowed to be used by each thread. The driver must
allocate enough contiguous scratch space, pointed to by the Scratch Space Pointer, to ensure
that the Maximum Number of Threads each get Per Thread Scratch Space size without exceeding
the driver-allocated scratch space.
104 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PS
Value
[0,11]
Name
indicating [1k bytes, 2M bytes] in powers of two
U8-1 representing thread count
Description
4 31:24 Maximum Number of Threads
Format:
Range:
WIZ Hashing Disable in GT_MODE register enabled: Range = [7,171] --> [8,172]
threads. Only odd values are allowed (resulting in even max number of threads)
WIZ Hashing Disable in GT_MODE register disabled: Range = [3,85] --> [4,86] threads.
Only odd values are allowed (resulting in even max number of threads)
Specifies the maximum number of simultaneous threads allowed to be active. Used to
avoid using up the scratch space, or to avoid potential deadlock.
Value
[3h,1fh]
Programming Notes
If this field is changed between 3DPRIMITIVE commands, a PIPE_CONTROL command with Stall
at Pixel Scoreboard set is required to be issued. This field must have an odd value so that the
max number of PS threads is even.
Name
Range
Description
[4,32] threads
23:12 Reserved
Format: MBZ
Enable
11 Push Constant Enable
Format:
This field must be enabled if the sum of the PS Constant Buffer [3:0] Read Length fields in
3DSTATE_CONSTANT_PS is nonzero, and must be disabled if the sum is zero.
10 Attribute Enable
Format: Enable
This field must be enabled if the Number of SF Output Attributes field in 3DSTATE_SBE is
nonzero, and must be disabled if that field is zero.
9 oMask Present to RenderTarget
Format: Enable
This bit is inserted in the PS payload header and made available to the DataPort (either via the
message header or via header bypass) to indicate that oMask data (one or two phases) is
included in Render Target Write messages. If present, the oMask data is used to mask off
samples.
8 Render Target Fast Clear Enable
Format: Enable
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 105
Command Reference - Instructions
3DSTATE_PS
This field is set to enable fast clear of the bound render targets. See "Render Target Fast Clear"
for restrictions on enabling this field.
7 Dual Source Blend Enable
Format: Enable
This field is set if dual source blend is enabled. If this bit is disabled, the data port dual source
message reverts to a single source message using source 0.
6 Render Target Resolve Enable
Format: Enable
This field is set to enable clear value resolve on non-multisampled render targets. See "Render
Target Resolve" for restrictions on enabling this field.
5 Reserved
Format: MBZ
U2 Enumerated Type
4:3 Position XY Offset Select
Format:
This field specifies if/what Position XY Offset values are passed in the PS payload. Note that these
are per-slot (pixel|sample) offsets, and therefore separate from the subspan XY coordinates
passed in R1.
Value
0h
1h
2h
3h
Programming Notes
SW Recommendation: If the PS kernel needs the Position Offsets to compute a Position XY
value, this field should match Position ZW Interpolation Mode to ensure a consistent
computation
If the PS kernel does not need the Position XY Offsets to compute a Position Value, then this
field should be programmed to POSOFFSET_NONE, as the PS kernel should be using the
various barycentric inputs to evaluate other-than-position attributes.
MSDISPMODE_PERSAMPLE is required in order to select POSOFFSET_SAMPLE.
Name
POSOFFSET_NONE
Reserved
Description
No Position XY Offsets are included in the PS payload.
POSOFFSET_CENTROID Position XY Offsets will be passed in the PS payload,
and these will reflect the Centroid position(s).
POSOFFSET_SAMPLE Position XY Offsets will be passed in the PS payload,
and these will reflect the multisample position(s).
2 32 Pixel Dispatch Enable
Format:
Description
Enables the Windower to dispatch 8 subspans in one payload.
Note: See Note: in the table below, the Valid column indicates which products that
Enable
106 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PS
combination is supported on. Combinations of dispatch enables not listed in the table
are not available on any product.
A: Valid on all products
B: Valid.
C: Not valid.
D: Valid on all products, except when in non-1x PERSAMPLE mode.
E: Valid on all products, except when in PERSAMPLE mode with number of
multisamples >= 8.
F: Valid on all products.
Each of the three KSP values are separately specified.
In addition, each kernel has a separately-specified GRF register count.
Variable Pixel Dispatch Section: Pixel Grouping (Dispatch size) control for valid pixel
dispatch combinations.
1 16 Pixel Dispatch Enable
Format:
Description
Enables the Windower to dispatch 4 subspans in one payload.
Note: See Note: in the table below, the Valid column indicates which products that
combination is supported on. Combinations of dispatch enables not listed in the table
are not available on any product.
A: Valid on all products
B: Valid.
C: Not valid.
D: Valid on all products, except when in non-1x PERSAMPLE mode.
E: Valid on all products, except when in PERSAMPLE mode with number of
multisamples >= 8.
F: Valid on all products.
Each of the three KSP values are separately specified.
In addition, each kernel has a separately-specified GRF register count.
Variable Pixel Dispatch Section: Pixel Grouping (Dispatch size) control for valid pixel
dispatch combinations.
Enable
0 8 Pixel Dispatch Enable
Format:
Description
Enables the Windower to dispatch 2 subspans in one payload.
Note: See Note: in the table below, the Valid column indicates which products that
combination is supported on. Combinations of dispatch enables not listed in the table
are not available on any product.
A: Valid on all products
B: Valid.
Enable
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 107
Command Reference - Instructions
3DSTATE_PS
C: Not valid.
D: Valid on all products, except when in non-1x PERSAMPLE mode.
E: Valid on all products, except when in PERSAMPLE mode with number of
multisamples >= 8.
F: Valid on all products.
Each of the three KSP values are separately specified.
In addition, each kernel has a separately-specified GRF register count.
Variable Pixel Dispatch Section: Pixel Grouping (Dispatch size) control for valid pixel
dispatch combinations.
5 31:23 Reserved
Format: MBZ
U7
22:16 Dispatch GRF Start Register for Constant/Setup Data [0]
Format:
Specifies the starting GRF register number for the Constant/Setup portion of the thread payload
for kernel[0].
Value
[0,127]
Name
MBZ
U7
15 Reserved
Format:
14:8 Dispatch GRF Start Register for Constant/Setup Data [1]
Format:
Specifies the starting GRF register number for the Constant/Setup portion of the thread payload
for kernel[1].
Value
[0,127]
Name
MBZ
U7
7 Reserved
Format:
6:0 Dispatch GRF Start Register for Constant/Setup Data [2]
Format:
Specifies the starting GRF register number for the Constant/Setup portion of the thread payload
for kernel[2].
Value
[0,127]
Name
6 31:6 Kernel Start Pointer[1]
Format: InstructionBaseOffset[31:6]Kernel
Specifies the 64-byte aligned address offset of the first instruction in kernel[1]. This pointer is
relative to the Instruction Base Address.
5:0 Reserved
Format: MBZ
108 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PS
7 31:6 Kernel Start Pointer[2]
Format: InstructionBaseOffset[31:6]Kernel
Specifies the 64-byte aligned address offset of the first instruction in kernel[2]. This pointer is
relative to the Instruction Base Address.
5:0 Reserved
Format: MBZ
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 109
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_DS
Source:
Length Bias:
RenderCS
2
Programming Notes
This command sets up the URB configuration for DS Push Constant Buffer.
Programming Restriction:
• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value
of the Constant Buffer Size.
• The sum of the constant length programmed in 3DSTATE_CONSTANT_DS must be equal or smaller then
the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant
URB Allocation section for more details.
• The 3DSTATE_CONSTANT_DS must be reprogrammed prior to the next 3DPRIMITIVE command after
programming the 3DSTATE_PUSH_CONSTANT_ALLOC_DS.
DWord
Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
14h 3DSTATE_PUSH_CONSTANT_ALLOC_DS
OpCode
MBZ
0h Excludes DWord (0,1)
=n Total Length - 2
MBZ
U4
Value
[0,15] (0KB - 15KB)
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Name
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:20 Reserved
Format:
19:16 Constant Buffer Offset
Format:
Specifies the offset of the DS constant buffer into the URB.
110
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_DS
0h 0KB [Default]
MBZ
U5
15:5 Reserved
Format:
4:0 Constant Buffer Size
Format:
Specifies the size of the DS constant buffer. This value will determine the amount of data the
command stream can pre-fetch before the buffer is full. Value of zero is only valid when
constants are not enabled for DS.
Value
[0,15]
0h
Name
(0KB - 15KB) Increments of 1KB
0KB [Default]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 111
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_GS
Source:
Length Bias:
RenderCS
2
Programming Notes
This command sets up the URB configuration for GS Push Constant Buffer.
• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value
of the Constant Buffer Size.
• The sum of the constant length programmed in 3DSTATE_CONSTANT_GS must be equal or smaller then
the size of the allocated space in the URB including the buffering for half cachelines.
• The 3DSTATE_CONSTANT_GS must be reprogrammed prior to the next 3DPRIMITIVE command after
programming the 3DSTATE_PUSH_CONSTANT_ALLOC_GS.
See Push Constant URB Allocation section for more details.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
15h 3DSTATE_PUSH_CONSTANT_ALLOC_GS
OpCode
MBZ
=n
Name
3DSTATE_PUSH_CONSTANT_ALLOC_GS [Default]
MBZ
U4
Value Name
Description
Excludes DWord (0,1)
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Format:
Total Length - 2
Value
0h
1 31:20 Reserved
Format:
19:16 Constant Buffer Offset
Format:
Specifies the offset of the GS constant buffer into the URB.
112 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_GS
[0,15]
0h
(0KB - 15KB)
0KB [Default]
MBZ
U5
15:5 Reserved
Format:
4:0 Constant Buffer Size
Format:
Specifies the size of the GS constant buffer. This value will determine the amount of data the
command stream can pre-fetch before the buffer is full. Value of zero is only valid when
constants are not enabled for GS.
Value
[0,15]
0h
Name
(0KB - 15KB) Increments of 1KB
0KB [Default]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 113
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_HS
Source:
Length Bias:
RenderCS
2
Programming Notes
This command sets up the URB configuration for HS Push Constant Buffer.
Programming Restriction:
• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value
of the Constant Buffer Size.
• The sum of the constant length programmed in 3DSTATE_CONSTANT_HS must be equal or smaller then
the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant
URB Allocation section for more details.
• The 3DSTATE_CONSTANT_HS must be reprogrammed prior to the next 3DPRIMITIVE command after
programming the 3DSTATE_PUSH_CONSTANT_ALLOC_HS.
DWord
Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
13h 3DSTATE_PUSH_CONSTANT_ALLOC_HS
OpCode
MBZ
0h Excludes DWord (0,1)
=n Total Length - 2
MBZ
U4
Value
[0,15] (0KB - 15KB)
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Name
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:20 Reserved
Format:
19:16 Constant Buffer Offset
Format:
Specifies the offset of the HS constant buffer into the URB.
114
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_HS
0h 0KB [Default]
MBZ
U5
15:5 Reserved
Format:
4:0 Constant Buffer Size
Format:
Specifies the size of the HS constant buffer. This value will determine the amount of data the
command stream can pre-fetch before the buffer is full. Value of zero is only valid when
constants are not enabled for HS.
Value
[0,15]
0h
Name
(0KB - 15KB) Increments of 1KB
0KB [Default]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 115
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_PS
Source:
Length Bias:
RenderCS
2
Description
This command sets up the URB configuration for PS Push Constant Buffer.
A PIPE_CONTOL command with the CS Stall bit set must be programmed in the ring after this
instruction.
Programming Notes
Restriction:
• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value
of the Constant Buffer Size.
• The sum of the constant length programmed in 3DSTATE_CONSTANT_PS must be equal or smaller then
the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant
URB Allocation section for more details.
• The 3DSTATE_CONSTANT_PS must be reprogrammed prior to the next 3DPRIMITIVE command after
programming the 3DSTATE_PUSH_CONSTANT_ALLOC_PS.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
16h 3DSTATE_PUSH_CONSTANT_ALLOC_PS
OpCode
MBZ
0h Excludes Dword (0,1)
=n Total Length - 2
MBZ
U4
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 Dword Length
Default Value:
Format:
1 31:20 Reserved
Format:
19:16 Constant Buffer Offset
Format:
116
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_PS
Specifies the offset of the PS constant buffer into the URB.
Value
[0,15]
0h
Name
(0KB - 15KB)
0KB [Default]
MBZ
U5
15:5 Reserved
Format:
4:0 Constant Buffer Size
Format:
Specifies the size of the PS constant buffer. This value will determine the amount of data the
command stream can pre-fetch before the buffer is full. Value of zero is only valid when
constants are not enabled for PS.
Value
[0,15]
0h
Name
(0KB - 15KB) Increments of 1KB
0KB [Default]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 117
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_VS
Source:
Length Bias:
RenderCS
2
Programming Notes
This command sets up the URB configuration for VS Push Constant Buffer.
Programming Restriction:
• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value
of the Constant Buffer Size.
• The sum of the constant length programmed in 3DSTATE_CONSTANT_VS must be equal or smaller then
the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant
URB Allocation section for more details.
• The 3DSTATE_CONSTANT_VS must be reprogrammed prior to the next 3DPRIMITIVE command after
programming the 3DSTATE_PUSH_CONSTANT_ALLOC_VS.
DWord
Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
12h 3DSTATE_PUSH_CONSTANT_ALLOC_VS
OpCode
MBZ
0h Excludes DWord (0,1)
=n Total Length - 2
MBZ
U4
Value
[0,15] (0KB - 15KB)
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Name
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:20 Reserved
Format:
19:16 Constant Buffer Offset
Format:
Specifies the offset of the VS constant buffer into the URB.
118
Command Reference - Instructions
3DSTATE_PUSH_CONSTANT_ALLOC_VS
0h 0KB [Default]
MBZ
U5
15:5 Reserved
Format:
4:0 Constant Buffer Size
Format:
Specifies the size of the VS constant buffer. This value will determine the amount of data the
command stream can pre-fetch before the buffer is full. Value of zero is only valid when
constants are not enabled for VS.
Value
[0,15]
0h
Name
(0KB - 15KB) Increments of 1KB
0KB [Default]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 119
Command Reference - Instructions
3DSTATE_SAMPLE_MASK
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
18h 3DSTATE_SAMPLE_MASK
OpCode
MBZ
0h Excludes Dword (0,1)
=n Total Length - 2
MBZ
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 Dword Length
Default Value:
Format:
1 31:8 Reserved
Format:
7:0 Sample Mask
Format: 8 bit mask Right-justified bitmask (Bit 0 = Sample0). Number of bits that are used is
determined by Num Multisamples (3DSTATE_MULTISAMPLE)
A per-multisample-position mask state variable that is immediately and unconditionally ANDed
with the sample coverage mask as part of the rasterization process. This mask is applied prior to
centroid selection.
Programming Notes
• If Number of Multisamples is NUMSAMPLES_1, bits 7:1 of this field must be zero.
• If Number of Multisamples is NUMSAMPLES_4, bits 7:4 of this field must be zero.
120 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SAMPLER_PALETTE_LOAD0
Source:
Length Bias:
RenderCS
2
Description
The 3DSTATE_SAMPLER_PALETTE_LOAD0 instruction is used to load 32-bit values into the first
texture palette. The texture palette is used whenever a texture with a paletted format (containing
"Px [palette0]") is referenced by the sampler.
This instruction is used to load all or a subset of the 256 entries of the first palette. Partial loads
always start from the first (index 0) entry.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
Opcode
3h GFXPIPE_3D
Opcode
1h 3DSTATE
Opcode
02h 3DSTATE_SAMPLER_PALETTE_LOAD0
Opcode
MBZ
0h Excludes DWord (0,1)
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8
7:0
Reserved
Format:
DWord Length
Default Value:
Format:
Total Length - 2
1..n 31:24 Palette Alpha[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
23:16 Palette Red[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
15:8 Palette Green[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 121
Command Reference - Instructions
3DSTATE_SAMPLER_PALETTE_LOAD0
7:0 Palette Blue[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
122 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SAMPLER_PALETTE_LOAD1
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_PALETTE_LOAD1 instruction is used to load 32-bit values into the second texture
palette. The second texture palette is used whenever a texture with a paletted format (containing
"Px...[palette1]") is referenced by the sampler. This instruction is used to load all or a subset of the 256 entries of
the second palette. Partial loads always start from the first (index 0) entry.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE
OpCode
0Ch 3DSTATE_SAMPLER_PALETTE_LOAD1
OpCode
MBZ
0h Excludes DWord (0,1)
=n Total Length - 2
U8
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1..n 31:24 Palette Alpha[0:N-1]
Format:
Alpha channel loaded into the Nth entry of the texture color palette.
23:16 Palette Red[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
15:8 Palette Green[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
7:0 Palette Blue[0:N-1]
Format: U8
Alpha channel loaded into the Nth entry of the texture color palette.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 123
Command Reference - Instructions
3DSTATE_SAMPLER_STATE_POINTERS_DS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_STATE_POINTERS_DS command is used to define the location of DS SAMPLER_STATE
table. Only some of the fixed functions utilize sampler state tables.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
2Dh 3DSTATE_SAMPLER_STATE_POINTERS_DS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SAMPLER_STATE*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Pointer to DS Sampler State
Format:
Specifies the 32-byte aligned address offset of the DS function's SAMPLER_STATE table. This
offset is relative to the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
124 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SAMPLER_STATE_POINTERS_GS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_STATE_POINTERS_GS command is used to define the location of GS SAMPLER_STATE
table. Only some of the fixed functions utilize sampler state tables.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
2Eh 3DSTATE_SAMPLER_STATE_POINTERS_GS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SAMPLER_STATE*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Pointer to GS Sampler State
Format:
Specifies the 32-byte aligned address offset of the GS function's SAMPLER_STATE table. This
offset is relative to the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 125
Command Reference - Instructions
3DSTATE_SAMPLER_STATE_POINTERS_HS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_STATE_POINTERS_HS command is used to define the location of HS SAMPLER_STATE
table. Only some of the fixed functions utilize sampler state tables.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
2Ch 3DSTATE_SAMPLER_STATE_POINTERS_HS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SAMPLER_STATE*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Pointer to HS Sampler State
Format:
Specifies the 32-byte aligned address offset of the HS function's SAMPLER_STATE table. This
offset is relative to the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
126 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SAMPLER_STATE_POINTERS_PS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_STATE_POINTERS_PS command is used to define the location of PS SAMPLER_STATE
table. Only some of the fixed functions utilize sampler state tables.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
2Fh 3DSTATE_SAMPLER_STATE_POINTERS_PS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SAMPLER_STATE*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Pointer to PS Sampler State
Format:
Specifies the 32-byte aligned address offset of the PS function's SAMPLER_STATE table. This
offset is relative to the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 127
Command Reference - Instructions
3DSTATE_SAMPLER_STATE_POINTERS_VS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SAMPLER_STATE_POINTERS_VS command is used to define the location of VS SAMPLER_STATE
table. Only some of the fixed functions utilize sampler state tables.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
2Bh 3DSTATE_SAMPLER_STATE_POINTERS_VS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SAMPLER_STATE*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Pointer to VS Sampler State
Format:
Specifies the 32-byte aligned address offset of the VS function's SAMPLER_STATE table. This
offset is relative to the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
128 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SBE
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
1Fh 3DSTATE_SBE
OpCode
MBZ
0Ch Excludes DWord (0,1)
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
Total Length - 2
1 31:29 Reserved
Format: MBZ
U1 enumerated type
28 Attribute Swizzle Control Mode
Format:
When Attribute Swizzle Enable is ENABLED, this bit controls whether attributes 0-15 or
16-31 are subject to the following swizzle controls:
• Attribute n Component Override X/Y/Z/W
• Attribute n Constant Source
• Attribute n Swizzle Select
• Attribute n Source Attribute
• Attribute n Wrap Shortest Enables
Note that the Number of SF Output Attributes field specifies how many attributes are
output.
Note: This field does not impact any functions which provide separate states for all 32
attributes (e.g., Point sprite, Constant interpolation).
Value Name Description
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 129
Command Reference - Instructions
3DSTATE_SBE
0h
1h
SWIZ_0_15 Attributes 0-15 are subject to swizzling, and attributes 16-31 are
not.
SWIZ_16_31 Attributes 16-31 are subject to swizzling, and attributes 0-15 are
not. Only valid when 16 or more attributes are output.
U6 count of attributes
27:22 Number of SF Output Attributes
Format:
Specifies the number of vertex attributes passed from the SF stage to the WM stage (does not
include Position).
Value
[0,32]
Name
Enable
21 Attribute Swizzle Enable
Format:
Enables the SF to perform swizzling on (up to the first 16) vertex attributes. If DISABLED, all vertex
attributes are passed through.
20 Point Sprite Texture Coordinate Origin
Format: U1 enumerated type
This state controls how Point Sprite Texture Coordinates are generated (when enabled on a per-
attribute basis by Point Sprite Texture Coordinate Enable).
Value
0h
1h
Name Description
UPPERLEFT Top Left = (0,0,0,1)Bottom Left = (0,1,0,1)Bottom Right = (1,1,0,1)
LOWERLEFT Top Left = (0,1,0,1)Bottom Left = (0,0,0,1)Bottom Right = (1,0,0,1)
MBZ
19:16 Reserved
Format:
15:11 Vertex URB Entry Read Length
Format: U5 Specifies the amount of URB data read for each Vertex URB entry, in 256-bit
register increments.
Value
[1,16]
Programming Notes
It is UNDEFINED to set this field to 0 indicating no Vertex URB data to be read. This field should
be set to the minimum length required to read the maximum source attribute. The maximum
source attribute is indicated by the maximum value of the enabled Attribute # Source Attribute
if Attribute Swizzle Enable is set, Number of Output Attributes-1 if enable is not set.
read_length = ceiling((max_source_attr+1)/2)
Name
10 Reserved
9:4 Vertex URB Entry Read Offset
Specifies the offset (in 256-bit units) at which Vertex URB data is to be read from the URB.
3:0 Reserved
130 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SBE
Format: MBZ
Enable
2..9 31 Attribute [2n+1] Component Override W
Format:
If set, the W component of output Attribute 1 is overridden by the W component of the constant
vector specified by ConstantSource[1].
30 Attribute [2n+1] Component Override Z
Format: Enable
If set, the Z component of output Attribute 1 is overridden by the Z component of the constant
vector specified by ConstantSource[1].
29 Attribute [2n+1] Component Override Y
Format: Enable
If set, the Y component of output Attribute 1 is overridden by the Y component of the constant
vector specified by ConstantSource[1].
28 Attribute [2n+1] Component Override X
Format: Enable
If set, the X component of output Attribute 1 is overridden by the X component of the constant
vector specified by ConstantSource[1].
27 Reserved
Format: MBZ
U2 enumerated type
26:25 Attribute [2n+1] Constant Source
Format:
This state selects a constant vector which can be used to override individual components of
Attribute 1
Value
0h
1h
2h
3h
Name
CONST_0000
CONST_0001_FLOAT
CONST_1111_FLOAT
PRIM_ID
Description
= 0.0,0.0,0.0,0.0
= 0.0,0.0,0.0,1.0
= 1.0,1.0,1.0,1.0
= PrimID (replicated)
MBZ
U2 enumerated type
24 Reserved
Format:
23:22 Attribute [2n+1] Swizzle Select
Format:
Value
0h
1h
Name
INPUTATTR
INPUTATTR_FACING
This state, along with Attribute 1 Source Attribute, specifies the source for output Attribute 1.
Description
This attribute is sourced from
AttrInputReg[SourceAttribute]
If the object is front-facing, this attribute is sourced
131 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SBE
from AttrInputReg[SourceAttribute]. If the object is
back-facing, this attribute is sourced from
AttrInputReg[SourceAttribute+1].
2h INPUTATTR_W This attribute is sourced from
AttrInputReg[SourceAttribute]. The W component is
copied to the X component.
3h INPUTATTR_FACING_W If the object is front-facing, this attribute is sourced
from AttrInputReg[SourceAttribute]. If the object is
back-facing, this attribute is sourced from
AttrInputReg[SourceAttribute+1]. The W component is
copied to the X component.
MBZ
U5
21 Reserved
Format:
20:16 Attribute [2n+1] Source Attribute
Format:
This field selects the source attribute for Attribute 1. Source attribute 0 corresponds to the first
128 bits of data indicated by Vertex URB Entry Read Offset
15 Attribute [2n] Component Override W
Format: Enable
If set, the W component of output Attribute 0 is overridden by the W component of the constant
vector specified by ConstantSource[1].
14 Attribute [2n] Component Override Z
Format: Enable
If set, the Z component of output Attribute 0 is overridden by the Z component of the constant
vector specified by ConstantSource[1].
13 Attribute [2n] Component Override Y
Format: Enable
If set, the Y component of output Attribute 0 is overridden by the Y component of the constant
vector specified by ConstantSource[1].
12 Attribute [2n] Component Override X
Format: Enable
If set, the X component of output Attribute 0 is overridden by the X component of the constant
vector specified by ConstantSource[1].
11 Reserved
Format: MBZ
U2 enumerated type
10:9 Attribute [2n] Constant Source
Format:
This state selects a constant vector which can be used to override individual components of
132 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SBE
Attribute 0
Value
0h
1h
2h
3h
Name
CONST_0000
CONST_0001_FLOAT
CONST_1111_FLOAT
PRIM_ID
Description
= 0.0,0.0,0.0,0.0
= 0.0,0.0,0.0,1.0
= 1.0,1.0,1.0,1.0
= PrimID (replicated)
MBZ
U2 enumerated type
8 Reserved
Format:
7:6 Attribute [2n] Swizzle Select
Format:
Value
0h
1h
Name
INPUTATTR
INPUTATTR_FACING
This state, along with Attribute 0 Source Attribute, specifies the source for output Attribute 0.
Description
This attribute is sourced from
AttrInputReg[SourceAttribute]
If the object is front-facing, this attribute is sourced
from AttrInputReg[SourceAttribute]. If the object is
back-facing, this attribute is sourced from
AttrInputReg[SourceAttribute+1].
This attribute is sourced from
AttrInputReg[SourceAttribute]. The W component is
copied to the X component.
2h INPUTATTR_W
3h INPUTATTR_FACING_W If the object is front-facing, this attribute is sourced
from AttrInputReg[SourceAttribute]. If the object is
back-facing, this attribute is sourced from
AttrInputReg[SourceAttribute+1]. The W component is
copied to the X component.
MBZ
U5
5 Reserved
Format:
4:0 Attribute [2n] Source Attribute
Format:
This field selects the source attribute for Attribute 0. Source attribute 0 corresponds to the first
128 bits of data indicated by Vertex URB Entry Read Offset
10 31:0 Point Sprite Texture Coordinate Enable
Format:
Description
32-bit bitmask
When processing point primitives, the attributes from the incoming point vertex are
typically copied to the point object corner vertices. However, if a bit is set in this field,
the corresponding Attribute is selected as a Point Sprite Texture Coordinate, in which
case each corner vertex is assigned a pre-defined texture coordinate as defined by
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 133
Command Reference - Instructions
3DSTATE_SBE
the Point Sprite Texture Coordinate Origin state bit. Bit 0 corresponds to output
Attribute 0.
This field must be programmed to 0 when non-point primitives are rendered.
11 31:0 Constant Interpolation Enable[31:0]
This field is a bitmask containing a Constant Interpolation Enable bit for each corresponding
attribute. If a bit is set, that attribute will undergo constant interpolation, and the corresponding
WrapShortest Enable bits (if defined) will be ignored. If a bit is clear, components which are not
enabled for WrapShortest interpolation (if defined) will be linearly interpolated.
31:28 Attribute 7 WrapShortest Enables
Format: Enable[4]
This state selects which components (if any) of Attribute 7 are to be interpolated in a "wrap
shortest" fashion. Operation is UNDEFINED if any of these bits are set and the Constant
Interpolation Enable bit associated with this attribute is set. Note that wrap-shortest interpolation
is only supported for Attributes 0-15. Bit 0: WrapShortest X ComponentBit 1: WrapShortest Y
ComponentBit 2: WrapShortest Z ComponentBit 3: WrapShortest W Component
27:24 Attribute 6 WrapShortest Enables
(See above).
23:20 Attribute 5 WrapShortest Enables
(See above).
19:16 Attribute 4 WrapShortest Enables
(See above).
15:12 Attribute 3 WrapShortest Enables
(See above).
11:8 Attribute 2 WrapShortest Enables
(See above).
7:4 Attribute 1 WrapShortest Enables
(See above).
3:0 Attribute 0 WrapShortest Enables
(See above).
12
13 31:28 Attribute 15 WrapShortest Enables
Format: Enable[4]
This state selects which components (if any) of Attribute 15 are to be interpolated in a "wrap
shortest" fashion. Operation is UNDEFINED if any of these bits are set and the Constant
Interpolation Enable bit associated with this attribute is 0: WrapShortest X ComponentBit
1: WrapShortest Y ComponentBit 2: WrapShortest Z ComponentBit 3: WrapShortest W
Component
27:24 Attribute 14 WrapShortest Enables
(See above).
23:20 Attribute 13 WrapShortest Enables
(See above).
19:16 Attribute 12 WrapShortest Enables
134 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SBE
(See above).
15:12 Attribute 11 WrapShortest Enables
(See above).
11:8 Attribute 10 WrapShortest Enables
(See above).
7:4 Attribute 9 WrapShortest Enables
(See above).
3:0 Attribute 8 WrapShortest Enables
(See above).
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 135
Command Reference - Instructions
3DSTATE_SCISSOR_STATE_POINTERS
Source:
Length Bias:
RenderCS
2
The 3DSTATE_SCISSOR_STATE_POINTERS command is used to define the location of the indirect SCISSOR_RECT
state.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
0Fh 3DSTATE_SCISSOR_STATE_POINTERS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]SCISSOR_RECT*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 Scissor Rect Pointer
Format:
Specifies the 32-byte aligned address offset of the SCISSOR_RECT state. This offset is
relative to the Dynamic State Base Address
4:0 Reserved
Format: MBZ
136 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SF
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE
OpCode
13h 3DSTATE_SF
OpCode
MBZ
5h Excludes DWord (0,1)
=n Total Length - 2
MBZ
U3 Enumerated Type
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:15 Reserved
Format:
14:12 Depth Buffer Surface Format
Format:
Specifies the format of the depth buffer. This must exactly match the Surface Format
programmed via 3DSTATE_DEPTH_BUFFER. The SF requires this information in order to compute
Global Depth Bias.
Value
0h
1h
2h
3h
4h
5h
6h-7h
Name
D32_FLOAT_S8X24_UINT
D32_FLOAT
D24_UNORM_S8_UINT
D24_UNORM_X8_UINT
Reserved
D16_UNORM
Reserved
Description
D32_FLOAT_S8X24_UINT
D32_FLOAT
D24_UNORM_S8_UINT
D24_UNORM_X8_UINT
Reserved
D16_UNORM
Reserved
Enable
11 Legacy Global Depth Bias Enable
Format:
Enables the SF to use the Global Depth Offset Constant state unmodified. If this bit is not set, the
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 137
Command Reference - Instructions
3DSTATE_SF
SF will scale the Global Depth Offset Constant as described in section Error! Reference source not
found. of this document.
Programming Notes
This bit should be set whenever non zero depth bias (Slope, Bias) values are used. Setting this
bit may have some degradation of performance for some workloads.
10 Statistics Enable
Format: Enable
If ENABLED, this FF unit will increment CL_PRIMITIVES_COUNT on behalf of the CLIP stage. If
DISABLED, CL_PRIMITIVES_COUNT will be left unchanged.
Programming Notes
This bit should be set whenever clipping is enabled and the Statistics Enable bit is set in
CLIP_STATE. It should be cleared if clipping is disabled or Statistics Enable in CLIP_STATE is
clear.
9 Global Depth Offset Enable Solid
Format: Enable
Programming Notes
This bit should be set whenever non zero depth bias (Slope, Bias) values are used.
Setting this bit may have some degradation of performance for some workloads.
Enables computation and application of Global Depth Offset for SOLID objects.
8 Global Depth Offset Enable Wireframe
Format: Enable
Enables computation and application of Global Depth Offset when triangles are rendered in
WIREFRAME mode.
Programming Notes
This bit should be set whenever non zero depth bias (Slope, Bias) values are used.
Setting this bit may have some degradation of performance for some workloads.
7 Global Depth Offset Enable Point
Format: Enable
Enables computation and application of Global Depth Offset when triangles are rendered in
POINT mode.
Programming Notes
This bit should be set whenever non zero depth bias (Slope, Bias) values are used.
Setting this bit may have some degradation of performance for some workloads.
6:5 FrontFace Fill Mode
Format:
Value
0h
Name
SOLID
U2 enumerated type
Description
Any triangle or rectangle object found to be front-facing is
rendered as a solid object. This setting is required when
rendering rectangle (RECTLIST) objects.
This state controls how front-facing triangle and rectangle objects are rendered.
1h WIREFRAME Any triangle object found to be front-facing is rendered as a
series of lines along the triangle boundaries (as determined by
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 138
Command Reference - Instructions
3DSTATE_SF
the topology type and controlled by the vertex EdgeFlags).
2h POINT Any triangle object found to be front-facing is rendered as a set
of point primitives at the triangle vertices (as determined by the
topology type and controlled by the vertex EdgeFlags). NOTE: If
the triangle is clipped, points will not be rendered at clip-inserted
vertices. Point will only be rendered at original vertices (if visible).
U2 enumerated type
Name
SOLID
Description
Any triangle or rectangle object found to be back-facing is
rendered as a solid object. This setting is required when
rendering rectangle (RECTLIST) objects.
3h Reserved
4:3 BackFace Fill Mode
Format:
Value
0h
This state controls how back-facing triangle and rectangle objects are rendered.
1h WIREFRAME Any triangle object found to be back-facing is rendered as a
series of lines along the triangle boundaries (as determined by
the topology type and controlled by the vertex EdgeFlags).
POINT Any triangle object found to be back-facing is rendered as a set
of point primitives at the triangle vertices (as determined by the
topology type and controlled by the vertex EdgeFlags). NOTE: If
the triangle is clipped, points will not be rendered at clip-inserted
vertices. Point will only be rendered at original vertices (if visible).
MBZ
Enable
2h
3h Reserved
2
1
Reserved
Format:
View Transform Enable
Format:
This bit controls the Viewport Transform function.
0 Front Winding
Determines whether a triangle object is considered "front facing" if the screen space vertex
positions, when traversed in the order, result in a clockwise (CW) or counter-clockwise (CCW)
winding order. Does not apply to points or lines.
Format:
This field enables "alpha-based" line anti-aliasing.
Programming Notes
This field must be disabled if any of the render targets have integer (UINT or SINT) surface
format.
2 31 Anti-Aliasing Enable
Enable
30:29 Cull Mode
Format: 3D_CullMode
Controls removal (culling) of triangle objects based on orientation. The cull mode only applies to
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 139
Command Reference - Instructions
3DSTATE_SF
triangle objects and does not apply to lines, points or rectangles.
Value
0h
1h
2h
3h
Programming Notes
Orientation determination is based on the setting of the Front Winding state.
Name Description
CULLMODE_BOTH All triangles are discarded (i.e., no triangle objects are
drawn)
CULLMODE_NONE No triangles are discarded due to orientation
CULLMODE_FRONT Triangles with a front-facing orientation are discarded
CULLMODE_BACK Triangles with a back-facing orientation are discarded
28 Reserved
27:18 Line Width
Format:
Range: [0.0, 7.9921875]
U3.7
Controls width of line primitives. Setting a Line Width of 0.0 specifies the rasterization
of the "thinnest" (one-pixel-wide), non-antialiased lines. Note that this effectively
overrides the effect of AAEnable (though the AAEnable state variable is not modified).
Programming Notes
Software must not program a value of 0.0 when running in MSRASTMODE_ON_xxx
modes - zero-width lines are not available when multisampling rasterization is
enabled.
17:16 Line End Cap Antialiasing Region Width
Format: U2
This field specifies the distances over which the coverage of anti-aliased line end caps are
computed.
Value
0h
1h
2h
3h
Name
0.5 pixels
1.0 pixels
2.0 pixels
4.0 pixels
Description
15 Reserved
Format: MBZ
MBZ
14 Reserved
Format:
13 Reserved
12 Reserved
11 Scissor Rectangle Enable
Format:
140
Enable
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SF
Enables operation of Scissor Rectangle.
10 Reserved
Format: MBZ
U2 enumerated type
9:8 Multisample Rasterization Mode
Format:
This state is duplicated in 3DSTATE_WM and both must be set to the same value. See the field in
3DSTATE_WM for definition details.
7:0 Reserved
Format: MBZ
Enable
3 31 Last Pixel Enable
Format:
If ENABLED, the last pixel of a diamond line will be lit. This state will only affect the rasterization
of Diamond lines (will not affect wide lines or anti-aliased lines).
Programming Notes
Last pixel is applied to all lines of a LINELIST, and only the last line of a LINESTRIP.
30:29 Triangle Strip/List Provoking Vertex Select
Format: 0-based vertex index
Selects which vertex of a triangle (in a triangle strip or list primitive) is considered the "provoking
vertex". Used for flat shading of primitives. Does current implementation send provoking vertex
first?
Value
0h
1h
2h
3h
Name
Vertex 0
Vertex 1
Vertex 2
Reserved
0-based vertex index
Name
Vertex 0
Vertex 1
Reserved
Reserved
0-based vertex index
Description
28:27 Line Strip/List Provoking Vertex Select
Format:
Value
0h
1h
2h
3h
Selects which vertex of a line (in a line strip or list primitive) is considered the "provoking vertex".
26:25 Triangle Fan Provoking Vertex Select
Format:
Value
0h
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Selects which vertex of a triangle (in a triangle fan primitive) is considered the "provoking vertex".
Name
Vertex 0
141
Command Reference - Instructions
3DSTATE_SF
1h
2h
3h
Vertex 1
Vertex 2
Reserved
MBZ
U1
Name
Reserved Reserved
Description
24:15 Reserved
Format:
14 AA Line Distance Mode
Format:
This bit controls the distance computation for antialiased lines.
Value
0h
1h AALINEDISTANCE_TRUE True distance computation. This is the normal setting
which should yield WHQL compliance.
MBZ
U1
Name
Disable
Enable
Description
8 sub pixel precision bits maintained
4 sub pixel precision bits maintained
U1
13 Reserved
Format:
12 Vertex Sub Pixel Precision Select
Format:
Selects the number of fractional bits maintained in the vertex data
Value
0h
1h
11 Use Point Width State
Format:
Controls whether the point width passed on the vertex or from state is used for rendering point
primitives.
Value
0h
1h
Name
Description
Use Point Width on Vertex
Use Point Width from State
U8.3
10:0 Point Width
Format:
Range: [0.125, 255.875] pixels
This field specifies the size (width) of point primitives in pixels. This field is overridden (though
not overwritten) whenever point width information is passed in the FVF
4 31:0 Global Depth Offset Constant
Format: IEEE_FP
Specifies the constant term in the Global Depth Offset function.
5 31:0 Global Depth Offset Scale
Format: IEEE_FP
Specifies the scale term used in the Global Depth Offset function.
142 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SF
6 31:0 Global Depth Offset Clamp
Format: IEEE_FP
Specifies the clamp term used in the Global Depth Offset function.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 143
Command Reference - Instructions
3DSTATE_SO_BUFFER
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
18h 3DSTATE_SO_BUFFER
OpCode
MBZ
2h Excludes DWord (0,1)
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
Total Length - 2
1 31 Reserved
Format: MBZ
U2
30:29 SO Buffer Index
Format:
Specifies which of the four SO Buffers is being defined.
28:25 SO Buffer Object Control State
Format: MEMORY_OBJECT_CONTROL_STATE
Specifies the memory object control state for the SO buffer.
24:22 Reserved
Format: MBZ
MBZ
U12 Pitch in Bytes
21:12 Reserved
Format:
11:0 Surface Pitch
Format:
This field specifies the pitch of the SO buffer in #Bytes.
144 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SO_BUFFER
Value
[0,2048]
Programming Notes
A Surface Pitch of 0 indicates an un-bound buffer. No writes are performed. Surface Base
Address is ignored.
Name
Must be 0 or a multiple of 4 Bytes.
2 31:2 Surface Base Address
Format: GraphicsAddress[31:2]
This field specifies the starting DWord address LSBs of the buffer in Graphics Memory.
1:0 Reserved
Format: MBZ
GraphicsAddress[31:2]
3 31:2 Surface End Address
Format:
This field specifies the ending DWord address of the buffer in Graphics Memory.
1:0 Reserved
Format: MBZ
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 145
Command Reference - Instructions
3DSTATE_SO_DECL_LIST
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
1h 3DSTATE_NONPIPELINED
OpCode
17h 3DSTATE_SO_DECL_LIST
OpCode
MBZ
=n Total Length - 2
Format: Q1
Name Description
Default value = 2(N-1)+3 h
MBZ
U4 bitmask
Index of SO Stream
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:9 Reserved
Format:
8:0 DWord Length
Format:
Value
3h Excludes DWord (0,1) [Default]
1 31:16 Reserved
Format:
15:12 Stream to Buffer Selects [3]
Format:
Identifies to which SO Buffers stream 3 outputs. See Stream To Buffer Selects [0] field description.
11:8 Stream to Buffer Selects [2]
Format: U4 bitmask
Identifies to which SO Buffers stream 2 outputs. See Stream To Buffer Selects [0] field description.
7:4 Stream to Buffer Selects [1]
Format: U4 bitmask
Identifies to which SO Buffers stream 1 outputs. See Stream To Buffer Selects [0] field description.
146 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_SO_DECL_LIST
3:0 Stream to Buffer Selects [0]
Format: U4 bitmask
Identifies to which SO Buffers stream 0 outputs (irrespective of whether those buffers are
enabled via 3DSTATE_STREAMOUT). Software is required to scan the SO_DECL list in order to
provide this summary information.
Note: For "inactive" streams, software must program this field to all zero (no buffers written to)
and the corresponding Num Entries field to zero (no valid SO_DECLs).
Value
1xxxb
x1xxb
xx1xb
xxx1b
Name
SO Buffer 3
SO Buffer 2
SO Buffer 1
SO Buffer 0
U8 #entries
2 31:24 Num Entries [3]
Format:
Specifies the number of valid SO_DECL entries for Stream 3. (See notes in Num Entries [0] field
description).
Value
[0,128]
Name
entries
U8 #entries
23:16 Num Entries [2]
Format:
Specifies the number of valid SO_DECL entries for Stream 2. (See notes in Num Entries [0] field
description).
Value
[0,128]
Name
entries
U8 #entries
15:8 Num Entries [1]
Format:
Specifies the number of valid SO_DECL entries for Stream 1. (See notes in Num Entries [0] field
description).
Value
[0,128]
Name
entries
U8 #entries
7:0 Num Entries [0]
Format:
Specifies the number of valid SO_DECL entries for Stream that the SO_DECLs are
programmed in groups of four (one SO_DECL for each of the four streams). Therefore the
number of 2-DWord groups of SO_DECLs supplied in this command is derived from the stream(s)
with the most valid SO_DECLs. The NumEntries value specific to each stream will indicate how
many SO_DECLS are valid for that particular stream. Any trailing invalid SO_DECLs supplied for
streams with fewer valid SO_DECLs will be ignored. It is legal to specify Num Entries = 0 for all
four streams simultaneously. In this case there will be no SO_DECLs included in the command
(only DW 0-2). Note that all Stream to Buffer Selects bits must be zero in this case (as no streams
produce output).
Value
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Name
147
Command Reference - Instructions
3DSTATE_SO_DECL_LIST
[0,128] entries
SO_DECL
3..n 63:48 SO_DECL[3,n]
Format:
This field contains Stream 3 SO_DECL [n]
47:32 SO_DECL[2,n]
Format:
This field contains Stream 2 SO_DECL [n]
31:16 SO_DECL[1,n]
Format:
This field contains Stream 1 SO_DECL [n]
15:0 SO_DECL[0,n]
Format:
This field contains Stream 0 SO_DECL [n]
SO_DECL
SO_DECL
SO_DECL
148 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_STENCIL_BUFFER
Source:
Length Bias:
RenderCS
2
This command sets the surface state of the separate stencil buffer, delivered as a pipelined state command.
However, the state change pipelining isn't completely transparent (see restriction below).
Programming Notes
Restriction: Prior to changing Depth/Stencil Buffer state (i.e., any combination of
3DSTATE_DEPTH_BUFFER, 3DSTATE_CLEAR_PARAMS, 3DSTATE_STENCIL_BUFFER,
3DSTATE_HIER_DEPTH_BUFFER) SW must first issue a pipelined depth stall (PIPE_CONTROL with Depth
Stall bit set, followed by a pipelined depth cache flush (PIPE_CONTROL with Depth Flush Bit set,
followed by another pipelined depth stall (PIPE_CONTROL with Depth Stall Bit set), unless SW can
otherwise guarantee that the pipeline from WM onwards is already flushed (e.g., via a preceding
MI_FLUSH).
3DSTATE_STENCIL_BUFFER must always be programmed in the along with the other Depth/Stencil
state 3DSTATE_DEPTH_BUFFER, 3DSTATE_CLEAR_PARAMS, or
3DSTATE_HIER_DEPTH_BUFFER)
The stencil buffer is always Tile-Y
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
06h 3DSTATE_STENCIL_BUFFER
OpCode
MBZ
=n Total Length - 2
Name
Excludes Dword (0,1) [Default]
MBZ
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 Dword Length
Format:
Value
1h
1 31 Reserved
Format:
30:29 Reserved
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 149
Command Reference - Instructions
3DSTATE_STENCIL_BUFFER
Format: MBZ
MEMORY_OBJECT_CONTROL_STATE
Description
28:25 Stencil Buffer Object Control State
Format:
Specifies the memory object control state for the stencil buffer.
Stencil Buffer Object Control State [3:0]
This field is not context save and restored by hardware. If this field is programmed to
any value other than zero, it must be programmed after the following commands or
events:
• MI_SET_CONTEXT
• MI_WAIT_FOR_EVENT (Specifically waits on vblank or display flip)
• Render engine goes IDLE due to head point equal to tail pointer
24:22 Reserved
Format: MBZ
MBZ
U17-1 Pitch in Bytes
Name Description
corresponding to [128B, 128KB]also restricted to a multiple of 128B
Programming Notes
21:17 Reserved
Format:
16:0 Surface Pitch
Format:
Value
Since this surface is tiled, the pitch specified must be a multiple of the tile pitch, in the range
[128B, 128KB].
The pitch must be set to 2x the value computed based on width, as the stencil buffer is stored
with two rows interleaved. For details on the separate stencil buffer storage format in memory,
see GPU Overview (vol1a), Memory Data Formats, Surface Layout, 2D Surfaces, Stencil Buffer
Layout (section 8.20.4.8).
This field specifies the pitch of the stencil buffer in (#Bytes - 1).
[127, 3FFFFh]
2 31:0 Surface Base Address
Format: GraphicsAddress[31:0]Stencil_Buffer
Programming Notes
The Stencil Buffer can only be mapped to Main Memory (uncached).
This field specifies the starting Dword address of the buffer in mapped Graphics Memory.
150 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_STREAMOUT
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
This command contains pipelined state required by the SOL unit.
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
1Eh 3DSTATE_STREAMOUT
OpCode
MBZ
1h
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
Total Length - 2
1 31 SO Function Enable
Format: U1
If set, the SO function is enabled. Vertex data will be streamed out to memory (subject to
overflow detection) as controlled by the various SO-related state variables.
If clear, the SO function is disabled, and therefore no vertex data will be streamed out to
memory. However, the Rendering Disable and Render Stream Select fields will still be used to
determine which vertices (if any) are forwarded down the pipeline for (possible) rendering.
30 Rendering Disable
Format: U1
If set, the SO stage will not forward any topologies down the pipeline. If clear, the SO stage will
forward topologies associated with Render Stream Select down the pipeline. This bit is used even
if SO Function Enable is DISABLED.
29 Reserved
Format: MBZ
28:27 Render Stream Select
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 151
Command Reference - Instructions
3DSTATE_STREAMOUT
Format:
Description
This field specifies which stream has been selected to be forwarded down the pipeline
for possible rendering. Topologies from other streams will not be passed down the
pipeline. If Rendering Disable is set, this field is ignored, as no topologies are sent
down the pipeline.
This bit is used even if SO Function Enable is DISABLED.
U2
26 Reorder Mode
This bit controls how vertices of triangle objects in TRISTRIP[_ADJ] and TRISTRIP_REV are
reordered for the purposes of stream-out only (does not impact rendering). See table in Input
Buffering.
Value
0h
Name Description
LEADING Reorder the vertices of alternating triangles of a TRISTRIP[_ADJ]
such that the leading (first) vertices are in consecutive order starting
at v0. A similar reordering is performed on alternating triangles in a
TRISTRIP_REV.
TRAILING Reorder the vertices of alternating triangles of a TRISTRIP[_ADJ]
such that the trailing (last) vertices are in consecutive order starting
at v2. A similar reordering is performed on alternating triangles in a
TRISTRIP_REV.
Enable
Description
1h
25 SO Statistics Enable
Format:
Value Name
0h
1h
This bit controls whether StreamOutput statistics register(s) can be incremented.
Disable SO_NUM_PRIMS_WRITTEN[0..3] and SO_PRIM_STORAGE_NEEDED[0..3]
registers cannot increment.
Enable SO_NUM_PRIMS_WRITTEN[0..3] and SO_PRIM_STORAGE_NEEDED[0..3]
registers can increment.
MBZ
MBZ
U1
24:23 Reserved
Format:
22:12 Reserved
Format:
11 SO Buffer Enable [3]
Format:
(See SO Buffer Enable [0] )
10 SO Buffer Enable [2]
Format:
(See SO Buffer Enable [0] )
9
152
U1
SO Buffer Enable [1]
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_STREAMOUT
Format:
(See SO Buffer Enable [0] )
8 SO Buffer Enable [0]
Format: U1
If set, stream output to SO Buffer 0 is enabled. If clear, SO Buffer 0 is considered "not bound"
and effectively treated as a zero-length buffer for the purposes of SO output and overflow
detection. If an enabled stream's Stream to Buffer Selects includes this buffer it is by definition an
overflow condition. That stream will cause no writes to occur, and only
SO_PRIM_STORAGE_NEEDED[
is DISABLED.
7:0 Reserved
Format:
U1
MBZ
MBZ
U1 count of 256-bit units
2 31:30 Reserved
Format:
29 Stream 3 Vertex Read Offset
Format:
Specifies amount of data to skip over before reading back Stream 3 vertex data.
(See Stream 0 Vertex Read Offset)
28:24 Stream 3 Vertex Read Length
Format: U5-1 count of 256-bit units
(See Stream 0 Vertex Read Length)
23:22 Reserved
Format: MBZ
U1 count of 256-bit units
21 Stream 2 Vertex Read Offset
Format:
Specifies amount of data to skip over before reading back Stream 2 vertex data. (See Stream 0
Vertex Read Offset)
20:16 Stream 2 Vertex Read Length
Format: U5-1 count of 256-bit units
MBZ
U1 count of 256-bit units
15:14 Reserved
Format:
13 Stream 1 Vertex Read Offset
Format:
Specifies amount of data to skip over before reading back Stream 1 vertex data. (See Stream 0
Vertex Read Offset)
12:8 Stream 1 Vertex Read Length
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 153
Command Reference - Instructions
3DSTATE_STREAMOUT
Format: U5-1 count of 256-bit units
(See Stream 0 Vertex Read Length)
7:6 Reserved
Format: MBZ
U1 count of 256-bit units
5 Stream 0 Vertex Read Offset
Format:
Specifies amount of data to skip over before reading back Stream 0 vertex data. Must be zero if
the GS is enabled and the Output Vertex Size field in 3DSTATE_GS is programmed to 0 (i.e., one
16B unit).
4:0 Stream 0 Vertex Read Length
Format: U5-1 count of 256-bit units
Specifies amount of vertex data to read back for Stream 0 vertices, starting at the Stream 0
Vertex Read Offset location. Maximum readback is 17 256-bit units (34 128-bit vertex attributes).
Read data past the end of the valid vertex data has undefined contents, and therefore shouldn't
be used to source stream out data.
Must be zero (i.e., read length = 256b) if the GS is enabled and the Output Vertex Size field in
3DSTATE_GS is programmed to 0 (i.e., one 16B unit).
154 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_TE
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
The state used by TE is defined with this inline state packet.
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
1Ch 3DSTATE_TE
OpCode
MBZ
2h Excludes DWord (0,1)
=n Total Length - 2
MBZ
MBZ
MBZ
U2
Name
INTEGER
Description
Outside/inside edges are divided into an integer number
of equal-sized segments.
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:19 Reserved
Format:
18:16 Reserved
Format:
15:14 Reserved
Format:
13:12 Partitioning
Format:
Value
0h
1h
2h
This field specifies how edges are partitioned based on tessellation factor.
ODD_FRACTIONAL Outside/inside edges are divided into an odd number of
possibly-unequal-sized segments.
EVEN_FRACTIONAL Outside/inside edges are divided into an even number of
possibly-unequal-sized segments.
11:10 Reserved
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 155
Command Reference - Instructions
3DSTATE_TE
Format: MBZ
U2
Description
Points are output (as POINTLIST topologies)
Lines are output (as LINESTRIP topologies). Only valid if ISOLINE
domain is selected.
9:8 Output Topology
Format:
This field specifies which primitive types are to be output.
Value Name
0h
1h
2h
POINT
LINE
TRI_CW Clockwise-ordered triangles are output (either as TRISTRIP,
TRISTRIP_REV or TRILIST topologies). Not valid if ISOLINE domain is
selected.
TRI_CCW Count-clockwise-ordered triangles are output (either as TRISTRIP,
TRISTRIP_REV or TRILIST topologies). Not valid if ISOLINE domain is
selected.
MBZ
U2
Name
QUAD
TRI
ISOLINE
Description
2D (U,V) domain is tessellated
Triangular (U,V,W) domain is tessellated
2D (U,V) domain is tessellated.
MBZ
U2
3h
7:6 Reserved
Format:
5:4 TE Domain
Format:
This field specifies which type of domain is to be tessellated.
Value
0h
1h
2h
3 Reserved
Format:
2:1 TE Mode
Format:
When TE Enable is ENABLED, this field specifies the overall operation of the TE stage. This field is
ignored if TE Enable is DISABLED.
Value
0h
Name Description
HW_TESS Normal HW Tessellation Mode. The TessFactors are read from the
patch URB entry, and are used to perform fixed-function hardware
tessellation of the specified domain.
SW_TESS Software Tessellation Mode. The TE unit will pass down HS-thread-
generated tessellated domain points instead of generating them
itself from TessFactors. The TE unit will read the Domain Point Count
and Domain Point Buffer Starting Address fields from the patch
header, and if the count is 0 it will consider the patch culled and
discard it. Otherwise the address is used to start fetching
DOMAIN_POINT structures from memory and passing them down
the pipeline to DS.
1h
156 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_TE
2h
3h
Reserved Reserved
Reserved Reserved
Enable
0 TE Enable
Format:
If ENABLED, the TE stage will perform tessellation processing on incoming patch primitives. The
TE Mode field determines how this tessellation operation proceeds. If DISABLED, the TE goes into
pass-through mode. All other state fields are ignored.
Programming Notes
The tessellation stages (HS, TE and DS) must be enabled/disabled as a group. I.e., draw
commands can only be issued if all three stages are enabled or all three stages are disabled,
otherwise the behavior is UNDEFINED.
2 31:0 Maximum Tessellation Factor Odd
Format: IEEE_Float
This field specifies the maximum TessFactor for ODD_FRACTIONAL partitioning when in
HW_TESS mode.
Value
427c0000h
Name
63
Description
Per API Spec, For normal operation software should set this
value to 63.0
[40400000h,427c0000h] Reserved Reserved.
Programming Notes
Note that ISOLINE's LineDensity TF is always subjected to INTEGER partitioning regardless of
the Partitioning state.
3 31:0 Maximum Tessellation Factor Not Odd
Format: IEEE_Float
This field specifies the maximum TessFactor for EVEN_FRACTIONAL or INTEGER partitioning
when in HW_TESS mode.
Value
42800000h
Name
64
Description
Per API Spec, For normal operation software should set this
value to 64.0
[40000000h,42800000h] Reserved Reserved
Programming Notes
Note that ISOLINE's LineDensity TF is always subjected to INTEGER partitioning regardless of
the Partitioning state.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 157
Command Reference - Instructions
3DSTATE_URB_DS
Source:
Length Bias:
RenderCS
2
This command may not overlap with the push constants in the URB defined by the
3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,
3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.
Programming Notes
3DSTATE_URB_VS, 3DSTATE_URB_HS, and 3DSTATE_URB_GS must also be programmed in order for the
programming of this state to be valid.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
32h 3DSTATE_URB_DS
OpCode
MBZ
0h DWORD_COUNT_n
=n
MBZ
MBZ
U5
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31
30
Reserved
Format:
Reserved
Format:
29:25 DS URB Starting Address
Format:
Offset from the start of the URB memory where DS starts its allocation, specified in multiples of
8 KB.
Value
[0,11]
Name
U9-1 Count of 512-bit units
24:16 DS URB Entry Allocation Size
Format:
Specifies the length of each URB entry owned by DS. This field is always used (even if DS
158 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_URB_DS
Function Enable is DISABLED).
Value
[0,9]
Name
Description
15:0 DS Number of URB Entries
Specifies the number of URB entries that are used by DS. This field is always used
(even if DS Function Enable is DISABLED).
If Domain Shader Thread Dispatch is Enabled then the minimum number of handles
that must be allocated is 10 URB entries.
Value
[0,288]
Programming Notes
DS Number of URB Entries must be divisible by 8 if the DS URB Entry Allocation Size is
programmed to a value less than 9, which is 10 512-bit URB entries. "2:0" = reserved "000"
Name
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 159
Command Reference - Instructions
3DSTATE_URB_GS
Source:
Length Bias:
RenderCS
2
This command may not overlap with the push constants in the URB defined by the
3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,
3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.
Programming Notes
3DSTATE_URB_VS, 3DSTATE_URB_HS, and 3DSTATE_URB_DS must also be programmed in order for the
programming of this state to be valid.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
33h 3DSTATE_URB_GS
OpCode
MBZ
0h DWORD_COUNT_n
=n
MBZ
MBZ
U5
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31 Reserved
Format:
30 Reserved
Format:
29:25 GS URB Starting Address
Format:
Offset from the start of the URB memory where GS starts its allocation, specified in multiples of 8
KB.
Value
[0,11]
Name
U9-1 512-bit units
24:16 GS URB Entry Allocation Size
Format:
Specifies the length of each URB entry owned by GS. This field is always used (even if GS
160 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_URB_GS
Function Enable is DISABLED).
15:0 GS Number of URB Entries
Specifies the number of URB entries that are used by GS. This field is always used (even if GS
Function Enable is DISABLED).
Value
[0,192]
Programming Notes
Only if GS is disabled can this field be programmed to 0.
If GS is enabled this field shall be programmed to a value greater than 0. For GS Dispatch Mode
"Single", this field shall be programmed to a value greater than or equal to 1. For other GS
Dispatch Modes, refer to the definition of Dispatch Mode (3DSTATE_GS) for minimum values of
this field.
GS Number of URB Entries must be divisible by 8 if the GS URB Entry Allocation Size is less than
9 512-bit URB entries.
"2:0" = reserved "000"
Name
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 161
Command Reference - Instructions
3DSTATE_URB_HS
Source:
Length Bias:
RenderCS
2
This command may not overlap with the push constants in the URB defined by the
3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,
3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.
Programming Notes
3DSTATE_URB_VS, 3DSTATE_URB_DS, and 3DSTATE_URB_GS must also be programmed in order for the
programming of this state to be valid.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
31h 3DSTATE_URB_HS
OpCode
MBZ
0h DWORD_COUNT_n
=n
MBZ
MBZ
U5
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31
30
Reserved
Format:
Reserved
Format:
29:25 HS URB Starting Address
Format:
Offset from the start of the URB memory where HS starts its allocation, specified in multiples of
8 KB.
Value
[0,11]
Name
U9-1 Count of 512-bit units
24:16 HS URB Entry Allocation Size
Format:
Specifies the length of each URB entry owned by HS. This field is always used (even if HS
162 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_URB_HS
Function Enable is DISABLED).
15:0 HS Number of URB Entries
Specifies the number of URB entries that are used by HS. This field is always used (even
if HS Function Enable is DISABLED).
Programming Restriction:HS Number of URB Entries must be divisible by 8 if the HS
URB Entry Allocation Size is less than 9 512-bit URB entries."2:0" = reserved "000"
Value
[0,32]
Name
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 163
Command Reference - Instructions
3DSTATE_URB_VS
Source:
Length Bias:
RenderCS
2
Description
VS URB Entry Allocation Size equal to 4(5 512-bit URB rows) may cause performance to decrease due
to banking in the URB. Element sizes of 16 to 20 should be programmed with six 512-bit URB rows.
This command may not overlap with the push constants in the URB defined by the
3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,
3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.
Programming Notes
3DSTATE_URB_HS, 3DSTATE_URB_DS, and 3DSTATE_URB_GS must also be programmed in order for the
programming of this state to be valid.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
30h 3DSTATE_URB_VS
OpCode
MBZ
0h DWORD_COUNT_n
=n
MBZ
MBZ
U5
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31
30
Reserved
Format:
Reserved
Format:
29:25 VS URB Starting Address
Format:
Offset from the start of the URB memory where VS starts its allocation, specified in multiples of
8 KB.
Value Name
164 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_URB_VS
[0,11]
U9-1 count of 512-bit units
24:16 VS URB Entry Allocation Size
Format:
Specifies the length of each URB entry owned by VS. This field is always used (even if VS
Function Enable is DISABLED).
Programming Notes
Programming Restriction: As the VS URB entry serves as both the per-vertex input and output
of the VS shader, the VS URB Allocation Size must be sized to the maximum of the vertex input
and output structures.
15:0 VS Number of URB Entries
Format: U16
Specifies the number of URB entries that are used by VS. This field is always used (even if VS
Function Enable is DISABLED).
Value
[32,512]
Programming Notes
Programming Restriction: VS Number of URB Entries must be divisible by 8 if the VS URB Entry
Allocation Size is less than 9 512-bit URB entries."2:0" = reserved "000b"
Name
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 165
Command Reference - Instructions
3DSTATE_VERTEX_BUFFERS
Source:
Length Bias:
RenderCS
2
Description
This command is used to specify VB state used by the VF function.
Can specify from 1 to 33 VBs.
The VertexBufferID field within a VERTEX_BUFFER_STATE structure indicates the specific VB. If
a VB definition is not included in this command, its associated state is left unchanged and is
available for use if previously defined.
Programming Notes
It is possible to have individual vertex elements sourced completely from generated ID values and
therefore not require any vertex buffer accesses for that vertex element. In this case, VF function will
simply ignore the VB state associated with that vertex element. If all enabled vertex elements have
this characteristic, no VBs are required to process 3DPRIMITIVE commands. For example, this might
arise when the user wants to perform all data lookups in the first shader, so only generated index
values need to be passed down to it. In this extreme case, SW would not need to program any VB
state, and therefore not need to issue any 3DSTATE_VERTEX_BUFFERS commands.
For any 3DSTATE_VERTEX_BUFFERS command, at least one VERTEX_BUFFER_STATE structure must be included.
VERTEX_BUFFER_STATE structures are 4 DWords for both VERTEXDATA buffers and INSTANCEDATA buffers.
Inclusion of partial VERTEX_BUFFER_STATE structures is UNDEFINED.
The order in which VBs are defined within this command can be arbitrary, though a vertex buffer must be
defined only once in any given command (otherwise operation is UNDEFINED).
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
03h GFXPIPE
Opcode
3h 3D
Opcode
0h 3DSTATE_VERTEX_BUFFERS
Opcode
08h 3DSTATE_VERTEX_BUFFERS
Opcode
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8
7:0
Reserved
DWord Count
Default Value:
Format:
3 DWORD_COUNT_n
=n
n = 4b-1 (where b = # of buffer states included)
166 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VERTEX_BUFFERS
1..n 127:0 Vertex Buffer State [n]
Format: VERTEX_BUFFER_STATE
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 167
Command Reference - Instructions
3DSTATE_VERTEX_ELEMENTS
Source:
Length Bias:
RenderCS
2
Description
This is a variable-length command used to specify the active vertex elements. Each
VERTEX_ELEMENT_STATE structure contains a Valid bit which determines which elements are
used.
Up to 34 elements.
Programming Notes
At least one VERTEX_ELEMENT_STATE structure must be included.
Inclusion of partial VERTEX_ELEMENT_STATE structures is UNDEFINED.
SW must ensure that at least one vertex element is defined prior to issuing a 3DPRIMTIVE
command, or operation is UNDEFINED.
There are no 'holes' allowed in the destination vertex: NOSTORE components must be
overwritten by subsequent components unless they are the trailing DWords of the vertex.
Software must explicitly chose some value (probably 0) to be written into DWords that would
otherwise be 'holes'.
Within a VERTEX_ELEMENT_STATE structure, if a Component Control field is set to something
other than VFCOMP_STORE_SRC, no higher-numbered Component Control fields may be set
to VFCOMP_STORE_SRC. In other words, only trailing components can be set to something
other than VFCOMP_STORE_SRC.
See additional restrictions listed in the command fields and VERTEX_ELEMENT_STATE
description.
Element[0] must be valid.
All elements must be valid from Element[0] to the last valid element. (I.e. if Element[2] is valid
then Element[1] and Element[0] must also be valid).
The pitch between elements packed in the URB will always be 128 bits.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
03h GFXPIPE
Opcode
3h 3D
Opcode
0h 3DSTATE_VERTEX_ELEMENTS
Opcode
09h 3DSTATE_VERTEX_ELEMENTS
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
168 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VERTEX_ELEMENTS
Format: Opcode
15:8
7:0
Reserved
DWord Count
Format:
Vertex Element Count = (DWord Count + 1) / 2
Value
1
[1,66]
=n
Name
DWORD_COUNT_n [Default]
Range
Description
excludes DWords 0,1
1-34 Elements
VERTEX_ELEMENT_STATE
1..n
63:0 Element [n]
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 169
Command Reference - Instructions
3DSTATE_VF_STATISTICS
Source:
Length Bias:
RenderCS
1
The VF stage tracks two pipeline statistics, the number of vertices fetched and the number of objects generated.
VF will increment the appropriate counter for each when statistics gathering is enabled by issuing the
3DSTATE_VF_STATISTICS command with the [Statistics Enable] bit set.
DWord
0
Bit
31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
Opcode
Opcode
Name
Pipelined, Single DWord [Default]
0h 3DSTATE_PIPELINED
Opcode
28:27 Command SubType
Format:
Value
1h
26:24 3D Command Opcode
Default Value:
Format:
GFXPIPE[28:27 = 1h, 26:24 = 0h, 23:16 = 0Bh] (Pipelined, Single DWord)
23:16 3D Command Sub Opcode
Default Value:
Format:
0Bh 3DSTATE_VF_STATISTICS
Opcode
GFXPIPE[28:27 = 1h, 26:24 = 0h, 23:16 = 0Bh] (Pipelined, Single DWord)
15:1 Reserved
Format: MBZ
Enable
0 Statistics Enable
Format:
If ENABLED, VF will increment the pipeline statistics counters IA_VERTICES_COUNT and
IA_PRIMITIVES_COUNT for each vertex fetched and each object output, respectively, for
3DPRIMITIVE commands issued subsequently.
If DISABLED, these counters will not be incremented for subsequent 3DPRIMITIVE commands.
170 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VIEWPORT_STATE_POINTERS_CC
Source:
Length Bias:
RenderCS
2
The 3DSTATE_VIEWPORT_STATE_POINTERS_CC command is used to define the location of fixed functions'
viewport state table.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
23h 3DSTATE_VIEWPORT_STATE_POINTERS
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:5]CC_VIEWPORT*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:5 CC Viewport Pointer
Format:
Specifies the 32-byte aligned address offset of the CC_VIEWPORT state. This offset is relative to
the Dynamic State Base Address.
4:0 Reserved
Format: MBZ
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 171
Command Reference - Instructions
3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP
Source:
Length Bias:
RenderCS
2
The 3DSTATE_VIEWPORT_STATE_POINTERS_CLIP command is used to define the location of fixed functions'
viewport state table.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
21h 3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP
OpCode
MBZ
0h DWORD_COUNT_n
=n
DynamicStateOffset[31:6]SF_CLIP_VIEWPORT*16
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:6 SF Clip Viewport Pointer
Format:
Specifies the 64-byte aligned address offset of the SF_CLIP_VIEWPORT state. This offset is
relative to the Dynamic State Base Address.
5:0 Reserved
Format: MBZ
172 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
Source:
Length Bias:
RenderCS
2
Description
The state used by VS is defined with this inline state packet.
DWord Bit
0 31:29 Command Type
Default Value:
Format:
Description
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
10h 3DSTATE_VS
OpCode
MBZ
4h Excludes DWord (0,1)
=n Total Length - 2
InstructionBaseOffset[31:6]Kernel
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
1 31:6 Kernel Start Pointer
Format:
This field specifies the starting location (1st GEN4 core instruction) of the kernel program run by
threads spawned by this FF unit. It is specified as a 64-byte-granular offset from the Instruction
Base Address. This field is ignored if VS Function Enable is DISABLED.
5:0 Reserved
Format: MBZ
U1 Enumerated type
Name
Multiple
Single
Description
Dual vertex SIMD4x2 thread dispatches are allowed.
Single vertex SIMD4x2 thread dispatches are forced.
2 31 Single Vertex Dispatch
Format:
Value
0h
1h
This field can be used to force single vertex SIMD4x2 VS threads.
30 Vector Mask Enable (VME)
When SPF=0, VME specifies which mask to use to initialize the initial channel enables. When
SPF=1, VME specifies which mask to use to generate execution channel enables.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 173
Command Reference - Instructions
3DSTATE_VS
Value
0h
1h
Name
Dmask
Vmask
Description
Channels are enabled based on the dispatch mask
Channels are enabled based on the vector mask
29:27 Sampler Count
Specifies how many samplers (in multiples of 4) the vertex shader 0 kernel uses. Used only for
prefetching the associated sampler state entries. This field is ignored if VS Function Enable is
DISABLED.
Value
0h
1h
2h
3h
4h
Name
No Samplers
1-4 Samplers
5-8 Samplers
9-12 Samplers
13-16 Samplers
no samplers used
Description
between 1 and 4 samplers used
between 5 and 8 samplers used
between 9 and 12 samplers used
between 13 and 16 samplers used
MBZ
U8
26 Reserved
Format:
25:18 Binding Table Entry Count
Format:
Specifies how many binding table entries the kernel uses. Used only for prefetching of the
binding table entries and associated surface state.
Note: For kernels using a large number of binding table entries, it may be wise to set this field to
zero to avoid prefetching too many entries and thrashing the state cache.
This field is ignored if VS Function Enable is DISABLED.
Value
[0,255]
Name
MBZ
U1 enumerated type
17 Reserved
Format:
16 Floating Point Mode
Format:
Specifies the initial floating point mode used by the dispatched thread. This field is ignored if VS
Function Enable is DISABLED.
Value
0h
1h
Name
IEEE-754
Alternate
Description
Use IEEE-754 Rules
Use alternate rules
MBZ
Enable
15:14 Reserved
Format:
13 Illegal Opcode Exception Enable
Format:
This bit gets loaded into EU CR0.1[12] (note the bit # difference). See Exceptions and ISA
Execution field is ignored if VS Function Enable is DISABLED.
174 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
12 Reserved
Format: MBZ
MBZ
Enable
11:8 Reserved
Format:
7 Software Exception Enable
Format:
This bit gets loaded into EU CR0.1[13] (note the bit # difference). See Exceptions and ISA
Execution field is ignored if VS Function Enable is DISABLED.
6:0 Reserved
Format: MBZ
GeneralStateOffset[31:10]ScratchSpace
3 31:10 Scratch Space Base Offset
Format:
Specifies the starting location of the scratch space area allocated to this FF unit as a 1K-byte
aligned offset from the General State Base Address. If required, each thread spawned by this FF
unit will be allocated some portion of this space, as specified by Per-Thread Scratch Space. The
computed offset of the thread-specific portion will be passed in the thread payload as Scratch
Space Offset. The thread is expected to utilize "stateless" DataPort read/write requests to access
scratch space, where the DataPort will cause the General State Base Address to be added to the
offset passed in the request header.
This field is ignored if VS Function Enable is DISABLED.
9:4 Reserved
Format: MBZ
U4 power of 2 Bytes over 1K Bytes
3:0 Per-Thread Scratch Space
Format:
Specifies the amount of scratch space to be allocated to each thread spawned by this FF unit.
The driver must allocate enough contiguous scratch space, starting at the Scratch Space Base
Pointer, to ensure that the Maximum Number of Threads can each get Per-Thread Scratch Space
size without exceeding the driver-allocated scratch space. This field is ignored if VS Function
Enable is DISABLED.
Value
[0,11]
Programming Notes
This amount is available to the kernel for information only. It will be passed verbatim (if not
altered by the kernel) to the Data Port in any scratch space access messages, but the Data Port
will ignore it.
Name
Description
indicating [1K Bytes, 2M Bytes]
4 31:25 Reserved
Format: MBZ
U5
175
24:20 Dispatch GRF Start Register for URB Data
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
Specifies the starting GRF register number for the URB portion (Constant + Vertices) of the
thread payload. This field is ignored if VS Function Enable is DISABLED.
Value
[0,31]
Name
Description
indicating GRF [R0,R31]
MBZ
U6
19:17 Reserved
Format:
16:11 Vertex URB Entry Read Length
Format:
Specifies the number of pairs of 128-bit vertex elements to be passed into the payload
for each vertex. This field is ignored if VS Function Enable is DISABLED.
For SIMD4x2 dispatch, each vertex element requires one GRF of payload data, therefore
the number of GRFs with vertex data will be double the value programmed in this field.
Value
[1,63]
Programming Notes
It is UNDEFINED to set this field to 0 indicating no Vertex URB data to be read and passed to
the thread.
Name
10 Reserved
Format: MBZ
U6
9:4 Vertex URB Entry Read Offset
Format:
Specifies the offset (in 256-bit units) at which Vertex URB data is to be read from the URB before
being included in the thread payload. This offset applies to all Vertex URB entries passed to the
thread. This field is ignored if VS Function Enable is DISABLED.
Value
[0,63]
Name
MBZ
3:0 Reserved
Format:
5 31:25 Maximum Number of Threads
Format: U7-1 representing thread count
Specifies the maximum number of simultaneous threads allowed to be active. Used to avoid
using up the scratch space. Programming the value of the max threads over the number of
threads based off number of threads supported in the execution units may improve performance
since the architecture allows threads to be buffered between the check for max threads and the
actual dispatch into the EU. Programming the max values to a number less than the number of
threads supported in the execution units may reduce performance. This field is ignored if VS
Function Enable is DISABLED.
Value
[0,15]
Name
indicating thread count of [1,16]
24:23 Reserved
176 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
Format: MBZ
MBZ
Enable
Description
22:11 Reserved
Format:
10 Statistics Enable
Format:
If ENABLED, this FF unit will engage in statistics gathering. See the Statistics Gathering
section later in this chapter. If DISABLED, statistics information associated with this FF
stage will be left unchanged.
This field is used even if VS Function Enable is DISABLED.
9:3 Reserved
Format: MBZ
MBZ
Disable
2
1
Reserved
Format:
Vertex Cache Disable
Format:
This bit controls the operation of the Vertex Cache. This field is always used. If the Vertex Cache
is DISABLED and the VS Function is ENABLED, the Vertex Cache is not used and all incoming
vertices will be passed to VS threads.
If the Vertex Cache is ENABLED and the VS Function is ENABLED, incoming vertices that do not
hit in the Vertex Cache will be passed to VS threads.
If the Vertex Cache is ENABLED and the VS Function is DISABLED, input vertices that miss in the
Vertex Cache will be assembled and written to the URB, though pass thru the VS stage
unmodified (not shaded).
The Vertex Cache is invalidated whenever the Vertex Cache becomes DISABLED , whenever the
VS Function Enable toggles, between 3DPRIMITIVE commands and between instances within a
3DPRIMITIVE command.
0 VS Function Enable
Format:
Description
If ENABLED, VS threads may be spawned to process VF-generated vertices before the
resulting vertices are passed down the pipeline.
If DISABLED, VF-generated vertices will pass thru the VS function and sent down the
pipeline unmodified. The Vertex Cache is still available in this mode, if enabled.
If Statistics Enable is ENABLED, VS_INVOCATION_COUNT will increment by 1 for every
vertex that passes through the VS stage, even if VS Function Enable is DISABLED.
This field is always used.
Enable
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 177
Command Reference - Instructions
3DSTATE_WM
Source:
Length Bias:
RenderCS
2
Description
Default Value:
Format:
DWord Bit
0 31:29 Command Type
3h GFXPIPE
OpCode
3h GFXPIPE_3D
OpCode
0h 3DSTATE_PIPELINED
OpCode
14h 3DSTATE_WM
OpCode
MBZ
01h Excludes DWord (0,1)
=n
28:27 Command SubType
Default Value:
Format:
26:24 3D Command Opcode
Default Value:
Format:
23:16 3D Command Sub Opcode
Default Value:
Format:
15:8 Reserved
Format:
7:0 DWord Length
Default Value:
Format:
Total Length - 2
1 31 Statistics Enable
Format: Enable
If ENABLED, the Windower and pixel pipeline will engage in statistics gathering. If DISABLED,
statistics information associated with this FF stage will be left unchanged. See Statistics
Gathering.
30 Depth Buffer Clear
Format: Enable
Programming Notes
If this field is enabled,
2. the Depth Test Enable field in DEPTH_STENCIL_STATE must be disabled.
3. 3DSTATE_DEPTH_BUFFER::Depth Write Enable must be set.
4. 3DSTATE_DEPTH_BUFFER::Stencil Write Enable must be set if
3DSTATE_STENCIL_BUFFER::Stencil buffer enable is set. Additionally the following must
be set to the correct values.
178
When set, the depth buffer is initialized as a side-effect of rendering pixels.
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
2. DEPTH_STENCIL_STATE::Stencil Write Mask must be 0xFF
3. DEPTH_STENCIL_STATE::Stencil Test Mask must be 0xFF
4. DEPTH_STENCIL_STATE::Back Face Stencil Write Mask must be 0xFF
5. DEPTH_STENCIL_STATE::Back Face Stencil Test Mask must be 0xFF
Refer to section 0 "Depth Buffer Clear" for additional restrictions when this field is enabled. If
this field is enabled,Pixel Shader Kill Pixel must be disabled.
29 Thread Dispatch Enable
Format: Enable
This bit, if set, indicates that it is possible for a PS thread to modify a render target, i.e.,at least
one render target is enabled (is not of type SURFTYPE_NULL and has at least one channel
enabled for writes) and the PS kernel contains a code path that may issue a write to that/those
enabled RTs.
Programming Notes
This bit is used for performance optimizations and does not directly control writing to render
targets. If this bit is DISABLED, no pixel shader threads will be dispatched. For correct behavior,
this bit must be set consistently with the behavior of the PS kernel, i.e. if this bit is DISABLED
the PS kernel must not write color or depth to any render targets. If this field is disabled, Pixel
Shader Kill Pixel must be disabled.
28 Depth Buffer Resolve Enable
Format: Enable
When set, the depth buffer is made to be consistent with the hierarchical depth buffer as a side-
effect of rendering pixels. This is intended to be used when the depth buffer is to be used as a
surface outside of the 3D rendering operation.
Programming Notes
If this field is enabled,
2.
the Depth Buffer Clear and Hierarchical Depth Buffer Resolve Enable fields must
both be disabled.
3. 3DSTATE_DEPTH_BUFFER::Depth Write Enable must be set.
Refer to section 11.5.4.2 "Depth Buffer Resolve" for additional restrictions when this field is
enabled. If Hierarchical Depth Buffer Enable is disabled, enabling this field will have no effect.
27 Hierarchical Depth Buffer Resolve Enable
Format: Enable
When set, the hierarchical depth buffer is made to be consistent with the depth buffer as a side-
effect of rendering pixels. This is intended to be used when the depth buffer has been modified
outside of the 3D rendering operation.
Programming Notes
If this field is enabled,
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 179
Command Reference - Instructions
3DSTATE_VS
2.
the Depth Buffer Clear and Depth Buffer Resolve Enable fields must both be
disabled.
3. 3DSTATE_DEPTH_BUFFER::Depth Write Enable must be set.
Refer to section 11.5.4.3 "Hierarchical Depth Buffer Resolve" for additional restrictions
when this field is enabled.
If Hierarchical Depth Buffer Enable is disabled, enabling this field will have no effect.
Performance Note: expect the hierarchical depth buffer's impact on performance to
be reduced for some period of time after this operation is performed, as the
hierarchical depth buffer is initialized to a state that makes it ineffective. Further
rendering will tend to bring the hierarchical depth buffer back to a more effective
state.
Software needs to do an ambiguate after allocating the surface for the first time if the
depth buffer width and height are NOT aligned to 8 and 4 respectively.
26 Legacy Diamond Line Rasterization
Format: Enable
This bit, if ENABLED, indicates that the Windower will rasterize zero width lines using the DX9
rasterization rules. If DISABLED, the Windower will rasterize zero width lines using the DX10
rasterization rules (see Strips Fans chapter).
25 Pixel Shader Kill Pixel
Format: Enable
This bit, if ENABLED, indicates that the PS kernel or color calculator has the ability to kill
(discard) pixels or samples, other than due to depth or stencil testing. This bit is required
to be ENABLED in the following situations:
• The API pixel shader program contains "killpix" or "discard" instructions, or other code in
the pixel shader kernel that can cause the final pixel mask to differ from the pixel mask
received on dispatch.
• A sampler with chroma key enabled with kill pixel mode is used by the pixel shader.
• Any render target has Alpha Test Enable or AlphaToCoverage Enable enabled.
• The pixel shader kernel generates and outputs oMask.
Note: As ClipDistance clipping is fully supported in hardware and therefore not via PS
instructions, there should be no need to ENABLE this bit due to ClipDistance clipping.
24:23 Pixel Shader Computed Depth Mode
Format: U2 Enumerated Type
This field specifies the computed depth mode for the pixel shader.
180 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
Value
0h
1h
2h
3h
Programming Notes
When bit 5 is set in WM_ RT independent rasterization is enabled), this field can not
be programmed to values: 2h or 3h.
Name
PSCDEPTH_OFF
PSCDEPTH_ON
Description
Pixel shader does not compute depth
Pixel shader computes depth with no guarantee as to its
value
PSCDEPTH_ON_GE Pixel shader computes depth and guarantees that oDepth
>= SourceDepth
PSCDEPTH_ON_LE Pixel shader computes depth and guarantees that oDepth
<= SourceDepth
22:21 Early Depth/Stencil Control
Format:
Value
0h
Name
U2 Enumerated Type
Description
This field specifies the behavior of early depth/stencil test.
EDSC_NORMAL Depth/Stencil Test/Write behaves as if it happens post-shader,
however the pixel shader is not necessarily executed if the
pixel fails depth or stencil test (this is the legacy behavior)
EDSC_PSEXEC Depth/Stencil Test/Write behaves as if it happens post-shader,
and the pixel shader is executed if the pixel fails depth or
stencil test (although pre-shader actions such as primitive
inclusion, stipple, etc. will still cause the shader not to execute)
Depth/Stencil Test/Write behaves as if it happens pre-shader.
The pixel shader is not executed if the pixel fails depth or
stencil test. Depth and stencil writes occur even if the pixel is
killed by the shader or post-shader by alpha test, etc. Depth
output by the pixel shader is ignored.
Programming Notes
If EDSC_PSEXEC mode is selected, Thread Dispatch Enable must be set.
Restriction
Restriction: When value of "2h" is programmed, PS_INVOCATIONs_COUNT may not be
accurate.
1h
2h EDSC_PREPS
3h
Reserved
20 Pixel Shader Uses Source Depth
Format: Enable
This bit, if ENABLED, indicates that the PS kernel requires the source depth value (vPos.z) to be
passed in the payload. The source depth value is interpolated according to the Position ZW
Interpolation Mode state.
19 Pixel Shader Uses Source W
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 181
Command Reference - Instructions
3DSTATE_VS
Format: Enable
This bit, if ENABLED, indicates that the PS kernel requires the interpolated source W value
(vPos.w) to be passed in the payload. The W value is interpolated according to the Position ZW
Interpolation Mode state.
18:17 Position ZW Interpolation Mode
Format: U2 Enumerated Type
This field elects "interpolation mode" associated with the Position Z (source depth) and W
coordinates passed in the PS payload when the PS requires Position as input. This field does not
determine whether these coordinates are actually included in the payload (see Pixel Shader
Requires Depth, Pixel Shader Requires W).
Value
0h
1h
2h
3h
Programming Notes
When bit 5 is set in WM_STATE, value of 3h is not defined for this field.
Programming Note: When bit 5 in dword 1 (RT Independent Rasterization Enable) is set and bit
30 in dword 2 (PS UAV-only) is not set in WM_STATE, value of 3h is not defined for this field.
Name
INTERP_PIXEL
Reserved
INTERP_SAMPLE
Description
Evaluate Z & W at the pixel center or UL corner (as
specified by Pixel Location of 3DSTATE_MULTISAMPLE)
INTERP_CENTROID
16:11 Barycentric Interpolation Mode
Format: Enable[6]
Controls which barycentric interpolation terms must be passed into the pixel shader kernel.
Bit 0: Perspective Pixel Location barycentric is required
Bit 1: Perspective Centroid barycentric is required
Bit 2: Perspective Sample barycentric is required
Bit 3: Non-perspective Pixel Location barycentric is required
Bit 4: Non-perspective Centroid barycentric is required
Bit 5: Non-perspective Sample barycentric is required
Programming Notes
If contiguous dispatch modes are enabled, only bit 3 (non-perspective pixel location) can be
set, all other bits in this field must be zero. Pixel Location below refers to either the upper left
corner or pixel center depending on the Pixel Location state of 3DSTATE_MULTISAMPLING).
MSDISPMODE_PERSAMPLE is required in order to select Perspective Sample or Non-
perspective Sample barycentric coordinates.
Restriction: When Centroid Barycentric mode is required, HW may produce incorrect
interpolation results when a 2X2 pixels have unlit pixels.
10 Pixel Shader Uses Input Coverage Mask
Format: Enable
This bit, if ENABLED, indicates that the PS kernel requires the input coverage mask to be passed
in the payload.
182 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
3DSTATE_VS
9:8 Line End Cap Antialiasing Region Width
Format: U2
This field specifies the distances over which the coverage of anti-aliased line end caps are
computed.
Value
0h
1h
2h
3h
Name
0.5 pixels
1.0 pixels
2.0 pixels
4.0 pixels
Description
7:6 Line Antialiasing Region Width
Format:
Value
0h
1h
2h
3h
U2
Name
0.5 pixels
1.0 pixels
2.0 pixels
4.0 pixels
MBZ
Enable
Description
This field specifies the distance over which the anti-aliased line coverage is computed.
5
4
Reserved
Format:
Polygon Stipple Enable
Format:
Enables the Polygon Stipple function.
3 Line Stipple Enable
Format:
Enables the Line Stipple function.
Enable
2 Point Rasterization Rule
Format: 3D_RasterizationRule
This field specifies the rasterization rules to be applied whenever the edges of a point primitive
fall exactly on a pixel sampling point.
Value
0h
1h
Name
RASTRULE_UPPER_LEFT
Description
To match "normal" upper left rules for surface
primitives
RASTRULE_UPPER_RIGHT To match OpenGL point rasterization rules (round to
+ infinity, where this is the upper right direction wrt
OpenGL screen origin of lower left).
U2 enumerated type
1:0 Multisample Rasterization Mode
Format:
This field determines whether multisample rasterization is turned on/off, and how the pixel
sample point(s) are defined. Software sets this according to the API, the API's multisample enable
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 183
Command Reference - Instructions
3DSTATE_VS
state setting (if any), and whether 1X or 4X MSRTs are bound. This state is duplicated in
3DSTATE_SF and both must be set to the same value. Refer to the "Multisampling" section for
details on the settings of this field.
Value
0h
1h
2h
3h
Name
MSRASTMODE_OFF_PIXEL
MSRASTMODE_OFF_PATTERN
MSRASTMODE_ON_PIXEL
MSRASTMODE_ON_PATTERN
U1 Enumerated Type
2 31 Multisample Dispatch Mode
Format:
This bit, along with Number of Multisamples, determines how PS threads are dispatched.
Software programs this bit depending on the per-pixel v.s per-sample PS execution requirement.
When RT Independent Rasterization Enable = 1, value of 0h for this field is not allowed.
Value
0h
Name Description
MSDISPMODE_PERSAMPLE This is the high-quality DX10.1 multisample mode
where (over and above PERPIXEL mode) the PS is
run for each covered sample. This mode is also
used for "normal" non-multisample rendering (aka
1X), given Number of Multisamples is
programmed to NUMSAMPLES_1.
MSDISPMODE_PERPIXEL This is the classic multisample mode of operation,
typically used for both antialiasing and
transparency. Setup and rasterization operate in
full multisample mode, testing coverage and
depth/stencil test at the sample level but only
running the PS once per pixel.
MBZ
1h
30:0 Reserved
Format:
184 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
add - Addition
Source:
Length Bias:
EuIsa
4
The add instruction performs component-wise addition of src0 and src1 and stores the results in dst.
Addition of two floating-point numbers follows rules in add (IEEE mode) or add (ALT mode).
Format:
[(pred)] add[.cmod] (exec_size) dst src0 src1
Programming Notes
Use a source modifier with add to implement subtraction.
Syntax
[(pred)] add[.cmod] (exec_size) reg reg reg [(pred)] add[.cmod] (exec_size) reg reg imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { [n] =
[n] + [n]; } }
Predication Conditional Modifier Saturation Source Modifier
Y Y Y Y
Src Types Dst Types
*B,*W,*D *B,*W,*D
*B,*W,*D F
F
DF
F
DF
Bit
127:64 ImmSource
Exists If:
Format:
DWord
0..3
Description
([ImmSource][e]=='IMM')
EU_INSTRUCTION_SOURCES_REG_IMM
([RegSource][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 185
Command Reference - Instructions
addc - Addition with Carry
Source:
Length Bias:
EuIsa
4
The addc instruction performs component-wise addition of src0 and src1 and stores the results in dst; it also
stores the carry into acc.
If the operation produces a carry out, 0x00000001 is stored in acc, else 0x00000000 is stored in acc.
Format:
[(pred)] addc[.cmod] (exec_size) dst src0 src1
Restriction
Restriction: AccWrEn is required. The accumulator is an implicit destination and thus cannot be an explicit
destination operand.
Syntax
[(pred)] addc[.cmod] (exec_size) reg reg reg [(pred)] addc[.cmod] (exec_size) reg reg
imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { [n] =
[n] + [n]; [n] = carry([n] + [n]); } }
Predication Conditional Modifier Saturation Source Modifier
Y Y N N
Src Types Dst Types
UD UD
Bit
127:64 ImmSource
Exists If:
Format:
DWord
0..3
Description
([ImmSource][e]=='IMM')
EU_INSTRUCTION_SOURCES_REG_IMM
([RegSource][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
186 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
asr - Arithmetic Shift Right
Source:
Length Bias:
EuIsa
4
Description
Perform component-wise arithmetic right shift of the bits in src0 by the shift count indicated in src1,
storing the results in dst. If src0 has a signed type, insert copies of src0's sign bit in the number of
MSBs indicated by the shift count. Otherwise insert 0 bits.
The shift count is taken from the low five bits of src1, regardless of the src1 type and treated as an
unsigned integer in the range 0 to 31.
Format:
[(pred)] asr[.cmod] (exec_size) dst src0 src1
Programming Notes
If src0 is -1, the result is -1 regardless of the shift count.
For unsigned src0 types, asr and shr produce the same result.
Syntax
[(pred)] asr[.cmod] (exec_size) reg reg reg [(pred)] asr[.cmod] (exec_size) reg reg imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( l[n] ) { shiftCnt =
[n] & 0x1F; // Always use low 5 bits for shift count. if ([n] >= 0) {
[n] = [n] >> shiftCnt; } else { int maskLSB = pow(2, shiftCnt) - 1; if (
maskLSB & [n] == 0 ) { [n] = sign([n]) * ((abs)[n] >>
shiftCnt); } else { [n] = sign([n]) * ((abs)[n] >> shiftCnt) -
1; } } } }
Predication Conditional Modifier Saturation Source Modifier
Y Y Y Y
Src Types Dst Types
*B,*W,*D *B,*W,*D
DWord
0..3
Bit
127:64 ImmSource
Exists If:
Format:
Description
([ImmSource][e]=='IMM')
EU_INSTRUCTION_SOURCES_REG_IMM
([RegSource][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 187
Command Reference - Instructions
avg - Average
Source:
Length Bias:
EuIsa
4
The avg instruction performs component-wise integer average of src0 and src1 and stores the results in dst. An
integer average uses integer upward rounding. It is equivalent to increment one to the addition of src0 and
src1 and then apply an arithmetic right shift to this intermediate value.
Format:
The avg instruction performs component-wise integer average of src0 and src1 and stores the results in dst. An
integer average uses integer upward rounding. It is equivalent to increment one to the addition of src0 and
src1 and then apply an arithmetic right shift to this intermediate value.
Syntax
[(pred)] avg[.cmod] (exec_size) reg reg reg [(pred)] avg[.cmod] (exec_size) reg reg imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { [n] =
([n] + [n] + 1) >> 1; // Use arithmetic shift right. } }
Predication Conditional Modifier Saturation Source Modifier
Y Y Y Y
Src Types Dst Types
*B,*W,*D *B,*W,*D
DWord
0..3
Bit
127:64 ImmSource
Exists If:
Format:
Description
([ImmSource][e]=='IMM')
EU_INSTRUCTION_SOURCES_REG_IMM
([RegSource][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
188 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
bfe - Bit Field Extract
Source:
Length Bias:
EuIsa
4
Component-wise extract a bit field from src2 using the bit field width from src0 and the bit field offset from
src1. Store the extracted bit field value in the low bits of dst and sign extend (if D type) or zero extend (if UD
type).
The width and offset values are from the low five bits of src0 and src1 respectively, or src0 & 0x1f and src1 &
0x1f.
If width is zero, the result is zero.
If offset + width > 32 then the extracted bit field is bits offset to 31 of src2, extracting only 32 - offset bits, less
than width as the bit field cannot extend past the MSB of the source value. Otherwise extract width bits
extending from bit positions offset to offset + width - 1.
Format:
[(pred)] bfe (exec_size) dst src0 src1 src2
Restriction
Restriction: No accumulator access, implicit or explicit.
Restriction: All three-source instructions have certain restrictions, described in Instruction Machine
Formats.
Syntax
[(pred)] bfe (exec_size) reg reg reg reg
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { UD width =
[n][4:0]; UD offset = [n][4:0]; if ( width == 0 ) { [n] =
0x00000000; } else if ( (width + offset) < 32 ) { [n] = [n] << (32 -
width - offset); if (src2 is signed) { [n] = [n] >> (32 - width); // pad
sign bit of } else { [n] = [n] >> (32 - width); // pad 0 } } else
{ if ( src2 is signed ) { [n] = [n] >> offset; // pad sign bit } else {
[n] = [n] >> offset; // pad 0 } } } }
Predication Conditional Modifier Saturation Source Modifier
Y N N N
Src Types Dst Types
UD
D
UD
D
Bit
Format:
DWord Description
MBZ
EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
MBZ
0..3 127:126 Reserved
125:106 Source 2
Format:
105 Reserved
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 189
Command Reference - Instructions
bfe - Bit Field Extract
104:85 Source 1
Format: EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
MBZ
EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
DstRegNum
DstSubRegNum[2:0]
ChanEn[4]
84 Reserved
Format:
83:64 Source 0
Format:
63:56 Destination Register Number
Format:
55:53 Destination Subregister Number
Format:
52:49 Destination Channel Enable
Format:
Four channel enables are defined for controlling which channels are written into the
destination region. These channel mask bits are applied in a modulo-four manner to all
ExecSize channels. There is 1-bit Channel Enable for each channel within the group of 4. If the
bit is cleared, the write for the corresponding channel is disabled. If the bit is set, the write is
enabled. Mnemonics for the bit being set for the group of 4 are x, y, z, and w, respectively,
where x corresponds to Channel 0 in the group and w corresponds to channel 3 in the group
48
47
46
Reserved
Format: MBZ
NibCtrl
MBZ
NibCtrl
Format:
Reserved
Format:
45:44 Destination Data Type
This field contains the data type for the destination
Value
00b
01b
10b
11b
Name
Single Precision Float
DWord
Unsigned DWord
Double Precision Float
43:42 Source Data Type
This field contains the data type for all three sources
Value
00b
01b
10b
11b
Name
Single Precision Float
DWord
Unsigned DWord
Double Precision Float
41:40 Source 2 Modifier
190 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
bfe - Bit Field Extract
Exists If:
Format:
([Property[Source Modification]=='true')
SrcMod
([Property[Source Modification]=='true')
SrcMod
([Property[Source Modification]=='false')
MBZ
([Property[Source Modification]=='true')
SrcMod
MBZ
39:38 Source 1 Modifier
Exists If:
Format:
41:36 Reserved
Exists If:
Format:
37:36 Source 0 Modifier
Exists If:
Format:
35
34
Reserved
Format:
Flag Register Number
This field contains the flag register number for instructions with a non-zero Conditional
Modifier.
Flag Subregister Number
This field contains the flag subregister number for instructions with a non-zero Conditional
Modifier.
Reserved
Format:
33
32
31:0
MBZ
EU_INSTRUCTION_HEADER
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 191
Command Reference - Instructions
bfi1 - Bit Field Insert 1
Source:
Length Bias:
EuIsa
4
The bfi1 instruction is the first instruction in a two-instruction macro for bfi (Bit Field Insert).
The bfi1 instruction component-wise generates mask with control from src0 and src1 and stores the results in
dst. The mask is used in the bfi2 instruction to generate the final result of bfi.
Create a bit mask corresponding to the bit field width and offset in src0 and src1. Store the bit mask in dst. The
mask has all bits in the bit field set to 1 and all other bits as 0.
The width and offset values are from the low five bits of src0 and src1 respectively, or src0 & 0x1f and src1 &
0x1f.
If width is zero, the result is zero.
The bfi macro has four source operands: src0 - bit field width in low five bits, src1 - bit field offset/starting bit
position in low five bits, src2 - bit field value to insert, using only the number of least significant bits given by
width in src0, and src3 - overall value into which the bit field is inserted, providing all bits other than the
inserted bits for the result value.
bfi dst src0 src1 src2 src3
// Translates to these two instructions:
bfi1 dst src0 src1
bfi2 dst dst src2 src3
Format:
[(pred)] bfi1 (exec_size) dst src0 src1
Programming Notes
No accumulator access, implicit or explicit.
Syntax
[(pred)] bfi1 (exec_size) reg reg reg [(pred)] bfi1 (exec_size) reg reg imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { UD width =
[n][4:0]; UD offset = [n][4:0]; dst = ((1 << width) - 1) << offset; } }
Predication Conditional Modifier Saturation Source Modifier
Y N N N
Src Types Dst Types
UD
D
UD
D
Bit Description DWord
192 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
bfi1 - Bit Field Insert 1
0..3 127:64 ImmSource
Exists If:
Format:
([ImmSource][e]=='IMM')
EU_INSTRUCTION_SOURCES_REG_IMM
([RegSource][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 193
Command Reference - Instructions
bfi2 - Bit Field Insert 2
Source:
Length Bias:
EuIsa
4
The bfi2 instruction is the second instruction in a two-instruction macro for bfi (Bit Field Insert).
The bfi2 instruction component-wise performs the bitfield insert operation on src1 and src2 based on the mask
in src0.
Use the mask in src0 to take a bit field value from the low bits of src1 and combine it with the value from src2
(so src2 provides all bits other than those masked out and replaced by the bit field value). Store the result in
dst.
The bfi macro has four source operands: src0 - bit field width in low five bits, src1 - bit field offset/starting bit
position in low five bits, src2 - bit field value to insert, using only the number of least significant bits given by
width in src0, and src3 - overall value into which the bit field is inserted, providing all bits other than the
inserted bits for the result value.
bfi dst src0 src1 src2 src3
// Translates to these two instructions:
bfi1 dst src0 src1
bfi2 dst dst src2 src3
Format:
[(pred)] bfi2 (exec_size) dst src0 src1 src2
Restriction
Restriction: No accumulator access, implicit or explicit.
Restriction: All three-source instructions have certain restrictions, described in Instruction Machine
Formats.
Syntax
[(pred)] bfi2 (exec_size) reg reg reg reg
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { UD offset =
LZD(reverse([n]))-1; // offset is the number of LSB zero bits below the bit mask
which has all 1s. // width (implied by the logic) is the number of 1 bits in the mask
value, which should be all 1s. [n] = (([n] << offset) & [n]) |
([n] & ! [n]); }
Predication Conditional Modifier Saturation Source Modifier
Y N N N
Src Types Dst Types
UD
D
UD
D
Bit Description
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
DWord
194
Command Reference - Instructions
bfi2 - Bit Field Insert 2
0..3 127:126 Reserved
Format: MBZ
EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
MBZ
EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
MBZ
EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC
DstRegNum
DstSubRegNum[2:0]
ChanEn[4]
125:106 Source 2
Format:
105 Reserved
Format:
104:85 Source 1
Format:
84 Reserved
Format:
83:64 Source 0
Format:
63:56 Destination Register Number
Format:
55:53 Destination Subregister Number
Format:
52:49 Destination Channel Enable
Format:
Four channel enables are defined for controlling which channels are written into the
destination region. These channel mask bits are applied in a modulo-four manner to all
ExecSize channels. There is 1-bit Channel Enable for each channel within the group of 4. If the
bit is cleared, the write for the corresponding channel is disabled. If the bit is set, the write is
enabled. Mnemonics for the bit being set for the group of 4 are x, y, z, and w, respectively,
where x corresponds to Channel 0 in the group and w corresponds to channel 3 in the group
48
47
46
Reserved
Format: MBZ
NibCtrl
MBZ
NibCtrl
Format:
Reserved
Format:
45:44 Destination Data Type
This field contains the data type for the destination
Value
00b
01b
10b
11b
Name
Single Precision Float
DWord
Unsigned DWord
Double Precision Float
43:42 Source Data Type
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 195
Command Reference - Instructions
bfi2 - Bit Field Insert 2
This field contains the data type for all three sources
Value
00b
01b
10b
11b
Name
Single Precision Float
DWord
Unsigned DWord
Double Precision Float
([Property[Source Modification]=='true')
SrcMod
([Property[Source Modification]=='true')
SrcMod
([Property[Source Modification]=='false')
MBZ
([Property[Source Modification]=='true')
SrcMod
MBZ
41:40 Source 2 Modifier
Exists If:
Format:
39:38 Source 1 Modifier
Exists If:
Format:
41:36 Reserved
Exists If:
Format:
37:36 Source 0 Modifier
Exists If:
Format:
35
34
Reserved
Format:
Flag Register Number
This field contains the flag register number for instructions with a non-zero Conditional
Modifier.
Flag Subregister Number
This field contains the flag subregister number for instructions with a non-zero Conditional
Modifier.
Reserved
Format:
33
32
31:0
MBZ
EU_INSTRUCTION_HEADER
Header
Format:
196 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
bfrev - Bit Field Reverse
Source:
Length Bias:
EuIsa
4
The bfrev instruction component-wise reverses all the bits in src0 and stores the results in dst.
Format:
[(pred)] bfrev (exec_size) dst src0
Restriction
Restriction: No accumulator access, implicit or explicit.
Syntax
[(pred)] bfrev (exec_size) reg reg [(pred)] bfrev (exec_size) reg imm32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { for ( idx = 0;
idx < 32; idx++ ) { [n][idx] = [n][31-idx]; } } }
Predication Conditional Modifier Saturation Source Modifier
Y N N N
Src Types Dst Types
UD UD
Bit
127:64 ImmSource
Exists If:
Format:
DWord
0..3
Description
([Operand Controls][e]=='IMM')
EU_INSTRUCTION_SOURCES_IMM32
([Operand Controls][e]!='IMM')
EU_INSTRUCTION_SOURCES_REG
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
127:64 RegSource
Exists If:
Format:
63:32
31:0
Operand Controls
Format:
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 197
Command Reference - Instructions
brc - Branch Converging
Source:
Length Bias:
EuIsa
4
Description
The brc instruction redirects the execution forward or backward to the instruction pointed by (current
IP + offset). The jump will occur if all channels are branched away. UIP should reference the instruction
where all channels are expected to come together. JIP should reference the end of the innermost
conditional block.
In GEN binary, JIP and UIP are at location src1 when immediates and at location src0 when reg32,
where reg32 is accessed as a scalar DWord containing both JIP and UIP. The null register must be used
(for example, by the assembler) as dst. When offsets are immediate, src0 must be null.
Format:
[(pred)] brc (exec_size) JIP UIP
Restriction
Restriction: A brc instruction must use the Switch instruction option.
Syntax
[(pred)] brc (exec_size) imm16 imm16 [(pred)] brc (exec_size) reg32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < 32; n++ ) { if ( WrEn[n] ) { PcIP[n] = IP + UIP; } else {
PcIP[n] = IP + 1; } } if ( all PcIP != IP + 1 ) { // for all channels Jump(IP + JIP); }
Predication Conditional Modifier Saturation Source Modifier Source Types
Y N
Bit
127:112 UIP
Format:
0..3
N N D
Description
S15
DWord
The jump distance in number of eight-byte units if a jump is taken for the channel.
111:96 JIP
Format: S15
The jump distance in number of eight-byte units if a jump is taken for the instruction.
95:64
63:32
31:0
198
Reserved
Format: MBZ
EU_INSTRUCTION_OPERAND_CONTROLS
EU_INSTRUCTION_HEADER
Operand Control
Format:
Header
Format:
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14
Command Reference - Instructions
brd - Branch Diverging
Source:
Length Bias:
EuIsa
4
Description
The brd instruction redirects the execution forward or backward to the instruction pointed by (current
IP + offset). The jump will occur if any channels are branched away.
In GEN binary, JIP is at location src1 when immediate and at location src0 when reg32, where reg32 is
accessed as a scalar DWord. The null register must be used at dst locations.
Format:
[(pred)] brd (exec_size) JIP
Restriction
Restriction: A brd instruction must use the Switch instruction option.
Syntax
[(pred)] brd (exec_size) imm16 [(pred)] brd (exec_size) reg32
Pseudocode
Evaluate(WrEn); for ( n = 0; n < 32; n++ ) { if ( WrEn[n] ) { PcIP[n] = IP + JIP; } else {
PcIP[n] = IP + 1; } } if ( any PcIP == ExIP + JIP ) { // any channel Jump(ExIP + JIP); }
Predication Conditional Modifier Saturation Source Modifier
Y N N N
Src Types
D
DWord Bit
Format:
Description
MBZ
S15
0..3 127:112 Reserved
111:96 JIP
Format:
Jump Target Offset. The relative offset in 64-bit units if a jump is taken for the instruction.
95:91 Reserved
Format: MBZ
90
89
Flag Register Number
Added a second flag register
Flag Subregister Number
This field specifies the sub-register number for a flag register operand. There are two sub-
registers in the flag register. Each sub-register contains 16 flag bits.
The selected flag sub-register is the source for predication if predication is enabled for the
instruction. It is the destination to store conditional flag bits if conditional modifier is enabled
for the instruction. The same flag sub-register can be both the predication source and
Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 199