你的位置：首页 > IT圈 > Intel Open Source HD Graphics程序员参考手册(PRM)卷2,部分2:命令

Intel Open Source HD Graphics程序员参考手册(PRM)卷2,部分2:命令

IT圈 admin 2024-11-05 63浏览 0评论

2024年11月5日发(作者：雷凝旋)

Command Reference - Instructions

3DSTATE_POLY_STIPPLE_OFFSET

Source:

Length Bias:

RenderCS

The 3DSTATE_POLY_STIPPLE_OFFSET command is used to specify the origin of the repeated screen-space

Polygon Stipple Pattern as an X,Y offset from the Color Buffer origin.

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

1h 3DSTATE_NONPIPELINED

OpCode

06h 3DSTATE_POLY_STIPPLE_OFFSET

OpCode

MBZ

0h Excludes Dword (0,1)

=n Total Length - 2

MBZ

Value

[0,31]

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8

7:0

Reserved

Format:

Dword Length

Default Value:

Format:

1 31:13

12:8

Reserved

Format:

Polygon Stipple X Offset

Format:

Specifies a 5 bit x address offset in the poly stipple pattern

Name

MBZ

Value

[0,31]

7:5

4:0

Reserved

Format:

Polygon Stipple Y Offset

Format:

Specifies a 5 bit y address offset in the poly stipple pattern

Name

100 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_POLY_STIPPLE_PATTERN

Source:

Length Bias:

RenderCS

The 3DSTATE_POLY_STIPPLE_PATTERN command is used to specify the 32x32 Polygon Stipple Pattern used in

the Polygon Stipple function of the WM unit.

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

1h 3DSTATE_NONPIPELINED

OpCode

07h 3DSTATE_POLY_STIPPLE_PATTERN

OpCode

MBZ

1Fh Excludes Dword (0,1)

=n Total Length - 2

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 Dword Length

Default Value:

Format:

1 31:0 Polygon Stipple Pattern Row 1 (top most)

Format: 32 bit mask Bit 31 = upper left corner, Bit 0 = upper right corner of first row.

Specifies a pattern used by Polygon Stipple to mask out specific pixels of every 32x32 area

rendered.

2..32 31:0 Polygon Stipple Pattern Rows 2-32 (bottom most)

Format: 32 bit mask Bit 31 = upper left corner, Bit 0 = upper right corner of first row.

Specifies a pattern used by Polygon Stipple to mask out specific pixels of every 32x32 area

rendered.

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 101

Command Reference - Instructions

3DSTATE_PS

Source:

Length Bias:

RenderCS

Description

Default Value:

Format:

DWord Bit

0 31:29 Command Type

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

20h 3DSTATE_PS

OpCode

MBZ

06h Excludes DWord (0,1)

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

Total Length - 2

1 31:6 Kernel Start Pointer[0]

Format: InstructionBaseOffset[31:6]Kernel

Specifies the 64-byte aligned address offset of the first instruction in the kernel[0]. This pointer is

relative to the Instruction Base Address.

5:0 Reserved

Format: MBZ

2 31 Single Program Flow (SPF)

Specifies the initial condition of the kernel program as either a single program flow (SIMDnxm

with m = 1) or as multiple program flows (SIMDnxm with m > 1). See CR0 description in ISA

Execution Environment.

Value

Name

Multiple

Single

Description

Multiple Program Flows

Single Program Flows

U1 Enumerated Type

30 Vector Mask Enable (VME)

Format:

When SPF=0, VME specifies which mask to use to initialize the initial channel enables. When

SPF=1, VME specifies which mask to use to generate execution channel enables.

102 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_PS

Value

Name

Dmask

Vmask

Description

Channels are enabled based on the dispatch mask

Channels are enabled based on the vector mask

29:27 Sampler Count

Format:

Specifies how many samplers (in multiples of 4) the pixel shader 0 kernel uses. Used only for

prefetching the associated sampler state entries.

Value

[0,4]

5h-7h

Name

no samplers used

Description

between 1 and 4 samplers used

between 5 and 8 samplers used

between 9 and 12 samplers used

between 13 and 16 samplers used

Reserved

26 Denormal Mode

Specifies the denornal mode used by the dispatched thread.

Value

Name

FTZ

RET

Description

Denormals are flushed to zero

Denormals are retained

25:18 Binding Table Entry Count

Format:

Specifies how many binding table entries the kernel uses. Used only for prefetching of the

binding table entries and associated surface state. Note: For kernels using a large number of

binding table entries, it may be advantageous to set this field to zero to avoid prefetching too

many entries and thrashing the state cache.

This field is ignored if [PS Function Enable] is DISABLED.

Value

[0,255]

Programming Notes

When HW binding table bit is set, it is assumed that the Binding Table Entry Count field will be

generated at JIT time.

Name

17 Reserved

Format: MBZ

16 Floating Point Mode

Specifies the floating point mode used by the dispatched thread.

Value

Name

IEEE-745

Alt

Description

Use IEEE-754 rules

Use alternate rules

103 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_PS

15:14 Rounding Mode

Specifies the rounding mode used by the dispatched thread.

Value

Name

RTNE

RTZ

Description

Round to Nearest Even

Round toward +infinity

Round toward -infinity

Round toward zero

Enable

13 Illegal Opcode Exception Enable

Format:

This bit gets loaded into EU CR0.1[12] (note the bit # difference). See Exceptions and ISA

Execution Environment.

12 Reserved

Format: MBZ

Enable

11 Mask Stack Exception Enable

Format:

This bit gets loaded into EU CR0.1[12] (note the bit # difference). See Exceptions and ISA

Execution Environment.

10:8 Reserved

Format: MBZ

Enable

7 Software Exception Enable

Format:

This bit gets loaded into EU CR0.1[13] (note the bit # difference). See Exceptions and ISA

Execution Environment.

6:0 Reserved

Format: MBZ

GeneralStateOffset[31:10]ScratchSpace

3 31:10 Scratch Space Base Pointer

Format:

Specifies the 1k-byte aligned address offset to scratch space for use by the kernel. This pointer is

relative to the General State Base Address.

9:4 Reserved

Format: MBZ

3:0 Per Thread Scratch Space

Format:

Specifies the amount of scratch space allowed to be used by each thread. The driver must

allocate enough contiguous scratch space, pointed to by the Scratch Space Pointer, to ensure

that the Maximum Number of Threads each get Per Thread Scratch Space size without exceeding

the driver-allocated scratch space.

104 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_PS

Value

[0,11]

Name

indicating [1k bytes, 2M bytes] in powers of two

U8-1 representing thread count

Description

4 31:24 Maximum Number of Threads

Format:

Range:

WIZ Hashing Disable in GT_MODE register enabled: Range = [7,171] --> [8,172]

threads. Only odd values are allowed (resulting in even max number of threads)

WIZ Hashing Disable in GT_MODE register disabled: Range = [3,85] --> [4,86] threads.

Only odd values are allowed (resulting in even max number of threads)

Specifies the maximum number of simultaneous threads allowed to be active. Used to

avoid using up the scratch space, or to avoid potential deadlock.

Value

[3h,1fh]

Programming Notes

If this field is changed between 3DPRIMITIVE commands, a PIPE_CONTROL command with Stall

at Pixel Scoreboard set is required to be issued. This field must have an odd value so that the

max number of PS threads is even.

Name

Range

Description

[4,32] threads

23:12 Reserved

Format: MBZ

Enable

11 Push Constant Enable

Format:

This field must be enabled if the sum of the PS Constant Buffer [3:0] Read Length fields in

3DSTATE_CONSTANT_PS is nonzero, and must be disabled if the sum is zero.

10 Attribute Enable

Format: Enable

This field must be enabled if the Number of SF Output Attributes field in 3DSTATE_SBE is

nonzero, and must be disabled if that field is zero.

9 oMask Present to RenderTarget

Format: Enable

This bit is inserted in the PS payload header and made available to the DataPort (either via the

message header or via header bypass) to indicate that oMask data (one or two phases) is

included in Render Target Write messages. If present, the oMask data is used to mask off

samples.

8 Render Target Fast Clear Enable

Format: Enable

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 105

Command Reference - Instructions

3DSTATE_PS

This field is set to enable fast clear of the bound render targets. See "Render Target Fast Clear"

for restrictions on enabling this field.

7 Dual Source Blend Enable

Format: Enable

This field is set if dual source blend is enabled. If this bit is disabled, the data port dual source

message reverts to a single source message using source 0.

6 Render Target Resolve Enable

Format: Enable

This field is set to enable clear value resolve on non-multisampled render targets. See "Render

Target Resolve" for restrictions on enabling this field.

5 Reserved

Format: MBZ

U2 Enumerated Type

4:3 Position XY Offset Select

Format:

This field specifies if/what Position XY Offset values are passed in the PS payload. Note that these

are per-slot (pixel|sample) offsets, and therefore separate from the subspan XY coordinates

passed in R1.

Value

Programming Notes

SW Recommendation: If the PS kernel needs the Position Offsets to compute a Position XY

value, this field should match Position ZW Interpolation Mode to ensure a consistent

computation

If the PS kernel does not need the Position XY Offsets to compute a Position Value, then this

field should be programmed to POSOFFSET_NONE, as the PS kernel should be using the

various barycentric inputs to evaluate other-than-position attributes.

MSDISPMODE_PERSAMPLE is required in order to select POSOFFSET_SAMPLE.

Name

POSOFFSET_NONE

Reserved

Description

No Position XY Offsets are included in the PS payload.

POSOFFSET_CENTROID Position XY Offsets will be passed in the PS payload,

and these will reflect the Centroid position(s).

POSOFFSET_SAMPLE Position XY Offsets will be passed in the PS payload,

and these will reflect the multisample position(s).

2 32 Pixel Dispatch Enable

Format:

Description

Enables the Windower to dispatch 8 subspans in one payload.

Note: See Note: in the table below, the Valid column indicates which products that

Enable

106 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_PS

combination is supported on. Combinations of dispatch enables not listed in the table

are not available on any product.

A: Valid on all products

B: Valid.

C: Not valid.

D: Valid on all products, except when in non-1x PERSAMPLE mode.

E: Valid on all products, except when in PERSAMPLE mode with number of

multisamples >= 8.

F: Valid on all products.

Each of the three KSP values are separately specified.

In addition, each kernel has a separately-specified GRF register count.

Variable Pixel Dispatch Section: Pixel Grouping (Dispatch size) control for valid pixel

dispatch combinations.

1 16 Pixel Dispatch Enable

Format:

Description

Enables the Windower to dispatch 4 subspans in one payload.

Note: See Note: in the table below, the Valid column indicates which products that

combination is supported on. Combinations of dispatch enables not listed in the table

are not available on any product.

A: Valid on all products

B: Valid.

C: Not valid.

D: Valid on all products, except when in non-1x PERSAMPLE mode.

E: Valid on all products, except when in PERSAMPLE mode with number of

multisamples >= 8.

F: Valid on all products.

Each of the three KSP values are separately specified.

In addition, each kernel has a separately-specified GRF register count.

Variable Pixel Dispatch Section: Pixel Grouping (Dispatch size) control for valid pixel

dispatch combinations.

Enable

0 8 Pixel Dispatch Enable

Format:

Description

Enables the Windower to dispatch 2 subspans in one payload.

Note: See Note: in the table below, the Valid column indicates which products that

combination is supported on. Combinations of dispatch enables not listed in the table

are not available on any product.

A: Valid on all products

B: Valid.

Enable

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 107

Command Reference - Instructions

3DSTATE_PS

C: Not valid.

D: Valid on all products, except when in non-1x PERSAMPLE mode.

E: Valid on all products, except when in PERSAMPLE mode with number of

multisamples >= 8.

F: Valid on all products.

Each of the three KSP values are separately specified.

In addition, each kernel has a separately-specified GRF register count.

Variable Pixel Dispatch Section: Pixel Grouping (Dispatch size) control for valid pixel

dispatch combinations.

5 31:23 Reserved

Format: MBZ

22:16 Dispatch GRF Start Register for Constant/Setup Data [0]

Format:

Specifies the starting GRF register number for the Constant/Setup portion of the thread payload

for kernel[0].

Value

[0,127]

Name

MBZ

15 Reserved

Format:

14:8 Dispatch GRF Start Register for Constant/Setup Data [1]

Format:

Specifies the starting GRF register number for the Constant/Setup portion of the thread payload

for kernel[1].

Value

[0,127]

Name

MBZ

7 Reserved

Format:

6:0 Dispatch GRF Start Register for Constant/Setup Data [2]

Format:

Specifies the starting GRF register number for the Constant/Setup portion of the thread payload

for kernel[2].

Value

[0,127]

Name

6 31:6 Kernel Start Pointer[1]

Format: InstructionBaseOffset[31:6]Kernel

Specifies the 64-byte aligned address offset of the first instruction in kernel[1]. This pointer is

relative to the Instruction Base Address.

5:0 Reserved

Format: MBZ

108 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_PS

7 31:6 Kernel Start Pointer[2]

Format: InstructionBaseOffset[31:6]Kernel

Specifies the 64-byte aligned address offset of the first instruction in kernel[2]. This pointer is

relative to the Instruction Base Address.

5:0 Reserved

Format: MBZ

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 109

Command Reference - Instructions

3DSTATE_PUSH_CONSTANT_ALLOC_DS

Source:

Length Bias:

RenderCS

Programming Notes

This command sets up the URB configuration for DS Push Constant Buffer.

Programming Restriction:

• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value

of the Constant Buffer Size.

• The sum of the constant length programmed in 3DSTATE_CONSTANT_DS must be equal or smaller then

the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant

URB Allocation section for more details.

• The 3DSTATE_CONSTANT_DS must be reprogrammed prior to the next 3DPRIMITIVE command after

programming the 3DSTATE_PUSH_CONSTANT_ALLOC_DS.

DWord

Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

1h 3DSTATE_NONPIPELINED

OpCode

14h 3DSTATE_PUSH_CONSTANT_ALLOC_DS

OpCode

MBZ

0h Excludes DWord (0,1)

=n Total Length - 2

MBZ

Value

[0,15] (0KB - 15KB)

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Name

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:20 Reserved

Format:

19:16 Constant Buffer Offset

Format:

Specifies the offset of the DS constant buffer into the URB.

110

Command Reference - Instructions

3DSTATE_PUSH_CONSTANT_ALLOC_DS

0h 0KB [Default]

MBZ

15:5 Reserved

Format:

4:0 Constant Buffer Size

Format:

Specifies the size of the DS constant buffer. This value will determine the amount of data the

command stream can pre-fetch before the buffer is full. Value of zero is only valid when

constants are not enabled for DS.

Value

[0,15]

Name

(0KB - 15KB) Increments of 1KB

0KB [Default]

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 111

Command Reference - Instructions

3DSTATE_PUSH_CONSTANT_ALLOC_GS

Source:

Length Bias:

RenderCS

Programming Notes

This command sets up the URB configuration for GS Push Constant Buffer.

• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value

of the Constant Buffer Size.

• The sum of the constant length programmed in 3DSTATE_CONSTANT_GS must be equal or smaller then

the size of the allocated space in the URB including the buffering for half cachelines.

• The 3DSTATE_CONSTANT_GS must be reprogrammed prior to the next 3DPRIMITIVE command after

programming the 3DSTATE_PUSH_CONSTANT_ALLOC_GS.

See Push Constant URB Allocation section for more details.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

1h 3DSTATE_NONPIPELINED

OpCode

15h 3DSTATE_PUSH_CONSTANT_ALLOC_GS

OpCode

MBZ

Name

3DSTATE_PUSH_CONSTANT_ALLOC_GS [Default]

MBZ

Value Name

Description

Excludes DWord (0,1)

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Format:

Total Length - 2

Value

1 31:20 Reserved

Format:

19:16 Constant Buffer Offset

Format:

Specifies the offset of the GS constant buffer into the URB.

112 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_PUSH_CONSTANT_ALLOC_GS

[0,15]

(0KB - 15KB)

0KB [Default]

MBZ

15:5 Reserved

Format:

4:0 Constant Buffer Size

Format:

Specifies the size of the GS constant buffer. This value will determine the amount of data the

command stream can pre-fetch before the buffer is full. Value of zero is only valid when

constants are not enabled for GS.

Value

[0,15]

Name

(0KB - 15KB) Increments of 1KB

0KB [Default]

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 113

Command Reference - Instructions

3DSTATE_PUSH_CONSTANT_ALLOC_HS

Source:

Length Bias:

RenderCS

Programming Notes

This command sets up the URB configuration for HS Push Constant Buffer.

Programming Restriction:

• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value

of the Constant Buffer Size.

• The sum of the constant length programmed in 3DSTATE_CONSTANT_HS must be equal or smaller then

the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant

URB Allocation section for more details.

• The 3DSTATE_CONSTANT_HS must be reprogrammed prior to the next 3DPRIMITIVE command after

programming the 3DSTATE_PUSH_CONSTANT_ALLOC_HS.

DWord

Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

1h 3DSTATE_NONPIPELINED

OpCode

13h 3DSTATE_PUSH_CONSTANT_ALLOC_HS

OpCode

MBZ

0h Excludes DWord (0,1)

=n Total Length - 2

MBZ

Value

[0,15] (0KB - 15KB)

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Name

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:20 Reserved

Format:

19:16 Constant Buffer Offset

Format:

Specifies the offset of the HS constant buffer into the URB.

114

Command Reference - Instructions

3DSTATE_PUSH_CONSTANT_ALLOC_HS

0h 0KB [Default]

MBZ

15:5 Reserved

Format:

4:0 Constant Buffer Size

Format:

Specifies the size of the HS constant buffer. This value will determine the amount of data the

command stream can pre-fetch before the buffer is full. Value of zero is only valid when

constants are not enabled for HS.

Value

[0,15]

Name

(0KB - 15KB) Increments of 1KB

0KB [Default]

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 115

Command Reference - Instructions

3DSTATE_PUSH_CONSTANT_ALLOC_PS

Source:

Length Bias:

RenderCS

Description

This command sets up the URB configuration for PS Push Constant Buffer.

A PIPE_CONTOL command with the CS Stall bit set must be programmed in the ring after this

instruction.

Programming Notes

Restriction:

• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value

of the Constant Buffer Size.

• The sum of the constant length programmed in 3DSTATE_CONSTANT_PS must be equal or smaller then

the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant

URB Allocation section for more details.

• The 3DSTATE_CONSTANT_PS must be reprogrammed prior to the next 3DPRIMITIVE command after

programming the 3DSTATE_PUSH_CONSTANT_ALLOC_PS.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

1h 3DSTATE_NONPIPELINED

OpCode

16h 3DSTATE_PUSH_CONSTANT_ALLOC_PS

OpCode

MBZ

0h Excludes Dword (0,1)

=n Total Length - 2

MBZ

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 Dword Length

Default Value:

Format:

1 31:20 Reserved

Format:

19:16 Constant Buffer Offset

Format:

116

Command Reference - Instructions

3DSTATE_PUSH_CONSTANT_ALLOC_PS

Specifies the offset of the PS constant buffer into the URB.

Value

[0,15]

Name

(0KB - 15KB)

0KB [Default]

MBZ

15:5 Reserved

Format:

4:0 Constant Buffer Size

Format:

Specifies the size of the PS constant buffer. This value will determine the amount of data the

command stream can pre-fetch before the buffer is full. Value of zero is only valid when

constants are not enabled for PS.

Value

[0,15]

Name

(0KB - 15KB) Increments of 1KB

0KB [Default]

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 117

Command Reference - Instructions

3DSTATE_PUSH_CONSTANT_ALLOC_VS

Source:

Length Bias:

RenderCS

Programming Notes

This command sets up the URB configuration for VS Push Constant Buffer.

Programming Restriction:

• The sum of the Constant Buffer Offset and the Constant Buffer Size may not exceed the maximum value

of the Constant Buffer Size.

• The sum of the constant length programmed in 3DSTATE_CONSTANT_VS must be equal or smaller then

the size of the allocated space in the URB including the buffering for half cachelines. See Push Constant

URB Allocation section for more details.

• The 3DSTATE_CONSTANT_VS must be reprogrammed prior to the next 3DPRIMITIVE command after

programming the 3DSTATE_PUSH_CONSTANT_ALLOC_VS.

DWord

Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

1h 3DSTATE_NONPIPELINED

OpCode

12h 3DSTATE_PUSH_CONSTANT_ALLOC_VS

OpCode

MBZ

0h Excludes DWord (0,1)

=n Total Length - 2

MBZ

Value

[0,15] (0KB - 15KB)

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Name

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:20 Reserved

Format:

19:16 Constant Buffer Offset

Format:

Specifies the offset of the VS constant buffer into the URB.

118

Command Reference - Instructions

3DSTATE_PUSH_CONSTANT_ALLOC_VS

0h 0KB [Default]

MBZ

15:5 Reserved

Format:

4:0 Constant Buffer Size

Format:

Specifies the size of the VS constant buffer. This value will determine the amount of data the

command stream can pre-fetch before the buffer is full. Value of zero is only valid when

constants are not enabled for VS.

Value

[0,15]

Name

(0KB - 15KB) Increments of 1KB

0KB [Default]

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 119

Command Reference - Instructions

3DSTATE_SAMPLE_MASK

Source:

Length Bias:

RenderCS

Description

Default Value:

Format:

DWord Bit

0 31:29 Command Type

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

18h 3DSTATE_SAMPLE_MASK

OpCode

MBZ

0h Excludes Dword (0,1)

=n Total Length - 2

MBZ

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 Dword Length

Default Value:

Format:

1 31:8 Reserved

Format:

7:0 Sample Mask

Format: 8 bit mask Right-justified bitmask (Bit 0 = Sample0). Number of bits that are used is

determined by Num Multisamples (3DSTATE_MULTISAMPLE)

A per-multisample-position mask state variable that is immediately and unconditionally ANDed

with the sample coverage mask as part of the rasterization process. This mask is applied prior to

centroid selection.

Programming Notes

• If Number of Multisamples is NUMSAMPLES_1, bits 7:1 of this field must be zero.

• If Number of Multisamples is NUMSAMPLES_4, bits 7:4 of this field must be zero.

120 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SAMPLER_PALETTE_LOAD0

Source:

Length Bias:

RenderCS

Description

The 3DSTATE_SAMPLER_PALETTE_LOAD0 instruction is used to load 32-bit values into the first

texture palette. The texture palette is used whenever a texture with a paletted format (containing

"Px [palette0]") is referenced by the sampler.

This instruction is used to load all or a subset of the 256 entries of the first palette. Partial loads

always start from the first (index 0) entry.

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

Opcode

3h GFXPIPE_3D

Opcode

1h 3DSTATE

Opcode

02h 3DSTATE_SAMPLER_PALETTE_LOAD0

Opcode

MBZ

0h Excludes DWord (0,1)

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8

7:0

Reserved

Format:

DWord Length

Default Value:

Format:

Total Length - 2

1..n 31:24 Palette Alpha[0:N-1]

Format: U8

Alpha channel loaded into the Nth entry of the texture color palette.

23:16 Palette Red[0:N-1]

Format: U8

Alpha channel loaded into the Nth entry of the texture color palette.

15:8 Palette Green[0:N-1]

Format: U8

Alpha channel loaded into the Nth entry of the texture color palette.

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 121

Command Reference - Instructions

3DSTATE_SAMPLER_PALETTE_LOAD0

7:0 Palette Blue[0:N-1]

Format: U8

Alpha channel loaded into the Nth entry of the texture color palette.

122 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SAMPLER_PALETTE_LOAD1

Source:

Length Bias:

RenderCS

The 3DSTATE_SAMPLER_PALETTE_LOAD1 instruction is used to load 32-bit values into the second texture

palette. The second texture palette is used whenever a texture with a paletted format (containing

"Px...[palette1]") is referenced by the sampler. This instruction is used to load all or a subset of the 256 entries of

the second palette. Partial loads always start from the first (index 0) entry.

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

1h 3DSTATE

OpCode

0Ch 3DSTATE_SAMPLER_PALETTE_LOAD1

OpCode

MBZ

0h Excludes DWord (0,1)

=n Total Length - 2

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1..n 31:24 Palette Alpha[0:N-1]

Format:

Alpha channel loaded into the Nth entry of the texture color palette.

23:16 Palette Red[0:N-1]

Format: U8

Alpha channel loaded into the Nth entry of the texture color palette.

15:8 Palette Green[0:N-1]

Format: U8

Alpha channel loaded into the Nth entry of the texture color palette.

7:0 Palette Blue[0:N-1]

Format: U8

Alpha channel loaded into the Nth entry of the texture color palette.

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 123

Command Reference - Instructions

3DSTATE_SAMPLER_STATE_POINTERS_DS

Source:

Length Bias:

RenderCS

The 3DSTATE_SAMPLER_STATE_POINTERS_DS command is used to define the location of DS SAMPLER_STATE

table. Only some of the fixed functions utilize sampler state tables.

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

2Dh 3DSTATE_SAMPLER_STATE_POINTERS_DS

OpCode

MBZ

0h DWORD_COUNT_n

DynamicStateOffset[31:5]SAMPLER_STATE*16

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:5 Pointer to DS Sampler State

Format:

Specifies the 32-byte aligned address offset of the DS function's SAMPLER_STATE table. This

offset is relative to the Dynamic State Base Address.

4:0 Reserved

Format: MBZ

124 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SAMPLER_STATE_POINTERS_GS

Source:

Length Bias:

RenderCS

The 3DSTATE_SAMPLER_STATE_POINTERS_GS command is used to define the location of GS SAMPLER_STATE

table. Only some of the fixed functions utilize sampler state tables.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

2Eh 3DSTATE_SAMPLER_STATE_POINTERS_GS

OpCode

MBZ

0h DWORD_COUNT_n

DynamicStateOffset[31:5]SAMPLER_STATE*16

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:5 Pointer to GS Sampler State

Format:

Specifies the 32-byte aligned address offset of the GS function's SAMPLER_STATE table. This

offset is relative to the Dynamic State Base Address.

4:0 Reserved

Format: MBZ

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 125

Command Reference - Instructions

3DSTATE_SAMPLER_STATE_POINTERS_HS

Source:

Length Bias:

RenderCS

The 3DSTATE_SAMPLER_STATE_POINTERS_HS command is used to define the location of HS SAMPLER_STATE

table. Only some of the fixed functions utilize sampler state tables.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

2Ch 3DSTATE_SAMPLER_STATE_POINTERS_HS

OpCode

MBZ

0h DWORD_COUNT_n

DynamicStateOffset[31:5]SAMPLER_STATE*16

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:5 Pointer to HS Sampler State

Format:

Specifies the 32-byte aligned address offset of the HS function's SAMPLER_STATE table. This

offset is relative to the Dynamic State Base Address.

4:0 Reserved

Format: MBZ

126 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SAMPLER_STATE_POINTERS_PS

Source:

Length Bias:

RenderCS

The 3DSTATE_SAMPLER_STATE_POINTERS_PS command is used to define the location of PS SAMPLER_STATE

table. Only some of the fixed functions utilize sampler state tables.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

2Fh 3DSTATE_SAMPLER_STATE_POINTERS_PS

OpCode

MBZ

0h DWORD_COUNT_n

DynamicStateOffset[31:5]SAMPLER_STATE*16

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:5 Pointer to PS Sampler State

Format:

Specifies the 32-byte aligned address offset of the PS function's SAMPLER_STATE table. This

offset is relative to the Dynamic State Base Address.

4:0 Reserved

Format: MBZ

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 127

Command Reference - Instructions

3DSTATE_SAMPLER_STATE_POINTERS_VS

Source:

Length Bias:

RenderCS

The 3DSTATE_SAMPLER_STATE_POINTERS_VS command is used to define the location of VS SAMPLER_STATE

table. Only some of the fixed functions utilize sampler state tables.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

2Bh 3DSTATE_SAMPLER_STATE_POINTERS_VS

OpCode

MBZ

0h DWORD_COUNT_n

DynamicStateOffset[31:5]SAMPLER_STATE*16

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:5 Pointer to VS Sampler State

Format:

Specifies the 32-byte aligned address offset of the VS function's SAMPLER_STATE table. This

offset is relative to the Dynamic State Base Address.

4:0 Reserved

Format: MBZ

128 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SBE

Source:

Length Bias:

RenderCS

Description

Default Value:

Format:

DWord Bit

0 31:29 Command Type

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

1Fh 3DSTATE_SBE

OpCode

MBZ

0Ch Excludes DWord (0,1)

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

Total Length - 2

1 31:29 Reserved

Format: MBZ

U1 enumerated type

28 Attribute Swizzle Control Mode

Format:

When Attribute Swizzle Enable is ENABLED, this bit controls whether attributes 0-15 or

16-31 are subject to the following swizzle controls:

• Attribute n Component Override X/Y/Z/W

• Attribute n Constant Source

• Attribute n Swizzle Select

• Attribute n Source Attribute

• Attribute n Wrap Shortest Enables

Note that the Number of SF Output Attributes field specifies how many attributes are

output.

Note: This field does not impact any functions which provide separate states for all 32

attributes (e.g., Point sprite, Constant interpolation).

Value Name Description

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 129

Command Reference - Instructions

3DSTATE_SBE

SWIZ_0_15 Attributes 0-15 are subject to swizzling, and attributes 16-31 are

not.

SWIZ_16_31 Attributes 16-31 are subject to swizzling, and attributes 0-15 are

not. Only valid when 16 or more attributes are output.

U6 count of attributes

27:22 Number of SF Output Attributes

Format:

Specifies the number of vertex attributes passed from the SF stage to the WM stage (does not

include Position).

Value

[0,32]

Name

Enable

21 Attribute Swizzle Enable

Format:

Enables the SF to perform swizzling on (up to the first 16) vertex attributes. If DISABLED, all vertex

attributes are passed through.

20 Point Sprite Texture Coordinate Origin

Format: U1 enumerated type

This state controls how Point Sprite Texture Coordinates are generated (when enabled on a per-

attribute basis by Point Sprite Texture Coordinate Enable).

Value

Name Description

UPPERLEFT Top Left = (0,0,0,1)Bottom Left = (0,1,0,1)Bottom Right = (1,1,0,1)

LOWERLEFT Top Left = (0,1,0,1)Bottom Left = (0,0,0,1)Bottom Right = (1,0,0,1)

MBZ

19:16 Reserved

Format:

15:11 Vertex URB Entry Read Length

Format: U5 Specifies the amount of URB data read for each Vertex URB entry, in 256-bit

Value

[1,16]

Programming Notes

It is UNDEFINED to set this field to 0 indicating no Vertex URB data to be read. This field should

be set to the minimum length required to read the maximum source attribute. The maximum

source attribute is indicated by the maximum value of the enabled Attribute # Source Attribute

if Attribute Swizzle Enable is set, Number of Output Attributes-1 if enable is not set.

read_length = ceiling((max_source_attr+1)/2)

Name

10 Reserved

9:4 Vertex URB Entry Read Offset

Specifies the offset (in 256-bit units) at which Vertex URB data is to be read from the URB.

3:0 Reserved

130 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SBE

Format: MBZ

Enable

2..9 31 Attribute [2n+1] Component Override W

Format:

If set, the W component of output Attribute 1 is overridden by the W component of the constant

vector specified by ConstantSource[1].

30 Attribute [2n+1] Component Override Z

Format: Enable

If set, the Z component of output Attribute 1 is overridden by the Z component of the constant

vector specified by ConstantSource[1].

29 Attribute [2n+1] Component Override Y

Format: Enable

If set, the Y component of output Attribute 1 is overridden by the Y component of the constant

vector specified by ConstantSource[1].

28 Attribute [2n+1] Component Override X

Format: Enable

If set, the X component of output Attribute 1 is overridden by the X component of the constant

vector specified by ConstantSource[1].

27 Reserved

Format: MBZ

U2 enumerated type

26:25 Attribute [2n+1] Constant Source

Format:

This state selects a constant vector which can be used to override individual components of

Attribute 1

Value

Name

CONST_0000

CONST_0001_FLOAT

CONST_1111_FLOAT

PRIM_ID

Description

= 0.0,0.0,0.0,0.0

= 0.0,0.0,0.0,1.0

= 1.0,1.0,1.0,1.0

= PrimID (replicated)

MBZ

U2 enumerated type

24 Reserved

Format:

23:22 Attribute [2n+1] Swizzle Select

Format:

Value

Name

INPUTATTR

INPUTATTR_FACING

This state, along with Attribute 1 Source Attribute, specifies the source for output Attribute 1.

Description

This attribute is sourced from

AttrInputReg[SourceAttribute]

If the object is front-facing, this attribute is sourced

131 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SBE

from AttrInputReg[SourceAttribute]. If the object is

back-facing, this attribute is sourced from

AttrInputReg[SourceAttribute+1].

2h INPUTATTR_W This attribute is sourced from

AttrInputReg[SourceAttribute]. The W component is

copied to the X component.

3h INPUTATTR_FACING_W If the object is front-facing, this attribute is sourced

from AttrInputReg[SourceAttribute]. If the object is

back-facing, this attribute is sourced from

AttrInputReg[SourceAttribute+1]. The W component is

copied to the X component.

MBZ

21 Reserved

Format:

20:16 Attribute [2n+1] Source Attribute

Format:

This field selects the source attribute for Attribute 1. Source attribute 0 corresponds to the first

128 bits of data indicated by Vertex URB Entry Read Offset

15 Attribute [2n] Component Override W

Format: Enable

If set, the W component of output Attribute 0 is overridden by the W component of the constant

vector specified by ConstantSource[1].

14 Attribute [2n] Component Override Z

Format: Enable

If set, the Z component of output Attribute 0 is overridden by the Z component of the constant

vector specified by ConstantSource[1].

13 Attribute [2n] Component Override Y

Format: Enable

If set, the Y component of output Attribute 0 is overridden by the Y component of the constant

vector specified by ConstantSource[1].

12 Attribute [2n] Component Override X

Format: Enable

If set, the X component of output Attribute 0 is overridden by the X component of the constant

vector specified by ConstantSource[1].

11 Reserved

Format: MBZ

U2 enumerated type

10:9 Attribute [2n] Constant Source

Format:

This state selects a constant vector which can be used to override individual components of

132 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SBE

Attribute 0

Value

Name

CONST_0000

CONST_0001_FLOAT

CONST_1111_FLOAT

PRIM_ID

Description

= 0.0,0.0,0.0,0.0

= 0.0,0.0,0.0,1.0

= 1.0,1.0,1.0,1.0

= PrimID (replicated)

MBZ

U2 enumerated type

8 Reserved

Format:

7:6 Attribute [2n] Swizzle Select

Format:

Value

Name

INPUTATTR

INPUTATTR_FACING

This state, along with Attribute 0 Source Attribute, specifies the source for output Attribute 0.

Description

This attribute is sourced from

AttrInputReg[SourceAttribute]

If the object is front-facing, this attribute is sourced

from AttrInputReg[SourceAttribute]. If the object is

back-facing, this attribute is sourced from

AttrInputReg[SourceAttribute+1].

This attribute is sourced from

AttrInputReg[SourceAttribute]. The W component is

copied to the X component.

2h INPUTATTR_W

3h INPUTATTR_FACING_W If the object is front-facing, this attribute is sourced

from AttrInputReg[SourceAttribute]. If the object is

back-facing, this attribute is sourced from

AttrInputReg[SourceAttribute+1]. The W component is

copied to the X component.

MBZ

5 Reserved

Format:

4:0 Attribute [2n] Source Attribute

Format:

This field selects the source attribute for Attribute 0. Source attribute 0 corresponds to the first

128 bits of data indicated by Vertex URB Entry Read Offset

10 31:0 Point Sprite Texture Coordinate Enable

Format:

Description

32-bit bitmask

When processing point primitives, the attributes from the incoming point vertex are

typically copied to the point object corner vertices. However, if a bit is set in this field,

the corresponding Attribute is selected as a Point Sprite Texture Coordinate, in which

case each corner vertex is assigned a pre-defined texture coordinate as defined by

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 133

Command Reference - Instructions

3DSTATE_SBE

the Point Sprite Texture Coordinate Origin state bit. Bit 0 corresponds to output

Attribute 0.

This field must be programmed to 0 when non-point primitives are rendered.

11 31:0 Constant Interpolation Enable[31:0]

This field is a bitmask containing a Constant Interpolation Enable bit for each corresponding

attribute. If a bit is set, that attribute will undergo constant interpolation, and the corresponding

WrapShortest Enable bits (if defined) will be ignored. If a bit is clear, components which are not

enabled for WrapShortest interpolation (if defined) will be linearly interpolated.

31:28 Attribute 7 WrapShortest Enables

Format: Enable[4]

This state selects which components (if any) of Attribute 7 are to be interpolated in a "wrap

shortest" fashion. Operation is UNDEFINED if any of these bits are set and the Constant

Interpolation Enable bit associated with this attribute is set. Note that wrap-shortest interpolation

is only supported for Attributes 0-15. Bit 0: WrapShortest X ComponentBit 1: WrapShortest Y

ComponentBit 2: WrapShortest Z ComponentBit 3: WrapShortest W Component

27:24 Attribute 6 WrapShortest Enables

(See above).

23:20 Attribute 5 WrapShortest Enables

(See above).

19:16 Attribute 4 WrapShortest Enables

(See above).

15:12 Attribute 3 WrapShortest Enables

(See above).

11:8 Attribute 2 WrapShortest Enables

(See above).

7:4 Attribute 1 WrapShortest Enables

(See above).

3:0 Attribute 0 WrapShortest Enables

(See above).

13 31:28 Attribute 15 WrapShortest Enables

Format: Enable[4]

This state selects which components (if any) of Attribute 15 are to be interpolated in a "wrap

shortest" fashion. Operation is UNDEFINED if any of these bits are set and the Constant

Interpolation Enable bit associated with this attribute is 0: WrapShortest X ComponentBit

1: WrapShortest Y ComponentBit 2: WrapShortest Z ComponentBit 3: WrapShortest W

Component

27:24 Attribute 14 WrapShortest Enables

(See above).

23:20 Attribute 13 WrapShortest Enables

(See above).

19:16 Attribute 12 WrapShortest Enables

134 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SBE

(See above).

15:12 Attribute 11 WrapShortest Enables

(See above).

11:8 Attribute 10 WrapShortest Enables

(See above).

7:4 Attribute 9 WrapShortest Enables

(See above).

3:0 Attribute 8 WrapShortest Enables

(See above).

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 135

Command Reference - Instructions

3DSTATE_SCISSOR_STATE_POINTERS

Source:

Length Bias:

RenderCS

The 3DSTATE_SCISSOR_STATE_POINTERS command is used to define the location of the indirect SCISSOR_RECT

state.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

0Fh 3DSTATE_SCISSOR_STATE_POINTERS

OpCode

MBZ

0h DWORD_COUNT_n

DynamicStateOffset[31:5]SCISSOR_RECT*16

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:5 Scissor Rect Pointer

Format:

Specifies the 32-byte aligned address offset of the SCISSOR_RECT state. This offset is

relative to the Dynamic State Base Address

4:0 Reserved

Format: MBZ

136 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SF

Source:

Length Bias:

RenderCS

Description

Default Value:

Format:

DWord Bit

0 31:29 Command Type

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE

OpCode

13h 3DSTATE_SF

OpCode

MBZ

5h Excludes DWord (0,1)

=n Total Length - 2

MBZ

U3 Enumerated Type

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:15 Reserved

Format:

14:12 Depth Buffer Surface Format

Format:

Specifies the format of the depth buffer. This must exactly match the Surface Format

programmed via 3DSTATE_DEPTH_BUFFER. The SF requires this information in order to compute

Global Depth Bias.

Value

6h-7h

Name

D32_FLOAT_S8X24_UINT

D32_FLOAT

D24_UNORM_S8_UINT

D24_UNORM_X8_UINT

Reserved

D16_UNORM

Reserved

Description

D32_FLOAT_S8X24_UINT

D32_FLOAT

D24_UNORM_S8_UINT

D24_UNORM_X8_UINT

Reserved

D16_UNORM

Reserved

Enable

11 Legacy Global Depth Bias Enable

Format:

Enables the SF to use the Global Depth Offset Constant state unmodified. If this bit is not set, the

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 137

Command Reference - Instructions

3DSTATE_SF

SF will scale the Global Depth Offset Constant as described in section Error! Reference source not

found. of this document.

Programming Notes

This bit should be set whenever non zero depth bias (Slope, Bias) values are used. Setting this

bit may have some degradation of performance for some workloads.

10 Statistics Enable

Format: Enable

If ENABLED, this FF unit will increment CL_PRIMITIVES_COUNT on behalf of the CLIP stage. If

DISABLED, CL_PRIMITIVES_COUNT will be left unchanged.

Programming Notes

This bit should be set whenever clipping is enabled and the Statistics Enable bit is set in

CLIP_STATE. It should be cleared if clipping is disabled or Statistics Enable in CLIP_STATE is

clear.

9 Global Depth Offset Enable Solid

Format: Enable

Programming Notes

This bit should be set whenever non zero depth bias (Slope, Bias) values are used.

Setting this bit may have some degradation of performance for some workloads.

Enables computation and application of Global Depth Offset for SOLID objects.

8 Global Depth Offset Enable Wireframe

Format: Enable

Enables computation and application of Global Depth Offset when triangles are rendered in

WIREFRAME mode.

Programming Notes

This bit should be set whenever non zero depth bias (Slope, Bias) values are used.

Setting this bit may have some degradation of performance for some workloads.

7 Global Depth Offset Enable Point

Format: Enable

Enables computation and application of Global Depth Offset when triangles are rendered in

POINT mode.

Programming Notes

This bit should be set whenever non zero depth bias (Slope, Bias) values are used.

Setting this bit may have some degradation of performance for some workloads.

6:5 FrontFace Fill Mode

Format:

Value

Name

SOLID

U2 enumerated type

Description

Any triangle or rectangle object found to be front-facing is

rendered as a solid object. This setting is required when

rendering rectangle (RECTLIST) objects.

This state controls how front-facing triangle and rectangle objects are rendered.

1h WIREFRAME Any triangle object found to be front-facing is rendered as a

series of lines along the triangle boundaries (as determined by

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 138

Command Reference - Instructions

3DSTATE_SF

the topology type and controlled by the vertex EdgeFlags).

2h POINT Any triangle object found to be front-facing is rendered as a set

of point primitives at the triangle vertices (as determined by the

topology type and controlled by the vertex EdgeFlags). NOTE: If

the triangle is clipped, points will not be rendered at clip-inserted

vertices. Point will only be rendered at original vertices (if visible).

U2 enumerated type

Name

SOLID

Description

Any triangle or rectangle object found to be back-facing is

rendered as a solid object. This setting is required when

rendering rectangle (RECTLIST) objects.

3h Reserved

4:3 BackFace Fill Mode

Format:

Value

This state controls how back-facing triangle and rectangle objects are rendered.

1h WIREFRAME Any triangle object found to be back-facing is rendered as a

series of lines along the triangle boundaries (as determined by

the topology type and controlled by the vertex EdgeFlags).

POINT Any triangle object found to be back-facing is rendered as a set

of point primitives at the triangle vertices (as determined by the

topology type and controlled by the vertex EdgeFlags). NOTE: If

the triangle is clipped, points will not be rendered at clip-inserted

vertices. Point will only be rendered at original vertices (if visible).

MBZ

Enable

3h Reserved

Reserved

Format:

View Transform Enable

Format:

This bit controls the Viewport Transform function.

0 Front Winding

Determines whether a triangle object is considered "front facing" if the screen space vertex

positions, when traversed in the order, result in a clockwise (CW) or counter-clockwise (CCW)

winding order. Does not apply to points or lines.

Format:

This field enables "alpha-based" line anti-aliasing.

Programming Notes

This field must be disabled if any of the render targets have integer (UINT or SINT) surface

format.

2 31 Anti-Aliasing Enable

Enable

30:29 Cull Mode

Format: 3D_CullMode

Controls removal (culling) of triangle objects based on orientation. The cull mode only applies to

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 139

Command Reference - Instructions

3DSTATE_SF

triangle objects and does not apply to lines, points or rectangles.

Value

Programming Notes

Orientation determination is based on the setting of the Front Winding state.

Name Description

CULLMODE_BOTH All triangles are discarded (i.e., no triangle objects are

drawn)

CULLMODE_NONE No triangles are discarded due to orientation

CULLMODE_FRONT Triangles with a front-facing orientation are discarded

CULLMODE_BACK Triangles with a back-facing orientation are discarded

28 Reserved

27:18 Line Width

Format:

Range: [0.0, 7.9921875]

U3.7

Controls width of line primitives. Setting a Line Width of 0.0 specifies the rasterization

of the "thinnest" (one-pixel-wide), non-antialiased lines. Note that this effectively

overrides the effect of AAEnable (though the AAEnable state variable is not modified).

Programming Notes

Software must not program a value of 0.0 when running in MSRASTMODE_ON_xxx

modes - zero-width lines are not available when multisampling rasterization is

enabled.

17:16 Line End Cap Antialiasing Region Width

Format: U2

This field specifies the distances over which the coverage of anti-aliased line end caps are

computed.

Value

Name

0.5 pixels

1.0 pixels

2.0 pixels

4.0 pixels

Description

15 Reserved

Format: MBZ

MBZ

14 Reserved

Format:

13 Reserved

12 Reserved

11 Scissor Rectangle Enable

Format:

140

Enable

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SF

Enables operation of Scissor Rectangle.

10 Reserved

Format: MBZ

U2 enumerated type

9:8 Multisample Rasterization Mode

Format:

This state is duplicated in 3DSTATE_WM and both must be set to the same value. See the field in

3DSTATE_WM for definition details.

7:0 Reserved

Format: MBZ

Enable

3 31 Last Pixel Enable

Format:

If ENABLED, the last pixel of a diamond line will be lit. This state will only affect the rasterization

of Diamond lines (will not affect wide lines or anti-aliased lines).

Programming Notes

Last pixel is applied to all lines of a LINELIST, and only the last line of a LINESTRIP.

30:29 Triangle Strip/List Provoking Vertex Select

Format: 0-based vertex index

Selects which vertex of a triangle (in a triangle strip or list primitive) is considered the "provoking

vertex". Used for flat shading of primitives. Does current implementation send provoking vertex

first?

Value

Name

Vertex 0

Vertex 1

Vertex 2

Reserved

0-based vertex index

Name

Vertex 0

Vertex 1

Reserved

0-based vertex index

Description

28:27 Line Strip/List Provoking Vertex Select

Format:

Value

Selects which vertex of a line (in a line strip or list primitive) is considered the "provoking vertex".

26:25 Triangle Fan Provoking Vertex Select

Format:

Value

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Selects which vertex of a triangle (in a triangle fan primitive) is considered the "provoking vertex".

Name

Vertex 0

141

Command Reference - Instructions

3DSTATE_SF

Vertex 1

Vertex 2

Reserved

MBZ

Name

Reserved Reserved

Description

24:15 Reserved

Format:

14 AA Line Distance Mode

Format:

This bit controls the distance computation for antialiased lines.

Value

1h AALINEDISTANCE_TRUE True distance computation. This is the normal setting

which should yield WHQL compliance.

MBZ

Name

Disable

Enable

Description

8 sub pixel precision bits maintained

4 sub pixel precision bits maintained

13 Reserved

Format:

12 Vertex Sub Pixel Precision Select

Format:

Selects the number of fractional bits maintained in the vertex data

Value

11 Use Point Width State

Format:

Controls whether the point width passed on the vertex or from state is used for rendering point

primitives.

Value

Name

Description

Use Point Width on Vertex

Use Point Width from State

U8.3

10:0 Point Width

Format:

Range: [0.125, 255.875] pixels

This field specifies the size (width) of point primitives in pixels. This field is overridden (though

not overwritten) whenever point width information is passed in the FVF

4 31:0 Global Depth Offset Constant

Format: IEEE_FP

Specifies the constant term in the Global Depth Offset function.

5 31:0 Global Depth Offset Scale

Format: IEEE_FP

Specifies the scale term used in the Global Depth Offset function.

142 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SF

6 31:0 Global Depth Offset Clamp

Format: IEEE_FP

Specifies the clamp term used in the Global Depth Offset function.

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 143

Command Reference - Instructions

3DSTATE_SO_BUFFER

Source:

Length Bias:

RenderCS

Description

Default Value:

Format:

DWord Bit

0 31:29 Command Type

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

1h 3DSTATE_NONPIPELINED

OpCode

18h 3DSTATE_SO_BUFFER

OpCode

MBZ

2h Excludes DWord (0,1)

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

Total Length - 2

1 31 Reserved

Format: MBZ

30:29 SO Buffer Index

Format:

Specifies which of the four SO Buffers is being defined.

28:25 SO Buffer Object Control State

Format: MEMORY_OBJECT_CONTROL_STATE

Specifies the memory object control state for the SO buffer.

24:22 Reserved

Format: MBZ

MBZ

U12 Pitch in Bytes

21:12 Reserved

Format:

11:0 Surface Pitch

Format:

This field specifies the pitch of the SO buffer in #Bytes.

144 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SO_BUFFER

Value

[0,2048]

Programming Notes

A Surface Pitch of 0 indicates an un-bound buffer. No writes are performed. Surface Base

Address is ignored.

Name

Must be 0 or a multiple of 4 Bytes.

2 31:2 Surface Base Address

Format: GraphicsAddress[31:2]

This field specifies the starting DWord address LSBs of the buffer in Graphics Memory.

1:0 Reserved

Format: MBZ

GraphicsAddress[31:2]

3 31:2 Surface End Address

Format:

This field specifies the ending DWord address of the buffer in Graphics Memory.

1:0 Reserved

Format: MBZ

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 145

Command Reference - Instructions

3DSTATE_SO_DECL_LIST

Source:

Length Bias:

RenderCS

Description

Default Value:

Format:

DWord Bit

0 31:29 Command Type

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

1h 3DSTATE_NONPIPELINED

OpCode

17h 3DSTATE_SO_DECL_LIST

OpCode

MBZ

=n Total Length - 2

Format: Q1

Name Description

Default value = 2(N-1)+3 h

MBZ

U4 bitmask

Index of SO Stream

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:9 Reserved

Format:

8:0 DWord Length

Format:

Value

3h Excludes DWord (0,1) [Default]

1 31:16 Reserved

Format:

15:12 Stream to Buffer Selects [3]

Format:

Identifies to which SO Buffers stream 3 outputs. See Stream To Buffer Selects [0] field description.

11:8 Stream to Buffer Selects [2]

Format: U4 bitmask

Identifies to which SO Buffers stream 2 outputs. See Stream To Buffer Selects [0] field description.

7:4 Stream to Buffer Selects [1]

Format: U4 bitmask

Identifies to which SO Buffers stream 1 outputs. See Stream To Buffer Selects [0] field description.

146 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_SO_DECL_LIST

3:0 Stream to Buffer Selects [0]

Format: U4 bitmask

Identifies to which SO Buffers stream 0 outputs (irrespective of whether those buffers are

enabled via 3DSTATE_STREAMOUT). Software is required to scan the SO_DECL list in order to

provide this summary information.

Note: For "inactive" streams, software must program this field to all zero (no buffers written to)

and the corresponding Num Entries field to zero (no valid SO_DECLs).

Value

1xxxb

x1xxb

xx1xb

xxx1b

Name

SO Buffer 3

SO Buffer 2

SO Buffer 1

SO Buffer 0

U8 #entries

2 31:24 Num Entries [3]

Format:

Specifies the number of valid SO_DECL entries for Stream 3. (See notes in Num Entries [0] field

description).

Value

[0,128]

Name

entries

U8 #entries

23:16 Num Entries [2]

Format:

Specifies the number of valid SO_DECL entries for Stream 2. (See notes in Num Entries [0] field

description).

Value

[0,128]

Name

entries

U8 #entries

15:8 Num Entries [1]

Format:

Specifies the number of valid SO_DECL entries for Stream 1. (See notes in Num Entries [0] field

description).

Value

[0,128]

Name

entries

U8 #entries

7:0 Num Entries [0]

Format:

Specifies the number of valid SO_DECL entries for Stream that the SO_DECLs are

programmed in groups of four (one SO_DECL for each of the four streams). Therefore the

number of 2-DWord groups of SO_DECLs supplied in this command is derived from the stream(s)

with the most valid SO_DECLs. The NumEntries value specific to each stream will indicate how

many SO_DECLS are valid for that particular stream. Any trailing invalid SO_DECLs supplied for

streams with fewer valid SO_DECLs will be ignored. It is legal to specify Num Entries = 0 for all

four streams simultaneously. In this case there will be no SO_DECLs included in the command

(only DW 0-2). Note that all Stream to Buffer Selects bits must be zero in this case (as no streams

produce output).

Value

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Name

147

Command Reference - Instructions

3DSTATE_SO_DECL_LIST

[0,128] entries

SO_DECL

3..n 63:48 SO_DECL[3,n]

Format:

This field contains Stream 3 SO_DECL [n]

47:32 SO_DECL[2,n]

Format:

This field contains Stream 2 SO_DECL [n]

31:16 SO_DECL[1,n]

Format:

This field contains Stream 1 SO_DECL [n]

15:0 SO_DECL[0,n]

Format:

This field contains Stream 0 SO_DECL [n]

SO_DECL

148 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_STENCIL_BUFFER

Source:

Length Bias:

RenderCS

This command sets the surface state of the separate stencil buffer, delivered as a pipelined state command.

However, the state change pipelining isn't completely transparent (see restriction below).

Programming Notes

Restriction: Prior to changing Depth/Stencil Buffer state (i.e., any combination of

3DSTATE_DEPTH_BUFFER, 3DSTATE_CLEAR_PARAMS, 3DSTATE_STENCIL_BUFFER,

3DSTATE_HIER_DEPTH_BUFFER) SW must first issue a pipelined depth stall (PIPE_CONTROL with Depth

Stall bit set, followed by a pipelined depth cache flush (PIPE_CONTROL with Depth Flush Bit set,

followed by another pipelined depth stall (PIPE_CONTROL with Depth Stall Bit set), unless SW can

otherwise guarantee that the pipeline from WM onwards is already flushed (e.g., via a preceding

MI_FLUSH).

3DSTATE_STENCIL_BUFFER must always be programmed in the along with the other Depth/Stencil

state 3DSTATE_DEPTH_BUFFER, 3DSTATE_CLEAR_PARAMS, or

3DSTATE_HIER_DEPTH_BUFFER)

The stencil buffer is always Tile-Y

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

06h 3DSTATE_STENCIL_BUFFER

OpCode

MBZ

=n Total Length - 2

Name

Excludes Dword (0,1) [Default]

MBZ

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 Dword Length

Format:

Value

1 31 Reserved

Format:

30:29 Reserved

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 149

Command Reference - Instructions

3DSTATE_STENCIL_BUFFER

Format: MBZ

MEMORY_OBJECT_CONTROL_STATE

Description

28:25 Stencil Buffer Object Control State

Format:

Specifies the memory object control state for the stencil buffer.

Stencil Buffer Object Control State [3:0]

This field is not context save and restored by hardware. If this field is programmed to

any value other than zero, it must be programmed after the following commands or

events:

• MI_SET_CONTEXT

• MI_WAIT_FOR_EVENT (Specifically waits on vblank or display flip)

• Render engine goes IDLE due to head point equal to tail pointer

24:22 Reserved

Format: MBZ

MBZ

U17-1 Pitch in Bytes

Name Description

corresponding to [128B, 128KB]also restricted to a multiple of 128B

Programming Notes

21:17 Reserved

Format:

16:0 Surface Pitch

Format:

Value

Since this surface is tiled, the pitch specified must be a multiple of the tile pitch, in the range

[128B, 128KB].

The pitch must be set to 2x the value computed based on width, as the stencil buffer is stored

with two rows interleaved. For details on the separate stencil buffer storage format in memory,

see GPU Overview (vol1a), Memory Data Formats, Surface Layout, 2D Surfaces, Stencil Buffer

Layout (section 8.20.4.8).

This field specifies the pitch of the stencil buffer in (#Bytes - 1).

[127, 3FFFFh]

2 31:0 Surface Base Address

Format: GraphicsAddress[31:0]Stencil_Buffer

Programming Notes

The Stencil Buffer can only be mapped to Main Memory (uncached).

This field specifies the starting Dword address of the buffer in mapped Graphics Memory.

150 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_STREAMOUT

Source:

Length Bias:

RenderCS

Description

Default Value:

Format:

This command contains pipelined state required by the SOL unit.

DWord Bit

0 31:29 Command Type

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

1Eh 3DSTATE_STREAMOUT

OpCode

MBZ

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

Total Length - 2

1 31 SO Function Enable

Format: U1

If set, the SO function is enabled. Vertex data will be streamed out to memory (subject to

overflow detection) as controlled by the various SO-related state variables.

If clear, the SO function is disabled, and therefore no vertex data will be streamed out to

memory. However, the Rendering Disable and Render Stream Select fields will still be used to

determine which vertices (if any) are forwarded down the pipeline for (possible) rendering.

30 Rendering Disable

Format: U1

If set, the SO stage will not forward any topologies down the pipeline. If clear, the SO stage will

forward topologies associated with Render Stream Select down the pipeline. This bit is used even

if SO Function Enable is DISABLED.

29 Reserved

Format: MBZ

28:27 Render Stream Select

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 151

Command Reference - Instructions

3DSTATE_STREAMOUT

Format:

Description

This field specifies which stream has been selected to be forwarded down the pipeline

for possible rendering. Topologies from other streams will not be passed down the

pipeline. If Rendering Disable is set, this field is ignored, as no topologies are sent

down the pipeline.

This bit is used even if SO Function Enable is DISABLED.

26 Reorder Mode

This bit controls how vertices of triangle objects in TRISTRIP[_ADJ] and TRISTRIP_REV are

reordered for the purposes of stream-out only (does not impact rendering). See table in Input

Buffering.

Value

Name Description

LEADING Reorder the vertices of alternating triangles of a TRISTRIP[_ADJ]

such that the leading (first) vertices are in consecutive order starting

at v0. A similar reordering is performed on alternating triangles in a

TRISTRIP_REV.

TRAILING Reorder the vertices of alternating triangles of a TRISTRIP[_ADJ]

such that the trailing (last) vertices are in consecutive order starting

at v2. A similar reordering is performed on alternating triangles in a

TRISTRIP_REV.

Enable

Description

25 SO Statistics Enable

Format:

Value Name

This bit controls whether StreamOutput statistics register(s) can be incremented.

Disable SO_NUM_PRIMS_WRITTEN[0..3] and SO_PRIM_STORAGE_NEEDED[0..3]

registers cannot increment.

Enable SO_NUM_PRIMS_WRITTEN[0..3] and SO_PRIM_STORAGE_NEEDED[0..3]

registers can increment.

MBZ

24:23 Reserved

Format:

22:12 Reserved

Format:

11 SO Buffer Enable [3]

Format:

(See SO Buffer Enable [0] )

10 SO Buffer Enable [2]

Format:

(See SO Buffer Enable [0] )

152

SO Buffer Enable [1]

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_STREAMOUT

Format:

(See SO Buffer Enable [0] )

8 SO Buffer Enable [0]

Format: U1

If set, stream output to SO Buffer 0 is enabled. If clear, SO Buffer 0 is considered "not bound"

and effectively treated as a zero-length buffer for the purposes of SO output and overflow

detection. If an enabled stream's Stream to Buffer Selects includes this buffer it is by definition an

overflow condition. That stream will cause no writes to occur, and only

SO_PRIM_STORAGE_NEEDED[] will increment. This bit is ignored if SO Function Enable

is DISABLED.

7:0 Reserved

Format:

MBZ

U1 count of 256-bit units

2 31:30 Reserved

Format:

29 Stream 3 Vertex Read Offset

Format:

Specifies amount of data to skip over before reading back Stream 3 vertex data.

(See Stream 0 Vertex Read Offset)

28:24 Stream 3 Vertex Read Length

Format: U5-1 count of 256-bit units

(See Stream 0 Vertex Read Length)

23:22 Reserved

Format: MBZ

U1 count of 256-bit units

21 Stream 2 Vertex Read Offset

Format:

Specifies amount of data to skip over before reading back Stream 2 vertex data. (See Stream 0

Vertex Read Offset)

20:16 Stream 2 Vertex Read Length

Format: U5-1 count of 256-bit units

MBZ

U1 count of 256-bit units

15:14 Reserved

Format:

13 Stream 1 Vertex Read Offset

Format:

Specifies amount of data to skip over before reading back Stream 1 vertex data. (See Stream 0

Vertex Read Offset)

12:8 Stream 1 Vertex Read Length

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 153

Command Reference - Instructions

3DSTATE_STREAMOUT

Format: U5-1 count of 256-bit units

(See Stream 0 Vertex Read Length)

7:6 Reserved

Format: MBZ

U1 count of 256-bit units

5 Stream 0 Vertex Read Offset

Format:

Specifies amount of data to skip over before reading back Stream 0 vertex data. Must be zero if

the GS is enabled and the Output Vertex Size field in 3DSTATE_GS is programmed to 0 (i.e., one

16B unit).

4:0 Stream 0 Vertex Read Length

Format: U5-1 count of 256-bit units

Specifies amount of vertex data to read back for Stream 0 vertices, starting at the Stream 0

Vertex Read Offset location. Maximum readback is 17 256-bit units (34 128-bit vertex attributes).

Read data past the end of the valid vertex data has undefined contents, and therefore shouldn't

be used to source stream out data.

Must be zero (i.e., read length = 256b) if the GS is enabled and the Output Vertex Size field in

3DSTATE_GS is programmed to 0 (i.e., one 16B unit).

154 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_TE

Source:

Length Bias:

RenderCS

Description

Default Value:

Format:

The state used by TE is defined with this inline state packet.

DWord Bit

0 31:29 Command Type

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

1Ch 3DSTATE_TE

OpCode

MBZ

2h Excludes DWord (0,1)

=n Total Length - 2

MBZ

Name

INTEGER

Description

Outside/inside edges are divided into an integer number

of equal-sized segments.

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:19 Reserved

Format:

18:16 Reserved

Format:

15:14 Reserved

Format:

13:12 Partitioning

Format:

Value

This field specifies how edges are partitioned based on tessellation factor.

ODD_FRACTIONAL Outside/inside edges are divided into an odd number of

possibly-unequal-sized segments.

EVEN_FRACTIONAL Outside/inside edges are divided into an even number of

possibly-unequal-sized segments.

11:10 Reserved

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 155

Command Reference - Instructions

3DSTATE_TE

Format: MBZ

Description

Points are output (as POINTLIST topologies)

Lines are output (as LINESTRIP topologies). Only valid if ISOLINE

domain is selected.

9:8 Output Topology

Format:

This field specifies which primitive types are to be output.

Value Name

POINT

LINE

TRI_CW Clockwise-ordered triangles are output (either as TRISTRIP,

TRISTRIP_REV or TRILIST topologies). Not valid if ISOLINE domain is

selected.

TRI_CCW Count-clockwise-ordered triangles are output (either as TRISTRIP,

TRISTRIP_REV or TRILIST topologies). Not valid if ISOLINE domain is

selected.

MBZ

Name

QUAD

TRI

ISOLINE

Description

2D (U,V) domain is tessellated

Triangular (U,V,W) domain is tessellated

2D (U,V) domain is tessellated.

MBZ

7:6 Reserved

Format:

5:4 TE Domain

Format:

This field specifies which type of domain is to be tessellated.

Value

3 Reserved

Format:

2:1 TE Mode

Format:

When TE Enable is ENABLED, this field specifies the overall operation of the TE stage. This field is

ignored if TE Enable is DISABLED.

Value

Name Description

HW_TESS Normal HW Tessellation Mode. The TessFactors are read from the

patch URB entry, and are used to perform fixed-function hardware

tessellation of the specified domain.

SW_TESS Software Tessellation Mode. The TE unit will pass down HS-thread-

generated tessellated domain points instead of generating them

itself from TessFactors. The TE unit will read the Domain Point Count

and Domain Point Buffer Starting Address fields from the patch

header, and if the count is 0 it will consider the patch culled and

discard it. Otherwise the address is used to start fetching

DOMAIN_POINT structures from memory and passing them down

the pipeline to DS.

156 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_TE

Reserved Reserved

Enable

0 TE Enable

Format:

If ENABLED, the TE stage will perform tessellation processing on incoming patch primitives. The

TE Mode field determines how this tessellation operation proceeds. If DISABLED, the TE goes into

pass-through mode. All other state fields are ignored.

Programming Notes

The tessellation stages (HS, TE and DS) must be enabled/disabled as a group. I.e., draw

commands can only be issued if all three stages are enabled or all three stages are disabled,

otherwise the behavior is UNDEFINED.

2 31:0 Maximum Tessellation Factor Odd

Format: IEEE_Float

This field specifies the maximum TessFactor for ODD_FRACTIONAL partitioning when in

HW_TESS mode.

Value

427c0000h

Name

Description

Per API Spec, For normal operation software should set this

value to 63.0

[40400000h,427c0000h] Reserved Reserved.

Programming Notes

Note that ISOLINE's LineDensity TF is always subjected to INTEGER partitioning regardless of

the Partitioning state.

3 31:0 Maximum Tessellation Factor Not Odd

Format: IEEE_Float

This field specifies the maximum TessFactor for EVEN_FRACTIONAL or INTEGER partitioning

when in HW_TESS mode.

Value

42800000h

Name

Description

Per API Spec, For normal operation software should set this

value to 64.0

[40000000h,42800000h] Reserved Reserved

Programming Notes

Note that ISOLINE's LineDensity TF is always subjected to INTEGER partitioning regardless of

the Partitioning state.

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 157

Command Reference - Instructions

3DSTATE_URB_DS

Source:

Length Bias:

RenderCS

This command may not overlap with the push constants in the URB defined by the

3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,

3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.

Programming Notes

3DSTATE_URB_VS, 3DSTATE_URB_HS, and 3DSTATE_URB_GS must also be programmed in order for the

programming of this state to be valid.

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

32h 3DSTATE_URB_DS

OpCode

MBZ

0h DWORD_COUNT_n

MBZ

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31

Reserved

Format:

Reserved

Format:

29:25 DS URB Starting Address

Format:

Offset from the start of the URB memory where DS starts its allocation, specified in multiples of

8 KB.

Value

[0,11]

Name

U9-1 Count of 512-bit units

24:16 DS URB Entry Allocation Size

Format:

Specifies the length of each URB entry owned by DS. This field is always used (even if DS

158 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_URB_DS

Function Enable is DISABLED).

Value

[0,9]

Name

Description

15:0 DS Number of URB Entries

Specifies the number of URB entries that are used by DS. This field is always used

(even if DS Function Enable is DISABLED).

If Domain Shader Thread Dispatch is Enabled then the minimum number of handles

that must be allocated is 10 URB entries.

Value

[0,288]

Programming Notes

DS Number of URB Entries must be divisible by 8 if the DS URB Entry Allocation Size is

programmed to a value less than 9, which is 10 512-bit URB entries. "2:0" = reserved "000"

Name

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 159

Command Reference - Instructions

3DSTATE_URB_GS

Source:

Length Bias:

RenderCS

This command may not overlap with the push constants in the URB defined by the

3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,

3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.

Programming Notes

3DSTATE_URB_VS, 3DSTATE_URB_HS, and 3DSTATE_URB_DS must also be programmed in order for the

programming of this state to be valid.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

33h 3DSTATE_URB_GS

OpCode

MBZ

0h DWORD_COUNT_n

MBZ

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31 Reserved

Format:

30 Reserved

Format:

29:25 GS URB Starting Address

Format:

Offset from the start of the URB memory where GS starts its allocation, specified in multiples of 8

KB.

Value

[0,11]

Name

U9-1 512-bit units

24:16 GS URB Entry Allocation Size

Format:

Specifies the length of each URB entry owned by GS. This field is always used (even if GS

160 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_URB_GS

Function Enable is DISABLED).

15:0 GS Number of URB Entries

Specifies the number of URB entries that are used by GS. This field is always used (even if GS

Function Enable is DISABLED).

Value

[0,192]

Programming Notes

Only if GS is disabled can this field be programmed to 0.

If GS is enabled this field shall be programmed to a value greater than 0. For GS Dispatch Mode

"Single", this field shall be programmed to a value greater than or equal to 1. For other GS

Dispatch Modes, refer to the definition of Dispatch Mode (3DSTATE_GS) for minimum values of

this field.

GS Number of URB Entries must be divisible by 8 if the GS URB Entry Allocation Size is less than

9 512-bit URB entries.

"2:0" = reserved "000"

Name

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 161

Command Reference - Instructions

3DSTATE_URB_HS

Source:

Length Bias:

RenderCS

This command may not overlap with the push constants in the URB defined by the

3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,

3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.

Programming Notes

3DSTATE_URB_VS, 3DSTATE_URB_DS, and 3DSTATE_URB_GS must also be programmed in order for the

programming of this state to be valid.

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

31h 3DSTATE_URB_HS

OpCode

MBZ

0h DWORD_COUNT_n

MBZ

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31

Reserved

Format:

Reserved

Format:

29:25 HS URB Starting Address

Format:

Offset from the start of the URB memory where HS starts its allocation, specified in multiples of

8 KB.

Value

[0,11]

Name

U9-1 Count of 512-bit units

24:16 HS URB Entry Allocation Size

Format:

Specifies the length of each URB entry owned by HS. This field is always used (even if HS

162 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_URB_HS

Function Enable is DISABLED).

15:0 HS Number of URB Entries

Specifies the number of URB entries that are used by HS. This field is always used (even

if HS Function Enable is DISABLED).

Programming Restriction:HS Number of URB Entries must be divisible by 8 if the HS

URB Entry Allocation Size is less than 9 512-bit URB entries."2:0" = reserved "000"

Value

[0,32]

Name

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 163

Command Reference - Instructions

3DSTATE_URB_VS

Source:

Length Bias:

RenderCS

Description

VS URB Entry Allocation Size equal to 4(5 512-bit URB rows) may cause performance to decrease due

to banking in the URB. Element sizes of 16 to 20 should be programmed with six 512-bit URB rows.

This command may not overlap with the push constants in the URB defined by the

3DSTATE_PUSH_CONSTANT_ALLOC_VS, 3DSTATE_PUSH_CONSTANT_ALLOC_DS,

3DSTATE_PUSH_CONSTANT_ALLOC_HS, and 3DSTATE_PUSH_CONSTANT_ALLOC_GS commands.

Programming Notes

3DSTATE_URB_HS, 3DSTATE_URB_DS, and 3DSTATE_URB_GS must also be programmed in order for the

programming of this state to be valid.

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

30h 3DSTATE_URB_VS

OpCode

MBZ

0h DWORD_COUNT_n

MBZ

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31

Reserved

Format:

Reserved

Format:

29:25 VS URB Starting Address

Format:

Offset from the start of the URB memory where VS starts its allocation, specified in multiples of

8 KB.

Value Name

164 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_URB_VS

[0,11]

U9-1 count of 512-bit units

24:16 VS URB Entry Allocation Size

Format:

Specifies the length of each URB entry owned by VS. This field is always used (even if VS

Function Enable is DISABLED).

Programming Notes

Programming Restriction: As the VS URB entry serves as both the per-vertex input and output

of the VS shader, the VS URB Allocation Size must be sized to the maximum of the vertex input

and output structures.

15:0 VS Number of URB Entries

Format: U16

Specifies the number of URB entries that are used by VS. This field is always used (even if VS

Function Enable is DISABLED).

Value

[32,512]

Programming Notes

Programming Restriction: VS Number of URB Entries must be divisible by 8 if the VS URB Entry

Allocation Size is less than 9 512-bit URB entries."2:0" = reserved "000b"

Name

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 165

Command Reference - Instructions

3DSTATE_VERTEX_BUFFERS

Source:

Length Bias:

RenderCS

Description

This command is used to specify VB state used by the VF function.

Can specify from 1 to 33 VBs.

The VertexBufferID field within a VERTEX_BUFFER_STATE structure indicates the specific VB. If

a VB definition is not included in this command, its associated state is left unchanged and is

available for use if previously defined.

Programming Notes

It is possible to have individual vertex elements sourced completely from generated ID values and

therefore not require any vertex buffer accesses for that vertex element. In this case, VF function will

simply ignore the VB state associated with that vertex element. If all enabled vertex elements have

this characteristic, no VBs are required to process 3DPRIMITIVE commands. For example, this might

arise when the user wants to perform all data lookups in the first shader, so only generated index

values need to be passed down to it. In this extreme case, SW would not need to program any VB

state, and therefore not need to issue any 3DSTATE_VERTEX_BUFFERS commands.

For any 3DSTATE_VERTEX_BUFFERS command, at least one VERTEX_BUFFER_STATE structure must be included.

VERTEX_BUFFER_STATE structures are 4 DWords for both VERTEXDATA buffers and INSTANCEDATA buffers.

Inclusion of partial VERTEX_BUFFER_STATE structures is UNDEFINED.

The order in which VBs are defined within this command can be arbitrary, though a vertex buffer must be

defined only once in any given command (otherwise operation is UNDEFINED).

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

03h GFXPIPE

Opcode

3h 3D

Opcode

0h 3DSTATE_VERTEX_BUFFERS

Opcode

08h 3DSTATE_VERTEX_BUFFERS

Opcode

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8

7:0

Reserved

DWord Count

Default Value:

Format:

3 DWORD_COUNT_n

n = 4b-1 (where b = # of buffer states included)

166 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_VERTEX_BUFFERS

1..n 127:0 Vertex Buffer State [n]

Format: VERTEX_BUFFER_STATE

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 167

Command Reference - Instructions

3DSTATE_VERTEX_ELEMENTS

Source:

Length Bias:

RenderCS

Description

This is a variable-length command used to specify the active vertex elements. Each

VERTEX_ELEMENT_STATE structure contains a Valid bit which determines which elements are

used.

Up to 34 elements.

Programming Notes

At least one VERTEX_ELEMENT_STATE structure must be included.

Inclusion of partial VERTEX_ELEMENT_STATE structures is UNDEFINED.

SW must ensure that at least one vertex element is defined prior to issuing a 3DPRIMTIVE

command, or operation is UNDEFINED.

There are no 'holes' allowed in the destination vertex: NOSTORE components must be

overwritten by subsequent components unless they are the trailing DWords of the vertex.

Software must explicitly chose some value (probably 0) to be written into DWords that would

otherwise be 'holes'.

Within a VERTEX_ELEMENT_STATE structure, if a Component Control field is set to something

other than VFCOMP_STORE_SRC, no higher-numbered Component Control fields may be set

to VFCOMP_STORE_SRC. In other words, only trailing components can be set to something

other than VFCOMP_STORE_SRC.

See additional restrictions listed in the command fields and VERTEX_ELEMENT_STATE

description.

Element[0] must be valid.

All elements must be valid from Element[0] to the last valid element. (I.e. if Element[2] is valid

then Element[1] and Element[0] must also be valid).

The pitch between elements packed in the URB will always be 128 bits.

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

03h GFXPIPE

Opcode

3h 3D

Opcode

0h 3DSTATE_VERTEX_ELEMENTS

Opcode

09h 3DSTATE_VERTEX_ELEMENTS

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

168 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_VERTEX_ELEMENTS

Format: Opcode

15:8

7:0

Reserved

DWord Count

Format:

Vertex Element Count = (DWord Count + 1) / 2

Value

[1,66]

Name

DWORD_COUNT_n [Default]

Range

Description

excludes DWords 0,1

1-34 Elements

VERTEX_ELEMENT_STATE

1..n

63:0 Element [n]

Format:

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 169

Command Reference - Instructions

3DSTATE_VF_STATISTICS

Source:

Length Bias:

RenderCS

The VF stage tracks two pipeline statistics, the number of vertices fetched and the number of objects generated.

VF will increment the appropriate counter for each when statistics gathering is enabled by issuing the

3DSTATE_VF_STATISTICS command with the [Statistics Enable] bit set.

DWord

Bit

31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

Opcode

Name

Pipelined, Single DWord [Default]

0h 3DSTATE_PIPELINED

Opcode

28:27 Command SubType

Format:

Value

26:24 3D Command Opcode

Default Value:

Format:

GFXPIPE[28:27 = 1h, 26:24 = 0h, 23:16 = 0Bh] (Pipelined, Single DWord)

23:16 3D Command Sub Opcode

Default Value:

Format:

0Bh 3DSTATE_VF_STATISTICS

Opcode

GFXPIPE[28:27 = 1h, 26:24 = 0h, 23:16 = 0Bh] (Pipelined, Single DWord)

15:1 Reserved

Format: MBZ

Enable

0 Statistics Enable

Format:

If ENABLED, VF will increment the pipeline statistics counters IA_VERTICES_COUNT and

IA_PRIMITIVES_COUNT for each vertex fetched and each object output, respectively, for

3DPRIMITIVE commands issued subsequently.

If DISABLED, these counters will not be incremented for subsequent 3DPRIMITIVE commands.

170 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_VIEWPORT_STATE_POINTERS_CC

Source:

Length Bias:

RenderCS

The 3DSTATE_VIEWPORT_STATE_POINTERS_CC command is used to define the location of fixed functions'

viewport state table.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

23h 3DSTATE_VIEWPORT_STATE_POINTERS

OpCode

MBZ

0h DWORD_COUNT_n

DynamicStateOffset[31:5]CC_VIEWPORT*16

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:5 CC Viewport Pointer

Format:

Specifies the 32-byte aligned address offset of the CC_VIEWPORT state. This offset is relative to

the Dynamic State Base Address.

4:0 Reserved

Format: MBZ

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 171

Command Reference - Instructions

3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP

Source:

Length Bias:

RenderCS

The 3DSTATE_VIEWPORT_STATE_POINTERS_CLIP command is used to define the location of fixed functions'

viewport state table.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

21h 3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP

OpCode

MBZ

0h DWORD_COUNT_n

DynamicStateOffset[31:6]SF_CLIP_VIEWPORT*16

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:6 SF Clip Viewport Pointer

Format:

Specifies the 64-byte aligned address offset of the SF_CLIP_VIEWPORT state. This offset is

relative to the Dynamic State Base Address.

5:0 Reserved

Format: MBZ

172 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_VS

Source:

Length Bias:

RenderCS

Description

The state used by VS is defined with this inline state packet.

DWord Bit

0 31:29 Command Type

Default Value:

Format:

Description

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

10h 3DSTATE_VS

OpCode

MBZ

4h Excludes DWord (0,1)

=n Total Length - 2

InstructionBaseOffset[31:6]Kernel

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

1 31:6 Kernel Start Pointer

Format:

This field specifies the starting location (1st GEN4 core instruction) of the kernel program run by

threads spawned by this FF unit. It is specified as a 64-byte-granular offset from the Instruction

Base Address. This field is ignored if VS Function Enable is DISABLED.

5:0 Reserved

Format: MBZ

U1 Enumerated type

Name

Multiple

Single

Description

Dual vertex SIMD4x2 thread dispatches are allowed.

Single vertex SIMD4x2 thread dispatches are forced.

2 31 Single Vertex Dispatch

Format:

Value

This field can be used to force single vertex SIMD4x2 VS threads.

30 Vector Mask Enable (VME)

When SPF=0, VME specifies which mask to use to initialize the initial channel enables. When

SPF=1, VME specifies which mask to use to generate execution channel enables.

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 173

Command Reference - Instructions

3DSTATE_VS

Value

Name

Dmask

Vmask

Description

Channels are enabled based on the dispatch mask

Channels are enabled based on the vector mask

29:27 Sampler Count

Specifies how many samplers (in multiples of 4) the vertex shader 0 kernel uses. Used only for

prefetching the associated sampler state entries. This field is ignored if VS Function Enable is

DISABLED.

Value

Name

No Samplers

1-4 Samplers

5-8 Samplers

9-12 Samplers

13-16 Samplers

no samplers used

Description

between 1 and 4 samplers used

between 5 and 8 samplers used

between 9 and 12 samplers used

between 13 and 16 samplers used

MBZ

26 Reserved

Format:

25:18 Binding Table Entry Count

Format:

Specifies how many binding table entries the kernel uses. Used only for prefetching of the

binding table entries and associated surface state.

Note: For kernels using a large number of binding table entries, it may be wise to set this field to

zero to avoid prefetching too many entries and thrashing the state cache.

This field is ignored if VS Function Enable is DISABLED.

Value

[0,255]

Name

MBZ

U1 enumerated type

17 Reserved

Format:

16 Floating Point Mode

Format:

Specifies the initial floating point mode used by the dispatched thread. This field is ignored if VS

Function Enable is DISABLED.

Value

Name

IEEE-754

Alternate

Description

Use IEEE-754 Rules

Use alternate rules

MBZ

Enable

15:14 Reserved

Format:

13 Illegal Opcode Exception Enable

Format:

This bit gets loaded into EU CR0.1[12] (note the bit # difference). See Exceptions and ISA

Execution field is ignored if VS Function Enable is DISABLED.

174 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_VS

12 Reserved

Format: MBZ

MBZ

Enable

11:8 Reserved

Format:

7 Software Exception Enable

Format:

This bit gets loaded into EU CR0.1[13] (note the bit # difference). See Exceptions and ISA

Execution field is ignored if VS Function Enable is DISABLED.

6:0 Reserved

Format: MBZ

GeneralStateOffset[31:10]ScratchSpace

3 31:10 Scratch Space Base Offset

Format:

Specifies the starting location of the scratch space area allocated to this FF unit as a 1K-byte

aligned offset from the General State Base Address. If required, each thread spawned by this FF

unit will be allocated some portion of this space, as specified by Per-Thread Scratch Space. The

computed offset of the thread-specific portion will be passed in the thread payload as Scratch

Space Offset. The thread is expected to utilize "stateless" DataPort read/write requests to access

scratch space, where the DataPort will cause the General State Base Address to be added to the

offset passed in the request header.

This field is ignored if VS Function Enable is DISABLED.

9:4 Reserved

Format: MBZ

U4 power of 2 Bytes over 1K Bytes

3:0 Per-Thread Scratch Space

Format:

Specifies the amount of scratch space to be allocated to each thread spawned by this FF unit.

The driver must allocate enough contiguous scratch space, starting at the Scratch Space Base

Pointer, to ensure that the Maximum Number of Threads can each get Per-Thread Scratch Space

size without exceeding the driver-allocated scratch space. This field is ignored if VS Function

Enable is DISABLED.

Value

[0,11]

Programming Notes

This amount is available to the kernel for information only. It will be passed verbatim (if not

altered by the kernel) to the Data Port in any scratch space access messages, but the Data Port

will ignore it.

Name

Description

indicating [1K Bytes, 2M Bytes]

4 31:25 Reserved

Format: MBZ

175

24:20 Dispatch GRF Start Register for URB Data

Format:

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_VS

Specifies the starting GRF register number for the URB portion (Constant + Vertices) of the

thread payload. This field is ignored if VS Function Enable is DISABLED.

Value

[0,31]

Name

Description

indicating GRF [R0,R31]

MBZ

19:17 Reserved

Format:

16:11 Vertex URB Entry Read Length

Format:

Specifies the number of pairs of 128-bit vertex elements to be passed into the payload

for each vertex. This field is ignored if VS Function Enable is DISABLED.

For SIMD4x2 dispatch, each vertex element requires one GRF of payload data, therefore

the number of GRFs with vertex data will be double the value programmed in this field.

Value

[1,63]

Programming Notes

It is UNDEFINED to set this field to 0 indicating no Vertex URB data to be read and passed to

the thread.

Name

10 Reserved

Format: MBZ

9:4 Vertex URB Entry Read Offset

Format:

Specifies the offset (in 256-bit units) at which Vertex URB data is to be read from the URB before

being included in the thread payload. This offset applies to all Vertex URB entries passed to the

thread. This field is ignored if VS Function Enable is DISABLED.

Value

[0,63]

Name

MBZ

3:0 Reserved

Format:

5 31:25 Maximum Number of Threads

Format: U7-1 representing thread count

Specifies the maximum number of simultaneous threads allowed to be active. Used to avoid

using up the scratch space. Programming the value of the max threads over the number of

threads based off number of threads supported in the execution units may improve performance

since the architecture allows threads to be buffered between the check for max threads and the

actual dispatch into the EU. Programming the max values to a number less than the number of

threads supported in the execution units may reduce performance. This field is ignored if VS

Function Enable is DISABLED.

Value

[0,15]

Name

indicating thread count of [1,16]

24:23 Reserved

176 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_VS

Format: MBZ

MBZ

Enable

Description

22:11 Reserved

Format:

10 Statistics Enable

Format:

If ENABLED, this FF unit will engage in statistics gathering. See the Statistics Gathering

section later in this chapter. If DISABLED, statistics information associated with this FF

stage will be left unchanged.

This field is used even if VS Function Enable is DISABLED.

9:3 Reserved

Format: MBZ

MBZ

Disable

Reserved

Format:

Vertex Cache Disable

Format:

This bit controls the operation of the Vertex Cache. This field is always used. If the Vertex Cache

is DISABLED and the VS Function is ENABLED, the Vertex Cache is not used and all incoming

vertices will be passed to VS threads.

If the Vertex Cache is ENABLED and the VS Function is ENABLED, incoming vertices that do not

hit in the Vertex Cache will be passed to VS threads.

If the Vertex Cache is ENABLED and the VS Function is DISABLED, input vertices that miss in the

Vertex Cache will be assembled and written to the URB, though pass thru the VS stage

unmodified (not shaded).

The Vertex Cache is invalidated whenever the Vertex Cache becomes DISABLED , whenever the

VS Function Enable toggles, between 3DPRIMITIVE commands and between instances within a

3DPRIMITIVE command.

0 VS Function Enable

Format:

Description

If ENABLED, VS threads may be spawned to process VF-generated vertices before the

resulting vertices are passed down the pipeline.

If DISABLED, VF-generated vertices will pass thru the VS function and sent down the

pipeline unmodified. The Vertex Cache is still available in this mode, if enabled.

If Statistics Enable is ENABLED, VS_INVOCATION_COUNT will increment by 1 for every

vertex that passes through the VS stage, even if VS Function Enable is DISABLED.

This field is always used.

Enable

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 177

Command Reference - Instructions

3DSTATE_WM

Source:

Length Bias:

RenderCS

Description

Default Value:

Format:

DWord Bit

0 31:29 Command Type

3h GFXPIPE

OpCode

3h GFXPIPE_3D

OpCode

0h 3DSTATE_PIPELINED

OpCode

14h 3DSTATE_WM

OpCode

MBZ

01h Excludes DWord (0,1)

28:27 Command SubType

Default Value:

Format:

26:24 3D Command Opcode

Default Value:

Format:

23:16 3D Command Sub Opcode

Default Value:

Format:

15:8 Reserved

Format:

7:0 DWord Length

Default Value:

Format:

Total Length - 2

1 31 Statistics Enable

Format: Enable

If ENABLED, the Windower and pixel pipeline will engage in statistics gathering. If DISABLED,

statistics information associated with this FF stage will be left unchanged. See Statistics

Gathering.

30 Depth Buffer Clear

Format: Enable

Programming Notes

If this field is enabled,

2. the Depth Test Enable field in DEPTH_STENCIL_STATE must be disabled.

3. 3DSTATE_DEPTH_BUFFER::Depth Write Enable must be set.

4. 3DSTATE_DEPTH_BUFFER::Stencil Write Enable must be set if

3DSTATE_STENCIL_BUFFER::Stencil buffer enable is set. Additionally the following must

be set to the correct values.

178

When set, the depth buffer is initialized as a side-effect of rendering pixels.

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_VS

2. DEPTH_STENCIL_STATE::Stencil Write Mask must be 0xFF

3. DEPTH_STENCIL_STATE::Stencil Test Mask must be 0xFF

4. DEPTH_STENCIL_STATE::Back Face Stencil Write Mask must be 0xFF

5. DEPTH_STENCIL_STATE::Back Face Stencil Test Mask must be 0xFF

Refer to section 0 "Depth Buffer Clear" for additional restrictions when this field is enabled. If

this field is enabled,Pixel Shader Kill Pixel must be disabled.

29 Thread Dispatch Enable

Format: Enable

This bit, if set, indicates that it is possible for a PS thread to modify a render target, i.e.,at least

one render target is enabled (is not of type SURFTYPE_NULL and has at least one channel

enabled for writes) and the PS kernel contains a code path that may issue a write to that/those

enabled RTs.

Programming Notes

This bit is used for performance optimizations and does not directly control writing to render

targets. If this bit is DISABLED, no pixel shader threads will be dispatched. For correct behavior,

this bit must be set consistently with the behavior of the PS kernel, i.e. if this bit is DISABLED

the PS kernel must not write color or depth to any render targets. If this field is disabled, Pixel

Shader Kill Pixel must be disabled.

28 Depth Buffer Resolve Enable

Format: Enable

When set, the depth buffer is made to be consistent with the hierarchical depth buffer as a side-

effect of rendering pixels. This is intended to be used when the depth buffer is to be used as a

surface outside of the 3D rendering operation.

Programming Notes

If this field is enabled,

the Depth Buffer Clear and Hierarchical Depth Buffer Resolve Enable fields must

both be disabled.

3. 3DSTATE_DEPTH_BUFFER::Depth Write Enable must be set.

Refer to section 11.5.4.2 "Depth Buffer Resolve" for additional restrictions when this field is

enabled. If Hierarchical Depth Buffer Enable is disabled, enabling this field will have no effect.

27 Hierarchical Depth Buffer Resolve Enable

Format: Enable

When set, the hierarchical depth buffer is made to be consistent with the depth buffer as a side-

effect of rendering pixels. This is intended to be used when the depth buffer has been modified

outside of the 3D rendering operation.

Programming Notes

If this field is enabled,

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 179

Command Reference - Instructions

3DSTATE_VS

the Depth Buffer Clear and Depth Buffer Resolve Enable fields must both be

disabled.

3. 3DSTATE_DEPTH_BUFFER::Depth Write Enable must be set.

Refer to section 11.5.4.3 "Hierarchical Depth Buffer Resolve" for additional restrictions

when this field is enabled.

If Hierarchical Depth Buffer Enable is disabled, enabling this field will have no effect.

Performance Note: expect the hierarchical depth buffer's impact on performance to

be reduced for some period of time after this operation is performed, as the

hierarchical depth buffer is initialized to a state that makes it ineffective. Further

rendering will tend to bring the hierarchical depth buffer back to a more effective

state.

Software needs to do an ambiguate after allocating the surface for the first time if the

depth buffer width and height are NOT aligned to 8 and 4 respectively.

26 Legacy Diamond Line Rasterization

Format: Enable

This bit, if ENABLED, indicates that the Windower will rasterize zero width lines using the DX9

rasterization rules. If DISABLED, the Windower will rasterize zero width lines using the DX10

rasterization rules (see Strips Fans chapter).

25 Pixel Shader Kill Pixel

Format: Enable

This bit, if ENABLED, indicates that the PS kernel or color calculator has the ability to kill

(discard) pixels or samples, other than due to depth or stencil testing. This bit is required

to be ENABLED in the following situations:

• The API pixel shader program contains "killpix" or "discard" instructions, or other code in

the pixel shader kernel that can cause the final pixel mask to differ from the pixel mask

received on dispatch.

• A sampler with chroma key enabled with kill pixel mode is used by the pixel shader.

• Any render target has Alpha Test Enable or AlphaToCoverage Enable enabled.

• The pixel shader kernel generates and outputs oMask.

Note: As ClipDistance clipping is fully supported in hardware and therefore not via PS

instructions, there should be no need to ENABLE this bit due to ClipDistance clipping.

24:23 Pixel Shader Computed Depth Mode

Format: U2 Enumerated Type

This field specifies the computed depth mode for the pixel shader.

180 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_VS

Value

Programming Notes

When bit 5 is set in WM_ RT independent rasterization is enabled), this field can not

be programmed to values: 2h or 3h.

Name

PSCDEPTH_OFF

PSCDEPTH_ON

Description

Pixel shader does not compute depth

Pixel shader computes depth with no guarantee as to its

value

PSCDEPTH_ON_GE Pixel shader computes depth and guarantees that oDepth

>= SourceDepth

PSCDEPTH_ON_LE Pixel shader computes depth and guarantees that oDepth

<= SourceDepth

22:21 Early Depth/Stencil Control

Format:

Value

Name

U2 Enumerated Type

Description

This field specifies the behavior of early depth/stencil test.

EDSC_NORMAL Depth/Stencil Test/Write behaves as if it happens post-shader,

however the pixel shader is not necessarily executed if the

pixel fails depth or stencil test (this is the legacy behavior)

EDSC_PSEXEC Depth/Stencil Test/Write behaves as if it happens post-shader,

and the pixel shader is executed if the pixel fails depth or

stencil test (although pre-shader actions such as primitive

inclusion, stipple, etc. will still cause the shader not to execute)

Depth/Stencil Test/Write behaves as if it happens pre-shader.

The pixel shader is not executed if the pixel fails depth or

stencil test. Depth and stencil writes occur even if the pixel is

killed by the shader or post-shader by alpha test, etc. Depth

output by the pixel shader is ignored.

Programming Notes

If EDSC_PSEXEC mode is selected, Thread Dispatch Enable must be set.

Restriction

Restriction: When value of "2h" is programmed, PS_INVOCATIONs_COUNT may not be

accurate.

2h EDSC_PREPS

Reserved

20 Pixel Shader Uses Source Depth

Format: Enable

This bit, if ENABLED, indicates that the PS kernel requires the source depth value (vPos.z) to be

passed in the payload. The source depth value is interpolated according to the Position ZW

Interpolation Mode state.

19 Pixel Shader Uses Source W

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 181

Command Reference - Instructions

3DSTATE_VS

Format: Enable

This bit, if ENABLED, indicates that the PS kernel requires the interpolated source W value

(vPos.w) to be passed in the payload. The W value is interpolated according to the Position ZW

Interpolation Mode state.

18:17 Position ZW Interpolation Mode

Format: U2 Enumerated Type

This field elects "interpolation mode" associated with the Position Z (source depth) and W

coordinates passed in the PS payload when the PS requires Position as input. This field does not

determine whether these coordinates are actually included in the payload (see Pixel Shader

Requires Depth, Pixel Shader Requires W).

Value

Programming Notes

When bit 5 is set in WM_STATE, value of 3h is not defined for this field.

Programming Note: When bit 5 in dword 1 (RT Independent Rasterization Enable) is set and bit

30 in dword 2 (PS UAV-only) is not set in WM_STATE, value of 3h is not defined for this field.

Name

INTERP_PIXEL

Reserved

INTERP_SAMPLE

Description

Evaluate Z & W at the pixel center or UL corner (as

specified by Pixel Location of 3DSTATE_MULTISAMPLE)

INTERP_CENTROID

16:11 Barycentric Interpolation Mode

Format: Enable[6]

Controls which barycentric interpolation terms must be passed into the pixel shader kernel.

Bit 0: Perspective Pixel Location barycentric is required

Bit 1: Perspective Centroid barycentric is required

Bit 2: Perspective Sample barycentric is required

Bit 3: Non-perspective Pixel Location barycentric is required

Bit 4: Non-perspective Centroid barycentric is required

Bit 5: Non-perspective Sample barycentric is required

Programming Notes

If contiguous dispatch modes are enabled, only bit 3 (non-perspective pixel location) can be

set, all other bits in this field must be zero. Pixel Location below refers to either the upper left

corner or pixel center depending on the Pixel Location state of 3DSTATE_MULTISAMPLING).

MSDISPMODE_PERSAMPLE is required in order to select Perspective Sample or Non-

perspective Sample barycentric coordinates.

Restriction: When Centroid Barycentric mode is required, HW may produce incorrect

interpolation results when a 2X2 pixels have unlit pixels.

10 Pixel Shader Uses Input Coverage Mask

Format: Enable

This bit, if ENABLED, indicates that the PS kernel requires the input coverage mask to be passed

in the payload.

182 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

3DSTATE_VS

9:8 Line End Cap Antialiasing Region Width

Format: U2

This field specifies the distances over which the coverage of anti-aliased line end caps are

computed.

Value

Name

0.5 pixels

1.0 pixels

2.0 pixels

4.0 pixels

Description

7:6 Line Antialiasing Region Width

Format:

Value

Name

0.5 pixels

1.0 pixels

2.0 pixels

4.0 pixels

MBZ

Enable

Description

This field specifies the distance over which the anti-aliased line coverage is computed.

Reserved

Format:

Polygon Stipple Enable

Format:

Enables the Polygon Stipple function.

3 Line Stipple Enable

Format:

Enables the Line Stipple function.

Enable

2 Point Rasterization Rule

Format: 3D_RasterizationRule

This field specifies the rasterization rules to be applied whenever the edges of a point primitive

fall exactly on a pixel sampling point.

Value

Name

RASTRULE_UPPER_LEFT

Description

To match "normal" upper left rules for surface

primitives

RASTRULE_UPPER_RIGHT To match OpenGL point rasterization rules (round to

+ infinity, where this is the upper right direction wrt

OpenGL screen origin of lower left).

U2 enumerated type

1:0 Multisample Rasterization Mode

Format:

This field determines whether multisample rasterization is turned on/off, and how the pixel

sample point(s) are defined. Software sets this according to the API, the API's multisample enable

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 183

Command Reference - Instructions

3DSTATE_VS

state setting (if any), and whether 1X or 4X MSRTs are bound. This state is duplicated in

3DSTATE_SF and both must be set to the same value. Refer to the "Multisampling" section for

details on the settings of this field.

Value

Name

MSRASTMODE_OFF_PIXEL

MSRASTMODE_OFF_PATTERN

MSRASTMODE_ON_PIXEL

MSRASTMODE_ON_PATTERN

U1 Enumerated Type

2 31 Multisample Dispatch Mode

Format:

This bit, along with Number of Multisamples, determines how PS threads are dispatched.

Software programs this bit depending on the per-pixel v.s per-sample PS execution requirement.

When RT Independent Rasterization Enable = 1, value of 0h for this field is not allowed.

Value

Name Description

MSDISPMODE_PERSAMPLE This is the high-quality DX10.1 multisample mode

where (over and above PERPIXEL mode) the PS is

run for each covered sample. This mode is also

used for "normal" non-multisample rendering (aka

1X), given Number of Multisamples is

programmed to NUMSAMPLES_1.

MSDISPMODE_PERPIXEL This is the classic multisample mode of operation,

typically used for both antialiasing and

transparency. Setup and rasterization operate in

full multisample mode, testing coverage and

depth/stencil test at the sample level but only

running the PS once per pixel.

MBZ

30:0 Reserved

Format:

184 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

add - Addition

Source:

Length Bias:

EuIsa

The add instruction performs component-wise addition of src0 and src1 and stores the results in dst.

Addition of two floating-point numbers follows rules in add (IEEE mode) or add (ALT mode).

Format:

[(pred)] add[.cmod] (exec_size) dst src0 src1

Programming Notes

Use a source modifier with add to implement subtraction.

Syntax

[(pred)] add[.cmod] (exec_size) reg reg reg [(pred)] add[.cmod] (exec_size) reg reg imm32

Pseudocode

Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { [n] =

[n] + [n]; } }

Predication Conditional Modifier Saturation Source Modifier

Y Y Y Y

Src Types Dst Types

*B,*W,*D *B,*W,*D

*B,*W,*D F

Bit

127:64 ImmSource

Exists If:

Format:

DWord

0..3

Description

([ImmSource][e]=='IMM')

EU_INSTRUCTION_SOURCES_REG_IMM

([RegSource][e]!='IMM')

EU_INSTRUCTION_SOURCES_REG_REG

EU_INSTRUCTION_OPERAND_CONTROLS

EU_INSTRUCTION_HEADER

127:64 RegSource

Exists If:

Format:

63:32

31:0

Operand Controls

Format:

Header

Format:

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 185

Command Reference - Instructions

addc - Addition with Carry

Source:

Length Bias:

EuIsa

The addc instruction performs component-wise addition of src0 and src1 and stores the results in dst; it also

stores the carry into acc.

If the operation produces a carry out, 0x00000001 is stored in acc, else 0x00000000 is stored in acc.

Format:

[(pred)] addc[.cmod] (exec_size) dst src0 src1

Restriction

Restriction: AccWrEn is required. The accumulator is an implicit destination and thus cannot be an explicit

destination operand.

Syntax

[(pred)] addc[.cmod] (exec_size) reg reg reg [(pred)] addc[.cmod] (exec_size) reg reg

imm32

Pseudocode

Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { [n] =

[n] + [n]; [n] = carry([n] + [n]); } }

Predication Conditional Modifier Saturation Source Modifier

Y Y N N

Src Types Dst Types

UD UD

Bit

127:64 ImmSource

Exists If:

Format:

DWord

0..3

Description

([ImmSource][e]=='IMM')

EU_INSTRUCTION_SOURCES_REG_IMM

([RegSource][e]!='IMM')

EU_INSTRUCTION_SOURCES_REG_REG

EU_INSTRUCTION_OPERAND_CONTROLS

EU_INSTRUCTION_HEADER

127:64 RegSource

Exists If:

Format:

63:32

31:0

Operand Controls

Format:

Header

Format:

186 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

asr - Arithmetic Shift Right

Source:

Length Bias:

EuIsa

Description

Perform component-wise arithmetic right shift of the bits in src0 by the shift count indicated in src1,

storing the results in dst. If src0 has a signed type, insert copies of src0's sign bit in the number of

MSBs indicated by the shift count. Otherwise insert 0 bits.

The shift count is taken from the low five bits of src1, regardless of the src1 type and treated as an

unsigned integer in the range 0 to 31.

Format:

[(pred)] asr[.cmod] (exec_size) dst src0 src1

Programming Notes

If src0 is -1, the result is -1 regardless of the shift count.

For unsigned src0 types, asr and shr produce the same result.

Syntax

[(pred)] asr[.cmod] (exec_size) reg reg reg [(pred)] asr[.cmod] (exec_size) reg reg imm32

Pseudocode

Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( l[n] ) { shiftCnt =

[n] & 0x1F; // Always use low 5 bits for shift count. if ([n] >= 0) {

[n] = [n] >> shiftCnt; } else { int maskLSB = pow(2, shiftCnt) - 1; if (

maskLSB & [n] == 0 ) { [n] = sign([n]) * ((abs)[n] >>

shiftCnt); } else { [n] = sign([n]) * ((abs)[n] >> shiftCnt) -

1; } } } }

Predication Conditional Modifier Saturation Source Modifier

Y Y Y Y

Src Types Dst Types

*B,*W,*D *B,*W,*D

DWord

0..3

Bit

127:64 ImmSource

Exists If:

Format:

Description

([ImmSource][e]=='IMM')

EU_INSTRUCTION_SOURCES_REG_IMM

([RegSource][e]!='IMM')

EU_INSTRUCTION_SOURCES_REG_REG

EU_INSTRUCTION_OPERAND_CONTROLS

EU_INSTRUCTION_HEADER

127:64 RegSource

Exists If:

Format:

63:32

31:0

Operand Controls

Format:

Header

Format:

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 187

Command Reference - Instructions

avg - Average

Source:

Length Bias:

EuIsa

The avg instruction performs component-wise integer average of src0 and src1 and stores the results in dst. An

integer average uses integer upward rounding. It is equivalent to increment one to the addition of src0 and

src1 and then apply an arithmetic right shift to this intermediate value.

Format:

The avg instruction performs component-wise integer average of src0 and src1 and stores the results in dst. An

integer average uses integer upward rounding. It is equivalent to increment one to the addition of src0 and

src1 and then apply an arithmetic right shift to this intermediate value.

Syntax

[(pred)] avg[.cmod] (exec_size) reg reg reg [(pred)] avg[.cmod] (exec_size) reg reg imm32

Pseudocode

Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { [n] =

([n] + [n] + 1) >> 1; // Use arithmetic shift right. } }

Predication Conditional Modifier Saturation Source Modifier

Y Y Y Y

Src Types Dst Types

*B,*W,*D *B,*W,*D

DWord

0..3

Bit

127:64 ImmSource

Exists If:

Format:

Description

([ImmSource][e]=='IMM')

EU_INSTRUCTION_SOURCES_REG_IMM

([RegSource][e]!='IMM')

EU_INSTRUCTION_SOURCES_REG_REG

EU_INSTRUCTION_OPERAND_CONTROLS

EU_INSTRUCTION_HEADER

127:64 RegSource

Exists If:

Format:

63:32

31:0

Operand Controls

Format:

Header

Format:

188 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

bfe - Bit Field Extract

Source:

Length Bias:

EuIsa

Component-wise extract a bit field from src2 using the bit field width from src0 and the bit field offset from

src1. Store the extracted bit field value in the low bits of dst and sign extend (if D type) or zero extend (if UD

type).

The width and offset values are from the low five bits of src0 and src1 respectively, or src0 & 0x1f and src1 &

0x1f.

If width is zero, the result is zero.

If offset + width > 32 then the extracted bit field is bits offset to 31 of src2, extracting only 32 - offset bits, less

than width as the bit field cannot extend past the MSB of the source value. Otherwise extract width bits

extending from bit positions offset to offset + width - 1.

Format:

[(pred)] bfe (exec_size) dst src0 src1 src2

Restriction

Restriction: No accumulator access, implicit or explicit.

Restriction: All three-source instructions have certain restrictions, described in Instruction Machine

Formats.

Syntax

[(pred)] bfe (exec_size) reg reg reg reg

Pseudocode

Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { UD width =

[n][4:0]; UD offset = [n][4:0]; if ( width == 0 ) { [n] =

0x00000000; } else if ( (width + offset) < 32 ) { [n] = [n] << (32 -

width - offset); if (src2 is signed) { [n] = [n] >> (32 - width); // pad

sign bit of } else { [n] = [n] >> (32 - width); // pad 0 } } else

{ if ( src2 is signed ) { [n] = [n] >> offset; // pad sign bit } else {

[n] = [n] >> offset; // pad 0 } } } }

Predication Conditional Modifier Saturation Source Modifier

Y N N N

Src Types Dst Types

Bit

Format:

DWord Description

MBZ

EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC

MBZ

0..3 127:126 Reserved

125:106 Source 2

Format:

105 Reserved

Format:

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 189

Command Reference - Instructions

bfe - Bit Field Extract

104:85 Source 1

Format: EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC

MBZ

EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC

DstRegNum

DstSubRegNum[2:0]

ChanEn[4]

84 Reserved

Format:

83:64 Source 0

Format:

63:56 Destination Register Number

Format:

55:53 Destination Subregister Number

Format:

52:49 Destination Channel Enable

Format:

Four channel enables are defined for controlling which channels are written into the

destination region. These channel mask bits are applied in a modulo-four manner to all

ExecSize channels. There is 1-bit Channel Enable for each channel within the group of 4. If the

bit is cleared, the write for the corresponding channel is disabled. If the bit is set, the write is

enabled. Mnemonics for the bit being set for the group of 4 are x, y, z, and w, respectively,

where x corresponds to Channel 0 in the group and w corresponds to channel 3 in the group

Reserved

Format: MBZ

NibCtrl

MBZ

NibCtrl

Format:

Reserved

Format:

45:44 Destination Data Type

This field contains the data type for the destination

Value

00b

01b

10b

11b

Name

Single Precision Float

DWord

Unsigned DWord

Double Precision Float

43:42 Source Data Type

This field contains the data type for all three sources

Value

00b

01b

10b

11b

Name

Single Precision Float

DWord

Unsigned DWord

Double Precision Float

41:40 Source 2 Modifier

190 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

bfe - Bit Field Extract

Exists If:

Format:

([Property[Source Modification]=='true')

SrcMod

([Property[Source Modification]=='true')

SrcMod

([Property[Source Modification]=='false')

MBZ

([Property[Source Modification]=='true')

SrcMod

MBZ

39:38 Source 1 Modifier

Exists If:

Format:

41:36 Reserved

Exists If:

Format:

37:36 Source 0 Modifier

Exists If:

Format:

Reserved

Format:

Flag Register Number

This field contains the flag register number for instructions with a non-zero Conditional

Modifier.

Flag Subregister Number

This field contains the flag subregister number for instructions with a non-zero Conditional

Modifier.

Reserved

Format:

31:0

MBZ

EU_INSTRUCTION_HEADER

Header

Format:

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 191

Command Reference - Instructions

bfi1 - Bit Field Insert 1

Source:

Length Bias:

EuIsa

The bfi1 instruction is the first instruction in a two-instruction macro for bfi (Bit Field Insert).

The bfi1 instruction component-wise generates mask with control from src0 and src1 and stores the results in

dst. The mask is used in the bfi2 instruction to generate the final result of bfi.

Create a bit mask corresponding to the bit field width and offset in src0 and src1. Store the bit mask in dst. The

mask has all bits in the bit field set to 1 and all other bits as 0.

The width and offset values are from the low five bits of src0 and src1 respectively, or src0 & 0x1f and src1 &

0x1f.

If width is zero, the result is zero.

The bfi macro has four source operands: src0 - bit field width in low five bits, src1 - bit field offset/starting bit

position in low five bits, src2 - bit field value to insert, using only the number of least significant bits given by

width in src0, and src3 - overall value into which the bit field is inserted, providing all bits other than the

inserted bits for the result value.

bfi dst src0 src1 src2 src3

// Translates to these two instructions:

bfi1 dst src0 src1

bfi2 dst dst src2 src3

Format:

[(pred)] bfi1 (exec_size) dst src0 src1

Programming Notes

No accumulator access, implicit or explicit.

Syntax

[(pred)] bfi1 (exec_size) reg reg reg [(pred)] bfi1 (exec_size) reg reg imm32

Pseudocode

Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { UD width =

[n][4:0]; UD offset = [n][4:0]; dst = ((1 << width) - 1) << offset; } }

Predication Conditional Modifier Saturation Source Modifier

Y N N N

Src Types Dst Types

Bit Description DWord

192 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

bfi1 - Bit Field Insert 1

0..3 127:64 ImmSource

Exists If:

Format:

([ImmSource][e]=='IMM')

EU_INSTRUCTION_SOURCES_REG_IMM

([RegSource][e]!='IMM')

EU_INSTRUCTION_SOURCES_REG_REG

EU_INSTRUCTION_OPERAND_CONTROLS

EU_INSTRUCTION_HEADER

127:64 RegSource

Exists If:

Format:

63:32

31:0

Operand Controls

Format:

Header

Format:

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 193

Command Reference - Instructions

bfi2 - Bit Field Insert 2

Source:

Length Bias:

EuIsa

The bfi2 instruction is the second instruction in a two-instruction macro for bfi (Bit Field Insert).

The bfi2 instruction component-wise performs the bitfield insert operation on src1 and src2 based on the mask

in src0.

Use the mask in src0 to take a bit field value from the low bits of src1 and combine it with the value from src2

(so src2 provides all bits other than those masked out and replaced by the bit field value). Store the result in

dst.

The bfi macro has four source operands: src0 - bit field width in low five bits, src1 - bit field offset/starting bit

position in low five bits, src2 - bit field value to insert, using only the number of least significant bits given by

width in src0, and src3 - overall value into which the bit field is inserted, providing all bits other than the

inserted bits for the result value.

bfi dst src0 src1 src2 src3

// Translates to these two instructions:

bfi1 dst src0 src1

bfi2 dst dst src2 src3

Format:

[(pred)] bfi2 (exec_size) dst src0 src1 src2

Restriction

Restriction: No accumulator access, implicit or explicit.

Restriction: All three-source instructions have certain restrictions, described in Instruction Machine

Formats.

Syntax

[(pred)] bfi2 (exec_size) reg reg reg reg

Pseudocode

Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { UD offset =

LZD(reverse([n]))-1; // offset is the number of LSB zero bits below the bit mask

which has all 1s. // width (implied by the logic) is the number of 1 bits in the mask

value, which should be all 1s. [n] = (([n] << offset) & [n]) |

([n] & ! [n]); }

Predication Conditional Modifier Saturation Source Modifier

Y N N N

Src Types Dst Types

Bit Description

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

DWord

194

Command Reference - Instructions

bfi2 - Bit Field Insert 2

0..3 127:126 Reserved

Format: MBZ

EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC

MBZ

EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC

MBZ

EU_INSTRUCTION_OPERAND_SRC_REG_THREE_SRC

DstRegNum

DstSubRegNum[2:0]

ChanEn[4]

125:106 Source 2

Format:

105 Reserved

Format:

104:85 Source 1

Format:

84 Reserved

Format:

83:64 Source 0

Format:

63:56 Destination Register Number

Format:

55:53 Destination Subregister Number

Format:

52:49 Destination Channel Enable

Format:

Four channel enables are defined for controlling which channels are written into the

destination region. These channel mask bits are applied in a modulo-four manner to all

ExecSize channels. There is 1-bit Channel Enable for each channel within the group of 4. If the

bit is cleared, the write for the corresponding channel is disabled. If the bit is set, the write is

enabled. Mnemonics for the bit being set for the group of 4 are x, y, z, and w, respectively,

where x corresponds to Channel 0 in the group and w corresponds to channel 3 in the group

Reserved

Format: MBZ

NibCtrl

MBZ

NibCtrl

Format:

Reserved

Format:

45:44 Destination Data Type

This field contains the data type for the destination

Value

00b

01b

10b

11b

Name

Single Precision Float

DWord

Unsigned DWord

Double Precision Float

43:42 Source Data Type

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 195

Command Reference - Instructions

bfi2 - Bit Field Insert 2

This field contains the data type for all three sources

Value

00b

01b

10b

11b

Name

Single Precision Float

DWord

Unsigned DWord

Double Precision Float

([Property[Source Modification]=='true')

SrcMod

([Property[Source Modification]=='true')

SrcMod

([Property[Source Modification]=='false')

MBZ

([Property[Source Modification]=='true')

SrcMod

MBZ

41:40 Source 2 Modifier

Exists If:

Format:

39:38 Source 1 Modifier

Exists If:

Format:

41:36 Reserved

Exists If:

Format:

37:36 Source 0 Modifier

Exists If:

Format:

Reserved

Format:

Flag Register Number

This field contains the flag register number for instructions with a non-zero Conditional

Modifier.

Flag Subregister Number

This field contains the flag subregister number for instructions with a non-zero Conditional

Modifier.

Reserved

Format:

31:0

MBZ

EU_INSTRUCTION_HEADER

Header

Format:

196 Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

bfrev - Bit Field Reverse

Source:

Length Bias:

EuIsa

The bfrev instruction component-wise reverses all the bits in src0 and stores the results in dst.

Format:

[(pred)] bfrev (exec_size) dst src0

Restriction

Restriction: No accumulator access, implicit or explicit.

Syntax

[(pred)] bfrev (exec_size) reg reg [(pred)] bfrev (exec_size) reg imm32

Pseudocode

Evaluate(WrEn); for ( n = 0; n < exec_size; n++ ) { if ( [n] ) { for ( idx = 0;

idx < 32; idx++ ) { [n][idx] = [n][31-idx]; } } }

Predication Conditional Modifier Saturation Source Modifier

Y N N N

Src Types Dst Types

UD UD

Bit

127:64 ImmSource

Exists If:

Format:

DWord

0..3

Description

([Operand Controls][e]=='IMM')

EU_INSTRUCTION_SOURCES_IMM32

([Operand Controls][e]!='IMM')

EU_INSTRUCTION_SOURCES_REG

EU_INSTRUCTION_OPERAND_CONTROLS

EU_INSTRUCTION_HEADER

127:64 RegSource

Exists If:

Format:

63:32

31:0

Operand Controls

Format:

Header

Format:

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 197

Command Reference - Instructions

brc - Branch Converging

Source:

Length Bias:

EuIsa

Description

The brc instruction redirects the execution forward or backward to the instruction pointed by (current

IP + offset). The jump will occur if all channels are branched away. UIP should reference the instruction

where all channels are expected to come together. JIP should reference the end of the innermost

conditional block.

In GEN binary, JIP and UIP are at location src1 when immediates and at location src0 when reg32,

where reg32 is accessed as a scalar DWord containing both JIP and UIP. The null register must be used

(for example, by the assembler) as dst. When offsets are immediate, src0 must be null.

Format:

[(pred)] brc (exec_size) JIP UIP

Restriction

Restriction: A brc instruction must use the Switch instruction option.

Syntax

[(pred)] brc (exec_size) imm16 imm16 [(pred)] brc (exec_size) reg32

Pseudocode

Evaluate(WrEn); for ( n = 0; n < 32; n++ ) { if ( WrEn[n] ) { PcIP[n] = IP + UIP; } else {

PcIP[n] = IP + 1; } } if ( all PcIP != IP + 1 ) { // for all channels Jump(IP + JIP); }

Predication Conditional Modifier Saturation Source Modifier Source Types

Y N

Bit

127:112 UIP

Format:

0..3

N N D

Description

S15

DWord

The jump distance in number of eight-byte units if a jump is taken for the channel.

111:96 JIP

Format: S15

The jump distance in number of eight-byte units if a jump is taken for the instruction.

95:64

63:32

31:0

198

Reserved

Format: MBZ

EU_INSTRUCTION_OPERAND_CONTROLS

EU_INSTRUCTION_HEADER

Operand Control

Format:

Header

Format:

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14

Command Reference - Instructions

brd - Branch Diverging

Source:

Length Bias:

EuIsa

Description

The brd instruction redirects the execution forward or backward to the instruction pointed by (current

IP + offset). The jump will occur if any channels are branched away.

In GEN binary, JIP is at location src1 when immediate and at location src0 when reg32, where reg32 is

accessed as a scalar DWord. The null register must be used at dst locations.

Format:

[(pred)] brd (exec_size) JIP

Restriction

Restriction: A brd instruction must use the Switch instruction option.

Syntax

[(pred)] brd (exec_size) imm16 [(pred)] brd (exec_size) reg32

Pseudocode

Evaluate(WrEn); for ( n = 0; n < 32; n++ ) { if ( WrEn[n] ) { PcIP[n] = IP + JIP; } else {

PcIP[n] = IP + 1; } } if ( any PcIP == ExIP + JIP ) { // any channel Jump(ExIP + JIP); }

Predication Conditional Modifier Saturation Source Modifier

Y N N N

Src Types

DWord Bit

Format:

Description

MBZ

S15

0..3 127:112 Reserved

111:96 JIP

Format:

Jump Target Offset. The relative offset in 64-bit units if a jump is taken for the instruction.

95:91 Reserved

Format: MBZ

Flag Register Number

Added a second flag register

Flag Subregister Number

This field specifies the sub-register number for a flag register operand. There are two sub-

registers in the flag register. Each sub-register contains 16 flag bits.

The selected flag sub-register is the source for predication if predication is enabled for the

instruction. It is the destination to store conditional flag bits if conditional modifier is enabled

for the instruction. The same flag sub-register can be both the predication source and

Doc Ref # IHD-OS-VLV-Vol2 pt2-04.14 199