summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorIan Romanick <ian.d.romanick@intel.com>2019-11-18 11:52:47 -0800
committerDylan Baker <dylan@pnwbakers.com>2019-11-26 16:43:04 -0800
commit85b0bb51443ff147a8df752147dff89f55267aaa (patch)
treeb0e9eebf700e1098840465a512e59b158ba5be74
parent9cd69861f82002f86258998310f00df7f2a4ad82 (diff)
downloadmesa-85b0bb51443ff147a8df752147dff89f55267aaa.tar.gz
mesa-85b0bb51443ff147a8df752147dff89f55267aaa.tar.bz2
mesa-85b0bb51443ff147a8df752147dff89f55267aaa.zip
intel/fs: Disable conditional discard optimization on Gen4 and Gen5
The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of valid data and 31 bits of junk. Results of comparisons that are used as Boolean values need to have a fixup applied to generate the proper 0/~0 values. Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup code from being generated. This results in a sequence like: cmp.l.f0.0(16) g8<1>F g14<8,8,1>F 0x0F /* 0F */ ... cmp.l.f0.0(16) g4<1>F g6<8,8,1>F 0x0F /* 0F */ (+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD g8<8,8,1>UD instead of cmp.l.f0.0(16) g8<1>F g14<8,8,1>F 0x0F /* 0F */ ... cmp.l.f0.0(16) g4<1>F g6<8,8,1>F 0x0F /* 0F */ or(16) g4<1>UD g4<8,8,1>UD g8<8,8,1>UD (+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD 1UD I examined a couple of the shaders hurt by this change, and ALL of them would have been affected by this bug. :( Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1836 Fixes: 0ba9497e66a ("intel/fs: Improve discard_if code generation") Iron Lake total instructions in shared programs: 8122757 -> 8122957 (<.01%) instructions in affected programs: 8307 -> 8507 (2.41%) helped: 0 HURT: 100 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %-change: 2.58% 3.03% Instructions are HURT. total cycles in shared programs: 188510100 -> 188510376 (<.01%) cycles in affected programs: 76018 -> 76294 (0.36%) helped: 0 HURT: 55 HURT stats (abs) min: 2 max: 12 x̄: 5.02 x̃: 4 HURT stats (rel) min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56% 95% mean confidence interval for cycles value: 4.33 5.71 95% mean confidence interval for cycles %-change: 0.60% 1.12% Cycles are HURT. GM45 total instructions in shared programs: 4994403 -> 4994503 (<.01%) instructions in affected programs: 4212 -> 4312 (2.37%) helped: 0 HURT: 50 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %-change: 2.45% 3.07% Instructions are HURT. total cycles in shared programs: 128928750 -> 128928982 (<.01%) cycles in affected programs: 67442 -> 67674 (0.34%) helped: 0 HURT: 47 HURT stats (abs) min: 2 max: 12 x̄: 4.94 x̃: 4 HURT stats (rel) min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53% 95% mean confidence interval for cycles value: 4.19 5.68 95% mean confidence interval for cycles %-change: 0.50% 1.00% Cycles are HURT. (cherry picked from commit e51eda99dfd6a66b066e371005e7a54ecc38fc11)
-rw-r--r--src/intel/compiler/brw_fs_nir.cpp9
1 files changed, 8 insertions, 1 deletions
diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp
index 527c021fe21..38d86c776d7 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -3460,7 +3460,14 @@ fs_visitor::nir_emit_fs_intrinsic(const fs_builder &bld,
if (alu != NULL &&
alu->op != nir_op_bcsel &&
- alu->op != nir_op_inot) {
+ alu->op != nir_op_inot &&
+ (devinfo->gen > 5 ||
+ (alu->instr.pass_flags & BRW_NIR_BOOLEAN_MASK) != BRW_NIR_BOOLEAN_NEEDS_RESOLVE ||
+ alu->op == nir_op_fne32 || alu->op == nir_op_feq32 ||
+ alu->op == nir_op_flt32 || alu->op == nir_op_fge32 ||
+ alu->op == nir_op_ine32 || alu->op == nir_op_ieq32 ||
+ alu->op == nir_op_ilt32 || alu->op == nir_op_ige32 ||
+ alu->op == nir_op_ult32 || alu->op == nir_op_uge32)) {
/* Re-emit the instruction that generated the Boolean value, but
* do not store it. Since this instruction will be conditional,
* other instructions that want to use the real Boolean value may