summaryrefslogtreecommitdiff
path: root/src/compiler
AgeCommit message (Collapse)AuthorFilesLines
2019-03-12nir: Add a pass for lowering IO back to vector when possibleJason Ekstrand5-1/+392
This pass tries to turn scalar and array-of-scalar IO variables into vector IO variables whenever possible. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Cc: "19.0" <mesa-stable@lists.freedesktop.org> (cherry picked from commit 5ef2b8f1f2ebcdb4ffe5c98b3f4f48e584cb4b22)
2019-03-07spirv: Pull offset/stride from the pointer for OpArrayLengthJason Ekstrand1-2/+10
We can't pull it from the variable type because it might be an array of blocks and not just the one block. While we're here, throw in some error checking. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit f1dbc7e97d3dcb2104b9438d32cace9529575208)
2019-03-05intel,nir: Lower TXD with min_lod when the sampler index is not < 16Jason Ekstrand2-0/+27
When we have a larger sampler index, we get into the "high sampler" scenario and need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Fixes: cb98e0755f8d "intel/fs: Support min_lod parameters on texture..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit 5c96120b5ce158fea28d751d8a55b5e4d80df4f3)
2019-03-05spirv: OpImageQueryLod requires a samplerJason Ekstrand1-1/+1
No idea how this fell through the cracks besides the fact that the sampler bound at 0 almost always works and the CTS isn't amazing. In any case, this appears to have been broken for almost forever. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit ca295ddbfb414a526d3bab7daf93fffbbc417c6e)
2019-03-05glsl: fix recording of variables for XFB in TCS shadersIlia Mirkin3-5/+44
This is purely for conformance, since it's not actually possible to do XFB on TCS output varyings. However we do have to make sure we record the names correctly, and this removes an extra level of array-ness from the names in question. Fixes KHR-GL45.tessellation_shader.single.xfb_captures_data_from_correct_stage v2: Add comment to the new program_resource_visitor::process function. (Ilia Mirkin) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108457 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit 4eec3a2a3652317f8e0fa97e0730c297bde8241a)
2019-03-05glsl: TCS outputs can not be transform feedback candidates on GLESJose Maria Casanova Crespo1-1/+21
Avoids regression on: KHR-GLES*.core.tessellation_shader.single.xfb_captures_data_from_correct_stage that is uncovered by the following patch. "glsl: fix recording of variables for XFB in TCS shaders" v2: Rebased over glsl: fix recording of variables for XFB in TCS shaders v3: Move this patch before "glsl: fix recording of variables for XFB in TCS shaders" to avoid temporal regressions. (Illia Mirkin) Cc: 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit bf1f49482d677e562993543cd9a9367597ce3ccc)
2019-03-05glsl: fix shader cache for packed param listTimothy Arceri1-11/+4
Some types of params such as some builtins are always padded. We need to keep track of this so we can restore the list correctly. Here we also remove a couple of cache entries that are not actually required as they get rebuilt by the _mesa_add_parameter() calls. This patch fixes a bunch of arb_texture_multisample and arb_sample_shading piglit tests for the radeonsi NIR backend. Fixes: edded1237607 ("mesa: rework ParameterList to allow packing") Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit 7536af670b7501228628a8c90f9e8456b5aec9e1)
2019-02-26nir: initialize value in copy_prop_vars_blockTapani Pälli1-1/+1
Fixes following valgrind warning: ==27561== Conditional jump or move depends on uninitialised value(s) ==27561== at 0x667856B: value_set_ssa_components (nir_opt_copy_prop_vars.c:78) ==27561== by 0x667A1C4: copy_prop_vars_block (nir_opt_copy_prop_vars.c:797) Fixes: 62332d139c8 "nir: Add a local variable-based copy propagation pass" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (cherry picked from commit 22267feff1a35c4b6f1f0cb9c8e371727f99b5d6)
2019-02-26spirv: Eliminate dead input/output variables after translation.Kenneth Graunke1-5/+20
spirv_to_nir can generate input/output variables which are illegal for the current shader stage, which would cause nir_validate_shader to balk. After my recent commit to start decorating arrays as compact, dEQP-VK.spirv_assembly.instruction.graphics.module.same_module started hitting validation errors due to outputs in a TCS (not intended for the TCS at all) not being per-vertex arrays. Thanks to Jason Ekstrand for suggesting this approach. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109573 Fixes: ef99f4c8d17 compiler: Mark clip/cull distance arrays as compact before lowering. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> (cherry picked from commit 6775665e5eec7db3f291508a8b7ba2a792f62ec0)
2019-02-26compiler: Mark clip/cull distance arrays as compact before lowering.Kenneth Graunke2-0/+14
nir_lower_clip_cull_distance_arrays() marks the combined clip/cull distance array as compact. However, when translating in from GLSL or SPIR-V, we were not marking the original float[] arrays as compact. We should do so. That way, we can detect these corner cases properly. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit ef99f4c8d176f4e854e12afa1545fa53f651d758) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-25nir/lower_clip_cull: Fix an incorrect assertJason Ekstrand1-1/+1
Copy+paste error. It was supposed to test cull and not clip. Fixes: 4e69fba534e "nir: Rewrite lower_clip_cull_distance_arrays..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit f98fd9d15a9a79ff1b41f1fce27bc285a20aa5bb)
2019-02-25nir/xfb: Handle compact arrays in gather_xfb_infoJason Ekstrand1-11/+22
This makes us properly handle gl_ClipDistance and gl_CullDistance. Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit 1a93fc382b18ee6d1135952d23f0b6a8aa8cd31f)
2019-02-25nir/xfb: Work in terms of components rather than slotsJason Ekstrand1-5/+5
We needed to better handle cases where a chunk of a variable starts at some non-zero location_frac and rolls over into the next slot but may not be more than 4 dwords. For example, if gl_CullDistance is an array of 3 things and has location_frac = 2, it will span across two vec4s but is not, itself, bigger than a vec4. If you ignore the clip/cull special case, it's not allowed to happen for anything else because the only things that can span more than one slot is dvec3 and dvec4 and they're both bigger than a vec4. The current code uses this attrib_slot thing where we count attribute slots and iterate over them. However, that doesn't work in the case above because gl_CullDistance will have an attrib_slot count of 1 even though it does span two slots. We could fix this by adjusting attrib_slot but we already have comp_mask and it's easier to just handle it that way. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit 558c3145045f1c6da8bddb31ed77a418ab27f2f9)
2019-02-25nir: Rewrite lower_clip_cull_distance_arrays to do a lot less loweringJason Ekstrand1-117/+26
Instead of going to all the work of to combine them into one array, just make two arrays and use location_frac to colocate them within CLIP0. Then the back-end can sort things out and stack them on top of each other. Thanks to ef99f4c8, we also don't need to set compact anymore. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit 4e69fba534e7377f3bc6c40c73e6bc5c23437d4e) Conflicts resolved by Dylan Conflicts: src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
2019-02-25nir/xfb: Properly align 64-bit valuesJason Ekstrand1-0/+4
Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit 8f0fe71cc5658728adc273daa03400aab7ec6d93)
2019-02-25compiler/types: Add a contains_64bit helperJason Ekstrand4-0/+29
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit 30b548fc6258e9a72722f511e377cf4716fd443c)
2019-02-19nir: Don't reassociate add/mul chains containing only constantsKenneth Graunke1-5/+5
The idea here is to reassociate a * (b * c) into (a * c) * b, when b is a non-constant value, but a and c are constants, allowing them to be combined. But nothing was enforcing that 'b' must be non-constant, which meant that running opt_algebraic in a loop would never terminate if the IR contained non-folded constant expressions like 256 * 0.5 * 2. Normally, we call constant folding in such a loop too, but IMO it's better for nir_opt_algebraic to be robust and not rely on that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109581 Fixes: 32e266a9a58 i965: Compile fp64 funcs only if we do not have 64-bit hardware support Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit 535251487ba56c4fd98465c4682881c2b9734242)
2019-02-14spirv: Add missing breakIan Romanick1-0/+1
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: c6465fec0c5 ("spirv: add SpvCapabilityInt64Atomics") CID: 1442555 (cherry picked from commit 9a918050e0886d8c6d6adc0c687ffd30d8f70b40)
2019-02-13nir/opt_if: don't mark progress if nothing changesKarol Herbst1-0/+7
if we have something like this: loop { ... if x { break; } else { continue; } } opt_if_loop_last_continue returns true marking progress allthough nothing changes. Fixes: 5921a19d4b0c6 "nir: add if opt opt_if_loop_last_continue()" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit 7e08f22a72cfc379902feeca3673db6aa344f782)
2019-02-12Revert "nir/opt_peephole_select: Don't peephole_select expensive math ↵Dylan Baker2-32/+9
instructions" This reverts commit 378f9967710e9145f2a4f8eee89d87badbe0e6ea. This also remove the default true argument from the a2xx nir backend, which was introduced after this commit. There should be no change in functionality.
2019-02-11nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocksJason Ekstrand1-3/+2
When nir_rematerialize_derefs_in_use_blocks_impl was first written, I attempted to optimize things a bit by not bothering to re-materialize the sources of deref instructions figuring that the final caller would take care of that. However, in the case of more complex deref chains where the first link or two lives in block A and then another link and the load/store_deref intrinsic live in block B it doesn't work. The code in rematerialize_deref_in_block looks at the tail of the chain, sees that it's already in block B and skips it, not realizing that part of the chain also lives in block A. The easy solution here is to just rematerialize deref sources of deref instructions as well. This may potentially lead to a few more deref instructions being created by the conditions required for that to actually happen are fairly unlikely and, thanks to the caching, it's all linear time regardless. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109603 Fixes: 7d1d1208c2b "nir: Add a small pass to rematerialize derefs per-block" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit 9e6a6ef0d45a5bb61a541c495fe12e54e646ecfe)
2019-02-11nir: Silence zillions of unused parameter warnings in release buildsIan Romanick1-1/+1
Fixes: cd56d79b59f "nir: check NIR_SKIP to skip passes by name" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit 78169870e416fde51946f8295fa6e1c652305447)
2019-01-30android,autotools,i965: Fix location of float64_glsl.hDylan Baker1-1/+1
Android.mk and autotools disagree about where generated files should go, which wasn't a problem until we wanted to build a dist tarball. This corrects the problme by changing the output and include paths to be the same on android and autotools (meson already has the correct include path). Fixes: 7d7b30835cfb9eb89beca9fb8593d0954f79b84d ("automake: Fix path to generated source")
2019-01-29automake: Add float64.glsl to dist tarballDylan Baker1-0/+1
Fixes: b63a1f8e40b6705d6a1d806fbd38dcd197d4229b ("glsl: Create file to contain software fp64 functions") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-29automake: Fix path to generated sourceDylan Baker1-1/+1
Fixes: b63a1f8e40b6705d6a1d806fbd38dcd197d4229b ("glsl: Create file to contain software fp64 functions") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-01-29nir: Optimize double-precision lower_round_even()Matt Turner1-44/+12
Use the trick of adding and then subtracting 2**52 (52 is the number of explicit mantissa bits a double-precision floating-point value has) to implement round-to-even. Cuts the number of instructions on SKL of the piglit test fs-roundEven-double.shader_test from 109 to 21. Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-01-29glsl: use remap location when serialising uniform program resource dataTimothy Arceri1-7/+26
This allows us to avoid expensive string compares since we already have a map to the pointers. These compares were taking ~30 seconds for a single shader compile in Godot due to it using 64,000+ uniforms. Fixes: c4cff5f40254 ("glsl: add basic support for resource list to shader cache") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109229
2019-01-28spirv: Don't use special semantics when counting vertex attribute sizeNeil Roberts1-6/+4
Under Vulkan, the double vertex attributes take up the same size regardless of whether they are vertex inputs or any other stage interface. Under OpenGL (ARB_gl_spirv), from GLSL 4.60 spec, section 4.3.9 Interface Blocks: "It is a compile-time error to have an input block in a vertex shader or an output block in a fragment shader. These uses are reserved for future use." So we also don't need to check if it is an vertex input or not, and use false in any case. v2: (changes made by Alejandro Piñeiro) * Update required after "spirv: Handle location decorations on block interface members" own updates (original patch was sent several months ago) * After Neil suggesting it, confirm that this change can be also done for OpenGL (ARB_gl_spirv). Expand commit message. v3: update after changing name of main method on a previous patch Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-28glsl_types: Rename parameter of glsl_count_attribute_slotsNeil Roberts4-10/+17
glsl_count_attribute_slots takes a parameter to specify whether the type is being used as a vertex input because on GL double attributes only take up one slot. Vulkan doesn’t make this distinction so this patch renames the argument to is_gl_vertex_input in order to make it more clear that it should always be false on Vulkan. v2: minor variable renaming (s/member/member_type) (Tapani) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-28spirv/nir: handle location decorations on block interface membersNeil Roberts2-9/+66
Previously the code was taking any location decoration on the block and using that to calculate the member locations for all of the members. I think this was assuming that there would only be one location decoration for the entire block. According to the Vulkan spec it is possible to add location decorations to individual members: “If the structure type is a Block but without a Location, then each of its members must have a Location decoration. If it is a Block with a Location decoration, then its members are assigned consecutive locations in declaration order, starting from the first member which is initially the Block. Any member with its own Location decoration is assigned that location. Each remaining member is assigned the location after the immediately preceding member in declaration order.” This patch makes it instead keep track of which members have been assigned an explicit location. It also has a space to store the location for the struct as a whole. Once all the decorations have been processed it iterates over each member to fill in the missing locations using the rules described above. So, this commit is needed to get working a case like this, on both Vulkan and OpenGL using SPIR-V (ARB_gl_spirv): out block { layout(location = 2) vec4 c; layout(location = 3) vec4 d; layout(location = 0) vec4 a; layout(location = 1) vec4 b; } name; v2: (changes made by Alejandro Piñeiro) * Update after introducing struct member splitting (See commit b0c643d) * Update after only exposing interface_type for blocks, not to any struct * Update after last changes done for xfb support v3: use "assign" instead of "add" on the new method added (Tapani) Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-01-27glsl: fix block member alignment validation for vec3Niklas Haas1-4/+4
Section 7.6.2.2 (Standard Uniform Block Layout) of the GL spec says: The base offset of the first member of a structure is taken from the aligned offset of the structure itself. The base offset of all other structure members is derived by taking the offset of the last basic machine unit consumed by the previous member and adding one. The current code does not reflect this last sentence - it effectively instead aligns up the next offset up to the alignment of the previous member. This causes an issue in exactly one case: layout(std140) uniform block { layout(offset=0) vec3 var1; layout(offset=12) float var2; }; As per section 7.6.2.1 (Uniform Buffer Object Storage) and elsewhere, a vec3 consumes 3 floats, i.e. 12 basic machine units. Therefore, `var1` in the example above consumes units 0-11, with 12 being the first available offset afterwards. However, before this commit, mesa incorrectly assumes `var2` must start at offset=16 when using explicit offsets, which results in a compile-time error. Without explicit offsets, the shaders actually work fine, indicating that mesa is already correctly aligning these fields internally. (Just not in the code that handles explicit buffer offset parsing) This patch should fix piglit tests: ssbo-explicit-offset-vec3.vert ubo-explicit-offset-vec3.vert Signed-off-by: Niklas Haas <git@haasn.xyz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-01-26spirv: Add support for SPV_EXT_physical_storage_bufferJason Ekstrand5-3/+55
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26spirv: Implement OpConvertPtrToU and OpConvertUToPtrJason Ekstrand2-2/+75
This only implements the actual opcodes and does not implement support for using them with specialization constants. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26spirv: Handle OpTypeForwardPointerJason Ekstrand1-33/+66
We handle forward declarations by creating the pointer type with it's storage type based on storage class and just waiting to fill out the actual deref type until we get the OpTypePointer. Because any composites using the forward declared type only care about the storage type (i.e. uint64_t, uvec2, etc.) when creating their glsl_type, this works fine and we can defer the actual deref_type as far as we need. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26spirv: Drop a bogus assertJason Ekstrand1-1/+0
This was valid back when the only valid types of pointers were uint32 and uvec2. Now that we're allowing more variety, it could be just about anything so we'll just drop the assert. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26nir: Allow SSBOs and global to aliasJason Ekstrand1-1/+6
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26nir/validate: Allow array derefs of vectors for nir_var_mem_globalJason Ekstrand1-1/+2
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26nir/lower_io: Add support for nir_var_mem_globalJason Ekstrand1-0/+12
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-26nir/lower_io: Add a 32 and 64-bit global address formatsJason Ekstrand2-30/+123
These are simple scalar addresses. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-26nir: Add load/store/atomic global intrinsicsJason Ekstrand3-1/+39
These correspond roughly to reading/writing OpenCL global pointers. The idea is that they just take a bare address and load/store from it. Of course, exactly what this address means is driver-dependent. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-01-23nir: Length of boolean vtn_value now is 1Sergii Romantsov1-3/+6
During conversion type-length was lost due to math. v2 (Jason Ekstrand): - Use a size/offset of 4 bytes Fixes: 44227453ec03 (nir: Switch to using 1-bit Booleans for almost everything) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109353 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Tested-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-22anv: Add pipeline cache support for xfb_infoJason Ekstrand1-1/+1
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22nir/xfb: distinguish array of structs vs array of blocksAlejandro Piñeiro1-7/+17
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-22nir/xfb: Properly handle arrays of blocksJason Ekstrand1-20/+41
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22nir/xfb: don't assert when xfb_buffer/stride is present but not xfb_offsetAlejandro Piñeiro1-7/+6
In order to allow nir_gather_xfb_info to be used on OpenGL, specifically ARB_gl_spirv. So, from OpenGL 4.6 spec, section 11.1.2.1, "Output Variables": "outputs specifying both an *XfbBuffer* and an *Offset* are captured, while outputs not specifying both of these are not captured. Values are captured each time the shader writes to such a decorated object." This implies that are captured if both are present, and not if one of those are lacking. Technically, it doesn't explicitly point that having just one or the other is a mistake. In some cases, glslang is adding some extra XfbBuffer without XfbOffset around, and mentioning that technically that is not a bug (see issue#1526) And for the case of Vulkan, as the same glslang issue mentions, it is not clear if that should be a mistake or not. But even if it is a mistake, it is not really needed to be checked on the driver, and we can let the validation layers to check that. v2: simplify explicit_xfb_buffer and explicit_offset checks (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-22nir/xfb: Fix offset accounting for dvec3/4Jason Ekstrand1-2/+2
Before, we were double-counting the component slots when we had a dvec3 or dvec4. Instead, just add them in once and manually offset the recorded output offset. Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22nir: Preserve offsets in lower_io_to_scalar_earlyJason Ekstrand1-0/+8
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22nir: fix lowering arrays to elements for XFB outputsSamuel Pitoiset1-2/+11
If we have a transform feedback output like: float[2] x2_out (VARYING_SLOT_VAR1.x, 0, 0) which is lowered by nir_lower_io_arrays_to_elements to, float x2_out (VARYING_SLOT_VAR1.x, 0, 0) float x2_out@5 (VARYING_SLOT_VAR2.x, 0, 0) We have to update the destination offset to avoid overwriting the same value. v2 (Jason Ekstrand): - Compute the correct offsets for arrays of vectors and/or doubles Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-01-22nir: do not remove varyings used for transform feedbackSamuel Pitoiset1-0/+3
When a xfb buffer is explicitely declared on a varying variable, we shouldn't remove it at link time. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-22spirv: Only set interface_type on blocksJason Ekstrand1-9/+25
Instead of setting interface_type to whatever the per-vertex type is, we only set it on blocks. This allows later passes to tell the difference between variables that are in blocks and those that aren't. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>