summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-02-05Add barrier instructions validation passAndrey Tuganov12-43/+963
2018-02-05Avoid vector copies in range-for loops in opt/types.cppDavid Neto1-6/+6
Also be more explicit about iterated types in other range-for loops.
2018-02-02Disambiguate between const and nonconst ForEachSuccessorLabelDavid Neto10-28/+45
This helps VisualStudio 2013 compile the code. Contributes to #1262
2018-02-02Add general folding infrastructure.Steven Perron8-469/+797
Create the folding engine that will 1) attempt to fold an instruction. 2) iterates on the folding so small folding rules can be easily combined. 3) insert new instructions when needed. I've added the minimum number of rules needed to test the features above.
2018-02-02Start v2018.1-devDavid Neto1-0/+3
2018-02-02Finalize v2018.0David Neto1-1/+1
2018-02-02Update CHANGESDavid Neto1-1/+12
2018-02-01Reordering performance passes ordering to produce better optsAlan Baker1-8/+8
* Moved initial insert/extract passes later to cover more opportunities * Added an extra set of passes to clean up opportunities exposed later in the pipeline
2018-02-01Add LoopUtils class to gather some loop transformation support.Victor Lomuller16-24/+1756
This patch adds LoopUtils class to handle some loop related transformations. For now it has 2 transformations that simplifies other transformations such as loop unroll or unswitch: - Dedicate exit blocks: this ensure that all exit basic block (out-of-loop basic blocks that have a predecessor in the loop) have all their predecessors in the loop; - Loop Closed SSA (LCSSA): this ensure that all definitions in a loop are used inside the loop or in a phi instruction in an exit basic block. It also adds the following capabilities: - Loop::IsLCSSA to test if the loop is in a LCSSA form - Loop::GetOrCreatePreHeaderBlock that can build a loop preheader if required; - New methods to allow on the fly updates of the loop descriptors. - New methods to allow on the fly updates of the CFG analysis. - Instruction::SetOperand to allow expression of the index relative to Instruction::NumOperands (to be compatible with the index returned by DefUseManager::ForEachUse)
2018-02-01Add pass to reaplce invalid opcodesSteven Perron11-0/+889
Creates a pass that will remove instructions that are invalid for the current shader stage. For the instruction to be considered for replacement 1) The opcode must be valid for a shader modules. 2) The opcode must be invalid for the current shader stage. 3) All entry points to the module must be for the same shader stage. 4) The function containing the instruction must be reachable from an entry point. Fixes #1247.
2018-02-01Added OpenCL ExtInst validation rulesAndrey Tuganov2-2/+4468
2018-02-01Add adjacency validation passJeremy Hayes9-30/+409
Validate OpPhi predecessors. Validate OpLoopMerge successors. Validate OpSelectionMerge successors. Fix collateral damage to existing tests. Remove ValidateIdWithMessage.OpSampledImageUsedInOpPhiBad.
2018-01-31Fixed harmless uninit var warningAndrey Tuganov1-1/+1
2018-01-31Use SPIR-V headers from "unified1" directoryDavid Neto5-5/+4
2018-01-31Remove constexpr from Analysis operatorsAlan Baker5-47/+53
* Had to remove templating from InstructionBuilder as a result * now preserved analyses are specified as a constructor argument * updated tests and uses * changed static_assert to a runtime assert * this should probably get further changes in the future
2018-01-31Opt: Add ScalarReplacement to RegisterSizePassesGregF1-0/+1
2018-01-30Add memory semantics checks to validate atomicsAndrey Tuganov2-101/+274
2018-01-30Update CHANGESDavid Neto1-2/+15
2018-01-30Prevent unnecessary changes to the IR in dead branch elimAlan Baker2-27/+74
* When handling unreachable merges and continues, do not optimize to the same IR * pass did not check whether the unreachable blocks were in the optimized form before transforming them * added a test to catch this issue
2018-01-30Improved error message in val capabilitiesAndrey Tuganov2-30/+35
2018-01-30Enhancements to block mergingAlan Baker3-88/+316
* Should handle all possibilities * Stricter checks for what is disallowed: * header and header * merge and merge * Allow header and merge blocks to be merged * Erases the structured control declaration if merging header and merge blocks together.
2018-01-30Fix dereference of possibly nullptrAlan Baker2-1/+27
* If the dead branch elim is performed on a module without structured control flow, the OpSelectionMerge may not be present * Add a check for pointer validity before dereferencing * Added a test to catch the bug
2018-01-30InsertExtractElim: Split out DeadInsertElim as separate passGregF15-955/+1235
2018-01-29Fixes in CCP for #1228Alan Baker4-6/+71
* Forces traversal of phis if the def has changed to varying * Mark a phi as varying if all incoming values are varying * added a test to catch the bug
2018-01-25Add LoopDescriptor as an IRContext analysis.Victor Lomuller6-25/+76
Move some function definitions from header to source to avoid circular definition.
2018-01-25DeadInsertElim: Detect and DCE dead InsertsGreg Fischer3-7/+961
This adds Dead Insert Elimination to the end of the --eliminate-insert-extract pass. See the new tests for examples of code that will benefit. Essentially, this removes OpCompositeInsert instructions which are not used, either because there is no instruction which uses the value at the index it is inserted, or because a subsequent insert intercepts any such use. This code has been seen to remove significant amounts of dead code from real-life HLSL shaders being ported to Vulkan. In fact, it is needed to remove dead texture samples which cause Vulkan validation layer errors (unbound textures and samplers) if not removed . Such DCE is thus required for fxc equivalence and legalization. This analysis operates across "chains" of Inserts which can also contain Phi instructions.
2018-01-25Initial implementation of if conversionAlan Baker18-9/+948
* Handles simple cases only * Identifies phis in blocks with two predecessors and attempts to convert the phi to an select * does not perform code motion currently so the converted values must dominate the join point (e.g. can't be defined in the branches) * limited for now to two predecessors, but can be extended to handle more cases * Adding if conversion to -O and -Os
2018-01-24Validator: restricted some atomic ops for shadersAndrey Tuganov2-5/+132
Ban floating point case for OpAtomicLoad, OpAtomicExchange, OpAtomicCompareExchange. In graphics (Shader) environments, these instructions only operate on scalar integers. Ban the floating point case. OpenCL supports atomic_float.
2018-01-24Added Vulkan-specifc checks to image validationAndrey Tuganov4-34/+148
Implemented Vulkan-specific rules: - OpTypeImage must declare a scalar 32-bit float or 32-bit integer type for the “Sampled Type”. - OpSampledImage must only consume an “Image” operand whose type has its “Sampled” operand set to 1.
2018-01-22Use id_map in Fold*ToConstantSteven Perron2-16/+137
The folding routines are suppose to use the id_map provided to map the ids in the instruction. The ones I just added are missing it.
2018-01-22Add generic folding function and use in CCPSteven Perron5-3/+1300
The current folding routines have a very cumbersome interface, make them harder to use, and not a obvious how to extend. This change is to create a new interface for the folding routines, and show how it can be used by calling it from CCP. This does not make a significant change to the behaviour of CCP. In general it should produce the same code as before; however it is possible that an instruction that takes 32-bit integers as inputs and the result is not a 32-bit integer or bool will not be folded as before. It seems like andriod has a problem with INT32_MAX and the like. I'll explicitly define those if the are not already defined.
2018-01-19Fixes infinite loop in ADCEAlan Baker3-16/+66
* Addresses how breaks are indentified to prevent infinite loops when back to back loop share a merge and header * Added test to catch the bug
2018-01-19Introduce an instruction builder helper class.Victor Lomuller4-6/+422
The class factorize the instruction building process. Def-use manager analysis can be updated on the fly to maintain coherency. To be updated to take into account more analysis.
2018-01-18Simplifying code for adding instructions to worklistAlan Baker3-14/+89
* AddToWorklist can now be called unconditionally * It will only add instructions that have not already been marked as live * Fixes a case where a merge was not added to the worklist because the branch was already marked as live * Added two similar tests that fail without the fix
2018-01-18Create a pass to work around a driver bug related to OpUnreachable.Steven Perron10-1/+558
We have come across a driver bug where and OpUnreachable inside a loop is causing the shader to go into an infinite loop. This commit will try to avoid this bug by turning OpUnreachable instructions that are contained in a loop into branches to the loop merge block. This is not added to "-O" and "-Os" because it should only be used if the driver being targeted has this problem. Fixes #1209.
2018-01-18Adding testcase for #1210Alan Baker1-0/+43
2018-01-18CFG: force the creation of a predecessor entry for all basic block.Victor Lomuller1-0/+3
This ensure that all basic blocks in a function have a valid entry the CFG object. The entry block has no predecessors but remains a valid basic block for which we might want to query the number of predecessors. Some unreachable basic blocks may not have predecessors as well.
2018-01-18spirv-dis: Add --color option to force color disassemblyDavid Neto1-9/+23
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1170
2018-01-17Fixing missing early exit from break identificationAlan Baker1-0/+1
2018-01-17Adding support for switch removal in ADCEAlan Baker4-42/+114
* Updated code to handle switches * Enabled disabled test and added a couple new ones
2018-01-17Capturing value table by reference in local redundancy elimAlan Baker1-1/+1
2018-01-16Update CHANGESDavid Neto1-0/+2
2018-01-16Fixes missing increment in common uniform elimAlan Baker2-0/+113
* Addresses #1203 * Increments inIdx in IsConstantIndexAccessChain * added test to catch the bug
2018-01-15Skip SpecConstants in CCP.Steven Perron3-2/+34
At the moment specialization constants look like constants to ccp. This causes a problem because they are handled differently by the constant manager. I choose to simply skip over them, and not try to add them to the value table. We can do specialization before ccp if we want to be able to propagate these values. Fixes #1199.
2018-01-12Update CHANGESDavid Neto1-0/+2
2018-01-12Add MatrixConstantGreg Fischer2-0/+40
2018-01-12Remove redundant passes from legalization passesSteven Perron1-4/+1
With work that Alan has done, some passes have become redundant. ADCE now removed unused variables. Dead branch elimination removes unreachable blocks. This means we can remove CFG Cleanup and dead variable elimination.
2018-01-12Adding early exit versions of several ForEach* methodsAlan Baker20-215/+393
* Looked through code for instances where code would benefit from early exit * Added a corresponding WhileEach* method and updated the code
2018-01-12Move initialization of the const mgr to the constructor.Steven Perron5-10/+44
The current code expects the users of the constant manager to initialize it with all of the constants in the module. The problem is that you do not want to redo the work multiple times. So I decided to move that code to the constructor of the constant manager. This way it will always be initialized on first use. I also removed an assert that expects all constant instructions to be successfully mapped. This is because not all OpConstant* instruction can map to a constant, and neither do the OpSpecConstant* instructions. The real problem is that an OpConstantComposite can contain a member that is OpUndef. I tried to treat OpUndef like OpConstantNull, but this failed because an OpSpecConstantComposite with an OpUndef cannot be changed to an OpConstantComposite. Since I feel this case will not be common, I decided to not complicate the code. Fixes #1193.
2018-01-12Start v2018.0-devDavid Neto1-0/+3