summaryrefslogtreecommitdiff
path: root/src/jit/inlinepolicy.h
AgeCommit message (Collapse)AuthorFilesLines
2016-08-11Reformat jit sources with clang-tidy and formatMichelle McDaniel1-53/+83
This change is the result of running clang-tidy and clang-format on jit sources.
2016-08-05Inliner: add a few more observationsAndy Ayers1-7/+15
Now that we can trust `optMethodFlags`, note if the root caller has newarr or newobj. We can't yet tell if these operations might feed a particular call's arguments but their presence in the root method alone might be enough to build correlations with performance changes. Count number of returns in the callee. Note if callee has same this as the (root) caller. Note when callee is a class constructor. Add code to dump out the new observations. No policies act on these yet.
2016-08-04Do not inline methods that never returnMike Danes1-0/+29
Methods that do not contain return blocks can't ever exit normally, they either throw or loop indefinitely. Inlining is not beneficial in such cases as it increases the code size without providing any speed benefits. In the particular case of throws the inlined code can easily be 10 times larger than the call site. The call to fgMoreThanOneReturnBlock has been replaced with code that does the same thing but also detects the existence of at least one return block. This avoids walking the basic block list twice. Note that BBJ_RETURN blocks are also generated for CEE_JMP. Methods exiting via CEE_JMP instead of CEE_RET will continue to be inlined (assuming they were inlined before this change).
2016-07-19Inliner: improve arg observationsAndy Ayers1-10/+13
Two related changes to better capture cases where args or constant args are seen at inline call sites. On the observation side, the inliner's stack modelling of the callee is crude and will often overestimate the evaluation stack depth. So when looking at an opcode that takes just one stack operand (eg BRFALSE) the inliner's check should be whether the stack has at least one element instead of checking whether the stack has exactly one element. For compatibility reasons, the check for exactly one element is still used when the inline is evaluated via the LegacyPolicy. On the policy side, instead of keeping a bool flag to note that there were interesting arg cases, count the number of observations. LegacyPolicy continues to act as before, checking if count is zero or nonzero instead of whether the flag was false or true. The count is available for use in other heuristics and is reported in inline data.
2016-07-08Inliner: updates to ModelPolicyAndy Ayers1-0/+3
Updates to bring CS and TP impact of the ModelPolicy into more acceptable ranges. For CS, reduce the call site weights to values that are more in keeping with the legacy policy weights. Local test runs show this does not drastically alter CQ and brings CS down below LegacyPolicy levels, on average. Implement an early out rejection based solely on ILSize. The threshold value is set by conservatively determining when ILSize alone indicates the method in question will never be inlined (note the policy itself does not have an explicit ILSize cutoff). See comments for `ModelPolicy::NoteInt` for details. Note if we adjust the model's size and profitability estimates, this threshould will also need updating. CQ (as measured by the CoreCLR perf tests) continues to show about a 2.5% geomean improvement over LegacyPolicy.
2016-06-30Inliner: make ModelPolicy available in release buildsAndy Ayers1-4/+17
Make the ModelPolicy available in release builds of the jit. Enable this policy by setting: COMPLUS_JitInlinePolicyModel=1 This works for either jitting or prejitting. Update the ModelPolicy so callee never inline decisions are always propagated back to the runtime.
2016-06-09Inliner: observe block weight, update schema and dataAndy Ayers1-0/+1
Add block weight as observation. At first blush I suspect this will be useful only when the jit has IBC data, and even then only in the root method blocks. We'll likely need to either build up better profile estimates for the inliner or else just try and cope without it somehow. Dump out block weight and the per-call profitability in the observation schema and data. Among other things this lets us cross-validate how well the ModelPolicy predictions match the abstract GLMNET models...
2016-06-09Inliner: implement profitabily side of the ModelPolicyAndy Ayers1-0/+2
First (and very preliminary) cut at a profitability model, based on inline data modelling. ModelPolicy heuristic is updated to use this information to decide when to inline. The policy has two parameters: one for the local call site weight and another for the overall size/speed tradeoff that we find acceptable. The first parameter is temporary as we should be able to model this weight via observations. Size/speed tradeoff is a policy decision and is a tuning parameter. I've set it for now to roughly match the LegacyPolicy behavior, in aggregate, across the jit perf benchmarks.
2016-06-07Inliner: allow replay log to override force inlinesAndy Ayers1-0/+4
ReplayPolicy only impacts discretionary inlines, so could not control force inline candidates. This has lead to some confusion during inline studies. Since most cases of force inline are performance related, allow replay to override and control these force inline candidates. This might be useful down the road to study how likely it is that a force inline directive is either unnecessary or perhaps even detremental to performance.
2016-05-25Inliner: use offset in xml replayAndy Ayers1-2/+14
Replay needs the ability to distinguish among multiple calls to the same callee. The IL offset of the call site can be used for this, so start checking for it during replay. Note there is subsequent follow-up work to ensure that we actually track offsets for calls. Refactor the way the ReplayPolicy gets notified of the current inline context (and the IL offset of the call). Instead of trying to pass this information as arguments to the factory method that creates policies, add special notifications for this information that only the ReplayPolicy will act upon.
2016-05-20Inliner: locks for xml read/write accessAndy Ayers1-3/+4
Move CritSecObject into util.h, and use it to lock around reading and writing inline Xml. Introduce CritSecHolder for RAII management of the locks. Add a simple file position cache for methods to speed up replay when the inline xml file is large.
2016-05-18Inliner: small changes to inline xmlAndy Ayers1-1/+1
Emit the hash for callees, and check it during replay. Capture the root method name, and do a simple Xml-safe conversion. Add a missing newline to the banner message announcing that the replay policy is in effect.
2016-05-12Inliner: introduce ReplayPolicyAndy Ayers1-0/+29
The ReplayPolicy reads an external script to determine which inlines to perform. The script is the same Xml syntax that's produced by the inliner when JitInlineDumpXml is enabled. This format can be edited by hand or tool to force particular inlining patterns to occur. Methods or calls sites not mentioned in the script are considered as noinline. There's a bunch of work still left to make this fully robust, but in testing it works well enough for my immediate use case that I'll hold off on further polish until it's needed. But, for future reference, here's a laundry list: * Need better ways to identify methods. Token and hash are not enough. * Need better ways to identify call sites. Callee token is not enough. * Consider preparsing or mapping the script into memory. * Consider caching node positions in the InlineContexts. * Make it robust for multithreading. * Handle the prejit root case somehow. * Possibly allow overriding of inline attributes.
2016-05-03Inliner: initial version of SizePolicyAndy Ayers1-1/+23
Add a new inline policy that tries to inline as much as possible without increasing method size. Fix one test that had an expected inline that doesn't happen under the SizePolicy, by marking a method with AggressiveInline.
2016-04-27Inliner: implement FullPolicyAndy Ayers1-0/+25
The FullPolicy will be used to grow maximal potential inline trees for methods. It inlines all legal candidates subject to depth and size limits. Add a config flag to select this as the policy to use. Also add a config flag so the inline depth limit becomes adjustable, and rework the code so that checking this limit is now a matter of policy. Add in an implementation max depth limit of 1000. Revise names of some statics to make their intent clearer, and add some missing header comments.
2016-04-20Inliner: estimate overall code size impactAndy Ayers1-1/+13
Accumulate the per-inline size impact estimates into an overall inline size impact estimate. No size budgeting (yet); this just provides data for further study.
2016-04-08Inliner: initial version of a model-based policyAndy Ayers1-5/+26
Add a model-based size predictor to the `DiscretionaryPolicy` and dump its data along with observations. This can be used to assess the performance of the model. Currently using a linear model that can explain about 55% of the variability in code size impact for mscorlib. Whether this holds up for other situations is still to be determined. Create a `ModelPolicy` that relies on this code size model to make inlining decisions. This is very much a work in progress as the code size models are still not well vetted and there is no real code quality model yet.
2016-04-07Inliner: update DiscretionaryPolicy to look at opcodes and arg infoAndy Ayers1-0/+32
Have the `DiscretionaryPolicy` do some simple opcode histogramming to try and provide better observations for code size estimation. Likewise, look at the types and sizes of arguments and report those.
2016-03-25Inliner: refactor policies to extract common legality portionAndy Ayers1-22/+41
Now that we have multiple policies, refactor their implementation to extract common elements. In particular all the existing policies have the same legality and ability constraints, so this logic is extracted to a partial `LegalPolicy` superclass. Adresses relevant feedback on #3697, where the number of policies increased from one to two. No changes in behavior.
2016-03-24Inliner: collect jit time dataAndy Ayers1-5/+11
This is an initial cut at collecting the time spent jitting a method as part of observing the incremental impact of inlines. It currently only measures the time after inlining has run, because there are substantial interactions with the EE before this point and the time spent in those calls is not always going to properly attribute to the current method. We've had some good internal discussion on other approaches to this and will likely revisit as time goes on. There was already some timing support in place for SQM so I've generalized that and shared it for this case too. This also fixes an off-by-one issue with the method version which came up when trying to back-correlate the inline data into the inline trees. The data dumps now include root method data, and so we capture the IL size and force inline state of the root method in the data dumps. This allows us to investiage if observed inliner data is corrlated with the current size of a method. Data output is now also supported in RET builds. A custom build that defines `INLINE_DATA` is required. Output is sent to stderr to work around issues with crossgen.exe reopening stdout in a non-ascii mode.
2016-03-21Inliner: initial data gathering for code size estimatesAndy Ayers1-0/+26
Some initial work to gather data to drive the modelling of inline code size impact. See #3775 for context. Update the `DiscretionaryPolicy` to capture more observations. Add a limit feature `JitInlineLimit` so the jit will stop inlining after a given number of successful inlines. Add a `DumpData` method to `InlinePolicy` which will display all the observations that were made to evaluate an inline. Implement this for the `DiscretionaryPolicy`. Add a matching `DumpSchema` that writes out column headers for each of the observations. Modify `InlineResult` to cache the last successful policy on the root compiler instance. Use that along with the `JitInlineLimit` and `DumpData` to display detailed information about the last sucessful inline along with the code size for the method. All this is displayed if `JitInlineDumpData ` is enabled. This allows for isolating code size measurements as follows. Compile or jit something with `JitInlineLimit = 0` and `JitInlineDumpData =1` (and for now, `JitInlinePolicyDiscretionary = 1`, since other policies do not implement any interesting data dumping). Record the this root set of method sizes and IDs. Repeat with `JitInlineLimit=1`. For each method where there was an inline, the size impact of that inline can be computed by comparing this version versus the version from the root set. Repeat with `JitInlineLimit=2`, comparing versus the `=1` versions. Currently the method token is used to identify the root method. This is not sufficiently unique but unfortunately there currently aren't substantially better alternatives. See #1957 for some discussion.
2016-03-17Inliner: make more observations unconditionalAndy Ayers1-0/+4
There are still some observations within `fgFindJumpTargets` that are only made when inlining. This change enables them to also be made for the prejit root case. This partially addresses the inconsistencies noted in #3482. Most notably, the simple stack model is now fully enabled for the prejit root case, and the policy is able to more broadly observe things like `CALLEE_LOOKS_LIKE_WRAPPER` and `CALLEE_ARG_FEEDS_CONSTANT_TEST`. As before, the `LegacyPolicy` ignores these observations for the prejit root. Forthcoming policies will likely behave differently. Two observations that were fatal -- `CALLEE_UNSUPPORTED_OPCODE` and `CALLEE_STORES_TO_ARGUMENT` -- are now conditionally fatal, since they do not cause failure for the prejit root case (though arguably they should). The `CALLEE_IS_MOSTLY_LOAD_STORE` observation has been removed, and the conditions under which it fired have been encapsulated within the `LegacyPolicy`. No change in behavior is expected. I verified crossgen of mscorlib still makes all the same inline decisions for the same reasons.
2016-03-16Inliner: create DiscretionaryPolicyAndy Ayers1-3/+26
The `DiscretionaryPolicy` is similar to the `LegacyPolicy`, but does not use size limits. So there is no "always" inline class and no "too big to inline" class. Also, discretionary failures do not trigger noinline stamping. It is installed if `JitInlinePolicyDiscretionary` is set nonzero. See #3775 for some background on where this new policy will be used. This is a first cut and further refinement is likely. Also removed the unused `NoteDouble` methods since the double-valued observations now are kept internally within `LegacyPolicy` and it's not very likely we'll introduce new cases where doubles are needed.
2016-03-16Use root compiler instance for inline result and policyAndy Ayers1-3/+3
When constructing an InlineResult (and subsequent InlinePolicy) always use the root compiler instance, so that all policy evaluations done in a method are consistent. Add an assert that the initial fact replay during `fgFindJumpTargets` (force inline state and code size) doesn't change the candidacy when inlining -- it should just be re-establishing what we knew in the initial candidate scan. Closes #3760.
2016-03-15Random inlinerAndy Ayers1-0/+45
This change creates an inline RandomPolicy which assigns a random benefit to each discretionary inline candidate. Force inlines are honored as they are with LegacyPolicy. The pool of discretionary candidates is widened to include the "always inline" cases from the LegacyPolicy. Discretionary inlines are accepted with a likelihood that decreases with IL size, in a manner that mirrors the acceptance probability of the LegacyPolicy. The random inliner is intended to be used mainly as a stress testing tool. The RandomPolicy is enabled in non-retail builds when `JitStress` is set to a nonzero value. By default the RandomPolicy will be used for 50% of the method being compiled, and the LegacyPolicy (in stress mode) for the other 50%. Which policy is used for a given method depends on the stress value. The stress value also forms part of the random seed, so different random patterns can be invoked by varying the `JitStress` value. To enable the RandomPolicy for all methods, set `JitStress` to something nonzero, then set `JitStressModeName=STRESS_RANDOM_INLINE`. To do this without also enabling other stress modes, set `JitStressModeNameOnly=1'.
2016-03-09Inline refactoring: move profitability assessment to LegacyPolicyAndy Ayers1-5/+4
LegacyPolicy now encapsulates all the computations needed to evaluate whether an inline is profitable or not. This completes the main objective of the refactoring effort, which was to preserve and encapsulate the current inliner's behavior.
2016-03-09Fix various naming convention issues in inliner codeAndy Ayers1-44/+44
Per the jit coding conventions, rename all class member functions to be CamelCase. Rename all class data members to begin with `m_`.
2016-03-09Inline refactoring: move native callsite estimatorAndy Ayers1-15/+17
Move the native call site estimation logic into `LegacyPolicy`. Rework the remaining per-callsite bonus multipliers so that they fit into the observation framework. Remove `hasNativeSizeEstimate` and instead check that the inline is a candidate and its observation is a discretionary inline. Fix header comment in `fgFindJumpTargets`.
2016-03-07Inline refactoring: move state machine into LegacyPolicyAndy Ayers1-12/+22
The state machine used to estimate callee native code size is now an internal part of the LegacyPolicy. During `fgFindJumpTargets` the policy determines if the state machine is need, and if so, sets one up. `fgFindJumpTargets` then notifies the policy of the various opcodes in the method, and the policy uses this to drive the state machine forward. When the IL scan is completed, the `fgFindJumpTargets` notifies the policy, then, for the discretionary inline case, grabs the native size estimate, and uses that to evaluate inlinability. The knowledge of when inlines are "always", "forced", or "discretionary" is now encapsulated by the policy instead of being distributed in various places in the code.
2016-03-07Inline refactoring: revise candidacy determinationAndy Ayers1-11/+6
Rework the code to determine the key aspects of candidacy up front rather than figuring it out eventually. In particular this ensures that the two stages of evaulation for inlines both start at the same point for candidates. This will help streamline subsequent work to move the state machine into the policy, since the state machine is only needed for the case where the candidate is a discretionary inline, and the policy now tracks this directly. For the `LegacyPolicy`, if an inline is both smaller than the always inline size and marked force inline, attribute its inlining to its small size. Remove bulk of the of the "evaluation in progress" candidate observations since they don't add much value. As a result `noteCandidate` is not needed as an external API, and can be repurposed internally as `setCandidate`. Change how the prejit root case is tracked to make it explicit from the context rather than a callback to the compiler.
2016-03-04Inline refactoring: convert hints into observationsAndy Ayers1-3/+22
This change updates the inlining code to use observations in place of the InlineHints and hint-like things (eg "HasSimd"). A number of new observations were added in support of this. The `compInlineeHints` member of the compiler object was removed as the same information is now tracked by the inline policy. The policy also now contains most of the weights and combining logic used to compute the profitabiliy boost for the candidate. There is one subtle and tricky aspect to the change. For the most part the hints were ignored during the prejit-root analysis, but the mostly load-store hint was propagated and influenced the inlines done when crossgenning mscorlib. See #3482 for more on this particular quirk. I've preserved this behavior by "passing" the prejit inline result into `fgFindJumpTargets`. In actuality the result is passed by temporarily setting the `compInlineResult` compiler member variable. Then, in `fgFindJumpTargets`, the load-store observation is the only one not guarded by `compIsForInlining`. In a subsequent change I'll redo the guards at other observation sites and put the policy aspects of these observations into the policy object. This addresses another piece of the work outlined in #3371.
2016-02-29Inliner refactoring: allow repeated bad observationsAndy Ayers1-54/+18
The LegacyPolicy now allows repeated observations leading to never or failing inlines, provided the policy is used as part of the prejit scan and the observations all have the same impact (that is, all nevers or all failures). Only the first bad observation is remembered and it is used as the reason for badness. This addresses another of the items in issue #3371. Also, clean up a few things: Fix a few copy and paste errors in comment headers. Fix broken "disallow copy-assignment" pattern for InlineResult and uncomment and fix same for InlinePolicy. Remove an accidentally duplicated comment block (original is still there in inline.h).
2016-02-25InlineRefactoring: capturing bits of legacy policyAndy Ayers1-1/+11
Rework the logic for force inline, basic block count, il size, and maxstack so that the policy decides when these values should inhibit inlining.
2016-02-25Inline Refactoring: set up policy for bad inlineesAndy Ayers1-0/+110
Move inline policies to their own header and cpp file. Add a method to the policy class to indicate if the policy wants newly discovered `Never` inline cases to change the callee method attributes to Noinline. This is an existing optimization that saves time when the jit sees calls to this callee elsewhere as a possible inline candidates. For example, in the trace below, at for the call at offset 31, the jit determines that `ToInt32` is too large to be inlined and so marks it as noinline. Then when it sees another call to `ToInt31` at offset 44 it immediately fails the inline attempt. ``` Inlines into RegistryTimeZoneInformation:.ctor(ref):this ... [IL=0031 TR=000040] [FAILED: too many il bytes] System.BitConverter:ToInt32(ref,int):int [IL=0044 TR=000049] [FAILED: noinline per IL/cached result] System.BitConverter:ToInt32(ref,int):int [IL=0057 TR=000058] [FAILED: noinline per IL/cached result] System.BitConverter:ToInt32(ref,int):int ``` Diagnostic and experimental policies may choose to disable this optimization to make it easier to locally reason about failed inlines. There were 5 calls to `setMethodAttribs` passing `CORINFO_FLG_BAD_INLINEE`. This change consolidates 4 of them into the inlining code. The remaining call is for a method with verification errors. I've left it as is.