diff options
Diffstat (limited to 'Documentation/project-docs/jit-testing.md')
-rw-r--r-- | Documentation/project-docs/jit-testing.md | 169 |
1 files changed, 169 insertions, 0 deletions
diff --git a/Documentation/project-docs/jit-testing.md b/Documentation/project-docs/jit-testing.md new file mode 100644 index 0000000000..63c6a63dd9 --- /dev/null +++ b/Documentation/project-docs/jit-testing.md @@ -0,0 +1,169 @@ +# JIT Testing + +We would like to ensure that the CoreCLR contains sufficient test collateral +and tooling to enable high-quality contributions to RyuJit or LLILC's JIT. + +JIT testing is somewhat specialized and can't rely solely on the general +framework tests or end to end application tests. + +This document describes some of the work needed to bring JIT existing tests and +technology into the CoreCLR, and touches on some areas as that open for +innovation. + +We expect to evolve this document into a road map for the overall JIT testing +effort, and to spawn a set of issues in the CoreCLR and LLILC repos for +implementing the needed capabilities. + +## Requirements and Assumptions + +1. It must be easy to add new tests. +2. Tests must execute with high throughput. We anticipate needing to run +thousands of tests to provide baseline level testing for JIT changes. +3. Tests should generally run on all supported/future chip architectures and +all OS platforms. +4. Tests must be partitionable so CI latency is tolerable (test latency goal +TBD). +5. Tests in CI can be run on private changes (currently tied to PRs; this may +be sufficient). +6. Test strategy harmonious with other .Net repo test strategies. +7. Test harness behaves reasonably on test failure. Easy to get at repro steps +for subsequent debugging. +8. Tests must allow fine-grained inspection of JIT outputs, for instance +comparing the generated code versus a baseline JIT. +9. Tests must support collection of various quantitative measurements, eg time +spent in the JIT, memory used by the JIT, etc. +10. For now, JIT test assets belong in the CoreCLR repo. +11. JIT tests use the same basic test xunit harness as existing CoreCLR tests. +12. JIT special-needs testing will rely on extensions/hooks. Examples below. + +## Tasks + +Below are some broad task areas that we should consider as part of this plan. +It seems sensible for Microsoft to focus on opening up the JIT self-host +(aka JITSH) tests first. A few other tasks are also Microsoft specific and are +marked with (MS) below. + +Other than that the priority, task list, and possibly assigments are open to +discussion. + +### (MS) Bring up equivalent of the JITSH tests + +JITSH is a set of roughly 8000 tests that have been traditionally used by +Microsoft JIT developers as the frontline JIT test suite. + +We'll need to subset these tests for various reasons: + +1. Some have shallow desktop CLR dependence (e.g. missing cases in string +formatting). +2. Some have deep desktop CLR dependence (testing a desktop CLR feature that +is not present in CoreCLR). +3. Some require tools not yet available in CoreCLR (ilasm in particular). +4. Some test windows features and won’t be relevant to other OS platforms. +5. Some tests may not be able to be freely redistributed. + +We have done an internal inventory and identified roughly 1000 tests that +should be straightforward to port into CoreCLR, and have already started in on +moving these. + +### Test script capabilities + +We need to ensure that the CoreCLR repo contains a suitably +hookable test script. Core testing is driven by xunit but there’s typically a +wrapper around this (runtest.cmd today) to facilitate test execution. + +The proposal is to implement platform-neutral variant of runtest.cmd that +contains all the existing functionality plus some additional capabilities for +JIT testing. Initially this will mean: + +1. Ability to execute tests with a JIT specified by the user (either as alt +JIT or as the only JIT) +2. Ability to pass options through to the JIT (eg for dumping assembly or IR) +or to the CoreCLR (eg to disable use of ngen images). + +### Cache prebuilt test assets + +In general we want JIT tests to be built from sources. But given the volume +of tests it can take a significant amount of time to compile those sources into +assemblies. This in turn slows down the ability to test the JIT. + +Given the volume of tests, we might reach a point where the default CoreCLR +build does not build all the tests. + +So it would be good if there was a regularly scheduled build of CoreCLR that +would prebuild a matching set of tests and make them available. + +### Round out JITSH suite, filling in missing pieces + +We need some way to run ILASM. Some suggestions here are to port the existing +ILASM or find some equivalent we could run instead. We could also prebuild +IL based tests and deploy as a package. Around 2400 JITSH tests are blocked by +this. + +There are also some VB tests which presumably can be brought over now that VB +projects can build. + +Native/interop tests may or may not require platform-specific adaption. + +### (MS) Port the devBVT tests. + +devBVT is a broader part of CLR SelfHost that is useful for second-tier testing. +Not yet clear what porting this entails. + +### Leverage peer repo test suites. + +We should be able to directly leverage tests provided in peer repo suites, once +they can run on top of CoreCLR. In particular CoreFx and Roslyn test cases +could be good initial targets. + +Note LLILC is currently working through the remaining issues that prevent it +from being able to compile all of Roslyn. See the "needed for Roslyn" tags +on the open LLILC issues. + +### Look for other CoreCLR hosted projects. + +Similar to the above, as other projects are able to host on CoreCLR we can +potentially use their tests for JIT testing. + +### Porting of existing suites/tools over to our repos. + +Tools developed to test JVM Jits might be interesting to port over to .Net. +Suggestions for best practices or effective techniques are welcome. + +### Bring up quantitative measurements. + +For Jit testing we'll need various quantitatve assessments of Jit behavior: + +1. Time spent jitting +2. Speed of jitted code +3. Size of jitted code +4. Memory utilization by the jit (+ leak detection) +5. Debug info fidelity +6. Coverage ? + +There will likely be work going on elsewhere to address some of these same +measurement capabilities, so we should make sure to keep it all in sync. + +### Bring up alternate codegen capabilities. + +For LLILC, implementing support for crossgen would provide the ability to drive +lots of IL through the JIT. There is enough similarity between the JIT and +crossgen paths that this would likely surface issues in both. + +Alternatively one can imagine simple test drivers that load up assemblies and +use reflection to enumerate methods and asks for method bodies to force the JIT +to generate code for all the methods. + +### Bring up stress testing + +The value of existing test assets can be leveraged through various stress +testing modes. These modes use non-standard code generation or runtime +mechanisms to try an flush out bugs. + +1. GC stress. Here the runtime will GC with much higher frequency in an attempt +to maximize the dependence on the GC info reported by the JIT. +2. Internal modes in the JIT to try and flush out bugs, eg randomized inlining, +register allocation stress, volatile stress, randomized block layout, etc. + +### Bring up custom testing frameworks and tools. + +We should invest in things like random program or IL generation tools. |