Age | Commit message (Collapse) | Author | Files | Lines |
|
The encoder was using size_t, a 32-bit type on x86, to accumulate opcode
and prefix bits to emit. AVX support uses 3 bytes for prefixes that are
higher than the 32-bit type can handle. So, change all code byte related types
from size_t to a new code_t, defined as "unsigned __int64" on RyuJIT x86
(there is precedence for this type on the ARM architectures).
Fixes #8331
|
|
|
|
The GC info encoding for this platform does not permit marking
arbitrary regions of generated code as non-interruptible.
|
|
|
|
remove `ARM_HAZARD_AVOIDANCE` as it is a rare case.
For #7003
|
|
This change is the result of running clang-tidy and clang-format on jit
sources.
|
|
This change starts the process of updating the jit code to make it ready
for being formatted by clang-format. Changes mostly include reflowing
comments that go past our column limit and moving comments around ifdefs
so clang-format does not modify the indentation. Additionally, some
header files are manually reformatted for pointer alignment and marked
as clang-format off so that we do not lose the current formatting.
|
|
Added method IsMultiRegPassedType and updated IsMultiRegReturnType
Switched these methods to using getArgTypeForStruct and getReturnTypeForStruct
Removed IsRegisterPassable and used IsMultiRegReturned instead.
Converted lvIsMultiregStruct to use getArgTypeForStruct
Renamed varDsc->lvIsMultiregStruct() to compiler->lvaIsMultiregStruct(varDsc)
Skip calling getPrimitiveTypeForStruct when we have a struct larger than 8 bytes
Refactored ReturnTypeDesc::InitializeReturnType
Fixed missing SPK_ByReference case in InitializeReturnType
Fixes for RyiJIt x86 TYP_LONG return types and additional ARM64 work for full multireg support
Added ARM64 guard the uses of MAX_RET_MULTIREG_BYTES with FEATURE_MULTIREG_RET
Fixes for multireg returns in Arm64 Codegen
Added dumping of lvIsMultiRegArg and lvIsMultiRegRet in the assembly output
Added check and set of compFloatingPointUsed to InitializeStructReturnType
Fixes to handle JIT helper calls that say they return a TYP_STRUCT with no class handle available
Placed all of the second GC return reg under MULTIREG_HAS_SECOND_GC_RET ifdefs
Added the Arm64 VM changes from Rahul's PR 5175
Update getArgTypeForStruct for x86/arm32 so that it returns TYP_STRUCT for all pass by value cases
Fixes for the passing of 3,5,6 or 7 byte sized structs
Fix issue on ARM64 where we would back fill into x7 after passing a 16-byte struct on the stack
Implemented register shuffling for multi reg Call returns on Arm64
Fixed regression on Arm32 for struct args that are not multi regs
Updated Tests.Lst with 23 additional passing tests
Changes from codereview feedback
|
|
This fixes ARM64 Linux compilation error
```
/home/sjlee/git/coreclr/src/jit/emit.h:988:61: error: 109 enumeration
values not handled in switch: 'IF_NONE', 'IF_LABEL', 'IF_EN9'...
[-Werror,-Wswitch]
```
|
|
The recent commit (61fe464) breaks ARM build.
It defines emitInsToJumpKind twice in src/jit/emit.h on ARM.
Signed-off-by: Dongyun Jin <dongyun.jin@samsung.com>
Signed-off-by: Jiyoung Yun <jy910.yun@samsung.com>
|
|
Fixes https://github.com/dotnet/coreclr/issues/3668
Currently ARM64 codegen can have reference within +/-1 MB due to encoding
restriction in `b<cond>/adr/ldr` instructions. This is normally okay
assuming each function is reasonably small, but certainly not working for large method which also
can be formed with an aggressive inlining probably like crossgen/corert scenarios.
In addition, for hot/cold code separation long address is a prerequisite
since reference can be across different regions which are arbitrary.
In fact, we need additional relocations which are not in this change yet.
In details, this supports long address for conditional jump/address loading/constant
loading operations by default while they can be shortened later by
`emitJumpDistBind()` if they can fit into the smaller encoding. Logically
those operations now can reach within +/-4GB address range.
Note I haven't extended unconditional jump in this change for simplicity
so it can reach within +/-128MB same as before.
`emitOutputLJ` is extended to finally encode these operations.
There are 3 pseudo instructions introduced. These can be expanded either
short/long form.
1. Conditional jump. See `emitIns_J()`
a. Short form(`IF_BI_0B`): `b<cond> rel_addr`
b. Long form(`IF_LARGEJMP`):
```
b<rev cond> $LABEL
b rel_addr (unconditional jump)
$LABEL:
```
2. Load label(address computation). See `emitIns_R_L()`
a. Short form(`IF_DI_1E`): `adr x, [rel_addr]`
b. Long form(`IF_LARGEADR`):
```
adrp x, [rel_page_addr]
add x, x, page_offs
```
3. Load constant (from JIT data). See `emitIns_R_C()`
a. Short form(`IF_LS_1A`): `ldr x, [rel_addr]`
b. Long form(`IF_LARGLDC`):
```
adrp x, [rel_page_addr]
ldr x, [x, page_offs]
(fmov v, x in case loading vector constant)
```
In addition, JIT data is aligned on 8 byte to be accessible from large
load. Replaced JitLargeBranches by JitLongAddress to test stress on these
operations.
|
|
Fixes https://github.com/dotnet/coreclr/issues/4350
Fixes https://github.com/dotnet/coreclr/issues/4615
This is a bit large change across VM/Zap/JIT to properly support crossgen
scenario.
1. Fix incorrect `ldr` encoding with size.
2. Enforce JIT data following JIT code per method by allocating them together.
This guarantees correct PC-relative encoding for such constant data access
without fix-up.
3. For the general fix-up data acceess, use `adrp/add` instruction pairs with fix-ups.
Two more relocations types are implemented in all sides.
4. Interface dispatch stub is now implemented which is needed for
interface call for crossgen.
I've verified hello world runs with mscorlib.ni.dll.
|
|
This assert hit in the encoder when we were trying to generate an
INS_shl_N with a constant of 1, instead of using the special xarch
INS_shl_1 encoding, which saves a byte. It turns out, the assert was
and in fact amd64 does generate the suboptimal encoding currently.
The bad code occurs in the RMW case of genCodeForShift(). It turns out
that function is unnecessarily complex, unique (it doesn't use the
common RMW code paths), and has a number of other latent bugs.
To fix this, I split genCodeForShift() by leaving the non-RMW case
there, and adding a genCodeForShiftRMW() function just for the RMW case.
I rewrote the RMW case to use the existing emitInsRMW functions.
Other related cleanups along the way:
1. I changed emitHandleMemOp to consistently set the idInsFmt field,
and changed all callers to stop pre-setting or post-setting this field.
This makes the API much easier to understand. I added a big header
comment for the function. Now, it takes a "basic" insFmt (using ARD,
AWR, or ARW forms), which might be munged to a MRD/MWR/MRW form
if necessary.
2. I changed some places to always use the most derived GenTree type
for all uses. For instance, if the code has
"GenTreeIndir* mem = node->AsIndir()", then always use "mem" from then
on, and don't use "node". I changed some functions to take more derived
GenTree node types.
3. I rewrote the emitInsRMW() functions to be much simpler, and rewrote
their header comments.
4. I added GenTree predicates OperIsShift(), OperIsRotate(), and
OperIsShiftOrRotate().
5. I added function genMapShiftInsToShiftByConstantIns() to encapsulate
mapping from INS_shl to INS_shl_N or INS_shl_1 based on a constant.
This code was in 3 different places already.
6. The change in assertionprop.cpp is simply to make JitDumps readable.
In addition to fixing the bug for RyuJIT/x86, there are a small number
of x64 diffs where we now generate smaller encodings for shift by 1.
|
|
This fixes https://github.com/dotnet/coreclr/issues/3738.
The fix I made in
https://github.com/dotnet/coreclr/commit/4dfd323dab88b902fc9479efa60cb5d6b7659e94
used 3rd/4th operand to keep GC info, which actually conflicts with
address field which is unioned with these operands.
So, I go back to the original fix that I proposed below:
Indirect call (```br``` or ```blr```) target is encoded with a register
which the first operand internally represents.
Unfortunately, call sites use the first two operands to hold GC callee-save registers.
So, this GC register information was overridden by the call target operand
in the indirect(virtual) call sites.
The fix is to split branch instruction categories for these two instructions
while keeping ```ret``` same as before. They internally use the third
operand to encode the target since the first two are used for GC info.
The reason I didn't change ```ret``` is because the return instruction
is created as a small instruction (not using operand 3 and more).
|
|
This fixes dotnet#3663.
Indirect call (```br``` or ```blr```) target is encoded with a register
which the first operand internally represents.
Unfortunately, call sites use the first two operands to hold GC
callee-save registers.
So, this GC register information was overridden by the call target operand
in the indirect(virtual) call sites.
The fix is to use 3rd/4th operands instead of 1st/2nd operands to hold GC info.
Ideally we should use different field name and also ensure constness when
we set up the operand so that it's never written more than once.
https://github.com/dotnet/coreclr/issues/3693 is filed.
|
|
structs.
This changeset adds support for emitting the GC-ness of the second return
register (RDX for System V Amd64) for multi-register return structs.
It closes a hole in the GC info and resolves multiple GC stress failures.
Fixes #2757.
|
|
|
|
http://blogs.msdn.com/b/dotnet/archive/2015/11/30/net-framework-4-6-1-is-now-available.aspx
.NET Framework list of changes in 4.6.1
https://github.com/Microsoft/dotnet/blob/master/releases/net461/dotnet461-changes.md
Additional changes including
- Working ARM64 JIT compiler
- Additional JIT Optimizations
o Tail call recursion optimization
o Array length tracking optimization
o CSE for widening casts
o Smaller encoding for RIP relative and absolute addresses in addressing modes
o Tracked Local Variable increased to 512
o Improved handling of Intrinsics System.GetType()
o Improved handling of Math intrinsics
- Work for the X86 Ryu-JIT compiler
[tfs-changeset: 1557101]
|
|
c_runtime/vprintf/test1 is disabled as casting NULL to va_list is
against the C specification.
Fix SetFilePointer tests on 32 bit platforms.
Define _FILE_OFFSET_BITS=64 so that we have long file support on 32 bit
platforms.
Implement context capture/restore for ARM.
Link libgcc_s before libunwind on ARM so C++ exceptions work.
Translate armasm to gas syntax.
Specify Thumb, VFPv3, ARMv7 for the ARM target.
Add ARM configuration to mscorlib build
Implement GetLogicalProcessorCacheSizeFromOS in PAL.
Set UNWIND_CONTEXT_IS_UCONTEXT_T from configure check.
|
|
[tfs-changeset: 1421297]
|
|
[tfs-changeset: 1407945]
|