Age | Commit message (Collapse) | Author | Files | Lines |
|
indirection
|
|
|
|
Simple HelloWorld app can run with NI of System.Private.CoreLib.dll, HelloWorld.dll and System.Console.dll
|
|
This patch enables more generic method inlining for methods which are not compiled by NI.
|
|
|
|
* Implement instantiating and unboxing through portable stublinker code
- Handle only the cases with register to register moves
- Shares abi processing logic with delegate shuffle thunk creation
- Architecture specific logic is relatively simple
- Do not permit use of HELPERREG in computed instantiating stubs
- Fix GetArgLoc such that it works on all architectures and OS combinations
Add a JIT stress test case for testing all of the various combinations
- Use the same calling convention test architecture that was used as part of tail call work
Rename secure delegates to wrapper delegates
- Secure delegates are no longer a feature of the runtime
- But the wrapper delegate lives on as a workaround for a weird detail of the ARM32 abi
|
|
* Fix GenerateShuffleArray to support cyclic shuffles
The GenerateShuffleArray was not handling case when there was a cycle in
the register / stack slots shuffle and it resulted in an infinite loop
in this function. This issue is Unix Amd64 ABI specific.
To fix that, this change reworks the algorithm completely. Besides
fixing the issue, it has also better performance in some cases.
To fix the cyclic shuffling, I needed an extra helper register. However,
there was no available general purpose register available, so I had to
use xmm8 for this purpose.
* Remove special handling of the hang from ABI stress
|
|
Turn on ASan inteceptors while marshaling managed buffers to native code.
We could not properly annotate already allocated on heap buffers, so
we have to disable pinning of such objects.
Current patch affects only pinning of native arrays.
|
|
* Fix PIE options
We were missing passing the -pie linker option. That means that while we
were compiling our code as position independent, the executables
(not shared libraries) were not marked as position independent and
ASLR was not applied to them. They were always loaded to fixed addresses.
This change adds the missing -pie option and also replaces all the individual
settings of -fPIE / -fPIC on the targets we build by a centralized setting
of CMAKE_POSITION_INDEPENDENT_CODE variable that causes cmake to add the
appropriate compiler options everywhere.
* Fix native parts of coreclr tests build
The native parts of the tests are not built using the root CMakeLists.txt
so I am moving enabling the position independent code to configurecompiler.cmake
Change-Id: Ieafff8984ec23e5fdb00fb0c2fb017e53afbce88
|
|
Many diagnostic tools are unaware of 32-bit applications which have
large address spaces (> 2GB). Such tools include the TraceEvent library
(required by PerfView and dotnet-trace), and Visual Studio. They assume
the address range 0x80000000 through 0xFFFFFFFF as the system space and
thus often fail to read symbols from event traces generated by CoreCLR.
This workaround is to support such scenarios by simply discarding MSBs
of 32-bit instruction pointer values in the trace output. Only a minimal
set of values required for symbol resolution are affected by this
change. Beware that you will have to manually restore the original
values when you inspect them in lldb or etc.
|
|
|
|
- After relocation, relocation section in zap image is not necessary.
- Mark the section as NotNeeded by giving advice(madvise with MADV_DONTNEED)
- It reduces 120~150KB PSS in tizen sample apps.
|
|
|
|
Change-Id: I48446ce7c8771a4c75149512bb7d8a8cb3fae8e5
Signed-off-by: Vyacheslav Cherkashin <v.cherkashin@samsung.com>
|
|
This commit implements wrappers that allow interception transitions
from managed to external unmanaged code (CIL -> native) and back
(native -> CIL). This allows enable/disable ASan during transitions.
Due to this, we sanitize only external code, which allows us to
achieve acceptable performance.
Change-Id: I53ecdc14d28f7210cd9e7f5bd4db0c8ef5ed81fc
Signed-off-by: Vyacheslav Cherkashin <v.cherkashin@samsung.com>
|
|
|
|
This fix is to update usages of SetupGcCoverage() under
FEATURE_PREJIT aligned to the signature change in #25261.
|
|
pCurrentContextPointers in REGDISPLAY can contain NULLs so we need to use
ebp value from pCurrentContext. This patch contains following changes:
- GetRegdisplayFP returns ebp from pCurrentContext
- GetRegdisplayFP is used instead of *GetEbpLocation()
- Set##reg##Location also updates register value in pCurrentContext
|
|
- Profile information is collected by ibc logger.
Hower it is not used and saved into profile file.
- The patch disables IBC logger which is enabled by default.
- It disables IBC logger only with ibclogger.h file.
IBCLOGGER_ENABLED definition is only used in ibclogger files.
|
|
- Even if DACCESS_COMPILE or CROSSGEN_COMPILE is defined,
coreclr can be built without IBCLOGGER_ENABLED definition.
|
|
|
|
- In many cases cooperative GC mode is entered after acquiring the slot backpatching lock and the thread may block for debugger suspension while holding the lock. A FuncEval may time out on entering the lock if for example it calls a virtual or interface method for the first time. Failing the FuncEval when the lock is held enables the debugger to fall back to other options for expression evaluation.
- Also added polls for debugger suspension before acquiring the slot backpatching lock on background threads that often operate in preemptive GC mode. A common case is when the debugger breaks while the tiering delay timer is active, the timer ticks shortly afterwards (after debugger suspension completes) and if a thread pool thread is already available, the background thread would block while holding the lock. The poll checks for debugger suspension and pulses the GC mode to block before acquiring the lock.
Risks:
- The fix is only a heuristic and lessens the problem when it is detected that the lock is held by some thread. Since the lock is acquired in preemptive GC mode, it is still possible that after the check at the start of a FuncEval, another thread acquires the lock and the FuncEval may time out. The polling makes it less likely for the lock to be taken by background tiering work, for example if a FuncEval starts while rejitting a method.
- The expression evaluation experience may be worse when it is detected that the lock is held, and may still happen from unfortunate timing
- Low risk for the change itself
Port of https://github.com/dotnet/runtime/pull/2380
Fix for https://github.com/dotnet/runtime/issues/1537
|
|
Ports change #26873 to release 3.1 branch.
On OpenVZ virtualized linux, GetCurrentProcessorNumber which uses sched_getcpu()
can return a value greater than the number of processors reported by
sched_getaffinity with CPU_COUNT or sysconf(_SC_NPROCESSORS_ONLN).
For example, taskset -c 2,3 ./MyApp will make CPU_COUNT be 2 but
sched_getcpu() can return 2 or 3, and OpenVZ kernel can make
sysconf(_SC_NPROCESSORS_ONLN) return a limited cpu count but
sched_getcpu() still report the real processor number.
Example of affinity vs current CPU id on OpenVZ:
nproc: 8
nprocOnline: 1
affinity: 1, 0, 0, 0, 0, 0, 0, 0, cpuid: 2
affinity: 1, 0, 0, 0, 0, 0, 0, 0, cpuid: 2
affinity: 1, 0, 0, 0, 0, 0, 0, 0, cpuid: 2
affinity: 1, 0, 0, 0, 0, 0, 0, 0, cpuid: 2
affinity: 1, 0, 0, 0, 0, 0, 0, 0, cpuid: 2
affinity: 1, 0, 0, 0, 0, 0, 0, 0, cpuid: 5
affinity: 1, 0, 0, 0, 0, 0, 0, 0, cpuid: 5
|
|
Ports https://github.com/dotnet/runtime/pull/206 to release/3.1.
The code in PAL_GetCurrentThreadAffinitySet relied on the fact that the
number of processors reported as configured in the system is always
larger than the maximum CPU index. However, it turns out that it is not
true on some devices / distros. The Jetson TX2 reports CPUs 0, 3, 4 and
5 in the affinity mask and the 1 and 2 are never reported. GLIBC reports
6 as the number of configured CPUs, however MUSL reports just 4. The
PAL_GetCurrentThreadAffinitySet was using the number of CPUs reported as
configured as the upper bound for scanning affinity set, so on Jetson
TX2, the affinity mask returned had just two bits set while there were
4 CPUs. That triggered an assert in the GCToOSInterface::Initialize.
This change fixes that by reading the maximum CPU index from the
/proc/cpuinfo. It falls back to using the number of processors
configured when the /proc/cpuinfo is not available (on macOS, FreeBSD, ...)
Fixes https://github.com/dotnet/runtime/issues/170
|
|
interface methods (#27756) (#27868)
* est for non_virtual_calls_to_instance_methods
* Fix handling of callvirt to instance methods on interface types that are not virtual
- Use call type to indicate if its non-virtual or not, instead of opcode
* Use ilproj to protect against future C# compiler changes
- This test needs to test the use of specific opcodes, and so an IL proj is required
|
|
* Comment rewordings required by policheck.
* Use AppContainer instead of Windows Store.
|
|
Fix a function that was ifdef'ed needed for the metadata locator callbacks to work.
Fix some not properly DAC'ized code in the type desc and server GC code. Caused an exception during dump generation.
Issue: https://github.com/dotnet/coreclr/issues/26907
|
|
to 3.1 (#27512)
|
|
processing during shutdown (#27241)
* Protect against a rare invalid lock acquision attempt during etw processing during abrupt shutdown
Targeted and partial fix for https://github.com/dotnet/coreclr/issues/27129
- This is not a generic fix for the issue above, it is only a very targeted fix for an issue seen (a new issue introduced in 3.x). For a generic fix and more details, see the fix in 5.0: https://github.com/dotnet/coreclr/pull/27238.
- This change avoids taking a lock during process detach - a point in time when all other threads have already been abruptly shut down by the OS and locks may have been orphaned.
- The issue leads to a hang during shutdown when ETW tracing is enabled and the .NET process being traced begins the shutdown sequence at an unfortunate time - this is a probably rare timing issue. It would take the shutdown sequence to begin at just the point when a thread holds a particular lock and is terminated by the OS while holding the lock, then the OS sends the process detach event to the CLR, work during which then tries to acquire the lock and cannot because it is orphaned.
- The generic fix has broader consequences and is unlikely to be a reasonable change to make so late in the cycle, such a change needs some bake time and feedback. Hence this targeted fix for 3.x.
* Report tier as unknown when it cannot be determined
* Return unknown only on process detach
|
|
(#27294)
* Fix macro redefinition to use XplatEventLogger instead of simply writing FALSE
* Fix linker error
* undo newline changes
* Some changes to comment
* Move wrapper export from eventpipe.cpp to eventtrace.cpp
|
|
* only set THREAD_IS_SUSPENDED if we are truly doing an async stack walk (#26985)
* Use a new COR_PRF_SUSPEND_FOR_PROFILER in ICorProfilerCallback::RuntimeThreadSuspended() when requested by profiler (#27041)
Fixes https://github.com/dotnet/coreclr/issues/26576
|
|
* Move TypeSystemLog::OnKeywordsChanged from EtwCallback to EtwCallbackCommon to enable this same behavior in ETW and EventPipe. This unblocks parity for GCHeapDumps between ETW and EventPipe (#26270)
* Fix unique ETW events for GC Type logging, so they are also fired across EventPipe (#27250)
|
|
|
|
to 0 (#27137) (#27140)
* Do not create diagnostics server thread and pipe if EnableDiagnostics is set to 0
* Remove unnecessary check for config var in DiagnosticServer::Shutdown
|
|
systems only (#27014) (#27080)
Use maximum number of processors the process may run on to determine whether it is ok to use
single-proc allocation helpers. It is not sufficient to depend on current process affinity since
that can change during the process lifetime.
Also, the single-proc allocation helpers work well on x86/x64 systems only because of they depend
on atomic non-interlocked increment instruction for good performance. Such instruction is available
on x86/x64 only. Disable them everywhere else.
Fixes #26990
|
|
Make FindOrCreateAssociatedMethodDesc throw a type load exception instead of an AV, which can't be handled by EX_TRY/EX_CATCH on Unix systems.
|
|
* Fix read ordering bug between buckets pointer and counter
Use VolaiteLoad to read counter
|
|
* Move JIT_WriteBarrier that is modified at runtime to a dynamically
allocated memory instead of making a page in libcoreclr.dylib RWX.
* Fix JIT_Stelem_Ref calls to JIT_WriteBarrier
* Update PAL to add MEM_JIT flag for allocations and reservations of
executable memory.
* Update native runtime in EH and stack unwinding areas so that it can
unwind from the write barrier copy. That code has no unwind info, so
without special handling, runtime would not be able to unwind from
it.
|
|
* Use the exact methodtable of the parameter to calculate the offset into a pinned array.
|
|
Fix watson bucketing/broken triage dumps
The DAC EnumMemoryRegions needs to include some missing code version
manager memory.
|
|
(#26423)
|
|
As adding EnC methods happens on the Debugger Thread, there's no managed Thread object from which to obtain the cached StackingAllocator. Instead, just use a function-local StackingAllocator.
|
|
* Account for quoted values in provider filter string
|
|
* Clean up diagnosticserver socket/pipe on shutdown
* Refactor dbg transport pipe cleanup to be registered as signal handler from the vm side
* cleanup dead code
* Remove more dead code and fix windows builds
* Moving some ifdefs around
|
|
need to read hardlimit configs from env vars as well as runtimeconfig.json
|
|
appropriate CRT functions for x64 Windows, as is already done for all other platforms/targets (#25763)
* Fixing Buffer::BlockCopy to just call the CRT memmove for x64 Windows, as is already done for all other platforms/targets
* Fixing up the x64 CrtHelpers.asm to just forward to the CRT implementations for JIT_MemSet and JIT_MemCpy
* Keep unix using memcpy and clarify that Windows uses memmove for full framework compat.
|
|
This typo was in #24989 so would be a new regression in 3.0.
In an x86 build, it causes us to not get the cache size correct,
leading us to use a smaller default cache size and do more GCs.
Tested with GCPerfSim and this PR reduces TotalNumberGCs by 33% using an x86 build.
|
|
|
|
* Arm64 ICorDebugRegisterSet float support
* Arm64 ICorDebugRegisterSet2 implementation
* Add arm64 VLT_REG_FP case
* Arm64 add funceval GetRegister SetRegister support
|
|
* Remove duplicate definition
* Fix conversion error
* 1ui64 doesn't exist on GCC
|