Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
Parse ".dynamic" section (ELF dynamic array tags) of the module being
added, find ".rel(a).plt" section and search it for presence of
'__asan_init' symbol.
Change-Id: Ie7cc4c818b791b5f00713b42ba15131325b8152c
Signed-off-by: Andrey Drobyshev <a.drobyshev@samsung.com>
|
|
|
|
Removed FEATURE_DATATARGET4 for arm64
Added SP check to createdump's native unwind loop to make it more robust.
Issue: https://github.com/dotnet/coreclr/issues/15062
|
|
On WSL, the alternate stack check in sigsegv_handler doesn't work. The
uc_stack members are always zero no matter whether the handler is
executed on an alternate or default stack. So the check to detect
whether we are running on an alternate stack or not is always returning
false on WSL and it prevents NULL reference exceptions from being
handled.
The fix is to introduce an env variable COMPlus_EnableAlternateStackCheck
that can be used to enable the alternate stack check. By default, the
sigsegv_handler is considered to always run on an alternate stack.
|
|
|
|
|
|
|
|
|
|
Fix issues related to save restore of FPCR/FPSR/V0/V31
There were several bugs in the assembly causing FPCR/FPSR
to overwrite V0 on RtlCaptureContext. Then restore from V0 on
RtlRestoreContext
|
|
On CentOS or OpenSUSE dotnet-dump collect fails but a
valid coredump is generated. The "prctl()" call is failing
that gives the child createdump process permission to ptrace
to the runtime process. On CentOS/OpenSUSE the PR_SET_PTRACER
option isn't supported and not needed.
Issue: https://github.com/dotnet/diagnostics/issues/334
|
|
Fixes https://github.com/dotnet/coreclr/issues/25108
- Upon a `WaitAll` when all waits are already satisfied, the abandoned flag is overwritten with the abandoned state of the last wait object in the array
- So if the first wait object is an abandoned mutex and the second wait object is a signaled event, the `WaitAll` succeeds and does not report that anything was abandoned
- Fixed to accumulate into the flag instead of overwriting it
|
|
There was a bug reported on a very recent Mac with Intel i9 processor. A
crash in the RtlRestoreContext was happening at the fxrstor instruction
due to the fact that the floating point state data were garbage.
The investigation has shown that sometimes, the x86_FLOAT_STATE64
cannot be obtained using the thread_get_state API. And it was also found
that at the same time, the x86_AVX_STATE64 can be obtained. The state
extracted by the AVX variant contains all the registers that the FLOAT
variant would extract.
However, in some cases, even the x86_AVX_STATE64 cannot be obtained and
there is a third flavor that we can get - x86_AVX512_STATE64.
Unfortunately, there are cases where none of those can be obtained.
It is not clear what causes these cases, it seems only kernel debugging
can give us an answer to that.
This change modifies the way we extract the floating point state. We
first try to get the AVX state, if we fail, we try the AVX512 and
finally we fall back to the FLOAT state. If we fail to get the floating
point state with any of these, we return context without the floating
point state flag set. Also, if only getting the FLOAT state succeeds,
we return context without the XSTATE flag set.
|
|
* Add crossgen option to setup default base address for native image
This is enabled only with -DFEATURE_ENABLE_NO_ADDRESS_SPACE_RANDOMIZATION.
* Mmap native images at default base address if env variable COMPlus_UseDefaultBaseAddr=0x1 is setup.
This is enabled only with -DFEATURE_ENABLE_NO_ADDRESS_SPACE_RANDOMIZATION.
|
|
When a third party library that someone loads into a coreclr process registers
its own SIGSEGV handler and then chain-calls coreclr SIGSEGV handler on a
non-alternate stack, coreclr would currently crash.
This fix enables it to execute the SIGSEGV handler on the non-alternate stack
(original stack of the interrupted thread) in such case.
The disadvantage is that stack overflow would lead to silent crash in such a
case, but we cannot do anything about it.
|
|
|
|
Shaves off two syscalls per managed assembly load.
|
|
Right now the new format is not on by default, but it can be enabled using COMPlus_EventPipeNetTraceFormat = 1 for testing purposes. The plan to have a follow up PR that will add shipping configuration mechanisms and change the default setting.
See the documentation in the PerfView repo for more details about the format. At a glance the goal is to create a format that is more efficient to produce, has a smaller on disk size, and offers enhanced functionality in a few areas:
a) 64 bit thread id support
b) Detection of dropped events via sequence numbers
c) Better support for extracting subsets of the file
Together with the change there was also some refactoring of the EventPipeBufferManager and EventPipeThread.
This change addresses (at least in part) the following issues:
#19688, #23414, #24188, #20751, #20555, #21827, #24852, #25046
|
|
csc.exe ourselves. (#24342)
* Use CMake's C# support to build DacTableGen instead of manually invoking csc.exe ourselves.
* Fix x86 failures.
* Disable DAC generation when building with NMake Makefiles and issue an error since the CMake C# support is VS-only. We don't actually support building with NMake (only configure) so this is ok.
* Clean up rest of the macro=1's
PR Feedback.
* Fix Visual Studio generator matching.
* Explicitly specify anycpu32bitpreferred for DacTableGen so the ARM64 build doesn't accidentally make it 64-bit
* Fix bad merge
|
|
The ONE_SHARED_MAPPING_PER_FILEREGION_PER_PROCESS check was using a temp
path that had some non-existent components. While this works fine on Linux,
it fails to create the temp file on OSX.
The fix is to use temp dir in the CMake's output dir.
|
|
* Fix PAL_GetLogicalProcessorCacheSizeFromOS on mac
In a previous PR
(https://github.com/dotnet/coreclr/commit/ed52a006c01a582d4d34add40c318d6f324b99ba#diff-8447e54277bb962d167a77bb260760d7R1879),
GetCacheSizePerLogicalCpu was changed to no longer rely on cpuid on
amd64 systems; instead it uses GetLogicalProcessorCacheSizeFromOS().
Unfortunately that function consisted of a number of `#if`s, none of
which were active on macs, and we just returned 0. This caused us to
default to a gen0size of only 0.25MB, causing many GCs.
Fixed by adding a new case that uses `sysctlbyname`.
Fix #24658
* Fixes from code review
* Check for function sysctlbyname instead of header
|
|
* Convert C++ standard settings and warning options from CMAKE_<LANG>_FLAGS to Modern CMake isms.
* More $<COMPILE_LANGUAGE> generator expressions instead of CMAKE_CXX_FLAGS.
* Use $<COMPILE_LANGUAGE:CXX> for all -fpermissive usage
* Fix generator expression that generates multiple flags
* Fix invalid use of CMAKE_CXX_FLAGS instead of CMAKE_C_FLAGS.
* Treat AppleClang as though it is Clang (match pre-3.0 behavior).
* Update our build system to understand that AppleClang is distinct from Clang and remove CMP0025 policy setting.
* PR Feedback.
|
|
* Improve fatal err msg
* Match SO format
* a
* Remove dead define
* More adjustments to ex msg
* And PAL
* Remove special case for SOE
* Remove excess Debug.Assert newline
* Remove excess newline
* typo
* New format
* Remove DebugProvider redundancy
* Adjustments
* Remove preceding newline
* Make other SOE and OOM consistent
* Tidy up assertion msg
* Fix missing newline after inner exception divider
* CR when no inner exception
* ToString never CR terminated
* disable corefx tests temporarily
|
|
test case run at (#24844)
least 4 to 5 times faster than before.
Fallback to old transport ReadMemory if /proc/<pid>/mem can't be opened. This happens
on attach because of permissions/access, but works fine on the launch (the most
important case).
|
|
* default value is 1, and when set to 0 will disable loading LTTng.
Co-Authored-By: Jan Kotas <jkotas@microsoft.com>
|
|
* Fix initial thread affinity on Linux
On Linux, a new thread inherits the affinity mask of the thread
that created the new thread. This is a problem for background GC
threads that are created by one of the server GC threads that are
affinitized to single core.
This change adds resetting each new thread affinity to match the
current process affinity.
In addition to that, I've also fixed the extraction of the CPU count
that was using PID 0. While the doc says that 0 represents current process,
it in fact means current thread.
And as a small bonus, I've added caching of the value returned by
the PAL_GetLogicalCpuCountFromOS, since it cannot change during runtime.
|
|
While looking at the strace of `corerun` built with logging and
debugging information, I was amazed at the number of gettid() calls it
was making. While system calls are cheap, they're still not free;
cache this number in the thread local storage area. Adds a branch, but
it's just a comparison with 0, so it's fine in comparison.
|
|
Fixes #21009
|
|
|
|
Fix CPUSET_T definition for FreeBSD
|
|
unicodedata.cpp based on UnicodeData.txt v11.0.
|
|
|
|
|
|
* Generate eventpipe implementation as part of CMake configure.
* Generate Etw provider as part of CMake configure.
* First pass porting over lttng provider to cmake.
* Fix up CMake Lttng provider generation.
* Move Lttng provider into CMake tree.
* Move dummy event provider to CMake
* Move genEventing into the CMake tree.
* Remove extraneous logging and unused python locator.
* Clean up build.sh
* Clean up genEventingTests.py
* Add dependencies to enable more incremental builds (providers not fully incremental).
* Convert to custom command and targets instead of at configure time.
* Get each eventing target to incrementally build.
* Fix incremental builds
* Add missing dependencies on eventing headers.
* PR Feedback. Mark all generated files as generated
* Clean up eventprovider test CMakeLists
|
|
|
|
Add the DiagnosticProtocolHelper class to deserialize and dispatch
the new GenerateCoreDump command.
Refactor the PAL createdump launch on unhandled exception code to
used by a new PAL_GenerateCoreDump method that doesn't depend on
the complus dump environment variables.
Changed the "full" createdump not to include the uncommitted pages and
removed the "add module metadata" workaround for SOS clrstack !UNKNOWN
problem now that is fixed in SOS (crashinfo.cpp).
|
|
* Fixing up time.cpp in the PAL
* Fixing GetTickCount64 in the PAL to continue using CLOCK_MONOTONIC_COARSE where available
* Reverting QueryPerformanceFrequency in the PAL to return tccSecondsToNanoSeconds for CLOCK_MONOTONIC
* Removing two unused variables from GetTickCount64 in the PAL
* Updating the PAL to error if neither mach_absolute_time nor clock_gettime(CLOCK_MONOTONIC) are supported.
* Fixing the PAL configure.cmake to link rt where applicable
|
|
returned by gettimeofday
This allows IBC profile data to record a meaningful time of when the training scenario was run.
Made EPOCH_DIFF a defined constant
Change calcTime to be an unsigned 64-bit integer
Change constants to units of 100NS instead of NS to avoid division and integer overflows.
Use the defined constants SECS_TO_100NS and USECS_TO_100NS when performing time calculations
Don't add a space after the Assembly arg when argc is zero
|
|
|
|
|
|
The PAL_SetCurrentThreadAffinity was incorrectly adding the specified processor
to the current thread affinity set instead of setting the affinity to only
the processor specified.
It was causing significant performance hit in aspnet benchmarks on machines with
many cores.
This change crept in when I was refactoring the related code while removing
CPU groups emulation.
|
|
|
|
The list size was set to g_SystemInfo.dwNumberOfProcessors which is a
number of processors the current process is allowed to run on, but not
the total number of processors in the system. Fixed to use
PAL_GetTotalCpuCount.
Also revert a change to the mbind node mask length computation I've
incorrectly made in my last commit and make it clear that the value is
a number of used bits in the node mask, which is the highest numa node
plus 1. And finally, re-reading the mbind doc, I've found that the
maxnode parameter is in fact "number of nodes" in the mask, so fixing
that too.
|
|
* Fix build on OSX and Linux machines without NUMA installed - there were
couple of places where I was missing ifdefs
* Fix bug in nodeMaskLength computation
* Remove testing change in eeconfig.cpp that has leaked into the PR
* Fix GCToOSInterface::GetTotalProcessorCount for embedded GC to return
all processors on the system, not just the ones enabled for the current
process.
|
|
This change removes CPU groups emulation from Unix PAL and modifies the
GC and thread pool code accordingly.
|
|
In case you would have UINT32_MAX - 1 CPUs, you would round up to return UINT32_MAX CPUs.
|
|
* Round up the value of the CPU limit
In the case where `--cpus` is set to a value very close to the smaller
integer (ex: 1.499999999), it would previously be rounded down. This
would mean that the runtime would only try to take advantage of 1 CPU in
this example, leading to underutilization.
By rounding it up, we augment the pressure on the OS threads scheduler,
but even in the worst case scenario (`--cpus=1.000000001` previously
being rounded to 1, now rounded to 2), we do not observe any
overutilization of the CPU leading to performance degradation.
* Teach the ThreadPool of CPU limits
By making sure we do take the CPU limits into account when computing the
CPU busy time, we ensure we do not have the various heuristic of the
threadpool competing with each other: one trying to allocate more
threads to increase the CPU busy time, and the other one trying to
allocate less threads because there adding more doesn't improve the
throughput.
Let's take the example of a system with 20 cores, and a docker container
with `--cpus=2`. It would mean the total CPU usage of the machine is
2000%, while the CPU limit is 200%. Because the OS scheduler would never
allocate more than 200% of its total CPU budget to the docker container,
the CPU busy time would never get over 200%. From `PAL_GetCpuBusyTime`,
this would indicate that we threadpool threads are mostly doing non-CPU
bound work, meaning we could launch more threads.
|
|
* Fix invalid use of stack memory
|
|
This focuse on better supporting `--cpuset-cpus` which limits the number of processors we have access to on the CPU; it also specifies which specific processor we have access to, but that’s irrelevant here
The work has been done here for all runtime components except `Environment.ProcessorCount`. The work consist in fixing `PAL_GetLogicalCpuCountFromOS` to use `sched_getaffinity`.
Fixes https://github.com/dotnet/coreclr/issues/22302
|
|
Integer Conversion issues
|