platform/upstream/coreclr - Domain: Dotnet / Core; Licenses: MIT;

Age	Commit message (Collapse)	Author	Files	Lines
2020-04-16	Fix PIE options (#26323)submit/tizen/20200415.223728 accepted/tizen/unified/20200416.080052	Jan Vorlicek	2	-2/+0
	* Fix PIE options We were missing passing the -pie linker option. That means that while we were compiling our code as position independent, the executables (not shared libraries) were not marked as position independent and ASLR was not applied to them. They were always loaded to fixed addresses. This change adds the missing -pie option and also replaces all the individual settings of -fPIE / -fPIC on the targets we build by a centralized setting of CMAKE_POSITION_INDEPENDENT_CODE variable that causes cmake to add the appropriate compiler options everywhere. * Fix native parts of coreclr tests build The native parts of the tests are not built using the root CMakeLists.txt so I am moving enabling the position independent code to configurecompiler.cmake Change-Id: Ieafff8984ec23e5fdb00fb0c2fb017e53afbce88
2020-02-13	Fix GC heap corruption on ARM. (#27985)	Anton Lapounov	1	-1/+1
	Port of dotnet/runtime#1389.
2020-01-14	Port to 3.1 - Fix getting affinity set on MUSL on Jetson TX2 (#27957)	Jan Vorlicek	1	-2/+2
	Ports https://github.com/dotnet/runtime/pull/206 to release/3.1. The code in PAL_GetCurrentThreadAffinitySet relied on the fact that the number of processors reported as configured in the system is always larger than the maximum CPU index. However, it turns out that it is not true on some devices / distros. The Jetson TX2 reports CPUs 0, 3, 4 and 5 in the affinity mask and the 1 and 2 are never reported. GLIBC reports 6 as the number of configured CPUs, however MUSL reports just 4. The PAL_GetCurrentThreadAffinitySet was using the number of CPUs reported as configured as the upper bound for scanning affinity set, so on Jetson TX2, the affinity mask returned had just two bits set while there were 4 CPUs. That triggered an assert in the GCToOSInterface::Initialize. This change fixes that by reading the maximum CPU index from the /proc/cpuinfo. It falls back to using the number of processors configured when the /proc/cpuinfo is not available (on macOS, FreeBSD, ...) Fixes https://github.com/dotnet/runtime/issues/170
2019-10-14	Fix available memory extraction on Linux (#26764) (#26938)	agoretsky	3	-15/+278
	* Fix available memory extraction on Linux The GlobalMemoryStatusEx in PAL is returning number of free physical pages in the ullAvailPhys member. But there are additional pages that are allocated as buffers and caches that get released when there is a memory pressure and thus they are effectively available too. This change extracts the available memory on Linux from the /proc/meminfo MemAvailable row, which is reported by the kernel as the most precise amount of available memory.
2019-10-03	oom (#26457) (#26983)	Maoni Stephens	2	-1/+31
	+ when hardlimit is specified we should only retry when we didn't fail due to commit failure - if commit failed it means we simply didn't have as much memory as what the hardlimit specified. we should throw OOM in this case. + added some diag info around OOM history to help with future diagnostics. (cherry picked from commit 7dca41fd36721068e610c537654765e8e42275d7)
2019-08-16	Fix a potential division by 0 in post GC counter computation (#26085) (#26089)	Sung Yoon Whang	1	-1/+4
	* Fix a potential division by 0 in post GC counter computation * Remove useless code
2019-07-12	Fixes when accessing fgn_maxgen_percent (#25650)	Andy Hanson	1	-6/+15
	* Fixes when accessing fgn_maxgen_percent PR #25350 changed `fgn_maxgen_percent` to be a per-heap property when `MULTIPLE_HEAPS` is set. A few uses need to be updated. * In `full_gc_wait`, must re-read `fgn_maxgen_percent` before the second test of `maxgen_percent == 0`. (Otherwise the second test is statically unreachable.) * In RegisterForFullGCNotification, must set `fgn_maxgen_percent` when `MULTIPLE_HEAPS` is not set * In CancelFullGCNotification, must set `fgn_maxgen_percent` for each heap separately when `MULTIPLE_HEAPS` is set. Fix dotnet/corefx#39374 * Avoid duplicate code when getting fgn_maxgen_percent twice in full_gc_wait
2019-07-08	Return HardLimitBytes from GCMemoryInfo.TotalAvailableMemoryBytes (#25437)	Andy Hanson	3	-21/+25
	* Add property HardLimitBytes to GCMemoryInfo This adds a new property HardLimitBytes. Unlike TotalAvailableMemoryBytes, this will reflect an explicitly set COMPLUS_GCHeapHardLimit. It will also reflect the fraction of a container's size that we use, where TotalAvailableMemoryBytes is the total container size. Normally, though, it is equal to TotalAvailableMemoryBytes. Fix #38821 * Remove HardLimitBytes; have TotalAvailableMemoryBytes take on its behavior * Fix typos * Separate total_physical_mem and heap_hard_limit so we can compute highMemoryLoadThresholdBytes and memoryLoadBytes * Do more work in gc.cpp instead of Gc.cs * Consistently end names in "Bytes"
2019-07-05	many core (#25350)	Maoni Stephens	7	-152/+970

2019-06-25	Brick table (#25349)	Peter Sollich	2	-18/+17
	Fix brick table logic to fix perf issue in several ASP.NET tests, remove #ifdef FFIND_OBJECT. What I observed was that some GCs spent a lot of time in find_first_object called from find_object, which is called during stack scanning to find the containing object for interior pointers. A substantial fraction of generation 0 was being scanned, indicating that the brick table logic didn't work properly in these cases. The root cause was the fact that the brick table entries were not being set in adjust_limit_clr if the allocation was satisfied from the free list in gen0 instead of newly allocated space. This is the case if there are pinned objects in gen0 as well. The main fix is in adjust_limit_clr - if the allocation is satisfied from the freelist, seg is nullptr, the change is to set the bricks in this case as well if we are allocating in gen0 and the allocated piece is above a reasonable size threshold. The bricks are not set always set during allocation - instead, when we detect an interior pointer during GC, we make the allocator set the bricks during the next GC cycles by setting gen0_must_clear_bricks. I changed the way this is handled for server GC (multiple heaps). We used to multiply the decay time by the number of heaps (gc_heap::n_heaps), but only applied it to the single heap where an interior pointer was found. Instead, I think it's better to instead set gen0_must_clear_bricks for all heaps, but leave the decay time unchanged compared to workstation GC. Maoni suggested to remove the #ifdef FFIND_OBJECT - interior pointers are not going away, so the #ifdefs are unnecessary clutter. Addressed code review feedback: - add parentheses as per GC coding conventions - use max instead of if-statement - merge body of for-loop over all into existing for-loop
2019-06-21	don't require seg size to be power of 2 for large pages (#25216)	Maoni Stephens	1	-5/+17
	large pages will have segments aligned to 16mb (the default min seg size for hardlimit)
2019-06-20	ensure process-wide fence when updating GC write barrier on ARM64 (#25130)	Vladimir Sadov	1	-1/+1
	* ensure process-wide fences when updating GC write barrier on ARM64
2019-06-15	Do not export GC entrypoints outside standalone build (#25184)	Michal Strehovský	1	-0/+4
	It doesn't seem like something we would want to export outside standalone build.
2019-06-12	Expose readonly heap segments to DAC (#25113)	Michal Strehovský	1	-0/+1
	This was in CoreRT's copy of gcinterface.dac.h, but got lost in dotnet/corert#7517.
2019-06-11	CoreRT change	Suchiman	1	-7/+3

2019-06-11	Ensure gen0_max_size to be initially >= gen0_min_size	Suchiman	1	-0/+2
	Otherwise, gen0_min_size is eventually capped by gen0_max_size, which makes it impossible to raise gen0 size above the default max sizes for gen0. This is required for some scenarios (CppCodeGen, WASM) in CoreRT.
2019-06-11	Multiple CoreRT changes	Suchiman	1	-14/+7

2019-06-11	Fix casts	Suchiman	2	-5/+5

2019-06-11	Fix Redhawk defines	Suchiman	2	-8/+4

2019-06-11	UNREFERENCED_PARAMETER	Suchiman	2	-6/+10

2019-06-11	Port typo fixes from CoreRT	Suchiman	6	-49/+49

2019-06-06	Use CMake's C# support to build DacTableGen instead of manually invoking ↵	Jeremy Koritzinsky	1	-2/+2
	csc.exe ourselves. (#24342) * Use CMake's C# support to build DacTableGen instead of manually invoking csc.exe ourselves. * Fix x86 failures. * Disable DAC generation when building with NMake Makefiles and issue an error since the CMake C# support is VS-only. We don't actually support building with NMake (only configure) so this is ok. * Clean up rest of the macro=1's PR Feedback. * Fix Visual Studio generator matching. * Explicitly specify anycpu32bitpreferred for DacTableGen so the ARM64 build doesn't accidentally make it 64-bit * Fix bad merge
2019-06-06	Clear syncblock early when `VERIFY_HEAP && DEBUG` to prevent verification ↵	Vladimir Sadov	1	-0/+11
	asserts. (#24992) Fixes:#24879
2019-05-28	Using AllocateUninitializedArray in array pool (#24504)	Vladimir Sadov	1	-14/+28
	* Just use `new T[]` when elements are not pointer-free * reduce zeroing out when not necessary. * use AllocateUninitializedArray in ArrayPool
2019-05-28	Fix initial thread affinity on Linux (#24801)	Jan Vorlicek	1	-1/+1
	* Fix initial thread affinity on Linux On Linux, a new thread inherits the affinity mask of the thread that created the new thread. This is a problem for background GC threads that are created by one of the server GC threads that are affinitized to single core. This change adds resetting each new thread affinity to match the current process affinity. In addition to that, I've also fixed the extraction of the CPU count that was using PID 0. While the doc says that 0 represents current process, it in fact means current thread. And as a small bonus, I've added caching of the value returned by the PAL_GetLogicalCpuCountFromOS, since it cannot change during runtime.
2019-05-24	Add more runtime GC counters (#24561)	Sung Yoon Whang	3	-118/+31
	* Add Series/CounterType to CounterPayload and IncrementingCounterPayload * merging with master * Add Generation sizes counter * Some cleanup * Add allocation rate counter * Fix build * add Allocation Rate runtime counter * Fix a potential div by zero exception * Add back in code commented out * Add LOH size counter * Fix linux build * GetTotalAllocated -> GetTotalAllocation * PR feedback * More cleanup + renaming per PR feedback * undo comments * more pr feedback * Use existing GC.GetTotalAllocatedBytes API instead * Remove duplicate GetTotalAllocation * More PR feedback * Fix x86 build * Match type between C++/C# * remove unused variables'
2019-05-21	Fix GCToOSInterface::SetCurrentThreadIdealAffinity on Unix (#24706)	Jan Vorlicek	1	-1/+2
	The code was using GCToOSInterface::SetThreadAffinity, which effectively pinned the current thread to a specific processor. On Windows, it calls SetThreadIdealProcessor which is basically just a scheduler hint, but the thread can stil run on other threads. Since there is no way to set ideal affinity on Unix, the fix is to do nothing in the GCToOSInterface::SetCurrentThreadIdealAffinity.
2019-05-17	Merge pull request #24520 from am11/freebsd/set-affinity	Jan Vorlicek	3	-4/+21
	Fix CPUSET_T definition for FreeBSD
2019-05-15	Remove concept of AppDomains from the GC (#24536)	David Wrighton	20	-1180/+27
	* Remove concept of AppDomains from the GC - Leave constructs allowing for multiple handle tables, as scenarios for that have been proposed - Remove FEATURE_APPDOMAIN_RESOURCE_MONITORING
2019-05-14	Fix issues reported by PREfast static analysis tool (#24577)	Jan Kotas	6	-17/+9

2019-05-13	Implement GC.GetTotalAllocatedBytes (#23852)	Ludovic Henry	4	-3/+45
	* keep what's allocated so far on each heap * Implement GC.GetTotalAllocatedBytes It is based on https://github.com/dotnet/corefx/issues/34631 and https://github.com/dotnet/corefx/issues/30644 * Fixing races related to dead_threads_non_alloc_bytes * separated per-heap SOH and LOH counters. Different locks imply that we need different counters. * allow/ignore torn 64bit reads on 32bit in imprecise mode. * PR feedback * simplified the test a little to avoid OOM on ARM
2019-05-11	Fix CPUSET_T definition for FreeBSD	Adeel	3	-4/+21

2019-05-10	Move EventProvider native layout to be driven by CMake configure (#24478)	Jeremy Koritzinsky	1	-0/+1
	* Generate eventpipe implementation as part of CMake configure. * Generate Etw provider as part of CMake configure. * First pass porting over lttng provider to cmake. * Fix up CMake Lttng provider generation. * Move Lttng provider into CMake tree. * Move dummy event provider to CMake * Move genEventing into the CMake tree. * Remove extraneous logging and unused python locator. * Clean up build.sh * Clean up genEventingTests.py * Add dependencies to enable more incremental builds (providers not fully incremental). * Convert to custom command and targets instead of at configure time. * Get each eventing target to incrementally build. * Fix incremental builds * Add missing dependencies on eventing headers. * PR Feedback. Mark all generated files as generated * Clean up eventprovider test CMakeLists
2019-05-08	Merge pull request #24366 from sandreenko/fixLogPrinting	Sergey Andreenko	1	-1/+1
	Fix some small issues with stress logging.
2019-05-02	System.GC.AllocateUninitializedArray (#24096)	Vladimir Sadov	3	-73/+155
	* Do not expand to allocation_quantum in SOH when GC_ALLOC_ZEROING_OPTIONAL * short-circuit short arrays to use `new T[size]` * Clean syncblock of large-aligned objects on ARM32 * specialize single-dimensional path AllocateSzArray * Unit tests * Some PR feedback. Made AllocateUninitializedArray not be trimmed away. * PR feedback on gchelpers - replaced use of multiple bool parameters with flags enum - merged some methods with nearly identical implementation - switched callers to use AllocateSzArray vs. AllocateArrayEx where appropriate. * PR feedback. Removed X86 specific array/string allocation helpers.
2019-05-02	Fix some small issues with stress logging.	Sergey Andreenko	1	-1/+1

2019-05-01	When large pages are enabled, only reserve/commit 1x seg size for LOH (#24320)	Andy Hanson	2	-6/+4
	When large pages are enabled, we must commit everything we reserve. Previously we reserved 2x the segment size for LOH. This is a problem with large pages where we must commit everything we reserve. Thanks to https://github.com/dotnet/coreclr/pull/24081 this does not cause performance regression with large pages; but without large pages we were seeing regressions when the loh_seg_size was reduced. So this change will only take effect when large pages are enabled.
2019-04-26	Typos (#24280)	John Doe	1	-1/+1
	* thier -> their * exeption -> exception * Estbalisher -> Establisher * neeed -> need * neeed -> need * neeeded -> needed * neeeded -> needed * facilitiate -> facilitate * extremly -> extremely * extry -> extra
2019-04-26	Improve LOH heap balancing (#24081)	Andy Hanson	2	-73/+135
	* Improve LOH heap balancing Previously in `balance_heaps_loh`, we would default to `org_hp` being `acontext->get_alloc_heap()`. Since `alloc_large_object` is an instance method, that ultimately came from the heap instance this was called on. In `GCHeap::Alloc` that came from `acontext->get_alloc_heap()` (this is a different acontext). That variable is set when we allocate a small object. So the heap we were allocating large objects on was affected by the heap we were allocating small objects on. This isn't necessary as small object heap and large object heaps have separate areas. In scenarios with limited memory, we can unnecessarily run out of memory by refusing to move away from that hea. However, we do want to ensure that the large object heap accessed is not on a different numa node than the small object heap. I experimented with adding a `get_loh_alloc_heap()` to acontext similar to the SOH alloc heap, but performance tests showed that it was usually better to just start from the home heap. The chosen policy was: * Start searching from the home heap -- this is the one corresponding to our processor. * Have a low (but non-zero) preference for that heap (dd_min_size(dd) / 2), as long as we stay within the same numa node. * Have a higher cost of switching to a different numa node. However, this is still much less than before; it was dd_min_size(dd) * 4, now dd_min_size(dd) * 3 / 2. This showed big performance improvements (over 30% less time) in a scenario with lots of LOH allocation where there were fewer allocating threads than GC heaps. The changes were more pronounced the more we allocated large objects vs small objects. There was usually slight improvement (1-2%) when there were 48 constantly allocating threads and 48 heaps. The one place we did see a slight regression was in an 800MB container and 4 allocating threads on a 48 processor machine; however, similar tests with less memory or more threads were prone to running out of memory or running very slow on the master branch, so we've improved stability. Previously the gc could get lucky by having the SOH choice happen to be a good choice for LOH, but we shouldn't be relying on it as it failed in some container scenarios. One more change is in joined_generation_to_condemn: If there is a memory limit and we are about to OOM, we should always do a compacting GC. This helps avoid the OOM and feeds into the next change. This PR also adds a second balance_heaps_loh function for when there is a memory limit and we previously failed to allocate into the chosen heap. `balance_heaps_loh` works based on allocation budgets, whereas `balance_heaps_loh_hard_limit_retry` works on the actual space available at the end of the segment. Thanks to the change to joined_generation_to_condemn the heaps should be compact, so not looking at free space here. * Fix uninitialized variable * In a container, use space available instead of budget * Fix duplicate semicolon
2019-04-26	Fix creation of the NUMA node to heap number map	Jan Vorlicek	1	-9/+17
	The current implementation assumes that the NUMA nodes of CPUs used for GC threads form a zero based continous range. However that doesn't have to be true for cases when user selects only a subset of the available CPUs for the GC heap threads using the COMPlus_GCHeapAffinitizeMask or COMPlus_GCHeapAffinitizeRanges. The selected CPUs may belong only to a subset of NUMA nodes that don't necessarily start at node 0 or form a continuous range. This change fixes the algorithm that initializes the numa_node_to_heap_map lookup array so that it works correctly even in such cases.
2019-04-25	Merge pull request #24242 from janvorli/fix-numa-node-for-disabled-numa	Jan Vorlicek	1	-13/+13
	Fix NUMA node for heap when NUMA is not available
2019-04-25	Fix NUMA node for heap when NUMA is not available	Jan Vorlicek	1	-13/+13
	The recent refactoring of the GCToOSInterface::GetProcessorForHeap has accidentally changed the NUMA node returned in case NUMA is disabled (either via the COMPlus_GCNumaAware or due to the fact that there is just a single NUMA node on the system) and the CPU groups are disabled. Before that refactoring, the code was incorrectly returning 0 as the NUMA node when CPU groups were disabled no matter whether NUMA was enabled or disabled. The refactoring fixed that by returning the current CPU group number for the case when NUMA was enabled, however it still returned incorrect value, this time GroupProcNo::NoGroup as the NUMA node number in case NUMA was disabled. This change fixes it by returning the current group number in this case.
2019-04-24	Switch to workstation GC in case of constrained CPU resources (#24194)	Ludovic Henry	1	-5/+1
	* Switch to workstation GC in case of constrained CPU resources Right now, if the user sets the configuration so that the server GC is used, the server GC will be loaded even in conditions where we know the workstation GC would fare better. An example of such conditions is constrained environment where there is only 1 or less CPU or with very low memory. This can be harmful if users deploy the same projects on different kind of platforms: deploying to a 20+ cores server and to Azure Functions will require largely different configurations for the runtime. There are already multiple ways for the user to specify to use the server GC or not: - setting `COMPlus_gcServer` as an environment variable - setting `gcServer` in the configuration file - setting `System.GC.Server` passed to `coreclr_initialize` Fix https://github.com/dotnet/coreclr/issues/23949 * Address review * Address review Remove GCToOSInterface::GetCurrentProcessCpuLimit in favor of GCToOSInterface::GetCurrentProcessCpuCount because the CpuLimit is taken into account in the CpuCount again. * Address review Do the work in src/vm/ceemain.cpp otherwise there will be a disparity between what the VM and the GC are running. Before, only the GC would be aware of the switch from server to workstation GC, but not the VM.
2019-04-24	Add Medium GC Profiling Mode & ICorProfilerInfo::GetObjectReferences (#24156)	Mukul Sabharwal	7	-7/+29

2019-04-23	Delete unnecessary static and update GCSample to VS2019 (#24204)	Jan Kotas	3	-6/+4

2019-04-19	Large Pages on Linux & macOS (#24098)	Mukul Sabharwal	3	-0/+12

2019-04-17	Put back the CPU limiting in GC	Jan Vorlicek	1	-8/+14
	The CPU limiting was accidentally removed during refactoring of the CPU groups support in GC. This change puts them back.
2019-04-16	Use delete [] on array types (#24027)	Omair Majid	1	-1/+1
	Calling delete on types allocated with new[] leads to undefined behaviour.
2019-04-15	Delete unused YieldProcessorScalingFactor from GC (#23994)	Jan Kotas	3	-15/+0

2019-04-15	Merge pull request #23981 from VSadov/arm32fix22422	Vladimir Sadov	1	-36/+8
	Adjust plug_size_to_fit to consider large alignment on ARM32