Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
The recent refactoring of the GCToOSInterface::GetProcessorForHeap has
accidentally changed the NUMA node returned in case NUMA is disabled
(either via the COMPlus_GCNumaAware or due to the fact that there is
just a single NUMA node on the system) and the CPU groups are disabled.
Before that refactoring, the code was incorrectly returning 0 as the
NUMA node when CPU groups were disabled no matter whether NUMA was
enabled or disabled. The refactoring fixed that by returning the
current CPU group number for the case when NUMA was enabled, however
it still returned incorrect value, this time GroupProcNo::NoGroup as
the NUMA node number in case NUMA was disabled.
This change fixes it by returning the current group number in this case.
|
|
|
|
This change removes CPU groups emulation from Unix PAL and modifies the
GC and thread pool code accordingly.
|
|
|
|
Each entry has to be prefixed by group number followed by comma. There
is nothing like global CPU index on Windows, all the APIs that support
more than 64 processors use group, in-group index pair. So specifying
the affinity this way matches what users are used to.
|
|
The GCHeapAffinitizeRanges are now used only when CPU groups are enabled
and the GCHeapAffinitizeMask when CPU groups are disabled.
|
|
|
|
This change removes all explicit manipulation and handling of CPU groups
from the gc.cpp and hides it behind the GCToOSInterface. This is a step
to prepare for removing the CPU groups emulation on Unix. In fact, I've
already updated the standalone Unix GC to be able to affinitize GC
threads to any subset of CPUs and added previously missing support for
the affinity setting itself. The NUMA support still remains missing there.
It also adds a new way to specify GC thread affinitization that is not
limited to 64 threads. Instead of affinity mask stored in 64 bit
integer, we now use a bitset that can contain as many processors as
GC can support. And there is a new GC config to specify the affinity in
a form of a range list.
|
|
also fixing the LocalGC standalone case on Linux
|
|
GCHeapHardLimit - specifies a hard limit for the GC heap
GCHeapHardLimitPercent - specifies a percentage of the physical memory this process is allowed to use
If both are specified, GCHeapHardLimit is checked first and only when it's not specified
would we check GCHeapHardLimitPercent.
If neither is specified but the process is running inside a container with a memory
limit specified, we will take this as the hard limit:
max (20mb, 75% of the memory limit on the container)
If one of the HardLimit configs is specified, and the process is running inside a container
with a memory limit, the GC heap usage will not exceed the HardLimit but the total memory
is still the memory limit on the container so when we calculate the memory load it's based
off the container memory limit.
An example,
process is running inside a container with 200mb limit
user also specified GCHeapHardLimit as 100mb.
if 50mb out of the 100mb is used for GC, and 100mb is used for other things, the memory load
is (50 + 100)/200 = 75%.
Some notes on these configs -
+ The limit is the commit size.
+ This is only supported on 64-bit.
+ For Server GC the minimum *reserved* segment size is 16mb per heap, this is to avoid the
scenario where the hard limit is small but the process can use many procs and we end up
with tiny segments which doesn't make sense. We then keep track of the committed on the segments
so the total does not exceed the hard limit.
|
|
We weren't setting memStatus->dwLength so the call to GlobalMemoryStatusEx would fail and we would return 0. This caused the standalone GC to get in to a state where we wouldn't allocate more memory for the GC heap even if there was plenty available on the machine.
|
|
windows (#19879)
* Add GetCacheSizePerLogicalCpu and GetCurrentProcessCpuCount to gcenv for windows
* Remove dead code, switch to __cpuid instaed of asm helper
* code review feedback
|
|
stubs from local gc (#19500)
* Switch NumaNodeInfo and CPUGroupInfo to the interface
* Remove AppDomain/SystemDomain stubs
* remove EEConfig methods
* Port numa code to the coreclr side
* add numa back to PAL and standalone builds
* enable numa for PAL/Standalone builds, and fix BOOL warnings
* remove unused defines, and fix linux build errors
* building on windows
* about to delete numa work from unix and want a backup
* add stubs for unix numa/cpugroup
* Code review feedback
* Code review feedback
|
|
|
|
GetLargestOnDieCacheSize into GetCacheSizePerLogicalCpu (#15975)
* refactor: combine GetLargestOnDieCacheSize and GetLogicalCpuCount in GetCacheSizePerLogicalCpu
* Perform PhysicalMemoryLimit check also for workstation GC
|
|
* Initial cut, ignoring thread affinity
* Integrate thread affinity
* Affinity for standalone Windows
* Add 'specialness' and the thread name as arguments to CreateThread
* First crack at unified implementation
* Set priority for server GC threads
* Remove unused parameter
* Address code review feedback and remove some dead code that broke the clang build
* Use char* on the interface instead of wchar_t (doesn't play well cross-platform)
* Rename IsGCSpecialThread -> CurrentThreadWasCreatedByGC
* Code review feedback and fix up the build
* rename CurrentThreadWasCreatedByGC -> WasCurrentThreadCreatedByGC
* Fix a contract violation when converting string encodings
* Thread::CreateUtilityThread returns a thread that is not suspended - restarting a non-suspended thread is incorrect
* CreateUnsuspendableThread -> CreateNonSuspendableThread
|
|
(#13656)
* Vendor the volatile.h header into src/gc/env for the standalone GC build
* Repair the GC sample
* Remove some unneeded defines (fixes the sample build)
|
|
* [Arm64/Unix] Support 64K pages
* GC move GCToOSInterface::Initialize() into InitializeGarbageCollector()
|
|
* [Local GC] Don't use g_SystemInfo to infer the number of processors
* [Local GC] Avoid using g_SystemInfo.dwAllocationGranularity from within
the GC.
In a few places, the GC aligns the size of a request memory reservation
to the allocation granularity before calling
GCToOSInterface::VirtualReserve. Upon inspection, this isn't necessary.
* [Local GC] Remove uses of g_SystemInfo from the handle table
* Add new API on GCToOSInterface for querying the total number of procs on the machine
|
|
* Remove destructor from GCEvent and instead rely on the OS to clean up
* Add a comment justifying the lack of destructor
* wording: many -> all
|
|
* [Local GC] Move operations on CLREventStatic to the EE and add their functionality to the interface
* Fix a missed case
* Split GetWaitForGCEvent into two smaller interface methods to avoid exposing the event itself on the interface
* Initial implementation for Unix
* Complete unix implementation
* Make it work on Windows
* Remove redudant methods from GCToEEInterface
* Fix the Linux build
* First part of code review feedback: make GCEvent dispatch statically (Windows)
* Second part of code review feedback: make GCEvent dispatch statically (Unix)
* Standardize implementation across Windows/Unix (apparently MSVC is more lenient about constructor names than clang)
* Address code review feedback: Add Create*Event methods back onto GCEvent and remove them from GCToOSInterface
* Address code review feedback: remove a dead define
* Remove a bad comment, remove an unnecessary friend class, fix some formatting issues
* Fix an issue when initializing a GCEvent on Linux (should not be allocating an Impl in the constructor)
* Fix the same issue on Windows (less bad, just leaks memory instead of asserting)
|
|
|
|
gcenv.windows.cpp matches gcenv.os.cpp
|
|
platforms (#8976)
* Add way to build with FEATURE_STANDALONE_GC from build.sh
* Make CMake changes to build the GC 'PAL' as its own build target (to avoid -nostdinc).
In addition, introduce a "GC PAL" that provides an implementation of
GCToOSInterface on Unix-like platforms, for use with
FEATURE_STANDALONE_GC.
|