diff options
author | Andy Hanson <anhans@microsoft.com> | 2019-04-26 19:33:26 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2019-04-26 19:33:26 -0700 |
commit | 141926d90c54bb358cfe8d9eb641c88e94639a8c (patch) | |
tree | 3f52ea1959640fc74e427dfedad385dcf9e2fddb /src/gc/gcpriv.h | |
parent | 4452efd309d40d4bc7fc1fa48bf1b6e615ee6755 (diff) | |
download | coreclr-141926d90c54bb358cfe8d9eb641c88e94639a8c.tar.gz coreclr-141926d90c54bb358cfe8d9eb641c88e94639a8c.tar.bz2 coreclr-141926d90c54bb358cfe8d9eb641c88e94639a8c.zip |
Improve LOH heap balancing (#24081)
* Improve LOH heap balancing
Previously in `balance_heaps_loh`, we would default to `org_hp` being
`acontext->get_alloc_heap()`.
Since `alloc_large_object` is an instance method, that ultimately came
from the heap instance this was called on. In `GCHeap::Alloc` that came
from `acontext->get_alloc_heap()` (this is a different acontext). That
variable is set when we allocate a small object. So the heap we were
allocating large objects on was affected by the heap we were allocating
small objects on. This isn't necessary as small object heap and large
object heaps have separate areas. In scenarios with limited memory, we
can unnecessarily run out of memory by refusing to move away from that
hea. However, we do want to ensure that the large object heap accessed
is not on a different numa node than the small object heap.
I experimented with adding a `get_loh_alloc_heap()` to acontext similar
to the SOH alloc heap, but performance tests showed that it was usually
better to just start from the home heap. The chosen policy was:
* Start searching from the home heap -- this is the one corresponding to
our processor.
* Have a low (but non-zero) preference for that heap (dd_min_size(dd) /
2), as long as we stay within the same numa node.
* Have a higher cost of switching to a different numa node. However,
this is still much less than before; it was dd_min_size(dd) * 4, now
dd_min_size(dd) * 3 / 2.
This showed big performance improvements (over 30% less time) in a
scenario with lots of LOH allocation where there were fewer allocating
threads than GC heaps. The changes were more pronounced the more we
allocated large objects vs small objects. There was usually slight
improvement (1-2%) when there were 48 constantly allocating threads and
48 heaps. The one place we did see a slight regression was in an 800MB
container and 4 allocating threads on a 48 processor machine; however,
similar tests with less memory or more threads were prone to running out
of memory or running very slow on the master branch, so we've improved
stability. Previously the gc could get lucky by having the SOH choice
happen to be a good choice for LOH, but we shouldn't be relying on it as
it failed in some container scenarios.
One more change is in joined_generation_to_condemn: If there is a memory
limit and we are about to OOM, we should always do a compacting GC. This
helps avoid the OOM and feeds into the next change.
This PR also adds a *second* balance_heaps_loh function for when there
is a memory limit and we previously failed to allocate into the chosen
heap. `balance_heaps_loh` works based on allocation budgets, whereas
`balance_heaps_loh_hard_limit_retry` works on the actual space available
at the end of the segment. Thanks to the change to
joined_generation_to_condemn the heaps should be compact, so not looking
at free space here.
* Fix uninitialized variable
* In a container, use space available instead of budget
* Fix duplicate semicolon
Diffstat (limited to 'src/gc/gcpriv.h')
-rw-r--r-- | src/gc/gcpriv.h | 12 |
1 files changed, 10 insertions, 2 deletions
diff --git a/src/gc/gcpriv.h b/src/gc/gcpriv.h index b13cd24fa4..34c820aa22 100644 --- a/src/gc/gcpriv.h +++ b/src/gc/gcpriv.h @@ -1222,9 +1222,15 @@ public: alloc_context* acontext); #ifdef MULTIPLE_HEAPS - static void balance_heaps (alloc_context* acontext); + static + void balance_heaps (alloc_context* acontext); + PER_HEAP + ptrdiff_t get_balance_heaps_loh_effective_budget (); static gc_heap* balance_heaps_loh (alloc_context* acontext, size_t size); + // Unlike balance_heaps_loh, this may return nullptr if we failed to change heaps. + static + gc_heap* balance_heaps_loh_hard_limit_retry (alloc_context* acontext, size_t size); static void gc_thread_stub (void* arg); #endif //MULTIPLE_HEAPS @@ -1232,6 +1238,8 @@ public: // For LOH allocations we only update the alloc_bytes_loh in allocation // context - we don't actually use the ptr/limit from it so I am // making this explicit by not passing in the alloc_context. + // Note: This is an instance method, but the heap instance is only used for + // lowest_address and highest_address, which are currently the same accross all heaps. PER_HEAP CObjectHeader* allocate_large_object (size_t size, int64_t& alloc_bytes); @@ -1446,7 +1454,7 @@ protected: PER_HEAP allocation_state try_allocate_more_space (alloc_context* acontext, size_t jsize, int alloc_generation_number); - PER_HEAP + PER_HEAP_ISOLATED BOOL allocate_more_space (alloc_context* acontext, size_t jsize, int alloc_generation_number); |