Hoard: a scalable memory allocator for multithreaded applications
Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, Paul R. Wilson
Abstract
Parallel, multithreaded C and C++ programs such as web servers,
database managers, news servers, and scientific applications are
becoming increasingly prevalent. For these applications, the memory
allocator is often a bottleneck that severely limits program
performance and scalability on multiprocessor systems. Previous
allocators suffer from problems that include poor performance and
scalability, and heap organizations that introduce false
sharing. Worse, many allocators exhibit a dramatic increase in
memory consumption when confronted with a producer-consumer pattern
of object allocation and freeing. This increase in memory
consumption can range from a factor of P (the number of processors)
to unbounded memory consumption.
This paper
introduces Hoard, a fast, highly scalable allocator that largely
avoids false sharing and is memory efficient. Hoard is the first
allocator to simultaneously solve the above problems. Hoard
combines one global heap and per-processor heaps with a novel
discipline that provably bounds memory consumption and has very low
synchronization costs in the common case. Our results on eleven
programs demonstrate that Hoard yields low average fragmentation
and improves overall program performance over the standard Solaris
allocator by up to a factor of 60 on 14 processors, and up to a
factor of 18 over the next best allocator we tested.