Page Allocation
Consider a system with a 4kB page size and a physically-indexed 64kB
direct-mapped cache. You are asked to implement a conflict-aware page
allocator in the OS, and you must choose between page coloring or bin hopping.
Assuming you maintain one free list per color, how many do you need?
Astutely, you ask what the expected workload for the system is going to be. Indicate whether you would use bin hopping or page coloring, and why, for each of the following scenarios:
Context switches are frequent, and nearly all processes are
extremely small (using just 1 code page and 1 stack page for data)
Context switches are less frequent and nearly all processes are
very data-intensive, with working sets that are roughly equal to, or larger than, the cache size
Answer
Number of colors = cache size / page size = 64K / 4K = 16
Page coloring uses the virtual address to select a color when
choosing a physical page to allocate. The stack (and the code, but
assume we're only looking at the data cache) starts at the same
virtual address in every process, and so will prefer the same physical
page color. Each process will thus conflict with the previous one
when it accesses its stack after a context switch. Bin hopping
chooses a physical page of the next color from the previous
allocation, so as new processes are created, they will be given pages
for their stacks that are of different colors, reducing conflicts in
the cache after context switches. (Of course, exactly what degree of
conflict we get depends on which processes are scheduled after each
other, but at least it isn't guaranteed to be a problem, as with page
coloring).
In this case, we mostly care about cache conflicts within a
single process, since the limited cache capacity will lead to misses
across context switches anyway. Page coloring naturally exploits
spatial locality in the virtual address space, while bin hopping
naturally exploits temporal locality in the order of page accesses.
Without more information about the nature of these data intensive
processes, it is impossible to say which will be more
common/important. However, with page coloring, the application
programmer can more easily reason about cache conflicts and control
the layout of key data structures in the virtual address space to try
and reduce them. For that reason, I would prefer page coloring.