CSC 469F / CSC 2208F | Assignment 1 | Fall 2007

CSC 469 / CSC 2208. Advanced Operating Systems

Angela Demke Brown
Fall 2007

Assignment 1: Parallel Memory Allocator

[CSC469 Home Page]

Due: 4 p.m., October 29, 2007.

  1. Introduction
  2. Assignment Logistics
  3. Submission Instructions
Introduction

GOAL: To explore advanced synchronization and cache effects through the development of a parallel memory allocator.

Overview

The allocation and management of blocks of virtual memory is a key function required by nearly all modern programs. Multi-threaded parallel programs are no exception, however, the use of a memory allocator designed for sequential programs can become a serious bottleneck limiting the performance and scalability of the application. Fast operation and efficient use of memory are the typical criteria by which a sequential allocator is evaluated; parallel allocators must also take scalability into account.

The simplest way to adapt a sequential allocator for use by parallel programs -- placing a single lock around the bodies of the malloc and free functions -- serializes all memory allocation and deallocation requests, limiting scalability. Even if the allocator is designed to allow individual malloc and free requests to proceed (mostly) in parallel, scalability problems can remain due to cache effects.

In their paper on Hoard, Berger et al. list the following features that are required for scalable and memory-efficient allocators (paraphrased from the paper):

Your task in this assignment is to design and implement a parallel memory allocator that (attempts to) provide the four features listed above.

Assignment Logistics

1. Code Provided

We are providing a package with two sample serial allocators that have been made thread-safe by the use of a single lock that surrounds all malloc and free operations. The package also contains some sample benchmarks that you can use to evaluate your allocator, and some utility routines that you can and/or must use in your implementation.

The package can be found on cdf at /u/csc469h/fall/pub/A1/a1_distrib.tgz; the package contains the following directories:

2. Specifications

Your allocator consists of the following three functions, which are declared in include/malloc.h:

       int   mm_init(void);
       char *mm_malloc(size_t size);
       void  mm_free(void *ptr);

You will add a subdirectory to allocators for your implementation, and fill in these function bodies in a file named malloc.c. You will also need to modify the Makefile in the allocators subdirectory to build your new allocator. Feel free to add multiple new subdirectories as you experiment with different designs. The only thing you will hand in, however, is the malloc.c file for your final implementation. All your code must be contained in the file named malloc.c; you are not allowed to change any of the interfaces to the three functions above. Of course you can define as many helper routines as you like within this file.

Your allocator interacts with an arbitrary application program in the following way. As part of its initialization phase, the application calls your mm_init function to perform initialization of the heap. You must allocate the necessary initial heap area (starting with a call to mem_init to create the chunk of memory that you can use), and initialize all structures that you need. The application then makes a series of calls to mm_malloc and mm_free. You may assume that any programs we test with will always call mm_init before any calls to mm_malloc/mm_free and before creating any threads.

You are allowed to use any of the functions from memlib.c in your allocator:

The functions in memlib.c are not generally thread-safe. They are safe in the current allocators because they are only called from within routines that are themselves protected by a lock. If you use finer-grained locking, you must ensure that calls to mem_sbrk are serialized by grabbing a suitable lock before the call and releasing it after the return. You are not allowed to modify the functions in memlib.c themselves.

The functions you are to implement are the following:

The two allocators provided, kheap and cmu_malloc, both meet the specification provided above, but they are deficient as parallel allocators. Both will actively induce false sharing and neither is scalable. The baseline sequential performance and memory utilization of both could be improved.

3. Rules of Engagement

4. Grading Criteria

Your allocator will be graded on correctness (30%), performance (50%), and style (10%). You must also provide a writeup describing your allocator, other alternatives that you considered or tried, and an analysis of your allocator's performance. This writeup will be worth the final 10%.

The performance of your allocator will be evaluated with respect to the four features listed at the beginning of this document: speed on sequential programs, scalability, false sharing avoidance, and fragmentation. Scalability will be given slightly greater weight than the other criteria; the exact formula to be used will be added later.

Since obtaining good measurements of your allocator's performance on a shared SMP server will be difficult, if not impossible, we are setting up an 8-core timing server for this assignment. The physical hardware of this machine is identical to the CDF redwolf and greywolf servers, however access to it will be restricted to ensure that your tests run with a minimum of interference. The timing server will be available on Saturday. You will submit your malloc.c file to the server, the provided benchmarks will be run, and you will be emailed the results. Details on how to submit to the timing server will be provided once the server is operational.

What (and how) to hand in

What

How