CSC 469F | Sample Term Test 1 | Fall 2006

First Term Test

On-demand answers appearing here in red

  1. Definitions: [15 points; 3 each] Define the following terms in the context of this course.
    1. virtual machine monitor
    2. IPC
    3. contention
    4. convoying
    5. gang scheduling

  2. [9 points] Contrast the SPIN approach to system extensibility with that of loadable kernel modules. What is the primary benefit and drawback of each?

  3. [9 points] In the paper "Operating system support for virtual machines", the authors classify VMMs as belonging to one of two classes (Type I and Type II) in their terminology. What is the defining characteristic of each class? In which category does Xen belong? VMWare Workstation?

  4. Performance: [12 points; 4 each]
    1. You have a program that spends 60 percent of its time doing I/O and 40 percent of its time doing computation. If you double the performance of the I/O system, what is the resulting speedup? (Show your calculations)
       
      Speedup = T_original / T_improved
      If the original is 60s I/O and 40s computation, then after doubling I/O speed you have 30s I/0 and still 40s computation, for 70s total. So:
      Speedup = 100/70 = 1.43
    2. Which of the following can be explained using Amdahl's Law?
      1. Processor speed has been doubling every 18 months.
      2. The gap between processing and I/O is widening.
      3. Application performance has not been growing as rapidly as processor performance.
      Explain it.
       
      Amdahl's law explains the third item: application's do not spend all their time using the processor, and only the fraction of their time spent using the CPU is improved by making the CPU faster. Overall application speedup must therefore be less than CPU speedup, in general.
    3. You have just profiled your system extensively. Which scenario below gives you the greatest confidence you will be able to improve the performance of the system? Explain.
      1. Two routines together account for 80 percent of the total execution time.
      2. Each subroutine accounts for approximately 1 percent of the total execution time.
      3. One routine accounts for 10 percent of the total execution time.

       
      The first scenario is the one you hope for. By focusing your attention on just two routines, you can capture the code where the application spends most (80%) of its execution time. Of course, this presumes that the size and complexity of the routines is roughly similar, and not that someone has manually inlined 80% of the small routines into just two messy mega-routines.

  5. [12 points; 4 each] Event notification:
    1. What is the primary source of inefficiency in the traditional Unix poll() or select() mechanisms?
    2. Explain how the FreeBSD kqueue() mechanism addresses this problem.
    3. Why was Unix poll() an acceptable design in the 1970's? What changes to hardware and applications made a mechanism like kqueue feasible and necessary for today's systems?

  6. [12 points; 8 and 4]
    1. [8 points]Here is the definition of a push() operation for a lock-free stack:
      typedef struct node {
              long key;
              struct node *next;
              struct node *mr_next; /* for limbo list */
      } node_t;
      
      /* Visible to all threads; initially NULL. */
      node_t *stack_top;
      
      void push (long t)
      {
              node_t *n = new_node ();
              n->key = t;
              do {
                 node->next = stack_top;
              } while (!CAS(&stack_top, node, node->next));
      }
      
      
      CAS returns TRUE on success and FALSE on failure.
      Write the corresponding pop() operation. Assume you have access to a deferred deletion function free_node_later (node_t *), so you don't have to worry about the ABA problem.
       
      Note that this is very similar to the lock-free stack from Lecture 10, except in the notes, the push() was provided the node to push onto the stack, and in this question, it is given only the value, and must allocate the node using new_node() and initialize it to hold the key inside the push() function. The pop() function is similarly responsible for deleting the node and just returning the key value in this case.
      int pop(long *t)
      {
              node_t *first, *second;
      
              do {
                 first = stack_top;
                 if (first != NULL) {
                     second = first->next;
                 } else return 0; /* FALSE - stack is empty, can't pop */
              } while (!CAS(&stack_top, first, second));
      
              t = first->key;
              free_node_later(first);
              return 1;  /* TRUE - t contains value popped off stack */
      }
      
    2. [4 points] Explain how the ABA problem can occur in the lock-free stack pop() if you do not have a deferred deletion function.
       
      Suppose, as in the stack from the lecture notes, a thread, T1, is trying to do a pop, and has set its local variables first and second. Suppose also that instead of free_node_later() the pop() routine calls a free_node_now() function that puts the available node at the front of a free list, and that a call to new_node returns the node at the front of that free list. Other threads, T2 and T3, perform the following:
      • T2: pop(&val1) - node at top of stack (pointed to by T1's first) goes onto free list
      • T2: pop(&val2) - node at top of stack (pointed to by T1's second) goes onto free list
      • T3: begins push(x) - calls new_node and gets first free (the node still pointed to by T1's second variable)
      • T2: push(y) - gets next node from free list (the node pointed to by T1's first variable, sets its key to y, and pushes it onto the stack.
      • T1: continues its push - CAS(&stack_top, first, second) succeeds because the node pointed to by first is now on top of the stack again! (The stack_top went from a value "A" to a different value "B" and back to "A" again between the time that we recorded the value of the stack_top, and the time we test to see if it changed with the CAS.) The CAS operation succeeds, setting the stack_top to point to the node second, which is the same node that T3 is in the middle of pushing onto the stack!!
      • T3: finishes push(x) - sets its node's next pointer to point to the stack_top (which is the same node now) and updates the stack_top to point to the node (which it already did)
      At this point, our stack consists of a single node linked to itself, and anything else that was in the stack has been lost. The problem occurs because a node is freed and reallocated, and re-inserted into the stack while a thread holds a reference to its original "incarnation". CAS is not powerful enough to tell that anything has changed.
       
      NOTE This answer is more detailed that I would expect from you - I included extra info in case anyone was unclear on the stack example from the lecture. You could easily omit the detailed steps in this case, for example, as long as the description made it clear you understood the problem.

  7. [10 points] Question sketch. Given a description of a set of jobs (i.e., arrival time, number of threads, required CPU time) construct a gang schedule showing the jobs that are grouped together. Identify the total number of wasted CPU cycles (time slices on some CPU when nothing was scheduled) due to gang scheduling.

demke
Last modified: Thu Oct 26 01:03:47 EDT 2006