CSC 2227S | Summer 2021 | Readings

Papers

For each meeting, readings are assigned. Usually, the readings will consist of two computer systems papers. The papers selected for this course are either classic papers or papers from recent top conferences. You are expected to read these papers thoroughly and submit a review BEFORE arriving at class on Tuesdays.

Each paper will be briefly presented by a student in the class, who will also lead the discussion of that paper. For each class meeting, we identify the topic and papers below; for each, we also try to identify good sources for background reading and for further investigation.

To enter your paper reviews, go here.

Electronic versions are available from the course review site.

(NOTE: This schedule is not set in stone. Some changes may be made to this schedule during the term)

Week 0 - May 4: Welcome to CSC 2227

There will be no class meeting this week.

The following items are intended to provide an overview of the course (how it will operate and what is expected of you) and to help you refresh your memory of operating systems. These are to help you prepare for the course and assess your own knowledge of the pre-requisite material.

Week 1 - May 11: Historical Distributed Systems

presented by Angela Demke Brown

Read and review the following papers:

  1. Grapevine: an exercise in distributed computing
    Andrew D. Birrell, Roy Levin, Michael D. Schroeder, and Roger M. Needham. In Communications of the ACM, Vol. 25, No. 4, pp. 260-274, April 1982. (ACM SIGOPS HoF paper 2008)
    http://doi.acm.org/10.1145/358468.358487
  2. A Comparison of Two Distributed Systems: Amoeba and Sprite
    Fred Douglis, M. Frans Kaashoek, John K. Ousterhout, and Andrew S. Tanenbaum. In Computing Systems, Vol. 4, No. 3, pp. 353-384, December 1991.
    https://www.usenix.org/legacy/publications/compsystems/1991/fall_douglis.pdf

Additional suggested reading

Week 2 - May 18: Classic Distributed File Systems

Read and review the following papers:

  1. Scale and Performance in a Distributed File System
    John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Satyanarayanan, Robert N. Sidebotham, and Michael J. West. In ACM Transactions on Computer Systems (TOCS), vol. 6, no. 1, pp. 51--81, February 1988. (ACM SIGOPS HoF paper 2008, originally published in Proceedings of the Eleventh ACM Symposium on Operating Systems Principles (SOSP’87), November 1987.)
    https://doi.org/10.1145/35037.35059
  2. Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency
    Cary G. Gray and David R. Cheriton. In Proceedings of the Twelfth ACM Symposium on Operating Systems Principles (SOSP’89), pp. 202--210, December 1989. (ACM SIGOPS HoF paper 2009)
    https://doi.org/10.1145/74850.74870

Additional suggested reading

Week 3 - May 25: Placement and Lookup Services

Read and review the following papers:

  1. Chord: A scalable peer-to-peer lookup service for internet applications.
    Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. In Proceedings of the 2001 Conference on Applications, technologies, architectures, and protocols for computer communications (SIGCOMM '01), pp. 149-160, August 2001. (ACM SIGOPS HoF paper 2015)
    http://doi.acm.org/10.1145/383059.383071
  2. CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data
    Sage A. Weil, Scott A. Brandt, Ethan L. Miller and Carlos Maltzahn. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC'06), November 2006.
    https://doi.org/10.1109/SC.2006.19

Additional suggested reading

Several closely related papers were published in 2001, including Chord, Pastry, CAN, and the Tapestry technical report.

Week 4 - June 1: Coordination Services

Read and review the following papers:

  1. The Chubby Lock Service for Loosely-Coupled Distributed Systems
    Mike Burrows. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI'06), pp. 335--350, November 2006. (ACM SIGOPS HoF paper 2017)
    https://www.usenix.org/legacy/event/osdi06/tech/burrows.html
  2. NetChain: Scale-Free Sub-RTT Coordination
    Xin Jin, Xiaozhou Li, Haoyu Zhang, Nate Foster, Jeongkeun Lee, Robert Soulé, Changhoon Kim and Ion Stoica. In Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI’18), pp. 35--49, April 2018. (Best paper award)
    https://www.usenix.org/node/211262

Additional suggested reading

Some understanding of Distributed Consensus protocols are needed to follow these coordination services papers. The first entry below (Raft) presents a consensus algorithm that was designed to be easy (or easier, compared to Paxos) to understand.

Week 5 - June 8: Distributed Shared Logs

Read and review the following papers:

  1. Corfu: A Distributed Shared Log
    Mahesh Balakrishnan, Dahlia Malkhi, John D. Davis, Vijayan Prabhakaran, Michael Wei and Ted Wobber. In ACM Transactions on Computer Systems (TOCS) vol. 31 no. 4 , pp. 10:1--10:24, December 2013. https://doi.org/10.1145/2535930

Additional suggested reading

Shared logs are a key component of many transactional systems. One early example is QuickSilver. Many others followed.

Week 6 - June 15: Key-Value Stores

Read and review the following papers:

  1. Bigtable: A Distributed Storage System for Structured Data
    Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes and Robert E. Gruber. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI'06), pp. 205--218, November 2006. (Best paper award)
    https://www.usenix.org/conference/osdi-06/bigtable-distributed-storage-system-structured-data
  2. FlashStore: High Throughput Persistent Key-Value Store
    Biplob Debnath, Sudipta Sengupta and Jin Li. In Proceedings of the VLDB Endowment, Volume 3, Number 1-2, pp. 1414–1425, September 2010.
    https://doi.org/10.14778/1920841.1921015

Additional suggested reading

Week 7 - June 22: Designing Key-Value Stores for SSDs

Read and review the following papers:

  1. WiscKey: Separating Keys from Values in SSD-conscious Storage
    Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST'16), pp. 133--148, Feb. 2016.
    https://www.usenix.org/node/194425
  2. KVell: the design and implementation of a fast persistent key-value store
    Baptiste Lepers, Oana Balmau, Karan Gupta and Willy Zwaenepoel. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP '19), pp. 447–-461 , Oct. 2019.
    https://doi.org/10.1145/3341301.3359628

Additional suggested reading

Week 8 - June 29: In-Memory Distributed Computing and Storage

Read and review the following papers:

  1. Fast Crash Recovery in RAMCloud
    Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP'11), pp. 29--41, October 2011.
    http://doi.acm.org/10.1145/2043556.2043560
  2. FaRM: Fast Remote Memory
    Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson, and Miguel Castro. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI'14), pp. 401--414, April 2014.
    https://www.usenix.org/conference/nsdi14/technical-sessions/dragojevic

Additional suggested reading

Week 9 - July 6: Experiences with real-world large scale systems

Read and review the following papers:

  1. A large scale analysis of hundreds of in-memory cache clusters at Twitter
    Juncheng Yang, Yao Yue and K. V. Rashmi. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pp. 191--208, Nov. 2020.
    https://www.usenix.org/conference/osdi20/presentation/yang
  2. Evolution of Development Priorities in Key-value Stores Serving Large-scale Applications: The RocksDB Experience
    Siying Dong, Andrew Kryczka, Yanqin Jin and Michael Stumm. In Proceedings of the 19th USENIX Conference on File and Storage Technologies (FAST 21), pp. 33--49, Feb. 2021.
    https://www.usenix.org/conference/fast21/presentation/dong

Additional suggested reading

Week 10 - July 13: Non-Volatile Main Memory

Read and review the following papers:

  1. NOVA-Fortis: A Fault-Tolerant Non-Volatile Main Memory File System
    Jian Xu, Lu Zhang, Amirsaman Memaripour, Akshatha Gangadharaiah, Amit Borase, Tamires Brito Da Silva, Steven Swanson, and Andy Rudoff. n Proceedings of the 26th Symposium on Operating Systems Principles (SOSP'17), pp. 478--496, October 2017.
    https://doi.org/10.1145/3132747.3132761
  2. Twizzler: a Data-Centric OS for Non-Volatile Memory
    Daniel Bittman, Peter Alvaro, Pankaj Mehra, Darrell D. E. Long and Ethan L. Miller. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC 20), pp. 65--80, July 2020.
    https://www.usenix.org/conference/atc20/presentation/bittman

Additional suggested reading

Current systems research on non-volatile memory systems covers a wide range of topics, including designing persistent data structures and programming frameworks, persistent memory file systems, hybrid (or tiered) paging systems, and databases on persistent memory. The selections here are just a few interesting points in that space.

Week 11 - July 20:

Read and review the following papers:

  1. Arrakis: The Operating System is the Control Plane
    Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, Thomas Anderson and Timothy Roscoe. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI'14), pp. 1--16, October 2014. (Best Paper award)
    https://www.usenix.org/node/186141
  2. IX: A Protected Dataplane Operating System for High Throughput and Low Latency
    Adam Belay, George Prekas, Ana Klimovic, Samuel Grossman, Christos Kozyrakis and Edouard Bugnion. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI'14), pp. 49--65, October 2014. (Best Paper award)
    https://www.usenix.org/node/186147

Additional suggested reading

Week 12 - July 27:

Read and review the following papers:

  1. Rethinking the Library OS from the Top Down
    Donald E. Porter, Silas Boyd-Wickizer, Jon Howell, Reuben Olinsky and Galen C. Hunt. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'11), pp.291--304, March 2011.
    https://doi.org/10.1145/1950365.1950399
  2. Unikernels: library operating systems for the cloud
    Anil Madhavapeddy, Richard Mortier, Charalampos Rotsos, David Scott, Balraj Singh, Thomas Gazagnaire, Steven Smith, Steven Hand and Jon Crowcroft. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'13), pp. 416--472, March 2013.
    https://doi.org/10.1145/2451116.2451167

Additional suggested reading