Advanced topics in database systems (CSC2531), Fall 2011
This course is a reading seminar
with a focus on the intersection of database engines and modern
hardware. Topics include non-traditional database engine designs,
cache- and multicore- aware algorithms, hardware acceleration for
database operations (e.g. FPGA and GPU), as well as flash and other
storage new technologies.
Instructor: Ryan Johnson
Lectures: Tue 9:00-11:00 in BA 2179
Office: BA 5226
TA: Mo Sadoghi (office hours TBD)
Office hours: TBA in BA 5226 (or by appointment)
Instructor telephone: 416 946-7069
Course email: email@example.com
Course web page: http://www.cs.utoronto.ca/~ryanjohn/teaching/csc2531-f11/
This course is a reading seminar, and most of the work consists of
reading, summarizing, and discussing research papers (2-3 per
week). Participation counts heavily; all students are expected to
come to class having read assigned papers for the week. To this end,
a concise write-up is due at the beginning of each class, with full
marks awarded to those which indicate the student is prepared to
contribute to the day's discussions (in other words, don't just hand
in an extended summary of the paper). Please be aware that this is
meant to be a real discussion. You are allowed, even encouraged, to
question assumptions and disagree with others (politely!). Being
"wrong" will only cost you participation marks if it is blatantly
clear that you didn't read the paper and are making things up as you
Each week a pair of students will present to the class a summary
and critique of the week's readings, with open discussions to
follow. Students presenting papers in the same week are encouraged
to select papers which complement or contrast each other. Time
permitting, we will dedicate a few minutes of each lecture to
feedback on the presentations themselves. In addition to receiving
marks for the presentation they deliver, students will receive marks
based on the quality of feedback they give to their peers. Positive
feedback (identifying things done well) is just as important as the
negative feedback we usually give and receive.
Clever algorithms are necessary but usually insufficient to produce
a successful systems design. Therefore, in addition to the readings,
each student (in groups of two) will complete a programming project
which aims to either reproduce or extend the state of the art in a
relevant area. The project forms an important, hands-on part of the
learning process, hence original research, while always encouraged,
is not required. Students are welcome to incorporate their own
ongoing research as long as they make clear what work they will
accomplish during the course. The project mark will be based on a
design document, 1-2 milestones to encourage students to keep on
track, the presentation and final report, and the completeness and
quality of the deliverables.
There is no final exam; the final project (code and write-up) will
be due during finals week, with students giving a brief (~15 minute)
presentation of their work during the last last lecture of the
Tentative schedule Over the twelve week course we will cover
ten fixed topics (listed below). Mo and I will lead the first
discussion, and the day's write-up will be generated during class.
Several papers are available for each theme; the students presenting
at each lecture will choose 2-3 papers with 1-2 weeks notice. During
the remaining week we will (re)visit topics of particular interest to
the students. Please indicate your preliminary preferences regarding
in-class presentations using
this Doodle poll. I
had to make it a hidden poll for privacy reasons, but we'll discuss
the results in class and hammer out a final schedule based on your
Breakdown of marks
The course mark will be broken down into
the categories listed below, with points assigned as indicated:
|Weight||Item||Minimal mark||Moderate mark||High mark|
|30%||Participation||Present||Talkative||Insightful comments or questions|
|20%||Presentations||Factually correct||Designed and delivered well||Transmits effectively key points, implications, etc.|
|15%||Written critiques||Accurate summary of paper, or list of facts from it||Identifies important points||Thoughtful analysis/critique that informs the discussion|
|5%||Quality of feedback to peers||Focus on nitpicks and minutiae||Suggest incremental improvements||Identify structural strengths and flaws|
|30%||Final project||Unambitious and/or badly planned||Partially implemented and/or poorly presented||Implemented successfully with key learning points presented|
Many of the above criteria are cumulative. For example, there is
nothing wrong with pointing out minutiae in a peer's presentation, as
long as it does not overlook more important issues.
The project proposal (due 18 Oct) should contain the following information:
I'd normally expect the above to occupy 2-4 pages, but word counts are not the priority here.
- Topic to be addressed and the nature of the problem
- State of the art (prior work, what remains unsolved, etc.)
- The proposed technique to be implemented/evaluated
- To what degree the project will repeat existing work
- Specific, measurable goals: deliverables, and dates you expect to produce them
Announcements and clarifications
- We'll start talking about course projects in mid-October, at
which point I'll circulate a list of suggested topics and give
- Write-ups do not need to be long. The exact format is up
to you, but one approach that can be very effective is called
"3-2-1" -- list three key points that summarize the paper, two
comments about the implications or impact of the work, and one
question/issue you'd like to see discussed in class.
- As requested, I've posted the slides I used for MonetDB
today. It would be good if presenters could also send their slides
for me to make available.
- Presentations should be about 25 minutes long, leaving 15-20
minutes for discussion and 5-10 minutes for post-mortem analysis of
the slides. You are welcome to use the authors' original slides if
they are available and if you feel that they are appropriate for the
class (all often they are not, unfortunately).
- You are most welcome to send paper critiques by email.
- PLEASE send email to the class list -- Mo can't answer questions
or mark your summaries if you send them directly to me.
- If you did not enter the poll you will not be assigned any
papers, which would not be particularly good for your mark in the
- (update below)
If people auditing the class are willing
to present, everyone will present only once. For now, I've
tentatively filled all slots with the following folks presenting
twice: George, Ioan, Mike, and Nosayba. This is NOT a final
assignment, and hopefully we'll be able to avoid double-booking
anyone when the dust settles.
- Project proposal time! Feel free to come up with your own, or
check out a list of project
ideas here. You should choose a
partner (systems projects tend to be big) and come up with at least
an informal project idea by 11 Oct, with a formal proposal due 18
Oct. Please meet with either the instructor or TA as a sanity check
before settling on a project!
- At this point all the double-bookings should have been cleared
excepting Ioan, who requested to keep his second presentation.
- Javed has a project idea involving data mining on flash storage,
and is looking for a partner. Contact him directly at his cs email
(jsiddique) if you're interested.
- Details on the format of the project proposal are now available above
- Some of you have asked about progress (= marks) to date. I just sent emails to people who did not submit some paper critiques; if you did not receive one of those emails you currently have full marks in that area; class participation and presentation quality have been excellent and I have not marked anyone down in those areas so far. Good job, everyone!
- Shimin Chen on the multi-million cycle FPB+tree probes: "In the experimental section, the figures report the aggregate time for 2000 operations (search/insert/delete). This is explained in the description of the figures in Sec 4.2. For example, in Figure 10, the 2000 searches cost about 1--3 million cycles. So a single search should be about 500-1500 cycles."
- So far I have received eight project proposals which cover the following people: Andy, Ben, Bilal, Fatemeh, George, Ioan, Javed, Jiang, Mike, Nataliya, Nosayba, Suprio. If you handed in a proposal and I didn't list you, please contact me ASAP!