Important links:
Tuesday, November 14: Your final project reports are due Wednesday, December 6 at 11:59pm ET on MarkUs. You can use this LaTeX template for your submission. Guidelines are written in the template.
The goals of this seminar class are twofold: first, students will be acquainted with computational tools and techniques for processing and thinking about social data, and be exposed to papers spanning the full spectrum of computational social science methods from large-scale empirical data analysis to online experimetation. Second, students will develop research skills by reading, reviewing, presenting, and discussing recent academic papers.
Every week, we will cover a different computational social science topic by reading two papers and discussing them. Before class, everyone will write a review of the papers, identifying their research questions, strengths and weaknesses, and connection to other literature. Each paper will be assigned to 2-3 people who will lead a group discussion of it in class. Throughout the term, everyone will get the chance to present one paper.
The major coursework component of the course, besides the weekly reviews, will be a term project. Students will propose a topic, give a presentation on their work, and submit a final report. The project will give students a chance to identify an interesting computational social science problem and implement it, and could potentially lead to publication in a workshop or conference.
Grading scheme:We will read the book Bit By Bit: Social Research in the Digital Age by Matthew Salganik. It's available online for free, and in print form at a reasonable cost.
Week | Date | Topic | Reviews Due | Textbook Readings |
---|---|---|---|---|
1 | 9/7 | Introduction to computational social science [Slides] [Video] (2nd half) | Ch. 1 | |
2 | 9/14 | Introduction to computational social science cont'd [Slides] [Video] | Ch. 1 | |
3 | 9/21 | Observational studies 1 [Video] | 9/20 9:00pm | Ch. 2 |
4 | 9/28 | Observational studies 2 | 9/27 9:00pm | Ch. 2 |
5 | 10/5 | Experiments 1 | 10/4 9:00pm | Ch. 4 |
6 | 10/12 | Project proposals | ||
7 | 10/19 | Experiments 2 | 10/18 9:00pm | Ch. 4 |
8 | 10/26 | Asking questions | 10/25 9:00pm | Ch. 3 |
9 | 11/2 | Applying machine learning | 11/1 9:00pm | |
10 | 11/16 | Ethics in computational social science | 11/15 9:00pm | Ch. 6 |
11 | 11/23 | Project presentations (Part 1) | ||
12 | 11/30 | Project presentations (Part 2) |
Main papers:
Main papers:
Main papers:
Students will present their project proposals.
Main papers:
Main papers:
Main papers:
Main papers:
Students will present their final projects.
The main point of this class is to engage with important and cutting-edge research at the interface of computer science and the social sciences. This involves reading, reviewing, discussing, and presenting papers. Reviewing papers for CSC2552 will help you develop your reviewing and critical thinking skills, as well as prepare you for in-class discussions. In what follows, I've written some thoughts on how to write a good review. Every week that we have a discussion class, your paper reviews will be due on Wednesday at 9pm before class.
Breaking papers down. Our structure for thinking about papers in this class can be roughly categorized into three components: the "front matter", the "meat", and the "back matter". The front matter consists of high-level motivation ("Why are we studying this domain and these questions? Why should we care about this? What will we gain if we successfully accomplish this paper's goals?"), the research question [RQ] ("What is the high-level research question that motivates this paper?"), and the concrete operationalization [CO] ("What is the answerable question that the researchers actually answer?"). Notice the difference between the RQ and the CO: the RQ is the high-level question, typically too broad and abstract to be completely answered in a single paper, whereas the CO is the question the researchers actually answer, and is usually the RQ where several high-level constructs are instantiated ("operationalized"). For example, in the structural virality paper I presented in the first lecture, the motivating research question is: "How does information spread in the world?" and the concrete operationalization is: "How structurally viral are diffusion cascades on Twitter?". In this CO, we've operationalized "information" as "URLs", "the world" as "Twitter", and "spread" as our new metric of "structural virality". Usually, the "front matter" of a paper is written in the front, in the Introduction (and potentially elsewhere, such as the Methods).
The "meat" of the paper is the analysis, where the actual work has been reported (e.g. the observational analysis, the results of the experiment, the survey findings, etc.). You'll typically find this in the middle of the paper and will normally be the majority of the content. Common section headers are Results, Analysis, etc.
Finally, the back matter of the paper interprets the paper's findings and discusses their implications. The authors should explicitly discuss how their results answer the research question, and what their results mean for it. They will also usually discuss the paper's limitations and suggest directions for follow-up work.
Review structure. The structure of your review should mirror this 3-part paper structure. First, provide a concise (1-2 sentence) summary of the paper. What is the main point of the paper? This demonstrates that you understood the high-level point of the work, and it's often a useful exercise to circumscribe the domain the paper is exploring. It's important to ensure that your summary is brief — if you can't summarize the point concisely, your understanding of the work probably still isn't crisp enough. Then, explicitly state both the motivating research question that the authors are asking and the concrete operationalization that the authors use in the work. This will comprise the "front matter".
Next, describe the methodological approach that the authors take to answering their CO, and discuss the strengths and weaknesses of this approach. For example, how did the author's choice of conducting a particular observational analysis play out? These should be major pros and cons, not little nitpicks. It's extremely rare that an approach doesn't have both several strengths and several weaknesses. Research is difficult, and there are almost always tradeoffs. What are the compromises the authors made? What is good about their approach, and what are the limitations? How might one address these limitations?
Finally, discuss the back matter of the work and your thoughts on the paper's contribution. What exactly have the authors shown? Do they answer their research question and address their motivation? Do you agree with their interpretation of the results? What is the difference between the world after this paper and the world without it? Do the strengths of their approach justify the weaknesses? Should the authors have done anything differently, in your opinion? What else should they have done? What are the implications of the results? How does this research inform or compliment other work in computational social science, or society in general?
Throughout your review, discuss the computational social science methods used in the framework set out by Salganik in Bit By Bit.
Finally, reviews should be concise. Papers are "big" things; they represent an entire research project that a group of people have spent significant time pursuing. Resist the temptation to address every detail, or go off on a small tangent. Keep to the main and most important points. Reviews should not be longer than 500 words.
The grading rubric for the reviews is as follows:
One of the main goals of CSC2552 is to introduce you to research in computational social science. The best way to do that is to get your hands dirty and try to study something yourself, and the final project offers you an opportunity to do exactly this.
The project has two main deliverables: the project proposal and the final report. Students will present each to the class (proposals on October 13 and final reports on November 24 and December 1). Projects will be done in teams of one or two people.
Your first task is to pick a project topic. If you are looking for project ideas, please email me, and I'd be happy to brainstorm and suggest some project ideas. Also check this resources page.
Project proposal (2 pages). Your proposal should outline the motivation for your project, the realistic research question you wish to answer, and your chosen operationalization. Articulate these as crisply as you can. Next, survey a bit of related work (around 1 paragraph). The proposal should identify the CSS methodology you will use and lay out a plan for your project. How exactly do you plan to pursue your research questions? What data/methods will you use? You should provide a concrete proposal for a data analysis, experiment, survey, or other method that helps you answer your research question. The analysis plan will probably be the biggest section, at least 3/4 page. Finally, mention any of the steps you have taken so far and share your preliminary progress. Ideally, the first 2 pages of your final report will be quite similar to this proposal.
Proposal components:
Thanks to Sharad Goel, Jon Kleinberg, Jure Leskovec, Matt Salganik, Johan Ugander, and Bob West for course advice and inspiration.