CSC2547HS Winter 2023
Automated Reasoning with Machine Learning


Course Info

Instructor: Xujie Si
Teaching Assistants: Jonathan Lorraine and Zhaoyu Li
Lecture hours: 3 - 5 PM, Thursday
Classroom: WW 126
Office hours: 3 - 4 PM, Tuesday, BA 7202
Course Breadth: M1/RA12
Discussion: Ed
Urgent contact Email: email. with subject “CSC2547HS-W23: "

Announcements

  • Jan 7: Welcome!

Course Overview

Reasoning and learning are two fundamental piceces of (either artificial or natural) intelligence. This seminar course is designed to introduce the cutting-edge research on combining logical reasoning with machine learning. We will first study foundational ideas and engineering tricks behind modern reasoning engines, such as Boolean Satisfiability (SAT) solvers, Satisfiability modulo Theories (SMT) solvers, and domain-specific solvers. We will then survey recent work that use machine learning to improve reasoning systems as well as the other way around. Specifically, this course will cover the following topics -- Boolean satisfiability (SAT), Satisifiability module Theories (SMT), program analysis and synthesis, (inductive) logic programming, and neuro-symoblic methods.

Learning objectives:
At the end of this course, you will:
  1. Understand how SAT/SMT solvers work and use them to solve interesting challenges.
  2. Learn promises and articulate limitations of current machine learning and program reasoning research.
  3. Constructively critique research papers, and deliver a tutorial style presentation.
  4. Practice your research skills through a full cycle, i.e., propose ↠ implement & evaluate ↠ present & get feedback ↠ final report.

Prerequisites:
Students are expected to have some basic knowledge of machine learning, algorithms and data structures, and programming skills (e.g., Python + C family languages). Ideally, students should have already taken the following courses (or equivalent ones):
  • CSC311H1 Introduction to Machine Learning
  • CSC324H1 Principles of Programming Languages
  • CSC373H1 Algorithm Design, Analysis & Complexity
Grading & Evaluation:
  • Pop-quizzes & Class Participation: 10%
  • One Assignment: 15%
  • In-Class Paper presentation: 15%
  • Project (proposal 15%, presentation 20%, report 25%)

Relevant Textbooks

All textbooks below are supplementary, since our main source of reading materials will be research papers.

Course Schedule

The schedule below is tentative and subject to change throughout the semester.
Schedule Broad Area Reading List Notes
Week 1
Jan 12
Course Overview &
SAT is hard but (practically) easy
Core readings:
The Complexity of Theorem-Proving Procedures (1971@UofT)
GRASP—A New Search Algorithm for Satisfiability (1996)
Chaff: Engineering an Efficient SAT Solver (2001)
Recommended:
Two-hundred-terabyte maths proof is largest ever (Nature 2016)
Computer Search Settles 90-Year-Old Math Problem (QuantaMagazine 2020)
Why general artificial intelligence will not be realized (Nature 2020)
week1-slides
Week 2
Jan 19
Intro to Satisfiability modulo Theories (SMT) Core readings:
Satisfiability modulo theories: introduction and applications (CACM 2011)
Recommended:
Handbook of Satisfiability, Chapter 12
Z3: An Efficient SMT Solver (TACAS 2008)
CVC4 (CAV 2011)
week-2 slides
Week 3
Jan 26
Intro to Program Analysis and Synthesis Core readings:
Scalable Error Detection using Boolean Satisfiability (POPL 2005)
KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs (OSDI 2008)
A decade of software model checking with SLAM (CACM 2011)
The Sketching Approach to Program Synthesis (2009)
Stochastic program optimization (CACM 2016)
Search-based program synthesis (CACM 2018)
Recommended:
Principles of Program Analysis (1999)
SLAM and Static Driver Verifier: Technology Transfer of Formal Methods inside Microsoft (2004)
Syntax-Guided Synthesis (2013)
Go Huge or Go Home: POPL’19 Most Influential Paper Retrospective (2020)
week3-slides
Week 4
Feb 02
Machine learning for SAT solving Learning a SAT Solver from Single-Bit Supervision (ICLR 2019)
Guiding High-Performance SAT Solvers with Unsat-Core Predictions (SAT 2019)
Low-rank semidefinite programming for the MAX2SAT problem (AAAI 2019)
Online Bayesian Moment Matching based SAT Solver Heuristics (ICML 2020)
Transformer-based Machine Learning for Fast SAT Solvers and Logic Synthesis (arXiv 2021)
Machine Learning-Based Restart Policy for CDCL SAT Solvers (SAT 2018)
SATzilla: Portfolio-based Algorithm Selection for SAT (JAIR 2008)
Jan 31
Feb 03
Homework release
Week 5
Feb 09
Machine learning for SMT solving Learning to Solve SMT Formulas Fast (NeurIPS 2018)
MachSMT: A Machine Learning-based Algorithm Selector for SMT Solvers (TACAS 2021)
BanditFuzz: Fuzzing SMT Solvers with Multi-agent Reinforcement Learning (FM 2021)
Towards Learning Quantifier Instantiation in SMT (SAT 2022)
Learning to Fuzz from Symbolic Execution with Application to Smart Contracts (CCS 2019)
Learning Heuristics for Quantified Boolean Formulas through Reinforcement Learning (ICLR 2020)
Learning Branching Heuristics for Propositional Model Counting (AAAI 2021)
Feb 10
Feb 15
Course project proposal due
Week 6
Feb 16
Formal methods for machine learning Developing Bug-Free Machine Learning Systems With Formal Mathematics (ICML 2017)
Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks (CAV 2017)
Learning Optimal Decision Trees with SAT (IJCAI 2018)
Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections (NeurIPS 2018)
Efficient Exact Verification of Binarized Neural Networks (NeurIPS 2020)
Verifying Properties of Binarized Deep Neural Networks (AAAI 2018)
Compositional Learning and Verification of Neural Network Controllers (EMSOFT 2021)
An Inductive Synthesis Framework for Verifiable Reinforcement Learning (PLDI 2019)
Towards Verified Stochastic Variational Inference for Probabilistic Programs (POPL 2020)
On Correctness of Automatic Differentiation for Non-Differentiable Functions (NeurIPS 2020)
Model Checking Finite-Horizon Markov Chains with Probabilistic Inference (CAV 2021)
Feb 17
Feb 20
Homework due
Reading week Feb 23 (No classes)
Week 7
Mar 2
Machine learning for code Programming by demonstration using version space algebra (Machine Learning 2003)
Bug isolation via remote program sampling (PLDI 2003)
Statistical Debugging of Sampled Programs (NeurIPS 2003)
Learning Programs: A Hierarchical Bayesian Approach (ICML 2010)
Learning Minimal Abstractions (POPL 2011)
Data-Driven Precondition Inference with Learned Features (PLDI 2016)
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood (ACL 2017)
Synthesis and machine learning for heterogeneous extraction (PLDI 2019)
Data-driven inference of representation invariants (PLDI 2020)
Week 8
Mar 9
Deep learning for code code2vec: Learning Distributed Representations of Code (POPL 2019)
Flow2Vec: value-flow-based precise code embedding (OOPSLA 2020)
Learning semantic program embeddings with graph interval neural network (OOPSLA 2020)
Break-It-Fix-It: Unsupervised Learning for Program Repair (ICML 2021)
Latent Execution for Neural Program Synthesis (NeurIPS 2021)
SpreadsheetCoder: Formula Prediction from Semi-structured Context (ICML 2021)
Competition-Level Code Generation with AlphaCode (from DeepMind 2022)
Codex: Evaluating Large Language Models Trained on Code (from OpenAI 2021)
Synchromesh: Reliable code generation from pre-trained language models (ICLR 2022)
Week 9
Mar 16
Deep Learning and Logic Programming An algebraic Prolog for reasoning about possible worlds (AAAI 2011)
Harnessing Deep Neural Networks with Logic Rules (ACL 2016)
End-to-End Differentiable Proving (NeurIPS 2017)
DeepProbLog: Neural Probabilistic Logic Programming (NeurIPS 2018)
Neural Logic Machines (ICLR 2019)
Learning explanatory rules from noisy data (JAIR 2018)
Scallop: From Probabilistic Deductive Databases to Scalable Differentiable Reasoning (NeurIPS 2021)
Learning Relational Representations with Auto-encoding Logic Programs (IJCAI 2020)
Neural Markov Logic Networks (ICML 2021)
DeepStochLog: Neural Stochastic Logic Programming (AAAI 2022)
A-NESI: A Scalable Approximate Method for Probabilistic Neurosymbolic Inference (arXiv 2022)
Week 10
Mar 23
Neuro-symbolic systems Learning to Infer Graphics Programs from Hand-Drawn Images (NeurIPS 2018)
SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver (ICML 2019)
Learning Symmetric Rules with SATNet (NeurIPS 2022)
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning (ICML 2020)
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision (ICLR 2019)
DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning (PLDI 2021)
Temporal and Object Quantification Networks (IJCAI 2021)
Hierarchical Motion Understanding via Motion Programs (CVPR 2021)
Unsupervised Learning of Shape Programs with Repeatable Implicit Parts (NeurIPS 2022)
Week 11
Mar 30
Project presentation I
Week 12
Apr 6
Project presentation II