I obtained my PhD from the University of Toronto in 2013, under the supervision of Toni Pitassi.

My thesis has been awarded the CMS Doctoral Prize for 2015.

During Fall 2013 I attended the special semester on real analysis in computer science at the Simons institute, Berkeley.

Winter 2014 to Summer 2015 I spent at the IAS in Princeton.

I have been at the Technion since Fall 2015, where I maintain a public homepage.

I can be contacted via yuvalfi﹫cs.technion.ac.il.

Research statement Curriculum vitæ DBLP

profile for Yuval Filmus on Stack Exchange, a network of free, community-driven Q&A sites

I'm not surprised that there were multiple questions with a single answer, provided by Yuval Filmus. Yuval is very active in answering questions. He also tends to respond very rapidly, and to post excellent, clear answers. It would not be surprising if most others see the question only after Yuval has already answered, they look at his answer, discover that it is excellent and think to themselves, "I can't improve on that, this one is already answered" and move on to some other question.

Some thoughts on studying math.


Bandwidth approximation of a restricted family of trees (Master's thesis)

We consider the NP-complete optimization problem Bandwith. Anupam Gupta gave an O(log2.5n) approximation algorithm for trees, and showed that his algorithm has an approximaion ratio of O(log n) on caterpillars, trees composed of a central path and paths emanating from it. We show that the same approximation ratio is obtained on trees composed of a central path and caterpillars emanating from it.

Our result relies on the following lemma. A sequence a1, …, an has thickness Θ if the sum of any d consecutive elements is at most dΘ, for 1≤dn. If a sequence has thickness Θ, then the sequence obtained by ordering the elements in non-decreasing order also has thickness Θ

Download Slides BibTeX

  author = {Yuval Filmus},
  title = {Bandwidth approximation of a restricted family of trees},
  school = {Weizmann institute of science},
  year = {2002}

Spectral methods in extremal combinatorics (PhD thesis)

Extremal combinatorics studies how large a collection of objects can be if it satisfies a given set of restrictions. Inspired by a classical theorem due to Erdős, Ko and Rado, Sominovits and Sós posed the following problem: determine how large a collection of graphs on the vertex set {1, …,n} can be, if the intersection of any two of them contains a triangle. They conjectured that the largest possible collection, containing 1/8 of all graphs, consists of all graphs containing a fixed triangle (a triangle-star). The first major contribution of the thesis is a confirmation of this conjecture. This result first appeared in our paper Triangle-intersecting families of graphs with David Ellis and Ehud Friedgut.

We prove the Simonovits–Sós conjecture in the following strong form: the only triangle-intersecting families of the maximal measure 1/8 are triangle-stars (uniqueness), and every triangle-intersecting family of measure 1/8−ε is O(ε)-close to a triangle-star (stability). Our proof uses spectral methods (Hoffman's bound).

In order to prove the stability part of our theorem, we utilize a structure theorem for Boolean functions on {0,1}m whose Fourier expansion is concentrated on the first t+1 levels, due to Kindler and Safra. The second major contribution of this thesis consists of two analogs of this theorem for Boolean functions on Sm whose Fourier expansion is concentrated on the first two levels. These results appear in our papers A quasi-stability result for dictatorships in Sn and A stability results for balanced dictatorships in Sn, both with David Ellis and Ehud Friedgut.

In the same way that the Kindler–Safra theorem is useful for studying triangle-intersecting families, our structure theorems are useful for studying intersecting families of permutations, which are families in which any two permutations agree on the image of at least one point. Using one of our theorems, we give a simple proof of the following result of Ellis, Friedgut and Pilpel: an intersecting family of permutations on Sm of size (1−ε)(m−1)! is O(ε)-close to a double coset, a family which consists of all permutations sending some point i to some point j.

The thesis includes a detailed exposition of Friedgut's paper On the measure of intersecting families, uniqueness and stability, a proof of the Ahlswede–Khachatrian theorem in the μp setting, and a gentle introduction to the representation theory of Sn from the point of view of class functions.

Download BibTeX

  author = {Yuval Filmus},
  title = {Spectral methods in extremal combinatorics},
  school = {University of Toronto},
  year = {2013}


Triangle-intersecting families of graphs (with David Ellis and Ehud Friedgut)

Journal of the European Mathematical Society, Volume 14, Issue 3, 2012, pp. 841–885

A family of graphs is said to be triangle-intersecting if the intersection of any two graphs in the family contains a triangle. A conjecture of Simonovits and Sós from 1976 states that the largest triangle-intersecting families of graphs on a fixed set of n vertices are those obtained by fixing a specific triangle and taking all graphs containing it, resulting in a family containing 1/8 of all graphs.. We prove this conjecture and some generalizations (for example, we prove that the same is true of odd-cycle-intersecting families, and we obtain best possible bounds on the size of the family under different, not necessarily uniform, measures). We also obtain stability results, showing that almost-largest triangle-intersecting families have approximately the same structure.

The ArXiv version corrects a mistake in the proof of the stability part of Corollary 2.3. The original proof assumed that the approximating family G is odd-cycle-agreeing, and deduced that G must be a triangle-junta from uniqueness. However, showing that G is odd-cycle-agreeing requires a separate argument, found in the new version.

An alternative presentation of the results in this paper can be found in my PhD thesis.

ArXiv Presentation 1 Presentation 2 BibTeX

  author = {David Ellis and Yuval Filmus and Ehud Friedgut},
  title = {Triangle-intersecting families of graphs},
  journal = {Journal of the European Mathematical Society},
  volume = {14},
  number = {3},
  year = {2012},
  pages = {841--885}

The weighted complete intersection theorem

Accepted to JCTA

We provide a complete proof of the Ahlswede–Khachatrian theorem in the μp setting: for all values of n, t and p, we determine the maximum μp-measure of a t-intersecting family on n points, and describe all optimal families (except for a few exceptional parameter settings). Our proof is based on several different articles of Ahlswede and Khachatrian.

The full version below includes more details, as well as some follow-up work (see next item).

Full version arXiv

Manuscript BibTeX

  author = {Yuval Filmus},
  title = {The weighted complete intersection theorem},
  journal = {Journal of Combinatorial Theory, Series A},
  year = {accepted},
  sortyear = {2017}

More complete intersection theorems


We extend the weighted Ahlswede–Khachatrian theorem (see previous item) to several new settings.

First, we extend the theorem to families on infinitely many points, showing that for most settings of the parameters, no new behavior is encountered.

Second, we extend the theorem to the Hamming scheme ℤmn, in which a family is t-intersecting if any two vectors have t coordinates on which the values differ by less than s (this corresponds to p=s/m).

Third, we extend the theorem to the continous analog of the Hamming scheme, a power of the unit circle, in which a family is t-intersecting if any two vectors have t coordinates on which the values differ by less than p.

This work is included in the full version of the preceding item.

Manuscript BibTeX

  author = {Yuval Filmus},
  title = {More complete intersection theorems},
  howpublished = {Manuscript},
  year = {2017}

Twenty (simple) questions (with Yuval Dagan, Ariel Gabizon, and Shay Moran)

STOC 2017

Given a distribution μ, the goal of the distributional 20 questions game is to construct a strategy that identifies an unknown element drawn from μ using as little yes/no queries on average. Huffman's algorithm constructs an optimal strategy, but the questions one has to ask can be arbitrary.

Given a parameter n, we ask how large a set of questions Q needs to be so that for each distribution supported on [n] there is a good strategy which uses only questions from Q.

Our first major result is that a linear number of questions (corresponding to binary comparison search trees) suffices to recover the H(μ)+1 performance of Huffman's algorithm. As a corollary, we deduce that the number of questions needed to guarantee a cost of at most H(μ)+r (for integer r) is asymptotic to rn1/r.

Our second major result is that (roughly) 1.25n questions are sufficient to match the performance of Huffman's algorithm exactly, and this is tight for infinitely many n.

We also determine the number of questions sufficient to match the performance of Huffman's algorithm up to r to be Θ(nΘ(1/r)).

Finally, we show that the set of questions used to obtain the bound H(μ)+1 performs better when the maximal probability of μ is small, bounding the performance between 0.5011 and 0.58607.

The full version contains all parts of the paper. We subsequently split the paper into two halves: the first containing most of the results, and the second (the next item) only the results concerning distributions whose maximal probability is small.


arXiv Full version

Extended Abstract BibTeX

  author = {Yuval Dagan and Yuval Filmus and Ariel Gabizon and Shay Moran},
  title = {Twenty (simple) questions},
  booktitle = {49th ACM Symposium on Theory of Computing (STOC 2017)},
  year = {2017}

Asymptotic redundancy and prolixity (with Yuval Dagan and Shay Moran)


Given a distribution μ and a set of queries Q (for example, all comparison queries), the cost of Q on μ is the average depth of a decision tree that locates an item in the support of μ using only queries from Q.

The cost of any set of queries on μ is always at least the entropy H(μ), and the redundancy of Q is the maximal gap between the cost of Q on a distribution and its entropy.

The prolixity of a set of queries is defined in the same way, with the entropy replaced by the cost of the optimal unrestricted decision tree.

The redundancy and prolixity of a set of queries is often achieved on degenerate distributions which have very low entropy, and this suggests studying their asymptotic variants, in which we consider distributions whose min-entropy tends to infinity.

Gallager showed that the asymptotic redundancy of unrestricted decision trees is 0.086.

We obtain bounds, and in some case determine, the non-asymptotic and asymptotic redundancy and prolixity of several sets of queries, including comparison queries, comparison and equality queries, and interval queries.

The results in this paper are also contained in the full version of the preceding item.

Preprint BibTeX

  author = {Yuval Dagan and Yuval Filmus and Shay Moran},
  title = {Asymptotic redundancy and prolixity},
  howpublished = {Manuscript},
  year = {2017}

Analysis of Boolean functions

A quasi-stability result for dictatorships in Sn (with David Ellis and Ehud Friedgut)

Combinatorica, Volume 35, Issue 5, 2015, pages 573–618

We prove that Boolean functions on Sn whose Fourier transform is highly concentrated on the first two irreducible representations of Sn are close to being unions of cosets of point-stabilizers. We use this to give a natural proof of a stability result on intersecting families of permutations, originally conjectured by Cameron and Ku, and first proved by David Ellis. We also use it to prove a ‘quasi-stability’ result for an edge-isoperimetric inequality in the transposition graph on Sn, namely that subsets of Sn with small edge-boundary in the transposition graph are close to being unions of cosets of point-stabilizers.

An alternative presentation of the main theorem (without the second application) can be found in my PhD thesis.

ArXiv BibTeX

  author = {David Ellis and Yuval Filmus and Ehud Friedgut},
  title = {A quasi-stability result for dictatorships in {$S_n$}},
  journal = {Combinatorica},
  volume = {35},
  number = {5},
  pages = {573--618},
  year = {2015}

A stability result for balanced dictatorships in Sn (with David Ellis and Ehud Friedgut)

Random Structures and Algorithms, Volume 46, Issue 3, 2015, pp. 494–530

We prove that a balanced Boolean function on Sn whose Fourier transform is highly concentrated on the first two irreducible representations of Sn is close in structure to a dictatorship, a function which is determined by the image or pre-image of a single element. As a corollary, we obtain a stability result concerning extremal isoperimetric sets in the Cayley graph on Sn generated by the transpositions.

Our proof works in the case where the expectation of the function is bounded away from 0 and 1. In contrast, the preceding paper deals with Boolean functions of expectation O(1/n) whose Fourier transform is highly concentrated on the first two irreducible representations of Sn. These need not be close to dictatorships; rather, they must be close to a union of a constant number of cosets of point-stabilizers.

An alternative presentation of the main theorem can be found in my PhD thesis.

ArXiv BibTeX Video

  author = {David Ellis and Yuval Filmus and Ehud Friedgut},
  title = {A stability result for balanced dictatorships in {$S_n$}},
  journal = {Random Structures and Algorithms},
  volume = {46},
  number = {3},
  year = {2015},
  pages = {494--530}

Low-degree Boolean functions on Sn, with an application to isoperimetry (with David Ellis and Ehud Friedgut)


We prove that Boolean functions on Sn whose Fourier transform is highly concentrated on irreducible representations indexed by partitions of n whose largest part has size at least nt are close to being unions of cosets of stabilizers of t-tuples. We also obtain an edge-isoperimetric inequality for the transposition graph on Sn which is asymptotically sharp for sets of measure 1/poly(n). We then combine both results to obtain a best-possible edge-isoperimetric inequality for sets of size (nt)! where n is large compared to t, confirming a conjecture of Ben-Efraim in these cases.

arXiv BibTeX

  author = {David Ellis and Yuval Filmus and Ehud Friedgut},
  title = {Low-degree {B}oolean functions on {$S_n$}, with an application to isoperimetry},
  journal = {Submitted}

On the sum of L1 influences of bounded functions (with Hamed Hatami, Nathan Keller and Noam Lifshitz)

Israel Journal of Mathematics, Volume 214, Issue 1, 2016, pp. 167–192

It is well-known that if f is a Boolean function of degree d then its total influence is bounded by d. There are several ways of extending the definition of influence for non-Boolean functions. The usual way is to define the influence of the ith variable as the L2 norm of the discrete derivative in direction i. Under this definition, the total influence of a bounded function (bounded by 1 in magnitude) is still upper-bounded by the degree. Aaronson and Ambainis asked whether total L1 influence can be bounded polynomially by the degree, and this was answered affirmatively by Bačkurs and Bavarian, who showed an upper bound of O(d3) for general functions, and O(d2) for homogeneous functions. We improve their results by giving an upper bound of d2 in the general case and O(d log d) in the homogenous case. Our proofs are also much simpler. We also give an almost optimal bound for monotone functions, d/2π + o(d).

Journal Preprint ArXiv BibTeX

  author = {Yuval Filmus and Hamed Hatami and Nathan Keller and Noam Lifshitz},
  title = {Bounds on the sum of {L1} influences},
  journal = {Israel Journal of Matematics},
  volume = {214},
  number = {1},
  year = {2016},
  pages = {167--192}

An Orthogonal basis for functions over a slice of the Boolean cube

Electronic Journal of Combinatorics

The Johnson and Kneser are graphs defined on the k-sets of [n], a vertex set known as a slice of the Boolean cube. Two sets are connected in the Johnson graph if they have Hamming distance two, and in the Kneser graph if they are disjoint. Both graphs belong to the Bose–Mesner algebra of the Johnson association scheme; this just means that whether an edge connects two sets S,T depends only on |ST|.

All graphs belonging to the Bose–Mesner algebra have the same eigenspaces, and these are well-known, arising from certain representations of the symmetric group. The multiplicity of the kth eigenspace is rather large, C(n,k)−C(n,k−1). As far as we can tell, prior to our work no explicit orthogonal basis for these eigenspace has been exhibited.

We present a simple orthogonal basis for the eigenspaces of the Bose–Mesner algebra of the Johnson association scheme, arising from Young's orthogonal basis for the symmetric group. Our presentation is completely elementary and makes no mention of the symmetric group.

As an application, we restate Wimmer's proof of Friedgut's theorem for the slice. The original proof makes heavy use of computations over the symmetric group. We are able to do these computations directly in the slice using our basis.

Update: Qing Xiang and his student Rafael Plaza pointed out that the same basis has been constructed by Murali K. Srinivasan in his paper Symmetric chains, Gelfand–Tsetlin chains, and the Terwilliger algebra of the binary Hamming scheme.

Journal Preprint ArXiv BibTeX

  author = {Yuval Filmus},
  title = {Orthogonal basis for functions over a slice of the
  {B}oolean hypercube},
  journal = {Electronic Journal of Combinatorics},
  volume = {23},
  number = {1},
  year = {2016},
  pages = {P1.23}

Friedgut–Kalai–Naor theorem for slices of the Boolean cube

Chicago Journal of Theoretical Computer Science

The Friedgut–Kalai–Naor theorem is a fundamental result in the analysis of Boolean function. It states that if a Boolean function f is close to an affine function, then f is close to an affine Boolean function, which must depend on at most one coordinate. We prove an analog of this theorem for slices of the Boolean cube (a slice consists of all vectors having a given Hamming weight). In the small error regime, our theorem shows that f is close to a function depending on at most one coordinate, and in general we show that f or its negation is close to a maximum of a small number of coordinates (this corresponds to a union of stars, families consisting of all elements containing some fixed element).

Preprint ArXiv Journal BibTeX

  author = {Yuval Filmus},
  title = {Friedgut--{K}alai--{N}aor theorem for slices of the {B}oolean cube},
  journal = {Chicago Journal of Theoretical Computer Science},
  year = {2016},
  pages = {14:1--14:17}

Invariance principle on the slice (with Guy Kindler, Elchanan Mossel and Karl Wimmer)

CCC 2016

The classical invariance principle of Mossel, O'Donnell and Oleszkiewicz states that the distribution of low-degree, low-influence multilinear polynomials under Bernoulli random variables is similar to their distribution under Gaussian random variables with the same expectation and variance.

We prove an invariance principle for functions on the slice (all vectors in the Boolean cube having a fixed Hamming weight). The main difficulty is that the variables are no longer independent.

As corollaries, we prove a version of majority is stablest, a Bourgain tail bound, and a weak version of the Kindler–Safra theorem. The Kindler–Safra theorem implies a stability result for t-intersecting families along the lines of Friedgut.

In follow-up work (see below), we improve the invariance principle by removing the condition of low influences (when appropriate).

Preprint ArXiv BibTeX

  author = {Yuval Filmus and Guy Kindler and Elchanan Mossel and Karl Wimmer},
  title = {Invariance principle on the slice},
  booktitle = {31st Computational Complexity Conference},
  year = {2016}

Harmonicity and invariance on slices of the Boolean cube (with Elchanan Mossel)

CCC 2016

The classical invariance principle of Mossel, O'Donnell and Oleszkiewicz states that the distribution of low-degree, low-influence multilinear polynomials under a product distrribution essentially depends only on the first two moments of this distribution.

In recent work with Kindler and Wimmer (see above), we extended this to harmonic multilinear polynomials on the slice. In that work we proved invariance with respect to the following three distributions: the uniform distribution on a slice, the matching skewed distribution on the Boolean cube, and the corresponding Gaussian slice (or Gaussian space; the distributions are the same). That invariance principle requires the function to have low degree and low influences.

While the condition of low influences is necessary when comparing discrete distributions to Gaussian space (consider the polynomial x1), such a condition is no longer necessary when comparing the uniform distribution on a slice to the matching skewed distribution on the Boolean cube. In this paper we prove an invariance principle for these two distributions without any condition on the influences. Using the classical invariance principle, we can easily derive the more general invariance principl that we had proved with Kindler and Wimmer. Our new proof is completely different, and uses a martingale approach.

Complementing the invariance principle, we reprove several properties of harmonic multilinear polynomials. While most of these properties have been proven earlier in my paper using an explicit orthogonal basis for the slice, the proofs appearing in this paper are much simpler and do not require the basis. We hope that the new proofs are easier to generalize to other settings.

Preprint ArXiv BibTeX

  author = {Yuval Filmus and Elchanan Mossel},
  title = {Harmonicity and invariance on slices of the {B}oolean cube},
  booktitle = {31st Computational Complexity Conference},
  year = {2016}

Computational complexity

The complexity of the comparator circuit value problem (with Stephen A. Cook and Dai Lê)


In 1990 Subramanian defined the complexity class CC as the set of problems log-space reducible to the comparator circuit value problem (CCV). He and Mayr showed that NL⊆CC⊆P, and proved that in addition to CCV several other problems are complete for CC, including the stable marriage problem, and finding the lexicographically first maximal matching in a bipartite graph. Although the class has not received much attention since then, we are interested in CC because we conjecture that it is incomparable with the parallel class NC which also satisfies NL⊆NC⊆P; this implies that CC-complete problems don't have an efficient polylog time parallel algorithm. We provide evidence for our conjecture by giving oracle settings in which relativized CC and relativized NC are incomparable.

We give several alternative definitions of CC, including (among others) the class of problems computed by uniform polynomial-size families of comparator circuits supplied with copies of the input and its negation, the class of problems AC0-reducible to CCV, and the class of problems computed by uniform AC0 circuits with CCV gates. We also give a machine model for CC which corresponds to its characterizations as log-space uniform polynomial-size families of comparator circuits. The various characterizations show that CC is a robust class. Our techniques also show that the corresponding function class FCC is closed under composition. The main technical tool we employ is universal comparator circuits.

Other results include a simpler proof of NL⊆CC, a more careful analysis showing that the lexicographically first maximal matching problem and its variants are CC-complete under AC0 many-one reductions, and an explanation of the relation between the Gale–Shapley algorithm and Subramanian's algorithm for stable marriage.

This paper continues previous work of Cook, Lê and Ye which focused on Cook–Nguyen style uniform proof complexity, answering several open questions raised in that paper.

The preprint contains more results than the ArXiv version, and the presentation is different.

Preprint ArXiv BibTeX

  author = {Stephen A. Cook and Yuval Filmus and Dai L\^e},
  title = {The complexity of the comparator circuit value problem},
  journal = {ACM Transactions on Computation Theory},
  volume = {6},
  number = {4},
  articleno = {15},
  pages = {15:1--15:44},
  year = {2014}

Average case lower bounds for monotone switching networks (with Toniann Pitassi, Robert Robere and Stephen A. Cook)

FOCS 2013

An approximate computation of a Boolean function f by a circuit or switching network M is a computation in which M computes f correctly on the majority of the inputs (rather than on all of them). Besides being interesting in their own right, lower bounds for approximate computation have proved useful in many subareas of complexity theory such as cryptography and derandomization. Lower bounds for approximate computation are also known as correlation bounds or average case hardness.

We obtain the first average case monotone depth lower bounds for a function in monotone P. We tolerate errors up to 1/2−1/n1/3−δ. Specifically, we prove average case exponential lower bounds on the size of monotone switching networks for the GEN function. As a corollary, we establish that for every i there are functions that can be computed with no error in monotone NCi+1 but that cannot be computed without large error by monotone circuits in NCi. We provide a similar separation between monotone NC and monotone P.

Our proof extends and simplifies the Fourier-analytic technique due to Potechin and further developed by Chan and Potechin.

As a corollary of our main lower bound, we prove that the communication complexity approach for monotone depth lower bounds does not naturally generalize to the average case setting.

Preprint ECCC FOCS BibTeX

  author = {Yuval Filmus and Toniann Pitassi and Robert Robere and
  Stephen A. Cook},
  title = {Average case lower bounds for monotone switching networks},
  booktitle = {The 54th Annual Symposium on Foundations of Computer
  Science ({FOCS} 2013)},
  pages = {598--607},
  year = {2013}

Fast matrix multiplication: limitations of the laser method (with Andris Ambainis and François Le Gall)

STOC 2015

Coppersmith and Winograd gave an O(n2.376) algorithm for matrix multiplication in 1990. Their algorithm relies on an identity known as the Coppersmith–Winograd identity. Analyzing the identity as-is using Strassen's laser method and an ingenious construction, Coppersmith and Winograd obtained an O(n2.388) algorithm. The tensor square of the basic identity leads to the improved algorithm.

Recently there has been a surge of activity in the area. Stothers, Vassilevska-Williams and Le Gall studied higher and higher tensor powers of the basic identity, culminating in Le Gall's O(n2.3728639) algorithm. How far can this approach go?

We describe a framework, laser method with merging, which encompasses all the algorithms just described, and is at once more general and amenable to analysis. We show that taking the Nth tensor power for an arbitrary N cannot obtain an algorithm with running time O(n2.3725) for the exact identity used in state-of-the-art algorithms.

Preprint ECCC ArXiv BibTeX

  author = {Ambainis, Andris and Filmus, Yuval and Le Gall, Fran\c{c}ois},
  title = {Fast matrix multiplication: limitations of the laser method},
  booktitle = {47th Annual Symposium on the Theory of Computing ({STOC} 2015)},
  year = {2015},
  pages = {585--593}

Communication complexity

Trading information complexity for error (with Yuval Dagan, Hamed Hatami, and Yaqiao Li)

CCC 2017

We discuss the following general question: How does the information complexity of a function change if we allow non-negligible error?

Our answers have implications to the communication complexity of set disjointness with non-negligible error.

ArXiv ECCC Preprint BibTeX

  author = {Yuval Dagan and Yuval Filmus and Hamed Hatami and Yaqiao Li},
  title = {Information complexity with non-negligible error},
  booktitle = {32nd Conference on Computational Complexity (CCC 2017)},
  year = {2017}

Information complexity of the AND function in the two-party and multiparty settings (with Hamed Hatami, Yaqiao Li, and Suzin You)


We find the optimal protocol (from the point of view of information complexity) for the AND function in a multiparty setting, extending the buzzer protocol described by Braverman et al.

We also fill in some gaps in the arguments of Braverman et al.

ArXiv BibTeX

  author = {Yuval Filmus and Hamed Hatami and Yaqiao Li and Suzin You},
  title = {Information complexity of the {AND} function in the two-party, and multiparty settings},
  howpublished = {Submitted},
  year = {2017}

Proof complexity

Exponential lower bounds for AC0-Frege imply superpolynomial Frege lower bounds (with Toniann Pitassi and Rahul Santhanam)

ICALP 2011, ToCT 2015

We give a general transformation which turns polynomial-size Frege proofs to subexponential-size AC0-Frege proofs. This indicates that proving exponential lower bounds for AC0-Frege is hard, since it is a longstanding open problem to probe super-polynomial lower bounds for Frege. Our construction is optimal for tree-like proofs.

As a consequence of our main result, we are able to shed some light on the question of weak automatizability for bounded-depth Frege systems. First, we present a simpler proof of the results of Bonet et al. showing that under cryptographic assumptions, bounded-depth Frege proofs are not weakly automatizable. Second, we show that because our proof is more general, under the right cryptographic assumptions it could resolve the weak automatizatbility question for lower depth Frege systems.

The proceedings version contains several small mistakes, corrected in the preprint version. These mistakes slightly affect the constants in the main theorem.

Preprint ICALP Slides BibTeX

  author = {Yuval Filmus and Toniann Pitassi and Rahul Santhanam},
  title = {Exponential lower bounds for {AC$^0$}-{F}rege imply
  superpolynomial {F}rege lower bounds},
  booktitle = {The 38th International Colloquium on Automata,
  Languages and Programming ({ICALP} 2011)},
  year = {2011},
  pages = {618--629}
  author = {Yuval Filmus and Toniann Pitassi and Rahul Santhanam},
  title = {Exponential lower bounds for {AC$^0$}-{F}rege imply
  superpolynomial {F}rege lower bounds},
  journal = {ACM Trans. Comput. Th.},
  volume = {7},
  number = {2},
  year = {2015},
  pages = {article 5}

Space complexity in polynomial calculus (with Massimo Lauria, Jakob Nordström, Neil Thapen and Noga Zewi)

CCC 2012, SICOMP 2015

During the last decade, an active line of research in proof complexity has been to study space complexity and time-space trade-offs for proofs. Besides being a natural complexity measure of intrinsic interest, space is also an important issue in SAT solving, and so research has mostly focused on weak systems that are used by SAT solvers.

There has been a relatively long sequence of papers on space in resolution, which is now reasonably well understood from this point of view. For other natural candidates to study, however, such as polynomial calculus or cutting planes, very little has been known. We are not aware of any nontrivial space lower bounds for cutting planes, and for polynomial calculus the only lower bound has been for CNF formulas of unbounded width in Alekhnovich et al., where the space lower bound is smaller than the initial width of the clauses in the formulas. Thus, in particular, it has been consistent with current knowledge that polynomial calculus could be able to refute any k-CNF formula in constant space.

We prove several new results on space in polynomial calculus (PC), and in the extended proof system polynomial calculus resolution (PCR) studied by Alekhnovich et al.:

  • We prove an Ω(n) space lower bound in PC for the canonical 3-CNF version of the pigeonhole principle formulas with m pigeons and n holes, and show that this is tight.
  • For PCR, we prove an Ω(n) space lower bound for a bitwise encoding of the functional pigeonhole principle. These formulas have width O(log n), and hence this is an exponential improvement over Alekhnovich et al. measured in the width of the formulas.
  • We then present another encoding of the pigeonhole principle that has constant width, and prove an Ω(n) space lower bound in PCR for these formulas as well.
  • Finally, we prove that any k-CNF formula can be refuted in PC in simultaneous exponential size and linear space (which holds for resolution and thus for PCR, but was not obviously the case for PC). We also characterize a natural class of CNF formulas for which the space complexity in resolution and PCR does not change when the formula is transformed into 3-CNF in the canonical way, something that we believe can be useful when proving PCR space lower bounds for other well-studied formula families in proof complexity.

ECCC Journal BibTeX

  author = {Yuval Filmus and Massimo Lauria and Jakob Nordstr\"om and
  Neil Thapen and Noga Zewi},
  title = {Space complexity in polynomial calculus},
  booktitle = {The 27th Annual Conference on Computational Complexity ({CCC} 2012)},
  year = {2012}
  author = {Yuval Filmus and Massimo Lauria and Jakob Nordstr\"om and
  Neil Thapen and Noga Zewi},
  title = {Space complexity in polynomial calculus},
  journal = {SIAM J. Comput.},
  volume = {44},
  number = {4},
  pages = {1119--1153},
  year = {2015}

Towards an understanding of Polynomial Calculus: new separations and lower bounds (with Massimo Lauria, Mladen Mikša, Jakob Nordström and Marc Vinyals)

ICALP 2013

During the last decade, an active line of research in proof complexity has been into the space complexity of proofs, and how space is related to other measures. By now, these aspects of resolution are fairly well understood, but many open problems remain for the related but stronger polynomial calculus (PC/PCR) proof system. For instance, the space complexity of many standard “benchmark formulas” is still open, as well as the relation of space to size and degree in PC/PCR.

We prove that if a formula requires large resolution width then making XOR substitution yields a formula requiring large PCR space, providing some circumstantial evidence that degree might be a lower bound for space. More importantly, this immediately yields formulas that are very hard for space but very easy for size, exhibiting a size-space separation similar to what is known for resolution. Using related ideas, we show that if a graph has good expansion and in addition its edge set can be partitioned into short cycles, then the Tseitin formula over this graph requires large PCR space. In particular, Tseitin formulas over random 4-regular graphs almost surely require space at least Ω(n1/2).

Our proofs use techniques recentled introduced by Bonacina and Galesi. Our final contribution is to show that these techniques provably cannot yield non-constant space lower bounds for the functional pigeonhole principle, delineating the limitations of this framework, and suggesting that we are still far from characterizing PC/PCR space.

Download BibTeX

  author = {Yuval Filmus and Massimo Lauria and Mladen Mik\v{s}a and
  Jakob Nordstr\"om and Marc Vinyals},
  title = {Towards an understanding of {P}olynomial {C}alculus: new
  separations and lower bounds},
  booktitle = {The 40th International Colloquium on Automata,
  Languages and Programming ({ICALP} 2013)},
  year = {2013},
  pages = {437--448}

From small space to small width in resolution (with Massimo Lauria, Mladen Mikša, Jakob Nordström and Marc Vinyals)

STACS 2014

In 2003, Atserias and Dalmau resolved a major open question about the resolution proof system by establishing that the space complexity of formulas is always an upper bound on the width needed to refute them. Their proof is beautiful but somewhat mysterious in that it relies heavily on tools from finite model theory. We give an alternative, completely elementary, proof that works by simple syntactic manipulations of resolution refutations. As a by-product, we develop a "black-box" technique for proving space lower bounds via a "static" complexity measure that works against any resolution refutation — previous techniques have been inherently adaptive. We conclude by showing that the related question for polynomial calculus (i.e., whether space is an upper bound on degree) seems unlikely to be resolvable by similar methods.

Download BibTeX

  author = {Yuval Filmus and Massimo Lauria and Mladen Mik\v{s}a and
  Jakob Nordstr\"om and Marc Vinyals},
  title = {From small space to small width in resolution},
  booktitle = {The 31st Symposium on Theoretical Aspects of Computer
  Science ({STACS} 2014)},
  year = {2014},
  pages = {300--311}

On the Alekhnovich–Razborov degree lower bound for the polynomial calculus


Alekhnovich and Razborov (Lower bounds for polynomial calculus: non-binomial case) gave a sophisticated degree lower bound for proofs in the polynomial calculus proof system. However, their proof is somewhat opaque. We present a different, more intuitive proof. We also adapt our proof to obtain related results of Galesi and Lauria.

Work done while visiting KTH in December 2012.

Download BibTeX

  author = {Yuval Filmus},
  title = {On the Alekhnovich--Razborov degree lower bound for the polynomial calculus},
  journal = {Preprint}

Semantic versus syntactic cutting planes (with Pavel Hrubeš and Massimo Lauria)

STACS 2016

Cutting planes is a proof system in which lines are linear inequalities. It has two main variants: syntactic cutting planes, in which specific derivation rules are given, and semantic cutting planes, in which any bounded fan-in derivation which is semantically correct (for all zero-one assignments to variables) is allowed. Only the syntactic version is a Cook–Reckhow proof system, since verifying a semantic cutting planes proof is coNP-complete.

Extending earlier work of Pudlák, we give an exponential lower bounds for semantic cutting planes.

We also show that semantic cutting planes is exponentially stronger than syntactic cutting planes, and exhibit two contradictory lines which take exponentially long to refute in syntactic cutting planes.

This work is a combination of two earlier preprints: a preprint of Pavel Hrubeš proving the exponential lower bound for semantic cutting planes, and a preprint of Massimo Lauria and myself proving the exponential separation between semantic and syntactic cutting planes.

Download BibTeX

  author = {Yuval Filmus and Pavel Hrube\v{s} and Massimo Lauria},
  title = {Semantic versus syntactic cutting planes},
  booktitle = {33rd Symposium on Theoretical Aspects of Computer Science (STACS 2016)},
  year = {2016}

Approximation algorithms

Maximum coverage over a matroid (with Justin Ward)

STACS 2012

We present an optimal, combinatorial 1−1/e approximation algorithm for Maximum Coverage over a matroid constraint, using non-oblivious local search. Calinescu, Chekuri, Pál and Vondrák have given an optimal 1−1/e approximation algorithm for the more general problem of monotone submodular maximization over a matroid constraint. The advantage of our algorithm is that it is entirely combinatorial, and in many circumstances also faster, as well as conceptually simpler.

Following previous work on satisfiability problems by Alimonti and by Khanna, Motwani, Sudan and Vazirani, our local search algorithm is non-oblivious. That is, our algorithm uses an auxiliary linear objective function to evaluate solutions. This function gives more weight to elements covered multiple times. We show that the locality ratio of the resulting local search procedure is at least 1−1/e. Our local search procedure only considers improvements of size 1. In contrast, we show that oblivious local search, guided only by the problem's objective function, achieves an approximation ratio of only (n−1)/(2n−1−k) when improvements of size k are considered.

In general, our local search algorithm could take an exponential amount of time to converge to an exact local optimum. We address this situation by using a combination of approximate local search and the same partial enumeration techniques used by Calinescu et al., resulting in a clear (1−1/e)-approximation algorithm running in polynomial time.

We obtained our auxiliary linear objective function using linear programming. This is detailed in Ward's thesis.

Download Slides 1 Slides 2 BibTeX

  author = {Yuval Filmus and Justin Ward},
  title = {Maximum coverage over a matroid},
  booktitle = {29th Symposium on Theoretical Aspects of Computer
  Science ({STACS} 2012)},
  year = {2012},
  pages = {601--612}

A tight combinatorial algorithm for submodular maximization subject to a matroid constraint (with Justin Ward)


We present an optimal, combinatorial 1−1/e approximation algorithm for monotone submodular optimization over a matroid constraint. Compared to the continuous greedy algorithm due to Calinescu, Chekuri, Pál and Vondrák, our algorithm is extremely simple and requires no rounding. It consists of the greedy algorithm followed by local search. Both phases are run not on the actual objective function, but on a related auxiliary potential function, which is also monotone submodular.

In our previous work on maximum coverage (the preceding paper), the potential function gives more weight to elements covered multiple times. We generalize this approach from coverage functions to arbitrary monotone submodular functions. When the objective function is a coverage function, both definitions of the potential function coincide.

Our approach generalizes to the case where the monotone submodular function has restricted curvature. For any curvature c, we adapt our algorithm to produce a (1−ec)/c approximation. This matches results of Vondrák, who has shown that the continuous greedy algorithm produces a (1−ec)/c approximation when the objective function has curvature c with respect to the optimum, and proved that achieving any better approximation ratio is impossible in the value oracle model.

The paper exists in several different versions:

  • The FOCS version only contains the case c=1.
  • The ArXiv version contains the result for general c. A similar account can be found in Ward's thesis.
  • The journal version contains a significantly simplified proof of the result for general c.
  • The extended version includes slightly better approximation ratios for bounded matroid rank, and an improved version of the continuous greedy algorithm.
  • The exposition gives a simplified exposition of the main part of the analysis, following ideas of Moran Feldman.

The journal version supersedes the previous versions.

SICOMP Journal FOCS ArXiv Extended Exposition

Video (IAS) Video (HIM) Slides (HIM) BibTeX

  author = {Yuval Filmus and Justin Ward},
  title = {A tight combinatorial algorithm for submodular maximization
  subject to a matroid constraint},
  booktitle = {53rd Annual {IEEE} Symposium on Foundations of Computer Science ({FOCS} 2012)},
  year = {2012},
  pages = {659--668}

  author = {Yuval Filmus and Justin Ward},
  title = {A tight combinatorial algorithm for submodular maximization subject to a matroid constraint},
  journal = {SIAM Journal on Computing},
  volume = {43},
  issue = {2},
  year = {2014},
  pages = {514--542}

Social choice theory

Threshold models for competitive influence in social networks (with Allan Borodin and Joel Oren)

WINE 2010

The problem of influence maximization deals with choosing the optimal set of nodes in a social networks so as to maximize the resulting spread of a technology (opinion, product ownership and so on), given a model of diffusion of influence in a network. A natural extension is a competitive setting, in which the goal is to maximize the spread of our technology in the presence of one or more competitors.

We suggest several natural extensions to the well-studied linear threshold model, showing that the original greedy approach cannot be used.

Furthermore, we show that for a broad family of competitive influence models, it is NP-hard to achieve an approximation that is better than a square root of the optimal solution; the same proof can also be applied to give a negative result for a conjecture in Carnes et al. about a general cascade model for competitive diffusion.

Finally, we suggest a natural model that is amenable to the greedy approach.

Download BibTeX

  author = {Allan Borodin and Yuval Filmus and Joel Oren},
  title = {Threshold models for competitive influence in social networks},
  booktitle = {The 6th Workshop on Internet and Network Economics
  ({WINE} 2010)},
  year = {2010},
  pages = {539--550}

Efficient vote elicitation under candidate uncertainty (with Craig Boutilier and Joel Oren)

IJCAI 2013

Top-k voting is an especially natural form of partial vote elicitation in which only length-k prefixes of rankings are elicited. We analyze the ability of top-k vote elicitation to correctly determine true winners with high probability, given probabilistic models of voter preferences and candidate availability. We provide bounds on the minimal value of k required to determine the correct winner under the plurality and Borda voting rules, considering both worst-case preference profiles and profiles drawn from the impartial culture and Mallows probabilistic models. We also derive conditions under which the special case of zero elicitation (i.e., k=0) produces the correct winner. We provide empirical results that confirm the value of top-k voting.

The proof of Theorem 10 is incomplete, but the issue is fixed in future work.

Download Full version BibTeX

  author = {Craig Boutilier and Yuval Filmus and Joel Oren},
  title = {Efficient vote elicitation under candidate uncertainty},
  booktitle = {23rd International Joint Conference on Artificial Intelligence
  ({IJCAI} 2013)},
  year = {2013},
  pages = {309--316}

Efficient voting via the top-k elicitation scheme: a probabilistic approach (with Joel Oren)

EC 2014

Many voting rules require the voters to give a complete preference order over the candidates. This is cumbersome, leading to the notion of top-k voting, in which the voters only give the length-k prefixes of their rankings. The question that we ask in this paper is: given a voting rule, for what value of k is it possible to predict the overall winner given only the length-k prefixes, with high probability, given enough voters?

We first consider the case of an impartial culture, in which the voters choose their preference profiles uniformly at random over all permutations. For positional scoring rules (like Borda) we give a nearly-tight threshold theorem for k. We also prove a strong, though non-optimal, lower bound for Copeland.

When the preference profiles are drawn from a biased distribution, such as the Mallows distribution, we show that the candidate toward which the distribution is biased wins the elections, for both positional scoring rules and Copeland, with high probability.

Finally, we consider adversarially-chosen preference distributions. We show that for positional scoring rules with geometrically decaying scores, k=O(log n) suffices to predict the winner with high probability.

Download BibTeX

  author = {Yuval Filmus and Joel Oren},
  title = {Efficient voting via the top-$k$ elicitation scheme: a
  probabilistic approach},
  booktitle = {Electronic Commerce (EC)},
  year = {2014},
  pages = {295--312}

Power distribution in randomized weighted voting: the effects of the quota (with Joel Oren, Yair Zick and Yoram Bachrach)

IJCAI 2016 (first half), SAGT 2016 (second half)

We study the Shapley value in weighted voting games. The Shapley value has been used as an index for measuring the power of individual agents in decision-making bodies and political organizations, where decisions are made by a majority vote process. We characterize the impact of changing the quota (i.e., the minimum number of seats in the parliament that are required to form a coalition) on the Shapley values of the agents. Contrary to previous studies, which assumed that the agent weights (corresponding to the size of a caucus or a political party) are fixed, we analyze new domains in which the weights are stochastically generated, modeling, for example, elections processes.

We examine a natural weight generation process: the Balls and Bins model, with uniform as well as exponentially decaying probabilities. We also analyze weights that admit a super-increasing sequence, answering several open questions pertaining to the Shapley values in such games.

Our results for the balls and bins model with exponentially decaying probabilities rely on a formula for the Shapley values of super-increasing sequences. Curiously, this formula gives rise to a continuous function reminiscent of Minkowski's question mark function.

Download BibTeX

  author = {Joel Oren and Yuval Filmus and Yair Zick and Yoram Bachrach},
  title = {Power distribution in randomized weighted voting: the effects of the quota},
  note = {Submitted},
  year = {2016}

Shapley values in random weighted voting games (with Joel Oren and Kannan Soundararajan)

We study the distribution of Shapley values in weighted voting games. The Shapley values measure the voting power collective decision making systems. While easy to estimate empirically given the parameters of a weighted voting game, the Shapley values are hard to reason about analytically.

We propose a probabilistic approach, in which the agent weights are drawn i.i.d. from some known exponentially decaying distribution. We provide a general closed-form characterization of the highest and lowest expected Shapley values in such a game, as a function of the parameters of the underlying distribution. To do so, we give a novel reinterpretation of the stochastic process that generates the Shapley variables as a renewal process. We demonstrate the use of our results on the uniform and exponential distributions.

Download BibTeX

  author = {Yuval Filmus and Joel Oren and Kannan Soundararajan},
  title = {Shapley values in weighted voting games with random weights},
  note = {Manuscript},
  year = {2017}


Automatic web-scale information extraction (with Philip Bohannon, Nilesh Dalvi, Nori Jacoby, Sathiya Keerthi and Alok Kirpal)


In this demonstration, we showcase the technologies that we are building at Yahoo! for web-scale information extraction. Given any new website, containing semi-structured information about a pre-specified set of schemas, we show how to populate objects in the corresponding schema by automatically extracting information from the website.

Work done while I was a summer intern in Yahoo! Tel Aviv.

Link BibTeX

  author = {Philip Bohannon and Nilesh Dalvi and Yuval Filmus and Nori
  Jacoby and Sathiya Keerthi and Alok Kirpal},
  title = {Automatic web-scale information extraction},
  booktitle = {Proceedings of the 2012 {ACM} {SIGMOD} International Conference on Management of Data},
  year = {2012},
  pages = {609--612}

Lower bounds for context-free grammars

Information Processing Letters, Volume 111, Issue 18, 2011, pp. 895–898

Ellul, Krawetz, Shallit and Wang prove an exponential lower bound on the size of any context-free grammar generating the language of all permutations over some alphabet. We generalize their method and obtain exponential lower bounds for many other languages, among them the set of all squares of given length, and the set of all words containing each symbol at most twice.

The version below corrects two typos in the proof of Proposition 6: w1w2 should be x(w1)∼x(w1), and in the following sentence N−1(A) should be x(N−1(A)); and a typo in the statement of Theorem 9: the exponent should be t rather than n.

Download BibTeX

  author = {Yuval Filmus},
  title = {Lower bounds for context-free grammars},
  journal = {Information Processing Letters},
  volume = {111},
  issue = {18},
  year = {2011},
  pages = {895--898}

Inequalities on submodular functions via term rewriting

Information Processing Letters, Volume 113, Issue 13, 2013, pp. 457–464

We devise a method for proving inequalities on submodular functions, with a term rewriting flavour. Our method comprises of the following steps:

  • Start with a linear combination X of the values of the function.
  • Define a set of simplification rules.
  • Conclude that XY, where Y is a linear combination of a small number of terms which cannot be simplified further.
  • Calculate the coefficients of Y by evaluating X and Y on functions on which the inequality is tight.

The crucial third step is non-constructive, since it uses compactness of the dual cone of submodular functions. Its proof uses the classical uncrossing technique with a quadratic potential function.

We prove several inequalities using our method, and use them to tightly analyze the performance of two natural (but non-optimal) algorithms for submodular maximization, the random set algorithm and local search.

Download BibTeX

  author = {Yuval Filmus},
  title = {Inequalities on submodular functions via term rewriting},
  journal = {Information Processing Letters},
  volume = {113},
  issue = {13},
  year = {2013},
  pages = {457--464}

Universal codes of the natural numbers

Logical Methods in Computer Science, Volume 9, Issue 3, 2013, Paper 7.

A code of the natural numbers is a uniquely-decodable binary code of the natural numbers with non-decreasing codeword lengths, which satisfies Kraft's inequality tightly. We define a natural partial order on the set of codes, and show how to construct effectively a code better than a given sequence of codes, in a certain precise sense. As an application, we prove that the existence of a scale of codes (a well-ordered set of codes which contains a code better than any given code) is independent of ZFC.

Journal ArXiv Preprint BibTeX

  author = {Yuval Filmus},
  title = {Universal codes of the natural numbers},
  journal = {Logical Methods in Computer Science},
  volume = {9},
  issue = {3},
  year = {2013},
  pages = {paper 7}

On the spectra of direct sums and Kronecker products of side length 2 hypermatrices (with Edinah K. Gnang)


We study the spectral theory of hypermatrices, initiated by Gnang, Elgammal and Retakh.

Here are some of our results:

  • We show that Hadamard hypermatrices of side length 2 exist unless the order is an even number larger than 2.
  • We determine the characteristic polynomial and hyperdeterminant of matrices of side length 2.
  • We generalize the Rayleigh quotient to hypermatrices.

ArXiv BibTeX

  author = {Yuval Filmus and Edinah K. Gnang},
  title = {On the spectra of direct sums and Kronecker products of side length 2 hypermatrices},
  howpublished = {Manuscript},
  year = {2016}

Unpublished notes

Spectral methods for intersection problems

A survey of Friedgut's research program in extremal combinatorics. Friedgut uses spectral methods — Hoffman's eigenvalue bound — to obtain tight bounds on measures of intersecting families. His method has the advantage of implying stability: families of near-maximal measure are similar to families of maximal measure.

Results surveyed:

I prepared this survey for my depth oral.


Monotone feasible interpolation as games

Bonet, Pitassi and Raz and, independently, Krajíček came up with the idea to use feasible interpolation as a vehicle for lower bounds on proof systems. Bonet et al. presented their construction in a somewhat ad hoc manner, and Krajíček used a theorem of Razborov generalizing Karchmer–Wigderson games. We provide a different interpretation of their arguments in terms games which are more suitable than the ones considered by Razborov.

Our account includes three flavors of arguments: ones following Bonet et al., ones following Krajíček, and ones following a suggestion of Neil Thapen. The different arguments prove lower bounds for slightly different formulas.

Download Slides

The Carromboard Problem

Snepscheut came up with the following puzzle. There is a table with four coins at the center of the four sides. The goal is to get all of the coins in the same orientation, that is, all heads or all tails. You never get to see the coins (you are blindfolded), but you can turn one or two of them. Prior to each move, the table is rotated by an arbitrary, unknown multiple of 90 degrees. How many moves do you need to reach the goal?

Dijkstra generalized the solution to 2n sides. We consider an even more general problem, in which the table has n sides, and the coins have m “states” which are modified by addition modulo m (more generally, one could think of a finite Abelian group for the state of the coins). We show that:

  • The puzzle is solvable if and only if either n=1, m=1, or n and m are powers of the same prime.
  • When m is prime, we explicitly describe all minimum-length solutions.

For similar results and more, see Rotating-table games and derivatives of words by Bar Yehuda, Etzion and Moran.



Seven trees in one

Andreas Blass explained how the type-theoretic identity T=1+T2 for binary trees leads to a “finitistic” identity T7=T (the solution to the former equation is a primitive sixth root of unity). More generally, he proved that two polynomials in T are “equal” (under his finitistic interpretation) if and only if they agree on a primitive sixth root of unity and in terms of cardinalities. We give an exposition of this result (using work by Fiore and Leinster), using a more intuitive definition of “strong” equality, as equality given by an algorithm which also works for infinite binary trees.

This talk was given in the Toronto Student Seminar on 16/9/2009.


Modern integer factorization methods

We survey several modern integer factorization methods, including Pollard's ρ, Pollard's p−1, Williams' p+1, the elliptic curve method, Shanks' continued fractions algorithm, the quadratic sieve (including MPQS and Dixon's provable variant) and the number-field sieve (which is only sketched).

Originally given as a talk in the Toronto Student Seminar on 25/11/2009, this talk has proven popular and I gave it several more times.


Two proofs of the central limit theorem

We provide two proofs of the central limit theorem (up to Lévy's continuity theorem), one using cumulants and the other using moments. As a bonus, we also prove the asymptotic normality of the number of distinct prime factors of a ‘random’ integer. Our account follows the exposition in the book The semicircle law, free random variables and entropy.

This talk was given in the Toronto Student Seminar on 20/1/2010.


Hardness of approximating set cover

An exposition of Feige's celebrated result on the hardness of approximating set cover.

This talk was given as part of the PCP reading group on 23/1/2010.


Matrix multiplication

An exposition on algorithms for matrix multiplication, in two parts:

  • Part 1:
    • The arithmetic model.
    • Bilinear normal form for matrix multiplication.
    • Tensor notation and tensor rank.
    • Border rank.
    • Schönhage's τ theorem (the asymptotic sum inequality).
    • Coppersmith's Õ(n2) algorithm for multiplying rectangular matrices.
  • Part 2:
    • The laser method.
    • The Coppersmith-Winograd algorithms.
    • Capacity: the fundamental combinatorial underpinning of the Coppersmith-Winograd method.

These talks were given in the Toronto Student Seminar on 2/2/2012 and 9/2/2012, though the second one has been significantly updated since.

The talks were given again in the IAS theory seminar, on 25/2/2014 and 4/3/2014. The second part has been significantly updated: the combinatorial construction has been simplified following Davie and Stothers, and the general presentation follows Le Gall's recent paper.

Part 1 Part 2 Part 2 (updated)

Video of part 1 Video of part 2

Permanent is hard to compute even on a good day

Cai, Pavan and Sivakumar showed that it is hard to compute the permanent even with an inversely polynomial success probability, assuming the worst-case hardness of computing the permanent. Their proof combines the LFKN protocol with Sudan's list-decoding algorithm for Reed-Solomon codes. We given an exposition of their result, as well as several results leading to it.

This talk was given in the Toronto Student Seminar on 17/9/2012.


Submodular maximization

We survey several recent algorithmic results on submodular maximization:

We also briefly survey some lower bounds:

This talk was given in the Toronto Student Seminar on 31/1/2013.



Reckhow's theorem

We given an exposition of Reckhow's theorem from his thesis, which extends Spira's formula balancing to Frege proofs. This balancing theorem, which doesn't appear explicitly in the thesis, is essential for showing that all Frege proofs are p-equivalent, one of the main results of the thesis.

An alternative exposition can be found in Krajíček's monograph Bounded Arithmetic, Propositional Logic and Complexity Theory.


Smolensky's polynomial method

We give an exposition of Smolensky's fundamental paper on the polynomial method, a lower bound method in circuit complexity. Books usually contain a simpler argument which works only for parity, whereas Smolensky's argument also works for majority (directly). Our exposition omits the step of approximating a constant depth circuit by a low-degree polynomial.


Forcing with random variables and proof complexity

An exposition of parts of Jan Krajíček's book Forcing with random variables and proof complexity, concentrating on lower bounds for constant depth proof systems.

Parts of this exposition has been given as talks in a reading group on the book on 13/6/2013 and 20/6/2013.


Harper's isoperimetric inequality

An exposition of Harper's proof of his edge isoperimetric inequality for the hypercube, following his book Global methods for combinatorial isoperimetric problems.

The proof uses a generalization of shifting that Harper calls compression.

Compared to Harper's original proof (also reproduced in the book), the compression proof includes only one simple calculation.


Short notes

Antichains on the Boolean lattice of dimension 6

We provide a list of all inequivalent non-trival antichains on the Boolean lattice of dimension 6, excluding the empty antichain and the one containing the empty set. Alternatively, this is a list containing all inequivalent non-constant monotone Boolean functions on six inputs, given by their minterms.

Antichains depending of dimension n are given in terms of the points 1, …, n. The number of antichains of given dimension (including the two trivial cases) forms the sequence A003182.


NPN equivalence classes of Boolean functions

Two Boolean functions are NPN-equivalent if they can be reached from one another by permuting the inputs, negating some of the inputs, and possibly negating the output. The number of different equivalence classes for a given number of variables forms the sequence A000370, which starts 2, 4, 14, 222, 616126, for functions of 1 to 5 variables, respectively.

For n up to 5, we have compiled a list of all NPN-equivalence classes of Boolean functions on n variables. Each such function is given as a hexadecimal integer in which bit i is the value at the ith input.

n=1 n=2 n=3 n=4 n=5

Triangle-intersecting families of graphs on eight vertices

We given a Katona-like proof that a triangle-intersecting family of graphs contains at most 1/8 of the graphs. Unfortuantely, our proof works only on up to eight vertices. We discuss several other methods which also cannot give a general proof.

Parts of this note are summarized in my thesis.


Khintchine-Kahane using Fourier Analysis

Latała and Oleszkiewicz proved the special L1 case of the Khintchine-Kahane inequality. We reformulate their proof using Fourier analysis.


Regular languages closed under Kleene plus

Vincenzo Ciancia asked on cstheory.stackexchange about the class of regular languages satisfying the following property: whenever a word w belongs to the language, all of its positive powers wk also belong to the language. He termed these languages ‘circular languages’. Answering his question, we have shown the following:

  • Every circular language can be written as the union of expressions r+. We exhibit a language where this union cannot be disjoint.
  • Given a DFA for a regular language, it is PSPACE-complete to decide whether the language is circular.

The normal form appears in a paper by Calbrix and Nivat, and the PSPACE-completeness result follows quite easily from a paper by Kozen, as I detail in my answer. My proofs appear below.

Part 1 Part 2

Self-avoiding walks on the integers which move at most two integers at a time

Yaroslav Bulatov asked on mathoverflow what is the asymptotic number of self-avoiding integer walks of length n in which adjacent positions are either 1 or 2 apart. We obtain a formula for the exact number of such walks, and deduce that it is Õ(μn), where μ≈2.20556943040059.


On the sequence n mod x, 1≤x≤√n

Avinoam Braverman considered the sequence n mod x, where 1≤x≤√n, took its local minima, and plotted the results. When n is large, a curious pattern composed out of what seem to be triangles appears. We explain this phenomenon heuristically, given formulas for the envelope of the “triangles” (which turn out to be quadratic functions). We go on to describe the envelope of the plot when an arbitrary number of the local minima and local maxima operations are composed.


Largest adjacent segments on the unit circle

Suppose n points are thrown on the circumference of a unit-circumference circle, partitioning the circumference into n segments. What is the expected length of the kth smallest segment? There is a well-known formula for this expectation.

We consider the expected length of kth smallest two adjacent segments. We develop a method for computing them exactly, and compute the expectations for several small n,k. The results do not seem to fit into a nice pattern.


Permutations avoiding patterns of length 3

We provide a bijective proof for the well-known fact that the number of 123-avoiding permutations and the number of 132-avoiding permutations are both counted by Catalan numbers. Simion and Schmidt came up with a direct bijection between the two sets of permutations, and their proof is recommended over mine.


Until cannot be expressed using Next, Always, Eventually

It is well-known that the until operator in linear temporal logic (LTL) cannot be expressed using next, always and eventually. We provide a simple proof of this fact.


Equivalent definitions of the Sprague–Grundy function

We prove that several equivalent definitions of the Sprague–Grundy function coincide (an exercise given in a course on combinatrial games given by Aviezri Fraenkel).


Proof of the μp version of the Erdős–Ko–Rado theorem using Katona's method

Katona gave a simple proof of the Erdős–Ko–Rado theorem. We adapt his proof to the μp setting. We are also able to prove uniqueness, but not stability.


Examples of the GCD proof system

If (x,y)=1 then (x+y,xy)=1. The first part shows how to prove this using standard arguments and using the characterization of GCD as the minimal positive value obtained as an integer combination of the operands. The second part generalizes the argument to a lemma involving n variables.

Part 1 Part 2

A combinatorial interpretation for the product of two geometric series in independent variables

We give a combinatorial proof of the identity 1/(1−x)⋅1/(1−y) = 1/(1−xy+xy).


Thirteenth proof of a result about tiling a rectangle

Stan Wagon gave fourteen proofs of the following result about tiling a rectangle: if a rectangle can be tiled using rectangles with at least one integral side, then the tiled rectangle also has at least one integral side. We paraphrase his 13th proof.


A positive proof of Dehn's theorem

Dehn proved that if a rectangle can be tiled by rectangles whose sides are commensurable, then the tiled rectangle is also commensurable. His proof, as described in Proofs from the book, applies a homomorphism which results in possibly negative side lengths. We modify his proof so that all side lengths are positive.

The crucial ingredient is the following lemma: for each finite set of positive reals there is a basis (over the rationals) of positive reals such that every element in the set is a non-negative integral combination of base elements. We provide two proofs of this lemma, one due to us and one due to Avinoam Braverman.


On the number of NOT gates needed to invert n inputs

We solve the following puzzle: given an arbitrary supply of AND gates and OR gates, invert n inputs using as few NOT gates as possible.


Orthogonal matrices with optimal L2 norm

Question 8 in Chapter 2 of The Probabilistic Method asks us to show that for every n×n orthogonal matrix and 1≤kn, there is a column such that the squared L2 norm of its first k entries is at least k/n, or at most k/n. It also asks for an example in which this is tight. We exhibit such an example which works simultaneously for all k.


Riddle concerning ±1 vectors

A big sheet of paper contains 2n rows consisting of all possible vectors of length n whose entries are +1 or −1. Someone changes some of the entries to zero. Show that there must be a non-empty subset of the rows summing to zero.

Probably much harder than you think it is!

Download mathoverflow

Range of symmetric matrices mod 2

We show that the range of a symmetric matrix over GF(2) always contains its diagonal. We present both our algorithmic proof and a simple proof by Noga Alon. A simplification of our proof has been given by Soltys (Lemma 9).

Download math.stackexchange


Lagrange's proof of the four square theorem

Démonstration d'un Théorème d'Arithmétique, Nouveaux Mémoires de l'Académie royale des Science et Belles-Lettres de Berlin, année 1770.

French English

Lagrange's proof of Wilson's theorem

Démonstration d'un Théorème nouveau concernant les Nombres premiers, Nouveaux Mémoires de l'Académie royale des Science et Belles-Lettres de Berlin, année 1771.


Hurwitz's proof of the transcendence of e

Beweis der Transzendenz der Zahl e, Mathematische Annalen, Bd. 43, 1893, S. 220–221.

Translation from Herstein's Topics in Algebra, p. 176–178.

German English

Gordan's proof of the transcendence of e and π

Transcendenz von e und π, Mathematische Annalen, Bd. 43, 1893, S. 222–224.


Hilbert's proof of the transcendence of e and π

Über die Transzendenz der Zahlen e und π, Mathematische Annalen, Bd. 43, 1893, S. 216–219.


Hurwitz's proof that four-square-like identities only occur in dimensions 1,2,4,8

Über die Komposition der quadratischen Formen von beliebig vielen Variablen, Nachrichten von der k. Gesellschaft der Wissenschaften zu Göttingen, Mathematisch-physikalische Klasse, 1898, S. 309–316.

German English

McKay's easy proof of Cauchy's theorem in group theory

Another proof of Cauchy's theorem, American Mathematical Monthly, Vol. 66 (February 1959), p. 119.


Wilkie's proof of the switching lemma

Excerpt from Modèles non-standard en arithmétique et théorie des ensembles, Jean-Pierre Ressayre & Alec J. Wilkie.



A collection of poems by Shalom Shabazi

Shalom Shabazi is the most important Jewish poet from Yemen. Following a facsimile of a diwan published by Seri and Tobi, we have copied a few of his Hebrew poems.


The Kuzari in Arabic

The Kuzari is an important Jewish theological work attributed to the medieval poet Yehuda Halevi. While originally written in Arabic, it is usually found in translation. Following an edition by Rabbi Qafih, we have copied the entire Arabic text.





Other stuff

Field ration (a poem by David Avidan)

My translation of Avidan's poem Menat Krav, in which Avidan, wary of the world, sleeps for 2000 years and awakens to a sci-fi future.


20 Patiencen

A patience collection by Ella von Haunstein from the early 20th century, regrettably without the diagrams.


An etiology of the major and minor scales

Our own contribution to the mythological origins of the diatonic scales.


On the background

The background image is the final result of a 2D cellular automaton. For rules and animations, follow the link.


Rhythms of resistance

Rhythms of resistance is a world-wide network of political samba bands. I have prepared Python code for generating rhythms, along with two examples. There is also a cheat sheet if you're playing.

Program Sample 1 Sample 2 Cheat sheet