Open Problems, Exercises - Computer Science, Exercises of Computer Architecture and Organization

Berry-Essen Theorem, Calculating majority's Noise Stability, Exercises

Typology: Exercises

2010/2011

Uploaded on 10/07/2011

rolla45
rolla45 🇺🇸

4

(6)

133 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Analysis of Boolean Functions (CMU 18-859S, Spring 2007)
Lecture 29: Open Problems
May 3, 2007
Lecturer: Ryan O’Donnell Scribe: Ryan O’Donnell
1 Miscellaneous problems
Small total influence implies a large coefficient: Prove or disprove: For every f:{−1,1}n
{−1,1}there exists some Ssuch that |ˆ
f(S)| 2O(I(f)). One might also try to add the condition
that the Ssatisfies |S| O(I(f)). A lower bound of 2O(I(f)2)follows from Friedgut’s Theorem,
and with this one can also get that |S| O(I(f)).
The result definitely holds for monotone functions: From the proof of Friegut/KKL one can
show that if Infi(f)τfor all i, then I(f)Ω(Var[f] log(1)) (this is usually credited to
Talagrand [Tal94]). Thus either ˆ
f()Ω(1), or there exists isuch that Infi(f)2O(I(f)) . But
for monotone functions, Infi(f) = ˆ
f({i}).
Bounding level kweight by level 1weight: Let f:{−1,1}n {−1,1}, and let II(f)denote
Pn
i=1 Infi(f)2. Recall also that we write Wk(f)for P|S|=kˆ
f(S)2; note that II(f) = W1(f)
if fis monotone. Talagrand [Tal96] showed that for any f:{−1,1}n {−1,1},W2(f)
O(II(f) log(1/II(f))). Benjamini, Kalai, and Schramm [BKS99] generalized this to show that
for each k2,Wk(f)Ck·II(f) logk1(1/II(f)), for some constant Ck. Unpublished work
of Kindler shows that in fact one can make the Cks smaller as kincreases, with a bound Ck
O(1/k). A conjecture is that one can get CkO(1/k!); if so, this would be tight by considering
the Tribes function.
2 Decision trees
Decision trees and influences for real-valued functions [OSSS05]: Recall we proved that for
f:{−1,1}n {−1,1},Var[f]Pn
i=1 δi(f)Infi(f). The question is to what extent this is
true for functions f:{−1,1}nR; in particular, is it true that Var[f]C·Pn
i=1 δi(f)Infi(f)
for some universal constant C? By an explicit example it is known that Ccan’t be 1, but the best
example only gives a lower bound like C1.1.
3 DNFs
Total influence of DNF: As came up on Problem 1 of Homework #3: If fis computable by a
DNF of width w, must it hold that I(f)w? This would be sharp, by Parity, and proving a 2w
1
pf3
pf4
pf5

Partial preview of the text

Download Open Problems, Exercises - Computer Science and more Exercises Computer Architecture and Organization in PDF only on Docsity!

Analysis of Boolean Functions (CMU 18-859S, Spring 2007)

Lecture 29: Open Problems

May 3, 2007 Lecturer: Ryan O’Donnell Scribe: Ryan O’Donnell

1 Miscellaneous problems

Small total influence implies a large coefficient: Prove or disprove: For every f : {− 1 , 1 }n^ → {− 1 , 1 } there exists some S such that | fˆ (S)| ≥ 2 −O(I(f^ )). One might also try to add the condition that the S satisfies |S| ≤ O(I(f )). A lower bound of 2 −O(I(f^ ) (^2) ) follows from Friedgut’s Theorem, and with this one can also get that |S| ≤ O(I(f )). The result definitely holds for monotone functions: From the proof of Friegut/KKL one can show that if Infi(f ) ≤ τ for all i, then I(f ) ≥ Ω(Var[f ] log(1/τ )) (this is usually credited to Talagrand [Tal94]). Thus either fˆ (∅) ≥ Ω(1), or there exists i such that Infi(f ) ≥ 2 −O(I(f^ )). But for monotone functions, Infi(f ) = fˆ ({i}).

Bounding level∑ k weight by level 1 weight: Let f : {− 1 , 1 }n^ → {− 1 , 1 }, and let II(f ) denote n i=1 Infi(f^ ) (^2). Recall also that we write Wk(f ) for ∑ |S|=k fˆ^ (S) (^2) ; note that II(f ) = W 1 (f )

if f is monotone. Talagrand [Tal96] showed that for any f : {− 1 , 1 }n^ → {− 1 , 1 }, W 2 (f ) ≤ O(II(f ) log(1/II(f ))). Benjamini, Kalai, and Schramm [BKS99] generalized this to show that for each k ≥ 2 , Wk(f ) ≤ Ck · II(f ) logk−^1 (1/II(f )), for some constant Ck. Unpublished work of Kindler shows that in fact one can make the Ck’s smaller as k increases, with a bound Ck ≤ O(1/k). A conjecture is that one can get Ck ≤ O(1/k!); if so, this would be tight by considering the Tribes function.

2 Decision trees

Decision trees and influences for real-valued functions [OSSS05]: Recall we proved that for f : {− 1 , 1 }n^ → {− 1 , 1 }, Var[f ] ≤

∑n i=1 δi(f^ )Infi(f^ ). The question is to what extent this is true for functions f : {− 1 , 1 }n^ → R; in particular, is it true that Var[f ] ≤ C ·

∑n i=1 δi(f^ )Infi(f^ ) for some universal constant C? By an explicit example it is known that C can’t be 1 , but the best example only gives a lower bound like C ≥ 1. 1.

3 DNFs

Total influence of DNF: As came up on Problem 1 of Homework #3: If f is computable by a DNF of width w, must it hold that I(f ) ≤ w? This would be sharp, by Parity, and proving a 2 w

upper-bound is easy.

Fourier concentration for DNF: Let f be computable by a poly-sized DNF. Is f -concentrated on a set of Fourier coefficients of cardinality at most nC()^ (i.e., polynomial for constant )? This question is not actually very interesting for learning theory, since the immediate learning conse- quence is already superseded by Jackson’s algorithm. Also, it’s not clear whether or not Tribes already rules out this conjecture.

4 LTFs

Noise sensitivity of intersections of halfspaces [KOS04]: Peres’s Theorem is that if f : {− 1 , 1 }n^ → {− 1 , 1 } is an LTF (halfspace), then NS(f ) ≤ O(

). By the union bound, this implies that if f is the intersection (AND) of k LTFs, then NS(f ) ≤ O(k

). It is conjectured that the following better upper bound holds: NS(f ) ≤ O(

log k

). This would be tight, by considering k sym- metric LTFs with bias 1 − 1 /k on disjoint sets of variables.. The bound is known to hold if the k LTFs are on disjoint sets of variables.

Most noise sensitive LTF: Let n be odd and fix 0 <  < 1 / 2. Show that the LTF on n bits with highest noise sensitivity at  is Majority. (Peres’s Theorem implies this is true up to a constant factor.)

Approximate Chow Parameters: The following problem is attributed to P. Goldberg [Gol06] (see also [Ser06]). Let f : {− 1 , 1 }n^ → {− 1 , 1 } be an LTF. It is known [Cho61] and not too hard to show that f ’s “Chow Parameters” fˆ (∅), fˆ ({ 1 }),... , fˆ ({n}) uniquely determine f among the class of all boolean-valued functions. Now suppose g : {− 1 , 1 }n^ → {− 1 , 1 } is another LTF satisfying ∑

|S|≤ 1

( fˆ (S) − ˆg(S))^2 ≤ .

Must g be o→ 0 (1)-close to f?

5 Learning

Learning monotone DNF: Can poly-size monotone DNF be PAC-learned under the uniform distribution in polynomial time? (Feel free to assume that the accuracy parameter, , is a constant.) This is not inherently a Fourier analysis problem, but it’s such a big open problem in PAC-learning that it’s worth mentioning; also, it’s likely that Fourier analysis would play a big role in any solu- tion.

Learning juntas: In addition to the problems for which Avrim Blum will give you prizes, one may ask: Can k-juntas over { 1 , 2 , 3 }n^ be learned in time n(1−Ω(1))k? How about juntas under the p-biased product distribution, p 6 = 1/ 2?

9 Circuit complexity

Small total influence implies small approximating circuits for monotone functions [BKS99]: Linial, Mansour, and Nisan [LMN93] implies that if f : {− 1 , 1 }n^ → {− 1 , 1 } has a circuit of depth d and size s, then I(f ) ≤ O(logd(s)).

Boppana [Bop97] improved the exponent to d − 1 , which is sharp (by considering Parity). It’s possible that the following “reverse” result holds, approximately, for monotone functions: Let f : {− 1 , 1 }n^ → {− 1 , 1 } be monotone, and let  > 0. Then there is a circuit φ which computes f correctly on a 1 −  fraction of inputs and has size s and depth d satisfying

I(f ) ≤ O(logd(s)).

Note that it is impossible to improve the exponent here to d − 1 , by a recent result [OW07].

10 Threshold phenomena, random graphs, percolation

Total influence lower bounds [Kal00]: Find “general conditions” on functions f : {− 1 , 1 }n^ → {− 1 , 1 } that imply I(f ) ≥ nΩ(1). The motivation here is showing that monotone functions have very sharp thresholds. Bourgain and Kalai [BK97] have results that can show I(f ) ≥ polylog(n) if f has enough symmetries. The only other method I know is the inequality relating influences and decision tree complexity from Lecture 26.

Influence versus Fourier entropy [FK96]: This is a particular case of the above problem. It would also imply the first problem listed in the Miscellaneous section. Let f : {− 1 , 1 }n^ → {− 1 , 1 }. Show that (^) ∑

S⊆[n]

f^ ˆ (S)^2 log(1/ fˆ (S)^2 ) ≤ O(I(f )).

This seems very similar to the Log-Sobolev inequality proven in Homework #4, but it’s not clear if they are actually related (in particular, this conjecture clearly needs that f is boolean-valued).

Thresholds for subgraph containment: Let H be any fixed graph on up to n vertices, and let f be the monotone graph property (in the G(n, p) model) of containing a copy of H. For which graph H is the threshold sharpest? If one could show I(pc)(f ) ≤ O(

v) for any subgraph containment property f (with pc the appropriate critical probability), then one could use the results of Lecture 26 to recover the result of [Gr¨o92], showing R(f ) ≥ Ω(v^3 /^2 ) for subgraph containment properties.

Distance variance of first passage percolation: Consider the graph on Z^2 where each vertex is connected to its 4 neighbors at distance 1. Choose each edge to have “length” either 1 or 2 , independently and with probability 1 / 2 each. Now let f denote shortest-path distance from (0, 0) to (v, v), where v ∈ N is thought of as large. Using the result of Talagrand [Tal94] (mentioned

in the first problem of the Miscellaneous section), [IB03] have shown that Var[f ] ≤ O(v/ log v). The goal is to prove that Var[f ] = Θ(v^2 /^3 ), which the statistical physicists “know” to be the right answer.

Percolation on the grid: The scenario here is similar to the previous one. Consider an (m +

  1. × m subgrid. Let each of the n = 2m^2 − 1 edges be present or absent with probability 1 / 2 , and let f : {− 1 , 1 }n^ → {− 1 , 1 } be the indicator of a “crossing”; i.e., a path from the left side to the right side. (A cute exercise: show E[f ] = 0.) One problem is to prove the following conjecture made by physicists: I(f ) = Θ(n^3 /^8 ). Another is to prove the following conjecture from [BKS99]: For every  > 0 , for sufficiently large m, the following holds:

Pr horizontal edges

[∣

∣ (^) vertical edgesPr [crossing^ |^ horizontal edges]^ −^1 /^2

∣ ≥^ 

]

if one chooses just the

11 Arithmetic Combinatorics

Triangle removal in Fn 2 [Gre04b]: Suppose f : Fn 2 → { 0 , 1 } is -far from being triangle-free (meaning that there are no x, y, z such that x + y + z = 0 and f (x) = f (y) = f (z) = 1). Prove or disprove: the no-triangles test (pick x, y at random and check that f (x), f (y), f (x + y) are not all 1 ) rejects with probability at least poly().

Cosets in sumsets (see [Gre04a]): Let A ⊆ Fn 2 have density at least 1 / 4. Green has shown that the set A + A := {a + b : a, b ∈ A} must contain coset of codimension at least Ω(n). On the other hand, Ruzsa has shown that there exists an A of density at least 1 / 4 (specifically, the set of vectors with at least n/2 +

√ n/2 1’s) such that^ A^ +^ A^ doesn’t contain any coset of codimension at most n. Narrow this gap.

Polynomial Freiman-Ruzsa Conjecture: This important open problem in arithmetic combina- torics is attributed to Marton by Ruzsa (see, e.g., [Gre04a]. It has many equivalent formulations, in- cluding the following: Let f : Fn 2 → Fm 2 (not just → F 2 ) satisfy Prx,y[f (x)+f (y) = f (x+y)] ≥ . Then there is some affine linear function g : Fn 2 → Fm 2 such that Pr[f (x) = g(x)] ≥ poly().

Singularity probability for random matrices: Let M be a random n × n matrix, where each entry is an independent random ± 1 bit. Let Pn denote the probability that M has determinant 0. Clearly this will happen if any two rows are the same (up to sign) or any two columns are the same (up to sign). This gives a lower bound:

Pn ≥ (1 − o(1))2n^2 · 2 −n.

It is conjectured that this bound is correct up to a 1+o(1) factor. A breakthrough result [JK95] gave the upper bound Pn ≤. 999 n, and the best current result [TV07] gets this down to (3/4 + o(1))n,

[LMN93] N. Linial, Y. Mansour, and N. Nisan. Constant depth circuits, Fourier transform and learnability. Journal of the ACM, 40(3):607–620, 1993.

[MO05] E. Mossel and R. O’Donnell. Coin flipping from a cosmic source: On error correction of truly random bits. RS&A, 26(4):418–436, 2005.

[MOO05] E. Mossel, R. O’Donnell, and K. Oleszkiewicz. Noise stability of functions with low influences: invariance and optimality. In FOCS, 2005.

[OSSS05] R. O’Donnell, M. Saks, O. Schramm, and R. Servedio. Every decision tree has an influential variable. In FOCS, 2005.

[OW07] R. O’Donnell and K. Wimmer. Approximation by DNF: examples and counterexam- ples. In ICALP, 2007.

[Ser06] R. Servedio. Every linear threshold function has a low-weight approximator. In CCC,

[Tal89] M. Talagrand. A conjecture on convolution operators, and operators from l^1 to a Banach space. Isr. J. Math., 68:82–88, 1989.

[Tal94] M. Talagrand. On Russo’s approximate zero-one law. 22(3):1576–1587, 1994.

[Tal96] M. Talagrand. How much are increasing sets positively correlated? Combinatorica, 16:243–258, 1996.

[TV07] T. Tao and V. Vu. On the singularity probability of random Bernoulli matrices. J. AMS, 20:603–628, 2007.