Randomized Algorithm

Communication Complexity of Equality

Any deterministic communication protocol computing EQ on two $n$-bit strings costs $n$ bits of communication in the worst-case.

$f(x)=\sum_{i=0}^{n-1}a_ix^i$ and $g(x)=\sum_{i=0}^{n-1}b_ix^i$, defined over finite field $\mathbb{Z}_p=\lbrace 0,1,\dots,p-1\rbrace $ for some suitable $p$

Communication complexity bounded with $O(\log p)$, $\text{Pr}[f(r)=g(r)]\leq\frac{n-1}{p}$, choosing $p$ to be a prime in the interval $[n^2,2n^2]$, error prability of false positive at most $O(\frac{1}{n})$

Schwartz-Zippel Theorem

Let $f\in\mathbb{F}[x_1,x_2,\dots,x_n]$ be a multivariate polynomial of degree $d$ over a field $\mathbb{F}$. If $f \not\equiv0$, then for any finite set $S \subset\mathbb{F}$, and $r_1,r_2,\dots,r_n\in S$ chosen uniformly and independently at random, $\text{Pr}[f(r_1,r_2,\dots,r_n)=0]\leq\frac{d}{|S|}$.

Fingerprinting

Fingerprints function $FING(\cdot)$

$FING(\cdot)$ is a function, meaning that if $X=Y$ then $FING(X)=FING(Y)$.

It is much easier to compute and compare the fingerprints.

Ideally, the domain of fingerprints is much smaller than the domain of original objects, so storing and comparing fingerprints are easy. This means the fingerprint function $FING(\cdot)$ cannot be an injection (one-to-one mapping), so it’s possible that different $X$ and $Y$ are mapped to the same fingerprint. We resolve this by making fingerprint function randomized, and for $X\neq Y$, we want the probability $\text{Pr}[FING(X)=FING(Y)]$ to be small.

Communication protocols for Equality by fingerprinting

Fingerprinting by randomized checksum

Now we consider a new fingerprint function: We treat each input string $x\in\lbrace 0,1\rbrace ^n$ as the binary representation of a number, and let $FING(x)=x\text{ mod }p$ for some random prime $p$ chosen from $[k]=\lbrace 0,1,\dots,k-1\rbrace $.

The number of bits to be communicated is obviously $O(\log k)$.

Let $z=x-y$. $\text{Pr}[z\text{ mod }p=0]\leq\frac{\text{the number of prime divisors }z}{\text{the number of primes in }[k]}$.

Prime Number Theorem

Let $\pi(k)$ denote the number of primes less than $k$. Then $\pi(k) \sim\frac{k}{\ln k}$ as $k\sim\infty$.

Balls into Bins

Birthday Problem

here are $m$ students in the class. Assume that for each student, his/her birthday is uniformly and independently distributed over the 365 days in a years. We wonder what the probability that no two students share a birthday.

Coupon Collector

We keep throwing balls one-by-one into $n$ bins (coupons), such that each ball is thrown into a bin uniformly and independently at random. Each ball corresponds to a box of chocolate, and each bin corresponds to a type of coupon. Thus, the number of boxes bought to collect $n$ coupons is just the number of balls thrown until none of the $n$ bins is empty.

Theorem :

Let $X$ be the number of balls thrown uniformly and independently to $n$ bins until no bin is empty. Then $\mathbb{E}[X]=nH(n)$, where $H(n)$ is the $n$-th harmonic number.

$H(n)=\ln n+O(1)$, the expected number of coupons required to obtain all $n$ types of coupons is $n\ln n+O(n)$.

Proof(★)

Let $X$ be the number of balls thrown uniformly and independently to $n$ bins until no bin is empty. Then $\text{Pr}[X\geq n\ln n+cn]\lt e^{-c}$ for any $c \gt 0$.

Occupancy Problem

An easy analysis shows that for every bin $i$, the expected load $\mathbb{E}[X_i]$ is equal to the average load $\frac{m}{n}$.

With High Probability (w.h.p) : means $\text{Pr}[A]\gt 1-\frac{1}{n^c}$ in which $n$ means the size of $A$, $c$ is a constant and $c\gt 1$, then we can say $A$ happens with high probability.

The distribution of the maximum load.

Theorem : Suppose that $n$ balls are thrown independently and uniformly at random into $n$ bins. For $1\leq i \leq n$, let $X_i$ be the random variable denoting the number of balls in he $i$-th bin. Then $\text{Pr}[\max_{1\leq i\leq n}X_i \geq \frac{3\ln n}{\ln\ln n}] \lt \frac{1}{n}$.

Proof(★)

Formally, it can be proved that for $m=\Omega(n\log n)$, with high probability, the maximum load is within $O(\frac{m}{n})$, which is asymptotically equal to the average load.

Chernoff Bound

Moment generating functions

The more we know about the moments of a random variable $X$, the more information we would have about $X$. There is a so-called moment generating function, which “packs” all the information about the moments of $X$ into one function.

Definition : The moment generating function of a random variable $X$ is defined as $\mathbb{E}[e^{\lambda x}]$ where $\lambda$ is the parameter of the function.

By Taylor’s expansion and the linearity of expectations, $\mathbb{E}[e^{\lambda x}]=\mathbb{E}[\sum_{k=0}^\infty \frac{\lambda^k}{k!}X^k]=\sum_{k=0}^\infty \frac{\lambda^k}{k!}\mathbb{E}[X^k]$.

Idea : apply Markov’s inequality to $e^{\lambda x}$ and for the rest, we just estimate the moment generating function $\mathbb{E}[e^{\lambda x}]$. To make the bound as tight as possible, we minimized the $\frac{e^{(e^{\lambda}-1)}}{e^{\lambda (1+\delta)}}$ by setting $\lambda = \ln (1 + \delta)$, which can be justified by taking derivatives of $\frac{e^{(e^{\lambda}-1)}}{e^{\lambda (1+\delta)}}$.

Useful Forms :

for $0 \lt \delta \leq 1$,

$\text{Pr}[X \geq (1+\delta)\mu] \lt \exp(-\frac{\mu\delta^2}{3})$

$\text{Pr}[X \leq (1-\delta)\mu] \lt \exp(-\frac{\mu\delta^2}{2})$

for $t \geq 2e\mu$,

$\text{Pr}[X \geq t] \leq 2^{-t}$

Hoeffding

Let $X$ be a real-valued random variable with expected value $\mathbb{E}[X]=0$, and such that $a \leq X \leq b$ almost surely. Then for all $\lambda \in \mathbb{R}$, $\mathbb{E}[e^{\lambda x}]\leq \exp(\frac{\lambda^2(b-a)^2}{8})$.

For any set cover instance $S_1, S_2, \cdots, S_n \subseteq U$ with optimal set cover of size $\text{OPT}$, the GreedyCover returns a set cover of size $C \leq H_n \cdot \text{OPT}$, where $n=|U|$ is the size of the universe and $H_n \simeq\ln n$ represents the $n$-th Harmonic Number.

Primal-Dual Algorithm

Definition :

Theorem ( Weak Duality ) :

For every matching $M$ and every vertex cover $C$ for $S_1, S_2, \cdots, S_m \subseteq U$, it holds that $|M| \leq |C|$.

Proof

Corollary :

Let $S_1, S_2, \cdots, S_m \subseteq U$ be an instance for set cover, and $\text{OPT}$ the size of the optimal set cover.

Graham’s List algorithm

for $j=1, 2, \cdots, n$, assign job $j$ to the machine that currently has the smallest load.

It is well known that the list algorithm has approximation ration $(2 - \frac{1}{m})$.

Theorem :

For every instance of scheduling $n$ jobs with processing times $p_1, p_2, \cdots, p_n$ on $m$ parallel identical machines, the List algorithm finds a schedule with makespan $C_{\text{max}} \leq (2 - \frac{1}{m}) \cdot \text{OPT}$, where $\text{OPT}$ is the makespac of optimal schedules.

Proof(★)

Local Search

Another natural idea for solving optimization problems is the local search. Given an instance of optimization problem, the principle of local search is as follows:

We start with a feasible solution.

At each step, we try to improve the solution by modifying the locally (changing the assignment of a constant number of variables), until nothing can be further changed (thus a local optimum is reached).

Online Algorithms and Competitive Ratio

Assignments

Solution 1

The number of distinct min-cuts in a connected acyclic graph is $O(n)$.

The number of distinct min-cuts in a cycle graph consists of $n$ vertices and $n$ edges is $O(\frac{n(n-1)}{2})$.

These conditions declare that $O(\log n)$ can’t be the upper bound on the number of distinct min-cuts in any multigraph $G$ with $n$ vertices.

We know that $p(n)=\text{Pr}[FastCut(G)\text{ returns a min-cut in G}]=\Omega(\frac{1}{\log n})$, and it’s obvious that $\text{Pr}[\text{a min-cut C is returned by }FastCut(G)]\leq \text{Pr}[FastCut(G)\text{ returns a min-cut in G}]=\Omega(\frac{1}{\log n})$.

Now we know $p_C \leq \Omega(\frac{1}{\log n })$, And since $p_{\text{correct}}$ is a well defined probability, due to the unitarity of probability, it must hold that $p_{\text{correct}}\leq 1$.

Therefore, the only facts we know are

$1 \geq p_{\text{correct}}=\sum_{C\in\mathcal{C}}p_C$

$\forall C\in\mathcal{C}, p_C \leq \Omega(\frac{1}{\log n})$

we cannot get the upper bound on the number of distinct min-cuts in any multigraph $G$ with $n$ vertices from the probability analysis of $FactCut$, because $\Omega(\frac{1}{\log n})$ is the lower bound of $p_C$, we cannot use it to estimate the upped bound of distinct min-cut numbers.

Actually, we already knew that the upper bound on the number of distinct min-cuts in any multigraph $G$ with $n$ vertices is $O(\frac{n(n-1)}{2})$, and we already have a condition reaching this bound, so we can’t find a tighter bound.

In conclusion, my answer of this problem is “no”.

Solution 2

We can us Tree-Hash algorithm to judge whether two rooted trees are isomorphic.

Consider a kind of polynomial $h=\sum_{i=0}^{n-1}a_i*x_i$ over $\mathbb{Z}_p=\lbrace 0, 1,\dots, p-1 \rbrace$, in which $n$ is the size of multiset $S$ and $a_i$ means the $i$-th min number in $S$.

The Tree-Hash algorithm is as follows :

For a rooted tree, we will calculate its hash value which can be used to judge whether two trees are isomorphic.

We will use Depth-First-Search to travel the tree, and calculate every node’s hash value from bottom to top. We call node $u$’s hash value $V(u)$.

If current node is a leaf node, its hash value can be a random number $r\in [M]$. But we need declare that for all leaf nodes, $x$ must be validate.

If current node is not a leaf node, in other words it has a son set $S\neq\emptyset$, we first get the multiset $T$ which consists of all this node’s sons’ hash value, means $T=\cup_{s\in S}\lbrace V(s)\rbrace $.

Then we use the result after applied $h$ on multiset $T$ as the hash value of current node $a$, means $V(a)=h(T)$.

After all nodes’ hash value have been calculated, we use the root of this tree $r$’s hash value as the tree’s hash value, means we need $V(r)$.

Assuming $r_1$ and $r_2$ are roots of rooted Trees $T_1$ and $T_2$. We need to compare two hash value $V(r_1)$ and $V(r_2)$.

Cause we should restore all nodes’ hash value, the space cost of Tree-Hash is $O(n)$.

Time Cost :

If we need to sort the members of multiset $T$, the time cost is $O(n\log n)$.

If we have some methods to calculate $h(T)$ in linear time of $T$’s size, the time cost is $O(n)$.

Error Analysis :

Obviously, if the results is “no”, there two trees should not be isomorphic. But there exists a condition that two trees are non-isomorphic but Tree-Hash returns “yes”. In other words, Tree-Hash has one-sided error (False Positive).

The degree of $h$ is at most $n-1$, and $r$ is chosen among $p$ distinct values, so $\text{Pr}[V(r1)=V(r2)]\leq \frac{n-1}{p}$.

By choosing $p$ as a prime in the interval $[n^2, 2n^2]$, the one-sided error of false positive is at most $O(\frac{1}{n})$. We can also use different polynomials to make error probability lower.

Solution 3

Task 1

Assuming the Bloom Filter which were created for $A\cap B$ as $F_C’$.

We consider $F_A(i)$ as the $i$-th bit of Bloom Filter $F_A$, and $A_j$ means the $j$-th number of set $A$.