Advanced Analysis of Algorithms

Advanced Analysis of Algorithms

theory overview

NP&P → decision problem on &[u1] (language)
⇒ search problem (harder, but may reduce to decision problem)
⇒ optimization

desicion problem → board game

P space: $\exists x_{1}, \forall x_{2}, \exists x_{3}, \dots s.t. f (x)$ where $∣ x ∣$ is polynomial

P time $\in$ P space (equal?)
PPAD

#P: account #solution to NP problem

each NP problem has #P problem
#P-complete $\subseteq$ P space

linear programming: proved P

learning problem

binary classification: $h : X \mapsto {0, 1} \in H$
learnable: train on unknown distribution $D$ , prediction correction increase

binary search to graph search

binary search $[1 : n]$ (go to middle)
⇒ quick selection (broader notion of middle; randomize):

input: $(a_{1}, \dots, a_{n}), k$
want: $k$ -th smallest
algorithm:
1. pick $a_{t_{1}}$ with random $t_{1}$
2. split by $a_{t_{1}}$ to $l_{L}$ , $l_{R}$ , throw away impossible list and shrink $k$

power of randomization: $O (n^{2})$ but $Θ (n)$ wrt randomness:
$\frac{1}{2}$ of the time we pick element ranked $\frac{n}{4} \sim \frac{3 n}{4}$ , can throw away $\frac{n}{4}$ of the list

$\Rightarrow E T (n) \leq n + \frac{1}{2} E T (\frac{3 n}{4}) + \frac{1}{2} E T (n) \Rightarrow E T (n) \leq 2 n + E T (\frac{3 n}{4}) \Rightarrow E T (n) \sim O (n)$

⇒ distance graph $G = (V, E), E \subseteq V \times V$

$ℓ (e), e \in E$
binary search is graph w/ $ℓ (e) \equiv 1$ , each number node has edge w/ next number
- local info, e.g., $m - 1$ is closer
generalized undirected graph “binary search” for target $t \in V$ from $q_{i} \in V$
- $N_{P} (v, t) = {z \in P ∣ (v, z) on shortest path to t}$
  - $z$ on shortest path $d_{G} (u, v)$ if: $d_{G} (z, v) + ℓ (u, z) = d_{G} (u, v)$
- condition: will give set of vertex on shortest path
- iteration: $q = t$ or give $q^{'} \in N_{G} (v, t)$ closer to $t$
- theorem: ∃ algorithm to find $t$ in $O (lo g n)$ question
- want: shrink possible answer set $P_{i} \leq V$ , set of node on shortest path from $q_{i}$ to $t$
- $P_{0} = V$
- ⇒ medium: lowest potential function $u arg min Φ_{P_{0}} (u) = v \in P_{0} \sum d_{G} (u, v)$
- update w/ hint $v_{t}$ : $P_{t + 1} = P_{t} \cap N_{P_{t}} (q_{t}, v_{t})$
claim: $∣ P_{t + 1} ∣ \leq \frac{1}{2} ∣ P_{t} ∣$
proof:

$Φ_{P_{t}} (u) \leq Φ_{P_{t}} (q_{t}) - (∣ P_{t + 1} ∣ - ∣ P_{t} / P_{t + 1} ∣) - ℓ (q_{t}, v_{t}), Φ_{P_{t}} (u) \geq Φ_{P_{t}} (q_{t}) \Rightarrow - ∣ P_{t + 1} ∣ + ∣ P_{t} / P_{t + 1} ∣ \geq ℓ (q_{t}, v_{t}) \geq 0 \Rightarrow ∣ P_{t + 1} ∣ \leq \frac{1}{2} ∣ P_{t} ∣$

interactive learning

A General Framework for Robust Interactive Learning, Ehsan Emamjomeh-Zadeh, David Kempe

given hypercube $X$ , search space $H$ , find $h \in H$ s.t. $h \leq X$

VC dimention
- $n$ hyperplane in $d$ -dimensional space can split $n^{d}$ cell

greedy algorithm

can deal with NP-hard problem
optimality:
- argument of staying ahead
- exchange argument
proving optimality
- optimum must exist for finite problem
- compare to imaginary optimum
- focus on simple local consistency that eliminate bad possibility

Huffman Codes

prefix code: no code is prefix of another, $\sum \to {0, 1}^{*}$
⇒ binary tree w/ $n$ leaf, each leaf’s parents not usable
- optimum is trivial when tree is given
- tree w/ same number of leaf of each length are equivalent
- has to be proper: each node is either leaf or parent of 2
- substructure optimality: bottom 2 children must map to 2 least frequency
induction: merge 2 least frequency, treat it as one letter to find mapping for the rest

minimum spanning tree (MST)

cut: split graph $V$ to $S$ and $\overset{ˉ}{S}$

$cut (S) = {(u, v) \in E ∣ u \in S, v \in \overset{ˉ}{S}}$
affinity: inverse distance, closeness

Kruskal’s algorithm

traditional Kruskal

sort $E$ ascending
from no node, add edge from small, w/o cycle

reversed Kruskal

sort $E$ ascending
from $G$ , remove edge from big, w/o disconnecting

Prim’s algorithm

choose arbitrary node “home town” $v_{0}$ . $S_{0} := {v_{0}}$
repeat: find closest node $v_{i + 1}$ w/ edge form $S_{i}$ . $S_{i + 1} := S_{i} \cup {v_{i + 1}}$

why: cut property: shortest cut edge is in MST
- reverse cut property: longest edge in cycle is not in MST

clustering

given $G = (V, E, d)$ w/ metric $d$ , $k$ , want $(S_{1}, \dots, S_{k})$ s.t.

$S_{i} \subseteq V$
$\forall i \neq = j, S_{i} \cap S_{j} = \emptyset$
$⋃_{i} S_{i} = V$
maximize shortest inter-cluster edge

idea: for tree $G$ , just need to cut $k - 1$ longest edge
⇒ algorithm: for graph $G$ , cut $k - 1$ longest edge in MST for optimal clustering
proof: denote greedy solution cluster $V_{1}, \dots, V_{k}$ , optimum cluster $C_{1}, \dots, C_{k}$
1. if ${V} \neq = {C}$ , ∃ node $v_{1} \neq = v_{2} \in V_{i}$ s.t. $v_{1} \in C_{p}, v_{2} \in C_{q}, C_{i} \neq = C_{j}$
2. $\Rightarrow \exists e$ in MST on path from $v_{1}$ to $v_{2}$ , within $V_{i}$ s.t. $e$ cross $C_{p}$ boundary
3. by reverse Kruskal, in MST, longest edge that do not disconnect graph are gone
  $\Rightarrow$ by greedy algorithm, $ℓ (e) \leq d_{k - 1}$ the $k - 1$ th longest edge in MST
4. $\Rightarrow$ optimum solution has shortest inter-cluster edge $\leq d_{k - 1}$ , same or smaller than greedy solution ⇒ contradiction

approximation algorithm

set cover

given: input ground set $V = {1, \dots, n}$ , $S_{1}, \dots, S_{i} \subset V$ w/ $w_{i}, i \in I = [1, \dots, m]$
want: $T \subset I$ s.t. $\cup_{i \in T} S_{i} = V$ and $∣ T ∣$ minimized

$T \subset I \cup_{i \in T} S_{i} = V arg min i \in T \sum w_{i}$
idea: minimize average cost $\frac{w _{i}}{∣ S _{i} ∣}$
algorithm:
1. $U := V$
2. while $U \neq = \emptyset$ , pick $i arg min \frac{w _{i}}{∣ S _{i} \cap U ∣}$ , set $U = U / S_{i}$
not optimal when many large $S_{i}$ also cover small $S_{j}$
claim: greedy is within $lo g n$ of optimum, where $n = max_{j} ∣ S_{j} ∣$
- doing consistently better than $lo g n$ of optimum is NP-hard
- greedy cost
  
  $c_{G} (i_{t}) = \frac{w _{i_{t}}}{∣ S _{i_{t}} \cap U _{t} ∣} \leq H (∣ S_{j_{t}} ∣) w_{j_{t}}$
  - harmonic series $H (k) = \sum_{i = 1}^{k} \frac{1}{i}$
  - when $h$ th (starting from 0) element $V_{π_{h}}$ in $S_{j}$ is covered, $c_{G} (π_{n}) \leq \frac{w _{j}}{∣ S _{j} ∣ - h}$ because at most $h$ element is covered in $S_{j}$

set function

$f : {S ∣ S \subseteq V} \to R$

salary for group in society
hypergraph: indicator set function determine if each subset is hyperedge

property:

grounded: $f (\emptyset) = 0$
monotone: $\forall S \subseteq T, f (S) \leq f (T)$
submodular

submodular function

$\forall S \subseteq T, \forall v \in / T$

$\nabla f_{S} (v) \geq \nabla f_{T} (v)$

diminishing return: less happier when having more and more chocolate
discrete derivative: $\nabla f_{S} (v) = f (S \cup {v}) - f (S)$
- how much value can $v$ add to $S$
complementarity: $1 + 1 > 2$
$f, g$ submodular $\Rightarrow \forall a, b \geq 0, a f + b g$ submodular
“or” is fundamentally submodular

problem

input: submodular $f : {S ∣ S \subseteq V} \to R^{*}$ , $k$
output: $S \subset V$ , s.t. $∣ S ∣ = k$ , $max f (S)$
oracle model: can get $f (T)$
example: max cover: for mapping $f : V \to B$ choose $k$ element from $V$ to maximize $∣ f (S) ∣$
greedy algorithm $S^{'}$ : at time $t$ , add $u_{t}$ that maximize $\nabla f_{u_{t}} (S_{t - 1}^{'})$

theorem: $\forall f$ monotone & submodular, $\forall k$ , $f (S^{'}) \geq (1 - \frac{1}{e}) f (S^{*})$

proof:

$S^{*} =: {v_{1}, \dots, v_{k}}; f (S^{*}) \leq f (S^{*} \cup S_{t}^{'}) = f (S_{t}^{'}) + [f (S^{*} \cup S_{t}^{'}) - f (S_{t}^{'})]; f (S^{*} \cup S_{t}^{'}) - f (S_{t}^{'}) = i = 1 \sum k [f ({v_{1}, \dots, v_{i}} \cup S_{t}^{'}) - f ({v_{1}, \dots, v_{i - 1}} \cup S_{t}^{'})] = i = 1 \sum k \nabla f_{v_{i}} ({v_{1}, \dots, v_{i - 1}} \cup S_{t}^{'}) \leq i = 1 \sum k \nabla f_{v_{i}} (S_{t}^{'}) = k \nabla f_{v_{i}} (S_{t}^{'}) \leq k \nabla f_{u_{t + 1}} (S_{t}^{'}) = k [f (S_{t + 1}^{'}) - f (S_{t})] \Rightarrow δ_{t} := f (S^{*}) - f (S_{t}^{'}) \leq k [f (S_{t + 1}^{'}) - f (S_{t})] = k [(f (S_{t + 1}^{'}) - f (S^{*})) + (f (S^{*}) - f (S_{t}))] = k [- δ_{t + 1} + δ_{t}] \Rightarrow δ_{t + 1} \leq \frac{k - 1}{k} δ_{t} \leq \dots \leq (1 - \frac{1}{k})^{t + 1} δ_{0} = (1 - \frac{1}{k})^{t + 1} f (S^{*}) \leq e^{- \frac{t + 1}{k}} f (S^{*}) \Rightarrow f (S_{k}^{'}) = f (S^{*}) - δ_{k} \geq (1 - e^{- \frac{k}{k}}) f (S^{*})$

alternative definition for submodular function (equivalent):

$\forall A, B \subseteq V, f (A) + f (B) \geq f (A \cap B) + f (A \cup B)$

reachability

$Reach (S) := {v ∣ can reach v from S}$

$f (S) = ∣ Reach (S) ∣$

$f$ submodular:

$\nabla f_{S} (v) = ∣ {x ∣ can reach x from v but not S} ∣$

network influence

input $G = (V, E), p_{e}; t = 1, \dots, T$

dynamic & complex compared to traditional static graph
network influence maximization: often submodular

independent cascade (IC)

each node has probability $p$ to influence

stochastic network influence; e.g., pandemic
$S_{t} = S_{t - 1} \cup N (S_{t - 1})$
influence spread $σ (S) = \sum_{W \in 2^{V}} Pr_{W} [S \to W] ∣ W ∣$
- generally #P-complete
roughly stochastic “or” process; distribution of reachability model

threshold model

each node $v$ has threshold $θ_{v}$ to be influenced by neighbor

deterministic/ stochastic by $θ_{v}$ , e.g., idea spreading

iterative algorithm

polynomial local search (PLS)

e.g., simplex algorithm, bubble sort
polynomial number of viable option at each step
always improve because know potential function
will stop because define direct acyclic graph (DAG)

Lloyd’s algorithm (K-means clustering)

input: $P = {p_{1}, \dots, p_{n}} \subseteq R^{d}$
want: cluster into $P_{1}, \dots, P_{k}$

representation (center) $c_{i}$ of $P_{i}$
- mean of cluster minimize variance of distance
algorithm:
1. randomly initialize $P_{i}^{0}$
2. calculate $c_{i}^{t}$
3. regroup $P_{i}^{t}$ by $min (p_{j}, c_{i}^{t - 1})$
Voronoi diagram: zip code
exponential time to converge, but polynomial time to get $\frac{1}{ε}$ close

divide and conquer in geometric space

1-dimensional space

nice because: can sort, minimal neighborhood, point = hyperplane
mean: not statistically robust; median: robust

approximate median

$(\frac{1}{2} - ε)$ -median

algorithm, use 2-way tree of height $h$ :

uniformly sample $H = \sum_{i = 0}^{h} 3^{i}$ point
find median of every 3 point
find median of every 3 median on the last level, recursively

proof:

define $Pr_{h} (x)$ probability that median algorithm w/ height $h$ yield result $< x$
$Pr_{0} (x) = x$ because only 1 point
at least 2 point among 3 need to be $< x$ to get $< x$ median:

$h + 1 Pr (x) = (2 3) h Pr (x)^{2} (1 - h Pr (x)) + (2 3) h Pr (x)^{2} \Rightarrow h + 1 Pr (x) = 3 h Pr (x)^{2} - 2 h Pr (x)^{3}$

by induction, $Pr_{h} (\frac{1}{2} - ε) \leq \frac{1}{2} - (\frac{11}{8})^{h} ε$ if $ε \leq \frac{1}{4}$

median in 2-dimensional space

$δ$ -centerpoint: for any projection, at approximate median
theorem: $\forall P \subseteq R^{d}$ , $\exists \frac{1}{d + 1}$ -centerpoint
- intuition: need $d + 1$ point to trap 1 point
VC dimension theory: need $\frac{d}{ε ^{2}}$ sample to estimate well
$ε$ -good sample $S \subset P$ : $\forall$ half space $H$ , $∣ \frac{∣ P \cap H ^{+} ∣}{∣ P ∣} - \frac{∣ S \cap H ^{+} ∣}{∣ S ∣} ∣ \leq ε$

nearest neighbor

$P = {p_{1}, \dots, p_{n}} \subseteq R^{d}$

ball: smallest ball $B (p_{i})$ that contain $k$ neighbor define “nearest”
- ball shape depend on metric (no necessary round)
- locality: a point is not covered by many ball
k-NN graph (k-NNG): edge from $p_{i}$ and $p_{j}$ if $p_{j}$ is in $B (p_{i})$
point location, e.g., cell phone connect to tower
nearest-pair problem: find nearest pair of point among set of point
- $n lo g n$ algorithm for 2D:
  1. divide by median on one axis to $P_{L}, P_{R}$ , find nearest pair in each half
  2. take minimum distance $δ$ for $min (P_{L}, P_{R})$
  3. find nearest pair within $δ$ around the boundary ( $O (n)$ ) - only need to check a series of $δ$ -hypercube
  4. take the minimum, recurs
- Bentley: $O (n (lo g n)^{d - 1})$ in d-dimension

construct k-NNG: divide and conquer ( $O (n (lo g n)^{2})$ )

split $P$ into $P_{0}, P_{1}$ by median from $x_{1}$ (could also split arbitrary)
build k-NNG for each half ( $O (n lo g n)$ )
- point location problem: ball too big iff point from other half in ball
- need binary search tree for $O (lo g n)$ query
- max overlapping ball $\leq τ_{d} k$
shrink all ball in $P_{0}$ w/ point in $P_{1}$
recurs

disk packing

non-overlapping 2D ball set

problem: given point, find ball containing it
planar graph: node for each ball, edge for intersection
Koebe embedding: reverse is true
e.g., prof Teng saw 100 lake in Minnesota (which has 10000) when driving across

condition: if can dig $n$ round lake on the spherical, then charge $1 for each lake on tour though great circle

maximum expected charge: $2 n$
$n^{1 - \frac{1}{d}}$ in $d$ -dimension

proof:

assume the globe has radius 1
each lake $i$ define a belt of width $2 r_{i}$ perpendicular to great circle passing through it ⇒ expectation of charge:

$i \sum \frac{2 π \cdot 2 r _{i}}{4 π 1 ^{2}} = i \sum r_{i}$
lake area cannot exceed globe area:

$i \sum π r_{i}^{2} \leq 4 π 1^{2} \Rightarrow i \sum r_{i}^{2} \leq 4$
clearly want equal $r_{i} = r_{0}$ by convexity

$\Rightarrow i \sum r_{i} \leq n r_{0} = n n r_{0}^{2} \leq n 4 = 2 n$

kissing number $τ_{d}$

max number of non-overlapping ball to touch one ball

$τ_{1} = 2$ in 1D, $τ_{2} = 6$ in 2D, $τ_{3} = 12$ in 3D
$τ_{d} \leq \frac{( 3 r ) ^{d}}{r ^{d}} = 3^{d}$

3-dimensional binary search via disk

a great circle divide disk into $B^{N}, B^{S}, B^{ON}$
can have conformal map s.t. median of all disk center is center of globe

$\Rightarrow ∣ B^{N} ∣, ∣ B^{S} ∣ \leq (1 - \frac{1}{d + 2}) n = \frac{3}{4} n ∣ B^{ON} ∣ \leq 2 k^{\frac{1}{d}} n^{1 - \frac{1}{d}} = 2 n$
- dilate point by projecting globe to plane via tangent, scaling up on plane, then projecting back
can build binary search tree by successive random split through median

$d$ -dimensional convex geometry

Helly’s theorem (projection lemma)

in $d$ -dimension, with ∞ convex point set, if $\forall d + 1$ convex set $C_{π_{1}}, \dots, C_{π_{d + 1}}$ , all them intersect, then all ∞ of these set intersect

in 1D, 3 interval intersect pairwise ⇒ ∃1 point in all 3 interval
$\Rightarrow \frac{1}{d + 1}$ -centerpoint exist

Radon theorem: median in convex geometry definition

in $d$ -dimension
$\forall d + 2$ point $X = {x_{1}, \dots, x_{d + 2}} \subseteq R^{d}$ ,
can be divided into 2 set $X_{1}, X_{2}$ s.t convex hull of $X_{1}$ and $X_{2}$ intersect

i.e. $\exists X_{1}, X_{2}$ s.t.
- $X_{1} \cap X_{2} = \emptyset$
- $X_{1} \cup X_{2} = X$
- $convex hull (X_{1}) \cap convex hull (X_{2}) \neq = \emptyset$

proof:

${\sum_{i = 1}^{d + 2} a_{i} x_{i} = 0 \sum_{i = 1}^{d + 2} a_{i} = 0$

has $d + 1$ linear equation ⇒ has non-trivial solution, i.e., $a_{i} \neq = 0$

$W := i = 1, a_{i} > 0 \sum i = d + 2 a_{i} = i = 1, a_{i} < 0 \sum i = d + 2 (- a_{i}) \Rightarrow \frac{a _{i}}{W} \in (0, 1] \forall a_{i} \Rightarrow p = i = 1, a_{i} > 0 \sum i = d + 2 \frac{a _{i}}{W} x_{i} = i = 1, a_{i} < 0 \sum i = d + 2 \frac{- a _{i}}{W} x_{i}$

is both in convex hull of ${x_{i} ∣ a_{i} > 0}$ and ${x_{i} ∣ a_{i} < 0}$

$\Rightarrow \exists$ intersection point as median
can get $\frac{1}{d ^{2}}$ -median WHP, $O (d^{3})$ time via $(d + 2)$ -tree

Lipton-Tarjan separator theorem for planar graph

Alan George nested dissection

numerical system beat Gaussian elimination
- in Gaussian elimination, removing 1 node connect all its neighbor
remove $O (n)$ separator node to separate graph into 2 (or 3)
- eliminate think separator last, so eliminating each node incur small cost

fast Fourier transform (FFT)

integer multiplication

classic algorithm: $O (n^{2})$ —not scalable
- $2 n$ output size
can divide each number into high and low half

$x \cdot y = x_{H} \cdot y_{H} \cdot 2^{n} + (x_{H} \cdot y_{L} + x_{L} \cdot y_{H}) \cdot 2^{\frac{n}{2}} + x_{L} \cdot y_{L}$
- $T (n) = 4 T (n /2) + O (n) = O (n^{2})$
can save one smaller multiplication by:

$x_{H} \cdot y_{L} + x_{L} \cdot y_{H} = (x_{H} + x_{L}) (y_{H} + y_{L}) - x_{H} \cdot y_{H} - x_{L} \cdot y_{L}$
- $T (n) = 3 T (n /2) + O (n) = O (n^{l o g_{2} 3}) = O (n^{1.58})$
FFT make multiplication $O (n lo g n)$ , nearly as easy as addition

polynomial multiplication

number are special case of polynomial
better than number because continuous
convolution: $f (z) g (z) = (\sum_{i = 0}^{n} a_{i} z^{i}) (\sum_{i = 0}^{n} b_{i} z^{i})$ is sequence of diagonal sum in $A B^{T}$
polynomial $f (z) = \sum_{i = 0}^{n} a_{i} z^{i}$ can be recovered by any $n + 1$ distinct data point
- realizable data
- recover from $(x_{1}, ℓ_{1}), \dots, (x_{L}, ℓ_{L})$ by solving
  
  $11 ⋮ 1 x_{1} x_{2} ⋮ x_{L} x_{1}^{2} x_{2}^{2} ⋮ x_{L}^{2} \dots \dots ⋱ \dots x_{1}^{n} x_{2}^{n} ⋮ x_{L}^{n} a_{0} a_{1} ⋮ a_{n} = ℓ_{1} ℓ_{2} ⋮ ℓ_{L}$
- easy to multiple $(x_{i}, f (x_{i}))$ and $(x_{i}, g (x_{i}))$ : $O (1)$
- need $2 n + 1$ data point to recover $f \cdot g$
- active learning: can pick nice data point as wished
unit root in complex space: $x^{n} = 1$
- divide unit circle evenly ⇒ all in form $e^{i θ}$
- ⇒ sample $z_{j} = w_{n}^{j} := e^{i \frac{jπ}{n}}$ for $j = 0, \dots, 2 n - 1$
- $\forall w \neq = 1, w^{n} = 1,$
  
  $i = 0 \sum n - 1 w^{i} = \frac{1 - w ^{n}}{1 - w} = 0$
- can save computation by $w_{n}^{2 j} = (w_{n}^{j})^{2}$ , etc.
  - divide recursively:
    
    $f (x) = j = 0 \sum n - 1 a_{j} x^{j} = j = 0 \sum \frac{n}{2} a_{2 j} (x^{2})^{j} + x j = 0 \sum \frac{n}{2} a_{2 j + 1} (x^{2})^{j}$
  - ⇒ calculate all in $O (n lo g n)$

linear programming (LP)

maximization standard form: maximize, $m$ ≤ constraint, non-negative entry
- e.g., use limited material to make product for profit
- minimization standard form: opposite
polyhedron $P$
- polytope (bounded)
- vertex: ∃ vector, stay in if added, not in if subtracted
- face
optimal solution: convex (a face), tight (constraint reach equality)
- feasible solution
- DNE when unbounded/ infeasible
- fundamental theorem of linear programming: optimal solution, if exist, contain vertex
  - ⇒ if optimal solution exist, exist optimal solution w/ at most $m$ non-zero variable
- complementary slackness: $x_{i} > 0 \Rightarrow y_{i} = 0$ , only a few constraint active
dual LP
- variable become constraint, each row of constraint become 1 dual variable $y_{i}$
- persuade not to make product by offering to buy raw material at higher price
- essentially finding upper bound for value
- finding the force on a ball in force field in a cage when it is stable against corner, then try to find path to origin w/ least work
- involution
- lenient primal form ⇒ tight dual form, vice versa
weak duality theorem: primal value $a^{T} x$ ≤ $b^{T} y$ dual value
- looking for optimum from opposite direction
- one is unbounded $\Leftrightarrow$ the other is infeasible
- optimal if $a^{T} x = b^{T} y$
strong duality theorem: if either feasible and bounded, then the other is feasible and bounded and optimal value equal
- hold for LP, not for general convex optimization
method
- simplex method (George Dantzig): polynomial in practice
- duality (John von Neumann)
- engineering method (polynomial): ellipsoid method, interior point method

network flow

designed to attack USSR supply chain
Ford-Fulkerson max flow: send most from source to sink
- simple greedy not optimal: find path w/ max min flow recursively
- optimal: include reverse flow from chosen flow in residual graph
  - proof by duality
min cut: sum of weight of edge that separate source and sink
- max flow ≤ min cut
- for each forward cut, ∃ flow that saturate it
- each backward cut is empty

randomization

Andy Yao’s theorem: random data + deterministic algorithm (algorithm min) is the same as deterministic data + random algorithm (data min)
- min of max = max of min
- convert worst case analysis to average case analysis

Markov chain

Markovian matrix $M = D^{- 1} A$ : stochastic matrix
- doubly-stochastic matrix: both row & column sum to one
- spectral radius: largest dilation by vector

PageRank

network centrality

$O (∣ V ∣ + ∣ E ∣)$ approximation:

$P = α \frac{1}{n} + (1 - α) M^{T} P \Rightarrow (I - α M^{T}) P = α \frac{1}{n} \Rightarrow P = (I - (1 - α) M^{T})^{- 1} α \frac{1}{n} = (t = 0 \sum \infty α ((1 - α) M^{T})^{t}) \frac{1}{n} i = 0 \sum \infty α (1 - α)^{t} = 1$

$\Rightarrow (M^{T})^{t} \frac{1}{n}$ : start random and walk $t$ round

significant PageRank problem: want all page w/ PageRank $\geq ϵ$
- approximately find page w/ PageRank $\geq \frac{ϵ}{2}$

spectral graph theory

simplest: undirected graph
spectral graph partitioning/ clustering: heuristics
reduce search space from $2^{n}$ w/ Laplacian matrix

Laplacian matrix $L = D - A$

$D$ : degree matrix, diagonal, $D_{ii}$ is degree of node $i$
$A$ : adjacency matrix
$L 1 = 0, 1 L = 0$
additive decomposition: can add edge one at a time and sum the $L$ , $L = \sum_{i, j \in E} L_{ij}$
$x^{T} L x = \sum_{(i, j) \in E} (x_{i} - x_{j})^{2}$
- ⇒ $L$ is positive semi-definite
- eigenvalue $λ_{1} = 0$
- Fiedler: $λ_{2} = 0 \Leftrightarrow$ graph disconnected

min cut

convention: $∣ S ∣ \leq ∣ \overset{ˉ}{S} ∣$
quality of cut: conductance $\frac{∣ c u t ( S , S ˉ ) ∣}{∣ S ∣}$
- want min conductance → NP-hard
- Cheeger’s inequality: $\frac{λ _{2}}{2} \leq min_{S} Φ (S) \leq 2 λ_{2}$
algorithm
1. find Fiedler vector $u_{2}$ (eigenvector corresponding to $λ_{2}$ )
  - $u_{2} ⊥ u_{1} = 1$
2. sort $u_{2}$ entries ascending $z_{π (i)}$
  - $z_{π (1)} \leq \dots \leq 0 \leq z_{π (n)}$
3. $V := {π (i)}$
  - only need to check cut between $π (i)$ and $π (i + 1)$
  - dimensionality reduction $2^{n} \to n - 1$

Steven Hé (Sīchàng)