- Advanced Analysis of Algorithms
- theory overview
- binary search to graph search
- interactive learning
- greedy algorithm
- approximation algorithm
- iterative algorithm
- divide and conquer in geometric space
- fast Fourier transform (FFT)
- linear programming (LP)
- network flow
- randomization
- spectral graph theory
Advanced Analysis of Algorithms
theory overview
NP&P → decision problem on &[u1]
(language)
⇒ search problem (harder, but may reduce to decision problem)
⇒ optimization
desicion problem → board game
P space: where is polynomial
- P time P space (equal?)
- PPAD
#P: account #solution to NP problem
- each NP problem has #P problem
- #P-complete P space
linear programming: proved P
learning problem
- binary classification:
- learnable: train on unknown distribution , prediction correction increase
binary search to graph search
binary search (go to middle)
⇒ quick selection (broader notion of middle; randomize):
- input:
- want: -th smallest
- algorithm:
- pick with random
- split by to , , throw away impossible list and shrink
power of randomization: but wrt randomness:
of the time we pick element ranked , can throw away of the list
⇒ distance graph
binary search is graph w/ , each number node has edge w/ next number
- local info, e.g., is closer
generalized undirected graph “binary search” for target from
-
- on shortest path if:
- condition: will give set of vertex on shortest path
- iteration: or give closer to
- theorem: ∃ algorithm to find in question
- want: shrink possible answer set , set of node on shortest path from to
- ⇒ medium: lowest potential function
- update w/ hint :
-
claim:
proof:
interactive learning
A General Framework for Robust Interactive Learning, Ehsan Emamjomeh-Zadeh, David Kempe
given hypercube , search space , find s.t.
- VC dimention
- hyperplane in -dimensional space can split cell
greedy algorithm
- can deal with NP-hard problem
- optimality:
- argument of staying ahead
- exchange argument
- proving optimality
- optimum must exist for finite problem
- compare to imaginary optimum
- focus on simple local consistency that eliminate bad possibility
Huffman Codes
- prefix code: no code is prefix of another,
- ⇒ binary tree w/ leaf, each leaf’s parents not usable
- optimum is trivial when tree is given
- tree w/ same number of leaf of each length are equivalent
- has to be proper: each node is either leaf or parent of 2
- substructure optimality: bottom 2 children must map to 2 least frequency
- induction: merge 2 least frequency, treat it as one letter to find mapping for the rest
minimum spanning tree (MST)
cut: split graph to and
affinity: inverse distance, closeness
Kruskal’s algorithm
traditional Kruskal
- sort ascending
- from no node, add edge from small, w/o cycle
reversed Kruskal
- sort ascending
- from , remove edge from big, w/o disconnecting
Prim’s algorithm
- choose arbitrary node “home town” .
- repeat: find closest node w/ edge form .
- why: cut property: shortest cut edge is in MST
- reverse cut property: longest edge in cycle is not in MST
clustering
given w/ metric , , want s.t.
- maximize shortest inter-cluster edge
- idea: for tree , just need to cut longest edge
- ⇒ algorithm: for graph , cut longest edge in MST for optimal clustering
- proof: denote greedy solution cluster , optimum cluster
- if , ∃ node s.t.
- in MST on path from to , within s.t. cross boundary
- by reverse Kruskal, in MST, longest edge that do not disconnect graph are gone
by greedy algorithm, the th longest edge in MST - optimum solution has shortest inter-cluster edge , same or smaller than greedy solution ⇒ contradiction
approximation algorithm
set cover
given: input ground set , w/
want: s.t. and minimized
idea: minimize average cost
algorithm:
- while , pick , set
not optimal when many large also cover small
claim: greedy is within of optimum, where
doing consistently better than of optimum is NP-hard
greedy cost
- harmonic series
- when th (starting from 0) element in is covered, because at most element is covered in
set function
- salary for group in society
- hypergraph: indicator set function determine if each subset is hyperedge
property:
- grounded:
- monotone:
- submodular
submodular function
- diminishing return: less happier when having more and more chocolate
- discrete derivative:
- how much value can add to
- complementarity:
- submodular submodular
- “or” is fundamentally submodular
problem
input: submodular ,
output: , s.t. ,
oracle model: can get
example: max cover: for mapping choose element from to maximize
greedy algorithm : at time , add that maximize
theorem: monotone & submodular, ,
proof:
alternative definition for submodular function (equivalent):
reachability
submodular:
network influence
input
- dynamic & complex compared to traditional static graph
- network influence maximization: often submodular
independent cascade (IC)
each node has probability to influence
- stochastic network influence; e.g., pandemic
- influence spread
- generally #P-complete
- roughly stochastic “or” process; distribution of reachability model
threshold model
each node has threshold to be influenced by neighbor
- deterministic/ stochastic by , e.g., idea spreading
iterative algorithm
polynomial local search (PLS)
- e.g., simplex algorithm, bubble sort
- polynomial number of viable option at each step
- always improve because know potential function
- will stop because define direct acyclic graph (DAG)
Lloyd’s algorithm (K-means clustering)
input:
want: cluster into
- representation (center) of
- mean of cluster minimize variance of distance
- algorithm:
- randomly initialize
- calculate
- regroup by
- Voronoi diagram: zip code
- exponential time to converge, but polynomial time to get close
divide and conquer in geometric space
1-dimensional space
- nice because: can sort, minimal neighborhood, point = hyperplane
- mean: not statistically robust; median: robust
approximate median
-median
algorithm, use 2-way tree of height :
- uniformly sample point
- find median of every 3 point
- find median of every 3 median on the last level, recursively
proof:
- define probability that median algorithm w/ height yield result
- because only 1 point
- at least 2 point among 3 need to be to get median:
- by induction, if
median in 2-dimensional space
- -centerpoint: for any projection, at approximate median
- theorem: , -centerpoint
- intuition: need point to trap 1 point
- VC dimension theory: need sample to estimate well
- -good sample : half space ,
nearest neighbor
- ball: smallest ball that contain neighbor define “nearest”
- ball shape depend on metric (no necessary round)
- locality: a point is not covered by many ball
- k-NN graph (k-NNG): edge from and if is in
- point location, e.g., cell phone connect to tower
- nearest-pair problem: find nearest pair of point among set of point
- algorithm for 2D:
- divide by median on one axis to , find nearest pair in each half
- take minimum distance for
- find nearest pair within around the boundary () - only need to check a series of -hypercube
- take the minimum, recurs
- Bentley: in d-dimension
- algorithm for 2D:
construct k-NNG: divide and conquer ()
- split into by median from (could also split arbitrary)
- build k-NNG for each half ()
- point location problem: ball too big iff point from other half in ball
- need binary search tree for query
- max overlapping ball
- shrink all ball in w/ point in
- recurs
disk packing
non-overlapping 2D ball set
- problem: given point, find ball containing it
- planar graph: node for each ball, edge for intersection
- Koebe embedding: reverse is true
- e.g., prof Teng saw 100 lake in Minnesota (which has 10000) when driving across
condition: if can dig round lake on the spherical, then charge $1 for each lake on tour though great circle
- maximum expected charge:
- in -dimension
proof:
assume the globe has radius 1
each lake define a belt of width perpendicular to great circle passing through it ⇒ expectation of charge:
lake area cannot exceed globe area:
clearly want equal by convexity
kissing number
max number of non-overlapping ball to touch one ball
- in 1D, in 2D, in 3D
3-dimensional binary search via disk
a great circle divide disk into
can have conformal map s.t. median of all disk center is center of globe
- dilate point by projecting globe to plane via tangent, scaling up on plane, then projecting back
can build binary search tree by successive random split through median
-dimensional convex geometry
Helly’s theorem (projection lemma)
in -dimension, with ∞ convex point set, if convex set , all them intersect, then all ∞ of these set intersect
- in 1D, 3 interval intersect pairwise ⇒ ∃1 point in all 3 interval
- -centerpoint exist
Radon theorem: median in convex geometry definition
in -dimension
point ,
can be divided into 2 set s.t convex hull of and intersect
- i.e. s.t.
proof:
has linear equation ⇒ has non-trivial solution, i.e.,
is both in convex hull of and
- intersection point as median
- can get -median WHP, time via -tree
Lipton-Tarjan separator theorem for planar graph
Alan George nested dissection
- numerical system beat Gaussian elimination
- in Gaussian elimination, removing 1 node connect all its neighbor
- remove separator node to separate graph into 2 (or 3)
- eliminate think separator last, so eliminating each node incur small cost
fast Fourier transform (FFT)
integer multiplication
classic algorithm: —not scalable
- output size
can divide each number into high and low half
can save one smaller multiplication by:
FFT make multiplication , nearly as easy as addition
polynomial multiplication
- number are special case of polynomial
- better than number because continuous
- convolution: is sequence of diagonal sum in
- polynomial can be recovered by any distinct data point
realizable data
recover from by solving
easy to multiple and :
need data point to recover
active learning: can pick nice data point as wished
- unit root in complex space:
divide unit circle evenly ⇒ all in form
⇒ sample for
can save computation by , etc.
divide recursively:
⇒ calculate all in
linear programming (LP)
- maximization standard form: maximize, ≤ constraint, non-negative entry
- e.g., use limited material to make product for profit
- minimization standard form: opposite
- polyhedron
- polytope (bounded)
- vertex: ∃ vector, stay in if added, not in if subtracted
- face
- optimal solution: convex (a face), tight (constraint reach equality)
- feasible solution
- DNE when unbounded/ infeasible
- fundamental theorem of linear programming: optimal solution, if exist, contain vertex
- ⇒ if optimal solution exist, exist optimal solution w/ at most non-zero variable
- complementary slackness: , only a few constraint active
- dual LP
- variable become constraint, each row of constraint become 1 dual variable
- persuade not to make product by offering to buy raw material at higher price
- essentially finding upper bound for value
- finding the force on a ball in force field in a cage when it is stable against corner, then try to find path to origin w/ least work
- involution
- lenient primal form ⇒ tight dual form, vice versa
- weak duality theorem: primal value ≤ dual value
- looking for optimum from opposite direction
- one is unbounded the other is infeasible
- optimal if
- strong duality theorem: if either feasible and bounded, then the other is feasible and bounded and optimal value equal
- hold for LP, not for general convex optimization
- method
- simplex method (George Dantzig): polynomial in practice
- duality (John von Neumann)
- engineering method (polynomial): ellipsoid method, interior point method
network flow
- designed to attack USSR supply chain
- Ford-Fulkerson max flow: send most from source to sink
- simple greedy not optimal: find path w/ max min flow recursively
- optimal: include reverse flow from chosen flow in residual graph
- proof by duality
- min cut: sum of weight of edge that separate source and sink
- max flow ≤ min cut
- for each forward cut, ∃ flow that saturate it
- each backward cut is empty
randomization
- Andy Yao’s theorem: random data + deterministic algorithm (algorithm min) is the same as deterministic data + random algorithm (data min)
- min of max = max of min
- convert worst case analysis to average case analysis
Markov chain
- Markovian matrix : stochastic matrix
- doubly-stochastic matrix: both row & column sum to one
- spectral radius: largest dilation by vector
PageRank
- network centrality
approximation:
: start random and walk round
- significant PageRank problem: want all page w/ PageRank
- approximately find page w/ PageRank
spectral graph theory
- simplest: undirected graph
- spectral graph partitioning/ clustering: heuristics
- reduce search space from w/ Laplacian matrix
Laplacian matrix
- : degree matrix, diagonal, is degree of node
- : adjacency matrix
- additive decomposition: can add edge one at a time and sum the ,
-
- ⇒ is positive semi-definite
- eigenvalue
- Fiedler: graph disconnected
min cut
- convention:
- quality of cut: conductance
- want min conductance → NP-hard
- Cheeger’s inequality:
- algorithm
- find Fiedler vector (eigenvector corresponding to )
- sort entries ascending
-
- only need to check cut between and
- dimensionality reduction
- find Fiedler vector (eigenvector corresponding to )