Summary: In this thesis we study various aspects of linear and semidefinite
programs including their limitations in approximating various combinatorial
optimization problems as well as applications of these
paradigms in solving problems in machine learning.
In the first part we show inapproximability results
for polynomial sized LP and SDPs for several
problems. For the class of Constraint Satisfaction Problems (CSPs)
it is known that general LPs and SDPs are no more powerful
than the Sherali-Adams and Lasserre hierarchies respectively.
We show lower bounds for general LPs and SDPs for several non-CSP
problems such as Matching, Matching on 3-regular graphs,
Independent Set, Vertex Cover, (non-uniform) Sparsest cut
and (non-uniform) Balanced Separator.
We also obtain the first general SDP inapproximability result
for Maximum Cut: any polynomial sized SDP cannot
do better than 15/16. We also show that contrary to the situation with
CSPs, there are problems such as Independent Set where the Lasserre SDP
hierarchy is a suboptimal relaxation compared to even linear sized LP
relaxations.
The second part of the thesis deals with applications of these
paradigms to problems in machine learning. In particular,
we study a recent cost function proposed
for hierarchical clustering and show how to write an
integer program describing the convex hull of this problem.
We also show that rounding "spreading metric" type of relaxations
leads to improved approximation guarantees for this problem.
We also study another classic problem in machine learning, namely
reinforcement learning in a robust setting, where the
transition matrices corresponding to every action is not
known explicitly but may lie in some (convex) uncertainty set
and may be chosen adversarially. We give approximation algorithms
to compute an (approximate) optimal policy in this setting.
The main ingredient of this work is to define "robust"
variants of classical Q-iterations and TD-iterations, where
the update involves an additional linear optimization
step. We prove convergence under appropriate choice of
step lengths and discount factor.