LP duality #1

OK, so it’s been a painfully long time since I wrote, and I really don’t have any good excuse for that, sigh. I will try to change this. Anyways, in this post, I want to talk about LPs (linear programs, known in more modern terms as linear optimization problems). More specifically, I want to talk about duality. Before I start with the interesting stuff, however, I want to establish the setup of the problem, which I’ll try to do as concisely as possible, only introducing the concepts we will need.

Setup and terminologyIn general, a linear optimization problem is concerned with optimizing a linear function f : \mathbb{R}^n \to \mathbb{R}, called the objective function, subject to some finite number (say m) of linear non-strict inequalities (or equations), called the constraints. To be more precise, we want to find a point \mathbf{x} \in \mathbb{R}^n, satisfying all the constraints, for which the objective function has maximum (or minimum, depending on the problem) value. Such an x is known as an optimal solution. In general, any x which satisfies the constraints is known as a feasible solution. Note that optimal solutions may not always exist (let alone be unique), and there might not even be any feasible solutions! If we have two vectors x, y then we will write x \leq y to mean that x_i \leq y_i for all i — in other words, the components of x are all \leq the corresponding ones of y. This clearly isn’t a total order, but that’s OK.

A concise way of expressing the problem. In general, a problem like I described above will look something like this:

Maximize x_1 + x_2 subject to the constraints x_2 + x_1 \leq 3, x_1 \geq 0.

Our hope, however, is to convert the problem to standard form, that is, write it as

Maximize c^T x subject to the constraints Ax \leq b, x \geq 0.

This might mean converting some \geq constraints into \leq constraints, by multiplying by -1 or something. Also, notice how we’ve encoded the objective function x_1 + x_2 as the action of a covector c^T on the vector x. Similarly, we’ve conveniently “collapsed” all m of the linear inequalities into the concise equation Ax \leq b where A is an m \times n real matrix and b \in \mathbb{R}^n.

Geometric intuition. With a bit of thought, we notice that each of the constraints “splits” the whole space \mathbb{R}^n into two parts: the vectors which satisfy that constraint, and those that don’t. These “constraint regions”, as I will call them, turn out to actually be half-spaces. The “feasible region” (that is, the space of all feasible solutions) we speak of is merely the intersection of all the separate constraint regions. The problem then reduces to finding the optimal real number t such that the feasible region has nontrivial intersection with the level set \mathcal{L}_t := \{ x \in \mathbb{R}^n : c^T x = t \}. Each level set \mathcal{L}_t will always be an affine space, obtained by translating \ker(c^T) = \mathcal{L}_0, which (due to rank-nullity) is a hyperplane, ignoring the degenerate case c^T = 0. In other words, we’re “sliding” along an uncountable stack \{ \mathcal{L}_t \}_{t \in \mathbb{R}} of hyperplanes sitting in \mathbb{R}^n, by optimizing t as much as we can until our hyperplane is just touching the edge, in some sense, of the feasible region.

Advertisements

About mlbaker

just another guy trying to make the diagrams commute.
This entry was posted in optimization and tagged , , . Bookmark the permalink.

7 Responses to LP duality #1

  1. Supplemental readings: http://www.math.uwaterloo.ca/~harvey/F10/

    It might be my geometric intuition that’s messing me up, but I assume that \{\mathcal{L}_t\}_{t\in\mathbb{R}} is analogous to somewhat of a quotient space in \mathbb{R}^n where we are just quotienting out some one dimensional space to obtain a stack of hyperplanes. By intersecting these half spaces (the constraints) with the hyperplanes, it seems that now we have have a set of “half-hyper-planes” or some lower dimensional variant.

    Is there anything that we can use to determine whether our space of feasible solutions is one, two, or n-dimensional? Or that there is only one unique solution? Or the solution is just constrained as a “line” segment?

  2. dx7hymcxnpq says:

    Well, if you identify points that share a hyperplane, \{ \mathcal{L}_t \}_{t \in \mathbb{R}} effectively becomes a one-dimensional space, i.e. the orthogonal complement (which in this case coincides with the quotient space) of some (n-1)-dimensional subspace, namely the kernel of that linear functional. I don’t think viewing it like that really helps at all though, since you care about whether or not each hyperplane intersects the feasible region.

    The feasible region is just the set of points in \mathbb{R}^n that satisfy all the constraints (i.e. lie in all of what I call the “constraint regions” which are half-spaces). This region will be convex since half-spaces are convex and intersection of a family of convex sets is again convex. Also, it could be empty, or unbounded (in the sense of MATH 247: for every M \in \mathbb{R} there is some point x \in S so that \| x \| > M). Also it’s pretty improbable that two (let alone more) half-spaces intersect in a subspace (in fact the two half-spaces would have to be closed, and perfectly complemented, from my own geometric intuition).

    • @ 2nd paragraph: Hmm, in regards to that unbounded definition, are you saying that we can have some intersection of half spaces that will give us a “space” that is the complement of a closed ball (with radius m \in \mathbb{R}) centered at the origin? In that case, it would really be a convex set… then again, since I haven’t taken 247 yet, I might be just misinterpreting the wording.

      @1st paragraph: What do you mean “identify points that share a hyperplane”? Do you mean that if you find points that satisfy the criterion that they are exclusively in one of the \mathcal{L}_t for some t \in \mathbb{R}, then for any s \neq t we have \mathcal{L}_s equal to some translated version of \mathcal{L}_t?

      Anyways, I can see the idea that \{ \mathcal{L}_t \}_{t \in \mathbb{R}} is just isomorphic to the orthogonal complement and can be viewed as a quotient space of sorts, but something I don’t get from the original post is what you mean by “just touching the edge”. Are talking about choosing t such that the hyper-plane \mathcal{L}_t is tangental to the feasible solution space?

    • dx7hymcxnpq says:

      First question: First off, the complement of a closed ball is not going to be convex. The possible “shapes” of the feasible region to an LP problem is significantly impacted by the fact that we are only allowing finitely many constraints. Therefore, the feasible region is an intersection of finitely many half-spaces. For example, I know of no way to generate a ball as a finite intersection of half-spaces, although with an infinite (probably uncountable) intersection I think it could be done.

      Second question: When I say “identify” a collection of objects, I mean “consider everything in that collection to be the same”.

      Third question: It doesn’t really have much to do with tangency, but more to do with the fact that we want to choose an “optimal” t (the meaning of “optimal” depends on whether it is a maximization or minimization problem) such that the hyperplane \mathcal{L}_t has nontrivial intersection with the feasible region. If we can do this, then we can just pluck some x that sits both on that “optimal hyperplane” and in the feasible region. Such an x will then be an optimal solution to the problem (since the objective function’s value is optimized there) if you think about it.

    • @First: Whoops, a typo, it really should be “wouldn’t”. To add to this, would it no longer be called linear programming if we had a sequence of constraints rather than some finite collection?

      @Third: Oh, okay. The choice of t would depend on the type of optimization problem that we’re inspecting.

  3. dx7hymcxnpq says:

    To come back to your first question above: it turns out that much of the rich theory (in particular, the extremely useful so-called simplex algorithm) of linear programming relies on the fact that feasible regions of linear programs are polyhedra in \mathbf{R}^n. That is, they are sets of the form P = \{ x \in \mathbf{R}^n : Ax \leq b \} where A is an m \times n matrix and b \in \mathbf{R}^m.

    If you have infinitely many constraints, then you can’t write it as a matrix equation anymore. I guess you’d have to write things like L(x) \leq b for your constraint system. It’s interesting, but yeah, it basically ceases to be considered linear programming at that point.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s