A standard linear programing refers to an LP in the form:
\begin{aligned}
\min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\
\text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\
&\boldsymbol x\geqslant0.
\end{aligned}
A nonstandard LP can always be transformed into an equivalent standard LP, by adding new variables. For example, x_1\leqslant1 is equivalent to x_1+x_2=1 with x_2\geqslant0.
Polyhedral set
A feasible region of an LP problem is a polyhedral set (polyhedron), that is, a set of the form \{\boldsymbol x\mid A\boldsymbol x\leqslant\boldsymbol b\}.
We can find the corner points of a polyhedral set in the following way. First impose the equality constraints to ensure feasibility, and then require that enough additional constraints be active, so that we get a total of n linearly independent active constraints and a unique \boldsymbol x is therefore determined. We can call such an \boldsymbol x be a basic solution. If no inactive constraint is violated, it is called a basic feasible solution.
Definition: Consider a polyhedral set and a point \boldsymbol x. Then \boldsymbol x is a basic solution if: all equality constraints are active and; out of the constraints that are active at \boldsymbol x, there are n of them that are linearly independent.
For a polyhedral set, the concept of vertex (the usual geometry definition, i.e., the point on one side of a hyperplane which meets the polyhedron only at that point), extreme point (the point that cannot be represented as a convex combination of any other two points in the polyhedron), basic feasible solution (the point in the set where the equality constraints are active and there are n of the active constraints that are linearly independent) are all the same.
Consider the polyhedral set : S=\{x\in\mathbb R^n\mid A\boldsymbol x=\boldsymbol b,\boldsymbol x\geqslant0\} (we should assume that A has full row rank m). Denote A=(\boldsymbol A _ 1,\dots,\boldsymbol A _ n), where \boldsymbol A _ i is the column vector of A.
Theorem 1: Characterization of basic solution. \boldsymbol x is a basic solution iff A\boldsymbol x=\boldsymbol b, and there exists indices i _ 1,\dots, i _ m such that the columns \boldsymbol A _ {i _ 1},\dots, \boldsymbol A _ {i _ m} are linear independent and x _ i=0\, (\forall i\notin\{i _ 1,\dots,i _ m\}).
According to this theorem, we have:
Procedure for constructing basic solutions: Choose m independent columns of A, set x _ i=0 for the column indexes i not selected, and then solve A\boldsymbol x=\boldsymbol b.
Theorem 1': \boldsymbol x\in S is a basic feasible solution iff \boldsymbol A _ {i _ 1},\dots,\boldsymbol A _ {i _ k} are linear independent, where i _ 1,\dots,i _ k are some indexes for which x _ {i _ j} is positive.
Optimal condition
Theorem 2: For a standard LP, if S is bounded, then at least one of its vertices is an optimal point of the LP problem.
Theorem 2': Suppose S has at least one vertex and there exists an optimal solution. Then there exists an optimal solution which is a vertex of S.
Theorem 2'': Suppose S has at least one extreme point. Then, either the optimal cost is -\infty, or there exists a vertex which is optimal.
Corollary: Consider LP of minimizing over a nonempty polyhedral set. Then either the optimal cost is -\infty or there exists an optimal solution.
We call a nonzero vector \boldsymbol d a direction of a convex set X if \{\boldsymbol x+\lambda\boldsymbol d\mid\lambda\geqslant0\}\subseteq X. If S=\{x\in\mathbb R^n\mid A\boldsymbol x=\boldsymbol b,\boldsymbol x\geqslant0\} , then it is clear \boldsymbol d is a direction of S iff
A\boldsymbol d=\boldsymbol 0,\, \boldsymbol d\geqslant0,\, \boldsymbol d\neq\boldsymbol 0.
The extreme points of \{\boldsymbol d\mid A\boldsymbol d=\boldsymbol 0,\, \boldsymbol d\neq\boldsymbol 0,\, \boldsymbol 1^\mathsf T\boldsymbol d=1\} are called the extreme direction of S. The polyhedral set S can only have finitely many extreme directions.
Theorem 3: Consider standard LP. Denote the extreme directions of S as \boldsymbol d _ 1,\dots,\boldsymbol d _ l. If \boldsymbol c^\mathsf T\boldsymbol d _ j\geqslant0\, (\forall j=1,\dots,l), then at least one extreme point of S is an optimal solution. Otherwise, the optimal objective value is -\infty.
Simplex method
Introduction of simplex method
Based on Theorem 2'', the simplex method searches for an optimal solution by moving from one basic feasible solution to another, along the edges of the feasible set, always in a cost reducing direction. Eventually, a basic feasible solution is reached at which none of the available edges leads to a cost reduction; such a basic feasible solution is optimal and the algorithm terminates.
Why does this method work? Many optimization algorithms are structured as follows: given a feasible solution, we search its neighborhood to find a nearby feasible solution with lower cost. If no nearby feasible solution leads to a cost improvement, the algorithm terminates and we have a locally optimal solution. For general optimization problems, a locally optimal solution need not be (globally) optimal. Fortunately, in linear programming, local optimality implies global optimality; this is because we are minimizing a convex function over a convex set.
Equivalent LP
Consider standard LP:
\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\ \text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\&\boldsymbol x\geqslant0.\end{aligned}
Let B be a basis matrix corresponding to \{i _ 1,\dots,i _ m\}, i.e.,
B=(\boldsymbol A _ {i _ 1},\dots,A _ {i _ m}), \quad\operatorname{rank}(B)=m,
and N be the non basis matrix. The LP can be written as
\begin{aligned}
\min\quad &\boldsymbol c _ B^\mathsf T\boldsymbol x _ B+\boldsymbol c^\mathsf T _ N\boldsymbol x _ N\\
\text{s.t.}\quad&\boldsymbol x _ B+B^{-1}N\boldsymbol x _ N=B^{-1}\boldsymbol b,\\
&\boldsymbol x _ B\geqslant0,\, \boldsymbol x _ N\geqslant0.
\end{aligned}
We obtain
\begin{aligned}
\min\quad &(\boldsymbol c _ N^\mathsf T-\boldsymbol c _ B^\mathsf TB^{-1}N)\boldsymbol x _ N+\boldsymbol c _ B^\mathsf TB^{-1}\boldsymbol b,\\
\text{s.t.}\quad& B^{-1}N\boldsymbol x _ N\leqslant B^{-1}\boldsymbol b,\\
&\boldsymbol x _ B\geqslant0,\, \boldsymbol x _ N\geqslant0.
\end{aligned}
Hence:
- If B^{-1}\boldsymbol b\geqslant0, then \boldsymbol 0\in\mathbb R^{n-m} is a feasible solution of the above LP.
- If in addition \boldsymbol c _ N^\mathsf T-\boldsymbol c _ B^\mathsf TB^{-1}N\geqslant0, then \boldsymbol 0\in\mathbb R^{n-m} is an optimal solution of the above LP.
The feasibility and optimality condition: if B^{-1}\boldsymbol b\geqslant0, then the corresponding basic solution is feasible. If in addition \boldsymbol c _ N^\mathsf T-\boldsymbol c _ B^\mathsf TB^{-1}N\geqslant0, then the corresponding basic solution is an optimal solution.
If there is a non basis index j such that c _ j-\boldsymbol c _ B^\mathsf TB^{-1}\boldsymbol A _ j<0, then increasing the value of x _ j can reduce the objective value.
Simplex tableau
\boldsymbol x _ B | \boldsymbol x _ N | Solution | |
---|---|---|---|
z | 0 | \boldsymbol c _ B^\mathsf TB^{-1}N-\boldsymbol c _ N^\mathsf T | \boldsymbol c _ B^\mathsf TB^{-1}\boldsymbol b |
\boldsymbol x _ B | I _ m | B^{-1}N | B^{-1}\boldsymbol b |
This table is the simplex tableau of a standard LP, which encodes the following equations:
\begin{aligned}z+(\boldsymbol c_B^\mathsf TB^{-1}N-\boldsymbol c_N^\mathsf T)\boldsymbol x_N&=\boldsymbol c_B^\mathsf TB^{-1}\boldsymbol b,\\\boldsymbol x_B+B^{-1}N&=B^{-1}\boldsymbol b.\end{aligned}
In practice we prefer to write out all components for the convenience of calculation and implementation of algorithm.
\boldsymbol x _ B | \boldsymbol x _ N | Solution | |
---|---|---|---|
z | 0 | \begin{matrix}r _ 1&\dots&r _ {n-m}\end{matrix} | z _ 0 |
\boldsymbol x _ B | I _ m | \begin{matrix}d _ {11}&\dots&d _ {1,n-m}\\\vdots&\ddots&\vdots\\d _ {m1}&\dots&d _ {m,n-m}\end{matrix} | \begin{matrix}f _ 1\\\vdots\\f _ m\end{matrix} |
Simplex method: algorithm for a standard LP
- If r _ 1\leqslant0,\, \dots,\, r _ {n-m}\leqslant0, then the tableau is optimal. The corresponding optimal solution is \boldsymbol x _ B=(f _ 1,\dots,f _ m), \boldsymbol x _ N=\boldsymbol 0 and the optimal value is z _ 0.
- Otherwise, if there is r _ k>0 and the column d _ {1k},\dots,d _ {mk}\leqslant0, then the objective value is -\infty.
- Otherwise, choose r _ k>0 and find i that minimizes \dfrac{f _ i}{d _ {ik}}.
- Apply Gauss-Jordan exchange with d _ {ik} as pivoting element, that is:
Perform row elementary operations to the above simplex tableau so that the (m+k)-th column becomes \boldsymbol e _ {i+1}\in\mathbb R^{m+1}.
Let me briefly explain the idea of this algorithm. Simplex method start with an initial basic feasible solution. If all r is negative then moving towards any direction cannot reduce the objective function, so we can conclude that this solution is locally optimal, and also globally optimal. Step 2 means that we can move from the initial solution to that direction without stopping, so the objective function keep decreasing and the minimum is -\infty. As for step 3, we choose a k, increase the value of the corresponding x, which lower the value of objective function, and at the same time change the value of x corresponding to the basis, so that the constraints still hold all the time. We also find i in Step 3, because eventually we move to a solution such that one of the basis x becomes 0 and we cannot continue moving towards this direction, otherwise this x will be negative, which violates the nonnegative constraints. In fact this solution is also a basic feasible solution. Now one basis x becomes 0 and one non basis x becomes positive (from 0). We can say we have changed the basis. Step 4 can be regarded as some kind of normalization.
Numerical example: Consider an LP
\begin{aligned}\min\quad&x_1-2x_2\\\text{s.t.}\quad& -4x_1+6x_2\leqslant 9,\\&x_1+x_2\leqslant 4,\\&x_1,x_2\geqslant0.\\\end{aligned}The corresponding standard LP is
\begin{aligned}\min\quad&x_1-2x_2\\\text{s.t.}\quad& -4x_1+6x_2+x_3= 9,\\&x_1+x_2+x_4= 4,\\&x_1,\dots,x_4\geqslant0.\\\end{aligned}The initial simplex tableau is
x_1 | x_2 | x_3 | x_4 | Solution | |
---|---|---|---|---|---|
z | -1 | 2 | 0 | 0 | 0 |
x_3 | -4 | 6 | 1 | 0 | 9 |
x_4 | \phantom{-}1 | 1 | 0 | 1 | 4 |
We first find a positive r in the first row. The column of x_2 has a positive r so we select x_2 to enter the basis. Now we compare f/d of rows with positive d: since 9/6<4/1, we remove x_3 from the basis. We transform the simplex tableau into
x_1 | x_2 | x_3 | x_4 | Solution | |
---|---|---|---|---|---|
z | \phantom{-}1/3 | \phantom{-}0 | -1/3 | 0 | -3 |
x_2 | -2/3 | \phantom{-}1 | \phantom{-}1/6 | 0 | 3/2 |
x_4 | \phantom{-}5/3 | \phantom{-}0 | -1/6 | 1 | 5/2 |
Similarly, now we remove x_4 from the basis and add x_1 to the basis.
x_1 | x_2 | x_3 | x_4 | Solution | |
---|---|---|---|---|---|
z | 0 | 0 | -3/10 | -1/5 | -7/2 |
x_2 | 0 | 1 | \phantom{-}1/10 | \phantom{-}2/5 | \phantom{-}5/2 |
x_1 | 1 | 0 | -1/10 | \phantom{-}3/5 | \phantom{-}3/2 |
We can find the optimality condition has been satisfied. Thus the optimal solution is (3/2,5/2) (with x_3=x_4=0) and the minimum of the original LP is -7/2.
Theorem 4: Assume that the feasible set is nonempty and every basic feasible solution is nondegenerate. Then, the simplex method terminates after a finite number of iterations.
Degeneracy
Theorem 4 is under the assumption that all basic feasible solutions are nondegenerate. Suppose now that the exact same algorithm is used in the presence of degeneracy. Then, the following new possibilities may be encountered in the course of the algorithm.
- If the current basic feasible solution \boldsymbol x is degenerate, the step size of moving can be equal to zero, that is to say we move to the new basic feasible solution \boldsymbol y which is the same as \boldsymbol x. Nevertheless, we still change the basis and continue the algorithm.
- We move to a basic feasible solution that is degenerate, from a nondegenerate one.
On the one hand, basis changes while staying at the same basic feasible solution are not in vain. On the other hand, a sequence of basis changes might lead back to the initial basis, in which case the algorithm may loop indefinitely. This undesirable phenomenon is called cycling. But we can avoid cycling by using specific pivoting rule.
Here we introduce two pivoting rules. Cycling can be avoided if we use the second one.
Dantzig’s pivoting rule (for standard LP): 1. Entering variable: choose the largest r to enter the basis; 2. leaving variable: choose the one with the smallest index to leave the basis.
Bland’s pivoting rule (for standard LP): 1. Entering variable: choose the one with smallest index to enter the basis; 2. leaving variable: choose the one with the smallest index to leave the basis.
Bland's pivoting rule is also known as the smallest subscript pivoting rule.
Theorem 5: The simplex method with Bland’s pivoting rule always terminates in finite number of steps.
Useful techniques for simplex method
We first multiply some of the constraints by -1, change the problem so that \boldsymbol b\geqslant0.
Big M technique
Suppose M is a sufficiently large positive constant. First solve a modified LP, of which the new objective function is \boldsymbol c^\mathsf T\boldsymbol x+MR _ 1+\dots MR _ t, while the equality constraints have t equations inserted R _ i respectively on the left side and the new variables are R _ 1,\dots,R _ t.
From the construction we can see that if (\boldsymbol x,\boldsymbol 0) is a basic feasible solution of the modified LP, then \boldsymbol x is a basic feasible solution of the original LP. And, if the original LP has at least one feasible solution, then the optimal solution of the modified LP (\boldsymbol x^\ast,\boldsymbol R^\ast) must satisfy R _ 1=\dots=R _ t=0. Otherwise the problem is infeasible.
Two-phase technique
Phase I is to solve another LP whose constraints and variable are the same with those of big M technique above, but the objective function is R _ 1+\dots+R _ t.
If the optimal solution satisfies R _ 1=\dots=R _ t=0, then go to Phase II. Otherwise, the initial problem is clearly infeasible.
Phase II is to solve the original LP starting from \boldsymbol x^\ast obtained from Phase I.
The numbers of iterations in the big M technique and two-phase technique are the same. It is said that the advantage of two-phase technique rests in eliminating manipulation of the constant M.
Simplex method for maximization
Simplex method can be easily adapted to maximization problem. The main difference is the sign of \boldsymbol r in all steps.
- If \boldsymbol r\geqslant0, then the tableau is optimal.
- Otherwise, if there is r _ k<0 and the column \boldsymbol d _ {\cdot k}\leqslant0, then the optimal value is -\infty.
- Otherwise, choose r _ k<0 and ...
Duality theory
Duality is a powerful theoretical tool that has numerous applications, provides new geometric insights, and leads to another algorithm for linear programming (the dual simplex method).
Motivation
Duality theory can be motivated as an outgrowth of the Lagrange multiplier method, often used in calculus to minimize a function subject to equality constraints. For example, to minimize x^2+y^2 subject to x+y=1, we can use the method called Lagrange multiplier. Define L(x,y,p)=x^2+y^2-p(x+y-1). We know that there is a p such that x _ 0,y _ 0 are stationary points to L(x,y,y) while x _ 0,y _ 0 are defined as extreme points to the original constrained problem. We obtain x _ 0=y _ 0=p/2, and the original x+y=1 t gives us the additional relation p=1, so the optimal solution to the original problem is x=y=1/2.
The main idea in the above example is the following. Instead of enforcing the hard constraint x + y = 1, we allow it to be violated and associate a Lagrange multiplier, or price, p with the amount 1-x-y by which it is violated. This leads to the unconstrained minimization. When the price is properly chosen (p = 1, in our example), the optimal solution to the constrained problem is also optimal for the unconstrained problem. In particular, under that specific value of p, the presence or absence of the hard constraint does not affect the optimal cost.
The situation in linear programming is similar.
Consider the standard form problem
\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\\text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\&\boldsymbol x\geqslant0,\end{aligned}
which we call the primal problem, and let \boldsymbol x^\ast be an optimal solution, assumed to exist. We introduce a relaxed problem in which the constraint A\boldsymbol x=\boldsymbol b is replaced by a penalty \boldsymbol p^\mathsf T(\boldsymbol b-A\boldsymbol x).
\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x+\boldsymbol p^\mathsf T(\boldsymbol b-A\boldsymbol x)\\\text{s.t.}\quad&\boldsymbol x\geqslant0.\end{aligned}
Let g(\boldsymbol p) be the optimal cost for the relaxed problem, as a function of the price vector \boldsymbol p. The relaxed problem allows for more options than those present in the primal problem, and we expect g(\boldsymbol p) to be no larger than the optimal primal cost \boldsymbol c^\mathsf T\boldsymbol x^\ast.
g(\boldsymbol p)\leqslant\boldsymbol c^\mathsf T\boldsymbol x^\ast+\boldsymbol p^\mathsf T(\boldsymbol b-A\boldsymbol x^\ast)=\boldsymbol c^\mathsf T\boldsymbol x^\ast.
Thus, each \boldsymbol p leads to a lower bound g(\boldsymbol p) for the optimal cost \boldsymbol c^\mathsf T\boldsymbol x^\ast. The problem: [maximize g(\boldsymbol p), subject to: no constraint], can be then interpreted as a search for the tightest possible lower bound of this type, and is known as the dual problem. The main result in duality theory asserts that the optimal cost in the dual problem is equal to the optimal cost \boldsymbol c^\mathsf T\boldsymbol x^\ast in the primal. In other words, when the prices are chosen according to an optimal solution for the dual problem, the option of violating the constraints A\boldsymbol x=\boldsymbol b is of no value.
Now we calculate g(\boldsymbol p). g(\boldsymbol p)=\min _ {\boldsymbol x\geqslant0}[\boldsymbol c^\mathsf T\boldsymbol x+\boldsymbol p^\mathsf T(\boldsymbol b-A\boldsymbol x)]=\boldsymbol p^\mathsf T\boldsymbol b+\min _ {\boldsymbol x\geqslant0}(\boldsymbol c^\mathsf T-\boldsymbol p^\mathsf TA)\boldsymbol x. Therefore, it is not hard to see that g(\boldsymbol p)=\boldsymbol p^\mathsf T\boldsymbol b if \boldsymbol c^\mathsf T-\boldsymbol p^\mathsf TA\geqslant0, otherwise it is -\infty. In maximizing g(\boldsymbol p), we only need to consider those values of \boldsymbol p for which g(\boldsymbol p) is not equal to -\infty. The dual problem is the same as the LP
\begin{aligned}\max\quad &\boldsymbol p^\mathsf T\boldsymbol b\\\text{s.t.}\quad&\boldsymbol p^\mathsf TA\leqslant\boldsymbol c^\mathsf T.\end{aligned}
The dual problem
When the primal is not standard, the dual can be written through the following table:
Primal | minimize | maximize | Dual |
---|---|---|---|
constraints | \begin{matrix}\geqslant b _ i\\\leqslant b _ i\\=b _ i\end{matrix} | \begin{matrix}\geqslant 0\\\leqslant 0\\\text{free}\end{matrix} | variables |
variables | \begin{matrix}\geqslant 0\\\leqslant 0\\\text{free}\end{matrix} | \begin{matrix}\leqslant c _ j\\ \geqslant c _ j\\=c _ j\end{matrix} | constraints |
Theorem 6: If we transform the dual into an equivalent minimization problem and then form its dual, we obtain a problem equivalent to the original problem.
A compact statement that is often used to describe Theorem 6 is that "the dual of the dual is the primal".
Theorem 7: Suppose that we have transformed a linear programming problem \Pi _ 1 to another linear programming problem \Pi _ 2, by a sequence of transformations of the following types:
- Replace a free variable with the difference of two nonnegative variables.
- Replace an inequality constraint by an equality constraint involving a nonnegative slack variable.
- If some row of the matrix A in a feasible standard form problem is a linear combination of the other rows, eliminate the corresponding equality constraint.
Then, the duals of \Pi _ 1 and \Pi _ 2 are equivalent, i.e., they are either both infeasible, or they have the same optimal cost.
The duality theorem
We generalize the result mentioned before into the following theorem.
Theorem 8 (Weak duality): If \boldsymbol x is a feasible solution to the primal problem and \boldsymbol p is a feasible solution to the dual problem, then
\boldsymbol p^\mathsf T\boldsymbol b\leqslant\boldsymbol c^\mathsf T\boldsymbol x.
The proof can be simple if we prove by \boldsymbol p^\mathsf T\boldsymbol b=\boldsymbol p^\mathsf TA\boldsymbol x\leqslant\boldsymbol c^\mathsf T\boldsymbol x. But for the purpose of some further theorems, we prove in another way. We define for any \boldsymbol p and \boldsymbol x that u _ i=p _ i(\boldsymbol a^\mathsf T _ i\boldsymbol x-b _ i) and v _ j=(c _ j-\boldsymbol p^\mathsf TA _ j)x _ j. By definition of the dual, they are all nonnegative, so
0\leqslant\sum\nolimits _ iu _ i+\sum\nolimits _ jv _ j=(\boldsymbol p^\mathsf TA\boldsymbol x-\boldsymbol p^\mathsf T\boldsymbol b)+(\boldsymbol c^\mathsf T\boldsymbol x-\boldsymbol p^\mathsf TA\boldsymbol x)=\boldsymbol c^\mathsf T\boldsymbol x-\boldsymbol p^\mathsf T\boldsymbol b.
The weak duality theorem is not a deep result, yet it does provide some useful information about the relation between the primal and the dual. We have, for example, the following corollary.
Corollary: If the optimal cost in the primal is -\infty, then the dual problem must be infeasible; and if the optimal cost in the dual is +\infty, then the primal problem must be infeasible.
Corollary: Let \boldsymbol x and \boldsymbol p be feasible solutions to the primal and the dual, respectively, and suppose that \boldsymbol p^\mathsf T\boldsymbol b = \boldsymbol c^\mathsf T\boldsymbol x. Then, \boldsymbol x and \boldsymbol p are optimal solutions to the primal and the dual, respectively.
The next theorem is the central result on linear programming duality.
Theorem 9 (Strong duality): If a linear programming problem has an optimal solution, so does its dual, and the respective optimal costs are equal.
Proof. Here we only consider the standard form problem. The rest is not hard to see, so it is omitted. Let us assume temporarily that the rows of A are linearly independent and that there exists an optimal solution. Let us apply the simplex method to this problem. As long as cycling is avoided, e.g., by using the Bland's pivoting rule, the simplex method terminates with an optimal solution \boldsymbol x and an optimal basis matrix B. Let \boldsymbol x _ B=B^{-1}\boldsymbol b be the corresponding vector of basic variables. When the simplex method terminates, we obtain \boldsymbol c^\mathsf T-\boldsymbol c^\mathsf T _ BB^{-1}A\geqslant0. Let us define a vector \boldsymbol p by letting \boldsymbol p^\mathsf T=\boldsymbol c^\mathsf T _ BB^{-1}, then we know this \boldsymbol p is a feasible solution to the dual problem. In addition, \boldsymbol p^\mathsf T\boldsymbol b=\boldsymbol c^\mathsf T _ BB^{-1}\boldsymbol b=\boldsymbol c^\mathsf T _ B\boldsymbol x _ B=\boldsymbol c^\mathsf T\boldsymbol x. It follows that \boldsymbol p is an optimal solution to the dual (cf. the Corollary above), and the optimal dual cost is equal to the optimal primal cost.
Finite optimum | Unbounded | Infeasible | |
---|---|---|---|
Finite optimum | Possible | ||
Unbounded | Possible | ||
Infeasible | Possible | Possible |
Here is a table about different possibilities for the primal and the dual. (The blank entries should be impossible.)
There is another interesting relation between the primal and the dual which is known as Clark's theorem. It asserts that unless both problems are in feasible, at least one of them must have an unbounded feasible set.
Complementary slackness: An important relation between primal and dual optimal solutions is provided by the complementary slackness conditions, which we present next.
Theorem 10 (Complementary slackness): Let \boldsymbol x and \boldsymbol p be feasible solutions to the primal and the dual problem, respectively. The vectors \boldsymbol x and \boldsymbol p are optimal solutions for the two respective problems iff
\begin{aligned}p _ i(\boldsymbol a _ i^\mathsf T\boldsymbol x-b _ i)=0,\quad\forall i,\\(c _ j-\boldsymbol p^\mathsf T\boldsymbol A _ j)x _ j=0,\quad\forall j.\end{aligned}
Proof. In the proof of Theorem 8, we defined u _ i=p _ i(\boldsymbol a _ i^\mathsf T\boldsymbol x-b _ i) and v _ i(c _ j-\boldsymbol p^\mathsf T\boldsymbol A _ j)x _ j which are both nonnegative. In addition, we showed that \boldsymbol c^\mathsf T\boldsymbol x-\boldsymbol p^\mathsf T\boldsymbol b=\sum _ i u _ i+\sum _ j v _ j. By the strong duality theorem, if \boldsymbol x and \boldsymbol p are optimal, then \boldsymbol c^\mathsf T\boldsymbol x=\boldsymbol p^\mathsf T\boldsymbol b, which implies that u _ i=v _ j=0 for all i,j. Conversely, if all u _ i and v _ j are zero, then \boldsymbol c^\mathsf T\boldsymbol x=\boldsymbol p^\mathsf T\boldsymbol b, by the corollary of Theorem 8, \boldsymbol x and \boldsymbol p are optimal.
An intuitive explanation of this theorem is that a constraint which is not active at an optimal solution can be removed from the problem without affecting the optimal cost, and there is no point in associating a nonzero price with such a constraint.
If the primal problem is in standard form (where the first condition is automatically satisfied) and a nondegenerate (otherwise, these conditions may be of very little help in determining an optimal solution to the dual problem) optimal basic feasible solution is known, the complementary slackness conditions determine a unique solution to the dual problem. Here is the reason. The second complementary slackness condition implies that for basis j, \boldsymbol p^\mathsf T\boldsymbol A _ j=c _ j and hence \boldsymbol p^\mathsf TB=\boldsymbol c _ B^\mathsf T. Since B is inversible, \boldsymbol p is uniquely determined.
We finally mention that if the primal constraints are of the form A\boldsymbol x\geqslant\boldsymbol b, \boldsymbol x\geqslant0, and the primal problem has an optimal solution, then there exist optimal solutions to the primal and the dual which satisfy strict complementary slackness; that is, a variable in one problem is nonzero iff the corresponding constraint in the other problem is active. This result has some interesting applications in discrete optimization, but these lie outside the scope of this appendix.
- Consider a standard LP and its dual. Assume that both problems have an optimal solution. Fix some j. Suppose that every optimal solution to the primal satisfies x _ j = 0. Then there exists an optimal solution \boldsymbol p to the dual such that \boldsymbol p^\mathsf T\boldsymbol A _ j<c _ j. (Hint: Let d be the optimal cost. Consider the problem of minimizing -x _ j subject to A\boldsymbol x=\boldsymbol b, \boldsymbol x\geqslant0, and -\boldsymbol c^\mathsf T\boldsymbol x\geqslant-d, and form its dual.)
There exist optimal solutions \boldsymbol x and \boldsymbol p to the primal and to the dual, respectively, such that for every j we have either x _ j>0 or \boldsymbol p^\mathsf T\boldsymbol A _ j. (Hint: Use part (1) for each j, and then take the average of the vectors obtained.)
Consider now the following LP problem and its dual:
\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\\text{s.t.}\quad&A\boldsymbol x\geqslant\boldsymbol b,\\&\boldsymbol x\geqslant0,\end{aligned}\qquad\begin{aligned}\max\quad &\boldsymbol p^\mathsf T\boldsymbol b\\\text{s.t.}\quad&\boldsymbol p^\mathsf TA\leqslant\boldsymbol c^\mathsf T,\\&\boldsymbol p\geqslant0.\end{aligned}Assume that both problems have an optimal solution. Then there exist optimal solutions to the primal and to the dual, respectively, that satisfy strict complementary slackness, that is:
(i) For every j we have either x_j>0 or \boldsymbol p^\mathsf T\boldsymbol A_j<c_j.
(ii) For every i, we have either \boldsymbol a _ i^\mathsf T\boldsymbol x>b _ i or p _ i>0.
(Hint: Convert the primal to standard form and apply part (2).)
Dual simplex method
Let us consider a problem in standard form, under the usual assumption that the rows of the matrix A are linearly independent. We also consider the corresponding tableau. We do not require the last column to be nonnegative, which means that we have a basic, but not necessarily feasible solution to the primal problem. However, we assume that those r are nonpositive; equivalently, the vector \boldsymbol p^\mathsf T=\boldsymbol c^\mathsf T _ BB^{-1} satisfies \boldsymbol p^\mathsf TA\leqslant \boldsymbol c^\mathsf T, and we have a feasible solution to the dual problem. If the inequality B^{-1}\boldsymbol b\geqslant0 happens to hold, we also have a primal feasible solution with the same cost, and optimal solutions to both problems have been found. If the inequality fails to hold, we perform a change of basis in a manner we describe next. After that those r in the new tableau will also be nonpositive and dual feasibility has been maintained. At the same time, the dual cost increases. We hope B^{-1}\boldsymbol b\geqslant0 holds after some iterations, then we obtain an optimal solution (for both dual and primal problem).
An iteration for the dual simplex method:
- A typical iteration starts with the tableau associated with a basis matrix B and with all r nonpositive.
- If all the components f of the last column are nonnegative, we have an optimal basic feasible solution and the algorithm terminates; else, choose some i such that f _ i<0.
- Consider the i-th row, if all d _ {i\cdot}\geqslant0, then the optimal dual cost is +\infty and the algorithm terminates.
- For each j such that d _ {ij}<0, find k that minimize r _ j/d _ {ij}.
- Apply Gauss-Jordan exchange with d _ {ik} as pivoting element.
Let us now consider the possibility that some r is zero. In this case the algorithm may cycle. This can be avoided by employing a suitable anticycling rule.
Some extra notes
Farkas' theorem
Theorem 11 (Farkas' theorem): Let A\in\mathbb R^{m\times n} and \boldsymbol b\in\mathbb R^n. Then exactly one of the following two alternatives holds:
- There exists some \boldsymbol x\geqslant0 such that A\boldsymbol x=\boldsymbol b.
- There exists some \boldsymbol p such that \boldsymbol p^\mathsf TA\geqslant0 and \boldsymbol p^\mathsf T\boldsymbol b<0.
Proof. One direction is easy. If there exists some \boldsymbol x\geqslant0 satisfying A\boldsymbol x=\boldsymbol b, and if \boldsymbol p^\mathsf TA\geqslant0, then \boldsymbol p^\mathsf T\boldsymbol b=\boldsymbol p^\mathsf TA\boldsymbol x\geqslant0, which shows that the second alternative cannot hold.
If there is no \boldsymbol x\geqslant0 satisfying A\boldsymbol x=\boldsymbol b. Consider the pair of problems
\begin{aligned}\max\quad &\boldsymbol 0^\mathsf T\boldsymbol x\\\text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\&\boldsymbol x\geqslant0,\end{aligned}\qquad\begin{aligned}\min\quad &\boldsymbol p^\mathsf T\boldsymbol b\\\text{s.t.}\quad&\boldsymbol p^\mathsf TA\geqslant\boldsymbol 0.\\&\phantom{\boldsymbol p\geqslant0.}\end{aligned}
Note that the first is the dual of the second. The maximization problem is infeasible, which implies that the minimization problem is either unbounded or infeasible. Since \boldsymbol p=\boldsymbol0 is a feasible solution to the minimization problem, it follows that the minimization problem is unbounded. Therefore, there exists some \boldsymbol p which is feasible (\boldsymbol p^\mathsf TA\geqslant0) and \boldsymbol p^\mathsf T\boldsymbol b<0.
Compact simplex tableau
The simplex tableaux that we have been using so far are usually called full simplex tableaux. However, we can observe that the same amount of information will be retained if we delete the basic columns from the full simplex tableau to obtain a more compact simplex tableau.
Computational efficiency of the simplex method
This is determined by two factors. The first is the computational effort at each iteration, and the second is the number of iterations. For each iteration, using simplex tableau needs O(mn) arithmetic operations. For the second, there exists an LP and a pivoting rule under which the simplex method requires 2^n-1 changes of basis before it terminates.
Other methods
There are some other algorithms not mentioned here, such as the ellipsoid method, and the interior point methods.
发表回复