Processing math: 100%

Linear programming

A standard linear programing refers to an LP in the form:
mincTxs.t.Ax=b,x0.mincTxs.t.Ax=b,x0.\begin{aligned} \min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\ \text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\ &\boldsymbol x\geqslant0. \end{aligned}\begin{aligned} \min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\ \text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\ &\boldsymbol x\geqslant0. \end{aligned}

A nonstandard LP can always be transformed into an equivalent standard LP, by adding new variables. For example, x11x_1\leqslant1 is equivalent to x1+x2=1x_1+x_2=1 with x20x_2\geqslant0.

Polyhedral set

A feasible region of an LP problem is a polyhedral set (polyhedron), that is, a set of the form {xAxb}\{\boldsymbol x\mid A\boldsymbol x\leqslant\boldsymbol b\}.

We can find the corner points of a polyhedral set in the following way. First impose the equality constraints to ensure feasibility, and then require that enough additional constraints be active, so that we get a total of nn linearly independent active constraints and a unique x\boldsymbol x is therefore determined. We can call such an x\boldsymbol x be a basic solution. If no inactive constraint is violated, it is called a basic feasible solution.

Definition: Consider a polyhedral set and a point x\boldsymbol x. Then x\boldsymbol x is a basic solution if: all equality constraints are active and; out of the constraints that are active at x\boldsymbol x, there are nn of them that are linearly independent.

For a polyhedral set, the concept of vertex (the usual geometry definition, i.e., the point on one side of a hyperplane which meets the polyhedron only at that point), extreme point (the point that cannot be represented as a convex combination of any other two points in the polyhedron), basic feasible solution (the point in the set where the equality constraints are active and there are nn of the active constraints that are linearly independent) are all the same.

Consider the polyhedral set : S={xRnAx=b,x0}S=\{x\in\mathbb R^n\mid A\boldsymbol x=\boldsymbol b,\boldsymbol x\geqslant0\} (we should assume that AA has full row rank mm). Denote A=(A1,,An)A=(\boldsymbol A _ 1,\dots,\boldsymbol A _ n), where Ai\boldsymbol A _ i is the column vector of AA.

Theorem 1: Characterization of basic solution. x\boldsymbol x is a basic solution iff Ax=bA\boldsymbol x=\boldsymbol b, and there exists indices i1,,imi _ 1,\dots, i _ m such that the columns Ai1,,Aim\boldsymbol A _ {i _ 1},\dots, \boldsymbol A _ {i _ m} are linear independent and xi=0(i{i1,,im})x _ i=0\, (\forall i\notin\{i _ 1,\dots,i _ m\}).

According to this theorem, we have:

Procedure for constructing basic solutions: Choose mm independent columns of AA, set xi=0x _ i=0 for the column indexes ii not selected, and then solve Ax=bA\boldsymbol x=\boldsymbol b.

Theorem 1': xS\boldsymbol x\in S is a basic feasible solution iff Ai1,,Aik\boldsymbol A _ {i _ 1},\dots,\boldsymbol A _ {i _ k} are linear independent, where i1,,iki _ 1,\dots,i _ k are some indexes for which xijx _ {i _ j} is positive.

Optimal condition

Theorem 2: For a standard LP, if SS is bounded, then at least one of its vertices is an optimal point of the LP problem.

Theorem 2': Suppose SS has at least one vertex and there exists an optimal solution. Then there exists an optimal solution which is a vertex of SS.

Theorem 2'': Suppose SS has at least one extreme point. Then, either the optimal cost is -\infty, or there exists a vertex which is optimal.

Corollary: Consider LP of minimizing over a nonempty polyhedral set. Then either the optimal cost is -\infty or there exists an optimal solution.

We call a nonzero vector d\boldsymbol d a direction of a convex set XX if {x+λdλ0}X\{\boldsymbol x+\lambda\boldsymbol d\mid\lambda\geqslant0\}\subseteq X. If S={xRnAx=b,x0}S=\{x\in\mathbb R^n\mid A\boldsymbol x=\boldsymbol b,\boldsymbol x\geqslant0\} , then it is clear d\boldsymbol d is a direction of SS iff
Ad=0,d0,d0.A\boldsymbol d=\boldsymbol 0,\, \boldsymbol d\geqslant0,\, \boldsymbol d\neq\boldsymbol 0.
The extreme points of {dAd=0,d0,1Td=1}\{\boldsymbol d\mid A\boldsymbol d=\boldsymbol 0,\, \boldsymbol d\neq\boldsymbol 0,\, \boldsymbol 1^\mathsf T\boldsymbol d=1\} are called the extreme direction of SS. The polyhedral set SS can only have finitely many extreme directions.

Theorem 3: Consider standard LP. Denote the extreme directions of SS as d1,,dl\boldsymbol d _ 1,\dots,\boldsymbol d _ l. If cTdj0(j=1,,l)\boldsymbol c^\mathsf T\boldsymbol d _ j\geqslant0\, (\forall j=1,\dots,l), then at least one extreme point of SS is an optimal solution. Otherwise, the optimal objective value is -\infty.

Simplex method

Introduction of simplex method

Based on Theorem 2'', the simplex method searches for an optimal solution by moving from one basic feasible solution to another, along the edges of the feasible set, always in a cost reducing direction. Eventually, a basic feasible solution is reached at which none of the available edges leads to a cost reduction; such a basic feasible solution is optimal and the algorithm terminates.

Why does this method work? Many optimization algorithms are structured as follows: given a feasible solution, we search its neighborhood to find a nearby feasible solution with lower cost. If no nearby feasible solution leads to a cost improvement, the algorithm terminates and we have a locally optimal solution. For general optimization problems, a locally optimal solution need not be (globally) optimal. Fortunately, in linear programming, local optimality implies global optimality; this is because we are minimizing a convex function over a convex set.

Equivalent LP

Consider standard LP:
mincTxs.t.Ax=b,x0.mincTxs.t.Ax=b,x0.\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\ \text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\&\boldsymbol x\geqslant0.\end{aligned}\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\ \text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\&\boldsymbol x\geqslant0.\end{aligned}

Let BB be a basis matrix corresponding to {i1,,im}\{i _ 1,\dots,i _ m\}, i.e.,
B=(Ai1,,Aim),rank(B)=m,B=(\boldsymbol A _ {i _ 1},\dots,A _ {i _ m}), \quad\operatorname{rank}(B)=m,

and NN be the non basis matrix. The LP can be written as
mincBTxB+cNTxNs.t.xB+B1NxN=B1b,xB0,xN0.mincBTxB+cNTxNs.t.xB+B1NxN=B1b,xB0,xN0.\begin{aligned} \min\quad &\boldsymbol c _ B^\mathsf T\boldsymbol x _ B+\boldsymbol c^\mathsf T _ N\boldsymbol x _ N\\ \text{s.t.}\quad&\boldsymbol x _ B+B^{-1}N\boldsymbol x _ N=B^{-1}\boldsymbol b,\\ &\boldsymbol x _ B\geqslant0,\, \boldsymbol x _ N\geqslant0. \end{aligned}\begin{aligned} \min\quad &\boldsymbol c _ B^\mathsf T\boldsymbol x _ B+\boldsymbol c^\mathsf T _ N\boldsymbol x _ N\\ \text{s.t.}\quad&\boldsymbol x _ B+B^{-1}N\boldsymbol x _ N=B^{-1}\boldsymbol b,\\ &\boldsymbol x _ B\geqslant0,\, \boldsymbol x _ N\geqslant0. \end{aligned}

We obtain
min(cNTcBTB1N)xN+cBTB1b,s.t.B1NxNB1b,xB0,xN0.min(cNTcBTB1N)xN+cBTB1b,s.t.B1NxNB1b,xB0,xN0.\begin{aligned} \min\quad &(\boldsymbol c _ N^\mathsf T-\boldsymbol c _ B^\mathsf TB^{-1}N)\boldsymbol x _ N+\boldsymbol c _ B^\mathsf TB^{-1}\boldsymbol b,\\ \text{s.t.}\quad& B^{-1}N\boldsymbol x _ N\leqslant B^{-1}\boldsymbol b,\\ &\boldsymbol x _ B\geqslant0,\, \boldsymbol x _ N\geqslant0. \end{aligned}\begin{aligned} \min\quad &(\boldsymbol c _ N^\mathsf T-\boldsymbol c _ B^\mathsf TB^{-1}N)\boldsymbol x _ N+\boldsymbol c _ B^\mathsf TB^{-1}\boldsymbol b,\\ \text{s.t.}\quad& B^{-1}N\boldsymbol x _ N\leqslant B^{-1}\boldsymbol b,\\ &\boldsymbol x _ B\geqslant0,\, \boldsymbol x _ N\geqslant0. \end{aligned}

Hence:

  1. If B1b0B^{-1}\boldsymbol b\geqslant0, then 0Rnm\boldsymbol 0\in\mathbb R^{n-m} is a feasible solution of the above LP.
  2. If in addition cNTcBTB1N0\boldsymbol c _ N^\mathsf T-\boldsymbol c _ B^\mathsf TB^{-1}N\geqslant0, then 0Rnm\boldsymbol 0\in\mathbb R^{n-m} is an optimal solution of the above LP.

The feasibility and optimality condition: if B1b0B^{-1}\boldsymbol b\geqslant0, then the corresponding basic solution is feasible. If in addition cNTcBTB1N0\boldsymbol c _ N^\mathsf T-\boldsymbol c _ B^\mathsf TB^{-1}N\geqslant0, then the corresponding basic solution is an optimal solution.

If there is a non basis index jj such that cjcBTB1Aj<0c _ j-\boldsymbol c _ B^\mathsf TB^{-1}\boldsymbol A _ j<0, then increasing the value of xjx _ j can reduce the objective value.

Simplex tableau

xB\boldsymbol x _ B xN\boldsymbol x _ N Solution
zz 00 cBTB1NcNT\boldsymbol c _ B^\mathsf TB^{-1}N-\boldsymbol c _ N^\mathsf T cBTB1b\boldsymbol c _ B^\mathsf TB^{-1}\boldsymbol b
xB\boldsymbol x _ B ImI _ m B1NB^{-1}N B1bB^{-1}\boldsymbol b

This table is the simplex tableau of a standard LP, which encodes the following equations:
z+(cBTB1NcNT)xN=cBTB1b,xB+B1N=B1b.z+(cBTB1NcNT)xN=cBTB1b,xB+B1N=B1b.\begin{aligned}z+(\boldsymbol c_B^\mathsf TB^{-1}N-\boldsymbol c_N^\mathsf T)\boldsymbol x_N&=\boldsymbol c_B^\mathsf TB^{-1}\boldsymbol b,\\\boldsymbol x_B+B^{-1}N&=B^{-1}\boldsymbol b.\end{aligned}\begin{aligned}z+(\boldsymbol c_B^\mathsf TB^{-1}N-\boldsymbol c_N^\mathsf T)\boldsymbol x_N&=\boldsymbol c_B^\mathsf TB^{-1}\boldsymbol b,\\\boldsymbol x_B+B^{-1}N&=B^{-1}\boldsymbol b.\end{aligned}
In practice we prefer to write out all components for the convenience of calculation and implementation of algorithm.

xB\boldsymbol x _ B xN\boldsymbol x _ N Solution
zz 0 r1rnmr1rnm\begin{matrix}r _ 1&\dots&r _ {n-m}\end{matrix}\begin{matrix}r _ 1&\dots&r _ {n-m}\end{matrix} z0z _ 0
xB\boldsymbol x _ B ImI _ m d11d1,nmdm1dm,nmd11d1,nmdm1dm,nm\begin{matrix}d _ {11}&\dots&d _ {1,n-m}\\\vdots&\ddots&\vdots\\d _ {m1}&\dots&d _ {m,n-m}\end{matrix}\begin{matrix}d _ {11}&\dots&d _ {1,n-m}\\\vdots&\ddots&\vdots\\d _ {m1}&\dots&d _ {m,n-m}\end{matrix} f1fmf1fm\begin{matrix}f _ 1\\\vdots\\f _ m\end{matrix}\begin{matrix}f _ 1\\\vdots\\f _ m\end{matrix}

Simplex method: algorithm for a standard LP

  1. If r10,,rnm0r _ 1\leqslant0,\, \dots,\, r _ {n-m}\leqslant0, then the tableau is optimal. The corresponding optimal solution is xB=(f1,,fm)\boldsymbol x _ B=(f _ 1,\dots,f _ m), xN=0\boldsymbol x _ N=\boldsymbol 0 and the optimal value is z0z _ 0.
  2. Otherwise, if there is rk>0r _ k>0 and the column d1k,,dmk0d _ {1k},\dots,d _ {mk}\leqslant0, then the objective value is -\infty.
  3. Otherwise, choose rk>0r _ k>0 and find ii that minimizes fidik\dfrac{f _ i}{d _ {ik}}.
  4. Apply Gauss-Jordan exchange with dikd _ {ik} as pivoting element, that is:
    Perform row elementary operations to the above simplex tableau so that the (m+k)(m+k)-th column becomes ei+1Rm+1\boldsymbol e _ {i+1}\in\mathbb R^{m+1}.

Let me briefly explain the idea of this algorithm. Simplex method start with an initial basic feasible solution. If all rr is negative then moving towards any direction cannot reduce the objective function, so we can conclude that this solution is locally optimal, and also globally optimal. Step 2 means that we can move from the initial solution to that direction without stopping, so the objective function keep decreasing and the minimum is -\infty. As for step 3, we choose a kk, increase the value of the corresponding xx, which lower the value of objective function, and at the same time change the value of xx corresponding to the basis, so that the constraints still hold all the time. We also find ii in Step 3, because eventually we move to a solution such that one of the basis xx becomes 00 and we cannot continue moving towards this direction, otherwise this xx will be negative, which violates the nonnegative constraints. In fact this solution is also a basic feasible solution. Now one basis xx becomes 00 and one non basis xx becomes positive (from 00). We can say we have changed the basis. Step 4 can be regarded as some kind of normalization.


Numerical example: Consider an LP
minx12x2s.t.4x1+6x29,x1+x24,x1,x20.minx12x2s.t.4x1+6x29,x1+x24,x1,x20.\begin{aligned}\min\quad&x_1-2x_2\\\text{s.t.}\quad& -4x_1+6x_2\leqslant 9,\\&x_1+x_2\leqslant 4,\\&x_1,x_2\geqslant0.\\\end{aligned}\begin{aligned}\min\quad&x_1-2x_2\\\text{s.t.}\quad& -4x_1+6x_2\leqslant 9,\\&x_1+x_2\leqslant 4,\\&x_1,x_2\geqslant0.\\\end{aligned}The corresponding standard LP is
minx12x2s.t.4x1+6x2+x3=9,x1+x2+x4=4,x1,,x40.minx12x2s.t.4x1+6x2+x3=9,x1+x2+x4=4,x1,,x40.\begin{aligned}\min\quad&x_1-2x_2\\\text{s.t.}\quad& -4x_1+6x_2+x_3= 9,\\&x_1+x_2+x_4= 4,\\&x_1,\dots,x_4\geqslant0.\\\end{aligned}\begin{aligned}\min\quad&x_1-2x_2\\\text{s.t.}\quad& -4x_1+6x_2+x_3= 9,\\&x_1+x_2+x_4= 4,\\&x_1,\dots,x_4\geqslant0.\\\end{aligned}The initial simplex tableau is

x1x_1 x2x_2 x3x_3 x4x_4 Solution
zz 1-1 22 00 00 00
x3x_3 4-4 66 11 00 99
x4x_4 1\phantom{-}1 11 00 11 44

We first find a positive rr in the first row. The column of x2x_2 has a positive rr so we select x2x_2 to enter the basis. Now we compare f/df/d of rows with positive dd: since 9/6<4/19/6<4/1, we remove x3x_3 from the basis. We transform the simplex tableau into

x1x_1 x2x_2 x3x_3 x4x_4 Solution
zz 1/3\phantom{-}1/3 0\phantom{-}0 1/3-1/3 00 3-3
x2x_2 2/3-2/3 1\phantom{-}1 1/6\phantom{-}1/6 00 3/23/2
x4x_4 5/3\phantom{-}5/3 0\phantom{-}0 1/6-1/6 11 5/25/2

Similarly, now we remove x4x_4 from the basis and add x1x_1 to the basis.

x1x_1 x2x_2 x3x_3 x4x_4 Solution
zz 00 00 3/10-3/10 1/5-1/5 7/2-7/2
x2x_2 00 11 1/10\phantom{-}1/10 2/5\phantom{-}2/5 5/2\phantom{-}5/2
x1x_1 11 00 1/10-1/10 3/5\phantom{-}3/5 3/2\phantom{-}3/2

We can find the optimality condition has been satisfied. Thus the optimal solution is (3/2,5/2)(3/2,5/2) (with x3=x4=0x_3=x_4=0) and the minimum of the original LP is 7/2-7/2.


Theorem 4: Assume that the feasible set is nonempty and every basic feasible solution is nondegenerate. Then, the simplex method terminates after a finite number of iterations.

Degeneracy

Theorem 4 is under the assumption that all basic feasible solutions are nondegenerate. Suppose now that the exact same algorithm is used in the presence of degeneracy. Then, the following new possibilities may be encountered in the course of the algorithm.

  1. If the current basic feasible solution x\boldsymbol x is degenerate, the step size of moving can be equal to zero, that is to say we move to the new basic feasible solution y\boldsymbol y which is the same as x\boldsymbol x. Nevertheless, we still change the basis and continue the algorithm.
  2. We move to a basic feasible solution that is degenerate, from a nondegenerate one.

On the one hand, basis changes while staying at the same basic feasible solution are not in vain. On the other hand, a sequence of basis changes might lead back to the initial basis, in which case the algorithm may loop indefinitely. This undesirable phenomenon is called cycling. But we can avoid cycling by using specific pivoting rule.

Here we introduce two pivoting rules. Cycling can be avoided if we use the second one.

Dantzig’s pivoting rule (for standard LP): 1. Entering variable: choose the largest rr to enter the basis; 2. leaving variable: choose the one with the smallest index to leave the basis.

Bland’s pivoting rule (for standard LP): 1. Entering variable: choose the one with smallest index to enter the basis; 2. leaving variable: choose the one with the smallest index to leave the basis.

Bland's pivoting rule is also known as the smallest subscript pivoting rule.

Theorem 5: The simplex method with Bland’s pivoting rule always terminates in finite number of steps.

Useful techniques for simplex method

We first multiply some of the constraints by 1-1, change the problem so that b0\boldsymbol b\geqslant0.

Big MM technique

Suppose MM is a sufficiently large positive constant. First solve a modified LP, of which the new objective function is cTx+MR1+MRt\boldsymbol c^\mathsf T\boldsymbol x+MR _ 1+\dots MR _ t, while the equality constraints have tt equations inserted RiR _ i respectively on the left side and the new variables are R1,,RtR _ 1,\dots,R _ t.

From the construction we can see that if (x,0)(\boldsymbol x,\boldsymbol 0) is a basic feasible solution of the modified LP, then x\boldsymbol x is a basic feasible solution of the original LP. And, if the original LP has at least one feasible solution, then the optimal solution of the modified LP (x,R)(\boldsymbol x^\ast,\boldsymbol R^\ast) must satisfy R1==Rt=0R _ 1=\dots=R _ t=0. Otherwise the problem is infeasible.

Two-phase technique

Phase I is to solve another LP whose constraints and variable are the same with those of big MM technique above, but the objective function is R1++RtR _ 1+\dots+R _ t.

If the optimal solution satisfies R1==Rt=0R _ 1=\dots=R _ t=0, then go to Phase II. Otherwise, the initial problem is clearly infeasible.

Phase II is to solve the original LP starting from x\boldsymbol x^\ast obtained from Phase I.

The numbers of iterations in the big MM technique and two-phase technique are the same. It is said that the advantage of two-phase technique rests in eliminating manipulation of the constant MM.

Simplex method for maximization

Simplex method can be easily adapted to maximization problem. The main difference is the sign of r\boldsymbol r in all steps.

  1. If r0\boldsymbol r\geqslant0, then the tableau is optimal.
  2. Otherwise, if there is rk<0r _ k<0 and the column dk0\boldsymbol d _ {\cdot k}\leqslant0, then the optimal value is -\infty.
  3. Otherwise, choose rk<0r _ k<0 and ...

Duality theory

Duality is a powerful theoretical tool that has numerous applications, provides new geometric insights, and leads to another algorithm for linear programming (the dual simplex method).

Motivation

Duality theory can be motivated as an outgrowth of the Lagrange multiplier method, often used in calculus to minimize a function subject to equality constraints. For example, to minimize x2+y2x^2+y^2 subject to x+y=1x+y=1, we can use the method called Lagrange multiplier. Define L(x,y,p)=x2+y2p(x+y1)L(x,y,p)=x^2+y^2-p(x+y-1). We know that there is a pp such that x0,y0x _ 0,y _ 0 are stationary points to L(x,y,y)L(x,y,y) while x0,y0x _ 0,y _ 0 are defined as extreme points to the original constrained problem. We obtain x0=y0=p/2x _ 0=y _ 0=p/2, and the original x+y=1x+y=1 t gives us the additional relation p=1p=1, so the optimal solution to the original problem is x=y=1/2x=y=1/2.

The main idea in the above example is the following. Instead of enforcing the hard constraint x+y=1x + y = 1, we allow it to be violated and associate a Lagrange multiplier, or price, pp with the amount 1xy1-x-y by which it is violated. This leads to the unconstrained minimization. When the price is properly chosen (p=1p = 1, in our example), the optimal solution to the constrained problem is also optimal for the unconstrained problem. In particular, under that specific value of pp, the presence or absence of the hard constraint does not affect the optimal cost.

The situation in linear programming is similar.

Consider the standard form problem
mincTxs.t.Ax=b,x0,mincTxs.t.Ax=b,x0,\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\\text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\&\boldsymbol x\geqslant0,\end{aligned}\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\\text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\&\boldsymbol x\geqslant0,\end{aligned}

which we call the primal problem, and let x\boldsymbol x^\ast be an optimal solution, assumed to exist. We introduce a relaxed problem in which the constraint Ax=bA\boldsymbol x=\boldsymbol b is replaced by a penalty pT(bAx)\boldsymbol p^\mathsf T(\boldsymbol b-A\boldsymbol x).
mincTx+pT(bAx)s.t.x0.mincTx+pT(bAx)s.t.x0.\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x+\boldsymbol p^\mathsf T(\boldsymbol b-A\boldsymbol x)\\\text{s.t.}\quad&\boldsymbol x\geqslant0.\end{aligned}\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x+\boldsymbol p^\mathsf T(\boldsymbol b-A\boldsymbol x)\\\text{s.t.}\quad&\boldsymbol x\geqslant0.\end{aligned}

Let g(p)g(\boldsymbol p) be the optimal cost for the relaxed problem, as a function of the price vector p\boldsymbol p. The relaxed problem allows for more options than those present in the primal problem, and we expect g(p)g(\boldsymbol p) to be no larger than the optimal primal cost cTx\boldsymbol c^\mathsf T\boldsymbol x^\ast.
g(p)cTx+pT(bAx)=cTx.g(\boldsymbol p)\leqslant\boldsymbol c^\mathsf T\boldsymbol x^\ast+\boldsymbol p^\mathsf T(\boldsymbol b-A\boldsymbol x^\ast)=\boldsymbol c^\mathsf T\boldsymbol x^\ast.

Thus, each p\boldsymbol p leads to a lower bound g(p)g(\boldsymbol p) for the optimal cost cTx\boldsymbol c^\mathsf T\boldsymbol x^\ast. The problem: [maximize g(p)g(\boldsymbol p), subject to: no constraint], can be then interpreted as a search for the tightest possible lower bound of this type, and is known as the dual problem. The main result in duality theory asserts that the optimal cost in the dual problem is equal to the optimal cost cTx\boldsymbol c^\mathsf T\boldsymbol x^\ast in the primal. In other words, when the prices are chosen according to an optimal solution for the dual problem, the option of violating the constraints Ax=bA\boldsymbol x=\boldsymbol b is of no value.

Now we calculate g(p)g(\boldsymbol p). g(p)=minx0[cTx+pT(bAx)]=pTb+minx0(cTpTA)xg(\boldsymbol p)=\min _ {\boldsymbol x\geqslant0}[\boldsymbol c^\mathsf T\boldsymbol x+\boldsymbol p^\mathsf T(\boldsymbol b-A\boldsymbol x)]=\boldsymbol p^\mathsf T\boldsymbol b+\min _ {\boldsymbol x\geqslant0}(\boldsymbol c^\mathsf T-\boldsymbol p^\mathsf TA)\boldsymbol x. Therefore, it is not hard to see that g(p)=pTbg(\boldsymbol p)=\boldsymbol p^\mathsf T\boldsymbol b if cTpTA0\boldsymbol c^\mathsf T-\boldsymbol p^\mathsf TA\geqslant0, otherwise it is -\infty. In maximizing g(p)g(\boldsymbol p), we only need to consider those values of p\boldsymbol p for which g(p)g(\boldsymbol p) is not equal to -\infty. The dual problem is the same as the LP
maxpTbs.t.pTAcT.maxpTbs.t.pTAcT.\begin{aligned}\max\quad &\boldsymbol p^\mathsf T\boldsymbol b\\\text{s.t.}\quad&\boldsymbol p^\mathsf TA\leqslant\boldsymbol c^\mathsf T.\end{aligned}\begin{aligned}\max\quad &\boldsymbol p^\mathsf T\boldsymbol b\\\text{s.t.}\quad&\boldsymbol p^\mathsf TA\leqslant\boldsymbol c^\mathsf T.\end{aligned}

The dual problem

When the primal is not standard, the dual can be written through the following table:

Primal minimize maximize Dual
constraints bibi=bibibi=bi\begin{matrix}\geqslant b _ i\\\leqslant b _ i\\=b _ i\end{matrix}\begin{matrix}\geqslant b _ i\\\leqslant b _ i\\=b _ i\end{matrix} 00free00free\begin{matrix}\geqslant 0\\\leqslant 0\\\text{free}\end{matrix}\begin{matrix}\geqslant 0\\\leqslant 0\\\text{free}\end{matrix} variables
variables 00free00free\begin{matrix}\geqslant 0\\\leqslant 0\\\text{free}\end{matrix}\begin{matrix}\geqslant 0\\\leqslant 0\\\text{free}\end{matrix} cjcj=cjcjcj=cj\begin{matrix}\leqslant c _ j\\ \geqslant c _ j\\=c _ j\end{matrix}\begin{matrix}\leqslant c _ j\\ \geqslant c _ j\\=c _ j\end{matrix} constraints

Theorem 6: If we transform the dual into an equivalent minimization problem and then form its dual, we obtain a problem equivalent to the original problem.

A compact statement that is often used to describe Theorem 6 is that "the dual of the dual is the primal".

Theorem 7: Suppose that we have transformed a linear programming problem Π1\Pi _ 1 to another linear programming problem Π2\Pi _ 2, by a sequence of transformations of the following types:

  • Replace a free variable with the difference of two nonnegative variables.
  • Replace an inequality constraint by an equality constraint involving a nonnegative slack variable.
  • If some row of the matrix AA in a feasible standard form problem is a linear combination of the other rows, eliminate the corresponding equality constraint.

Then, the duals of Π1\Pi _ 1 and Π2\Pi _ 2 are equivalent, i.e., they are either both infeasible, or they have the same optimal cost.

The duality theorem

We generalize the result mentioned before into the following theorem.

Theorem 8 (Weak duality): If x\boldsymbol x is a feasible solution to the primal problem and p\boldsymbol p is a feasible solution to the dual problem, then
pTbcTx.\boldsymbol p^\mathsf T\boldsymbol b\leqslant\boldsymbol c^\mathsf T\boldsymbol x.
The proof can be simple if we prove by pTb=pTAxcTx\boldsymbol p^\mathsf T\boldsymbol b=\boldsymbol p^\mathsf TA\boldsymbol x\leqslant\boldsymbol c^\mathsf T\boldsymbol x. But for the purpose of some further theorems, we prove in another way. We define for any p\boldsymbol p and x\boldsymbol x that ui=pi(aiTxbi)u _ i=p _ i(\boldsymbol a^\mathsf T _ i\boldsymbol x-b _ i) and vj=(cjpTAj)xjv _ j=(c _ j-\boldsymbol p^\mathsf TA _ j)x _ j. By definition of the dual, they are all nonnegative, so
0iui+jvj=(pTAxpTb)+(cTxpTAx)=cTxpTb.0\leqslant\sum\nolimits _ iu _ i+\sum\nolimits _ jv _ j=(\boldsymbol p^\mathsf TA\boldsymbol x-\boldsymbol p^\mathsf T\boldsymbol b)+(\boldsymbol c^\mathsf T\boldsymbol x-\boldsymbol p^\mathsf TA\boldsymbol x)=\boldsymbol c^\mathsf T\boldsymbol x-\boldsymbol p^\mathsf T\boldsymbol b.
The weak duality theorem is not a deep result, yet it does provide some useful information about the relation between the primal and the dual. We have, for example, the following corollary.

Corollary: If the optimal cost in the primal is -\infty, then the dual problem must be infeasible; and if the optimal cost in the dual is ++\infty, then the primal problem must be infeasible.

Corollary: Let x\boldsymbol x and p\boldsymbol p be feasible solutions to the primal and the dual, respectively, and suppose that pTb=cTx\boldsymbol p^\mathsf T\boldsymbol b = \boldsymbol c^\mathsf T\boldsymbol x. Then, x\boldsymbol x and p\boldsymbol p are optimal solutions to the primal and the dual, respectively.

The next theorem is the central result on linear programming duality.

Theorem 9 (Strong duality): If a linear programming problem has an optimal solution, so does its dual, and the respective optimal costs are equal.

Proof. Here we only consider the standard form problem. The rest is not hard to see, so it is omitted. Let us assume temporarily that the rows of AA are linearly independent and that there exists an optimal solution. Let us apply the simplex method to this problem. As long as cycling is avoided, e.g., by using the Bland's pivoting rule, the simplex method terminates with an optimal solution x\boldsymbol x and an optimal basis matrix BB. Let xB=B1b\boldsymbol x _ B=B^{-1}\boldsymbol b be the corresponding vector of basic variables. When the simplex method terminates, we obtain cTcBTB1A0\boldsymbol c^\mathsf T-\boldsymbol c^\mathsf T _ BB^{-1}A\geqslant0. Let us define a vector p\boldsymbol p by letting pT=cBTB1\boldsymbol p^\mathsf T=\boldsymbol c^\mathsf T _ BB^{-1}, then we know this p\boldsymbol p is a feasible solution to the dual problem. In addition, pTb=cBTB1b=cBTxB=cTx\boldsymbol p^\mathsf T\boldsymbol b=\boldsymbol c^\mathsf T _ BB^{-1}\boldsymbol b=\boldsymbol c^\mathsf T _ B\boldsymbol x _ B=\boldsymbol c^\mathsf T\boldsymbol x. It follows that p\boldsymbol p is an optimal solution to the dual (cf. the Corollary above), and the optimal dual cost is equal to the optimal primal cost.

Finite optimum Unbounded Infeasible
Finite optimum Possible
Unbounded Possible
Infeasible Possible Possible

Here is a table about different possibilities for the primal and the dual. (The blank entries should be impossible.)

There is another interesting relation between the primal and the dual which is known as Clark's theorem. It asserts that unless both problems are in feasible, at least one of them must have an unbounded feasible set.

Complementary slackness: An important relation between primal and dual optimal solutions is provided by the complementary slackness conditions, which we present next.

Theorem 10 (Complementary slackness): Let x\boldsymbol x and p\boldsymbol p be feasible solutions to the primal and the dual problem, respectively. The vectors x\boldsymbol x and p\boldsymbol p are optimal solutions for the two respective problems iff
pi(aiTxbi)=0,i,(cjpTAj)xj=0,j.pi(aiTxbi)=0,i,(cjpTAj)xj=0,j.\begin{aligned}p _ i(\boldsymbol a _ i^\mathsf T\boldsymbol x-b _ i)=0,\quad\forall i,\\(c _ j-\boldsymbol p^\mathsf T\boldsymbol A _ j)x _ j=0,\quad\forall j.\end{aligned}\begin{aligned}p _ i(\boldsymbol a _ i^\mathsf T\boldsymbol x-b _ i)=0,\quad\forall i,\\(c _ j-\boldsymbol p^\mathsf T\boldsymbol A _ j)x _ j=0,\quad\forall j.\end{aligned}
Proof. In the proof of Theorem 8, we defined ui=pi(aiTxbi)u _ i=p _ i(\boldsymbol a _ i^\mathsf T\boldsymbol x-b _ i) and vi(cjpTAj)xjv _ i(c _ j-\boldsymbol p^\mathsf T\boldsymbol A _ j)x _ j which are both nonnegative. In addition, we showed that cTxpTb=iui+jvj\boldsymbol c^\mathsf T\boldsymbol x-\boldsymbol p^\mathsf T\boldsymbol b=\sum _ i u _ i+\sum _ j v _ j. By the strong duality theorem, if x\boldsymbol x and p\boldsymbol p are optimal, then cTx=pTb\boldsymbol c^\mathsf T\boldsymbol x=\boldsymbol p^\mathsf T\boldsymbol b, which implies that ui=vj=0u _ i=v _ j=0 for all i,ji,j. Conversely, if all uiu _ i and vjv _ j are zero, then cTx=pTb\boldsymbol c^\mathsf T\boldsymbol x=\boldsymbol p^\mathsf T\boldsymbol b, by the corollary of Theorem 8, x\boldsymbol x and p\boldsymbol p are optimal.

An intuitive explanation of this theorem is that a constraint which is not active at an optimal solution can be removed from the problem without affecting the optimal cost, and there is no point in associating a nonzero price with such a constraint.

If the primal problem is in standard form (where the first condition is automatically satisfied) and a nondegenerate (otherwise, these conditions may be of very little help in determining an optimal solution to the dual problem) optimal basic feasible solution is known, the complementary slackness conditions determine a unique solution to the dual problem. Here is the reason. The second complementary slackness condition implies that for basis jj, pTAj=cj\boldsymbol p^\mathsf T\boldsymbol A _ j=c _ j and hence pTB=cBT\boldsymbol p^\mathsf TB=\boldsymbol c _ B^\mathsf T. Since BB is inversible, p\boldsymbol p is uniquely determined.

We finally mention that if the primal constraints are of the form AxbA\boldsymbol x\geqslant\boldsymbol b, x0\boldsymbol x\geqslant0, and the primal problem has an optimal solution, then there exist optimal solutions to the primal and the dual which satisfy strict complementary slackness; that is, a variable in one problem is nonzero iff the corresponding constraint in the other problem is active. This result has some interesting applications in discrete optimization, but these lie outside the scope of this appendix.

  1. Consider a standard LP and its dual. Assume that both problems have an optimal solution. Fix some jj. Suppose that every optimal solution to the primal satisfies xj=0x _ j = 0. Then there exists an optimal solution p\boldsymbol p to the dual such that pTAj<cj\boldsymbol p^\mathsf T\boldsymbol A _ j<c _ j. (Hint: Let dd be the optimal cost. Consider the problem of minimizing xj-x _ j subject to Ax=bA\boldsymbol x=\boldsymbol b, x0\boldsymbol x\geqslant0, and cTxd-\boldsymbol c^\mathsf T\boldsymbol x\geqslant-d, and form its dual.)

  2. There exist optimal solutions x\boldsymbol x and p\boldsymbol p to the primal and to the dual, respectively, such that for every jj we have either xj>0x _ j>0 or pTAj\boldsymbol p^\mathsf T\boldsymbol A _ j. (Hint: Use part (1) for each jj, and then take the average of the vectors obtained.)

  3. Consider now the following LP problem and its dual:
    mincTxs.t.Axb,x0,mincTxs.t.Axb,x0,maxpTbs.t.pTAcT,p0.maxpTbs.t.pTAcT,p0.\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\\text{s.t.}\quad&A\boldsymbol x\geqslant\boldsymbol b,\\&\boldsymbol x\geqslant0,\end{aligned}\begin{aligned}\min\quad &\boldsymbol c^\mathsf T\boldsymbol x\\\text{s.t.}\quad&A\boldsymbol x\geqslant\boldsymbol b,\\&\boldsymbol x\geqslant0,\end{aligned}\qquad\begin{aligned}\max\quad &\boldsymbol p^\mathsf T\boldsymbol b\\\text{s.t.}\quad&\boldsymbol p^\mathsf TA\leqslant\boldsymbol c^\mathsf T,\\&\boldsymbol p\geqslant0.\end{aligned}\begin{aligned}\max\quad &\boldsymbol p^\mathsf T\boldsymbol b\\\text{s.t.}\quad&\boldsymbol p^\mathsf TA\leqslant\boldsymbol c^\mathsf T,\\&\boldsymbol p\geqslant0.\end{aligned}

    Assume that both problems have an optimal solution. Then there exist optimal solutions to the primal and to the dual, respectively, that satisfy strict complementary slackness, that is:
    (i) For every jj we have either xj>0x_j>0 or pTAj<cj\boldsymbol p^\mathsf T\boldsymbol A_j<c_j.
    (ii) For every ii, we have either aiTx>bi\boldsymbol a _ i^\mathsf T\boldsymbol x>b _ i or pi>0p _ i>0.
    (Hint: Convert the primal to standard form and apply part (2).)

Dual simplex method

Let us consider a problem in standard form, under the usual assumption that the rows of the matrix AA are linearly independent. We also consider the corresponding tableau. We do not require the last column to be nonnegative, which means that we have a basic, but not necessarily feasible solution to the primal problem. However, we assume that those rr are nonpositive; equivalently, the vector pT=cBTB1\boldsymbol p^\mathsf T=\boldsymbol c^\mathsf T _ BB^{-1} satisfies pTAcT\boldsymbol p^\mathsf TA\leqslant \boldsymbol c^\mathsf T, and we have a feasible solution to the dual problem. If the inequality B1b0B^{-1}\boldsymbol b\geqslant0 happens to hold, we also have a primal feasible solution with the same cost, and optimal solutions to both problems have been found. If the inequality fails to hold, we perform a change of basis in a manner we describe next. After that those rr in the new tableau will also be nonpositive and dual feasibility has been maintained. At the same time, the dual cost increases. We hope B1b0B^{-1}\boldsymbol b\geqslant0 holds after some iterations, then we obtain an optimal solution (for both dual and primal problem).

An iteration for the dual simplex method:

  1. A typical iteration starts with the tableau associated with a basis matrix BB and with all rr nonpositive.
  2. If all the components ff of the last column are nonnegative, we have an optimal basic feasible solution and the algorithm terminates; else, choose some ii such that fi<0f _ i<0.
  3. Consider the ii-th row, if all di0d _ {i\cdot}\geqslant0, then the optimal dual cost is ++\infty and the algorithm terminates.
  4. For each jj such that dij<0d _ {ij}<0, find kk that minimize rj/dijr _ j/d _ {ij}.
  5. Apply Gauss-Jordan exchange with dikd _ {ik} as pivoting element.

Let us now consider the possibility that some rr is zero. In this case the algorithm may cycle. This can be avoided by employing a suitable anticycling rule.

Some extra notes

Farkas' theorem

Theorem 11 (Farkas' theorem): Let ARm×nA\in\mathbb R^{m\times n} and bRn\boldsymbol b\in\mathbb R^n. Then exactly one of the following two alternatives holds:

  1. There exists some x0\boldsymbol x\geqslant0 such that Ax=bA\boldsymbol x=\boldsymbol b.
  2. There exists some p\boldsymbol p such that pTA0\boldsymbol p^\mathsf TA\geqslant0 and pTb<0\boldsymbol p^\mathsf T\boldsymbol b<0.

Proof. One direction is easy. If there exists some x0\boldsymbol x\geqslant0 satisfying Ax=bA\boldsymbol x=\boldsymbol b, and if pTA0\boldsymbol p^\mathsf TA\geqslant0, then pTb=pTAx0\boldsymbol p^\mathsf T\boldsymbol b=\boldsymbol p^\mathsf TA\boldsymbol x\geqslant0, which shows that the second alternative cannot hold.

If there is no x0\boldsymbol x\geqslant0 satisfying Ax=bA\boldsymbol x=\boldsymbol b. Consider the pair of problems
max0Txs.t.Ax=b,x0,max0Txs.t.Ax=b,x0,minpTbs.t.pTA0.p0.minpTbs.t.pTA0.p0.\begin{aligned}\max\quad &\boldsymbol 0^\mathsf T\boldsymbol x\\\text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\&\boldsymbol x\geqslant0,\end{aligned}\begin{aligned}\max\quad &\boldsymbol 0^\mathsf T\boldsymbol x\\\text{s.t.}\quad&A\boldsymbol x=\boldsymbol b,\\&\boldsymbol x\geqslant0,\end{aligned}\qquad\begin{aligned}\min\quad &\boldsymbol p^\mathsf T\boldsymbol b\\\text{s.t.}\quad&\boldsymbol p^\mathsf TA\geqslant\boldsymbol 0.\\&\phantom{\boldsymbol p\geqslant0.}\end{aligned}\begin{aligned}\min\quad &\boldsymbol p^\mathsf T\boldsymbol b\\\text{s.t.}\quad&\boldsymbol p^\mathsf TA\geqslant\boldsymbol 0.\\&\phantom{\boldsymbol p\geqslant0.}\end{aligned}
Note that the first is the dual of the second. The maximization problem is infeasible, which implies that the minimization problem is either unbounded or infeasible. Since p=0\boldsymbol p=\boldsymbol0 is a feasible solution to the minimization problem, it follows that the minimization problem is unbounded. Therefore, there exists some p\boldsymbol p which is feasible (pTA0\boldsymbol p^\mathsf TA\geqslant0) and pTb<0\boldsymbol p^\mathsf T\boldsymbol b<0.

Compact simplex tableau

The simplex tableaux that we have been using so far are usually called full simplex tableaux. However, we can observe that the same amount of information will be retained if we delete the basic columns from the full simplex tableau to obtain a more compact simplex tableau.

Computational efficiency of the simplex method

This is determined by two factors. The first is the computational effort at each iteration, and the second is the number of iterations. For each iteration, using simplex tableau needs O(mn)O(mn) arithmetic operations. For the second, there exists an LP and a pivoting rule under which the simplex method requires 2n12^n-1 changes of basis before it terminates.

Other methods

There are some other algorithms not mentioned here, such as the ellipsoid method, and the interior point methods.


评论

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注