...

Semidefinite and Second Order Cone Programming Seminar, Fall 2003 Lecture 3 1

by user

on
Category: Documents
1

views

Report

Comments

Transcript

Semidefinite and Second Order Cone Programming Seminar, Fall 2003 Lecture 3 1
Semidefinite and Second Order Cone
Programming Seminar, Fall 2003
Lecture 3
Instructor: Farid Alizadeh
Scribe: Nilay Noyan
09/17/2003
1
Overview
We prove that Cn+1 = Mn+1 . Then by using this fact we prove that the
dual cone of nonnegative polynomials is equal to the moment cone. We discuss
Caratheodory‘s theorem and give several examples. Then we prove that every
non-negative polynomial is the sum of square polynomials. Finally we start to
study Duality Theory.
2
The Moment Cone and Some Related Propositions
Let FD be the set of functions F : < → < with the following properties:
1. F is right continuous,
2. non-decreasing (i.e. if x > y then F(x) ≥ F(y),) and
3. has bounded variation, that is F(x) → α > −∞ as x → −∞, and F(x) →
β < ∞ as x → ∞.
First observe that functions in F are almost like probability distribution functions, except that their range is the interval [α, β] rather than [0, 1]. Second,
the set F itself is a pointed convex cone in the space of continuous functions.
The moment cone is defined as:
Z
def
Mn+1 = (u0 , u1 , . . . , un )T ∃F ∈ FD : ui =
ti dF(t) (i = 0 . . . n)
D
where D is an interval.
def
Proposition 1 Mn+1 = Cn+1 = conv {cn+1 (t)t ∈ D}
1
Scribe: Nilay Noyan
Lecture 3
Date: 09/17/2003
def
Proof: Let us introduce the following notation: cn+1 (t) = (1, t, t2 , . . . , tn )T .
We first show that Cn+1 ⊆ Mn+1 . The ith moment of a random variable which
takes the value t with probability 1 is ti , therefore for all t ∈ D : cn+1 (t) =
(1, t, t2 , . . . , tn )T ∈ Mn+1 . Then we have Cn+1 = conv {cn+1 (t) : t ∈ D} ⊆
Mn+1 .
Finally we show that Mn+1 ⊆ Cn+1 . Suppose that Mn+1 * Cn+1 , then ∃x
suchRthat x ∈ Mn+1 and x ∈
/ Cn+1 . Since x ∈ Mn+1 , ∃F ∈ FD such that
x = D cn (t)dF(t). Since x ∈
/ Cn+1 and Cn is a closed cone, by the seperating
T
T
hyperplane theorem
R ∃pT such that p x < 0 and p y ≥ 0 (∀y ∈ Cn ).Then we
T
have 0 > p x = D p Cn (t)dF(t). Each Cn (t) belongs to the moment cone
which implies that pT Cn (t) ≥ 0. Therefore we have a contradiction. Thus
Mn+1 ⊆ Cn+1 .
Remark 1 We have shown that any vector in the moment cone can be written
T
as a nonnegative combination of some vectors cn+1 (ti ) = (1, ti , t2i , . . . , tn
i ) ,
where (ti ∈ D).
Now we examine M∗n+1 . Let us first consider the cone of nonnegative polynomials, defined as follows:
Pn+1 = {p = (p0 , . . . , pn ) | p0 + p1 t + p2 t2 + ... + pn tn = p(t) ≥ 0 ∀t ∈ D}
Remark 2 It is easy to verify that Pn+1 is a proper cone.
Proposition 2 M∗n+1 =Pn+1
P
T
Proof:
Let p ∈ Pn+1 and c ∈ Cn+1 . Then pT c = pT i λi (1, ti , t2i , . . . , tn
i ) =
P
∗
i λi P(ti ) ≥ 0. Thus Pn+1 ⊆ Cn+1 . Hence, since we proved that Mn+1 =
Cn+1 , we have Pn+1 ⊆ C∗n+1 .
We also need to show that M∗n+1 ⊆ Pn+1 . Suppose that p belongs to M∗n+1 ,
but not to Pn+1 . Since p ∈
/ Pn+1 , there exits t ∈ D such that P(t) < 0. Then
T
T
we have pT (1, ti , t2i , . . . , tn
)
< 0, which is impossible since (1, ti , t2i , . . . , tn
i
i ) ∈
∗
Mn+1 and p ∈ Mn+1 .
Remark 3 If D = (−∞, +∞) and n is even (n = 2l) then (c0 , c1 , . . . , c2n ) ∈
M2n+1 if and only if


c0 c1
c2 . . . cl
 c1 c2
cl+1 




..
.
..
 c2
<0
.




cl cl+1
. . . c2l
The above matrix is called a Hankel matrix. If D is a half line or a closed
interval, there are similar characterizations based on semi-definiteness of Hankel
matrices.
2
Scribe: Nilay Noyan
Lecture 3
3
Date: 09/17/2003
CARATHEODORY’S THEOREM
Definition 1 An extreme ray of a proper cone K is a half line Ra = {αa | α ≥ 0}
for a ∈ K such that for each x ∈ Ra if x = b + c for some b, c ∈ K, then
b, c ∈ Ra .
Theorem 1 Caratheodory’s Theorem
If K ⊆ <n is a proper cone then for all a ∈ K there exist
P some extreme rays
Ra1 , . . . , Rak (k ≤ n) such that a can be written as a =
λi ai , where λi ≥ 0
and a ∈ Ri .
Definition 2 The Caratheodory number κ(K) of a proper coneP
K is the smallest
κ
integer κ such that every point a ∈ K can be written as a = i=1 λi ai where
λi ≥ 0, ai ∈ Rai and Rai are extreme rays of K.
Example 1 (Non-negative Orthant)
The non-negative orthant of Rn is Ln = {(x1 , x2 , . . . , xn })T : xi ≥ 0 (i = 0 . . . n)}.
def
Let us introduce the following notations: EXT(K) = {v1 , . . . , vm } ⇐⇒ The extreme rays of K are {αvi | α ≥ 0}.
ei : the unit vector where the ith coordinate is 1 and the rest of the coordinates
are all zero.
Proposition 3 EXT(Ln ) = {e1 , e2 , . . . , en } and κ(Ln ) = n.
Example 2 (Extreme rays of the
second order cone) Let Q be the second order cone. EXT(Qn ) = { 1, x̄ : x ∈ Rn , kx̄k = 1} . Note that there are
infinitely many extreme rays.
Proposition 4 κ(Ln ) = 2
Proof: It is easy to see that (xo +kx̄k)
and (xo −kx̄k)
are nonnegative numbers
2
2
1
1
and the vectors
and −x̄ are normalized extreme rays of Q. We can
x̄
kx̄k
kx̄k
write any x ∈ Q as follows:
1
1
(xo + kx̄k)
(xo − kx̄k)
xo
xo
x=
=
+
=
.
x̄
−x̄
x̄
x̄
2
2
kx̄k
kx̄k
Therefore, any x ∈ Q can be written as nonnegative combination of two
vectors from extreme rays of Q.
Thus κ(Ln ) = 2.
3
Scribe: Nilay Noyan
Lecture 3
Date: 09/17/2003
Example 3 (Extreme rays of the semidefinite cone)
Let Pn×n = {A ∈ Rn×n : A = AT , A < 0} be the semidefinite cone.
Lemma 1 The extreme rays of Pn×n consist of the positive semi-definite matrices qqT of rank 1.
Proof:
Any positive semidefinite matrix X can be written in the form X =
P
T
λ
q
q
i
i
i . (This from can be obtained from spectral decomposition of X: Any
i
symmetric matrix can be diagonalized as X = QΛQT , where Q = [q1 , . . . , qn ], qi
are the orthogonal eigenvectors
of X with unit lenght and Λ = diag(λ1 , . . . , λn ).
P
Then X = QΛQT = i λi qi qTi ). This shows that all extreme rays must be
among matrices of the form qqT . Now we must show that each qqT is an
extreme ray. Let qqT = X + Y, where X, Y < 0. Suppose {q1 = q, q2 , . . . , qn }
is an orthogonal set of vectors in Rn . Then multiplying from left by qTi and
from right by qi we see that qTi Xqi + qTi Yqi = 0 for i = 2, . . . , n; but since
the summands are both non-negative and add up to zero, they are both zero.
Thus qTi Xqi = qTi Yqi = 0 for i = 2, . . . n. Therefore both X and Y are rank one
matrices (their null space has dimension n − 1) so we have qqT = xxT + yyT .
But the right hand side is a rank 2 matrix unless x and y are proportional,
which proves they are both proportional to q. Thus qqT are extreme rays for
each vector q ∈ Rn .
n(n+1)
Note that while Pn×n ⊆ R 2
, κ(Pn×n ) < n(n+1)
: since we showed that
2
T
EXP(Pn×n ) = λqi qi , where qi belongs to the orthogonal set of vectors:
{q, q2 , . . . , qn } ∈ Rn we have κ(Pn× n ) = n.
Example 4 (Moment Cone)
Cn+1 = Mn+1 = {Σαi cn+1 (ti ) | αi ≥ 0,ti ∈ D}
We claim that EXT(Mn+1 ) = {cn+1 (ti ),ti ∈ D}. (There are infinitely many
such vectors).
To prove the claim we have to show that it is impossiblePto write any vector
T
(1, t, t2 , . . . , tn )T as Σαi (1, ti , t2i , . . . , tn
αi 6= 0 , ti values
i ) where αi ≥ 0 ,
are distinct and ti 6= t. Consider the following matrix:


1
1
1 ... 1
 t
tn 
 2 t21

2 
 t
t
t
n 
1

 ..

 .

n
n
n
t
t1
. . . tn
The determinant
of the
Q
Q matrix above is a Vandermonde determinant which
is equal to i>j (ti − tj ) (t − tj ).
Q
Q
Since the values t,t1 ,. . . ,tn are distinct i>j (ti − tj ) (t − tj ) 6= 0, therefore
4
Scribe: Nilay Noyan
Lecture 3
Date: 09/17/2003
the columns of the matrix are linearly independent. Thus it is impossible to
T
write (1, t, t2 , . . . , tn )T as Σαi (1, ti , t2i , . . . , tn
i ) .
It can be shown that the Caratheodory number of the moment cone is
κ(Mn+1 ) = b n+1
2 c.
Example 5 (Cone of non-negative polynomials)
Pn+1 = {p = (p0 , . . . , pn ) | p0 + p1 t + p2 t2 + ... + pn tn = p(t) ≥ 0 ∀t ∈ D}
Theorem 2 If p ∈ Pn+1 then there exist polynomials q(t) and p(t) such that
p(t) = q2 (t) + r2 (t).
We will prove another version of this theorem.
Theorem 3 Every non-negative polynomial is the sum of square polynomials.
Proof: We only prove for the case D = (−∞, +∞).
We first claim that n must be even. Otherwise p(t) → −∞ as t → −∞ if
pn > 0 (similarly, p(t) → −∞ as t → −∞ if pn < 0). Since p(t) cannot be
non-negative, we have n = 2l.
Observations
1. All the real roots must have even multiplicity, because otherwise in the
neighborhood of the root with odd multiplicity there would be some t such
that p(t) < 0.
2. Complex roots appear in conjugate pairs. Therefore p(t) can be written
as
k
n
Y
Y
(t − αj − iβj )(t − αj + iβj )
p(t) = c
(t − γj )2
j=1
where i =
√
j=k+1
−1 and c ≥ 0;
3. For each pair of conjugate complex roots we have
(t − α − iβ)(t − α + iβ) = (t − α)2 + β2 .
Therefore the above expression for p(t) is product of sums of square polynomials, which also yields a sum of square polynomials.
This theorem implies that the set of extreme rays of the non-negative polynomials is a subset of the square polynomials. However, not all the square
polynomials define extreme rays. In particular, if a square polynomial has nonreal roots then it can be written as sum of two square polynomials:
[(t − α − iβ)(t − α + iβ)]2 = [(t − α)2 + β2 ]2
5
Scribe: Nilay Noyan
Lecture 3
Date: 09/17/2003
Therefore extreme rays must consist of square polynomials with only real
roots. We now argue that these polynomials indeed define extreme rays.
Theorem 4 Square polynomials of the form P2 (t) where P(t) have only real
roots define extreme rays of Pn+1 .
Proof:
Qk
Suppose p2 (t) =
(t − γj )2 is a polynomial with distinct real roots γj
which is not in an extreme ray. Then p2 (t) = q2 (t) + r2 (t) and since both q
and r are non-negative, we must have q(t) ≤ p(t). This means that degree of
q(t) is at most as large as the degree of p. Furthermore, from the picture it is
clear that each γj is also a root of q(t). But if for some γj the multiplicity in
p is 2k and the multiplicity in q is 2m where m < k then in some neigborhood
of γj q(t) > p(t) because (t − γj )2m > (t − γj )2k in some neighborhood of γj
when m < k; therefore k ≤ m holds for each root. Since the degree of p is
larger than or equal to the degree of q it follows that k = m for each root. Thus
q(t) = αp(t) for some constant α.
Remark 4 If we have some finite interval D = (a, b) the extreme rays consist
of square polynomials with all real roots in interval (a, b).
Remark 5 We considered only polynomials with one variable. For non-negative
polynomials of several variables the theorem does not hold, ie. it is not always
true that a non-negative multivariate polynomial is the sum of square polynomials.
A few more considerations:
1. Optimization over non-negative polynomials of one variable is a semidefinite programming problem. However, when we have non-negative
polynomials of several variables the problem becomes NP-hard.
6
Scribe: Nilay Noyan
Lecture 3
Date: 09/17/2003
P 2
2. The cone of polynomials P(t1 , . . . , tn ) : P =
qi (t1 , . . . , tn ) is a proper
subcone of nonnegative polynomials. However,optimization over this is
semi-definite programming problem.
4
DUALITY THEORY
The standard form (P) of the K-LP problem is formulated as:
min cT x
s.t. Ax = b
x ≥K 0
(1)
where c ∈ Rn and b ∈ Rm , A ∈ Rn×}m .
The dual (D) of the above problem is:
max bT y
s.t. AT y + s = c
s ≥K∗ 0
Lemma 2 Weak Duality Lemma If x is feasible for the primal problem (P)
and (y, s) is feasible for the dual problem (D) then cT x ≥ bT y holds.
Proof: cT x − bT y = xT c − xT AT y = xT (c − AT y) = xT s ≥ 0 since x ∈ K and
s ∈ K∗ .
Remark 6 The value cT x − bT y = xT s is called the duality gap.
Corollary 1
• P is unbounded, then D is infeasible.
• D is unbounded, then P is infeasible.
• If x is feasible for (P) and (y, s) is feasible for (D) and cT x−bT y = xT s = 0
then x is optimal for (P) and (y, s) is optimal for (D).
Lemma 3 Farkas Lemma
Let K be a closed cone, A ∈ Rn×m , b ∈ Rm . Let us further assume that A(K)
is also closed. Then
either ∃x ∈ Rn such that x ≥K 0
or ∃y ∈ Rm such that AT y ≥K∗ 0and bT y < 0.
7
Scribe: Nilay Noyan
Lecture 3
Date: 09/17/2003
Remark 7 For polyhedral cones (for example when K is the non-negative orthant) A(K) is always a closed cone. However, in the general case A(K) might
not be closed even when K is closed. Therefore the assumption in the lemma
was necessary to guarantee closedness.
Example 6 (A(K) is not closed for a closed cone K)



 xo

Let K =  x1  : x1 ≥ 0, x2 ≥ 0, x1 x2 ≥ x2o .


x2
It is easy to see that K is closed cone. For a fixed value of xo we get in the
x1 ,x2 plane the upper branch of a hyperbola. Let A be the projection to the
x0 ,x1 plane. It is easy to verify that A(K) = {(x0 , x1 ) : x0 ≥ 0, x1 > 0} ∪ {0}.
Therefore the projected cone A(K) is not closed.
Proof:(Farkas Lemma)
A(K) = {Ax : x <K 0} is a closed convex cone.
either : b ∈ A(K) =⇒ ∃x <K 0 such that Ax = b.
or : b ∈
/ A(K) and since A(K) is closed the conic version of separating hyperplane
applies:
∃y : bT y < 0 and zT y ≥ 0 ∀z ∈ A(K) ⇐⇒ (Ax)T y ≥ 0 ∀x ∈ K ⇐⇒ xT AT y ≥
0 ∀x ∈ K ⇐⇒ AT y ≥K∗ 0
Lemma 4 If K is closed and convex then (K∗ )∗ = K
Remark 8 This lemma is a restatement of Farkas lemma.
Example 7 (Counterexample to Farkas Lemma when A(K) is not closed)
Let Eij be a matrix, where the ijth element is 1 and all other elements are zero.
0 1
<0
1 u
is impossible because the determinant of the matrix is negative so at least one
of the eigenvalues has to be negative. Therefore we have
@x : (E11 x = 0,
E12 + E21
)x = 1 and x ≥ 0.
2
Since this case is impossible according to the Farkas lemma,
E12 + E21
y1
∃ y1 ,y2 : y1 E11 + y2
≥ 0 and (0 1)
<0
y2
2
=⇒ ∃(y1 , y2 ) :
y1
y2
y2
0
8
< 0 and y2 < 0.
Scribe: Nilay Noyan
Lecture 3
Date: 09/17/2003
This is also impossible because determinant is negative, therefore both the either
and or parts of the Farkas
Lemmaare impossible. The
lemma
is not
valid
S
x1
x1 x 2
x1
since A(K) =
:
< 0)} = {(0, 0)} {
: x > 0 is not
x2
x2 x 3
x2
a closed cone.
Theorem 5 Strong Duality Theorem Let us assume that (P) and (D) are
both feasible
andz1 and z2 are their optimal values, respectively.
A
‘
Let A =
. If A‘ (K) is closed then z1 = z2 .
cT
We will prove the strong duality theorem in the next lecture.
9
Fly UP