...

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 2

by user

on
2

views

Report

Comments

Transcript

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 2
Semidefinite and Second Order Cone
Programming Seminar
Fall 2001
Lecture 2
Instructor: Farid Alizadeh
Scribe: Xuan Li
9/17/2001
1
Overview
We survey the basic notions of cones and cone-LP and give several examples
mostly related to semidefinite programming.
2
Program Formulations
The linear and semidefinite programming problems are formulated as follows:
2.1
Standard Form Linear Programming
Let c ∈ <n and b ∈ <m ,A ∈ <n×m with rows ai ∈ <n , i = 1, ...m.
min: cT x
s.t.
ai x = b i ,
x≥0
2.2
i = 1, ..m
(1)
Semidefinite Programming
Here instead of vectors ai we use symmetric matrices Ai ∈ Sn×n (the set of
n × n symmetric matrices), i = 1, ...m, C ∈ Sn×n and X ∈ Sn×n instead of c
and x. The matrix X is positive semidefinite. The inner product is defined as
X
A•B =
Aij Bij
i,j
=
=
Trace(ABT )
Tr(AB) = Tr(BA).
1
scribe:Xuan Li
Lecture 2
Date: 01/17/2001
The second equation is from definition of product, and the last one come
from the observation that even though matrix product is not commutative,
i.e. AB 6= BA in general, the diagonal entries of AB and BA are equal and thus
their traces are equal as well. The standard form of semidefinite programming
is :
min C • X
s.t. Ai • X = bi ,
X<0
3
i = 1, ..m
Some Notations and Definitions
• cone: A set K is called a cone if αx ∈ K for each x ∈ K and for each α ≥ 0.
• convex Cone: A convex cone K is a cone with the additional property
that x + y ∈ K for each x, y ∈ K.
• pointed cone A pointed cone K is a cone with the property that K ∩
(−K) = {0}.
• open Set A set S is open if for every point s ∈ S, B(a, ) = {x : ||x − s|| <
} ⊂ S for some positive number s .
• closed set A set S is a closed set if its compliment Sc is open.
• interior of set The interior of a set S is defined as
[
Int(S) :=
T
T ⊆S,T open
• closure of set The closure of a set S is defined as
\
cl(S) :=
T
T ⊇S,T closed
• boundary of set The boundary of a set S is defined as
Bd(S) := Cl(S) ∩ Int(S)c
Remark 1 There are some basic facts which can be easily seen from the definitions above:
2
scribe:Xuan Li
Lecture 2
Date: 01/17/2001
1. An open set in <n is not open in <m for n < m ;
2. similarly, the boundary or the interior of a set isn’t the same in <n as in
<m ;
3. As a result one talks about an open set with respect to the topology
induced by the vector space spanned by a set S;
4. similarly we speak of relative interior and relative boundary of a set which
are understood to be with respected to topology of the space spanned by
the the set;
5. a closed set in <n is also closed in <m .
Consider the half closed interval [a, b) = {x : a ≤ a < b} in <1 . The interior of
[a, b) in <1 is the open interval (a, b) and the boundary of [a, b) is {a}∪{b} . But
(a, b) isn’t open in <2 since for any x ∈ (a, b), we can’t find some > 0 such
that B(x, ) ⊂ (a, b). The interior of [a, b) in <2 is empty and the boundary of
[a, b) in <2 is [a, b]. However the relative interior of [a, b) in <n is again (a, b)
and the relative boundary {a, b}.
Definition 1 (Proper Cone) A proper cone K ⊆ <n is a closed, pointed,
convex and full- dimensional cone (i.e dim(K) = n). A full-dimensional cone
is a cone which contains n linearly independent vectors.
Theorem 1 Every proper cone K induces a partial order which is defined as
follows:
K
∀x, y ∈ <n , x ≥ y ⇔ x − y ∈ K
K
X > y ⇔ x − y ∈ Int(K)
K
K
K
Proof: First note that x ≥ x since x − x = 0 ∈ K . Secondly, ifx ≥ y, y ≥ x ,
then x − y ∈ K, y − x ∈ K . Since K is a proper cone, thus a pointed cone, we
K
K
get x = y . Finally, if x ≥ y, y ≥ z then x − z = (x − y) + (y − z) ∈ K , i.e.,
K
x ≥ z.
4
The Standard cone linear programming (KLP)
min cT x
s.t. aTi x = bi ,
i = 1, ..m
K
x≥0
where c ∈ < and b ∈ < ,A ∈ <n×m with rows ai ∈ <n , i = 1, ...m. Observe
that every convex optimization problem: minx∈C f(x) where C is a convex set
n
m
3
scribe:Xuan Li
Lecture 2
Date: 01/17/2001
and f(x) is convex over C, can be turned into a cone-LP. First turn the problem
to one with linear objective and then turn it into Cone LP:
min z
s.t. f(x) − z ≤ 0
x ∈ C.
Since the set C0 = {(z, x) | x ∈ C and f(x) − z ≤ 0} is convex our problem is
now equivalent to the cone LP where
min z
s.t. x0 = 1
K
x≥0
where K = {(x0 , z, x) | (z, x) ∈ x0 C and x0 ≥ 0}
The convex set
embeded in plane
and turned into a cone
Definition 2 (Dual Cone) The dual cone K∗ of a proper cone is the set
{z : zT x ≥ 0, ∀x ∈ K}.
It is easy to prove that if K is proper so is K∗ .
4
scribe:Xuan Li
Lecture 2
Date: 01/17/2001
Example 1 (Half line) Let <+ = {x : x ≥ 0}. The dual cone <∗+ is exactly
<+ .
Example 2 (non-negative orthant) Let
<n
+ = {x | xk ≥ 0 for
k = 1, . . . , n},
the dual cone equals <n
+ , that is the non-negative orthant is self dual.
We recall that
Lemma 1 A matrix X is positive semidefinite if it satisfies any one of the
following equivalent conditions:
1.
(1) aT Xa ≥ 0, ∀a ∈ <n
2.
(2) ∃A ∈ <n×n such that AAT = X
3.
(3)
All eigenvalues of X are non-negative.
Example 3 (The semidefinite cone) Let Pn×n = {X ∈ <n×n : X is positive semidefinite}
Now we are interested in P∗n×n . On one side,
∀Z ∈ P∗n×n , Z • X ≥ 0 for allX 0,
i.e.,
Z • X = Tr(ZX) = Tr(ZAAT ) = Tr(AT ZA) ≥ 0 for all A ∈ <n×n .
Since X is symmetric, from the knowledge of linear algebra, X can be written
as X = QΛQT where QQT = I , that is Q is an orthogonal matrix, and Λ
is diagonal with the diagonal entries containing the eigenvalues of X. Write
Q = [q1 , ...qn ] and Λ = diag(λ1 , ...λn ). λi , i = 1..n , then qi is the eigenvector
corresponding to λi , i.e, qTi Xqi = λi
Let us choose Ai = pi ∈ Rn where pi is the eigen vector of Z corresponding to
γi and pTi pi = 1. Then,
0 ≤ Tr(ATi ZAi ) = pTi Zpi = γi
. So all the eigenvalues of Z are non-negative, i.e., Z ∈ Pn×n , P∗n×n ⊆ Pn×n.
On the other hand, ∀Y ∈ Pn×n , ∃B ∈ <n×n such that Y = BBT . ∀X ∈
Pn×n , X = AAT , we have
Y • X = Tr(YX) = Tr(BBT AAT ) = Tr(AT BBT A) = Tr[(BT A)T (BT A)] ≥ 0
i.e., Y ∈ P∗n×n , Pn×n ⊆ P∗n×n . In conclusion, P∗n×n = Pn×n
5
scribe:Xuan Li
Lecture 2
Date: 01/17/2001
Example 4 (The second order cone) Let Q = {(x0 , x̄) | x0 ≥ ||x̄||}. Q is a
proper cone. What is Q∗ ?
On one side, if z = (z0 , z̄) ∈ Q, then for every (x0 , x̄) ∈ Q
x
(z0 , z̄T ) 0
= z0 x0 + z̄T x̄
x̄
≥ ||z̄|| · ||x̄|| + z̄T x̄
≥ −z̄T x̄ + z̄T x̄ = 0
i.e., Q ⊆ Q∗ . The inequalities come from the Cauchy-Schwartz inequality:
−zT x ≤ |xT z| ≤ ||z|| · ||x||
On the other side, we note that e = (1, 0) ∈ Q. For each element z = (z0 , z̄) ∈ Q∗
we must have zT e = z0 ≥ 0. We also note that each vector of the form
x = (kz̄k, −z̄) ∈ Q, for all z̄ ∈ <n . Thus, in particular for z = (z0 , z̄) ∈ Q∗ ,
zT x = z0 ||z̄|| − ||z̄||2 ≥ 0
Since ||z̄|| is always non-negative, we get z0 ≥ ||z̄||, i.e., Q∗ ⊆ Q.
Therefore, Q = Q∗
Definition 3 An extreme ray of proper cone K is a half line αx = {αx | α ≥ 0}
for x ∈ K such that for each a ∈ αx, if a = b + c, then b, c ∈ αx.
Example 5 (Extreme rays of the second
order cone) Let Q the second
order cone. The vectors x = kx̄k, x̄ define the extreme rays of Q. This is
fairly easy to prove.
Example 6 (Extreme rays of the semidefinite cone) Let Pn×n be the semidefinite cone. Positive semi-definite matrices qqT of rank 1 form the extreme rays
of Pn×n . Here is the
P proof. Any positive semidefinite matrix X can be written
in the form of X = i λi pi pTi (See previous lecture to see how to get this from
spectral decomposition of X). This shows that all extreme rays must be among
matrices of the form qqT . Now we must show that each qqT is an extreme ray.
Let qqT = X+Y, where X, Y < 0. Suppose {q1 = q, q2 , . . . , qn } is an orthogonal
set of vectors in <n . Then multiplying from left by qTi and from right by qi we
see that qTi Xqi + qTi Yqi = 0 for i = 2, . . . , n; but since the summands are both
non-negative and add up to zero, they are both zero. Thus qTi Xqi = qTi Yqi = 0
for i = 2, . . . n. Thus both X and Y are rank one matrices (their null space has
dimension n − 1) and we might as well write qqT = xxT + yyT . But the right
hand side is a rank 2 matrix unless x and y are proportional, which proves they
are proportional to q. Thus, qqT are extreme rays for each vector q ∈ <n .
6
scribe:Xuan Li
Lecture 2
4.1
Date: 01/17/2001
An Example of a cone which is not self dual
In the examples above, we note that they were all self-dual cones. But there
are cones that are not self-dual.
Let F be the set of functions F : < → < with the following properties:
1. F is right continuous,
2. non-decreasing (i.e. if x > y then F(x) ≥ F(y),) and
3. has bounded variation, that is F(x) → α > −∞ as x → −∞, and F(x) →
β < ∞ as x → ∞.
First observe that functions in F are almost like probability distribution functions, except that their range is the interval [α, β] rather than [0, 1]. Second the
set F itself is a convex cone and in fact pointed cone in the space of continuous
functions.
Now we define a particular kind of Moment cone. First, let us define
 
1
x
 2
x 

ux = 
 · .
 
 · 
xn
The moment cone is defined as:
Z
Mn+1 = c = ux dF(x) : F(x) ∈ F
that is Mn+1 consits of vectors c where for each j = 0, . . . , n, cj is the jth
moment of a distribution times a non-negative constant.
Lemma 2 Mn+1 is a proper cone.
Proof: Let’s examine the properties we need to prove:
• ∀c ∈ Mn+1 and α ≥ 0 αc
R ∈ Mn+1 . To see this observe that there
exists F ∈ F such that c = ux dF(x). Now if F is right-continuous, nondecreasing and with bounded variation, then all these properties
also hold
R
for αF for each α ≥ 0 and thus αF ∈ F. Therefore, αc = ux d(αF(x)) ∈
Mn+1 . Thus Mn+1 is a cone.
• If c and dR are in Mn+1 then c + d ∈ Mn+1 . ∀c =
Mn+1 , d = ux dF2 (x) ∈ Mn+1
Z
c + d = ux d[F1 (x) + F2 (x)] ∈ Mn+1
Thus Mn+1 is a convex cone.
7
R
ux dF1 (x) ∈
scribe:Xuan Li
Lecture 2
Date: 01/17/2001
R
• If c and −c are in Mn+1
R then c = 0. Ifc = ux dF1 (x) ∈ Mn+1 and
c ∈ −Mn+1 , then −c = ux dF2 (x) ∈ Mn+1 .
Z
c + (−c) = 0 = ux d[F1 (x) + F2 (x)]
R
Especially, d[F1 (x)+F2 (x)] = 0. Since F1 (x)+F2 (x) ∈ F is non-decreasing
with F1 (x) + F2 (x) → 0 as x → −∞, we get F1 (x) + F2 (x) = 0 almost
everywhere,i.e., Fi (x) = 0, i = 1, 2 almost everywhere. It means c = 0,
i.e., Mn+1 ∩ −Mn+1 = 0. Thus Mn+1 is a pointed cone.
• Mn+1 is full-dimensional. Let
0, if x < a
Fa (x) =
1, if x ≥ a
Obviously, Fa (x) ∈ F and ua =
n + 1 distinct a1 , ...an+1 ,
R
ux dFa (x) ∈ Mn+1 for all a ∈ R. Choose
det[ua1 , · · · , uan+1 ] =
Y
(ai − aj ) 6= 0
i>j
Thus Mn+1 is full-dimension cone. (The determinant above is the wellknown Vander Monde determinant.)
In addition we need to show that Mn+1 is closed. This will be taken up in
future lectures.
Example 7 (Extreme rays of Mn+1 ) The extreme rays of Mn+1 are all αux
for x ∈ <. If c ∈ Mn+1 , c can be written as α1 ux1 + α2 ux2 + · · · +
αn+1 uxn+1 , αi ≥ 0 for i = 1, ..n + 1.
There is a one-to-one correspondence between c ∈ Mn+1 and
H = α1 ux1 uTx1 + α2 ux2 uTx2 + · · · + αn+1 uxn+1 uTxn+1 .
Such a matrix is called Hankel matrix. In general Hankel matrices are thos
matrices, H such that Hij = hi+j , that is entries are constant along all opposite
diagonals. A vector c ∈ <2n+1 is in the moment cone if and only if the Hankel
matrix Hij = ci+j is positive semidefinite. Again these assertions will be proved
in future lectures.
Now we examine M∗n+1 . Let’s first consider the cone defined as follows:
Pn+1 = {p = (p0 , . . . , pn ) | p0 + p1 x + p2 x2 + ... + pn xn = p(x) ≥ 0 for all x}
Lemma 3 Every non-negative polynomial is the sum of square polynomials.
8
scribe:Xuan Li
Lecture 2
Date: 01/17/2001
Proof: First it is well known that p(x) can be written as
k
n
Y
Y
p(x) = c
(x − γj )
(x − αj − iβj )(x − αj + iβj )
j=1
j=k+1
√
where i = −1 and c ≥ 0. We first claim that n must be even. Otherwise,
p(x) → −∞ as x → −∞ p(x) and cannot be non-negative. The number of
real roots is even subsequently, say 2l.
since p(x) ≥ 0, all the real roots must have even multiplicity, because otherwise
in the neighborhood of the root with odd multiplicity there is some t such that
p(t) < 0. Thus, we can write
k
n
Y
Y
(x − αj − iβj )(x − αj + iβj )
p(x) = c
(x − γj )2
j=1
j=k+1
On the other hand for each pair of conjugate complex roots we have
(x − α − iβ)(x − α + iβ) = (x − α)2 + β2
Therefore the product expression for p(x) is product of square polynomials or
sums of square polynomials, which yields a sum of square polynomials.
This means that the set of extreme rays of the non-negative polynomials is
among polynomials that are square q2 (x). Thus, the coefficients of extreme
rays are of the form q ∗ q = q∗2 , where a ∗ b is the convolution of vectors a
and b, that is for a, b ∈ <n+1 , a ∗ b ∈ <2n+1 and is defined as:
a ∗ b = (a0 b0 , a0 b1 + a1 b0 , . . . , a0 bk + a1 bk−1 + · · · + ak b0 , . . . , an bn )T
and q∗2 = q ∗ q.
Now not all square polynomials are extreme rays. In particular, if a square
polynomial has non-real roots then it can be written as sum of two square
polynomials as shown above. Thus, extreme rays are among those square polynomials with only real roots. We now argue that these polynomials are indeed
extreme rays.
9
scribe:Xuan Li
Lecture 2
Date: 01/17/2001
Q
Suppose p(x) = (x − γj )2k is a polynomial with distinct roots γj which
is not an extreme ray. Then p(x) = q(x) + r(x) and since both q and r are
non-negative, we must have q(x) ≤ p(x). This means that degree of q(x) is
at most as large as degree of p. Furthermore, from the picture it is clear that
each γj is also a root of q(x). But if for some γj the multiplicity in p is 2k
and the multiplicity in q is 2m where m < k then in some neigborhood of γj
q(x) > p(x) because (x − γj )2m > (x − γj )2k in some neighborhood of γj when
m < k; therefore, k ≤ m for each root. Since degree of p is larger than or equal
to degree of q it follows that k = m for each root. Thus q(x) = αp(x) for some
constant α. We have proved:
2
Corollary 1 p is an extreme ray of Pn+1 if p = q∗ and q(x) has only real
roots.
Pn+1
P
2
We now show that P∗n+1 ⊇ Mn+1 . Note that ∀c = i=1 βi uxi ∈ Mn+1 , i p∗i ∈
Pn+1
X
i
since
2
n+1
X
p∗i )T (
X 2
βj uxj =
βj (p∗i )T (uxj ) ≥ 0.
j=1
i,j
2
2
βi ≥ 0, (p∗i )T (uxj ) = pi (x)
Later in the course we will prove that that P∗n+1 = Mn+1 .
10
Fly UP