intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Bài giảng Tối ưu hóa nâng cao: Chương 6 - Hoàng Nam Dũng

Chia sẻ: Minh Vũ | Ngày: | Loại File: PDF | Số trang:36

44
lượt xem
3
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Bài giảng "Tối ưu hóa nâng cao - Chương 6: Subgradients" cung cấp cho người học các kiến thức: Last time - gradient descent, subgradients, examples of subgradients, monotonicity, examples of non-subdifferentiable functions,... Mời các bạn cùng tham khảo.

Chủ đề:
Lưu

Nội dung Text: Bài giảng Tối ưu hóa nâng cao: Chương 6 - Hoàng Nam Dũng

  1. Subgradients Hoàng Nam Dũng Khoa Toán - Cơ - Tin học, Đại học Khoa học Tự nhiên, Đại học Quốc gia Hà Nội
  2. Last time: gradient descent Consider the problem min f (x) x for f convex and differentiable, dom(f ) = Rn . Gradient descent: choose initial x (0) ∈ Rn , repeat x (k) = x (k−1) − tk · ∇f (x (k−1) ), k = 1, 2, 3, . . . Step sizes tk chosen to be fixed and small, or by backtracking line search If ∇f Lipschitz, gradient descent has convergence rate O(1/ε) Downsides: I Requires f differentiable I Can be slow to converge 1
  3. Outline Today: I Subgradients I Examples I Properties I Optimality characterizations 2
  4. Basic inequality Basic inequality recall the basic inequality for differentiable convex functions: Recall that for convex and differentiable f , T T f (y )f≥ (y)f (x) + ∇f ≥ f (x) (x)(x) + ∇f −−x), (y (y x) ∀x, ∀yy∈∈dom dom(f f ). (x, f (x))   ∇f (x) −1 I The first-order approximation of f at x is a global lower • the first-order approximation of f at x is a global lower bound bound. • ∇f (x) defines a non-vertical supporting hyperplane to epi f at (x, f (x)): I ∇f defines a non-vertical supporting hyperplane to epi(f ) at  T     (x, f (x)) ∇f (x) y − x ≤ 0 ∀(y, t) ∈ epi f  −1  y t! f (x)!! x ∇f −1 − ≤ 0, ∀(y , t) ∈ epi(f ). Subgradients t f (x) 4-2 3
  5. Subgradients A subgradient of a convex function f at x is any g ∈ Rn such that f (y ) ≥ f (x) + g T (y − x), ∀y ∈ dom(f ). I Always exists (on the relative interior of dom(f )) I If f differentiable at x, then g = ∇f (x) uniquely I Same definition works for nonconvex f (however, subgradients need not exist). 4
  6. Subgradients A subgradient of a convex function f at x is any g ∈ Rn such that f (y ) ≥ f (x) + g T (y − x), ∀y ∈ dom(f ). I Always exists (on the relative interior of dom(f )) Subgradient I If f differentiable at x, then g = ∇f (x) uniquely g is a subgradient of a convex function f at x ∈ dom f if I Same definition works for nonconvex f (however, subgradients need not exist).f (y) ≥ f (x) + gT (y − x) ∀y ∈ dom f f (y) f (x1) + g1T (y − x1) f (x1) + g2T (y − x1) f (x2) + g3T (y − x2) x1 x2 g1 and g2 are subgradients at x1 , g3 is subgradient at x2 . 4 g1, g2 are subgradients at x1; g3 is a subgradient at x2
  7. Examples of subgradients Examples of subgradients Considerff :: RR → Consider R, ff (x) = |x| → R, 2.0 1.5 1.0 f(x) 0.5 0.0 −0.5 −2 −1 0 1 2 x • For x 6= 0, unique subgradient g = sign(x) I For x 6= 0, unique subgradient g = sign(x) • For x = 0, subgradient g is any element of [−1, 1] I For x = 0, subgradient g is any element of [−1, 1]. 5 5
  8. Examples of subgradients Considerf f: :RRnn→ Consider R,ff (x) →R, (x) = kxk2 = kxk 2 f(x) x2 x1 x Forxx6=6=0,0,unique • IFor subgradientg g==x/kxk2 uniquesubgradient kxk2 • IFor Forxx==0,0,subgradient subgradientg gis isany elementofof{z{z: kzk anyelement : kzk ≤≤1}1}. 2 2 6
  9. Examples of subgradients Considerff :: RRnn → Consider R, ff (x) → R, = kxk (x) = kxk11 f(x) x2 x1 I For x 6= 0, unique ith component g = sign(x ) • For xi i 6= 0, unique ith component gi i = sign(xii) I For xi = 0, ith component gi is any element of [−1, 1]. • For xi = 0, ith component gi is any element of [−1, 1] 7
  10. Examples of subgradients for ff11,,ff2 2: R : Rn→→RRconvex, (x)==max{f Considerf f(x) max{f11(x), (x), ff22(x)}, for n Consider convex, differentiable differentiable 15 10 f(x) 5 0 −2 −1 0 1 2 x (x)>>ff22(x), Forf1f1(x) •IFor (x), unique unique subgradient subgradient gg = ∇f11(x) = ∇f (x)>>ff11(x), Forf2f2(x) •IFor (x), unique unique subgradient subgradient gg = ∇f22(x) = ∇f (x) •IFor (x)==ff22(x), Forf1f1(x) (x), subgradient subgradient gg isis any any point point on on line line segment segment ∇f11(x) between∇f between (x)and ∇f22(x) and∇f (x). 8
  11. Subdifferential Set of all subgradients of convex f is called the subdifferential: ∂f (x) = {g ∈ Rn : g is a subgradient of f at x}. 9
  12. Subdifferential Set of all subgradients of convex f is called the subdifferential: ∂f (x) = {g ∈ Rn : g is a subgradient of f at x}. Properties: I Nonempty for convex f at x ∈ int(domf ) I ∂f (x) is closed and convex (even for nonconvex f ) I If f is differentiable at x, then ∂f (x) = {∇f (x)} I If ∂f (x) = {g }, then f is differentiable at x and ∇f (x) = g . Proof: See http://www.seas.ucla.edu/~vandenbe/236C/ lectures/subgradients.pdf 9
  13. Monotonicity Theorem The subdifferential of a convex function f is a monotone operator (u − v )T (x − y ) ≥ 0, ∀u ∈ ∂f (x), v ∈ ∂f (y ). Chứng minh. By definition we have f (y ) ≥ f (x) + u T (y − x) and f (x) ≥ f (y ) + v T (x − y ). Combining them get shows monotonicity. 10
  14. Monotonicity Theorem The subdifferential of a convex function f is a monotone operator (u − v )T (x − y ) ≥ 0, ∀u ∈ ∂f (x), v ∈ ∂f (y ). Chứng minh. By definition we have f (y ) ≥ f (x) + u T (y − x) and f (x) ≥ f (y ) + v T (x − y ). Combining them get shows monotonicity. Question: Monotonicity for differentiable convex function? 10
  15. Monotonicity Theorem The subdifferential of a convex function f is a monotone operator (u − v )T (x − y ) ≥ 0, ∀u ∈ ∂f (x), v ∈ ∂f (y ). Chứng minh. By definition we have f (y ) ≥ f (x) + u T (y − x) and f (x) ≥ f (y ) + v T (x − y ). Combining them get shows monotonicity. Question: Monotonicity for differentiable convex function? (∇f (x) − ∇f (y ))T (x − y ) ≥ 0, which follows directly from the first order characterization of convex functions. 10
  16. Examples of non-subdifferentiable functions The following functions are not subdifferentiable at x = 0 I f : R → R, dom(f ) = R+ ( 1 if x = 0 f (x) = 0 if x > 0. I f : R → R, dom(f ) = R+ √ f (x) = − x. The only supporting hyperplane to epi(f ) at (0, f (0)) is vertical. 11
  17. Connection to convex geometry Convex set C ⊆ Rn , consider indicator function IC : Rn → R,  0 if x ∈ C IC (x) = I {x ∈ C } = ∞ if x 6∈ C . For x ∈ C , ∂IC (x) = NC (x), the normal cone of C at x is, recall NC = {g ∈ Rn : g T x ≥ g T y for any y ∈ C }. Why? By definition of subgradient g , IC (y ) ≥ IC (x) + g T (y − x) for all y . I For y 6∈ C , IC (y ) = ∞ I For y ∈ C , this means 0 ≥ g T (y − x). 12
  18. ● ● ● ● 13 11
  19. Subgradient calculus Basic rules for convex functions: I Scaling: ∂(af ) = a · ∂f provided a > 0. I Addition: ∂(f1 + f2 ) = ∂f1 + ∂f2 . I Affine composition: if g (x) = f (Ax + b), then ∂g (x) = AT ∂f (Ax + b). I Finite pointwise maximum: if f (x) = maxi=1,...m fi (x), then  [  ∂f (x) = conv ∂fi (x) i:fi (x)=f (x) convex hull of union of subdifferentials of active functions at x. 14
  20. Subgradient calculus I General pointwise maximum: if f (x) = maxs∈S fs (x), then     [  ∂f (x) ⊇ cl conv ∂fs (x) .   s:fs (x)=f (x) Under some regularity conditions (on S, fs ), we get equality. I Norms: important special case, f (x) = kxkp . Let q be such that 1/p + 1/q = 1, then kxkp = max z T x. kzkq ≤1 And ∂f (x) = argmaxkzkq ≤1 z T x. 15
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2