Bài giảng Tối ưu hóa nâng cao: Chương 8 - Hoàng Nam Dũng
lượt xem 3
download
Bài giảng "Tối ưu hóa nâng cao - Chương 8: Proximal gradient descent (and acceleration)" cung cấp cho người học các kiến thức: Subgradient method, decomposable functions, proximal mapping, proximal gradient descent,.... Mời các bạn cùng tham khảo.
Bình luận(0) Đăng nhập để gửi bình luận!
Nội dung Text: Bài giảng Tối ưu hóa nâng cao: Chương 8 - Hoàng Nam Dũng
- Proximal Gradient Descent (and Acceleration) Hoàng Nam Dũng Khoa Toán - Cơ - Tin học, Đại học Khoa học Tự nhiên, Đại học Quốc gia Hà Nội
- Last time: subgradient method Consider the problem min f (x) x with f convex, and dom(f ) = Rn . Subgradient method: choose an initial x (0) ∈ Rn , and repeat: x (k) = x (k−1) − tk · g (k−1) , k = 1, 2, 3, . . . where g (k−1) ∈ ∂f (x (k−1) ). We use pre-set rules for the step sizes (e.g., diminshing step sizes rule). If f is Lipschitz, then subgradient method has a convergence rate O(1/ε2 ). Upside: very generic. Downside: can be slow — addressed today. 1
- Outline Today I Proximal gradient descent I Convergence analysis I ISTA, matrix completion I Special cases I Acceleration 2
- Decomposable functions Suppose f (x) = g (x) + h(x) where I g is convex, differentiable, dom(g ) = Rn I h is convex, not necessarily differentiable. If f were differentiable, then gradient descent update would be x + = x − t · ∇f (x) Recall motivation: minimize quadratic approximation to f around x, replace ∇2 f (x) by 1t I 1 x + = argminz f (x) + ∇f (x)T (z − x) + kz − xk22 . | {z 2t } f˜t (z) 3
- Decomposable functions In our case f is not differentiable, but f = g + h, g differentiable. Why don’t we make quadratic approximation to g , leave h alone? I.e., update x + = argminz g˜t (z) + h(z) 1 = argminz g (x) + ∇g (x)T (z − x) + kz − xk22 + h(z) 2t 1 = argminz kz − (x − t∇g (x))k22 + h(z). 2t 1 2t kz − (x − t∇g (x))k22 stay close to gradient update for g h(z) also make h small 4
- Proximal mapping The proximal mapping (or prox-operator) of a convex function h is defined as 1 proxh (x) = argminz kx − zk22 + h(z). 2 5
- Proximal mapping The proximal mapping (or prox-operator) of a convex function h is defined as 1 proxh (x) = argminz kx − zk22 + h(z). 2 Examples: I h(x) = 0: proxh (x) = x. 5
- Proximal mapping The proximal mapping (or prox-operator) of a convex function h is defined as 1 proxh (x) = argminz kx − zk22 + h(z). 2 Examples: I h(x) = 0: proxh (x) = x. I h(x) is indicator function of a closed convex set C : proxh is the projection on C 1 proxh (x) = argminz∈C kx − zk22 = PC (x). 2 5
- Proximal mapping The proximal mapping (or prox-operator) of a convex function h is defined as 1 proxh (x) = argminz kx − zk22 + h(z). 2 Examples: I h(x) = 0: proxh (x) = x. I h(x) is indicator function of a closed convex set C : proxh is the projection on C 1 proxh (x) = argminz∈C kx − zk22 = PC (x). 2 I h(x) = kxk1 : proxh is the ’soft-threshold’ (shrinkage) operation xi − 1 xi ≥ 1 proxh (x)i = 0 |xi | ≤ 1 xi + 1 xi ≤ −1. 5
- Proximal mapping Theorem If h is convex and closed (has closed epigraph) then 1 proxh (x) = argminz kx − zk22 + h(z). 2 exists and is unique for all x. Chứng minh. See http://www.seas.ucla.edu/~vandenbe/236C/lectures/ proxop.pdf Uniqueness since the objective function is strictly convex. 6
- Proximal mapping Theorem If h is convex and closed (has closed epigraph) then 1 proxh (x) = argminz kx − zk22 + h(z). 2 exists and is unique for all x. Chứng minh. See http://www.seas.ucla.edu/~vandenbe/236C/lectures/ proxop.pdf Uniqueness since the objective function is strictly convex. Optimality condition z = proxh (x) ⇔ x − z ∈ ∂h(z) ⇔ h(u) ≥ h(z) + (x − z)T (u − z), ∀u. 6
- Properties of proximal mapping Theorem Proximal mappings are firmly nonexpansive (co-coercive with constant 1) (proxh (x) − proxh (y ))T (x − y ) ≥ k proxh (x) − proxh (y )k22 . 7
- Properties of proximal mapping Theorem Proximal mappings are firmly nonexpansive (co-coercive with constant 1) (proxh (x) − proxh (y ))T (x − y ) ≥ k proxh (x) − proxh (y )k22 . Chứng minh. With u = proxh (x) and v = proxh (y ) we have x − u ∈ ∂f (u) and y − v ∈ ∂f (v ). From the monotonicity of subdifferential we get T (x − u) − (y − v ) (u − v ) ≥ 0. 7
- Properties of proximal mapping Theorem Proximal mappings are firmly nonexpansive (co-coercive with constant 1) (proxh (x) − proxh (y ))T (x − y ) ≥ k proxh (x) − proxh (y )k22 . Chứng minh. With u = proxh (x) and v = proxh (y ) we have x − u ∈ ∂f (u) and y − v ∈ ∂f (v ). From the monotonicity of subdifferential we get T (x − u) − (y − v ) (u − v ) ≥ 0. From firm nonexpansiveness and Cauchy-Schwarz inequality we get nonexpansiveness (Lipschitz continuity with constant 1) k proxh (x) − proxh (y )k2 ≤ kx − y k2 . 7
- Proximal gradient descent Proximal gradient descent: choose initialize x (0) , repeat: x (k) = proxtk h x (k−1) − tk · ∇g (x (k−1) ) , k = 1, 2, 3, . . . 8
- Proximal gradient descent Proximal gradient descent: choose initialize x (0) , repeat: x (k) = proxtk h x (k−1) − tk · ∇g (x (k−1) ) , k = 1, 2, 3, . . . To make this update step look familiar, can rewrite it as x (k) = x (k−1) − tk · Gtk (x (k−1) ) where Gt is the generalized gradient of f , x − proxth (x − t∇g (x)) Gt (x) = . t 8
- Proximal gradient descent Proximal gradient descent: choose initialize x (0) , repeat: x (k) = proxtk h x (k−1) − tk · ∇g (x (k−1) ) , k = 1, 2, 3, . . . To make this update step look familiar, can rewrite it as x (k) = x (k−1) − tk · Gtk (x (k−1) ) where Gt is the generalized gradient of f , x − proxth (x − t∇g (x)) Gt (x) = . t For h = 0 it is gradient descent. 8
- Examples minimize g(x) + h(x) Gradient method: special case with h(x) = 0 x+ = x − t∇g(x) Gradient projection method: special case with h(x) = δC (x) (indicator of C ) x x+ = PC (x − t∇g(x)) C x+ x − t∇g(x) Proximal gradient method 6-5
- What good did this do? You have a right to be suspicious ... may look like we just swapped one minimization problem for another. Key point is that proxh (·) is can be computed analytically for a lot of important functions h1 . Note: I Mapping proxh (·) doesn’t depend on g at all, only on h. I Smooth part g can be complicated, we only need to compute its gradients. Convergence analysis: will be in terms of number of iterations of the algorithm. Each iteration evaluates proxh (·) once and this can be cheap or expensive depending on h. 1 see http://www.seas.ucla.edu/~vandenbe/236C/lectures/proxop.pdf 9
- Example: ISTA (Iterative Shrinkage-Thresholding Algorithm) Given y ∈ Rn , X ∈ Rn×p , recall lasso criterion 1 f (β) = ky − X βk22 + λkβk1 . |2 {z } | {z } h(β) g (β) Proximal mapping is now 1 proxth (β) = argminz kβ − zk22 + λkzk1 2t = Sλt (β), where Sλ (β) is the soft-thresholding operator βi − λ if βi > λ [Sλ (β)]i = 0 if − λ ≤ βi ≤ λ, i = 1, . . . , n. βi + λ if βi < −λ 10
CÓ THỂ BẠN MUỐN DOWNLOAD
-
Hệ thống tự động xử lý nước thải mạ Crom – Niken
2 p | 211 | 54
-
HyperChem v8.0.6
4 p | 153 | 14
-
Các chất trợ dệt và các chất xử lý hoàn tất vải công năng cao mang lại lợi ích chi phí
4 p | 127 | 12
-
Phân tích và đánh giá hàm lượng Cafein, Theobromin, Theophyllin trong các loại chè xanh Việt Nam có nguồn gốc địa lý khác nhau
6 p | 128 | 9
-
Bài giảng Tối ưu hóa nâng cao: Chương 4 - Hoàng Nam Dũng
54 p | 55 | 5
-
Bài giảng Tối ưu hóa nâng cao: Chương 3 - Hoàng Nam Dũng
47 p | 34 | 4
-
Bài giảng Tối ưu hóa nâng cao: Chương 1 - Hoàng Nam Dũng
30 p | 47 | 4
-
Bài giảng Tối ưu hóa nâng cao: Chương 2 - Hoàng Nam Dũng
76 p | 49 | 3
-
Bài giảng Tối ưu hóa nâng cao: Chương 5 - Hoàng Nam Dũng
31 p | 47 | 3
-
Bài giảng Tối ưu hóa nâng cao: Chương 6 - Hoàng Nam Dũng
36 p | 43 | 3
-
Bài giảng Tối ưu hóa nâng cao: Chương 7 - Hoàng Nam Dũng
34 p | 26 | 3
-
Bài giảng Tối ưu hóa nâng cao: Chương 9 - Hoàng Nam Dũng
24 p | 50 | 3
-
Bài giảng Tối ưu hóa nâng cao: Chương 10 - Hoàng Nam Dũng
22 p | 38 | 3
-
Đánh giá mức độ rủi ro vùng biển ven bờ khu vực Mỹ Giang - Hòn Đỏ - Bãi Cỏ thuộc xã Ninh Phước, Ninh Hòa, Khánh Hòa
8 p | 68 | 1
Chịu trách nhiệm nội dung:
Nguyễn Công Hà - Giám đốc Công ty TNHH TÀI LIỆU TRỰC TUYẾN VI NA
LIÊN HỆ
Địa chỉ: P402, 54A Nơ Trang Long, Phường 14, Q.Bình Thạnh, TP.HCM
Hotline: 093 303 0098
Email: support@tailieu.vn