Second-order methods converge faster due to quadratic convergence but are expensive due to O(n³) per iteration

Image: Hans Hillewaert, CC BY-SA 4.0, via Wikimedia Commons

second-order methods (Newton's) converge faster but are expensive: O(n³) per step

Second-order methods converge faster due to quadratic convergence but are expensive due to O(n³) per iteration

Related concepts

Finite element method

Runge-Kutta method improves Euler by providing higher-order accuracy with k₁,k₂,k₃,k₄

iterative methods (CG, GMRES) do: solve Ax=b without explicitly inverting A

CG, GMRES iteratively solve Ax=b without explicitly inverting A

approximation algorithms guarantee: solution within factor α of optimal

Approximation algorithms guarantee a solution within a factor α of the optimal solution

Overlapping subproblems

Dynamic programming solves overlapping subproblems by storing results of subproblems to avoid redundant calculations

the momentum term does: v_t = βv_{t-1} + ∇L, accumulates gradient direction

Momentum term accelerates convergence in the gradient direction

Greedy vs dynamic programming: greedy makes locally optimal choices, DP considers all subproblems

Greedy: locally optimal choices; DP: considers all subproblems

Swipe through 100 ML concepts daily