pypose.optim.strategy.Adaptive¶
- class pypose.optim.strategy.Adaptive(damping=1e-06, high=0.5, low=0.001, up=2.0, down=0.5, min=1e-06, max=1e+16)[source]¶
Adaptive damping strategy used in the Levenberg-Marquardt (LM) algorithm.
It scales down the damping factor if the optimization step is very successful, unchanges the factor if the step is successful, and scales the factor up if the step is unsuccessful.
\[\begin{aligned} &\rule{113mm}{0.4pt} \\ &\textbf{input}: \lambda ~\text{(damping)}, \bm{f}(\bm{\theta})~(\text{model}), \theta_h~(\text{high}), \theta_l~(\text{low}), \delta_u~(\text{up}), \delta_d~(\text{down}), \\ &\hspace{13mm} \epsilon_{s}~(\text{min}), \epsilon_{l}~(\text{max}) \\ &\rule{113mm}{0.4pt} \\ &\rho = \frac{ \|\bm{f}(\bm{\theta})\|^2 - \|\bm{f}(\bm{\theta} + \delta)\|^2} {\|\bm{f}(\bm{\theta})\|^2 - \|\bm{f}(\bm{\theta}) + \mathbf{J}\delta\|^2} ~\text{(step quality)} \\ &\textbf{if} ~~ \rho > \theta_h ~ \text{(``very successful'' step)} \\ &\hspace{5mm} \lambda \leftarrow \delta_d \cdot \lambda \\ &\textbf{elif} ~~ \rho > \theta_l ~ \text{(``successful'' step)} \\ &\hspace{5mm} \lambda \leftarrow \lambda \\ &\textbf{else} ~ \text{(``unsuccessful'' step)} \\ &\hspace{5mm} \lambda \leftarrow \delta_u \cdot \lambda \\ &\lambda \leftarrow \mathrm{min}(\mathrm{max}(\lambda, \epsilon_{s}), \epsilon_{l}) \\ &\rule{113mm}{0.4pt} \\[-1.ex] &\textbf{return} \: \lambda \\[-1.ex] &\rule{113mm}{0.4pt} \\[-1.ex] \end{aligned} \]- Parameters
damping (float, optional) – initial damping factor of LM optimizer. Default: 1e-6.
high (float, optional) – high threshold for scaling down the damping factor. Default: 0.5.
low (float, optional) – low threshold for scaling up the damping factor. Default: 1e-3.
down (float, optional) – the down scaling factor in the range of \((0,1)\). Default: 0.5.
up (float, optional) – the up scaling factor in the range of \((1,\infty)\). Default: 2.0.
min (float, optional) – the lower-bound of damping factor. Default: 1e-6.
max (float, optional) – the upper-bound of damping factor. Default: 1e16.
More details about the optimization process go to
pypose.optim.LevenbergMarquardt()
.Note
For efficiency, we calculate the denominator of the step quality \(\rho\) as
\[\begin{aligned} & \|\bm{f}(\bm{\theta})\|^2 - \|\bm{f}(\bm{\theta}) + \mathbf{J}\delta\|^2 \\ & = \bm{f}^T\bm{f} - \left( \bm{f}^T\bm{f} + 2 \bm{f}^T \delta + \delta^T \mathbf{J}^T\mathbf{J}\delta \right) \\ & = -(\mathbf{J} \delta)^T (2\bm{f} + \mathbf{J}\delta) \end{aligned} \]where \(\mathbf{J}\) is the Jacobian of the model \(\bm{f}\) at evaluation point \(\bm{\theta}\).
Example
>>> class PoseInv(nn.Module): ... def __init__(self, *dim): ... super().__init__() ... self.pose = pp.Parameter(pp.randn_SE3(*dim)) ... ... def forward(self, inputs): ... return (self.pose @ inputs).Log().tensor() ... ... device = torch.device("cuda" if torch.cuda.is_available() else "cpu") ... inputs = pp.randn_SE3(2, 2).to(device) ... invnet = PoseInv(2, 2).to(device) ... strategy = pp.optim.strategy.Adaptive(damping=1e-6) ... optimizer = pp.optim.LM(invnet, strategy=strategy) ... ... for idx in range(10): ... loss = optimizer.step(inputs) ... print('Pose loss %.7f @ %dit'%(loss, idx)) ... if loss < 1e-5: ... print('Early Stoping!') ... print('Optimization Early Done with loss:', loss.item()) ... break Pose loss 0.0000000 @ 0it Early Stoping! Optimization Early Done with loss: 9.236661990819073e-10