pypose.optim.strategy.TrustRegion¶
- class pypose.optim.strategy.TrustRegion(radius=1000000.0, high=0.5, low=0.001, up=2.0, down=0.5, factor=0.5, min=1e-06, max=1e+16)[source]¶
The trust region (TR) algorithm used in the Levenberg-Marquardt (LM) algorithm.
\[\begin{aligned} &\rule{113mm}{0.4pt} \\ &\textbf{input}: \Delta ~\text{(radius)}, \bm{f}(\bm{\theta})~(\text{model}), \theta_h~(\text{high}), \theta_l~(\text{low}), \delta_u~(\text{up}), \delta_d~(\text{down}), \\ &\hspace{13mm} \sigma~(\text{factor}), \epsilon_{s}~(\text{min}), \epsilon_{l}~(\text{max}) \\ &\rule{113mm}{0.4pt} \\ & \rho = \frac{ \|\bm{f}(\bm{\theta})\|^2 - \|\bm{f}(\bm{\theta} + \delta)\|^2} {\|\bm{f}(\bm{\theta})\|^2 - \|\bm{f}(\bm{\theta}) + \mathbf{J}\delta\|^2} ~\text{(step quality)} \\ &\textbf{if} ~~ \rho > \theta_h ~ \text{(``very successful'' step)} \\ &\hspace{5mm} \Delta \leftarrow \delta_u \cdot \Delta \\ &\hspace{5mm} \delta_d \leftarrow \delta_d^{\text{init}} \\ &\textbf{elif} ~~ \rho > \theta_l ~ \text{(``successful'' step)} \\ &\hspace{5mm} \Delta \leftarrow \Delta \\ &\hspace{5mm} \delta_d \leftarrow \delta_d^{\text{init}} \\ &\textbf{else} ~ \text{(``unsuccessful'' step)} \\ &\hspace{5mm} \Delta \leftarrow \delta_d \cdot \Delta \\ &\hspace{5mm} \delta_d \leftarrow \sigma \cdot \delta_d \\ &\Delta \leftarrow \mathrm{min}(\mathrm{max}(\Delta, \epsilon_{s}), \epsilon_{l}) \\ &\delta_d \leftarrow \mathrm{min}(\mathrm{max}(\delta_d, \epsilon_{s}), \epsilon_{l})\\ &\rule{113mm}{0.4pt} \\[-1.ex] &\textbf{return} \: \Delta, \delta_d \\[-1.ex] &\rule{113mm}{0.4pt} \\[-1.ex] \end{aligned} \]- Parameters
radius (float, optional) – the initial radius of the trust region (positive number). Default: 1e6.
high (float, optional) – high threshold for scaling down the damping factor. Default: 0.5.
low (float, optional) – low threshold for scaling up the damping factor. Default: 1e-3.
up (float, optional) – the up scaling factor in the range of \((1,\infty)\). Default: 2.0.
down (float, optional) – the initial down scaling factor in range of \((0,1)\). Default: 0.5.
factor (float, optional) – exponential factor for shrinking of the trust region. Default: 0.5.
min (float, optional) – the lower-bound of trust region radius and down scaling factor. Default: 1e-6.
max (float, optional) – the upper-bound of trust region radius and down scaling factor. Default: 1e16.
More details about the optimization process go to
pypose.optim.LevenbergMarquardt()
.Note
This implementation is an improved version of TR in Ceres.
For efficiency, we calculate the denominator of the step quality \(\rho\) as
\[\begin{aligned} & \|\bm{f}(\bm{\theta})\|^2 - \|\bm{f}(\bm{\theta}) + \mathbf{J}\delta\|^2 \\ & = \bm{f}^T\bm{f} - \left( \bm{f}^T\bm{f} + 2 \bm{f}^T \delta + \delta^T \mathbf{J}^T\mathbf{J}\delta \right) \\ & = -(\mathbf{J} \delta)^T (2\bm{f} + \mathbf{J}\delta) \end{aligned} \]where \(\mathbf{J}\) is the Jacobian of the model \(\bm{f}\) at evaluation point \(\bm{\theta}\).
Example
>>> class PoseInv(nn.Module): ... def __init__(self, *dim): ... super().__init__() ... self.pose = pp.Parameter(pp.randn_SE3(*dim)) ... ... def forward(self, inputs): ... return (self.pose @ inputs).Log().tensor() ... ... device = torch.device("cuda" if torch.cuda.is_available() else "cpu") ... inputs = pp.randn_SE3(2, 2).to(device) ... invnet = PoseInv(2, 2).to(device) ... strategy = pp.optim.strategy.TrustRegion(radius=1e6) ... optimizer = pp.optim.LM(invnet, strategy=strategy) ... ... for idx in range(10): ... loss = optimizer.step(inputs) ... print('Pose loss %.7f @ %dit'%(loss, idx)) ... if loss < 1e-5: ... print('Early Stoping!') ... print('Optimization Early Done with loss:', loss.item()) ... break Pose loss 0.0000000 @ 0it Early Stoping! Optimization Early Done with loss: 7.462681583803032e-10