class netket.optimizer.AdaMax

AdaMax is an adaptive stochastic gradient descent method, and a variant of Adam based on the infinity norm. In contrast to the SGD, AdaMax offers the important advantage of being much less sensitive to the choice of the hyper-parameters (for example, the learning rate).

Given a stochastic estimate of the gradient of the cost function $$G(\mathbf{p})$$, AdaMax performs an update:

$p^{\prime}_k = p_k + \mathcal{S}_k,$

where $$\mathcal{S}_k$$ implicitly depends on all the history of the optimization up to the current point. The NetKet naming convention of the parameters strictly follows the one introduced by the authors of AdaMax. For an in-depth description of this method, please refer to Kingma, D., & Ba, J. (2015). Adam: a method for stochastic optimization (Algorithm 2 therein).

__init__(self: netket._C_netket.optimizer.AdaMax, alpha: float = 0.001, beta1: float = 0.9, beta2: float = 0.999, epscut: float = 1e-07) → None

Constructs a new AdaMax optimizer.

Parameters
• alpha – The step size.

• beta1 – First exponential decay rate.

• beta2 – Second exponential decay rate.

• epscut – Small epsilon cutoff.

Examples

>>> from netket.optimizer import AdaMax


Methods

 __init__(self, alpha, beta1, beta2, epscut) Constructs a new AdaMax optimizer. init(self, arg0, arg1) reset(self) Member function resetting the internal state of the optimizer. update(*args, **kwargs) Overloaded function.
init(self: netket._C_netket.optimizer.Optimizer, arg0: int, arg1: bool) → None
reset(self: netket._C_netket.optimizer.Optimizer) → None

Member function resetting the internal state of the optimizer.

update(*args, **kwargs)