class netket.optimizer.AdaDelta

AdaDelta Optimizer. Like RMSProp, AdaDelta corrects the monotonic decay of learning rates associated with AdaGrad, while additionally eliminating the need to choose a global learning rate $$\eta$$. The NetKet naming convention of the parameters strictly follows the one introduced in the original paper; here $$E[g^2]$$ is equivalent to the vector $$\mathbf{s}$$ from RMSProp. $$E[g^2]$$ and $$E[\Delta x^2]$$ are initialized as zero vectors.

$\begin{split}E[g^2]^\prime_k &= \rho E[g^2] + (1-\rho)G_k(\mathbf{p})^2\\ \Delta p_k &= - \frac{\sqrt{E[\Delta x^2]+\epsilon}}{\sqrt{E[g^2]+ \epsilon}}G_k(\mathbf{p})\\ E[\Delta x^2]^\prime_k &= \rho E[\Delta x^2] + (1-\rho)\Delta p_k^2\\ p^\prime_k &= p_k + \Delta p_k\end{split}$
__init__(self: netket._C_netket.optimizer.AdaDelta, rho: float = 0.95, epscut: float = 1e-07) → None

Constructs a new AdaDelta optimizer.

Parameters
• rho – Exponential decay rate, in [0,1].

• epscut – Small $$\epsilon$$ cutoff.

Examples

>>> from netket.optimizer import AdaDelta


Methods

 __init__(self, rho, epscut) Constructs a new AdaDelta optimizer. init(self, arg0, arg1) reset(self) Member function resetting the internal state of the optimizer. update(*args, **kwargs) Overloaded function.
init(self: netket._C_netket.optimizer.Optimizer, arg0: int, arg1: bool) → None
reset(self: netket._C_netket.optimizer.Optimizer) → None

Member function resetting the internal state of the optimizer.

update(*args, **kwargs)