# Learning the Ground State

NetKet implements learning algorithms to find the ground-state of a given many-body quantum Hamiltonian . Here, given a wave-function depending on a set of variational parameters the goal is to minimize the expectation value of the Hamiltonian:

In NetKet, is computed by means of stochastic estimates, and it is also minimized using stochastic estimates of its gradient . Further details can be found in the References (1-3) listed below.

Given the estimates of and , there are different methods to update the network parameters at each learning iteration. Each of the different methods yield a direction , such that the parameters are updated according to:

where is the parameter update given by one of the chosen Optimizers.

## Common Parameters

Independently of the specific optimization method chosen, you have to specify a set of parameters that controls general aspects of the learning.

In addition to the choice of the `Method`

, another crucial control parameter is the number of samples used to estimate and its gradient.
The number of samples is fixed by the field `Nsamples`

, which sets the number of sweeps that the `Sampler`

should perform to sample from the probability distribution:

The user must also specify a suitable `Optimizer`

, which ultimately yields the parameters updates in combination with the information given by `Method`

through the vector as defined above.
More information about Optimizers is found here.

Parameter | Possible values | Description | Default value |
---|---|---|---|

`Method` |
`Sr` or `Gd` |
The chosen method to learn the parameters of the wave-function | `Sr` |

`NiterOpt` |
Integer | Number of optimization steps (epochs in the Machine Learning parlance) |
None |

`Nsamples` |
Integer | Number of Markov Chain Monte Carlo sweeps to be performed at each step of the optimization | None |

`DiscardedSamples` |
Integer | Number of sweeps to be discarded at the beginning of the sampling, at each step of the optimization | 10% of sweeps/CPU core |

`DiscardedSamplesOnInit` |
Integer | Number of sweeps to be discarded in the first step of optimization, at the beginning of the sampling | 0 |

`OutputFile` |
String | The prefix for the output files (the output is then stored in prefix.log, the wave-function saved in prefix.wf) | None |

`SaveEvery` |
Integer | The wave function is saved every `SaveEvery` optimization steps |
50 |

## Gradient Descent

The simplest optimization method is the Gradient Descent, and is obtained when:

where the gradient of the energy is estimated through

where are the log-derivatives of the wave-function, and

is the so-called *local energy*.

The expectation values above are computed stochastically, using samples from the probability distribution . See References (2-3) for further details.

Parameter | Possible values | Description | Default value |
---|---|---|---|

None | None | None | None |

### Example

```
pars['Learning']={
'Method' : 'Gd',
'Nsamples' : 1.0e3,
'NiterOpt' : 1000,
'OutputFile' : "test",
}
pars['Optimizer']={
'Name' : 'AdaMax',
}
```

## Stochastic Reconfiguration

The method of choice in NetKet is the Stochastic Reconfiguration `Sr`

, developed by S. Sorella and coworkers. For an introduction to this method, you can have a look at the book (1).
In a nutshell, the vector is found as a solution of the following linear system:

where the hermitian (Gram) matrix reads:

In most cases, it is necessary to regularize the matrix, in order to have a well-conditioned solution and avoid numerical instabilities. The regularization procedure implemented in NetKet is a simple diagonal shift:

where the parameter is chosen by the user, through the field `DiagShift`

.

The Stochastic Reconfiguration is typically a more robust method than the simple Gradient Descent, however its computational cost
is at least quadratic in the number of parameter to be optimized, at variance with the linear cost of the Gradient Descent. In order to reduce the computational burden, NetKet implements
an iterative Conjugate Gradient solver to find without ever forming the matrix (which is the computational bottleneck of the algorithm).
The iterative solver can be activated with the flag `UseIterative`

, and should be used when optimizing a very large number of parameters .

Parameter | Possible values | Description | Default value |
---|---|---|---|

`DiagShift` |
Double | The regularization parameter for the Sr method | 0.01 |

`UseIterative` |
Boolean | Whether to use the iterative solver in the Sr method (this is extremely useful when the number of parameters to optimize is very large) | False |

### Example

```
pars['Learning']={
'Method' : 'Sr',
'Nsamples' : 1.0e3,
'NiterOpt' : 500,
'Diagshift' : 0.1,
'UseIterative' : False,
'OutputFile' : "test",
}
pars['Optimizer']={
'Name' : 'Sgd',
'LearningRate' : 0.1,
}
```

## References

- Becca, F., & Sorella, S. (2017). Quantum Monte Carlo Approaches for Correlated Systems. Cambridge University Press.
- Carleo, G., & Troyer, M. (2017). Solving the quantum many-body problem with artificial neural networks. Science, 355 602
- Carleo, G. (2017). Lecture notes for the Advanced School on Quantum Science and Quantum technology.

- Previous
- Next