# The nep.in input file

## Brief descriptions

• This file specifies some hyperparameters used for training the NEP (neuroevolution potential) model, which is a highly efficient machine-learned potential.
• The NEP approach was proposed in [Fan2021], and later improved in [Fan2022a] and [Fan2022b]. The NEP approaches as described in these papers are called NEP1, NEP2, and NEP3, respectively.
• We are writing a new paper [Someone202x], where we propose NEP4.
• Currently, we support NEP2, NEP3 and NEP4, which can be chosen by the version keyword as described below.
• Although the NEP approach was introduced in GPUMD-v2.6, here we only document GPUMD-v3.2 and later versions. Older versions are deprecated.

## Mathematical formulation

### The neural network model

NEP uses the simplest feedforward neural network (NN) to represent the site energy of atom $i$ as a function of a descriptor vector with $N_\mathrm{des}$ components, $$U_i(\mathbf{q}) = U_i \left(\{q^i_{\nu}\}_{\nu =1}^{N_\mathrm{des}}\right).$$ There is a single hidden layer with $N_\mathrm{neu}$ neurons in the NN and we have $$U_i = \sum_{\mu=1}^{N_\mathrm{neu}}w^{(1)}_{\mu}\tanh\left(\sum_{\nu=1}^{N_\mathrm{des}} w^{(0)}_{\mu\nu} q^i_{\nu} - b^{(0)}_{\mu}\right) - b^{(1)},$$ where $\tanh(x)$ is the activation function in the hidden layer, $\mathbf{w}^{(0)}$ is the connection weight matrix from the input layer (descriptor vector) to the hidden layer, $\mathbf{w}^{(1)}$ is the connection weight vector from the hidden layer to the output node, which is the energy $U_i$, $\mathbf{b}^{(0)}$ is the bias vector in the hidden layer, and $b^{(1)}$ is the bias for node $U_i$.

### The descriptor

• The descriptor for atom $i$ consists of a number of radial and angular components as described below.
• The radial descriptor components are defined as

$$q^i_{n} = \sum_{j\neq i} g_{n}(r_{ij}) \quad\text{with}\quad 0\leq n\leq n_\mathrm{max}^\mathrm{R},$$ where the summation runs over all the neighbors of atom $i$ within a certain cutoff distance. Clearly, there are $n_\mathrm{max}^\mathrm{R}+1$ radial descriptor components.

• For the angular descriptor components, we consider 3-body to 5-body ones. The formulation is similar but not identical to the ACE (atomic cluster expansion) approach [Drautz2019]. For 3-body ones, we define ($0\leq n\leq n_\mathrm{max}^\mathrm{A}$, $1\leq l \leq l_\mathrm{max}^\mathrm{3b}$)

$$q^i_{nl} = \sum_{m=-l}^l (-1)^m A^i_{nlm} A^i_{nl(-m)},$$ where $$A^i_{nlm} = \sum_{j\neq i} g_n(r_{ij}) Y_{lm}(\theta_{ij},\phi_{ij}),$$ and $Y_{lm}(\theta_{ij},\phi_{ij})$ are the spherical harmonics as a function of the polar angle $\theta_{ij}$ and the azimuthal angle $\phi_{ij}$. For expressions of he 4-body and 5-body descriptor components, we refer to [Fan2022b].

• The radial functions $g_n(r_{ij})$ appear in both the radial and the angular descriptor components. For NEP3, in the radial descriptor components,

$$g_n(r_{ij}) = \sum_{k=0}^{N_\mathrm{bas}^\mathrm{R}} c^{ij}_{nk} f_k(r_{ij}),$$ with $$f_k(r_{ij}) = \frac{1}{2} \left[T_k\left(2\left(r_{ij}/r_\mathrm{c}^\mathrm{R}-1\right)^2-1\right)+1\right] f_\mathrm{c}(r_{ij}),$$ and $$f_\mathrm{c}(r_{ij}) = \begin{cases} \frac{1}{2}\left[ 1 + \cos\left( \pi \frac{r_{ij}}{r_\mathrm{c}^\mathrm{R}} \right) \right],& r_{ij}\leq r_\mathrm{c}^\mathrm{R}; \\ 0, & r_{ij} > r_\mathrm{c}^\mathrm{R}. \end{cases}$$ For NEP2, the trainable parameters $c^{ij}_{nk}$ reduce to $c^{ij}_{n}\delta_{nk}$ In the angular descriptor components, $g_n(r_{ij})$ have similar forms but with $N_\mathrm{bas}^\mathrm{R}$ changed to $N_\mathrm{bas}^\mathrm{A}$ and with $r_\mathrm{c}^\mathrm{R}$ changed to $r_\mathrm{c}^\mathrm{A}$.

### Some dimensions

• The number of radial descriptor components: $n_\mathrm{max}^\mathrm{R}+1$
• The number of 3-body angular descriptor components: $(n_\mathrm{max}^\mathrm{A}+1) l_\mathrm{max}^\mathrm{3b}$
• The number of 4-body angular descriptor components: Currently only $(n_\mathrm{max}^\mathrm{A}+1)$ or zero (if not used)
• The number of 5-body angular descriptor components: Currently only $(n_\mathrm{max}^\mathrm{A}+1)$ or zero (if not used)
• The total number of descriptor components $N_{\rm des}$ is the sum of the above numbers.
• The number of the trainable parameters $c_{nk}^{ij}$ in the descriptor ($N_{\rm typ}$ is the number of atom types):
• In NEP2: $N_{\rm typ}^2 [(n_\mathrm{max}^\mathrm{R}+1)+(n_\mathrm{max}^\mathrm{A}+1)]$
• In NEP3 and NEP4: $N_{\rm typ}^2 [(n_\mathrm{max}^\mathrm{R}+1)(N_\mathrm{bas}^\mathrm{R}+1)+(n_\mathrm{max}^\mathrm{A}+1)(N_\mathrm{bas}^\mathrm{A}+1)]$
• The number of the trainable parameters in the NN:
• In NEP2 and NEP3: $(N_{\rm des} +2) N_{\rm neu}+1$
• In NEP4: $(N_{\rm des} +2) N_{\rm neu} N_{\rm typ} +1$
• The total number of trainable parameters: the sum of the number of trainable parameters in the descriptor and that in the NN.

## File format

• In this input file, blank lines and lines starting with # are ignored. One can thus write comments after #.
• All the other lines should be of the following form:
keyword parameter_1 parameter_2 ...
• The keywords can appear in any order with one exception: type_weight cannot appear earlier than type.
• The type keyword does not have default parameters, so it must be set by the user.
• All the other keywords have default values and can be absent.
• Here is the complete list of the keywords:
 Keyword Parameters Defaults Restrictions Comments version an integer 4 can only be 2, 3, or 4, corresponding to NEP2, NEP3, and NEP4, respectively All versions have comparable accuracy for single-component systems, but for multi-component systems, NEP2 has the lowest accuracy, and NEP4 has the highest, if all the other hyperparameters are the same. type an integer $N_{\rm typ}$ (number of atom types) and then a list of atom symbols in the periodic table No default values $N_{\rm typ}\leq 10$ and atom symbols should be standard Chemical symbols (case sensitive) in the periodic table We have tried systems up to 5 atom types. type_weight $N_{\rm typ}$ real numbers representing the relative force weights for the different atom types all numbers are 1.0 any non-negative values are allowed Could be used if the numbers of atoms for the different atom types are very different. zbl one real number $r_{\rm c}^{\rm ZBL-outer}$ (outer cutoff for the universal ZBL potential) No defaut value; when this keyword is absent, the ZBL potential will not be enabled and the value of $r_{\rm c}^{\rm ZBL-outer}$ is then irrelevant. 1 A $\leq r_{\rm c}^{\rm ZBL-outer} \leq$ 2.5 A The inner cutoff of the ZBL potential is fixed to half of the outer cutoff: $r_{\rm c}^{\rm ZBL-inner} = r_{\rm c}^{\rm ZBL-outer} /2$. We believe this is a good choice. More details will be discussed in a future publication. cutoff two real numbers $r_{\rm c}^{\rm R}$ (radial cutoff) and $r_{\rm c}^{\rm A}$ (angular cutoff) $r_{\rm c}^{\rm R}$ = 8 A, $r_{\rm c}^{\rm A}$ = 4 A 2.5 A $\leq r_{\rm c}^{\rm A} \leq r_{\rm c}^{\rm R} \leq$ 10 A It should be benificial to use much larger $r_{\rm c}^{\rm R}$. The default values should be good in most cases. n_max two integers $n_{\rm max}^{\rm R}$ and $n_{\rm max}^{\rm A}$ $n_{\rm max}^{\rm R}=4$, $n_{\rm max}^{\rm A}=4$ $0 \leq n_{\rm max}^{\rm R},n_{\rm max}^{\rm A} \leq 19$ The default values are relatively small, but can lead to very high speed. basis_size two integers $N_{\rm bas}^{\rm R}$ and $N_{\rm bas}^{\rm A}$ $N_{\rm bas}^{\rm R}=8$, $N_{\rm bas}^{\rm A}=8$ $0 \leq N_{\rm bas}^{\rm R},N_{\rm bas}^{\rm A} \leq 19$ The default values are usually sufficient. This keyword is only relevant for NEP3 and NEP4. l_max one or two or three integers $l_{\rm max}^{\rm 3b}$, $l_{\rm max}^{\rm 4b}$, and $l_{\rm max}^{\rm 5b}$. If there is only one integer, it means $l_{\rm max}^{\rm 4b}=l_{\rm max}^{\rm 5b}=0$. If there are only two integers, it means $l_{\rm max}^{\rm 5b}=0$. $l_{\rm max}^{\rm 3b}=4$, $l_{\rm max}^{\rm 4b}=2$, and $l_{\rm max}^{\rm 5b}=0$. $l_{\rm max}^{\rm 3b}$ can only be 4, $l_{\rm max}^{\rm 4b}$ can be 0 or 2, and $l_{\rm max}^{\rm 5b}$ can be 0 or 1. $l_{\rm max}^{\rm 4b}$ and $l_{\rm max}^{\rm 5b}$ are only relevant for NEP3 and NEP4. neuron an integer $N_{\rm neu}$ (number of neurons in the hidden layer) $N_{\rm neu}=30$ $1 \leq N_{\rm neu} \leq 200$ The default value is relatively small, but can lead to very high speed. lambda_1 a real number $\lambda_{1}$ (loss weight for $\mathcal{L}_1$ regularization) $\lambda_{1} = 0.05$ $\lambda_{1} \geq 0$ In some cases, we find it better to use a strong regularization with $\lambda_{1} = 0.1$. lambda_2 a real number $\lambda_{2}$ (loss weight for $\mathcal{L}_2$ regularization) $\lambda_{2} = 0.05$ $\lambda_{2} \geq 0$ In some cases, we find it better to use a strong regularization with $\lambda_{2} = 0.1$. lambda_e a real number $\lambda_{\rm e}$ (loss weight for energy) $\lambda_{\rm e} = 1.0$ $\lambda_{\rm e} \geq 0$ The default value is good. lambda_f a real number $\lambda_{\rm f}$ (loss weight for force) $\lambda_{\rm f} = 1.0$ $\lambda_{\rm f} \geq 0$ The default value is good. lambda_v a real number $\lambda_{\rm v}$ (loss weight for virial) $\lambda_{\rm v} = 0.1$ $\lambda_{\rm v} \geq 0$ The default value is good. force_delta a real number $\delta$ used to make smaller forces more accurate $\delta = 0$ eV/A $\delta \geq 0$ eV/A When $\delta = 0$ eV/A, we define the loss function of force as the RMSE of force: $\sqrt{\frac{1}{3N}\sum_{i=1}^{N}\left(\vec{F}_i^{\rm NEP} - \vec{F}_i^{\rm tar}\right)^2}$. When $\delta \gt 0$ eV/A, we modify it to $\sqrt{\frac{1}{3N}\sum_{i=1}^{N}\left(\vec{F}_i^{\rm NEP} - \vec{F}_i^{\rm tar}\right)^2 \frac{1}{1+\|\vec{F}_i^{\rm tar}\| / \delta} }$, and in this case, a smaller $\delta$ will induce a stronger bias torwards smaller forces. batch an integer $N_{\rm bat}$ (batch size) $N_{\rm bat}=1000$ $N_{\rm bat}\geq 1$ Usually, a training set with more abundant structures requires using a larger batch size to achieve a given accuracy. A batch size from $N_{\rm bat}=100$ to $N_{\rm bat}=1000$ should be good. If you have a powerful GPU (such as Tesla A100), we suggest you use a large batch size (such as $N_{\rm bat}=1000$) or the full-batch ($N_{\rm bat}\geq$ number of configurations in the training set). If you use a small batch size for a powerful GPU, you will simply waste the GPU resources. If you have a weaker GPU, you can use a smaller batch size. population an integer $N_{\rm pop}$ (population size) $N_{\rm pop}=50$ $10 \leq N_{\rm pop}\leq 100$ $N_{\rm pop}=50$ is close to be optimal. generation an integer $N_{\rm gen}$ (the maximum number of generations to be evolved) $N_{\rm gen}=10^5$ $0 \leq N_{\rm gen}\leq 10^7$ For simple systems, $N_{\rm gen}= 10^4 \sim 10^5$ is enough; for complicated systems, might need $N_{\rm gen}= 10^5\sim10^6$ to converge the accuracy.
• Here is an example nep.in file using all the default parameters:
type       	2 Te Pb # this is a mandatory keyword
version 	4       # default
cutoff     	8 4     # default
n_max      	4 4     # default
basis_size	8 8     # default
l_max      	4 2 0   # default
neuron     	30      # default
lambda_1   	0.05    # default
lambda_2	0.05    # default
lambda_e	1.0     # default
lambda_f	1.0     # default
lambda_v	0.1     # default
batch           1000    # default
population	50      # default
generation	100000  # default

## References

• [Fan2021] Zheyong Fan, Zezhu Zeng, Cunzhi Zhang, Yanzhou Wang, Keke Song, Haikuan Dong, Yue Chen, and Tapio Ala-Nissila, Neuroevolution machine learning potentials: Combining high accuracy and low cost in atomistic simulations and application to heat transport, Phys. Rev. B. 104, 104309 (2021).
• [Fan2022a] Zheyong Fan, Improving the accuracy of the neuroevolution machine learning potentials for multi-component systems, Journal of Physics: Condensed Matter 34 125902 (2022).
• [Fan2022b] Zheyong Fan, Yanzhou Wang, Penghua Ying, Keke Song, Junjie Wang, Yong Wang, Zezhu Zeng, Ke Xu, Eric Lindgren, J. Magnus Rahm, Alexander J. Gabourie, Jiahui Liu, Haikuan Dong, Jianyang Wu, Yue Chen, Zheng Zhong, Jian Sun, Paul Erhart, Yanjing Su, Tapio Ala-Nissila, GPUMD: A package for constructing accurate machine-learned potentials and performing highly efficient atomistic simulations, Journal of Chemical Physics 157, 114801 (2022).
• [Someone202x] Someone, Title, Journal (202x).
• [Drautz2019] Ralf Drautz, Atomic cluster expansion for accurate and transferable interatomic potentials, Phys. Rev. B 99, 014104 (2019).