# The nep.in input file

## Brief descriptions

• This file specifies some hyperparameters of the NEP potential [Fan2021].

## File format

• This file has the following fixed form:
cutoff              [cutoff_radial] [cutoff_angular]
l_max               [l_max]
number_of_neurons   [number_of_neurons]
regularization      [lambda_1] [lambda_2]
batch_size          [batch_size]
population_size     [population_size]
maximum_generation  [maximum_generation]

• Do not modify the first column above.
• For each row, items within [ ] are to be filled by the user.
• We explain the meanings of the input items using the following example.
• To fully understand this input file, one needs to consult [Fan2021].

## An example

cutoff               8.0 4.0
n_max                12 8
l_max                6
number_of_neurons    25
regularization       0.01 0.05
batch_size           1000
population_size      50
maximum_generation   100000

• Explanations:
• The cutoff distance for the radial and angular descriptor components are $r_{\rm c}^{\rm R}=8.0$ A and $r_{\rm c}^{\rm A}=4.0$ A, respectively. We require that 1 A $\leq r_{\rm c}^{\rm A} \leq r_{\rm c}^{\rm R} \leq$ 10 A.
• The Chebyshev polynomial expansion order for the radial and angular descriptor components are $n_{\rm max}^{\rm R}=12$ and $n_{\rm max}^{\rm A}=8$, respectively. We require that $0 \leq n_{\rm max}^{\rm R},n_{\rm max}^{\rm A} \leq 12$.
• The Legendre polynomial expansion order for the angular part is $l_{\rm max}=6$. We require that $0 \leq l_{\rm max} \leq 6$.
• The number of neurons in the hidden layer (yes, we have tested that a single hidden layer is sufficient in NEP) is $N_{\rm neu}=25$. We require that $1 \leq N_{\rm neu} \leq 50$.
• The weight parameters for the $L_1$ and $L_2$ regularization are 0.01 and 0.05, respectively. These two parameters can take any non-negative values.
• The batch size is $N_{\rm bat}=1000$, i.e., at each generation, 1000 structures will be evaluated. This parameter needs to be a positive integer. When it is equal to or larger than the total number of structures in the training set, it means using the full batch.
• The population size is $N_{\rm pop}=50$. We require that $10 \leq N_{\rm pop} \leq 100$.
• The training will be carried out for $N_{\rm gen}=10^5$ generations. We require that $10^2 \leq N_{\rm gen} \leq 10^6$.

## Units

• The cutoff distances $r_{\rm c}^{\rm R}$ and $r_{\rm c}^{\rm A}$ are in units of Angstrom.
• Other quantities in this input file are dimensionless.

## References

• [Fan2021] Zheyong Fan, Zezhu Zeng, Cunzhi Zhang, Yanzhou Wang, Haikuan Dong, Yue Chen, and Tapio Ala-Nissila, Neuroevolution machine learning potentials: Combining high accuracy and low cost in atomistic simulations, To be submitted.