# The train.in input file

Jump to navigation
Jump to search

## Contents

## Purpose

This file contains all the training data, possibly from DFT calculations.

## Data format

The data format in this file is fixed:

Nc N_1 has_virial N_2 has_virial ... N_Nc has_virial Data for configuration 1 Data for configuration 2 ... Data for configuration Nc

Here,

`Nc`

is the total number of configurations (systems).`N_i`

is the number of atoms in configuration`i`

.- The
`has_virial`

flag (can only be 0 or 1) dictates whether or not there is virial information for the current configuration. - Data for one configuration occupy
`N_i + 2`

lines:- The first line should have 1 or 7 numbers. If
`has_virial`

for the current configuration is 0, this line only has one number, which is the**total energy**of the current configuration. If`has_virial`

for the current configuration is 1, this line has 7 numbers, which are the**total energy**of the current configuration followed by 6 virial components (in the order of`xx`

,`yy`

,`zz`

,`xy`

,`yz`

, and`zx`

) of the current configuration. - The second line should have nine numbers defining the cell vectors ([math]\vec{a}[/math], [math]\vec{b}[/math], [math]\vec{c}[/math])

- The first line should have 1 or 7 numbers. If

ax ay az bx by bz cx cy cz

- In the remaining
`N_i`

lines, each line contains 7 numbers, corresponding to the atomic number (that is, number of protons,`Z`

), position components (`x`

,`y`

,`z`

), and force components (`fx`

,`fy`

,`fz`

):

- In the remaining

Z x y z fx fy fz

## Units

In this file:

- Length and position are in units of Angstrom.
- Energy is in units of eV.
- Force is in units of eV/Angstrom.
- Virial is in units of eV (so virial divided by volume gives pressure).

## Tips

- Periodic boundary conditions are always assumed for all directions in each configuration. We use the minimum image convetion, and it is the responsibility of the user to make sure that the box is large enough for the chosen cutoff distance.
- The minimal number of atoms in a configuration is 2. The user is responsible for choosing a
**good**referene energy when preparing the energy data. - The energy and virial data refer to the total energy and virial for the system. They are not per-atom but per-box quantities.