ParAMS

ParAMS is a reparameterization tool for computational chemistry models which ships with SCM’s AMS suite. Several interfaces to ParAMS have been included in the GloMPO package. They allow GloMPO to manage ReaxFF and GFN-xTB reparameterisations.

There are two ways to interface the two pieces of software, depending on your preferred workflow or interface:

  1. ParAMS is primary, setup an Optimization instance as normal. GloMPO is wrapped using the GlompoParamsWrapper to look like a scm.params.optimizers.base.BaseOptimizer.
  2. GloMPO is primary, setup a GloMPOManager instance as normal. The ReaxFFError class below will create the error function to be used as the manager GloMPOManager.task.

The second approach is recommended.

class glompo.interfaces.params.BaseParamsError(data_set: <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097f60>, job_collection: <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097e48>, parameters: <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097cf8>, validation_dataset: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097f60>] = None, scale_residuals: bool = False)[source]

Bases: object

Base error function instance from which other classes derive depending on the engine used e.g. ReaxFF, xTB etc. Primarily initialized from ParAMS objects. To initialize from files see the class methods from_classic_files() or from_params_files().

Parameters:
  • data_set – Reference data used to compare against force field results.
  • job_collection – AMS jobs from which the data can be extracted for comparison to the DataSet
  • parametersBaseParameters object which holds the force field values, ranges, engine and which parameters are active or not.
  • validation_dataset – If a validation set is being used and evaluated along with the training set, it may be added here. Jobs for the validation set are expected to be included in job_collection.
  • scale_residuals – See scale_residuals.

Notes

The class provides several convenience functions to access/read/modify the force field parameters (for example: n_parms, active_names, set_parameters(), reweigh_residuals() etc.). These are typically light wrappers around various par_eng commands. Not all forms of interface have been provided and, in general, the user may access the par_eng directly for fine control.

dat_set

Represents the training set.

Type:DataSet
job_col

Represents the jobs from which model results will be extracted and compared to the training set.

Type:JobCollection
loss

Method by which individual errors are grouped into a single error function value.

Type:Union[str, Loss]
par_eng

Parameter engine interface representing the model and its parameters to tune.

Type:BaseParameters
par_levels

The layers of parallelism possible within the evaluation of the jobs.

Type:ParallelLevels
scale_residuals

If True then the raw residuals (i.e. the differences between engine evaluation and training data) will be scaled by the weight and sigma values in the datasets i.e. r_scaled = weight * (r / sigma) ** 2. Otherwise the raw residual is returned. This setting effects resids() and detailed_call().

Type:bool
val_set

Optional validation set to evaluate in parallel to the training set.

Type:DataSet
__call__(x: Sequence[float]) → float[source]

Returns the error value between the the force field with the given parameters and the training values.

Notes

Optimizations are done in scaled space to improve the numerics of the problem. Thus x is expected to be given in scaled space. To transform from one space to another see convert_parms_real2scaled() and convert_parms_scaled2real().

active_abs_indices

Returns the absolute index number of the active parameters.

See also

active_names(), convert_indices_abs2rel(), convert_rel2abs_indices()

active_names

Returns the names of the active parameters.

See also

active_abs_indices(), convert_indices_abs2rel(), convert_rel2abs_indices()

bounds

Returns the min, max bounds in each dimension in scaled space i.e. a list of (0, 1) tuples for each parameter.

convert_indices_abs2rel(indices: List[int]) → List[int][source]

Converts a sequence of absolute indices to relative indices pointing to the corresponding parameter in the active subset.

Parameters:indices – Sequence of absolute indices for active parameters.
Returns:List of the same length as indices with corresponding elements giving the index of the parameters in the smaller active subset.
Return type:List[int]
Warns:UserWarning – If indices contains an index for an inactive parameter. None will be returned for that index.

Examples

Suppose par_eng has 100 parameters of which 5 are active. The absolute index numbers of these five are:

>>> active = [23, 57, 78, 10, 98]

Converting to the relative indices in the active subset:

>>> err.convert_indices_abs2rel(active)
[1, 2, 3, 0, 4]

Note that this method correctly accounts for the ordering of the parameters given to indices.

Suppose you attempted to convert a parameter which was not active:

err.convert_indices_abs2rel([23, 57, 1]) [1, 2, None]

convert_indices_rel2abs(indices: List[int]) → List[int][source]

Converts a sequence of relative indices in the active parameter subset to absolute indices in the par_eng.

Parameters:indices – Sequence of relative indices in the active parameter subset.
Returns:List of the same length as indices with corresponding elements giving the index of the parameters in the par_eng.
Return type:List[int]

Examples

Suppose par_eng has 100 parameters of which 5 are active. To find the absolute index numbers of all of them:

>>> err.convert_rel2abs_indices(range(4))
[10, 23, 57, 78, 98]
convert_parms_real2scaled(x: List[float]) → numpy.ndarray[source]

Transforms parameters from their actual values, to values between 0 and 1 where 0 and 1 represent the lower and upper bounds of the parameter respectively.

Important

Active parameter values exist in in two spaces:

  1. The real and actual parameter values which appear in the force field.
  2. A scaled space between 0 and 1 in all dimensions where 0 and 1 represent the lower bound and upper bounds of the active parameters respectively.

Optimizations are done in scaled space to improve the numerics of the problem.

Parameters:x – Sequence of parameter values to transform. May be the same length as the number of active parameters, or the length of the total number of parameters in the set.
Raises:ValueError – If the length of x does not match the number of active or total parameters
convert_parms_scaled2real(x: List[float]) → numpy.ndarray[source]

Transforms parameters from their [0, 1] scaled values, to actual parameter values. Exact opposite transformation of convert_parms_real2scaled().

detailed_call(x: Sequence[float]) → Union[Tuple[float, numpy.ndarray], Tuple[float, numpy.ndarray, float, numpy.ndarray]][source]

A full return of the error results. Returns a tuple of:

training_set_error, [training_set_residual_1, ..., training_set_residual_N]

If a validation set is included then returned tuple is:

training_set_error, [training_set_residual_1, ..., training_set_residual_N],
validation_set_error, [validation_set_residual_1, ..., validation_set_residual_N]

See also

__call__()

headers() → Dict[str, tables.description.Col][source]

Returns a the column headers for the detailed_call() return. See BaseFunction.headers().

n_all_parms

Returns the total number of active and inactive parameters.

See also

n_parms

n_parms

Returns the number of active parameters.

See also

n_all_parms

resids(x: Sequence[float]) → numpy.ndarray[source]

Method for compatibility with GFLS optimizer. Returns the signed differences between the force field and training set residuals. Will be scaled by sigma and weight if scale_residuals is True, otherwise not.

reweigh_residuals(resids: Union[Sequence[str], Sequence[int], Dict[Union[str, int], float]], new_weight: Optional[float] = None)[source]

Changes weights for elements in the DataSet. Can be used to deactivate contributions to the training set by setting their weight to zero.

Note

Deactivating a residual does not stop its associated jobs from still being calculated.

Parameters:
  • resids – Sequence of integers (which refer to the DataSetEntry indices in the DataSet) or strings corresponding to DataSet keys. A mix of integers and strings is not supported. May also be a dictionary mapping the above to new weight values.
  • new_weight – New weight to apply to all elements in resids. Ignored if resids is a dictionary, must be supplied otherwise.
save(path: Union[pathlib.Path, str], filenames: Optional[Dict[str, str]] = None, parameters: Optional[Sequence[float]] = None)[source]

Writes the dat_set and job_col to YAML files. Writes the engine object to an appropriate parameter file.

Parameters:
  • path – Path to directory in which files will be saved.
  • filenames

    Custom filenames for the written files. The dictionary may include any/all of the keys in the example below. This example contains the default names used if not given:

    {'ds': 'data_set.yml', 'jc': 'job_collection.yml', 'ff': 'ffield'}
    
  • parameters – Optional parameters to be written into the force field file. If not given, the parameters currently therein will be used.
set_parameters(x: Sequence[float], space: str, full: bool = False)[source]

Store parameters in the class.

Parameters:
  • x – Parameters to store in BaseParameters.
  • space

    Represents the space in which x is given. Accepts:

    1. 'real': Actual parameter values
    2. 'scaled': Transformed parameter values, bounded by 0 and 1 according to their ranges (see convert_parms_real2scaled() and convert_parms_scaled2real()).
  • full – If True, x is expected to be an array of ALL parameters in the force field, otherwise x is expected to be an array of active parameters only.
Warns:

UserWarning – If any value in x is outside of the bounds associated with that parameter.

toggle_parameter(parameters: Union[Sequence[int], Sequence[str]], toggle: Union[str, bool] = None)[source]

De/Activate parameters. This means either allowing them to be changed during an optimization, or fixing their value so that they are not changed.

Parameters:
  • parameters – Sequence of integers (which refer to the parameters’ indices in BaseParameters) or parameter name strings which should be de/activated. A mix of integers and strings is not supported.
  • toggle – Accepts 'on', 'off', True or False. Specifies how the toggle should be applied. Must be supplied. 'on' means the parameters will be optimized and changed during the optimization. 'off' means the parameters will be fixed. To set the parameter values see set_parameters().

Notes

If using integers in parameters these are the absolute index numbers of the full parameter set. Not the parameter indices of the already activated subset. This may lead to unexpected results. For example, if you have a field with five activated parameters, attempting err.toggle_parameters(4, 'off') will not deactivate the fifth active parameter but rather the parameter indexed 4 in the overall set. See convert_indices_abs2rel() and convert_indices_rel2abs() to be able to convert between the reference systems.

Warning

When toggling parameters on, make sure that their associated bounds are sensible!

See also

active_abs_indices, convert_indices_abs2rel, convert_rel2abs_indices, set_parameters()

class glompo.interfaces.params.GlompoParamsWrapper(opt_selector: glompo.opt_selectors.baseselector.BaseSelector, **manager_kwargs)[source]

Bases: sphinx.ext.autodoc.importer._MockObject

Wraps the GloMPO manager into a ParAMS BaseOptimizer. This is not the recommended way to make use of the GloMPO interface, it is preferable to make use of the BaseParamsError classes. This class is only applicable in cases where the ParAMS Optimization class interface is preferred.

Parameters:
  • opt_selector – Initialised BaseSelector object which specifies how optimizers are selected and initialised.
  • **manager_kwargs – Optional arguments to the GloMPOManager initialisation function.

Notes

manager_kwargs accepts all arguments of GloMPOManager.setup() but required GloMPO arguments task and bounds will be overwritten as they are passed by the minimize() function in accordance with ParAMS API.

minimize(function: <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097eb8>, x0: Sequence[float], bounds: Sequence[Tuple[float, float]], workers: int = 1) → <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097da0>[source]

Passes ‘function’ to GloMPO to be minimized. Returns an instance of MinimizeResult.

Parameters:
  • function – Function to be minimized, this is passed as GloMPO’s task parameter.
  • x0 – Ignored by GloMPO, the correct way to control the optimizer starting points is by using GloMPO BaseGenerator objects.
  • bounds – Sequence of (min, max) pairs used to bound the search area for every parameter. The ‘bounds’ parameter is passed to GloMPO as its bounds parameter.
  • workers – Represents the maximum number of optimizers run in parallel. Passed to GloMPO as its max_jobs parameter if it has not been sent during initialisation via manager_kwargs otherwise ignored. If allowed to default this will usually result in the number of optimizers as there are cores available.

Notes

GloMPO is not currently compatible with using multiple DataSet and only the first one will be considered.

By default ParAMS shifts and scales all parameters to the interval (0, 1). GloMPO will work in this space and be blind to the true bounds, thus results from the GloMPO logs cannot be applied directly to the function.

class glompo.interfaces.params.ReaxFFError(data_set: <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097f60>, job_collection: <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097e48>, parameters: <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097cf8>, validation_dataset: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097f60>] = None, scale_residuals: bool = False)[source]

Bases: glompo.interfaces.params.BaseParamsError

ReaxFF error function.

checkpoint_save(path: Union[pathlib.Path, str])[source]

Used to store files into a GloMPO checkpoint (at path) suitable to reconstruct the task when the checkpoint is loaded.

classmethod from_classic_files(path: Union[pathlib.Path, str], **kwargs) → glompo.interfaces.params.ReaxFFError[source]

Initializes the error function from classic ReaxFF files.

Parameters:path – Path to classic ReaxFF files, passed to setup_reax_from_classic().
classmethod from_params_files(path: Union[pathlib.Path, str], **kwargs) → glompo.interfaces.params.ReaxFFError[source]

Initializes the error function from ParAMS data files.

Parameters:path – Path to directory containing ParAMS data set, job collection and ReaxFF engine files (see setup_reax_from_params()).
toggle_parameter(parameters: Union[Sequence[int], Sequence[str]], toggle: Union[str, bool] = None, force: bool = False)[source]

De/Activate parameters. This means either allowing them to be changed during an optimization, or fixing their value so that they are not changed.

See toggle_parameter().

Parameters:force – If True, the sense checks which verify that certain parameters are not activated will be bypassed.
Warns:UserWarning – If parameters contains a parameter which should never be activated and toggle is True or 'on'.

Notes

Certain parameters should never be activated. For examples, some represent two- or three-way toggles for certain behaviours. Others can only take very specific values based on which atoms are present. This method will ignore and warn about attempts to activate such parameters unless force is used.

class glompo.interfaces.params.XTBError(data_set: <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097f60>, job_collection: <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097e48>, parameters: <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097cf8>, validation_dataset: Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097f60>] = None, scale_residuals: bool = False)[source]

Bases: glompo.interfaces.params.BaseParamsError

GFN-xTB error function.

checkpoint_save(path: Union[pathlib.Path, str])[source]

Used to store files into a GloMPO checkpoint (at path) suitable to reconstruct the task when the checkpoint is loaded.

classmethod from_params_files(path: Union[pathlib.Path, str], **kwargs) → glompo.interfaces.params.XTBError[source]

Initializes the error function from ParAMS data files.

Parameters:path – Path to directory containing ParAMS data set, job collection and ReaxFF engine files (see setup_reax_from_params()).
glompo.interfaces.params.setup_reax_from_classic(path: Union[pathlib.Path, str]) → Tuple[<sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097f60>, <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097e48>, <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097ef0>][source]

Parses classic ReaxFF force field and configuration files into instances which can be evaluated by AMS.

Parameters:path – Path to directory containing classic ReaxFF configuration files:

Notes

path must contain:

trainset.in: Contains the description of the items in the training set.

control: Contains ReaxFF settings.

geo: Contains the geometries of the items used in the training set, will make the JobCollection along with the control file.

ffield: A force field file which contains values for all the parameters. By default almost all parameters are activated and given ranges of \(\pm 20%\) if non-zero and [-1, 1] otherwise. See ReaxParams for details.

Optionally, the directory may contain:

params: File which describes which parameters to optimize and their ranges.

Or, alternatively:

ffield_bool: A force field file with all parameters set to 0 or 1. 1 indicates it will be adjusted during optimization. 0 indicates it will not be changed during optimization.

ffield_max: A force field file where the active parameters are set to their maximum value (value of other parameters is ignored).

ffield_min: A force field file where the active parameters are set to their maximum value (value of other parameters is ignored).

The method will ignore ffield_bool, ffield_min and ffield_max if params is also present.

Caution

params files are not supported in ParAMS <v0.5.1. In this case the file will be ignored and the method will directly look for ffield_bool, ffield_min and ffield_max.

Returns:ParAMS reparameterization objects: job collection, training set and engine.
Return type:Tuple[DataSet, JobCollection, ReaxParams]
glompo.interfaces.params.setup_reax_from_params(path: Union[pathlib.Path, str]) → Tuple[<sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097f60>, <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097e48>, <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097ef0>][source]

Loads ParAMS produced ReaxFF files into ParAMS objects.

Parameters:path
Path to folder containing:
data_set.yml OR data_set.pkl
Contains the description of the items in the training set. A YAML file must be of the form produced by store(), a pickle file must be of the form produced by pickle_dump(). If both files are present, the pickle is given priority.
job_collection.yml OR job_collection.pkl
Contains descriptions of the AMS jobs to evaluate. A YAML file must be of the form produced by store(), a pickle file must be of the form produced by pickle_dump(). If both files are present, the pickle is given priority.
reax_params.pkl
Pickle produced by pickle_dump(), representing the force field, active parameters and their ranges.
glompo.interfaces.params.setup_xtb_from_params(path: Union[pathlib.Path, str]) → Tuple[<sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097f60>, <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097e48>, <sphinx.ext.autodoc.importer._MockObject object at 0x7fcec1097c88>][source]

Loads ParAMS produced ReaxFF files into ParAMS objects.

Parameters:path
Path to folder containing:
data_set.yml OR data_set.pkl
Contains the description of the items in the training set. A YAML file must be of the form produced by store(), a pickle file must be of the form produced by pickle_dump(). If both files are present, the pickle is given priority.
job_collection.yml OR job_collection.pkl
Contains descriptions of the AMS jobs to evaluate. A YAML file must be of the form produced by store(), a pickle file must be of the form produced by pickle_dump(). If both files are present, the pickle is given priority.
elements.xtbpar, basis.xtbpar, globals.xtbpar, additional_parameters.yaml, metainfo.yaml, atomic_configurations.xtbpar, metals.xtbpar
Classic xTB parameter files.