Examples

GloMPO comes bundled with several examples to get you started. They can all be found in the examples directory of the package. The examples are best done in the order presented here. After working them, the user is encouraged to read further in the documentation to get a proper understanding of all of GloMPO’s components.

Minimal

The minimal example configures the minimum number of GloMPO options (i.e. uses all of its defaults) and demonstrates that very simple configurations are possible.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
from glompo.benchmark_fncs import Michalewicz
from glompo.core.manager import GloMPOManager
from glompo.opt_selectors import CycleSelector
from glompo.optimizers.cmawrapper import CMAOptimizer

if __name__ == '__main__':
    task = Michalewicz(dims=5)

    sigma = (task.bounds[0][1] - task.bounds[0][0]) / 2
    call_args = {'sigma0': sigma}
    init_args = {'workers': 1, 'popsize': 6}
    selector = CycleSelector((CMAOptimizer, init_args, call_args))

    manager = GloMPOManager.new_manager(task=task, bounds=task.bounds, opt_selector=selector)
    result = manager.start_manager()

    print(f"Global min for Michalewicz Function: {task.min_fx:.3E}")
    print("GloMPO minimum found:")
    print(result)

The Michalewicz global optimization test function is a good example of where GloMPO can outperform normal optimization.

7
    task = Michalewicz(dims=5)

For this task we will use CMA-ES which has good optimization properties for many function classes. Optimizers are sent to GloMPO via BaseSelector objects. These are code stubs which propose an optimizer type and configuration to start when asked by the manager.

A very basic selector is CycleSelector which returns a rotating list of optimizers when asked but can be used for just a single optimizer type.

Setting up any selector requires that a sequence of available optimizers be given to it during initialisation. The elements in this list can take two forms:

  1. Uninitiated optimizer class.
  2. Tuple of:
    1. Uninitiated optimizer class;
    2. Dictionary of optional initialisation arguments;
    3. Dictionary of optional arguments passed to BaseOptimizer.minimize().

In this case we need to setup:

The initial \(\sigma\) value:
We choose this to be half the range of the bounds in each direction (in this case the bounds are equal in all directions). This value must be sent to minimize().
 9
10
    sigma = (task.bounds[0][1] - task.bounds[0][0]) / 2
    call_args = {'sigma0': sigma}
The number of parallel workers:
CMA is a population based solver and uses multiple function evaluations per iteration; this is the population size. It can also use internal parallelization to evaluate each population member simultaneously; this is the number of workers or threads it can start. It is important that the user takes care of the load balancing at this point to ensure the most efficient performance. In this case we will use 1 worker and population of 6 (the function evaluation in this toy example is too fast to justify the overhead of multithreading or multiprocessing). These are arguments required at CMA initialisation.
11
    init_args = {'workers': 1, 'popsize': 6}

We can now setup the selector.

12
    selector = CycleSelector((CMAOptimizer, init_args, call_args))

Note the load balancing here. GloMPO will allow a fixed number of threads to be run. By default this is one less than the number of CPUs available. If your machine has 32 cores for example than the manager will use 1 and allow 31 to be used by the local optimizers. 'workers' keyword we used for the optimizer earlier tells GloMPO that each instance of CMA will use 1 of these slots. Thus, GloMPO will start a maximum of 31 parallel CMA optimizers in this run. Alternatively, if we had parallelized the function evaluations (by setting 'workers' equal to 6) then 5 optimizers would be started taking 6 slots each. In such a configuration one core of the 32 core machine would remain unused: \(5\times6=30\text{optimizers} + 1\text{manager} = 31\).

If you want to fix the number of threads used regardless of the system resources, pass the optional max_jobs argument during the manager initialisation.

The manager is setup using all GloMPO defaults in this case. Only the task, its box bounds and local optimizers need be provided.

14
    manager = GloMPOManager.new_manager(task=task, bounds=task.bounds, opt_selector=selector)

To execute the minimization we simply run GloMPOManager.start_manager(). Note: by default GloMPO will not save any files but this is available.

15
    result = manager.start_manager()

Finally we print the selected minimum

17
18
19
    print(f"Global min for Michalewicz Function: {task.min_fx:.3E}")
    print("GloMPO minimum found:")
    print(result)

Customized

The customized example guides users through each of the options available to configure the manager and will give the user a good overview of what is possible.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
import logging
import sys

from glompo.benchmark_fncs import Schwefel
from glompo.convergence import MaxFuncCalls, TargetCost
from glompo.core.checkpointing import CheckpointingControl
from glompo.core.manager import GloMPOManager
from glompo.generators import RandomGenerator
from glompo.hunters import BestUnmoving, EvaluationsUnmoving, ParameterDistance, ValueAnnealing
from glompo.opt_selectors import CycleSelector

try:
    from glompo.optimizers.cmawrapper import CMAOptimizer
except ModuleNotFoundError:
    raise ModuleNotFoundError("To run this example the cma package is required.")

if __name__ == '__main__':
    formatter = logging.Formatter("%(asctime)s : %(levelname)s : %(lineno)d : %(name)s :: %(message)s")

    handler = logging.StreamHandler(sys.stdout)
    handler.setFormatter(formatter)

    logger = logging.getLogger('glompo.manager')
    logger.addHandler(handler)
    logger.setLevel('INFO')

    task = Schwefel(dims=20)

    max_calls = 100000  # Imposed iteration budget
    checker = MaxFuncCalls(max_calls) | TargetCost(task.min_fx)  # Combined two conditions into a single stop criteria.

    sigma = (task.bounds[0][1] - task.bounds[0][0]) / 2
    call_args = {'sigma0': sigma}
    init_args = {'workers': 1, 'popsize': 12}
    selector = CycleSelector((CMAOptimizer, init_args, call_args))

    max_jobs = 10  # OR os.cpu_count()

    hunters = (EvaluationsUnmoving(100, 0.01) &  # Kill optimizers which are incorrectly focussing
               ValueAnnealing(0.10) |  # Keep competitive optimizers alive
               BestUnmoving(int(max_calls / 15), 0.2) |  # Kill optimizers that go nowhere for a long time
               ParameterDistance(task.bounds, 0.05))  # Kill optimizers that go to the same minimum

    generator = RandomGenerator(task.bounds)

    backend = 'processes'

    visualisation = True
    visualisation_args = {'record_movie': True,
                          'x_range': (0, max_calls),
                          'y_range': None,
                          'log_scale': False,
                          'events_per_flush': 500,
                          'interactive_mode': True,
                          'writer_kwargs': {'fps': 8},
                          'movie_kwargs': {'outfile': 'demo.mp4',
                                           'dpi': 200}}

    force_terminations = -1

    checkpointing = CheckpointingControl(checkpoint_at_conv=True,
                                         naming_format='customized_completed_%(date)_%(time).tar.gz',
                                         checkpointing_dir="customized_example_outputs")

    manager = GloMPOManager.new_manager(task=task,
                                        bounds=task.bounds,
                                        opt_selector=selector,
                                        working_dir="customized_example_outputs",  # Dir in which files are saved
                                        overwrite_existing=True,  # Replaces existing files in working directory
                                        max_jobs=max_jobs,
                                        backend=backend,
                                        convergence_checker=checker,
                                        x0_generator=generator,
                                        killing_conditions=hunters,
                                        share_best_solutions=False,  # Send good evals from one opt to another
                                        hunt_frequency=500,  # Function evaluations between hunts
                                        status_frequency=60,  # Interval in seconds in which a status message is logged
                                        checkpoint_control=checkpointing,
                                        summary_files=3,  # Controls the level of output produced
                                        is_log_detailed=False,  # Functions can produce extra info which can be logged
                                        visualisation=visualisation,
                                        visualisation_args=visualisation_args,
                                        force_terminations_after=-1,
                                        split_printstreams=True)  # Autosend print statements from opts to files

    result = manager.start_manager()

    print(f"Global min for Schwefel Function: {task.min_fx:.3E}")
    print("GloMPO minimum found:")
    print(result)

GloMPO contains built-in logging statements throughout the library. These will not show up by default but can be accessed if desired. In fact intercepting the logging.INFO level statements from the manager creates a nice progress stream from the optimization; we will set this up here. See Logging Messages for more information.

18
19
20
21
22
23
24
25
    formatter = logging.Formatter("%(asctime)s : %(levelname)s : %(lineno)d : %(name)s :: %(message)s")

    handler = logging.StreamHandler(sys.stdout)
    handler.setFormatter(formatter)

    logger = logging.getLogger('glompo.manager')
    logger.addHandler(handler)
    logger.setLevel('INFO')

In this example GloMPO will be run on a well known global optimization test function but each configuration option will be individually set and explained.

The Schwefel global optimization test function is a good example of where GloMPO can outperform normal optimization.

27
    task = Schwefel(dims=20)

Convergence of the GloMPO manager is controlled by BaseChecker objects. These are small classes which define a single termination condition. These classes can then be easily combined to create sophisticated termination conditions using & and | symbolics.

In this case we would like the optimizer to run for a fixed number of iterations or stop if the global minimum is found. Of course we would not know the global minimum in typical problems but we do in this case.

29
30
    max_calls = 100000  # Imposed iteration budget
    checker = MaxFuncCalls(max_calls) | TargetCost(task.min_fx)  # Combined two conditions into a single stop criteria.

We will configure the optimizers as was done in the Minimal example:

32
33
34
35
    sigma = (task.bounds[0][1] - task.bounds[0][0]) / 2
    call_args = {'sigma0': sigma}
    init_args = {'workers': 1, 'popsize': 12}
    selector = CycleSelector((CMAOptimizer, init_args, call_args))

The Minimal example discussed the importance of load balancing. In this example we will override the default number of slots and limit the manager to 10:

37
    max_jobs = 10  # OR os.cpu_count()

BaseHunter objects are setup in a similar way to BaseChecker objects and control the conditions in which optimizers are shutdown by the manager. Each hunter is individually documented here.

In this example we will use a hunting set which has proven effective on several problems:

39
40
41
42
    hunters = (EvaluationsUnmoving(100, 0.01) &  # Kill optimizers which are incorrectly focussing
               ValueAnnealing(0.10) |  # Keep competitive optimizers alive
               BestUnmoving(int(max_calls / 15), 0.2) |  # Kill optimizers that go nowhere for a long time
               ParameterDistance(task.bounds, 0.05))  # Kill optimizers that go to the same minimum

Note

BaseHunter and BaseChecker are evaluated lazily this means that in x | y, y will not be evaluated if x is True and in x & y, y will not be evaluated if x is False.

BaseSelector objects select which optimizers to start but BaseGenerator objects select a point in parameter space where to start them.

In this example we will use the RandomGenerator which starts optimizers at random locations.

44
    generator = RandomGenerator(task.bounds)

GloMPO supports running the optimizers both as threads and processes. Processes are preferred and the default since they circumvent the Python Global Interpreter Lock but threads can also be used for tasks that are not multiprocessing safe. In this example we will use processes.

Attention

It is highly recommended that the user familiarize themselves with Python’s behavior in this regard! If all computations are performed within Python than multithreading will NOT result in the distribution of calculations over more than one core.

46
    backend = 'processes'

GloMPO includes a dynamic scope allowing one to watch the optimization progress in real-time using a graphic. This can be very helpful when configuring GloMPO and the results can be saved as movie files. This functionality requires matplotlib and ffmpeg installed on the system.

This is turned on for this example but if the script fails simply set visualisation to False to bypass it. Note also that the scope is very helpful for preliminary configuration but is slow due to matplotlib's limitations and should not be used during production runs.

48
49
50
51
52
53
54
55
56
57
    visualisation = True
    visualisation_args = {'record_movie': True,
                          'x_range': (0, max_calls),
                          'y_range': None,
                          'log_scale': False,
                          'events_per_flush': 500,
                          'interactive_mode': True,
                          'writer_kwargs': {'fps': 8},
                          'movie_kwargs': {'outfile': 'demo.mp4',
                                           'dpi': 200}}

For buggy tasks which are liable to fail or produce extreme results, it is possible that optimizers can get stuck and simply never return. If this is a risk that we can send a timeout condition after which the manager will force them to crash. Note that this will not work on threaded backends. In this example this is not needed so we leave the default as -1.

59
    force_terminations = -1

GloMPO supports checkpointing. This means that its state can be persisted to file during an optimization and this checkpoint file can be loaded by another GloMPO instance to resume the optimization from that point. Checkpointing options are configured through a CheckpointingControl instance. In this case we will produce a checkpoint called customized_completed_<DATE>_<TIME>.tar.gz once the task has converged.

61
62
63
    checkpointing = CheckpointingControl(checkpoint_at_conv=True,
                                         naming_format='customized_completed_%(date)_%(time).tar.gz',
                                         checkpointing_dir="customized_example_outputs")

All arguments are now fed to the manager initialisation. Other settings which did not warrant detailed discussion above are commented upon below:

65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
    manager = GloMPOManager.new_manager(task=task,
                                        bounds=task.bounds,
                                        opt_selector=selector,
                                        working_dir="customized_example_outputs",  # Dir in which files are saved
                                        overwrite_existing=True,  # Replaces existing files in working directory
                                        max_jobs=max_jobs,
                                        backend=backend,
                                        convergence_checker=checker,
                                        x0_generator=generator,
                                        killing_conditions=hunters,
                                        share_best_solutions=False,  # Send good evals from one opt to another
                                        hunt_frequency=500,  # Function evaluations between hunts
                                        status_frequency=60,  # Interval in seconds in which a status message is logged
                                        checkpoint_control=checkpointing,
                                        summary_files=3,  # Controls the level of output produced
                                        is_log_detailed=False,  # Functions can produce extra info which can be logged
                                        visualisation=visualisation,
                                        visualisation_args=visualisation_args,
                                        force_terminations_after=-1,
                                        split_printstreams=True)  # Autosend print statements from opts to files

To execute the minimization we simply run GloMPOManager.start_manager().

86
    result = manager.start_manager()

Finally we print the selected minimum.

88
89
90
    print(f"Global min for Schwefel Function: {task.min_fx:.3E}")
    print("GloMPO minimum found:")
    print(result)

Nudging

The nudging example is a variation of the Customized one. GloMPO will be run on the same task with virtually the same configuration, but in this case good iterations will be shared between optimizers. The optimizers, in turn, will use this information to accelerate their convergence. The user should see a marked improvement in GloMPO’s performance. Only two modifications to the Customized example are necessary:

In this case we tell CMA-ES to accept suggestions from the manager and sample these points once every 10 iterations.

34
    init_args = {'workers': 1, 'popsize': 12, 'force_injects': True, 'injection_frequency': 10}

The hunting must be reconfigured slightly to better suit the new optimization behavior:

39
    hunters = EvaluationsUnmoving(100, 0.01) | BestUnmoving(int(max_calls / 15), 0.2)

Managing Two-Level Algorithms

The twolevel script provides a simple demonstration of GloMPO managing a popular two-level algorithm; basin-hopping. By ‘two-level’ algorithm we mean metaheuristics which include periodic local searches such as SciPy’s basin-hopping or dual annealing algorithms.

These algorithms typically have an upper level routine (usually a Monte-Carlo jump) which selects points to evaluate. Local search routines are then started at these points. One can configure GloMPO to manage the overall strategy by launching instances of routines as its children (see ScipyOptimizeWrapper).

In this example, however, we demonstrate a different approach. Here the ‘upper’ level algorithm (which chooses where optimizers are started) is used as the Generator, while the local searches are started as its children.

This is a proof of concept, showing how GloMPO’s management and supervision aspects can be brought into existing optimization strategies without requiring a large amount of reimplementation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
import logging
import sys

from glompo.benchmark_fncs import LennardJones
from glompo.convergence import MaxFuncCalls, TargetCost
from glompo.core.manager import GloMPOManager
from glompo.generators.basinhopping import BasinHoppingGenerator
from glompo.hunters import EvaluationsUnmoving
from glompo.opt_selectors import CycleSelector
from glompo.optimizers.scipy import ScipyOptimizeWrapper

try:
    from glompo.optimizers.cmawrapper import CMAOptimizer
except ModuleNotFoundError:
    raise ModuleNotFoundError("To run this example the cma package is required.")


class ShiftedLennardJones(LennardJones):
    def __call__(self, *args, **kwargs):
        return super().__call__(*args, **kwargs) + 28.422532 + 1


if __name__ == '__main__':
    formatter = logging.Formatter("%(asctime)s : %(levelname)s : %(name)s :: %(message)s")

    handler = logging.StreamHandler(sys.stdout)
    handler.setFormatter(formatter)

    logger = logging.getLogger('glompo.manager')
    logger.addHandler(handler)
    logger.setLevel('INFO')

    task = ShiftedLennardJones(atoms=10, dims=3)

    max_calls = 4000  # Imposed iteration budget
    checker = MaxFuncCalls(max_calls) | TargetCost(1)  # Combined two conditions into a single stop criteria.

    generator = BasinHoppingGenerator()

    call_args = {'jac': task.jacobian}
    init_args = {'workers': 1, 'method': 'BFGS'}
    selector = CycleSelector((ScipyOptimizeWrapper, init_args, call_args))

    max_jobs = 4

    hunters = EvaluationsUnmoving(100, 0.01)

    manager = GloMPOManager.new_manager(task=task,
                                        bounds=task.bounds,
                                        opt_selector=selector,
                                        working_dir="twolevel_example_outputs",  # Dir in which files are saved
                                        overwrite_existing=True,  # Replaces existing files in working directory
                                        max_jobs=max_jobs,
                                        backend='processes',
                                        convergence_checker=checker,
                                        x0_generator=generator,
                                        killing_conditions=hunters,
                                        share_best_solutions=False,  # Send good evals from one opt to another
                                        hunt_frequency=1,  # Function evaluations between hunts
                                        status_frequency=60,  # Interval in seconds in which a status message is logged
                                        summary_files=3,  # Controls the level of output produced
                                        visualisation=True,
                                        split_printstreams=True)  # Autosend print statements from opts to files

    result = manager.start_manager()

    print("GloMPO minimum found:")
    print(result)

In this example, we will use the LennardJones problem. The minimization problem seeks to optimally arrange \(N\) atoms in \(d\)-dimensional space which minimizes the overall energy (balances attractive and repulsive forces). We use a shifted and version so that all values are postive and can be conveniently displayed on the output plots.

18
19
20
class ShiftedLennardJones(LennardJones):
    def __call__(self, *args, **kwargs):
        return super().__call__(*args, **kwargs) + 28.422532 + 1
33
    task = ShiftedLennardJones(atoms=10, dims=3)

The optimization will be terminated when the minimum is found or the function has been called 4000 times:

35
36
    max_calls = 4000  # Imposed iteration budget
    checker = MaxFuncCalls(max_calls) | TargetCost(1)  # Combined two conditions into a single stop criteria.

We will use the BasinHoppingGenerator to pick points in the same way that the basin-hopping algorithm does.

38
    generator = BasinHoppingGenerator()

The managed optimizers will be instances of the BFGS algorithm used by the full routine:

40
41
42
    call_args = {'jac': task.jacobian}
    init_args = {'workers': 1, 'method': 'BFGS'}
    selector = CycleSelector((ScipyOptimizeWrapper, init_args, call_args))

We can use a simple hunting scheme in this example which simply terminates optimizers which are stagnating at levels worse than those previously seen.

46
    hunters = EvaluationsUnmoving(100, 0.01)

The rest of the script follows the pattern of the previous examples. The components can now be given to the manager and run.

ReaxFF

Attention

This example requires the Amsterdam Modelling Suite be present on your system with licenses for ReaxAMS and ParAMS.

This example is also quite expensive and should be performed on an HPC node.

The reaxff script provides an example of GloMPO configured to manage several parallel CMA-ES reparameterizations of the disulfide training set designed by Muller et al. (2015).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
import logging
import sys

from glompo.convergence import MaxSeconds
from glompo.core.checkpointing import CheckpointingControl
from glompo.core.manager import GloMPOManager
from glompo.generators import PerturbationGenerator
from glompo.hunters import EvaluationsUnmoving
from glompo.opt_selectors import CycleSelector

try:
    from glompo.optimizers.cmawrapper import CMAOptimizer
except ModuleNotFoundError:
    raise ModuleNotFoundError("To run this example the cma package is required.")

try:
    from glompo.interfaces.params import ReaxFFError
except ModuleNotFoundError:
    raise ModuleNotFoundError("To run this example the scm package (part of the Amsterdam Modelling Suite [scm.com]) "
                              "is required.")

if __name__ == '__main__':
    formatter = logging.Formatter("%(asctime)s : %(levelname)s : %(name)s :: %(message)s")

    handler = logging.StreamHandler(sys.stdout)
    handler.setFormatter(formatter)

    logger = logging.getLogger('glompo.manager')
    logger.addHandler(handler)
    logger.setLevel('INFO')

    cost_function = ReaxFFError.from_classic_files('../tests/_test_inputs')
    print("Reparameterizing ReaxFF Force Field for Disulfide Training Set")
    print("--------------------------------------------------------------")
    print("Active Parameters:", cost_function.n_parms, "/", cost_function.n_all_parms)
    print("Initial Error Value:",
          cost_function(cost_function.convert_parms_real2scaled(cost_function.par_eng.active.x)))

    checker = MaxSeconds(session_max=24 * 60 * 60)

    sigma = 0.25
    call_args = {'sigma0': sigma}
    init_args = {'workers': 4, 'popsize': 12}  # Change for better balancing on your available computational resources
    selector = CycleSelector((CMAOptimizer, init_args, call_args))

    max_jobs = 24  # Change for better balancing on your available computational resources

    hunters = EvaluationsUnmoving(25000, 0.01)  # Kill optimizers which are incorrectly focusing

    generator = PerturbationGenerator(x0=cost_function.convert_parms_real2scaled(cost_function.par_eng.active.x),
                                      bounds=cost_function.bounds,
                                      scale=[0.25] * cost_function.n_parms)

    backend = 'threads'  # DO NOT CHANGE

    checkpointing = CheckpointingControl(checkpoint_at_conv=True,
                                         naming_format='customized_completed_%(date)_%(time).tar.gz',
                                         checkpointing_dir="customized_example_outputs")

    manager = GloMPOManager.new_manager(task=cost_function,
                                        bounds=cost_function.bounds,
                                        opt_selector=selector,
                                        working_dir="reaxff_example_outputs",  # Dir in which files are saved
                                        overwrite_existing=True,  # Replaces existing files in working directory
                                        max_jobs=max_jobs,
                                        backend=backend,
                                        convergence_checker=checker,
                                        x0_generator=generator,
                                        killing_conditions=hunters,
                                        hunt_frequency=500,  # Function evaluations between hunts
                                        status_frequency=600,  # Interval in seconds in which a status message is logged
                                        checkpoint_control=checkpointing,  # Saves state for restart
                                        summary_files=3,  # Controls the level of output produced
                                        force_terminations_after=-1,
                                        split_printstreams=True)  # Autosend print statements from opts to files

    result = manager.start_manager()

    print("Final Error Value:", result.fx)

We begin by setting up the cost function from the trainset.in, geo and ffield files published with the paper. These are included in GloMPO’s tests/_test_inputs directory:

32
    cost_function = ReaxFFError.from_classic_files('../tests/_test_inputs')

The optimization will be time-limited in this case to 24hrs:

38

Resource balancing is critically important in this case since the optimization is expensive. Generally the cost function evaluation time is a function of the parameters being tested since geometry optimizations take longer to converge for some parameter combinations.

Thus, one can set:

workers = popsize
This evaluates very quickly, but has the worst computational efficiency as many cores remain idle while waiting for the slowest evaluation.
workers = 1
This evaluates the cost function sequentially and is the most computationally efficient (no idling time), but requires the most wall time since there is no parallelization.

It is often best to choose a balance between these considerations and factor in the availability of computational resources and the number of parallel optimizations one would like to run.

In this example, we assume a 24 core machine and would like to run 6 optimizers in parallel. This suggests that each optimizer should run 4 function evaluations in parallel; a suitable load balancing compromise. We select a population size of 12 (multiple of 4) to further optimize computational efficiency.

We use the CMA-ES algorithm which has been shown to perform well on ReaxFF reparameterizations, and select a moderate \(\sigma\) value.

40
41
42
43
44
45

    sigma = 0.25
    call_args = {'sigma0': sigma}
    init_args = {'workers': 4, 'popsize': 12}  # Change for better balancing on your available computational resources
    selector = CycleSelector((CMAOptimizer, init_args, call_args))

Attention

It is important that you rebalance these parameters according to your available resources before running the example.

We will keep hunting simple in this example, and shutdown optimizers which begin to converge towards points which are worse than those already seen. You could also consider including conditions based on the status of a validation set, or some other measure of overfitting. If you are testing a different training set, it may be helpful to include a condition to terminate optimizers which do not start to converge after a long time.

47

In this example, we will look for parameter sets near the incumbent, as other good values are likely in the same region. It is possible to do a more exploratory search by using RandomGenerator and larger \(\sigma\) value as was done in the Customized example. This creates the possibility of finding very different parameter sets, but may end up being more expensive as the optimizers explore non-physical and instable parameters.

49
50
51

    generator = PerturbationGenerator(x0=cost_function.convert_parms_real2scaled(cost_function.par_eng.active.x),
                                      bounds=cost_function.bounds,

'threads' must be used for the backend when working with ParAMS since it does not support multiprocessing at this level. However, actual evaluations of the cost functions will be spun off by processes at the PLAMS level and not be subject to the Python GIL.

53

A checkpoint is configured for the end of the optimization. If you would like to continue the optimization further, this file can be used to restart the optimization for the final state.

55
56
57

    checkpointing = CheckpointingControl(checkpoint_at_conv=True,
                                         naming_format='customized_completed_%(date)_%(time).tar.gz',

The rest of the script follows the pattern of the previous examples. The components can now be given to the manager and run.