CrystalStructurePrediction(initial_population, number_of_generations, selection_pressure=2.0, number_of_elites=None, number_to_promote=1, heredity_probability=0.6, permutation_probability=0.0, mutation_probability=0.4, sigma_lattice=0.7, max_forces=PhysicalQuantity(0.01, eV/Ang), max_stress=PhysicalQuantity(0.1, GPa), max_steps=1000, max_step_length=PhysicalQuantity(0.2, Ang), external_pressure=None, rng=None, write_population=True, population_format=None, log_filename_prefix='generation_', permutation_operator=<class 'NL.Dynamics.CrystalStructurePrediction.GeneticOperators.PermutationOperator'>, heredity_operator=<class 'NL.Dynamics.CrystalStructurePrediction.GeneticOperators.HeredityOperator'>, mutation_operator=<class 'NL.Dynamics.CrystalStructurePrediction.GeneticOperators.MutationOperator'>)¶
Function to perform a crystal sturcture search by applying a genetic algorithm on a given initial population.
- initial_population (list of individuals) – A list of initial BulkConfigurations
- number_of_generations (int) – The number of times the population should be evolved.
- selection_pressure (float) – The pressure applied during the selection of the parents from the population. The higher the pressure the less likely it is to pick an individual with lower fitness. Default: 2.0
- number_of_elites (int) – Only consider the
number_of_elitesfittest individuals when making the selection. Default: Consider all individuals during selection.
- number_to_promote (int) – The number of most fittest individuals to promote to the next generation. Default: 1
- heredity_probability (float) – The unnormalized probabilty of using the heredity operator. Default: 0.6
- permutation_probability (float) – The unnormalized probabilty of using the permutation operator. The permutation operator will only be used when there is more than one type of an element in the system. Default: 0.0
- mutation_probability (float) – The probabilty of using the mutation operator. Default: 0.4
- sigma_lattice (float) – The standard deviation of the random strain applied by the mutation operator. Default: 0.7
- max_forces (PhysicalQuantity in units of force) – The maximum forces determining when the optimization should stop. Default: 0.01*eV/Ang
- max_stress (PhysicalQuantity in units of pressure) – The maximum stress determining when the optimization should stop. Default: 0.1*GPa
- max_steps (uint) – The maximum number of steps for local optimization. Default: 1000
- max_step_length (PhysicalQuantity in units of length) – The maximum step length the local optimizer may take. Default: 0.2*Ang
- external_pressure (PhysicalQuantity in units of pressure) – The target external pressure of the system. Default: 0*GPa
- rng (numpy.random.RandomState) – The random number generated used to generate the random number stream. Default: A new random number generator (seeded from OS entropy pool)
- write_population (bool) – Enables saving the configuration of each individual configuration during the crystal structure prediction. Default: True
- population_format (str) – Sets the storage format for populations. Default: hdf5
- log_filename_prefix (str) – The filename prefix to use for logging the individual optimizations.
Evolves the current population for
Parameters: number_of_generations (int) – The number of times the genetic algorithm is applied to the population.
Return the fitnesses of the current population. Sorted by descending fitness.
Returns: a list of fitnesses Return type: list
Return the current population.
Returns: a population Return type: list of individuals
Return the international symbol for each individual of the current population. Sorted by descending fitness.
Returns: a list of international symbols Return type: list of str
Determine the global minimum energy crystal structure for TiO2.
# ------------------------------------------------------------- # Initial Population # ------------------------------------------------------------- initial_population = generateInitialPopulation( elements=elements, population_size=10, calculator=calculator, volume=206.29*Angstrom**3, max_forces=0.01*eV/Angstrom, max_stress=0.1*GPa, max_steps=1000, max_step_length=0.2*Angstrom, external_pressure=0.0*GPa, rng=rng, log_filename_prefix="initial_population_", ) # ------------------------------------------------------------- # Crystal Structure Prediction # ------------------------------------------------------------- global_optimization = CrystalStructurePrediction( initial_population, number_of_generations=16, heredity_probability=0.85, mutation_probability=0.1, permutation_probability=0.05, number_to_promote=2, number_of_elites=int(0.6*population_size), selection_pressure=3.0, population_format='hdf5', max_forces=0.005 * eV/Angstrom, )
The goal of crystal structure prediction is to explore the state space of a compound and locate the structure with the maximum fitness for a given chemical composition. In QuantumATK, the fitness function is set to the negative enthalpy, i.e. the algorithm converges to the structure with the minimal enthalpy (or energy if no external pressure is applied). The primary input to the calculation is the chemical formula, which determines the number of atoms in the unit cell. It is important that enough atoms are present in the unit cell to represent the crystal structure. Just because the correct stoichiometric ratio is used, does not ensure that it is possible to represent the minimum energy structure.
The crystal structure prediction algorithm in QuantumATK uses a genetic algorithm. The genetic algorithm begins with an initial population of structures provided by the user. The initial population can be generated using the generateInitialPopulation function. Each step of the algorithm involves generating the next population of structures from the previous one. This evolution process is done in the following manner:
- Promote the
number_to_promotefittest individuals to the next generation.
- Loop until there are
population_sizeindividuals in the new generation:
- Pick which genetic operation should be performed. Each operation is given a user defined probability of being selected.
- Select one (mutation and permutation operator) or two parent structures (heredity operator). The selection probability is weighted by the fitness of the parental structures.
- Execute the genetic operation and create a new structure.
- Perform a local optimization of each individual of the population.
There are four types of genetic operators available:
- Promotion: this operation is always performed at the beginning of the
evolution process to ensure that the best
number_to_promoteindividuals are kept.
- Mutation: perturb the lattice vectors and atomic positions of the chosen configuration randomly.
- Permutation: swap the position of atoms of differing elements.
- Heredity: combine two parental structures to a new one. A random cut plane is used to divide the atoms of the two parent structures; all atoms below the plane in one structure are combined with the atoms above the plane in the other structure to create a new “child” structure. Care is taken to ensure that the total number of each element is not changed.
The algorithm stops after evolving the population
times. Given the stochastic nature of the algorithm, this stopping criterion
does not ensure that one has found the global solution. The stability of the
search should be checked at least with respect to the chosen seed of the pseudo
random number generator (
After generating a new population, each individual of the population is away
from a local minimum in enthalpy, hence a geometry optimization must be
performed to converge to a close local minimum. Since the global minimum is
also a local minimum, one is guaranteed to find it for sufficiently large
number_of_generations. These geometry (local) optimizations
comprise nearly all of the computational work during crystal structure
One key difference between a genetic algorithm and a random search is that
the selection of the parent structure(s), on which the genetic operations
are performed, is biased by the fitness of that structure. The higher the
fitness (i.e. the lower the calculated enthalpy in this implementation),
the more likely it is to pick this structure. The skewness of the
probability distribution is altered by
selection_pressure. A large
selection_pressure makes it less likely to select a
configuration with a lower fitness.
A crystal structure prediction run can be restarted / resumed from a previous point in the following manner:
CSP = CrystalStructurePrediction( ... ) # Save the current state of the CSP object. nlsave('my_file.hdf5', CSP, object_id='my_run') ... # Read back a stored CSP object. The last population is restored # and all given properties / attributes are set up. CSP = nlread('my_file.hdf5', object_id='my_run') # Run 10 more iterations of the genetic algorithm. CSP.evolvePopulation(10) # Run 5 more iterations of the genetic algorithm. CSP.evolvePopulation(5)
True then after each step of the genetic
algorithm the current population will be written to disk. The filename will
log_filename_prefix and include the current generation number.
For example, the initial population will be saved as
and after the first step a new file
generation_1.hdf5 will be saved. The
file format can be either
nc for NetCDF or
hdf5 for HDF5, by default
HDF5 is used. The crystal structures contained in these files can be viewed
in QuantumATK using the Viewer.