Tutorial 2 - Generators, Properties and Calculators#
_
/|_|\
/ / \ \
/_/ \_\
\ \ / /
\ \_/ /
\|_|/
SOPRANO: a Python library for generation, manipulation and analysis of large batches of crystalline structures
Developed within the CCP-NC project. Copyright STFC 2022
# Basic imports
import os, sys
sys.path.insert(0, os.path.abspath('..')) # This to add the Soprano path to the PYTHONPATH
# so we can load it without installing it
# Other useful imports
import numpy as np
from ase import Atoms
from ase import io as ase_io
from ase.visualize.plot import plot_atoms
1 - USING GENERATORS#
Soprano provides a series of generators able to create multiple structures on one go based on simple criteria. One of these, used here, is the linspaceGen, which interpolates linearly between two extreme structures. Others are the rattleGen (generating copies of a given structure with random atomic displacements) and the airssGen (binding to AIRSS’ buildcell executable to generate random structures, only available if AIRSS is installed).
from soprano.collection import AtomsCollection
from soprano.collection.generate import linspaceGen
# Let's use the ammonia molecule switching configurations as an example
nh3coords = np.array([[ 2.5, 2.5, 2.5 ],
[ 3.4373, 2.5, 2.1193],
[ 2.0314, 3.3117, 2.1193],
[ 2.0314, 1.6883, 2.1193]])
nh3l = Atoms('NHHH', nh3coords, cell=[5,5,5]) # The cell is just an empty box
# Now the right version
nh3coords *= [1, 1, -1]
nh3r = Atoms('NHHH', nh3coords, cell=[5,5,5])
# Now let's build a collection of 21 intermediate steps between the two structures
# Note that 21 includes the two endpoints
nh3linsp = linspaceGen(nh3l, nh3r, steps=21, periodic=True)
# Generators can be passed directly to the AtomsCollection constructor
nh3coll = AtomsCollection(nh3linsp)
We can now plot the collection of structures generated by the linspaceGen. The example below shows how to do this.
# Make a 1x5 grid of subplots
from matplotlib import pyplot as plt
N = len(nh3coll)
N_plots = 5
fig, axs = plt.subplots(1, N_plots, figsize=(20, 4))
# Plot each structure in the grid
for i, ax in enumerate(axs):
structure_index = int(i * (N +N_plots-1)/ N_plots)
ax.set_title(f'Structure {structure_index}')
# Use the plot_atoms function to plot the structure
plot_atoms(nh3coll.structures[structure_index], ax=ax, rotation='0x,90y,0z', show_unit_cell=2,radii=0.65)

2 - PROPERTIES#
Soprano Properties are classes meant to extract complex arrays of information from collections. A number of these are provided by default, but advanced users can easily create their own class inheriting from the generic AtomsProperty class to implement particular needs.
from soprano.properties.linkage import LinkageList
# As a first experiment we try using LinkageList, a property meant to return a list of all pair interatomic distances
# in a system. This can serve as a fingerprint to distinguish different structures
# The basic usage is to just call the Property's method "get". In this way the Property is calculated with
# default parameters.
# The three shortest values (varying) are N-H distances, while the constant ones are H-H distances
print("---- Linkage List for all NH3 configurations - Default parameters\n")
print('\n'.join(['{0}'.format(x) for x in LinkageList.get(nh3coll)]), "\n\n")
---- Linkage List for all NH3 configurations - Default parameters
[1.01162016 1.01162016 1.01166387 1.62339512 1.62339512 1.6234 ]
[0.99791691 0.99791691 0.99796123 1.62339512 1.62339512 1.6234 ]
[0.98549462 0.98549462 0.98553949 1.62339512 1.62339512 1.6234 ]
[0.97440226 0.97440226 0.97444764 1.62339512 1.62339512 1.6234 ]
[0.96468572 0.96468572 0.96473156 1.62339512 1.62339512 1.6234 ]
[0.95638694 0.95638694 0.95643317 1.62339512 1.62339512 1.6234 ]
[0.94954307 0.94954307 0.94958964 1.62339512 1.62339512 1.6234 ]
[0.94418577 0.94418577 0.94423261 1.62339512 1.62339512 1.6234 ]
[0.94034044 0.94034044 0.94038747 1.62339512 1.62339512 1.6234 ]
[0.93802568 0.93802568 0.93807282 1.62339512 1.62339512 1.6234 ]
[0.93725282 0.93725282 0.9373 1.62339512 1.62339512 1.6234 ]
[0.93802568 0.93802568 0.93807282 1.62339512 1.62339512 1.6234 ]
[0.94034044 0.94034044 0.94038747 1.62339512 1.62339512 1.6234 ]
[0.94418577 0.94418577 0.94423261 1.62339512 1.62339512 1.6234 ]
[0.94954307 0.94954307 0.94958964 1.62339512 1.62339512 1.6234 ]
[0.95638694 0.95638694 0.95643317 1.62339512 1.62339512 1.6234 ]
[0.96468572 0.96468572 0.96473156 1.62339512 1.62339512 1.6234 ]
[0.97440226 0.97440226 0.97444764 1.62339512 1.62339512 1.6234 ]
[0.98549462 0.98549462 0.98553949 1.62339512 1.62339512 1.6234 ]
[0.99791691 0.99791691 0.99796123 1.62339512 1.62339512 1.6234 ]
[1.01162016 1.01162016 1.01166387 1.62339512 1.62339512 1.6234 ]
# If one wants to use parameters, an instance of the Property has to be created.
# For example LinkageList accepts a parameter "size" that limits the number of distances computed.
# This can then just be called on the AtomsCollection
customLL = LinkageList(size=3)
print("---- Linkage List for all NH3 configurations - Custom parameters\n")
print('\n'.join(['{0}'.format(x) for x in customLL(nh3coll)]), "\n\n")
---- Linkage List for all NH3 configurations - Custom parameters
[1.01162016 1.01162016 1.01166387]
[0.99791691 0.99791691 0.99796123]
[0.98549462 0.98549462 0.98553949]
[0.97440226 0.97440226 0.97444764]
[0.96468572 0.96468572 0.96473156]
[0.95638694 0.95638694 0.95643317]
[0.94954307 0.94954307 0.94958964]
[0.94418577 0.94418577 0.94423261]
[0.94034044 0.94034044 0.94038747]
[0.93802568 0.93802568 0.93807282]
[0.93725282 0.93725282 0.9373 ]
[0.93802568 0.93802568 0.93807282]
[0.94034044 0.94034044 0.94038747]
[0.94418577 0.94418577 0.94423261]
[0.94954307 0.94954307 0.94958964]
[0.95638694 0.95638694 0.95643317]
[0.96468572 0.96468572 0.96473156]
[0.97440226 0.97440226 0.97444764]
[0.98549462 0.98549462 0.98553949]
[0.99791691 0.99791691 0.99796123]
[1.01162016 1.01162016 1.01166387]
# Now we can try creating a custom property. This one will calculate the center of mass of all Hydrogen atoms.
from soprano.properties import AtomsProperty
class HydrogenCOM(AtomsProperty):
default_name = 'hydrogen_com' # These need to be defined for any property
default_params = {}
@staticmethod
def extract(s): # This is where the core of the calculation happens
# s is a single Atoms object passed to this method
chemsyms = s.get_chemical_symbols()
h_inds = [i for i, sym in enumerate(chemsyms) if sym == 'H']
h_pos = s.get_positions()[h_inds]
com = np.average(h_pos, axis=0)
return com
print("---- Hydrogen COM for all NH3 configurations\n")
print('\n'.join(['{0}'.format(x) for x in HydrogenCOM.get(nh3coll)]), "\n\n")
---- Hydrogen COM for all NH3 configurations
[2.50003333 2.5 2.1193 ]
[2.50003333 2.5 2.15737 ]
[2.50003333 2.5 2.19544 ]
[2.50003333 2.5 2.23351 ]
[2.50003333 2.5 2.27158 ]
[2.50003333 2.5 2.30965 ]
[2.50003333 2.5 2.34772 ]
[2.50003333 2.5 2.38579 ]
[2.50003333 2.5 2.42386 ]
[2.50003333 2.5 2.46193 ]
[2.50003333 2.5 2.5 ]
[2.50003333 2.5 2.53807 ]
[2.50003333 2.5 2.57614 ]
[2.50003333 2.5 2.61421 ]
[2.50003333 2.5 2.65228 ]
[2.50003333 2.5 2.69035 ]
[2.50003333 2.5 2.72842 ]
[2.50003333 2.5 2.76649 ]
[2.50003333 2.5 2.80456 ]
[2.50003333 2.5 2.84263 ]
[2.50003333 2.5 2.8807 ]
Average properties#
If you want to compute the mean of a property over a collection of structures or over sites within a structure, you can use the .mean
method. For example:
AtomsProperty().mean(collection, axis=0)
will compute the mean of the property over all structures in the collection, while
AtomsProperty().mean(collection, axis=1)
will compute the mean of the property over all sites for each structure in the collection.
In the previous cell we defined a new AtomsProperty
called HydrogenCOM
in which we compute the center of mass of the hydrogen atoms in a structure. We could have instead approached this problem using a much simpler property that gets all cartesian coordinates, and then use a custom selection to return the hydrogen positions and then compute the mean of those to get the center of mass.
# Simple class to return the positions of all atoms in the system
class AtomsPositions(AtomsProperty):
default_name = 'atoms_positions'
default_params = {}
@staticmethod
def extract(s):
return s.get_positions()
# Now we can use the selection argument to select only the H atoms and choose to
# average them along the first axis (i.e. average over all the H positions
# within each structure in the collection)
H_COM_alternative = AtomsPositions().mean(nh3coll, selection="H", axis=1)
# This is equivalent to the previous property, but it shows how to use the selection
# and mean methods to get the same result
print("---- Hydrogen COM for all NH3 configurations\n")
print('\n'.join(['{0}'.format(x) for x in H_COM_alternative]), "\n\n")
---- Hydrogen COM for all NH3 configurations
[2.50003333 2.5 2.1193 ]
[2.50003333 2.5 2.15737 ]
[2.50003333 2.5 2.19544 ]
[2.50003333 2.5 2.23351 ]
[2.50003333 2.5 2.27158 ]
[2.50003333 2.5 2.30965 ]
[2.50003333 2.5 2.34772 ]
[2.50003333 2.5 2.38579 ]
[2.50003333 2.5 2.42386 ]
[2.50003333 2.5 2.46193 ]
[2.50003333 2.5 2.5 ]
[2.50003333 2.5 2.53807 ]
[2.50003333 2.5 2.57614 ]
[2.50003333 2.5 2.61421 ]
[2.50003333 2.5 2.65228 ]
[2.50003333 2.5 2.69035 ]
[2.50003333 2.5 2.72842 ]
[2.50003333 2.5 2.76649 ]
[2.50003333 2.5 2.80456 ]
[2.50003333 2.5 2.84263 ]
[2.50003333 2.5 2.8807 ]
3 - CALCULATORS#
The Atomic Simulation Environment provides bindings to many codes in the form of calculators. These include ab initio codes like CASTEP and VASP as well as empirical force fields. These calculators can be set and used in Soprano as well. Here we’re going to use the most basic one, the Lennard-Jones force field, as an example.
from ase.calculators.lj import LennardJones
from soprano.properties.basic import CalcEnergy
nh3coll.set_calculators(LennardJones) # Creates calculators of the given type for all structures
print("---- NH3 Lennard-Jones energy for all configurations ----\n")
print('\n'.join(['{0}'.format(x) for x in CalcEnergy.get(nh3coll)]), "\n\n")
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[9], line 4
1 from ase.calculators.lj import LennardJones
2 from soprano.properties.basic import CalcEnergy
----> 4 nh3coll.set_calculators(LennardJones) # Creates calculators of the given type for all structures
6 print("---- NH3 Lennard-Jones energy for all configurations ----\n")
7 print('\n'.join(['{0}'.format(x) for x in CalcEnergy.get(nh3coll)]), "\n\n")
File ~/work/soprano/soprano/soprano/collection/collection.py:409, in AtomsCollection.set_calculators(self, calctype, labels, params)
407 from ase.calculators.calculator import Calculator as cCalculator
408 from ase.calculators.calculator import FileIOCalculator as ioCalculator
--> 409 from ase.calculators.general import Calculator as gCalculator
411 if (
412 (gCalculator not in calctype.__bases__)
413 and (cCalculator not in calctype.__bases__)
414 and (ioCalculator not in calctype.__bases__)
415 ):
416 raise TypeError("calctype must be a type of ASE Calculator")
ModuleNotFoundError: No module named 'ase.calculators.general'
# Now let's try a plot
%matplotlib inline
import matplotlib.pyplot as plt
comz = np.array(HydrogenCOM.get(nh3coll))[:,2]-2.5
ljE = CalcEnergy.get(nh3coll)
plt.plot(comz, ljE)
[<matplotlib.lines.Line2D at 0x7c93a26eedb0>]
