soprano.scripts.cli_utils#

A collection of functions and options for the command line interface for soprano.

Functions

add_options(options)

apply_df_filtering(df, include, exclude, query)

Inlcude/exclude columns and filter the dataframe using a pandas query.

average_quaternions_by_tags(quaternions, tags)

For repeated tags, average the quaternions.

configure(ctx, param, filename)

expand_aliases(input_list, alias_dict)

find_XHn_groups(atoms, pattern_string[, ...])

Find groups of atoms based on a functional group pattern.

get_column_list(ctx, parameter, value)

Parse the column names string. TODO: Document this a bit better. :param ctx: click context :param parameter: click parameter :param value: The column names, comma-separated. Some shortcuts defined for MS_angles and EFG_angles. :type value: str.

get_duplicates(seq)

Returns dict {duplicate_value: [indices]} for duplicates in a list

get_matching_cols(df, lst)

Get the columns of a dataframe that roughly match a list of strings.

get_missing_cols(df, lst)

Get the items in list that don't match any of the columns of a dataframe

has_CH_bonds(atoms[, rcut])

Check if the atoms object has bonds with CH atoms Does this simply by comparing the labels with the element symbols.

isotope_selection(ctx, parameter, isotope_string)

Parse the isotope string.

keyvalue_parser(ctx, parameter, value)

Parse strings in the form 'C:1,H:2' into a dictionary.

print_results(dfs[, output, output_format, ...])

reload_as_molecular_crystal(atoms[, force])

If the atoms object is a molecular crystal, reload it with the correct connectivity.

sortdf(df, sortby, sort_order)

sort df by column, return new df

units_rename(colname[, units_dict])

viewimages(images[, reload_as_molecular])

Use ASE GUI to view the images.

soprano.scripts.cli_utils._validate_df_output_extension(output_filename, output_format=None)[source]#

Validate the output filename extension. :param output_filename: The output filename. :type output_filename: str :param output_format: The output format. :type output_format: str

Returns:

The output filename.

Return type:

str

Raises:

click.UsageError – If the output format is not valid.

Parameters:
  • output_filename (str)

  • output_format (str | None)

soprano.scripts.cli_utils.apply_df_filtering(df, include, exclude, query, essential_columns=[], logger=None)[source]#

Inlcude/exclude columns and filter the dataframe using a pandas query.

Parameters:
  • df (pd.DataFrame) – the dataframe to filter

  • include (list) – list of columns to include

  • exclude (list) – list of columns to exclude

  • query (str) – pandas query string to filter the dataframe

  • essential_columns (list) – list of columns that must be included

  • logger (logging.Logger) – logger to use

Returns:

the filtered dataframe

Return type:

pd.DataFrame

soprano.scripts.cli_utils.average_quaternions_by_tags(quaternions, tags)[source]#

For repeated tags, average the quaternions. Return the modified list of quaternions.

soprano.scripts.cli_utils.find_XHn_groups(atoms, pattern_string, tags=None, vdw_scale=1.0)[source]#

Find groups of atoms based on a functional group pattern. The pattern is a string such as CH3 or CH2. It must contain an element symbol, H and the number of H atoms

Args:
atoms (ase.Atoms): Atoms object on which to perform selection
pattern_string (str): functional group pattern e.g. ‘CH3’
for a methyl group. Assumes the group is
the thing(s) connected to the first atom.
They can be combined, comma separated.
TODO: add SMILES/SMARTS support?
vdw_scale (float): scale factor for vdw radius (used for bond searching)
soprano.scripts.cli_utils.get_column_list(ctx, parameter, value)[source]#

Parse the column names string. TODO: Document this a bit better. :param ctx: click context :param parameter: click parameter :param value: The column names, comma-separated.

Some shortcuts defined for MS_angles and EFG_angles.

Returns:

The column names specified.

Return type:

list

soprano.scripts.cli_utils.get_duplicates(seq)[source]#

Returns dict {duplicate_value: [indices]} for duplicates in a list

soprano.scripts.cli_utils.get_matching_cols(df, lst)[source]#

Get the columns of a dataframe that roughly match a list of strings.

soprano.scripts.cli_utils.get_missing_cols(df, lst)[source]#

Get the items in list that don’t match any of the columns of a dataframe

soprano.scripts.cli_utils.has_CH_bonds(atoms, rcut=1.5)[source]#

Check if the atoms object has bonds with CH atoms Does this simply by comparing the labels with the element symbols. If they’re all the same, then so special labels are present.

Parameters:
  • atoms (Atoms)

  • rcut (float)

Return type:

bool

soprano.scripts.cli_utils.isotope_selection(ctx, parameter, isotope_string)[source]#

Parse the isotope string. :param ctx: click context :param parameter: click parameter :param isotope_string: The isotopes specification, in the form "2H,15N" for deuterium and 15N. :type isotope_string: str

Returns:

The isotope for each element specified. Formatted as::

{Element: Isotope}.

Return type:

dict

soprano.scripts.cli_utils.keyvalue_parser(ctx, parameter, value)[source]#
Parse strings in the form ‘C:1,H:2’ into a dictionary.

Also accept = as the separator between key and value. e.g. the MS shift reference and gradient strings.

Parameters:
  • ctx – click context

  • parameter – click parameter

  • value (str) – The references specification, in the form "C:100,H:123". If value is a single float, that will returned instead of a dict.

Returns:

The values for each key specified. Formatted as::

{key: value}.

Return type:

dict

soprano.scripts.cli_utils.reload_as_molecular_crystal(atoms, force=False)[source]#

If the atoms object is a molecular crystal, reload it with the correct connectivity.

We check if there are any C-H bonds, and if so, we assume it’s a molecular crystal and reload it as such. If there are no C-H bonds, we assume it’s not and return the original atoms object, unless force=True.

Parameters:
  • atoms (ASE Atoms object) – the atoms object to be reloaded.

  • force (bool) – if True, force the reload even if there are no C-H bonds.

Returns:

the atoms object with the correct connectivity.

Return type:

atoms (ASE Atoms object)

soprano.scripts.cli_utils.sortdf(df, sortby, sort_order)[source]#

sort df by column, return new df

soprano.scripts.cli_utils.viewimages(images, reload_as_molecular=None)[source]#

Use ASE GUI to view the images.

If they contain C and H, we’ll assume it’s a molecular crystal and reload it as such.

Parameters:
  • images (list) – list of ASE Atoms objects to view

  • reload_as_molecular (bool) – whether to reload as a molecular crystal. Default is None. This will be set to True if the images contain a C-H.