Tutorial 0 - Python quickstart guide#
_
/|_|\
/ / \ \
/_/ \_\
\ \ / /
\ \_/ /
\|_|/
SOPRANO: a Python library for generation, manipulation and analysis of large batches of crystalline structures
Developed within the CCP-NC project. Copyright STFC 2022
In this tutorial we look at how to quickly get started with Python, Jupyter notebooks and using Soprano.
Setting up Python#
There are many ways to run Python, either on your own machine or in the cloud. In these tutorials we use Jupyter notebooks, which are a great way to run Python code interactively, combining text, code and code-outputs such as plots. More on these notebooks later.
Installing Python#
If you are new to Python, we recommend using the Anaconda distribution, which is a free and open-source distribution of Python. It is widely used in the scientific community and comes with many pre-installed packages. You can download it from here and it is available for Windows, macOS and Linux.
There are different flavours of “conda”, but we recommend using the full Anaconda distribution if you’re new to Python. This will install Python, Jupyter notebooks and many other useful packages. If you have more limited disk space, you can use the Miniconda distribution, which is a smaller version of Anaconda. Another alternative is Mamba which can be a faster and more efficient package manager than conda.
For Windows users, the Anaconda installer will add Python to your PATH, so you can run Python from the command line. You can also use the Anaconda Navigator, which is a graphical user interface for managing packages and environments.
Alternatively, WSL (Windows Subsystem for Linux) allows you to run a Linux distribution on Windows. You can then install Python using the package manager of the Linux distribution you are using (see below instructions). Using WSL is a good option if you want to use Python in a Linux-like environment (sometimes scientific packages are less-well tested on Windows, for example). It also helps when it comes ssh’ing into an HPC cluster such as Young.
For Linux users, you can also install Python via Anaconda. Or you can use your package manager. For example, on Ubuntu you can install Python using the following command:
sudo apt-get install python3
For macOS users, you can also install Python via Anaconda. Or you can use Homebrew, which is a package manager for macOS. If you have homebrew already, you can install Python using the following command:
brew install python3
Using Python in the cloud#
Instead of installing Python on your own machine, you can also use Python in the cloud via e.g. Google Colab. This is a free service that allows you to run Python code in the cloud. You can also use Binder to run Jupyter notebooks in the cloud. These tutorials can be launched in either Binder or Google Colab by clicking on the respective badge after hovering over the rocket icon at the top of the page. Note that launching in Binder may take a few minutes but brings you into an environment with Soprano pre-installed, whereas Google Colab requires you to install the software yourself (you need to add !pip install soprano
into a new cell and run that first).
Environment management#
Once you have Python installed, it is a good idea to create a new environment for each project (or set of projects with similar requirements) you are working on. This helps to keep your packages separate and avoid conflicts between different versions of packages. The Anaconda Navigator allows you to create and manage environments using a graphical user interface.
You can also create a new environment using conda with the following command:
conda create --name myenv python=3.9
This will create a new environment called myenv
with Python 3.9. You can then activate the environment with the following command:
conda activate myenv
You can then install packages into this environment using conda install
or pip install
.
Alternatively, if you installed Python using your package manager, you can use virtualenv
to create environments. You can install virtualenv
using the following command:
pip install virtualenv
You can then create a new environment using the following command:
virtualenv myenv
This will create the environment in your current directory - you can specify a different directory if you want.
You can activate the environment using the following command:
source myenv/bin/activate
where myenv
is the name of your environment and also the name of the directory where the environment is stored.
You can then install packages using pip install
.
Running Python#
Once installed, you can use Python either by:
running a python script (
python my_script.py
) or byusing an interactive Python shell (
python
or betteripython
, if it’s installed) from your terminal.You can also use Jupyter notebooks, which allow you to run Python code interactively in a browser, or some other IDE (Integrated Development Environment) like PyCharm. Another excellent IDE is VS Code which comes with excellent Python and Jupyter notebook support via the Python extension and is a popular choice for many developers. VS Code also works well with WSL (Windows Subsystem for Linux) if you are using Windows.
Jupyter notebooks#
There is excellent documentation on installing and using Jupyter notebooks on the web (for example here).
Once you have a notebook running, here are a few tips for using notebooks for computational materials science:
Use the
%matplotlib inline
magic command to display plots inline in the notebook. This is especially useful when using matplotlib to plot data. You can also use%matplotlib notebook
for interactive plots.Use the
?
to get help on a function or module. For example,np.linspace?
will show you the documentation for thelinspace
function in numpy.Use the
!
to run shell commands. For example,!ls
will list the files in the current directory. Any other shell command can be run in this way.Use the
%%time
magic command to time the execution of a cell. For example,%%time
at the top of a cell will time how long it takes to run the cell.Use the
%%bash
magic command to run bash commands in a cell. For example,%%bash
at the top of a cell will run all the commands in the cell as bash commands.In markdown cells, you can use LaTeX to write mathematical equations. For example,
$\int_0^\infty e^{-x} dx$
will render as \(\int_0^\infty e^{-x} dx\). This is very helpful for documenting your research in a notebook.The Atomic Simulation Environment (ASE) is a useful package for working with atomic structures in Python. It can read and write many file formats, and has many useful tools for manipulating atomic structures. Soprano is built on top of ASE and so learning a little more about how ASE works can be very helpful. One useful feature of ASE is the
view
function, which allows you to view atomic structures in the ASE GUI. For example, a cell that has this:from ase.io import read from ase.visualize import view atoms = read('my_structure.cif') view(atoms)
will open a GUI window showing the structure in
my_structure.cif
. This is very useful for quickly checking that you have read in the correct structure. Note that the GUI will only work if you are running the notebook on your local machine, not in the cloud. For WSL users, you will need to either have WSLg installed or use an X server such as VcXsrv.Whichever way you start a Jupyter notebook, you will have the option to pick which Python environment you want to use. This is useful if you have multiple environments set up on your machine with different versions of packages. For example, if you have a conda environment called
myenv
, you can start Jupyter notebook with this environment using the following command:conda activate myenv jupyter notebook
This will start a Jupyter notebook server with the
myenv
environment. You can then select this environment in the notebook by going toKernel -> Change kernel
and selectingmyenv
from the list of available kernels.