Python in ATK

All calculations in ATK are controlled via Python and the QuantumWise extension to Python, called NanoLanguage; the combination of the two is called ATK Python. As briefly mentioned in the section Installing and running the software, all ATK Python scripts (written in the Python programming language, but using ATK-specific functionality from the NanoLanguage module) are executed using the atkpython executable, which is installed with ATK.

If you have an ATK Python script named script.py, it is easy to run it from command line:

$ atkpython script.py

You can also simply execute atkpython without an argument to invoke an interactive session that allows you to execute ATK Python commands one after another (hit Enter to execute a line):

$ atkpython
# ---------------------------------------------------------------- #
# ATK license information.                                         #
# ---------------------------------------------------------------- #
Atomistix ToolKit 2016.0

In [1]: a=1.0

In [2]: b=2.0

In [3]: c=a+b

In [4]: print "a+b = c =", c
a+b = c = 3.0

In [5]: bulk_configuration = BulkConfiguration(
    ...:     bravais_lattice=FaceCenteredCubic(5.4306*Angstrom),
    ...:     elements=[Silicon, Silicon],
    ...:     fractional_coordinates=[[0.,0.,0.],[0.25,0.25,0.25]])

In [6]: nlprint(bulk_configuration)
+----------------------------------------------------------+
| Bulk Bravais lattice                                     |
+----------------------------------------------------------+
Type:
FaceCenteredCubic

Lattice constants:
a =     5.430600 Ang
b =     5.430600 Ang
c =     5.430600 Ang

Lattice angles:
alpha =    90.000000 deg
beta  =    90.000000 deg
gamma =    90.000000 deg

Primitive vectors:
u_1 =      0.000000      2.715300      2.715300 Ang
u_2 =      2.715300      0.000000      2.715300 Ang
u_3 =      2.715300      2.715300      0.000000 Ang

+----------------------------------------------------------+
| Bulk: Cartesian (Angstrom) / fractional                  |
+----------------------------------------------------------+
2
Bulk
Si    0.000000e+00  0.000000e+00  0.000000e+00    0.00000  0.00000  0.00000
Si    1.357650e+00  1.357650e+00  1.357650e+00    0.25000  0.25000  0.25000

Input lines 1–4 in the example above are standard Python commands. However, input line 5 creates the primitive silicon bulk using the BulkConfiguration class from NanoLanguage, while input line 6 uses the nlprint functionality to print the main contents of the parameters defining the silicon bulk.

All the standard functionality of Python is available when you invoke ATK. However, the main purpose of this chapter is to introduce the NanoLanguage module. If you have no prior experience with Python, we encourage you to first go through the section Python basics.

NanoLanguage

NanoLanguage extends the standard Python environment with concepts and objects relevant for computational nano-scale physics and chemistry. This enables a simple, flexible, and intuitive way to operate ATK: Use Python scripting to define nano-structures, atomic-scale simulators, and post-SCF analyses to be performed. Simply write the ATK Python script and execute it.

For example, NanoLanguage contains a periodic table of the elements, units such as Rydberg and Angstrom, methods for calculating the one-electron spectrum of a molecule, band structure of solids, and transmission spectra of nano-scale devices, as well of constructors for creating molecules, Bravais lattices, and devices.

The ATK-DFT calculator may be largely implemented in highly efficient C++ routines, but setting up and executing DFT calculations is done using Python and NanoLanguage commands. The same applies to setting up and executing analysis tools, as well as reading and writing of computational data. Moreover, the graphical user interface VNL uses NanoLanguage to read data produced by ATK. NanoLanguage is therefore the scripting language that binds all QuantumWise products together, and is a platform on which other developers and companies can build applications and extend the functionality of QuantumWise products.

Important

All ATK functionality available through NanoLanguage is documented in the NanoLanguage Reference Manual.

Python packages in NanoLanguage

The ATK distribution comes with all the standard Python packages, plus the non-standard packages listed in the table below. Additional packages can be installed in the same manner as in any other regular Python environment.

Table 19 Pre-installed non-standard packages available with the atkpython executable.
Package Load Command Description
NumPy import numpy Linear algebra and numerical routines.
SciPy import scipy Scientific computing and algorithms.
mpi4py import mpi4py MPI functionality for parallel computing.
matplotlib import pylab Advanced plotting of data.
ASE import ase Support for external atomic-scale calculators.
cclib import cclib Interface to computational chemistry packages.
pymatgen import pymatgen Advanced materials analysis.

Using NumPy with NanoLanguage

NumPy is the fundamental package for scientific computing with Python, since it can be used to perform advanced mathematical operations much faster than using ordinary Python lists. The NumPy module is therefore used throughout NanoLanguage to store and manipulate values from analysis functions. NumPy objects resemble ordinary lists, but contain a lot more functionality, and NanoLanguage ships with built-in NumPy support to easily facilitate its usage.

A few major differences between ordinary lists and NumPy arrays can be seen from this short example:

>>> from numpy import array
>>> a = array([1,2]) # a NumPy array
>>> a = a+[3,4]
>>> print a
[4, 6]
>>> b = [1,2]  # an ordinary Python list
>>> b = b+[3,4]
>>> print b
[1, 2, 3, 4]

As seen in the example above, NumPy arrays can in many ways be regarded as matrices. The following example underlines this:

>>> a = array([1,2],[3,4]])
>>> a = a *[3,4]
>>> print a
[[ 3  8]
[ 9 16]]
>>> print a.trace()
19
>>> a.transpose()
>>> print a
[[ 3  9]
 [ 8 16]]
>>> print a.trace()
19

Note in the above that the values 8 and 9 changed place in the matrix after applying the transpose() operation. However, as expected, the trace of the matrix remains the same.

NumPy arrays may also be converted into lists:

>>> a = array([1,2],[3,4]])
>>> print a.tolist()
[[1, 2], [3, 4]]

There are many more possibilities using arrays from the NumPy module, and it is usually faster than iterating through for-loops or using ordinary lists!

More information can be found at the NumPy website, or by using the dir() command on a NumPy array object. Details on how NumPy can be used for improved performance can be found at the online resource Python Performance Tips.

Cloning of ATK Python objects

It is possible to get a copy of an ATK object using a method called cloning. This is done by adding a closed parenthesis after the object:

parameter_object = IterationControlParameters(tolerance=1.e-5)
parameter_object_clone = parameter_object()

Very imporatntly, it is possible to modify the parameters of the clone by specifying the new parameters during cloning:

parameter_object = IterationControlParameters(tolerance=1.e-5)
parameter_object_clone = parameter_object(max_steps=50, damping_factor=0.2)

Plotting using pylab

The following script uses the NumPy and matplotlib modules for creating a 2D plot:

import pylab
x = numpy.linspace(-1,5,10)
y = numpy.exp(x)
pylab.figure()
pylab.plot(x,y)
pylab.show()

Note that the NumPy package is automatically loaded with the NanoLanguage module, so the import numpy statement is not really needed when running atkpython.

Physical quantities and units

Units are a key concept in ATK. All parameters that correspond to physical quantities, such as lengths, energies, voltages, etc., should be specified with an explicit unit. Similarly, all physical results returned from ATK calculations also contain an explicit unit. PhysicalQuantity objects are created by multiplying the scalar, list, or array, containing the quantity’s value(s), with the desired unit:

>>> a = [[1.0, 2.0], [3.0, 4.0]]*Angstrom
>>> t = 0.5*femtoSecond**-1

See below for physical units available in NanoLanguage.

All PhysicalQuantity objects have two query methods:

  • inUnitsOf(Unit): Returns the numerical value in the specified unit as a numpy-array, respectively numpy-float object for scalar values.
  • convertTo(Unit): Returns the value of the PhysicalQuantity as a new PhysicalQuantity object in the specified unit.

Moreover, since the PhysicalQuantity class derives from numpy.array, PhysicalQuantity objects can be used, in most respects, as a numpy array. This means that many class methods of numpy arrays, such as sum(), max(), or reshape() can be used with PhysicalQuantity objects.

Element-wise operations between two PhysicalQuantity objects work as in numpy, e.g.:

>>> a = [[1.0, 2.0], [3.0, 4.0]]*Ang
>>> b = [[2.0, 2.0], [4.0, 4.0]]*nanoMeter
>>> c = a + b
>>> print c
[[ 21.  22.]
 [ 43.  44.]] Ang

Note, that addition and multiplication require compatible units for all operands.

Most numpy universal functions, as well as the two numpy functions numpy.dot and numpy.cross, work for PhysicalQuantity objects, in the same way as for numpy arrays.

Note, however, that most other numpy and python functions, e.g. numpy.arange, are not supported for PhysicalQuantity. In order to use them, the units have to be removed, via inUnitsOf() before the function is invoked:

>>> a = 5.0*Ang
>>> b = 1.0*nanoMeter
>>> delta = 0.5*Ang
>>> distances = numpy.arange(a.inUnitsOf(Ang), b.inUnitsOf(Ang), delta.inUnitsOf(Ang))

If the result of a PhysicalQuantity-operation is unitless, e.g:

>>> a = [[1.0, 2.0], [3.0, 4.0]]*Ang
>>> b = [[2.0, 2.0], [4.0, 4.0]]*Ang**-1
>>> c = a*b
>>> print c
[[ 21.  22.]
 [ 43.  44.]]

the result is directly returned as a numpy array, respectively as numpy float for scalar values.

Usage Examples

Getting a float value:

>>> a = 5*Angstrom
>>> print a.inUnitsOf(nanoMeter)
0.5

Getting a PhysicalQuantity object:

>>> print a.convertTo(nanoMeter)
0.5 nm

Physical quantities can be transformed with an exponent:

>>> a = 2. * Meter * Second**-2
>>> v = (2 * a * (1*Meter))**0.5
>>> print v
2.0 m/s

Inverse units are specified by using the exponent operator **:

>>> f = 2.2/Second
>>> print f.inUnitsOf(Second**-1)
2.2

Units are attached to values by multiplication. Thus, to specify a length of 5 Bohr:

>>> a = 5*Bohr

By printing the value of the variable a, the unit will automatically be displayed:

>>> print a
5.0 Bohr

Units can also be composite. The unit for force is Newton, which is Joule per Meter. This is a rather awkward unit for nano-scale calculations, where something like electron volt per nm makes more sense. Any energy divided by a length is, however, a valid force unit, so to specify a force, write:

>>> F = 5*eV/Bohr

Next, multiply this by a length again and the result will be an energy:

>>> b = F*5*Bohr
>>> print b
25*eV

Some unit abbreviations are only available with the Units prefix:

>>> b = 5.1*Units.Ry
>>> print b
5.1 Rydberg

Units that by default are specified without a prefix, can also be given with a prefix:

>>> b = 5.1*Rydberg
>>> c = 5.1*Units.Rydberg

Units available in NanoLanguage

The following units are made available when importing NanoLanguage:

Table 20 Units available in NanoLanguage. More units are available using the Units prefix.
Unit type Name
Length units nm
  nanoMeter
  Ang
  Angstrom
  Bohr
  Meter
Energy units Rydberg
  eV
  meV
  electronVolt
  Hartree
  J
  Joule
  Calorie
  kiloCaloriePerMol
  kiloJoulePerMol
Force units Newton
  nanoNewton
Mass unit kiloGram
Temperature unit Kelvin
Time units fs
  femtoSecond
  femtosecond
  ps
  picoSecond
  picosecond
  ns
  nanoSecond
  nanosecond
  microSecond
  microsecond
  milliSecond
  millisecond
  Second
  Minute
  Hour
  Day
Conductivity related units Ampere
  Volt
  Siemens
  G0
  Coulomb
Pressure units bar
  Pa
  GPa
Spin unit hbar
Number unit Mol
  mol
Angle units Radians
  Degrees
Physical constants boltzmann_constant
  planck_constant
  avogadro_number
  speed_of_light
  atomic_mass_unit
  hbar
  electron_mass
  elementary_charge
  vacuum_permitivity

Read and Write Support

Read and write functionality in NanoLanguage is provided by two functions: nlread and nlsave. Storage of several objects per file is supported. Each object in native ATK files is associated with a unique identifier – the object_id. If a new entry is saved without specifying an object_id, the entry is appended to the file with an auto-generated object_id. If an object_id is specified which already is present in the file, the old entry is automatically deleted.

ATK supports two file formats natively: NetCDF (from ATK version \(\ge\) 10.8) and HDF5 (from ATK version \(\ge\) 2017). Both formats are platform independent, i.e. the files can, for instance, be written on a Linux platform and later be read on a Windows platform. The internal data structure is performance-optimized.

HDF5 (Default File Format)

specification HDF5 format
object_id (default) classname_x with x being an increasing integer

HDF5 is the default file format for ATK version \(\ge\) 2017. It has superior performance to NetCDF, especially for large Configurations and MDTrajectory. The file format supports Metatext, and deleting objects – see nldelete. Due to the performance-optimized storage, the file size is not automatically reduced if objects have been deleted / overwritten. The free space can be reclaimed with nlrepack. The stored data can easily be accessed by hdf-view or by any program based on libhdf5.

NetCDF

specification: NetCDF format
object_id (default) gIDxxx with x being an increasing 0-padded integer

The content of a NetCDF file can be converted to ASCII format with the command ncdump.

Metatext

Most of the objects available in NanoLanguage have support for Metatext. This feature allows the user to store additional text on an object. All Configuration and Analysis objects support this feature. The information is automatically written to / read from HDF5 files. This is not supported with NetCDF files. Access to the Metatext of an object obj is provided by two functions:

obj.setMetatext(metatext) Sets the metatext on obj. metatext has to be of type str.
obj.metatext() Returns the metatext of obj.

Moreover, with the utility functions readMetatext and writeMetatext, one can directly access/modify the Metatext of an object stored in a file without the need of reading and/or saving the full object.

To remove the metatext information from an object, the metatext argument in setMetatext/writeMetatext must be set to None. The Metatext can also be directly modified from the GUI in VNL.

Spin

Spin is a flag. As such it cannot be constructed; Spin() is an invalid command. Instead, Spin provides derived classes (flags) to represent spin components and projections:

Spin.Up The ‘up’ component of a spinor (up-up component of a spin matrix).
Spin.Down The ‘down’ component of a spinor (down-down component of a spin matrix).
Spin.RealUpDown The real part of the ‘up-down’ component of a spinor (spin matrix.)
Spin.ImagUpDown The imaginary part of the ‘up-down’ component of a spinor (spin matrix).
Spin.All All spin components.
Spin.Sum The sum ‘Spin.Up + Spin.Down’
Spin.X The spin projection along ‘x’ (Spin.X = 2*Spin.RealUpDown).
Spin.Y The spin projection along ‘y’ (Spin.Y = -2*Spin.ImagUpDown).
Spin.Z The spin projection along ‘z’ (Spin.Z = Spin.Up - Spin.Down).
Spin.Unknown Unknown spin.

Usage Example

Calculate the electron density for all spin and evaluate some components:

# Calculate the electron density for a given configuration.
ed_up = ElectronDensity(configuration, spin=Spin.All)

# Take some spin projections.
x = ed.spinProjection(spin=Spin.X)
y = ed.spinProjection(spin=Spin.Y)
z = ed.spinProjection(spin=Spin.Z)
s = ed.spinProjection(spin=Spin.Sum)
r = ed.spinProjection(spin=Spin.RealUpDown)
i = ed.spinProjection(spin=Spin.ImagUpDown)
u = ed.spinProjection(spin=Spin.Up)
d = ed.spinProjection(spin=Spin.Down)

# Evaluate for Spin.X at the origin.
data = ed.evaluate(0.0*Bohr, 0.0*Bohr, 0.0*Bohr, spin=Spin.X)

Note about Spin.All

Precisely which spin components are returned when calling an objects query method with spin = Spin.All depends on the queried object. E.g. ElectronDensity.evaluate(x, y, z, spin=Spin.All) returns a list of four electron density values at the grid point (x, y, z) corresponding to Spin.Sum, Spin.X, Spin.Y, and Spin.Z. In other cases (e.g. ExchangeCorrelationPotential), the returned array contains the values corresponding to the spinor components Spin.Up, Spin.Down, Spin.RealUpDown, and Spin.ImagUpDown. Refer to the object’s documentation for details.

Note on Spin in low level interface functions

In all low level interface functions such as calculateHamiltonianAndOverlap, calculateDensityMatrix, calculateSelfEnergy etc., the following rules for the spin parameter apply:

  • UNPOLARIZED: Valid spin parameters are Spin.Up and Spin.All, which both yield the same result in this case, as there is no designated spin direction in UNPOLARIZED calculations.
  • POLARIZED: Valid spin parameters are Spin.All, Spin.Up, and Spin.Down. The default is Spin.All, in which case the function returns a pair of matrices, one for the Spin.Up and one for the Spin.Down component. For Spin.Up or Spin.Down, only the respective spin component is returned.
  • NONCOLLINEAR / SPINORBIT: In noncollinear calculations, only Spin.All is an accepted parameter. The returned matrix contains the spin components Spin.Up, Spin.Down, Spin.UpDown, and Spin.DownUp in an interleaved fashion, see below for an example.

Examples

# Calculate the density matrix for a polarized system.
D = calculateDensityMatrix(polarized_configuration, spin=Spin.All)
# Extract the Spin.Up component.
D_uu = D[0]
# Extract the Spin.Down component.
D_dd = D[1]

# Calculate the density matrix for a noncollinear system.
D = calculateDensityMatrix(noncollinear_configuration, spin=Spin.All)
# Get all up-up entries:
D_ud = D[::2,::2]
# Get all down-down entries:
D_du = D[1::2,1::2]
# Get all up-down entries:
D_ud = D[1::2,::2]
# Get all down-up entries:
D_du = D[::2,1::2]

Python basics

This section introduces the basics of Python, which is a mature and modern object-oriented programming language with a powerful syntax that is surprisingly easy to learn. If you are not familiar with Python at all, there are many good resources available on the web, e.g.

The spectrum of features offered by Python is enormous, and a lot of them will not be needed in your ATK scripts. The minimum set of Python structures you really should know about is the following:

The next sections will discuss the basic usage of the above Python concepts and some general Python features.

Indentation

One important point you must know before you embark on writing your first NanoLanguage script, is that Python relies on indentation when interpreting your script. If your code is not correctly indented, Python will stop executing the provided script and return an error. Exactly when and how you should indent code in your scripts will become apparent through the examples in this manual; a brief example, however, illustrates the point:

def myNewMethod():
    print 'Hello World'

The colon efter the fist code line and the indentation of the second code line tells the Python interpreter that the print statement is a part of the myNewMethod() function. The indentation thereby determines if the code belongs to the defined function or to any remaining code.

Important

Please note that using both spaces and tabulation when indenting code sections or statements could mean trouble. The reason for this is that tabulation might not be interpreted the same way in different editors. This could become an issue if you work on the same script using different operating systems or collaborate with others on writing them. Some editors allow you to specify the number of spaces that should be inserted when pressing the TAB key, and we recommend that you use this option when available, or simply use the SPACE key for indentation to increase interoperability.

This will do for now, but keep in mind that Python code must be properly indented and never to use both types of indentation in the same script. For a more complete discussion of the indentation rules used in Python, see this online resource: Indenting Code.

Comments

A comment line in Python starts with the character #:

# This is a comment line in Python
print 'Only this line will be executed'

The first line is ignored when interpreting the Python script. The second line will print the string to the screen:

Only this line will be executed

Longer (multi-line) comments can be made using triple quotes:

a = 2
"""
A value was just assigned to a.
We will now assign a value to b.
Are you ready?
"""
b = 3
print "a x b = ", b*a

The lines between the triple quotes are ignored by the Python interpreter, so the result printed by the above would be

a x b = 6

In Python, it does not matter whether you use single quotes (') or double quotes (") for declaring a triple-quoted region.

Importing modules

A Python module is a file containing a collection of functions, classes, and many other tools that initially are not available when Python is invoked. In some sense, you may think of a Python module as a library. You load a Python module by using the import statement. Modules are typically imported in three different ways:

  • By importing the entire module:

    import math
    # Entire math module is now available
    x = 3.14
    y = math.cos(x)
    z = math.sin(x)
    
  • By importing specific elements from the module:

    from math import cos
    # Only cos() has been loaded from the math module
    x = 3.14
    y = cos(x)
    
  • By importing all methods from a module:

    from math import *
    # All methods available in the math module have been loaded
    x = 3.14
    y = cos(x)
    z = sin(x)
    

As mentioned above, a # denotes a comment in Python. Everything past this character, but still on the same line, will not be interpreted.

For more details on modules, consult the Modules entry in the official Python tutorial. An overview of the math module is provided here: math module.

Two modules, which are not part of standard Python, are automatically imported by when running ATK Python:

Lists

A list is a Python object used to collect elements. Lists are easily created:

numbers  = [1, 2, 3, 4, 5, 6, 7, 8, 9]
romans   = ['a', 'b', 'c', 'd']
elements = ['Hydrogen', 'Helium']

The last example above creates a list containing two strings and saves the list in the variable elements. Lists can contain several different data types at the same time (integers, floats, strings, etc.), which makes it a very flexible data structure.

Elements in a list are numbered starting from zero, so the first element in the list elements (Hydrogen) is accessed by index 0:

>>> print elements
['Hydrogen', 'Helium']
>>> print elements[0]
Hydrogen

It is also possible to store different data types within the same list structure:

elements = [1, 'Hydrogen', 2, 'Helium']

and then extend the list with additional elements:

>>> elements.extend([7,'Nitrogen'])
>>> print elements
[1, 'Hydrogen', 2, 'Helium', 7, 'Nitrogen']

In the above we extended elements from another list to the list elements. If we instead apply the list method append() the result is different:

>>> elements.append([8,'Nitrogen'])
>>> print elements
[1, 'Hydrogen', 2, 'Helium', 7, 'Nitrogen', [8, 'Nitrogen']]

In this case, the actual list (and not the elements in it) is added to the elements list. Another (and shorter) way of adding elements to a list is by using the + operator:

>>> a = [1,2]
>>> a = a + [3,4]
>>> print a
[1, 2, 3, 4]

Additional information on lists can be found in the Lists entry in the official Python tutorial.

Tuples

A tuple is constructed very similar to a list, but by using parentheses instead of square brackets:

mytuple = ('uno','duo')  # Note the curved parentheses
myothertuple = ('uno', ) # Note the comma just after 'uno'

An important detail in the above example is that a trailing comma is needed when the tuple only contains a single element; otherwise, it could not be distinguished from an ordinary parentheses construction. For example:

t = ('uno',) # t is a tuple
s = ('uno')  # s is 'just' a string

Contrary to a list, a tuple is immutable, meaning that once it is defined, its values can not be changed. For example:

>>> mytuple = ('uno','duo')
>>> mytuple[1] = 'quattro'

results in the error Traceback (most recent call last): File "<stdin>", line 1, in ? because the assignment is illegal. The error raised by Python, indicates that the error is related to the type of a variable. Python also informs us that the error occurred on input line 1. This is the kind of message you would get when using Python interactively. Had we used it in a script, the line number would refer to the actual line in the script.

Combinations of tuples and lists are allowed. To set up a collection of vectors to describe atomic coordinates, we may use a combination of lists and tuples:

>>> atom_coordinate_1 = (0.1, 0.2, 0.3)
>>> atom_coordinate_2 = (0.4, 0.5, 0.6)
>>> atom_coordinate_3 = (0.7, 0.8, 0.9)

>>> collection_of_atoms = [
>>> atom_coordinate_1,
>>> atom_coordinate_2,
>>> atom_coordinate_3]

>>> print collection_of_atoms
[(0.1, 0.2, 0.3), (0.4, 0.5, 0.6), (0.7, 0.8, 0.9)]

For more details on tuples, consult the Tuples and Sequences entry in the official Python tutorial.

Dictionaries

It can often be useful to assign keys to different values in order distinguish among these (a sort of tagging). This can be accomplished by using dictionaries. In Python, a dictionary is called a dict. A dict is created using curly braces and key–value assignments by a colon:

>>> myDict = {'username' : 'henry', 'password' : 'secret'}
>>> print myDict['username']
henry

In this example, username and henry is a key–value pair. So is password and secret. Note how the dict is created using curly braces {} (tuples use parentheses () and lists use square brackets []).

There is no internal ordering in a dict, i.e. keys and values are not stored in the same order as they are entered into the dict. Values in the dict are accessed via their key. A value can be associated with several keys, whereas a key may be associated with a single value only.

Two frequently used methods associated with a dict are keys() and values(). The method keys() returns a list containing the keys of the dict, while values() returns the list of values:

>>> myDict = {'username':'henry', 'password':'secret'}
>>> print myDict.keys()
['username', 'password']
>>> print myDict.values()
['henry', 'secret']

It is also possible to query a dict regarding its length using the standard method len():

>>> myDict = {'username':'henry','password':'secret'}
>>> print 'myDict has length', len(myDict)
myDict has length 2

The return value of len() corresponds to the number of key–value pairs in the dict.

For more details on dicts, consult the Dictionaries entry in the official Python tutorial.

For-loops

Once we have created a list, it would be nice if we had an automatic way of addressing its individual elements. Python offers this functionality by using a for-loop construction:

>>> numbers = [1, 2, 3, 4, 5]
>>> for x in numbers:
>>>    print x,
1 2 3 4 5

For-loops are very useful for constructing iterative loops in numerical algorithms. Here is a simple example using Newton–Raphson iteration for determining the value of \(\sqrt{2}\):

x = 20.0
for i in range(8):
    x = x - (x*x - 2.0)/(2.0*x)
    print x

which converges quadratically to \(\sqrt{2}\):

10.05
5.12450248756
2.75739213842
1.74135758045
1.44494338196
1.41454033013
1.41421360012
1.41421356237

The range() function used above returns a list containing all positive integers less than the argument (including zero), which was passed to the function. The for-loop now iterates over all the elements generated by the range(8) call, performing a Newton update of the variable x for each iteration step of the loop. The value of the first element in a list and the increments between neighboring elements can be controlled by calling range() with more than one argument:

>>> for i in range(9,21,3):
>>> print i
9
12
15
18

The range() function is one of the built-in functions in Python. There are a lot of these, all adding to the flexibility of Python. The function len() is another built-in function. It makes it possible to find the length of a list or a tuple. This can neatly be combined with a for-loop to iterate over a list:

>>> m = range(6)
>>> for i in range(len(m)):
>>>     print m[i],
0 1 2 3 4 5

In the above, len(m) returns the length of the list m, i.e. the number of elements in the list. This is then used to create a new list (using the range() function) over which the for-loop iterates. The comma at the end of the print statement instructs Python to suppress printing a new line character. Otherwise, all numbers would have been printed on separate lines.

Consult the More Control Flow Statements entry in the official Python tutorial for additional information about the if, for, and while statements, and built-in functions like len(), range() and enumerate().

Objects

Many of the structures you work with in both NanoLanguage and Python are so-called objects. An object is a structure that contains a lot of handy functions for accessing and manipulating the data assigned to the object. These special functions are called methods. Let us see how we work with these in practice. If we define a list like thid:

numbers = [1, 5, 3, 6, 2, 8, 7, 9, 4]

the variable numbers in fact refers to a list object holding the numbers [1,5,3,6,2,8,7,9,4]. A list object contains several helpful methods, one of them being reverse(), which you call like this:

>>> numbers.reverse()
>>> print numbers
[4, 9, 7, 8, 2, 6, 3, 5, 1]

Note that the returned list is now in reverse order.

Another list method is sort(), which sorts the elements of a list:

>>> numbers.sort()
>>> print numbers
[1, 2, 3, 4, 5, 6, 7, 8, 9]

You can always use the built-in Python function dir() to display information about the functionality provided by a given object. For example,

print dir(list)

returns the following methods for the list type

['append', 'count', 'extend', 'index',
'insert', 'pop', 'remove', 'reverse', sort']

For instructions about their specific usage, e.g. for reverse(), you can apply Python‘s built-in help system using the function help(). So, to get more information on the list method method reverse(), invoke help() like this:

>>> help(list.reverse)
Help on method_descriptor:
reverse(...)
    L.reverse() -- reverse *IN PLACE*
(END)

For more details on objects, consult the Classes entry in the official Python tutorial.

Functions and arguments

You will often find that you keep copying and repeating almost identical segments of Python code. A common approach to avoid this redundancy is to encapsulate these structures in a function. This way, you keep your code readable and reusable, as well as concise and clear. You also avoid “reinventing the wheel” every time you start on a new problem. Instead, you simply use the function you already made in a previous script.

We will use the Newton iteration scheme introduced in section on For-loops as an example. To encapsulate this in a function, we could use the def statement to declare the definition of a function named newton():

def newton():
    x = 20.0
    for i in range(10):
        x = x - (x*x - 2.0)/(2.0*x)
        print x

All indented lines following the colon belongs to the function definition. Indention is very important: All lines belonging to the function must be indented by the same number of spaces within the region that defines the function. It is now simple to call the function to execute it:

>>> newton()
10.05
5.12450248756
2.75739213842
1.74135758045
1.44494338196
1.41454033013
1.41421360012
1.41421356237
1.41421356237
1.41421356237

Even though this already makes life easier, the function newton() still has certain shortcomings. For example, it would be nice, if we could

  1. supply the initial guess (currently x = 20 is always used);
  2. set the maximum number of iterations steps.

In Python, we do this by passing arguments to the function. Here is an implementation that fulfills the wish list given above by allowing you to pass the arguments n and x to the function:

>>> def newton(n,x):
>>>    for i in range(n):
>>>        x = x - (x*x - 2.0)/(2.0*x)
>>>        print x
>>> newton(8,4.0)
2.25
1.56944444444
1.42189036382
1.41423428594
1.41421356252
1.41421356237
1.41421356237
1.41421356237

Still, however, this is somewhat useless: Suppose that we actually wanted to use the result of the calculation (the numerical value of \(\sqrt(2)\)) in some subsequent parts of our script. We solve this problem by letting the function return the result of the calculation, which we then “grab” and store in a new variable:

>>> def newton(n,x):
>>>    for i in range(n):
>>>        x = x - (x*x - 2.0)/(2.0*x)
>>>    return x
>>> x = newton(8,4.0)
>>> print 'sqrt(2) = ', x
sqrt(2) =  1.41421356237

This is more satisfactory, but there are still some handy features regarding function definitions that can make life even easier for us. Often we might be completely satisfied with using x=2.0 and n=8 when we call the newton() function. To avoid supplying this redundant information, we can define default values for the function arguments:

def newton(n=10, x=2.0):
    for i in range(n):
        x = x - (x*x - 2.0)/(2.0*x)
    return x

If we are happy about the default settings, we may invoke the function by calling it as newton(). On the contrary, should the default settings be changed, we may also invoke the function by calling newton(8,2.0). When the variables for a Python function are specified like above, they are called optional variables, as opposed to required variables, which have no default value.

This is certainly handy, but what if we often wanted to change the initial guess for x while keeping the value of n at the default setting? It is possible to override the default value by explicitly naming the variable:

newton(x=3.0)

which overrides the default value of x while keeping the default value of the argument n. This way of assigning values to variables makes it possible to specify the variables of the function in whichever order you prefer. The following function calls are therefore completely equivalent:

newton(n=30, x=2.0)
newton(x=2.0, n=30)

You may include both optional and required variables when calling a function. In this case, however, the order is important! Once you have specified your first variable by name, no more variables may be specified according to order.

If you want to know more about specifying Python functions and their arguments, please see the Defining Functions entry in the official Python tutorial or the online resource Using Optional and Named Arguments.