Python

  • High level language
  • Free and open source
  • Interactive environment
  • has a big scientific community using it...
  • Python is object-oriented
  • Python is interpreted
  • There are modules for almost anything in scientific computation

Problems

  • Python is interpreted
  • Dictionary lookups
  • function calling overheads
  • GIL - global interpreter lock

Cython

Cython is an attempt to make a superset of python which has the high level coolness of python along with the speed of C. This is achieved because

  • Cython is compiled
  • Cython has cdef variables, attributes, functions
  • Cython supports parallelism (openMP) by opening GIL.

Cython is thus a python like language which can be used to do the heavy-liftings in the code. From the cython '.pyx' file a highly optimised C code is generated internally. Cython, as we will see, has support for numpy. Almost all the python codes can be taken to cython. Moreover, with a little work it can be made order of magnitude faster!

Below are the steps involves in building a cython code

  • A .pyx file is compiled by cython to .c
  • The .c file is then compiled by the C compiler.
  • One needs to build the cython file using setup.py. More here.

Multiplication of two one dimensional arrays in python, Cython and C

In [1]:
import numpy as np
N = 6000    # size of the array
p = 1000    # number of iterations 

A = np.linspace(-10,10, N)
B = np.linspace(-10,10, N)
C = np.zeros( np.size(A))
In [2]:
%%timeit
for j in range(N):
    for tn in range(p):
        C[j] = A[j]*B[j] 
1 loop, best of 3: 3.66 s per loop
In [3]:
%load_ext cython
In [4]:
%%cython
# Basic cython class for calculating multiplication of two arrays.
#     
#             : This codes uses OpenMP multithreading
#             : also it employs the concept of memory views 
#


import  numpy as np
cimport numpy as np
cimport cython
from libc.math cimport sqrt
from cython.parallel import prange

DTYPE   = np.float
ctypedef np.float_t DTYPE_t

@cython.wraparound(False)
@cython.boundscheck(False)
@cython.cdivision(True)
@cython.nonecheck(False)
cdef class cythontest:
    cdef readonly np.ndarray A, B, C 
    cdef readonly int N
       
    def __init__(self, N):
        self.N = N
        self.C = np.empty(N, dtype=DTYPE)
    
    
    cpdef calcC(self, np.ndarray A, np.ndarray B, int iter):
        cdef int N = self.N
        cdef double [:] t1   = A 
        cdef double [:] t2   = B
                
        cdef double [:] F   = self.C
        cdef int i, j 
       
        for i in prange(N, nogil=True):
            for j in range(iter):            
                F[i] = t1[i] * t2[i]                 
        return
In [5]:
rm = cythontest(N)
rm.calcC(A, B, p)
np.allclose(C, rm.C)
Out[5]:
True
In [6]:
%%timeit
rm = cythontest(N)
rm.calcC(A, B, p)
np.allclose(C, rm.C)
100 loops, best of 3: 8.97 ms per loop

Thus it can be seen that merely by writing the code in Cython we see a order of magnitude difference in the speed. Honestly speaking, the code in python has been written terribly to show the speed difference. A vectorised code in python will also be very fast. I leave it for the readers to check!

The cython, along with setup etc, can be accessed here. Moreover, we wrote the same thing in pure C and, to our amazement, cython is as fast as the C code!

The decorators (compiler directives) before the class decoration helps in getting more speed up when the respective python checks are not performed. E.g. setting can enhance the performance of the code considerably.

C-Cython interface

Cython optimizes codes internally, and thus, a bare-bone C cod may be beaten by cython in the best case. In some cases, though, we may need to write some performance critical part of the code in C/C++. Cython comes very handy in this case as well to act as an inerface between python and C. So our code has three layers now. The uppermost layer is in python which is extremely human readable and third layer is C while Cython is the sandwiched layer 2 between them. Numpy arrays can also be passed between the layes, as shown here.

The amount of time taken in writing and debugging is very small in python, and it is also very human readable. Also, personally, I believe that writing code in high level language like python is fun - you can import almost anything (try importing antigravity) !

One can also wrap C++ classes in Cython. Look here.

Conclusions:

  • Premature optimization is the root of all evil: Donald Knuth
  • Make It Work

    • Make It Right

      • Make It Fast -- Kent Back