Feb 23, 2013

Speeding up Python performance with Cython (II)

Following up on the last article and as you can see in the previous script, first of all the full path of the Cython installation has been added to the PYTHONPATH. By running this script, both a C file (calculate_primes1.c) and a loadable module by Python (calculate_primes1.so) are been generated.

cython$ python setup.py build_ext --inplace
running build_ext
cythoning calculate_primes1.pyx to calculate_primes1.c
building 'calculate_primes1' extension
gcc -pthread -fno-strict-aliasing -DNDEBUG -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g -fwrapv -fPIC -I/usr/include/python2.6 -c calculate_primes1.c -o build/temp.linux-x86_64-2.6/calculate_primes1.o
gcc -pthread -shared build/temp.linux-x86_64-2.6/calculate_primes1.o -L/usr/lib64 -lpython2.6 -o calculate_primes1.so

cython$ ls -l
drwxr-xr-x  3 javi javi  4096 Feb 16 17:46 build
-rw-r--r--  1 javi javi 68295 Feb 17 11:27 calculate_primes1.c
-rw-r--r--  1 javi javi   393 Feb 17 11:06 calculate_primes1.pyx
-rwxr-xr-x  1 javi javi 53388 Feb 17 11:27 calculate_primes1.so
-rwxr-xr-x  1 javi javi   393 Feb 16 21:43 calculate_primes.py
-rwxr-xr-x  1 javi javi   530 Feb 16 21:49 calculate_primes.pyc
drwxr-xr-x 10 javi javi  4096 Feb 16 16:34 Cython-0.18
-rwxr-xr-x  1 javi javi   314 Feb 17 11:06 setup.py
-rwxr-xr-x  1 javi javi   426 Feb 16 21:53 test.py


If you take a look at the C file, you will appreciate that is a huge file (compared with the Python version) where some parts of the code have been translated to C and others keep as Python API calls.

A handy tool aimed at turning out an HTML report which shows the Cython code interlined with the C code, is the own compiler executed with the "-a" option.

cython$ Cython-0.18/cython.py -a calculate_primes1.pyx


If you open the come out HTML file, you will see that lines are colored according to the level of "typedness" (white lines translates to pure C without any Python API calls).



Now let's run again this new version of the module compiled with Cython (remember to change the name of the module inside the test.py file). As you can derive from the result, the performance has been enhanced about 30%.

cython$ python test.py 
1.20011019707


Let's go a step further and add some static types to a second version of our module (called now calculate_primes2).

def calculate_primes(int limit):
    primes = []
    cdef int number = 0
    cdef int divisor = 0

    for number in range(limit):
        for divisor in range(2, number+1):
            if number % divisor == 0 and number == divisor:
                primes.append(number)
                break
            elif number % divisor == 0 and number != divisor:
                break

    return primes


Feb 17, 2013

Speeding up Python performance with Cython (I)

Cython is a programming language which creates C/C++ extensions for Python, that is, is able to translate some parts of a Python program into C code, and in this way, increasing considerably the execution time of a Python program. In addition, provides some predefined constructions that you can use directly when you develop in Cython.

Let's see a Python method which takes care of calculating all prime numbers within a range (calculate_primes.py).

def calculate_primes(limit):
    primes = []
    number = 0
    divisor = 0

    for number in range(limit):
        for divisor in range(2, number+1):
            if number % divisor == 0 and number == divisor:
                primes.append(number)
                break
            elif number % divisor == 0 and number != divisor:
                break

    return primes


Now let's import this module in another program and run it in order to work out the time spent in searching for all prime numbers between 2 and 10000 (the application will be executed five times and averaged it).

$ cat test.py 
from timeit import Timer  

timer = Timer("calculate_primes(10000)",
              "from calculate_primes import calculate_primes")
print sum(timer.repeat(repeat=5, number=1)) / 5


$ python test.py 
1.7978372097


The next step will be to compile this program with Cython in order to generate C extensions for our program. So as to install Cython, you have a couple of options: either to grab directly its source code from the web page of the project, or install it from some repository. In my case, I am going to download its source code and thereby, to be able to have the most recent version (0.18 at present).

After downloading and uncompressing it, you will be able to install it on your system by running the setup.py script, or as in my case, just decompressing it on the home directory and using that path later. Also point out that you must have installed on your system all tools needed to compile C/C++ programs (for example, the package build-essential in case of Ubuntu).

$ mkdir cython ; cd cython

cython$ wget http://cython.org/release/Cython-0.18.zip ; unzip Cython-0.18.zip


The first test will be to compile the module called calculate_primes with Cython, so as to generate on the one hand a C file, and on the other, a module loadable by Python. First up, you need to change the extension of the module from py to pyx. And to compile it, you might do it by hand, but it is much better to create a tiny script responsible of accomplishing it.

cython$ cp -a calculate_primes.py calculate_primes1.pyx

cython$ cat setup.py 
import sys
sys.path.insert(0, "/home/javi/cython/Cython-0.18")
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

setup(
    cmdclass = {'build_ext': build_ext},
    ext_modules = [Extension("calculate_primes1", ["calculate_primes1.pyx"])]
)


Feb 10, 2013

Paramiko object with Process (RNG must be re-initialized after fork)

When you create an object in Python which will be used later by an object of type Process, that works correctly. Let's see an example where an object of type Process prints an attribute provided by another object belonging to the class A.

from multiprocessing import Process

class A():
    def __init__(self):
        self.my_dict = {"day": "Tuesday"}

class B():
    def __init__(self):
        self.a = A()
        self.__process = None

    def start(self):
        self.__process = Process(target=self.__run_process)
        self.__process.start()
        
    def __run_process(self):
        print self.a.my_dict
        
if __name__ == "__main__":
    b = B()
    b.start()


$ python test.py 
{'day': 'Tuesday'}


But when that process has to handle a Paramiko object where the connection has been initialized before, will not work.

import paramiko
from multiprocessing import Process

class B():
    def __init__(self):
        self.__process = None
        self.__ssh = paramiko.SSHClient()
        self.__ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
        self.__ssh.connect('127.0.0.1', username='javi', password='xxxxxx')
        
    def start(self):
        self.__process = Process(target=self.__run_process)
        self.__process.start()
        
    def __run_process(self):
        try:
            _, stdout, _ = self.__ssh.exec_command("hostname")
        except:
            print "Failed the execution"
        else:
            print stdout.read()
        
if __name__ == "__main__":
    b = B()
    b.start()


$ python test.py 
Failed the execution


If you take a look at the output generated by the execution without catching it through an exception, you could observe an error as follows.

$ python test.py 
Process Process-1:
Traceback (most recent call last):
...
    raise AssertionError("PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()")
AssertionError: PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()


This is a well-known issue in Paramiko that has been apparently fixed in recent versions (for my tests, I am using the version provided by Ubuntu 12.10 by default: 1.7.7.1-3). A possible workaround will be for example to open a new SSH connection for that process. Let's grab an example.

import paramiko
from multiprocessing import Process

class B():
    def __init__(self):
        self.__process = None
        self.__ssh = paramiko.SSHClient()
        self.__ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
        
    def start(self):
        self.__process = Process(target=self.__run_process)
        self.__process.start()
        
    def __run_process(self):
        self.__ssh.connect('127.0.0.1', username='javi', password='xxxxxx')
        try:
            _, stdout, _ = self.__ssh.exec_command("hostname")
        except:
            print "Failed the execution"
        else:
            print stdout.read()
        
if __name__ == "__main__":
    b = B()
    b.start()


$ python test.py 
javi-pc


As you can see above, the definition of the Paramiko object takes place within the constructor of the class B, but the connection is eventually initialized by the forked process.


Feb 3, 2013

Merging dictionaries in Python with lists within

Because I have just set out to program in Python at work (in depth), I am going to write down on my blog those interesting things that I have to work out related to this activity.

And for example, this week I had to cope a curious problem about merging dictionaries with any kind of element within, such as lists or other dictionaries. Browsing the Internet, I was able to find some functions to combine dictionaries, but the problem was that those methods did not take into account the option of having lists inside.

So in order to get over this issue, I had to develop the following two recursive methods.

def merge_dict(dict1, dict2):   
    if isinstance(dict1, dict) and isinstance(dict2, dict):
        for k, v in dict2.iteritems():
            if k not in dict1:
                dict1[k] = v
            else:
                dict1[k] = merge_dict(dict1[k], v)
    elif isinstance(dict1, list) and isinstance(dict2, list):
        dict1 = merge_list(dict1, dict2)
    elif dict2 == None:
        return dict1
    else:
        return dict2
        
    return dict1

def merge_list(list1, list2):
    if isinstance(list1, list) and isinstance(list2, list):
        for i, item in enumerate(list2):
            if len(list1) > i:
                if isinstance(list1[i], dict) and isinstance(item, dict):
                    list1[i] = merge_dict(list1[i], item)
                elif isinstance(list1[i], list) and isinstance(item, list):
                    list1[i] = merge_list(list1[i], item)
                else:
                    list1[i] = item
            else:
                list1.append(list2[i])
    elif isinstance(list1, dict) and isinstance(list2, dict):
        list1 = merge_dict(list1, list2)
    else:
        return list2

    return list1


As you can see in the above code, the method merge_dict loops through the dictionary and if it comes across a list, it will have a function able to merge the list. Also say that dict2 and list2 will have more priority that dict1 and list1 respectively.

Let's see an example where there are a couple of dictionaries with several elements within.

dict1 = {"country": [{"name": "Spain", "capital": "Madrid"}, {"name": "France"}]}
dict2 = {"country": [{"name": "Germany"}], "continent": "Europe"}

print merge_dict(dict1, dict2)


This is the output for the preceding script.

$ python test.py 

{'country': [{'name': 'Germany', 'capital': 'Madrid'}, {'name': 'France'}], 'continent': 'Europe'}