Skip to content

Writing pygtk applications with style, using pygtkhelpers

24/05/2010

Introduction

pygtkhelpers is an awesome library for writing pygtk applications, it was developed by pida developers and makes the pygtk programming experience much better, let’s start with the tutorial.

I’ve used this library for my project, filesnake, I want to explain my workflow.

GUIs like template

In pygtkhelpers “glade” files and hand written GUI blends together in a wonderful manner, in particular each piece of the GUI is separate from the rest and give you a lot of control, flexibility and mantainablity.

I started with glade, writing a “skeleton gui”. It’s a window with a VBox inside (3 slots). I packed in each slot of the VBox a gtk.EventBox (also another gtk.VBox would have been fine for the purpose), these EventBoxes serves as “placeholders” for other pieces of the GUI, the menu, the userlist and the status bar.

I’ve written the very little code to make the GUI run:

class FileSnakeGUI(WindowView):
    builder_file = "main_win.glade"

def test_run():
    fs = FileSnakeGUI()
    fs.show_and_run()

if __name__ == '__main__':
    test_run()

It’s time to write the three components:

  • menu
  • userlist
  • statusbar

The Menu

I’ve written another glade file that contains  a window with the menu inside, the library will handle the extraction of the menu from the “container” window for you.

To connect the events I’ve used the signal handling facilities provided by pygtkhelpers.

The connection is made automatically with the naming convention on_widget__signal,  this simple convention let us to eliminate the boilerplate code related to the “connect” methods.

from pygtkhelpers.delegates import SlaveView

class Menu(SlaveView):

    builder_file = "menu.glade"

    def __init__(self, parent):
        SlaveView.__init__(self)
        self.parent = parent

    def on_quit__activate(self,*a):
        self.parent.hide_and_quit()

    def on_sendfile__activate(self, *a):
        self.parent.send_file()

Handwritten GUIs

I’ve written the statusbar and the userlist by hand, The code can be as complex as you want, demostrating the flexibility and “invisibility” of the framework, all the code is well packed and organized on its own place.

When subclassing a generic Slave/WindowView (subclasses of BaseDelegate) you can override this methods to customize the behaviour of the class  :

  • create_ui: in this method you can manually write your GUI, usually you put there all the “add_slave” code
  • on_mywidget__event: these are the signal handlers, facilities that let
    you write cleaner code without all the self.connect stuff
  • __init__: you can pass custom initializer, and various control
    code not related to the gui code (nothing stops you to do that in
    the create_ui, it’s just to add a bit of conventions)

Here’s the UserList code, you can add additional methods to simplify
external access, like add_user(). This “additional method” would be
used i.e. in the main controller (FileSnakeGUI). It’s a component, and it’s reusable.

# Defining a user container
User = namedtuple("User", "name icon address port")

# UserList section
class UserList(SlaveView):

    def create_ui(self):
        model = gtk.ListStore(object)
        treeview = gtk.TreeView(model)
        treeview.set_name("User List")

        iconrend = gtk.CellRendererPixbuf()
        inforend = gtk.CellRendererText()

        iconcol = gtk.TreeViewColumn('Icon', iconrend)
        infocol = gtk.TreeViewColumn('Info', inforend)

        iconcol.set_cell_data_func(iconrend, self._icon_data)
        infocol.set_cell_data_func(inforend, self._info_data)

        treeview.append_column(iconcol)
        treeview.append_column(infocol)
        treeview.set_headers_visible(False)

        self.store = model
        self.treeview = treeview
        self.widget.add(treeview)

    def _icon_data(self, column, cell, model, iter):

        row = model[iter]
        user = row[0]
        cell.set_property("pixbuf",gtk.gdk.pixbuf_new_from_file(user.icon))

    def _info_data(self, column, cell, model, iter):

        row = model[iter]
        user = row[0]

        template = "<big><b>{user}</b></big>\n<small><i>{address}:{port}</i></small>"
        label = template.format(user=user.name,
                                address=user.address,
                                port=user.port)
        cell.set_property("markup",label)

    def add_user(self, user):

        self.store.append([user])

The statusbar doesn’t introduce anything new. You can find the source following the link at the end of the article.

To pack together the gui, we’ll use the add_slave method, it adds to a container widget (the EventBoxes we placed)  the widget defined in the slave view, the resulting code is that:

class FileSnakeGUI(WindowView):

    builder_file = "main_win.glade"

    def create_ui(self):
        self.userlist = UserList()

        self.add_slave("statusbar_cont", Statusbar())
        self.add_slave("menu_cont", Menu(self))
        self.add_slave("userlist_cont", self.userlist)

    def on_window1__delete_event(self, *a):
        self.hide_and_quit()

pygtk signals facilities, now you haven’t any excuse

It’s trivial to add your own signals to a BaseDelegate instance, let’s see
how to add a “user-added” signal to my UserList:

from pygtkhelpers.utils import gsignal

class UserList(SlaveView):

   gsignal("user-added",object)

   #   ... source code defined before...
   def add_user(self, user):
      self.emit("user-added", user)
      self.store.append([user])

It’s just one line of code and you can connect it like a gtk widget,
implementing in the easiest way I’ve seen the observer pattern. It’s
cool!

Refactoring the UserList with pygtkhelpers.ui.objectlist.ObjectList widget

One of the most useful feature of pygtkhelpers are ObjectList and ObjectTree, pythonic versions of the common gtk list/tree widgets. They automate the tedious task of setting up treeview widgets using a very powerful (yet customizable) manner.

In a few words, given an object, his attributes can be mapped to the treeview through Columns that almost automatically select the correct renderer for the data to be displayed, like in the following scheme:

The resulting code, is much cleaner and readable:

from pygtkhelpers.ui.objectlist import Column,ObjectList
from pygtkhelpers.delegates import SlaveView
import gtk

from collections import namedtuple
# This is the "user object"
User = namedtuple("User", "name address port icon")

class UserList(SlaveView):

   def create_ui(self):
      # Columns are intended to map UserEntry object, with this signature: Column("attribute", type)
      self.users = ObjectList(columns = [Column("icon",gtk.gdk.Pixbuf),
                                         Column("info",str, use_markup=True)])

      self.users.set_headers_visible(False)
      self.widget.add(self.users)

   def add_user(self, user):
      self.users.append(UserEntry(user))

# The UserEntry is mapped by the treeview with the attributes info and icon
class UserEntry(object):

    def __init__(self, user):
        template = "<big><b>{user}</b></big>\n<small><i>{address}:{port}</i></small>"
        self.info = template.format(user=user.name,
                                    address=user.address,
                                    port=user.port)
        self.icon = gtk.gdk.pixbuf_new_from_file(user.icon)

There are also other interesting features that really boost the pygk GUI programming expecially regarding ObjectList, but also with the pygtkhelpers.ui.dialogs module: well cooked dialogs for common uses.

You can find the full sourcecode at the blog repo:

http://bitbucket.org/gabriele/pygabriel-blogging/src/tip/pygtkhelpers/source/

Profiling python C extensions

14/04/2010

In last days I was optimizing some code I’ve written for PyQuante
http://pyquante.sourceforge.net/. I had to do a lot of searches to find
my way in profiling C extensions from python.

I’ll also digress on various tools I’ve used in this case, feel free to
skip on “The Solution” section.

It’s all tested on Linux platforms, if you need help for other platforms
I can see what I can do.

The Test Case

Let’s see a typical example, imagine that you have some data (numbers)
you want to calculate something about that.

You have this data from some sources or generated by python control
code, in this example we’ll use xml, but there are a lot of use cases.

Let’s see a snippet, this code can block your computer also for a small
set of data (like 100 numbers)

#!/usr/bin/python
# -*- coding: utf-8 -*-
import math
import xml.etree.ElementTree as ET

def parse_numbers(xmlfile):
    '''
    Parses numeric data from a dummy xmlfile
    '''

    tree = ET.parse('data.xml')
    root = tree.getroot()
    numbers = []

    for element in root:
        numbers.append(float(entry.text))
    return numbers

def stddev(numbers):
    res = 0
    for n in numbers:
        res += math.pow(n, 2)
    return math.sqrt(res)

import operator

def heavycrunch(numbers):
    res = 0
    for n1 in numbers:
        for n2 in numbers:
            for n3 in numbers:
                for n4 in numbers:
                    tocalc = (n1, n2, n3, n4)
                    res += stddev(tocalc)
    return res / len(numbers) ** 4

if __name__ == '__main__':
    numbers = parse_numbers('data.xml')
    result = heavycrunch(numbers)

First point: Profile and optimize it in python, it’s
not uncommon that you can reach enough speed in pure python, leaving it
in python has a lot of advantages like ease of deploying and ease of
testing. There is the cProfile profiler that’s pretty good.

This command lets me profile the code and sorts the results by time
spent in the functions (excluded subfunctions):

python -m cProfile -s time example.py

         77765214 function calls (77765107 primitive calls) in 173.175 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 12960000   89.673    0.000  156.182    0.000 example.py:33(stddev)
 51840000   53.832    0.000   53.832    0.000 {math.pow}
        1   16.961   16.961  173.143  173.143 example.py:39(heavycrunch)
 12960001   12.677    0.000   12.677    0.000 {math.sqrt}
        1    0.007    0.007    0.008    0.008 sre_compile.py:307(_optimize_unicode)
       61    0.003    0.000    0.004    0.000 ElementTree.py:1075(start)
        1    0.002    0.002  173.174  173.174 example.py:5()
        2    0.002    0.001    0.002    0.001 {range}
...........................................................
  • ncalls: number of calls
  • tottime: total time spent in this function, excluding the time spent in
    subfunctions. (this is the most important)

This output tells me that 89 secs are spent in the inner cycle in stddev
and maybe in the call overhead, 53 secs are spent in computing the
powers and 16 seconds in the heavycrunch cycle.

This is well optimizable just in python, there are a lot of suggestions
that comes in mind..

  • Inlining the cycle inside of stddev
  • Substituting x*x instead of pow(x,2)
  • Using sum(map(operator.mul, numbers,numbers)) instead of all of this
    cycles.

Let’s see what’s happening editing the interested lines:

import operator

def heavycrunch(numbers):
    res = 0
    for n1 in numbers:
        for n2 in numbers:
            for n3 in numbers:
                for n4 in numbers:
                    res += sum(map(operator.pow, tocalc, tocalc))**0.5
    return res / (len(numbers)**4)

The profiling output:

         25925214 function calls (25925107 primitive calls) in 79.984 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   34.535   34.535   79.951   79.951 example.py:43(heavycrunch)
 12960000   29.576    0.000   29.576    0.000 {map}
 12960000   15.840    0.000   15.840    0.000 {sum}
        1    0.008    0.008    0.009    0.009 sre_compile.py:307(_optimize_unicode)
       61    0.003    0.000    0.004    0.000 ElementTree.py:1075(start)
        2    0.002    0.001    0.002    0.001 {range}

We have mostly doubled the speed, but we can’t do much more than that.

Anyway, the profiling in python is not the main purpose of this article :).

Writing the function and the wrapper in C

Let’s implement the interesting functions in C, headers in example.h and
sources in example.c

Nothing much to say, they’re just pure C functions.

/* example.h */

double stddev(double *numbers, int len);
double heavycrunch(double *numbers, int len);

/* example.c */

#include <math.h>

double stddev(double *numbers, int len)
{
  double res =0;
  int i;
  for (i=0; i&lt;len; i++)
    {
      res += pow(numbers[i],2);
    }
  return res;
}

double heavycrunch(double *numbers, int len)
{
  double res;
  double topass[4];
  int i,j,k,l;

  for (i=0; i&lt;len; i++)
    {
      for (j = 0; j &lt;len; j++)
        {
          for (k = 0; k&lt;len; k++)
            {
              for (l=0; l&lt;len; l++)
                {
                  topass[0] = numbers[i];
                  topass[1] = numbers[j];
                  topass[2] = numbers[k];
                  topass[3] = numbers[l];

                  res+=stddev(topass,4);
                }
            }
        }
    }
  return sqrt(res)/pow(len,4);
}

Now we have to implement the wrappers, we will just wrap heavycrunch
because we don’t use stddev externally.

I used cython to wrap the extensions and I suggest you to do so, it
makes the process of wrapping C stuff straightforward and fun :).

# example_wrap.pyx
from stdlib cimport * # We need to allocate a double *

cdef extern from &quot;example.h&quot;:
    double cheavycrunch &quot;heavycrunch&quot; (double *numbers, int len)

def heavycrunch(numbers):
    cdef double *numarray

    numarray = &lt;double *&gt; malloc(sizeof(double)*len(numbers))
    for i,num in enumerate(numbers):
        numarray[i] = num

    res = cheavycrunch(numarray,len(numbers))
    free(numarray)
    return res

OK, sorry for the code spamming, you will find all at the end of the
article so you can run a working example.

Let’s compile and run, I used scons to do that, however you can use the
following commands (well, scons generated this for me).

gcc -o example.os -c -fPIC -I/usr/include/python2.6 example.c
cython -o example_wrap.c example_wrap.pyx
gcc -o example_wrap.os -c -fPIC -I/usr/include/python2.6 example_wrap.c
gcc -o example_wrap.so -shared example_wrap.os example.os -lpython2.6

We’ve finished with all this stuff, let’s test how much speed we’ve
gained.

from example_wrap import heavycrunch

if __name__ == '__main__':
    numbers = parse_numbers('data.xml')
    result = heavycrunch(numbers)

5213 function calls (5106 primitive calls) in 0.402 CPU seconds

Ordered by: internal time

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.369 0.369 0.369 0.369 {example_wrap.heavycrunch}
1 0.007 0.007 0.008 0.008 sre_compile.py:307(_optimize_unicode)
61 0.004 0.000 0.005 0.000 ElementTree.py:1075(start)
2 0.002 0.001 0.002 0.001 {range}
Nice! But now starts the real purpose of the article. How to know
profiling information *inside* of heavycrunch?

If you pass a big xml file with ~100 numbers we have still problem in C
and you may want to profile and optimize it.

gprof – Plain old way

You can try to profile the functions independently from the python
code. You can write a main.c file and implement some tests, this implies
feed the data inside the function so you will need to use a C xmlparser.

After that you can profile your code with various C tools (like gprof).

This is a big work, “scripting” in C is somewhat difficult and you can
lost a lot of time.

There should be another way to accomplish that, but I couldn’t find how
to do it. It consists compiling the whole python with the profiling
flags, this can work as well but it’s a bit overwhelming.

The Solution – google-perftools

There’s a really nice library over there, google-perftools

http://code.google.com/p/google-perftools/

This library is not invasive, you can profile without particular needs
and you can run the code under each condition. Nice!

The library works in this way:

StartProfiler("logfile.log")
... code to profile ...
StopProfiler()

And it dumps in logfile.log information about the code inside.

Question: But if we call this 2 functions *from python*?
Answer: We can have profile information on what’s happening inside this
2 calls!

So, basically we have to install this library and wrap this 2 functions,
it’s quite easy with cython.

# prof.pyx
cdef extern from "google/profiler.h":
    void ProfilerStart( char* fname )
    void ProfilerStop()

def profiler_start(fname):
    ProfilerStart(<char *>fname)

def profiler_stop():
    ProfilerStop()

Compiling is something like that:

cython -o prof.c prof.pyx
gcc -o prof.so -shared prof.c -fPIC -lpython2.6 -lprofiler -I/usr/include/python2.6

Now that we have our profiling extension we can profile and analyze the
output:

from example_wrap import heavycrunch
from prof import profiler_start,profiler_stop
if __name__ == '__main__':
    numbers = parse_numbers('data.xml')
    profiler_start("heavycrunch.log")
    result = heavycrunch(numbers)
    profiler_stop()

OK, we have produced our profiling data, we can analyze in a lot of ways

http://google-perftools.googlecode.com/svn/trunk/doc/cpuprofile.html

Personally I like the graphical interface of kcachegrind.

pprof --callgrind example_wrap.so heavycrunch.log > heavycrunch.callgrind
kcachegrind heavycrunch.callgrind

The graphical interface is quite intuitive, however look at the
picture.

Kcachegrind example

There is a “Flat Profile” on the left, with fields:

  • incl: the “time” (well it’s more like a rate) spent in this
    function, subcalls included
  • self: “time” spent in the function, subcalls excluded, this is the
    most important field.
  • The other two are trivial

There’s a nice feature that gives you a graphical representation of all
this stuff, on the right go on “Callee Map”. This shows you graphically
and interactively how the time is distributed between callers.

In our case is evident that we have to optimize stddev, (you guessed it
yet) anyway we can cut this 75% off. But this is left as exercise.

If some reader has some questions or has suggestion for improve the
article, I would be happy to answer quickly! Bye!

Emacs coding and … blogging

29/03/2010

After sometime of intensive hacking here I am. I’ve just found some
comments on another post so I’ve decided to write some thoughts…

Well this is just a little test to write an entry on the blog directly
in emacs using “weblogger-mode”. I guess that now I’ve to write a little
tutorial about that.

I’m using also emacs as my development environment and it’s quite good
at it, I’m packing toghether a lot of useful well-tested extensions in a
bundle that I ship on my machine. Maybe this can be useful to other
python developers!

See you for updates, Bye!

Invites for Google Wave

15/11/2009

If someone wants to be invited on google wave, I’ve 9 invites remaining, tell me if you need one!

pygtkscintilla – version 0.1 released

15/09/2009

pygtkscintilla is a python wrapper for Scintilla, it gives to pygtk a powerful source-editing widget with features such as:

* Syntax highlighting for tons of languages
* Autocompletion
* Code Folding
* Calltips
* Annotations
* Markers
* Multiple selections (NEW!)
It provides a nice pythonic API that looks and works better.
Sourceforge Page:

pygtkscintilla, some documentation written

02/09/2009

I’ve written some documentation about pygtkscintilla project (svn version)
The sourceforge page:
http://sourceforge.net/projects/pygtksci/

The documentation:
http://pygtksci.sourceforge.net

The wiki for bugs, status, other info:
https://sourceforge.net/apps/trac/pygtksci/wiki

pygtkscintilla preliminary version released 0.0.1

10/08/2009

I’ve packaged in a rough manner the gtkscintilla module,
follow the instruction in the trac wiki, if you have problem or want to tell me something, leave a comment!

Follow

Get every new post delivered to your Inbox.