A few months ago I proposed a new analysis for my thesis data. The trouble was that I have only a faint scent of the direction, and have no algorithm, nor the means to find out what that algorithm ought to be. I knew it must involve some automation – afterall, there is ~30Gb of data – but the impasse was the lack of live interaction with my data, where I can explore the different transformation options.
After numerous blunders on my self-devised curriculum, I’m now well on my path, and helped tremendously along the way by the Enthought Traits python package. Traits offer a nice view at the top but has a pretty slippery learning curve on the way. I’ve constantly wished that there’s a single QuickStart tutorial that takes me from 0 to 100mph. Here’s my attempt at a 20->40 tutorial: this assumes some general knowledge of python – Software Carpentry is a good resource to get started – and ends at what I know.
Part I: What does Traits do?
“A manifest type definition library for Python that provides initialization, validation, delegation, notification, and visualization.”
The official definition reads like gibberish to a novice pythonista. Let’s do an example: say we would like to conduct a study correlating happiness with age. We’d probably want to make a class of subjects, which contains a variable that contain the subject’s age and happiness. In the absence of the traits package, this would probably look like:
That’s the baseline we’re working from – it represents the data but does not do much beyond that. To enter data, I would have most likely written a text prompt that says, “What is the subject’s age? (press enter when finished)“, and write that input into Bob.age. It gets the job done, but is neither convenient nor pretty (what do you do when you want to go back and change things up?) [Color code: instance of object, attributes, class, methods]
Compounding that, to err is human, and sooner or later I’ll find myself entering Jane.age as 455 instead of 45. Both are perfectly fine integers, and python would accept them without skipping a beat even though they aren’t feasible human ages. To keep these from happening, we’ll need to write some code to make sure that the integers are reasonable, that is, some code to validate the values. Then to make life easier, we’ll try to make ourselves a pretty graphical interface instead of the text prompt, and life just got a whole lot messier.
With traits, you would instead write code like this – not too much different than what we had:
Since this is our first example, let’s look at this line by line, and compare it with our base class.
line 1, 2: These import the enthought libraries for this scenario. (These are installed if you use the Enthought Python Distribution – it’s redundantly painful to do it on your own.)
line 4: This subclasses Subject to HasTraits. That is, Subject now inherits all the goodness that HasTraits have – the whole “initialization, validation, delegation, notification, and visualization” bit.
line 5: Instead of saying age is an integer, we talk about age as (an instance of) a Range object as setup by Enthought Traits. We’ll talk more fully about what this means when we look at the UI, but for now, you can see that this attribute will only accept values between 15 and 100, and is trained to complain if someone tries to enter 1000.
line 6: Range objects can hold floating point numbers too. Sweet 🙂
line 9: We construct a Subject, and call him bob. At this point, bob has an age (30) and happiness (50.0) already.
line 10: bob is a Subject, which is in turn also an (object that) HasTraits. HasTraits (and its descendents, like bob) have a method called configure_traits(), whose function is to launch a graphical interface of itself. For the case of bob, that default interface looks like:
This looks pretty good for essentially no efforts on our part – and we can do a whole lot more with just a little bit more effort. At this point though, let’s go back and look at what used to be gibberish:
“Enthought Traits is a manifest type definition library for Python that provides initialization, validation, delegation, notification, and visualization.”
Initialization means that there is a default age and happiness, set by the programmer (you). If you do not define the default explicitly as we did here, the object holds a default value as defined by the Traits package.
Validation means that some basic checking is in place. For example, if we try to enter happiness = 450.0 for bob, the following window pops up (“you can be too happy”):
Delegation… we won’t deal with that at all here.
Notification… we’ll look at this when we move onto some basic plotting. But imagine we have calculated an average happiness for someone who’s 30 years old, and now bob is suddenly a great deal happier. So we enter the value 100.0 for bob.happiness. If happiness is not traited, the global happiness will stay where it is, because it doesn’t know that it ought to be recomputed. But since Subject.happiness is traited, it has the ability to notify the global happiness to recompute itself accordingly. This is really useful when you start connecting different pieces together with a graphical interface.
Visualization is the pretty slider you see. bob, age, and happiness are all Traited objects, so they can be edited easily with an interface that makes good sense.
In the basic sense, this example showcases what the Traits package can do above and beyond the native python types. Besides Range, the Enthought package comes with a great many other useful traited objects: you can find a list of them here. The linked page is one from the Traits User Manual, which as full of information but as a novice I found rather dense and difficult to read.
In the next section, we’ll look at making some simple modifications to the interface that Subject presents.