Attracting Google
I find it interesting that Google seems to pay fairly close attention to Advogato--within only a day or so of my last Google bait entry for "native-selector" (in relation to pyobjc) the entry appeared in the Google results. I've noticed this phenomenon before.

Contrast that with the fact the Hydra project has changed its name to SubEthaEdit which at this point in time returns exactly zero results. I wonder how long this entry will take to turn up... :-)

Update: As it happens it took about a day for Google to return results, but it didn't include this post...

Okay, so it's been a while, and I've got an entry from ages ago that I was going to post, but that'll have to wait...

Isn't it always the way...
Heh, unbelievable, I've just spent a huge chunk of time trying to work out how to access Mac OS X Rendezvous (ZeroConf) functionality via Apple Cocoa APIs with pyobjc. Anyway, between trying to learn to read Objective-C, working out how pyobjc interacts with it and understanding the Rendezvous API I wasn't making much progress...

In fact, it turns out I wasn't too far off.

I just needed to discover NSRunLoop.currentRunLoop().run() (yeah, okay so I needed to walk before I tried to run... (excuse the pun)) and an underscore character.

An underscore character? Yep, my object instance had a method named netServiceBrowserWillSearch but it should have been netServiceBrowserWillSearch_.

I must find out if there's a way to find out what methods are attempting to be called. With Objective-C's use of ':' and subsequent conversion for Python to '_' it all gets somewhat confusing...

Oh, and how did I discover all this in the end? Oh, that's easy, I just looked at rendezvous.py a sample that can be found in the examples directory of the pyobjc distribution! I do have a little bit of an excuse since I installed pyobjc via the MacPython Package Manager, and I don't think it installs the examples... Still, I should have looked in the pyobjc cvs reposistory earlier...

I think I've learned a bit in the process though.

Unfortunately I couldn't get the Python Rendevouz implementation I found to run successfully, despite trying it on Linux, Windows 98 and XP. I also couldn't use it on Mac OS X because it can't cooperate with the implementation that's already there.

All I really wanted was to be able to browse things, in a somewhat cross-platform Python way, but it looks like more work is required.

I also could not get jrendezvous to work successfully between machines. In the end I used the sample mDNSPosix code from Apple successfully under Linux to publish details of services that my Mac OS X machine could find.

Update...
Google-bait: In pyobjc if the string representation of an object is of the following form:

it means (as far as I've figured it...) that you are accessing something (in this case name) incorrectly as an attribute instead of as a method. (So I assume a "native-selector" is Objective-C speak for something like a Python method.)

BayPiggies & PyChecker
Gave a talk tonight at BayPiggies (Bay Area Python Interest Group), it seems to have been fairly well received and went pretty much without any hiccups. This is even more surprising when you consider I only had about two days to prepare (both of which I was working), and only found a laptop to use this morning.

The talk was about PyChecker, which can be thought of as lint for Python.

Also previewed another Python project I've been working on, should be released next week. People showed some interest in it.

I think there were about twenty people present, and I talked for just under an hour.

Enjoyed the talk from the guys at Gnu Radio the most, but all the presenters did well.

Outside of the "official" programme I had some interesting chats with people, including an editor from Salon; the owner of No Starch Press; and the guy who did the original reverse engineering of the AIM protocol.

Also had a useful conversation with another guy about iBooks. It was very interesting to note the number of iBooks (and running OS X) in the audience.

Another observation of note was that although this is CodeCon, there were often as many, if not more, questions relating to the societal impact of the technology being presented as there were about technical issues.

Blogs...
A sudden thought occurred to me as I was riding the bus to work... "Are blogs the talkback radio of the 00's?"

Whoops...
Here's a hint, it starts with "rm", ends with "*", with a "rf" in between... How could I be so stupid??? Fortunately the important stuff (all?) is in CVS and I think I'll cope without the rest, I guess I'll find out in due course. (Although the nice thing is that my '.' configuration files didn't get blown away which is one less irritation...) I think I'll be looking for an anti-stupidity tool tomorrow.

CodeCon
Look's like I'm going to CodeCon! (I mentioned it at the BayPiggies Python user group tonight too, maybe I could get a commission if anyone goes from there? :-) )

Exciting...

os.popen()
Hey there Python programmer, wanna execute a Python script using one of the os.popen() family and use a custom sys.path? You're in luck, even though there's no command line option to provide an alternative sys.path or PYTHONPATH to the Python interpreter, you can achieve the desired result (in a cross-platform way) by using os.setenv() to set the value of PYTHONPATH. The exec*() and popen*() families inherit their environment from the modified environment variables os.setenv() changes. Just I what I wanted. Now you can know, if you didn't already.

Work
Be happy for me, I have a job I enjoy... :-) (For at least the next month and a bit anyway.)

Development continues on the uncaught exceptions handler for Python, although mostly on the unit testing side which leads to:

A (possibly) interesting question: How do you unit test a replacement for the system exception handler? If you use the unittest module as it is, it always traps exceptions, so if you want to check whether your exception handler captures uncaught exceptions you can't, because they're not uncaught, 'cos unittest's exception handler captures them... Got it? :-) My answer was to run the tests outside the standard unittest framework and somehow link them into it. Which leads to...

External unit tests I'm probably reimplementing the wheel here but I now have a new ExternalUnitTestMixin class which allows the standard unittest framework to run an outside command and testing its output against the "correct" output (using os.popen4()). In my situation the command is simply running a Python script which tests that the exception handler prints the correct results. (e.g. nothing, an incident id or a traceback depending on the debug mode and where the exception occurred.) (Of course when you're capturing exceptions the line numbers can change if you're editing the code, my comparison checker optionally filters out line number references to avoid this issue.)

The kinda nifty thing about this implementation is that the external unit tests are performed by running the same Python file that contains the "normal" unit tests as a script instead of importing it into the test runner. The external tests are all functions within the module which are extracted dynamically by a customised TestSuite class. I like it being done this way because it reduces the number of places you have to look for test code.

Unfortunately I discovered that the unittest module's loadTestsFromModule() function doesn't automatically load TestSuites only TestCases, which leads us to...

Importing test suites I came up with a two line patch to loadTestsFromModule() to make it import TestSuite subclasses it finds in addition to TestCases. This means you can use "tricky" custom TestsSuites but they're imported automatically, and treated as just another TestCase. Admittedly this is all a somewhat vague description, but hopefully the code's clearer, and it'll be better when/if I can put up some example somewhere.

All of this is being controlled by our unit test runner which helpfully searches, loads and runs all our unit tests with just one command, which is handy, although not that original, hopefully I'll be able to put this up somewhere too.

The final "complete" incident id that is generated will probably have the/a version number prepended; I think it's more useful than making it part of the hash itself. The final form is probably something like:<ver>--<id>--<line> (e.g. 501-33425-22) where <id> is generated from the hash of the function path. Keeping the version number separate means that exceptions from the same call path in different versions maintain the same id which I think is useful.

The final purpose of all this is to create a "catchall uncaught exception handler" so you might not have to make time to do it yourself... :-) Essentially you'll be able to import the handler module, start it (which hooks the exception handler), and register some Reporters. The reporters determine how the otherwise unhandled exception is reported (surprise!). For example, the current very basic reporters display the incident id in the console and email a report to an email address. Other possibilities include writing a log file, paging a tech or displaying a message dialog--which are reporters we're likely to implement. (Could have a direct-to-web option too...)

For the purposes we're using the module we don't want to re-raise the exception--we want to supress it completely as far as the user is concerned--but a reporter could be created to display the traceback as per normal behaviour.

I also briefly considered embedding a mini-stack trace in the incident id, but decided it would probably make it too long and enumerating the functions would make the system more difficult to use overall.

Thanks for mentioning anonymous functions and lambdas, I hadn't specifically thought about them. I haven't checked what the module does with them at the moment, but will have to do so, it depends on how Python reports them in the stack trace...

The module is mostly production ready in its current state, could be ready for release early next week--my employers are happy for it to be GPL'd.

Other stuff...

Having an employer that uses and is happy to release open source is cool.

Being paid to work with Python is (still) cool.

I have no idea what other people think about reading it, but looking at what I've written for this & my last entry maybe I will yet be able to get interested enough about something to actually write something significant about it.

Later...

CodeCon
How come it's taken me until now to realise that most of CodeCon takes place over the weekend, not during the week? That sudden realisation has greatly increased the likelihood of me going... (One other question, is it just my imagination or was the original discount price $75?)

Talkback ID functionality in Python
Interesting task at work today... While working on a wrapper to catch otherwise unhandled exceptions and deal with them "nicely", I started wondering how Full Circle Software's TalkBack (their website is surprisingly difficult to find using Google...) software and other similar products calculate their unique "Incident ID".

By my understanding, the key attributes of an incident id are:

It is short, say 5 to 7 digits.

Unique for the incident, i.e. no matter what machine the error occurs on, the same cause generates the same ID. (With some degree of certainty anyway...)

I had a look around on the net but after discovering that TalkBack is actually a closed source product, didn't find any implementation details. I eventually found Anet: A Network Game Programming Library which includes crash logging functionality. After a quick glance at the code (note: only the tar file seems to exist now) I couldn't readily identify a routine that calculates an incident id.

After further reading on crash signatures and the like I've hypothesized that the TalkBack ID and other similar incident ids are probably calculated by running some sort of hash algorithm over items (e.g. relative function call addresses) from the stack trace. But, I don't know for sure. Does anyone around here have any details on how TalkBack or BugToaster crash signatures/incident ids are created?

Anyway, I wanted to come up with some way to calculate a similar incident ID for otherwise uncaught Python exceptions.

The two approaches (with different trade-offs) I ended up with were to generate an incident id from either of the following sets of information found in the trace-back:

Line numbers

Function names

Using line numbers has the advantage that each exception in a particular function has a unique incident id, but the disadvantage that changes in the code (even the addition or deletion of comments) affects the id dramatically. This method is most suitable for stable final-release type code.

I think we've decided to use function names to generate the ids. Given a series of calls to the following functions (with an uncaught exception handled in the last function):

f1(), f2(), f3

we assemble a string of the form "f1f2f3" (i.e. the function names concatenated together) and then get a hash from the string with Python's hash() function. Then, in order to get a "nicer" number we do a mod 100000 operation on the hash to get our incident id.

(Actually I ended up adding a line number (mod 100) to the end of the incident id also. If you had some standard way of enumerating exception types you could probably throw the exception type into the mix too.)

I think this will serve our purposes for a start, I figured the hashing algorithm can afford to be relatively simple (so using Python's builtin hash() is probably overkill...) as one would hope that the number of items in the hash space would be small (being the total number of paths to uncaught exceptions after all!).

Would be interested in comments on this from anybody who's had more experience with this type of thing than me.