Posts Tagged ‘Pidgin’

I realize that I haven’t written my customary “software stack” post for this year yet. But hey, from where I’m sitting, I still have … 36 minutes to spare

I’ll be using the same categories as last year; system, communications, web, development, office suite, server, organization, and entertainment.

System

The OS of choice is still Archlinux, my window manager is still wmii, my terminal emulator is rxvt-unicode, upgraded by also installing urxvt-tabbedex.

My shell is still bash, my cron daemon is still fcron, and my network manager is wicd.

To this configuration I’ve added the terminal multiplexer tmux, and have lately found out just how useful mc can be. Oh, and qmv from the renameutils package is now a given part of the stack.

Communications

Not much change here, Thunderbird for email, Pidgin for instant messaging, irssi for IRC.

Heybuddy has been replaced by identicurse as my micro-blogging (identi.ca) client. Heybuddy is very nice, but I can use identicurse from the commandline, and it has vim-like bindings.

For Pidgin I use OTR to encrypt conversations. For Thunderbird I use the enigmail addon along with GnuPG.

This means that Thunderbird still hasn’t been replaced by the “mutt-stack” (mutt, msmtp, offlineimap and mairix) and this is mostly due to me not having the energy to learn how to configure mutt.

I also considered trying to replace Pidgin with irssi and bitlbee but Pidgin + OTR works so well, and I have no idea about how well OTR works with bitlbee/irssi (well, actually, I’ve found irssi + OTR to be flaky at best.

Web

Not much changed here either, Firefox dominates, and I haven’t looked further into uzbl although that is still on the TODO list, for some day.

I do some times also use w3m, elinks, wget, curl and perl-libwww.

My Firefox is customized with NoScript, RequestPolicy, some other stuff, and Pentadactyl.

Privoxy is nowadays also part of the loadout, to filter out ads and other undesirable web “resources”.

Development

In this category there has actually been some changes:

gvim has been completely dropped

eclipse has been dropped, using vim instead

mercurial has been replaced by git

Thanks in no small part to my job, I have gotten more intimate knowledge of awk and expect, as well as beginning to learn Perl.

I still do some Python hacking, a whole lot of shell scripting, and for many of these hacks, SQLite is a faithful companion.

Doh! I completely forgot that I’ve been dabbling around with Erlang as well, and that mscgen has been immensely helpful in helping me visualize communication paths between various modules.

“Office suite”

I still use LaTeX for PDF creation (sorry hook, still haven’t gotten around to checking out ConTeXt), I haven’t really used sc at all, it was just too hard to learn the controls, and I had too few spreadsheets in need of creating. I use qalculate almost on a weekly basis, but for shell scripts I’ve started using bc instead.

A potential replacement for sc could be teapot, but again, I usually don’t create spreadsheets…

Server

Since I’ve dropped mercurial, and since the mercurial-server package suddenly stopped working after a system update, I couldn’t be bothered to fix it, and it is now dropped.

screen and irssi is of course always a winning combination.

nginx and uwsgi has not been used to any extent, I haven’t tried setting up a VPN service, but I have a couple of ideas for the coming year (mumble, some VPN service, some nginx + Python/Perl thingies, bitlbee) and maybe replace the Ubuntu installation with Debian.

Organization

I still use both vimwiki and vim outliner, and my Important Dates Notifier script.

Still no TaskJuggler, and I haven’t gotten much use out of abook.

remind has completely replaced when, while I haven’t gotten any use what so ever out of wyrd.

Entertainment

For consuming stuff I use evince (PDF), mplayer (video), while for music, moc has had to step down from the throne, to leave place for mpd and ncmpcpp.

eog along with gthumb (replacing geeqie) handles viewing images.

For manipulation/creation needs I use LaTeX, or possibly Scribus, ffmpeg, audacity, imagemagick, inkscape, and gimp.

Bonus: Security

I thought I’d add another category, security, since I finally have something worthwhile to report here.

And sometimes (this was mostly relevant for when debugging passtore after having begun actively using it) when I have a sensitive file which I for a session need to store on the hard drive, in clear text, I use quixand to create an encrypted directory with a session key only stored in RAM. So once the session has ended, there is little chance of retrieving the key and decrypting the encrypted directory.

Ending notes

That’s about it. Some new stuff, mostly old stuff, only a few things getting kicked off the list. My stack is pretty stable for now. I wonder what cool stuff I will find in 2012

Although not a whole lot has changed in this part of the stack, I’ll go through it for completeness sake. And there is actually an addition to the stack if you look closely.

Email

Thunderbird remains my email client of choice, augmented by the Enigmail add-on which enables support for GnuPG.

Instant messaging

Pidgin still remains my IM client, because it works well, has multi-protocol support (which is necessary, as it is hard to get all your friends to switch from MSN to Jabber), and supports OTR (the primary reason why empathy won’t exist on any of my systems any time soon).

IRC

Microblogging

Finally, we’ve come to the addition to the stack: a microblogging client.

I tried Gwibber, and it worked ok (never had the latest and greatest as I at that time was running Ubuntu Jaunty so it might be better now) but it wasn’t perfect.

I then started having problems with the way twitter operates (more on that in a separate post later), and all of a sudden, the fact that they’d changed authentication to OAuth, which gwibber on my old Jaunty installation couldn’t interface with, didn’t much matter anymore.

That’s because twitter isn’t the only game in town. It might be the most populated service, but not the only one… so when I heard of the lightweight heybuddyIdenti.ca-only client, I jumped ship and haven’t looked back since.

The next post will be about the software I have come to use to organize my life.

I like Pidgin. I like it a lot, and have done so since the days when it was named Gaim. Add plugins like Off-The-Record and hook it up to TOR, and we have a pretty powerful communications package.

Pidgin doesn’t support audio or web cam conversations, which is a bummer, but nothing I personally need or use, so it isn’t a big deal, it just means I can’t promote it to other people who do use/need those features.

But, on to the “battle”. You see, while Pidgin is a great piece of software, it has ugly sides as well. Tonight I upgraded my Pidgin installation. I don’t upgrade Pidgin as often as I should. There is a reason for this. Until this evening I’ve lived with Pidgin 2.4.1. A customized version of 2.4.1.

Most of the time, regarding most issues, I consider the developers of Pidgin to be awesome. Their software is awesome. Most of the time…

So I don’t upgrade very often, because I customize my Pidgin, which means that I go through all the trouble of downloading the source, customize the code, and go through the build-process. What is this customization which I require so badly you wonder? What would be so hard to reconcile with, as to trigger this kind of response?

It is actually quite silly. Silly of the developers, and silly of me. I have loads of respect for them and they have posted arguments for their “improvement” but to this day, 16 months later, I still cannot see it as anything less than a usability regression.

I am of course talking about Issue #4986… “closed enhancement: wontfix” a bug which yielded 325 comments, some users registering just to voice their malcontent with this “feature”, where the message input area “grows”, upwards, as you type long messages. No more manually resizeable input area.

I know this is a free software project, I know that if I don’t like it I can either fork the project or… well… “fork off”.

And rather than learning to live with that “feature”, for me it is still a usability regression, I’d rather download their source, hard code the textarea to 4 lines high, and accept that neither they, nor I, have control over the resizing of the input area. I much prefer this option to the alternative.

And then I came along this post which got me thinking about what software I ended up using towards the end of my bachelors. Or the software I have learned of since, but wish I’d known about earlier. I began to write a comment to her post, but realized that it would be too long, so I write here instead. All credit to Hazel though, since without her post I wouldn’t have been inspired to write this one.

My list, as compared to Hazels, will not be as well-rounded, it won’t necessarily fit every student the way her list do. Also, the software I list will only be guaranteed to work in GNU/Linux, as that is what I used in the final semesters, and have continued to use since.

First of all, a text editor. It doesn’t really matter which, just evaluate a bunch until you find one you feel comfortable with. Once you have found “the one” become intimate with it. Become a frakking Jedi-master at wielding it. I’m still a padawan-level user of Vim, but I’m getting there.

I say the same about web browsers, mail clients and instant messaging clients. Find a good one, learn as much as you can about it, and use it effectively. Firefox, Thunderbird and Pidgin are my preferred tools.

A bug-tracker, although often web based creating a need for a web server, can often provide more “good stuff” than just tracking bugs. Stuff like statistics, or, if you think outside the box you’d be able to track things other than bugs, which I guess it was issue-trackers does. Some of these also include a wiki-system, which makes establishing a project-specific knowledge-base kindof easy. In the one university project where we used such a system (and where I realized its potential) we used Trac.

A blogging-system with an RSS-feed capable of being filtered on tags or categories could be used to distribute status updates to other members of a group. That I’m using WordPress should be fairly obvious to all.

Use a version control system wherever and whenever possible. With the next two suggestions on the list, “wherever” will be a lot more commonplace than one might first believe, even for non-programmers. At the university we had access to SVN-servers, and also tried Mercurial, a distributed vcs. Mercurial stuck with me ever since.

From generic suggestions, let’s go specific.

I could encourage you to check out markup languages such as reStructuredText or Markdown, to find one which suits you best and to run with it. And since I’ve now written the terms you’d need to Google, you could do that, but I’ll simply recommend LaTeX. The reason for markup languages in general, and LaTeX specifically is that you can then store your information in one plaintext format (which makes it easy to manage in version control) and can then transform it to a slew of other formats as needed.

Most of the time we needed to hand in PDFs. LaTeX excels in that and manages all the typesetting stuff and (obvious) formatting. Which leaves you with more time to focus on the content. One could also either extend LaTeX with Beamer, to create presentations, or simply generate a PDF and run Impress!ve.

For diagrams, graphs and flowcharts or representations of state-machines, Graphviz would be my recommended way to go. Again using plaintext to control the content, again with the benefits of version control. Inkscape saves files in the SVG format (again, plaintext) which might be usable (especially since it can also save files as both PS and PDF)

If you need graphical representations of statistical data or other plots, matplotlib could be the way to go.

I personally don’t like managing things, or management-related stuff, but lately I have been haunted by the feeling that if I used management tools, even if I would only be managing myself and my pet projects, I could be more organized and efficient. So I have started looking at TaskJuggler. It is similar to Microsoft Project, with the largest difference being that… you guessed it, you code the project plan ;D. Plaintext yet again. And then you compile the plan and TaskJuggler attempts to verify that no resources have been double-booked.

Considering each piece in this list on their own, it might seem like a waste of time to exchange one software with another. I do find each of these softwares impressive in and on their own, but it is when they are put together, when all their strengths are combined, that you tend to get the most out of it.

The all plaintext approach I have tried, both in groupwork at the university, and later on my own, work rather well. That so many of the softwares on the list can be used to communicate and transfer information between parties is also intentional as without communication the chance of a successful project outcome diminish rapidly.

The last (bonus?) item on the list would be to recommend learning, at least superficially, a programming language which you could hack together small scripts with. Something which you could use to “glue” together the other parts. I adore Python, and many of the softwares listed above have python-bindings ready to use. Perl, Ruby and others, which elude me right now, would undoubtedly work equally well or better, but as with the text editor, pick a language you feel comfortable with, and rock on.

I’m in an ambivalent state about Pidgin, I really have no love for the “expanding input field” “feature” the developers introduced a couple of versions back. Frankly, I think it sucks and I am not alone (search for “input” on that page), which is why, when I update to a newer version, download the source, modify it to always show, and ONLY show, four lines of text. No expanding, manual or automatic. Hard coded. I find that to be better than the alternative. My opinion is simply that the developers could have handled that issue much better.

But, Pidgin is not all suck. Far from it actually. Take the buddy pounce system for instance. That is a work of pure genius. There are so many options available! Your imagination is the limiting factor here.

Pidgin buddy pounce window (click for full size)

Other than the devilishly annoying game you can play with other people by having Pidgin send them a message as they start typing to you (preempting them), or return a message when you status is away and they have just sent a message. But all this fades in comparison once you look at what options I have checked in the screenshot. When a user comes online, execute a command. I tested this, my Python script ran perfectly.

So what then could you do with this? You could turn Pidgin into a computer remote control (although I would probably advise against it) by doing something along these lines:

Register a new account, and set it up to always be online on the computer to be remotely controlled

Add your normal user to the contact list of the new user (disallow anyone else, for security reasons)

Set up a buddy pounce for this new account, so that anytime a message is sent to it a script is executed, which reads the last line in the latest log-file for that account

Do a little parsing perhaps, and then execute whatever command was sent in the message

I don’t think that I would ever use it for that, I have SSH for such things. But another thing one might do, is to create a second account, add it to the contact-list of the primary account, and have the primary account pounce when the new account sign on. And the executed command could be something as simple as opening a port in the firewall on the home computer. There would of course need to be a pounce for when the user sign off as well, to close the hole.

One could probably do something more advanced as well since, through the application, you can access databases, and thus can synchronize data over several users (what I’m thinking is that executing the same pounce for several users will mean that a command (inside the script being run for each of the users) can be executed when all of the users are online, offline or whatever, but otherwise do nothing. As I said, my/your imagination is the limiting factor here.

I’m not sure how “robust” (or fragile) the command execution is, if you have to return a value to Pidgin at end of execution, but I had my Python script terminate with a sys.exit(0) just to be on the safe side.

Last Saturday… sorry, early Sunday, way past any reasonable bedtime, I was twisting and turning, finding it impossible to fall asleep. Reading a magazine didn’t work, in fact it might just have had the opposite effect. It got my brain working, and all of the sudden an idea entered my mind.

I can’t understand it myself, so don’t bother asking, there will be no coherent or reasonable answer, but I got the idea to pull my Pidgin log-files, all 2900 of them, dating back from 2008.01.01, and have a program go through all of them, cataloging and counting the outgoing words.

Maybe it was some urge to code, maybe my subconscious has a plan for the code which it has yet to reveal to me, I couldn’t tell you, but the more I thought about it, the more the idea appealed to me. Within half an hour I knew roughly how I wanted to do it.

The premise was: 2900 HTML-formatted log-files describing interactions between me and one or more external parties. Pidgin stores each sent message on a separate line, so except for some meta-data about when the conversation took place, located at the top of the file, there was one line per message.

I wanted the code to be, I hesitate to call the resulting code “modular”, but “dynamic” might be better. So no hard coded values about what alias to look for. This worked out fine, as I soon realized I needed a file name for the SQLite database which would store the data.

The script is called with two parameters, an alias and a path to the directory in which the logs can be found. This is also where I cheated. I should have made the script recursively walk into any sub directory in that path, looking for HTML-files, but I opted instead to move all the files from their separate sub directories into one large directory. Nautilus gives me an angry stare every time I even hint at wanting to open that directory, but handling sub directories will come in a later revision.

So, given an alias (the unique identifier which every line that shall be consumed should have) and a path, list all HTML files found in that path. Once this list has been compiled, begin working through it, opening one file at a time, and for each line in that file, determine it the line should be consumed, or discarded.

Since the line contains HTML-formatting, as well as the alias and a timestamp, this would be prudent to scrape away, regular expressions to the rescue. Notice the trailing “s”, simple code is better than complex code, and a couple of fairly readable regular expressions is better than one monster of an expression. So away goes HTML-formatting, smileys, timestamps and the alias. What should now, theoretically, be left, is a string of words.

So that string is split up into words and fed into the SQLite database.

I was happy, this was my first attempt at working with SQLite, and thus my first attempt at getting Python to work with SQLite. It worked like a charm. Three separate queries where used, one, trying to select the word being stored. If the select returned a result, the returned value was incremented by one, and updated. Of no result was returned, a simple insert was called.

This is of course the naive and sub-optimal way to do it, but right then I was just so thrilled about coding something that I didn’t want to risk leaving “the zone”. Needless to say, doing two queries per word, means hitting the hard drive two times per word, EVERY word, for EVERY matching line, for EVERY one of the 2900 files. Yeah, I think there is room for some improvement here.

But I have to admit, I am impressed, in roughly four hours, give or take half an hour, I managed to put together a program which worksed surprisingly well. The one regret I have right now is that I didn’t write anything in the way of tests. No unit tests, no performance tests, no nothing. Of course, had I done that, I probably would have gotten bored half way through, and fallen asleep. And tests can be written after the fact.

Well, I wrote one simple progress measurement. The loop going through the files is called through the function enumerate, so I got hold of an index indicating what file was being processed, and for each file having been closed (processed and done) I printed the message “File %d done!”. From this I was able to clock the script at finishing roughly 20 files a minute (the measurements was taken at ten minute intervals) but this is rather inprecise as no file equals another in line or word length.

It was truly inspiring to realize how much can be done, in so little time. The next steps, besides the obvious room for improvement and optimization, is to use this little project as a real-life excercise to test how much I have learned by reading Martin Fowler’s Refactoring – Improving the Design of Exising Code.

Adding the ability to walk down into sub directories should of course be added, but the most interesting thing at the moment is going to be finding a way to detect typoes. The regexp rule for how to detect and split up words is a little… “stupid” at the moment.

Initially (after having slept through the day following that session) I thought about the typoes, and how to detect them, and how one might be able to use something like levenshtein, but again, this would entail IO heavy operations, and also start impacting the processor. There is probably some Python binding for Aspell one could use, I will have to look into that.

So, finally, why? What’s the reason? The motivation?

Well, somewhere in the back of my mind I remember having read an article somewhere which discussed the ability to use writing to identify people. So if I publish enough text on this blog, and then, on another, more anonymous blog, I publish something else, the words I use, or the frequency with which I use them, should give me away. In order to prove or disprove that hypothesis a signature would need to be identified, in the form of a database containing words and their frequency (in such a case it might even be beneficial to NOT attempt to correct spelling errors as they might indeed also be a “tell”) and then write a program which attempts to determine whether or not it is probable that this text was written by me.

While talking about the idea with a friend, she asked me about privacy concerns (I can only assume that she didn’t feel entirely satisfied with the thought of me sitting on a database with her words and their frequencies) and that is a valid concern. Besides the ethical ramifications of deriving data from and about people I call my friends, there is a potential flaw in trying to generate signatures for my friends from the partial data feed I am privy to. I base this potential flaw on the fact that I know that my relationship, experience and history with my various friends make for a diversified use of my vocabulary. In short, what language I use is determined by the person I speak with.