domingo, 22 de enero de 2017

This is a very quick hack for something I've wanted for some time. I'm using slack through their IRC gateway. That allows me to use erc, yasnippets, bug-reference-mode, and whatnot.

The few things that annoy me of that interface is that to ping users, in slack you are expected to prepend the nick with an at sign. That makes autocompletion kinda useless because, as this redditor explains , it's a pain in the ass to type the nick, autocomplete, go back, add the @ sign, and move forward.

I haven't found a way to solve this in an elegant way, but as a hack, I could add tihs advice functions that will understand @nick as well as 'nick' when autocompleting. I'd love to be able to just type the name and the autocompletion framework fill the '@' sign itself, but I couldn't find it, so I'll try this approach I got so far and see if it works well enough for me.

sábado, 21 de enero de 2017

Did you know that xargs can spawn multiple processes to execute the commands it gets?
It's the poor man's gnu parallel (very poor).

Well, I've been dabbling quite a bit with shell scripts lately and I've come to this balance of using shellscripts for simple (or not so simple) tasks. It's a mix of suckless and Taco Bell programming....

Well, the problem at point was to create many connections to some api endpoints. The endpoints would be serving data in a streamming fashion (think twitter hose). My idea would be to spawn several curls to a "circular" list of urls.

I thought about wrk (because lua), vegeta or siege, but I'm not sure how they would cope with 'persistent' connections, so I tried my luck with plain curl and bash.

It's funny how you can solve some issues in bash with so few lines of code. I don't have stats, or anything, but the only thing I wanted, that was to generate traffic can be done in plain simple shellscripting.

Things I learned:

- ${variable:-"default-value"} is great
- doing arithmetic in bash is next to impossible.
- xargs can spawn multiple processes, and you can control the number, so you can use the number as an upper limit.
- while true + timeout are a very nice combo when you want to generate any kind of 'ticking' or 'movement' on the loop.

jueves, 19 de enero de 2017

Let's play a little, just for the fun of it, and to gather some metainformation about a codebase.

I sometimes would like to find out the commit spans of the different authors. I may not care about the number of commits, but only first and last. This info may be useful to know if a repo has most long time commiters or the majority of contributors are one-off, or one-feature-and-forget.

First we'll sharpen a bit our unix chainsaw:

Here's a little helper that has proven to be very useful. It's similar to uniq, but you don't need to sort first, and also it accepts a parameter meaning the column you want to be unique. (Unique by Column)

function uc () {
awk -F" " "!_[\$$1]++"
}

And then, you only have to look at man git-log to find out you can reverse the order of the logs, so you can track the first and the last appearence of each commiter.

Here is the output file. it has some spurious parsing errors, but I think it still shows how powerful insights we can get with just a bit of bash, man and pipelines.

Just by chance, the day after I tried this, I discovered git-quick-stats, which is a utility tool to get simple stats out of a git repo. It's great that I could add the same functionality there too via a Pull Request :).

Big thanks to every one of the commiters in emacs, being long or short span and amount of code contributed. Thanks to y'all

This post is about a gnu parallel, a tool I recently discovered, and I'm starting to use a bit more every day.

At its core, it's a command to multiple commands in parallel, but it has many many different options to customize how the paralelization is done, notifications, and other configs. Take a look at the official tutorial or the man page, which contain a wealth of info and examples.

Let's get SICP videos

The use case I have today is to use it as a simple queuing system. I just want processes to start when I have a new job for them. The task at hand is to download all sicp lectures, at one download at a time (don't want to hog the network).

After we generated the file, we're going to run the command in the following way:

cat sicp.lisp | parallel -j1 --tmux

this makes parallel run one after the other, and just putting the outputs of each job in a tmux tab.

B, b, but.... what's the point of all this?

Ok, we didn't use parallel for anything useful, we could have run the list as a shellscript, and be happy. The idea is that we can use this simple mechanism to treat the file as a job queue, that waits for new incomming jobs and then processes them. Beware, it has very little logging, and you can't do very sophisticated error recovery (see --resume-failed), so it's NOT a replacement for resque/sidekiq/etc... In fact, I'd love to see something like a suckless queuing system based on parallel and bash.

So, here's the command to use this tool as a queuing system.

touch joblist; tail -f joblist | parallel --tmux

Then, add lines to the joblist for them to be executed. It's a very easy plumbing task:

echo "sleep 10" >joblist

tail -f works as our event loop, waiting for new tasks to come, and it will pass the tasks to parallel, that will apply the job contention depending on the number of jobs you configure it to run in parallel.

I've just scratched the surface of what parallel is able to do. Do some searching around, and take a look at the man and tutorial to get a grasp of what this amazing gnu tool is able to do. Taco Bell programming at its best!