Tags

If you’re using a lot of cheap VPS like me, then you you need a way to ship database backups to a third party site. If you’re on EC2, then you’re probably shipping it to Amazon S3. Other providers have their own automated techniques for backup. I have a lot of smaller sites that run on cheap VPS, with no backup solutions. I tend to use dropbox to sync them out to external storage. It’s not the best way, but it’s definitely cheap(read free), and quick to setup. More importanntly you can automatically have them sync to your local or any machines that you want!!!

Dropbox basically designates a folder on your computer as a sync folder. Any changes to that folder are synced to the cloud automatically if the dropbox service is running on local.

Configure

This will print out a link. You can visit that link and and it will ask you to create an account. This will automatically hookup that dropbox account to your local/server machine that you ran this on.

This client is not linked to any account...
Please visit https://www.dropbox.com/cli_link?host_id=7aaaaa8f285f2da0x67334d02c1 to link this machine.

Once you’ve signed up you can actually see dropboxd command line acknowledge the account creation. You can close the daemon by running ctrl+c. To run it again in the background so it syncs automatically always, run

~/.dropbox-dist/dropboxd &

Usage

You have to run the dropboxd daemon to sync automatically, so make sure its running by running htop or top. Once you’ve done that, you can also make sure you have a directory called Dropbox on your home folder.

cd ~/Dropbox

To backup anything simply move it to this folder. For eg. i periodically run a cronjob that dumps the live sql database to this folder. See below for an example. You can dump hourly backups of your database this way and they’ll be uploaded to dropbox. You can also set your local machine to use the same account and they’ll be downloaded on your machine as well. Instant gratification!!

mysqldump --single-transaction -h -u -p > ~/Dropbox/dump.sql

More Dropbox tricks to come later.

Source

You can easily make apt-get use a download accelerator for faster upgrades. Using apt-fast, which is simply a script that uses Axel, a command line application which accelerates HTTP/FTP downloads by using multiple sources for one file.

Source

If you’re a freelancer, then you’re probably working on multiple projects at the same time. If by any chance the projects use different versions of django, then you’re pretty much in version hell, writing bash scripts to change paths and all the other sins. This is where virtualenv will save your rear end.

virtualenv is a python module to help you isolate virtual python enviroments. Each environment will have its own python executable, it’s own site-packages effectively allowing you to install completely diferent dependencies for each project.

This should install both packages to the global python site-packages. One thing to note during installation is the path to virtualenvwrapper.sh. It will be printed during installation. Typically it’s found at /usr/local/bin/virtualenvwrapper.sh . We’ll need this later.

There’re some extra steps to setup virtualenvwrapper. If you’re on *nix then add the following lines at the top of ~/.bashrc . Here we’re just specifying that we want to save all our environments in a particular directory.

Usage

mkvirtualenv --no-site-packages myenv

This will create an environment and not import the global packages. This is important. This means it only uses libraries/versions in the local env.
I prefer to do this on all production environments to make sure upgrading a global package doesn’t affect the web app running on the current venv.
The only down side is that you’ve to install all the modules again in each new environment you create. You can choose to skip the argument —no-site-packages, and it will use all the modules in your global python path.

To enable a particular environment, virtualenvwrapper gives us some shortcut scripts. This will change the prompt to show the name of the current environment, eg. (myenv)$>

workon myenv

Tips

You have a brand new virtual environment to play with. While it’s active you can pretty much install any module an it will be installed only for this environment. Your global packages will be safe as ever. Also it’s a good idea to keep all your project dependencies in a requirements.txt file.

cd ~/go/to/project/dir
pip install -r requirements.txt

A sample requirements file could look like the following.

django==1.2.4
South==0.7.2

You can try $>pip freeze > requirements.txt to port an existing virtualenvironment to a requirements file. Now you can add a step to your deployment script to install all required dependencies each time. Fabric is also an awesome tool that every Django programmer should know and use.

There are other awesome things that will save you time. If you goto your WORKON_HOME dir, you will notice files like postactivate, postdeactivate. These are global shell scripts that you can add commands to. They will be exectued on each activate, deactivate etc. Infact each environment directory inside can have their own hooks like these. Enough spoon feeding, browse the docs to find out more.

THATS IT.

But wait!! it’s not that easy. If you’re running a django project like this, you’ll have a hell of a time installing modules like mysql-python, PIL etc. using pip inside a virtualenv. They need to be compiled, and due to changed paths the correct libraries aren’t found on a fresh system. I’ll explain some more common gotchas on my next post.

Tags

Continuing on from the last post, django provides an excellent hook in middlewares to process exception. You can add a new middleware and override the process_handler exception to add any kind of info to the request object. This info then shows up on the errors displayed on the screen(in debug mode) or sent as email. Profit!

classExceptionUserInfoMiddleware(object):
""" Adds user details to request.META, so that they show up in the error emails. Add to settings.MIDDLEWARE_CLASSES and keep it outermost (i.e. on top if possible). This allows it to catch exceptions in other middlewares as well."""defprocess_exception(self, request, exception):
""" Process the exception. :Parameters: - `request`: request that caused the exception - `exception`: actual exception being raised"""try:
if request.user.is_authenticated():
request.META['USERNAME'] = str(request.user.username)
request.META['USER_EMAIL'] = str(request.user.email)
except:
pass

Tags

Django error emails on 500 errors are pretty useless if you’re the acting
customer support for the day. There’s absolutely not much besides a
session ID, to identify which user actually got that exception. Obviously
for a startup, that’s essential to reacho out to little customers we do
have, if for nothing else than to apologize. Anyways here’s a simple
little snippet that can help you figure out a user from a session id.
Note, that the user might not be logged in, in that case there’s not much
you can do.

Tags

Here’s a quick git tip that i end up searching each time i need to do it. So i’m posting this here. There’s plenty of times i’ve added a tag, pushed to remote and realised that i’d named it wrong. Eg. 1.59 instead of 1.49 . To change it back you would need to add a new tag and push it,

Tags

Ubuntu Lucid only contain Apache 2.2.14 in their repos. In case you intend to install a newer version you don’t have to worry about downloading, and installing all dependencies. Luckily the new Maverick repos contain the newer version. Thus, all you need is to add them to your sources.list

#Add the following line to /etc/apt/sources.list
#deb http://cz.archive.ubuntu.com/ubuntu maverick main
#and run the commands below.
sudo apt-get update
sudo apt-get install apache2

Also remember to remove the new repo, otherwise it might try to update everything and you certainly don’t want that. I recently upgraded to Maverick, and loving the new font!

Tags

I recently switched to Ubuntu Lucid from Windows XP. Switching to a new OS
is probably harder than the decision to change one’s religion. Being a web
developer, IE is unfortunately a staple diet for most of us. Unfortunately
i couldn’t get IEs4Linux to run on Lucid. After trying to figure out loads
of install issues, and then almost settling on virtualbox, i just found
out about winetricks.

$>wget http://www.kegel.com/wine/winetricks
$>sh winetricks

and it should pop up a box with tools you can install and simply select IE8 or other goodies to choose from.

I was reading a post yesterday. It touches the topic of social filtering or collaborative filtering. I’m not sure if those are equivalent terms yet.
It raises questions on recommendations, popularity but from the other end. Most of the recommendation engines in my opinion focus on similarity of attributes, doing matrix manipulations to come up with more of the same. The reasons for those are obvious, people like something and they want to see more of it. If you show them items more aligned to their interests, they are more likely to buy stuff, spend more time on your site and improve upon a variety of such positive metrics.

I’ve always thought that was an incomplete way of going about it. In anything involving social behavior, there are infinite such aspects which we might not ever fully understand. All we can hope to do is capture and simplify everything into essential bits that explains most of it. I rather see recommendations as:

recommendations = similarity + serendipity + sieving

similarity – this is simply including the items that we think are most similar based on some finite attribute set. This is where the current focus is and what collaborative filtering tries to do.

serendipity – A good recommendation system should have an element of surprise built in as well. Thus given the same item, the result set should not be fully predictable. Tastes change over time, and in addition a StumbleUpon-like feature has a potential for interesting user feedback

sieving – or another kind of filtering, which is simply the act of filtering out stuff i would hate.

I think sites like http://www.thesixtyone.com do a good job of hooking people in with good music. The reason i spend so much time on that site is they give you a feeler first, of tried/tested/proven music. The music that people before you have liked a lot, it somewhat increases the probability of you feeling the same way. They have an explore section where i can set the level of “adventure”, and listen to completely random new music and which is the most interesting aspect of the site for me. Although it’s probably harder to figure out what algorithm they or anyone else use to recommend the next item but it’s easy to get a feel for it based on the kinds of data they collect.

Facebook like buttons are everywhere now. It’s adding a social layer on top of external links that tells us something important about its popularity. It also tells me Facebook wants to know what people read, watched or listened to. But i still think they’re missing out on important pieces of the puzzle here. Why shouldn’t there be a dislike button there. Why are companies not collecting any negative metrics? Page rank builds on the implicit recommendations in links. Does the meta data even exist on the Internet for generic Web pages to specify dis-similarity? Sites like Digg/Reddit have down votes, which is a start. My Yahoo Hackday project was an Anti-recommendation engine for music/movies(it used to live at http://epcntr.appspot.com). But it didn’t do a very good job, and the reason for that was simply lack of the right dataset. The questions being asked right now, ofcourse do not provide enough data to answer the converse questions.

How can we start collecting the right data? How would it improve the web if we had additional data on “hate patterns”? Do upvotes/downvote data from Digg, Reddit tell us something more about all the uncharted anti-matter on the internet. Like the article asks, what would a negative page rank look like? I have no clue, but collecting the right data might be a start.