Month: March 2014

There are a lot of valid usecases when you need to protect your identity while communicating over the public internet. It is 2013 and so you probably already know about Tor. Most people use Tor through the browser. The cool thing is that you can get access to the Tor network programmatically so you can build interesting tools with privacy built into it.

The most common usecase to be able to hide the identity using TOR or being able to change identities programmatically is when you are crawling a website like Google (well, this one is harder than you think) and you don’t want to be rate-limited or forbidden.

This did take a fair amount hit and trial to get it working though.Tor
First of all, lets install Tor.

apt-get update
apt-get install tor
/etc/init.d/tor restart

You will notice that socks listener is on port 9050.

Lets enable the ControlPort listener for Tor to listen on port 9051. This is the port Tor will listen to for any communication from applications talking to Tor controller. The Hashed password is to enable authentication to the port to prevent any random access to the port.

Privoxy
Tor itself is not a http proxy. So in order to get access to the Tor Network, we will use the Privoxy as an http-proxy though socks5..

Install Privoxy.

apt-get install privoxy

Now lets tell privoxy to use TOR. This will tell Privoxy to route all traffic through the SOCKS servers at localhost port 9050.
Go to /etc/privoxy/config and enable forward-socks5:

forward-socks5 / localhost:9050 .

Restart Privoxy after making the change to the configuration file.

/etc/init.d/privoxy restart

Script:
In the script below, we’re using urllib2 to use the proxy. Privoxy listens on port 8118 by default, and forwards the traffic to port 9050 which the Tor socks is listening on.
Additionally, in the renew_connection() function, I am also sending signal to Tor controller to change the identity, so you get new identities without restarting Tor. You don’t have to change the ip, but sometimes it comes in handy with you are crawling and don’t wanted to be blocked based on ip.

Typically, with Ansible you create one or more hosts file which it calls Inventory file and Ansible will pick the servers from the hosts file and runs the playbooks onto the servers. This is a simple and straightforward way to do it. However, if you are using the Cloud, its very likely that your applications are creating and deleting servers based on some other logic and its very impractical to maintain a static Inventory file. In that case, Ansible can directly talk to your cloud (AWS, Rackspace, OpenStack, etc) or a dynamic source (Cobbler etc) through what it calls Dynamic Inventory plugins, without you having to maintain a static list of servers.

Here, I will go through the process of using the Rackspace Public Cloud Dynamic Inventory Plugin with Ansible.

Install Ansible
First of all, if you have not already installed Ansible, go ahead and do so. I like to install Ansible within virtualenv using pip.

Install Rax Dynamic Inventory Plugin
Ansible maintains an external RAX Inventory File on its repository (Not sure why these plugins do not get bundled with the Ansible package). The rax.py script depends on pyrax module, which is the client binding for Rackspace Cloud.

Run rax.py
As you can see, rax.py is a very simple script that provides a couple of methods to list and show servers in your cloud. By default, it grabs the servers in all Rackspace regions. If you are interested in only one region, you can specify the RAX_REGION.

Create Cloud Servers
Since you have already pyrax installed as a dependency of rax.py inventory plugin, you can use command-line to create a cloud server named ‘staging-apache1′ and and tag the server as staging-apache group using the metadata key-value feature.

If you want to install Apache on more staging servers, you would create server named staging-apache2 and tag it with the same group name staging-apache.

Also note, we are injecting ssh keys to the servers on creation, so ansible will be able to do ssh passwordless login. With Ansible, you also have the option of using username-password if you choose so.

Once the server is booted, lets make sure ansible can ping all the servers tagged with the group staging-apache.

ansible -i rax.py staging-apache -u root -m ping

Run a sample playbook
Now, lets create a very simple playbook to install apache on the inventory.

Lets run the apache playbook on all rax servers in the region DFW and that match the hosts in the group staging-apache.

RAX_REGION=DFW ansible-playbook -i rax.py apache.yml

With static inventory, you’d be doing this instead, and manually updating the hosts file:

ansible-playbook -i hosts apache.yml

Now you can ssh into the staging-apache1 server and make sure everything is configured as per your playbook.

ssh -i ~/.ssh/id_rsa root@staging-apache1

You may add more servers to the staging-apache group, and on the next run, ansible will detect the updated inventory dynamically and run the playbooks.

Rackspace Public Cloud is based off of OpenStack Nova. So nova.py inventory should work pretty much the same. You can look at the complete lists of dynamic inventory plugins here. Adding a new inventory plugin like for say Razor that isn’t already there would be fairly simple.