Wednesday, May 25, 2011

Set Up SSH to Bypass GFW - The Definitive Guide

In this article, I'm gonna show you how to set up SSH to bypass GFW, and how to
make the most out of an SSH connection in various ways.

The presumed platform is Linux, but OS X might probably be fine.

Configure SSH

Now you have acquired an SSH account. You usually get a username/password pair,
but you find it a bit inconvenient having to type the password every time you
log in. Good news is you don't have to: by using public key
authentication instead of password authentication.

Use Proxy Auto-Config

Since our purpose is just to unblock the blocked contents, we don't really need
to connect to all web sites through the proxy. In fact, some contents inside
the GFW are not available from outside of it.

It is wise to only proxify connections that otherwise would be blocked or
faked. How do you do that? One answer is Proxy Auto-Config.

OK, I see, but just how the GFW do I write a PAC file that covers all blocked
sites? There're simply too many of them!

The answer is, as usual, Open Source: the autoproxy-gfwlist project
maintains a GFW list, which covers a large list of blocked URLs, for the
Firefox AutoProxy addon. You may just use this addon, but I'd like something
lighter, and being Firefox-only, the addon is not as universal as a PAC file.

Next into the spotlight is, tada! the autoproxy2pac project! What it does is
convert the autoproxy-gfwlist list to a PAC file. The URL to the PAC file is:

http://autoproxy2pac.appspot.com/pac/TYPE/IP/PORT

TYPE being either proxy or socks, and IP and PORT being the
IP and port of your proxy. So in my case:

The PAC file retrieved is cryptically encoded in order to bypass the GFW keyword
filtering. It's OK to use it as is, but for the sake of readability and possible
further modifications, I always decode it:

which corresponds to what we specified in the autoproxy2pac URL.
For Firefox, the proxy type specified can be PROXY | SOCKS | SOCKS4 | SOCKS5
(nsIProxyAutoConfig). PROXY is for HTTP(S) proxies, the rest for SOCKS.

OK. The hard part is resolved. Now you just point your browser to this PAC file.
All major browsers support this feature. For Firefox 4, the setting goes:

Proxify DNS Queries

Setting a SOCKS proxy for your browser does not necessarily proxify your DNS
queries, meaning you still may not reach some of the sites (e.g. twitter.com)
due to DNS cache pollution or hijacking. In Firefox, this can be eliminated
by setting the following preference through about:config:

network.proxy.socks_remote_dns = true

For manual proxy configuration, this will bypass local DNS resolution totally,
and request DNS resolutions from the remote host, ensuring better reliability
and privacy at the same time.

For Proxy Auto-Config setups, even with socks_remote_dns, URLs that should
be requested through SOCKS proxy still trigger local DNS queries. This does
not affect browsing, since the effective DNS resolution is done remotely, but
the counterintuitive behavior may be a privacy concern for some. To solve this
issue, set the preference network.dns.disablePrefetch to true through
about:config to disable DNS prefetch. Hosts that are not proxied will,
of course, be resolved by your local DNS resolver.

There are other more general ways to do DNS queries on the remote side. We will
cover those later.

Proxify Any Program

Not everything is done inside your browser, but not every program supports all
types of proxies. Luckily, every problem tends to have got an answer for us ;)

For example, the excellent downloader wget does not understand SOCKS. This
is where tsocks comes in:

$ tsocks wget -qO- http://ip.appspot.com

will print the IP address of the remote host. Cool? Indeed.

And there's something even cooler: proxychains. Apart from its ability to
chain mixed types of proxies all into one proxy chain, it supports DNS tunneling.
This is important, as I explained earlier regarding the Firefox preference
network.proxy.socks_remote_dns.

Privoxy As HTTP Proxy

By far, we have been focusing on proxying connections directly through the SOCKS
proxy, and have managed to do so for almost all types of applications. In this
section, I'm gonna introduce you an alternative approach.

Privoxy is an HTTP proxy that supports forwarding reqests to HTTP or SOCKS
proxies. It is widely used in conjunction with Tor, for example.

Basically, we put privoxy between applications and the SSH SOCKS server, acting
as an HTTP proxy as far as the applications are concerned.

This alternative approach simplifies things for HTTP(S) connections:

Any program that understands HTTP proxy, can talk to privoxy directly.

With SOCKS5, all DNS resolution will happen on the remote server, rendering
network.proxy.socks_remote_dns moot, and all programs using this HTTP
proxy will get correct DNS resolutions as well.

Privoxy supports flexible forwarding rules, effectively voiding the need for
a PAC file, and being universal for any program.

If that sounds interesting, read on.

To setup privoxy to forward requests to your SSH SOCKS5 server, put this in
/etc/privoxy/config:

forward-socks5 / 127.0.0.1:7127 .

Now make sure you have ssh -D running, and start the privoxy daemon with
the command rc start privoxy, for example.

By default, privoxy listens on localhost:8118. Quickly test it out with
your browser; for Firefox users:

As for PAC, you need to configure the proxy type to PROXY. Note also, for
PAC setups, DNS prefetch will always query local DNS resolvers regardless
whether the proxy will do DNS resolution or not. Set network.dns.disablePrefetch
to true to disable this behavior, as mentioned before.

Wait, screw Proxy Auto-Config! (Yes I promised.) Let's opt for privoxy's builtin, flexible
pattern-based forward rules. I'm not going to cover the details here, but
give you the answer to the question that is now on your mind: Yes, there is a
written list of GFW proxy forward rules for privoxy! And of course, it's
based on autoproxy-gfwlist, no doubt about that :)

You should read the README on the AutoProxy2Privoxy page for instructions, but here is
what you need to do in brief. First you need to edit the following line in
gfw.action according to address:port of your SOCKS proxy:

Privoxy should automatically pick up the new config. Now just point your program
to privoxy, who will automatically determine whether to forward to SOCKS or not.

As a sidenote, privoxy originally stands for "Privacy Enhancing Proxy". It has
advanced filtering capabilities and can be used to enhance your privacy, or block
obnoxious advertisements. I suggest you read more about it.

Share Your Proxy

In the examples above, we always set up the proxy to listen on the loop device,
providing service exclusively to our own computer. In reality, you probably want
to share this proxy with others. For example, to allow your iPhone to twitter
through this proxy.

Usually it's safe to allow access from clients in your local home network, but
if you want to serve clients from the internet, make sure you set up your firewall
correctly.

Example for a privoxy setup (/etc/privoxy/config):

listen-address 192.168.1.100:8118

192.168.1.100 is the static address assigned to my computer by my
router, so that I don't have to change this address every now and then.

Of course, you can put the proxy on your router, which is better. It just might
take a little more work to set it up.

Now if you have your iPhone connected to the same wireless network, point it to
the HTTP proxy at 192.168.1.100:8118. If it doesn't work, try
restarting privoxy.

Another example to share the SOCKS proxy on all interfaces:

$ ssh -v24NnD :7127 ssh.gfw

Be careful with that.

That's about it briefly. For more, read the manpages and Google is your friend.