About policy routing

In classic routing the destination address of an IP packet is used to determine how to route the packet. Policy routing is an advanced type of routing that lets you configure routing based on parameters other than just the destination IP address. For instance, you may want to use the source address or the port to take the routing decision. Policy routing is only possible with the new generation network tools, so you will need the iproute2 programs. The route command cannot do policy routing. The good news is the support for policy routing has been merged in the linux kernel a long time ago, so you don't have to patch your kernel. The support is optional, but most of the Linux distributions support it by default.

Since the old routing tools are still supported, it's not possible to mix the standard routes based on the destination addresses, and the advanced routes based on extended parameters. That's why you have to use several routing tables. You will have to write rules that tell the kernel which routing table to use for each network packet. For instance, you can say that TCP/IP packets having a specific destination port will be routed using a specific routing table, and that all the other packets will use the main routing table.

The routing tables

Policy routing requires more than one routing table. Linux-2.6 supports up to 255 different routing tables. By default and with classic routing, you just use two tables: the local routing table and the main routing table. With policy routing, you will create other routing tables. The tables are listed in /etc/iproute2/rt_tables.

The normal routing tables

The local routing table is managed automatically by the kernel. The user does not have to take care of this table. It's used to store all the local addresses, and it allows the kernel to know if a network packet has to be delivered locally (on the local machine) or if it has to be delivered to another computer (it would be routed if it's allowed). It's the first real routing table used by the kernel when it performs a lookup. It's used just after the routing cache that is a special routing table.

The main routing table is used for all the other addresses by default. This is the routing table used by the route command, and it's also the table that is used when you don't specify the name of the table with ip route.

The custom routing tables are all the other tables. These tables will be used when a network packet matches the advanced routing rules specified by ip rule. These routing tables are normal routing tables, they can have a default route.

The routing cache table

The routing cache table is a special routing table managed automatically by the kernel table to improve performance. There is only one routing cache table even if you have several routing tables configured. The cache is where the kernel saves the results of the recent routing lookups it makes. It saves only lookup results for a specific IP address, it does not save routing information about subnets. The routing cache can be manipulated with ip route as another table. It can be viewed by ip route show cache and flushed by ip route flush cache. The cache is the first routing table used by the kernel every time it makes a lookup. It's used even before the local routing table.

Rules and policies

The rules are used to tell the kernel what action to take for each kind of network packet. The action is often to use a custom routing table, but it may also be specific actions such as throw, unreachable, prohibit, blackhole. The rules is the place where we use the advanced packet matching. Here are the parameters that you can use to decide which routing table to use:

the IP source address and the IP destination address

the ingress device, I mean where the packet comes from

the TOS (Type Of Service), it's part of the IP packet header

the fwmark number (firewall mark). It's an attribute that can be changed in netfilter, you have to use iptables to do that.

Examples of rules

All the packets from 192.168.114.0/24 should use the routing table named rt_table1

ip rule add from 192.168.114.0/24 table rt_table1

All the packets from 192.168.5.1 to 172.16.1.100 should not be routed and unreachable must be returned via ICMP:

ip rule add from 192.168.5.1 to 172.16.1.100 unreachable

All the packets from 192.168.5.1 should not be routed:

ip rule add from 192.168.5.1 prohibit

All the packets that were marked with fwmark=1 by netfilter (you can do that with iptables) should be routed to rt_table_adsl

ip rule add fwmark 1 table rt_table_adsl

All the packets that were marked with fwmark=2 by netfilter (you can do that with iptables) should be routed to rt_table_cable

ip rule add fwmark 2 table rt_table_cable

You can remove an existing rule using a syntax similar to the add subcommand. Here is how to remove the last rule we added:

ip rule del fwmark 2 table rt_table_cable

How rules are processed

The kernel supports up to 32767 rules. By default only the following rules are used:

% ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup default

The rules are executed in the order, from priority 0 to priority 32767. This means that the first rule (rule 0) is the first one to be executed, and it drives all of the packets to the local routing table, in order to quickly process packets that have to be delivered locally. The main and default routes are the last ones to be used for the lookup. So the main routing table will be used only if no custom route found a valid match for a network packet. In other words, all the packets that do not match the rules specified by ip rule will use the main routing table. The default routing table is empty by default. You can use it if you want to specify routes for packets that do not match any of the previous routes.

Be careful: even if you only have classic routing tables on your system, the default rules are important. Removing the default rules would break the classic routing, since the packet could not access the main routing table.

When you create rules, you can specify the priority just with priority xxx. If you don't specify the priority, the last available number will be attributed. It means the first new rules to be created will be 32765, 32764, ...

Example of rules

Here is the list of rules we get if we execute all the rules given in the previous section:

% ip rule show
0: from all lookup local
32761: from all fwmark 0x2 lookup rt_table_cable
32762: from all fwmark 0x1 lookup rt_table_adsl
32763: from 192.168.5.1 prohibit
32764: from 192.168.5.1 to 172.16.1.100 unreachable
32765: from 192.168.114.0/24 lookup rt_table1
32766: from all lookup main
32767: from all lookup default

How to organize the routing tables

Keep in mind that a packet can check multiple routing tables, each routing table can have multiple routes. So in case no route matches in the first routing table, other tables will be checked in order to find the right route for a packet. So you don't have to duplicate the routes of the main routing table in the custom tables.

One good way to manage the different routing tables is to use the following method:

You add all the normal routes based on the destination in the main routing table

You just add one default route in each of the other routing tables

That way, you end up with something quite simple, and there is just one route for each rule, and the list of rules is similar to a master routing table.