Mailing List Archive

Rout looping through local host.
After many many hours of frustration and failures I'm almost to the
point that I don't think this is even currently possible with Linux.

With out going in to too much detail, I am effectively wanting to do the
following.

I want to be able to take traffic in from a local LAN on eth0 and route
it out eth1 to a default gateway with a static IP. I want said default
gateway with the static IP to be assigned to eth2. I then want to route
and masquerade traffic that came in eth2 out eth3.

(Enter ASCII art)

--------------+
Context 0 |
+------+ +-----------+
+---+ eth0 |------+ Local LAN |
| +------+ +-----------+
| |
| +------+
+---+ eth1 +---+
+------+ |
| |
Context 1 | |
+------+ |
+---+ eth2 +---+
| +------+
| |
| +------+ +----------+
+---+ eth3 +------+ Internet |
+------+ +----------+
|
--------------+

I want the ""router in context 0 to effectively (for the sake of
discussion) do basic static NAT routing for the local LAN. This router
will have two static IP addresses, LAN facing and upstream router facing.

I want the ""router in context 1 to effectively (for the sake of
discussion) do basic MASQUERADing for the equipment behind it. This
router will have one static IP facing the LAN and one dynamic IP facing
its upstream provider.

I have followed Julian Anastasov's directions
(http://www.ssi.bg/~ja/send-to-self.txt) and applied his Send-to-Self
patch (http://www.ssi.bg/~ja/send-to-self-2.6.22-1.diff) to a stock
2.6.22 kernel and I am able to ping the IP address assigned to eth2 from
eth1 with out any problems. However I don't think Julian's patch covers
routing traffic through (not terminating at or originating locally) the
cross over cable.

I have also done some experimenting on my own to see if this is even
remotely possible to do by altering the routing tables in the kernel.
The closest that I can come is to remove all references to eth2 from the
kernel's 'local' routing table so that the kernel is not aware that the
IP address in question is local to the system thus making it think that
it needs to send the traffic out eth1 which is on the same subnet as the
target IP assigned to eth2.

I can tell from packet counters that this does indeed send the traffic
like it is suppose to do. However when the packet arrives in eth2 the
kernel does not know what to do with it as it does not see the IP in
question as being bound to any thing any where and drops the packet.

To this end I have re-added the entries from the 'local' routing table
to a new routing table 'local_new' and set up an 'ip rule' that
indicates that any traffic coming in the eth2 interface should use this
'local_new' routing table. However I have no way to know if this is
doing any good or not as I can not progress further. I have also tried
with out success to use the CONNMARKing in conjunction with (packet)
MARKing to use an additional 'ip rule' to specify that any traffic that
would be leaving the system should also use the 'local_new' routing
table. However all of this is to no avail.

If I stick more with Julian's 'Send-to-Self' document and just alter
source IPs for different destinations (per the end of said document) I
can get the traffic to flow through the system, but not as I want it to.
To the best of my knowledge traffic will come in eth0 and go directly
out eth3 while somehow in the return path passing through eth2, but
never touching eth1.

If I can not get this to work the way that I need / want it to I will
have to fall back to UML routers to fulfill the role of the context 1
""router. So any help that any one could provide would be _*GREATLY*_
appreciated.



Thanks in advance for any and all help that any one can provide,

Grant. . . .
Re: Rout looping through local host. [ In reply to ]
Hi Grant,

here's my 2¢:
there is no need to patch the kernel.
what you should do is PBR and little arp hacking:
let's say your eth0 is 10.0.0.1/24
what i'd do is to put eth1 and eth2 in different subnets:
eth1 -> 10.0.1.1/24
eth2 -> 10.0.2.1/24

default routes:
ip ro add default via 10.0.1.254 table 252 # from eth1 to eth2
ip ro add default via <gateway on eth3 side> table default # from eth3
to outside

PBR rules:
ip rule del prio 32766 # we need to put rules between lookup to main and default
ip rule add prio 100 lookup main # rule 32766 becomes 100
ip rule add 200 lookup 252 iif eth0 # alternative default route for local LAN

arp override:
arp -s 10.0.1.254 <ETH addr of eth2>

disable antispoof on eth{1,2} (may be not needed if you do NAT)
echo 0 > /proc/sys/net/ipv4/conf/eth1/rp_filter
echo 0 > /proc/sys/net/ipv4/conf/eth2/rp_filter


there is one thing to look after: the dhcp client will put its default
route in table main(252) and you should move it from here to table
default(253)

This setup should work with or without NAT

one last thing:

if you happen to snat packets from local LAN(received by eth0) to
10.0.1.1(address of eth1) then IIRC you *will* need to patch the
kernel as incoming packets with source address that the linux box
considers as its own are dropped.


Hope this works(did not test this exact setup) and helps

Best regards
Michel
Re: Rout looping through local host. [ In reply to ]
After more Googleing and searching through mailing lists with ever
widening search terms I think I may have come across something that has
some potential.

If I understand it correctly the "Linux Virtual Router and Forwarding"
project (http://linux-vrf.sourceforge.net/) that has a LOT of potential.
I think it might even help me.

(More to come.)



Grant. . . .
Re: Rout looping through local host. [ In reply to ]
On 08/21/07 12:26, michel banguerski wrote:
> here's my 2¢:
> there is no need to patch the kernel.
> what you should do is PBR and little arp hacking:
> let's say your eth0 is 10.0.0.1/24
> what i'd do is to put eth1 and eth2 in different subnets:
> eth1 -> 10.0.1.1/24
> eth2 -> 10.0.2.1/24

(For the sake of discussion) Ok...

> default routes:
> ip ro add default via 10.0.1.254 table 252 # from eth1 to eth2
> ip ro add default via <gateway on eth3 side> table default # from eth3
> to outside

I can see how this might get packets from eth1 in to eth2, but I fail to
see how returning packets will get from eth2 back to eth1 with out doing
the same in reverse. Doing the same in reverse will either require more
routing tables or make routing table 252 more complex if possible.

> PBR rules:
> ip rule del prio 32766 # we need to put rules between lookup to main
and default
> ip rule add prio 100 lookup main # rule 32766 becomes 100
> ip rule add 200 lookup 252 iif eth0 # alternative default route for
local LAN

Again, returning traffic is going to be problematic.

> arp override:
> arp -s 10.0.1.254 <ETH addr of eth2>

Presuming that the reverse path can be worked out, this might work if I
was really using cross over cables, but for scalability reasons I'm not
doing so, but rather a program more like a tunnel than a physical
interface. Said program generates a NOARP interface, so I don't think
this approach will work.

> disable antispoof on eth{1,2} (may be not needed if you do NAT)
> echo 0 > /proc/sys/net/ipv4/conf/eth1/rp_filter
> echo 0 > /proc/sys/net/ipv4/conf/eth2/rp_filter

Agreed. I don't know for sure if reverse path protection is or is not
needed yet, so for safety sake we'll turn it off and do our own reverse
path filtering in IPTables or see if the kernel can do it for us later on.

> there is one thing to look after: the dhcp client will put its default
> route in table main(252) and you should move it from here to table
> default(253)

Agreed. However this is what the options for dhcpcd (or the likes) are
for to tell it not to over write any files or change any thing else for
that matter. In fact, I think I'll just have the client request the
information and log it to files in the /etc/dhcp<what ever> directory
and I'll alter interfaces and routing table(s) my self.

> This setup should work with or without NAT

Possibly.

> Hope this works(did not test this exact setup) and helps

In theory (what I understand of what you are suggesting) has merit, but
lacks return path. Even if the return path can be fixed, there is still
the issue of the NOARP interfaces.

Also, my project requires me to have multiple of these additional
routers, so this will not scale very well and thus is not really an
ideal solution. (Look at a follow up post to my own question.)



Thank you very much for your input and for providing a very unique
solution, all be it fairly out side of the ball park one (sending
traffic to an IP address that does not exist...).

Grant. . . .
Re: Rout looping through local host. [ In reply to ]
Grant, You are 100% right about return path, it needs to be set up and
I hope the VRF will prove to be more adequate for your needs.

For sake of completeness I'll still provide the missing commands.
It is much easier if you SNAT packets coming to eth0 using a dedicated
IP address from the subnet attached to eth2:
iptables -t nat -A POSTROUTING -i eth0 -j SNAT --to-source 10.0.2.254
iptables -t nat -A POSTROUTING -o eth3 -j MASQUERADE

then all you need is :
arp -s 10.0.2.254 <ETH addr of eth1>

no additional routes or PBR rules are needed because on return path
the packet will be DNATed to appropriate address before routing
decision.

As for NOARP and tunneling interfaces AFAIK it makes things even
easier because static arp is not even needed: the packet will be
exchanged between eth1 and eth2 just because of routing.

I've made the following to test the behaviour of NOARP interfaces:

#vtund -snf /usr/share/doc/vtun/examples/vtund-server.conf
#vtund -nf /usr/share/doc/vtun/examples/vtund-client.conf cobra 127.0.0.1

that brought up tun0 and tun1:
tun0 Link encap:UNSPEC HWaddr
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:10.3.0.1 P-t-P:10.3.0.2 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MTU:1450 Metric:1
RX packets:192 errors:0 dropped:0 overruns:0 frame:0
TX packets:235 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:21504 (21.0 KiB) TX bytes:19740 (19.2 KiB)

tun1 Link encap:UNSPEC HWaddr
00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:10.3.0.2 P-t-P:10.3.0.1 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MTU:1450 Metric:1
RX packets:235 errors:0 dropped:0 overruns:0 frame:0
TX packets:192 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:19740 (19.2 KiB) TX bytes:21504 (21.0 KiB)


#ip ro add 192.168.0.0/24 dev tun0
#ip ro add 192.168.1.0/24 dev tun1

#iptables -t nat -A POSTROUTING -d 192.168.0.1 -s 10.3.0.1 -j SNAT
--to-source 192.168.1.1

# ping 192.168.0.1
PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.
>From 192.168.0.1 icmp_seq=1 Time to live exceeded
...
Bingo! the icmp packet loops trough tun0 and tun1 till the TTL goes 0,
no 'arp -s' needed.

The fun starts if we want to avoid NAT, in that case more magic is needed.

2007/8/21, Grant Taylor <gtaylor@riverviewtech.net>:
> On 08/21/07 12:26, michel banguerski wrote:
> > here's my 2¢:
> > there is no need to patch the kernel.
> > what you should do is PBR and little arp hacking:
> > let's say your eth0 is 10.0.0.1/24
> > what i'd do is to put eth1 and eth2 in different subnets:
> > eth1 -> 10.0.1.1/24
> > eth2 -> 10.0.2.1/24
eth0 -> 10.0.0.1/32 # so we don't have that nasty connected route in
'local' table

>
> (For the sake of discussion) Ok...
>
> > default routes:
> > ip ro add default via 10.0.1.254 table 252 # from eth1 to eth2
ip ro add 10.0.0.0/24 via 10.0.2.254 table 252 # from eth3 to eth2
If you use tun devices, no need to give a gateway, just "dev tunX".

> > ip ro add default via <gateway on eth3 side> table default # from eth3
> > to outside
ip ro add 10.0.0.0/24 dev eth0 table default # from eth1 to eth0

>
> I can see how this might get packets from eth1 in to eth2, but I fail to
> see how returning packets will get from eth2 back to eth1 with out doing
> the same in reverse. Doing the same in reverse will either require more
> routing tables or make routing table 252 more complex if possible.
>
> > PBR rules:
> > ip rule del prio 32766 # we need to put rules between lookup to main and default
> > ip rule add prio 100 lookup main # rule 32766 becomes 100
> > ip rule add 200 lookup 252 iif eth0 # alternative default route for local LAN
> > ip rule add 201 lookup 252 iif eth3 # alternative default route for local LAN
No changes in PBR rules
>
> Again, returning traffic is going to be problematic.
>
> > arp override:
> > arp -s 10.0.1.254 <ETH addr of eth2>

arp -s 10.0.2.254 <ETH addr of eth1>

>
> Presuming that the reverse path can be worked out, this might work if I
> was really using cross over cables, but for scalability reasons I'm not
> doing so, but rather a program more like a tunnel than a physical
> interface. Said program generates a NOARP interface, so I don't think
> this approach will work.

I belive on NOARP interfaces 'arp -s' is not needed at all.

>
> > disable antispoof on eth{1,2} (may be not needed if you do NAT)
> > echo 0 > /proc/sys/net/ipv4/conf/eth1/rp_filter
> > echo 0 > /proc/sys/net/ipv4/conf/eth2/rp_filter
>
> Agreed. I don't know for sure if reverse path protection is or is not
> needed yet, so for safety sake we'll turn it off and do our own reverse
> path filtering in IPTables or see if the kernel can do it for us later on.
>
> > there is one thing to look after: the dhcp client will put its default
> > route in table main(252) and you should move it from here to table
> > default(253)
>
> Agreed. However this is what the options for dhcpcd (or the likes) are
> for to tell it not to over write any files or change any thing else for
> that matter. In fact, I think I'll just have the client request the
> information and log it to files in the /etc/dhcp<what ever> directory
> and I'll alter interfaces and routing table(s) my self.
>
> > This setup should work with or without NAT
>
> Possibly.
>
> > Hope this works(did not test this exact setup) and helps
>
> In theory (what I understand of what you are suggesting) has merit, but
> lacks return path. Even if the return path can be fixed, there is still
> the issue of the NOARP interfaces.
>
> Also, my project requires me to have multiple of these additional
> routers, so this will not scale very well and thus is not really an
> ideal solution. (Look at a follow up post to my own question.)
>
>
>
> Thank you very much for your input and for providing a very unique
> solution, all be it fairly out side of the ball park one (sending
> traffic to an IP address that does not exist...).
It was an interesting exercise, I wished to be of help but at least it
was entertaining. Thank you :)

>
> Grant. . . .
>
Best regards
Michel

PS thank You for pointing to vrf, interesting indeed
Re: Rout looping through local host. [ In reply to ]
oops, misstype because of non linear writting and poor proofreading:

> > > ip rule add 200 lookup 252 iif eth0 # alternative default route for local LAN
> > > ip rule add 201 lookup 252 iif eth3 # alternative default route for local LAN
> No changes in PBR rules
> >

This line:
ip rule add 201 lookup 252 iif eth3 # alternative default route for local LAN
*is* the required change to PBR rules

sorry for flooding you.

2007/8/21, michel banguerski <banguerski+nfdev@gmail.com>:
> Grant, You are 100% right about return path, it needs to be set up and
> I hope the VRF will prove to be more adequate for your needs.
>
> For sake of completeness I'll still provide the missing commands.
> It is much easier if you SNAT packets coming to eth0 using a dedicated
> IP address from the subnet attached to eth2:
> iptables -t nat -A POSTROUTING -i eth0 -j SNAT --to-source 10.0.2.254
> iptables -t nat -A POSTROUTING -o eth3 -j MASQUERADE
>
> then all you need is :
> arp -s 10.0.2.254 <ETH addr of eth1>
>
> no additional routes or PBR rules are needed because on return path
> the packet will be DNATed to appropriate address before routing
> decision.
>
> As for NOARP and tunneling interfaces AFAIK it makes things even
> easier because static arp is not even needed: the packet will be
> exchanged between eth1 and eth2 just because of routing.
>
> I've made the following to test the behaviour of NOARP interfaces:
>
> #vtund -snf /usr/share/doc/vtun/examples/vtund-server.conf
> #vtund -nf /usr/share/doc/vtun/examples/vtund-client.conf cobra 127.0.0.1
>
> that brought up tun0 and tun1:
> tun0 Link encap:UNSPEC HWaddr
> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> inet addr:10.3.0.1 P-t-P:10.3.0.2 Mask:255.255.255.255
> UP POINTOPOINT RUNNING NOARP MTU:1450 Metric:1
> RX packets:192 errors:0 dropped:0 overruns:0 frame:0
> TX packets:235 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:500
> RX bytes:21504 (21.0 KiB) TX bytes:19740 (19.2 KiB)
>
> tun1 Link encap:UNSPEC HWaddr
> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> inet addr:10.3.0.2 P-t-P:10.3.0.1 Mask:255.255.255.255
> UP POINTOPOINT RUNNING NOARP MTU:1450 Metric:1
> RX packets:235 errors:0 dropped:0 overruns:0 frame:0
> TX packets:192 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:500
> RX bytes:19740 (19.2 KiB) TX bytes:21504 (21.0 KiB)
>
>
> #ip ro add 192.168.0.0/24 dev tun0
> #ip ro add 192.168.1.0/24 dev tun1
>
> #iptables -t nat -A POSTROUTING -d 192.168.0.1 -s 10.3.0.1 -j SNAT
> --to-source 192.168.1.1
>
> # ping 192.168.0.1
> PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.
> From 192.168.0.1 icmp_seq=1 Time to live exceeded
> ...
> Bingo! the icmp packet loops trough tun0 and tun1 till the TTL goes 0,
> no 'arp -s' needed.
>
> The fun starts if we want to avoid NAT, in that case more magic is needed.
>
> 2007/8/21, Grant Taylor <gtaylor@riverviewtech.net>:
> > On 08/21/07 12:26, michel banguerski wrote:
> > > here's my 2¢:
> > > there is no need to patch the kernel.
> > > what you should do is PBR and little arp hacking:
> > > let's say your eth0 is 10.0.0.1/24
> > > what i'd do is to put eth1 and eth2 in different subnets:
> > > eth1 -> 10.0.1.1/24
> > > eth2 -> 10.0.2.1/24
> eth0 -> 10.0.0.1/32 # so we don't have that nasty connected route in
> 'local' table
>
> >
> > (For the sake of discussion) Ok...
> >
> > > default routes:
> > > ip ro add default via 10.0.1.254 table 252 # from eth1 to eth2
> ip ro add 10.0.0.0/24 via 10.0.2.254 table 252 # from eth3 to eth2
> If you use tun devices, no need to give a gateway, just "dev tunX".
>
> > > ip ro add default via <gateway on eth3 side> table default # from eth3
> > > to outside
> ip ro add 10.0.0.0/24 dev eth0 table default # from eth1 to eth0
>
> >
> > I can see how this might get packets from eth1 in to eth2, but I fail to
> > see how returning packets will get from eth2 back to eth1 with out doing
> > the same in reverse. Doing the same in reverse will either require more
> > routing tables or make routing table 252 more complex if possible.
> >
> > > PBR rules:
> > > ip rule del prio 32766 # we need to put rules between lookup to main and default
> > > ip rule add prio 100 lookup main # rule 32766 becomes 100
> > > ip rule add 200 lookup 252 iif eth0 # alternative default route for local LAN
> > > ip rule add 201 lookup 252 iif eth3 # alternative default route for local LAN
> No changes in PBR rules
> >
> > Again, returning traffic is going to be problematic.
> >
> > > arp override:
> > > arp -s 10.0.1.254 <ETH addr of eth2>
>
> arp -s 10.0.2.254 <ETH addr of eth1>
>
> >
> > Presuming that the reverse path can be worked out, this might work if I
> > was really using cross over cables, but for scalability reasons I'm not
> > doing so, but rather a program more like a tunnel than a physical
> > interface. Said program generates a NOARP interface, so I don't think
> > this approach will work.
>
> I belive on NOARP interfaces 'arp -s' is not needed at all.
>
> >
> > > disable antispoof on eth{1,2} (may be not needed if you do NAT)
> > > echo 0 > /proc/sys/net/ipv4/conf/eth1/rp_filter
> > > echo 0 > /proc/sys/net/ipv4/conf/eth2/rp_filter
> >
> > Agreed. I don't know for sure if reverse path protection is or is not
> > needed yet, so for safety sake we'll turn it off and do our own reverse
> > path filtering in IPTables or see if the kernel can do it for us later on.
> >
> > > there is one thing to look after: the dhcp client will put its default
> > > route in table main(252) and you should move it from here to table
> > > default(253)
> >
> > Agreed. However this is what the options for dhcpcd (or the likes) are
> > for to tell it not to over write any files or change any thing else for
> > that matter. In fact, I think I'll just have the client request the
> > information and log it to files in the /etc/dhcp<what ever> directory
> > and I'll alter interfaces and routing table(s) my self.
> >
> > > This setup should work with or without NAT
> >
> > Possibly.
> >
> > > Hope this works(did not test this exact setup) and helps
> >
> > In theory (what I understand of what you are suggesting) has merit, but
> > lacks return path. Even if the return path can be fixed, there is still
> > the issue of the NOARP interfaces.
> >
> > Also, my project requires me to have multiple of these additional
> > routers, so this will not scale very well and thus is not really an
> > ideal solution. (Look at a follow up post to my own question.)
> >
> >
> >
> > Thank you very much for your input and for providing a very unique
> > solution, all be it fairly out side of the ball park one (sending
> > traffic to an IP address that does not exist...).
> It was an interesting exercise, I wished to be of help but at least it
> was entertaining. Thank you :)
>
> >
> > Grant. . . .
> >
> Best regards
> Michel
>
> PS thank You for pointing to vrf, interesting indeed
>
Re: Rout looping through local host. [ In reply to ]
On 08/21/07 15:47, michel banguerski wrote:
> Grant, You are 100% right about return path, it needs to be set up
> and I hope the VRF will prove to be more adequate for your needs.

Michel, thank you for your help. I will keep what you have pointed out
in mind. However for now I'm pursuing the VRF route as I think it will
better handle my problem and scale better, especially considering I need
to not NAT what is coming out of eth1 on its way to eth2.

I'll keep everyone apprised on what I find out.



Thanks again,

Grant. . . .