I have questions about the order of events in the routing/iptables pipeline. I first explain my setup. The questions are at the end of this post.
I use policy routing and iptables on Linux 4.4.
I have two interfaces: wan0 (towards my ISP) and vpn-crypto (a tun device towards a vpn provider)
I want to selectively route some traffic to the vpn, everything else trough the wan0
I implement policy routing as follows:
In the mangle OUTPUT table I set a mark 0xC for NEW traffic which match some pattern. For testing purpose I mark traffic directed to a specific IP address (37.9.239.33). Of course I also have other iptables stuff, like MASQUEARADE, but that is irrelevant to this question.
I have an iproute2 rule which dispatches all "marked" packet to the vpn:
0: from all lookup local 190: from all fwmark 0x4/0x4 lookup vpn 400: from all fwmark 0x8/0x8 oif wan0 unreachable 32766: from all lookup main 32767: from all lookup default
The vpn table contains only one rule and send everything via the vpn:
default via 10.33.0.1 dev vpn-crypto proto static src 10.33.148.125
I test using the command ping -c3 37.9.239.33
where 37.9.239.33 is an IP address marked for vpn. Two tcpdump sessions, attached to wan0 and vpn-crypto, show that the packet effectively goes out and come back via vpn-crypto, as expected! So far so good, everything works.
I observe however a weird behavior which I do no understand while logging packets flow through iptables: I insert log traces in the FILTER.OUTPUT table and in the MANGLE.POSTROUTING table.
I observe this sequence of events:
FILTER.OUTPUT: IN= OUT=wan0 SRC=59.189.21.112 DST=37.9.239.33 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=26960 DF PROTO=ICMP TYPE=8 CODE=0 ID=9649 SEQ=1 MARK=0xc
MANGLE.POSTROUTING: IN= OUT=vpn-crypto SRC=59.189.21.112 DST=37.9.239.33 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=26960 DF PROTO=ICMP TYPE=8 CODE=0 ID=9649 SEQ=1 MARK=0xc
MANGLE.POSTROUTING: IN= OUT=wan0 SRC=59.189.21.112 DST=37.9.239.33 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=27049 DF PROTO=ICMP TYPE=8 CODE=0 ID=9649 SEQ=2
FILTER.OUTPUT: IN= OUT=wan0 SRC=59.189.21.112 DST=37.9.239.33 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=27100 DF PROTO=ICMP TYPE=8 CODE=0 ID=9649 SEQ=2 MARK=0xc
MANGLE.POSTROUTING: IN= OUT=vpn-crypto SRC=59.189.21.112 DST=37.9.239.33 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=27100 DF PROTO=ICMP TYPE=8 CODE=0 ID=9649 SEQ=2 MARK=0xc
MANGLE.POSTROUTING: IN= OUT=wan0 SRC=59.189.21.112 DST=37.9.239.33 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=27193 DF PROTO=ICMP TYPE=8 CODE=0 ID=9649 SEQ=3
FILTER.OUTPUT: IN= OUT=wan0 SRC=59.189.21.112 DST=37.9.239.33 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=27237 DF PROTO=ICMP TYPE=8 CODE=0 ID=9649 SEQ=3 MARK=0xc
MANGLE.POSTROUTING: IN= OUT=vpn-crypto SRC=59.189.21.112 DST=37.9.239.33 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=27237 DF PROTO=ICMP TYPE=8 CODE=0 ID=9649 SEQ=3 MARK=0xc
I expect the re-routing decision due to the mark in MANGLE.OUTPUT to happen before FILTER.OUTPUT, but it seems to happen after! In fact, all FILTER.OUTPUT packets are still routed via wan0, indicating no re-routing decision has happened yet. The documentation available on the web is split 50/50 about this point. For instance:
https://www.frozentux.net/iptables-tutorial/iptables-tutorial.html, chapter 6, table 6-2, states the re-routing decision happens before filter.output
http://www.aptalaska.net/~jclive/IPTablesFlowChart.pdf, states the re-re-routing decision happens after filter.output
Which one of the two is correct? Can someone shed some light on this?
In the context of the 3 pings, in the MANGLE.POSTROUTING table I see 2 unexpected packets routed via wan0, which are "unmarked" and seem to be duplicates of the ones routed via vpn-crypto. Why is this? Also, according to tcpdump, this packets never get to the interface (which is correct).
Can someone shed some light on this behaviour?