Message ID | 94839fa9e7995afa6139b4f65c12ac15c1a8dc2f.1618844973.git.dcaratti@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | fix stack OOB read while fragmenting IPv4 packets | expand |
On 19 Apr 2021, at 17:23, Davide Caratti wrote: > running openvswitch on kernels built with KASAN, it's possible to see > the > following splat while testing fragmentation of IPv4 packets: <SNIP> > for IPv4 packets, ovs_fragment() uses a temporary struct dst_entry. > Then, > in the following call graph: > > ip_do_fragment() > ip_skb_dst_mtu() > ip_dst_mtu_maybe_forward() > ip_mtu_locked() > > the pointer to struct dst_entry is used as pointer to struct rtable: > this > turns the access to struct members like rt_mtu_locked into an OOB read > in > the stack. Fix this changing the temporary variable used for IPv4 > packets > in ovs_fragment(), similarly to what is done for IPv6 few lines below. > > Fixes: d52e5a7e7ca4 ("ipv4: lock mtu in fnhe when received PMTU < > net.ipv4.route.min_pmt") > Cc: <stable@vger.kernel.org> > Signed-off-by: Davide Caratti <dcaratti@redhat.com> The fix looks good to me, however isn’t the real root cause ip_mtu_locked() who casts struct dst_entry to struct rtable (not even using container_of())? I do not know details in this area of the code, so maybe it’s just fine to always assume dst_entry is part of a rtable struct, as I see other core functions do the same ipv4_neigh_lookup()/ipv4_confirm_neigh(). Acked-by: Eelco Chaudron <echaudro@redhat.com>
hello Eelco, thanks for looking at this! On Wed, 2021-04-21 at 11:27 +0200, Eelco Chaudron wrote: > > On 19 Apr 2021, at 17:23, Davide Caratti wrote: > > > running openvswitch on kernels built with KASAN, it's possible to see > > the > > following splat while testing fragmentation of IPv4 packets: > > <SNIP> > > > for IPv4 packets, ovs_fragment() uses a temporary struct dst_entry. > > Then, > > in the following call graph: > > > > ip_do_fragment() > > ip_skb_dst_mtu() > > ip_dst_mtu_maybe_forward() > > ip_mtu_locked() > > > > the pointer to struct dst_entry is used as pointer to struct rtable: > > this > > turns the access to struct members like rt_mtu_locked into an OOB read > > in > > the stack. Fix this changing the temporary variable used for IPv4 > > packets > > in ovs_fragment(), similarly to what is done for IPv6 few lines below. > > > > Fixes: d52e5a7e7ca4 ("ipv4: lock mtu in fnhe when received PMTU < > > net.ipv4.route.min_pmt") > > Cc: <stable@vger.kernel.org> > > Signed-off-by: Davide Caratti <dcaratti@redhat.com> > > The fix looks good to me, however isn’t the real root cause > ip_mtu_locked() who casts struct dst_entry to struct rtable (not even > using container_of())? good point, that's my understanding (and the reason for that 'Fixes:' tag). Probably openvswitch was doing this on purpose, and it was "just working" until commit d52e5a7e7ca4. But at the current state, I see much easier to just fix the IPv4 part to have the same behavior as other "users" of ip_do_fragment(), like it happens for ovs_fragment() when the packet is IPv6 (or br_netfilter core, see [1]). By the way, apparently ip_do_fragment() already assumes that a struct rtable is available for the skb [2]. So, the fix in ovs_fragment() looks safer to me. WDYT? -- davide [1] https://elixir.bootlin.com/linux/v5.12-rc8/source/net/bridge/br_nf_core.c#L72 [2] https://elixir.bootlin.com/linux/v5.12-rc8/source/net/ipv4/ip_output.c#L778
On 21 Apr 2021, at 17:05, Davide Caratti wrote: > hello Eelco, thanks for looking at this! > > On Wed, 2021-04-21 at 11:27 +0200, Eelco Chaudron wrote: >> >> On 19 Apr 2021, at 17:23, Davide Caratti wrote: >> >>> running openvswitch on kernels built with KASAN, it's possible to >>> see >>> the >>> following splat while testing fragmentation of IPv4 packets: >> >> <SNIP> >> >>> for IPv4 packets, ovs_fragment() uses a temporary struct dst_entry. >>> Then, >>> in the following call graph: >>> >>> ip_do_fragment() >>> ip_skb_dst_mtu() >>> ip_dst_mtu_maybe_forward() >>> ip_mtu_locked() >>> >>> the pointer to struct dst_entry is used as pointer to struct rtable: >>> this >>> turns the access to struct members like rt_mtu_locked into an OOB >>> read >>> in >>> the stack. Fix this changing the temporary variable used for IPv4 >>> packets >>> in ovs_fragment(), similarly to what is done for IPv6 few lines >>> below. >>> >>> Fixes: d52e5a7e7ca4 ("ipv4: lock mtu in fnhe when received PMTU < >>> net.ipv4.route.min_pmt") >>> Cc: <stable@vger.kernel.org> >>> Signed-off-by: Davide Caratti <dcaratti@redhat.com> >> >> The fix looks good to me, however isn’t the real root cause >> ip_mtu_locked() who casts struct dst_entry to struct rtable (not even >> using container_of())? > > good point, that's my understanding (and the reason for that 'Fixes:' > tag). Probably openvswitch was doing this on purpose, and it was "just > working" until commit d52e5a7e7ca4. > > But at the current state, I see much easier to just fix the IPv4 part > to > have the same behavior as other "users" of ip_do_fragment(), like it > happens for ovs_fragment() when the packet is IPv6 (or br_netfilter > core, see [1]). > > By the way, apparently ip_do_fragment() already assumes that a struct > rtable is available for the skb [2]. So, the fix in ovs_fragment() > looks > safer to me. WDYT? It looks like the assumption that a dst_entry is always embedded in rtable seems deeply embedded already, looking at skb_rtable(), so I agree this patch is the best solution. So again, Acked-by: Eelco Chaudron <echaudro@redhat.com>
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index 92a0b67b2728..77d924ab8cdb 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -827,17 +827,17 @@ static void ovs_fragment(struct net *net, struct vport *vport, } if (key->eth.type == htons(ETH_P_IP)) { - struct dst_entry ovs_dst; + struct rtable ovs_rt = { 0 }; unsigned long orig_dst; prepare_frag(vport, skb, orig_network_offset, ovs_key_mac_proto(key)); - dst_init(&ovs_dst, &ovs_dst_ops, NULL, 1, + dst_init(&ovs_rt.dst, &ovs_dst_ops, NULL, 1, DST_OBSOLETE_NONE, DST_NOCOUNT); - ovs_dst.dev = vport->dev; + ovs_rt.dst.dev = vport->dev; orig_dst = skb->_skb_refdst; - skb_dst_set_noref(skb, &ovs_dst); + skb_dst_set_noref(skb, &ovs_rt.dst); IPCB(skb)->frag_max_size = mru; ip_do_fragment(net, skb->sk, skb, ovs_vport_output);