Message ID | 20201201115250.6381-2-tparkin@katalix.com |
---|---|
State | Superseded |
Headers | show |
Series | add ppp_generic ioctl(s) to bridge channels | expand |
On Tue, Dec 01, 2020 at 11:52:49AM +0000, Tom Parkin wrote: > This new ioctl pair allows two ppp channels to be bridged together: > frames arriving in one channel are transmitted in the other channel > and vice versa. > > The practical use for this is primarily to support the L2TP Access > Concentrator use-case. The end-user session is presented as a ppp > channel (typically PPPoE, although it could be e.g. PPPoA, or even PPP > over a serial link) and is switched into a PPPoL2TP session for > transmission to the LNS. At the LNS the PPP session is terminated in > the ISP's network. > > When a PPP channel is bridged to another it takes a reference on the > other's struct ppp_file. This reference is dropped when the channels > are unbridged, which can occur either explicitly on userspace calling > the PPPIOCUNBRIDGECHAN ioctl, or implicitly when either channel in the > bridge is unregistered. > > In order to implement the channel bridge, struct channel is extended > with a new field, 'bridge', which points to the other struct channel > making up the bridge. > > This pointer is RCU protected to avoid adding another lock to the data > path. > > To guard against concurrent writes to the pointer, the existing struct > channel lock 'upl' coverage is extended rather than adding a new lock. > > The 'upl' lock is used to protect the existing unit pointer. Since the > bridge effectively replaces the unit (they're mutually exclusive for a > channel) it makes coding easier to use the same lock to cover them > both. > > Signed-off-by: Tom Parkin <tparkin@katalix.com> > --- > drivers/net/ppp/ppp_generic.c | 139 ++++++++++++++++++++++++++++++++- > include/uapi/linux/ppp-ioctl.h | 2 + > 2 files changed, 138 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c > index 7d005896a0f9..5babf0aff840 100644 > --- a/drivers/net/ppp/ppp_generic.c > +++ b/drivers/net/ppp/ppp_generic.c > @@ -174,7 +174,8 @@ struct channel { > struct ppp *ppp; /* ppp unit we're connected to */ > struct net *chan_net; /* the net channel belongs to */ > struct list_head clist; /* link in list of channels per unit */ > - rwlock_t upl; /* protects `ppp' */ > + rwlock_t upl; /* protects `ppp' and 'bridge' */ > + struct channel __rcu *bridge; /* "bridged" ppp channel */ > #ifdef CONFIG_PPP_MULTILINK > u8 avail; /* flag used in multilink stuff */ > u8 had_frag; /* >= 1 fragments have been sent */ > @@ -606,6 +607,73 @@ static struct bpf_prog *compat_ppp_get_filter(struct sock_fprog32 __user *p) > #endif > #endif > > +/* Bridge one PPP channel to another. > + * When two channels are bridged, ppp_input on one channel is redirected to > + * the other's ops->start_xmit handler. > + * In order to safely bridge channels we must reject channels which are already > + * part of a bridge instance, or which form part of an existing unit. > + * Once successfully bridged, each channel holds a reference on the other > + * to prevent it being freed while the bridge is extant. > + */ > +static int ppp_bridge_channels(struct channel *pch, struct channel *pchb) > +{ > + write_lock_bh(&pch->upl); > + if (pch->ppp || pch->bridge) { Since ->bridge is RCU protected, it should be dereferenced with rcu_dereference_protected() here: rcu_dereference_protected(pch->bridge, lockdep_is_held(&pch->upl)). > + write_unlock_bh(&pch->upl); > + return -EALREADY; > + } > + rcu_assign_pointer(pch->bridge, pchb); > + write_unlock_bh(&pch->upl); > + > + write_lock_bh(&pchb->upl); > + if (pchb->ppp || pchb->bridge) { Same here (with adaptation of the lockdep part of course). > + write_unlock_bh(&pchb->upl); > + goto err_unset; > + } > + rcu_assign_pointer(pchb->bridge, pch); > + write_unlock_bh(&pchb->upl); > + > + refcount_inc(&pch->file.refcnt); > + refcount_inc(&pchb->file.refcnt); > + > + return 0; > + > +err_unset: > + write_lock_bh(&pch->upl); > + RCU_INIT_POINTER(pch->bridge, NULL); > + write_unlock_bh(&pch->upl); > + synchronize_rcu(); > + return -EALREADY; > +} > + > +static int ppp_unbridge_channels(struct channel *pch) > +{ > + struct channel *pchb; > + > + write_lock_bh(&pch->upl); > + pchb = rcu_dereference(pch->bridge); It'd make more sense to use rcu_dereference_protected() here too. > + if (!pchb) { > + write_unlock_bh(&pch->upl); > + return -EINVAL; I'm not sure I'd consider this case as an error. If there's no bridged channel, there's just nothing to do. Furthermore, there might be situations where this is not really an error (see the possible race below). > + } > + RCU_INIT_POINTER(pch->bridge, NULL); > + write_unlock_bh(&pch->upl); > + > + write_lock_bh(&pchb->upl); > + RCU_INIT_POINTER(pchb->bridge, NULL); > + write_unlock_bh(&pchb->upl); > + > + synchronize_rcu(); > + > + if (refcount_dec_and_test(&pch->file.refcnt)) > + ppp_destroy_channel(pch); I think that we could have a situation where pchb->bridge could be different from pch. If ppp_unbridge_channels() was also called concurrently on pchb, then pchb->bridge might have been already reset. And it might have dropped the reference it had on pch. In this case, we'd erroneously decrement the refcnt again. In theory, pchb->bridge might even have been reassigned to a different channel. So we'd reset pchb->bridge, but without decrementing the refcnt of the channel it pointed to (and again, we'd erroneously decrement pch's refcount instead). So I think we need to save pchb->bridge to a local variable while we hold pchb->upl, and then drop the refcount of that channel, instead of assuming that it's equal to pch. > + if (refcount_dec_and_test(&pchb->file.refcnt)) > + ppp_destroy_channel(pchb); > + > + return 0; > +} > + > static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) > { > struct ppp_file *pf; snip > @@ -3270,7 +3403,7 @@ ppp_connect_channel(struct channel *pch, int unit) > goto out; > write_lock_bh(&pch->upl); > ret = -EINVAL; > - if (pch->ppp) > + if (pch->ppp || pch->bridge) rcu_dereference_protected(). > goto outl; > > ppp_lock(ppp); > diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h > index 7bd2a5a75348..8dbecb3ad036 100644 > --- a/include/uapi/linux/ppp-ioctl.h > +++ b/include/uapi/linux/ppp-ioctl.h > @@ -115,6 +115,8 @@ struct pppol2tp_ioc_stats { > #define PPPIOCATTCHAN _IOW('t', 56, int) /* attach to ppp channel */ > #define PPPIOCGCHAN _IOR('t', 55, int) /* get ppp channel number */ > #define PPPIOCGL2TPSTATS _IOR('t', 54, struct pppol2tp_ioc_stats) > +#define PPPIOCBRIDGECHAN _IOW('t', 53, int) /* bridge one channel to another */ > +#define PPPIOCUNBRIDGECHAN _IO('t', 54) /* unbridge channel */ > > #define SIOCGPPPSTATS (SIOCDEVPRIVATE + 0) > #define SIOCGPPPVER (SIOCDEVPRIVATE + 1) /* NEVER change this!! */ > -- > 2.17.1 >
On Thu, Dec 03, 2020 at 01:23:18 +0100, Guillaume Nault wrote: > On Tue, Dec 01, 2020 at 11:52:49AM +0000, Tom Parkin wrote: > > +static int ppp_bridge_channels(struct channel *pch, struct channel *pchb) > > +{ > > + write_lock_bh(&pch->upl); > > + if (pch->ppp || pch->bridge) { > > Since ->bridge is RCU protected, it should be dereferenced with > rcu_dereference_protected() here: > rcu_dereference_protected(pch->bridge, lockdep_is_held(&pch->upl)). > Ack, thanks. Ditto for the other callsites which should also be using rcu_dereference_protected for access to the rcu-protected pointer. <snip> > > + if (!pchb) { > > + write_unlock_bh(&pch->upl); > > + return -EINVAL; > > I'm not sure I'd consider this case as an error. To be honest I'd probably tend agree with you, but I was seeking to maintain consistency with how PPPIOCCONNECT/PPPIOCDISCONN behave. The latter returns EINVAL if the channel isn't connected to an interface. If you feel strongly I'm happy to change it but IMO it's better to be consistent with existing ioctl calls. > If there's no bridged channel, there's just nothing to do. > Furthermore, there might be situations where this is not really an > error (see the possible race below). > > > + } > > + RCU_INIT_POINTER(pch->bridge, NULL); > > + write_unlock_bh(&pch->upl); > > + > > + write_lock_bh(&pchb->upl); > > + RCU_INIT_POINTER(pchb->bridge, NULL); > > + write_unlock_bh(&pchb->upl); > > + > > + synchronize_rcu(); > > + > > + if (refcount_dec_and_test(&pch->file.refcnt)) > > + ppp_destroy_channel(pch); > > I think that we could have a situation where pchb->bridge could be > different from pch. > If ppp_unbridge_channels() was also called concurrently on pchb, then > pchb->bridge might have been already reset. And it might have dropped > the reference it had on pch. In this case, we'd erroneously decrement > the refcnt again. > > In theory, pchb->bridge might even have been reassigned to a different > channel. So we'd reset pchb->bridge, but without decrementing the > refcnt of the channel it pointed to (and again, we'd erroneously > decrement pch's refcount instead). > > So I think we need to save pchb->bridge to a local variable while we > hold pchb->upl, and then drop the refcount of that channel, instead of > assuming that it's equal to pch. Ack, yes. The v1 series protected against this, although by nesting locks :-| I think in the case that pchb->bridge != pch, we probably want to leave pchb alone, so: 1. Don't unset the pchb->bridge pointer 2. Don't drop the pch reference (pchb doesn't hold a reference on pch because pchb->bridge != pch) This is on the assumption that pchb has been reassigned -- in that scenario we don't want to mess with pchb at all since it'll break the other bridge instance.
On Thu, Dec 03, 2020 at 11:57:18AM +0000, Tom Parkin wrote: > On Thu, Dec 03, 2020 at 01:23:18 +0100, Guillaume Nault wrote: > > On Tue, Dec 01, 2020 at 11:52:49AM +0000, Tom Parkin wrote: > > > + if (!pchb) { > > > + write_unlock_bh(&pch->upl); > > > + return -EINVAL; > > > > I'm not sure I'd consider this case as an error. > > To be honest I'd probably tend agree with you, but I was seeking to > maintain consistency with how PPPIOCCONNECT/PPPIOCDISCONN behave. The > latter returns EINVAL if the channel isn't connected to an interface. Indeed, that makes sense. I didn't think about that. I was mostly thinking about the case where ->bridge was concurently reset by another ppp_unbridge_channels(), which doesn't look like an error to me. But we can let userspace responsible for properly using the API (or ignoring EINVAL when they don't). > If you feel strongly I'm happy to change it but IMO it's better to be > consistent with existing ioctl calls. I don't feel strongly about it :). > > If there's no bridged channel, there's just nothing to do. > > Furthermore, there might be situations where this is not really an > > error (see the possible race below). > > > > > + } > > > + RCU_INIT_POINTER(pch->bridge, NULL); > > > + write_unlock_bh(&pch->upl); > > > + > > > + write_lock_bh(&pchb->upl); > > > + RCU_INIT_POINTER(pchb->bridge, NULL); > > > + write_unlock_bh(&pchb->upl); > > > + > > > + synchronize_rcu(); > > > + > > > + if (refcount_dec_and_test(&pch->file.refcnt)) > > > + ppp_destroy_channel(pch); > > > > I think that we could have a situation where pchb->bridge could be > > different from pch. > > If ppp_unbridge_channels() was also called concurrently on pchb, then > > pchb->bridge might have been already reset. And it might have dropped > > the reference it had on pch. In this case, we'd erroneously decrement > > the refcnt again. > > > > In theory, pchb->bridge might even have been reassigned to a different > > channel. So we'd reset pchb->bridge, but without decrementing the > > refcnt of the channel it pointed to (and again, we'd erroneously > > decrement pch's refcount instead). > > > > So I think we need to save pchb->bridge to a local variable while we > > hold pchb->upl, and then drop the refcount of that channel, instead of > > assuming that it's equal to pch. > > Ack, yes. > > The v1 series protected against this, although by nesting locks :-| Well, I think the v1 could deadlock in this situation. The RFC was immune to this problem, as it didn't modify ->bridge on pchb. > I think in the case that pchb->bridge != pch, we probably want to > leave pchb alone, so: > > 1. Don't unset the pchb->bridge pointer > 2. Don't drop the pch reference (pchb doesn't hold a reference on pch > because pchb->bridge != pch) > > This is on the assumption that pchb has been reassigned -- in that > scenario we don't want to mess with pchb at all since it'll break the > other bridge instance. Yes you're right. Thanks!
diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 7d005896a0f9..5babf0aff840 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -174,7 +174,8 @@ struct channel { struct ppp *ppp; /* ppp unit we're connected to */ struct net *chan_net; /* the net channel belongs to */ struct list_head clist; /* link in list of channels per unit */ - rwlock_t upl; /* protects `ppp' */ + rwlock_t upl; /* protects `ppp' and 'bridge' */ + struct channel __rcu *bridge; /* "bridged" ppp channel */ #ifdef CONFIG_PPP_MULTILINK u8 avail; /* flag used in multilink stuff */ u8 had_frag; /* >= 1 fragments have been sent */ @@ -606,6 +607,73 @@ static struct bpf_prog *compat_ppp_get_filter(struct sock_fprog32 __user *p) #endif #endif +/* Bridge one PPP channel to another. + * When two channels are bridged, ppp_input on one channel is redirected to + * the other's ops->start_xmit handler. + * In order to safely bridge channels we must reject channels which are already + * part of a bridge instance, or which form part of an existing unit. + * Once successfully bridged, each channel holds a reference on the other + * to prevent it being freed while the bridge is extant. + */ +static int ppp_bridge_channels(struct channel *pch, struct channel *pchb) +{ + write_lock_bh(&pch->upl); + if (pch->ppp || pch->bridge) { + write_unlock_bh(&pch->upl); + return -EALREADY; + } + rcu_assign_pointer(pch->bridge, pchb); + write_unlock_bh(&pch->upl); + + write_lock_bh(&pchb->upl); + if (pchb->ppp || pchb->bridge) { + write_unlock_bh(&pchb->upl); + goto err_unset; + } + rcu_assign_pointer(pchb->bridge, pch); + write_unlock_bh(&pchb->upl); + + refcount_inc(&pch->file.refcnt); + refcount_inc(&pchb->file.refcnt); + + return 0; + +err_unset: + write_lock_bh(&pch->upl); + RCU_INIT_POINTER(pch->bridge, NULL); + write_unlock_bh(&pch->upl); + synchronize_rcu(); + return -EALREADY; +} + +static int ppp_unbridge_channels(struct channel *pch) +{ + struct channel *pchb; + + write_lock_bh(&pch->upl); + pchb = rcu_dereference(pch->bridge); + if (!pchb) { + write_unlock_bh(&pch->upl); + return -EINVAL; + } + RCU_INIT_POINTER(pch->bridge, NULL); + write_unlock_bh(&pch->upl); + + write_lock_bh(&pchb->upl); + RCU_INIT_POINTER(pchb->bridge, NULL); + write_unlock_bh(&pchb->upl); + + synchronize_rcu(); + + if (refcount_dec_and_test(&pch->file.refcnt)) + ppp_destroy_channel(pch); + + if (refcount_dec_and_test(&pchb->file.refcnt)) + ppp_destroy_channel(pchb); + + return 0; +} + static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct ppp_file *pf; @@ -641,8 +709,9 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) } if (pf->kind == CHANNEL) { - struct channel *pch; + struct channel *pch, *pchb; struct ppp_channel *chan; + struct ppp_net *pn; pch = PF_TO_CHANNEL(pf); @@ -657,6 +726,29 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg) err = ppp_disconnect_channel(pch); break; + case PPPIOCBRIDGECHAN: + if (get_user(unit, p)) + break; + err = -ENXIO; + pn = ppp_pernet(current->nsproxy->net_ns); + spin_lock_bh(&pn->all_channels_lock); + pchb = ppp_find_channel(pn, unit); + /* Hold a reference to prevent pchb being freed while + * we establish the bridge. + */ + if (pchb) + refcount_inc(&pchb->file.refcnt); + spin_unlock_bh(&pn->all_channels_lock); + err = ppp_bridge_channels(pch, pchb); + /* Drop earlier refcount now bridge establishment is complete */ + if (refcount_dec_and_test(&pchb->file.refcnt)) + ppp_destroy_channel(pchb); + break; + + case PPPIOCUNBRIDGECHAN: + err = ppp_unbridge_channels(pch); + break; + default: down_read(&pch->chan_sem); chan = pch->chan; @@ -2089,6 +2181,40 @@ static bool ppp_decompress_proto(struct sk_buff *skb) return pskb_may_pull(skb, 2); } +/* Attempt to handle a frame via. a bridged channel, if one exists. + * If the channel is bridged, the frame is consumed by the bridge. + * If not, the caller must handle the frame by normal recv mechanisms. + * Returns true if the frame is consumed, false otherwise. + */ +static bool ppp_channel_bridge_input(struct channel *pch, struct sk_buff *skb) +{ + struct channel *pchb; + + rcu_read_lock(); + pchb = rcu_dereference(pch->bridge); + if (!pchb) + goto out_rcu; + + spin_lock(&pchb->downl); + if (!pchb->chan) { + /* channel got unregistered */ + kfree_skb(skb); + goto outl; + } + + skb_scrub_packet(skb, !net_eq(pch->chan_net, pchb->chan_net)); + if (!pchb->chan->ops->start_xmit(pchb->chan, skb)) + kfree_skb(skb); + +outl: + spin_unlock(&pchb->downl); +out_rcu: + rcu_read_unlock(); + + /* If pchb is set then we've consumed the packet */ + return !!pchb; +} + void ppp_input(struct ppp_channel *chan, struct sk_buff *skb) { @@ -2100,6 +2226,10 @@ ppp_input(struct ppp_channel *chan, struct sk_buff *skb) return; } + /* If the channel is bridged, transmit via. bridge */ + if (ppp_channel_bridge_input(pch, skb)) + return; + read_lock_bh(&pch->upl); if (!ppp_decompress_proto(skb)) { kfree_skb(skb); @@ -2796,8 +2926,11 @@ ppp_unregister_channel(struct ppp_channel *chan) list_del(&pch->list); spin_unlock_bh(&pn->all_channels_lock); + ppp_unbridge_channels(pch); + pch->file.dead = 1; wake_up_interruptible(&pch->file.rwait); + if (refcount_dec_and_test(&pch->file.refcnt)) ppp_destroy_channel(pch); } @@ -3270,7 +3403,7 @@ ppp_connect_channel(struct channel *pch, int unit) goto out; write_lock_bh(&pch->upl); ret = -EINVAL; - if (pch->ppp) + if (pch->ppp || pch->bridge) goto outl; ppp_lock(ppp); diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h index 7bd2a5a75348..8dbecb3ad036 100644 --- a/include/uapi/linux/ppp-ioctl.h +++ b/include/uapi/linux/ppp-ioctl.h @@ -115,6 +115,8 @@ struct pppol2tp_ioc_stats { #define PPPIOCATTCHAN _IOW('t', 56, int) /* attach to ppp channel */ #define PPPIOCGCHAN _IOR('t', 55, int) /* get ppp channel number */ #define PPPIOCGL2TPSTATS _IOR('t', 54, struct pppol2tp_ioc_stats) +#define PPPIOCBRIDGECHAN _IOW('t', 53, int) /* bridge one channel to another */ +#define PPPIOCUNBRIDGECHAN _IO('t', 54) /* unbridge channel */ #define SIOCGPPPSTATS (SIOCDEVPRIVATE + 0) #define SIOCGPPPVER (SIOCDEVPRIVATE + 1) /* NEVER change this!! */
This new ioctl pair allows two ppp channels to be bridged together: frames arriving in one channel are transmitted in the other channel and vice versa. The practical use for this is primarily to support the L2TP Access Concentrator use-case. The end-user session is presented as a ppp channel (typically PPPoE, although it could be e.g. PPPoA, or even PPP over a serial link) and is switched into a PPPoL2TP session for transmission to the LNS. At the LNS the PPP session is terminated in the ISP's network. When a PPP channel is bridged to another it takes a reference on the other's struct ppp_file. This reference is dropped when the channels are unbridged, which can occur either explicitly on userspace calling the PPPIOCUNBRIDGECHAN ioctl, or implicitly when either channel in the bridge is unregistered. In order to implement the channel bridge, struct channel is extended with a new field, 'bridge', which points to the other struct channel making up the bridge. This pointer is RCU protected to avoid adding another lock to the data path. To guard against concurrent writes to the pointer, the existing struct channel lock 'upl' coverage is extended rather than adding a new lock. The 'upl' lock is used to protect the existing unit pointer. Since the bridge effectively replaces the unit (they're mutually exclusive for a channel) it makes coding easier to use the same lock to cover them both. Signed-off-by: Tom Parkin <tparkin@katalix.com> --- drivers/net/ppp/ppp_generic.c | 139 ++++++++++++++++++++++++++++++++- include/uapi/linux/ppp-ioctl.h | 2 + 2 files changed, 138 insertions(+), 3 deletions(-)