Message ID | 20201002141531.7081-1-manivannan.sadhasivam@linaro.org |
---|---|
State | New |
Headers | show |
Series | net: qrtr: ns: Fix the incorrect usage of rcu_read_lock() | expand |
Hi, On Fri, Oct 2, 2020 at 7:15 AM Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote: > > The rcu_read_lock() is not supposed to lock the kernel_sendmsg() API > since it has the lock_sock() in qrtr_sendmsg() which will sleep. Hence, > fix it by excluding the locking for kernel_sendmsg(). > > Fixes: a7809ff90ce6 ("net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks") > Reported-by: Doug Anderson <dianders@chromium.org> > Tested-by: Alex Elder <elder@linaro.org> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> > --- > net/qrtr/ns.c | 20 ++++++++++++++------ > 1 file changed, 14 insertions(+), 6 deletions(-) > > diff --git a/net/qrtr/ns.c b/net/qrtr/ns.c > index 934999b56d60..0515433de922 100644 > --- a/net/qrtr/ns.c > +++ b/net/qrtr/ns.c > @@ -203,15 +203,17 @@ static int announce_servers(struct sockaddr_qrtr *sq) > /* Announce the list of servers registered in this node */ > radix_tree_for_each_slot(slot, &node->servers, &iter, 0) { > srv = radix_tree_deref_slot(slot); > + rcu_read_unlock(); My RCU-fu is mediocre at best and my radix-tree knowledge is non-existent. However: => Reading through radix_tree_deref_slot() it says that if you are only holding the read lock that you need to be calling radix_tree_deref_retry(). Why don't I see that here? => Without any real knowledge, it seems super sketchy to drop the lock while iterating over the tree. Somehow that feels unsafe. Hrm, there seems to be a function radix_tree_iter_resume() that might be exactly what you want, but I'm not totally sure. The only user I can see in-tree (other than radix tree regression testing) is btrfs-tests.c but it's using it together with radix_tree_deref_slot_protected(). In any case, my totally untested and totally knowedge-free proposal would look something like this: rcu_read_lock(); /* Announce the list of servers registered in this node */ radix_tree_for_each_slot(slot, &node->servers, &iter, 0) { srv = radix_tree_deref_slot(slot); if (!srv) continue; if (radix_tree_deref_retry(srv)) { slot = radix_tree_iter_retry(&iter); continue; } slot = radix_tree_iter_resume(slot, &iter); rcu_read_unlock(); ret = service_announce_new(sq, srv); if (ret < 0) { pr_err("failed to announce new service\n"); return ret; } rcu_read_lock(); } rcu_read_unlock(); What a beast! Given that this doesn't seem to be what anyone else in the kernel is doing exactly, it makes me suspect that there's a more fundamental design issue here, though... -Doug
Hi Doug, On Fri, Oct 02, 2020 at 08:28:51AM -0700, Doug Anderson wrote: > Hi, > > On Fri, Oct 2, 2020 at 7:15 AM Manivannan Sadhasivam > <manivannan.sadhasivam@linaro.org> wrote: > > > > The rcu_read_lock() is not supposed to lock the kernel_sendmsg() API > > since it has the lock_sock() in qrtr_sendmsg() which will sleep. Hence, > > fix it by excluding the locking for kernel_sendmsg(). > > > > Fixes: a7809ff90ce6 ("net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks") > > Reported-by: Doug Anderson <dianders@chromium.org> > > Tested-by: Alex Elder <elder@linaro.org> > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> > > --- > > net/qrtr/ns.c | 20 ++++++++++++++------ > > 1 file changed, 14 insertions(+), 6 deletions(-) > > > > diff --git a/net/qrtr/ns.c b/net/qrtr/ns.c > > index 934999b56d60..0515433de922 100644 > > --- a/net/qrtr/ns.c > > +++ b/net/qrtr/ns.c > > @@ -203,15 +203,17 @@ static int announce_servers(struct sockaddr_qrtr *sq) > > /* Announce the list of servers registered in this node */ > > radix_tree_for_each_slot(slot, &node->servers, &iter, 0) { > > srv = radix_tree_deref_slot(slot); > > + rcu_read_unlock(); > > My RCU-fu is mediocre at best and my radix-tree knowledge is > non-existent. However: > > => Reading through radix_tree_deref_slot() it says that if you are > only holding the read lock that you need to be calling > radix_tree_deref_retry(). Why don't I see that here? > Well, I drew inspiration from peer drivers and didn't look into the API documentation properly, my bad :( > => Without any real knowledge, it seems super sketchy to drop the lock > while iterating over the tree. Somehow that feels unsafe. Hrm, there > seems to be a function radix_tree_iter_resume() that might be exactly > what you want, but I'm not totally sure. The only user I can see > in-tree (other than radix tree regression testing) is btrfs-tests.c > but it's using it together with radix_tree_deref_slot_protected(). > > In any case, my totally untested and totally knowedge-free proposal > would look something like this: > > rcu_read_lock(); > /* Announce the list of servers registered in this node */ > radix_tree_for_each_slot(slot, &node->servers, &iter, 0) { > srv = radix_tree_deref_slot(slot); > if (!srv) > continue; > if (radix_tree_deref_retry(srv)) { > slot = radix_tree_iter_retry(&iter); > continue; > } > slot = radix_tree_iter_resume(slot, &iter); > rcu_read_unlock(); > > ret = service_announce_new(sq, srv); > if (ret < 0) { > pr_err("failed to announce new service\n"); > return ret; > } > > rcu_read_lock(); > } > > rcu_read_unlock(); > > What a beast! Given that this doesn't seem to be what anyone else in > the kernel is doing exactly, it makes me suspect that there's a more > fundamental design issue here, though... > That's how it is supposed to be. So I'm going to roll out next revision with your suggestion for the rest of the deref_slot() calls also. Thanks for your time looking into this. Regards, Mani > -Doug
diff --git a/net/qrtr/ns.c b/net/qrtr/ns.c index 934999b56d60..0515433de922 100644 --- a/net/qrtr/ns.c +++ b/net/qrtr/ns.c @@ -203,15 +203,17 @@ static int announce_servers(struct sockaddr_qrtr *sq) /* Announce the list of servers registered in this node */ radix_tree_for_each_slot(slot, &node->servers, &iter, 0) { srv = radix_tree_deref_slot(slot); + rcu_read_unlock(); ret = service_announce_new(sq, srv); if (ret < 0) { pr_err("failed to announce new service\n"); - goto err_out; + return ret; } + + rcu_read_lock(); } -err_out: rcu_read_unlock(); return ret; @@ -352,7 +354,9 @@ static int ctrl_cmd_bye(struct sockaddr_qrtr *from) /* Advertise removal of this client to all servers of remote node */ radix_tree_for_each_slot(slot, &node->servers, &iter, 0) { srv = radix_tree_deref_slot(slot); + rcu_read_unlock(); server_del(node, srv->port); + rcu_read_lock(); } rcu_read_unlock(); @@ -368,6 +372,7 @@ static int ctrl_cmd_bye(struct sockaddr_qrtr *from) rcu_read_lock(); radix_tree_for_each_slot(slot, &local_node->servers, &iter, 0) { srv = radix_tree_deref_slot(slot); + rcu_read_unlock(); sq.sq_family = AF_QIPCRTR; sq.sq_node = srv->node; @@ -379,11 +384,11 @@ static int ctrl_cmd_bye(struct sockaddr_qrtr *from) ret = kernel_sendmsg(qrtr_ns.sock, &msg, &iv, 1, sizeof(pkt)); if (ret < 0) { pr_err("failed to send bye cmd\n"); - goto err_out; + return ret; } + rcu_read_lock(); } -err_out: rcu_read_unlock(); return ret; @@ -447,6 +452,7 @@ static int ctrl_cmd_del_client(struct sockaddr_qrtr *from, rcu_read_lock(); radix_tree_for_each_slot(slot, &local_node->servers, &iter, 0) { srv = radix_tree_deref_slot(slot); + rcu_read_unlock(); sq.sq_family = AF_QIPCRTR; sq.sq_node = srv->node; @@ -458,11 +464,11 @@ static int ctrl_cmd_del_client(struct sockaddr_qrtr *from, ret = kernel_sendmsg(qrtr_ns.sock, &msg, &iv, 1, sizeof(pkt)); if (ret < 0) { pr_err("failed to send del client cmd\n"); - goto err_out; + return ret; } + rcu_read_lock(); } -err_out: rcu_read_unlock(); return ret; @@ -580,7 +586,9 @@ static int ctrl_cmd_new_lookup(struct sockaddr_qrtr *from, if (!server_match(srv, &filter)) continue; + rcu_read_unlock(); lookup_notify(from, srv, true); + rcu_read_lock(); } } rcu_read_unlock();