@@ -465,9 +465,15 @@ EXPORT_SYMBOL(team_mode_unregister);
static const struct team_mode *team_mode_get(const char *kind)
{
- struct team_mode_item *mitem;
const struct team_mode *mode = NULL;
+ struct team_mode_item *mitem;
+ bool put = false;
+#if IS_MODULE(CONFIG_NET_TEAM)
+ if (!try_module_get(THIS_MODULE))
+ return NULL;
+ put = true;
+#endif
spin_lock(&mode_list_lock);
mitem = __find_mode(kind);
if (!mitem) {
@@ -483,6 +489,8 @@ static const struct team_mode *team_mode_get(const char *kind)
}
spin_unlock(&mode_list_lock);
+ if (put)
+ module_put(THIS_MODULE);
return mode;
}
When team mode is changed or set, the team_mode_get() is called to check whether the mode module is inserted or not. If the mode module is not inserted, it calls the request_module(). In the request_module(), it creates a child process, which is the "modprobe" process and waits for the done of the child process. At this point, the following locks were used. down_read(&cb_lock()); by genl_rcv() genl_lock(); by genl_rcv_msc() rtnl_lock(); by team_nl_cmd_options_set() mutex_lock(&team->lock); by team_nl_team_get() Concurrently, the team module could be removed by rmmod or "modprobe -r" The __exit function of team module is team_module_exit(), which calls team_nl_fini() and it tries to acquire following locks. down_write(&cb_lock); genl_lock(); Because of the genl_lock() and cb_lock, this process can't be finished earlier than request_module() routine. The problem secenario. CPU0 CPU1 team_mode_get request_module() modprobe -r team_mode_roundrobin team <--(B) modprobe team <--(A) team_mode_roundrobin By request_module(), the "modprobe team_mode_roundrobin" command will be executed. At this point, the modprobe process will decide that the team module should be inserted before team_mode_roundrobin. Because the team module is being removed. By the module infrastructure, the same module insert/remove operations can't be executed concurrently. So, (A) waits for (B) but (B) also waits for (A) because of locks. So that the hang occurs at this point. Test commands: while : do teamd -d & killall teamd & modprobe -rv team_mode_roundrobin & done The approach of this patch is to hold the reference count of the team module if the team module is compiled as a module. If the reference count of the team module is not zero while request_module() is being called, the team module will not be removed at that moment. So that the above scenario could not occur. Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device") Signed-off-by: Taehee Yoo <ap420073@gmail.com> --- drivers/net/team/team.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)