From patchwork Fri Nov 7 01:44:40 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: wangyijing X-Patchwork-Id: 40372 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-wg0-f71.google.com (mail-wg0-f71.google.com [74.125.82.71]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id B080E24687 for ; Fri, 7 Nov 2014 01:47:24 +0000 (UTC) Received: by mail-wg0-f71.google.com with SMTP id b13sf1308516wgh.6 for ; Thu, 06 Nov 2014 17:47:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:message-id:date:from:user-agent :mime-version:to:cc:subject:references:in-reply-to:sender:precedence :list-id:x-original-sender:x-original-authentication-results :mailing-list:list-post:list-help:list-archive:list-unsubscribe :content-type:content-transfer-encoding; bh=2uqHa7xlqVg/NxL7Dwn0PehpPHJmL39BtEJsrY97lbg=; b=RI+jNtOXPhpYksczFr/Ts2ckwtL9ofFTrX62fdXONFxKPcqbb6qrB7pIEWg1VnWJAB CP5FLu4cKJ2VADka/o4yapP1crf6XEh6LAUMYhf99tjzdvJjnm8YkfgPAs6tiD9PVbkH Kck3EqSrXoz7j5MLvqEdAEOL6dMQk8UZMno33MNzxU1x1lrwEyQwlIxppIsfhqxjISCZ ec9jsAXVdnMnro00ceIr5s9ojDh27lG0AHb75zGguwt8kjXOMSEOwJg7btkn3Nf6Gb5J aX5mmx6Zvap61hCsC8++UY8k9EDCGYcGeLPj24arhwxyn5xclbWpZLDmzQTd2kz3Lm4q 3tdA== X-Gm-Message-State: ALoCoQk5xfOh+/PlHc1kQ2a3rX8a+HVmGkBsJC+Xj10jfdneWc+kJNMdnm/KiTm74NvJghAWYZxx X-Received: by 10.180.74.196 with SMTP id w4mr47305wiv.7.1415324843958; Thu, 06 Nov 2014 17:47:23 -0800 (PST) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.205.6 with SMTP id lc6ls151316lac.11.gmail; Thu, 06 Nov 2014 17:47:22 -0800 (PST) X-Received: by 10.112.56.134 with SMTP id a6mr8322760lbq.25.1415324842842; Thu, 06 Nov 2014 17:47:22 -0800 (PST) Received: from mail-lb0-f176.google.com (mail-lb0-f176.google.com. [209.85.217.176]) by mx.google.com with ESMTPS id mj1si13368660lbc.40.2014.11.06.17.47.22 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 06 Nov 2014 17:47:22 -0800 (PST) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.176 as permitted sender) client-ip=209.85.217.176; Received: by mail-lb0-f176.google.com with SMTP id 10so1884631lbg.7 for ; Thu, 06 Nov 2014 17:47:22 -0800 (PST) X-Received: by 10.112.254.162 with SMTP id aj2mr8445519lbd.70.1415324842433; Thu, 06 Nov 2014 17:47:22 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.184.201 with SMTP id ew9csp133717lbc; Thu, 6 Nov 2014 17:47:21 -0800 (PST) X-Received: by 10.67.1.39 with SMTP id bd7mr8309617pad.57.1415324840829; Thu, 06 Nov 2014 17:47:20 -0800 (PST) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id jd4si7551758pbb.112.2014.11.06.17.47.20 for ; Thu, 06 Nov 2014 17:47:20 -0800 (PST) Received-SPF: none (google.com: stable-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751216AbaKGBrT (ORCPT + 1 other); Thu, 6 Nov 2014 20:47:19 -0500 Received: from szxga03-in.huawei.com ([119.145.14.66]:52764 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751051AbaKGBrS (ORCPT ); Thu, 6 Nov 2014 20:47:18 -0500 Received: from 172.24.2.119 (EHLO szxeml408-hub.china.huawei.com) ([172.24.2.119]) by szxrg03-dlp.huawei.com (MOS 4.4.3-GA FastPath queued) with ESMTP id AWS96236; Fri, 07 Nov 2014 09:47:16 +0800 (CST) Received: from [127.0.0.1] (10.177.27.212) by szxeml408-hub.china.huawei.com (10.82.67.95) with Microsoft SMTP Server id 14.3.158.1; Fri, 7 Nov 2014 09:47:08 +0800 Message-ID: <545C2408.60703@huawei.com> Date: Fri, 7 Nov 2014 09:44:40 +0800 From: Yijing Wang User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.0.1 MIME-Version: 1.0 To: Greg KH , Tejun Heo CC: , , Weng Meiling , Subject: Re: [PATCH] sysfs: driver core: Fix glue dir race condition References: <1415261798-9671-1-git-send-email-wangyijing@huawei.com> <20141106165547.GG25642@htj.dyndns.org> <20141106172246.GA20192@kroah.com> In-Reply-To: <20141106172246.GA20192@kroah.com> X-Originating-IP: [10.177.27.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020209.545C24A4.0182, ss=1, re=0.001, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2013-05-26 15:14:31, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 3ea8af9f1b4f92a978deeeaae8527107 Sender: stable-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: stable@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: wangyijing@huawei.com X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.176 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , On 2014/11/7 1:22, Greg KH wrote: > On Thu, Nov 06, 2014 at 11:55:47AM -0500, Tejun Heo wrote: >> Maybe "fix glue dir race condition by not removing them" is a better >> title? >> >> On Thu, Nov 06, 2014 at 04:16:38PM +0800, Yijing Wang wrote: >>> There is a race condition when removing glue directory. >>> It can be reproduced in following test: >>> >>> path 1: Add first child device >>> device_add() >>> get_device_parent() >>> /*find parent from glue_dirs.list*/ >>> list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry) >>> if (k->parent == parent_kobj) { >>> kobj = kobject_get(k); >>> break; >>> } >>> .... >>> class_dir_create_and_add() >>> >>> path2: Remove last child device under glue dir >>> device_del() >>> cleanup_device_parent() >>> cleanup_glue_dir() >>> kobject_put(glue_dir); >>> >>> If path2 has been called cleanup_glue_dir(), but not >>> call kobject_put(glue_dir), the glue dir is still >>> in parent's kset list. Meanwhile, path1 find the glue >>> dir from the glue_dirs.list. Path2 may release glue dir >>> before path1 call kobject_get(). So kernel will report >>> the warning and bug_on. >>> >>> This fix keep glue dir around once it created suggested >>> by Tejun Heo. >> >> I think you prolly want to explain why this is okay / desired. >> e.g. list how the glue dir is used and how many of them are there and >> explain that there's no real benefit in removing them. > > I'd really _like_ to remove them if at all possible, as if there isn't > any "children" in the subdirectory, there shouldn't be a need for that > directory to be there. > > This seems to be the "classic" problem we have of a kref in a list that > can be found while the last instance could be removed at the same time. > I hate to just throw another lock at the problem, but wouldn't a lock to > protect the list of glue_dirs be the answer here? Hi Greg, in this case, we need to protect the race condition between traverse dev->class->p->glue_dirs.list and kobject_put(glue_dir) in cleanup_glue_dir(). glue_dirs.list_lock only used to protect glue_dirs.list, but what we want to protect is don't call kobject_put(glue_dir) to decrease glue_dir ref count during we traverse dev->class->p->glue_dirs.list. --------------------------------------------------------------------------- /* find our class-directory at the parent and reference it */ spin_lock(&dev->class->p->glue_dirs.list_lock); list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry) ------>A if (k->parent == parent_kobj) { kobj = kobject_get(k); break; } spin_unlock(&dev->class->p->glue_dirs.list_lock); ------------------------------------------------------------------------------ static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir) { /* see if we live in a "glue" directory */ if (!glue_dir || !dev->class || glue_dir->kset != &dev->class->p->glue_dirs) return; kobject_put(glue_dir); --------------->B } ------------------------------------------------------------------------------ Tejun introduced a mutex gdp_mutex in commit 77d3d7c1d561f49 to fix the race condition in get_device_parent(). We could reuse the mutex to fix the race condition between glue_dirs.list traverse and kobject_put(glue_dir). Greg, the two solutions (reuse the gdp_mutex and don't remove glue_dir), which one do you prefer ? > > thanks, > > greg k-h > > . > diff --git a/drivers/base/core.c b/drivers/base/core.c index 28b808c..645eacf 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -724,12 +724,12 @@ class_dir_create_and_add(struct class *class, struct kobject *parent_kobj) return &dir->kobj; } +static DEFINE_MUTEX(gdp_mutex); static struct kobject *get_device_parent(struct device *dev, struct device *parent) { if (dev->class) { - static DEFINE_MUTEX(gdp_mutex); struct kobject *kobj = NULL; struct kobject *parent_kobj; struct kobject *k; @@ -793,7 +793,9 @@ static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir) glue_dir->kset != &dev->class->p->glue_dirs) return; + mutex_lock(&gdp_mutex); kobject_put(glue_dir); + mutex_unlock(&gdp_mutex); } static void cleanup_device_parent(struct device *dev)