diff mbox

PROBLEM: Kernel OOPS and possible system freeze after concurrent writing to cpufreq/scaling_governor (Resend)

Message ID CAKohpo=DeQQq505U_EDsJ+5dPGPuSOypx_YbTYW5owj4FDYLpA@mail.gmail.com
State New
Headers show

Commit Message

Viresh Kumar Sept. 8, 2014, 10:56 a.m. UTC
On 8 September 2014 13:46, Robert Schöne <robert.schoene@tu-dresden.de> wrote:
> (Sorry for the resend, I forgot to disable my S/MIME signature)
>
> The patch you suggested did not work, so I introduced a new mutex in the
> patch below.

Okay, let me apologize first. This thread has taken much longer than I
expected, to some level due to me. I have changed my job recently
and have been running busy with new assignments, etc..

> I am not happy with adding just another mutex, but it fixes my problem
> of changing governors concurrently.

So, yeah back to the problem.

Can you please try attached patch with the revert earlier suggested?
To make it clear again, cherry-pick: 19c7630 and then apply  this
patch.

Let me know if it still doesn't work for you..

I don't have a local setup to test this and so its just compile tested.

Cc'd few more people who also reported similar issues.

--
viresh

Comments

Robert Schöne Sept. 8, 2014, 12:28 p.m. UTC | #1
Am Montag, den 08.09.2014, 16:26 +0530 schrieb Viresh Kumar:
> Can you please try attached patch with the revert earlier suggested?
> To make it clear again, cherry-pick: 19c7630 and then apply  this
> patch.

The patch works for me.

Thank you!

PS: With this patch I get an EINVAL when writing fails (instead of an
EBUSY). Is this intended? 


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Sept. 8, 2014, 12:57 p.m. UTC | #2
On 8 September 2014 17:58, Robert Schöne <robert.schoene@tu-dresden.de> wrote:
> The patch works for me.
>
> Thank you!
>
> PS: With this patch I get an EINVAL when writing fails (instead of an
> EBUSY). Is this intended?

That's because cpufreq_set_policy() returns.. We can fix it separately if we
really want.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Sept. 8, 2014, 9:14 p.m. UTC | #3
On Monday, September 08, 2014 04:26:42 PM Viresh Kumar wrote:
> 
> --001a11c2eb5293ba1305028bac8b
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: quoted-printable
> 
> On 8 September 2014 13:46, Robert Sch=C3=B6ne <robert.schoene@tu-dresden.de=
> > wrote:
> > (Sorry for the resend, I forgot to disable my S/MIME signature)
> >
> > The patch you suggested did not work, so I introduced a new mutex in the
> > patch below.
> 
> Okay, let me apologize first. This thread has taken much longer than I
> expected, to some level due to me. I have changed my job recently
> and have been running busy with new assignments, etc..
> 
> > I am not happy with adding just another mutex, but it fixes my problem
> > of changing governors concurrently.
> 
> So, yeah back to the problem.
> 
> Can you please try attached patch with the revert earlier suggested?
> To make it clear again, cherry-pick: 19c7630 and then apply  this
> patch.
> 
> Let me know if it still doesn't work for you..
> 
> I don't have a local setup to test this and so its just compile tested.
> 
> Cc'd few more people who also reported similar issues.

Some mailer (mine or yours) mangled the attachment.  I can see it in patchwork,
though.

It looks reasonable to me.

Am I supposed to pick it up?  If so, do we need it in -stable?  If so, which
-stable (or what commit does it fix)?

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Sept. 9, 2014, 4:18 a.m. UTC | #4
On 9 September 2014 02:44, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> Some mailer (mine or yours) mangled the attachment.  I can see it in patchwork,
> though.

I didn't knew that they play with attachments as well :)

> It looks reasonable to me.
>
> Am I supposed to pick it up?  If so, do we need it in -stable?  If so, which
> -stable (or what commit does it fix)?

I wanted to send this separately and so attached it here. I have sent
two patches
to you with details about stable as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

From 250686bfdbb183a9d798b071d32972b65d35c915 Mon Sep 17 00:00:00 2001
Message-Id: <250686bfdbb183a9d798b071d32972b65d35c915.1410173112.git.viresh.kumar@linaro.org>
From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Mon, 8 Sep 2014 16:01:15 +0530
Subject: [PATCH] cpufreq: Track governor-state with 'policy->governor_state'

Even after serializing calls to __cpufreq_governor() there are some races left.
The races are around doing the invalid operation during some state of cpufreq
governors. For example, while the governor is in CPUFREQ_GOV_POLICY_EXIT state,
we can't do CPUFREQ_GOV_START without first doing CPUFREQ_GOV_POLICY_INIT.

All these cases weren't handled elegantly in __cpufreq_governor() and so there
were enough chances that things may go wrong when governors are changed with
multiple thread.

This patch renames an existing field 'policy->governor_enabled' with
'policy->governor_state' which can have more values than 0 & 1 now.

We maintain the current state of governors for each policy now and reject any
invalid operation.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c          | 26 +++++++++++++-------------
 drivers/cpufreq/cpufreq_governor.c |  2 +-
 include/linux/cpufreq.h            |  2 +-
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index a7ceae3..c597361 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -935,6 +935,7 @@  static void cpufreq_init_policy(struct cpufreq_policy *policy)
 	struct cpufreq_policy new_policy;
 	int ret = 0;
 
+	policy->governor_state = CPUFREQ_GOV_POLICY_EXIT;
 	memcpy(&new_policy, policy, sizeof(*policy));
 
 	/* Update governor of new_policy to the governor used before hotplug */
@@ -1976,7 +1977,7 @@  EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 static int __cpufreq_governor(struct cpufreq_policy *policy,
 					unsigned int event)
 {
-	int ret;
+	int ret, state;
 
 	/* Only must be defined when default governor is known to have latency
 	   restrictions, like e.g. conservative or ondemand.
@@ -2012,19 +2013,21 @@  static int __cpufreq_governor(struct cpufreq_policy *policy,
 		 policy->cpu, event);
 
 	mutex_lock(&cpufreq_governor_lock);
+	state = policy->governor_state;
+
+	/* Check if operation is permitted or not */
 	if (policy->governor_busy
-	    || (policy->governor_enabled && event == CPUFREQ_GOV_START)
-	    || (!policy->governor_enabled
-	    && (event == CPUFREQ_GOV_LIMITS || event == CPUFREQ_GOV_STOP))) {
+	    || (state == CPUFREQ_GOV_START && event != CPUFREQ_GOV_LIMITS && event != CPUFREQ_GOV_STOP)
+	    || (state == CPUFREQ_GOV_STOP && event != CPUFREQ_GOV_START && event != CPUFREQ_GOV_POLICY_EXIT)
+	    || (state == CPUFREQ_GOV_POLICY_INIT && event != CPUFREQ_GOV_START && event != CPUFREQ_GOV_POLICY_EXIT)
+	    || (state == CPUFREQ_GOV_POLICY_EXIT && event != CPUFREQ_GOV_POLICY_INIT)) {
 		mutex_unlock(&cpufreq_governor_lock);
 		return -EBUSY;
 	}
 
 	policy->governor_busy = true;
-	if (event == CPUFREQ_GOV_STOP)
-		policy->governor_enabled = false;
-	else if (event == CPUFREQ_GOV_START)
-		policy->governor_enabled = true;
+	if (event != CPUFREQ_GOV_LIMITS)
+		policy->governor_state = event;
 
 	mutex_unlock(&cpufreq_governor_lock);
 
@@ -2035,13 +2038,10 @@  static int __cpufreq_governor(struct cpufreq_policy *policy,
 			policy->governor->initialized++;
 		else if (event == CPUFREQ_GOV_POLICY_EXIT)
 			policy->governor->initialized--;
-	} else {
+	} else if (event != CPUFREQ_GOV_LIMITS) {
 		/* Restore original values */
 		mutex_lock(&cpufreq_governor_lock);
-		if (event == CPUFREQ_GOV_STOP)
-			policy->governor_enabled = true;
-		else if (event == CPUFREQ_GOV_START)
-			policy->governor_enabled = false;
+		policy->governor_state = state;
 		mutex_unlock(&cpufreq_governor_lock);
 	}
 
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 1b44496..d173181 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -174,7 +174,7 @@  void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
 	int i;
 
 	mutex_lock(&cpufreq_governor_lock);
-	if (!policy->governor_enabled)
+	if (policy->governor_state != CPUFREQ_GOV_START)
 		goto out_unlock;
 
 	if (!all_cpus) {
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index c7aa96b..39ddd3e 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -81,7 +81,7 @@  struct cpufreq_policy {
 	unsigned int		policy; /* see above */
 	struct cpufreq_governor	*governor; /* see below */
 	void			*governor_data;
-	bool			governor_enabled; /* governor start/stop flag */
+	bool			governor_state; /* Governor's current state */
 	bool			governor_busy;
 
 	struct work_struct	update; /* if update_policy() needs to be
-- 
2.0.3.693.g996b0fd