From patchwork Tue May 13 09:57:49 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juri Lelli X-Patchwork-Id: 30021 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-yh0-f70.google.com (mail-yh0-f70.google.com [209.85.213.70]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 5009B20446 for ; Tue, 13 May 2014 09:57:28 +0000 (UTC) Received: by mail-yh0-f70.google.com with SMTP id z6sf297203yhz.9 for ; Tue, 13 May 2014 02:57:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:date:from:to:cc:subject:message-id :in-reply-to:references:mime-version:sender:precedence:list-id :x-original-sender:x-original-authentication-results:mailing-list :list-post:list-help:list-archive:list-unsubscribe:content-type :content-transfer-encoding; bh=s+FhN+P+a566cc3nP8n0ti2xsj6wbJ8EK+STToRB6cE=; b=aj5EQGrSydfimSu/K5LjuitqNVeJfqWh2DcZe3qiZrd0MsHUwt62I60VHg5NZkFWvF H2WIAMiD/7FoFNwVEtz42eCneGsltwjYVM03Yh2gpUgcgZyBNd1LyHVy4NgrFCilJYIf 037/pPjRK3Ecz6+m46c4HHPS7Xdf1tIHL+35U63NU9OieKjuaBgVc9A8YF24xHd5w/MB YASjSm93bHjIfcOwUKBWIFVQsqYRfTmYaTfbqPArCEspF5rjlQiuItD+lMZZj/a0q6pr SzDwLRJnfwar23M6dV1DNYPclAVlPH/lLAz/lZ6owjMPxcLWWhi1pO7nTpj4Ah/TcI5e oKmA== X-Gm-Message-State: ALoCoQk2RsKth7h9ArXeP7YRRTuKWwLSlQ+3MFu/hrtx7Y5mN0yz5OQp1iaTwyRiHzV8u3WHq59k X-Received: by 10.236.92.204 with SMTP id j52mr14632167yhf.26.1399975048159; Tue, 13 May 2014 02:57:28 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.48.210 with SMTP id o76ls1734529qga.66.gmail; Tue, 13 May 2014 02:57:28 -0700 (PDT) X-Received: by 10.52.72.138 with SMTP id d10mr23679915vdv.15.1399975048077; Tue, 13 May 2014 02:57:28 -0700 (PDT) Received: from mail-ve0-x22b.google.com (mail-ve0-x22b.google.com [2607:f8b0:400c:c01::22b]) by mx.google.com with ESMTPS id j20si2559848vcx.19.2014.05.13.02.57.28 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 13 May 2014 02:57:28 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2607:f8b0:400c:c01::22b as permitted sender) client-ip=2607:f8b0:400c:c01::22b; Received: by mail-ve0-f171.google.com with SMTP id oz11so120000veb.2 for ; Tue, 13 May 2014 02:57:28 -0700 (PDT) X-Received: by 10.52.228.134 with SMTP id si6mr23550216vdc.5.1399975047984; Tue, 13 May 2014 02:57:27 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.220.221.72 with SMTP id ib8csp137694vcb; Tue, 13 May 2014 02:57:27 -0700 (PDT) X-Received: by 10.68.110.226 with SMTP id id2mr4042934pbb.40.1399975046860; Tue, 13 May 2014 02:57:26 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ef1si7721476pbc.214.2014.05.13.02.57.26; Tue, 13 May 2014 02:57:26 -0700 (PDT) Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760791AbaEMJ5R (ORCPT + 27 others); Tue, 13 May 2014 05:57:17 -0400 Received: from mail-we0-f179.google.com ([74.125.82.179]:37060 "EHLO mail-we0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759891AbaEMJ5N (ORCPT ); Tue, 13 May 2014 05:57:13 -0400 Received: by mail-we0-f179.google.com with SMTP id q59so115837wes.10 for ; Tue, 13 May 2014 02:57:12 -0700 (PDT) X-Received: by 10.194.88.74 with SMTP id be10mr854584wjb.71.1399975032716; Tue, 13 May 2014 02:57:12 -0700 (PDT) Received: from neville (nat-cataldo.sssup.it. [193.205.81.5]) by mx.google.com with ESMTPSA id gr2sm21563194wjc.12.2014.05.13.02.56.55 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 13 May 2014 02:56:56 -0700 (PDT) Date: Tue, 13 May 2014 11:57:49 +0200 From: Juri Lelli To: Peter Zijlstra Cc: "Michael Kerrisk (man-pages)" , Dario Faggioli , Ingo Molnar , lkml , Dave Jones Subject: Re: [BUG] sched_setattr() SCHED_DEADLINE hangs system Message-Id: <20140513115749.ebf3eebc64e44aac6f183410@gmail.com> In-Reply-To: <20140512123032.GC13467@laptop.programming.kicks-ass.net> References: <536F8F0E.7020301@gmail.com> <53707007.3080003@gmail.com> <20140512084727.GN30445@twins.programming.kicks-ass.net> <20140512123032.GC13467@laptop.programming.kicks-ass.net> X-Mailer: Sylpheed 3.2.0beta5 (GTK+ 2.24.10; x86_64-pc-linux-gnu) Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Original-Sender: juri.lelli@gmail.com X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2607:f8b0:400c:c01::22b as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=neutral (body hash did not verify) header.i=@; dmarc=fail (p=NONE dis=NONE) header.from=gmail.com Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Hi all, On Mon, 12 May 2014 14:30:32 +0200 Peter Zijlstra wrote: > On Mon, May 12, 2014 at 11:19:39AM +0200, Michael Kerrisk (man-pages) wrote: > > Hi Peter, > > > > On Mon, May 12, 2014 at 10:47 AM, Peter Zijlstra wrote: > > > On Mon, May 12, 2014 at 08:53:59AM +0200, Michael Kerrisk (man-pages) wrote: > > >> On 05/11/2014 04:54 PM, Michael Kerrisk (man-pages) wrote: > > > > > >> > $ time sudo ./t_sched_setattr d 18446744072 18446744072 18446744073 > > >> > > >> I realize my speculation was completely off the mark. time(2) really > > >> is reporting the truth, and the sched_setattr() call returns immediately. > > >> But it looks like with these settings the deadline scheduler gets itself > > >> into a confused state. The process chews up a vast amount of CPU time > > >> for the few actions (including process teardown) that occur after > > >> the sched_setattr() call, and since the SCHED_DEADLINE process has > > >> priority over everything else, the system locks up. > > > > > > Yeah, its doing something weird alright.. let me see if I can get > > > something useful out. > > > > Thanks! > > So I think its because the way we check wrapping > > (s64)(a - b) < 0 > > This means that its impossible to tell if time went fwd or bwd with > 64bit increments. I've not entirely pinpointed where this is wrecking > things, but it seems like a fair bet this is what's going wrong. > > So I'm tempted to put a sanity check on all these values to make sure <= > 2^63. That way the wrapping logic in the kernel keeps working. > > And 2^63 [ns] should be plenty large enough for everyone (famous last > words of course). > Does the following fix the thing? Thanks, - Juri --- >From 90a7603a0b6b620c9d07e3f375906b436dcc2230 Mon Sep 17 00:00:00 2001 From: Juri Lelli Date: Tue, 13 May 2014 10:15:59 +0200 Subject: [PATCH] sched/deadline: restrict user params max value to 2^63 ns Michael Kerrisk noticed that creating SCHED_DEADLINE reservations with certain parameters (e.g, a runtime of something near 2^64 ns) can cause a system freeze for some amount of time. The problem is that in the interface we have u64 sched_runtime; while internally we need to have a signed runtime (to cope with budget overruns) s64 runtime; At the time we setup a new dl_entity we copy the first value in the second. The cast turns out with negative values when sched_runtime is too big, and this causes the scheduler to go crazy right from the start. Moreover, considering how we deal with deadlines wraparound (s64)(a - b) < 0 we also have to restrict acceptable values for sched_{deadline,period}. This patch fixes the thing checking that user parameters are always below 2^63 ns (still large enough for everyone). It also rewrites other conditions that we check, since in __checkparam_dl we don't have to deal with deadline wraparounds and what we have now erroneously fails when the difference between values is too big. Reported-by: Michael Kerrisk Suggested-by: Peter Zijlstra Signed-off-by: Juri Lelli --- kernel/sched/core.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index d9d8ece..96ba59d 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3188,17 +3188,21 @@ __getparam_dl(struct task_struct *p, struct sched_attr *attr) * We ask for the deadline not being zero, and greater or equal * than the runtime, as well as the period of being zero or * greater than deadline. Furthermore, we have to be sure that - * user parameters are above the internal resolution (1us); we - * check sched_runtime only since it is always the smaller one. + * user parameters are above the internal resolution of 1us (we + * check sched_runtime only since it is always the smaller one) and + * below 2^63 ns (we have to check both sched_deadline and + * sched_period, as the latter can be zero). */ static bool __checkparam_dl(const struct sched_attr *attr) { return attr && attr->sched_deadline != 0 && (attr->sched_period == 0 || - (s64)(attr->sched_period - attr->sched_deadline) >= 0) && - (s64)(attr->sched_deadline - attr->sched_runtime ) >= 0 && - attr->sched_runtime >= (2 << (DL_SCALE - 1)); + (attr->sched_period >= attr->sched_deadline)) && + (attr->sched_deadline >= attr->sched_runtime) && + attr->sched_runtime >= (1ULL << DL_SCALE) && + (attr->sched_deadline < (1ULL << 63) && + attr->sched_period < (1ULL << 63)); } /*