vl: rework smp_parse

Message ID	1415290175-17314-1-git-send-email-drjones@redhat.com
State	New
Headers	show Return-Path: <patchwork-forward+bncBDY2J6HAR4PBBKF452RAKGQE7FMSBNQ@linaro.org> MIME-Version: 1.0 Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.42 as permitted sender) client-ip=209.85.215.42; Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; From: Andrew Jones <drjones@redhat.com> To: qemu-devel@nongnu.org Date: Thu, 6 Nov 2014 17:09:35 +0100 Message-Id: <1415290175-17314-1-git-send-email-drjones@redhat.com> Cc: ehabkost@redhat.com Subject: [Qemu-devel] [PATCH] vl: rework smp_parse Precedence: list Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org

Andrew Jones Nov. 6, 2014, 4:09 p.m. UTC

smp_parse has a couple problems. First, it should use max_cpus,
not smp_cpus when calculating missing topology information.
Conversely, if maxcpus is not input, then the topology should
dictate max_cpus, as the topology may support more than the
input smp_cpus number. Second, smp_parse shouldn't silently
adjust the number of threads a user provides, which it currently
does in order to ensure the topology supports the number
of cpus provided. smp_parse should rather complain and exit
when input isn't consistent. This patch fixes those issues and
attempts to clarify the code a bit too.

I don't believe we need to consider compatibility with the old
behaviors. Specifying something like

  -smp 4,sockets=1,cores=1,threads=1

is wrong, even though it currently works (the number of threads
is silently adjusted to 4). And, specifying something like

  -smp 4,sockets=1,cores=4,threads=1,maxcpus=8

is also wrong, as there's no way to hotplug the additional 4 cpus
with this topology. So, any users doing these things should be
corrected. The new error message this patch provides should help
them do that.

Below are some examples with before/after results

// topology should support up to max_cpus
< -smp 4,sockets=2,maxcpus=8                    sockets=2 cores=2 threads=1 smp_cpus=4 max_cpus=8
> -smp 4,sockets=2,maxcpus=8                    sockets=2 cores=4 threads=1 smp_cpus=4 max_cpus=8

// topology supports more than smp_cpus, max_cpus should be higher
< -smp 4,sockets=4,cores=2                      sockets=4 cores=2 threads=1 smp_cpus=4 max_cpus=4
> -smp 4,sockets=4,cores=2                      sockets=4 cores=2 threads=1 smp_cpus=4 max_cpus=8

// shouldn't silently adjust threads
< -smp 4,sockets=1,cores=2,threads=1            sockets=1 cores=2 threads=2 smp_cpus=4 max_cpus=4
> -smp 4,sockets=1,cores=2,threads=1            "maxcpus must be equal to or greater than smp"
< -smp 4,sockets=2,cores=2,threads=4,maxcpus=8  sockets=2 cores=2 threads=1 smp_cpus=4 max_cpus=8
> -smp 4,sockets=2,cores=2,threads=4,maxcpus=8  "cpu topology: sockets (2) * cores (2) * threads (4) != max_cpus (8)"

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 vl.c | 59 ++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 38 insertions(+), 21 deletions(-)

Eduardo Habkost Nov. 6, 2014, 7:17 p.m. UTC | #1

On Thu, Nov 06, 2014 at 05:09:35PM +0100, Andrew Jones wrote:
> smp_parse has a couple problems. First, it should use max_cpus,
> not smp_cpus when calculating missing topology information.
> Conversely, if maxcpus is not input, then the topology should
> dictate max_cpus, as the topology may support more than the
> input smp_cpus number. Second, smp_parse shouldn't silently
> adjust the number of threads a user provides, which it currently
> does in order to ensure the topology supports the number
> of cpus provided. smp_parse should rather complain and exit
> when input isn't consistent. This patch fixes those issues and
> attempts to clarify the code a bit too.
> 
> I don't believe we need to consider compatibility with the old
> behaviors. Specifying something like
> 
>   -smp 4,sockets=1,cores=1,threads=1
> 
> is wrong, even though it currently works (the number of threads
> is silently adjusted to 4).

Agreed, in this case.

> And, specifying something like
> 
>   -smp 4,sockets=1,cores=4,threads=1,maxcpus=8
> 
> is also wrong, as there's no way to hotplug the additional 4 cpus
> with this topology. So, any users doing these things should be
> corrected. The new error message this patch provides should help
> them do that.

In this case, you are proposing a new meaning for "sockets", and it
makes sense (I as suppose people understand "sockets" to mean all
sockets, even the empty ones). But it looks like it will break
compatility on cases we don't want to.

For example, the case below is currently valid, and probably can be
generated by libvirt today. But you are now rejecting it:

 -smp 30,sockets=5,cores=3,threads=2,maxcpus=60   "cpu topology: sockets (5) * cores (3) * threads (2) != max_cpus (60)"

This is currently valid because "sockets" today means "online sockets",
not "all sockets, even the empty ones".

It looks like the patch also breaks automatic socket count calculation,
which (AFAIK) works today:

I get:
 -smp 30,cores=3,threads=2        "maxcpus must be equal to or greater than smp"
but I expect:
 -smp 30,cores=3,threads=2        sockets=5 cores=3 threads=2 smp_cpus=30 max_cpus=30

(And the error message is confusing: I am not even setting max_cpus, why
is it saying maxcpus is wrong?)

> 
> Below are some examples with before/after results
> 
> // topology should support up to max_cpus
> < -smp 4,sockets=2,maxcpus=8                    sockets=2 cores=2 threads=1 smp_cpus=4 max_cpus=8
> > -smp 4,sockets=2,maxcpus=8                    sockets=2 cores=4 threads=1 smp_cpus=4 max_cpus=8

So, like in my first example above, we are breaking compatibility here
because the meaning of "sockets" changed. Do we want to?

(Unless otherwise noted in my other comments below, I am using the new
meaning for "sockets", not the old one.)

> 
> // topology supports more than smp_cpus, max_cpus should be higher
> < -smp 4,sockets=4,cores=2                      sockets=4 cores=2 threads=1 smp_cpus=4 max_cpus=4
> > -smp 4,sockets=4,cores=2                      sockets=4 cores=2 threads=1 smp_cpus=4 max_cpus=8

There are also existing cases where the user could simply expect
smp_threads to be calculated automatically. You seem to cover that, but
explicit testing would be nice. e.g., to make sure we won't break stuff
like:

 -smp  8,sockets=2,cores=2               sockets=2 cores=2 threads=2 smp_cpus=8 max_cpus=8
 -smp 12,sockets=2,cores=2               sockets=2 cores=2 threads=3 smp_cpus=12 max_cpus=12

Anyway, I see a problem here because we are trying to automatically
calculate two variables (max_cpus and smp_threads) when multiple
solutions may exist. We are just trying to choose the solution that
makes more sense, I guess, but do we really want to go down that path?

e.g. what about this one:
  -smp 6,sockets=2,cores=2

It has a solution:
  sockets=2 cores=2 threads=3 smp_cpus=6 max_cpus=12
but should the code find it?

> 
> // shouldn't silently adjust threads
> < -smp 4,sockets=1,cores=2,threads=1            sockets=1 cores=2 threads=2 smp_cpus=4 max_cpus=4
> > -smp 4,sockets=1,cores=2,threads=1            "maxcpus must be equal to or greater than smp"

I find it very confusing that the error message mentions "maxcpus" when
it was never used in the command-line. But I agree we should reject the
above configuration, as we must never silently adjust any option that
was explicitly set.

> < -smp 4,sockets=2,cores=2,threads=4,maxcpus=8  sockets=2 cores=2 threads=1 smp_cpus=4 max_cpus=8
> > -smp 4,sockets=2,cores=2,threads=4,maxcpus=8  "cpu topology: sockets (2) * cores (2) * threads (4) != max_cpus (8)"

In this specific case I would expect it to abort too, because both
max_cpus (8) and smp_cpus (4) don't match sockets*cores*threads (16).

Paolo Bonzini Nov. 6, 2014, 10:11 p.m. UTC | #2

On 06/11/2014 17:09, Andrew Jones wrote:
> +        if (sockets * cores * threads != max_cpus) {
> +            fprintf(stderr, "cpu topology: "
> +                    "sockets (%u) * cores (%u) * threads (%u) != max_cpus (%u)\n",
> +                    sockets, cores, threads, max_cpus);
> +            exit(1);
> +        }

I think this would cause too many failures in the wild.  Perhaps error
out if it is lower, and warn if sockets * cores * threads > max_cpus
since we actually allow hot-plug a thread at a time?

Paolo

Andrew Jones Nov. 7, 2014, 9:22 a.m. UTC | #3

On Thu, Nov 06, 2014 at 05:17:44PM -0200, Eduardo Habkost wrote:
> On Thu, Nov 06, 2014 at 05:09:35PM +0100, Andrew Jones wrote:
> > smp_parse has a couple problems. First, it should use max_cpus,
> > not smp_cpus when calculating missing topology information.
> > Conversely, if maxcpus is not input, then the topology should
> > dictate max_cpus, as the topology may support more than the
> > input smp_cpus number. Second, smp_parse shouldn't silently
> > adjust the number of threads a user provides, which it currently
> > does in order to ensure the topology supports the number
> > of cpus provided. smp_parse should rather complain and exit
> > when input isn't consistent. This patch fixes those issues and
> > attempts to clarify the code a bit too.
> > 
> > I don't believe we need to consider compatibility with the old
> > behaviors. Specifying something like
> > 
> >   -smp 4,sockets=1,cores=1,threads=1
> > 
> > is wrong, even though it currently works (the number of threads
> > is silently adjusted to 4).
> 
> Agreed, in this case.
> 
> > And, specifying something like
> > 
> >   -smp 4,sockets=1,cores=4,threads=1,maxcpus=8
> > 
> > is also wrong, as there's no way to hotplug the additional 4 cpus
> > with this topology. So, any users doing these things should be
> > corrected. The new error message this patch provides should help
> > them do that.
> 
> In this case, you are proposing a new meaning for "sockets", and it
> makes sense (I as suppose people understand "sockets" to mean all
> sockets, even the empty ones). But it looks like it will break
> compatility on cases we don't want to.
> 
> For example, the case below is currently valid, and probably can be
> generated by libvirt today. But you are now rejecting it:
> 
>  -smp 30,sockets=5,cores=3,threads=2,maxcpus=60   "cpu topology: sockets (5) * cores (3) * threads (2) != max_cpus (60)"
> 
> This is currently valid because "sockets" today means "online sockets",
> not "all sockets, even the empty ones".

Will hotplug still work correctly with this meaning? I expect that
we need holes to fill.

> 
> 
> It looks like the patch also breaks automatic socket count calculation,
> which (AFAIK) works today:
> 
> I get:
>  -smp 30,cores=3,threads=2        "maxcpus must be equal to or greater than smp"
> but I expect:
>  -smp 30,cores=3,threads=2        sockets=5 cores=3 threads=2 smp_cpus=30 max_cpus=30

True. I should have preserved that behavior.

The current code actually ends up with

  sockets=1 cores=3 threads=2 cpus=30 maxcpus=30

in smp_parse(), which is nonsense, but as the sockets count isn't saved
(it's always derivable from smp_cpus, cores, threads), then in practice
it doesn't matter.

> 
> (And the error message is confusing: I am not even setting max_cpus, why
> is it saying maxcpus is wrong?)
>

I should have added an underscore to maxcpus in that error message, as
max_cpus is either maxcpus or the result of the topology now.

> 
> > 
> > Below are some examples with before/after results
> > 
> > // topology should support up to max_cpus
> > < -smp 4,sockets=2,maxcpus=8                    sockets=2 cores=2 threads=1 smp_cpus=4 max_cpus=8
> > > -smp 4,sockets=2,maxcpus=8                    sockets=2 cores=4 threads=1 smp_cpus=4 max_cpus=8
> 
> So, like in my first example above, we are breaking compatibility here
> because the meaning of "sockets" changed. Do we want to?
>

I guess the main questions are:
  - do we need to make this change for hotplug?
  - do we need to make this change to make the command line more
    intuitive?
 
> 
> (Unless otherwise noted in my other comments below, I am using the new
> meaning for "sockets", not the old one.)
> 
> > 
> > // topology supports more than smp_cpus, max_cpus should be higher
> > < -smp 4,sockets=4,cores=2                      sockets=4 cores=2 threads=1 smp_cpus=4 max_cpus=4
> > > -smp 4,sockets=4,cores=2                      sockets=4 cores=2 threads=1 smp_cpus=4 max_cpus=8
> 
> There are also existing cases where the user could simply expect
> smp_threads to be calculated automatically. You seem to cover that, but
> explicit testing would be nice. e.g., to make sure we won't break stuff
> like:
> 
>  -smp  8,sockets=2,cores=2               sockets=2 cores=2 threads=2 smp_cpus=8 max_cpus=8
>  -smp 12,sockets=2,cores=2               sockets=2 cores=2 threads=3 smp_cpus=12 max_cpus=12
>

yeah, these work fine. But maybe they shouldn't... It seems to me
that if any topology is specified then the whole topology should
be specified. And, specifying both topology and maxcpus is redundant.
So we only need/should allow

-smp <n>					# smp_cpus = max_cpus = <n>
-smp maxcpus=<m>				# smp_cpus = max_cpus = <m>
-smp sockets=<s>,cores=<c>,threads=<t>		# smp_cpus = max_cpus = <s>*<c>*<t>

# hotplug support: smp_cpus = <n>, max_cpus = <m>
-smp <n>,maxcpus=<m>

# hotplug support: smp_cpus = <n>, max_cpus = <s>*<c>*<t>
-smp <n>,sockets=<s>,cores=<c>,threads=<t>

> 
> Anyway, I see a problem here because we are trying to automatically
> calculate two variables (max_cpus and smp_threads) when multiple
> solutions may exist. We are just trying to choose the solution that
> makes more sense, I guess, but do we really want to go down that path?
> 
> e.g. what about this one:
>   -smp 6,sockets=2,cores=2
> 
> It has a solution:
>   sockets=2 cores=2 threads=3 smp_cpus=6 max_cpus=12
> but should the code find it?

I guess not, see my opinion above.

> 
> > 
> > // shouldn't silently adjust threads
> > < -smp 4,sockets=1,cores=2,threads=1            sockets=1 cores=2 threads=2 smp_cpus=4 max_cpus=4
> > > -smp 4,sockets=1,cores=2,threads=1            "maxcpus must be equal to or greater than smp"
> 
> I find it very confusing that the error message mentions "maxcpus" when
> it was never used in the command-line. But I agree we should reject the
> above configuration, as we must never silently adjust any option that
> was explicitly set.

As said above, it should be max_cpus.

> 
> > < -smp 4,sockets=2,cores=2,threads=4,maxcpus=8  sockets=2 cores=2 threads=1 smp_cpus=4 max_cpus=8
> > > -smp 4,sockets=2,cores=2,threads=4,maxcpus=8  "cpu topology: sockets (2) * cores (2) * threads (4) != max_cpus (8)"
> 
> In this specific case I would expect it to abort too, because both
> max_cpus (8) and smp_cpus (4) don't match sockets*cores*threads (16).

The current code doesn't abort, as there's no final sanity check.


There's another max_cpus issue I thought of after posting. I think
max_cpus should be initialized to machine_class->max_cpus. Then, for the
case we only have '-smp <n>' input by the user (assuming <n> <=
max_cpus and we don't error out), max_cpus should remain equal to
machine_class->max_cpus, keeping holes for hotplug. Currently it would
get set to smp_cpus.

drew

Andrew Jones Nov. 7, 2014, 9:29 a.m. UTC | #4

On Thu, Nov 06, 2014 at 11:11:30PM +0100, Paolo Bonzini wrote:
> 
> 
> On 06/11/2014 17:09, Andrew Jones wrote:
> > +        if (sockets * cores * threads != max_cpus) {
> > +            fprintf(stderr, "cpu topology: "
> > +                    "sockets (%u) * cores (%u) * threads (%u) != max_cpus (%u)\n",
> > +                    sockets, cores, threads, max_cpus);
> > +            exit(1);
> > +        }
> 
> I think this would cause too many failures in the wild.  Perhaps error
> out if it is lower, and warn if sockets * cores * threads > max_cpus
> since we actually allow hot-plug a thread at a time?

We'd still have more failures if we choose to error out when it's lower,
since we currently silently adjust threads in some of those cases, or
just don't care that the topology doesn't support up to maxcpus in other.

I'm not sure how best to go about modifying the command line semantics
in a backwards compatible way, other than to just create a new "-smp"
option. I'm open to all opinions and suggestions.

Thanks,
drew

Paolo Bonzini Nov. 7, 2014, 9:40 a.m. UTC | #5

On 07/11/2014 10:29, Andrew Jones wrote:
>> > I think this would cause too many failures in the wild.  Perhaps error
>> > out if it is lower, and warn if sockets * cores * threads > max_cpus
>> > since we actually allow hot-plug a thread at a time?
> We'd still have more failures if we choose to error out when it's lower,
> since we currently silently adjust threads in some of those cases, or
> just don't care that the topology doesn't support up to maxcpus in other.

So I guess we need a decent fallback if it doesn't match.  Something
like (based also on the reply from Eduardo):

1) always warn if max_cpus % (cores*threads) != 0 || smp_cpus %
(cores*threads) != 0

2) if sockets*cpus*threads < max_cpus, adjust sockets to
DIV_ROUND_UP(max_cpus, cores*threads).  If we didn't warn in step 1, do
it now.  Give a different, less harsh warning if the cmdline
sockets*cpus*threads did match smp_cpus.  In the latter case, the user
_almost_ knows what he was doing.

Not perfect, but it could be something to start from.  Adjusting sockets
is better than adjusting threads.

Paolo

> I'm not sure how best to go about modifying the command line semantics
> in a backwards compatible way, other than to just create a new "-smp"
> option. I'm open to all opinions and suggestions.

Andrew Jones Nov. 7, 2014, 9:52 a.m. UTC | #6

On Fri, Nov 07, 2014 at 10:40:14AM +0100, Paolo Bonzini wrote:
> 
> 
> On 07/11/2014 10:29, Andrew Jones wrote:
> >> > I think this would cause too many failures in the wild.  Perhaps error
> >> > out if it is lower, and warn if sockets * cores * threads > max_cpus
> >> > since we actually allow hot-plug a thread at a time?
> > We'd still have more failures if we choose to error out when it's lower,
> > since we currently silently adjust threads in some of those cases, or
> > just don't care that the topology doesn't support up to maxcpus in other.
> 
> So I guess we need a decent fallback if it doesn't match.  Something
> like (based also on the reply from Eduardo):
> 
> 1) always warn if max_cpus % (cores*threads) != 0 || smp_cpus %
> (cores*threads) != 0
> 
> 2) if sockets*cpus*threads < max_cpus, adjust sockets to
> DIV_ROUND_UP(max_cpus, cores*threads).  If we didn't warn in step 1, do
> it now.  Give a different, less harsh warning if the cmdline
> sockets*cpus*threads did match smp_cpus.  In the latter case, the user
> _almost_ knows what he was doing.
> 
> Not perfect, but it could be something to start from.  Adjusting sockets
> is better than adjusting threads.

OK. I can whip up a v2 that is less harsh (more warnings, less aborts).
I'll also address the other issue I mentioned in the bottom of my reply
to Eduardo, which is to make sure we consider machine_class->max_cpus.

drew
> 
> Paolo
> 
> > I'm not sure how best to go about modifying the command line semantics
> > in a backwards compatible way, other than to just create a new "-smp"
> > option. I'm open to all opinions and suggestions.

Andrew Jones Nov. 7, 2014, 11:21 a.m. UTC | #7

On Fri, Nov 07, 2014 at 10:52:31AM +0100, Andrew Jones wrote:
> On Fri, Nov 07, 2014 at 10:40:14AM +0100, Paolo Bonzini wrote:
> > 
> > 
> > On 07/11/2014 10:29, Andrew Jones wrote:
> > >> > I think this would cause too many failures in the wild.  Perhaps error
> > >> > out if it is lower, and warn if sockets * cores * threads > max_cpus
> > >> > since we actually allow hot-plug a thread at a time?
> > > We'd still have more failures if we choose to error out when it's lower,
> > > since we currently silently adjust threads in some of those cases, or
> > > just don't care that the topology doesn't support up to maxcpus in other.
> > 
> > So I guess we need a decent fallback if it doesn't match.  Something
> > like (based also on the reply from Eduardo):
> > 
> > 1) always warn if max_cpus % (cores*threads) != 0 || smp_cpus %
> > (cores*threads) != 0
> > 
> > 2) if sockets*cpus*threads < max_cpus, adjust sockets to
> > DIV_ROUND_UP(max_cpus, cores*threads).  If we didn't warn in step 1, do
> > it now.  Give a different, less harsh warning if the cmdline
> > sockets*cpus*threads did match smp_cpus.  In the latter case, the user
> > _almost_ knows what he was doing.
> > 
> > Not perfect, but it could be something to start from.  Adjusting sockets
> > is better than adjusting threads.
> 
> OK. I can whip up a v2 that is less harsh (more warnings, less aborts).
> I'll also address the other issue I mentioned in the bottom of my reply
> to Eduardo, which is to make sure we consider machine_class->max_cpus.

After talking with Igor, it seems like the better approach is to get
smp parameters converted over to machine properties, allowing us to
use the old parser for old machine types, and the new for new. I'll
look into it.

drew

Andrew Jones Nov. 7, 2014, 11:29 a.m. UTC | #8

On Fri, Nov 07, 2014 at 10:22:39AM +0100, Andrew Jones wrote:
> On Thu, Nov 06, 2014 at 05:17:44PM -0200, Eduardo Habkost wrote:
> > On Thu, Nov 06, 2014 at 05:09:35PM +0100, Andrew Jones wrote:
> > > smp_parse has a couple problems. First, it should use max_cpus,
> > > not smp_cpus when calculating missing topology information.
> > > Conversely, if maxcpus is not input, then the topology should
> > > dictate max_cpus, as the topology may support more than the
> > > input smp_cpus number. Second, smp_parse shouldn't silently
> > > adjust the number of threads a user provides, which it currently
> > > does in order to ensure the topology supports the number
> > > of cpus provided. smp_parse should rather complain and exit
> > > when input isn't consistent. This patch fixes those issues and
> > > attempts to clarify the code a bit too.
> > > 
> > > I don't believe we need to consider compatibility with the old
> > > behaviors. Specifying something like
> > > 
> > >   -smp 4,sockets=1,cores=1,threads=1
> > > 
> > > is wrong, even though it currently works (the number of threads
> > > is silently adjusted to 4).
> > 
> > Agreed, in this case.
> > 
> > > And, specifying something like
> > > 
> > >   -smp 4,sockets=1,cores=4,threads=1,maxcpus=8
> > > 
> > > is also wrong, as there's no way to hotplug the additional 4 cpus
> > > with this topology. So, any users doing these things should be
> > > corrected. The new error message this patch provides should help
> > > them do that.
> > 
> > In this case, you are proposing a new meaning for "sockets", and it
> > makes sense (I as suppose people understand "sockets" to mean all
> > sockets, even the empty ones). But it looks like it will break
> > compatility on cases we don't want to.
> > 
> > For example, the case below is currently valid, and probably can be
> > generated by libvirt today. But you are now rejecting it:
> > 
> >  -smp 30,sockets=5,cores=3,threads=2,maxcpus=60   "cpu topology: sockets (5) * cores (3) * threads (2) != max_cpus (60)"
> > 
> > This is currently valid because "sockets" today means "online sockets",
> > not "all sockets, even the empty ones".
> 
> Will hotplug still work correctly with this meaning? I expect that
> we need holes to fill.
> 
> > 
> > 
> > It looks like the patch also breaks automatic socket count calculation,
> > which (AFAIK) works today:
> > 
> > I get:
> >  -smp 30,cores=3,threads=2        "maxcpus must be equal to or greater than smp"
> > but I expect:
> >  -smp 30,cores=3,threads=2        sockets=5 cores=3 threads=2 smp_cpus=30 max_cpus=30
> 
> True. I should have preserved that behavior.
> 
> The current code actually ends up with
> 
>   sockets=1 cores=3 threads=2 cpus=30 maxcpus=30
> 
> in smp_parse(), which is nonsense, but as the sockets count isn't saved
> (it's always derivable from smp_cpus, cores, threads), then in practice
> it doesn't matter.
> 
> > 
> > (And the error message is confusing: I am not even setting max_cpus, why
> > is it saying maxcpus is wrong?)
> >
> 
> I should have added an underscore to maxcpus in that error message, as
> max_cpus is either maxcpus or the result of the topology now.
> 
> > 
> > > 
> > > Below are some examples with before/after results
> > > 
> > > // topology should support up to max_cpus
> > > < -smp 4,sockets=2,maxcpus=8                    sockets=2 cores=2 threads=1 smp_cpus=4 max_cpus=8
> > > > -smp 4,sockets=2,maxcpus=8                    sockets=2 cores=4 threads=1 smp_cpus=4 max_cpus=8
> > 
> > So, like in my first example above, we are breaking compatibility here
> > because the meaning of "sockets" changed. Do we want to?
> >
> 
> I guess the main questions are:
>   - do we need to make this change for hotplug?
>   - do we need to make this change to make the command line more
>     intuitive?
>  
> > 
> > (Unless otherwise noted in my other comments below, I am using the new
> > meaning for "sockets", not the old one.)
> > 
> > > 
> > > // topology supports more than smp_cpus, max_cpus should be higher
> > > < -smp 4,sockets=4,cores=2                      sockets=4 cores=2 threads=1 smp_cpus=4 max_cpus=4
> > > > -smp 4,sockets=4,cores=2                      sockets=4 cores=2 threads=1 smp_cpus=4 max_cpus=8
> > 
> > There are also existing cases where the user could simply expect
> > smp_threads to be calculated automatically. You seem to cover that, but
> > explicit testing would be nice. e.g., to make sure we won't break stuff
> > like:
> > 
> >  -smp  8,sockets=2,cores=2               sockets=2 cores=2 threads=2 smp_cpus=8 max_cpus=8
> >  -smp 12,sockets=2,cores=2               sockets=2 cores=2 threads=3 smp_cpus=12 max_cpus=12
> >
> 
> yeah, these work fine. But maybe they shouldn't... It seems to me
> that if any topology is specified then the whole topology should
> be specified. And, specifying both topology and maxcpus is redundant.
> So we only need/should allow
> 
> -smp <n>					# smp_cpus = max_cpus = <n>
> -smp maxcpus=<m>				# smp_cpus = max_cpus = <m>
> -smp sockets=<s>,cores=<c>,threads=<t>		# smp_cpus = max_cpus = <s>*<c>*<t>
> 
> # hotplug support: smp_cpus = <n>, max_cpus = <m>
> -smp <n>,maxcpus=<m>
> 
> # hotplug support: smp_cpus = <n>, max_cpus = <s>*<c>*<t>
> -smp <n>,sockets=<s>,cores=<c>,threads=<t>
> 
> > 
> > Anyway, I see a problem here because we are trying to automatically
> > calculate two variables (max_cpus and smp_threads) when multiple
> > solutions may exist. We are just trying to choose the solution that
> > makes more sense, I guess, but do we really want to go down that path?
> > 
> > e.g. what about this one:
> >   -smp 6,sockets=2,cores=2
> > 
> > It has a solution:
> >   sockets=2 cores=2 threads=3 smp_cpus=6 max_cpus=12
> > but should the code find it?
> 
> I guess not, see my opinion above.
> 
> > 
> > > 
> > > // shouldn't silently adjust threads
> > > < -smp 4,sockets=1,cores=2,threads=1            sockets=1 cores=2 threads=2 smp_cpus=4 max_cpus=4
> > > > -smp 4,sockets=1,cores=2,threads=1            "maxcpus must be equal to or greater than smp"
> > 
> > I find it very confusing that the error message mentions "maxcpus" when
> > it was never used in the command-line. But I agree we should reject the
> > above configuration, as we must never silently adjust any option that
> > was explicitly set.
> 
> As said above, it should be max_cpus.
> 
> > 
> > > < -smp 4,sockets=2,cores=2,threads=4,maxcpus=8  sockets=2 cores=2 threads=1 smp_cpus=4 max_cpus=8
> > > > -smp 4,sockets=2,cores=2,threads=4,maxcpus=8  "cpu topology: sockets (2) * cores (2) * threads (4) != max_cpus (8)"
> > 
> > In this specific case I would expect it to abort too, because both
> > max_cpus (8) and smp_cpus (4) don't match sockets*cores*threads (16).
> 
> The current code doesn't abort, as there's no final sanity check.
> 
> 
> There's another max_cpus issue I thought of after posting. I think
> max_cpus should be initialized to machine_class->max_cpus. Then, for the
> case we only have '-smp <n>' input by the user (assuming <n> <=
> max_cpus and we don't error out), max_cpus should remain equal to
> machine_class->max_cpus, keeping holes for hotplug. Currently it would
> get set to smp_cpus.

Actually, scratch most of this. In the absence of an input max_cpus, we
should use smp_cpus, as we don't necessarily want to support hotplug for
this machine instance at all. However the sanity check against
machine_class->max_cpus needs to be updated to check against max_cpus,
not smp_cpus in order to be correct.

drew

Eduardo Habkost Nov. 7, 2014, 12:16 p.m. UTC | #9

On Fri, Nov 07, 2014 at 12:21:26PM +0100, Andrew Jones wrote:
> On Fri, Nov 07, 2014 at 10:52:31AM +0100, Andrew Jones wrote:
> > On Fri, Nov 07, 2014 at 10:40:14AM +0100, Paolo Bonzini wrote:
> > > 
> > > 
> > > On 07/11/2014 10:29, Andrew Jones wrote:
> > > >> > I think this would cause too many failures in the wild.  Perhaps error
> > > >> > out if it is lower, and warn if sockets * cores * threads > max_cpus
> > > >> > since we actually allow hot-plug a thread at a time?
> > > > We'd still have more failures if we choose to error out when it's lower,
> > > > since we currently silently adjust threads in some of those cases, or
> > > > just don't care that the topology doesn't support up to maxcpus in other.
> > > 
> > > So I guess we need a decent fallback if it doesn't match.  Something
> > > like (based also on the reply from Eduardo):
> > > 
> > > 1) always warn if max_cpus % (cores*threads) != 0 || smp_cpus %
> > > (cores*threads) != 0
> > > 
> > > 2) if sockets*cpus*threads < max_cpus, adjust sockets to
> > > DIV_ROUND_UP(max_cpus, cores*threads).  If we didn't warn in step 1, do
> > > it now.  Give a different, less harsh warning if the cmdline
> > > sockets*cpus*threads did match smp_cpus.  In the latter case, the user
> > > _almost_ knows what he was doing.
> > > 
> > > Not perfect, but it could be something to start from.  Adjusting sockets
> > > is better than adjusting threads.
> > 
> > OK. I can whip up a v2 that is less harsh (more warnings, less aborts).
> > I'll also address the other issue I mentioned in the bottom of my reply
> > to Eduardo, which is to make sure we consider machine_class->max_cpus.
> 
> After talking with Igor, it seems like the better approach is to get
> smp parameters converted over to machine properties, allowing us to
> use the old parser for old machine types, and the new for new. I'll
> look into it.

While we work on that, I think we could at least change the existing
code to abort if sockets*cores*threads < smp_cpus (when all 4 options
are present in the command-line), and never adjust any option silently.

Andrew Jones Nov. 7, 2014, 12:23 p.m. UTC | #10

On Fri, Nov 07, 2014 at 10:16:06AM -0200, Eduardo Habkost wrote:
> On Fri, Nov 07, 2014 at 12:21:26PM +0100, Andrew Jones wrote:
> > On Fri, Nov 07, 2014 at 10:52:31AM +0100, Andrew Jones wrote:
> > > On Fri, Nov 07, 2014 at 10:40:14AM +0100, Paolo Bonzini wrote:
> > > > 
> > > > 
> > > > On 07/11/2014 10:29, Andrew Jones wrote:
> > > > >> > I think this would cause too many failures in the wild.  Perhaps error
> > > > >> > out if it is lower, and warn if sockets * cores * threads > max_cpus
> > > > >> > since we actually allow hot-plug a thread at a time?
> > > > > We'd still have more failures if we choose to error out when it's lower,
> > > > > since we currently silently adjust threads in some of those cases, or
> > > > > just don't care that the topology doesn't support up to maxcpus in other.
> > > > 
> > > > So I guess we need a decent fallback if it doesn't match.  Something
> > > > like (based also on the reply from Eduardo):
> > > > 
> > > > 1) always warn if max_cpus % (cores*threads) != 0 || smp_cpus %
> > > > (cores*threads) != 0
> > > > 
> > > > 2) if sockets*cpus*threads < max_cpus, adjust sockets to
> > > > DIV_ROUND_UP(max_cpus, cores*threads).  If we didn't warn in step 1, do
> > > > it now.  Give a different, less harsh warning if the cmdline
> > > > sockets*cpus*threads did match smp_cpus.  In the latter case, the user
> > > > _almost_ knows what he was doing.
> > > > 
> > > > Not perfect, but it could be something to start from.  Adjusting sockets
> > > > is better than adjusting threads.
> > > 
> > > OK. I can whip up a v2 that is less harsh (more warnings, less aborts).
> > > I'll also address the other issue I mentioned in the bottom of my reply
> > > to Eduardo, which is to make sure we consider machine_class->max_cpus.
> > 
> > After talking with Igor, it seems like the better approach is to get
> > smp parameters converted over to machine properties, allowing us to
> > use the old parser for old machine types, and the new for new. I'll
> > look into it.
> 
> While we work on that, I think we could at least change the existing
> code to abort if sockets*cores*threads < smp_cpus (when all 4 options
> are present in the command-line), and never adjust any option silently.

But then we'll end up with 3 different behaviors. If we do anything for
the short-term, then maybe it should only be to add warnings. The
machine property version will then convert those warnings to aborts for
new machines types.

drew

Eduardo Habkost Nov. 7, 2014, 12:32 p.m. UTC | #11

On Fri, Nov 07, 2014 at 01:23:12PM +0100, Andrew Jones wrote:
> On Fri, Nov 07, 2014 at 10:16:06AM -0200, Eduardo Habkost wrote:
> > On Fri, Nov 07, 2014 at 12:21:26PM +0100, Andrew Jones wrote:
> > > On Fri, Nov 07, 2014 at 10:52:31AM +0100, Andrew Jones wrote:
[...]
> > > After talking with Igor, it seems like the better approach is to get
> > > smp parameters converted over to machine properties, allowing us to
> > > use the old parser for old machine types, and the new for new. I'll
> > > look into it.
> > 
> > While we work on that, I think we could at least change the existing
> > code to abort if sockets*cores*threads < smp_cpus (when all 4 options
> > are present in the command-line), and never adjust any option silently.
> 
> But then we'll end up with 3 different behaviors. If we do anything for
> the short-term, then maybe it should only be to add warnings. The
> machine property version will then convert those warnings to aborts for
> new machines types.

Oh, in that specific case I didn't even consider keeping the old
behavior, so there wouldn't be 3 different behaviors to emulate.

Silently changing smp_threads when it was explicitly set in the
command-line is so wrong that I would always prefer to see QEMU
aborting.

Andrew Jones Nov. 7, 2014, 4:03 p.m. UTC | #12

On Thu, Nov 06, 2014 at 05:09:35PM +0100, Andrew Jones wrote:
> smp_parse has a couple problems. First, it should use max_cpus,
> not smp_cpus when calculating missing topology information.
> Conversely, if maxcpus is not input, then the topology should
> dictate max_cpus, as the topology may support more than the
> input smp_cpus number. Second, smp_parse shouldn't silently
> adjust the number of threads a user provides, which it currently
> does in order to ensure the topology supports the number
> of cpus provided. smp_parse should rather complain and exit
> when input isn't consistent. This patch fixes those issues and
> attempts to clarify the code a bit too.
> 
> I don't believe we need to consider compatibility with the old
> behaviors. Specifying something like
> 
>   -smp 4,sockets=1,cores=1,threads=1
> 
> is wrong, even though it currently works (the number of threads
> is silently adjusted to 4). And, specifying something like
> 
>   -smp 4,sockets=1,cores=4,threads=1,maxcpus=8
> 
> is also wrong, as there's no way to hotplug the additional 4 cpus
> with this topology. So, any users doing these things should be
> corrected. The new error message this patch provides should help
> them do that.
> 
> Below are some examples with before/after results
> 
> // topology should support up to max_cpus
> < -smp 4,sockets=2,maxcpus=8                    sockets=2 cores=2 threads=1 smp_cpus=4 max_cpus=8
> > -smp 4,sockets=2,maxcpus=8                    sockets=2 cores=4 threads=1 smp_cpus=4 max_cpus=8
> 
> // topology supports more than smp_cpus, max_cpus should be higher
> < -smp 4,sockets=4,cores=2                      sockets=4 cores=2 threads=1 smp_cpus=4 max_cpus=4
> > -smp 4,sockets=4,cores=2                      sockets=4 cores=2 threads=1 smp_cpus=4 max_cpus=8
> 
> // shouldn't silently adjust threads
> < -smp 4,sockets=1,cores=2,threads=1            sockets=1 cores=2 threads=2 smp_cpus=4 max_cpus=4
> > -smp 4,sockets=1,cores=2,threads=1            "maxcpus must be equal to or greater than smp"
> < -smp 4,sockets=2,cores=2,threads=4,maxcpus=8  sockets=2 cores=2 threads=1 smp_cpus=4 max_cpus=8
> > -smp 4,sockets=2,cores=2,threads=4,maxcpus=8  "cpu topology: sockets (2) * cores (2) * threads (4) != max_cpus (8)"
> 
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> ---
>  vl.c | 59 ++++++++++++++++++++++++++++++++++++++---------------------
>  1 file changed, 38 insertions(+), 21 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index f4a6e5e05bce2..23b21a5fbca50 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1270,35 +1270,52 @@ static QemuOptsList qemu_smp_opts = {
>  static void smp_parse(QemuOpts *opts)
>  {
>      if (opts) {
> -
> -        unsigned cpus    = qemu_opt_get_number(opts, "cpus", 0);
>          unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
>          unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
>          unsigned threads = qemu_opt_get_number(opts, "threads", 0);
> +        unsigned cpus;
> +
> +        smp_cpus = qemu_opt_get_number(opts, "cpus", 0);
> +        max_cpus = qemu_opt_get_number(opts, "maxcpus", 0);
> +
> +        cpus = max_cpus ? max_cpus : smp_cpus;
>  
>          /* compute missing values, prefer sockets over cores over threads */
> -        if (cpus == 0 || sockets == 0) {
> -            sockets = sockets > 0 ? sockets : 1;
> -            cores = cores > 0 ? cores : 1;
> -            threads = threads > 0 ? threads : 1;
> -            if (cpus == 0) {
> -                cpus = cores * threads * sockets;
> -            }
> -        } else {
> -            if (cores == 0) {
> -                threads = threads > 0 ? threads : 1;
> -                cores = cpus / (sockets * threads);
> -            } else {
> -                threads = cpus / (cores * sockets);
> -            }
> +        if (cpus == 0) {
> +            cpus = sockets ? sockets : 1;
> +            cpus = cpus * cores ? cpus * cores : cpus;
> +            cpus = cpus * threads ? cpus * threads : cpus;
>          }
>  
> -        max_cpus = qemu_opt_get_number(opts, "maxcpus", 0);
> +        if (sockets == 0) {
> +            sockets = 1;
> +        }
>  
> -        smp_cpus = cpus;
> -        smp_cores = cores > 0 ? cores : 1;
> -        smp_threads = threads > 0 ? threads : 1;
> +        if (cores == 0) {
> +            threads = threads ? threads : 1;
> +            cores = cpus / (sockets * threads);
> +            cores = cores ? cores : 1;
> +        }
> +
> +        if (threads == 0) {
> +            threads = cpus / (sockets * cores);
> +            threads = threads ? threads : 1;
> +        }
> +
> +        if (max_cpus == 0) {
> +            max_cpus = sockets * cores * threads;
> +        }
>  
> +        if (sockets * cores * threads != max_cpus) {
> +            fprintf(stderr, "cpu topology: "
> +                    "sockets (%u) * cores (%u) * threads (%u) != max_cpus (%u)\n",
> +                    sockets, cores, threads, max_cpus);
> +            exit(1);
> +        }
> +
> +        smp_cpus = smp_cpus ? smp_cpus : max_cpus;
> +        smp_cores = cores;
> +        smp_threads = threads;
>      }
>  
>      if (max_cpus == 0) {
> @@ -1309,11 +1326,11 @@ static void smp_parse(QemuOpts *opts)
>          fprintf(stderr, "Unsupported number of maxcpus\n");
>          exit(1);
>      }
> +
>      if (max_cpus < smp_cpus) {
>          fprintf(stderr, "maxcpus must be equal to or greater than smp\n");
>          exit(1);
>      }
> -
>  }
>  
>  static void realtime_init(void)
> -- 
> 1.9.3
>

Dropping this patch in favor of series I'll post in a few seconds.

drew

vl: rework smp_parse

Commit Message

Comments

Patch