diff mbox series

mm/mmzone.c: fix page_cpupid_xchg_last() to READ_ONCE() the page flags

Message ID 20220118230539.323058-1-pcc@google.com
State New
Headers show
Series mm/mmzone.c: fix page_cpupid_xchg_last() to READ_ONCE() the page flags | expand

Commit Message

Peter Collingbourne Jan. 18, 2022, 11:05 p.m. UTC
After submitting a patch with a compare-exchange loop similar to this
one to set the KASAN tag in the page flags, Andrey Konovalov pointed
out that we should be using READ_ONCE() to read the page flags. Fix
it here.

Fixes: 75980e97dacc ("mm: fold page->_last_nid into page->flags where possible")
Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I2e1f5b5b080ac9c4e0eb7f98768dba6fd7821693
Cc: stable@vger.kernel.org
---
 mm/mmzone.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Peter Zijlstra Jan. 19, 2022, 10:02 a.m. UTC | #1
On Tue, Jan 18, 2022 at 03:05:39PM -0800, Peter Collingbourne wrote:
> After submitting a patch with a compare-exchange loop similar to this
> one to set the KASAN tag in the page flags, Andrey Konovalov pointed
> out that we should be using READ_ONCE() to read the page flags. Fix
> it here.

What does it actually fix? If it manages to split the read and read
garbage the cmpxchg will fail and we go another round, no harm done.

> Fixes: 75980e97dacc ("mm: fold page->_last_nid into page->flags where possible")

As per the above argument, I don't think this rates a Fixes tag, there
is no actual fix.

> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Link: https://linux-review.googlesource.com/id/I2e1f5b5b080ac9c4e0eb7f98768dba6fd7821693

That's that doing here?

> Cc: stable@vger.kernel.org

That's massively over-selling things.

> ---
>  mm/mmzone.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/mmzone.c b/mm/mmzone.c
> index eb89d6e018e2..f84b84b0d3fc 100644
> --- a/mm/mmzone.c
> +++ b/mm/mmzone.c
> @@ -90,7 +90,7 @@ int page_cpupid_xchg_last(struct page *page, int cpupid)
>  	int last_cpupid;
>  
>  	do {
> -		old_flags = flags = page->flags;
> +		old_flags = flags = READ_ONCE(page->flags);
>  		last_cpupid = page_cpupid_last(page);
>  
>  		flags &= ~(LAST_CPUPID_MASK << LAST_CPUPID_PGSHIFT);

I think that if you want to touch that code, something like the below
makes more sense...

diff --git a/mm/mmzone.c b/mm/mmzone.c
index eb89d6e018e2..ed9f4bcdc9ee 100644
--- a/mm/mmzone.c
+++ b/mm/mmzone.c
@@ -89,13 +89,14 @@ int page_cpupid_xchg_last(struct page *page, int cpupid)
 	unsigned long old_flags, flags;
 	int last_cpupid;
 
+	old_flags = READ_ONCE(page->flags);
 	do {
-		old_flags = flags = page->flags;
-		last_cpupid = page_cpupid_last(page);
+		flags = old_flags;
+		last_cpupid = (flags >> LAST_CPUPID_PGSHIFT) & LAST_CPUPID_MASK;
 
 		flags &= ~(LAST_CPUPID_MASK << LAST_CPUPID_PGSHIFT);
 		flags |= (cpupid & LAST_CPUPID_MASK) << LAST_CPUPID_PGSHIFT;
-	} while (unlikely(cmpxchg(&page->flags, old_flags, flags) != old_flags));
+	} while (unlikely(!try_cmpxchg(&page->flags, &old_flags, flags)));
 
 	return last_cpupid;
 }
Peter Collingbourne Jan. 19, 2022, 11:28 p.m. UTC | #2
On Wed, Jan 19, 2022 at 2:03 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jan 18, 2022 at 03:05:39PM -0800, Peter Collingbourne wrote:
> > After submitting a patch with a compare-exchange loop similar to this
> > one to set the KASAN tag in the page flags, Andrey Konovalov pointed
> > out that we should be using READ_ONCE() to read the page flags. Fix
> > it here.
>
> What does it actually fix? If it manages to split the read and read
> garbage the cmpxchg will fail and we go another round, no harm done.

What I wasn't sure about was whether the compiler would be allowed to
break this code by hoisting the read of page->flags out of the loop
(because nothing in the loop actually writes to page->flags aside from
the compare-exchange, and if that succeeds we're *leaving* the loop).
That could potentially result in a loop that never terminates if the
first compare-exchange fails. This is largely a theoretical problem as
far as I know; the assembly produced by clang and gcc on x86_64 and
arm64 appears to be doing the expected thing for now, and we're using
inline asm for compare-exchange instead of the compiler builtins on
those architectures (and on all other architectures it seems? no
matches for __atomic_compare_exchange outside of kcsan and the
selftests) so the compiler wouldn't be able to look inside it anyway.

> > Fixes: 75980e97dacc ("mm: fold page->_last_nid into page->flags where possible")
>
> As per the above argument, I don't think this rates a Fixes tag, there
> is no actual fix.

Okay, I'll remove it unless you find the above convincing.

> > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > Link: https://linux-review.googlesource.com/id/I2e1f5b5b080ac9c4e0eb7f98768dba6fd7821693
>
> That's that doing here?

I upload my changes to Gerrit and link to them here so that I (and
others) can see the progression of the patch via the web UI.

> > Cc: stable@vger.kernel.org
>
> That's massively over-selling things.

Fair enough since it isn't causing an actual problem, I'll remove this tag.

> > ---
> >  mm/mmzone.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/mmzone.c b/mm/mmzone.c
> > index eb89d6e018e2..f84b84b0d3fc 100644
> > --- a/mm/mmzone.c
> > +++ b/mm/mmzone.c
> > @@ -90,7 +90,7 @@ int page_cpupid_xchg_last(struct page *page, int cpupid)
> >       int last_cpupid;
> >
> >       do {
> > -             old_flags = flags = page->flags;
> > +             old_flags = flags = READ_ONCE(page->flags);
> >               last_cpupid = page_cpupid_last(page);
> >
> >               flags &= ~(LAST_CPUPID_MASK << LAST_CPUPID_PGSHIFT);
>
> I think that if you want to touch that code, something like the below
> makes more sense...

Yeah, that looks a bit nicer. I'll send a v2 and update the other patch as well.

Peter
Peter Zijlstra Jan. 20, 2022, 10:43 a.m. UTC | #3
On Wed, Jan 19, 2022 at 03:28:32PM -0800, Peter Collingbourne wrote:
> On Wed, Jan 19, 2022 at 2:03 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Tue, Jan 18, 2022 at 03:05:39PM -0800, Peter Collingbourne wrote:
> > > After submitting a patch with a compare-exchange loop similar to this
> > > one to set the KASAN tag in the page flags, Andrey Konovalov pointed
> > > out that we should be using READ_ONCE() to read the page flags. Fix
> > > it here.
> >
> > What does it actually fix? If it manages to split the read and read
> > garbage the cmpxchg will fail and we go another round, no harm done.
> 
> What I wasn't sure about was whether the compiler would be allowed to
> break this code by hoisting the read of page->flags out of the loop
> (because nothing in the loop actually writes to page->flags aside from
> the compare-exchange, and if that succeeds we're *leaving* the loop).

The cmpxchg is a barrier() and as such I don't think it's allowed to
hoist anything out of the loop. Except perhaps since it's do-while, it
could try and unroll the first iteration and wreck that something
fierce.

The bigger problem is I think that page_cpuid_last() usage which does a
second load of page->flags, and given sufficient races that could
actually load a different value and then things would be screwy. But
that's not actually fixed.

> > > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > > Link: https://linux-review.googlesource.com/id/I2e1f5b5b080ac9c4e0eb7f98768dba6fd7821693
> >
> > That's that doing here?
> 
> I upload my changes to Gerrit and link to them here so that I (and
> others) can see the progression of the patch via the web UI.

What's the life-time guarantee for that URL existing? Because if it
becomes part of the git commit, it had better stay around 'forever'
etc..
Peter Collingbourne Jan. 21, 2022, 10:35 p.m. UTC | #4
On Thu, Jan 20, 2022 at 2:43 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Jan 19, 2022 at 03:28:32PM -0800, Peter Collingbourne wrote:
> > On Wed, Jan 19, 2022 at 2:03 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Tue, Jan 18, 2022 at 03:05:39PM -0800, Peter Collingbourne wrote:
> > > > After submitting a patch with a compare-exchange loop similar to this
> > > > one to set the KASAN tag in the page flags, Andrey Konovalov pointed
> > > > out that we should be using READ_ONCE() to read the page flags. Fix
> > > > it here.
> > >
> > > What does it actually fix? If it manages to split the read and read
> > > garbage the cmpxchg will fail and we go another round, no harm done.
> >
> > What I wasn't sure about was whether the compiler would be allowed to
> > break this code by hoisting the read of page->flags out of the loop
> > (because nothing in the loop actually writes to page->flags aside from
> > the compare-exchange, and if that succeeds we're *leaving* the loop).
>
> The cmpxchg is a barrier() and as such I don't think it's allowed to
> hoist anything out of the loop.

Yes it looks like it's at least as powerful as a barrier() because the
implementations I looked at (arm64 and x86) use inline asm with memory
operand (i.e. same as barrier()).

> The bigger problem is I think that page_cpuid_last() usage which does a
> second load of page->flags, and given sufficient races that could
> actually load a different value and then things would be screwy. But
> that's not actually fixed.

Right, the patch you provided (which I posted as v2) inlines the
page_cpuid_last() and should resolve this.

> > > > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > > > Link: https://linux-review.googlesource.com/id/I2e1f5b5b080ac9c4e0eb7f98768dba6fd7821693
> > >
> > > That's that doing here?
> >
> > I upload my changes to Gerrit and link to them here so that I (and
> > others) can see the progression of the patch via the web UI.
>
> What's the life-time guarantee for that URL existing? Because if it
> becomes part of the git commit, it had better stay around 'forever'
> etc..

I'd be surprised if it went away any time soon, the same hosting is
used for projects like Android and Chromium which have been using it
for years and have a long track record of stability.

Also note that the link is useful independent of the host continuing
to be up, it means that you can also do a search of the mailing list
archive like so:
https://lore.kernel.org/all/?q=I2e1f5b5b080ac9c4e0eb7f98768dba6fd7821693
(or the equivalent link on a different mailing list archive if
lore.kernel.org ever gets shut down) and find the progression of the
patch that way. This is particularly useful if (as in this case) the
subject line of the patch changes.

Peter
diff mbox series

Patch

diff --git a/mm/mmzone.c b/mm/mmzone.c
index eb89d6e018e2..f84b84b0d3fc 100644
--- a/mm/mmzone.c
+++ b/mm/mmzone.c
@@ -90,7 +90,7 @@  int page_cpupid_xchg_last(struct page *page, int cpupid)
 	int last_cpupid;
 
 	do {
-		old_flags = flags = page->flags;
+		old_flags = flags = READ_ONCE(page->flags);
 		last_cpupid = page_cpupid_last(page);
 
 		flags &= ~(LAST_CPUPID_MASK << LAST_CPUPID_PGSHIFT);