diff mbox series

[v2,06/17] vdso: Change getrandom's generation to unsigned long

Message ID 525b48eb79978ddba2d1b8ee23b27bd6c5b0b4ee.1724309198.git.christophe.leroy@csgroup.eu
State New
Headers show
Series Wire up getrandom() vDSO implementation on powerpc | expand

Commit Message

Christophe Leroy Aug. 22, 2024, 7:13 a.m. UTC
Performing SMP atomic operations on u64 fails on powerpc32.

Random driver generation is handled as unsigned long not u64,
see for instance base_cnrg or struct crng.

Use the same type for vDSO's getrandom as it gets copied
from the above. This is also in line with the local
current_generation which is already an unsigned long.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 include/vdso/datapage.h  | 2 +-
 include/vdso/getrandom.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Jason A. Donenfeld Aug. 26, 2024, 7:50 a.m. UTC | #1
On Thu, Aug 22, 2024 at 09:13:14AM +0200, Christophe Leroy wrote:
> Performing SMP atomic operations on u64 fails on powerpc32.
> 
> Random driver generation is handled as unsigned long not u64,
> see for instance base_cnrg or struct crng.
> 
> Use the same type for vDSO's getrandom as it gets copied
> from the above. This is also in line with the local
> current_generation which is already an unsigned long.

This isn't going to work when 32-bit userspace tries to access a 64-bit
kernel.

I had "fixed" this with a vdso_kernel_ulong type way back in an earlier
version: https://lore.kernel.org/lkml/20240528122352.2485958-5-Jason@zx2c4.com/#Z31include:vdso:types.h

But tglx pointed out in that thread that this actually isn't necessary:

| All of this is pointless because if a 32-bit application runs on a
| 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do
| we need magic here for a 32-bit kernel?
| 
| Just use u64 for both and spare all this voodoo. We're seriously not
| "optimizing" for 32-bit kernels.
|
| All what happens on a 32-bit kernel is that the RNG will store the
| unsigned long (32bit) generation into a 64bit variable:
| 
| 	smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
| 
| As the upper 32bit are always zero, there is no issue vs. load store
| tearing at all. So there is zero benefit for this aside of slightly
| "better" user space code when running on a 32-bit kernel. Who cares?

So I just got rid of it and used a u64 as he suggested.

However, there's also an additional reason why it's not worth churning
further over this - because VM_DROPPABLE is 64-bit only (due to flags in
vma bits), likely so is vDSO getrandom() for the time being. So I think
it makes more sense to retool this series to be ppc64, and then if you
really really want 32-bit and can convince folks it matters, then all of
these parts (for example, here, the fact that the smp helper doesn't
want to tear) can be fixed up in a separate series.

Jason
Christophe Leroy Aug. 26, 2024, 8:01 a.m. UTC | #2
Le 26/08/2024 à 09:50, Jason A. Donenfeld a écrit :
> On Thu, Aug 22, 2024 at 09:13:14AM +0200, Christophe Leroy wrote:
>> Performing SMP atomic operations on u64 fails on powerpc32.
>>
>> Random driver generation is handled as unsigned long not u64,
>> see for instance base_cnrg or struct crng.
>>
>> Use the same type for vDSO's getrandom as it gets copied
>> from the above. This is also in line with the local
>> current_generation which is already an unsigned long.
> 
> This isn't going to work when 32-bit userspace tries to access a 64-bit
> kernel.
> 
> I had "fixed" this with a vdso_kernel_ulong type way back in an earlier
> version: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20240528122352.2485958-5-Jason%40zx2c4.com%2F%23Z31include%3Avdso%3Atypes.h&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C41747dd989164267c1cc08dcc5a3c424%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638602554376441761%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Tf9ShSN6aOOFZ1HymAmHhj0xhQ6BUtHJX95t50gsp9k%3D&reserved=0
> 
> But tglx pointed out in that thread that this actually isn't necessary:
> 
> | All of this is pointless because if a 32-bit application runs on a
> | 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do
> | we need magic here for a 32-bit kernel?
> |
> | Just use u64 for both and spare all this voodoo. We're seriously not
> | "optimizing" for 32-bit kernels.
> |
> | All what happens on a 32-bit kernel is that the RNG will store the
> | unsigned long (32bit) generation into a 64bit variable:
> |
> | 	smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
> |
> | As the upper 32bit are always zero, there is no issue vs. load store
> | tearing at all. So there is zero benefit for this aside of slightly
> | "better" user space code when running on a 32-bit kernel. Who cares?
> 
> So I just got rid of it and used a u64 as he suggested.
> 
> However, there's also an additional reason why it's not worth churning
> further over this - because VM_DROPPABLE is 64-bit only (due to flags in
> vma bits), likely so is vDSO getrandom() for the time being. So I think
> it makes more sense to retool this series to be ppc64, and then if you
> really really want 32-bit and can convince folks it matters, then all of
> these parts (for example, here, the fact that the smp helper doesn't
> want to tear) can be fixed up in a separate series.

So yes I really really want it on ppc32 because this is the only type of 
boards I have and this is really were we need getrandom() to be 
optimised, indeed ppc64 was sherry-on-the-cake in my series, I just 
added it because it was easy to do after doing ppc32.

Christophe
Jason A. Donenfeld Aug. 26, 2024, 8:16 a.m. UTC | #3
On Mon, Aug 26, 2024 at 10:01:17AM +0200, Christophe Leroy wrote:
> 
> 
> Le 26/08/2024 à 09:50, Jason A. Donenfeld a écrit :
> > On Thu, Aug 22, 2024 at 09:13:14AM +0200, Christophe Leroy wrote:
> >> Performing SMP atomic operations on u64 fails on powerpc32.
> >>
> >> Random driver generation is handled as unsigned long not u64,
> >> see for instance base_cnrg or struct crng.
> >>
> >> Use the same type for vDSO's getrandom as it gets copied
> >> from the above. This is also in line with the local
> >> current_generation which is already an unsigned long.
> > 
> > This isn't going to work when 32-bit userspace tries to access a 64-bit
> > kernel.
> > 
> > I had "fixed" this with a vdso_kernel_ulong type way back in an earlier
> > version: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20240528122352.2485958-5-Jason%40zx2c4.com%2F%23Z31include%3Avdso%3Atypes.h&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C41747dd989164267c1cc08dcc5a3c424%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638602554376441761%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Tf9ShSN6aOOFZ1HymAmHhj0xhQ6BUtHJX95t50gsp9k%3D&reserved=0
> > 
> > But tglx pointed out in that thread that this actually isn't necessary:
> > 
> > | All of this is pointless because if a 32-bit application runs on a
> > | 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do
> > | we need magic here for a 32-bit kernel?
> > |
> > | Just use u64 for both and spare all this voodoo. We're seriously not
> > | "optimizing" for 32-bit kernels.
> > |
> > | All what happens on a 32-bit kernel is that the RNG will store the
> > | unsigned long (32bit) generation into a 64bit variable:
> > |
> > | 	smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
> > |
> > | As the upper 32bit are always zero, there is no issue vs. load store
> > | tearing at all. So there is zero benefit for this aside of slightly
> > | "better" user space code when running on a 32-bit kernel. Who cares?
> > 
> > So I just got rid of it and used a u64 as he suggested.
> > 
> > However, there's also an additional reason why it's not worth churning
> > further over this - because VM_DROPPABLE is 64-bit only (due to flags in
> > vma bits), likely so is vDSO getrandom() for the time being. So I think
> > it makes more sense to retool this series to be ppc64, and then if you
> > really really want 32-bit and can convince folks it matters, then all of
> > these parts (for example, here, the fact that the smp helper doesn't
> > want to tear) can be fixed up in a separate series.
> 
> So yes I really really want it on ppc32 because this is the only type of 
> boards I have and this is really were we need getrandom() to be 
> optimised, indeed ppc64 was sherry-on-the-cake in my series, I just 
> added it because it was easy to do after doing ppc32.

I saw that you did in fact find a bit on ppc32 for VM_DROPPABLE. So it
looks at least possible. Because of this generation counter business, I
still think it might make sense to do in two steps, though, first doing
64-bit, and then doing 32-bit after.

As for the generation counter error you're seeing, I guess what we want
is smp_store_release memory ordering semantics, but letting tearing
happen (since the upper 32-bits will be zero anyway). I'm not sure the
best way to do this, whether it's a new helper, or doing a WRITE_ONCE
together with an smp barrier, or what. But I imagine it's something like
that.

Jason
Thomas Gleixner Aug. 26, 2024, 9:43 a.m. UTC | #4
On Mon, Aug 26 2024 at 10:01, Christophe Leroy wrote:
> Le 26/08/2024 à 09:50, Jason A. Donenfeld a écrit :
>> But tglx pointed out in that thread that this actually isn't necessary:
>> 
>> | All of this is pointless because if a 32-bit application runs on a
>> | 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do
>> | we need magic here for a 32-bit kernel?
>> |
>> | Just use u64 for both and spare all this voodoo. We're seriously not
>> | "optimizing" for 32-bit kernels.
>> |
>> | All what happens on a 32-bit kernel is that the RNG will store the
>> | unsigned long (32bit) generation into a 64bit variable:
>> |
>> | 	smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
>> |
>> | As the upper 32bit are always zero, there is no issue vs. load store
>> | tearing at all. So there is zero benefit for this aside of slightly
>> | "better" user space code when running on a 32-bit kernel. Who cares?
>> 
>> So I just got rid of it and used a u64 as he suggested.
>> 
>> However, there's also an additional reason why it's not worth churning
>> further over this - because VM_DROPPABLE is 64-bit only (due to flags in
>> vma bits), likely so is vDSO getrandom() for the time being. So I think
>> it makes more sense to retool this series to be ppc64, and then if you
>> really really want 32-bit and can convince folks it matters, then all of
>> these parts (for example, here, the fact that the smp helper doesn't
>> want to tear) can be fixed up in a separate series.
>
> So yes I really really want it on ppc32 because this is the only type of 
> boards I have and this is really were we need getrandom() to be 
> optimised,

For nostalgic reasons?

> indeed ppc64 was sherry-on-the-cake in my series, I just added it
> because it was easy to do after doing ppc32.

The rng problem for ppc32 seems to be:

   smp_store_release(&_vdso_rng_data.generation, next_gen + 1);

right? 

Your proposed type change creates inconsistency for 32-bit userspace
running on 64-bit kernels because the data struct layout changes.

As explained before, there is no problem with store or load tearing on
32bit systems because the generation counter is only 32bit wide. So the
obvious solution is to only update 32 bits on a 32bit kernel:

--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -282,7 +282,7 @@ static void crng_reseed(struct work_stru
 	 * is ordered with the write above to base_crng.generation. Pairs with
 	 * the smp_rmb() before the syscall in the vDSO code.
 	 */
-	smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
+	smp_store_release((unsigned long *)&_vdso_rng_data.generation, next_gen + 1);
 #endif
 	if (!static_branch_likely(&crng_is_ready))
 		crng_init = CRNG_READY;

Which is perfectly fine on 32-bit independent of endianess because the
user space side does READ_ONCE(data->generation) and the read value is
solely used for comparison so it does not matter at all whether the
generation information is in the upper or the lower 32bit of the u64.

No?

But that's a trivial fix compared to making VM_DROPPABLE work on 32-bit
correclty. :)

Thanks,

        tglx
Jason A. Donenfeld Aug. 26, 2024, 9:48 a.m. UTC | #5
On Mon, Aug 26, 2024 at 11:43:39AM +0200, Thomas Gleixner wrote:
> As explained before, there is no problem with store or load tearing on
> 32bit systems because the generation counter is only 32bit wide. So the
> obvious solution is to only update 32 bits on a 32bit kernel:
> 
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -282,7 +282,7 @@ static void crng_reseed(struct work_stru
>  	 * is ordered with the write above to base_crng.generation. Pairs with
>  	 * the smp_rmb() before the syscall in the vDSO code.
>  	 */
> -	smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
> +	smp_store_release((unsigned long *)&_vdso_rng_data.generation, next_gen + 1);
>  #endif
>  	if (!static_branch_likely(&crng_is_ready))
>  		crng_init = CRNG_READY;

That seems like a pretty clean fix.

> But that's a trivial fix compared to making VM_DROPPABLE work on 32-bit
> correclty. :)

My initial response too, and then I noticed he posted this:
https://lore.kernel.org/all/315e3a268b165b6edad7dcb723b0d8a506a56c4e.1724309198.git.christophe.leroy@csgroup.eu/
If that's correct, maybe it's not so bad, at least here. I haven't yet
looked into the details of it.

Jason
LEROY Christophe Aug. 26, 2024, 10:20 a.m. UTC | #6
Le 26/08/2024 à 11:43, Thomas Gleixner a écrit :
> On Mon, Aug 26 2024 at 10:01, Christophe Leroy wrote:
>> Le 26/08/2024 à 09:50, Jason A. Donenfeld a écrit :
>>> But tglx pointed out in that thread that this actually isn't necessary:
>>>
>>> | All of this is pointless because if a 32-bit application runs on a
>>> | 64-bit kernel it has to use the 64-bit 'generation'. So why on earth do
>>> | we need magic here for a 32-bit kernel?
>>> |
>>> | Just use u64 for both and spare all this voodoo. We're seriously not
>>> | "optimizing" for 32-bit kernels.
>>> |
>>> | All what happens on a 32-bit kernel is that the RNG will store the
>>> | unsigned long (32bit) generation into a 64bit variable:
>>> |
>>> | 	smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
>>> |
>>> | As the upper 32bit are always zero, there is no issue vs. load store
>>> | tearing at all. So there is zero benefit for this aside of slightly
>>> | "better" user space code when running on a 32-bit kernel. Who cares?
>>>
>>> So I just got rid of it and used a u64 as he suggested.
>>>
>>> However, there's also an additional reason why it's not worth churning
>>> further over this - because VM_DROPPABLE is 64-bit only (due to flags in
>>> vma bits), likely so is vDSO getrandom() for the time being. So I think
>>> it makes more sense to retool this series to be ppc64, and then if you
>>> really really want 32-bit and can convince folks it matters, then all of
>>> these parts (for example, here, the fact that the smp helper doesn't
>>> want to tear) can be fixed up in a separate series.
>>
>> So yes I really really want it on ppc32 because this is the only type of
>> boards I have and this is really were we need getrandom() to be
>> optimised,
> 
> For nostalgic reasons?

No nostalgy here. We deliver voice communication systems for air trafic 
control that we are commited to maintain for at least the next 15 years.

The MPC 885 was manufactured by NXP until recently and we have several 
hundreds in stock in case one of our customer needs to extend an 
existing system.

The MPC 8321 is still manufactured by NXP and the boards on new systems 
we deliver use that CPU.

Both are 32 bits powerpc.


> 
>> indeed ppc64 was sherry-on-the-cake in my series, I just added it
>> because it was easy to do after doing ppc32.
> 
> The rng problem for ppc32 seems to be:
> 
>     smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
> 
> right?

right.

> 
> Your proposed type change creates inconsistency for 32-bit userspace
> running on 64-bit kernels because the data struct layout changes.
> 
> As explained before, there is no problem with store or load tearing on
> 32bit systems because the generation counter is only 32bit wide. So the
> obvious solution is to only update 32 bits on a 32bit kernel:
> 
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -282,7 +282,7 @@ static void crng_reseed(struct work_stru
>   	 * is ordered with the write above to base_crng.generation. Pairs with
>   	 * the smp_rmb() before the syscall in the vDSO code.
>   	 */
> -	smp_store_release(&_vdso_rng_data.generation, next_gen + 1);
> +	smp_store_release((unsigned long *)&_vdso_rng_data.generation, next_gen + 1);
>   #endif
>   	if (!static_branch_likely(&crng_is_ready))
>   		crng_init = CRNG_READY;
> 
> Which is perfectly fine on 32-bit independent of endianess because the
> user space side does READ_ONCE(data->generation) and the read value is
> solely used for comparison so it does not matter at all whether the
> generation information is in the upper or the lower 32bit of the u64. >
> No?

Yes that looks ok, it will only require a big fat comment to explain 
that. I'll give it a try.

> 
> But that's a trivial fix compared to making VM_DROPPABLE work on 32-bit
> correclty. :)

Euh ... Ok. What's the problem here ? I have used VM_ARCH_1 which is 
available. Is there more to do ?

Thanks
Christophe
diff mbox series

Patch

diff --git a/include/vdso/datapage.h b/include/vdso/datapage.h
index b85f24cac3f5..00b66a9b5778 100644
--- a/include/vdso/datapage.h
+++ b/include/vdso/datapage.h
@@ -123,7 +123,7 @@  struct vdso_data {
  * @is_ready:	boolean signaling whether the RNG is initialized
  */
 struct vdso_rng_data {
-	u64	generation;
+	unsigned long	generation;
 	u8	is_ready;
 };
 
diff --git a/include/vdso/getrandom.h b/include/vdso/getrandom.h
index a8b7c14b0ae0..36fd166bf330 100644
--- a/include/vdso/getrandom.h
+++ b/include/vdso/getrandom.h
@@ -38,7 +38,7 @@  struct vgetrandom_state {
 		};
 		u8		batch_key[CHACHA_BLOCK_SIZE * 2];
 	};
-	u64			generation;
+	unsigned long		generation;
 	u8			pos;
 	bool 			in_use;
 };