diff mbox series

[for-5.10] spi: rpc-if: Fix use-after-free on unbind

Message ID bf610a9fc88376e2cdf661c4ad0bb275ee5f4f20.1605512876.git.lukas@wunner.de
State Superseded
Headers show
Series [for-5.10] spi: rpc-if: Fix use-after-free on unbind | expand

Commit Message

Lukas Wunner Nov. 16, 2020, 8:23 a.m. UTC
rpcif_spi_remove() accesses the driver's private data after calling
spi_unregister_controller() even though that function releases the last
reference on the spi_controller and thereby frees the private data.

Fix by switching over to the new devm_spi_alloc_master() helper which
keeps the private data accessible until the driver has unbound.

Fixes: eb8d6d464a27 ("spi: add Renesas RPC-IF driver")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: <stable@vger.kernel.org> # v5.9+: 5e844cc37a5c: spi: Introduce device-managed SPI controller allocation
Cc: <stable@vger.kernel.org> # v5.9+
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
---
 drivers/spi/spi-rpc-if.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

Comments

Sergey Shtylyov Nov. 28, 2020, 8:20 p.m. UTC | #1
Hello!

On 11/16/20 11:23 AM, Lukas Wunner wrote:

> rpcif_spi_remove() accesses the driver's private data after calling

> spi_unregister_controller() even though that function releases the last

> reference on the spi_controller and thereby frees the private data.


   OK, your analysis seems correct (sorry for the delay admitting this :-).
   Not sure why spi_unregister_controller() drops the device reference while
spi_register_controller() itself doesn't allocate the memory... 

> Fix by switching over to the new devm_spi_alloc_master() helper which

> keeps the private data accessible until the driver has unbound.


   Perhaps the order of the calls in the remove() method could be reversed? 

> Fixes: eb8d6d464a27 ("spi: add Renesas RPC-IF driver")

> Signed-off-by: Lukas Wunner <lukas@wunner.de>

> Cc: <stable@vger.kernel.org> # v5.9+: 5e844cc37a5c: spi: Introduce device-managed SPI controller allocation

> Cc: <stable@vger.kernel.org> # v5.9+

> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

[...]

MBR, Sergei
Lukas Wunner Nov. 29, 2020, 11:35 a.m. UTC | #2
On Sat, Nov 28, 2020 at 11:20:50PM +0300, Sergey Shtylyov wrote:
> On 11/16/20 11:23 AM, Lukas Wunner wrote:

> > rpcif_spi_remove() accesses the driver's private data after calling

> > spi_unregister_controller() even though that function releases the last

> > reference on the spi_controller and thereby frees the private data.

> 

> OK, your analysis seems correct (sorry for the delay admitting this :-).


Thanks!  Is it okay to take this for an Acked-by?


> Not sure why spi_unregister_controller() drops the device reference

> while spi_register_controller() itself doesn't allocate the memory... 


Yes, that's exactly what I'm trying to move away from with
devm_spi_alloc_master() (introduced in v5.10-rc5 by 5e844cc37a5c).
The API as it has been so far has made it really easy to shoot oneself
in the foot.


> > Fix by switching over to the new devm_spi_alloc_master() helper which

> > keeps the private data accessible until the driver has unbound.

> 

>    Perhaps the order of the calls in the remove() method could be reversed? 


I'm not familiar with power management on these Renesas controllers
but rpcif_disable_rpm() calls pm_runtime_put_sync(), which I assume
may put the controller to sleep.

SPI transfers may still be ongoing until spi_unregister_controller()
returns.  Specifically, this function unbinds and unregisters all
SPI slaves attached to the controller and the slaves' drivers may
need to perform SPI transfers to quiesce interrupts on the slaves etc.

Thus, the correct order is to call spi_unregister_controller() first
and only then perform further teardown steps.  So the order in
rpcif_spi_remove() seems correct to me.

The only thing that looks confusing is that rpcif_enable_rpm() calls
pm_runtime_enable(), whereas rpcif_disable_rpm() calls
pm_runtime_put_sync().  That looks incongruent.

Thanks,

Lukas
Sergey Shtylyov Nov. 30, 2020, 7:18 p.m. UTC | #3
On 11/29/20 2:35 PM, Lukas Wunner wrote:

>>> rpcif_spi_remove() accesses the driver's private data after calling

>>> spi_unregister_controller() even though that function releases the last

>>> reference on the spi_controller and thereby frees the private data.

>>

>> OK, your analysis seems correct (sorry for the delay admitting this :-).

> 

> Thanks!  Is it okay to take this for an Acked-by?


   Not yet. :-)

>> Not sure why spi_unregister_controller() drops the device reference

>> while spi_register_controller() itself doesn't allocate the memory... 

> 

> Yes, that's exactly what I'm trying to move away from with

> devm_spi_alloc_master() (introduced in v5.10-rc5 by 5e844cc37a5c).

> The API as it has been so far has made it really easy to shoot oneself

> in the foot.


   Maybe it needs to be fixed, rather than using the managed device API?

>>> Fix by switching over to the new devm_spi_alloc_master() helper which

>>> keeps the private data accessible until the driver has unbound.

>>

>>    Perhaps the order of the calls in the remove() method could be reversed? 

> 

> I'm not familiar with power management on these Renesas controllers

> but rpcif_disable_rpm() calls pm_runtime_put_sync(), which I assume

> may put the controller to sleep.


   Sigh, that's a stupid typo on my part, being fixed now to pm_runtim_disable()...

> SPI transfers may still be ongoing until spi_unregister_controller()

> returns.  Specifically, this function unbinds and unregisters all

> SPI slaves attached to the controller and the slaves' drivers may

> need to perform SPI transfers to quiesce interrupts on the slaves etc.

> 

> Thus, the correct order is to call spi_unregister_controller() first

> and only then perform further teardown steps.  So the order in

> rpcif_spi_remove() seems correct to me.


   OK. :-)

> The only thing that looks confusing is that rpcif_enable_rpm() calls

> pm_runtime_enable(), whereas rpcif_disable_rpm() calls

> pm_runtime_put_sync().  That looks incongruent.


   Do you need a link to the fix (it a whole patchset of minor fixes)?

> Thanks,

> 

> Lukas


MBR, Sergei
Lukas Wunner Dec. 2, 2020, 11:43 a.m. UTC | #4
On Mon, Nov 30, 2020 at 10:18:12PM +0300, Sergey Shtylyov wrote:
> On 11/29/20 2:35 PM, Lukas Wunner wrote:

> > > Not sure why spi_unregister_controller() drops the device reference

> > > while spi_register_controller() itself doesn't allocate the memory... 

> > 

> > Yes, that's exactly what I'm trying to move away from with

> > devm_spi_alloc_master() (introduced in v5.10-rc5 by 5e844cc37a5c).

> > The API as it has been so far has made it really easy to shoot oneself

> > in the foot.

> 

> Maybe it needs to be fixed, rather than using the managed device API?


devm_spi_alloc_master() *is* the fix, or at least a means to get there:

No longer dropping the reference in spi_unregister_controller() requires
that the drivers drop the reference.  So every single SPI driver needs to
be touched.  However, upon closer examination I've found tons of bugs in
the ->probe and ->remove hooks of SPI drivers, some of them related to
reference counting (leaks or use-after-free), others related to not
disabling clocks properly etc.  Ideally, the fixes for those bugs should
be backported to stable.

devm_spi_alloc_master() allows me to do that and at the same time it
allows stretching the migration across multiple releases.  That's because
spi_unregister_controller() auto-senses if devm_spi_alloc_master() was
used, and if so, it no longer drops a reference.

devm_spi_alloc_master() has the additional advantage of simplifying
probe error paths, as is apparent from the diffstat of the $subject patch:

 drivers/spi/spi-rpc-if.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

I think the vast majority of SPI drivers can be converted to
devm_spi_alloc_master() and the few that can't will be amended to
explicitly drop a reference.


> > > Perhaps the order of the calls in the remove() method could be reversed? 

> > 

> > I'm not familiar with power management on these Renesas controllers

> > but rpcif_disable_rpm() calls pm_runtime_put_sync(), which I assume

> > may put the controller to sleep.

> 

> Sigh, that's a stupid typo on my part, being fixed now to

> pm_runtim_disable()...


Okay in that case the order of the two calls in	rpcif_spi_remove()
won't matter, i.e. it would actually be possible to fix the UAF by
calling rpcif_disable_rpm() before spi_unregister_controller().

However, I still recommend fixing the UAF in the way proposed by
the $subject patch because of the simplified probe error path and
reduced LoC.


> > The only thing that looks confusing is that rpcif_enable_rpm() calls

> > pm_runtime_enable(), whereas rpcif_disable_rpm() calls

> > pm_runtime_put_sync().  That looks incongruent.

> 

> Do you need a link to the fix (it a whole patchset of minor fixes)?


I don't *need* it, but am happy to take a look.  Glad that I was able to
point out another bug. :)

Thanks,

Lukas
diff mbox series

Patch

diff --git a/drivers/spi/spi-rpc-if.c b/drivers/spi/spi-rpc-if.c
index ed3e548227f4..3579675485a5 100644
--- a/drivers/spi/spi-rpc-if.c
+++ b/drivers/spi/spi-rpc-if.c
@@ -134,7 +134,7 @@  static int rpcif_spi_probe(struct platform_device *pdev)
 	struct rpcif *rpc;
 	int error;
 
-	ctlr = spi_alloc_master(&pdev->dev, sizeof(*rpc));
+	ctlr = devm_spi_alloc_master(&pdev->dev, sizeof(*rpc));
 	if (!ctlr)
 		return -ENOMEM;
 
@@ -159,13 +159,8 @@  static int rpcif_spi_probe(struct platform_device *pdev)
 	error = spi_register_controller(ctlr);
 	if (error) {
 		dev_err(&pdev->dev, "spi_register_controller failed\n");
-		goto err_put_ctlr;
+		rpcif_disable_rpm(rpc);
 	}
-	return 0;
-
-err_put_ctlr:
-	rpcif_disable_rpm(rpc);
-	spi_controller_put(ctlr);
 
 	return error;
 }