diff mbox series

spi: bcm2835: Fix buffer overflow with CS able to go beyond limit.

Message ID 20210420083402.6950-1-joe.burmeister@devtank.co.uk
State New
Headers show
Series spi: bcm2835: Fix buffer overflow with CS able to go beyond limit. | expand

Commit Message

Joe Burmeister April 20, 2021, 8:34 a.m. UTC
It was previoulsy possible to have a device tree with more chips than
the driver supports and go off the end of CS arrays.

This patches inforces CS limit but sets that limit to the max of the
default limit and what is in the device tree when driver is loaded.

Signed-off-by: Joe Burmeister <joe.burmeister@devtank.co.uk>
---
 drivers/spi/spi-bcm2835.c | 77 +++++++++++++++++++++++++++++----------
 1 file changed, 58 insertions(+), 19 deletions(-)

Comments

Mark Brown April 22, 2021, 4:42 p.m. UTC | #1
On Tue, Apr 20, 2021 at 09:34:02AM +0100, Joe Burmeister wrote:
> It was previoulsy possible to have a device tree with more chips than

> the driver supports and go off the end of CS arrays.

> 

> This patches inforces CS limit but sets that limit to the max of the

> default limit and what is in the device tree when driver is loaded.


This doesn't apply against current code, please check and resend.

Also s/previoulsy/previously/ and s/inforces/enforces/
Florian Fainelli April 22, 2021, 4:50 p.m. UTC | #2
On 4/20/2021 1:34 AM, Joe Burmeister wrote:
> It was previoulsy possible to have a device tree with more chips than

> the driver supports and go off the end of CS arrays.


Do you mind walking me through the code how that could have happened? We
have spi_register_controller() call of_spi_get_gpio_numbers() which has
the following:

ctlr->num_chipselect = max_t(int, nb, ctlr->num_chipselect);

such that what the controller has is the maximum between the number of
'cs-gpios' properties parsed and what was already populated in
ctrl->num_chipselect during bcm2835_spi_probe(), which for this driver
is BCM2835_SPI_NUM_CS (3).

> 

> This patches inforces CS limit but sets that limit to the max of the

> default limit and what is in the device tree when driver is loaded.

> 

> Signed-off-by: Joe Burmeister <joe.burmeister@devtank.co.uk>


You have changed many more things that just enforcing a limit on
BCM2835_SPI_NUM_CS you have now made all chip-select related data
structuresd dynamically allocated and you have changed a number of
prints to use the shorthand "dev" instead of &pdev->dev.
-- 
Florian
Joe Burmeister April 22, 2021, 8:10 p.m. UTC | #3
> On 4/20/2021 1:34 AM, Joe Burmeister wrote:

>> It was previoulsy possible to have a device tree with more chips than

>> the driver supports and go off the end of CS arrays.

> Do you mind walking me through the code how that could have happened? We

> have spi_register_controller() call of_spi_get_gpio_numbers() which has

> the following:

>

> ctlr->num_chipselect = max_t(int, nb, ctlr->num_chipselect);

>

> such that what the controller has is the maximum between the number of

> 'cs-gpios' properties parsed and what was already populated in

> ctrl->num_chipselect during bcm2835_spi_probe(), which for this driver

> is BCM2835_SPI_NUM_CS (3).


If you make a initial device tree (or add overlay in the rpi's 
config.txt) with more on the bus than BCM2835_SPI_NUM_CS (in my case 8
devices), you get into this trampling memory state. As the devices are
added, once the chip_select is equal to or greater than
BCM2835_SPI_NUM_CS, it's writing off the end of the arrays.

There is no protection from this happening. By the looks of it, this
isn't the only driver this could happen with, but it is the one I have
hardware for to test. There are also drivers that look like they don't
have a problem going well beyond the limit they gave.

There is protection in spi_add_device, which will catch extra added
later, but not ones in the device tree when the spi controller was
registered.

>> This patches inforces CS limit but sets that limit to the max of the

>> default limit and what is in the device tree when driver is loaded.

>>

>> Signed-off-by: Joe Burmeister <joe.burmeister@devtank.co.uk>

> You have changed many more things that just enforcing a limit on

> BCM2835_SPI_NUM_CS you have now made all chip-select related data

> structuresd dynamically allocated and you have changed a number of

> prints to use the shorthand "dev" instead of &pdev->dev.

The change to dynamic allocated arrays is just to support what is given
in the device  tree rather than increase and enforce the CS limit just
for my case.

The shorthand is of course not required. I'll drop it on resubmitting.

Regards,

Joe
Florian Fainelli April 22, 2021, 11:49 p.m. UTC | #4
On 4/22/2021 1:10 PM, Joe Burmeister wrote:
>> On 4/20/2021 1:34 AM, Joe Burmeister wrote:

>>> It was previoulsy possible to have a device tree with more chips than

>>> the driver supports and go off the end of CS arrays.

>> Do you mind walking me through the code how that could have happened? We

>> have spi_register_controller() call of_spi_get_gpio_numbers() which has

>> the following:

>>

>> ctlr->num_chipselect = max_t(int, nb, ctlr->num_chipselect);

>>

>> such that what the controller has is the maximum between the number of

>> 'cs-gpios' properties parsed and what was already populated in

>> ctrl->num_chipselect during bcm2835_spi_probe(), which for this driver

>> is BCM2835_SPI_NUM_CS (3).

> 

> If you make a initial device tree (or add overlay in the rpi's 

> config.txt) with more on the bus than BCM2835_SPI_NUM_CS (in my case 8

> devices), you get into this trampling memory state. As the devices are

> added, once the chip_select is equal to or greater than

> BCM2835_SPI_NUM_CS, it's writing off the end of the arrays.


OK.

> 

> There is no protection from this happening. By the looks of it, this

> isn't the only driver this could happen with, but it is the one I have

> hardware for to test. There are also drivers that look like they don't

> have a problem going well beyond the limit they gave.


Right, which means that we should probably seek a solution within the
SPI core itself, even if you can only test with spi-bcm2835.c chances
are that the fix would be applicable for other controllers if done in
the core.

> 

> There is protection in spi_add_device, which will catch extra added

> later, but not ones in the device tree when the spi controller was

> registered.


Not sure I follow you, if we have the overlay before
spi_register_controller() is called, how can the check there not
trigger? And if we load the overlay later when the SPI controller is
already registered, why does not spi_add_device()'s check work?

How would I go about reproducing this on a Pi4?

> 

>>> This patches inforces CS limit but sets that limit to the max of the

>>> default limit and what is in the device tree when driver is loaded.

>>>

>>> Signed-off-by: Joe Burmeister <joe.burmeister@devtank.co.uk>

>> You have changed many more things that just enforcing a limit on

>> BCM2835_SPI_NUM_CS you have now made all chip-select related data

>> structuresd dynamically allocated and you have changed a number of

>> prints to use the shorthand "dev" instead of &pdev->dev.

> The change to dynamic allocated arrays is just to support what is given

> in the device  tree rather than increase and enforce the CS limit just

> for my case.

> 

> The shorthand is of course not required. I'll drop it on resubmitting.

> 

> Regards,

> 

> Joe

> 

> 


-- 
Florian
Joe Burmeister April 23, 2021, 10:03 a.m. UTC | #5
On 23/04/2021 00:49, Florian Fainelli wrote:
> On 4/22/2021 1:10 PM, Joe Burmeister wrote:

>>> On 4/20/2021 1:34 AM, Joe Burmeister wrote:

>>>> It was previoulsy possible to have a device tree with more chips than

>>>> the driver supports and go off the end of CS arrays.

>>> Do you mind walking me through the code how that could have happened? 

We
>>> have spi_register_controller() call of_spi_get_gpio_numbers() which has

>>> the following:

>>>

>>> ctlr->num_chipselect = max_t(int, nb, ctlr->num_chipselect);

>>>

>>> such that what the controller has is the maximum between the number of

>>> 'cs-gpios' properties parsed and what was already populated in

>>> ctrl->num_chipselect during bcm2835_spi_probe(), which for this driver

>>> is BCM2835_SPI_NUM_CS (3).

>> If you make a initial device tree (or add overlay in the rpi's 

>> config.txt) with more on the bus than BCM2835_SPI_NUM_CS (in my case 8

>> devices), you get into this trampling memory state. As the devices are

>> added, once the chip_select is equal to or greater than

>> BCM2835_SPI_NUM_CS, it's writing off the end of the arrays.

> OK.

>

>> There is no protection from this happening. By the looks of it, this

>> isn't the only driver this could happen with, but it is the one I have

>> hardware for to test. There are also drivers that look like they don't

>> have a problem going well beyond the limit they gave.

> Right, which means that we should probably seek a solution within the

> SPI core itself, even if you can only test with spi-bcm2835.c chances

> are that the fix would be applicable for other controllers if done in

> the core.


I'm not sure it's possible to do in the core alone. The numb of the
issue is the core changes ctlr->num_chipselect to what is in the device
tree and some drivers are cool with that overs quietly stomp memory.

If we stop the core changing ctlr->num_chipselect then sod's law says
that we'd break existing devices which exceed the drivers num_chipselect
without a problem.

I've got a simple little patch to warn when the core expands
ctlr->num_chipselect. This warning won't go off in bcm2835 with my patch
because I am also extending ctlr->num_chipselect to the amount in the
device tree before the core does that expansion. Hopefully that new
warning would make people investigate and fix problem drivers.


>> There is protection in spi_add_device, which will catch extra added

>> later, but not ones in the device tree when the spi controller was

>> registered.

> Not sure I follow you, if we have the overlay before

> spi_register_controller() is called, how can the check there not

> trigger? And if we load the overlay later when the SPI controller is

> already registered, why does not spi_add_device()'s check work?


I think it might be a RPI thing. I think it is merging in the overlay
and giving Linux one already merged.


> How would I go about reproducing this on a Pi4?


Attached is a device tree overlay. If you compile that up and stick it
in /boot/overlays and add dtoverlay=rpi-bug to your config.txt, you can
get into this state.

If you do dtoverlay, you don't see anything, but if you do:

ls /dev/spi*

You can see all the spidev added by this are added. 2 of which go beyond
the drivers CS arrays.


>>>> This patches inforces CS limit but sets that limit to the max of the

>>>> default limit and what is in the device tree when driver is loaded.

>>>>

>>>> Signed-off-by: Joe Burmeister <joe.burmeister@devtank.co.uk>

>>> You have changed many more things that just enforcing a limit on

>>> BCM2835_SPI_NUM_CS you have now made all chip-select related data

>>> structuresd dynamically allocated and you have changed a number of

>>> prints to use the shorthand "dev" instead of &pdev->dev.

>> The change to dynamic allocated arrays is just to support what is given

>> in the device  tree rather than increase and enforce the CS limit 

just
>> for my case.

>>

>> The shorthand is of course not required. I'll drop it on resubmitting.

>>

>> Regards,

>>

>> Joe

>>

>>
Mark Brown April 23, 2021, 11:57 a.m. UTC | #6
On Fri, Apr 23, 2021 at 11:03:22AM +0100, Joe Burmeister wrote:
> On 23/04/2021 00:49, Florian Fainelli wrote:


> > Right, which means that we should probably seek a solution within the

> > SPI core itself, even if you can only test with spi-bcm2835.c chances

> > are that the fix would be applicable for other controllers if done in

> > the core.


> I'm not sure it's possible to do in the core alone. The numb of the

> issue is the core changes ctlr->num_chipselect to what is in the device

> tree and some drivers are cool with that overs quietly stomp memory.


I wouldn't expect any controller to be OK with that?  Drivers can store
per-client data in spi_device->controller_data which doesn't need
scaling (but is also not so helpful if you need to look at clients other
than the one you're currently controlling).

> I've got a simple little patch to warn when the core expands

> ctlr->num_chipselect. This warning won't go off in bcm2835 with my patch

> because I am also extending ctlr->num_chipselect to the amount in the

> device tree before the core does that expansion. Hopefully that new

> warning would make people investigate and fix problem drivers.


> >> There is protection in spi_add_device, which will catch extra added

> >> later, but not ones in the device tree when the spi controller was

> >> registered.


> > Not sure I follow you, if we have the overlay before

> > spi_register_controller() is called, how can the check there not

> > trigger? And if we load the overlay later when the SPI controller is

> > already registered, why does not spi_add_device()'s check work?


> I think it might be a RPI thing. I think it is merging in the overlay

> and giving Linux one already merged.


If the overlay is handled by the bootloader then from the point of view
of Linux there is no overlay - sounds like there's an issue in the
overlay, it should be overriding something that it doesn't?
Joe Burmeister April 23, 2021, 2:12 p.m. UTC | #7
On 23/04/2021 12:57, Mark Brown wrote:
> On Fri, Apr 23, 2021 at 11:03:22AM +0100, Joe Burmeister wrote:

>> On 23/04/2021 00:49, Florian Fainelli wrote:

>>> Right, which means that we should probably seek a solution within the

>>> SPI core itself, even if you can only test with spi-bcm2835.c chances

>>> are that the fix would be applicable for other controllers if done in

>>> the core.

>> I'm not sure it's possible to do in the core alone. The numb of the

>> issue is the core changes ctlr->num_chipselect to what is in the device

>> tree and some drivers are cool with that overs quietly stomp memory.

> I wouldn't expect any controller to be OK with that?  Drivers can store

> per-client data in spi_device->controller_data which doesn't need

> scaling (but is also not so helpful if you need to look at clients other

> than the one you're currently controlling).

I can see a number which certainly wouldn't. Though I don't want to
assume that all don't.

If we are happy just not letting the core expand num_chipselect that
does stop the condition on everything.

Any controller that can go higher without issue could them have their
num_chipselect set to what their real limit is if this enforcement
causes an issue.


>> I've got a simple little patch to warn when the core expands

>> ctlr->num_chipselect. This warning won't go off in bcm2835 with my patch

>> because I am also extending ctlr->num_chipselect to the amount in the

>> device tree before the core does that expansion. Hopefully that new

>> warning would make people investigate and fix problem drivers.

>>>> There is protection in spi_add_device, which will catch extra added

>>>> later, but not ones in the device tree when the spi controller was

>>>> registered.

>>> Not sure I follow you, if we have the overlay before

>>> spi_register_controller() is called, how can the check there not

>>> trigger? And if we load the overlay later when the SPI controller is

>>> already registered, why does not spi_add_device()'s check work?

>> I think it might be a RPI thing. I think it is merging in the overlay

>> and giving Linux one already merged.

> If the overlay is handled by the bootloader then from the point of view

> of Linux there is no overlay - sounds like there's an issue in the

> overlay, it should be overriding something that it doesn't?

Does it matter if the final device tree was compiled like that in the
first place or merge into that by the bootloader?

The limit isn't an hardware issue because the bcm2835 just uses any
gpios for CS. So hardware like ours with 8 spi chips on the bus is fine.
The driver's limit at 4 is arbitrary.

My patch for the bcm2835 just compares of what is in the device
tree and the harcoded limit and uses the largest. Other drivers do this.

Of course we could just raise BCM2835_SPI_NUM_CS to 8 or more if that is
preferred. Does seams like the dynamic solution is less favoured.

Regards,

Joe
Mark Brown April 23, 2021, 4:20 p.m. UTC | #8
On Fri, Apr 23, 2021 at 03:12:11PM +0100, Joe Burmeister wrote:
> On 23/04/2021 12:57, Mark Brown wrote:


> > I wouldn't expect any controller to be OK with that?  Drivers can store

> > per-client data in spi_device->controller_data which doesn't need

> > scaling (but is also not so helpful if you need to look at clients other

> > than the one you're currently controlling).


> I can see a number which certainly wouldn't. Though I don't want to

> assume that all don't.


Yeah, some won't - some do also rely on system specific assumptions
about what's possible but there's not really mechanisms for declaring
that.

> If we are happy just not letting the core expand num_chipselect that

> does stop the condition on everything.


> Any controller that can go higher without issue could them have their

> num_chipselect set to what their real limit is if this enforcement

> causes an issue.


Part of the issue here is that there has been some variation in how
num_chipselect is interpreted with regard to GPIO based chip selects
over time.  It *should* be redundant, I'm not clear why it's in the
generic bindings at all but that's lost to history AFAICT.

> >>> Not sure I follow you, if we have the overlay before

> >>> spi_register_controller() is called, how can the check there not

> >>> trigger? And if we load the overlay later when the SPI controller is

> >>> already registered, why does not spi_add_device()'s check work?


> >> I think it might be a RPI thing. I think it is merging in the overlay

> >> and giving Linux one already merged.


> > If the overlay is handled by the bootloader then from the point of view

> > of Linux there is no overlay - sounds like there's an issue in the

> > overlay, it should be overriding something that it doesn't?


> Does it matter if the final device tree was compiled like that in the

> first place or merge into that by the bootloader?


It matters in the context of a discussion of ordering between loading
the overlay and spi_register_controller() - it's clearly not loaded
afterwards.

> Of course we could just raise BCM2835_SPI_NUM_CS to 8 or more if that is

> preferred. Does seams like the dynamic solution is less favoured.


The best thing would be to have it not have a single array of chip
select specific data and instead store everything in the controller_data
that's there per-device.
Nicolas Saenz Julienne April 23, 2021, 5:34 p.m. UTC | #9
On Fri, 2021-04-23 at 17:20 +0100, Mark Brown wrote:
> On Fri, Apr 23, 2021 at 03:12:11PM +0100, Joe Burmeister wrote:

> > Of course we could just raise BCM2835_SPI_NUM_CS to 8 or more if that is

> > preferred. Does seams like the dynamic solution is less favoured.

> 

> The best thing would be to have it not have a single array of chip

> select specific data and instead store everything in the controller_data

> that's there per-device.


+1

All in all, it would make for a cleaner driver.

-- 
Nicolás Sáenz
Lukas Wunner May 1, 2021, 7:51 p.m. UTC | #10
On Fri, Apr 23, 2021 at 05:20:55PM +0100, Mark Brown wrote:
> Part of the issue here is that there has been some variation in how

> num_chipselect is interpreted with regard to GPIO based chip selects

> over time.  It *should* be redundant, I'm not clear why it's in the

> generic bindings at all but that's lost to history AFAICT.


It seems num_chipselect is meant to be set to the maximum number of
*native* chipselects supported by the controller.  Which is overwritten
if GPIO chipselects are used.

I failed to appreciate that when I changed num_chipselects for
spi-bcm2835.c with commit 571e31fa60b3.  That single line change
in the commit ought to be reverted.

And the kernel-doc ought to be amended because the crucial detail
that num_chipselect needs to be set to the maximum *native* chipselects
isn't mentioned anywhere.


> The best thing would be to have it not have a single array of chip

> select specific data and instead store everything in the controller_data

> that's there per-device.


Unfortunately that's non-trivial.  The slave-specific data is DMA-mapped.
It could be DMA-mapped in ->setup but there's no ->unsetup to DMA-unmap
the memory once the slave is removed.  Note that the slave could be removed
dynamically with a DT overlay, not just when the controller is unbound.

So we'd need a new ->unsetup hook at the very least to make this work.

The Foundation's downstream kernel now contains a bandaid commit which
raises the limit to 24 and errors out of ->probe if the limit is exceeded.
I would have preferred if it errored out of ->setup.  That way only
the slaves exceeding the limit would fail to instantiate:

https://github.com/raspberrypi/linux/commit/05f8d5826e28

Thoughts?

Lukas
Mark Brown May 4, 2021, 11:51 a.m. UTC | #11
On Sat, May 01, 2021 at 09:51:35PM +0200, Lukas Wunner wrote:
> On Fri, Apr 23, 2021 at 05:20:55PM +0100, Mark Brown wrote:

> > Part of the issue here is that there has been some variation in how

> > num_chipselect is interpreted with regard to GPIO based chip selects

> > over time.  It *should* be redundant, I'm not clear why it's in the

> > generic bindings at all but that's lost to history AFAICT.


> It seems num_chipselect is meant to be set to the maximum number of

> *native* chipselects supported by the controller.  Which is overwritten

> if GPIO chipselects are used.


This gets fun with the controllers that have for various reasons open
coded some or all of the GPIO chip select handling.

> I failed to appreciate that when I changed num_chipselects for

> spi-bcm2835.c with commit 571e31fa60b3.  That single line change

> in the commit ought to be reverted.


> And the kernel-doc ought to be amended because the crucial detail

> that num_chipselect needs to be set to the maximum *native* chipselects

> isn't mentioned anywhere.


Can you send patches for these please?

> > The best thing would be to have it not have a single array of chip

> > select specific data and instead store everything in the controller_data

> > that's there per-device.


> Unfortunately that's non-trivial.  The slave-specific data is DMA-mapped.

> It could be DMA-mapped in ->setup but there's no ->unsetup to DMA-unmap

> the memory once the slave is removed.  Note that the slave could be removed

> dynamically with a DT overlay, not just when the controller is unbound.


> So we'd need a new ->unsetup hook at the very least to make this work.


There's the cleanup() callback which seems to fit?
Lukas Wunner May 4, 2021, 1:53 p.m. UTC | #12
On Tue, May 04, 2021 at 12:51:30PM +0100, Mark Brown wrote:
> On Sat, May 01, 2021 at 09:51:35PM +0200, Lukas Wunner wrote:

> > I failed to appreciate that when I changed num_chipselects for

> > spi-bcm2835.c with commit 571e31fa60b3.  That single line change

> > in the commit ought to be reverted.

> 

> > And the kernel-doc ought to be amended because the crucial detail

> > that num_chipselect needs to be set to the maximum *native* chipselects

> > isn't mentioned anywhere.

> 

> Can you send patches for these please?


Yup, I've cooked up two patches over the weekend, one bcm2835 short-term
fix for-5.13 and one long-term solution for-5.14, they're on this branch:

https://github.com/l1k/linux/commits/bcm2835_spi_limit

Just needs some more polishing and testing before submission (hopefully
in the second half of this week).


> > Unfortunately that's non-trivial.  The slave-specific data is DMA-mapped.

> > It could be DMA-mapped in ->setup but there's no ->unsetup to DMA-unmap

> > the memory once the slave is removed.  Note that the slave could be removed

> > dynamically with a DT overlay, not just when the controller is unbound.

> 

> > So we'd need a new ->unsetup hook at the very least to make this work.

> 

> There's the cleanup() callback which seems to fit?


Right, I initially missed that but found it and then prepared the patch
on the above-linked branch.

Thanks,

Lukas
diff mbox series

Patch

diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
index aab6c7e5c114..cee761bfffe4 100644
--- a/drivers/spi/spi-bcm2835.c
+++ b/drivers/spi/spi-bcm2835.c
@@ -28,6 +28,7 @@ 
 #include <linux/gpio/consumer.h>
 #include <linux/gpio/machine.h> /* FIXME: using chip internals */
 #include <linux/gpio/driver.h> /* FIXME: using chip internals */
+#include <linux/of_gpio.h>
 #include <linux/of_irq.h>
 #include <linux/spi/spi.h>
 
@@ -134,7 +135,7 @@  struct bcm2835_spi {
 	int tx_prologue;
 	int rx_prologue;
 	unsigned int tx_spillover;
-	u32 prepare_cs[BCM2835_SPI_NUM_CS];
+	u32 *prepare_cs;
 
 	struct dentry *debugfs_dir;
 	u64 count_transfer_polling;
@@ -147,9 +148,9 @@  struct bcm2835_spi {
 	unsigned int rx_dma_active;
 	struct dma_async_tx_descriptor *fill_tx_desc;
 	dma_addr_t fill_tx_addr;
-	struct dma_async_tx_descriptor *clear_rx_desc[BCM2835_SPI_NUM_CS];
+	struct dma_async_tx_descriptor **clear_rx_desc;
 	dma_addr_t clear_rx_addr;
-	u32 clear_rx_cs[BCM2835_SPI_NUM_CS] ____cacheline_aligned;
+	u32 *clear_rx_cs;
 };
 
 #if defined(CONFIG_DEBUG_FS)
@@ -875,14 +876,14 @@  static void bcm2835_dma_release(struct spi_controller *ctlr,
 	if (ctlr->dma_rx) {
 		dmaengine_terminate_sync(ctlr->dma_rx);
 
-		for (i = 0; i < BCM2835_SPI_NUM_CS; i++)
+		for (i = 0; i < ctlr->num_chipselect; i++)
 			if (bs->clear_rx_desc[i])
 				dmaengine_desc_free(bs->clear_rx_desc[i]);
 
 		if (bs->clear_rx_addr)
 			dma_unmap_single(ctlr->dma_rx->device->dev,
 					 bs->clear_rx_addr,
-					 sizeof(bs->clear_rx_cs),
+					 sizeof(u32) * ctlr->num_chipselect,
 					 DMA_TO_DEVICE);
 
 		dma_release_channel(ctlr->dma_rx);
@@ -978,7 +979,7 @@  static int bcm2835_dma_init(struct spi_controller *ctlr, struct device *dev,
 
 	bs->clear_rx_addr = dma_map_single(ctlr->dma_rx->device->dev,
 					   bs->clear_rx_cs,
-					   sizeof(bs->clear_rx_cs),
+					   sizeof(u32) * ctlr->num_chipselect,
 					   DMA_TO_DEVICE);
 	if (dma_mapping_error(ctlr->dma_rx->device->dev, bs->clear_rx_addr)) {
 		dev_err(dev, "cannot map clear_rx_cs - not using DMA mode\n");
@@ -987,7 +988,7 @@  static int bcm2835_dma_init(struct spi_controller *ctlr, struct device *dev,
 		goto err_release;
 	}
 
-	for (i = 0; i < BCM2835_SPI_NUM_CS; i++) {
+	for (i = 0; i < ctlr->num_chipselect; i++) {
 		bs->clear_rx_desc[i] = dmaengine_prep_dma_cyclic(ctlr->dma_rx,
 					   bs->clear_rx_addr + i * sizeof(u32),
 					   sizeof(u32), 0,
@@ -1209,6 +1210,12 @@  static int bcm2835_spi_setup(struct spi_device *spi)
 	struct gpio_chip *chip;
 	u32 cs;
 
+	if (spi->chip_select >= ctlr->num_chipselect) {
+		dev_err(&spi->dev, "cs%d >= max %d\n", spi->chip_select,
+			ctlr->num_chipselect);
+		return -EINVAL;
+	}
+
 	/*
 	 * Precalculate SPI slave's CS register value for ->prepare_message():
 	 * The driver always uses software-controlled GPIO chip select, hence
@@ -1233,7 +1240,7 @@  static int bcm2835_spi_setup(struct spi_device *spi)
 						    BCM2835_SPI_CS_CLEAR_RX;
 		dma_sync_single_for_device(ctlr->dma_rx->device->dev,
 					   bs->clear_rx_addr,
-					   sizeof(bs->clear_rx_cs),
+					   sizeof(u32) * ctlr->num_chipselect,
 					   DMA_TO_DEVICE);
 	}
 
@@ -1286,39 +1293,71 @@  static int bcm2835_spi_setup(struct spi_device *spi)
 	return 0;
 }
 
+
+#ifdef CONFIG_OF
+static int bcm2835_spi_get_num_chipselect(struct platform_device *pdev)
+{
+	return max_t(int, of_gpio_named_count(pdev->dev.of_node, "cs-gpios"),
+		BCM2835_SPI_NUM_CS);
+}
+#else
+static int bcm2835_spi_get_num_chipselect(struct platform_device *pdev)
+{
+	return BCM2835_SPI_NUM_CS;
+}
+#endif
+
+
 static int bcm2835_spi_probe(struct platform_device *pdev)
 {
+	struct device *dev = &pdev->dev;
 	struct spi_controller *ctlr;
 	struct bcm2835_spi *bs;
+	int num_chipselect;
 	int err;
 
-	ctlr = devm_spi_alloc_master(&pdev->dev, ALIGN(sizeof(*bs),
+	ctlr = devm_spi_alloc_master(dev, ALIGN(sizeof(*bs),
 						  dma_get_cache_alignment()));
 	if (!ctlr)
 		return -ENOMEM;
 
+	num_chipselect = bcm2835_spi_get_num_chipselect(pdev);
+
 	platform_set_drvdata(pdev, ctlr);
 
 	ctlr->use_gpio_descriptors = true;
 	ctlr->mode_bits = BCM2835_SPI_MODE_BITS;
 	ctlr->bits_per_word_mask = SPI_BPW_MASK(8);
-	ctlr->num_chipselect = BCM2835_SPI_NUM_CS;
+	ctlr->num_chipselect = num_chipselect;
 	ctlr->setup = bcm2835_spi_setup;
 	ctlr->transfer_one = bcm2835_spi_transfer_one;
 	ctlr->handle_err = bcm2835_spi_handle_err;
 	ctlr->prepare_message = bcm2835_spi_prepare_message;
-	ctlr->dev.of_node = pdev->dev.of_node;
+	ctlr->dev.of_node = dev->of_node;
 
 	bs = spi_controller_get_devdata(ctlr);
 	bs->ctlr = ctlr;
 
+	bs->prepare_cs = devm_kmalloc(dev, num_chipselect * sizeof(u32), GFP_KERNEL);
+	if (!bs->prepare_cs)
+		return -ENOMEM;
+
+	bs->clear_rx_desc = devm_kmalloc(dev, num_chipselect *
+		sizeof(struct dma_async_tx_descriptor *), GFP_KERNEL);
+	if (!bs->clear_rx_desc)
+		return -ENOMEM;
+
+	bs->clear_rx_cs = devm_kmalloc(dev, num_chipselect * sizeof(u32), GFP_DMA);
+	if (!bs->clear_rx_cs)
+		return -ENOMEM;
+
 	bs->regs = devm_platform_ioremap_resource(pdev, 0);
 	if (IS_ERR(bs->regs))
 		return PTR_ERR(bs->regs);
 
-	bs->clk = devm_clk_get(&pdev->dev, NULL);
+	bs->clk = devm_clk_get(dev, NULL);
 	if (IS_ERR(bs->clk))
-		return dev_err_probe(&pdev->dev, PTR_ERR(bs->clk),
+		return dev_err_probe(dev, PTR_ERR(bs->clk),
 				     "could not get clk\n");
 
 	bs->irq = platform_get_irq(pdev, 0);
@@ -1327,7 +1366,7 @@  static int bcm2835_spi_probe(struct platform_device *pdev)
 
 	clk_prepare_enable(bs->clk);
 
-	err = bcm2835_dma_init(ctlr, &pdev->dev, bs);
+	err = bcm2835_dma_init(ctlr, dev, bs);
 	if (err)
 		goto out_clk_disable;
 
@@ -1335,22 +1374,22 @@  static int bcm2835_spi_probe(struct platform_device *pdev)
 	bcm2835_wr(bs, BCM2835_SPI_CS,
 		   BCM2835_SPI_CS_CLEAR_RX | BCM2835_SPI_CS_CLEAR_TX);
 
-	err = devm_request_irq(&pdev->dev, bs->irq, bcm2835_spi_interrupt,
+	err = devm_request_irq(dev, bs->irq, bcm2835_spi_interrupt,
 			       IRQF_SHARED,
-			       dev_name(&pdev->dev), bs);
+			       dev_name(dev), bs);
 	if (err) {
-		dev_err(&pdev->dev, "could not request IRQ: %d\n", err);
+		dev_err(dev, "could not request IRQ: %d\n", err);
 		goto out_dma_release;
 	}
 
 	err = spi_register_controller(ctlr);
 	if (err) {
-		dev_err(&pdev->dev, "could not register SPI controller: %d\n",
+		dev_err(dev, "could not register SPI controller: %d\n",
 			err);
 		goto out_dma_release;
 	}
 
-	bcm2835_debugfs_create(bs, dev_name(&pdev->dev));
+	bcm2835_debugfs_create(bs, dev_name(dev));
 
 	return 0;