diff mbox series

[2/3] scsi: megaraid_sas: check user-provided offsets

Message ID 20200908213715.3553098-2-arnd@arndb.de
State Superseded
Headers show
Series [1/3] scsi: aacraid: improve compat_ioctl handlers | expand

Commit Message

Arnd Bergmann Sept. 8, 2020, 9:36 p.m. UTC
It sounds unwise to let user space pass an unchecked 32-bit
offset into a kernel structure in an ioctl. This is an unsigned
variable, so checking the upper bound for the size of the structure
it points into is sufficient to avoid data corruption, but as
the pointer might also be unaligned, it has to be written carefully
as well.

While I stumbled over this problem by reading the code, I did not
continue checking the function for further problems like it.

Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/scsi/megaraid/megaraid_sas_base.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

Comments

Phil Oester Dec. 31, 2020, 12:15 a.m. UTC | #1
On Tue, Sep 08, 2020 at 11:36:22PM +0200, Arnd Bergmann wrote:
> It sounds unwise to let user space pass an unchecked 32-bit
> offset into a kernel structure in an ioctl. This is an unsigned
> variable, so checking the upper bound for the size of the structure
> it points into is sufficient to avoid data corruption, but as
> the pointer might also be unaligned, it has to be written carefully
> as well.
> 
> While I stumbled over this problem by reading the code, I did not
> continue checking the function for further problems like it.

Sorry for replying to an ancient thread, but this patch just recently
made it into 5.10.3 and has caused unintended consequences.  On Dell
servers with PERC RAID controllers, booting 5.10.3+ with this patch
causes a PCI parity error.  Specifically:

Event Message: A PCI parity error was detected on a component at bus 0 device 5 function 0.
Severity: Critical
Message ID: PCI1308

I reverted this single patch and the errors went away.

Thoughts?

Phil Oester
Arnd Bergmann Jan. 3, 2021, 4:26 p.m. UTC | #2
On Thu, Dec 31, 2020 at 1:15 AM Phil Oester <kernel@linuxace.com> wrote:
>
> On Tue, Sep 08, 2020 at 11:36:22PM +0200, Arnd Bergmann wrote:
> > It sounds unwise to let user space pass an unchecked 32-bit
> > offset into a kernel structure in an ioctl. This is an unsigned
> > variable, so checking the upper bound for the size of the structure
> > it points into is sufficient to avoid data corruption, but as
> > the pointer might also be unaligned, it has to be written carefully
> > as well.
> >
> > While I stumbled over this problem by reading the code, I did not
> > continue checking the function for further problems like it.
>
> Sorry for replying to an ancient thread, but this patch just recently
> made it into 5.10.3 and has caused unintended consequences.  On Dell
> servers with PERC RAID controllers, booting 5.10.3+ with this patch
> causes a PCI parity error.  Specifically:
>
> Event Message: A PCI parity error was detected on a component at bus 0 device 5 function 0.
> Severity: Critical
> Message ID: PCI1308
>
> I reverted this single patch and the errors went away.
>
> Thoughts?

Thank you for the report and bisecting the issue, and sorry this broke
your system!

Fortunately, the patch is fairly small, so there are only a limited number
of things that could go wrong. I haven't tried to analyze that message,
but I have two ideas:

a) The added ioc->sense_off check gets triggered and the code relies
  on the data being written outside of the structure

b) the address actually needs to always be written as a 64-bit value
    regardless of the instance->consistent_mask_64bit flag, as the
   driver did before. This looked like it was done in error.

Can you try the patch below instead of the revert and see if that
resolves the regression, and if it triggers the warning message I
add?

       Arnd

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
b/drivers/scsi/megaraid/megaraid_sas_base.c
index 6e4bf05c6d77..248063a4148b 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -8194,8 +8194,7 @@ megasas_mgmt_fw_ioctl(struct megasas_instance *instance,
                /* make sure the pointer is part of the frame */
                if (ioc->sense_off >
                    (sizeof(union megasas_frame) - sizeof(__le64))) {
-                       error = -EINVAL;
-                       goto out;
+                       pr_warn("possible out of bounds access offset
%d\n", ioc->sense_off);
                }

                sense = dma_alloc_coherent(&instance->pdev->dev, ioc->sense_len,
@@ -8209,7 +8208,7 @@ megasas_mgmt_fw_ioctl(struct megasas_instance *instance,
                if (instance->consistent_mask_64bit)
                        put_unaligned_le64(sense_handle, sense_ptr);
                else
-                       put_unaligned_le32(sense_handle, sense_ptr);
+                       put_unaligned_le64(sense_handle, sense_ptr);
        }

        /*
James Bottomley Jan. 3, 2021, 5 p.m. UTC | #3
On Sun, 2021-01-03 at 17:26 +0100, Arnd Bergmann wrote:
[...]
> @@ -8209,7 +8208,7 @@ megasas_mgmt_fw_ioctl(struct megasas_instance
> *instance,
>                 if (instance->consistent_mask_64bit)
>                         put_unaligned_le64(sense_handle, sense_ptr);
>                 else
> -                       put_unaligned_le32(sense_handle, sense_ptr);
> +                       put_unaligned_le64(sense_handle, sense_ptr);
>         }

This hunk can't be right.  It effectively means removing the if.
However, the if is needed because sense_handle is a dma_addr_t which
can be either 32 or 64 bit.  What about changing the if to 

if (sizeof(dma_addr_t) == 8)

instead?

James
Arnd Bergmann Jan. 3, 2021, 6:49 p.m. UTC | #4
On Sun, Jan 3, 2021 at 6:00 PM James Bottomley <jejb@linux.ibm.com> wrote:
> On Sun, 2021-01-03 at 17:26 +0100, Arnd Bergmann wrote:
> [...]
> > @@ -8209,7 +8208,7 @@ megasas_mgmt_fw_ioctl(struct megasas_instance
> > *instance,
> >                 if (instance->consistent_mask_64bit)
> >                         put_unaligned_le64(sense_handle, sense_ptr);
> >                 else
> > -                       put_unaligned_le32(sense_handle, sense_ptr);
> > +                       put_unaligned_le64(sense_handle, sense_ptr);
> >         }
>
> This hunk can't be right.  It effectively means removing the if.

I'm just trying to restore the state before the regression introduced
in my 381d34e376e3 ("scsi: megaraid_sas: Check user-provided offsets").

The old code always stored 'sizeof(long)' bytes into sense_ptr,
regardless of instance->consistent_mask_64bit, but it would truncate
the address to 32 bit if that was cleared. This was clearly bogus
and I tried to make it do something more meaningful, only storing
8 bytes into the structure if it was configured for 64-bit DMA, regardless
of the capabilities of the kernel.

> However, the if is needed because sense_handle is a dma_addr_t which
> can be either 32 or 64 bit.  What about changing the if to
>
> if (sizeof(dma_addr_t) == 8)
>
> instead?

That would not be useful either, the device surely does not care
if the kernel supports 64-bit DMA. What we'd really need here is
someone with access to the interface specifications to see how
many bytes should be stored in the structure. I suspect always
storing 64 bits (as my patch does) is correct, and would send a
proper patch to remove the if() if Phil confirms that my test
patch fixes the regression.

        Arnd
James Bottomley Jan. 3, 2021, 8:12 p.m. UTC | #5
On Sun, 2021-01-03 at 19:49 +0100, Arnd Bergmann wrote:
> On Sun, Jan 3, 2021 at 6:00 PM James Bottomley <jejb@linux.ibm.com>
> wrote:
> > On Sun, 2021-01-03 at 17:26 +0100, Arnd Bergmann wrote:
> > [...]
> > > @@ -8209,7 +8208,7 @@ megasas_mgmt_fw_ioctl(struct
> > > megasas_instance
> > > *instance,
> > >                 if (instance->consistent_mask_64bit)
> > >                         put_unaligned_le64(sense_handle,
> > > sense_ptr);
> > >                 else
> > > -                       put_unaligned_le32(sense_handle,
> > > sense_ptr);
> > > +                       put_unaligned_le64(sense_handle,
> > > sense_ptr);
> > >         }
> > 
> > This hunk can't be right.  It effectively means removing the if.
> 
> I'm just trying to restore the state before the regression introduced
> in my 381d34e376e3 ("scsi: megaraid_sas: Check user-provided
> offsets").
> 
> The old code always stored 'sizeof(long)' bytes into sense_ptr,
> regardless of instance->consistent_mask_64bit, but it would truncate
> the address to 32 bit if that was cleared. This was clearly bogus
> and I tried to make it do something more meaningful, only storing
> 8 bytes into the structure if it was configured for 64-bit DMA,
> regardless of the capabilities of the kernel.

Heh, well, all this depends on how the firmware interprets the pointer,
for which we don't seem to have a manual.  Instinct tells me the flag
MFI_FRAME_SENSE64 is what does this and that's conditioned on the same
if clause 100 lines above this, so the fix your proposing would still
seem to be wrong, because I think when that flag is not set, the device
expects the sense pointer to be 32 bit.

> > However, the if is needed because sense_handle is a dma_addr_t
> > which can be either 32 or 64 bit.  What about changing the if to
> > 
> > if (sizeof(dma_addr_t) == 8)
> > 
> > instead?
> 
> That would not be useful either, the device surely does not care
> if the kernel supports 64-bit DMA. What we'd really need here is
> someone with access to the interface specifications to see how
> many bytes should be stored in the structure. I suspect always
> storing 64 bits (as my patch does) is correct, and would send a
> proper patch to remove the if() if Phil confirms that my test
> patch fixes the regression.

Well, as I said above, I'm speculating the device does what we tell it,
and whether to use 32 or 64 bits for the sense pointer definitely seems
to be a flag the driver controls ... we really need someone with access
to the programming manual to tell us if this speculation is accurate,
though.

James
Phil Oester Jan. 4, 2021, 5:48 p.m. UTC | #6
On Sun, Jan 03, 2021 at 05:26:29PM +0100, Arnd Bergmann wrote:
> Thank you for the report and bisecting the issue, and sorry this broke
> your system!
> 
> Fortunately, the patch is fairly small, so there are only a limited number
> of things that could go wrong. I haven't tried to analyze that message,
> but I have two ideas:
> 
> a) The added ioc->sense_off check gets triggered and the code relies
>   on the data being written outside of the structure
> 
> b) the address actually needs to always be written as a 64-bit value
>     regardless of the instance->consistent_mask_64bit flag, as the
>    driver did before. This looked like it was done in error.
> 
> Can you try the patch below instead of the revert and see if that
> resolves the regression, and if it triggers the warning message I
> add?

Thanks Arnd, I tried your patch and it resolves the regression.  It does not
trigger the warning message you added.

Phil
Arnd Bergmann Jan. 4, 2021, 10:24 p.m. UTC | #7
On Mon, Jan 4, 2021 at 6:48 PM Phil Oester <kernel@linuxace.com> wrote:
>
> On Sun, Jan 03, 2021 at 05:26:29PM +0100, Arnd Bergmann wrote:
> > Thank you for the report and bisecting the issue, and sorry this broke
> > your system!
> >
> > Fortunately, the patch is fairly small, so there are only a limited number
> > of things that could go wrong. I haven't tried to analyze that message,
> > but I have two ideas:
> >
> > a) The added ioc->sense_off check gets triggered and the code relies
> >   on the data being written outside of the structure
> >
> > b) the address actually needs to always be written as a 64-bit value
> >     regardless of the instance->consistent_mask_64bit flag, as the
> >    driver did before. This looked like it was done in error.
> >
> > Can you try the patch below instead of the revert and see if that
> > resolves the regression, and if it triggers the warning message I
> > add?
>
> Thanks Arnd, I tried your patch and it resolves the regression.  It does not
> trigger the warning message you added.

Ok, thanks for testing! That would mean the range check is correct,
but the sense pointer must indeed be treated as a 64-bit entity
regardless of instance->consistent_mask_64bit, or at least the
upper 32 bit must be zero when the flag is unset, rather than
the recycled previous value.

I'll send a proper fix shortly, it would be nice if you could give it
another spin, but the behavior should be the same as this patch.

       Arnd
diff mbox series

Patch

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 861f7140f52e..c3de69f3bee8 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -8095,7 +8095,7 @@  megasas_mgmt_fw_ioctl(struct megasas_instance *instance,
 	int error = 0, i;
 	void *sense = NULL;
 	dma_addr_t sense_handle;
-	unsigned long *sense_ptr;
+	void *sense_ptr;
 	u32 opcode = 0;
 	int ret = DCMD_SUCCESS;
 
@@ -8218,6 +8218,12 @@  megasas_mgmt_fw_ioctl(struct megasas_instance *instance,
 	}
 
 	if (ioc->sense_len) {
+		/* make sure the pointer is part of the frame */
+		if (ioc->sense_off > (sizeof(union megasas_frame) - sizeof(__le64))) {
+			error = -EINVAL;
+			goto out;
+		}
+
 		sense = dma_alloc_coherent(&instance->pdev->dev, ioc->sense_len,
 					     &sense_handle, GFP_KERNEL);
 		if (!sense) {
@@ -8225,12 +8231,11 @@  megasas_mgmt_fw_ioctl(struct megasas_instance *instance,
 			goto out;
 		}
 
-		sense_ptr =
-		(unsigned long *) ((unsigned long)cmd->frame + ioc->sense_off);
+		sense_ptr = (void *)cmd->frame + ioc->sense_off;
 		if (instance->consistent_mask_64bit)
-			*sense_ptr = cpu_to_le64(sense_handle);
+			put_unaligned_le64(sense_handle, sense_ptr);
 		else
-			*sense_ptr = cpu_to_le32(sense_handle);
+			put_unaligned_le32(sense_handle, sense_ptr);
 	}
 
 	/*