diff mbox series

[for-rc,4/4] IB/hfi1: Fix regressions in security fix

Message ID 1617025700-31865-5-git-send-email-dennis.dalessandro@cornelisnetworks.com
State New
Headers show
Series [for-rc,1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev | expand

Commit Message

Dennis Dalessandro March 29, 2021, 1:48 p.m. UTC
From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

The security code guards for non-current mm in all cases for
updating the rb tree.

That is ok for insert, but NOT ok for remove, since the insert
has already guarded the node from being inserted and the remove
can be called with a different mm because of a segfault other similar
"close" issues where current-mm is NULL.

Best case, is we leak pages. worst case we delete items for an lru_list
more than once:
[20945.911107] list_del corruption, ffffa0cd536bcac8->next is LIST_POISON1 (dead000000000100)

Fix by removing the guard from any functions that remove nodes
from the tree assuming the node was entered into the tree as valid since
the insert is guarded.

Fixes: 3d2a9d642512 ("IB/hfi1: Ensure correct mm is used at all times")
Cc: <stable@vger.kernel.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/mmu_rb.c | 9 ---------
 1 file changed, 9 deletions(-)

Comments

Jason Gunthorpe April 7, 2021, 6:33 p.m. UTC | #1
On Mon, Mar 29, 2021 at 11:36:09AM -0700, Ira Weiny wrote:
> On Mon, Mar 29, 2021 at 09:48:20AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:

> > From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

> > 

> > The security code guards for non-current mm in all cases for

> > updating the rb tree.

> > 

> > That is ok for insert, but NOT ok for remove, since the insert

> > has already guarded the node from being inserted and the remove

> > can be called with a different mm because of a segfault other similar

> > "close" issues where current-mm is NULL.

> > 

> > Best case, is we leak pages. worst case we delete items for an lru_list

> > more than once:

> > [20945.911107] list_del corruption, ffffa0cd536bcac8->next is LIST_POISON1 (dead000000000100)

> > 

> > Fix by removing the guard from any functions that remove nodes

> > from the tree assuming the node was entered into the tree as valid since

> > the insert is guarded.

> 

> Does this open up a child process being able to remove nodes which the parent

> added?


Dennis?

Jason
Dennis Dalessandro April 7, 2021, 8:20 p.m. UTC | #2
On 4/7/2021 2:33 PM, Jason Gunthorpe wrote:
> On Mon, Mar 29, 2021 at 11:36:09AM -0700, Ira Weiny wrote:

>> On Mon, Mar 29, 2021 at 09:48:20AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:

>>> From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

>>>

>>> The security code guards for non-current mm in all cases for

>>> updating the rb tree.

>>>

>>> That is ok for insert, but NOT ok for remove, since the insert

>>> has already guarded the node from being inserted and the remove

>>> can be called with a different mm because of a segfault other similar

>>> "close" issues where current-mm is NULL.

>>>

>>> Best case, is we leak pages. worst case we delete items for an lru_list

>>> more than once:

>>> [20945.911107] list_del corruption, ffffa0cd536bcac8->next is LIST_POISON1 (dead000000000100)

>>>

>>> Fix by removing the guard from any functions that remove nodes

>>> from the tree assuming the node was entered into the tree as valid since

>>> the insert is guarded.

>>

>> Does this open up a child process being able to remove nodes which the parent

>> added?

> 

> Dennis?


I believe it does in a way. I'm not sure what we can do about it.

One thought was to check mm for NULL and if so remove unconditionally 
because that means it's coming from the kernel killing the proc or 
something along those lines. If it's not NULL check against the saved mm 
value. Ira, do you recall discussing that during our internal review?

Need to do some more thinking on the right thing to do as I'm sure there 
are corner cases that I'm not seeing.

-Denny
Jason Gunthorpe April 13, 2021, 10:55 p.m. UTC | #3
On Mon, Mar 29, 2021 at 09:48:20AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
> From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

> 

> The security code guards for non-current mm in all cases for

> updating the rb tree.

> 

> That is ok for insert, but NOT ok for remove, since the insert

> has already guarded the node from being inserted and the remove

> can be called with a different mm because of a segfault other similar

> "close" issues where current-mm is NULL.

> 

> Best case, is we leak pages. worst case we delete items for an lru_list

> more than once:

> [20945.911107] list_del corruption, ffffa0cd536bcac8->next is LIST_POISON1 (dead000000000100)

> 

> Fix by removing the guard from any functions that remove nodes

> from the tree assuming the node was entered into the tree as valid since

> the insert is guarded.

> 

> Fixes: 3d2a9d642512 ("IB/hfi1: Ensure correct mm is used at all times")

> Cc: <stable@vger.kernel.org>

> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>

>  drivers/infiniband/hw/hfi1/mmu_rb.c | 9 ---------

>  1 file changed, 9 deletions(-)


I'm going to drop this - resend it when the more thinking is done

But generally the security concern is establishing new access to a mm,
not so much destroying access created by another user of a FD.

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
index f3fb28e..375a881 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
@@ -210,9 +210,6 @@  bool hfi1_mmu_rb_remove_unless_exact(struct mmu_rb_handler *handler,
 	unsigned long flags;
 	bool ret = false;
 
-	if (current->mm != handler->mn.mm)
-		return ret;
-
 	spin_lock_irqsave(&handler->lock, flags);
 	node = __mmu_rb_search(handler, addr, len);
 	if (node) {
@@ -235,9 +232,6 @@  void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
 	unsigned long flags;
 	bool stop = false;
 
-	if (current->mm != handler->mn.mm)
-		return;
-
 	INIT_LIST_HEAD(&del_list);
 
 	spin_lock_irqsave(&handler->lock, flags);
@@ -271,9 +265,6 @@  void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
 {
 	unsigned long flags;
 
-	if (current->mm != handler->mn.mm)
-		return;
-
 	/* Validity of handler and node pointers has been checked by caller. */
 	trace_hfi1_mmu_rb_remove(node->addr, node->len);
 	spin_lock_irqsave(&handler->lock, flags);