Message ID | 20221116111955.21057-1-quic_ugoswami@quicinc.com |
---|---|
State | Superseded |
Headers | show |
Series | [v2] usb: gadget: f_fs: Prevent race between functionfs_unbind & ffs_ep0_queue_wait | expand |
On Sun, Nov 20, 2022 at 12:23:50PM +0530, Udipto Goswami wrote: > On 11/18/22 9:49 PM, John Keeping wrote: > > On Wed, Nov 16, 2022 at 04:49:55PM +0530, Udipto Goswami wrote: > > > While performing fast composition switch, there is a possibility that the > > > process of ffs_ep0_write/ffs_ep0_read get into a race condition > > > due to ep0req being freed up from functionfs_unbind. > > > > > > Consider the scenario that the ffs_ep0_write calls the ffs_ep0_queue_wait > > > by taking a lock &ffs->ev.waitq.lock. However, the functionfs_unbind isn't > > > bounded so it can go ahead and mark the ep0req to NULL, and since there > > > is no NULL check in ffs_ep0_queue_wait we will end up in use-after-free. > > > > > > Fix this by making a serialized execution between the two functions using > > > a mutex_lock(ffs->mutex). Also, dequeue the ep0req to ensure that no > > > other function can use it after the free operation. > > > > > > Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver") > > > Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com> > > > --- > > > v2: Replaces spinlock with mutex & added dequeue operation in unbind. > > > > > > drivers/usb/gadget/function/f_fs.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c > > > index 73dc10a77cde..1439449df39a 100644 > > > --- a/drivers/usb/gadget/function/f_fs.c > > > +++ b/drivers/usb/gadget/function/f_fs.c > > > @@ -279,6 +279,9 @@ static int __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data, size_t len) > > > struct usb_request *req = ffs->ep0req; > > > int ret; > > > + if (!req) > > > + return -EINVAL; > > > + > > > req->zero = len < le16_to_cpu(ffs->ev.setup.wLength); > > > spin_unlock_irq(&ffs->ev.waitq.lock); > > > @@ -1892,10 +1895,14 @@ static void functionfs_unbind(struct ffs_data *ffs) > > > ENTER(); > > > if (!WARN_ON(!ffs->gadget)) { > > > + mutex_lock(&ffs->mutex); > > > + /* dequeue before freeing ep0req */ > > > + usb_ep_dequeue(ffs->gadget->ep0, ffs->ep0req); > > > usb_ep_free_request(ffs->gadget->ep0, ffs->ep0req); > > > ffs->ep0req = NULL; > > > ffs->gadget = NULL; > > > clear_bit(FFS_FL_BOUND, &ffs->flags); > > > + mutex_unlock(&ffs->mutex); > > > > There's now a deadlock here if some other thread is waiting in > > __ffs_ep0_queue_wait() on ep0req_completion. > > > > You need to dequeue before taking the lock. > That's a control request right, will it be async? > > Anyway I see only 2 possible threads ep0_read/ep0_write who calls > ep0_queue_wait and waits for the completion of ep0req and both > ep0_read/write are prptected by the mutex lock so i guess execution won't > reach there right ? > Say functionfs_unbind ran first then ep0_read/write had to wait will the > functionfs_unbind is completed so ep_dequeue will ran, will get completed, > further free the request, mark in NULL. now ep0_read/write will have ep0req > as NULL so bail out. > > Is reverse then functionfs_unbind will wait will the ep0_read/write is > completed. What guarantee is there that the transfer completes? If there is such a guarantee, then the request will not be queued, so why is usb_ep_dequeue() necessary?
On Mon, Nov 21, 2022 at 09:52:43AM +0530, Udipto Goswami wrote: > Hi John > > On 11/20/22 11:18 PM, John Keeping wrote: > > On Sun, Nov 20, 2022 at 12:23:50PM +0530, Udipto Goswami wrote: > > > On 11/18/22 9:49 PM, John Keeping wrote: > > > > On Wed, Nov 16, 2022 at 04:49:55PM +0530, Udipto Goswami wrote: > > > > > While performing fast composition switch, there is a possibility that the > > > > > process of ffs_ep0_write/ffs_ep0_read get into a race condition > > > > > due to ep0req being freed up from functionfs_unbind. > > > > > > > > > > Consider the scenario that the ffs_ep0_write calls the ffs_ep0_queue_wait > > > > > by taking a lock &ffs->ev.waitq.lock. However, the functionfs_unbind isn't > > > > > bounded so it can go ahead and mark the ep0req to NULL, and since there > > > > > is no NULL check in ffs_ep0_queue_wait we will end up in use-after-free. > > > > > > > > > > Fix this by making a serialized execution between the two functions using > > > > > a mutex_lock(ffs->mutex). Also, dequeue the ep0req to ensure that no > > > > > other function can use it after the free operation. > > > > > > > > > > Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver") > > > > > Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com> > > > > > --- > > > > > v2: Replaces spinlock with mutex & added dequeue operation in unbind. > > > > > > > > > > drivers/usb/gadget/function/f_fs.c | 7 +++++++ > > > > > 1 file changed, 7 insertions(+) > > > > > > > > > > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c > > > > > index 73dc10a77cde..1439449df39a 100644 > > > > > --- a/drivers/usb/gadget/function/f_fs.c > > > > > +++ b/drivers/usb/gadget/function/f_fs.c > > > > > @@ -279,6 +279,9 @@ static int __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data, size_t len) > > > > > struct usb_request *req = ffs->ep0req; > > > > > int ret; > > > > > + if (!req) > > > > > + return -EINVAL; > > > > > + > > > > > req->zero = len < le16_to_cpu(ffs->ev.setup.wLength); > > > > > spin_unlock_irq(&ffs->ev.waitq.lock); > > > > > @@ -1892,10 +1895,14 @@ static void functionfs_unbind(struct ffs_data *ffs) > > > > > ENTER(); > > > > > if (!WARN_ON(!ffs->gadget)) { > > > > > + mutex_lock(&ffs->mutex); > > > > > + /* dequeue before freeing ep0req */ > > > > > + usb_ep_dequeue(ffs->gadget->ep0, ffs->ep0req); > > > > > usb_ep_free_request(ffs->gadget->ep0, ffs->ep0req); > > > > > ffs->ep0req = NULL; > > > > > ffs->gadget = NULL; > > > > > clear_bit(FFS_FL_BOUND, &ffs->flags); > > > > > + mutex_unlock(&ffs->mutex); > > > > > > > > There's now a deadlock here if some other thread is waiting in > > > > __ffs_ep0_queue_wait() on ep0req_completion. > > > > > > > > You need to dequeue before taking the lock. > > > That's a control request right, will it be async? > > > > > > Anyway I see only 2 possible threads ep0_read/ep0_write who calls > > > ep0_queue_wait and waits for the completion of ep0req and both > > > ep0_read/write are prptected by the mutex lock so i guess execution won't > > > reach there right ? > > > Say functionfs_unbind ran first then ep0_read/write had to wait will the > > > functionfs_unbind is completed so ep_dequeue will ran, will get completed, > > > further free the request, mark in NULL. now ep0_read/write will have ep0req > > > as NULL so bail out. > > > > > > Is reverse then functionfs_unbind will wait will the ep0_read/write is > > > completed. > > > > What guarantee is there that the transfer completes? > > > > If there is such a guarantee, then the request will not be queued, so > > why is usb_ep_dequeue() necessary? > > I Agree that we cannot say that for sure, but we see that > wait_for_completion in the ep0_queue_wait is also inside mutex which was > acquired in ep0_read/write right? Correct. > I Though of maintaining the uniformity for the approaches. What uniformity? If one process is blocked waiting for completion and another process wants to cancel the operation, then the cancel (usb_eq_dequeue()) must run concurrently with the wait, otherwise the blocked process will never wake up.
On Tue, Nov 22, 2022 at 05:56:56PM +0530, Udipto Goswami wrote: > On 11/22/22 5:17 PM, John Keeping wrote: > > On Mon, Nov 21, 2022 at 09:52:43AM +0530, Udipto Goswami wrote: > > > Hi John > > > > > > On 11/20/22 11:18 PM, John Keeping wrote: > > > > On Sun, Nov 20, 2022 at 12:23:50PM +0530, Udipto Goswami wrote: > > > > > On 11/18/22 9:49 PM, John Keeping wrote: > > > > > > On Wed, Nov 16, 2022 at 04:49:55PM +0530, Udipto Goswami wrote: > > > > > > > While performing fast composition switch, there is a possibility that the > > > > > > > process of ffs_ep0_write/ffs_ep0_read get into a race condition > > > > > > > due to ep0req being freed up from functionfs_unbind. > > > > > > > > > > > > > > Consider the scenario that the ffs_ep0_write calls the ffs_ep0_queue_wait > > > > > > > by taking a lock &ffs->ev.waitq.lock. However, the functionfs_unbind isn't > > > > > > > bounded so it can go ahead and mark the ep0req to NULL, and since there > > > > > > > is no NULL check in ffs_ep0_queue_wait we will end up in use-after-free. > > > > > > > > > > > > > > Fix this by making a serialized execution between the two functions using > > > > > > > a mutex_lock(ffs->mutex). Also, dequeue the ep0req to ensure that no > > > > > > > other function can use it after the free operation. > > > > > > > > > > > > > > Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver") > > > > > > > Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com> > > > > > > > --- > > > > > > > v2: Replaces spinlock with mutex & added dequeue operation in unbind. > > > > > > > > > > > > > > drivers/usb/gadget/function/f_fs.c | 7 +++++++ > > > > > > > 1 file changed, 7 insertions(+) > > > > > > > > > > > > > > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c > > > > > > > index 73dc10a77cde..1439449df39a 100644 > > > > > > > --- a/drivers/usb/gadget/function/f_fs.c > > > > > > > +++ b/drivers/usb/gadget/function/f_fs.c > > > > > > > @@ -279,6 +279,9 @@ static int __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data, size_t len) > > > > > > > struct usb_request *req = ffs->ep0req; > > > > > > > int ret; > > > > > > > + if (!req) > > > > > > > + return -EINVAL; > > > > > > > + > > > > > > > req->zero = len < le16_to_cpu(ffs->ev.setup.wLength); > > > > > > > spin_unlock_irq(&ffs->ev.waitq.lock); > > > > > > > @@ -1892,10 +1895,14 @@ static void functionfs_unbind(struct ffs_data *ffs) > > > > > > > ENTER(); > > > > > > > if (!WARN_ON(!ffs->gadget)) { > > > > > > > + mutex_lock(&ffs->mutex); > > > > > > > + /* dequeue before freeing ep0req */ > > > > > > > + usb_ep_dequeue(ffs->gadget->ep0, ffs->ep0req); > > > > > > > usb_ep_free_request(ffs->gadget->ep0, ffs->ep0req); > > > > > > > ffs->ep0req = NULL; > > > > > > > ffs->gadget = NULL; > > > > > > > clear_bit(FFS_FL_BOUND, &ffs->flags); > > > > > > > + mutex_unlock(&ffs->mutex); > > > > > > > > > > > > There's now a deadlock here if some other thread is waiting in > > > > > > __ffs_ep0_queue_wait() on ep0req_completion. > > > > > > > > > > > > You need to dequeue before taking the lock. > > > > > That's a control request right, will it be async? > > > > > > > > > > Anyway I see only 2 possible threads ep0_read/ep0_write who calls > > > > > ep0_queue_wait and waits for the completion of ep0req and both > > > > > ep0_read/write are prptected by the mutex lock so i guess execution won't > > > > > reach there right ? > > > > > Say functionfs_unbind ran first then ep0_read/write had to wait will the > > > > > functionfs_unbind is completed so ep_dequeue will ran, will get completed, > > > > > further free the request, mark in NULL. now ep0_read/write will have ep0req > > > > > as NULL so bail out. > > > > > > > > > > Is reverse then functionfs_unbind will wait will the ep0_read/write is > > > > > completed. > > > > > > > > What guarantee is there that the transfer completes? > > > > > > > > If there is such a guarantee, then the request will not be queued, so > > > > why is usb_ep_dequeue() necessary? > > > > > > I Agree that we cannot say that for sure, but we see that > > > wait_for_completion in the ep0_queue_wait is also inside mutex which was > > > acquired in ep0_read/write right? > > > > Correct. > > > > > I Though of maintaining the uniformity for the approaches. > > > > What uniformity? If one process is blocked waiting for completion and > > another process wants to cancel the operation, then the cancel > > (usb_eq_dequeue()) must run concurrently with the wait, otherwise the > > blocked process will never wake up. > > I get that, we want to rely on the dequeue to get us unblocked. > But this is also true right that doing dequeue outside might cause this? > > functionfs_unbind > ep0_dequeue > ffs_ep0_read > mutex_lock() > giveback ep0_queue > map request buffer > unmap buffer > > This can affect the controller's list i.e the pending_list for ep0 or might > also result on controller accessing a stale memory address isn't it ? > > Or does the mutex would let the ep0_read execute in atomic context? No > right. Will it not be able to execute parallely? If not then yah we can do > dequeue outside mutex for sure. I would expect that if we're in unbind then any attempt to enqueue a new request will fail, so if the mutex is taken in the case above ep_queue() should fail with -ESHUTDOWN. But I can't actually find an point to any code that ensures that is the case! This doesn't matter too much though - it's not going to result in any access to stale memory because either: ep0_dequeue ffs_ep0_read mutex_lock() ep0_queue ... wait for response ... read ep0req->status mutex_unlock() mutex_lock() free_ep0_request ... or: ffs_ep0_read mutex_lock() ep0_queue ep0_dequeue wake up read ep0req->status mutex_unlock() mutex_lock() free_ep0_request ... The first case isn't ideal as we don't want to be waiting for a request while unbinding, but it's not unsafe.
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index 73dc10a77cde..1439449df39a 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -279,6 +279,9 @@ static int __ffs_ep0_queue_wait(struct ffs_data *ffs, char *data, size_t len) struct usb_request *req = ffs->ep0req; int ret; + if (!req) + return -EINVAL; + req->zero = len < le16_to_cpu(ffs->ev.setup.wLength); spin_unlock_irq(&ffs->ev.waitq.lock); @@ -1892,10 +1895,14 @@ static void functionfs_unbind(struct ffs_data *ffs) ENTER(); if (!WARN_ON(!ffs->gadget)) { + mutex_lock(&ffs->mutex); + /* dequeue before freeing ep0req */ + usb_ep_dequeue(ffs->gadget->ep0, ffs->ep0req); usb_ep_free_request(ffs->gadget->ep0, ffs->ep0req); ffs->ep0req = NULL; ffs->gadget = NULL; clear_bit(FFS_FL_BOUND, &ffs->flags); + mutex_unlock(&ffs->mutex); ffs_data_put(ffs); } }
While performing fast composition switch, there is a possibility that the process of ffs_ep0_write/ffs_ep0_read get into a race condition due to ep0req being freed up from functionfs_unbind. Consider the scenario that the ffs_ep0_write calls the ffs_ep0_queue_wait by taking a lock &ffs->ev.waitq.lock. However, the functionfs_unbind isn't bounded so it can go ahead and mark the ep0req to NULL, and since there is no NULL check in ffs_ep0_queue_wait we will end up in use-after-free. Fix this by making a serialized execution between the two functions using a mutex_lock(ffs->mutex). Also, dequeue the ep0req to ensure that no other function can use it after the free operation. Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver") Signed-off-by: Udipto Goswami <quic_ugoswami@quicinc.com> --- v2: Replaces spinlock with mutex & added dequeue operation in unbind. drivers/usb/gadget/function/f_fs.c | 7 +++++++ 1 file changed, 7 insertions(+)