mbox series

[v3,0/4] Disconnect devices before rfkilling adapter

Message ID 20240107180252.73436-1-verdre@v0yd.nl
Headers show
Series Disconnect devices before rfkilling adapter | expand

Message

Jonas Dreßler Jan. 7, 2024, 6:02 p.m. UTC
Apparently the firmware is supposed to power off the bluetooth card
properly, including disconnecting devices, when we use rfkill to block
bluetooth. This doesn't work on a lot of laptops though, leading to weird
issues after turning off bluetooth, like the connection timing out on the
peripherals which were connected, and bluetooth not connecting properly
when the adapter is turned on again after rfkilling.

This series uses the rfkill hook in the bluetooth subsystem
to execute a few more shutdown commands and make sure that all
devices get disconnected before we close the HCI connection to the adapter.

---

v1: https://lore.kernel.org/linux-bluetooth/20240102133311.6712-1-verdre@v0yd.nl/
v2: https://lore.kernel.org/linux-bluetooth/20240102181946.57288-1-verdre@v0yd.nl/
v3:
 - Update commit message titles to reflect what's actually happening
   (disconnecting devices, not sending a power-off command).
 - Doing the shutdown sequence synchronously instead of async now.
 - Move HCI_RFKILLED flag back again to be set before shutdown.
 - Added a "fallback" hci_dev_do_close() to the error path because
   hci_set_powered_sync() might bail-out early on error.

Jonas Dreßler (4):
  Bluetooth: Remove HCI_POWER_OFF_TIMEOUT
  Bluetooth: mgmt: Remove leftover queuing of power_off work
  Bluetooth: Add new state HCI_POWERING_DOWN
  Bluetooth: Disconnect connected devices before rfkilling adapter

 include/net/bluetooth/hci.h |  2 +-
 net/bluetooth/hci_core.c    | 35 +++++++++++++++++++++++++++++++++--
 net/bluetooth/hci_sync.c    | 16 +++++++++++-----
 net/bluetooth/mgmt.c        | 30 ++++++++++++++----------------
 4 files changed, 59 insertions(+), 24 deletions(-)

Comments

Jonas Dreßler Jan. 24, 2024, 6 p.m. UTC | #1
Hi Luiz,

On 1/8/24 11:25 PM, Jonas Dreßler wrote:
> Hi Luiz,
> 
> On 1/8/24 19:05, Luiz Augusto von Dentz wrote:
>> Hi Jonas,
>>
>> On Sun, Jan 7, 2024 at 1:03 PM Jonas Dreßler <verdre@v0yd.nl> wrote:
>>>
>>> Apparently the firmware is supposed to power off the bluetooth card
>>> properly, including disconnecting devices, when we use rfkill to block
>>> bluetooth. This doesn't work on a lot of laptops though, leading to weird
>>> issues after turning off bluetooth, like the connection timing out on the
>>> peripherals which were connected, and bluetooth not connecting properly
>>> when the adapter is turned on again after rfkilling.
>>>
>>> This series uses the rfkill hook in the bluetooth subsystem
>>> to execute a few more shutdown commands and make sure that all
>>> devices get disconnected before we close the HCI connection to the adapter.
>>>
>>> ---
>>>
>>> v1: https://lore.kernel.org/linux-bluetooth/20240102133311.6712-1-verdre@v0yd.nl/
>>> v2: https://lore.kernel.org/linux-bluetooth/20240102181946.57288-1-verdre@v0yd.nl/
>>> v3:
>>>   - Update commit message titles to reflect what's actually happening
>>>     (disconnecting devices, not sending a power-off command).
>>>   - Doing the shutdown sequence synchronously instead of async now.
>>>   - Move HCI_RFKILLED flag back again to be set before shutdown.
>>>   - Added a "fallback" hci_dev_do_close() to the error path because
>>>     hci_set_powered_sync() might bail-out early on error.
>>>
>>> Jonas Dreßler (4):
>>>    Bluetooth: Remove HCI_POWER_OFF_TIMEOUT
>>>    Bluetooth: mgmt: Remove leftover queuing of power_off work
>>>    Bluetooth: Add new state HCI_POWERING_DOWN
>>>    Bluetooth: Disconnect connected devices before rfkilling adapter
>>>
>>>   include/net/bluetooth/hci.h |  2 +-
>>>   net/bluetooth/hci_core.c    | 35 +++++++++++++++++++++++++++++++++--
>>>   net/bluetooth/hci_sync.c    | 16 +++++++++++-----
>>>   net/bluetooth/mgmt.c        | 30 ++++++++++++++----------------
>>>   4 files changed, 59 insertions(+), 24 deletions(-)
>>>
>>> -- 
>>> 2.43.0
>>
>> I will probably be applying this sortly, but let's try to add tests to
>> mgmt-tester just to make sure we don't introduce regressions later,
>> btw it seems there are a few suspend test that do connect, for
>> example:
>>
>> Suspend - Success 5 (Pairing - Legacy) - waiting 1 seconds
>> random: crng init done
>>    New connection with handle 0x002a
>>    Test condition complete, 1 left
>> Suspend - Success 5 (Pairing - Legacy) - waiting done
>>    Set the system into Suspend via force_suspend
>>    New Controller Suspend event received
>>    Test condition complete, 0 left
>>
> 
> Thanks for that hint, I've been starting to write a test and managed to
> write to the rfkill file and it's blocking the device just fine, except
> I've run into what might be a bug in the virtual HCI driver:
> 
> So the power down sequence is initiated on the rfkill as expected and
> hci_set_powered_sync(false) is called. That then calls
> hci_write_scan_enable_sync(), and this HCI command never gets a response
> from the virtual HCI driver. Strangely, BT_HCI_CMD_WRITE_SCAN_ENABLE is
> implemented in btdev.c and the callback does get executed (I checked), it
> just doesn't send the command completed event:
> 
> < HCI Command: Write Scan Enable (0x03|0x001a) plen 1                                                                                                                                       #1588 [hci1] 12.294234
>          Scan enable: No Scans (0x00)
> 
> no response after...
> 

So I think I found the problem here too:

The problem with this one is that calling hci_set_powered_sync() from
within the context of the write to the rfkill device blocks the write()
until the HCI commands have returned. Because the mgmt-tester process is
stuck in write(), it can't reply to the HCI commands using the emulator
(which runs in the same thread), and after two seconds the HCI command
times out and the test ends.

I haven't really been able to confirm this other than that we're indeed
blocked in write(), does this sound like a sane explanation to you?

Seems like for this to work we'd either have to stop blocking userspace
until the rfkill has finished/failed (don't think that's a good idea), or
write to the rfkill device from an separate thread in mgmt-tester? The
latter should be fairly easy, so I'll give that a shot.

Cheers,
Jonas
Luiz Augusto von Dentz Jan. 24, 2024, 6:10 p.m. UTC | #2
Hi Jonas,

On Wed, Jan 24, 2024 at 1:00 PM Jonas Dreßler <verdre@v0yd.nl> wrote:
>
> Hi Luiz,
>
> On 1/8/24 11:25 PM, Jonas Dreßler wrote:
> > Hi Luiz,
> >
> > On 1/8/24 19:05, Luiz Augusto von Dentz wrote:
> >> Hi Jonas,
> >>
> >> On Sun, Jan 7, 2024 at 1:03 PM Jonas Dreßler <verdre@v0yd.nl> wrote:
> >>>
> >>> Apparently the firmware is supposed to power off the bluetooth card
> >>> properly, including disconnecting devices, when we use rfkill to block
> >>> bluetooth. This doesn't work on a lot of laptops though, leading to weird
> >>> issues after turning off bluetooth, like the connection timing out on the
> >>> peripherals which were connected, and bluetooth not connecting properly
> >>> when the adapter is turned on again after rfkilling.
> >>>
> >>> This series uses the rfkill hook in the bluetooth subsystem
> >>> to execute a few more shutdown commands and make sure that all
> >>> devices get disconnected before we close the HCI connection to the adapter.
> >>>
> >>> ---
> >>>
> >>> v1: https://lore.kernel.org/linux-bluetooth/20240102133311.6712-1-verdre@v0yd.nl/
> >>> v2: https://lore.kernel.org/linux-bluetooth/20240102181946.57288-1-verdre@v0yd.nl/
> >>> v3:
> >>>   - Update commit message titles to reflect what's actually happening
> >>>     (disconnecting devices, not sending a power-off command).
> >>>   - Doing the shutdown sequence synchronously instead of async now.
> >>>   - Move HCI_RFKILLED flag back again to be set before shutdown.
> >>>   - Added a "fallback" hci_dev_do_close() to the error path because
> >>>     hci_set_powered_sync() might bail-out early on error.
> >>>
> >>> Jonas Dreßler (4):
> >>>    Bluetooth: Remove HCI_POWER_OFF_TIMEOUT
> >>>    Bluetooth: mgmt: Remove leftover queuing of power_off work
> >>>    Bluetooth: Add new state HCI_POWERING_DOWN
> >>>    Bluetooth: Disconnect connected devices before rfkilling adapter
> >>>
> >>>   include/net/bluetooth/hci.h |  2 +-
> >>>   net/bluetooth/hci_core.c    | 35 +++++++++++++++++++++++++++++++++--
> >>>   net/bluetooth/hci_sync.c    | 16 +++++++++++-----
> >>>   net/bluetooth/mgmt.c        | 30 ++++++++++++++----------------
> >>>   4 files changed, 59 insertions(+), 24 deletions(-)
> >>>
> >>> --
> >>> 2.43.0
> >>
> >> I will probably be applying this sortly, but let's try to add tests to
> >> mgmt-tester just to make sure we don't introduce regressions later,
> >> btw it seems there are a few suspend test that do connect, for
> >> example:
> >>
> >> Suspend - Success 5 (Pairing - Legacy) - waiting 1 seconds
> >> random: crng init done
> >>    New connection with handle 0x002a
> >>    Test condition complete, 1 left
> >> Suspend - Success 5 (Pairing - Legacy) - waiting done
> >>    Set the system into Suspend via force_suspend
> >>    New Controller Suspend event received
> >>    Test condition complete, 0 left
> >>
> >
> > Thanks for that hint, I've been starting to write a test and managed to
> > write to the rfkill file and it's blocking the device just fine, except
> > I've run into what might be a bug in the virtual HCI driver:
> >
> > So the power down sequence is initiated on the rfkill as expected and
> > hci_set_powered_sync(false) is called. That then calls
> > hci_write_scan_enable_sync(), and this HCI command never gets a response
> > from the virtual HCI driver. Strangely, BT_HCI_CMD_WRITE_SCAN_ENABLE is
> > implemented in btdev.c and the callback does get executed (I checked), it
> > just doesn't send the command completed event:
> >
> > < HCI Command: Write Scan Enable (0x03|0x001a) plen 1                                                                                                                                       #1588 [hci1] 12.294234
> >          Scan enable: No Scans (0x00)
> >
> > no response after...
> >
>
> So I think I found the problem here too:
>
> The problem with this one is that calling hci_set_powered_sync() from
> within the context of the write to the rfkill device blocks the write()
> until the HCI commands have returned. Because the mgmt-tester process is
> stuck in write(), it can't reply to the HCI commands using the emulator
> (which runs in the same thread), and after two seconds the HCI command
> times out and the test ends.
>
> I haven't really been able to confirm this other than that we're indeed
> blocked in write(), does this sound like a sane explanation to you?
>
> Seems like for this to work we'd either have to stop blocking userspace
> until the rfkill has finished/failed (don't think that's a good idea), or
> write to the rfkill device from an separate thread in mgmt-tester? The
> latter should be fairly easy, so I'll give that a shot.

Userspace code is normally not thread safe since we usually have been
using the concept of mainloop to avoid entering into the threading
support which might require locking, etc. That said we could perhaps
either not block at vhci driver, with use of hci_cmd_sync_queue, etc,
or use async IO mechanism in userspace so we avoid blocking btdev
handling.

> Cheers,
> Jonas