Message ID | 20210624041316.567622-3-sukadev@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | ibmvnic: Assorted bug fixes | expand |
On Wed, Jun 23, 2021 at 11:17 PM Sukadev Bhattiprolu <sukadev@linux.ibm.com> wrote: > > From: Dany Madden <drt@linux.ibm.com> > > This reverts commit 7c451f3ef676c805a4b77a743a01a5c21a250a73. > > When a vnic interface is taken down and then up, connectivity is not > restored. We bisected it to this commit. Reverting this commit until > we can fully investigate the issue/benefit of the change. > Sometimes git bisect may lead us to the false positives. Full investigation is always the best solution if it is not too hard.
On 6/24/21 12:02 AM, Johaan Smith wrote: > On Wed, Jun 23, 2021 at 11:17 PM Sukadev Bhattiprolu > Sometimes git bisect may lead us to the false positives. Full investigation is > always the best solution if it is not too hard. I'd be happy to view evidence that points at another patch, if someone has some. But the original patch did not address any observed problem: "So there is no need for do_reset to call napi_schedule again at the end of the function though napi_schedule will neglect the request if napi is already scheduled." Given that it fixed no problem but appears to have introduced one, the efficient action is to revert it for now.
On Thu, Jun 24, 2021 at 2:29 AM Rick Lindsley <ricklind@linux.vnet.ibm.com> wrote: > > On 6/24/21 12:02 AM, Johaan Smith wrote: > > On Wed, Jun 23, 2021 at 11:17 PM Sukadev Bhattiprolu > > > Sometimes git bisect may lead us to the false positives. Full investigation is > > always the best solution if it is not too hard. > > I'd be happy to view evidence that points at another patch, if someone has some. > But the original patch did not address any observed problem: > > "So there is no need for do_reset to call napi_schedule again > at the end of the function though napi_schedule will neglect > the request if napi is already scheduled." > > Given that it fixed no problem but appears to have introduced one, the efficient > action is to revert it for now. > With this reverted patch, there are lots of errors after bringing device down then up, e.g., "ibmvnic 30000003 env3: rx buffer returned with rc 6". That seems something wrong with the handling of set_link_state DOWN/UP. It is either the communication protocol or the VIOS itself.
On 2021-06-24 10:05, Lijun Pan wrote: > On Thu, Jun 24, 2021 at 2:29 AM Rick Lindsley > <ricklind@linux.vnet.ibm.com> wrote: >> >> On 6/24/21 12:02 AM, Johaan Smith wrote: >> > On Wed, Jun 23, 2021 at 11:17 PM Sukadev Bhattiprolu >> >> > Sometimes git bisect may lead us to the false positives. Full investigation is >> > always the best solution if it is not too hard. >> >> I'd be happy to view evidence that points at another patch, if someone >> has some. >> But the original patch did not address any observed problem: >> >> "So there is no need for do_reset to call napi_schedule again >> at the end of the function though napi_schedule will neglect >> the request if napi is already scheduled." >> >> Given that it fixed no problem but appears to have introduced one, the >> efficient >> action is to revert it for now. >> > > With this reverted patch, there are lots of errors after bringing > device down then up, e.g., > "ibmvnic 30000003 env3: rx buffer returned with rc 6". > That seems something wrong with the handling of set_link_state > DOWN/UP. It is either the communication protocol or the VIOS itself. Did the driver bring the link back up with the patch is reverted? When link is down, vnic server returns rx buffers back to the client. This error message is printed when debug is turned on, driver's way to log what happened.
On 6/24/21 10:05 AM, Lijun Pan wrote: > With this reverted patch, there are lots of errors after bringing > device down then up, e.g., > "ibmvnic 30000003 env3: rx buffer returned with rc 6". Ok, thanks. Can you provide some more details about your test? When you ran this test, did you include the rest of the patches in this series, only some, or no others at all? I assume it was based on the contents of net from 6/23 or 6/24 - is that correct? Can you also describe the hardware you ran your test on? And lastly, would you briefly describe the nature of your test? The message you quoted is only seen with debug turned on, and as Dany remarked, can be expected and normal in some recovery situations. It's always possible you've found a case undetected by our testing, so the above information can help us better understand what you saw.
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index f13ad6bc67cd..fe1627ea9762 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -1234,6 +1234,11 @@ static int __ibmvnic_open(struct net_device *netdev) netif_tx_start_all_queues(netdev); + if (prev_state == VNIC_CLOSED) { + for (i = 0; i < adapter->req_rx_queues; i++) + napi_schedule(&adapter->napi[i]); + } + adapter->state = VNIC_OPEN; return rc; }