Message ID | 20210318081507.36287-1-yongxin.liu@windriver.com |
---|---|
State | New |
Headers | show |
Series | [net] ice: fix memory leak of aRFS after resuming from suspend | expand |
> -----Original Message----- > From: Creeley, Brett <brett.creeley@intel.com> > Sent: Friday, March 19, 2021 06:20 > To: Liu, Yongxin <Yongxin.Liu@windriver.com>; jeffrey.t.kirsher@intel.com; > Chittim, Madhu <madhu.chittim@intel.com>; Nguyen, Anthony L > <anthony.l.nguyen@intel.com>; andrewx.bowers@intel.com > Cc: netdev@vger.kernel.org > Subject: Re: [PATCH net] ice: fix memory leak of aRFS after resuming from > suspend > > > On Thu, 2021-03-18 at 16:15 +0800, Yongxin Liu wrote: > > In ice_suspend(), ice_clear_interrupt_scheme() is called, and then > > irq_free_descs() will be eventually called to free irq and its > > descriptor. > > > > In ice_resume(), ice_init_interrupt_scheme() is called to allocate new > > irqs. > > However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap > > maybe cannot be freed, if the irqs that released in ice_suspend() were > > reassigned to other devices, which makes irq descriptor's > > affinity_notify lost. > > > > So move ice_remove_arfs() before ice_clear_interrupt_scheme(), which > > can make sure all irq_glue and cpu_rmap can be correctly released > > before corresponding irq and descriptor are released. > > > > Fix the following memeory leak. > > s/memeory/memory > > <snip> > > > diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c > > b/drivers/net/ethernet/intel/ice/ice_arfs.c > > index 6560acd76c94..c748d0a5c7d4 100644 > > --- a/drivers/net/ethernet/intel/ice/ice_arfs.c > > +++ b/drivers/net/ethernet/intel/ice/ice_arfs.c > > @@ -654,7 +654,6 @@ void ice_rebuild_arfs(struct ice_pf *pf) > > if (!pf_vsi) > > return; > > > > - ice_remove_arfs(pf); > > This should not be removed. Removing this would break the reset flows > outside of the suspend/remove case. > > > if (ice_set_cpu_rx_rmap(pf_vsi)) { > > dev_err(ice_pf_to_dev(pf), "Failed to rebuild aRFS\n"); > > return; > > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c > > b/drivers/net/ethernet/intel/ice/ice_main.c > > index 2c23c8f468a5..dba901bf2b9b 100644 > > --- a/drivers/net/ethernet/intel/ice/ice_main.c > > +++ b/drivers/net/ethernet/intel/ice/ice_main.c > > @@ -4568,6 +4568,9 @@ static int __maybe_unused ice_suspend(struct > > device *dev) > > continue; > > ice_vsi_free_q_vectors(pf->vsi[v]); > > } > > + if (test_bit(ICE_FLAG_FD_ENA, pf->flags)) { > > + ice_remove_arfs(pf); > > + } > > Braces aren't needed around a single if statement like this. > > Also, I don't think this is the right solution. I think a better approach > would be to call ice_free_rx_cpu_map() here. With this, it seems like no > other changes are necessary. It also isn't necessary to check the > ICE_FLAG_FD_ENA bit with this change. Thanks for your valuable review. I will send V2. --Yongxin > > > ice_clear_interrupt_scheme(pf); > > > > pci_save_state(pdev);
diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c b/drivers/net/ethernet/intel/ice/ice_arfs.c index 6560acd76c94..c748d0a5c7d4 100644 --- a/drivers/net/ethernet/intel/ice/ice_arfs.c +++ b/drivers/net/ethernet/intel/ice/ice_arfs.c @@ -654,7 +654,6 @@ void ice_rebuild_arfs(struct ice_pf *pf) if (!pf_vsi) return; - ice_remove_arfs(pf); if (ice_set_cpu_rx_rmap(pf_vsi)) { dev_err(ice_pf_to_dev(pf), "Failed to rebuild aRFS\n"); return; diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 2c23c8f468a5..dba901bf2b9b 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -4568,6 +4568,9 @@ static int __maybe_unused ice_suspend(struct device *dev) continue; ice_vsi_free_q_vectors(pf->vsi[v]); } + if (test_bit(ICE_FLAG_FD_ENA, pf->flags)) { + ice_remove_arfs(pf); + } ice_clear_interrupt_scheme(pf); pci_save_state(pdev);
In ice_suspend(), ice_clear_interrupt_scheme() is called, and then irq_free_descs() will be eventually called to free irq and its descriptor. In ice_resume(), ice_init_interrupt_scheme() is called to allocate new irqs. However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe cannot be freed, if the irqs that released in ice_suspend() were reassigned to other devices, which makes irq descriptor's affinity_notify lost. So move ice_remove_arfs() before ice_clear_interrupt_scheme(), which can make sure all irq_glue and cpu_rmap can be correctly released before corresponding irq and descriptor are released. Fix the following memeory leak. unreferenced object 0xffff95bd951afc00 (size 512): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff ........p....... 00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff ................ backtrace: [<0000000072e4b914>] __kmalloc+0x336/0x540 [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0 [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 unreferenced object 0xffff95bd81b0a2a0 (size 96): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 8............... b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff ................ backtrace: [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0 [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0 [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> --- drivers/net/ethernet/intel/ice/ice_arfs.c | 1 - drivers/net/ethernet/intel/ice/ice_main.c | 3 +++ 2 files changed, 3 insertions(+), 1 deletion(-)