On 28.12.2020 13:50, Greg Kroah-Hartman wrote: > From: Stylon Wang <stylon.wang@amd.com> > > commit a135a1b4c4db1f3b8cbed9676a40ede39feb3362 upstream. > > EDID parsing in S3 resume pushes new display modes > to probed_modes list but doesn't consolidate to actual > mode list. This creates a race condition when > amdgpu_dm_connector_ddc_get_modes() re-initializes the > list head without walking the list and results in memory leak. This commit is causing me problems on 5.10.4: when I turn off the display (a LG TV in this case), and turn it back on again later there is no video output and I get the following in the kernel log: [ 8245.259628] [drm:dm_restore_drm_connector_state [amdgpu]] *ERROR* Restoring old state failed with -12 I've found another report on this commit as well: https://bugzilla.kernel.org/show_bug.cgi?id=211033 And I suspect this is the same: https://bugs.archlinux.org/task/69202 Reverting it from 5.10.4 makes things behave again. Have not tested 5.4.86 or 5.11-rc. I'm using a RX570 Polaris based card.
On Mon, Jan 04, 2021 at 08:04:08PM +0100, Andre Tomt wrote: > On 28.12.2020 13:50, Greg Kroah-Hartman wrote: > > From: Stylon Wang <stylon.wang@amd.com> > > > > commit a135a1b4c4db1f3b8cbed9676a40ede39feb3362 upstream. > > > > EDID parsing in S3 resume pushes new display modes > > to probed_modes list but doesn't consolidate to actual > > mode list. This creates a race condition when > > amdgpu_dm_connector_ddc_get_modes() re-initializes the > > list head without walking the list and results in memory leak. > > This commit is causing me problems on 5.10.4: when I turn off the display (a > LG TV in this case), and turn it back on again later there is no video > output and I get the following in the kernel log: > > [ 8245.259628] [drm:dm_restore_drm_connector_state [amdgpu]] *ERROR* > Restoring old state failed with -12 Uh, it seems you've just saved me a ton of gray hair. I have the very same issue and I'm going to revert this patch now in order to check whether it makes any difference. Thanks! > > I've found another report on this commit as well: > https://bugzilla.kernel.org/show_bug.cgi?id=211033 > > And I suspect this is the same: > https://bugs.archlinux.org/task/69202 > > Reverting it from 5.10.4 makes things behave again. > > Have not tested 5.4.86 or 5.11-rc. > > I'm using a RX570 Polaris based card. -- Oleksandr Natalenko (post-factum)
On Mon, Jan 04, 2021 at 09:10:17PM +0100, Oleksandr Natalenko wrote: > On Mon, Jan 04, 2021 at 08:04:08PM +0100, Andre Tomt wrote: > > On 28.12.2020 13:50, Greg Kroah-Hartman wrote: > > > From: Stylon Wang <stylon.wang@amd.com> > > > > > > commit a135a1b4c4db1f3b8cbed9676a40ede39feb3362 upstream. > > > > > > EDID parsing in S3 resume pushes new display modes > > > to probed_modes list but doesn't consolidate to actual > > > mode list. This creates a race condition when > > > amdgpu_dm_connector_ddc_get_modes() re-initializes the > > > list head without walking the list and results in memory leak. > > > > This commit is causing me problems on 5.10.4: when I turn off the display (a > > LG TV in this case), and turn it back on again later there is no video > > output and I get the following in the kernel log: > > > > [ 8245.259628] [drm:dm_restore_drm_connector_state [amdgpu]] *ERROR* > > Restoring old state failed with -12 > > Uh, it seems you've just saved me a ton of gray hair. I have the very > same issue and I'm going to revert this patch now in order to check > whether it makes any difference. Confirmed, reverting this patch makes my monitor light back after turning off/on. Also, during testing, I've noticed that with the stock v5.10.4 kernel once reboot sequence is initiated and xorg gets killed, the monitor also lights back and shows the console. My HW: 0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Lexa PRO [Radeon 540/540X/550/550X / RX 540X/550/550X] (rev c7) > > Thanks! > > > > > I've found another report on this commit as well: > > https://bugzilla.kernel.org/show_bug.cgi?id=211033 > > > > And I suspect this is the same: > > https://bugs.archlinux.org/task/69202 > > > > Reverting it from 5.10.4 makes things behave again. > > > > Have not tested 5.4.86 or 5.11-rc. > > > > I'm using a RX570 Polaris based card. -- Oleksandr Natalenko (post-factum)
On Mon, Jan 04, 2021 at 08:04:08PM +0100, Andre Tomt wrote: > On 28.12.2020 13:50, Greg Kroah-Hartman wrote: > > From: Stylon Wang <stylon.wang@amd.com> > > > > commit a135a1b4c4db1f3b8cbed9676a40ede39feb3362 upstream. > > > > EDID parsing in S3 resume pushes new display modes > > to probed_modes list but doesn't consolidate to actual > > mode list. This creates a race condition when > > amdgpu_dm_connector_ddc_get_modes() re-initializes the > > list head without walking the list and results in memory leak. > > This commit is causing me problems on 5.10.4: when I turn off the display (a > LG TV in this case), and turn it back on again later there is no video > output and I get the following in the kernel log: > > [ 8245.259628] [drm:dm_restore_drm_connector_state [amdgpu]] *ERROR* > Restoring old state failed with -12 > > I've found another report on this commit as well: > https://bugzilla.kernel.org/show_bug.cgi?id=211033 > > And I suspect this is the same: > https://bugs.archlinux.org/task/69202 > > Reverting it from 5.10.4 makes things behave again. > > Have not tested 5.4.86 or 5.11-rc. > > I'm using a RX570 Polaris based card. Can you test 5.11-rc to see if this issue is there as well? thanks, greg k-h
On 05.01.2021 07:54, Greg Kroah-Hartman wrote: > On Mon, Jan 04, 2021 at 08:04:08PM +0100, Andre Tomt wrote: >> On 28.12.2020 13:50, Greg Kroah-Hartman wrote: >>> From: Stylon Wang <stylon.wang@amd.com> >>> >>> commit a135a1b4c4db1f3b8cbed9676a40ede39feb3362 upstream. >>> >>> EDID parsing in S3 resume pushes new display modes >>> to probed_modes list but doesn't consolidate to actual >>> mode list. This creates a race condition when >>> amdgpu_dm_connector_ddc_get_modes() re-initializes the >>> list head without walking the list and results in memory leak. >> >> This commit is causing me problems on 5.10.4: when I turn off the display (a >> LG TV in this case), and turn it back on again later there is no video >> output and I get the following in the kernel log: >> >> [ 8245.259628] [drm:dm_restore_drm_connector_state [amdgpu]] *ERROR* >> Restoring old state failed with -12 >> >> I've found another report on this commit as well: >> https://bugzilla.kernel.org/show_bug.cgi?id=211033 >> >> And I suspect this is the same: >> https://bugs.archlinux.org/task/69202 >> >> Reverting it from 5.10.4 makes things behave again. >> >> Have not tested 5.4.86 or 5.11-rc. >> >> I'm using a RX570 Polaris based card. > > Can you test 5.11-rc to see if this issue is there as well? Just did, and have the same issue on 5.11-rc2. Reverting it also solves the problem on 5.11-rc2, as it does on 5.10.4 FWIW one easy way to reproduce seems to be unplugging and re-plugging the HDMI.
[AMD Public Use] > -----Original Message----- > From: Andre Tomt <andre@tomt.net> > Sent: Tuesday, January 5, 2021 11:32 AM > To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Cc: linux-kernel@vger.kernel.org; stable@vger.kernel.org; Wentland, Harry > <Harry.Wentland@amd.com>; Deucher, Alexander > <Alexander.Deucher@amd.com>; Kazlauskas, Nicholas > <Nicholas.Kazlauskas@amd.com>; Wang, Chao-kai (Stylon) > <Stylon.Wang@amd.com> > Subject: Re: [PATCH 5.10 637/717] drm/amd/display: Fix memory leaks in S3 > resume > > On 05.01.2021 07:54, Greg Kroah-Hartman wrote: > > On Mon, Jan 04, 2021 at 08:04:08PM +0100, Andre Tomt wrote: > >> On 28.12.2020 13:50, Greg Kroah-Hartman wrote: > >>> From: Stylon Wang <stylon.wang@amd.com> > >>> > >>> commit a135a1b4c4db1f3b8cbed9676a40ede39feb3362 upstream. > >>> > >>> EDID parsing in S3 resume pushes new display modes to probed_modes > >>> list but doesn't consolidate to actual mode list. This creates a > >>> race condition when > >>> amdgpu_dm_connector_ddc_get_modes() re-initializes the list head > >>> without walking the list and results in memory leak. > >> > >> This commit is causing me problems on 5.10.4: when I turn off the > >> display (a LG TV in this case), and turn it back on again later there > >> is no video output and I get the following in the kernel log: > >> > >> [ 8245.259628] [drm:dm_restore_drm_connector_state [amdgpu]] > *ERROR* > >> Restoring old state failed with -12 > >> > >> I've found another report on this commit as well: > >> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbug > >> > zilla.kernel.org%2Fshow_bug.cgi%3Fid%3D211033&data=04%7C01%7Cal > ex > >> > ander.deucher%40amd.com%7Cad673e351bab4f6af94508d8b1977ed8%7C3d > d8961f > >> > e4884e608e11a82d994e183d%7C0%7C0%7C637454612140560971%7CUnknow > n%7CTWF > >> > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV > CI6 > >> > Mn0%3D%7C3000&sdata=8Rsnbfh4P5GmFUlybb31mT7C0Ee4vDInxJ1gt > C3jrVI%3 > >> D&reserved=0 > >> > >> And I suspect this is the same: > >> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbug > >> > s.archlinux.org%2Ftask%2F69202&data=04%7C01%7Calexander.deuche > r%4 > >> > 0amd.com%7Cad673e351bab4f6af94508d8b1977ed8%7C3dd8961fe4884e608 > e11a82 > >> > d994e183d%7C0%7C0%7C637454612140560971%7CUnknown%7CTWFpbGZsb > 3d8eyJWIj > >> > oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3 > 000&a > >> > mp;sdata=1149tCcm3rcaj1MkVPbWqWhWFIPgkeBoYxo0oVv%2FzNI%3D&a > mp;reserve > >> d=0 > >> > >> Reverting it from 5.10.4 makes things behave again. > >> > >> Have not tested 5.4.86 or 5.11-rc. > >> > >> I'm using a RX570 Polaris based card. > > > > Can you test 5.11-rc to see if this issue is there as well? > > Just did, and have the same issue on 5.11-rc2. Reverting it also solves the > problem on 5.11-rc2, as it does on 5.10.4 > > FWIW one easy way to reproduce seems to be unplugging and re-plugging > the HDMI. We are looking into the root cause, but I'll send out the revert for now. Thanks, Alex
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2278,7 +2278,8 @@ void amdgpu_dm_update_connector_after_de drm_connector_update_edid_property(connector, aconnector->edid); - drm_add_edid_modes(connector, aconnector->edid); + aconnector->num_modes = drm_add_edid_modes(connector, aconnector->edid); + drm_connector_list_update(connector); if (aconnector->dc_link->aux_mode) drm_dp_cec_set_edid(&aconnector->dm_dp_aux.aux,