Message ID | 20210516230551.12469-1-afaerber@suse.de |
---|---|
Headers | show |
Series | arm64: dts: rockchip: Initial Toybrick TB-RK1808M0 support | expand |
On Mon, 17 May 2021 00:05:42 +0100, Andreas Färber <afaerber@suse.de> wrote: > > Hello Heiko et al., > > It seems linux-rockchip list only saw two RK1808 patches for ASoC in 2019. > Following up on a SUSE Hackweek 20 project of mine, here's some patches that > allow me to start booting into the TB-RK1808M0 mPCIe card's eMMC. > > Tested using its USB adapter, which allows to connect a serial cable and a > USB storage device that I load kernel+dtb from. It has a reset button, and > Ctrl+C allows to enter a U-Boot prompt (without EBBR/UEFI support though). > > Patches are based on the shipping toybrick.dtb file. > http://t.rock-chips.com/en/wiki.php?mod=view&id=110 gives instructions for > compiling sources, but no source download or link is actually provided. > > I encountered a hang: earlycon revealed it being related to KVM and > vGIC. Disabling KVM in Kconfig works around it, as does removing > the vGIC irq in DT. I've already tried low and high for the vGIC > interrupt, so no clue what might cause it. On an mPCIe card with 1 > GiB of RAM I figured KVM is not going to be a major use case, so if > we find no other solution, we could just delete the interrupts > property in its .dts, as demonstrated here. I think you figured it out wrong, for a number of reasons: - KVM hanging is usually a sign that you have described the platform the wrong way. Either you are stepping over reserved memory regions, or you have badly described the GIC itself. - It could also be a bug in KVM, which will need to be fixed. If that's because the HW is broken, we need to be able to detect it. - You cannot be prescriptive of what a user is going to run. People have been running KVM on systems with less memory than that. So no, we don't paper over these issues. We work out what is going wrong and we fix it. Thanks, M. -- Without deviation from the norm, progress is not possible.
Hi Marc, On 17.05.21 11:02, Marc Zyngier wrote: > On Mon, 17 May 2021 00:05:42 +0100, > Andreas Färber <afaerber@suse.de> wrote: >> Patches are based on the shipping toybrick.dtb file. >> http://t.rock-chips.com/en/wiki.php?mod=view&id=110 gives instructions for >> compiling sources, but no source download or link is actually provided. >> >> I encountered a hang: earlycon revealed it being related to KVM and >> vGIC. Disabling KVM in Kconfig works around it, as does removing >> the vGIC irq in DT. I've already tried low and high for the vGIC >> interrupt, so no clue what might cause it. On an mPCIe card with 1 >> GiB of RAM I figured KVM is not going to be a major use case, so if >> we find no other solution, we could just delete the interrupts >> property in its .dts, as demonstrated here. > > I think you figured it out wrong, Did I? I identified that an issue resulting in no serial console was dependent on CONFIG_KVM being enabled and specifically to the vGIC interrupt being specified in my DT. That's all I said. I never claimed KVM code was to blame, you should know me better by now! > for a number of reasons: > > - KVM hanging is usually a sign that you have described the platform > the wrong way. Either you are stepping over reserved memory regions, > or you have badly described the GIC itself. This whole series is about a new DT hardware description, so yes, that is the most likely source of the problem I'm observing. Without further hints how to verify what may cause it, you're just stating the obvious. The only /reserved-memory entries in the shipping DTB are drm-logo of size 0 and ramoops - the latter I could try to test, but I'd assume that to just be a software convention that for lack of oops should not affect KVM here? And why would reserved memory affect the vGIC but no other driver doing allocations? Any way to narrow it down, does vGIC allocate specially? Only other issue I'm seeing is Debian failing to mount partitions that I checked I do have drivers built in for and ends up failing to provide an emergency shell. In order to boot a clean openSUSE rootfs for comparison I'd first need to figure out adding any USB host nodes and clocks. > > - It could also be a bug in KVM, which will need to be fixed. If > that's because the HW is broken, we need to be able to detect it. > > - You cannot be prescriptive of what a user is going to run. People > have been running KVM on systems with less memory than that. > > So no, we don't paper over these issues. As you can see in patch 3, it does include the vGIC interrupt, so that anyone with access to the TB-96AIoT or any EVB can test KVM and report success or failure. Thus I don't see me as papering over something here. However, patch 5 is needed to test this patchset on at least M0 - to have serial and eMMC rootfs working - until a better fix is found. > We work out what is going > wrong and we fix it. Thanks. You were specifically copied to advise on how to figure out what might cause it, so that we/I can fix it properly. :) As I mentioned, I already tried changing the interrupt between high and low (which was a likely bug source on Realtek RK1319 (where I'm still waiting on them to confirm a ~year later...)). I don't have a data source other than the downstream .dtb to check the interrupt number - mainline PX30/RK3308/RK3328/RK3368/RK3399 do all use 9 and high consistently though, so I figured it's likely correct. What I was wondering is whether the vGIC, similar to arch timer, might need some initialization in the bootloader? (Note: No U-Boot sources either at the link.) Unfortunately I'm seeing a recurring pattern (cf. Realtek) that vendors in their BSPs don't enable KVM and thus don't validate their hardware description against KVM; their shipping 4.4 based kernel here does not seem to have KVM enabled. Or is it possible for vendors to actually have a Cortex-A35 without the Armv8 Virtualization Extensions in silicon? If so, how could one verify? Thanks, Andreas -- SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer HRB 36809 (AG Nürnberg)
Andreas, On Mon, 17 May 2021 13:22:27 +0100, Andreas Färber <afaerber@suse.de> wrote: > > Hi Marc, > > On 17.05.21 11:02, Marc Zyngier wrote: > > On Mon, 17 May 2021 00:05:42 +0100, > > Andreas Färber <afaerber@suse.de> wrote: > >> Patches are based on the shipping toybrick.dtb file. > > >> http://t.rock-chips.com/en/wiki.php?mod=view&id=110 gives instructions for > > >> compiling sources, but no source download or link is actually provided. > > >> > > >> I encountered a hang: earlycon revealed it being related to KVM and > >> vGIC. Disabling KVM in Kconfig works around it, as does removing > >> the vGIC irq in DT. I've already tried low and high for the vGIC > >> interrupt, so no clue what might cause it. On an mPCIe card with 1 > >> GiB of RAM I figured KVM is not going to be a major use case, so if > >> we find no other solution, we could just delete the interrupts > >> property in its .dts, as demonstrated here. > > > > I think you figured it out wrong, > > Did I? I identified that an issue resulting in no serial console was > dependent on CONFIG_KVM being enabled and specifically to the vGIC > interrupt being specified in my DT. That's all I said. I guess we have a different way to approach these issues. Rather than disabling a feature, I would have reached out to narrow the problem down *before* posting a series. > I never claimed KVM code was to blame, you should know me better by > now! Maybe it *is* to blame, and I'd really like to know. > > for a number of reasons: > > > > - KVM hanging is usually a sign that you have described the platform > > the wrong way. Either you are stepping over reserved memory regions, > > or you have badly described the GIC itself. > > This whole series is about a new DT hardware description, so yes, that > is the most likely source of the problem I'm observing. Without further > hints how to verify what may cause it, you're just stating the obvious. > > The only /reserved-memory entries in the shipping DTB are drm-logo of > size 0 and ramoops - the latter I could try to test, but I'd assume that > to just be a software convention that for lack of oops should not affect > KVM here? > > And why would reserved memory affect the vGIC but no other driver doing > allocations? Any way to narrow it down, does vGIC allocate specially? Not an existing reserved memory, but instead the lack of a reserved memory description in the DT, on which KVM would happily step as part of its own allocations. Having a working vGIC adds a substantial amount of code paths and (surprise!) interrupt handling. > Only other issue I'm seeing is Debian failing to mount partitions that I > checked I do have drivers built in for and ends up failing to provide an > emergency shell. In order to boot a clean openSUSE rootfs for comparison > I'd first need to figure out adding any USB host nodes and clocks. > > > > > - It could also be a bug in KVM, which will need to be fixed. If > > that's because the HW is broken, we need to be able to detect it. > > > > - You cannot be prescriptive of what a user is going to run. People > > have been running KVM on systems with less memory than that. > > > > So no, we don't paper over these issues. > > As you can see in patch 3, it does include the vGIC interrupt, so that > anyone with access to the TB-96AIoT or any EVB can test KVM and report > success or failure. Thus I don't see me as papering over something here. > > However, patch 5 is needed to test this patchset on at least M0 - to > have serial and eMMC rootfs working - until a better fix is found. And that's not papering over the problem? OK, nevermind. Not to mention that the GIC node has some obvious mistakes which result from copy-paste. > > We work out what is going > > wrong and we fix it. > > Thanks. You were specifically copied to advise on > how to figure out what might cause it, so that we/I can fix it properly. :) > > As I mentioned, I already tried changing the interrupt between high and > low (which was a likely bug source on Realtek RK1319 (where I'm still > waiting on them to confirm a ~year later...)). Which has no influence since the GIC-500 PPIs are not configurable in SW, and the presence of this attribute in the DT is just for documentation. > I don't have a data source other than the downstream .dtb to check the > interrupt number - mainline PX30/RK3308/RK3328/RK3368/RK3399 do all use > 9 and high consistently though, so I figured it's likely correct. > > What I was wondering is whether the vGIC, similar to arch timer, might > need some initialization in the bootloader? (Note: No U-Boot sources > either at the link.) As long as the PPIs are set as group-1NS, this is enough. You can find out by dumping the redistributors' GICR_IGROUPR0 registers. Nothing else is required for the GIC to behave. > Unfortunately I'm seeing a recurring pattern (cf. Realtek) that vendors > in their BSPs don't enable KVM and thus don't validate their hardware > description against KVM; their shipping 4.4 based kernel here does not > seem to have KVM enabled. > > Or is it possible for vendors to actually have a Cortex-A35 without the > Armv8 Virtualization Extensions in silicon? If so, how could one verify? There is no "Armv8 Virtualization Extensions". There is only EL2, and you are already booting at that exception level, or KVM wouldn't even try to initialise. It would probably help if you posted a full dmesg as well as added some basic tracing in the vgic init code so that we can figure out *what* is going wrong, so that we can all stop making idle guesses. M. -- Without deviation from the norm, progress is not possible.