Message ID | cover.1734392473.git.ashish.kalra@amd.com |
---|---|
Headers | show |
Series | Move initializing SEV/SNP functionality to KVM | expand |
On 12/17/2024 10:00 AM, Dionna Amalie Glaze wrote: > On Mon, Dec 16, 2024 at 3:57 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote: >> >> From: Ashish Kalra <ashish.kalra@amd.com> > >> The on-demand SEV initialization support requires a fix in QEMU to >> remove check for SEV initialization to be done prior to launching >> SEV/SEV-ES VMs. >> NOTE: With the above fix for QEMU, older QEMU versions will be broken >> with respect to launching SEV/SEV-ES VMs with the newer kernel/KVM as >> older QEMU versions require SEV initialization to be done before >> launching SEV/SEV-ES VMs. >> > > I don't think this is okay. I think you need to introduce a KVM > capability to switch over to the new way of initializing SEV VMs and > deprecate the old way so it doesn't need to be supported for any new > additions to the interface. > But that means KVM will need to support both mechanisms of doing SEV initialization - during KVM module load time and the deferred/lazy (on-demand) SEV INIT during VM launch. Thanks, Ashish
On Tue, Dec 17, 2024, Ashish Kalra wrote: > > > On 12/17/2024 10:00 AM, Dionna Amalie Glaze wrote: > > On Mon, Dec 16, 2024 at 3:57 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote: > >> > >> From: Ashish Kalra <ashish.kalra@amd.com> > > > >> The on-demand SEV initialization support requires a fix in QEMU to > >> remove check for SEV initialization to be done prior to launching > >> SEV/SEV-ES VMs. > >> NOTE: With the above fix for QEMU, older QEMU versions will be broken > >> with respect to launching SEV/SEV-ES VMs with the newer kernel/KVM as > >> older QEMU versions require SEV initialization to be done before > >> launching SEV/SEV-ES VMs. > >> > > > > I don't think this is okay. I think you need to introduce a KVM > > capability to switch over to the new way of initializing SEV VMs and > > deprecate the old way so it doesn't need to be supported for any new > > additions to the interface. > > > > But that means KVM will need to support both mechanisms of doing SEV > initialization - during KVM module load time and the deferred/lazy > (on-demand) SEV INIT during VM launch. What's the QEMU change? Dionna is right, we can't break userspace, but maybe there's an alternative to supporting both models.
On 12/17/2024 3:37 PM, Sean Christopherson wrote: > On Tue, Dec 17, 2024, Ashish Kalra wrote: >> >> >> On 12/17/2024 10:00 AM, Dionna Amalie Glaze wrote: >>> On Mon, Dec 16, 2024 at 3:57 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote: >>>> >>>> From: Ashish Kalra <ashish.kalra@amd.com> >>> >>>> The on-demand SEV initialization support requires a fix in QEMU to >>>> remove check for SEV initialization to be done prior to launching >>>> SEV/SEV-ES VMs. >>>> NOTE: With the above fix for QEMU, older QEMU versions will be broken >>>> with respect to launching SEV/SEV-ES VMs with the newer kernel/KVM as >>>> older QEMU versions require SEV initialization to be done before >>>> launching SEV/SEV-ES VMs. >>>> >>> >>> I don't think this is okay. I think you need to introduce a KVM >>> capability to switch over to the new way of initializing SEV VMs and >>> deprecate the old way so it doesn't need to be supported for any new >>> additions to the interface. >>> >> >> But that means KVM will need to support both mechanisms of doing SEV >> initialization - during KVM module load time and the deferred/lazy >> (on-demand) SEV INIT during VM launch. > > What's the QEMU change? Dionna is right, we can't break userspace, but maybe > there's an alternative to supporting both models. Here is the QEMU fix : (makes a SEV PLATFORM STATUS firmware call via PSP driver ioctl to check if SEV is in INIT state) diff --git a/target/i386/sev.c b/target/i386/sev.c index 1a4eb1ada6..4fa8665395 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -1503,15 +1503,6 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp) } } - if (sev_es_enabled() && !sev_snp_enabled()) { - if (!(status.flags & SEV_STATUS_FLAGS_CONFIG_ES)) { - error_setg(errp, "%s: guest policy requires SEV-ES, but " - "host SEV-ES support unavailable", - __func__); - return -1; - } - } - trace_kvm_sev_init(); switch (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common))) { case KVM_X86_DEFAULT_VM:
On Tue, Dec 17, 2024 at 05:16:01PM -0600, Kalra, Ashish wrote: > > > On 12/17/2024 3:37 PM, Sean Christopherson wrote: > > On Tue, Dec 17, 2024, Ashish Kalra wrote: > >> > >> > >> On 12/17/2024 10:00 AM, Dionna Amalie Glaze wrote: > >>> On Mon, Dec 16, 2024 at 3:57 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote: > >>>> > >>>> From: Ashish Kalra <ashish.kalra@amd.com> > >>> > >>>> The on-demand SEV initialization support requires a fix in QEMU to > >>>> remove check for SEV initialization to be done prior to launching > >>>> SEV/SEV-ES VMs. > >>>> NOTE: With the above fix for QEMU, older QEMU versions will be broken > >>>> with respect to launching SEV/SEV-ES VMs with the newer kernel/KVM as > >>>> older QEMU versions require SEV initialization to be done before > >>>> launching SEV/SEV-ES VMs. > >>>> > >>> > >>> I don't think this is okay. I think you need to introduce a KVM > >>> capability to switch over to the new way of initializing SEV VMs and > >>> deprecate the old way so it doesn't need to be supported for any new > >>> additions to the interface. > >>> > >> > >> But that means KVM will need to support both mechanisms of doing SEV > >> initialization - during KVM module load time and the deferred/lazy > >> (on-demand) SEV INIT during VM launch. > > > > What's the QEMU change? Dionna is right, we can't break userspace, but maybe > > there's an alternative to supporting both models. > > Here is the QEMU fix : (makes a SEV PLATFORM STATUS firmware call via PSP driver ioctl > to check if SEV is in INIT state) > > diff --git a/target/i386/sev.c b/target/i386/sev.c > index 1a4eb1ada6..4fa8665395 100644 > --- a/target/i386/sev.c > +++ b/target/i386/sev.c > @@ -1503,15 +1503,6 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp) > } > } > > - if (sev_es_enabled() && !sev_snp_enabled()) { > - if (!(status.flags & SEV_STATUS_FLAGS_CONFIG_ES)) { > - error_setg(errp, "%s: guest policy requires SEV-ES, but " > - "host SEV-ES support unavailable", > - __func__); > - return -1; > - } > - } Sigh, that code exists in all versions of QEMU that shipped with SEV-ES support. IOW the proposed kernel change is not limited to breaking "older QEMU versions". Every QEMU for the last 3 years will break, including the newest version released last week. Please don't do that. If the kvm-svm kmod supports both load time init and lazy init, then the QEMU incompatibility still exists, and will likely get pushed on users by the OS distro forcing use of the lazy-load option :-( With regards, Daniel
On Tue, Dec 17, 2024, Ashish Kalra wrote: > On 12/17/2024 3:37 PM, Sean Christopherson wrote: > > On Tue, Dec 17, 2024, Ashish Kalra wrote: > >> On 12/17/2024 10:00 AM, Dionna Amalie Glaze wrote: > >>> On Mon, Dec 16, 2024 at 3:57 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote: > >>>> > >>>> From: Ashish Kalra <ashish.kalra@amd.com> > >>> > >>>> The on-demand SEV initialization support requires a fix in QEMU to > >>>> remove check for SEV initialization to be done prior to launching > >>>> SEV/SEV-ES VMs. > >>>> NOTE: With the above fix for QEMU, older QEMU versions will be broken > >>>> with respect to launching SEV/SEV-ES VMs with the newer kernel/KVM as > >>>> older QEMU versions require SEV initialization to be done before > >>>> launching SEV/SEV-ES VMs. > >>>> > >>> > >>> I don't think this is okay. I think you need to introduce a KVM > >>> capability to switch over to the new way of initializing SEV VMs and > >>> deprecate the old way so it doesn't need to be supported for any new > >>> additions to the interface. > >>> > >> > >> But that means KVM will need to support both mechanisms of doing SEV > >> initialization - during KVM module load time and the deferred/lazy > >> (on-demand) SEV INIT during VM launch. > > > > What's the QEMU change? Dionna is right, we can't break userspace, but maybe > > there's an alternative to supporting both models. > > Here is the QEMU fix : (makes a SEV PLATFORM STATUS firmware call via PSP > driver ioctl to check if SEV is in INIT state) > > diff --git a/target/i386/sev.c b/target/i386/sev.c > index 1a4eb1ada6..4fa8665395 100644 > --- a/target/i386/sev.c > +++ b/target/i386/sev.c > @@ -1503,15 +1503,6 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp) > } > } > > - if (sev_es_enabled() && !sev_snp_enabled()) { > - if (!(status.flags & SEV_STATUS_FLAGS_CONFIG_ES)) { > - error_setg(errp, "%s: guest policy requires SEV-ES, but " > - "host SEV-ES support unavailable", > - __func__); > - return -1; > - } > - } Aside from breaking userspace, removing a sanity check is not a "fix". Can't we simply have the kernel do __sev_platform_init_locked() on-demand for SEV_PLATFORM_STATUS? The goal with lazy initialization is defer initialization until it's necessary so that userspace can do firmware updates. And it's quite clearly necessary in this case, so...
On 12/18/2024 1:10 PM, Sean Christopherson wrote: > On Tue, Dec 17, 2024, Ashish Kalra wrote: >> On 12/17/2024 3:37 PM, Sean Christopherson wrote: >>> On Tue, Dec 17, 2024, Ashish Kalra wrote: >>>> On 12/17/2024 10:00 AM, Dionna Amalie Glaze wrote: >>>>> On Mon, Dec 16, 2024 at 3:57 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote: >>>>>> >>>>>> From: Ashish Kalra <ashish.kalra@amd.com> >>>>> >>>>>> The on-demand SEV initialization support requires a fix in QEMU to >>>>>> remove check for SEV initialization to be done prior to launching >>>>>> SEV/SEV-ES VMs. >>>>>> NOTE: With the above fix for QEMU, older QEMU versions will be broken >>>>>> with respect to launching SEV/SEV-ES VMs with the newer kernel/KVM as >>>>>> older QEMU versions require SEV initialization to be done before >>>>>> launching SEV/SEV-ES VMs. >>>>>> >>>>> >>>>> I don't think this is okay. I think you need to introduce a KVM >>>>> capability to switch over to the new way of initializing SEV VMs and >>>>> deprecate the old way so it doesn't need to be supported for any new >>>>> additions to the interface. >>>>> >>>> >>>> But that means KVM will need to support both mechanisms of doing SEV >>>> initialization - during KVM module load time and the deferred/lazy >>>> (on-demand) SEV INIT during VM launch. >>> >>> What's the QEMU change? Dionna is right, we can't break userspace, but maybe >>> there's an alternative to supporting both models. >> >> Here is the QEMU fix : (makes a SEV PLATFORM STATUS firmware call via PSP >> driver ioctl to check if SEV is in INIT state) >> >> diff --git a/target/i386/sev.c b/target/i386/sev.c >> index 1a4eb1ada6..4fa8665395 100644 >> --- a/target/i386/sev.c >> +++ b/target/i386/sev.c >> @@ -1503,15 +1503,6 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp) >> } >> } >> >> - if (sev_es_enabled() && !sev_snp_enabled()) { >> - if (!(status.flags & SEV_STATUS_FLAGS_CONFIG_ES)) { >> - error_setg(errp, "%s: guest policy requires SEV-ES, but " >> - "host SEV-ES support unavailable", >> - __func__); >> - return -1; >> - } >> - } > > Aside from breaking userspace, removing a sanity check is not a "fix". Actually this sanity check is not really required, if SEV INIT is not done before launching a SEV/SEV-ES VM, then LAUNCH_START will fail with invalid platform state error as below: ... qemu-system-x86_64: sev_launch_start: LAUNCH_START ret=1 fw_error=1 'Platform state is invalid' ... So we can safely remove this check without causing a SEV/SEV-ES VM to blow up or something. > > Can't we simply have the kernel do __sev_platform_init_locked() on-demand for > SEV_PLATFORM_STATUS? The goal with lazy initialization is defer initialization > until it's necessary so that userspace can do firmware updates. And it's quite > clearly necessary in this case, so... I don't think we want to do that, probably want to return "raw" status back to userspace, if SEV INIT has not been done we probably need to return back that status, otherwise it may break some other userspace tool. Now, looking at this qemu check we will always have issues launching SEV/SEV-ES VMs with SEV INIT on demand as this check enforces SEV INIT to be done before launching the VMs. And then this causes issues with SEV firmware hotloading as the check enforces SEV INIT before launching VMs and once SEV INIT is done we can't do firmware hotloading. But, i believe there is another alternative approach : - PSP driver can call SEV Shutdown right before calling DLFW_EX and then do a SEV INIT after successful DLFW_EX, in other words, we wrap DLFW_EX with SEV_SHUTDOWN prior to it and SEV INIT post it. This approach will also allow us to do both SNP and SEV INIT at KVM module load time, there is no need to do SEV INIT lazily or on demand before SEV/SEV-ES VM launch. This approach should work without any changes in qemu and also allow SEV firmware hotloading without having any concerns about SEV INIT state. Thanks, Ashish
On 12/18/2024 7:11 PM, Kalra, Ashish wrote: > > On 12/18/2024 1:10 PM, Sean Christopherson wrote: >> On Tue, Dec 17, 2024, Ashish Kalra wrote: >>> On 12/17/2024 3:37 PM, Sean Christopherson wrote: >>>> On Tue, Dec 17, 2024, Ashish Kalra wrote: >>>>> On 12/17/2024 10:00 AM, Dionna Amalie Glaze wrote: >>>>>> On Mon, Dec 16, 2024 at 3:57 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote: >>>>>>> >>>>>>> From: Ashish Kalra <ashish.kalra@amd.com> >>>>>> >>>>>>> The on-demand SEV initialization support requires a fix in QEMU to >>>>>>> remove check for SEV initialization to be done prior to launching >>>>>>> SEV/SEV-ES VMs. >>>>>>> NOTE: With the above fix for QEMU, older QEMU versions will be broken >>>>>>> with respect to launching SEV/SEV-ES VMs with the newer kernel/KVM as >>>>>>> older QEMU versions require SEV initialization to be done before >>>>>>> launching SEV/SEV-ES VMs. >>>>>>> >>>>>> >>>>>> I don't think this is okay. I think you need to introduce a KVM >>>>>> capability to switch over to the new way of initializing SEV VMs and >>>>>> deprecate the old way so it doesn't need to be supported for any new >>>>>> additions to the interface. >>>>>> >>>>> >>>>> But that means KVM will need to support both mechanisms of doing SEV >>>>> initialization - during KVM module load time and the deferred/lazy >>>>> (on-demand) SEV INIT during VM launch. >>>> >>>> What's the QEMU change? Dionna is right, we can't break userspace, but maybe >>>> there's an alternative to supporting both models. >>> >>> Here is the QEMU fix : (makes a SEV PLATFORM STATUS firmware call via PSP >>> driver ioctl to check if SEV is in INIT state) >>> >>> diff --git a/target/i386/sev.c b/target/i386/sev.c >>> index 1a4eb1ada6..4fa8665395 100644 >>> --- a/target/i386/sev.c >>> +++ b/target/i386/sev.c >>> @@ -1503,15 +1503,6 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp) >>> } >>> } >>> >>> - if (sev_es_enabled() && !sev_snp_enabled()) { >>> - if (!(status.flags & SEV_STATUS_FLAGS_CONFIG_ES)) { >>> - error_setg(errp, "%s: guest policy requires SEV-ES, but " >>> - "host SEV-ES support unavailable", >>> - __func__); >>> - return -1; >>> - } >>> - } >> >> Aside from breaking userspace, removing a sanity check is not a "fix". > > Actually this sanity check is not really required, if SEV INIT is not done before > launching a SEV/SEV-ES VM, then LAUNCH_START will fail with invalid platform state > error as below: > > ... > qemu-system-x86_64: sev_launch_start: LAUNCH_START ret=1 fw_error=1 'Platform state is invalid' > ... > > So we can safely remove this check without causing a SEV/SEV-ES VM to blow up or something. > >> >> Can't we simply have the kernel do __sev_platform_init_locked() on-demand for >> SEV_PLATFORM_STATUS? The goal with lazy initialization is defer initialization >> until it's necessary so that userspace can do firmware updates. And it's quite >> clearly necessary in this case, so... > > I don't think we want to do that, probably want to return "raw" status back to userspace, > if SEV INIT has not been done we probably need to return back that status, otherwise > it may break some other userspace tool. > > Now, looking at this qemu check we will always have issues launching SEV/SEV-ES VMs > with SEV INIT on demand as this check enforces SEV INIT to be done before launching > the VMs. And then this causes issues with SEV firmware hotloading as the check > enforces SEV INIT before launching VMs and once SEV INIT is done we can't do > firmware hotloading. > > But, i believe there is another alternative approach : > > - PSP driver can call SEV Shutdown right before calling DLFW_EX and then do > a SEV INIT after successful DLFW_EX, in other words, we wrap DLFW_EX with > SEV_SHUTDOWN prior to it and SEV INIT post it. This approach will also allow > us to do both SNP and SEV INIT at KVM module load time, there is no need to > do SEV INIT lazily or on demand before SEV/SEV-ES VM launch. > > This approach should work without any changes in qemu and also allow > SEV firmware hotloading without having any concerns about SEV INIT state. > And to add here that SEV Shutdown will succeed with active SEV and SNP guests. SEV Shutdown (internally) marks all SEV asids as invalid and decommission all SEV guests and does not affect SNP guests. So any active SEV guests will be implicitly shutdown and SNP guests will not be affected after SEV Shutdown right before doing SEV firmware hotloading and calling DLFW_EX command. It should be fine to expect that there are no active SEV guests or any active SEV guests will be shutdown as part of SEV firmware hotloading while keeping SNP guests running. Thanks, Ashish
On Thu, Dec 19, 2024 at 2:04 PM Kalra, Ashish <ashish.kalra@amd.com> wrote: > > > > On 12/18/2024 7:11 PM, Kalra, Ashish wrote: > > > > On 12/18/2024 1:10 PM, Sean Christopherson wrote: > >> On Tue, Dec 17, 2024, Ashish Kalra wrote: > >>> On 12/17/2024 3:37 PM, Sean Christopherson wrote: > >>>> On Tue, Dec 17, 2024, Ashish Kalra wrote: > >>>>> On 12/17/2024 10:00 AM, Dionna Amalie Glaze wrote: > >>>>>> On Mon, Dec 16, 2024 at 3:57 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote: > >>>>>>> > >>>>>>> From: Ashish Kalra <ashish.kalra@amd.com> > >>>>>> > >>>>>>> The on-demand SEV initialization support requires a fix in QEMU to > >>>>>>> remove check for SEV initialization to be done prior to launching > >>>>>>> SEV/SEV-ES VMs. > >>>>>>> NOTE: With the above fix for QEMU, older QEMU versions will be broken > >>>>>>> with respect to launching SEV/SEV-ES VMs with the newer kernel/KVM as > >>>>>>> older QEMU versions require SEV initialization to be done before > >>>>>>> launching SEV/SEV-ES VMs. > >>>>>>> > >>>>>> > >>>>>> I don't think this is okay. I think you need to introduce a KVM > >>>>>> capability to switch over to the new way of initializing SEV VMs and > >>>>>> deprecate the old way so it doesn't need to be supported for any new > >>>>>> additions to the interface. > >>>>>> > >>>>> > >>>>> But that means KVM will need to support both mechanisms of doing SEV > >>>>> initialization - during KVM module load time and the deferred/lazy > >>>>> (on-demand) SEV INIT during VM launch. > >>>> > >>>> What's the QEMU change? Dionna is right, we can't break userspace, but maybe > >>>> there's an alternative to supporting both models. > >>> > >>> Here is the QEMU fix : (makes a SEV PLATFORM STATUS firmware call via PSP > >>> driver ioctl to check if SEV is in INIT state) > >>> > >>> diff --git a/target/i386/sev.c b/target/i386/sev.c > >>> index 1a4eb1ada6..4fa8665395 100644 > >>> --- a/target/i386/sev.c > >>> +++ b/target/i386/sev.c > >>> @@ -1503,15 +1503,6 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp) > >>> } > >>> } > >>> > >>> - if (sev_es_enabled() && !sev_snp_enabled()) { > >>> - if (!(status.flags & SEV_STATUS_FLAGS_CONFIG_ES)) { > >>> - error_setg(errp, "%s: guest policy requires SEV-ES, but " > >>> - "host SEV-ES support unavailable", > >>> - __func__); > >>> - return -1; > >>> - } > >>> - } > >> > >> Aside from breaking userspace, removing a sanity check is not a "fix". > > > > Actually this sanity check is not really required, if SEV INIT is not done before > > launching a SEV/SEV-ES VM, then LAUNCH_START will fail with invalid platform state > > error as below: > > > > ... > > qemu-system-x86_64: sev_launch_start: LAUNCH_START ret=1 fw_error=1 'Platform state is invalid' > > ... > > > > So we can safely remove this check without causing a SEV/SEV-ES VM to blow up or something. > > > >> > >> Can't we simply have the kernel do __sev_platform_init_locked() on-demand for > >> SEV_PLATFORM_STATUS? The goal with lazy initialization is defer initialization > >> until it's necessary so that userspace can do firmware updates. And it's quite > >> clearly necessary in this case, so... > > > > I don't think we want to do that, probably want to return "raw" status back to userspace, > > if SEV INIT has not been done we probably need to return back that status, otherwise > > it may break some other userspace tool. > > > > Now, looking at this qemu check we will always have issues launching SEV/SEV-ES VMs > > with SEV INIT on demand as this check enforces SEV INIT to be done before launching > > the VMs. And then this causes issues with SEV firmware hotloading as the check > > enforces SEV INIT before launching VMs and once SEV INIT is done we can't do > > firmware hotloading. > > > > But, i believe there is another alternative approach : > > > > - PSP driver can call SEV Shutdown right before calling DLFW_EX and then do > > a SEV INIT after successful DLFW_EX, in other words, we wrap DLFW_EX with > > SEV_SHUTDOWN prior to it and SEV INIT post it. This approach will also allow > > us to do both SNP and SEV INIT at KVM module load time, there is no need to > > do SEV INIT lazily or on demand before SEV/SEV-ES VM launch. > > > > This approach should work without any changes in qemu and also allow > > SEV firmware hotloading without having any concerns about SEV INIT state. > > > > And to add here that SEV Shutdown will succeed with active SEV and SNP guests. > > SEV Shutdown (internally) marks all SEV asids as invalid and decommission all > SEV guests and does not affect SNP guests. > > So any active SEV guests will be implicitly shutdown and SNP guests will not be > affected after SEV Shutdown right before doing SEV firmware hotloading and > calling DLFW_EX command. > Please don't implicitly shut down VMs. At least have a safe and unsafe option for dlfw_ex where the default is to not destroy active workloads. That's why the 2022 patch series for Intel SGX EUPDATESVN on microcode hotload was shot down. It's very rude to destroy running workloads because a system update was scheduled. > It should be fine to expect that there are no active SEV guests or any active > SEV guests will be shutdown as part of SEV firmware hotloading while keeping > SNP guests running. > > Thanks, > Ashish
On Thu, Dec 19, 2024 at 04:04:45PM -0600, Kalra, Ashish wrote: > > > On 12/18/2024 7:11 PM, Kalra, Ashish wrote: > > > > On 12/18/2024 1:10 PM, Sean Christopherson wrote: > >> On Tue, Dec 17, 2024, Ashish Kalra wrote: > >>> On 12/17/2024 3:37 PM, Sean Christopherson wrote: > >>>> On Tue, Dec 17, 2024, Ashish Kalra wrote: > >>>>> On 12/17/2024 10:00 AM, Dionna Amalie Glaze wrote: > >>>>>> On Mon, Dec 16, 2024 at 3:57 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote: > >>>>>>> > >>>>>>> From: Ashish Kalra <ashish.kalra@amd.com> > >>>>>> > >>>>>>> The on-demand SEV initialization support requires a fix in QEMU to > >>>>>>> remove check for SEV initialization to be done prior to launching > >>>>>>> SEV/SEV-ES VMs. > >>>>>>> NOTE: With the above fix for QEMU, older QEMU versions will be broken > >>>>>>> with respect to launching SEV/SEV-ES VMs with the newer kernel/KVM as > >>>>>>> older QEMU versions require SEV initialization to be done before > >>>>>>> launching SEV/SEV-ES VMs. > >>>>>>> > >>>>>> > >>>>>> I don't think this is okay. I think you need to introduce a KVM > >>>>>> capability to switch over to the new way of initializing SEV VMs and > >>>>>> deprecate the old way so it doesn't need to be supported for any new > >>>>>> additions to the interface. > >>>>>> > >>>>> > >>>>> But that means KVM will need to support both mechanisms of doing SEV > >>>>> initialization - during KVM module load time and the deferred/lazy > >>>>> (on-demand) SEV INIT during VM launch. > >>>> > >>>> What's the QEMU change? Dionna is right, we can't break userspace, but maybe > >>>> there's an alternative to supporting both models. > >>> > >>> Here is the QEMU fix : (makes a SEV PLATFORM STATUS firmware call via PSP > >>> driver ioctl to check if SEV is in INIT state) > >>> > >>> diff --git a/target/i386/sev.c b/target/i386/sev.c > >>> index 1a4eb1ada6..4fa8665395 100644 > >>> --- a/target/i386/sev.c > >>> +++ b/target/i386/sev.c > >>> @@ -1503,15 +1503,6 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp) > >>> } > >>> } > >>> > >>> - if (sev_es_enabled() && !sev_snp_enabled()) { > >>> - if (!(status.flags & SEV_STATUS_FLAGS_CONFIG_ES)) { > >>> - error_setg(errp, "%s: guest policy requires SEV-ES, but " > >>> - "host SEV-ES support unavailable", > >>> - __func__); > >>> - return -1; > >>> - } > >>> - } > >> > >> Aside from breaking userspace, removing a sanity check is not a "fix". > > > > Actually this sanity check is not really required, if SEV INIT is not done before > > launching a SEV/SEV-ES VM, then LAUNCH_START will fail with invalid platform state > > error as below: > > > > ... > > qemu-system-x86_64: sev_launch_start: LAUNCH_START ret=1 fw_error=1 'Platform state is invalid' > > ... > > > > So we can safely remove this check without causing a SEV/SEV-ES VM to blow up or something. > > > >> > >> Can't we simply have the kernel do __sev_platform_init_locked() on-demand for > >> SEV_PLATFORM_STATUS? The goal with lazy initialization is defer initialization > >> until it's necessary so that userspace can do firmware updates. And it's quite > >> clearly necessary in this case, so... > > > > I don't think we want to do that, probably want to return "raw" status back to userspace, > > if SEV INIT has not been done we probably need to return back that status, otherwise > > it may break some other userspace tool. > > > > Now, looking at this qemu check we will always have issues launching SEV/SEV-ES VMs > > with SEV INIT on demand as this check enforces SEV INIT to be done before launching > > the VMs. And then this causes issues with SEV firmware hotloading as the check > > enforces SEV INIT before launching VMs and once SEV INIT is done we can't do > > firmware hotloading. > > > > But, i believe there is another alternative approach : > > > > - PSP driver can call SEV Shutdown right before calling DLFW_EX and then do > > a SEV INIT after successful DLFW_EX, in other words, we wrap DLFW_EX with > > SEV_SHUTDOWN prior to it and SEV INIT post it. This approach will also allow > > us to do both SNP and SEV INIT at KVM module load time, there is no need to > > do SEV INIT lazily or on demand before SEV/SEV-ES VM launch. > > > > This approach should work without any changes in qemu and also allow > > SEV firmware hotloading without having any concerns about SEV INIT state. > > > > And to add here that SEV Shutdown will succeed with active SEV and SNP guests. > > SEV Shutdown (internally) marks all SEV asids as invalid and decommission all > SEV guests and does not affect SNP guests. > > So any active SEV guests will be implicitly shutdown and SNP guests will not be > affected after SEV Shutdown right before doing SEV firmware hotloading and > calling DLFW_EX command. > > It should be fine to expect that there are no active SEV guests or any active > SEV guests will be shutdown as part of SEV firmware hotloading while keeping > SNP guests running. That's a pretty subtle distinction that I don't think host admins will be likely to either learn about or remember. IMHO if there are active SEV guests, the kernel should refuse the run the operation, rather than kill running guests. The host admin must decide whether it is appropriate to shutdown the guests in order to be able to run the upgrade. With regards, Daniel
On Fri, Dec 20, 2024, Daniel P. Berrangé wrote: > On Thu, Dec 19, 2024 at 04:04:45PM -0600, Kalra, Ashish wrote: > > On 12/18/2024 7:11 PM, Kalra, Ashish wrote: > > > But, i believe there is another alternative approach : > > > > > > - PSP driver can call SEV Shutdown right before calling DLFW_EX and then do > > > a SEV INIT after successful DLFW_EX, in other words, we wrap DLFW_EX with > > > SEV_SHUTDOWN prior to it and SEV INIT post it. This approach will also allow > > > us to do both SNP and SEV INIT at KVM module load time, there is no need to > > > do SEV INIT lazily or on demand before SEV/SEV-ES VM launch. > > > > > > This approach should work without any changes in qemu and also allow > > > SEV firmware hotloading without having any concerns about SEV INIT state. > > > > > > > And to add here that SEV Shutdown will succeed with active SEV and SNP guests. > > > > SEV Shutdown (internally) marks all SEV asids as invalid and decommission all > > SEV guests and does not affect SNP guests. > > > > So any active SEV guests will be implicitly shutdown and SNP guests will not be > > affected after SEV Shutdown right before doing SEV firmware hotloading and > > calling DLFW_EX command. > > > > It should be fine to expect that there are no active SEV guests or any active > > SEV guests will be shutdown as part of SEV firmware hotloading while keeping > > SNP guests running. > > That's a pretty subtle distinction that I don't think host admins will > be likely to either learn about or remember. IMHO if there are active > SEV guests, the kernel should refuse the run the operation, rather > than kill running guests. The host admin must decide whether it is > appropriate to shutdown the guests in order to be able to run the > upgrade. +1 to this and what Dionna said. Aside from being a horrible experience for userspace, trying to forcefully stop actions from within the kernel gets ugly.
On 12/20/2024 10:25 AM, Sean Christopherson wrote: > On Fri, Dec 20, 2024, Daniel P. Berrangé wrote: >> On Thu, Dec 19, 2024 at 04:04:45PM -0600, Kalra, Ashish wrote: >>> On 12/18/2024 7:11 PM, Kalra, Ashish wrote: >>>> But, i believe there is another alternative approach : >>>> >>>> - PSP driver can call SEV Shutdown right before calling DLFW_EX and then do >>>> a SEV INIT after successful DLFW_EX, in other words, we wrap DLFW_EX with >>>> SEV_SHUTDOWN prior to it and SEV INIT post it. This approach will also allow >>>> us to do both SNP and SEV INIT at KVM module load time, there is no need to >>>> do SEV INIT lazily or on demand before SEV/SEV-ES VM launch. >>>> >>>> This approach should work without any changes in qemu and also allow >>>> SEV firmware hotloading without having any concerns about SEV INIT state. >>>> >>> >>> And to add here that SEV Shutdown will succeed with active SEV and SNP guests. >>> >>> SEV Shutdown (internally) marks all SEV asids as invalid and decommission all >>> SEV guests and does not affect SNP guests. >>> >>> So any active SEV guests will be implicitly shutdown and SNP guests will not be >>> affected after SEV Shutdown right before doing SEV firmware hotloading and >>> calling DLFW_EX command. >>> >>> It should be fine to expect that there are no active SEV guests or any active >>> SEV guests will be shutdown as part of SEV firmware hotloading while keeping >>> SNP guests running. >> >> That's a pretty subtle distinction that I don't think host admins will >> be likely to either learn about or remember. IMHO if there are active >> SEV guests, the kernel should refuse the run the operation, rather >> than kill running guests. The host admin must decide whether it is >> appropriate to shutdown the guests in order to be able to run the >> upgrade. > > +1 to this and what Dionna said. Aside from being a horrible experience for > userspace, trying to forcefully stop actions from within the kernel gets ugly. Ok, SEV firmware hotloading will refuse the operation if there are active SEV/SEV-ES guests. SNP firmware hotloading/DLFW_EX is anyway transparent to SNP guests. If there are no active SEV/SEV-ES guests, DLFW_EX will do SEV Shutdown prior to it and SEV INIT post it, to work with the requirement of SEV to be in UNINIT state to do DLFW_EX. KVM module load time will do both SNP and SEV INIT. There is no reason to support lazy/on-demand SEV INIT when the first SEV VM is launched, and that anyway can't be supported till qemu is changed to remove the check for SEV INIT to be done before launching SEV/SEV-ES VMs. Hopefully this should be the final design for SEV/SNP platform initialization changes and SEV firmware hotloading support which i can go ahead and implement and if someone has comments or concerns with the above please let me know. Thanks, Ashish
From: Ashish Kalra <ashish.kalra@amd.com> Remove initializing SEV/SNP functionality from PSP driver and instead add support to KVM to explicitly initialize the PSP if KVM wants to use SEV/SNP functionality. This removes SEV/SNP initialization at PSP module probe time and does on-demand SEV/SNP initialization when KVM really wants to use SEV/SNP functionality. This will allow running legacy non-confidential VMs without initializating SEV functionality. This will assist in adding SNP CipherTextHiding support and SEV firmware hotloading support in KVM without sharing SEV ASID management and SNP guest context support between PSP driver and KVM and keeping all that support only in KVM. The on-demand SEV initialization support requires a fix in QEMU to remove check for SEV initialization to be done prior to launching SEV/SEV-ES VMs. NOTE: With the above fix for QEMU, older QEMU versions will be broken with respect to launching SEV/SEV-ES VMs with the newer kernel/KVM as older QEMU versions require SEV initialization to be done before launching SEV/SEV-ES VMs. v2: - Added support for separate SEV and SNP platform initalization, while SNP platform initialization is done at KVM module load time, SEV platform initialization is done on demand at SEV/SEV-ES VM launch. - Added support for separate SEV and SNP platform shutdown, both SEV and SNP shutdown done at KVM module unload time, only SEV shutdown down when all SEV/SEV-ES VMs have been destroyed, this allows SEV firmware hotloading support anytime during system lifetime. - Updated commit messages for couple of patches in the series with reference to the feedback received on v1 patches. Ashish Kalra (9): crypto: ccp: Move dev_info/err messages for SEV/SNP initialization crypto: ccp: Fix implicit SEV/SNP init and shutdown in ioctls crypto: ccp: Reset TMR size at SNP Shutdown crypto: ccp: Register SNP panic notifier only if SNP is enabled crypto: ccp: Add new SEV platform shutdown API crypto: ccp: Add new SEV/SNP platform shutdown API crypto: ccp: Add new SEV/SNP platform initialization API KVM: SVM: Add support to initialize SEV/SNP functionality in KVM crypto: ccp: Move SEV/SNP Platform initialization to KVM arch/x86/kvm/svm/sev.c | 33 +++- drivers/crypto/ccp/sev-dev.c | 283 ++++++++++++++++++++++++----------- include/linux/psp-sev.h | 27 +++- 3 files changed, 248 insertions(+), 95 deletions(-)