Message ID | 20220128171804.569796-39-brijesh.singh@amd.com |
---|---|
State | New |
Headers | show |
Series | Add AMD Secure Nested Paging (SEV-SNP) Guest Support | expand |
On Fri, Jan 28, 2022 at 11:17:59AM -0600, Brijesh Singh wrote: > From: Michael Roth <michael.roth@amd.com> > > SEV-SNP guests will be provided the location of special 'secrets' and > 'CPUID' pages via the Confidential Computing blob. This blob is > provided to the run-time kernel either through a bootparams field that > was initialized by the boot/compressed kernel, or via a setup_data > structure as defined by the Linux Boot Protocol. > > Locate the Confidential Computing blob from these sources and, if found, > use the provided CPUID page/table address to create a copy that the > run-time kernel will use when servicing CPUID instructions via a #VC > handler. > > Signed-off-by: Michael Roth <michael.roth@amd.com> > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> > --- > arch/x86/boot/compressed/sev.c | 37 ----------------- > arch/x86/kernel/sev-shared.c | 37 +++++++++++++++++ > arch/x86/kernel/sev.c | 75 ++++++++++++++++++++++++++++++++++ > 3 files changed, 112 insertions(+), 37 deletions(-) > <snip> > diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c > index 5c72b21c7e7b..cb97200bfda7 100644 > --- a/arch/x86/kernel/sev.c > +++ b/arch/x86/kernel/sev.c > @@ -2034,6 +2034,8 @@ bool __init snp_init(struct boot_params *bp) > if (!cc_info) > return false; > > + snp_setup_cpuid_table(cc_info); > + > /* > * The CC blob will be used later to access the secrets page. Cache > * it here like the boot kernel does. > @@ -2047,3 +2049,76 @@ void __init snp_abort(void) > { > sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED); > } > + > +static void snp_dump_cpuid_table(void) > +{ > + const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr(); > + int i = 0; > + > + pr_info("count=%d reserved=0x%x reserved2=0x%llx\n", > + cpuid_info->count, cpuid_info->__reserved1, cpuid_info->__reserved2); > + > + for (i = 0; i < SNP_CPUID_COUNT_MAX; i++) { > + const struct snp_cpuid_fn *fn = &cpuid_info->fn[i]; > + > + pr_info("index=%03d fn=0x%08x subfn=0x%08x: eax=0x%08x ebx=0x%08x ecx=0x%08x edx=0x%08x xcr0_in=0x%016llx xss_in=0x%016llx reserved=0x%016llx\n", > + i, fn->eax_in, fn->ecx_in, fn->eax, fn->ebx, fn->ecx, > + fn->edx, fn->xcr0_in, fn->xss_in, fn->__reserved); > + } > +} > + > +/* > + * It is useful from an auditing/testing perspective to provide an easy way > + * for the guest owner to know that the CPUID table has been initialized as > + * expected, but that initialization happens too early in boot to print any > + * sort of indicator, and there's not really any other good place to do it. > + * So do it here. This is also a good place to flag unexpected usage of the > + * CPUID table that does not violate the SNP specification, but may still > + * be worth noting to the user. > + */ > +static int __init snp_check_cpuid_table(void) > +{ > + const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr(); > + bool dump_table = false; > + int i; > + > + if (!cpuid_info->count) > + return 0; > + > + pr_info("Using SEV-SNP CPUID table, %d entries present.\n", > + cpuid_info->count); > + > + if (cpuid_info->__reserved1 || cpuid_info->__reserved2) { > + pr_warn("Unexpected use of reserved fields in CPUID table header.\n"); > + dump_table = true; > + } > + > + for (i = 0; i < cpuid_info->count; i++) { > + const struct snp_cpuid_fn *fn = &cpuid_info->fn[i]; > + struct snp_cpuid_fn zero_fn = {0}; > + > + /* > + * Though allowed, an all-zero entry is not a valid leaf for linux > + * guests, and likely the result of an incorrect entry count. > + */ > + if (!memcmp(fn, &zero_fn, sizeof(struct snp_cpuid_fn))) { > + pr_warn("CPUID table contains all-zero entry at index=%d\n", i); > + dump_table = true; > + } > + > + if (fn->__reserved) { > + pr_warn("Unexpected use of reserved fields in CPUID table entry at index=%d\n", > + i); > + dump_table = true; > + } > + } > + > + if (dump_table) { > + pr_warn("Dumping CPUID table:\n"); > + snp_dump_cpuid_table(); > + } > + > + return 0; > +} > + > +arch_initcall(snp_check_cpuid_table); Just a quick note on this one. I know you'd recently suggested dropping this check since it was redundant: https://lore.kernel.org/all/YfGUWLmg82G+l4jU@zn.tnic/ I've dropped the redundant checks from in this version, but with the additional sanity checks you suggested adding, this function remained relevant for v9 so we can provide user-readable warnings for "weird" stuff in the table that doesn't necessarily violate the spec, but also doesn't seem to make much sense: https://lore.kernel.org/all/YefzQuqrV8kdLr9z@zn.tnic/ But I also agree with your later response here, where sanity-checking doesn't gain us much since ultimately what really matter is the resulting values we provide in response to CPUID instructions, which would be ultimately what gets used for guest owner's attestation flow: https://lore.kernel.org/all/YfR1GNb%2FyzKu4n5+@zn.tnic/ So I agree we should drop the sanity-check in this function since there's no cases covered here that would actually effect the end result of those CPUID instructions, nor do any of the checks here violate the SNP spec... ...except possibly the checks that enforce that the Reserved fields here are all 0, which we discussed here: https://lore.kernel.org/all/YeGhKll2fTcTr2wS@zn.tnic/ I followed up with our firmware team on this to confirm whether our intended handling and the potential backward-compatibility issues aligned with the intent of the SNP FW spec, and the guidance I got on that is that those Reserved/MBZ constraints are meant to apply to what a hypervisor places in the initial CPUID table input to SNP_LAUNCH_UPDATE (or what the guest places in the MSG_CPUID_REQ structure in the guest message version). They are not meant to cover what the guest should expect to read for these Reserved fields in the resulting CPUID table: the guest is supposed to ignore them completely and not enforce any value. I mentioned the concern you raised about out-of-spec hypervisors using non-zero for Reserved fields, and potentially breaking future guests that attempt to make use of them if they ever get re-purposed for some other functionality, but their take on that is that such a hypervisor is clearly out-of-spec, and would not be supported. Their preference would be to update the spec for clarity on this if we have any better suggested wording, rather than implementing anything on the guest side that might effect backward-compatibility in the future. Another possibility is enforcing 0 in the firmware itself. So given their guidance on the Reserved fields, and your guidance on not doing the other sanity-checks, my current plan to to drop snp_check_cpuid_table() completely for v10, but if you have other thoughts on that just let me know. Thanks! -Mike > -- > 2.25.1 >
On Sat, Feb 05, 2022 at 11:19:01AM -0600, Michael Roth wrote: > I mentioned the concern you raised about out-of-spec hypervisors > using non-zero for Reserved fields, and potentially breaking future > guests that attempt to make use of them if they ever get re-purposed > for some other functionality, but their take on that is that such a > hypervisor is clearly out-of-spec, and would not be supported. Yah, like stating that something should not be done in the spec is going to stop people from doing it anyway. You folks need to understand one thing: the smaller the attack surface, the better. And you need to *enforce* that - not state it in a spec. No one cares about the spec when it comes to poking holes in the architecture. And people do poke and will poke holes *especially* in this architecture, as its main goal is security. > Another possibility is enforcing 0 in the firmware itself. Yes, this is the thing to do. If they're going to be reused in the future, then guests can be changed to handle that. Like we do all the time anyway. > So given their guidance on the Reserved fields, and your guidance > on not doing the other sanity-checks, my current plan to to drop > snp_check_cpuid_table() completely for v10, but if you have other > thoughts on that just let me know. Yes, and pls fix the firmware to zero them out, because from reading your very detailed explanation - btw, thanks for taking the time to explain properly what's not in the ABI doc: https://lore.kernel.org/r/20220205154243.s33gwghqwbb4efyl@amd.com it sounds like those two input fields are not really needed. So the earlier you fix them in the firmware and deprecate them, the better. Thx!
On Fri, Jan 28, 2022 at 11:17:59AM -0600, Brijesh Singh wrote: > +/* > + * It is useful from an auditing/testing perspective to provide an easy way > + * for the guest owner to know that the CPUID table has been initialized as > + * expected, but that initialization happens too early in boot to print any > + * sort of indicator, and there's not really any other good place to do it. If you really wanna do it - and I think it might make sense to be able to dump that table for debugging or whatever purposes, you could define a cmdline parameter like sev=debug or so, which would enable checking and dumping of the cpuid table. For example...
On Sun, Feb 06, 2022 at 04:46:08PM +0100, Borislav Petkov wrote: > On Sat, Feb 05, 2022 at 11:19:01AM -0600, Michael Roth wrote: > > I mentioned the concern you raised about out-of-spec hypervisors > > using non-zero for Reserved fields, and potentially breaking future > > guests that attempt to make use of them if they ever get re-purposed > > for some other functionality, but their take on that is that such a > > hypervisor is clearly out-of-spec, and would not be supported. > > Yah, like stating that something should not be done in the spec is > going to stop people from doing it anyway. You folks need to understand > one thing: the smaller the attack surface, the better. And you need to > *enforce* that - not state it in a spec. No one cares about the spec > when it comes to poking holes in the architecture. And people do poke > and will poke holes *especially* in this architecture, as its main goal > is security. Agreed, but SNP never relies on the hypervisor to be the single authority on what security features are available, if these fields were ever re-purposed for anything in the future that would almost assuredly be accompanied by a new SEV status MSR bit (that can't be intercepted), a new policy bit (which affects measurement), a new "type2" CPUID page that would effect measurement (contents don't affect measurement in current firmware, but the metadata, like what type of page it is, is), etc. At that point any guest code changes to make use of those new fields could then fail any out-of-spec hypervisors trying to use those fields without the requisite firmware/hardware support. So please don't take my summary as an indication that the security relies on hypervisors abiding by the spec, this is more a statement that an out-of-spec hypervisor should not expect that their guests will continue working in future firmware versions, and what's being determined here is whether to break those out-of-spec hypervisor now, or later when/if we actually make use of the fields in the guest code, and whether we need to make clarifications in the spec to help implementations avoid such breakage. > > > Another possibility is enforcing 0 in the firmware itself. > > Yes, this is the thing to do. If they're going to be reused in the > future, then guests can be changed to handle that. Like we do all the > time anyway. Ok, I'll follow up with the firmware team on this. But just to be clear, what they're suggesting is that the firmware could enforce the MBZ checks on the CPUID page, so out-of-spec hypervisors will fail immediately, rather than in some future version of the spec/cpuid page, and guests can continue ignoring them in the meantime. I'll also note the type you spotted with the Table 13 reference and see if there's anything else that can be cleared up there. > > > So given their guidance on the Reserved fields, and your guidance > > on not doing the other sanity-checks, my current plan to to drop > > snp_check_cpuid_table() completely for v10, but if you have other > > thoughts on that just let me know. > > Yes, and pls fix the firmware to zero them out, because from reading > your very detailed explanation - btw, thanks for taking the time to > explain properly what's not in the ABI doc: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fr%2F20220205154243.s33gwghqwbb4efyl%40amd.com&data=04%7C01%7Cmichael.roth%40amd.com%7C377208b805be40f55c0f08d9e987d19e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637797591858315372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=L9McNfcn514uHmHLCtFPI2TqLKoCK0%2FKDMPk32lO8r8%3D&reserved=0 > > it sounds like those two input fields are not really needed. So the > earlier you fix them in the firmware and deprecate them, the better. XCR0_IN/XSS_IN aren't needed by a guest if it follows the recommended implementation and computes them on the fly, but they do still serve a purpose in the context of firmware validation of 0xD,0x0/0xD,0x1 leaves, since those are validated relative to a particular XCR0_IN/XSS_IN. Whether a guest chooses to use the resulting firmware-validated version of EBX for those, or compute it on it's own, is considered outside the scope of the SNP firmware ABI, so I think the plan is to leave those fields in the spec at least for current SNP version, and rely on the implementation recommendations to document anything outside of that. But since the recommendations need to be compatibile with the SNP firmware ABI, I've updated the recommendations to have the guest to go ahead and check for XCR0_IN={1,3}/XSS_IN=0 when searching for base 0xD,0x0/0xD,0x1 entries, so that there's no question of whether a guest is supposed to expect XCR0_IN/XSS_IN to be 0. Getting that document ready to send if a few. Hope that helps clear things up, but please let me know if there's anything else that needs clarification. Thanks! -Mike > > Thx! > > -- > Regards/Gruss, > Boris. > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&data=04%7C01%7Cmichael.roth%40amd.com%7C377208b805be40f55c0f08d9e987d19e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637797591858315372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=0uNI0Ojku3ZIgiQRbNf0UxNXruycXsOYIUNIY9I2IiY%3D&reserved=0
On Mon, Feb 07, 2022 at 11:00:18AM -0600, Michael Roth wrote: > this is more a statement that an out-of-spec hypervisor should not > expect that their guests will continue working in future firmware > versions, and what's being determined here is whether to break > those out-of-spec hypervisor now, or later when/if we actually > make use of the fields in the guest code, I think you're missing a very important aspect here called reality. Let's say that HV is a huge cloud vendor who means a lot of $ and a huge use case for SEV tech. And let's say that same HV is doing those incompatible things. Now imagine you break it with the spec change. But they already have a gazillion of deployments on real hw which they can't simply update just like that. Hell, cloud vendors are even trying to dictate how CPU vendors should do microcode updates on a live system, without rebooting, and we're talking about some wimpy fields in some table. Now imagine your business unit calls your engineering and says, you need to fix this because a very persuasive chunk of money. What you most likely will end up with is an ugly ugly workaround after a lot of managers screaming at each other and you won't even think about breaking that HV. Now imagine you've designed it the right and unambiguous way from the getgo. You wake up and realize, it was all just a bad dream... > Ok, I'll follow up with the firmware team on this. But just to be clear, > what they're suggesting is that the firmware could enforce the MBZ checks > on the CPUID page, so out-of-spec hypervisors will fail immediately, > rather than in some future version of the spec/cpuid page, and guests > can continue ignoring them in the meantime. Yes, exactly. Fail immediately. Thx.
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index cde5658e1007..21cdafac71e3 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -393,43 +393,6 @@ static struct cc_blob_sev_info *snp_find_cc_blob(struct boot_params *bp) return cc_info; } -/* - * Initialize the kernel's copy of the SEV-SNP CPUID table, and set up the - * pointer that will be used to access it. - * - * Maintaining a direct mapping of the SEV-SNP CPUID table used by firmware - * would be possible as an alternative, but the approach is brittle since the - * mapping needs to be updated in sync with all the changes to virtual memory - * layout and related mapping facilities throughout the boot process. - */ -static void snp_setup_cpuid_table(const struct cc_blob_sev_info *cc_info) -{ - const struct snp_cpuid_info *cpuid_info_fw, *cpuid_info; - int i; - - if (!cc_info || !cc_info->cpuid_phys || cc_info->cpuid_len < PAGE_SIZE) - sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID); - - cpuid_info_fw = (const struct snp_cpuid_info *)cc_info->cpuid_phys; - if (!cpuid_info_fw->count || cpuid_info_fw->count > SNP_CPUID_COUNT_MAX) - sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID); - - cpuid_info = snp_cpuid_info_get_ptr(); - memcpy((void *)cpuid_info, cpuid_info_fw, sizeof(*cpuid_info)); - - /* Initialize CPUID ranges for range-checking. */ - for (i = 0; i < cpuid_info->count; i++) { - const struct snp_cpuid_fn *fn = &cpuid_info->fn[i]; - - if (fn->eax_in == 0x0) - cpuid_std_range_max = fn->eax; - else if (fn->eax_in == 0x40000000) - cpuid_hyp_range_max = fn->eax; - else if (fn->eax_in == 0x80000000) - cpuid_ext_range_max = fn->eax; - } -} - bool snp_init(struct boot_params *bp) { struct cc_blob_sev_info *cc_info; diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c index 2bddbfea079b..795426ee6108 100644 --- a/arch/x86/kernel/sev-shared.c +++ b/arch/x86/kernel/sev-shared.c @@ -961,3 +961,40 @@ static struct cc_blob_sev_info *snp_find_cc_blob_setup_data(struct boot_params * return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address; } + +/* + * Initialize the kernel's copy of the SEV-SNP CPUID table, and set up the + * pointer that will be used to access it. + * + * Maintaining a direct mapping of the SEV-SNP CPUID table used by firmware + * would be possible as an alternative, but the approach is brittle since the + * mapping needs to be updated in sync with all the changes to virtual memory + * layout and related mapping facilities throughout the boot process. + */ +static void __init snp_setup_cpuid_table(const struct cc_blob_sev_info *cc_info) +{ + const struct snp_cpuid_info *cpuid_info_fw, *cpuid_info; + int i; + + if (!cc_info || !cc_info->cpuid_phys || cc_info->cpuid_len < PAGE_SIZE) + sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID); + + cpuid_info_fw = (const struct snp_cpuid_info *)cc_info->cpuid_phys; + if (!cpuid_info_fw->count || cpuid_info_fw->count > SNP_CPUID_COUNT_MAX) + sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID); + + cpuid_info = snp_cpuid_info_get_ptr(); + memcpy((void *)cpuid_info, cpuid_info_fw, sizeof(*cpuid_info)); + + /* Initialize CPUID ranges for range-checking. */ + for (i = 0; i < cpuid_info->count; i++) { + const struct snp_cpuid_fn *fn = &cpuid_info->fn[i]; + + if (fn->eax_in == 0x0) + cpuid_std_range_max = fn->eax; + else if (fn->eax_in == 0x40000000) + cpuid_hyp_range_max = fn->eax; + else if (fn->eax_in == 0x80000000) + cpuid_ext_range_max = fn->eax; + } +} diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 5c72b21c7e7b..cb97200bfda7 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -2034,6 +2034,8 @@ bool __init snp_init(struct boot_params *bp) if (!cc_info) return false; + snp_setup_cpuid_table(cc_info); + /* * The CC blob will be used later to access the secrets page. Cache * it here like the boot kernel does. @@ -2047,3 +2049,76 @@ void __init snp_abort(void) { sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED); } + +static void snp_dump_cpuid_table(void) +{ + const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr(); + int i = 0; + + pr_info("count=%d reserved=0x%x reserved2=0x%llx\n", + cpuid_info->count, cpuid_info->__reserved1, cpuid_info->__reserved2); + + for (i = 0; i < SNP_CPUID_COUNT_MAX; i++) { + const struct snp_cpuid_fn *fn = &cpuid_info->fn[i]; + + pr_info("index=%03d fn=0x%08x subfn=0x%08x: eax=0x%08x ebx=0x%08x ecx=0x%08x edx=0x%08x xcr0_in=0x%016llx xss_in=0x%016llx reserved=0x%016llx\n", + i, fn->eax_in, fn->ecx_in, fn->eax, fn->ebx, fn->ecx, + fn->edx, fn->xcr0_in, fn->xss_in, fn->__reserved); + } +} + +/* + * It is useful from an auditing/testing perspective to provide an easy way + * for the guest owner to know that the CPUID table has been initialized as + * expected, but that initialization happens too early in boot to print any + * sort of indicator, and there's not really any other good place to do it. + * So do it here. This is also a good place to flag unexpected usage of the + * CPUID table that does not violate the SNP specification, but may still + * be worth noting to the user. + */ +static int __init snp_check_cpuid_table(void) +{ + const struct snp_cpuid_info *cpuid_info = snp_cpuid_info_get_ptr(); + bool dump_table = false; + int i; + + if (!cpuid_info->count) + return 0; + + pr_info("Using SEV-SNP CPUID table, %d entries present.\n", + cpuid_info->count); + + if (cpuid_info->__reserved1 || cpuid_info->__reserved2) { + pr_warn("Unexpected use of reserved fields in CPUID table header.\n"); + dump_table = true; + } + + for (i = 0; i < cpuid_info->count; i++) { + const struct snp_cpuid_fn *fn = &cpuid_info->fn[i]; + struct snp_cpuid_fn zero_fn = {0}; + + /* + * Though allowed, an all-zero entry is not a valid leaf for linux + * guests, and likely the result of an incorrect entry count. + */ + if (!memcmp(fn, &zero_fn, sizeof(struct snp_cpuid_fn))) { + pr_warn("CPUID table contains all-zero entry at index=%d\n", i); + dump_table = true; + } + + if (fn->__reserved) { + pr_warn("Unexpected use of reserved fields in CPUID table entry at index=%d\n", + i); + dump_table = true; + } + } + + if (dump_table) { + pr_warn("Dumping CPUID table:\n"); + snp_dump_cpuid_table(); + } + + return 0; +} + +arch_initcall(snp_check_cpuid_table);