Message ID | 20210824094109.7272-1-shawnguo@kernel.org |
---|---|
State | New |
Headers | show |
Series | soc: qcom: mdt_loader: Drop PT_LOAD check on hash segment | expand |
Hi Shawn, On 8/24/21 11:41 AM, Shawn Guo wrote: > From: Shawn Guo <shawn.guo@linaro.org> > > It's been observed on Sony Xperia M4 Aqua phone, that wcnss firmware has > the type of the second segment holding hashes just be PT_LOAD. So drop > the check on phdrs[1].p_type to get it go on that phone. > > Signed-off-by: Shawn Guo <shawn.guo@linaro.org> > --- > drivers/soc/qcom/mdt_loader.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c > index eba7f76f9d61..6034cd8992b0 100644 > --- a/drivers/soc/qcom/mdt_loader.c > +++ b/drivers/soc/qcom/mdt_loader.c > @@ -98,7 +98,7 @@ void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len) > if (ehdr->e_phnum < 2) > return ERR_PTR(-EINVAL); > > - if (phdrs[0].p_type == PT_LOAD || phdrs[1].p_type == PT_LOAD) > + if (phdrs[0].p_type == PT_LOAD) > return ERR_PTR(-EINVAL); > > if ((phdrs[1].p_flags & QCOM_MDT_TYPE_MASK) != QCOM_MDT_TYPE_HASH) > Konrad (on the CC-list) originally came up with a similar patch to make his Sony phone boot (Xperia XZ, MSM8996). We however concluded that this solution is wrong, for the simple fact that the code below expects a PT_NULL header with data in the right place. Skipping this check most likely means that the code below will read out of bounds since the mdt file isn't large enough; this data is supposed to be read from an external file as indicated by the meaning of PT_LOAD. We built a solution for this, and fortunately CAF independently submitted a similar solution to the lists a while ago which is more thorough: https://lore.kernel.org/linux-arm-msm/1609968211-7579-1-git-send-email-sidgup@codeaurora.org/ It's been a while since I last compared mdt files with and without PT_LOAD, but this is the conclusion I remember coming to: The code that packs the hash from program header 1 tightly after the ELF header in program header 0 doesn't actually need to run, since our mdt binaries already contain the signature in that place; the bytes aren't used for anything else. Simply sending the contents of the entire file as-is (similar to our downstream kernels) worked just fine (of course files beyond the size of the mdt file still have to be appended from .bXX files, and I'm not sure why this isn't checking for PT_LOAD program-header type there). - Marijn
Looping Siddharth Gupta in this thread who sent the mdt_loader improvements referenced in my reply. Siddharth, it appears review comments have been handled but no v2 made it to the lists yet; are you still able to send this? On 8/24/21 5:18 PM, Marijn Suijten wrote: > [..] > > https://lore.kernel.org/linux-arm-msm/1609968211-7579-1-git-send-email-sidgup@codeaurora.org/ > > [..] - Marijn
Hi Marijn, Thanks much for the information! On Tue, Aug 24, 2021 at 05:18:05PM +0200, Marijn Suijten wrote: > Hi Shawn, > > On 8/24/21 11:41 AM, Shawn Guo wrote: > > From: Shawn Guo <shawn.guo@linaro.org> > > > > It's been observed on Sony Xperia M4 Aqua phone, that wcnss firmware has > > the type of the second segment holding hashes just be PT_LOAD. So drop > > the check on phdrs[1].p_type to get it go on that phone. > > > > Signed-off-by: Shawn Guo <shawn.guo@linaro.org> > > --- > > drivers/soc/qcom/mdt_loader.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c > > index eba7f76f9d61..6034cd8992b0 100644 > > --- a/drivers/soc/qcom/mdt_loader.c > > +++ b/drivers/soc/qcom/mdt_loader.c > > @@ -98,7 +98,7 @@ void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len) > > if (ehdr->e_phnum < 2) > > return ERR_PTR(-EINVAL); > > - if (phdrs[0].p_type == PT_LOAD || phdrs[1].p_type == PT_LOAD) > > + if (phdrs[0].p_type == PT_LOAD) > > return ERR_PTR(-EINVAL); > > if ((phdrs[1].p_flags & QCOM_MDT_TYPE_MASK) != QCOM_MDT_TYPE_HASH) > > > > > Konrad (on the CC-list) originally came up with a similar patch to make his > Sony phone boot (Xperia XZ, MSM8996). We however concluded that this > solution is wrong, for the simple fact that the code below expects a PT_NULL > header with data in the right place. Skipping this check most likely means > that the code below will read out of bounds since the mdt file isn't large > enough; this data is supposed to be read from an external file as indicated > by the meaning of PT_LOAD. We built a solution for this, and fortunately > CAF independently submitted a similar solution to the lists a while ago > which is more thorough: > > https://lore.kernel.org/linux-arm-msm/1609968211-7579-1-git-send-email-sidgup@codeaurora.org/ While a thorough solution is good, I do not think my patch makes anything worse or breaks anything, while fixing my issue on Sony Xperia M4 Aqua. All the current assumptions hold true for my WCNSS firmware, only except that phdrs[1] gets a PT_LOAD type. There is a blog[1] illustrating the situation quite nicely. Again, the only thing my WCNSS firmware differs from the illustration is that hash segment gets a PT_LOAD type. Skipping the check will not cause problem for firmwares we have seen, because hash segment is duplicated in .mdt file, and we are using that duplication instead of reading from .b01 file. Shawn [1] http://bits-please.blogspot.com/2016/04/exploring-qualcomms-secure-execution.html
Hi Shawn, On 8/26/21 4:18 PM, Shawn Guo wrote: > Hi Marijn, > > Thanks much for the information! > > On Tue, Aug 24, 2021 at 05:18:05PM +0200, Marijn Suijten wrote: >> Hi Shawn, >> >> On 8/24/21 11:41 AM, Shawn Guo wrote: >>> From: Shawn Guo <shawn.guo@linaro.org> >>> >>> It's been observed on Sony Xperia M4 Aqua phone, that wcnss firmware has >>> the type of the second segment holding hashes just be PT_LOAD. So drop >>> the check on phdrs[1].p_type to get it go on that phone. >>> >>> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> >>> --- >>> drivers/soc/qcom/mdt_loader.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c >>> index eba7f76f9d61..6034cd8992b0 100644 >>> --- a/drivers/soc/qcom/mdt_loader.c >>> +++ b/drivers/soc/qcom/mdt_loader.c >>> @@ -98,7 +98,7 @@ void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len) >>> if (ehdr->e_phnum < 2) >>> return ERR_PTR(-EINVAL); >>> - if (phdrs[0].p_type == PT_LOAD || phdrs[1].p_type == PT_LOAD) >>> + if (phdrs[0].p_type == PT_LOAD) >>> return ERR_PTR(-EINVAL); >>> if ((phdrs[1].p_flags & QCOM_MDT_TYPE_MASK) != QCOM_MDT_TYPE_HASH) >>> >> >> >> Konrad (on the CC-list) originally came up with a similar patch to make his >> Sony phone boot (Xperia XZ, MSM8996). We however concluded that this >> solution is wrong, for the simple fact that the code below expects a PT_NULL >> header with data in the right place. Skipping this check most likely means >> that the code below will read out of bounds since the mdt file isn't large >> enough; this data is supposed to be read from an external file as indicated >> by the meaning of PT_LOAD. We built a solution for this, and fortunately >> CAF independently submitted a similar solution to the lists a while ago >> which is more thorough: >> >> https://lore.kernel.org/linux-arm-msm/1609968211-7579-1-git-send-email-sidgup@codeaurora.org/ > > While a thorough solution is good, I do not think my patch makes > anything worse or breaks anything, while fixing my issue on Sony Xperia > M4 Aqua. All the current assumptions hold true for my WCNSS firmware, > only except that phdrs[1] gets a PT_LOAD type. It's not up to me to choose between a thorough or quick solution, but this patch seems to open up the possibility for an out-of-bounds read. The assertion was placed in this function for a reason after all. > There is a blog[1] illustrating the situation quite nicely. Again, the > only thing my WCNSS firmware differs from the illustration is that > hash segment gets a PT_LOAD type. Yes, that blog post nicely explains the format but it doesn't cover that the hash is encoded in the second program header nor that it can be copied to be packed tightly against the ELF header? Maybe that only applies to newer formats? > Skipping the check will not cause problem for firmwares we have seen, > because hash segment is duplicated in .mdt file, and we are using that > duplication instead of reading from .b01 file. Can you share some more details about the firmware file you're using: size and the program headers as shown by `readelf -l`? If possible, can you also validate whether QCOM_MDT_TYPE_HASH is set in phdrs[1].p_flags & QCOM_MDT_TYPE_MASK (using something like GNU poke)? All the files I'm able to test adhere to `/* Is the header and hash already packed */`: `ehdr_size + hash_size == fw->size` (meaning the file solely consists of a tightly packed ELF header and hash signature) but I won't be surprised if there are firmware files out in the wild with different headers or the hash missing, otherwise the else clause wouldn't exist. This else clause causes a read starting at fw->data + phdrs[1].p_offset of phdrs[1].p_filesz bytes which is almost certainly out of bounds if the program header type is PT_LOAD instead of PT_NONE, otherwise it wouldn't need to be loaded from a .bXX file in the first place. For this quick solution to be valid I propose to reject PT_LOAD in the else clause, and otherwise validate that phdrs[1].p_offset + hash_size <= fw->size. The aforementioned patch series in qcom_mdt_bins_are_split (and certain firmware files from Sony phones) show that it is also possible to read PT_NULL headers from external files, as long as the header references a range that is out of bounds of the mdt file. > Shawn > > [1] http://bits-please.blogspot.com/2016/04/exploring-qualcomms-secure-execution.html > - Marijn
On Thu, Aug 26, 2021 at 10:52:21PM +0200, Marijn Suijten wrote: > Hi Shawn, > > On 8/26/21 4:18 PM, Shawn Guo wrote: > > Hi Marijn, > > > > Thanks much for the information! > > > > On Tue, Aug 24, 2021 at 05:18:05PM +0200, Marijn Suijten wrote: > > > Hi Shawn, > > > > > > On 8/24/21 11:41 AM, Shawn Guo wrote: > > > > From: Shawn Guo <shawn.guo@linaro.org> > > > > > > > > It's been observed on Sony Xperia M4 Aqua phone, that wcnss firmware has > > > > the type of the second segment holding hashes just be PT_LOAD. So drop > > > > the check on phdrs[1].p_type to get it go on that phone. > > > > > > > > Signed-off-by: Shawn Guo <shawn.guo@linaro.org> > > > > --- > > > > drivers/soc/qcom/mdt_loader.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c > > > > index eba7f76f9d61..6034cd8992b0 100644 > > > > --- a/drivers/soc/qcom/mdt_loader.c > > > > +++ b/drivers/soc/qcom/mdt_loader.c > > > > @@ -98,7 +98,7 @@ void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len) > > > > if (ehdr->e_phnum < 2) > > > > return ERR_PTR(-EINVAL); > > > > - if (phdrs[0].p_type == PT_LOAD || phdrs[1].p_type == PT_LOAD) > > > > + if (phdrs[0].p_type == PT_LOAD) > > > > return ERR_PTR(-EINVAL); > > > > if ((phdrs[1].p_flags & QCOM_MDT_TYPE_MASK) != QCOM_MDT_TYPE_HASH) > > > > > > > > > > > > > Konrad (on the CC-list) originally came up with a similar patch to make his > > > Sony phone boot (Xperia XZ, MSM8996). We however concluded that this > > > solution is wrong, for the simple fact that the code below expects a PT_NULL > > > header with data in the right place. Skipping this check most likely means > > > that the code below will read out of bounds since the mdt file isn't large > > > enough; this data is supposed to be read from an external file as indicated > > > by the meaning of PT_LOAD. We built a solution for this, and fortunately > > > CAF independently submitted a similar solution to the lists a while ago > > > which is more thorough: > > > > > > https://lore.kernel.org/linux-arm-msm/1609968211-7579-1-git-send-email-sidgup@codeaurora.org/ > > > > While a thorough solution is good, I do not think my patch makes > > anything worse or breaks anything, while fixing my issue on Sony Xperia > > M4 Aqua. All the current assumptions hold true for my WCNSS firmware, > > only except that phdrs[1] gets a PT_LOAD type. > > > It's not up to me to choose between a thorough or quick solution, To be clear, Siddharth's series doesn't resolve my problem, as the assumption that hash segment type cannot be PT_LOAD is still there. I have to add the following change on top to fix my problem. @@ -126,8 +126,7 @@ void *qcom_mdt_read_metadata(struct device *dev, const struct firmware *fw, cons return ERR_PTR(-EINVAL); for (hash_index = 1; hash_index < ehdr->e_phnum; hash_index++) { - if (phdrs[hash_index].p_type != PT_LOAD && - (phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH) + if ((phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH) break; } if (hash_index >= ehdr->e_phnum) That said, my patch is merely to break the assumption that hash segment type cannot be PT_LOAD, and it's really orthogonal to Siddharth's series. > but this > patch seems to open up the possibility for an out-of-bounds read. The > assertion was placed in this function for a reason after all. I would much appreciate it if someone helps to elaborate the reason. > > > There is a blog[1] illustrating the situation quite nicely. Again, the > > only thing my WCNSS firmware differs from the illustration is that > > hash segment gets a PT_LOAD type. > > > Yes, that blog post nicely explains the format but it doesn't cover that the > hash is encoded in the second program header nor that it can be copied to be > packed tightly against the ELF header? Maybe that only applies to newer > formats? Hmm, it does cover the things. It's been illustrated that .mdt is simply a concatenating of .b00 and .b01. The .b00 includes ELF header and program headers, while .b01 is just hash segment. $ cat wcnss.b00 wcnss.b01 > wcnss.b00_b01 $ cmp wcnss.b00_b01 wcnss.mdt $ That said, .mdt = ELF header + program headers + hash > > > Skipping the check will not cause problem for firmwares we have seen, > > because hash segment is duplicated in .mdt file, and we are using that > > duplication instead of reading from .b01 file. > > > Can you share some more details about the firmware file you're using: size > and the program headers as shown by `readelf -l`? $ readelf -l wcnss.mdt Elf file type is EXEC (Executable file) Entry point 0x8b6018d4 There are 12 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align NULL 0x000000 0x00000000 0x00000000 0x001b4 0x00000 0 LOAD 0x001000 0x8bbe0000 0x8bbe0000 0x00c58 0x02000 0x1000 LOAD 0x003000 0x8b600000 0x8b600000 0x03308 0x03438 RWE 0x100000 LOAD 0x00afcc 0x8b604000 0x8b604000 0x00000 0x08000 RW 0x4000 LOAD 0x00afcc 0x8b60c000 0x8b60c000 0x0f000 0x0f000 RW 0x4 LOAD 0x019fcc 0x8b61b000 0x8b61b000 0x00000 0x0e000 RW 0x4 LOAD 0x019fcc 0x8b629000 0x8b629000 0x32eb04 0x4c2b10 RWE 0x80 LOAD 0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x37cf8 RW 0x8 LOAD 0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x21c44 RW 0x4 LOAD 0x348ad0 0x8bb30000 0x8bb30000 0x00034 0x001a8 RW 0x8 LOAD 0x349fcc 0x8bb31000 0x8bb31000 0xa0000 0xa0000 RW 0x1000 LOAD 0x3e9fcc 0x8bbd1000 0x8bbd1000 0x0ac3c 0x0ee00 RWE 0x1000 > If possible, can you also > validate whether QCOM_MDT_TYPE_HASH is set in phdrs[1].p_flags & > QCOM_MDT_TYPE_MASK (using something like GNU poke)? It is set, otherwise the QCOM_MDT_TYPE_HASH check in qcom_mdt_read_metadata() will just fail. To be clear, everything works fine for me, as long as I drop the check of (phdrs[1].p_type == PT_LOAD). > > All the files I'm able to test adhere to `/* Is the header and hash already > packed */`: `ehdr_size + hash_size == fw->size` (meaning the file solely > consists of a tightly packed ELF header and hash signature) Yeah, my wcnss firmware is same here. Actually, all split firmwares I have seen are this case. But bear it in mind there are non-split firmware like a single .mbn file. My understanding is that condition (ehdr_size + hash_size == fw->size) is being used to check whether it's a split firmware or non-split one. > but I won't be > surprised if there are firmware files out in the wild with different headers > or the hash missing, otherwise the else clause wouldn't exist. > This else clause causes a read starting at fw->data + phdrs[1].p_offset of > phdrs[1].p_filesz bytes which is almost certainly out of bounds if the > program header type is PT_LOAD instead of PT_NONE, otherwise it wouldn't > need to be loaded from a .bXX file in the first place. So the else clause is there to handle non-split firmware, which has everything within fw->size. > > For this quick solution to be valid I propose to reject PT_LOAD in the else > clause, and otherwise validate that phdrs[1].p_offset + hash_size <= > fw->size. I'm not sure what you propose here are required for non-split firmware. > The aforementioned patch series in qcom_mdt_bins_are_split (and certain > firmware files from Sony phones) show that it is also possible to read > PT_NULL headers from external files, as long as the header references a > range that is out of bounds of the mdt file. It's been shown that hashes can be read from .mdt or hash segment, and independently the hash segment type could be PT_NULL or PT_LOAD. With Siddharth's patch, instead of using the hash duplication in .mdt, hash will be read from hash segment. But again, to resolve my problem, the assertion that hash segment type cannot be PT_LOAD has to be dropped. And that's the only thing my patch is doing. Shawn
Hi Shawn, On 8/27/21 8:24 AM, Shawn Guo wrote: >> [..] >> It's not up to me to choose between a thorough or quick solution, > > To be clear, Siddharth's series doesn't resolve my problem, as the > assumption that hash segment type cannot be PT_LOAD is still there. > I have to add the following change on top to fix my problem. > > @@ -126,8 +126,7 @@ void *qcom_mdt_read_metadata(struct device *dev, const struct firmware *fw, cons > return ERR_PTR(-EINVAL); > > for (hash_index = 1; hash_index < ehdr->e_phnum; hash_index++) { > - if (phdrs[hash_index].p_type != PT_LOAD && > - (phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH) > + if ((phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH) > break; > } > if (hash_index >= ehdr->e_phnum) Patch 3/3 allows the hash segment to be read from an external file and should indeed get rid of this comparison, as external file loading is possible with PT_NULL and required with PT_LOAD. I'd go as far as moving the check into the if, next to qcom_mdt_bins_are_split: if (phdrs[hash_index].p_type == PT_LOAD || qcom_mdt_bins_are_split(fw)) Unfortunately it seems this patchset lost optimization for the packed `ehdr_size + hash_size == fw->size` case, where the hash segment is already available in the loaded mbt. > That said, my patch is merely to break the assumption that hash segment > type cannot be PT_LOAD, and it's really orthogonal to Siddharth's > series. It is fine - correct even - to break that assumption, but it should go with extra validation that we are dealing with packed mdt instead. >> but this >> patch seems to open up the possibility for an out-of-bounds read. The >> assertion was placed in this function for a reason after all. > > I would much appreciate it if someone helps to elaborate the reason. PT_LOAD specifies that the segment is to be loaded externally. The fact that our .mdt file is a tight pack of b00 + b01 is mere convenience, but is it also a given for the future? Can we rely on this assumption to never change? >>> There is a blog[1] illustrating the situation quite nicely. Again, the >>> only thing my WCNSS firmware differs from the illustration is that >>> hash segment gets a PT_LOAD type. >> >> >> Yes, that blog post nicely explains the format but it doesn't cover that the >> hash is encoded in the second program header nor that it can be copied to be >> packed tightly against the ELF header? Maybe that only applies to newer >> formats? > > Hmm, it does cover the things. It's been illustrated that .mdt is > simply a concatenating of .b00 and .b01. The .b00 includes ELF header > and program headers, while .b01 is just hash segment. > > $ cat wcnss.b00 wcnss.b01 > wcnss.b00_b01 > $ cmp wcnss.b00_b01 wcnss.mdt > $ > > That said, .mdt = ELF header + program headers + hash Ack, there's one picture halfway demonstrating the `ehdr_size + hash_size == fw->size` case. >>> Skipping the check will not cause problem for firmwares we have seen, >>> because hash segment is duplicated in .mdt file, and we are using that >>> duplication instead of reading from .b01 file. >> >> >> Can you share some more details about the firmware file you're using: size >> and the program headers as shown by `readelf -l`? > > $ readelf -l wcnss.mdt > > Elf file type is EXEC (Executable file) > Entry point 0x8b6018d4 > There are 12 program headers, starting at offset 52 > > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > NULL 0x000000 0x00000000 0x00000000 0x001b4 0x00000 0 > LOAD 0x001000 0x8bbe0000 0x8bbe0000 0x00c58 0x02000 0x1000 > LOAD 0x003000 0x8b600000 0x8b600000 0x03308 0x03438 RWE 0x100000 > LOAD 0x00afcc 0x8b604000 0x8b604000 0x00000 0x08000 RW 0x4000 > LOAD 0x00afcc 0x8b60c000 0x8b60c000 0x0f000 0x0f000 RW 0x4 > LOAD 0x019fcc 0x8b61b000 0x8b61b000 0x00000 0x0e000 RW 0x4 > LOAD 0x019fcc 0x8b629000 0x8b629000 0x32eb04 0x4c2b10 RWE 0x80 > LOAD 0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x37cf8 RW 0x8 > LOAD 0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x21c44 RW 0x4 > LOAD 0x348ad0 0x8bb30000 0x8bb30000 0x00034 0x001a8 RW 0x8 > LOAD 0x349fcc 0x8bb31000 0x8bb31000 0xa0000 0xa0000 RW 0x1000 > LOAD 0x3e9fcc 0x8bbd1000 0x8bbd1000 0x0ac3c 0x0ee00 RWE 0x1000 > >> If possible, can you also >> validate whether QCOM_MDT_TYPE_HASH is set in phdrs[1].p_flags & >> QCOM_MDT_TYPE_MASK (using something like GNU poke)? > > It is set, otherwise the QCOM_MDT_TYPE_HASH check in > qcom_mdt_read_metadata() will just fail. To be clear, everything works > fine for me, as long as I drop the check of (phdrs[1].p_type == PT_LOAD). Thanks, this all checks out and is similar to what I'm seeing here. It all comes down to the mdt packing b00 and b01 tightly together already. >> All the files I'm able to test adhere to `/* Is the header and hash already >> packed */`: `ehdr_size + hash_size == fw->size` (meaning the file solely >> consists of a tightly packed ELF header and hash signature) > > Yeah, my wcnss firmware is same here. Actually, all split firmwares I > have seen are this case. But bear it in mind there are non-split > firmware like a single .mbn file. My understanding is that condition > (ehdr_size + hash_size == fw->size) is being used to check whether it's > a split firmware or non-split one. Is it unreasonable to assume that more configurations besides split and non-split firmware might occur in the future (or are already out in the wild)? These program headers allow for a variety of configurations which we should - in my opinion - parse and handle in a generic manner. `ehdr_size + hash_size == fw->size` is convenient to have, but not something we should rely on. >> but I won't be >> surprised if there are firmware files out in the wild with different headers >> or the hash missing, otherwise the else clause wouldn't exist. >> This else clause causes a read starting at fw->data + phdrs[1].p_offset of >> phdrs[1].p_filesz bytes which is almost certainly out of bounds if the >> program header type is PT_LOAD instead of PT_NONE, otherwise it wouldn't >> need to be loaded from a .bXX file in the first place. > > So the else clause is there to handle non-split firmware, which has > everything within fw->size. Is it too much to ask for extra validation here, instead of always assuming either "split" or "non-split" firmware? As mentioned before these program headers allow for a variety of configurations, which is confirmed by Siddharth's first patch looking for QCOM_MDT_TYPE_HASH in all headers instead of assuming it to reside in the second. >> >> For this quick solution to be valid I propose to reject PT_LOAD in the else >> clause, and otherwise validate that phdrs[1].p_offset + hash_size <= >> fw->size. > > I'm not sure what you propose here are required for non-split firmware. Would it help if I sent a patch based on yours, with this extra validation and comments + commit message explaining the cases? Alternatively we can try re-spinning the patches from Siddharth while taking review comments, the possible .mdt == .b00 + . b01 packing, and my suggestion to handle the headers generically into account. >> The aforementioned patch series in qcom_mdt_bins_are_split (and certain >> firmware files from Sony phones) show that it is also possible to read >> PT_NULL headers from external files, as long as the header references a >> range that is out of bounds of the mdt file. > > It's been shown that hashes can be read from .mdt or hash segment, and > independently the hash segment type could be PT_NULL or PT_LOAD. With > Siddharth's patch, instead of using the hash duplication in .mdt, hash > will be read from hash segment. But again, to resolve my problem, the > assertion that hash segment type cannot be PT_LOAD has to be dropped. > And that's the only thing my patch is doing. Correct. If I were to respin Siddharth's patchset, I'd write qcom_mdt_read_metadata as follows: 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE and PT_LOAD); 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and the mdt file can be used as-is for metadata; 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but `p_offset + p_filesz` does not fit inside the .mdt, this is (a variant of) split-firmware, and the hash needs to be loaded from an external file and appended to the ELF header. 4. If none of the above, this is a non-split firmware file. Concatenate the ELF and hash headers by reading them directly from the firmware. - Marijn
On Fri, Aug 27, 2021 at 10:29:59AM +0200, Marijn Suijten wrote: > Hi Shawn, > > On 8/27/21 8:24 AM, Shawn Guo wrote: > > > [..] > > > It's not up to me to choose between a thorough or quick solution, > > > > To be clear, Siddharth's series doesn't resolve my problem, as the > > assumption that hash segment type cannot be PT_LOAD is still there. > > I have to add the following change on top to fix my problem. > > > > @@ -126,8 +126,7 @@ void *qcom_mdt_read_metadata(struct device *dev, const struct firmware *fw, cons > > return ERR_PTR(-EINVAL); > > for (hash_index = 1; hash_index < ehdr->e_phnum; hash_index++) { > > - if (phdrs[hash_index].p_type != PT_LOAD && > > - (phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH) > > + if ((phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH) > > break; > > } > > if (hash_index >= ehdr->e_phnum) > > > Patch 3/3 allows the hash segment to be read from an external file and > should indeed get rid of this comparison, as external file loading is > possible with PT_NULL and required with PT_LOAD. I'd go as far as moving > the check into the if, next to qcom_mdt_bins_are_split: > > if (phdrs[hash_index].p_type == PT_LOAD || qcom_mdt_bins_are_split(fw)) > > Unfortunately it seems this patchset lost optimization for the packed > `ehdr_size + hash_size == fw->size` case, where the hash segment is already > available in the loaded mbt. > > > That said, my patch is merely to break the assumption that hash segment > > type cannot be PT_LOAD, and it's really orthogonal to Siddharth's > > series. > > > It is fine - correct even - to break that assumption, but it should go with > extra validation that we are dealing with packed mdt instead. > > > > but this > > > patch seems to open up the possibility for an out-of-bounds read. The > > > assertion was placed in this function for a reason after all. > > > > I would much appreciate it if someone helps to elaborate the reason. > > > PT_LOAD specifies that the segment is to be loaded externally. The fact > that our .mdt file is a tight pack of b00 + b01 is mere convenience, but is > it also a given for the future? Can we rely on this assumption to never > change? My patch is trying to fix an existing issue, not anything for the future. > > > > > There is a blog[1] illustrating the situation quite nicely. Again, the > > > > only thing my WCNSS firmware differs from the illustration is that > > > > hash segment gets a PT_LOAD type. > > > > > > > > > Yes, that blog post nicely explains the format but it doesn't cover that the > > > hash is encoded in the second program header nor that it can be copied to be > > > packed tightly against the ELF header? Maybe that only applies to newer > > > formats? > > > > Hmm, it does cover the things. It's been illustrated that .mdt is > > simply a concatenating of .b00 and .b01. The .b00 includes ELF header > > and program headers, while .b01 is just hash segment. > > > > $ cat wcnss.b00 wcnss.b01 > wcnss.b00_b01 > > $ cmp wcnss.b00_b01 wcnss.mdt > > $ > > > > That said, .mdt = ELF header + program headers + hash > > > Ack, there's one picture halfway demonstrating the `ehdr_size + hash_size == > fw->size` case. > > > > > Skipping the check will not cause problem for firmwares we have seen, > > > > because hash segment is duplicated in .mdt file, and we are using that > > > > duplication instead of reading from .b01 file. > > > > > > > > > Can you share some more details about the firmware file you're using: size > > > and the program headers as shown by `readelf -l`? > > > > $ readelf -l wcnss.mdt > > > > Elf file type is EXEC (Executable file) > > Entry point 0x8b6018d4 > > There are 12 program headers, starting at offset 52 > > > > Program Headers: > > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > > NULL 0x000000 0x00000000 0x00000000 0x001b4 0x00000 0 > > LOAD 0x001000 0x8bbe0000 0x8bbe0000 0x00c58 0x02000 0x1000 > > LOAD 0x003000 0x8b600000 0x8b600000 0x03308 0x03438 RWE 0x100000 > > LOAD 0x00afcc 0x8b604000 0x8b604000 0x00000 0x08000 RW 0x4000 > > LOAD 0x00afcc 0x8b60c000 0x8b60c000 0x0f000 0x0f000 RW 0x4 > > LOAD 0x019fcc 0x8b61b000 0x8b61b000 0x00000 0x0e000 RW 0x4 > > LOAD 0x019fcc 0x8b629000 0x8b629000 0x32eb04 0x4c2b10 RWE 0x80 > > LOAD 0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x37cf8 RW 0x8 > > LOAD 0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x21c44 RW 0x4 > > LOAD 0x348ad0 0x8bb30000 0x8bb30000 0x00034 0x001a8 RW 0x8 > > LOAD 0x349fcc 0x8bb31000 0x8bb31000 0xa0000 0xa0000 RW 0x1000 > > LOAD 0x3e9fcc 0x8bbd1000 0x8bbd1000 0x0ac3c 0x0ee00 RWE 0x1000 > > > > > If possible, can you also > > > validate whether QCOM_MDT_TYPE_HASH is set in phdrs[1].p_flags & > > > QCOM_MDT_TYPE_MASK (using something like GNU poke)? > > > > It is set, otherwise the QCOM_MDT_TYPE_HASH check in > > qcom_mdt_read_metadata() will just fail. To be clear, everything works > > fine for me, as long as I drop the check of (phdrs[1].p_type == PT_LOAD). > > > Thanks, this all checks out and is similar to what I'm seeing here. It all > comes down to the mdt packing b00 and b01 tightly together already. > > > > All the files I'm able to test adhere to `/* Is the header and hash already > > > packed */`: `ehdr_size + hash_size == fw->size` (meaning the file solely > > > consists of a tightly packed ELF header and hash signature) > > > > Yeah, my wcnss firmware is same here. Actually, all split firmwares I > > have seen are this case. But bear it in mind there are non-split > > firmware like a single .mbn file. My understanding is that condition > > (ehdr_size + hash_size == fw->size) is being used to check whether it's > > a split firmware or non-split one. > > > Is it unreasonable to assume that more configurations besides split and > non-split firmware might occur in the future (or are already out in the > wild)? These program headers allow for a variety of configurations which we > should - in my opinion - parse and handle in a generic manner. `ehdr_size + > hash_size == fw->size` is convenient to have, but not something we should > rely on. > > > > but I won't be > > > surprised if there are firmware files out in the wild with different headers > > > or the hash missing, otherwise the else clause wouldn't exist. > > > This else clause causes a read starting at fw->data + phdrs[1].p_offset of > > > phdrs[1].p_filesz bytes which is almost certainly out of bounds if the > > > program header type is PT_LOAD instead of PT_NONE, otherwise it wouldn't > > > need to be loaded from a .bXX file in the first place. > > > > So the else clause is there to handle non-split firmware, which has > > everything within fw->size. > > > Is it too much to ask for extra validation here, instead of always assuming > either "split" or "non-split" firmware? As mentioned before these program > headers allow for a variety of configurations, which is confirmed by > Siddharth's first patch looking for QCOM_MDT_TYPE_HASH in all headers > instead of assuming it to reside in the second. > > > > > > > For this quick solution to be valid I propose to reject PT_LOAD in the else > > > clause, and otherwise validate that phdrs[1].p_offset + hash_size <= > > > fw->size. > > > > I'm not sure what you propose here are required for non-split firmware. > > > Would it help if I sent a patch based on yours, with this extra validation > and comments + commit message explaining the cases? > > Alternatively we can try re-spinning the patches from Siddharth while taking > review comments, the possible .mdt == .b00 + . b01 packing, and my > suggestion to handle the headers generically into account. I would suggest this path. It's been 8 months, so Siddharth probably lost the interest here. It's reasonable that someone picks up the work. > > > > The aforementioned patch series in qcom_mdt_bins_are_split (and certain > > > firmware files from Sony phones) show that it is also possible to read > > > PT_NULL headers from external files, as long as the header references a > > > range that is out of bounds of the mdt file. > > > > It's been shown that hashes can be read from .mdt or hash segment, and > > independently the hash segment type could be PT_NULL or PT_LOAD. With > > Siddharth's patch, instead of using the hash duplication in .mdt, hash > > will be read from hash segment. But again, to resolve my problem, the > > assertion that hash segment type cannot be PT_LOAD has to be dropped. > > And that's the only thing my patch is doing. > > Correct. If I were to respin Siddharth's patchset, I'd write > qcom_mdt_read_metadata as follows: > > 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE > and PT_LOAD); > 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and > the mdt file can be used as-is for metadata; > 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but > `p_offset + p_filesz` does not fit inside the .mdt, this is (a > variant of) split-firmware, and the hash needs to be loaded from an > external file and appended to the ELF header. > 4. If none of the above, this is a non-split firmware file. Concatenate > the ELF and hash headers by reading them directly from the firmware. Looks good to me. I will be happy to review patches if you pick up the work. Shawn
Hi Shawn, On 8/27/21 11:57 AM, Shawn Guo wrote: >> [..] >> PT_LOAD specifies that the segment is to be loaded externally. The fact >> that our .mdt file is a tight pack of b00 + b01 is mere convenience, but is >> it also a given for the future? Can we rely on this assumption to never >> change? > > My patch is trying to fix an existing issue, not anything for the > future. We both agree that the PT_LOAD assertion here is too strict, but removing it altogether makes the function too lenient and allows for possible bugs. To solve your issue in the simple case I have already suggested to add an extra bounds check. >> [..] >> Alternatively we can try re-spinning the patches from Siddharth while taking >> review comments, the possible .mdt == .b00 + . b01 packing, and my >> suggestion to handle the headers generically into account. > > I would suggest this path. It's been 8 months, so Siddharth probably lost > the interest here. It's reasonable that someone picks up the work. > >> >>>> The aforementioned patch series in qcom_mdt_bins_are_split (and certain >>>> firmware files from Sony phones) show that it is also possible to read >>>> PT_NULL headers from external files, as long as the header references a >>>> range that is out of bounds of the mdt file. >>> >>> It's been shown that hashes can be read from .mdt or hash segment, and >>> independently the hash segment type could be PT_NULL or PT_LOAD. With >>> Siddharth's patch, instead of using the hash duplication in .mdt, hash >>> will be read from hash segment. But again, to resolve my problem, the >>> assertion that hash segment type cannot be PT_LOAD has to be dropped. >>> And that's the only thing my patch is doing. >> >> Correct. If I were to respin Siddharth's patchset, I'd write >> qcom_mdt_read_metadata as follows: >> >> 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE >> and PT_LOAD); >> 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and >> the mdt file can be used as-is for metadata; >> 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but >> `p_offset + p_filesz` does not fit inside the .mdt, this is (a >> variant of) split-firmware, and the hash needs to be loaded from an >> external file and appended to the ELF header. >> 4. If none of the above, this is a non-split firmware file. Concatenate >> the ELF and hash headers by reading them directly from the firmware. > > Looks good to me. I will be happy to review patches if you pick up the > work. I'll try to get this going over the weekend or next week. I don't think I can salvage anything from Siddharth's patchset and will likely start from scratch as I'm not confident "just detecting split firmware" is enough, but favour generic handling of the program headers as described above instead (for both the hash and firmware loading itself). Will keep you posted! - Marijn
On Fri, Aug 27, 2021 at 12:46:47PM +0200, Marijn Suijten wrote: > Hi Shawn, > > On 8/27/21 11:57 AM, Shawn Guo wrote: > > > [..] > > > PT_LOAD specifies that the segment is to be loaded externally. The fact > > > that our .mdt file is a tight pack of b00 + b01 is mere convenience, but is > > > it also a given for the future? Can we rely on this assumption to never > > > change? > > > > My patch is trying to fix an existing issue, not anything for the > > future. > > > We both agree that the PT_LOAD assertion here is too strict, but removing it > altogether makes the function too lenient and allows for possible bugs. To > solve your issue in the simple case I have already suggested to add an extra > bounds check. So you proposed to reject PT_LOAD in the else clause, which right now handles .mbn case, are you sure hash segment in .mbn is not going to be PT_LOAD? Shawn
Hi Shawn, On 8/27/21 4:12 PM, Shawn Guo wrote: > [..] > > So you proposed to reject PT_LOAD in the else clause, which right now > handles .mbn case Yes, I propose to reject PT_LOAD in the else-case, and additionally reject cases where p_offset+p_filesz > sw->size since PT_NULL can also cause external file loads (meaning split-firmware). This is what Siddharth's patchset - or my respin of it - is going to implement. > are you sure hash segment in .mbn is not going to be > PT_LOAD? PT_LOAD unambiguously indicates a program header that ought to be loaded from an external file. Any mbn file (non-split firmware) without split files that set PT_LOAD are misusing this program header type field. I have no way to validate whether such mbns are in circulation. Of note, I have never referenced the definition of the program header types yet. Please see [1]: PT_LOAD (1) Indicates that this program header describes a segment to be loaded from the file. Let me know if you're planning to send a v2 of this patch with aforementioned improvements, then I'll adjust my plans to respin Siddharth's patchset accordingly. - Marijn [1]: https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_23.html
On Fri 27 Aug 04:57 CDT 2021, Shawn Guo wrote: > On Fri, Aug 27, 2021 at 10:29:59AM +0200, Marijn Suijten wrote: > > On 8/27/21 8:24 AM, Shawn Guo wrote: [..] > > > > > > > > For this quick solution to be valid I propose to reject PT_LOAD in the else > > > > clause, and otherwise validate that phdrs[1].p_offset + hash_size <= > > > > fw->size. > > > > > > I'm not sure what you propose here are required for non-split firmware. > > > > > > Would it help if I sent a patch based on yours, with this extra validation > > and comments + commit message explaining the cases? > > > > Alternatively we can try re-spinning the patches from Siddharth while taking > > review comments, the possible .mdt == .b00 + . b01 packing, and my > > suggestion to handle the headers generically into account. > > I would suggest this path. It's been 8 months, so Siddharth probably lost > the interest here. It's reasonable that someone picks up the work. > Prior to 498b98e93900 ("soc: qcom: mdt_loader: Support loading non-split images") we'd just blindly pass the entire .mdt into the SCM call. So reading through your discussion and looking at the problem is that I assumed (based on the firmware files I looked at) that the hash segment is explicitly mentioned in the ELF header - i.e. segment 0 and 1 are not PT_LOAD and define the part that should be passed to init_image, and that this may or may not be packed. And the problem we have here is that we have the packed case, but the hash segment isn't described explicitly in the ELF header. > > > > > > The aforementioned patch series in qcom_mdt_bins_are_split (and certain > > > > firmware files from Sony phones) show that it is also possible to read > > > > PT_NULL headers from external files, as long as the header references a > > > > range that is out of bounds of the mdt file. > > > > > > It's been shown that hashes can be read from .mdt or hash segment, and > > > independently the hash segment type could be PT_NULL or PT_LOAD. With > > > Siddharth's patch, instead of using the hash duplication in .mdt, hash > > > will be read from hash segment. But again, to resolve my problem, the > > > assertion that hash segment type cannot be PT_LOAD has to be dropped. > > > And that's the only thing my patch is doing. > > > > Correct. If I were to respin Siddharth's patchset, I'd write > > qcom_mdt_read_metadata as follows: > > > > 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE > > and PT_LOAD); > > 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and > > the mdt file can be used as-is for metadata; > > 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but > > `p_offset + p_filesz` does not fit inside the .mdt, this is (a > > variant of) split-firmware, and the hash needs to be loaded from an > > external file and appended to the ELF header. I would expect that PT_LOAD denotes that the segment should be loaded into the final firmware region and that the hash segment would be PT_NULL regardless of being part of the .mdt, single .mbn or a separate .bNN segment. > > 4. If none of the above, this is a non-split firmware file. Concatenate > > the ELF and hash headers by reading them directly from the firmware. > I'm happy with this plan and I think Sid will be as well. I hope that we can introduce a change that fixes Shawn's reported problem first, so that can be backported to stable, and then add Sid's need for loading the additional .bNN after that (same series is fine). > Looks good to me. I will be happy to review patches if you pick up the > work. > Thanks to the both of you for looking into this! Regards, Bjorn
Hi Bjorn, On 8/27/21 6:07 PM, Bjorn Andersson wrote: > On Fri 27 Aug 04:57 CDT 2021, Shawn Guo wrote: > >> On Fri, Aug 27, 2021 at 10:29:59AM +0200, Marijn Suijten wrote: >>> On 8/27/21 8:24 AM, Shawn Guo wrote: > [..] >>>>> >>>>> For this quick solution to be valid I propose to reject PT_LOAD in the else >>>>> clause, and otherwise validate that phdrs[1].p_offset + hash_size <= >>>>> fw->size. >>>> >>>> I'm not sure what you propose here are required for non-split firmware. >>> >>> >>> Would it help if I sent a patch based on yours, with this extra validation >>> and comments + commit message explaining the cases? >>> >>> Alternatively we can try re-spinning the patches from Siddharth while taking >>> review comments, the possible .mdt == .b00 + . b01 packing, and my >>> suggestion to handle the headers generically into account. >> >> I would suggest this path. It's been 8 months, so Siddharth probably lost >> the interest here. It's reasonable that someone picks up the work. >> > > Prior to 498b98e93900 ("soc: qcom: mdt_loader: Support loading non-split > images") we'd just blindly pass the entire .mdt into the SCM call. > > So reading through your discussion and looking at the problem is that > I assumed (based on the firmware files I looked at) that the hash > segment is explicitly mentioned in the ELF header - i.e. segment 0 and 1 > are not PT_LOAD and define the part that should be passed to init_image, > and that this may or may not be packed. > > And the problem we have here is that we have the packed case, but the > hash segment isn't described explicitly in the ELF header. The hash segment is still explicitly described by the presence of QCOM_MDT_TYPE_HASH, and the external file (.b01 when the program header is in slot 1) contains the hash signature. And as Shawn demonstrated, concatenating the ELF header in .b00 and this hash in .b01 results in the .mdt file. This is the case we can (and already) optimize for, by passing the entire .mdt as-is into SCM if such packing is detected. >>>>> The aforementioned patch series in qcom_mdt_bins_are_split (and certain >>>>> firmware files from Sony phones) show that it is also possible to read >>>>> PT_NULL headers from external files, as long as the header references a >>>>> range that is out of bounds of the mdt file. >>>> >>>> It's been shown that hashes can be read from .mdt or hash segment, and >>>> independently the hash segment type could be PT_NULL or PT_LOAD. With >>>> Siddharth's patch, instead of using the hash duplication in .mdt, hash >>>> will be read from hash segment. But again, to resolve my problem, the >>>> assertion that hash segment type cannot be PT_LOAD has to be dropped. >>>> And that's the only thing my patch is doing. >>> >>> Correct. If I were to respin Siddharth's patchset, I'd write >>> qcom_mdt_read_metadata as follows: >>> >>> 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE >>> and PT_LOAD); >>> 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and >>> the mdt file can be used as-is for metadata; >>> 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but >>> `p_offset + p_filesz` does not fit inside the .mdt, this is (a >>> variant of) split-firmware, and the hash needs to be loaded from an >>> external file and appended to the ELF header. > > I would expect that PT_LOAD denotes that the segment should be loaded > into the final firmware region and that the hash segment would be > PT_NULL regardless of being part of the .mdt, single .mbn or a separate > .bNN segment. I have two mdt files here that load the hash externally, one with PT_LOAD and one with PT_NULL annotation (QCOM_MDT_TYPE_HASH is set). Both have their p_offset and/or p_offset+p_filesz well out of bounds of the .mdt file, but the .mdt file is of the packed variant (.b00 + .b01 == .mdt). This packing seems to be an optimization, making it possible to send the .mdt contents as-is over SCM without having to load and concatenate the header from a separate file. The image produced by __qcom_mdt_load should still be reconstructed based on the layout specified in the program headers, of course. >>> 4. If none of the above, this is a non-split firmware file. Concatenate >>> the ELF and hash headers by reading them directly from the firmware. >> > > I'm happy with this plan and I think Sid will be as well. > > I hope that we can introduce a change that fixes Shawn's reported > problem first, so that can be backported to stable, and then add Sid's > need for loading the additional .bNN after that (same series is fine). I'm totally for fixing Shawn's reported issue first, before diving into extending mdt_loader as per Sid's patches. In hindsight I'm not sure if I should be the person writing that until coming across a platform/firmware that needs this setup (hash in a different header, hash not included in the packed .mdt file). Note that backporting needs a Fixes: tag, probably on 498b98e93900. Shawn, do you want to send your reworded patch with the necessary safeguards as v2, or should I send it? - Marijn
On Fri 27 Aug 10:40 PDT 2021, Marijn Suijten wrote: > Hi Bjorn, > > On 8/27/21 6:07 PM, Bjorn Andersson wrote: > > On Fri 27 Aug 04:57 CDT 2021, Shawn Guo wrote: > > > > > On Fri, Aug 27, 2021 at 10:29:59AM +0200, Marijn Suijten wrote: > > > > On 8/27/21 8:24 AM, Shawn Guo wrote: > > [..] > > > > > > > > > > > > For this quick solution to be valid I propose to reject PT_LOAD in the else > > > > > > clause, and otherwise validate that phdrs[1].p_offset + hash_size <= > > > > > > fw->size. > > > > > > > > > > I'm not sure what you propose here are required for non-split firmware. > > > > > > > > > > > > Would it help if I sent a patch based on yours, with this extra validation > > > > and comments + commit message explaining the cases? > > > > > > > > Alternatively we can try re-spinning the patches from Siddharth while taking > > > > review comments, the possible .mdt == .b00 + . b01 packing, and my > > > > suggestion to handle the headers generically into account. > > > > > > I would suggest this path. It's been 8 months, so Siddharth probably lost > > > the interest here. It's reasonable that someone picks up the work. > > > > > > > Prior to 498b98e93900 ("soc: qcom: mdt_loader: Support loading non-split > > images") we'd just blindly pass the entire .mdt into the SCM call. > > > > So reading through your discussion and looking at the problem is that > > I assumed (based on the firmware files I looked at) that the hash > > segment is explicitly mentioned in the ELF header - i.e. segment 0 and 1 > > are not PT_LOAD and define the part that should be passed to init_image, > > and that this may or may not be packed. > > > > And the problem we have here is that we have the packed case, but the > > hash segment isn't described explicitly in the ELF header. > > > The hash segment is still explicitly described by the presence of > QCOM_MDT_TYPE_HASH, and the external file (.b01 when the program header is > in slot 1) contains the hash signature. But in Shawn's firmware the hashes follows the elf header in the .mdt and both are described in a single program header. Are you saying that this entry is of type QCOM_MDT_TYPE_HASH? > And as Shawn demonstrated, > concatenating the ELF header in .b00 and this hash in .b01 results in the > .mdt file. This is the case we can (and already) optimize for, by passing > the entire .mdt as-is into SCM if such packing is detected. > In the other case, where the hash is described by a separate program header, we need to ensure that it's concatenated directly following the ELF header. There are three cases here: 1) It's already directly following the ELF header and we should send ehdr_size + hash_size bytes to PAS. 2) It's in the loaded file, but it's not tightly packed (this is what we see in the .mbn). In this case we need to pack up the two pieces before we send them to PAS. 3) It's not part of the loaded .mdt, in which case we need to read it from whichever program header that happens to contain it (might not be .b01). In the case of #1 we should not do #3, because I've seen several cases where the signature in .b01 does not match the one present in the .mdt. I do expect that passing the metadata of ELF+hash+b01 will still work, as it would ignore the tailing garbage. Replacing ELF+hash with ELF+b01 is different from what we've been doing all these years, so that will require retesting on all possible platforms... So I think what we're looking at it the three possible operations: 1) init_image(ELF+hash) 2) init_image(pack(ELF+hash)) 3) init_image(pack(ELF+load(b01))) Let's see if Sid agrees. > > > > > > The aforementioned patch series in qcom_mdt_bins_are_split (and certain > > > > > > firmware files from Sony phones) show that it is also possible to read > > > > > > PT_NULL headers from external files, as long as the header references a > > > > > > range that is out of bounds of the mdt file. > > > > > > > > > > It's been shown that hashes can be read from .mdt or hash segment, and > > > > > independently the hash segment type could be PT_NULL or PT_LOAD. With > > > > > Siddharth's patch, instead of using the hash duplication in .mdt, hash > > > > > will be read from hash segment. But again, to resolve my problem, the > > > > > assertion that hash segment type cannot be PT_LOAD has to be dropped. > > > > > And that's the only thing my patch is doing. > > > > > > > > Correct. If I were to respin Siddharth's patchset, I'd write > > > > qcom_mdt_read_metadata as follows: > > > > > > > > 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE > > > > and PT_LOAD); > > > > 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and > > > > the mdt file can be used as-is for metadata; > > > > 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but > > > > `p_offset + p_filesz` does not fit inside the .mdt, this is (a > > > > variant of) split-firmware, and the hash needs to be loaded from an > > > > external file and appended to the ELF header. > > > > I would expect that PT_LOAD denotes that the segment should be loaded > > into the final firmware region and that the hash segment would be > > PT_NULL regardless of being part of the .mdt, single .mbn or a separate > > .bNN segment. > > > I have two mdt files here that load the hash externally, one with PT_LOAD > and one with PT_NULL annotation (QCOM_MDT_TYPE_HASH is set). Both have their > p_offset and/or p_offset+p_filesz well out of bounds of the .mdt file, but > the .mdt file is of the packed variant (.b00 + .b01 == .mdt). > Ahh, because mdt_phdr_valid() will discard them anyways as we look for PT_LOAD segments... > This packing seems to be an optimization, making it possible to send the > .mdt contents as-is over SCM without having to load and concatenate the > header from a separate file. > Yes, that's probably the case. I was surprised to see the need for repacking the two segments in the .mbn case. I can only assume that Windows does this... Regards, Bjorn > The image produced by __qcom_mdt_load should still be reconstructed based on > the layout specified in the program headers, of course. > > > > > 4. If none of the above, this is a non-split firmware file. Concatenate > > > > the ELF and hash headers by reading them directly from the firmware. > > > > > > > I'm happy with this plan and I think Sid will be as well. > > > > I hope that we can introduce a change that fixes Shawn's reported > > problem first, so that can be backported to stable, and then add Sid's > > need for loading the additional .bNN after that (same series is fine). > > > I'm totally for fixing Shawn's reported issue first, before diving into > extending mdt_loader as per Sid's patches. In hindsight I'm not sure if I > should be the person writing that until coming across a platform/firmware > that needs this setup (hash in a different header, hash not included in the > packed .mdt file). > > Note that backporting needs a Fixes: tag, probably on 498b98e93900. > > Shawn, do you want to send your reworded patch with the necessary safeguards > as v2, or should I send it? > > - Marijn
Hi Bjorn, On 8/27/21 11:25 PM, Bjorn Andersson wrote: > On Fri 27 Aug 10:40 PDT 2021, Marijn Suijten wrote: > [..] >> On 8/27/21 6:07 PM, Bjorn Andersson wrote: >>> [..] >>> Prior to 498b98e93900 ("soc: qcom: mdt_loader: Support loading non-split >>> images") we'd just blindly pass the entire .mdt into the SCM call. >>> >>> So reading through your discussion and looking at the problem is that >>> I assumed (based on the firmware files I looked at) that the hash >>> segment is explicitly mentioned in the ELF header - i.e. segment 0 and 1 >>> are not PT_LOAD and define the part that should be passed to init_image, >>> and that this may or may not be packed. >>> >>> And the problem we have here is that we have the packed case, but the >>> hash segment isn't described explicitly in the ELF header. >> >> >> The hash segment is still explicitly described by the presence of >> QCOM_MDT_TYPE_HASH, and the external file (.b01 when the program header is >> in slot 1) contains the hash signature. > > But in Shawn's firmware the hashes follows the elf header in the .mdt Apologies, I wasn't denying that the hash doesn't follow the ELF header in the .mdt file (in his example, ELF (.b00) + hash (.b01) == .mdt), simply forgot to mention it. > and both are described in a single program header. Only the ELF header is described in the first program header, which is merely 0x001b4 bytes in size. The hash segment is described by the second header, which loads it from an external file. The trailing data in the .mdt file directly after the ELF header is unspecified by the program headers (all other tables are PT_LOAD after all). As discussed below this seems to be an "ease-of-use" feature of the .mdt, doubling as metadata sent directly to SCM. Its "presence" is (seemingly) detected by comparing the size of the mdt file against the summed size of the ELF header and the segment containing QCOM_MDT_TYPE_HASH. > Are you saying that this entry is of type QCOM_MDT_TYPE_HASH? According to Shawn's validation (matching what I see locally), the second program header starting at 0x1000 has the QCOM_MDT_TYPE_HASH bit set. >> And as Shawn demonstrated, >> concatenating the ELF header in .b00 and this hash in .b01 results in the >> .mdt file. This is the case we can (and already) optimize for, by passing >> the entire .mdt as-is into SCM if such packing is detected. >> > > In the other case, where the hash is described by a separate program > header, we need to ensure that it's concatenated directly following the > ELF header. Shawn's case has the hash described in a separate program header as well, but it is also appended directly after the ELF header. The program headers do not specify anything about this region in the .mdt file though. > There are three cases here: > 1) It's already directly following the ELF header and we should send > ehdr_size + hash_size bytes to PAS. Ack. > 2) It's in the loaded file, but it's not tightly packed (this is what we > see in the .mbn). In this case we need to pack up the two pieces before > we send them to PAS. Ack. > 3) It's not part of the loaded .mdt, in which case we need to read it > from whichever program header that happens to contain it (might not be > .b01). How should we detect that it is not part of the .mdt? Will the size of ELF header + hash segment simply not equal the size of the .mdt file? > In the case of #1 we should not do #3, because I've seen several cases > where the signature in .b01 does not match the one present in the .mdt. Oh, that really complicates matters :(. It means that SCM will see a different signature than what we eventually load into memory (recall that the hash bit in a .mdt is not specified in any program header, and will hence never be copied to memory. The "memory size" [p_memsz] of the ELF header is zero, and will never be copied either). Maybe that's never read though. > I do expect that passing the metadata of ELF+hash+b01 will still work, I don't think we have an ELF+hash+b01 case. It's always ELF+hash, but that hash can come from one of three locations as described in your summation below. > as it would ignore the tailing garbage. Replacing ELF+hash with ELF+b01 > is different from what we've been doing all these years, so that will > require retesting on all possible platforms... I don't think we should be doing this replacement (Sid's patch does, which seems to have lost this check) and simply keep the ehdr_size + hash_size == fw->size predicate for direct ELF+hash reading from a .mdt. > So I think what we're looking at it the three possible operations: > 1) init_image(ELF+hash) > 2) init_image(pack(ELF+hash)) > 3) init_image(pack(ELF+load(b01))) Correct, and all three cases should be handled by the four steps I mentioned to implement qcom_mdt_read_metadata. > Let's see if Sid agrees. > >>>>> [..] >>>>> Correct. If I were to respin Siddharth's patchset, I'd write >>>>> qcom_mdt_read_metadata as follows: >>>>> >>>>> 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE >>>>> and PT_LOAD); >>>>> 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and >>>>> the mdt file can be used as-is for metadata; >>>>> 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but >>>>> `p_offset + p_filesz` does not fit inside the .mdt, this is (a >>>>> variant of) split-firmware, and the hash needs to be loaded from an >>>>> external file and appended to the ELF header. >>> >>> I would expect that PT_LOAD denotes that the segment should be loaded >>> into the final firmware region and that the hash segment would be >>> PT_NULL regardless of being part of the .mdt, single .mbn or a separate >>> .bNN segment. >> >> >> I have two mdt files here that load the hash externally, one with PT_LOAD >> and one with PT_NULL annotation (QCOM_MDT_TYPE_HASH is set). Both have their >> p_offset and/or p_offset+p_filesz well out of bounds of the .mdt file, but >> the .mdt file is of the packed variant (.b00 + .b01 == .mdt). >> > > Ahh, because mdt_phdr_valid() will discard them anyways as we look for > PT_LOAD segments... Yes. Curious how that works for mbn files then, as those should not have externally loadable files and hence no PT_LOAD segments? >> This packing seems to be an optimization, making it possible to send the >> .mdt contents as-is over SCM without having to load and concatenate the >> header from a separate file. >> > > Yes, that's probably the case. I was surprised to see the need for > repacking the two segments in the .mbn case. I can only assume that > Windows does this... I take it the .mbns also do not have a program header defining the memory contents directly after the ELF header? Is this part zero-initialized in the file, or is it pre-initialized with the hash just like .mdt? - Marijn
On Fri, Aug 27, 2021 at 05:13:34PM +0200, Marijn Suijten wrote: > Hi Shawn, > > On 8/27/21 4:12 PM, Shawn Guo wrote: > > [..] > > > > So you proposed to reject PT_LOAD in the else clause, which right now > > handles .mbn case > > > Yes, I propose to reject PT_LOAD in the else-case, and additionally reject > cases where p_offset+p_filesz > sw->size since PT_NULL can also cause > external file loads (meaning split-firmware). This is what Siddharth's > patchset - or my respin of it - is going to implement. > > > are you sure hash segment in .mbn is not going to be > > PT_LOAD? > > > PT_LOAD unambiguously indicates a program header that ought to be loaded > from an external file. Any mbn file (non-split firmware) without split > files that set PT_LOAD are misusing this program header type field. I have > no way to validate whether such mbns are in circulation. Following your take on PT_LOAD, I assume that no PT_LOAD segment should be found in .mbn file, correct? Here are two .mbn I found in circulation. Both have PT_LOAD type for a few segments. $ readelf -l venus.mbn Elf file type is EXEC (Executable file) Entry point 0xf500000 There are 5 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align NULL 0x000000 0x00000000 0x00000000 0x000d4 0x00000 0 LOAD 0x001000 0x0fa00000 0x0fa00000 0x00b88 0x02000 0x1000 LOAD 0x003000 0x00000000 0x0f500000 0xeecd0 0xeecd0 R E 0x100000 LOAD 0x0f1cd0 0x000ef000 0x0f5ef000 0x015c0 0x405000 RW 0x4000 LOAD 0x0f3290 0x004ff000 0x0f9ff000 0x00020 0x00020 RW 0x4 $ readelf -l mba.mbn Elf file type is EXEC (Executable file) Entry point 0x4417000 There are 5 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align NULL 0x000000 0x00000000 0x00000000 0x000d4 0x00000 0 NULL 0x001000 0x8ea4a000 0x8ea4a000 0x019c8 0x02000 0x1000 LOAD 0x003000 0x04417000 0x8ea00000 0x350d0 0x3bbc8 RWE 0x1000 LOAD 0x039000 0x04460000 0x8ea49000 0x00380 0x00380 RW 0x1000 GNU_STACK 0x002000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x4 Or you think these are all misusing of PT_LOAD? Sorry, I hardly believe your understanding on PT_LOAD matches the reality. Instead, I'm inclined to agree with Bjorn's comment. Quote: "I would expect that PT_LOAD denotes that the segment should be loaded into the final firmware region and that the hash segment would be PT_NULL regardless of being part of the .mdt, single .mbn or a separate .bNN segment." The only part that doesn't hold is "the hash segment would be PT_NULL". But the point is that PT_LOAD doesn't mean the segment should be loaded externally (from .bNN file). > > Of note, I have never referenced the definition of the program header types > yet. Please see [1]: > > PT_LOAD (1) > Indicates that this program header describes a segment to be > loaded from the file. > > Let me know if you're planning to send a v2 of this patch with > aforementioned improvements, then I'll adjust my plans to respin Siddharth's > patchset accordingly. I will send v2. However there will be no code change but just commit log update based on all these discussion. Thanks! Shawn > [1]: https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_23.html
Hi Shawn, On 8/28/21 8:03 AM, Shawn Guo wrote: > [..] > > Or you think these are all misusing of PT_LOAD? Sorry, I hardly believe > your understanding on PT_LOAD matches the reality. Instead, I'm inclined > to agree with Bjorn's comment. Agreed, PT_LOAD does not imply _external_ loading at all, merely whether it has to be copied to the final memory region. That's my misunderstanding, even after quoting the documentation. This external loading is merely a qcom implementation detail AFAIK, and solely detected by the source boundaries (p_offset - p_offset+p_filesz) residing outside the .mdt file. > Quote: > > "I would expect that PT_LOAD denotes that the segment should be loaded > into the final firmware region and that the hash segment would be > PT_NULL regardless of being part of the .mdt, single .mbn or a separate > .bNN segment." > > The only part that doesn't hold is "the hash segment would be PT_NULL". > But the point is that PT_LOAD doesn't mean the segment should be loaded > externally (from .bNN file). Yes, this is strange. Why would the .mdt request the hash segment to be loaded to firmware memory when it's filtered out anyway? Are we supposed to filter it out? > [..] > > I will send v2. However there will be no code change but just commit log > update based on all these discussion. Thanks! Sounds good - glad we could have this discussion to get to the bottom of this and conclude that removing this check is the right approach without any side-effects, including a detailed justification in the commit-message. - Marijn
diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c index eba7f76f9d61..6034cd8992b0 100644 --- a/drivers/soc/qcom/mdt_loader.c +++ b/drivers/soc/qcom/mdt_loader.c @@ -98,7 +98,7 @@ void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len) if (ehdr->e_phnum < 2) return ERR_PTR(-EINVAL); - if (phdrs[0].p_type == PT_LOAD || phdrs[1].p_type == PT_LOAD) + if (phdrs[0].p_type == PT_LOAD) return ERR_PTR(-EINVAL); if ((phdrs[1].p_flags & QCOM_MDT_TYPE_MASK) != QCOM_MDT_TYPE_HASH)