Message ID | 1594378641-26360-1-git-send-email-bmeng.cn@gmail.com |
---|---|
State | New |
Headers | show |
Series | memsize: Make get_ram_size() work with arbitary RAM size | expand |
Dear Bin Meng, In message <1594378641-26360-1-git-send-email-bmeng.cn at gmail.com> you wrote: > > Currently get_ram_size() only works with certain RAM size like 1GiB, > 2GiB, 4GiB, 8GiB, etc. Chanage the codes to work with any RAM size. I'm afraid I don't understand this change, Can you please explain a bit more detailed what "any RAM size" means? The existing code should work fine with any RAM size that is a power of two (and the algoithm used is based on this assumption, so the code would fail if it is not met). > - long save[BITS_PER_LONG - 1]; > - long save_base; > - long cnt; > - long val; > - long size; > - int i = 0; > + long save[BITS_PER_LONG - 1]; > + long save_base; > + long cnt; > + long val; > + long size; > + int i = 0; Why do you change the formatting here? I can see no need for that? > + long n = maxsize / sizeof(long); > > - for (cnt = (maxsize / sizeof(long)) >> 1; cnt > 0; cnt >>= 1) { > + n = __ffs(n); > + n = BIT(n); > + > + for (cnt = n >> 1; cnt > 0; cnt >>= 1) { I can only speculate - but do you think that this will work for a memory size of - for example - 2.5 GiB as might result from combining two banks of 2 GiB resp. 512 MiB ? I bet it doesn't. For correct operation (as originally intended) you would always specify a maxsize twice as large as the maximum possible memory size of a bank of memory, and the function would return the real size it found. Any other use, especially not checking one bank of memory at a time, will not work as intended. And I have yet to see systems where the size of a bank of memory is not a power of two. So I feel what you are doing here is not an improvement, but a "workaround" for some incorrect usage. I don't think we should accept this patch. Best regards, Wolfgang Denk
Hi Wolfgang, On Tue, Jul 14, 2020 at 1:18 AM Wolfgang Denk <wd at denx.de> wrote: > > Dear Bin Meng, > > In message <1594378641-26360-1-git-send-email-bmeng.cn at gmail.com> you wrote: > > > > Currently get_ram_size() only works with certain RAM size like 1GiB, > > 2GiB, 4GiB, 8GiB, etc. Chanage the codes to work with any RAM size. > > I'm afraid I don't understand this change, Can you please explain a > bit more detailed what "any RAM size" means? I meant "any RAM size" that is not power of two. > > The existing code should work fine with any RAM size that is a power > of two (and the algoithm used is based on this assumption, so the > code would fail if it is not met). Unfortunately this limitation is not documented clearly. > > > - long save[BITS_PER_LONG - 1]; > > - long save_base; > > - long cnt; > > - long val; > > - long size; > > - int i = 0; > > + long save[BITS_PER_LONG - 1]; > > + long save_base; > > + long cnt; > > + long val; > > + long size; > > + int i = 0; > > Why do you change the formatting here? I can see no need for that? I see the above are the only places in this file that do not follow our coding practices. Since I was modifying this function, I think it's fine to do the updates while we are here. > > > > + long n = maxsize / sizeof(long); > > > > - for (cnt = (maxsize / sizeof(long)) >> 1; cnt > 0; cnt >>= 1) { > > + n = __ffs(n); > > + n = BIT(n); > > + > > + for (cnt = n >> 1; cnt > 0; cnt >>= 1) { > > I can only speculate - but do you think that this will work for a > memory size of - for example - 2.5 GiB as might result from combining > two banks of 2 GiB resp. 512 MiB ? I bet it doesn't. Correct. This issue was exposed when I was testing U-Boot with QEMU on RISC-V. With QEMU we can instantiate arbitrary RAM size for a given machine. > > For correct operation (as originally intended) you would always > specify a maxsize twice as large as the maximum possible memory size > of a bank of memory, and the function would return the real size it > found. I don't think this function can work as expected. Even if I pass a power-of-two RAM size to this function, it could still fail. Imagine the target has 2GiB memory size, and I pass 8GiB as the maxsize to this function. The function will hang when it tries to read/write something at an address that is beyond 2GiB. > > Any other use, especially not checking one bank of memory at a time, > will not work as intended. And I have yet to see systems where > the size of a bank of memory is not a power of two. > > So I feel what you are doing here is not an improvement, but a > "workaround" for some incorrect usage. > Given what you mentioned the function limitation, we should update the codes with the above power of two assumptions documented. > I don't think we should accept this patch. > Regards, Bin
Dear Bin Meng, In message <CAEUhbmWBFvb_SGobk2f_JXhvAtyZkZ1jJ0HXiWdC0xU61Hn1jg at mail.gmail.com> you wrote: > > > I'm afraid I don't understand this change, Can you please explain a > > bit more detailed what "any RAM size" means? > > I meant "any RAM size" that is not power of two. I was afraid you meant this. > > The existing code should work fine with any RAM size that is a power > > of two (and the algoithm used is based on this assumption, so the > > code would fail if it is not met). > > Unfortunately this limitation is not documented clearly. You are right - the code is completely undocumented, not even the fact that it should always operate on a single bank of memory at a time. > > For correct operation (as originally intended) you would always > > specify a maxsize twice as large as the maximum possible memory size > > of a bank of memory, and the function would return the real size it > > found. > > I don't think this function can work as expected. Even if I pass a > power-of-two RAM size to this function, it could still fail. Imagine > the target has 2GiB memory size, and I pass 8GiB as the maxsize to > this function. The function will hang when it tries to read/write > something at an address that is beyond 2GiB. It should not hang if used properly - but of course this depends a lot on the hardware properties, and you have to know what you are doing. When the code was written, all we had to deal with was Power Architecture systems with pretty flexible and easy to ptogram memory controllers. A typical use case was a system where you could for example populate either 64 or 128 or 256 MB of RAM. In such a case you would run the code with a max size of 512 MB, after configuring he memory controller to map such a size. In this case, acceses to the whole 512 MB range will work without problems, the higher address ranges just mirroring the data of the actually populated RAM. This mirroring will be detected, so you get the information how big the populated RAM really is. If we would only check with a max size of 256 MB, we would not be able to detect the case when there is bigger RAM (say, 512 MB) populated. I am aware that there are situations where you can't program the memory controller so freely, so we often cannot use this complete testing and lose the ability to detect working, but bigger RAM. In any case we always test only the memory locations at the power-of-two borders - the main reason is that this test is supposed to run on every boot, so it must be fast. In your case this immediately shows the problem. Assume you have a memory configuration which is supposed to have 2 GiB + 512 MiB, but by mistake wrong components were fit and instead of 512 MiB there are only 128 MiB actually present. U-Boot will test memory below the ... 128M, 256M, 512M, 1G, 2G (and ideally 4G) limits. It will tell you that there are 2 GiB working memory. It will NOT test any location between 2 GB and 4GB, so it has no chance to find out if there is 128 MB or 512 MB present, or nothing at all, or if this memory is not working at all and returning random data. This code works only for bank sizes that are a power of two! > Given what you mentioned the function limitation, we should update the > codes with the above power of two assumptions documented. That would indeed make sense. On the other hand, I would have expected that this behaviour is kind of obvious from the fact, that we just bit shift the address location used for testing? Best regards, Wolfgang Denk
On 14.07.20 12:35, Bin Meng wrote: > Hi Wolfgang, > > On Tue, Jul 14, 2020 at 1:18 AM Wolfgang Denk <wd at denx.de> wrote: >> >> Dear Bin Meng, >> >> In message <1594378641-26360-1-git-send-email-bmeng.cn at gmail.com> you wrote: >>> >>> Currently get_ram_size() only works with certain RAM size like 1GiB, >>> 2GiB, 4GiB, 8GiB, etc. Chanage the codes to work with any RAM size. >> >> I'm afraid I don't understand this change, Can you please explain a >> bit more detailed what "any RAM size" means? > > I meant "any RAM size" that is not power of two. > >> >> The existing code should work fine with any RAM size that is a power >> of two (and the algoithm used is based on this assumption, so the >> code would fail if it is not met). > > Unfortunately this limitation is not documented clearly. > >> >>> - long save[BITS_PER_LONG - 1]; >>> - long save_base; >>> - long cnt; >>> - long val; >>> - long size; >>> - int i = 0; >>> + long save[BITS_PER_LONG - 1]; >>> + long save_base; >>> + long cnt; >>> + long val; >>> + long size; >>> + int i = 0; >> >> Why do you change the formatting here? I can see no need for that? > > I see the above are the only places in this file that do not follow > our coding practices. Since I was modifying this function, I think > it's fine to do the updates while we are here. > >> >> >>> + long n = maxsize / sizeof(long); >>> >>> - for (cnt = (maxsize / sizeof(long)) >> 1; cnt > 0; cnt >>= 1) { >>> + n = __ffs(n); >>> + n = BIT(n); >>> + >>> + for (cnt = n >> 1; cnt > 0; cnt >>= 1) { >> >> I can only speculate - but do you think that this will work for a >> memory size of - for example - 2.5 GiB as might result from combining >> two banks of 2 GiB resp. 512 MiB ? I bet it doesn't. > > Correct. This issue was exposed when I was testing U-Boot with QEMU on > RISC-V. With QEMU we can instantiate arbitrary RAM size for a given > machine. > >> >> For correct operation (as originally intended) you would always >> specify a maxsize twice as large as the maximum possible memory size >> of a bank of memory, and the function would return the real size it >> found. > > I don't think this function can work as expected. Even if I pass a > power-of-two RAM size to this function, it could still fail. Imagine > the target has 2GiB memory size, and I pass 8GiB as the maxsize to > this function. The function will hang when it tries to read/write > something at an address that is beyond 2GiB. > >> >> Any other use, especially not checking one bank of memory at a time, >> will not work as intended. And I have yet to see systems where >> the size of a bank of memory is not a power of two. >> >> So I feel what you are doing here is not an improvement, but a >> "workaround" for some incorrect usage. >> > > Given what you mentioned the function limitation, we should update the > codes with the above power of two assumptions documented. > >> I don't think we should accept this patch. >> > > Regards, > Bin > If we want a fast algorithm to determine the last supported address even if the start or size is not a power of two: unsigned long n = sizeof(unsigned long) * 8 - 1; unsigned long addr = 0; n = 1UL << n; for (; n; n >>= 1) { ????????unsigned long next = addr + n; ????????if (next < start || next <= start + size - 1 && test(next)) ????????????????addr = next; } Best regards Heinrich
Dear Heinrich, In message <53dad1c7-7684-f975-1567-6ec5e03fa4b6 at gmx.de> you wrote: > > If we want a fast algorithm to determine the last supported address even > if the start or size is not a power of two: Are you sure? How is this supposed to work? Running it with start = 0 and size = 0xC0000000 it will test the memory locations 0x80000000 0xA0000000 0xB0000000 0xB8000000 0xBC000000 0xBE000000 0xBF000000 0xBF800000 0xBFC00000 0xBFE00000 0xBFF00000 0xBFF80000 0xBFFC0000 0xBFFE0000 0xBFFF0000 0xBFFF8000 0xBFFFC000 0xBFFFE000 0xBFFFF000 0xBFFFF800 0xBFFFFC00 0xBFFFFE00 0xBFFFFF00 0xBFFFFF80 0xBFFFFFC0 0xBFFFFFE0 0xBFFFFFF0 0xBFFFFFF8 0xBFFFFFFC 0xBFFFFFFE 0xBFFFFFFF What do you intend with such a sequence? Best regards, Wolfgang Denk
Dear Heinrich, In message <4d5aaf77-378d-4ac2-0008-599ceeeb95ed@gmx.de> you wrote: > > > Are you sure? How is this supposed to work? > > > > Running it with start = 0 and size = 0xC0000000 it will test the > > memory locations > > > > 0x80000000 > > 0xA0000000 > > 0xB0000000 > > 0xB8000000 > > 0xBC000000 > > 0xBE000000 > > 0xBF000000 > > 0xBF800000 > > 0xBFC00000 > > 0xBFE00000 > > 0xBFF00000 > > 0xBFF80000 > > 0xBFFC0000 > > 0xBFFE0000 > > 0xBFFF0000 > > 0xBFFF8000 > > 0xBFFFC000 > > 0xBFFFE000 > > 0xBFFFF000 > > 0xBFFFF800 > > 0xBFFFFC00 > > 0xBFFFFE00 > > 0xBFFFFF00 > > 0xBFFFFF80 > > 0xBFFFFFC0 > > 0xBFFFFFE0 > > 0xBFFFFFF0 > > 0xBFFFFFF8 > > 0xBFFFFFFC > > 0xBFFFFFFE > > 0xBFFFFFFF > > The last accessible byte is at 0xBFFFFFFF which matches (start + size - > 1) in your example. > > The difference to the current logic is that it does not require start or > size to be power of two. Yes, I know, but your test strategy makes no sense. You have to check memory at the power-of-two boudanries to find out how big the ram actually is and whether your memory accesses may be mirrored at other locations in the address room. You are just testing the end of the expected size in some way for which I cannot find a good explanation. > If the size input is larger than the actually accessible memory size, > you will get the actual memory size, e.g. lets assume that the last > accessible address is 0x3333: Please understand thtat this function is not a plain stupid size test, but there is more intelligece in it. We have seen many cases where for example the memory controller was set up for 256 MiB address room, but only 64 MiB of memory were installed. The current version of the test reliably detects that there are only 64 MiB of memory presnet, clearly indicating the problem. I think your test would happyly report to see 256 MiB of memory and miss the fact that this is only one of the 4 mirrors of the real memory - and of course any attempt to run Linux on such a system would result in funny crashes... > Now the algorithm returns that the last accessible memory location is > 0x3332. Come on. The purpose of this function as it was designed is to find the size of installed memory in systems where this may change because there may be several hardware configurations. And memory chips tend to come in sizes that are a power of two. And we not only test the size, we also verify basic functionality (including errors like shorts of breaks in the data and address lanes), and this really fast so it can be kept in the production version. You drop all this very practical functionality for a feature which is of hypothetical use at best? > The algorithm runs in O(log(UINT_MAX)) which is the same time complexity > as for the current algorithm. Yes, but it makes absolutely no sense to me. Best regards, Wolfgang Denk -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de I can't say I've ever been lost, but I was bewildered once for three days. - Daniel Boone (Attributed)
diff --git a/common/memsize.c b/common/memsize.c index e95c682..776737a 100644 --- a/common/memsize.c +++ b/common/memsize.c @@ -6,6 +6,7 @@ #include <common.h> #include <init.h> +#include <linux/bitops.h> DECLARE_GLOBAL_DATA_PTR; @@ -27,14 +28,18 @@ DECLARE_GLOBAL_DATA_PTR; long get_ram_size(long *base, long maxsize) { volatile long *addr; - long save[BITS_PER_LONG - 1]; - long save_base; - long cnt; - long val; - long size; - int i = 0; + long save[BITS_PER_LONG - 1]; + long save_base; + long cnt; + long val; + long size; + int i = 0; + long n = maxsize / sizeof(long); - for (cnt = (maxsize / sizeof(long)) >> 1; cnt > 0; cnt >>= 1) { + n = __ffs(n); + n = BIT(n); + + for (cnt = n >> 1; cnt > 0; cnt >>= 1) { addr = base + cnt; /* pointer arith! */ sync(); save[i++] = *addr; @@ -53,7 +58,7 @@ long get_ram_size(long *base, long maxsize) /* Restore the original data before leaving the function. */ sync(); *base = save_base; - for (cnt = 1; cnt < maxsize / sizeof(long); cnt <<= 1) { + for (cnt = 1; cnt < n; cnt <<= 1) { addr = base + cnt; sync(); *addr = save[--i]; @@ -61,7 +66,7 @@ long get_ram_size(long *base, long maxsize) return (0); } - for (cnt = 1; cnt < maxsize / sizeof(long); cnt <<= 1) { + for (cnt = 1; cnt < n; cnt <<= 1) { addr = base + cnt; /* pointer arith! */ val = *addr; *addr = save[--i]; @@ -72,12 +77,13 @@ long get_ram_size(long *base, long maxsize) * before leaving the function. */ for (cnt <<= 1; - cnt < maxsize / sizeof(long); + cnt < n; cnt <<= 1) { addr = base + cnt; *addr = save[--i]; } - /* warning: don't restore save_base in this case, + /* + * warning: don't restore save_base in this case, * it is already done in the loop because * base and base+size share the same physical memory * and *base is saved after *(base+size) modification