Message ID | 1485185917-3677-1-git-send-email-christophe.milard@linaro.org |
---|---|
State | Accepted |
Commit | 0d6c0804640151f8aa9970cc74f1fef28a11fac1 |
Headers | show |
On Mon, Jan 23, 2017 at 9:38 AM, Christophe Milard <christophe.milard@linaro.org> wrote: > _ishm prereserves address space for _ODP_ISHM_SINGLE_VA allocated memory > by reserving unreachable "memory", using PROT_NONE, hence really allocating > virtual space (as the memory cannot be hit). > The problem came when trying to use some of this preallocated space: > strangely, if a new mapping on the preallocated virtual space failed (e.g. > due to lack of huge pages), linux would returned MAP_FAILED (OK) but > also cancel the previous PROT_NONE mapping, hence making the virtual > memory space available for further mmaps. (code Point A) > Before this patch, the code simply re-reserved (PROT_NONE) the area > (point B): > This was NOT OK: yes, _ishm_reserve calls are mutexed, so no other > odpthread 2 could do a reserve between point A and B of opdthread1, but if > thread2 did its own mmap(), malloc(),..., there was a chance for thread 2 > to get virtual space from the preserved area (which should be blocked). > This patch does not allow any A-B window by first doing an mmap (non fixed) > and then performing a second mapping at the correct address (in the > pre-reserved area), which is guaranteed to succeed, and finally removing > the non-fixed mapping. > The non-fix mapping just acts as a failure guard when the proper maping > is done. > Fixes https://bugs.linaro.org/show_bug.cgi?id=2834 > (but very hard to trigger i.e. to prove) > > Signed-off-by: Christophe Milard <christophe.milard@linaro.org> Reviewed-and-tested-by: Bill Fischofer <bill.fischofer@linaro.org> > --- > platform/linux-generic/_ishmphy.c | 42 +++++++++++++++++++++++++++++---------- > 1 file changed, 32 insertions(+), 10 deletions(-) > > diff --git a/platform/linux-generic/_ishmphy.c b/platform/linux-generic/_ishmphy.c > index 2b2d100..d519af6 100644 > --- a/platform/linux-generic/_ishmphy.c > +++ b/platform/linux-generic/_ishmphy.c > @@ -94,7 +94,7 @@ int _odp_ishmphy_unbook_va(void) > void *_odp_ishmphy_map(int fd, void *start, uint64_t size, > int flags) > { > - void *mapped_addr; > + void *mapped_addr_tmp, *mapped_addr; > int mmap_flags = 0; > > if (flags & _ODP_ISHM_SINGLE_VA) { > @@ -103,15 +103,37 @@ void *_odp_ishmphy_map(int fd, void *start, uint64_t size, > return NULL; > } > /* maps over fragment of reserved VA: */ > - mapped_addr = mmap(start, size, PROT_READ | PROT_WRITE, > - MAP_SHARED | MAP_FIXED | mmap_flags, fd, 0); > - /* if mapping fails, re-block the space we tried to take > - * as it seems a mapping failure still affect what was there??*/ > - if (mapped_addr == MAP_FAILED) { > - mmap_flags = MAP_SHARED | MAP_FIXED | > - MAP_ANONYMOUS | MAP_NORESERVE; > - mmap(start, size, PROT_NONE, mmap_flags, -1, 0); > - mprotect(start, size, PROT_NONE); > + /* first, try a normal map. If that works, remap it where it > + * should (on the prereverved space), and remove the initial > + * normal mapping: > + * This is because it turned out that if a mapping fails > + * on a the prereserved virtual address space, then > + * the prereserved address space which was tried to be mapped > + * on becomes available to the kernel again! This was not > + * according to expectations: the assumption was that if a > + * mapping fails, the system should remain unchanged, but this > + * is obvioulsy not true (at least for huge pages when > + * exhausted). > + * So the strategy is to first map at a non reserved place > + * (which can then be freed and returned to the kernel on > + * failure) and peform a new map to the prereserved space on > + * success (which is then guaranteed to work). > + * The initial free maping can then be removed. > + */ > + mapped_addr = MAP_FAILED; > + mapped_addr_tmp = mmap(NULL, size, PROT_READ | PROT_WRITE, > + MAP_SHARED | mmap_flags, fd, 0); > + if (mapped_addr_tmp != MAP_FAILED) { > + /* If OK, do new map at right fixed location... */ > + mapped_addr = mmap(start, > + size, PROT_READ | PROT_WRITE, > + MAP_SHARED | MAP_FIXED | mmap_flags, > + fd, 0); > + if (mapped_addr != start) > + ODP_ERR("new map failed:%s\n", strerror(errno)); > + /* ... and remove initial mapping: */ > + if (munmap(mapped_addr_tmp, size)) > + ODP_ERR("munmap failed:%s\n", strerror(errno)); > } > } else { > /* just do a new mapping in the VA space: */ > -- > 2.7.4 >
merged, Maxim. On 01/24/17 03:55, Bill Fischofer wrote: > On Mon, Jan 23, 2017 at 9:38 AM, Christophe Milard > <christophe.milard@linaro.org> wrote: >> _ishm prereserves address space for _ODP_ISHM_SINGLE_VA allocated memory >> by reserving unreachable "memory", using PROT_NONE, hence really allocating >> virtual space (as the memory cannot be hit). >> The problem came when trying to use some of this preallocated space: >> strangely, if a new mapping on the preallocated virtual space failed (e.g. >> due to lack of huge pages), linux would returned MAP_FAILED (OK) but >> also cancel the previous PROT_NONE mapping, hence making the virtual >> memory space available for further mmaps. (code Point A) >> Before this patch, the code simply re-reserved (PROT_NONE) the area >> (point B): >> This was NOT OK: yes, _ishm_reserve calls are mutexed, so no other >> odpthread 2 could do a reserve between point A and B of opdthread1, but if >> thread2 did its own mmap(), malloc(),..., there was a chance for thread 2 >> to get virtual space from the preserved area (which should be blocked). >> This patch does not allow any A-B window by first doing an mmap (non fixed) >> and then performing a second mapping at the correct address (in the >> pre-reserved area), which is guaranteed to succeed, and finally removing >> the non-fixed mapping. >> The non-fix mapping just acts as a failure guard when the proper maping >> is done. >> Fixes https://bugs.linaro.org/show_bug.cgi?id=2834 >> (but very hard to trigger i.e. to prove) >> >> Signed-off-by: Christophe Milard <christophe.milard@linaro.org> > > Reviewed-and-tested-by: Bill Fischofer <bill.fischofer@linaro.org> > >> --- >> platform/linux-generic/_ishmphy.c | 42 +++++++++++++++++++++++++++++---------- >> 1 file changed, 32 insertions(+), 10 deletions(-) >> >> diff --git a/platform/linux-generic/_ishmphy.c b/platform/linux-generic/_ishmphy.c >> index 2b2d100..d519af6 100644 >> --- a/platform/linux-generic/_ishmphy.c >> +++ b/platform/linux-generic/_ishmphy.c >> @@ -94,7 +94,7 @@ int _odp_ishmphy_unbook_va(void) >> void *_odp_ishmphy_map(int fd, void *start, uint64_t size, >> int flags) >> { >> - void *mapped_addr; >> + void *mapped_addr_tmp, *mapped_addr; >> int mmap_flags = 0; >> >> if (flags & _ODP_ISHM_SINGLE_VA) { >> @@ -103,15 +103,37 @@ void *_odp_ishmphy_map(int fd, void *start, uint64_t size, >> return NULL; >> } >> /* maps over fragment of reserved VA: */ >> - mapped_addr = mmap(start, size, PROT_READ | PROT_WRITE, >> - MAP_SHARED | MAP_FIXED | mmap_flags, fd, 0); >> - /* if mapping fails, re-block the space we tried to take >> - * as it seems a mapping failure still affect what was there??*/ >> - if (mapped_addr == MAP_FAILED) { >> - mmap_flags = MAP_SHARED | MAP_FIXED | >> - MAP_ANONYMOUS | MAP_NORESERVE; >> - mmap(start, size, PROT_NONE, mmap_flags, -1, 0); >> - mprotect(start, size, PROT_NONE); >> + /* first, try a normal map. If that works, remap it where it >> + * should (on the prereverved space), and remove the initial >> + * normal mapping: >> + * This is because it turned out that if a mapping fails >> + * on a the prereserved virtual address space, then >> + * the prereserved address space which was tried to be mapped >> + * on becomes available to the kernel again! This was not >> + * according to expectations: the assumption was that if a >> + * mapping fails, the system should remain unchanged, but this >> + * is obvioulsy not true (at least for huge pages when >> + * exhausted). >> + * So the strategy is to first map at a non reserved place >> + * (which can then be freed and returned to the kernel on >> + * failure) and peform a new map to the prereserved space on >> + * success (which is then guaranteed to work). >> + * The initial free maping can then be removed. >> + */ >> + mapped_addr = MAP_FAILED; >> + mapped_addr_tmp = mmap(NULL, size, PROT_READ | PROT_WRITE, >> + MAP_SHARED | mmap_flags, fd, 0); >> + if (mapped_addr_tmp != MAP_FAILED) { >> + /* If OK, do new map at right fixed location... */ >> + mapped_addr = mmap(start, >> + size, PROT_READ | PROT_WRITE, >> + MAP_SHARED | MAP_FIXED | mmap_flags, >> + fd, 0); >> + if (mapped_addr != start) >> + ODP_ERR("new map failed:%s\n", strerror(errno)); >> + /* ... and remove initial mapping: */ >> + if (munmap(mapped_addr_tmp, size)) >> + ODP_ERR("munmap failed:%s\n", strerror(errno)); >> } >> } else { >> /* just do a new mapping in the VA space: */ >> -- >> 2.7.4 >>
diff --git a/platform/linux-generic/_ishmphy.c b/platform/linux-generic/_ishmphy.c index 2b2d100..d519af6 100644 --- a/platform/linux-generic/_ishmphy.c +++ b/platform/linux-generic/_ishmphy.c @@ -94,7 +94,7 @@ int _odp_ishmphy_unbook_va(void) void *_odp_ishmphy_map(int fd, void *start, uint64_t size, int flags) { - void *mapped_addr; + void *mapped_addr_tmp, *mapped_addr; int mmap_flags = 0; if (flags & _ODP_ISHM_SINGLE_VA) { @@ -103,15 +103,37 @@ void *_odp_ishmphy_map(int fd, void *start, uint64_t size, return NULL; } /* maps over fragment of reserved VA: */ - mapped_addr = mmap(start, size, PROT_READ | PROT_WRITE, - MAP_SHARED | MAP_FIXED | mmap_flags, fd, 0); - /* if mapping fails, re-block the space we tried to take - * as it seems a mapping failure still affect what was there??*/ - if (mapped_addr == MAP_FAILED) { - mmap_flags = MAP_SHARED | MAP_FIXED | - MAP_ANONYMOUS | MAP_NORESERVE; - mmap(start, size, PROT_NONE, mmap_flags, -1, 0); - mprotect(start, size, PROT_NONE); + /* first, try a normal map. If that works, remap it where it + * should (on the prereverved space), and remove the initial + * normal mapping: + * This is because it turned out that if a mapping fails + * on a the prereserved virtual address space, then + * the prereserved address space which was tried to be mapped + * on becomes available to the kernel again! This was not + * according to expectations: the assumption was that if a + * mapping fails, the system should remain unchanged, but this + * is obvioulsy not true (at least for huge pages when + * exhausted). + * So the strategy is to first map at a non reserved place + * (which can then be freed and returned to the kernel on + * failure) and peform a new map to the prereserved space on + * success (which is then guaranteed to work). + * The initial free maping can then be removed. + */ + mapped_addr = MAP_FAILED; + mapped_addr_tmp = mmap(NULL, size, PROT_READ | PROT_WRITE, + MAP_SHARED | mmap_flags, fd, 0); + if (mapped_addr_tmp != MAP_FAILED) { + /* If OK, do new map at right fixed location... */ + mapped_addr = mmap(start, + size, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_FIXED | mmap_flags, + fd, 0); + if (mapped_addr != start) + ODP_ERR("new map failed:%s\n", strerror(errno)); + /* ... and remove initial mapping: */ + if (munmap(mapped_addr_tmp, size)) + ODP_ERR("munmap failed:%s\n", strerror(errno)); } } else { /* just do a new mapping in the VA space: */
_ishm prereserves address space for _ODP_ISHM_SINGLE_VA allocated memory by reserving unreachable "memory", using PROT_NONE, hence really allocating virtual space (as the memory cannot be hit). The problem came when trying to use some of this preallocated space: strangely, if a new mapping on the preallocated virtual space failed (e.g. due to lack of huge pages), linux would returned MAP_FAILED (OK) but also cancel the previous PROT_NONE mapping, hence making the virtual memory space available for further mmaps. (code Point A) Before this patch, the code simply re-reserved (PROT_NONE) the area (point B): This was NOT OK: yes, _ishm_reserve calls are mutexed, so no other odpthread 2 could do a reserve between point A and B of opdthread1, but if thread2 did its own mmap(), malloc(),..., there was a chance for thread 2 to get virtual space from the preserved area (which should be blocked). This patch does not allow any A-B window by first doing an mmap (non fixed) and then performing a second mapping at the correct address (in the pre-reserved area), which is guaranteed to succeed, and finally removing the non-fixed mapping. The non-fix mapping just acts as a failure guard when the proper maping is done. Fixes https://bugs.linaro.org/show_bug.cgi?id=2834 (but very hard to trigger i.e. to prove) Signed-off-by: Christophe Milard <christophe.milard@linaro.org> --- platform/linux-generic/_ishmphy.c | 42 +++++++++++++++++++++++++++++---------- 1 file changed, 32 insertions(+), 10 deletions(-) -- 2.7.4