Message ID | 20201021144456.1072-1-s.reiter@proxmox.com |
---|---|
State | New |
Headers | show |
Series | migration/block-dirty-bitmap: fix larger granularity bitmaps | expand |
21.10.2020 17:44, Stefan Reiter wrote: > sectors_per_chunk is a 64 bit integer, but the calculation is done in 32 > bits, leading to an overflow for coarse bitmap granularities. > > If that results in the value 0, it leads to a hang where no progress is > made but send_bitmap_bits is constantly called with nr_sectors being 0. > > Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> > --- > migration/block-dirty-bitmap.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c > index 5bef793ac0..5398869e2b 100644 > --- a/migration/block-dirty-bitmap.c > +++ b/migration/block-dirty-bitmap.c > @@ -562,8 +562,9 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, > dbms->bitmap_alias = g_strdup(bitmap_alias); > dbms->bitmap = bitmap; > dbms->total_sectors = bdrv_nb_sectors(bs); > - dbms->sectors_per_chunk = CHUNK_SIZE * 8 * > + dbms->sectors_per_chunk = CHUNK_SIZE * 8lu * I'd prefer 8llu for absolute safety. > bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS; > + assert(dbms->sectors_per_chunk != 0); I doubt that we need this assertion. Bug fixed, and it's obviously impossible. And if we really want to assert that there is no overflow (assuming future changes), it should look like this: assert(bdrv_dirty_bitmap_granularity(bitmap) < (1ull << 63) / CHUNK_SIZE / 8 >> BDRV_SECTOR_BITS); to cover not only corner case but any overflow.. And of course we should modify original expression to do ">> BDRV_SECTOR_BITS" earlier than all multiplies, like dbms->sectors_per_chunk = CHUNK_SIZE * 8llu * (bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS); But I think that only s/8/8ull/ change is enough. > if (bdrv_dirty_bitmap_enabled(bitmap)) { > dbms->flags |= DIRTY_BITMAP_MIG_START_FLAG_ENABLED; > } > With 8llu and with or without assertion: Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> -- Best regards, Vladimir
On 10/21/20 9:44 AM, Stefan Reiter wrote: > sectors_per_chunk is a 64 bit integer, but the calculation is done in 32 > bits, leading to an overflow for coarse bitmap granularities. > > If that results in the value 0, it leads to a hang where no progress is > made but send_bitmap_bits is constantly called with nr_sectors being 0. > > Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> > --- > migration/block-dirty-bitmap.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c > index 5bef793ac0..5398869e2b 100644 > --- a/migration/block-dirty-bitmap.c > +++ b/migration/block-dirty-bitmap.c > @@ -562,8 +562,9 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, > dbms->bitmap_alias = g_strdup(bitmap_alias); > dbms->bitmap = bitmap; > dbms->total_sectors = bdrv_nb_sectors(bs); > - dbms->sectors_per_chunk = CHUNK_SIZE * 8 * > + dbms->sectors_per_chunk = CHUNK_SIZE * 8lu * 8lu is not necessarily 64-bit; you need llu. Also, I prefer capitalizing the type suffix, as in 8LLU. I can touch that up while queuing through my bitmaps tree, so: Reviewed-by: Eric Blake <eblake@redhat.com> > bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS; > + assert(dbms->sectors_per_chunk != 0); > if (bdrv_dirty_bitmap_enabled(bitmap)) { > dbms->flags |= DIRTY_BITMAP_MIG_START_FLAG_ENABLED; > } > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
On 10/21/20 5:17 PM, Vladimir Sementsov-Ogievskiy wrote: > 21.10.2020 17:44, Stefan Reiter wrote: >> sectors_per_chunk is a 64 bit integer, but the calculation is done in 32 >> bits, leading to an overflow for coarse bitmap granularities. >> >> If that results in the value 0, it leads to a hang where no progress is >> made but send_bitmap_bits is constantly called with nr_sectors being 0. >> >> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> >> --- >> migration/block-dirty-bitmap.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/migration/block-dirty-bitmap.c >> b/migration/block-dirty-bitmap.c >> index 5bef793ac0..5398869e2b 100644 >> --- a/migration/block-dirty-bitmap.c >> +++ b/migration/block-dirty-bitmap.c >> @@ -562,8 +562,9 @@ static int add_bitmaps_to_list(DBMSaveState *s, >> BlockDriverState *bs, >> dbms->bitmap_alias = g_strdup(bitmap_alias); >> dbms->bitmap = bitmap; >> dbms->total_sectors = bdrv_nb_sectors(bs); >> - dbms->sectors_per_chunk = CHUNK_SIZE * 8 * >> + dbms->sectors_per_chunk = CHUNK_SIZE * 8lu * > > I'd prefer 8llu for absolute safety. > >> bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS; >> + assert(dbms->sectors_per_chunk != 0); > > I doubt that we need this assertion. Bug fixed, and it's obviously > impossible. > And if we really want to assert that there is no overflow (assuming > future changes), > it should look like this: > > assert(bdrv_dirty_bitmap_granularity(bitmap) < (1ull << 63) / > CHUNK_SIZE / 8 >> BDRV_SECTOR_BITS); > > to cover not only corner case but any overflow.. And of course we should > modify original expression > to do ">> BDRV_SECTOR_BITS" earlier than all multiplies, like > > dbms->sectors_per_chunk = CHUNK_SIZE * 8llu * > (bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS); > > > But I think that only s/8/8ull/ change is enough. > I agree, and I wouldn't mind removing the assert, but just to clarify it was mostly meant to prevent the case where the migration gets stuck entirely. Even if the calculation is wrong, it would at least do _something_ instead of endlessly looping. Maybe an assert(nr_sectors != 0); in send_bitmap_bits instead for that? >> if (bdrv_dirty_bitmap_enabled(bitmap)) { >> dbms->flags |= DIRTY_BITMAP_MIG_START_FLAG_ENABLED; >> } >> > > > With 8llu and with or without assertion: > Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >
22.10.2020 10:46, Stefan Reiter wrote: > On 10/21/20 5:17 PM, Vladimir Sementsov-Ogievskiy wrote: >> 21.10.2020 17:44, Stefan Reiter wrote: >>> sectors_per_chunk is a 64 bit integer, but the calculation is done in 32 >>> bits, leading to an overflow for coarse bitmap granularities. >>> >>> If that results in the value 0, it leads to a hang where no progress is >>> made but send_bitmap_bits is constantly called with nr_sectors being 0. >>> >>> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> >>> --- >>> migration/block-dirty-bitmap.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c >>> index 5bef793ac0..5398869e2b 100644 >>> --- a/migration/block-dirty-bitmap.c >>> +++ b/migration/block-dirty-bitmap.c >>> @@ -562,8 +562,9 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, >>> dbms->bitmap_alias = g_strdup(bitmap_alias); >>> dbms->bitmap = bitmap; >>> dbms->total_sectors = bdrv_nb_sectors(bs); >>> - dbms->sectors_per_chunk = CHUNK_SIZE * 8 * >>> + dbms->sectors_per_chunk = CHUNK_SIZE * 8lu * >> >> I'd prefer 8llu for absolute safety. >> >>> bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS; >>> + assert(dbms->sectors_per_chunk != 0); >> >> I doubt that we need this assertion. Bug fixed, and it's obviously impossible. >> And if we really want to assert that there is no overflow (assuming future changes), >> it should look like this: >> >> assert(bdrv_dirty_bitmap_granularity(bitmap) < (1ull << 63) / CHUNK_SIZE / 8 >> BDRV_SECTOR_BITS); >> >> to cover not only corner case but any overflow.. And of course we should modify original expression >> to do ">> BDRV_SECTOR_BITS" earlier than all multiplies, like >> >> dbms->sectors_per_chunk = CHUNK_SIZE * 8llu * (bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS); >> >> >> But I think that only s/8/8ull/ change is enough. >> > > I agree, and I wouldn't mind removing the assert, but just to clarify it was mostly meant to prevent the case where the migration gets stuck entirely. Even if the calculation is wrong, it would at least do _something_ instead of endlessly looping. > > Maybe an > > assert(nr_sectors != 0); > > in send_bitmap_bits instead for that? Hmm, just sending 0 sectors should not be a problem by itself. It's a problem when we don't make a progress in the loop in bulk_phase().. So, I'd prefer your original assertion, as sectors_per_chunk=0 is definitely wrong thing. > >>> if (bdrv_dirty_bitmap_enabled(bitmap)) { >>> dbms->flags |= DIRTY_BITMAP_MIG_START_FLAG_ENABLED; >>> } >>> >> >> >> With 8llu and with or without assertion: >> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> > -- Best regards, Vladimir
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c index 5bef793ac0..5398869e2b 100644 --- a/migration/block-dirty-bitmap.c +++ b/migration/block-dirty-bitmap.c @@ -562,8 +562,9 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, dbms->bitmap_alias = g_strdup(bitmap_alias); dbms->bitmap = bitmap; dbms->total_sectors = bdrv_nb_sectors(bs); - dbms->sectors_per_chunk = CHUNK_SIZE * 8 * + dbms->sectors_per_chunk = CHUNK_SIZE * 8lu * bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS; + assert(dbms->sectors_per_chunk != 0); if (bdrv_dirty_bitmap_enabled(bitmap)) { dbms->flags |= DIRTY_BITMAP_MIG_START_FLAG_ENABLED; }
sectors_per_chunk is a 64 bit integer, but the calculation is done in 32 bits, leading to an overflow for coarse bitmap granularities. If that results in the value 0, it leads to a hang where no progress is made but send_bitmap_bits is constantly called with nr_sectors being 0. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> --- migration/block-dirty-bitmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)