Message ID | 20201031123502.4558-2-vsementsov@virtuozzo.com |
---|---|
State | New |
Headers | show |
Series | [1/2] block: make bdrv_drop_intermediate() less wrong | expand |
On 31.10.20 13:35, Vladimir Sementsov-Ogievskiy wrote: > First, permission update loop tries to do iterations transactionally, > but the whole update is not transactional: nobody roll-back successful > loop iterations when some iteration fails. Indeed. Another thing that should be noted: bdrv_check_update_perm() doc: > Needs to be followed by a call to either bdrv_set_perm() > or bdrv_abort_perm_update(). And besides rolling back on error, we don’t commit on success either. > Second, in the iteration we have nested permission update: > c->klass->update_filename may point to bdrv_child_cb_update_filename() > which calls bdrv_backing_update_filename(), which may do node reopen to > RW. > > Permission update system is not prepared to nested updates, at least it > has intermediate permission-update state stored in BdrvChild > structures: has_backup_perm, backup_perm and backup_shared_perm. > > So, let's first do bdrv_replace_node() (which is more transactional > than open-coded update in bdrv_drop_intermediate()) and then call > update_filename() in separate. We still do not rollback changes in case > of update_filename() failure but it's not much worse than pre-patch > behavior. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > --- > block.c | 36 +++++++++++++++--------------------- > 1 file changed, 15 insertions(+), 21 deletions(-) And it’s also much simpler and clearer now. Reviewed-by: Max Reitz <mreitz@redhat.com>
On 05.11.20 14:22, Max Reitz wrote: > On 31.10.20 13:35, Vladimir Sementsov-Ogievskiy wrote: >> First, permission update loop tries to do iterations transactionally, >> but the whole update is not transactional: nobody roll-back successful >> loop iterations when some iteration fails. [...] > And besides rolling back on error, we don’t commit on success either. (Oh, we do, through bdrv_replace_child(). Well.) Max
On Sat 31 Oct 2020 01:35:01 PM CET, Vladimir Sementsov-Ogievskiy wrote: > @@ -4958,36 +4958,30 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base, > backing_file_str = base->filename; > } > > - QLIST_FOREACH_SAFE(c, &top->parents, next_parent, next) { > - /* Check whether we are allowed to switch c from top to base */ > - GSList *ignore_children = g_slist_prepend(NULL, c); > - ret = bdrv_check_update_perm(base, NULL, c->perm, c->shared_perm, > - ignore_children, NULL, &local_err); > - g_slist_free(ignore_children); > - if (ret < 0) { > - error_report_err(local_err); > - goto exit; > - } > + bdrv_replace_node(top, base, &local_err); > + if (local_err) { > + error_report_err(local_err); > + goto exit; > + } At the beginning of this function there's a check for c->frozen. I think you can remove it safely because you also have it in bdrv_replace_node() Berto
On Sat 31 Oct 2020 01:35:01 PM CET, Vladimir Sementsov-Ogievskiy wrote: > - QLIST_FOREACH_SAFE(c, &top->parents, next_parent, next) { /* ... */ > + QLIST_FOREACH_SAFE(c, &base->parents, next_parent, next) { I also wonder, is top->parents and base->parents guaranteed to be the same list in this case? If not you could store the list of top->parents before calling bdrv_replace_node() and use it afterwards. QLIST_FOREACH(c, &top->parents, next_parent) { parents = g_slist_prepend(parents, c); } Berto
05.11.2020 18:14, Alberto Garcia wrote: > On Sat 31 Oct 2020 01:35:01 PM CET, Vladimir Sementsov-Ogievskiy wrote: >> - QLIST_FOREACH_SAFE(c, &top->parents, next_parent, next) { > /* ... */ >> + QLIST_FOREACH_SAFE(c, &base->parents, next_parent, next) { > > I also wonder, is top->parents and base->parents guaranteed to be the > same list in this case? > > If not you could store the list of top->parents before calling > bdrv_replace_node() and use it afterwards. > > QLIST_FOREACH(c, &top->parents, next_parent) { > parents = g_slist_prepend(parents, c); > } > > Berto > Hmm... We should not touch other parents of base, which it had prior to bdrv_replace_node(). On the other hand, it should be safe to call update_filename for not-updated parents.. And we should keep in mind that bdrv_replace_node may replace not all children. Still, in this case we must replace all of them. Seems, we need additional argument for bdrv_replace_node() to fail, if it want's to skip some children replacement. So, I'm going to resend to add this parameter to bdrv_replace_node() and will use your suggestion to be stricter on what we update, thanks!
diff --git a/block.c b/block.c index 9538af4884..bd9f4e534b 100644 --- a/block.c +++ b/block.c @@ -4958,36 +4958,30 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base, backing_file_str = base->filename; } - QLIST_FOREACH_SAFE(c, &top->parents, next_parent, next) { - /* Check whether we are allowed to switch c from top to base */ - GSList *ignore_children = g_slist_prepend(NULL, c); - ret = bdrv_check_update_perm(base, NULL, c->perm, c->shared_perm, - ignore_children, NULL, &local_err); - g_slist_free(ignore_children); - if (ret < 0) { - error_report_err(local_err); - goto exit; - } + bdrv_replace_node(top, base, &local_err); + if (local_err) { + error_report_err(local_err); + goto exit; + } - /* If so, update the backing file path in the image file */ + QLIST_FOREACH_SAFE(c, &base->parents, next_parent, next) { if (c->klass->update_filename) { ret = c->klass->update_filename(c, base, backing_file_str, &local_err); if (ret < 0) { - bdrv_abort_perm_update(base); + /* + * TODO: Actually, we want to rollback all previous iterations + * of this loop, and (which is almost impossible) previous + * bdrv_replace_node()... + * + * Note, that c->klass->update_filename may lead to permission + * update, so it's a bad idea to call it inside permission + * update transaction of bdrv_replace_node. + */ error_report_err(local_err); goto exit; } } - - /* - * Do the actual switch in the in-memory graph. - * Completes bdrv_check_update_perm() transaction internally. - * c->frozen is false, we have checked that above. - */ - bdrv_ref(base); - bdrv_replace_child(c, base); - bdrv_unref(top); } if (update_inherits_from) {
First, permission update loop tries to do iterations transactionally, but the whole update is not transactional: nobody roll-back successful loop iterations when some iteration fails. Second, in the iteration we have nested permission update: c->klass->update_filename may point to bdrv_child_cb_update_filename() which calls bdrv_backing_update_filename(), which may do node reopen to RW. Permission update system is not prepared to nested updates, at least it has intermediate permission-update state stored in BdrvChild structures: has_backup_perm, backup_perm and backup_shared_perm. So, let's first do bdrv_replace_node() (which is more transactional than open-coded update in bdrv_drop_intermediate()) and then call update_filename() in separate. We still do not rollback changes in case of update_filename() failure but it's not much worse than pre-patch behavior. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> --- block.c | 36 +++++++++++++++--------------------- 1 file changed, 15 insertions(+), 21 deletions(-)