diff mbox series

[1/2] block: make bdrv_drop_intermediate() less wrong

Message ID 20201031123502.4558-2-vsementsov@virtuozzo.com
State New
Headers show
Series [1/2] block: make bdrv_drop_intermediate() less wrong | expand

Commit Message

Vladimir Sementsov-Ogievskiy Oct. 31, 2020, 12:35 p.m. UTC
First, permission update loop tries to do iterations transactionally,
but the whole update is not transactional: nobody roll-back successful
loop iterations when some iteration fails.

Second, in the iteration we have nested permission update:
c->klass->update_filename may point to bdrv_child_cb_update_filename()
which calls bdrv_backing_update_filename(), which may do node reopen to
RW.

Permission update system is not prepared to nested updates, at least it
has intermediate permission-update state stored in BdrvChild
structures: has_backup_perm, backup_perm and backup_shared_perm.

So, let's first do bdrv_replace_node() (which is more transactional
than open-coded update in bdrv_drop_intermediate()) and then call
update_filename() in separate. We still do not rollback changes in case
of update_filename() failure but it's not much worse than pre-patch
behavior.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c | 36 +++++++++++++++---------------------
 1 file changed, 15 insertions(+), 21 deletions(-)

Comments

Max Reitz Nov. 5, 2020, 1:22 p.m. UTC | #1
On 31.10.20 13:35, Vladimir Sementsov-Ogievskiy wrote:
> First, permission update loop tries to do iterations transactionally,
> but the whole update is not transactional: nobody roll-back successful
> loop iterations when some iteration fails.

Indeed.  Another thing that should be noted:

bdrv_check_update_perm() doc:
 > Needs to be followed by a call to either bdrv_set_perm()
 > or bdrv_abort_perm_update().

And besides rolling back on error, we don’t commit on success either.

> Second, in the iteration we have nested permission update:
> c->klass->update_filename may point to bdrv_child_cb_update_filename()
> which calls bdrv_backing_update_filename(), which may do node reopen to
> RW.
> 
> Permission update system is not prepared to nested updates, at least it
> has intermediate permission-update state stored in BdrvChild
> structures: has_backup_perm, backup_perm and backup_shared_perm.
> 
> So, let's first do bdrv_replace_node() (which is more transactional
> than open-coded update in bdrv_drop_intermediate()) and then call
> update_filename() in separate. We still do not rollback changes in case
> of update_filename() failure but it's not much worse than pre-patch
> behavior.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block.c | 36 +++++++++++++++---------------------
>   1 file changed, 15 insertions(+), 21 deletions(-)

And it’s also much simpler and clearer now.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Max Reitz Nov. 5, 2020, 1:24 p.m. UTC | #2
On 05.11.20 14:22, Max Reitz wrote:
> On 31.10.20 13:35, Vladimir Sementsov-Ogievskiy wrote:
>> First, permission update loop tries to do iterations transactionally,
>> but the whole update is not transactional: nobody roll-back successful
>> loop iterations when some iteration fails.

[...]

> And besides rolling back on error, we don’t commit on success either.

(Oh, we do, through bdrv_replace_child().  Well.)

Max
Alberto Garcia Nov. 5, 2020, 3:03 p.m. UTC | #3
On Sat 31 Oct 2020 01:35:01 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> @@ -4958,36 +4958,30 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
>          backing_file_str = base->filename;
>      }
>  
> -    QLIST_FOREACH_SAFE(c, &top->parents, next_parent, next) {
> -        /* Check whether we are allowed to switch c from top to base */
> -        GSList *ignore_children = g_slist_prepend(NULL, c);
> -        ret = bdrv_check_update_perm(base, NULL, c->perm, c->shared_perm,
> -                                     ignore_children, NULL, &local_err);
> -        g_slist_free(ignore_children);
> -        if (ret < 0) {
> -            error_report_err(local_err);
> -            goto exit;
> -        }
> +    bdrv_replace_node(top, base, &local_err);
> +    if (local_err) {
> +        error_report_err(local_err);
> +        goto exit;
> +    }

At the beginning of this function there's a check for c->frozen. I think
you can remove it safely because you also have it in bdrv_replace_node()

Berto
Alberto Garcia Nov. 5, 2020, 3:14 p.m. UTC | #4
On Sat 31 Oct 2020 01:35:01 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> -    QLIST_FOREACH_SAFE(c, &top->parents, next_parent, next) {
  /* ... */
> +    QLIST_FOREACH_SAFE(c, &base->parents, next_parent, next) {

I also wonder, is top->parents and base->parents guaranteed to be the
same list in this case?

If not you could store the list of top->parents before calling
bdrv_replace_node() and use it afterwards.

    QLIST_FOREACH(c, &top->parents, next_parent) {
        parents = g_slist_prepend(parents, c);
    }

Berto
Vladimir Sementsov-Ogievskiy Nov. 6, 2020, 9:26 a.m. UTC | #5
05.11.2020 18:14, Alberto Garcia wrote:
> On Sat 31 Oct 2020 01:35:01 PM CET, Vladimir Sementsov-Ogievskiy wrote:
>> -    QLIST_FOREACH_SAFE(c, &top->parents, next_parent, next) {
>    /* ... */
>> +    QLIST_FOREACH_SAFE(c, &base->parents, next_parent, next) {
> 
> I also wonder, is top->parents and base->parents guaranteed to be the
> same list in this case?
> 
> If not you could store the list of top->parents before calling
> bdrv_replace_node() and use it afterwards.
> 
>      QLIST_FOREACH(c, &top->parents, next_parent) {
>          parents = g_slist_prepend(parents, c);
>      }
> 
> Berto
> 

Hmm... We should not touch other parents of base, which it had prior to bdrv_replace_node().

On the other hand, it should be safe to call update_filename for not-updated parents..

And we should keep in mind that bdrv_replace_node may replace not all children. Still, in this
case we must replace all of them. Seems, we need additional argument for bdrv_replace_node()
to fail, if it want's to skip some children replacement.

So, I'm going to resend to add this parameter to bdrv_replace_node() and will use your
suggestion to be stricter on what we update, thanks!
diff mbox series

Patch

diff --git a/block.c b/block.c
index 9538af4884..bd9f4e534b 100644
--- a/block.c
+++ b/block.c
@@ -4958,36 +4958,30 @@  int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
         backing_file_str = base->filename;
     }
 
-    QLIST_FOREACH_SAFE(c, &top->parents, next_parent, next) {
-        /* Check whether we are allowed to switch c from top to base */
-        GSList *ignore_children = g_slist_prepend(NULL, c);
-        ret = bdrv_check_update_perm(base, NULL, c->perm, c->shared_perm,
-                                     ignore_children, NULL, &local_err);
-        g_slist_free(ignore_children);
-        if (ret < 0) {
-            error_report_err(local_err);
-            goto exit;
-        }
+    bdrv_replace_node(top, base, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+        goto exit;
+    }
 
-        /* If so, update the backing file path in the image file */
+    QLIST_FOREACH_SAFE(c, &base->parents, next_parent, next) {
         if (c->klass->update_filename) {
             ret = c->klass->update_filename(c, base, backing_file_str,
                                             &local_err);
             if (ret < 0) {
-                bdrv_abort_perm_update(base);
+                /*
+                 * TODO: Actually, we want to rollback all previous iterations
+                 * of this loop, and (which is almost impossible) previous
+                 * bdrv_replace_node()...
+                 *
+                 * Note, that c->klass->update_filename may lead to permission
+                 * update, so it's a bad idea to call it inside permission
+                 * update transaction of bdrv_replace_node.
+                 */
                 error_report_err(local_err);
                 goto exit;
             }
         }
-
-        /*
-         * Do the actual switch in the in-memory graph.
-         * Completes bdrv_check_update_perm() transaction internally.
-         * c->frozen is false, we have checked that above.
-         */
-        bdrv_ref(base);
-        bdrv_replace_child(c, base);
-        bdrv_unref(top);
     }
 
     if (update_inherits_from) {