[RFC,0/7] block-backend: Introduce I/O hang

Message ID	20200927130420.1095-1-fangying1@huawei.com
Headers	show Return-Path: <SRS0=PST1=DE=nongnu.org=qemu-devel-bounces+qemu-devel=archiver.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C6E1E2389F From: Ying Fang <fangying1@huawei.com> To: <qemu-devel@nongnu.org> Subject: [RFC PATCH 0/7] block-backend: Introduce I/O hang Date: Sun, 27 Sep 2020 21:04:13 +0800 Message-ID: <20200927130420.1095-1-fangying1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Received-SPF: pass client-ip=45.249.212.32; envelope-from=fangying1@huawei.com; helo=huawei.com Precedence: list Cc: kwolf@redhat.com, Ying Fang <fangying1@huawei.com>, zhang.zhanghailiang@huawei.com, mreitz@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>
Series	block-backend: Introduce I/O hang \| expand [RFC,0/7] block-backend: Introduce I/O hang [RFC,1/7] block-backend: introduce I/O rehandle info [RFC,2/7] block-backend: rehandle block aios when EIO [RFC,3/7] block-backend: add I/O hang timeout [RFC,4/7] block-backend: add I/O hang drain when disbale [RFC,5/7] virtio-blk: disable I/O hang when resetting [RFC,6/7] qemu-option: add I/O hang timeout option [RFC,7/7] qapi: add I/O hang and I/O hang timeout qapi event

Message ID

20200927130420.1095-1-fangying1@huawei.com

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C6E1E2389F
From: Ying Fang <fangying1@huawei.com>
To: <qemu-devel@nongnu.org>
Subject: [RFC PATCH 0/7] block-backend: Introduce I/O hang
Date: Sun, 27 Sep 2020 21:04:13 +0800
Message-ID: <20200927130420.1095-1-fangying1@huawei.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain
Received-SPF: pass client-ip=45.249.212.32;
	envelope-from=fangying1@huawei.com; helo=huawei.com
X-Spam_score_int: -41
X-Spam_score: -4.2
X-Spam_bar: ----
X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3,
	RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_PASS=-0.001,
	SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: kwolf@redhat.com, Ying Fang <fangying1@huawei.com>,
	zhang.zhanghailiang@huawei.com, mreitz@redhat.com
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>

Series

block-backend: Introduce I/O hang | expand

Message

Ying Fang Sept. 27, 2020, 1:04 p.m. UTC

A VM in the cloud environment may use a virutal disk as the backend storage,
and there are usually filesystems on the virtual block device. When backend
storage is temporarily down, any I/O issued to the virtual block device will
cause an error. For example, an error occurred in ext4 filesystem would make
the filesystem readonly. However a cloud backend storage can be soon recovered.
For example, an IP-SAN may be down due to network failure and will be online
soon after network is recovered. The error in the filesystem may not be
recovered unless a device reattach or system restart. So an I/O rehandle is
in need to implement a self-healing mechanism.

This patch series propose a feature called I/O hang. It can rehandle AIOs
with EIO error without sending error back to guest. From guest's perspective
of view it is just like an IO is hanging and not returned. Guest can get
back running smoothly when I/O is recovred with this feature enabled.


Ying Fang (7):
  block-backend: introduce I/O rehandle info
  block-backend: rehandle block aios when EIO
  block-backend: add I/O hang timeout
  block-backend: add I/O hang drain when disbale
  virtio-blk: disable I/O hang when resetting
  qemu-option: add I/O hang timeout option
  qapi: add I/O hang and I/O hang timeout qapi event

 block/block-backend.c          | 285 +++++++++++++++++++++++++++++++++
 blockdev.c                     |  11 ++
 hw/block/virtio-blk.c          |   8 +
 include/sysemu/block-backend.h |   5 +
 qapi/block-core.json           |  26 +++
 5 files changed, 335 insertions(+)

Comments

no-reply@patchew.org Sept. 27, 2020, 1:27 p.m. UTC | #1

Patchew URL: https://patchew.org/QEMU/20200927130420.1095-1-fangying1@huawei.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

Host machine cpu: x86_64
Target machine cpu family: x86
Target machine cpu: x86_64
../src/meson.build:10: WARNING: Module unstable-keyval has no backwards or forwards compatibility and might not exist in future releases.
Program sh found: YES
Program python3 found: YES (/usr/bin/python3)
Configuring ninjatool using configuration
---
Compiling C object libblock.fa.p/block_vdi.c.obj
Compiling C object libblock.fa.p/block_cloop.c.obj
../src/block/block-backend.c: In function 'blk_new':
../src/block/block-backend.c:386:5: error: implicit declaration of function 'atomic_set'; did you mean 'qatomic_set'? [-Werror=implicit-function-declaration]
  386 |     atomic_set(&blk->reinfo.in_flight, 0);
      |     ^~~~~~~~~~
      |     qatomic_set
../src/block/block-backend.c:386:5: error: nested extern declaration of 'atomic_set' [-Werror=nested-externs]
In file included from /usr/x86_64-w64-mingw32/sys-root/mingw/lib/glib-2.0/include/glibconfig.h:9,
                 from /usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0/glib/gtypes.h:32,
                 from /usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0/glib/galloca.h:32,
---
                 from /tmp/qemu-test/src/include/qemu/osdep.h:126,
                 from ../src/block/block-backend.c:13:
../src/block/block-backend.c: In function 'blk_delete':
../src/block/block-backend.c:479:12: error: implicit declaration of function 'atomic_read'; did you mean 'qatomic_read'? [-Werror=implicit-function-declaration]
  479 |     assert(atomic_read(&blk->reinfo.in_flight) == 0);
      |            ^~~~~~~~~~~
/usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0/glib/gmacros.h:928:8: note: in definition of macro '_G_BOOLEAN_EXPR'
---
../src/block/block-backend.c:479:5: note: in expansion of macro 'assert'
  479 |     assert(atomic_read(&blk->reinfo.in_flight) == 0);
      |     ^~~~~~
../src/block/block-backend.c:479:12: error: nested extern declaration of 'atomic_read' [-Werror=nested-externs]
  479 |     assert(atomic_read(&blk->reinfo.in_flight) == 0);
      |            ^~~~~~~~~~~
/usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0/glib/gmacros.h:928:8: note: in definition of macro '_G_BOOLEAN_EXPR'
---
  479 |     assert(atomic_read(&blk->reinfo.in_flight) == 0);
      |     ^~~~~~
../src/block/block-backend.c: In function 'blk_rehandle_insert_aiocb':
../src/block/block-backend.c:2459:5: error: implicit declaration of function 'atomic_inc'; did you mean 'qatomic_inc'? [-Werror=implicit-function-declaration]
 2459 |     atomic_inc(&blk->reinfo.in_flight);
      |     ^~~~~~~~~~
      |     qatomic_inc
../src/block/block-backend.c:2459:5: error: nested extern declaration of 'atomic_inc' [-Werror=nested-externs]
../src/block/block-backend.c: In function 'blk_rehandle_remove_aiocb':
../src/block/block-backend.c:2468:5: error: implicit declaration of function 'atomic_dec'; did you mean 'qatomic_dec'? [-Werror=implicit-function-declaration]
 2468 |     atomic_dec(&blk->reinfo.in_flight);
      |     ^~~~~~~~~~
      |     qatomic_dec
../src/block/block-backend.c:2468:5: error: nested extern declaration of 'atomic_dec' [-Werror=nested-externs]
cc1: all warnings being treated as errors
make: *** [Makefile.ninja:888: libblock.fa.p/block_block-backend.c.obj] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 709, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--rm', '--label', 'com.qemu.instance.uuid=4c3aba1eb35b428ca91e79a610e892a6', '-u', '1001', '--security-opt', 'seccomp=unconfined', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-1pm_eno6/src/docker-src.2020-09-27-09.21.55.30331:/var/tmp/qemu:z,ro', 'qemu/fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=4c3aba1eb35b428ca91e79a610e892a6
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-1pm_eno6/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real    5m11.016s
user    0m19.775s


The full log is available at
http://patchew.org/logs/20200927130420.1095-1-fangying1@huawei.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

no-reply@patchew.org Sept. 27, 2020, 1:32 p.m. UTC | #2

Patchew URL: https://patchew.org/QEMU/20200927130420.1095-1-fangying1@huawei.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

C linker for the host machine: cc ld.bfd 2.27-43
Host machine cpu family: x86_64
Host machine cpu: x86_64
../src/meson.build:10: WARNING: Module unstable-keyval has no backwards or forwards compatibility and might not exist in future releases.
Program sh found: YES
Program python3 found: YES (/usr/bin/python3)
Configuring ninjatool using configuration
---
Compiling C object libblock.fa.p/block_commit.c.o
Compiling C object libblock.fa.p/block_vhdx-endian.c.o
../src/block/block-backend.c: In function 'blk_new':
../src/block/block-backend.c:386:5: error: implicit declaration of function 'atomic_set' [-Werror=implicit-function-declaration]
     atomic_set(&blk->reinfo.in_flight, 0);
     ^
../src/block/block-backend.c:386:5: error: nested extern declaration of 'atomic_set' [-Werror=nested-externs]
../src/block/block-backend.c: In function 'blk_delete':
../src/block/block-backend.c:479:5: error: implicit declaration of function 'atomic_read' [-Werror=implicit-function-declaration]
     assert(atomic_read(&blk->reinfo.in_flight) == 0);
     ^
../src/block/block-backend.c:479:5: error: nested extern declaration of 'atomic_read' [-Werror=nested-externs]
../src/block/block-backend.c: In function 'blk_rehandle_insert_aiocb':
../src/block/block-backend.c:2459:5: error: implicit declaration of function 'atomic_inc' [-Werror=implicit-function-declaration]
     atomic_inc(&blk->reinfo.in_flight);
     ^
../src/block/block-backend.c:2459:5: error: nested extern declaration of 'atomic_inc' [-Werror=nested-externs]
../src/block/block-backend.c: In function 'blk_rehandle_remove_aiocb':
../src/block/block-backend.c:2468:5: error: implicit declaration of function 'atomic_dec' [-Werror=implicit-function-declaration]
     atomic_dec(&blk->reinfo.in_flight);
     ^
../src/block/block-backend.c:2468:5: error: nested extern declaration of 'atomic_dec' [-Werror=nested-externs]
cc1: all warnings being treated as errors
make: *** [libblock.fa.p/block_block-backend.c.o] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 709, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--rm', '--label', 'com.qemu.instance.uuid=39951e04bf3b4809a4afe5755ca771f5', '-u', '1001', '--security-opt', 'seccomp=unconfined', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-5qkiksy1/src/docker-src.2020-09-27-09.28.20.6987:/var/tmp/qemu:z,ro', 'qemu/centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=39951e04bf3b4809a4afe5755ca771f5
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-5qkiksy1/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    4m6.755s
user    0m23.139s


The full log is available at
http://patchew.org/logs/20200927130420.1095-1-fangying1@huawei.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

Kevin Wolf Sept. 28, 2020, 10:57 a.m. UTC | #3

Am 27.09.2020 um 15:04 hat Ying Fang geschrieben:
> A VM in the cloud environment may use a virutal disk as the backend storage,

> and there are usually filesystems on the virtual block device. When backend

> storage is temporarily down, any I/O issued to the virtual block device will

> cause an error. For example, an error occurred in ext4 filesystem would make

> the filesystem readonly. However a cloud backend storage can be soon recovered.

> For example, an IP-SAN may be down due to network failure and will be online

> soon after network is recovered. The error in the filesystem may not be

> recovered unless a device reattach or system restart. So an I/O rehandle is

> in need to implement a self-healing mechanism.

> 

> This patch series propose a feature called I/O hang. It can rehandle AIOs

> with EIO error without sending error back to guest. From guest's perspective

> of view it is just like an IO is hanging and not returned. Guest can get

> back running smoothly when I/O is recovred with this feature enabled.


What is the problem with setting werror=stop and rerror=stop for the
device? Is it that QEMU won't automatically retry, but management tool
interaction is required to resume the guest?

I haven't checked your patches in detail yet, but implementing this
functionality in the backend means that blk_drain() will hang (or if it
doesn't hang, it doesn't do what it's supposed to do), making the whole
QEMU process unresponsive until the I/O succeeds again. Amongst others,
this would make it impossible to migrate away from a host with storage
problems.

Kevin

cenjiahui Sept. 29, 2020, 9:48 a.m. UTC | #4

On 2020/9/28 18:57, Kevin Wolf wrote:
> Am 27.09.2020 um 15:04 hat Ying Fang geschrieben:

>> A VM in the cloud environment may use a virutal disk as the backend storage,

>> and there are usually filesystems on the virtual block device. When backend

>> storage is temporarily down, any I/O issued to the virtual block device will

>> cause an error. For example, an error occurred in ext4 filesystem would make

>> the filesystem readonly. However a cloud backend storage can be soon recovered.

>> For example, an IP-SAN may be down due to network failure and will be online

>> soon after network is recovered. The error in the filesystem may not be

>> recovered unless a device reattach or system restart. So an I/O rehandle is

>> in need to implement a self-healing mechanism.

>>

>> This patch series propose a feature called I/O hang. It can rehandle AIOs

>> with EIO error without sending error back to guest. From guest's perspective

>> of view it is just like an IO is hanging and not returned. Guest can get

>> back running smoothly when I/O is recovred with this feature enabled.

> 

> What is the problem with setting werror=stop and rerror=stop for the

When an I/O error occurs, if simply setting werror=stop and rerror=stop, the
whole VM will be paused and unavailable. Moreover, the VM won't be recovered
until the management tool manually resumes it after the backend storage recovers.
> device? Is it that QEMU won't automatically retry, but management tool

> interaction is required to resume the guest?

By using I/O Hang mechanism, we can temporarily hang the IOs, and any other
services unrelated with the hung virtual block device like network can go on
working. Besides, once the backend storage is recovered, our I/O rehandle
mechanism will automatically complete the hung IOs and continue the VM's work.
> 

> I haven't checked your patches in detail yet, but implementing this

> functionality in the backend means that blk_drain() will hang (or if it

> doesn't hang, it doesn't do what it's supposed to do), making the whole

What if we disable rehandle before blk_drain().
> QEMU process unresponsive until the I/O succeeds again. Amongst others,

> this would make it impossible to migrate away from a host with storage

> problems.

Exactly if the storage is recovered during migration iteration phase, the
migration can succeed, but if the storage is still not recovered on migration
completion phase, the migration should fail and be cancelled.

Thanks,
Jiahui Cen
> 

> Kevin

> 

> 

> .

>