Message ID | 20200922104932.46384-3-mreitz@redhat.com |
---|---|
State | New |
Headers | show |
Series | block/export: Allow exporting BDSs via FUSE | expand |
Am 22.09.2020 um 12:49 hat Max Reitz geschrieben: > block-export-add type=fuse allows mounting block graph nodes via FUSE on > some existing regular file. That file should then appears like a raw > disk image, and accesses to it result in accesses to the exported BDS. > > Right now, we only implement the necessary block export functions to set > it up and shut it down. We do not implement any access functions, so > accessing the mount point only results in errors. This will be > addressed by a followup patch. > > We keep a hash table of exported mount points, because we want to be > able to detect when users try to use a mount point twice. This is > because we invoke stat() to check whether the given mount point is a > regular file, but if that file is served by ourselves (because it is > already used as a mount point), then this stat() would have to be served > by ourselves, too, which is impossible to do while we (as the caller) > are waiting for it to settle. Therefore, keep track of mount point > paths to at least catch the most obvious instances of that problem. > > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > qapi/block-export.json | 24 +++- > include/block/fuse.h | 30 +++++ > block.c | 1 + > block/export/export.c | 4 + > block/export/fuse.c | 253 +++++++++++++++++++++++++++++++++++++++ > block/export/meson.build | 1 + > 6 files changed, 311 insertions(+), 2 deletions(-) > create mode 100644 include/block/fuse.h > create mode 100644 block/export/fuse.c > > diff --git a/qapi/block-export.json b/qapi/block-export.json > index 0c045dfe7b..cb5bd54cbf 100644 > --- a/qapi/block-export.json > +++ b/qapi/block-export.json > @@ -174,6 +174,21 @@ > ## > { 'command': 'nbd-server-stop' } > > +## > +# @BlockExportOptionsFuse: > +# > +# Options for exporting a block graph node on some (file) mountpoint > +# as a raw image. > +# > +# @mountpoint: Path on which to export the block device via FUSE. > +# This must point to an existing regular file. > +# > +# Since: 5.2 > +## > +{ 'struct': 'BlockExportOptionsFuse', > + 'data': { 'mountpoint': 'str' }, > + 'if': 'defined(CONFIG_FUSE)' } > + > ## > # @BlockExportType: > # > @@ -181,10 +196,13 @@ > # > # @nbd: NBD export > # > +# @fuse: Fuse export (since: 5.2) s/Fuse/FUSE/? > +# > # Since: 4.2 > ## > { 'enum': 'BlockExportType', > - 'data': [ 'nbd' ] } > + 'data': [ 'nbd', > + { 'name': 'fuse', 'if': 'defined(CONFIG_FUSE)' } ] } > > ## > # @BlockExportOptions: > @@ -213,7 +231,9 @@ > '*writethrough': 'bool' }, > 'discriminator': 'type', > 'data': { > - 'nbd': 'BlockExportOptionsNbd' > + 'nbd': 'BlockExportOptionsNbd', > + 'fuse': { 'type': 'BlockExportOptionsFuse', > + 'if': 'defined(CONFIG_FUSE)' } > } } > > ## > diff --git a/include/block/fuse.h b/include/block/fuse.h > new file mode 100644 > index 0000000000..ffa91fe364 > --- /dev/null > +++ b/include/block/fuse.h > @@ -0,0 +1,30 @@ > +/* > + * Present a block device as a raw image through FUSE > + * > + * Copyright (c) 2020 Max Reitz <mreitz@redhat.com> > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; under version 2 or later of the License. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, see <http://www.gnu.org/licenses/>. > + */ > + > +#ifndef BLOCK_FUSE_H > +#define BLOCK_FUSE_H > + > +#ifdef CONFIG_FUSE > + > +#include "block/export.h" > + > +extern const BlockExportDriver blk_exp_fuse; > + > +#endif /* CONFIG_FUSE */ > + > +#endif > diff --git a/block.c b/block.c > index acde53f92a..7bf45f6ede 100644 > --- a/block.c > +++ b/block.c > @@ -26,6 +26,7 @@ > #include "block/trace.h" > #include "block/block_int.h" > #include "block/blockjob.h" > +#include "block/fuse.h" > #include "block/nbd.h" > #include "block/qdict.h" > #include "qemu/error-report.h" > diff --git a/block/export/export.c b/block/export/export.c > index 64d39a6934..ba866d6ba7 100644 > --- a/block/export/export.c > +++ b/block/export/export.c > @@ -16,6 +16,7 @@ > #include "block/block.h" > #include "sysemu/block-backend.h" > #include "block/export.h" > +#include "block/fuse.h" > #include "block/nbd.h" > #include "qapi/error.h" > #include "qapi/qapi-commands-block-export.h" > @@ -24,6 +25,9 @@ > > static const BlockExportDriver *blk_exp_drivers[] = { > &blk_exp_nbd, > +#ifdef CONFIG_FUSE > + &blk_exp_fuse, > +#endif > }; > > /* Only accessed from the main thread */ > diff --git a/block/export/fuse.c b/block/export/fuse.c > new file mode 100644 > index 0000000000..75f11d2514 > --- /dev/null > +++ b/block/export/fuse.c > @@ -0,0 +1,253 @@ > +/* > + * Present a block device as a raw image through FUSE > + * > + * Copyright (c) 2020 Max Reitz <mreitz@redhat.com> > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; under version 2 or later of the License. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, see <http://www.gnu.org/licenses/>. > + */ > + > +#define FUSE_USE_VERSION 31 > + > +#include "qemu/osdep.h" > +#include "block/aio.h" > +#include "block/block.h" > +#include "block/export.h" > +#include "block/fuse.h" > +#include "block/qapi.h" > +#include "qapi/error.h" > +#include "qapi/qapi-commands-block.h" > +#include "sysemu/block-backend.h" > + > +#include <fuse.h> > +#include <fuse_lowlevel.h> > + > + > +typedef struct FuseExport { > + BlockExport common; > + > + struct fuse_session *fuse_session; > + struct fuse_buf fuse_buf; > + bool mounted, fd_handler_set_up; > + > + char *mountpoint; > + bool writable; > +} FuseExport; > + > +static GHashTable *exports; > +static const struct fuse_lowlevel_ops fuse_ops; > + > +static void fuse_export_shutdown(BlockExport *exp); > +static void fuse_export_delete(BlockExport *exp); > + > +static void init_fuse(void); > +static int setup_fuse_export(FuseExport *exp, const char *mountpoint, > + Error **errp); > +static void read_from_fuse_export(void *opaque); > + > +static bool is_regular_file(const char *path, Error **errp); > + > + > +static int fuse_export_create(BlockExport *blk_exp, > + BlockExportOptions *blk_exp_args, > + Error **errp) > +{ > + FuseExport *exp = container_of(blk_exp, FuseExport, common); > + BlockExportOptionsFuse *args = &blk_exp_args->u.fuse; > + int ret; > + > + assert(blk_exp_args->type == BLOCK_EXPORT_TYPE_FUSE); > + > + init_fuse(); > + > + /* > + * It is important to do this check before calling is_regular_file() -- > + * that function will do a stat(), which we would have to handle if we > + * already exported something on @mountpoint. But we cannot, because > + * we are currently caught up here. > + * (Note that ideally we would want to resolve relative paths here, > + * but bdrv_make_absolute_filename() might do the wrong thing for > + * paths that contain colons, and realpath() would resolve symlinks, > + * which we do not want: The mount point is not going to be the > + * symlink's destination, but the link itself.) > + * So this will not catch all potential clashes, but hopefully at > + * least the most common one of specifying exactly the same path > + * string twice. > + */ > + if (g_hash_table_contains(exports, args->mountpoint)) { > + error_setg(errp, "There already is a FUSE export on '%s'", > + args->mountpoint); > + ret = -EEXIST; > + goto fail; > + } I guess the proper solution would be to make block export creation asynchronous so that we can actually serve that request in the background. Something for another day... > + > + if (!is_regular_file(args->mountpoint, errp)) { > + ret = -EINVAL; > + goto fail; > + } > + > + exp->mountpoint = g_strdup(args->mountpoint); > + exp->writable = blk_exp_args->writable; > + > + ret = setup_fuse_export(exp, args->mountpoint, errp); > + if (ret < 0) { > + goto fail; > + } > + > + g_hash_table_insert(exports, g_strdup(args->mountpoint), NULL); > + return 0; > + > +fail: > + fuse_export_shutdown(blk_exp); > + fuse_export_delete(blk_exp); Why fuse_export_shutdown()? The only thing to possibly undo at this point is allocating exp->mountpoint, which is freed in fuse_export_delete(). Or can setup_fuse_export() fail without cleaning up? I hope not. > + return ret; > +} > + > +/** > + * Ensure that the global FUSE context is set up. > + */ > +static void init_fuse(void) > +{ > + if (exports) { > + return; > + } > + > + exports = g_hash_table_new_full(g_str_hash, g_str_equal, g_free, NULL); > +} > + > +/** > + * Create exp->fuse_session and mount it. > + */ > +static int setup_fuse_export(FuseExport *exp, const char *mountpoint, > + Error **errp) > +{ > + const char *fuse_argv[2]; > + struct fuse_args fuse_args; > + int ret; > + > + fuse_argv[0] = ""; /* Dummy program name */ > + fuse_argv[1] = NULL; > + fuse_args = (struct fuse_args)FUSE_ARGS_INIT(1, (char **)fuse_argv); > + > + exp->fuse_session = fuse_session_new(&fuse_args, &fuse_ops, > + sizeof(fuse_ops), exp); > + if (!exp->fuse_session) { > + error_setg(errp, "Failed to set up FUSE session"); > + return -EIO; > + } > + > + ret = fuse_session_mount(exp->fuse_session, mountpoint); > + if (ret < 0) { > + error_setg(errp, "Failed to mount FUSE session to export"); > + return -EIO; Oh, it can. :-( I think it would be better to call fuse_export_shutdown() here so that the function doesn't change the state when it returns failure. > + } > + exp->mounted = true; > + > + aio_set_fd_handler(exp->common.ctx, > + fuse_session_fd(exp->fuse_session), true, > + read_from_fuse_export, NULL, NULL, exp); > + exp->fd_handler_set_up = true; > + > + return 0; > +} > + > +/** > + * Callback to be invoked when the FUSE session FD can be read from. > + * (This is basically the FUSE event loop.) > + */ > +static void read_from_fuse_export(void *opaque) > +{ > + FuseExport *exp = opaque; > + int ret; > + > + blk_exp_ref(&exp->common); > + > + ret = fuse_session_receive_buf(exp->fuse_session, &exp->fuse_buf); The fuse_session_loop() implementation seems to imply that we should retry on EINTR. > + if (ret < 0) { > + goto out; > + } > + > + fuse_session_process_buf(exp->fuse_session, &exp->fuse_buf); > + > +out: > + blk_exp_unref(&exp->common); > +} > + > +static void fuse_export_shutdown(BlockExport *blk_exp) > +{ > + FuseExport *exp = container_of(blk_exp, FuseExport, common); > + > + if (exp->fuse_session) { > + fuse_session_exit(exp->fuse_session); > + > + if (exp->mounted) { > + fuse_session_unmount(exp->fuse_session); > + exp->mounted = false; > + } > + > + if (exp->fd_handler_set_up) { > + aio_set_fd_handler(exp->common.ctx, > + fuse_session_fd(exp->fuse_session), true, > + NULL, NULL, NULL, NULL); > + exp->fd_handler_set_up = false; > + } > + > + fuse_session_destroy(exp->fuse_session); > + exp->fuse_session = NULL; What happens if a request is still in flight? Oh, can't happen because the driver is fully synchronous after this series. Fair enough, making it asynchronous can come on top of it. However, I think we should move fuse_session_unmount() and fuse_session_destroy() to .delete instead of .shutdown to avoid setting up a trap for whoever introduces async requests to the FUSE export. > + } > + > + if (exp->mountpoint) { > + g_hash_table_remove(exports, exp->mountpoint); > + } > +} > + > +static void fuse_export_delete(BlockExport *blk_exp) > +{ > + FuseExport *exp = container_of(blk_exp, FuseExport, common); > + > + free(exp->fuse_buf.mem); > + g_free(exp->mountpoint); > +} > + > +/** > + * Check whether @path points to a regular file. If not, put an > + * appropriate message into *errp. > + */ > +static bool is_regular_file(const char *path, Error **errp) > +{ > + struct stat statbuf; > + int ret; > + > + ret = stat(path, &statbuf); > + if (ret < 0) { > + error_setg_errno(errp, errno, "Failed to stat '%s'", path); > + return false; > + } > + > + if (!S_ISREG(statbuf.st_mode)) { > + error_setg(errp, "'%s' is not a regular file", path); > + return false; > + } > + > + return true; > +} This is obviously racy (TOCTOU), but if I understand correctly, we use it only to provide a nicer error message than if fuse_session_mount() fails, so this is fine. > +static const struct fuse_lowlevel_ops fuse_ops = { > +}; > + > +const BlockExportDriver blk_exp_fuse = { > + .type = BLOCK_EXPORT_TYPE_FUSE, > + .instance_size = sizeof(FuseExport), > + .create = fuse_export_create, > + .delete = fuse_export_delete, > + .request_shutdown = fuse_export_shutdown, > +}; Kevin
On 15.10.20 10:57, Kevin Wolf wrote: > Am 22.09.2020 um 12:49 hat Max Reitz geschrieben: >> block-export-add type=fuse allows mounting block graph nodes via FUSE on >> some existing regular file. That file should then appears like a raw >> disk image, and accesses to it result in accesses to the exported BDS. >> >> Right now, we only implement the necessary block export functions to set >> it up and shut it down. We do not implement any access functions, so >> accessing the mount point only results in errors. This will be >> addressed by a followup patch. >> >> We keep a hash table of exported mount points, because we want to be >> able to detect when users try to use a mount point twice. This is >> because we invoke stat() to check whether the given mount point is a >> regular file, but if that file is served by ourselves (because it is >> already used as a mount point), then this stat() would have to be served >> by ourselves, too, which is impossible to do while we (as the caller) >> are waiting for it to settle. Therefore, keep track of mount point >> paths to at least catch the most obvious instances of that problem. >> >> Signed-off-by: Max Reitz <mreitz@redhat.com> >> --- >> qapi/block-export.json | 24 +++- >> include/block/fuse.h | 30 +++++ >> block.c | 1 + >> block/export/export.c | 4 + >> block/export/fuse.c | 253 +++++++++++++++++++++++++++++++++++++++ >> block/export/meson.build | 1 + >> 6 files changed, 311 insertions(+), 2 deletions(-) >> create mode 100644 include/block/fuse.h >> create mode 100644 block/export/fuse.c >> >> diff --git a/qapi/block-export.json b/qapi/block-export.json >> index 0c045dfe7b..cb5bd54cbf 100644 >> --- a/qapi/block-export.json >> +++ b/qapi/block-export.json >> @@ -174,6 +174,21 @@ >> ## >> { 'command': 'nbd-server-stop' } >> >> +## >> +# @BlockExportOptionsFuse: >> +# >> +# Options for exporting a block graph node on some (file) mountpoint >> +# as a raw image. >> +# >> +# @mountpoint: Path on which to export the block device via FUSE. >> +# This must point to an existing regular file. >> +# >> +# Since: 5.2 >> +## >> +{ 'struct': 'BlockExportOptionsFuse', >> + 'data': { 'mountpoint': 'str' }, >> + 'if': 'defined(CONFIG_FUSE)' } >> + >> ## >> # @BlockExportType: >> # >> @@ -181,10 +196,13 @@ >> # >> # @nbd: NBD export >> # >> +# @fuse: Fuse export (since: 5.2) > > s/Fuse/FUSE/? Tomato, TOMATO. Sure. :) >> +# >> # Since: 4.2 >> ## >> { 'enum': 'BlockExportType', >> - 'data': [ 'nbd' ] } >> + 'data': [ 'nbd', >> + { 'name': 'fuse', 'if': 'defined(CONFIG_FUSE)' } ] } >> >> ## >> # @BlockExportOptions: >> @@ -213,7 +231,9 @@ >> '*writethrough': 'bool' }, >> 'discriminator': 'type', >> 'data': { >> - 'nbd': 'BlockExportOptionsNbd' >> + 'nbd': 'BlockExportOptionsNbd', >> + 'fuse': { 'type': 'BlockExportOptionsFuse', >> + 'if': 'defined(CONFIG_FUSE)' } >> } } >> >> ## >> diff --git a/include/block/fuse.h b/include/block/fuse.h >> new file mode 100644 >> index 0000000000..ffa91fe364 >> --- /dev/null >> +++ b/include/block/fuse.h >> @@ -0,0 +1,30 @@ >> +/* >> + * Present a block device as a raw image through FUSE >> + * >> + * Copyright (c) 2020 Max Reitz <mreitz@redhat.com> >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License as published by >> + * the Free Software Foundation; under version 2 or later of the License. >> + * >> + * This program is distributed in the hope that it will be useful, >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >> + * GNU General Public License for more details. >> + * >> + * You should have received a copy of the GNU General Public License >> + * along with this program; if not, see <http://www.gnu.org/licenses/>. >> + */ >> + >> +#ifndef BLOCK_FUSE_H >> +#define BLOCK_FUSE_H >> + >> +#ifdef CONFIG_FUSE >> + >> +#include "block/export.h" >> + >> +extern const BlockExportDriver blk_exp_fuse; >> + >> +#endif /* CONFIG_FUSE */ >> + >> +#endif >> diff --git a/block.c b/block.c >> index acde53f92a..7bf45f6ede 100644 >> --- a/block.c >> +++ b/block.c >> @@ -26,6 +26,7 @@ >> #include "block/trace.h" >> #include "block/block_int.h" >> #include "block/blockjob.h" >> +#include "block/fuse.h" >> #include "block/nbd.h" >> #include "block/qdict.h" >> #include "qemu/error-report.h" >> diff --git a/block/export/export.c b/block/export/export.c >> index 64d39a6934..ba866d6ba7 100644 >> --- a/block/export/export.c >> +++ b/block/export/export.c >> @@ -16,6 +16,7 @@ >> #include "block/block.h" >> #include "sysemu/block-backend.h" >> #include "block/export.h" >> +#include "block/fuse.h" >> #include "block/nbd.h" >> #include "qapi/error.h" >> #include "qapi/qapi-commands-block-export.h" >> @@ -24,6 +25,9 @@ >> >> static const BlockExportDriver *blk_exp_drivers[] = { >> &blk_exp_nbd, >> +#ifdef CONFIG_FUSE >> + &blk_exp_fuse, >> +#endif >> }; >> >> /* Only accessed from the main thread */ >> diff --git a/block/export/fuse.c b/block/export/fuse.c >> new file mode 100644 >> index 0000000000..75f11d2514 >> --- /dev/null >> +++ b/block/export/fuse.c >> @@ -0,0 +1,253 @@ >> +/* >> + * Present a block device as a raw image through FUSE >> + * >> + * Copyright (c) 2020 Max Reitz <mreitz@redhat.com> >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License as published by >> + * the Free Software Foundation; under version 2 or later of the License. >> + * >> + * This program is distributed in the hope that it will be useful, >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >> + * GNU General Public License for more details. >> + * >> + * You should have received a copy of the GNU General Public License >> + * along with this program; if not, see <http://www.gnu.org/licenses/>. >> + */ >> + >> +#define FUSE_USE_VERSION 31 >> + >> +#include "qemu/osdep.h" >> +#include "block/aio.h" >> +#include "block/block.h" >> +#include "block/export.h" >> +#include "block/fuse.h" >> +#include "block/qapi.h" >> +#include "qapi/error.h" >> +#include "qapi/qapi-commands-block.h" >> +#include "sysemu/block-backend.h" >> + >> +#include <fuse.h> >> +#include <fuse_lowlevel.h> >> + >> + >> +typedef struct FuseExport { >> + BlockExport common; >> + >> + struct fuse_session *fuse_session; >> + struct fuse_buf fuse_buf; >> + bool mounted, fd_handler_set_up; >> + >> + char *mountpoint; >> + bool writable; >> +} FuseExport; >> + >> +static GHashTable *exports; >> +static const struct fuse_lowlevel_ops fuse_ops; >> + >> +static void fuse_export_shutdown(BlockExport *exp); >> +static void fuse_export_delete(BlockExport *exp); >> + >> +static void init_fuse(void); >> +static int setup_fuse_export(FuseExport *exp, const char *mountpoint, >> + Error **errp); >> +static void read_from_fuse_export(void *opaque); >> + >> +static bool is_regular_file(const char *path, Error **errp); >> + >> + >> +static int fuse_export_create(BlockExport *blk_exp, >> + BlockExportOptions *blk_exp_args, >> + Error **errp) >> +{ >> + FuseExport *exp = container_of(blk_exp, FuseExport, common); >> + BlockExportOptionsFuse *args = &blk_exp_args->u.fuse; >> + int ret; >> + >> + assert(blk_exp_args->type == BLOCK_EXPORT_TYPE_FUSE); >> + >> + init_fuse(); >> + >> + /* >> + * It is important to do this check before calling is_regular_file() -- >> + * that function will do a stat(), which we would have to handle if we >> + * already exported something on @mountpoint. But we cannot, because >> + * we are currently caught up here. >> + * (Note that ideally we would want to resolve relative paths here, >> + * but bdrv_make_absolute_filename() might do the wrong thing for >> + * paths that contain colons, and realpath() would resolve symlinks, >> + * which we do not want: The mount point is not going to be the >> + * symlink's destination, but the link itself.) >> + * So this will not catch all potential clashes, but hopefully at >> + * least the most common one of specifying exactly the same path >> + * string twice. >> + */ >> + if (g_hash_table_contains(exports, args->mountpoint)) { >> + error_setg(errp, "There already is a FUSE export on '%s'", >> + args->mountpoint); >> + ret = -EEXIST; >> + goto fail; >> + } > > I guess the proper solution would be to make block export creation > asynchronous so that we can actually serve that request in the > background. > > Something for another day... > >> + >> + if (!is_regular_file(args->mountpoint, errp)) { >> + ret = -EINVAL; >> + goto fail; >> + } >> + >> + exp->mountpoint = g_strdup(args->mountpoint); >> + exp->writable = blk_exp_args->writable; >> + >> + ret = setup_fuse_export(exp, args->mountpoint, errp); >> + if (ret < 0) { >> + goto fail; >> + } >> + >> + g_hash_table_insert(exports, g_strdup(args->mountpoint), NULL); >> + return 0; >> + >> +fail: >> + fuse_export_shutdown(blk_exp); >> + fuse_export_delete(blk_exp); > > Why fuse_export_shutdown()? The only thing to possibly undo at > this point is allocating exp->mountpoint, which is freed in > fuse_export_delete(). I just realized it’s not only a question of style, but even technically wrong to do so. fuse_export_shutdown() drops the mount point path from the @exports hash table, which shouldn’t be done if we got here from the g_hash_table_contains() condition. That could be fixed by introducing a new label that won’t execute fuse_export_shutdown(), but we can also just let setup_fuse_export() call that. > Or can setup_fuse_export() fail without cleaning up? I hope not. > >> + return ret; >> +} >> + >> +/** >> + * Ensure that the global FUSE context is set up. >> + */ >> +static void init_fuse(void) >> +{ >> + if (exports) { >> + return; >> + } >> + >> + exports = g_hash_table_new_full(g_str_hash, g_str_equal, g_free, NULL); >> +} >> + >> +/** >> + * Create exp->fuse_session and mount it. >> + */ >> +static int setup_fuse_export(FuseExport *exp, const char *mountpoint, >> + Error **errp) >> +{ >> + const char *fuse_argv[2]; >> + struct fuse_args fuse_args; >> + int ret; >> + >> + fuse_argv[0] = ""; /* Dummy program name */ >> + fuse_argv[1] = NULL; >> + fuse_args = (struct fuse_args)FUSE_ARGS_INIT(1, (char **)fuse_argv); >> + >> + exp->fuse_session = fuse_session_new(&fuse_args, &fuse_ops, >> + sizeof(fuse_ops), exp); >> + if (!exp->fuse_session) { >> + error_setg(errp, "Failed to set up FUSE session"); >> + return -EIO; >> + } >> + >> + ret = fuse_session_mount(exp->fuse_session, mountpoint); >> + if (ret < 0) { >> + error_setg(errp, "Failed to mount FUSE session to export"); >> + return -EIO; > > Oh, it can. :-( > > I think it would be better to call fuse_export_shutdown() here so that > the function doesn't change the state when it returns failure. I suppose then this function (setup_fuse_export()) should also add the mount point name from the @exports hash table, so it makes sense to drop it via fuse_export_shutdown(). But then yeah, why not. Makes sense. >> + } >> + exp->mounted = true; >> + >> + aio_set_fd_handler(exp->common.ctx, >> + fuse_session_fd(exp->fuse_session), true, >> + read_from_fuse_export, NULL, NULL, exp); >> + exp->fd_handler_set_up = true; >> + >> + return 0; >> +} >> + >> +/** >> + * Callback to be invoked when the FUSE session FD can be read from. >> + * (This is basically the FUSE event loop.) >> + */ >> +static void read_from_fuse_export(void *opaque) >> +{ >> + FuseExport *exp = opaque; >> + int ret; >> + >> + blk_exp_ref(&exp->common); >> + >> + ret = fuse_session_receive_buf(exp->fuse_session, &exp->fuse_buf); > > The fuse_session_loop() implementation seems to imply that we should > retry on EINTR. OK. I see you’re digging into libfuse already. :) >> + if (ret < 0) { >> + goto out; >> + } >> + >> + fuse_session_process_buf(exp->fuse_session, &exp->fuse_buf); >> + >> +out: >> + blk_exp_unref(&exp->common); >> +} >> + >> +static void fuse_export_shutdown(BlockExport *blk_exp) >> +{ >> + FuseExport *exp = container_of(blk_exp, FuseExport, common); >> + >> + if (exp->fuse_session) { >> + fuse_session_exit(exp->fuse_session); >> + >> + if (exp->mounted) { >> + fuse_session_unmount(exp->fuse_session); >> + exp->mounted = false; >> + } >> + >> + if (exp->fd_handler_set_up) { >> + aio_set_fd_handler(exp->common.ctx, >> + fuse_session_fd(exp->fuse_session), true, >> + NULL, NULL, NULL, NULL); >> + exp->fd_handler_set_up = false; >> + } >> + >> + fuse_session_destroy(exp->fuse_session); >> + exp->fuse_session = NULL; > > What happens if a request is still in flight? > > Oh, can't happen because the driver is fully synchronous after this > series. Fair enough, making it asynchronous can come on top of it. (I had multiple approaches of handling parallel requests, but none made a substantial performance difference, which is why I left the driver in the most simple form for this first proposal.) > However, I think we should move fuse_session_unmount() and > fuse_session_destroy() to .delete instead of .shutdown to avoid setting > up a trap for whoever introduces async requests to the FUSE export. Yes, makes sense. >> + } >> + >> + if (exp->mountpoint) { >> + g_hash_table_remove(exports, exp->mountpoint); >> + } >> +} >> + >> +static void fuse_export_delete(BlockExport *blk_exp) >> +{ >> + FuseExport *exp = container_of(blk_exp, FuseExport, common); >> + >> + free(exp->fuse_buf.mem); >> + g_free(exp->mountpoint); >> +} >> + >> +/** >> + * Check whether @path points to a regular file. If not, put an >> + * appropriate message into *errp. >> + */ >> +static bool is_regular_file(const char *path, Error **errp) >> +{ >> + struct stat statbuf; >> + int ret; >> + >> + ret = stat(path, &statbuf); >> + if (ret < 0) { >> + error_setg_errno(errp, errno, "Failed to stat '%s'", path); >> + return false; >> + } >> + >> + if (!S_ISREG(statbuf.st_mode)) { >> + error_setg(errp, "'%s' is not a regular file", path); >> + return false; >> + } >> + >> + return true; >> +} > > This is obviously racy (TOCTOU), but if I understand correctly, we use > it only to provide a nicer error message than if fuse_session_mount() > fails, so this is fine. It’s a check provided for convenience, yes. More important is perhaps that users might provide a directory instead of a file as the mount point, which wouldn’t work. (You’ll just get errors back. Which isn’t a catastrophe, it’s just nice to tell users early and explicitly that that won’t work.) Max
Am 15.10.2020 um 16:46 hat Max Reitz geschrieben: > On 15.10.20 10:57, Kevin Wolf wrote: > > Am 22.09.2020 um 12:49 hat Max Reitz geschrieben: > >> block-export-add type=fuse allows mounting block graph nodes via FUSE on > >> some existing regular file. That file should then appears like a raw > >> disk image, and accesses to it result in accesses to the exported BDS. > >> > >> Right now, we only implement the necessary block export functions to set > >> it up and shut it down. We do not implement any access functions, so > >> accessing the mount point only results in errors. This will be > >> addressed by a followup patch. > >> > >> We keep a hash table of exported mount points, because we want to be > >> able to detect when users try to use a mount point twice. This is > >> because we invoke stat() to check whether the given mount point is a > >> regular file, but if that file is served by ourselves (because it is > >> already used as a mount point), then this stat() would have to be served > >> by ourselves, too, which is impossible to do while we (as the caller) > >> are waiting for it to settle. Therefore, keep track of mount point > >> paths to at least catch the most obvious instances of that problem. > >> > >> Signed-off-by: Max Reitz <mreitz@redhat.com> > >> +/** > >> + * Callback to be invoked when the FUSE session FD can be read from. > >> + * (This is basically the FUSE event loop.) > >> + */ > >> +static void read_from_fuse_export(void *opaque) > >> +{ > >> + FuseExport *exp = opaque; > >> + int ret; > >> + > >> + blk_exp_ref(&exp->common); > >> + > >> + ret = fuse_session_receive_buf(exp->fuse_session, &exp->fuse_buf); > > > > The fuse_session_loop() implementation seems to imply that we should > > retry on EINTR. > > OK. I see you’re digging into libfuse already. :) Well, I have to review against something, and the documentation tends to be terse... > >> + if (ret < 0) { > >> + goto out; > >> + } > >> + > >> + fuse_session_process_buf(exp->fuse_session, &exp->fuse_buf); > >> + > >> +out: > >> + blk_exp_unref(&exp->common); > >> +} > >> + > >> +static void fuse_export_shutdown(BlockExport *blk_exp) > >> +{ > >> + FuseExport *exp = container_of(blk_exp, FuseExport, common); > >> + > >> + if (exp->fuse_session) { > >> + fuse_session_exit(exp->fuse_session); > >> + > >> + if (exp->mounted) { > >> + fuse_session_unmount(exp->fuse_session); > >> + exp->mounted = false; > >> + } > >> + > >> + if (exp->fd_handler_set_up) { > >> + aio_set_fd_handler(exp->common.ctx, > >> + fuse_session_fd(exp->fuse_session), true, > >> + NULL, NULL, NULL, NULL); > >> + exp->fd_handler_set_up = false; > >> + } > >> + > >> + fuse_session_destroy(exp->fuse_session); > >> + exp->fuse_session = NULL; > > > > What happens if a request is still in flight? > > > > Oh, can't happen because the driver is fully synchronous after this > > series. Fair enough, making it asynchronous can come on top of it. > > (I had multiple approaches of handling parallel requests, but none made > a substantial performance difference, which is why I left the driver in > the most simple form for this first proposal.) I think the more relevant part is that we'd block the guest or anything else running in the main loop while doing I/O. Not a problem if you spawn a new qemu-storage-daemon just for this FUSE export, but if you want to have multiple exports, or export from the system emulator, you probably don't want to have synchronous operations. Kevin
On 15.10.20 17:41, Kevin Wolf wrote: > Am 15.10.2020 um 16:46 hat Max Reitz geschrieben: >> On 15.10.20 10:57, Kevin Wolf wrote: >>> Am 22.09.2020 um 12:49 hat Max Reitz geschrieben: [...] >>>> +static void fuse_export_shutdown(BlockExport *blk_exp) >>>> +{ >>>> + FuseExport *exp = container_of(blk_exp, FuseExport, common); >>>> + >>>> + if (exp->fuse_session) { >>>> + fuse_session_exit(exp->fuse_session); >>>> + >>>> + if (exp->mounted) { >>>> + fuse_session_unmount(exp->fuse_session); >>>> + exp->mounted = false; >>>> + } >>>> + >>>> + if (exp->fd_handler_set_up) { >>>> + aio_set_fd_handler(exp->common.ctx, >>>> + fuse_session_fd(exp->fuse_session), true, >>>> + NULL, NULL, NULL, NULL); >>>> + exp->fd_handler_set_up = false; >>>> + } >>>> + >>>> + fuse_session_destroy(exp->fuse_session); >>>> + exp->fuse_session = NULL; >>> >>> What happens if a request is still in flight? >>> >>> Oh, can't happen because the driver is fully synchronous after this >>> series. Fair enough, making it asynchronous can come on top of it. >> >> (I had multiple approaches of handling parallel requests, but none made >> a substantial performance difference, which is why I left the driver in >> the most simple form for this first proposal.) > > I think the more relevant part is that we'd block the guest or anything > else running in the main loop while doing I/O. > > Not a problem if you spawn a new qemu-storage-daemon just for this FUSE > export, but if you want to have multiple exports, or export from the > system emulator, you probably don't want to have synchronous operations. Ah, hm. Hmm. O:) Does NBD work any different, though? I had always assumed it runs in the BB’s context. Max
Am 15.10.2020 um 17:59 hat Max Reitz geschrieben: > On 15.10.20 17:41, Kevin Wolf wrote: > > Am 15.10.2020 um 16:46 hat Max Reitz geschrieben: > >> On 15.10.20 10:57, Kevin Wolf wrote: > >>> Am 22.09.2020 um 12:49 hat Max Reitz geschrieben: > > [...] > > >>>> +static void fuse_export_shutdown(BlockExport *blk_exp) > >>>> +{ > >>>> + FuseExport *exp = container_of(blk_exp, FuseExport, common); > >>>> + > >>>> + if (exp->fuse_session) { > >>>> + fuse_session_exit(exp->fuse_session); > >>>> + > >>>> + if (exp->mounted) { > >>>> + fuse_session_unmount(exp->fuse_session); > >>>> + exp->mounted = false; > >>>> + } > >>>> + > >>>> + if (exp->fd_handler_set_up) { > >>>> + aio_set_fd_handler(exp->common.ctx, > >>>> + fuse_session_fd(exp->fuse_session), true, > >>>> + NULL, NULL, NULL, NULL); > >>>> + exp->fd_handler_set_up = false; > >>>> + } > >>>> + > >>>> + fuse_session_destroy(exp->fuse_session); > >>>> + exp->fuse_session = NULL; > >>> > >>> What happens if a request is still in flight? > >>> > >>> Oh, can't happen because the driver is fully synchronous after this > >>> series. Fair enough, making it asynchronous can come on top of it. > >> > >> (I had multiple approaches of handling parallel requests, but none made > >> a substantial performance difference, which is why I left the driver in > >> the most simple form for this first proposal.) > > > > I think the more relevant part is that we'd block the guest or anything > > else running in the main loop while doing I/O. > > > > Not a problem if you spawn a new qemu-storage-daemon just for this FUSE > > export, but if you want to have multiple exports, or export from the > > system emulator, you probably don't want to have synchronous operations. > > Ah, hm. Hmm. O:) > > Does NBD work any different, though? I had always assumed it runs in > the BB’s context. Yes, it runs in the BB's AioContext, but it's coroutine based, so instead of blocking, it just yields and lets other stuff in the same event loop make progress while it's waiting for its I/O. Kevin
diff --git a/qapi/block-export.json b/qapi/block-export.json index 0c045dfe7b..cb5bd54cbf 100644 --- a/qapi/block-export.json +++ b/qapi/block-export.json @@ -174,6 +174,21 @@ ## { 'command': 'nbd-server-stop' } +## +# @BlockExportOptionsFuse: +# +# Options for exporting a block graph node on some (file) mountpoint +# as a raw image. +# +# @mountpoint: Path on which to export the block device via FUSE. +# This must point to an existing regular file. +# +# Since: 5.2 +## +{ 'struct': 'BlockExportOptionsFuse', + 'data': { 'mountpoint': 'str' }, + 'if': 'defined(CONFIG_FUSE)' } + ## # @BlockExportType: # @@ -181,10 +196,13 @@ # # @nbd: NBD export # +# @fuse: Fuse export (since: 5.2) +# # Since: 4.2 ## { 'enum': 'BlockExportType', - 'data': [ 'nbd' ] } + 'data': [ 'nbd', + { 'name': 'fuse', 'if': 'defined(CONFIG_FUSE)' } ] } ## # @BlockExportOptions: @@ -213,7 +231,9 @@ '*writethrough': 'bool' }, 'discriminator': 'type', 'data': { - 'nbd': 'BlockExportOptionsNbd' + 'nbd': 'BlockExportOptionsNbd', + 'fuse': { 'type': 'BlockExportOptionsFuse', + 'if': 'defined(CONFIG_FUSE)' } } } ## diff --git a/include/block/fuse.h b/include/block/fuse.h new file mode 100644 index 0000000000..ffa91fe364 --- /dev/null +++ b/include/block/fuse.h @@ -0,0 +1,30 @@ +/* + * Present a block device as a raw image through FUSE + * + * Copyright (c) 2020 Max Reitz <mreitz@redhat.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; under version 2 or later of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef BLOCK_FUSE_H +#define BLOCK_FUSE_H + +#ifdef CONFIG_FUSE + +#include "block/export.h" + +extern const BlockExportDriver blk_exp_fuse; + +#endif /* CONFIG_FUSE */ + +#endif diff --git a/block.c b/block.c index acde53f92a..7bf45f6ede 100644 --- a/block.c +++ b/block.c @@ -26,6 +26,7 @@ #include "block/trace.h" #include "block/block_int.h" #include "block/blockjob.h" +#include "block/fuse.h" #include "block/nbd.h" #include "block/qdict.h" #include "qemu/error-report.h" diff --git a/block/export/export.c b/block/export/export.c index 64d39a6934..ba866d6ba7 100644 --- a/block/export/export.c +++ b/block/export/export.c @@ -16,6 +16,7 @@ #include "block/block.h" #include "sysemu/block-backend.h" #include "block/export.h" +#include "block/fuse.h" #include "block/nbd.h" #include "qapi/error.h" #include "qapi/qapi-commands-block-export.h" @@ -24,6 +25,9 @@ static const BlockExportDriver *blk_exp_drivers[] = { &blk_exp_nbd, +#ifdef CONFIG_FUSE + &blk_exp_fuse, +#endif }; /* Only accessed from the main thread */ diff --git a/block/export/fuse.c b/block/export/fuse.c new file mode 100644 index 0000000000..75f11d2514 --- /dev/null +++ b/block/export/fuse.c @@ -0,0 +1,253 @@ +/* + * Present a block device as a raw image through FUSE + * + * Copyright (c) 2020 Max Reitz <mreitz@redhat.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; under version 2 or later of the License. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see <http://www.gnu.org/licenses/>. + */ + +#define FUSE_USE_VERSION 31 + +#include "qemu/osdep.h" +#include "block/aio.h" +#include "block/block.h" +#include "block/export.h" +#include "block/fuse.h" +#include "block/qapi.h" +#include "qapi/error.h" +#include "qapi/qapi-commands-block.h" +#include "sysemu/block-backend.h" + +#include <fuse.h> +#include <fuse_lowlevel.h> + + +typedef struct FuseExport { + BlockExport common; + + struct fuse_session *fuse_session; + struct fuse_buf fuse_buf; + bool mounted, fd_handler_set_up; + + char *mountpoint; + bool writable; +} FuseExport; + +static GHashTable *exports; +static const struct fuse_lowlevel_ops fuse_ops; + +static void fuse_export_shutdown(BlockExport *exp); +static void fuse_export_delete(BlockExport *exp); + +static void init_fuse(void); +static int setup_fuse_export(FuseExport *exp, const char *mountpoint, + Error **errp); +static void read_from_fuse_export(void *opaque); + +static bool is_regular_file(const char *path, Error **errp); + + +static int fuse_export_create(BlockExport *blk_exp, + BlockExportOptions *blk_exp_args, + Error **errp) +{ + FuseExport *exp = container_of(blk_exp, FuseExport, common); + BlockExportOptionsFuse *args = &blk_exp_args->u.fuse; + int ret; + + assert(blk_exp_args->type == BLOCK_EXPORT_TYPE_FUSE); + + init_fuse(); + + /* + * It is important to do this check before calling is_regular_file() -- + * that function will do a stat(), which we would have to handle if we + * already exported something on @mountpoint. But we cannot, because + * we are currently caught up here. + * (Note that ideally we would want to resolve relative paths here, + * but bdrv_make_absolute_filename() might do the wrong thing for + * paths that contain colons, and realpath() would resolve symlinks, + * which we do not want: The mount point is not going to be the + * symlink's destination, but the link itself.) + * So this will not catch all potential clashes, but hopefully at + * least the most common one of specifying exactly the same path + * string twice. + */ + if (g_hash_table_contains(exports, args->mountpoint)) { + error_setg(errp, "There already is a FUSE export on '%s'", + args->mountpoint); + ret = -EEXIST; + goto fail; + } + + if (!is_regular_file(args->mountpoint, errp)) { + ret = -EINVAL; + goto fail; + } + + exp->mountpoint = g_strdup(args->mountpoint); + exp->writable = blk_exp_args->writable; + + ret = setup_fuse_export(exp, args->mountpoint, errp); + if (ret < 0) { + goto fail; + } + + g_hash_table_insert(exports, g_strdup(args->mountpoint), NULL); + return 0; + +fail: + fuse_export_shutdown(blk_exp); + fuse_export_delete(blk_exp); + return ret; +} + +/** + * Ensure that the global FUSE context is set up. + */ +static void init_fuse(void) +{ + if (exports) { + return; + } + + exports = g_hash_table_new_full(g_str_hash, g_str_equal, g_free, NULL); +} + +/** + * Create exp->fuse_session and mount it. + */ +static int setup_fuse_export(FuseExport *exp, const char *mountpoint, + Error **errp) +{ + const char *fuse_argv[2]; + struct fuse_args fuse_args; + int ret; + + fuse_argv[0] = ""; /* Dummy program name */ + fuse_argv[1] = NULL; + fuse_args = (struct fuse_args)FUSE_ARGS_INIT(1, (char **)fuse_argv); + + exp->fuse_session = fuse_session_new(&fuse_args, &fuse_ops, + sizeof(fuse_ops), exp); + if (!exp->fuse_session) { + error_setg(errp, "Failed to set up FUSE session"); + return -EIO; + } + + ret = fuse_session_mount(exp->fuse_session, mountpoint); + if (ret < 0) { + error_setg(errp, "Failed to mount FUSE session to export"); + return -EIO; + } + exp->mounted = true; + + aio_set_fd_handler(exp->common.ctx, + fuse_session_fd(exp->fuse_session), true, + read_from_fuse_export, NULL, NULL, exp); + exp->fd_handler_set_up = true; + + return 0; +} + +/** + * Callback to be invoked when the FUSE session FD can be read from. + * (This is basically the FUSE event loop.) + */ +static void read_from_fuse_export(void *opaque) +{ + FuseExport *exp = opaque; + int ret; + + blk_exp_ref(&exp->common); + + ret = fuse_session_receive_buf(exp->fuse_session, &exp->fuse_buf); + if (ret < 0) { + goto out; + } + + fuse_session_process_buf(exp->fuse_session, &exp->fuse_buf); + +out: + blk_exp_unref(&exp->common); +} + +static void fuse_export_shutdown(BlockExport *blk_exp) +{ + FuseExport *exp = container_of(blk_exp, FuseExport, common); + + if (exp->fuse_session) { + fuse_session_exit(exp->fuse_session); + + if (exp->mounted) { + fuse_session_unmount(exp->fuse_session); + exp->mounted = false; + } + + if (exp->fd_handler_set_up) { + aio_set_fd_handler(exp->common.ctx, + fuse_session_fd(exp->fuse_session), true, + NULL, NULL, NULL, NULL); + exp->fd_handler_set_up = false; + } + + fuse_session_destroy(exp->fuse_session); + exp->fuse_session = NULL; + } + + if (exp->mountpoint) { + g_hash_table_remove(exports, exp->mountpoint); + } +} + +static void fuse_export_delete(BlockExport *blk_exp) +{ + FuseExport *exp = container_of(blk_exp, FuseExport, common); + + free(exp->fuse_buf.mem); + g_free(exp->mountpoint); +} + +/** + * Check whether @path points to a regular file. If not, put an + * appropriate message into *errp. + */ +static bool is_regular_file(const char *path, Error **errp) +{ + struct stat statbuf; + int ret; + + ret = stat(path, &statbuf); + if (ret < 0) { + error_setg_errno(errp, errno, "Failed to stat '%s'", path); + return false; + } + + if (!S_ISREG(statbuf.st_mode)) { + error_setg(errp, "'%s' is not a regular file", path); + return false; + } + + return true; +} + +static const struct fuse_lowlevel_ops fuse_ops = { +}; + +const BlockExportDriver blk_exp_fuse = { + .type = BLOCK_EXPORT_TYPE_FUSE, + .instance_size = sizeof(FuseExport), + .create = fuse_export_create, + .delete = fuse_export_delete, + .request_shutdown = fuse_export_shutdown, +}; diff --git a/block/export/meson.build b/block/export/meson.build index 558ef35d38..3f7f8c5263 100644 --- a/block/export/meson.build +++ b/block/export/meson.build @@ -1 +1,2 @@ block_ss.add(files('export.c')) +block_ss.add(when: ['CONFIG_FUSE', libfuse], if_true: files('fuse.c'))
block-export-add type=fuse allows mounting block graph nodes via FUSE on some existing regular file. That file should then appears like a raw disk image, and accesses to it result in accesses to the exported BDS. Right now, we only implement the necessary block export functions to set it up and shut it down. We do not implement any access functions, so accessing the mount point only results in errors. This will be addressed by a followup patch. We keep a hash table of exported mount points, because we want to be able to detect when users try to use a mount point twice. This is because we invoke stat() to check whether the given mount point is a regular file, but if that file is served by ourselves (because it is already used as a mount point), then this stat() would have to be served by ourselves, too, which is impossible to do while we (as the caller) are waiting for it to settle. Therefore, keep track of mount point paths to at least catch the most obvious instances of that problem. Signed-off-by: Max Reitz <mreitz@redhat.com> --- qapi/block-export.json | 24 +++- include/block/fuse.h | 30 +++++ block.c | 1 + block/export/export.c | 4 + block/export/fuse.c | 253 +++++++++++++++++++++++++++++++++++++++ block/export/meson.build | 1 + 6 files changed, 311 insertions(+), 2 deletions(-) create mode 100644 include/block/fuse.h create mode 100644 block/export/fuse.c