Message ID | 1413559886-11516-6-git-send-email-stefano.stabellini@eu.citrix.com |
---|---|
State | New |
Headers | show |
On Fri, 17 Oct 2014, Stefano Stabellini wrote: > Introduce a new hypercall to perform cache maintenance operation on > behalf of the guest. The argument is a machine address and a size. The > implementation checks that the memory range is owned by the guest or the > guest has been granted access to it by another domain. > > Introduce grant_map_exists: an internal grant table function to check > whether an mfn has been granted to a given domain on a target grant > table. Check hypercall_preempt_check() every 2048 iterations in the > implementation of grant_map_exists. > Use the top 16 bits of the GNTTABOP cmd encoding to save the last ref > across the hypercall continuation. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > Changes in v7: > - remove warning message; > - prefix second line of the warning with XENLOG_WARNING; > - do not lower DEFAULT_MAX_NR_GRANT_FRAMES; > - no long lines; > - interrupt loops in grant_map_exists with more than 2048 iterations, > create an hypercall continuation if necessary. > > Changes in v6: > - set DEFAULT_MAX_NR_GRANT_FRAMES to 10; > - warn if max_grant_frames > 10. > > Changes in v5: > - make mfn mfn unsigned long; > - remove unhelpful error message; > - handle errors returned by cache maintenance functions. > > Changes in v4: > - ASSERT(spin_is_locked); > - return instead of break in grant_map_exists; > - pass a pointer to __gnttab_cache_flush; > - code style; > - unsigned int iterator in gnttab_cache_flush; > - return ret instead -ret; > - cflush.offset >= PAGE_SIZE return -EINVAL. > > Changes in v3: > - reduce the time the grant_table lock is held; > - fix warning message; > - s/EFAULT/EPERM; > - s/llx/PRIx64; > - check offset and size independetly before checking their sum; > - rcu_lock_current_domain cannot fail; > - s/ENOSYS/EOPNOTSUPP; > - use clean_and_invalidate_xen_dcache_va_range to do both operations at > once; > - fold grant_map_exists in this patch; > - support "count" argument; > - make correspondent changes to compat/grant_table.c; > - introduce GNTTAB_CACHE_SOURCE_GREF to select the type of input in the > union; > - rename size field to length; > - make length and offset uint16_t; > - only take spin_lock if d != owner. > > Changes in v2: > - do not check for mfn_to_page errors; > - take a reference to the page; > - replace printk with gdprintk; > - split long line; > - remove out label; > - move rcu_lock_current_domain down before the loop. > - move the hypercall to GNTTABOP; > - take a spin_lock before calling grant_map_exists. > --- > xen/common/compat/grant_table.c | 8 ++ > xen/common/grant_table.c | 166 +++++++++++++++++++++++++++++++++++++- > xen/include/public/grant_table.h | 21 +++++ > xen/include/xlat.lst | 1 + > 4 files changed, 193 insertions(+), 3 deletions(-) > > diff --git a/xen/common/compat/grant_table.c b/xen/common/compat/grant_table.c > index 2dc1e44..5130c35 100644 > --- a/xen/common/compat/grant_table.c > +++ b/xen/common/compat/grant_table.c > @@ -51,6 +51,10 @@ CHECK_gnttab_get_version; > CHECK_gnttab_swap_grant_ref; > #undef xen_gnttab_swap_grant_ref > > +#define xen_gnttab_cache_flush gnttab_cache_flush > +CHECK_gnttab_cache_flush; > +#undef xen_gnttab_cache_flush > + > int compat_grant_table_op(unsigned int cmd, > XEN_GUEST_HANDLE_PARAM(void) cmp_uop, > unsigned int count) > @@ -106,6 +110,10 @@ int compat_grant_table_op(unsigned int cmd, > CASE(swap_grant_ref); > #endif > > +#ifndef CHECK_gnttab_cache_flush > + CASE(cache_flush); > +#endif > + > #undef CASE > default: > return do_grant_table_op(cmd, cmp_uop, count); > diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c > index 12f4d5e..9d05e89 100644 > --- a/xen/common/grant_table.c > +++ b/xen/common/grant_table.c > @@ -490,6 +490,42 @@ static int _set_status(unsigned gt_version, > return _set_status_v2(domid, readonly, mapflag, shah, act, status); > } > > +#define MAX_GRANT_ENTRIES_ITERATIONS 2048 > +static int grant_map_exists(const struct domain *ld, > + struct grant_table *rgt, > + unsigned long mfn, > + grant_ref_t *ref_count) > +{ > + const struct active_grant_entry *act; > + grant_ref_t ref; > + > + ASSERT(spin_is_locked(&rgt->lock)); > + > + for ( ref = *ref_count; ref < nr_grant_entries(rgt); ref++ ) > + { > + if ( (ref % MAX_GRANT_ENTRIES_ITERATIONS) == 0 && ref != *ref_count ) > + { > + *ref_count = ref; > + return ref; > + } > + > + act = &active_entry(rgt, ref); > + > + if ( !act->pin ) > + continue; > + > + if ( act->domid != ld->domain_id ) > + continue; > + > + if ( act->frame != mfn ) > + continue; > + > + return 0; > + } > + > + return -1; > +} > + > static void mapcount( > struct grant_table *lgt, struct domain *rd, unsigned long mfn, > unsigned int *wrc, unsigned int *rdc) > @@ -2488,16 +2524,121 @@ gnttab_swap_grant_ref(XEN_GUEST_HANDLE_PARAM(gnttab_swap_grant_ref_t) uop, > return 0; > } > > +static int __gnttab_cache_flush(gnttab_cache_flush_t *cflush, > + grant_ref_t *ref_count) > +{ > + struct domain *d, *owner; > + struct page_info *page; > + unsigned long mfn; > + void *v; > + int ret = 0; > + > + if ( (cflush->offset >= PAGE_SIZE) || > + (cflush->length > PAGE_SIZE) || > + (cflush->offset + cflush->length > PAGE_SIZE) ) > + return -EINVAL; > + > + if ( cflush->length == 0 || cflush->op == 0 ) > + return 0; > + > + /* currently unimplemented */ > + if ( cflush->op & GNTTAB_CACHE_SOURCE_GREF ) > + return -EOPNOTSUPP; > + > + d = rcu_lock_current_domain(); > + mfn = cflush->a.dev_bus_addr >> PAGE_SHIFT; > + > + if ( !mfn_valid(mfn) ) > + { > + rcu_unlock_domain(d); > + return -EINVAL; > + } > + > + page = mfn_to_page(mfn); > + owner = page_get_owner_and_reference(page); > + if ( !owner ) > + { > + rcu_unlock_domain(d); > + return -EPERM; > + } > + > + if ( d != owner ) > + { > + spin_lock(&owner->grant_table->lock); > + > + ret = grant_map_exists(d, owner->grant_table, mfn, ref_count); > + if ( ret != 0 ) > + { > + spin_unlock(&owner->grant_table->lock); > + rcu_unlock_domain(d); > + put_page(page); > + return ret; > + } > + } > + > + v = map_domain_page(mfn); > + v += cflush->offset; > + > + if ( (cflush->op & GNTTAB_CACHE_INVAL) && (cflush->op & GNTTAB_CACHE_CLEAN) ) > + ret = clean_and_invalidate_dcache_va_range(v, cflush->length); > + else if ( cflush->op & GNTTAB_CACHE_INVAL ) > + ret = invalidate_dcache_va_range(v, cflush->length); > + else if ( cflush->op & GNTTAB_CACHE_CLEAN ) > + ret = clean_dcache_va_range(v, cflush->length); > + > + if ( d != owner ) > + spin_unlock(&owner->grant_table->lock); > + unmap_domain_page(v); > + put_page(page); > + > + return ret; > +} > + > +static long > +gnttab_cache_flush(XEN_GUEST_HANDLE_PARAM(gnttab_cache_flush_t) uop, > + grant_ref_t *ref_count, > + unsigned int count) > +{ > + int ret = 0; > + unsigned int i; > + gnttab_cache_flush_t op; > + > + for ( i = 0; i < count; i++ ) > + { > + if ( i && hypercall_preempt_check() ) > + return i; > + if ( unlikely(__copy_from_guest(&op, uop, 1)) ) > + return -EFAULT; > + do { > + ret = __gnttab_cache_flush(&op, ref_count); > + if ( ret < 0 ) > + return ret; > + if ( ret > 0 && hypercall_preempt_check() ) > + return i; > + } while ( ret > 0 ); > + *ref_count = 0; > + guest_handle_add_offset(uop, 1); > + } > + return 0; > +} > + > + > +#define OPAQUE_CONTINUATION_ARG_SHIFT 16 > +#define CMD_MASK ((1<<OPAQUE_CONTINUATION_ARG_SHIFT)-1) > long > do_grant_table_op( > unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) uop, unsigned int count) > { > long rc; > + unsigned int opaque; > > if ( (int)count < 0 ) > return -EINVAL; > > rc = -EFAULT; > + > + opaque = cmd >> OPAQUE_CONTINUATION_ARG_SHIFT; > + cmd &= CMD_MASK; I realize now that opaque is not reset correctly for hypercalls others than GNTTABOP_cache_flush. I'll fix in the next version. > switch ( cmd ) > { > case GNTTABOP_map_grant_ref: > @@ -2617,17 +2758,36 @@ do_grant_table_op( > } > break; > } > + case GNTTABOP_cache_flush: > + { > + grant_ref_t ref_count = opaque; > + XEN_GUEST_HANDLE_PARAM(gnttab_cache_flush_t) cflush = > + guest_handle_cast(uop, gnttab_cache_flush_t); > + if ( unlikely(!guest_handle_okay(cflush, count)) ) > + goto out; > + rc = gnttab_cache_flush(cflush, &ref_count, count); > + if ( rc > 0 ) > + { > + guest_handle_add_offset(cflush, rc); > + uop = guest_handle_cast(cflush, void); > + } > + opaque = ref_count; > + break; > + } > default: > rc = -ENOSYS; > break; > } > > out: > - if ( rc > 0 ) > + if ( rc > 0 || opaque != 0 ) > { > ASSERT(rc < count); > - rc = hypercall_create_continuation(__HYPERVISOR_grant_table_op, > - "ihi", cmd, uop, count - rc); > + ASSERT(opaque < (1 << (sizeof(cmd)*8 - OPAQUE_CONTINUATION_ARG_SHIFT))); > + rc = hypercall_create_continuation(__HYPERVISOR_grant_table_op, "ihi", > + (opaque << > + OPAQUE_CONTINUATION_ARG_SHIFT) | cmd, > + uop, count - rc); > } > > return rc; > diff --git a/xen/include/public/grant_table.h b/xen/include/public/grant_table.h > index b8a3d6c..0e8bc91 100644 > --- a/xen/include/public/grant_table.h > +++ b/xen/include/public/grant_table.h > @@ -309,6 +309,8 @@ typedef uint16_t grant_status_t; > #define GNTTABOP_get_status_frames 9 > #define GNTTABOP_get_version 10 > #define GNTTABOP_swap_grant_ref 11 > +#define GNTTABOP_cache_flush 12 > +/* max GNTTABOP is 65535 */ > #endif /* __XEN_INTERFACE_VERSION__ */ > /* ` } */ > > @@ -574,6 +576,25 @@ struct gnttab_swap_grant_ref { > typedef struct gnttab_swap_grant_ref gnttab_swap_grant_ref_t; > DEFINE_XEN_GUEST_HANDLE(gnttab_swap_grant_ref_t); > > +/* > + * Issue one or more cache maintenance operations on a portion of a > + * page granted to the calling domain by a foreign domain. > + */ > +struct gnttab_cache_flush { > + union { > + uint64_t dev_bus_addr; > + grant_ref_t ref; > + } a; > + uint16_t offset; /* offset from start of grant */ > + uint16_t length; /* size within the grant */ > +#define GNTTAB_CACHE_CLEAN (1<<0) > +#define GNTTAB_CACHE_INVAL (1<<1) > +#define GNTTAB_CACHE_SOURCE_GREF (1<<31) > + uint32_t op; > +}; > +typedef struct gnttab_cache_flush gnttab_cache_flush_t; > +DEFINE_XEN_GUEST_HANDLE(gnttab_cache_flush_t); > + > #endif /* __XEN_INTERFACE_VERSION__ */ > > /* > diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst > index 234b668..9ce9fee 100644 > --- a/xen/include/xlat.lst > +++ b/xen/include/xlat.lst > @@ -51,6 +51,7 @@ > ? grant_entry_header grant_table.h > ? grant_entry_v2 grant_table.h > ? gnttab_swap_grant_ref grant_table.h > +? gnttab_cache_flush grant_table.h > ? kexec_exec kexec.h > ! kexec_image kexec.h > ! kexec_range kexec.h > -- > 1.7.10.4 >
diff --git a/xen/common/compat/grant_table.c b/xen/common/compat/grant_table.c index 2dc1e44..5130c35 100644 --- a/xen/common/compat/grant_table.c +++ b/xen/common/compat/grant_table.c @@ -51,6 +51,10 @@ CHECK_gnttab_get_version; CHECK_gnttab_swap_grant_ref; #undef xen_gnttab_swap_grant_ref +#define xen_gnttab_cache_flush gnttab_cache_flush +CHECK_gnttab_cache_flush; +#undef xen_gnttab_cache_flush + int compat_grant_table_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) cmp_uop, unsigned int count) @@ -106,6 +110,10 @@ int compat_grant_table_op(unsigned int cmd, CASE(swap_grant_ref); #endif +#ifndef CHECK_gnttab_cache_flush + CASE(cache_flush); +#endif + #undef CASE default: return do_grant_table_op(cmd, cmp_uop, count); diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c index 12f4d5e..9d05e89 100644 --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -490,6 +490,42 @@ static int _set_status(unsigned gt_version, return _set_status_v2(domid, readonly, mapflag, shah, act, status); } +#define MAX_GRANT_ENTRIES_ITERATIONS 2048 +static int grant_map_exists(const struct domain *ld, + struct grant_table *rgt, + unsigned long mfn, + grant_ref_t *ref_count) +{ + const struct active_grant_entry *act; + grant_ref_t ref; + + ASSERT(spin_is_locked(&rgt->lock)); + + for ( ref = *ref_count; ref < nr_grant_entries(rgt); ref++ ) + { + if ( (ref % MAX_GRANT_ENTRIES_ITERATIONS) == 0 && ref != *ref_count ) + { + *ref_count = ref; + return ref; + } + + act = &active_entry(rgt, ref); + + if ( !act->pin ) + continue; + + if ( act->domid != ld->domain_id ) + continue; + + if ( act->frame != mfn ) + continue; + + return 0; + } + + return -1; +} + static void mapcount( struct grant_table *lgt, struct domain *rd, unsigned long mfn, unsigned int *wrc, unsigned int *rdc) @@ -2488,16 +2524,121 @@ gnttab_swap_grant_ref(XEN_GUEST_HANDLE_PARAM(gnttab_swap_grant_ref_t) uop, return 0; } +static int __gnttab_cache_flush(gnttab_cache_flush_t *cflush, + grant_ref_t *ref_count) +{ + struct domain *d, *owner; + struct page_info *page; + unsigned long mfn; + void *v; + int ret = 0; + + if ( (cflush->offset >= PAGE_SIZE) || + (cflush->length > PAGE_SIZE) || + (cflush->offset + cflush->length > PAGE_SIZE) ) + return -EINVAL; + + if ( cflush->length == 0 || cflush->op == 0 ) + return 0; + + /* currently unimplemented */ + if ( cflush->op & GNTTAB_CACHE_SOURCE_GREF ) + return -EOPNOTSUPP; + + d = rcu_lock_current_domain(); + mfn = cflush->a.dev_bus_addr >> PAGE_SHIFT; + + if ( !mfn_valid(mfn) ) + { + rcu_unlock_domain(d); + return -EINVAL; + } + + page = mfn_to_page(mfn); + owner = page_get_owner_and_reference(page); + if ( !owner ) + { + rcu_unlock_domain(d); + return -EPERM; + } + + if ( d != owner ) + { + spin_lock(&owner->grant_table->lock); + + ret = grant_map_exists(d, owner->grant_table, mfn, ref_count); + if ( ret != 0 ) + { + spin_unlock(&owner->grant_table->lock); + rcu_unlock_domain(d); + put_page(page); + return ret; + } + } + + v = map_domain_page(mfn); + v += cflush->offset; + + if ( (cflush->op & GNTTAB_CACHE_INVAL) && (cflush->op & GNTTAB_CACHE_CLEAN) ) + ret = clean_and_invalidate_dcache_va_range(v, cflush->length); + else if ( cflush->op & GNTTAB_CACHE_INVAL ) + ret = invalidate_dcache_va_range(v, cflush->length); + else if ( cflush->op & GNTTAB_CACHE_CLEAN ) + ret = clean_dcache_va_range(v, cflush->length); + + if ( d != owner ) + spin_unlock(&owner->grant_table->lock); + unmap_domain_page(v); + put_page(page); + + return ret; +} + +static long +gnttab_cache_flush(XEN_GUEST_HANDLE_PARAM(gnttab_cache_flush_t) uop, + grant_ref_t *ref_count, + unsigned int count) +{ + int ret = 0; + unsigned int i; + gnttab_cache_flush_t op; + + for ( i = 0; i < count; i++ ) + { + if ( i && hypercall_preempt_check() ) + return i; + if ( unlikely(__copy_from_guest(&op, uop, 1)) ) + return -EFAULT; + do { + ret = __gnttab_cache_flush(&op, ref_count); + if ( ret < 0 ) + return ret; + if ( ret > 0 && hypercall_preempt_check() ) + return i; + } while ( ret > 0 ); + *ref_count = 0; + guest_handle_add_offset(uop, 1); + } + return 0; +} + + +#define OPAQUE_CONTINUATION_ARG_SHIFT 16 +#define CMD_MASK ((1<<OPAQUE_CONTINUATION_ARG_SHIFT)-1) long do_grant_table_op( unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) uop, unsigned int count) { long rc; + unsigned int opaque; if ( (int)count < 0 ) return -EINVAL; rc = -EFAULT; + + opaque = cmd >> OPAQUE_CONTINUATION_ARG_SHIFT; + cmd &= CMD_MASK; switch ( cmd ) { case GNTTABOP_map_grant_ref: @@ -2617,17 +2758,36 @@ do_grant_table_op( } break; } + case GNTTABOP_cache_flush: + { + grant_ref_t ref_count = opaque; + XEN_GUEST_HANDLE_PARAM(gnttab_cache_flush_t) cflush = + guest_handle_cast(uop, gnttab_cache_flush_t); + if ( unlikely(!guest_handle_okay(cflush, count)) ) + goto out; + rc = gnttab_cache_flush(cflush, &ref_count, count); + if ( rc > 0 ) + { + guest_handle_add_offset(cflush, rc); + uop = guest_handle_cast(cflush, void); + } + opaque = ref_count; + break; + } default: rc = -ENOSYS; break; } out: - if ( rc > 0 ) + if ( rc > 0 || opaque != 0 ) { ASSERT(rc < count); - rc = hypercall_create_continuation(__HYPERVISOR_grant_table_op, - "ihi", cmd, uop, count - rc); + ASSERT(opaque < (1 << (sizeof(cmd)*8 - OPAQUE_CONTINUATION_ARG_SHIFT))); + rc = hypercall_create_continuation(__HYPERVISOR_grant_table_op, "ihi", + (opaque << + OPAQUE_CONTINUATION_ARG_SHIFT) | cmd, + uop, count - rc); } return rc; diff --git a/xen/include/public/grant_table.h b/xen/include/public/grant_table.h index b8a3d6c..0e8bc91 100644 --- a/xen/include/public/grant_table.h +++ b/xen/include/public/grant_table.h @@ -309,6 +309,8 @@ typedef uint16_t grant_status_t; #define GNTTABOP_get_status_frames 9 #define GNTTABOP_get_version 10 #define GNTTABOP_swap_grant_ref 11 +#define GNTTABOP_cache_flush 12 +/* max GNTTABOP is 65535 */ #endif /* __XEN_INTERFACE_VERSION__ */ /* ` } */ @@ -574,6 +576,25 @@ struct gnttab_swap_grant_ref { typedef struct gnttab_swap_grant_ref gnttab_swap_grant_ref_t; DEFINE_XEN_GUEST_HANDLE(gnttab_swap_grant_ref_t); +/* + * Issue one or more cache maintenance operations on a portion of a + * page granted to the calling domain by a foreign domain. + */ +struct gnttab_cache_flush { + union { + uint64_t dev_bus_addr; + grant_ref_t ref; + } a; + uint16_t offset; /* offset from start of grant */ + uint16_t length; /* size within the grant */ +#define GNTTAB_CACHE_CLEAN (1<<0) +#define GNTTAB_CACHE_INVAL (1<<1) +#define GNTTAB_CACHE_SOURCE_GREF (1<<31) + uint32_t op; +}; +typedef struct gnttab_cache_flush gnttab_cache_flush_t; +DEFINE_XEN_GUEST_HANDLE(gnttab_cache_flush_t); + #endif /* __XEN_INTERFACE_VERSION__ */ /* diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst index 234b668..9ce9fee 100644 --- a/xen/include/xlat.lst +++ b/xen/include/xlat.lst @@ -51,6 +51,7 @@ ? grant_entry_header grant_table.h ? grant_entry_v2 grant_table.h ? gnttab_swap_grant_ref grant_table.h +? gnttab_cache_flush grant_table.h ? kexec_exec kexec.h ! kexec_image kexec.h ! kexec_range kexec.h
Introduce a new hypercall to perform cache maintenance operation on behalf of the guest. The argument is a machine address and a size. The implementation checks that the memory range is owned by the guest or the guest has been granted access to it by another domain. Introduce grant_map_exists: an internal grant table function to check whether an mfn has been granted to a given domain on a target grant table. Check hypercall_preempt_check() every 2048 iterations in the implementation of grant_map_exists. Use the top 16 bits of the GNTTABOP cmd encoding to save the last ref across the hypercall continuation. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- Changes in v7: - remove warning message; - prefix second line of the warning with XENLOG_WARNING; - do not lower DEFAULT_MAX_NR_GRANT_FRAMES; - no long lines; - interrupt loops in grant_map_exists with more than 2048 iterations, create an hypercall continuation if necessary. Changes in v6: - set DEFAULT_MAX_NR_GRANT_FRAMES to 10; - warn if max_grant_frames > 10. Changes in v5: - make mfn mfn unsigned long; - remove unhelpful error message; - handle errors returned by cache maintenance functions. Changes in v4: - ASSERT(spin_is_locked); - return instead of break in grant_map_exists; - pass a pointer to __gnttab_cache_flush; - code style; - unsigned int iterator in gnttab_cache_flush; - return ret instead -ret; - cflush.offset >= PAGE_SIZE return -EINVAL. Changes in v3: - reduce the time the grant_table lock is held; - fix warning message; - s/EFAULT/EPERM; - s/llx/PRIx64; - check offset and size independetly before checking their sum; - rcu_lock_current_domain cannot fail; - s/ENOSYS/EOPNOTSUPP; - use clean_and_invalidate_xen_dcache_va_range to do both operations at once; - fold grant_map_exists in this patch; - support "count" argument; - make correspondent changes to compat/grant_table.c; - introduce GNTTAB_CACHE_SOURCE_GREF to select the type of input in the union; - rename size field to length; - make length and offset uint16_t; - only take spin_lock if d != owner. Changes in v2: - do not check for mfn_to_page errors; - take a reference to the page; - replace printk with gdprintk; - split long line; - remove out label; - move rcu_lock_current_domain down before the loop. - move the hypercall to GNTTABOP; - take a spin_lock before calling grant_map_exists. --- xen/common/compat/grant_table.c | 8 ++ xen/common/grant_table.c | 166 +++++++++++++++++++++++++++++++++++++- xen/include/public/grant_table.h | 21 +++++ xen/include/xlat.lst | 1 + 4 files changed, 193 insertions(+), 3 deletions(-)