diff mbox

[Xen-devel,v2,3/8] tools: arm: allocate large pages to guests.

Message ID 1402504804-29173-3-git-send-email-ian.campbell@citrix.com
State New
Headers show

Commit Message

Ian Campbell June 11, 2014, 4:39 p.m. UTC
Tries to allocate as many large (1G or 2M) pages as it can to the guest and tries
to align to the next larger size on each attempt so that we can attempt to
allocate even larger pages on the next iteration (this is currently only
exercised via a compile time debug option, which is left in place under a #if
0).

Since ARM page tables are consistent at each level there is a common helper
function which tries to allocate a levels worth of pages. The exception to this
consistency is level 0 which does not support table mappings (0.5TB
superpages!).

Previously we would allocate in batches of up to 4GB worth of pages (allocsz
clamped at 1024*1024 pages) however this would now require 8MB worth of start
for the extents array in populate_one_size. Reduce to just 256*1024 or 1GB
worth of pages (at level 3) or 2MB of stack.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: min_t defintion moved into earlier patch
    drop debug_iters
    allocate extents array explicitly instead of putting 8M on the stack.
    Handle OOM, xc_populate_physmap returns 0, which needs handling for L3.
---
 tools/libxc/xc_dom_arm.c |  119 ++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 105 insertions(+), 14 deletions(-)

Comments

Julien Grall June 11, 2014, 9:26 p.m. UTC | #1
Hi Ian,

On 11/06/14 17:39, Ian Campbell wrote:
> Previously we would allocate in batches of up to 4GB worth of pages (allocsz
> clamped at 1024*1024 pages) however this would now require 8MB worth of start
> for the extents array in populate_one_size. Reduce to just 256*1024 or 1GB
> worth of pages (at level 3) or 2MB of stack.

I think you can drop this paragraph. You are using calloc rather than 
the stack in your patch.

> -    return rc;
> +    for ( pfn = 0; pfn < nr_pfns; pfn++ )
> +        dom->p2m_host[pfn] = base_pfn + pfn;
> +
> +out:
> +    free(extents);

Does free preserve errno? I can't find anything saying it... If so we 
may lose it when there is not enought memory.

Regards,
Ian Campbell June 12, 2014, 7:19 a.m. UTC | #2
On Wed, 2014-06-11 at 22:26 +0100, Julien Grall wrote:
> Hi Ian,
> 
> On 11/06/14 17:39, Ian Campbell wrote:
> > Previously we would allocate in batches of up to 4GB worth of pages (allocsz
> > clamped at 1024*1024 pages) however this would now require 8MB worth of start
> > for the extents array in populate_one_size. Reduce to just 256*1024 or 1GB
> > worth of pages (at level 3) or 2MB of stack.
> 
> I think you can drop this paragraph. You are using calloc rather than 
> the stack in your patch.

Oops, yes. I'll do that on commit unless I have another reason to
resend.

> > -    return rc;
> > +    for ( pfn = 0; pfn < nr_pfns; pfn++ )
> > +        dom->p2m_host[pfn] = base_pfn + pfn;
> > +
> > +out:
> > +    free(extents);
> 
> Does free preserve errno? I can't find anything saying it... If so we 
> may lose it when there is not enought memory.

I'm not 100% sure but I'm almost certain it is fine. free() returns void
and http://pubs.opengroup.org/onlinepubs/007908799/xsh/free.html says no
error codes are defined for it.

We don't preserve errno in numerous error paths.

Ian.
Julien Grall June 12, 2014, 8:47 a.m. UTC | #3
On 12/06/14 08:19, Ian Campbell wrote:
> On Wed, 2014-06-11 at 22:26 +0100, Julien Grall wrote:
>> Hi Ian,
>>
>> On 11/06/14 17:39, Ian Campbell wrote:
>>> Previously we would allocate in batches of up to 4GB worth of pages (allocsz
>>> clamped at 1024*1024 pages) however this would now require 8MB worth of start
>>> for the extents array in populate_one_size. Reduce to just 256*1024 or 1GB
>>> worth of pages (at level 3) or 2MB of stack.
>>
>> I think you can drop this paragraph. You are using calloc rather than
>> the stack in your patch.
>
> Oops, yes. I'll do that on commit unless I have another reason to
> resend.
>
>>> -    return rc;
>>> +    for ( pfn = 0; pfn < nr_pfns; pfn++ )
>>> +        dom->p2m_host[pfn] = base_pfn + pfn;
>>> +
>>> +out:
>>> +    free(extents);
>>
>> Does free preserve errno? I can't find anything saying it... If so we
>> may lose it when there is not enought memory.
>
> I'm not 100% sure but I'm almost certain it is fine. free() returns void
> and http://pubs.opengroup.org/onlinepubs/007908799/xsh/free.html says no
> error codes are defined for it.

Thanks for the answer. Assuming that:

Acked-by: Julien Grall <julien.grall@linaro.org>

Regards,
diff mbox

Patch

diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
index 75f8363..6351ead 100644
--- a/tools/libxc/xc_dom_arm.c
+++ b/tools/libxc/xc_dom_arm.c
@@ -30,6 +30,13 @@ 
 #define CONSOLE_PFN_OFFSET 0
 #define XENSTORE_PFN_OFFSET 1
 
+#define LPAE_SHIFT 9
+
+#define PFN_4K_SHIFT  (0)
+#define PFN_2M_SHIFT  (PFN_4K_SHIFT+LPAE_SHIFT)
+#define PFN_1G_SHIFT  (PFN_2M_SHIFT+LPAE_SHIFT)
+#define PFN_512G_SHIFT (PFN_1G_SHIFT+LPAE_SHIFT)
+
 /* get guest IO ABI protocol */
 const char *xc_domain_get_native_protocol(xc_interface *xch,
                                           uint32_t domid)
@@ -249,11 +256,60 @@  static int set_mode(xc_interface *xch, domid_t domid, char *guest_type)
     return rc;
 }
 
+/*  >0: success, *nr_pfns set to number actually populated
+ *   0: didn't try with this pfn shift (e.g. misaligned base etc)
+ *  <0: ERROR
+ */
+static int populate_one_size(struct xc_dom_image *dom, int pfn_shift,
+                             xen_pfn_t base_pfn, xen_pfn_t *nr_pfns,
+                             xen_pfn_t *extents)
+{
+    uint64_t mask = ((uint64_t)1<<(pfn_shift))-1;
+    uint64_t next_mask = ((uint64_t)1<<(LPAE_SHIFT))-1;
+    int nr, i, count = *nr_pfns >> pfn_shift;
+
+    /* No level zero super pages with current hardware */
+    if ( pfn_shift == PFN_512G_SHIFT )
+        return 0;
+
+    /* Nothing to allocate */
+    if ( !count )
+        return 0;
+
+    /* base is misaligned for this level */
+    if ( mask & base_pfn )
+        return 0;
+
+    /* align to the end of a super page at this level */
+    if ( count & next_mask )
+        count &= next_mask;
+
+    for ( i = 0 ; i < count ; i ++ )
+        extents[i] = base_pfn + (i<<pfn_shift);
+
+    nr = xc_domain_populate_physmap(dom->xch, dom->guest_domid, count,
+                                    pfn_shift, 0, extents);
+    if ( nr <= 0 ) return nr;
+    DOMPRINTF("%s: populated %#x/%#x entries with shift %d",
+              __FUNCTION__, nr, count, pfn_shift);
+
+    *nr_pfns = nr << pfn_shift;
+
+    return 1;
+}
+
 static int populate_guest_memory(struct xc_dom_image *dom,
                                  xen_pfn_t base_pfn, xen_pfn_t nr_pfns)
 {
-    int rc;
-    xen_pfn_t allocsz, pfn;
+    int rc = 0;
+    xen_pfn_t allocsz, pfn, *extents;
+
+    extents = calloc(1024*1024,sizeof(xen_pfn_t));
+    if ( extents == NULL )
+    {
+        DOMPRINTF("%s: Unable to allocate extent array", __FUNCTION__);
+        return -1;
+    }
 
     DOMPRINTF("%s: populating RAM @ %016"PRIx64"-%016"PRIx64" (%"PRId64"MB)",
               __FUNCTION__,
@@ -261,21 +317,56 @@  static int populate_guest_memory(struct xc_dom_image *dom,
               (uint64_t)(base_pfn + nr_pfns) << XC_PAGE_SHIFT,
               (uint64_t)nr_pfns >> (20-XC_PAGE_SHIFT));
 
-    for ( pfn = 0; pfn < nr_pfns; pfn++ )
-        dom->p2m_host[pfn] = base_pfn + pfn;
-
-    for ( pfn = rc = allocsz = 0; (pfn < nr_pfns) && !rc; pfn += allocsz )
+    for ( pfn = 0; pfn < nr_pfns; pfn += allocsz )
     {
-        allocsz = nr_pfns - pfn;
-        if ( allocsz > 1024*1024 )
-            allocsz = 1024*1024;
-
-        rc = xc_domain_populate_physmap_exact(
-            dom->xch, dom->guest_domid, allocsz,
-            0, 0, &dom->p2m_host[pfn]);
+        allocsz = min_t(int, 1024*1024, nr_pfns - pfn);
+#if 0 /* Enable this to exercise/debug the code which tries to realign
+       * to a superpage boundary, by misaligning at the start. */
+        if ( pfn == 0 )
+        {
+            allocsz = 1;
+            rc = populate_one_size(dom, PFN_4K_SHIFT,
+                                   base_pfn + pfn, &allocsz, extents);
+            if (rc < 0) break;
+            if (rc > 0) continue;
+            /* Failed to allocate a single page? */
+            break;
+        }
+#endif
+
+        rc = populate_one_size(dom, PFN_512G_SHIFT,
+                               base_pfn + pfn, &allocsz, extents);
+        if ( rc < 0 ) break;
+        if ( rc > 0 ) continue;
+
+        rc = populate_one_size(dom, PFN_1G_SHIFT,
+                               base_pfn + pfn, &allocsz, extents);
+        if ( rc < 0 ) break;
+        if ( rc > 0 ) continue;
+
+        rc = populate_one_size(dom, PFN_2M_SHIFT,
+                               base_pfn + pfn, &allocsz, extents);
+        if ( rc < 0 ) break;
+        if ( rc > 0 ) continue;
+
+        rc = populate_one_size(dom, PFN_4K_SHIFT,
+                               base_pfn + pfn, &allocsz, extents);
+        if ( rc < 0 ) break;
+        if ( rc == 0 )
+        {
+            DOMPRINTF("%s: Not enough RAM", __FUNCTION__);
+            errno = ENOMEM;
+            rc = -1;
+            goto out;
+        }
     }
 
-    return rc;
+    for ( pfn = 0; pfn < nr_pfns; pfn++ )
+        dom->p2m_host[pfn] = base_pfn + pfn;
+
+out:
+    free(extents);
+    return rc < 0 ? rc : 0;
 }
 
 int arch_setup_meminit(struct xc_dom_image *dom)