From patchwork Thu Feb 10 14:11:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Zimmermann X-Patchwork-Id: 541577 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 085CEC433EF for ; Thu, 10 Feb 2022 14:11:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239028AbiBJOLZ (ORCPT ); Thu, 10 Feb 2022 09:11:25 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:32966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235081AbiBJOLY (ORCPT ); Thu, 10 Feb 2022 09:11:24 -0500 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D135C1 for ; Thu, 10 Feb 2022 06:11:26 -0800 (PST) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 05C2921114; Thu, 10 Feb 2022 14:11:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1644502285; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2m0LxcQyStkLKfwrGKW8CTnydl47YLuxekTjHFPor3k=; b=p3eeCS/JtdPquyZV0tmRg/kuPMQ3S/X0zZMON8eA2agA7jU4PBpgyd+Jya0v2EhYUXZSsC IZ+/zwAAu9OC283sGY4qzdtynFR+Yota83D29n4GnOLHcHL2poKR1SqVQ8xGBPNDxv4/mE NKPqKFWXRLJoJelWRsP6ZcNSs7aaOKg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1644502285; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2m0LxcQyStkLKfwrGKW8CTnydl47YLuxekTjHFPor3k=; b=wZ0jBTAVWnHk37m+aYTSRrx/sJMEchtxBXEeMhAt4v2O1eBNqJfP4wp+GGKmy/JAISQfTP aBZYLG60w5pIUdCQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id C05C913B9E; Thu, 10 Feb 2022 14:11:24 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id IP8HLgwdBWIPaQAAMHmgww (envelope-from ); Thu, 10 Feb 2022 14:11:24 +0000 From: Thomas Zimmermann To: daniel@ffwll.ch, javierm@redhat.com, noralf@tronnes.org, andriy.shevchenko@linux.intel.com, deller@gmx.de, bernie@plugable.com, jayalk@intworks.biz Cc: dri-devel@lists.freedesktop.org, linux-fbdev@vger.kernel.org, linux-staging@lists.linux.dev, Thomas Zimmermann Subject: [PATCH 1/2] fbdev/defio: Early-out if page is already enlisted Date: Thu, 10 Feb 2022 15:11:11 +0100 Message-Id: <20220210141111.5231-2-tzimmermann@suse.de> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220210141111.5231-1-tzimmermann@suse.de> References: <20220210141111.5231-1-tzimmermann@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fbdev@vger.kernel.org Return early if a page is already in the list of dirty pages for deferred I/O. This can be detected if the page's list head is not empty. Keep the list head initialized while the page is not enlisted to make this work reliably. Signed-off-by: Thomas Zimmermann Acked-by: Sam Ravnborg --- drivers/video/fbdev/core/fb_defio.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c index a591d291b231..3727b1ca87b1 100644 --- a/drivers/video/fbdev/core/fb_defio.c +++ b/drivers/video/fbdev/core/fb_defio.c @@ -59,6 +59,7 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf) printk(KERN_ERR "no mapping available\n"); BUG_ON(!page->mapping); + INIT_LIST_HEAD(&page->lru); page->index = vmf->pgoff; vmf->page = page; @@ -122,17 +123,19 @@ static vm_fault_t fb_deferred_io_mkwrite(struct vm_fault *vmf) */ lock_page(page); + /* + * This check is to catch the case where a new process could start + * writing to the same page through a new pte. this new access + * can cause the mkwrite even when the original ps's pte is marked + * writable. + */ + if (!list_empty(&page->lru)) + goto page_already_added; + /* we loop through the pagelist before adding in order to keep the pagelist sorted */ list_for_each_entry(cur, &fbdefio->pagelist, lru) { - /* this check is to catch the case where a new - process could start writing to the same page - through a new pte. this new access can cause the - mkwrite even when the original ps's pte is marked - writable */ - if (unlikely(cur == page)) - goto page_already_added; - else if (cur->index > page->index) + if (cur->index > page->index) break; } @@ -194,7 +197,7 @@ static void fb_deferred_io_work(struct work_struct *work) /* clear the list */ list_for_each_safe(node, next, &fbdefio->pagelist) { - list_del(node); + list_del_init(node); } mutex_unlock(&fbdefio->lock); } From patchwork Thu Feb 10 14:11:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Zimmermann X-Patchwork-Id: 542239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6F5BC433EF for ; Thu, 10 Feb 2022 14:11:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242645AbiBJOLf (ORCPT ); Thu, 10 Feb 2022 09:11:35 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:33036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235081AbiBJOLe (ORCPT ); Thu, 10 Feb 2022 09:11:34 -0500 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF59BC1 for ; Thu, 10 Feb 2022 06:11:35 -0800 (PST) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id AD1A021114; Thu, 10 Feb 2022 14:11:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1644502294; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H3S4RvJdlP5BIh8B4OhTvpSZaUM6LEPYp7U56LaQG7I=; b=xWJ4PK0MiulEMPfYU/6DM7Ecy99FQj6f4dEkSoi088vMDrvTcM1nKceMuXjAO6SGuueQ3V qedkd72SUgQCoEo67E5GHGJuL3vMl4pn+omSMt5KRVgbw6TYzYtneSWAU3uryhXFFbfI+q 9BnfuVDehOAV+4HndxR4xcJ2CeOuyQY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1644502294; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H3S4RvJdlP5BIh8B4OhTvpSZaUM6LEPYp7U56LaQG7I=; b=RbYaO040hRZcViS8llm9DihB/x6+aIeyZmFNUhh0lbSR8PxTbqcXSsJO7lmV4BfEMmT6Rk QSAhlzBaKRbfeODg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6981D13B9E; Thu, 10 Feb 2022 14:11:34 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 4MrNGBYdBWIPaQAAMHmgww (envelope-from ); Thu, 10 Feb 2022 14:11:34 +0000 From: Thomas Zimmermann To: daniel@ffwll.ch, javierm@redhat.com, noralf@tronnes.org, andriy.shevchenko@linux.intel.com, deller@gmx.de, bernie@plugable.com, jayalk@intworks.biz Cc: dri-devel@lists.freedesktop.org, linux-fbdev@vger.kernel.org, linux-staging@lists.linux.dev, Thomas Zimmermann Subject: [PATCH 2/2] fbdev: Don't sort deferred-I/O pages by default Date: Thu, 10 Feb 2022 15:11:13 +0100 Message-Id: <20220210141111.5231-3-tzimmermann@suse.de> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220210141111.5231-1-tzimmermann@suse.de> References: <20220210141111.5231-1-tzimmermann@suse.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fbdev@vger.kernel.org Fbdev's deferred I/O sorts all dirty pages by default, which incurs a significant overhead. Make the sorting step optional and update the few drivers that require it. Use a FIFO list by default. Sorting pages by memory offset for deferred I/O performs an implicit bubble-sort step on the list of dirty pages. The algorithm goes through the list of dirty pages and inserts each new page according to its index field. Even worse, list traversal always starts at the first entry. As video memory is most likely updated scanline by scanline, the algorithm traverses through the complete list for each updated page. For example, with 1024x768x32bpp a page covers exactly one scanline. Writing a single screen update from top to bottom requires updating 768 pages. With an average list length of 384 entries, a screen update creates (768 * 384 =) 294912 compare operation. Fix this by making the sorting step opt-in and update the few drivers that require it. All other drivers work with unsorted page lists. Pages are appended to the list. Therefore, in the common case of writing the framebuffer top to bottom, pages are still sorted by offset, which may have a positive effect on performance. Playing a video [1] in mplayer's benchmark mode shows the difference (i7-4790, FullHD, simpledrm, kernel with debugging). mplayer -benchmark -nosound -vo fbdev ./big_buck_bunny_720p_stereo.ogg With sorted page lists: BENCHMARKs: VC: 32.960s VO: 73.068s A: 0.000s Sys: 2.413s = 108.441s BENCHMARK%: VC: 30.3947% VO: 67.3802% A: 0.0000% Sys: 2.2251% = 100.0000% With unsorted page lists: BENCHMARKs: VC: 31.005s VO: 42.889s A: 0.000s Sys: 2.256s = 76.150s BENCHMARK%: VC: 40.7156% VO: 56.3219% A: 0.0000% Sys: 2.9625% = 100.0000% VC shows the overhead of video decoding, VO shows the overhead of the video output. Using unsorted page lists reduces the benchmark's run time by ~32s/~25%. Signed-off-by: Thomas Zimmermann Link: https://download.blender.org/peach/bigbuckbunny_movies/big_buck_bunny_720p_stereo.ogg # [1] --- drivers/staging/fbtft/fbtft-core.c | 1 + drivers/video/fbdev/broadsheetfb.c | 1 + drivers/video/fbdev/core/fb_defio.c | 19 ++++++++++++------- drivers/video/fbdev/metronomefb.c | 1 + drivers/video/fbdev/udlfb.c | 1 + include/linux/fb.h | 1 + 6 files changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/staging/fbtft/fbtft-core.c b/drivers/staging/fbtft/fbtft-core.c index f2684d2d6851..4a35347b3020 100644 --- a/drivers/staging/fbtft/fbtft-core.c +++ b/drivers/staging/fbtft/fbtft-core.c @@ -654,6 +654,7 @@ struct fb_info *fbtft_framebuffer_alloc(struct fbtft_display *display, fbops->fb_blank = fbtft_fb_blank; fbdefio->delay = HZ / fps; + fbdefio->sort_pagelist = true; fbdefio->deferred_io = fbtft_deferred_io; fb_deferred_io_init(info); diff --git a/drivers/video/fbdev/broadsheetfb.c b/drivers/video/fbdev/broadsheetfb.c index fd66f4d4a621..b9054f658838 100644 --- a/drivers/video/fbdev/broadsheetfb.c +++ b/drivers/video/fbdev/broadsheetfb.c @@ -1059,6 +1059,7 @@ static const struct fb_ops broadsheetfb_ops = { static struct fb_deferred_io broadsheetfb_defio = { .delay = HZ/4, + .sort_pagelist = true, .deferred_io = broadsheetfb_dpy_deferred_io, }; diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c index 3727b1ca87b1..1f672cf253b2 100644 --- a/drivers/video/fbdev/core/fb_defio.c +++ b/drivers/video/fbdev/core/fb_defio.c @@ -132,15 +132,20 @@ static vm_fault_t fb_deferred_io_mkwrite(struct vm_fault *vmf) if (!list_empty(&page->lru)) goto page_already_added; - /* we loop through the pagelist before adding in order - to keep the pagelist sorted */ - list_for_each_entry(cur, &fbdefio->pagelist, lru) { - if (cur->index > page->index) - break; + if (fbdefio->sort_pagelist) { + /* + * We loop through the pagelist before adding in order + * to keep the pagelist sorted. + */ + list_for_each_entry(cur, &fbdefio->pagelist, lru) { + if (cur->index > page->index) + break; + } + list_add_tail(&page->lru, &cur->lru); + } else { + list_add_tail(&page->lru, &fbdefio->pagelist); } - list_add_tail(&page->lru, &cur->lru); - page_already_added: mutex_unlock(&fbdefio->lock); diff --git a/drivers/video/fbdev/metronomefb.c b/drivers/video/fbdev/metronomefb.c index 952826557a0c..af858dd23ea6 100644 --- a/drivers/video/fbdev/metronomefb.c +++ b/drivers/video/fbdev/metronomefb.c @@ -568,6 +568,7 @@ static const struct fb_ops metronomefb_ops = { static struct fb_deferred_io metronomefb_defio = { .delay = HZ, + .sort_pagelist = true, .deferred_io = metronomefb_dpy_deferred_io, }; diff --git a/drivers/video/fbdev/udlfb.c b/drivers/video/fbdev/udlfb.c index b9cdd02c1000..184bb8433b78 100644 --- a/drivers/video/fbdev/udlfb.c +++ b/drivers/video/fbdev/udlfb.c @@ -980,6 +980,7 @@ static int dlfb_ops_open(struct fb_info *info, int user) if (fbdefio) { fbdefio->delay = DL_DEFIO_WRITE_DELAY; + fbdefio->sort_pagelist = true; fbdefio->deferred_io = dlfb_dpy_deferred_io; } diff --git a/include/linux/fb.h b/include/linux/fb.h index 3d7306c9a706..9a77ab615c36 100644 --- a/include/linux/fb.h +++ b/include/linux/fb.h @@ -204,6 +204,7 @@ struct fb_pixmap { struct fb_deferred_io { /* delay between mkwrite and deferred handler */ unsigned long delay; + bool sort_pagelist; /* sort pagelist by offset */ struct mutex lock; /* mutex that protects the page list */ struct list_head pagelist; /* list of touched pages */ /* callback */