From patchwork Tue May 20 10:26:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 891438 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1641E26B08D; Tue, 20 May 2025 10:27:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747736851; cv=none; b=BMSoxTbmpsisBlP3OFmVtsL0H6k/TjZpjVH+b7i+f834pogsRU6QWvcczXXRJu4M7ZzKEjvRUIx55p2WCqU6kN7kl+u08yG2UVl4Jd047Q399cHsoZDUuSS5raNeH3IR2GAhY/xw/kf8ahqDjtfa5MOE5hYuso7ywrRRep3NcG8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747736851; c=relaxed/simple; bh=r63jBmdYNfllCSlVbc7iG71kBpdUFZAx1p/Hv2DTI4I=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Af/RDEl84qOpp/mSpxAbgUagAbIUg6XMJViJWV06zYw1h0S4c5wjFJmt8Sg8r8v9bSZPLK1iOzLHb9oNs6TNq2nRR/z485U/VSpc3+q8AOYMmvtDWF+x8vOPULooxQ4qIOn6JjHUqqy+LT1f88+w9kIm5dQqYDE2OVmLBRvFF4I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.128.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-43d04dc73b7so59079135e9.3; Tue, 20 May 2025 03:27:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747736848; x=1748341648; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=laEkWWVQjQ2oI3CZqHqG5uO1wLsoJjXKyEWBV14uYNg=; b=Te7ncP9D/PoQ7Bs+md/GznOWUoZTHPnrE7HbIJ37LjYeOPIVBlyA5cNdbpHCtbyYMh x7A6p63EIz6L0zRZIOLb+3u2rl6HCJvJPMvRPgd3N2kjbTfCUEwv9f8PjZmqfcmroNWJ yaaptlbiM08eOV62Ynuh+rxrJLkhCB2V5LG7ZGBsA5kJkZeyvMswRFPUIy7hYJ3XlYYB rYSh4DIaJc9pbwV0pYrn0I/k3u+nyTlMX6ZcOrs2QPieExLVizrSR0ETgUn1A1rwWBog iZfzJFgc7rxD9I/Q0Dkh/8zfq1Jsgvqb0ZTvMjKLP2dv7ASJz+aw4OyjmT8ysqjCQVGg /ckQ== X-Forwarded-Encrypted: i=1; AJvYcCW6ex+S1LQirYTpMjKtB6QwS5+5SCDjRP1LFmjagVTXnj35enbGFrhqkgvGH5RpdSZ0Klb+/MXibGQ=@vger.kernel.org, AJvYcCWtM0y8OObUYiSBk/r2NEyE9Faq/o+vPvISlt8oJFQhOOkF1ij9t9WG5tRaC+CgWRzATs+PgbzIGDRJ2EI=@vger.kernel.org, AJvYcCXM8sW6tivZioFrDb70HRCVPOgiJ0d9maBeMEhy0kV8HVN02lBYJF/2su6tYbInqybwbjS3uQyka35+SyF6@vger.kernel.org X-Gm-Message-State: AOJu0YyAXlpEw1aeYp2BRSh80gljruPxprJkrXiLLvNplsGCh78ITfva t7jzMwYoJxOP83WjvhpjRXNkVqGD9RvbeEfTJ2gZieWMqLJHg91IJX34VucNkLuv X-Gm-Gg: ASbGncvkNWZo8lw+SZVkfC8Phbq22WU/0mdelNFPWAXj89PVxyOs/HjJlLjzhlGZ1cu nZWU4VZj3lHU/ts3cR16sktF8HRVtzXSvP/m6irxde2mHwhYLHk0gX1wcCcFtYywE+WEDWc144A iMTUMeOr810JY0AdTukAGapDTJGgKJyNXPGH8fF9483biEYMkBeaYDu/veJcOlnYaUu8nGdXy3V ljlMzBtem+oOl8znBkCX29J0MqzCHrUVaAq7XUg5wP2q1F5ah2pSLtj9XtpFlf8BNytVzTbqZGd dTp2jnuA4SLL/xZPPe89dx0SVAzNCMUMYLPQA//rW7sdEe8RjvLEdOR3Pewwa+DmJSoTsBYLivv 6/Ynzhr2w9A== X-Google-Smtp-Source: AGHT+IE5Bxm/E8r67kmDtvflrTA0WsP6FuXs0GnBPgIeKU2e5T6lT8BURuMuEsuYnHCevSt98CcBpQ== X-Received: by 2002:a05:600c:6806:b0:43c:fb95:c76f with SMTP id 5b1f17b1804b1-442fd618ee7mr138062635e9.9.1747736848247; Tue, 20 May 2025 03:27:28 -0700 (PDT) Received: from [10.42.0.1] (cst-prg-46-162.cust.vodafone.cz. [46.135.46.162]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-447f73d3defsm24680025e9.18.2025.05.20.03.27.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 May 2025 03:27:27 -0700 (PDT) From: Tomeu Vizoso Date: Tue, 20 May 2025 12:26:55 +0200 Subject: [PATCH v5 02/10] arm64: dts: rockchip: Add nodes for NPU and its MMU to rk3588s Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250520-6-10-rocket-v5-2-18c9ca0fcb3c@tomeuvizoso.net> References: <20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net> In-Reply-To: <20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Nicolas Frattaroli , Jeff Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 See Chapter 36 "RKNN" from the RK3588 TRM (Part 1). This is a derivative of NVIDIA's NVDLA, but with its own front-end processor. The IP is divided in three cores, programmed independently. The first core though is special, requiring to be powered on before any of the others can be used. The IOMMU of the first core is also special in that it has two subunits (read/write?) that need to be programmed in sync. v2: - Have one device for each NPU core (Sebastian Reichel) - Have one device for each IOMMU (Sebastian Reichel) - Correctly sort nodes (Diederik de Haas) - Add rockchip,iommu compatible to IOMMU nodes (Sebastian Reichel) v3: - Adapt to a split of the register block in the DT bindings (Nicolas Frattaroli) v4: - Adapt to changes in bindings Signed-off-by: Tomeu Vizoso --- arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 85 +++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi index 1e18ad93ba0ebdad31642b88ff0f90ef4e8dc76f..f5e58851047e80b23f9ff3244692ad868ddc1ff6 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi @@ -1136,6 +1136,91 @@ power-domain@RK3588_PD_SDMMC { }; }; + rknn_core_top: npu@fdab0000 { + compatible = "rockchip,rk3588-rknn-core-top"; + reg = <0x0 0xfdab0000 0x0 0x1000>, + <0x0 0xfdab1000 0x0 0x1000>, + <0x0 0xfdab3000 0x0 0x1000>; + reg-names = "pc", "cna", "core"; + interrupts = ; + clocks = <&scmi_clk SCMI_CLK_NPU>, <&cru PCLK_NPU_ROOT>, + <&cru ACLK_NPU0>, <&cru HCLK_NPU0>; + clock-names = "aclk", "hclk", "npu", "pclk"; + assigned-clocks = <&scmi_clk SCMI_CLK_NPU>; + assigned-clock-rates = <200000000>; + resets = <&cru SRST_A_RKNN0>, <&cru SRST_H_RKNN0>; + reset-names = "srst_a", "srst_h"; + power-domains = <&power RK3588_PD_NPUTOP>; + iommus = <&rknn_mmu_top>; + status = "disabled"; + }; + + rknn_mmu_top: iommu@fdab9000 { + compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu"; + reg = <0x0 0xfdab9000 0x0 0x100>, + <0x0 0xfdaba000 0x0 0x100>; + interrupts = ; + clocks = <&cru ACLK_NPU0>, <&cru HCLK_NPU0>; + clock-names = "aclk", "iface"; + #iommu-cells = <0>; + power-domains = <&power RK3588_PD_NPUTOP>; + status = "disabled"; + }; + + rknn_core_1: npu@fdac0000 { + compatible = "rockchip,rk3588-rknn-core"; + reg = <0x0 0xfdac0000 0x0 0x1000>, + <0x0 0xfdac1000 0x0 0x1000>, + <0x0 0xfdac3000 0x0 0x1000>; + reg-names = "pc", "cna", "core"; + interrupts = ; + clocks = <&cru ACLK_NPU1>, <&cru HCLK_NPU1>; + clock-names = "aclk", "hclk"; + resets = <&cru SRST_A_RKNN1>, <&cru SRST_H_RKNN1>; + reset-names = "srst_a", "srst_h"; + power-domains = <&power RK3588_PD_NPU1>; + iommus = <&rknn_mmu_1>; + status = "disabled"; + }; + + rknn_mmu_1: iommu@fdac9000 { + compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu"; + reg = <0x0 0xfdaca000 0x0 0x100>; + interrupts = ; + clocks = <&cru ACLK_NPU1>, <&cru HCLK_NPU1>; + clock-names = "aclk", "iface"; + #iommu-cells = <0>; + power-domains = <&power RK3588_PD_NPU1>; + status = "disabled"; + }; + + rknn_core_2: npu@fdad0000 { + compatible = "rockchip,rk3588-rknn-core"; + reg = <0x0 0xfdad0000 0x0 0x1000>, + <0x0 0xfdad1000 0x0 0x1000>, + <0x0 0xfdad3000 0x0 0x1000>; + reg-names = "pc", "cna", "core"; + interrupts = ; + clocks = <&cru ACLK_NPU2>, <&cru HCLK_NPU2>; + clock-names = "aclk", "hclk"; + resets = <&cru SRST_A_RKNN2>, <&cru SRST_H_RKNN2>; + reset-names = "srst_a", "srst_h"; + power-domains = <&power RK3588_PD_NPU2>; + iommus = <&rknn_mmu_2>; + status = "disabled"; + }; + + rknn_mmu_2: iommu@fdad9000 { + compatible = "rockchip,rk3588-iommu", "rockchip,rk3568-iommu"; + reg = <0x0 0xfdada000 0x0 0x100>; + interrupts = ; + clocks = <&cru ACLK_NPU2>, <&cru HCLK_NPU2>; + clock-names = "aclk", "iface"; + #iommu-cells = <0>; + power-domains = <&power RK3588_PD_NPU2>; + status = "disabled"; + }; + vpu121: video-codec@fdb50000 { compatible = "rockchip,rk3588-vpu121", "rockchip,rk3568-vpu"; reg = <0x0 0xfdb50000 0x0 0x800>; From patchwork Tue May 20 10:26:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 891437 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0BD2826A1DA; Tue, 20 May 2025 10:27:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747736858; cv=none; b=Dax+9zgw366kEZf6DRokvltYuEuAWv82XxgaQ+yBp9cJSUjGjF/OGsk6e1gmb7BwZRzqFY92UeBG03X3SjL3jpTYcS54km33wSBLmvvNme0o+kx9TNHKICXdj3/3EAHgs6nSl/lhUXQdgMYchfcjCAiTspGvtEWTjfJlDtbQEGk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747736858; c=relaxed/simple; bh=06J+XXeHaVzOMtbGvtN8tErHX+tjjB3NqFTxWTuaRnc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Cgz9cwiKJBrfhAaPZt237hom540NP+ZrRy9dMw0mbMSmYnC/+9HuadzaIN3nb+UtkvJhshfYmR4276+UpUjI9wWePH5Mh7pl/QWRt99OOyIRM7/7GZnNizOp2wAD9VnI1w9tIAE78/fNv4rcCMJ61OavvS729BaoryhFYQlQTlQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-43cfa7e7f54so41278575e9.1; Tue, 20 May 2025 03:27:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747736854; x=1748341654; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NH6sJQIhe/N21z+OoIEs6BTIMF+BQDmg2RbVYIUKnuM=; b=DaZUy5U3b2zavsoaHDmhroaMe07sn8Zm4wszfdkeD1IHgHLt/It5wWvgokfDJgMudL E98MQR2zCsrtuA7cTFSbe09ezQpcRKTQNMBl0l/AHIAi9Ld3TEj5FCsE8k+oMA057Tcm PcVBtuPunus9rYTTzRFonc2QkElC26QI4Y3PzHSR9qXKbqxZfwtm6FbzzD9ubkivtZzx DdZVRalY3B2GjturuME6Pf7gp2yDKyD7XGBQ2VcjuesrFUxpMS5FCbl4c87ms7cDO0YO +IM2uNA02cWz8X72GnosmmzGbtXlJvkY/MD0PhFRRe3U5fKxD65pNa92/OCX540DLTr9 dkIQ== X-Forwarded-Encrypted: i=1; AJvYcCUsOab/H4CyLageAWkaj16EGM20xoDfkC8sMcYOzA8QBHoS+xhB6ttpXtGTBraAxV87w95zhzBjAP7moTw=@vger.kernel.org, AJvYcCWI1vEu/2+A06RDujZtbuNHdGAwb1j/DcqFVEVGUxhUM+THhtvooXgVoaQ+GL1T6KSmhJ+m8eMPIXQ=@vger.kernel.org, AJvYcCWaMj85i6Y29R/ZHh8SdQnmGQiHIk6NrCmUnmKuq77Zlnq50s1mj7DsCU7N9wqqH++0rpjVL0QsRe7pZbyi@vger.kernel.org X-Gm-Message-State: AOJu0Yyb7SFzJHBbfzeBdGoCZxmOrggt9dPuRpOUEzPPoTTZrAG1PTwB ypYZCoDWReM8dkyfFBa1f/qdxnmBAtr/4rt7YbM9gzwOCSIyW7SpZt8bBkf1FgRS X-Gm-Gg: ASbGncvTtPbeLPYgl67azIk4nTmyXTt9n7G+JBUbR+7pfoLGF/vjoVUILiLJOkrNAPZ GbxhxyUF7EeZWqjERmHZbGIyDRw58LVEtajZm0KQh+udyAgQ3ObymFQTCkRIl++IuGL0tmbngu/ EzcNa0PE9mJHnkdi0aCQ1/iIfwztil0NyxluC3MUEFkM6fKY3BdVaJaMpSrSyAULNUxRCiqjNtu M69FkwIE0rWNADP4e1WaZUX8yih/VWQWBS4gTVj78vfV+2zfvLxR8vB0uDtXjKIUe273Qj+PHA0 Rl8DV3wKk4e+XXH3SZNdlLo73AgqAzCCLmriKGxhJICuDUrj8Ia8uwjpsaAGBHakXenSz+YFp66 s88bzKbhHCQ== X-Google-Smtp-Source: AGHT+IEnYx5cPR9He0+qELeiwYGUDpKELGreGj4dLXm4hRmTWS9Ey6M21NUefRs7Jii5Dq4JDq0+Xg== X-Received: by 2002:a05:600c:5012:b0:440:69f5:f179 with SMTP id 5b1f17b1804b1-442f84d5511mr190326515e9.7.1747736853866; Tue, 20 May 2025 03:27:33 -0700 (PDT) Received: from [10.42.0.1] (cst-prg-46-162.cust.vodafone.cz. [46.135.46.162]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-447f73d3defsm24680025e9.18.2025.05.20.03.27.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 May 2025 03:27:33 -0700 (PDT) From: Tomeu Vizoso Date: Tue, 20 May 2025 12:26:58 +0200 Subject: [PATCH v5 05/10] accel/rocket: Add a new driver for Rockchip's NPU Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250520-6-10-rocket-v5-5-18c9ca0fcb3c@tomeuvizoso.net> References: <20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net> In-Reply-To: <20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Nicolas Frattaroli , Jeff Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 This initial version supports the NPU as shipped in the RK3588 SoC and described in the first part of its TRM, in Chapter 36. This NPU contains 3 independent cores that the driver can submit jobs to. This commit adds just hardware initialization and power management. v2: - Split cores and IOMMUs as independent devices (Sebastian Reichel) - Add some documentation (Jeffrey Hugo) - Be more explicit in the Kconfig documentation (Jeffrey Hugo) - Remove resets, as these haven't been found useful so far (Zenghui Yu) - Repack structs (Jeffrey Hugo) - Use DEFINE_DRM_ACCEL_FOPS (Jeffrey Hugo) - Use devm_drm_dev_alloc (Jeffrey Hugo) - Use probe log helper (Jeffrey Hugo) - Introduce UABI header in a later patch (Jeffrey Hugo) v3: - Adapt to a split of the register block in the DT bindings (Nicolas Frattaroli) - Move registers header to its own commit (Thomas Zimmermann) - Misc. cleanups (Thomas Zimmermann and Jeff Hugo) - Make use of GPL-2.0-only for the copyright notice (Jeff Hugo) - PM improvements (Nicolas Frattaroli) v4: - Use bulk clk API (Krzysztof Kozlowski) Signed-off-by: Tomeu Vizoso --- Documentation/accel/index.rst | 1 + Documentation/accel/rocket/index.rst | 25 +++ MAINTAINERS | 10 ++ drivers/accel/Kconfig | 1 + drivers/accel/Makefile | 1 + drivers/accel/rocket/Kconfig | 25 +++ drivers/accel/rocket/Makefile | 8 + drivers/accel/rocket/rocket_core.c | 70 +++++++++ drivers/accel/rocket/rocket_core.h | 45 ++++++ drivers/accel/rocket/rocket_device.c | 30 ++++ drivers/accel/rocket/rocket_device.h | 27 ++++ drivers/accel/rocket/rocket_drv.c | 294 +++++++++++++++++++++++++++++++++++ drivers/accel/rocket/rocket_drv.h | 13 ++ 13 files changed, 550 insertions(+) diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst index bc85f26533d88891dde482f91e26c99991b22869..d8fa332d60a890dbb617454d2a26d9b6f9b196aa 100644 --- a/Documentation/accel/index.rst +++ b/Documentation/accel/index.rst @@ -10,6 +10,7 @@ Compute Accelerators introduction amdxdna/index qaic/index + rocket/index .. only:: subproject and html diff --git a/Documentation/accel/rocket/index.rst b/Documentation/accel/rocket/index.rst new file mode 100644 index 0000000000000000000000000000000000000000..a3389f9a284c0975bc201f6e09082c01970e08a3 --- /dev/null +++ b/Documentation/accel/rocket/index.rst @@ -0,0 +1,25 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +===================================== + accel/rocket Rockchip NPU driver +===================================== + +The accel/rocket driver supports the Neural Processing Units (NPUs) inside some +Rockchip SoCs such as the RK3588. Rockchip calls it RKNN and sometimes RKNPU. + +This NPU is closely based on the NVDLA IP released by NVIDIA as open hardware in +2018, along with open source kernel and userspace drivers. + +The frontend unit in Rockchip's NPU though is completely different from that in +the open source IP, so this kernel driver is specific to Rockchip's version. + +The hardware is described in chapter 36 in the RK3588 TRM. + +This driver just powers the hardware on and off, allocates and maps buffers to +the device and submits jobs to the frontend unit. Everything else is done in +userspace, as a Gallium driver (also called rocket) that is part of the Mesa3D +project. + +Hardware currently supported: + +* RK3588 \ No newline at end of file diff --git a/MAINTAINERS b/MAINTAINERS index 96b82704950184bd71623ff41fc4df31e4c7fe87..2d8833bf1f2db06ca624d703f19066adab2f9fde 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7263,6 +7263,16 @@ T: git https://gitlab.freedesktop.org/drm/misc/kernel.git F: drivers/accel/ivpu/ F: include/uapi/drm/ivpu_accel.h +DRM ACCEL DRIVER FOR ROCKCHIP NPU +M: Tomeu Vizoso +L: dri-devel@lists.freedesktop.org +S: Supported +T: git https://gitlab.freedesktop.org/drm/misc/kernel.git +F: Documentation/accel/rocket/ +F: Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml +F: drivers/accel/rocket/ +F: include/uapi/drm/rocket_accel.h + DRM COMPUTE ACCELERATORS DRIVERS AND FRAMEWORK M: Oded Gabbay L: dri-devel@lists.freedesktop.org diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig index 5b9490367a39fd12d35a8d9021768aa186c09308..bb01cebc42bf16ebf02e938040f339ff94869e33 100644 --- a/drivers/accel/Kconfig +++ b/drivers/accel/Kconfig @@ -28,5 +28,6 @@ source "drivers/accel/amdxdna/Kconfig" source "drivers/accel/habanalabs/Kconfig" source "drivers/accel/ivpu/Kconfig" source "drivers/accel/qaic/Kconfig" +source "drivers/accel/rocket/Kconfig" endif diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile index a301fb6089d4c515430175c5e2ba9190f6dc9158..ffc3fa58866616d933184a7659573cd4d4780a8d 100644 --- a/drivers/accel/Makefile +++ b/drivers/accel/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_DRM_ACCEL_AMDXDNA) += amdxdna/ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/ obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/ obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/ +obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/ \ No newline at end of file diff --git a/drivers/accel/rocket/Kconfig b/drivers/accel/rocket/Kconfig new file mode 100644 index 0000000000000000000000000000000000000000..9a59c6c61bf4d6460d8008b16331f001c97de67d --- /dev/null +++ b/drivers/accel/rocket/Kconfig @@ -0,0 +1,25 @@ +# SPDX-License-Identifier: GPL-2.0-only + +config DRM_ACCEL_ROCKET + tristate "Rocket (support for Rockchip NPUs)" + depends on DRM + depends on ARM64 || COMPILE_TEST + depends on MMU + select DRM_SCHED + select IOMMU_SUPPORT + select IOMMU_IO_PGTABLE_LPAE + select DRM_GEM_SHMEM_HELPER + help + Choose this option if you have a Rockchip SoC that contains a + compatible Neural Processing Unit (NPU), such as the RK3588. Called by + Rockchip either RKNN or RKNPU, it accelerates inference of neural + networks. + + The interface exposed to userspace is described in + include/uapi/drm/rocket_accel.h and is used by the Rocket userspace + driver in Mesa3D. + + If unsure, say N. + + To compile this driver as a module, choose M here: the + module will be called rocket. diff --git a/drivers/accel/rocket/Makefile b/drivers/accel/rocket/Makefile new file mode 100644 index 0000000000000000000000000000000000000000..abdd75f2492eaecf8bf5e78a2ac150ea19ac3e96 --- /dev/null +++ b/drivers/accel/rocket/Makefile @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-2.0-only + +obj-$(CONFIG_DRM_ACCEL_ROCKET) := rocket.o + +rocket-y := \ + rocket_core.o \ + rocket_device.o \ + rocket_drv.o diff --git a/drivers/accel/rocket/rocket_core.c b/drivers/accel/rocket/rocket_core.c new file mode 100644 index 0000000000000000000000000000000000000000..a852ad7874b9c161963b1aa5f0fc2720c84738a6 --- /dev/null +++ b/drivers/accel/rocket/rocket_core.c @@ -0,0 +1,70 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2024-2025 Tomeu Vizoso */ + +#include +#include +#include +#include +#include + +#include "rocket_core.h" + +int rocket_core_init(struct rocket_core *core) +{ + struct device *dev = core->dev; + struct platform_device *pdev = to_platform_device(dev); + u32 version; + int err = 0; + + err = devm_clk_bulk_get(dev, ARRAY_SIZE(core->clks), core->clks); + if (err) + return dev_err_probe(dev, err, "failed to get clocks for core %d\n", core->index); + + core->pc_iomem = devm_platform_ioremap_resource_byname(pdev, "pc"); + if (IS_ERR(core->pc_iomem)) { + dev_err(dev, "couldn't find PC registers %ld\n", PTR_ERR(core->pc_iomem)); + return PTR_ERR(core->pc_iomem); + } + + core->cna_iomem = devm_platform_ioremap_resource_byname(pdev, "cna"); + if (IS_ERR(core->cna_iomem)) { + dev_err(dev, "couldn't find CNA registers %ld\n", PTR_ERR(core->cna_iomem)); + return PTR_ERR(core->cna_iomem); + } + + core->core_iomem = devm_platform_ioremap_resource_byname(pdev, "core"); + if (IS_ERR(core->core_iomem)) { + dev_err(dev, "couldn't find CORE registers %ld\n", PTR_ERR(core->core_iomem)); + return PTR_ERR(core->core_iomem); + } + + pm_runtime_use_autosuspend(dev); + + /* + * As this NPU will be most often used as part of a media pipeline that + * ends presenting in a display, choose 50 ms (~3 frames at 60Hz) as an + * autosuspend delay as that will keep the device powered up while the + * pipeline is running. + */ + pm_runtime_set_autosuspend_delay(dev, 50); + + pm_runtime_enable(dev); + + err = pm_runtime_get_sync(dev); + + version = rocket_pc_read(core, VERSION); + version += rocket_pc_read(core, VERSION_NUM) & 0xffff; + + pm_runtime_mark_last_busy(dev); + pm_runtime_put_autosuspend(dev); + + dev_info(dev, "Rockchip NPU core %d version: %d\n", core->index, version); + + return 0; +} + +void rocket_core_fini(struct rocket_core *core) +{ + pm_runtime_dont_use_autosuspend(core->dev); + pm_runtime_disable(core->dev); +} diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h new file mode 100644 index 0000000000000000000000000000000000000000..ec89b8b5641f9714f157fd777580c98e20b09ec5 --- /dev/null +++ b/drivers/accel/rocket/rocket_core.h @@ -0,0 +1,45 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2024-2025 Tomeu Vizoso */ + +#ifndef __ROCKET_CORE_H__ +#define __ROCKET_CORE_H__ + +#include +#include +#include +#include + +#include "rocket_registers.h" + +#define rocket_pc_readl(core, reg) \ + readl((core)->pc_iomem + (REG_PC_##reg)) +#define rocket_pc_writel(core, reg, value) \ + writel(value, (core)->pc_iomem + (REG_PC_##reg)) + +#define rocket_cna_readl(core, reg) \ + readl((core)->cna_iomem + (REG_CNA_##reg) - REG_CNA_S_STATUS) +#define rocket_cna_writel(core, reg, value) \ + writel(value, (core)->cna_iomem + (REG_CNA_##reg) - REG_CNA_S_STATUS) + +#define rocket_core_readl(core, reg) \ + readl((core)->core_iomem + (REG_CORE_##reg) - REG_CORE_S_STATUS) +#define rocket_core_writel(core, reg, value) \ + writel(value, (core)->core_iomem + (REG_CORE_##reg) - REG_CORE_S_STATUS) + +struct rocket_core { + struct device *dev; + struct rocket_device *rdev; + struct device_link *link; + unsigned int index; + + int irq; + void __iomem *pc_iomem; + void __iomem *cna_iomem; + void __iomem *core_iomem; + struct clk_bulk_data clks[2]; +}; + +int rocket_core_init(struct rocket_core *core); +void rocket_core_fini(struct rocket_core *core); + +#endif diff --git a/drivers/accel/rocket/rocket_device.c b/drivers/accel/rocket/rocket_device.c new file mode 100644 index 0000000000000000000000000000000000000000..97e32d19a1b4a36177b8039b67b4892887daa880 --- /dev/null +++ b/drivers/accel/rocket/rocket_device.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2024-2025 Tomeu Vizoso */ + +#include +#include +#include + +#include "rocket_device.h" + +int rocket_device_init(struct rocket_device *rdev) +{ + struct device *dev = rdev->cores[0].dev; + int err; + + err = devm_clk_bulk_get(dev, ARRAY_SIZE(rdev->clks), rdev->clks); + if (err) + return dev_err_probe(dev, err, "failed to get device clocks\n"); + + /* Initialize core 0 (top) */ + err = rocket_core_init(&rdev->cores[0]); + if (err) + return err; + + return 0; +} + +void rocket_device_fini(struct rocket_device *rdev) +{ + rocket_core_fini(&rdev->cores[0]); +} diff --git a/drivers/accel/rocket/rocket_device.h b/drivers/accel/rocket/rocket_device.h new file mode 100644 index 0000000000000000000000000000000000000000..55f4da252cfbd1f102c56e5009472deff59aaaec --- /dev/null +++ b/drivers/accel/rocket/rocket_device.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2024-2025 Tomeu Vizoso */ + +#ifndef __ROCKET_DEVICE_H__ +#define __ROCKET_DEVICE_H__ + +#include +#include + +#include "rocket_core.h" + +struct rocket_device { + struct drm_device ddev; + + struct clk_bulk_data clks[2]; + + struct rocket_core *cores; + unsigned int num_cores; +}; + +int rocket_device_init(struct rocket_device *rdev); +void rocket_device_fini(struct rocket_device *rdev); + +#define to_rocket_device(drm_dev) \ + ((struct rocket_device *)container_of(drm_dev, struct rocket_device, ddev)) + +#endif diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c new file mode 100644 index 0000000000000000000000000000000000000000..d1a1be32760feed864db86963b9942f1e37b17eb --- /dev/null +++ b/drivers/accel/rocket/rocket_drv.c @@ -0,0 +1,294 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2024-2025 Tomeu Vizoso */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "rocket_drv.h" + +static int +rocket_open(struct drm_device *dev, struct drm_file *file) +{ + struct rocket_device *rdev = to_rocket_device(dev); + struct rocket_file_priv *rocket_priv; + + rocket_priv = kzalloc(sizeof(*rocket_priv), GFP_KERNEL); + if (!rocket_priv) + return -ENOMEM; + + rocket_priv->rdev = rdev; + file->driver_priv = rocket_priv; + + return 0; +} + +static void +rocket_postclose(struct drm_device *dev, struct drm_file *file) +{ + struct rocket_file_priv *rocket_priv = file->driver_priv; + + kfree(rocket_priv); +} + +static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = { +#define ROCKET_IOCTL(n, func) \ + DRM_IOCTL_DEF_DRV(ROCKET_##n, rocket_ioctl_##func, 0) +}; + +DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); + +/* + * Rocket driver version: + * - 1.0 - initial interface + */ +static const struct drm_driver rocket_drm_driver = { + .driver_features = DRIVER_COMPUTE_ACCEL, + .open = rocket_open, + .postclose = rocket_postclose, + .ioctls = rocket_drm_driver_ioctls, + .num_ioctls = ARRAY_SIZE(rocket_drm_driver_ioctls), + .fops = &rocket_accel_driver_fops, + .name = "rocket", + .desc = "rocket DRM", +}; + +static int rocket_drm_bind(struct device *dev) +{ + struct device_node *core_node; + struct rocket_device *rdev; + struct drm_device *ddev; + unsigned int num_cores = 1; + int err; + + rdev = devm_drm_dev_alloc(dev, &rocket_drm_driver, struct rocket_device, ddev); + if (IS_ERR(rdev)) + return PTR_ERR(rdev); + + ddev = &rdev->ddev; + dev_set_drvdata(dev, rdev); + + for_each_compatible_node(core_node, NULL, "rockchip,rk3588-rknn-core") + if (of_device_is_available(core_node)) + num_cores++; + + rdev->cores = devm_kmalloc_array(dev, num_cores, sizeof(*rdev->cores), + GFP_KERNEL | __GFP_ZERO); + if (IS_ERR(rdev->cores)) + return PTR_ERR(rdev->cores); + + /* Add core 0, any other cores will be added later when they are bound */ + rdev->cores[0].rdev = rdev; + rdev->cores[0].dev = dev; + rdev->cores[0].index = 0; + rdev->num_cores = 1; + + err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40)); + if (err) + return err; + + err = rocket_device_init(rdev); + if (err) { + dev_err_probe(dev, err, "Fatal error during NPU init\n"); + goto err_device_fini; + } + + err = component_bind_all(dev, rdev); + if (err) + goto err_device_fini; + + err = drm_dev_register(ddev, 0); + if (err < 0) + goto err_unbind; + + return 0; + +err_unbind: + component_unbind_all(dev, rdev); +err_device_fini: + rocket_device_fini(rdev); + return err; +} + +static void rocket_drm_unbind(struct device *dev) +{ + struct rocket_device *rdev = dev_get_drvdata(dev); + struct drm_device *ddev = &rdev->ddev; + + drm_dev_unregister(ddev); + + component_unbind_all(dev, rdev); + + rocket_device_fini(rdev); +} + +const struct component_master_ops rocket_drm_ops = { + .bind = rocket_drm_bind, + .unbind = rocket_drm_unbind, +}; + +static int rocket_core_bind(struct device *dev, struct device *master, void *data) +{ + struct rocket_device *rdev = data; + unsigned int core = rdev->num_cores; + int err; + + dev_set_drvdata(dev, rdev); + + rdev->cores[core].rdev = rdev; + rdev->cores[core].dev = dev; + rdev->cores[core].index = core; + rdev->cores[core].link = device_link_add(dev, rdev->cores[0].dev, + DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME); + + rdev->num_cores++; + + err = rocket_core_init(&rdev->cores[core]); + if (err) { + rocket_device_fini(rdev); + return err; + } + + return 0; +} + +static void rocket_core_unbind(struct device *dev, struct device *master, void *data) +{ + struct rocket_device *rdev = data; + + for (unsigned int core = 1; core < rdev->num_cores; core++) { + if (rdev->cores[core].dev == dev) { + rocket_core_fini(&rdev->cores[core]); + device_link_del(rdev->cores[core].link); + break; + } + } +} + +const struct component_ops rocket_core_ops = { + .bind = rocket_core_bind, + .unbind = rocket_core_unbind, +}; + +static int rocket_probe(struct platform_device *pdev) +{ + struct component_match *match = NULL; + struct device_node *core_node; + + if (fwnode_device_is_compatible(pdev->dev.fwnode, "rockchip,rk3588-rknn-core")) + return component_add(&pdev->dev, &rocket_core_ops); + + for_each_compatible_node(core_node, NULL, "rockchip,rk3588-rknn-core") { + if (!of_device_is_available(core_node)) + continue; + + drm_of_component_match_add(&pdev->dev, &match, + component_compare_of, core_node); + } + + return component_master_add_with_match(&pdev->dev, &rocket_drm_ops, match); +} + +static void rocket_remove(struct platform_device *pdev) +{ + if (fwnode_device_is_compatible(pdev->dev.fwnode, "rockchip,rk3588-rknn-core-top")) + component_master_del(&pdev->dev, &rocket_drm_ops); + else if (fwnode_device_is_compatible(pdev->dev.fwnode, "rockchip,rk3588-rknn-core")) + component_del(&pdev->dev, &rocket_core_ops); +} + +static const struct of_device_id dt_match[] = { + { .compatible = "rockchip,rk3588-rknn-core-top" }, + { .compatible = "rockchip,rk3588-rknn-core" }, + {} +}; +MODULE_DEVICE_TABLE(of, dt_match); + +static int find_core_for_dev(struct device *dev) +{ + struct rocket_device *rdev = dev_get_drvdata(dev); + + for (unsigned int core = 0; core < rdev->num_cores; core++) { + if (dev == rdev->cores[core].dev) + return core; + } + + return -1; +} + +static int rocket_device_runtime_resume(struct device *dev) +{ + struct rocket_device *rdev = dev_get_drvdata(dev); + int core = find_core_for_dev(dev); + int err = 0; + + if (core < 0) + return -ENODEV; + + if (core == 0) { + err = clk_bulk_prepare_enable(ARRAY_SIZE(rdev->clks), rdev->clks); + if (err) { + dev_err(dev, "failed to enable (%d) device clocks\n", err); + return err; + } + } + + err = clk_bulk_prepare_enable(ARRAY_SIZE(rdev->cores[core].clks), rdev->cores[core].clks); + if (err) { + dev_err(dev, "failed to enable (%d) clocks for core %d\n", err, core); + goto error_dev_clocks; + } + + return 0; + +error_dev_clocks: + if (core == 0) + clk_bulk_disable_unprepare(ARRAY_SIZE(rdev->clks), rdev->clks); + + return err; +} + +static int rocket_device_runtime_suspend(struct device *dev) +{ + struct rocket_device *rdev = dev_get_drvdata(dev); + int core = find_core_for_dev(dev); + + if (core < 0) + return -ENODEV; + + clk_bulk_disable_unprepare(ARRAY_SIZE(rdev->cores[core].clks), rdev->cores[core].clks); + + if (core == 0) + clk_bulk_disable_unprepare(ARRAY_SIZE(rdev->clks), rdev->clks); + + return 0; +} + +EXPORT_GPL_DEV_PM_OPS(rocket_pm_ops) = { + RUNTIME_PM_OPS(rocket_device_runtime_suspend, rocket_device_runtime_resume, NULL) + SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume) +}; + +static struct platform_driver rocket_driver = { + .probe = rocket_probe, + .remove = rocket_remove, + .driver = { + .name = "rocket", + .pm = pm_ptr(&rocket_pm_ops), + .of_match_table = dt_match, + }, +}; +module_platform_driver(rocket_driver); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("DRM driver for the Rockchip NPU IP"); +MODULE_AUTHOR("Tomeu Vizoso"); diff --git a/drivers/accel/rocket/rocket_drv.h b/drivers/accel/rocket/rocket_drv.h new file mode 100644 index 0000000000000000000000000000000000000000..bd3a697ab7c8e378967ce638b04d7d86845b53c7 --- /dev/null +++ b/drivers/accel/rocket/rocket_drv.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2024-2025 Tomeu Vizoso */ + +#ifndef __ROCKET_DRV_H__ +#define __ROCKET_DRV_H__ + +#include "rocket_device.h" + +struct rocket_file_priv { + struct rocket_device *rdev; +}; + +#endif From patchwork Tue May 20 10:27:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 891436 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 037D326F447; Tue, 20 May 2025 10:27:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747736862; cv=none; b=UkciufysqKOLJJzdr7s58cUQAwCUiGMPuPyGFBnbEQzRA8utPj3MrG/FC7LK+t4rPx5F88jG/JBPabpVmN2RLGJ1lgrS8SEy6/AYuvVEG9P2X1khoXgUVVx7pCCiHh8bd3TO7d2lC7HEVlMwfICbZkVf+TLhJ0+Uot4vH4OzVaE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747736862; c=relaxed/simple; bh=CNTRmPa8rLEeV0WQEDc5q42O99JHt1GBeF2sEoL3HpA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=JzM9TIJP9E03Rozjwoful0MLkJnbs0uNVEyhmKbco3L7Conl+5Odg4LoEC8IAxeqh23Qltsn49uzSpeSosQCvVHXtUCp8RnPukg+QYEw/AOiwCjgafnDHgeZAZmiVITobVHTBpV9UKHN1KVCFbJGw0cw/6qu9uPOAGs6MgQBnCs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-442ed8a275fso67803545e9.2; Tue, 20 May 2025 03:27:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747736858; x=1748341658; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YucihFrlt/jxAhOu28cLHlguxbOeZVuaiKJPsMI73DM=; b=RHgS7Wsuip41RUJjTq4ugUASPPiDG3KHIQaKEShlxmd4otKvna8idWaYwi723w0Crf yRVvwAhZagv9BShaWH5sFCl78Ib+UY84XrWc9s52aiRG/jOTaPyybj5U09NWSV5opPGT EcYYKSt8w2dvDdy7ld3cRyGxFZdaQrwiSVjuT3tGEKk0c49MM93BfZqhWBMXcYV2Jkg0 4duGBVNXXRRyOXdtki30aHgKp0ELmkqe57Z609bDWfVa8RVpIkUU8ZcgYTUIcsV2lieT ePQ1fXD7xZzrU19vPktPu02pzyBTxbQR5sysJ7uqEqf/gIu2dLgXqyhTCEJA2ge9JznC hPyw== X-Forwarded-Encrypted: i=1; AJvYcCUn3O/Rg4cPdlSAF4Fol95XvN/ZPInoy45HEr4EaPMRJHcnuHER2mUuNacjiEIdsYHeqs7Wt4QbppqYl7ni@vger.kernel.org, AJvYcCVID0/e7G9bJ8BsRKsOXZZAT3KjgyFkFn6zck3sij4Ij1fTAwhBEWDWYm2VegDB6f2/DWA4hO2G2NwseEs=@vger.kernel.org, AJvYcCWfWMN7IVeU282zAPR4Fbjen07l5yBRt2Wjl/AqAtS/Vsmr5CJEsRKLw076VctwXtDr8rkBhC2Vqu8=@vger.kernel.org X-Gm-Message-State: AOJu0YwPrB9IhbWoM8WtHR0vIRO9M6CYl2mKxvyBYZpIzB1u7xlAQilq d31k5CZfaIchhZysEOR+8bSOu78OmpDhjqgc20WMqxzQGXzCsrpiAcygocZ+o+KC X-Gm-Gg: ASbGncsr65uVJfIsJZ/EH1Eht/7vh/S9idpdR2AuMH3ULE59ja3NBeZM0McqEF/XI0z dEiVc3X/Bh/5XQfvHNHZ8IxZS5XdHDJllAYS39a1UosQmVMVvrIhMU7Cr1uUcVu6xsK1mq1+w3h o2rYjGla2Uu1UnOSDrIG0xIYTXDEy8+338rXachCNhZpziGzMpqB8pJuMt8C4wbO6EmW/N0V/Sq NaszbtBVAL74ALzCkfOCRWKHAuro1lz6gaL4uE83x5a0vW11teD4h33YGOz+SDSod2Y3YtwtPc3 WTAQK7QyVa4CuVqCFiNJ45BnaUAHLHdkcWFLkI2YMLZjo+zAT2HtuaZPJ5pgP+mRzBmLq90Zlfa NAPtJonLr8Q== X-Google-Smtp-Source: AGHT+IHuLZkiOPX2XMVaqGwtc2iqRLAcn/HQhPklRJ+WXCh8kz9omEqvQIm5t5PPZg6ljnuzFedZxg== X-Received: by 2002:a05:600c:524c:b0:43d:160:cd9e with SMTP id 5b1f17b1804b1-442fd633ae6mr170317165e9.17.1747736857765; Tue, 20 May 2025 03:27:37 -0700 (PDT) Received: from [10.42.0.1] (cst-prg-46-162.cust.vodafone.cz. [46.135.46.162]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-447f73d3defsm24680025e9.18.2025.05.20.03.27.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 May 2025 03:27:37 -0700 (PDT) From: Tomeu Vizoso Date: Tue, 20 May 2025 12:27:00 +0200 Subject: [PATCH v5 07/10] accel/rocket: Add job submission IOCTL Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250520-6-10-rocket-v5-7-18c9ca0fcb3c@tomeuvizoso.net> References: <20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net> In-Reply-To: <20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Nicolas Frattaroli , Jeff Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 Using the DRM GPU scheduler infrastructure, with a scheduler for each core. Userspace can decide for a series of tasks to be executed sequentially in the same core, so SRAM locality can be taken advantage of. The job submission code was initially based on Panfrost. v2: - Remove hardcoded number of cores - Misc. style fixes (Jeffrey Hugo) - Repack IOCTL struct (Jeffrey Hugo) v3: - Adapt to a split of the register block in the DT bindings (Nicolas Frattaroli) - Make use of GPL-2.0-only for the copyright notice (Jeff Hugo) - Use drm_* logging functions (Thomas Zimmermann) - Rename reg i/o macros (Thomas Zimmermann) - Add padding to ioctls and check for zero (Jeff Hugo) - Improve error handling (Nicolas Frattaroli) Signed-off-by: Tomeu Vizoso --- drivers/accel/rocket/Makefile | 3 +- drivers/accel/rocket/rocket_core.c | 14 +- drivers/accel/rocket/rocket_core.h | 14 + drivers/accel/rocket/rocket_device.c | 2 + drivers/accel/rocket/rocket_device.h | 2 + drivers/accel/rocket/rocket_drv.c | 15 + drivers/accel/rocket/rocket_drv.h | 4 + drivers/accel/rocket/rocket_job.c | 723 +++++++++++++++++++++++++++++++++++ drivers/accel/rocket/rocket_job.h | 50 +++ include/uapi/drm/rocket_accel.h | 64 ++++ 10 files changed, 888 insertions(+), 3 deletions(-) diff --git a/drivers/accel/rocket/Makefile b/drivers/accel/rocket/Makefile index 4deef267f9e1238c4d8bd108dcc8afd9dc8b2b8f..3713dfe223d6ec6293ced3ef9291af2f3d144131 100644 --- a/drivers/accel/rocket/Makefile +++ b/drivers/accel/rocket/Makefile @@ -6,4 +6,5 @@ rocket-y := \ rocket_core.o \ rocket_device.o \ rocket_drv.o \ - rocket_gem.o + rocket_gem.o \ + rocket_job.o diff --git a/drivers/accel/rocket/rocket_core.c b/drivers/accel/rocket/rocket_core.c index a852ad7874b9c161963b1aa5f0fc2720c84738a6..b57e10d9938c0f71d0107841244ec969ca9e30e1 100644 --- a/drivers/accel/rocket/rocket_core.c +++ b/drivers/accel/rocket/rocket_core.c @@ -8,6 +8,7 @@ #include #include "rocket_core.h" +#include "rocket_job.h" int rocket_core_init(struct rocket_core *core) { @@ -38,6 +39,10 @@ int rocket_core_init(struct rocket_core *core) return PTR_ERR(core->core_iomem); } + err = rocket_job_init(core); + if (err) + return err; + pm_runtime_use_autosuspend(dev); /* @@ -51,9 +56,13 @@ int rocket_core_init(struct rocket_core *core) pm_runtime_enable(dev); err = pm_runtime_get_sync(dev); + if (err) { + rocket_job_fini(core); + return err; + } - version = rocket_pc_read(core, VERSION); - version += rocket_pc_read(core, VERSION_NUM) & 0xffff; + version = rocket_pc_readl(core, VERSION); + version += rocket_pc_readl(core, VERSION_NUM) & 0xffff; pm_runtime_mark_last_busy(dev); pm_runtime_put_autosuspend(dev); @@ -67,4 +76,5 @@ void rocket_core_fini(struct rocket_core *core) { pm_runtime_dont_use_autosuspend(core->dev); pm_runtime_disable(core->dev); + rocket_job_fini(core); } diff --git a/drivers/accel/rocket/rocket_core.h b/drivers/accel/rocket/rocket_core.h index ec89b8b5641f9714f157fd777580c98e20b09ec5..311f8714558756166411ea61883eef4323c2b726 100644 --- a/drivers/accel/rocket/rocket_core.h +++ b/drivers/accel/rocket/rocket_core.h @@ -37,6 +37,20 @@ struct rocket_core { void __iomem *cna_iomem; void __iomem *core_iomem; struct clk_bulk_data clks[2]; + + struct rocket_job *in_flight_job; + + spinlock_t job_lock; + + struct { + struct workqueue_struct *wq; + struct work_struct work; + atomic_t pending; + } reset; + + struct drm_gpu_scheduler sched; + u64 fence_context; + u64 emit_seqno; }; int rocket_core_init(struct rocket_core *core); diff --git a/drivers/accel/rocket/rocket_device.c b/drivers/accel/rocket/rocket_device.c index ee81810dd171ef1cdb1582c1bbe5099c669e42cc..e8d354995252389c2b9ef42aef07477f9e98cb92 100644 --- a/drivers/accel/rocket/rocket_device.c +++ b/drivers/accel/rocket/rocket_device.c @@ -23,12 +23,14 @@ int rocket_device_init(struct rocket_device *rdev) return err; mutex_init(&rdev->iommu_lock); + mutex_init(&rdev->sched_lock); return 0; } void rocket_device_fini(struct rocket_device *rdev) { + mutex_destroy(&rdev->sched_lock); mutex_destroy(&rdev->iommu_lock); rocket_core_fini(&rdev->cores[0]); } diff --git a/drivers/accel/rocket/rocket_device.h b/drivers/accel/rocket/rocket_device.h index 2e22aa2b95252a2850a40c3271a91cb3aca578ae..d5df6dd72b6edcb38f240be1794b6762d9002b4d 100644 --- a/drivers/accel/rocket/rocket_device.h +++ b/drivers/accel/rocket/rocket_device.h @@ -12,6 +12,8 @@ struct rocket_device { struct drm_device ddev; + struct mutex sched_lock; + struct clk_bulk_data clks[2]; struct mutex iommu_lock; diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c index 685499537a0a8a206452b745ff23f9ff170b35db..fef9b93372d3f65c41c1ac35a9bfa0c01ee721a5 100644 --- a/drivers/accel/rocket/rocket_drv.c +++ b/drivers/accel/rocket/rocket_drv.c @@ -17,12 +17,14 @@ #include "rocket_drv.h" #include "rocket_gem.h" +#include "rocket_job.h" static int rocket_open(struct drm_device *dev, struct drm_file *file) { struct rocket_device *rdev = to_rocket_device(dev); struct rocket_file_priv *rocket_priv; + int ret; rocket_priv = kzalloc(sizeof(*rocket_priv), GFP_KERNEL); if (!rocket_priv) @@ -31,7 +33,15 @@ rocket_open(struct drm_device *dev, struct drm_file *file) rocket_priv->rdev = rdev; file->driver_priv = rocket_priv; + ret = rocket_job_open(rocket_priv); + if (ret) + goto err_free; + return 0; + +err_free: + kfree(rocket_priv); + return ret; } static void @@ -39,6 +49,7 @@ rocket_postclose(struct drm_device *dev, struct drm_file *file) { struct rocket_file_priv *rocket_priv = file->driver_priv; + rocket_job_close(rocket_priv); kfree(rocket_priv); } @@ -47,6 +58,7 @@ static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = { DRM_IOCTL_DEF_DRV(ROCKET_##n, rocket_ioctl_##func, 0) ROCKET_IOCTL(CREATE_BO, create_bo), + ROCKET_IOCTL(SUBMIT, submit), }; DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); @@ -270,6 +282,9 @@ static int rocket_device_runtime_suspend(struct device *dev) if (core < 0) return -ENODEV; + if (!rocket_job_is_idle(&rdev->cores[core])) + return -EBUSY; + clk_bulk_disable_unprepare(ARRAY_SIZE(rdev->cores[core].clks), rdev->cores[core].clks); if (core == 0) diff --git a/drivers/accel/rocket/rocket_drv.h b/drivers/accel/rocket/rocket_drv.h index bd3a697ab7c8e378967ce638b04d7d86845b53c7..b4055cfad6bd431b7c59b0848653748ab945615c 100644 --- a/drivers/accel/rocket/rocket_drv.h +++ b/drivers/accel/rocket/rocket_drv.h @@ -4,10 +4,14 @@ #ifndef __ROCKET_DRV_H__ #define __ROCKET_DRV_H__ +#include + #include "rocket_device.h" struct rocket_file_priv { struct rocket_device *rdev; + + struct drm_sched_entity sched_entity; }; #endif diff --git a/drivers/accel/rocket/rocket_job.c b/drivers/accel/rocket/rocket_job.c new file mode 100644 index 0000000000000000000000000000000000000000..aee6ebdb2bd227439449fdfcab3ce7d1e39cd4c4 --- /dev/null +++ b/drivers/accel/rocket/rocket_job.c @@ -0,0 +1,723 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright 2019 Linaro, Ltd, Rob Herring */ +/* Copyright 2019 Collabora ltd. */ +/* Copyright 2024-2025 Tomeu Vizoso */ + +#include +#include +#include +#include +#include +#include +#include + +#include "rocket_core.h" +#include "rocket_device.h" +#include "rocket_drv.h" +#include "rocket_job.h" +#include "rocket_registers.h" + +#define JOB_TIMEOUT_MS 500 + +static struct rocket_job * +to_rocket_job(struct drm_sched_job *sched_job) +{ + return container_of(sched_job, struct rocket_job, base); +} + +struct rocket_fence { + struct dma_fence base; + struct drm_device *dev; + /* rocket seqno for signaled() test */ + u64 seqno; + int queue; +}; + +#define to_rocket_fence(dma_fence) \ + ((struct rocket_fence *)container_of(dma_fence, struct rocket_fence, base)) + +static const char *rocket_fence_get_driver_name(struct dma_fence *fence) +{ + return "rocket"; +} + +static const char *rocket_fence_get_timeline_name(struct dma_fence *fence) +{ + return "rockchip-npu"; +} + +static const struct dma_fence_ops rocket_fence_ops = { + .get_driver_name = rocket_fence_get_driver_name, + .get_timeline_name = rocket_fence_get_timeline_name, +}; + +static struct dma_fence *rocket_fence_create(struct rocket_core *core) +{ + struct rocket_device *rdev = core->rdev; + struct rocket_fence *fence; + + fence = kzalloc(sizeof(*fence), GFP_KERNEL); + if (!fence) + return ERR_PTR(-ENOMEM); + + fence->dev = &rdev->ddev; + fence->seqno = ++core->emit_seqno; + dma_fence_init(&fence->base, &rocket_fence_ops, &core->job_lock, + core->fence_context, fence->seqno); + + return &fence->base; +} + +static int +rocket_copy_tasks(struct drm_device *dev, + struct drm_file *file_priv, + struct drm_rocket_job *job, + struct rocket_job *rjob) +{ + struct drm_rocket_task *tasks; + int ret = 0; + int i; + + rjob->task_count = job->task_count; + + if (!rjob->task_count) + return 0; + + tasks = kvmalloc_array(rjob->task_count, sizeof(*tasks), GFP_KERNEL); + if (!tasks) { + ret = -ENOMEM; + drm_dbg(dev, "Failed to allocate incoming tasks\n"); + goto fail; + } + + if (copy_from_user(tasks, + (void __user *)(uintptr_t)job->tasks, + rjob->task_count * sizeof(*tasks))) { + ret = -EFAULT; + drm_dbg(dev, "Failed to copy incoming tasks\n"); + goto fail; + } + + rjob->tasks = kvmalloc_array(job->task_count, sizeof(*rjob->tasks), GFP_KERNEL); + if (!rjob->tasks) { + drm_dbg(dev, "Failed to allocate task array\n"); + ret = -ENOMEM; + goto fail; + } + + for (i = 0; i < rjob->task_count; i++) { + if (tasks[i].reserved != 0) { + drm_dbg(dev, "Reserved field in drm_rocket_task struct should be 0.\n"); + return -EINVAL; + } + + if (tasks[i].regcmd_count == 0) { + ret = -EINVAL; + goto fail; + } + rjob->tasks[i].regcmd = tasks[i].regcmd; + rjob->tasks[i].regcmd_count = tasks[i].regcmd_count; + } + +fail: + kvfree(tasks); + return ret; +} + +static void rocket_job_hw_submit(struct rocket_core *core, struct rocket_job *job) +{ + struct rocket_task *task; + bool task_pp_en = 1; + bool task_count = 1; + + /* GO ! */ + + /* Don't queue the job if a reset is in progress */ + if (!atomic_read(&core->reset.pending)) { + task = &job->tasks[job->next_task_idx]; + job->next_task_idx++; /* TODO: Do this only after a successful run? */ + + rocket_pc_writel(core, BASE_ADDRESS, 0x1); + + rocket_cna_writel(core, S_POINTER, 0xe + 0x10000000 * core->index); + rocket_core_writel(core, S_POINTER, 0xe + 0x10000000 * core->index); + + rocket_pc_writel(core, BASE_ADDRESS, task->regcmd); + rocket_pc_writel(core, REGISTER_AMOUNTS, (task->regcmd_count + 1) / 2 - 1); + + rocket_pc_writel(core, INTERRUPT_MASK, + PC_INTERRUPT_MASK_DPU_0 | PC_INTERRUPT_MASK_DPU_1); + rocket_pc_writel(core, INTERRUPT_CLEAR, + PC_INTERRUPT_CLEAR_DPU_0 | PC_INTERRUPT_CLEAR_DPU_1); + + rocket_pc_writel(core, TASK_CON, ((0x6 | task_pp_en) << 12) | task_count); + + rocket_pc_writel(core, TASK_DMA_BASE_ADDR, 0x0); + + rocket_pc_writel(core, OPERATION_ENABLE, 0x1); + + dev_dbg(core->dev, + "Submitted regcmd at 0x%llx to core %d", + task->regcmd, core->index); + } +} + +static int rocket_acquire_object_fences(struct drm_gem_object **bos, + int bo_count, + struct drm_sched_job *job, + bool is_write) +{ + int i, ret; + + for (i = 0; i < bo_count; i++) { + ret = dma_resv_reserve_fences(bos[i]->resv, 1); + if (ret) + return ret; + + ret = drm_sched_job_add_implicit_dependencies(job, bos[i], + is_write); + if (ret) + return ret; + } + + return 0; +} + +static void rocket_attach_object_fences(struct drm_gem_object **bos, + int bo_count, + struct dma_fence *fence) +{ + int i; + + for (i = 0; i < bo_count; i++) + dma_resv_add_fence(bos[i]->resv, fence, DMA_RESV_USAGE_WRITE); +} + +static int rocket_job_push(struct rocket_job *job) +{ + struct rocket_device *rdev = job->rdev; + struct drm_gem_object **bos; + struct ww_acquire_ctx acquire_ctx; + int ret = 0; + + bos = kvmalloc_array(job->in_bo_count + job->out_bo_count, sizeof(void *), + GFP_KERNEL); + memcpy(bos, job->in_bos, job->in_bo_count * sizeof(void *)); + memcpy(&bos[job->in_bo_count], job->out_bos, job->out_bo_count * sizeof(void *)); + + ret = drm_gem_lock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx); + if (ret) + goto err; + + mutex_lock(&rdev->sched_lock); + drm_sched_job_arm(&job->base); + + job->inference_done_fence = dma_fence_get(&job->base.s_fence->finished); + + ret = rocket_acquire_object_fences(job->in_bos, job->in_bo_count, &job->base, false); + if (ret) { + mutex_unlock(&rdev->sched_lock); + goto err_unlock; + } + + ret = rocket_acquire_object_fences(job->out_bos, job->out_bo_count, &job->base, true); + if (ret) { + mutex_unlock(&rdev->sched_lock); + goto err_unlock; + } + + kref_get(&job->refcount); /* put by scheduler job completion */ + + drm_sched_entity_push_job(&job->base); + + mutex_unlock(&rdev->sched_lock); + + rocket_attach_object_fences(job->out_bos, job->out_bo_count, job->inference_done_fence); + +err_unlock: + drm_gem_unlock_reservations(bos, job->in_bo_count + job->out_bo_count, &acquire_ctx); +err: + kfree(bos); + + return ret; +} + +static void rocket_job_cleanup(struct kref *ref) +{ + struct rocket_job *job = container_of(ref, struct rocket_job, + refcount); + unsigned int i; + + dma_fence_put(job->done_fence); + dma_fence_put(job->inference_done_fence); + + if (job->in_bos) { + for (i = 0; i < job->in_bo_count; i++) + drm_gem_object_put(job->in_bos[i]); + + kvfree(job->in_bos); + } + + if (job->out_bos) { + for (i = 0; i < job->out_bo_count; i++) + drm_gem_object_put(job->out_bos[i]); + + kvfree(job->out_bos); + } + + kfree(job->tasks); + + kfree(job); +} + +static void rocket_job_put(struct rocket_job *job) +{ + kref_put(&job->refcount, rocket_job_cleanup); +} + +static void rocket_job_free(struct drm_sched_job *sched_job) +{ + struct rocket_job *job = to_rocket_job(sched_job); + + drm_sched_job_cleanup(sched_job); + + rocket_job_put(job); +} + +static struct rocket_core *sched_to_core(struct rocket_device *rdev, + struct drm_gpu_scheduler *sched) +{ + unsigned int core; + + for (core = 0; core < rdev->num_cores; core++) { + if (&rdev->cores[core].sched == sched) + return &rdev->cores[core]; + } + + return NULL; +} + +static struct dma_fence *rocket_job_run(struct drm_sched_job *sched_job) +{ + struct rocket_job *job = to_rocket_job(sched_job); + struct rocket_device *rdev = job->rdev; + struct rocket_core *core = sched_to_core(rdev, sched_job->sched); + struct dma_fence *fence = NULL; + int ret; + + if (unlikely(job->base.s_fence->finished.error)) + return NULL; + + /* + * Nothing to execute: can happen if the job has finished while + * we were resetting the GPU. + */ + if (job->next_task_idx == job->task_count) + return NULL; + + fence = rocket_fence_create(core); + if (IS_ERR(fence)) + return fence; + + if (job->done_fence) + dma_fence_put(job->done_fence); + job->done_fence = dma_fence_get(fence); + + ret = pm_runtime_get_sync(core->dev); + if (ret < 0) + return fence; + + spin_lock(&core->job_lock); + + core->in_flight_job = job; + rocket_job_hw_submit(core, job); + + spin_unlock(&core->job_lock); + + return fence; +} + +static void rocket_job_handle_done(struct rocket_core *core, + struct rocket_job *job) +{ + if (job->next_task_idx < job->task_count) { + rocket_job_hw_submit(core, job); + return; + } + + core->in_flight_job = NULL; + dma_fence_signal_locked(job->done_fence); + pm_runtime_put_autosuspend(core->dev); +} + +static void rocket_job_handle_irq(struct rocket_core *core) +{ + u32 status, raw_status; + + pm_runtime_mark_last_busy(core->dev); + + status = rocket_pc_readl(core, INTERRUPT_STATUS); + raw_status = rocket_pc_readl(core, INTERRUPT_RAW_STATUS); + + rocket_pc_writel(core, OPERATION_ENABLE, 0x0); + rocket_pc_writel(core, INTERRUPT_CLEAR, 0x1ffff); + + spin_lock(&core->job_lock); + + if (core->in_flight_job) + rocket_job_handle_done(core, core->in_flight_job); + + spin_unlock(&core->job_lock); +} + +static void +rocket_reset(struct rocket_core *core, struct drm_sched_job *bad) +{ + bool cookie; + + if (!atomic_read(&core->reset.pending)) + return; + + /* + * Stop the scheduler. + * + * FIXME: We temporarily get out of the dma_fence_signalling section + * because the cleanup path generate lockdep splats when taking locks + * to release job resources. We should rework the code to follow this + * pattern: + * + * try_lock + * if (locked) + * release + * else + * schedule_work_to_release_later + */ + drm_sched_stop(&core->sched, bad); + + cookie = dma_fence_begin_signalling(); + + if (bad) + drm_sched_increase_karma(bad); + + /* + * Mask job interrupts and synchronize to make sure we won't be + * interrupted during our reset. + */ + rocket_pc_writel(core, INTERRUPT_MASK, 0x0); + synchronize_irq(core->irq); + + /* Handle the remaining interrupts before we reset. */ + rocket_job_handle_irq(core); + + /* + * Remaining interrupts have been handled, but we might still have + * stuck jobs. Let's make sure the PM counters stay balanced by + * manually calling pm_runtime_put_noidle() and + * rocket_devfreq_record_idle() for each stuck job. + * Let's also make sure the cycle counting register's refcnt is + * kept balanced to prevent it from running forever + */ + spin_lock(&core->job_lock); + if (core->in_flight_job) + pm_runtime_put_noidle(core->dev); + + core->in_flight_job = NULL; + spin_unlock(&core->job_lock); + + /* Proceed with reset now. */ + pm_runtime_force_suspend(core->dev); + pm_runtime_force_resume(core->dev); + + /* GPU has been reset, we can clear the reset pending bit. */ + atomic_set(&core->reset.pending, 0); + + /* + * Now resubmit jobs that were previously queued but didn't have a + * chance to finish. + * FIXME: We temporarily get out of the DMA fence signalling section + * while resubmitting jobs because the job submission logic will + * allocate memory with the GFP_KERNEL flag which can trigger memory + * reclaim and exposes a lock ordering issue. + */ + dma_fence_end_signalling(cookie); + drm_sched_resubmit_jobs(&core->sched); + cookie = dma_fence_begin_signalling(); + + /* Restart the scheduler */ + drm_sched_start(&core->sched, 0); + + dma_fence_end_signalling(cookie); +} + +static enum drm_gpu_sched_stat rocket_job_timedout(struct drm_sched_job *sched_job) +{ + struct rocket_job *job = to_rocket_job(sched_job); + struct rocket_device *rdev = job->rdev; + struct rocket_core *core = sched_to_core(rdev, sched_job->sched); + + /* + * If the GPU managed to complete this jobs fence, the timeout is + * spurious. Bail out. + */ + if (dma_fence_is_signaled(job->done_fence)) + return DRM_GPU_SCHED_STAT_NOMINAL; + + /* + * Rocket IRQ handler may take a long time to process an interrupt + * if there is another IRQ handler hogging the processing. + * For example, the HDMI encoder driver might be stuck in the IRQ + * handler for a significant time in a case of bad cable connection. + * In order to catch such cases and not report spurious rocket + * job timeouts, synchronize the IRQ handler and re-check the fence + * status. + */ + synchronize_irq(core->irq); + + if (dma_fence_is_signaled(job->done_fence)) { + dev_warn(core->dev, "unexpectedly high interrupt latency\n"); + return DRM_GPU_SCHED_STAT_NOMINAL; + } + + dev_err(core->dev, "gpu sched timeout"); + + atomic_set(&core->reset.pending, 1); + rocket_reset(core, sched_job); + + return DRM_GPU_SCHED_STAT_NOMINAL; +} + +static void rocket_reset_work(struct work_struct *work) +{ + struct rocket_core *core; + + core = container_of(work, struct rocket_core, reset.work); + rocket_reset(core, NULL); +} + +static const struct drm_sched_backend_ops rocket_sched_ops = { + .run_job = rocket_job_run, + .timedout_job = rocket_job_timedout, + .free_job = rocket_job_free +}; + +static irqreturn_t rocket_job_irq_handler_thread(int irq, void *data) +{ + struct rocket_core *core = data; + + rocket_job_handle_irq(core); + + return IRQ_HANDLED; +} + +static irqreturn_t rocket_job_irq_handler(int irq, void *data) +{ + struct rocket_core *core = data; + u32 raw_status = rocket_pc_readl(core, INTERRUPT_RAW_STATUS); + + WARN_ON(raw_status & PC_INTERRUPT_RAW_STATUS_DMA_READ_ERROR); + WARN_ON(raw_status & PC_INTERRUPT_RAW_STATUS_DMA_READ_ERROR); + + if (!(raw_status & PC_INTERRUPT_RAW_STATUS_DPU_0 || + raw_status & PC_INTERRUPT_RAW_STATUS_DPU_1)) + return IRQ_NONE; + + rocket_pc_writel(core, INTERRUPT_MASK, 0x0); + + return IRQ_WAKE_THREAD; +} + +int rocket_job_init(struct rocket_core *core) +{ + struct drm_sched_init_args args = { + .ops = &rocket_sched_ops, + .num_rqs = DRM_SCHED_PRIORITY_COUNT, + .credit_limit = 1, + .timeout = msecs_to_jiffies(JOB_TIMEOUT_MS), + .name = dev_name(core->dev), + .dev = core->dev, + }; + int ret; + + INIT_WORK(&core->reset.work, rocket_reset_work); + spin_lock_init(&core->job_lock); + + core->irq = platform_get_irq(to_platform_device(core->dev), 0); + if (core->irq < 0) + return core->irq; + + ret = devm_request_threaded_irq(core->dev, core->irq, + rocket_job_irq_handler, + rocket_job_irq_handler_thread, + IRQF_SHARED, KBUILD_MODNAME "-job", + core); + if (ret) { + dev_err(core->dev, "failed to request job irq"); + return ret; + } + + core->reset.wq = alloc_ordered_workqueue("rocket-reset-%d", 0, core->index); + if (!core->reset.wq) + return -ENOMEM; + + core->fence_context = dma_fence_context_alloc(1); + + args.timeout_wq = core->reset.wq; + ret = drm_sched_init(&core->sched, &args); + if (ret) { + dev_err(core->dev, "Failed to create scheduler: %d.", ret); + goto err_sched; + } + + return 0; + +err_sched: + drm_sched_fini(&core->sched); + + destroy_workqueue(core->reset.wq); + return ret; +} + +void rocket_job_fini(struct rocket_core *core) +{ + drm_sched_fini(&core->sched); + + cancel_work_sync(&core->reset.work); + destroy_workqueue(core->reset.wq); +} + +int rocket_job_open(struct rocket_file_priv *rocket_priv) +{ + struct rocket_device *rdev = rocket_priv->rdev; + struct drm_gpu_scheduler **scheds = kmalloc_array(rdev->num_cores, sizeof(scheds), + GFP_KERNEL); + unsigned int core; + int ret; + + for (core = 0; core < rdev->num_cores; core++) + scheds[core] = &rdev->cores[core].sched; + + ret = drm_sched_entity_init(&rocket_priv->sched_entity, + DRM_SCHED_PRIORITY_NORMAL, + scheds, + rdev->num_cores, NULL); + if (WARN_ON(ret)) + return ret; + + return 0; +} + +void rocket_job_close(struct rocket_file_priv *rocket_priv) +{ + struct drm_sched_entity *entity = &rocket_priv->sched_entity; + + kfree(entity->sched_list); + drm_sched_entity_destroy(entity); +} + +int rocket_job_is_idle(struct rocket_core *core) +{ + /* If there are any jobs in this HW queue, we're not idle */ + if (atomic_read(&core->sched.credit_count)) + return false; + + return true; +} + +static int rocket_ioctl_submit_job(struct drm_device *dev, struct drm_file *file, + struct drm_rocket_job *job) +{ + struct rocket_device *rdev = to_rocket_device(dev); + struct rocket_file_priv *file_priv = file->driver_priv; + struct rocket_job *rjob = NULL; + int ret = 0; + + if (job->task_count == 0) + return -EINVAL; + + rjob = kzalloc(sizeof(*rjob), GFP_KERNEL); + if (!rjob) + return -ENOMEM; + + kref_init(&rjob->refcount); + + rjob->rdev = rdev; + + ret = drm_sched_job_init(&rjob->base, + &file_priv->sched_entity, + 1, NULL); + if (ret) + goto out_put_job; + + ret = rocket_copy_tasks(dev, file, job, rjob); + if (ret) + goto out_cleanup_job; + + ret = drm_gem_objects_lookup(file, + (void __user *)(uintptr_t)job->in_bo_handles, + job->in_bo_handle_count, &rjob->in_bos); + if (ret) + goto out_cleanup_job; + + rjob->in_bo_count = job->in_bo_handle_count; + + ret = drm_gem_objects_lookup(file, + (void __user *)(uintptr_t)job->out_bo_handles, + job->out_bo_handle_count, &rjob->out_bos); + if (ret) + goto out_cleanup_job; + + rjob->out_bo_count = job->out_bo_handle_count; + + ret = rocket_job_push(rjob); + if (ret) + goto out_cleanup_job; + +out_cleanup_job: + if (ret) + drm_sched_job_cleanup(&rjob->base); +out_put_job: + rocket_job_put(rjob); + + return ret; +} + +int rocket_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct drm_rocket_submit *args = data; + struct drm_rocket_job *jobs; + int ret = 0; + unsigned int i = 0; + + if (args->reserved != 0) { + drm_dbg(dev, "Reserved field in drm_rocket_submit struct should be 0.\n"); + return -EINVAL; + } + + jobs = kvmalloc_array(args->job_count, sizeof(*jobs), GFP_KERNEL); + if (!jobs) { + drm_dbg(dev, "Failed to allocate incoming job array\n"); + return -ENOMEM; + } + + if (copy_from_user(jobs, + (void __user *)(uintptr_t)args->jobs, + args->job_count * sizeof(*jobs))) { + ret = -EFAULT; + drm_dbg(dev, "Failed to copy incoming job array\n"); + goto exit; + } + + for (i = 0; i < args->job_count; i++) { + if (jobs[i].reserved != 0) { + drm_dbg(dev, "Reserved field in drm_rocket_job struct should be 0.\n"); + return -EINVAL; + } + + rocket_ioctl_submit_job(dev, file, &jobs[i]); + } + +exit: + kfree(jobs); + + return ret; +} diff --git a/drivers/accel/rocket/rocket_job.h b/drivers/accel/rocket/rocket_job.h new file mode 100644 index 0000000000000000000000000000000000000000..99e1928fbd89f9b506c63bf9dd591124feeb54b5 --- /dev/null +++ b/drivers/accel/rocket/rocket_job.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright 2024-2025 Tomeu Vizoso */ + +#ifndef __ROCKET_JOB_H__ +#define __ROCKET_JOB_H__ + +#include +#include + +#include "rocket_core.h" +#include "rocket_drv.h" + +struct rocket_task { + u64 regcmd; + u32 regcmd_count; +}; + +struct rocket_job { + struct drm_sched_job base; + + struct rocket_device *rdev; + + struct drm_gem_object **in_bos; + struct drm_gem_object **out_bos; + + u32 in_bo_count; + u32 out_bo_count; + + struct rocket_task *tasks; + u32 task_count; + u32 next_task_idx; + + /* Fence to be signaled by drm-sched once its done with the job */ + struct dma_fence *inference_done_fence; + + /* Fence to be signaled by IRQ handler when the job is complete. */ + struct dma_fence *done_fence; + + struct kref refcount; +}; + +int rocket_ioctl_submit(struct drm_device *dev, void *data, struct drm_file *file); + +int rocket_job_init(struct rocket_core *core); +void rocket_job_fini(struct rocket_core *core); +int rocket_job_open(struct rocket_file_priv *rocket_priv); +void rocket_job_close(struct rocket_file_priv *rocket_priv); +int rocket_job_is_idle(struct rocket_core *core); + +#endif diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h index 95720702b7c4413d72b89c1f0f59abb22dc8c6b3..cb1b5934c201160e7650aabd1b3a2b1c77b1fd7b 100644 --- a/include/uapi/drm/rocket_accel.h +++ b/include/uapi/drm/rocket_accel.h @@ -12,8 +12,10 @@ extern "C" { #endif #define DRM_ROCKET_CREATE_BO 0x00 +#define DRM_ROCKET_SUBMIT 0x01 #define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo) +#define DRM_IOCTL_ROCKET_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_SUBMIT, struct drm_rocket_submit) /** * struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs. @@ -37,6 +39,68 @@ struct drm_rocket_create_bo { __u64 offset; }; +/** + * struct drm_rocket_task - A task to be run on the NPU + * + * A task is the smallest unit of work that can be run on the NPU. + */ +struct drm_rocket_task { + /** Input: DMA address to NPU mapping of register command buffer */ + __u64 regcmd; + + /** Input: Number of commands in the register command buffer */ + __u32 regcmd_count; + + /** Reserved, must be zero. */ + __u32 reserved; +}; + +/** + * struct drm_rocket_job - A job to be run on the NPU + * + * The kernel will schedule the execution of this job taking into account its + * dependencies with other jobs. All tasks in the same job will be executed + * sequentially on the same core, to benefit from memory residency in SRAM. + */ +struct drm_rocket_job { + /** Input: Pointer to an array of struct drm_rocket_task. */ + __u64 tasks; + + /** Input: Pointer to a u32 array of the BOs that are read by the job. */ + __u64 in_bo_handles; + + /** Input: Pointer to a u32 array of the BOs that are written to by the job. */ + __u64 out_bo_handles; + + /** Input: Number of tasks passed in. */ + __u32 task_count; + + /** Input: Number of input BO handles passed in (size is that times 4). */ + __u32 in_bo_handle_count; + + /** Input: Number of output BO handles passed in (size is that times 4). */ + __u32 out_bo_handle_count; + + /** Reserved, must be zero. */ + __u32 reserved; +}; + +/** + * struct drm_rocket_submit - ioctl argument for submitting commands to the NPU. + * + * The kernel will schedule the execution of these jobs in dependency order. + */ +struct drm_rocket_submit { + /** Input: Pointer to an array of struct drm_rocket_job. */ + __u64 jobs; + + /** Input: Number of jobs passed in. */ + __u32 job_count; + + /** Reserved, must be zero. */ + __u32 reserved; +}; + #if defined(__cplusplus) } #endif From patchwork Tue May 20 10:27:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 891435 Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 577AE26F46D; Tue, 20 May 2025 10:27:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747736863; cv=none; b=TOz89CN4Pg4IgNmx9Qfi4gq4Roy5D/aSIKkAxarrrgQBFk9ppIx6AKdwYE1sfAsXlATPApORwSFJnnPK9Qx7Xym21KXO/v9cBNM/lgrJNELKxBCuXuLKjElQTc+jWEuDZepohVgbRNUegquQqaVizFoMbt46JiIlXQo3i/03zLk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747736863; c=relaxed/simple; bh=vnL/JQTIzMrG7ymt1jV45qabuhdAHhFO8ifp0/6kr8g=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=EwOm+7iFK2GOJbJ+M9jruCb7Wsyhn+4A853OBiYvzAgNFrnUGmhb9Vu2jnRcyTMh/iLJlX/atWrxaM3MGlDr+RJOnbKeZrzLV6WzhfwRPnrV6Uqhg/yBj40ABpdyp7RxYGYv2ZP4jS4rE/RGO0Th2P/uPZ58ajsxT+A6H5VvdXI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.128.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-43cfe63c592so60008355e9.2; Tue, 20 May 2025 03:27:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747736859; x=1748341659; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Sno2TxM6nh2qfS6yS7wci8cH+EcvTWnBUgzpoh/6UDI=; b=rlEDtbOtV0gjffU2/xcTANMDq5mV2dxuVePKo8Pcp2kMBAtd1egayF/xfCKQY1UDex 5JdbDkkiqomcZVQ2hSS4uEi0uC/IXmm+Znew6neg3b5kgZ589ZlBFORwvqyhgQKYWNYC EIKLTlHu3H/ccrPv0ZE0HaQqmxX1yh3HtgtoKnDPTbuJOf8KvxzGfE4Ld+Yp4b/6MfJh gNOpqAfve8j25t0oK8dG5grJZLWxKBEjtuslSJ3Vv9xy19k0JmE0VXax2jJvDbfEUSeL TlPF4lPsOdLoXMycnHeM5+LtaQ+vUuf+vtnynO81EffR2jVEBpUxOrwAbsiHt9CvVKHH hn2A== X-Forwarded-Encrypted: i=1; AJvYcCVHyNneOpZWCY/VELJKk/WOUjMBBIKID9W6Kgwwk0pPGKXGeLRzcwl9VM8IXA6OE1I8UXAgUkECXLQ=@vger.kernel.org, AJvYcCWFGehYXQckvt9TUKyd1KgaCC5Bw4LQYM7oV1wrvul0Hr9/j31RgnldU0eE9rhe8QZ5G7X6HtlvhGC1IfjF@vger.kernel.org, AJvYcCXTIzAs+Jf7p0k7u+c8427qSywPLrC+gZQjUeq0GPfuiLn46tM/xqVZV+kbxIa3QTwM31dlRdaW+5FzvqQ=@vger.kernel.org X-Gm-Message-State: AOJu0YxwBRSZo8vAm4gTY63w2FELWh7CgYi+mcWYMipHurqEGcSY1+me sRbasJW/WhGFInHAKYUSGPuXcMOz6KVJfoYja/4a2lyZgogpwrzHA0CbzUbxv9Ah X-Gm-Gg: ASbGncssnMVBy8VKYKMzhuUwWrn/N5vik9JIov655f3eN1VfLqddUm71zXeK/yhPJiP p3kLu+fJyHE9lJvFA71F5oScA2dC7WLAI2up/9KwHBg3PDT38DS8MK2+Kglvdqb0r/t1xmXJWYg px/ZZnK7SEoZS9nDRDuHwr9kqidUXti2aWOmNBQ918MvatRp6ANJXmP2yTJQapBxYjh37eGu1Pw hJZTmTVIisaY2d0WuuXH3+9TU94r5pY8O1ZtM6sSR49e/Fe7corLJGgUOSnuTgbBjnmQ6HuQSAA cME1KY9o9Fv7NGHFCGLc3VSf2tdfuliuYNNmy9m4cS7oCisZcEo6pYdalvVB+G0u8eeYu69U9+z AewXaTHwx7w== X-Google-Smtp-Source: AGHT+IFVeV+SJDiHq/M1Mn0wviBi4rkXrBqwt1IdCIq93iABKAxWCW3N64g4uPR2N467hpXo1zwylQ== X-Received: by 2002:a05:600c:34d4:b0:442:e0e0:250 with SMTP id 5b1f17b1804b1-442fd67200emr148769165e9.29.1747736859396; Tue, 20 May 2025 03:27:39 -0700 (PDT) Received: from [10.42.0.1] (cst-prg-46-162.cust.vodafone.cz. [46.135.46.162]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-447f73d3defsm24680025e9.18.2025.05.20.03.27.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 May 2025 03:27:38 -0700 (PDT) From: Tomeu Vizoso Date: Tue, 20 May 2025 12:27:01 +0200 Subject: [PATCH v5 08/10] accel/rocket: Add IOCTLs for synchronizing memory accesses Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250520-6-10-rocket-v5-8-18c9ca0fcb3c@tomeuvizoso.net> References: <20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net> In-Reply-To: <20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Nicolas Frattaroli , Jeff Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 The NPU cores have their own access to the memory bus, and this isn't cache coherent with the CPUs. Add IOCTLs so userspace can mark when the caches need to be flushed, and also when a writer job needs to be waited for before the buffer can be accessed from the CPU. Initially based on the same IOCTLs from the Etnaviv driver. v2: - Don't break UABI by reordering the IOCTL IDs (Jeff Hugo) v3: - Check that padding fields in IOCTLs are zero (Jeff Hugo) Signed-off-by: Tomeu Vizoso --- drivers/accel/rocket/rocket_drv.c | 2 + drivers/accel/rocket/rocket_gem.c | 80 +++++++++++++++++++++++++++++++++++++++ drivers/accel/rocket/rocket_gem.h | 5 +++ include/uapi/drm/rocket_accel.h | 37 ++++++++++++++++++ 4 files changed, 124 insertions(+) diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c index fef9b93372d3f65c41c1ac35a9bfa0c01ee721a5..c06e66939e6c39909fe08bef3c4f301b07bf8fbf 100644 --- a/drivers/accel/rocket/rocket_drv.c +++ b/drivers/accel/rocket/rocket_drv.c @@ -59,6 +59,8 @@ static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = { ROCKET_IOCTL(CREATE_BO, create_bo), ROCKET_IOCTL(SUBMIT, submit), + ROCKET_IOCTL(PREP_BO, prep_bo), + ROCKET_IOCTL(FINI_BO, fini_bo), }; DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); diff --git a/drivers/accel/rocket/rocket_gem.c b/drivers/accel/rocket/rocket_gem.c index 8a8a7185daac4740081293aae6945c9b2bbeb2dd..cdc5238a93fa5978129dc1ac8ec8de955160dc18 100644 --- a/drivers/accel/rocket/rocket_gem.c +++ b/drivers/accel/rocket/rocket_gem.c @@ -129,3 +129,83 @@ int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file * return ret; } + +static inline enum dma_data_direction rocket_op_to_dma_dir(u32 op) +{ + if (op & ROCKET_PREP_READ) + return DMA_FROM_DEVICE; + else if (op & ROCKET_PREP_WRITE) + return DMA_TO_DEVICE; + else + return DMA_BIDIRECTIONAL; +} + +int rocket_ioctl_prep_bo(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct drm_rocket_prep_bo *args = data; + unsigned long timeout = drm_timeout_abs_to_jiffies(args->timeout_ns); + struct rocket_device *rdev = to_rocket_device(dev); + struct drm_gem_object *gem_obj; + struct drm_gem_shmem_object *shmem_obj; + bool write = !!(args->op & ROCKET_PREP_WRITE); + long ret = 0; + + if (args->op & ~(ROCKET_PREP_READ | ROCKET_PREP_WRITE)) + return -EINVAL; + + gem_obj = drm_gem_object_lookup(file, args->handle); + if (!gem_obj) + return -ENOENT; + + ret = dma_resv_wait_timeout(gem_obj->resv, dma_resv_usage_rw(write), + true, timeout); + if (!ret) + ret = timeout ? -ETIMEDOUT : -EBUSY; + + shmem_obj = &to_rocket_bo(gem_obj)->base; + + for (unsigned int core = 1; core < rdev->num_cores; core++) { + dma_sync_sgtable_for_cpu(rdev->cores[core].dev, shmem_obj->sgt, + rocket_op_to_dma_dir(args->op)); + } + + to_rocket_bo(gem_obj)->last_cpu_prep_op = args->op; + + drm_gem_object_put(gem_obj); + + return ret; +} + +int rocket_ioctl_fini_bo(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct rocket_device *rdev = to_rocket_device(dev); + struct drm_rocket_fini_bo *args = data; + struct drm_gem_shmem_object *shmem_obj; + struct rocket_gem_object *rkt_obj; + struct drm_gem_object *gem_obj; + + if (args->reserved != 0) { + drm_dbg(dev, "Reserved field in drm_rocket_fini_bo struct should be 0.\n"); + return -EINVAL; + } + + gem_obj = drm_gem_object_lookup(file, args->handle); + if (!gem_obj) + return -ENOENT; + + rkt_obj = to_rocket_bo(gem_obj); + shmem_obj = &rkt_obj->base; + + WARN_ON(rkt_obj->last_cpu_prep_op == 0); + + for (unsigned int core = 1; core < rdev->num_cores; core++) { + dma_sync_sgtable_for_device(rdev->cores[core].dev, shmem_obj->sgt, + rocket_op_to_dma_dir(rkt_obj->last_cpu_prep_op)); + } + + rkt_obj->last_cpu_prep_op = 0; + + drm_gem_object_put(gem_obj); + + return 0; +} diff --git a/drivers/accel/rocket/rocket_gem.h b/drivers/accel/rocket/rocket_gem.h index 41497554366961cfe18cf6c7e93ab1e4e5dc1886..2caa268f7f496f782996c6ad2c4eb851a225a86f 100644 --- a/drivers/accel/rocket/rocket_gem.h +++ b/drivers/accel/rocket/rocket_gem.h @@ -11,12 +11,17 @@ struct rocket_gem_object { size_t size; u32 offset; + u32 last_cpu_prep_op; }; struct drm_gem_object *rocket_gem_create_object(struct drm_device *dev, size_t size); int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file); +int rocket_ioctl_prep_bo(struct drm_device *dev, void *data, struct drm_file *file); + +int rocket_ioctl_fini_bo(struct drm_device *dev, void *data, struct drm_file *file); + static inline struct rocket_gem_object *to_rocket_bo(struct drm_gem_object *obj) { diff --git a/include/uapi/drm/rocket_accel.h b/include/uapi/drm/rocket_accel.h index cb1b5934c201160e7650aabd1b3a2b1c77b1fd7b..b5c80dd767be56e9720b51e4a82617a425a881a1 100644 --- a/include/uapi/drm/rocket_accel.h +++ b/include/uapi/drm/rocket_accel.h @@ -13,9 +13,13 @@ extern "C" { #define DRM_ROCKET_CREATE_BO 0x00 #define DRM_ROCKET_SUBMIT 0x01 +#define DRM_ROCKET_PREP_BO 0x02 +#define DRM_ROCKET_FINI_BO 0x03 #define DRM_IOCTL_ROCKET_CREATE_BO DRM_IOWR(DRM_COMMAND_BASE + DRM_ROCKET_CREATE_BO, struct drm_rocket_create_bo) #define DRM_IOCTL_ROCKET_SUBMIT DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_SUBMIT, struct drm_rocket_submit) +#define DRM_IOCTL_ROCKET_PREP_BO DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_PREP_BO, struct drm_rocket_prep_bo) +#define DRM_IOCTL_ROCKET_FINI_BO DRM_IOW(DRM_COMMAND_BASE + DRM_ROCKET_FINI_BO, struct drm_rocket_fini_bo) /** * struct drm_rocket_create_bo - ioctl argument for creating Rocket BOs. @@ -39,6 +43,39 @@ struct drm_rocket_create_bo { __u64 offset; }; +#define ROCKET_PREP_READ 0x01 +#define ROCKET_PREP_WRITE 0x02 + +/** + * struct drm_rocket_prep_bo - ioctl argument for starting CPU ownership of the BO. + * + * Takes care of waiting for any NPU jobs that might still use the NPU and performs cache + * synchronization. + */ +struct drm_rocket_prep_bo { + /** Input: GEM handle of the buffer object. */ + __u32 handle; + + /** Input: mask of ROCKET_PREP_x, direction of the access. */ + __u32 op; + + /** Input: Amount of time to wait for NPU jobs. */ + __s64 timeout_ns; +}; + +/** + * struct drm_rocket_fini_bo - ioctl argument for finishing CPU ownership of the BO. + * + * Synchronize caches for NPU access. + */ +struct drm_rocket_fini_bo { + /** Input: GEM handle of the buffer object. */ + __u32 handle; + + /** Reserved, must be zero. */ + __u32 reserved; +}; + /** * struct drm_rocket_task - A task to be run on the NPU * From patchwork Tue May 20 10:27:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tomeu Vizoso X-Patchwork-Id: 891434 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A55552701CB; Tue, 20 May 2025 10:27:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747736867; cv=none; b=GFt//Er8sywtLbB+vxEmd5Wr9nCerMNuuY7Se7vzoUYVG0v6WGtBqMtde6JsmnEtLoFzr8izh0VGZVwc6wXevalwNYY1ayovDiIh3aU8/TCMcIAUcO+d7TBuEVPTZ4Q3x+kS9eORCgEYKb84UTbaH/X+T5xjvABVeANQy/nqf5g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747736867; c=relaxed/simple; bh=qcLANFQYB6F1aTadtJclYNPeTflXqceuvsxjG9tl5Gg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=gLUFyCtwsqamR6tv453yV7PgIrYzTBE882ZZjruSKsWEsRuTKLx5NILPPfw8RKnGA2Yk1vO58/YoXRq37jGbx0+iDAPgcP1kU/GUWz9JZKT639i666/YIWdcM87wYZxHHfTXx/6Iqt9HsyoO/8gdXr6PSSts0qJy04XMf2UdUv4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.128.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=tomeuvizoso.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-442f9043f56so29597685e9.0; Tue, 20 May 2025 03:27:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747736864; x=1748341664; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=anhtEfC+aYWJeJvUly60jdWH8rm+sVcW8pcF9vb2eMk=; b=N6WNFXRpPNc9+4xvuoJF3BsFa1PAqBYxuDGeDGf7s18Gszx0VqJLeJVNoH2nKHvBvf ewU3v09u7IRwhGwpYsm1XY1OsvueSMq18uZQ+Drqh4ICLEm4m1dKyIytGpfmAPzpqGRs LvbB4R4wwbIvqOWlXpwJo7Yc9Zd0epL7SmkLj7kfR5SpQeyb8bqKzzsfTLyiP5pX0kbs Udko8o2Gdd69BIKydAGozktSzr37IZYGfWPdRo59TduBfmM4o/7tLv2rJT8wh07JLwFh Kz5AV9E7Eh3W7ej6EOdrcU5MI01gGDZcn3AEWKGRNh6srfHMaJYGXXuk6t3Md3z5dggf 1Xzw== X-Forwarded-Encrypted: i=1; AJvYcCUXfjDRl4sUsEf8iR6YL7JPiGGMup2nBAkVj4q+hbd9UiSs741NlWaVfV1vpUAMJpdK5gCEFTobEwwxRiB0@vger.kernel.org, AJvYcCW2WKKLyIIvp7ZXG4g6Rrgux5l57xfZjzU09i7sOMJflS7UstM9Z47d8nulmtySXPVU4Bb+Fg9SV+khdt0=@vger.kernel.org, AJvYcCXFZkgVcjmp4bc8V0DYpdm6NKQ4P7/usX0YoJZcgkMgKdip4axNIKhZho/1jSuMGxgWcAvnK7po66Y=@vger.kernel.org X-Gm-Message-State: AOJu0Yw/Folp1f4/ixCnIENp0U2mZcIW1cYc9+W2fZPppW1XMyZnhdhk mftGsAVqM2/RJabE/6dkvSQA/gFD8sn9Rqs1k11ukx7QQ8BPYJ2isq01IRZIi389 X-Gm-Gg: ASbGncuBTJJtrEhj58kbN1S2BYlEQvEyDpZCaIrKqdTFwP1LLa52qqZr3MS5M2B9ERn wGLxoryIX9TfusY5v+jWlNR/6I62KCx+xWCio6FxOFlWdwHSgby1wwjx+9U0SeQ7CT8VmRnbPeU sQ7XC6y6K/podCXdHsRxDoxYnXMrmGKeDhQtXuFu270mu9QFJAY45LzPJUnWlkB4tPRFnO7WPUe A1ZJB/YwWaYoO2BzZgbNeuW4VohWvFW7x74qGLptsqLYl+r6F7dqDGwYgIDp8A877c3paPThfCG /OsYdaUMMvbmXcF0pyzVXynqGlx3Hq9PZ1Z5OGg5Z/eKf35Dyv8/ZS7/1YQs2xsWAlxp08evxz0 c8/6aRdV49g== X-Google-Smtp-Source: AGHT+IGUarcsN0dGkfly/Rx4SdIOJyJyOrzjU/pgpXaQr2qIzh21VVgC/kOUDjf60ygnRBw0CXG2wQ== X-Received: by 2002:a05:600c:154a:b0:442:eaa9:31c9 with SMTP id 5b1f17b1804b1-442fd66dd78mr108589395e9.22.1747736863892; Tue, 20 May 2025 03:27:43 -0700 (PDT) Received: from [10.42.0.1] (cst-prg-46-162.cust.vodafone.cz. [46.135.46.162]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-447f73d3defsm24680025e9.18.2025.05.20.03.27.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 May 2025 03:27:43 -0700 (PDT) From: Tomeu Vizoso Date: Tue, 20 May 2025 12:27:03 +0200 Subject: [PATCH v5 10/10] arm64: dts: rockchip: enable NPU on ROCK 5B Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250520-6-10-rocket-v5-10-18c9ca0fcb3c@tomeuvizoso.net> References: <20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net> In-Reply-To: <20250520-6-10-rocket-v5-0-18c9ca0fcb3c@tomeuvizoso.net> To: Rob Herring , Krzysztof Kozlowski , Conor Dooley , Heiko Stuebner , Oded Gabbay , Jonathan Corbet , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= , Sebastian Reichel , Nicolas Frattaroli , Jeff Hugo Cc: devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, Tomeu Vizoso X-Mailer: b4 0.14.2 From: Nicolas Frattaroli The NPU on the ROCK5B uses the same regulator for both the sram-supply and the npu's supply. Add this regulator, and enable all the NPU bits. Also add the regulator as a domain-supply to the pd_npu power domain. Signed-off-by: Nicolas Frattaroli Signed-off-by: Tomeu Vizoso --- arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 56 +++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index d22068475c5dc6cb885f878f3f527a66edf1ba70..49500f7cbcb14af4919a6c1997e9e53a01d84973 100644 --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts @@ -316,6 +316,28 @@ regulator-state-mem { }; }; +&i2c1 { + pinctrl-names = "default"; + pinctrl-0 = <&i2c1m2_xfer>; + status = "okay"; + + vdd_npu_s0: regulator@42 { + compatible = "rockchip,rk8602"; + reg = <0x42>; + fcs,suspend-voltage-selector = <1>; + regulator-name = "vdd_npu_s0"; + regulator-boot-on; + regulator-min-microvolt = <550000>; + regulator-max-microvolt = <950000>; + regulator-ramp-delay = <2300>; + vin-supply = <&vcc5v0_sys>; + + regulator-state-mem { + regulator-off-in-suspend; + }; + }; +}; + &i2c6 { status = "okay"; @@ -440,6 +462,10 @@ &pd_gpu { domain-supply = <&vdd_gpu_s0>; }; +&pd_npu { + domain-supply = <&vdd_npu_s0>; +}; + &pinctrl { hdmirx { hdmirx_hpd: hdmirx-5v-detection { @@ -500,6 +526,36 @@ &pwm1 { status = "okay"; }; +&rknn_core_top { + npu-supply = <&vdd_npu_s0>; + sram-supply = <&vdd_npu_s0>; + status = "okay"; +}; + +&rknn_core_1 { + npu-supply = <&vdd_npu_s0>; + sram-supply = <&vdd_npu_s0>; + status = "okay"; +}; + +&rknn_core_2 { + npu-supply = <&vdd_npu_s0>; + sram-supply = <&vdd_npu_s0>; + status = "okay"; +}; + +&rknn_mmu_top { + status = "okay"; +}; + +&rknn_mmu_1 { + status = "okay"; +}; + +&rknn_mmu_2 { + status = "okay"; +}; + &saradc { vref-supply = <&avcc_1v8_s0>; status = "okay";