mbox series

[0/9] Enable two H264 encoder cores on MT8195

Message ID 20210816105934.28265-1-irui.wang@mediatek.com
Headers show
Series Enable two H264 encoder cores on MT8195 | expand

Message

Irui Wang Aug. 16, 2021, 10:59 a.m. UTC
MT8195 has two H264 encoder cores, they have their own power-domains,
clocks, interrupts, register base. The two H264 encoder cores can work
together to achieve higher performance.

This series of patches is to use enable two h264 encoder cores.
path[1..2]: use linux component framework to manage encoder hardware,
user call "mt8195-vcodec-enc" driver can get the encoder master device,
the encoding work is done by the two encoder core device. The hw_mode
variable is added to distinguish from old platform, two encoder cores
called "FRAME_RACING_MODE".

The hardware mode of two encoder cores work together(overlap, another
word called) on MT8195 called "frame_racing_mode", the two encoder
power-domains should be power on together while encoding, the encoding
process look like this:

    VENC Core0 frm#0....frm#2....frm#4
    VENC Core1  .frm#1....frm#3....frm#5

patch[3..5]: due to the component device, the master device has no
power-domains/clocks properties in dtsi, so the power/clock init function
can't use for "frame_racing_mode" device in master device probe process,
it should be called in component device probe process. Power on the
hardware power and clock on demand.

patch[6]: "frame_racing_mode" encoding need a new set of memory buffer
for two encoder cores. For compatibility, we should new a encoder driver
interface.

patch[7..9]: add "frame_racing_mode" encoding process:
As-Is: Synchronous
VIDIOC_QBUF#0 --> device_run(triger encoder) --> wait encoder IRQ -->
encode done with result --> job_finish

VIDIOC_QBUF#1 --> device_run(triger encoder) --> wait encoder IRQ -->
encode done with result --> job_finish
...

To-Be: Asynchronous
VIDIOC_QBUF#0 --> device_run(triger encoder core0) --> job_finish
..VIDIOC_QBUF#1 --> device_run(triger encoder core1) --> job_finish
(core0 may encode done here, return encode result to client)
VIDIOC_QBUF#2 --> device_run(triger encoder core0) --> job_finish

Thers is no "wait encoder IRQ" synchronous call during "frame_racing_mode"
encoding process, which can full use the two encoder cores to achieve
higher performance.

Irui Wang (9):
  dt-bindings: media: mtk-vcodec: Add binding for MT8195 two venc cores
  media: mtk-vcodec: Use component framework to manage encoder hardware
  media: mtk-vcodec: Rewrite venc power manage interface
  media: mtk-vcodec: Add venc power on/off interface
  media: mtk-vcodec: Rewrite venc clock interface
  media: mtk-vcodec: Add new venc drv interface for frame_racing mode
  media: mtk-vcodec: Add frame racing mode encode process
  media: mtk-vcodec: Return encode result to client
  media: mtk-vcodec: Add delayed worker for encode timeout

 .../bindings/media/mediatek-vcodec.txt        |   2 +
 drivers/media/platform/mtk-vcodec/Makefile    |   2 +
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  34 +-
 .../platform/mtk-vcodec/mtk_vcodec_enc.c      | 120 +++-
 .../platform/mtk-vcodec/mtk_vcodec_enc.h      |  10 +-
 .../platform/mtk-vcodec/mtk_vcodec_enc_drv.c  | 204 +++++-
 .../platform/mtk-vcodec/mtk_vcodec_enc_hw.c   | 253 +++++++
 .../platform/mtk-vcodec/mtk_vcodec_enc_hw.h   |  38 +
 .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c   | 213 ++++--
 .../platform/mtk-vcodec/mtk_vcodec_enc_pm.h   |  13 +-
 .../platform/mtk-vcodec/mtk_vcodec_util.c     |  19 +
 .../platform/mtk-vcodec/mtk_vcodec_util.h     |   5 +
 .../platform/mtk-vcodec/venc/venc_common_if.c | 675 ++++++++++++++++++
 .../platform/mtk-vcodec/venc/venc_h264_if.c   |   6 +-
 .../platform/mtk-vcodec/venc/venc_vp8_if.c    |   2 +-
 .../media/platform/mtk-vcodec/venc_drv_if.c   |  96 ++-
 .../media/platform/mtk-vcodec/venc_drv_if.h   |   7 +
 .../media/platform/mtk-vcodec/venc_vpu_if.c   |  11 +-
 .../media/platform/mtk-vcodec/venc_vpu_if.h   |   3 +-
 19 files changed, 1564 insertions(+), 149 deletions(-)
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_hw.c
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_hw.h
 create mode 100644 drivers/media/platform/mtk-vcodec/venc/venc_common_if.c

Comments

Tzung-Bi Shih Aug. 23, 2021, 10:01 a.m. UTC | #1
On Mon, Aug 16, 2021 at 06:59:27PM +0800, Irui Wang wrote:
> +static struct component_match *mtk_venc_match_add(struct mtk_vcodec_dev *dev)
> +{
> +	struct platform_device *pdev = dev->plat_dev;
> +	struct component_match *match = NULL;
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(mtk_venc_comp_ids); i++) {
> +		enum mtk_venc_hw_id comp_idx;
> +		struct device_node *comp_node;
> +		const struct of_device_id *of_id;
To be neat, prefer to define the variables outside of the loop (i.e. at the beginning of the function).

> +
> +		comp_node = of_find_compatible_node(NULL, NULL,
> +			mtk_venc_comp_ids[i].compatible);
> +		if (!comp_node)
> +			continue;
> +
> +		of_id = of_match_node(mtk_venc_comp_ids, comp_node);
> +		if (!of_id) {
> +			dev_err(&pdev->dev, "Failed to get match node\n");
Need to call of_node_put() actually, but see comment below.

> +			return ERR_PTR(-EINVAL);
> +		}
> +
> +		comp_idx = (enum mtk_venc_hw_id)of_id->data;
For getting the comp_idx, mtk_venc_comp_ids[i].data should be sufficient.  If so, of_match_node() can be removed so that the error handling path won't need to call of_node_put().

> @@ -239,6 +314,7 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
>  	phandle rproc_phandle;
>  	enum mtk_vcodec_fw_type fw_type;
>  	int ret;
> +	struct component_match *match = NULL;
It doesn't need to be initialized.

> -	res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
> -	if (res == NULL) {
> -		dev_err(&pdev->dev, "failed to get irq resource");
> -		ret = -ENOENT;
> -		goto err_res;
> -	}
> +		res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
> +		if (!res) {
> +			dev_err(&pdev->dev, "failed to get irq resource");
> +			ret = -ENOENT;
> +			goto err_res;
> +		}
res is not used.  Can be removed in next version or in another patch.

> diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_hw.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_hw.c
> new file mode 100644
> index 000000000000..4e6a8a81ff67
> --- /dev/null
> +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_hw.c
> @@ -0,0 +1,179 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 MediaTek Inc.
> + */
> +
> +#include <linux/pm_runtime.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/of_platform.h>
> +#include <linux/module.h>
Would be better to maintain an order.

> +#include "mtk_vcodec_enc_hw.h"
> +#include "mtk_vcodec_enc.h"
Would be better to maintain an order.

> +static irqreturn_t mtk_enc_comp_irq_handler(int irq, void *priv)
> +{
> +	struct mtk_venc_comp_dev *dev = priv;
> +	struct mtk_vcodec_ctx *ctx;
> +	unsigned long flags;
> +	void __iomem *addr;
> +
> +	spin_lock_irqsave(&dev->master_dev->irqlock, flags);
> +	ctx = dev->curr_ctx;
> +	spin_unlock_irqrestore(&dev->master_dev->irqlock, flags);
> +	if (!ctx)
> +		return IRQ_HANDLED;
Here is a read lock for the curr_ctx.  The patch doesn't contain the write lock part.

I am not sure if the following situation would be happened:
1. curr_ctx is not NULL.
2. mtk_enc_comp_irq_handler() gets the curr_ctx.
3. The curr_ctx has been destroyed somewhere.
4. mtk_enc_comp_irq_handler() finds the ctx is not NULL so that it continues to execute.
5. Something wrong in latter mtk_enc_comp_irq_handler() because the ctx has been destroyed.

Does it make more sense to set curr_ctx to NULL to indicate the ownership has been transferred to mtk_enc_comp_irq_handler()?  For example:

spin_lock_irqsave(...);
ctx = dev->curr_ctx;
dev->curr_ctx = NULL;
spin_unlock_irqrestore(...);

> +static int mtk_venc_comp_bind(struct device *dev,
> +			      struct device *master, void *data)
> +{
> +	struct mtk_venc_comp_dev *comp_dev = dev_get_drvdata(dev);
> +	struct mtk_vcodec_dev *master_dev = data;
> +	int i;
> +
> +	for (i = 0; i < MTK_VENC_HW_MAX; i++) {
> +		if (dev->of_node != master_dev->enc_comp_node[i])
> +			continue;
> +
> +		/*add component device by order*/
> +		if (comp_dev->core_id == MTK_VENC_CORE0)
> +			master_dev->enc_comp_dev[MTK_VENC_CORE0] = comp_dev;
> +		else if (comp_dev->core_id == MTK_VENC_CORE1)
> +			master_dev->enc_comp_dev[MTK_VENC_CORE1] = comp_dev;
> +		else
> +			return -EINVAL;
if (comp_dev->core_id < 0 || comp_dev->core_id >= MTK_VENC_HW_MAX)
    return -EINVAL;

master_dev->enc_comp_dev[comp_dev->core_id] = comp_dev;
Tzung-Bi Shih Aug. 23, 2021, 10:16 a.m. UTC | #2
On Mon, Aug 16, 2021 at 06:59:28PM +0800, Irui Wang wrote:
> @@ -105,6 +106,14 @@ static int mtk_venc_comp_probe(struct platform_device *pdev)
>  
>  	comp_dev->plat_dev = pdev;
>  
> +	ret = mtk_vcodec_init_enc_pm(pdev, &comp_dev->pm);
> +	if (ret < 0) {
> +		dev_err(&pdev->dev, "Failed to get venc component clock source!");
> +		return ret;
> +	}
> +
> +	pm_runtime_enable(&pdev->dev);
mtk_vcodec_init_enc_pm() and mtk_vcodec_release_enc_pm() is more like a pair.  Does it make more sense to call pm_runtime_enable() in mtk_vcodec_init_enc_pm()?
Tzung-Bi Shih Aug. 24, 2021, 10:24 a.m. UTC | #3
On Mon, Aug 16, 2021 at 06:59:30PM +0800, Irui Wang wrote:
> -void mtk_vcodec_enc_clock_on(struct mtk_vcodec_pm *pm)
> +void mtk_vcodec_enc_clock_on(struct mtk_vcodec_dev *dev, int core_id)
>  {
> -	struct mtk_vcodec_clk *enc_clk = &pm->venc_clk;
> -	int ret, i = 0;
> +	struct mtk_venc_comp_dev *venc;
> +	struct mtk_vcodec_pm *enc_pm;
> +	struct mtk_vcodec_clk *enc_clk;
> +	struct clk		*clk;
To be neat, remove the extra spaces.

> -	ret = mtk_smi_larb_get(pm->larbvenc);
> -	if (ret) {
> -		mtk_v4l2_err("mtk_smi_larb_get larb3 fail %d", ret);
> -		goto clkerr;
I may miss the context but why does it remove mtk_smi_larb_get()?

> -void mtk_vcodec_enc_clock_off(struct mtk_vcodec_pm *pm)
> +void mtk_vcodec_enc_clock_off(struct mtk_vcodec_dev *dev, int core_id)
>  {
> -	struct mtk_vcodec_clk *enc_clk = &pm->venc_clk;
> -	int i = 0;
> +	struct mtk_venc_comp_dev *venc;
> +	struct mtk_vcodec_pm *enc_pm;
> +	struct mtk_vcodec_clk *enc_clk;
> +	int i;
>  
> -	mtk_smi_larb_put(pm->larbvenc);
Same here.  Why does it remove mtk_smi_larb_put()?

>  int mtk_venc_enable_comp_hw(struct mtk_vcodec_dev *dev)
>  {
>  	int i, ret;
>  	struct mtk_venc_comp_dev *venc_comp;
> +	struct mtk_vcodec_clk *enc_clk;
> +	int j = 0;
It doesn't need to be initialized.  Can inline to "int i, ret;".